The first step is to orient yourself to the code in front you. This is easiest when you yourself have written the code. But you may find yourself debugging code that was contributed (at least in part) by someone else—especially if you are using generative AI code tools, which draft chunks of code based on patterns in millions of online code examples.
Beginner programmers often skip this step, but by taking the time to observe the structure of your code, your odds of catching the bug go up.
2.1 Strategy
2.1.1 State the Program’s Purpose
First, check that you understand why this program exists in the first place. Can you say the intent in one simple sentence? For example: “This program is supposed to count how many orders there were in a one-month period and print out the total.”
2.1.2 Meet the cast of characters
Just like the playbill at a theater may start with a list of the characters and their relationships to one another, familiarize yourself with the key players in the code: data objects and computations.
2.1.2.1 Data Objects
Write down an inventory of the important data objects that will be produced and what they are meant to represent. Typically, these will be assigned to variables, and those variable names will work as character names for your cast.
2.1.2.2 Computations
Also, make note of important computations that will occur. Typically, these are functions or methods defined in the code, and their names will work as character names for your cast.
Your cast of characters needs to select only the most important ojbects and computations, and describe only some of their properties. A complete and accurate description of all the characters would be at least as long as the program itself, and wouldn’t be helpful for orienting you. For example, if one function calls several other functions in order to perform its computation, you may only need to include the top-level function in your cast. Finding the right level of abstraction for these descriptions is an art form; you will get better at it with practice.
Visual Symbols for Data Types
It is often helpful to note something about the data types, both of key data objects, and of the inputs and outputs of computations. Identifying that a data object is a list, say, rather than a string or a dictionary, can make it easier to recognize when there is a mismatch. That might be a mismatch between the expectation that it is a list and a reality that the code execution has created a different type of object. Or it might be a mismatch between the correct expectation that the object is a list and the requirement of a subsequent computation that another object type is needed.
In our diagrams, we will use symbols to indicate some common data types that we’ll use in. (Note that you not yet have learned about numpy arrays or pandas DataFrames or Series previously. Don’t worry; you’ll learn about them later on.)
2.1.3 Understand the Plot
2.1.3.1 (Draw a diagram of the program execution)
Then, look for the plot structure- what happens in what order? Some people like to do this by hand, so don’t hesitate to break out a pen and paper. Or you can use an online drawing tool. We like to use a simplified form of flowchart, with rectangles showing computations, diamonds for conditionals, and arrows that loop back to an earlier computation as a way to represent iteration.
At key points in the execution diagram, it will be useful to draw in an abstract version of a “reference diagram”, a mapping of variable names to data objects. This abstract version may include only some of the available variables, and may include only some properties of the data object. For example, it may indicate that a data object is a list, without specifying how many items are in the list or what their data types are. You can always go back and fill in more detail in this diagram during a later stage of the OILER process, if it turns out to be useful.
A very abstract execution diagram.
How to know if you are done with Orientation
When you finish this stage, you should be able to give a pretty good description of the key steps in the program. That means you know their inputs and outputs, as well as the order in which the steps are executed.
2.2 Practice
Imagine you’ve been given this code, written by a teammate. Your job is to orient yourself to the program using the strategies we’ve just discussed.
Code
import csvdef read_two_column_csv(file_name):withopen(file_name, "r") asfile: reader = csv.reader(file)next(reader) # skip header row data = {}for item, value in reader: data[item] = valuereturn datadef create_combined_inventory(quantities, prices): combined_data = {}for item in quantities: combined_data[item] = {"quantity": int(quantities[item]) }for item in prices: price =float(prices[item].replace("$", ""))if item in combined_data: combined_data[item]["price"] = priceelse: combined_data[item] = {"price": price }return combined_data## Process a shopping requestdef check_one_product(inventory, product, request_qty): message ="" available_qty = inventory[product]["quantity"] price = inventory[product]["price"]if request_qty > available_qty: qty = available_qty apology ="Sorry, we only have "else: qty = request_qty apology ="" subtotal = qty * price message = (f"${qty * price:.2f}: {apology}"f"{qty}{'unit'if qty ==1else'units'} of {product}; "f"${price:.2f} per unit.\n" )return subtotal, messagedef process_shopping_request(shopping_request, inventory): response ="" total =0for product, request_qty in shopping_request.items(): subtotal, message = check_one_product(inventory, product, request_qty) total += subtotal response += message response +=f"----\n${total:.2f} Total\n"return responsequantities = read_two_column_csv("data/quantities.csv")prices = read_two_column_csv("data/prices.csv")inventory = create_combined_inventory(quantities, prices)print(process_shopping_request({"apples": 500, "bananas": 3, "pears": 1}, inventory))print(process_shopping_request({"apples": 500, "bananas": 3, "oranges": 4, "pears": 1}, inventory))
$10.00: Sorry, we only have 10 units of apples; $1.00 per unit.
$6.00: 3 units of bananas; $2.00 per unit.
$1.50: 1 unit of pears; $1.50 per unit.
----
$17.50 Total
After examining the code, it appears that this program handles shopping requests, generating a text invoice with one line for each type of fruit that is ordered and a total price.
2.2.2 Meet the cast of characters.
The main computational characters are:
Read. A function that reads in a two-column csv file and outputs a dictionary with the first column as keys and the second as values. (lines 3-10)
Combine. Turns a quantities and a costs dictionary into a combined inventory dictionary. (lines 12-26)
Shop. The top-level function is process_shopping_request Iines 48-56), which uses check_one_product as a helper function (lines 29-46).
The helper function checks to make sure there is enough inventory for the requested product, removes the quantity that will be provided from the inventory, and returns the total cost for that item and a feedback string.
The data objects that are produced are:
Two dictionaries, stored in the variables prices and quantities (lines 58-59)
A combined dictionary, stored in the variable inventories (line 61). It has product name as keys and, as values, has nested dictionaries, with quantity and price as keys.
A response string which acts as an invoice (returned by process_shopping_request on line 56).
2.2.3 Understand the Plot
2.2.3.1 (Draw a diagram of the program execution).
The Read and Combine steps are executed in order, the results of the Read step being used as inputs for the Combine step. The Shop step, which may be invoked more than once, invokes check_one_product on each requested product. That function determines how much of the product to supply based on whether there is enough in the inventory, which involves conditional logic.
Here’s how our diagram might look, based on these plot details.
Are we done with orientation?
Depending on what happens in the later steps of the OILER process, we may need a little more detail about some of the data objects or the computations. For example, we might end up wanting to represent explicitly the data types of the inputs and outputs for check_one_product, which are omitted in this diagram. Or we might want to represent that there could be multiple invocations of the Shop computation.
But this figure counts as a “pretty good” representation of what will happen in a program execution. Good enough for now. So, yes, we’re ready to move on from the Orientation stage!