4  Locate the Root Cause

Orient full slide

If you completed the “Investigate the Symptoms” stage and diagnosed the proximate cause as a data fault, you will need to find the root cause. In other words: you need to find the first point during the program’s execution where the data diverged from your expectations and identify why.

If you diagnosed the proximal cause as a code-intention mismatch or a misunderstanding of a computation at the proximate location, you can skip this step. You only need to look for an earlier root cause when you find a data fault as the proximate cause.

4.1 Strategy

Thanks to our investigation, we know that at some point in our program’s execution, one of the data objects has unexpected contents. But how do we know when things first went wrong?

There are specialized tools for helping you scan through a program and track the state of variables- these are called debuggers. Here, we are going to learn a process you can do by hand to follow the journey of a data object through your program.

Importantly, we want to be hypothesis driven in our search. We have good reason to believe tracing the origins of the suspect data object will lead us to a root cause. But this may not always be easy: a data object may be modified many times throughout a program. Several variable names can refer to the same data object. And data objects might originate from other data objects defined even earlier in the program!

To keep the search reasonably organized, programmers can use a technique called tracing- specifically, forward and backward tracing. Why are there two kinds? Well, there are two ways to think about looking for the root cause:

  • We could look backward through the execution to find the first place where the data fault disappears. This is backward tracing.
  • We could look forward through the execution to find the first place where a data fault appears. This is forward tracing.

Either one can be effective. You can choose between them based on your understanding of the program: if you suspect there may be issues with your data before you even read it into your program or during an early transformation, forward tracing could save you time. On the other hand, if you suspect the data object was as expected until quite recently in the program execution, you might opt to trace backwards from the proximal location of the data fault.

4.1.1 Backward tracing

In backward tracing we look for the most recent computational step that may have changed the object, and the computational step before that, and so on.

We should have at least one variable name bound to the problematic object, or to some container object that includes it. In our running example, inventory is bound to a dictionary whose values are also dictionaries, and one of those inner dictionaries is missing the 'price' key.

We need to be cautious, as there could be other variables, by other names, that are aliases for the same dictionary, or parts of it. We don’t want to ignore the possibility that the object was changed as a result of operations on those other variables. But tracing back to see how inventory gots its value will be a good start.

The first step is just to search for that variable name. Using the text search feature of your browser or programming environment will be helpful, if you have a long name like inventory. If you have a short name like a, text search won’t be so helpful because you’ll get lots of matches that aren’t your variable– another reason to use longer, descriptive variable names. It may be especially helpful to put your cursor on the line that you’re trying to trace back from and do a “reverse” text search from there, to find recent references to that variable name.

Commonly, you will find that the variable name in question is a parameter name of a function that was invoked. In other words, the data object was passed in as a parameter value. In the calling environment (the place where the function was invoked) the data object may have been referred to by some other name, or be the value resulting from some complicated expression. So you may have to start tracing backward with a different variable name.

Even figuring out where the function was invoked can be tricky. The same function may be invoked in multiple places. Text search for the function name may be helpful for identifying candidates.

Even better, you can use the stack trace to help you! Remember how we said the stack trace for an error message includes a “frame” for each function that is invoked? We started by looking at the bottom of the stack trace and went up to find the first mentioned location in our code, ignoring function calls that were deep in libraries that we would never considering editing. If we keep going up in the stack trace, we will find information about where each function was invoked. It will even tell us the line number.

For example, the stack trace for our running example tells us that the error ocurred while executing check_one_product(), which was invoked on line 52, inside process_shopping_request(). We see there that a variable name inventory was passed as the inventory parameter of check_one_product(). So, in this case, the variable name in the calling location is the same as the variable name at our proximal cause location. We can continue our search backward from line 52 to see where inventory comes from there.

Eventually, in this backward tracing, we may find a computation that sets or transforms the object. Whenever we find one, we can hypothesize that perhaps it is not doing the transformation we expect. Then, we can check that hypothesis.

Eventually, we will get back to the first step in the computation that does any transformation, which may just be directly setting it, or reading contents from a file.

4.2 Forward Tracing

Alternatively, we may have a pretty good idea, based on our plot diagram, about the original data source. Especially if we suspect that the problem may originate with that data, we might locate the root cause a lot faster by starting with the original data source. We can check whether it meets our expectations. If so, we can trace forward to see the first place where it is used, and then the second, and so on, until we find the first place where it is transformed in a way that doesn’t match our expectations.

Forward tracing can be a nice shortcut if we guess right about the original data source and either it is incorrect from the start or a data fault occurs early on in the program execution. However, it can also lead to some wasted time, if we guess wrong about the original data source, or the fault occurred just before the location of the proximal cause.

How to know if you are done with Locating the Root Cause

You will know you have located the root cause when you find a computation that transforms a good data object into a bad one, or when you find input data (e.g., a .csv file) that has data violates the expectations of your program.

4.3 Practice

Let’s review our code and error message:

Code
import csv


def read_two_column_csv(file_name):
    with open(file_name, "r") as file:
        reader = csv.reader(file)
        next(reader)  # skip header row
        data = {}
        for item, value in reader:
            data[item] = value
    return data


def create_combined_inventory(quantities, prices):
    combined_data = {}
    for item in quantities:
        combined_data[item] = {"quantity": int(quantities[item])}
    for item in prices:
        price = float(prices[item].replace("$", ""))
        if item in combined_data:
            combined_data[item]["price"] = price
        else:
            combined_data[item] = {"price": price}
    return combined_data


## Process a shopping request
def check_one_product(inventory, product, request_qty):
    message = ""
    available_qty = inventory[product]["quantity"]
    price = inventory[product]["price"]

    if request_qty > available_qty:
        qty = available_qty
        apology = "Sorry, we only have "
    else:
        qty = request_qty
        apology = ""
    subtotal = qty * price
    message = (
        f"${qty * price:.2f}: {apology}"
        f"{qty} {'unit' if qty == 1 else 'units'} of {product}; "
        f"${price:.2f} per unit.\n"
    )
    return subtotal, message


def process_shopping_request(shopping_request, inventory):
    response = ""
    total = 0
    for product, request_qty in shopping_request.items():
        subtotal, message = check_one_product(inventory, product, request_qty)
        total += subtotal
        response += message
    response += f"----\n${total:.2f} Total\n"
    return response


quantities = read_two_column_csv("data/quantities.csv")
prices = read_two_column_csv("data/prices.csv")

inventory = create_combined_inventory(quantities, prices)
print(process_shopping_request({"apples": 500, "bananas": 3, "pears": 1}, inventory))
print(
    process_shopping_request(
        {"apples": 500, "bananas": 3, "oranges": 4, "pears": 1}, inventory
    )
)
{'apples': '$1.00', 'bananas': '$2.00', 'mangoes': '$4.00', 'grapes': '$5.00', 'pears': '$1.50', 'blueberries': '$2.50', 'strawberries': '$3.50', 'raspberries': '$4.50', 'blackberries': '$5.50'}
{'apples': '10', 'bananas': '20', 'oranges': '30', 'mangoes': '40', 'grapes': '50', 'pears': '5', 'blueberries': '15', 'strawberries': '25', 'raspberries': '35', 'blackberries': '45'}
$10.00: Sorry, we only have 10 units of apples; $1.00 per unit.
$6.00: 3 units of bananas; $2.00 per unit.
$1.50: 1 unit of pears; $1.50 per unit.
----
$17.50 Total
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 66
     64 inventory = create_combined_inventory(quantities, prices)
     65 print(process_shopping_request({"apples": 500, "bananas": 3, "pears": 1}, inventory))
---> 66 print(process_shopping_request({"apples": 500, "bananas": 3, "oranges": 4, "pears": 1}, inventory))

Cell In[2], line 52, in process_shopping_request(shopping_request, inventory)
     50 total = 0
     51 for product, request_qty in shopping_request.items():
---> 52     subtotal, message = check_one_product(inventory, product, request_qty)
     53     total += subtotal
     54     response += message

Cell In[2], line 32, in check_one_product(inventory, product, request_qty)
     30 message = ""
     31 available_qty = inventory[product]["quantity"]
---> 32 price = inventory[product]["price"]
     34 if request_qty > available_qty:
     35     qty = available_qty

KeyError: 'price'

4.3.1 Backward Tracing

In the “Investigate the Symptoms” stage, we identifed that our code halted execution while trying to lookup the price of an orange from the inventory nested dictionary. What was the order of events leading up to this point in the code? Before we begin tracing the inventory object, let’s recall our execution diagram:

Execution Diagram

We know the program threw an error at line 32- but where was that in the execution flow? There are some hints in the error message:

Reading this from bottom to top tells us:

  • The error happened while trying to access inventory[product]["price"] inside the check_one_product() function.
  • The check_one_product() function was called by process_shopping_request().
  • The process_shopping_request() function was called in line 63, where we used it to process a shopping cart dictionary ({"apples": 500, "bananas": 3, "oranges": 4, "pears": 1}) alongisde an inventory dictionary (inventory).

Now for the backward tracing. At the point of execution that triggered the error, we know that something was wrong with the inventory nested dictionary- specifically, we expected that inventory["orange"] would have a "price" key, but it didn’t.

We are now going to work backwards through the execution to find any location where inventory is modified (or created).

  • Within the function check_one_product(), inventory is accessed on line 31, but not modified. So that can’t be where things went awry.
  • Stepping backwards to the previous place we see inventory takes us to the check_one_product() arguments. So inventory is passed in here.
  • Where did the function check_one_product() get its inventory argument? Well, remember that function is called by process_shopping_request() in line 52.
  • Where did that inventory object come from? It appears to be an argument passed to process_shopping_request().
  • Where did process_shopping_request() get invoked? That happened in line 63. And there we see inventory.
  • Where did the inventory dictionary in line 63 come from? We see it in line 62, but it isn’t being modified here. If we step back one line further, though, we see it is the result of running create_combined_inventory(quantities, prices). Aha!
  • It could be that the function create_combined_inventory() doesn’t work how we understand it to (a computation misunderstaning), and maybe that is leading us to lose the orange prices. Another possibility is that one of the data objects involved at this step, quantities or prices, doesn’t look how we think it does (another data fault). Which could it be? Let’s first try checking the data objects. We can print out the values of those dictionaries:
Code
quantities
{'apples': '10',
 'bananas': '20',
 'oranges': '30',
 'mangoes': '40',
 'grapes': '50',
 'pears': '5',
 'blueberries': '15',
 'strawberries': '25',
 'raspberries': '35',
 'blackberries': '45'}
Code
prices
{'apples': '$1.00',
 'bananas': '$2.00',
 'mangoes': '$4.00',
 'grapes': '$5.00',
 'pears': '$1.50',
 'blueberries': '$2.50',
 'strawberries': '$3.50',
 'raspberries': '$4.50',
 'blackberries': '$5.50'}

It looks like create_combined_inventory() may be working correctly. But the prices data that is passed to it doesn’t have a price for oranges.

So we have a new data fault, and we have to trace back even farther to find the original root cause.

  • Where did prices come from on line 61? It was set on line 59 to be the return value from calling read_two_column_csv("data/prices.csv").
  • So, again we have two possibilities. read_two_column_csv might not operate as we think it does, or there’s a data fault in the file "data/prices.csv". Now that we’ve gotten this far, it seems plausible that maybe there just wasn’t a price for oranges in that datafile, so let’s look there first.

Here are the contents of the file:

Name,Price
apples,$1.00
bananas,$2.00
mangoes,$4.00
grapes,$5.00
pears,$1.50
blueberries,$2.50
strawberries,$3.50
raspberries,$4.50
blackberries,$5.50

Sure enough: no row for oranges. This is the root cause of our original symptom, the KeyError on line 32.

4.3.2 Forward Tracing

If we had done forward tracing, we would have begun by inspecting the inventory.csv file in the first place. Wait a second, that would’ve been a whole lot faster than backward tracing!

How might we have been inspired to start there, and save a lot of time? We might have asked ourselves from the start, “Hey, did we ever have a price for oranges, or did we lose it along the way in our data manipulations?” Before checking all the manipulations, let me just see if the original source is missing a price for oranges.