Put On Your Deerstalker. You're Now a Detective. It's Time For Debugging
Debugging Python Code Is Like Detective Work
For this week’s article, I’m revisiting an old post about debugging I wrote before I started The Python Coding Stack. Debugging needs a certain mindset, perhaps one that’s shared with detectives investigating crimes.
I first published this article in April 2022 on an old blog, which is where I published articles before starting The Python Coding Stack. Some of you may have read it then. But I’m hoping it will be the first time reading this for most of you!
I’m not being lazy by reposting an old article this week—well, I suppose I am, but I’m busy getting everything ready for the official launch of The Python Coding Place this Monday, recording more videos for The Place and for the teenagers’ platform I’ve also just launched, Codetoday Unlimited. What was I thinking launching two new platforms at the same time?! But it’s been lots of fun getting these projects going…
And one of the courses I was recording recently was about debugging, which got me searching for this article again.
Anyway, here’s the article. I’m posting it without editing, so it’s written in a style that’s two years old. My writing has changed since then…
If you like these articles, you may also like The Python Coding Place for video courses, live cohort courses, members’ forum, and more…
Debugging Python Code Is Like Detective Work — Let’s Investigate
Debugging Python code is not a mysterious art form. It's like a detective solving a mystery. This analogy comes from one of my favourite programming aphorisms: "Debugging is like being the detective in a crime movie where you are also the murderer" (Felipe Fortes).
So what can real detectives tell us about debugging Python code? I thought of looking up some guidelines that police use when investigating a crime. Here are the areas detectives work on when investigating a crime scene according to the College of Policing in the UK:
Prove that a crime has been committed
Establish the identity of a victim, suspect or witness
Corroborate or disprove witness accounts
Exclude a suspect from a scene
Link a suspect with a scene
Interpret the scene in relation to movements within the scene and sequences of events
Link crime scene to crime scene and provide intelligence on crime patterns
[Source: https://www.app.college.police.uk/app-content/investigations/forensics/]
Let's look at all of these and find their counterparts in debugging Python code.
I'll use the code below as an example throughout this article. This code has a list of dictionaries with books about detectives and crimes, of course! Each item includes the author, title, year published, and the book's rating on Goodreads:
There are two functions, too. One finds the books written by a specific author, and the other filters books based on their rating. The two calls at the end should result in all Arthur Conan Doyle books with a rating higher than 4. However, as you'll see soon, there's a problem.
Let's start going through the areas listed in the College of Policing document.
Prove That A Crime Has Been Committed
You need to determine whether there's something that doesn't work in your program. Sometimes, this is obvious. Either an error is raised when you run your code, or the output from your code is clearly wrong.
But often, the bug in your code is not obvious.
You need to be on the lookout for potential crimes in the same way that police forces are on the lookout (or should be) for crimes.
This is why testing your code is crucial. Now, there are different ways of testing your code, depending on the scale and extent of the code and what its purpose is. However, whatever the code, you always need to test it somehow.
This testing will allow you to determine that a crime has been committed—there's a bug somewhere!
The output of the code I showed you above is the following:
[]
In this case, it's not too difficult to determine that there is indeed a crime that's been committed. In the short list of books, you can see two out of the three Arthur Conan Doyle books have a rating above 4. The code should have output these two books.
Before you send in your complaints that the last name should be Conan Doyle and not Doyle, please note that I've referred to the font of all the world's truth on this matter: Wikipedia! See Arthur Conan Doyle.
Establish the identity of a victim, suspect or witness
Who's the victim? I can see how that's important for a detective trying to solve a crime.
When debugging Python code, you'll need to understand the problem. If your code raises an error, the victim is shown in red writing in your console. If your code doesn't raise an error, but your testing shows there's a problem, you'll need to be clear about what the problem is. How is the output you get different from the output you were expecting?
As you go through the debugging process, you'll need to identify who the suspects are. Which lines of your code could be the ones which committed the crime? I'll talk more about how to deal with suspects later, and how to exclude them or keep them in consideration. But before you can do either of those two things, you'll need to identify a line of code as a suspect!
You also have witnesses in your code. Often, these are the variables containing data: what are the values of the data and what type of data are they? Before you can interrogate the witnesses, you'll need to identify them!
Corroborate Or Disprove Witness Accounts
How do you interrogate witnesses to get accurate witness accounts? You've probably watched as much crime drama on TV as I have, so I'll skip what detectives do in real-world crimes. Besides, I strongly suspect (!) real police interrogations are a lot less exciting than those we see on TV.
How do you interrogate the witnesses in your code? You ask the witnesses (variables) for the values they hold and what data types they are. You can do this with the humble print()
using print(witness_variable)
and print(type(witness_variable))
. Or you can use whatever debugging tool you want. A big part of debugging Python code is looking at the variables' values and data types.
Programmers have one advantage over detectives. Witnesses never lie! Once you ask a variable to give up its value and data type, it will always tell you the truth!
Let's start our investigation into the crime in the code above. You can start from the first function call find_by_author(books, "Doyle")
. This takes us to the function definition for find_by_author()
.
Could the for
loop statement have any issues? Is this line a suspect? Let's ask the witnesses:
You've interrogated the witnesses books_list
and book
as these witnesses were present on the crime scene when the line was executed. You're using the print()
function as your forensic tool along with the f-string with an =
at the end. This use of the f-string is ideal for debugging!
The output looks like this:
books_list = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}, {'author': ('Agatha', 'Christie'), 'title': 'Murder of the Orient Express (Hercule Poirot #4)', 'published': 1926, 'rating': 4.26}, {'author': ('Agatha', 'Christie'), 'title': 'Death on the Nile (Hercule Poirot #17)', 'published': 1937, 'rating': 4.12}]
book = {'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}
book = {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}
book = {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}
book = {'author': ('Agatha', 'Christie'), 'title': 'Murder of the Orient Express (Hercule Poirot #4)', 'published': 1926, 'rating': 4.26}
book = {'author': ('Agatha', 'Christie'), 'title': 'Death on the Nile (Hercule Poirot #17)', 'published': 1937, 'rating': 4.12}
doyle_books_above_4 = []
Exclude A Suspect From A Scene
You've seen earlier how you need to be identifying suspects as you go through your code step-by-step.
For each line of code you identify as a suspect, you interrogate the witnesses. You can exclude this line of code from your list of suspects if the witness account corroborates what the line is meant to do.
Let's look at the output from the last version of the code above, when you asked for witness statements from books_list
and book
in find_by_author()
.
The first output is what's returned by print(f"{books_list = }")
. This includes all the books in the original list. It's what you expect from this variable. So far, this witness statement hasn't led you to suspect this line of code!
The remaining outputs are the return values of print(f"{book = }")
which is in the for
loop. You expected the loop to run five times as there are five items in the list books
. You note that there are five lines output, and they each show one of the books in the list.
It seems that the for
statement can be excluded as a suspect.
You can remove the two calls to print()
you added.
Link A Suspect With A Scene
However, if the witness account doesn't exonerate the suspect, you'll need to leave that line on the list of suspects for the time being. You've linked the suspect with the scene of the crime.
Back to our code above. You can move your attention to the if
statement in the definition of find_by_author()
. You've already determined that the variable book
contains what you expect. You can look for a clue to help you determine whether the if
statement line is a suspect by checking when code in the if
block is executed:
The output from this investigation is just the empty list returned by the final print()
in the code:
doyle_books_above_4 = []
Therefore, the print(f"{book = }")
call you've just added never happened. This puts suspicion on the line containing the if
statement.
You need to call the forensics team:
The witnesses that were at the crime scene when the if
statement was there are book["author"]
and last_name
. These are the objects being compared using the equality operator ==
in the if
statement. So, the forensics team decide to print these out just before the if
statement. This is the forensics team's result:
book["author"] = ('Arthur Conan', 'Doyle')
last_name = 'Doyle'
book["author"] = ('Arthur Conan', 'Doyle')
last_name = 'Doyle'
book["author"] = ('Arthur Conan', 'Doyle')
last_name = 'Doyle'
book["author"] = ('Agatha', 'Christie')
last_name = 'Doyle'
book["author"] = ('Agatha', 'Christie')
last_name = 'Doyle'
doyle_books_above_4 = []
And there you are! You've found evidence that clearly links the if
statement with the crime scene! The value of book["author"]
is a tuple. The author's last name is the second item in this tuple but the if
statement incorrectly tries to compare the whole tuple with the last name.
All you need to do is add an index in the if
statement:
You've solved the mystery. But, are you sure? When you run the code now, once you remove the print()
call you used for debugging, the output is still the empty list.
Interpret The Scene In Relation To Movements Within The Scene And Sequences Of Events
Looking at a single suspect line of code in isolation is not sufficient. You need to follow how the data is being manipulated on that line and the lines before and after it.
This is the only way to investigate what has really happened during the crime.
Let's look at the whole for
loop in the definition of find_by_author()
again.
You've already interrogated book["author"]
and last_name
. You can even interrogate book["author"][1]
just to be sure. If you do so, you'll see that its account seems to make sense.
The other witness on the scene is the list output
. You can interrogate output
at the end of the for
loop:
This code now gives the following result:
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
output = []
output = []
doyle_books_above_4 = []
The first line is correct. You expect the first book in the list to be added to output
since it's an Arthur Conan Doyle book. However, you expect it to still be there in the second line. "The Sign of Four" should have been added to "A Study in Scarlet". Instead, it seems like it has replaced it.
You notice the same clues for the other results, too. In fact, the list is empty in the fourth and fifth outputs. (The final empty list is the output from the final print()
at the end of the code.)
You interrogated output
as a witness, but it's actually a suspect now! Therefore, you study its movements across the crime scene, sketching things on a whiteboard with lots of arrows, as they do in the detective films.
Gotcha! You finally see it. The code is re-initialising output
every time inside the for
loop. That's a serious crime. You move the line with output = []
outside the loop:
The code now gives the following. Note that you're still interrogating output
after the for
loop through a print()
call:
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
output = [{'author': ('Arthur Conan', 'Doyle'), 'title': 'A Study in Scarlet', 'published': 1887, 'rating': 4.14}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Sign of Four', 'published': 1890, 'rating': 3.92}, {'author': ('Arthur Conan', 'Doyle'), 'title': 'The Hound of the Baskervilles', 'published': 1901, 'rating': 4.13}]
doyle_books_above_4 = []
You can now remove output
from your list of suspects as the five print-outs you get are what you expect. The first three show the Arthur Conan Doyle titles, added one at a time. The last two do not add the Agatha Christie books to the list output
.
This is what you expect find_by_author()
to do!
Link Crime Scene To Crime Scene And Provide Intelligence On Crime Patterns
Criminals rarely commit just one crime. No wonder one of the guidelines from the College of Policing is to link crime scenes and look for crime patterns.
Don't assume there's only one bug in your code. And bugs may well be interconnected. You may think you've solved the mystery, only to find that there's another crime scene to investigate!
In the last output from the code above, you may have noticed that the final line still shows an empty list! Your detective work leads you to a different crime scene now. You need to explore the find_by_ratings()
function definition.
But, by now, you're a senior detective and very experienced. So I'll let you finish off the investigation yourself!
End Of Investigation
Although I couldn't find the titles "Sherlock Holmes and the Python Bugs" or "Debugging Python on the Nile" in my local library, I think it's only a matter of time until we have a new genre of crime fiction novels based on debugging Python code. They'll make for gripping reading.
In the meantime, you can read Sherlock Holmes and Hercule Poirot books to learn how to debug Python code. Or maybe not…
Originally posted on 17 April 2022 at https://thepythoncodingbook.com/2022/04/17/debugging-python-code-is-like-detective-work-lets-investigate/
Normal service resumes next week with a new article, not one from the archives
Stop Stack
#47
The Python Coding Place launches in a few days' time, on the 15 January 2024. Join before that date to make the most of the pre-launch offer at thepythoncodingplace.com. The Place is the hub for all my resources: video courses, members' forum, live cohort courses, weekly videos, and more.
If you read my articles often, and perhaps my posts on social media, too, you've heard me talk about The Python Coding Place several times. But you haven't heard me talk a lot about is Codetoday Unlimited, a platform for teenagers to learn to code in Python. The beginner levels are free so everyone can start their Python journey. If you have teenage daughters or sons, or a bit younger, too, or nephews and nieces, or neighbours' children, or any teenager you know, really, send them to Codetoday Unlimited so they can start learning Python or take their Python to the next level if they've already covered some of the basics.
Recently published articles on The Python Coding Stack:
Why Can't I Just Use A List? • Understanding NumPy's
ndarray
(A NumPy for Numpties article) From Python's native lists to NumPy'sndarray
data type, with a glimpse at the built-inarray
. Why do we need all these similar data structures?next(years) An end-of-year post • Some reflections • And there's some Python stuff in this post, too—a spinning globe animation
Do Not Try This At Home A bit of silliness for the holiday season • But please, don't code like this. Please • Plus some out-of-the-norm commentary • There's nothing ordinary about today's article
The Key To The 'key' Parameter in Python A parameter named
key
is present in several Python functions, such assorted()
. Let's explore what it is and how to use it.What's All the Fuss About 'lambda' Functions in Python? Python's
lambda
functions are seemingly obscure, until they aren't. They're almost mystical, until unveiled. Let's shed some light to dispel the obscurity and lift the mystique.
Recently published articles on Breaking the Rules, my other substack about narrative technical writing:
The South Park Technical Writing Manual (Ep. 14) What can we learn from South Park? Yes, the satirical TV show
I Haven't Been Abducted by Aliens (Ep. --) Why this long lull since the last Breaking The Rules post?
The Selfish Reason (Ep. 13) Another reason for authors to innovate • Enjoying the writing process
The Consequential Detail (Ep. 12). Can a single letter or one blank line make a difference? (Spoiler Alert: Yes)
The Unexpected Audience (Ep. 11). What I'm learning from listening to Feynman's physics lectures
Stats on the Stack
Age: 9 months and 1 day old
Number of articles: 47
Total subscribers: 1,775
On the Paid tier: 84
Each article is the result of years of experience and many hours of work. Hope you enjoy each one and find them useful. If you're in a position to do so, you can support this Substack further with a paid subscription. In addition to supporting this work, you'll get access to the full archive of articles and some paid-only articles. Alternatively, if you become a member of The Python Coding Place, you'll get access to all articles on The Stack as part of that membership.