Python Quirks? Party Tricks? Peculiarities Revealed…
Three "weird" Python behaviours that aren't weird at all
"That's weird! Surely it must be a flaw in Python." There are a few seemingly-odd behaviours in Python that elicit this response. It's a rite of passage for those learning Python to get from that sentiment to "That makes perfect sense, yes!"
I've selected three of these Python peculiarities, and I'll explore them in this article, taking you on the journey from "it's weird" to "it's clear".
These are the three oddities:
1. The Self-Replicating Trick
Let's create a list of lists to store members of several teams. You can start by creating a list containing empty lists:
And now, add a team member to the first of these teams:
Bob seems to have self-replicated into all the teams.
2. The Teleportation Trick
Let's create a function to add items to a shopping list. The function has a default value for shopping_list
:
And now, create two shopping lists, one for groceries and one for books:
Bread has somehow teleported from the grocery list to the bookshop list.
3. The Vanishing Trick
Let's collect all the doubles of the numbers from 0
to 9
:
And check whether 4
is in doubles
:
It looks like it was there, but then it vanished.
These are not bugs or oversights by the Python core developers. In all of these three "tricks", the result you see is the expected result. Let's look at each in more detail and dig underneath the surface to explain what's happening.
1. The Self-Replicating Trick
Let's look at the example again:
You start with a list containing an empty list, and you multiply this by 5
. This gives you a list with five lists. Or so it seems.
When you append a name to teams[0]
, you might expect the name "Bob"
to be added to the first of the lists within teams
. However, "Bob"
is added to all the lists.
Let's rewind to the creation of the list of lists:
At first sight, it seems you have five empty lists. Let's print the identities of the lists within teams
using the built-in id()
function:
The same number is printed five times. And if you don't like scanning through those numbers to check they're the same, you can try the following:
The elements of a set are unique, so there's only one number in the set.
If the five lists have the same identity, they must be the same object. There aren't five lists in teams
. Instead, there's one list repeated five times.
A list contains references to other objects. It doesn't contain the objects themselves. When you multiply [[]]
by five, you create five copies of the reference to the inner list. You can confirm that only one inner list is created by disassembling [[]] * 5
into bytecode:
Only two lists are created, the outer and inner lists. Multiplying by 5
doesn't create additional lists. It replicates the references to the list already present.
The alternative
If you want to create a list called teams
that contains five empty lists, you can use a comprehension:
When you use a list comprehension, the creation of a new list is repeated five times. The identities of the five lists are different in this case. You can try adding "Bob"
to the first team now:
On this occasion, "Bob"
is only added to the first list. The name doesn't self-replicate into all the lists.
2. The Teleportation Trick
Let's recall the bizarre behaviour of the teleportation trick:
The intended behaviour of this code is the following:
Define a function called
add_to_shopping_list()
which takes two arguments,item
andshopping_list
.The parameter
shopping_list
has an empty list as a default value.The expectation is that a new default empty list is created each time you call the function without a second argument.
However, "Bread"
, the item you add the first time you call add_to_shopping_list()
, persists when you call the function the second time. It seems "Bread"
has been teleported to a new place!
Let's test the function without using the default argument to check whether this strange behaviour happens in this case, too:
There are no teleportation mysteries when you pass a list that already exists to the function!
So, let's go back to the function definition:
The list to be used as a default value is created when the function is defined. Therefore, each time you use the function without the second argument and the default value for shopping_list
is used, the same list is used. This is the list that was created at the time the function was defined. And since lists are a mutable data type, a new item is added to the same list each time you rely on the default value.
You can confirm this by checking the identity of the list used in the function. You can compare the cases when you call the function with or without a second argument:
You add a line in the function definition that prints the identity of the shopping_list
using id()
. The default list is used since you don't use a second argument in the first two calls to add_to_shopping_list()
. You can see that the same number is returned by id()
in the first two function calls.
However, when you use the lists cakes
and tools
in the next two calls, the identity of shopping_list
is a different number each time and different from the default list.
You test the function one final time without a second argument to confirm that the same list used in the first two calls is used in this case, too.
You used the default list three times in this example. Therefore, you'll find three items within it:
In fact, groceries
, books
, and some_other_list
aren't three separate lists. They're the same list, the one that was created when the function was defined:
The alternative
This bug creeps in each time you use a mutable data type as a default value when you define a function. You should always avoid mutable data types as default values in functions.
You can use immutable types. Since these cannot change, you cannot add values to an immutable type each time you call a function.
If you want a parameter to have a mutable default value, you can use the following pattern:
The default value for shopping_list
is now None
. Therefore, no default list is created when the function is defined. Instead, each time you call the function with no second argument, the if
statement returns True
, and a new empty list is created and assigned to shopping_list
.
You can confirm that there are no more teleportation tricks in this case:
Since a new list is created each time you call the function add_to_shopping_list()
in the example above, there is no conflict between groceries
and books
.
3. The Vanishing Trick
Let's recall the bizarre behaviour of the final one of the three "tricks" in this article:
This seems like a contradictory result. How can 4
be both in and not in doubles
? Let's confuse the matters a bit further (we'll clarify things very soon!):
In this last example, another_doubles
is created using a list comprehension. It's a list, and membership of a list is well-defined. The integer 4
is a member of the list another_doubles
, and the expression 4 in another_doubles
will always return True
.
This is what you'd expect. So, why is the first example different? It's not a flaw in Python, no.
When you create doubles
using a comprehension enclosed in parentheses ()
, you do not create a tuple. Instead, doubles
is a generator:
The generator object does not contain the numbers 0
, 2
, 4
, 6
, and so on. Instead, it will generate and return each number as and when needed. For example, calling next()
will generate and return the first value. And each time you call next()
, the next value in line is generated and returned:
However, a generator is a disposable data structure. Once an item is generated and returned, the generator moves on to the next value and cannot go back. It can only go through the values once. Once you reach the end of the values available, the generator raises a StopIteration
exception.
What's this got to do with our vanishing trick? Let's reset the generator:
Recall that none of the values have been generated yet, and doubles
doesn't have any of the values within it. Strictly speaking, doubles
doesn't contain any data.
So, what happens when you check whether 4
is "in" doubles
?
To determine whether 4
is in doubles
, the first value is generated and returned. The first value is 0
, which is not 4
. Therefore, the next value is generated, which is 2
. As this is still not 4
, the next value is generated. But this time, the value is 4
. Therefore, the expression 4 in doubles
can return an answer. Yes, 4
is in doubles
, and you get True
.
However, the first three values in doubles
have now been used up. You cannot go back through a generator.
So what happens if you execute the expression 4 in doubles
again:
To check whether 4
is in doubles
, the next value in the generator is generated. But since the first three values have already been used up, the next value is 6
, which is not 4
. So, the next value is generated, and the next one, and so on until the end of the generator is reached. The number 4
never comes up again. This is why the second time you execute 4 in doubles
you get False
. The value 4
has "vanished" from doubles
because you can only generate and use each value in a generator once.
Let's reset the generator and experiment a bit more:
Let's see what's happening in these lines of code after you create the generator doubles
:
The first time you execute
4 in doubles
, values from the generator are generated until4
is reached. The first expression returnsTrue
. The first three values have been used up.Next, you execute
10 in doubles
. The generator restarts from the next value, which is6
, since you stopped at4
previously. Then8
and10
are generated.10
is the number you're looking for. Therefore this search ends here and returnsTrue
.The third expression you execute is
4 in doubles
again. The search starts again. But it starts from12
, which is the next number that's generated. Then it moves on to14
,16
, and so on until the end of the generator. The value4
never shows up again, so this expression returnsFalse
. But now, the entire generator has been exhausted. There are no more values to generate.The final expression is
14 in doubles
. You haven't looked for14
or any number larger until now. However, when searching for4
for the second time, you exhausted the entire generator. Therefore it's too late to look for14
or any other number now. This expression returnsFalse
.
The same thing happens with iterators for the same reason. Here's an example using the iterator returned by the built-in function reversed()
:
The alternative
When you're using generators or iterators, you'll need to be aware that you can only use each value once. In most cases, if you're concerned about membership—whether an item is contained in a data structure—you can avoid using generators and iterators.
For example, instead of creating a generator using a comprehension, you can create a tuple:
The tuple more_doubles
contains the items. Therefore, you can safely check for membership using the in
keyword.
Final Words
These are three pitfalls that often come up as examples of "strange" behaviour in Python. The results are undoubtedly unexpected at first sight. However, these examples also offer an opportunity to dig deeper underneath the surface. This "digging" gives us a better understanding of how Python works.
There are more examples. I may write about a few more in a future post.
Code in this article uses Python 3.11
Stop Stack
#23
Recently published articles on The Python Coding Stack:
The Mayor of Py Town's Local Experiment: A Global Disaster. Why variables within functions are local
Time for Something Special • Special Methods in Python Classes (Harry Potter OOP Series #6) Year 6 at Hogwarts School of Codecraft and Algorithmancy • Special Methods (aka Dunder Methods)
A Picture is Worth More Than a Thousand Words • Images & 2D Fourier Transforms in Python. For article #20 on The Python Coding Stack, I'm revisiting one of my tutorials from the pre-Substack era
A One-Way Stream of Data • Iterators in Python (Data Structure Categories #6) Iterators • Part 6 of the Data Structure Categories Series
Collecting Things • Python's Collections (Data Structure Categories #5) Collections • Part 5 of the Data Structure Categories Series
Recently published articles on Breaking the Rules, my other substack about narrative technical writing:
The Broom and the Door Frame (Ep. 5). How the brain deals with stories
Mystery in the Manor (Ep. 4). The best story is the one narrated by the technical detail itself
Frame It (Ep. 3). You can't replace the technical content with a story. But you can frame it within a story.
Whizzing Through Wormholes (Ep. 2). Travelling to the other end of the universe—with the help of analogies
Sharing Cupcakes (Ep. 1). Linking abstract concepts to a narrative • Our brain works in funny ways
Stats on the Stack
Age: 3 months, 3 weeks, and 3 days old
Number of articles: 23
Subscribers: 813