What's In A Name?
Names in Python. Reference counts. Namespaces. And why there can only be one Stephen, sorry to all the others! (in this namespace, at least)
"What's in a name? That which we call a rose By any other name would smell as sweet."
—William Shakespeare, "Romeo and Juliet", Act II, Scene II
I hadn't quoted Shakespeare yet in The Python Coding Stack. I can tick that off now.
Does my name make me who I am? No, of course not. Most call me "Stephen", my children call me "Papà", and almost no one still calls me "Dr Gruppetta" (thankfully—gone are the days in academia!). But, whatever name they use, I'm still the same person.
I could see you glancing at the top to ensure you are reading The Python Coding Stack and not The Philosophy Corner Stack. So, let me get to the point. Here's the outline of this article:
Names Are Not Objects
Keeping Track Of All The Labels • The Reference Counter
Let's start again
More references
Removing Labels • Deleting Names
Removing the last reference • Collecting garbage
There Can Only Be One!
Namespaces
Why The Box Analogy Is "Wrong"
Remember The Weird Reference Counts For
"Stephen"
?Final Words
Further Reading (if you're ready to go to the deepest Python dungeons)
Names Are Not Objects
What happens when you write this line in Python?
The answer to this question depends on how deep you want to go into the Python dungeons. I'll stay reasonably close to the surface in this article, but still a bit below ground!
Two things are happening when you execute this assignment statement:
A new object of type
str
is created. Its value is"Stephen"
.The name
first_name
is created if it doesn't already exist, and it's made to refer to the object created in step 1.
The name is not the object. It's a label that refers to the object. I often use the box analogy to describe this process. The steps above can be adapted to fit the analogy:
A new object of type
str
is created. Its value is"Stephen"
. [no change from the previous version]A box is labelled with the name
first_name
, and the object created in step 1 is placed in the box.
I'm sure you've seen boxes, and you know what they look like. But here's a visual representation anyway!
The box can now go on a shelf with the label clearly visible. Anytime you use the name first_name
, your program will fetch the contents of the box. In this case, this will be the string object "Stephen"
. There are some "buts" in this argument. It's not always this simple. So, let's explore further.
Keeping Track Of All The Labels • The Reference Counter
Did you notice there's another sticker on the front of the box in the earlier diagram? It has the number 1 on it. This keeps track of how many labels refer to the same object. There's only the label first_name
at the moment, which is why it shows 1.
So, let's add another line of code:
We dealt with the first line earlier. Let's focus on the second line. And let's start with what comes after the equals sign:
The program reads the name
first_name
.This name refers to the string object with the value
"Stephen"
.A new name,
me
, is created, which also refers to the same object.
The second assignment does not create a copy of the object. There's only one object. But now, that object has two labels. It has two names you could use to refer to it.
There are two labels on the box, but the box still contains the same object. The label showing the number of references now shows 2. This is the object's reference count.
Warning: if you feel you've understood everything so far, get ready to be confused!
You can find an object's reference count in your program using sys.getrefcount()
. Let's try this on the code so far:
The output when running Python 3.12 is:
4294967295
4294967295
Eh?! What's going on there? And that's not a random number. It's 2^32 - 1
.
And if you're using Python 3.11 or earlier, the same code returns these results:
4
5
Better, but even here, those numbers "don't seem right".
I did warn you it's confusing. I'll get back to this later.
Let's start again
For now, let me use a different example. This example creates a list containing numbers:
The output shows the following reference counts:
2
3
Much, much better. Still a bit puzzling, but we're only off by one now!
The first assignment creates the list object and also creates the name my_numbers
. This is the first reference.
On the second line, you call sys.getrefcount(my_numbers)
. But since you're passing the object as an argument to the function sys.getrefcount()
, an additional temporary reference is created. Therefore, as the function is called, the object gets an extra reference. So, sys.getrefcount()
returns a value that's 1 more than the actual reference count.
Since 2 - 1 = 1
(you knew that already, right?), we have only one real reference count.
The second call to sys.getrefcount(my_numbers)
shows there are 2 references (3 - 1
). The same object can be referred to using either my_numbers
or cool_numbers
. Let's stick with this example for now, so here's the visualisation:
Let's also confirm they're the same object using the is
keyword:
Here's the output from this code:
2
3
True
The is
keyword checks whether two objects are the same object and not whether they're equal. Let's confirm this:
This returns False
since the two lists are separate objects. They are equal (you can verify this using ==
), but they're not the same object. You can read more about equality and identity and the difference between ==
and is
in this article from my pre-Substack era: Understanding The Difference Between is and == in Python: The £5 Note and a Trip to a Coffee Shop.
Note: many aspects I'm discussing in today's article are implementation details of CPython, the main Python implementation and the one you're most likely using. There's a difference between features that are inherent in the Python language in general and others that are a result of how they're implemented behind the scenes. I'll only discuss what happens in CPython since it's the most popular implementation.
More references
You create a list. You also create two variable names referring to the same list. The object has a reference count of 2. Let's add a bit more code:
We're not interested in boring_numbers
because they're boring! I needed another list. That's why it's there. But let's look at the list of lists groups_of_numbers
. This list contains references to two objects that already exist. Apologies for the wordy descriptions below, but they're required:
The first item in
groups_of_numbers
is the object that the namemy_numbers
refers to. Therefore,groups_of_numbers[0]
refers to the same object asmy_numbers
The second item in
groups_of_numbers
is the object thatboring_numbers
refers to. But these are boring, so let's ignore them.
So, the original list you create at the beginning of the code now has three references:
my_numbers
cool_numbers
groups_of_numbers[0]
They're three different references to the same object:
This code prints the reference count of the original list again after creating groups_of_numbers
and checks that all references refer to the same object using the is
keyword. This is the output from this code:
2
3
4
True
Recall that you need to subtract 1 from the numbers shown to get the real reference count.
And let's add one more item in groups_of_numbers
:
The first item in groups_of_numbers
is still my_numbers
, as in the previous example. But you add cool_numbers
as the last item in the list of lists. But cool_numbers
and my_numbers
refer to the same object. Therefore, you create two more references to the original list when you define groups_of_numbers
. Here's the output from this code:
2
3
5
True
There are 4 references (5 - 1
) to the same object at the end of this script:
my_numbers
cool_numbers
groups_of_numbers[0]
groups_of_numbers[2]
Removing Labels • Deleting Names
Let me return to an earlier version of the code before we added groups_of_numbers
. I want to keep the code simple so we can focus on what really matters:
The list has two references. Let's explore the del
keyword:
When you use del
, you do not delete the object. You delete the name and, therefore, the reference to the object. You can try print(cool_numbers)
after the statement with del
in it and you'll get a NameError
. This error shows you cool_numbers
is not defined.
The output from the code above shows that the reference count has gone down, too:
2
3
2
We're back down to one reference.
Removing the last reference • Collecting garbage
Let's also remove the last reference:
You create the object and add two references to it. Then you delete both of them. This brings the object's reference count down to 0:
In our analogy, the sticker showing 0 on the box is a signal for the janitor to clean up the unneeded rubbish on the next cleaning round:
Or, for a more technical way to phrase that, when an object's reference count reaches 0, Python's garbage collector will reclaim the memory used for the object. The object no longer exists.
But do you remember when we had the groups_of_numbers
list? Let's bring it back:
When you delete the names cool_numbers
and my_numbers
, you remove two references to the object. But there are two more left since you can still fetch the object using either groups_of_numbers[0]
or groups_of_numbers[2]
. Even though there's no name that directly refers to the original list, the reference count is not 0 and the object will not be destroyed.
A moment of your time if I may. As you can imagine, a lot of work goes into preparing and writing these articles and I hope you enjoy them and find them useful.
I also put in a lot of work into The Python Coding Place, which includes a growing catalogue of video courses across many levels, weekly short videos, an active forum, and coming soon, live cohort courses and workshops.
Everything at the place is included in a membership–yes, even the live courses. But there’s no subscription in sight. It’s a one-time membership that gives you access to everything, forever.
You may not be ready to join now, or ever. But if you want to have a look, here’s the link below. And one more favour–please spread the word about this publication or The Place.
There Can Only Be One!
My name is Stephen. Many years ago, in my class at school, there was another Stephen. But somehow, we managed.
But in a Python program*, you can't have the same name used twice for different objects. (*This is not quite true. We'll talk about namespaces shortly.)
In the first line, you create the list object and the name my_numbers
, which refers to the newly created object. But later, you reassign a new object to the same name. You can't have the same name used twice. So, the list [7, 19, 5]
is removed from the box labelled my_numbers
and the string "I like numbers"
is placed in the box instead. The name my_numbers
now refers to this string. Here's the output from the code above:
[7, 19, 51]
I like numbers
And since the original list no longer has any references, it will be garbage collected:
Names can be used for any object, including a function. Here's an example:
This code outputs the following two lines:
[7, 19, 51]
<function my_numbers at 0x10111d080>
The name my_numbers
referred to the list at first. But you assign the name to the function you define. The list lost its name!
And names are not only for objects you create yourself directly in your code. Here's an example that comes with a "don't do this!" warning:
The first call to the built-in print()
function works as expected. However, you then reassign the name print
to a different object: a string. Since a name can only be used once, the name print
no longer refers to the much-loved built-in function. When you try to use print()
as a function, you get an error:
Hello!
Traceback (most recent call last):
File "...", line 7, in <module>
print("Hello!")
TypeError: 'str' object is not callable
The TypeError
says that a str
object is not callable. You cannot use parentheses with a string. You can only do that with callable objects like functions and classes.
Or, for another "party trick", you can choose a new name for a built-in function:
This code displays the string "Hello!"
. Why? You create a new name, say_something
, which refers to the same object that the name print
refers to. That's the print()
function! Therefore, say_something
and print
now both refer to the same function, and you can use either name. In this example, you did not replace the name print
. Instead, you added a second label to the function.
And if you want to play a prank on a colleague:
You swap the names for the min()
and max()
functions so that the name min
refers to the function that finds the maximum and name max
refers to the function that finds the minimum value. So, here's the output:
The minimum number is 10
The maximum number is 2
Not funny, I know!
Namespaces
You can only use a name once. But is that in all of Python, everywhere and anywhere? No.
Here's another example:
You create an object and the name numbers
on the first line. The name numbers
refers to this new object. You also create a new list inside the function definition and assign it to the name numbers
. However, the final print()
still shows that numbers
is the original list:
The numbers at the beginning are:
[2, 5, 10, 3, 7]
The numbers at the end are:
[2, 5, 10, 3, 7]
A function has its own namespace. This means that when you create the name numbers
inside the function definition, it does not conflict with the name numbers
in the main program. They're in different namespaces. You can use the same name in different namespaces.
Let's also call the function before the final call to print()
:
And here's the output:
The numbers at the beginning are:
[2, 5, 10, 3, 7]
The numbers in the function are:
[100, 200, 300, 400]
The numbers at the end are:
[2, 5, 10, 3, 7]
This confirms that numbers
in the function refers to a different object to numbers
in the main program.
A word of warning: not all indented blocks have their own namespace. For example, the indented blocks in loops or conditional blocks share the same namespace as the main program:
The output shows that the using numbers
in the for
loop removed the name from the original list:
The numbers at the beginning are:
[2, 5, 10, 3, 7]
The numbers in the 'for' loop are:
[100, 200, 300, 400]
The numbers at the end are:
[100, 200, 300, 400]
And one last bit of trivia for this section: a list comprehension does have its own namespace, unlike a standard for
loop:
Even though you use the name numbers
in the list comprehension, it doesn't affect the numbers in the global namespace:
The number at the beginning is:
999
The number at the end is:
999
Why The Box Analogy Is "Wrong"
I love analogies. But no analogy is perfect. And it's good to know when to quit when using analogies. We can also use the breaking points of an analogy as a way to understand the topic further.
So, let's see why the box analogy is wrong.
You may have noticed that I didn't show you the pretty pictures of the box in the earlier example with my_numbers
, cool_numbers
, and groups_of_numbers
. Here's the code again:
You create a new list, [7, 19, 51]
, and place it in a box labelled my_numbers
. So far, so good.
Next, you create the name cool_numbers
, which refers to the same object. So, you place a second label on the same box. There's one object in one box, but the box has two labels. The analogy still holds well.
However, when you create the list groups_of_numbers
, you create a new box which contains a new list. But this new list contains a reference to the object in the my_numbers
and cool_numbers
box. The original list can't be in two boxes. An object can't be in two places in the physical world. In this case, the physical interpretation of names and objects as boxes with labels breaks down.
There is a way out, which I'll share with you soon. But often, I feel this is the point to let the analogy go! But here's the modification: you can place a strip of paper in the box with a reference to where the object is. So, you could have a set of shelves where you store the objects without any boxes, and the box could contain a paper saying the object is on shelf 2 in section A, for example.
And you can have several strips of paper in different boxes, all referring to the same location and, therefore, to the same object.
There's another misrepresentation in the images I showed you earlier. I showed the reference count on a sticker on the box. But the reference count belongs to the object not to the box holding it.
In the code above, the box with the two labels my_numbers
and cool_numbers
accounts for two of the four references to the list. The box labelled groups_of_numbers
, which contains a list of its own, accounts for another two references to the original list [7, 19, 51]
. This original list is referenced by groups_of_numbers[0]
and groups_of_numbers[2]
. The object [7, 19, 51]
has a reference count of 4.
Remember The Weird Reference Counts For "Stephen"
?
I'll finish this article where I started, with the weird behaviour we saw when the object was the string "Stephen"
. There were no issues when using the list as an example. So what's the difference?
I promised I wouldn't go deep into the dungeons on how Python works. So here's a short version. CPython, the main Python implementation, treats mutable objects differently. Since these can't change, it's fine to reuse the same object in different places. For example, consider the following example where you have two variables referring to the same immutable object:
This returns True
. There's no reason to create two different objects for the integer 20
. Python simply has one object and re-uses it whenever it needs it. Although score
and age
are different variables, they share the same object. There are no dangers in doing this since integers are immutable. If you increment the score, for example, the name score
will refer to a new object, but it doesn't alter the object with value 20
.
And here's a bit more on this topic. The code above returns True
whether you run it in a script or in an interactive console or REPL. However, the following code may not behave in the same way. Let's start with a script:
This returns True
.
Now, try this in Python's default REPL:
The same code now returns False
.
And, to confuse matters further, copy and paste both lines creating the variables from the script and paste them in one go in the REPL:
You can see the continuation line starts with ...
when you do this. And we're back to True
.
This behaviour can vary, and you shouldn't rely on it in your code. These are implementation details. But here's what's happening, in simple terms (also because I don't understand the complex terms myself!)
First, let's start with when you used 20
. CPython treats the numbers from -5
to 256
differently from the rest. These are pre-allocated in memory. They're present in every program. Therefore, when you write score = 20
, the integer 20
already exists in memory.
This is why score is age
returns True
in all situations when the integer is 20
.
However, 2000
is outside the range of numbers that already exist. So, the object is created when it's needed. However, CPython still tries to be efficient and reuse the same objects when possible. This happens in a script and when you define the two variables in one go in the REPL. But it doesn't happen when you define score
and age
on separate lines in the REPL.
Once again, I'll remind you that these details can vary, and you shouldn't rely on them in your code. But it's a useful exercise to help us peer underneath the surface and understand how objects and their names are treated in Python code.
Back to my first example, I used the string "Stephen"
. Strings are immutable. Therefore, CPython will try to optimise when possible. This means that there may be other references elsewhere to the same object. This is not a problem when the object is immutable, but it will never happen for mutable objects.
And do you recall how Python 3.12 gave a different result to Python 3.11? The reference count returned 2^32 - 1
. Python 3.12 created immortal objects. These are objects whose reference count never changes and is set to the largest number possible since no object will ever be able to reach such a high reference count in a real program. I won't dwell further, but I'll add some links below for those who want to read more.
A reminder that many of the points made in this article refer to CPython, the main Python implementation.
Final Words
When I first started planning this article, I assumed it would be a short article. I was keen not to go deep into Python internals. And I didn't! But even without going into too much detail, this article still went over 4,000 words. That's not short. There's so much more to say just about names and how they refer to objects.
But, to summarise the article, names and objects are not the same thing.
Further Reading (if you're ready to go to the deepest Python dungeons)
If you want to climb down further into Python's dungeons, my fellow author
wrote some deep dives into these topics:Deep into reference counting:
Immortalisation in Python 3.12:
Code in this article uses Python 3.12
Stop Stack
#51
If you like my style of communication and the topics I talk about, you may be interested in The Python Coding Place. This is my platform where I have plenty of video courses (with plenty more coming over the coming months), a community forum, weekly videos, and coming soon, live workshops and cohort courses. Any questions, just reply to this email to ask.
If you read my articles often, and perhaps my posts on social media, too, you've heard me talk about The Python Coding Place several times. But you haven't heard me talk a lot about is Codetoday Unlimited, a platform for teenagers to learn to code in Python. The beginner levels are free so everyone can start their Python journey. If you have teenage daughters or sons, or a bit younger, too, or nephews and nieces, or neighbours' children, or any teenager you know, really, send them to Codetoday Unlimited so they can start learning Python or take their Python to the next level if they've already covered some of the basics.
Each article is the result of years of experience and many hours of work. Hope you enjoy each one and find them useful. If you're in a position to do so, you can support this Substack further with a paid subscription. In addition to supporting this work, you'll get access to the full archive of articles. Alternatively, if you become a member of The Python Coding Place, you'll get access to all articles on The Stack as part of that membership. Of course, there's plenty more at The Place, too.
Appendix: Code Blocks
Code Block #1
first_name = "Stephen"
Code Block #2
first_name = "Stephen"
me = first_name
Code Block #3
import sys
first_name = "Stephen"
print(sys.getrefcount(first_name))
me = first_name
print(sys.getrefcount(first_name))
Code Block #4
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
Code Block #5
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
print(my_numbers is cool_numbers)
Code Block #6
print([7, 19, 51] is [7, 19, 51])
Code Block #7
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
boring_numbers = [1, 2, 3]
groups_of_numbers = [my_numbers, boring_numbers]
print(sys.getrefcount(my_numbers))
Code Block #8
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
boring_numbers = [1, 2, 3]
groups_of_numbers = [my_numbers, boring_numbers]
print(sys.getrefcount(my_numbers))
print(
my_numbers
is cool_numbers
is groups_of_numbers[0]
)
Code Block #9
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
boring_numbers = [1, 2, 3]
groups_of_numbers = [my_numbers, boring_numbers, cool_numbers]
print(sys.getrefcount(my_numbers))
print(
my_numbers
is cool_numbers
is groups_of_numbers[0]
is groups_of_numbers[2]
)
Code Block #10
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
Code Block #11
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
del cool_numbers
print(sys.getrefcount(my_numbers))
Code Block #12
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
del cool_numbers
del my_numbers
Code Block #13
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
boring_numbers = [1, 2, 3]
groups_of_numbers = [my_numbers, boring_numbers, cool_numbers]
del cool_numbers
del my_numbers
Code Block #14
my_numbers = [7, 19, 51]
print(my_numbers)
my_numbers = "I like numbers"
print(my_numbers)
Code Block #15
my_numbers = [7, 19, 51]
print(my_numbers)
def my_numbers():
pass
print(my_numbers)
Code Block #16
# Please don't do this!!
print("Hello!")
print = "I'm breaking the print function!"
print("Hello!")
Code Block #17
# Best not to do this,
# unless you have a good reason to!
say_something = print
say_something("Hello!")
Code Block #18
# Please don't do this, either!!
min, max = max, min
numbers = [2, 5, 10, 3, 7]
print(f"The minimum number is {min(numbers)}")
print(f"The maximum number is {max(numbers)}")
Code Block #19
numbers = [2, 5, 10, 3, 7]
print(f"The numbers at the beginning are:\n{numbers}")
def do_something():
numbers = [100, 200, 300, 400]
print(f"The numbers in the function are:\n{numbers}")
print(f"The numbers at the end are:\n{numbers}")
Code Block #20
numbers = [2, 5, 10, 3, 7]
print(f"The numbers at the beginning are:\n{numbers}")
def do_something():
numbers = [100, 200, 300, 400]
print(f"The numbers in the function are:\n{numbers}")
do_something()
print(f"The numbers at the end are:\n{numbers}")
Code Block #21
numbers = [2, 5, 10, 3, 7]
print(f"The numbers at the beginning are:\n{numbers}")
# Repeat just once, which is pointless
# but it demonstrates the point
for _ in range(1):
numbers = [100, 200, 300, 400]
print(f"The numbers in the 'for' loop are:\n{numbers}")
print(f"The numbers at the end are:\n{numbers}")
Code Block #22
number = 999
print(f"The number at the beginning is:\n{number}")
the_list_comp = [number for number in [2, 5, 10, 3, 7]]
print(f"The number at the end is:\n{number}")
Code Block #23
import sys
my_numbers = [7, 19, 51]
print(sys.getrefcount(my_numbers))
cool_numbers = my_numbers
print(sys.getrefcount(my_numbers))
boring_numbers = [1, 2, 3]
groups_of_numbers = [my_numbers, boring_numbers, cool_numbers]
print(sys.getrefcount(my_numbers))
Code Block #24
score = 20
age = 20
print(score is age)
Code Block #25
# This is a script
score = 2000
age = 2000
print(score is age)
# True
Code Block #26
# This is Python's default REPL
score = 2000
age = 2000
score is age
# False
Code Block #27
score = 2000
age = 2000
score is age
# True
Great post as always Stephen. And thank you for references to my posts!
It's tantalizing to know that we can rename inbuilt functions in Python and use them.
The box analogy is somewhat oversimplifying things as you pointed out. It's better to start with C, learn references and pointers and then move to Python.
On reading this article, I'm reminded of Feynman's quote that everything is interesting when you go deep enough. 🙂