Collecting Things • Python's Collections (Data Structure Categories #5)
Collections • Part 5 of the Data Structure Categories Series
Picture a collection of artwork in a museum. The museum collection contains paintings. It also has a size—the number of items in the museum. And when you visit the museum, you can plan a route to take you from one painting to the other, going past all the items one at a time.
Did you spot three definitions you encountered in previous articles in this Data Structure Categories series?
The collection contains items, has a size, and you can go through each element one by one. In Python-speak, it's a container, a sized object, and an iterable.
The Data Structure Categories Series
We're on the fifth of seven articles in this series. You can read the previous ones by following the links in this overview:
What's a Collection?
It's easy to get collection and container confused. And if you also throw sequence and iterable in the mix, things can get murkier. There's lots of overlap between the different terms. But they're all different.
This is a good time to visualise the hierarchy of these categories again—you've seen this earlier in this series:
Iterables, containers, and sized objects sit at the top. Each of these categories defines a specific property. Containers contain objects. Iterables are objects that can return their elements one at a time. And sized objects have a length.
A collection is an object that has all three of these properties. A collection is a sized, iterable container.
Let's look at some data types to determine whether they're collections. Let's start with a list:
All three special methods __contains__
, __len__
, and __iter__
exist for a list object. This makes it a collection.
Let's look at another data type, the zip
object:
You create a zip
object called team_points
. It has an __iter__
special method, which means it's iterable. But the zip
object doesn't have __contains__
or __len__
. The error messages confirm this. Therefore, a zip
object is not a collection.
Using the Abstract Base Classes
Let's look at another way to determine whether an object is a collection. In fact, this will allow you to check for any of the categories we've been discussing in this series.
You'll use the Abstract Base Classes, or ABCs for short, that you can find in the collections
module. Note that you can find ABCs for all data type categories in the collections
module, not just for those that meet the definition of a collection:
You use the built-in function isinstance()
to check whether the objects team_members
and team_points
are collections. You check whether they're instances of the abstract base class collections.abc.Collection
.
The results confirm the earlier observation. Lists are collections, but zip
objects aren't.
Let's explore a few more data types:
You can also explore other categories of data types. Let's look at a few examples:
You can explore other data types. There are abstract base classes for all the categories of data types we've discussed in this series.
The Next Rung Down
A collection is an object which is a sized, iterable container. However, most collections have additional properties. You already read about sequences and mappings earlier in this series. All sequences and mappings are collections. A set is another Python basic data type that's a collection.
Let's explore this hierarchy using a list. You can start by considering the "bottom" level of this hierarchy, which you can see in the image shown earlier in this article: a list is a sequence. You can use isinstance()
and the abstract base class Sequence
to confirm this:
All sequences are collections:
And all collections are sized, iterable containers:
Etymology Corner
The term “collection” comes from the Latin colligere, which means “gather together” The Latin prefix con- means “together”, and the verb legere is “to gather”.
Next in the series: iterator
Code in this article uses Python 3.11
Stop Stack
#18
Recently published articles on The Python Coding Stack:
And Now for the Conclusion: The Manor's Oak-Panelled Library and getitem()[Part 2]. The second in this two-part article on Python's
__getitem__()
special methodThe Anatomy of a for Loop. What happens behind the scenes when you run a Python
for
loop? How complex can it be?The Manor House, the Oak-Panelled Library, the Vending Machine, and Python's getitem() [Part 1]. Understanding how to use the Python special method
__getitem__()
. The first of a two-part article"You Have Your Mother's Eyes" • Inheritance in Python Classes. Year 5 at Hogwarts School of Codecraft and Algorithmancy • Inheritance
Casting A Spell • More Interaction Between Classes. Year 4 at Hogwarts School of Codecraft and Algorithmancy • More on Methods and Classes
Recently published articles on Breaking the Rules, my other substack about narrative technical writing:
Sharing Cupcakes (Ep. 1). Linking abstract concepts to a narrative • Our brain works in funny ways
Once Upon an Article (Pilot Episode) …because Once Upon a Technical Article didn't sound right. What's this Substack about?
Are You Ready to Break the Rules? Narrative Technical Writing: Using storytelling techniques in technical articles
The Different Flavours of Narrative Technical Writing. Why I'm using more storytelling techniques in my Python articles
Stats on the Stack
Age: 2 months, 3 weeks, and 2 days old
Number of articles: 18
Subscribers: 653
This article is a one-way communication. But if you want a conversation, then feel free to comment below, or even better, engage in a conversation in the Substack Chat or Notes.
Most articles will be published in full on the free subscription. However, a lot of effort and time goes into crafting and preparing these articles. If you enjoy the content and find it useful, and if you're in a position to do so, you can become a paid subscriber. In addition to supporting this work, you'll get access to the full archive of articles and some paid-only articles.