Sequences in Python (Data Structure Categories #2)
Sequences are different from iterables • Part 2 of the Data Structure Categories Series
I have an admission to make.
I've used the terms iterable and sequence interchangeably in the past for longer than I wish to admit. You can get away with this in the early days of learning to code in Python. They're quite similar…
…until you dig deeper beneath the surface, which is what we'll do in this article.
The Data Structure Categories Series
This is the second of seven articles in this series. You can read the first one about iterables if you missed it. Here's the overview of the series:
What's a Sequence? The Short Version
There's a reason why it's easy to confuse iterables and sequences. All sequences are iterables. We'll talk more about this later. And you're likely to see common data structures such as lists and tuples referred to as sequences sometimes and as iterables other times.
Let's start with the headline difference between the two terms:
A Python sequence is an iterable that you can index using an integer.
This means you can use an integer inside square brackets to get an item from a sequence, such as some_sequence[0]
. You can also use slices within the square brackets.
In the first article in the series, we discussed how an iterable is an object that can return its elements one at a time. With a sequence, we're going further. You can fetch an item based on its position in the sequence.
Let's look at some examples of sequences:
Lists, strings, and tuples are among the most common sequences.
There's another requirement for an object to be a sequence. It needs to have a length:
You may think this is obvious and that every data structure must have a length. However, later in this series, we'll look at iterables that don't have a length.
Some data types that are not sequences
All sequences are iterables. But not all iterables are sequences.
Let's take a dictionary, for example. In the first article in the series, we determined that a dictionary is an iterable. Although you could use an integer in the square brackets to fetch an item if that integer is one of the dictionary's keys, you can also use non-integer data types as keys. To put this in another way, you cannot fetch the second item in a dictionary by using my_dictionary[1]
.
Therefore, a dictionary is an iterable but not a sequence.
Let's look at some other data types that are not sequences. Let's start with sets:
We've created a set and checked that it's iterable. We've checked using two techniques for good measure—using the set in a for
loop and passing it to iter()
. These are not really different checks since when you use an object in a for
loop, it's converted to an iterable using iter()
.
Therefore, sets meet one of the criteria for being a sequence. How about the "length test"?
A set has a length. Therefore, it passes the "length test", too. But there's one more test it needs to pass:
A set cannot be indexed with an integer. The TypeError
tells us that a set is not subscriptable—it cannot be indexed!
Let's explore another data type:
You create a zip
object using zip()
. This object fails both the "length" test and the "indexed with an integer" test. Therefore, zip
objects are not sequences. However, they are iterables:
So, the zip
object is another example of an iterable that's not a sequence. We'll look at what category zip
objects belong to later in this series.
What's a Sequence? The More Detailed Version
I'll finish this article by bringing everything together to see what makes an object a sequence.
In the first article in the series, we discussed how for a class to create iterables, it must have at least one of __iter__()
or __getitem__()
defined. This also applies to sequences, since a sequence is also an iterable.
However, a sequence can be indexed, and it must be indexed with an integer or a slice. The __getitem__()
special method makes the instances of a class indexable. Therefore, a class needs to define this method to be a sequence. And the implementation of __getitem__()
should ensure that it accepts only integers or slices.
Finally, a sequence needs to have a length. You can use the __len__()
special method to define the length of an object.
Let’s put all of this together. The minimum requirement to make an object a sequence is to define the following special methods for the class:
__getitem__()
, which makes the object indexable. It should only take an integer argument (or a slice)__len__()
, which defines the length of the object
The __getitem__()
special method also makes the object iterable. However, defining __iter__()
is preferable to make an object iterable. A sequence should ideally also have the __iter__()
method defined.
As is often the case, there's more to say about this topic. However, I'll return to fill in some blanks once I've covered a few more data structure categories later in this series.
But I'll give you a preview of what's coming next:
We'll talk about this diagram later in the series. However, looking at the two categories we've already discussed, iterables and sequences, you'll see they're in different parts of the hierarchical structure.
Etymology Corner
The term "sequence" comes from the Latin sequi, which means "to follow".
Therefore, each item in a sequence follows another. That's why you need to use integer indices!
Code in this article uses Python 3.11
Stop Stack
Recently published articles on The Stack:
Harry Potter and The Object-Oriented Programming Paradigm. Year 1 at Hogwarts School of Codecraft and Algorithmancy • The Mindset
Iterable: Python's Stepping Stones. What makes an iterable iterable? Part 1 of the Data Structure Categories Series
Why Do 5 + "5" and "5" + 5 Give Different Errors in Python? • Do You Know The Whole Story? If
__radd__()
is not part of your answer, read on…
I take back what I said last time! I did say I'm still experimenting with format and how to publish on Substack. In the last couple of articles I mentioned that I may not email all articles in series. I've had a couple of discussions with readers and other writers, and I've changed my mind. I will email most articles now, but I'm also revising my planned schedule of publication. See next bullet point
Originally I planned to publish weekly on Wednesdays plus another article per week on some weeks. I'm now aiming for a five-day cycle. There will roughly be one article every five days. However, as I've mentioned in my introductory post, I will not publish just for the sake of meeting a self-imposed dealine. So there may be times when the frequency of publication will change
In other news, the first cohort of The Python Coding Programme is underway. It's fun guiding a small group of very keen and eager learners through the fundamentals of Python. Next cohort starts in mid-May!
Do get in touch on Notes (or other platforms) so we can continue the conversation from this and other articles. As those of you who've interacted with me on any social media platform know, I enjoy having conversations on these platforms!