A Slicing Story
Are you sure you know everything there is to know about Python's `slice` object?
I have a few precious words to grab your attention. You're thinking: "I know how to slice a list, what else is there to know about slicing?" So, here's a tasting menu:
Starter • Naming a slice
Main course • Expanding a sequence
Dessert • The iterator slice
And there's more in the rest of this article. Right, so if you're interested in exploring Python's slicing further, get yourself a cup of tea, coffee, or your beverage of choice, and we'll start once you're back…
The conventional approach would be to introduce the basic and most common uses of slicing and then build from there. But I won't do this. It's likely you've seen and used basic slicing already. (But don't worry if you haven't, either.) So, I'll take a different path and won't start from "the beginning".
Everything Is An Object, Including A Slice
Let's start with a basic example of slicing:
But what's in the square brackets? Sure, it's a slice. But what is a slice?
Let's find out by creating a new data type—a modified list which displays the string representation of what's in the square brackets when using subscripts, and shows its data type. You can override the __getitem__()
special method, which is used when fetching items using the square brackets notation, and add a couple of calls to print()
:
When you subscript an object using the square brackets notation, its __getitem__()
method is called. The values within the square brackets are passed as an argument to __getitem__()
. Therefore, the parameter item
represents whatever you put in the square brackets when you subscript an object.
You can read more about __getitem__()
in The Manor House, the Oak-Panelled Library, the Vending Machine, and Python's `__getitem__()`.
In this new class called TestList
, which inherits from list
, you first print the value and data type of the argument in the __getitem__()
method, and then you call the list's __getitem__()
method and return its value. This is what super().__getitem__(item)
does since list
is the superclass for TestList
.
The syntax 2:7
within the square brackets represents a slice
object. You've probably heard that everything is an object in Python. And therefore, slices are objects, too.
When you print the item, you see the following:
slice(2, 7, None)
The three arguments represent the start, stop, and step values. Since you use the syntax 2:7
, you only include the start and stop values. Therefore, the step value is None
, equivalent to a step size of 1.
If you include a step size in the slice, you'll see this represented as the third argument in slice()
:
You can also use an integer within the square brackets when indexing a sequence, such as a list. You can try this out with TestList
. But this article is about slicing, so I'll move on.
Naming Slices
You've seen how the slicing syntax you use within the square brackets, such as 2:7
or 2:7:2
, refers to a slice
object. So, can you call the constructor slice()
directly within the square brackets?
Yes, it turns out you can. But why would you want to do this? Probably you wouldn't.
But sometimes you may want to name a slice you want to re-use in your code. Since a slice is an object, just like most other things in Python, you can create a slice
instance and assign it to a name. You can then use that name within the square brackets:
Using a named slice can also help with code readability in some situations. We'll return to the slice
object later to look at its .indices()
method.
Back to Basics
I skipped the description of what slices are and how to use them at the beginning of this article. So, let me go back to the basics now. But I'll be quick. And you can safely skip this section if you're familiar with using slices.
You can extract a subset of a sequence—or a slice of the sequence:
This gives you a slice of the sequence. This slice contains the elements from index 4 up to, but excluding, the item with index 8. And you can use shortcuts if you want the slice to start from the beginning or end at the end of the sequence:
Indeed, you can "slice" the whole sequence:
You may be thinking this is a pointless slice since it refers to the whole sequence. Why would you do that? However, you'll see this used often in Python code as it creates a copy of the sequence. We'll return to this point later in this article.
For the record, I prefer to use my_favourite_numbers.copy()
when using a list, as it's more readable. There's also the copy
module in the standard library to deal with making copies for other data types.
And if you don't want successive elements from the sequence, you can use a different step by adding a third value in the slicing syntax:
And you can go in reverse if you use a negative step size:
Note that if you're slicing a list using a slice that's not possible, you'll get an empty list:
In the first example, there are no elements starting from index 4 up to, but excluding, index 2. The default step size is 1.
In the second example, you slice from index 2
up to, but excluding, index 7
, but in steps of -1. That's not possible, so you get an empty list.
Expanding and Shrinking A Sequence Using Slicing
You can use slices to replace multiple items in a sequence. I'll use a different list of numbers in this section:
You assign the elements 40
, 50
, and 60
to the positions in the original list represented by the slice [4:7]
, which refers to the items with indices 4, 5, and 6.
What if the slice you assign to goes beyond the size of the original list?
That's fine. This extends the list to fit the data on the right of the equals sign. The length of the slice doesn't even have to match the length of the data:
Or even simpler:
And I can hear you ask, so here's the answer: Yes, this works even the other way round, when the slice is longer than the sequence on the right of the equals sign:
Now, this is where it gets a bit weird:
The original list has length 10. You assign values to the slice starting from index 20 up to 29. And it works! But the new elements are added at the end of the original list, starting from index 11, and not starting from index 20, as the slice indicates. There can be no gaps in a list!
Squeezing data into a sequence
Let's look at another example:
Let's look at what happened in this assignment. You assign a list with four elements, [4.0, 4.5, 5.0, 5.5]
, to a slice representing two elements. Therefore, these two elements, 4
and 5
, are replaced by the four new elements.
You can also shrink the list in a similar way:
The numbers from 2
to 7
are replaced by two values, 100
and 200
.
Copies and Views
Let's explore slices further. Have a look at this example:
You assign a slice of numbers
to a new variable name, subset_of_numbers
. But when you make a change to subset_of_numbers
, the original list numbers
is unchanged. When you create a slice of a list, you create a copy of the list. The slice does not refer to the same objects in the original list.
Let's go a step further and create nested lists:
On this occasion, changing a value within one of the inner lists of subset_of_nested_numbers
also affects the original list, nested_numbers
. This behaviour may seem strange–some beginners assume it's a bug. But it isn't. A slice of a list creates a shallow copy of the list.
The slice is a new list but contains references to the original data. Let's see this in action again by extending the previous REPL/Console session:
Confused? One assignment changed the original list, but the second assignment didn't. Let's break this down:
1.
>>> nested_numbers = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
This creates four list objects: the outer list, which is assigned to the name nested_numbers
, and three inner lists.
2.
>>> subset_of_nested_numbers = nested_numbers[:2]
>>> subset_of_nested_numbers
[[0, 1, 2], [3, 4, 5]]
This creates a new list assigned to the name subset_of_nested_numbers
. This list is a new object. However, the inner lists are the same lists as those in the original list, nested_numbers
. They're not just equal to those lists, they're the same objects. Here's the proof:
>>> id(nested_numbers[0])
140431821514560
>>> id(subset_of_nested_numbers[0])
140431821514560 # <-- Same object id
# But 'nested_numbers' and 'subset_of_nested_numbers' are different
>>> id(nested_numbers)
140431821601728
>>> id(subset_of_nested_numbers)
140431821601216
The first element in nested_numbers
and the first element in subset_of_nested_numbers
have the same id. Therefore, they're the same object.
3.
>>> subset_of_nested_numbers[1][0] = 300
>>> subset_of_nested_numbers
[[0, 1, 2], [300, 4, 5]]
When you access subset_of_nested_numbers[1]
, you access the same object as nested_numbers[1]
since they're the same object. Therefore, when you change the first element of this inner list, this affects both outer lists:
>>> nested_numbers
[[0, 1, 2], [300, 4, 5], [6, 7, 8]]
4.
>>> subset_of_nested_numbers[0] = [999, 999, 999]
>>> subset_of_nested_numbers
[[999, 999, 999], [300, 4, 5]]
>>> nested_numbers
[[0, 1, 2], [300, 4, 5], [6, 7, 8]]
But when you assign a new list to subset_of_nested_numbers[0]
, you replace the reference to the original inner list. So, subset_of_nested_numbers[0]
and nested_numbers[0]
are no longer the same object. The first item in nested_numbers
hasn't changed and is the same inner list as before, [0, 1, 2]
. However, the first item in subset_of_nested_lists
is now the new list [999, 999, 999]
.
You can read more about shallow and deep copies in an article I wrote in the pre-Substack era: Shallow and Deep Copy in Python and How to Use __copy__().
Confused? Let me add to the confusion…
So, slicing creates a copy of the subset of the sequence, right? Not so fast…
I have used lists in all the examples in this article so far. You can slice other sequences, too:
And this also creates a copy. When slicing strings or tuples, there's no other option other than creating a copy since they're immutable data types.
So, let's try another data type to see whether slicing always creates a copy. Let's use a NumPy array. I'll replicate the first example in this Copies and Views subsection, which I'm showing again below, and also use NumPy arrays in addition to using lists:
Now, let's replicate these steps using NumPy arrays. You'll need to install NumPy using pip install numpy
or your favourite package manager:
The behaviour when you slice a NumPy array is different to the case when you slice a list. When you assign a new value to an item in subset_of_numbers_array
, the original array also changes. Slicing doesn't create a copy of the portion of the array in this case. Instead, it creates a view. This means that subset_of_numbers_array
refers to the same data as numbers_array
. When you change one, the other changes, too.
You can read more about copies and views in NumPy in the docs, and I'll write an article about this topic soon.
So, whether a slice creates a copy of the data depends on the data type. Slices create copies when dealing with lists, strings, tuples, and other built-in data types. However, this behaviour is not guaranteed by slicing, as you've seen in the case of NumPy arrays.
Back to The slice
Object
Before I move on, I should mention the .indices()
method. This is the only method the slice
object has, aside from the usual special methods. It's unlikely you'll ever need to use this method, but it can help you understand slices a bit better.
Let's assume you want to extract from the fourth to the eighth element of a sequence in steps of 2:
What if the list is shorter and doesn't have enough elements? The slice still returns a value:
However, if you need a slice that perfectly matches the sequence you're using, you can convert one slice into an equivalent one using the length of the sequence. This will make more sense with an example. Let's take the slice 3:9:2
from the example above. This is equivalent to slice(3, 9, 2)
. You can convert this slice to the equivalent slice for a list of length 6:
The .indices()
method with argument 6, the length of the new sequence, returns the ideal start, stop and step values to recreate this slice for a sequence of length 6. The stop value is now 6, since the highest index in the list is 5.
Let's try this with the shorter list:
I won't dwell on .indices()
longer. You'll be fine if you forget about this method!
The Iterator Slice
I'll finish this article with a brief mention of the slice
object's 'cousin', the islice()
object, which is part of the itertools
module. When you slice a sequence, you get an object of the same data type. A slice of a list is another list, and a slice of string is also a string. However, you can use itertools.islice()
to get an iterator instead. I'll use my favourite numbers again:
The function islice()
returns an iterator that's equivalent to the slice 2:7
. The first argument in islice()
is the iterable you want to get a slice from. The second and third arguments are the start and stop values of the slice. You can also add another argument if you don't want to use the default step size, which is 1.
The first item returned by the iterator is the element with index 2. The items are returned sequentially since the default step of 1 is used. However, once the item with index 6 is returned, which is the number 42, the iterator is exhausted. The next time you call next()
, a StopIteration
exception is raised. Recall that the stop value, which is 7, is excluded.
Another difference between regular slicing and islice()
is that you cannot use negative values for the start, stop, and step values in islice()
.
You can read more about iterators in A One-Way Stream of Data • Iterators in Python.
Final Words
I covered some of the lesser-known features and quirks of Python slicing in this article. You may find a use for some of these techniques in your code. But even if you don't, it will give you a better understanding of what's going on when you slice a list or another sequence. Keep an eye out whether the slice returns a copy or a view of the original sequence, especially when slicing data types other than the built-in types.
Code in this article uses Python 3.11
Stop Stack
#30
Recently published articles on The Python Coding Stack:
Coding, Fast and Slow, Just Like Chess An essay: How adapting from fast to slow chess got me thinking about coding in Python
Butter Berries, An Elusive Delicacy (Paid article) How my quest to find butter berries at the supermarket led to musings about Python lists and dictionaries and more
Pay As You Go • Generate Data Using Generators (Data Structure Categories #7) Generators • Part 7 of the Data Structure Categories Series
Clearing The Deque—Tidying My Daughter's Soft Toys • A Python Picture Story Exploring Python's
deque
data structure through a picture story. [It's pronounced "deck"]The Final Year at Hogwarts School of Codecraft and Algorithmancy (Harry Potter OOP Series #7) Year 7 at Hogwarts School of Codecraft and Algorithmancy • Class methods and static methods
Recently published articles on Breaking the Rules, my other substack about narrative technical writing:
The Story So Far (Mid-Season* Review). Are you back from your holidays? Catch up with what you've missed
Broken Rules (Ep. 10). Let's not lose sight of why it's good to break the rules—sometimes
Frame It • Part 2 (Ep. 9). Why and when to use story-framing
The Rhythm of Your Words (Ep. 8). Can you control your audience's pace and rhythm when they read your article?
A Near-Perfect Picture (Ep. 7). Sampling theory for technical article-writing • Conceptual resolution
Stats on the Stack
Age: 5 months, 1 week, and 6 days old
Number of articles: 30
Subscribers: 1,055
Each article is the result of years of experience and many hours of work. Hope you enjoy each one and find them useful. If you're in a position to do so, you can support this Substack further with a paid subscription. In addition to supporting this work, you'll get access to the full archive of articles and some paid-only articles.