The Key To The `key` Parameter in Python
A parameter named `key` is present in several Python functions, such as `sorted()`. Let's explore what it is and how to use it.
That's not a great title, is it? It refers to a parameter named key
rather than "the key parameter", as in "the important parameter". You may have seen this parameter in the built-in function sorted()
or the list method .sort()
. However, these are not the only places you'll find it. We'll look at a few more functions in this article.
But what's this parameter? How do we use it? And why do we need it?
Allow me a short note about The Python Coding Place. Members of The Place get access to everything I do: video courses ranging across several levels, weekly videos about interesting Python topics, The Place’s members’ forum, and live sessions and cohort courses. Ah, and full access to this substack, too.
Keys everywhere! The term "key" appears often in Python in different contexts. There are the keys in dictionaries and other mappings. And there are keyword arguments in functions. There are also Python keywords like
for
andimport
. However, in this article, I'll focus on the parameters namedkey
you'll find in several Python functions and methods.
Using key
With sorted()
Let's start with the built-in function sorted()
and look at an example. There's a list of scientifically interesting numbers (do you recognise all of them?), and you pass this list as an argument to sorted()
. Next, you print the list returned by sorted()
. If you're not familiar with the unpacking operator *
or the sep
parameter in print()
, you don't need to worry too much about the weirdness within the parentheses in print()
:
Here's the output:
6.626e-34
6.674e-11
1.61803
2.71828
3.14159
299792458
6.022e+23
The numbers are in ascending numerical order. Yawn! Useful. But boring.
Let's snazz it up a bit.
Ok, but before we do so, I know you're frustrated that I skipped the explanation of what's going on within the call to print()
. That's out of character. Sorry! So here we go:
The unpacking operator
*
unpacks the items in the list one by one. It's equivalent to writing each element separated by a comma directly within the parentheses.Here's a simple example: If you have a list, such as
numbers = [5, 2, 10]
, thenprint(*numbers)
is equivalent toprint(5, 2, 10)
The
print()
function has an optional argumentsep
. Its default value is the space character" "
, which means that multiple arguments are separated by a space when printed. However, you can assign another value tosep
. In the example above,sep
is set to the newline character"\n"
. Therefore, the code prints each number on a new line.
Now, back to snazzing up sorted()
. Let's use a list of names and see what sorted()
does to this list:
Here's the output:
Albert
Alexandra
Christine
Ishaan
Max
Robert
Trevor
"Where's the snazzing up you promised us?", I can hear you think. The names are in alphabetical order. So what?
What if you want to order the names using their lengths? Or maybe depending on how many letter "a"s they have?
Well, you can. Let's start by ordering the names based on their length. You can set your own rules for sorting the list as long as you can express the rule as a function. And this is where the key
parameter enters the scene:
There's a second argument in sorted()
. You pass the function name len
and assign it to the parameter key
. Here's what happens within sorted()
:
Each item from the list
some_names
is passed as an argument to the functionlen()
.The value returned by
len()
is used bysorted()
to determine the order of the items.
Therefore, sorted()
used values returned by len()
to determine the order of the items. Here's the output from this code:
Max
Robert
Ishaan
Trevor
Albert
Alexandra
Christine
The names are ordered from shortest to longest. Names with the same number of letters keep their original order ("Robert"
came before "Ishaan"
in the original list, so it remains ahead in the sorted list since both names have the same length).
Regular Named Functions And Anonymous lambda
Functions
Let's order the names using the number of times the letter 'a' appears in the name. I'll show two versions that give the same output.
The first version is similar to the earlier example, except that you're defining your own function rather than using a built-in function:
The function get_number_of_a_s()
converts the input string to lowercase and counts the number of occurrences of the letter 'a'. The function returns this count, which is used by sorted()
to determine the order of the items in the new list. Here's the output:
Robert
Trevor
Christine
Max
Albert
Ishaan
Alexandra
Since sorted()
deals with numerical values by ordering them in ascending order, the names with no 'a's appear first since .count("a")
returns 0
for these names. "Max"
and "Albert"
are next since they contain one occurrence of 'a'. "Max"
is listed first since it occurs before "Albert"
in the original list. The names with two and three occurrences of 'a' come next.
You can add a third keyword argument to sorted()
, reverse=True
, to order the names starting with the one with the most occurrences of the letter 'a'.
This option is perfectly fine. However, you'll often see lambda
functions used in this scenario. If you're unfamiliar with lambda
functions, it means you're new here or you weren't paying attention last week. So here's a link to last week's article: What's All the Fuss About 'lambda' Functions in Python?
And here's the same ordering task using a lambda
function instead of a regular named function:
The lambda
function performs the same task as get_number_of_a_s()
earlier. It has the same parameter and the same return expression. The output from this code is identical to the earlier one.
More Examples Using list.sort()
The key
parameter appears elsewhere. Let's start with a close relative of sorted()
, the list method .sort()
. Whereas sorted()
is a built-in function that can be used with iterables (not just lists), the list method is, well, a list method. But there's another difference. See whether you can spot this difference compared to when you used sorted()
in the previous section:
Here's the output, which is the same as in the previous example:
Robert
Trevor
Christine
Max
Albert
Ishaan
Alexandra
The built-in sorted()
does not change the object passed to it. Instead, it returns a new list with the output. However, the list method .sort()
mutates the list it acts on. Since lists are mutable, list methods such as .sort()
modify the original list rather than returning a copy. And since .sort()
is a method, the list doesn't need to be explicitly passed as an argument since this is implied.
If you want to sort using several rules, you can call .sort()
multiple times:
When you use .sort()
without any arguments, the method sorts the names alphabetically. Then, when you sort using the "number of 'a's" rule, you start with the list already sorted alphabetically.
Here's another example:
You create a list of 20
random numbers between 1
and 50
. You sort these numbers using a lambda
function that returns the remainder when the number is divided by 5
using the modulo operator %
. I'm using x
as the parameter name in the lambda
function. Although I'm generally against single-letter names, it's quite common to use x
as a parameter name in lambda
functions. Here's the output I got when I ran this. Your results will vary since the numbers are random:
[37, 40, 19, 11, 10, 41, 45, 22, 49, 18, 19, 33, 25, 33, 24, 1, 40, 25, 18, 27]
[40, 10, 45, 25, 40, 25, 11, 41, 1, 37, 22, 27, 18, 33, 33, 18, 19, 49, 19, 24]
The first line shows the original list. The second line in the output shows the modified list. The first numbers shown are those divisible by 5
. These are 40
, 10
, 45
, 25
, 40
, and 25
. Their order is the same as they appear in the original list.
Following these numbers are those that leave a remainder of 1
when divided by 5
. These are 11
, 41
, and 1
in this example. Next are the numbers that leave a remainder of 2
, then a remainder of 3
, and finally, those that leave a remainder of 4
when divided by 5
.
The point is that you can sort using any rule you wish.
Using key
with max()
and min()
The key
parameter isn't limited to sorted()
and .sort()
. Here it is again in max()
and min()
. And you probably figured out what it does in these built-in functions, too.
Here's a list of random numbers again. If you pass the list of numbers to the built-in function max()
, the function returns the largest value. But in the second scenario, I've set up a different rule. A lot is happening in the lambda
function. Can you figure out what the rule is?
Here's the output from this code to help you determine whether you figured it out. The three lines show the list of numbers, the maximum value in that list, and finally, the "maximum value" based on the rule in the lambda
function:
[6, 8, 44, 16, 46, 43, 23, 26, 33, 28, 32, 26, 15, 38, 32, 38, 23, 13, 21, 26]
46
38
And since these are random numbers, here's another output from a different run of this code:
[2, 4, 28, 32, 37, 21, 27, 4, 21, 25, 30, 32, 19, 22, 18, 16, 40, 5, 36, 3]
40
28
If there's more than one value that equals the "maximum", whichever definition of "maximum" is used, max()
returns the first one.
You can try this or a different rule with min()
.
Other Places Where You Can Spot key
In The Wild
The sorting functions, min()
, and max()
† aren't the only places you'll find the key
parameter used. Once you know how to use this parameter in these functions, you'll be able to use it anywhere else you encounter it.
† This is why I'm a fan of the Oxford comma. It makes it clear that min()
and max()
are not the sorting functions! But let's move on from linguistic details.
Let's start with close relatives of min()
and max()
. We'll look at nlargest()
and nsmallest()
in the heapq
module. This module is part of the standard library. I'll use a similar example to the one in the section above:
Spoiler alert: the cryptic lambda
function finds the sum of the digits that make up a number. For example, the lambda
function returns 9
when the input is 36
since the sum of 3
and 6
is 9
. There's a detailed step-by-step description of this lambda
function at the end of this article.
The function heapq.nlargest()
returns the largest n
values of an iterable. The first argument represents n
. In the above example, the first argument is 3
. Therefore, the function returns the three "largest" values using the sum-of-digits rule defined in the lambda
function. The output shows the entire list first and then the list returned by heapq.nlargest()
:
[15, 40, 43, 19, 4, 20, 46, 34, 21, 23, 8, 10, 16, 40, 17, 7, 5, 50, 4, 13]
[19, 46, 8]
Note how the full list includes 50
, but this is not included in the output from heapq.nlargest()
since 5 + 0 = 5
, which is smaller than the sum of the digits in 19
, 46
, and 8
. These values are considered the "largest" values using this rule since the sums of their digits are 10
, 10
, and 8
, and these values are larger than the sums of the digits of the rest of the numbers. Note that there is another number, 17
, whose sum of digits is 8
. However, this comes after 8
in the original list, so the function picks 8
as the third and final element.
The value returned by the lambda
function is used to determine the three "largest" values based on this rule.
Another function in the standard library that uses the key
parameter is groupby()
in the itertools
module. Here's how to use groupby()
. This example uses the list of names used earlier in this article:
The function itertools.groupby()
takes two arguments (the second one is optional). The first argument is the iterable containing the data, and the second argument is the function used as a key.
I won't dive too deep into how groupby()
works—maybe in a future article—but I'll give a quick overview:
groupby()
groups consecutive elements based on whether they return the same value when passed to thekey
function. In this case, items are grouped based on their lengths.The function returns an iterator, which is assigned to the variable name
output
in this example.Each item in the iterator is a tuple containing two objects. The first of these objects is the category which defines a group, and the second is the group of items that match that category. The second element, which represents the group of items, is another iterator. This explains why it's cast into a list in the
for
loop. All this will make more sense when you see the output from this code below.
Here's the output:
6 ['Robert', 'Ishaan']
3 ['Max']
6 ['Trevor']
9 ['Alexandra']
6 ['Albert']
9 ['Christine']
The category, displayed first on each line, is the length of the names. This is followed by the group, which includes the names that correspond to that length.
But you'll note there are three lines corresponding to 6
and two corresponding to 9
. If you re-read the first bullet point above, you'll see I included the term "consecutive". The groupby()
function groups consecutive matching items.
Let's sort the names based on their lengths before applying groupby()
:
The first argument in groupby()
is now the built-in function sorted()
. This function sorts the list of names based on their lengths by using len()
as the function assigned to key
. The list returned by sorted()
is used in groupby()
, which also uses len()
as the key. Here's the output:
3 ['Max']
6 ['Robert', 'Ishaan', 'Trevor', 'Albert']
9 ['Alexandra', 'Christine']
There's one name with three letters, four with six letters, and two names with nine letters.
You may also be familiar with another popular groupby()
function from the pandas
module. This is a different function and doesn't have a key
parameter. However, it has a by
parameter which can serve a similar purpose. Another topic for a future article! But you will find the key
parameter elsewhere in pandas
, such as in sort_index()
or sort_values()
, for example.
Final Words
There are other places where you'll find the key
parameter in the standard library and in third-party modules. I'll let you go hunt more of them!
Functions with a key
parameter are examples of functions that take other functions as arguments. This is possible since functions are objects. You'll often find them referred to as "first class objects", which means they can be used in the same way as other objects, including as arguments in functions.
And let me finish off with the key point of this article…
No, I can't think of another good pun using "key", so I'll end here.
Appendix
Remember this code?
Did you figure out what's the rule? Often, lambda
functions lead to one-liners with reduced readability. This can be seen as a drawback. Let's break down the lambda
function:
The parameter is
x
. Since thelambda
function is the argument for thekey
parameter, each number in the listnumbers
is passed to thelambda
function'sx
parameter. Let's use24
as an example in these bullet points.The return expression has a generator expression as the argument for the built-in
sum()
. The value ofx
is a number from the list. You convert this to a string usingstr()
. Therefore, this becomes the string"24"
in this example.The string is an iterable, and the generator expression loops through this string. Each character in this string is assigned to
y
, which is converted to an integer. In our example, the generator expression will generate the integers2
and4
.The built-in function
sum()
adds these values to return6
. This is the value used bymax()
to determine the "largest" number using the rule defined by thekey
parameter.
Therefore, max()
returns the number with the largest sum of its digits.
Code in this article uses Python 3.12
Stop Stack
#43
I'm getting closer to the official launch of The Python Coding Place (15 January 2024). There are already many who joined as lifetime members, and I hope many more will join over the coming weeks and months to grow this new community into a thriving learning platform for beginners and intermediate Python programmers. There are video courses, weekly videos, members' forum, live workshops, and even cohort-based courses…all included in the lifetime membership.
Read more about The Python Coding Place membership here
Recently published articles on The Python Coding Stack:
What's All the Fuss About 'lambda' Functions in Python? Python's
lambda
functions are seemingly obscure, until they aren't. They're almost mystical, until unveiled. Let's shed some light to dispel the obscurity and lift the mystique.In Conversation: Pawel and Stephen Discuss Matplotlib's New-ish subplot_mosaic() In recent years, Matplotlib introduced a new function for plotting several plots in one figure. We had a chat about
subplot_mosaic()
In Conversation: Rodrigo and Stephen Discuss Analogies When Learning To Code A conversation between Rodrigo Girão Serrão and Stephen Gruppetta on analogies in programming
A Touch of Randomness Makes The Magic Sparkle • A Python
turtle
Animation Let's build this sparklyturtle
animation step by stepNumPy for Numpties Introducing an infrequent, loose series of articles on NumPy
Recently published articles on Breaking the Rules, my other substack about narrative technical writing:
I Haven't Been Abducted by Aliens (Ep. --) Why this long lull since the last Breaking The Rules post?
The Selfish Reason (Ep. 13) Another reason for authors to innovate • Enjoying the writing process
The Consequential Detail (Ep. 12). Can a single letter or one blank line make a difference? (Spoiler Alert: Yes)
The Unexpected Audience (Ep. 11). What I'm learning from listening to Feynman's physics lectures
The Story So Far (Mid-Season* Review). Are you back from your holidays? Catch up with what you've missed
Stats on the Stack
Age: 7 months, 3 weeks, and 6 days old
Number of articles: 43
Total subscribers: 1,565
On the Paid tier: 60
Each article is the result of years of experience and many hours of work. Hope you enjoy each one and find them useful. If you're in a position to do so, you can support this Substack further with a paid subscription. In addition to supporting this work, you'll get access to the full archive of articles and some paid-only articles. Alternatively, if you become a member of The Python Coding Place, you'll get access to all articles on The Stack as part of that membership.
Hi, I really liked your article. Never realised that keys can be supplied in so many places!
Please note that instead of sorting twice to get a list of names sorted on nr of 'a's and names with equal nr of 'a's alphabetically , you could sort just once:
sort_names.sort(key=lambda x: (x.lower().count('a'), x))
This was probably the most confusing thing in Python when I started, and the docstring of sorted never helped :)