What's The Difference Between NumPy's `arange()` and `linspace()` (A NumPy for Numpties article)
They both generate NumPy arrays with evenly spaced values, but they're not the same
There is a time for long and detailed articles covering every hidden aspect of a topic. This is not it. This article is brief.
Meet Arav and Linda. Yes, Arav will be arange()
and Linda linspace()
. How predictable? I know…
Arav is a sports coach. He wants to record his athletes' times in 10 metre gaps as they run, so he gets a bag of plastic cones and places each one exactly 10m from the other. Today, his athletes are running 150m runs, so he lined up the 150m stretch on the track with these cones. Yesterday, they ran 400m, so Arav had a harder job placing the cones around the whole track. But he made sure they're all 10m apart.
Linda is an events planner. She's setting up booths for companies displaying their products at a fair. She has a long corridor to work with, and she knows 17 companies have signed up to display their goods. She needs to work out how far apart to place each booth to fit all 17 companies in the 60m corridor. She's meticulous, so she wants to ensure they're all equally apart.
NumPy has different methods to deal with Arav's and Linda's tasks.
This is a NumPy for Numpties article. You can see the rest of the articles in this series here: The NumPy for Numpties (sort-of) series
You'll need to download and install NumPy in your environment using pip install numpy
(or python -m pip install numpy
) in the Terminal or whatever package manager you use to install libraries.
Arav Needs arange()
Arav needs arange()
. This function allows you to choose the start and end points of your range of values and also to set the step size:
This code shows the positions for Arav's cones when setting up for the 150m training runs with cones 10m apart:
[ 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150]
NumPy's arange()
returns a NumPy 'ndarray'.
Like most number ranges in Python, the start value is included, but the stop value isn't. Therefore, Arav uses 151 as the stop value to make sure 150 is included in the cone positions. If the stop argument is 150, this value is excluded from the output:
The end cone is no longer present:
[ 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140]
The stop value can be any value larger than your desired endpoint:
Since 150.01 is larger than 150, 150 will be included as it fits in the pattern of increasing the distance by 10m at each step, starting from 0:
[ 0. 10. 20. 30. 40. 50. 60. 70. 80. 90. 100. 110. 120. 130. 140. 150.]
The stop value is a float. Therefore, the values in the array are floats—note the dot or period after each number, showing they're floats. You can also confirm this by accessing the array's dtype
attribute:
This confirms the elements in the array are floats:
float64
But in the original example, when the start, stop, and step values are all integers, the elements in the output array are also integers:
The output confirms the elements are integers:
int64
What about Python's built-in range()
?
Great question. You may have wondered why Arav couldn't just use Python's built-in range()
? There's no reason in this case—range()
would work perfectly well, too:
Here's the output:
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150]
But, Arav is doing some fancy data science with the data he collects during his athletes' training sessions. He now wants to record their times every 7.5m:
Oops:
Traceback (most recent call last):
...
aravs_cones = range(0, 151, 7.5)
^^^^^^^^^^^^^^^^^^
TypeError: 'float' object cannot be interpreted as an integer
Python's range()
only deals with integers. However, NumPy's arange()
can deal with floats:
Here are the cone locations Arav needs:
[ 0. 7.5 15. 22.5 30. 37.5 45. 52.5 60. 67.5 75. 82.5 90. 97.5 105. 112.5 120. 127.5 135. 142.5 150. ]
Arav cares about the distance between the cones. He needs to know how many cones he needs so he can use len()
:
He needs 21 cones when recording the times every 7.5m:
21
And you surely noticed the similarity in name between Python's built-in range()
and NumPy's arange()
. They follow a similar usage pattern. So, if you know how to use range()
, you're in a good place to start using NumPy's arange()
.
You can also call arange()
with one or two arguments:
You can call
arange()
with a start and stop value but without a step size. The default step size is 1You can call
arange()
with only a stop value. The default start value is 0, and the step size is 1
I'll let you explore these on your own.
I'll mention this here, but I won't discuss it in detail. Precision errors can creep in when dealing with non-integer step sizes and when casting to specific data types. Therefore, arange()
may give some odd results. In these cases, it's best to use linspace()
, which you'll meet next. If you want to read more about the issues with arange()
, have a look at the warning in the documentation.
If you’re interested in the full range of video courses at The Python Coding Place, you can get a one week’s free pass to access all the courses.
Linda Needs linspace()
Let's focus on Linda's task. She has a 60m corridor, and she needs to fit equally-spaced booths for 17 companies. Unlike Arav, whose main requirement was to have a fixed gap size between the cones, Linda's priority is to fit 17 booths. The gaps don't matter. Enter linspace()
, which stands for linear space:
NumPy's linspace()
also has a start and stop argument. In this case, the start and stop values are 0 and 60. You'll see soon there's an issue with this choice, but let's ignore it for now.
The third argument represents the number of points needed in the array. Linda has 17 companies, so she needs 17 booths. She uses 17 as the third argument to ensure the ndarray
returned by linspace()
has 17 values:
[ 0. 3.75 7.5 11.25 15. 18.75 22.5 26.25 30. 33.75 37.5 41.25 45. 48.75 52.5 56.25 60. ]
There are 17 values in the array. They show the points where each booth should start. However, note that the stop value is included. The first value in the array is 0, and that shouldn't come as a surprise. But the stop value is also included when using linspace()
. This is different from most number ranges in Python, which are normally half-open—a half-open range includes the start value but excludes the stop value.
But Linda's corridor is 60m long. Therefore, she can't have a booth that starts at 60m as the wall gets in the way.
She has several options.
Linda can override the default behaviour in linspace()
, which includes the stop value. The function has an endpoint
parameter with a default value of True
. Linda passes False
as this optional argument:
When you use linspace()
, you don't define the step size. The function calculates the step size using the start and stop values and the number of points in the array, which you pass as the third argument. When the endpoint is excluded, linspace()
works out the step size needed by using the endpoint as an "extra" point. So, in this example, it divides the range from 0 to 60 into 18 steps, not 17. But it excludes the final value:
[ 0. 3.52941176 7.05882353 10.58823529 14.11764706 17.64705882 21.17647059 24.70588235 28.23529412 31.76470588 35.29411765 38.82352941 42.35294118 45.88235294 49.41176471 52.94117647 56.47058824]
Therefore, there are still 17 points in the array. You can set linspace()
to return the step size it calculates by using another optional argument:
When retstep=True
, the function also returns the step size. The output is now a tuple with the array as its first element and the step size as its second:
(array([ 0. , 3.52941176, 7.05882353,
10.58823529, 14.11764706, 17.64705882,
21.17647059, 24.70588235, 28.23529412,
31.76470588, 35.29411765, 38.82352941,
42.35294118, 45.88235294, 49.41176471,
52.94117647, 56.47058824]),
3.5294117647058822)
If you add the step size to the final point in the array, you'll get exactly 60.0, the stop value that's excluded.
Unfortunately, this leaves a gap at the end of the corridor, and Linda doesn't like this. Instead, she uses a simpler solution.
Linda's booths are 2m wide. This width is the size of the actual booth and not the gap between booths. She likes to leave empty spaces between booths!
So, Linda can use 58m as the stop value and leaves the endpoint as included, which is the default behaviour:
Here are the booths' starting positions:
[ 0. 3.625 7.25 10.875 14.5 18.125 21.75 25.375 29. 32.625 36.25 39.875 43.5 47.125 50.75 54.375 58. ]
The final booth starts at the 58m mark, and since it's 2m wide, it reaches the wall perfectly. Linda has efficiently used all the space available.
If you want to read more about linspace()
, here's an article I wrote for Real Python about the topic: np.linspace(): Create Evenly or Non-Evenly Spaced Arrays – Real Python.
And if you want to see linspace()
used in a real-world scientific context, you can read my most popular article ever: 2D Fourier transform in Python: Create any image using only sine functions.
Any Other Tools?
Are there other tools you can use? Yes. First of all, linspace()
can be used with multi-dimensional arrays:
Here's the 2D array returned:
[[ 1. 5. 10. ]
[ 2. 6.11111111 11.11111111]
[ 3. 7.22222222 12.22222222]
[ 4. 8.33333333 13.33333333]
[ 5. 9.44444444 14.44444444]
[ 6. 10.55555556 15.55555556]
[ 7. 11.66666667 16.66666667]
[ 8. 12.77777778 17.77777778]
[ 9. 13.88888889 18.88888889]
[10. 15. 20. ]]
But I promised a brief article, so I'll let you study this on your own. Sorry!
You can use geomspace()
to create a geometric progression instead, where the steps are logarithmic:
The output is the following:
[ 1. 1.66810054 2.7825594 4.64158883 7.74263683 12.91549665 21.5443469 35.93813664 59.94842503 100. ]
These values are evenly spaced on a log scale. In this case, each value is multiplied by 1.6681... to get the next one. And if you need to define the start and stop values in log space, you can use logspace()
:
This code gives the same output as the previous one that used geomspace()
:
[ 1. 1.66810054 2.7825594 4.64158883 7.74263683 12.91549665 21.5443469 35.93813664 59.94842503 100. ]
You can go and fetch your maths books from the attic now if your logarithm knowledge is a bit rusty!
It's also worth mentioning meshgrid()
, mgrid()
, and ogrid()
, which are somewhat related. But I'll refer you to yet another article I wrote in the era before I started The Python Coding Stack: numpy.meshgrid(): How And When Do You Use It? Are There Alternatives?
In Summary
Arav needs arange()
. Arav needs a fixed step size. You use a start, stop and step value when you use arange()
. It's up to you to decide the step size. But beware of odd results when using float step sizes or casting into specific data types. If in doubt, use linspace()
and work out the number of values you need directly.
Linda needs linspace()
. Linda wants her code to calculate the step size for her. She wants to ensure she can fit the number of booths she needs. You use a start value, a stop value, and the number of values needed when you call linspace()
.
And a summary of the summary:
arange()
for when you have a desired step sizelinspace()
for when you have a desired number of values
But you may also choose to use linspace()
when you have a desired step size by working out the number of values first.
Code in this article uses Python 3.12
Stop Stack
#68
There are free passes available for readers of The Stack to access the learning platform at The Python Coding Place. The passes give access to all the video courses for one week. Here's the link to get your one week's pass:
Thank you to all those who supported me with a one-off donation recently. This means a lot and helps me focus on writing more articles and keeping more of these articles free for everyone. Here's the link again for anyone who wants to make a one-off donation to support The Python Coding Stack
The Python Coding Book is available (Ebook and paperback). This is the First Edition, which follows from the "Zeroth" Edition that has been available online for a while—Just ask Google for "python book"!
And if you read the book already, I'd appreciate a review on Amazon. These things matter so much for individual authors!
And for those who want to join The Python Coding Place to access all of my video courses—past and future—join regular live sessions, and interact with me and other learners on the members-only forum, here's the link:
Any questions? Just ask…
Appendix: Code Blocks
Code Block #1
import numpy as np
aravs_cones = np.arange(0, 151, 10)
print(aravs_cones)
Code Block #2
import numpy as np
aravs_cones = np.arange(0, 150, 10)
print(aravs_cones)
Code Block #3
import numpy as np
aravs_cones = np.arange(0, 150.01, 10)
print(aravs_cones)
Code Block #4
import numpy as np
aravs_cones = np.arange(0, 150.01, 10)
print(aravs_cones.dtype)
Code Block #5
import numpy as np
aravs_cones = np.arange(0, 151, 10)
print(aravs_cones.dtype)
Code Block #6
aravs_cones = range(0, 151, 10)
print(list(aravs_cones))
Code Block #7
aravs_cones = range(0, 151, 7.5)
print(list(aravs_cones))
Code Block #8
import numpy as np
aravs_cones = np.arange(0, 151, 7.5)
print(aravs_cones)
Code Block #9
import numpy as np
aravs_cones = np.arange(0, 151, 7.5)
print(len(aravs_cones))
Code Block #10
import numpy as np
lindas_booths = np.linspace(0, 60, 17)
print(lindas_booths)
Code Block #11
import numpy as np
lindas_booths = np.linspace(0, 60, 17, endpoint=False)
print(lindas_booths)
Code Block #12
import numpy as np
lindas_booths = np.linspace(0, 60, 17, endpoint=False, retstep=True)
print(lindas_booths)
Code Block #13
import numpy as np
lindas_booths = np.linspace(0, 58, 17)
print(lindas_booths)
Code Block #14
import numpy as np
output = np.linspace([1, 5, 10], [10, 15, 20], 10)
print(output)
Code Block #15
import numpy as np
output = np.geomspace(1, 100, 10)
print(output)
Code Block #16
import numpy as np
output = np.logspace(0, 2, 10)
print(output)