NumPy 1.17.0 Release Notes#
This NumPy release contains a number of new features that should substantially improve its performance and usefulness, see Highlights below for a summary. The Python versions supported are 3.5-3.7, note that Python 2.7 has been dropped. Python 3.8b2 should work with the released source packages, but there are no future guarantees.
Downstream developers should use Cython >= 0.29.11 for Python 3.8 support and OpenBLAS >= 3.7 (not currently out) to avoid problems on the Skylake architecture. The NumPy wheels on PyPI are built from the OpenBLAS development branch in order to avoid those problems.
Highlights#
A new extensible
random
module along with four selectable random number generators and improved seeding designed for use in parallel processes has been added. The currently available bit generators are MT19937, PCG64, Philox, and SFC64. See below under New Features.NumPy’s
FFT
implementation was changed from fftpack to pocketfft, resulting in faster, more accurate transforms and better handling of datasets of prime length. See below under Improvements.New radix sort and timsort sorting methods. It is currently not possible to choose which will be used. They are hardwired to the datatype and used when either
stable
ormergesort
is passed as the method. See below under Improvements.Overriding numpy functions is now possible by default, see
__array_function__
below.
New functions#
numpy.errstate
is now also a function decorator
Deprecations#
numpy.polynomial
functions warn when passed float
in place of int
#
Previously functions in this module would accept float
values provided they
were integral (1.0
, 2.0
, etc). For consistency with the rest of numpy,
doing so is now deprecated, and in future will raise a TypeError
.
Similarly, passing a float like 0.5
in place of an integer will now raise a
TypeError
instead of the previous ValueError
.
Deprecate numpy.distutils.exec_command
and temp_file_name
#
The internal use of these functions has been refactored and there are better
alternatives. Replace exec_command
with subprocess.Popen
and
temp_file_name
with tempfile.mkstemp
.
Writeable flag of C-API wrapped arrays#
When an array is created from the C-API to wrap a pointer to data, the only
indication we have of the read-write nature of the data is the writeable
flag set during creation. It is dangerous to force the flag to writeable.
In the future it will not be possible to switch the writeable flag to True
from python.
This deprecation should not affect many users since arrays created in such
a manner are very rare in practice and only available through the NumPy C-API.
numpy.nonzero
should no longer be called on 0d arrays#
The behavior of numpy.nonzero
on 0d arrays was surprising, making uses of it
almost always incorrect. If the old behavior was intended, it can be preserved
without a warning by using nonzero(atleast_1d(arr))
instead of
nonzero(arr)
. In a future release, it is most likely this will raise a
ValueError
.
Writing to the result of numpy.broadcast_arrays
will warn#
Commonly numpy.broadcast_arrays
returns a writeable array with internal
overlap, making it unsafe to write to. A future version will set the
writeable
flag to False
, and require users to manually set it to
True
if they are sure that is what they want to do. Now writing to it will
emit a deprecation warning with instructions to set the writeable
flag
True
. Note that if one were to inspect the flag before setting it, one
would find it would already be True
. Explicitly setting it, though, as one
will need to do in future versions, clears an internal flag that is used to
produce the deprecation warning. To help alleviate confusion, an additional
FutureWarning will be emitted when accessing the writeable
flag state to
clarify the contradiction.
Note that for the C-side buffer protocol such an array will return a
readonly buffer immediately unless a writable buffer is requested. If
a writeable buffer is requested a warning will be given. When using
cython, the const
qualifier should be used with such arrays to avoid
the warning (e.g. cdef const double[::1] view
).
Future Changes#
Shape-1 fields in dtypes won’t be collapsed to scalars in a future version#
Currently, a field specified as [(name, dtype, 1)]
or "1type"
is
interpreted as a scalar field (i.e., the same as [(name, dtype)]
or
[(name, dtype, ()]
). This now raises a FutureWarning; in a future version,
it will be interpreted as a shape-(1,) field, i.e. the same as [(name,
dtype, (1,))]
or "(1,)type"
(consistently with [(name, dtype, n)]
/ "ntype"
with n>1
, which is already equivalent to [(name, dtype,
(n,)]
/ "(n,)type"
).
Compatibility notes#
float16
subnormal rounding#
Casting from a different floating point precision to float16
used incorrect
rounding in some edge cases. This means in rare cases, subnormal results will
now be rounded up instead of down, changing the last bit (ULP) of the result.
Signed zero when using divmod#
Starting in version 1.12.0, numpy incorrectly returned a negatively signed zero
when using the divmod
and floor_divide
functions when the result was
zero. For example:
>>> np.zeros(10)//1
array([-0., -0., -0., -0., -0., -0., -0., -0., -0., -0.])
With this release, the result is correctly returned as a positively signed zero:
>>> np.zeros(10)//1
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
MaskedArray.mask
now returns a view of the mask, not the mask itself#
Returning the mask itself was unsafe, as it could be reshaped in place which
would violate expectations of the masked array code. The behavior of mask
is now consistent with data
,
which also returns a view.
The underlying mask can still be accessed with ._mask
if it is needed.
Tests that contain assert x.mask is not y.mask
or similar will need to be
updated.
Do not lookup __buffer__
attribute in numpy.frombuffer
#
Looking up __buffer__
attribute in numpy.frombuffer
was undocumented and
non-functional. This code was removed. If needed, use
frombuffer(memoryview(obj), ...)
instead.
out
is buffered for memory overlaps in take
, choose
, put
#
If the out argument to these functions is provided and has memory overlap with the other arguments, it is now buffered to avoid order-dependent behavior.
Unpickling while loading requires explicit opt-in#
The functions load
, and lib.format.read_array
take an
allow_pickle
keyword which now defaults to False
in response to
CVE-2019-6446.
Potential changes to the random stream in old random module#
Due to bugs in the application of log
to random floating point numbers,
the stream may change when sampling from beta
, binomial
,
laplace
, logistic
, logseries
or
multinomial
if a 0
is generated in the underlying MT19937
random stream. There is a 1
in
\(10^{53}\) chance of this occurring, so the probability that the stream
changes for any given seed is extremely small. If a 0
is encountered in the
underlying generator, then the incorrect value produced (either numpy.inf
or
numpy.nan
) is now dropped.
i0
now always returns a result with the same shape as the input#
Previously, the output was squeezed, such that, e.g., input with just a single
element would lead to an array scalar being returned, and inputs with shapes
such as (10, 1)
would yield results that would not broadcast against the
input.
Note that we generally recommend the SciPy implementation over the numpy one: it is a proper ufunc written in C, and more than an order of magnitude faster.
can_cast
no longer assumes all unsafe casting is allowed#
Previously, can_cast
returned True for almost all inputs for
casting='unsafe'
, even for cases where casting was not possible, such as
from a structured dtype to a regular one. This has been fixed, making it
more consistent with actual casting using, e.g., the .astype
method.
ndarray.flags.writeable
can be switched to true slightly more often#
In rare cases, it was not possible to switch an array from not writeable
to writeable, although a base array is writeable. This can happen if an
intermediate ndarray.base
object is writeable. Previously, only the deepest
base object was considered for this decision. However, in rare cases this
object does not have the necessary information. In that case switching to
writeable was never allowed. This has now been fixed.
C API changes#
dimension or stride input arguments are now passed by npy_intp const*
#
Previously these function arguments were declared as the more strict
npy_intp*
, which prevented the caller passing constant data.
This change is backwards compatible, but now allows code like:
npy_intp const fixed_dims[] = {1, 2, 3};
// no longer complains that the const-qualifier is discarded
npy_intp size = PyArray_MultiplyList(fixed_dims, 3);
New Features#
New extensible numpy.random
module with selectable random number generators#
A new extensible numpy.random
module along with four selectable random number
generators and improved seeding designed for use in parallel processes has been
added. The currently available Bit Generators are
MT19937, PCG64, Philox, and SFC64.
PCG64
is the new default while MT19937
is retained for backwards
compatibility. Note that the legacy random module is unchanged and is now
frozen, your current results will not change. More information is available in
the API change description and in the top-level view
documentation.
libFLAME#
Support for building NumPy with the libFLAME linear algebra package as the LAPACK, implementation, see libFLAME for details.
User-defined BLAS detection order#
distutils
now uses an environment variable, comma-separated and case
insensitive, to determine the detection order for BLAS libraries.
By default NPY_BLAS_ORDER=mkl,blis,openblas,atlas,accelerate,blas
.
However, to force the use of OpenBLAS simply do:
NPY_BLAS_ORDER=openblas python setup.py build
which forces the use of OpenBLAS. This may be helpful for users which have a MKL installation but wishes to try out different implementations.
User-defined LAPACK detection order#
numpy.distutils
now uses an environment variable, comma-separated and case
insensitive, to determine the detection order for LAPACK libraries.
By default NPY_LAPACK_ORDER=mkl,openblas,flame,atlas,accelerate,lapack
.
However, to force the use of OpenBLAS simply do:
NPY_LAPACK_ORDER=openblas python setup.py build
which forces the use of OpenBLAS. This may be helpful for users which have a MKL installation but wishes to try out different implementations.
Timsort and radix sort have replaced mergesort for stable sorting#
Both radix sort and timsort have been implemented and are now used in place of
mergesort. Due to the need to maintain backward compatibility, the sorting
kind
options "stable"
and "mergesort"
have been made aliases of
each other with the actual sort implementation depending on the array type.
Radix sort is used for small integer types of 16 bits or less and timsort for
the remaining types. Timsort features improved performance on data containing
already or nearly sorted data and performs like mergesort on random data and
requires \(O(n/2)\) working space. Details of the timsort algorithm can be
found at CPython listsort.txt.
packbits
and unpackbits
accept an order
keyword#
The order
keyword defaults to big
, and will order the bits
accordingly. For 'order=big'
3 will become [0, 0, 0, 0, 0, 0, 1, 1]
,
and [1, 1, 0, 0, 0, 0, 0, 0]
for order=little
unpackbits
now accepts a count
parameter#
count
allows subsetting the number of bits that will be unpacked up-front,
rather than reshaping and subsetting later, making the packbits
operation
invertible, and the unpacking less wasteful. Counts larger than the number of
available bits add zero padding. Negative counts trim bits off the end instead
of counting from the beginning. None counts implement the existing behavior of
unpacking everything.
linalg.svd
and linalg.pinv
can be faster on hermitian inputs#
These functions now accept a hermitian
argument, matching the one added
to linalg.matrix_rank
in 1.14.0.
divmod operation is now supported for two timedelta64
operands#
The divmod operator now handles two timedelta64
operands, with
type signature mm->qm
.
fromfile
now takes an offset
argument#
This function now takes an offset
keyword argument for binary files,
which specifics the offset (in bytes) from the file’s current position.
Defaults to 0
.
New mode “empty” for pad
#
This mode pads an array to a desired shape without initializing the new entries.
Floating point scalars implement as_integer_ratio
to match the builtin float#
This returns a (numerator, denominator) pair, which can be used to construct a
fractions.Fraction
.
Structured dtype
objects can be indexed with multiple fields names#
arr.dtype[['a', 'b']]
now returns a dtype that is equivalent to
arr[['a', 'b']].dtype
, for consistency with
arr.dtype['a'] == arr['a'].dtype
.
Like the dtype of structured arrays indexed with a list of fields, this dtype
has the same itemsize
as the original, but only keeps a subset of the fields.
This means that arr[['a', 'b']]
and arr.view(arr.dtype[['a', 'b']])
are
equivalent.
.npy
files support unicode field names#
A new format version of 3.0 has been introduced, which enables structured types with non-latin1 field names. This is used automatically when needed.
Improvements#
Array comparison assertions include maximum differences#
Error messages from array comparison tests such as
testing.assert_allclose
now include “max absolute difference” and
“max relative difference,” in addition to the previous “mismatch” percentage.
This information makes it easier to update absolute and relative error
tolerances.
Replacement of the fftpack based fft
module by the pocketfft library#
Both implementations have the same ancestor (Fortran77 FFTPACK by Paul N. Swarztrauber), but pocketfft contains additional modifications which improve both accuracy and performance in some circumstances. For FFT lengths containing large prime factors, pocketfft uses Bluestein’s algorithm, which maintains \(O(N log N)\) run time complexity instead of deteriorating towards \(O(N*N)\) for prime lengths. Also, accuracy for real valued FFTs with near prime lengths has improved and is on par with complex valued FFTs.
Further improvements to ctypes
support in numpy.ctypeslib
#
A new numpy.ctypeslib.as_ctypes_type
function has been added, which can be
used to converts a dtype
into a best-guess ctypes
type. Thanks to this
new function, numpy.ctypeslib.as_ctypes
now supports a much wider range of
array types, including structures, booleans, and integers of non-native
endianness.
numpy.errstate
is now also a function decorator#
Currently, if you have a function like:
def foo():
pass
and you want to wrap the whole thing in errstate
, you have to rewrite it
like so:
def foo():
with np.errstate(...):
pass
but with this change, you can do:
@np.errstate(...)
def foo():
pass
thereby saving a level of indentation
numpy.exp
and numpy.log
speed up for float32 implementation#
float32 implementation of exp
and log
now benefit from AVX2/AVX512
instruction set which are detected during runtime. exp
has a max ulp
error of 2.52 and log
has a max ulp error or 3.83.
Improve performance of numpy.pad
#
The performance of the function has been improved for most cases by filling in a preallocated array with the desired padded shape instead of using concatenation.
numpy.interp
handles infinities more robustly#
In some cases where interp
would previously return nan
, it now
returns an appropriate infinity.
Pathlib support for fromfile
, tofile and ndarray.dump
#
fromfile
, ndarray.ndarray.tofile and ndarray.dump
now support
the pathlib.Path
type for the file
/fid
parameter.
Specialized isnan
, isinf
, and isfinite
ufuncs for bool and int types#
The boolean and integer types are incapable of storing nan
and inf
values,
which allows us to provide specialized ufuncs that are up to 250x faster than
the previous approach.
isfinite
supports datetime64
and timedelta64
types#
Previously, isfinite
used to raise a TypeError on being used on these
two types.
New keywords added to nan_to_num
#
nan_to_num
now accepts keywords nan
, posinf
and neginf
allowing the user to define the value to replace the nan
, positive and
negative np.inf
values respectively.
MemoryErrors caused by allocated overly large arrays are more descriptive#
Often the cause of a MemoryError is incorrect broadcasting, which results in a very large and incorrect shape. The message of the error now includes this shape to help diagnose the cause of failure.
floor
, ceil
, and trunc
now respect builtin magic methods#
These ufuncs now call the __floor__
, __ceil__
, and __trunc__
methods when called on object arrays, making them compatible with
decimal.Decimal
and fractions.Fraction
objects.
quantile
now works on fraction.Fraction and decimal.Decimal
objects#
In general, this handles object arrays more gracefully, and avoids floating- point operations if exact arithmetic types are used.
Support of object arrays in matmul
#
It is now possible to use matmul
(or the @
operator) with object arrays.
For instance, it is now possible to do:
from fractions import Fraction
a = np.array([[Fraction(1, 2), Fraction(1, 3)], [Fraction(1, 3), Fraction(1, 2)]])
b = a @ a
Changes#
median
and percentile
family of functions no longer warn about nan
#
numpy.median
, numpy.percentile
, and numpy.quantile
used to emit a
RuntimeWarning
when encountering an nan
. Since they return the
nan
value, the warning is redundant and has been removed.
timedelta64 % 0
behavior adjusted to return NaT
#
The modulus operation with two np.timedelta64
operands now returns
NaT
in the case of division by zero, rather than returning zero
NumPy functions now always support overrides with __array_function__
#
NumPy now always checks the __array_function__
method to implement overrides
of NumPy functions on non-NumPy arrays, as described in NEP 18. The feature
was available for testing with NumPy 1.16 if appropriate environment variables
are set, but is now always enabled.
lib.recfunctions.structured_to_unstructured
does not squeeze single-field views#
Previously structured_to_unstructured(arr[['a']])
would produce a squeezed
result inconsistent with structured_to_unstructured(arr[['a', b']])
. This
was accidental. The old behavior can be retained with
structured_to_unstructured(arr[['a']]).squeeze(axis=-1)
or far more simply,
arr['a']
.
clip
now uses a ufunc under the hood#
This means that registering clip functions for custom dtypes in C via
descr->f->fastclip
is deprecated - they should use the ufunc registration
mechanism instead, attaching to the np.core.umath.clip
ufunc.
It also means that clip
accepts where
and casting
arguments,
and can be override with __array_ufunc__
.
A consequence of this change is that some behaviors of the old clip
have
been deprecated:
Passing
nan
to mean “do not clip” as one or both bounds. This didn’t work in all cases anyway, and can be better handled by passing infinities of the appropriate sign.Using “unsafe” casting by default when an
out
argument is passed. Usingcasting="unsafe"
explicitly will silence this warning.
Additionally, there are some corner cases with behavior changes:
Padding
max < min
has changed to be more consistent across dtypes, but should not be relied upon.Scalar
min
andmax
take part in promotion rules like they do in all other ufuncs.
__array_interface__
offset now works as documented#
The interface may use an offset
value that was mistakenly ignored.
Pickle protocol in savez
set to 3 for force zip64
flag#
savez
was not using the force_zip64
flag, which limited the size of
the archive to 2GB. But using the flag requires us to use pickle protocol 3 to
write object
arrays. The protocol used was bumped to 3, meaning the archive
will be unreadable by Python2.
Structured arrays indexed with non-existent fields raise KeyError
not ValueError
#
arr['bad_field']
on a structured type raises KeyError
, for consistency
with dict['bad_field']
.