NumPy

Table of Contents

Previous topic

Release Notes

Next topic

NumPy 1.18.4 Release Notes

NumPy 1.19.0 Release Notes

This NumPy release is marked by the removal of much technical debt: support for Python 2 has been removed, many deprecations have been expired, and documentation has been improved. The polishing of the random module continues apace with bug fixes and better usability from Cython.

The Python versions supported for this release are 3.6-3.8. Downstream developers should use Cython >= 0.29.16 for Python 3.8 support and OpenBLAS >= 3.7 to avoid problems on the Skylake architecture.

Highlights

  • Code compatibility with Python versions < 3.6 (including Python 2) was dropped from both the python and C code. The shims in numpy.compat will remain to support third-party packages, but they may be deprecated in a future release. Note that 1.19.x will not compile with earlier versions of Python due to the use of f-strings.

    (gh-15233)

Expired deprecations

numpy.insert and numpy.delete can no longer be passed an axis on 0d arrays

This concludes a deprecation from 1.9, where when an axis argument was passed to a call to ~numpy.insert and ~numpy.delete on a 0d array, the axis and obj argument and indices would be completely ignored. In these cases, insert(arr, "nonsense", 42, axis=0) would actually overwrite the entire array, while delete(arr, "nonsense", axis=0) would be arr.copy()

Now passing axis on a 0d array raises ~numpy.AxisError.

(gh-15802)

numpy.delete no longer ignores out-of-bounds indices

This concludes deprecations from 1.8 and 1.9, where np.delete would ignore both negative and out-of-bounds items in a sequence of indices. This was at odds with its behavior when passed a single index.

Now out-of-bounds items throw IndexError, and negative items index from the end.

(gh-15804)

numpy.insert and numpy.delete no longer accept non-integral indices

This concludes a deprecation from 1.9, where sequences of non-integers indices were allowed and cast to integers. Now passing sequences of non-integral indices raises IndexError, just like it does when passing a single non-integral scalar.

(gh-15805)

numpy.delete no longer casts boolean indices to integers

This concludes a deprecation from 1.8, where np.delete would cast boolean arrays and scalars passed as an index argument into integer indices. The behavior now is to treat boolean arrays as a mask, and to raise an error on boolean scalars.

(gh-15815)

Compatibility notes

Changed random variate stream from numpy.random.Generator.dirichlet

A bug in the generation of random variates for the Dirichlet distribution with small ‘alpha’ values was fixed by using a different algorithm when max(alpha) < 0.1. Because of the change, the stream of variates generated by dirichlet in this case will be different from previous releases.

(gh-14924)

Scalar promotion in PyArray_ConvertToCommonType

The promotion of mixed scalars and arrays in PyArray_ConvertToCommonType has been changed to adhere to those used by np.result_type. This means that input such as (1000, np.array([1], dtype=np.uint8))) will now return uint16 dtypes. In most cases the behaviour is unchanged. Note that the use of this C-API function is generally discouraged. This also fixes np.choose to behave the same way as the rest of NumPy in this respect.

(gh-14933)

Fasttake and fastputmask slots are deprecated and NULL’ed

The fasttake and fastputmask slots are now never used and must always be set to NULL. This will result in no change in behaviour. However, if a user dtype should set one of these a DeprecationWarning will be given.

(gh-14942)

np.ediff1d casting behaviour with to_end and to_begin

np.ediff1d now uses the "same_kind" casting rule for its additional to_end and to_begin arguments. This ensures type safety except when the input array has a smaller integer type than to_begin or to_end. In rare cases, the behaviour will be more strict than it was previously in 1.16 and 1.17. This is necessary to solve issues with floating point NaN.

(gh-14981)

Converting of empty array-like objects to NumPy arrays

Objects with len(obj) == 0 which implement an “array-like” interface, meaning an object implementing obj.__array__(), obj.__array_interface__, obj.__array_struct__, or the python buffer interface and which are also sequences (i.e. Pandas objects) will now always retain there shape correctly when converted to an array. If such an object has a shape of (0, 1) previously, it could be converted into an array of shape (0,) (losing all dimensions after the first 0).

(gh-14995)

Removed multiarray.int_asbuffer

As part of the continued removal of Python 2 compatibility, multiarray.int_asbuffer was removed. On Python 3, it threw a NotImplementedError and was unused internally. It is expected that there are no downstream use cases for this method with Python 3.

(gh-15229)

numpy.distutils.compat has been removed

This module contained only the function get_exception(), which was used as:

try:
    ...
except Exception:
    e = get_exception()

Its purpose was to handle the change in syntax introduced in Python 2.6, from except Exception, e: to except Exception as e:, meaning it was only necessary for codebases supporting Python 2.5 and older.

(gh-15255)

issubdtype no longer interprets float as np.floating

numpy.issubdtype had a FutureWarning since NumPy 1.14 which has expired now. This means that certain input where the second argument was neither a datatype nor a NumPy scalar type (such as a string or a python type like int or float) will now be consistent with passing in np.dtype(arg2).type. This makes the result consistent with expectations and leads to a false result in some cases which previously returned true.

(gh-15773)

Change output of round on scalars to be consistent with Python

Output of the __round__ dunder method and consequently the Python built-in round has been changed to be a Python int to be consistent with calling it on Python float objects when called with no arguments. Previously, it would return a scalar of the np.dtype that was passed in.

(gh-15840)

The numpy.ndarray constructor no longer interprets strides=() as strides=None

The former has changed to have the expected meaning of setting numpy.ndarray.strides to (), while the latter continues to result in strides being chosen automatically.

(gh-15882)

C-Level string to datetime casts changed

The C-level casts from strings were simplified. This changed also fixes string to datetime and timedelta casts to behave correctly (i.e. like Python casts using string_arr.astype("M8") while previously the cast would behave like string_arr.astype(np.int_).astype("M8"). This only affects code using low-level C-API to do manual casts (not full array casts) of single scalar values or using e.g. PyArray_GetCastFunc, and should thus not affect the vast majority of users.

(gh-16068)

SeedSequence with small seeds no longer conflicts with spawning

Small seeds (less than 2**96) were previously implicitly 0-padded out to 128 bits, the size of the internal entropy pool. When spawned, the spawn key was concatenated before the 0-padding. Since the first spawn key is (0,), small seeds before the spawn created the same states as the first spawned SeedSequence. Now, the seed is explicitly 0-padded out to the internal pool size before concatenating the spawn key. Spawned SeedSequences will produce different results than in the previous release. Unspawned SeedSequences will still produce the same results.

(gh-16551)

Deprecations

Deprecate automatic dtype=object for ragged input

Calling np.array([[1, [1, 2, 3]]) will issue a DeprecationWarning as per NEP 34. Users should explicitly use dtype=object to avoid the warning.

(gh-15119)

Passing shape=0 to factory functions in numpy.rec is deprecated

0 is treated as a special case and is aliased to None in the functions:

  • numpy.core.records.fromarrays

  • numpy.core.records.fromrecords

  • numpy.core.records.fromstring

  • numpy.core.records.fromfile

In future, 0 will not be special cased, and will be treated as an array length like any other integer.

(gh-15217)

Deprecation of probably unused C-API functions

The following C-API functions are probably unused and have been deprecated:

  • PyArray_GetArrayParamsFromObject

  • PyUFunc_GenericFunction

  • PyUFunc_SetUsesArraysAsData

In most cases PyArray_GetArrayParamsFromObject should be replaced by converting to an array, while PyUFunc_GenericFunction can be replaced with PyObject_Call (see documentation for details).

(gh-15427)

Converting certain types to dtypes is Deprecated

The super classes of scalar types, such as np.integer, np.generic, or np.inexact will now give a deprecation warning when converted to a dtype (or used in a dtype keyword argument). The reason for this is that np.integer is converted to np.int_, while it would be expected to represent any integer (e.g. also int8, int16, etc. For example, dtype=np.floating is currently identical to dtype=np.float64, even though also np.float32 is a subclass of np.floating.

(gh-15534)

Deprecation of round for np.complexfloating scalars

Output of the __round__ dunder method and consequently the Python built-in round has been deprecated on complex scalars. This does not affect np.round.

(gh-15840)

numpy.ndarray.tostring() is deprecated in favor of tobytes()

~numpy.ndarray.tobytes has existed since the 1.9 release, but until this release ~numpy.ndarray.tostring emitted no warning. The change to emit a warning brings NumPy in line with the builtin array.array methods of the same name.

(gh-15867)

C API changes

Better support for const dimensions in API functions

The following functions now accept a constant array of npy_intp:

  • PyArray_BroadcastToShape

  • PyArray_IntTupleFromIntp

  • PyArray_OverflowMultiplyList

Previously the caller would have to cast away the const-ness to call these functions.

(gh-15251)

Const qualify UFunc inner loops

UFuncGenericFunction now expects pointers to const dimension and strides as arguments. This means inner loops may no longer modify either dimension or strides. This change leads to an incompatible-pointer-types warning forcing users to either ignore the compiler warnings or to const qualify their own loop signatures.

(gh-15355)

New Features

numpy.frompyfunc now accepts an identity argument

This allows the :attr:numpy.ufunc.identity attribute to be set on the resulting ufunc, meaning it can be used for empty and multi-dimensional calls to :meth:numpy.ufunc.reduce.

(gh-8255)

np.str_ scalars now support the buffer protocol

np.str_ arrays are always stored as UCS4, so the corresponding scalars now expose this through the buffer interface, meaning memoryview(np.str_('test')) now works.

(gh-15385)

subok option for numpy.copy

A new kwarg, subok, was added to numpy.copy to allow users to toggle the behavior of numpy.copy with respect to array subclasses. The default value is False which is consistent with the behavior of numpy.copy for previous numpy versions. To create a copy that preserves an array subclass with numpy.copy, call np.copy(arr, subok=True). This addition better documents that the default behavior of numpy.copy differs from the numpy.ndarray.copy method which respects array subclasses by default.

(gh-15685)

numpy.linalg.multi_dot now accepts an out argument

out can be used to avoid creating unnecessary copies of the final product computed by numpy.linalg.multidot.

(gh-15715)

keepdims parameter for numpy.count_nonzero

The parameter keepdims was added to numpy.count_nonzero. The parameter has the same meaning as it does in reduction functions such as numpy.sum or numpy.mean.

(gh-15870)

equal_nan parameter for numpy.array_equal

The keyword argument equal_nan was added to numpy.array_equal. equal_nan is a boolean value that toggles whether or not nan values are considered equal in comparison (default is False). This matches API used in related functions such as numpy.isclose and numpy.allclose.

(gh-16128)

Improvements

Improve detection of CPU features

Replace npy_cpu_supports which was a gcc specific mechanism to test support of AVX with more general functions npy_cpu_init and npy_cpu_have, and expose the results via a NPY_CPU_HAVE c-macro as well as a python-level __cpu_features__ dictionary.

(gh-13421)

Use 64-bit integer size on 64-bit platforms in fallback lapack_lite

Use 64-bit integer size on 64-bit platforms in the fallback LAPACK library, which is used when the system has no LAPACK installed, allowing it to deal with linear algebra for large arrays.

(gh-15218)

Use AVX512 intrinsic to implement np.exp when input is np.float64

Use AVX512 intrinsic to implement np.exp when input is np.float64, which can improve the performance of np.exp with np.float64 input 5-7x faster than before. The _multiarray_umath.so module has grown about 63 KB on linux64.

(gh-15648)

Ability to disable madvise hugepages

On Linux NumPy has previously added support for madavise hugepages which can improve performance for very large arrays. Unfortunately, on older Kernel versions this led to peformance regressions, thus by default the support has been disabled on kernels before version 4.6. To override the default, you can use the environment variable:

NUMPY_MADVISE_HUGEPAGE=0

or set it to 1 to force enabling support. Note that this only makes a difference if the operating system is set up to use madvise transparent hugepage.

(gh-15769)

numpy.einsum accepts NumPy int64 type in subscript list

There is no longer a type error thrown when numpy.einsum is passed a NumPy int64 array as its subscript list.

(gh-16080)

np.logaddexp2.identity changed to -inf

The ufunc ~numpy.logaddexp2 now has an identity of -inf, allowing it to be called on empty sequences. This matches the identity of ~numpy.logaddexp.

(gh-16102)

Changes

Remove handling of extra argument to __array__

A code path and test have been in the code since NumPy 0.4 for a two-argument variant of __array__(dtype=None, context=None). It was activated when calling ufunc(op) or ufunc.reduce(op) if op.__array__ existed. However that variant is not documented, and it is not clear what the intention was for its use. It has been removed.

(gh-15118)

numpy.random._bit_generator moved to numpy.random.bit_generator

In order to expose numpy.random.BitGenerator and numpy.random.SeedSequence to Cython, the _bitgenerator module is now public as numpy.random.bit_generator

Cython access to the random distributions is provided via a pxd file

c_distributions.pxd provides access to the c functions behind many of the random distributions from Cython, making it convenient to use and extend them.

(gh-15463)

Fixed eigh and cholesky methods in numpy.random.multivariate_normal

Previously, when passing method='eigh' or method='cholesky', numpy.random.multivariate_normal produced samples from the wrong distribution. This is now fixed.

(gh-15872)

Fixed the jumping implementation in MT19937.jumped

This fix changes the stream produced from jumped MT19937 generators. It does not affect the stream produced using RandomState or MT19937 that are directly seeded.

The translation of the jumping code for the MT19937 contained a reversed loop ordering. MT19937.jumped matches the Makoto Matsumoto’s original implementation of the Horner and Sliding Window jump methods.

(gh-16153)