NumPy

NumPy 1.11.0 Release Notes

This release supports Python 2.6 - 2.7 and 3.2 - 3.5 and contains a number of enhancements and improvements. Note also the build system changes listed below as they may have subtle effects.

No Windows (TM) binaries are provided for this release due to a broken toolchain. One of the providers of Python packages for Windows (TM) is your best bet.

Highlights

Details of these improvements can be found below.

  • The datetime64 type is now timezone naive.

  • A dtype parameter has been added to randint.

  • Improved detection of two arrays possibly sharing memory.

  • Automatic bin size estimation for np.histogram.

  • Speed optimization of A @ A.T and dot(A, A.T).

  • New function np.moveaxis for reordering array axes.

Build System Changes

  • Numpy now uses setuptools for its builds instead of plain distutils. This fixes usage of install_requires='numpy' in the setup.py files of projects that depend on Numpy (see gh-6551). It potentially affects the way that build/install methods for Numpy itself behave though. Please report any unexpected behavior on the Numpy issue tracker.

  • Bento build support and related files have been removed.

  • Single file build support and related files have been removed.

Future Changes

The following changes are scheduled for Numpy 1.12.0.

  • Support for Python 2.6, 3.2, and 3.3 will be dropped.

  • Relaxed stride checking will become the default. See the 1.8.0 release notes for a more extended discussion of what this change implies.

  • The behavior of the datetime64 “not a time” (NaT) value will be changed to match that of floating point “not a number” (NaN) values: all comparisons involving NaT will return False, except for NaT != NaT which will return True.

  • Indexing with floats will raise IndexError, e.g., a[0, 0.0].

  • Indexing with non-integer array_like will raise IndexError, e.g., a['1', '2']

  • Indexing with multiple ellipsis will raise IndexError, e.g., a[..., ...].

  • Non-integers used as index values will raise TypeError, e.g., in reshape, take, and specifying reduce axis.

In a future release the following changes will be made.

  • The rand function exposed in numpy.testing will be removed. That function is left over from early Numpy and was implemented using the Python random module. The random number generators from numpy.random should be used instead.

  • The ndarray.view method will only allow c_contiguous arrays to be viewed using a dtype of different size causing the last dimension to change. That differs from the current behavior where arrays that are f_contiguous but not c_contiguous can be viewed as a dtype type of different size causing the first dimension to change.

  • Slicing a MaskedArray will return views of both data and mask. Currently the mask is copy-on-write and changes to the mask in the slice do not propagate to the original mask. See the FutureWarnings section below for details.

Compatibility notes

datetime64 changes

In prior versions of NumPy the experimental datetime64 type always stored times in UTC. By default, creating a datetime64 object from a string or printing it would convert from or to local time:

# old behavior
>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00-0800')  # note the timezone offset -08:00

A consensus of datetime64 users agreed that this behavior is undesirable and at odds with how datetime64 is usually used (e.g., by pandas). For most use cases, a timezone naive datetime type is preferred, similar to the datetime.datetime type in the Python standard library. Accordingly, datetime64 no longer assumes that input is in local time, nor does it print local times:

>>>> np.datetime64('2000-01-01T00:00:00')
numpy.datetime64('2000-01-01T00:00:00')

For backwards compatibility, datetime64 still parses timezone offsets, which it handles by converting to UTC. However, the resulting datetime is timezone naive:

>>> np.datetime64('2000-01-01T00:00:00-08')
DeprecationWarning: parsing timezone aware datetimes is deprecated;
this will raise an error in the future
numpy.datetime64('2000-01-01T08:00:00')

As a corollary to this change, we no longer prohibit casting between datetimes with date units and datetimes with time units. With timezone naive datetimes, the rule for casting from dates to times is no longer ambiguous.

linalg.norm return type changes

The return type of the linalg.norm function is now floating point without exception. Some of the norm types previously returned integers.

polynomial fit changes

The various fit functions in the numpy polynomial package no longer accept non-integers for degree specification.

np.dot now raises TypeError instead of ValueError

This behaviour mimics that of other functions such as np.inner. If the two arguments cannot be cast to a common type, it could have raised a TypeError or ValueError depending on their order. Now, np.dot will now always raise a TypeError.

FutureWarning to changed behavior

  • In np.lib.split an empty array in the result always had dimension (0,) no matter the dimensions of the array being split. This has been changed so that the dimensions will be preserved. A FutureWarning for this change has been in place since Numpy 1.9 but, due to a bug, sometimes no warning was raised and the dimensions were already preserved.

% and // operators

These operators are implemented with the remainder and floor_divide functions respectively. Those functions are now based around fmod and are computed together so as to be compatible with each other and with the Python versions for float types. The results should be marginally more accurate or outright bug fixes compared to the previous results, but they may differ significantly in cases where roundoff makes a difference in the integer returned by floor_divide. Some corner cases also change, for instance, NaN is always returned for both functions when the divisor is zero, divmod(1.0, inf) returns (0.0, 1.0) except on MSVC 2008, and divmod(-1.0, inf) returns (-1.0, inf).

C API

Removed the check_return and inner_loop_selector members of the PyUFuncObject struct (replacing them with reserved slots to preserve struct layout). These were never used for anything, so it’s unlikely that any third-party code is using them either, but we mention it here for completeness.

object dtype detection for old-style classes

In python 2, objects which are instances of old-style user-defined classes no longer automatically count as ‘object’ type in the dtype-detection handler. Instead, as in python 3, they may potentially count as sequences, but only if they define both a __len__ and a __getitem__ method. This fixes a segfault and inconsistency between python 2 and 3.

New Features

  • np.histogram now provides plugin estimators for automatically estimating the optimal number of bins. Passing one of [‘auto’, ‘fd’, ‘scott’, ‘rice’, ‘sturges’] as the argument to ‘bins’ results in the corresponding estimator being used.

  • A benchmark suite using Airspeed Velocity has been added, converting the previous vbench-based one. You can run the suite locally via python runtests.py --bench. For more details, see benchmarks/README.rst.

  • A new function np.shares_memory that can check exactly whether two arrays have memory overlap is added. np.may_share_memory also now has an option to spend more effort to reduce false positives.

  • SkipTest and KnownFailureException exception classes are exposed in the numpy.testing namespace. Raise them in a test function to mark the test to be skipped or mark it as a known failure, respectively.

  • f2py.compile has a new extension keyword parameter that allows the fortran extension to be specified for generated temp files. For instance, the files can be specifies to be *.f90. The verbose argument is also activated, it was previously ignored.

  • A dtype parameter has been added to np.random.randint Random ndarrays of the following types can now be generated:

    • np.bool_,

    • np.int8, np.uint8,

    • np.int16, np.uint16,

    • np.int32, np.uint32,

    • np.int64, np.uint64,

    • np.int_ ``, ``np.intp

    The specification is by precision rather than by C type. Hence, on some platforms np.int64 may be a long instead of long long even if the specified dtype is long long because the two may have the same precision. The resulting type depends on which C type numpy uses for the given precision. The byteorder specification is also ignored, the generated arrays are always in native byte order.

  • A new np.moveaxis function allows for moving one or more array axes to a new position by explicitly providing source and destination axes. This function should be easier to use than the current rollaxis function as well as providing more functionality.

  • The deg parameter of the various numpy.polynomial fits has been extended to accept a list of the degrees of the terms to be included in the fit, the coefficients of all other terms being constrained to zero. The change is backward compatible, passing a scalar deg will behave as before.

  • A divmod function for float types modeled after the Python version has been added to the npy_math library.

Improvements

np.gradient now supports an axis argument

The axis parameter was added to np.gradient for consistency. It allows to specify over which axes the gradient is calculated.

np.lexsort now supports arrays with object data-type

The function now internally calls the generic npy_amergesort when the type does not implement a merge-sort kind of argsort method.

np.ma.core.MaskedArray now supports an order argument

When constructing a new MaskedArray instance, it can be configured with an order argument analogous to the one when calling np.ndarray. The addition of this argument allows for the proper processing of an order argument in several MaskedArray-related utility functions such as np.ma.core.array and np.ma.core.asarray.

Memory and speed improvements for masked arrays

Creating a masked array with mask=True (resp. mask=False) now uses np.ones (resp. np.zeros) to create the mask, which is faster and avoid a big memory peak. Another optimization was done to avoid a memory peak and useless computations when printing a masked array.

ndarray.tofile now uses fallocate on linux

The function now uses the fallocate system call to reserve sufficient disk space on file systems that support it.

Optimizations for operations of the form A.T @ A and A @ A.T

Previously, gemm BLAS operations were used for all matrix products. Now, if the matrix product is between a matrix and its transpose, it will use syrk BLAS operations for a performance boost. This optimization has been extended to @, numpy.dot, numpy.inner, and numpy.matmul.

Note: Requires the transposed and non-transposed matrices to share data.

np.testing.assert_warns can now be used as a context manager

This matches the behavior of assert_raises.

Speed improvement for np.random.shuffle

np.random.shuffle is now much faster for 1d ndarrays.

Changes

Pyrex support was removed from numpy.distutils

The method build_src.generate_a_pyrex_source will remain available; it has been monkeypatched by users to support Cython instead of Pyrex. It’s recommended to switch to a better supported method of build Cython extensions though.

np.broadcast can now be called with a single argument

The resulting object in that case will simply mimic iteration over a single array. This change obsoletes distinctions like

if len(x) == 1:

shape = x[0].shape

else:

shape = np.broadcast(*x).shape

Instead, np.broadcast can be used in all cases.

np.trace now respects array subclasses

This behaviour mimics that of other functions such as np.diagonal and ensures, e.g., that for masked arrays np.trace(ma) and ma.trace() give the same result.

np.dot now raises TypeError instead of ValueError

This behaviour mimics that of other functions such as np.inner. If the two arguments cannot be cast to a common type, it could have raised a TypeError or ValueError depending on their order. Now, np.dot will now always raise a TypeError.

linalg.norm return type changes

The linalg.norm function now does all its computations in floating point and returns floating results. This change fixes bugs due to integer overflow and the failure of abs with signed integers of minimum value, e.g., int8(-128). For consistency, floats are used even where an integer might work.

Deprecations

Views of arrays in Fortran order

The F_CONTIGUOUS flag was used to signal that views using a dtype that changed the element size would change the first index. This was always problematical for arrays that were both F_CONTIGUOUS and C_CONTIGUOUS because C_CONTIGUOUS took precedence. Relaxed stride checking results in more such dual contiguous arrays and breaks some existing code as a result. Note that this also affects changing the dtype by assigning to the dtype attribute of an array. The aim of this deprecation is to restrict views to C_CONTIGUOUS arrays at some future time. A work around that is backward compatible is to use a.T.view(...).T instead. A parameter may also be added to the view method to explicitly ask for Fortran order views, but that will not be backward compatible.

Invalid arguments for array ordering

It is currently possible to pass in arguments for the order parameter in methods like array.flatten or array.ravel that were not one of the following: ‘C’, ‘F’, ‘A’, ‘K’ (note that all of these possible values are both unicode and case insensitive). Such behavior will not be allowed in future releases.

Random number generator in the testing namespace

The Python standard library random number generator was previously exposed in the testing namespace as testing.rand. Using this generator is not recommended and it will be removed in a future release. Use generators from numpy.random namespace instead.

Random integer generation on a closed interval

In accordance with the Python C API, which gives preference to the half-open interval over the closed one, np.random.random_integers is being deprecated in favor of calling np.random.randint, which has been enhanced with the dtype parameter as described under “New Features”. However, np.random.random_integers will not be removed anytime soon.

FutureWarnings

Assigning to slices/views of MaskedArray

Currently a slice of a masked array contains a view of the original data and a copy-on-write view of the mask. Consequently, any changes to the slice’s mask will result in a copy of the original mask being made and that new mask being changed rather than the original. For example, if we make a slice of the original like so, view = original[:], then modifications to the data in one array will affect the data of the other but, because the mask will be copied during assignment operations, changes to the mask will remain local. A similar situation occurs when explicitly constructing a masked array using MaskedArray(data, mask), the returned array will contain a view of data but the mask will be a copy-on-write view of mask.

In the future, these cases will be normalized so that the data and mask arrays are treated the same way and modifications to either will propagate between views. In 1.11, numpy will issue a MaskedArrayFutureWarning warning whenever user code modifies the mask of a view that in the future may cause values to propagate back to the original. To silence these warnings and make your code robust against the upcoming changes, you have two options: if you want to keep the current behavior, call masked_view.unshare_mask() before modifying the mask. If you want to get the future behavior early, use masked_view._sharedmask = False. However, note that setting the _sharedmask attribute will break following explicit calls to masked_view.unshare_mask().