NumPy 2.xx.x Release Notes#
Highlights#
We’ll choose highlights for this release near the end of the release cycle.
Deprecations#
Setting the strides
attribute is deprecated#
Setting the strides attribute is now deprecated since mutating
an array is unsafe if an array is shared, especially by multiple
threads. As an alternative, you can create a new view (no copy) via:
* np.lib.stride_tricks.strided_window_view if applicable,
* np.lib.stride_tricks.as_strided for the general case,
* or the np.ndarray constructor (buffer
is the original array) for a light-weight version.
(gh-28925)
Positional out
argument to np.maximum, np.minimum is deprecated#
Passing the output array out
positionally to numpy.maximum
and
numpy.minimum
is deprecated. For example, np.maximum(a, b, c)
will
emit a deprecation warning, since c
is treated as the output buffer
rather than a third input.
Always pass the output with the keyword form, e.g.
np.maximum(a, b, out=c)
. This makes intent clear and simplifies
type annotations.
(gh-29052)
align=
must be passed as boolean to np.dtype()
#
When creating a new dtype
a VisibleDeprecationWarning
will be
given if align=
is not a boolean.
This is mainly to prevent accidentally passing a subarray align flag where it
has no effect, such as np.dtype("f8", 3)
instead of np.dtype(("f8", 3))
.
We strongly suggest to always pass align=
as a keyword argument.
(gh-29301)
Assertion and warning control utilities are deprecated#
np.testing.assert_warns and np.testing.suppress_warnings are deprecated.
Use warnings.catch_warnings
, warnings.filterwarnings
, pytest.warns
, or
pytest.filterwarnings
instead.
(gh-29550)
Compatibility notes#
NumPy’s C extension modules have begun to use multi-phase initialisation, as defined by PEP 489. As part of this, a new explicit check has been added that each such module is only imported once per Python process. This comes with the side-effect that deleting
numpy
fromsys.modules
and re-importing it will now fail with anImportError
. This has always been unsafe, with unexpected side-effects, though did not previously raise an error.(gh-29030)
numpy.round
now always returns a copy. Previously, it returned a view for integer inputs fordecimals >= 0
and a copy in all other cases. This change bringsround
in line withceil
,floor
andtrunc
.(gh-29137)
The Macro NPY_ALIGNMENT_REQUIRED has been removed#
The macro was defined in the npy_cpu.h file, so might be regarded as semipublic. As it turns out, with modern compilers and hardware it is almost always the case that alignment is required, so numpy no longer uses the macro. It is unlikely anyone uses it, but you might want to compile with the -Wundef flag or equivalent to be sure.
(gh-29094)
New Features#
Let
np.size
accept multiple axes.(gh-29240)
Extend numpy.pad
to accept a dictionary for the pad_width
argument.
(gh-29273)
StringDType
fill_value support in numpy.ma.MaskedArray
#
Masked arrays now accept and preserve a Python str
as their fill_value
when
using the variable‑width StringDType
(kind 'T'
), including through slicing
and views. The default is 'N/A'
and may be overridden by any valid string.
This fixes issue gh‑29421 and was
implemented in pull request gh‑29423.
(gh-29423)
ndmax
option for numpy.array
#
The ndmax
option is now available for numpy.array
.
It explicitly limits the maximum number of dimensions created from nested sequences.
This is particularly useful when creating arrays of list-like objects with dtype=object
.
By default, NumPy recurses through all nesting levels to create the highest possible
dimensional array, but this behavior may not be desired when the intent is to preserve
nested structures as objects. The ndmax
parameter provides explicit control over
this recursion depth.
# Default behavior: Creates a 2D array
>>> a = np.array([[1, 2], [3, 4]], dtype=object)
>>> a
array([[1, 2],
[3, 4]], dtype=object)
>>> a.shape
(2, 2)
# With ndmax=1: Creates a 1D array
>>> b = np.array([[1, 2], [3, 4]], dtype=object, ndmax=1)
>>> b
array([list([1, 2]), list([3, 4])], dtype=object)
>>> b.shape
(2,)
(gh-29569)
Improvements#
Fix flatiter
indexing edge cases#
The flatiter
object now shares the same index preparation logic as
ndarray
, ensuring consistent behavior and fixing several issues where
invalid indices were previously accepted or misinterpreted.
Key fixes and improvements:
Stricter index validation
Boolean non-array indices like
arr.flat[[True, True]]
were incorrectly treated asarr.flat[np.array([1, 1], dtype=int)]
. They now raise an index error. Note that indices that match the iterator’s shape are expected to not raise in the future and be handled as regular boolean indices. Usenp.asarray(<index>)
if you want to match that behavior.Float non-array indices were also cast to integer and incorrectly treated as
arr.flat[np.array([1.0, 1.0], dtype=int)]
. This is now deprecated and will be removed in a future version.0-dimensional boolean indices like
arr.flat[True]
are also deprecated and will be removed in a future version.
Consistent error types:
Certain invalid
flatiter
indices that previously raised ValueError now correctly raise IndexError, aligning withndarray
behavior.Improved error messages:
The error message for unsupported index operations now provides more specific details, including explicitly listing the valid index types, instead of the generic
IndexError: unsupported index operation
.
(gh-28590)
Improved error message for assert_array_compare#
The error message generated by assert_array_compare which is used by functions like assert_allclose, assert_array_less etc. now also includes information about the indices at which the assertion fails.
(gh-29112)
Show unit information in __repr__
for datetime64("NaT")
#
When a datetime64
object is “Not a Time” (NaT), its __repr__
method now
includes the time unit of the datetime64 type. This makes it consistent with
the behavior of a timedelta64
object.
(gh-29396)
Performance improvements and changes#
Performance improvements to np.unique
for string dtypes#
The hash-based algorithm for unique extraction provides an order-of-magnitude speedup on large string arrays. In an internal benchmark with about 1 billion string elements, the hash-based np.unique completed in roughly 33.5 seconds, compared to 498 seconds with the sort-based method – about 15× faster for unsorted unique operations on strings. This improvement greatly reduces the time to find unique values in very large string datasets.
(gh-28767)
Changes#
Multiplication between a string and integer now raises OverflowError instead of MemoryError if the result of the multiplication would create a string that is too large to be represented. This follows Python’s behavior.
(gh-29060)
unique_values
for string dtypes may return unsorted data#
np.unique now supports hash‐based duplicate removal for string dtypes. This enhancement extends the hash-table algorithm to byte strings (‘S’), Unicode strings (‘U’), and the experimental string dtype (‘T’, StringDType). As a result, calling np.unique() on an array of strings will use the faster hash-based method to obtain unique values. Note that this hash-based method does not guarantee that the returned unique values will be sorted. This also works for StringDType arrays containing None (missing values) when using equal_nan=True (treating missing values as equal).
(gh-28767)
Fix bug in matmul
for non-contiguous out kwarg parameter#
In some cases, if out
was non-contiguous, np.matmul
would cause
memory corruption or a c-level assert. This was new to v2.3.0 and fixed in v2.3.1.
(gh-29179)
__array_interface__
with NULL pointer changed#
The array interface now accepts NULL pointers (NumPy will do
its own dummy allocation, though).
Previously, these incorrectly triggered an undocumented
scalar path.
In the unlikely event that the scalar path was actually desired,
you can (for now) achieve the previous behavior via the correct
scalar path by not providing a data
field at all.
(gh-29338)