This NumPy release is marked by the removal of much technical debt: support for Python 2 has been removed, many deprecations have been expired, and documentation has been improved. The polishing of the random module continues apace with bug fixes and better usability from Cython.
The Python versions supported for this release are 3.6-3.8. Downstream developers should use Cython >= 0.29.16 for Python 3.8 support and OpenBLAS >= 3.7 to avoid problems on the Skylake architecture.
Code compatibility with Python versions < 3.6 (including Python 2) was dropped from both the python and C code. The shims in numpy.compat will remain to support third-party packages, but they may be deprecated in a future release. Note that 1.19.x will not compile with earlier versions of Python due to the use of f-strings.
numpy.compat
(gh-15233)
numpy.insert
numpy.delete
This concludes a deprecation from 1.9, where when an axis argument was passed to a call to ~numpy.insert and ~numpy.delete on a 0d array, the axis and obj argument and indices would be completely ignored. In these cases, insert(arr, "nonsense", 42, axis=0) would actually overwrite the entire array, while delete(arr, "nonsense", axis=0) would be arr.copy()
axis
~numpy.insert
~numpy.delete
obj
insert(arr, "nonsense", 42, axis=0)
delete(arr, "nonsense", axis=0)
arr.copy()
Now passing axis on a 0d array raises ~numpy.AxisError.
~numpy.AxisError
(gh-15802)
This concludes deprecations from 1.8 and 1.9, where np.delete would ignore both negative and out-of-bounds items in a sequence of indices. This was at odds with its behavior when passed a single index.
np.delete
Now out-of-bounds items throw IndexError, and negative items index from the end.
IndexError
(gh-15804)
This concludes a deprecation from 1.9, where sequences of non-integers indices were allowed and cast to integers. Now passing sequences of non-integral indices raises IndexError, just like it does when passing a single non-integral scalar.
(gh-15805)
This concludes a deprecation from 1.8, where np.delete would cast boolean arrays and scalars passed as an index argument into integer indices. The behavior now is to treat boolean arrays as a mask, and to raise an error on boolean scalars.
(gh-15815)
numpy.random.Generator.dirichlet
A bug in the generation of random variates for the Dirichlet distribution with small ‘alpha’ values was fixed by using a different algorithm when max(alpha) < 0.1. Because of the change, the stream of variates generated by dirichlet in this case will be different from previous releases.
max(alpha) < 0.1
dirichlet
(gh-14924)
PyArray_ConvertToCommonType
The promotion of mixed scalars and arrays in PyArray_ConvertToCommonType has been changed to adhere to those used by np.result_type. This means that input such as (1000, np.array([1], dtype=np.uint8))) will now return uint16 dtypes. In most cases the behaviour is unchanged. Note that the use of this C-API function is generally discouraged. This also fixes np.choose to behave the same way as the rest of NumPy in this respect.
np.result_type
(1000, np.array([1], dtype=np.uint8)))
uint16
np.choose
(gh-14933)
The fasttake and fastputmask slots are now never used and must always be set to NULL. This will result in no change in behaviour. However, if a user dtype should set one of these a DeprecationWarning will be given.
(gh-14942)
np.ediff1d
to_end
to_begin
np.ediff1d now uses the "same_kind" casting rule for its additional to_end and to_begin arguments. This ensures type safety except when the input array has a smaller integer type than to_begin or to_end. In rare cases, the behaviour will be more strict than it was previously in 1.16 and 1.17. This is necessary to solve issues with floating point NaN.
"same_kind"
(gh-14981)
Objects with len(obj) == 0 which implement an “array-like” interface, meaning an object implementing obj.__array__(), obj.__array_interface__, obj.__array_struct__, or the python buffer interface and which are also sequences (i.e. Pandas objects) will now always retain there shape correctly when converted to an array. If such an object has a shape of (0, 1) previously, it could be converted into an array of shape (0,) (losing all dimensions after the first 0).
len(obj) == 0
obj.__array__()
obj.__array_interface__
obj.__array_struct__
(0, 1)
(0,)
(gh-14995)
multiarray.int_asbuffer
As part of the continued removal of Python 2 compatibility, multiarray.int_asbuffer was removed. On Python 3, it threw a NotImplementedError and was unused internally. It is expected that there are no downstream use cases for this method with Python 3.
NotImplementedError
(gh-15229)
numpy.distutils.compat
This module contained only the function get_exception(), which was used as:
get_exception()
try: ... except Exception: e = get_exception()
Its purpose was to handle the change in syntax introduced in Python 2.6, from except Exception, e: to except Exception as e:, meaning it was only necessary for codebases supporting Python 2.5 and older.
except Exception, e:
except Exception as e:
(gh-15255)
issubdtype
float
np.floating
numpy.issubdtype had a FutureWarning since NumPy 1.14 which has expired now. This means that certain input where the second argument was neither a datatype nor a NumPy scalar type (such as a string or a python type like int or float) will now be consistent with passing in np.dtype(arg2).type. This makes the result consistent with expectations and leads to a false result in some cases which previously returned true.
numpy.issubdtype
int
np.dtype(arg2).type
(gh-15773)
round
Output of the __round__ dunder method and consequently the Python built-in round has been changed to be a Python int to be consistent with calling it on Python float objects when called with no arguments. Previously, it would return a scalar of the np.dtype that was passed in.
__round__
np.dtype
(gh-15840)
numpy.ndarray
strides=()
strides=None
The former has changed to have the expected meaning of setting numpy.ndarray.strides to (), while the latter continues to result in strides being chosen automatically.
numpy.ndarray.strides
()
(gh-15882)
The C-level casts from strings were simplified. This changed also fixes string to datetime and timedelta casts to behave correctly (i.e. like Python casts using string_arr.astype("M8") while previously the cast would behave like string_arr.astype(np.int_).astype("M8"). This only affects code using low-level C-API to do manual casts (not full array casts) of single scalar values or using e.g. PyArray_GetCastFunc, and should thus not affect the vast majority of users.
string_arr.astype("M8")
string_arr.astype(np.int_).astype("M8")
PyArray_GetCastFunc
(gh-16068)
SeedSequence
Small seeds (less than 2**96) were previously implicitly 0-padded out to 128 bits, the size of the internal entropy pool. When spawned, the spawn key was concatenated before the 0-padding. Since the first spawn key is (0,), small seeds before the spawn created the same states as the first spawned SeedSequence. Now, the seed is explicitly 0-padded out to the internal pool size before concatenating the spawn key. Spawned SeedSequences will produce different results than in the previous release. Unspawned SeedSequences will still produce the same results.
2**96
SeedSequences
(gh-16551)
dtype=object
Calling np.array([[1, [1, 2, 3]]) will issue a DeprecationWarning as per NEP 34. Users should explicitly use dtype=object to avoid the warning.
np.array([[1, [1, 2, 3]])
DeprecationWarning
(gh-15119)
shape=0
numpy.rec
0 is treated as a special case and is aliased to None in the functions:
0
None
numpy.core.records.fromarrays
numpy.core.records.fromrecords
numpy.core.records.fromstring
numpy.core.records.fromfile
In future, 0 will not be special cased, and will be treated as an array length like any other integer.
(gh-15217)
The following C-API functions are probably unused and have been deprecated:
PyArray_GetArrayParamsFromObject
PyUFunc_GenericFunction
PyUFunc_SetUsesArraysAsData
In most cases PyArray_GetArrayParamsFromObject should be replaced by converting to an array, while PyUFunc_GenericFunction can be replaced with PyObject_Call (see documentation for details).
PyObject_Call
(gh-15427)
The super classes of scalar types, such as np.integer, np.generic, or np.inexact will now give a deprecation warning when converted to a dtype (or used in a dtype keyword argument). The reason for this is that np.integer is converted to np.int_, while it would be expected to represent any integer (e.g. also int8, int16, etc. For example, dtype=np.floating is currently identical to dtype=np.float64, even though also np.float32 is a subclass of np.floating.
np.integer
np.generic
np.inexact
np.int_
int8
int16
dtype=np.floating
dtype=np.float64
np.float32
(gh-15534)
np.complexfloating
Output of the __round__ dunder method and consequently the Python built-in round has been deprecated on complex scalars. This does not affect np.round.
np.round
numpy.ndarray.tostring()
tobytes()
~numpy.ndarray.tobytes has existed since the 1.9 release, but until this release ~numpy.ndarray.tostring emitted no warning. The change to emit a warning brings NumPy in line with the builtin array.array methods of the same name.
~numpy.ndarray.tobytes
~numpy.ndarray.tostring
array.array
(gh-15867)
const
The following functions now accept a constant array of npy_intp:
npy_intp
PyArray_BroadcastToShape
PyArray_IntTupleFromIntp
PyArray_OverflowMultiplyList
Previously the caller would have to cast away the const-ness to call these functions.
(gh-15251)
UFuncGenericFunction now expects pointers to const dimension and strides as arguments. This means inner loops may no longer modify either dimension or strides. This change leads to an incompatible-pointer-types warning forcing users to either ignore the compiler warnings or to const qualify their own loop signatures.
UFuncGenericFunction
dimension
strides
incompatible-pointer-types
(gh-15355)
numpy.frompyfunc
This allows the :attr:numpy.ufunc.identity attribute to be set on the resulting ufunc, meaning it can be used for empty and multi-dimensional calls to :meth:numpy.ufunc.reduce.
numpy.ufunc.identity
numpy.ufunc.reduce
(gh-8255)
np.str_
np.str_ arrays are always stored as UCS4, so the corresponding scalars now expose this through the buffer interface, meaning memoryview(np.str_('test')) now works.
memoryview(np.str_('test'))
(gh-15385)
subok
numpy.copy
A new kwarg, subok, was added to numpy.copy to allow users to toggle the behavior of numpy.copy with respect to array subclasses. The default value is False which is consistent with the behavior of numpy.copy for previous numpy versions. To create a copy that preserves an array subclass with numpy.copy, call np.copy(arr, subok=True). This addition better documents that the default behavior of numpy.copy differs from the numpy.ndarray.copy method which respects array subclasses by default.
False
np.copy(arr, subok=True)
numpy.ndarray.copy
(gh-15685)
numpy.linalg.multi_dot
out
out can be used to avoid creating unnecessary copies of the final product computed by numpy.linalg.multidot.
numpy.linalg.multidot
(gh-15715)
keepdims
numpy.count_nonzero
The parameter keepdims was added to numpy.count_nonzero. The parameter has the same meaning as it does in reduction functions such as numpy.sum or numpy.mean.
numpy.sum
numpy.mean
(gh-15870)
equal_nan
numpy.array_equal
The keyword argument equal_nan was added to numpy.array_equal. equal_nan is a boolean value that toggles whether or not nan values are considered equal in comparison (default is False). This matches API used in related functions such as numpy.isclose and numpy.allclose.
nan
numpy.isclose
numpy.allclose
(gh-16128)
Replace npy_cpu_supports which was a gcc specific mechanism to test support of AVX with more general functions npy_cpu_init and npy_cpu_have, and expose the results via a NPY_CPU_HAVE c-macro as well as a python-level __cpu_features__ dictionary.
npy_cpu_supports
npy_cpu_init
npy_cpu_have
NPY_CPU_HAVE
__cpu_features__
(gh-13421)
Use 64-bit integer size on 64-bit platforms in the fallback LAPACK library, which is used when the system has no LAPACK installed, allowing it to deal with linear algebra for large arrays.
(gh-15218)
np.exp
np.float64
Use AVX512 intrinsic to implement np.exp when input is np.float64, which can improve the performance of np.exp with np.float64 input 5-7x faster than before. The _multiarray_umath.so module has grown about 63 KB on linux64.
_multiarray_umath.so
(gh-15648)
On Linux NumPy has previously added support for madavise hugepages which can improve performance for very large arrays. Unfortunately, on older Kernel versions this led to peformance regressions, thus by default the support has been disabled on kernels before version 4.6. To override the default, you can use the environment variable:
NUMPY_MADVISE_HUGEPAGE=0
or set it to 1 to force enabling support. Note that this only makes a difference if the operating system is set up to use madvise transparent hugepage.
(gh-15769)
numpy.einsum
int64
There is no longer a type error thrown when numpy.einsum is passed a NumPy int64 array as its subscript list.
(gh-16080)
np.logaddexp2.identity
-inf
The ufunc ~numpy.logaddexp2 now has an identity of -inf, allowing it to be called on empty sequences. This matches the identity of ~numpy.logaddexp.
~numpy.logaddexp2
~numpy.logaddexp
(gh-16102)
__array__
A code path and test have been in the code since NumPy 0.4 for a two-argument variant of __array__(dtype=None, context=None). It was activated when calling ufunc(op) or ufunc.reduce(op) if op.__array__ existed. However that variant is not documented, and it is not clear what the intention was for its use. It has been removed.
__array__(dtype=None, context=None)
ufunc(op)
ufunc.reduce(op)
op.__array__
(gh-15118)
numpy.random._bit_generator
numpy.random.bit_generator
In order to expose numpy.random.BitGenerator and numpy.random.SeedSequence to Cython, the _bitgenerator module is now public as numpy.random.bit_generator
numpy.random.BitGenerator
numpy.random.SeedSequence
_bitgenerator
pxd
c_distributions.pxd provides access to the c functions behind many of the random distributions from Cython, making it convenient to use and extend them.
c_distributions.pxd
(gh-15463)
eigh
cholesky
numpy.random.multivariate_normal
Previously, when passing method='eigh' or method='cholesky', numpy.random.multivariate_normal produced samples from the wrong distribution. This is now fixed.
method='eigh'
method='cholesky'
(gh-15872)
MT19937.jumped
This fix changes the stream produced from jumped MT19937 generators. It does not affect the stream produced using RandomState or MT19937 that are directly seeded.
RandomState
MT19937
The translation of the jumping code for the MT19937 contained a reversed loop ordering. MT19937.jumped matches the Makoto Matsumoto’s original implementation of the Horner and Sliding Window jump methods.
(gh-16153)