Testing guidelines#

Introduction#

Until the 1.15 release, NumPy used the nose testing framework, it now uses the pytest framework. The older framework is still maintained in order to support downstream projects that use the old numpy framework, but all tests for NumPy should use pytest.

Our goal is that every module and package in NumPy should have a thorough set of unit tests. These tests should exercise the full functionality of a given routine as well as its robustness to erroneous or unexpected input arguments. Well-designed tests with good coverage make an enormous difference to the ease of refactoring. Whenever a new bug is found in a routine, you should write a new test for that specific case and add it to the test suite to prevent that bug from creeping back in unnoticed.

Note

SciPy uses the testing framework from numpy.testing, so all of the NumPy examples shown below are also applicable to SciPy

Testing NumPy#

NumPy can be tested in a number of ways, choose any way you feel comfortable.

Running tests from inside Python#

You can test an installed NumPy by numpy.test, for example, To run NumPy’s full test suite, use the following:

>>> import numpy
>>> numpy.test(label='slow')

The test method may take two or more arguments; the first label is a string specifying what should be tested and the second verbose is an integer giving the level of output verbosity. See the docstring numpy.test for details. The default value for label is ‘fast’ - which will run the standard tests. The string ‘full’ will run the full battery of tests, including those identified as being slow to run. If verbose is 1 or less, the tests will just show information messages about the tests that are run; but if it is greater than 1, then the tests will also provide warnings on missing tests. So if you want to run every test and get messages about which modules don’t have tests:

>>> numpy.test(label='full', verbose=2)  # or numpy.test('full', 2)

Finally, if you are only interested in testing a subset of NumPy, for example, the _core module, use the following:

>>> numpy._core.test()

Running tests from the command line#

If you want to build NumPy in order to work on NumPy itself, use the spin utility.

To run NumPy’s full test suite:

$ spin test -m full

Testing a subset of NumPy:

$ spin test -t numpy/_core/tests

For detailed info on testing, see Testing builds

Running tests in multiple threads#

To help with stress testing NumPy for thread safety, the test suite can be run under pytest-run-parallel. To install pytest-run-parallel:

$ pip install pytest-run-parallel

To run the test suite in multiple threads:

$ spin test -p auto # have pytest-run-parallel detect the number of available cores
$ spin test -p 4 # run each test under 4 threads
$ spin test -p auto -- --skip-thread-unsafe=true # run ONLY tests that are thread-safe

When you write new tests, it is worth testing to make sure they do not fail under pytest-run-parallel, since the CI jobs make use of it. Some tips on how to write thread-safe tests can be found here.

Note

Ideally you should run pytest-run-parallel using a free-threaded build of Python that is 3.14 or higher. If you decide to use a version of Python that is not free-threaded, you will need to set the environment variables PYTHON_CONTEXT_AWARE_WARNINGS and PYTHON_THREAD_INHERIT_CONTEXT to 1.

Running doctests#

NumPy documentation contains code examples, “doctests”. To check that the examples are correct, install the scipy-doctest package:

$ pip install scipy-doctest

and run one of:

$ spin check-docs -v
$ spin check-docs numpy/linalg
$ spin check-docs -- -k 'det and not slogdet'

Note that the doctests are not run when you use spin test.

Other methods of running tests#

Run tests using your favourite IDE such as vscode or pycharm

Writing your own tests#

If you are writing code that you’d like to become part of NumPy, please write the tests as you develop your code. Every Python module, extension module, or subpackage in the NumPy package directory should have a corresponding test_<name>.py file. Pytest examines these files for test methods (named test*) and test classes (named Test*).

Suppose you have a NumPy module numpy/xxx/yyy.py containing a function zzz(). To test this function you would create a test module called test_yyy.py. If you only need to test one aspect of zzz, you can simply add a test function:

def test_zzz():
    assert zzz() == 'Hello from zzz'

More often, we need to group a number of tests together, so we create a test class:

# import xxx symbols
from numpy.xxx.yyy import zzz
import pytest

class TestZzz:
    def test_simple(self):
        assert zzz() == 'Hello from zzz'

    def test_invalid_parameter(self):
        with pytest.raises(ValueError, match='.*some matching regex.*'):
            ...

Within these test methods, the assert statement or a specialized assertion function is used to test whether a certain assumption is valid. If the assertion fails, the test fails. Common assertion functions include:

numpy.testing.assert_equal for testing exact elementwise equality between a result array and a reference,
numpy.testing.assert_allclose for testing near elementwise equality between a result array and a reference (i.e. with specified relative and absolute tolerances), and
numpy.testing.assert_array_less for testing (strict) elementwise ordering between a result array and a reference.

By default, these assertion functions only compare the numerical values in the arrays. Consider using the strict=True option to check the array dtype and shape, too.

When you need custom assertions, use the Python assert statement. Note that pytest internally rewrites assert statements to give informative output when it fails, so it should be preferred over the legacy variant numpy.testing.assert_. Whereas plain assert statements are ignored when running Python in optimized mode with -O, this is not an issue when running tests with pytest.

Similarly, the pytest functions pytest.raises and pytest.warns should be preferred over their legacy counterparts numpy.testing.assert_raises and numpy.testing.assert_warns, which are more broadly used. These versions also accept a match parameter, which should always be used to precisely target the intended warning or error.

Note that test_ functions or methods should not have a docstring, because that makes it hard to identify the test from the output of running the test suite with verbose=2 (or similar verbosity setting). Use plain comments (#) to describe the intent of the test and help the unfamiliar reader to interpret the code.

Also, since much of NumPy is legacy code that was originally written without unit tests, there are still several modules that don’t have tests yet. Please feel free to choose one of these modules and develop tests for it.

Using C code in tests#

NumPy exposes a rich C-API . These are tested using c-extension modules written “as-if” they know nothing about the internals of NumPy, rather using the official C-API interfaces only. Examples of such modules are tests for a user-defined rational dtype in _rational_tests or the ufunc machinery tests in _umath_tests which are part of the binary distribution. Starting from version 1.21, you can also write snippets of C code in tests that will be compiled locally into c-extension modules and loaded into python.

numpy.testing.extbuild.build_and_import_extension(modname, functions, *, prologue='', build_dir=None, include_dirs=None, more_init='')#

Build and imports a c-extension module modname from a list of function fragments functions.

Parameters:

functionslist of fragments: Each fragment is a sequence of func_name, calling convention, snippet.
prologuestring: Code to precede the rest, usually extra #include or #define macros.
build_dirpathlib.Path: Where to build the module, usually a temporary directory
include_dirslist: Extra directories to find include files when compiling
more_initstring: Code to appear in the module PyMODINIT_FUNC

Returns:

out: module: The module will have been loaded and is ready for use

Examples

>>> functions = [("test_bytes", "METH_O", """
    if ( !PyBytesCheck(args)) {
        Py_RETURN_FALSE;
    }
    Py_RETURN_TRUE;
""")]
>>> mod = build_and_import_extension("testme", functions)
>>> assert not mod.test_bytes('abc')
>>> assert mod.test_bytes(b'abc')

Labeling tests#

Unlabeled tests like the ones above are run in the default numpy.test() run. If you want to label your test as slow - and therefore reserved for a full numpy.test(label='full') run, you can label it with pytest.mark.slow:

import pytest

@pytest.mark.slow
def test_big(self):
    print('Big, slow test')

Similarly for methods:

class test_zzz:
    @pytest.mark.slow
    def test_simple(self):
        assert_(zzz() == 'Hello from zzz')

Setup and teardown methods#

NumPy originally used xunit setup and teardown, a feature of pytest. We now encourage the usage of setup and teardown methods that are called explicitly by the tests that need them:

class TestMe:
    def setup(self):
        print('doing setup')
        return 1

    def teardown(self):
        print('doing teardown')

    def test_xyz(self):
        x = self.setup()
        assert x == 1
        self.teardown()

This approach is thread-safe, ensuring tests can run under pytest-run-parallel. Using pytest setup fixtures (such as xunit setup methods) is generally not thread-safe and will likely cause thread-safety test failures.

pytest supports more general fixture at various scopes which may be used automatically via special arguments. For example, the special argument name tmp_path is used in tests to create temporary directories. However, fixtures should be used sparingly.

Parametric tests#

One very nice feature of pytest is the ease of testing across a range of parameter values using the pytest.mark.parametrize decorator. For example, suppose you wish to test linalg.solve for all combinations of three array sizes and two data types:

@pytest.mark.parametrize('dimensionality', [3, 10, 25])
@pytest.mark.parametrize('dtype', [np.float32, np.float64])
def test_solve(dimensionality, dtype):
    np.random.seed(842523)
    A = np.random.random(size=(dimensionality, dimensionality)).astype(dtype)
    b = np.random.random(size=dimensionality).astype(dtype)
    x = np.linalg.solve(A, b)
    eps = np.finfo(dtype).eps
    assert_allclose(A @ x, b, rtol=eps*1e2, atol=0)
    assert x.dtype == np.dtype(dtype)

Doctests#

Doctests are a convenient way of documenting the behavior of a function and allowing that behavior to be tested at the same time. The output of an interactive Python session can be included in the docstring of a function, and the test framework can run the example and compare the actual output to the expected output.

The doctests can be run by adding the doctests argument to the test() call; for example, to run all tests (including doctests) for numpy.lib:

>>> import numpy as np
>>> np.lib.test(doctests=True)

The doctests are run as if they are in a fresh Python instance which has executed import numpy as np. Tests that are part of a NumPy subpackage will have that subpackage already imported. E.g. for a test in numpy/linalg/tests/, the namespace will be created such that from numpy import linalg has already executed.

`tests/`#

Rather than keeping the code and the tests in the same directory, we put all the tests for a given subpackage in a tests/ subdirectory. For our example, if it doesn’t already exist you will need to create a tests/ directory in numpy/xxx/. So the path for test_yyy.py is numpy/xxx/tests/test_yyy.py.

Once the numpy/xxx/tests/test_yyy.py is written, its possible to run the tests by going to the tests/ directory and typing:

python test_yyy.py

Or if you add numpy/xxx/tests/ to the Python path, you could run the tests interactively in the interpreter like this:

>>> import test_yyy
>>> test_yyy.test()

`init.py` and `setup.py`#

Usually, however, adding the tests/ directory to the python path isn’t desirable. Instead it would better to invoke the test straight from the module xxx. To this end, simply place the following lines at the end of your package’s __init__.py file:

...
def test(level=1, verbosity=1):
    from numpy.testing import Tester
    return Tester().test(level, verbosity)

You will also need to add the tests directory in the configuration section of your setup.py:

...
def configuration(parent_package='', top_path=None):
    ...
    config.add_subpackage('tests')
    return config
...

Now you can do the following to test your module:

>>> import numpy
>>> numpy.xxx.test()

Also, when invoking the entire NumPy test suite, your tests will be found and run:

>>> import numpy
>>> numpy.test()
# your tests are included and run automatically!

Tips & Tricks#

Known failures & skipping tests#

Sometimes you might want to skip a test or mark it as a known failure, such as when the test suite is being written before the code it’s meant to test, or if a test only fails on a particular architecture.

To skip a test, simply use skipif:

import pytest

@pytest.mark.skipif(SkipMyTest, reason="Skipping this test because...")
def test_something(foo):
    ...

The test is marked as skipped if SkipMyTest evaluates to nonzero, and the message in verbose test output is the second argument given to skipif. Similarly, a test can be marked as a known failure by using xfail:

import pytest

@pytest.mark.xfail(MyTestFails, reason="This test is known to fail because...")
def test_something_else(foo):
    ...

Of course, a test can be unconditionally skipped or marked as a known failure by using skip or xfail without argument, respectively.

A total of the number of skipped and known failing tests is displayed at the end of the test run. Skipped tests are marked as 'S' in the test results (or 'SKIPPED' for verbose > 1), and known failing tests are marked as 'x' (or 'XFAIL' if verbose > 1).

Tests on random data#

Tests on random data are good, but since test failures are meant to expose new bugs or regressions, a test that passes most of the time but fails occasionally with no code changes is not helpful. Make the random data deterministic by setting the random number seed before generating it. Use rng = numpy.random.RandomState(some_number) to set a seed on a local instance of numpy.random.RandomState.

Alternatively, you can use Hypothesis to generate arbitrary data. Hypothesis manages both Python’s and Numpy’s random seeds for you, and provides a very concise and powerful way to describe data (including hypothesis.extra.numpy, e.g. for a set of mutually-broadcastable shapes).

The advantages over random generation include tools to replay and share failures without requiring a fixed seed, reporting minimal examples for each failure, and better-than-naive-random techniques for triggering bugs.

Writing thread-safe tests#

Writing thread-safe tests may require some trial-and-error. Generally you should follow the guidelines stated so far, especially when it comes to setup methods and seeding random data. Explicit setup and the usage of local RNG are thread-safe practices. Here are tips for some other common problems you may run into.

Using pytest.mark.parametrize may occasionally cause thread-safety issues. To fix this, you can use copy():

@pytest.mark.parametrize('dimensionality', [3, 10, 25])
@pytest.mark.parametrize('dtype', [np.float32, np.float64])
def test_solve(dimensionality, dtype):
    dimen = dimensionality.copy()
    d = dtype.copy()
    # use these copied variables instead
    ...

If you are testing something that is inherently thread-unsafe, you can label your test with pytest.mark.thread_unsafe so that it will run under a single thread and not cause test failures:

@pytest.mark.thread_unsafe(reason="reason this test is thread-unsafe")
def test_thread_unsafe():
  ...

Some examples of what should be labeled as thread-unsafe:

Usage of sys.stdout and sys.stderr
Mutation of global data, like docstrings, modules, garbage collectors, etc.
Tests that require a lot of memory, since they could cause crashes.

Additionally, some pytest fixtures are thread-unsafe, such as monkeypatch and capsys. However, pytest-run-parallel will automatically mark these as thread-unsafe if you decide to use them. Some fixtures have been patched to be thread-safe, like tmp_path.

Documentation for `numpy.test`#

numpy.test(label='fast', verbose=1, extra_argv=None, doctests=False, coverage=False, durations=-1, tests=None)#

Pytest test runner.

A test function is typically added to a package’s __init__.py like so:

from numpy._pytesttester import PytestTester
test = PytestTester(__name__).test
del PytestTester

Calling this test function finds and runs all tests associated with the module and all its sub-modules.