NumPy
The fundamental package for scientific computing with Python
D&I Grant from CZI

Powerful N-dimensional arrays
Fast and versatile, the NumPy vectorization, indexing, and broadcasting concepts are the de-facto standards of array computing today.

Numerical computing tools
NumPy offers comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms, and more.

Interoperable
NumPy supports a wide range of hardware and computing platforms, and plays well with distributed, GPU, and sparse array libraries.

Performant
The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code.

Easy to use
NumPy’s high level syntax makes it accessible and productive for programmers from any background or experience level.

Open source
Distributed under a liberal BSD license, NumPy is developed and maintained publicly on GitHub by a vibrant, responsive, and diverse community.

Try NumPy
Enable the interactive shell
>

ECOSYSTEM

  • Nearly every scientist working in Python draws on the power of NumPy.

    NumPy brings the computational power of languages like C and Fortran to Python, a language much easier to learn and use. With this power comes simplicity: a solution in NumPy is often clear and elegant.

    Quantum Computing Statistical Computing Signal Processing Image Processing Graphs and Networks Astronomy Processes Cognitive Psychology
    A computer chip. A line graph with the line moving up. A bar chart with positive and negative values. An photograph of the mountains. A simple graph. A telescope. A human head with gears.
    QuTiP Pandas SciPy Scikit-image NetworkX AstroPy PsychoPy
    PyQuil statsmodels PyWavelets OpenCV graph-tool SunPy
    Qiskit Xarray python-control Mahotas igraph SpacePy
    Seaborn PyGSP
    Bioinformatics Bayesian Inference Mathematical Analysis Chemistry Geoscience Geographic Processing Architecture & Engineering
    A strand of DNA. A graph with a bell-shaped curve. Four mathematical symbols. A test tube. The Earth. A map. A microprocessor development board.
    BioPython PyStan SciPy Cantera Pangeo Shapely COMPAS
    Scikit-Bio PyMC3 SymPy MDAnalysis Simpeg GeoPandas City Energy Analyst
    PyEnsembl ArviZ cvxpy RDKit ObsPy Folium Sverchok
    ETE emcee FEniCS Fatiando a Terra
  • NumPy's API is the starting point when libraries are written to exploit innovative hardware, create specialized array types, or add capabilities beyond what NumPy provides.

    Array Library Capabilities & Application areas
    Dask Dask Distributed arrays and advanced parallelism for analytics, enabling performance at scale.
    CuPy CuPy NumPy-compatible array library for GPU-accelerated computing with Python.
    JAX JAX Composable transformations of NumPy programs: differentiate, vectorize, just-in-time compilation to GPU/TPU.
    xarray Xarray Labeled, indexed multi-dimensional arrays for advanced analytics and visualization
    sparse Sparse NumPy-compatible sparse array library that integrates with Dask and SciPy's sparse linear algebra.
    PyTorch PyTorch Deep learning framework that accelerates the path from research prototyping to production deployment.
    TensorFlow TensorFlow An end-to-end platform for machine learning to easily build and deploy ML powered applications.
    MXNet MXNet Deep learning framework suited for flexible research prototyping and production.
    arrow Arrow A cross-language development platform for columnar in-memory data and analytics.
    xtensor xtensor Multi-dimensional arrays with broadcasting and lazy computing for numerical analysis.
    xnd XND Develop libraries for array computing, recreating NumPy's foundational concepts.
    uarray uarray Python backend system that decouples API from implementation; unumpy provides a NumPy API.
    tensorly tensorly Tensor learning, algebra and backends to seamlessly use NumPy, MXNet, PyTorch, TensorFlow or CuPy.
  • NumPy lies at the core of a rich ecosystem of data science libraries. A typical exploratory data science workflow might look like:

    For high data volumes, Dask and Ray are designed to scale. Stable deployments rely on data versioning (DVC), experiment tracking (MLFlow), and workflow automation (Airflow and Prefect).

    Diagram of three overlapping circles. The circles are labeled 'Mathematics', 'Computer Science' and 'Domain Expertise'. In the middle of the diagram, which has the three circles overlapping it, is an area labeled 'Data Science'.
  • NumPy forms the basis of powerful machine learning libraries like scikit-learn and SciPy. As machine learning grows, so does the list of libraries built on NumPy. TensorFlow’s deep learning capabilities have broad applications — among them speech and image recognition, text-based applications, time-series analysis, and video detection. PyTorch, another deep learning library, is popular among researchers in computer vision and natural language processing. MXNet is another AI package, providing blueprints and templates for deep learning.

    Statistical techniques called ensemble methods such as binning, bagging, stacking, and boosting are among the ML algorithms implemented by tools such as XGBoost, LightGBM, and CatBoost — one of the fastest inference engines. Yellowbrick and Eli5 offer machine learning visualizations.

  • NumPy is an essential component in the burgeoning Python visualization landscape, which includes Matplotlib, Seaborn, Plotly, Altair, Bokeh, Holoviz, Vispy, Napari, and PyVista, to name a few.

    NumPy’s accelerated processing of large arrays allows researchers to visualize datasets far larger than native Python could handle.

CASE STUDIES