[go: up one dir, main page]

Skip to content

scikit-hep/hist

Repository files navigation

histogram

Hist

Actions Status Documentation Status pre-commit.ci status

PyPI version Conda-Forge PyPI platforms DOI License

GitHub Discussion Gitter Binder Scikit-HEP

Hist is an analyst-friendly front-end for boost-histogram, designed for Python 3.8+ (3.6-3.7 users get older versions). See what's new.

Slideshow of features. See docs/banner_slides.md for text if the image is not readable.

Installation

You can install this library from PyPI with pip:

python3 -m pip install "hist[plot,fit]"

If you do not need the plotting features, you can skip the [plot] and/or [fit] extras. [fit] is not currently supported in WebAssembly.

Features

Hist currently provides everything boost-histogram provides, and the following enhancements:

  • Hist augments axes with names:

    • name= is a unique label describing each axis.
    • label= is an optional string that is used in plotting (defaults to name if not provided).
    • Indexing, projection, and more support named axes.
    • Experimental NamedHist is a Hist that disables most forms of positional access, forcing users to use only names.
  • The Hist class augments bh.Histogram with simpler construction:

    • flow=False is a fast way to turn off flow for the axes on construction.
    • Storages can be given by string.
    • storage= can be omitted, strings and storages can be positional.
    • data= can initialize a histogram with existing data.
    • Hist.from_columns can be used to initialize with a DataFrame or dict.
    • You can cast back and forth with boost-histogram (or any other extensions).
  • Hist support QuickConstruct, an import-free construction system that does not require extra imports:

    • Use Hist.new.<axis>().<axis>().<storage>().
    • Axes names can be full (Regular) or short (Reg).
    • Histogram arguments (like data=) can go in the storage.
  • Extended Histogram features:

    • Direct support for .name and .label, like axes.
    • .density() computes the density as an array.
    • .profile(remove_ax) can convert a ND COUNT histogram into a (N-1)D MEAN histogram.
    • .sort(axis) supports sorting a histogram by a categorical axis. Optionally takes a function to sort by.
    • .fill_flattened(...) will flatten and fill, including support for AwkwardArray.
    • .integrate(...), which takes the opposite arguments as .project.
  • Hist implements UHI+; an extension to the UHI (Unified Histogram Indexing) system designed for import-free interactivity:

    • Uses j suffix to switch to data coordinates in access or slices.
    • Uses j suffix on slices to rebin.
    • Strings can be used directly to index into string category axes.
  • Quick plotting routines encourage exploration:

    • .plot() provides 1D and 2D plots (or use plot1d(), plot2d())
    • .plot2d_full() shows 1D projects around a 2D plot.
    • .plot_ratio(...) make a ratio plot between the histogram and another histogram or callable.
    • .plot_pull(...) performs a pull plot.
    • .plot_pie() makes a pie plot.
    • .show() provides a nice str printout using Histoprint.
  • Stacks: work with groups of histograms with identical axes

    • Stacks can be created with h.stack(axis), using index or name of an axis (StrCategory axes ideal).
    • You can also create with hist.stacks.Stack(h1, h2, ...), or use from_iter or from_dict.
    • You can index a stack, and set an entry with a matching histogram.
    • Stacks support .plot() and .show(), with names (plot labels default to original axes info).
    • Stacks pass through .project, *, +, and -.
  • New modules

    • intervals supports frequentist coverage intervals.
  • Notebook ready: Hist has gorgeous in-notebook representation.

    • No dependencies required

Usage

from hist import Hist

# Quick construction, no other imports needed:
h = (
    Hist.new.Reg(10, 0, 1, name="x", label="x-axis")
    .Var(range(10), name="y", label="y-axis")
    .Int64()
)

# Filling by names is allowed:
h.fill(y=[1, 4, 6], x=[3, 5, 2])

# Names can be used to manipulate the histogram:
h.project("x")
h[{"y": 0.5j + 3, "x": 5j}]

# You can access data coordinates or rebin with a `j` suffix:
h[0.3j:, ::2j]  # x from .3 to the end, y is rebinned by 2

# Elegant plotting functions:
h.plot()
h.plot2d_full()
h.plot_pull(Callable)

Development

From a git checkout, either use nox, or run:

python -m pip install -e .[dev]

See Contributing guidelines for information on setting up a development environment.

Contributors

We would like to acknowledge the contributors that made this project possible (emoji key):


Henry Schreiner

🚧 πŸ’» πŸ“–

Nino Lau

🚧 πŸ’» πŸ“–

Chris Burr

πŸ’»

Nick Amin

πŸ’»

Eduardo Rodrigues

πŸ’»

Andrzej Novak

πŸ’»

Matthew Feickert

πŸ’»

Kyle Cranmer

πŸ“–

Daniel Antrim

πŸ’»

Nicholas Smith

πŸ’»

Michael Eliachevitch

πŸ’»

Jonas Eschle

πŸ“–

This project follows the all-contributors specification.

Talks


Acknowledgements

This library was primarily developed by Henry Schreiner and Nino Lau.

Support for this work was provided by the National Science Foundation cooperative agreement OAC-1836650 (IRIS-HEP) and OAC-1450377 (DIANA/HEP). Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.