hep.aida.tfloat.package.html Maven / Gradle / Ivy
Show all versions of parallelcolt Show documentation
Interfaces for compact, extensible, modular and performant histogramming functionality.
Getting Started
1. Overview
Aida itself offers the histogramming features of HTL and HBOOK, the de-facto
standard for histogramming for many years. It also offers a number of useful extensions,
with an object-oriented approach. These features include the following:
- creating and filling of 1D, 2D (and profile histograms, in the future)
- computation of statistics such as the mean, rms, etc. of a histogram
- support for operations between histograms (in the future)
- browsing of and access to characteristics of individual histograms
File-based I/O can be achieved through the standard Java built-in serialization
mechanism. All classes implement the {@link java.io.Serializable} interface.
However, the toolkit is entirely decoupled from advanced I/O and visualisation
techniques. It provides data structures and algorithms only.
This toolkit borrows many concepts from HBOOK and the CERN
HTL package (C++) largely written by Savrak Sar.
The definition of an abstract histogram interface allows functionality that
is provided by external packages, such as plotting or fitting, to be decoupled
from the actual implementation of the histogram. This feature paves the way
for co-existence of different histogram packages that conform to the abstract
interface.
A reference implementation of the interfaces is provided by package {@link
hep.aida.tdouble.ref}.
2. AIDA at a glance
Fixed-width histogram
The following code snippet demonstrates example usage:
IHistogram1D h1 = new Histogram1D("my histo 1",10, -2, +2); // 10 bins, min=-2, max=2
IHistogram2D h2 = new Histogram2D("my histo 2",10, -2, +2, 5, -2, +2);
IHistogram2D h3 = new Histogram3D("my histo 3",10, -2, +2, 5, -2, +2, 3, -2, +2);
// equivalent
// IHistogram1D h1 = new Histogram1D("my histo 1",new FixedAxis(10, -2, +2));
// IHistogram2D h2 = new Histogram2D("my histo 2",new FixedAxis(10, -2, +2),new FixedAxis(5, -2, +2));
// your favourite distribution goes here
cern.jet.random.AbstractDistribution gauss = new cern.jet.random.Normal(0,1,new cern.jet.random.engine.MersenneTwister());
for (int i=0; i < 10000; i++) {
h1.{@link hep.aida.tdouble.DoubleIHistogram1D#fill fill}(gauss.nextDouble());
h2.{@link hep.aida.tdouble.DoubleIHistogram2D#fill fill}(gauss.nextDouble(),gauss.nextDouble());
h3.{@link hep.aida.tdouble.DoubleIHistogram3D#fill fill}(gauss.nextDouble(),gauss.nextDouble(),gauss.nextDouble());
}
System.out.println(h1);
System.out.println(h2);
System.out.println(h3);
rms=h1.rms();
sum=h1.sumBinHeights();
...
Variable-width histogram
The following code snippet demonstrates example usage:
double[] xedges = { -5, -1, 0, 1, 5 };
double[] yedges = { -5, -1, 0.2, 0, 0.2, 1, 5 };
double[] zedges = { -5, 0, 7 };
IHistogram1D h1 = new Histogram1D("my histo 1",xedges); //
IHistogram2D h2 = new Histogram2D("my histo 2",xedges,yedges);
IHistogram2D h3 = new Histogram3D("my histo 3",xedges,yedges,zedges);
// equivalent
// IHistogram1D h1 = new Histogram1D("my histo 1",new VariableAxis(xedges));
// IHistogram2D h2 = new Histogram2D("my histo 2",new VariableAxis(xedges),new VariableAxis(yedges));
// your favourite distribution goes here
cern.jet.random.AbstractDistribution gauss = new cern.jet.random.Normal(0,1,new cern.jet.random.engine.MersenneTwister());
for (int i=0; i < 10000; i++) {
h1.{@link hep.aida.tdouble.DoubleIHistogram1D#fill fill}(gauss.nextDouble());
h2.{@link hep.aida.tdouble.DoubleIHistogram2D#fill fill}(gauss.nextDouble(),gauss.nextDouble());
h3.{@link hep.aida.tdouble.DoubleIHistogram3D#fill fill}(gauss.nextDouble(),gauss.nextDouble(),gauss.nextDouble());
}
System.out.println(h1);
System.out.println(h2);
System.out.println(h3);
rms=h1.rms();
sum=h1.sumBinHeights();
...
Here are some example histograms, as rendered by Java
Analysis Studio.
And here is an example output of {@link hep.aida.tdouble.ref.DoubleConverter#toString(DoubleIHistogram2D)}.
my histo 2:
Entries=5000, ExtraEntries=0
MeanX=4.9838, RmsX=NaN
MeanY=2.5304, RmsY=NaN
xAxis: Bins=11, Min=0, Max=11
yAxis: Bins=6, Min=0, Max=6
Heights:
| X
| 0 1 2 3 4 5 6 7 8 9 10 | Sum
----------------------------------------------------------
Y 5 | 30 53 51 52 57 39 65 61 55 49 22 | 534
4 | 43 106 112 96 92 94 107 98 98 110 47 | 1003
3 | 39 134 87 93 102 103 110 90 114 98 51 | 1021
2 | 44 81 113 96 101 86 109 83 111 93 42 | 959
1 | 54 94 103 99 115 92 98 97 103 90 44 | 989
0 | 24 54 52 44 42 56 46 47 56 53 20 | 494
----------------------------------------------------------
Sum | 234 522 518 480 509 470 535 476 537 493 226 |
And here is a sample 3d histogram output.
3. Histograms
3.1 Axes
An axis ({@link hep.aida.tdouble.DoubleIAxis}) describes how one dimension of the problem
space is divided into intervals. Consider the case of a 10 bin histogram in
the range [0,100]. An axis object containing the number of bins
and the interval limits will describe completely how we divide such an interval:
a set of 10 sub-intervals of equal width. This is termed a {@link hep.aida.tdouble.ref.DoubleFixedAxis}
and can be constructed as follows
IAxis axis = new FixedAxis(10, 0.0, 100.0);
It may be required to work with an histogram over the same range as the example
above, but with bins of variable widths. In this case, an axis containing the
bin edges will describe completely how the interval [0,100] is divided.
Such an axis is termed a {@link hep.aida.tdouble.ref.DoubleVariableAxis} and can be constructed
as follows
double[] edges = { 0.0, 10.0, 40.0, 49.0, 50.0, 51.0, 60.0, 100.0 };
IAxis axis = new VariableAxis(edges);
An n-dimensional histogram thus contains n axes, one for each
dimension. The only concern of an axis is to associate any ordered 1D space with
a discrete numbered space. Thus it associates an interval to an integer. Hence,
an axis knows about the width of the intervals and their lower point/bound or
upper point/bound. An axis can be asked for such information as follows:
IAxis axis = new FixedAxis(2, 0.0, 20.0); // 2 bins, min=0, max=20
...
axis.{@link hep.aida.tdouble.DoubleIAxis#bins bins()}; // Number of in-range bins (excluding underflow and overflow bins)
axis.{@link hep.aida.tdouble.DoubleIAxis#binLowerEdge binLowerEdge(i)}; // and the lower edge of bin i
axis.{@link hep.aida.tdouble.DoubleIAxis#binWidth binWidth(i)}; // and its width
axis.{@link hep.aida.tdouble.DoubleIAxis#binUpperEdge binUpperEdge(i)}; // and its upper edge
double point = 1.23;
int binIndex = axis.{@link hep.aida.tdouble.DoubleIAxis#coordToIndex coordToIndex(point)}; // Obtain index of bin the point falls into (maps to)
In this package, a histogram delegates to its axes the task of locating a
bin. In other words, information about the lower and upper edges of a bin or
the width of a given bin are obtained from the corresponding axis. This is shown
in the following code fragment, which demonstrates how the lower and upper edges
and width of a given bin can be obtained.
IHistogram1D histo = new Histogram1D("Histo1D", 10, 0.0, 100.0 );
...
histo.{@link hep.aida.tdouble.DoubleIHistogram1D#xAxis xAxis()}.bins() // Obtain the number of bins (excluding underflow and overflow bins)
histo.xAxis().binLowerEdge(i) // and the lower edge of bin i
histo.xAxis().binWidth(i) // and its width
histo.xAxis().binUpperEdge(i) // and its upper edge
An axis always sucessfully maps any arbitrary point drawn from the universe
[-infinity,+infinity] to a bin index, because it implicitly defines
an additional underflow and overflow bin, both together called
extra bins.
IHistogram2D h = new Histogram2D(new FixedAxis(2, 0.0, 100.0), new FixedAxis(2, 0.0, 100.0), ...);
y ^ i ... in-range bin, e .. extra bins
|
+inf |
| e | e | e | e
100 - ---------------
| e | i | i | e --> in-range == [0,100]2
| --------------- --> universe == [-infinity,+infinity]2
| e | i | i | e --> extra bins == universe - inrange
0 - ---------------
| e | e | e | e
-inf|
-----|-------|------> x
-inf 0 100 +inf
For example if an axis is defined to be new FixedAxis(2, 0.0, 20.0),
it has 2 in-range bins plus one for underflow and one for overflow. axis.bins()==2.
Its boundaries are [Double.NEGATIVE_INFINITY,0.0), [0.0, 10.0), [10.0, 20.0),
[20.0, Double.POSITIVE_INFINITY]. As a consequence point -5.0 maps to bin
index IHistogram.UNDERFLOW, point 5.0 maps to bin index 0, 15.0 maps
to bin index 1 and 25.0 maps to bin index IHistogram.OVERFLOW.
As a further example, consider the following case: new VariableAxis(new
double[] { 10.0, 20.0 }). The axis has 1 in-range bin: axis.bins()==1.
Its boundaries are [Double.NEGATIVE_INFINITY,10.0), [10.0, 20.0), [20.0,
Double.POSITIVE_INFINITY]. Point 5.0 maps to bin index IHistogram.UNDERFLOW,
point 15.0 maps to bin index 0 and 25.0 maps to bin index IHistogram.OVERFLOW.
As can be seen, underflow bins always have an index of IHistogram.UNDERFLOW,
whereas overflow outlier bins always have an index of IHistogram.OVERFLOW.
3.2 Bins
Bins themselves contain information about the data filled into them. They
can be asked for various descriptive statistical measures, such as the minimum,
maximum, size, mean, rms, variance, etc.
Note that bins (of any kind) only know about their contents. They do not know
where they are are located in the histogram to which they belong, nor about
their widths or bounds - this information is stored in the axis to which they
belong, which also defines the bin layout within a histogram.
4. Advanced Histogramming
TODO.
Comparison with the old AIDA interfaces
A proposed simpler alternative to the current hep.aida.flat classes.
The classes in this directory have been proposed by Mark Donselmann, Wolfgang
Hoschek and Tony Johnson as a simpler, easier to use alternative to the classes
orignally proposed as the AIDA standard.
Our goals were:
- Eliminate methods that are primarily for developers
writing display packages, they should not be complicating the public user
interfaces.
- Reduce unnecessary duplication which makes the
interfaces very long without adding any additional functionality or
ease-of-use
- Eliminate methods that are hard to use (we
could not think of any occasion where the 8 separate methods for getting the 2D
overflows bins would be convenient for anyone).
Note that
ease of implementation was NOT a primary goal.
Following these goals we were able to reduce the number of methods as
follows:
OLD
# methods
NEW
#methods
IHistogram1D
45
IHistogram
9
IHistogram2D
89
IHistogram1D
9 (+ inherited from IHistogram)
IHistogram2D
23(+9 inherited from IHistogram)
Axis
8
The primary differences between the old classes and the new classes
are:
- Introduction of an IAxis class, to describe the X
axis for 1D histograms, and the X and Y axes of 2D histograms. We understand
that the desire is to keep the interfaces as flat as possible, but feel this
introduces a significant improvement in terms of reducing complexity, and is
an abstraction that is easy for even the most object-phobic physicist to
grasp.
- We define constants OVERFLOW and UNDERFLOW to
represent the underflow and overflow bins on an axis. This eliminates the need
for special routines that deal with overflows/underflows. It also improves the
interface since it exposes the full set of overflow/underflow bins for 2D
histograms. Under the previous proposal it was necessary for the
implementation to keep the full set of overflow/underflow bins, in order to be
able to do the projections correctly, but there was no way for the end-user to
access them (they were restricted to the 8 overflow bins N,E,S,W,NE,SE,SW,NW).
- We eliminated the methods which return information
about bins based on coordinate (as opposed to index). We felt these functions
were rarely used, were in some cases ambiguous (for example when
projections/slices were specified in terms of coordinates what exactly was the
meaning) and the same functionality with less ambiguity was available by
calling coordToIndex() first.
A UML diagram of the classes is given below: