sim.util.media.chart.DataCuller Maven / Gradle / Ivy
Show all versions of mason Show documentation
/*
Copyright 2006 by Sean Luke and George Mason University
Licensed under the Academic Free License version 3.0
See the file "LICENSE" for more information
*/
package sim.util.media.chart;
import sim.util.IntBag;
/**
*
* This is meant to accomodate an on-line algorithm for keeping a constant number of data points
* from an on-going time series.
*
* It only looks at the X values of the time series data points.
*
* Implementations can assume the X values come in in increasing order.
*
*
* Now if a series holds the data points in a Bag, removing data points
* screws with the order, so the next time a data point needs to be
* removed, its index will be compromized.
*
* The way this works is ... I get a bunch of X values and I return the indices of values that should be dropped.
* It's then your job to delete the correct items and recompact the remaining data.
*
*
*
Changing the API to include the actual data item, not just its x value
* could be useful if one wants to choose based on similarity of Y values, not
* just closeness of x values. Another example would be an implementation that averages data
* (e.g. average 2 consecutive data points)!
*
*
*
If a chart has multiple time series, does one cull each data series
* separatelly? I.e. is this a chart global property or each series
* has its own?
*
* I kinda expect that when multiple series reside in
* the same chart, they're "synchronized" in the sense that they have
* data points at the same moments in time. In this case, the cullers of
* those series do the same work, so one should be enough.
*
*
The user is expected to add the series at the same time. I make no attempt to
* detect if series in the same chart have the same set of x values.
*
*
The current implementation in TimeSeriesChartGenerator and the corresponding property inspector
* uses a single DacaCuller for all series. In the future I guess I could give each series a clone of the
* data culling algorithm, if it turns out that stateful algoriths are needed.
*
*
*
In order to improve the amortized time complexity, more than 1 data point
* should be culled at a time. E.g. as soon as you get 200 points, drop 100.
*
* After each such operation there's a linear time
* data shifting, so it pays off to delete multiple points at a time.
* It also helps with the stuff during the operation, since one does not have to
* scan starting from the beginning for each data point to be deleted.
*
* Heaps might be helpful while deciding which points to drop. The recompacting is still linear,
* but it should be very fast.
*
* @author Gabriel Balan
*/
public interface DataCuller
{
public boolean tooManyPoints(int currentPointCount);
public IntBag cull(double[] xValues, boolean sortOutput);
}