All Downloads are FREE. Search and download functionalities are using the official Maven repository.

net.algart.arrays.GeneralizedBitProcessing Maven / Gradle / Ivy

Go to download

Open-source Java libraries, supporting generalized smart arrays and matrices with elements of any types, including a wide set of 2D-, 3D- and multidimensional image processing and other algorithms, working with arrays and matrices.

There is a newer version: 1.4.23
Show newest version
/*
 * The MIT License (MIT)
 *
 * Copyright (c) 2007-2024 Daniel Alievsky, AlgART Laboratory (http://algart.net)
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in all
 * copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 */

package net.algart.arrays;

import net.algart.math.functions.ConstantFunc;
import net.algart.math.Range;

import java.util.Objects;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;

/**
 * 

Universal converter of bitwise operation (an algorithm processing {@link BitArray}) * to operation over any primitive type (an algorithm processing {@link PArray}).

* *

This class implements the following common idea. Let we have some algorithm, * that transforms the source bit array ({@link BitArray}) b to another bit array ƒ(b). * (Here is an interface {@link GeneralizedBitProcessing.SliceOperation} representing such algorithm.) * Let we want to generalize this algorithm for a case of any element types — * byte, int, * double, etc.; in other words, for the common case of {@link PArray}. * This class allows to do this.

* *

Namely, let we have the source array ({@link PArray}) a, * and let amin..amax be some numeric range, * aminamax * usually (but not necessarily) from the minimal to the maximal value, stored in the source array * ({@link Arrays#rangeOf(PArray) Arrays.rangeOf(a)}). * This class can work in two modes, called rounding modes * (and represented by {@link GeneralizedBitProcessing.RoundingMode} enum): * the first mode is called round-down mode, and the second is called round-up mode. * Below is the specification of the behavior in both modes.

* * * * * * * * * * * * * * * * * * * * * * * * * *
 
Round-down modeRound-up mode
*

In both modes, we consider n+1 (n≥0) numeric thresholds * t0, t1, ..., tn in * amin..amax range:

*
* t0 = amin,
* t1 = * amin + (amaxamin)/n,
* . . .
* ti = * amin + i * (amaxamin)/n,
* . . .
* tn = amax *
*
(In the degenerated case n=0, we consider * t0 = tn = amin.) * (In the degenerated case n=0, we consider * t0 = tn = amax.) *
*

Then we define the bit slice bi, i=0,1,...,n, * as a bit array ati, i.e. {@link BitArray} with the same length as a, * where each element

*
* bi[k] = a[k]≥ti ? 1 : 0 *
*
*

Then we define the bit slice bi, i=0,1,...,n, * as a bit array a>ti, i.e. {@link BitArray} with the same length as a, * where each element

*
* bi[k] = a[k]>ti ? 1 : 0 *
*
* The described transformation of the numeric array a to a set of n+1 "slices" * (bit arrays) bi is called splitting to slices. * Then we consider the backward conversion of the set of bit slices * bi to a numeric array a', called joining the slices: *
*
* a'[k] = ti for max i: bi[k] = 1, * or a'[k] = amin if all bi[k] = 0 *
*
*
* a'[k] = ti for min i: bi[k] = 0, * or a'[k] = amax if all bi[k] = 1 *
*
* *

It's obvious that if all a elements are inside amin..amax * range and if n is large enough, then a' is almost equal to a. * In particular, if a is a byte array ({@link ByteArray}), amin=0, * amax=255 and n=255, then a' is strictly equal to a * (in both rounding modes).

* *

The main task, solved by this class, is converting any bitwise operation ƒ(b) * to a new operation g(a), defining for any primitive-type array a, * according to the following rule:

* * * * * * * * * * * * * * *
 
Round-down modeRound-up mode
*
* g(a)[k] = ti for max i: * ƒ(bi)[k] = 1, * or g(a)[k] = amin if all ƒ(bi)[k] = 0 *
*
*
* g(a)[k] = ti for min i: * ƒ(bi)[k] = 0, * or g(a)[k] = amax if all ƒ(bi)[k] = 0 *
*
* In other words, the source array is split into bit slices, the bitwise operation is applied to all slices, * and then we perform the backward conversion (joining) of slices set to a numeric array g(a). *
* *

The conversion of the source array a to the target array c=g(a) * is performed by the main * method of this class: {@link #process(UpdatablePArray c, PArray a, Range range, long numberOfSlices)}. * The amin..amax range and the number of slices * numberOfSlices=n+1 are specified as arguments of this method. * The original bitwise algorithm should be specified while creating an instance of this class by * {@link #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode) * getInstance} method.

* *

Note that the described operation does not require to calculate ƒ(b0) * in the round-down mode or ƒ(bn) in the round-up mode: * the result does not depend on it. Also note that the case n=0 is degenerated: * in this case always * g(a)[k] = amin (round-down mode) * or amax (round-up mode).

* *

Additional useful note: for some kinds of bitwise algorithms, you can improve the precision of the results * by replacing (after calling {@link #process(UpdatablePArray, PArray, Range, long) process} method) * the resulting array c=g(a) * with elementwise maximum, in case of round-down mode, or elementwise minimum, in case of round-up mode, * of c and the source array a: c=max(c,a) or * c=min(c,a) correspondingly. * You can do it easily by * {@link Arrays#applyFunc(ArrayContext, net.algart.math.functions.Func, UpdatablePArray, PArray...)} method * with {@link net.algart.math.functions.Func#MAX Func.MAX} or {@link net.algart.math.functions.Func#MIN Func.MIN} * argument.

* *

This class allocates, in {@link #process(UpdatablePArray, PArray, Range, long) process} method, some temporary * {@link UpdatableBitArray bit arrays}. Arrays are allocated with help of the memory model, * returned by context.{@link ArrayContext#getMemoryModel() getMemoryModel()} method * of the context, specified while creating an instance of this class. * If the context is {@code null}, or if necessary amount of memory is less than * {@link Arrays.SystemSettings#maxTempJavaMemory()}, then {@link SimpleMemoryModel} is used * for allocating temporary arrays. There is a special case when * {@link #process(UpdatablePArray, PArray, Range, long) process} method * is called for bit arrays (element type is boolean); in this case, no AlgART arrays * are allocated. * *

When this class allocates temporary AlgART arrays, it also checks whether the passed (source and destination) * arrays are tiled, i.e. they are underlying arrays of some matrices, created by * {@link Matrix#tile(long...)} or {@link Matrix#tile()} method. * Only the case, when these methods are implemented in this package, is recognized. * In this case, if the src array, passed to {@link #process(UpdatablePArray, PArray, Range, long) process} * method, is tiled, then the temporary AlgART arrays are tiled with the same tile structure. * *

This class uses multithreading (alike {@link Arrays#copy(ArrayContext, UpdatableArray, Array)} * and similar methods) to optimize calculations on multiprocessor or multicore computers. * Namely, the {@link #process(UpdatablePArray, PArray, Range, long) process} method * processes different bit slices in several parallel threads. * However, it is not useful if the bitwise processing method {@link * GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int)} * already use multithreading optimization. In this case, please create an instance of this class via * {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, * GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method.

* *

This class is not thread-safe, but is thread-compatible * and can be synchronized manually, if multithreading access is necessary. * However, usually there are no reasons to use the same instance of this class in different threads: * usually there is much better idea to create a separate instance for every thread.

* * @author Daniel Alievsky */ public class GeneralizedBitProcessing extends AbstractArrayProcessorWithContextSwitching { /** * Rounding mode, in which {@link GeneralizedBitProcessing} class works: see comments to that class. */ public enum RoundingMode { /** * Round-down mode. */ ROUND_DOWN, /** * Round-up mode. */ ROUND_UP } /** * Algorithm of processing bit arrays, that should be generalized for another element types via * {@link GeneralizedBitProcessing} class. */ public interface SliceOperation { /** * Processes the source bit array srcBits and saves the results in * destBits bit array. * This method is called by * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)} method * for every {@link GeneralizedBitProcessing bit slice} of the source non-bit array. * It is the main method, which you should implement to generalize some bit algorithm for non-bit case. * *

The destBits and srcBits arrays will be different bit arrays, allocated by * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) process} method * (according to the memory model, recommended by the {@link GeneralizedBitProcessing#context() context}), * if the {@link #isInPlaceProcessingAllowed()} method returns false. * If that method returns true, the destBits and srcBits arguments * will be references to the same bit array. * *

The index i of the slice, processed by this method, is passed via * sliceIndex argument. * In other words, the threshold, corresponding to this slice, is * *

* ti = * amin + i * (amaxamin)/n, *  i = sliceIndex *
* *

(here amin..amax is the range of processed values and * n+1 is the desired number of slices, passed via range and numberOfSlices * arguments of * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) process} method). * In the round-down mode sliceIndex is always in range 1..n, * and in the round-up mode it is always in range 0..n-1. * The slice with index i=0 (round-down mode) or i=n (round-up mode) * is never processed, because the end result does not depend on it. * See comments to {@link GeneralizedBitProcessing} class for more details. * *

Please note that this method can be called simultaneously in different threads, * when {@link GeneralizedBitProcessing} class uses multithreading optimization. * In this case, threadIndex argument will be the index of the thread, * in which this method is called, i.e. an integer in * *

* 0..min(n-1,{@link GeneralizedBitProcessing#numberOfTasks()}−1) *
* *

range, where n+1 is the desired number of slices. * The high bound of this range, increased by 1, is also passed via numberOfThreads argument. * The threadIndex can be very useful if your algorithm requires some work memory or other objects: * in this case, your should allocate different work memory for different threads. * *

If multithreading optimization is not used, in particular, if the arrays, processed by * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) process} method, * are {@link BitArray bit arrays}, then threadIndex=0 and numberOfThreads=1. * * @param context the context of execution. It will be {@code null}, if (and only if) * the same argument of * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) * process} method is {@code null}; in this case, the context should be ignored. * The main purpose of the context is to allow interruption of this method via * {@link ArrayContext#checkInterruption()} and to allocate * work memory via {@link ArrayContext#getMemoryModel()}. * @param destBits the destination bit array, where results of processing should be stored. * @param srcBits the source bit array for processing. * @param sliceIndex the index of the currently processed slice, * from 1 to n in the round-down mode or * from 0 to n-1 the round-up mode * (n is the desired number of slices minus 1). * @param threadIndex the index of the current thread (different for different threads in a case of * multithreading optimization). * @param numberOfThreads the maximal possible value of threadIndex+1: equal to * min(numberOfSlices−1,{@link * GeneralizedBitProcessing#numberOfTasks()}), * where numberOfSlices=n+1 is the argument of * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long) * process} method. * @throws NullPointerException if srcBits or destBits argument is {@code null}. */ void processBits( ArrayContext context, UpdatableBitArray destBits, BitArray srcBits, long sliceIndex, int threadIndex, int numberOfThreads); /** * Indicates whether this algorithm can work in place. * *

Some algorithms, processing bit arrays, can work in place, when the results are stored in the source * array. In this case, this method should return true. If it returns true, * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)} method * saves memory and time by passing the same bit array as destBits and * srcBits arguments * of {@link #processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int) processBits} method. * If it returns false, * {@link GeneralizedBitProcessing#process(UpdatablePArray, PArray, Range, long)} method * allocates 2 different bit arrays for source and resulting bit arrays and passes * them as destBits and srcBits arguments * of {@link #processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int) processBits} method. * * @return true if {@link #processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int) * processBits} method can work correctly when destBits==srcBits or false * if that method requires different source and destination arrays. */ boolean isInPlaceProcessingAllowed(); } private final ThreadPoolFactory threadPoolFactory; private final int numberOfTasks; private final SliceOperation sliceOperation; private final RoundingMode roundingMode; private final boolean inPlaceProcessingAllowed; private GeneralizedBitProcessing(ArrayContext context, SliceOperation sliceOperation, RoundingMode roundingMode, int numberOfTasks) { super(context); Objects.requireNonNull(sliceOperation, "Null sliceOperation argument"); Objects.requireNonNull(roundingMode, "Null roundingMode argument"); this.threadPoolFactory = Arrays.getThreadPoolFactory(context); this.numberOfTasks = numberOfTasks > 0 ? numberOfTasks : Math.max(1, this.threadPoolFactory.recommendedNumberOfTasks()); this.sliceOperation = sliceOperation; this.roundingMode = roundingMode; this.inPlaceProcessingAllowed = sliceOperation.isInPlaceProcessingAllowed(); } /** * Returns new instance of this class. * * @param context the {@link #context() context} that will be used by this object; * can be {@code null}, then it will be ignored, and all temporary arrays * will be created by {@link SimpleMemoryModel}. * @param sliceOperation the bit processing operation that will be generalized by this instance. * @param roundingMode the rounding mode, used by the created instance. * @return new instance of this class. * @throws NullPointerException if sliceOperation or roundingMode argument is {@code null}. * @see #getSingleThreadInstance(ArrayContext, SliceOperation, net.algart.arrays.GeneralizedBitProcessing.RoundingMode) */ public static GeneralizedBitProcessing getInstance(ArrayContext context, SliceOperation sliceOperation, RoundingMode roundingMode) { return new GeneralizedBitProcessing(context, sliceOperation, roundingMode, 0); } /** * Returns new instance of this class, that does not use multithreading optimization. * Usually this class performs calculations in many threads * (different slides in different threads), according to information from the passed context, * but an instance, created by this method, does not perform this. * It is the best choice if the operation, implemented by passed sliceOperation object, * already uses multithreading. * * @param context the {@link #context() context} that will be used by this object; * can be {@code null}, then it will be ignored, and all temporary arrays * will be created by {@link SimpleMemoryModel}. * @param sliceOperation the bit processing operation that will be generalized by this instance. * @param roundingMode the rounding mode, used by the created instance. * @return new instance of this class. * @throws NullPointerException if sliceOperation or roundingMode argument is {@code null}. * @see #getInstance(ArrayContext, SliceOperation, net.algart.arrays.GeneralizedBitProcessing.RoundingMode) */ public static GeneralizedBitProcessing getSingleThreadInstance(ArrayContext context, SliceOperation sliceOperation, RoundingMode roundingMode) { return new GeneralizedBitProcessing(context, sliceOperation, roundingMode, 1); } /** * Switches the context: returns an instance, identical to this one excepting * that it uses the specified newContext for all operations. * The returned instance is a clone of this one, but there is no guarantees * that it is a deep clone. * Usually, the returned instance is used only for performing a * {@link ArrayContext#part(double, double) subtask} of the full task. * * @param newContext another context, used by the returned instance; can be {@code null}. * @return new instance with another context. */ @Override public GeneralizedBitProcessing context(ArrayContext newContext) { return newContext == this.context() ? this : new GeneralizedBitProcessing(newContext, this.sliceOperation, this.roundingMode, this.numberOfTasks); } /** * Returns the bit processing algorithm, used by this instance. * More precisely, this method just returns a reference to the sliceOperation object, * passed to {@link * #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode) * getInstance} or {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, * GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method. * * @return the bit processing algorithm, used by this instance. */ public SliceOperation sliceOperation() { return this.sliceOperation; } /** * Returns the rounding mode, used by this instance. * More precisely, this method just returns a reference to the roundingMode object, * passed to {@link * #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode) * getInstance} or {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, * GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method. * * @return the roundingMode, used by this instance. */ public RoundingMode roundingMode() { return this.roundingMode; } /** * Returns the number of threads, that this class uses for multithreading optimization. * It is equal to: *

    *
  • 1 if this instance was created by * {@link #getSingleThreadInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, * GeneralizedBitProcessing.RoundingMode) getSingleThreadInstance} method or
  • *
  • context.{@link ArrayContext#getThreadPoolFactory() * getThreadPoolFactory()}.{@link ThreadPoolFactory#recommendedNumberOfTasks() recommendedNumberOfTasks()} * if it was created by {@link #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, * GeneralizedBitProcessing.RoundingMode) getInstance} method.
  • *
* *

(As in {@link Arrays#copy(ArrayContext, UpdatableArray, Array)}, if the context argument of * {@link * #getInstance(ArrayContext, GeneralizedBitProcessing.SliceOperation, GeneralizedBitProcessing.RoundingMode) * getInstance} method * is {@code null}, then {@link DefaultThreadPoolFactory} is used.) * *

Note that the real number of parallel threads will be a minimum from this value and * n, where n+1 is the desired number of slices (the last argument of * {@link #process(UpdatablePArray, PArray, Range, long)} method). * * @return the number of threads, that this class uses for multithreading optimization. */ public int numberOfTasks() { return this.numberOfTasks; } /** * Performs processing of the source array src, with saving results in dest array, * on the base of the bit processing algorithm, specified for this instance. * Namely, the source array src is splitted to bit slices, each slice is processed by * {@link * GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int)} * method of the {@link #sliceOperation()} object (representing the used bit processing algorithm), * and the processed slices are joined to the resulting array dest. * See the precise description of this generalization of a bit processing algorithm * in the {@link GeneralizedBitProcessing comments to this class}. * *

The src and dest arrays must not be the same array or views of the same array; * in another case, the results will be incorrect. * These arrays must have the same element types and the same lengths; in another case, * an exception will occur. * *

The range argument specifies the amin..amax range, * used for splitting to bit slices. * If you do not want to lose too little or too big values in the processed array, this range should * contain all possible values of src array. * The simplest way to provide this is using the result of {@link Arrays#rangeOf(PArray) Arrays.rangeOf(src)}. * *

The numberOfSlices argument is the number of bit slices, used * for splitting the src array, * which is equal to n+1 in the {@link GeneralizedBitProcessing comments to this class}. * Less values of this argument increases speed, larger values increases precision of the result * (only numberOfSlices different values are possible for dest elements). * If the element type is a fixed-point type (src and dest * are {@link PFixedArray} instance), * this argument is automatically truncated to * min(numberOfSlices, (long)(range.{@link Range#size() size()}+1.0)), * because larger values cannot increase the precision. * *

Please remember that numberOfSlices=1 (n=0) is a degenerated case: in this case, * the dest array is just filled by range.min() * (round-down mode) or range.min() (round-up mode), * alike in {@link UpdatablePArray#fill(double)} method. * *

If the element type is boolean ({@link BitArray}), * then a generalization of the bitwise algorithm * is not necessary. In this case, if numberOfSlices>1, this method just calls * {@link #sliceOperation()}.{@link * GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int) * processBits} method for the passed dest and src arrays. * (However, according to the specification of {@link GeneralizedBitProcessing.SliceOperation}, if its * {@link GeneralizedBitProcessing.SliceOperation#isInPlaceProcessingAllowed() isInPlaceProcessingAllowed()} * method returns true, then src arrays is firstly copied into dest, * and then the dest arrays * is passed as both srcBits and destBits arguments.) * *

Note: if the element type is boolean, then multithreading is never used: * {@link * GeneralizedBitProcessing.SliceOperation#processBits(ArrayContext, UpdatableBitArray, BitArray, long, int, int) * processBits} method is called * in the current thread, and its threadIndex and numberOfThreads arguments are * 0 and 1 correspondingly. * * @param dest the result of processing. * @param src the source AlgART array. * @param range the amin..amax range, * used for splitting to bit slices. * @param numberOfSlices the number of bit slices (i.e. n+1); must be positive. * @throws NullPointerException if one of the arguments is {@code null}. * @throws IllegalArgumentException if dest and src arrays have different lengths * or element types, or if numberOfSlices<=0. */ public void process(UpdatablePArray dest, PArray src, Range range, long numberOfSlices) { Objects.requireNonNull(dest, "Null dest argument"); Objects.requireNonNull(src, "Null src argument"); Objects.requireNonNull(range, "Null range argument"); if (src.elementType() != dest.elementType()) { throw new IllegalArgumentException("Different element types of src and dest arrays (" + src.elementType().getCanonicalName() + " and " + dest.elementType().getCanonicalName() + ")"); } if (dest.length() != src.length()) { throw new SizeMismatchException("Different lengths of src and dest arrays (" + src.length() + " and " + dest.length() + ")"); } if (numberOfSlices <= 0) { throw new IllegalArgumentException("numberOfSlices must be positive"); } if (src instanceof PFixedArray) { numberOfSlices = Math.min(numberOfSlices, (long)(range.size() + 1.0)); } if (numberOfSlices >= 2 && src instanceof BitArray) { if (inPlaceProcessingAllowed) { Arrays.copy(contextPart(0.0, 0.05), dest, src); this.sliceOperation.processBits(contextPart(0.05, 1.0), (UpdatableBitArray)dest, (BitArray)dest, 1, 0, 1); } else { this.sliceOperation.processBits(context(), (UpdatableBitArray)dest, (BitArray)src, 1, 0, 1); } return; } long numberOfRanges = numberOfSlices - 1; // numberOfSlices thresholds split the range into numberOfRanges ranges: // range.min(), range.min() + range.size() / numberOfRanges, ..., range.max() ArrayContext acFill = numberOfRanges == 0 ? context() : contextPart(0.0, 0.1 / numberOfRanges); ConstantFunc filler = switch (roundingMode) { case ROUND_DOWN -> ConstantFunc.getInstance(range.min()); case ROUND_UP -> ConstantFunc.getInstance(range.max()); default -> throw new AssertionError(); }; Arrays.copy(acFill, dest, Arrays.asIndexFuncArray(filler, dest.type(), dest.length())); // equivalent to dest.fill(range.min()), but supports context (important for a case of very large disk array) if (numberOfRanges == 0) { // degenerated case: numberOfSlices == 1 return; } assert numberOfSlices >= 2; Runnable[] tasks = createTasks(contextPart(0.1 / numberOfRanges, 1.0), dest, src, range, numberOfRanges); threadPoolFactory.performTasks(tasks); } private Runnable[] createTasks(ArrayContext context, final UpdatablePArray dest, final PArray src, final Range range, final long numberOfRanges) { long len = src.length(); assert dest.length() == len; assert dest.elementType() == src.elementType(); assert numberOfRanges >= 1 : "1 slice must be processed by trivial filling out of this method"; int nt = (int)Math.min(this.numberOfTasks, numberOfRanges); UpdatableBitArray[] srcBits = allocateMemory(nt, src); UpdatableBitArray[] destBits = this.inPlaceProcessingAllowed ? srcBits : allocateMemory(nt, src); Runnable[] tasks = new Runnable[nt]; AtomicLong readyLayers = new AtomicLong(0); AtomicBoolean interruptionRequest = new AtomicBoolean(false); if (context != null && nt > 1) { context = context.singleThreadVersion(); } Lock lock = new ReentrantLock(); Condition condition = lock.newCondition(); for (int threadIndex = 0; threadIndex < tasks.length; threadIndex++) { SliceProcessingTask task = new SliceProcessingTask(context, threadIndex, nt, lock, condition); task.setProcessedData(dest, src, destBits, srcBits); task.setRanges(range, numberOfRanges); task.setSynchronizationVariables(readyLayers, interruptionRequest); tasks[threadIndex] = task; } return tasks; } private UpdatableBitArray[] allocateMemory(int numberOfTasks, PArray src) { long nba = this.sliceOperation.isInPlaceProcessingAllowed() ? numberOfTasks : 2 * (long)numberOfTasks; long arrayLength = src.length(); MemoryModel mm = Arrays.sizeOf(boolean.class, arrayLength) < Arrays.SystemSettings.maxTempJavaMemory() / nba ? SimpleMemoryModel.getInstance() : memoryModel(); UpdatableBitArray[] result = new UpdatableBitArray[numberOfTasks]; Matrix srcBaseMatrix = !Arrays.isTiled(src) ? null : ((ArraysTileMatrixImpl.TileMatrixArray)src).baseMatrix().cast(PArray.class); for (int k = 0; k < result.length; k++) { result[k] = mm.newUnresizableBitArray(arrayLength); if (srcBaseMatrix != null) { result[k] = srcBaseMatrix.matrix(result[k]).tile(Arrays.tileDimensions(src)).array(); } } return result; } private class SliceProcessingTask implements Runnable { private final ArrayContext context; private final int threadIndex; private final int numberOfThreads; private final Lock lock; private final Condition condition; private Range range; private long numberOfRanges; private AtomicLong readyLayers; private AtomicBoolean interruptionRequest; private UpdatablePArray dest; private PArray src; private UpdatableBitArray[] srcBits; private UpdatableBitArray[] destBits; private SliceProcessingTask(ArrayContext context, int threadIndex, int numberOfThreads, Lock lock, Condition condition) { this.context = context; this.threadIndex = threadIndex; this.numberOfThreads = numberOfThreads; this.lock = lock; this.condition = condition; } public void setRanges(Range range, long numberOfRanges) { this.range = range; this.numberOfRanges = numberOfRanges; } private void setProcessedData(UpdatablePArray dest, PArray src, UpdatableBitArray[] destBits, UpdatableBitArray[] srcBits) { this.dest = dest; this.src = src; this.destBits = destBits; this.srcBits = srcBits; } private void setSynchronizationVariables(AtomicLong readyLayers, AtomicBoolean interruptionRequest) { this.readyLayers = readyLayers; this.interruptionRequest = interruptionRequest; } public void run() { if (lock == null || range == null || dest == null || src == null) { throw new AssertionError(this + " is not initialized correctly"); } for (long sliceIndex = threadIndex + 1; sliceIndex <= numberOfRanges; sliceIndex += numberOfThreads) { // threadIndex+1 because processing the layer #0 is trivial and should be skipped double value = switch (roundingMode) { case ROUND_DOWN -> sliceIndex == numberOfRanges ? range.max() : // to be on the safe side range.min() + range.size() * (double) sliceIndex / numberOfRanges; case ROUND_UP -> sliceIndex == numberOfRanges ? range.min() : // to be on the safe side range.max() - range.size() * (double) sliceIndex / numberOfRanges; default -> throw new AssertionError(); }; ArrayContext acToBit, acOp, acFromBit; if (context != null) { ArrayContext ac = context.part(sliceIndex - 1, sliceIndex, numberOfRanges); acToBit = numberOfThreads == 1 ? ac.part(0.0, 0.05) : context.noProgressVersion(); acOp = numberOfThreads == 1 ? ac.part(0.05, 0.95) : context.noProgressVersion(); acFromBit = ac.part(0.95, 1.0); } else { acToBit = acOp = acFromBit = ArrayContext.DEFAULT_SINGLE_THREAD; // if context==null, then numberOfThreads is the default system recommended number of threads; // if it is >1, then we need provide single-thread processing in every slice, // and if it is 1, then "null" context works like DEFAULT_SINGLE_THREAD } if (interruptionRequest.get()) { return; } // long t1 = System.nanoTime(); switch (roundingMode) { case ROUND_DOWN: Arrays.packBitsGreaterOrEqual(acToBit, srcBits[threadIndex], src, value); break; case ROUND_UP: Arrays.packBitsGreater(acToBit, srcBits[threadIndex], src, value); break; } // Below is a slower code (commented out), equivalent to this "packBitsGreaterOrEqual" // (but not to "packBitsLessOrEqual"): // // Arrays.applyFunc(acToBit, // net.algart.math.functions.RectangularFunc.getInstance(value, // Double.POSITIVE_INFINITY, 1.0, 0.0), srcBits[threadIndex], src); // long t2 = System.nanoTime(); sliceOperation.processBits(acOp, destBits[threadIndex], srcBits[threadIndex], roundingMode == RoundingMode.ROUND_DOWN ? sliceIndex : numberOfRanges - sliceIndex, threadIndex, numberOfThreads); // long t3 = System.nanoTime(); lock.lock(); try { if (interruptionRequest.get()) { return; // important to avoid writing anywhere if interruption is necessary } while (readyLayers.get() < sliceIndex - 1) { // This loop cannot be infinite. // Proof. // Note that sliceIndex takes all values m=1,2,...,numberOfRanges in some order. // Let's prove by induction with m, that this loop is not infinite when sliceIndex=m. // 1. If sliceIndex=m=1, it is obvious: readyLayers.get() is never < 0. // 2. Let's suppose, that this loop was successfully performed and finished in some thread, // when sliceIndex was equal to m-1. And let now sliceIndex be equal to m. // Because this loop was successfully performed with sliceIndex = m-1, // so, readyLayers.get() had become >= m-2. Some time after this, // readyLayers.incrementAndGet() operator below was executed, // and readyLayers.get() must become >= m-1. // At this moment, this loop must finish with sliceIndex = m. try { condition.await(100, TimeUnit.MILLISECONDS); // 100 ms - to be on the safe side, if "signalAll" has not help } catch (InterruptedException e) { interruptionRequest.set(true); return; } if (context != null) { context.checkInterruption(); } // to be on the safe side, allow interruption even in a case of a bug here } switch (roundingMode) { case ROUND_DOWN: Arrays.unpackUnitBits(acFromBit, dest, destBits[threadIndex], value); break; case ROUND_UP: Arrays.unpackZeroBits(acFromBit, dest, destBits[threadIndex], value); break; } // Below is a slower code (commented out), equivalent to this "unpackUnitBits" // (but not to "unpackZeroBits"), which, however, // does not need waiting for readyLayers.get() == sliceIndex-1: // // PArray castBits = Arrays.asFuncArray( // net.algart.math.functions.LinearFunc.getInstance(0.0, value), // dest.type(), destBits[threadIndex]); // Arrays.applyFunc(acFromBit, net.algart.math.functions.Func.MAX, // dest, dest, castBits); long ready = readyLayers.incrementAndGet(); if (ready != sliceIndex) { throw new AssertionError("Invalid synchronization in " + GeneralizedBitProcessing.class); } condition.signalAll(); } finally { lock.unlock(); } // System.out.printf("%d: %.5f + %.5f + %.5f, %d tasks (%s <-- %s)%n", // sliceIndex, (t2 - t1) * 1e-6, (t3 - t2) * 1e-6, (System.nanoTime() - t3) * 1e-6, // numberOfThreads, destBits[threadIndex], srcBits[threadIndex]); } } } }





© 2015 - 2025 Weber Informatics LLC | Privacy Policy