All Downloads are FREE. Search and download functionalities are using the official Maven repository.

net.algart.matrices.TiledApertureProcessorFactory Maven / Gradle / Ivy

Go to download

Open-source Java libraries, supporting generalized smart arrays and matrices with elements of any types, including a wide set of 2D-, 3D- and multidimensional image processing and other algorithms, working with arrays and matrices.

There is a newer version: 1.4.23
Show newest version
/*
 * The MIT License (MIT)
 *
 * Copyright (c) 2007-2024 Daniel Alievsky, AlgART Laboratory (http://algart.net)
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in all
 * copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 */

package net.algart.matrices;

import net.algart.arrays.*;
import net.algart.arrays.Arrays;
import net.algart.math.IPoint;
import net.algart.math.IRectangularArea;

import java.util.*;

/**
 * 

Tiler: generator of tiled {@link ApertureProcessor aperture matrix processors}. * The tiler works with some algorithm, represented by {@link ApertureProcessor} interface * and called one-tile processor: it can be any algorithm, which * processes one or several {@link Matrix n-dimensional matrices} (with identical sets of dimensions) * and returns one or several other matrices as a result (with the same sets of dimensions). * The only requirement is that the value of every element of the resulting matrices depends only on * the elements of the source matrices in a fixed rectangular aperture "around" the same position, * as described in {@link ApertureProcessor} interface. * This class allows to convert the given one-tile processor into another aperture processor, * called a tiled processor, created on the base of the original one-tile processor * and functionally equivalent to it. (The equivalence can be violated on the bounds * of the matrices, where the tiled processor provides several models of continuations — * see below the section "Continuation model outside the bounds of the large matrices".) * This new processor splits all the matrices into relatively little tiles (rectangular areas, * i.e. sub-matrices), performs the processing of every tile with the one-tile processor and places the results * into the corresponding sub-matrices of the resulting matrices. * Such conversion of one algorithm to another is called tiling an algorithm * and is performed by {@link #tile(ApertureProcessor)} method — the main method of this class.

* * *

Why to tile aperture processors

* *

The goal of tiling some algorithms is optimization of processing very large matrices, * usually located on external storage devices (for example, with help of {@link LargeMemoryModel}).

* *

First, very large matrices are usually {@link Matrix#tile() tiled}, but many algorithms process matrices * in a simple "streaming" manner, i.e. load elements in the order, corresponding to the order of elements * in the {@link Matrix#array() built-in AlgART array}. This order of downloading is inefficient for tiled matrices. * The same algorithms, tiled with help of this class, process large tiled matrices in more efficient order: * they download a rectangular block from all source matrices into newly created * (relatively little) matrices, process them and store the results * into the corresponding sub-matrices of the destination matrices. * For maximal efficiency, the tiler tries to use {@link SimpleMemoryModel} for storing and processing every * rectangular block (a tile); you can control this with maxTempJavaMemory argument of all * instantiation methods getInstance. In addition, the matrices, allocated by the tiled processor * (if it creates them), are automatically tiled by {@link Matrix#tile(long...)} method (see more details below in * the specification of process method in the tiled aperture processors, stage 4.d).

* *

Second, many algorithms (for example, the basic implementation of * {@link net.algart.matrices.morphology.Morphology mathematical morphology} from * {@link net.algart.matrices.morphology} package) are multipass, i.e. process source matrices in many passes * though all the matrix. It can be very important for high performance, when all data are located in RAM, * especially in a form of Java memory (via {@link SimpleMemoryModel}), but it can extremely slow down * the calculations, when the source matrices are very large and located on a disk or another storage, * because an algorithm downloads all data from external devices again during each pass. * Unlike this, a matrix processor, generated by the tiler (by {@link #tile(ApertureProcessor)} method), * is always single-pass: each tile is downloaded and saved only 1 time, and multipass processing is applied * to relatively little tile matrices, usually allocated in {@link SimpleMemoryModel}.

* *

Third, tiling by this class is the simplest way to optimize an algorithm for multiprocessor or multicore * computers, if the algorithm does not provide multithreading optimization itself: * several tiles can be processed simultaneously in parallel threads. See below about multithreading.

* * *

Specification of {@link ApertureProcessor#process(Map, Map) * process} and other methods in the tiled aperture processors

* *

Here is the precise specification of the behavior of the {@link ApertureProcessor}, * tiled by this tiler, i.e. of the result of {@link #tile(ApertureProcessor oneTileProcessor)}. * We call the argument of this method the one-tile processor, and the result of this method * the tiled processor.

* *
    *
  • The generic type K of indexes of the source and resulting matrices in the tiled * processor is the same as in the one-tile processor. (It is obvious from the declaration * of {@link #tile(ApertureProcessor) tile} method.) *
     
  • * *
  • {@link ApertureProcessor#process(Map dest, Map src)} method of the tiled processor * does the following.
      *
      *
    1. It checks, whether the arguments are correct. If they violate one of common requirement, * described in "Throws" section in * {@link ApertureProcessor#process(Map, Map) comments to "process" method}, * a corresponding exception is thrown. In particular, this implementation checks, * that all matrices in the dest map are either {@code null} * or updatable, i.e. their {@link Matrix#array() built-in arrays} implement * {@link UpdatableArray} interface — if at least one non-null matrix in the dest map * is not updatable, IllegalArgumentException is thrown.
      * In addition, the given implementation throws * IllegalArgumentException if one of the passed matrices has * {@link Matrix#dimCount() number of dimensions}, other than the number of dimensions of this tiler, * returned by its {@link #dimCount()} method.
      * If no source matrices and no non-null resulting matrices were passed to this method, * i.e. if src.isEmpty() and either dest.isEmpty(), or all matrices * dest.get(key)==null, then {@link ApertureProcessor#process(Map, Map) process} * method does nothing and immediately returns. * If at least one of dimensions of the passed matrices is 0, then * {@link ApertureProcessor#process(Map, Map) process} * method also does nothing and immediately returns (there are no elements to process). *
       
    2. * *
    3. It calculates the maximal dependence aperture Am: a minimal integer * rectangular area (an instance of {@link IRectangularArea}), containing all dependence apertures * Ai of the one-tile processor (returned by its * {@link ApertureProcessor#dependenceAperture(Object) dependenceAperture(i)} method) * for all indexes iQ=src.keySet(), and also containing the origin * of coordinates. While this calculation, IndexOutOfBoundsException will be thrown, * if the number of dimensions for one of results of {@link ApertureProcessor#dependenceAperture(Object) * dependenceAperture(i)} calls is less than {@link #dimCount()}, but if some * of them has more than {@link #dimCount()} dimensions, the extra dimensions of such aperture * are just ignored (here and in the further algorithm). *
       
    4. * *
    5. It splits all source matrices Mi (src.get(i)) * and all resulting matrices M'j (dest.get(j)) into * a set of rectangular non-overlapping tiles, i.e. sub-matrices, the dimensions of which are chosen * to be equal to the desired tile dimensions of this tiler ({@link #tileDim()}) or, maybe, less. * (This stage does not suppose any actual calculations: we consider this stage * for the sake of simplicity.)
      * For every tile, we shall designate * f = (f0, f1, ..., * fn−1) the n-dimensional starting point of the tile (inclusive) * and t = (t0, t1, ..., * tn−1) the n-dimensional ending point of the tile (exclusive). * (Here "f" is the starting letter of "from" word, "t" is the starting letter of "to" word.) * More precisely, this tile consists of all elements of the source and target matrices with such indexes * (i0, i1, ..., in−1), * that
      *     fkik < tk, * k=0,1,...,n−1.
      * Besides this tile (ft), we also consider the extended tile * (fete), consisting of all elements of the source and target matrices with such indexes * (i0, i1, ..., in−1), * that
      *     fek * = fk + Am.{@link IRectangularArea#min(int) min(k)} * ≤ ik * < tek * = tk + Am.{@link IRectangularArea#min(int) max(k)}. *
      Note that all tiles (ft) lie fully inside the dimensions * of the source and target matrices, but it is not always so for extended tiles (fete). * Also note that each tile (ft) is a subset of the corresponding extended tile * (fete), because the maximal dependence aperture Am * always contains the origin of coordinates (as written in the item 2). *
       
    6. * *
    7. Then the {@link ApertureProcessor#process(Map dest, Map src) process} * method of the tiled processor does the following, for every tile (ft) * and the corresponding extended tile (fete): *
        *
      1. For each index of a source matrices iQ=src.keySet() and * for each index of a resulting non-null matrix jR=dest.keySet(), * passed to this method, it allocates new matrices * mi and m'j * with the same element type as Mi and M'j * and with dimensions, equal to the sizes of the extended tile * tekfek. * For each index j of a null resulting matrix M'j=null, * passed to this method, it's assumed m'j=null. * The newly created (relatively small) matrices (or {@code null} references) mi * and m'j are stored in * two Map<K, Matrix<?>> objects * srcTile (mi) * and destTile (m'j), * in the same manner as the original dest and src arguments.
        * Note that the tiled processor tries to use the created matrices mi * and m'j many times for different tiles, because there is no sense * to create them again for every tile.
        * This algorithm tries to create all matrices in {@link SimpleMemoryModel}. But, it the total * amount of memory, necessary simultaneously for all these matrices, is greater than * {@link #maxTempJavaMemory()} bytes (this parameter is passed to all instantiation methods * getInstance of the tiler), then the memory model from the {@link #context() current context} * is used instead. Note that the total amount of memory depends not only on the number of arguments and * results and the tile dimensions, but also on the desired * {@link #numberOfTasks() number of parallel tasks}.
        * Here is a guarantee that all non-null matrices m'j * (stored in destTile) are updatable, i.e. their {@link Matrix#array() built-in arrays} * implement {@link UpdatableArray} interface. *
      2. * *
      3. For each source matrix Mi, its submatrix, corresponding to the * extended tile and extracted with Mi.{@link * Matrix#subMatrix(long[], long[], Matrix.ContinuationMode) * subMatrix}(fe, te, continuationMode) call, is copied into * the corresponding small matrix mi (srcTile.get(i)). * Here the continuationMode is equal to the continuation mode of this tiler, returned by * {@link #continuationMode()} method.
        * At this step, the method may really copy less data from some source matrices (with getting * the same results), if the corresponding dependence apertures Ai are less * than the maximal aperture Am. *
      4. * *
      5. Main part of the algorithm: * {@link ApertureProcessor#process(Map, Map) process} method of * the one-tile processor is called with the arguments m'j * and mi — * oneTileProcessor.{@link ApertureProcessor#process(Map, Map) * process}(destTile, srcTile).
        * Note that, as a result, all null small matrices m'j * (destTile.get(j)) will become not null — it is a requirement, * described in {@link ApertureProcessor#process(Map, Map) * comments to "process" method}. If some matrix destTile.get(j) * is {@code null} after calling the one-tile processor, it means an invalid implementation * of the one-tile processor: AssertionError is thrown in this case. *
      6. * *
      7. If it is the first processed tile, this algorithm scans the whole map destTile. * If it contains some matrix m'j, for which the index j is not * present in the dest map (!dest.containsKey(j)) * or the value dest.get(j)==null, * the corresponding resulting matrices M'j * is created with the element type, equal to corresponding * m'j.{@link Matrix#elementType() elementType()}, and dimensions, * equal to dimensions of other source and resulting matrices Mi * and M'j. Each created matrix M'j is saved * back into dest argument: * dest.put(jM'j).
        * The resulting matrices M'j are created in the memory model * from the current context of the tiler: {@link #context()}.{@link ArrayContext#getMemoryModel() * getMemoryModel()}. Moreover, every newly created matrix is automatically tiled, i.e. replaced * with newMatrix.{@link Matrix#tile(long...) tile}(allocationTileDim), where * allocationTileDim is the corresponding argument of the getInstance * instantiation method — with the only exception, when you explicitly specify {@code null} * for this argument (in this case the new matrices are not tiled). In most cases, * the tiler is used with very large matrices, and automatic tiling the resulting matrices * improves performance. *
      8. * *
      9. The central part (submatrix) of all matrices m'j, * corresponding to the original (non-extended) tile (ft), * is copied into the corresponding tile of the resulting matrices M'j, * i.e. into their sub-matrices M'j.{@link * Matrix#subMatrix(long[], long[]) subMatrix}(f, t). *
         
      10. *
    8. *
  • * *
  • {@link ApertureProcessor#dependenceAperture(Object srcMatrixKey)} method of the tiled processor * just calls the same method of the one-tile processor with the same srcMatrixKey argument * and returns its result. *
     
  • *
* *

Note: there is a guarantee, that each resulting matrix M'j, * created by {@link ApertureProcessor#process(Map dest, Map src) process} method of the tiled processor * at the stage 4.d is updatable: its {@link Matrix#array() built-in array} is {@link UpdatableArray} * and, thus, the matrix can be cast to {@link Matrix}<UpdatableArray> with help of * {@link Matrix#cast(Class) Matrix.cast}(UpdatableArray.class) call.

* *

Note: {@link ApertureProcessor#process(Map dest, Map src) process} * method of the tiled processor can process several tiles simultaneously in parallel threads * to optimize calculations on multiprocessor or multicore computers. * It depends on the numberOfTasks argument of the instantiation methods getInstance. * If it is 0 (or absent), the desired number of parallel tasks * is detected automatically on the base of the {@link ArrayContext} argument * of the instantiation methods. * Many algorithms (one-tile processors) provide multithreading optimization themselves, * so there is no sense to use this feature: in this case you may specify numberOfTasks=1.

* * *

Continuation model outside the bounds of the large matrices

* *

The behavior of the aperture processor, tiled by {@link #tile(ApertureProcessor oneTileProcessor)} method, * can little differ from the behavior of the original one-tile processor near the bounds of the matrices, * namely for the resulting elements, for which the dependence aperture * Ai={@link ApertureProcessor#dependenceAperture(Object) * dependenceAperture(i)} (at least for one index i of a source matrix) * does not fully lie inside the corresponding source matrix Mi.

* *

In such situation the behavior of the original one-tile processor depends on implementation — * for example, many algorithms suppose so-called pseudo-cyclic continuation mode, described in comments * to {@link net.algart.arrays.Matrix.ContinuationMode#PSEUDO_CYCLIC} constant. * But the behavior of the resulting processor, * tiled by {@link #tile(ApertureProcessor)} method, is strictly defined always and corresponds to * the {@link net.algart.arrays.Matrix.ContinuationMode continuation mode}, * passed as continuationMode argument to an instantiation method getInstance * of the tiler and returned by {@link #continuationMode()} method. * You can see it from the specification of the behavior of * {@link ApertureProcessor#process(Map dest, Map src) process} method above, * stage 4.b.

* *

If the one-tile processor works according one of continuation models, provided by * {@link net.algart.arrays.Matrix.ContinuationMode} class, you can guarantee the identical behavior of * the tiled processor by passing the same continuation mode into a tiler instantiation * method getInstance; * if no, the tiled processor will be impossible to provide identical results.

* *

Note that {@link net.algart.arrays.Matrix.ContinuationMode#NONE} continuation mode cannot be used in the tiler: * such value of continuationMode argument of instantiation methods getInstance leads * to IllegalArgumentException.

* * *

Contexts for the one-tile processor

* *

First of all, we note that every tiled processor — a result of {@link #tile(ApertureProcessor)} method * — always implements not only {@link ApertureProcessor}, but also * {@link ArrayProcessorWithContextSwitching} interface. So, you can use its * {@link ArrayProcessor#context()} and {@link ArrayProcessorWithContextSwitching#context(ArrayContext)} methods * after corresponding type cast. The current context of the tiled processor * (returned by {@link ArrayProcessor#context() context()} method) is initially equal to the * {@link #context() current context} of the tiler, and you can change it with help of * {@link ArrayProcessorWithContextSwitching#context(ArrayContext) context(ArrayContext)} method. * This context (if it is not {@code null}) is used for determining memory model, * which should be used for allocating matrices, for showing execution progress * and allowing to stop execution after processing every tile (even if the one-tile processor * does not support these features) and for multithreading simultaneous processing several tiles, * if {@link #numberOfTasks()}>1. And it will be initially {@code null}, if the * {@link #context() current context} of the tiler is {@code null} — then it will be ignored.

* *

Many algorithms, which can be tiled by this class, also works with some {@link ArrayContext} * to provide abilities to stop calculations, show progress, determine desired memory model for allocating * AlgART arrays, etc. Such algorithms should implement not only {@link ApertureProcessor} interface, * but also {@link ArrayProcessorWithContextSwitching} interface, and should get the current context * via their {@link #context()} method. This requirement if not absolute, but if your algorithm retrieves * the context with some other way, then the behavior of its {@link ArrayContext#updateProgress(ArrayContext.Event)} * method can be incorrect — your processor, processing one tile, will not "know" that it is only a part * of the full task (processing all tiles).

* *

If a one-tile processor, tiled by {@link #tile(ApertureProcessor)} method, really implements * {@link ArrayProcessorWithContextSwitching}, then * {@link ApertureProcessor#process(Map dest, Map src) process} method of the tiled processor * creates special tile context before processing every tile and * {@link ArrayProcessorWithContextSwitching#context(ArrayContext) switches} the one-tile processor * to this context before calling its * {@link ApertureProcessor#process(Map dest, Map src) process} method. * In other words, at the stage 4.c the tiled processor calls not
*      oneTileProcessor.{@link * ApertureProcessor#process(Map, Map) * process}(destTile, srcTile),
* but
*      ((ApertureProcessor<K>)(oneTileProcessor.{@link * ArrayProcessorWithContextSwitching#context(ArrayContext) * context}(tileContext))).{@link ApertureProcessor#process(Map, Map) * process}(destTile, srcTile).
* (By the way, it means that you are able not to think about the initial value of the * {@link ArrayProcessor#context() current context} in the constructor of your one-tile processor: * it will be surely replaced with tileContext before usage of your processor. * For example, you may initialize it by {@code null}.) * Of course, it is supposed that the switching method * oneTileProcessor.{@link ArrayProcessorWithContextSwitching#context(ArrayContext) * context}(tileContext) returns an object that also implements {@link ApertureProcessor} — * if it is not so, it means an invalid implementation of that method, and AssertionError * or ClassCastException can be thrown in this case.

* *

The tileContext here is never {@code null}: you can freely use this fact * in your implementation of the one-tile processor. * This context is formed automatically as a {@link ArrayContext#part(double, double) part} * of the current context of the tiled processor, returned by its {@link ArrayProcessor#context() * context()} method — a part, corresponding to processing only one from a lot of tiles. * (As written above, by default the current context of the tiled processor is equal to the * {@link #context() current context} of the tiler.) * Thus, the tiler provides correct behavior of * oneTileProcessor.{@link ArrayProcessor#context() * context()}.{@link ArrayContext#updateProgress(ArrayContext.Event) updateProgress(...)} * inside {@link ApertureProcessor#process(Map, Map) process} method * of your one-tile processor. * If the current context of the tiled processor is {@code null}, * tileContext is formed from {@link ArrayContext#DEFAULT}.

* *

The tileContext also provides additional information about the position and sizes * of the currently processed tile. Namely, it is created with help of {@link ArrayContext#customDataVersion(Object)} * method in such a way, that its {@link ArrayContext#customData() customData()} method always returns * a correctly filled instance of {@link TileInformation} class, describing the currently processed tile.

* *

If the current {@link #numberOfTasks() number of tasks}, desired for this tiler, * is greater than 1, and the tiled processor uses multithreading for parallel processing several tiles, * then the tileContext is formed in a more complex way. * Namely, in this case it is also a {@link ArrayContext#part(double, double) part} of the full context * with correctly filled {@link ArrayContext#customData() customData()} (an instance of {@link TileInformation}), * and in addition:

*
    *
  • {@link ArrayContext#multithreadingVersion(int k, int n)} method is called — so, * the one-tile processor can determine, in which of several parallel threads it is called * (the index k) and what is the total number of parallel threads * (the value n≤{@link #numberOfTasks()} — it can be less than {@link #numberOfTasks()}, * for example, when the total number of tiles is less than it). * This is helpful if the implementation of the one-tile processor needs some work memory * or another objects, which should be created before all calculations * and must be separate for different threads;
  • *
  • {@link ArrayContext#singleThreadVersion()} method is called — in other words, * the tiler tries to suppress multithreading in the one-tile processor, when it uses multithreading * itself for parallel processing several tiles;
  • *
  • {@link ArrayContext#noProgressVersion()} method is called — because a progress bar cannot be updated * correctly while parallel processing several tiles (it will be updated after finishing processing * this group of tiles).
  • *
* * *

Restrictions

* *

Every instance of this class can work only with some fixed number n of matrix dimensions, * returned by {@link #dimCount()} method and equal to the length of tileDim array, * passed as an argument of the instantiation methods getInstance. It means that * {@link ApertureProcessor#process(Map, Map) process} method of an aperture processor, * returned by {@link #tile(ApertureProcessor)} method, can process only n-dimensional matrices * with n={@link #dimCount()} and throws IllegalArgumentException if some of the passed matrices * has another number of dimensions.

* *

The tiler has no restrictions for the types of matrix elements: it can work with any element types, * including non-primitive types. But usually the types of matrix elements are primitive.

* *

Note: in improbable cases, when the dimensions of the source and resulting matrices and/or * the sizes of the {@link ApertureProcessor#dependenceAperture(Object) dependence apertures} * are extremely large (about 263), * so that the sum of some matrix dimension and the corresponding size of the aperture * ({@link IRectangularArea#width(int)}) or the product of all such sums (i.e. the number of elements * in a source/resulting matrix, {@link DependenceApertureBuilder#extendDimensions(long[], IRectangularArea) * extended} by such aperture) is greater than Long.MAX_VALUE, * the {@link ApertureProcessor#process(Map dest, Map src) process} method of the * {@link #tile(ApertureProcessor) tiled} processor throws IndexOutOfBoundsException and does nothing. * Of course, these are very improbable cases.

* *

Creating instances of this class

* *

To create instances of this class, you should use one of the following methods:

* *
    *
  • {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[])},
  • *
  • {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[])},
  • *
  • {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], int)},
  • *
  • {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int)}.
  • *
* * *

Multithreading compatibility

* *

This class is immutable and thread-safe: * there are no ways to modify settings of the created instance. * The same is true for the tiled processors, created by {@link #tile(ApertureProcessor)} method.

* * @author Daniel Alievsky */ public final class TiledApertureProcessorFactory { /** *

Additional information about the current processed tile, available for tiled aperture processors * via their context.

* *

This object is returned by {@link ArrayContext#customData() customData()} method of the current context * {@link ArrayProcessor#context()} of the one-tile aperture processor — * the argument of {@link TiledApertureProcessorFactory#tile(ApertureProcessor)} method — * if this one-tile processor implements {@link ArrayProcessorWithContextSwitching} interface and * is called from the tiled processor (the result of * {@link TiledApertureProcessorFactory#tile(ApertureProcessor) tile} method) * for processing a tile. See comments to {@link TiledApertureProcessorFactory}, * the section "Contexts for the one-tile processor".

* *

This class is immutable and thread-safe: * there are no ways to modify settings of the created instance.

*/ public static final class TileInformation { private final IRectangularArea tile; private final IRectangularArea extendedTile; private TileInformation(IRectangularArea tile, IRectangularArea extendedTile) { assert tile != null && extendedTile != null; this.tile = tile; this.extendedTile = extendedTile; } /** * Returns the position and sizes of the currently processed tile (ft). * See the strict definition of (ft) area in the specification * of process method, stage 3, in comments to {@link TiledApertureProcessorFactory}. * *

The {@link IRectangularArea#min() min()} point of the result contains the minimal coordinates * of the matrix elements, belonging to this tile: * {@link IRectangularArea#min() min()} = * f = (f0, f1, ..., * fn−1). * The {@link IRectangularArea#max() max()} point of the result contains the maximal coordinates * of the matrix elements, belonging to this tile: * {@link IRectangularArea#max() max()} = * t−1 = (t0−1, t1−1, ..., * tn−1−1). * * @return the currently processed tile (ft). */ public IRectangularArea getTile() { return tile; } /** * Returns the position and sizes of the currently processed extended tile (fete). * See the strict definition of (fete) area in the specification * of process method, stage 3, in comments to {@link TiledApertureProcessorFactory}. * *

The {@link IRectangularArea#min() min()} point of the result contains the minimal coordinates * of the matrix elements, belonging to this extended tile: * {@link IRectangularArea#min() min()} = * fe = (fe0, fe1, ..., * fen−1). * The {@link IRectangularArea#max() max()} point of the result contains the maximal coordinates * of the matrix elements, belonging to this extended tile: * {@link IRectangularArea#max() max()} = * te−1 = (te0−1, te1−1, ..., * ten−1−1). * * @return the currently processed extended tile (fete). */ public IRectangularArea getExtendedTile() { return extendedTile; } } private final ArrayContext context; private final ThreadPoolFactory threadPoolFactory; private final int numberOfTasks; private final int originalNumberOfTasks; private final long maxTempJavaMemory; private final Matrix.ContinuationMode continuationMode; private final long[] processingTileDim; private final long[] allocationTileDim; private final int dimCount; // == tileDim.length private TiledApertureProcessorFactory( ArrayContext context, Matrix.ContinuationMode continuationMode, long maxTempJavaMemory, long[] processingTileDim, long[] allocationTileDim, int numberOfTasks) { Objects.requireNonNull(continuationMode, "Null continuation mode"); Objects.requireNonNull(processingTileDim, "Null processing tile dimensions Java array"); if (continuationMode == Matrix.ContinuationMode.NONE) { throw new IllegalArgumentException(getClass() + " cannot be used with continuation mode \"" + continuationMode + "\""); } if (processingTileDim.length == 0) { throw new IllegalArgumentException("Empty processing tile dimensions Java array"); } if (allocationTileDim != null && allocationTileDim.length != processingTileDim.length) { throw new IllegalArgumentException("Different number of allocation and processing tile dimensions"); } if (numberOfTasks < 0) { throw new IllegalArgumentException("Negative numberOfTasks=" + numberOfTasks); } this.processingTileDim = processingTileDim.clone(); this.allocationTileDim = allocationTileDim == null ? null : allocationTileDim.clone(); this.dimCount = processingTileDim.length; for (int k = 0; k < dimCount; k++) { if (this.processingTileDim[k] <= 0) { throw new IllegalArgumentException("Negative or zero processing tile dimension #" + k + ": " + this.processingTileDim[k]); } if (this.allocationTileDim != null && this.allocationTileDim[k] <= 0) { throw new IllegalArgumentException("Negative or zero allocation tile dimension #" + k + ": " + this.allocationTileDim[k]); } } if (maxTempJavaMemory < 0) { throw new IllegalArgumentException("Negative maxTempJavaMemory argument"); } this.context = context; this.maxTempJavaMemory = maxTempJavaMemory; this.continuationMode = continuationMode; this.threadPoolFactory = Arrays.getThreadPoolFactory(context); this.originalNumberOfTasks = numberOfTasks; this.numberOfTasks = numberOfTasks > 0 ? numberOfTasks : Math.max(1, this.threadPoolFactory.recommendedNumberOfTasks()); } /** * Creates new instance of the tiler. Equivalent to the following call of the basic instantiation method:
* {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance}(context, continuationMode, maxTempJavaMemory, tileDim, {@link * Matrices#defaultTileDimensions(int) Matrices.defaultTileDimensions}(tileDim.length), 0). * * @param context see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param continuationMode see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param maxTempJavaMemory see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param tileDim see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @return new tiler. * @throws NullPointerException if continuationMode or tileDim * argument is {@code null}. * @throws IllegalArgumentException if continuationMode=={@link * net.algart.arrays.Matrix.ContinuationMode#NONE}, * or if maxTempJavaMemory<0, * or if tileDim.length==0, * or if one of elements of tileDim Java array is zero or negative. */ public static TiledApertureProcessorFactory getInstance( ArrayContext context, Matrix.ContinuationMode continuationMode, long maxTempJavaMemory, long[] tileDim) { return new TiledApertureProcessorFactory(context, continuationMode, maxTempJavaMemory, tileDim, Matrices.defaultTileDimensions(tileDim.length), 0); } /** * Creates new instance of the tiler. Equivalent to the following call of the basic instantiation method:
* {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance}(context, continuationMode, maxTempJavaMemory, tileDim, allocationTileDim, 0). * * @param context see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param continuationMode see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param maxTempJavaMemory see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param tileDim see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param allocationTileDim see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @return new tiler. * @throws NullPointerException if continuationMode or tileDim * argument is {@code null}. * @throws IllegalArgumentException if continuationMode=={@link * net.algart.arrays.Matrix.ContinuationMode#NONE}, * or if maxTempJavaMemory<0, * or if tileDim.length==0, * or if allocationTileDim!=null and * allocationTileDim.length!=tileDim.length, * or if one of elements of tileDim or (non-null) * allocationTileDim Java arrays is zero or negative. */ public static TiledApertureProcessorFactory getInstance( ArrayContext context, Matrix.ContinuationMode continuationMode, long maxTempJavaMemory, long[] tileDim, long[] allocationTileDim) { return new TiledApertureProcessorFactory(context, continuationMode, maxTempJavaMemory, tileDim, allocationTileDim, 0); } /** * Creates new instance of the tiler. Equivalent to the following call of the basic instantiation method:
* {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance}(context, continuationMode, maxTempJavaMemory, tileDim, {@link * Matrices#defaultTileDimensions(int) Matrices.defaultTileDimensions}(tileDim.length), numberOfTasks). * * @param context see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param continuationMode see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param maxTempJavaMemory see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param tileDim see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @param numberOfTasks see the basic {@link * #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method. * @return new tiler. * @throws NullPointerException if continuationMode or tileDim * argument is {@code null}. * @throws IllegalArgumentException if continuationMode=={@link * net.algart.arrays.Matrix.ContinuationMode#NONE}, * or if maxTempJavaMemory<0, * or if tileDim.length==0, * or if numberOfTasks<0, * or if one of elements of tileDim Java array is zero or negative. */ public static TiledApertureProcessorFactory getInstance( ArrayContext context, Matrix.ContinuationMode continuationMode, long maxTempJavaMemory, long[] tileDim, int numberOfTasks) { return new TiledApertureProcessorFactory(context, continuationMode, maxTempJavaMemory, tileDim, Matrices.defaultTileDimensions(tileDim.length), numberOfTasks); } /** * Creates new instance of the tiler. * *

The passed Java arrays tileDim and allocationTileDim are cloned by this method: * no references to them are maintained by the created object. * * @param context the {@link #context() context} that will be used by this tiler; * can be {@code null}, then it will be ignored, and * the {@link #tile(ApertureProcessor) tiled} processor will create all temporary * matrices in {@link SimpleMemoryModel}.
  * @param continuationMode continuation mode, used by the {@link #tile(ApertureProcessor) tiled} processor * (see also the specification of the * {@link ApertureProcessor#process(Map, Map) process} * method in the {@link TiledApertureProcessorFactory comments to this class}, * stage 4.b).
  * @param maxTempJavaMemory maximal amount of Java memory, in bytes, allowed for allocating by the * {@link ApertureProcessor#process(Map, Map) process} method * of the {@link #tile(ApertureProcessor) tiled} processor * (see ibid., stage 4.a). If you are sure that there is enough Java memory * for allocating all necessary matrices for {@link #numberOfTasks()} tiles * of all source and resulting matrices (with the given dimensions tileDim), * you may specify here Long.MAX_VALUE.
  * @param tileDim the desired dimensions of tiles, into which the source and resulting matrices * are split by the {@link #tile(ApertureProcessor) tiled} processor * (see ibid., stage 3). Typical values for most applications are 4096x4096 * or 2048x2048 (in 2-dimensional case).
  * @param allocationTileDim if not {@code null}, then the resulting matrices M'j, * created by the {@link #tile(ApertureProcessor) tiled} processor * (see ibid., stage 4.d), are automatically tiled by the call * newMatrix.{@link Matrix#tile(long...) tile}(allocationTileDim). * If it is {@code null}, the resulting matrices are not tiled.
  * @param numberOfTasks the desired number of tiles, which should be processed simultaneously in * parallel threads to optimize calculations on multiprocessor or multicore computers; * may be 0, then it will be detected automatically as * {@link Arrays#getThreadPoolFactory(ArrayContext) * Arrays.getThreadPoolFactory}(context).{@link * ThreadPoolFactory#recommendedNumberOfTasks() recommendedNumberOfTasks()}. * You may specify numberOfTasks=1 for saving memory, if you know that * the one-tile processors, which you are going to tile, provide multithreading * optimization themselves. * @return new tiler. * @throws NullPointerException if continuationMode or tileDim * argument is {@code null}. * @throws IllegalArgumentException if continuationMode=={@link * net.algart.arrays.Matrix.ContinuationMode#NONE}, * or if maxTempJavaMemory<0, * or if tileDim.length==0, * or if allocationTileDim!=null and * allocationTileDim.length!=tileDim.length, * or if numberOfTasks<0, * or if one of elements of tileDim or (non-null) * allocationTileDim Java arrays is zero or negative. * @see #context() * @see #continuationMode() * @see #maxTempJavaMemory() * @see #dimCount() * @see #tileDim() * @see #numberOfTasks() */ public static TiledApertureProcessorFactory getInstance( ArrayContext context, Matrix.ContinuationMode continuationMode, long maxTempJavaMemory, long[] tileDim, long[] allocationTileDim, int numberOfTasks) { return new TiledApertureProcessorFactory(context, continuationMode, maxTempJavaMemory, tileDim, allocationTileDim, numberOfTasks); } /** * Returns the current context, used by this tiler. Equal to the first argument, * passed to an instantiation method getInstance. * *

This context (if it is not {@code null}) is used by the {@link #tile(ApertureProcessor) tiled} processor * for determining memory model, which should be used for allocating resulting matrices and, maybe, * temporary matrices for every tile (if {@link #maxTempJavaMemory()} is too small to allocate them * in {@link SimpleMemoryModel}), for showing execution progress and allowing to stop execution after * processing every tile (even if the one-tile processor does not support these features) * and for multithreading simultaneous processing several tiles, if {@link #numberOfTasks()}>1. * *

See also the {@link TiledApertureProcessorFactory comments to this class}, the section * "Contexts for the one-tile processor". * * @return the current context, used by this tiler; can be {@code null}. */ public ArrayContext context() { return this.context; } /** * Switches the context: returns an instance, identical to this one excepting * that it uses the specified newContext for all operations. * * @param newContext another context, used by the returned instance; can be {@code null}. * @return new instance with another context. */ public TiledApertureProcessorFactory context(ArrayContext newContext) { return new TiledApertureProcessorFactory(newContext, continuationMode, maxTempJavaMemory, processingTileDim, allocationTileDim, originalNumberOfTasks); } /** * Returns the number of dimensions of this tiler. Equal to the number of elements of tileDim * arrays, passed to an instantiation method getInstance. * *

The tiled processor, created by {@link #tile(ApertureProcessor)} method of this tiler, * can process only matrices with this number of dimensions. * * @return the number of dimensions of matrices, which can be processed by aperture processors, * tiled by this tiler; always ≥1. */ public int dimCount() { return this.dimCount; } /** * Returns the continuation mode, used by this tiler. Equal to the corresponding argument, * passed to an instantiation method getInstance. * *

See comments to the basic * {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) getInstance} * method and the {@link TiledApertureProcessorFactory comments to this class} for more details. * * @return the continuation mode of this tiler; cannot be {@code null} or * {@link net.algart.arrays.Matrix.ContinuationMode#NONE}. */ public Matrix.ContinuationMode continuationMode() { return this.continuationMode; } /** * Returns the maximal amount of Java memory, in bytes, allowed for allocating temporary matrices * for storing a tile. Equal to the corresponding argument, * passed to an instantiation method getInstance. * *

See comments to the basic * {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) getInstance} * method and the {@link TiledApertureProcessorFactory comments to this class} for more details. * * @return the maximal amount of Java memory, in bytes, allowed for allocating temporary matrices * for storing a tile; always ≥0. */ public long maxTempJavaMemory() { return this.maxTempJavaMemory; } /** * Returns the desired dimensions of every tile. Equal to tileDim * arrays, passed to an instantiation method getInstance. * *

The returned array is a clone of the internal dimension array stored in this object. * The returned array is never empty (its length cannot be zero). * The elements of the returned array are never zero or negative. * *

See comments to the basic * {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) getInstance} * method and the {@link TiledApertureProcessorFactory comments to this class} for more details. * * @return the desired dimensions of every tile. */ public long[] tileDim() { return this.processingTileDim.clone(); } /** * Returns the number of tiles, which should be processed simultaneously in * parallel threads to optimize calculations on multiprocessor or multicore computers. * It is equal to: *

    *
  • numberOfTasks argument if this instance was created by * {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[], int) * getInstance} method, having such argument, and if this argument was non-zero or
  • *
  • {@link Arrays#getThreadPoolFactory(ArrayContext) * Arrays.getThreadPoolFactory}({@link #context()}).{@link ThreadPoolFactory#recommendedNumberOfTasks() * recommendedNumberOfTasks()} if this instance was created by * {@link #getInstance(ArrayContext, Matrix.ContinuationMode, long, long[], long[]) getInstance} * method without numberOfTasks argument or if this argument was zero * (numberOfTasks=0).
  • *
* * @return the number of threads, that this class uses for multithreading optimization; always ≥1. */ public int numberOfTasks() { return numberOfTasks; } /** * The main method: builds the tiled aperture processor on the base of the given one-tile processor. * See the {@link TiledApertureProcessorFactory comments to this class} for more details. * *

The result of this method always implements {@link ArrayProcessorWithContextSwitching} interface. * See the {@link TiledApertureProcessorFactory comments to this class}, the section * "Contexts for the one-tile processor". * * @param the generic type of the keys in the tile processor. * @param oneTileProcessor one-tile aperture processor. * @return tiled aperture processor. * @throws NullPointerException if the argument is {@code null}. */ public ApertureProcessor tile(ApertureProcessor oneTileProcessor) { return new BasicTilingProcessor<>(oneTileProcessor); } /** * Returns a brief string description of this object. * *

The result of this method may depend on implementation and usually contains * a short description of this tiler. * * @return a brief string description of this object. */ @Override public String toString() { return "universal " + dimCount + "-dimensional " + (numberOfTasks == 1 ? "" : numberOfTasks + "-threading ") + "processing tiler (tiles " + JArrays.toString(processingTileDim, "x", 1000) + ")"; } private class BasicTilingProcessor extends AbstractArrayProcessorWithContextSwitching implements ApertureProcessor, ArrayProcessorWithContextSwitching { private final ApertureProcessor oneTileProcessor; private BasicTilingProcessor(ApertureProcessor oneTileProcessor) { super(context); Objects.requireNonNull(oneTileProcessor, "Null one-tile processor"); this.oneTileProcessor = oneTileProcessor; } public void process(Map> dest, Map> src) { Objects.requireNonNull(src, "Null table of source matrices"); Objects.requireNonNull(dest, "Null table of destination matrices"); final Map> destCopy = new LinkedHashMap<>(); for (Map.Entry> e : dest.entrySet()) { K key = e.getKey(); Matrix m = e.getValue(); if (m != null && !(m.array() instanceof UpdatableArray)) { throw new IllegalArgumentException("The destination matrix with key \"" + key + "\" is not updatable and cannot be used for returning result"); } destCopy.put(key, m == null ? null : m.cast(UpdatableArray.class)); } final Map> srcCopy = new LinkedHashMap<>(); for (Map.Entry> e : src.entrySet()) { K key = e.getKey(); Matrix m = e.getValue(); Objects.requireNonNull(m, "Null source matrix with key \"" + key + "\""); srcCopy.put(key, m); } // - this cloning is useful if some parallel thread is changing these lists right now // (but not enough: we need to save new references to some dest elements) final long[] dim = getDimensionsAndCheck(destCopy, srcCopy); if (dim == null) { return; // no arguments and results: nothing to do } final long matrixSize = Arrays.longMul(dim); final long[] tileCounts = new long[dimCount]; long[] maxTileDim = new long[dimCount]; // maximal tile dimensions; some tiles near the bounds can be less long tileCount = 1; for (int k = 0; k < dimCount; k++) { if (dim[k] == 0) { return; // no elements in the matrices: nothing to do } tileCounts[k] = (dim[k] - 1) / processingTileDim[k] + 1; assert tileCounts[k] <= dim[k]; // because processingTileDim[k] > 0 maxTileDim[k] = Math.min(dim[k], processingTileDim[k]); // so, there is a guarantee that maxTileDim are allowed matrix dimensions tileCount *= tileCounts[k]; // overflow impossible, because tileCounts[k] <= dim[k] } final int nt = (int) Math.min(numberOfTasks, tileCount); final IRectangularArea maxAperture = maxDependenceAperture(srcCopy.keySet()); DependenceApertureBuilder.extendDimensions(dim, maxAperture); // overflow check before any calculations long maxExtTileSize = Arrays.longMul(DependenceApertureBuilder.extendDimensions(maxTileDim, maxAperture)); double estimatedMemory = estimateWorkMemory(maxExtTileSize, destCopy.values(), srcCopy.values(), nt); MemoryModel betterModel = estimatedMemory < maxTempJavaMemory ? Arrays.SMM : memoryModel(); final List> srcTileMem = allocateTile(betterModel, maxExtTileSize, srcCopy, nt); final List> destTileMem = allocateTile(betterModel, maxExtTileSize, destCopy, nt); final Matrix enumerator = Matrices.matrix(Arrays.nIntCopies(tileCount, 157), tileCounts); // - this trivial virtual matrix is a simplest way to enumerate all tiles ArrayContext context = this.context(); // maybe, already not a context of TilingProcessorFactory! if (nt > 1) { context = context == null ? ArrayContext.DEFAULT_SINGLE_THREAD : context.singleThreadVersion(); } else if (context == null) { context = ArrayContext.DEFAULT; } Runnable[] tasks = new Runnable[nt]; Runnable[] postprocessing = new Runnable[nt]; // non-parallel long readyElementsCount = 0; int taskIndex = 0; // System.out.println("Number of tasks/tiles: " + nt + "/" + tileCount + ", " + maxAperture + "; " // + src.size() + " arguments and " + dest.size() + " results " + JArrays.toString(dim, "x", 1000)); for (long tileIndex = 0; tileIndex < tileCount; tileIndex++) { long[] tileIndexes = enumerator.coordinates(tileIndex, null); final long[] tilePos = new long[dimCount]; final long[] tileDim = new long[dimCount]; final long[] tileMax = new long[dimCount]; final long[] extTilePos = new long[dimCount]; final long[] extTileDim = new long[dimCount]; final long[] extTileMax = new long[dimCount]; long tileSize = 1; for (int k = 0; k < dimCount; k++) { tilePos[k] = tileIndexes[k] * processingTileDim[k]; assert tilePos[k] < dim[k]; tileDim[k] = Math.min(processingTileDim[k], dim[k] - tilePos[k]); // exclusive assert tileDim[k] > 0; // because processingTileDim[k] > 0: checked in the constructor tileMax[k] = tilePos[k] + tileDim[k] - 1; extTileDim[k] = DependenceApertureBuilder.safelyAdd(tileDim[k], maxAperture.width(k)); extTilePos[k] = tilePos[k] + maxAperture.min(k); extTileMax[k] = tileMax[k] + maxAperture.max(k); tileSize *= tileDim[k]; } final ArrayContext ac = nt == 1 ? context.part(readyElementsCount, readyElementsCount + tileSize, matrixSize) : context.noProgressVersion(); final Map> srcTile = loadSrcTile(ac.part(0.0, 0.05), maxAperture, srcTileMem.get(taskIndex), srcCopy, tilePos, tileDim, extTileDim); final Map> destTile = prepareDestTile( destTileMem.get(taskIndex), destCopy.keySet(), extTileDim); final int ti = taskIndex; tasks[taskIndex] = () -> { ArrayContext tileContext = switchingContextSupported() ? ac.part(0.05, 0.95).multithreadingVersion(ti, nt).customDataVersion( new TileInformation( IRectangularArea.valueOf( IPoint.valueOf(tilePos), IPoint.valueOf(tileMax)), IRectangularArea.valueOf( IPoint.valueOf(extTilePos), IPoint.valueOf(extTileMax)))) : ac; subtaskTileProcessor(tileContext).process(destTile, srcTile); // additional matrices CAN appear in destTile }; postprocessing[taskIndex] = () -> { allocateDestMatricesIfNecessary(dim, destCopy, destTile); // - synchronization is not necessary, because we call postprocessing in a single thread saveDestTile(ac.part(0.95, 1.0), maxAperture, destCopy, destTile, tilePos, tileDim); freeResources(destTile); // maybe, destTile was created by the parent processor in some file, which should be released }; taskIndex++; readyElementsCount += tileSize; if (taskIndex == nt || tileIndex == tileCount - 1) { threadPoolFactory.performTasks(tasks, 0, taskIndex); for (int i = 0; i < taskIndex; i++) { postprocessing[i].run(); } Class elementType = (!destCopy.isEmpty() ? destCopy.values().iterator().next() : srcCopy.values().iterator().next()).elementType(); context.checkInterruptionAndUpdateProgress(elementType, readyElementsCount, matrixSize); } if (taskIndex == nt) { taskIndex = 0; } } for (Map.Entry> e : destCopy.entrySet()) { // saving newly created dest matrices back into the original dest list K key = e.getKey(); if (dest.get(key) == null) { dest.put(key, e.getValue()); } } } public IRectangularArea dependenceAperture(K srcMatrixKey) { return oneTileProcessor.dependenceAperture(srcMatrixKey); } @Override public String toString() { return "aperture-dependent tiled processor of an " + TiledApertureProcessorFactory.this; } private long[] getDimensionsAndCheck( Map> dest, Map> src) { long[] result = null; for (Map.Entry> e : dest.entrySet()) { K key = e.getKey(); Matrix m = e.getValue(); if (m == null) { continue; } if (m.dimCount() != dimCount) { throw new IllegalArgumentException("The destination matrix with key \"" + key + "\" has " + m.dimCount() + " dimensions, but this processing tiler works with " + dimCount + " dimensions"); } if (result == null) { result = m.dimensions(); } else if (!m.dimEquals(result)) { throw new SizeMismatchException("The destination matrix with key \"" + key + "\" and the first matrix dimensions mismatch: " + "the destination matrix with key \"" + key + "\" is " + m + ", but the first matrix has dimensions " + JArrays.toString(result, "x", 1000)); } } for (Map.Entry> e : src.entrySet()) { K key = e.getKey(); Matrix m = e.getValue(); assert m != null; if (m.dimCount() != dimCount) { throw new IllegalArgumentException("The source matrix with key \"" + key + "\" has " + m.dimCount() + " dimensions, but this processing tiler works with " + dimCount + " dimensions"); } if (result == null) { result = m.dimensions(); } else if (!m.dimEquals(result)) { throw new SizeMismatchException("The source matrix with key \"" + key + "\" and the first matrix dimensions mismatch: " + "the source matrix with key \"" + key + "\" is " + m + ", but the first matrix has dimensions " + JArrays.toString(result, "x", 1000)); } } return result; } private IRectangularArea maxDependenceAperture(Set srcKeys) { long[] min = new long[dimCount]; // zero-filled by Java long[] max = new long[dimCount]; // zero-filled by Java for (K key : srcKeys) { IRectangularArea a = oneTileProcessor.dependenceAperture(key); for (int k = 0; k < dimCount; k++) { min[k] = Math.min(min[k], a.min(k)); max[k] = Math.max(max[k], a.max(k)); } } return IRectangularArea.valueOf(IPoint.valueOf(min), IPoint.valueOf(max)); } private double estimateWorkMemory( long extendedTileSize, Collection> destList, Collection> srcList, int numberOfTasks) { double result = 0.0; for (Matrix m : srcList) { result += Math.max(Arrays.sizeOf(m.elementType()), 0.0) * extendedTileSize; // Math.max: we shall not try to use optimized memory model for non-primitive element types } for (Matrix m : destList) { if (m != null) { result += Math.max(Arrays.sizeOf(m.elementType()), 0.0) * extendedTileSize; } } return result * numberOfTasks; } private List> allocateTile( MemoryModel betterMemoryModel, long extendedTileSize, Map> processorArguments, int numberOfTasks) { List> result = new ArrayList<>(); for (int taskIndex = 0; taskIndex < numberOfTasks; taskIndex++) { Map tileMemory = new LinkedHashMap<>(); for (Map.Entry> e : processorArguments.entrySet()) { Matrix m = e.getValue(); if (m != null) { K key = e.getKey(); MemoryModel mm = Arrays.sizeOf(m.elementType()) < 0.0 ? memoryModel() : betterMemoryModel; tileMemory.put(key, mm.newUnresizableArray(m.elementType(), extendedTileSize)); } } result.add(tileMemory); } return result; } private Map> loadSrcTile( ArrayContext ac, IRectangularArea maxAperture, Map srcTileMem, Map> src, long[] tilePos, long[] tileDim, long[] extTileDim) { long len = Arrays.longMul(extTileDim); Map> srcTile = new LinkedHashMap<>(); long[] inTilePos = new long[dimCount]; long[] preciseTileDim = new long[dimCount]; long[] preciseTilePos = new long[dimCount]; int i = 0, n = src.size(); for (Map.Entry> e : src.entrySet()) { K key = e.getKey(); Matrix tileMatrix = Matrices.matrix(srcTileMem.get(key).subArr(0, len), extTileDim); IRectangularArea a = oneTileProcessor.dependenceAperture(key); assert a != null : "Null dependenceAperture(" + key + ")"; for (int k = 0; k < dimCount; k++) { inTilePos[k] = a.min(k) - maxAperture.min(k); preciseTilePos[k] = tilePos[k] + a.min(k); preciseTileDim[k] = DependenceApertureBuilder.safelyAdd(tileDim[k], a.width(k)); } Matrices.copy(ac.part(i, ++i, n), tileMatrix.subMatr(inTilePos, preciseTileDim), e.getValue().subMatr(preciseTilePos, preciseTileDim, continuationMode)); srcTile.put(key, tileMatrix); } return srcTile; } private Map> prepareDestTile( Map destTileMem, Set destKeys, long[] extTileDim) { long len = Arrays.longMul(extTileDim); Map> destTile = new LinkedHashMap<>(); for (K key : destKeys) { UpdatableArray a = destTileMem.get(key); destTile.put(key, a == null ? null : Matrices.matrix(a.subArr(0, len), extTileDim)); } return destTile; } private void allocateDestMatricesIfNecessary( long[] dim, Map> dest, Map> destTile) { for (Map.Entry> e : destTile.entrySet()) { K key = e.getKey(); if (dest.get(key) != null) { continue; // this resulting argument is pre-allocated by the external client } Matrix destTileMatrix = e.getValue(); if (destTileMatrix == null) throw new AssertionError("Illegal implementation of one-tile processor " + oneTileProcessor.getClass() + (dest.containsKey(key) ? ": it leaves null result matrix" : ": it creates null result") + " for the key \"" + key + "\""); Matrix destMatrix = this.memoryModel().newMatrix(UpdatableArray.class, destTileMatrix.elementType(), dim); if (allocationTileDim != null) { destMatrix = destMatrix.tile(allocationTileDim); } dest.put(key, destMatrix); } for (Map.Entry> e : dest.entrySet()) { K key = e.getKey(); if (e.getValue() == null) throw new AssertionError("Illegal implementation of one-tile processor " + oneTileProcessor.getClass() + ": it does not allocate necessary result matrix with the key \"" + key + "\""); if (destTile.get(key) == null) throw new AssertionError("Illegal implementation of one-tile processor " + oneTileProcessor.getClass() + ": it removes the matrix with the key \"" + key + "\" from the list of resulting arguments"); } assert dest.size() == destTile.size(); // moreover, they have identical key sets } private void saveDestTile( ArrayContext ac, IRectangularArea maxAperture, Map> dest, Map> destTile, long[] tilePos, long[] tileDim) { long[] inTilePos = maxAperture.min().symmetric().coordinates(); int i = 0, n = dest.size(); for (Map.Entry> e : dest.entrySet()) { K key = e.getKey(); Matrix destMatrix = e.getValue(); assert destMatrix != null : "internal bug: dest matrix with the key \"" + key + "\" is not allocated"; Matrix destTileMatrix = destTile.get(key); Matrices.copy(ac.part(i, ++i, n), destMatrix.subMatr(tilePos, tileDim), destTileMatrix.subMatr(inTilePos, tileDim)); } } private void freeResources(Map> tile) { for (Matrix m : tile.values()) { m.freeResources(); } } @SuppressWarnings("unchecked") private ApertureProcessor subtaskTileProcessor(ArrayContext tileContext) { assert tileContext != null : "Null tileContext"; if (!switchingContextSupported()) { return oneTileProcessor; } Object p = ((ArrayProcessorWithContextSwitching) oneTileProcessor).context(tileContext); if (!(p instanceof ApertureProcessor)) { throw new AssertionError("Illegal implementation of one-tile processor, " + oneTileProcessor.getClass() + ": it implements " + ApertureProcessor.class + ", but after switching context the result does not implement it"); } return (ApertureProcessor) p; } private boolean switchingContextSupported() { return oneTileProcessor instanceof ArrayProcessorWithContextSwitching; } } }





© 2015 - 2024 Weber Informatics LLC | Privacy Policy