net.algart.matrices.spectra.SeparableFastHartleyTransform Maven / Gradle / Ivy
Show all versions of algart Show documentation
/*
* The MIT License (MIT)
*
* Copyright (c) 2007-2024 Daniel Alievsky, AlgART Laboratory (http://algart.net)
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
package net.algart.matrices.spectra;
import net.algart.arrays.*;
import java.util.Objects;
/**
* Fast Hartley transform (FHT)
* (in multidimensional case — the separable fast Hartley transform).
* This class implements standard one-dimensional FHT algorithm over an abstract {@link SampleArray}.
* It is generalized to multidimensional case by the simplest way, implemented in
* {@link AbstractSpectralTransform} class (applying the transform separably to each dimension);
* the resulting transformation for 2- or multidimensional {@link Matrix AlgART numeric matrices} is usually called
* separable fast Hartley transform (SFHT).
* The samples, processed by this class, can be both real or complex ({@link #areComplexSamplesRequired()} method
* returns false
). This class is especially useful in a case of real samples. In this case
* it is performed faster than the classic FFT, {@link FastFourierTransform} class, because
* there are no needs to allocate and process arrays of imaginary parts.
* The simple relation between Hartley and Fourier transform (see below) allows to use this transform
* almost in all areas where Fourier transform is applicable.
*
* More precisely, this class implements the classic fast "butterfly" algorithm (FHT) for calculating
* discrete Hartley transform (DHT), described at
* http://en.wikipedia.org/wiki/Discrete_Hartley_transform.
*
* Namely, let x0,x1,...,xN−1 are some
* real or complex samples (represented by abstract {@link SampleArray}), and
* H0,H1,...,HN−1 are their Hartley spectrum:
* the result of DHT. Let's designate cas θ = cos θ + sin θ.
* This class implements two possible definitions of DHT:
*
*
* - direct transform is
* Hk =
* ∑(0≤n<N)
* xn cas(2knπ/N),
* inverse transform is
* xn = N −1
* ∑(0≤k<N)
* Hk cas(2knπ/N).
*
*
* - direct transform is
* Hk = N −1
* ∑(0≤n<N)
* xn cas(2knπ/N),
* inverse transform is
* xn =
* ∑(0≤k<N)
* Hk cas(2knπ/N).
*
*
*
* The only difference is when to normalize the result: while inverse transform (case 1) or direct transform
* (case 2). The Wikipedia offers formulas of the 1st case. This class allows to calculate both variants:
* the 1st case is chosen if the normalizeDirectTransform
argument
* of the constructors is false
* or if this class is created by a constructor without this argument (it is the default behavior),
* the 2nd case is chosen if the normalizeDirectTransform
* argument of the constructors is true
.
*
* The very useful feature of DHT is that for real samples xk the Hartley spectrum
* Hk is also real — unlike DFT, when even real samples lead to complex spectrum.
* As a result, the transformation algorithms in this class can process real arrays and matrices,
* without imaginary parts. In this case, they work in two and even more times faster than FFT algorithms,
* implemented in {@link FastFourierTransform}, and do not require allocating additional memory
* for storing imaginary parts of the complex numbers.
*
* The formulas above correspond to one-dimensional transforms and specify the results of
* {@link #directTransform directTransform} / {@link #inverseTransform inverseTransform} methods.
* They are generalized to multidimensional case by default algorithms, implemented in
* {@link AbstractSpectralTransform} class, i.e. by applying the transform separably to each dimension.
* It leads to so-called multidimensional separable discrete Hartley transformations (SDHT). Below are
* the formulas for 2-dimensional separable discrete Hartley transformation of the matrix xij
* (0≤i<M, 0≤j<N) for the case 1 (normalizing the inverse transform):
*
*
* direct:
* Hij =
* ∑(0≤m<M)
* ∑(0≤n<N)
* xmn cas(2imπ/M) cas(2jnπ/N),
* inverse:
* xmn = (MN) −1
* ∑(0≤i<M)
* ∑(0≤j<N)
* xij cas(2imπ/M) cas(2jnπ/N).
*
*
* There is the simple relation between classic DFT (discrete Fourier transform) and
* SDHT (separable discrete Hartley transform).
*
* Let's consider one-dimensional case (usual DHT).
* Let x0,x1,...,xN−1 are some
* real or complex samples,
* F0,F1,...,FN−1 are their Fourier spectrum and
* H0,H1,...,HN−1 are their Hartley spectrum.
* Let i is the usual imaginary unit.
* For simplicity, let's consider that F−k=FN−k,
* H−k=HN−k, k=1,2,... Then:
*
*
* Fk = (Hk+H−k)/2
* − i (Hk−H−k)/2,
* Hk = (Fk+F−k)/2
* + i (Fk−F−k)/2,
* in a case of real samples: Hk =
* Re Fk − Im Fk
*
*
* (of course, we consider the same definition, 1 or 2, for both {@link FastFourierTransform DFT} and
* {@link SeparableFastHartleyTransform SDHT} spectra).
*
* In 2-dimensional case, the relation between DFT and SDHT is the following (we similarly suppose that
* F−i, j=FM−i, j,
* Fi,−j=Fi, N−j,
* H−i, j=HM−i, j,
* Hi,−j=Hi, N−j):
*
*
* Fi, j = (Hi,−j+H−i, j)/2
* − i (Hi, j−H−i,−j)/2,
* Hi, j = (Fi,−j+F−i, j)/2
* + i (Fi, j−F−i,−j)/2,
* in a case of real samples: Hi, j =
* Re Fi,−j − Im Fi, j.
*
*
* In the common n-dimensional case, there are similar formulas, which express
* Fi, j,...,k through a linear combination of 2n numbers
* H± i,± j,...,± k and, vice versa, express
* Hi, j,...,k through a linear combination of 2n numbers
* F± i,± j,...,± k.
*
* This class contains the ready for use methods, allowing to convert n-dimensional
* separable Hartley spectrum to Fourier one and vice versa, n=1,2,3,...:
*
*
* - {@link #separableHartleyToFourier(ArrayContext context, Matrix fRe, Matrix fIm, Matrix h)}
* converts (real) Hartley spectrum of the real matrix to its Fourier spectrum;
* - {@link #separableHartleyToFourier(ArrayContext context, Matrix fRe, Matrix fIm, Matrix hRe, Matrix hIm)}
* converts (complex) Hartley spectrum of the complex matrix to its Fourier spectrum;
* - {@link #fourierToSeparableHartley(ArrayContext context, Matrix h, Matrix fRe, Matrix fIm)}
* converts Fourer spectrum of the real matrix to its (real) Hartley spectrum;
* - {@link #fourierToSeparableHartley(ArrayContext context, Matrix hRe, Matrix hIm, Matrix fRe, Matrix fIm)}
* converts Fourer spectrum of the complex matrix to its (complex) Hartley spectrum.
*
*
* If it is necessary to get the Fourier spectrum of some real matrix, probably process it and transform
* the Fourier spectrum back to the real matrix, you can use a combination of SHFT,
* provided by this class, and the conversion methods listed above (cases of real matrices).
* But if all that you need is to calculate a convolution of two real matrices, there is a better way: see below.
*
* One-dimensional Hartley transform, defined by the formulas 1 and 2 above, complies with the
* convolution theorem. Namely, let
* p0,p1,...,pN−1 is the first complex or real
* numeric function and q0,q1,...,qN−1 is
* the second complex or real function, and
* c0,c1,...,cN−1 is
* their (complex or real) convolution, defined as:
*
*
* ck =
* ∑(0≤n<N)
* pnqk−n
*
*
* (here and below we consider that Z−k=ZN−k for all samples
* and spectra).
* Also, let P0,P1,...,PN−1,
* Q0,Q1,...,QN−1 and
* C0,C1,...,CN−1 are
* Hartley spectra of these functions. Then:
*
*
* - Ck =
* (PkQ−k+P−kQk)/2 +
* (PkQk−P−kQ−k)/2,
* if the spectra were calculated according formula 1 above (default method);
*
* - Ck = N
* ((PkQ−k+P−kQk)/2 +
* (PkQk−P−kQ−k)/2),
* if the spectra were calculated according formula 2 above.
*
*
* There are similar formulas in the common n-dimensional case, allowing to express the separable
* Hartley spectrum of the convolution of two n-dimensional matrices via the spectra of the source matrices.
* In particular, in the 2-dimensional case:
*
*
* - Ci, j =
* ((Pi, j+P−i,−j)
* (Qi, j+Q−i,−j) −
* (Pi,−j−P−i, j)
* (Qi,−j−Q−i, j) +
* (Pi, j−P−i,−j)
* (Qi,−j+Q−i, j) +
* (Pi,−j+P−i, j)
* (Qi, j−Q−i,−j))/4,
* if the spectra were calculated according formula 1 above (default method);
*
* - Ci, j = the same expression multiplied by MN
* (M and N are the dimensions of the matrices),
* if the spectra were calculated according formula 2 above.
*
*
* This class contains the ready for use methods, allowing to calculate a spectrum of convolution C
* on the base of the given spectra P and Q of two source numeric matrices x and y
* according the formulas A, C and their generalization for any number of dimensions:
*
*
* - {@link
* #spectrumOfConvolution(ArrayContext context, Matrix cRe, Matrix cIm, Matrix pRe, Matrix pRe, Matrix qRe, Matrix qRe)}
* calculates separable Hartley spectrum of the convolution C on the base of separable Hartley spectra
* P and Q of two source complex matrices;
* - {@link #spectrumOfConvolution(ArrayContext context, Matrix c, Matrix p, Matrix q)} calculates
* separable Hartley spectrum of the convolution C on the base of separable Hartley spectra
* P and Q of two source real matrices.
*
*
* So, if you need to calculate a convolution of some real matrices, for example, for goals of linear filtering,
* you can use the SFHT transform and the
* {@link #spectrumOfConvolution(ArrayContext context, Matrix c, Matrix p, Matrix q) spectrumOfConvolution}
* method, provided by this class: it is much better idea than using {@link FastFourierTransform} class.
*
* Please note: in the one-dimensional case, the spectral transofmation algorithms, implemented by
* {@link #directTransformMatrix directTransformMatrix} / {@link #inverseTransformMatrix inverseTransformMatrix}
* methods of this class, work with normal (i.e. high) performance only if
* the passed one-dimensional AlgART matrices are stored in {@link SimpleMemoryModel}
* (more precisely, if they are {@link DirectAccessible directly accessible}).
* In another case, each access to every sample leads to calling accessing methods
* {@link PArray#getDouble(long) getDouble} and {@link UpdatablePArray#setDouble(long, double) setDouble},
* which can work slowly in non-simple memory models like {@link LargeMemoryModel}. There is the same problem for
* {@link #directTransform directTransform} / {@link #inverseTransform inverseTransform} methods, if the passed
* sample arrays are created via {@link RealScalarSampleArray#asSampleArray RealScalarSampleArray.asSampleArray}
* or {@link ComplexScalarSampleArray#asSampleArray ComplexScalarSampleArray.asSampleArray} methods on the base of
* updatable AlgART arrays, created by memory model other than {@link SimpleMemoryModel}.
*
* For n-dimensional matrices (n≥2), this problem usually does not occur at all, even for non-simple
* memory models, if you use standard implementations of
* {@link #directTransformMatrix directTransformMatrix} / {@link #inverseTransformMatrix inverseTransformMatrix}
* from {@link AbstractSpectralTransform} class: these implementations automatically download necessary parts
* of the matrix into {@link SimpleMemoryModel}. This problem also does not occur while using
* conversion methods {@link #separableHartleyToFourier(ArrayContext, Matrix, Matrix, Matrix)},
* {@link #separableHartleyToFourier(ArrayContext, Matrix, Matrix, Matrix, Matrix)},
* {@link #fourierToSeparableHartley(ArrayContext, Matrix, Matrix, Matrix)},
* {@link #fourierToSeparableHartley(ArrayContext, Matrix, Matrix, Matrix, Matrix)} and
* methods of calculation of the spectrum of convolution
* {@link #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix)} and
* {@link #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix, Matrix, Matrix, Matrix)},
* if all processed matrices have the same float
or double
element types.
*/
public class SeparableFastHartleyTransform extends AbstractSpectralTransform implements SpectralTransform {
final boolean normalizeDirectTransform;
/**
* Creates a new instance of this class, performing separable Hartley transform according to the formula 1 from
* the {@link SeparableFastHartleyTransform comments to this class}.
* Equivalent to {@link #SeparableFastHartleyTransform(boolean) SeparableFastHartleyTransform(false)}.
*
* @see #SeparableFastHartleyTransform(long)
*/
public SeparableFastHartleyTransform() {
this.normalizeDirectTransform = FastFourierTransform.DEFAULT_NORMALIZE_DIRECT_TRANSFORM;
}
/**
* Creates a new instance of this class, performing separable Hartley transform according to the formula 1 from
* the {@link SeparableFastHartleyTransform comments to this class}.
*
* The maxTempJavaMemory
argument specifies the amount of Java memory (heap),
* that can be used by methods of this class for internal needs. It is passed to the corresponding
* constructor of {@link AbstractSpectralTransform}: see
* {@link AbstractSpectralTransform#AbstractSpectralTransform(long) comments to that constructor}.
*
* @param maxTempJavaMemory desired maximal amount of Java memory, in bytes, allowed for allocation
* by methods of this class for internal needs.
* @see #SeparableFastHartleyTransform()
*/
public SeparableFastHartleyTransform(long maxTempJavaMemory) {
super(maxTempJavaMemory);
this.normalizeDirectTransform = FastFourierTransform.DEFAULT_NORMALIZE_DIRECT_TRANSFORM;
}
/**
* Creates a new instance of this class, performing separable Hartley transform according either to the formula 1
* from the {@link SeparableFastHartleyTransform comments to this class},
* if normalizeDirectTransform
argument is false
,
* or to the formula 2, if this argument is true
.
* The default value, used by the constructors without normalizeDirectTransform
argument,
* is false
.
*
*
Please note: the value of normalizeDirectTransform
argument affects only the transformation
* methods {@link #directTransform directTransform}, {@link #inverseTransform inverseTransform},
* {@link #directTransformMatrix directTransformMatrix}, {@link #inverseTransformMatrix inverseTransformMatrix}.
* This value does not matter in other methods of this class: conversions between Hartley and Fourier spectrum,
* {@link #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix)} and
* {@link #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix, Matrix, Matrix, Matrix)}.
*
* @param normalizeDirectTransform true
if you want to perform normalization
* (division by the number of
* samples N) after the direct transform,
* false
(the default value)
* if you want to perform normalization after the inverse transform.
* @see #SeparableFastHartleyTransform(boolean, long)
*/
public SeparableFastHartleyTransform(boolean normalizeDirectTransform) {
this.normalizeDirectTransform = normalizeDirectTransform;
}
/**
* Creates a new instance of this class, performing separable Hartley transform according either to the formula 1
* from the {@link SeparableFastHartleyTransform comments to this class},
* if normalizeDirectTransform
argument is false
,
* or to the formula 2, if this argument is true
.
* The default value, used by the constructors without normalizeDirectTransform
argument,
* is false
.
*
*
Please note: the value of normalizeDirectTransform
argument affects only the transformation
* methods {@link #directTransform directTransform}, {@link #inverseTransform inverseTransform},
* {@link #directTransformMatrix directTransformMatrix}, {@link #inverseTransformMatrix inverseTransformMatrix}.
* This value does not matter in other methods of this class: conversions between Hartley and Fourier spectrum,
* {@link #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix)} and
* {@link #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix, Matrix, Matrix, Matrix)}.
*
*
The maxTempJavaMemory
argument specifies the amount of Java memory (heap),
* that can be used by methods of this class for internal needs. It is passed to the corresponding
* constructor of {@link AbstractSpectralTransform}: see
* {@link AbstractSpectralTransform#AbstractSpectralTransform(long) comments to that constructor}.
*
* @param maxTempJavaMemory desired maximal amount of Java memory, in bytes, allowed for allocating
* by methods of this class for internal needs.
* @param normalizeDirectTransform true
if you want to perform normalization
* (division by the number of
* samples N) after the direct transform,
* false
(the default value)
* if you want to perform normalization after the inverse transform.
* @see #SeparableFastHartleyTransform(boolean)
*/
public SeparableFastHartleyTransform(boolean normalizeDirectTransform, long maxTempJavaMemory) {
super(maxTempJavaMemory);
this.normalizeDirectTransform = normalizeDirectTransform;
}
/**
* Converts the separable Hartley spectrum H of some real n-dimensional matrix into
* the (complex) Fourier spectrum F of the same matrix.
* See the {@link SeparableFastHartleyTransform comments to this class} about the relation formulas between
* separable Hartley and Fourier spectra.
*
*
The complex matrix F is represented as a pair of AlgART matrices (fRe,fIm)
:
* the corresponding elements of these 2 matrices contain the real and imaginary parts
* of the corresponding elements of the complex matrix F.
* The real matrix H is passed as an AlgART matrix h
.
*
*
All matrices, passed to this method, must have {@link Matrix#dimEquals(Matrix) equal dimensions}.
* The {@link Matrix#elementType() element type} of the passed matrices can be different, but we recommend
* using the same float
or double
element type for all matrices.
* There are no restrictions for the dimensions of the passed matrices:
* {@link #isLengthAllowed(long)} method is not used here.
*
*
This method works correctly, if you pass the same matrix as
* fRe
/ fIm
and h
.
*
*
If you need to convert spectrum in a case of one-dimensional
* numeric AlgART arrays, you just need to convert them into one-dimensional AlgART matrices by
* {@link Matrices#matrix(Array, long...)} call, for example:
* {@link Matrices#matrix(Array, long...) Matrices.matrix}(array, array.length())
.
*
* @param context the context that will be used by this algorithm; can be {@code null}
* (see comments to {@link SpectralTransform}).
* @param fRe the real parts of the elements of the resulting matrix (Fourier spectrum).
* @param fIm the imaginary parts of the elements of the resulting matrix (Fourier spectrum).
* @param h the source real matrix (separable Hartley spectrum).
* @throws NullPointerException if one of fRe
, fIm
,
* h
arguments is {@code null}.
* @throws SizeMismatchException if some of the passed matrices have different dimensions.
* @see #separableHartleyToFourier(ArrayContext, Matrix, Matrix, Matrix, Matrix)
*/
public void separableHartleyToFourier(
ArrayContext context,
Matrix extends UpdatablePNumberArray> fRe, Matrix extends UpdatablePNumberArray> fIm,
Matrix extends PNumberArray> h) {
Objects.requireNonNull(fRe, "Null fRe argument");
Objects.requireNonNull(fIm, "Null fIm argument");
Objects.requireNonNull(h, "Null h argument");
if (!fRe.dimEquals(fIm)) {
throw new SizeMismatchException("fRe and fIm dimensions mismatch: fRe is " + fRe + ", fIm " + fIm);
}
if (!h.dimEquals(fRe)) {
throw new SizeMismatchException("h and fRe dimensions mismatch: h is " + h + ", fRe " + fRe);
}
ThreadPoolFactory tpf = Arrays.getThreadPoolFactory(context);
Conversions.separableHartleyToFourierRecoursive(context, maxTempJavaMemory(),
fRe.array(), fIm.array(), h.array(), null,
fRe.dimensions(), Math.max(1, tpf.recommendedNumberOfTasks()));
}
/**
* Converts the separable Hartley spectrum H of some complex n-dimensional matrix into
* the (complex) Fourier spectrum F of the same matrix.
* See the {@link SeparableFastHartleyTransform comments to this class} about the relation formulas between
* separable Hartley and Fourier spectra.
*
*
The complex matrix F is represented as a pair of AlgART matrices (fRe,fIm)
:
* the corresponding elements of these 2 matrices contain the real and imaginary parts
* of the corresponding elements of the complex matrix F.
* Similarly, the complex matrix H is represented as a pair of AlgART matrices (hRe,hIm)
.
*
*
All matrices, passed to this method, must have {@link Matrix#dimEquals(Matrix) equal dimensions}.
* The {@link Matrix#elementType() element type} of the passed matrices can be different, but we recommend
* using the same float
or double
element type for all matrices.
* There are no restrictions for the dimensions of the passed matrices:
* {@link #isLengthAllowed(long)} method is not used here.
*
*
This method works correctly, if you pass the same complex matrix as F and H.
* So, you can calculate and return the result in the source matrices.
*
*
If you need to convert spectrum in a case of one-dimensional
* numeric AlgART arrays, you just need to convert them into one-dimensional AlgART matrices by
* {@link Matrices#matrix(Array, long...)} call, for example:
* {@link Matrices#matrix(Array, long...) Matrices.matrix}(array, array.length())
.
*
* @param context the context that will be used by this algorithm; can be {@code null}
* (see comments to {@link SpectralTransform}).
* @param fRe the real parts of the elements of the resulting matrix (Fourier spectrum).
* @param fIm the imaginary parts of the elements of the resulting matrix (Fourier spectrum).
* @param hRe the real parts of the elements of the source matrix (separable Hartley spectrum).
* @param hIm the imaginary parts of the elements of the source matrix (separable Hartley spectrum).
* @throws NullPointerException if one of fRe
, fIm
, hRe
, hIm
* arguments is {@code null}.
* @throws SizeMismatchException if some of the passed matrices have different dimensions.
* @see #separableHartleyToFourier(ArrayContext, Matrix, Matrix, Matrix)
*/
public void separableHartleyToFourier(
ArrayContext context,
Matrix extends UpdatablePNumberArray> fRe, Matrix extends UpdatablePNumberArray> fIm,
Matrix extends PNumberArray> hRe, Matrix extends PNumberArray> hIm) {
Objects.requireNonNull(fRe, "Null fRe argument");
Objects.requireNonNull(fIm, "Null fIm argument");
Objects.requireNonNull(hRe, "Null hRe argument");
Objects.requireNonNull(hIm, "Null hIm argument");
if (!fRe.dimEquals(fIm)) {
throw new SizeMismatchException("fRe and fIm dimensions mismatch: fRe is " + fRe + ", fIm " + fIm);
}
if (!hRe.dimEquals(fRe)) {
throw new SizeMismatchException("hRe and fRe dimensions mismatch: hRe is " + hRe + ", fRe " + fRe);
}
if (!hIm.dimEquals(fRe)) {
throw new SizeMismatchException("hIm and fRe dimensions mismatch: hIm is " + hIm + ", fRe " + fRe);
}
ThreadPoolFactory tpf = Arrays.getThreadPoolFactory(context);
Conversions.separableHartleyToFourierRecoursive(context, maxTempJavaMemory(),
fRe.array(), fIm.array(), hRe.array(), hIm.array(),
fRe.dimensions(), Math.max(1, tpf.recommendedNumberOfTasks()));
}
/**
* Converts the Fourier spectrum F of some real n-dimensional matrix into
* the (real) separable Hartley spectrum H of the same matrix.
* See the {@link SeparableFastHartleyTransform comments to this class} about the relation formulas between
* separable Hartley and Fourier spectra.
* If the passed Fourier spectrum is not a spectrum of a real matrix (in other words,
* if the inverse Fourier transform of F matrix contains nonzero imaginary parts),
* then this method still correctly calculates the real parts of the separable Hartley spectrum H.
*
*
The complex matrix F is represented as a pair of AlgART matrices (fRe,fIm)
:
* the corresponding elements of these 2 matrices contain the real and imaginary parts
* of the corresponding elements of the complex matrix F.
* The real matrix H (or the real parts of H, if the passed F matrix
* is not a spectrum of a real matrix) is passed as an AlgART matrix h
.
*
*
All matrices, passed to this method, must have {@link Matrix#dimEquals(Matrix) equal dimensions}.
* The {@link Matrix#elementType() element type} of the passed matrices can be different, but we recommend
* using the same float
or double
element type for all matrices.
* There are no restrictions for the dimensions of the passed matrices:
* {@link #isLengthAllowed(long)} method is not used here.
*
*
This method works correctly, if you pass the same matrix as
* fRe
/ fIm
and h
.
*
*
If you need to convert spectrum in a case of one-dimensional
* numeric AlgART arrays, you just need to convert them into one-dimensional AlgART matrices by
* {@link Matrices#matrix(Array, long...)} call, for example:
* {@link Matrices#matrix(Array, long...) Matrices.matrix}(array, array.length())
.
*
* @param context the context that will be used by this algorithm; can be {@code null}
* (see comments to {@link SpectralTransform}).
* @param h the resulting real matrix (separable Hartley spectrum).
* @param fRe the real parts of the elements of the source matrix (Fourier spectrum).
* @param fIm the imaginary parts of the elements of the source matrix (Fourier spectrum).
* @throws NullPointerException if one of h
, fRe
, fIm
arguments is {@code null}.
* @throws SizeMismatchException if some of the passed matrices have different dimensions.
* @see #fourierToSeparableHartley(ArrayContext, Matrix, Matrix, Matrix, Matrix)
*/
public void fourierToSeparableHartley(
ArrayContext context,
Matrix extends UpdatablePNumberArray> h,
Matrix extends PNumberArray> fRe, Matrix extends PNumberArray> fIm) {
Objects.requireNonNull(h, "Null h argument");
Objects.requireNonNull(fRe, "Null fRe argument");
Objects.requireNonNull(fIm, "Null fIm argument");
if (!fRe.dimEquals(fIm)) {
throw new SizeMismatchException("fRe and fIm dimensions mismatch: fRe is " + fRe + ", fIm " + fIm);
}
if (!h.dimEquals(fRe)) {
throw new SizeMismatchException("h and fRe dimensions mismatch: h is " + h + ", fRe " + fRe);
}
ThreadPoolFactory tpf = Arrays.getThreadPoolFactory(context);
Conversions.fourierToSeparableHartleyRecursive(context, maxTempJavaMemory(),
h.array(), null, fRe.array(), fIm.array(),
fRe.dimensions(), Math.max(1, tpf.recommendedNumberOfTasks()));
}
/**
* Converts the Fourier spectrum F of some complex n-dimensional matrix into
* the (complex) separable Hartley spectrum H of the same matrix.
* See the {@link SeparableFastHartleyTransform comments to this class} about the relation formulas between
* separable Hartley and Fourier spectra.
*
*
The complex matrix F is represented as a pair of AlgART matrices (fRe,fIm)
:
* the corresponding elements of these 2 matrices contain the real and imaginary parts
* of the corresponding elements of the complex matrix F.
* Similarly, the complex matrix H is represented as a pair of AlgART matrices (hRe,hIm)
.
*
*
All matrices, passed to this method, must have {@link Matrix#dimEquals(Matrix) equal dimensions}.
* The {@link Matrix#elementType() element type} of the passed matrices can be different, but we recommend
* using the same float
or double
element type for all matrices.
* There are no restrictions for the dimensions of the passed matrices:
* {@link #isLengthAllowed(long)} method is not used here.
*
*
This method works correctly, if you pass the same complex matrix as F and H.
* So, you can calculate and return the result in the source matrices.
*
*
If you need to convert spectrum in a case of one-dimensional
* numeric AlgART arrays, you just need to convert them into one-dimensional AlgART matrices by
* {@link Matrices#matrix(Array, long...)} call, for example:
* {@link Matrices#matrix(Array, long...) Matrices.matrix}(array, array.length())
.
*
* @param context the context that will be used by this algorithm; can be {@code null}
* (see comments to {@link SpectralTransform}).
* @param hRe the real parts of the elements of the resulting matrix (separable Hartley spectrum).
* @param hIm the imaginary parts of the elements of the resulting matrix (separable Hartley spectrum).
* @param fRe the real parts of the elements of the source matrix (Fourier spectrum).
* @param fIm the imaginary parts of the elements of the source matrix (Fourier spectrum).
* @throws NullPointerException if one of hRe
, hIm
, fRe
, fIm
* arguments is {@code null}.
* @throws SizeMismatchException if some of the passed matrices have different dimensions.
* @see #fourierToSeparableHartley(ArrayContext, Matrix, Matrix, Matrix)
*/
public void fourierToSeparableHartley(
ArrayContext context,
Matrix extends UpdatablePNumberArray> hRe, Matrix extends UpdatablePNumberArray> hIm,
Matrix extends PNumberArray> fRe, Matrix extends PNumberArray> fIm) {
Objects.requireNonNull(hRe, "Null hRe argument");
Objects.requireNonNull(fRe, "Null fRe argument");
Objects.requireNonNull(fIm, "Null fIm argument");
Objects.requireNonNull(hIm, "Null hIm argument");
if (!fRe.dimEquals(fIm)) {
throw new SizeMismatchException("fRe and fIm dimensions mismatch: fRe is " + fRe + ", fIm " + fIm);
}
if (!hRe.dimEquals(fRe)) {
throw new SizeMismatchException("hRe and fRe dimensions mismatch: hRe is " + hRe + ", fRe " + fRe);
}
if (!hIm.dimEquals(fRe)) {
throw new SizeMismatchException("hIm and fRe dimensions mismatch: hIm is " + hIm + ", fRe " + fRe);
}
ThreadPoolFactory tpf = Arrays.getThreadPoolFactory(context);
Conversions.fourierToSeparableHartleyRecursive(context, maxTempJavaMemory(),
hRe.array(), hIm.array(), fRe.array(), fIm.array(),
fRe.dimensions(), Math.max(1, tpf.recommendedNumberOfTasks()));
}
/**
* Calculates C, the separable Hartley spectrum of the convolution of some two real matrices,
* on the base of P and Q — the separable Hartley spectra of these two real matrices.
*
*
The real matrices P, Q, C are passed as AlgART matrices
* p
, q
, c
.
*
*
All matrices, passed to this method, must have {@link Matrix#dimEquals(Matrix) equal dimensions}.
* The {@link Matrix#elementType() element type} of the passed matrices can be different, but we recommend
* using the same float
or double
element type for all matrices.
* There are no restrictions for the dimensions of the passed matrices:
* {@link #isLengthAllowed(long)} method is not used here.
*
*
This method works correctly, if you pass the same complex matrix as P and Q,
* or as P and C, or as Q and C, or even as all three matrices.
* So, you can calculate and return the result in one of the source matrices.
*
*
If you need to calculate the Hartley spectrum of convolution for a case of one-dimensional
* numeric AlgART arrays, you just need to convert them into one-dimensional AlgART matrices by
* {@link Matrices#matrix(Array, long...)} call, for example:
* {@link Matrices#matrix(Array, long...) Matrices.matrix}(array, array.length())
.
*
* @param context the context that will be used by this algorithm; can be {@code null}
* (see comments to {@link SpectralTransform}).
* @param c the resulting matrix (spectrum of the convolution).
* @param p the spectrum of the 1st matrix.
* @param q the spectrum of the 2nd matrix.
* @throws NullPointerException if one of c
, p
, q
arguments is {@code null}.
* @throws SizeMismatchException if some of the passed matrices have different dimensions.
* @see #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix, Matrix, Matrix, Matrix)
*/
public void spectrumOfConvolution(
ArrayContext context,
Matrix extends UpdatablePNumberArray> c,
Matrix extends PNumberArray> p,
Matrix extends PNumberArray> q) {
Objects.requireNonNull(c, "Null c argument");
Objects.requireNonNull(p, "Null p argument");
Objects.requireNonNull(q, "Null q argument");
if (!p.dimEquals(c)) {
throw new SizeMismatchException("c and p dimensions mismatch: c is " + c + ", p " + p);
}
if (!q.dimEquals(c)) {
throw new SizeMismatchException("c and q dimensions mismatch: c is " + c + ", q " + q);
}
ThreadPoolFactory tpf = Arrays.getThreadPoolFactory(context);
SpectraOfConvolution.separableHartleySpectrumOfConvolution(context, maxTempJavaMemory(),
c.array(), null, p.array(), null, q.array(), null,
c.dimensions(), Math.max(1, tpf.recommendedNumberOfTasks()));
}
/**
* Calculates C, the separable Hartley spectrum of the convolution of some two complex matrices,
* on the base of P and Q — the separable Hartley spectra of these two complex matrices.
*
*
The complex matrix P is represented as a pair of AlgART matrices (pRe,pIm)
:
* the corresponding elements of these 2 matrices contain the real and imaginary parts
* of the corresponding elements of the complex matrix P.
* Similarly, the complex matrix Q is represented as a pair of AlgART matrices (qRe,qIm)
,
* and the complex matrix C is represented as a pair of AlgART matrices (cRe,cIm)
.
*
*
All matrices, passed to this method, must have {@link Matrix#dimEquals(Matrix) equal dimensions}.
* The {@link Matrix#elementType() element type} of the passed matrices can be different, but we recommend
* using the same float
or double
element type for all matrices.
* There are no restrictions for the dimensions of the passed matrices:
* {@link #isLengthAllowed(long)} method is not used here.
*
*
This method works correctly, if you pass the same complex matrix as P and Q,
* or as P and C, or as Q and C, or even as all three matrices.
* So, you can calculate and return the result in one of the source matrices.
*
*
If you need to calculate the Hartley spectrum of convolution for a case of one-dimensional
* numeric AlgART arrays, you just need to convert them into one-dimensional AlgART matrices by
* {@link Matrices#matrix(Array, long...)} call, for example:
* {@link Matrices#matrix(Array, long...) Matrices.matrix}(array, array.length())
.
*
* @param context the context that will be used by this algorithm; can be {@code null}
* (see comments to {@link SpectralTransform}).
* @param cRe the real parts of the elements of the resulting matrix (spectrum of the convolution).
* @param cIm the imaginary parts of the elements of the resulting matrix (spectrum of the convolution).
* @param pRe the real parts of the elements of the spectrum of the 1st matrix.
* @param pIm the imaginary parts of the elements of the spectrum of the 1st matrix.
* @param qRe the real parts of the elements of the spectrum of the 2nd matrix.
* @param qIm the imaginary parts of the elements of the spectrum of the 2nd matrix.
* @throws NullPointerException if one of cRe
, cIm
, pRe
, pIm
,
* qRe
, qIm
arguments is {@code null}.
* @throws SizeMismatchException if some of the passed matrices have different dimensions.
* @see #spectrumOfConvolution(ArrayContext, Matrix, Matrix, Matrix)
*/
public void spectrumOfConvolution(
ArrayContext context,
Matrix extends UpdatablePNumberArray> cRe, Matrix extends UpdatablePNumberArray> cIm,
Matrix extends PNumberArray> pRe, Matrix extends PNumberArray> pIm,
Matrix extends PNumberArray> qRe, Matrix extends PNumberArray> qIm) {
Objects.requireNonNull(cRe, "Null cRe argument");
Objects.requireNonNull(cIm, "Null cIm argument");
Objects.requireNonNull(pRe, "Null pRe argument");
Objects.requireNonNull(pIm, "Null pIm argument");
Objects.requireNonNull(qRe, "Null qRe argument");
Objects.requireNonNull(qIm, "Null qIm argument");
if (!cRe.dimEquals(cIm)) {
throw new SizeMismatchException("cRe and cIm dimensions mismatch: cRe is " + cRe + ", cIm " + cIm);
}
if (!pRe.dimEquals(cRe)) {
throw new SizeMismatchException("cRe and pRe dimensions mismatch: cRe is " + cRe + ", pRe " + pRe);
}
if (!pIm.dimEquals(cRe)) {
throw new SizeMismatchException("cRe and pIm dimensions mismatch: cRe is " + cRe + ", pIm " + pIm);
}
if (!qRe.dimEquals(cRe)) {
throw new SizeMismatchException("cRe and qRe dimensions mismatch: cRe is " + cRe + ", qRe " + qRe);
}
if (!qIm.dimEquals(cRe)) {
throw new SizeMismatchException("cRe and qIm dimensions mismatch: cRe is " + cRe + ", qIm " + qIm);
}
ThreadPoolFactory tpf = Arrays.getThreadPoolFactory(context);
SpectraOfConvolution.separableHartleySpectrumOfConvolution(context, maxTempJavaMemory(),
cRe.array(), cIm.array(), pRe.array(), pIm.array(), qRe.array(), qIm.array(),
cRe.dimensions(), Math.max(1, tpf.recommendedNumberOfTasks()));
}
@Override
public final boolean isLengthAllowed(long length) {
return (length & (length - 1)) == 0;
}
@Override
public boolean areComplexSamplesRequired() {
return false;
}
@Override
protected String unallowedLengthMessage() {
return "FHT algorithm can process only 2^k elements";
}
@Override
protected final void transform(ArrayContext context, SampleArray samples, boolean inverse) {
assert isLengthAllowed(samples.length());
boolean normalize = normalizeDirectTransform ? !inverse : inverse;
// long t1 = System.nanoTime();
ReverseBits.reorderForButterfly(context == null ? null : context.part(0.0, 0.2), samples);
// long t2 = System.nanoTime();
fhtMainLoop(context == null ? null : context.part(0.2, normalize ? 0.95 : 1.0), samples);
// long t3 = System.nanoTime();
if (normalize) {
FastFourierTransform.fftNormalize(context == null ? null : context.part(0.95, 1.0), samples);
}
// long t4 = System.nanoTime();
// System.out.printf("FHT time (common branch) %.3f ms reorder, %.3f ms main, %.3f normalizing%n",
// (t2 - t1) * 1e-6, (t3 - t2) * 1e-6, (t4 - t3) * 1e-6);
}
private static final int LOG_ANGLE_STEP = 4, ANGLE_STEP = 1 << LOG_ANGLE_STEP;
private static final double SQRT2 = StrictMath.sqrt(2.0);
private static final double HALF_SQRT2 = 0.5 * SQRT2;
private static final int
S01 = 0, D01 = 1, S23 = 2, D23 = 3,
S45 = 0, D45 = 1, S67 = 2, D67 = 3, // reusing
SS0123 = 4, SD0123 = 5, DS0123 = 6, DD0123 = 7,
SS4567 = 8, DS4567 = 9,
R1 = 0, R2 = 1,
L1 = 2,
L2 = 2, // reusing,
CAS = 3,
NUMBER_OF_WORK_VARIABLES = 10;
private static void fhtMainLoop(ArrayContext context, SampleArray samples) {
final int logN = 63 - Long.numberOfLeadingZeros(samples.length());
if (logN <= 0) {
return; // nothing to do: this check little simplifies further checks in the recursive procedure
}
if (samples instanceof RealScalarSampleArray.DirectZeroOffsetsRealFloatSampleArray
|| samples instanceof RealScalarSampleArray.DirectRealFloatSampleArray) {
fhtJavaFloatMainLoop(context, samples, 0, logN, logN);
} else if (samples instanceof RealScalarSampleArray.DirectZeroOffsetsRealDoubleSampleArray
|| samples instanceof RealScalarSampleArray.DirectRealDoubleSampleArray) {
fhtJavaDoubleMainLoop(context, samples, 0, logN, logN);
// The functions below are anti-optimization in the large applications in the current version of 32-bit Java 1.7
// } else if (samples instanceof RealVectorSampleArray.DirectRealFloatVectorSampleArray) {
// fhtJavaFloatMultidimensionalMainLoop(context,
// (RealVectorSampleArray.DirectRealFloatVectorSampleArray)samples, 0, logN, logN,
// ((RealVectorSampleArray.DirectRealFloatVectorSampleArray)samples).
// newCompatibleSamplesArray(NUMBER_OF_WORK_VARIABLES));
// } else if (samples instanceof RealVectorSampleArray.DirectRealDoubleVectorSampleArray) {
// fhtJavaDoubleMultidimensionalMainLoop(context,
// (RealVectorSampleArray.DirectRealDoubleVectorSampleArray)samples, 0, logN, logN,
// ((RealVectorSampleArray.DirectRealDoubleVectorSampleArray)samples).
// newCompatibleSamplesArray(NUMBER_OF_WORK_VARIABLES));
} else {
fhtCommonMainLoop(context, samples,
0, logN, logN, samples.newCompatibleSamplesArray(NUMBER_OF_WORK_VARIABLES));
}
}
private static void fhtCommonMainLoop(
ArrayContext context, SampleArray samples,
final long pos, final int logN, final int originalLogN, SampleArray work) {
switch (logN) { // in comments below xK, rK, yK means the source, the reordered source and the result
case 1: {
work.sub(0, samples, pos, pos + 1);
samples.add(pos, pos, pos + 1); // y0 = x0 + x1
samples.copy(pos + 1, work, 0); // y1 = x0 - x1
break;
}
case 2: {
work.sub(D01, samples, pos, pos + 1);
work.add(S01, samples, pos, pos + 1);
work.sub(D23, samples, pos + 2, pos + 3);
work.add(S23, samples, pos + 2, pos + 3);
samples.add(pos, work, S01, S23); // y0 = r0 + r1 + r2 + r3 = x0 + x2 + x1 + x3
samples.add(pos + 1, work, D01, D23); // y1 = r0 - r1 + r2 - r3 = x0 - x2 + x1 - x3
samples.sub(pos + 2, work, S01, S23); // y2 = r0 + r1 - r2 - r3 = x0 + x2 - x1 - x3
samples.sub(pos + 3, work, D01, D23); // y3 = r0 - r1 - r2 + r3 = x0 - x2 - x1 + x3
break;
}
case 3: {
work.sub(D01, samples, pos, pos + 1);
work.add(S01, samples, pos, pos + 1);
work.sub(D23, samples, pos + 2, pos + 3);
work.add(S23, samples, pos + 2, pos + 3);
work.sub(DS0123, S01, S23);
work.add(SS0123, S01, S23);
work.sub(DD0123, D01, D23);
work.add(SD0123, D01, D23); // S01, S23, D01, D23 will not be used more and can be reused
work.add(S45, samples, pos + 4, pos + 5);
work.add(S67, samples, pos + 6, pos + 7);
work.sub(D45, samples, pos + 4, pos + 5);
work.sub(D67, samples, pos + 6, pos + 7);
work.sub(DS4567, S45, S67);
work.add(SS4567, S45, S67);
samples.sub(pos + 4, work, SS0123, SS4567); // y4 = r0 + r1 + r2 + r3 - r4 - r5 - r6 - r7
samples.add(pos, work, SS0123, SS4567); // y0 = r0 + ... + r7 = x0 + ... + x7
samples.sub(pos + 6, work, DS0123, DS4567); // y6 = r0 + r1 - r2 - r3 - r4 - r5 + r6 + r7
// = x0 + x4 - x2 - x6 - x1 - x5 + x3 + x7
samples.add(pos + 2, work, DS0123, DS4567); // y2 = r0 + r1 - r2 - r3 + r4 + r5 - r6 - r7
// = x0 + x4 - x2 - x6 + x1 + x5 - x3 - x7
work.multiplyByRealScalar(D45, SQRT2);
work.multiplyByRealScalar(D67, SQRT2);
samples.sub(pos + 5, work, SD0123, D45); // y5 = r0 - r1 + r2 - r3 - sqrt(2)*r4 + sqrt(2)*r5
// = x0 - x4 + x2 - x6 - sqrt(2)*x1 + sqrt(2)*x5
samples.add(pos + 1, work, SD0123, D45); // y1 = r0 - r1 + r2 - r3 + sqrt(2)*r4 - sqrt(2)*r5
// = x0 - x4 + x2 - x6 + sqrt(2)*x1 - sqrt(2)*x5
samples.sub(pos + 7, work, DD0123, D67); // y7 = r0 - r1 - r2 + r3 - sqrt(2)*r6 + sqrt(2)*r7
// = x0 - x4 - x2 + x6 - sqrt(2)*x3 + sqrt(2)*x7
samples.add(pos + 3, work, DD0123, D67); // y3 = r0 - r1 - r2 + r3 + sqrt(2)*r6 - sqrt(2)*r7
// = x0 - x4 - x2 + x6 + sqrt(2)*x3 - sqrt(2)*x7
break;
}
default: {
assert logN > 3;
final long nDiv8 = 1L << (logN - 3);
final long nDiv4 = nDiv8 * 2;
final long nDiv2 = nDiv4 * 2;
fhtCommonMainLoop(context, samples, pos, logN - 1, originalLogN, work);
fhtCommonMainLoop(context, samples, pos + nDiv2, logN - 1, originalLogN, work);
final boolean allAnglesInCache = logN - 1 <= RootsOfUnity.LOG_CACHE_SIZE + LOG_ANGLE_STEP;
final double rotationAngle = StrictMath.PI / nDiv2;
final double sin0Half = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN]; //sin(0.5 * rotationAngle);
final double cos0M1 = -2.0 * sin0Half * sin0Half; // cos(rotationAngle)-1
final double sin0 = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN - 1]; //sin(rotationAngle)
double cos = 1.0 + cos0M1;
double sin = sin0;
long j1, j2;
for (long j = 1; j < nDiv8; j++) {
// (cos, sin) = (cos(j*2*PI/N), sin(j*2*PI/N))
j1 = pos + j;
j2 = pos + nDiv2 - j;
//[[Repeat.SectionStart CommonButterfly]]
work.copy(R1, samples, nDiv2 + j1);
work.copy(R2, samples, nDiv2 + j2);
work.combineWithRealMultipliers(CAS, R1, cos, R2, sin);
work.copy(L1, samples, j1);
samples.add(j1, work, L1, CAS);
samples.sub(nDiv2 + j1, work, L1, CAS);
work.combineWithRealMultipliers(CAS, R1, sin, R2, -cos);
work.copy(L2, samples, j2);
samples.add(j2, work, L2, CAS);
samples.sub(nDiv2 + j2, work, L2, CAS);
//[[Repeat.SectionEnd CommonButterfly]]
j1 = pos + nDiv4 - j;
j2 = pos + nDiv4 + j; // below is the same code as above with exchanging sin<->cos
//[[Repeat(INCLUDE_FROM_FILE, THIS_FILE, CommonButterfly)
// cos ==> TTTT;; sin ==> cos;; TTTT ==> sin !! Auto-generated: NOT EDIT !! ]]
work.copy(R1, samples, nDiv2 + j1);
work.copy(R2, samples, nDiv2 + j2);
work.combineWithRealMultipliers(CAS, R1, sin, R2, cos);
work.copy(L1, samples, j1);
samples.add(j1, work, L1, CAS);
samples.sub(nDiv2 + j1, work, L1, CAS);
work.combineWithRealMultipliers(CAS, R1, cos, R2, -sin);
work.copy(L2, samples, j2);
samples.add(j2, work, L2, CAS);
samples.sub(nDiv2 + j2, work, L2, CAS);
//[[Repeat.IncludeEnd]]
if ((j & (ANGLE_STEP - 1)) == ANGLE_STEP - 1) {
if (allAnglesInCache) {
int angleIndex = (int) (j + 1) >> LOG_ANGLE_STEP
<< (RootsOfUnity.LOG_CACHE_SIZE - (logN - 1) + LOG_ANGLE_STEP);
// (j + 1) * CACHE_SIZE/nDiv2
// cos = RootsOfUnity.quickCos(angleIndex);
// sin = RootsOfUnity.quickSin(angleIndex);
cos = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[RootsOfUnity.HALF_CACHE_SIZE - angleIndex] :
-RootsOfUnity.SINE_CACHE[angleIndex - RootsOfUnity.HALF_CACHE_SIZE];
sin = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[angleIndex] :
RootsOfUnity.SINE_CACHE[RootsOfUnity.CACHE_SIZE - angleIndex];
} else {
double angle = (j + 1) * rotationAngle;
double sinHalf = Math.sin(0.5 * angle);
cos = 1.0 - 2.0 * sinHalf * sinHalf; // = cos(angle)
sin = Math.sin(angle);
}
} else {
// on modern JVM and CPU, the following 4 multiplications and 4 additions work faster
// then extracting cos and sin from the table
double temp = cos;
cos = cos * cos0M1 - sin * sin0 + cos;
sin = sin * cos0M1 + temp * sin0 + sin;
}
}
j1 = pos + nDiv8;
j2 = pos + nDiv2 - nDiv8;
//[[Repeat(INCLUDE_FROM_FILE, THIS_FILE, CommonButterfly)
// (cos|sin) ==> HALF_SQRT2;;
// \s\s\s\s(work\.|samples\.|$) ==> $1 !! Auto-generated: NOT EDIT !! ]]
work.copy(R1, samples, nDiv2 + j1);
work.copy(R2, samples, nDiv2 + j2);
work.combineWithRealMultipliers(CAS, R1, HALF_SQRT2, R2, HALF_SQRT2);
work.copy(L1, samples, j1);
samples.add(j1, work, L1, CAS);
samples.sub(nDiv2 + j1, work, L1, CAS);
work.combineWithRealMultipliers(CAS, R1, HALF_SQRT2, R2, -HALF_SQRT2);
work.copy(L2, samples, j2);
samples.add(j2, work, L2, CAS);
samples.sub(nDiv2 + j2, work, L2, CAS);
//[[Repeat.IncludeEnd]]
work.sub(0, samples, pos, pos + nDiv2);
samples.add(pos, pos, pos + nDiv2); // r[0] + r[n/2]
samples.copy(pos + nDiv2, work, 0); // r[0] - r[n/2]
work.sub(0, samples, pos + nDiv4, pos + nDiv2 + nDiv4);
samples.add(pos + nDiv4, pos + nDiv4, pos + nDiv2 + nDiv4); // r[n/4] + r[3*n/4]
samples.copy(pos + nDiv2 + nDiv4, work, 0); // r[n/4] - r[3*n/4]
}
if (context != null && (pos + (1 << logN) == samples.length())) {
context.checkInterruptionAndUpdateProgress(null, logN, originalLogN);
}
}
}
//[[Repeat() Float ==> Double;;
// float ==> double;;
// \(double\)\s*\(([^)]+)\) ==> $1 ]]
private static void fhtJavaFloatMainLoop(
ArrayContext context, SampleArray samples,
final int pos, final int logN, final int originalLogN) {
final float[] values = samples instanceof RealScalarSampleArray.DirectZeroOffsetsRealFloatSampleArray ?
((RealScalarSampleArray.DirectZeroOffsetsRealFloatSampleArray) samples).samples :
((RealScalarSampleArray.DirectRealFloatSampleArray) samples).samples;
final int ofs = pos + (samples instanceof RealScalarSampleArray.DirectZeroOffsetsRealFloatSampleArray ? 0 :
((RealScalarSampleArray.DirectRealFloatSampleArray) samples).ofs);
switch (logN) { // in comments below xK, rK, yK means the source, the reordered source and the result
case 1: {
float temp = values[ofs] - values[ofs + 1];
values[ofs] += values[ofs + 1]; // y0 = x0 + x1
values[ofs + 1] = temp; // y1 = x0 - x1
break;
}
case 2: {
float d01 = values[ofs] - values[ofs + 1];
float s01 = values[ofs] + values[ofs + 1];
float d23 = values[ofs + 2] - values[ofs + 3];
float s23 = values[ofs + 2] + values[ofs + 3];
values[ofs] = s01 + s23; // y0 = r0 + r1 + r2 + r3 = x0 + x2 + x1 + x3
values[ofs + 1] = d01 + d23; // y1 = r0 - r1 + r2 - r3 = x0 - x2 + x1 - x3
values[ofs + 2] = s01 - s23; // y2 = r0 + r1 - r2 - r3 = x0 + x2 - x1 - x3
values[ofs + 3] = d01 - d23; // y3 = r0 - r1 - r2 + r3 = x0 - x2 - x1 + x3
break;
}
case 3: {
float d01 = values[ofs] - values[ofs + 1];
float s01 = values[ofs] + values[ofs + 1];
float d23 = values[ofs + 2] - values[ofs + 3];
float s23 = values[ofs + 2] + values[ofs + 3];
float ds0123 = s01 - s23;
float ss0123 = s01 + s23;
float dd0123 = d01 - d23;
float sd0123 = d01 + d23;
float s45 = values[ofs + 4] + values[ofs + 5];
float s67 = values[ofs + 6] + values[ofs + 7];
float d45 = values[ofs + 4] - values[ofs + 5];
float d67 = values[ofs + 6] - values[ofs + 7];
float ds4567 = s45 - s67;
float ss4567 = s45 + s67;
values[ofs + 4] = ss0123 - ss4567; // y4 = r0 + r1 + r2 + r3 - r4 - r5 - r6 - r7
values[ofs] = ss0123 + ss4567; // y0 = r0 + ... + r7 = x0 + ... + x7
values[ofs + 6] = ds0123 - ds4567; // y6 = r0 + r1 - r2 - r3 - r4 - r5 + r6 + r7
// = x0 + x4 - x2 - x6 - x1 - x5 + x3 + x7
values[ofs + 2] = ds0123 + ds4567; // y2 = r0 + r1 - r2 - r3 + r4 + r5 - r6 - r7
// = x0 + x4 - x2 - x6 + x1 + x5 - x3 - x7
d45 *= SQRT2;
d67 *= SQRT2;
values[ofs + 5] = sd0123 - d45; // y5 = r0 - r1 + r2 - r3 - sqrt(2)*r4 + sqrt(2)*r5
// = x0 - x4 + x2 - x6 - sqrt(2)*x1 + sqrt(2)*x5
values[ofs + 1] = sd0123 + d45; // y1 = r0 - r1 + r2 - r3 + sqrt(2)*r4 - sqrt(2)*r5
// = x0 - x4 + x2 - x6 + sqrt(2)*x1 - sqrt(2)*x5
values[ofs + 7] = dd0123 - d67; // y7 = r0 - r1 - r2 + r3 - sqrt(2)*r6 + sqrt(2)*r7
// = x0 - x4 - x2 + x6 - sqrt(2)*x3 + sqrt(2)*x7
values[ofs + 3] = dd0123 + d67; // y3 = r0 - r1 - r2 + r3 + sqrt(2)*r6 - sqrt(2)*r7
// = x0 - x4 - x2 + x6 + sqrt(2)*x3 - sqrt(2)*x7
break;
}
default: {
assert logN > 3;
final int nDiv8 = 1 << (logN - 3);
final int nDiv4 = nDiv8 * 2;
final int nDiv2 = nDiv4 * 2;
fhtJavaFloatMainLoop(context, samples, pos, logN - 1, originalLogN);
fhtJavaFloatMainLoop(context, samples, pos + nDiv2, logN - 1, originalLogN);
final boolean allAnglesInCache = logN - 1 <= RootsOfUnity.LOG_CACHE_SIZE + LOG_ANGLE_STEP;
final double rotationAngle = StrictMath.PI / nDiv2;
final double sin0Half = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN]; //sin(0.5 * rotationAngle);
final double cos0M1 = -2.0 * sin0Half * sin0Half; // cos(rotationAngle)-1
final double sin0 = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN - 1]; //sin(rotationAngle)
double cos = 1.0 + cos0M1;
double sin = sin0;
int j1, j2;
float r1, r2, cas, l1, l2;
for (int j = 1; j < nDiv8; j++) {
// (cos, sin) = (cos(j*2*PI/N), sin(j*2*PI/N))
j1 = ofs + j;
j2 = ofs + nDiv2 - j;
//[[Repeat.SectionStart JavaFloatButterfly]]
r1 = values[nDiv2 + j1];
r2 = values[nDiv2 + j2];
cas = (float) (r1 * cos + r2 * sin);
l1 = values[j1];
values[j1] = l1 + cas;
values[nDiv2 + j1] = l1 - cas;
cas = (float) (r1 * sin - r2 * cos);
l2 = values[j2];
values[j2] = l2 + cas;
values[nDiv2 + j2] = l2 - cas;
//[[Repeat.SectionEnd JavaFloatButterfly]]
j1 = ofs + nDiv4 - j;
j2 = ofs + nDiv4 + j; // below is the same code as above with exchanging sin<->cos
//[[Repeat(INCLUDE_FROM_FILE, THIS_FILE, JavaFloatButterfly)
// cos ==> TTTT;; sin ==> cos;; TTTT ==> sin !! Generated by Repeater: DO NOT EDIT !! ]]
r1 = values[nDiv2 + j1];
r2 = values[nDiv2 + j2];
cas = (float) (r1 * sin + r2 * cos);
l1 = values[j1];
values[j1] = l1 + cas;
values[nDiv2 + j1] = l1 - cas;
cas = (float) (r1 * cos - r2 * sin);
l2 = values[j2];
values[j2] = l2 + cas;
values[nDiv2 + j2] = l2 - cas;
//[[Repeat.IncludeEnd]]
if ((j & (ANGLE_STEP - 1)) == ANGLE_STEP - 1) {
if (allAnglesInCache) {
int angleIndex = (j + 1) >> LOG_ANGLE_STEP
<< (RootsOfUnity.LOG_CACHE_SIZE - (logN - 1) + LOG_ANGLE_STEP);
// (j + 1) * CACHE_SIZE/nDiv2
// cos = RootsOfUnity.quickCos(angleIndex);
// sin = RootsOfUnity.quickSin(angleIndex);
cos = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[RootsOfUnity.HALF_CACHE_SIZE - angleIndex] :
-RootsOfUnity.SINE_CACHE[angleIndex - RootsOfUnity.HALF_CACHE_SIZE];
sin = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[angleIndex] :
RootsOfUnity.SINE_CACHE[RootsOfUnity.CACHE_SIZE - angleIndex];
} else {
double angle = (j + 1) * rotationAngle;
double sinHalf = Math.sin(0.5 * angle);
cos = 1.0 - 2.0 * sinHalf * sinHalf; // = cos(angle)
sin = Math.sin(angle);
}
} else {
// on modern JVM and CPU, the following 4 multiplications and 4 additions work faster
// then extracting cos and sin from the table
double temp = cos;
cos = cos * cos0M1 - sin * sin0 + cos;
sin = sin * cos0M1 + temp * sin0 + sin;
}
}
j1 = ofs + nDiv8;
j2 = ofs + nDiv2 - nDiv8;
//[[Repeat(INCLUDE_FROM_FILE, THIS_FILE, JavaFloatButterfly)
// (cos|sin) ==> HALF_SQRT2;;
// \s\s\s\s(r1\s|r2\s|cas\s|l1\s|l2\s|values\b|$) ==> $1 !! Generated by Repeater: DO NOT EDIT !! ]]
r1 = values[nDiv2 + j1];
r2 = values[nDiv2 + j2];
cas = (float) (r1 * HALF_SQRT2 + r2 * HALF_SQRT2);
l1 = values[j1];
values[j1] = l1 + cas;
values[nDiv2 + j1] = l1 - cas;
cas = (float) (r1 * HALF_SQRT2 - r2 * HALF_SQRT2);
l2 = values[j2];
values[j2] = l2 + cas;
values[nDiv2 + j2] = l2 - cas;
//[[Repeat.IncludeEnd]]
float temp = values[ofs] - values[ofs + nDiv2];
values[ofs] += values[ofs + nDiv2]; // r[0] + r[n/2]
values[ofs + nDiv2] = temp; // r[0] - r[n/2]
temp = values[ofs + nDiv4] - values[ofs + nDiv2 + nDiv4];
values[ofs + nDiv4] += values[ofs + nDiv2 + nDiv4]; // r[n/4] + r[3*n/4]
values[ofs + nDiv2 + nDiv4] = temp; // r[n/4] - r[3*n/4]
}
if (context != null && (pos + (1 << logN) == samples.length())) {
context.checkInterruptionAndUpdateProgress(null, logN, originalLogN);
}
}
}
//[[Repeat.AutoGeneratedStart !! Auto-generated: NOT EDIT !! ]]
private static void fhtJavaDoubleMainLoop(
ArrayContext context, SampleArray samples,
final int pos, final int logN, final int originalLogN) {
final double[] values = samples instanceof RealScalarSampleArray.DirectZeroOffsetsRealDoubleSampleArray ?
((RealScalarSampleArray.DirectZeroOffsetsRealDoubleSampleArray) samples).samples :
((RealScalarSampleArray.DirectRealDoubleSampleArray) samples).samples;
final int ofs = pos + (samples instanceof RealScalarSampleArray.DirectZeroOffsetsRealDoubleSampleArray ? 0 :
((RealScalarSampleArray.DirectRealDoubleSampleArray) samples).ofs);
switch (logN) { // in comments below xK, rK, yK means the source, the reordered source and the result
case 1: {
double temp = values[ofs] - values[ofs + 1];
values[ofs] += values[ofs + 1]; // y0 = x0 + x1
values[ofs + 1] = temp; // y1 = x0 - x1
break;
}
case 2: {
double d01 = values[ofs] - values[ofs + 1];
double s01 = values[ofs] + values[ofs + 1];
double d23 = values[ofs + 2] - values[ofs + 3];
double s23 = values[ofs + 2] + values[ofs + 3];
values[ofs] = s01 + s23; // y0 = r0 + r1 + r2 + r3 = x0 + x2 + x1 + x3
values[ofs + 1] = d01 + d23; // y1 = r0 - r1 + r2 - r3 = x0 - x2 + x1 - x3
values[ofs + 2] = s01 - s23; // y2 = r0 + r1 - r2 - r3 = x0 + x2 - x1 - x3
values[ofs + 3] = d01 - d23; // y3 = r0 - r1 - r2 + r3 = x0 - x2 - x1 + x3
break;
}
case 3: {
double d01 = values[ofs] - values[ofs + 1];
double s01 = values[ofs] + values[ofs + 1];
double d23 = values[ofs + 2] - values[ofs + 3];
double s23 = values[ofs + 2] + values[ofs + 3];
double ds0123 = s01 - s23;
double ss0123 = s01 + s23;
double dd0123 = d01 - d23;
double sd0123 = d01 + d23;
double s45 = values[ofs + 4] + values[ofs + 5];
double s67 = values[ofs + 6] + values[ofs + 7];
double d45 = values[ofs + 4] - values[ofs + 5];
double d67 = values[ofs + 6] - values[ofs + 7];
double ds4567 = s45 - s67;
double ss4567 = s45 + s67;
values[ofs + 4] = ss0123 - ss4567; // y4 = r0 + r1 + r2 + r3 - r4 - r5 - r6 - r7
values[ofs] = ss0123 + ss4567; // y0 = r0 + ... + r7 = x0 + ... + x7
values[ofs + 6] = ds0123 - ds4567; // y6 = r0 + r1 - r2 - r3 - r4 - r5 + r6 + r7
// = x0 + x4 - x2 - x6 - x1 - x5 + x3 + x7
values[ofs + 2] = ds0123 + ds4567; // y2 = r0 + r1 - r2 - r3 + r4 + r5 - r6 - r7
// = x0 + x4 - x2 - x6 + x1 + x5 - x3 - x7
d45 *= SQRT2;
d67 *= SQRT2;
values[ofs + 5] = sd0123 - d45; // y5 = r0 - r1 + r2 - r3 - sqrt(2)*r4 + sqrt(2)*r5
// = x0 - x4 + x2 - x6 - sqrt(2)*x1 + sqrt(2)*x5
values[ofs + 1] = sd0123 + d45; // y1 = r0 - r1 + r2 - r3 + sqrt(2)*r4 - sqrt(2)*r5
// = x0 - x4 + x2 - x6 + sqrt(2)*x1 - sqrt(2)*x5
values[ofs + 7] = dd0123 - d67; // y7 = r0 - r1 - r2 + r3 - sqrt(2)*r6 + sqrt(2)*r7
// = x0 - x4 - x2 + x6 - sqrt(2)*x3 + sqrt(2)*x7
values[ofs + 3] = dd0123 + d67; // y3 = r0 - r1 - r2 + r3 + sqrt(2)*r6 - sqrt(2)*r7
// = x0 - x4 - x2 + x6 + sqrt(2)*x3 - sqrt(2)*x7
break;
}
default: {
assert logN > 3;
final int nDiv8 = 1 << (logN - 3);
final int nDiv4 = nDiv8 * 2;
final int nDiv2 = nDiv4 * 2;
fhtJavaDoubleMainLoop(context, samples, pos, logN - 1, originalLogN);
fhtJavaDoubleMainLoop(context, samples, pos + nDiv2, logN - 1, originalLogN);
final boolean allAnglesInCache = logN - 1 <= RootsOfUnity.LOG_CACHE_SIZE + LOG_ANGLE_STEP;
final double rotationAngle = StrictMath.PI / nDiv2;
final double sin0Half = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN]; //sin(0.5 * rotationAngle);
final double cos0M1 = -2.0 * sin0Half * sin0Half; // cos(rotationAngle)-1
final double sin0 = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN - 1]; //sin(rotationAngle)
double cos = 1.0 + cos0M1;
double sin = sin0;
int j1, j2;
double r1, r2, cas, l1, l2;
for (int j = 1; j < nDiv8; j++) {
// (cos, sin) = (cos(j*2*PI/N), sin(j*2*PI/N))
j1 = ofs + j;
j2 = ofs + nDiv2 - j;
r1 = values[nDiv2 + j1];
r2 = values[nDiv2 + j2];
cas = r1 * cos + r2 * sin;
l1 = values[j1];
values[j1] = l1 + cas;
values[nDiv2 + j1] = l1 - cas;
cas = r1 * sin - r2 * cos;
l2 = values[j2];
values[j2] = l2 + cas;
values[nDiv2 + j2] = l2 - cas;
j1 = ofs + nDiv4 - j;
j2 = ofs + nDiv4 + j; // below is the same code as above with exchanging sin<->cos
r1 = values[nDiv2 + j1];
r2 = values[nDiv2 + j2];
cas = r1 * sin + r2 * cos;
l1 = values[j1];
values[j1] = l1 + cas;
values[nDiv2 + j1] = l1 - cas;
cas = r1 * cos - r2 * sin;
l2 = values[j2];
values[j2] = l2 + cas;
values[nDiv2 + j2] = l2 - cas;
if ((j & (ANGLE_STEP - 1)) == ANGLE_STEP - 1) {
if (allAnglesInCache) {
int angleIndex = (j + 1) >> LOG_ANGLE_STEP
<< (RootsOfUnity.LOG_CACHE_SIZE - (logN - 1) + LOG_ANGLE_STEP);
// (j + 1) * CACHE_SIZE/nDiv2
// cos = RootsOfUnity.quickCos(angleIndex);
// sin = RootsOfUnity.quickSin(angleIndex);
cos = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[RootsOfUnity.HALF_CACHE_SIZE - angleIndex] :
-RootsOfUnity.SINE_CACHE[angleIndex - RootsOfUnity.HALF_CACHE_SIZE];
sin = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[angleIndex] :
RootsOfUnity.SINE_CACHE[RootsOfUnity.CACHE_SIZE - angleIndex];
} else {
double angle = (j + 1) * rotationAngle;
double sinHalf = Math.sin(0.5 * angle);
cos = 1.0 - 2.0 * sinHalf * sinHalf; // = cos(angle)
sin = Math.sin(angle);
}
} else {
// on modern JVM and CPU, the following 4 multiplications and 4 additions work faster
// then extracting cos and sin from the table
double temp = cos;
cos = cos * cos0M1 - sin * sin0 + cos;
sin = sin * cos0M1 + temp * sin0 + sin;
}
}
j1 = ofs + nDiv8;
j2 = ofs + nDiv2 - nDiv8;
r1 = values[nDiv2 + j1];
r2 = values[nDiv2 + j2];
cas = r1 * HALF_SQRT2 + r2 * HALF_SQRT2;
l1 = values[j1];
values[j1] = l1 + cas;
values[nDiv2 + j1] = l1 - cas;
cas = r1 * HALF_SQRT2 - r2 * HALF_SQRT2;
l2 = values[j2];
values[j2] = l2 + cas;
values[nDiv2 + j2] = l2 - cas;
double temp = values[ofs] - values[ofs + nDiv2];
values[ofs] += values[ofs + nDiv2]; // r[0] + r[n/2]
values[ofs + nDiv2] = temp; // r[0] - r[n/2]
temp = values[ofs + nDiv4] - values[ofs + nDiv2 + nDiv4];
values[ofs + nDiv4] += values[ofs + nDiv2 + nDiv4]; // r[n/4] + r[3*n/4]
values[ofs + nDiv2 + nDiv4] = temp; // r[n/4] - r[3*n/4]
}
if (context != null && (pos + (1 << logN) == samples.length())) {
context.checkInterruptionAndUpdateProgress(null, logN, originalLogN);
}
}
}
//[[Repeat.AutoGeneratedEnd]]
//[[Repeat() Float ==> Double;;
// float ==> double;;
// \(double\)\s*\(([^)]+)\) ==> $1 ]]
private static void fhtJavaFloatMultidimensionalMainLoop(
ArrayContext context,
RealVectorSampleArray.DirectRealFloatVectorSampleArray samples,
final int pos, final int logN, final int originalLogN,
RealVectorSampleArray.DirectRealFloatVectorSampleArray work) {
final float[] values = samples.samples;
final int step = (int) samples.vectorStep;
final int ofs = pos * step + samples.ofs;
switch (logN) { // in comments below xK, rK, yK means the source, the reordered source and the result
case 1: {
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
float temp = values[k] - values[k + step];
values[k] += values[k + step]; // y0 = x0 + x1
values[k + step] = temp; // y1 = x0 - x1
}
break;
}
case 2: {
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
float d01 = values[k] - values[k + step];
float s01 = values[k] + values[k + step];
float d23 = values[k + 2 * step] - values[k + 3 * step];
float s23 = values[k + 2 * step] + values[k + 3 * step];
values[k] = s01 + s23; // y0 = r0 + r1 + r2 + r3 = x0 + x2 + x1 + x3
values[k + step] = d01 + d23; // y1 = r0 - r1 + r2 - r3 = x0 - x2 + x1 - x3
values[k + 2 * step] = s01 - s23; // y2 = r0 + r1 - r2 - r3 = x0 + x2 - x1 - x3
values[k + 3 * step] = d01 - d23; // y3 = r0 - r1 - r2 + r3 = x0 - x2 - x1 + x3
}
break;
}
case 3: {
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
float d01 = values[k] - values[k + step];
float s01 = values[k] + values[k + step];
float d23 = values[k + 2 * step] - values[k + 3 * step];
float s23 = values[k + 2 * step] + values[k + 3 * step];
float ds0123 = s01 - s23;
float ss0123 = s01 + s23;
float dd0123 = d01 - d23;
float sd0123 = d01 + d23;
float s45 = values[k + 4 * step] + values[k + 5 * step];
float s67 = values[k + 6 * step] + values[k + 7 * step];
float d45 = values[k + 4 * step] - values[k + 5 * step];
float d67 = values[k + 6 * step] - values[k + 7 * step];
float ds4567 = s45 - s67;
float ss4567 = s45 + s67;
values[k + 4 * step] = ss0123 - ss4567; // y4 = r0 + r1 + r2 + r3 - r4 - r5 - r6 - r7
values[k] = ss0123 + ss4567; // y0 = r0 + ... + r7 = x0 + ... + x7
values[k + 6 * step] = ds0123 - ds4567; // y6 = r0 + r1 - r2 - r3 - r4 - r5 + r6 + r7
// = x0 + x4 - x2 - x6 - x1 - x5 + x3 + x7
values[k + 2 * step] = ds0123 + ds4567; // y2 = r0 + r1 - r2 - r3 + r4 + r5 - r6 - r7
// = x0 + x4 - x2 - x6 + x1 + x5 - x3 - x7
d45 *= SQRT2;
d67 *= SQRT2;
values[k + 5 * step] = sd0123 - d45; // y5 = r0 - r1 + r2 - r3 - sqrt(2)*r4 + sqrt(2)*r5
// = x0 - x4 + x2 - x6 - sqrt(2)*x1 + sqrt(2)*x5
values[k + step] = sd0123 + d45; // y1 = r0 - r1 + r2 - r3 + sqrt(2)*r4 - sqrt(2)*r5
// = x0 - x4 + x2 - x6 + sqrt(2)*x1 - sqrt(2)*x5
values[k + 7 * step] = dd0123 - d67; // y7 = r0 - r1 - r2 + r3 - sqrt(2)*r6 + sqrt(2)*r7
// = x0 - x4 - x2 + x6 - sqrt(2)*x3 + sqrt(2)*x7
values[k + 3 * step] = dd0123 + d67; // y3 = r0 - r1 - r2 + r3 + sqrt(2)*r6 - sqrt(2)*r7
// = x0 - x4 - x2 + x6 + sqrt(2)*x3 - sqrt(2)*x7
}
break;
}
default: {
assert logN > 3;
final int nDiv8 = 1 << (logN - 3);
final int nDiv4 = nDiv8 * 2;
final int nDiv2 = nDiv4 * 2;
final int nDiv8Step = nDiv8 * step;
final int nDiv4Step = nDiv8Step * 2;
final int nDiv2Step = nDiv4Step * 2;
fhtJavaFloatMultidimensionalMainLoop(context, samples, pos, logN - 1, originalLogN, work);
fhtJavaFloatMultidimensionalMainLoop(context, samples, pos + nDiv2, logN - 1, originalLogN, work);
final boolean allAnglesInCache = logN - 1 <= RootsOfUnity.LOG_CACHE_SIZE + LOG_ANGLE_STEP;
final double rotationAngle = StrictMath.PI / nDiv2;
final double sin0Half = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN]; //sin(0.5 * rotationAngle);
final double cos0M1 = -2.0 * sin0Half * sin0Half; // cos(rotationAngle)-1
final double sin0 = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN - 1]; //sin(rotationAngle)
double cos = 1.0 + cos0M1;
double sin = sin0;
int j1, j2, j1Step, j2Step;
float r1, r2, cas, l1, l2;
for (int j = 1, jStep = step; j < nDiv8; j++, jStep += step) {
// (cos, sin) = (cos(j*2*PI/N), sin(j*2*PI/N))
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
int j1Step1 = k + jStep;
int j2Step1 = k + nDiv2Step - jStep;
//[[Repeat.SectionStart JavaFloatMultidimensionalButterfly]]
r1 = values[nDiv2Step + j1Step1];
r2 = values[nDiv2Step + j2Step1];
cas = (float) (r1 * cos + r2 * sin);
l1 = values[j1Step1];
values[j1Step1] = l1 + cas;
values[nDiv2Step + j1Step1] = l1 - cas;
cas = (float) (r1 * sin - r2 * cos);
l2 = values[j2Step1];
values[j2Step1] = l2 + cas;
values[nDiv2Step + j2Step1] = l2 - cas;
//[[Repeat.SectionEnd JavaFloatMultidimensionalButterfly]]
j1Step1 = k + nDiv4Step - jStep;
j2Step1 = k + nDiv4Step + jStep;
//[[Repeat(INCLUDE_FROM_FILE, THIS_FILE, JavaFloatMultidimensionalButterfly)
// cos ==> TTTT;; sin ==> cos;; TTTT ==> sin !! Generated by Repeater: DO NOT EDIT !! ]]
r1 = values[nDiv2Step + j1Step1];
r2 = values[nDiv2Step + j2Step1];
cas = (float) (r1 * sin + r2 * cos);
l1 = values[j1Step1];
values[j1Step1] = l1 + cas;
values[nDiv2Step + j1Step1] = l1 - cas;
cas = (float) (r1 * cos - r2 * sin);
l2 = values[j2Step1];
values[j2Step1] = l2 + cas;
values[nDiv2Step + j2Step1] = l2 - cas;
//[[Repeat.IncludeEnd]]
}
if ((j & (ANGLE_STEP - 1)) == ANGLE_STEP - 1) {
if (allAnglesInCache) {
int angleIndex = (j + 1) >> LOG_ANGLE_STEP
<< (RootsOfUnity.LOG_CACHE_SIZE - (logN - 1) + LOG_ANGLE_STEP);
// (j + 1) * CACHE_SIZE/nDiv2
// cos = RootsOfUnity.quickCos(angleIndex);
// sin = RootsOfUnity.quickSin(angleIndex);
cos = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[RootsOfUnity.HALF_CACHE_SIZE - angleIndex] :
-RootsOfUnity.SINE_CACHE[angleIndex - RootsOfUnity.HALF_CACHE_SIZE];
sin = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[angleIndex] :
RootsOfUnity.SINE_CACHE[RootsOfUnity.CACHE_SIZE - angleIndex];
} else {
double angle = (j + 1) * rotationAngle;
double sinHalf = Math.sin(0.5 * angle);
cos = 1.0 - 2.0 * sinHalf * sinHalf; // = cos(angle)
sin = Math.sin(angle);
}
} else {
// on modern JVM and CPU, the following 4 multiplications and 4 additions work faster
// then extracting cos and sin from the table
double temp = cos;
cos = cos * cos0M1 - sin * sin0 + cos;
sin = sin * cos0M1 + temp * sin0 + sin;
}
}
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
j1Step = k + nDiv8Step;
j2Step = k + nDiv2Step - nDiv8Step;
//[[Repeat(INCLUDE_FROM_FILE, THIS_FILE, JavaFloatMultidimensionalButterfly)
// (cos|sin) ==> HALF_SQRT2;;
// \s\s\s\s(r1\s|r2\s|cas\s|l1\s|l2\s|values\b|$) ==> $1 !! Generated by Repeater: DO NOT EDIT !! ]]
r1 = values[nDiv2Step + j1Step];
r2 = values[nDiv2Step + j2Step];
cas = (float) (r1 * HALF_SQRT2 + r2 * HALF_SQRT2);
l1 = values[j1Step];
values[j1Step] = l1 + cas;
values[nDiv2Step + j1Step] = l1 - cas;
cas = (float) (r1 * HALF_SQRT2 - r2 * HALF_SQRT2);
l2 = values[j2Step];
values[j2Step] = l2 + cas;
values[nDiv2Step + j2Step] = l2 - cas;
//[[Repeat.IncludeEnd]]
float temp = values[k] - values[k + nDiv2Step];
values[k] += values[k + nDiv2Step]; // r[0] + r[n/2]
values[k + nDiv2Step] = temp; // r[0] - r[n/2]
temp = values[k + nDiv4Step] - values[k + nDiv2Step + nDiv4Step];
values[k + nDiv4Step] += values[k + nDiv2Step + nDiv4Step]; // r[n/4] + r[3*n/4]
values[k + nDiv2Step + nDiv4Step] = temp; // r[n/4] - r[3*n/4]
}
}
if (context != null && (pos + (1 << logN) == samples.length())) {
context.checkInterruptionAndUpdateProgress(null, logN, originalLogN);
}
}
}
//[[Repeat.AutoGeneratedStart !! Auto-generated: NOT EDIT !! ]]
private static void fhtJavaDoubleMultidimensionalMainLoop(
ArrayContext context,
RealVectorSampleArray.DirectRealDoubleVectorSampleArray samples,
final int pos, final int logN, final int originalLogN,
RealVectorSampleArray.DirectRealDoubleVectorSampleArray work) {
final double[] values = samples.samples;
final int step = (int) samples.vectorStep;
final int ofs = pos * step + samples.ofs;
switch (logN) { // in comments below xK, rK, yK means the source, the reordered source and the result
case 1: {
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
double temp = values[k] - values[k + step];
values[k] += values[k + step]; // y0 = x0 + x1
values[k + step] = temp; // y1 = x0 - x1
}
break;
}
case 2: {
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
double d01 = values[k] - values[k + step];
double s01 = values[k] + values[k + step];
double d23 = values[k + 2 * step] - values[k + 3 * step];
double s23 = values[k + 2 * step] + values[k + 3 * step];
values[k] = s01 + s23; // y0 = r0 + r1 + r2 + r3 = x0 + x2 + x1 + x3
values[k + step] = d01 + d23; // y1 = r0 - r1 + r2 - r3 = x0 - x2 + x1 - x3
values[k + 2 * step] = s01 - s23; // y2 = r0 + r1 - r2 - r3 = x0 + x2 - x1 - x3
values[k + 3 * step] = d01 - d23; // y3 = r0 - r1 - r2 + r3 = x0 - x2 - x1 + x3
}
break;
}
case 3: {
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
double d01 = values[k] - values[k + step];
double s01 = values[k] + values[k + step];
double d23 = values[k + 2 * step] - values[k + 3 * step];
double s23 = values[k + 2 * step] + values[k + 3 * step];
double ds0123 = s01 - s23;
double ss0123 = s01 + s23;
double dd0123 = d01 - d23;
double sd0123 = d01 + d23;
double s45 = values[k + 4 * step] + values[k + 5 * step];
double s67 = values[k + 6 * step] + values[k + 7 * step];
double d45 = values[k + 4 * step] - values[k + 5 * step];
double d67 = values[k + 6 * step] - values[k + 7 * step];
double ds4567 = s45 - s67;
double ss4567 = s45 + s67;
values[k + 4 * step] = ss0123 - ss4567; // y4 = r0 + r1 + r2 + r3 - r4 - r5 - r6 - r7
values[k] = ss0123 + ss4567; // y0 = r0 + ... + r7 = x0 + ... + x7
values[k + 6 * step] = ds0123 - ds4567; // y6 = r0 + r1 - r2 - r3 - r4 - r5 + r6 + r7
// = x0 + x4 - x2 - x6 - x1 - x5 + x3 + x7
values[k + 2 * step] = ds0123 + ds4567; // y2 = r0 + r1 - r2 - r3 + r4 + r5 - r6 - r7
// = x0 + x4 - x2 - x6 + x1 + x5 - x3 - x7
d45 *= SQRT2;
d67 *= SQRT2;
values[k + 5 * step] = sd0123 - d45; // y5 = r0 - r1 + r2 - r3 - sqrt(2)*r4 + sqrt(2)*r5
// = x0 - x4 + x2 - x6 - sqrt(2)*x1 + sqrt(2)*x5
values[k + step] = sd0123 + d45; // y1 = r0 - r1 + r2 - r3 + sqrt(2)*r4 - sqrt(2)*r5
// = x0 - x4 + x2 - x6 + sqrt(2)*x1 - sqrt(2)*x5
values[k + 7 * step] = dd0123 - d67; // y7 = r0 - r1 - r2 + r3 - sqrt(2)*r6 + sqrt(2)*r7
// = x0 - x4 - x2 + x6 - sqrt(2)*x3 + sqrt(2)*x7
values[k + 3 * step] = dd0123 + d67; // y3 = r0 - r1 - r2 + r3 + sqrt(2)*r6 - sqrt(2)*r7
// = x0 - x4 - x2 + x6 + sqrt(2)*x3 - sqrt(2)*x7
}
break;
}
default: {
assert logN > 3;
final int nDiv8 = 1 << (logN - 3);
final int nDiv4 = nDiv8 * 2;
final int nDiv2 = nDiv4 * 2;
final int nDiv8Step = nDiv8 * step;
final int nDiv4Step = nDiv8Step * 2;
final int nDiv2Step = nDiv4Step * 2;
fhtJavaDoubleMultidimensionalMainLoop(context, samples, pos, logN - 1, originalLogN, work);
fhtJavaDoubleMultidimensionalMainLoop(context, samples, pos + nDiv2, logN - 1, originalLogN, work);
final boolean allAnglesInCache = logN - 1 <= RootsOfUnity.LOG_CACHE_SIZE + LOG_ANGLE_STEP;
final double rotationAngle = StrictMath.PI / nDiv2;
final double sin0Half = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN]; //sin(0.5 * rotationAngle);
final double cos0M1 = -2.0 * sin0Half * sin0Half; // cos(rotationAngle)-1
final double sin0 = RootsOfUnity.LOGARITHMICAL_SINE_CACHE[logN - 1]; //sin(rotationAngle)
double cos = 1.0 + cos0M1;
double sin = sin0;
int j1, j2, j1Step, j2Step;
double r1, r2, cas, l1, l2;
for (int j = 1, jStep = step; j < nDiv8; j++, jStep += step) {
// (cos, sin) = (cos(j*2*PI/N), sin(j*2*PI/N))
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
int j1Step1 = k + jStep;
int j2Step1 = k + nDiv2Step - jStep;
r1 = values[nDiv2Step + j1Step1];
r2 = values[nDiv2Step + j2Step1];
cas = r1 * cos + r2 * sin;
l1 = values[j1Step1];
values[j1Step1] = l1 + cas;
values[nDiv2Step + j1Step1] = l1 - cas;
cas = r1 * sin - r2 * cos;
l2 = values[j2Step1];
values[j2Step1] = l2 + cas;
values[nDiv2Step + j2Step1] = l2 - cas;
j1Step1 = k + nDiv4Step - jStep;
j2Step1 = k + nDiv4Step + jStep;
r1 = values[nDiv2Step + j1Step1];
r2 = values[nDiv2Step + j2Step1];
cas = r1 * sin + r2 * cos;
l1 = values[j1Step1];
values[j1Step1] = l1 + cas;
values[nDiv2Step + j1Step1] = l1 - cas;
cas = r1 * cos - r2 * sin;
l2 = values[j2Step1];
values[j2Step1] = l2 + cas;
values[nDiv2Step + j2Step1] = l2 - cas;
}
if ((j & (ANGLE_STEP - 1)) == ANGLE_STEP - 1) {
if (allAnglesInCache) {
int angleIndex = (j + 1) >> LOG_ANGLE_STEP
<< (RootsOfUnity.LOG_CACHE_SIZE - (logN - 1) + LOG_ANGLE_STEP);
// (j + 1) * CACHE_SIZE/nDiv2
// cos = RootsOfUnity.quickCos(angleIndex);
// sin = RootsOfUnity.quickSin(angleIndex);
cos = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[RootsOfUnity.HALF_CACHE_SIZE - angleIndex] :
-RootsOfUnity.SINE_CACHE[angleIndex - RootsOfUnity.HALF_CACHE_SIZE];
sin = angleIndex < RootsOfUnity.HALF_CACHE_SIZE ?
RootsOfUnity.SINE_CACHE[angleIndex] :
RootsOfUnity.SINE_CACHE[RootsOfUnity.CACHE_SIZE - angleIndex];
} else {
double angle = (j + 1) * rotationAngle;
double sinHalf = Math.sin(0.5 * angle);
cos = 1.0 - 2.0 * sinHalf * sinHalf; // = cos(angle)
sin = Math.sin(angle);
}
} else {
// on modern JVM and CPU, the following 4 multiplications and 4 additions work faster
// then extracting cos and sin from the table
double temp = cos;
cos = cos * cos0M1 - sin * sin0 + cos;
sin = sin * cos0M1 + temp * sin0 + sin;
}
}
for (int k = ofs, kMax = k + samples.vectorLen; k < kMax; k++) {
j1Step = k + nDiv8Step;
j2Step = k + nDiv2Step - nDiv8Step;
r1 = values[nDiv2Step + j1Step];
r2 = values[nDiv2Step + j2Step];
cas = r1 * HALF_SQRT2 + r2 * HALF_SQRT2;
l1 = values[j1Step];
values[j1Step] = l1 + cas;
values[nDiv2Step + j1Step] = l1 - cas;
cas = r1 * HALF_SQRT2 - r2 * HALF_SQRT2;
l2 = values[j2Step];
values[j2Step] = l2 + cas;
values[nDiv2Step + j2Step] = l2 - cas;
double temp = values[k] - values[k + nDiv2Step];
values[k] += values[k + nDiv2Step]; // r[0] + r[n/2]
values[k + nDiv2Step] = temp; // r[0] - r[n/2]
temp = values[k + nDiv4Step] - values[k + nDiv2Step + nDiv4Step];
values[k + nDiv4Step] += values[k + nDiv2Step + nDiv4Step]; // r[n/4] + r[3*n/4]
values[k + nDiv2Step + nDiv4Step] = temp; // r[n/4] - r[3*n/4]
}
}
if (context != null && (pos + (1 << logN) == samples.length())) {
context.checkInterruptionAndUpdateProgress(null, logN, originalLogN);
}
}
}
//[[Repeat.AutoGeneratedEnd]]
}