All Downloads are FREE. Search and download functionalities are using the official Maven repository.

com.aliasi.stats.RegressionPrior Maven / Gradle / Ivy

Go to download

This is the original Lingpipe: http://alias-i.com/lingpipe/web/download.html There were not made any changes to the source code.

There is a newer version: 4.1.2-JL1.0
Show newest version
/*
 * LingPipe v. 4.1.0
 * Copyright (C) 2003-2011 Alias-i
 *
 * This program is licensed under the Alias-i Royalty Free License
 * Version 1 WITHOUT ANY WARRANTY, without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the Alias-i
 * Royalty Free License Version 1 for more details.
 *
 * You should have received a copy of the Alias-i Royalty Free License
 * Version 1 along with this program; if not, visit
 * http://alias-i.com/lingpipe/licenses/lingpipe-license-1.txt or contact
 * Alias-i, Inc. at 181 North 11th Street, Suite 401, Brooklyn, NY 11211,
 * +1 (718) 290-9170.
 */
package com.aliasi.stats;

import com.aliasi.matrix.Vector;

import com.aliasi.util.AbstractExternalizable;
import java.io.IOException;
import java.io.ObjectInput;
import java.io.ObjectOutput;
import java.io.Serializable;

/**
 * A RegressionPrior instance represents a prior
 * distribution on parameters for linear or logistic regression.
 * It has methods to return the log probabilities of input
 * parameters and compute the gradient of the log probability
 * for estimation.
 *
 * 

Instances of this class are used as parameters in the {@link * LogisticRegression} class to control the regularization or lack * thereof used by the stochastic gradient descent optimizers. The * priors typically assume a zero mode (maximal value) for each * dimension, but allow variances (or scales) to vary by input * dimension. The method {@link #shiftMeans(double[],RegressionPrior)} * may be used to shift the means (and hence modes) of priors. * *

The behavior of a prior under stochastic gradient fitting is * determined by its gradient, the partial derivatives with respect to * the dimensions of the error function for the prior (negative log * likelihood) with respect to a coefficient * βi. * *

 * gradient(β,i) = - ∂ log p(β) / ∂ βi
* *

See the class documentation for {@link LogisticRegression} * for more information. * *

Priors also implement a log (base 2) probability density for the * prior for a given parameter in a given dimension. The total log * prior probability is defined as the sum of the log probabilities * for the dimensions, * *

 * log p(β) = Σi log p(βi)
* *

Priors affect gradient descent fitting of regression through * their contribution to the gradient of the error function with * respect to the parameter vector. The contribution of the prior to * the error function is the negative log probability of the parameter * vector(s) with respect to the prior distribution. The gradient of * the error function is the collection of partial derivatives of the * error function with respect to the components of the parameter * vector. The regression prior abstract base class is defined in * terms of a single method {@link #gradient(double,int)}, which * specifies the value of the gradient of the error function for a * specified dimension with a specified value in that dimension. * *

This class implements static factory methods to construct * noninformative, Gaussian and Laplace priors. The Gaussian and * Laplace priors may specify a different variance for each dimension, * but assumes all the prior means (which are equivalent to the modes) * are zero. The priors also assume the dimensions are independent so * that the full covariance matrix is assumed to be diagonal (that is, * there is zero covariance between different dimensions). * * *

Noninformative Prior & Maximum Likelihood Estimation

* *

Using a noninformative prior for regression results in standard * maximum likelihood estimation. * *

The noninformative prior assumes an improper uniform * distribution over parameter vectors: * *

 * p(βi) = Uniform(βi) = constant
* * and thus the log probabiilty is constant * *
 * log p(βi) = log constant
* * and therefore contributes nothing to the gradient: * *
 * gradient(β,i) =  0.0
* * A noninformative prior is constructed using the static method * {@link #noninformative()}. * * *

Gaussian Prior, L2 Regularization & Ridge Regression

* *

The Gaussian prior assumes a Gaussian (also known as normal) density over * parameter vectors which results in L2-regularized * regression, also known as ridge regression. Specifically, the * prior allows a variance to be specified per dimension, but * assumes dimensions are independent in that all off-diagonal * covariances are zero. The Gaussian prior has a single mode that * is the same as its mean. * *

The Gaussian density with variance is defined by: * *

 * p(βi) = 1.0/sqrt(2 * π σi2) * exp(-βi2/(2 * σi2))
* * which on a log scale is * *
 * log p(βi) = log (1.0/sqrt(2 * π * σi2)) + -βi2/(2 * σi2)
* *

The Gaussian prior leads to the following contribution to the * gradient for a dimension i with parameter * βi and variance * σi2: * *

 * gradient(β,i) = βii2
* * As usual, the lower the variance, the steeper the gradient, and the stronger * the effect on the (maximum) a posteriori estimate. * *

Gaussian priors are constructed using one of the static factory * methods, {@link #gaussian(double[])} or {@link * #gaussian(double,boolean)}. * *

Laplace Prior, L1 Regularization & the Lasso

* *

The Laplace prior assumes a Laplace density over parameter * vectors which results in L1-regularized regression, also * known as the lasso. The Laplace prior is called a * double-exponential distribution because it is looks like an * exponential distribution for positive values and the reflection of * this exponential distribution around zero (or more generally, * around its mean parameter). The Laplace prior has the mode in * the same location as the mean. * *

A Laplace prior allows a variance to be specified per dimension, * but like the Gaussian prior, assumes means are zero and that the * dimensions are independent in that all off-diagonal covariances are * zero. * *

The Laplace density is defined by: * *

 * p(βi) = (sqrt(2)/(2 * σi)) * exp(- sqrt(2) * abs(βi) / σi)
* * which on the log scale is * *
 * log p(βi) = log (sqrt(2)/(2 * σi)) - sqrt(2) * abs(βi) / σi
*

The Laplace prior leads to the following contribution to the * gradient for a dimension i with parameter * betai, mean zero and variance * σi2: * *

 * gradient(β,i) = sqrt(2) * signum(βi) / σi
* * where the derivative of the absolute value function is the signum function, as defined by {@link Math#signum(double)}. * *
 * signum(x) = x > 0 ? 1 : (x < 0 ? -1 : 0)
* *

Laplace priors are constructed using one of the static factory * methods, {@link #laplace(double[])} or {@link * #laplace(double,boolean)}. * * *

Cauchy Prior

* *

The Cauchy prior assumes a Cauchy density (also known as a * Lorentz density) over priors. The Cauchy density allows a scale to * be specified for each dimension. The mean and variance are * undefined as their integrals diverge. The Cauchy distribution is * symmetric and for regression priors, we assume a mode of zero for * the base distribution. The Cauchy prior also has a single mode * at its mean. * *

The Cauchy density with scale of 1 * is a Student-t density with one degree of freedom. * *

The Cauchy density is defined by: * *

 * p(βi,i) = (1 / π) * (λi / (βi2 + λi2))
* which on a log scale is * *
 * log p(βi,i) = log (1 / π) + log (λi) - log (βi2 + λi2)
* *

The Cauchy prior leads to the following contribution to the * gradient for dimension i with parameter βi and scale * λi2: * *

 * gradient(βi, i) = 2 βi / (βi2 + λi2)
* *

Cauchy priors are constructed using one of the static factory * methods {@link #cauchy(double[])} or {@link #cauchy(double,boolean)}. * *

Log Interpolated Priors

* *

For use in gradient-based algorithms, the gradients of two * different priors may be interpolated. A special case is the * elastic net, discussed in he next section. Given two priors * p1 and p2, and an interpolation ratio * α between 0 and 1, the interpolated prior is * defined by * *

 * log p(βi) = α * log p1(βi) + (1 - α) * log p2(βi) - Z
* * where Z is the normalization constant not depending on * β that normalizes the density, * *
 * p(β,i) = exp(log p(βi))
 *
 *        = exp(α * log p1(βi)) * exp((1 - α) * log p2(βi)) / exp(Z)
 *
 *        = p1(β,i)α * p2(β,1)(1 - α) / exp(Z)
* *

The gradient, being a derivative, will be the weighted sum of the * underlying gradients gradient1 and gradient2, * *

 * gradient(β,i) = α * gradient1(β,i) + (1 - α) * gradient2(β,i)
* * *

Elastic Net Prior

* * The elastic net prior interpolates between a Laplace prior and a * Gaussian prior on the log scale uniformly for all dimensions. * * There are two parameters, a scale parameter for the prior variances * and an interpolation parameter that determines the weight given to * the Laplace prior versus the Gaussian prior. The elastic net prior * with Laplace weight α and scale * λ is defined by * *
 * log p(β,i) = α * log Laplace(βi|1/sqrt(λ)) + (1 - α) Gaussian(βi|sqrt(2)/λ)
* * where Laplace(βi|1/sqrt(λ)) is * the density of the (zero-mean) Laplace distribution with variance * 1/sqrt(λ), and * Gaussian(βi|sqrt(2)/λ) is the * (zero-mean) Gaussian density function with variance * sqrt(2)/λ. + (1 - α) Gaussian(βi|sqrt(2)/λ) * *

Thus the gradient is an interpolation of the gradients of the * Laplace with variance σ2 = 1/sqrt(λ) and * Gaussian with variance σ2 = sqrt(2)/λ, * leading to a simple gradient form, * *

 * gradient(β,i) = α * λ * signum(βi) + (1 - α) * λ * βi
* *

The basic elastic net prior has zero means and modes in all' * dimensions, but may be shifted like other priors. * *

Non-Zero Means and Modes

* *

Priors with non-zero means or modes typically arise in * hierarchical or multilevel regression models or models in which * infomative priors are available on a dimension-by-dimension basis. * *

Through the method {@link #shiftMeans(double[],RegressionPrior)} * it is possible to shift the means of a prior by the specified * amount. This allows any prior to be used with non-zero means. * Probabilities are computed by shifting back. Suppose * p2 is the density and gradient2 the * gradient of the specified prior and shifts the * specified array of floats specifying the mean shifts. * Probabilities and gradients are computed by shifting back, * *

 * p(β) = p2(β - shifts)
* * and * *
 * gradient(β,i) = gradient2(β - shifts,i)
* * Dimension by dimension, the value is computed by subtracting * the shift from the value and plugging it into the underlying * prior. * *

For example, to specify a Gaussian prior with means * mus and variances vars, use * *

 * double[] mus = ...
 * double[] vars = ...
 * RegressionPrior prior = shiftMeans(mus,gaussian(vars))
* * *

Special Treatment of Intercept

* *

By convention, input dimension zero (0) may be * reserved for the intercept and set to value 1.0 in all input * vectors. For regularized regression, the regularization is * typically not applied to the intercept term. To match this * convention, the factory methods allow a boolean parameter * indicating whether the intercept parameter has a * noninformative/uniform prior. If the intercept flag indicates it * is noninformative, then dimension 0 will not have an infinite * prior variance or scale, and hence a zero gradient. The result is * that the intercept will be fit by maximum likelihood. * *

Serialization

* *

All of the regression priors may be serialized. *

References

* *

For full details on the Gaussian, cauchy, and Laplace distributions, * see: * *

* *

For explanations of how the priors are used with regression * including logistic regression, see the following three textbooks: * *

* * and for non-zero means and gradient calculations, the tech reports: * * * * For a decription and evaluation of the Cauchy prior, see * * * *

For details of the elastic net prior, see * *

* * @author Bob Carpenter * @version 3.9.2 * @since LingPipe3.5 */ public abstract class RegressionPrior implements Serializable { static final long serialVersionUID = 2955531646832969891L; // do not allow instances or subclasses private RegressionPrior() { /* empty constructor */ } /** * Returns {@code true} if this prior is the uniform distribution. * Uniform priors reduce to maximum likelihood calculations. * * @return {@code true} if this prior is the uniform distribution. */ public boolean isUniform() { return false; } /** * Returns the mode of the prior. The mode is used to clip * gradient steps of the prior so they do not pass through the * mode. * * @param dimension Dimension position in vector. * @return The mean of the prior for the specified dimension. */ public double mode(int dimension) { return 0.0; } /** * Returns the contribution to the gradient of the error function * of the specified parameter value for the specified dimension. * * @param betaForDimension Parameter value for the specified dimension. * @param dimension The dimension. * @return The contribution to the gradient of the error function * of the parameter value and dimension. */ public abstract double gradient(double betaForDimension, int dimension); /** * Returns the log (base 2) of the prior density evaluated at the * specified coefficient value for the specified dimension (up to * an additive constant). The overall error function is the sum of * the negative log likelihood of the data under the model and the * negative log of the prior. * * @param betaForDimension Parameter value for the specified dimension. * @param dimension The dimension. * @return The prior probability of the specified parameter value * for the specified dimension. */ public abstract double log2Prior(double betaForDimension, int dimension); /** * Returns the log (base 2) prior density for a specified * coefficient vector (up to an additive constant). * * @param beta Parameter vector. * @return The log (base 2) prior for the specified parameter * vector. * @throws IllegalArgumentException If the specified parameter * vector does not match the dimensionality of the prior (if * specified). */ public double log2Prior(Vector beta) { int numDimensions = beta.numDimensions(); verifyNumberOfDimensions(numDimensions); double log2Prior = 0.0; for (int i = 0; i < numDimensions; ++i) log2Prior += log2Prior(beta.value(i),i); return log2Prior; } /** * Returns the log (base 2) prior density for the specified * array of coefficient vectors (up to an additive constant). * * @param betas The parameter vectors. * @return The log (base 2) prior density for the specified * @throws IllegalArgumentException If any of the specified * parameter vectors does not match the dimensionality of the * prior (if specified). */ public double log2Prior(Vector[] betas) { double log2Prior = 0.0; for (Vector beta : betas) log2Prior += log2Prior(beta); return log2Prior; } // package local on purpose void verifyNumberOfDimensions(int ignoreMeNumDimensions) { // do nothing on purpose } static final double SQRT_2 = Math.sqrt(2.0); private final static RegressionPrior NONINFORMATIVE_PRIOR = new NoninformativeRegressionPrior(); /** * Returns the noninformative or uniform prior to use for * maximum likelihood regression fitting. * * @return The noninformative prior. */ public static RegressionPrior noninformative() { return NONINFORMATIVE_PRIOR; } /** * Returns the Gaussian prior with the specified prior variance * and indication of whether the intercept is given a * noninformative prior. * *

If the noninformative-intercept flag is set to * true, the prior variance for dimension zero * (0) is set to {@link Double#POSITIVE_INFINITY}. * *

See the class documentation above for more inforamtion on * Gaussian priors. * * @param priorVariance Variance of the Gaussian prior for each * dimension. * @param noninformativeIntercept Flag indicating if intercept is * given a noninformative (uniform) prior. * @return The Gaussian prior with the specified parameters. * @throws IllegalArgumentException If the prior variance is not * a non-negative number. */ public static RegressionPrior gaussian(double priorVariance, boolean noninformativeIntercept) { verifyPriorVariance(priorVariance); return new VariableGaussianRegressionPrior(priorVariance,noninformativeIntercept); } /** * Returns the Gaussian prior with the specified priors for * each dimension. The number of dimensions is taken to be * the length of the variance array. * *

See the class documentation above for more inforamtion on * Gaussian priors. * * @param priorVariances Array of prior variances for dimensions. * @return The Gaussian prior with the specified variances. * @throws IllegalArgumentException If any of the variances are not * non-negative numbers. * */ public static RegressionPrior gaussian(double[] priorVariances) { verifyPriorVariances(priorVariances); return new GaussianRegressionPrior(priorVariances); } /** * Returns the Laplace prior with the specified prior variance * and number of dimensions and indication of whether the * intecept dimension is given a noninformative prior. * *

If the noninformative-intercept flag is set to * true, the prior variance for dimension zero * (0) is set to {@link Double#POSITIVE_INFINITY}. * *

See the class documentation above for more inforamtion on * Laplace priors. * * @param priorVariance Variance of the Laplace prior for each * dimension. * @param noninformativeIntercept Flag indicating if intercept is * given a noninformative (uniform) prior. * @return The Laplace prior with the specified parameters. * @throws IllegalArgumentException If the variance is not a non-negative * number. */ public static RegressionPrior laplace(double priorVariance, boolean noninformativeIntercept) { verifyPriorVariance(priorVariance); return new VariableLaplaceRegressionPrior(priorVariance,noninformativeIntercept); } /** * Returns the Laplace prior with the specified prior variances * for the dimensions. * *

See the class documentation above for more inforamtion on * Laplace priors. * * @param priorVariances Array of prior variances for dimensions. * @return The Laplace prior for the specified variances. * @throws IllegalArgumentException If any of the variances is not * a non-negative number. */ public static RegressionPrior laplace(double[] priorVariances) { verifyPriorVariances(priorVariances); return new LaplaceRegressionPrior(priorVariances); } /** * Returns the Cauchy prior with the specified prior squared * scales for the dimensions. * *

See the class documentation above for more information * on Cauchy priors. * * @param priorSquaredScale The square of the prior scae parameter. * @param noninformativeIntercept Flag indicating if intercept is * given a noninformative (uniform) prior. * @return The Cauchy prior for the specified squared scale and * intercept flag. * @throws IllegalArgumentException If the scale is not a non-negative * number. */ public static RegressionPrior cauchy(double priorSquaredScale, boolean noninformativeIntercept) { verifyPriorVariance(priorSquaredScale); return new VariableCauchyRegressionPrior(priorSquaredScale,noninformativeIntercept); } /** * Returns the Cauchy prior for the specified squared scales. * *

See the class documentation above for more information * on Cauchy priors. * * @param priorSquaredScales Prior squared scale parameters. * @return The Cauchy prior for the specified square scales. * @throws IllegalArgumentException If any of the prior squared * scales is not a non-negative number. */ public static RegressionPrior cauchy(double[] priorSquaredScales) { verifyPriorVariances(priorSquaredScales); return new CauchyRegressionPrior(priorSquaredScales); } /** * Returns the prior that interpolates its log probability between * the specified priors with the weight going to the first prior. * *

See the class documentaton above for more information on * Cauchy priors. * * @param alpha Weight of first prior. * @param prior1 First prior for interpolation. * @param prior2 Second prior for interpolation. * @return The interpolated prior. * @throws IllegalArgumentException If the interpolation ratio is * not a number between 0 and 1 inclusive. */ public static RegressionPrior logInterpolated(double alpha, RegressionPrior prior1, RegressionPrior prior2) { if (Double.isNaN(alpha) || alpha < 0.0 || alpha > 1.0) { String msg = "Weight of first prior must be between 0 and 1 inclusive." + " Found alpha=" + alpha; throw new IllegalArgumentException(msg); } return new LogInterpolatedRegressionPrior(alpha,prior1,prior2); } /** * Returns the elastic net prior with the specified weight on * the Laplace prior, the specified scale parameter for the elastic * net and a noninformative prior on the intercept (dimension 0) if * the specified flag is set. * *

See the class documentation above for more information on * elastic net priors. * *

This is a convenience method for * *

     * logInterpolated(laplaceWeight,
     *                 laplace(1/sqrt(scale),noninformativeIntercept),
     *                 gaussian(sqrt(2)/scale,noninformativeIntercept))
     * 
* * @param laplaceWeight Weight on the Laplace prior. * @param scale Scale parameter for the elastic net. * @param noninformativeIntercept A flag indicating whether or not * the intercept (dimension 0) should have a noninformative prior. * @return The elastic net prior with the specified paramters. * @throws IllegalArgumentException If the interpolation parameter * is not between 0 and 1 inclusive, and if the scale is not * positive and finite. */ public static RegressionPrior elasticNet(double laplaceWeight, double scale, boolean noninformativeIntercept) { if (Double.isInfinite(scale) || !(scale > 0.0)) { String msg = "Scale parameter must be finite and positive." + " Found scale=" + scale; throw new IllegalArgumentException(msg); } return logInterpolated(laplaceWeight, laplace(1.0/Math.sqrt(scale),noninformativeIntercept), gaussian(SQRT_2/scale,noninformativeIntercept)); } /** * Returns the prior that shifts the means of the specified prior * by the specified values. * *

See the class documentation above for more information. * * @param shifts Mean shifts indexed by dimension. * @param prior Prior to apply to shifted values. * @return Prior that shifts values before delegating to the * specified prior. */ public static RegressionPrior shiftMeans(double[] shifts, RegressionPrior prior) { return new ShiftMeans(shifts,prior); } static void verifyPriorVariance(double priorVariance) { if (priorVariance < 0 || Double.isNaN(priorVariance) || priorVariance == Double.NEGATIVE_INFINITY) { String msg = "Prior variance must be a non-negative number." + " Found priorVariance=" + priorVariance; throw new IllegalArgumentException(msg); } } static void verifyPriorVariances(double[] priorVariances) { for (int i = 0; i < priorVariances.length; ++i) { if (priorVariances[i] < 0 || Double.isNaN(priorVariances[i]) || priorVariances[i] == Double.NEGATIVE_INFINITY) { String msg = "Prior variances must be non-negative numbers." + " Found priorVariances[" + i + "]=" + priorVariances[i]; throw new IllegalArgumentException(msg); } } } static class NoninformativeRegressionPrior extends RegressionPrior implements Serializable { static final long serialVersionUID = -582012445093979284L; @Override public double gradient(double beta, int dimension) { return 0.0; } @Override public double log2Prior(double beta, int dimension) { return 0.0; // log2(1) = 0 } @Override public double log2Prior(Vector beta) { return 0.0; } @Override public double log2Prior(Vector[] betas) { return 0.0; } @Override public String toString() { return "NoninformativeRegressionPrior"; } @Override public boolean isUniform() { return true; } } static abstract class ArrayRegressionPrior extends RegressionPrior { static final long serialVersionUID = -1887383164794837169L; final double[] mValues; ArrayRegressionPrior(double[] values) { mValues = values; } @Override void verifyNumberOfDimensions(int numDimensions) { if (mValues.length != numDimensions) { String msg = "Prior and instances must match in number of dimensions." + " Found prior numDimensions=" + mValues.length + " instance numDimensions=" + numDimensions; throw new IllegalArgumentException(msg); } } public String toString(String priorName, String paramName) { StringBuilder sb = new StringBuilder(); sb.append(priorName + "\n"); sb.append(" dimensionality=" + mValues.length); for (int i = 0; i < mValues.length; ++i) sb.append(" " + paramName + "[" + i + "]=" + mValues[i] + "\n"); return sb.toString(); } } static class GaussianRegressionPrior extends ArrayRegressionPrior implements Serializable { static final long serialVersionUID = 8257747607648390037L; GaussianRegressionPrior(double[] priorVariances) { super(priorVariances); } @Override public double gradient(double beta, int dimension) { return beta / mValues[dimension]; } @Override public double log2Prior(double beta, int dimension) { return -log2Sqrt2Pi - 0.5 * com.aliasi.util.Math.log2(mValues[dimension]) - beta * beta / (2.0 * mValues[dimension]); } @Override public String toString() { return toString("GaussianRegressionPrior","Variance"); } private Object writeReplace() { return new Serializer(this); } private static class Serializer extends AbstractExternalizable { static final long serialVersionUID = -1129377549371296060L; final GaussianRegressionPrior mPrior; public Serializer(GaussianRegressionPrior prior) { mPrior = prior; } public Serializer() { this(null); } @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeInt(mPrior.mValues.length); for (int i = 0; i < mPrior.mValues.length; ++i) out.writeDouble(mPrior.mValues[i]); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { int numDimensions = in.readInt(); double[] priorVariances = new double[numDimensions]; for (int i = 0; i < numDimensions; ++i) priorVariances[i] = in.readDouble(); return new GaussianRegressionPrior(priorVariances); } } } static final double sqrt2 = Math.sqrt(2.0); static final double log2Sqrt2Over2 = com.aliasi.util.Math.log2(sqrt2/2.0); static final double log2Sqrt2Pi = com.aliasi.util.Math.log2(Math.sqrt(2.0 * Math.PI)); static final double log21OverPi = -com.aliasi.util.Math.log2(Math.PI); static class LaplaceRegressionPrior extends ArrayRegressionPrior implements Serializable { static final long serialVersionUID = 9120480132502062861L; LaplaceRegressionPrior(double[] priorVariances) { super(priorVariances); } @Override public double gradient(double beta, int dimension) { if (beta == 0.0) return 0.0; if (beta > 0) return Math.sqrt(2.0/mValues[dimension]); return -Math.sqrt(2.0/mValues[dimension]); } @Override public double log2Prior(double beta, int dimension) { return log2Sqrt2Over2 - 0.5 * com.aliasi.util.Math.log2(mValues[dimension]) - sqrt2 * Math.abs(beta) / Math.sqrt(mValues[dimension]); } @Override public String toString() { return toString("LaplaceRegressionPrior","Variance"); } private Object writeReplace() { return new Serializer(this); } private static class Serializer extends AbstractExternalizable { static final long serialVersionUID = 7844951573062416091L; final LaplaceRegressionPrior mPrior; public Serializer(LaplaceRegressionPrior prior) { mPrior = prior; } public Serializer() { this(null); } @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeInt(mPrior.mValues.length); for (int i = 0; i < mPrior.mValues.length; ++i) out.writeDouble(mPrior.mValues[i]); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { int numDimensions = in.readInt(); double[] priorVariances = new double[numDimensions]; for (int i = 0; i < numDimensions; ++i) priorVariances[i] = in.readDouble(); return new LaplaceRegressionPrior(priorVariances); } } } static class CauchyRegressionPrior extends ArrayRegressionPrior implements Serializable { static final long serialVersionUID = 2351846943518745614L; CauchyRegressionPrior(double[] priorSquaredScales) { super(priorSquaredScales); } @Override public double gradient(double beta, int dimension) { return 2.0 * beta / (beta * beta + mValues[dimension]); } @Override public double log2Prior(double beta, int dimension) { return log21OverPi + 0.5 * com.aliasi.util.Math.log2(mValues[dimension]) - com.aliasi.util.Math.log2(beta * beta + mValues[dimension] * mValues[dimension]); } @Override public String toString() { return toString("CauchyRegressionPrior","Scale"); } private Object writeReplace() { return new Serializer(this); } private static class Serializer extends AbstractExternalizable { static final long serialVersionUID = 5202676106810759907L; final CauchyRegressionPrior mPrior; public Serializer(CauchyRegressionPrior prior) { mPrior = prior; } public Serializer() { this(null); } @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeInt(mPrior.mValues.length); for (int i = 0; i < mPrior.mValues.length; ++i) out.writeDouble(mPrior.mValues[i]); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { int numDimensions = in.readInt(); double[] priorScales = new double[numDimensions]; for (int i = 0; i < numDimensions; ++i) priorScales[i] = in.readDouble(); return new CauchyRegressionPrior(priorScales); } } } static abstract class VariableRegressionPrior extends RegressionPrior { static final long serialVersionUID = -7527207309328127863L; final double mPriorVariance; final boolean mNoninformativeIntercept; VariableRegressionPrior(double priorVariance, boolean noninformativeIntercept) { mPriorVariance = priorVariance; mNoninformativeIntercept = noninformativeIntercept; } public String toString(String priorName, String paramName) { return priorName + "(" + paramName + "=" + mPriorVariance + ", noninformativeIntercept=" + mNoninformativeIntercept + ")"; } } static class VariableGaussianRegressionPrior extends VariableRegressionPrior implements Serializable { static final long serialVersionUID = -7527207309328127863L; VariableGaussianRegressionPrior(double priorVariance, boolean noninformativeIntercept) { super(priorVariance,noninformativeIntercept); } @Override public double gradient(double beta, int dimension) { return (dimension == 0 && mNoninformativeIntercept) ? 0.0 : beta / mPriorVariance; } @Override public double log2Prior(double beta, int dimension) { if (dimension == 0 && mNoninformativeIntercept) return 0.0; // log(1)=0.0 return -log2Sqrt2Pi - 0.5 * com.aliasi.util.Math.log2(mPriorVariance) - beta * beta / (2.0 * mPriorVariance); } @Override public String toString() { return toString("GaussianRegressionPrior","Variance"); } private Object writeReplace() { return new Serializer(this); } private static class Serializer extends AbstractExternalizable { static final long serialVersionUID = 5979483825025936160L; final VariableGaussianRegressionPrior mPrior; public Serializer(VariableGaussianRegressionPrior prior) { mPrior = prior; } public Serializer() { this(null); } @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeDouble(mPrior.mPriorVariance); out.writeBoolean(mPrior.mNoninformativeIntercept); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { double priorVariance = in.readDouble(); boolean noninformativeIntercept = in.readBoolean(); return new VariableGaussianRegressionPrior(priorVariance, noninformativeIntercept); } } } static class VariableLaplaceRegressionPrior extends VariableRegressionPrior implements Serializable { static final long serialVersionUID = -4286001162222250623L; final double mPositiveGradient; final double mNegativeGradient; final double mPriorIntercept; final double mPriorCoefficient; VariableLaplaceRegressionPrior(double priorVariance, boolean noninformativeIntercept) { super(priorVariance,noninformativeIntercept); mPositiveGradient = Math.sqrt(2.0/priorVariance); mNegativeGradient = -mPositiveGradient; mPriorIntercept = log2Sqrt2Over2 - 0.5 * com.aliasi.util.Math.log2(priorVariance); mPriorCoefficient = -sqrt2 / Math.sqrt(priorVariance); } @Override public double gradient(double beta, int dimension) { return (dimension == 0 && mNoninformativeIntercept || beta == 0.0) ? 0.0 : (beta > 0 ? mPositiveGradient : mNegativeGradient ); } @Override public double log2Prior(double beta, int dimension) { if (dimension == 0 && mNoninformativeIntercept) return 0.0; return mPriorIntercept + mPriorCoefficient * Math.abs(beta); // return log2Sqrt2Over2 // - 0.5 * com.aliasi.util.Math.log2(mPriorVariance) // - sqrt2 * Math.abs(beta) / Math.sqrt(mPriorVariance); } @Override public String toString() { return toString("LaplaceRegressionPrior","Variance"); } private Object writeReplace() { return new Serializer(this); } private static class Serializer extends AbstractExternalizable { static final long serialVersionUID = 2321796089407881776L; final VariableLaplaceRegressionPrior mPrior; public Serializer(VariableLaplaceRegressionPrior prior) { mPrior = prior; } public Serializer() { this(null); } @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeDouble(mPrior.mPriorVariance); out.writeBoolean(mPrior.mNoninformativeIntercept); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { double priorVariance = in.readDouble(); boolean noninformativeIntercept = in.readBoolean(); return new VariableLaplaceRegressionPrior(priorVariance, noninformativeIntercept); } } } static class VariableCauchyRegressionPrior extends VariableRegressionPrior { static final long serialVersionUID = 3368658136325392652L; VariableCauchyRegressionPrior(double priorVariance, boolean noninformativeIntercept) { super(priorVariance,noninformativeIntercept); } @Override public double gradient(double beta, int dimension) { return (dimension == 0 && mNoninformativeIntercept) ? 0 : 2.0 * beta / (beta * beta + mPriorVariance); } @Override public double log2Prior(double beta, int dimension) { if (dimension == 0 && mNoninformativeIntercept) return 0.0; return log21OverPi + 0.5 * com.aliasi.util.Math.log2(mPriorVariance) - com.aliasi.util.Math.log2(beta * beta + mPriorVariance); } @Override public String toString() { return toString("CauchyRegressionPrior","Scale"); } public Object writeReplace() { return new Serializer(this); } private static class Serializer extends AbstractExternalizable { static final long serialVersionUID = -7209096281888148303L; final VariableCauchyRegressionPrior mPrior; public Serializer(VariableCauchyRegressionPrior prior) { mPrior = prior; } public Serializer() { this(null); } @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeDouble(mPrior.mPriorVariance); out.writeBoolean(mPrior.mNoninformativeIntercept); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { double priorScale = in.readDouble(); boolean noninformativeIntercept = in.readBoolean(); return new VariableCauchyRegressionPrior(priorScale, noninformativeIntercept); } } } static class LogInterpolatedRegressionPrior extends RegressionPrior { static final long serialVersionUID = 1052451778773339516L; private final double mAlpha; private final RegressionPrior mPrior1; private final RegressionPrior mPrior2; LogInterpolatedRegressionPrior(double alpha, RegressionPrior prior1, RegressionPrior prior2) { mAlpha = alpha; mPrior1 = prior1; mPrior2 = prior2; } @Override public double gradient(double beta, int dimension) { return mAlpha * mPrior1.gradient(beta,dimension) + (1 - mAlpha) * mPrior2.gradient(beta,dimension); } @Override public double log2Prior(double beta, int dimension) { return mAlpha * mPrior1.log2Prior(beta,dimension) + (1 - mAlpha) * mPrior2.log2Prior(beta,dimension); } @Override public String toString() { return "LogInterpolatedRegressionPrior(" + "alpha=" + mAlpha + ", prior1=" + mPrior1 + ", prior2=" + mPrior2 + ")"; } Object writeReplace() { return new Serializer(this); } static class Serializer extends AbstractExternalizable { static final long serialVersionUID = 1071183663202516816L; final LogInterpolatedRegressionPrior mPrior; public Serializer() { this(null); } public Serializer(LogInterpolatedRegressionPrior prior) { mPrior = prior; } @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeDouble(mPrior.mAlpha); out.writeObject(mPrior.mPrior1); out.writeObject(mPrior.mPrior2); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { double alpha = in.readDouble(); @SuppressWarnings("unchecked") RegressionPrior prior1 = (RegressionPrior) in.readObject(); @SuppressWarnings("unchecked") RegressionPrior prior2 = (RegressionPrior) in.readObject(); return new LogInterpolatedRegressionPrior(alpha,prior1,prior2); } } } static class ShiftMeans extends RegressionPrior { static final long serialVersionUID = 5159543505446681732L; private final double[] mMeans; private final RegressionPrior mPrior; ShiftMeans(double[] means, RegressionPrior prior) { mPrior = prior; mMeans = means; } @Override public double mode(int i) { return mMeans[i] + mPrior.mode(i); } @Override public boolean isUniform() { return mPrior.isUniform(); } @Override public double log2Prior(double betaI, int i) { return mPrior.log2Prior(betaI - mMeans[i],i); } @Override public double gradient(double betaI, int i) { return mPrior.gradient(betaI - mMeans[i],i); } @Override public String toString() { return "ShiftMeans(means=...,prior=" + mPrior + ")"; } static class Serializer extends AbstractExternalizable { static final long serialVersionUID = -777157399350907424L; final ShiftMeans mPrior; public Serializer() { this(null); } public Serializer(ShiftMeans prior) { mPrior = prior; } @Override public void writeExternal(ObjectOutput out) throws IOException { writeDoubles(mPrior.mMeans,out); out.writeObject(mPrior.mPrior); } @Override public Object read(ObjectInput in) throws IOException, ClassNotFoundException { double[] means = readDoubles(in); @SuppressWarnings("unchecked") RegressionPrior prior = (RegressionPrior) in.readObject(); return new ShiftMeans(means,prior); } } } }



© 2015 - 2025 Weber Informatics LLC | Privacy Policy