org.nd4j.weightinit.WeightInit Maven / Gradle / Ivy
/*-
*
* * Copyright 2015 Skymind,Inc.
* *
* * Licensed under the Apache License, Version 2.0 (the "License");
* * you may not use this file except in compliance with the License.
* * You may obtain a copy of the License at
* *
* * http://www.apache.org/licenses/LICENSE-2.0
* *
* * Unless required by applicable law or agreed to in writing, software
* * distributed under the License is distributed on an "AS IS" BASIS,
* * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* * See the License for the specific language governing permissions and
* * limitations under the License.
*
*/
package org.nd4j.weightinit;
/**
* Weight initialization scheme
*
* DISTRIBUTION: Sample weights from a provided distribution
*
* ZERO: Generate weights as zeros
*
* ONES: All weights are set to 1
*
* SIGMOID_UNIFORM: A version of XAVIER_UNIFORM for sigmoid activation functions. U(-r,r) with r=4*sqrt(6/(fanIn + fanOut))
*
* NORMAL: Normal/Gaussian distribution, with mean 0 and standard deviation 1/sqrt(fanIn).
* This is the initialization recommented in Klambauer et al. 2017, "Self-Normalizing Neural Network". Equivalent to
* DL4J's XAVIER_FAN_IN and LECUN_NORMAL (i.e. Keras' "lecun_normal")
*
* LECUN_UNIFORM Uniform U[-a,a] with a=3/sqrt(fanIn).
*
* UNIFORM: Uniform U[-a,a] with a=1/sqrt(fanIn). "Commonly used heuristic" as per Glorot and Bengio 2010
*
* XAVIER: As per Glorot and Bengio 2010: Gaussian distribution with mean 0, variance 2.0/(fanIn + fanOut)
*
* XAVIER_UNIFORM: As per Glorot and Bengio 2010: Uniform distribution U(-s,s) with s = sqrt(6/(fanIn + fanOut))
*
* XAVIER_FAN_IN: Similar to Xavier, but 1/fanIn -> Caffe originally used this.
*
* XAVIER_LEGACY: Xavier weight init in DL4J up to 0.6.0. XAVIER should be preferred.
*
* RELU: He et al. (2015), "Delving Deep into Rectifiers". Normal distribution with variance 2.0/nIn
*
* RELU_UNIFORM: He et al. (2015), "Delving Deep into Rectifiers". Uniform distribution U(-s,s) with s = sqrt(6/fanIn)
*
* IDENTITY: Weights are set to an identity matrix. Note: can only be used with square weight matrices
*
* VAR_SCALING_NORMAL_FAN_IN Gaussian distribution with mean 0, variance 1.0/(fanIn)
*
* VAR_SCALING_NORMAL_FAN_OUT Gaussian distribution with mean 0, variance 1.0/(fanOut)
*
* VAR_SCALING_NORMAL_FAN_AVG Gaussian distribution with mean 0, variance 1.0/((fanIn + fanOut)/2)
*
* VAR_SCALING_UNIFORM_FAN_IN Uniform U[-a,a] with a=3.0/(fanIn)
*
* VAR_SCALING_UNIFORM_FAN_OUT Uniform U[-a,a] with a=3.0/(fanOut)
*
* VAR_SCALING_UNIFORM_FAN_AVG Uniform U[-a,a] with a=3.0/((fanIn + fanOut)/2)
*
* @author Adam Gibson
*/
public enum WeightInit {
DISTRIBUTION, ZERO, ONES, SIGMOID_UNIFORM, NORMAL, LECUN_NORMAL, UNIFORM, XAVIER, XAVIER_UNIFORM, XAVIER_FAN_IN, XAVIER_LEGACY, RELU,
RELU_UNIFORM, IDENTITY, LECUN_UNIFORM, VAR_SCALING_NORMAL_FAN_IN, VAR_SCALING_NORMAL_FAN_OUT, VAR_SCALING_NORMAL_FAN_AVG,
VAR_SCALING_UNIFORM_FAN_IN, VAR_SCALING_UNIFORM_FAN_OUT, VAR_SCALING_UNIFORM_FAN_AVG,SUPPLIED
}