org.nd4j.weightinit.WeightInit Maven / Gradle / Ivy

Go to download

Show more of this group Show more artifacts with this name
Show all versions of nd4j-api Show documentation

There is a newer version: 1.0.0-M2.1

/*- * * * Copyright 2015 Skymind,Inc. * * * * Licensed under the Apache License, Version 2.0 (the "License"); * * you may not use this file except in compliance with the License. * * You may obtain a copy of the License at * * * * http://www.apache.org/licenses/LICENSE-2.0 * * * * Unless required by applicable law or agreed to in writing, software * * distributed under the License is distributed on an "AS IS" BASIS, * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * * See the License for the specific language governing permissions and * * limitations under the License. * */ package org.nd4j.weightinit; /** * Weight initialization scheme * * DISTRIBUTION: Sample weights from a provided distribution * * ZERO: Generate weights as zeros * * ONES: All weights are set to 1 * * SIGMOID_UNIFORM: A version of XAVIER_UNIFORM for sigmoid activation functions. U(-r,r) with r=4*sqrt(6/(fanIn + fanOut)) * * NORMAL: Normal/Gaussian distribution, with mean 0 and standard deviation 1/sqrt(fanIn). * This is the initialization recommented in Klambauer et al. 2017, "Self-Normalizing Neural Network". Equivalent to * DL4J's XAVIER_FAN_IN and LECUN_NORMAL (i.e. Keras' "lecun_normal") * * LECUN_UNIFORM Uniform U[-a,a] with a=3/sqrt(fanIn). * * UNIFORM: Uniform U[-a,a] with a=1/sqrt(fanIn). "Commonly used heuristic" as per Glorot and Bengio 2010 * * XAVIER: As per Glorot and Bengio 2010: Gaussian distribution with mean 0, variance 2.0/(fanIn + fanOut) * * XAVIER_UNIFORM: As per Glorot and Bengio 2010: Uniform distribution U(-s,s) with s = sqrt(6/(fanIn + fanOut)) * * XAVIER_FAN_IN: Similar to Xavier, but 1/fanIn -> Caffe originally used this. * * XAVIER_LEGACY: Xavier weight init in DL4J up to 0.6.0. XAVIER should be preferred. * * RELU: He et al. (2015), "Delving Deep into Rectifiers". Normal distribution with variance 2.0/nIn * * RELU_UNIFORM: He et al. (2015), "Delving Deep into Rectifiers". Uniform distribution U(-s,s) with s = sqrt(6/fanIn) * * IDENTITY: Weights are set to an identity matrix. Note: can only be used with square weight matrices * * VAR_SCALING_NORMAL_FAN_IN Gaussian distribution with mean 0, variance 1.0/(fanIn) * * VAR_SCALING_NORMAL_FAN_OUT Gaussian distribution with mean 0, variance 1.0/(fanOut) * * VAR_SCALING_NORMAL_FAN_AVG Gaussian distribution with mean 0, variance 1.0/((fanIn + fanOut)/2) * * VAR_SCALING_UNIFORM_FAN_IN Uniform U[-a,a] with a=3.0/(fanIn) * * VAR_SCALING_UNIFORM_FAN_OUT Uniform U[-a,a] with a=3.0/(fanOut) * * VAR_SCALING_UNIFORM_FAN_AVG Uniform U[-a,a] with a=3.0/((fanIn + fanOut)/2) *

* @author Adam Gibson */ public enum WeightInit { DISTRIBUTION, ZERO, ONES, SIGMOID_UNIFORM, NORMAL, LECUN_NORMAL, UNIFORM, XAVIER, XAVIER_UNIFORM, XAVIER_FAN_IN, XAVIER_LEGACY, RELU, RELU_UNIFORM, IDENTITY, LECUN_UNIFORM, VAR_SCALING_NORMAL_FAN_IN, VAR_SCALING_NORMAL_FAN_OUT, VAR_SCALING_NORMAL_FAN_AVG, VAR_SCALING_UNIFORM_FAN_IN, VAR_SCALING_UNIFORM_FAN_OUT, VAR_SCALING_UNIFORM_FAN_AVG,SUPPLIED }