All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.deeplearning4j.spark.api.RepartitionStrategy Maven / Gradle / Ivy

package org.deeplearning4j.spark.api;

/**
 * RepartitionStrategy: different strategies for conducting repartitioning on training data, when repartitioning is required.
* SparkDefault: repartition using Spark's standard {@code RDD.repartition(int)} method. This results in each value being * randomly mapped to a new partition. This results in approximately equal partitions, though random sampling issues can * be problematic when the number of elements in a RDD is small
* Balanced: a custom repartitioning strategy that attempts to ensure that each partition ends up with the correct number * of elements. It has a slightly higher overhead (need to count the number of values in each partition) but should be less * prone to random sampling variance than the SparkDefault strategy * * * @author Alex Black */ public enum RepartitionStrategy { SparkDefault, Balanced }




© 2015 - 2025 Weber Informatics LLC | Privacy Policy