public static class ParameterAveragingTrainingMaster.Builder
extends java.lang.Object
Constructor and Description |
---|
Builder(int rddDataSetNumExamples)
Same as
#Builder(Integer, int) but automatically set number of workers based on JavaSparkContext.defaultParallelism() |
Builder(java.lang.Integer numWorkers,
int rddDataSetNumExamples)
Create a builder, where the following number of workers (Spark executors * number of threads per executor) are
being used.
Note: this should match the configuration of the cluster. |
Modifier and Type | Method and Description |
---|---|
ParameterAveragingTrainingMaster.Builder |
averagingFrequency(int averagingFrequency)
Frequency with which to average worker parameters.
Note: Too high or too low can be bad for different reasons. - Too low (such as 1) can result in a lot of network traffic - Too high (>> 20 or so) can result in accuracy issues or problems with network convergence |
ParameterAveragingTrainingMaster.Builder |
batchSizePerWorker(int batchSizePerWorker)
Batch size (in number of examples) per worker, for each fit(DataSet) call.
|
ParameterAveragingTrainingMaster |
build() |
ParameterAveragingTrainingMaster.Builder |
exportDirectory(java.lang.String exportDirectory)
When
rddTrainingApproach(RDDTrainingApproach) is set to RDDTrainingApproach.Export (as it is by default)
the data is exported to a temporary directory first. |
ParameterAveragingTrainingMaster.Builder |
rddTrainingApproach(RDDTrainingApproach rddTrainingApproach)
The approach to use when training on a
RDD<DataSet> or RDD<MultiDataSet> . |
ParameterAveragingTrainingMaster.Builder |
repartionData(Repartition repartition)
Set if/when repartitioning should be conducted for the training data.
Default value: always repartition (if required to guarantee correct number of partitions and correct number of examples in each partition). |
ParameterAveragingTrainingMaster.Builder |
repartitionStrategy(RepartitionStrategy repartitionStrategy)
Used in conjunction with
repartionData(Repartition) (which defines when repartitioning should be
conducted), repartitionStrategy defines how the repartitioning should be done. |
ParameterAveragingTrainingMaster.Builder |
rngSeed(long rngSeed)
Random number generator seed, used mainly for enforcing repeatable splitting on RDDs
Default: no seed set (i.e., random seed)
|
ParameterAveragingTrainingMaster.Builder |
saveUpdater(boolean saveUpdater)
Set whether the updater (i.e., historical state for momentum, adagrad, etc should be saved).
|
ParameterAveragingTrainingMaster.Builder |
storageLevel(org.apache.spark.storage.StorageLevel storageLevel)
Set the storage level for
RDD<DataSet> s.Default: StorageLevel.MEMORY_ONLY_SER() - i.e., store in memory, in serialized form To use no RDD persistence, use null |
ParameterAveragingTrainingMaster.Builder |
storageLevelStreams(org.apache.spark.storage.StorageLevel storageLevelStreams)
Set the storage level RDDs used when fitting data from Streams: either PortableDataStreams (sc.binaryFiles via
SparkDl4jMultiLayer.fit(String) and SparkComputationGraph.fit(String) ) or String paths
(via SparkDl4jMultiLayer.fitPaths(JavaRDD) , SparkComputationGraph.fitPaths(JavaRDD) and
SparkComputationGraph.fitPathsMultiDataSet(JavaRDD) ). |
ParameterAveragingTrainingMaster.Builder |
trainingHooks(java.util.Collection<TrainingHook> trainingHooks)
Adds training hooks to the master.
|
ParameterAveragingTrainingMaster.Builder |
trainingHooks(TrainingHook... hooks)
Adds training hooks to the master.
|
ParameterAveragingTrainingMaster.Builder |
workerPrefetchNumBatches(int prefetchNumBatches)
Set the number of minibatches to asynchronously prefetch in the worker.
|
public Builder(int rddDataSetNumExamples)
#Builder(Integer, int)
but automatically set number of workers based on JavaSparkContext.defaultParallelism()rddDataSetNumExamples
- Number of examples in each DataSet object in the RDD<DataSet>
public Builder(java.lang.Integer numWorkers, int rddDataSetNumExamples)
It is also necessary to specify how many examples are in each DataSet that appears in the RDD<DataSet>
or JavaRDD<DataSet>
used for training.
Two most common cases here:
(a) Preprocessed data pipelines will often load binary DataSet objects with N > 1 examples in each; in this case,
rddDataSetNumExamples should be set to N
(b) "In line" data pipelines (for example, CSV String -> record reader -> DataSet just before training) will
typically have exactly 1 example in each DataSet object. In this case, rddDataSetNumExamples should be set to 1
numWorkers
- Number of Spark execution threads in the cluster. May be null. If null: number of workers will
be obtained from JavaSparkContext.defaultParallelism(), which should provide the number of cores
in the cluster.rddDataSetNumExamples
- Number of examples in each DataSet object in the RDD<DataSet>
public ParameterAveragingTrainingMaster.Builder trainingHooks(java.util.Collection<TrainingHook> trainingHooks)
trainingHooks
- the training hooks to adpublic ParameterAveragingTrainingMaster.Builder trainingHooks(TrainingHook... hooks)
hooks
- the training hooks to adpublic ParameterAveragingTrainingMaster.Builder batchSizePerWorker(int batchSizePerWorker)
batchSizePerWorker
- Size of each minibatch to use for each workerpublic ParameterAveragingTrainingMaster.Builder averagingFrequency(int averagingFrequency)
averagingFrequency
- Frequency (in number of minibatches of size 'batchSizePerWorker') to average parameterspublic ParameterAveragingTrainingMaster.Builder workerPrefetchNumBatches(int prefetchNumBatches)
Default: 0 (no prefetching)
prefetchNumBatches
- Number of minibatches (DataSets of size batchSizePerWorker) to fetchpublic ParameterAveragingTrainingMaster.Builder saveUpdater(boolean saveUpdater)
This is enabled by default.
saveUpdater
- If true: retain the updater state (default). If false, don't retain (updaters will be
reinitalized in each worker after averaging).public ParameterAveragingTrainingMaster.Builder repartionData(Repartition repartition)
repartition
- Setting for repartitioningpublic ParameterAveragingTrainingMaster.Builder repartitionStrategy(RepartitionStrategy repartitionStrategy)
repartionData(Repartition)
(which defines when repartitioning should be
conducted), repartitionStrategy defines how the repartitioning should be done. See RepartitionStrategy
for detailsrepartitionStrategy
- Repartitioning strategy to usepublic ParameterAveragingTrainingMaster.Builder storageLevel(org.apache.spark.storage.StorageLevel storageLevel)
RDD<DataSet>
s.null
Note: Spark's StorageLevel.MEMORY_ONLY() and StorageLevel.MEMORY_AND_DISK() can be problematic when it comes to off-heap data (which DL4J/ND4J uses extensively). Spark does not account for off-heap memory when deciding if/when to drop blocks to ensure enough free memory; consequently, for DataSet RDDs that are larger than the total amount of (off-heap) memory, this can lead to OOM issues. Put another way: Spark counts the on-heap size of DataSet and INDArray objects only (which is negligible) resulting in a significant underestimate of the true DataSet object sizes. More DataSets are thus kept in memory than we can really afford.
storageLevel
- Storage level to use for DataSet RDDspublic ParameterAveragingTrainingMaster.Builder storageLevelStreams(org.apache.spark.storage.StorageLevel storageLevelStreams)
SparkDl4jMultiLayer.fit(String)
and SparkComputationGraph.fit(String)
) or String paths
(via SparkDl4jMultiLayer.fitPaths(JavaRDD)
, SparkComputationGraph.fitPaths(JavaRDD)
and
SparkComputationGraph.fitPathsMultiDataSet(JavaRDD)
).Default storage level is StorageLevel.MEMORY_ONLY() which should be appropriate in most cases.
storageLevelStreams
- Storage level to usepublic ParameterAveragingTrainingMaster.Builder rddTrainingApproach(RDDTrainingApproach rddTrainingApproach)
RDD<DataSet>
or RDD<MultiDataSet>
.
Default: RDDTrainingApproach.Export
, which exports data to a temporary directory firstrddTrainingApproach
- Training approach to use when training from a RDD<DataSet>
or RDD<MultiDataSet>
public ParameterAveragingTrainingMaster.Builder exportDirectory(java.lang.String exportDirectory)
rddTrainingApproach(RDDTrainingApproach)
is set to RDDTrainingApproach.Export
(as it is by default)
the data is exported to a temporary directory first.
Default: null. -> use {hadoop.tmp.dir}/dl4j/. In this case, data is exported to {hadoop.tmp.dir}/dl4j/SOME_UNIQUE_ID/
If you specify a directory, the directory {exportDirectory}/SOME_UNIQUE_ID/ will be used instead.
exportDirectory
- Base directory to export datapublic ParameterAveragingTrainingMaster.Builder rngSeed(long rngSeed)
rngSeed
- RNG seedpublic ParameterAveragingTrainingMaster build()