SparkSequenceVectors

java.lang.Object
- org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl<T>
- - org.deeplearning4j.models.sequencevectors.SequenceVectors<T>
  - - org.deeplearning4j.spark.models.sequencevectors.SparkSequenceVectors<T>

All Implemented Interfaces:

java.io.Serializable, WordVectors

Direct Known Subclasses:

SparkParagraphVectors, SparkWord2Vec
```
public class SparkSequenceVectors<T extends SequenceElement>
extends SequenceVectors<T>
```
Generic SequenceVectors implementation for dl4j-spark-nlp

See Also:

Serialized Form

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class SparkSequenceVectors.Builder<T extends SequenceElement>
- Nested classes/interfaces inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors
  SequenceVectors.AsyncSequencer

Nested Classes
Modifier and Type	Class and Description
`static class`	`SparkSequenceVectors.Builder<T extends SequenceElement>`

Field Summary

Fields
Modifier and Type	Field and Description
`protected org.apache.spark.broadcast.Broadcast<VectorsConfiguration>`	`configurationBroadcast`
`protected SparkElementsLearningAlgorithm`	`ela`
`protected org.apache.spark.Accumulator<Counter<java.lang.Long>>`	`elementsFreqAccum`
`protected org.apache.spark.Accumulator<ExtraCounter<java.lang.Long>>`	`elementsFreqAccumExtra`
`protected SparkModelExporter<T>`	`exporter`
`protected boolean`	`isAutoDiscoveryMode`
`protected boolean`	`isEnvironmentReady`
`protected org.nd4j.parameterserver.distributed.conf.VoidConfiguration`	`paramServerConfiguration`
`protected VocabCache<ShallowSequenceElement>`	`shallowVocabCache`
`protected org.apache.spark.broadcast.Broadcast<VocabCache<ShallowSequenceElement>>`	`shallowVocabCacheBroadcast`
`protected SparkSequenceLearningAlgorithm`	`sla`
`protected org.apache.spark.storage.StorageLevel`	`storageLevel`
`protected org.apache.spark.broadcast.Broadcast<VocabCache<T>>`	`vocabCacheBroadcast`

Fields inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors
configuration, configured, elementsLearningAlgorithm, enableScavenger, eventListeners, existingModel, iterator, log, scoreElements, scoreSequences, sequenceLearningAlgorithm, unknownElement

Fields inherited from class org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl
batchSize, DEFAULT_UNK, layerSize, learningRate, learningRateDecayWords, lookupTable, minLearningRate, minWordFrequency, modelUtils, negative, numEpochs, numIterations, resetModel, sampling, seed, stopWords, trainElementsVectors, trainSequenceVectors, useAdeGrad, useUnknown, variableWindows, vocab, window, workers

Constructor Summary

Constructors
Modifier Constructor and Description

protected SparkSequenceVectors()

protected SparkSequenceVectors(VectorsConfiguration configuration)

Constructors
Modifier	Constructor and Description
`protected`	`SparkSequenceVectors()`
`protected`	`SparkSequenceVectors(VectorsConfiguration configuration)`

Method Summary

All Methods Instance Methods Concrete Methods Deprecated Methods
Modifier and Type	Method and Description
`protected void`	`broadcastEnvironment(org.apache.spark.api.java.JavaSparkContext context)`
`protected VocabCache<ShallowSequenceElement>`	`buildShallowVocabCache(Counter<java.lang.Long> counter)` This method builds shadow vocabulary and huffman tree
`void`	`fit()` Deprecated.
`void`	`fitLists(org.apache.spark.api.java.JavaRDD<java.util.List<T>> corpus)` Utility method.
`void`	`fitSequences(org.apache.spark.api.java.JavaRDD<Sequence<T>> corpus)` Base training entry point
`protected Counter<java.lang.Long>`	`getCounter()`
`protected VocabCache<ShallowSequenceElement>`	`getShallowVocabCache()`
`protected void`	`validateConfiguration()`

Methods inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors
buildVocab, getElementsScore, getSequencesScore, getUNK, getWordVectorMatrix, initLearners, trainSequence

Methods inherited from class org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl
accuracy, getLayerSize, getWordVector, getWordVectorMatrixNormalized, getWordVectors, getWordVectorsMean, hasWord, indexOf, lookupTable, setLookupTable, setModelUtils, setVocab, similarity, similarWordsInVocabTo, update, update, vocab, wordsNearest, wordsNearest, wordsNearest, wordsNearestSum, wordsNearestSum, wordsNearestSum

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.deeplearning4j.models.embeddings.wordvectors.WordVectors
accuracy, getWordVector, getWordVectorMatrixNormalized, getWordVectors, getWordVectorsMean, hasWord, indexOf, lookupTable, setModelUtils, setUNK, similarity, similarWordsInVocabTo, vocab, wordsNearest, wordsNearest, wordsNearest, wordsNearestSum, wordsNearestSum, wordsNearestSum

Field Detail

elementsFreqAccum

protected org.apache.spark.Accumulator<Counter<java.lang.Long>> elementsFreqAccum

elementsFreqAccumExtra

protected org.apache.spark.Accumulator<ExtraCounter<java.lang.Long>> elementsFreqAccumExtra

storageLevel

protected org.apache.spark.storage.StorageLevel storageLevel

vocabCacheBroadcast

protected org.apache.spark.broadcast.Broadcast<VocabCache<T extends SequenceElement>> vocabCacheBroadcast

shallowVocabCacheBroadcast

protected org.apache.spark.broadcast.Broadcast<VocabCache<ShallowSequenceElement>> shallowVocabCacheBroadcast

configurationBroadcast

protected org.apache.spark.broadcast.Broadcast<VectorsConfiguration> configurationBroadcast

isEnvironmentReady

protected transient boolean isEnvironmentReady

shallowVocabCache

protected transient VocabCache<ShallowSequenceElement> shallowVocabCache

isAutoDiscoveryMode
```
protected boolean isAutoDiscoveryMode
```

exporter

protected SparkModelExporter<T extends SequenceElement> exporter

ela

protected SparkElementsLearningAlgorithm ela

sla

protected SparkSequenceLearningAlgorithm sla

paramServerConfiguration

protected org.nd4j.parameterserver.distributed.conf.VoidConfiguration paramServerConfiguration

Constructor Detail

SparkSequenceVectors
```
protected SparkSequenceVectors()
```

SparkSequenceVectors

protected SparkSequenceVectors(@NonNull
                               VectorsConfiguration configuration)

Method Detail
- getShallowVocabCache
```
protected VocabCache<ShallowSequenceElement> getShallowVocabCache()
```
- fit
```
@Deprecated
public void fit()
```
  Deprecated.
  
  PLEASE NOTE: This method isn't supported for Spark implementation. Consider using fitLists() or fitSequences() instead.
  
  Overrides:
  
  fit in class SequenceVectors<T extends SequenceElement>
- validateConfiguration
```
protected void validateConfiguration()
```
- broadcastEnvironment
```
protected void broadcastEnvironment(org.apache.spark.api.java.JavaSparkContext context)
```
- fitLists
```
public void fitLists(org.apache.spark.api.java.JavaRDD<java.util.List<T>> corpus)
```
  Utility method. fitSequences() used within. PLEASE NOTE: This method can't be used to train for labels, since List can't hold labels. If you need labels - consider manual Sequence creation instead.
  
  Parameters:
  
  corpus -
- fitSequences
```
public void fitSequences(org.apache.spark.api.java.JavaRDD<Sequence<T>> corpus)
```
  Base training entry point
  
  Parameters:
  
  corpus -
- buildShallowVocabCache
```
protected VocabCache<ShallowSequenceElement> buildShallowVocabCache(Counter<java.lang.Long> counter)
```
  This method builds shadow vocabulary and huffman tree
  
  Parameters:
  
  counter -
  
  Returns:
- getCounter
```
protected Counter<java.lang.Long> getCounter()
```

Class SparkSequenceVectors<T extends SequenceElement>

Nested Class Summary

Nested classes/interfaces inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors

Field Summary

Fields inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors

Fields inherited from class org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl

Constructor Summary

Method Summary

Methods inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors

Methods inherited from class org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl

Methods inherited from class java.lang.Object

Methods inherited from interface org.deeplearning4j.models.embeddings.wordvectors.WordVectors

Field Detail

elementsFreqAccum

elementsFreqAccumExtra

storageLevel

vocabCacheBroadcast

shallowVocabCacheBroadcast

configurationBroadcast

isEnvironmentReady

shallowVocabCache

isAutoDiscoveryMode

exporter

ela

sla

paramServerConfiguration

Constructor Detail

SparkSequenceVectors

SparkSequenceVectors

Method Detail

getShallowVocabCache

fit

validateConfiguration

broadcastEnvironment

fitLists

fitSequences

buildShallowVocabCache

getCounter