public class SparkParagraphVectors extends SparkSequenceVectors<VocabWord>
SparkSequenceVectors.Builder<T extends SequenceElement>SequenceVectors.AsyncSequencerconfigurationBroadcast, ela, elementsFreqAccum, elementsFreqAccumExtra, exporter, isAutoDiscoveryMode, isEnvironmentReady, paramServerConfiguration, shallowVocabCache, shallowVocabCacheBroadcast, sla, storageLevel, vocabCacheBroadcastconfiguration, configured, elementsLearningAlgorithm, enableScavenger, eventListeners, existingModel, iterator, log, scoreElements, scoreSequences, sequenceLearningAlgorithm, unknownElementbatchSize, DEFAULT_UNK, layerSize, learningRate, learningRateDecayWords, lookupTable, minLearningRate, minWordFrequency, modelUtils, negative, numEpochs, numIterations, resetModel, sampling, seed, stopWords, trainElementsVectors, trainSequenceVectors, useAdeGrad, useUnknown, variableWindows, vocab, window, workers| Modifier | Constructor and Description |
|---|---|
protected |
SparkParagraphVectors() |
| Modifier and Type | Method and Description |
|---|---|
void |
fitLabelledDocuments(org.apache.spark.api.java.JavaRDD<LabelledDocument> documentsRdd)
This method builds ParagraphVectors model, expecting JavaRDD
|
void |
fitMultipleFiles(org.apache.spark.api.java.JavaPairRDD<java.lang.String,java.lang.String> documentsRdd)
This method builds ParagraphVectors model, expecting JavaPairRDD with key as label, and value as document-in-a-string.
|
protected VocabCache<ShallowSequenceElement> |
getShallowVocabCache() |
protected void |
validateConfiguration() |
broadcastEnvironment, buildShallowVocabCache, fit, fitLists, fitSequences, getCounterbuildVocab, getElementsScore, getSequencesScore, getUNK, getWordVectorMatrix, initLearners, trainSequenceaccuracy, getLayerSize, getWordVector, getWordVectorMatrixNormalized, getWordVectors, getWordVectorsMean, hasWord, indexOf, lookupTable, setLookupTable, setModelUtils, setVocab, similarity, similarWordsInVocabTo, update, update, vocab, wordsNearest, wordsNearest, wordsNearest, wordsNearestSum, wordsNearestSum, wordsNearestSumclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitaccuracy, getWordVector, getWordVectorMatrixNormalized, getWordVectors, getWordVectorsMean, hasWord, indexOf, lookupTable, setModelUtils, setUNK, similarity, similarWordsInVocabTo, vocab, wordsNearest, wordsNearest, wordsNearest, wordsNearestSum, wordsNearestSum, wordsNearestSumprotected VocabCache<ShallowSequenceElement> getShallowVocabCache()
getShallowVocabCache in class SparkSequenceVectors<VocabWord>protected void validateConfiguration()
validateConfiguration in class SparkSequenceVectors<VocabWord>public void fitMultipleFiles(org.apache.spark.api.java.JavaPairRDD<java.lang.String,java.lang.String> documentsRdd)
documentsRdd - public void fitLabelledDocuments(org.apache.spark.api.java.JavaRDD<LabelledDocument> documentsRdd)
documentsRdd -