public static class Glove.Builder extends SequenceVectors.Builder<VocabWord>
Modifier and Type | Field and Description |
---|---|
protected double |
alpha |
protected DocumentIterator |
documentIterator |
protected SentenceIterator |
sentenceIterator |
protected TokenizerFactory |
tokenFactory |
batchSize, configuration, elementsLearningAlgorithm, enableScavenger, existingVectors, hugeModelExpected, iterations, iterator, layerSize, learningRate, learningRateDecayWords, lookupTable, minLearningRate, minWordFrequency, modelUtils, negative, numEpochs, preciseWeightInit, resetModel, sampling, seed, sequenceLearningAlgorithm, STOP, stopWords, trainElementsVectors, trainSequenceVectors, UNK, unknownElement, useAdaGrad, useHierarchicSoftmax, useUnknown, variableWindows, vectorsListeners, vocabCache, window, workers
Constructor and Description |
---|
Builder() |
Builder(VectorsConfiguration configuration) |
Modifier and Type | Method and Description |
---|---|
Glove.Builder |
alpha(double alpha)
Parameter in exponent of weighting function; default 0.75
|
Glove.Builder |
batchSize(int batchSize)
Specifies minibatch size for training process.
|
Glove |
build()
Build SequenceVectors instance with defined settings/options
|
Glove.Builder |
epochs(int numEpochs)
Sets the number of iteration over training corpus during training
|
Glove.Builder |
iterate(DocumentIterator iterator) |
Glove.Builder |
iterate(SentenceIterator iterator) |
Glove.Builder |
iterate(SequenceIterator<VocabWord> iterator)
This method defines SequenceIterator to be used for model building
|
Glove.Builder |
iterations(int iterations)
Ierations and epochs are the same in GloVe implementation.
|
Glove.Builder |
layerSize(int layerSize)
This method defines number of dimensions for outcome vectors.
|
Glove.Builder |
learningRate(double learningRate)
This method defines initial learning rate.
|
Glove.Builder |
lookupTable(WeightLookupTable<VocabWord> lookupTable)
You can pass externally built WeightLookupTable, containing model weights and vocabulary.
|
Glove.Builder |
maxMemory(int gbytes)
This method allows you to specify maximum memory available for CoOccurrence map builder.
|
Glove.Builder |
minLearningRate(double minLearningRate)
This method defines minimum learning rate after decay being applied.
|
Glove.Builder |
minWordFrequency(int minWordFrequency)
Sets minimum word frequency during vocabulary mastering.
|
Glove.Builder |
modelUtils(ModelUtils<VocabWord> modelUtils)
Sets ModelUtils that gonna be used as provider for utility methods: similarity(), wordsNearest(), accuracy(), etc
|
Glove.Builder |
negativeSample(double negative)
Deprecated.
|
Glove.Builder |
resetModel(boolean reallyReset)
This method defines, should all model be reset before training.
|
Glove.Builder |
sampling(double sampling)
Deprecated.
|
Glove.Builder |
seed(long randomSeed)
Sets seed for random numbers generator.
|
Glove.Builder |
setVectorsListeners(java.util.Collection<VectorsListener<VocabWord>> vectorsListeners)
This method sets VectorsListeners for this SequenceVectors model
|
Glove.Builder |
shuffle(boolean reallyShuffle)
Parameter specifying, if cooccurrences list should be shuffled between training epochs
|
Glove.Builder |
stopWords(java.util.Collection<VocabWord> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
Glove.Builder |
stopWords(java.util.List<java.lang.String> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
Glove.Builder |
symmetric(boolean reallySymmetric)
Parameters specifying, if cooccurrences list should be build into both directions from any current word.
|
Glove.Builder |
tokenizerFactory(TokenizerFactory tokenizerFactory)
Sets TokenizerFactory to be used for training
|
Glove.Builder |
trainElementsRepresentation(boolean trainElements) |
Glove.Builder |
trainSequencesRepresentation(boolean trainSequences)
Deprecated.
|
Glove.Builder |
unknownElement(VocabWord element)
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is used
|
Glove.Builder |
useAdaGrad(boolean reallyUse)
This method defines if Adaptive Gradients should be used in calculations
|
Glove.Builder |
useExistingWordVectors(WordVectors vec)
This method has no effect for GloVe
|
Glove.Builder |
useUnknown(boolean reallyUse)
This method allows you to specify, if UNK word should be used internally
|
Glove.Builder |
useVariableWindow(int... windows)
This method has no effect for ParagraphVectors
|
Glove.Builder |
vocabCache(VocabCache<VocabWord> vocabCache)
You can pass externally built vocabCache object, containing vocabulary
|
Glove.Builder |
windowSize(int windowSize)
Sets window size for skip-Gram training
|
Glove.Builder |
workers(int numWorkers)
Sets number of worker threads to be used in calculations
|
Glove.Builder |
xMax(double xMax)
Parameter specifying cutoff in weighting function; default 100.0
|
elementsLearningAlgorithm, elementsLearningAlgorithm, enableScavenger, presetTables, sequenceLearningAlgorithm, sequenceLearningAlgorithm, useHierarchicSoftmax, usePreciseWeightInit
protected double alpha
protected TokenizerFactory tokenFactory
protected SentenceIterator sentenceIterator
protected DocumentIterator documentIterator
public Builder()
public Builder(@NonNull VectorsConfiguration configuration)
public Glove.Builder useExistingWordVectors(@NonNull WordVectors vec)
useExistingWordVectors
in class SequenceVectors.Builder<VocabWord>
vec
- existing WordVectors modelpublic Glove.Builder iterate(@NonNull SequenceIterator<VocabWord> iterator)
SequenceVectors.Builder
iterate
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder batchSize(int batchSize)
batchSize
in class SequenceVectors.Builder<VocabWord>
batchSize
- public Glove.Builder iterations(int iterations)
iterations
in class SequenceVectors.Builder<VocabWord>
iterations
- public Glove.Builder epochs(int numEpochs)
epochs
in class SequenceVectors.Builder<VocabWord>
numEpochs
- public Glove.Builder useAdaGrad(boolean reallyUse)
SequenceVectors.Builder
useAdaGrad
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder layerSize(int layerSize)
SequenceVectors.Builder
layerSize
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder learningRate(double learningRate)
SequenceVectors.Builder
learningRate
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder minWordFrequency(int minWordFrequency)
minWordFrequency
in class SequenceVectors.Builder<VocabWord>
minWordFrequency
- public Glove.Builder minLearningRate(double minLearningRate)
SequenceVectors.Builder
minLearningRate
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder resetModel(boolean reallyReset)
SequenceVectors.Builder
resetModel
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder vocabCache(@NonNull VocabCache<VocabWord> vocabCache)
SequenceVectors.Builder
vocabCache
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder lookupTable(@NonNull WeightLookupTable<VocabWord> lookupTable)
SequenceVectors.Builder
lookupTable
in class SequenceVectors.Builder<VocabWord>
@Deprecated public Glove.Builder sampling(double sampling)
SequenceVectors.Builder
sampling
in class SequenceVectors.Builder<VocabWord>
@Deprecated public Glove.Builder negativeSample(double negative)
SequenceVectors.Builder
negativeSample
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder stopWords(@NonNull java.util.List<java.lang.String> stopList)
SequenceVectors.Builder
stopWords
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder trainElementsRepresentation(boolean trainElements)
trainElementsRepresentation
in class SequenceVectors.Builder<VocabWord>
@Deprecated public Glove.Builder trainSequencesRepresentation(boolean trainSequences)
trainSequencesRepresentation
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder stopWords(@NonNull java.util.Collection<VocabWord> stopList)
SequenceVectors.Builder
stopWords
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder windowSize(int windowSize)
SequenceVectors.Builder
windowSize
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder seed(long randomSeed)
SequenceVectors.Builder
seed
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder workers(int numWorkers)
SequenceVectors.Builder
workers
in class SequenceVectors.Builder<VocabWord>
public Glove.Builder tokenizerFactory(@NonNull TokenizerFactory tokenizerFactory)
tokenizerFactory
- public Glove.Builder xMax(double xMax)
xMax
- public Glove.Builder symmetric(boolean reallySymmetric)
reallySymmetric
- public Glove.Builder shuffle(boolean reallyShuffle)
reallyShuffle
- public Glove.Builder useVariableWindow(int... windows)
useVariableWindow
in class SequenceVectors.Builder<VocabWord>
windows
- public Glove.Builder alpha(double alpha)
alpha
- public Glove.Builder iterate(@NonNull SentenceIterator iterator)
public Glove.Builder iterate(@NonNull DocumentIterator iterator)
public Glove.Builder modelUtils(@NonNull ModelUtils<VocabWord> modelUtils)
modelUtils
in class SequenceVectors.Builder<VocabWord>
modelUtils
- model utils to be usedpublic Glove.Builder setVectorsListeners(@NonNull java.util.Collection<VectorsListener<VocabWord>> vectorsListeners)
setVectorsListeners
in class SequenceVectors.Builder<VocabWord>
vectorsListeners
- public Glove.Builder maxMemory(int gbytes)
gbytes
- memory limit, in gigabytespublic Glove.Builder unknownElement(VocabWord element)
unknownElement
in class SequenceVectors.Builder<VocabWord>
element
- public Glove.Builder useUnknown(boolean reallyUse)
useUnknown
in class SequenceVectors.Builder<VocabWord>
reallyUse
- public Glove build()
SequenceVectors.Builder
build
in class SequenceVectors.Builder<VocabWord>