public static class SequenceVectors.Builder<T extends SequenceElement>
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
protected int |
batchSize |
protected VectorsConfiguration |
configuration |
protected ElementsLearningAlgorithm<T> |
elementsLearningAlgorithm |
protected boolean |
enableScavenger |
protected WordVectors |
existingVectors |
protected boolean |
hugeModelExpected |
protected int |
iterations |
protected SequenceIterator<T> |
iterator |
protected int |
layerSize |
protected double |
learningRate |
protected int |
learningRateDecayWords |
protected WeightLookupTable<T> |
lookupTable |
protected double |
minLearningRate |
protected int |
minWordFrequency |
protected ModelUtils<T> |
modelUtils |
protected double |
negative |
protected int |
numEpochs |
protected boolean |
preciseWeightInit |
protected boolean |
resetModel |
protected double |
sampling |
protected long |
seed |
protected SequenceLearningAlgorithm<T> |
sequenceLearningAlgorithm |
protected java.lang.String |
STOP |
protected java.util.Collection<java.lang.String> |
stopWords |
protected boolean |
trainElementsVectors |
protected boolean |
trainSequenceVectors |
protected java.lang.String |
UNK |
protected T |
unknownElement |
protected boolean |
useAdaGrad |
protected boolean |
useHierarchicSoftmax |
protected boolean |
useUnknown |
protected int[] |
variableWindows |
protected java.util.Set<VectorsListener<T>> |
vectorsListeners |
protected VocabCache<T> |
vocabCache |
protected int |
window |
protected int |
workers |
Constructor and Description |
---|
Builder() |
Builder(VectorsConfiguration configuration) |
Modifier and Type | Method and Description |
---|---|
SequenceVectors.Builder<T> |
batchSize(int batchSize)
This method defines batchSize option, viable only if iterations > 1
|
SequenceVectors<T> |
build()
Build SequenceVectors instance with defined settings/options
|
SequenceVectors.Builder<T> |
elementsLearningAlgorithm(ElementsLearningAlgorithm<T> algorithm)
* Sets specific LearningAlgorithm as Elements Learning Algorithm
|
SequenceVectors.Builder<T> |
elementsLearningAlgorithm(java.lang.String algoName)
* Sets specific LearningAlgorithm as Elements Learning Algorithm
|
SequenceVectors.Builder<T> |
enableScavenger(boolean reallyEnable)
This method ebables/disables periodical vocab truncation during construction
Default value: disabled
|
SequenceVectors.Builder<T> |
epochs(int numEpochs)
This method defines how much iterations should be done over whole training corpus during modelling
|
SequenceVectors.Builder<T> |
iterate(SequenceIterator<T> iterator)
This method defines SequenceIterator to be used for model building
|
SequenceVectors.Builder<T> |
iterations(int iterations)
This method defines how much iterations should be done over batched sequences.
|
SequenceVectors.Builder<T> |
layerSize(int layerSize)
This method defines number of dimensions for outcome vectors.
|
SequenceVectors.Builder<T> |
learningRate(double learningRate)
This method defines initial learning rate.
|
SequenceVectors.Builder<T> |
lookupTable(WeightLookupTable<T> lookupTable)
You can pass externally built WeightLookupTable, containing model weights and vocabulary.
|
SequenceVectors.Builder<T> |
minLearningRate(double minLearningRate)
This method defines minimum learning rate after decay being applied.
|
SequenceVectors.Builder<T> |
minWordFrequency(int minWordFrequency)
This method defines minimal element frequency for elements found in the training corpus.
|
SequenceVectors.Builder<T> |
modelUtils(ModelUtils<T> modelUtils)
ModelUtils implementation, that will be used to access model.
|
SequenceVectors.Builder<T> |
negativeSample(double negative)
This method defines negative sampling value for skip-gram algorithm.
|
protected void |
presetTables()
This method creates new WeightLookupTable
|
SequenceVectors.Builder<T> |
resetModel(boolean reallyReset)
This method defines, should all model be reset before training.
|
SequenceVectors.Builder<T> |
sampling(double sampling)
This method defines sub-sampling threshold.
|
SequenceVectors.Builder<T> |
seed(long randomSeed)
Sets seed for random numbers generator.
|
SequenceVectors.Builder<T> |
sequenceLearningAlgorithm(SequenceLearningAlgorithm<T> algorithm)
Sets specific LearningAlgorithm as Sequence Learning Algorithm
|
SequenceVectors.Builder<T> |
sequenceLearningAlgorithm(java.lang.String algoName)
Sets specific LearningAlgorithm as Sequence Learning Algorithm
|
SequenceVectors.Builder<T> |
setVectorsListeners(java.util.Collection<VectorsListener<T>> listeners)
This method sets VectorsListeners for this SequenceVectors model
|
SequenceVectors.Builder<T> |
stopWords(java.util.Collection<T> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
SequenceVectors.Builder<T> |
stopWords(java.util.List<java.lang.String> stopList)
You can provide collection of objects to be ignored, and excluded out of model
Please note: Object labels and hashCode will be used for filtering
|
SequenceVectors.Builder<T> |
trainElementsRepresentation(boolean trainElements) |
SequenceVectors.Builder<T> |
trainSequencesRepresentation(boolean trainSequences) |
SequenceVectors.Builder<T> |
unknownElement(T element)
This method allows you to specify SequenceElement that will be used as UNK element, if UNK is used
|
SequenceVectors.Builder<T> |
useAdaGrad(boolean reallyUse)
Deprecated.
|
protected SequenceVectors.Builder<T> |
useExistingWordVectors(WordVectors vec)
This method allows you to use pre-built WordVectors model (SkipGram or GloVe) for DBOW sequence learning.
|
SequenceVectors.Builder<T> |
useHierarchicSoftmax(boolean reallyUse)
Enable/disable hierarchic softmax
|
SequenceVectors.Builder<T> |
usePreciseWeightInit(boolean reallyUse)
If set to true, initial weights for elements/sequences will be derived from elements themself.
|
SequenceVectors.Builder<T> |
useUnknown(boolean reallyUse)
This method allows you to specify, if UNK word should be used internally
|
SequenceVectors.Builder<T> |
useVariableWindow(int... windows)
This method allows to use variable window size.
|
SequenceVectors.Builder<T> |
vocabCache(VocabCache<T> vocabCache)
You can pass externally built vocabCache object, containing vocabulary
|
SequenceVectors.Builder<T> |
windowSize(int windowSize)
Sets window size for skip-Gram training
|
SequenceVectors.Builder<T> |
workers(int numWorkers)
Sets number of worker threads to be used in calculations
|
protected VocabCache<T extends SequenceElement> vocabCache
protected WeightLookupTable<T extends SequenceElement> lookupTable
protected SequenceIterator<T extends SequenceElement> iterator
protected ModelUtils<T extends SequenceElement> modelUtils
protected WordVectors existingVectors
protected double sampling
protected double negative
protected double learningRate
protected double minLearningRate
protected int minWordFrequency
protected int iterations
protected int numEpochs
protected int layerSize
protected int window
protected boolean hugeModelExpected
protected int batchSize
protected int learningRateDecayWords
protected long seed
protected boolean useAdaGrad
protected boolean resetModel
protected int workers
protected boolean useUnknown
protected boolean useHierarchicSoftmax
protected int[] variableWindows
protected boolean trainSequenceVectors
protected boolean trainElementsVectors
protected boolean preciseWeightInit
protected java.util.Collection<java.lang.String> stopWords
protected VectorsConfiguration configuration
protected transient T extends SequenceElement unknownElement
protected java.lang.String UNK
protected java.lang.String STOP
protected boolean enableScavenger
protected ElementsLearningAlgorithm<T extends SequenceElement> elementsLearningAlgorithm
protected SequenceLearningAlgorithm<T extends SequenceElement> sequenceLearningAlgorithm
protected java.util.Set<VectorsListener<T extends SequenceElement>> vectorsListeners
public Builder()
public Builder(@NonNull VectorsConfiguration configuration)
protected SequenceVectors.Builder<T> useExistingWordVectors(@NonNull WordVectors vec)
vec
- existing WordVectors modelpublic SequenceVectors.Builder<T> iterate(@NonNull SequenceIterator<T> iterator)
iterator
- public SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull java.lang.String algoName)
algoName
- fully qualified class namepublic SequenceVectors.Builder<T> sequenceLearningAlgorithm(@NonNull SequenceLearningAlgorithm<T> algorithm)
algorithm
- SequenceLearningAlgorithm implementationpublic SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull java.lang.String algoName)
algoName
- fully qualified class namepublic SequenceVectors.Builder<T> elementsLearningAlgorithm(@NonNull ElementsLearningAlgorithm<T> algorithm)
algorithm
- ElementsLearningAlgorithm implementationpublic SequenceVectors.Builder<T> batchSize(int batchSize)
batchSize
- public SequenceVectors.Builder<T> iterations(int iterations)
iterations
- public SequenceVectors.Builder<T> epochs(int numEpochs)
numEpochs
- public SequenceVectors.Builder<T> workers(int numWorkers)
numWorkers
- public SequenceVectors.Builder<T> useHierarchicSoftmax(boolean reallyUse)
reallyUse
- @Deprecated public SequenceVectors.Builder<T> useAdaGrad(boolean reallyUse)
reallyUse
- public SequenceVectors.Builder<T> layerSize(int layerSize)
layerSize
- public SequenceVectors.Builder<T> learningRate(double learningRate)
learningRate
- public SequenceVectors.Builder<T> minWordFrequency(int minWordFrequency)
minWordFrequency
- public SequenceVectors.Builder<T> minLearningRate(double minLearningRate)
minLearningRate
- public SequenceVectors.Builder<T> resetModel(boolean reallyReset)
reallyReset
- public SequenceVectors.Builder<T> vocabCache(@NonNull VocabCache<T> vocabCache)
vocabCache
- public SequenceVectors.Builder<T> lookupTable(@NonNull WeightLookupTable<T> lookupTable)
lookupTable
- public SequenceVectors.Builder<T> sampling(double sampling)
sampling
- public SequenceVectors.Builder<T> negativeSample(double negative)
negative
- public SequenceVectors.Builder<T> stopWords(@NonNull java.util.List<java.lang.String> stopList)
stopList
- public SequenceVectors.Builder<T> trainElementsRepresentation(boolean trainElements)
trainElements
- public SequenceVectors.Builder<T> trainSequencesRepresentation(boolean trainSequences)
public SequenceVectors.Builder<T> stopWords(@NonNull java.util.Collection<T> stopList)
stopList
- public SequenceVectors.Builder<T> windowSize(int windowSize)
windowSize
- public SequenceVectors.Builder<T> seed(long randomSeed)
randomSeed
- public SequenceVectors.Builder<T> modelUtils(@NonNull ModelUtils<T> modelUtils)
modelUtils
- model utils to be usedpublic SequenceVectors.Builder<T> useUnknown(boolean reallyUse)
reallyUse
- public SequenceVectors.Builder<T> unknownElement(@NonNull T element)
element
- public SequenceVectors.Builder<T> useVariableWindow(int... windows)
windows
- public SequenceVectors.Builder<T> usePreciseWeightInit(boolean reallyUse)
reallyUse
- protected void presetTables()
public SequenceVectors.Builder<T> setVectorsListeners(@NonNull java.util.Collection<VectorsListener<T>> listeners)
listeners
- public SequenceVectors.Builder<T> enableScavenger(boolean reallyEnable)
reallyEnable
- public SequenceVectors<T> build()