public class VocabConstructor<T extends SequenceElement>
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
static class |
VocabConstructor.Builder<T extends SequenceElement> |
protected class |
VocabConstructor.VocabRunnable |
Modifier and Type | Field and Description |
---|---|
protected static org.slf4j.Logger |
log |
Modifier and Type | Method and Description |
---|---|
protected WeightLookupTable<T> |
buildExtendedLookupTable()
Placeholder for future implementation
|
protected VocabCache<T> |
buildExtendedVocabulary()
Placeholder for future implementation
|
VocabCache<T> |
buildJointVocabulary(boolean resetCounters,
boolean buildHuffmanTree)
This method scans all sources passed through builder, and returns all words as vocab.
|
VocabCache<T> |
buildMergedVocabulary(VocabCache<T> vocabCache,
boolean fetchLabels)
This method transfers existing vocabulary into current one
Please note: this method expects source vocabulary has Huffman tree indexes applied
|
VocabCache<T> |
buildMergedVocabulary(WordVectors wordVectors,
boolean fetchLabels)
This method transfers existing WordVectors model into current one
|
protected void |
filterVocab(AbstractCache<T> cache,
int minWordFrequency) |
long |
getNumberOfSequences()
This method returns total number of sequences passed through VocabConstructor
|
protected WeightLookupTable<T> buildExtendedLookupTable()
protected VocabCache<T> buildExtendedVocabulary()
public VocabCache<T> buildMergedVocabulary(@NonNull WordVectors wordVectors, boolean fetchLabels)
wordVectors
- public long getNumberOfSequences()
public VocabCache<T> buildMergedVocabulary(@NonNull VocabCache<T> vocabCache, boolean fetchLabels)
vocabCache
- public VocabCache<T> buildJointVocabulary(boolean resetCounters, boolean buildHuffmanTree)
protected void filterVocab(AbstractCache<T> cache, int minWordFrequency)