public class TfidfRecordReader extends FileRecordReader
appendLabel, conf, currentFile, inputSplit, iter, labelslistenersAPPEND_LABEL, LABELS, NAME_SPACE| Constructor and Description |
|---|
TfidfRecordReader() |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
Configuration |
getConf()
Return the configuration used by this object.
|
int |
getNumFeatures() |
TfidfVectorizer |
getTfidfVectorizer() |
boolean |
hasNext()
Whether there are anymore records
|
void |
initialize(Configuration conf,
InputSplit split)
Called once at initialization.
|
void |
initialize(InputSplit split)
Called once at initialization.
|
java.util.List<Record> |
loadFromMetaData(java.util.List<RecordMetaData> recordMetaDatas)
Load multiple records from the given a list of
RecordMetaData instances |
Record |
loadFromMetaData(RecordMetaData recordMetaData)
Load a single record from the given
RecordMetaData instanceNote: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once using RecordReader.loadFromMetaData(List) |
java.util.List<Writable> |
next()
Get the next record
|
Record |
nextRecord()
Similar to
RecordReader.next(), but returns a Record object, that may include metadata such as the source
of the data |
void |
reset()
Reset record reader iterator
|
void |
setConf(Configuration conf)
Set the configuration to be used by this object.
|
void |
setTfidfVectorizer(TfidfVectorizer tfidfVectorizer) |
void |
shuffle() |
void |
shuffle(java.util.Random random) |
doInitialize, getCurrentLabel, getLabels, record, setLabelsgetListeners, invokeListeners, setListeners, setListenerspublic void initialize(InputSplit split) throws java.io.IOException, java.lang.InterruptedException
RecordReaderinitialize in interface RecordReaderinitialize in class FileRecordReadersplit - the split that defines the range of records to readjava.io.IOExceptionjava.lang.InterruptedExceptionpublic void initialize(Configuration conf, InputSplit split) throws java.io.IOException, java.lang.InterruptedException
RecordReaderinitialize in interface RecordReaderinitialize in class FileRecordReaderconf - a configuration for initializationsplit - the split that defines the range of records to readjava.io.IOExceptionjava.lang.InterruptedExceptionpublic void reset()
RecordReaderreset in interface RecordReaderreset in class FileRecordReaderpublic Record nextRecord()
RecordReaderRecordReader.next(), but returns a Record object, that may include metadata such as the source
of the datanextRecord in interface RecordReadernextRecord in class FileRecordReaderpublic java.util.List<Writable> next()
RecordReadernext in interface RecordReadernext in class FileRecordReaderpublic boolean hasNext()
RecordReaderhasNext in interface RecordReaderhasNext in class FileRecordReaderpublic void close()
throws java.io.IOException
close in interface java.io.Closeableclose in interface java.lang.AutoCloseableclose in class FileRecordReaderjava.io.IOExceptionpublic void setConf(Configuration conf)
ConfigurablesetConf in interface ConfigurablesetConf in class FileRecordReaderpublic Configuration getConf()
ConfigurablegetConf in interface ConfigurablegetConf in class FileRecordReaderpublic TfidfVectorizer getTfidfVectorizer()
public void setTfidfVectorizer(TfidfVectorizer tfidfVectorizer)
public int getNumFeatures()
public void shuffle()
public void shuffle(java.util.Random random)
public Record loadFromMetaData(RecordMetaData recordMetaData) throws java.io.IOException
RecordReaderRecordMetaData instanceRecordReader.loadFromMetaData(List)loadFromMetaData in interface RecordReaderloadFromMetaData in class FileRecordReaderrecordMetaData - Metadata for the record that we want to load fromjava.io.IOException - If I/O error occurs during loadingpublic java.util.List<Record> loadFromMetaData(java.util.List<RecordMetaData> recordMetaDatas) throws java.io.IOException
RecordReaderRecordMetaData instancesloadFromMetaData in interface RecordReaderloadFromMetaData in class FileRecordReaderrecordMetaDatas - Metadata for the records that we want to load fromjava.io.IOException - If I/O error occurs during loading