public class MLLibUtil
extends java.lang.Object
Modifier and Type | Method and Description |
---|---|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromBinary(org.apache.spark.api.java.JavaPairRDD<java.lang.String,org.apache.spark.input.PortableDataStream> binaryFiles,
org.datavec.api.records.reader.RecordReader reader)
Convert a traditional sc.binaryFiles
in to something usable for machine learning
|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromBinary(org.apache.spark.api.java.JavaRDD<scala.Tuple2<java.lang.String,org.apache.spark.input.PortableDataStream>> binaryFiles,
org.datavec.api.records.reader.RecordReader reader)
Convert a traditional sc.binaryFiles
in to something usable for machine learning
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromContinuousLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data)
Converts a continuous JavaRDD LabeledPoint to a JavaRDD DataSet.
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromContinuousLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
boolean preCache)
Converts a continuous JavaRDD LabeledPoint to a JavaRDD DataSet.
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromContinuousLabeledPoint(org.apache.spark.api.java.JavaSparkContext sc,
org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data)
Deprecated.
|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromDataSet(org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data)
Convert an rdd of data set in to labeled point.
|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromDataSet(org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data,
boolean preCache)
Convert an rdd of data set in to labeled point.
|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
fromDataSet(org.apache.spark.api.java.JavaSparkContext sc,
org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data)
Deprecated.
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels)
Converts JavaRDD labeled points to JavaRDD datasets.
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels,
boolean preCache)
Converts JavaRDD labeled points to JavaRDD DataSets.
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels,
int batchSize)
Convert an rdd
of labeled point
based on the specified batch size
in to data set
|
static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> |
fromLabeledPoint(org.apache.spark.api.java.JavaSparkContext sc,
org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data,
int numPossibleLabels)
Deprecated.
|
static org.apache.spark.mllib.regression.LabeledPoint |
pointOf(java.util.Collection<org.datavec.api.writable.Writable> writables)
Returns a labeled point of the writables
where the final item is the point and the rest of the items are
features
|
static double |
toClassifierPrediction(org.apache.spark.mllib.linalg.Vector vector)
This is for the edge case where
you have a single output layer
and need to convert the output layer to
an index
|
static org.apache.spark.mllib.linalg.Matrix |
toMatrix(org.nd4j.linalg.api.ndarray.INDArray arr)
Convert an ndarray to a matrix.
|
static org.nd4j.linalg.api.ndarray.INDArray |
toMatrix(org.apache.spark.mllib.linalg.Matrix arr)
Convert an ndarray to a matrix.
|
static org.apache.spark.mllib.linalg.Vector |
toVector(org.nd4j.linalg.api.ndarray.INDArray arr)
Convert an ndarray to a vector
|
static org.nd4j.linalg.api.ndarray.INDArray |
toVector(org.apache.spark.mllib.linalg.Vector arr)
Convert an ndarray to a vector
|
public static double toClassifierPrediction(org.apache.spark.mllib.linalg.Vector vector)
vector
- the vector to get the classifier prediction forpublic static org.nd4j.linalg.api.ndarray.INDArray toMatrix(org.apache.spark.mllib.linalg.Matrix arr)
arr
- the arraypublic static org.nd4j.linalg.api.ndarray.INDArray toVector(org.apache.spark.mllib.linalg.Vector arr)
arr
- the arraypublic static org.apache.spark.mllib.linalg.Matrix toMatrix(org.nd4j.linalg.api.ndarray.INDArray arr)
arr
- the arraypublic static org.apache.spark.mllib.linalg.Vector toVector(org.nd4j.linalg.api.ndarray.INDArray arr)
arr
- the arraypublic static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromBinary(org.apache.spark.api.java.JavaPairRDD<java.lang.String,org.apache.spark.input.PortableDataStream> binaryFiles, org.datavec.api.records.reader.RecordReader reader)
binaryFiles
- the binary files to convertreader
- the reader to usepublic static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromBinary(org.apache.spark.api.java.JavaRDD<scala.Tuple2<java.lang.String,org.apache.spark.input.PortableDataStream>> binaryFiles, org.datavec.api.records.reader.RecordReader reader)
binaryFiles
- the binary files to convertreader
- the reader to usepublic static org.apache.spark.mllib.regression.LabeledPoint pointOf(java.util.Collection<org.datavec.api.writable.Writable> writables)
writables
- the writablespublic static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data, int numPossibleLabels, int batchSize)
data
- the data to convertnumPossibleLabels
- the number of possible labelsbatchSize
- the batch size@Deprecated public static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromLabeledPoint(org.apache.spark.api.java.JavaSparkContext sc, org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data, int numPossibleLabels)
fromLabeledPoint(JavaRDD, int)
sc
- the org.deeplearning4j.spark context used for creating the rdddata
- the data to convertnumPossibleLabels
- the number of possible labels@Deprecated public static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromContinuousLabeledPoint(org.apache.spark.api.java.JavaSparkContext sc, org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data)
fromContinuousLabeledPoint(JavaRDD)
data
- the java rdd labeled points ready to convert@Deprecated public static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromDataSet(org.apache.spark.api.java.JavaSparkContext sc, org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data)
fromDataSet(JavaRDD)
sc
- the spark context to usedata
- the dataset to convertpublic static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromContinuousLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data)
data
- JavaRDD LabeledPointpublic static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromContinuousLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data, boolean preCache)
data
- JavaRdd LabeledPointpreCache
- boolean pre-cache rdd before operationpublic static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data, int numPossibleLabels)
data
- JavaRDD LabeledPointsnumPossibleLabels
- number of possible labelspublic static org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> fromLabeledPoint(org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> data, int numPossibleLabels, boolean preCache)
data
- JavaRDD LabeledPointsnumPossibleLabels
- number of possible labelspreCache
- boolean pre-cache rdd before operationpublic static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromDataSet(org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data)
data
- the dataset to convertpublic static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> fromDataSet(org.apache.spark.api.java.JavaRDD<org.nd4j.linalg.dataset.DataSet> data, boolean preCache)
data
- the dataset to convertpreCache
- boolean pre-cache rdd before operation