- All Implemented Interfaces:
- java.io.Serializable, org.apache.spark.api.java.function.Function2<java.lang.Integer,java.util.Iterator<T>,java.util.Iterator<T>>
public class SplitPartitionsFunction<T>
extends java.lang.Object
implements org.apache.spark.api.java.function.Function2<java.lang.Integer,java.util.Iterator<T>,java.util.Iterator<T>>
SplitPartitionsFunction is used to split a RDD (using AbstractJavaRDDLike.mapPartitionsWithIndex(Function2, boolean)
via filtering.
It is similar in design to JavaRDD.randomSplit(double[])
however it is less prone to
producing imbalanced splits that that method. Specifically, JavaRDD.randomSplit(double[])
splts each element individually (i.e., randomly determine a new split for each element at random), whereas this method
chooses one out of every numSplits objects per output split. Exactly which of these is done randomly.
- See Also:
- Serialized Form