Basically I want something like this,
int count = 100;
Java<String> myRandomRDD = generate(count, new Function<String, String>() {
@Override
public String call(String arg0) throws Exception {
return RandomStringUtils.randomAlphabetic(42);
}
});
Theoretically I could use Spark RandomRDD, but I can't get it working right. I'm overwhelmed by the choices. Should I use RandomRDDs::randomRDD or RandomRDDs::randomRDDVector? Or should I use RandomVectorRDD?
I have tried the following, but I can't even get the syntax to be correct.
RandomRDDs.randomRDD(jsc, new RandomDataGenerator<String>() {
@Override
public void setSeed(long arg0) {
// TODO Auto-generated method stub
}
@Override
public org.apache.spark.mllib.random.RandomDataGenerator<String> copy() {
// TODO Auto-generated method stub
return null;
}
@Override
public String nextValue() {
RandomStringUtils.randomAlphabetic(42);
}
}, count, ??);
The documentation is sparse, I'm confused, and I would appreciate any help.
Thanks!
Aucun commentaire:
Enregistrer un commentaire