GetJavaRDDOfAllElements
See javadoc - uk.gov.gchq.gaffer.spark.operation.javardd.GetJavaRDDOfAllElements
Available since Gaffer version 1.0.0
All the elements in a graph can be returned as a JavaRDD by using the operation GetJavaRDDOfAllElements. Some examples follow. Note - there is an option to read the RFiles directly rather than the usual approach of obtaining them from Accumulo's tablet servers. This requires the Hadoop user, running the Spark job, to have read access to the RFiles in the Accumulo tablet. Note, however, that data which has not been minor compacted will not be read if this option is used. This functionality is enabled using the option: "gaffer.accumulo.spark.directrdd.use_rfile_reader=true"
When executing a spark operation you can either let Gaffer create a SparkSession for you or you can add it yourself to the Context object and provide it when you execute the operation. e.g:
Context context = SparkContextUtil.createContext(new User("User01"), sparkSession);
graph.execute(operation, context);
Required fields
No required fields
Examples
Get java RDD of all elements
final GetJavaRDDOfAllElements operation = new GetJavaRDDOfAllElements();
{
"class" : "GetJavaRDDOfAllElements"
}
{
"class" : "uk.gov.gchq.gaffer.spark.operation.javardd.GetJavaRDDOfAllElements"
}
The results are:
Entity[vertex=1,group=entity,properties=Properties[count=<java.lang.Integer>3]]
Edge[source=1,destination=2,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>3]]
Edge[source=1,destination=4,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>1]]
Entity[vertex=2,group=entity,properties=Properties[count=<java.lang.Integer>1]]
Edge[source=2,destination=3,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>2]]
Edge[source=2,destination=4,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>1]]
Edge[source=2,destination=5,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>1]]
Entity[vertex=3,group=entity,properties=Properties[count=<java.lang.Integer>2]]
Edge[source=3,destination=4,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>4]]
Entity[vertex=4,group=entity,properties=Properties[count=<java.lang.Integer>1]]
Entity[vertex=5,group=entity,properties=Properties[count=<java.lang.Integer>3]]
Get java RDD of all elements returning edges only
final GetJavaRDDOfAllElements operation = new GetJavaRDDOfAllElements.Builder()
.view(new View.Builder()
.edge("edge")
.build())
.build();
{
"class" : "GetJavaRDDOfAllElements",
"view" : {
"edges" : {
"edge" : { }
}
}
}
{
"class" : "uk.gov.gchq.gaffer.spark.operation.javardd.GetJavaRDDOfAllElements",
"view" : {
"edges" : {
"edge" : { }
}
}
}
The results are:
Edge[source=1,destination=2,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>3]]
Edge[source=1,destination=4,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>1]]
Edge[source=2,destination=3,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>2]]
Edge[source=2,destination=4,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>1]]
Edge[source=2,destination=5,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>1]]
Edge[source=3,destination=4,directed=true,group=edge,properties=Properties[count=<java.lang.Integer>4]]