Class DataFrameUtil
- java.lang.Object
-
- uk.gov.gchq.gaffer.spark.utils.scala.DataFrameUtil
-
public final class DataFrameUtil extends Object
Utility class for manipulating DataFrames.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
emptyEdges(org.apache.spark.sql.SparkSession sparkSession)
Create an emptyDataset
ofRow
s for use as edges in aGraphFrame
.static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
union(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> ds1, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> ds2)
Carry out a union of twoDataset
s where the input Datasets may contain a different number of columns.
-
-
-
Method Detail
-
union
public static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> union(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> ds1, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> ds2)
Carry out a union of twoDataset
s where the input Datasets may contain a different number of columns. The resulting Dataset will contain entries for all of the columns found in the input Dataset, with null entries used as placeholders.- Parameters:
ds1
- the first Datasetds2
- the second Dataset- Returns:
- the combined Dataset
-
emptyEdges
public static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> emptyEdges(org.apache.spark.sql.SparkSession sparkSession)
Create an emptyDataset
ofRow
s for use as edges in aGraphFrame
.- Parameters:
sparkSession
- the spark session- Returns:
- an empty
Dataset
ofRow
s with a src and dst column.
-
-