CSV Data Import and Export
Gaffer supports both importing from and exporting to csv. This page outlines some of the common methods you can use to do this via the API.
Using Local Files
If you configure your Gaffer graph to support the ImportFromLocalFile
and
ExportToLocalFile
operations, then it can do this from/to a local file. To
enable these operations you will need to use a JSON configuration file to
specify the operations and their handlers like so:
{
"operations": [
{
"operation": "uk.gov.gchq.gaffer.operation.impl.export.localfile.ImportFromLocalFile",
"handler": {
"class": "uk.gov.gchq.gaffer.store.operation.handler.export.localfile.ImportFromLocalFileHandler"
}
},
{
"operation": "uk.gov.gchq.gaffer.operation.impl.export.localfile.ExportToLocalFile",
"handler": {
"class": "uk.gov.gchq.gaffer.store.operation.handler.export.localfile.ExportToLocalFileHandler"
}
}
]
}
Usually this file is called operationsDeclarations.json
but the name can be
anything, what is important is that the file is specified in your store
properties file by using the following property:
How to Import and Export
You can use the REST API to add the graph elements. In production this method would not be recommended for large volumes of data. However, it is fine for smaller data sets and generally can be done in a few stages outlined in the following diagram.
flowchart LR
A(Raw Data) --> B(GenerateElements)
B --> C(AddElements)
The operation chain below essentially mirrors the stages in the previous
diagram. The first stage is taking the raw input data and converting it into
Gaffer elements via an element generator class. Gaffer includes a few built in
generators but you can use a
custom class or pre-process the data before passing to Gaffer so that you're
able to use a default generator. Once the data has been converted to elements it
needs to be added into the graph. To load elements there is a standard
AddElements
operation which takes raw elements JSON as input and adds them
into the graph.
Tip
See the page covering Operations for an introduction on how to use them via the API.
{
"class": "OperationChain",
"operations": [
{
"class": "ImportFromLocalFile", //(1)!
"filePath": "mydata.csv"
},
{
"class": "GenerateElements", //(2)!
"elementGenerator": {
"class": "Neo4jCsvElementGenerator"
}
},
{
"class": "AddElements" //(3)!
}
]
}
- The
ImportFromLocalFile
operation reads each line from the filemydata.csv
and will stream each string into the next parts of the chain. - The
GenerateElements
operation will transform each line of the file into a Gaffer Element. You will need to provide an element generator that is suitable for the file you have provided. The twoCsvElementGenerators
provided in core Gaffer areNeo4jElementGenerator
andNeptuneCsvElementGenerator
. - Finally, the stream of Gaffer Elements are added with an
AddElements
operation.
Exporting to csv is done with a similar OperationChain.
{
"class": "OperationChain",
"operations": [
{
"class": "GetAllElements" //(1)!
},
{
"class": "ToCsv", //(2)!
"csvGenerator": "Neo4jCsvGenerator"
},
{
"class": "ExportToLocalFile", //(3)!
"filePath": "output.csv"
}
]
}
- Firstly, you need to get the Elements which you want to export, in this
example we simply
GetAllElements
. - The
ToCsv
operation is then used to turn each Element into a csv formatted string. You must supply aCsvGenerator
to do this. You can build a customCsvGenerator
, or use a supplied one. The twoCsvGenerators
provided in core Gaffer areNeo4jCsvGenerator
andNeptuneCsvGenerator
. - Then the
ExportToLocalFile
operation is used to save this string output into a local file.
Formats
Custom formats
Currently, custom formats for import are not supported. Instead you should use
one of the two OpenCypher formats. However, for export
you can customise the CsvGenerator
class to create a custom export format in a
ToCsv
operation. For example, the following operation.
{
"class": "ToCsv",
"csvGenerator": {
"class": "CsvGenerator",
"fields": ["prop1", "SOURCE", "DESTINATION", "prop2", "GROUP"],
"constants": ["constant1", "constant2"]
}
}
Would produce csv rows that look like:
OpenCypher Formats
Core Gaffer has some generators provided that can import from and export to OpenCypher CSV. These will work with other graph databases like Neo4j and Neptune.
Note
Please note that when using these, Gaffer might change your property name
headers. All instances of -
are replaced with _
, and invalid characters
are stripped as outlined in PropertiesUtil.
As shown later in the examples, OpenCypher formats let you
dictate property types in the header, like propertyName:type
. Below is a table
that shows which Gaffer transform function is used to deserialise each
OpenCypher data
type
during import.
Gaffer Transform Function | OpenCypher Data Types |
---|---|
ToString |
String Char Duration Point Date LocalDate LocalDateTime |
ToBoolean |
Bool Boolean |
ToInteger |
Int Short Byte |
ToLong |
Long |
ToFloat |
Float |
ToDouble |
Double |
ParseTime |
DateTime |
Neo4j Generators
You can import CSV from Neo4j using the Neo4jCsvElementGenerator
and export
using the Neo4jCsvGenerator
. The format used is defined
here.
Neptune Generators
You can import CSV from Neptune using the NeptuneCsvElementGenerator
and
export using the NeptuneCsvGenerator
. The format used is defined
here.