Using the API
As covered in the Getting Started API page the main POST
request used in the API is /graph/operations/execute
. This part of the guide will cover the
general usage of this part of the API and walk through some general operations you might want to
use to load data or query.
Note
The Gaffer operations reference guide has more detail on available operations.
Loading Data
Gaffer supports various methods of loading data and depending on your use case you can even bypass it all together to load directly into Accumulo.
This example will focus on using the REST API to add the graph elements. In production this method would not be recommended for large volumes of data. However, it is fine for smaller data sets and generally can be done in a few stages outlined in the following diagram.
flowchart LR
A(Raw Data) --> B(GenerateElements)
B --> C(AddElements)
The first stage is taking the raw input data and converting it into Gaffer elements via an element
generator class. Gaffer includes a few built in
generators but you can use a custom class or
pre-process the data before passing to Gaffer so that you're able to use a default generator. Once
the data has been converted to elements it needs to be added into the graph. To load elements there
is a standard AddElements
operation which takes raw elements JSON as input and adds them into the
graph.
Info
This is where the schema is used to validate the elements are correct and conform before adding them to a graph.
Using the example again we will demonstrate how we could write an operation chain to load the data from the Neo4j formatted CSV file.
{
"class": "OperationChain",
"operations": [
{
"class": "ImportFromLocalFile",
"filePath": "/gaffer/data/neo4jExport.csv"
},
{
"class": "GenerateElements",
"elementGenerator": {
"class": "Neo4jCsvElementGenerator"
}
},
{
"class": "AddElements"
}
]
}
The operation chain above essentially mirrors the stages in the previous diagram. In the example
chain we first ingest the data via the ImportFromLocalFile
class (an additional operation we added
via the operationsDeclarations.json
), which
streams the data from the CSV file into the next GenerateElements
operation.
For the generator we have selected the built in Neo4jCsvElementGenerator
class, this is already
set up to be able to parse a correctly formatted Neo4j exported CSV into Gaffer elements via the
schema. If you are curious as to what the output of each operation is you can try run a subset of
this chain to see how the data changes on each one, the output should be returned back to you in the
server response section of the Swagger API.
Querying Data
Once data is loaded into the graph it is now possible to start querying the data to gain insight and
perform analytics. Querying in Gaffer can get fairly complex but generally simple queries are made
up of two parts; a Get
Operation and a View
.
Starting with the Get
operation, say we want to get all entities and edges based on their ID. To do
this we can use the GetElements
operation and set the Seed
to the vertex or edge
where we want to start the search. To demonstrate this on the example graph we can attempt to get
all entities and edges associated with the Person
entity with ID v1
.
The result from this query should return the entity associated with the v1
id along with any edges
on this vertex, which in this case is just one.
[
{
"class": "uk.gov.gchq.gaffer.data.element.Entity",
"group": "Person",
"vertex": "v1",
"properties": {
"name": "marko",
"age": 29
}
},
{
"class": "uk.gov.gchq.gaffer.data.element.Edge",
"group": "Created",
"source": "v1",
"destination": "v2",
"directed": true,
"matchedVertex": "SOURCE",
"properties": {
"weight": {
"java.lang.Float": 0.4
}
}
}
]
Filtering Data
The final bit of querying this guide will go into is how to apply a View
to a returned set of
elements. A View
in Gaffer allows you to filter, aggregate, transform and just generally
manipulate the results. In general a View
has the following possible use cases:
-
Filtering - General filtering on elements based on predicates. Filtering can be applied pre-aggregation, post aggregation and post transformation.
-
Aggregation - This is to control how similar elements are aggregated together. You can provide a subset of the schema
groupBy
properties and override the aggregation functions. -
Transformation - Transformations can be applied by providing Functions to transform properties and vertex values. This is a powerful feature, you can override the existing values or you can transform and save the new value into a new transient property.
-
Property Removal - The relevant properties you want to be returned can be controlled. You can use either
properties
orexcludeProperties
to define the list of properties to be included or excluded.
Taking the example from the previous section we will demonstrate general filtering on a query. As
before, the query returns the vertex v1
and any edges associated with it. We will now filter it to
include only edges where the weight is over a certain value. In this scenario it is analogous to
asking, "get all the Created
edges on vertex v1
that have a weight
greater than 0.3".
Tip
As you can see filtering is based around predicates which are similar to if else statements in traditional programming. For a full list of available predicates refer to the reference documentation.