Stores Guide
A Gaffer Store represents the backing database responsible for storing (or facilitating access to) a graph. Ordinarily a Store provides backing for a single graph. Stores which provide access to other stores can support multiple graphs. So far only the Federated Store supports this.
Gaffer currently supplies the following store implementations:
- Map Store - Simple in-memory store
- Accumulo Store - Apache Accumulo backed store
- Proxy Store - Delegates/forwards queries to another Gaffer REST
- Federated Store - Federates queries across multiple graphs
Store Properties
Stores are configured using key=value
style properties stored in a
There are general properties which apply to all Stores and per Store properties for configuring specific behaviour.
Most properties are optional and don't need to be specified or configured, default values will be used.
All General Store Properties
The properties in bold are set based on the type of Gaffer Store, for how to configure these see the respective page for each store type.
Property | Default | Description |
---|---|---| |
None | Class Name String to set Gaffer Store class | | |
Class Name String to set class to use for serialising Schemas | | |
Class Name String to set Gaffer Store Properties class | |
None | Path to Operation Declarations files (separate multiple files with commas) | |
None | JSON String containing Operation Declarations | |
False | Controls if the Job Tracker is to be used | |
True | Controls if Named Operations can be used | |
True | Controls if Named Views can be used | |
50 | Number of threads to be used by the Job Tracker ExecutorService | |
None | String for Auth to associate with Administrator Users | |
None | Reflection Packages to add to Koryphe ReflectionUtil |
gaffer.named.operation.nested |
False | Controls if NamedOperations are allowed to reference/nest other NamedOperations |
gaffer.serialiser.json.class | |
Class Name String for setting a custom class extending JSONSerialiser |
gaffer.serialiser.json.modules |
None | Class Name String for registering classes implementing JSONSerialiserModules (separate multiple modules with commas) |
gaffer.serialiser.json.strict |
False | Controls if unknown fields should be ignored when serialising JSON (sets Jackson FAIL_ON_UNKNOWN_PROPERTIES internally) |
gaffer.error-mode.debug |
False | Controls technical debugging by methods calling DebugUtil |
gaffer.cache.service.default.class |
None | Fully-qualified class name of a Gaffer cache implementation to use as the default |
gaffer.cache.service.jobtracker.class |
None | Gaffer cache implementation to use for the Job Tracker |
gaffer.cache.service.namedview.class |
None | Gaffer cache implementation to use for Named Views |
gaffer.cache.service.namedoperation.class |
None | Gaffer cache implementation to use for Named Operations |
gaffer.cache.config.file |
None | Config file to use with a Gaffer cache implementation |
gaffer.cache.service.default.suffix |
graphId |
String to use as the default cache suffix | |
None | String to override the default suffix used by Federated Store graph cache |
gaffer.cache.service.named.operation.suffix |
None | String to override the default suffix used by Named Operation cache |
gaffer.cache.service.job.tracker.suffix |
None | String to override the default suffix used by Job Tracker cache |
gaffer.cache.service.named.view.suffix |
None | String to override the default suffix used by Named View cache |
Gaffer comes with three cache implementations:
- Uses a JavaHashMap
as the cache data store. See Javadoc.JcsCacheService
- Uses Apache Commons JCS for the cache data store. See Javadoc.HazelcastCacheService
- Uses Hazelcast for the cache data store. See Javadoc.
The HashMap
cache is not persistent. If using the Hazelcast instance of the Cache service be aware that once the last node shuts down, all data will be lost. This is due to the data being held in memory in a distributed system.
For information on implementing caches, see the cache developer docs page.
Cache configuration includes selecting which cache service to use and optionally specifying a cache suffix.
Cache Service
In order for the cache service to run you must select your desired implementation. You can set the default implementation by adding a line to the
Both the JCS and Hazelcast caches require configuration files. In the case of a JCS file this is a ccf file while for Hazelcast this is commonly a XML/YAML file.
You should then specify the location of any configuration file(s) in your file as follows:
Additionally, the cache service implementation to use for the Job Tracker, Named Views and Named Operations can be set independently (as given in the properties table above). The default service should still be specified, unless all optional cache class properties are given. When cache service implementations have been set independently, but the same implementation class used, this will result in multiple caches of the same kind being created. Setting the cache service independently is intended to allow different cache implementations to be used at the same time. Depending on the implementation, using multiple instances of the same implementation may not work correctly.
Currently it is not possible to specify different cache config files if multiple different cache implementations have been used. The same config file property will be passed to all implementations.
To prevent conflicts between different graphs which share the same cache service, by default the cache entries for each graph are appended with a suffix. The default value of this suffix is the Graph's ID.
You can manually specify the default suffix to use for all types of cache by setting the store property gaffer.cache.service.default.suffix
to the desired String.
By default the cache entry is named the type of cache followed by _
and the graphId
. For example, when using a Federated Store with multiple sub-graphs, named graphA
, graphB
and graphC
, for Named Operations there will be three cache entries called NamedOperationCache_graphA
, NamedOperationCache_graphB
and NamedOperationCache_graphC
In the past (Gaffer versions 1.x
) these suffixes did not exist, and all graphs used the same cache entries. If you want two or more graphs to share the same cache entry, then configure them to use the same suffix.
An example where you might want to share the same cache entry is when using Named Operations and a Federated Store.
Adding a Named Operation to a Federated Store won't make it available to sub-graphs (when using a FederatedOperation
to execute it) unless the sub-graphs share the same Named Operation cache as the Federated Store.
If you only want a certain kind of cache entry to be shared, e.g. only share Named Operations, then set the suffix specific to that cache entry. See the table above for all the properties for this. You could also set the default cache suffix to share everything and set a specific suffix to be different and therefore not shared.
Configuring customisable Operations
Some operations are not available by default and you will need to manually configure them.
These customisable operations can be added to your Gaffer graph by providing config in one or more operation declaration JSON files.
Named Operations
Named Operations depends on the Cache service being active at runtime. See Caches above for how to enable these.
Operation scores determine whether a particular user has the required permissions to execute a given OperationChain. See Operation Scores for how to enable and configure these.