BoundedTimestampSet
The code for this example is BoundedTimestampSetWalkthrough.
This example demonstrates how the BoundedTimestampSet property can be used to maintain a set of the timestamps at which an element was seen active. If this set becomes larger than a size specified by the user then a uniform random sample of the timestamps is maintained. In this example we record the timestamps to minute level accuracy, i.e. the seconds are ignored, and specify that at most 25 timestamps should be retained.
Elements schema
This is our new schema. The edge has a property called 'boundedTimestampSet'. This will store the BoundedTimestampSet object, which is actually a 'BoundedTimestampSet'.
{
"edges": {
"red": {
"source": "vertex.string",
"destination": "vertex.string",
"directed": "false",
"properties": {
"boundedTimestampSet": "bounded.timestamp.set"
}
}
}
}
Types schema
We have added a new type - 'bounded.timestamp.set'. This is a uk.gov.gchq.gaffer.time.BoundedTimestampSet object. We have added in the serialiser and aggregator for the BoundedTimestampSet object. Gaffer will automatically aggregate these sets together to maintain a set of all the times the element was active. Once the size of the set becomes larger than 25 then a uniform random sample of size at most 25 of the timestamps is maintained.
{
"types": {
"vertex.string": {
"class": "java.lang.String",
"validateFunctions": [
{
"class": "uk.gov.gchq.koryphe.impl.predicate.Exists"
}
]
},
"bounded.timestamp.set": {
"class": "uk.gov.gchq.gaffer.time.BoundedTimestampSet",
"aggregateFunction": {
"class": "uk.gov.gchq.gaffer.time.binaryoperator.BoundedTimestampSetAggregator"
},
"serialiser": {
"class": "uk.gov.gchq.gaffer.time.serialisation.BoundedTimestampSetSerialiser"
}
},
"false": {
"class": "java.lang.Boolean",
"validateFunctions": [
{
"class": "uk.gov.gchq.koryphe.impl.predicate.IsFalse"
}
]
}
}
}
There are two edges in the graph. Edge A-B was added 3 times, and each time it had the 'boundedTimestampSet' property containing a randomly generated timestamp from 2017. Edge A-C was added 1000 times, and each time it also had the 'boundedTimestampSet' property containing a randomly generated timestamp from 2017. Here are the edges:
Edge[source=A,destination=C,directed=false,matchedVertex=SOURCE,group=red,properties=Properties[boundedTimestampSet=<uk.gov.gchq.gaffer.time.BoundedTimestampSet>BoundedTimestampSet[timeBucket=MINUTE,state=SAMPLE,maxSize=25,timestamps=2017-01-01T13:46:00Z,2017-01-18T20:35:00Z,2017-01-23T09:25:00Z,2017-01-26T18:22:00Z,2017-03-03T13:15:00Z,2017-03-11T14:48:00Z,2017-04-09T03:52:00Z,2017-05-09T13:59:00Z,2017-05-10T20:37:00Z,2017-05-28T23:07:00Z,2017-07-24T05:07:00Z,2017-08-16T15:19:00Z,2017-09-02T03:01:00Z,2017-09-25T07:01:00Z,2017-09-27T07:08:00Z,2017-10-07T19:04:00Z,2017-10-16T21:05:00Z,2017-11-03T19:57:00Z,2017-11-11T21:27:00Z,2017-11-17T18:11:00Z,2017-11-22T09:03:00Z,2017-11-30T03:03:00Z,2017-12-09T21:38:00Z,2017-12-16T08:11:00Z,2017-12-22T21:24:00Z]]]
Edge[source=A,destination=B,directed=false,matchedVertex=SOURCE,group=red,properties=Properties[boundedTimestampSet=<uk.gov.gchq.gaffer.time.BoundedTimestampSet>BoundedTimestampSet[timeBucket=MINUTE,state=NOT_FULL,maxSize=25,timestamps=2017-02-12T14:21:00Z,2017-03-21T18:09:00Z,2017-12-24T08:00:00Z]]]
You can see that edge A-B has the full list of timestamps on the edge, but edge A-C has a sample of the timestamps.