Class AccumuloIDWithinSetRetriever
- java.lang.Object
-
- uk.gov.gchq.gaffer.accumulostore.retriever.AccumuloRetriever<OP,Element>
-
- uk.gov.gchq.gaffer.accumulostore.retriever.AccumuloSetRetriever<GetElementsWithinSet>
-
- uk.gov.gchq.gaffer.accumulostore.retriever.impl.AccumuloIDWithinSetRetriever
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Iterable<Element>
public class AccumuloIDWithinSetRetriever extends AccumuloSetRetriever<GetElementsWithinSet>
RetrievesEdge
s where both ends are in a given set ofEntityId
's andEntity
s where the vertex is in the set.BloomFilter
s are used to identify on the server edges that are likely to be between members of the set and to send only these to the client. This reduces the amount of data sent to the client.This operates in two modes. In the first mode the seeds are loaded into memory (client-side). They are also loaded into a
BloomFilter
. This is passed to the iterators to filter out all edges that are definitely not between elements of the set. A secondary check is done within this class to check that the edge is definitely between elements of the set (this defeats any false positives, i.e. edges that passed theBloomFilter
check in the iterators). This secondary check uses the in memory set of seeds (and hence there are guaranteed to be no false positives returned to the user).In the second mode, where there are too many seeds to be loaded into memory, the seeds are queried one batch at a time. When the first batch is queried for, a
BloomFilter
of the first batch is created and passed to the iterators. This filters out all edges that are definitely not between elements of the first batch. When the second batch is queried for, the sameBloomFilter
has the second batch added to it. This is passed to the iterators, which filters out all edges that are definitely not between elements of the second batch and the first or second batch. This process repeats until all seeds have been queried for. This is best thought of as a square split into a grid (with the same number of squares in both dimensions). As there are too many seeds to load into memory, we use a client-sideBloomFilter
to further reduce the chances of false positives making it to the user.
-
-
Constructor Summary
Constructors Constructor Description AccumuloIDWithinSetRetriever(AccumuloStore store, GetElementsWithinSet operation, User user, boolean readEntriesIntoMemory, org.apache.accumulo.core.client.IteratorSetting... iteratorSettings)
AccumuloIDWithinSetRetriever(AccumuloStore store, GetElementsWithinSet operation, User user, org.apache.accumulo.core.client.IteratorSetting... iteratorSettings)
-
Method Summary
-
Methods inherited from class uk.gov.gchq.gaffer.accumulostore.retriever.AccumuloSetRetriever
iterator, setReadEntriesIntoMemory
-
Methods inherited from class uk.gov.gchq.gaffer.accumulostore.retriever.AccumuloRetriever
close, doPostFilter, doTransformation
-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Constructor Detail
-
AccumuloIDWithinSetRetriever
public AccumuloIDWithinSetRetriever(AccumuloStore store, GetElementsWithinSet operation, User user, org.apache.accumulo.core.client.IteratorSetting... iteratorSettings) throws StoreException
- Throws:
StoreException
-
AccumuloIDWithinSetRetriever
public AccumuloIDWithinSetRetriever(AccumuloStore store, GetElementsWithinSet operation, User user, boolean readEntriesIntoMemory, org.apache.accumulo.core.client.IteratorSetting... iteratorSettings) throws StoreException
- Throws:
StoreException
-
-