Class BloomFilterUtils
- java.lang.Object
-
- uk.gov.gchq.gaffer.accumulostore.utils.BloomFilterUtils
-
public final class BloomFilterUtils extends Object
Utilities for the creation of Bloom Filters
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static int
calculateBloomFilterSize(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)
Calculates the size of theBloomFilter
needed to achieve the desired false positive rate given that the specified number of items will be added to the set, but with the maximum size limited as specified.static int
calculateNumHashes(int bloomFilterSize, int numItemsToBeAdded)
Calculates the optimal number of hash functions to use in aBloomFilter
of the given size, to which the given number of items will be added.static org.apache.hadoop.util.bloom.BloomFilter
getBloomFilter(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)
Returns aBloomFilter
of the necessary size to achieve the given false positive rate (subject to the given maximum size), configured with the optimal number of hash functions.static org.apache.hadoop.util.bloom.BloomFilter
getBloomFilter(int size)
Returns aBloomFilter
of the given size.
-
-
-
Method Detail
-
calculateBloomFilterSize
public static int calculateBloomFilterSize(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)
Calculates the size of theBloomFilter
needed to achieve the desired false positive rate given that the specified number of items will be added to the set, but with the maximum size limited as specified.- Parameters:
falsePositiveRate
- the false positive ratenumItemsToBeAdded
- the number of items to be addedmaximumSize
- the maximum size- Returns:
- An Integer representing the size of the bloom filter needed.
-
calculateNumHashes
public static int calculateNumHashes(int bloomFilterSize, int numItemsToBeAdded)
Calculates the optimal number of hash functions to use in aBloomFilter
of the given size, to which the given number of items will be added.- Parameters:
bloomFilterSize
- the size of the bloom filternumItemsToBeAdded
- the number of items to be added- Returns:
- An integer representing the optimal number of hashes to use
-
getBloomFilter
public static org.apache.hadoop.util.bloom.BloomFilter getBloomFilter(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)
Returns aBloomFilter
of the necessary size to achieve the given false positive rate (subject to the given maximum size), configured with the optimal number of hash functions.- Parameters:
falsePositiveRate
- the false positive ratenumItemsToBeAdded
- the number of items to be addedmaximumSize
- the maximum size- Returns:
- A new BloomFilter with the desired Settings
-
getBloomFilter
public static org.apache.hadoop.util.bloom.BloomFilter getBloomFilter(int size)
Returns aBloomFilter
of the given size.- Parameters:
size
- the size of the bloom filter to create- Returns:
- A new BloomFilter of the desired size
-
-