Class BloomFilterUtils
- java.lang.Object
-
- uk.gov.gchq.gaffer.accumulostore.utils.BloomFilterUtils
-
public final class BloomFilterUtils extends Object
Utilities for the creation of Bloom Filters
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static intcalculateBloomFilterSize(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)Calculates the size of theBloomFilterneeded to achieve the desired false positive rate given that the specified number of items will be added to the set, but with the maximum size limited as specified.static intcalculateNumHashes(int bloomFilterSize, int numItemsToBeAdded)Calculates the optimal number of hash functions to use in aBloomFilterof the given size, to which the given number of items will be added.static org.apache.hadoop.util.bloom.BloomFiltergetBloomFilter(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)Returns aBloomFilterof the necessary size to achieve the given false positive rate (subject to the given maximum size), configured with the optimal number of hash functions.static org.apache.hadoop.util.bloom.BloomFiltergetBloomFilter(int size)Returns aBloomFilterof the given size.
-
-
-
Method Detail
-
calculateBloomFilterSize
public static int calculateBloomFilterSize(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)Calculates the size of theBloomFilterneeded to achieve the desired false positive rate given that the specified number of items will be added to the set, but with the maximum size limited as specified.- Parameters:
falsePositiveRate- the false positive ratenumItemsToBeAdded- the number of items to be addedmaximumSize- the maximum size- Returns:
- An Integer representing the size of the bloom filter needed.
-
calculateNumHashes
public static int calculateNumHashes(int bloomFilterSize, int numItemsToBeAdded)Calculates the optimal number of hash functions to use in aBloomFilterof the given size, to which the given number of items will be added.- Parameters:
bloomFilterSize- the size of the bloom filternumItemsToBeAdded- the number of items to be added- Returns:
- An integer representing the optimal number of hashes to use
-
getBloomFilter
public static org.apache.hadoop.util.bloom.BloomFilter getBloomFilter(double falsePositiveRate, int numItemsToBeAdded, int maximumSize)Returns aBloomFilterof the necessary size to achieve the given false positive rate (subject to the given maximum size), configured with the optimal number of hash functions.- Parameters:
falsePositiveRate- the false positive ratenumItemsToBeAdded- the number of items to be addedmaximumSize- the maximum size- Returns:
- A new BloomFilter with the desired Settings
-
getBloomFilter
public static org.apache.hadoop.util.bloom.BloomFilter getBloomFilter(int size)
Returns aBloomFilterof the given size.- Parameters:
size- the size of the bloom filter to create- Returns:
- A new BloomFilter of the desired size
-
-