it.unimi.dsi.util
Class IntBloomFilter

java.lang.Object
  extended by it.unimi.dsi.util.IntBloomFilter
All Implemented Interfaces:
Serializable

public class IntBloomFilter
extends Object
implements Serializable

A Bloom filter for integers.

Instances of this class represent a set of integers (with false positives) using a Bloom filter. Because of the way Bloom filters work, you cannot remove elements.

Bloom filters have an expected error rate, depending on the number of hash functions used, on the filter size and on the number of elements in the filter. This implementation uses a variable optimal number of hash functions, depending on the expected number of elements. More precisely, a Bloom filter for n integers with d hash functions will use ln 2 dn ≈ 1.44 dn bits; false positives will happen with probability 2-d.

Hash functions are generated at creation time using universal hashing. Each hash function uses two integers A and B, and the integer x is mapped to (Ax)⊕B before taking the remainder modulo the number of bits in the filter.

This class exports access methods that are very similar to those of Set, but it does not implement that interface, as too many non-optional methods would be unimplementable (e.g., iterators).

Author:
Sebastiano Vigna
See Also:
Serialized Form

Field Summary
 int d
          The number of hash functions used by this filter.
 long m
          The number of bits in this filter.
 
Constructor Summary
IntBloomFilter(int n, int d)
          Creates a new Bloom filter with given number of hash functions and expected number of elements.
IntBloomFilter(long n, int d)
          Creates a new Bloom filter with given number of hash functions and expected number of elements.
IntBloomFilter(long n, int d, byte[] seed)
          Creates a new Bloom filter with given number of hash functions and expected number of elements.
 
Method Summary
 void add(int x)
          Adds an integer to the filter.
 boolean contains(int x)
          Checks whether the given integer is in this filter.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m

public final long m
The number of bits in this filter.


d

public final int d
The number of hash functions used by this filter.

Constructor Detail

IntBloomFilter

public IntBloomFilter(int n,
                      int d)
Creates a new Bloom filter with given number of hash functions and expected number of elements.

Parameters:
n - the expected number of elements.
d - the number of hash functions; if the filter add not more than n elements, false positives will happen with probability 2-d.

IntBloomFilter

public IntBloomFilter(long n,
                      int d)
Creates a new Bloom filter with given number of hash functions and expected number of elements.

Parameters:
n - the expected number of elements.
d - the number of hash functions; if the filter add not more than n elements, false positives will happen with probability 2-d.

IntBloomFilter

public IntBloomFilter(long n,
                      int d,
                      byte[] seed)
Creates a new Bloom filter with given number of hash functions and expected number of elements.

Parameters:
n - the expected number of elements.
d - the number of hash functions; if the filter add not more than n elements, false positives will happen with probability 2-d.
seed - an array of bytes that will be used as a seed by SecureRandom.SecureRandom(byte[]).
Method Detail

contains

public boolean contains(int x)
Checks whether the given integer is in this filter.

Note that this method may return true on an integer that has not been added to the filter. This will happen with probability 2-d, where d is the number of hash functions specified at creation time, if the number of the elements in the filter is less than n, the number of expected elements specified at creation time.

Parameters:
x - an integer.
Returns:
true if the integer is in the filter (or if an integer with the same hash sequence is in the filter).

add

public void add(int x)
Adds an integer to the filter.

Parameters:
x - an integer.