Search in sources :

Example 11 with Hasher

use of org.apache.commons.collections4.bloomfilter.hasher.Hasher in project commons-collections by apache.

the class ArrayCountingBloomFilterTest method constructorTest_Hasher_Duplicates.

/**
 * Tests that counts are correct when a hasher with duplicates is used in the
 * constructor.
 */
@Test
public void constructorTest_Hasher_Duplicates() {
    final int[] expected = { 0, 1, 1, 0, 0, 1 };
    // Some indexes with duplicates
    final Hasher hasher = new FixedIndexesTestHasher(shape, 1, 2, 2, 5);
    final ArrayCountingBloomFilter bf = createFilter(hasher, shape);
    final long[] lb = bf.getBits();
    assertEquals(1, lb.length);
    assertEquals(0b100110L, lb[0]);
    assertCounts(bf, expected);
}
Also used : Hasher(org.apache.commons.collections4.bloomfilter.hasher.Hasher) Test(org.junit.jupiter.api.Test)

Example 12 with Hasher

use of org.apache.commons.collections4.bloomfilter.hasher.Hasher in project commons-collections by apache.

the class ArrayCountingBloomFilterTest method removeTest_Negative.

/**
 * Tests that removal errors when the counts become negative.
 */
@Test
public void removeTest_Negative() {
    final Hasher hasher = new FixedIndexesTestHasher(shape, 1, 2, 3);
    final ArrayCountingBloomFilter bf = createFilter(hasher, shape);
    final Hasher hasher2 = new FixedIndexesTestHasher(shape, 2);
    final ArrayCountingBloomFilter bf2 = createFilter(hasher2, shape);
    // More - Less = OK
    bf.remove(bf2);
    assertTrue(bf.isValid());
    assertCounts(bf, new int[] { 0, 1, 0, 1 });
    // Less - More = Negative
    assertTrue(bf2.isValid());
    bf2.remove(bf);
    assertFalse(bf2.isValid(), "Remove should create negative counts and the filter is invalid");
    // The counts are not clipped to zero. They have been left as negative.
    assertCounts(bf2, new int[] { 0, -1, 1, -1 });
}
Also used : Hasher(org.apache.commons.collections4.bloomfilter.hasher.Hasher) Test(org.junit.jupiter.api.Test)

Example 13 with Hasher

use of org.apache.commons.collections4.bloomfilter.hasher.Hasher in project commons-collections by apache.

the class ArrayCountingBloomFilterTest method assertOperation.

/**
 * Assert a counting operation. The first set of indexes is used to create the
 * CountingBloomFilter. The second set of indices is passed to the converter to
 * construct a suitable object to combine with the counting Bloom filter. The counts
 * of the first Bloom filter are checked using the expected counts.
 *
 * <p>Counts are assumed to map to indexes starting from 0.
 *
 * @param <F> the type of the filter
 * @param indexes1 the first set of indexes
 * @param indexes2 the second set of indexes
 * @param converter the converter
 * @param operation the operation
 * @param isValid the expected value for the operation result
 * @param expected the expected counts after the operation
 */
private <F> void assertOperation(final int[] indexes1, final int[] indexes2, final Function<int[], F> converter, final BiPredicate<ArrayCountingBloomFilter, F> operation, final boolean isValid, final int[] expected) {
    final Hasher hasher = new FixedIndexesTestHasher(shape, indexes1);
    final ArrayCountingBloomFilter bf = createFilter(hasher, shape);
    final F filter = converter.apply(indexes2);
    final boolean result = operation.test(bf, filter);
    assertEquals(isValid, result);
    assertEquals(isValid, bf.isValid());
    assertCounts(bf, expected);
}
Also used : Hasher(org.apache.commons.collections4.bloomfilter.hasher.Hasher)

Example 14 with Hasher

use of org.apache.commons.collections4.bloomfilter.hasher.Hasher in project commons-collections by apache.

the class HasherBloomFilterTest method getBitsTest_LowestBitOnly.

/**
 * Test the edge case where the filter has only 1 bit in the lowest index and the getBits()
 * function returns an array of length 1.
 */
@Test
public void getBitsTest_LowestBitOnly() {
    final BloomFilter filter = createEmptyFilter(shape);
    // Set the lowest bit index only.
    filter.merge(new Hasher() {

        @Override
        public OfInt iterator(final Shape shape) {
            return Arrays.stream(new int[] { 0 }).iterator();
        }

        @Override
        public HashFunctionIdentity getHashFunctionIdentity() {
            return shape.getHashFunctionIdentity();
        }
    });
    assertArrayEquals(new long[] { 1L }, filter.getBits());
}
Also used : OfInt(java.util.PrimitiveIterator.OfInt) Hasher(org.apache.commons.collections4.bloomfilter.hasher.Hasher) DynamicHasher(org.apache.commons.collections4.bloomfilter.hasher.DynamicHasher) Shape(org.apache.commons.collections4.bloomfilter.hasher.Shape) HashFunctionIdentity(org.apache.commons.collections4.bloomfilter.hasher.HashFunctionIdentity) Test(org.junit.jupiter.api.Test)

Example 15 with Hasher

use of org.apache.commons.collections4.bloomfilter.hasher.Hasher in project commons-collections by apache.

the class SetOperationsTest method cosineSimilarityTest_NoValues.

/**
 * Tests that the Cosine similarity is correctly calculated when one or
 * both filters are empty
 */
@Test
public final void cosineSimilarityTest_NoValues() {
    final BloomFilter filter1 = new HasherBloomFilter(shape);
    final BloomFilter filter2 = new HasherBloomFilter(shape);
    // build a filter
    final List<Integer> lst = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17);
    final Hasher hasher = new StaticHasher(lst.iterator(), shape);
    final BloomFilter filter3 = new HasherBloomFilter(hasher, shape);
    assertEquals(0.0, SetOperations.cosineSimilarity(filter1, filter2), 0.0001);
    assertEquals(0.0, SetOperations.cosineSimilarity(filter2, filter1), 0.0001);
    assertEquals(0.0, SetOperations.cosineSimilarity(filter1, filter3), 0.0001);
    assertEquals(0.0, SetOperations.cosineSimilarity(filter3, filter1), 0.0001);
}
Also used : Hasher(org.apache.commons.collections4.bloomfilter.hasher.Hasher) StaticHasher(org.apache.commons.collections4.bloomfilter.hasher.StaticHasher) StaticHasher(org.apache.commons.collections4.bloomfilter.hasher.StaticHasher) Test(org.junit.jupiter.api.Test)

Aggregations

Hasher (org.apache.commons.collections4.bloomfilter.hasher.Hasher)37 Test (org.junit.jupiter.api.Test)35 StaticHasher (org.apache.commons.collections4.bloomfilter.hasher.StaticHasher)34 Shape (org.apache.commons.collections4.bloomfilter.hasher.Shape)10 OfInt (java.util.PrimitiveIterator.OfInt)2 IntConsumer (java.util.function.IntConsumer)2 DynamicHasher (org.apache.commons.collections4.bloomfilter.hasher.DynamicHasher)2 IteratorChain (org.apache.commons.collections4.iterators.IteratorChain)2 ArrayList (java.util.ArrayList)1 Arrays (java.util.Arrays)1 Set (java.util.Set)1 TreeSet (java.util.TreeSet)1 HashFunctionIdentity (org.apache.commons.collections4.bloomfilter.hasher.HashFunctionIdentity)1 MD5Cyclic (org.apache.commons.collections4.bloomfilter.hasher.function.MD5Cyclic)1 EmptyIterator (org.apache.commons.collections4.iterators.EmptyIterator)1