Search in sources :

Example 1 with BloomFilter

use of org.apache.cassandra.utils.BloomFilter in project cassandra by apache.

the class BloomFilterSerializerBench method serializationTest.

@Benchmark
public void serializationTest() throws IOException {
    File file = FileUtils.createTempFile("bloomFilterTest-", ".dat");
    try {
        BloomFilter filter = (BloomFilter) FilterFactory.getFilter(numElemsInK * 1024, 0.01d);
        filter.add(wrap(testVal));
        DataOutputStreamPlus out = new FileOutputStreamPlus(file);
        if (oldBfFormat)
            SerializationsTest.serializeOldBfFormat(filter, out);
        else
            BloomFilterSerializer.serialize(filter, out);
        out.close();
        filter.close();
        FileInputStreamPlus in = new FileInputStreamPlus(file);
        BloomFilter filter2 = BloomFilterSerializer.deserialize(in, oldBfFormat);
        FileUtils.closeQuietly(in);
        filter2.close();
    } finally {
        file.tryDelete();
    }
}
Also used : FileInputStreamPlus(org.apache.cassandra.io.util.FileInputStreamPlus) File(org.apache.cassandra.io.util.File) FileOutputStreamPlus(org.apache.cassandra.io.util.FileOutputStreamPlus) BloomFilter(org.apache.cassandra.utils.BloomFilter) DataOutputStreamPlus(org.apache.cassandra.io.util.DataOutputStreamPlus) Benchmark(org.openjdk.jmh.annotations.Benchmark)

Example 2 with BloomFilter

use of org.apache.cassandra.utils.BloomFilter in project eiger by wlloyd.

the class ColumnIndexer method serialize.

/**
 * Serializes the index into in-memory structure with all required components
 * such as Bloom Filter, index block size, IndexInfo list
 *
 * @param columns Column family to create index for
 *
 * @return information about index - it's Bloom Filter, block size and IndexInfo list
 */
public static RowHeader serialize(IIterableColumns columns) {
    int columnCount = columns.getEstimatedColumnCount();
    BloomFilter bf = BloomFilter.getFilter(columnCount, 4);
    if (columnCount == 0)
        return new RowHeader(bf, Collections.<IndexHelper.IndexInfo>emptyList());
    // update bloom filter and create a list of IndexInfo objects marking the first and last column
    // in each block of ColumnIndexSize
    List<IndexHelper.IndexInfo> indexList = new ArrayList<IndexHelper.IndexInfo>();
    long endPosition = 0, startPosition = -1;
    IColumn lastColumn = null, firstColumn = null;
    for (IColumn column : columns) {
        bf.add(column.name());
        if (firstColumn == null) {
            firstColumn = column;
            startPosition = endPosition;
        }
        endPosition += column.serializedSize();
        // if we hit the column index size that we have to index after, go ahead and index it.
        if (endPosition - startPosition >= DatabaseDescriptor.getColumnIndexSize()) {
            IndexHelper.IndexInfo cIndexInfo = new IndexHelper.IndexInfo(firstColumn.name(), column.name(), startPosition, endPosition - startPosition);
            indexList.add(cIndexInfo);
            firstColumn = null;
        }
        lastColumn = column;
    }
    // all columns were GC'd after all
    if (lastColumn == null)
        return new RowHeader(bf, Collections.<IndexHelper.IndexInfo>emptyList());
    // the last column may have fallen on an index boundary already.  if not, index it explicitly.
    if (indexList.isEmpty() || columns.getComparator().compare(indexList.get(indexList.size() - 1).lastName, lastColumn.name()) != 0) {
        IndexHelper.IndexInfo cIndexInfo = new IndexHelper.IndexInfo(firstColumn.name(), lastColumn.name(), startPosition, endPosition - startPosition);
        indexList.add(cIndexInfo);
    }
    // we should always have at least one computed index block, but we only write it out if there is more than that.
    assert indexList.size() > 0;
    return new RowHeader(bf, indexList);
}
Also used : IndexHelper(org.apache.cassandra.io.sstable.IndexHelper) ArrayList(java.util.ArrayList) BloomFilter(org.apache.cassandra.utils.BloomFilter)

Aggregations

BloomFilter (org.apache.cassandra.utils.BloomFilter)2 ArrayList (java.util.ArrayList)1 IndexHelper (org.apache.cassandra.io.sstable.IndexHelper)1 DataOutputStreamPlus (org.apache.cassandra.io.util.DataOutputStreamPlus)1 File (org.apache.cassandra.io.util.File)1 FileInputStreamPlus (org.apache.cassandra.io.util.FileInputStreamPlus)1 FileOutputStreamPlus (org.apache.cassandra.io.util.FileOutputStreamPlus)1 Benchmark (org.openjdk.jmh.annotations.Benchmark)1