Search in sources :

Example 11 with RandomAccessData

use of org.apache.beam.runners.dataflow.util.RandomAccessData in project beam by apache.

the class IsmReaderTest method testReadKeyThatEncodesToEmptyByteArray.

@Test
public void testReadKeyThatEncodesToEmptyByteArray() throws Exception {
    File tmpFile = tmpFolder.newFile();
    IsmRecordCoder<Void> coder = IsmRecordCoder.of(1, 0, ImmutableList.<Coder<?>>of(VoidCoder.of()), VoidCoder.of());
    IsmSink<Void> sink = new IsmSink<>(FileSystems.matchNewResource(tmpFile.getPath(), false), coder, BLOOM_FILTER_SIZE_LIMIT);
    IsmRecord<Void> element = IsmRecord.of(Arrays.asList((Void) null), (Void) null);
    try (SinkWriter<WindowedValue<IsmRecord<Void>>> writer = sink.writer()) {
        writer.add(new ValueInEmptyWindows<>(element));
    }
    Cache<IsmShardKey, WeightedValue<NavigableMap<RandomAccessData, WindowedValue<IsmRecord<Void>>>>> cache = CacheBuilder.newBuilder().weigher(Weighers.fixedWeightKeys(1)).maximumWeight(10_000).build();
    IsmReader<Void> reader = new IsmReaderImpl<>(FileSystems.matchSingleFileSpec(tmpFile.getAbsolutePath()).resourceId(), coder, cache);
    IsmReader<Void>.IsmPrefixReaderIterator iterator = reader.iterator();
    assertTrue(iterator.start());
    assertEquals(coder.structuralValue(element), coder.structuralValue(iterator.getCurrent().getValue()));
}
Also used : RandomAccessData(org.apache.beam.runners.dataflow.util.RandomAccessData) WeightedValue(org.apache.beam.sdk.util.WeightedValue) WindowedValue(org.apache.beam.sdk.util.WindowedValue) IsmShardKey(org.apache.beam.runners.dataflow.worker.IsmReaderImpl.IsmShardKey) File(java.io.File) Test(org.junit.Test)

Example 12 with RandomAccessData

use of org.apache.beam.runners.dataflow.util.RandomAccessData in project beam by apache.

the class IsmReaderTest method testReadMissingKeysBypassingBloomFilter.

@Test
public void testReadMissingKeysBypassingBloomFilter() throws Exception {
    File tmpFile = tmpFolder.newFile();
    List<IsmRecord<byte[]>> data = new ArrayList<>();
    data.add(IsmRecord.<byte[]>of(ImmutableList.of(EMPTY, new byte[] { 0x04 }), EMPTY));
    data.add(IsmRecord.<byte[]>of(ImmutableList.of(EMPTY, new byte[] { 0x08 }), EMPTY));
    writeElementsToFile(data, tmpFile);
    IsmReader<byte[]> reader = new IsmReaderImpl<byte[]>(FileSystems.matchSingleFileSpec(tmpFile.getAbsolutePath()).resourceId(), CODER, cache) {

        // We use this override to get around the Bloom filter saying that the key doesn't exist.
        @Override
        boolean bloomFilterMightContain(RandomAccessData keyBytes) {
            return true;
        }
    };
    // Check that we got false with a key before all keys contained in the file.
    assertFalse(reader.overKeyComponents(ImmutableList.of(EMPTY, new byte[] { 0x02 })).start());
    // Check that we got false with a key between two other keys contained in the file.
    assertFalse(reader.overKeyComponents(ImmutableList.of(EMPTY, new byte[] { 0x06 })).start());
    // Check that we got false with a key that is after all keys contained in the file.
    assertFalse(reader.overKeyComponents(ImmutableList.of(EMPTY, new byte[] { 0x10 })).start());
}
Also used : RandomAccessData(org.apache.beam.runners.dataflow.util.RandomAccessData) ArrayList(java.util.ArrayList) IsmRecord(org.apache.beam.runners.dataflow.internal.IsmFormat.IsmRecord) File(java.io.File) Test(org.junit.Test)

Aggregations

RandomAccessData (org.apache.beam.runners.dataflow.util.RandomAccessData)12 SeekableByteChannel (java.nio.channels.SeekableByteChannel)5 IsmRecord (org.apache.beam.runners.dataflow.internal.IsmFormat.IsmRecord)5 File (java.io.File)3 ArrayList (java.util.ArrayList)3 IsmShard (org.apache.beam.runners.dataflow.internal.IsmFormat.IsmShard)3 WindowedValue (org.apache.beam.sdk.util.WindowedValue)3 Test (org.junit.Test)3 Closeable (java.io.Closeable)2 HashMap (java.util.HashMap)2 NavigableMap (java.util.NavigableMap)2 SortedMap (java.util.SortedMap)2 IsmShardKey (org.apache.beam.runners.dataflow.worker.IsmReaderImpl.IsmShardKey)2 WeightedValue (org.apache.beam.sdk.util.WeightedValue)2 KV (org.apache.beam.sdk.values.KV)2 ImmutableSortedMap (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableSortedMap)2 SuppressFBWarnings (edu.umd.cs.findbugs.annotations.SuppressFBWarnings)1 InputStream (java.io.InputStream)1 Map (java.util.Map)1 IsmRecordCoder (org.apache.beam.runners.dataflow.internal.IsmFormat.IsmRecordCoder)1