Search in sources :

Example 11 with BigtableSource

use of org.apache.beam.sdk.io.gcp.bigtable.BigtableIO.BigtableSource in project beam by apache.

the class BigtableIOTest method testReduceSplitsWithSomeNonAdjacentRanges.

/**
 * Tests reduce splits with few non adjacent ranges.
 */
@Test
public void testReduceSplitsWithSomeNonAdjacentRanges() throws Exception {
    final String table = "TEST-MANY-ROWS-SPLITS-TABLE";
    final int numRows = 10;
    final int numSamples = 10;
    final long bytesPerRow = 100L;
    final int maxSplit = 3;
    // Set up test table data and sample row keys for size estimation and splitting.
    makeTableData(table, numRows);
    service.setupSampleRowKeys(table, numSamples, bytesPerRow);
    // Construct few non contiguous key ranges [..1][1..2][3..4][4..5][6..7][8..9]
    List<ByteKeyRange> keyRanges = Arrays.asList(ByteKeyRange.of(ByteKey.EMPTY, createByteKey(1)), ByteKeyRange.of(createByteKey(1), createByteKey(2)), ByteKeyRange.of(createByteKey(3), createByteKey(4)), ByteKeyRange.of(createByteKey(4), createByteKey(5)), ByteKeyRange.of(createByteKey(6), createByteKey(7)), ByteKeyRange.of(createByteKey(8), createByteKey(9)));
    // Expected ranges after split and reduction by maxSplitCount is [..2][3..5][6..7][8..9]
    List<ByteKeyRange> expectedKeyRangesAfterReducedSplits = Arrays.asList(ByteKeyRange.of(ByteKey.EMPTY, createByteKey(2)), ByteKeyRange.of(createByteKey(3), createByteKey(5)), ByteKeyRange.of(createByteKey(6), createByteKey(7)), ByteKeyRange.of(createByteKey(8), createByteKey(9)));
    // Generate source and split it.
    BigtableSource source = new BigtableSource(config.withTableId(StaticValueProvider.of(table)), BigtableReadOptions.builder().setKeyRanges(StaticValueProvider.of(keyRanges)).build(), null);
    List<BigtableSource> splits = new ArrayList<>();
    for (ByteKeyRange range : keyRanges) {
        splits.add(source.withSingleRange(range));
    }
    List<BigtableSource> reducedSplits = source.reduceSplits(splits, null, maxSplit);
    List<ByteKeyRange> actualRangesAfterSplit = new ArrayList<>();
    for (BigtableSource splitSource : reducedSplits) {
        actualRangesAfterSplit.addAll(splitSource.getRanges());
    }
    assertAllSourcesHaveSingleRanges(reducedSplits);
    assertThat(actualRangesAfterSplit, IsIterableContainingInAnyOrder.containsInAnyOrder(expectedKeyRangesAfterReducedSplits.toArray()));
}
Also used : ByteKeyRange(org.apache.beam.sdk.io.range.ByteKeyRange) ArrayList(java.util.ArrayList) ByteString(com.google.protobuf.ByteString) BigtableSource(org.apache.beam.sdk.io.gcp.bigtable.BigtableIO.BigtableSource) Test(org.junit.Test)

Aggregations

ByteString (com.google.protobuf.ByteString)11 BigtableSource (org.apache.beam.sdk.io.gcp.bigtable.BigtableIO.BigtableSource)11 Test (org.junit.Test)11 ByteKeyRange (org.apache.beam.sdk.io.range.ByteKeyRange)5 ArrayList (java.util.ArrayList)3 ByteKey (org.apache.beam.sdk.io.range.ByteKey)2 Row (com.google.bigtable.v2.Row)1 RowFilter (com.google.bigtable.v2.RowFilter)1