Search in sources :

Example 1 with Source

use of com.google.cloud.bigtable.beam.CloudBigtableIO.Source in project java-bigtable-hbase by googleapis.

the class CloudBigtableIOTest method testSampleRowKeys.

@Test
public void testSampleRowKeys() throws Exception {
    List<KeyOffset> sampleRowKeys = new ArrayList<>();
    int count = (int) (AbstractSource.COUNT_MAX_SPLIT_COUNT * 3 - 5);
    byte[][] keys = Bytes.split("A".getBytes(), "Z".getBytes(), count - 2);
    long tabletSize = 2L * 1024L * 1024L * 1024L;
    long boundary = 0;
    for (byte[] currentKey : keys) {
        boundary += tabletSize;
        try {
            sampleRowKeys.add(KeyOffset.create(ByteString.copyFrom(currentKey), boundary));
        } catch (NoClassDefFoundError e) {
            // This could cause some problems for javadoc or cobertura because of the shading magic we
            // do.
            e.printStackTrace();
            return;
        }
    }
    Source source = (Source) CloudBigtableIO.read(scanConfig);
    source.setSampleRowKeys(sampleRowKeys);
    List<SourceWithKeys> splits = source.getSplits(20000);
    Collections.sort(splits, new Comparator<SourceWithKeys>() {

        @Override
        public int compare(SourceWithKeys o1, SourceWithKeys o2) {
            return ByteStringComparator.INSTANCE.compare(o1.getConfiguration().getStartRowByteString(), o2.getConfiguration().getStartRowByteString());
        }
    });
    Assert.assertTrue(splits.size() <= AbstractSource.COUNT_MAX_SPLIT_COUNT);
    Iterator<SourceWithKeys> iter = splits.iterator();
    SourceWithKeys last = iter.next();
    while (iter.hasNext()) {
        SourceWithKeys current = iter.next();
        Assert.assertTrue(Bytes.equals(current.getConfiguration().getZeroCopyStartRow(), last.getConfiguration().getZeroCopyStopRow()));
        // The last source will have a stop key of empty.
        if (iter.hasNext()) {
            Assert.assertTrue(Bytes.compareTo(current.getConfiguration().getZeroCopyStartRow(), current.getConfiguration().getZeroCopyStopRow()) < 0);
        }
        Assert.assertTrue(current.getEstimatedSize() >= tabletSize);
        last = current;
    }
// check first and last
}
Also used : KeyOffset(com.google.bigtable.repackaged.com.google.cloud.bigtable.data.v2.models.KeyOffset) ArrayList(java.util.ArrayList) AbstractSource(com.google.cloud.bigtable.beam.CloudBigtableIO.AbstractSource) Source(com.google.cloud.bigtable.beam.CloudBigtableIO.Source) BoundedSource(org.apache.beam.sdk.io.BoundedSource) SourceWithKeys(com.google.cloud.bigtable.beam.CloudBigtableIO.SourceWithKeys) Test(org.junit.Test)

Example 2 with Source

use of com.google.cloud.bigtable.beam.CloudBigtableIO.Source in project java-bigtable-hbase by googleapis.

the class CloudBigtableIOTest method testSourceToString.

@Test
public void testSourceToString() throws Exception {
    Source source = (Source) CloudBigtableIO.read(scanConfig);
    byte[] startKey = "abc d".getBytes();
    byte[] stopKey = "def g".getBytes();
    BoundedSource<Result> sourceWithKeys = source.createSourceWithKeys(startKey, stopKey, 10);
    assertEquals("Split start: 'abc d', end: 'def g', size: 10.", sourceWithKeys.toString());
    startKey = new byte[] { 0, 1, 2, 3, 4, 5 };
    // hello
    stopKey = new byte[] { 104, 101, 108, 108, 111 };
    sourceWithKeys = source.createSourceWithKeys(startKey, stopKey, 10);
    assertEquals("Split start: '\\x00\\x01\\x02\\x03\\x04\\x05', end: 'hello', size: 10.", sourceWithKeys.toString());
}
Also used : AbstractSource(com.google.cloud.bigtable.beam.CloudBigtableIO.AbstractSource) Source(com.google.cloud.bigtable.beam.CloudBigtableIO.Source) BoundedSource(org.apache.beam.sdk.io.BoundedSource) Result(org.apache.hadoop.hbase.client.Result) Test(org.junit.Test)

Aggregations

AbstractSource (com.google.cloud.bigtable.beam.CloudBigtableIO.AbstractSource)2 Source (com.google.cloud.bigtable.beam.CloudBigtableIO.Source)2 BoundedSource (org.apache.beam.sdk.io.BoundedSource)2 Test (org.junit.Test)2 KeyOffset (com.google.bigtable.repackaged.com.google.cloud.bigtable.data.v2.models.KeyOffset)1 SourceWithKeys (com.google.cloud.bigtable.beam.CloudBigtableIO.SourceWithKeys)1 ArrayList (java.util.ArrayList)1 Result (org.apache.hadoop.hbase.client.Result)1