Search in sources :

Example 1 with AbstractRowKeyDistributor

use of co.cask.cdap.hbase.wd.AbstractRowKeyDistributor in project cdap by caskdata.

the class HBaseStreamFileConsumerFactory method create.

@Override
protected StreamConsumer create(TableId tableId, StreamConfig streamConfig, ConsumerConfig consumerConfig, StreamConsumerStateStore stateStore, StreamConsumerState beginConsumerState, FileReader<StreamEventOffset, Iterable<StreamFileOffset>> reader, @Nullable ReadFilter extraFilter) throws IOException {
    int splits = cConf.getInt(Constants.Stream.CONSUMER_TABLE_PRESPLITS);
    AbstractRowKeyDistributor distributor = new RowKeyDistributorByHashPrefix(new RowKeyDistributorByHashPrefix.OneByteSimpleHash(splits));
    byte[][] splitKeys = HBaseTableUtil.getSplitKeys(splits, splits, distributor);
    TableId hBaseTableId = tableUtil.createHTableId(new NamespaceId(tableId.getNamespace()), tableId.getTableName());
    TableDescriptorBuilder tdBuilder = HBaseTableUtil.getTableDescriptorBuilder(hBaseTableId, cConf);
    ColumnFamilyDescriptorBuilder cfdBuilder = HBaseTableUtil.getColumnFamilyDescriptorBuilder(Bytes.toString(QueueEntryRow.COLUMN_FAMILY), hConf);
    tdBuilder.addColumnFamily(cfdBuilder.build());
    tdBuilder.addProperty(QueueConstants.DISTRIBUTOR_BUCKETS, Integer.toString(splits));
    try (HBaseDDLExecutor ddlExecutor = ddlExecutorFactory.get()) {
        ddlExecutor.createTableIfNotExists(tdBuilder.build(), splitKeys);
    }
    HTable hTable = tableUtil.createHTable(hConf, hBaseTableId);
    hTable.setWriteBufferSize(Constants.Stream.HBASE_WRITE_BUFFER_SIZE);
    hTable.setAutoFlushTo(false);
    return new HBaseStreamFileConsumer(cConf, streamConfig, consumerConfig, tableUtil, hTable, reader, stateStore, beginConsumerState, extraFilter, createKeyDistributor(hTable.getTableDescriptor()));
}
Also used : TableId(co.cask.cdap.data2.util.TableId) HBaseDDLExecutor(co.cask.cdap.spi.hbase.HBaseDDLExecutor) RowKeyDistributorByHashPrefix(co.cask.cdap.hbase.wd.RowKeyDistributorByHashPrefix) AbstractRowKeyDistributor(co.cask.cdap.hbase.wd.AbstractRowKeyDistributor) ColumnFamilyDescriptorBuilder(co.cask.cdap.data2.util.hbase.ColumnFamilyDescriptorBuilder) TableDescriptorBuilder(co.cask.cdap.data2.util.hbase.TableDescriptorBuilder) NamespaceId(co.cask.cdap.proto.id.NamespaceId) HTable(org.apache.hadoop.hbase.client.HTable)

Example 2 with AbstractRowKeyDistributor

use of co.cask.cdap.hbase.wd.AbstractRowKeyDistributor in project cdap by caskdata.

the class HBaseTableUtilTest method testGetSplitKeys.

@Test
public void testGetSplitKeys() {
    int buckets = 16;
    AbstractRowKeyDistributor distributor = new RowKeyDistributorByHashPrefix(new RowKeyDistributorByHashPrefix.OneByteSimpleHash(buckets));
    // Number of splits will be no less than user asked. If splits > buckets, the number of splits will bumped to
    // next multiple of bucket that is no less than user splits requested.
    // it should return one key less than required splits count, because HBase will take care of the first automatically
    Assert.assertEquals(getSplitSize(buckets, 12) - 1, HBaseTableUtil.getSplitKeys(12, buckets, distributor).length);
    Assert.assertEquals(getSplitSize(buckets, 16) - 1, HBaseTableUtil.getSplitKeys(16, buckets, distributor).length);
    // at least #buckets - 1, but no less than user asked
    Assert.assertEquals(buckets - 1, HBaseTableUtil.getSplitKeys(6, buckets, distributor).length);
    Assert.assertEquals(buckets - 1, HBaseTableUtil.getSplitKeys(2, buckets, distributor).length);
    // "1" can be used for queue tables that we know are not "hot", so we do not pre-split in this case
    Assert.assertEquals(0, HBaseTableUtil.getSplitKeys(1, buckets, distributor).length);
    // allows up to 255 * 8 - 1 splits
    Assert.assertEquals(255 * buckets - 1, HBaseTableUtil.getSplitKeys(255 * buckets, buckets, distributor).length);
    try {
        HBaseTableUtil.getSplitKeys(256 * buckets, buckets, distributor);
        Assert.fail("getSplitKeys(256) should have thrown IllegalArgumentException");
    } catch (IllegalArgumentException e) {
    // expected
    }
    try {
        HBaseTableUtil.getSplitKeys(0, buckets, distributor);
        Assert.fail("getSplitKeys(0) should have thrown IllegalArgumentException");
    } catch (IllegalArgumentException e) {
    // expected
    }
}
Also used : RowKeyDistributorByHashPrefix(co.cask.cdap.hbase.wd.RowKeyDistributorByHashPrefix) AbstractRowKeyDistributor(co.cask.cdap.hbase.wd.AbstractRowKeyDistributor) Test(org.junit.Test)

Aggregations

AbstractRowKeyDistributor (co.cask.cdap.hbase.wd.AbstractRowKeyDistributor)2 RowKeyDistributorByHashPrefix (co.cask.cdap.hbase.wd.RowKeyDistributorByHashPrefix)2 TableId (co.cask.cdap.data2.util.TableId)1 ColumnFamilyDescriptorBuilder (co.cask.cdap.data2.util.hbase.ColumnFamilyDescriptorBuilder)1 TableDescriptorBuilder (co.cask.cdap.data2.util.hbase.TableDescriptorBuilder)1 NamespaceId (co.cask.cdap.proto.id.NamespaceId)1 HBaseDDLExecutor (co.cask.cdap.spi.hbase.HBaseDDLExecutor)1 HTable (org.apache.hadoop.hbase.client.HTable)1 Test (org.junit.Test)1