Search in sources :

Example 11 with DelimitedRecordHiveMapper

use of org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper in project storm by apache.

the class TestHiveBolt method testMultiPartitionTuples.

@Test
public void testMultiPartitionTuples() throws Exception {
    DelimitedRecordHiveMapper mapper = new DelimitedRecordHiveMapper().withColumnFields(new Fields(colNames)).withPartitionFields(new Fields(partNames));
    HiveOptions hiveOptions = new HiveOptions(metaStoreURI, dbName, tblName, mapper).withTxnsPerBatch(10).withBatchSize(10);
    bolt = new TestingHiveBolt(hiveOptions);
    bolt.prepare(config, null, new OutputCollector(collector));
    Integer id = 1;
    String msg = "test";
    String city = "San Jose";
    String state = "CA";
    List<Tuple> tuples = new ArrayList<>();
    for (int i = 0; i < 100; i++) {
        Tuple tuple = generateTestTuple(id, msg, city, state);
        tuples.add(tuple);
        bolt.execute(tuple);
    }
    for (Tuple t : tuples) {
        verify(collector).ack(t);
    }
    List<String> partVals = Lists.newArrayList(city, state);
    List<byte[]> recordsWritten = bolt.getRecordWritten(partVals);
    Assert.assertNotNull(recordsWritten);
    Assert.assertEquals(100, recordsWritten.size());
    byte[] mapped = generateDelimiteredRecord(Lists.newArrayList(id, msg), mapper.getFieldDelimiter());
    for (byte[] record : recordsWritten) {
        Assert.assertArrayEquals(mapped, record);
    }
    bolt.cleanup();
}
Also used : OutputCollector(org.apache.storm.task.OutputCollector) ArrayList(java.util.ArrayList) DelimitedRecordHiveMapper(org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper) HiveEndPoint(org.apache.hive.hcatalog.streaming.HiveEndPoint) Fields(org.apache.storm.tuple.Fields) HiveOptions(org.apache.storm.hive.common.HiveOptions) Tuple(org.apache.storm.tuple.Tuple) Test(org.junit.Test)

Example 12 with DelimitedRecordHiveMapper

use of org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper in project storm by apache.

the class TestHiveWriter method testWriteMultiFlush.

@Test
public void testWriteMultiFlush() throws Exception {
    DelimitedRecordHiveMapper mapper = new MockedDelemiteredRecordHiveMapper().withColumnFields(new Fields(colNames)).withPartitionFields(new Fields(partNames));
    HiveEndPoint endPoint = new HiveEndPoint(metaStoreURI, dbName, tblName, Arrays.asList(partitionVals));
    TestingHiveWriter writer = new TestingHiveWriter(endPoint, 10, true, timeout, callTimeoutPool, mapper, ugi, false);
    Tuple tuple = generateTestTuple("1", "abc");
    writer.write(mapper.mapRecord(tuple));
    tuple = generateTestTuple("2", "def");
    writer.write(mapper.mapRecord(tuple));
    Assert.assertEquals(writer.getTotalRecords(), 2);
    Mockito.verify(writer.getMockedTxBatch(), Mockito.times(2)).write(Mockito.any(byte[].class));
    Mockito.verify(writer.getMockedTxBatch(), Mockito.never()).commit();
    writer.flush(true);
    Assert.assertEquals(writer.getTotalRecords(), 0);
    Mockito.verify(writer.getMockedTxBatch(), Mockito.atLeastOnce()).commit();
    tuple = generateTestTuple("3", "ghi");
    writer.write(mapper.mapRecord(tuple));
    writer.flush(true);
    tuple = generateTestTuple("4", "klm");
    writer.write(mapper.mapRecord(tuple));
    writer.flush(true);
    writer.close();
    Mockito.verify(writer.getMockedTxBatch(), Mockito.times(4)).write(Mockito.any(byte[].class));
}
Also used : Fields(org.apache.storm.tuple.Fields) HiveEndPoint(org.apache.hive.hcatalog.streaming.HiveEndPoint) DelimitedRecordHiveMapper(org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper) Tuple(org.apache.storm.tuple.Tuple) Test(org.junit.Test)

Aggregations

DelimitedRecordHiveMapper (org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper)12 Fields (org.apache.storm.tuple.Fields)12 HiveOptions (org.apache.storm.hive.common.HiveOptions)9 Test (org.junit.Test)8 HiveEndPoint (org.apache.hive.hcatalog.streaming.HiveEndPoint)7 Tuple (org.apache.storm.tuple.Tuple)6 Config (org.apache.storm.Config)3 TopologyBuilder (org.apache.storm.topology.TopologyBuilder)3 ArrayList (java.util.ArrayList)2 HashSet (java.util.HashSet)2 OutputCollector (org.apache.storm.task.OutputCollector)2 SimpleDateFormat (java.text.SimpleDateFormat)1 Date (java.util.Date)1 Stream (org.apache.storm.trident.Stream)1 TridentState (org.apache.storm.trident.TridentState)1 TridentTopology (org.apache.storm.trident.TridentTopology)1 StateFactory (org.apache.storm.trident.state.StateFactory)1