Search in sources :

Example 46 with Tuple

use of org.apache.storm.tuple.Tuple in project storm by apache.

the class TestHiveWriter method writeTuples.

private void writeTuples(HiveWriter writer, HiveMapper mapper, int count) throws HiveWriter.WriteFailure, InterruptedException, SerializationError {
    Integer id = 100;
    String msg = "test-123";
    for (int i = 1; i <= count; i++) {
        Tuple tuple = generateTestTuple(id, msg);
        writer.write(mapper.mapRecord(tuple));
    }
}
Also used : HiveEndPoint(org.apache.hive.hcatalog.streaming.HiveEndPoint) Tuple(org.apache.storm.tuple.Tuple)

Example 47 with Tuple

use of org.apache.storm.tuple.Tuple in project storm by apache.

the class TestHiveBolt method testData.

@Test
public void testData() throws Exception {
    DelimitedRecordHiveMapper mapper = new DelimitedRecordHiveMapper().withColumnFields(new Fields(colNames)).withPartitionFields(new Fields(partNames));
    HiveOptions hiveOptions = new HiveOptions(metaStoreURI, dbName, tblName, mapper).withTxnsPerBatch(2).withBatchSize(1);
    bolt = new HiveBolt(hiveOptions);
    bolt.prepare(config, null, new OutputCollector(collector));
    Tuple tuple1 = generateTestTuple(1, "SJC", "Sunnyvale", "CA");
    //Tuple tuple2 = generateTestTuple(2,"SFO","San Jose","CA");
    bolt.execute(tuple1);
    verify(collector).ack(tuple1);
    //bolt.execute(tuple2);
    //verify(collector).ack(tuple2);
    checkDataWritten(tblName, dbName, "1,SJC,Sunnyvale,CA");
    bolt.cleanup();
}
Also used : OutputCollector(org.apache.storm.task.OutputCollector) Fields(org.apache.storm.tuple.Fields) DelimitedRecordHiveMapper(org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper) HiveOptions(org.apache.storm.hive.common.HiveOptions) Tuple(org.apache.storm.tuple.Tuple) Test(org.junit.Test)

Example 48 with Tuple

use of org.apache.storm.tuple.Tuple in project storm by apache.

the class TestHiveBolt method testTickTuple.

@Test
public void testTickTuple() {
    JsonRecordHiveMapper mapper = new JsonRecordHiveMapper().withColumnFields(new Fields(colNames1)).withPartitionFields(new Fields(partNames));
    HiveOptions hiveOptions = new HiveOptions(metaStoreURI, dbName, tblName, mapper).withTxnsPerBatch(2).withBatchSize(2);
    bolt = new HiveBolt(hiveOptions);
    bolt.prepare(config, null, new OutputCollector(collector));
    Tuple tuple1 = generateTestTuple(1, "SJC", "Sunnyvale", "CA");
    Tuple tuple2 = generateTestTuple(2, "SFO", "San Jose", "CA");
    bolt.execute(tuple1);
    //The tick should cause tuple1 to be ack'd
    Tuple mockTick = MockTupleHelpers.mockTickTuple();
    bolt.execute(mockTick);
    verify(collector).ack(tuple1);
    //The second tuple should NOT be ack'd because the batch should be cleared and this will be
    //the first transaction in the new batch
    bolt.execute(tuple2);
    verify(collector, never()).ack(tuple2);
    bolt.cleanup();
}
Also used : JsonRecordHiveMapper(org.apache.storm.hive.bolt.mapper.JsonRecordHiveMapper) OutputCollector(org.apache.storm.task.OutputCollector) Fields(org.apache.storm.tuple.Fields) HiveOptions(org.apache.storm.hive.common.HiveOptions) Tuple(org.apache.storm.tuple.Tuple) Test(org.junit.Test)

Example 49 with Tuple

use of org.apache.storm.tuple.Tuple in project storm by apache.

the class TestHiveBolt method testNoAcksIfFlushFails.

@Test
public void testNoAcksIfFlushFails() throws Exception {
    JsonRecordHiveMapper mapper = new JsonRecordHiveMapper().withColumnFields(new Fields(colNames1)).withPartitionFields(new Fields(partNames));
    HiveOptions hiveOptions = new HiveOptions(metaStoreURI, dbName, tblName, mapper).withTxnsPerBatch(2).withBatchSize(2);
    HiveBolt spyBolt = Mockito.spy(new HiveBolt(hiveOptions));
    //This forces a failure of all the flush attempts
    doThrow(new InterruptedException()).when(spyBolt).flushAllWriters(true);
    spyBolt.prepare(config, null, new OutputCollector(collector));
    Tuple tuple1 = generateTestTuple(1, "SJC", "Sunnyvale", "CA");
    Tuple tuple2 = generateTestTuple(2, "SFO", "San Jose", "CA");
    spyBolt.execute(tuple1);
    spyBolt.execute(tuple2);
    verify(collector, never()).ack(tuple1);
    verify(collector, never()).ack(tuple2);
    spyBolt.cleanup();
}
Also used : JsonRecordHiveMapper(org.apache.storm.hive.bolt.mapper.JsonRecordHiveMapper) OutputCollector(org.apache.storm.task.OutputCollector) Fields(org.apache.storm.tuple.Fields) HiveOptions(org.apache.storm.hive.common.HiveOptions) Tuple(org.apache.storm.tuple.Tuple) Test(org.junit.Test)

Example 50 with Tuple

use of org.apache.storm.tuple.Tuple in project storm by apache.

the class TestHiveBolt method testMultiPartitionTuples.

@Test
public void testMultiPartitionTuples() throws Exception {
    DelimitedRecordHiveMapper mapper = new DelimitedRecordHiveMapper().withColumnFields(new Fields(colNames)).withPartitionFields(new Fields(partNames));
    HiveOptions hiveOptions = new HiveOptions(metaStoreURI, dbName, tblName, mapper).withTxnsPerBatch(10).withBatchSize(10);
    bolt = new HiveBolt(hiveOptions);
    bolt.prepare(config, null, new OutputCollector(collector));
    Integer id = 1;
    String msg = "test";
    String city = "San Jose";
    String state = "CA";
    checkRecordCountInTable(tblName, dbName, 0);
    Set<Tuple> tupleSet = new HashSet<Tuple>();
    for (int i = 0; i < 100; i++) {
        Tuple tuple = generateTestTuple(id, msg, city, state);
        tupleSet.add(tuple);
        bolt.execute(tuple);
    }
    checkRecordCountInTable(tblName, dbName, 100);
    for (Tuple t : tupleSet) verify(collector).ack(t);
    bolt.cleanup();
}
Also used : OutputCollector(org.apache.storm.task.OutputCollector) Fields(org.apache.storm.tuple.Fields) DelimitedRecordHiveMapper(org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper) HiveOptions(org.apache.storm.hive.common.HiveOptions) Tuple(org.apache.storm.tuple.Tuple) HashSet(java.util.HashSet) Test(org.junit.Test)

Aggregations

Tuple (org.apache.storm.tuple.Tuple)85 Test (org.junit.Test)30 Fields (org.apache.storm.tuple.Fields)13 OutputCollector (org.apache.storm.task.OutputCollector)11 Values (org.apache.storm.tuple.Values)11 ArrayList (java.util.ArrayList)10 HiveOptions (org.apache.storm.hive.common.HiveOptions)10 TupleWindow (org.apache.storm.windowing.TupleWindow)9 HashMap (java.util.HashMap)7 Test (org.testng.annotations.Test)7 GlobalStreamId (org.apache.storm.generated.GlobalStreamId)6 DelimitedRecordHiveMapper (org.apache.storm.hive.bolt.mapper.DelimitedRecordHiveMapper)6 HashSet (java.util.HashSet)5 JsonRecordHiveMapper (org.apache.storm.hive.bolt.mapper.JsonRecordHiveMapper)5 TopologyContext (org.apache.storm.task.TopologyContext)5 TupleImpl (org.apache.storm.tuple.TupleImpl)5 BasicOutputCollector (org.apache.storm.topology.BasicOutputCollector)4 Map (java.util.Map)3 Callback (org.apache.kafka.clients.producer.Callback)3 ProducerRecord (org.apache.kafka.clients.producer.ProducerRecord)3