Search in sources :

Example 76 with BytesWritable

use of org.apache.hadoop.io.BytesWritable in project hive by apache.

the class DelimitedInputWriter method encode.

@Override
public Object encode(byte[] record) throws SerializationError {
    try {
        BytesWritable blob = new BytesWritable();
        blob.set(record, 0, record.length);
        return serde.deserialize(blob);
    } catch (SerDeException e) {
        throw new SerializationError("Unable to convert byte[] record into Object", e);
    }
}
Also used : BytesWritable(org.apache.hadoop.io.BytesWritable) SerDeException(org.apache.hadoop.hive.serde2.SerDeException)

Example 77 with BytesWritable

use of org.apache.hadoop.io.BytesWritable in project druid by druid-io.

the class IndexGeneratorCombinerTest method testMultipleRowsNotMerged.

@Test
public void testMultipleRowsNotMerged() throws Exception {
    long timestamp = System.currentTimeMillis();
    Bucket bucket = new Bucket(0, new DateTime(timestamp), 0);
    SortableBytes keySortableBytes = new SortableBytes(bucket.toGroupKey(), new byte[0]);
    BytesWritable key = keySortableBytes.toBytesWritable();
    InputRow row1 = new MapBasedInputRow(timestamp, ImmutableList.<String>of("host", "keywords"), ImmutableMap.<String, Object>of("host", "host1", "keywords", Arrays.asList("foo", "bar"), "visited", 10));
    InputRow row2 = new MapBasedInputRow(timestamp, ImmutableList.<String>of("host", "keywords"), ImmutableMap.<String, Object>of("host", "host2", "keywords", Arrays.asList("foo", "bar"), "visited", 5));
    List<BytesWritable> rows = Lists.newArrayList(new BytesWritable(InputRowSerde.toBytes(row1, aggregators, true)), new BytesWritable(InputRowSerde.toBytes(row2, aggregators, true)));
    Reducer.Context context = EasyMock.createNiceMock(Reducer.Context.class);
    Capture<BytesWritable> captureKey1 = Capture.newInstance();
    Capture<BytesWritable> captureVal1 = Capture.newInstance();
    Capture<BytesWritable> captureKey2 = Capture.newInstance();
    Capture<BytesWritable> captureVal2 = Capture.newInstance();
    context.write(EasyMock.capture(captureKey1), EasyMock.capture(captureVal1));
    context.write(EasyMock.capture(captureKey2), EasyMock.capture(captureVal2));
    EasyMock.replay(context);
    combiner.reduce(key, rows, context);
    EasyMock.verify(context);
    Assert.assertTrue(captureKey1.getValue() == key);
    Assert.assertTrue(captureKey2.getValue() == key);
    InputRow capturedRow1 = InputRowSerde.fromBytes(captureVal1.getValue().getBytes(), aggregators);
    Assert.assertEquals(Arrays.asList("host", "keywords"), capturedRow1.getDimensions());
    Assert.assertEquals(Arrays.asList("host1"), capturedRow1.getDimension("host"));
    Assert.assertEquals(Arrays.asList("bar", "foo"), capturedRow1.getDimension("keywords"));
    Assert.assertEquals(10, capturedRow1.getLongMetric("visited_sum"));
    Assert.assertEquals(1.0, (Double) HyperUniquesAggregatorFactory.estimateCardinality(capturedRow1.getRaw("unique_hosts")), 0.001);
    InputRow capturedRow2 = InputRowSerde.fromBytes(captureVal2.getValue().getBytes(), aggregators);
    Assert.assertEquals(Arrays.asList("host", "keywords"), capturedRow2.getDimensions());
    Assert.assertEquals(Arrays.asList("host2"), capturedRow2.getDimension("host"));
    Assert.assertEquals(Arrays.asList("bar", "foo"), capturedRow2.getDimension("keywords"));
    Assert.assertEquals(5, capturedRow2.getLongMetric("visited_sum"));
    Assert.assertEquals(1.0, (Double) HyperUniquesAggregatorFactory.estimateCardinality(capturedRow2.getRaw("unique_hosts")), 0.001);
}
Also used : MapBasedInputRow(io.druid.data.input.MapBasedInputRow) InputRow(io.druid.data.input.InputRow) BytesWritable(org.apache.hadoop.io.BytesWritable) MapBasedInputRow(io.druid.data.input.MapBasedInputRow) Reducer(org.apache.hadoop.mapreduce.Reducer) DateTime(org.joda.time.DateTime) Test(org.junit.Test)

Example 78 with BytesWritable

use of org.apache.hadoop.io.BytesWritable in project druid by druid-io.

the class IndexGeneratorCombinerTest method testSingleRowNoMergePassThrough.

@Test
public void testSingleRowNoMergePassThrough() throws Exception {
    Reducer.Context context = EasyMock.createMock(Reducer.Context.class);
    Capture<BytesWritable> captureKey = Capture.newInstance();
    Capture<BytesWritable> captureVal = Capture.newInstance();
    context.write(EasyMock.capture(captureKey), EasyMock.capture(captureVal));
    EasyMock.replay(context);
    BytesWritable key = new BytesWritable("dummy_key".getBytes());
    BytesWritable val = new BytesWritable("dummy_row".getBytes());
    combiner.reduce(key, Lists.newArrayList(val), context);
    Assert.assertTrue(captureKey.getValue() == key);
    Assert.assertTrue(captureVal.getValue() == val);
}
Also used : BytesWritable(org.apache.hadoop.io.BytesWritable) Reducer(org.apache.hadoop.mapreduce.Reducer) Test(org.junit.Test)

Example 79 with BytesWritable

use of org.apache.hadoop.io.BytesWritable in project druid by druid-io.

the class IndexGeneratorCombinerTest method testMultipleRowsMerged.

@Test
public void testMultipleRowsMerged() throws Exception {
    long timestamp = System.currentTimeMillis();
    Bucket bucket = new Bucket(0, new DateTime(timestamp), 0);
    SortableBytes keySortableBytes = new SortableBytes(bucket.toGroupKey(), new byte[0]);
    BytesWritable key = keySortableBytes.toBytesWritable();
    InputRow row1 = new MapBasedInputRow(timestamp, ImmutableList.<String>of("keywords"), ImmutableMap.<String, Object>of("host", "host1", "keywords", Arrays.asList("foo", "bar"), "visited", 10));
    InputRow row2 = new MapBasedInputRow(timestamp, ImmutableList.<String>of("keywords"), ImmutableMap.<String, Object>of("host", "host2", "keywords", Arrays.asList("foo", "bar"), "visited", 5));
    List<BytesWritable> rows = Lists.newArrayList(new BytesWritable(InputRowSerde.toBytes(row1, aggregators, true)), new BytesWritable(InputRowSerde.toBytes(row2, aggregators, true)));
    Reducer.Context context = EasyMock.createNiceMock(Reducer.Context.class);
    Capture<BytesWritable> captureKey = Capture.newInstance();
    Capture<BytesWritable> captureVal = Capture.newInstance();
    context.write(EasyMock.capture(captureKey), EasyMock.capture(captureVal));
    EasyMock.replay(context);
    combiner.reduce(key, rows, context);
    EasyMock.verify(context);
    Assert.assertTrue(captureKey.getValue() == key);
    InputRow capturedRow = InputRowSerde.fromBytes(captureVal.getValue().getBytes(), aggregators);
    Assert.assertEquals(Arrays.asList("host", "keywords"), capturedRow.getDimensions());
    Assert.assertEquals(ImmutableList.of(), capturedRow.getDimension("host"));
    Assert.assertEquals(Arrays.asList("bar", "foo"), capturedRow.getDimension("keywords"));
    Assert.assertEquals(15, capturedRow.getLongMetric("visited_sum"));
    Assert.assertEquals(2.0, (Double) HyperUniquesAggregatorFactory.estimateCardinality(capturedRow.getRaw("unique_hosts")), 0.001);
}
Also used : MapBasedInputRow(io.druid.data.input.MapBasedInputRow) InputRow(io.druid.data.input.InputRow) BytesWritable(org.apache.hadoop.io.BytesWritable) MapBasedInputRow(io.druid.data.input.MapBasedInputRow) Reducer(org.apache.hadoop.mapreduce.Reducer) DateTime(org.joda.time.DateTime) Test(org.junit.Test)

Example 80 with BytesWritable

use of org.apache.hadoop.io.BytesWritable in project bagheera by mozilla-metrics.

the class SequenceFileSink method store.

@Override
public void store(String key, byte[] data, long timestamp) throws IOException {
    try {
        lock.acquire();
        checkRollover();
        if (addTimestamp) {
            data = addTimestampToJson(data, timestamp);
        }
        if (useBytesValue) {
            writer.append(new Text(key), new BytesWritable(data));
        } else {
            writer.append(new Text(key), new Text(data));
        }
        stored.mark();
        bytesWritten.getAndAdd(key.length() + data.length);
    } catch (IOException e) {
        LOG.error("IOException while writing key/value pair", e);
        throw new RuntimeException(e);
    } catch (InterruptedException e) {
        LOG.error("Interrupted while writing key/value pair", e);
    } finally {
        lock.release();
    }
}
Also used : Text(org.apache.hadoop.io.Text) BytesWritable(org.apache.hadoop.io.BytesWritable) IOException(java.io.IOException)

Aggregations

BytesWritable (org.apache.hadoop.io.BytesWritable)275 Text (org.apache.hadoop.io.Text)73 LongWritable (org.apache.hadoop.io.LongWritable)59 Test (org.junit.Test)53 ObjectInspector (org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector)46 IntWritable (org.apache.hadoop.io.IntWritable)44 ArrayList (java.util.ArrayList)39 Path (org.apache.hadoop.fs.Path)38 IOException (java.io.IOException)36 Configuration (org.apache.hadoop.conf.Configuration)33 FloatWritable (org.apache.hadoop.io.FloatWritable)33 Writable (org.apache.hadoop.io.Writable)32 BooleanWritable (org.apache.hadoop.io.BooleanWritable)31 List (java.util.List)30 SequenceFile (org.apache.hadoop.io.SequenceFile)27 Random (java.util.Random)24 DoubleWritable (org.apache.hadoop.hive.serde2.io.DoubleWritable)24 ShortWritable (org.apache.hadoop.hive.serde2.io.ShortWritable)23 ByteWritable (org.apache.hadoop.hive.serde2.io.ByteWritable)22 FileSystem (org.apache.hadoop.fs.FileSystem)21