Search in sources :

Example 11 with SerializationFactory

use of org.apache.hadoop.io.serializer.SerializationFactory in project cdap by caskdata.

the class ReflectionUtils method copy.

/**
   * Make a copy of the writable object using serialization to a buffer
   * @param src the object to copy from
   * @param dst the object to copy into, which is destroyed
   * @return dst param (the copy)
   * @throws IOException
   */
@SuppressWarnings("unchecked")
public static <T> T copy(Configuration conf, T src, T dst) throws IOException {
    CopyInCopyOutBuffer buffer = cloneBuffers.get();
    buffer.outBuffer.reset();
    SerializationFactory factory = getFactory(conf);
    Class<T> cls = (Class<T>) src.getClass();
    Serializer<T> serializer = factory.getSerializer(cls);
    serializer.open(buffer.outBuffer);
    serializer.serialize(src);
    buffer.moveData();
    Deserializer<T> deserializer = factory.getDeserializer(cls);
    deserializer.open(buffer.inBuffer);
    dst = deserializer.deserialize(dst);
    return dst;
}
Also used : SerializationFactory(org.apache.hadoop.io.serializer.SerializationFactory)

Example 12 with SerializationFactory

use of org.apache.hadoop.io.serializer.SerializationFactory in project hive by apache.

the class CustomPartitionVertex method getFileSplitFromEvent.

private FileSplit getFileSplitFromEvent(InputDataInformationEvent event) throws IOException {
    InputSplit inputSplit = null;
    if (event.getDeserializedUserPayload() != null) {
        inputSplit = (InputSplit) event.getDeserializedUserPayload();
    } else {
        MRSplitProto splitProto = MRSplitProto.parseFrom(ByteString.copyFrom(event.getUserPayload()));
        SerializationFactory serializationFactory = new SerializationFactory(new Configuration());
        inputSplit = MRInputHelpers.createOldFormatSplitFromUserPayload(splitProto, serializationFactory);
    }
    if (!(inputSplit instanceof FileSplit)) {
        throw new UnsupportedOperationException("Cannot handle splits other than FileSplit for the moment. Current input split type: " + inputSplit.getClass().getSimpleName());
    }
    return (FileSplit) inputSplit;
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) SerializationFactory(org.apache.hadoop.io.serializer.SerializationFactory) FileSplit(org.apache.hadoop.mapred.FileSplit) InputSplit(org.apache.hadoop.mapred.InputSplit) MRSplitProto(org.apache.tez.mapreduce.protos.MRRuntimeProtos.MRSplitProto)

Example 13 with SerializationFactory

use of org.apache.hadoop.io.serializer.SerializationFactory in project goldenorb by jzachr.

the class InputSplitAllocator method assignInputSplits.

/**
   * This method gets the raw splits and calls another method to assign them.
   * 
   * @returns Map
   */
@SuppressWarnings({ "deprecation", "rawtypes", "unchecked" })
public Map<OrbPartitionMember, List<RawSplit>> assignInputSplits() {
    List<RawSplit> rawSplits = null;
    JobConf job = new JobConf(orbConf);
    LOG.debug(orbConf.getJobNumber().toString());
    JobContext jobContext = new JobContext(job, new JobID(orbConf.getJobNumber(), 0));
    org.apache.hadoop.mapreduce.InputFormat<?, ?> input;
    try {
        input = ReflectionUtils.newInstance(jobContext.getInputFormatClass(), orbConf);
        List<org.apache.hadoop.mapreduce.InputSplit> splits = input.getSplits(jobContext);
        rawSplits = new ArrayList<RawSplit>(splits.size());
        DataOutputBuffer buffer = new DataOutputBuffer();
        SerializationFactory factory = new SerializationFactory(orbConf);
        Serializer serializer = factory.getSerializer(splits.get(0).getClass());
        serializer.open(buffer);
        for (int i = 0; i < splits.size(); i++) {
            buffer.reset();
            serializer.serialize(splits.get(i));
            RawSplit rawSplit = new RawSplit();
            rawSplit.setClassName(splits.get(i).getClass().getName());
            rawSplit.setDataLength(splits.get(i).getLength());
            rawSplit.setBytes(buffer.getData(), 0, buffer.getLength());
            rawSplit.setLocations(splits.get(i).getLocations());
            rawSplits.add(rawSplit);
        }
    } catch (ClassNotFoundException e) {
        e.printStackTrace();
        throw new RuntimeException(e);
    } catch (IOException e) {
        e.printStackTrace();
        throw new RuntimeException(e);
    } catch (InterruptedException e) {
        e.printStackTrace();
        throw new RuntimeException(e);
    }
    return assignInputSplits(rawSplits);
}
Also used : RawSplit(org.goldenorb.io.input.RawSplit) SerializationFactory(org.apache.hadoop.io.serializer.SerializationFactory) IOException(java.io.IOException) DataOutputBuffer(org.apache.hadoop.io.DataOutputBuffer) JobContext(org.apache.hadoop.mapreduce.JobContext) JobConf(org.apache.hadoop.mapred.JobConf) JobID(org.apache.hadoop.mapreduce.JobID) Serializer(org.apache.hadoop.io.serializer.Serializer)

Example 14 with SerializationFactory

use of org.apache.hadoop.io.serializer.SerializationFactory in project goldenorb by jzachr.

the class VertexInput method initialize.

/**
 * 
 */
@SuppressWarnings("unchecked")
public void initialize() {
    // rebuild the input split
    org.apache.hadoop.mapreduce.InputSplit split = null;
    DataInputBuffer splitBuffer = new DataInputBuffer();
    splitBuffer.reset(rawSplit.getBytes(), 0, rawSplit.getLength());
    SerializationFactory factory = new SerializationFactory(orbConf);
    Deserializer<? extends org.apache.hadoop.mapreduce.InputSplit> deserializer;
    try {
        deserializer = (Deserializer<? extends org.apache.hadoop.mapreduce.InputSplit>) factory.getDeserializer(orbConf.getClassByName(splitClass));
        deserializer.open(splitBuffer);
        split = deserializer.deserialize(null);
        JobConf job = new JobConf(orbConf);
        JobContext jobContext = new JobContext(job, new JobID(getOrbConf().getJobNumber(), 0));
        InputFormat<INPUT_KEY, INPUT_VALUE> inputFormat;
        inputFormat = (InputFormat<INPUT_KEY, INPUT_VALUE>) ReflectionUtils.newInstance(jobContext.getInputFormatClass(), orbConf);
        TaskAttemptContext tao = new TaskAttemptContext(job, new TaskAttemptID(new TaskID(jobContext.getJobID(), true, partitionID), 0));
        recordReader = inputFormat.createRecordReader(split, tao);
        recordReader.initialize(split, tao);
    } catch (ClassNotFoundException e) {
        throw new RuntimeException(e);
    } catch (IOException e) {
        throw new RuntimeException(e);
    } catch (InterruptedException e) {
        throw new RuntimeException(e);
    }
}
Also used : TaskID(org.apache.hadoop.mapreduce.TaskID) TaskAttemptID(org.apache.hadoop.mapreduce.TaskAttemptID) SerializationFactory(org.apache.hadoop.io.serializer.SerializationFactory) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) IOException(java.io.IOException) DataInputBuffer(org.apache.hadoop.io.DataInputBuffer) JobContext(org.apache.hadoop.mapreduce.JobContext) JobConf(org.apache.hadoop.mapred.JobConf) JobID(org.apache.hadoop.mapreduce.JobID)

Example 15 with SerializationFactory

use of org.apache.hadoop.io.serializer.SerializationFactory in project hadoop by apache.

the class TestAvroSerialization method testAcceptHandlingPrimitivesAndArrays.

@Test
public void testAcceptHandlingPrimitivesAndArrays() throws Exception {
    SerializationFactory factory = new SerializationFactory(conf);
    assertNull(factory.getSerializer(byte[].class));
    assertNull(factory.getSerializer(byte.class));
}
Also used : SerializationFactory(org.apache.hadoop.io.serializer.SerializationFactory) Test(org.junit.Test)

Aggregations

SerializationFactory (org.apache.hadoop.io.serializer.SerializationFactory)20 Deserializer (org.apache.hadoop.io.serializer.Deserializer)5 IOException (java.io.IOException)4 Serializer (org.apache.hadoop.io.serializer.Serializer)4 ByteBuffer (java.nio.ByteBuffer)2 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)2 FileSystem (org.apache.hadoop.fs.FileSystem)2 LocalFileSystem (org.apache.hadoop.fs.LocalFileSystem)2 DataInputBuffer (org.apache.hadoop.io.DataInputBuffer)2 DataOutputBuffer (org.apache.hadoop.io.DataOutputBuffer)2 JobConf (org.apache.hadoop.mapred.JobConf)2 InputSplit (org.apache.hadoop.mapreduce.InputSplit)2 JobContext (org.apache.hadoop.mapreduce.JobContext)2 JobID (org.apache.hadoop.mapreduce.JobID)2 IgniteCheckedException (org.apache.ignite.IgniteCheckedException)2 ArrayList (java.util.ArrayList)1 Map (java.util.Map)1 ByteBufferInputStream (org.apache.avro.util.ByteBufferInputStream)1 ByteBufferOutputStream (org.apache.avro.util.ByteBufferOutputStream)1 Configuration (org.apache.hadoop.conf.Configuration)1