Search in sources :

Example 1 with JobContextImpl

use of org.apache.hadoop.mapred.JobContextImpl in project ignite by apache.

the class HadoopV2SetupTask method run0.

/**
 * {@inheritDoc}
 */
@SuppressWarnings("ConstantConditions")
@Override
protected void run0(HadoopV2TaskContext taskCtx) throws IgniteCheckedException {
    try {
        JobContextImpl jobCtx = taskCtx.jobContext();
        OutputFormat outputFormat = getOutputFormat(jobCtx);
        outputFormat.checkOutputSpecs(jobCtx);
        OutputCommitter committer = outputFormat.getOutputCommitter(hadoopContext());
        if (committer != null)
            committer.setupJob(jobCtx);
    } catch (ClassNotFoundException | IOException e) {
        throw new IgniteCheckedException(e);
    } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
        throw new IgniteInterruptedCheckedException(e);
    }
}
Also used : OutputCommitter(org.apache.hadoop.mapreduce.OutputCommitter) IgniteInterruptedCheckedException(org.apache.ignite.internal.IgniteInterruptedCheckedException) JobContextImpl(org.apache.hadoop.mapred.JobContextImpl) IgniteCheckedException(org.apache.ignite.IgniteCheckedException) OutputFormat(org.apache.hadoop.mapreduce.OutputFormat) IOException(java.io.IOException)

Example 2 with JobContextImpl

use of org.apache.hadoop.mapred.JobContextImpl in project cdap by caskdata.

the class StreamInputFormatTest method testStreamRecordReader.

@Test
public void testStreamRecordReader() throws Exception {
    File inputDir = tmpFolder.newFolder();
    File partition = new File(inputDir, "1.1000");
    partition.mkdirs();
    File eventFile = new File(partition, "bucket.1.0." + StreamFileType.EVENT.getSuffix());
    File indexFile = new File(partition, "bucket.1.0." + StreamFileType.INDEX.getSuffix());
    // write 1 event
    StreamDataFileWriter writer = new StreamDataFileWriter(Files.newOutputStreamSupplier(eventFile), Files.newOutputStreamSupplier(indexFile), 100L);
    writer.append(StreamFileTestUtils.createEvent(1000, "test"));
    writer.flush();
    // get splits from the input format. Expect to get 2 splits,
    // one from 0 - some offset and one from offset - Long.MAX_VALUE.
    Configuration conf = new Configuration();
    TaskAttemptContext context = new TaskAttemptContextImpl(conf, new TaskAttemptID());
    AbstractStreamInputFormat.setStreamId(conf, DUMMY_ID);
    AbstractStreamInputFormat.setStreamPath(conf, inputDir.toURI());
    AbstractStreamInputFormat format = new AbstractStreamInputFormat() {

        @Override
        public AuthorizationEnforcer getAuthorizationEnforcer(TaskAttemptContext context) {
            return new NoOpAuthorizer();
        }

        @Override
        public AuthenticationContext getAuthenticationContext(TaskAttemptContext context) {
            return new AuthenticationTestContext();
        }
    };
    List<InputSplit> splits = format.getSplits(new JobContextImpl(new JobConf(conf), new JobID()));
    Assert.assertEquals(2, splits.size());
    // write another event so that the 2nd split has something to read
    writer.append(StreamFileTestUtils.createEvent(1001, "test"));
    writer.close();
    // create a record reader for the 2nd split
    StreamRecordReader<LongWritable, StreamEvent> recordReader = new StreamRecordReader<>(new IdentityStreamEventDecoder(), new NoOpAuthorizer(), new AuthenticationTestContext(), DUMMY_ID);
    recordReader.initialize(splits.get(1), context);
    // check that we read the 2nd stream event
    Assert.assertTrue(recordReader.nextKeyValue());
    StreamEvent output = recordReader.getCurrentValue();
    Assert.assertEquals(1001, output.getTimestamp());
    Assert.assertEquals("test", Bytes.toString(output.getBody()));
    // check that there is nothing more to read
    Assert.assertFalse(recordReader.nextKeyValue());
}
Also used : JobContextImpl(org.apache.hadoop.mapred.JobContextImpl) Configuration(org.apache.hadoop.conf.Configuration) TaskAttemptID(org.apache.hadoop.mapred.TaskAttemptID) StreamEvent(co.cask.cdap.api.flow.flowlet.StreamEvent) AuthenticationTestContext(co.cask.cdap.security.auth.context.AuthenticationTestContext) NoOpAuthorizer(co.cask.cdap.security.spi.authorization.NoOpAuthorizer) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) TaskAttemptContextImpl(org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl) IdentityStreamEventDecoder(co.cask.cdap.data.stream.decoder.IdentityStreamEventDecoder) LongWritable(org.apache.hadoop.io.LongWritable) File(java.io.File) InputSplit(org.apache.hadoop.mapreduce.InputSplit) JobConf(org.apache.hadoop.mapred.JobConf) JobID(org.apache.hadoop.mapred.JobID) Test(org.junit.Test)

Example 3 with JobContextImpl

use of org.apache.hadoop.mapred.JobContextImpl in project hive by apache.

the class TestHiveIcebergOutputCommitter method testSuccessfulUnpartitionedWrite.

@Test
public void testSuccessfulUnpartitionedWrite() throws IOException {
    HiveIcebergOutputCommitter committer = new HiveIcebergOutputCommitter();
    Table table = table(temp.getRoot().getPath(), false);
    JobConf conf = jobConf(table, 1);
    List<Record> expected = writeRecords(table.name(), 1, 0, true, false, conf);
    committer.commitJob(new JobContextImpl(conf, JOB_ID));
    HiveIcebergTestUtils.validateFiles(table, conf, JOB_ID, 1);
    HiveIcebergTestUtils.validateData(table, expected, 0);
}
Also used : JobContextImpl(org.apache.hadoop.mapred.JobContextImpl) Table(org.apache.iceberg.Table) Record(org.apache.iceberg.data.Record) JobConf(org.apache.hadoop.mapred.JobConf) Test(org.junit.Test)

Example 4 with JobContextImpl

use of org.apache.hadoop.mapred.JobContextImpl in project hive by apache.

the class TestHiveIcebergOutputCommitter method testSuccessfulMultipleTasksUnpartitionedWrite.

@Test
public void testSuccessfulMultipleTasksUnpartitionedWrite() throws IOException {
    HiveIcebergOutputCommitter committer = new HiveIcebergOutputCommitter();
    Table table = table(temp.getRoot().getPath(), false);
    JobConf conf = jobConf(table, 2);
    List<Record> expected = writeRecords(table.name(), 2, 0, true, false, conf);
    committer.commitJob(new JobContextImpl(conf, JOB_ID));
    HiveIcebergTestUtils.validateFiles(table, conf, JOB_ID, 2);
    HiveIcebergTestUtils.validateData(table, expected, 0);
}
Also used : JobContextImpl(org.apache.hadoop.mapred.JobContextImpl) Table(org.apache.iceberg.Table) Record(org.apache.iceberg.data.Record) JobConf(org.apache.hadoop.mapred.JobConf) Test(org.junit.Test)

Example 5 with JobContextImpl

use of org.apache.hadoop.mapred.JobContextImpl in project hive by apache.

the class TestHiveIcebergOutputCommitter method testSuccessfulPartitionedWrite.

@Test
public void testSuccessfulPartitionedWrite() throws IOException {
    HiveIcebergOutputCommitter committer = new HiveIcebergOutputCommitter();
    Table table = table(temp.getRoot().getPath(), true);
    JobConf conf = jobConf(table, 1);
    List<Record> expected = writeRecords(table.name(), 1, 0, true, false, conf);
    committer.commitJob(new JobContextImpl(conf, JOB_ID));
    // Expecting 3 files with fanout-, 4 with ClusteredWriter where writing to already completed partitions is allowed.
    HiveIcebergTestUtils.validateFiles(table, conf, JOB_ID, 4);
    HiveIcebergTestUtils.validateData(table, expected, 0);
}
Also used : JobContextImpl(org.apache.hadoop.mapred.JobContextImpl) Table(org.apache.iceberg.Table) Record(org.apache.iceberg.data.Record) JobConf(org.apache.hadoop.mapred.JobConf) Test(org.junit.Test)

Aggregations

JobContextImpl (org.apache.hadoop.mapred.JobContextImpl)14 JobConf (org.apache.hadoop.mapred.JobConf)8 Test (org.junit.Test)7 Table (org.apache.iceberg.Table)6 Record (org.apache.iceberg.data.Record)5 IOException (java.io.IOException)4 JobID (org.apache.hadoop.mapred.JobID)4 OutputFormat (org.apache.hadoop.mapreduce.OutputFormat)4 IgniteCheckedException (org.apache.ignite.IgniteCheckedException)4 IgniteInterruptedCheckedException (org.apache.ignite.internal.IgniteInterruptedCheckedException)4 JobContext (org.apache.hadoop.mapred.JobContext)2 TaskAttemptID (org.apache.hadoop.mapred.TaskAttemptID)2 InputSplit (org.apache.hadoop.mapreduce.InputSplit)2 OutputCommitter (org.apache.hadoop.mapreduce.OutputCommitter)2 StreamEvent (co.cask.cdap.api.flow.flowlet.StreamEvent)1 IdentityStreamEventDecoder (co.cask.cdap.data.stream.decoder.IdentityStreamEventDecoder)1 AuthenticationTestContext (co.cask.cdap.security.auth.context.AuthenticationTestContext)1 NoOpAuthorizer (co.cask.cdap.security.spi.authorization.NoOpAuthorizer)1 File (java.io.File)1 HadoopDummyProgressable (org.apache.flink.api.java.hadoop.mapred.wrapper.HadoopDummyProgressable)1