Examples with HadoopJobId - org.apache.ignite.internal.processors.hadoop.HadoopJobId

Example 6 with HadoopJobId

use of org.apache.ignite.internal.processors.hadoop.HadoopJobId in project ignite by apache.

the class HadoopJobTracker method submit.

/**
     * Submits execution of Hadoop job to grid.
     *
     * @param jobId Job ID.
     * @param info Job info.
     * @return Job completion future.
     */
@SuppressWarnings("unchecked")
public IgniteInternalFuture<HadoopJobId> submit(HadoopJobId jobId, HadoopJobInfo info) {
    if (!busyLock.tryReadLock()) {
        return new GridFinishedFuture<>(new IgniteCheckedException("Failed to execute map-reduce job " + "(grid is stopping): " + info));
    }
    try {
        long jobPrepare = U.currentTimeMillis();
        if (jobs.containsKey(jobId) || jobMetaCache().containsKey(jobId))
            throw new IgniteCheckedException("Failed to submit job. Job with the same ID already exists: " + jobId);
        HadoopJobEx job = job(jobId, info);
        HadoopMapReducePlan mrPlan = mrPlanner.preparePlan(job, ctx.nodes(), null);
        logPlan(info, mrPlan);
        HadoopJobMetadata meta = new HadoopJobMetadata(ctx.localNodeId(), jobId, info);
        meta.mapReducePlan(mrPlan);
        meta.pendingSplits(allSplits(mrPlan));
        meta.pendingReducers(allReducers(mrPlan));
        GridFutureAdapter<HadoopJobId> completeFut = new GridFutureAdapter<>();
        GridFutureAdapter<HadoopJobId> old = activeFinishFuts.put(jobId, completeFut);
        assert old == null : "Duplicate completion future [jobId=" + jobId + ", old=" + old + ']';
        if (log.isDebugEnabled())
            log.debug("Submitting job metadata [jobId=" + jobId + ", meta=" + meta + ']');
        long jobStart = U.currentTimeMillis();
        HadoopPerformanceCounter perfCntr = HadoopPerformanceCounter.getCounter(meta.counters(), ctx.localNodeId());
        perfCntr.clientSubmissionEvents(info);
        perfCntr.onJobPrepare(jobPrepare);
        perfCntr.onJobStart(jobStart);
        if (jobMetaCache().getAndPutIfAbsent(jobId, meta) != null)
            throw new IgniteCheckedException("Failed to submit job. Job with the same ID already exists: " + jobId);
        return completeFut;
    } catch (IgniteCheckedException e) {
        U.error(log, "Failed to submit job: " + jobId, e);
        return new GridFinishedFuture<>(e);
    } finally {
        busyLock.readUnlock();
    }
}

Also used : HadoopMapReducePlan(org.apache.ignite.hadoop.HadoopMapReducePlan) IgniteCheckedException(org.apache.ignite.IgniteCheckedException) HadoopJobEx(org.apache.ignite.internal.processors.hadoop.HadoopJobEx) GridFutureAdapter(org.apache.ignite.internal.util.future.GridFutureAdapter) HadoopPerformanceCounter(org.apache.ignite.internal.processors.hadoop.counter.HadoopPerformanceCounter) HadoopJobId(org.apache.ignite.internal.processors.hadoop.HadoopJobId) GridFinishedFuture(org.apache.ignite.internal.util.future.GridFinishedFuture)

Example 7 with HadoopJobId

use of org.apache.ignite.internal.processors.hadoop.HadoopJobId in project ignite by apache.

the class HadoopJobTrackerSelfTest method testTaskWithCombinerPerMap.

/**
     * @throws Exception If failed.
     */
public void testTaskWithCombinerPerMap() throws Exception {
    try {
        UUID globalId = UUID.randomUUID();
        Job job = Job.getInstance();
        setupFileSystems(job.getConfiguration());
        job.setMapperClass(TestMapper.class);
        job.setReducerClass(TestReducer.class);
        job.setCombinerClass(TestCombiner.class);
        job.setInputFormatClass(InFormat.class);
        FileOutputFormat.setOutputPath(job, new Path(igfsScheme() + PATH_OUTPUT + "2"));
        HadoopJobId jobId = new HadoopJobId(globalId, 1);
        grid(0).hadoop().submit(jobId, createJobInfo(job.getConfiguration()));
        checkStatus(jobId, false);
        info("Releasing map latch.");
        latch.get("mapAwaitLatch").countDown();
        checkStatus(jobId, false);
        // All maps are completed. We have a combiner, so no reducers should be executed
        // before combiner latch is released.
        U.sleep(50);
        assertEquals(0, reduceExecCnt.get());
        info("Releasing combiner latch.");
        latch.get("combineAwaitLatch").countDown();
        checkStatus(jobId, false);
        info("Releasing reduce latch.");
        latch.get("reduceAwaitLatch").countDown();
        checkStatus(jobId, true);
        assertEquals(10, mapExecCnt.get());
        assertEquals(10, combineExecCnt.get());
        assertEquals(1, reduceExecCnt.get());
    } finally {
        // Safety.
        latch.get("mapAwaitLatch").countDown();
        latch.get("combineAwaitLatch").countDown();
        latch.get("reduceAwaitLatch").countDown();
    }
}

Also used : Path(org.apache.hadoop.fs.Path) UUID(java.util.UUID) Job(org.apache.hadoop.mapreduce.Job) HadoopJobId(org.apache.ignite.internal.processors.hadoop.HadoopJobId)

Example 8 with HadoopJobId

use of org.apache.ignite.internal.processors.hadoop.HadoopJobId in project ignite by apache.

the class HadoopTaskExecutionRequest method readExternal.

/** {@inheritDoc} */
@Override
public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
    jobId = new HadoopJobId();
    jobId.readExternal(in);
    jobInfo = (HadoopJobInfo) in.readObject();
    tasks = U.readCollection(in);
}

Also used : HadoopJobId(org.apache.ignite.internal.processors.hadoop.HadoopJobId)

Example 9 with HadoopJobId

use of org.apache.ignite.internal.processors.hadoop.HadoopJobId in project ignite by apache.

the class HadoopJobTrackerSelfTest method testSimpleTaskSubmit.

/**
     * @throws Exception If failed.
     */
public void testSimpleTaskSubmit() throws Exception {
    try {
        UUID globalId = UUID.randomUUID();
        Job job = Job.getInstance();
        setupFileSystems(job.getConfiguration());
        job.setMapperClass(TestMapper.class);
        job.setReducerClass(TestReducer.class);
        job.setInputFormatClass(InFormat.class);
        FileOutputFormat.setOutputPath(job, new Path(igfsScheme() + PATH_OUTPUT + "1"));
        HadoopJobId jobId = new HadoopJobId(globalId, 1);
        grid(0).hadoop().submit(jobId, createJobInfo(job.getConfiguration()));
        checkStatus(jobId, false);
        info("Releasing map latch.");
        latch.get("mapAwaitLatch").countDown();
        checkStatus(jobId, false);
        info("Releasing reduce latch.");
        latch.get("reduceAwaitLatch").countDown();
        checkStatus(jobId, true);
        assertEquals(10, mapExecCnt.get());
        assertEquals(0, combineExecCnt.get());
        assertEquals(1, reduceExecCnt.get());
    } finally {
        // Safety.
        latch.get("mapAwaitLatch").countDown();
        latch.get("combineAwaitLatch").countDown();
        latch.get("reduceAwaitLatch").countDown();
    }
}

Also used : Path(org.apache.hadoop.fs.Path) UUID(java.util.UUID) Job(org.apache.hadoop.mapreduce.Job) HadoopJobId(org.apache.ignite.internal.processors.hadoop.HadoopJobId)

Example 10 with HadoopJobId

use of org.apache.ignite.internal.processors.hadoop.HadoopJobId in project ignite by apache.

the class HadoopV2JobSelfTest method testCustomSerializationApplying.

/**
     * Tests that {@link HadoopJobEx} provides wrapped serializer if it's set in configuration.
     *
     * @throws IgniteCheckedException If fails.
     */
public void testCustomSerializationApplying() throws IgniteCheckedException {
    JobConf cfg = new JobConf();
    cfg.setMapOutputKeyClass(IntWritable.class);
    cfg.setMapOutputValueClass(Text.class);
    cfg.set(CommonConfigurationKeys.IO_SERIALIZATIONS_KEY, CustomSerialization.class.getName());
    HadoopDefaultJobInfo info = createJobInfo(cfg);
    final UUID uuid = UUID.randomUUID();
    HadoopJobId id = new HadoopJobId(uuid, 1);
    HadoopJobEx job = info.createJob(HadoopV2Job.class, id, log, null, new HadoopHelperImpl());
    HadoopTaskContext taskCtx = job.getTaskContext(new HadoopTaskInfo(HadoopTaskType.MAP, null, 0, 0, null));
    HadoopSerialization ser = taskCtx.keySerialization();
    assertEquals(HadoopSerializationWrapper.class.getName(), ser.getClass().getName());
    DataInput in = new DataInputStream(new ByteArrayInputStream(new byte[0]));
    assertEquals(TEST_SERIALIZED_VALUE, ser.read(in, null).toString());
    ser = taskCtx.valueSerialization();
    assertEquals(HadoopSerializationWrapper.class.getName(), ser.getClass().getName());
    assertEquals(TEST_SERIALIZED_VALUE, ser.read(in, null).toString());
}

Also used : HadoopHelperImpl(org.apache.ignite.internal.processors.hadoop.HadoopHelperImpl) DataInputStream(java.io.DataInputStream) HadoopJobId(org.apache.ignite.internal.processors.hadoop.HadoopJobId) DataInput(java.io.DataInput) ByteArrayInputStream(java.io.ByteArrayInputStream) HadoopJobEx(org.apache.ignite.internal.processors.hadoop.HadoopJobEx) HadoopTaskContext(org.apache.ignite.internal.processors.hadoop.HadoopTaskContext) HadoopTaskInfo(org.apache.ignite.internal.processors.hadoop.HadoopTaskInfo) HadoopDefaultJobInfo(org.apache.ignite.internal.processors.hadoop.HadoopDefaultJobInfo) HadoopSerializationWrapper(org.apache.ignite.internal.processors.hadoop.impl.v2.HadoopSerializationWrapper) UUID(java.util.UUID) HadoopSerialization(org.apache.ignite.internal.processors.hadoop.HadoopSerialization) JobConf(org.apache.hadoop.mapred.JobConf)

Aggregations

HadoopJobId (org.apache.ignite.internal.processors.hadoop.HadoopJobId)39 UUID (java.util.UUID)15 Path (org.apache.hadoop.fs.Path)13 Job (org.apache.hadoop.mapreduce.Job)13 IgniteCheckedException (org.apache.ignite.IgniteCheckedException)10 Configuration (org.apache.hadoop.conf.Configuration)9 HadoopConfiguration (org.apache.ignite.configuration.HadoopConfiguration)7 IgfsPath (org.apache.ignite.igfs.IgfsPath)7 IOException (java.io.IOException)6 JobConf (org.apache.hadoop.mapred.JobConf)5 FileSystemConfiguration (org.apache.ignite.configuration.FileSystemConfiguration)5 HadoopDefaultJobInfo (org.apache.ignite.internal.processors.hadoop.HadoopDefaultJobInfo)4 IgniteHadoopFileSystem (org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem)3 HadoopHelperImpl (org.apache.ignite.internal.processors.hadoop.HadoopHelperImpl)3 HadoopJobEx (org.apache.ignite.internal.processors.hadoop.HadoopJobEx)3 HadoopTaskCancelledException (org.apache.ignite.internal.processors.hadoop.HadoopTaskCancelledException)3 HadoopTaskInfo (org.apache.ignite.internal.processors.hadoop.HadoopTaskInfo)3 ArrayList (java.util.ArrayList)2 IgniteConfiguration (org.apache.ignite.configuration.IgniteConfiguration)2 HadoopMapReducePlan (org.apache.ignite.hadoop.HadoopMapReducePlan)2