Search in sources :

Example 11 with HistoryEvent

use of org.apache.tez.dag.history.HistoryEvent in project tez by apache.

the class TestAMRecovery method testVertexCompletelyFinished_One2One.

/**
 * Fine-grained recovery task-level, In a vertex (v1), task 0 is done task 1
 * is also done. History flush happens. AM dies. Once AM is recovered, task 0
 * and Task 1 is not re-run. (ONE_TO_ONE)
 *
 * @throws Exception
 */
@Test(timeout = 120000)
public void testVertexCompletelyFinished_One2One() throws Exception {
    DAG dag = createDAG("VertexCompletelyFinished_One2One", ControlledInputReadyVertexManager.class, DataMovementType.ONE_TO_ONE, false);
    TezCounters counters = runDAGAndVerify(dag, DAGStatus.State.SUCCEEDED);
    assertEquals(4, counters.findCounter(DAGCounter.NUM_SUCCEEDED_TASKS).getValue());
    assertEquals(2, counters.findCounter(TestCounter.Counter_1).getValue());
    List<HistoryEvent> historyEvents1 = readRecoveryLog(1);
    List<HistoryEvent> historyEvents2 = readRecoveryLog(2);
    printHistoryEvents(historyEvents1, 1);
    printHistoryEvents(historyEvents1, 2);
    // task_0 of v1 is finished in attempt 1, task_1 of v1 is not finished in
    // attempt 1
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents1, 0, 0).size());
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents1, 0, 1).size());
    // task_0 of v1 is finished in attempt 1 and not rerun, task_1 of v1 is
    // finished in attempt 2
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents2, 0, 0).size());
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents2, 0, 1).size());
}
Also used : DAG(org.apache.tez.dag.api.DAG) HistoryEvent(org.apache.tez.dag.history.HistoryEvent) TezCounters(org.apache.tez.common.counters.TezCounters) Test(org.junit.Test)

Example 12 with HistoryEvent

use of org.apache.tez.dag.history.HistoryEvent in project tez by apache.

the class TestAMRecovery method testVertexCompletelyFinished_Broadcast.

/**
 * Fine-grained recovery task-level, In a vertex (v1), task 0 is done task 1
 * is also done. History flush happens. AM dies. Once AM is recovered, task 0
 * and Task 1 is not re-run. (Broadcast)
 *
 * @throws Exception
 */
@Test(timeout = 120000)
public void testVertexCompletelyFinished_Broadcast() throws Exception {
    DAG dag = createDAG("VertexCompletelyFinished_Broadcast", ControlledImmediateStartVertexManager.class, DataMovementType.BROADCAST, false);
    TezCounters counters = runDAGAndVerify(dag, DAGStatus.State.SUCCEEDED);
    assertEquals(4, counters.findCounter(DAGCounter.NUM_SUCCEEDED_TASKS).getValue());
    assertEquals(2, counters.findCounter(TestCounter.Counter_1).getValue());
    List<HistoryEvent> historyEvents1 = readRecoveryLog(1);
    List<HistoryEvent> historyEvents2 = readRecoveryLog(2);
    printHistoryEvents(historyEvents1, 1);
    printHistoryEvents(historyEvents1, 2);
    // task_0 of v1 is finished in attempt 1, task_1 of v1 is not finished in
    // attempt 1
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents1, 0, 0).size());
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents1, 0, 1).size());
    // task_0 of v1 is finished in attempt 1 and not rerun, task_1 of v1 is
    // finished in attempt 2
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents2, 0, 0).size());
    assertEquals(1, findTaskAttemptFinishedEvent(historyEvents2, 0, 1).size());
}
Also used : DAG(org.apache.tez.dag.api.DAG) HistoryEvent(org.apache.tez.dag.history.HistoryEvent) TezCounters(org.apache.tez.common.counters.TezCounters) Test(org.junit.Test)

Example 13 with HistoryEvent

use of org.apache.tez.dag.history.HistoryEvent in project tez by apache.

the class TestAMRecovery method readRecoveryLog.

private List<HistoryEvent> readRecoveryLog(int attemptNum) throws IOException {
    ApplicationId appId = tezSession.getAppMasterApplicationId();
    Path tezSystemStagingDir = TezCommonUtils.getTezSystemStagingPath(tezConf, appId.toString());
    Path recoveryDataDir = TezCommonUtils.getRecoveryPath(tezSystemStagingDir, tezConf);
    FileSystem fs = tezSystemStagingDir.getFileSystem(tezConf);
    List<HistoryEvent> historyEvents = new ArrayList<HistoryEvent>();
    for (int i = 1; i <= attemptNum; ++i) {
        Path currentAttemptRecoveryDataDir = TezCommonUtils.getAttemptRecoveryPath(recoveryDataDir, i);
        Path recoveryFilePath = new Path(currentAttemptRecoveryDataDir, appId.toString().replace("application", "dag") + "_1" + TezConstants.DAG_RECOVERY_RECOVER_FILE_SUFFIX);
        if (fs.exists(recoveryFilePath)) {
            LOG.info("Read recovery file:" + recoveryFilePath);
            historyEvents.addAll(RecoveryParser.parseDAGRecoveryFile(fs.open(recoveryFilePath)));
        }
    }
    return historyEvents;
}
Also used : Path(org.apache.hadoop.fs.Path) FileSystem(org.apache.hadoop.fs.FileSystem) ArrayList(java.util.ArrayList) ApplicationId(org.apache.hadoop.yarn.api.records.ApplicationId) HistoryEvent(org.apache.tez.dag.history.HistoryEvent)

Example 14 with HistoryEvent

use of org.apache.tez.dag.history.HistoryEvent in project tez by apache.

the class TestHistoryEventsProtoConversion method testProtoConversion.

private HistoryEvent testProtoConversion(HistoryEvent event) throws IOException, TezException {
    ByteArrayOutputStream os = new ByteArrayOutputStream();
    HistoryEvent deserializedEvent = null;
    event.toProtoStream(os);
    os.flush();
    os.close();
    deserializedEvent = ReflectionUtils.createClazzInstance(event.getClass().getName());
    LOG.info("Serialized event to byte array" + ", eventType=" + event.getEventType() + ", bufLen=" + os.toByteArray().length);
    deserializedEvent.fromProtoStream(new ByteArrayInputStream(os.toByteArray()));
    return deserializedEvent;
}
Also used : ByteArrayInputStream(java.io.ByteArrayInputStream) ByteArrayOutputStream(java.io.ByteArrayOutputStream) HistoryEvent(org.apache.tez.dag.history.HistoryEvent)

Example 15 with HistoryEvent

use of org.apache.tez.dag.history.HistoryEvent in project tez by apache.

the class TestHistoryEventJsonConversion method testHandlerExists.

@Test(timeout = 5000)
public void testHandlerExists() throws JSONException {
    for (HistoryEventType eventType : HistoryEventType.values()) {
        HistoryEvent event = null;
        switch(eventType) {
            case APP_LAUNCHED:
                event = new AppLaunchedEvent(applicationId, random.nextInt(), random.nextInt(), user, new Configuration(false), null);
                break;
            case AM_LAUNCHED:
                event = new AMLaunchedEvent(applicationAttemptId, random.nextInt(), random.nextInt(), user);
                break;
            case AM_STARTED:
                event = new AMStartedEvent(applicationAttemptId, random.nextInt(), user);
                break;
            case DAG_SUBMITTED:
                event = new DAGSubmittedEvent(tezDAGID, random.nextInt(), dagPlan, applicationAttemptId, null, user, null, null, "Q_" + eventType.name());
                break;
            case DAG_INITIALIZED:
                event = new DAGInitializedEvent(tezDAGID, random.nextInt(), user, dagPlan.getName(), null);
                break;
            case DAG_STARTED:
                event = new DAGStartedEvent(tezDAGID, random.nextInt(), user, dagPlan.getName());
                break;
            case DAG_FINISHED:
                event = new DAGFinishedEvent(tezDAGID, random.nextInt(), random.nextInt(), DAGState.ERROR, null, null, user, dagPlan.getName(), null, applicationAttemptId, dagPlan);
                break;
            case VERTEX_INITIALIZED:
                event = new VertexInitializedEvent(tezVertexID, "v1", random.nextInt(), random.nextInt(), random.nextInt(), "proc", null, null, null);
                break;
            case VERTEX_STARTED:
                event = new VertexStartedEvent(tezVertexID, random.nextInt(), random.nextInt());
                break;
            case VERTEX_CONFIGURE_DONE:
                event = new VertexConfigurationDoneEvent(tezVertexID, 0L, 1, null, null, null, true);
                break;
            case VERTEX_FINISHED:
                event = new VertexFinishedEvent(tezVertexID, "v1", 1, random.nextInt(), random.nextInt(), random.nextInt(), random.nextInt(), random.nextInt(), VertexState.ERROR, null, null, null, null, null);
                break;
            case TASK_STARTED:
                event = new TaskStartedEvent(tezTaskID, "v1", random.nextInt(), random.nextInt());
                break;
            case TASK_FINISHED:
                event = new TaskFinishedEvent(tezTaskID, "v1", random.nextInt(), random.nextInt(), tezTaskAttemptID, TaskState.FAILED, null, null, 0);
                break;
            case TASK_ATTEMPT_STARTED:
                event = new TaskAttemptStartedEvent(tezTaskAttemptID, "v1", random.nextInt(), containerId, nodeId, null, null, "nodeHttpAddress");
                break;
            case TASK_ATTEMPT_FINISHED:
                event = new TaskAttemptFinishedEvent(tezTaskAttemptID, "v1", random.nextInt(), random.nextInt(), TaskAttemptState.KILLED, null, TaskAttemptTerminationCause.TERMINATED_BY_CLIENT, null, null, null, null, 0, null, 0, containerId, nodeId, null, null, "nodeHttpAddress");
                break;
            case CONTAINER_LAUNCHED:
                event = new ContainerLaunchedEvent(containerId, random.nextInt(), applicationAttemptId);
                break;
            case CONTAINER_STOPPED:
                event = new ContainerStoppedEvent(containerId, random.nextInt(), -1, applicationAttemptId);
                break;
            case DAG_COMMIT_STARTED:
                event = new DAGCommitStartedEvent();
                break;
            case VERTEX_COMMIT_STARTED:
                event = new VertexCommitStartedEvent();
                break;
            case VERTEX_GROUP_COMMIT_STARTED:
                event = new VertexGroupCommitStartedEvent();
                break;
            case VERTEX_GROUP_COMMIT_FINISHED:
                event = new VertexGroupCommitFinishedEvent();
                break;
            case DAG_RECOVERED:
                event = new DAGRecoveredEvent(applicationAttemptId, tezDAGID, dagPlan.getName(), user, 1l, null);
                break;
            case DAG_KILL_REQUEST:
                event = new DAGKillRequestEvent();
                break;
            default:
                Assert.fail("Unhandled event type " + eventType);
        }
        if (event == null || !event.isHistoryEvent()) {
            continue;
        }
        JSONObject json = HistoryEventJsonConversion.convertToJson(event);
        if (eventType == HistoryEventType.DAG_SUBMITTED) {
            try {
                Assert.assertEquals("Q_" + eventType.name(), json.getJSONObject(ATSConstants.OTHER_INFO).getString(ATSConstants.DAG_QUEUE_NAME));
                Assert.assertEquals("Q_" + eventType.name(), json.getJSONObject(ATSConstants.PRIMARY_FILTERS).getString(ATSConstants.DAG_QUEUE_NAME));
            } catch (JSONException ex) {
                Assert.fail("Exception: " + ex.getMessage() + " for type: " + eventType);
            }
        }
    }
}
Also used : DAGCommitStartedEvent(org.apache.tez.dag.history.events.DAGCommitStartedEvent) Configuration(org.apache.hadoop.conf.Configuration) VertexInitializedEvent(org.apache.tez.dag.history.events.VertexInitializedEvent) HistoryEventType(org.apache.tez.dag.history.HistoryEventType) DAGInitializedEvent(org.apache.tez.dag.history.events.DAGInitializedEvent) ContainerStoppedEvent(org.apache.tez.dag.history.events.ContainerStoppedEvent) DAGKillRequestEvent(org.apache.tez.dag.history.events.DAGKillRequestEvent) DAGStartedEvent(org.apache.tez.dag.history.events.DAGStartedEvent) VertexConfigurationDoneEvent(org.apache.tez.dag.history.events.VertexConfigurationDoneEvent) DAGRecoveredEvent(org.apache.tez.dag.history.events.DAGRecoveredEvent) TaskAttemptFinishedEvent(org.apache.tez.dag.history.events.TaskAttemptFinishedEvent) AMStartedEvent(org.apache.tez.dag.history.events.AMStartedEvent) VertexStartedEvent(org.apache.tez.dag.history.events.VertexStartedEvent) VertexGroupCommitStartedEvent(org.apache.tez.dag.history.events.VertexGroupCommitStartedEvent) JSONException(org.codehaus.jettison.json.JSONException) HistoryEvent(org.apache.tez.dag.history.HistoryEvent) TaskStartedEvent(org.apache.tez.dag.history.events.TaskStartedEvent) TaskAttemptStartedEvent(org.apache.tez.dag.history.events.TaskAttemptStartedEvent) AppLaunchedEvent(org.apache.tez.dag.history.events.AppLaunchedEvent) TaskFinishedEvent(org.apache.tez.dag.history.events.TaskFinishedEvent) JSONObject(org.codehaus.jettison.json.JSONObject) VertexGroupCommitFinishedEvent(org.apache.tez.dag.history.events.VertexGroupCommitFinishedEvent) AMLaunchedEvent(org.apache.tez.dag.history.events.AMLaunchedEvent) ContainerLaunchedEvent(org.apache.tez.dag.history.events.ContainerLaunchedEvent) DAGFinishedEvent(org.apache.tez.dag.history.events.DAGFinishedEvent) VertexFinishedEvent(org.apache.tez.dag.history.events.VertexFinishedEvent) DAGSubmittedEvent(org.apache.tez.dag.history.events.DAGSubmittedEvent) VertexCommitStartedEvent(org.apache.tez.dag.history.events.VertexCommitStartedEvent) Test(org.junit.Test)

Aggregations

HistoryEvent (org.apache.tez.dag.history.HistoryEvent)23 Test (org.junit.Test)10 Path (org.apache.hadoop.fs.Path)6 TezCounters (org.apache.tez.common.counters.TezCounters)6 DAG (org.apache.tez.dag.api.DAG)6 DAGSubmittedEvent (org.apache.tez.dag.history.events.DAGSubmittedEvent)6 Configuration (org.apache.hadoop.conf.Configuration)5 TezConfiguration (org.apache.tez.dag.api.TezConfiguration)5 HistoryEventType (org.apache.tez.dag.history.HistoryEventType)5 TaskFinishedEvent (org.apache.tez.dag.history.events.TaskFinishedEvent)5 ArrayList (java.util.ArrayList)4 TezDAGID (org.apache.tez.dag.records.TezDAGID)4 IOException (java.io.IOException)3 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)3 FileSystem (org.apache.hadoop.fs.FileSystem)3 DAGHistoryEvent (org.apache.tez.dag.history.DAGHistoryEvent)3 AMLaunchedEvent (org.apache.tez.dag.history.events.AMLaunchedEvent)3 AMStartedEvent (org.apache.tez.dag.history.events.AMStartedEvent)3 TaskAttemptFinishedEvent (org.apache.tez.dag.history.events.TaskAttemptFinishedEvent)3 TaskAttemptStartedEvent (org.apache.tez.dag.history.events.TaskAttemptStartedEvent)3