Search in sources :

Example 1 with DAGAppMasterEventUserServiceFatalError

use of org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError in project tez by apache.

the class TaskSchedulerManager method setShouldUnregisterFlag.

public void setShouldUnregisterFlag() {
    LOG.info("TaskScheduler notified that it should unregister from RM");
    this.shouldUnregisterFlag.set(true);
    for (int i = 0; i < taskSchedulers.length; i++) {
        if (this.taskSchedulers[i] != null) {
            try {
                this.taskSchedulers[i].setShouldUnregister();
            } catch (Exception e) {
                String msg = "Error in TaskScheduler when setting Unregister Flag" + ", scheduler=" + Utils.getTaskSchedulerIdentifierString(i, appContext);
                LOG.error(msg, e);
                sendEvent(new DAGAppMasterEventUserServiceFatalError(DAGAppMasterEventType.TASK_SCHEDULER_SERVICE_FATAL_ERROR, msg, e));
            }
        }
    }
}
Also used : DAGAppMasterEventUserServiceFatalError(org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError) TaskLocationHint(org.apache.tez.dag.api.TaskLocationHint) TezUncheckedException(org.apache.tez.dag.api.TezUncheckedException) TezException(org.apache.tez.dag.api.TezException)

Example 2 with DAGAppMasterEventUserServiceFatalError

use of org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError in project tez by apache.

the class TaskSchedulerManager method handleTASucceeded.

private void handleTASucceeded(AMSchedulerEventTAEnded event) {
    TaskAttempt attempt = event.getAttempt();
    ContainerId usedContainerId = event.getUsedContainerId();
    // assigned to it.
    if (event.getUsedContainerId() != null) {
        sendEvent(new AMContainerEventTASucceeded(usedContainerId, event.getAttemptID()));
        sendEvent(new AMNodeEventTaskAttemptSucceeded(appContext.getAllContainers().get(usedContainerId).getContainer().getNodeId(), event.getSchedulerId(), usedContainerId, event.getAttemptID()));
    }
    boolean wasContainerAllocated = false;
    try {
        wasContainerAllocated = taskSchedulers[event.getSchedulerId()].deallocateTask(attempt, true, null, event.getDiagnostics());
    } catch (Exception e) {
        String msg = "Error in TaskScheduler for handling Task De-allocation" + ", eventType=" + event.getType() + ", scheduler=" + Utils.getTaskSchedulerIdentifierString(event.getSchedulerId(), appContext) + ", taskAttemptId=" + attempt.getID();
        LOG.error(msg, e);
        sendEvent(new DAGAppMasterEventUserServiceFatalError(DAGAppMasterEventType.TASK_SCHEDULER_SERVICE_FATAL_ERROR, msg, e));
        return;
    }
    if (!wasContainerAllocated) {
        LOG.error("De-allocated successful task: " + attempt.getID() + ", but TaskScheduler reported no container assigned to task");
    }
}
Also used : DAGAppMasterEventUserServiceFatalError(org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError) AMContainerEventTASucceeded(org.apache.tez.dag.app.rm.container.AMContainerEventTASucceeded) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) AMNodeEventTaskAttemptSucceeded(org.apache.tez.dag.app.rm.node.AMNodeEventTaskAttemptSucceeded) TaskAttempt(org.apache.tez.dag.app.dag.TaskAttempt) TezUncheckedException(org.apache.tez.dag.api.TezUncheckedException) TezException(org.apache.tez.dag.api.TezException)

Example 3 with DAGAppMasterEventUserServiceFatalError

use of org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError in project tez by apache.

the class TaskSchedulerManager method handleContainerDeallocate.

private void handleContainerDeallocate(AMSchedulerEventDeallocateContainer event) {
    ContainerId containerId = event.getContainerId();
    // TaskAttempt taskAttempt = (TaskAttempt)
    try {
        taskSchedulers[event.getSchedulerId()].deallocateContainer(containerId);
    } catch (Exception e) {
        String msg = "Error in TaskScheduler for handling Container De-allocation" + ", eventType=" + event.getType() + ", scheduler=" + Utils.getTaskSchedulerIdentifierString(event.getSchedulerId(), appContext) + ", containerId=" + containerId;
        LOG.error(msg, e);
        sendEvent(new DAGAppMasterEventUserServiceFatalError(DAGAppMasterEventType.TASK_SCHEDULER_SERVICE_FATAL_ERROR, msg, e));
        return;
    }
    // TODO does this container need to be stopped via C_STOP_REQUEST
    sendEvent(new AMContainerEventStopRequest(containerId));
}
Also used : DAGAppMasterEventUserServiceFatalError(org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) AMContainerEventStopRequest(org.apache.tez.dag.app.rm.container.AMContainerEventStopRequest) TezUncheckedException(org.apache.tez.dag.api.TezUncheckedException) TezException(org.apache.tez.dag.api.TezException)

Example 4 with DAGAppMasterEventUserServiceFatalError

use of org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError in project tez by apache.

the class TestContainerLauncherManager method testReportFailureFromContainerLauncher.

@SuppressWarnings("unchecked")
@Test(timeout = 5000)
public void testReportFailureFromContainerLauncher() throws ServicePluginException, TezException {
    final String dagName = DAG_NAME;
    final int dagIndex = DAG_INDEX;
    TezDAGID dagId = TezDAGID.getInstance(ApplicationId.newInstance(0, 0), dagIndex);
    DAG dag = mock(DAG.class);
    doReturn(dagName).when(dag).getName();
    doReturn(dagId).when(dag).getID();
    EventHandler eventHandler = mock(EventHandler.class);
    AppContext appContext = mock(AppContext.class);
    doReturn(eventHandler).when(appContext).getEventHandler();
    doReturn(dag).when(appContext).getCurrentDAG();
    doReturn("testlauncher").when(appContext).getContainerLauncherName(0);
    NamedEntityDescriptor<TaskCommunicatorDescriptor> taskCommDescriptor = new NamedEntityDescriptor<>("testlauncher", ContainerLauncherForTest.class.getName());
    List<NamedEntityDescriptor> list = new LinkedList<>();
    list.add(taskCommDescriptor);
    ContainerLauncherManager containerLauncherManager = new ContainerLauncherManager(appContext, mock(TaskCommunicatorManagerInterface.class), "", list, false);
    try {
        ContainerLaunchContext clc1 = mock(ContainerLaunchContext.class);
        Container container1 = mock(Container.class);
        ContainerLauncherLaunchRequestEvent launchRequestEvent = new ContainerLauncherLaunchRequestEvent(clc1, container1, 0, 0, 0);
        containerLauncherManager.handle(launchRequestEvent);
        ArgumentCaptor<Event> argumentCaptor = ArgumentCaptor.forClass(Event.class);
        verify(eventHandler, times(1)).handle(argumentCaptor.capture());
        Event rawEvent = argumentCaptor.getValue();
        assertTrue(rawEvent instanceof DAGAppMasterEventUserServiceFatalError);
        DAGAppMasterEventUserServiceFatalError event = (DAGAppMasterEventUserServiceFatalError) rawEvent;
        assertEquals(DAGAppMasterEventType.CONTAINER_LAUNCHER_SERVICE_FATAL_ERROR, event.getType());
        assertTrue(event.getDiagnosticInfo().contains("ReportedFatalError"));
        assertTrue(event.getDiagnosticInfo().contains(ServicePluginErrorDefaults.INCONSISTENT_STATE.name()));
        assertTrue(event.getDiagnosticInfo().contains("[0:testlauncher]"));
        reset(eventHandler);
        // stop container
        ContainerId containerId2 = mock(ContainerId.class);
        NodeId nodeId2 = mock(NodeId.class);
        ContainerLauncherStopRequestEvent stopRequestEvent = new ContainerLauncherStopRequestEvent(containerId2, nodeId2, null, 0, 0, 0);
        argumentCaptor = ArgumentCaptor.forClass(Event.class);
        containerLauncherManager.handle(stopRequestEvent);
        verify(eventHandler, times(1)).handle(argumentCaptor.capture());
        rawEvent = argumentCaptor.getValue();
        assertTrue(rawEvent instanceof DAGEventTerminateDag);
        DAGEventTerminateDag killEvent = (DAGEventTerminateDag) rawEvent;
        assertTrue(killEvent.getDiagnosticInfo().contains("ReportError"));
        assertTrue(killEvent.getDiagnosticInfo().contains(ServicePluginErrorDefaults.SERVICE_UNAVAILABLE.name()));
        assertTrue(killEvent.getDiagnosticInfo().contains("[0:testlauncher]"));
    } finally {
        containerLauncherManager.stop();
    }
}
Also used : DAGAppMasterEventUserServiceFatalError(org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError) AppContext(org.apache.tez.dag.app.AppContext) EventHandler(org.apache.hadoop.yarn.event.EventHandler) DAG(org.apache.tez.dag.app.dag.DAG) ContainerLaunchContext(org.apache.hadoop.yarn.api.records.ContainerLaunchContext) TaskCommunicatorManagerInterface(org.apache.tez.dag.app.TaskCommunicatorManagerInterface) DAGEventTerminateDag(org.apache.tez.dag.app.dag.event.DAGEventTerminateDag) NamedEntityDescriptor(org.apache.tez.dag.api.NamedEntityDescriptor) LinkedList(java.util.LinkedList) ContainerLauncherLaunchRequestEvent(org.apache.tez.dag.app.rm.ContainerLauncherLaunchRequestEvent) TaskCommunicatorDescriptor(org.apache.tez.serviceplugins.api.TaskCommunicatorDescriptor) Container(org.apache.hadoop.yarn.api.records.Container) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) TezDAGID(org.apache.tez.dag.records.TezDAGID) NodeId(org.apache.hadoop.yarn.api.records.NodeId) ContainerLauncherLaunchRequestEvent(org.apache.tez.dag.app.rm.ContainerLauncherLaunchRequestEvent) Event(org.apache.hadoop.yarn.event.Event) ContainerLauncherStopRequestEvent(org.apache.tez.dag.app.rm.ContainerLauncherStopRequestEvent) ContainerLauncherStopRequestEvent(org.apache.tez.dag.app.rm.ContainerLauncherStopRequestEvent) DagInfoImplForTest(org.apache.tez.dag.helpers.DagInfoImplForTest) Test(org.junit.Test)

Example 5 with DAGAppMasterEventUserServiceFatalError

use of org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError in project tez by apache.

the class TestTaskCommunicatorManager method testReportFailureFromTaskCommunicator.

@SuppressWarnings("unchecked")
@Test(timeout = 5000)
public void testReportFailureFromTaskCommunicator() throws TezException {
    String dagName = DAG_NAME;
    EventHandler eventHandler = mock(EventHandler.class);
    AppContext appContext = mock(AppContext.class, RETURNS_DEEP_STUBS);
    doReturn("testTaskCommunicator").when(appContext).getTaskCommunicatorName(0);
    doReturn(eventHandler).when(appContext).getEventHandler();
    DAG dag = mock(DAG.class);
    TezDAGID dagId = TezDAGID.getInstance(ApplicationId.newInstance(1, 0), DAG_INDEX);
    doReturn(dagName).when(dag).getName();
    doReturn(dagId).when(dag).getID();
    doReturn(dag).when(appContext).getCurrentDAG();
    NamedEntityDescriptor<TaskCommunicatorDescriptor> namedEntityDescriptor = new NamedEntityDescriptor<>("testTaskCommunicator", TaskCommForFailureTest.class.getName());
    List<NamedEntityDescriptor> list = new LinkedList<>();
    list.add(namedEntityDescriptor);
    TaskCommunicatorManager taskCommManager = new TaskCommunicatorManager(appContext, mock(TaskHeartbeatHandler.class), mock(ContainerHeartbeatHandler.class), list);
    try {
        taskCommManager.init(new Configuration());
        taskCommManager.start();
        taskCommManager.registerRunningContainer(mock(ContainerId.class), 0);
        ArgumentCaptor<Event> argumentCaptor = ArgumentCaptor.forClass(Event.class);
        verify(eventHandler, times(1)).handle(argumentCaptor.capture());
        Event rawEvent = argumentCaptor.getValue();
        assertTrue(rawEvent instanceof DAGEventTerminateDag);
        DAGEventTerminateDag killEvent = (DAGEventTerminateDag) rawEvent;
        assertTrue(killEvent.getDiagnosticInfo().contains("ReportError"));
        assertTrue(killEvent.getDiagnosticInfo().contains(ServicePluginErrorDefaults.SERVICE_UNAVAILABLE.name()));
        assertTrue(killEvent.getDiagnosticInfo().contains("[0:testTaskCommunicator]"));
        reset(eventHandler);
        taskCommManager.dagComplete(dag);
        argumentCaptor = ArgumentCaptor.forClass(Event.class);
        verify(eventHandler, times(1)).handle(argumentCaptor.capture());
        rawEvent = argumentCaptor.getValue();
        assertTrue(rawEvent instanceof DAGAppMasterEventUserServiceFatalError);
        DAGAppMasterEventUserServiceFatalError event = (DAGAppMasterEventUserServiceFatalError) rawEvent;
        assertEquals(DAGAppMasterEventType.TASK_COMMUNICATOR_SERVICE_FATAL_ERROR, event.getType());
        assertTrue(event.getDiagnosticInfo().contains("ReportedFatalError"));
        assertTrue(event.getDiagnosticInfo().contains(ServicePluginErrorDefaults.INCONSISTENT_STATE.name()));
        assertTrue(event.getDiagnosticInfo().contains("[0:testTaskCommunicator]"));
    } finally {
        taskCommManager.stop();
    }
}
Also used : DAGAppMasterEventUserServiceFatalError(org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError) Configuration(org.apache.hadoop.conf.Configuration) EventHandler(org.apache.hadoop.yarn.event.EventHandler) DAG(org.apache.tez.dag.app.dag.DAG) DAGEventTerminateDag(org.apache.tez.dag.app.dag.event.DAGEventTerminateDag) NamedEntityDescriptor(org.apache.tez.dag.api.NamedEntityDescriptor) LinkedList(java.util.LinkedList) TaskCommunicatorDescriptor(org.apache.tez.serviceplugins.api.TaskCommunicatorDescriptor) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) TezDAGID(org.apache.tez.dag.records.TezDAGID) Event(org.apache.hadoop.yarn.event.Event) DagInfoImplForTest(org.apache.tez.dag.helpers.DagInfoImplForTest) Test(org.junit.Test)

Aggregations

DAGAppMasterEventUserServiceFatalError (org.apache.tez.dag.app.dag.event.DAGAppMasterEventUserServiceFatalError)21 TezUncheckedException (org.apache.tez.dag.api.TezUncheckedException)12 TezException (org.apache.tez.dag.api.TezException)11 ContainerId (org.apache.hadoop.yarn.api.records.ContainerId)9 Event (org.apache.hadoop.yarn.event.Event)6 EventHandler (org.apache.hadoop.yarn.event.EventHandler)6 DagInfoImplForTest (org.apache.tez.dag.helpers.DagInfoImplForTest)6 Test (org.junit.Test)6 Configuration (org.apache.hadoop.conf.Configuration)5 IOException (java.io.IOException)4 NodeId (org.apache.hadoop.yarn.api.records.NodeId)4 TaskLocationHint (org.apache.tez.dag.api.TaskLocationHint)4 AppContext (org.apache.tez.dag.app.AppContext)4 DAG (org.apache.tez.dag.app.dag.DAG)4 TaskAttempt (org.apache.tez.dag.app.dag.TaskAttempt)4 InvocationTargetException (java.lang.reflect.InvocationTargetException)3 LinkedList (java.util.LinkedList)3 NamedEntityDescriptor (org.apache.tez.dag.api.NamedEntityDescriptor)3 DAGEventTerminateDag (org.apache.tez.dag.app.dag.event.DAGEventTerminateDag)3 ContainerLauncherLaunchRequestEvent (org.apache.tez.dag.app.rm.ContainerLauncherLaunchRequestEvent)3