Search in sources :

Example 1 with RMAppEvent

use of org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent in project hadoop by apache.

the class RMAppManager method recoverApplication.

protected void recoverApplication(ApplicationStateData appState, RMState rmState) throws Exception {
    ApplicationSubmissionContext appContext = appState.getApplicationSubmissionContext();
    ApplicationId appId = appContext.getApplicationId();
    // create and recover app.
    RMAppImpl application = createAndPopulateNewRMApp(appContext, appState.getSubmitTime(), appState.getUser(), true, appState.getStartTime());
    // is true and give clear message so that user can react properly.
    if (!appContext.getUnmanagedAM() && application.getAMResourceRequest() == null && !YarnConfiguration.areNodeLabelsEnabled(this.conf)) {
        // check application submission context and see if am resource request
        // or application itself contains any node label expression.
        ResourceRequest amReqFromAppContext = appContext.getAMContainerResourceRequest();
        String labelExp = (amReqFromAppContext != null) ? amReqFromAppContext.getNodeLabelExpression() : null;
        if (labelExp == null) {
            labelExp = appContext.getNodeLabelExpression();
        }
        if (labelExp != null && !labelExp.equals(RMNodeLabelsManager.NO_LABEL)) {
            String message = "Failed to recover application " + appId + ". NodeLabel is not enabled in cluster, but AM resource request " + "contains a label expression.";
            LOG.warn(message);
            application.handle(new RMAppEvent(appId, RMAppEventType.APP_REJECTED, message));
            return;
        }
    }
    application.handle(new RMAppRecoverEvent(appId, rmState));
}
Also used : RMAppImpl(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl) RMAppEvent(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent) RMAppRecoverEvent(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppRecoverEvent) ApplicationSubmissionContext(org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext) ResourceRequest(org.apache.hadoop.yarn.api.records.ResourceRequest) ApplicationId(org.apache.hadoop.yarn.api.records.ApplicationId)

Example 2 with RMAppEvent

use of org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent in project hadoop by apache.

the class CapacityScheduler method addApplicationOnRecovery.

private void addApplicationOnRecovery(ApplicationId applicationId, String queueName, String user, Priority priority) {
    try {
        writeLock.lock();
        CSQueue queue = getQueue(queueName);
        if (queue == null) {
            //not presently supported
            if (!YarnConfiguration.shouldRMFailFast(getConfig())) {
                this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.KILL, "Application killed on recovery as it was submitted to queue " + queueName + " which no longer exists after restart."));
                return;
            } else {
                String queueErrorMsg = "Queue named " + queueName + " missing during application recovery." + " Queue removal during recovery is not presently " + "supported by the capacity scheduler, please " + "restart with all queues configured" + " which were present before shutdown/restart.";
                LOG.fatal(queueErrorMsg);
                throw new QueueInvalidException(queueErrorMsg);
            }
        }
        if (!(queue instanceof LeafQueue)) {
            // queue, which is not supported for running apps.
            if (!YarnConfiguration.shouldRMFailFast(getConfig())) {
                this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.KILL, "Application killed on recovery as it was submitted to queue " + queueName + " which is no longer a leaf queue after restart."));
                return;
            } else {
                String queueErrorMsg = "Queue named " + queueName + " is no longer a leaf queue during application recovery." + " Changing a leaf queue to a parent queue during recovery is" + " not presently supported by the capacity scheduler. Please" + " restart with leaf queues before shutdown/restart continuing" + " as leaf queues.";
                LOG.fatal(queueErrorMsg);
                throw new QueueInvalidException(queueErrorMsg);
            }
        }
        // Submit to the queue
        try {
            queue.submitApplication(applicationId, user, queueName);
        } catch (AccessControlException ace) {
        // Ignore the exception for recovered app as the app was previously
        // accepted.
        }
        queue.getMetrics().submitApp(user);
        SchedulerApplication<FiCaSchedulerApp> application = new SchedulerApplication<FiCaSchedulerApp>(queue, user, priority);
        applications.put(applicationId, application);
        LOG.info("Accepted application " + applicationId + " from user: " + user + ", in queue: " + queueName);
        if (LOG.isDebugEnabled()) {
            LOG.debug(applicationId + " is recovering. Skip notifying APP_ACCEPTED");
        }
    } finally {
        writeLock.unlock();
    }
}
Also used : RMAppEvent(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent) SchedulerApplication(org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplication) FiCaSchedulerApp(org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp) AccessControlException(org.apache.hadoop.security.AccessControlException) QueueInvalidException(org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueInvalidException)

Example 3 with RMAppEvent

use of org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent in project hadoop by apache.

the class FairScheduler method resolveReservationQueueName.

private String resolveReservationQueueName(String queueName, ApplicationId applicationId, ReservationId reservationID, boolean isRecovering) {
    try {
        readLock.lock();
        FSQueue queue = queueMgr.getQueue(queueName);
        if ((queue == null) || !allocConf.isReservable(queue.getQueueName())) {
            return queueName;
        }
        // Use fully specified name from now on (including root. prefix)
        queueName = queue.getQueueName();
        if (reservationID != null) {
            String resQName = queueName + "." + reservationID.toString();
            queue = queueMgr.getQueue(resQName);
            if (queue == null) {
                // reservation has terminated during failover
                if (isRecovering && allocConf.getMoveOnExpiry(queueName)) {
                    // move to the default child queue of the plan
                    return getDefaultQueueForPlanQueue(queueName);
                }
                String message = "Application " + applicationId + " submitted to a reservation which is not yet " + "currently active: " + resQName;
                this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.APP_REJECTED, message));
                return null;
            }
            if (!queue.getParent().getQueueName().equals(queueName)) {
                String message = "Application: " + applicationId + " submitted to a reservation " + resQName + " which does not belong to the specified queue: " + queueName;
                this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(applicationId, RMAppEventType.APP_REJECTED, message));
                return null;
            }
            // use the reservation queue to run the app
            queueName = resQName;
        } else {
            // use the default child queue of the plan for unreserved apps
            queueName = getDefaultQueueForPlanQueue(queueName);
        }
        return queueName;
    } finally {
        readLock.unlock();
    }
}
Also used : RMAppEvent(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent)

Example 4 with RMAppEvent

use of org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent in project hadoop by apache.

the class AbstractYarnScheduler method killAllAppsInQueue.

@Override
public void killAllAppsInQueue(String queueName) throws YarnException {
    try {
        writeLock.lock();
        // check if queue is a valid
        List<ApplicationAttemptId> apps = getAppsInQueue(queueName);
        if (apps == null) {
            String errMsg = "The specified Queue: " + queueName + " doesn't exist";
            LOG.warn(errMsg);
            throw new YarnException(errMsg);
        }
        // generate kill events for each pending/running app
        for (ApplicationAttemptId app : apps) {
            this.rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(app.getApplicationId(), RMAppEventType.KILL, "Application killed due to expiry of reservation queue " + queueName + "."));
        }
    } finally {
        writeLock.unlock();
    }
}
Also used : RMAppEvent(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent) ApplicationAttemptId(org.apache.hadoop.yarn.api.records.ApplicationAttemptId) YarnException(org.apache.hadoop.yarn.exceptions.YarnException)

Example 5 with RMAppEvent

use of org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent in project hadoop by apache.

the class RMAppLifetimeMonitor method expire.

@SuppressWarnings("unchecked")
@Override
protected synchronized void expire(RMAppToMonitor monitoredAppKey) {
    ApplicationId appId = monitoredAppKey.getApplicationId();
    RMApp app = rmContext.getRMApps().get(appId);
    if (app == null) {
        return;
    }
    String diagnostics = "Application is killed by ResourceManager as it" + " has exceeded the lifetime period.";
    rmContext.getDispatcher().getEventHandler().handle(new RMAppEvent(appId, RMAppEventType.KILL, diagnostics));
}
Also used : RMAppEvent(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent) RMApp(org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp) ApplicationId(org.apache.hadoop.yarn.api.records.ApplicationId)

Aggregations

RMAppEvent (org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppEvent)21 ApplicationId (org.apache.hadoop.yarn.api.records.ApplicationId)8 Test (org.junit.Test)7 RMApp (org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMApp)6 YarnConfiguration (org.apache.hadoop.yarn.conf.YarnConfiguration)5 RMAppImpl (org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl)5 Event (org.apache.hadoop.yarn.event.Event)4 SchedulerApplication (org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplication)4 Configuration (org.apache.hadoop.conf.Configuration)3 AccessControlException (org.apache.hadoop.security.AccessControlException)3 ApplicationSubmissionContext (org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext)3 YarnException (org.apache.hadoop.yarn.exceptions.YarnException)3 VisibleForTesting (com.google.common.annotations.VisibleForTesting)2 IOException (java.io.IOException)2 Credentials (org.apache.hadoop.security.Credentials)2 SubmitApplicationRequest (org.apache.hadoop.yarn.api.protocolrecords.SubmitApplicationRequest)2 ApplicationAttemptId (org.apache.hadoop.yarn.api.records.ApplicationAttemptId)2 ContainerLaunchContext (org.apache.hadoop.yarn.api.records.ContainerLaunchContext)2 NodeId (org.apache.hadoop.yarn.api.records.NodeId)2 EventHandler (org.apache.hadoop.yarn.event.EventHandler)2