Search in sources :

Example 1 with ContainerLivenessContext

use of org.apache.hadoop.yarn.server.nodemanager.executor.ContainerLivenessContext in project hadoop by apache.

the class ContainerExecutor method reacquireContainer.

/**
   * Recover an already existing container. This is a blocking call and returns
   * only when the container exits.  Note that the container must have been
   * activated prior to this call.
   *
   * @param ctx encapsulates information necessary to reacquire container
   * @return The exit code of the pre-existing container
   * @throws IOException if there is a failure while reacquiring the container
   * @throws InterruptedException if interrupted while waiting to reacquire
   * the container
   */
public int reacquireContainer(ContainerReacquisitionContext ctx) throws IOException, InterruptedException {
    Container container = ctx.getContainer();
    String user = ctx.getUser();
    ContainerId containerId = ctx.getContainerId();
    Path pidPath = getPidFilePath(containerId);
    if (pidPath == null) {
        LOG.warn(containerId + " is not active, returning terminated error");
        return ExitCode.TERMINATED.getExitCode();
    }
    String pid = ProcessIdFileReader.getProcessId(pidPath);
    if (pid == null) {
        throw new IOException("Unable to determine pid for " + containerId);
    }
    LOG.info("Reacquiring " + containerId + " with pid " + pid);
    ContainerLivenessContext livenessContext = new ContainerLivenessContext.Builder().setContainer(container).setUser(user).setPid(pid).build();
    while (isContainerAlive(livenessContext)) {
        Thread.sleep(1000);
    }
    // wait for exit code file to appear
    final int sleepMsec = 100;
    int msecLeft = 2000;
    String exitCodeFile = ContainerLaunch.getExitCodeFile(pidPath.toString());
    File file = new File(exitCodeFile);
    while (!file.exists() && msecLeft >= 0) {
        if (!isContainerActive(containerId)) {
            LOG.info(containerId + " was deactivated");
            return ExitCode.TERMINATED.getExitCode();
        }
        Thread.sleep(sleepMsec);
        msecLeft -= sleepMsec;
    }
    if (msecLeft < 0) {
        throw new IOException("Timeout while waiting for exit code from " + containerId);
    }
    try {
        return Integer.parseInt(FileUtils.readFileToString(file).trim());
    } catch (NumberFormatException e) {
        throw new IOException("Error parsing exit code from pid " + pid, e);
    }
}
Also used : Path(org.apache.hadoop.fs.Path) Container(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container) ContainerId(org.apache.hadoop.yarn.api.records.ContainerId) IOException(java.io.IOException) File(java.io.File) ContainerLivenessContext(org.apache.hadoop.yarn.server.nodemanager.executor.ContainerLivenessContext)

Aggregations

File (java.io.File)1 IOException (java.io.IOException)1 Path (org.apache.hadoop.fs.Path)1 ContainerId (org.apache.hadoop.yarn.api.records.ContainerId)1 Container (org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container)1 ContainerLivenessContext (org.apache.hadoop.yarn.server.nodemanager.executor.ContainerLivenessContext)1