Search in sources :

Example 26 with JobTokenIdentifier

use of org.apache.tez.common.security.JobTokenIdentifier in project tez by apache.

the class TestShuffleHandler method testRecovery.

@Test
public void testRecovery() throws IOException {
    final String user = "someuser";
    final ApplicationId appId = ApplicationId.newInstance(12345, 1);
    final JobID jobId = JobID.downgrade(TypeConverter.fromYarn(appId));
    final File tmpDir = new File(System.getProperty("test.build.data", System.getProperty("java.io.tmpdir")), TestShuffleHandler.class.getName());
    Configuration conf = new Configuration();
    conf.setInt(ShuffleHandler.SHUFFLE_PORT_CONFIG_KEY, 0);
    conf.setInt(ShuffleHandler.MAX_SHUFFLE_CONNECTIONS, 3);
    ShuffleHandler shuffle = new ShuffleHandler();
    // emulate aux services startup with recovery enabled
    shuffle.setRecoveryPath(new Path(tmpDir.toString()));
    tmpDir.mkdirs();
    try {
        shuffle.init(conf);
        shuffle.start();
        // setup a shuffle token for an application
        DataOutputBuffer outputBuffer = new DataOutputBuffer();
        outputBuffer.reset();
        Token<JobTokenIdentifier> jt = new Token<JobTokenIdentifier>("identifier".getBytes(), "password".getBytes(), new Text(user), new Text("shuffleService"));
        jt.write(outputBuffer);
        shuffle.initializeApplication(new ApplicationInitializationContext(user, appId, ByteBuffer.wrap(outputBuffer.getData(), 0, outputBuffer.getLength())));
        // verify we are authorized to shuffle
        int rc = getShuffleResponseCode(shuffle, jt);
        Assert.assertEquals(HttpURLConnection.HTTP_OK, rc);
        // emulate shuffle handler restart
        shuffle.close();
        shuffle = new ShuffleHandler();
        shuffle.setRecoveryPath(new Path(tmpDir.toString()));
        shuffle.init(conf);
        shuffle.start();
        // verify we are still authorized to shuffle to the old application
        rc = getShuffleResponseCode(shuffle, jt);
        Assert.assertEquals(HttpURLConnection.HTTP_OK, rc);
        // shutdown app and verify access is lost
        shuffle.stopApplication(new ApplicationTerminationContext(appId));
        rc = getShuffleResponseCode(shuffle, jt);
        Assert.assertEquals(HttpURLConnection.HTTP_UNAUTHORIZED, rc);
        // emulate shuffle handler restart
        shuffle.close();
        shuffle = new ShuffleHandler();
        shuffle.setRecoveryPath(new Path(tmpDir.toString()));
        shuffle.init(conf);
        shuffle.start();
        // verify we still don't have access
        rc = getShuffleResponseCode(shuffle, jt);
        Assert.assertEquals(HttpURLConnection.HTTP_UNAUTHORIZED, rc);
    } finally {
        if (shuffle != null) {
            shuffle.close();
        }
        FileUtil.fullyDelete(tmpDir);
    }
}
Also used : Path(org.apache.hadoop.fs.Path) Configuration(org.apache.hadoop.conf.Configuration) YarnConfiguration(org.apache.hadoop.yarn.conf.YarnConfiguration) JobTokenIdentifier(org.apache.tez.common.security.JobTokenIdentifier) Token(org.apache.hadoop.security.token.Token) Text(org.apache.hadoop.io.Text) DataOutputBuffer(org.apache.hadoop.io.DataOutputBuffer) ApplicationId(org.apache.hadoop.yarn.api.records.ApplicationId) File(java.io.File) JobID(org.apache.hadoop.mapred.JobID) ApplicationInitializationContext(org.apache.hadoop.yarn.server.api.ApplicationInitializationContext) ApplicationTerminationContext(org.apache.hadoop.yarn.server.api.ApplicationTerminationContext) Test(org.junit.Test)

Example 27 with JobTokenIdentifier

use of org.apache.tez.common.security.JobTokenIdentifier in project tez by apache.

the class MRTask method configureMRTask.

private void configureMRTask() throws IOException, InterruptedException {
    Credentials credentials = UserGroupInformation.getCurrentUser().getCredentials();
    jobConf.setCredentials(credentials);
    // TODO Can this be avoided all together. Have the MRTezOutputCommitter use
    // the Tez parameter.
    // TODO This could be fetched from the env if YARN is setting it for all
    // Containers.
    // Set it in conf, so as to be able to be used the the OutputCommitter.
    // Not needed. This is probably being set via the source/consumer meta
    Token<JobTokenIdentifier> jobToken = TokenCache.getSessionToken(credentials);
    if (jobToken != null) {
        // Will MR ever run without a job token.
        SecretKey sk = JobTokenSecretManager.createSecretKey(jobToken.getPassword());
        this.jobTokenSecret = sk;
    } else {
        LOG.warn("No job token set");
    }
    configureLocalDirs();
    // Set up the DistributedCache related configs
    setupDistributedCacheConfig(jobConf);
}
Also used : SecretKey(javax.crypto.SecretKey) JobTokenIdentifier(org.apache.tez.common.security.JobTokenIdentifier) Credentials(org.apache.hadoop.security.Credentials)

Example 28 with JobTokenIdentifier

use of org.apache.tez.common.security.JobTokenIdentifier in project hive by apache.

the class LlapTaskSchedulerService method createAmsToken.

private static Token<JobTokenIdentifier> createAmsToken(ApplicationId id) {
    if (!UserGroupInformation.isSecurityEnabled())
        return null;
    JobTokenIdentifier identifier = new JobTokenIdentifier(new Text(id.toString()));
    JobTokenSecretManager jobTokenManager = new JobTokenSecretManager();
    Token<JobTokenIdentifier> sessionToken = new Token<>(identifier, jobTokenManager);
    sessionToken.setService(identifier.getJobId());
    return sessionToken;
}
Also used : JobTokenSecretManager(org.apache.tez.common.security.JobTokenSecretManager) JobTokenIdentifier(org.apache.tez.common.security.JobTokenIdentifier) Text(org.apache.hadoop.io.Text) Token(org.apache.hadoop.security.token.Token)

Example 29 with JobTokenIdentifier

use of org.apache.tez.common.security.JobTokenIdentifier in project hive by apache.

the class QueryTracker method registerDag.

public void registerDag(String applicationId, int dagId, String user, Credentials credentials) {
    Token<JobTokenIdentifier> jobToken = TokenCache.getSessionToken(credentials);
    QueryIdentifier queryIdentifier = new QueryIdentifier(applicationId, dagId);
    ReadWriteLock dagLock = getDagLock(queryIdentifier);
    dagLock.readLock().lock();
    try {
        ShuffleHandler.get().registerDag(applicationId, dagId, jobToken, user, null);
    } finally {
        dagLock.readLock().unlock();
    }
}
Also used : ReentrantReadWriteLock(java.util.concurrent.locks.ReentrantReadWriteLock) ReadWriteLock(java.util.concurrent.locks.ReadWriteLock) JobTokenIdentifier(org.apache.tez.common.security.JobTokenIdentifier)

Example 30 with JobTokenIdentifier

use of org.apache.tez.common.security.JobTokenIdentifier in project hive by apache.

the class ContainerRunnerImpl method submitWork.

@Override
public SubmitWorkResponseProto submitWork(SubmitWorkRequestProto request) throws IOException {
    LlapTokenInfo tokenInfo = null;
    try {
        tokenInfo = LlapTokenChecker.getTokenInfo(clusterId);
    } catch (SecurityException ex) {
        logSecurityErrorRarely(null);
        throw ex;
    }
    SignableVertexSpec vertex = extractVertexSpec(request, tokenInfo);
    TezEvent initialEvent = extractInitialEvent(request, tokenInfo);
    TezTaskAttemptID attemptId = Converters.createTaskAttemptId(vertex.getQueryIdentifier(), vertex.getVertexIndex(), request.getFragmentNumber(), request.getAttemptNumber());
    String fragmentIdString = attemptId.toString();
    QueryIdentifierProto qIdProto = vertex.getQueryIdentifier();
    verifyJwtForExternalClient(request, qIdProto.getApplicationIdString(), fragmentIdString);
    LOG.info("Queueing container for execution: fragemendId={}, {}", fragmentIdString, stringifySubmitRequest(request, vertex));
    HistoryLogger.logFragmentStart(qIdProto.getApplicationIdString(), request.getContainerIdString(), localAddress.get().getHostName(), constructUniqueQueryId(vertex.getHiveQueryId(), qIdProto.getDagIndex()), qIdProto.getDagIndex(), vertex.getVertexName(), request.getFragmentNumber(), request.getAttemptNumber());
    // This is the start of container-annotated logging.
    final String dagId = attemptId.getTaskID().getVertexID().getDAGId().toString();
    final String queryId = vertex.getHiveQueryId();
    final String fragmentId = LlapTezUtils.stripAttemptPrefix(fragmentIdString);
    MDC.put("dagId", dagId);
    MDC.put("queryId", queryId);
    MDC.put("fragmentId", fragmentId);
    // TODO: Ideally we want tez to use CallableWithMdc that retains the MDC for threads created in
    // thread pool. For now, we will push both dagId and queryId into NDC and the custom thread
    // pool that we use for task execution and llap io (StatsRecordingThreadPool) will pop them
    // using reflection and update the MDC.
    NDC.push(dagId);
    NDC.push(queryId);
    NDC.push(fragmentId);
    Scheduler.SubmissionState submissionState;
    SubmitWorkResponseProto.Builder responseBuilder = SubmitWorkResponseProto.newBuilder();
    try {
        Map<String, String> env = new HashMap<>();
        // TODO What else is required in this environment map.
        env.putAll(localEnv);
        env.put(ApplicationConstants.Environment.USER.name(), vertex.getUser());
        TezTaskAttemptID taskAttemptId = TezTaskAttemptID.fromString(fragmentIdString);
        int dagIdentifier = taskAttemptId.getTaskID().getVertexID().getDAGId().getId();
        QueryIdentifier queryIdentifier = new QueryIdentifier(qIdProto.getApplicationIdString(), dagIdentifier);
        Credentials credentials = LlapUtil.credentialsFromByteArray(request.getCredentialsBinary().toByteArray());
        Token<JobTokenIdentifier> jobToken = TokenCache.getSessionToken(credentials);
        LlapNodeId amNodeId = LlapNodeId.getInstance(request.getAmHost(), request.getAmPort());
        QueryFragmentInfo fragmentInfo = queryTracker.registerFragment(queryIdentifier, qIdProto.getApplicationIdString(), dagId, vertex.getDagName(), vertex.getHiveQueryId(), dagIdentifier, vertex.getVertexName(), request.getFragmentNumber(), request.getAttemptNumber(), vertex.getUser(), vertex, jobToken, fragmentIdString, tokenInfo, amNodeId, ugiPool);
        // May need to setup localDir for re-localization, which is usually setup as Environment.PWD.
        // Used for re-localization, to add the user specified configuration (conf_pb_binary_stream)
        // Lazy create conf object, as it gets expensive in this codepath.
        Supplier<Configuration> callableConf = () -> new Configuration(getConfig());
        UserGroupInformation fsTaskUgi = fsUgiFactory == null ? null : fsUgiFactory.createUgi();
        boolean isGuaranteed = request.hasIsGuaranteed() && request.getIsGuaranteed();
        // enable the printing of (per daemon) LLAP task queue/run times via LLAP_TASK_TIME_SUMMARY
        ConfVars tezSummary = ConfVars.TEZ_EXEC_SUMMARY;
        ConfVars llapTasks = ConfVars.LLAP_TASK_TIME_SUMMARY;
        boolean addTaskTimes = getConfig().getBoolean(tezSummary.varname, tezSummary.defaultBoolVal) && getConfig().getBoolean(llapTasks.varname, llapTasks.defaultBoolVal);
        final String llapHost;
        if (UserGroupInformation.isSecurityEnabled()) {
            // when kerberos is enabled always use FQDN
            llapHost = localAddress.get().getHostName();
        } else if (execUseFQDN) {
            // when FQDN is explicitly requested (default)
            llapHost = localAddress.get().getHostName();
        } else {
            // when FQDN is not requested, use ip address
            llapHost = localAddress.get().getAddress().getHostAddress();
        }
        LOG.info("Using llap host: {} for execution context. hostName: {} hostAddress: {}", llapHost, localAddress.get().getHostName(), localAddress.get().getAddress().getHostAddress());
        // TODO: ideally we'd register TezCounters here, but it seems impossible before registerTask.
        WmFragmentCounters wmCounters = new WmFragmentCounters(addTaskTimes);
        TaskRunnerCallable callable = new TaskRunnerCallable(request, fragmentInfo, callableConf, new ExecutionContextImpl(llapHost), env, credentials, memoryPerExecutor, amReporter, confParams, metrics, killedTaskHandler, this, tezHadoopShim, attemptId, vertex, initialEvent, fsTaskUgi, completionListener, socketFactory, isGuaranteed, wmCounters);
        submissionState = executorService.schedule(callable);
        LOG.info("SubmissionState for {} : {} ", fragmentIdString, submissionState);
        if (submissionState.equals(Scheduler.SubmissionState.REJECTED)) {
            // Stop tracking the fragment and re-throw the error.
            fragmentComplete(fragmentInfo);
            return responseBuilder.setSubmissionState(SubmissionStateProto.valueOf(submissionState.name())).build();
        }
        if (metrics != null) {
            metrics.incrExecutorTotalRequestsHandled();
        }
    } finally {
        MDC.clear();
        NDC.clear();
    }
    return responseBuilder.setUniqueNodeId(daemonId.getUniqueNodeIdInCluster()).setSubmissionState(SubmissionStateProto.valueOf(submissionState.name())).build();
}
Also used : LlapTokenInfo(org.apache.hadoop.hive.llap.daemon.impl.LlapTokenChecker.LlapTokenInfo) Configuration(org.apache.hadoop.conf.Configuration) TezConfiguration(org.apache.tez.dag.api.TezConfiguration) HashMap(java.util.HashMap) ByteString(com.google.protobuf.ByteString) UserGroupInformation(org.apache.hadoop.security.UserGroupInformation) WmFragmentCounters(org.apache.hadoop.hive.llap.counters.WmFragmentCounters) ExecutionContextImpl(org.apache.tez.runtime.api.impl.ExecutionContextImpl) JobTokenIdentifier(org.apache.tez.common.security.JobTokenIdentifier) ConfVars(org.apache.hadoop.hive.conf.HiveConf.ConfVars) LlapNodeId(org.apache.hadoop.hive.llap.LlapNodeId) SignableVertexSpec(org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos.SignableVertexSpec) QueryIdentifierProto(org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos.QueryIdentifierProto) SubmitWorkResponseProto(org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos.SubmitWorkResponseProto) NotTezEvent(org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos.NotTezEvent) TezEvent(org.apache.tez.runtime.api.impl.TezEvent) Credentials(org.apache.hadoop.security.Credentials) TezTaskAttemptID(org.apache.tez.dag.records.TezTaskAttemptID)

Aggregations

JobTokenIdentifier (org.apache.tez.common.security.JobTokenIdentifier)31 Token (org.apache.hadoop.security.token.Token)23 Text (org.apache.hadoop.io.Text)16 JobTokenSecretManager (org.apache.tez.common.security.JobTokenSecretManager)12 ApplicationId (org.apache.hadoop.yarn.api.records.ApplicationId)11 Configuration (org.apache.hadoop.conf.Configuration)10 Path (org.apache.hadoop.fs.Path)10 ExecutionContextImpl (org.apache.tez.runtime.api.impl.ExecutionContextImpl)8 Test (org.junit.Test)8 IOException (java.io.IOException)7 ByteBuffer (java.nio.ByteBuffer)7 DataOutputBuffer (org.apache.hadoop.io.DataOutputBuffer)7 File (java.io.File)5 HashMap (java.util.HashMap)5 DataInputByteBuffer (org.apache.hadoop.io.DataInputByteBuffer)5 Credentials (org.apache.hadoop.security.Credentials)5 TezConfiguration (org.apache.tez.dag.api.TezConfiguration)5 TaskSpec (org.apache.tez.runtime.api.impl.TaskSpec)4 ByteString (com.google.protobuf.ByteString)3 JobConf (org.apache.hadoop.mapred.JobConf)3