Search in sources :

Example 1 with ExplainTask

use of org.apache.hadoop.hive.ql.exec.ExplainTask in project hive by apache.

the class TestUpdateDeleteSemanticAnalyzer method explain.

private String explain(SemanticAnalyzer sem, QueryPlan plan) throws IOException {
    FileSystem fs = FileSystem.get(conf);
    File f = File.createTempFile("TestSemanticAnalyzer", "explain");
    Path tmp = new Path(f.getPath());
    fs.create(tmp);
    fs.deleteOnExit(tmp);
    ExplainConfiguration config = new ExplainConfiguration();
    config.setExtended(true);
    ExplainWork work = new ExplainWork(tmp, sem.getParseContext(), sem.getRootTasks(), sem.getFetchTask(), sem, config, null);
    ExplainTask task = new ExplainTask();
    task.setWork(work);
    task.initialize(queryState, plan, null, null);
    task.execute(null);
    FSDataInputStream in = fs.open(tmp);
    StringBuilder builder = new StringBuilder();
    final int bufSz = 4096;
    byte[] buf = new byte[bufSz];
    long pos = 0L;
    while (true) {
        int bytesRead = in.read(pos, buf, 0, bufSz);
        if (bytesRead > 0) {
            pos += bytesRead;
            builder.append(new String(buf, 0, bytesRead));
        } else {
            // Reached end of file
            in.close();
            break;
        }
    }
    return builder.toString().replaceAll("pfile:/.*\n", "pfile:MASKED-OUT\n").replaceAll("location file:/.*\n", "location file:MASKED-OUT\n").replaceAll("file:/.*\n", "file:MASKED-OUT\n").replaceAll("transient_lastDdlTime.*\n", "transient_lastDdlTime MASKED-OUT\n");
}
Also used : Path(org.apache.hadoop.fs.Path) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) FileSystem(org.apache.hadoop.fs.FileSystem) ExplainWork(org.apache.hadoop.hive.ql.plan.ExplainWork) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) File(java.io.File)

Example 2 with ExplainTask

use of org.apache.hadoop.hive.ql.exec.ExplainTask in project hive by apache.

the class ExplainSemanticAnalyzer method skipAuthorization.

@Override
public boolean skipAuthorization() {
    List<Task<? extends Serializable>> rootTasks = getRootTasks();
    assert rootTasks != null && rootTasks.size() == 1;
    Task task = rootTasks.get(0);
    return task instanceof ExplainTask && ((ExplainTask) task).getWork().isAuthorize();
}
Also used : Task(org.apache.hadoop.hive.ql.exec.Task) FetchTask(org.apache.hadoop.hive.ql.exec.FetchTask) StatsTask(org.apache.hadoop.hive.ql.exec.StatsTask) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) Serializable(java.io.Serializable) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask)

Example 3 with ExplainTask

use of org.apache.hadoop.hive.ql.exec.ExplainTask in project hive by apache.

the class Driver method getExplainOutput.

/**
   * Returns EXPLAIN EXTENDED output for a semantically
   * analyzed query.
   *
   * @param sem semantic analyzer for analyzed query
   * @param plan query plan
   * @param astTree AST tree dump
   * @throws java.io.IOException
   */
private String getExplainOutput(BaseSemanticAnalyzer sem, QueryPlan plan, ASTNode astTree) throws IOException {
    String ret = null;
    ExplainTask task = new ExplainTask();
    task.initialize(queryState, plan, null, ctx.getOpContext());
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    PrintStream ps = new PrintStream(baos);
    try {
        List<Task<?>> rootTasks = sem.getAllRootTasks();
        task.getJSONPlan(ps, rootTasks, sem.getFetchTask(), false, true, true);
        ret = baos.toString();
    } catch (Exception e) {
        LOG.warn("Exception generating explain output: " + e, e);
    }
    return ret;
}
Also used : PrintStream(java.io.PrintStream) ConditionalTask(org.apache.hadoop.hive.ql.exec.ConditionalTask) Task(org.apache.hadoop.hive.ql.exec.Task) FetchTask(org.apache.hadoop.hive.ql.exec.FetchTask) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) ByteArrayOutputStream(java.io.ByteArrayOutputStream) LockException(org.apache.hadoop.hive.ql.lockmgr.LockException) IOException(java.io.IOException) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) AuthorizationException(org.apache.hadoop.hive.ql.metadata.AuthorizationException)

Example 4 with ExplainTask

use of org.apache.hadoop.hive.ql.exec.ExplainTask in project hive by apache.

the class ExplainSemanticAnalyzer method analyzeInternal.

@SuppressWarnings("unchecked")
@Override
public void analyzeInternal(ASTNode ast) throws SemanticException {
    final int childCount = ast.getChildCount();
    // Skip TOK_QUERY.
    int i = 1;
    while (i < childCount) {
        int explainOptions = ast.getChild(i).getType();
        if (explainOptions == HiveParser.KW_FORMATTED) {
            config.setFormatted(true);
        } else if (explainOptions == HiveParser.KW_EXTENDED) {
            config.setExtended(true);
        } else if (explainOptions == HiveParser.KW_DEPENDENCY) {
            config.setDependency(true);
        } else if (explainOptions == HiveParser.KW_LOGICAL) {
            config.setLogical(true);
        } else if (explainOptions == HiveParser.KW_AUTHORIZATION) {
            config.setAuthorize(true);
        } else if (explainOptions == HiveParser.KW_ANALYZE) {
            config.setAnalyze(AnalyzeState.RUNNING);
            config.setExplainRootPath(ctx.getMRTmpPath());
        } else if (explainOptions == HiveParser.KW_VECTORIZATION) {
            config.setVectorization(true);
            if (i + 1 < childCount) {
                int vectorizationOption = ast.getChild(i + 1).getType();
                // [ONLY]
                if (vectorizationOption == HiveParser.TOK_ONLY) {
                    config.setVectorizationOnly(true);
                    i++;
                    if (i + 1 >= childCount) {
                        break;
                    }
                    vectorizationOption = ast.getChild(i + 1).getType();
                }
                // [SUMMARY|OPERATOR|EXPRESSION|DETAIL]
                if (vectorizationOption == HiveParser.TOK_SUMMARY) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.SUMMARY);
                    i++;
                } else if (vectorizationOption == HiveParser.TOK_OPERATOR) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.OPERATOR);
                    i++;
                } else if (vectorizationOption == HiveParser.TOK_EXPRESSION) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.EXPRESSION);
                    i++;
                } else if (vectorizationOption == HiveParser.TOK_DETAIL) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.DETAIL);
                    i++;
                }
            }
        } else {
        // UNDONE: UNKNOWN OPTION?
        }
        i++;
    }
    ctx.setExplainConfig(config);
    ASTNode input = (ASTNode) ast.getChild(0);
    // step 2 (ANALYZE_STATE.ANALYZING), explain the query and provide the runtime #rows collected.
    if (config.getAnalyze() == AnalyzeState.RUNNING) {
        String query = ctx.getTokenRewriteStream().toString(input.getTokenStartIndex(), input.getTokenStopIndex());
        LOG.info("Explain analyze (running phase) for query " + query);
        Context runCtx = null;
        try {
            runCtx = new Context(conf);
            // runCtx and ctx share the configuration
            runCtx.setExplainConfig(config);
            Driver driver = new Driver(conf, runCtx);
            CommandProcessorResponse ret = driver.run(query);
            if (ret.getResponseCode() == 0) {
                // However, we need to skip all the results.
                while (driver.getResults(new ArrayList<String>())) {
                }
            } else {
                throw new SemanticException(ret.getErrorMessage(), ret.getException());
            }
            config.setOpIdToRuntimeNumRows(aggregateStats(config.getExplainRootPath()));
        } catch (IOException e1) {
            throw new SemanticException(e1);
        } catch (CommandNeedRetryException e) {
            throw new SemanticException(e);
        }
        ctx.resetOpContext();
        ctx.resetStream();
        TaskFactory.resetId();
        LOG.info("Explain analyze (analyzing phase) for query " + query);
        config.setAnalyze(AnalyzeState.ANALYZING);
    }
    BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(queryState, input);
    sem.analyze(input, ctx);
    sem.validate();
    ctx.setResFile(ctx.getLocalTmpPath());
    List<Task<? extends Serializable>> tasks = sem.getAllRootTasks();
    if (tasks == null) {
        tasks = Collections.emptyList();
    }
    FetchTask fetchTask = sem.getFetchTask();
    if (fetchTask != null) {
        // Initialize fetch work such that operator tree will be constructed.
        fetchTask.getWork().initializeForFetch(ctx.getOpContext());
    }
    ParseContext pCtx = null;
    if (sem instanceof SemanticAnalyzer) {
        pCtx = ((SemanticAnalyzer) sem).getParseContext();
    }
    config.setUserLevelExplain(!config.isExtended() && !config.isFormatted() && !config.isDependency() && !config.isLogical() && !config.isAuthorize() && (HiveConf.getBoolVar(ctx.getConf(), HiveConf.ConfVars.HIVE_EXPLAIN_USER) && HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")));
    ExplainWork work = new ExplainWork(ctx.getResFile(), pCtx, tasks, fetchTask, sem, config, ctx.getCboInfo());
    work.setAppendTaskType(HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVEEXPLAINDEPENDENCYAPPENDTASKTYPES));
    ExplainTask explTask = (ExplainTask) TaskFactory.get(work, conf);
    fieldList = explTask.getResultSchema();
    rootTasks.add(explTask);
}
Also used : StatsCollectionContext(org.apache.hadoop.hive.ql.stats.StatsCollectionContext) Context(org.apache.hadoop.hive.ql.Context) Task(org.apache.hadoop.hive.ql.exec.Task) FetchTask(org.apache.hadoop.hive.ql.exec.FetchTask) StatsTask(org.apache.hadoop.hive.ql.exec.StatsTask) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) Serializable(java.io.Serializable) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) CommandProcessorResponse(org.apache.hadoop.hive.ql.processors.CommandProcessorResponse) Driver(org.apache.hadoop.hive.ql.Driver) ExplainWork(org.apache.hadoop.hive.ql.plan.ExplainWork) IOException(java.io.IOException) FetchTask(org.apache.hadoop.hive.ql.exec.FetchTask) CommandNeedRetryException(org.apache.hadoop.hive.ql.CommandNeedRetryException)

Example 5 with ExplainTask

use of org.apache.hadoop.hive.ql.exec.ExplainTask in project hive by apache.

the class ATSHook method run.

@Override
public void run(final HookContext hookContext) throws Exception {
    final long currentTime = System.currentTimeMillis();
    final HiveConf conf = new HiveConf(hookContext.getConf());
    final QueryState queryState = hookContext.getQueryState();
    final String queryId = queryState.getQueryId();
    final Map<String, Long> durations = new HashMap<String, Long>();
    for (String key : hookContext.getPerfLogger().getEndTimes().keySet()) {
        durations.put(key, hookContext.getPerfLogger().getDuration(key));
    }
    try {
        setupAtsExecutor(conf);
        final String domainId = createOrGetDomain(hookContext);
        executor.submit(new Runnable() {

            @Override
            public void run() {
                try {
                    QueryPlan plan = hookContext.getQueryPlan();
                    if (plan == null) {
                        return;
                    }
                    String queryId = plan.getQueryId();
                    String opId = hookContext.getOperationId();
                    long queryStartTime = plan.getQueryStartTime();
                    String user = hookContext.getUgi().getShortUserName();
                    String requestuser = hookContext.getUserName();
                    if (hookContext.getUserName() == null) {
                        requestuser = hookContext.getUgi().getUserName();
                    }
                    int numMrJobs = Utilities.getMRTasks(plan.getRootTasks()).size();
                    int numTezJobs = Utilities.getTezTasks(plan.getRootTasks()).size();
                    if (numMrJobs + numTezJobs <= 0) {
                        // ignore client only queries
                        return;
                    }
                    switch(hookContext.getHookType()) {
                        case PRE_EXEC_HOOK:
                            ExplainConfiguration config = new ExplainConfiguration();
                            config.setFormatted(true);
                            ExplainWork work = new // resFile
                            ExplainWork(// resFile
                            null, // pCtx
                            null, // RootTasks
                            plan.getRootTasks(), // FetchTask
                            plan.getFetchTask(), // analyzer
                            null, //explainConfig
                            config, // cboInfo
                            null);
                            @SuppressWarnings("unchecked") ExplainTask explain = (ExplainTask) TaskFactory.get(work, conf);
                            explain.initialize(queryState, plan, null, null);
                            String query = plan.getQueryStr();
                            JSONObject explainPlan = explain.getJSONPlan(null, work);
                            String logID = conf.getLogIdVar(hookContext.getSessionId());
                            List<String> tablesRead = getTablesFromEntitySet(hookContext.getInputs());
                            List<String> tablesWritten = getTablesFromEntitySet(hookContext.getOutputs());
                            String executionMode = getExecutionMode(plan).name();
                            String hiveInstanceAddress = hookContext.getHiveInstanceAddress();
                            if (hiveInstanceAddress == null) {
                                hiveInstanceAddress = InetAddress.getLocalHost().getHostAddress();
                            }
                            String hiveInstanceType = hookContext.isHiveServerQuery() ? "HS2" : "CLI";
                            ApplicationId llapId = determineLlapId(conf, plan);
                            fireAndForget(createPreHookEvent(queryId, query, explainPlan, queryStartTime, user, requestuser, numMrJobs, numTezJobs, opId, hookContext.getIpAddress(), hiveInstanceAddress, hiveInstanceType, hookContext.getSessionId(), logID, hookContext.getThreadId(), executionMode, tablesRead, tablesWritten, conf, llapId, domainId));
                            break;
                        case POST_EXEC_HOOK:
                            fireAndForget(createPostHookEvent(queryId, currentTime, user, requestuser, true, opId, durations, domainId));
                            break;
                        case ON_FAILURE_HOOK:
                            fireAndForget(createPostHookEvent(queryId, currentTime, user, requestuser, false, opId, durations, domainId));
                            break;
                        default:
                            //ignore
                            break;
                    }
                } catch (Exception e) {
                    LOG.warn("Failed to submit plan to ATS for " + queryId, e);
                }
            }
        });
    } catch (Exception e) {
        LOG.warn("Failed to submit to ATS for " + queryId, e);
    }
}
Also used : ExplainConfiguration(org.apache.hadoop.hive.ql.parse.ExplainConfiguration) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) ExplainWork(org.apache.hadoop.hive.ql.plan.ExplainWork) QueryState(org.apache.hadoop.hive.ql.QueryState) QueryPlan(org.apache.hadoop.hive.ql.QueryPlan) IOException(java.io.IOException) JSONObject(org.json.JSONObject) HiveConf(org.apache.hadoop.hive.conf.HiveConf) ArrayList(java.util.ArrayList) List(java.util.List) ApplicationId(org.apache.hadoop.yarn.api.records.ApplicationId)

Aggregations

ExplainTask (org.apache.hadoop.hive.ql.exec.ExplainTask)5 IOException (java.io.IOException)3 FetchTask (org.apache.hadoop.hive.ql.exec.FetchTask)3 Task (org.apache.hadoop.hive.ql.exec.Task)3 ExplainWork (org.apache.hadoop.hive.ql.plan.ExplainWork)3 Serializable (java.io.Serializable)2 StatsTask (org.apache.hadoop.hive.ql.exec.StatsTask)2 ByteArrayOutputStream (java.io.ByteArrayOutputStream)1 File (java.io.File)1 PrintStream (java.io.PrintStream)1 ArrayList (java.util.ArrayList)1 HashMap (java.util.HashMap)1 LinkedHashMap (java.util.LinkedHashMap)1 List (java.util.List)1 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1 HiveConf (org.apache.hadoop.hive.conf.HiveConf)1 CommandNeedRetryException (org.apache.hadoop.hive.ql.CommandNeedRetryException)1 Context (org.apache.hadoop.hive.ql.Context)1