Search in sources :

Example 1 with ExplainWork

use of org.apache.hadoop.hive.ql.plan.ExplainWork in project hive by apache.

the class TestExplainTask method explainToString.

private <K, V> String explainToString(Map<K, V> explainMap) throws Exception {
    ExplainWork work = new ExplainWork();
    ParseContext pCtx = new ParseContext();
    HashMap<String, TableScanOperator> topOps = new HashMap<>();
    TableScanOperator scanOp = new DummyOperator(new DummyExplainDesc<K, V>(explainMap));
    topOps.put("sample", scanOp);
    pCtx.setTopOps(topOps);
    work.setParseContext(pCtx);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    work.setConfig(new ExplainConfiguration());
    new ExplainTask().getJSONLogicalPlan(new PrintStream(baos), work);
    baos.close();
    return baos.toString();
}
Also used : PrintStream(java.io.PrintStream) ExplainConfiguration(org.apache.hadoop.hive.ql.parse.ExplainConfiguration) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) ExplainWork(org.apache.hadoop.hive.ql.plan.ExplainWork) ByteArrayOutputStream(org.apache.commons.io.output.ByteArrayOutputStream) ParseContext(org.apache.hadoop.hive.ql.parse.ParseContext)

Example 2 with ExplainWork

use of org.apache.hadoop.hive.ql.plan.ExplainWork in project hive by apache.

the class TestUpdateDeleteSemanticAnalyzer method explain.

private String explain(SemanticAnalyzer sem, QueryPlan plan) throws IOException {
    FileSystem fs = FileSystem.get(conf);
    File f = File.createTempFile("TestSemanticAnalyzer", "explain");
    Path tmp = new Path(f.getPath());
    fs.create(tmp);
    fs.deleteOnExit(tmp);
    ExplainConfiguration config = new ExplainConfiguration();
    config.setExtended(true);
    ExplainWork work = new ExplainWork(tmp, sem.getParseContext(), sem.getRootTasks(), sem.getFetchTask(), sem, config, null);
    ExplainTask task = new ExplainTask();
    task.setWork(work);
    task.initialize(queryState, plan, null, null);
    task.execute(null);
    FSDataInputStream in = fs.open(tmp);
    StringBuilder builder = new StringBuilder();
    final int bufSz = 4096;
    byte[] buf = new byte[bufSz];
    long pos = 0L;
    while (true) {
        int bytesRead = in.read(pos, buf, 0, bufSz);
        if (bytesRead > 0) {
            pos += bytesRead;
            builder.append(new String(buf, 0, bytesRead));
        } else {
            // Reached end of file
            in.close();
            break;
        }
    }
    return builder.toString().replaceAll("pfile:/.*\n", "pfile:MASKED-OUT\n").replaceAll("location file:/.*\n", "location file:MASKED-OUT\n").replaceAll("file:/.*\n", "file:MASKED-OUT\n").replaceAll("transient_lastDdlTime.*\n", "transient_lastDdlTime MASKED-OUT\n");
}
Also used : Path(org.apache.hadoop.fs.Path) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) FileSystem(org.apache.hadoop.fs.FileSystem) ExplainWork(org.apache.hadoop.hive.ql.plan.ExplainWork) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) File(java.io.File)

Example 3 with ExplainWork

use of org.apache.hadoop.hive.ql.plan.ExplainWork in project hive by apache.

the class ExplainSemanticAnalyzer method analyzeInternal.

@SuppressWarnings("unchecked")
@Override
public void analyzeInternal(ASTNode ast) throws SemanticException {
    final int childCount = ast.getChildCount();
    // Skip TOK_QUERY.
    int i = 1;
    while (i < childCount) {
        int explainOptions = ast.getChild(i).getType();
        if (explainOptions == HiveParser.KW_FORMATTED) {
            config.setFormatted(true);
        } else if (explainOptions == HiveParser.KW_EXTENDED) {
            config.setExtended(true);
        } else if (explainOptions == HiveParser.KW_DEPENDENCY) {
            config.setDependency(true);
        } else if (explainOptions == HiveParser.KW_LOGICAL) {
            config.setLogical(true);
        } else if (explainOptions == HiveParser.KW_AUTHORIZATION) {
            config.setAuthorize(true);
        } else if (explainOptions == HiveParser.KW_ANALYZE) {
            config.setAnalyze(AnalyzeState.RUNNING);
            config.setExplainRootPath(ctx.getMRTmpPath());
        } else if (explainOptions == HiveParser.KW_VECTORIZATION) {
            config.setVectorization(true);
            if (i + 1 < childCount) {
                int vectorizationOption = ast.getChild(i + 1).getType();
                // [ONLY]
                if (vectorizationOption == HiveParser.TOK_ONLY) {
                    config.setVectorizationOnly(true);
                    i++;
                    if (i + 1 >= childCount) {
                        break;
                    }
                    vectorizationOption = ast.getChild(i + 1).getType();
                }
                // [SUMMARY|OPERATOR|EXPRESSION|DETAIL]
                if (vectorizationOption == HiveParser.TOK_SUMMARY) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.SUMMARY);
                    i++;
                } else if (vectorizationOption == HiveParser.TOK_OPERATOR) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.OPERATOR);
                    i++;
                } else if (vectorizationOption == HiveParser.TOK_EXPRESSION) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.EXPRESSION);
                    i++;
                } else if (vectorizationOption == HiveParser.TOK_DETAIL) {
                    config.setVectorizationDetailLevel(VectorizationDetailLevel.DETAIL);
                    i++;
                }
            }
        } else {
        // UNDONE: UNKNOWN OPTION?
        }
        i++;
    }
    ctx.setExplainConfig(config);
    ASTNode input = (ASTNode) ast.getChild(0);
    // step 2 (ANALYZE_STATE.ANALYZING), explain the query and provide the runtime #rows collected.
    if (config.getAnalyze() == AnalyzeState.RUNNING) {
        String query = ctx.getTokenRewriteStream().toString(input.getTokenStartIndex(), input.getTokenStopIndex());
        LOG.info("Explain analyze (running phase) for query " + query);
        Context runCtx = null;
        try {
            runCtx = new Context(conf);
            // runCtx and ctx share the configuration
            runCtx.setExplainConfig(config);
            Driver driver = new Driver(conf, runCtx);
            CommandProcessorResponse ret = driver.run(query);
            if (ret.getResponseCode() == 0) {
                // However, we need to skip all the results.
                while (driver.getResults(new ArrayList<String>())) {
                }
            } else {
                throw new SemanticException(ret.getErrorMessage(), ret.getException());
            }
            config.setOpIdToRuntimeNumRows(aggregateStats(config.getExplainRootPath()));
        } catch (IOException e1) {
            throw new SemanticException(e1);
        } catch (CommandNeedRetryException e) {
            throw new SemanticException(e);
        }
        ctx.resetOpContext();
        ctx.resetStream();
        TaskFactory.resetId();
        LOG.info("Explain analyze (analyzing phase) for query " + query);
        config.setAnalyze(AnalyzeState.ANALYZING);
    }
    BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(queryState, input);
    sem.analyze(input, ctx);
    sem.validate();
    ctx.setResFile(ctx.getLocalTmpPath());
    List<Task<? extends Serializable>> tasks = sem.getAllRootTasks();
    if (tasks == null) {
        tasks = Collections.emptyList();
    }
    FetchTask fetchTask = sem.getFetchTask();
    if (fetchTask != null) {
        // Initialize fetch work such that operator tree will be constructed.
        fetchTask.getWork().initializeForFetch(ctx.getOpContext());
    }
    ParseContext pCtx = null;
    if (sem instanceof SemanticAnalyzer) {
        pCtx = ((SemanticAnalyzer) sem).getParseContext();
    }
    config.setUserLevelExplain(!config.isExtended() && !config.isFormatted() && !config.isDependency() && !config.isLogical() && !config.isAuthorize() && (HiveConf.getBoolVar(ctx.getConf(), HiveConf.ConfVars.HIVE_EXPLAIN_USER) && HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")));
    ExplainWork work = new ExplainWork(ctx.getResFile(), pCtx, tasks, fetchTask, sem, config, ctx.getCboInfo());
    work.setAppendTaskType(HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVEEXPLAINDEPENDENCYAPPENDTASKTYPES));
    ExplainTask explTask = (ExplainTask) TaskFactory.get(work, conf);
    fieldList = explTask.getResultSchema();
    rootTasks.add(explTask);
}
Also used : StatsCollectionContext(org.apache.hadoop.hive.ql.stats.StatsCollectionContext) Context(org.apache.hadoop.hive.ql.Context) Task(org.apache.hadoop.hive.ql.exec.Task) FetchTask(org.apache.hadoop.hive.ql.exec.FetchTask) StatsTask(org.apache.hadoop.hive.ql.exec.StatsTask) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) Serializable(java.io.Serializable) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) CommandProcessorResponse(org.apache.hadoop.hive.ql.processors.CommandProcessorResponse) Driver(org.apache.hadoop.hive.ql.Driver) ExplainWork(org.apache.hadoop.hive.ql.plan.ExplainWork) IOException(java.io.IOException) FetchTask(org.apache.hadoop.hive.ql.exec.FetchTask) CommandNeedRetryException(org.apache.hadoop.hive.ql.CommandNeedRetryException)

Example 4 with ExplainWork

use of org.apache.hadoop.hive.ql.plan.ExplainWork in project hive by apache.

the class ATSHook method run.

@Override
public void run(final HookContext hookContext) throws Exception {
    final long currentTime = System.currentTimeMillis();
    final HiveConf conf = new HiveConf(hookContext.getConf());
    final QueryState queryState = hookContext.getQueryState();
    final String queryId = queryState.getQueryId();
    final Map<String, Long> durations = new HashMap<String, Long>();
    for (String key : hookContext.getPerfLogger().getEndTimes().keySet()) {
        durations.put(key, hookContext.getPerfLogger().getDuration(key));
    }
    try {
        setupAtsExecutor(conf);
        final String domainId = createOrGetDomain(hookContext);
        executor.submit(new Runnable() {

            @Override
            public void run() {
                try {
                    QueryPlan plan = hookContext.getQueryPlan();
                    if (plan == null) {
                        return;
                    }
                    String queryId = plan.getQueryId();
                    String opId = hookContext.getOperationId();
                    long queryStartTime = plan.getQueryStartTime();
                    String user = hookContext.getUgi().getShortUserName();
                    String requestuser = hookContext.getUserName();
                    if (hookContext.getUserName() == null) {
                        requestuser = hookContext.getUgi().getUserName();
                    }
                    int numMrJobs = Utilities.getMRTasks(plan.getRootTasks()).size();
                    int numTezJobs = Utilities.getTezTasks(plan.getRootTasks()).size();
                    if (numMrJobs + numTezJobs <= 0) {
                        // ignore client only queries
                        return;
                    }
                    switch(hookContext.getHookType()) {
                        case PRE_EXEC_HOOK:
                            ExplainConfiguration config = new ExplainConfiguration();
                            config.setFormatted(true);
                            ExplainWork work = new // resFile
                            ExplainWork(// resFile
                            null, // pCtx
                            null, // RootTasks
                            plan.getRootTasks(), // FetchTask
                            plan.getFetchTask(), // analyzer
                            null, //explainConfig
                            config, // cboInfo
                            null);
                            @SuppressWarnings("unchecked") ExplainTask explain = (ExplainTask) TaskFactory.get(work, conf);
                            explain.initialize(queryState, plan, null, null);
                            String query = plan.getQueryStr();
                            JSONObject explainPlan = explain.getJSONPlan(null, work);
                            String logID = conf.getLogIdVar(hookContext.getSessionId());
                            List<String> tablesRead = getTablesFromEntitySet(hookContext.getInputs());
                            List<String> tablesWritten = getTablesFromEntitySet(hookContext.getOutputs());
                            String executionMode = getExecutionMode(plan).name();
                            String hiveInstanceAddress = hookContext.getHiveInstanceAddress();
                            if (hiveInstanceAddress == null) {
                                hiveInstanceAddress = InetAddress.getLocalHost().getHostAddress();
                            }
                            String hiveInstanceType = hookContext.isHiveServerQuery() ? "HS2" : "CLI";
                            ApplicationId llapId = determineLlapId(conf, plan);
                            fireAndForget(createPreHookEvent(queryId, query, explainPlan, queryStartTime, user, requestuser, numMrJobs, numTezJobs, opId, hookContext.getIpAddress(), hiveInstanceAddress, hiveInstanceType, hookContext.getSessionId(), logID, hookContext.getThreadId(), executionMode, tablesRead, tablesWritten, conf, llapId, domainId));
                            break;
                        case POST_EXEC_HOOK:
                            fireAndForget(createPostHookEvent(queryId, currentTime, user, requestuser, true, opId, durations, domainId));
                            break;
                        case ON_FAILURE_HOOK:
                            fireAndForget(createPostHookEvent(queryId, currentTime, user, requestuser, false, opId, durations, domainId));
                            break;
                        default:
                            //ignore
                            break;
                    }
                } catch (Exception e) {
                    LOG.warn("Failed to submit plan to ATS for " + queryId, e);
                }
            }
        });
    } catch (Exception e) {
        LOG.warn("Failed to submit to ATS for " + queryId, e);
    }
}
Also used : ExplainConfiguration(org.apache.hadoop.hive.ql.parse.ExplainConfiguration) ExplainTask(org.apache.hadoop.hive.ql.exec.ExplainTask) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) ExplainWork(org.apache.hadoop.hive.ql.plan.ExplainWork) QueryState(org.apache.hadoop.hive.ql.QueryState) QueryPlan(org.apache.hadoop.hive.ql.QueryPlan) IOException(java.io.IOException) JSONObject(org.json.JSONObject) HiveConf(org.apache.hadoop.hive.conf.HiveConf) ArrayList(java.util.ArrayList) List(java.util.List) ApplicationId(org.apache.hadoop.yarn.api.records.ApplicationId)

Aggregations

ExplainWork (org.apache.hadoop.hive.ql.plan.ExplainWork)4 ExplainTask (org.apache.hadoop.hive.ql.exec.ExplainTask)3 IOException (java.io.IOException)2 HashMap (java.util.HashMap)2 LinkedHashMap (java.util.LinkedHashMap)2 ExplainConfiguration (org.apache.hadoop.hive.ql.parse.ExplainConfiguration)2 File (java.io.File)1 PrintStream (java.io.PrintStream)1 Serializable (java.io.Serializable)1 ArrayList (java.util.ArrayList)1 List (java.util.List)1 ByteArrayOutputStream (org.apache.commons.io.output.ByteArrayOutputStream)1 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1 HiveConf (org.apache.hadoop.hive.conf.HiveConf)1 CommandNeedRetryException (org.apache.hadoop.hive.ql.CommandNeedRetryException)1 Context (org.apache.hadoop.hive.ql.Context)1 Driver (org.apache.hadoop.hive.ql.Driver)1 QueryPlan (org.apache.hadoop.hive.ql.QueryPlan)1