Search in sources :

Example 11 with MapReduceApplicationData

use of com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData in project dr-elephant by linkedin.

the class InfoExtractor method loadInfo.

/**
 * Loads result with the info depending on the application type
 *
 * @param result The jobResult to be loaded with.
 * @param data The Hadoop application data
 */
public static void loadInfo(AppResult result, HadoopApplicationData data) {
    Properties properties = new Properties();
    if (data instanceof MapReduceApplicationData) {
        properties = retrieveMapreduceProperties((MapReduceApplicationData) data);
    } else if (data instanceof SparkApplicationData) {
        properties = retrieveSparkProperties((SparkApplicationData) data);
    } else if (data instanceof TezApplicationData) {
        properties = retrieveTezProperties((TezApplicationData) data);
    }
    Scheduler scheduler = getSchedulerInstance(data.getAppId(), properties);
    if (scheduler == null) {
        logger.info("No Scheduler found for appid: " + data.getAppId());
        loadNoSchedulerInfo(result);
    } else if (StringUtils.isEmpty(scheduler.getJobDefId()) || StringUtils.isEmpty(scheduler.getJobExecId()) || StringUtils.isEmpty(scheduler.getFlowDefId()) || StringUtils.isEmpty(scheduler.getFlowExecId())) {
        logger.warn("This job doesn't have the correct " + scheduler.getSchedulerName() + " integration support. I" + " will treat this as an adhoc job");
        logger.info("No Flow/job info found for appid: " + data.getAppId());
        loadNoSchedulerInfo(result);
    } else {
        loadSchedulerInfo(result, data, scheduler);
    }
}
Also used : MapReduceApplicationData(com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData) Scheduler(com.linkedin.drelephant.schedulers.Scheduler) SparkApplicationData(com.linkedin.drelephant.spark.data.SparkApplicationData) TezApplicationData(com.linkedin.drelephant.tez.data.TezApplicationData) Properties(java.util.Properties)

Example 12 with MapReduceApplicationData

use of com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData in project dr-elephant by linkedin.

the class ShuffleSortHeuristicTest method analyzeJob.

private Severity analyzeJob(long shuffleTimeMs, long sortTimeMs, long reduceTimeMs) throws IOException {
    MapReduceCounterData dummyCounter = new MapReduceCounterData();
    MapReduceTaskData[] reducers = new MapReduceTaskData[NUMTASKS + 1];
    int i = 0;
    for (; i < NUMTASKS; i++) {
        reducers[i] = new MapReduceTaskData("task-id-" + i, "task-attempt-id-" + i);
        reducers[i].setTimeAndCounter(new long[] { shuffleTimeMs + sortTimeMs + reduceTimeMs, shuffleTimeMs, sortTimeMs, 0, 0 }, dummyCounter);
    }
    // Non-sampled task, which does not contain time and counter data
    reducers[i] = new MapReduceTaskData("task-id-" + i, "task-attempt-id-" + i);
    MapReduceApplicationData data = new MapReduceApplicationData().setCounters(dummyCounter).setReducerData(reducers);
    HeuristicResult result = _heuristic.apply(data);
    return result.getSeverity();
}
Also used : MapReduceApplicationData(com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData) MapReduceCounterData(com.linkedin.drelephant.mapreduce.data.MapReduceCounterData) MapReduceTaskData(com.linkedin.drelephant.mapreduce.data.MapReduceTaskData) HeuristicResult(com.linkedin.drelephant.analysis.HeuristicResult)

Example 13 with MapReduceApplicationData

use of com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData in project dr-elephant by linkedin.

the class DistributedCacheLimitHeuristicTest method testHeuristicResult.

/**
 * All cache file sizes are within the limit.
 */
@Test
public void testHeuristicResult() {
    jobConf.setProperty("mapreduce.job.cache.files.filesizes", "100,200,300");
    jobConf.setProperty("mapreduce.job.cache.archives.filesizes", "400,500,600");
    MapReduceApplicationData data = new MapReduceApplicationData().setJobConf(jobConf);
    HeuristicResult result = _heuristic.apply(data);
    assertTrue("Failed to match on expected severity", result.getSeverity() == Severity.NONE);
}
Also used : MapReduceApplicationData(com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData) HeuristicResult(com.linkedin.drelephant.analysis.HeuristicResult) Test(org.junit.Test)

Example 14 with MapReduceApplicationData

use of com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData in project dr-elephant by linkedin.

the class DistributedCacheLimitHeuristicTest method testHeuristicResultNoDistributedCacheFiles.

/**
 * Either of the caches are not used by the application.
 */
@Test
public void testHeuristicResultNoDistributedCacheFiles() {
    jobConf.remove("mapreduce.job.cache.files");
    jobConf.remove("mapreduce.job.cache.archives");
    MapReduceApplicationData data = new MapReduceApplicationData().setJobConf(jobConf);
    HeuristicResult result = _heuristic.apply(data);
    assertTrue("Failed to match on expected severity", result == null);
}
Also used : MapReduceApplicationData(com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData) HeuristicResult(com.linkedin.drelephant.analysis.HeuristicResult) Test(org.junit.Test)

Example 15 with MapReduceApplicationData

use of com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData in project dr-elephant by linkedin.

the class DistributedCacheLimitHeuristicTest method testHeuristicResultWithEmptyCacheFiles.

/**
 * Cache files are not used by the application.
 */
@Test
public void testHeuristicResultWithEmptyCacheFiles() {
    jobConf.remove("mapreduce.job.cache.files");
    jobConf.setProperty("mapreduce.job.cache.archives.filesizes", "400,500,600");
    MapReduceApplicationData data = new MapReduceApplicationData().setJobConf(jobConf);
    HeuristicResult result = _heuristic.apply(data);
    assertTrue("Failed to match on expected severity", result.getSeverity() == Severity.NONE);
}
Also used : MapReduceApplicationData(com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData) HeuristicResult(com.linkedin.drelephant.analysis.HeuristicResult) Test(org.junit.Test)

Aggregations

MapReduceApplicationData (com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData)28 HeuristicResult (com.linkedin.drelephant.analysis.HeuristicResult)22 MapReduceTaskData (com.linkedin.drelephant.mapreduce.data.MapReduceTaskData)17 MapReduceCounterData (com.linkedin.drelephant.mapreduce.data.MapReduceCounterData)15 Test (org.junit.Test)10 Properties (java.util.Properties)8 IOException (java.io.IOException)3 ArrayList (java.util.ArrayList)2 AppResult (models.AppResult)2 HadoopApplicationData (com.linkedin.drelephant.analysis.HadoopApplicationData)1 MapperSkewHeuristic (com.linkedin.drelephant.mapreduce.heuristics.MapperSkewHeuristic)1 Scheduler (com.linkedin.drelephant.schedulers.Scheduler)1 SparkApplicationData (com.linkedin.drelephant.spark.data.SparkApplicationData)1 TezApplicationData (com.linkedin.drelephant.tez.data.TezApplicationData)1 MalformedURLException (java.net.MalformedURLException)1 URL (java.net.URL)1 List (java.util.List)1 Map (java.util.Map)1 Expectations (mockit.Expectations)1 Configuration (org.apache.hadoop.conf.Configuration)1