Search in sources :

Example 1 with HadoopApplicationData

use of com.linkedin.drelephant.analysis.HadoopApplicationData in project dr-elephant by linkedin.

the class InfoExtractorTest method testLoadInfoSpark.

@Test
public void testLoadInfoSpark() {
    final String JOB_DEF_URL = "https://grid.example.com:9000/manager?project=project-name&flow=flow-name&job=job-name";
    final String JOB_EXEC_URL = "https://grid.example.com:9000/executor?execid=123456&job=job-name&attempt=0";
    final String FLOW_DEF_URL = "https://grid.example.com:9000/manager?project=project-name&flow=flow-name";
    final String FLOW_EXEC_URL = "https://grid.example.com:9000/executor?execid=123456";
    final String JAVA_EXTRA_OPTIONS = "spark.driver.extraJavaOptions";
    Map<String, String> properties = new HashMap<String, String>();
    properties = properties.$plus(new Tuple2<String, String>(JAVA_EXTRA_OPTIONS, "-Dazkaban.link.workflow.url=" + FLOW_DEF_URL + " -Dazkaban.link.job.url=" + JOB_DEF_URL + " -Dazkaban.link.execution.url=" + FLOW_EXEC_URL + " -Dazkaban.link.attempt.url=" + JOB_EXEC_URL));
    AppResult result = new AppResult();
    HadoopApplicationData data = new SparkApplicationData("application_5678", properties, new ApplicationInfoImpl("", "", new Vector<ApplicationAttemptInfoImpl>(0, 1, 0)), new Vector<JobData>(0, 1, 0), new Vector<StageData>(0, 1, 0), new Vector<ExecutorSummary>(0, 1, 0));
    InfoExtractor.loadInfo(result, data);
    assertTrue(result.jobDefId.equals(JOB_DEF_URL));
    assertTrue(result.jobExecId.equals(JOB_EXEC_URL));
    assertTrue(result.flowDefId.equals(FLOW_DEF_URL));
    assertTrue(result.flowExecId.equals(FLOW_EXEC_URL));
}
Also used : HashMap(scala.collection.immutable.HashMap) HadoopApplicationData(com.linkedin.drelephant.analysis.HadoopApplicationData) SparkApplicationData(com.linkedin.drelephant.spark.data.SparkApplicationData) ApplicationInfoImpl(com.linkedin.drelephant.spark.fetchers.statusapiv1.ApplicationInfoImpl) AppResult(models.AppResult) StageData(com.linkedin.drelephant.spark.fetchers.statusapiv1.StageData) ExecutorSummary(com.linkedin.drelephant.spark.fetchers.statusapiv1.ExecutorSummary) Tuple2(scala.Tuple2) JobData(com.linkedin.drelephant.spark.fetchers.statusapiv1.JobData) Vector(scala.collection.immutable.Vector) Test(org.junit.Test)

Example 2 with HadoopApplicationData

use of com.linkedin.drelephant.analysis.HadoopApplicationData in project dr-elephant by linkedin.

the class InfoExtractorTest method testLoadInfoSparkNoConfig.

@Test
public void testLoadInfoSparkNoConfig() {
    Map<String, String> properties = new HashMap<String, String>();
    AppResult result = new AppResult();
    HadoopApplicationData data = new SparkApplicationData("application_5678", properties, new ApplicationInfoImpl("", "", new Vector<ApplicationAttemptInfoImpl>(0, 1, 0)), new Vector<JobData>(0, 1, 0), new Vector<StageData>(0, 1, 0), new Vector<ExecutorSummary>(0, 1, 0));
    // test to make sure loadInfo does not throw exception if properties are not defined
    InfoExtractor.loadInfo(result, data);
    assertTrue(result.jobDefId.isEmpty());
    assertTrue(result.jobExecId.isEmpty());
    assertTrue(result.flowDefId.isEmpty());
    assertTrue(result.flowExecId.isEmpty());
}
Also used : HashMap(scala.collection.immutable.HashMap) HadoopApplicationData(com.linkedin.drelephant.analysis.HadoopApplicationData) SparkApplicationData(com.linkedin.drelephant.spark.data.SparkApplicationData) ApplicationInfoImpl(com.linkedin.drelephant.spark.fetchers.statusapiv1.ApplicationInfoImpl) AppResult(models.AppResult) StageData(com.linkedin.drelephant.spark.fetchers.statusapiv1.StageData) ExecutorSummary(com.linkedin.drelephant.spark.fetchers.statusapiv1.ExecutorSummary) JobData(com.linkedin.drelephant.spark.fetchers.statusapiv1.JobData) Vector(scala.collection.immutable.Vector) Test(org.junit.Test)

Example 3 with HadoopApplicationData

use of com.linkedin.drelephant.analysis.HadoopApplicationData in project dr-elephant by linkedin.

the class InfoExtractorTest method testLoadInfoMapReduce.

@Test
public void testLoadInfoMapReduce() {
    final String JOB_DEF_URL = "https://grid.example.com:9000/manager?project=project-name&flow=flow-name&job=job-name";
    final String JOB_EXEC_URL = "https://grid.example.com:9000/executor?execid=123456&job=job-name&attempt=0";
    final String FLOW_DEF_URL = "https://grid.example.com:9000/manager?project=project-name&flow=flow-name";
    final String FLOW_EXEC_URL = "https://grid.example.com:9000/executor?execid=123456";
    final String JOB_NAME = "job-name";
    Properties properties = new Properties();
    properties.put(AzkabanScheduler.AZKABAN_JOB_URL, JOB_DEF_URL);
    properties.put(AzkabanScheduler.AZKABAN_ATTEMPT_URL, JOB_EXEC_URL);
    properties.put(AzkabanScheduler.AZKABAN_WORKFLOW_URL, FLOW_DEF_URL);
    properties.put(AzkabanScheduler.AZKABAN_EXECUTION_URL, FLOW_EXEC_URL);
    properties.put(AzkabanScheduler.AZKABAN_JOB_NAME, JOB_NAME);
    AppResult result = new AppResult();
    HadoopApplicationData data = (new MapReduceApplicationData()).setAppId("application_5678").setJobConf(properties);
    InfoExtractor.loadInfo(result, data);
    assertTrue(result.jobDefId.equals(JOB_DEF_URL));
    assertTrue(result.jobExecId.equals(JOB_EXEC_URL));
    assertTrue(result.flowDefId.equals(FLOW_DEF_URL));
    assertTrue(result.flowExecId.equals(FLOW_EXEC_URL));
}
Also used : MapReduceApplicationData(com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData) HadoopApplicationData(com.linkedin.drelephant.analysis.HadoopApplicationData) Properties(java.util.Properties) AppResult(models.AppResult) Test(org.junit.Test)

Example 4 with HadoopApplicationData

use of com.linkedin.drelephant.analysis.HadoopApplicationData in project dr-elephant by linkedin.

the class InfoExtractorTest method testLoadSchedulerInfo.

@Test
public void testLoadSchedulerInfo() {
    Properties properties = new Properties();
    properties.put(AzkabanScheduler.AZKABAN_JOB_URL, "https://grid.example.com:9000/manager?project=project-name&flow=flow-name&job=job-name");
    properties.put(AzkabanScheduler.AZKABAN_ATTEMPT_URL, "https://grid.example.com:9000/executor?execid=123456&job=job-name&attempt=0");
    properties.put(AzkabanScheduler.AZKABAN_WORKFLOW_URL, "https://grid.example.com:9000/manager?project=project-name&flow=flow-name");
    properties.put(AzkabanScheduler.AZKABAN_EXECUTION_URL, "https://grid.example.com:9000/executor?execid=123456");
    properties.put(AzkabanScheduler.AZKABAN_JOB_NAME, "job-name");
    SchedulerConfigurationData schedulerConfigurationData = new SchedulerConfigurationData("azkaban", null, null);
    Scheduler scheduler = new AzkabanScheduler("id", properties, schedulerConfigurationData);
    AppResult result = new AppResult();
    HadoopApplicationData data = new HadoopApplicationData() {

        String appId = "application_5678";

        Properties conf = new Properties();

        ApplicationType applicationType = new ApplicationType("foo");

        @Override
        public String getAppId() {
            return appId;
        }

        @Override
        public Properties getConf() {
            return conf;
        }

        @Override
        public ApplicationType getApplicationType() {
            return applicationType;
        }

        @Override
        public boolean isEmpty() {
            return false;
        }
    };
    InfoExtractor.loadSchedulerInfo(result, data, scheduler);
    assertEquals(result.scheduler, "azkaban");
    assertFalse(StringUtils.isEmpty(result.getJobExecId()));
    assertFalse(StringUtils.isEmpty(result.getJobDefId()));
    assertFalse(StringUtils.isEmpty(result.getFlowExecId()));
    assertFalse(StringUtils.isEmpty(result.getFlowDefId()));
    assertFalse(StringUtils.isEmpty(result.getJobExecUrl()));
    assertFalse(StringUtils.isEmpty(result.getJobDefUrl()));
    assertFalse(StringUtils.isEmpty(result.getFlowExecUrl()));
    assertFalse(StringUtils.isEmpty(result.getFlowDefUrl()));
}
Also used : ApplicationType(com.linkedin.drelephant.analysis.ApplicationType) AzkabanScheduler(com.linkedin.drelephant.schedulers.AzkabanScheduler) Scheduler(com.linkedin.drelephant.schedulers.Scheduler) AzkabanScheduler(com.linkedin.drelephant.schedulers.AzkabanScheduler) AirflowScheduler(com.linkedin.drelephant.schedulers.AirflowScheduler) OozieScheduler(com.linkedin.drelephant.schedulers.OozieScheduler) HadoopApplicationData(com.linkedin.drelephant.analysis.HadoopApplicationData) Properties(java.util.Properties) AppResult(models.AppResult) SchedulerConfigurationData(com.linkedin.drelephant.configurations.scheduler.SchedulerConfigurationData) Test(org.junit.Test)

Aggregations

HadoopApplicationData (com.linkedin.drelephant.analysis.HadoopApplicationData)4 AppResult (models.AppResult)4 Test (org.junit.Test)4 SparkApplicationData (com.linkedin.drelephant.spark.data.SparkApplicationData)2 ApplicationInfoImpl (com.linkedin.drelephant.spark.fetchers.statusapiv1.ApplicationInfoImpl)2 ExecutorSummary (com.linkedin.drelephant.spark.fetchers.statusapiv1.ExecutorSummary)2 JobData (com.linkedin.drelephant.spark.fetchers.statusapiv1.JobData)2 StageData (com.linkedin.drelephant.spark.fetchers.statusapiv1.StageData)2 Properties (java.util.Properties)2 HashMap (scala.collection.immutable.HashMap)2 Vector (scala.collection.immutable.Vector)2 ApplicationType (com.linkedin.drelephant.analysis.ApplicationType)1 SchedulerConfigurationData (com.linkedin.drelephant.configurations.scheduler.SchedulerConfigurationData)1 MapReduceApplicationData (com.linkedin.drelephant.mapreduce.data.MapReduceApplicationData)1 AirflowScheduler (com.linkedin.drelephant.schedulers.AirflowScheduler)1 AzkabanScheduler (com.linkedin.drelephant.schedulers.AzkabanScheduler)1 OozieScheduler (com.linkedin.drelephant.schedulers.OozieScheduler)1 Scheduler (com.linkedin.drelephant.schedulers.Scheduler)1 Tuple2 (scala.Tuple2)1