Search in sources :

Example 1 with LineageEventBuilder

use of org.apache.gobblin.metrics.event.lineage.LineageEventBuilder in project incubator-gobblin by apache.

the class ConvertibleHiveDatasetTest method testLineageInfo.

@Test
public void testLineageInfo() throws Exception {
    String testConfFilePath = "convertibleHiveDatasetTest/flattenedAndNestedOrc.conf";
    Config config = ConfigFactory.parseResources(testConfFilePath).getConfig("hive.conversion.avro");
    WorkUnit workUnit = WorkUnit.createEmpty();
    Gson GSON = new Gson();
    HiveSource.setLineageInfo(createTestConvertibleDataset(config), workUnit, getSharedJobBroker());
    Properties props = workUnit.getSpecProperties();
    // Asset that lineage name is correct
    Assert.assertEquals(props.getProperty("gobblin.event.lineage.name"), "db1.tb1");
    // Assert that source is correct for lineage event
    Assert.assertTrue(props.containsKey("gobblin.event.lineage.source"));
    DatasetDescriptor sourceDD = GSON.fromJson(props.getProperty("gobblin.event.lineage.source"), DatasetDescriptor.class);
    Assert.assertEquals(sourceDD.getPlatform(), DatasetConstants.PLATFORM_HIVE);
    Assert.assertEquals(sourceDD.getName(), "db1.tb1");
    // Assert that first dest is correct for lineage event
    Assert.assertTrue(props.containsKey("gobblin.event.lineage.branch.1.destination"));
    DatasetDescriptor destDD1 = GSON.fromJson(props.getProperty("gobblin.event.lineage.branch.1.destination"), DatasetDescriptor.class);
    Assert.assertEquals(destDD1.getPlatform(), DatasetConstants.PLATFORM_HIVE);
    Assert.assertEquals(destDD1.getName(), "db1_nestedOrcDb.tb1_nestedOrc");
    // Assert that second dest is correct for lineage event
    Assert.assertTrue(props.containsKey("gobblin.event.lineage.branch.2.destination"));
    DatasetDescriptor destDD2 = GSON.fromJson(props.getProperty("gobblin.event.lineage.branch.2.destination"), DatasetDescriptor.class);
    Assert.assertEquals(destDD2.getPlatform(), DatasetConstants.PLATFORM_HIVE);
    Assert.assertEquals(destDD2.getName(), "db1_flattenedOrcDb.tb1_flattenedOrc");
    // Assert that there are two eventBuilders for nestedOrc and flattenedOrc
    Collection<LineageEventBuilder> lineageEventBuilders = LineageInfo.load(Collections.singleton(workUnit));
    Assert.assertEquals(lineageEventBuilders.size(), 2);
}
Also used : DatasetDescriptor(org.apache.gobblin.dataset.DatasetDescriptor) Config(com.typesafe.config.Config) ConversionConfig(org.apache.gobblin.data.management.conversion.hive.dataset.ConvertibleHiveDataset.ConversionConfig) Gson(com.google.gson.Gson) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) LineageEventBuilder(org.apache.gobblin.metrics.event.lineage.LineageEventBuilder) Properties(java.util.Properties) Test(org.testng.annotations.Test)

Aggregations

Gson (com.google.gson.Gson)1 Config (com.typesafe.config.Config)1 Properties (java.util.Properties)1 ConversionConfig (org.apache.gobblin.data.management.conversion.hive.dataset.ConvertibleHiveDataset.ConversionConfig)1 DatasetDescriptor (org.apache.gobblin.dataset.DatasetDescriptor)1 LineageEventBuilder (org.apache.gobblin.metrics.event.lineage.LineageEventBuilder)1 WorkUnit (org.apache.gobblin.source.workunit.WorkUnit)1 Test (org.testng.annotations.Test)1