Search in sources :

Example 76 with SourceState

use of org.apache.gobblin.configuration.SourceState in project incubator-gobblin by apache.

the class CopySourceTest method testPartitionableDataset.

@Test
public void testPartitionableDataset() throws Exception {
    SourceState state = new SourceState();
    state.setProp(ConfigurationKeys.SOURCE_FILEBASED_FS_URI, "file:///");
    state.setProp(ConfigurationKeys.WRITER_FILE_SYSTEM_URI, "file:///");
    state.setProp(ConfigurationKeys.DATA_PUBLISHER_FINAL_DIR, "/target/dir");
    state.setProp(DatasetUtils.DATASET_PROFILE_CLASS_KEY, TestCopyablePartitionableDatasedFinder.class.getCanonicalName());
    CopySource source = new CopySource();
    List<WorkUnit> workunits = source.getWorkunits(state);
    workunits = JobLauncherUtils.flattenWorkUnits(workunits);
    Assert.assertEquals(workunits.size(), TestCopyableDataset.FILE_COUNT);
    Extract extractAbove = null;
    Extract extractBelow = null;
    for (WorkUnit workUnit : workunits) {
        CopyableFile copyableFile = (CopyableFile) CopySource.deserializeCopyEntity(workUnit);
        Assert.assertTrue(copyableFile.getOrigin().getPath().toString().startsWith(TestCopyableDataset.ORIGIN_PREFIX));
        Assert.assertEquals(copyableFile.getDestinationOwnerAndPermission(), TestCopyableDataset.OWNER_AND_PERMISSION);
        if (Integer.parseInt(copyableFile.getOrigin().getPath().getName()) < TestCopyablePartitionableDataset.THRESHOLD) {
            // should be in extractBelow
            if (extractBelow == null) {
                extractBelow = workUnit.getExtract();
            }
            Assert.assertEquals(workUnit.getExtract(), extractBelow);
        } else {
            // should be in extractAbove
            if (extractAbove == null) {
                extractAbove = workUnit.getExtract();
            }
            Assert.assertEquals(workUnit.getExtract(), extractAbove);
        }
    }
    Assert.assertNotNull(extractAbove);
    Assert.assertNotNull(extractBelow);
}
Also used : SourceState(org.apache.gobblin.configuration.SourceState) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) Test(org.testng.annotations.Test)

Example 77 with SourceState

use of org.apache.gobblin.configuration.SourceState in project incubator-gobblin by apache.

the class BackfillHiveSourceTest method testNoWhitelist.

@Test
public void testNoWhitelist() throws Exception {
    BackfillHiveSource backfillHiveSource = new BackfillHiveSource();
    SourceState state = new SourceState();
    backfillHiveSource.initBackfillHiveSource(state);
    Partition sourcePartition = Mockito.mock(Partition.class, Mockito.RETURNS_SMART_NULLS);
    Assert.assertTrue(backfillHiveSource.shouldCreateWorkunit(sourcePartition, new LongWatermark(0)));
}
Also used : Partition(org.apache.hadoop.hive.ql.metadata.Partition) SourceState(org.apache.gobblin.configuration.SourceState) BackfillHiveSource(org.apache.gobblin.data.management.conversion.hive.source.BackfillHiveSource) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Test(org.testng.annotations.Test)

Example 78 with SourceState

use of org.apache.gobblin.configuration.SourceState in project incubator-gobblin by apache.

the class Kafka09JsonIntegrationTest method createSourceState.

private SourceState createSourceState(String topic) {
    SourceState state = new SourceState();
    state.setProp(ConfigurationKeys.KAFKA_BROKERS, "localhost:" + kafkaTestHelper.getKafkaServerPort());
    state.setProp(KafkaSource.TOPIC_WHITELIST, topic);
    state.setProp(KafkaSource.GOBBLIN_KAFKA_CONSUMER_CLIENT_FACTORY_CLASS, Kafka09ConsumerClient.Factory.class.getName());
    state.setProp(KafkaSource.BOOTSTRAP_WITH_OFFSET, "earliest");
    return state;
}
Also used : SourceState(org.apache.gobblin.configuration.SourceState) ManagementFactory(java.lang.management.ManagementFactory)

Example 79 with SourceState

use of org.apache.gobblin.configuration.SourceState in project incubator-gobblin by apache.

the class JsonElementConversionFactoryTest method setUp.

@BeforeClass
public static void setUp() {
    WorkUnit workUnit = new WorkUnit(new SourceState(), new Extract(new SourceState(), Extract.TableType.SNAPSHOT_ONLY, "namespace", "dummy_table"));
    state = new WorkUnitState(workUnit);
    Type listType = new TypeToken<JsonObject>() {
    }.getType();
    Gson gson = new Gson();
    testData = gson.fromJson(new InputStreamReader(JsonElementConversionFactoryTest.class.getResourceAsStream("/converter/JsonElementConversionFactoryTest.json")), listType);
}
Also used : SourceState(org.apache.gobblin.configuration.SourceState) Type(java.lang.reflect.Type) InputStreamReader(java.io.InputStreamReader) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) JsonObject(com.google.gson.JsonObject) Gson(com.google.gson.Gson) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) BeforeClass(org.testng.annotations.BeforeClass)

Example 80 with SourceState

use of org.apache.gobblin.configuration.SourceState in project incubator-gobblin by apache.

the class JsonIntermediateToAvroConverterTest method initResources.

private JsonObject initResources(String resourceFilePath) {
    Type listType = new TypeToken<JsonObject>() {
    }.getType();
    Gson gson = new Gson();
    JsonObject testData = gson.fromJson(new InputStreamReader(this.getClass().getResourceAsStream(resourceFilePath)), listType);
    jsonRecord = testData.get("record").getAsJsonObject();
    jsonSchema = testData.get("schema").getAsJsonArray();
    WorkUnit workUnit = new WorkUnit(new SourceState(), new Extract(new SourceState(), Extract.TableType.SNAPSHOT_ONLY, "namespace", "dummy_table"));
    state = new WorkUnitState(workUnit);
    state.setProp(ConfigurationKeys.CONVERTER_AVRO_TIME_FORMAT, "HH:mm:ss");
    state.setProp(ConfigurationKeys.CONVERTER_AVRO_DATE_TIMEZONE, "PST");
    return testData;
}
Also used : Type(java.lang.reflect.Type) SourceState(org.apache.gobblin.configuration.SourceState) InputStreamReader(java.io.InputStreamReader) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) JsonObject(com.google.gson.JsonObject) Gson(com.google.gson.Gson) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit)

Aggregations

SourceState (org.apache.gobblin.configuration.SourceState)90 Test (org.testng.annotations.Test)76 WorkUnitState (org.apache.gobblin.configuration.WorkUnitState)44 WorkUnit (org.apache.gobblin.source.workunit.WorkUnit)38 State (org.apache.gobblin.configuration.State)30 WorkingState (org.apache.gobblin.configuration.WorkUnitState.WorkingState)11 Partition (org.apache.hadoop.hive.ql.metadata.Partition)8 Table (org.apache.hadoop.hive.ql.metadata.Table)8 IterableDatasetFinder (org.apache.gobblin.dataset.IterableDatasetFinder)7 LongWatermark (org.apache.gobblin.source.extractor.extract.LongWatermark)7 Extract (org.apache.gobblin.source.workunit.Extract)7 DateTime (org.joda.time.DateTime)7 Dataset (org.apache.gobblin.dataset.Dataset)6 PartitionableDataset (org.apache.gobblin.dataset.PartitionableDataset)6 MultiWorkUnit (org.apache.gobblin.source.workunit.MultiWorkUnit)6 WorkUnitStream (org.apache.gobblin.source.workunit.WorkUnitStream)6 IOException (java.io.IOException)5 Path (org.apache.hadoop.fs.Path)5 Gson (com.google.gson.Gson)4 JsonObject (com.google.gson.JsonObject)4