Search in sources :

Example 16 with Extract

use of org.apache.gobblin.source.workunit.Extract in project incubator-gobblin by apache.

the class WriterUtilsTest method testGetDefaultWriterFilePathWithWorkUnitState.

@Test
public void testGetDefaultWriterFilePathWithWorkUnitState() {
    String namespace = "gobblin.test";
    String tableName = "test-table";
    SourceState sourceState = new SourceState();
    WorkUnit workUnit = WorkUnit.create(new Extract(sourceState, TableType.APPEND_ONLY, namespace, tableName));
    WorkUnitState workUnitState = new WorkUnitState(workUnit);
    Assert.assertEquals(WriterUtils.getWriterFilePath(workUnitState, 0, 0), new Path(workUnitState.getExtract().getOutputFilePath()));
    Assert.assertEquals(WriterUtils.getWriterFilePath(workUnitState, 2, 0), new Path(workUnitState.getExtract().getOutputFilePath(), ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + "0"));
}
Also used : Path(org.apache.hadoop.fs.Path) SourceState(org.apache.gobblin.configuration.SourceState) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) Test(org.testng.annotations.Test)

Example 17 with Extract

use of org.apache.gobblin.source.workunit.Extract in project incubator-gobblin by apache.

the class WriterUtilsTest method testGetDefaultWriterFilePath.

@Test
public void testGetDefaultWriterFilePath() {
    String namespace = "gobblin.test";
    String tableName = "test-table";
    SourceState sourceState = new SourceState();
    WorkUnit state = WorkUnit.create(new Extract(sourceState, TableType.APPEND_ONLY, namespace, tableName));
    Assert.assertEquals(WriterUtils.getWriterFilePath(state, 0, 0), new Path(state.getExtract().getOutputFilePath()));
    Assert.assertEquals(WriterUtils.getWriterFilePath(state, 2, 0), new Path(state.getExtract().getOutputFilePath(), ConfigurationKeys.DEFAULT_FORK_BRANCH_NAME + "0"));
}
Also used : Path(org.apache.hadoop.fs.Path) SourceState(org.apache.gobblin.configuration.SourceState) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) Test(org.testng.annotations.Test)

Example 18 with Extract

use of org.apache.gobblin.source.workunit.Extract in project incubator-gobblin by apache.

the class WorstFitDecreasingBinPackingTest method getWorkUnitWithWeight.

public WorkUnit getWorkUnitWithWeight(long weight) {
    WorkUnit workUnit = new WorkUnit(new Extract(Extract.TableType.APPEND_ONLY, "", ""));
    workUnit.setProp(WEIGHT, Long.toString(weight));
    return workUnit;
}
Also used : Extract(org.apache.gobblin.source.workunit.Extract) MultiWorkUnit(org.apache.gobblin.source.workunit.MultiWorkUnit) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit)

Example 19 with Extract

use of org.apache.gobblin.source.workunit.Extract in project incubator-gobblin by apache.

the class CopySourceTest method testPartitionableDataset.

@Test
public void testPartitionableDataset() throws Exception {
    SourceState state = new SourceState();
    state.setProp(ConfigurationKeys.SOURCE_FILEBASED_FS_URI, "file:///");
    state.setProp(ConfigurationKeys.WRITER_FILE_SYSTEM_URI, "file:///");
    state.setProp(ConfigurationKeys.DATA_PUBLISHER_FINAL_DIR, "/target/dir");
    state.setProp(DatasetUtils.DATASET_PROFILE_CLASS_KEY, TestCopyablePartitionableDatasedFinder.class.getCanonicalName());
    CopySource source = new CopySource();
    List<WorkUnit> workunits = source.getWorkunits(state);
    workunits = JobLauncherUtils.flattenWorkUnits(workunits);
    Assert.assertEquals(workunits.size(), TestCopyableDataset.FILE_COUNT);
    Extract extractAbove = null;
    Extract extractBelow = null;
    for (WorkUnit workUnit : workunits) {
        CopyableFile copyableFile = (CopyableFile) CopySource.deserializeCopyEntity(workUnit);
        Assert.assertTrue(copyableFile.getOrigin().getPath().toString().startsWith(TestCopyableDataset.ORIGIN_PREFIX));
        Assert.assertEquals(copyableFile.getDestinationOwnerAndPermission(), TestCopyableDataset.OWNER_AND_PERMISSION);
        if (Integer.parseInt(copyableFile.getOrigin().getPath().getName()) < TestCopyablePartitionableDataset.THRESHOLD) {
            // should be in extractBelow
            if (extractBelow == null) {
                extractBelow = workUnit.getExtract();
            }
            Assert.assertEquals(workUnit.getExtract(), extractBelow);
        } else {
            // should be in extractAbove
            if (extractAbove == null) {
                extractAbove = workUnit.getExtract();
            }
            Assert.assertEquals(workUnit.getExtract(), extractAbove);
        }
    }
    Assert.assertNotNull(extractAbove);
    Assert.assertNotNull(extractBelow);
}
Also used : SourceState(org.apache.gobblin.configuration.SourceState) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) Test(org.testng.annotations.Test)

Example 20 with Extract

use of org.apache.gobblin.source.workunit.Extract in project incubator-gobblin by apache.

the class TaskTest method getEmptyTestTaskState.

TaskState getEmptyTestTaskState(String taskId) {
    // Create a TaskState
    WorkUnit workUnit = WorkUnit.create(new Extract(Extract.TableType.SNAPSHOT_ONLY, this.getClass().getName(), this.getClass().getSimpleName()));
    workUnit.setProp(ConfigurationKeys.TASK_KEY_KEY, "taskKey");
    TaskState taskState = new TaskState(new WorkUnitState(workUnit));
    taskState.setProp(ConfigurationKeys.METRICS_ENABLED_KEY, Boolean.toString(false));
    taskState.setTaskId(taskId);
    taskState.setJobId("1234");
    return taskState;
}
Also used : WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit)

Aggregations

Extract (org.apache.gobblin.source.workunit.Extract)29 WorkUnit (org.apache.gobblin.source.workunit.WorkUnit)24 WorkUnitState (org.apache.gobblin.configuration.WorkUnitState)11 SourceState (org.apache.gobblin.configuration.SourceState)8 Test (org.testng.annotations.Test)7 Path (org.apache.hadoop.fs.Path)6 MultiWorkUnit (org.apache.gobblin.source.workunit.MultiWorkUnit)4 IOException (java.io.IOException)3 ArrayList (java.util.ArrayList)3 Configuration (org.apache.hadoop.conf.Configuration)3 Gson (com.google.gson.Gson)2 JsonObject (com.google.gson.JsonObject)2 Config (com.typesafe.config.Config)2 InputStreamReader (java.io.InputStreamReader)2 Type (java.lang.reflect.Type)2 Map (java.util.Map)2 State (org.apache.gobblin.configuration.State)2 WatermarkInterval (org.apache.gobblin.source.extractor.WatermarkInterval)2 LongWatermark (org.apache.gobblin.source.extractor.extract.LongWatermark)2 TableType (org.apache.gobblin.source.workunit.Extract.TableType)2