Search in sources :

Example 26 with LongWatermark

use of org.apache.gobblin.source.extractor.extract.LongWatermark in project incubator-gobblin by apache.

the class TableLevelWatermarkerTest method testPreviousState.

@Test
public void testPreviousState() throws Exception {
    WorkUnitState previousWus = new WorkUnitState();
    previousWus.setProp(ConfigurationKeys.DATASET_URN_KEY, "test_table");
    previousWus.setActualHighWatermark(new LongWatermark(100l));
    // Watermark will be lowest of 100l and 101l
    WorkUnitState previousWus1 = new WorkUnitState();
    previousWus1.setProp(ConfigurationKeys.DATASET_URN_KEY, "test_table");
    previousWus1.setActualHighWatermark(new LongWatermark(101l));
    SourceState state = new SourceState(new State(), Lists.newArrayList(previousWus));
    TableLevelWatermarker watermarker = new TableLevelWatermarker(state);
    Assert.assertEquals(watermarker.getPreviousHighWatermark(mockTable("test_table")), new LongWatermark(100l));
}
Also used : SourceState(org.apache.gobblin.configuration.SourceState) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) State(org.apache.gobblin.configuration.State) SourceState(org.apache.gobblin.configuration.SourceState) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Test(org.testng.annotations.Test)

Example 27 with LongWatermark

use of org.apache.gobblin.source.extractor.extract.LongWatermark in project incubator-gobblin by apache.

the class TableLevelWatermarkerTest method testPreviousStateWithPartitionWatermark.

@Test
public void testPreviousStateWithPartitionWatermark() throws Exception {
    WorkUnitState previousWus = new WorkUnitState();
    previousWus.setProp(ConfigurationKeys.DATASET_URN_KEY, "test_table");
    previousWus.setActualHighWatermark(new LongWatermark(100l));
    // Watermark workunits created by PartitionLevelWatermarker need to be ignored.
    WorkUnitState previousWus1 = new WorkUnitState();
    previousWus1.setProp(ConfigurationKeys.DATASET_URN_KEY, "test_table");
    previousWus1.setProp(PartitionLevelWatermarker.IS_WATERMARK_WORKUNIT_KEY, true);
    previousWus1.setActualHighWatermark(new MultiKeyValueLongWatermark(ImmutableMap.of("part1", 200l)));
    SourceState state = new SourceState(new State(), Lists.newArrayList(previousWus));
    TableLevelWatermarker watermarker = new TableLevelWatermarker(state);
    Assert.assertEquals(watermarker.getPreviousHighWatermark(mockTable("test_table")), new LongWatermark(100l));
}
Also used : SourceState(org.apache.gobblin.configuration.SourceState) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) State(org.apache.gobblin.configuration.State) SourceState(org.apache.gobblin.configuration.SourceState) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Test(org.testng.annotations.Test)

Example 28 with LongWatermark

use of org.apache.gobblin.source.extractor.extract.LongWatermark in project incubator-gobblin by apache.

the class ConversionHiveTestUtils method createWus.

public static WorkUnitState createWus(String dbName, String tableName, long watermark) {
    WorkUnitState wus = new WorkUnitState();
    wus.setActualHighWatermark(new LongWatermark(watermark));
    wus.setProp(ConfigurationKeys.DATASET_URN_KEY, dbName + "@" + tableName);
    wus.setProp(ConfigurationKeys.JOB_ID_KEY, "jobId");
    return wus;
}
Also used : WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark)

Example 29 with LongWatermark

use of org.apache.gobblin.source.extractor.extract.LongWatermark in project incubator-gobblin by apache.

the class BackfillHiveSourceTest method testNoWhitelist.

@Test
public void testNoWhitelist() throws Exception {
    BackfillHiveSource backfillHiveSource = new BackfillHiveSource();
    SourceState state = new SourceState();
    backfillHiveSource.initBackfillHiveSource(state);
    Partition sourcePartition = Mockito.mock(Partition.class, Mockito.RETURNS_SMART_NULLS);
    Assert.assertTrue(backfillHiveSource.shouldCreateWorkunit(sourcePartition, new LongWatermark(0)));
}
Also used : Partition(org.apache.hadoop.hive.ql.metadata.Partition) SourceState(org.apache.gobblin.configuration.SourceState) BackfillHiveSource(org.apache.gobblin.data.management.conversion.hive.source.BackfillHiveSource) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Test(org.testng.annotations.Test)

Example 30 with LongWatermark

use of org.apache.gobblin.source.extractor.extract.LongWatermark in project incubator-gobblin by apache.

the class GoogleWebmasterExtractorTest method getWorkUnitState1.

public static WorkUnitState getWorkUnitState1() {
    WorkUnit wu = new WorkUnit(new Extract(Extract.TableType.APPEND_ONLY, "namespace", "table"));
    wu.setWatermarkInterval(new WatermarkInterval(new LongWatermark(20160101235959L), new LongWatermark(20160102235959L)));
    State js = new State();
    return new WorkUnitState(wu, js);
}
Also used : WatermarkInterval(org.apache.gobblin.source.extractor.WatermarkInterval) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) State(org.apache.gobblin.configuration.State) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) Extract(org.apache.gobblin.source.workunit.Extract) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark)

Aggregations

LongWatermark (org.apache.gobblin.source.extractor.extract.LongWatermark)35 Test (org.testng.annotations.Test)16 DefaultCheckpointableWatermark (org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark)12 WorkUnitState (org.apache.gobblin.configuration.WorkUnitState)10 CheckpointableWatermark (org.apache.gobblin.source.extractor.CheckpointableWatermark)9 SourceState (org.apache.gobblin.configuration.SourceState)7 State (org.apache.gobblin.configuration.State)7 WatermarkInterval (org.apache.gobblin.source.extractor.WatermarkInterval)6 IOException (java.io.IOException)5 RecordEnvelope (org.apache.gobblin.stream.RecordEnvelope)5 WorkUnit (org.apache.gobblin.source.workunit.WorkUnit)4 Partition (org.apache.hadoop.hive.ql.metadata.Partition)4 Random (java.util.Random)3 TreeSet (java.util.TreeSet)3 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)3 ComparableWatermark (org.apache.gobblin.source.extractor.ComparableWatermark)3 Path (org.apache.hadoop.fs.Path)3 Benchmark (org.openjdk.jmh.annotations.Benchmark)3 Group (org.openjdk.jmh.annotations.Group)3 Config (com.typesafe.config.Config)2