Search in sources :

Example 6 with DefaultCheckpointableWatermark

use of org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark in project incubator-gobblin by apache.

the class StateStoreWatermarkStorageTest method testPersistWatermarkStateToZk.

@Test
public void testPersistWatermarkStateToZk() throws IOException {
    CheckpointableWatermark watermark = new DefaultCheckpointableWatermark("source", new LongWatermark(startTime));
    TaskState taskState = new TaskState();
    taskState.setJobId(TEST_JOB_ID);
    taskState.setProp(ConfigurationKeys.JOB_NAME_KEY, "JobName-" + startTime);
    // watermark storage configuration
    taskState.setProp(StateStoreBasedWatermarkStorage.WATERMARK_STORAGE_TYPE_KEY, "zk");
    taskState.setProp(StateStoreBasedWatermarkStorage.WATERMARK_STORAGE_CONFIG_PREFIX + ZkStateStoreConfigurationKeys.STATE_STORE_ZK_CONNECT_STRING_KEY, testingServer.getConnectString());
    StateStoreBasedWatermarkStorage watermarkStorage = new StateStoreBasedWatermarkStorage(taskState);
    watermarkStorage.commitWatermarks(ImmutableList.of(watermark));
    Map<String, CheckpointableWatermark> watermarkMap = watermarkStorage.getCommittedWatermarks(DefaultCheckpointableWatermark.class, ImmutableList.of("source"));
    Assert.assertEquals(watermarkMap.size(), 1);
    Assert.assertEquals(((LongWatermark) watermarkMap.get("source").getWatermark()).getValue(), startTime);
}
Also used : DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) CheckpointableWatermark(org.apache.gobblin.source.extractor.CheckpointableWatermark) DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Test(org.testng.annotations.Test)

Example 7 with DefaultCheckpointableWatermark

use of org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark in project incubator-gobblin by apache.

the class ConsoleWriterTest method writeEnvelope.

private void writeEnvelope(ConsoleWriter consoleWriter, String content, String source, long value) throws IOException {
    CheckpointableWatermark watermark = new DefaultCheckpointableWatermark(source, new LongWatermark(value));
    AcknowledgableWatermark ackable = new AcknowledgableWatermark(watermark);
    RecordEnvelope<String> mockEnvelope = (RecordEnvelope<String>) new RecordEnvelope<>(content).addCallBack(ackable);
    consoleWriter.writeEnvelope(mockEnvelope);
    Assert.assertTrue(ackable.isAcked());
}
Also used : RecordEnvelope(org.apache.gobblin.stream.RecordEnvelope) DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) CheckpointableWatermark(org.apache.gobblin.source.extractor.CheckpointableWatermark) DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark)

Example 8 with DefaultCheckpointableWatermark

use of org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark in project incubator-gobblin by apache.

the class FineGrainedWatermarkTrackerTest method testWatermarkTracker.

/**
 * Single threaded test that creates attempts, acknowledges a few at random
 * then checks if the getCommitables method is returning the right values.
 * Runs a few iterations.
 */
@Test
public static void testWatermarkTracker() {
    Random random = new Random();
    Config config = ConfigFactory.empty();
    for (int j = 0; j < 100; ++j) {
        FineGrainedWatermarkTracker tracker = new FineGrainedWatermarkTracker(config);
        int numWatermarks = 1 + random.nextInt(1000);
        AcknowledgableWatermark[] acknowledgableWatermarks = new AcknowledgableWatermark[numWatermarks];
        for (int i = 0; i < numWatermarks; ++i) {
            CheckpointableWatermark checkpointableWatermark = new DefaultCheckpointableWatermark("default", new LongWatermark(i));
            AcknowledgableWatermark ackable = new AcknowledgableWatermark(checkpointableWatermark);
            acknowledgableWatermarks[i] = ackable;
            tracker.track(ackable);
        }
        // Create some random holes. Don't fire acknowledgements for these messages.
        int numMissingAcks = random.nextInt(numWatermarks);
        SortedSet<Integer> holes = new TreeSet<>();
        for (int i = 0; i < numMissingAcks; ++i) {
            holes.add(random.nextInt(numWatermarks));
        }
        for (int i = 0; i < numWatermarks; ++i) {
            if (!holes.contains(i)) {
                acknowledgableWatermarks[i].ack();
            }
        }
        verifyCommitables(tracker, holes, numWatermarks - 1);
        // verify that sweeping doesn't have any side effects on correctness
        tracker.sweep();
        verifyCommitables(tracker, holes, numWatermarks - 1);
    }
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) Random(java.util.Random) Config(com.typesafe.config.Config) TreeSet(java.util.TreeSet) DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) CheckpointableWatermark(org.apache.gobblin.source.extractor.CheckpointableWatermark) DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Test(org.testng.annotations.Test)

Example 9 with DefaultCheckpointableWatermark

use of org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark in project incubator-gobblin by apache.

the class MultiWriterWatermarkManagerTest method testFailingWatermarkStorage.

/**
 * Test that when we have commits failing to watermark storage, the manager continues to try
 * at every interval and keeps track of the exception it is seeing.
 */
@Test
public void testFailingWatermarkStorage() throws IOException, InterruptedException {
    WatermarkStorage reallyBadWatermarkStorage = mock(WatermarkStorage.class);
    IOException exceptionToThrow = new IOException("Failed to write coz the programmer told me to");
    doThrow(exceptionToThrow).when(reallyBadWatermarkStorage).commitWatermarks(any(Iterable.class));
    long commitInterval = 1000;
    MultiWriterWatermarkManager watermarkManager = new MultiWriterWatermarkManager(reallyBadWatermarkStorage, commitInterval, Optional.<Logger>absent());
    WatermarkAwareWriter mockWriter = mock(WatermarkAwareWriter.class);
    CheckpointableWatermark watermark = new DefaultCheckpointableWatermark("default", new LongWatermark(0));
    when(mockWriter.getCommittableWatermark()).thenReturn(Collections.singletonMap("default", watermark));
    watermarkManager.registerWriter(mockWriter);
    try {
        watermarkManager.start();
    } catch (Exception e) {
        Assert.fail("Should not throw exception", e);
    }
    // sleep for 2.5 iterations
    Thread.sleep(commitInterval * 2 + (commitInterval / 2));
    watermarkManager.close();
    // 2 calls from iterations, 1 additional attempt due to close
    int expectedCalls = 3;
    verify(reallyBadWatermarkStorage, atLeast(expectedCalls)).commitWatermarks(any(Iterable.class));
    Assert.assertEquals(watermarkManager.getCommitStatus().getLastCommitException(), exceptionToThrow, "Testing tracking of failed exceptions");
}
Also used : IOException(java.io.IOException) DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) CheckpointableWatermark(org.apache.gobblin.source.extractor.CheckpointableWatermark) DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) IOException(java.io.IOException) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Test(org.testng.annotations.Test)

Example 10 with DefaultCheckpointableWatermark

use of org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark in project incubator-gobblin by apache.

the class FineGrainedWatermarkTrackerBenchmark method scheduledDelayedAcks.

@Benchmark
@Group("scheduledDelayed")
public void scheduledDelayedAcks(Control control, TrackerState trackerState) throws Exception {
    if (!control.stopMeasurement) {
        final AcknowledgableWatermark wmark = new AcknowledgableWatermark(new DefaultCheckpointableWatermark("0", new LongWatermark(trackerState._index)));
        trackerState._index++;
        int delay = trackerState._random.nextInt(10);
        trackerState._executorService.schedule(new Runnable() {

            @Override
            public void run() {
                wmark.ack();
            }
        }, delay, TimeUnit.MILLISECONDS);
    }
}
Also used : DefaultCheckpointableWatermark(org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark) LongWatermark(org.apache.gobblin.source.extractor.extract.LongWatermark) Group(org.openjdk.jmh.annotations.Group) Benchmark(org.openjdk.jmh.annotations.Benchmark)

Aggregations

DefaultCheckpointableWatermark (org.apache.gobblin.source.extractor.DefaultCheckpointableWatermark)12 LongWatermark (org.apache.gobblin.source.extractor.extract.LongWatermark)12 CheckpointableWatermark (org.apache.gobblin.source.extractor.CheckpointableWatermark)8 Test (org.testng.annotations.Test)6 Benchmark (org.openjdk.jmh.annotations.Benchmark)4 Group (org.openjdk.jmh.annotations.Group)4 IOException (java.io.IOException)3 Random (java.util.Random)3 TreeSet (java.util.TreeSet)3 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)3 RecordEnvelope (org.apache.gobblin.stream.RecordEnvelope)3 Config (com.typesafe.config.Config)1 ArrayList (java.util.ArrayList)1 List (java.util.List)1 ScheduledExecutorService (java.util.concurrent.ScheduledExecutorService)1 ScheduledThreadPoolExecutor (java.util.concurrent.ScheduledThreadPoolExecutor)1 Schema (org.apache.avro.Schema)1 GenericRecord (org.apache.avro.generic.GenericRecord)1 State (org.apache.gobblin.configuration.State)1 TestPartitioner (org.apache.gobblin.writer.test.TestPartitioner)1