Search in sources :

Example 1 with IdentityForkOperator

use of org.apache.gobblin.fork.IdentityForkOperator in project incubator-gobblin by apache.

the class TaskTest method testForkCorrectnessIdentity.

/**
 * Test that forks work correctly when the operator picks all outgoing forks
 */
@Test(dataProvider = "stateOverrides")
public void testForkCorrectnessIdentity(State overrides) throws Exception {
    // Create a TaskState
    TaskState taskState = getEmptyTestTaskState("testForkTaskId");
    taskState.addAll(overrides);
    int numRecords = 100;
    int numForks = 5;
    // Identity Fork Operator looks for number of forks in work unit state.
    taskState.setProp(ConfigurationKeys.FORK_BRANCHES_KEY, "" + numForks);
    ForkOperator mockForkOperator = new IdentityForkOperator();
    ArrayList<ArrayList<Object>> recordCollectors = runTaskAndGetResults(taskState, numRecords, numForks, mockForkOperator);
    // Check that we got the right records in the collectors
    int recordsPerFork = numRecords;
    for (int forkNumber = 0; forkNumber < numForks; ++forkNumber) {
        ArrayList<Object> forkRecords = recordCollectors.get(forkNumber);
        Assert.assertEquals(forkRecords.size(), recordsPerFork);
        for (int j = 0; j < recordsPerFork; ++j) {
            Object forkRecord = forkRecords.get(j);
            Assert.assertEquals((String) forkRecord, "" + j);
        }
    }
}
Also used : IdentityForkOperator(org.apache.gobblin.fork.IdentityForkOperator) ForkOperator(org.apache.gobblin.fork.ForkOperator) IdentityForkOperator(org.apache.gobblin.fork.IdentityForkOperator) ArrayList(java.util.ArrayList) Test(org.testng.annotations.Test)

Example 2 with IdentityForkOperator

use of org.apache.gobblin.fork.IdentityForkOperator in project incubator-gobblin by apache.

the class TestRecordStream method setupTask.

private Task setupTask(Extractor extractor, DataWriterBuilder writer, List<Converter<?, ?, ?, ?>> converters, List<RecordStreamProcessor<?, ?, ?, ?>> recordStreamProcessors) throws Exception {
    // Create a TaskState
    TaskState taskState = getEmptyTestTaskState("testRetryTaskId");
    taskState.setProp(ConfigurationKeys.TASK_SYNCHRONOUS_EXECUTION_MODEL_KEY, false);
    // Create a mock TaskContext
    TaskContext mockTaskContext = mock(TaskContext.class);
    when(mockTaskContext.getExtractor()).thenReturn(extractor);
    when(mockTaskContext.getForkOperator()).thenReturn(new IdentityForkOperator());
    when(mockTaskContext.getTaskState()).thenReturn(taskState);
    when(mockTaskContext.getConverters()).thenReturn(converters);
    when(mockTaskContext.getRecordStreamProcessors()).thenReturn(recordStreamProcessors);
    when(mockTaskContext.getTaskLevelPolicyChecker(any(TaskState.class), anyInt())).thenReturn(mock(TaskLevelPolicyChecker.class));
    when(mockTaskContext.getRowLevelPolicyChecker()).thenReturn(new RowLevelPolicyChecker(Lists.newArrayList(), "ss", FileSystem.getLocal(new Configuration())));
    when(mockTaskContext.getRowLevelPolicyChecker(anyInt())).thenReturn(new RowLevelPolicyChecker(Lists.newArrayList(), "ss", FileSystem.getLocal(new Configuration())));
    when(mockTaskContext.getDataWriterBuilder(anyInt(), anyInt())).thenReturn(writer);
    // Create a mock TaskPublisher
    TaskPublisher mockTaskPublisher = mock(TaskPublisher.class);
    when(mockTaskPublisher.canPublish()).thenReturn(TaskPublisher.PublisherState.SUCCESS);
    when(mockTaskContext.getTaskPublisher(any(TaskState.class), any(TaskLevelPolicyCheckResults.class))).thenReturn(mockTaskPublisher);
    // Create a mock TaskStateTracker
    TaskStateTracker mockTaskStateTracker = mock(TaskStateTracker.class);
    // Create a TaskExecutor - a real TaskExecutor must be created so a Fork is run in a separate thread
    TaskExecutor taskExecutor = new TaskExecutor(new Properties());
    // Create the Task
    Task realTask = new Task(mockTaskContext, mockTaskStateTracker, taskExecutor, Optional.<CountDownLatch>absent());
    Task task = spy(realTask);
    doNothing().when(task).submitTaskCommittedEvent();
    return task;
}
Also used : IdentityForkOperator(org.apache.gobblin.fork.IdentityForkOperator) TaskPublisher(org.apache.gobblin.publisher.TaskPublisher) Configuration(org.apache.hadoop.conf.Configuration) TaskLevelPolicyChecker(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyChecker) RowLevelPolicyChecker(org.apache.gobblin.qualitychecker.row.RowLevelPolicyChecker) TaskLevelPolicyCheckResults(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyCheckResults) Properties(java.util.Properties)

Example 3 with IdentityForkOperator

use of org.apache.gobblin.fork.IdentityForkOperator in project incubator-gobblin by apache.

the class TaskContinuousTest method getMockTaskContext.

private TaskContext getMockTaskContext(ArrayList<Object> recordCollector, Extractor mockExtractor) throws Exception {
    TaskState taskState = getStreamingTaskState();
    // Create a mock RowLevelPolicyChecker
    RowLevelPolicyChecker mockRowLevelPolicyChecker = new RowLevelPolicyChecker(Lists.newArrayList(), "stateId", FileSystem.getLocal(new Configuration()));
    WatermarkStorage mockWatermarkStorage = new MockWatermarkStorage();
    // Create a mock TaskPublisher
    TaskPublisher mockTaskPublisher = mock(TaskPublisher.class);
    when(mockTaskPublisher.canPublish()).thenReturn(TaskPublisher.PublisherState.SUCCESS);
    // Create a mock TaskContext
    TaskContext mockTaskContext = mock(TaskContext.class);
    when(mockTaskContext.getTaskMetrics()).thenReturn(TaskMetrics.get(taskState));
    when(mockTaskContext.getExtractor()).thenReturn(mockExtractor);
    when(mockTaskContext.getRawSourceExtractor()).thenReturn(mockExtractor);
    when(mockTaskContext.getWatermarkStorage()).thenReturn(mockWatermarkStorage);
    when(mockTaskContext.getForkOperator()).thenReturn(new IdentityForkOperator());
    when(mockTaskContext.getTaskState()).thenReturn(taskState);
    when(mockTaskContext.getTaskPublisher(any(TaskState.class), any(TaskLevelPolicyCheckResults.class))).thenReturn(mockTaskPublisher);
    when(mockTaskContext.getRowLevelPolicyChecker()).thenReturn(mockRowLevelPolicyChecker);
    when(mockTaskContext.getRowLevelPolicyChecker(anyInt())).thenReturn(mockRowLevelPolicyChecker);
    when(mockTaskContext.getTaskLevelPolicyChecker(any(TaskState.class), anyInt())).thenReturn(mock(TaskLevelPolicyChecker.class));
    when(mockTaskContext.getDataWriterBuilder(anyInt(), anyInt())).thenReturn(new TestStreamingDataWriterBuilder(recordCollector));
    return mockTaskContext;
}
Also used : IdentityForkOperator(org.apache.gobblin.fork.IdentityForkOperator) WatermarkStorage(org.apache.gobblin.writer.WatermarkStorage) TaskPublisher(org.apache.gobblin.publisher.TaskPublisher) Configuration(org.apache.hadoop.conf.Configuration) TaskLevelPolicyChecker(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyChecker) RowLevelPolicyChecker(org.apache.gobblin.qualitychecker.row.RowLevelPolicyChecker) TaskLevelPolicyCheckResults(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyCheckResults)

Example 4 with IdentityForkOperator

use of org.apache.gobblin.fork.IdentityForkOperator in project incubator-gobblin by apache.

the class TaskTest method testRetryTask.

/**
 * Check if a {@link WorkUnitState.WorkingState} of a {@link Task} is set properly after a {@link Task} fails once,
 * but then is successful the next time.
 */
@Test(dataProvider = "stateOverrides")
public void testRetryTask(State overrides) throws Exception {
    // Create a TaskState
    TaskState taskState = getEmptyTestTaskState("testRetryTaskId");
    taskState.addAll(overrides);
    // Create a mock TaskContext
    TaskContext mockTaskContext = mock(TaskContext.class);
    when(mockTaskContext.getTaskMetrics()).thenReturn(TaskMetrics.get(taskState));
    when(mockTaskContext.getExtractor()).thenReturn(new FailOnceExtractor());
    when(mockTaskContext.getForkOperator()).thenReturn(new IdentityForkOperator());
    when(mockTaskContext.getTaskState()).thenReturn(taskState);
    when(mockTaskContext.getTaskLevelPolicyChecker(any(TaskState.class), anyInt())).thenReturn(mock(TaskLevelPolicyChecker.class));
    when(mockTaskContext.getRowLevelPolicyChecker()).thenReturn(new RowLevelPolicyChecker(Lists.newArrayList(), "ss", FileSystem.getLocal(new Configuration())));
    when(mockTaskContext.getRowLevelPolicyChecker(anyInt())).thenReturn(new RowLevelPolicyChecker(Lists.newArrayList(), "ss", FileSystem.getLocal(new Configuration())));
    // Create a mock TaskPublisher
    TaskPublisher mockTaskPublisher = mock(TaskPublisher.class);
    when(mockTaskPublisher.canPublish()).thenReturn(TaskPublisher.PublisherState.SUCCESS);
    when(mockTaskContext.getTaskPublisher(any(TaskState.class), any(TaskLevelPolicyCheckResults.class))).thenReturn(mockTaskPublisher);
    // Create a mock TaskStateTracker
    TaskStateTracker mockTaskStateTracker = mock(TaskStateTracker.class);
    // Create a TaskExecutor - a real TaskExecutor must be created so a Fork is run in a separate thread
    TaskExecutor taskExecutor = new TaskExecutor(new Properties());
    // Create the Task
    Task realTask = new Task(mockTaskContext, mockTaskStateTracker, taskExecutor, Optional.<CountDownLatch>absent());
    Task task = spy(realTask);
    doNothing().when(task).submitTaskCommittedEvent();
    // The first run of the Task should fail
    task.run();
    task.commit();
    Assert.assertEquals(task.getTaskState().getWorkingState(), WorkUnitState.WorkingState.FAILED);
    // The second run of the Task should succeed
    task.run();
    task.commit();
    Assert.assertEquals(task.getTaskState().getWorkingState(), WorkUnitState.WorkingState.SUCCESSFUL);
}
Also used : TaskPublisher(org.apache.gobblin.publisher.TaskPublisher) Configuration(org.apache.hadoop.conf.Configuration) TaskLevelPolicyChecker(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyChecker) TaskLevelPolicyCheckResults(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyCheckResults) Properties(java.util.Properties) IdentityForkOperator(org.apache.gobblin.fork.IdentityForkOperator) RowLevelPolicyChecker(org.apache.gobblin.qualitychecker.row.RowLevelPolicyChecker) Test(org.testng.annotations.Test)

Example 5 with IdentityForkOperator

use of org.apache.gobblin.fork.IdentityForkOperator in project incubator-gobblin by apache.

the class TestRecordStream method testAcks.

@Test
public void testAcks() throws Exception {
    StreamEntity[] entities = new StreamEntity[] { new RecordEnvelope<>("a"), new BasicTestControlMessage("1"), new RecordEnvelope<>("b"), new BasicTestControlMessage("2") };
    BasicAckableForTesting ackable = new BasicAckableForTesting();
    for (int i = 0; i < entities.length; i++) {
        entities[i].addCallBack(ackable);
    }
    MyExtractor extractor = new MyExtractor(entities);
    MyConverter converter = new MyConverter();
    MyDataWriter writer = new MyDataWriter();
    // Create a TaskState
    TaskState taskState = getEmptyTestTaskState("testRetryTaskId");
    taskState.setProp(ConfigurationKeys.TASK_SYNCHRONOUS_EXECUTION_MODEL_KEY, false);
    // Create a mock TaskContext
    TaskContext mockTaskContext = mock(TaskContext.class);
    when(mockTaskContext.getExtractor()).thenReturn(extractor);
    when(mockTaskContext.getForkOperator()).thenReturn(new IdentityForkOperator());
    when(mockTaskContext.getTaskState()).thenReturn(taskState);
    when(mockTaskContext.getConverters()).thenReturn(Lists.newArrayList(converter));
    when(mockTaskContext.getTaskLevelPolicyChecker(any(TaskState.class), anyInt())).thenReturn(mock(TaskLevelPolicyChecker.class));
    when(mockTaskContext.getRowLevelPolicyChecker()).thenReturn(new RowLevelPolicyChecker(Lists.newArrayList(), "ss", FileSystem.getLocal(new Configuration())));
    when(mockTaskContext.getRowLevelPolicyChecker(anyInt())).thenReturn(new RowLevelPolicyChecker(Lists.newArrayList(), "ss", FileSystem.getLocal(new Configuration())));
    when(mockTaskContext.getDataWriterBuilder(anyInt(), anyInt())).thenReturn(writer);
    // Create a mock TaskPublisher
    TaskPublisher mockTaskPublisher = mock(TaskPublisher.class);
    when(mockTaskPublisher.canPublish()).thenReturn(TaskPublisher.PublisherState.SUCCESS);
    when(mockTaskContext.getTaskPublisher(any(TaskState.class), any(TaskLevelPolicyCheckResults.class))).thenReturn(mockTaskPublisher);
    // Create a mock TaskStateTracker
    TaskStateTracker mockTaskStateTracker = mock(TaskStateTracker.class);
    // Create a TaskExecutor - a real TaskExecutor must be created so a Fork is run in a separate thread
    TaskExecutor taskExecutor = new TaskExecutor(new Properties());
    // Create the Task
    Task realTask = new Task(mockTaskContext, mockTaskStateTracker, taskExecutor, Optional.<CountDownLatch>absent());
    Task task = spy(realTask);
    doNothing().when(task).submitTaskCommittedEvent();
    task.run();
    task.commit();
    Assert.assertEquals(task.getTaskState().getWorkingState(), WorkUnitState.WorkingState.SUCCESSFUL);
    Assert.assertEquals(ackable.acked, 4);
}
Also used : TaskPublisher(org.apache.gobblin.publisher.TaskPublisher) Configuration(org.apache.hadoop.conf.Configuration) RecordEnvelope(org.apache.gobblin.stream.RecordEnvelope) TaskLevelPolicyChecker(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyChecker) StreamEntity(org.apache.gobblin.stream.StreamEntity) TaskLevelPolicyCheckResults(org.apache.gobblin.qualitychecker.task.TaskLevelPolicyCheckResults) Properties(java.util.Properties) IdentityForkOperator(org.apache.gobblin.fork.IdentityForkOperator) RowLevelPolicyChecker(org.apache.gobblin.qualitychecker.row.RowLevelPolicyChecker) BasicAckableForTesting(org.apache.gobblin.ack.BasicAckableForTesting) Test(org.testng.annotations.Test)

Aggregations

IdentityForkOperator (org.apache.gobblin.fork.IdentityForkOperator)5 TaskPublisher (org.apache.gobblin.publisher.TaskPublisher)4 RowLevelPolicyChecker (org.apache.gobblin.qualitychecker.row.RowLevelPolicyChecker)4 TaskLevelPolicyCheckResults (org.apache.gobblin.qualitychecker.task.TaskLevelPolicyCheckResults)4 TaskLevelPolicyChecker (org.apache.gobblin.qualitychecker.task.TaskLevelPolicyChecker)4 Configuration (org.apache.hadoop.conf.Configuration)4 Properties (java.util.Properties)3 Test (org.testng.annotations.Test)3 ArrayList (java.util.ArrayList)1 BasicAckableForTesting (org.apache.gobblin.ack.BasicAckableForTesting)1 ForkOperator (org.apache.gobblin.fork.ForkOperator)1 RecordEnvelope (org.apache.gobblin.stream.RecordEnvelope)1 StreamEntity (org.apache.gobblin.stream.StreamEntity)1 WatermarkStorage (org.apache.gobblin.writer.WatermarkStorage)1