Search in sources :

Example 1 with RESTARTING

use of com.hazelcast.jet.core.JobStatus.RESTARTING in project hazelcast-jet by hazelcast.

the class MasterContext method invokeCompleteExecution.

private void invokeCompleteExecution(Throwable error) {
    JobStatus status = jobStatus();
    Throwable finalError;
    if (status == STARTING || status == RESTARTING || status == RUNNING) {
        logger.fine("Completing " + jobIdString());
        finalError = error;
    } else {
        if (error != null) {
            logger.severe("Cannot properly complete failed " + jobIdString() + ": status is " + status, error);
        } else {
            logger.severe("Cannot properly complete " + jobIdString() + ": status is " + status);
        }
        finalError = new IllegalStateException("Job coordination failed.");
    }
    Function<ExecutionPlan, Operation> operationCtor = plan -> new CompleteExecutionOperation(executionId, finalError);
    invoke(operationCtor, responses -> finalizeJob(error), null);
}
Also used : JobStatus(com.hazelcast.jet.core.JobStatus) NO_SNAPSHOT(com.hazelcast.jet.impl.execution.SnapshotContext.NO_SNAPSHOT) SnapshotRepository.snapshotDataMapName(com.hazelcast.jet.impl.SnapshotRepository.snapshotDataMapName) NonCompletableFuture(com.hazelcast.jet.impl.util.NonCompletableFuture) Address(com.hazelcast.nio.Address) Util.jobAndExecutionId(com.hazelcast.jet.impl.util.Util.jobAndExecutionId) Processors.mapP(com.hazelcast.jet.core.processor.Processors.mapP) SourceProcessors.readMapP(com.hazelcast.jet.core.processor.SourceProcessors.readMapP) CompletionToken(com.hazelcast.jet.impl.util.CompletionToken) Util.idToString(com.hazelcast.jet.impl.util.Util.idToString) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) MemberInfo(com.hazelcast.internal.cluster.MemberInfo) Map(java.util.Map) STARTING(com.hazelcast.jet.core.JobStatus.STARTING) DAG(com.hazelcast.jet.core.DAG) JobStatus(com.hazelcast.jet.core.JobStatus) ExceptionUtil(com.hazelcast.jet.impl.util.ExceptionUtil) ExecutionService(com.hazelcast.spi.ExecutionService) Operation(com.hazelcast.spi.Operation) CancellationException(java.util.concurrent.CancellationException) Collections.emptyList(java.util.Collections.emptyList) Collection(java.util.Collection) Util.getJetInstance(com.hazelcast.jet.impl.util.Util.getJetInstance) JobConfig(com.hazelcast.jet.config.JobConfig) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) Set(java.util.Set) Collectors(java.util.stream.Collectors) BroadcastKey(com.hazelcast.jet.core.BroadcastKey) List(java.util.List) ExecutionCallback(com.hazelcast.core.ExecutionCallback) ExecutionPlan(com.hazelcast.jet.impl.execution.init.ExecutionPlan) Entry(java.util.Map.Entry) CancelExecutionOperation(com.hazelcast.jet.impl.operation.CancelExecutionOperation) TopologyChangedException(com.hazelcast.jet.core.TopologyChangedException) COMPLETED(com.hazelcast.jet.core.JobStatus.COMPLETED) InternalCompletableFuture(com.hazelcast.spi.InternalCompletableFuture) ExecutionPlanBuilder.createExecutionPlans(com.hazelcast.jet.impl.execution.init.ExecutionPlanBuilder.createExecutionPlans) Collectors.partitioningBy(java.util.stream.Collectors.partitioningBy) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) HashMap(java.util.HashMap) CompletableFuture(java.util.concurrent.CompletableFuture) StartExecutionOperation(com.hazelcast.jet.impl.operation.StartExecutionOperation) AtomicReference(java.util.concurrent.atomic.AtomicReference) Function(java.util.function.Function) HashSet(java.util.HashSet) InitExecutionOperation(com.hazelcast.jet.impl.operation.InitExecutionOperation) ILogger(com.hazelcast.logging.ILogger) ExceptionUtil.withTryCatch(com.hazelcast.jet.impl.util.ExceptionUtil.withTryCatch) NOT_STARTED(com.hazelcast.jet.core.JobStatus.NOT_STARTED) MembersView(com.hazelcast.internal.cluster.impl.MembersView) DistributedFunction(com.hazelcast.jet.function.DistributedFunction) Edge(com.hazelcast.jet.core.Edge) ClusterServiceImpl(com.hazelcast.internal.cluster.impl.ClusterServiceImpl) Nullable(javax.annotation.Nullable) NodeEngineImpl(com.hazelcast.spi.impl.NodeEngineImpl) RESTARTING(com.hazelcast.jet.core.JobStatus.RESTARTING) BroadcastEntry(com.hazelcast.jet.impl.execution.BroadcastEntry) ExceptionUtil.isTopologicalFailure(com.hazelcast.jet.impl.util.ExceptionUtil.isTopologicalFailure) DistributedFunctions.entryKey(com.hazelcast.jet.function.DistributedFunctions.entryKey) Consumer(java.util.function.Consumer) Vertex(com.hazelcast.jet.core.Vertex) Collectors.toList(java.util.stream.Collectors.toList) CustomClassLoadedObject.deserializeWithCustomClassLoader(com.hazelcast.jet.impl.execution.init.CustomClassLoadedObject.deserializeWithCustomClassLoader) CompleteExecutionOperation(com.hazelcast.jet.impl.operation.CompleteExecutionOperation) ExceptionUtil.peel(com.hazelcast.jet.impl.util.ExceptionUtil.peel) FAILED(com.hazelcast.jet.core.JobStatus.FAILED) RUNNING(com.hazelcast.jet.core.JobStatus.RUNNING) ProcessingGuarantee(com.hazelcast.jet.config.ProcessingGuarantee) JobRestartRequestedException(com.hazelcast.jet.impl.exception.JobRestartRequestedException) SnapshotOperation(com.hazelcast.jet.impl.operation.SnapshotOperation) Edge.between(com.hazelcast.jet.core.Edge.between) ExecutionPlan(com.hazelcast.jet.impl.execution.init.ExecutionPlan) Operation(com.hazelcast.spi.Operation) CancelExecutionOperation(com.hazelcast.jet.impl.operation.CancelExecutionOperation) StartExecutionOperation(com.hazelcast.jet.impl.operation.StartExecutionOperation) InitExecutionOperation(com.hazelcast.jet.impl.operation.InitExecutionOperation) CompleteExecutionOperation(com.hazelcast.jet.impl.operation.CompleteExecutionOperation) SnapshotOperation(com.hazelcast.jet.impl.operation.SnapshotOperation) CompleteExecutionOperation(com.hazelcast.jet.impl.operation.CompleteExecutionOperation)

Example 2 with RESTARTING

use of com.hazelcast.jet.core.JobStatus.RESTARTING in project hazelcast-jet by hazelcast.

the class SplitBrainTest method when_quorumIsLostOnBothSides_then_jobRestartsUntilMerge.

@Test
public void when_quorumIsLostOnBothSides_then_jobRestartsUntilMerge() {
    int firstSubClusterSize = 2;
    int secondSubClusterSize = 2;
    int clusterSize = firstSubClusterSize + secondSubClusterSize;
    StuckProcessor.executionStarted = new CountDownLatch(clusterSize * PARALLELISM);
    Job[] jobRef = new Job[1];
    Consumer<JetInstance[]> beforeSplit = instances -> {
        MockPS processorSupplier = new MockPS(StuckProcessor::new, clusterSize);
        DAG dag = new DAG().vertex(new Vertex("test", processorSupplier));
        jobRef[0] = instances[0].newJob(dag, new JobConfig().setSplitBrainProtection(true));
        assertOpenEventually(StuckProcessor.executionStarted);
    };
    BiConsumer<JetInstance[], JetInstance[]> onSplit = (firstSubCluster, secondSubCluster) -> {
        StuckProcessor.proceedLatch.countDown();
        long jobId = jobRef[0].getId();
        assertTrueEventually(() -> {
            JetService service1 = getJetService(firstSubCluster[0]);
            JetService service2 = getJetService(secondSubCluster[0]);
            assertEquals(RESTARTING, service1.getJobCoordinationService().getJobStatus(jobId));
            assertEquals(STARTING, service2.getJobCoordinationService().getJobStatus(jobId));
        });
        assertTrueAllTheTime(() -> {
            JetService service1 = getJetService(firstSubCluster[0]);
            JetService service2 = getJetService(secondSubCluster[0]);
            assertEquals(RESTARTING, service1.getJobCoordinationService().getJobStatus(jobId));
            assertEquals(STARTING, service2.getJobCoordinationService().getJobStatus(jobId));
        }, 20);
    };
    Consumer<JetInstance[]> afterMerge = instances -> {
        assertTrueEventually(() -> {
            assertEquals(clusterSize * 2, MockPS.initCount.get());
            assertEquals(clusterSize * 2, MockPS.closeCount.get());
        });
        assertEquals(clusterSize, MockPS.receivedCloseErrors.size());
        MockPS.receivedCloseErrors.forEach(t -> assertTrue(t instanceof TopologyChangedException));
    };
    testSplitBrain(firstSubClusterSize, secondSubClusterSize, beforeSplit, onSplit, afterMerge);
}
Also used : MasterContext(com.hazelcast.jet.impl.MasterContext) JetInstance(com.hazelcast.jet.JetInstance) RunWith(org.junit.runner.RunWith) HazelcastSerialClassRunner(com.hazelcast.test.HazelcastSerialClassRunner) Future(java.util.concurrent.Future) STARTING(com.hazelcast.jet.core.JobStatus.STARTING) BiConsumer(java.util.function.BiConsumer) Assert.fail(org.junit.Assert.fail) ExpectedException(org.junit.rules.ExpectedException) Job(com.hazelcast.jet.Job) JobRepository(com.hazelcast.jet.impl.JobRepository) JetConfig(com.hazelcast.jet.config.JetConfig) MAX_BACKUP_COUNT(com.hazelcast.spi.partition.IPartition.MAX_BACKUP_COUNT) MockPS(com.hazelcast.jet.core.TestProcessors.MockPS) CancellationException(java.util.concurrent.CancellationException) RESTARTING(com.hazelcast.jet.core.JobStatus.RESTARTING) Assert.assertNotNull(org.junit.Assert.assertNotNull) JobConfig(com.hazelcast.jet.config.JobConfig) StuckProcessor(com.hazelcast.jet.core.TestProcessors.StuckProcessor) Assert.assertTrue(org.junit.Assert.assertTrue) Test(org.junit.Test) TimeUnit(java.util.concurrent.TimeUnit) Consumer(java.util.function.Consumer) CountDownLatch(java.util.concurrent.CountDownLatch) Rule(org.junit.Rule) JetService(com.hazelcast.jet.impl.JetService) COMPLETED(com.hazelcast.jet.core.JobStatus.COMPLETED) JobRecord(com.hazelcast.jet.impl.JobRecord) Assert.assertEquals(org.junit.Assert.assertEquals) MockPS(com.hazelcast.jet.core.TestProcessors.MockPS) JetService(com.hazelcast.jet.impl.JetService) CountDownLatch(java.util.concurrent.CountDownLatch) JobConfig(com.hazelcast.jet.config.JobConfig) Job(com.hazelcast.jet.Job) Test(org.junit.Test)

Aggregations

JobConfig (com.hazelcast.jet.config.JobConfig)2 COMPLETED (com.hazelcast.jet.core.JobStatus.COMPLETED)2 RESTARTING (com.hazelcast.jet.core.JobStatus.RESTARTING)2 STARTING (com.hazelcast.jet.core.JobStatus.STARTING)2 CancellationException (java.util.concurrent.CancellationException)2 Consumer (java.util.function.Consumer)2 ExecutionCallback (com.hazelcast.core.ExecutionCallback)1 MemberInfo (com.hazelcast.internal.cluster.MemberInfo)1 ClusterServiceImpl (com.hazelcast.internal.cluster.impl.ClusterServiceImpl)1 MembersView (com.hazelcast.internal.cluster.impl.MembersView)1 JetInstance (com.hazelcast.jet.JetInstance)1 Job (com.hazelcast.jet.Job)1 JetConfig (com.hazelcast.jet.config.JetConfig)1 ProcessingGuarantee (com.hazelcast.jet.config.ProcessingGuarantee)1 BroadcastKey (com.hazelcast.jet.core.BroadcastKey)1 DAG (com.hazelcast.jet.core.DAG)1 Edge (com.hazelcast.jet.core.Edge)1 Edge.between (com.hazelcast.jet.core.Edge.between)1 JobStatus (com.hazelcast.jet.core.JobStatus)1 FAILED (com.hazelcast.jet.core.JobStatus.FAILED)1