Search in sources :

Example 1 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class DispatcherTest method testThatDirtilyFinishedJobsNotBeingRetriggered.

@Test(expected = IllegalArgumentException.class)
public void testThatDirtilyFinishedJobsNotBeingRetriggered() throws Exception {
    final JobGraph jobGraph = JobGraphTestUtils.emptyJobGraph();
    final JobResult jobResult = TestingJobResultStore.createSuccessfulJobResult(jobGraph.getJobID());
    dispatcher = createTestingDispatcherBuilder().setRecoveredJobs(Collections.singleton(jobGraph)).setRecoveredDirtyJobs(Collections.singleton(jobResult)).build();
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobResult(org.apache.flink.runtime.jobmaster.JobResult) Test(org.junit.Test)

Example 2 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class CheckpointResourcesCleanupRunnerTest method testRequestJob_JobState.

@Test
public void testRequestJob_JobState() {
    final JobResult jobResult = createDummySuccessJobResult();
    testRequestJobExecutionGraph(jobResult, System.currentTimeMillis(), actualExecutionGraph -> actualExecutionGraph.getState().equals(jobResult.getApplicationStatus().deriveJobStatus()));
}
Also used : JobResult(org.apache.flink.runtime.jobmaster.JobResult) Test(org.junit.jupiter.api.Test)

Example 3 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class JobDispatcherLeaderProcessFactoryFactory method createFactory.

@Override
public JobDispatcherLeaderProcessFactory createFactory(JobPersistenceComponentFactory jobPersistenceComponentFactory, Executor ioExecutor, RpcService rpcService, PartialDispatcherServices partialDispatcherServices, FatalErrorHandler fatalErrorHandler) {
    final JobGraph jobGraph;
    try {
        jobGraph = Preconditions.checkNotNull(jobGraphRetriever.retrieveJobGraph(partialDispatcherServices.getConfiguration()));
    } catch (FlinkException e) {
        throw new FlinkRuntimeException("Could not retrieve the JobGraph.", e);
    }
    final JobResultStore jobResultStore = jobPersistenceComponentFactory.createJobResultStore();
    final Collection<JobResult> recoveredDirtyJobResults = getDirtyJobResults(jobResultStore);
    final Optional<JobResult> maybeRecoveredDirtyJobResult = extractDirtyJobResult(recoveredDirtyJobResults, jobGraph);
    final Optional<JobGraph> maybeJobGraph = getJobGraphBasedOnDirtyJobResults(jobGraph, recoveredDirtyJobResults);
    final DefaultDispatcherGatewayServiceFactory defaultDispatcherServiceFactory = new DefaultDispatcherGatewayServiceFactory(JobDispatcherFactory.INSTANCE, rpcService, partialDispatcherServices);
    return new JobDispatcherLeaderProcessFactory(defaultDispatcherServiceFactory, maybeJobGraph.orElse(null), maybeRecoveredDirtyJobResult.orElse(null), jobResultStore, fatalErrorHandler);
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobResult(org.apache.flink.runtime.jobmaster.JobResult) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) JobResultStore(org.apache.flink.runtime.highavailability.JobResultStore) FlinkException(org.apache.flink.util.FlinkException)

Example 4 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class MiniDispatcher method requestJobResult.

@Override
public CompletableFuture<JobResult> requestJobResult(JobID jobId, Time timeout) {
    final CompletableFuture<JobResult> jobResultFuture = super.requestJobResult(jobId, timeout);
    if (executionMode == ClusterEntrypoint.ExecutionMode.NORMAL) {
        // terminate the MiniDispatcher once we served the first JobResult successfully
        jobResultFuture.thenAccept((JobResult result) -> {
            ApplicationStatus status = result.getSerializedThrowable().isPresent() ? ApplicationStatus.FAILED : ApplicationStatus.SUCCEEDED;
            if (!ApplicationStatus.UNKNOWN.equals(result.getApplicationStatus())) {
                log.info("Shutting down cluster because someone retrieved the job result" + " and the status is globally terminal.");
                shutDownFuture.complete(status);
            }
        });
    } else {
        log.info("Not shutting down cluster after someone retrieved the job result.");
    }
    return jobResultFuture;
}
Also used : JobResult(org.apache.flink.runtime.jobmaster.JobResult) ApplicationStatus(org.apache.flink.runtime.clusterframework.ApplicationStatus)

Example 5 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class JobDispatcherITCase method testRecoverFromCheckpointAfterLosingAndRegainingLeadership.

@Test
public void testRecoverFromCheckpointAfterLosingAndRegainingLeadership(@TempDir Path tmpPath) throws Exception {
    final Deadline deadline = Deadline.fromNow(TIMEOUT);
    final Configuration configuration = new Configuration();
    configuration.set(HighAvailabilityOptions.HA_MODE, HighAvailabilityMode.ZOOKEEPER.name());
    final TestingMiniClusterConfiguration clusterConfiguration = TestingMiniClusterConfiguration.newBuilder().setConfiguration(configuration).build();
    final EmbeddedHaServicesWithLeadershipControl haServices = new EmbeddedHaServicesWithLeadershipControl(TestingUtils.defaultExecutor());
    final Configuration newConfiguration = new Configuration(clusterConfiguration.getConfiguration());
    final long checkpointInterval = 100;
    final JobID jobID = generateAndPersistJobGraph(newConfiguration, checkpointInterval, tmpPath);
    final TestingMiniCluster.Builder clusterBuilder = TestingMiniCluster.newBuilder(clusterConfiguration).setHighAvailabilityServicesSupplier(() -> haServices).setDispatcherResourceManagerComponentFactorySupplier(createJobModeDispatcherResourceManagerComponentFactorySupplier(newConfiguration));
    AtLeastOneCheckpointInvokable.reset();
    try (final MiniCluster cluster = clusterBuilder.build()) {
        // start mini cluster and submit the job
        cluster.start();
        AtLeastOneCheckpointInvokable.atLeastOneCheckpointCompleted.await();
        final CompletableFuture<JobResult> firstJobResult = cluster.requestJobResult(jobID);
        haServices.revokeDispatcherLeadership();
        // make sure the leadership is revoked to avoid race conditions
        Assertions.assertEquals(ApplicationStatus.UNKNOWN, firstJobResult.get().getApplicationStatus());
        haServices.grantDispatcherLeadership();
        // job is suspended, wait until it's running
        awaitJobStatus(cluster, jobID, JobStatus.RUNNING, deadline);
        CommonTestUtils.waitUntilCondition(() -> cluster.getArchivedExecutionGraph(jobID).get().getCheckpointStatsSnapshot().getLatestRestoredCheckpoint() != null, deadline);
    }
}
Also used : TestingMiniCluster(org.apache.flink.runtime.minicluster.TestingMiniCluster) TestingMiniClusterConfiguration(org.apache.flink.runtime.minicluster.TestingMiniClusterConfiguration) CheckpointCoordinatorConfiguration(org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration) Configuration(org.apache.flink.configuration.Configuration) TestingMiniClusterConfiguration(org.apache.flink.runtime.minicluster.TestingMiniClusterConfiguration) JobResult(org.apache.flink.runtime.jobmaster.JobResult) Deadline(org.apache.flink.api.common.time.Deadline) EmbeddedHaServicesWithLeadershipControl(org.apache.flink.runtime.highavailability.nonha.embedded.EmbeddedHaServicesWithLeadershipControl) MiniCluster(org.apache.flink.runtime.minicluster.MiniCluster) TestingMiniCluster(org.apache.flink.runtime.minicluster.TestingMiniCluster) JobID(org.apache.flink.api.common.JobID) Test(org.junit.jupiter.api.Test)

Aggregations

JobResult (org.apache.flink.runtime.jobmaster.JobResult)58 Test (org.junit.Test)28 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)25 JobID (org.apache.flink.api.common.JobID)15 Test (org.junit.jupiter.api.Test)13 MiniCluster (org.apache.flink.runtime.minicluster.MiniCluster)11 ExecutionException (java.util.concurrent.ExecutionException)8 JobSubmissionResult (org.apache.flink.api.common.JobSubmissionResult)7 Deadline (org.apache.flink.api.common.time.Deadline)7 Configuration (org.apache.flink.configuration.Configuration)7 File (java.io.File)5 JobResultStore (org.apache.flink.runtime.highavailability.JobResultStore)5 IOException (java.io.IOException)4 CompletableFuture (java.util.concurrent.CompletableFuture)4 ScheduledExecutorService (java.util.concurrent.ScheduledExecutorService)4 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)4 Duration (java.time.Duration)3 List (java.util.List)3 Time (org.apache.flink.api.common.time.Time)3 MiniClusterClient (org.apache.flink.client.program.MiniClusterClient)3