Search in sources :

Example 6 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class FileSystemJobResultStore method getDirtyResultsInternal.

@Override
public Set<JobResult> getDirtyResultsInternal() throws IOException {
    final Set<JobResult> dirtyResults = new HashSet<>();
    FileStatus[] statuses = fileSystem.listStatus(this.basePath);
    for (FileStatus s : statuses) {
        if (!s.isDir()) {
            if (hasValidDirtyJobResultStoreEntryExtension(s.getPath().getName())) {
                JsonJobResultEntry jre = mapper.readValue(fileSystem.open(s.getPath()), JsonJobResultEntry.class);
                dirtyResults.add(jre.getJobResult());
            }
        }
    }
    return dirtyResults;
}
Also used : FileStatus(org.apache.flink.core.fs.FileStatus) JobResult(org.apache.flink.runtime.jobmaster.JobResult) HashSet(java.util.HashSet)

Example 7 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class ApplicationDispatcherBootstrapITCase method testDispatcherRecoversAfterLosingAndRegainingLeadership.

@Test
public void testDispatcherRecoversAfterLosingAndRegainingLeadership() throws Exception {
    final String blockId = UUID.randomUUID().toString();
    final Deadline deadline = Deadline.fromNow(TIMEOUT);
    final Configuration configuration = new Configuration();
    configuration.set(HighAvailabilityOptions.HA_MODE, HighAvailabilityMode.ZOOKEEPER.name());
    configuration.set(DeploymentOptions.TARGET, EmbeddedExecutor.NAME);
    configuration.set(ClientOptions.CLIENT_RETRY_PERIOD, Duration.ofMillis(100));
    final TestingMiniClusterConfiguration clusterConfiguration = TestingMiniClusterConfiguration.newBuilder().setConfiguration(configuration).build();
    final EmbeddedHaServicesWithLeadershipControl haServices = new EmbeddedHaServicesWithLeadershipControl(TestingUtils.defaultExecutor());
    final TestingMiniCluster.Builder clusterBuilder = TestingMiniCluster.newBuilder(clusterConfiguration).setHighAvailabilityServicesSupplier(() -> haServices).setDispatcherResourceManagerComponentFactorySupplier(createApplicationModeDispatcherResourceManagerComponentFactorySupplier(clusterConfiguration.getConfiguration(), BlockingJob.getProgram(blockId)));
    try (final MiniCluster cluster = clusterBuilder.build()) {
        // start mini cluster and submit the job
        cluster.start();
        // wait until job is running
        awaitJobStatus(cluster, ApplicationDispatcherBootstrap.ZERO_JOB_ID, JobStatus.RUNNING, deadline);
        // make sure the operator is actually running
        BlockingJob.awaitRunning(blockId);
        final CompletableFuture<JobResult> firstJobResult = cluster.requestJobResult(ApplicationDispatcherBootstrap.ZERO_JOB_ID);
        haServices.revokeDispatcherLeadership();
        // make sure the leadership is revoked to avoid race conditions
        assertThat(firstJobResult.get()).extracting(JobResult::getApplicationStatus).isEqualTo(ApplicationStatus.UNKNOWN);
        haServices.grantDispatcherLeadership();
        // job is suspended, wait until it's running
        awaitJobStatus(cluster, ApplicationDispatcherBootstrap.ZERO_JOB_ID, JobStatus.RUNNING, deadline);
        // unblock processing so the job can finish
        BlockingJob.unblock(blockId);
        // and wait for it to actually finish
        final JobResult secondJobResult = cluster.requestJobResult(ApplicationDispatcherBootstrap.ZERO_JOB_ID).get();
        assertThat(secondJobResult.isSuccess()).isTrue();
        assertThat(secondJobResult.getApplicationStatus()).isEqualTo(ApplicationStatus.SUCCEEDED);
        // the cluster should shut down automatically once the application completes
        awaitClusterStopped(cluster, deadline);
    } finally {
        BlockingJob.cleanUp(blockId);
    }
}
Also used : TestingMiniCluster(org.apache.flink.runtime.minicluster.TestingMiniCluster) TestingMiniClusterConfiguration(org.apache.flink.runtime.minicluster.TestingMiniClusterConfiguration) Configuration(org.apache.flink.configuration.Configuration) TestingMiniClusterConfiguration(org.apache.flink.runtime.minicluster.TestingMiniClusterConfiguration) JobResult(org.apache.flink.runtime.jobmaster.JobResult) Deadline(org.apache.flink.api.common.time.Deadline) EmbeddedHaServicesWithLeadershipControl(org.apache.flink.runtime.highavailability.nonha.embedded.EmbeddedHaServicesWithLeadershipControl) MiniCluster(org.apache.flink.runtime.minicluster.MiniCluster) TestingMiniCluster(org.apache.flink.runtime.minicluster.TestingMiniCluster) Test(org.junit.jupiter.api.Test)

Example 8 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class ApplicationDispatcherGatewayServiceFactory method create.

@Override
public AbstractDispatcherLeaderProcess.DispatcherGatewayService create(DispatcherId fencingToken, Collection<JobGraph> recoveredJobs, Collection<JobResult> recoveredDirtyJobResults, JobGraphWriter jobGraphWriter, JobResultStore jobResultStore) {
    final List<JobID> recoveredJobIds = getRecoveredJobIds(recoveredJobs);
    final Dispatcher dispatcher;
    try {
        dispatcher = dispatcherFactory.createDispatcher(rpcService, fencingToken, recoveredJobs, recoveredDirtyJobResults, (dispatcherGateway, scheduledExecutor, errorHandler) -> new ApplicationDispatcherBootstrap(application, recoveredJobIds, configuration, dispatcherGateway, scheduledExecutor, errorHandler), PartialDispatcherServicesWithJobPersistenceComponents.from(partialDispatcherServices, jobGraphWriter, jobResultStore));
    } catch (Exception e) {
        throw new FlinkRuntimeException("Could not create the Dispatcher rpc endpoint.", e);
    }
    dispatcher.start();
    return DefaultDispatcherGatewayService.from(dispatcher);
}
Also used : DispatcherId(org.apache.flink.runtime.dispatcher.DispatcherId) Dispatcher(org.apache.flink.runtime.dispatcher.Dispatcher) PartialDispatcherServices(org.apache.flink.runtime.dispatcher.PartialDispatcherServices) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) PartialDispatcherServicesWithJobPersistenceComponents(org.apache.flink.runtime.dispatcher.PartialDispatcherServicesWithJobPersistenceComponents) Collection(java.util.Collection) Configuration(org.apache.flink.configuration.Configuration) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) AbstractDispatcherLeaderProcess(org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess) Collectors(java.util.stream.Collectors) JobResult(org.apache.flink.runtime.jobmaster.JobResult) List(java.util.List) JobID(org.apache.flink.api.common.JobID) RpcService(org.apache.flink.runtime.rpc.RpcService) Internal(org.apache.flink.annotation.Internal) PackagedProgram(org.apache.flink.client.program.PackagedProgram) DispatcherFactory(org.apache.flink.runtime.dispatcher.DispatcherFactory) JobResultStore(org.apache.flink.runtime.highavailability.JobResultStore) Preconditions.checkNotNull(org.apache.flink.util.Preconditions.checkNotNull) JobGraphWriter(org.apache.flink.runtime.jobmanager.JobGraphWriter) DefaultDispatcherGatewayService(org.apache.flink.runtime.dispatcher.runner.DefaultDispatcherGatewayService) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) Dispatcher(org.apache.flink.runtime.dispatcher.Dispatcher) JobID(org.apache.flink.api.common.JobID) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException)

Example 9 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class DistributedCacheDfsTest method testSubmittingJobViaRestClusterClient.

/**
 * All the Flink Standalone, Yarn, Kubernetes sessions are using {@link
 * RestClusterClient#submitJob(JobGraph)} to submit a job to an existing session. This test will
 * cover this cases.
 */
@Test(timeout = 30000)
public void testSubmittingJobViaRestClusterClient() throws Exception {
    RestClusterClient<String> restClusterClient = new RestClusterClient<>(MINI_CLUSTER_RESOURCE.getClientConfiguration(), "testSubmittingJobViaRestClusterClient");
    final JobGraph jobGraph = createJobWithRegisteredCachedFiles().getStreamGraph().getJobGraph();
    final JobResult jobResult = restClusterClient.submitJob(jobGraph).thenCompose(restClusterClient::requestJobResult).get();
    final String messageInCaseOfFailure = jobResult.getSerializedThrowable().isPresent() ? jobResult.getSerializedThrowable().get().getFullStringifiedStackTrace() : "Job failed.";
    assertTrue(messageInCaseOfFailure, jobResult.isSuccess());
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobResult(org.apache.flink.runtime.jobmaster.JobResult) RestClusterClient(org.apache.flink.client.program.rest.RestClusterClient) Test(org.junit.Test)

Example 10 with JobResult

use of org.apache.flink.runtime.jobmaster.JobResult in project flink by apache.

the class JobStatusPollingUtilsTest method testPolling.

@Test
public void testPolling() {
    final int maxAttemptCounter = 3;
    final ScheduledExecutorService executor = Executors.newSingleThreadScheduledExecutor();
    try {
        final ScheduledExecutor scheduledExecutor = new ScheduledExecutorServiceAdapter(executor);
        final CallCountingJobStatusSupplier jobStatusSupplier = new CallCountingJobStatusSupplier(maxAttemptCounter);
        final CompletableFuture<JobResult> result = JobStatusPollingUtils.pollJobResultAsync(jobStatusSupplier, () -> CompletableFuture.completedFuture(createSuccessfulJobResult(new JobID(0, 0))), scheduledExecutor, 10);
        result.join();
        assertThat(jobStatusSupplier.getAttemptCounter(), is(equalTo(maxAttemptCounter)));
    } finally {
        ExecutorUtils.gracefulShutdown(5, TimeUnit.SECONDS, executor);
    }
}
Also used : ScheduledExecutorService(java.util.concurrent.ScheduledExecutorService) ScheduledExecutorServiceAdapter(org.apache.flink.util.concurrent.ScheduledExecutorServiceAdapter) JobResult(org.apache.flink.runtime.jobmaster.JobResult) JobID(org.apache.flink.api.common.JobID) ScheduledExecutor(org.apache.flink.util.concurrent.ScheduledExecutor) Test(org.junit.Test)

Aggregations

JobResult (org.apache.flink.runtime.jobmaster.JobResult)58 Test (org.junit.Test)28 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)25 JobID (org.apache.flink.api.common.JobID)15 Test (org.junit.jupiter.api.Test)13 MiniCluster (org.apache.flink.runtime.minicluster.MiniCluster)11 ExecutionException (java.util.concurrent.ExecutionException)8 JobSubmissionResult (org.apache.flink.api.common.JobSubmissionResult)7 Deadline (org.apache.flink.api.common.time.Deadline)7 Configuration (org.apache.flink.configuration.Configuration)7 File (java.io.File)5 JobResultStore (org.apache.flink.runtime.highavailability.JobResultStore)5 IOException (java.io.IOException)4 CompletableFuture (java.util.concurrent.CompletableFuture)4 ScheduledExecutorService (java.util.concurrent.ScheduledExecutorService)4 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)4 Duration (java.time.Duration)3 List (java.util.List)3 Time (org.apache.flink.api.common.time.Time)3 MiniClusterClient (org.apache.flink.client.program.MiniClusterClient)3