Search in sources :

Example 1 with SparkLogLine

use of com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine in project azure-tools-for-java by Microsoft.

the class Session method createSessionRequest.

private Observable<com.microsoft.azure.hdinsight.sdk.rest.livy.interactive.Session> createSessionRequest() {
    final URI uri = baseUrl.resolve(REST_SEGMENT_SESSION);
    final PostSessions postBody = getCreateParameters().build();
    final String json = postBody.convertToJson().orElseThrow(() -> new IllegalArgumentException("Bad session arguments to post."));
    getCtrlSubject().onNext(new SparkLogLine(TOOL, Debug, "Create Livy Session by sending request to " + uri + " with body " + json));
    final StringEntity entity = new StringEntity(json, StandardCharsets.UTF_8);
    entity.setContentType("application/json");
    return getHttp().setUserAgent(getUserAgent()).post(uri.toString(), entity, null, null, com.microsoft.azure.hdinsight.sdk.rest.livy.interactive.Session.class);
}
Also used : StringEntity(org.apache.http.entity.StringEntity) PostSessions(com.microsoft.azure.hdinsight.sdk.rest.livy.interactive.api.PostSessions) URI(java.net.URI) SparkLogLine(com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine)

Example 2 with SparkLogLine

use of com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine in project azure-tools-for-java by Microsoft.

the class ADLSGen2Deploy method deploy.

@Override
public Observable<String> deploy(File src, Observer<SparkLogLine> logSubject) {
    // four steps to upload via adls gen2 rest api
    // 1.put request to create new dir
    // 2.put request to create new file(artifact) which is empty
    // 3.patch request to append data to file
    // 4.patch request to flush data to file
    final URI destURI = getUploadDir();
    // remove request / end otherwise invalid url response
    final String destStr = destURI.toString();
    final String dirPath = destStr.endsWith("/") ? destStr.substring(0, destStr.length() - 1) : destStr;
    final String filePath = String.format("%s/%s", dirPath, src.getName());
    final ADLSGen2FSOperation op = new ADLSGen2FSOperation(this.http);
    return op.createDir(dirPath, "0755").onErrorReturn(err -> {
        if (err.getMessage() != null && (err.getMessage().contains(String.valueOf(HttpStatus.SC_FORBIDDEN)) || err.getMessage().contains(String.valueOf(HttpStatus.SC_NOT_FOUND)))) {
            // Sample destinationRootPath: https://accountName.dfs.core.windows.net/fsName/SparkSubmission/
            String fileSystemRootPath = UriUtil.normalizeWithSlashEnding(URI.create(destinationRootPath)).resolve("../").toString();
            String errorMessage = String.format("Failed to create folder %s when uploading Spark application artifacts with error: %s. %s", dirPath, err.getMessage(), getForbiddenErrorHints(fileSystemRootPath));
            throw new IllegalArgumentException(errorMessage);
        } else {
            throw Exceptions.propagate(err);
        }
    }).doOnNext(ignore -> log().info(String.format("Create filesystem %s successfully.", dirPath))).flatMap(ignore -> op.createFile(filePath, "0755")).flatMap(ignore -> op.uploadData(filePath, src)).doOnNext(ignore -> log().info(String.format("Append data to file %s successfully.", filePath))).map(ignored -> AbfsUri.parse(filePath).getUri().toString());
}
Also used : AuthMethodManager(com.microsoft.azuretools.authmanage.AuthMethodManager) NotNull(com.microsoft.azuretools.azurecommons.helpers.NotNull) JobUtils(com.microsoft.azure.hdinsight.spark.jobs.JobUtils) Exceptions(rx.exceptions.Exceptions) HttpStatus(org.apache.http.HttpStatus) Observer(rx.Observer) AbfsUri(com.microsoft.azure.hdinsight.common.AbfsUri) File(java.io.File) ILogger(com.microsoft.azure.hdinsight.common.logger.ILogger) HttpObservable(com.microsoft.azure.hdinsight.sdk.common.HttpObservable) ADLSGen2FSOperation(com.microsoft.azure.hdinsight.sdk.storage.adlsgen2.ADLSGen2FSOperation) Observable(rx.Observable) UriUtil(com.microsoft.azure.hdinsight.common.UriUtil) URI(java.net.URI) SparkLogLine(com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine) ADLSGen2FSOperation(com.microsoft.azure.hdinsight.sdk.storage.adlsgen2.ADLSGen2FSOperation) URI(java.net.URI)

Example 3 with SparkLogLine

use of com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine in project azure-tools-for-java by Microsoft.

the class SparkBatchJobDebuggerRunner method execute.

/**
 * Execute Spark remote debugging action, refer to {@link GenericDebuggerRunner#execute(ExecutionEnvironment)}
 * implementations, some internal API leveraged.
 *
 * @param environment the execution environment
 * @throws ExecutionException the exception in execution
 */
@Override
public void execute(final ExecutionEnvironment environment) throws ExecutionException {
    final RunProfileState state = environment.getState();
    if (state == null) {
        return;
    }
    final Operation operation = environment.getUserData(TelemetryKeys.OPERATION);
    final AsyncPromise<ExecutionEnvironment> jobDriverEnvReady = new AsyncPromise<>();
    final SparkBatchRemoteDebugState submissionState = (SparkBatchRemoteDebugState) state;
    final SparkSubmitModel submitModel = submissionState.getSubmitModel();
    // Create SSH debug session firstly
    final SparkBatchDebugSession session;
    try {
        session = SparkBatchDebugSession.factoryByAuth(getSparkJobUrl(submitModel), submitModel.getAdvancedConfigModel()).open().verifyCertificate();
    } catch (final Exception e) {
        final ExecutionException exp = new ExecutionException("Failed to create SSH session for debugging. " + ExceptionUtils.getRootCauseMessage(e));
        EventUtil.logErrorClassNameOnlyWithComplete(operation, ErrorType.systemError, exp, null, null);
        throw exp;
    }
    final Project project = submitModel.getProject();
    final ExecutionManager executionManager = ExecutionManager.getInstance(project);
    final IdeaSchedulers schedulers = new IdeaSchedulers(project);
    final PublishSubject<SparkBatchJobSubmissionEvent> debugEventSubject = PublishSubject.create();
    final ISparkBatchDebugJob sparkDebugBatch = (ISparkBatchDebugJob) submissionState.getSparkBatch().clone();
    final PublishSubject<SparkLogLine> ctrlSubject = (PublishSubject<SparkLogLine>) sparkDebugBatch.getCtrlSubject();
    final SparkBatchJobRemoteDebugProcess driverDebugProcess = new SparkBatchJobRemoteDebugProcess(schedulers, session, sparkDebugBatch, submitModel.getArtifactPath().orElseThrow(() -> new ExecutionException("No artifact selected")), submitModel.getSubmissionParameter().getMainClassName(), submitModel.getAdvancedConfigModel(), ctrlSubject);
    final SparkBatchJobDebugProcessHandler driverDebugHandler = new SparkBatchJobDebugProcessHandler(project, driverDebugProcess, debugEventSubject);
    // Prepare an independent submission console
    final ConsoleViewImpl submissionConsole = new ConsoleViewImpl(project, true);
    final RunContentDescriptor submissionDesc = new RunContentDescriptor(submissionConsole, driverDebugHandler, submissionConsole.getComponent(), String.format("Submit %s to cluster %s", submitModel.getSubmissionParameter().getMainClassName(), submitModel.getSubmissionParameter().getClusterName()));
    // Show the submission console view
    ExecutionManager.getInstance(project).getContentManager().showRunContent(environment.getExecutor(), submissionDesc);
    // Use the submission console to display the deployment ctrl message
    final Subscription jobSubscription = ctrlSubject.subscribe(typedMessage -> {
        final String line = typedMessage.getRawLog() + "\n";
        switch(typedMessage.getMessageInfoType()) {
            case Error:
                submissionConsole.print(line, ConsoleViewContentType.ERROR_OUTPUT);
                break;
            case Info:
                submissionConsole.print(line, ConsoleViewContentType.NORMAL_OUTPUT);
                break;
            case Log:
                submissionConsole.print(line, ConsoleViewContentType.SYSTEM_OUTPUT);
                break;
            case Warning:
                submissionConsole.print(line, ConsoleViewContentType.LOG_WARNING_OUTPUT);
                break;
        }
    }, err -> {
        submissionConsole.print(ExceptionUtils.getRootCauseMessage(err), ConsoleViewContentType.ERROR_OUTPUT);
        final String errMsg = "The Spark job remote debug is cancelled due to " + ExceptionUtils.getRootCauseMessage(err);
        jobDriverEnvReady.setError(errMsg);
        EventUtil.logErrorClassNameOnlyWithComplete(operation, ErrorType.systemError, new UncheckedExecutionException(errMsg, err), null, null);
    }, () -> {
        if (Optional.ofNullable(driverDebugHandler.getUserData(ProcessHandler.TERMINATION_REQUESTED)).orElse(false)) {
            final String errMsg = "The Spark job remote debug is cancelled by user.";
            jobDriverEnvReady.setError(errMsg);
            final Map<String, String> props = ImmutableMap.of("isDebugCancelled", "true");
            EventUtil.logErrorClassNameOnlyWithComplete(operation, ErrorType.userError, new ExecutionException(errMsg), props, null);
        }
    });
    // Call after completed or error
    debugEventSubject.subscribeOn(Schedulers.io()).doAfterTerminate(session::close).subscribe(debugEvent -> {
        try {
            if (debugEvent instanceof SparkBatchRemoteDebugHandlerReadyEvent) {
                final SparkBatchRemoteDebugHandlerReadyEvent handlerReadyEvent = (SparkBatchRemoteDebugHandlerReadyEvent) debugEvent;
                final SparkBatchDebugJobJdbPortForwardedEvent jdbReadyEvent = handlerReadyEvent.getJdbPortForwardedEvent();
                if (!jdbReadyEvent.getLocalJdbForwardedPort().isPresent()) {
                    return;
                }
                final int localPort = jdbReadyEvent.getLocalJdbForwardedPort().get();
                final ExecutionEnvironment forkEnv = forkEnvironment(environment, jdbReadyEvent.getRemoteHost().orElse("unknown"), jdbReadyEvent.isDriver());
                final RunProfile runProfile = forkEnv.getRunProfile();
                if (!(runProfile instanceof LivySparkBatchJobRunConfiguration)) {
                    ctrlSubject.onError(new UnsupportedOperationException("Only supports LivySparkBatchJobRunConfiguration type, but got type" + runProfile.getClass().getCanonicalName()));
                    return;
                }
                // Reuse the driver's Spark batch job
                ((LivySparkBatchJobRunConfiguration) runProfile).setSparkRemoteBatch(sparkDebugBatch);
                final SparkBatchRemoteDebugState forkState = jdbReadyEvent.isDriver() ? submissionState : (SparkBatchRemoteDebugState) forkEnv.getState();
                if (forkState == null) {
                    return;
                }
                // Set the debug connection to localhost and local forwarded port to the state
                forkState.setRemoteConnection(new RemoteConnection(true, "localhost", Integer.toString(localPort), false));
                // Prepare the debug tab console view UI
                SparkJobLogConsoleView jobOutputView = new SparkJobLogConsoleView(project);
                // Get YARN container log URL port
                int containerLogUrlPort = ((SparkBatchRemoteDebugJob) driverDebugProcess.getSparkJob()).getYarnContainerLogUrlPort().toBlocking().single();
                // Parse container ID and host URL from driver console view
                jobOutputView.getSecondaryConsoleView().addMessageFilter((line, entireLength) -> {
                    Matcher matcher = Pattern.compile("Launching container (\\w+).* on host ([a-zA-Z_0-9-.]+)", Pattern.CASE_INSENSITIVE).matcher(line);
                    while (matcher.find()) {
                        String containerId = matcher.group(1);
                        // TODO: get port from somewhere else rather than hard code here
                        URI hostUri = URI.create(String.format("http://%s:%d", matcher.group(2), containerLogUrlPort));
                        debugEventSubject.onNext(new SparkBatchJobExecutorCreatedEvent(hostUri, containerId));
                    }
                    return null;
                });
                jobOutputView.attachToProcess(handlerReadyEvent.getDebugProcessHandler());
                ExecutionResult result = new DefaultExecutionResult(jobOutputView, handlerReadyEvent.getDebugProcessHandler());
                forkState.setExecutionResult(result);
                forkState.setConsoleView(jobOutputView.getSecondaryConsoleView());
                forkState.setRemoteProcessCtrlLogHandler(handlerReadyEvent.getDebugProcessHandler());
                if (jdbReadyEvent.isDriver()) {
                    // Let the debug console view to handle the control log
                    jobSubscription.unsubscribe();
                    // Resolve job driver promise, handle the driver VM attaching separately
                    jobDriverEnvReady.setResult(forkEnv);
                } else {
                    // Start Executor debugging
                    executionManager.startRunProfile(forkEnv, () -> toIdeaPromise(attachAndDebug(forkEnv, forkState)));
                }
            } else if (debugEvent instanceof SparkBatchJobExecutorCreatedEvent) {
                SparkBatchJobExecutorCreatedEvent executorCreatedEvent = (SparkBatchJobExecutorCreatedEvent) debugEvent;
                final String containerId = executorCreatedEvent.getContainerId();
                final SparkBatchRemoteDebugJob debugJob = (SparkBatchRemoteDebugJob) driverDebugProcess.getSparkJob();
                URI internalHostUri = executorCreatedEvent.getHostUri();
                URI executorLogUrl = debugJob.convertToPublicLogUri(internalHostUri).map(uri -> uri.resolve(String.format("node/containerlogs/%s/livy", containerId))).toBlocking().singleOrDefault(internalHostUri);
                // Create an Executor Debug Process
                SparkBatchJobRemoteDebugExecutorProcess executorDebugProcess = new SparkBatchJobRemoteDebugExecutorProcess(schedulers, debugJob, internalHostUri.getHost(), driverDebugProcess.getDebugSession(), executorLogUrl.toString());
                SparkBatchJobDebugProcessHandler executorDebugHandler = new SparkBatchJobDebugProcessHandler(project, executorDebugProcess, debugEventSubject);
                executorDebugHandler.getRemoteDebugProcess().start();
            }
        } catch (final ExecutionException e) {
            EventUtil.logErrorClassNameOnlyWithComplete(operation, ErrorType.systemError, new UncheckedExecutionException(e), null, null);
            throw new UncheckedExecutionException(e);
        }
    });
    driverDebugHandler.getRemoteDebugProcess().start();
    // Driver side execute, leverage Intellij Async Promise, to wait for the Spark app deployed
    executionManager.startRunProfile(environment, () -> jobDriverEnvReady.thenAsync(driverEnv -> toIdeaPromise(attachAndDebug(driverEnv, state))));
}
Also used : ModalityState.any(com.intellij.openapi.application.ModalityState.any) ExecutionManager(com.intellij.execution.ExecutionManager) ModalityState(com.intellij.openapi.application.ModalityState) SparkJobLogConsoleView(com.microsoft.azure.hdinsight.spark.ui.SparkJobLogConsoleView) com.intellij.execution.configurations(com.intellij.execution.configurations) ExecutionEnvironment(com.intellij.execution.runners.ExecutionEnvironment) Matcher(java.util.regex.Matcher) ConsoleViewContentType(com.intellij.execution.ui.ConsoleViewContentType) ClusterManagerEx(com.microsoft.azure.hdinsight.common.ClusterManagerEx) Map(java.util.Map) Schedulers(rx.schedulers.Schedulers) ExecutionResult(com.intellij.execution.ExecutionResult) URI(java.net.URI) Nullable(com.microsoft.azuretools.azurecommons.helpers.Nullable) ImmutableMap(com.google.common.collect.ImmutableMap) WindowManager(com.intellij.openapi.wm.WindowManager) ErrorType(com.microsoft.azuretools.telemetrywrapper.ErrorType) Operation(com.microsoft.azuretools.telemetrywrapper.Operation) ExecutionEnvironmentBuilder(com.intellij.execution.runners.ExecutionEnvironmentBuilder) RunContentDescriptor(com.intellij.execution.ui.RunContentDescriptor) GenericDebuggerRunnerSettings(com.intellij.debugger.impl.GenericDebuggerRunnerSettings) RxJavaExtKt.toIdeaPromise(com.microsoft.intellij.rxjava.RxJavaExtKt.toIdeaPromise) Optional(java.util.Optional) EventUtil(com.microsoft.azuretools.telemetrywrapper.EventUtil) Pattern(java.util.regex.Pattern) Subscription(rx.Subscription) PublishSubject(rx.subjects.PublishSubject) com.microsoft.azure.hdinsight.spark.common(com.microsoft.azure.hdinsight.spark.common) GenericDebuggerRunner(com.intellij.debugger.impl.GenericDebuggerRunner) ExceptionUtils(org.apache.commons.lang3.exception.ExceptionUtils) AsyncPromise(org.jetbrains.concurrency.AsyncPromise) TelemetryKeys(com.microsoft.intellij.telemetry.TelemetryKeys) NotNull(com.microsoft.azuretools.azurecommons.helpers.NotNull) ExecutionException(com.intellij.execution.ExecutionException) LivyCluster(com.microsoft.azure.hdinsight.sdk.cluster.LivyCluster) IdeaSchedulers(com.microsoft.intellij.rxjava.IdeaSchedulers) Observable(rx.Observable) UncheckedExecutionException(com.google.common.util.concurrent.UncheckedExecutionException) SparkBatchSubmission.getClusterSubmission(com.microsoft.azure.hdinsight.spark.common.SparkBatchSubmission.getClusterSubmission) Project(com.intellij.openapi.project.Project) LivySparkBatchJobRunConfiguration(com.microsoft.azure.hdinsight.spark.run.configuration.LivySparkBatchJobRunConfiguration) DefaultExecutionResult(com.intellij.execution.DefaultExecutionResult) IClusterDetail(com.microsoft.azure.hdinsight.sdk.cluster.IClusterDetail) ConsoleViewImpl(com.intellij.execution.impl.ConsoleViewImpl) Key(com.intellij.openapi.util.Key) IOException(java.io.IOException) ProcessHandler(com.intellij.execution.process.ProcessHandler) ModalityState.stateForComponent(com.intellij.openapi.application.ModalityState.stateForComponent) SparkLogLine(com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine) SparkSubmissionAdvancedConfigPanel(com.microsoft.azure.hdinsight.spark.ui.SparkSubmissionAdvancedConfigPanel) javax.swing(javax.swing) ExecutionEnvironment(com.intellij.execution.runners.ExecutionEnvironment) DefaultExecutionResult(com.intellij.execution.DefaultExecutionResult) Matcher(java.util.regex.Matcher) IdeaSchedulers(com.microsoft.intellij.rxjava.IdeaSchedulers) Operation(com.microsoft.azuretools.telemetrywrapper.Operation) AsyncPromise(org.jetbrains.concurrency.AsyncPromise) LivySparkBatchJobRunConfiguration(com.microsoft.azure.hdinsight.spark.run.configuration.LivySparkBatchJobRunConfiguration) URI(java.net.URI) ConsoleViewImpl(com.intellij.execution.impl.ConsoleViewImpl) PublishSubject(rx.subjects.PublishSubject) ExecutionException(com.intellij.execution.ExecutionException) UncheckedExecutionException(com.google.common.util.concurrent.UncheckedExecutionException) Subscription(rx.Subscription) SparkJobLogConsoleView(com.microsoft.azure.hdinsight.spark.ui.SparkJobLogConsoleView) ExecutionManager(com.intellij.execution.ExecutionManager) RunContentDescriptor(com.intellij.execution.ui.RunContentDescriptor) UncheckedExecutionException(com.google.common.util.concurrent.UncheckedExecutionException) ExecutionResult(com.intellij.execution.ExecutionResult) DefaultExecutionResult(com.intellij.execution.DefaultExecutionResult) ExecutionException(com.intellij.execution.ExecutionException) UncheckedExecutionException(com.google.common.util.concurrent.UncheckedExecutionException) IOException(java.io.IOException) Project(com.intellij.openapi.project.Project) SparkLogLine(com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine)

Example 4 with SparkLogLine

use of com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine in project azure-tools-for-java by Microsoft.

the class SparkBatchJobRunner method doExecute.

@Nullable
@Override
protected RunContentDescriptor doExecute(@NotNull RunProfileState state, @NotNull ExecutionEnvironment environment) throws ExecutionException {
    final SparkBatchRemoteRunProfileState submissionState = (SparkBatchRemoteRunProfileState) state;
    final SparkSubmitModel submitModel = submissionState.getSubmitModel();
    final Project project = submitModel.getProject();
    // Prepare the run table console view UI
    final SparkJobLogConsoleView jobOutputView = new SparkJobLogConsoleView(project);
    final String artifactPath = submitModel.getArtifactPath().orElse(null);
    assert artifactPath != null : "artifactPath should be checked in LivySparkBatchJobRunConfiguration::checkSubmissionConfigurationBeforeRun";
    // To address issue https://github.com/microsoft/azure-tools-for-java/issues/4021.
    // In this issue, when user click rerun button, we are still using the legacy ctrlSubject which has already sent
    // "onComplete" message when the job is done in the previous time. To avoid this issue,  We clone a new Spark
    // batch job instance to re-initialize everything in the object.
    final ISparkBatchJob sparkBatch = submissionState.getSparkBatch().clone();
    final PublishSubject<SparkLogLine> ctrlSubject = (PublishSubject<SparkLogLine>) sparkBatch.getCtrlSubject();
    final SparkBatchJobRemoteProcess remoteProcess = new SparkBatchJobRemoteProcess(new IdeaSchedulers(project), sparkBatch, artifactPath, submitModel.getSubmissionParameter().getMainClassName(), ctrlSubject);
    final SparkBatchJobRunProcessHandler processHandler = new SparkBatchJobRunProcessHandler(remoteProcess, "Package and deploy the job to Spark cluster", null);
    // After attaching, the console view can read the process inputStreams and display them
    jobOutputView.attachToProcess(processHandler);
    remoteProcess.start();
    final Operation operation = environment.getUserData(TelemetryKeys.OPERATION);
    // After we define a new AnAction class, IntelliJ will construct a new AnAction instance for us.
    // Use one action instance can keep behaviours like isEnabled() consistent
    final SparkBatchJobDisconnectAction disconnectAction = (SparkBatchJobDisconnectAction) ActionManager.getInstance().getAction("Actions.SparkJobDisconnect");
    disconnectAction.init(remoteProcess, operation);
    sendTelemetryForParameters(submitModel, operation);
    final ExecutionResult result = new DefaultExecutionResult(jobOutputView, processHandler, Separator.getInstance(), disconnectAction);
    submissionState.setExecutionResult(result);
    final ConsoleView consoleView = jobOutputView.getSecondaryConsoleView();
    submissionState.setConsoleView(consoleView);
    addConsoleViewFilter(remoteProcess.getSparkJob(), consoleView);
    submissionState.setRemoteProcessCtrlLogHandler(processHandler);
    ctrlSubject.subscribe(messageWithType -> {
    }, err -> disconnectAction.setEnabled(false), () -> disconnectAction.setEnabled(false));
    return super.doExecute(state, environment);
}
Also used : DefaultExecutionResult(com.intellij.execution.DefaultExecutionResult) SparkJobLogConsoleView(com.microsoft.azure.hdinsight.spark.ui.SparkJobLogConsoleView) ConsoleView(com.intellij.execution.ui.ConsoleView) IdeaSchedulers(com.microsoft.intellij.rxjava.IdeaSchedulers) ExecutionResult(com.intellij.execution.ExecutionResult) DefaultExecutionResult(com.intellij.execution.DefaultExecutionResult) Operation(com.microsoft.azuretools.telemetrywrapper.Operation) Project(com.intellij.openapi.project.Project) SparkBatchJobDisconnectAction(com.microsoft.azure.hdinsight.spark.run.action.SparkBatchJobDisconnectAction) PublishSubject(rx.subjects.PublishSubject) SparkJobLogConsoleView(com.microsoft.azure.hdinsight.spark.ui.SparkJobLogConsoleView) SparkLogLine(com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine) Nullable(com.microsoft.azuretools.azurecommons.helpers.Nullable)

Example 5 with SparkLogLine

use of com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine in project azure-tools-for-java by Microsoft.

the class SparkBatchJob method createBatchJob.

/**
 * Create a batch Spark job
 *
 * @return the current instance for chain calling
 * @throws IOException the exceptions for networking connection issues related
 */
private SparkBatchJob createBatchJob() throws IOException {
    if (getConnectUri() == null) {
        throw new SparkJobNotConfiguredException("Can't get Spark job connection URI, " + "please configure Spark cluster which the Spark job will be submitted.");
    }
    // Submit the batch job
    final HttpResponse httpResponse = this.getSubmission().createBatchSparkJob(this.getConnectUri().toString(), this.getSubmissionParameter());
    // Get the batch ID from response and save it
    if (httpResponse.getCode() >= 200 && httpResponse.getCode() < 300) {
        final SparkSubmitResponse jobResp = ObjectConvertUtils.convertJsonToObject(httpResponse.getMessage(), SparkSubmitResponse.class).orElseThrow(() -> new UnknownServiceException("Bad spark job response: " + httpResponse.getMessage()));
        this.setBatchId(jobResp.getId());
        getCtrlSubject().onNext(new SparkLogLine(TOOL, Info, "Spark Batch submission " + httpResponse.toString()));
        return this;
    }
    throw new UnknownServiceException(String.format("Failed to submit Spark batch job. error code: %d, type: %s, reason: %s.", httpResponse.getCode(), httpResponse.getContent(), httpResponse.getMessage()));
}
Also used : UnknownServiceException(java.net.UnknownServiceException) HttpResponse(com.microsoft.azure.hdinsight.sdk.common.HttpResponse) SparkLogLine(com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine)

Aggregations

SparkLogLine (com.microsoft.azure.hdinsight.spark.common.log.SparkLogLine)10 URI (java.net.URI)7 Nullable (com.microsoft.azuretools.azurecommons.helpers.Nullable)6 Observable (rx.Observable)6 NotNull (com.microsoft.azuretools.azurecommons.helpers.NotNull)5 File (java.io.File)5 ILogger (com.microsoft.azure.hdinsight.common.logger.ILogger)4 HttpObservable (com.microsoft.azure.hdinsight.sdk.common.HttpObservable)4 StringUtils (org.apache.commons.lang3.StringUtils)4 ExceptionUtils (org.apache.commons.lang3.exception.ExceptionUtils)4 IClusterDetail (com.microsoft.azure.hdinsight.sdk.cluster.IClusterDetail)3 HttpResponse (com.microsoft.azure.hdinsight.sdk.common.HttpResponse)3 JobUtils (com.microsoft.azure.hdinsight.spark.jobs.JobUtils)3 IOException (java.io.IOException)3 UnknownServiceException (java.net.UnknownServiceException)3 Observer (rx.Observer)3 PublishSubject (rx.subjects.PublishSubject)3 ImmutableMap (com.google.common.collect.ImmutableMap)2 DefaultExecutionResult (com.intellij.execution.DefaultExecutionResult)2 ExecutionResult (com.intellij.execution.ExecutionResult)2