Search in sources :

Example 1 with ClusterFileBase64BufferedOutputStream

use of com.microsoft.azure.hdinsight.sdk.io.spark.ClusterFileBase64BufferedOutputStream in project azure-tools-for-java by Microsoft.

the class JobUtils method uploadFileToHDFSBase.

public static String uploadFileToHDFSBase(IClusterDetail selectedClusterDetail, String buildJarPath, @Nullable Observer<SimpleImmutableEntry<MessageInfoType, String>> legacyLogSubject, @Nullable Observer<SparkLogLine> newLogSubject) throws HDIException {
    ctrlInfo(legacyLogSubject, newLogSubject, String.format("Get target jar from %s.", buildJarPath));
    final File srcJarFile = new File(buildJarPath);
    final URI destUri = URI.create(String.format("/SparkSubmission/%s/%s", getFormatPathByDate(), srcJarFile.getName()));
    final String username = selectedClusterDetail.getHttpUserName();
    final String password = selectedClusterDetail.getHttpPassword();
    final String sessionName = "Helper session to upload " + destUri.toString();
    final URI livyUri = selectedClusterDetail instanceof LivyCluster ? URI.create(((LivyCluster) selectedClusterDetail).getLivyConnectionUrl()) : URI.create(selectedClusterDetail.getConnectionUrl());
    ctrlInfo(legacyLogSubject, newLogSubject, "Create Spark helper interactive session...");
    try {
        return Observable.using(() -> new SparkSession(sessionName, livyUri, username, password), SparkSession::create, SparkSession::close).map(sparkSession -> {
            sparkSession.getCtrlSubject().subscribe(logLine -> ctrlInfo(legacyLogSubject, newLogSubject, logLine.getRawLog()), err -> ctrlError(legacyLogSubject, newLogSubject, err), () -> {
            });
            ClusterFileBase64BufferedOutputStream clusterFileBase64Out = new ClusterFileBase64BufferedOutputStream(sparkSession, destUri);
            Base64OutputStream base64Enc = new Base64OutputStream(clusterFileBase64Out, true);
            InputStream inFile;
            try {
                inFile = new BufferedInputStream(new FileInputStream(srcJarFile));
                ctrlInfo(legacyLogSubject, newLogSubject, String.format("Uploading %s...", srcJarFile));
                IOUtils.copy(inFile, base64Enc);
                inFile.close();
                base64Enc.close();
            } catch (FileNotFoundException fnfEx) {
                throw propagate(new HDIException(String.format("Source file %s not found.", srcJarFile), fnfEx));
            } catch (IOException ioEx) {
                throw propagate(new HDIException(String.format("Failed to upload file %s.", destUri), ioEx));
            }
            ctrlInfo(legacyLogSubject, newLogSubject, String.format("Uploaded to %s.", destUri));
            return destUri.toString();
        }).toBlocking().single();
    } catch (final NoSuchElementException ignored) {
        // The cause exception will be thrown inside
        throw new HDIException("Failed to upload file to HDFS (Should Not Reach).");
    }
}
Also used : ClusterFileBase64BufferedOutputStream(com.microsoft.azure.hdinsight.sdk.io.spark.ClusterFileBase64BufferedOutputStream) SparkSession(com.microsoft.azure.hdinsight.sdk.common.livy.interactive.SparkSession) LivyCluster(com.microsoft.azure.hdinsight.sdk.cluster.LivyCluster) HDIException(com.microsoft.azure.hdinsight.sdk.common.HDIException) Base64OutputStream(org.apache.commons.codec.binary.Base64OutputStream) URI(java.net.URI)

Aggregations

LivyCluster (com.microsoft.azure.hdinsight.sdk.cluster.LivyCluster)1 HDIException (com.microsoft.azure.hdinsight.sdk.common.HDIException)1 SparkSession (com.microsoft.azure.hdinsight.sdk.common.livy.interactive.SparkSession)1 ClusterFileBase64BufferedOutputStream (com.microsoft.azure.hdinsight.sdk.io.spark.ClusterFileBase64BufferedOutputStream)1 URI (java.net.URI)1 Base64OutputStream (org.apache.commons.codec.binary.Base64OutputStream)1