Search in sources :

Example 6 with S3ObjectInfo

use of com.amazonaws.services.neptune.util.S3ObjectInfo in project amazon-neptune-tools by awslabs.

the class NeptuneMachineLearningExportEventHandlerV1 method createTrainingJobConfigurationFile.

private void createTrainingJobConfigurationFile(TrainingDataWriterConfigV1 trainingJobWriterConfig, Path outputPath, GraphSchema graphSchema, PropertyName propertyName, TransferManagerWrapper transferManager) throws Exception {
    File outputDirectory = outputPath.toFile();
    String filename = String.format("%s.json", trainingJobWriterConfig.name());
    File trainingJobConfigurationFile = new File(outputPath.toFile(), filename);
    try (Writer writer = new PrintWriter(trainingJobConfigurationFile)) {
        new PropertyGraphTrainingDataConfigWriterV1(graphSchema, createJsonGenerator(writer), propertyName, printerOptions, trainingJobWriterConfig).write();
    }
    if (StringUtils.isNotEmpty(outputS3Path)) {
        Timer.timedActivity("uploading training job configuration file to S3", (CheckedActivity.Runnable) () -> {
            S3ObjectInfo outputS3ObjectInfo = calculateOutputS3Path(outputDirectory);
            uploadTrainingJobConfigurationFileToS3(filename, transferManager.get(), trainingJobConfigurationFile, outputS3ObjectInfo);
        });
    }
}
Also used : PropertyGraphTrainingDataConfigWriterV1(com.amazonaws.services.neptune.profiles.neptune_ml.v1.PropertyGraphTrainingDataConfigWriterV1) S3ObjectInfo(com.amazonaws.services.neptune.util.S3ObjectInfo) CheckedActivity(com.amazonaws.services.neptune.util.CheckedActivity)

Example 7 with S3ObjectInfo

use of com.amazonaws.services.neptune.util.S3ObjectInfo in project amazon-neptune-tools by awslabs.

the class NeptuneExportService method checkS3OutputIsEmpty.

private void checkS3OutputIsEmpty() {
    AmazonS3 s3 = AmazonS3ClientBuilder.defaultClient();
    S3ObjectInfo s3ObjectInfo = new S3ObjectInfo(outputS3Path);
    ObjectListing listing = s3.listObjects(new ListObjectsRequest(s3ObjectInfo.bucket(), s3ObjectInfo.key(), null, null, 1));
    if (!listing.getObjectSummaries().isEmpty()) {
        throw new IllegalStateException(String.format("S3 destination contains existing objects: %s. Set 'overwriteExisting' parameter to 'true' to allow overwriting existing objects.", outputS3Path));
    }
}
Also used : AmazonS3(com.amazonaws.services.s3.AmazonS3) ListObjectsRequest(com.amazonaws.services.s3.model.ListObjectsRequest) S3ObjectInfo(com.amazonaws.services.neptune.util.S3ObjectInfo) ObjectListing(com.amazonaws.services.s3.model.ObjectListing)

Example 8 with S3ObjectInfo

use of com.amazonaws.services.neptune.util.S3ObjectInfo in project amazon-neptune-tools by awslabs.

the class NeptuneMachineLearningExportEventHandlerV2 method createTrainingJobConfigurationFile.

private void createTrainingJobConfigurationFile(TrainingDataWriterConfigV2 trainingDataWriterConfig, Path outputPath, GraphSchema graphSchema, PropertyName propertyName, TransferManagerWrapper transferManager) throws Exception {
    File outputDirectory = outputPath.toFile();
    String filename = String.format("%s.json", trainingDataWriterConfig.name());
    File trainingJobConfigurationFile = new File(outputPath.toFile(), filename);
    try (Writer writer = new PrintWriter(trainingJobConfigurationFile)) {
        if (dataModel == NeptuneMLSourceDataModel.RDF) {
            Collection<String> filenames = new ArrayList<>();
            File[] directories = outputDirectory.listFiles(File::isDirectory);
            for (File directory : directories) {
                File[] files = directory.listFiles(File::isFile);
                for (File file : files) {
                    filenames.add(outputDirectory.toPath().relativize(file.toPath()).toString());
                }
            }
            new RdfTrainingDataConfigWriter(filenames, createJsonGenerator(writer), trainingDataWriterConfig).write();
        } else {
            new PropertyGraphTrainingDataConfigWriterV2(graphSchema, createJsonGenerator(writer), propertyName, printerOptions, trainingDataWriterConfig).write(includeEdgeFeatures);
        }
    }
    if (StringUtils.isNotEmpty(outputS3Path)) {
        Timer.timedActivity("uploading training job configuration file to S3", (CheckedActivity.Runnable) () -> {
            S3ObjectInfo outputS3ObjectInfo = calculateOutputS3Path(outputDirectory);
            uploadTrainingJobConfigurationFileToS3(filename, transferManager.get(), trainingJobConfigurationFile, outputS3ObjectInfo);
        });
    }
}
Also used : RdfTrainingDataConfigWriter(com.amazonaws.services.neptune.profiles.neptune_ml.v2.RdfTrainingDataConfigWriter) PropertyGraphTrainingDataConfigWriterV2(com.amazonaws.services.neptune.profiles.neptune_ml.v2.PropertyGraphTrainingDataConfigWriterV2) S3ObjectInfo(com.amazonaws.services.neptune.util.S3ObjectInfo) ArrayList(java.util.ArrayList) CheckedActivity(com.amazonaws.services.neptune.util.CheckedActivity) RdfTrainingDataConfigWriter(com.amazonaws.services.neptune.profiles.neptune_ml.v2.RdfTrainingDataConfigWriter)

Example 9 with S3ObjectInfo

use of com.amazonaws.services.neptune.util.S3ObjectInfo in project amazon-neptune-tools by awslabs.

the class NeptuneMachineLearningExportEventHandlerV2 method uploadTrainingJobConfigurationFileToS3.

private void uploadTrainingJobConfigurationFileToS3(String filename, TransferManager transferManager, File trainingJobConfigurationFile, S3ObjectInfo outputS3ObjectInfo) throws IOException {
    S3ObjectInfo s3ObjectInfo = outputS3ObjectInfo.withNewKeySuffix(filename);
    try (InputStream inputStream = new FileInputStream(trainingJobConfigurationFile)) {
        ObjectMetadata objectMetadata = new ObjectMetadata();
        objectMetadata.setContentLength(trainingJobConfigurationFile.length());
        objectMetadata.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION);
        PutObjectRequest putObjectRequest = new PutObjectRequest(s3ObjectInfo.bucket(), s3ObjectInfo.key(), inputStream, objectMetadata).withTagging(ExportToS3NeptuneExportEventHandler.createObjectTags(profiles));
        Upload upload = transferManager.upload(putObjectRequest);
        upload.waitForUploadResult();
    } catch (InterruptedException e) {
        logger.warn(e.getMessage());
        Thread.currentThread().interrupt();
    }
}
Also used : S3ObjectInfo(com.amazonaws.services.neptune.util.S3ObjectInfo) Upload(com.amazonaws.services.s3.transfer.Upload) ObjectMetadata(com.amazonaws.services.s3.model.ObjectMetadata) PutObjectRequest(com.amazonaws.services.s3.model.PutObjectRequest)

Example 10 with S3ObjectInfo

use of com.amazonaws.services.neptune.util.S3ObjectInfo in project amazon-neptune-tools by awslabs.

the class ExportToS3NeptuneExportEventHandler method uploadExportFilesToS3.

private void uploadExportFilesToS3(TransferManager transferManager, File directory, S3ObjectInfo outputS3ObjectInfo) {
    if (directory == null || !directory.exists()) {
        logger.warn("Ignoring request to upload files to S3 because upload directory from which to upload files does not exist");
        return;
    }
    boolean allowRetry = true;
    int retryCount = 0;
    while (allowRetry) {
        try {
            // deleteS3Directories(directory, outputS3ObjectInfo);
            ObjectMetadataProvider metadataProvider = (file, objectMetadata) -> {
                objectMetadata.setContentLength(file.length());
                objectMetadata.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION);
            };
            ObjectTaggingProvider taggingProvider = uploadContext -> createObjectTags(profiles);
            logger.info("Uploading export files to s3 bucket={} key={}", outputS3ObjectInfo.bucket(), outputS3ObjectInfo.key());
            MultipleFileUpload upload = transferManager.uploadDirectory(outputS3ObjectInfo.bucket(), outputS3ObjectInfo.key(), directory, true, metadataProvider, taggingProvider);
            AmazonClientException amazonClientException = upload.waitForException();
            if (amazonClientException != null) {
                String errorMessage = amazonClientException.getMessage();
                logger.error("Upload to S3 failed: {}", errorMessage);
                if (!amazonClientException.isRetryable() || retryCount > 2) {
                    allowRetry = false;
                    logger.warn("Cancelling upload to S3 [RetryCount: {}]", retryCount);
                    throw new RuntimeException(String.format("Upload to S3 failed [Directory: %s, S3 location: %s, Reason: %s, RetryCount: %s]", directory, outputS3ObjectInfo, errorMessage, retryCount));
                } else {
                    retryCount++;
                    logger.info("Retrying upload to S3 [RetryCount: {}]", retryCount);
                }
            } else {
                allowRetry = false;
            }
        } catch (InterruptedException e) {
            logger.warn(e.getMessage());
            Thread.currentThread().interrupt();
        }
    }
}
Also used : StringUtils(org.apache.commons.lang.StringUtils) Cluster(com.amazonaws.services.neptune.cluster.Cluster) S3ObjectInfo(com.amazonaws.services.neptune.util.S3ObjectInfo) LoggerFactory(org.slf4j.LoggerFactory) Directories(com.amazonaws.services.neptune.io.Directories) Timer(com.amazonaws.services.neptune.util.Timer) AtomicReference(java.util.concurrent.atomic.AtomicReference) ObjectNode(com.fasterxml.jackson.databind.node.ObjectNode) ProgressEvent(com.amazonaws.event.ProgressEvent) ArrayList(java.util.ArrayList) ProgressListener(com.amazonaws.event.ProgressListener) ObjectMetadata(com.amazonaws.services.s3.model.ObjectMetadata) TransferManagerWrapper(com.amazonaws.services.neptune.util.TransferManagerWrapper) com.amazonaws.services.s3.transfer(com.amazonaws.services.s3.transfer) ObjectTagging(com.amazonaws.services.s3.model.ObjectTagging) Path(java.nio.file.Path) ExportStats(com.amazonaws.services.neptune.propertygraph.ExportStats) ObjectWriter(com.fasterxml.jackson.databind.ObjectWriter) Files(java.nio.file.Files) UTF_8(java.nio.charset.StandardCharsets.UTF_8) Collection(java.util.Collection) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) FileUtils(org.apache.commons.io.FileUtils) Tag(com.amazonaws.services.s3.model.Tag) UUID(java.util.UUID) PutObjectRequest(com.amazonaws.services.s3.model.PutObjectRequest) GraphSchema(com.amazonaws.services.neptune.propertygraph.schema.GraphSchema) List(java.util.List) java.io(java.io) NEPTUNE_EXPORT_TAGS(com.amazonaws.services.neptune.export.NeptuneExportService.NEPTUNE_EXPORT_TAGS) Paths(java.nio.file.Paths) JsonNodeFactory(com.fasterxml.jackson.databind.node.JsonNodeFactory) AmazonClientException(com.amazonaws.AmazonClientException) FilenameUtils(org.apache.commons.io.FilenameUtils) CheckedActivity(com.amazonaws.services.neptune.util.CheckedActivity) AmazonClientException(com.amazonaws.AmazonClientException)

Aggregations

S3ObjectInfo (com.amazonaws.services.neptune.util.S3ObjectInfo)14 ObjectMetadata (com.amazonaws.services.s3.model.ObjectMetadata)7 PutObjectRequest (com.amazonaws.services.s3.model.PutObjectRequest)7 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)6 CheckedActivity (com.amazonaws.services.neptune.util.CheckedActivity)5 ObjectNode (com.fasterxml.jackson.databind.node.ObjectNode)5 ArrayList (java.util.ArrayList)5 ObjectWriter (com.fasterxml.jackson.databind.ObjectWriter)4 UTF_8 (java.nio.charset.StandardCharsets.UTF_8)4 Path (java.nio.file.Path)4 StringUtils (org.apache.commons.lang.StringUtils)4 AmazonClientException (com.amazonaws.AmazonClientException)3 ProgressEvent (com.amazonaws.event.ProgressEvent)3 ProgressListener (com.amazonaws.event.ProgressListener)3 Cluster (com.amazonaws.services.neptune.cluster.Cluster)3 NEPTUNE_EXPORT_TAGS (com.amazonaws.services.neptune.export.NeptuneExportService.NEPTUNE_EXPORT_TAGS)3 Directories (com.amazonaws.services.neptune.io.Directories)3 ExportStats (com.amazonaws.services.neptune.propertygraph.ExportStats)3 GraphSchema (com.amazonaws.services.neptune.propertygraph.schema.GraphSchema)3 Timer (com.amazonaws.services.neptune.util.Timer)3