Search in sources :

Example 1 with FileEncryptionProperties

use of org.apache.parquet.crypto.FileEncryptionProperties in project parquet-mr by apache.

the class TestBloomFiltering method createFiles.

@BeforeClass
public static void createFiles() throws IOException {
    writePhoneBookToFile(FILE_V1, ParquetProperties.WriterVersion.PARQUET_1_0, null);
    writePhoneBookToFile(FILE_V2, ParquetProperties.WriterVersion.PARQUET_2_0, null);
    FileEncryptionProperties encryptionProperties = getFileEncryptionProperties();
    writePhoneBookToFile(FILE_V1_E, ParquetProperties.WriterVersion.PARQUET_1_0, encryptionProperties);
    writePhoneBookToFile(FILE_V2_E, ParquetProperties.WriterVersion.PARQUET_2_0, encryptionProperties);
}
Also used : FileEncryptionProperties(org.apache.parquet.crypto.FileEncryptionProperties) BeforeClass(org.junit.BeforeClass)

Example 2 with FileEncryptionProperties

use of org.apache.parquet.crypto.FileEncryptionProperties in project parquet-mr by apache.

the class SchemaCryptoPropertiesFactory method getFileEncryptionProperties.

@Override
public FileEncryptionProperties getFileEncryptionProperties(Configuration conf, Path tempFilePath, WriteContext fileWriteContext) throws ParquetCryptoRuntimeException {
    MessageType schema = fileWriteContext.getSchema();
    List<String[]> paths = schema.getPaths();
    if (paths == null || paths.isEmpty()) {
        throw new ParquetCryptoRuntimeException("Null or empty fields is found");
    }
    Map<ColumnPath, ColumnEncryptionProperties> columnPropertyMap = new HashMap<>();
    for (String[] path : paths) {
        getColumnEncryptionProperties(path, columnPropertyMap, conf);
    }
    if (columnPropertyMap.size() == 0) {
        log.debug("No column is encrypted. Returning null so that Parquet can skip. Empty properties will cause Parquet exception");
        return null;
    }
    /**
     * Why we still need footerKeyMetadata even withEncryptedFooter as false? According to the
     * 'Plaintext Footer' section of
     * https://github.com/apache/parquet-format/blob/encryption/Encryption.md, the plaintext footer
     * is signed in order to prevent tampering with the FileMetaData contents. So footerKeyMetadata
     * is always needed. This signature will be verified if parquet-mr code is with parquet-1178.
     * Otherwise, it will be ignored.
     */
    boolean shouldEncryptFooter = getEncryptFooter(conf);
    FileEncryptionProperties.Builder encryptionPropertiesBuilder = FileEncryptionProperties.builder(FOOTER_KEY).withFooterKeyMetadata(FOOTER_KEY_METADATA).withAlgorithm(getParquetCipherOrDefault(conf)).withEncryptedColumns(columnPropertyMap);
    if (!shouldEncryptFooter) {
        encryptionPropertiesBuilder = encryptionPropertiesBuilder.withPlaintextFooter();
    }
    FileEncryptionProperties encryptionProperties = encryptionPropertiesBuilder.build();
    log.info("FileEncryptionProperties is built with, algorithm:{}, footerEncrypted:{}", encryptionProperties.getAlgorithm(), encryptionProperties.encryptedFooter());
    return encryptionProperties;
}
Also used : ParquetCryptoRuntimeException(org.apache.parquet.crypto.ParquetCryptoRuntimeException) HashMap(java.util.HashMap) FileEncryptionProperties(org.apache.parquet.crypto.FileEncryptionProperties) ColumnEncryptionProperties(org.apache.parquet.crypto.ColumnEncryptionProperties) ColumnPath(org.apache.parquet.hadoop.metadata.ColumnPath) MessageType(org.apache.parquet.schema.MessageType)

Example 3 with FileEncryptionProperties

use of org.apache.parquet.crypto.FileEncryptionProperties in project parquet-mr by apache.

the class TestFileBuilder method build.

public EncryptionTestFile build() throws IOException {
    String fileName = createTempFile("test");
    SimpleGroup[] fileContent = createFileContent(schema);
    FileEncryptionProperties encryptionProperties = EncDecProperties.getFileEncryptionProperties(encryptColumns, cipher, footerEncryption);
    ExampleParquetWriter.Builder builder = ExampleParquetWriter.builder(new Path(fileName)).withConf(conf).withWriterVersion(writerVersion).withExtraMetaData(extraMeta).withValidation(true).withPageSize(pageSize).withEncryption(encryptionProperties).withCompressionCodec(CompressionCodecName.valueOf(codec));
    try (ParquetWriter writer = builder.build()) {
        for (int i = 0; i < fileContent.length; i++) {
            writer.write(fileContent[i]);
        }
    }
    return new EncryptionTestFile(fileName, fileContent);
}
Also used : Path(org.apache.hadoop.fs.Path) ExampleParquetWriter(org.apache.parquet.hadoop.example.ExampleParquetWriter) ParquetWriter(org.apache.parquet.hadoop.ParquetWriter) FileEncryptionProperties(org.apache.parquet.crypto.FileEncryptionProperties) ExampleParquetWriter(org.apache.parquet.hadoop.example.ExampleParquetWriter) SimpleGroup(org.apache.parquet.example.data.simple.SimpleGroup)

Example 4 with FileEncryptionProperties

use of org.apache.parquet.crypto.FileEncryptionProperties in project parquet-mr by apache.

the class TestEncryptionOptions method testWriteEncryptedParquetFiles.

private void testWriteEncryptedParquetFiles(Path root, List<SingleRow> data) throws IOException {
    Configuration conf = new Configuration();
    // Ensure that several pages will be created
    int pageSize = data.size() / 10;
    // Ensure that there are more row-groups created
    int rowGroupSize = pageSize * 6 * 5;
    SimpleGroupFactory f = new SimpleGroupFactory(SCHEMA);
    EncryptionConfiguration[] encryptionConfigurations = EncryptionConfiguration.values();
    for (EncryptionConfiguration encryptionConfiguration : encryptionConfigurations) {
        Path file = new Path(root, getFileName(encryptionConfiguration));
        FileEncryptionProperties encryptionProperties = encryptionConfiguration.getEncryptionProperties();
        LOG.info("\nWrite " + file.toString());
        try (ParquetWriter<Group> writer = ExampleParquetWriter.builder(file).withWriteMode(OVERWRITE).withRowGroupSize(rowGroupSize).withPageSize(pageSize).withType(SCHEMA).withConf(conf).withEncryption(encryptionProperties).build()) {
            for (SingleRow singleRow : data) {
                writer.write(f.newGroup().append(SingleRow.BOOLEAN_FIELD_NAME, singleRow.boolean_field).append(SingleRow.INT32_FIELD_NAME, singleRow.int32_field).append(SingleRow.FLOAT_FIELD_NAME, singleRow.float_field).append(SingleRow.DOUBLE_FIELD_NAME, singleRow.double_field).append(SingleRow.BINARY_FIELD_NAME, Binary.fromConstantByteArray(singleRow.ba_field)).append(SingleRow.FIXED_LENGTH_BINARY_FIELD_NAME, Binary.fromConstantByteArray(singleRow.flba_field)).append(SingleRow.PLAINTEXT_INT32_FIELD_NAME, singleRow.plaintext_int32_field));
            }
        }
    }
}
Also used : ColumnPath(org.apache.parquet.hadoop.metadata.ColumnPath) Path(org.apache.hadoop.fs.Path) Group(org.apache.parquet.example.data.Group) Configuration(org.apache.hadoop.conf.Configuration) FileEncryptionProperties(org.apache.parquet.crypto.FileEncryptionProperties) SimpleGroupFactory(org.apache.parquet.example.data.simple.SimpleGroupFactory) SingleRow(org.apache.parquet.crypto.SingleRow)

Example 5 with FileEncryptionProperties

use of org.apache.parquet.crypto.FileEncryptionProperties in project parquet-mr by apache.

the class TestColumnIndexFiltering method getFileEncryptionProperties.

private static FileEncryptionProperties getFileEncryptionProperties() {
    ColumnEncryptionProperties columnProperties1 = ColumnEncryptionProperties.builder("id").withKey(COLUMN_ENCRYPTION_KEY1).withKeyID(COLUMN_ENCRYPTION_KEY1_ID).build();
    ColumnEncryptionProperties columnProperties2 = ColumnEncryptionProperties.builder("name").withKey(COLUMN_ENCRYPTION_KEY2).withKeyID(COLUMN_ENCRYPTION_KEY2_ID).build();
    Map<ColumnPath, ColumnEncryptionProperties> columnPropertiesMap = new HashMap<>();
    columnPropertiesMap.put(columnProperties1.getPath(), columnProperties1);
    columnPropertiesMap.put(columnProperties2.getPath(), columnProperties2);
    FileEncryptionProperties encryptionProperties = FileEncryptionProperties.builder(FOOTER_ENCRYPTION_KEY).withFooterKeyID(FOOTER_ENCRYPTION_KEY_ID).withEncryptedColumns(columnPropertiesMap).build();
    return encryptionProperties;
}
Also used : HashMap(java.util.HashMap) FileEncryptionProperties(org.apache.parquet.crypto.FileEncryptionProperties) ColumnEncryptionProperties(org.apache.parquet.crypto.ColumnEncryptionProperties) ColumnPath(org.apache.parquet.hadoop.metadata.ColumnPath)

Aggregations

FileEncryptionProperties (org.apache.parquet.crypto.FileEncryptionProperties)10 ColumnPath (org.apache.parquet.hadoop.metadata.ColumnPath)6 ColumnEncryptionProperties (org.apache.parquet.crypto.ColumnEncryptionProperties)5 HashMap (java.util.HashMap)4 Path (org.apache.hadoop.fs.Path)3 IOException (java.io.IOException)2 Configuration (org.apache.hadoop.conf.Configuration)2 ParquetCryptoRuntimeException (org.apache.parquet.crypto.ParquetCryptoRuntimeException)2 BeforeClass (org.junit.BeforeClass)2 Objects (java.util.Objects)1 JobConf (org.apache.hadoop.mapred.JobConf)1 Job (org.apache.hadoop.mapreduce.Job)1 JobContext (org.apache.hadoop.mapreduce.JobContext)1 OutputCommitter (org.apache.hadoop.mapreduce.OutputCommitter)1 RecordWriter (org.apache.hadoop.mapreduce.RecordWriter)1 TaskAttemptContext (org.apache.hadoop.mapreduce.TaskAttemptContext)1 FileOutputFormat (org.apache.hadoop.mapreduce.lib.output.FileOutputFormat)1 ParquetProperties (org.apache.parquet.column.ParquetProperties)1 DEFAULT_BLOOM_FILTER_ENABLED (org.apache.parquet.column.ParquetProperties.DEFAULT_BLOOM_FILTER_ENABLED)1 WriterVersion (org.apache.parquet.column.ParquetProperties.WriterVersion)1