Search in sources :

Example 1 with TestUtils.getParquetStoreProperties

use of uk.gov.gchq.gaffer.parquetstore.testutils.TestUtils.getParquetStoreProperties in project Gaffer by gchq.

the class ParquetStoreTest method shouldCorrectlyUseCompressionOption.

@Test
public void shouldCorrectlyUseCompressionOption(@TempDir java.nio.file.Path tempDir) throws Exception {
    for (final String compressionType : Sets.newHashSet("GZIP", "SNAPPY", "UNCOMPRESSED")) {
        // Given
        final Schema schema = new Schema.Builder().type("int", new TypeDefinition.Builder().clazz(Integer.class).serialiser(new IntegerParquetSerialiser()).build()).type("string", new TypeDefinition.Builder().clazz(String.class).serialiser(new StringParquetSerialiser()).build()).type(DIRECTED_EITHER, Boolean.class).entity("entity", new SchemaEntityDefinition.Builder().vertex("string").property("property1", "int").aggregate(false).build()).edge("edge", new SchemaEdgeDefinition.Builder().source("string").destination("string").property("property2", "int").directed(DIRECTED_EITHER).aggregate(false).build()).vertexSerialiser(new StringParquetSerialiser()).build();
        final ParquetStoreProperties parquetStoreProperties = TestUtils.getParquetStoreProperties(tempDir);
        parquetStoreProperties.setCompressionCodecName(compressionType);
        final ParquetStore parquetStore = (ParquetStore) ParquetStore.createStore("graphId", schema, parquetStoreProperties);
        final List<Element> elements = new ArrayList<>();
        elements.add(new Entity.Builder().group("entity").vertex("A").property("property1", 1).build());
        elements.add(new Edge.Builder().group("edge").source("B").dest("C").property("property2", 100).build());
        // When
        final AddElements add = new AddElements.Builder().input(elements).build();
        parquetStore.execute(add, new Context());
        // Then
        final List<Path> files = parquetStore.getFilesForGroup("entity");
        for (final Path path : files) {
            final ParquetMetadata parquetMetadata = ParquetFileReader.readFooter(new Configuration(), path, ParquetMetadataConverter.NO_FILTER);
            for (final BlockMetaData blockMetadata : parquetMetadata.getBlocks()) {
                blockMetadata.getColumns().forEach(c -> assertEquals(compressionType, c.getCodec().name()));
            }
        }
    }
}
Also used : AddElements(uk.gov.gchq.gaffer.operation.impl.add.AddElements) Entity(uk.gov.gchq.gaffer.data.element.Entity) BlockMetaData(org.apache.parquet.hadoop.metadata.BlockMetaData) Configuration(org.apache.hadoop.conf.Configuration) ParquetMetadata(org.apache.parquet.hadoop.metadata.ParquetMetadata) Schema(uk.gov.gchq.gaffer.store.schema.Schema) Element(uk.gov.gchq.gaffer.data.element.Element) ArrayList(java.util.ArrayList) TypeDefinition(uk.gov.gchq.gaffer.store.schema.TypeDefinition) IntegerParquetSerialiser(uk.gov.gchq.gaffer.parquetstore.serialisation.impl.IntegerParquetSerialiser) Context(uk.gov.gchq.gaffer.store.Context) Path(org.apache.hadoop.fs.Path) StringParquetSerialiser(uk.gov.gchq.gaffer.parquetstore.serialisation.impl.StringParquetSerialiser) TestUtils.getParquetStoreProperties(uk.gov.gchq.gaffer.parquetstore.testutils.TestUtils.getParquetStoreProperties) SchemaEdgeDefinition(uk.gov.gchq.gaffer.store.schema.SchemaEdgeDefinition) Test(org.junit.jupiter.api.Test)

Aggregations

ArrayList (java.util.ArrayList)1 Configuration (org.apache.hadoop.conf.Configuration)1 Path (org.apache.hadoop.fs.Path)1 BlockMetaData (org.apache.parquet.hadoop.metadata.BlockMetaData)1 ParquetMetadata (org.apache.parquet.hadoop.metadata.ParquetMetadata)1 Test (org.junit.jupiter.api.Test)1 Element (uk.gov.gchq.gaffer.data.element.Element)1 Entity (uk.gov.gchq.gaffer.data.element.Entity)1 AddElements (uk.gov.gchq.gaffer.operation.impl.add.AddElements)1 IntegerParquetSerialiser (uk.gov.gchq.gaffer.parquetstore.serialisation.impl.IntegerParquetSerialiser)1 StringParquetSerialiser (uk.gov.gchq.gaffer.parquetstore.serialisation.impl.StringParquetSerialiser)1 TestUtils.getParquetStoreProperties (uk.gov.gchq.gaffer.parquetstore.testutils.TestUtils.getParquetStoreProperties)1 Context (uk.gov.gchq.gaffer.store.Context)1 Schema (uk.gov.gchq.gaffer.store.schema.Schema)1 SchemaEdgeDefinition (uk.gov.gchq.gaffer.store.schema.SchemaEdgeDefinition)1 TypeDefinition (uk.gov.gchq.gaffer.store.schema.TypeDefinition)1