Examples with PartitionSpec - org.apache.iceberg.PartitionSpec

Example 51 with PartitionSpec

use of org.apache.iceberg.PartitionSpec in project hive by apache.

the class TestHiveIcebergPartitions method testMultilevelIdentityPartitionedWrite.

@Test
public void testMultilevelIdentityPartitionedWrite() throws IOException {
    PartitionSpec spec = PartitionSpec.builderFor(HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA).identity("customer_id").identity("last_name").build();
    List<Record> records = TestHelper.generateRandomRecords(HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, 4, 0L);
    Table table = testTables.createTable(shell, "partitioned_customers", HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, fileFormat, records);
    HiveIcebergTestUtils.validateData(table, records, 0);
}

Also used : Table(org.apache.iceberg.Table) Record(org.apache.iceberg.data.Record) PartitionSpec(org.apache.iceberg.PartitionSpec) Test(org.junit.Test)

Example 52 with PartitionSpec

use of org.apache.iceberg.PartitionSpec in project hive by apache.

the class TestHiveIcebergPartitions method testPartitionedWrite.

@Test
public void testPartitionedWrite() throws IOException {
    PartitionSpec spec = PartitionSpec.builderFor(HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA).bucket("customer_id", 3).build();
    List<Record> records = TestHelper.generateRandomRecords(HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, 4, 0L);
    Table table = testTables.createTable(shell, "partitioned_customers", HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, fileFormat, records);
    HiveIcebergTestUtils.validateData(table, records, 0);
}

Also used : Table(org.apache.iceberg.Table) Record(org.apache.iceberg.data.Record) PartitionSpec(org.apache.iceberg.PartitionSpec) Test(org.junit.Test)

Example 53 with PartitionSpec

use of org.apache.iceberg.PartitionSpec in project hive by apache.

the class TestHiveIcebergPartitions method testYearTransform.

@Test
public void testYearTransform() throws IOException {
    Schema schema = new Schema(optional(1, "id", Types.LongType.get()), optional(2, "part_field", Types.DateType.get()));
    PartitionSpec spec = PartitionSpec.builderFor(schema).year("part_field").build();
    List<Record> records = TestHelper.RecordsBuilder.newInstance(schema).add(1L, LocalDate.of(2020, 1, 21)).add(2L, LocalDate.of(2020, 1, 22)).add(3L, LocalDate.of(2019, 1, 21)).build();
    Table table = testTables.createTable(shell, "part_test", schema, spec, fileFormat, records);
    HiveIcebergTestUtils.validateData(table, records, 0);
    HiveIcebergTestUtils.validateDataWithSQL(shell, "part_test", records, "id");
}

Also used : Table(org.apache.iceberg.Table) Schema(org.apache.iceberg.Schema) Record(org.apache.iceberg.data.Record) PartitionSpec(org.apache.iceberg.PartitionSpec) Test(org.junit.Test)

Example 54 with PartitionSpec

use of org.apache.iceberg.PartitionSpec in project hive by apache.

the class TestHiveIcebergPartitions method testMonthTransform.

@Test
public void testMonthTransform() throws IOException {
    Assume.assumeTrue("ORC/TIMESTAMP_INSTANT is not a supported vectorized type for Hive", isVectorized && fileFormat == FileFormat.ORC);
    Schema schema = new Schema(optional(1, "id", Types.LongType.get()), optional(2, "part_field", Types.TimestampType.withZone()));
    PartitionSpec spec = PartitionSpec.builderFor(schema).month("part_field").build();
    List<Record> records = TestHelper.RecordsBuilder.newInstance(schema).add(1L, OffsetDateTime.of(2017, 11, 22, 11, 30, 7, 0, ZoneOffset.ofHours(1))).add(2L, OffsetDateTime.of(2017, 11, 22, 11, 30, 7, 0, ZoneOffset.ofHours(2))).add(3L, OffsetDateTime.of(2017, 11, 23, 11, 30, 7, 0, ZoneOffset.ofHours(3))).build();
    Table table = testTables.createTable(shell, "part_test", schema, spec, fileFormat, records);
    HiveIcebergTestUtils.validateData(table, records, 0);
    HiveIcebergTestUtils.validateDataWithSQL(shell, "part_test", records, "id");
}

Also used : Table(org.apache.iceberg.Table) Schema(org.apache.iceberg.Schema) Record(org.apache.iceberg.data.Record) PartitionSpec(org.apache.iceberg.PartitionSpec) Test(org.junit.Test)

Example 55 with PartitionSpec

use of org.apache.iceberg.PartitionSpec in project hive by apache.

the class TestHiveIcebergPartitions method testTruncateTransform.

@Test
public void testTruncateTransform() throws IOException {
    Schema schema = new Schema(optional(1, "id", Types.LongType.get()), optional(2, "part_field", Types.StringType.get()));
    PartitionSpec spec = PartitionSpec.builderFor(schema).truncate("part_field", 2).build();
    List<Record> records = TestHelper.RecordsBuilder.newInstance(schema).add(1L, "Part1").add(2L, "Part2").add(3L, "Art3").build();
    Table table = testTables.createTable(shell, "part_test", schema, spec, fileFormat, records);
    HiveIcebergTestUtils.validateData(table, records, 0);
    HiveIcebergTestUtils.validateDataWithSQL(shell, "part_test", records, "id");
}

Also used : Table(org.apache.iceberg.Table) Schema(org.apache.iceberg.Schema) Record(org.apache.iceberg.data.Record) PartitionSpec(org.apache.iceberg.PartitionSpec) Test(org.junit.Test)

Aggregations

PartitionSpec (org.apache.iceberg.PartitionSpec)59 Test (org.junit.Test)39 Table (org.apache.iceberg.Table)38 Schema (org.apache.iceberg.Schema)37 Record (org.apache.iceberg.data.Record)19 TableIdentifier (org.apache.iceberg.catalog.TableIdentifier)18 List (java.util.List)10 FileFormat (org.apache.iceberg.FileFormat)9 ArrayList (java.util.ArrayList)8 FieldSchema (org.apache.hadoop.hive.metastore.api.FieldSchema)8 ImmutableList (org.apache.iceberg.relocated.com.google.common.collect.ImmutableList)7 IOException (java.io.IOException)6 UpdateSchema (org.apache.iceberg.UpdateSchema)6 BaseTable (org.apache.iceberg.BaseTable)5 Path (org.apache.hadoop.fs.Path)4 PartitionField (org.apache.iceberg.PartitionField)4 Types (org.apache.iceberg.types.Types)4 HdfsContext (com.facebook.presto.hive.HdfsContext)3 PrestoException (com.facebook.presto.spi.PrestoException)3 Map (java.util.Map)3