Search in sources :

Example 1 with HadoopTables

use of org.apache.iceberg.hadoop.HadoopTables in project hive by apache.

the class Catalogs method createTable.

/**
 * Creates an Iceberg table using the catalog specified by the configuration.
 * <p>
 * The properties should contain the following values:
 * <ul>
 * <li>Table identifier ({@link Catalogs#NAME}) or table path ({@link Catalogs#LOCATION}) is required
 * <li>Table schema ({@link InputFormatConfig#TABLE_SCHEMA}) is required
 * <li>Partition specification ({@link InputFormatConfig#PARTITION_SPEC}) is optional. Table will be unpartitioned if
 *  not provided
 * </ul><p>
 * Other properties will be handled over to the Table creation. The controlling properties above will not be
 * propagated.
 * @param conf a Hadoop conf
 * @param props the controlling properties
 * @return the created Iceberg table
 */
public static Table createTable(Configuration conf, Properties props) {
    String schemaString = props.getProperty(InputFormatConfig.TABLE_SCHEMA);
    Preconditions.checkNotNull(schemaString, "Table schema not set");
    Schema schema = SchemaParser.fromJson(props.getProperty(InputFormatConfig.TABLE_SCHEMA));
    String specString = props.getProperty(InputFormatConfig.PARTITION_SPEC);
    PartitionSpec spec = PartitionSpec.unpartitioned();
    if (specString != null) {
        spec = PartitionSpecParser.fromJson(schema, specString);
    }
    String location = props.getProperty(LOCATION);
    String catalogName = props.getProperty(InputFormatConfig.CATALOG_NAME);
    // Create a table property map without the controlling properties
    Map<String, String> map = Maps.newHashMapWithExpectedSize(props.size());
    for (Object key : props.keySet()) {
        if (!PROPERTIES_TO_REMOVE.contains(key)) {
            map.put(key.toString(), props.get(key).toString());
        }
    }
    Optional<Catalog> catalog = loadCatalog(conf, catalogName);
    if (catalog.isPresent()) {
        String name = props.getProperty(NAME);
        Preconditions.checkNotNull(name, "Table identifier not set");
        return catalog.get().createTable(TableIdentifier.parse(name), schema, spec, location, map);
    }
    Preconditions.checkNotNull(location, "Table location not set");
    return new HadoopTables(conf).create(schema, spec, map, location);
}
Also used : Schema(org.apache.iceberg.Schema) HadoopTables(org.apache.iceberg.hadoop.HadoopTables) PartitionSpec(org.apache.iceberg.PartitionSpec) Catalog(org.apache.iceberg.catalog.Catalog)

Example 2 with HadoopTables

use of org.apache.iceberg.hadoop.HadoopTables in project hive by apache.

the class TestIcebergInputFormats method before.

@Before
public void before() throws IOException {
    conf = new Configuration();
    conf.set(InputFormatConfig.CATALOG, Catalogs.LOCATION);
    HadoopTables tables = new HadoopTables(conf);
    File location = temp.newFolder(testInputFormat.name(), fileFormat.name());
    Assert.assertTrue(location.delete());
    helper = new TestHelper(conf, tables, location.toString(), SCHEMA, SPEC, fileFormat, temp);
    builder = new InputFormatConfig.ConfigBuilder(conf).readFrom(location.toString());
}
Also used : TestHelper(org.apache.iceberg.mr.TestHelper) Configuration(org.apache.hadoop.conf.Configuration) HadoopTables(org.apache.iceberg.hadoop.HadoopTables) DataFile(org.apache.iceberg.DataFile) File(java.io.File) Before(org.junit.Before)

Example 3 with HadoopTables

use of org.apache.iceberg.hadoop.HadoopTables in project hive by apache.

the class TestCatalogs method testLoadTableFromLocation.

@Test
public void testLoadTableFromLocation() throws IOException {
    conf.set(InputFormatConfig.CATALOG, Catalogs.LOCATION);
    AssertHelpers.assertThrows("Should complain about table location not set", IllegalArgumentException.class, "location not set", () -> Catalogs.loadTable(conf));
    HadoopTables tables = new HadoopTables();
    Table hadoopTable = tables.create(SCHEMA, temp.newFolder("hadoop_tables").toString());
    conf.set(InputFormatConfig.TABLE_LOCATION, hadoopTable.location());
    Assert.assertEquals(hadoopTable.location(), Catalogs.loadTable(conf).location());
}
Also used : Table(org.apache.iceberg.Table) HadoopTables(org.apache.iceberg.hadoop.HadoopTables) Test(org.junit.Test)

Example 4 with HadoopTables

use of org.apache.iceberg.hadoop.HadoopTables in project hive by apache.

the class Catalogs method dropTable.

/**
 * Drops an Iceberg table using the catalog specified by the configuration.
 * <p>
 * The table identifier ({@link Catalogs#NAME}) or table path ({@link Catalogs#LOCATION}) should be specified by
 * the controlling properties.
 * @param conf a Hadoop conf
 * @param props the controlling properties
 * @return the created Iceberg table
 */
public static boolean dropTable(Configuration conf, Properties props) {
    String location = props.getProperty(LOCATION);
    String catalogName = props.getProperty(InputFormatConfig.CATALOG_NAME);
    Optional<Catalog> catalog = loadCatalog(conf, catalogName);
    if (catalog.isPresent()) {
        String name = props.getProperty(NAME);
        Preconditions.checkNotNull(name, "Table identifier not set");
        return catalog.get().dropTable(TableIdentifier.parse(name));
    }
    Preconditions.checkNotNull(location, "Table location not set");
    return new HadoopTables(conf).dropTable(location);
}
Also used : HadoopTables(org.apache.iceberg.hadoop.HadoopTables) Catalog(org.apache.iceberg.catalog.Catalog)

Example 5 with HadoopTables

use of org.apache.iceberg.hadoop.HadoopTables in project hive by apache.

the class Catalogs method loadTable.

private static Table loadTable(Configuration conf, String tableIdentifier, String tableLocation, String catalogName) {
    Optional<Catalog> catalog = loadCatalog(conf, catalogName);
    if (catalog.isPresent()) {
        Preconditions.checkArgument(tableIdentifier != null, "Table identifier not set");
        return catalog.get().loadTable(TableIdentifier.parse(tableIdentifier));
    }
    Preconditions.checkArgument(tableLocation != null, "Table location not set");
    return new HadoopTables(conf).load(tableLocation);
}
Also used : HadoopTables(org.apache.iceberg.hadoop.HadoopTables) Catalog(org.apache.iceberg.catalog.Catalog)

Aggregations

HadoopTables (org.apache.iceberg.hadoop.HadoopTables)9 Configuration (org.apache.hadoop.conf.Configuration)3 Catalog (org.apache.iceberg.catalog.Catalog)3 Test (org.junit.Test)3 File (java.io.File)2 Properties (java.util.Properties)2 Schema (org.apache.iceberg.Schema)2 Table (org.apache.iceberg.Table)2 BigDecimal (java.math.BigDecimal)1 HashMap (java.util.HashMap)1 Map (java.util.Map)1 LogicalExpression (org.apache.drill.common.expression.LogicalExpression)1 FormatPluginConfig (org.apache.drill.common.logical.FormatPluginConfig)1 StoragePluginRegistry (org.apache.drill.exec.store.StoragePluginRegistry)1 FileSystemConfig (org.apache.drill.exec.store.dfs.FileSystemConfig)1 IcebergFormatPluginConfig (org.apache.drill.exec.store.iceberg.format.IcebergFormatPluginConfig)1 Snapshot (org.apache.drill.exec.store.iceberg.snapshot.Snapshot)1 JsonStringHashMap (org.apache.drill.exec.util.JsonStringHashMap)1 CatalogProperties (org.apache.iceberg.CatalogProperties)1 DataFile (org.apache.iceberg.DataFile)1