Search in sources :

Example 1 with MergeOnReadInputFormat

use of org.apache.hudi.table.format.mor.MergeOnReadInputFormat in project hudi by apache.

the class TestStreamReadOperator method createReader.

private OneInputStreamOperatorTestHarness<MergeOnReadInputSplit, RowData> createReader() throws Exception {
    final String basePath = tempFile.getAbsolutePath();
    final org.apache.hadoop.conf.Configuration hadoopConf = StreamerUtil.getHadoopConf();
    final HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder().setConf(hadoopConf).setBasePath(basePath).build();
    final List<String> partitionKeys = Collections.singletonList("partition");
    // This input format is used to opening the emitted split.
    TableSchemaResolver schemaResolver = new TableSchemaResolver(metaClient);
    final Schema tableAvroSchema;
    try {
        tableAvroSchema = schemaResolver.getTableAvroSchema();
    } catch (Exception e) {
        throw new HoodieException("Get table avro schema error", e);
    }
    final DataType rowDataType = AvroSchemaConverter.convertToDataType(tableAvroSchema);
    final RowType rowType = (RowType) rowDataType.getLogicalType();
    final MergeOnReadTableState hoodieTableState = new MergeOnReadTableState(rowType, TestConfigurations.ROW_TYPE, tableAvroSchema.toString(), AvroSchemaConverter.convertToSchema(TestConfigurations.ROW_TYPE).toString(), Collections.emptyList(), new String[0]);
    MergeOnReadInputFormat inputFormat = MergeOnReadInputFormat.builder().config(conf).tableState(hoodieTableState).fieldTypes(rowDataType.getChildren()).defaultPartName("default").limit(1000L).emitDelete(true).build();
    OneInputStreamOperatorFactory<MergeOnReadInputSplit, RowData> factory = StreamReadOperator.factory(inputFormat);
    OneInputStreamOperatorTestHarness<MergeOnReadInputSplit, RowData> harness = new OneInputStreamOperatorTestHarness<>(factory, 1, 1, 0);
    harness.getStreamConfig().setTimeCharacteristic(TimeCharacteristic.ProcessingTime);
    return harness;
}
Also used : Schema(org.apache.avro.Schema) RowType(org.apache.flink.table.types.logical.RowType) TableSchemaResolver(org.apache.hudi.common.table.TableSchemaResolver) HoodieException(org.apache.hudi.exception.HoodieException) MergeOnReadTableState(org.apache.hudi.table.format.mor.MergeOnReadTableState) OneInputStreamOperatorTestHarness(org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness) HoodieException(org.apache.hudi.exception.HoodieException) HoodieTableMetaClient(org.apache.hudi.common.table.HoodieTableMetaClient) MergeOnReadInputSplit(org.apache.hudi.table.format.mor.MergeOnReadInputSplit) RowData(org.apache.flink.table.data.RowData) MergeOnReadInputFormat(org.apache.hudi.table.format.mor.MergeOnReadInputFormat) DataType(org.apache.flink.table.types.DataType)

Aggregations

Schema (org.apache.avro.Schema)1 OneInputStreamOperatorTestHarness (org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness)1 RowData (org.apache.flink.table.data.RowData)1 DataType (org.apache.flink.table.types.DataType)1 RowType (org.apache.flink.table.types.logical.RowType)1 HoodieTableMetaClient (org.apache.hudi.common.table.HoodieTableMetaClient)1 TableSchemaResolver (org.apache.hudi.common.table.TableSchemaResolver)1 HoodieException (org.apache.hudi.exception.HoodieException)1 MergeOnReadInputFormat (org.apache.hudi.table.format.mor.MergeOnReadInputFormat)1 MergeOnReadInputSplit (org.apache.hudi.table.format.mor.MergeOnReadInputSplit)1 MergeOnReadTableState (org.apache.hudi.table.format.mor.MergeOnReadTableState)1