Search in sources :

Example 1 with HoodieAvroPayload

use of org.apache.hudi.common.model.HoodieAvroPayload in project rocketmq-externals by apache.

the class Updater method schemaEvolution.

private void schemaEvolution(Schema newSchema, Schema oldSchema) {
    if (null != oldSchema && oldSchema.toString().equals(newSchema.toString())) {
        return;
    }
    log.info("Schema changed. New schema is " + newSchema.toString());
    this.cfg = HoodieWriteConfig.newBuilder().withPath(hudiConnectConfig.getTablePath()).withSchema(this.hudiConnectConfig.schema.toString()).withEngineType(EngineType.JAVA).withParallelism(hudiConnectConfig.getInsertShuffleParallelism(), hudiConnectConfig.getUpsertShuffleParallelism()).withDeleteParallelism(hudiConnectConfig.getDeleteParallelism()).forTable(hudiConnectConfig.getTableName()).withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(HoodieIndex.IndexType.INMEMORY).build()).withCompactionConfig(HoodieCompactionConfig.newBuilder().archiveCommitsWith(20, 30).build()).build();
    this.hudiWriteClient.close();
    Configuration hadoopConf = new Configuration();
    hadoopConf.setBoolean(AvroReadSupport.AVRO_COMPATIBILITY, false);
    hadoopConf.set(AvroReadSupport.AVRO_DATA_SUPPLIER, GenericDataSupplier.class.getName());
    this.hudiWriteClient = new HoodieJavaWriteClient<HoodieAvroPayload>(new HoodieJavaEngineContext(hadoopConf), cfg);
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) GenericDataSupplier(org.apache.parquet.avro.GenericDataSupplier) HoodieAvroPayload(org.apache.hudi.common.model.HoodieAvroPayload) HoodieJavaEngineContext(org.apache.hudi.client.common.HoodieJavaEngineContext)

Example 2 with HoodieAvroPayload

use of org.apache.hudi.common.model.HoodieAvroPayload in project rocketmq-externals by apache.

the class Updater method commit.

public void commit() {
    List<SinkDataEntry> commitList;
    if (inflightList.isEmpty()) {
        return;
    }
    synchronized (this.inflightList) {
        commitList = inflightList;
        inflightList = new ArrayList<>();
    }
    List<HoodieRecord> hoodieRecordsList = new ArrayList<>();
    for (SinkDataEntry record : commitList) {
        GenericRecord genericRecord = sinkDataEntry2GenericRecord(record);
        HoodieRecord<HoodieAvroPayload> hoodieRecord = new HoodieRecord(new HoodieKey(UUID.randomUUID().toString(), "shardingKey-" + record.getQueueName()), new HoodieAvroPayload(Option.of(genericRecord)));
        hoodieRecordsList.add(hoodieRecord);
    }
    try {
        List<WriteStatus> statuses = hudiWriteClient.upsert(hoodieRecordsList, hudiWriteClient.startCommit());
        log.info("Upserted data to hudi");
        long upserted = statuses.get(0).getStat().getNumInserts();
        if (upserted != commitList.size()) {
            log.warn("Upserted num not equals input");
        }
    } catch (Exception e) {
        log.error("Exception when upserting to Hudi", e);
    }
}
Also used : HoodieRecord(org.apache.hudi.common.model.HoodieRecord) ArrayList(java.util.ArrayList) IOException(java.io.IOException) SinkDataEntry(io.openmessaging.connector.api.data.SinkDataEntry) HoodieKey(org.apache.hudi.common.model.HoodieKey) GenericRecord(org.apache.avro.generic.GenericRecord) WriteStatus(org.apache.hudi.client.WriteStatus) HoodieAvroPayload(org.apache.hudi.common.model.HoodieAvroPayload)

Aggregations

HoodieAvroPayload (org.apache.hudi.common.model.HoodieAvroPayload)2 SinkDataEntry (io.openmessaging.connector.api.data.SinkDataEntry)1 IOException (java.io.IOException)1 ArrayList (java.util.ArrayList)1 GenericRecord (org.apache.avro.generic.GenericRecord)1 Configuration (org.apache.hadoop.conf.Configuration)1 WriteStatus (org.apache.hudi.client.WriteStatus)1 HoodieJavaEngineContext (org.apache.hudi.client.common.HoodieJavaEngineContext)1 HoodieKey (org.apache.hudi.common.model.HoodieKey)1 HoodieRecord (org.apache.hudi.common.model.HoodieRecord)1 GenericDataSupplier (org.apache.parquet.avro.GenericDataSupplier)1