Search in sources :

Example 1 with HoodieRealtimeFileSplit

use of org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit in project presto by prestodb.

the class TestCustomSplitConversionUtils method testHudiRealtimeSplitConverterRoundTrip.

@Test
public void testHudiRealtimeSplitConverterRoundTrip() throws IOException {
    List<String> expectedDeltaLogPaths = Arrays.asList("test1", "test2", "test3");
    String expectedMaxCommitTime = "max_commit_time";
    FileSplit baseSplit = new FileSplit(FILE_PATH, SPLIT_START_POS, SPLIT_LENGTH, SPLIT_HOSTS);
    FileSplit hudiSplit = new HoodieRealtimeFileSplit(baseSplit, BASE_PATH, expectedDeltaLogPaths, expectedMaxCommitTime, Option.empty());
    // Test conversion of HudiSplit -> customSplitInfo
    Map<String, String> customSplitInfo = CustomSplitConversionUtils.extractCustomSplitInfo(hudiSplit);
    // Test conversion of (customSplitInfo + baseSplit) -> HudiSplit
    HoodieRealtimeFileSplit recreatedSplit = (HoodieRealtimeFileSplit) CustomSplitConversionUtils.recreateSplitWithCustomInfo(baseSplit, customSplitInfo);
    assertEquals(FILE_PATH, recreatedSplit.getPath());
    assertEquals(SPLIT_START_POS, recreatedSplit.getStart());
    assertEquals(SPLIT_LENGTH, recreatedSplit.getLength());
    assertEquals(SPLIT_HOSTS, recreatedSplit.getLocations());
    assertEquals(BASE_PATH, recreatedSplit.getBasePath());
    assertEquals(expectedDeltaLogPaths, recreatedSplit.getDeltaLogPaths());
    assertEquals(expectedMaxCommitTime, recreatedSplit.getMaxCommitTime());
}
Also used : HoodieRealtimeFileSplit(org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit) FileSplit(org.apache.hadoop.mapred.FileSplit) RealtimeBootstrapBaseFileSplit(org.apache.hudi.hadoop.realtime.RealtimeBootstrapBaseFileSplit) BootstrapBaseFileSplit(org.apache.hudi.hadoop.BootstrapBaseFileSplit) HoodieRealtimeFileSplit(org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit) Test(org.testng.annotations.Test)

Example 2 with HoodieRealtimeFileSplit

use of org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit in project presto by prestodb.

the class HudiRealtimeSplitConverter method extractCustomSplitInfo.

@Override
public Optional<Map<String, String>> extractCustomSplitInfo(FileSplit split) {
    if (split instanceof HoodieRealtimeFileSplit) {
        HoodieRealtimeFileSplit hudiSplit = (HoodieRealtimeFileSplit) split;
        Map<String, String> customSplitInfo = ImmutableMap.<String, String>builder().put(CUSTOM_FILE_SPLIT_CLASS_KEY, HoodieRealtimeFileSplit.class.getName()).put(HUDI_DELTA_FILEPATHS_KEY, String.join(",", hudiSplit.getDeltaLogPaths())).put(HUDI_BASEPATH_KEY, hudiSplit.getBasePath()).put(HUDI_MAX_COMMIT_TIME_KEY, hudiSplit.getMaxCommitTime()).build();
        return Optional.of(customSplitInfo);
    }
    return Optional.empty();
}
Also used : HoodieRealtimeFileSplit(org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit)

Example 3 with HoodieRealtimeFileSplit

use of org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit in project presto by prestodb.

the class HudiRealtimeSplitConverter method recreateFileSplitWithCustomInfo.

@Override
public Optional<FileSplit> recreateFileSplitWithCustomInfo(FileSplit split, Map<String, String> customSplitInfo) throws IOException {
    String customSplitClass = customSplitInfo.get(CUSTOM_FILE_SPLIT_CLASS_KEY);
    if (HoodieRealtimeFileSplit.class.getName().equals(customSplitClass)) {
        requireNonNull(customSplitInfo.get(HUDI_DELTA_FILEPATHS_KEY), "HUDI_DELTA_FILEPATHS_KEY is missing");
        List<String> deltaLogPaths = Arrays.asList(customSplitInfo.get(HUDI_DELTA_FILEPATHS_KEY).split(","));
        return Optional.of(new HoodieRealtimeFileSplit(split, requireNonNull(customSplitInfo.get(HUDI_BASEPATH_KEY), "HUDI_BASEPATH_KEY is missing"), deltaLogPaths, requireNonNull(customSplitInfo.get(HUDI_MAX_COMMIT_TIME_KEY), "HUDI_MAX_COMMIT_TIME_KEY is missing"), Option.empty()));
    }
    return Optional.empty();
}
Also used : HoodieRealtimeFileSplit(org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit)

Aggregations

HoodieRealtimeFileSplit (org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit)3 FileSplit (org.apache.hadoop.mapred.FileSplit)1 BootstrapBaseFileSplit (org.apache.hudi.hadoop.BootstrapBaseFileSplit)1 RealtimeBootstrapBaseFileSplit (org.apache.hudi.hadoop.realtime.RealtimeBootstrapBaseFileSplit)1 Test (org.testng.annotations.Test)1