Search in sources :

Example 1 with BootstrapBaseFileSplit

use of org.apache.hudi.hadoop.BootstrapBaseFileSplit in project presto by prestodb.

the class HudiBootstrapBaseFileSplitConverter method extractCustomSplitInfo.

@Override
public Optional<Map<String, String>> extractCustomSplitInfo(FileSplit split) {
    if (split instanceof BootstrapBaseFileSplit) {
        ImmutableMap.Builder<String, String> customSplitInfo = new ImmutableMap.Builder<>();
        BootstrapBaseFileSplit hudiSplit = (BootstrapBaseFileSplit) split;
        customSplitInfo.put(CUSTOM_FILE_SPLIT_CLASS_KEY, BootstrapBaseFileSplit.class.getName());
        customSplitInfo.put(BOOTSTRAP_FILE_SPLIT_PATH_KEY, hudiSplit.getBootstrapFileSplit().getPath().toString());
        customSplitInfo.put(BOOTSTRAP_FILE_SPLIT_START_KEY, String.valueOf(hudiSplit.getBootstrapFileSplit().getStart()));
        customSplitInfo.put(BOOTSTRAP_FILE_SPLIT_LEN_KEY, String.valueOf(hudiSplit.getBootstrapFileSplit().getLength()));
        return Optional.of(customSplitInfo.build());
    }
    return Optional.empty();
}
Also used : BootstrapBaseFileSplit(org.apache.hudi.hadoop.BootstrapBaseFileSplit) ImmutableMap(com.google.common.collect.ImmutableMap)

Example 2 with BootstrapBaseFileSplit

use of org.apache.hudi.hadoop.BootstrapBaseFileSplit in project presto by prestodb.

the class TestCustomSplitConversionUtils method testHudiBootstrapBaseFileSplitConverter.

@Test
public void testHudiBootstrapBaseFileSplitConverter() throws IOException {
    Path bootstrapSourceFilePath = new Path("/test/source/test.parquet");
    long bootstrapSourceSplitStartPos = 0L;
    long bootstrapSourceSplitLength = 200L;
    FileSplit baseSplit = new FileSplit(FILE_PATH, SPLIT_START_POS, SPLIT_LENGTH, SPLIT_HOSTS);
    FileSplit bootstrapSourceSplit = new FileSplit(bootstrapSourceFilePath, bootstrapSourceSplitStartPos, bootstrapSourceSplitLength, new String[0]);
    FileSplit hudiSplit = new BootstrapBaseFileSplit(baseSplit, bootstrapSourceSplit);
    // Test conversion of HudiSplit -> customSplitInfo
    Map<String, String> customSplitInfo = CustomSplitConversionUtils.extractCustomSplitInfo(hudiSplit);
    // Test conversion of (customSplitInfo + baseSplit) -> HudiSplit
    BootstrapBaseFileSplit recreatedSplit = (BootstrapBaseFileSplit) CustomSplitConversionUtils.recreateSplitWithCustomInfo(baseSplit, customSplitInfo);
    assertEquals(FILE_PATH, recreatedSplit.getPath());
    assertEquals(SPLIT_START_POS, recreatedSplit.getStart());
    assertEquals(SPLIT_LENGTH, recreatedSplit.getLength());
    assertEquals(SPLIT_HOSTS, recreatedSplit.getLocations());
    assertEquals(bootstrapSourceFilePath, recreatedSplit.getBootstrapFileSplit().getPath());
    assertEquals(bootstrapSourceSplitStartPos, recreatedSplit.getBootstrapFileSplit().getStart());
    assertEquals(bootstrapSourceSplitLength, recreatedSplit.getBootstrapFileSplit().getLength());
}
Also used : Path(org.apache.hadoop.fs.Path) RealtimeBootstrapBaseFileSplit(org.apache.hudi.hadoop.realtime.RealtimeBootstrapBaseFileSplit) BootstrapBaseFileSplit(org.apache.hudi.hadoop.BootstrapBaseFileSplit) FileSplit(org.apache.hadoop.mapred.FileSplit) RealtimeBootstrapBaseFileSplit(org.apache.hudi.hadoop.realtime.RealtimeBootstrapBaseFileSplit) BootstrapBaseFileSplit(org.apache.hudi.hadoop.BootstrapBaseFileSplit) HoodieRealtimeFileSplit(org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit) Test(org.testng.annotations.Test)

Example 3 with BootstrapBaseFileSplit

use of org.apache.hudi.hadoop.BootstrapBaseFileSplit in project presto by prestodb.

the class HudiBootstrapBaseFileSplitConverter method recreateFileSplitWithCustomInfo.

@Override
public Optional<FileSplit> recreateFileSplitWithCustomInfo(FileSplit split, Map<String, String> customSplitInfo) throws IOException {
    requireNonNull(customSplitInfo);
    String customFileSplitClass = customSplitInfo.get(CUSTOM_FILE_SPLIT_CLASS_KEY);
    if (!isNullOrEmpty(customFileSplitClass) && BootstrapBaseFileSplit.class.getName().equals(customFileSplitClass)) {
        FileSplit bootstrapFileSplit = new FileSplit(new Path(customSplitInfo.get(BOOTSTRAP_FILE_SPLIT_PATH_KEY)), parseLong(customSplitInfo.get(BOOTSTRAP_FILE_SPLIT_START_KEY)), parseLong(customSplitInfo.get(BOOTSTRAP_FILE_SPLIT_LEN_KEY)), (String[]) null);
        split = new BootstrapBaseFileSplit(split, bootstrapFileSplit);
        return Optional.of(split);
    }
    return Optional.empty();
}
Also used : Path(org.apache.hadoop.fs.Path) BootstrapBaseFileSplit(org.apache.hudi.hadoop.BootstrapBaseFileSplit) FileSplit(org.apache.hadoop.mapred.FileSplit) BootstrapBaseFileSplit(org.apache.hudi.hadoop.BootstrapBaseFileSplit)

Aggregations

BootstrapBaseFileSplit (org.apache.hudi.hadoop.BootstrapBaseFileSplit)3 Path (org.apache.hadoop.fs.Path)2 FileSplit (org.apache.hadoop.mapred.FileSplit)2 ImmutableMap (com.google.common.collect.ImmutableMap)1 HoodieRealtimeFileSplit (org.apache.hudi.hadoop.realtime.HoodieRealtimeFileSplit)1 RealtimeBootstrapBaseFileSplit (org.apache.hudi.hadoop.realtime.RealtimeBootstrapBaseFileSplit)1 Test (org.testng.annotations.Test)1