Search in sources :

Example 1 with SparkSizeBasedClusteringPlanStrategy

use of org.apache.hudi.client.clustering.plan.strategy.SparkSizeBasedClusteringPlanStrategy in project hudi by apache.

the class TestSparkClusteringPlanPartitionFilter method testFilterPartitionNoFilter.

@Test
public void testFilterPartitionNoFilter() {
    HoodieWriteConfig config = hoodieWriteConfigBuilder.withClusteringConfig(HoodieClusteringConfig.newBuilder().withClusteringPlanPartitionFilterMode(ClusteringPlanPartitionFilterMode.NONE).build()).build();
    PartitionAwareClusteringPlanStrategy sg = new SparkSizeBasedClusteringPlanStrategy(table, context, config);
    ArrayList<String> fakeTimeBasedPartitionsPath = new ArrayList<>();
    fakeTimeBasedPartitionsPath.add("20210718");
    fakeTimeBasedPartitionsPath.add("20210716");
    fakeTimeBasedPartitionsPath.add("20210719");
    List list = sg.filterPartitionPaths(fakeTimeBasedPartitionsPath);
    assertEquals(3, list.size());
}
Also used : SparkSizeBasedClusteringPlanStrategy(org.apache.hudi.client.clustering.plan.strategy.SparkSizeBasedClusteringPlanStrategy) ArrayList(java.util.ArrayList) HoodieWriteConfig(org.apache.hudi.config.HoodieWriteConfig) ArrayList(java.util.ArrayList) List(java.util.List) Test(org.junit.jupiter.api.Test)

Example 2 with SparkSizeBasedClusteringPlanStrategy

use of org.apache.hudi.client.clustering.plan.strategy.SparkSizeBasedClusteringPlanStrategy in project hudi by apache.

the class TestSparkClusteringPlanPartitionFilter method testFilterPartitionRecentDays.

@Test
public void testFilterPartitionRecentDays() {
    HoodieWriteConfig config = hoodieWriteConfigBuilder.withClusteringConfig(HoodieClusteringConfig.newBuilder().withClusteringSkipPartitionsFromLatest(1).withClusteringTargetPartitions(1).withClusteringPlanPartitionFilterMode(ClusteringPlanPartitionFilterMode.RECENT_DAYS).build()).build();
    PartitionAwareClusteringPlanStrategy sg = new SparkSizeBasedClusteringPlanStrategy(table, context, config);
    ArrayList<String> fakeTimeBasedPartitionsPath = new ArrayList<>();
    fakeTimeBasedPartitionsPath.add("20210718");
    fakeTimeBasedPartitionsPath.add("20210716");
    fakeTimeBasedPartitionsPath.add("20210719");
    List list = sg.filterPartitionPaths(fakeTimeBasedPartitionsPath);
    assertEquals(1, list.size());
    assertSame("20210718", list.get(0));
}
Also used : SparkSizeBasedClusteringPlanStrategy(org.apache.hudi.client.clustering.plan.strategy.SparkSizeBasedClusteringPlanStrategy) ArrayList(java.util.ArrayList) HoodieWriteConfig(org.apache.hudi.config.HoodieWriteConfig) ArrayList(java.util.ArrayList) List(java.util.List) Test(org.junit.jupiter.api.Test)

Example 3 with SparkSizeBasedClusteringPlanStrategy

use of org.apache.hudi.client.clustering.plan.strategy.SparkSizeBasedClusteringPlanStrategy in project hudi by apache.

the class TestSparkClusteringPlanPartitionFilter method testFilterPartitionSelectedPartitions.

@Test
public void testFilterPartitionSelectedPartitions() {
    HoodieWriteConfig config = hoodieWriteConfigBuilder.withClusteringConfig(HoodieClusteringConfig.newBuilder().withClusteringPartitionFilterBeginPartition("20211222").withClusteringPartitionFilterEndPartition("20211223").withClusteringPlanPartitionFilterMode(ClusteringPlanPartitionFilterMode.SELECTED_PARTITIONS).build()).build();
    PartitionAwareClusteringPlanStrategy sg = new SparkSizeBasedClusteringPlanStrategy(table, context, config);
    ArrayList<String> fakeTimeBasedPartitionsPath = new ArrayList<>();
    fakeTimeBasedPartitionsPath.add("20211220");
    fakeTimeBasedPartitionsPath.add("20211221");
    fakeTimeBasedPartitionsPath.add("20211222");
    fakeTimeBasedPartitionsPath.add("20211224");
    List list = sg.filterPartitionPaths(fakeTimeBasedPartitionsPath);
    assertEquals(1, list.size());
    assertSame("20211222", list.get(0));
}
Also used : SparkSizeBasedClusteringPlanStrategy(org.apache.hudi.client.clustering.plan.strategy.SparkSizeBasedClusteringPlanStrategy) ArrayList(java.util.ArrayList) HoodieWriteConfig(org.apache.hudi.config.HoodieWriteConfig) ArrayList(java.util.ArrayList) List(java.util.List) Test(org.junit.jupiter.api.Test)

Aggregations

ArrayList (java.util.ArrayList)3 List (java.util.List)3 SparkSizeBasedClusteringPlanStrategy (org.apache.hudi.client.clustering.plan.strategy.SparkSizeBasedClusteringPlanStrategy)3 HoodieWriteConfig (org.apache.hudi.config.HoodieWriteConfig)3 Test (org.junit.jupiter.api.Test)3