Search in sources :

Example 1 with CompactionTask

use of org.apache.druid.indexing.common.task.CompactionTask in project druid by druid-io.

the class PartialCompactionTest method testPartialCompactRangeAndDynamicPartitionedSegments.

@Test
public void testPartialCompactRangeAndDynamicPartitionedSegments() {
    final Map<Interval, List<DataSegment>> rangePartitionedSegments = SegmentUtils.groupSegmentsByInterval(runTestTask(new SingleDimensionPartitionsSpec(10, null, "dim1", false), TaskState.SUCCESS, false));
    final Map<Interval, List<DataSegment>> linearlyPartitionedSegments = SegmentUtils.groupSegmentsByInterval(runTestTask(new DynamicPartitionsSpec(10, null), TaskState.SUCCESS, true));
    // Pick half of each partition lists to compact together
    rangePartitionedSegments.values().forEach(segmentsInInterval -> segmentsInInterval.sort(Comparator.comparing(segment -> segment.getShardSpec().getPartitionNum())));
    linearlyPartitionedSegments.values().forEach(segmentsInInterval -> segmentsInInterval.sort(Comparator.comparing(segment -> segment.getShardSpec().getPartitionNum())));
    final List<DataSegment> segmentsToCompact = new ArrayList<>();
    for (List<DataSegment> segmentsInInterval : rangePartitionedSegments.values()) {
        segmentsToCompact.addAll(segmentsInInterval.subList(segmentsInInterval.size() / 2, segmentsInInterval.size()));
    }
    for (List<DataSegment> segmentsInInterval : linearlyPartitionedSegments.values()) {
        segmentsToCompact.addAll(segmentsInInterval.subList(0, segmentsInInterval.size() / 2));
    }
    final CompactionTask compactionTask = newCompactionTaskBuilder().inputSpec(SpecificSegmentsSpec.fromSegments(segmentsToCompact)).tuningConfig(newTuningConfig(new DynamicPartitionsSpec(20, null), 2, false)).build();
    final Map<Interval, List<DataSegment>> compactedSegments = SegmentUtils.groupSegmentsByInterval(runTask(compactionTask, TaskState.SUCCESS));
    for (List<DataSegment> segmentsInInterval : compactedSegments.values()) {
        final int expectedAtomicUpdateGroupSize = segmentsInInterval.size();
        for (DataSegment segment : segmentsInInterval) {
            Assert.assertEquals(expectedAtomicUpdateGroupSize, segment.getShardSpec().getAtomicUpdateGroupSize());
        }
    }
}
Also used : DynamicPartitionsSpec(org.apache.druid.indexer.partitions.DynamicPartitionsSpec) CompactionTask(org.apache.druid.indexing.common.task.CompactionTask) ArrayList(java.util.ArrayList) SingleDimensionPartitionsSpec(org.apache.druid.indexer.partitions.SingleDimensionPartitionsSpec) ArrayList(java.util.ArrayList) List(java.util.List) DataSegment(org.apache.druid.timeline.DataSegment) Interval(org.joda.time.Interval) Test(org.junit.Test)

Example 2 with CompactionTask

use of org.apache.druid.indexing.common.task.CompactionTask in project druid by druid-io.

the class PartialCompactionTest method testPartialCompactHashAndDynamicPartitionedSegments.

@Test
public void testPartialCompactHashAndDynamicPartitionedSegments() {
    final Map<Interval, List<DataSegment>> hashPartitionedSegments = SegmentUtils.groupSegmentsByInterval(runTestTask(new HashedPartitionsSpec(null, 3, null), TaskState.SUCCESS, false));
    final Map<Interval, List<DataSegment>> linearlyPartitionedSegments = SegmentUtils.groupSegmentsByInterval(runTestTask(new DynamicPartitionsSpec(10, null), TaskState.SUCCESS, true));
    // Pick half of each partition lists to compact together
    hashPartitionedSegments.values().forEach(segmentsInInterval -> segmentsInInterval.sort(Comparator.comparing(segment -> segment.getShardSpec().getPartitionNum())));
    linearlyPartitionedSegments.values().forEach(segmentsInInterval -> segmentsInInterval.sort(Comparator.comparing(segment -> segment.getShardSpec().getPartitionNum())));
    final List<DataSegment> segmentsToCompact = new ArrayList<>();
    for (List<DataSegment> segmentsInInterval : hashPartitionedSegments.values()) {
        segmentsToCompact.addAll(segmentsInInterval.subList(segmentsInInterval.size() / 2, segmentsInInterval.size()));
    }
    for (List<DataSegment> segmentsInInterval : linearlyPartitionedSegments.values()) {
        segmentsToCompact.addAll(segmentsInInterval.subList(0, segmentsInInterval.size() / 2));
    }
    final CompactionTask compactionTask = newCompactionTaskBuilder().inputSpec(SpecificSegmentsSpec.fromSegments(segmentsToCompact)).tuningConfig(newTuningConfig(new DynamicPartitionsSpec(20, null), 2, false)).build();
    final Map<Interval, List<DataSegment>> compactedSegments = SegmentUtils.groupSegmentsByInterval(runTask(compactionTask, TaskState.SUCCESS));
    for (List<DataSegment> segmentsInInterval : compactedSegments.values()) {
        final int expectedAtomicUpdateGroupSize = segmentsInInterval.size();
        for (DataSegment segment : segmentsInInterval) {
            Assert.assertEquals(expectedAtomicUpdateGroupSize, segment.getShardSpec().getAtomicUpdateGroupSize());
        }
    }
}
Also used : HashedPartitionsSpec(org.apache.druid.indexer.partitions.HashedPartitionsSpec) DynamicPartitionsSpec(org.apache.druid.indexer.partitions.DynamicPartitionsSpec) CompactionTask(org.apache.druid.indexing.common.task.CompactionTask) ArrayList(java.util.ArrayList) ArrayList(java.util.ArrayList) List(java.util.List) DataSegment(org.apache.druid.timeline.DataSegment) Interval(org.joda.time.Interval) Test(org.junit.Test)

Aggregations

ArrayList (java.util.ArrayList)2 List (java.util.List)2 DynamicPartitionsSpec (org.apache.druid.indexer.partitions.DynamicPartitionsSpec)2 CompactionTask (org.apache.druid.indexing.common.task.CompactionTask)2 DataSegment (org.apache.druid.timeline.DataSegment)2 Interval (org.joda.time.Interval)2 Test (org.junit.Test)2 HashedPartitionsSpec (org.apache.druid.indexer.partitions.HashedPartitionsSpec)1 SingleDimensionPartitionsSpec (org.apache.druid.indexer.partitions.SingleDimensionPartitionsSpec)1