Search in sources :

Example 26 with SegmentIdWithShardSpec

use of org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec in project druid by druid-io.

the class CachingLocalSegmentAllocator method allocate.

@Override
public SegmentIdWithShardSpec allocate(InputRow row, String sequenceName, String previousSegmentId, boolean skipSegmentLineageCheck) {
    return sequenceNameToSegmentId.computeIfAbsent(sequenceName, k -> {
        final Pair<Interval, BucketNumberedShardSpec> pair = Preconditions.checkNotNull(sequenceNameToBucket.get(sequenceName), "Missing bucket for sequence[%s]", sequenceName);
        final Interval interval = pair.lhs;
        // Determines the partitionId if this segment allocator is used by the single-threaded task.
        // In parallel ingestion, the partitionId is determined in the supervisor task.
        // See ParallelIndexSupervisorTask.groupGenericPartitionLocationsPerPartition().
        // This code... isn't pretty, but should be simple enough to understand.
        final ShardSpec shardSpec = isParallel ? pair.rhs : pair.rhs.convert(intervalToNextPartitionId.computeInt(interval, (i, nextPartitionId) -> nextPartitionId == null ? 0 : nextPartitionId + 1));
        final String version = versionFinder.apply(interval);
        return new SegmentIdWithShardSpec(dataSource, interval, version, shardSpec);
    });
}
Also used : BucketNumberedShardSpec(org.apache.druid.timeline.partition.BucketNumberedShardSpec) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) BucketNumberedShardSpec(org.apache.druid.timeline.partition.BucketNumberedShardSpec) ShardSpec(org.apache.druid.timeline.partition.ShardSpec) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) Interval(org.joda.time.Interval)

Example 27 with SegmentIdWithShardSpec

use of org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec in project druid by druid-io.

the class SegmentAllocateActionTest method testCannotAddToExistingNumberedShardSpecsWithCoarserQueryGranularity.

@Test
public void testCannotAddToExistingNumberedShardSpecsWithCoarserQueryGranularity() throws Exception {
    final Task task = NoopTask.create();
    taskActionTestKit.getMetadataStorageCoordinator().announceHistoricalSegments(ImmutableSet.of(DataSegment.builder().dataSource(DATA_SOURCE).interval(Granularities.HOUR.bucket(PARTY_TIME)).version(PARTY_TIME.toString()).shardSpec(new NumberedShardSpec(0, 2)).size(0).build(), DataSegment.builder().dataSource(DATA_SOURCE).interval(Granularities.HOUR.bucket(PARTY_TIME)).version(PARTY_TIME.toString()).shardSpec(new NumberedShardSpec(1, 2)).size(0).build()));
    taskActionTestKit.getTaskLockbox().add(task);
    final SegmentIdWithShardSpec id1 = allocate(task, PARTY_TIME, Granularities.DAY, Granularities.DAY, "s1", null);
    Assert.assertNull(id1);
}
Also used : Task(org.apache.druid.indexing.common.task.Task) NoopTask(org.apache.druid.indexing.common.task.NoopTask) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) HashBasedNumberedShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedShardSpec) NumberedShardSpec(org.apache.druid.timeline.partition.NumberedShardSpec) Test(org.junit.Test)

Example 28 with SegmentIdWithShardSpec

use of org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec in project druid by druid-io.

the class SegmentAllocateActionTest method testCannotDoAnythingWithSillyQueryGranularity.

@Test
public void testCannotDoAnythingWithSillyQueryGranularity() {
    final Task task = NoopTask.create();
    taskActionTestKit.getTaskLockbox().add(task);
    final SegmentIdWithShardSpec id1 = allocate(task, PARTY_TIME, Granularities.DAY, Granularities.HOUR, "s1", null);
    Assert.assertNull(id1);
}
Also used : Task(org.apache.druid.indexing.common.task.Task) NoopTask(org.apache.druid.indexing.common.task.NoopTask) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) Test(org.junit.Test)

Example 29 with SegmentIdWithShardSpec

use of org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec in project druid by druid-io.

the class SegmentAllocateActionTest method testMultipleSequences.

@Test
public void testMultipleSequences() {
    final Task task = NoopTask.create();
    taskActionTestKit.getTaskLockbox().add(task);
    final SegmentIdWithShardSpec id1 = allocate(task, PARTY_TIME, Granularities.NONE, Granularities.HOUR, "s1", null);
    final SegmentIdWithShardSpec id2 = allocate(task, PARTY_TIME, Granularities.NONE, Granularities.HOUR, "s2", null);
    final SegmentIdWithShardSpec id3 = allocate(task, PARTY_TIME, Granularities.NONE, Granularities.HOUR, "s1", id1.toString());
    final SegmentIdWithShardSpec id4 = allocate(task, THE_DISTANT_FUTURE, Granularities.NONE, Granularities.HOUR, "s1", id3.toString());
    final SegmentIdWithShardSpec id5 = allocate(task, THE_DISTANT_FUTURE, Granularities.NONE, Granularities.HOUR, "s2", id2.toString());
    final SegmentIdWithShardSpec id6 = allocate(task, PARTY_TIME, Granularities.NONE, Granularities.HOUR, "s1", null);
    if (lockGranularity == LockGranularity.TIME_CHUNK) {
        final TaskLock partyLock = Iterables.getOnlyElement(FluentIterable.from(taskActionTestKit.getTaskLockbox().findLocksForTask(task)).filter(new Predicate<TaskLock>() {

            @Override
            public boolean apply(TaskLock input) {
                return input.getInterval().contains(PARTY_TIME);
            }
        }));
        final TaskLock futureLock = Iterables.getOnlyElement(FluentIterable.from(taskActionTestKit.getTaskLockbox().findLocksForTask(task)).filter(new Predicate<TaskLock>() {

            @Override
            public boolean apply(TaskLock input) {
                return input.getInterval().contains(THE_DISTANT_FUTURE);
            }
        }));
        assertSameIdentifier(id1, new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(PARTY_TIME), partyLock.getVersion(), new NumberedShardSpec(0, 0)));
        assertSameIdentifier(id2, new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(PARTY_TIME), partyLock.getVersion(), new NumberedShardSpec(1, 0)));
        assertSameIdentifier(id3, new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(PARTY_TIME), partyLock.getVersion(), new NumberedShardSpec(2, 0)));
        assertSameIdentifier(id4, new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(THE_DISTANT_FUTURE), futureLock.getVersion(), new NumberedShardSpec(0, 0)));
        assertSameIdentifier(id5, new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(THE_DISTANT_FUTURE), futureLock.getVersion(), new NumberedShardSpec(1, 0)));
    } else {
        final List<TaskLock> partyLocks = taskActionTestKit.getTaskLockbox().findLocksForTask(task).stream().filter(input -> input.getInterval().contains(PARTY_TIME)).collect(Collectors.toList());
        Assert.assertEquals(3, partyLocks.size());
        assertSameIdentifier(new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(PARTY_TIME), partyLocks.get(0).getVersion(), new NumberedShardSpec(0, 0)), id1);
        assertSameIdentifier(new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(PARTY_TIME), partyLocks.get(1).getVersion(), new NumberedShardSpec(1, 0)), id2);
        assertSameIdentifier(new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(PARTY_TIME), partyLocks.get(2).getVersion(), new NumberedShardSpec(2, 0)), id3);
        final List<TaskLock> futureLocks = taskActionTestKit.getTaskLockbox().findLocksForTask(task).stream().filter(input -> input.getInterval().contains(THE_DISTANT_FUTURE)).collect(Collectors.toList());
        Assert.assertEquals(2, futureLocks.size());
        assertSameIdentifier(new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(THE_DISTANT_FUTURE), futureLocks.get(0).getVersion(), new NumberedShardSpec(0, 0)), id4);
        assertSameIdentifier(new SegmentIdWithShardSpec(DATA_SOURCE, Granularities.HOUR.bucket(THE_DISTANT_FUTURE), futureLocks.get(1).getVersion(), new NumberedShardSpec(1, 0)), id5);
    }
    assertSameIdentifier(id1, id6);
}
Also used : NumberedPartialShardSpec(org.apache.druid.timeline.partition.NumberedPartialShardSpec) Iterables(com.google.common.collect.Iterables) Granularity(org.apache.druid.java.util.common.granularity.Granularity) Intervals(org.apache.druid.java.util.common.Intervals) HashBasedNumberedShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedShardSpec) RunWith(org.junit.runner.RunWith) HashMap(java.util.HashMap) ImmutableList(com.google.common.collect.ImmutableList) FluentIterable(com.google.common.collect.FluentIterable) PeriodGranularity(org.apache.druid.java.util.common.granularity.PeriodGranularity) Task(org.apache.druid.indexing.common.task.Task) Map(java.util.Map) TaskLock(org.apache.druid.indexing.common.TaskLock) ExpectedException(org.junit.rules.ExpectedException) HashBasedNumberedPartialShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedPartialShardSpec) Parameterized(org.junit.runners.Parameterized) Before(org.junit.Before) DateTimes(org.apache.druid.java.util.common.DateTimes) ShardSpec(org.apache.druid.timeline.partition.ShardSpec) Period(org.joda.time.Period) ImmutableSet(com.google.common.collect.ImmutableSet) EmittingLogger(org.apache.druid.java.util.emitter.EmittingLogger) NumberedShardSpec(org.apache.druid.timeline.partition.NumberedShardSpec) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) DateTime(org.joda.time.DateTime) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) Test(org.junit.Test) IOException(java.io.IOException) EasyMock(org.easymock.EasyMock) Collectors(java.util.stream.Collectors) LockGranularity(org.apache.druid.indexing.common.LockGranularity) DefaultObjectMapper(org.apache.druid.jackson.DefaultObjectMapper) Granularities(org.apache.druid.java.util.common.granularity.Granularities) NoopTask(org.apache.druid.indexing.common.task.NoopTask) List(java.util.List) Rule(org.junit.Rule) Predicate(com.google.common.base.Predicate) SegmentLock(org.apache.druid.indexing.common.SegmentLock) LinearShardSpec(org.apache.druid.timeline.partition.LinearShardSpec) ServiceEmitter(org.apache.druid.java.util.emitter.service.ServiceEmitter) DataSegment(org.apache.druid.timeline.DataSegment) Entry(java.util.Map.Entry) LinearPartialShardSpec(org.apache.druid.timeline.partition.LinearPartialShardSpec) PartialShardSpec(org.apache.druid.timeline.partition.PartialShardSpec) Assert(org.junit.Assert) Task(org.apache.druid.indexing.common.task.Task) NoopTask(org.apache.druid.indexing.common.task.NoopTask) TaskLock(org.apache.druid.indexing.common.TaskLock) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) HashBasedNumberedShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedShardSpec) NumberedShardSpec(org.apache.druid.timeline.partition.NumberedShardSpec) Predicate(com.google.common.base.Predicate) Test(org.junit.Test)

Example 30 with SegmentIdWithShardSpec

use of org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec in project druid by druid-io.

the class SegmentAllocateActionTest method testWithPartialShardSpecAndOvershadowingSegments.

@Test
public void testWithPartialShardSpecAndOvershadowingSegments() throws IOException {
    final Task task = NoopTask.create();
    taskActionTestKit.getTaskLockbox().add(task);
    final ObjectMapper objectMapper = new DefaultObjectMapper();
    taskActionTestKit.getMetadataStorageCoordinator().announceHistoricalSegments(ImmutableSet.of(DataSegment.builder().dataSource(DATA_SOURCE).interval(Granularities.HOUR.bucket(PARTY_TIME)).version(PARTY_TIME.toString()).shardSpec(new HashBasedNumberedShardSpec(0, 2, 0, 2, ImmutableList.of("dim1"), null, objectMapper)).size(0).build(), DataSegment.builder().dataSource(DATA_SOURCE).interval(Granularities.HOUR.bucket(PARTY_TIME)).version(PARTY_TIME.toString()).shardSpec(new HashBasedNumberedShardSpec(1, 2, 1, 2, ImmutableList.of("dim1"), null, objectMapper)).size(0).build()));
    final SegmentAllocateAction action = new SegmentAllocateAction(DATA_SOURCE, PARTY_TIME, Granularities.MINUTE, Granularities.HOUR, "seq", null, true, new HashBasedNumberedPartialShardSpec(ImmutableList.of("dim1"), 1, 2, null), lockGranularity, null);
    final SegmentIdWithShardSpec segmentIdentifier = action.perform(task, taskActionTestKit.getTaskActionToolbox());
    Assert.assertNotNull(segmentIdentifier);
    final ShardSpec shardSpec = segmentIdentifier.getShardSpec();
    Assert.assertEquals(2, shardSpec.getPartitionNum());
    Assert.assertTrue(shardSpec instanceof HashBasedNumberedShardSpec);
    final HashBasedNumberedShardSpec hashBasedNumberedShardSpec = (HashBasedNumberedShardSpec) shardSpec;
    Assert.assertEquals(2, hashBasedNumberedShardSpec.getNumCorePartitions());
    Assert.assertEquals(ImmutableList.of("dim1"), hashBasedNumberedShardSpec.getPartitionDimensions());
}
Also used : HashBasedNumberedShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedShardSpec) Task(org.apache.druid.indexing.common.task.Task) NoopTask(org.apache.druid.indexing.common.task.NoopTask) DefaultObjectMapper(org.apache.druid.jackson.DefaultObjectMapper) HashBasedNumberedPartialShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedPartialShardSpec) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) DefaultObjectMapper(org.apache.druid.jackson.DefaultObjectMapper) NumberedPartialShardSpec(org.apache.druid.timeline.partition.NumberedPartialShardSpec) HashBasedNumberedShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedShardSpec) HashBasedNumberedPartialShardSpec(org.apache.druid.timeline.partition.HashBasedNumberedPartialShardSpec) ShardSpec(org.apache.druid.timeline.partition.ShardSpec) NumberedShardSpec(org.apache.druid.timeline.partition.NumberedShardSpec) SegmentIdWithShardSpec(org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec) LinearShardSpec(org.apache.druid.timeline.partition.LinearShardSpec) LinearPartialShardSpec(org.apache.druid.timeline.partition.LinearPartialShardSpec) PartialShardSpec(org.apache.druid.timeline.partition.PartialShardSpec) Test(org.junit.Test)

Aggregations

SegmentIdWithShardSpec (org.apache.druid.segment.realtime.appenderator.SegmentIdWithShardSpec)36 Test (org.junit.Test)23 DataSegment (org.apache.druid.timeline.DataSegment)14 Interval (org.joda.time.Interval)14 NoopTask (org.apache.druid.indexing.common.task.NoopTask)12 Task (org.apache.druid.indexing.common.task.Task)12 PartialShardSpec (org.apache.druid.timeline.partition.PartialShardSpec)11 HashBasedNumberedPartialShardSpec (org.apache.druid.timeline.partition.HashBasedNumberedPartialShardSpec)10 NumberedPartialShardSpec (org.apache.druid.timeline.partition.NumberedPartialShardSpec)10 HashBasedNumberedShardSpec (org.apache.druid.timeline.partition.HashBasedNumberedShardSpec)9 LinearShardSpec (org.apache.druid.timeline.partition.LinearShardSpec)9 NumberedShardSpec (org.apache.druid.timeline.partition.NumberedShardSpec)8 NumberedOverwritePartialShardSpec (org.apache.druid.timeline.partition.NumberedOverwritePartialShardSpec)7 IOException (java.io.IOException)6 HashSet (java.util.HashSet)6 Map (java.util.Map)6 DateTime (org.joda.time.DateTime)6 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)5 Iterables (com.google.common.collect.Iterables)5 List (java.util.List)5