Search in sources :

Example 6 with SegmentDescriptor

use of org.apache.druid.query.SegmentDescriptor in project druid by druid-io.

the class StreamAppenderatorDriver method registerHandoff.

/**
 * Register the segments in the given {@link SegmentsAndCommitMetadata} to be handed off and execute a background task which
 * waits until the hand off completes.
 *
 * @param segmentsAndCommitMetadata the result segments and metadata of
 *                            {@link #publish(TransactionalSegmentPublisher, Committer, Collection)}
 *
 * @return null if the input segmentsAndMetadata is null. Otherwise, a {@link ListenableFuture} for the submitted task
 * which returns {@link SegmentsAndCommitMetadata} containing the segments successfully handed off and the metadata
 * of the caller of {@link AppenderatorDriverMetadata}
 */
public ListenableFuture<SegmentsAndCommitMetadata> registerHandoff(SegmentsAndCommitMetadata segmentsAndCommitMetadata) {
    if (segmentsAndCommitMetadata == null) {
        return Futures.immediateFuture(null);
    } else {
        final List<SegmentIdWithShardSpec> waitingSegmentIdList = segmentsAndCommitMetadata.getSegments().stream().map(SegmentIdWithShardSpec::fromDataSegment).collect(Collectors.toList());
        final Object metadata = Preconditions.checkNotNull(segmentsAndCommitMetadata.getCommitMetadata(), "commitMetadata");
        if (waitingSegmentIdList.isEmpty()) {
            return Futures.immediateFuture(new SegmentsAndCommitMetadata(segmentsAndCommitMetadata.getSegments(), ((AppenderatorDriverMetadata) metadata).getCallerMetadata()));
        }
        log.debug("Register handoff of segments: [%s]", waitingSegmentIdList);
        final SettableFuture<SegmentsAndCommitMetadata> resultFuture = SettableFuture.create();
        final AtomicInteger numRemainingHandoffSegments = new AtomicInteger(waitingSegmentIdList.size());
        for (final SegmentIdWithShardSpec segmentIdentifier : waitingSegmentIdList) {
            handoffNotifier.registerSegmentHandoffCallback(new SegmentDescriptor(segmentIdentifier.getInterval(), segmentIdentifier.getVersion(), segmentIdentifier.getShardSpec().getPartitionNum()), Execs.directExecutor(), () -> {
                log.debug("Segment[%s] successfully handed off, dropping.", segmentIdentifier);
                metrics.incrementHandOffCount();
                final ListenableFuture<?> dropFuture = appenderator.drop(segmentIdentifier);
                Futures.addCallback(dropFuture, new FutureCallback<Object>() {

                    @Override
                    public void onSuccess(Object result) {
                        if (numRemainingHandoffSegments.decrementAndGet() == 0) {
                            List<DataSegment> segments = segmentsAndCommitMetadata.getSegments();
                            log.debug("Successfully handed off [%d] segments.", segments.size());
                            resultFuture.set(new SegmentsAndCommitMetadata(segments, ((AppenderatorDriverMetadata) metadata).getCallerMetadata()));
                        }
                    }

                    @Override
                    public void onFailure(Throwable e) {
                        log.warn(e, "Failed to drop segment[%s]?!", segmentIdentifier);
                        numRemainingHandoffSegments.decrementAndGet();
                        resultFuture.setException(e);
                    }
                });
            });
        }
        return resultFuture;
    }
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) SegmentDescriptor(org.apache.druid.query.SegmentDescriptor) ArrayList(java.util.ArrayList) List(java.util.List)

Example 7 with SegmentDescriptor

use of org.apache.druid.query.SegmentDescriptor in project druid by druid-io.

the class DumpSegment method runMetadata.

private void runMetadata(final Injector injector, final QueryableIndex index) throws IOException {
    final ObjectMapper objectMapper = injector.getInstance(Key.get(ObjectMapper.class, Json.class)).copy().configure(JsonGenerator.Feature.AUTO_CLOSE_TARGET, false);
    final SegmentMetadataQuery query = new SegmentMetadataQuery(new TableDataSource("dataSource"), new SpecificSegmentSpec(new SegmentDescriptor(index.getDataInterval(), "0", 0)), new ListColumnIncluderator(getColumnsToInclude(index)), false, null, EnumSet.allOf(SegmentMetadataQuery.AnalysisType.class), false, false);
    withOutputStream(new Function<OutputStream, Object>() {

        @Override
        public Object apply(final OutputStream out) {
            evaluateSequenceForSideEffects(Sequences.map(executeQuery(injector, index, query), new Function<SegmentAnalysis, Object>() {

                @Override
                public Object apply(SegmentAnalysis analysis) {
                    try {
                        objectMapper.writeValue(out, analysis);
                    } catch (IOException e) {
                        throw new RuntimeException(e);
                    }
                    return null;
                }
            }));
            return null;
        }
    });
}
Also used : ListColumnIncluderator(org.apache.druid.query.metadata.metadata.ListColumnIncluderator) OutputStream(java.io.OutputStream) FileOutputStream(java.io.FileOutputStream) Json(org.apache.druid.guice.annotations.Json) IOException(java.io.IOException) TableDataSource(org.apache.druid.query.TableDataSource) SpecificSegmentSpec(org.apache.druid.query.spec.SpecificSegmentSpec) SegmentMetadataQuery(org.apache.druid.query.metadata.metadata.SegmentMetadataQuery) SegmentDescriptor(org.apache.druid.query.SegmentDescriptor) SegmentAnalysis(org.apache.druid.query.metadata.metadata.SegmentAnalysis) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper)

Example 8 with SegmentDescriptor

use of org.apache.druid.query.SegmentDescriptor in project druid by druid-io.

the class CachingClusteredClientTest method testSingleDimensionPruning.

@Test
public void testSingleDimensionPruning() {
    DimFilter filter = new AndDimFilter(new OrDimFilter(new SelectorDimFilter("dim1", "a", null), new BoundDimFilter("dim1", "from", "to", false, false, false, null, StringComparators.LEXICOGRAPHIC)), new AndDimFilter(new InDimFilter("dim2", Arrays.asList("a", "c", "e", "g"), null), new BoundDimFilter("dim2", "aaa", "hi", false, false, false, null, StringComparators.LEXICOGRAPHIC), new BoundDimFilter("dim2", "e", "zzz", true, true, false, null, StringComparators.LEXICOGRAPHIC)));
    final Druids.TimeseriesQueryBuilder builder = Druids.newTimeseriesQueryBuilder().dataSource(DATA_SOURCE).filters(filter).granularity(GRANULARITY).intervals(SEG_SPEC).context(CONTEXT).intervals("2011-01-05/2011-01-10").aggregators(RENAMED_AGGS).postAggregators(RENAMED_POST_AGGS);
    TimeseriesQuery query = builder.randomQueryId().build();
    final Interval interval1 = Intervals.of("2011-01-06/2011-01-07");
    final Interval interval2 = Intervals.of("2011-01-07/2011-01-08");
    final Interval interval3 = Intervals.of("2011-01-08/2011-01-09");
    QueryRunner runner = new FinalizeResultsQueryRunner(getDefaultQueryRunner(), new TimeseriesQueryQueryToolChest());
    final DruidServer lastServer = servers[random.nextInt(servers.length)];
    ServerSelector selector1 = makeMockSingleDimensionSelector(lastServer, "dim1", null, "b", 0);
    ServerSelector selector2 = makeMockSingleDimensionSelector(lastServer, "dim1", "e", "f", 1);
    ServerSelector selector3 = makeMockSingleDimensionSelector(lastServer, "dim1", "hi", "zzz", 2);
    ServerSelector selector4 = makeMockSingleDimensionSelector(lastServer, "dim2", "a", "e", 0);
    ServerSelector selector5 = makeMockSingleDimensionSelector(lastServer, "dim2", null, null, 1);
    ServerSelector selector6 = makeMockSingleDimensionSelector(lastServer, "other", "b", null, 0);
    timeline.add(interval1, "v", new NumberedPartitionChunk<>(0, 3, selector1));
    timeline.add(interval1, "v", new NumberedPartitionChunk<>(1, 3, selector2));
    timeline.add(interval1, "v", new NumberedPartitionChunk<>(2, 3, selector3));
    timeline.add(interval2, "v", new NumberedPartitionChunk<>(0, 2, selector4));
    timeline.add(interval2, "v", new NumberedPartitionChunk<>(1, 2, selector5));
    timeline.add(interval3, "v", new NumberedPartitionChunk<>(0, 1, selector6));
    final Capture<QueryPlus> capture = Capture.newInstance();
    final Capture<ResponseContext> contextCap = Capture.newInstance();
    QueryRunner mockRunner = EasyMock.createNiceMock(QueryRunner.class);
    EasyMock.expect(mockRunner.run(EasyMock.capture(capture), EasyMock.capture(contextCap))).andReturn(Sequences.empty()).anyTimes();
    EasyMock.expect(serverView.getQueryRunner(lastServer)).andReturn(mockRunner).anyTimes();
    EasyMock.replay(serverView);
    EasyMock.replay(mockRunner);
    List<SegmentDescriptor> descriptors = new ArrayList<>();
    descriptors.add(new SegmentDescriptor(interval1, "v", 0));
    descriptors.add(new SegmentDescriptor(interval1, "v", 2));
    descriptors.add(new SegmentDescriptor(interval2, "v", 1));
    descriptors.add(new SegmentDescriptor(interval3, "v", 0));
    MultipleSpecificSegmentSpec expected = new MultipleSpecificSegmentSpec(descriptors);
    runner.run(QueryPlus.wrap(query)).toList();
    Assert.assertEquals(expected, ((TimeseriesQuery) capture.getValue().getQuery()).getQuerySegmentSpec());
}
Also used : MultipleSpecificSegmentSpec(org.apache.druid.query.spec.MultipleSpecificSegmentSpec) BoundDimFilter(org.apache.druid.query.filter.BoundDimFilter) AndDimFilter(org.apache.druid.query.filter.AndDimFilter) TimeseriesQuery(org.apache.druid.query.timeseries.TimeseriesQuery) ArrayList(java.util.ArrayList) QueryableDruidServer(org.apache.druid.client.selector.QueryableDruidServer) TimeseriesQueryQueryToolChest(org.apache.druid.query.timeseries.TimeseriesQueryQueryToolChest) FinalizeResultsQueryRunner(org.apache.druid.query.FinalizeResultsQueryRunner) QueryRunner(org.apache.druid.query.QueryRunner) ServerSelector(org.apache.druid.client.selector.ServerSelector) FinalizeResultsQueryRunner(org.apache.druid.query.FinalizeResultsQueryRunner) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) SegmentDescriptor(org.apache.druid.query.SegmentDescriptor) Druids(org.apache.druid.query.Druids) ResponseContext(org.apache.druid.query.context.ResponseContext) OrDimFilter(org.apache.druid.query.filter.OrDimFilter) InDimFilter(org.apache.druid.query.filter.InDimFilter) AndDimFilter(org.apache.druid.query.filter.AndDimFilter) DimFilter(org.apache.druid.query.filter.DimFilter) InDimFilter(org.apache.druid.query.filter.InDimFilter) SelectorDimFilter(org.apache.druid.query.filter.SelectorDimFilter) BoundDimFilter(org.apache.druid.query.filter.BoundDimFilter) OrDimFilter(org.apache.druid.query.filter.OrDimFilter) Interval(org.joda.time.Interval) QueryPlus(org.apache.druid.query.QueryPlus) Test(org.junit.Test)

Example 9 with SegmentDescriptor

use of org.apache.druid.query.SegmentDescriptor in project druid by druid-io.

the class CachingClusteredClientTest method toFilteredQueryableTimeseriesResults.

private Sequence<Result<TimeseriesResultValue>> toFilteredQueryableTimeseriesResults(TimeseriesQuery query, List<SegmentId> segmentIds, List<Interval> queryIntervals, List<Iterable<Result<TimeseriesResultValue>>> results) {
    MultipleSpecificSegmentSpec spec = (MultipleSpecificSegmentSpec) query.getQuerySegmentSpec();
    List<Result<TimeseriesResultValue>> ret = new ArrayList<>();
    for (SegmentDescriptor descriptor : spec.getDescriptors()) {
        SegmentId id = SegmentId.dummy(StringUtils.format("%s_%s", queryIntervals.indexOf(descriptor.getInterval()), descriptor.getPartitionNumber()));
        int index = segmentIds.indexOf(id);
        if (index != -1) {
            Result result = new Result(results.get(index).iterator().next().getTimestamp(), new BySegmentResultValueClass(Lists.newArrayList(results.get(index)), id.toString(), descriptor.getInterval()));
            ret.add(result);
        } else {
            throw new ISE("Descriptor %s not found in server", id);
        }
    }
    return Sequences.simple(ret);
}
Also used : MultipleSpecificSegmentSpec(org.apache.druid.query.spec.MultipleSpecificSegmentSpec) SegmentId(org.apache.druid.timeline.SegmentId) SegmentDescriptor(org.apache.druid.query.SegmentDescriptor) ArrayList(java.util.ArrayList) BySegmentResultValueClass(org.apache.druid.query.BySegmentResultValueClass) ISE(org.apache.druid.java.util.common.ISE) Result(org.apache.druid.query.Result)

Example 10 with SegmentDescriptor

use of org.apache.druid.query.SegmentDescriptor in project druid by druid-io.

the class CachingClusteredClientPerfTest method testGetQueryRunnerForSegments_singleIntervalLargeSegments.

@Test(timeout = 10_000)
public void testGetQueryRunnerForSegments_singleIntervalLargeSegments() {
    final int segmentCount = 30_000;
    final Interval interval = Intervals.of("2021-02-13/2021-02-14");
    final List<SegmentDescriptor> segmentDescriptors = new ArrayList<>(segmentCount);
    final List<DataSegment> dataSegments = new ArrayList<>(segmentCount);
    final VersionedIntervalTimeline<String, ServerSelector> timeline = new VersionedIntervalTimeline<>(Ordering.natural());
    final DruidServer server = new DruidServer("server", "localhost:9000", null, Long.MAX_VALUE, ServerType.HISTORICAL, DruidServer.DEFAULT_TIER, DruidServer.DEFAULT_PRIORITY);
    for (int ii = 0; ii < segmentCount; ii++) {
        segmentDescriptors.add(new SegmentDescriptor(interval, "1", ii));
        DataSegment segment = makeDataSegment("test", interval, "1", ii);
        dataSegments.add(segment);
    }
    timeline.addAll(Iterators.transform(dataSegments.iterator(), segment -> {
        ServerSelector ss = new ServerSelector(segment, new HighestPriorityTierSelectorStrategy(new RandomServerSelectorStrategy()));
        ss.addServerAndUpdateSegment(new QueryableDruidServer(server, new MockQueryRunner()), segment);
        return new VersionedIntervalTimeline.PartitionChunkEntry<>(segment.getInterval(), segment.getVersion(), segment.getShardSpec().createChunk(ss));
    }));
    TimelineServerView serverView = Mockito.mock(TimelineServerView.class);
    QueryScheduler queryScheduler = Mockito.mock(QueryScheduler.class);
    // mock scheduler to return same sequence as argument
    Mockito.when(queryScheduler.run(any(), any())).thenAnswer(i -> i.getArgument(1));
    Mockito.when(queryScheduler.prioritizeAndLaneQuery(any(), any())).thenAnswer(i -> ((QueryPlus) i.getArgument(0)).getQuery());
    Mockito.doReturn(Optional.of(timeline)).when(serverView).getTimeline(any());
    Mockito.doReturn(new MockQueryRunner()).when(serverView).getQueryRunner(any());
    CachingClusteredClient cachingClusteredClient = new CachingClusteredClient(new MockQueryToolChestWareHouse(), serverView, MapCache.create(1024), TestHelper.makeJsonMapper(), Mockito.mock(CachePopulator.class), new CacheConfig(), Mockito.mock(DruidHttpClientConfig.class), Mockito.mock(DruidProcessingConfig.class), ForkJoinPool.commonPool(), queryScheduler, NoopJoinableFactory.INSTANCE, new NoopServiceEmitter());
    Query<SegmentDescriptor> fakeQuery = makeFakeQuery(interval);
    QueryRunner<SegmentDescriptor> queryRunner = cachingClusteredClient.getQueryRunnerForSegments(fakeQuery, segmentDescriptors);
    Sequence<SegmentDescriptor> sequence = queryRunner.run(QueryPlus.wrap(fakeQuery));
    Assert.assertEquals(segmentDescriptors, sequence.toList());
}
Also used : QueryPlus(org.apache.druid.query.QueryPlus) Map(java.util.Map) ServerType(org.apache.druid.server.coordination.ServerType) QueryRunner(org.apache.druid.query.QueryRunner) QueryToolChestWarehouse(org.apache.druid.query.QueryToolChestWarehouse) NoopJoinableFactory(org.apache.druid.segment.join.NoopJoinableFactory) Sequence(org.apache.druid.java.util.common.guava.Sequence) QueryScheduler(org.apache.druid.server.QueryScheduler) ImmutableMap(com.google.common.collect.ImmutableMap) DataSource(org.apache.druid.query.DataSource) CacheConfig(org.apache.druid.client.cache.CacheConfig) DruidProcessingConfig(org.apache.druid.query.DruidProcessingConfig) QuerySegmentSpec(org.apache.druid.query.spec.QuerySegmentSpec) DruidHttpClientConfig(org.apache.druid.guice.http.DruidHttpClientConfig) List(java.util.List) DimFilter(org.apache.druid.query.filter.DimFilter) LinearShardSpec(org.apache.druid.timeline.partition.LinearShardSpec) DataSegment(org.apache.druid.timeline.DataSegment) ServerManagerTest(org.apache.druid.server.coordination.ServerManagerTest) Optional(java.util.Optional) QueryableDruidServer(org.apache.druid.client.selector.QueryableDruidServer) MapCache(org.apache.druid.client.cache.MapCache) ArgumentMatchers.any(org.mockito.ArgumentMatchers.any) HighestPriorityTierSelectorStrategy(org.apache.druid.client.selector.HighestPriorityTierSelectorStrategy) Intervals(org.apache.druid.java.util.common.Intervals) TestSequence(org.apache.druid.java.util.common.guava.TestSequence) BaseQuery(org.apache.druid.query.BaseQuery) Iterators(com.google.common.collect.Iterators) ArrayList(java.util.ArrayList) MultipleSpecificSegmentSpec(org.apache.druid.query.spec.MultipleSpecificSegmentSpec) ServerSelector(org.apache.druid.client.selector.ServerSelector) Interval(org.joda.time.Interval) Query(org.apache.druid.query.Query) MultipleIntervalSegmentSpec(org.apache.druid.query.spec.MultipleIntervalSegmentSpec) CachePopulator(org.apache.druid.client.cache.CachePopulator) NoopServiceEmitter(org.apache.druid.server.metrics.NoopServiceEmitter) VersionedIntervalTimeline(org.apache.druid.timeline.VersionedIntervalTimeline) ResponseContext(org.apache.druid.query.context.ResponseContext) QueryToolChest(org.apache.druid.query.QueryToolChest) Test(org.junit.Test) TableDataSource(org.apache.druid.query.TableDataSource) Mockito(org.mockito.Mockito) TestHelper(org.apache.druid.segment.TestHelper) RandomServerSelectorStrategy(org.apache.druid.client.selector.RandomServerSelectorStrategy) Ordering(com.google.common.collect.Ordering) ForkJoinPool(java.util.concurrent.ForkJoinPool) SegmentDescriptor(org.apache.druid.query.SegmentDescriptor) Assert(org.junit.Assert) Collections(java.util.Collections) ArrayList(java.util.ArrayList) DataSegment(org.apache.druid.timeline.DataSegment) QueryableDruidServer(org.apache.druid.client.selector.QueryableDruidServer) DruidHttpClientConfig(org.apache.druid.guice.http.DruidHttpClientConfig) ServerSelector(org.apache.druid.client.selector.ServerSelector) SegmentDescriptor(org.apache.druid.query.SegmentDescriptor) HighestPriorityTierSelectorStrategy(org.apache.druid.client.selector.HighestPriorityTierSelectorStrategy) CacheConfig(org.apache.druid.client.cache.CacheConfig) QueryScheduler(org.apache.druid.server.QueryScheduler) QueryableDruidServer(org.apache.druid.client.selector.QueryableDruidServer) NoopServiceEmitter(org.apache.druid.server.metrics.NoopServiceEmitter) CachePopulator(org.apache.druid.client.cache.CachePopulator) VersionedIntervalTimeline(org.apache.druid.timeline.VersionedIntervalTimeline) DruidProcessingConfig(org.apache.druid.query.DruidProcessingConfig) RandomServerSelectorStrategy(org.apache.druid.client.selector.RandomServerSelectorStrategy) Interval(org.joda.time.Interval) ServerManagerTest(org.apache.druid.server.coordination.ServerManagerTest) Test(org.junit.Test)

Aggregations

SegmentDescriptor (org.apache.druid.query.SegmentDescriptor)71 Test (org.junit.Test)47 Interval (org.joda.time.Interval)26 TaskStatus (org.apache.druid.indexer.TaskStatus)21 DataSegment (org.apache.druid.timeline.DataSegment)20 Executor (java.util.concurrent.Executor)19 ArrayList (java.util.ArrayList)17 Result (org.apache.druid.query.Result)16 InitializedNullHandlingTest (org.apache.druid.testing.InitializedNullHandlingTest)16 List (java.util.List)15 Query (org.apache.druid.query.Query)14 QueryRunner (org.apache.druid.query.QueryRunner)14 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)13 Map (java.util.Map)13 TimeseriesQuery (org.apache.druid.query.timeseries.TimeseriesQuery)13 ImmutableMap (com.google.common.collect.ImmutableMap)12 ListenableFuture (com.google.common.util.concurrent.ListenableFuture)11 QueryPlus (org.apache.druid.query.QueryPlus)11 ResponseContext (org.apache.druid.query.context.ResponseContext)11 ImmutableList (com.google.common.collect.ImmutableList)10