Search in sources :

Example 61 with Period

use of org.joda.time.Period in project druid by druid-io.

the class AppenderatorPlumber method startPersistThread.

private void startPersistThread() {
    final Granularity segmentGranularity = schema.getGranularitySpec().getSegmentGranularity();
    final Period windowPeriod = config.getWindowPeriod();
    final DateTime truncatedNow = segmentGranularity.bucketStart(new DateTime());
    final long windowMillis = windowPeriod.toStandardDuration().getMillis();
    log.info("Expect to run at [%s]", new DateTime().plus(new Duration(System.currentTimeMillis(), segmentGranularity.increment(truncatedNow).getMillis() + windowMillis)));
    ScheduledExecutors.scheduleAtFixedRate(scheduledExecutor, new Duration(System.currentTimeMillis(), segmentGranularity.increment(truncatedNow).getMillis() + windowMillis), new Duration(truncatedNow, segmentGranularity.increment(truncatedNow)), new ThreadRenamingCallable<ScheduledExecutors.Signal>(String.format("%s-overseer-%d", schema.getDataSource(), config.getShardSpec().getPartitionNum())) {

        @Override
        public ScheduledExecutors.Signal doCall() {
            if (stopped) {
                log.info("Stopping merge-n-push overseer thread");
                return ScheduledExecutors.Signal.STOP;
            }
            mergeAndPush();
            if (stopped) {
                log.info("Stopping merge-n-push overseer thread");
                return ScheduledExecutors.Signal.STOP;
            } else {
                return ScheduledExecutors.Signal.REPEAT;
            }
        }
    });
}
Also used : Period(org.joda.time.Period) Duration(org.joda.time.Duration) Granularity(io.druid.java.util.common.granularity.Granularity) DateTime(org.joda.time.DateTime)

Example 62 with Period

use of org.joda.time.Period in project druid by druid-io.

the class RealtimePlumber method startPersistThread.

protected void startPersistThread() {
    final Granularity segmentGranularity = schema.getGranularitySpec().getSegmentGranularity();
    final Period windowPeriod = config.getWindowPeriod();
    final DateTime truncatedNow = segmentGranularity.bucketStart(new DateTime());
    final long windowMillis = windowPeriod.toStandardDuration().getMillis();
    log.info("Expect to run at [%s]", new DateTime().plus(new Duration(System.currentTimeMillis(), segmentGranularity.increment(truncatedNow).getMillis() + windowMillis)));
    ScheduledExecutors.scheduleAtFixedRate(scheduledExecutor, new Duration(System.currentTimeMillis(), segmentGranularity.increment(truncatedNow).getMillis() + windowMillis), new Duration(truncatedNow, segmentGranularity.increment(truncatedNow)), new ThreadRenamingCallable<ScheduledExecutors.Signal>(String.format("%s-overseer-%d", schema.getDataSource(), config.getShardSpec().getPartitionNum())) {

        @Override
        public ScheduledExecutors.Signal doCall() {
            if (stopped) {
                log.info("Stopping merge-n-push overseer thread");
                return ScheduledExecutors.Signal.STOP;
            }
            mergeAndPush();
            if (stopped) {
                log.info("Stopping merge-n-push overseer thread");
                return ScheduledExecutors.Signal.STOP;
            } else {
                return ScheduledExecutors.Signal.REPEAT;
            }
        }
    });
}
Also used : Period(org.joda.time.Period) Duration(org.joda.time.Duration) Granularity(io.druid.java.util.common.granularity.Granularity) DateTime(org.joda.time.DateTime)

Example 63 with Period

use of org.joda.time.Period in project hive by apache.

the class DruidQueryBasedInputFormat method splitSelectQuery.

/* Method that splits Select query depending on the threshold so read can be
   * parallelized. We will only contact the Druid broker to obtain all results. */
private static HiveDruidSplit[] splitSelectQuery(Configuration conf, String address, SelectQuery query, Path dummyPath) throws IOException {
    final int selectThreshold = (int) HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVE_DRUID_SELECT_THRESHOLD);
    final int numConnection = HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVE_DRUID_NUM_HTTP_CONNECTION);
    final Period readTimeout = new Period(HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_DRUID_HTTP_READ_TIMEOUT));
    final boolean isFetch = query.getContextBoolean(Constants.DRUID_QUERY_FETCH, false);
    if (isFetch) {
        // If it has a limit, we use it and we do not split the query
        return new HiveDruidSplit[] { new HiveDruidSplit(DruidStorageHandlerUtils.JSON_MAPPER.writeValueAsString(query), dummyPath, new String[] { address }) };
    }
    // We do not have the number of rows, thus we need to execute a
    // Segment Metadata query to obtain number of rows
    SegmentMetadataQueryBuilder metadataBuilder = new Druids.SegmentMetadataQueryBuilder();
    metadataBuilder.dataSource(query.getDataSource());
    metadataBuilder.intervals(query.getIntervals());
    metadataBuilder.merge(true);
    metadataBuilder.analysisTypes();
    SegmentMetadataQuery metadataQuery = metadataBuilder.build();
    Lifecycle lifecycle = new Lifecycle();
    HttpClient client = HttpClientInit.createClient(HttpClientConfig.builder().withNumConnections(numConnection).withReadTimeout(readTimeout.toStandardDuration()).build(), lifecycle);
    try {
        lifecycle.start();
    } catch (Exception e) {
        LOG.error("Lifecycle start issue");
        throw new IOException(org.apache.hadoop.util.StringUtils.stringifyException(e));
    }
    InputStream response;
    try {
        response = DruidStorageHandlerUtils.submitRequest(client, DruidStorageHandlerUtils.createRequest(address, metadataQuery));
    } catch (Exception e) {
        lifecycle.stop();
        throw new IOException(org.apache.hadoop.util.StringUtils.stringifyException(e));
    }
    // Retrieve results
    List<SegmentAnalysis> metadataList;
    try {
        metadataList = DruidStorageHandlerUtils.SMILE_MAPPER.readValue(response, new TypeReference<List<SegmentAnalysis>>() {
        });
    } catch (Exception e) {
        response.close();
        throw new IOException(org.apache.hadoop.util.StringUtils.stringifyException(e));
    } finally {
        lifecycle.stop();
    }
    if (metadataList == null) {
        throw new IOException("Connected to Druid but could not retrieve datasource information");
    }
    if (metadataList.isEmpty()) {
        // There are no rows for that time range, we can submit query as it is
        return new HiveDruidSplit[] { new HiveDruidSplit(DruidStorageHandlerUtils.JSON_MAPPER.writeValueAsString(query), dummyPath, new String[] { address }) };
    }
    if (metadataList.size() != 1) {
        throw new IOException("Information about segments should have been merged");
    }
    final long numRows = metadataList.get(0).getNumRows();
    query = query.withPagingSpec(PagingSpec.newSpec(Integer.MAX_VALUE));
    if (numRows <= selectThreshold) {
        // We are not going to split it
        return new HiveDruidSplit[] { new HiveDruidSplit(DruidStorageHandlerUtils.JSON_MAPPER.writeValueAsString(query), dummyPath, new String[] { address }) };
    }
    // If the query does not specify a timestamp, we obtain the total time using
    // a Time Boundary query. Then, we use the information to split the query
    // following the Select threshold configuration property
    final List<Interval> intervals = new ArrayList<>();
    if (query.getIntervals().size() == 1 && query.getIntervals().get(0).withChronology(ISOChronology.getInstanceUTC()).equals(DruidTable.DEFAULT_INTERVAL)) {
        // Default max and min, we should execute a time boundary query to get a
        // more precise range
        TimeBoundaryQueryBuilder timeBuilder = new Druids.TimeBoundaryQueryBuilder();
        timeBuilder.dataSource(query.getDataSource());
        TimeBoundaryQuery timeQuery = timeBuilder.build();
        lifecycle = new Lifecycle();
        client = HttpClientInit.createClient(HttpClientConfig.builder().withNumConnections(numConnection).withReadTimeout(readTimeout.toStandardDuration()).build(), lifecycle);
        try {
            lifecycle.start();
        } catch (Exception e) {
            LOG.error("Lifecycle start issue");
            throw new IOException(org.apache.hadoop.util.StringUtils.stringifyException(e));
        }
        try {
            response = DruidStorageHandlerUtils.submitRequest(client, DruidStorageHandlerUtils.createRequest(address, timeQuery));
        } catch (Exception e) {
            lifecycle.stop();
            throw new IOException(org.apache.hadoop.util.StringUtils.stringifyException(e));
        }
        // Retrieve results
        List<Result<TimeBoundaryResultValue>> timeList;
        try {
            timeList = DruidStorageHandlerUtils.SMILE_MAPPER.readValue(response, new TypeReference<List<Result<TimeBoundaryResultValue>>>() {
            });
        } catch (Exception e) {
            response.close();
            throw new IOException(org.apache.hadoop.util.StringUtils.stringifyException(e));
        } finally {
            lifecycle.stop();
        }
        if (timeList == null || timeList.isEmpty()) {
            throw new IOException("Connected to Druid but could not retrieve time boundary information");
        }
        if (timeList.size() != 1) {
            throw new IOException("We should obtain a single time boundary");
        }
        intervals.add(new Interval(timeList.get(0).getValue().getMinTime().getMillis(), timeList.get(0).getValue().getMaxTime().getMillis(), ISOChronology.getInstanceUTC()));
    } else {
        intervals.addAll(query.getIntervals());
    }
    // Create (numRows/default threshold) input splits
    int numSplits = (int) Math.ceil((double) numRows / selectThreshold);
    List<List<Interval>> newIntervals = createSplitsIntervals(intervals, numSplits);
    HiveDruidSplit[] splits = new HiveDruidSplit[numSplits];
    for (int i = 0; i < numSplits; i++) {
        // Create partial Select query
        final SelectQuery partialQuery = query.withQuerySegmentSpec(new MultipleIntervalSegmentSpec(newIntervals.get(i)));
        splits[i] = new HiveDruidSplit(DruidStorageHandlerUtils.JSON_MAPPER.writeValueAsString(partialQuery), dummyPath, new String[] { address });
    }
    return splits;
}
Also used : ArrayList(java.util.ArrayList) MultipleIntervalSegmentSpec(io.druid.query.spec.MultipleIntervalSegmentSpec) TimeBoundaryQuery(io.druid.query.timeboundary.TimeBoundaryQuery) Result(io.druid.query.Result) SegmentMetadataQuery(io.druid.query.metadata.metadata.SegmentMetadataQuery) SegmentMetadataQueryBuilder(io.druid.query.Druids.SegmentMetadataQueryBuilder) SegmentAnalysis(io.druid.query.metadata.metadata.SegmentAnalysis) TimeBoundaryQueryBuilder(io.druid.query.Druids.TimeBoundaryQueryBuilder) List(java.util.List) ArrayList(java.util.ArrayList) TypeReference(com.fasterxml.jackson.core.type.TypeReference) InputStream(java.io.InputStream) Lifecycle(com.metamx.common.lifecycle.Lifecycle) Period(org.joda.time.Period) IOException(java.io.IOException) JsonParseException(com.fasterxml.jackson.core.JsonParseException) JsonMappingException(com.fasterxml.jackson.databind.JsonMappingException) IOException(java.io.IOException) SelectQuery(io.druid.query.select.SelectQuery) HttpClient(com.metamx.http.client.HttpClient) Interval(org.joda.time.Interval)

Example 64 with Period

use of org.joda.time.Period in project druid by druid-io.

the class HadoopIngestionSpecTest method testPeriodSegmentGranularitySpec.

@Test
public void testPeriodSegmentGranularitySpec() {
    final HadoopIngestionSpec schema;
    try {
        schema = jsonReadWriteRead("{\n" + "    \"dataSchema\": {\n" + "     \"dataSource\": \"foo\",\n" + "     \"metricsSpec\": [],\n" + "        \"granularitySpec\": {\n" + "                \"type\": \"uniform\",\n" + "                \"segmentGranularity\": {\"type\": \"period\", \"period\":\"PT1H\", \"timeZone\":\"America/Los_Angeles\"},\n" + "                \"intervals\": [\"2012-01-01/P1D\"]\n" + "        }\n" + "    }\n" + "}", HadoopIngestionSpec.class);
    } catch (Exception e) {
        throw Throwables.propagate(e);
    }
    final UniformGranularitySpec granularitySpec = (UniformGranularitySpec) schema.getDataSchema().getGranularitySpec();
    Assert.assertEquals("getSegmentGranularity", new PeriodGranularity(new Period("PT1H"), null, DateTimeZone.forID("America/Los_Angeles")), granularitySpec.getSegmentGranularity());
}
Also used : UniformGranularitySpec(io.druid.segment.indexing.granularity.UniformGranularitySpec) PeriodGranularity(io.druid.java.util.common.granularity.PeriodGranularity) Period(org.joda.time.Period) JsonProcessingException(com.fasterxml.jackson.core.JsonProcessingException) Test(org.junit.Test)

Example 65 with Period

use of org.joda.time.Period in project druid by druid-io.

the class GranularityPathSpecTest method testBackwardCompatiblePeriodSegmentGranularitySerialization.

@Test
public void testBackwardCompatiblePeriodSegmentGranularitySerialization() throws JsonProcessingException {
    final PeriodGranularity pt2S = new PeriodGranularity(new Period("PT2S"), null, DateTimeZone.UTC);
    Assert.assertNotEquals("\"SECOND\"", jsonMapper.writeValueAsString(pt2S));
    final Granularity pt1S = Granularities.SECOND;
    Assert.assertEquals("\"SECOND\"", jsonMapper.writeValueAsString(pt1S));
}
Also used : PeriodGranularity(io.druid.java.util.common.granularity.PeriodGranularity) Period(org.joda.time.Period) Granularity(io.druid.java.util.common.granularity.Granularity) PeriodGranularity(io.druid.java.util.common.granularity.PeriodGranularity) Test(org.junit.Test)

Aggregations

Period (org.joda.time.Period)273 Test (org.junit.Test)102 DateTime (org.joda.time.DateTime)54 PeriodGranularity (io.druid.java.util.common.granularity.PeriodGranularity)40 Interval (org.joda.time.Interval)30 LongSumAggregatorFactory (io.druid.query.aggregation.LongSumAggregatorFactory)29 DefaultDimensionSpec (io.druid.query.dimension.DefaultDimensionSpec)20 Row (io.druid.data.input.Row)19 File (java.io.File)15 DefaultObjectMapper (io.druid.jackson.DefaultObjectMapper)12 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)10 Result (io.druid.query.Result)10 CountAggregatorFactory (io.druid.query.aggregation.CountAggregatorFactory)10 FinalizeResultsQueryRunner (io.druid.query.FinalizeResultsQueryRunner)8 QueryRunner (io.druid.query.QueryRunner)8 AggregatorFactory (io.druid.query.aggregation.AggregatorFactory)8 DimensionSpec (io.druid.query.dimension.DimensionSpec)8 MutablePeriod (org.joda.time.MutablePeriod)8 ExtractionDimensionSpec (io.druid.query.dimension.ExtractionDimensionSpec)7 DimFilterHavingSpec (io.druid.query.groupby.having.DimFilterHavingSpec)7