Search in sources :

Example 1 with DateHistogram

use of org.talend.dataprep.api.dataset.statistics.date.DateHistogram in project data-prep by Talend.

the class StatisticsAdapter method injectNumberSummary.

/**
 * Injects numerical statistics like max, min to statistics of the specified column metadata.
 *
 * For columns of type date, min and max values are retrieved from the date histogram
 *
 * @param column the specified column metadata
 * @param result the analyzer result
 */
private void injectNumberSummary(final ColumnMetadata column, final Analyzers.Result result) {
    if (result.exist(SummaryStatistics.class)) {
        final Statistics statistics = column.getStatistics();
        final SummaryStatistics summaryStatistics = result.get(SummaryStatistics.class);
        statistics.setMean(summaryStatistics.getMean());
        statistics.setVariance(summaryStatistics.getVariance());
        // if the column is of type Date
        if (DATE.isAssignableFrom(column.getType()) && result.exist(StreamDateHistogramStatistics.class)) {
            final DateHistogram histogram = (DateHistogram) result.get(StreamDateHistogramStatistics.class).getHistogram();
            statistics.setMax(histogram.getMaxUTCEpochMilliseconds());
            statistics.setMin(histogram.getMinUTCEpochMilliseconds());
        } else {
            statistics.setMax(summaryStatistics.getMax());
            statistics.setMin(summaryStatistics.getMin());
        }
    }
}
Also used : StreamDateHistogramStatistics(org.talend.dataprep.api.dataset.statistics.date.StreamDateHistogramStatistics) DateHistogram(org.talend.dataprep.api.dataset.statistics.date.DateHistogram) SummaryStatistics(org.talend.dataquality.statistics.numeric.summary.SummaryStatistics) CardinalityStatistics(org.talend.dataquality.statistics.cardinality.CardinalityStatistics) DataTypeFrequencyStatistics(org.talend.dataquality.statistics.frequency.DataTypeFrequencyStatistics) StreamNumberHistogramStatistics(org.talend.dataprep.api.dataset.statistics.number.StreamNumberHistogramStatistics) ValueQualityStatistics(org.talend.dataquality.common.inference.ValueQualityStatistics) SummaryStatistics(org.talend.dataquality.statistics.numeric.summary.SummaryStatistics) StreamDateHistogramStatistics(org.talend.dataprep.api.dataset.statistics.date.StreamDateHistogramStatistics) TextLengthStatistics(org.talend.dataquality.statistics.text.TextLengthStatistics) PatternFrequencyStatistics(org.talend.dataquality.statistics.frequency.pattern.PatternFrequencyStatistics) QuantileStatistics(org.talend.dataquality.statistics.numeric.quantile.QuantileStatistics)

Aggregations

DateHistogram (org.talend.dataprep.api.dataset.statistics.date.DateHistogram)1 StreamDateHistogramStatistics (org.talend.dataprep.api.dataset.statistics.date.StreamDateHistogramStatistics)1 StreamNumberHistogramStatistics (org.talend.dataprep.api.dataset.statistics.number.StreamNumberHistogramStatistics)1 ValueQualityStatistics (org.talend.dataquality.common.inference.ValueQualityStatistics)1 CardinalityStatistics (org.talend.dataquality.statistics.cardinality.CardinalityStatistics)1 DataTypeFrequencyStatistics (org.talend.dataquality.statistics.frequency.DataTypeFrequencyStatistics)1 PatternFrequencyStatistics (org.talend.dataquality.statistics.frequency.pattern.PatternFrequencyStatistics)1 QuantileStatistics (org.talend.dataquality.statistics.numeric.quantile.QuantileStatistics)1 SummaryStatistics (org.talend.dataquality.statistics.numeric.summary.SummaryStatistics)1 TextLengthStatistics (org.talend.dataquality.statistics.text.TextLengthStatistics)1