Search in sources :

Example 1 with BinaryTruncator

use of org.apache.parquet.internal.column.columnindex.BinaryTruncator in project parquet-mr by apache.

the class ParquetMetadataConverter method toParquetStatistics.

public static Statistics toParquetStatistics(org.apache.parquet.column.statistics.Statistics stats, int truncateLength) {
    Statistics formatStats = new Statistics();
    // value has been truncated and is a lower bound and not in the page.
    if (!stats.isEmpty() && withinLimit(stats, truncateLength)) {
        formatStats.setNull_count(stats.getNumNulls());
        if (stats.hasNonNullValue()) {
            byte[] min;
            byte[] max;
            if (stats instanceof BinaryStatistics && truncateLength != Integer.MAX_VALUE) {
                BinaryTruncator truncator = BinaryTruncator.getTruncator(stats.type());
                min = tuncateMin(truncator, truncateLength, stats.getMinBytes());
                max = tuncateMax(truncator, truncateLength, stats.getMaxBytes());
            } else {
                min = stats.getMinBytes();
                max = stats.getMaxBytes();
            }
            // trivially true for equal min-max values)
            if (sortOrder(stats.type()) == SortOrder.SIGNED || Arrays.equals(min, max)) {
                formatStats.setMin(min);
                formatStats.setMax(max);
            }
            if (isMinMaxStatsSupported(stats.type()) || Arrays.equals(min, max)) {
                formatStats.setMin_value(min);
                formatStats.setMax_value(max);
            }
        }
    }
    return formatStats;
}
Also used : BinaryStatistics(org.apache.parquet.column.statistics.BinaryStatistics) BinaryTruncator(org.apache.parquet.internal.column.columnindex.BinaryTruncator) Statistics(org.apache.parquet.format.Statistics) BinaryStatistics(org.apache.parquet.column.statistics.BinaryStatistics) CorruptStatistics(org.apache.parquet.CorruptStatistics)

Aggregations

CorruptStatistics (org.apache.parquet.CorruptStatistics)1 BinaryStatistics (org.apache.parquet.column.statistics.BinaryStatistics)1 Statistics (org.apache.parquet.format.Statistics)1 BinaryTruncator (org.apache.parquet.internal.column.columnindex.BinaryTruncator)1