Search in sources :

Example 1 with IntBigArray

use of io.prestosql.array.IntBigArray in project hetu-core by openlookeng.

the class SliceDictionaryColumnWriter method finishRowGroup.

@Override
public Map<OrcColumnId, ColumnStatistics> finishRowGroup() {
    checkState(!closed);
    checkState(inRowGroup);
    inRowGroup = false;
    if (directEncoded) {
        return directColumnWriter.finishRowGroup();
    }
    ColumnStatistics statistics = statisticsBuilder.buildColumnStatistics();
    rowGroups.add(new DictionaryRowGroup(values, rowGroupValueCount, statistics));
    rowGroupValueCount = 0;
    statisticsBuilder = newStringStatisticsBuilder();
    values = new IntBigArray();
    return ImmutableMap.of(columnId, statistics);
}
Also used : ColumnStatistics(io.prestosql.orc.metadata.statistics.ColumnStatistics) IntBigArray(io.prestosql.array.IntBigArray)

Example 2 with IntBigArray

use of io.prestosql.array.IntBigArray in project hetu-core by openlookeng.

the class ValueStore method rehash.

@VisibleForTesting
void rehash() {
    ++rehashCount;
    long newBucketCountLong = bucketCount * 2L;
    if (newBucketCountLong > Integer.MAX_VALUE) {
        throw new PrestoException(GENERIC_INSUFFICIENT_RESOURCES, "Size of hash table cannot exceed " + Integer.MAX_VALUE + " entries (" + newBucketCountLong + ")");
    }
    int newBucketCount = (int) newBucketCountLong;
    int newMask = newBucketCount - 1;
    IntBigArray newBuckets = new IntBigArray(-1);
    newBuckets.ensureCapacity(newBucketCount);
    for (int i = 0; i < values.getPositionCount(); i++) {
        long valueHash = valueHashes.get(i);
        int bucketId = getBucketId(valueHash, newMask);
        int probeCount = 1;
        while (newBuckets.get(bucketId) != EMPTY_BUCKET) {
            int probe = nextProbe(probeCount);
            bucketId = nextBucketId(bucketId, newMask, probe);
            probeCount++;
        }
        // record the mapping
        newBuckets.set(bucketId, i);
    }
    buckets = newBuckets;
    // worst case is every bucket has a unique value, so pre-emptively keep this large enough to have a value for ever bucket
    // TODO: could optimize the growth algorithm to be resize this only when necessary; this wastes memory but guarantees that if every value has a distinct hash, we have space
    valueHashes.ensureCapacity(newBucketCount);
    bucketCount = newBucketCount;
    maxFill = calculateMaxFill(newBucketCount, MAX_FILL_RATIO);
    mask = newMask;
}
Also used : IntBigArray(io.prestosql.array.IntBigArray) PrestoException(io.prestosql.spi.PrestoException) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 3 with IntBigArray

use of io.prestosql.array.IntBigArray in project hetu-core by openlookeng.

the class GroupedTypedHistogram method rehash.

private void rehash() {
    long newBucketCountLong = bucketCount * 2L;
    if (newBucketCountLong > Integer.MAX_VALUE) {
        throw new PrestoException(GENERIC_INSUFFICIENT_RESOURCES, "Size of hash table cannot exceed " + Integer.MAX_VALUE + " entries (" + newBucketCountLong + ")");
    }
    int newBucketCount = computeBucketCount((int) newBucketCountLong, MAX_FILL_RATIO);
    int newMask = newBucketCount - 1;
    IntBigArray newBuckets = new IntBigArray(-1);
    newBuckets.ensureCapacity(newBucketCount);
    for (int i = 0; i < nextNodePointer; i++) {
        // find the old one
        int tmpBucketId = getBucketIdForNode(i, newMask);
        int probeCount = 1;
        int originalBucket = tmpBucketId;
        // find new one
        while (newBuckets.get(tmpBucketId) != -1) {
            int probe = nextProbe(probeCount);
            tmpBucketId = nextBucketId(originalBucket, newMask, probe);
            probeCount++;
        }
        // record the mapping
        newBuckets.set(tmpBucketId, i);
    }
    buckets = newBuckets;
    bucketCount = newBucketCount;
    maxFill = calculateMaxFill(newBucketCount, MAX_FILL_RATIO);
    mask = newMask;
    resizeNodeArrays(newBucketCount);
}
Also used : IntBigArray(io.prestosql.array.IntBigArray) PrestoException(io.prestosql.spi.PrestoException)

Example 4 with IntBigArray

use of io.prestosql.array.IntBigArray in project hetu-core by openlookeng.

the class SingleTypedHistogram method rehash.

private void rehash() {
    long newCapacityLong = hashCapacity * 2L;
    if (newCapacityLong > Integer.MAX_VALUE) {
        throw new PrestoException(GENERIC_INSUFFICIENT_RESOURCES, "Size of hash table cannot exceed 1 billion entries");
    }
    int newCapacity = (int) newCapacityLong;
    int newMask = newCapacity - 1;
    IntBigArray newHashPositions = new IntBigArray(-1);
    newHashPositions.ensureCapacity(newCapacity);
    for (int i = 0; i < values.getPositionCount(); i++) {
        // find an empty slot for the address
        int hashPosition = getBucketId(TypeUtils.hashPosition(type, values, i), newMask);
        while (newHashPositions.get(hashPosition) != -1) {
            hashPosition = (hashPosition + 1) & newMask;
        }
        // record the mapping
        newHashPositions.set(hashPosition, i);
    }
    hashCapacity = newCapacity;
    mask = newMask;
    maxFill = calculateMaxFill(newCapacity);
    hashPositions = newHashPositions;
    this.counts.ensureCapacity(maxFill);
}
Also used : IntBigArray(io.prestosql.array.IntBigArray) PrestoException(io.prestosql.spi.PrestoException)

Example 5 with IntBigArray

use of io.prestosql.array.IntBigArray in project hetu-core by openlookeng.

the class InMemoryHashAggregationBuilder method hashSortedGroupIds.

private IntIterator hashSortedGroupIds() {
    IntBigArray groupIds = new IntBigArray();
    groupIds.ensureCapacity(groupBy.getGroupCount());
    for (int i = 0; i < groupBy.getGroupCount(); i++) {
        groupIds.set(i, i);
    }
    groupIds.sort(0, groupBy.getGroupCount(), (leftGroupId, rightGroupId) -> Long.compare(groupBy.getRawHash(leftGroupId), groupBy.getRawHash(rightGroupId)));
    return new AbstractIntIterator() {

        private final int totalPositions = groupBy.getGroupCount();

        private int position;

        @Override
        public boolean hasNext() {
            return position < totalPositions;
        }

        @Override
        public int nextInt() {
            return groupIds.get(position++);
        }
    };
}
Also used : AbstractIntIterator(it.unimi.dsi.fastutil.ints.AbstractIntIterator) IntBigArray(io.prestosql.array.IntBigArray)

Aggregations

IntBigArray (io.prestosql.array.IntBigArray)7 PrestoException (io.prestosql.spi.PrestoException)4 VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 Slice (io.airlift.slice.Slice)1 LongBigArray (io.prestosql.array.LongBigArray)1 BooleanStreamCheckpoint (io.prestosql.orc.checkpoint.BooleanStreamCheckpoint)1 LongStreamCheckpoint (io.prestosql.orc.checkpoint.LongStreamCheckpoint)1 ColumnEncoding (io.prestosql.orc.metadata.ColumnEncoding)1 ColumnStatistics (io.prestosql.orc.metadata.statistics.ColumnStatistics)1 Block (io.prestosql.spi.block.Block)1 DictionaryBlock (io.prestosql.spi.block.DictionaryBlock)1 AbstractIntIterator (it.unimi.dsi.fastutil.ints.AbstractIntIterator)1