Examples with TuplePointer - org.apache.hyracks.dataflow.std.structures.TuplePointer

Example 1 with TuplePointer

use of org.apache.hyracks.dataflow.std.structures.TuplePointer in project asterixdb by apache.

the class VPartitionTupleBufferManagerTest method testInsertOnePartitionToFull.

private Map<TuplePointer, Integer> testInsertOnePartitionToFull(int idpart) throws HyracksDataException {
    Map<TuplePointer, Integer> tuplePointerIntegerMap = new HashMap<>();
    for (int i = 0; i < inFTA.getTupleCount(); i++) {
        TuplePointer tuplePointer = new TuplePointer();
        copyDataToTupleBuilder(inFTA, i, tupleBuilder);
        if (!bufferManager.insertTuple(idpart, tupleBuilder.getByteArray(), tupleBuilder.getFieldEndOffsets(), 0, tupleBuilder.getSize(), tuplePointer)) {
            assert false;
        }
        tuplePointerIntegerMap.put(tuplePointer, IntSerDeUtils.getInt(inFTA.getBuffer().array(), inFTA.getAbsoluteFieldStartOffset(i, 0)));
    }
    return tuplePointerIntegerMap;
}

Also used : HashMap(java.util.HashMap) TuplePointer(org.apache.hyracks.dataflow.std.structures.TuplePointer)

Example 2 with TuplePointer

use of org.apache.hyracks.dataflow.std.structures.TuplePointer in project asterixdb by apache.

the class VariableTupleMemoryManagerTest method testReOrganizeVariableSizeTuple.

@Test
public void testReOrganizeVariableSizeTuple() throws HyracksDataException {
    Map<Integer, Integer> map = prepareVariableSizeTuples();
    Map<TuplePointer, Integer> mapInserted = insertInFTAToBufferCouldFailForLargerTuples(map);
    Map<Integer, Integer> copyMap = new HashMap<>(map);
    ByteBuffer buffer = deleteRandomSelectedTuples(map, mapInserted, map.size() / 2);
    inFTA.reset(buffer);
    Map<TuplePointer, Integer> mapInserted2 = insertInFTAToBufferCouldFailForLargerTuples(copyMap);
    Map<TuplePointer, Integer> mergedMap = new HashMap<>(mapInserted);
    mergedMap.putAll(mapInserted2);
    assertEachTupleInFTAIsInBuffer(copyMap, mergedMap);
}

Also used : HashMap(java.util.HashMap) TuplePointer(org.apache.hyracks.dataflow.std.structures.TuplePointer) ByteBuffer(java.nio.ByteBuffer) Test(org.junit.Test)

Example 3 with TuplePointer

use of org.apache.hyracks.dataflow.std.structures.TuplePointer in project asterixdb by apache.

the class VariableTupleMemoryManagerTest method deleteRandomSelectedTuples.

private ByteBuffer deleteRandomSelectedTuples(Map<Integer, Integer> map, Map<TuplePointer, Integer> mapInserted, int minNumOfRecordTobeDeleted) throws HyracksDataException {
    ByteBuffer buffer = ByteBuffer.allocate(Common.BUDGET);
    FixedSizeFrame frame = new FixedSizeFrame(buffer);
    FrameTupleAppender appender = new FrameTupleAppender();
    appender.reset(frame, true);
    assert (minNumOfRecordTobeDeleted < mapInserted.size());
    int countDeleted = minNumOfRecordTobeDeleted + random.nextInt(mapInserted.size() - minNumOfRecordTobeDeleted);
    ITuplePointerAccessor accessor = tupleMemoryManager.createTuplePointerAccessor();
    for (int i = 0; i < countDeleted; i++) {
        Iterator<Map.Entry<TuplePointer, Integer>> iter = mapInserted.entrySet().iterator();
        assert (iter.hasNext());
        Map.Entry<TuplePointer, Integer> pair = iter.next();
        accessor.reset(pair.getKey());
        appender.append(accessor.getBuffer().array(), accessor.getTupleStartOffset(), accessor.getTupleLength());
        map.remove(pair.getValue());
        tupleMemoryManager.deleteTuple(pair.getKey());
        iter.remove();
    }
    return buffer;
}

Also used : FixedSizeFrame(org.apache.hyracks.api.comm.FixedSizeFrame) FrameTupleAppender(org.apache.hyracks.dataflow.common.comm.io.FrameTupleAppender) TuplePointer(org.apache.hyracks.dataflow.std.structures.TuplePointer) ByteBuffer(java.nio.ByteBuffer) Map(java.util.Map) HashMap(java.util.HashMap)

Example 4 with TuplePointer

use of org.apache.hyracks.dataflow.std.structures.TuplePointer in project asterixdb by apache.

the class HashSpillableTableFactory method buildSpillableTable.

@Override
public ISpillableTable buildSpillableTable(final IHyracksTaskContext ctx, int suggestTableSize, long inputDataBytesSize, final int[] keyFields, final IBinaryComparator[] comparators, final INormalizedKeyComputer firstKeyNormalizerFactory, IAggregatorDescriptorFactory aggregateFactory, RecordDescriptor inRecordDescriptor, RecordDescriptor outRecordDescriptor, final int framesLimit, final int seed) throws HyracksDataException {
    final int tableSize = suggestTableSize;
    // For the output, we need to have at least one frame.
    if (framesLimit < MIN_FRAME_LIMT) {
        throw new HyracksDataException("The given frame limit is too small to partition the data.");
    }
    final int[] intermediateResultKeys = new int[keyFields.length];
    for (int i = 0; i < keyFields.length; i++) {
        intermediateResultKeys[i] = i;
    }
    final FrameTuplePairComparator ftpcInputCompareToAggregate = new FrameTuplePairComparator(keyFields, intermediateResultKeys, comparators);
    final ITuplePartitionComputer tpc = new FieldHashPartitionComputerFamily(keyFields, hashFunctionFamilies).createPartitioner(seed);
    // For calculating hash value for the already aggregated tuples (not incoming tuples)
    // This computer is required to calculate the hash value of a aggregated tuple
    // while doing the garbage collection work on Hash Table.
    final ITuplePartitionComputer tpcIntermediate = new FieldHashPartitionComputerFamily(intermediateResultKeys, hashFunctionFamilies).createPartitioner(seed);
    final IAggregatorDescriptor aggregator = aggregateFactory.createAggregator(ctx, inRecordDescriptor, outRecordDescriptor, keyFields, intermediateResultKeys, null);
    final AggregateState aggregateState = aggregator.createAggregateStates();
    final ArrayTupleBuilder stateTupleBuilder = new ArrayTupleBuilder(outRecordDescriptor.getFields().length);
    //TODO(jf) research on the optimized partition size
    long memoryBudget = Math.max(MIN_DATA_TABLE_FRAME_LIMT + MIN_HASH_TABLE_FRAME_LIMT, framesLimit - OUTPUT_FRAME_LIMT - MIN_HASH_TABLE_FRAME_LIMT);
    final int numPartitions = getNumOfPartitions(inputDataBytesSize / ctx.getInitialFrameSize(), memoryBudget);
    final int entriesPerPartition = (int) Math.ceil(1.0 * tableSize / numPartitions);
    if (LOGGER.isLoggable(Level.FINE)) {
        LOGGER.fine("created hashtable, table size:" + tableSize + " file size:" + inputDataBytesSize + "  #partitions:" + numPartitions);
    }
    final ArrayTupleBuilder outputTupleBuilder = new ArrayTupleBuilder(outRecordDescriptor.getFields().length);
    return new ISpillableTable() {

        private final TuplePointer pointer = new TuplePointer();

        private final BitSet spilledSet = new BitSet(numPartitions);

        // This frame pool will be shared by both data table and hash table.
        private final IDeallocatableFramePool framePool = new DeallocatableFramePool(ctx, framesLimit * ctx.getInitialFrameSize());

        // buffer manager for hash table
        private final ISimpleFrameBufferManager bufferManagerForHashTable = new FramePoolBackedFrameBufferManager(framePool);

        private final ISerializableTable hashTableForTuplePointer = new SerializableHashTable(tableSize, ctx, bufferManagerForHashTable);

        // buffer manager for data table
        final IPartitionedTupleBufferManager bufferManager = new VPartitionTupleBufferManager(PreferToSpillFullyOccupiedFramePolicy.createAtMostOneFrameForSpilledPartitionConstrain(spilledSet), numPartitions, framePool);

        final ITuplePointerAccessor bufferAccessor = bufferManager.getTuplePointerAccessor(outRecordDescriptor);

        private final PreferToSpillFullyOccupiedFramePolicy spillPolicy = new PreferToSpillFullyOccupiedFramePolicy(bufferManager, spilledSet);

        private final FrameTupleAppender outputAppender = new FrameTupleAppender(new VSizeFrame(ctx));

        @Override
        public void close() throws HyracksDataException {
            hashTableForTuplePointer.close();
            aggregator.close();
        }

        @Override
        public void clear(int partition) throws HyracksDataException {
            for (int p = getFirstEntryInHashTable(partition); p < getLastEntryInHashTable(partition); p++) {
                hashTableForTuplePointer.delete(p);
            }
            // Checks whether the garbage collection is required and conducts a garbage collection if so.
            if (hashTableForTuplePointer.isGarbageCollectionNeeded()) {
                int numberOfFramesReclaimed = hashTableForTuplePointer.collectGarbage(bufferAccessor, tpcIntermediate);
                if (LOGGER.isLoggable(Level.FINE)) {
                    LOGGER.fine("Garbage Collection on Hash table is done. Deallocated frames:" + numberOfFramesReclaimed);
                }
            }
            bufferManager.clearPartition(partition);
        }

        private int getPartition(int entryInHashTable) {
            return entryInHashTable / entriesPerPartition;
        }

        private int getFirstEntryInHashTable(int partition) {
            return partition * entriesPerPartition;
        }

        private int getLastEntryInHashTable(int partition) {
            return Math.min(tableSize, (partition + 1) * entriesPerPartition);
        }

        @Override
        public boolean insert(IFrameTupleAccessor accessor, int tIndex) throws HyracksDataException {
            int entryInHashTable = tpc.partition(accessor, tIndex, tableSize);
            for (int i = 0; i < hashTableForTuplePointer.getTupleCount(entryInHashTable); i++) {
                hashTableForTuplePointer.getTuplePointer(entryInHashTable, i, pointer);
                bufferAccessor.reset(pointer);
                int c = ftpcInputCompareToAggregate.compare(accessor, tIndex, bufferAccessor);
                if (c == 0) {
                    aggregateExistingTuple(accessor, tIndex, bufferAccessor, pointer.getTupleIndex());
                    return true;
                }
            }
            return insertNewAggregateEntry(entryInHashTable, accessor, tIndex);
        }

        /**
             * Inserts a new aggregate entry into the data table and hash table.
             * This insertion must be an atomic operation. We cannot have a partial success or failure.
             * So, if an insertion succeeds on the data table and the same insertion on the hash table fails, then
             * we need to revert the effect of data table insertion.
             */
        private boolean insertNewAggregateEntry(int entryInHashTable, IFrameTupleAccessor accessor, int tIndex) throws HyracksDataException {
            initStateTupleBuilder(accessor, tIndex);
            int pid = getPartition(entryInHashTable);
            // Insertion to the data table
            if (!bufferManager.insertTuple(pid, stateTupleBuilder.getByteArray(), stateTupleBuilder.getFieldEndOffsets(), 0, stateTupleBuilder.getSize(), pointer)) {
                return false;
            }
            // Insertion to the hash table
            if (!hashTableForTuplePointer.insert(entryInHashTable, pointer)) {
                // To preserve the atomicity of this method, we need to undo the effect
                // of the above bufferManager.insertTuple() call since the given insertion has failed.
                bufferManager.cancelInsertTuple(pid);
                return false;
            }
            return true;
        }

        private void initStateTupleBuilder(IFrameTupleAccessor accessor, int tIndex) throws HyracksDataException {
            stateTupleBuilder.reset();
            for (int k = 0; k < keyFields.length; k++) {
                stateTupleBuilder.addField(accessor, tIndex, keyFields[k]);
            }
            aggregator.init(stateTupleBuilder, accessor, tIndex, aggregateState);
        }

        private void aggregateExistingTuple(IFrameTupleAccessor accessor, int tIndex, ITuplePointerAccessor bufferAccessor, int tupleIndex) throws HyracksDataException {
            aggregator.aggregate(accessor, tIndex, bufferAccessor, tupleIndex, aggregateState);
        }

        @Override
        public int flushFrames(int partition, IFrameWriter writer, AggregateType type) throws HyracksDataException {
            int count = 0;
            for (int hashEntryPid = getFirstEntryInHashTable(partition); hashEntryPid < getLastEntryInHashTable(partition); hashEntryPid++) {
                count += hashTableForTuplePointer.getTupleCount(hashEntryPid);
                for (int tid = 0; tid < hashTableForTuplePointer.getTupleCount(hashEntryPid); tid++) {
                    hashTableForTuplePointer.getTuplePointer(hashEntryPid, tid, pointer);
                    bufferAccessor.reset(pointer);
                    outputTupleBuilder.reset();
                    for (int k = 0; k < intermediateResultKeys.length; k++) {
                        outputTupleBuilder.addField(bufferAccessor.getBuffer().array(), bufferAccessor.getAbsFieldStartOffset(intermediateResultKeys[k]), bufferAccessor.getFieldLength(intermediateResultKeys[k]));
                    }
                    boolean hasOutput = false;
                    switch(type) {
                        case PARTIAL:
                            hasOutput = aggregator.outputPartialResult(outputTupleBuilder, bufferAccessor, pointer.getTupleIndex(), aggregateState);
                            break;
                        case FINAL:
                            hasOutput = aggregator.outputFinalResult(outputTupleBuilder, bufferAccessor, pointer.getTupleIndex(), aggregateState);
                            break;
                    }
                    if (hasOutput && !outputAppender.appendSkipEmptyField(outputTupleBuilder.getFieldEndOffsets(), outputTupleBuilder.getByteArray(), 0, outputTupleBuilder.getSize())) {
                        outputAppender.write(writer, true);
                        if (!outputAppender.appendSkipEmptyField(outputTupleBuilder.getFieldEndOffsets(), outputTupleBuilder.getByteArray(), 0, outputTupleBuilder.getSize())) {
                            throw new HyracksDataException("The output item is too large to be fit into a frame.");
                        }
                    }
                }
            }
            outputAppender.write(writer, true);
            spilledSet.set(partition);
            return count;
        }

        @Override
        public int getNumPartitions() {
            return bufferManager.getNumPartitions();
        }

        @Override
        public int findVictimPartition(IFrameTupleAccessor accessor, int tIndex) throws HyracksDataException {
            int entryInHashTable = tpc.partition(accessor, tIndex, tableSize);
            int partition = getPartition(entryInHashTable);
            return spillPolicy.selectVictimPartition(partition);
        }
    };
}

Also used : IFrameWriter(org.apache.hyracks.api.comm.IFrameWriter) FramePoolBackedFrameBufferManager(org.apache.hyracks.dataflow.std.buffermanager.FramePoolBackedFrameBufferManager) VPartitionTupleBufferManager(org.apache.hyracks.dataflow.std.buffermanager.VPartitionTupleBufferManager) ITuplePointerAccessor(org.apache.hyracks.dataflow.std.buffermanager.ITuplePointerAccessor) TuplePointer(org.apache.hyracks.dataflow.std.structures.TuplePointer) ISimpleFrameBufferManager(org.apache.hyracks.dataflow.std.buffermanager.ISimpleFrameBufferManager) ITuplePartitionComputer(org.apache.hyracks.api.dataflow.value.ITuplePartitionComputer) IDeallocatableFramePool(org.apache.hyracks.dataflow.std.buffermanager.IDeallocatableFramePool) FrameTupleAppender(org.apache.hyracks.dataflow.common.comm.io.FrameTupleAppender) IFrameTupleAccessor(org.apache.hyracks.api.comm.IFrameTupleAccessor) SerializableHashTable(org.apache.hyracks.dataflow.std.structures.SerializableHashTable) FrameTuplePairComparator(org.apache.hyracks.dataflow.std.util.FrameTuplePairComparator) BitSet(java.util.BitSet) FieldHashPartitionComputerFamily(org.apache.hyracks.dataflow.common.data.partition.FieldHashPartitionComputerFamily) ArrayTupleBuilder(org.apache.hyracks.dataflow.common.comm.io.ArrayTupleBuilder) HyracksDataException(org.apache.hyracks.api.exceptions.HyracksDataException) VSizeFrame(org.apache.hyracks.api.comm.VSizeFrame) IDeallocatableFramePool(org.apache.hyracks.dataflow.std.buffermanager.IDeallocatableFramePool) DeallocatableFramePool(org.apache.hyracks.dataflow.std.buffermanager.DeallocatableFramePool) PreferToSpillFullyOccupiedFramePolicy(org.apache.hyracks.dataflow.std.buffermanager.PreferToSpillFullyOccupiedFramePolicy) ISerializableTable(org.apache.hyracks.dataflow.std.structures.ISerializableTable) IPartitionedTupleBufferManager(org.apache.hyracks.dataflow.std.buffermanager.IPartitionedTupleBufferManager)

Example 5 with TuplePointer

use of org.apache.hyracks.dataflow.std.structures.TuplePointer in project asterixdb by apache.

the class VariableTupleMemoryManagerTest method testReOrganizeSpace.

@Test
public void testReOrganizeSpace() throws HyracksDataException {
    int iTuplePerFrame = 3;
    Map<Integer, Integer> mapPrepare = prepareFixedSizeTuples(iTuplePerFrame, EXTRA_BYTES_FOR_DELETABLE_FRAME, 0);
    Map<Integer, Integer> copyMap = new HashMap<>(mapPrepare);
    Map<TuplePointer, Integer> mapInserted = insertInFTAToBufferShouldAllSuccess();
    ByteBuffer buffer = deleteRandomSelectedTuples(mapPrepare, mapInserted, mapPrepare.size() / 2);
    inFTA.reset(buffer);
    //The deletable frame buffer will keep the deleted slot untouched, which will take more space.
    // the reason is to not reuse the same TuplePointer outside.
    Map<TuplePointer, Integer> mapInserted2 = insertInFTAToBufferMayNotAllSuccess();
    assertTrue(mapInserted2.size() > 0);
}

Also used : HashMap(java.util.HashMap) TuplePointer(org.apache.hyracks.dataflow.std.structures.TuplePointer) ByteBuffer(java.nio.ByteBuffer) Test(org.junit.Test)

Aggregations

TuplePointer (org.apache.hyracks.dataflow.std.structures.TuplePointer)9 HashMap (java.util.HashMap)7 ByteBuffer (java.nio.ByteBuffer)3 IFrameTupleAccessor (org.apache.hyracks.api.comm.IFrameTupleAccessor)2 FrameTupleAppender (org.apache.hyracks.dataflow.common.comm.io.FrameTupleAppender)2 Test (org.junit.Test)2 BitSet (java.util.BitSet)1 Map (java.util.Map)1 FixedSizeFrame (org.apache.hyracks.api.comm.FixedSizeFrame)1 IFrameWriter (org.apache.hyracks.api.comm.IFrameWriter)1 VSizeFrame (org.apache.hyracks.api.comm.VSizeFrame)1 ITuplePartitionComputer (org.apache.hyracks.api.dataflow.value.ITuplePartitionComputer)1 HyracksDataException (org.apache.hyracks.api.exceptions.HyracksDataException)1 ArrayTupleBuilder (org.apache.hyracks.dataflow.common.comm.io.ArrayTupleBuilder)1 FrameTupleAccessor (org.apache.hyracks.dataflow.common.comm.io.FrameTupleAccessor)1 FieldHashPartitionComputerFamily (org.apache.hyracks.dataflow.common.data.partition.FieldHashPartitionComputerFamily)1 DeallocatableFramePool (org.apache.hyracks.dataflow.std.buffermanager.DeallocatableFramePool)1 FramePoolBackedFrameBufferManager (org.apache.hyracks.dataflow.std.buffermanager.FramePoolBackedFrameBufferManager)1 IDeallocatableFramePool (org.apache.hyracks.dataflow.std.buffermanager.IDeallocatableFramePool)1 IPartitionedTupleBufferManager (org.apache.hyracks.dataflow.std.buffermanager.IPartitionedTupleBufferManager)1