Search in sources :

Example 6 with BinaryRowDataSerializer

use of org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer in project flink by apache.

the class SortMergeJoinIteratorTest method fullOuter.

public void fullOuter(Tuple2<MutableObjectIterator<BinaryRowData>, MutableObjectIterator<BinaryRowData>> data, List<Tuple2<BinaryRowData, BinaryRowData>> compare) throws Exception {
    MutableObjectIterator<BinaryRowData> input1 = data.f0;
    MutableObjectIterator<BinaryRowData> input2 = data.f1;
    try (SortMergeFullOuterJoinIterator iterator = new SortMergeFullOuterJoinIterator(new BinaryRowDataSerializer(1), new BinaryRowDataSerializer(1), new MyProjection(), new MyProjection(), new IntRecordComparator(), input1, input2, new ResettableExternalBuffer(ioManager, new LazyMemorySegmentPool(this, memManager, BUFFER_MEMORY), serializer, false), new ResettableExternalBuffer(ioManager, new LazyMemorySegmentPool(this, memManager, BUFFER_MEMORY), serializer, false), new boolean[] { true })) {
        int id = 0;
        while (iterator.nextOuterJoin()) {
            BinaryRowData matchKey = iterator.getMatchKey();
            ResettableExternalBuffer buffer1 = iterator.getBuffer1();
            ResettableExternalBuffer buffer2 = iterator.getBuffer2();
            if (matchKey == null && buffer1.size() > 0) {
                // left outer join.
                ResettableExternalBuffer.BufferIterator iter = buffer1.newIterator();
                while (iter.advanceNext()) {
                    RowData row = iter.getRow();
                    Tuple2<BinaryRowData, BinaryRowData> expected = compare.get(id++);
                    assertEquals(expected, new Tuple2<>(row, null));
                }
            } else if (matchKey == null && buffer2.size() > 0) {
                // right outer join.
                ResettableExternalBuffer.BufferIterator iter = buffer2.newIterator();
                while (iter.advanceNext()) {
                    RowData row = iter.getRow();
                    Tuple2<BinaryRowData, BinaryRowData> expected = compare.get(id++);
                    assertEquals(expected, new Tuple2<>(null, row));
                }
            } else if (matchKey != null) {
                // match join.
                ResettableExternalBuffer.BufferIterator iter1 = buffer1.newIterator();
                while (iter1.advanceNext()) {
                    RowData row1 = iter1.getRow();
                    ResettableExternalBuffer.BufferIterator iter2 = buffer2.newIterator();
                    while (iter2.advanceNext()) {
                        RowData row2 = iter2.getRow();
                        Tuple2<BinaryRowData, BinaryRowData> expected = compare.get(id++);
                        assertEquals(expected, new Tuple2<>(row1, row2));
                    }
                }
            } else {
                // bug...
                throw new RuntimeException("There is a bug.");
            }
        }
        assertEquals(compare.size(), id);
    }
}
Also used : ResettableExternalBuffer(org.apache.flink.table.runtime.util.ResettableExternalBuffer) IntRecordComparator(org.apache.flink.table.runtime.operators.sort.IntRecordComparator) LazyMemorySegmentPool(org.apache.flink.table.runtime.util.LazyMemorySegmentPool) RowData(org.apache.flink.table.data.RowData) BinaryRowData(org.apache.flink.table.data.binary.BinaryRowData) MyProjection(org.apache.flink.table.runtime.operators.join.Int2HashJoinOperatorTest.MyProjection) Tuple2(org.apache.flink.api.java.tuple.Tuple2) BinaryRowData(org.apache.flink.table.data.binary.BinaryRowData) BinaryRowDataSerializer(org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer)

Example 7 with BinaryRowDataSerializer

use of org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer in project flink by apache.

the class SortOperator method open.

@Override
public void open() throws Exception {
    super.open();
    LOG.info("Opening SortOperator");
    ClassLoader cl = getContainingTask().getUserCodeClassLoader();
    AbstractRowDataSerializer inputSerializer = (AbstractRowDataSerializer) getOperatorConfig().getTypeSerializerIn1(getUserCodeClassloader());
    this.binarySerializer = new BinaryRowDataSerializer(inputSerializer.getArity());
    NormalizedKeyComputer computer = gComputer.newInstance(cl);
    RecordComparator comparator = gComparator.newInstance(cl);
    gComputer = null;
    gComparator = null;
    MemoryManager memManager = getContainingTask().getEnvironment().getMemoryManager();
    this.sorter = new BinaryExternalSorter(this.getContainingTask(), memManager, computeMemorySize(), this.getContainingTask().getEnvironment().getIOManager(), inputSerializer, binarySerializer, computer, comparator, getContainingTask().getJobConfiguration());
    this.sorter.startThreads();
    collector = new StreamRecordCollector<>(output);
    // register the metrics.
    getMetricGroup().gauge("memoryUsedSizeInBytes", (Gauge<Long>) sorter::getUsedMemoryInBytes);
    getMetricGroup().gauge("numSpillFiles", (Gauge<Long>) sorter::getNumSpillFiles);
    getMetricGroup().gauge("spillInBytes", (Gauge<Long>) sorter::getSpillInBytes);
}
Also used : AbstractRowDataSerializer(org.apache.flink.table.runtime.typeutils.AbstractRowDataSerializer) NormalizedKeyComputer(org.apache.flink.table.runtime.generated.NormalizedKeyComputer) GeneratedNormalizedKeyComputer(org.apache.flink.table.runtime.generated.GeneratedNormalizedKeyComputer) MemoryManager(org.apache.flink.runtime.memory.MemoryManager) BinaryRowDataSerializer(org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer) RecordComparator(org.apache.flink.table.runtime.generated.RecordComparator) GeneratedRecordComparator(org.apache.flink.table.runtime.generated.GeneratedRecordComparator)

Example 8 with BinaryRowDataSerializer

use of org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer in project flink by apache.

the class BinaryHashTableTest method setup.

@Before
public void setup() {
    TypeInformation[] types = new TypeInformation[] { Types.INT, Types.INT };
    this.buildSideSerializer = new BinaryRowDataSerializer(types.length);
    this.probeSideSerializer = new BinaryRowDataSerializer(types.length);
    this.ioManager = new IOManagerAsync();
    conf = new Configuration();
    conf.setBoolean(ExecutionConfigOptions.TABLE_EXEC_SPILL_COMPRESSION_ENABLED, useCompress);
}
Also used : IOManagerAsync(org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync) Configuration(org.apache.flink.configuration.Configuration) TypeInformation(org.apache.flink.api.common.typeinfo.TypeInformation) BinaryRowDataSerializer(org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer) Before(org.junit.Before)

Example 9 with BinaryRowDataSerializer

use of org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer in project flink by apache.

the class LongHashTableTest method init.

@Before
public void init() {
    TypeInformation[] types = new TypeInformation[] { Types.INT, Types.INT };
    this.buildSideSerializer = new BinaryRowDataSerializer(types.length);
    this.probeSideSerializer = new BinaryRowDataSerializer(types.length);
    this.ioManager = new IOManagerAsync();
    conf = new Configuration();
    conf.setBoolean(ExecutionConfigOptions.TABLE_EXEC_SPILL_COMPRESSION_ENABLED, useCompress);
}
Also used : IOManagerAsync(org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync) Configuration(org.apache.flink.configuration.Configuration) TypeInformation(org.apache.flink.api.common.typeinfo.TypeInformation) BinaryRowDataSerializer(org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer) Before(org.junit.Before)

Example 10 with BinaryRowDataSerializer

use of org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer in project flink by apache.

the class BinaryRowDataTest method testHashAndCopy.

@Test
public void testHashAndCopy() throws IOException {
    MemorySegment[] segments = new MemorySegment[3];
    for (int i = 0; i < 3; i++) {
        segments[i] = MemorySegmentFactory.wrap(new byte[64]);
    }
    RandomAccessOutputView out = new RandomAccessOutputView(segments, 64);
    BinaryRowDataSerializer serializer = new BinaryRowDataSerializer(2);
    BinaryRowData row = new BinaryRowData(2);
    BinaryRowWriter writer = new BinaryRowWriter(row);
    writer.writeString(0, fromString("hahahahahahahahahahahahahahahahahahahhahahahahahahahahah"));
    writer.writeString(1, fromString("hahahahahahahahahahahahahahahahahahahhahahahahahahahahaa"));
    writer.complete();
    serializer.serializeToPages(row, out);
    ArrayList<MemorySegment> segmentList = new ArrayList<>(Arrays.asList(segments));
    RandomAccessInputView input = new RandomAccessInputView(segmentList, 64, 64);
    BinaryRowData mapRow = serializer.createInstance();
    mapRow = serializer.mapFromPages(mapRow, input);
    assertEquals(row, mapRow);
    assertEquals(row.getString(0), mapRow.getString(0));
    assertEquals(row.getString(1), mapRow.getString(1));
    assertNotEquals(row.getString(0), mapRow.getString(1));
    // test if the hash code before and after serialization are the same
    assertEquals(row.hashCode(), mapRow.hashCode());
    assertEquals(row.getString(0).hashCode(), mapRow.getString(0).hashCode());
    assertEquals(row.getString(1).hashCode(), mapRow.getString(1).hashCode());
    // test if the copy method produce a row with the same contents
    assertEquals(row.copy(), mapRow.copy());
    assertEquals(((BinaryStringData) row.getString(0)).copy(), ((BinaryStringData) mapRow.getString(0)).copy());
    assertEquals(((BinaryStringData) row.getString(1)).copy(), ((BinaryStringData) mapRow.getString(1)).copy());
}
Also used : RandomAccessInputView(org.apache.flink.runtime.io.disk.RandomAccessInputView) BinaryRowData(org.apache.flink.table.data.binary.BinaryRowData) BinaryRowWriter(org.apache.flink.table.data.writer.BinaryRowWriter) ArrayList(java.util.ArrayList) RandomAccessOutputView(org.apache.flink.runtime.io.disk.RandomAccessOutputView) MemorySegment(org.apache.flink.core.memory.MemorySegment) BinaryRowDataSerializer(org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer) Test(org.junit.Test)

Aggregations

BinaryRowDataSerializer (org.apache.flink.table.runtime.typeutils.BinaryRowDataSerializer)19 BinaryRowData (org.apache.flink.table.data.binary.BinaryRowData)9 Before (org.junit.Before)7 MemorySegment (org.apache.flink.core.memory.MemorySegment)5 IOManagerAsync (org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync)5 RowData (org.apache.flink.table.data.RowData)5 Test (org.junit.Test)5 ArrayList (java.util.ArrayList)4 BinaryRowWriter (org.apache.flink.table.data.writer.BinaryRowWriter)4 IntRecordComparator (org.apache.flink.table.runtime.operators.sort.IntRecordComparator)4 Configuration (org.apache.flink.configuration.Configuration)3 RandomAccessInputView (org.apache.flink.runtime.io.disk.RandomAccessInputView)3 RandomAccessOutputView (org.apache.flink.runtime.io.disk.RandomAccessOutputView)3 GenericRowData (org.apache.flink.table.data.GenericRowData)3 MyProjection (org.apache.flink.table.runtime.operators.join.Int2HashJoinOperatorTest.MyProjection)3 LazyMemorySegmentPool (org.apache.flink.table.runtime.util.LazyMemorySegmentPool)3 ResettableExternalBuffer (org.apache.flink.table.runtime.util.ResettableExternalBuffer)3 Random (java.util.Random)2 TypeInformation (org.apache.flink.api.common.typeinfo.TypeInformation)2 JoinedRowData (org.apache.flink.table.data.utils.JoinedRowData)2