Search in sources :

Example 36 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class MultiInputSortingDataInputsTest method twoInputOrderTest.

@SuppressWarnings("unchecked")
public void twoInputOrderTest(int preferredIndex, int sortedIndex) throws Exception {
    CollectingDataOutput<Object> collectingDataOutput = new CollectingDataOutput<>();
    List<StreamElement> sortedInputElements = Arrays.asList(new StreamRecord<>(1, 3), new StreamRecord<>(1, 1), new StreamRecord<>(2, 1), new StreamRecord<>(2, 3), new StreamRecord<>(1, 2), new StreamRecord<>(2, 2), Watermark.MAX_WATERMARK);
    CollectionDataInput<Integer> sortedInput = new CollectionDataInput<>(sortedInputElements, sortedIndex);
    List<StreamElement> preferredInputElements = Arrays.asList(new StreamRecord<>(99, 3), new StreamRecord<>(99, 1), new Watermark(99L));
    CollectionDataInput<Integer> preferredInput = new CollectionDataInput<>(preferredInputElements, preferredIndex);
    KeySelector<Integer, Integer> keySelector = value -> value;
    try (MockEnvironment environment = MockEnvironment.builder().build()) {
        SelectableSortingInputs selectableSortingInputs = MultiInputSortingDataInput.wrapInputs(new DummyInvokable(), new StreamTaskInput[] { sortedInput }, new KeySelector[] { keySelector }, new TypeSerializer[] { new IntSerializer() }, new IntSerializer(), new StreamTaskInput[] { preferredInput }, environment.getMemoryManager(), environment.getIOManager(), true, 1.0, new Configuration(), new ExecutionConfig());
        StreamTaskInput<?>[] sortingDataInputs = selectableSortingInputs.getSortedInputs();
        StreamTaskInput<?>[] preferredDataInputs = selectableSortingInputs.getPassThroughInputs();
        try (StreamTaskInput<Object> preferredTaskInput = (StreamTaskInput<Object>) preferredDataInputs[0];
            StreamTaskInput<Object> sortedTaskInput = (StreamTaskInput<Object>) sortingDataInputs[0]) {
            MultipleInputSelectionHandler selectionHandler = new MultipleInputSelectionHandler(selectableSortingInputs.getInputSelectable(), 2);
            @SuppressWarnings("rawtypes") StreamOneInputProcessor[] inputProcessors = new StreamOneInputProcessor[2];
            inputProcessors[preferredIndex] = new StreamOneInputProcessor<>(preferredTaskInput, collectingDataOutput, new DummyOperatorChain());
            inputProcessors[sortedIndex] = new StreamOneInputProcessor<>(sortedTaskInput, collectingDataOutput, new DummyOperatorChain());
            StreamMultipleInputProcessor processor = new StreamMultipleInputProcessor(selectionHandler, inputProcessors);
            DataInputStatus inputStatus;
            do {
                inputStatus = processor.processInput();
            } while (inputStatus != DataInputStatus.END_OF_INPUT);
        }
    }
    assertThat(collectingDataOutput.events, equalTo(Arrays.asList(new StreamRecord<>(99, 3), new StreamRecord<>(99, 1), // max watermark from the preferred input
    new Watermark(99L), new StreamRecord<>(1, 1), new StreamRecord<>(1, 2), new StreamRecord<>(1, 3), new StreamRecord<>(2, 1), new StreamRecord<>(2, 2), new StreamRecord<>(2, 3), // max watermark from the sorted input
    Watermark.MAX_WATERMARK)));
}
Also used : StreamTaskInput(org.apache.flink.streaming.runtime.io.StreamTaskInput) Arrays(java.util.Arrays) TypeSerializer(org.apache.flink.api.common.typeutils.TypeSerializer) KeySelector(org.apache.flink.api.java.functions.KeySelector) BoundedMultiInput(org.apache.flink.streaming.api.operators.BoundedMultiInput) StreamElement(org.apache.flink.streaming.runtime.streamrecord.StreamElement) CoreMatchers.equalTo(org.hamcrest.CoreMatchers.equalTo) Configuration(org.apache.flink.configuration.Configuration) SelectableSortingInputs(org.apache.flink.streaming.api.operators.sort.MultiInputSortingDataInput.SelectableSortingInputs) Watermark(org.apache.flink.streaming.api.watermark.Watermark) Test(org.junit.Test) StreamMultipleInputProcessor(org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor) Assert.assertThat(org.junit.Assert.assertThat) IntSerializer(org.apache.flink.api.common.typeutils.base.IntSerializer) DummyInvokable(org.apache.flink.runtime.operators.testutils.DummyInvokable) List(java.util.List) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) DataInputStatus(org.apache.flink.streaming.runtime.io.DataInputStatus) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) MultipleInputSelectionHandler(org.apache.flink.streaming.runtime.io.MultipleInputSelectionHandler) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) StreamOneInputProcessor(org.apache.flink.streaming.runtime.io.StreamOneInputProcessor) IntSerializer(org.apache.flink.api.common.typeutils.base.IntSerializer) Configuration(org.apache.flink.configuration.Configuration) StreamTaskInput(org.apache.flink.streaming.runtime.io.StreamTaskInput) StreamElement(org.apache.flink.streaming.runtime.streamrecord.StreamElement) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) SelectableSortingInputs(org.apache.flink.streaming.api.operators.sort.MultiInputSortingDataInput.SelectableSortingInputs) StreamOneInputProcessor(org.apache.flink.streaming.runtime.io.StreamOneInputProcessor) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) DataInputStatus(org.apache.flink.streaming.runtime.io.DataInputStatus) DummyInvokable(org.apache.flink.runtime.operators.testutils.DummyInvokable) StreamMultipleInputProcessor(org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor) Watermark(org.apache.flink.streaming.api.watermark.Watermark) MultipleInputSelectionHandler(org.apache.flink.streaming.runtime.io.MultipleInputSelectionHandler)

Example 37 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class SortingDataInputTest method simpleVariableLengthKeySorting.

@Test
public void simpleVariableLengthKeySorting() throws Exception {
    CollectingDataOutput<Integer> collectingDataOutput = new CollectingDataOutput<>();
    CollectionDataInput<Integer> input = new CollectionDataInput<>(Arrays.asList(new StreamRecord<>(1, 3), new StreamRecord<>(1, 1), new StreamRecord<>(2, 1), new StreamRecord<>(2, 3), new StreamRecord<>(1, 2), new StreamRecord<>(2, 2)));
    MockEnvironment environment = MockEnvironment.builder().build();
    SortingDataInput<Integer, String> sortingDataInput = new SortingDataInput<>(input, new IntSerializer(), new StringSerializer(), (KeySelector<Integer, String>) value -> "" + value, environment.getMemoryManager(), environment.getIOManager(), true, 1.0, new Configuration(), new DummyInvokable(), new ExecutionConfig());
    DataInputStatus inputStatus;
    do {
        inputStatus = sortingDataInput.emitNext(collectingDataOutput);
    } while (inputStatus != DataInputStatus.END_OF_INPUT);
    assertThat(collectingDataOutput.events, equalTo(Arrays.asList(new StreamRecord<>(1, 1), new StreamRecord<>(1, 2), new StreamRecord<>(1, 3), new StreamRecord<>(2, 1), new StreamRecord<>(2, 2), new StreamRecord<>(2, 3))));
}
Also used : Arrays(java.util.Arrays) KeySelector(org.apache.flink.api.java.functions.KeySelector) CoreMatchers.equalTo(org.hamcrest.CoreMatchers.equalTo) Configuration(org.apache.flink.configuration.Configuration) Watermark(org.apache.flink.streaming.api.watermark.Watermark) Test(org.junit.Test) StringSerializer(org.apache.flink.api.common.typeutils.base.StringSerializer) Assert.assertThat(org.junit.Assert.assertThat) IntSerializer(org.apache.flink.api.common.typeutils.base.IntSerializer) DummyInvokable(org.apache.flink.runtime.operators.testutils.DummyInvokable) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) DataInputStatus(org.apache.flink.streaming.runtime.io.DataInputStatus) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) IntSerializer(org.apache.flink.api.common.typeutils.base.IntSerializer) StreamRecord(org.apache.flink.streaming.runtime.streamrecord.StreamRecord) Configuration(org.apache.flink.configuration.Configuration) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) MockEnvironment(org.apache.flink.runtime.operators.testutils.MockEnvironment) DataInputStatus(org.apache.flink.streaming.runtime.io.DataInputStatus) DummyInvokable(org.apache.flink.runtime.operators.testutils.DummyInvokable) StringSerializer(org.apache.flink.api.common.typeutils.base.StringSerializer) Test(org.junit.Test)

Example 38 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class StateDescriptorPassingTest method testReduceWindowState.

@Test
public void testReduceWindowState() {
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.registerTypeWithKryoSerializer(File.class, JavaSerializer.class);
    DataStream<File> src = env.fromElements(new File("/")).assignTimestampsAndWatermarks(WatermarkStrategy.<File>forMonotonousTimestamps().withTimestampAssigner((file, ts) -> System.currentTimeMillis()));
    SingleOutputStreamOperator<?> result = src.keyBy(new KeySelector<File, String>() {

        @Override
        public String getKey(File value) {
            return null;
        }
    }).window(TumblingEventTimeWindows.of(Time.milliseconds(1000))).reduce(new ReduceFunction<File>() {

        @Override
        public File reduce(File value1, File value2) {
            return null;
        }
    });
    validateStateDescriptorConfigured(result);
}
Also used : Kryo(com.esotericsoftware.kryo.Kryo) Collector(org.apache.flink.util.Collector) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) ProcessAllWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessAllWindowFunction) ListStateDescriptor(org.apache.flink.api.common.state.ListStateDescriptor) ReduceFunction(org.apache.flink.api.common.functions.ReduceFunction) JavaSerializer(com.esotericsoftware.kryo.serializers.JavaSerializer) Time(org.apache.flink.streaming.api.windowing.time.Time) TypeSerializer(org.apache.flink.api.common.typeutils.TypeSerializer) KeySelector(org.apache.flink.api.java.functions.KeySelector) StateDescriptor(org.apache.flink.api.common.state.StateDescriptor) KryoSerializer(org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) WindowOperator(org.apache.flink.streaming.runtime.operators.windowing.WindowOperator) Assert.assertTrue(org.junit.Assert.assertTrue) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) Test(org.junit.Test) ProcessWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) File(java.io.File) DataStream(org.apache.flink.streaming.api.datastream.DataStream) WindowFunction(org.apache.flink.streaming.api.functions.windowing.WindowFunction) TumblingEventTimeWindows(org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows) AllWindowFunction(org.apache.flink.streaming.api.functions.windowing.AllWindowFunction) ListSerializer(org.apache.flink.api.common.typeutils.base.ListSerializer) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) File(java.io.File) Test(org.junit.Test)

Example 39 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class StateDescriptorPassingTest method testApplyWindowState.

@Test
public void testApplyWindowState() {
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.registerTypeWithKryoSerializer(File.class, JavaSerializer.class);
    DataStream<File> src = env.fromElements(new File("/")).assignTimestampsAndWatermarks(WatermarkStrategy.<File>forMonotonousTimestamps().withTimestampAssigner((file, ts) -> System.currentTimeMillis()));
    SingleOutputStreamOperator<?> result = src.keyBy(new KeySelector<File, String>() {

        @Override
        public String getKey(File value) {
            return null;
        }
    }).window(TumblingEventTimeWindows.of(Time.milliseconds(1000))).apply(new WindowFunction<File, String, String, TimeWindow>() {

        @Override
        public void apply(String s, TimeWindow window, Iterable<File> input, Collector<String> out) {
        }
    });
    validateListStateDescriptorConfigured(result);
}
Also used : Kryo(com.esotericsoftware.kryo.Kryo) Collector(org.apache.flink.util.Collector) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) ProcessAllWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessAllWindowFunction) ListStateDescriptor(org.apache.flink.api.common.state.ListStateDescriptor) ReduceFunction(org.apache.flink.api.common.functions.ReduceFunction) JavaSerializer(com.esotericsoftware.kryo.serializers.JavaSerializer) Time(org.apache.flink.streaming.api.windowing.time.Time) TypeSerializer(org.apache.flink.api.common.typeutils.TypeSerializer) KeySelector(org.apache.flink.api.java.functions.KeySelector) StateDescriptor(org.apache.flink.api.common.state.StateDescriptor) KryoSerializer(org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) WindowOperator(org.apache.flink.streaming.runtime.operators.windowing.WindowOperator) Assert.assertTrue(org.junit.Assert.assertTrue) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) Test(org.junit.Test) ProcessWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) File(java.io.File) DataStream(org.apache.flink.streaming.api.datastream.DataStream) WindowFunction(org.apache.flink.streaming.api.functions.windowing.WindowFunction) TumblingEventTimeWindows(org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows) AllWindowFunction(org.apache.flink.streaming.api.functions.windowing.AllWindowFunction) ListSerializer(org.apache.flink.api.common.typeutils.base.ListSerializer) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) File(java.io.File) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) Test(org.junit.Test)

Example 40 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class StateDescriptorPassingTest method testProcessWindowState.

@Test
public void testProcessWindowState() {
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.registerTypeWithKryoSerializer(File.class, JavaSerializer.class);
    DataStream<File> src = env.fromElements(new File("/")).assignTimestampsAndWatermarks(WatermarkStrategy.<File>forMonotonousTimestamps().withTimestampAssigner((file, ts) -> System.currentTimeMillis()));
    SingleOutputStreamOperator<?> result = src.keyBy(new KeySelector<File, String>() {

        @Override
        public String getKey(File value) {
            return null;
        }
    }).window(TumblingEventTimeWindows.of(Time.milliseconds(1000))).process(new ProcessWindowFunction<File, String, String, TimeWindow>() {

        @Override
        public void process(String s, Context ctx, Iterable<File> input, Collector<String> out) {
        }
    });
    validateListStateDescriptorConfigured(result);
}
Also used : Kryo(com.esotericsoftware.kryo.Kryo) Collector(org.apache.flink.util.Collector) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) ProcessAllWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessAllWindowFunction) ListStateDescriptor(org.apache.flink.api.common.state.ListStateDescriptor) ReduceFunction(org.apache.flink.api.common.functions.ReduceFunction) JavaSerializer(com.esotericsoftware.kryo.serializers.JavaSerializer) Time(org.apache.flink.streaming.api.windowing.time.Time) TypeSerializer(org.apache.flink.api.common.typeutils.TypeSerializer) KeySelector(org.apache.flink.api.java.functions.KeySelector) StateDescriptor(org.apache.flink.api.common.state.StateDescriptor) KryoSerializer(org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) WindowOperator(org.apache.flink.streaming.runtime.operators.windowing.WindowOperator) Assert.assertTrue(org.junit.Assert.assertTrue) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) Test(org.junit.Test) ProcessWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) File(java.io.File) DataStream(org.apache.flink.streaming.api.datastream.DataStream) WindowFunction(org.apache.flink.streaming.api.functions.windowing.WindowFunction) TumblingEventTimeWindows(org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows) AllWindowFunction(org.apache.flink.streaming.api.functions.windowing.AllWindowFunction) ListSerializer(org.apache.flink.api.common.typeutils.base.ListSerializer) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) File(java.io.File) Test(org.junit.Test)

Aggregations

KeySelector (org.apache.flink.api.java.functions.KeySelector)120 Test (org.junit.Test)113 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)45 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)44 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)39 Watermark (org.apache.flink.streaming.api.watermark.Watermark)30 List (java.util.List)29 StreamRecord (org.apache.flink.streaming.runtime.streamrecord.StreamRecord)28 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)22 JobID (org.apache.flink.api.common.JobID)22 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)22 IOException (java.io.IOException)21 Arrays (java.util.Arrays)21 AtomicLong (java.util.concurrent.atomic.AtomicLong)21 Configuration (org.apache.flink.configuration.Configuration)21 KeyedOneInputStreamOperatorTestHarness (org.apache.flink.streaming.util.KeyedOneInputStreamOperatorTestHarness)21 ArrayList (java.util.ArrayList)18 Map (java.util.Map)18 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)18 ValueStateDescriptor (org.apache.flink.api.common.state.ValueStateDescriptor)16