Examples with KeySelector - org.apache.flink.api.java.functions.KeySelector

Example 6 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class AbstractQueryableStateITCase method testReducingState.

/**
	 * Tests simple reducing state queryable state instance. Each source emits
	 * (subtaskIndex, 0)..(subtaskIndex, numElements) tuples, which are then
	 * queried. The reducing state instance sums these up. The test succeeds
	 * after each subtask index is queried with result n*(n+1)/2.
	 */
@Test
public void testReducingState() throws Exception {
    // Config
    final Deadline deadline = TEST_TIMEOUT.fromNow();
    final int numElements = 1024;
    final QueryableStateClient client = new QueryableStateClient(cluster.configuration());
    JobID jobId = null;
    try {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setStateBackend(stateBackend);
        env.setParallelism(NUM_SLOTS);
        // Very important, because cluster is shared between tests and we
        // don't explicitly check that all slots are available before
        // submitting.
        env.setRestartStrategy(RestartStrategies.fixedDelayRestart(Integer.MAX_VALUE, 1000));
        DataStream<Tuple2<Integer, Long>> source = env.addSource(new TestAscendingValueSource(numElements));
        // Reducing state
        ReducingStateDescriptor<Tuple2<Integer, Long>> reducingState = new ReducingStateDescriptor<>("any", new SumReduce(), source.getType());
        QueryableStateStream<Integer, Tuple2<Integer, Long>> queryableState = source.keyBy(new KeySelector<Tuple2<Integer, Long>, Integer>() {

            @Override
            public Integer getKey(Tuple2<Integer, Long> value) throws Exception {
                return value.f0;
            }
        }).asQueryableState("jungle", reducingState);
        // Submit the job graph
        JobGraph jobGraph = env.getStreamGraph().getJobGraph();
        jobId = jobGraph.getJobID();
        cluster.submitJobDetached(jobGraph);
        // Wait until job is running
        // Now query
        long expected = numElements * (numElements + 1) / 2;
        executeValueQuery(deadline, client, jobId, queryableState, expected);
    } finally {
        // Free cluster resources
        if (jobId != null) {
            Future<CancellationSuccess> cancellation = cluster.getLeaderGateway(deadline.timeLeft()).ask(new JobManagerMessages.CancelJob(jobId), deadline.timeLeft()).mapTo(ClassTag$.MODULE$.<CancellationSuccess>apply(CancellationSuccess.class));
            Await.ready(cancellation, deadline.timeLeft());
        }
        client.shutDown();
    }
}

Also used : ReducingStateDescriptor(org.apache.flink.api.common.state.ReducingStateDescriptor) Deadline(scala.concurrent.duration.Deadline) QueryableStateClient(org.apache.flink.runtime.query.QueryableStateClient) KeySelector(org.apache.flink.api.java.functions.KeySelector) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) Tuple2(org.apache.flink.api.java.tuple.Tuple2) AtomicLong(java.util.concurrent.atomic.AtomicLong) CancellationSuccess(org.apache.flink.runtime.messages.JobManagerMessages.CancellationSuccess) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) JobID(org.apache.flink.api.common.JobID) Test(org.junit.Test)

Example 7 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class StreamOperatorSnapshotRestoreTest method testOperatorStatesSnapshotRestore.

@Test
public void testOperatorStatesSnapshotRestore() throws Exception {
    //-------------------------------------------------------------------------- snapshot
    TestOneInputStreamOperator op = new TestOneInputStreamOperator(false);
    KeyedOneInputStreamOperatorTestHarness<Integer, Integer, Integer> testHarness = new KeyedOneInputStreamOperatorTestHarness<>(op, new KeySelector<Integer, Integer>() {

        @Override
        public Integer getKey(Integer value) throws Exception {
            return value;
        }
    }, TypeInformation.of(Integer.class), MAX_PARALLELISM, 1, /* num subtasks */
    0);
    testHarness.open();
    for (int i = 0; i < 10; ++i) {
        testHarness.processElement(new StreamRecord<>(i));
    }
    OperatorStateHandles handles = testHarness.snapshot(1L, 1L);
    testHarness.close();
    //-------------------------------------------------------------------------- restore
    op = new TestOneInputStreamOperator(true);
    testHarness = new KeyedOneInputStreamOperatorTestHarness<>(op, new KeySelector<Integer, Integer>() {

        @Override
        public Integer getKey(Integer value) throws Exception {
            return value;
        }
    }, TypeInformation.of(Integer.class), MAX_PARALLELISM, 1, /* num subtasks */
    0);
    testHarness.initializeState(handles);
    testHarness.open();
    for (int i = 0; i < 10; ++i) {
        testHarness.processElement(new StreamRecord<>(i));
    }
    testHarness.close();
}

Also used : OperatorStateHandles(org.apache.flink.streaming.runtime.tasks.OperatorStateHandles) KeySelector(org.apache.flink.api.java.functions.KeySelector) KeyedOneInputStreamOperatorTestHarness(org.apache.flink.streaming.util.KeyedOneInputStreamOperatorTestHarness) Test(org.junit.Test)

Example 8 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class ReduceTranslationTests method translateGroupedReduceWithkeyExtractor.

@Test
public void translateGroupedReduceWithkeyExtractor() {
    try {
        final int parallelism = 8;
        ExecutionEnvironment env = ExecutionEnvironment.createLocalEnvironment(parallelism);
        DataSet<Tuple3<Double, StringValue, LongValue>> initialData = getSourceDataSet(env);
        initialData.groupBy(new KeySelector<Tuple3<Double, StringValue, LongValue>, StringValue>() {

            public StringValue getKey(Tuple3<Double, StringValue, LongValue> value) {
                return value.f1;
            }
        }).reduce(new RichReduceFunction<Tuple3<Double, StringValue, LongValue>>() {

            public Tuple3<Double, StringValue, LongValue> reduce(Tuple3<Double, StringValue, LongValue> value1, Tuple3<Double, StringValue, LongValue> value2) {
                return value1;
            }
        }).setParallelism(4).output(new DiscardingOutputFormat<Tuple3<Double, StringValue, LongValue>>());
        Plan p = env.createProgramPlan();
        GenericDataSinkBase<?> sink = p.getDataSinks().iterator().next();
        MapOperatorBase<?, ?, ?> keyProjector = (MapOperatorBase<?, ?, ?>) sink.getInput();
        PlanUnwrappingReduceOperator<?, ?> reducer = (PlanUnwrappingReduceOperator<?, ?>) keyProjector.getInput();
        MapOperatorBase<?, ?, ?> keyExtractor = (MapOperatorBase<?, ?, ?>) reducer.getInput();
        // check the parallelisms
        assertEquals(1, keyExtractor.getParallelism());
        assertEquals(4, reducer.getParallelism());
        assertEquals(4, keyProjector.getParallelism());
        // check types
        TypeInformation<?> keyValueInfo = new TupleTypeInfo<Tuple2<StringValue, Tuple3<Double, StringValue, LongValue>>>(new ValueTypeInfo<StringValue>(StringValue.class), initialData.getType());
        assertEquals(initialData.getType(), keyExtractor.getOperatorInfo().getInputType());
        assertEquals(keyValueInfo, keyExtractor.getOperatorInfo().getOutputType());
        assertEquals(keyValueInfo, reducer.getOperatorInfo().getInputType());
        assertEquals(keyValueInfo, reducer.getOperatorInfo().getOutputType());
        assertEquals(keyValueInfo, keyProjector.getOperatorInfo().getInputType());
        assertEquals(initialData.getType(), keyProjector.getOperatorInfo().getOutputType());
        // check keys
        assertEquals(KeyExtractingMapper.class, keyExtractor.getUserCodeWrapper().getUserCodeClass());
        assertTrue(keyExtractor.getInput() instanceof GenericDataSourceBase<?, ?>);
    } catch (Exception e) {
        System.err.println(e.getMessage());
        e.printStackTrace();
        fail("Test caused an error: " + e.getMessage());
    }
}

Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) KeySelector(org.apache.flink.api.java.functions.KeySelector) Plan(org.apache.flink.api.common.Plan) TupleTypeInfo(org.apache.flink.api.java.typeutils.TupleTypeInfo) MapOperatorBase(org.apache.flink.api.common.operators.base.MapOperatorBase) Tuple3(org.apache.flink.api.java.tuple.Tuple3) LongValue(org.apache.flink.types.LongValue) StringValue(org.apache.flink.types.StringValue) Test(org.junit.Test)

Example 9 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class PartitionOperatorTest method testRangePartitionBySelectorComplexKeyWithOrders.

@Test
public void testRangePartitionBySelectorComplexKeyWithOrders() throws Exception {
    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    final DataSet<NestedPojo> ds = getNestedPojoDataSet(env);
    ds.partitionByRange(new KeySelector<NestedPojo, CustomPojo>() {

        @Override
        public CustomPojo getKey(NestedPojo value) throws Exception {
            return value.getNested();
        }
    }).withOrders(Order.ASCENDING);
}

Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) KeySelector(org.apache.flink.api.java.functions.KeySelector) Test(org.junit.Test)

Example 10 with KeySelector

use of org.apache.flink.api.java.functions.KeySelector in project flink by apache.

the class SortPartitionTest method testSortPartitionWithKeySelector5.

@Test(expected = InvalidProgramException.class)
public void testSortPartitionWithKeySelector5() {
    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple4<Integer, Long, CustomType, Long[]>> tupleDs = env.fromCollection(tupleWithCustomData, tupleWithCustomInfo);
    // must not work
    tupleDs.sortPartition(new KeySelector<Tuple4<Integer, Long, CustomType, Long[]>, CustomType>() {

        @Override
        public CustomType getKey(Tuple4<Integer, Long, CustomType, Long[]> value) throws Exception {
            return value.f2;
        }
    }, Order.ASCENDING).sortPartition("f1", Order.ASCENDING);
}

Also used : Tuple4(org.apache.flink.api.java.tuple.Tuple4) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) KeySelector(org.apache.flink.api.java.functions.KeySelector) Test(org.junit.Test)

Aggregations

KeySelector (org.apache.flink.api.java.functions.KeySelector)120 Test (org.junit.Test)113 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)45 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)44 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)39 Watermark (org.apache.flink.streaming.api.watermark.Watermark)30 List (java.util.List)29 StreamRecord (org.apache.flink.streaming.runtime.streamrecord.StreamRecord)28 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)22 JobID (org.apache.flink.api.common.JobID)22 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)22 IOException (java.io.IOException)21 Arrays (java.util.Arrays)21 AtomicLong (java.util.concurrent.atomic.AtomicLong)21 Configuration (org.apache.flink.configuration.Configuration)21 KeyedOneInputStreamOperatorTestHarness (org.apache.flink.streaming.util.KeyedOneInputStreamOperatorTestHarness)21 ArrayList (java.util.ArrayList)18 Map (java.util.Map)18 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)18 ValueStateDescriptor (org.apache.flink.api.common.state.ValueStateDescriptor)16