Search in sources :

Example 1 with Time

use of org.apache.flink.streaming.api.windowing.time.Time in project flink by apache.

the class NFACompiler method compileFactory.

/**
	 * Compiles the given pattern into a {@link NFAFactory}. The NFA factory can be used to create
	 * multiple NFAs.
	 *
	 * @param pattern Definition of sequence pattern
	 * @param inputTypeSerializer Serializer for the input type
	 * @param timeoutHandling True if the NFA shall return timed out event patterns
	 * @param <T> Type of the input events
	 * @return Factory for NFAs corresponding to the given pattern
	 */
@SuppressWarnings("unchecked")
public static <T> NFAFactory<T> compileFactory(Pattern<T, ?> pattern, TypeSerializer<T> inputTypeSerializer, boolean timeoutHandling) {
    if (pattern == null) {
        // return a factory for empty NFAs
        return new NFAFactoryImpl<T>(inputTypeSerializer, 0, Collections.<State<T>>emptyList(), timeoutHandling);
    } else {
        // set of all generated states
        Map<String, State<T>> states = new HashMap<>();
        long windowTime;
        // this is used to enforse pattern name uniqueness.
        Set<String> patternNames = new HashSet<>();
        Pattern<T, ?> succeedingPattern;
        State<T> succeedingState;
        Pattern<T, ?> currentPattern = pattern;
        // we're traversing the pattern from the end to the beginning --> the first state is the final state
        State<T> currentState = new State<>(currentPattern.getName(), State.StateType.Final);
        patternNames.add(currentPattern.getName());
        states.put(currentPattern.getName(), currentState);
        windowTime = currentPattern.getWindowTime() != null ? currentPattern.getWindowTime().toMilliseconds() : 0L;
        while (currentPattern.getPrevious() != null) {
            succeedingPattern = currentPattern;
            succeedingState = currentState;
            currentPattern = currentPattern.getPrevious();
            if (!patternNames.add(currentPattern.getName())) {
                throw new MalformedPatternException("Duplicate pattern name: " + currentPattern.getName() + ". " + "Pattern names must be unique.");
            }
            Time currentWindowTime = currentPattern.getWindowTime();
            if (currentWindowTime != null && currentWindowTime.toMilliseconds() < windowTime) {
                // the window time is the global minimum of all window times of each state
                windowTime = currentWindowTime.toMilliseconds();
            }
            if (states.containsKey(currentPattern.getName())) {
                currentState = states.get(currentPattern.getName());
            } else {
                currentState = new State<>(currentPattern.getName(), State.StateType.Normal);
                states.put(currentState.getName(), currentState);
            }
            currentState.addStateTransition(new StateTransition<T>(StateTransitionAction.TAKE, succeedingState, (FilterFunction<T>) succeedingPattern.getFilterFunction()));
            if (succeedingPattern instanceof FollowedByPattern) {
                // the followed by pattern entails a reflexive ignore transition
                currentState.addStateTransition(new StateTransition<T>(StateTransitionAction.IGNORE, currentState, null));
            }
        }
        // add the beginning state
        final State<T> beginningState;
        if (states.containsKey(BEGINNING_STATE_NAME)) {
            beginningState = states.get(BEGINNING_STATE_NAME);
        } else {
            beginningState = new State<>(BEGINNING_STATE_NAME, State.StateType.Start);
            states.put(BEGINNING_STATE_NAME, beginningState);
        }
        beginningState.addStateTransition(new StateTransition<T>(StateTransitionAction.TAKE, currentState, (FilterFunction<T>) currentPattern.getFilterFunction()));
        return new NFAFactoryImpl<T>(inputTypeSerializer, windowTime, new HashSet<>(states.values()), timeoutHandling);
    }
}
Also used : FilterFunction(org.apache.flink.api.common.functions.FilterFunction) HashMap(java.util.HashMap) Time(org.apache.flink.streaming.api.windowing.time.Time) FollowedByPattern(org.apache.flink.cep.pattern.FollowedByPattern) State(org.apache.flink.cep.nfa.State) HashSet(java.util.HashSet)

Example 2 with Time

use of org.apache.flink.streaming.api.windowing.time.Time in project flink by apache.

the class StateDescriptorPassingTest method testApplyWindowAllState.

@Test
public void testApplyWindowAllState() {
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.registerTypeWithKryoSerializer(File.class, JavaSerializer.class);
    // simulate ingestion time
    DataStream<File> src = env.fromElements(new File("/")).assignTimestampsAndWatermarks(WatermarkStrategy.<File>forMonotonousTimestamps().withTimestampAssigner((file, ts) -> System.currentTimeMillis()));
    SingleOutputStreamOperator<?> result = src.windowAll(TumblingEventTimeWindows.of(Time.milliseconds(1000))).apply(new AllWindowFunction<File, String, TimeWindow>() {

        @Override
        public void apply(TimeWindow window, Iterable<File> input, Collector<String> out) {
        }
    });
    validateListStateDescriptorConfigured(result);
}
Also used : Kryo(com.esotericsoftware.kryo.Kryo) Collector(org.apache.flink.util.Collector) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) ProcessAllWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessAllWindowFunction) ListStateDescriptor(org.apache.flink.api.common.state.ListStateDescriptor) ReduceFunction(org.apache.flink.api.common.functions.ReduceFunction) JavaSerializer(com.esotericsoftware.kryo.serializers.JavaSerializer) Time(org.apache.flink.streaming.api.windowing.time.Time) TypeSerializer(org.apache.flink.api.common.typeutils.TypeSerializer) KeySelector(org.apache.flink.api.java.functions.KeySelector) StateDescriptor(org.apache.flink.api.common.state.StateDescriptor) KryoSerializer(org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) WindowOperator(org.apache.flink.streaming.runtime.operators.windowing.WindowOperator) Assert.assertTrue(org.junit.Assert.assertTrue) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) Test(org.junit.Test) ProcessWindowFunction(org.apache.flink.streaming.api.functions.windowing.ProcessWindowFunction) OneInputTransformation(org.apache.flink.streaming.api.transformations.OneInputTransformation) File(java.io.File) DataStream(org.apache.flink.streaming.api.datastream.DataStream) WindowFunction(org.apache.flink.streaming.api.functions.windowing.WindowFunction) TumblingEventTimeWindows(org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows) AllWindowFunction(org.apache.flink.streaming.api.functions.windowing.AllWindowFunction) ListSerializer(org.apache.flink.api.common.typeutils.base.ListSerializer) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) File(java.io.File) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) Test(org.junit.Test)

Example 3 with Time

use of org.apache.flink.streaming.api.windowing.time.Time in project flink by apache.

the class DataStreamJavaITCase method testFromAndToChangelogStreamEventTime.

@Test
public void testFromAndToChangelogStreamEventTime() throws Exception {
    final StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
    final DataStream<Tuple3<Long, Integer, String>> dataStream = getWatermarkedDataStream();
    final DataStream<Row> changelogStream = dataStream.map(t -> Row.ofKind(RowKind.INSERT, t.f1, t.f2)).returns(Types.ROW(Types.INT, Types.STRING));
    // derive physical columns and add a rowtime
    final Table table = tableEnv.fromChangelogStream(changelogStream, Schema.newBuilder().columnByMetadata("rowtime", TIMESTAMP_LTZ(3)).columnByExpression("computed", $("f1").upperCase()).watermark("rowtime", sourceWatermark()).build());
    tableEnv.createTemporaryView("t", table);
    // access and reorder columns
    final Table reordered = tableEnv.sqlQuery("SELECT computed, rowtime, f0 FROM t");
    // write out the rowtime column with fully declared schema
    final DataStream<Row> result = tableEnv.toChangelogStream(reordered, Schema.newBuilder().column("f1", STRING()).columnByMetadata("rowtime", TIMESTAMP_LTZ(3)).columnByExpression("ignored", $("f1").upperCase()).column("f0", INT()).build());
    // test event time window and field access
    testResult(result.keyBy(k -> k.getField("f1")).window(TumblingEventTimeWindows.of(Time.milliseconds(5))).<Row>apply((key, window, input, out) -> {
        int sum = 0;
        for (Row row : input) {
            sum += row.<Integer>getFieldAs("f0");
        }
        out.collect(Row.of(key, sum));
    }).returns(Types.ROW(Types.STRING, Types.INT)), Row.of("A", 47), Row.of("C", 1000), Row.of("C", 1000));
}
Also used : DataType(org.apache.flink.table.types.DataType) BIGINT(org.apache.flink.table.api.DataTypes.BIGINT) STRING(org.apache.flink.table.api.DataTypes.STRING) StreamTableEnvironment(org.apache.flink.table.api.bridge.java.StreamTableEnvironment) Arrays(java.util.Arrays) Schema(org.apache.flink.table.api.Schema) Tuple3(org.apache.flink.api.java.tuple.Tuple3) TableDescriptor(org.apache.flink.table.api.TableDescriptor) Tuple2(org.apache.flink.api.java.tuple.Tuple2) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) TupleTypeInfo(org.apache.flink.api.java.typeutils.TupleTypeInfo) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint) TIMESTAMP_LTZ(org.apache.flink.table.api.DataTypes.TIMESTAMP_LTZ) RawType(org.apache.flink.table.types.logical.RawType) ZoneOffset(java.time.ZoneOffset) TypeInformation(org.apache.flink.api.common.typeinfo.TypeInformation) FIELD(org.apache.flink.table.api.DataTypes.FIELD) Parameterized(org.junit.runners.Parameterized) AbstractTestBase(org.apache.flink.test.util.AbstractTestBase) DOUBLE(org.apache.flink.table.api.DataTypes.DOUBLE) TableConfig(org.apache.flink.table.api.TableConfig) Expressions.$(org.apache.flink.table.api.Expressions.$) TestValuesTableFactory(org.apache.flink.table.planner.factories.TestValuesTableFactory) WatermarkStrategy(org.apache.flink.api.common.eventtime.WatermarkStrategy) Table(org.apache.flink.table.api.Table) ResolvedExpressionMock(org.apache.flink.table.expressions.utils.ResolvedExpressionMock) ZoneId(java.time.ZoneId) Objects(java.util.Objects) Matchers.instanceOf(org.hamcrest.Matchers.instanceOf) CloseableIterator(org.apache.flink.util.CloseableIterator) List(java.util.List) ValueState(org.apache.flink.api.common.state.ValueState) TumblingEventTimeWindows(org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows) Matchers.containsInAnyOrder(org.hamcrest.Matchers.containsInAnyOrder) STRUCTURED(org.apache.flink.table.api.DataTypes.STRUCTURED) TableResult(org.apache.flink.table.api.TableResult) Row(org.apache.flink.types.Row) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) MAP(org.apache.flink.table.api.DataTypes.MAP) BOOLEAN(org.apache.flink.table.api.DataTypes.BOOLEAN) Either(org.apache.flink.types.Either) ChangelogMode(org.apache.flink.table.connector.ChangelogMode) ROW(org.apache.flink.table.api.DataTypes.ROW) Column(org.apache.flink.table.catalog.Column) RunWith(org.junit.runner.RunWith) Parameters(org.junit.runners.Parameterized.Parameters) LocalDateTime(java.time.LocalDateTime) Expressions.sourceWatermark(org.apache.flink.table.api.Expressions.sourceWatermark) DataStreamSource(org.apache.flink.streaming.api.datastream.DataStreamSource) KeyedProcessFunction(org.apache.flink.streaming.api.functions.KeyedProcessFunction) ArrayList(java.util.ArrayList) Collector(org.apache.flink.util.Collector) ProcessFunction(org.apache.flink.streaming.api.functions.ProcessFunction) MatcherAssert.assertThat(org.hamcrest.MatcherAssert.assertThat) INT(org.apache.flink.table.api.DataTypes.INT) Before(org.junit.Before) Types(org.apache.flink.api.common.typeinfo.Types) Time(org.apache.flink.streaming.api.windowing.time.Time) WatermarkSpec(org.apache.flink.table.catalog.WatermarkSpec) GenericTypeInfo(org.apache.flink.api.java.typeutils.GenericTypeInfo) Parameter(org.junit.runners.Parameterized.Parameter) ValueStateDescriptor(org.apache.flink.api.common.state.ValueStateDescriptor) Configuration(org.apache.flink.configuration.Configuration) SingleOutputStreamOperator(org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator) DataTypes(org.apache.flink.table.api.DataTypes) Test(org.junit.Test) IOException(java.io.IOException) CollectionUtil(org.apache.flink.util.CollectionUtil) DataStream(org.apache.flink.streaming.api.datastream.DataStream) RowKind(org.apache.flink.types.RowKind) DayOfWeek(java.time.DayOfWeek) TIMESTAMP(org.apache.flink.table.api.DataTypes.TIMESTAMP) EnumTypeInfo(org.apache.flink.api.java.typeutils.EnumTypeInfo) RuntimeExecutionMode(org.apache.flink.api.common.RuntimeExecutionMode) Collections(java.util.Collections) Assert.assertEquals(org.junit.Assert.assertEquals) Table(org.apache.flink.table.api.Table) Tuple3(org.apache.flink.api.java.tuple.Tuple3) StreamTableEnvironment(org.apache.flink.table.api.bridge.java.StreamTableEnvironment) Row(org.apache.flink.types.Row) TypeHint(org.apache.flink.api.common.typeinfo.TypeHint) Test(org.junit.Test)

Example 4 with Time

use of org.apache.flink.streaming.api.windowing.time.Time in project flink by apache.

the class CoGroupedStreamsTest method testDelegateToCoGrouped.

@Test
public void testDelegateToCoGrouped() {
    Time lateness = Time.milliseconds(42L);
    CoGroupedStreams.WithWindow<String, String, String, TimeWindow> withLateness = dataStream1.coGroup(dataStream2).where(keySelector).equalTo(keySelector).window(tsAssigner).allowedLateness(lateness);
    withLateness.apply(coGroupFunction, BasicTypeInfo.STRING_TYPE_INFO);
    Assert.assertEquals(lateness.toMilliseconds(), withLateness.getWindowedStream().getAllowedLateness());
}
Also used : Time(org.apache.flink.streaming.api.windowing.time.Time) TimeWindow(org.apache.flink.streaming.api.windowing.windows.TimeWindow) Test(org.junit.Test)

Example 5 with Time

use of org.apache.flink.streaming.api.windowing.time.Time in project flink by apache.

the class FlinkKinesisConsumerTest method testPeriodicWatermark.

@Test
public void testPeriodicWatermark() throws Exception {
    String streamName = "fakeStreamName";
    Time maxOutOfOrderness = Time.milliseconds(5);
    long autoWatermarkInterval = 1_000;
    HashMap<String, String> subscribedStreamsToLastDiscoveredShardIds = new HashMap<>();
    subscribedStreamsToLastDiscoveredShardIds.put(streamName, null);
    KinesisDeserializationSchema<String> deserializationSchema = new KinesisDeserializationSchemaWrapper<>(new SimpleStringSchema());
    Properties props = new Properties();
    props.setProperty(ConsumerConfigConstants.AWS_REGION, "us-east-1");
    props.setProperty(ConsumerConfigConstants.SHARD_GETRECORDS_INTERVAL_MILLIS, Long.toString(10L));
    BlockingQueue<String> shard1 = new LinkedBlockingQueue<>();
    BlockingQueue<String> shard2 = new LinkedBlockingQueue<>();
    Map<String, List<BlockingQueue<String>>> streamToQueueMap = new HashMap<>();
    streamToQueueMap.put(streamName, Arrays.asList(shard1, shard2));
    // override createFetcher to mock Kinesis
    FlinkKinesisConsumer<String> sourceFunc = new FlinkKinesisConsumer<String>(streamName, deserializationSchema, props) {

        @Override
        protected KinesisDataFetcher<String> createFetcher(List<String> streams, SourceContext<String> sourceContext, RuntimeContext runtimeContext, Properties configProps, KinesisDeserializationSchema<String> deserializationSchema) {
            KinesisDataFetcher<String> fetcher = new KinesisDataFetcher<String>(streams, sourceContext, sourceContext.getCheckpointLock(), runtimeContext, configProps, deserializationSchema, getShardAssigner(), getPeriodicWatermarkAssigner(), null, new AtomicReference<>(), new ArrayList<>(), subscribedStreamsToLastDiscoveredShardIds, (props) -> FakeKinesisBehavioursFactory.blockingQueueGetRecords(streamToQueueMap), null) {
            };
            return fetcher;
        }
    };
    sourceFunc.setShardAssigner((streamShardHandle, i) -> {
        // shardId-000000000000
        return Integer.parseInt(streamShardHandle.getShard().getShardId().substring("shardId-".length()));
    });
    sourceFunc.setPeriodicWatermarkAssigner(new TestTimestampExtractor(maxOutOfOrderness));
    // there is currently no test harness specifically for sources,
    // so we overlay the source thread here
    AbstractStreamOperatorTestHarness<Object> testHarness = new AbstractStreamOperatorTestHarness<Object>(new StreamSource(sourceFunc), 1, 1, 0);
    testHarness.setTimeCharacteristic(TimeCharacteristic.EventTime);
    testHarness.getExecutionConfig().setAutoWatermarkInterval(autoWatermarkInterval);
    testHarness.initializeEmptyState();
    testHarness.open();
    ConcurrentLinkedQueue<Watermark> watermarks = new ConcurrentLinkedQueue<>();
    @SuppressWarnings("unchecked") SourceFunction.SourceContext<String> sourceContext = new CollectingSourceContext(testHarness.getCheckpointLock(), testHarness.getOutput()) {

        @Override
        public void emitWatermark(Watermark mark) {
            watermarks.add(mark);
        }

        @Override
        public void markAsTemporarilyIdle() {
        }
    };
    new Thread(() -> {
        try {
            sourceFunc.run(sourceContext);
        } catch (InterruptedException e) {
        // expected on cancel
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }).start();
    shard1.put("1");
    shard1.put("2");
    shard2.put("10");
    int recordCount = 3;
    int watermarkCount = 0;
    awaitRecordCount(testHarness.getOutput(), recordCount);
    // Trigger watermark emit, first watermark is -3
    // - Shard-1 @2
    // - Shard-2 @10
    // - Watermark = min(2, 10) - maxOutOfOrderness = 2 - 5 = -3
    testHarness.setProcessingTime(testHarness.getProcessingTime() + autoWatermarkInterval);
    watermarkCount++;
    // advance watermark
    shard1.put("10");
    recordCount++;
    awaitRecordCount(testHarness.getOutput(), recordCount);
    // Trigger watermark emit, second watermark is -3
    // - Shard-1 @10
    // - Shard-2 @10
    // - Watermark = min(10, 10) - maxOutOfOrderness = 10 - 5 = 5
    testHarness.setProcessingTime(testHarness.getProcessingTime() + autoWatermarkInterval);
    watermarkCount++;
    sourceFunc.cancel();
    testHarness.close();
    assertEquals("record count", recordCount, testHarness.getOutput().size());
    assertThat(watermarks, org.hamcrest.Matchers.contains(new Watermark(-3), new Watermark(5)));
    assertEquals("watermark count", watermarkCount, watermarks.size());
}
Also used : HashMap(java.util.HashMap) Time(org.apache.flink.streaming.api.windowing.time.Time) Properties(java.util.Properties) LinkedBlockingQueue(java.util.concurrent.LinkedBlockingQueue) CollectingSourceContext(org.apache.flink.streaming.util.CollectingSourceContext) AbstractStreamOperatorTestHarness(org.apache.flink.streaming.util.AbstractStreamOperatorTestHarness) List(java.util.List) ArrayList(java.util.ArrayList) KinesisDeserializationSchema(org.apache.flink.streaming.connectors.kinesis.serialization.KinesisDeserializationSchema) SourceFunction(org.apache.flink.streaming.api.functions.source.SourceFunction) StreamSource(org.apache.flink.streaming.api.operators.StreamSource) KinesisDeserializationSchemaWrapper(org.apache.flink.streaming.connectors.kinesis.serialization.KinesisDeserializationSchemaWrapper) CollectingSourceContext(org.apache.flink.streaming.util.CollectingSourceContext) KinesisDataFetcher(org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher) TestableFlinkKinesisConsumer(org.apache.flink.streaming.connectors.kinesis.testutils.TestableFlinkKinesisConsumer) SimpleStringSchema(org.apache.flink.api.common.serialization.SimpleStringSchema) RuntimeContext(org.apache.flink.api.common.functions.RuntimeContext) ConcurrentLinkedQueue(java.util.concurrent.ConcurrentLinkedQueue) Watermark(org.apache.flink.streaming.api.watermark.Watermark) PrepareForTest(org.powermock.core.classloader.annotations.PrepareForTest) Test(org.junit.Test)

Aggregations

Time (org.apache.flink.streaming.api.windowing.time.Time)13 Test (org.junit.Test)11 TimeWindow (org.apache.flink.streaming.api.windowing.windows.TimeWindow)7 WatermarkStrategy (org.apache.flink.api.common.eventtime.WatermarkStrategy)4 DataStream (org.apache.flink.streaming.api.datastream.DataStream)4 SingleOutputStreamOperator (org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator)4 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)4 TumblingEventTimeWindows (org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows)4 Collector (org.apache.flink.util.Collector)4 Kryo (com.esotericsoftware.kryo.Kryo)3 JavaSerializer (com.esotericsoftware.kryo.serializers.JavaSerializer)3 File (java.io.File)3 ArrayList (java.util.ArrayList)3 HashMap (java.util.HashMap)3 List (java.util.List)3 ReduceFunction (org.apache.flink.api.common.functions.ReduceFunction)3 ListStateDescriptor (org.apache.flink.api.common.state.ListStateDescriptor)3 StateDescriptor (org.apache.flink.api.common.state.StateDescriptor)3 TypeSerializer (org.apache.flink.api.common.typeutils.TypeSerializer)3 ListSerializer (org.apache.flink.api.common.typeutils.base.ListSerializer)3