Search in sources :

Example 11 with IntegerSerde

use of org.apache.samza.serializers.IntegerSerde in project samza by apache.

the class TestLocalTableWithSideInputsEndToEnd method testLowLevelJoinWithSideInputsTable.

@Test
public void testLowLevelJoinWithSideInputsTable() throws InterruptedException {
    int partitionCount = 4;
    IntegerSerde integerSerde = new IntegerSerde();
    // for low-level, need to pre-partition the input in the same way that the profiles are partitioned
    Map<Integer, List<PageView>> pageViewsPartitionedByMemberId = TestTableData.generatePartitionedPageViews(20, partitionCount).values().stream().flatMap(List::stream).collect(Collectors.groupingBy(pageView -> Math.abs(Arrays.hashCode(integerSerde.toBytes(pageView.getMemberId()))) % partitionCount));
    runTest(new LowLevelPageViewProfileJoin(), pageViewsPartitionedByMemberId, TestTableData.generatePartitionedProfiles(10, partitionCount));
}
Also used : RocksDbTableDescriptor(org.apache.samza.storage.kv.descriptors.RocksDbTableDescriptor) Arrays(java.util.Arrays) InMemorySystemDescriptor(org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor) TableDescriptor(org.apache.samza.table.descriptors.TableDescriptor) Profile(org.apache.samza.test.table.TestTableData.Profile) InMemoryInputDescriptor(org.apache.samza.test.framework.system.descriptors.InMemoryInputDescriptor) ImmutableList(com.google.common.collect.ImmutableList) InitableTask(org.apache.samza.task.InitableTask) MessageCollector(org.apache.samza.task.MessageCollector) SystemStream(org.apache.samza.system.SystemStream) DelegatingSystemDescriptor(org.apache.samza.system.descriptors.DelegatingSystemDescriptor) Duration(java.time.Duration) Map(java.util.Map) StreamTask(org.apache.samza.task.StreamTask) SamzaApplication(org.apache.samza.application.SamzaApplication) ApplicationDescriptor(org.apache.samza.application.descriptors.ApplicationDescriptor) KV(org.apache.samza.operators.KV) IntegerSerde(org.apache.samza.serializers.IntegerSerde) NoOpSerde(org.apache.samza.serializers.NoOpSerde) ProfileJsonSerde(org.apache.samza.test.table.TestTableData.ProfileJsonSerde) InMemoryTableDescriptor(org.apache.samza.storage.kv.inmemory.descriptors.InMemoryTableDescriptor) Table(org.apache.samza.table.Table) StreamAssert(org.apache.samza.test.framework.StreamAssert) TaskApplicationDescriptor(org.apache.samza.application.descriptors.TaskApplicationDescriptor) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) InMemoryOutputDescriptor(org.apache.samza.test.framework.system.descriptors.InMemoryOutputDescriptor) ImmutableMap(com.google.common.collect.ImmutableMap) TaskApplication(org.apache.samza.application.TaskApplication) StreamTaskFactory(org.apache.samza.task.StreamTaskFactory) EnrichedPageView(org.apache.samza.test.table.TestTableData.EnrichedPageView) Test(org.junit.Test) Collectors(java.util.stream.Collectors) TaskCoordinator(org.apache.samza.task.TaskCoordinator) Context(org.apache.samza.context.Context) TestRunner(org.apache.samza.test.framework.TestRunner) List(java.util.List) Entry(org.apache.samza.storage.kv.Entry) ReadWriteUpdateTable(org.apache.samza.table.ReadWriteUpdateTable) StreamApplicationDescriptor(org.apache.samza.application.descriptors.StreamApplicationDescriptor) Optional(java.util.Optional) OutgoingMessageEnvelope(org.apache.samza.system.OutgoingMessageEnvelope) KVSerde(org.apache.samza.serializers.KVSerde) PageView(org.apache.samza.test.table.TestTableData.PageView) StreamApplication(org.apache.samza.application.StreamApplication) ImmutableList(com.google.common.collect.ImmutableList) List(java.util.List) IntegerSerde(org.apache.samza.serializers.IntegerSerde) Test(org.junit.Test)

Example 12 with IntegerSerde

use of org.apache.samza.serializers.IntegerSerde in project samza by apache.

the class TestSchedulerFunction method testImmediateTimer.

@Test
public void testImmediateTimer() {
    final InMemorySystemDescriptor isd = new InMemorySystemDescriptor("test");
    final InMemoryInputDescriptor<Integer> imid = isd.getInputDescriptor("test-input", new IntegerSerde());
    StreamApplication app = new StreamApplication() {

        @Override
        public void describe(StreamApplicationDescriptor appDescriptor) {
            appDescriptor.getInputStream(imid).map(new TestFunction());
        }
    };
    TestRunner.of(app).addInputStream(imid, Arrays.asList(1, 2, 3, 4, 5)).run(Duration.ofSeconds(1));
    assertTrue(timerFired.get());
}
Also used : StreamApplicationDescriptor(org.apache.samza.application.descriptors.StreamApplicationDescriptor) StreamApplication(org.apache.samza.application.StreamApplication) InMemorySystemDescriptor(org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor) IntegerSerde(org.apache.samza.serializers.IntegerSerde) Test(org.junit.Test)

Example 13 with IntegerSerde

use of org.apache.samza.serializers.IntegerSerde in project samza by apache.

the class SessionWindowApp method describe.

@Override
public void describe(StreamApplicationDescriptor appDescriptor) {
    JsonSerdeV2<PageView> inputSerde = new JsonSerdeV2<>(PageView.class);
    KVSerde<String, Integer> outputSerde = KVSerde.of(new StringSerde(), new IntegerSerde());
    KafkaSystemDescriptor ksd = new KafkaSystemDescriptor(SYSTEM);
    KafkaInputDescriptor<PageView> id = ksd.getInputDescriptor(INPUT_TOPIC, inputSerde);
    KafkaOutputDescriptor<KV<String, Integer>> od = ksd.getOutputDescriptor(OUTPUT_TOPIC, outputSerde);
    MessageStream<PageView> pageViews = appDescriptor.getInputStream(id);
    OutputStream<KV<String, Integer>> outputStream = appDescriptor.getOutputStream(od);
    pageViews.filter(m -> !FILTER_KEY.equals(m.getUserId())).window(Windows.keyedSessionWindow(PageView::getUserId, Duration.ofSeconds(3), new StringSerde(), new JsonSerdeV2<>(PageView.class)), "sessionWindow").map(m -> KV.of(m.getKey().getKey(), m.getMessage().size())).sendTo(outputStream);
}
Also used : ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) Windows(org.apache.samza.operators.windows.Windows) KafkaInputDescriptor(org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor) CommandLine(org.apache.samza.util.CommandLine) KafkaSystemDescriptor(org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor) PageView(org.apache.samza.test.operator.data.PageView) StringSerde(org.apache.samza.serializers.StringSerde) KafkaOutputDescriptor(org.apache.samza.system.kafka.descriptors.KafkaOutputDescriptor) StreamApplicationDescriptor(org.apache.samza.application.descriptors.StreamApplicationDescriptor) Duration(java.time.Duration) Config(org.apache.samza.config.Config) ApplicationRunners(org.apache.samza.runtime.ApplicationRunners) JsonSerdeV2(org.apache.samza.serializers.JsonSerdeV2) KVSerde(org.apache.samza.serializers.KVSerde) StreamApplication(org.apache.samza.application.StreamApplication) KV(org.apache.samza.operators.KV) OutputStream(org.apache.samza.operators.OutputStream) IntegerSerde(org.apache.samza.serializers.IntegerSerde) MessageStream(org.apache.samza.operators.MessageStream) PageView(org.apache.samza.test.operator.data.PageView) StringSerde(org.apache.samza.serializers.StringSerde) KV(org.apache.samza.operators.KV) KafkaSystemDescriptor(org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor) JsonSerdeV2(org.apache.samza.serializers.JsonSerdeV2) IntegerSerde(org.apache.samza.serializers.IntegerSerde)

Example 14 with IntegerSerde

use of org.apache.samza.serializers.IntegerSerde in project samza by apache.

the class TumblingWindowApp method describe.

@Override
public void describe(StreamApplicationDescriptor appDescriptor) {
    JsonSerdeV2<PageView> inputSerde = new JsonSerdeV2<>(PageView.class);
    KVSerde<String, Integer> outputSerde = KVSerde.of(new StringSerde(), new IntegerSerde());
    KafkaSystemDescriptor ksd = new KafkaSystemDescriptor(SYSTEM);
    KafkaInputDescriptor<PageView> id = ksd.getInputDescriptor(INPUT_TOPIC, inputSerde);
    KafkaOutputDescriptor<KV<String, Integer>> od = ksd.getOutputDescriptor(OUTPUT_TOPIC, outputSerde);
    MessageStream<PageView> pageViews = appDescriptor.getInputStream(id);
    OutputStream<KV<String, Integer>> outputStream = appDescriptor.getOutputStream(od);
    pageViews.filter(m -> !FILTER_KEY.equals(m.getUserId())).window(Windows.keyedTumblingWindow(PageView::getUserId, Duration.ofSeconds(3), new StringSerde(), new JsonSerdeV2<>(PageView.class)), "tumblingWindow").map(m -> KV.of(m.getKey().getKey(), m.getMessage().size())).sendTo(outputStream);
}
Also used : ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) Windows(org.apache.samza.operators.windows.Windows) KafkaInputDescriptor(org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor) CommandLine(org.apache.samza.util.CommandLine) KafkaSystemDescriptor(org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor) PageView(org.apache.samza.test.operator.data.PageView) StringSerde(org.apache.samza.serializers.StringSerde) KafkaOutputDescriptor(org.apache.samza.system.kafka.descriptors.KafkaOutputDescriptor) StreamApplicationDescriptor(org.apache.samza.application.descriptors.StreamApplicationDescriptor) Duration(java.time.Duration) Config(org.apache.samza.config.Config) ApplicationRunners(org.apache.samza.runtime.ApplicationRunners) JsonSerdeV2(org.apache.samza.serializers.JsonSerdeV2) KVSerde(org.apache.samza.serializers.KVSerde) StreamApplication(org.apache.samza.application.StreamApplication) KV(org.apache.samza.operators.KV) OutputStream(org.apache.samza.operators.OutputStream) IntegerSerde(org.apache.samza.serializers.IntegerSerde) MessageStream(org.apache.samza.operators.MessageStream) PageView(org.apache.samza.test.operator.data.PageView) StringSerde(org.apache.samza.serializers.StringSerde) KV(org.apache.samza.operators.KV) KafkaSystemDescriptor(org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor) JsonSerdeV2(org.apache.samza.serializers.JsonSerdeV2) IntegerSerde(org.apache.samza.serializers.IntegerSerde)

Example 15 with IntegerSerde

use of org.apache.samza.serializers.IntegerSerde in project samza by apache.

the class WindowExample method describe.

@Override
public void describe(StreamApplicationDescriptor appDescriptor) {
    KafkaSystemDescriptor trackingSystem = new KafkaSystemDescriptor("tracking");
    KafkaInputDescriptor<PageViewEvent> inputStreamDescriptor = trackingSystem.getInputDescriptor("pageViewEvent", new JsonSerdeV2<>(PageViewEvent.class));
    KafkaOutputDescriptor<Integer> outputStreamDescriptor = trackingSystem.getOutputDescriptor("pageViewEventPerMember", new IntegerSerde());
    SupplierFunction<Integer> initialValue = () -> 0;
    FoldLeftFunction<PageViewEvent, Integer> counter = (m, c) -> c == null ? 1 : c + 1;
    MessageStream<PageViewEvent> inputStream = appDescriptor.getInputStream(inputStreamDescriptor);
    OutputStream<Integer> outputStream = appDescriptor.getOutputStream(outputStreamDescriptor);
    // create a tumbling window that outputs the number of message collected every 10 minutes.
    // also emit early results if either the number of messages collected reaches 30000, or if no new messages arrive
    // for 1 minute.
    inputStream.window(Windows.tumblingWindow(Duration.ofMinutes(10), initialValue, counter, new IntegerSerde()).setLateTrigger(Triggers.any(Triggers.count(30000), Triggers.timeSinceLastMessage(Duration.ofMinutes(1)))), "window").map(WindowPane::getMessage).sendTo(outputStream);
}
Also used : ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) Windows(org.apache.samza.operators.windows.Windows) KafkaInputDescriptor(org.apache.samza.system.kafka.descriptors.KafkaInputDescriptor) CommandLine(org.apache.samza.util.CommandLine) KafkaSystemDescriptor(org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor) PageViewEvent(org.apache.samza.example.models.PageViewEvent) Triggers(org.apache.samza.operators.triggers.Triggers) WindowPane(org.apache.samza.operators.windows.WindowPane) KafkaOutputDescriptor(org.apache.samza.system.kafka.descriptors.KafkaOutputDescriptor) StreamApplicationDescriptor(org.apache.samza.application.descriptors.StreamApplicationDescriptor) Duration(java.time.Duration) Config(org.apache.samza.config.Config) ApplicationRunners(org.apache.samza.runtime.ApplicationRunners) JsonSerdeV2(org.apache.samza.serializers.JsonSerdeV2) StreamApplication(org.apache.samza.application.StreamApplication) OutputStream(org.apache.samza.operators.OutputStream) FoldLeftFunction(org.apache.samza.operators.functions.FoldLeftFunction) IntegerSerde(org.apache.samza.serializers.IntegerSerde) SupplierFunction(org.apache.samza.operators.functions.SupplierFunction) MessageStream(org.apache.samza.operators.MessageStream) PageViewEvent(org.apache.samza.example.models.PageViewEvent) KafkaSystemDescriptor(org.apache.samza.system.kafka.descriptors.KafkaSystemDescriptor) IntegerSerde(org.apache.samza.serializers.IntegerSerde)

Aggregations

IntegerSerde (org.apache.samza.serializers.IntegerSerde)24 Test (org.junit.Test)17 StreamApplication (org.apache.samza.application.StreamApplication)9 KVSerde (org.apache.samza.serializers.KVSerde)9 Config (org.apache.samza.config.Config)8 Duration (java.time.Duration)7 HashMap (java.util.HashMap)7 StreamApplicationDescriptor (org.apache.samza.application.descriptors.StreamApplicationDescriptor)7 MapConfig (org.apache.samza.config.MapConfig)7 KV (org.apache.samza.operators.KV)6 List (java.util.List)5 MessageStream (org.apache.samza.operators.MessageStream)5 ArrayList (java.util.ArrayList)4 Map (java.util.Map)4 Context (org.apache.samza.context.Context)4 StringSerde (org.apache.samza.serializers.StringSerde)4 GenericInputDescriptor (org.apache.samza.system.descriptors.GenericInputDescriptor)4 InMemorySystemDescriptor (org.apache.samza.test.framework.system.descriptors.InMemorySystemDescriptor)4 ImmutableList (com.google.common.collect.ImmutableList)3 OutputStream (org.apache.samza.operators.OutputStream)3