Search in sources :

Example 1 with TimestampedValue

use of org.apache.samza.util.TimestampedValue in project samza by apache.

the class TestTimeSeriesStoreImpl method testPutWithMultipleEntries.

@Test
public void testPutWithMultipleEntries() {
    TimeSeriesStore<String, byte[]> timeSeriesStore = newTimeSeriesStore(new StringSerde("UTF-8"), true);
    // insert 100 entries at timestamps "1" and "2"
    for (int i = 0; i < 100; i++) {
        timeSeriesStore.put("hello", "world-1".getBytes(), 1L);
        timeSeriesStore.put("hello", "world-2".getBytes(), 2L);
    }
    // read from time-range [0,2) should return 100 entries
    List<TimestampedValue<byte[]>> values = readStore(timeSeriesStore, "hello", 0L, 2L);
    Assert.assertEquals(100, values.size());
    values.forEach(timeSeriesValue -> {
        Assert.assertEquals("world-1", new String(timeSeriesValue.getValue()));
    });
    // read from time-range [2,4) should return 100 entries
    values = readStore(timeSeriesStore, "hello", 2L, 4L);
    Assert.assertEquals(100, values.size());
    values.forEach(timeSeriesValue -> {
        Assert.assertEquals("world-2", new String(timeSeriesValue.getValue()));
    });
    // read all entries in the store
    values = readStore(timeSeriesStore, "hello", 0L, Integer.MAX_VALUE);
    Assert.assertEquals(200, values.size());
}
Also used : StringSerde(org.apache.samza.serializers.StringSerde) TimestampedValue(org.apache.samza.util.TimestampedValue) Test(org.junit.Test)

Example 2 with TimestampedValue

use of org.apache.samza.util.TimestampedValue in project samza by apache.

the class TestTimeSeriesStoreImpl method testDeletesInOverwriteMode.

@Test
public void testDeletesInOverwriteMode() {
    // instantiate a store in overwrite mode
    TimeSeriesStore<String, byte[]> timeSeriesStore = newTimeSeriesStore(new StringSerde("UTF-8"), false);
    // insert an entry with key "hello" at timestamps "1" and "2"
    timeSeriesStore.put("hello", "world-1".getBytes(), 1L);
    timeSeriesStore.put("hello", "world-1".getBytes(), 2L);
    timeSeriesStore.put("hello", "world-2".getBytes(), 2L);
    List<TimestampedValue<byte[]>> values = readStore(timeSeriesStore, "hello", 1L, 3L);
    Assert.assertEquals(2, values.size());
    timeSeriesStore.remove("hello", 0L, 3L);
    values = readStore(timeSeriesStore, "hello", 1L, 3L);
    Assert.assertEquals(0, values.size());
}
Also used : StringSerde(org.apache.samza.serializers.StringSerde) TimestampedValue(org.apache.samza.util.TimestampedValue) Test(org.junit.Test)

Example 3 with TimestampedValue

use of org.apache.samza.util.TimestampedValue in project samza by apache.

the class TestTimeSeriesStoreImpl method testGetOnTimestampBoundaries.

@Test
public void testGetOnTimestampBoundaries() {
    TimeSeriesStore<String, byte[]> timeSeriesStore = newTimeSeriesStore(new StringSerde("UTF-8"), true);
    // insert an entry with key "hello" at timestamps "1" and "2"
    timeSeriesStore.put("hello", "world-1".getBytes(), 1L);
    timeSeriesStore.put("hello", "world-1".getBytes(), 2L);
    timeSeriesStore.put("hello", "world-2".getBytes(), 2L);
    // read from time-range
    List<TimestampedValue<byte[]>> values = readStore(timeSeriesStore, "hello", 0L, 1L);
    Assert.assertEquals(0, values.size());
    // read from time-range [1,2) should return one entry
    values = readStore(timeSeriesStore, "hello", 1L, 2L);
    Assert.assertEquals(1, values.size());
    Assert.assertEquals("world-1", new String(values.get(0).getValue()));
    // read from time-range [2,3) should return two entries
    values = readStore(timeSeriesStore, "hello", 2L, 3L);
    Assert.assertEquals(2, values.size());
    Assert.assertEquals("world-1", new String(values.get(0).getValue()));
    Assert.assertEquals(2L, values.get(0).getTimestamp());
    // read from time-range [0,3) should return three entries
    values = readStore(timeSeriesStore, "hello", 0L, 3L);
    Assert.assertEquals(3, values.size());
    // read from time-range [2,999999) should return two entries
    values = readStore(timeSeriesStore, "hello", 2L, 999999L);
    Assert.assertEquals(2, values.size());
    // read from time-range [3,4) should return no entries
    values = readStore(timeSeriesStore, "hello", 3L, 4L);
    Assert.assertEquals(0, values.size());
}
Also used : StringSerde(org.apache.samza.serializers.StringSerde) TimestampedValue(org.apache.samza.util.TimestampedValue) Test(org.junit.Test)

Example 4 with TimestampedValue

use of org.apache.samza.util.TimestampedValue in project samza by apache.

the class TestTimeSeriesStoreImpl method testGetOnTimestampBoundariesWithOverwriteMode.

@Test
public void testGetOnTimestampBoundariesWithOverwriteMode() {
    // instantiate a store in overwrite mode
    TimeSeriesStore<String, byte[]> timeSeriesStore = newTimeSeriesStore(new StringSerde("UTF-8"), false);
    // insert an entry with key "hello" at timestamps "1" and "2"
    timeSeriesStore.put("hello", "world-1".getBytes(), 1L);
    timeSeriesStore.put("hello", "world-1".getBytes(), 2L);
    timeSeriesStore.put("hello", "world-2".getBytes(), 2L);
    // read from time-range
    List<TimestampedValue<byte[]>> values = readStore(timeSeriesStore, "hello", 0L, 1L);
    Assert.assertEquals(0, values.size());
    // read from time-range [1,2) should return one entry
    values = readStore(timeSeriesStore, "hello", 1L, 2L);
    Assert.assertEquals(1, values.size());
    Assert.assertEquals("world-1", new String(values.get(0).getValue()));
    // read from time-range [2,3) should return the most recent entry
    values = readStore(timeSeriesStore, "hello", 2L, 3L);
    Assert.assertEquals(1, values.size());
    Assert.assertEquals("world-2", new String(values.get(0).getValue()));
    Assert.assertEquals(2L, values.get(0).getTimestamp());
    // read from time-range [0,3) should return two entries
    values = readStore(timeSeriesStore, "hello", 0L, 3L);
    Assert.assertEquals(2, values.size());
    // read from time-range [2,999999) should return one entry
    values = readStore(timeSeriesStore, "hello", 2L, 999999L);
    Assert.assertEquals(1, values.size());
    // read from time-range [3,4) should return no entries
    values = readStore(timeSeriesStore, "hello", 3L, 4L);
    Assert.assertEquals(0, values.size());
}
Also used : StringSerde(org.apache.samza.serializers.StringSerde) TimestampedValue(org.apache.samza.util.TimestampedValue) Test(org.junit.Test)

Example 5 with TimestampedValue

use of org.apache.samza.util.TimestampedValue in project samza by apache.

the class TestOperatorImplGraph method testJoinChain.

@Test
public void testJoinChain() {
    String inputStreamId1 = "input1";
    String inputStreamId2 = "input2";
    String inputSystem = "input-system";
    String inputPhysicalName1 = "input-stream1";
    String inputPhysicalName2 = "input-stream2";
    HashMap<String, String> configs = new HashMap<>();
    configs.put(JobConfig.JOB_NAME, "jobName");
    configs.put(JobConfig.JOB_ID, "jobId");
    StreamTestUtils.addStreamConfigs(configs, inputStreamId1, inputSystem, inputPhysicalName1);
    StreamTestUtils.addStreamConfigs(configs, inputStreamId2, inputSystem, inputPhysicalName2);
    Config config = new MapConfig(configs);
    when(this.context.getJobContext().getConfig()).thenReturn(config);
    Integer joinKey = new Integer(1);
    Function<Object, Integer> keyFn = (Function & Serializable) m -> joinKey;
    JoinFunction testJoinFunction = new TestJoinFunction("jobName-jobId-join-j1", (BiFunction & Serializable) (m1, m2) -> KV.of(m1, m2), keyFn, keyFn);
    StreamApplicationDescriptorImpl graphSpec = new StreamApplicationDescriptorImpl(appDesc -> {
        GenericSystemDescriptor sd = new GenericSystemDescriptor(inputSystem, "mockFactoryClass");
        GenericInputDescriptor inputDescriptor1 = sd.getInputDescriptor(inputStreamId1, mock(Serde.class));
        GenericInputDescriptor inputDescriptor2 = sd.getInputDescriptor(inputStreamId2, mock(Serde.class));
        MessageStream<Object> inputStream1 = appDesc.getInputStream(inputDescriptor1);
        MessageStream<Object> inputStream2 = appDesc.getInputStream(inputDescriptor2);
        inputStream1.join(inputStream2, testJoinFunction, mock(Serde.class), mock(Serde.class), mock(Serde.class), Duration.ofHours(1), "j1");
    }, config);
    TaskName mockTaskName = mock(TaskName.class);
    TaskModel taskModel = mock(TaskModel.class);
    when(taskModel.getTaskName()).thenReturn(mockTaskName);
    when(this.context.getTaskContext().getTaskModel()).thenReturn(taskModel);
    KeyValueStore mockLeftStore = mock(KeyValueStore.class);
    when(this.context.getTaskContext().getStore(eq("jobName-jobId-join-j1-L"))).thenReturn(mockLeftStore);
    KeyValueStore mockRightStore = mock(KeyValueStore.class);
    when(this.context.getTaskContext().getStore(eq("jobName-jobId-join-j1-R"))).thenReturn(mockRightStore);
    OperatorImplGraph opImplGraph = new OperatorImplGraph(graphSpec.getOperatorSpecGraph(), this.context, mock(Clock.class));
    // verify that join function is initialized once.
    assertEquals(TestJoinFunction.getInstanceByTaskName(mockTaskName, "jobName-jobId-join-j1").numInitCalled, 1);
    InputOperatorImpl inputOpImpl1 = opImplGraph.getInputOperator(new SystemStream(inputSystem, inputPhysicalName1));
    InputOperatorImpl inputOpImpl2 = opImplGraph.getInputOperator(new SystemStream(inputSystem, inputPhysicalName2));
    PartialJoinOperatorImpl leftPartialJoinOpImpl = (PartialJoinOperatorImpl) inputOpImpl1.registeredOperators.iterator().next();
    PartialJoinOperatorImpl rightPartialJoinOpImpl = (PartialJoinOperatorImpl) inputOpImpl2.registeredOperators.iterator().next();
    assertEquals(leftPartialJoinOpImpl.getOperatorSpec(), rightPartialJoinOpImpl.getOperatorSpec());
    assertNotSame(leftPartialJoinOpImpl, rightPartialJoinOpImpl);
    // verify that left partial join operator calls getFirstKey
    Object mockLeftMessage = mock(Object.class);
    long currentTimeMillis = System.currentTimeMillis();
    when(mockLeftStore.get(eq(joinKey))).thenReturn(new TimestampedValue<>(mockLeftMessage, currentTimeMillis));
    IncomingMessageEnvelope leftMessage = new IncomingMessageEnvelope(mock(SystemStreamPartition.class), "", "", mockLeftMessage);
    inputOpImpl1.onMessage(leftMessage, mock(MessageCollector.class), mock(TaskCoordinator.class));
    // verify that right partial join operator calls getSecondKey
    Object mockRightMessage = mock(Object.class);
    when(mockRightStore.get(eq(joinKey))).thenReturn(new TimestampedValue<>(mockRightMessage, currentTimeMillis));
    IncomingMessageEnvelope rightMessage = new IncomingMessageEnvelope(mock(SystemStreamPartition.class), "", "", mockRightMessage);
    inputOpImpl2.onMessage(rightMessage, mock(MessageCollector.class), mock(TaskCoordinator.class));
    // verify that the join function apply is called with the correct messages on match
    assertEquals(((TestJoinFunction) TestJoinFunction.getInstanceByTaskName(mockTaskName, "jobName-jobId-join-j1")).joinResults.size(), 1);
    KV joinResult = (KV) ((TestJoinFunction) TestJoinFunction.getInstanceByTaskName(mockTaskName, "jobName-jobId-join-j1")).joinResults.iterator().next();
    assertEquals(joinResult.getKey(), mockLeftMessage);
    assertEquals(joinResult.getValue(), mockRightMessage);
}
Also used : StreamApplicationDescriptorImpl(org.apache.samza.application.descriptors.StreamApplicationDescriptorImpl) BiFunction(java.util.function.BiFunction) Assert.assertNotSame(org.junit.Assert.assertNotSame) TaskModel(org.apache.samza.job.model.TaskModel) TimestampedValue(org.apache.samza.util.TimestampedValue) GenericInputDescriptor(org.apache.samza.system.descriptors.GenericInputDescriptor) StringSerde(org.apache.samza.serializers.StringSerde) HashMultimap(com.google.common.collect.HashMultimap) Matchers.eq(org.mockito.Matchers.eq) After(org.junit.After) Duration(java.time.Duration) Map(java.util.Map) MapConfig(org.apache.samza.config.MapConfig) KV(org.apache.samza.operators.KV) TaskName(org.apache.samza.container.TaskName) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) Collection(java.util.Collection) Set(java.util.Set) Serializable(java.io.Serializable) Context(org.apache.samza.context.Context) List(java.util.List) SystemClock(org.apache.samza.util.SystemClock) Config(org.apache.samza.config.Config) KVSerde(org.apache.samza.serializers.KVSerde) OutputStream(org.apache.samza.operators.OutputStream) MetricsRegistryMap(org.apache.samza.metrics.MetricsRegistryMap) Mockito.mock(org.mockito.Mockito.mock) GenericSystemDescriptor(org.apache.samza.system.descriptors.GenericSystemDescriptor) JobConfig(org.apache.samza.config.JobConfig) ClosableFunction(org.apache.samza.operators.functions.ClosableFunction) Serde(org.apache.samza.serializers.Serde) HashMap(java.util.HashMap) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Multimap(com.google.common.collect.Multimap) Function(java.util.function.Function) StreamConfig(org.apache.samza.config.StreamConfig) MapFunction(org.apache.samza.operators.functions.MapFunction) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) StreamTestUtils(org.apache.samza.testUtils.StreamTestUtils) MessageCollector(org.apache.samza.task.MessageCollector) SystemStream(org.apache.samza.system.SystemStream) MockContext(org.apache.samza.context.MockContext) IntegerSerde(org.apache.samza.serializers.IntegerSerde) JobModel(org.apache.samza.job.model.JobModel) MessageStream(org.apache.samza.operators.MessageStream) Before(org.junit.Before) OpCode(org.apache.samza.operators.spec.OperatorSpec.OpCode) FilterFunction(org.apache.samza.operators.functions.FilterFunction) GenericOutputDescriptor(org.apache.samza.system.descriptors.GenericOutputDescriptor) Partition(org.apache.samza.Partition) Assert.assertTrue(org.junit.Assert.assertTrue) InitableFunction(org.apache.samza.operators.functions.InitableFunction) Clock(org.apache.samza.util.Clock) Test(org.junit.Test) Mockito.when(org.mockito.Mockito.when) JoinFunction(org.apache.samza.operators.functions.JoinFunction) TaskCoordinator(org.apache.samza.task.TaskCoordinator) ContainerModel(org.apache.samza.job.model.ContainerModel) TaskContextImpl(org.apache.samza.context.TaskContextImpl) KeyValueStore(org.apache.samza.storage.kv.KeyValueStore) Collections(java.util.Collections) Assert.assertEquals(org.junit.Assert.assertEquals) StringSerde(org.apache.samza.serializers.StringSerde) KVSerde(org.apache.samza.serializers.KVSerde) Serde(org.apache.samza.serializers.Serde) IntegerSerde(org.apache.samza.serializers.IntegerSerde) Serializable(java.io.Serializable) HashMap(java.util.HashMap) MapConfig(org.apache.samza.config.MapConfig) Config(org.apache.samza.config.Config) JobConfig(org.apache.samza.config.JobConfig) StreamConfig(org.apache.samza.config.StreamConfig) IncomingMessageEnvelope(org.apache.samza.system.IncomingMessageEnvelope) KeyValueStore(org.apache.samza.storage.kv.KeyValueStore) SystemClock(org.apache.samza.util.SystemClock) Clock(org.apache.samza.util.Clock) GenericInputDescriptor(org.apache.samza.system.descriptors.GenericInputDescriptor) StreamApplicationDescriptorImpl(org.apache.samza.application.descriptors.StreamApplicationDescriptorImpl) MessageCollector(org.apache.samza.task.MessageCollector) MapConfig(org.apache.samza.config.MapConfig) SystemStream(org.apache.samza.system.SystemStream) TaskCoordinator(org.apache.samza.task.TaskCoordinator) KV(org.apache.samza.operators.KV) BiFunction(java.util.function.BiFunction) TaskName(org.apache.samza.container.TaskName) JoinFunction(org.apache.samza.operators.functions.JoinFunction) GenericSystemDescriptor(org.apache.samza.system.descriptors.GenericSystemDescriptor) TaskModel(org.apache.samza.job.model.TaskModel) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) Test(org.junit.Test)

Aggregations

TimestampedValue (org.apache.samza.util.TimestampedValue)9 Test (org.junit.Test)7 StringSerde (org.apache.samza.serializers.StringSerde)6 IntegerSerde (org.apache.samza.serializers.IntegerSerde)2 HashMultimap (com.google.common.collect.HashMultimap)1 Multimap (com.google.common.collect.Multimap)1 Serializable (java.io.Serializable)1 ByteBuffer (java.nio.ByteBuffer)1 Duration (java.time.Duration)1 ArrayList (java.util.ArrayList)1 Collection (java.util.Collection)1 Collections (java.util.Collections)1 HashMap (java.util.HashMap)1 HashSet (java.util.HashSet)1 List (java.util.List)1 Map (java.util.Map)1 Set (java.util.Set)1 BiFunction (java.util.function.BiFunction)1 Function (java.util.function.Function)1 Partition (org.apache.samza.Partition)1