Search in sources :

Example 1 with StreamGraphImpl

use of org.apache.samza.operators.StreamGraphImpl in project samza by apache.

the class TestExecutionPlanner method testTriggerIntervalWithInvalidWindowMs.

@Test
public void testTriggerIntervalWithInvalidWindowMs() throws Exception {
    Map<String, String> map = new HashMap<>(config);
    map.put(TaskConfig.WINDOW_MS(), "-1");
    map.put(JobConfig.JOB_INTERMEDIATE_STREAM_PARTITIONS(), String.valueOf(DEFAULT_PARTITIONS));
    Config cfg = new MapConfig(map);
    ExecutionPlanner planner = new ExecutionPlanner(cfg, streamManager);
    StreamGraphImpl streamGraph = createStreamGraphWithJoinAndWindow();
    ExecutionPlan plan = planner.plan(streamGraph);
    List<JobConfig> jobConfigs = plan.getJobConfigs();
    assertEquals(jobConfigs.size(), 1);
    // GCD of 8, 16, 1600 and 252 is 4
    assertEquals(jobConfigs.get(0).get(TaskConfig.WINDOW_MS()), "4");
}
Also used : HashMap(java.util.HashMap) JobConfig(org.apache.samza.config.JobConfig) MapConfig(org.apache.samza.config.MapConfig) TaskConfig(org.apache.samza.config.TaskConfig) Config(org.apache.samza.config.Config) StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl) MapConfig(org.apache.samza.config.MapConfig) JobConfig(org.apache.samza.config.JobConfig) Test(org.junit.Test)

Example 2 with StreamGraphImpl

use of org.apache.samza.operators.StreamGraphImpl in project samza by apache.

the class TestExecutionPlanner method createSimpleGraph.

private StreamGraphImpl createSimpleGraph() {
    /**
     * a simple graph of partitionBy and map
     *
     * input1 -> partitionBy -> map -> output1
     *
     */
    StreamGraphImpl streamGraph = new StreamGraphImpl(runner, config);
    Function mockFn = mock(Function.class);
    OutputStream<Object, Object, Object> output1 = streamGraph.getOutputStream("output1", mockFn, mockFn);
    BiFunction mockBuilder = mock(BiFunction.class);
    streamGraph.getInputStream("input1", mockBuilder).partitionBy(m -> "yes!!!").map(m -> m).sendTo(output1);
    return streamGraph;
}
Also used : BiFunction(java.util.function.BiFunction) JobConfig(org.apache.samza.config.JobConfig) HashMap(java.util.HashMap) SystemStreamPartition(org.apache.samza.system.SystemStreamPartition) SystemStreamMetadata(org.apache.samza.system.SystemStreamMetadata) Function(java.util.function.Function) ArrayList(java.util.ArrayList) Duration(java.time.Duration) Map(java.util.Map) MapConfig(org.apache.samza.config.MapConfig) MessageStream(org.apache.samza.operators.MessageStream) Before(org.junit.Before) ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) Windows(org.apache.samza.operators.windows.Windows) TaskConfig(org.apache.samza.config.TaskConfig) Collection(java.util.Collection) Partition(org.apache.samza.Partition) Set(java.util.Set) Assert.assertTrue(org.junit.Assert.assertTrue) StreamSpec(org.apache.samza.system.StreamSpec) Test(org.junit.Test) Mockito.when(org.mockito.Mockito.when) JoinFunction(org.apache.samza.operators.functions.JoinFunction) StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl) List(java.util.List) Assert.assertFalse(org.junit.Assert.assertFalse) SystemAdmin(org.apache.samza.system.SystemAdmin) Config(org.apache.samza.config.Config) OutputStream(org.apache.samza.operators.OutputStream) Collections(java.util.Collections) Assert.assertEquals(org.junit.Assert.assertEquals) Mockito.mock(org.mockito.Mockito.mock) BiFunction(java.util.function.BiFunction) Function(java.util.function.Function) JoinFunction(org.apache.samza.operators.functions.JoinFunction) BiFunction(java.util.function.BiFunction) StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl)

Example 3 with StreamGraphImpl

use of org.apache.samza.operators.StreamGraphImpl in project samza by apache.

the class AbstractApplicationRunner method getExecutionPlan.

final ExecutionPlan getExecutionPlan(StreamApplication app) throws Exception {
    // build stream graph
    StreamGraphImpl streamGraph = new StreamGraphImpl(this, config);
    app.init(streamGraph, config);
    // create the physical execution plan
    return planner.plan(streamGraph);
}
Also used : StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl)

Example 4 with StreamGraphImpl

use of org.apache.samza.operators.StreamGraphImpl in project samza by apache.

the class StreamOperatorTask method init.

/**
   * Initializes this task during startup.
   * <p>
   * Implementation: Initializes the user-implemented {@link StreamApplication}. The {@link StreamApplication} sets
   * the input and output streams and the task-wide context manager using the {@link StreamGraphImpl} APIs,
   * and the logical transforms using the {@link org.apache.samza.operators.MessageStream} APIs.
   *<p>
   * It then uses the {@link StreamGraphImpl} to create the {@link OperatorImplGraph} corresponding to the logical
   * DAG. It also saves the mapping between input {@link SystemStream}s and their corresponding
   * {@link InputStreamInternal}s for delivering incoming messages to the appropriate sub-DAG.
   *
   * @param config allows accessing of fields in the configuration files that this StreamTask is specified in
   * @param context allows initializing and accessing contextual data of this StreamTask
   * @throws Exception in case of initialization errors
   */
@Override
public final void init(Config config, TaskContext context) throws Exception {
    StreamGraphImpl streamGraph = new StreamGraphImpl(runner, config);
    // initialize the user-implemented stream application.
    this.streamApplication.init(streamGraph, config);
    // get the user-implemented context manager and initialize it
    this.contextManager = streamGraph.getContextManager();
    if (this.contextManager != null) {
        this.contextManager.init(config, context);
    }
    // create the operator impl DAG corresponding to the logical operator spec DAG
    OperatorImplGraph operatorImplGraph = new OperatorImplGraph(clock);
    operatorImplGraph.init(streamGraph, config, context);
    this.operatorImplGraph = operatorImplGraph;
    // TODO: SAMZA-1118 - Remove mapping after SystemConsumer starts returning logical streamId with incoming messages
    inputSystemStreamToInputStream = new HashMap<>();
    streamGraph.getInputStreams().forEach((streamSpec, inputStream) -> {
        SystemStream systemStream = new SystemStream(streamSpec.getSystemName(), streamSpec.getPhysicalName());
        inputSystemStreamToInputStream.put(systemStream, inputStream);
    });
}
Also used : OperatorImplGraph(org.apache.samza.operators.impl.OperatorImplGraph) SystemStream(org.apache.samza.system.SystemStream) StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl)

Example 5 with StreamGraphImpl

use of org.apache.samza.operators.StreamGraphImpl in project samza by apache.

the class TestJobGraphJsonGenerator method test.

@Test
public void test() throws Exception {
    /**
     * the graph looks like the following. number of partitions in parentheses. quotes indicate expected value.
     *
     *                               input1 (64) -> map -> join -> output1 (8)
     *                                                       |
     *          input2 (16) -> partitionBy ("64") -> filter -|
     *                                                       |
     * input3 (32) -> filter -> partitionBy ("64") -> map -> join -> output2 (16)
     *
     */
    Map<String, String> configMap = new HashMap<>();
    configMap.put(JobConfig.JOB_NAME(), "test-app");
    configMap.put(JobConfig.JOB_DEFAULT_SYSTEM(), "test-system");
    Config config = new MapConfig(configMap);
    StreamSpec input1 = new StreamSpec("input1", "input1", "system1");
    StreamSpec input2 = new StreamSpec("input2", "input2", "system2");
    StreamSpec input3 = new StreamSpec("input3", "input3", "system2");
    StreamSpec output1 = new StreamSpec("output1", "output1", "system1");
    StreamSpec output2 = new StreamSpec("output2", "output2", "system2");
    ApplicationRunner runner = mock(ApplicationRunner.class);
    when(runner.getStreamSpec("input1")).thenReturn(input1);
    when(runner.getStreamSpec("input2")).thenReturn(input2);
    when(runner.getStreamSpec("input3")).thenReturn(input3);
    when(runner.getStreamSpec("output1")).thenReturn(output1);
    when(runner.getStreamSpec("output2")).thenReturn(output2);
    // intermediate streams used in tests
    when(runner.getStreamSpec("test-app-1-partition_by-0")).thenReturn(new StreamSpec("test-app-1-partition_by-0", "test-app-1-partition_by-0", "default-system"));
    when(runner.getStreamSpec("test-app-1-partition_by-1")).thenReturn(new StreamSpec("test-app-1-partition_by-1", "test-app-1-partition_by-1", "default-system"));
    when(runner.getStreamSpec("test-app-1-partition_by-4")).thenReturn(new StreamSpec("test-app-1-partition_by-4", "test-app-1-partition_by-4", "default-system"));
    // set up external partition count
    Map<String, Integer> system1Map = new HashMap<>();
    system1Map.put("input1", 64);
    system1Map.put("output1", 8);
    Map<String, Integer> system2Map = new HashMap<>();
    system2Map.put("input2", 16);
    system2Map.put("input3", 32);
    system2Map.put("output2", 16);
    Map<String, SystemAdmin> systemAdmins = new HashMap<>();
    SystemAdmin systemAdmin1 = createSystemAdmin(system1Map);
    SystemAdmin systemAdmin2 = createSystemAdmin(system2Map);
    systemAdmins.put("system1", systemAdmin1);
    systemAdmins.put("system2", systemAdmin2);
    StreamManager streamManager = new StreamManager(systemAdmins);
    StreamGraphImpl streamGraph = new StreamGraphImpl(runner, config);
    BiFunction mockBuilder = mock(BiFunction.class);
    MessageStream m1 = streamGraph.getInputStream("input1", mockBuilder).map(m -> m);
    MessageStream m2 = streamGraph.getInputStream("input2", mockBuilder).partitionBy(m -> "haha").filter(m -> true);
    MessageStream m3 = streamGraph.getInputStream("input3", mockBuilder).filter(m -> true).partitionBy(m -> "hehe").map(m -> m);
    Function mockFn = mock(Function.class);
    OutputStream<Object, Object, Object> outputStream1 = streamGraph.getOutputStream("output1", mockFn, mockFn);
    OutputStream<Object, Object, Object> outputStream2 = streamGraph.getOutputStream("output2", mockFn, mockFn);
    m1.join(m2, mock(JoinFunction.class), Duration.ofHours(2)).sendTo(outputStream1);
    m2.sink((message, collector, coordinator) -> {
    });
    m3.join(m2, mock(JoinFunction.class), Duration.ofHours(1)).sendTo(outputStream2);
    ExecutionPlanner planner = new ExecutionPlanner(config, streamManager);
    ExecutionPlan plan = planner.plan(streamGraph);
    String json = plan.getPlanAsJson();
    System.out.println(json);
    // deserialize
    ObjectMapper mapper = new ObjectMapper();
    JobGraphJsonGenerator.JobGraphJson nodes = mapper.readValue(json, JobGraphJsonGenerator.JobGraphJson.class);
    assertTrue(nodes.jobs.get(0).operatorGraph.inputStreams.size() == 5);
    assertTrue(nodes.jobs.get(0).operatorGraph.operators.size() == 13);
    assertTrue(nodes.sourceStreams.size() == 3);
    assertTrue(nodes.sinkStreams.size() == 2);
    assertTrue(nodes.intermediateStreams.size() == 2);
}
Also used : ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) BiFunction(java.util.function.BiFunction) JobConfig(org.apache.samza.config.JobConfig) Assert.assertTrue(org.junit.Assert.assertTrue) HashMap(java.util.HashMap) StreamSpec(org.apache.samza.system.StreamSpec) Test(org.junit.Test) Mockito.when(org.mockito.Mockito.when) JoinFunction(org.apache.samza.operators.functions.JoinFunction) Function(java.util.function.Function) TestExecutionPlanner.createSystemAdmin(org.apache.samza.execution.TestExecutionPlanner.createSystemAdmin) StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl) Duration(java.time.Duration) Map(java.util.Map) SystemAdmin(org.apache.samza.system.SystemAdmin) Config(org.apache.samza.config.Config) MapConfig(org.apache.samza.config.MapConfig) OutputStream(org.apache.samza.operators.OutputStream) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) MessageStream(org.apache.samza.operators.MessageStream) Mockito.mock(org.mockito.Mockito.mock) StreamSpec(org.apache.samza.system.StreamSpec) HashMap(java.util.HashMap) JobConfig(org.apache.samza.config.JobConfig) Config(org.apache.samza.config.Config) MapConfig(org.apache.samza.config.MapConfig) BiFunction(java.util.function.BiFunction) JoinFunction(org.apache.samza.operators.functions.JoinFunction) Function(java.util.function.Function) ApplicationRunner(org.apache.samza.runtime.ApplicationRunner) BiFunction(java.util.function.BiFunction) MessageStream(org.apache.samza.operators.MessageStream) StreamGraphImpl(org.apache.samza.operators.StreamGraphImpl) MapConfig(org.apache.samza.config.MapConfig) TestExecutionPlanner.createSystemAdmin(org.apache.samza.execution.TestExecutionPlanner.createSystemAdmin) SystemAdmin(org.apache.samza.system.SystemAdmin) ObjectMapper(org.codehaus.jackson.map.ObjectMapper) Test(org.junit.Test)

Aggregations

StreamGraphImpl (org.apache.samza.operators.StreamGraphImpl)12 Test (org.junit.Test)10 Config (org.apache.samza.config.Config)8 JoinFunction (org.apache.samza.operators.functions.JoinFunction)7 Mockito.mock (org.mockito.Mockito.mock)7 Mockito.when (org.mockito.Mockito.when)7 Duration (java.time.Duration)6 ArrayList (java.util.ArrayList)6 Set (java.util.Set)6 Assert.assertEquals (org.junit.Assert.assertEquals)6 Assert.assertTrue (org.junit.Assert.assertTrue)6 HashMap (java.util.HashMap)5 Function (java.util.function.Function)5 JobConfig (org.apache.samza.config.JobConfig)5 MapConfig (org.apache.samza.config.MapConfig)5 TestMessageEnvelope (org.apache.samza.operators.data.TestMessageEnvelope)5 TestOutputMessageEnvelope (org.apache.samza.operators.data.TestOutputMessageEnvelope)5 Collection (java.util.Collection)4 List (java.util.List)4 Map (java.util.Map)4