Search in sources :

Example 16 with TopologyBuilder

use of org.apache.kafka.streams.processor.TopologyBuilder in project incubator-rya by apache.

the class ProjectionProcessorIT method showProcessorWorks.

@Test
public void showProcessorWorks() throws Exception {
    // Enumerate some topics that will be re-used
    final String ryaInstance = UUID.randomUUID().toString();
    final UUID queryId = UUID.randomUUID();
    final String statementsTopic = KafkaTopics.statementsTopic(ryaInstance);
    final String resultsTopic = KafkaTopics.queryResultsTopic(ryaInstance, queryId);
    // Create a topology for the Query that will be tested.
    final String sparql = "SELECT (?person AS ?p) ?otherPerson " + "WHERE { " + "?person <urn:talksTo> ?otherPerson . " + "}";
    final TopologyBuilder builder = new TopologyFactory().build(sparql, statementsTopic, resultsTopic, new RandomUUIDFactory());
    // Load some data into the input topic.
    final ValueFactory vf = new ValueFactoryImpl();
    final List<VisibilityStatement> statements = new ArrayList<>();
    statements.add(new VisibilityStatement(vf.createStatement(vf.createURI("urn:Alice"), vf.createURI("urn:talksTo"), vf.createURI("urn:Bob")), "a"));
    // Show the correct binding set results from the job.
    final Set<VisibilityBindingSet> expected = new HashSet<>();
    final MapBindingSet expectedBs = new MapBindingSet();
    expectedBs.addBinding("p", vf.createURI("urn:Alice"));
    expectedBs.addBinding("otherPerson", vf.createURI("urn:Bob"));
    expected.add(new VisibilityBindingSet(expectedBs, "a"));
    RyaStreamsTestUtil.runStreamProcessingTest(kafka, statementsTopic, resultsTopic, builder, statements, Sets.newHashSet(expected), VisibilityBindingSetDeserializer.class);
}
Also used : VisibilityBindingSet(org.apache.rya.api.model.VisibilityBindingSet) TopologyBuilder(org.apache.kafka.streams.processor.TopologyBuilder) ValueFactoryImpl(org.openrdf.model.impl.ValueFactoryImpl) ArrayList(java.util.ArrayList) TopologyFactory(org.apache.rya.streams.kafka.topology.TopologyFactory) ValueFactory(org.openrdf.model.ValueFactory) VisibilityStatement(org.apache.rya.api.model.VisibilityStatement) RandomUUIDFactory(org.apache.rya.api.function.projection.RandomUUIDFactory) MapBindingSet(org.openrdf.query.impl.MapBindingSet) UUID(java.util.UUID) HashSet(java.util.HashSet) Test(org.junit.Test)

Example 17 with TopologyBuilder

use of org.apache.kafka.streams.processor.TopologyBuilder in project incubator-rya by apache.

the class SingleThreadKafkaStreamsFactory method make.

@Override
public KafkaStreams make(final String ryaInstance, final StreamsQuery query) throws KafkaStreamsFactoryException {
    requireNonNull(ryaInstance);
    requireNonNull(query);
    // Setup the Kafka Stream program.
    final Properties streamsProps = new Properties();
    // Configure the Kafka servers that will be talked to.
    streamsProps.setProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServersConfig);
    // Use the Query ID as the Application ID to ensure we resume where we left off the last time this command was run.
    streamsProps.put(StreamsConfig.APPLICATION_ID_CONFIG, "RyaStreams-Query-" + query.getQueryId());
    // Always start at the beginning of the input topic.
    streamsProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    // Setup the topology that processes the Query.
    final String statementsTopic = KafkaTopics.statementsTopic(ryaInstance);
    final String resultsTopic = KafkaTopics.queryResultsTopic(ryaInstance, query.getQueryId());
    try {
        final TopologyBuilder topologyBuilder = topologyFactory.build(query.getSparql(), statementsTopic, resultsTopic, new RandomUUIDFactory());
        return new KafkaStreams(topologyBuilder, new StreamsConfig(streamsProps));
    } catch (final MalformedQueryException | TopologyBuilderException e) {
        throw new KafkaStreamsFactoryException("Could not create a KafkaStreams processing topology for query " + query.getQueryId(), e);
    }
}
Also used : KafkaStreams(org.apache.kafka.streams.KafkaStreams) TopologyBuilderException(org.apache.rya.streams.kafka.topology.TopologyBuilderFactory.TopologyBuilderException) TopologyBuilder(org.apache.kafka.streams.processor.TopologyBuilder) RandomUUIDFactory(org.apache.rya.api.function.projection.RandomUUIDFactory) MalformedQueryException(org.openrdf.query.MalformedQueryException) Properties(java.util.Properties) StreamsConfig(org.apache.kafka.streams.StreamsConfig)

Example 18 with TopologyBuilder

use of org.apache.kafka.streams.processor.TopologyBuilder in project incubator-rya by apache.

the class KafkaRunQuery method run.

@Override
public void run(final UUID queryId) throws RyaStreamsException {
    requireNonNull(queryId);
    // Fetch the query from the repository. Throw an exception if it isn't present.
    final Optional<StreamsQuery> query = queryRepo.get(queryId);
    if (!query.isPresent()) {
        throw new RyaStreamsException("Could not run the Query with ID " + queryId + " because no such query " + "is currently registered.");
    }
    // Build a processing topology using the SPARQL, provided statements topic, and provided results topic.
    final String sparql = query.get().getSparql();
    final TopologyBuilder topologyBuilder;
    try {
        topologyBuilder = topologyFactory.build(sparql, statementsTopic, resultsTopic, new RandomUUIDFactory());
    } catch (final Exception e) {
        throw new RyaStreamsException("Could not run the Query with ID " + queryId + " because a processing " + "topolgoy could not be built for the SPARQL " + sparql, e);
    }
    // Setup the Kafka Stream program.
    final Properties streamsProps = new Properties();
    streamsProps.setProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaHostname + ":" + kafkaPort);
    // Use the Query ID as the Application ID to ensure we resume where we left off the last time this command was run.
    streamsProps.put(StreamsConfig.APPLICATION_ID_CONFIG, "KafkaRunQuery-" + queryId);
    // Always start at the beginning of the input topic.
    streamsProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
    final KafkaStreams streams = new KafkaStreams(topologyBuilder, new StreamsConfig(streamsProps));
    // If an unhandled exception is thrown, rethrow it.
    streams.setUncaughtExceptionHandler((t, e) -> {
        // Log the problem and kill the program.
        log.error("Unhandled exception while processing the Rya Streams query. Shutting down.", e);
        System.exit(1);
    });
    // Setup a shutdown hook that kills the streams program at shutdown.
    final CountDownLatch awaitTermination = new CountDownLatch(1);
    Runtime.getRuntime().addShutdownHook(new Thread() {

        @Override
        public void run() {
            awaitTermination.countDown();
        }
    });
    // Run the streams program and wait for termination.
    streams.start();
    try {
        awaitTermination.await();
    } catch (final InterruptedException e) {
        log.warn("Interrupted while waiting for termination. Shutting down.");
    }
    streams.close();
}
Also used : KafkaStreams(org.apache.kafka.streams.KafkaStreams) StreamsQuery(org.apache.rya.streams.api.entity.StreamsQuery) TopologyBuilder(org.apache.kafka.streams.processor.TopologyBuilder) Properties(java.util.Properties) CountDownLatch(java.util.concurrent.CountDownLatch) RyaStreamsException(org.apache.rya.streams.api.exception.RyaStreamsException) RyaStreamsException(org.apache.rya.streams.api.exception.RyaStreamsException) RandomUUIDFactory(org.apache.rya.api.function.projection.RandomUUIDFactory) StreamsConfig(org.apache.kafka.streams.StreamsConfig)

Example 19 with TopologyBuilder

use of org.apache.kafka.streams.processor.TopologyBuilder in project kafka by apache.

the class StreamThreadTest method testHandingOverTaskFromOneToAnotherThread.

@Test
public void testHandingOverTaskFromOneToAnotherThread() throws Exception {
    final TopologyBuilder builder = new TopologyBuilder();
    builder.addStateStore(Stores.create("store").withByteArrayKeys().withByteArrayValues().persistent().build());
    final StreamsConfig config = new StreamsConfig(configProps());
    final MockClientSupplier mockClientSupplier = new MockClientSupplier();
    mockClientSupplier.consumer.assign(Arrays.asList(new TopicPartition(TOPIC, 0), new TopicPartition(TOPIC, 1)));
    final StreamThread thread1 = new StreamThread(builder, config, mockClientSupplier, applicationId, clientId + 1, processId, new Metrics(), Time.SYSTEM, new StreamsMetadataState(builder, StreamsMetadataState.UNKNOWN_HOST), 0);
    final StreamThread thread2 = new StreamThread(builder, config, mockClientSupplier, applicationId, clientId + 2, processId, new Metrics(), Time.SYSTEM, new StreamsMetadataState(builder, StreamsMetadataState.UNKNOWN_HOST), 0);
    final Map<TaskId, Set<TopicPartition>> task0 = Collections.singletonMap(new TaskId(0, 0), task0Assignment);
    final Map<TaskId, Set<TopicPartition>> task1 = Collections.singletonMap(new TaskId(0, 1), task1Assignment);
    final Map<TaskId, Set<TopicPartition>> thread1Assignment = new HashMap<>(task0);
    final Map<TaskId, Set<TopicPartition>> thread2Assignment = new HashMap<>(task1);
    thread1.partitionAssignor(new MockStreamsPartitionAssignor(thread1Assignment));
    thread2.partitionAssignor(new MockStreamsPartitionAssignor(thread2Assignment));
    // revoke (to get threads in correct state)
    thread1.rebalanceListener.onPartitionsRevoked(Collections.EMPTY_SET);
    thread2.rebalanceListener.onPartitionsRevoked(Collections.EMPTY_SET);
    // assign
    thread1.rebalanceListener.onPartitionsAssigned(task0Assignment);
    thread2.rebalanceListener.onPartitionsAssigned(task1Assignment);
    final Set<TaskId> originalTaskAssignmentThread1 = new HashSet<>();
    for (TaskId tid : thread1.tasks().keySet()) {
        originalTaskAssignmentThread1.add(tid);
    }
    final Set<TaskId> originalTaskAssignmentThread2 = new HashSet<>();
    for (TaskId tid : thread2.tasks().keySet()) {
        originalTaskAssignmentThread2.add(tid);
    }
    // revoke (task will be suspended)
    thread1.rebalanceListener.onPartitionsRevoked(task0Assignment);
    thread2.rebalanceListener.onPartitionsRevoked(task1Assignment);
    // assign reverted
    thread1Assignment.clear();
    thread1Assignment.putAll(task1);
    thread2Assignment.clear();
    thread2Assignment.putAll(task0);
    Thread runIt = new Thread(new Runnable() {

        @Override
        public void run() {
            thread1.rebalanceListener.onPartitionsAssigned(task1Assignment);
        }
    });
    runIt.start();
    thread2.rebalanceListener.onPartitionsAssigned(task0Assignment);
    runIt.join();
    assertThat(thread1.tasks().keySet(), equalTo(originalTaskAssignmentThread2));
    assertThat(thread2.tasks().keySet(), equalTo(originalTaskAssignmentThread1));
    assertThat(thread1.prevActiveTasks(), equalTo(originalTaskAssignmentThread1));
    assertThat(thread2.prevActiveTasks(), equalTo(originalTaskAssignmentThread2));
}
Also used : TaskId(org.apache.kafka.streams.processor.TaskId) Set(java.util.Set) HashSet(java.util.HashSet) TopologyBuilder(org.apache.kafka.streams.processor.TopologyBuilder) HashMap(java.util.HashMap) Metrics(org.apache.kafka.common.metrics.Metrics) StreamsMetrics(org.apache.kafka.streams.StreamsMetrics) MockClientSupplier(org.apache.kafka.test.MockClientSupplier) TopicPartition(org.apache.kafka.common.TopicPartition) StreamsConfig(org.apache.kafka.streams.StreamsConfig) HashSet(java.util.HashSet) Test(org.junit.Test)

Example 20 with TopologyBuilder

use of org.apache.kafka.streams.processor.TopologyBuilder in project kafka by apache.

the class StreamThreadTest method testMetrics.

@Test
public void testMetrics() throws Exception {
    TopologyBuilder builder = new TopologyBuilder().setApplicationId("MetricsApp");
    StreamsConfig config = new StreamsConfig(configProps());
    MockClientSupplier clientSupplier = new MockClientSupplier();
    Metrics metrics = new Metrics();
    StreamThread thread = new StreamThread(builder, config, clientSupplier, applicationId, clientId, processId, metrics, new MockTime(), new StreamsMetadataState(builder, StreamsMetadataState.UNKNOWN_HOST), 0);
    String defaultGroupName = "stream-metrics";
    String defaultPrefix = "thread." + thread.threadClientId();
    Map<String, String> defaultTags = Collections.singletonMap("client-id", thread.threadClientId());
    assertNotNull(metrics.getSensor(defaultPrefix + ".commit-latency"));
    assertNotNull(metrics.getSensor(defaultPrefix + ".poll-latency"));
    assertNotNull(metrics.getSensor(defaultPrefix + ".process-latency"));
    assertNotNull(metrics.getSensor(defaultPrefix + ".punctuate-latency"));
    assertNotNull(metrics.getSensor(defaultPrefix + ".task-created"));
    assertNotNull(metrics.getSensor(defaultPrefix + ".task-closed"));
    assertNotNull(metrics.getSensor(defaultPrefix + ".skipped-records"));
    assertNotNull(metrics.metrics().get(metrics.metricName("commit-latency-avg", defaultGroupName, "The average commit time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("commit-latency-max", defaultGroupName, "The maximum commit time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("commit-rate", defaultGroupName, "The average per-second number of commit calls", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("poll-latency-avg", defaultGroupName, "The average poll time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("poll-latency-max", defaultGroupName, "The maximum poll time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("poll-rate", defaultGroupName, "The average per-second number of record-poll calls", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("process-latency-avg", defaultGroupName, "The average process time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("process-latency-max", defaultGroupName, "The maximum process time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("process-rate", defaultGroupName, "The average per-second number of process calls", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("punctuate-latency-avg", defaultGroupName, "The average punctuate time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("punctuate-latency-max", defaultGroupName, "The maximum punctuate time in ms", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("punctuate-rate", defaultGroupName, "The average per-second number of punctuate calls", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("task-created-rate", defaultGroupName, "The average per-second number of newly created tasks", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("task-closed-rate", defaultGroupName, "The average per-second number of closed tasks", defaultTags)));
    assertNotNull(metrics.metrics().get(metrics.metricName("skipped-records-rate", defaultGroupName, "The average per-second number of skipped records.", defaultTags)));
}
Also used : Metrics(org.apache.kafka.common.metrics.Metrics) StreamsMetrics(org.apache.kafka.streams.StreamsMetrics) TopologyBuilder(org.apache.kafka.streams.processor.TopologyBuilder) MockClientSupplier(org.apache.kafka.test.MockClientSupplier) MockTime(org.apache.kafka.common.utils.MockTime) StreamsConfig(org.apache.kafka.streams.StreamsConfig) Test(org.junit.Test)

Aggregations

TopologyBuilder (org.apache.kafka.streams.processor.TopologyBuilder)38 Test (org.junit.Test)34 HashSet (java.util.HashSet)27 UUID (java.util.UUID)25 RandomUUIDFactory (org.apache.rya.api.function.projection.RandomUUIDFactory)25 VisibilityStatement (org.apache.rya.api.model.VisibilityStatement)24 TopologyFactory (org.apache.rya.streams.kafka.topology.TopologyFactory)24 ValueFactory (org.openrdf.model.ValueFactory)24 ValueFactoryImpl (org.openrdf.model.impl.ValueFactoryImpl)24 VisibilityBindingSet (org.apache.rya.api.model.VisibilityBindingSet)23 ArrayList (java.util.ArrayList)20 MapBindingSet (org.openrdf.query.impl.MapBindingSet)19 StreamsConfig (org.apache.kafka.streams.StreamsConfig)10 Metrics (org.apache.kafka.common.metrics.Metrics)8 MockClientSupplier (org.apache.kafka.test.MockClientSupplier)8 StreamsMetrics (org.apache.kafka.streams.StreamsMetrics)7 TaskId (org.apache.kafka.streams.processor.TaskId)7 TopicPartition (org.apache.kafka.common.TopicPartition)6 MockTime (org.apache.kafka.common.utils.MockTime)6 Properties (java.util.Properties)5