Search in sources :

Example 1 with KafkaSpoutConfig

use of org.apache.storm.kafka.spout.KafkaSpoutConfig in project metron by apache.

the class ParserTopologyBuilder method build.

/**
 * Builds a Storm topology that parses telemetry data received from an external sensor.
 *
 * @param zookeeperUrl             Zookeeper URL
 * @param brokerUrl                Kafka Broker URL
 * @param sensorType               Type of sensor
 * @param spoutParallelismSupplier         Supplier for the parallelism hint for the spout
 * @param spoutNumTasksSupplier            Supplier for the number of tasks for the spout
 * @param parserParallelismSupplier        Supplier for the parallelism hint for the parser bolt
 * @param parserNumTasksSupplier           Supplier for the number of tasks for the parser bolt
 * @param errorWriterParallelismSupplier   Supplier for the parallelism hint for the bolt that handles errors
 * @param errorWriterNumTasksSupplier      Supplier for the number of tasks for the bolt that handles errors
 * @param kafkaSpoutConfigSupplier         Supplier for the configuration options for the kafka spout
 * @param securityProtocolSupplier         Supplier for the security protocol
 * @param outputTopic                      The output kafka topic
 * @param stormConfigSupplier              Supplier for the storm config
 * @return A Storm topology that parses telemetry data received from an external sensor
 * @throws Exception
 */
public static ParserTopology build(String zookeeperUrl, Optional<String> brokerUrl, String sensorType, ValueSupplier<Integer> spoutParallelismSupplier, ValueSupplier<Integer> spoutNumTasksSupplier, ValueSupplier<Integer> parserParallelismSupplier, ValueSupplier<Integer> parserNumTasksSupplier, ValueSupplier<Integer> errorWriterParallelismSupplier, ValueSupplier<Integer> errorWriterNumTasksSupplier, ValueSupplier<Map> kafkaSpoutConfigSupplier, ValueSupplier<String> securityProtocolSupplier, Optional<String> outputTopic, ValueSupplier<Config> stormConfigSupplier) throws Exception {
    // fetch configuration from zookeeper
    ParserConfigurations configs = new ParserConfigurations();
    SensorParserConfig parserConfig = getSensorParserConfig(zookeeperUrl, sensorType, configs);
    int spoutParallelism = spoutParallelismSupplier.get(parserConfig, Integer.class);
    int spoutNumTasks = spoutNumTasksSupplier.get(parserConfig, Integer.class);
    int parserParallelism = parserParallelismSupplier.get(parserConfig, Integer.class);
    int parserNumTasks = parserNumTasksSupplier.get(parserConfig, Integer.class);
    int errorWriterParallelism = errorWriterParallelismSupplier.get(parserConfig, Integer.class);
    int errorWriterNumTasks = errorWriterNumTasksSupplier.get(parserConfig, Integer.class);
    Map<String, Object> kafkaSpoutConfig = kafkaSpoutConfigSupplier.get(parserConfig, Map.class);
    Optional<String> securityProtocol = Optional.ofNullable(securityProtocolSupplier.get(parserConfig, String.class));
    // create the spout
    TopologyBuilder builder = new TopologyBuilder();
    KafkaSpout kafkaSpout = createKafkaSpout(zookeeperUrl, sensorType, securityProtocol, Optional.ofNullable(kafkaSpoutConfig), parserConfig);
    builder.setSpout("kafkaSpout", kafkaSpout, spoutParallelism).setNumTasks(spoutNumTasks);
    // create the parser bolt
    ParserBolt parserBolt = createParserBolt(zookeeperUrl, brokerUrl, sensorType, securityProtocol, configs, parserConfig, outputTopic);
    builder.setBolt("parserBolt", parserBolt, parserParallelism).setNumTasks(parserNumTasks).localOrShuffleGrouping("kafkaSpout");
    // create the error bolt, if needed
    if (errorWriterNumTasks > 0) {
        WriterBolt errorBolt = createErrorBolt(zookeeperUrl, brokerUrl, sensorType, securityProtocol, configs, parserConfig);
        builder.setBolt("errorMessageWriter", errorBolt, errorWriterParallelism).setNumTasks(errorWriterNumTasks).localOrShuffleGrouping("parserBolt", Constants.ERROR_STREAM);
    }
    return new ParserTopology(builder, stormConfigSupplier.get(parserConfig, Config.class));
}
Also used : TopologyBuilder(org.apache.storm.topology.TopologyBuilder) SensorParserConfig(org.apache.metron.common.configuration.SensorParserConfig) KafkaSpoutConfig(org.apache.storm.kafka.spout.KafkaSpoutConfig) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) Config(org.apache.storm.Config) ParserBolt(org.apache.metron.parsers.bolt.ParserBolt) SensorParserConfig(org.apache.metron.common.configuration.SensorParserConfig) WriterBolt(org.apache.metron.parsers.bolt.WriterBolt) ParserConfigurations(org.apache.metron.common.configuration.ParserConfigurations) JSONObject(org.json.simple.JSONObject) KafkaSpout(org.apache.storm.kafka.spout.KafkaSpout) StormKafkaSpout(org.apache.metron.storm.kafka.flux.StormKafkaSpout)

Example 2 with KafkaSpoutConfig

use of org.apache.storm.kafka.spout.KafkaSpoutConfig in project metron by apache.

the class ParserTopologyBuilder method build.

/**
 * Builds a Storm topology that parses telemetry data received from an external sensor.
 *
 * @param zookeeperUrl             Zookeeper URL
 * @param brokerUrl                Kafka Broker URL
 * @param sensorTypes               Type of sensor
 * @param spoutParallelismSupplier         Supplier for the parallelism hint for the spout
 * @param spoutNumTasksSupplier            Supplier for the number of tasks for the spout
 * @param parserParallelismSupplier        Supplier for the parallelism hint for the parser bolt
 * @param parserNumTasksSupplier           Supplier for the number of tasks for the parser bolt
 * @param errorWriterParallelismSupplier   Supplier for the parallelism hint for the bolt that handles errors
 * @param errorWriterNumTasksSupplier      Supplier for the number of tasks for the bolt that handles errors
 * @param kafkaSpoutConfigSupplier         Supplier for the configuration options for the kafka spout
 * @param securityProtocolSupplier         Supplier for the security protocol
 * @param outputTopicSupplier              Supplier for the output kafka topic
 * @param stormConfigSupplier              Supplier for the storm config
 * @return A Storm topology that parses telemetry data received from an external sensor
 * @throws Exception
 */
public static ParserTopology build(String zookeeperUrl, Optional<String> brokerUrl, List<String> sensorTypes, ValueSupplier<List> spoutParallelismSupplier, ValueSupplier<List> spoutNumTasksSupplier, ValueSupplier<Integer> parserParallelismSupplier, ValueSupplier<Integer> parserNumTasksSupplier, ValueSupplier<Integer> errorWriterParallelismSupplier, ValueSupplier<Integer> errorWriterNumTasksSupplier, ValueSupplier<List> kafkaSpoutConfigSupplier, ValueSupplier<String> securityProtocolSupplier, ValueSupplier<String> outputTopicSupplier, ValueSupplier<String> errorTopicSupplier, ValueSupplier<Config> stormConfigSupplier) throws Exception {
    // fetch configuration from zookeeper
    ParserConfigurations configs = new ParserConfigurations();
    Map<String, SensorParserConfig> sensorToParserConfigs = getSensorParserConfig(zookeeperUrl, sensorTypes, configs);
    Collection<SensorParserConfig> parserConfigs = sensorToParserConfigs.values();
    @SuppressWarnings("unchecked") List<Integer> spoutParallelism = (List<Integer>) spoutParallelismSupplier.get(parserConfigs, List.class);
    @SuppressWarnings("unchecked") List<Integer> spoutNumTasks = (List<Integer>) spoutNumTasksSupplier.get(parserConfigs, List.class);
    int parserParallelism = parserParallelismSupplier.get(parserConfigs, Integer.class);
    int parserNumTasks = parserNumTasksSupplier.get(parserConfigs, Integer.class);
    int errorWriterParallelism = errorWriterParallelismSupplier.get(parserConfigs, Integer.class);
    int errorWriterNumTasks = errorWriterNumTasksSupplier.get(parserConfigs, Integer.class);
    String outputTopic = outputTopicSupplier.get(parserConfigs, String.class);
    List<Map<String, Object>> kafkaSpoutConfig = kafkaSpoutConfigSupplier.get(parserConfigs, List.class);
    Optional<String> securityProtocol = Optional.ofNullable(securityProtocolSupplier.get(parserConfigs, String.class));
    // create the spout
    TopologyBuilder builder = new TopologyBuilder();
    int i = 0;
    List<String> spoutIds = new ArrayList<>();
    for (Entry<String, SensorParserConfig> entry : sensorToParserConfigs.entrySet()) {
        KafkaSpout kafkaSpout = createKafkaSpout(zookeeperUrl, entry.getKey(), securityProtocol, Optional.ofNullable(kafkaSpoutConfig.get(i)), entry.getValue());
        String spoutId = sensorToParserConfigs.size() > 1 ? "kafkaSpout-" + entry.getKey() : "kafkaSpout";
        builder.setSpout(spoutId, kafkaSpout, spoutParallelism.get(i)).setNumTasks(spoutNumTasks.get(i));
        spoutIds.add(spoutId);
        ++i;
    }
    // create the parser bolt
    ParserBolt parserBolt = createParserBolt(zookeeperUrl, brokerUrl, sensorToParserConfigs, securityProtocol, configs, Optional.ofNullable(outputTopic));
    BoltDeclarer boltDeclarer = builder.setBolt("parserBolt", parserBolt, parserParallelism).setNumTasks(parserNumTasks);
    for (String spoutId : spoutIds) {
        boltDeclarer.localOrShuffleGrouping(spoutId);
    }
    // create the error bolt, if needed
    if (errorWriterNumTasks > 0) {
        String errorTopic = errorTopicSupplier.get(parserConfigs, String.class);
        WriterBolt errorBolt = createErrorBolt(zookeeperUrl, brokerUrl, sensorTypes.get(0), securityProtocol, configs, parserConfigs.iterator().next(), errorTopic);
        builder.setBolt("errorMessageWriter", errorBolt, errorWriterParallelism).setNumTasks(errorWriterNumTasks).localOrShuffleGrouping("parserBolt", Constants.ERROR_STREAM);
    }
    return new ParserTopology(builder, stormConfigSupplier.get(parserConfigs, Config.class));
}
Also used : TopologyBuilder(org.apache.storm.topology.TopologyBuilder) SensorParserConfig(org.apache.metron.common.configuration.SensorParserConfig) KafkaSpoutConfig(org.apache.storm.kafka.spout.KafkaSpoutConfig) ConsumerConfig(org.apache.kafka.clients.consumer.ConsumerConfig) Config(org.apache.storm.Config) ParserBolt(org.apache.metron.parsers.bolt.ParserBolt) ArrayList(java.util.ArrayList) SensorParserConfig(org.apache.metron.common.configuration.SensorParserConfig) WriterBolt(org.apache.metron.parsers.bolt.WriterBolt) BoltDeclarer(org.apache.storm.topology.BoltDeclarer) ParserConfigurations(org.apache.metron.common.configuration.ParserConfigurations) ArrayList(java.util.ArrayList) List(java.util.List) KafkaSpout(org.apache.storm.kafka.spout.KafkaSpout) StormKafkaSpout(org.apache.metron.storm.kafka.flux.StormKafkaSpout) HashMap(java.util.HashMap) Map(java.util.Map)

Example 3 with KafkaSpoutConfig

use of org.apache.storm.kafka.spout.KafkaSpoutConfig in project storm by apache.

the class KafkaDataSourcesProvider method constructStreams.

@Override
public ISqlStreamsDataSource constructStreams(URI uri, String inputFormatClass, String outputFormatClass, Properties properties, List<FieldInfo> fields) {
    List<String> fieldNames = new ArrayList<>();
    int primaryIndex = -1;
    for (int i = 0; i < fields.size(); ++i) {
        FieldInfo f = fields.get(i);
        fieldNames.add(f.name());
        if (f.isPrimary()) {
            primaryIndex = i;
        }
    }
    Preconditions.checkState(primaryIndex != -1, "Kafka stream table must have a primary key");
    Scheme scheme = SerdeUtils.getScheme(inputFormatClass, properties, fieldNames);
    Map<String, String> values = parseUriParams(uri.getQuery());
    String bootstrapServers = values.get(URI_PARAMS_BOOTSTRAP_SERVERS);
    Preconditions.checkNotNull(bootstrapServers, "bootstrap-servers must be specified");
    String topic = uri.getHost();
    KafkaSpoutConfig<ByteBuffer, ByteBuffer> kafkaSpoutConfig = new KafkaSpoutConfig.Builder<ByteBuffer, ByteBuffer>(bootstrapServers, topic).setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ByteBufferDeserializer.class).setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ByteBufferDeserializer.class).setProp(ConsumerConfig.GROUP_ID_CONFIG, "storm-sql-kafka-" + UUID.randomUUID().toString()).setRecordTranslator(new RecordTranslatorSchemeAdapter(scheme)).build();
    IOutputSerializer serializer = SerdeUtils.getSerializer(outputFormatClass, properties, fieldNames);
    return new KafkaStreamsDataSource(kafkaSpoutConfig, bootstrapServers, topic, properties, serializer);
}
Also used : Scheme(org.apache.storm.spout.Scheme) ByteBufferDeserializer(org.apache.kafka.common.serialization.ByteBufferDeserializer) ArrayList(java.util.ArrayList) KafkaSpoutConfig(org.apache.storm.kafka.spout.KafkaSpoutConfig) ByteBuffer(java.nio.ByteBuffer) IOutputSerializer(org.apache.storm.sql.runtime.IOutputSerializer) FieldInfo(org.apache.storm.sql.runtime.FieldInfo)

Example 4 with KafkaSpoutConfig

use of org.apache.storm.kafka.spout.KafkaSpoutConfig in project metron by apache.

the class HDFSWriterCallback method initialize.

@Override
public void initialize(EmitContext context) {
    this.context = context;
    KafkaSpoutConfig spoutConfig = context.get(EmitContext.Type.SPOUT_CONFIG);
    if (spoutConfig != null && spoutConfig.getSubscription() != null) {
        this.topic = spoutConfig.getSubscription().getTopicsString();
        if (this.topic.length() > 0) {
            int len = this.topic.length();
            if (this.topic.charAt(0) == '[' && this.topic.charAt(len - 1) == ']') {
                this.topic = this.topic.substring(1, len - 1);
            }
        }
    } else {
        throw new IllegalStateException("Unable to initialize, because spout config is not correctly specified");
    }
}
Also used : KafkaSpoutConfig(org.apache.storm.kafka.spout.KafkaSpoutConfig)

Example 5 with KafkaSpoutConfig

use of org.apache.storm.kafka.spout.KafkaSpoutConfig in project metron by apache.

the class SpoutConfigurationTest method testBuilderCreation.

@Test
public void testBuilderCreation() {
    Map<String, Object> config = new HashMap<String, Object>() {

        {
            put(SpoutConfiguration.OFFSET_COMMIT_PERIOD_MS.key, "1000");
            put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "foo:1234");
            put("group.id", "foobar");
        }
    };
    Map<String, Object> spoutConfig = SpoutConfiguration.separate(config);
    KafkaSpoutConfig.Builder<Object, Object> builder = new SimpleStormKafkaBuilder(config, "topic", null);
    SpoutConfiguration.configure(builder, spoutConfig);
    KafkaSpoutConfig c = builder.build();
    assertEquals(1000, c.getOffsetsCommitPeriodMs());
}
Also used : HashMap(java.util.HashMap) KafkaSpoutConfig(org.apache.storm.kafka.spout.KafkaSpoutConfig) Test(org.junit.jupiter.api.Test)

Aggregations

KafkaSpoutConfig (org.apache.storm.kafka.spout.KafkaSpoutConfig)5 ArrayList (java.util.ArrayList)2 HashMap (java.util.HashMap)2 ConsumerConfig (org.apache.kafka.clients.consumer.ConsumerConfig)2 ParserConfigurations (org.apache.metron.common.configuration.ParserConfigurations)2 SensorParserConfig (org.apache.metron.common.configuration.SensorParserConfig)2 ParserBolt (org.apache.metron.parsers.bolt.ParserBolt)2 WriterBolt (org.apache.metron.parsers.bolt.WriterBolt)2 StormKafkaSpout (org.apache.metron.storm.kafka.flux.StormKafkaSpout)2 Config (org.apache.storm.Config)2 KafkaSpout (org.apache.storm.kafka.spout.KafkaSpout)2 TopologyBuilder (org.apache.storm.topology.TopologyBuilder)2 ByteBuffer (java.nio.ByteBuffer)1 List (java.util.List)1 Map (java.util.Map)1 ByteBufferDeserializer (org.apache.kafka.common.serialization.ByteBufferDeserializer)1 Scheme (org.apache.storm.spout.Scheme)1 FieldInfo (org.apache.storm.sql.runtime.FieldInfo)1 IOutputSerializer (org.apache.storm.sql.runtime.IOutputSerializer)1 BoltDeclarer (org.apache.storm.topology.BoltDeclarer)1