Search in sources :

Example 1 with LocalTableDescriptor

use of org.apache.samza.table.descriptors.LocalTableDescriptor in project samza by apache.

the class StreamApplicationDescriptorImpl method getTable.

@Override
public <K, V> Table<KV<K, V>> getTable(TableDescriptor<K, V, ?> tableDescriptor) {
    addTableDescriptor(tableDescriptor);
    if (tableDescriptor instanceof LocalTableDescriptor) {
        LocalTableDescriptor localTableDescriptor = (LocalTableDescriptor) tableDescriptor;
        getOrCreateTableSerdes(localTableDescriptor.getTableId(), localTableDescriptor.getSerde());
    }
    return new TableImpl(tableDescriptor);
}
Also used : LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor) TableImpl(org.apache.samza.operators.TableImpl)

Example 2 with LocalTableDescriptor

use of org.apache.samza.table.descriptors.LocalTableDescriptor in project samza by apache.

the class JobNodeConfigurationGenerator method configureTables.

private void configureTables(Map<String, String> generatedConfig, Config originalConfig, Map<String, TableDescriptor> tables, Set<String> inputs) {
    generatedConfig.putAll(TableConfigGenerator.generate(new MapConfig(generatedConfig), new ArrayList<>(tables.values())));
    // Add side inputs to the inputs and mark the stream as bootstrap
    tables.values().forEach(tableDescriptor -> {
        if (tableDescriptor instanceof LocalTableDescriptor) {
            LocalTableDescriptor localTableDescriptor = (LocalTableDescriptor) tableDescriptor;
            List<String> sideInputs = localTableDescriptor.getSideInputs();
            if (sideInputs != null && !sideInputs.isEmpty()) {
                sideInputs.stream().map(sideInput -> StreamUtil.getSystemStreamFromNameOrId(originalConfig, sideInput)).forEach(systemStream -> {
                    inputs.add(StreamUtil.getNameFromSystemStream(systemStream));
                    generatedConfig.put(String.format(StreamConfig.STREAM_PREFIX + StreamConfig.BOOTSTRAP, systemStream.getSystem(), systemStream.getStream()), "true");
                });
            }
        }
    });
}
Also used : ConfigUtil(org.apache.samza.util.ConfigUtil) TableDescriptor(org.apache.samza.table.descriptors.TableDescriptor) LoggerFactory(org.slf4j.LoggerFactory) JobConfig(org.apache.samza.config.JobConfig) HashMap(java.util.HashMap) JoinOperatorSpec(org.apache.samza.operators.spec.JoinOperatorSpec) Serde(org.apache.samza.serializers.Serde) LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor) StringUtils(org.apache.commons.lang3.StringUtils) StreamConfig(org.apache.samza.config.StreamConfig) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) OperatorSpec(org.apache.samza.operators.spec.OperatorSpec) Map(java.util.Map) ApplicationConfig(org.apache.samza.config.ApplicationConfig) StreamUtil(org.apache.samza.util.StreamUtil) MapConfig(org.apache.samza.config.MapConfig) KV(org.apache.samza.operators.KV) NoOpSerde(org.apache.samza.serializers.NoOpSerde) SerializableSerde(org.apache.samza.serializers.SerializableSerde) StorageConfig(org.apache.samza.config.StorageConfig) Logger(org.slf4j.Logger) SerializerConfig(org.apache.samza.config.SerializerConfig) TaskConfig(org.apache.samza.config.TaskConfig) Collection(java.util.Collection) Set(java.util.Set) WindowOperatorSpec(org.apache.samza.operators.spec.WindowOperatorSpec) UUID(java.util.UUID) JavaTableConfig(org.apache.samza.config.JavaTableConfig) Collectors(java.util.stream.Collectors) SamzaException(org.apache.samza.SamzaException) StoreDescriptor(org.apache.samza.operators.spec.StoreDescriptor) Objects(java.util.Objects) Base64(java.util.Base64) List(java.util.List) StatefulOperatorSpec(org.apache.samza.operators.spec.StatefulOperatorSpec) Config(org.apache.samza.config.Config) MathUtil(org.apache.samza.util.MathUtil) TableConfigGenerator(org.apache.samza.table.TableConfigGenerator) Joiner(com.google.common.base.Joiner) LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor) ArrayList(java.util.ArrayList) MapConfig(org.apache.samza.config.MapConfig)

Example 3 with LocalTableDescriptor

use of org.apache.samza.table.descriptors.LocalTableDescriptor in project samza by apache.

the class ExecutionPlanner method createJobGraph.

/**
 * Creates the physical graph from {@link ApplicationDescriptorImpl}
 */
/* package private */
JobGraph createJobGraph(ApplicationDescriptorImpl<? extends ApplicationDescriptor> appDesc) {
    JobGraph jobGraph = new JobGraph(config, appDesc);
    // Source streams contain both input and intermediate streams.
    Set<StreamSpec> sourceStreams = getStreamSpecs(appDesc.getInputStreamIds(), streamConfig);
    // Sink streams contain both output and intermediate streams.
    Set<StreamSpec> sinkStreams = getStreamSpecs(appDesc.getOutputStreamIds(), streamConfig);
    Set<StreamSpec> intermediateStreams = Sets.intersection(sourceStreams, sinkStreams);
    Set<StreamSpec> inputStreams = Sets.difference(sourceStreams, intermediateStreams);
    Set<StreamSpec> outputStreams = Sets.difference(sinkStreams, intermediateStreams);
    Set<TableDescriptor> tables = appDesc.getTableDescriptors();
    // Generate job.id and job.name configs from app.id and app.name if defined
    MapConfig generatedJobConfigs = JobPlanner.generateSingleJobConfig(config);
    String jobName = generatedJobConfigs.get(JobConfig.JOB_NAME);
    String jobId = generatedJobConfigs.get(JobConfig.JOB_ID, "1");
    // For this phase, we have a single job node for the whole DAG
    JobNode node = jobGraph.getOrCreateJobNode(jobName, jobId);
    // Add input streams
    inputStreams.forEach(spec -> jobGraph.addInputStream(spec, node));
    // Add output streams
    outputStreams.forEach(spec -> jobGraph.addOutputStream(spec, node));
    // Add intermediate streams
    intermediateStreams.forEach(spec -> jobGraph.addIntermediateStream(spec, node, node));
    // Add tables
    for (TableDescriptor table : tables) {
        jobGraph.addTable(table, node);
        // Add side-input streams (if any)
        if (table instanceof LocalTableDescriptor) {
            LocalTableDescriptor localTable = (LocalTableDescriptor) table;
            Iterable<String> sideInputs = ListUtils.emptyIfNull(localTable.getSideInputs());
            for (String sideInput : sideInputs) {
                jobGraph.addSideInputStream(getStreamSpec(sideInput, streamConfig));
            }
        }
    }
    if (!LegacyTaskApplication.class.isAssignableFrom(appDesc.getAppClass())) {
        // skip the validation when input streamIds are empty. This is only possible for LegacyTaskApplication
        jobGraph.validate();
    }
    return jobGraph;
}
Also used : StreamUtil.getStreamSpec(org.apache.samza.util.StreamUtil.getStreamSpec) StreamSpec(org.apache.samza.system.StreamSpec) LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor) LegacyTaskApplication(org.apache.samza.application.LegacyTaskApplication) MapConfig(org.apache.samza.config.MapConfig) TableDescriptor(org.apache.samza.table.descriptors.TableDescriptor) LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor)

Example 4 with LocalTableDescriptor

use of org.apache.samza.table.descriptors.LocalTableDescriptor in project samza by apache.

the class TableConfigGenerator method generateSerdeConfig.

/**
 * Generate serde configuration for provided tables
 *
 * @param tableDescriptors table descriptors, for which serde configuration to be generated
 * @return serde configuration for tables
 */
public static Map<String, String> generateSerdeConfig(List<TableDescriptor> tableDescriptors) {
    Map<String, String> serdeConfigs = new HashMap<>();
    // Collect key and msg serde instances for all the tables
    Map<String, Serde> tableKeySerdes = new HashMap<>();
    Map<String, Serde> tableValueSerdes = new HashMap<>();
    HashSet<Serde> serdes = new HashSet<>();
    tableDescriptors.stream().filter(d -> d instanceof LocalTableDescriptor).forEach(d -> {
        LocalTableDescriptor ld = (LocalTableDescriptor) d;
        tableKeySerdes.put(ld.getTableId(), ld.getSerde().getKeySerde());
        tableValueSerdes.put(ld.getTableId(), ld.getSerde().getValueSerde());
    });
    serdes.addAll(tableKeySerdes.values());
    serdes.addAll(tableValueSerdes.values());
    // Generate serde names
    SerializableSerde<Serde> serializableSerde = new SerializableSerde<>();
    Base64.Encoder base64Encoder = Base64.getEncoder();
    Map<Serde, String> serdeUUIDs = new HashMap<>();
    serdes.forEach(serde -> {
        String serdeName = serdeUUIDs.computeIfAbsent(serde, s -> serde.getClass().getSimpleName() + "-" + UUID.randomUUID().toString());
        serdeConfigs.putIfAbsent(String.format(SerializerConfig.SERDE_SERIALIZED_INSTANCE, serdeName), base64Encoder.encodeToString(serializableSerde.toBytes(serde)));
    });
    // Set key and msg serdes for tables to the serde names generated above
    tableKeySerdes.forEach((tableId, serde) -> {
        String keySerdeConfigKey = String.format(JavaTableConfig.STORE_KEY_SERDE, tableId);
        serdeConfigs.put(keySerdeConfigKey, serdeUUIDs.get(serde));
    });
    tableValueSerdes.forEach((tableId, serde) -> {
        String valueSerdeConfigKey = String.format(JavaTableConfig.STORE_MSG_SERDE, tableId);
        serdeConfigs.put(valueSerdeConfigKey, serdeUUIDs.get(serde));
    });
    return serdeConfigs;
}
Also used : Serde(org.apache.samza.serializers.Serde) SerializableSerde(org.apache.samza.serializers.SerializableSerde) Logger(org.slf4j.Logger) SerializerConfig(org.apache.samza.config.SerializerConfig) TableDescriptor(org.apache.samza.table.descriptors.TableDescriptor) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) Serde(org.apache.samza.serializers.Serde) LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor) UUID(java.util.UUID) JavaTableConfig(org.apache.samza.config.JavaTableConfig) HashSet(java.util.HashSet) Base64(java.util.Base64) List(java.util.List) Map(java.util.Map) Config(org.apache.samza.config.Config) SerializableSerde(org.apache.samza.serializers.SerializableSerde) LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor) Base64(java.util.Base64) HashMap(java.util.HashMap) SerializableSerde(org.apache.samza.serializers.SerializableSerde) HashSet(java.util.HashSet)

Example 5 with LocalTableDescriptor

use of org.apache.samza.table.descriptors.LocalTableDescriptor in project samza by apache.

the class TaskApplicationDescriptorImpl method withTable.

@Override
public TaskApplicationDescriptor withTable(TableDescriptor tableDescriptor) {
    addTableDescriptor(tableDescriptor);
    if (tableDescriptor instanceof LocalTableDescriptor) {
        LocalTableDescriptor localTableDescriptor = (LocalTableDescriptor) tableDescriptor;
        getOrCreateTableSerdes(localTableDescriptor.getTableId(), localTableDescriptor.getSerde());
    }
    return this;
}
Also used : LocalTableDescriptor(org.apache.samza.table.descriptors.LocalTableDescriptor)

Aggregations

LocalTableDescriptor (org.apache.samza.table.descriptors.LocalTableDescriptor)6 TableDescriptor (org.apache.samza.table.descriptors.TableDescriptor)4 ArrayList (java.util.ArrayList)2 Base64 (java.util.Base64)2 HashMap (java.util.HashMap)2 HashSet (java.util.HashSet)2 List (java.util.List)2 Map (java.util.Map)2 UUID (java.util.UUID)2 Config (org.apache.samza.config.Config)2 JavaTableConfig (org.apache.samza.config.JavaTableConfig)2 MapConfig (org.apache.samza.config.MapConfig)2 SerializerConfig (org.apache.samza.config.SerializerConfig)2 OperatorSpec (org.apache.samza.operators.spec.OperatorSpec)2 Serde (org.apache.samza.serializers.Serde)2 SerializableSerde (org.apache.samza.serializers.SerializableSerde)2 Joiner (com.google.common.base.Joiner)1 Collection (java.util.Collection)1 Objects (java.util.Objects)1 Set (java.util.Set)1