Search in sources :

Example 31 with StreamConfig

use of co.cask.cdap.data2.transaction.stream.StreamConfig in project cdap by caskdata.

the class HiveStreamInputFormat method getSplitFinder.

private StreamInputSplitFinder<InputSplit> getSplitFinder(JobConf conf) throws IOException {
    // first get the context we are in
    ContextManager.Context context = ContextManager.getContext(conf);
    Preconditions.checkNotNull(context);
    StreamConfig streamConfig = context.getStreamConfig(getStreamId(conf));
    // make sure we get the current generation so we don't read events that occurred before a truncate.
    Location streamPath = StreamUtils.createGenerationLocation(streamConfig.getLocation(), StreamUtils.getGeneration(streamConfig));
    StreamInputSplitFinder.Builder builder = StreamInputSplitFinder.builder(streamPath.toURI());
    // Get the Hive table path for the InputSplit created. It is just to satisfy hive. The InputFormat never uses it.
    JobContext jobContext = ShimLoader.getHadoopShims().newJobContext(Job.getInstance(conf));
    final Path[] tablePaths = FileInputFormat.getInputPaths(jobContext);
    return setupBuilder(conf, streamConfig, builder).build(new StreamInputSplitFactory<InputSplit>() {

        @Override
        public InputSplit createSplit(Path eventPath, Path indexPath, long startTime, long endTime, long start, long length, @Nullable String[] locations) {
            return new StreamInputSplit(tablePaths[0], eventPath, indexPath, startTime, endTime, start, length, locations);
        }
    });
}
Also used : Path(org.apache.hadoop.fs.Path) StreamConfig(co.cask.cdap.data2.transaction.stream.StreamConfig) StreamInputSplitFinder(co.cask.cdap.data.stream.StreamInputSplitFinder) ContextManager(co.cask.cdap.hive.context.ContextManager) JobContext(org.apache.hadoop.mapreduce.JobContext) InputSplit(org.apache.hadoop.mapred.InputSplit) Location(org.apache.twill.filesystem.Location)

Example 32 with StreamConfig

use of co.cask.cdap.data2.transaction.stream.StreamConfig in project cdap by caskdata.

the class StreamInputFormatProvider method getInputFormatConfiguration.

@Override
public Map<String, String> getInputFormatConfiguration() {
    try {
        StreamConfig streamConfig = streamAdmin.getConfig(streamId);
        Location streamPath = StreamUtils.createGenerationLocation(streamConfig.getLocation(), StreamUtils.getGeneration(streamConfig));
        Configuration hConf = new Configuration();
        hConf.clear();
        AbstractStreamInputFormat.setStreamId(hConf, streamId);
        AbstractStreamInputFormat.setTTL(hConf, streamConfig.getTTL());
        AbstractStreamInputFormat.setStreamPath(hConf, streamPath.toURI());
        AbstractStreamInputFormat.setTimeRange(hConf, streamInput.getStartTime(), streamInput.getEndTime());
        FormatSpecification formatSpec = streamInput.getBodyFormatSpec();
        if (formatSpec != null) {
            AbstractStreamInputFormat.setBodyFormatSpecification(hConf, formatSpec);
        } else {
            String decoderType = streamInput.getDecoderType();
            if (decoderType != null) {
                AbstractStreamInputFormat.setDecoderClassName(hConf, decoderType);
            }
        }
        return ConfigurationUtil.toMap(hConf);
    } catch (IOException e) {
        throw Throwables.propagate(e);
    }
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) FormatSpecification(co.cask.cdap.api.data.format.FormatSpecification) StreamConfig(co.cask.cdap.data2.transaction.stream.StreamConfig) IOException(java.io.IOException) Location(org.apache.twill.filesystem.Location)

Aggregations

StreamConfig (co.cask.cdap.data2.transaction.stream.StreamConfig)18 StreamId (co.cask.cdap.proto.id.StreamId)18 StreamEvent (co.cask.cdap.api.flow.flowlet.StreamEvent)15 Test (org.junit.Test)14 Location (org.apache.twill.filesystem.Location)10 IOException (java.io.IOException)7 ConsumerConfig (co.cask.cdap.data2.queue.ConsumerConfig)6 NamespaceId (co.cask.cdap.proto.id.NamespaceId)6 StreamAdmin (co.cask.cdap.data2.transaction.stream.StreamAdmin)5 TableId (co.cask.cdap.data2.util.TableId)5 TransactionContext (org.apache.tephra.TransactionContext)5 NotificationFeedException (co.cask.cdap.notifications.feeds.NotificationFeedException)3 FileNotFoundException (java.io.FileNotFoundException)3 Properties (java.util.Properties)3 StreamSpecification (co.cask.cdap.api.data.stream.StreamSpecification)2 FileWriter (co.cask.cdap.data.file.FileWriter)2 LevelDBTableCore (co.cask.cdap.data2.dataset2.lib.table.leveldb.LevelDBTableCore)2 ColumnFamilyDescriptorBuilder (co.cask.cdap.data2.util.hbase.ColumnFamilyDescriptorBuilder)2 TableDescriptorBuilder (co.cask.cdap.data2.util.hbase.TableDescriptorBuilder)2 HBaseDDLExecutor (co.cask.cdap.spi.hbase.HBaseDDLExecutor)2