Search in sources :

Example 1 with FileNameFormat

use of org.apache.storm.hdfs.trident.format.FileNameFormat in project storm by apache.

the class TridentFileTopology method buildTopology.

public static StormTopology buildTopology(String hdfsUrl) {
    FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence", "key"), 1000, new Values("the cow jumped over the moon", 1L), new Values("the man went to the store and bought some candy", 2L), new Values("four score and seven years ago", 3L), new Values("how many apples can you eat", 4L), new Values("to be or not to be the person", 5L));
    spout.setCycle(true);
    TridentTopology topology = new TridentTopology();
    Stream stream = topology.newStream("spout1", spout);
    Fields hdfsFields = new Fields("sentence", "key");
    FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/tmp/trident").withPrefix("trident").withExtension(".txt");
    RecordFormat recordFormat = new DelimitedRecordFormat().withFields(hdfsFields);
    FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(5.0f, FileSizeRotationPolicy.Units.MB);
    HdfsState.Options options = new HdfsState.HdfsFileOptions().withFileNameFormat(fileNameFormat).withRecordFormat(recordFormat).withRotationPolicy(rotationPolicy).withFsUrl(hdfsUrl).withConfigKey("hdfs.config");
    StateFactory factory = new HdfsStateFactory().withOptions(options);
    TridentState state = stream.partitionPersist(factory, hdfsFields, new HdfsUpdater(), new Fields());
    return topology.build();
}
Also used : DelimitedRecordFormat(org.apache.storm.hdfs.trident.format.DelimitedRecordFormat) TridentState(org.apache.storm.trident.TridentState) DelimitedRecordFormat(org.apache.storm.hdfs.trident.format.DelimitedRecordFormat) RecordFormat(org.apache.storm.hdfs.trident.format.RecordFormat) Values(org.apache.storm.tuple.Values) FileNameFormat(org.apache.storm.hdfs.trident.format.FileNameFormat) DefaultFileNameFormat(org.apache.storm.hdfs.trident.format.DefaultFileNameFormat) FileRotationPolicy(org.apache.storm.hdfs.trident.rotation.FileRotationPolicy) DefaultFileNameFormat(org.apache.storm.hdfs.trident.format.DefaultFileNameFormat) Fields(org.apache.storm.tuple.Fields) StateFactory(org.apache.storm.trident.state.StateFactory) TridentTopology(org.apache.storm.trident.TridentTopology) FileInputStream(java.io.FileInputStream) Stream(org.apache.storm.trident.Stream) InputStream(java.io.InputStream) FileSizeRotationPolicy(org.apache.storm.hdfs.trident.rotation.FileSizeRotationPolicy)

Example 2 with FileNameFormat

use of org.apache.storm.hdfs.trident.format.FileNameFormat in project storm by apache.

the class TridentSequenceTopology method buildTopology.

public static StormTopology buildTopology(String hdfsUrl) {
    FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence", "key"), 1000, new Values("the cow jumped over the moon", 1L), new Values("the man went to the store and bought some candy", 2L), new Values("four score and seven years ago", 3L), new Values("how many apples can you eat", 4L), new Values("to be or not to be the person", 5L));
    spout.setCycle(true);
    TridentTopology topology = new TridentTopology();
    Stream stream = topology.newStream("spout1", spout);
    Fields hdfsFields = new Fields("sentence", "key");
    FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/tmp/trident").withPrefix("trident").withExtension(".seq");
    FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(5.0f, FileSizeRotationPolicy.Units.MB);
    HdfsState.Options seqOpts = new HdfsState.SequenceFileOptions().withFileNameFormat(fileNameFormat).withSequenceFormat(new DefaultSequenceFormat("key", "sentence")).withRotationPolicy(rotationPolicy).withFsUrl(hdfsUrl).withConfigKey("hdfs.config").addRotationAction(new MoveFileAction().toDestination("/tmp/dest2/"));
    StateFactory factory = new HdfsStateFactory().withOptions(seqOpts);
    TridentState state = stream.partitionPersist(factory, hdfsFields, new HdfsUpdater(), new Fields());
    return topology.build();
}
Also used : TridentState(org.apache.storm.trident.TridentState) Values(org.apache.storm.tuple.Values) FileNameFormat(org.apache.storm.hdfs.trident.format.FileNameFormat) DefaultFileNameFormat(org.apache.storm.hdfs.trident.format.DefaultFileNameFormat) FileRotationPolicy(org.apache.storm.hdfs.trident.rotation.FileRotationPolicy) DefaultFileNameFormat(org.apache.storm.hdfs.trident.format.DefaultFileNameFormat) DefaultSequenceFormat(org.apache.storm.hdfs.trident.format.DefaultSequenceFormat) MoveFileAction(org.apache.storm.hdfs.common.rotation.MoveFileAction) Fields(org.apache.storm.tuple.Fields) StateFactory(org.apache.storm.trident.state.StateFactory) TridentTopology(org.apache.storm.trident.TridentTopology) FileInputStream(java.io.FileInputStream) Stream(org.apache.storm.trident.Stream) InputStream(java.io.InputStream) FileSizeRotationPolicy(org.apache.storm.hdfs.trident.rotation.FileSizeRotationPolicy)

Aggregations

FileInputStream (java.io.FileInputStream)2 InputStream (java.io.InputStream)2 DefaultFileNameFormat (org.apache.storm.hdfs.trident.format.DefaultFileNameFormat)2 FileNameFormat (org.apache.storm.hdfs.trident.format.FileNameFormat)2 FileRotationPolicy (org.apache.storm.hdfs.trident.rotation.FileRotationPolicy)2 FileSizeRotationPolicy (org.apache.storm.hdfs.trident.rotation.FileSizeRotationPolicy)2 Stream (org.apache.storm.trident.Stream)2 TridentState (org.apache.storm.trident.TridentState)2 TridentTopology (org.apache.storm.trident.TridentTopology)2 StateFactory (org.apache.storm.trident.state.StateFactory)2 Fields (org.apache.storm.tuple.Fields)2 Values (org.apache.storm.tuple.Values)2 MoveFileAction (org.apache.storm.hdfs.common.rotation.MoveFileAction)1 DefaultSequenceFormat (org.apache.storm.hdfs.trident.format.DefaultSequenceFormat)1 DelimitedRecordFormat (org.apache.storm.hdfs.trident.format.DelimitedRecordFormat)1 RecordFormat (org.apache.storm.hdfs.trident.format.RecordFormat)1