Search in sources :

Example 11 with CompressionType

use of org.apache.hadoop.io.SequenceFile.CompressionType in project hive by apache.

the class Utilities method createSequenceWriter.

/**
 * Create a sequencefile output stream based on job configuration Uses user supplied compression
 * flag (rather than obtaining it from the Job Configuration).
 *
 * @param jc
 *          Job configuration
 * @param fs
 *          File System to create file in
 * @param file
 *          Path to be created
 * @param keyClass
 *          Java Class for key
 * @param valClass
 *          Java Class for value
 * @return output stream over the created sequencefile
 */
public static SequenceFile.Writer createSequenceWriter(JobConf jc, FileSystem fs, Path file, Class<?> keyClass, Class<?> valClass, boolean isCompressed, Progressable progressable) throws IOException {
    CompressionCodec codec = null;
    CompressionType compressionType = CompressionType.NONE;
    Class<? extends CompressionCodec> codecClass = null;
    if (isCompressed) {
        compressionType = SequenceFileOutputFormat.getOutputCompressionType(jc);
        codecClass = FileOutputFormat.getOutputCompressorClass(jc, DefaultCodec.class);
        codec = ReflectionUtil.newInstance(codecClass, jc);
    }
    return SequenceFile.createWriter(fs, jc, file, keyClass, valClass, compressionType, codec, progressable);
}
Also used : DefaultCodec(org.apache.hadoop.io.compress.DefaultCodec) CompressionCodec(org.apache.hadoop.io.compress.CompressionCodec) CompressionType(org.apache.hadoop.io.SequenceFile.CompressionType)

Aggregations

CompressionType (org.apache.hadoop.io.SequenceFile.CompressionType)11 FileSystem (org.apache.hadoop.fs.FileSystem)7 Path (org.apache.hadoop.fs.Path)7 CompressionCodec (org.apache.hadoop.io.compress.CompressionCodec)7 Configuration (org.apache.hadoop.conf.Configuration)4 MapFile (org.apache.hadoop.io.MapFile)4 SequenceFile (org.apache.hadoop.io.SequenceFile)4 Writable (org.apache.hadoop.io.Writable)3 RecordWriter (org.apache.hadoop.mapreduce.RecordWriter)3 TaskAttemptContext (org.apache.hadoop.mapreduce.TaskAttemptContext)3 IOException (java.io.IOException)2 Option (org.apache.hadoop.io.MapFile.Writer.Option)2 Text (org.apache.hadoop.io.Text)2 WritableComparable (org.apache.hadoop.io.WritableComparable)2 DefaultCodec (org.apache.hadoop.io.compress.DefaultCodec)2 CrawlDatum (org.apache.nutch.crawl.CrawlDatum)2 MalformedURLException (java.net.MalformedURLException)1 URL (java.net.URL)1 FileSystemNotFoundException (java.nio.file.FileSystemNotFoundException)1 ArrayList (java.util.ArrayList)1