Search in sources :

Example 1 with Writer

use of org.apache.accumulo.core.file.rfile.RFile.Writer in project accumulo by apache.

the class SplitLarge method main.

public static void main(String[] args) throws Exception {
    Configuration conf = CachedConfiguration.getInstance();
    FileSystem fs = FileSystem.get(conf);
    Opts opts = new Opts();
    opts.parseArgs(SplitLarge.class.getName(), args);
    for (String file : opts.files) {
        AccumuloConfiguration aconf = DefaultConfiguration.getInstance();
        Path path = new Path(file);
        CachableBlockFile.Reader rdr = new CachableBlockFile.Reader(fs, path, conf, null, null, aconf);
        try (Reader iter = new RFile.Reader(rdr)) {
            if (!file.endsWith(".rf")) {
                throw new IllegalArgumentException("File must end with .rf");
            }
            String smallName = file.substring(0, file.length() - 3) + "_small.rf";
            String largeName = file.substring(0, file.length() - 3) + "_large.rf";
            int blockSize = (int) aconf.getAsBytes(Property.TABLE_FILE_BLOCK_SIZE);
            try (Writer small = new RFile.Writer(new CachableBlockFile.Writer(fs, new Path(smallName), "gz", null, conf, aconf), blockSize);
                Writer large = new RFile.Writer(new CachableBlockFile.Writer(fs, new Path(largeName), "gz", null, conf, aconf), blockSize)) {
                small.startDefaultLocalityGroup();
                large.startDefaultLocalityGroup();
                iter.seek(new Range(), new ArrayList<>(), false);
                while (iter.hasTop()) {
                    Key key = iter.getTopKey();
                    Value value = iter.getTopValue();
                    if (key.getSize() + value.getSize() < opts.maxSize) {
                        small.append(key, value);
                    } else {
                        large.append(key, value);
                    }
                    iter.next();
                }
            }
        }
    }
}
Also used : Path(org.apache.hadoop.fs.Path) DefaultConfiguration(org.apache.accumulo.core.conf.DefaultConfiguration) AccumuloConfiguration(org.apache.accumulo.core.conf.AccumuloConfiguration) CachedConfiguration(org.apache.accumulo.core.util.CachedConfiguration) Configuration(org.apache.hadoop.conf.Configuration) Reader(org.apache.accumulo.core.file.rfile.RFile.Reader) Range(org.apache.accumulo.core.data.Range) FileSystem(org.apache.hadoop.fs.FileSystem) Value(org.apache.accumulo.core.data.Value) CachableBlockFile(org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile) Writer(org.apache.accumulo.core.file.rfile.RFile.Writer) Key(org.apache.accumulo.core.data.Key) AccumuloConfiguration(org.apache.accumulo.core.conf.AccumuloConfiguration)

Aggregations

AccumuloConfiguration (org.apache.accumulo.core.conf.AccumuloConfiguration)1 DefaultConfiguration (org.apache.accumulo.core.conf.DefaultConfiguration)1 Key (org.apache.accumulo.core.data.Key)1 Range (org.apache.accumulo.core.data.Range)1 Value (org.apache.accumulo.core.data.Value)1 CachableBlockFile (org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile)1 Reader (org.apache.accumulo.core.file.rfile.RFile.Reader)1 Writer (org.apache.accumulo.core.file.rfile.RFile.Writer)1 CachedConfiguration (org.apache.accumulo.core.util.CachedConfiguration)1 Configuration (org.apache.hadoop.conf.Configuration)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1