Search in sources :

Example 1 with GridFS

use of com.mongodb.gridfs.GridFS in project mongo-hadoop by mongodb.

the class GridFSInputFormat method getSplits.

@Override
public List<InputSplit> getSplits(final JobContext context) throws IOException, InterruptedException {
    Configuration conf = context.getConfiguration();
    DBCollection inputCollection = MongoConfigUtil.getInputCollection(conf);
    MongoClientURI inputURI = MongoConfigUtil.getInputURI(conf);
    GridFS gridFS = new GridFS(inputCollection.getDB(), inputCollection.getName());
    DBObject query = MongoConfigUtil.getQuery(conf);
    List<InputSplit> splits = new LinkedList<InputSplit>();
    for (GridFSDBFile file : gridFS.find(query)) {
        // One split per file.
        if (MongoConfigUtil.isGridFSWholeFileSplit(conf)) {
            splits.add(new GridFSSplit(inputURI, (ObjectId) file.getId(), (int) file.getChunkSize(), file.getLength()));
        } else // One split per file chunk.
        {
            for (int chunk = 0; chunk < file.numChunks(); ++chunk) {
                splits.add(new GridFSSplit(inputURI, (ObjectId) file.getId(), (int) file.getChunkSize(), file.getLength(), chunk));
            }
        }
    }
    LOG.debug("Found GridFS splits: " + splits);
    return splits;
}
Also used : DBCollection(com.mongodb.DBCollection) GridFSSplit(com.mongodb.hadoop.input.GridFSSplit) Configuration(org.apache.hadoop.conf.Configuration) ObjectId(org.bson.types.ObjectId) GridFSDBFile(com.mongodb.gridfs.GridFSDBFile) MongoClientURI(com.mongodb.MongoClientURI) GridFS(com.mongodb.gridfs.GridFS) DBObject(com.mongodb.DBObject) InputSplit(org.apache.hadoop.mapreduce.InputSplit) LinkedList(java.util.LinkedList)

Example 2 with GridFS

use of com.mongodb.gridfs.GridFS in project mongo-hadoop by mongodb.

the class GridFSSplit method getGridFS.

private GridFS getGridFS() {
    if (null == gridFS) {
        DBCollection rootCollection = MongoConfigUtil.getCollection(inputURI);
        gridFS = new GridFS(rootCollection.getDB(), rootCollection.getName());
    }
    return gridFS;
}
Also used : DBCollection(com.mongodb.DBCollection) GridFS(com.mongodb.gridfs.GridFS)

Example 3 with GridFS

use of com.mongodb.gridfs.GridFS in project play-cookbook by spinscale.

the class GridFsHelper method getGridFS.

private static GridFS getGridFS() {
    String collection = Play.configuration.getProperty("morphia.db.collection.upload", "uploads");
    GridFS fs = new GridFS(MorphiaPlugin.ds().getDB(), collection);
    return fs;
}
Also used : GridFS(com.mongodb.gridfs.GridFS)

Example 4 with GridFS

use of com.mongodb.gridfs.GridFS in project beam by apache.

the class MongoDBGridFSIOTest method setup.

@BeforeClass
public static void setup() throws Exception {
    try (ServerSocket serverSocket = new ServerSocket(0)) {
        port = serverSocket.getLocalPort();
    }
    LOG.info("Starting MongoDB embedded instance on {}", port);
    try {
        Files.forceDelete(new File(MONGODB_LOCATION));
    } catch (Exception e) {
    }
    new File(MONGODB_LOCATION).mkdirs();
    IMongodConfig mongodConfig = new MongodConfigBuilder().version(Version.Main.PRODUCTION).configServer(false).replication(new Storage(MONGODB_LOCATION, null, 0)).net(new Net("localhost", port, Network.localhostIsIPv6())).cmdOptions(new MongoCmdOptionsBuilder().syncDelay(10).useNoPrealloc(true).useSmallFiles(true).useNoJournal(true).build()).build();
    mongodExecutable = mongodStarter.prepare(mongodConfig);
    mongodProcess = mongodExecutable.start();
    LOG.info("Insert test data");
    Mongo client = new Mongo("localhost", port);
    DB database = client.getDB(DATABASE);
    GridFS gridfs = new GridFS(database);
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    for (int x = 0; x < 100; x++) {
        out.write(("Einstein\nDarwin\nCopernicus\nPasteur\n" + "Curie\nFaraday\nNewton\nBohr\nGalilei\nMaxwell\n").getBytes());
    }
    for (int x = 0; x < 5; x++) {
        gridfs.createFile(new ByteArrayInputStream(out.toByteArray()), "file" + x).save();
    }
    gridfs = new GridFS(database, "mapBucket");
    long now = System.currentTimeMillis();
    Random random = new Random();
    String[] scientists = { "Einstein", "Darwin", "Copernicus", "Pasteur", "Curie", "Faraday", "Newton", "Bohr", "Galilei", "Maxwell" };
    for (int x = 0; x < 10; x++) {
        GridFSInputFile file = gridfs.createFile("file_" + x);
        OutputStream outf = file.getOutputStream();
        OutputStreamWriter writer = new OutputStreamWriter(outf);
        for (int y = 0; y < 5000; y++) {
            long time = now - random.nextInt(3600000);
            String name = scientists[y % scientists.length];
            writer.write(Long.toString(time) + "\t");
            writer.write(name + "\t");
            writer.write(Integer.toString(random.nextInt(100)));
            writer.write("\n");
        }
        for (int y = 0; y < scientists.length; y++) {
            String name = scientists[y % scientists.length];
            writer.write(Long.toString(now) + "\t");
            writer.write(name + "\t");
            writer.write("101");
            writer.write("\n");
        }
        writer.flush();
        writer.close();
    }
    client.close();
}
Also used : Mongo(com.mongodb.Mongo) ByteArrayOutputStream(java.io.ByteArrayOutputStream) OutputStream(java.io.OutputStream) ServerSocket(java.net.ServerSocket) ByteArrayOutputStream(java.io.ByteArrayOutputStream) GridFS(com.mongodb.gridfs.GridFS) IOException(java.io.IOException) GridFSInputFile(com.mongodb.gridfs.GridFSInputFile) Storage(de.flapdoodle.embed.mongo.config.Storage) Random(java.util.Random) ByteArrayInputStream(java.io.ByteArrayInputStream) IMongodConfig(de.flapdoodle.embed.mongo.config.IMongodConfig) OutputStreamWriter(java.io.OutputStreamWriter) Net(de.flapdoodle.embed.mongo.config.Net) GridFSInputFile(com.mongodb.gridfs.GridFSInputFile) GridFSDBFile(com.mongodb.gridfs.GridFSDBFile) File(java.io.File) MongodConfigBuilder(de.flapdoodle.embed.mongo.config.MongodConfigBuilder) MongoCmdOptionsBuilder(de.flapdoodle.embed.mongo.config.MongoCmdOptionsBuilder) DB(com.mongodb.DB) BeforeClass(org.junit.BeforeClass)

Example 5 with GridFS

use of com.mongodb.gridfs.GridFS in project camel by apache.

the class GridFsEndpoint method initializeConnection.

@SuppressWarnings("deprecation")
public void initializeConnection() throws Exception {
    LOG.info("Initialize GridFS endpoint: {}", this.toString());
    if (database == null) {
        throw new IllegalStateException("Missing required endpoint configuration: database");
    }
    db = mongoConnection.getDB(database);
    if (db == null) {
        throw new IllegalStateException("Could not initialize GridFsComponent. Database " + database + " does not exist.");
    }
    gridFs = new GridFS(db, bucket == null ? GridFS.DEFAULT_BUCKET : bucket) {

        {
            filesCollection = getFilesCollection();
        }
    };
}
Also used : GridFS(com.mongodb.gridfs.GridFS)

Aggregations

GridFS (com.mongodb.gridfs.GridFS)9 DB (com.mongodb.DB)3 GridFSDBFile (com.mongodb.gridfs.GridFSDBFile)3 GridFSInputFile (com.mongodb.gridfs.GridFSInputFile)3 OutputStream (java.io.OutputStream)3 DBCollection (com.mongodb.DBCollection)2 Mongo (com.mongodb.Mongo)2 MongoClient (com.mongodb.MongoClient)2 MongoClientURI (com.mongodb.MongoClientURI)2 ByteArrayInputStream (java.io.ByteArrayInputStream)2 ByteArrayOutputStream (java.io.ByteArrayOutputStream)2 File (java.io.File)2 IOException (java.io.IOException)2 DBObject (com.mongodb.DBObject)1 GridFSSplit (com.mongodb.hadoop.input.GridFSSplit)1 IMongodConfig (de.flapdoodle.embed.mongo.config.IMongodConfig)1 MongoCmdOptionsBuilder (de.flapdoodle.embed.mongo.config.MongoCmdOptionsBuilder)1 MongodConfigBuilder (de.flapdoodle.embed.mongo.config.MongodConfigBuilder)1 Net (de.flapdoodle.embed.mongo.config.Net)1 Storage (de.flapdoodle.embed.mongo.config.Storage)1