Search in sources :

Example 1 with MongoClientURI

use of com.mongodb.MongoClientURI in project immutables by immutables.

the class RepositorySetup method forUri.

/**
   * Create setup using MongoDB client uri.
   * <ul>
   * <li>URI should contain database path segment</li>
   * <li>New internal {@link MongoClient} will be created</li>
   * <li>New internal executor will be created (with shutdown on jvm exit)</li>
   * <li>New {@link Gson} instance will be created configured with type adapter factory providers</li>
   * </ul>
   * <p>
   * Setup created by this factory methods should be reused to configure collection repositories for
   * the same MongoDB database.
   * <p>
   * This constructor designed for ease of use in sample scenarious. For more flexibility consider
   * using {@link #builder()} with custom constructed {@link ListeningExecutorService
   * executor} and {@link DB database} handle.
   * @param uri string that will be parsed as {@link MongoClientURI}.
   * @see MongoClientURI
   * @return repository setup instance.
   */
public static RepositorySetup forUri(String uri) {
    MongoClientURI clientUri = new MongoClientURI(uri);
    @Nullable String databaseName = clientUri.getDatabase();
    checkArgument(databaseName != null, "URI should contain database path segment");
    return builder().database(newMongoClient(clientUri).getDB(databaseName)).executor(newExecutor()).gson(createGson()).build();
}
Also used : MongoClientURI(com.mongodb.MongoClientURI) Nullable(javax.annotation.Nullable)

Example 2 with MongoClientURI

use of com.mongodb.MongoClientURI in project mongo-hadoop by mongodb.

the class GridFSInputFormat method getSplits.

@Override
public List<InputSplit> getSplits(final JobContext context) throws IOException, InterruptedException {
    Configuration conf = context.getConfiguration();
    DBCollection inputCollection = MongoConfigUtil.getInputCollection(conf);
    MongoClientURI inputURI = MongoConfigUtil.getInputURI(conf);
    GridFS gridFS = new GridFS(inputCollection.getDB(), inputCollection.getName());
    DBObject query = MongoConfigUtil.getQuery(conf);
    List<InputSplit> splits = new LinkedList<InputSplit>();
    for (GridFSDBFile file : gridFS.find(query)) {
        // One split per file.
        if (MongoConfigUtil.isGridFSWholeFileSplit(conf)) {
            splits.add(new GridFSSplit(inputURI, (ObjectId) file.getId(), (int) file.getChunkSize(), file.getLength()));
        } else // One split per file chunk.
        {
            for (int chunk = 0; chunk < file.numChunks(); ++chunk) {
                splits.add(new GridFSSplit(inputURI, (ObjectId) file.getId(), (int) file.getChunkSize(), file.getLength(), chunk));
            }
        }
    }
    LOG.debug("Found GridFS splits: " + splits);
    return splits;
}
Also used : DBCollection(com.mongodb.DBCollection) GridFSSplit(com.mongodb.hadoop.input.GridFSSplit) Configuration(org.apache.hadoop.conf.Configuration) ObjectId(org.bson.types.ObjectId) GridFSDBFile(com.mongodb.gridfs.GridFSDBFile) MongoClientURI(com.mongodb.MongoClientURI) GridFS(com.mongodb.gridfs.GridFS) DBObject(com.mongodb.DBObject) InputSplit(org.apache.hadoop.mapreduce.InputSplit) LinkedList(java.util.LinkedList)

Example 3 with MongoClientURI

use of com.mongodb.MongoClientURI in project mongo-hadoop by mongodb.

the class SingleMongoSplitter method calculateSplits.

@Override
public List<InputSplit> calculateSplits() {
    if (LOG.isDebugEnabled()) {
        MongoClientURI inputURI = MongoConfigUtil.getInputURI(getConfiguration());
        LOG.debug(format("SingleMongoSplitter calculating splits for namespace: %s.%s; hosts: %s", inputURI.getDatabase(), inputURI.getCollection(), inputURI.getHosts()));
    }
    return Collections.singletonList((InputSplit) new MongoInputSplit(getConfiguration()));
}
Also used : MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) MongoClientURI(com.mongodb.MongoClientURI)

Example 4 with MongoClientURI

use of com.mongodb.MongoClientURI in project mongo-hadoop by mongodb.

the class StandaloneMongoSplitter method calculateSplits.

@Override
public List<InputSplit> calculateSplits() throws SplitFailedException {
    final DBObject splitKey = MongoConfigUtil.getInputSplitKey(getConfiguration());
    final DBObject splitKeyMax = MongoConfigUtil.getMaxSplitKey(getConfiguration());
    final DBObject splitKeyMin = MongoConfigUtil.getMinSplitKey(getConfiguration());
    final int splitSize = MongoConfigUtil.getSplitSize(getConfiguration());
    final MongoClientURI inputURI;
    DBCollection inputCollection = null;
    final ArrayList<InputSplit> returnVal;
    try {
        inputURI = MongoConfigUtil.getInputURI(getConfiguration());
        MongoClientURI authURI = MongoConfigUtil.getAuthURI(getConfiguration());
        if (authURI != null) {
            inputCollection = MongoConfigUtil.getCollectionWithAuth(inputURI, authURI);
        } else {
            inputCollection = MongoConfigUtil.getCollection(inputURI);
        }
        returnVal = new ArrayList<InputSplit>();
        final String ns = inputCollection.getFullName();
        if (LOG.isDebugEnabled()) {
            LOG.debug(String.format("Running splitVector on namespace: %s.%s; hosts: %s", inputURI.getDatabase(), inputURI.getCollection(), inputURI.getHosts()));
        }
        final DBObject cmd = BasicDBObjectBuilder.start("splitVector", ns).add("keyPattern", splitKey).add("min", splitKeyMin).add("max", splitKeyMax).add("force", false).add("maxChunkSize", splitSize).get();
        CommandResult data;
        boolean ok = true;
        try {
            data = inputCollection.getDB().getSisterDB(inputURI.getDatabase()).command(cmd, ReadPreference.primary());
        } catch (final MongoException e) {
            // 2.0 servers throw exceptions rather than info in a CommandResult
            data = null;
            LOG.info(e.getMessage(), e);
            if (e.getMessage().contains("unrecognized command: splitVector")) {
                ok = false;
            } else {
                throw e;
            }
        }
        if (data != null) {
            if (data.containsField("$err")) {
                throw new SplitFailedException("Error calculating splits: " + data);
            } else if (!data.get("ok").equals(1.0)) {
                ok = false;
            }
        }
        if (!ok) {
            final CommandResult stats = inputCollection.getStats();
            if (stats.containsField("primary")) {
                final DBCursor shards = inputCollection.getDB().getSisterDB("config").getCollection("shards").find(new BasicDBObject("_id", stats.getString("primary")));
                try {
                    if (shards.hasNext()) {
                        final DBObject shard = shards.next();
                        final String host = ((String) shard.get("host")).replace(shard.get("_id") + "/", "");
                        final MongoClientURI shardHost;
                        if (authURI != null) {
                            shardHost = new MongoClientURIBuilder(authURI).host(host).build();
                        } else {
                            shardHost = new MongoClientURIBuilder(inputURI).host(host).build();
                        }
                        MongoClient shardClient = null;
                        try {
                            shardClient = new MongoClient(shardHost);
                            data = shardClient.getDB(shardHost.getDatabase()).command(cmd, ReadPreference.primary());
                        } catch (final Exception e) {
                            LOG.error(e.getMessage(), e);
                        } finally {
                            if (shardClient != null) {
                                shardClient.close();
                            }
                        }
                    }
                } finally {
                    shards.close();
                }
            }
            if (data != null && !data.get("ok").equals(1.0)) {
                throw new SplitFailedException("Unable to calculate input splits: " + data.get("errmsg"));
            }
        }
        // Comes in a format where "min" and "max" are implicit
        // and each entry is just a boundary key; not ranged
        final BasicDBList splitData = (BasicDBList) data.get("splitKeys");
        if (splitData.size() == 0) {
            LOG.warn("WARNING: No Input Splits were calculated by the split code. Proceeding with a *single* split. Data may be too" + " small, try lowering 'mongo.input.split_size' if this is undesirable.");
        }
        // Lower boundary of the first min split
        BasicDBObject lastKey = null;
        // If splitKeyMin was given, use it as first boundary.
        if (!splitKeyMin.toMap().isEmpty()) {
            lastKey = new BasicDBObject(splitKeyMin.toMap());
        }
        for (final Object aSplitData : splitData) {
            final BasicDBObject currentKey = (BasicDBObject) aSplitData;
            returnVal.add(createSplitFromBounds(lastKey, currentKey));
            lastKey = currentKey;
        }
        BasicDBObject maxKey = null;
        // If splitKeyMax was given, use it as last boundary.
        if (!splitKeyMax.toMap().isEmpty()) {
            maxKey = new BasicDBObject(splitKeyMax.toMap());
        }
        // Last max split
        final MongoInputSplit lastSplit = createSplitFromBounds(lastKey, maxKey);
        returnVal.add(lastSplit);
    } finally {
        if (inputCollection != null) {
            MongoConfigUtil.close(inputCollection.getDB().getMongo());
        }
    }
    if (MongoConfigUtil.isFilterEmptySplitsEnabled(getConfiguration())) {
        return filterEmptySplits(returnVal);
    }
    return returnVal;
}
Also used : MongoException(com.mongodb.MongoException) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) MongoClientURI(com.mongodb.MongoClientURI) BasicDBObject(com.mongodb.BasicDBObject) DBObject(com.mongodb.DBObject) MongoException(com.mongodb.MongoException) CommandResult(com.mongodb.CommandResult) DBCollection(com.mongodb.DBCollection) BasicDBObject(com.mongodb.BasicDBObject) MongoClient(com.mongodb.MongoClient) BasicDBList(com.mongodb.BasicDBList) DBCursor(com.mongodb.DBCursor) MongoClientURIBuilder(com.mongodb.hadoop.util.MongoClientURIBuilder) BasicDBObject(com.mongodb.BasicDBObject) DBObject(com.mongodb.DBObject) InputSplit(org.apache.hadoop.mapreduce.InputSplit) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit)

Example 5 with MongoClientURI

use of com.mongodb.MongoClientURI in project mongo-hadoop by mongodb.

the class MongoConfigUtil method getMongoURIs.

public static List<MongoClientURI> getMongoURIs(final Configuration conf, final String key) {
    String raw = conf.get(key);
    List<MongoClientURI> result = new LinkedList<MongoClientURI>();
    if (raw != null && !raw.trim().isEmpty()) {
        for (String connectionString : raw.split("mongodb://")) {
            // Try to be forgiving with formatting.
            connectionString = StringUtils.strip(connectionString, ", ");
            if (!connectionString.isEmpty()) {
                result.add(new MongoClientURI("mongodb://" + connectionString));
            }
        }
    }
    return result;
}
Also used : MongoClientURI(com.mongodb.MongoClientURI) LinkedList(java.util.LinkedList)

Aggregations

MongoClientURI (com.mongodb.MongoClientURI)64 MongoClient (com.mongodb.MongoClient)27 DBCollection (com.mongodb.DBCollection)12 BasicDBObject (com.mongodb.BasicDBObject)9 Test (org.junit.Test)9 MongoClientURIBuilder (com.mongodb.hadoop.util.MongoClientURIBuilder)8 Configuration (org.apache.hadoop.conf.Configuration)8 DBObject (com.mongodb.DBObject)7 ArrayList (java.util.ArrayList)7 List (java.util.List)7 InputSplit (org.apache.hadoop.mapreduce.InputSplit)6 DB (com.mongodb.DB)5 MongoDatabase (com.mongodb.client.MongoDatabase)5 MongoInputSplit (com.mongodb.hadoop.input.MongoInputSplit)5 IOException (java.io.IOException)5 MongoConnection (org.apache.jackrabbit.oak.plugins.document.util.MongoConnection)5 Document (org.bson.Document)5 OptionParser (joptsimple.OptionParser)4 OptionSet (joptsimple.OptionSet)4 DocumentNodeStore (org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore)4