Search in sources :

Example 16 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class StandaloneMongoSplitterTest method testFilterEmptySplitsNoQuery.

@Test
public void testFilterEmptySplitsNoQuery() throws SplitFailedException {
    Configuration config = new Configuration();
    MongoConfigUtil.setInputURI(config, uri);
    MongoConfigUtil.setEnableFilterEmptySplits(config, true);
    MongoConfigUtil.setSplitSize(config, 1);
    StandaloneMongoSplitter splitter = new StandaloneMongoSplitter(config);
    List<InputSplit> splits = splitter.calculateSplits();
    // No splits should be elided, because there's no query.
    for (InputSplit split : splits) {
        assertNotEquals(0, (((MongoInputSplit) split).getCursor().itcount()));
    }
    assertSplitsCount(collection.count(), splits);
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) InputSplit(org.apache.hadoop.mapreduce.InputSplit) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Test(org.junit.Test)

Example 17 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class StandaloneMongoSplitterTest method testNullUpperBound.

@Test
public void testNullUpperBound() throws Exception {
    Configuration config = new Configuration();
    StandaloneMongoSplitter splitter = new StandaloneMongoSplitter(config);
    BasicDBObject lowerBound = new BasicDBObject("a", 10);
    MongoInputSplit split = splitter.createSplitFromBounds(lowerBound, null);
    assertEquals(10, split.getMin().get("a"));
    assertEquals(new BasicDBObject(), split.getMax());
}
Also used : BasicDBObject(com.mongodb.BasicDBObject) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) Test(org.junit.Test)

Example 18 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class StandaloneMongoSplitterTest method testLowerUpperBounds.

@Test
public void testLowerUpperBounds() throws Exception {
    Configuration config = new Configuration();
    StandaloneMongoSplitter splitter = new StandaloneMongoSplitter(config);
    BasicDBObject lowerBound = new BasicDBObject("a", 0);
    BasicDBObject upperBound = new BasicDBObject("a", 10);
    MongoInputSplit split = splitter.createSplitFromBounds(lowerBound, upperBound);
    assertEquals(0, split.getMin().get("a"));
    assertEquals(10, split.getMax().get("a"));
}
Also used : BasicDBObject(com.mongodb.BasicDBObject) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) Test(org.junit.Test)

Example 19 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class MongoInputSplitTest method testConstructor.

@Test
public void testConstructor() {
    Configuration conf = new Configuration();
    MongoConfigUtil.setFields(conf, "{\"field\": 1}");
    MongoConfigUtil.setAuthURI(conf, "mongodb://auth");
    MongoConfigUtil.setInputURI(conf, "mongodb://input");
    MongoConfigUtil.setInputKey(conf, "field");
    MongoConfigUtil.setMaxSplitKey(conf, "{\"field\": 1e9}");
    MongoConfigUtil.setMinSplitKey(conf, "{\"field\": -1e9}");
    MongoConfigUtil.setNoTimeout(conf, true);
    MongoConfigUtil.setQuery(conf, "{\"foo\": 42}");
    MongoConfigUtil.setSort(conf, "{\"foo\": -1}");
    MongoConfigUtil.setSkip(conf, 10);
    MongoInputSplit mis = new MongoInputSplit(conf);
    assertEquals(MongoConfigUtil.getFields(conf), mis.getFields());
    assertEquals(MongoConfigUtil.getAuthURI(conf), mis.getAuthURI());
    assertEquals(MongoConfigUtil.getInputURI(conf), mis.getInputURI());
    assertEquals(MongoConfigUtil.getInputKey(conf), mis.getKeyField());
    assertEquals(MongoConfigUtil.getMaxSplitKey(conf), mis.getMax());
    assertEquals(MongoConfigUtil.getMinSplitKey(conf), mis.getMin());
    assertEquals(MongoConfigUtil.isNoTimeout(conf), mis.getNoTimeout());
    assertEquals(MongoConfigUtil.getQuery(conf), mis.getQuery());
    assertEquals(MongoConfigUtil.getSort(conf), mis.getSort());
    assertEquals(MongoConfigUtil.getLimit(conf), (int) mis.getLimit());
    assertEquals(MongoConfigUtil.getSkip(conf), (int) mis.getSkip());
    MongoInputSplit mis2 = new MongoInputSplit(mis);
    assertEquals(mis, mis2);
}
Also used : MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) Test(org.junit.Test)

Example 20 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class MongoCollectionSplitter method createSplitFromBounds.

/**
     * Create an instance of MongoInputSplit that represents a view of this
     * splitter's input URI between the given lower/upper bounds. If this
     * splitter has range queries enabled, it will attempt to use $gte/$lt
     * clauses in the query construct to create the split, otherwise it will use
     * min/max index boundaries (default behavior).
     *
     * @param lowerBound the lower bound of the collection
     * @param upperBound the upper bound of the collection
     * @return a MongoInputSplit in the given bounds
     * @throws SplitFailedException if the split could not be created
     */
public MongoInputSplit createSplitFromBounds(final BasicDBObject lowerBound, final BasicDBObject upperBound) throws SplitFailedException {
    LOG.info("Created split: min=" + (lowerBound != null ? lowerBound.toString() : "null") + ", max= " + (upperBound != null ? upperBound.toString() : "null"));
    //Objects to contain upper/lower bounds for each split
    DBObject splitMin = new BasicDBObject();
    DBObject splitMax = new BasicDBObject();
    if (lowerBound != null) {
        for (Entry<String, Object> entry : lowerBound.entrySet()) {
            String key = entry.getKey();
            Object val = entry.getValue();
            if (!val.equals(MIN_KEY_TYPE)) {
                splitMin.put(key, val);
            }
        }
    }
    if (upperBound != null) {
        for (Entry<String, Object> entry : upperBound.entrySet()) {
            String key = entry.getKey();
            Object val = entry.getValue();
            if (!val.equals(MAX_KEY_TYPE)) {
                splitMax.put(key, val);
            }
        }
    }
    MongoInputSplit split = null;
    // If enabled, attempt to build the split using $gte/$lt.
    if (MongoConfigUtil.isRangeQueryEnabled(getConfiguration())) {
        try {
            DBObject query = MongoConfigUtil.getQuery(getConfiguration());
            split = createRangeQuerySplit(lowerBound, upperBound, query);
        } catch (Exception e) {
            throw new SplitFailedException("Couldn't use range query to create split: " + e.getMessage());
        }
    }
    if (split == null) {
        split = new MongoInputSplit(getConfiguration());
        split.setMin(splitMin);
        split.setMax(splitMax);
    }
    return split;
}
Also used : BasicDBObject(com.mongodb.BasicDBObject) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) BasicDBObject(com.mongodb.BasicDBObject) BSONObject(org.bson.BSONObject) DBObject(com.mongodb.DBObject) BasicDBObject(com.mongodb.BasicDBObject) DBObject(com.mongodb.DBObject)

Aggregations

MongoInputSplit (com.mongodb.hadoop.input.MongoInputSplit)21 Test (org.junit.Test)13 BasicDBObject (com.mongodb.BasicDBObject)12 Configuration (org.apache.hadoop.conf.Configuration)12 InputSplit (org.apache.hadoop.mapreduce.InputSplit)11 DBObject (com.mongodb.DBObject)7 MongoClientURI (com.mongodb.MongoClientURI)5 BasicDBList (com.mongodb.BasicDBList)3 BaseHadoopTest (com.mongodb.hadoop.testutils.BaseHadoopTest)3 DBCollection (com.mongodb.DBCollection)2 MongoClient (com.mongodb.MongoClient)2 MongoClientURIBuilder (com.mongodb.hadoop.util.MongoClientURIBuilder)2 ArrayList (java.util.ArrayList)2 List (java.util.List)2 BSONObject (org.bson.BSONObject)2 CommandResult (com.mongodb.CommandResult)1 DBCursor (com.mongodb.DBCursor)1 MongoException (com.mongodb.MongoException)1 MongoRecordReader (com.mongodb.hadoop.input.MongoRecordReader)1 MongoRecordReader (com.mongodb.hadoop.mapred.input.MongoRecordReader)1