Search in sources :

Example 6 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class SampleSplitterTest method testAllOnOneSplit.

@Test
public void testAllOnOneSplit() throws SplitFailedException {
    assumeTrue(isSampleOperatorSupported(uri));
    Configuration conf = new Configuration();
    MongoConfigUtil.setInputURI(conf, uri.getURI());
    // Split size is enough to encapsulate all documents.
    MongoConfigUtil.setSplitSize(conf, 12);
    splitter.setConfiguration(conf);
    List<InputSplit> splits = splitter.calculateSplits();
    assertEquals(1, splits.size());
    MongoInputSplit firstSplit = (MongoInputSplit) splits.get(0);
    assertTrue(firstSplit.getMin().toMap().isEmpty());
    assertTrue(firstSplit.getMax().toMap().isEmpty());
}
Also used : MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) InputSplit(org.apache.hadoop.mapreduce.InputSplit) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Test(org.junit.Test) BaseHadoopTest(com.mongodb.hadoop.testutils.BaseHadoopTest)

Example 7 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class SampleSplitterTest method testAlternateSplitKey.

@Test
public void testAlternateSplitKey() throws SplitFailedException {
    assumeTrue(isSampleOperatorSupported(uri));
    Configuration conf = new Configuration();
    MongoConfigUtil.setInputURI(conf, uri.getURI());
    MongoConfigUtil.setSplitSize(conf, 1);
    MongoConfigUtil.setInputSplitKeyPattern(conf, "{\"i\": 1}");
    splitter.setConfiguration(conf);
    List<InputSplit> splits = splitter.calculateSplits();
    assertEquals(12, splits.size());
    MongoInputSplit firstSplit = (MongoInputSplit) splits.get(0);
    assertTrue(firstSplit.getMin().toMap().isEmpty());
    MongoInputSplit lastSplit = (MongoInputSplit) splits.get(11);
    assertTrue(lastSplit.getMax().toMap().isEmpty());
    // Ranges for splits are ascending.
    int lastKey = (Integer) firstSplit.getMax().get("i");
    for (int i = 1; i < splits.size() - 1; i++) {
        MongoInputSplit split = (MongoInputSplit) splits.get(i);
        int currentKey = (Integer) split.getMax().get("i");
        assertTrue(currentKey > lastKey);
        lastKey = currentKey;
    }
}
Also used : MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) InputSplit(org.apache.hadoop.mapreduce.InputSplit) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Test(org.junit.Test) BaseHadoopTest(com.mongodb.hadoop.testutils.BaseHadoopTest)

Example 8 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class StandaloneMongoSplitterTest method testNullLowerBound.

@Test
public void testNullLowerBound() throws Exception {
    Configuration config = new Configuration();
    StandaloneMongoSplitter splitter = new StandaloneMongoSplitter(config);
    BasicDBObject upperBound = new BasicDBObject("a", 10);
    MongoInputSplit split = splitter.createSplitFromBounds(null, upperBound);
    assertEquals(new BasicDBObject(), split.getMin());
    assertEquals(10, split.getMax().get("a"));
}
Also used : BasicDBObject(com.mongodb.BasicDBObject) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) Test(org.junit.Test)

Example 9 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class StandaloneMongoSplitterTest method testNullBounds.

@Test
public void testNullBounds() throws Exception {
    Configuration config = new Configuration();
    StandaloneMongoSplitter splitter = new StandaloneMongoSplitter(config);
    MongoInputSplit split = splitter.createSplitFromBounds(null, null);
    assertEquals(new BasicDBObject(), split.getMin());
    assertEquals(new BasicDBObject(), split.getMax());
}
Also used : BasicDBObject(com.mongodb.BasicDBObject) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) Test(org.junit.Test)

Example 10 with MongoInputSplit

use of com.mongodb.hadoop.input.MongoInputSplit in project mongo-hadoop by mongodb.

the class StandaloneMongoSplitterTest method testFilterEmptySplits.

@Test
public void testFilterEmptySplits() throws SplitFailedException {
    Configuration config = new Configuration();
    DBObject query = new BasicDBObject("$or", new BasicDBObject[] { new BasicDBObject("value", new BasicDBObject("$lt", 20000)), new BasicDBObject("value", new BasicDBObject("$gt", 35000)) });
    MongoConfigUtil.setInputURI(config, uri);
    MongoConfigUtil.setEnableFilterEmptySplits(config, true);
    MongoConfigUtil.setQuery(config, query);
    // 1 MB per document results in 4 splits; the 3rd one is empty per
    // the above query.
    MongoConfigUtil.setSplitSize(config, 1);
    StandaloneMongoSplitter splitter = new StandaloneMongoSplitter(config);
    List<InputSplit> splits = splitter.calculateSplits();
    // No splits are empty.
    for (InputSplit split : splits) {
        // Cursor is closed on the split, so copy it to create a new one.
        MongoInputSplit mis = new MongoInputSplit((MongoInputSplit) split);
        assertNotEquals(0, mis.getCursor().itcount());
    }
    assertSplitsCount(collection.count(query), splits);
}
Also used : BasicDBObject(com.mongodb.BasicDBObject) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Configuration(org.apache.hadoop.conf.Configuration) BasicDBObject(com.mongodb.BasicDBObject) DBObject(com.mongodb.DBObject) InputSplit(org.apache.hadoop.mapreduce.InputSplit) MongoInputSplit(com.mongodb.hadoop.input.MongoInputSplit) Test(org.junit.Test)

Aggregations

MongoInputSplit (com.mongodb.hadoop.input.MongoInputSplit)21 Test (org.junit.Test)13 BasicDBObject (com.mongodb.BasicDBObject)12 Configuration (org.apache.hadoop.conf.Configuration)12 InputSplit (org.apache.hadoop.mapreduce.InputSplit)11 DBObject (com.mongodb.DBObject)7 MongoClientURI (com.mongodb.MongoClientURI)5 BasicDBList (com.mongodb.BasicDBList)3 BaseHadoopTest (com.mongodb.hadoop.testutils.BaseHadoopTest)3 DBCollection (com.mongodb.DBCollection)2 MongoClient (com.mongodb.MongoClient)2 MongoClientURIBuilder (com.mongodb.hadoop.util.MongoClientURIBuilder)2 ArrayList (java.util.ArrayList)2 List (java.util.List)2 BSONObject (org.bson.BSONObject)2 CommandResult (com.mongodb.CommandResult)1 DBCursor (com.mongodb.DBCursor)1 MongoException (com.mongodb.MongoException)1 MongoRecordReader (com.mongodb.hadoop.input.MongoRecordReader)1 MongoRecordReader (com.mongodb.hadoop.mapred.input.MongoRecordReader)1