use of org.apache.crunch.impl.mr.MRPipeline in project crunch by cloudera.
the class PCollectionGetSizeTest method testGetSizeOfEmptyIntermediatePCollection_NoSave_MRPipeline.
@Test
@Ignore("GetSize of a DoCollection is only an estimate based on scale factor, so we can't count on it being reported as 0")
public void testGetSizeOfEmptyIntermediatePCollection_NoSave_MRPipeline() throws IOException {
PCollection<String> data = new MRPipeline(this.getClass()).readTextFile(nonEmptyInputPath);
PCollection<String> emptyPCollection = data.filter(new FalseFilterFn());
assertThat(emptyPCollection.getSize(), is(0L));
}
use of org.apache.crunch.impl.mr.MRPipeline in project crunch by cloudera.
the class PCollectionGetSizeTest method testMaterializeOfEmptyIntermediatePCollection_MRPipeline.
@Test
public void testMaterializeOfEmptyIntermediatePCollection_MRPipeline() throws IOException {
PCollection<String> emptyIntermediate = createPesistentEmptyIntermediate(new MRPipeline(this.getClass()));
assertThat(newArrayList(emptyIntermediate.materialize()).size(), is(0));
}
use of org.apache.crunch.impl.mr.MRPipeline in project crunch by cloudera.
the class PTableKeyValueTest method setUp.
@Before
public void setUp() throws IOException {
pipeline = new MRPipeline(PTableKeyValueTest.class);
inputFile = FileHelper.createTempCopyOf("set1.txt");
}
use of org.apache.crunch.impl.mr.MRPipeline in project crunch by cloudera.
the class WordCountTest method runWithTop.
public static void runWithTop(PTypeFamily tf) throws IOException {
Pipeline pipeline = new MRPipeline(WordCountTest.class);
String inputPath = FileHelper.createTempCopyOf("shakes.txt");
PCollection<String> shakespeare = pipeline.read(At.textFile(inputPath, tf.strings()));
PTable<String, Long> wordCount = wordCount(shakespeare, tf);
List<Pair<String, Long>> top5 = Lists.newArrayList(Aggregate.top(wordCount, 5, true).materialize());
assertEquals(ImmutableList.of(Pair.of("", 1470L), Pair.of("the", 620L), Pair.of("and", 427L), Pair.of("of", 396L), Pair.of("to", 367L)), top5);
}
use of org.apache.crunch.impl.mr.MRPipeline in project crunch by cloudera.
the class WordCountTest method testWritablesWithSecond.
@Test
public void testWritablesWithSecond() throws IOException {
runSecond = true;
run(new MRPipeline(WordCountTest.class), WritableTypeFamily.getInstance());
}
Aggregations