Search in sources :

Example 1 with SelectNameFn

use of org.apache.beam.sdk.io.common.TestRow.SelectNameFn in project beam by apache.

the class S3FileSystemIT method testWriteThenRead.

@Test
public void testWriteThenRead() {
    int rows = env.options().getNumberOfRows();
    // Write test dataset to S3.
    pipelineWrite.apply("Generate Sequence", GenerateSequence.from(0).to(rows)).apply("Prepare TestRows", ParDo.of(new DeterministicallyConstructTestRowFn())).apply("Prepare file rows", ParDo.of(new SelectNameFn())).apply("Write to S3 file", TextIO.write().to("s3://" + s3Bucket.name + "/test"));
    pipelineWrite.run().waitUntilFinish();
    // Read test dataset from S3.
    PCollection<String> output = pipelineRead.apply(TextIO.read().from("s3://" + s3Bucket.name + "/test*"));
    PAssert.thatSingleton(output.apply("Count All", Count.globally())).isEqualTo((long) rows);
    PAssert.that(output.apply(Combine.globally(new HashingFn()).withoutDefaults())).containsInAnyOrder(getExpectedHashForRowCount(rows));
    pipelineRead.run().waitUntilFinish();
}
Also used : DeterministicallyConstructTestRowFn(org.apache.beam.sdk.io.common.TestRow.DeterministicallyConstructTestRowFn) SelectNameFn(org.apache.beam.sdk.io.common.TestRow.SelectNameFn) HashingFn(org.apache.beam.sdk.io.common.HashingFn) Test(org.junit.Test)

Example 2 with SelectNameFn

use of org.apache.beam.sdk.io.common.TestRow.SelectNameFn in project beam by apache.

the class S3FileSystemIT method testWriteThenRead.

@Test
public void testWriteThenRead() {
    int rows = env.options().getNumberOfRows();
    // Write test dataset to S3.
    pipelineWrite.apply("Generate Sequence", GenerateSequence.from(0).to(rows)).apply("Prepare TestRows", ParDo.of(new DeterministicallyConstructTestRowFn())).apply("Prepare file rows", ParDo.of(new SelectNameFn())).apply("Write to S3 file", TextIO.write().to("s3://" + s3Bucket.name + "/test"));
    pipelineWrite.run().waitUntilFinish();
    // Read test dataset from S3.
    PCollection<String> output = pipelineRead.apply(TextIO.read().from("s3://" + s3Bucket.name + "/test*"));
    PAssert.thatSingleton(output.apply(Count.globally())).isEqualTo((long) rows);
    PAssert.that(output.apply(Combine.globally(new HashingFn()).withoutDefaults())).containsInAnyOrder(getExpectedHashForRowCount(rows));
    pipelineRead.run().waitUntilFinish();
}
Also used : DeterministicallyConstructTestRowFn(org.apache.beam.sdk.io.common.TestRow.DeterministicallyConstructTestRowFn) SelectNameFn(org.apache.beam.sdk.io.common.TestRow.SelectNameFn) HashingFn(org.apache.beam.sdk.io.common.HashingFn) Test(org.junit.Test)

Aggregations

HashingFn (org.apache.beam.sdk.io.common.HashingFn)2 DeterministicallyConstructTestRowFn (org.apache.beam.sdk.io.common.TestRow.DeterministicallyConstructTestRowFn)2 SelectNameFn (org.apache.beam.sdk.io.common.TestRow.SelectNameFn)2 Test (org.junit.Test)2