Search in sources :

Example 1 with CompressedReader

use of org.apache.beam.sdk.io.CompressedSource.CompressedReader in project beam by apache.

the class CompressedSourceTest method testEmptyGzipProgress.

@Test
public void testEmptyGzipProgress() throws IOException {
    File tmpFile = tmpFolder.newFile("empty.gz");
    String filename = tmpFile.toPath().toString();
    writeFile(tmpFile, new byte[0], CompressionMode.GZIP);
    PipelineOptions options = PipelineOptionsFactory.create();
    CompressedSource<Byte> source = CompressedSource.from(new ByteSource(filename, 1));
    try (BoundedReader<Byte> readerOrig = source.createReader(options)) {
        assertThat(readerOrig, instanceOf(CompressedReader.class));
        CompressedReader<Byte> reader = (CompressedReader<Byte>) readerOrig;
        // before starting
        assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
        assertEquals(0, reader.getSplitPointsConsumed());
        assertEquals(1, reader.getSplitPointsRemaining());
        // confirm empty
        assertFalse(reader.start());
        // after reading empty source
        assertEquals(1.0, reader.getFractionConsumed(), 1e-6);
        assertEquals(0, reader.getSplitPointsConsumed());
        assertEquals(0, reader.getSplitPointsRemaining());
    }
}
Also used : PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) CompressedReader(org.apache.beam.sdk.io.CompressedSource.CompressedReader) Matchers.containsString(org.hamcrest.Matchers.containsString) File(java.io.File) Test(org.junit.Test)

Example 2 with CompressedReader

use of org.apache.beam.sdk.io.CompressedSource.CompressedReader in project beam by apache.

the class CompressedSourceTest method testGzipProgress.

@Test
public void testGzipProgress() throws IOException {
    int numRecords = 3;
    File tmpFile = tmpFolder.newFile("nonempty.gz");
    String filename = tmpFile.toPath().toString();
    writeFile(tmpFile, new byte[numRecords], CompressionMode.GZIP);
    PipelineOptions options = PipelineOptionsFactory.create();
    CompressedSource<Byte> source = CompressedSource.from(new ByteSource(filename, 1));
    try (BoundedReader<Byte> readerOrig = source.createReader(options)) {
        assertThat(readerOrig, instanceOf(CompressedReader.class));
        CompressedReader<Byte> reader = (CompressedReader<Byte>) readerOrig;
        // before starting
        assertEquals(0.0, reader.getFractionConsumed(), 1e-6);
        assertEquals(0, reader.getSplitPointsConsumed());
        assertEquals(1, reader.getSplitPointsRemaining());
        // confirm has three records
        for (int i = 0; i < numRecords; ++i) {
            if (i == 0) {
                assertTrue(reader.start());
            } else {
                assertTrue(reader.advance());
            }
            assertEquals(0, reader.getSplitPointsConsumed());
            assertEquals(1, reader.getSplitPointsRemaining());
        }
        assertFalse(reader.advance());
        // after reading empty source
        assertEquals(1.0, reader.getFractionConsumed(), 1e-6);
        assertEquals(1, reader.getSplitPointsConsumed());
        assertEquals(0, reader.getSplitPointsRemaining());
    }
}
Also used : PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) CompressedReader(org.apache.beam.sdk.io.CompressedSource.CompressedReader) Matchers.containsString(org.hamcrest.Matchers.containsString) File(java.io.File) Test(org.junit.Test)

Aggregations

File (java.io.File)2 CompressedReader (org.apache.beam.sdk.io.CompressedSource.CompressedReader)2 PipelineOptions (org.apache.beam.sdk.options.PipelineOptions)2 Matchers.containsString (org.hamcrest.Matchers.containsString)2 Test (org.junit.Test)2