Search in sources :

Example 51 with SeekableByteChannel

use of java.nio.channels.SeekableByteChannel in project lucene-solr by apache.

the class LineFileDocs method open.

private synchronized void open(Random random) throws IOException {
    InputStream is = getClass().getResourceAsStream(path);
    boolean needSkip = true;
    long size = 0L, seekTo = 0L;
    if (is == null) {
        // if it's not in classpath, we load it as absolute filesystem path (e.g. Hudson's home dir)
        Path file = Paths.get(path);
        size = Files.size(file);
        if (path.endsWith(".gz")) {
            // if it is a gzip file, we need to use InputStream and slowly skipTo:
            is = Files.newInputStream(file);
        } else {
            // optimized seek using SeekableByteChannel
            seekTo = randomSeekPos(random, size);
            final SeekableByteChannel channel = Files.newByteChannel(file);
            if (LuceneTestCase.VERBOSE) {
                System.out.println("TEST: LineFileDocs: file seek to fp=" + seekTo + " on open");
            }
            channel.position(seekTo);
            is = Channels.newInputStream(channel);
            needSkip = false;
        }
    } else {
        // if the file comes from Classpath:
        size = is.available();
    }
    if (path.endsWith(".gz")) {
        is = new GZIPInputStream(is);
        // guestimate:
        size *= 2.8;
    }
    // but this seek is a scan, so very inefficient!!!
    if (needSkip) {
        seekTo = randomSeekPos(random, size);
        if (LuceneTestCase.VERBOSE) {
            System.out.println("TEST: LineFileDocs: stream skip to fp=" + seekTo + " on open");
        }
        is.skip(seekTo);
    }
    // if we seeked somewhere, read until newline char
    if (seekTo > 0L) {
        int b;
        do {
            b = is.read();
        } while (b >= 0 && b != 13 && b != 10);
    }
    CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder().onMalformedInput(CodingErrorAction.REPORT).onUnmappableCharacter(CodingErrorAction.REPORT);
    reader = new BufferedReader(new InputStreamReader(is, decoder), BUFFER_SIZE);
    if (seekTo > 0L) {
        // read one more line, to make sure we are not inside a Windows linebreak (\r\n):
        reader.readLine();
    }
}
Also used : Path(java.nio.file.Path) SeekableByteChannel(java.nio.channels.SeekableByteChannel) GZIPInputStream(java.util.zip.GZIPInputStream) CharsetDecoder(java.nio.charset.CharsetDecoder) InputStreamReader(java.io.InputStreamReader) GZIPInputStream(java.util.zip.GZIPInputStream) InputStream(java.io.InputStream) BufferedReader(java.io.BufferedReader) IntPoint(org.apache.lucene.document.IntPoint)

Example 52 with SeekableByteChannel

use of java.nio.channels.SeekableByteChannel in project lucene-solr by apache.

the class HandleTrackingFS method newByteChannel.

@Override
public SeekableByteChannel newByteChannel(Path path, Set<? extends OpenOption> options, FileAttribute<?>... attrs) throws IOException {
    SeekableByteChannel channel = new FilterSeekableByteChannel(super.newByteChannel(path, options, attrs)) {

        boolean closed;

        @Override
        public void close() throws IOException {
            try {
                if (!closed) {
                    closed = true;
                    onClose(path, this);
                }
            } finally {
                super.close();
            }
        }

        @Override
        public String toString() {
            return "SeekableByteChannel(" + path.toString() + ")";
        }

        @Override
        public int hashCode() {
            return System.identityHashCode(this);
        }

        @Override
        public boolean equals(Object obj) {
            return this == obj;
        }
    };
    callOpenHook(path, channel);
    return channel;
}
Also used : SeekableByteChannel(java.nio.channels.SeekableByteChannel)

Example 53 with SeekableByteChannel

use of java.nio.channels.SeekableByteChannel in project gatk by broadinstitute.

the class ReadsDataSourceUnitTest method testCloudBamWithCustomReaderFactoryAndWrappers.

@Test(dataProvider = "cloudXorTestData", groups = { "bucket" })
public void testCloudBamWithCustomReaderFactoryAndWrappers(final List<Path> bams, final List<Path> indices) {
    final SamReaderFactory customFactory = SamReaderFactory.makeDefault().validationStringency(ValidationStringency.STRICT);
    // The input files are XOR'd with a constant. We use a wrapper to XOR it back.
    // If the code uses the wrong wrapper, or omits one, then the test will fail.
    Function<SeekableByteChannel, SeekableByteChannel> xorData = XorWrapper.forKey((byte) 74);
    Function<SeekableByteChannel, SeekableByteChannel> xorIndex = XorWrapper.forKey((byte) 80);
    try (final ReadsDataSource readsSource = new ReadsDataSource(bams, indices, customFactory, xorData, xorIndex)) {
        Assert.assertTrue(readsSource.indicesAvailable(), "Explicitly-provided indices not detected for bams: " + bams);
        final Iterator<GATKRead> queryReads = readsSource.query(new SimpleInterval("1", 1, 300));
        int queryCount = 0;
        while (queryReads.hasNext()) {
            ++queryCount;
            queryReads.next();
        }
        Assert.assertEquals(queryCount, 2, "Wrong number of reads returned in query");
    }
}
Also used : SeekableByteChannel(java.nio.channels.SeekableByteChannel) GATKRead(org.broadinstitute.hellbender.utils.read.GATKRead) SimpleInterval(org.broadinstitute.hellbender.utils.SimpleInterval) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test)

Example 54 with SeekableByteChannel

use of java.nio.channels.SeekableByteChannel in project gatk by broadinstitute.

the class ParallelCopyGCSDirectoryIntoHDFSSpark method readChunkToHdfs.

private static final Tuple2<Integer, String> readChunkToHdfs(final String inputGCSPathFinal, final long chunkSize, final Integer chunkNum, final String outputDirectory) {
    final Path gcsPath = IOUtils.getPath(inputGCSPathFinal);
    final String basename = gcsPath.getName(gcsPath.getNameCount() - 1).toString();
    org.apache.hadoop.fs.Path outputPath = new org.apache.hadoop.fs.Path(outputDirectory);
    final String chunkPath = outputPath + "/" + basename + ".chunk." + chunkNum;
    try (SeekableByteChannel channel = Files.newByteChannel(gcsPath);
        final OutputStream outputStream = new BufferedOutputStream(BucketUtils.createFile(chunkPath))) {
        final long start = chunkSize * (long) chunkNum;
        channel.position(start);
        ByteBuffer byteBuffer = ByteBuffer.allocateDirect((int) Math.min(SIXTY_FOUR_MIB, chunkSize));
        long bytesRead = 0;
        while (channel.read(byteBuffer) > 0) {
            byteBuffer.flip();
            while (byteBuffer.hasRemaining() && bytesRead < chunkSize) {
                byte b = byteBuffer.get();
                outputStream.write(b);
                bytesRead++;
            }
            if (bytesRead == chunkSize) {
                break;
            }
            if (bytesRead > chunkSize) {
                throw new GATKException("Encountered an unknown error condition and read too many bytes; output file may be corrupt");
            }
            byteBuffer.clear();
        }
    } catch (IOException e) {
        throw new GATKException(e.getMessage() + "; inputGCSPathFinal = " + inputGCSPathFinal, e);
    }
    return new Tuple2<>(chunkNum, chunkPath);
}
Also used : Path(java.nio.file.Path) ByteBuffer(java.nio.ByteBuffer) SeekableByteChannel(java.nio.channels.SeekableByteChannel) Tuple2(scala.Tuple2) GATKException(org.broadinstitute.hellbender.exceptions.GATKException)

Example 55 with SeekableByteChannel

use of java.nio.channels.SeekableByteChannel in project gatk by broadinstitute.

the class GcsNioIntegrationTest method testCloseWhilePrefetching.

@Test(groups = { "cloud" })
public void testCloseWhilePrefetching() throws Exception {
    final String large = getGCPTestInputPath() + largeFilePath;
    SeekableByteChannel chan = new SeekableByteChannelPrefetcher(Files.newByteChannel(Paths.get(URI.create(large))), 10 * 1024 * 1024);
    // read just 1 byte, get the prefetching going
    ByteBuffer one = ByteBuffer.allocate(1);
    chan.read(one);
    // closing must not throw an exception, even if the prefetching
    // thread is active.
    chan.close();
}
Also used : SeekableByteChannel(java.nio.channels.SeekableByteChannel) ByteBuffer(java.nio.ByteBuffer) BaseTest(org.broadinstitute.hellbender.utils.test.BaseTest) Test(org.testng.annotations.Test)

Aggregations

SeekableByteChannel (java.nio.channels.SeekableByteChannel)58 Path (java.nio.file.Path)33 Test (org.junit.Test)20 ByteBuffer (java.nio.ByteBuffer)16 IOException (java.io.IOException)10 Test (org.testng.annotations.Test)9 File (java.io.File)6 CloudStorageFileSystem (com.google.cloud.storage.contrib.nio.CloudStorageFileSystem)5 GcsPath (org.apache.beam.sdk.util.gcsfs.GcsPath)4 InvocationOnMock (org.mockito.invocation.InvocationOnMock)4 ImmutableList (com.google.common.collect.ImmutableList)3 InputStream (java.io.InputStream)3 FileSystem (java.nio.file.FileSystem)3 java.util (java.util)3 Function (java.util.function.Function)3 CharStream (org.antlr.v4.runtime.CharStream)3 CodePointCharStream (org.antlr.v4.runtime.CodePointCharStream)3 UserException (org.broadinstitute.hellbender.exceptions.UserException)3 IOUtils (org.broadinstitute.hellbender.utils.io.IOUtils)3 VisibleForTesting (com.google.common.annotations.VisibleForTesting)2