use of java.nio.channels.SeekableByteChannel in project beam by apache.
the class GcsUtil method open.
* Opens an object in GCS.
* <p>Returns a SeekableByteChannel that provides access to data in the bucket.
* @param path the GCS filename to read from
* @param readOptions Fine-grained options for behaviors of retries, buffering, etc.
* @return a SeekableByteChannel that can read the object data
SeekableByteChannel open(GcsPath path, GoogleCloudStorageReadOptions readOptions) throws IOException {
HashMap<String, String> baseLabels = new HashMap<>();
baseLabels.put(MonitoringInfoConstants.Labels.PTRANSFORM, "");
baseLabels.put(MonitoringInfoConstants.Labels.SERVICE, "Storage");
baseLabels.put(MonitoringInfoConstants.Labels.METHOD, "GcsGet");
baseLabels.put(MonitoringInfoConstants.Labels.RESOURCE, GcpResourceIdentifiers.cloudStorageBucket(path.getBucket()));
baseLabels.put(MonitoringInfoConstants.Labels.GCS_PROJECT_ID, googleCloudStorageOptions.getProjectId());
baseLabels.put(MonitoringInfoConstants.Labels.GCS_BUCKET, path.getBucket());
ServiceCallMetric serviceCallMetric = new ServiceCallMetric(MonitoringInfoConstants.Urns.API_REQUEST_COUNT, baseLabels);
try {
SeekableByteChannel channel = StorageResourceId(path.getBucket(), path.getObject()), readOptions);"ok");
return channel;
} catch (IOException e) {
if (e.getCause() instanceof GoogleJsonResponseException) { e.getCause()).getDetails().getCode());
throw e;
use of java.nio.channels.SeekableByteChannel in project beam by apache.
the class IsmReaderImpl method overKeyComponents.
public IsmPrefixReaderIterator overKeyComponents(List<?> keyComponents, int shardId, RandomAccessData keyBytes) throws IOException {
SideInputReadCounter readCounter = IsmReader.getCurrentSideInputCounter();
if (keyComponents.isEmpty()) {
checkArgument(shardId == 0 && keyBytes.size() == 0, "Expected shard id to be 0 and key bytes to be empty " + "but got shard id %s and key bytes of length %s", shardId, keyBytes.size());
checkArgument(keyComponents.size() <= coder.getKeyComponentCoders().size(), "Expected at most %s key component(s) but received %s.", coder.getKeyComponentCoders().size(), keyComponents);
Optional<SeekableByteChannel> inChannel = initializeFooterAndShardIndex(Optional.<SeekableByteChannel>absent(), readCounter);
// If this file is empty, we can return an empty iterator.
if (footer.getNumberOfKeys() == 0) {
return new EmptyIsmPrefixReaderIterator(keyComponents);
// iterator over all the keys.
if (keyComponents.size() < coder.getNumberOfShardKeyCoders(keyComponents)) {
return new ShardAwareIsmPrefixReaderIterator(keyComponents, openIfNeeded(inChannel), readCounter);
// we know that we can return an empty reader iterator.
if (!shardIdToShardMap.containsKey(shardId)) {
return new EmptyIsmPrefixReaderIterator(keyComponents);
inChannel = initializeForKeyedRead(shardId, inChannel, readCounter);
if (!bloomFilterMightContain(keyBytes)) {
return new EmptyIsmPrefixReaderIterator(keyComponents);
// Otherwise we may actually contain the key so construct a reader iterator
// which will fetch the data blocks containing the requested key prefix.
// We find the first key in the index which may contain our prefix
RandomAccessData floorKey = indexPerShard.get(shardId).floorKey(keyBytes);
// We compute an upper bound on the key prefix by incrementing the prefix
RandomAccessData keyBytesUpperBound = keyBytes.increment();
// Compute the sub-range of the index map that we want to iterate over since
// any of these blocks may contain the key prefix.
Iterator<IsmShardKey> blockEntries = indexPerShard.get(shardId).subMap(floorKey, keyBytesUpperBound).values().iterator();
return new WithinShardIsmPrefixReaderIterator(keyComponents, keyBytes, keyBytesUpperBound, blockEntries, readCounter);
use of java.nio.channels.SeekableByteChannel in project beam by apache.
the class IsmReaderImpl method initializeBloomFilterAndIndexPerShard.
* Initializes the Bloom filter and index per shard. We prepopulate empty indices for shards where
* the index offset matches the following shard block offset. Re-uses the provided channel,
* returning it or a new one if this method was required to open one.
private synchronized Optional<SeekableByteChannel> initializeBloomFilterAndIndexPerShard(Optional<SeekableByteChannel> inChannel) throws IOException {
if (indexPerShard != null) {
checkState(bloomFilter != null, "Expected Bloom filter to have been initialized.");
return inChannel;
SeekableByteChannel rawChannel = openIfNeeded(inChannel);
// Set the position to where the bloom filter is and read it in.
position(rawChannel, footer.getBloomFilterPosition());
bloomFilter = ScalableBloomFilterCoder.of().decode(Channels.newInputStream(rawChannel));
indexPerShard = new HashMap<>();
// If a shard is small, it may not contain an index and we can detect this and
// prepopulate the shard index map with an empty entry if the start of the index
// and start of the next block are equal
Iterator<IsmShard> shardIterator = shardOffsetToShardMap.values().iterator();
// If file is empty we just return here.
if (!shardIterator.hasNext()) {
return Optional.of(rawChannel);
// If the current shard's index position is equal to the next shards block offset
// then we know that the index contains no data and we can pre-populate it with
// the empty map.
IsmShard currentShard =;
while (shardIterator.hasNext()) {
IsmShard nextShard =;
if (currentShard.getIndexOffset() == nextShard.getBlockOffset()) {
indexPerShard.put(currentShard.getId(), ImmutableSortedMap.<RandomAccessData, IsmShardKey>orderedBy(RandomAccessData.UNSIGNED_LEXICOGRAPHICAL_COMPARATOR).put(new RandomAccessData(0), new IsmShardKey(IsmReaderImpl.this.resourceId.toString(), new RandomAccessData(0), currentShard.getBlockOffset(), currentShard.getIndexOffset())).build());
currentShard = nextShard;
// start of the Bloom filter, then we know that the index is empty.
if (currentShard.getIndexOffset() == footer.getBloomFilterPosition()) {
indexPerShard.put(currentShard.getId(), ImmutableSortedMap.<RandomAccessData, IsmShardKey>orderedBy(RandomAccessData.UNSIGNED_LEXICOGRAPHICAL_COMPARATOR).put(new RandomAccessData(0), new IsmShardKey(IsmReaderImpl.this.resourceId.toString(), new RandomAccessData(0), currentShard.getBlockOffset(), currentShard.getIndexOffset())).build());
return Optional.of(rawChannel);
use of java.nio.channels.SeekableByteChannel in project beam by apache.
the class IsmReaderImpl method open.
* Opens a new channel.
private SeekableByteChannel open() throws IOException {
ReadableByteChannel channel =;
Preconditions.checkArgument(channel instanceof SeekableByteChannel, "IsmReaderImpl requires a SeekableByteChannel for path %s but received %s.", resourceId, channel);
return (SeekableByteChannel) channel;
use of java.nio.channels.SeekableByteChannel in project beam by apache.
the class IsmReaderTest method testCachedTailSeekableByteChannelThrowsOnTruncate.
public void testCachedTailSeekableByteChannelThrowsOnTruncate() throws Exception {
try (SeekableByteChannel channel = new CachedTailSeekableByteChannel(0, new byte[0])) {