Search in sources :

Example 1 with ShortCircuitConf

use of org.apache.hadoop.hdfs.client.impl.DfsClientConf.ShortCircuitConf in project hadoop by apache.

the class BlockReaderFactory method build.

/**
   * Build a BlockReader with the given options.
   *
   * This function will do the best it can to create a block reader that meets
   * all of our requirements.  We prefer short-circuit block readers
   * (BlockReaderLocal and BlockReaderLocalLegacy) over remote ones, since the
   * former avoid the overhead of socket communication.  If short-circuit is
   * unavailable, our next fallback is data transfer over UNIX domain sockets,
   * if dfs.client.domain.socket.data.traffic has been enabled.  If that doesn't
   * work, we will try to create a remote block reader that operates over TCP
   * sockets.
   *
   * There are a few caches that are important here.
   *
   * The ShortCircuitCache stores file descriptor objects which have been passed
   * from the DataNode.
   *
   * The DomainSocketFactory stores information about UNIX domain socket paths
   * that we not been able to use in the past, so that we don't waste time
   * retrying them over and over.  (Like all the caches, it does have a timeout,
   * though.)
   *
   * The PeerCache stores peers that we have used in the past.  If we can reuse
   * one of these peers, we avoid the overhead of re-opening a socket.  However,
   * if the socket has been timed out on the remote end, our attempt to reuse
   * the socket may end with an IOException.  For that reason, we limit our
   * attempts at socket reuse to dfs.client.cached.conn.retry times.  After
   * that, we create new sockets.  This avoids the problem where a thread tries
   * to talk to a peer that it hasn't talked to in a while, and has to clean out
   * every entry in a socket cache full of stale entries.
   *
   * @return The new BlockReader.  We will not return null.
   *
   * @throws InvalidToken
   *             If the block token was invalid.
   *         InvalidEncryptionKeyException
   *             If the encryption key was invalid.
   *         Other IOException
   *             If there was another problem.
   */
public BlockReader build() throws IOException {
    Preconditions.checkNotNull(configuration);
    Preconditions.checkState(length >= 0, "Length must be set to a non-negative value");
    BlockReader reader = tryToCreateExternalBlockReader();
    if (reader != null) {
        return reader;
    }
    final ShortCircuitConf scConf = conf.getShortCircuitConf();
    if (scConf.isShortCircuitLocalReads() && allowShortCircuitLocalReads) {
        if (clientContext.getUseLegacyBlockReaderLocal()) {
            reader = getLegacyBlockReaderLocal();
            if (reader != null) {
                LOG.trace("{}: returning new legacy block reader local.", this);
                return reader;
            }
        } else {
            reader = getBlockReaderLocal();
            if (reader != null) {
                LOG.trace("{}: returning new block reader local.", this);
                return reader;
            }
        }
    }
    if (scConf.isDomainSocketDataTraffic()) {
        reader = getRemoteBlockReaderFromDomain();
        if (reader != null) {
            LOG.trace("{}: returning new remote block reader using UNIX domain " + "socket on {}", this, pathInfo.getPath());
            return reader;
        }
    }
    Preconditions.checkState(!DFSInputStream.tcpReadsDisabledForTesting, "TCP reads were disabled for testing, but we failed to " + "do a non-TCP read.");
    return getRemoteBlockReaderFromTcp();
}
Also used : ShortCircuitConf(org.apache.hadoop.hdfs.client.impl.DfsClientConf.ShortCircuitConf) BlockReader(org.apache.hadoop.hdfs.BlockReader)

Example 2 with ShortCircuitConf

use of org.apache.hadoop.hdfs.client.impl.DfsClientConf.ShortCircuitConf in project hadoop by apache.

the class BlockReaderLocalLegacy method newBlockReader.

/**
   * The only way this object can be instantiated.
   */
static BlockReaderLocalLegacy newBlockReader(DfsClientConf conf, UserGroupInformation userGroupInformation, Configuration configuration, String file, ExtendedBlock blk, Token<BlockTokenIdentifier> token, DatanodeInfo node, long startOffset, long length, StorageType storageType, Tracer tracer) throws IOException {
    final ShortCircuitConf scConf = conf.getShortCircuitConf();
    LocalDatanodeInfo localDatanodeInfo = getLocalDatanodeInfo(node.getIpcPort());
    // check the cache first
    BlockLocalPathInfo pathinfo = localDatanodeInfo.getBlockLocalPathInfo(blk);
    if (pathinfo == null) {
        if (userGroupInformation == null) {
            userGroupInformation = UserGroupInformation.getCurrentUser();
        }
        pathinfo = getBlockPathInfo(userGroupInformation, blk, node, configuration, conf.getSocketTimeout(), token, conf.isConnectToDnViaHostname(), storageType);
    }
    // check to see if the file exists. It may so happen that the
    // HDFS file has been deleted and this block-lookup is occurring
    // on behalf of a new HDFS file. This time, the block file could
    // be residing in a different portion of the fs.data.dir directory.
    // In this case, we remove this entry from the cache. The next
    // call to this method will re-populate the cache.
    FileInputStream dataIn = null;
    FileInputStream checksumIn = null;
    BlockReaderLocalLegacy localBlockReader = null;
    final boolean skipChecksumCheck = scConf.isSkipShortCircuitChecksums() || storageType.isTransient();
    try {
        // get a local file system
        File blkfile = new File(pathinfo.getBlockPath());
        dataIn = new FileInputStream(blkfile);
        LOG.debug("New BlockReaderLocalLegacy for file {} of size {} startOffset " + "{} length {} short circuit checksum {}", blkfile, blkfile.length(), startOffset, length, !skipChecksumCheck);
        if (!skipChecksumCheck) {
            // get the metadata file
            File metafile = new File(pathinfo.getMetaPath());
            checksumIn = new FileInputStream(metafile);
            final DataChecksum checksum = BlockMetadataHeader.readDataChecksum(new DataInputStream(checksumIn), blk);
            long firstChunkOffset = startOffset - (startOffset % checksum.getBytesPerChecksum());
            localBlockReader = new BlockReaderLocalLegacy(scConf, file, blk, startOffset, checksum, true, dataIn, firstChunkOffset, checksumIn, tracer);
        } else {
            localBlockReader = new BlockReaderLocalLegacy(scConf, file, blk, startOffset, dataIn, tracer);
        }
    } catch (IOException e) {
        // remove from cache
        localDatanodeInfo.removeBlockLocalPathInfo(blk);
        LOG.warn("BlockReaderLocalLegacy: Removing " + blk + " from cache because local file " + pathinfo.getBlockPath() + " could not be opened.");
        throw e;
    } finally {
        if (localBlockReader == null) {
            if (dataIn != null) {
                dataIn.close();
            }
            if (checksumIn != null) {
                checksumIn.close();
            }
        }
    }
    return localBlockReader;
}
Also used : ShortCircuitConf(org.apache.hadoop.hdfs.client.impl.DfsClientConf.ShortCircuitConf) BlockLocalPathInfo(org.apache.hadoop.hdfs.protocol.BlockLocalPathInfo) IOException(java.io.IOException) DataInputStream(java.io.DataInputStream) File(java.io.File) FileInputStream(java.io.FileInputStream) DataChecksum(org.apache.hadoop.util.DataChecksum)

Aggregations

ShortCircuitConf (org.apache.hadoop.hdfs.client.impl.DfsClientConf.ShortCircuitConf)2 DataInputStream (java.io.DataInputStream)1 File (java.io.File)1 FileInputStream (java.io.FileInputStream)1 IOException (java.io.IOException)1 BlockReader (org.apache.hadoop.hdfs.BlockReader)1 BlockLocalPathInfo (org.apache.hadoop.hdfs.protocol.BlockLocalPathInfo)1 DataChecksum (org.apache.hadoop.util.DataChecksum)1