Search in sources :

Example 71 with InterruptedIOException

use of java.io.InterruptedIOException in project hbase by apache.

the class HRegionFileSystem method rename.

/**
   * Renames a directory. Assumes the user has already checked for this directory existence.
   * @param srcpath
   * @param dstPath
   * @return true if rename is successful.
   * @throws IOException
   */
boolean rename(Path srcpath, Path dstPath) throws IOException {
    IOException lastIOE = null;
    int i = 0;
    do {
        try {
            return fs.rename(srcpath, dstPath);
        } catch (IOException ioe) {
            lastIOE = ioe;
            // successful move
            if (!fs.exists(srcpath) && fs.exists(dstPath))
                return true;
            // dir is not there, retry after some time.
            try {
                sleepBeforeRetry("Rename Directory", i + 1);
            } catch (InterruptedException e) {
                throw (InterruptedIOException) new InterruptedIOException().initCause(e);
            }
        }
    } while (++i <= hdfsClientRetriesNumber);
    throw new IOException("Exception in rename", lastIOE);
}
Also used : InterruptedIOException(java.io.InterruptedIOException) InterruptedIOException(java.io.InterruptedIOException) IOException(java.io.IOException)

Example 72 with InterruptedIOException

use of java.io.InterruptedIOException in project hbase by apache.

the class HRegionFileSystem method createDir.

/**
   * Creates a directory. Assumes the user has already checked for this directory existence.
   * @param dir
   * @return the result of fs.mkdirs(). In case underlying fs throws an IOException, it checks
   *         whether the directory exists or not, and returns true if it exists.
   * @throws IOException
   */
boolean createDir(Path dir) throws IOException {
    int i = 0;
    IOException lastIOE = null;
    do {
        try {
            return mkdirs(fs, conf, dir);
        } catch (IOException ioe) {
            lastIOE = ioe;
            // directory is present
            if (fs.exists(dir))
                return true;
            try {
                sleepBeforeRetry("Create Directory", i + 1);
            } catch (InterruptedException e) {
                throw (InterruptedIOException) new InterruptedIOException().initCause(e);
            }
        }
    } while (++i <= hdfsClientRetriesNumber);
    throw new IOException("Exception in createDir", lastIOE);
}
Also used : InterruptedIOException(java.io.InterruptedIOException) InterruptedIOException(java.io.InterruptedIOException) IOException(java.io.IOException)

Example 73 with InterruptedIOException

use of java.io.InterruptedIOException in project hbase by apache.

the class FSHDFSUtils method recoverDFSFileLease.

/*
   * Run the dfs recover lease. recoverLease is asynchronous. It returns:
   *    -false when it starts the lease recovery (i.e. lease recovery not *yet* done)
   *    - true when the lease recovery has succeeded or the file is closed.
   * But, we have to be careful.  Each time we call recoverLease, it starts the recover lease
   * process over from the beginning.  We could put ourselves in a situation where we are
   * doing nothing but starting a recovery, interrupting it to start again, and so on.
   * The findings over in HBASE-8354 have it that the namenode will try to recover the lease
   * on the file's primary node.  If all is well, it should return near immediately.  But,
   * as is common, it is the very primary node that has crashed and so the namenode will be
   * stuck waiting on a socket timeout before it will ask another datanode to start the
   * recovery. It does not help if we call recoverLease in the meantime and in particular,
   * subsequent to the socket timeout, a recoverLease invocation will cause us to start
   * over from square one (possibly waiting on socket timeout against primary node).  So,
   * in the below, we do the following:
   * 1. Call recoverLease.
   * 2. If it returns true, break.
   * 3. If it returns false, wait a few seconds and then call it again.
   * 4. If it returns true, break.
   * 5. If it returns false, wait for what we think the datanode socket timeout is
   * (configurable) and then try again.
   * 6. If it returns true, break.
   * 7. If it returns false, repeat starting at step 5. above.
   *
   * If HDFS-4525 is available, call it every second and we might be able to exit early.
   */
boolean recoverDFSFileLease(final DistributedFileSystem dfs, final Path p, final Configuration conf, final CancelableProgressable reporter) throws IOException {
    LOG.info("Recover lease on dfs file " + p);
    long startWaiting = EnvironmentEdgeManager.currentTime();
    // Default is 15 minutes. It's huge, but the idea is that if we have a major issue, HDFS
    // usually needs 10 minutes before marking the nodes as dead. So we're putting ourselves
    // beyond that limit 'to be safe'.
    long recoveryTimeout = conf.getInt("hbase.lease.recovery.timeout", 900000) + startWaiting;
    // This setting should be a little bit above what the cluster dfs heartbeat is set to.
    long firstPause = conf.getInt("hbase.lease.recovery.first.pause", 4000);
    // This should be set to how long it'll take for us to timeout against primary datanode if it
    // is dead.  We set it to 61 seconds, 1 second than the default READ_TIMEOUT in HDFS, the
    // default value for DFS_CLIENT_SOCKET_TIMEOUT_KEY. If recovery is still failing after this
    // timeout, then further recovery will take liner backoff with this base, to avoid endless
    // preemptions when this value is not properly configured.
    long subsequentPauseBase = conf.getLong("hbase.lease.recovery.dfs.timeout", 61 * 1000);
    Method isFileClosedMeth = null;
    // whether we need to look for isFileClosed method
    boolean findIsFileClosedMeth = true;
    boolean recovered = false;
    // We break the loop if we succeed the lease recovery, timeout, or we throw an exception.
    for (int nbAttempt = 0; !recovered; nbAttempt++) {
        recovered = recoverLease(dfs, nbAttempt, p, startWaiting);
        if (recovered)
            break;
        checkIfCancelled(reporter);
        if (checkIfTimedout(conf, recoveryTimeout, nbAttempt, p, startWaiting))
            break;
        try {
            // On the first time through wait the short 'firstPause'.
            if (nbAttempt == 0) {
                Thread.sleep(firstPause);
            } else {
                // Cycle here until (subsequentPause * nbAttempt) elapses.  While spinning, check
                // isFileClosed if available (should be in hadoop 2.0.5... not in hadoop 1 though.
                long localStartWaiting = EnvironmentEdgeManager.currentTime();
                while ((EnvironmentEdgeManager.currentTime() - localStartWaiting) < subsequentPauseBase * nbAttempt) {
                    Thread.sleep(conf.getInt("hbase.lease.recovery.pause", 1000));
                    if (findIsFileClosedMeth) {
                        try {
                            isFileClosedMeth = dfs.getClass().getMethod("isFileClosed", new Class[] { Path.class });
                        } catch (NoSuchMethodException nsme) {
                            LOG.debug("isFileClosed not available");
                        } finally {
                            findIsFileClosedMeth = false;
                        }
                    }
                    if (isFileClosedMeth != null && isFileClosed(dfs, isFileClosedMeth, p)) {
                        recovered = true;
                        break;
                    }
                    checkIfCancelled(reporter);
                }
            }
        } catch (InterruptedException ie) {
            InterruptedIOException iioe = new InterruptedIOException();
            iioe.initCause(ie);
            throw iioe;
        }
    }
    return recovered;
}
Also used : Path(org.apache.hadoop.fs.Path) InterruptedIOException(java.io.InterruptedIOException) Method(java.lang.reflect.Method)

Example 74 with InterruptedIOException

use of java.io.InterruptedIOException in project hbase by apache.

the class JVMClusterUtil method startup.

/**
   * Start the cluster.  Waits until there is a primary master initialized
   * and returns its address.
   * @param masters
   * @param regionservers
   * @return Address to use contacting primary master.
   */
public static String startup(final List<JVMClusterUtil.MasterThread> masters, final List<JVMClusterUtil.RegionServerThread> regionservers) throws IOException {
    Configuration configuration = null;
    if (masters == null || masters.isEmpty()) {
        return null;
    }
    for (JVMClusterUtil.MasterThread t : masters) {
        configuration = t.getMaster().getConfiguration();
        t.start();
    }
    // Wait for an active master
    //  having an active master before starting the region threads allows
    //  then to succeed on their connection to master
    long startTime = System.currentTimeMillis();
    while (findActiveMaster(masters) == null) {
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {
            throw (InterruptedIOException) new InterruptedIOException().initCause(e);
        }
        int startTimeout = configuration != null ? Integer.parseInt(configuration.get("hbase.master.start.timeout.localHBaseCluster", "30000")) : 30000;
        if (System.currentTimeMillis() > startTime + startTimeout) {
            throw new RuntimeException(String.format("Master not active after %s seconds", startTimeout));
        }
    }
    if (regionservers != null) {
        for (JVMClusterUtil.RegionServerThread t : regionservers) {
            t.start();
        }
    }
    // Wait for an active master to be initialized (implies being master)
    //  with this, when we return the cluster is complete
    startTime = System.currentTimeMillis();
    final int maxwait = 200000;
    while (true) {
        JVMClusterUtil.MasterThread t = findActiveMaster(masters);
        if (t != null && t.master.isInitialized()) {
            return t.master.getServerName().toString();
        }
        // REMOVE
        if (System.currentTimeMillis() > startTime + 10000) {
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                throw (InterruptedIOException) new InterruptedIOException().initCause(e);
            }
        }
        if (System.currentTimeMillis() > startTime + maxwait) {
            String msg = "Master not initialized after " + maxwait + "ms seconds";
            Threads.printThreadInfo(System.out, "Thread dump because: " + msg);
            throw new RuntimeException(msg);
        }
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {
            throw (InterruptedIOException) new InterruptedIOException().initCause(e);
        }
    }
}
Also used : InterruptedIOException(java.io.InterruptedIOException) Configuration(org.apache.hadoop.conf.Configuration)

Example 75 with InterruptedIOException

use of java.io.InterruptedIOException in project hbase by apache.

the class ModifyRegionUtils method assignRegions.

/**
   * Triggers a bulk assignment of the specified regions
   *
   * @param assignmentManager the Assignment Manger
   * @param regionInfos the list of regions to assign
   * @throws IOException if an error occurred during the assignment
   */
public static void assignRegions(final AssignmentManager assignmentManager, final List<HRegionInfo> regionInfos) throws IOException {
    try {
        assignmentManager.getRegionStates().createRegionStates(regionInfos);
        assignmentManager.assign(regionInfos);
    } catch (InterruptedException e) {
        LOG.error("Caught " + e + " during round-robin assignment");
        InterruptedIOException ie = new InterruptedIOException(e.getMessage());
        ie.initCause(e);
        throw ie;
    }
}
Also used : InterruptedIOException(java.io.InterruptedIOException)

Aggregations

InterruptedIOException (java.io.InterruptedIOException)286 IOException (java.io.IOException)195 Test (org.junit.Test)40 Socket (java.net.Socket)28 ArrayList (java.util.ArrayList)27 InputStream (java.io.InputStream)23 ExecutionException (java.util.concurrent.ExecutionException)23 ConnectException (java.net.ConnectException)22 InetSocketAddress (java.net.InetSocketAddress)21 ByteBuffer (java.nio.ByteBuffer)21 Path (org.apache.hadoop.fs.Path)20 NoRouteToHostException (java.net.NoRouteToHostException)19 EOFException (java.io.EOFException)17 OutputStream (java.io.OutputStream)17 SocketTimeoutException (java.net.SocketTimeoutException)17 ServletException (javax.servlet.ServletException)17 CountDownLatch (java.util.concurrent.CountDownLatch)16 SocketException (java.net.SocketException)15 HttpServletRequest (javax.servlet.http.HttpServletRequest)15 HttpServletResponse (javax.servlet.http.HttpServletResponse)15