Search in sources :

Example 36 with KeeperException

use of org.apache.zookeeper.KeeperException in project hbase by apache.

the class ServerManager method letRegionServersShutdown.

void letRegionServersShutdown() {
    long previousLogTime = 0;
    ServerName sn = master.getServerName();
    ZooKeeperWatcher zkw = master.getZooKeeper();
    int onlineServersCt;
    while ((onlineServersCt = onlineServers.size()) > 0) {
        if (System.currentTimeMillis() > (previousLogTime + 1000)) {
            Set<ServerName> remainingServers = onlineServers.keySet();
            synchronized (onlineServers) {
                if (remainingServers.size() == 1 && remainingServers.contains(sn)) {
                    // Master will delete itself later.
                    return;
                }
            }
            StringBuilder sb = new StringBuilder();
            // It's ok here to not sync on onlineServers - merely logging
            for (ServerName key : remainingServers) {
                if (sb.length() > 0) {
                    sb.append(", ");
                }
                sb.append(key);
            }
            LOG.info("Waiting on regionserver(s) to go down " + sb.toString());
            previousLogTime = System.currentTimeMillis();
        }
        try {
            List<String> servers = getRegionServersInZK(zkw);
            if (servers == null || servers.isEmpty() || (servers.size() == 1 && servers.contains(sn.toString()))) {
                LOG.info("ZK shows there is only the master self online, exiting now");
                // Master could have lost some ZK events, no need to wait more.
                break;
            }
        } catch (KeeperException ke) {
            LOG.warn("Failed to list regionservers", ke);
            // ZK is malfunctioning, don't hang here
            break;
        }
        synchronized (onlineServers) {
            try {
                if (onlineServersCt == onlineServers.size())
                    onlineServers.wait(100);
            } catch (InterruptedException ignored) {
            // continue
            }
        }
    }
}
Also used : ZooKeeperWatcher(org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher) ServerName(org.apache.hadoop.hbase.ServerName) KeeperException(org.apache.zookeeper.KeeperException)

Example 37 with KeeperException

use of org.apache.zookeeper.KeeperException in project hbase by apache.

the class ReplicationZKNodeCleaner method getUnDeletedQueues.

/**
   * @return undeletedQueues replicator with its queueIds for removed peers
   * @throws IOException
   */
public Map<String, List<String>> getUnDeletedQueues() throws IOException {
    Map<String, List<String>> undeletedQueues = new HashMap<>();
    Set<String> peerIds = new HashSet<>(this.replicationPeers.getAllPeerIds());
    try {
        List<String> replicators = this.queuesClient.getListOfReplicators();
        for (String replicator : replicators) {
            List<String> queueIds = this.queuesClient.getAllQueues(replicator);
            for (String queueId : queueIds) {
                ReplicationQueueInfo queueInfo = new ReplicationQueueInfo(queueId);
                if (!peerIds.contains(queueInfo.getPeerId())) {
                    undeletedQueues.computeIfAbsent(replicator, (key) -> new ArrayList<>()).add(queueId);
                    if (LOG.isDebugEnabled()) {
                        LOG.debug("Undeleted replication queue for removed peer found: " + String.format("[removedPeerId=%s, replicator=%s, queueId=%s]", queueInfo.getPeerId(), replicator, queueId));
                    }
                }
            }
        }
    } catch (KeeperException ke) {
        throw new IOException("Failed to get the replication queues of all replicators", ke);
    }
    return undeletedQueues;
}
Also used : KeeperException(org.apache.zookeeper.KeeperException) ZKUtil(org.apache.hadoop.hbase.zookeeper.ZKUtil) Abortable(org.apache.hadoop.hbase.Abortable) ReplicationQueuesClientArguments(org.apache.hadoop.hbase.replication.ReplicationQueuesClientArguments) Set(java.util.Set) ReplicationFactory(org.apache.hadoop.hbase.replication.ReplicationFactory) IOException(java.io.IOException) HashMap(java.util.HashMap) ReplicationPeers(org.apache.hadoop.hbase.replication.ReplicationPeers) ReplicationStateZKBase(org.apache.hadoop.hbase.replication.ReplicationStateZKBase) ZooKeeperWatcher(org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) List(java.util.List) ReplicationQueueInfo(org.apache.hadoop.hbase.replication.ReplicationQueueInfo) ReplicationQueuesClient(org.apache.hadoop.hbase.replication.ReplicationQueuesClient) Map(java.util.Map) Configuration(org.apache.hadoop.conf.Configuration) Entry(java.util.Map.Entry) Log(org.apache.commons.logging.Log) LogFactory(org.apache.commons.logging.LogFactory) InterfaceAudience(org.apache.hadoop.hbase.classification.InterfaceAudience) HashMap(java.util.HashMap) ReplicationQueueInfo(org.apache.hadoop.hbase.replication.ReplicationQueueInfo) ArrayList(java.util.ArrayList) ArrayList(java.util.ArrayList) List(java.util.List) IOException(java.io.IOException) KeeperException(org.apache.zookeeper.KeeperException) HashSet(java.util.HashSet)

Example 38 with KeeperException

use of org.apache.zookeeper.KeeperException in project hbase by apache.

the class ReplicationZKNodeCleaner method getUnDeletedHFileRefsQueues.

/**
   * @return undeletedHFileRefsQueue replicator with its undeleted queueIds for removed peers in
   *         hfile-refs queue
   * @throws IOException
   */
public Set<String> getUnDeletedHFileRefsQueues() throws IOException {
    Set<String> undeletedHFileRefsQueue = new HashSet<>();
    Set<String> peerIds = new HashSet<>(this.replicationPeers.getAllPeerIds());
    String hfileRefsZNode = queueDeletor.getHfileRefsZNode();
    try {
        if (-1 == ZKUtil.checkExists(zkw, hfileRefsZNode)) {
            return null;
        }
        List<String> listOfPeers = this.queuesClient.getAllPeersFromHFileRefsQueue();
        Set<String> peers = new HashSet<>(listOfPeers);
        peers.removeAll(peerIds);
        if (!peers.isEmpty()) {
            undeletedHFileRefsQueue.addAll(peers);
        }
    } catch (KeeperException e) {
        throw new IOException("Failed to get list of all peers from hfile-refs znode " + hfileRefsZNode, e);
    }
    return undeletedHFileRefsQueue;
}
Also used : IOException(java.io.IOException) KeeperException(org.apache.zookeeper.KeeperException) HashSet(java.util.HashSet)

Example 39 with KeeperException

use of org.apache.zookeeper.KeeperException in project hbase by apache.

the class TestZooKeeper method testCreateSilentIsReallySilent.

/**
   * A test for HBASE-3238
   * @throws IOException A connection attempt to zk failed
   * @throws InterruptedException One of the non ZKUtil actions was interrupted
   * @throws KeeperException Any of the zookeeper connections had a
   * KeeperException
   */
@Test
public void testCreateSilentIsReallySilent() throws InterruptedException, KeeperException, IOException {
    Configuration c = TEST_UTIL.getConfiguration();
    String aclZnode = "/aclRoot";
    String quorumServers = ZKConfig.getZKQuorumServersString(c);
    // 5 seconds
    int sessionTimeout = 5 * 1000;
    ZooKeeper zk = new ZooKeeper(quorumServers, sessionTimeout, EmptyWatcher.instance);
    zk.addAuthInfo("digest", "hbase:rox".getBytes());
    // Assumes the  root of the ZooKeeper space is writable as it creates a node
    // wherever the cluster home is defined.
    ZooKeeperWatcher zk2 = new ZooKeeperWatcher(TEST_UTIL.getConfiguration(), "testCreateSilentIsReallySilent", null);
    // Save the previous ACL
    Stat s = null;
    List<ACL> oldACL = null;
    while (true) {
        try {
            s = new Stat();
            oldACL = zk.getACL("/", s);
            break;
        } catch (KeeperException e) {
            switch(e.code()) {
                case CONNECTIONLOSS:
                case SESSIONEXPIRED:
                case OPERATIONTIMEOUT:
                    LOG.warn("Possibly transient ZooKeeper exception", e);
                    Threads.sleep(100);
                    break;
                default:
                    throw e;
            }
        }
    }
    // Add retries in case of retryable zk exceptions.
    while (true) {
        try {
            zk.setACL("/", ZooDefs.Ids.CREATOR_ALL_ACL, -1);
            break;
        } catch (KeeperException e) {
            switch(e.code()) {
                case CONNECTIONLOSS:
                case SESSIONEXPIRED:
                case OPERATIONTIMEOUT:
                    LOG.warn("Possibly transient ZooKeeper exception: " + e);
                    Threads.sleep(100);
                    break;
                default:
                    throw e;
            }
        }
    }
    while (true) {
        try {
            zk.create(aclZnode, null, ZooDefs.Ids.CREATOR_ALL_ACL, CreateMode.PERSISTENT);
            break;
        } catch (KeeperException e) {
            switch(e.code()) {
                case CONNECTIONLOSS:
                case SESSIONEXPIRED:
                case OPERATIONTIMEOUT:
                    LOG.warn("Possibly transient ZooKeeper exception: " + e);
                    Threads.sleep(100);
                    break;
                default:
                    throw e;
            }
        }
    }
    zk.close();
    ZKUtil.createAndFailSilent(zk2, aclZnode);
    // Restore the ACL
    ZooKeeper zk3 = new ZooKeeper(quorumServers, sessionTimeout, EmptyWatcher.instance);
    zk3.addAuthInfo("digest", "hbase:rox".getBytes());
    try {
        zk3.setACL("/", oldACL, -1);
    } finally {
        zk3.close();
    }
}
Also used : ZooKeeper(org.apache.zookeeper.ZooKeeper) Stat(org.apache.zookeeper.data.Stat) Configuration(org.apache.hadoop.conf.Configuration) ZooKeeperWatcher(org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher) ACL(org.apache.zookeeper.data.ACL) KeeperException(org.apache.zookeeper.KeeperException) Test(org.junit.Test)

Example 40 with KeeperException

use of org.apache.zookeeper.KeeperException in project hbase by apache.

the class HBaseAdmin method getMasterInfoPort.

@Override
public int getMasterInfoPort() throws IOException {
    // TODO: Fix!  Reaching into internal implementation!!!!
    ConnectionImplementation connection = (ConnectionImplementation) this.connection;
    ZooKeeperKeepAliveConnection zkw = connection.getKeepAliveZooKeeperWatcher();
    try {
        return MasterAddressTracker.getMasterInfoPort(zkw);
    } catch (KeeperException e) {
        throw new IOException("Failed to get master info port from MasterAddressTracker", e);
    }
}
Also used : InterruptedIOException(java.io.InterruptedIOException) IOException(java.io.IOException) DoNotRetryIOException(org.apache.hadoop.hbase.DoNotRetryIOException) TimeoutIOException(org.apache.hadoop.hbase.exceptions.TimeoutIOException) KeeperException(org.apache.zookeeper.KeeperException)

Aggregations

KeeperException (org.apache.zookeeper.KeeperException)345 IOException (java.io.IOException)114 Stat (org.apache.zookeeper.data.Stat)79 ZooKeeper (org.apache.zookeeper.ZooKeeper)54 Test (org.junit.Test)37 NoNodeException (org.apache.zookeeper.KeeperException.NoNodeException)36 ArrayList (java.util.ArrayList)30 SolrException (org.apache.solr.common.SolrException)30 HeliosRuntimeException (com.spotify.helios.common.HeliosRuntimeException)24 HashMap (java.util.HashMap)21 WatchedEvent (org.apache.zookeeper.WatchedEvent)20 Watcher (org.apache.zookeeper.Watcher)20 InterruptedIOException (java.io.InterruptedIOException)19 Map (java.util.Map)19 ZooKeeperClient (com.spotify.helios.servicescommon.coordination.ZooKeeperClient)17 ServerName (org.apache.hadoop.hbase.ServerName)15 ACL (org.apache.zookeeper.data.ACL)15 List (java.util.List)14 CountDownLatch (java.util.concurrent.CountDownLatch)14 RetryCounter (org.apache.hadoop.hbase.util.RetryCounter)13