Search in sources :

Example 1 with AdminClient

use of voldemort.client.protocol.admin.AdminClient in project voldemort by voldemort.

the class AdminStoreSwapper method invokeFetch.

public Map<Node, Response> invokeFetch(final String storeName, final String basePath, final long pushVersion) {
    // do fetch
    final Map<Integer, Future<String>> fetchDirs = new HashMap<Integer, Future<String>>();
    for (final Node node : cluster.getNodes()) {
        fetchDirs.put(node.getId(), executor.submit(new Callable<String>() {

            public String call() throws Exception {
                String response = null;
                if (buildPrimaryReplicasOnly) {
                    // Then we give the root directory to the server and let it decide what to fetch
                    response = fetch(basePath);
                } else {
                    // Old behavior: fetch the node directory only
                    String storeDir = basePath + "/" + ReadOnlyUtils.NODE_DIRECTORY_PREFIX + node.getId();
                    response = fetch(storeDir);
                }
                if (response == null)
                    throw new VoldemortException("Fetch request on " + node.briefToString() + " failed");
                logger.info("Fetch succeeded on " + node.briefToString());
                return response.trim();
            }

            private String fetch(String hadoopStoreDirToFetch) {
                // We need to keep the AdminClient instance separate in each Callable, so that a refresh of
                // the client in one callable does not refresh the AdminClient used by another callable.
                AdminClient currentAdminClient = AdminStoreSwapper.this.adminClient;
                int attempt = 1;
                while (attempt <= MAX_FETCH_ATTEMPTS) {
                    if (attempt > 1) {
                        logger.info("Fetch attempt " + attempt + "/" + MAX_FETCH_ATTEMPTS + " for " + node.briefToString() + ". Will wait " + WAIT_TIME_BETWEEN_FETCH_ATTEMPTS + " ms before going ahead.");
                        try {
                            Thread.sleep(WAIT_TIME_BETWEEN_FETCH_ATTEMPTS);
                        } catch (InterruptedException e) {
                            throw new VoldemortException(e);
                        }
                    }
                    logger.info("Invoking fetch for " + node.briefToString() + " for " + hadoopStoreDirToFetch);
                    try {
                        return currentAdminClient.readonlyOps.fetchStore(node.getId(), storeName, hadoopStoreDirToFetch, pushVersion, timeoutMs);
                    } catch (AsyncOperationTimeoutException e) {
                        throw e;
                    } catch (VoldemortException ve) {
                        if (attempt >= MAX_FETCH_ATTEMPTS) {
                            throw ve;
                        }
                        if (ExceptionUtils.recursiveClassEquals(ve, ExceptionUtils.BNP_SOFT_ERRORS)) {
                            String logMessage = "Got a " + ve.getClass().getSimpleName() + " from " + node.briefToString() + " while trying to fetch store '" + storeName + "'" + " (attempt " + attempt + "/" + MAX_FETCH_ATTEMPTS + ").";
                            if (currentAdminClient.isClusterModified()) {
                                logMessage += " It seems like the cluster.xml state has changed since this" + " AdminClient was constructed. Therefore, we will attempt constructing" + " a fresh AdminClient and retrying the fetch operation.";
                                currentAdminClient = currentAdminClient.getFreshClient();
                            } else {
                                logMessage += " The cluster.xml is up to date. We will retry with the same AdminClient.";
                            }
                            logger.info(logMessage, ve);
                            attempt++;
                        } else {
                            throw ve;
                        }
                    }
                }
                // Defensive coding
                throw new IllegalStateException("Code should never reach here!");
            }
        }));
    }
    Map<Node, Response> fetchResponseMap = Maps.newTreeMap();
    boolean fetchErrors = false;
    /*
         * We wait for all fetches to complete successfully or throw any
         * Exception. We don't handle QuotaException in a special way here. The
         * idea is to protect the disk. It is okay to let the Bnp job run to
         * completion. We still want to delete data of a failed fetch in all
         * nodes that successfully fetched the data. After deleting the
         * failedFetch data, we bubble up the Quota Exception as needed.
         * 
         * The alternate is to cancel all future tasks as soon as we detect a
         * QuotaExceededException. This will save time (fail faster) and protect
         * the disk usage. But does not guarantee a clean state in all nodes wrt
         * to data from failed fetch. Someone manually needs to clean up all the
         * data from failedFetches. Instead we try to cleanup the data as much
         * as we can before we fail the job.
         * 
         * In iteration 2 we can try to improve this to fail faster, by adding
         * either/both:
         * 
         * 1. Client side checks 2. Server side takes care of failing fast as
         * soon as it detect QuotaExceededException in one of the servers. Note
         * that this needs careful decision on how to handle those fetches that
         * already started in other nodes and how & when to clean them up.
         */
    ArrayList<Node> failedNodes = new ArrayList<Node>();
    for (final Node node : cluster.getNodes()) {
        Future<String> val = fetchDirs.get(node.getId());
        try {
            String response = val.get();
            fetchResponseMap.put(node, new Response(response));
        } catch (Exception e) {
            if (e.getCause() instanceof UnauthorizedStoreException) {
                throw (UnauthorizedStoreException) e.getCause();
            } else {
                fetchErrors = true;
                fetchResponseMap.put(node, new Response(e));
                failedNodes.add(node);
            }
        }
    }
    if (fetchErrors) {
        // Log All the errors for the user
        for (Map.Entry<Node, Response> entry : fetchResponseMap.entrySet()) {
            if (!entry.getValue().isSuccessful()) {
                logger.error("Error on " + entry.getKey().briefToString() + " during push : ", entry.getValue().getException());
            }
        }
        Iterator<FailedFetchStrategy> strategyIterator = failedFetchStrategyList.iterator();
        boolean swapIsPossible = false;
        FailedFetchStrategy strategy = null;
        while (strategyIterator.hasNext() && !swapIsPossible) {
            strategy = strategyIterator.next();
            try {
                logger.info("About to attempt: " + strategy.toString());
                swapIsPossible = strategy.dealWithIt(storeName, pushVersion, fetchResponseMap);
                logger.info("Finished executing: " + strategy.toString() + "; swapIsPossible: " + swapIsPossible);
            } catch (Exception e) {
                if (strategyIterator.hasNext()) {
                    logger.error("Got an exception while trying to execute: " + strategy.toString() + ". Continuing with next strategy.", e);
                } else {
                    logger.error("Got an exception while trying to execute the last remaining strategy: " + strategy.toString() + ". Swap will be aborted.", e);
                }
            }
        }
        if (!swapIsPossible) {
            throw new VoldemortException("Exception during push. Swap will be aborted", fetchResponseMap.get(failedNodes.get(0)).getException());
        }
    }
    return fetchResponseMap;
}
Also used : UnauthorizedStoreException(voldemort.store.readonly.UnauthorizedStoreException) HashMap(java.util.HashMap) Node(voldemort.cluster.Node) ArrayList(java.util.ArrayList) VoldemortException(voldemort.VoldemortException) Callable(java.util.concurrent.Callable) AsyncOperationTimeoutException(voldemort.client.protocol.admin.AsyncOperationTimeoutException) AsyncOperationTimeoutException(voldemort.client.protocol.admin.AsyncOperationTimeoutException) UnauthorizedStoreException(voldemort.store.readonly.UnauthorizedStoreException) VoldemortException(voldemort.VoldemortException) Future(java.util.concurrent.Future) HashMap(java.util.HashMap) Map(java.util.Map) AdminClient(voldemort.client.protocol.admin.AdminClient)

Example 2 with AdminClient

use of voldemort.client.protocol.admin.AdminClient in project voldemort by voldemort.

the class RebootstrappingStore method reinit.

private void reinit() {
    AdminClient adminClient = AdminClient.createTempAdminClient(voldemortConfig, metadata.getCluster(), voldemortConfig.getClientMaxConnectionsPerNode());
    try {
        Versioned<Cluster> latestCluster = adminClient.rebalanceOps.getLatestCluster(new ArrayList<Integer>());
        metadata.put(MetadataStore.CLUSTER_KEY, latestCluster.getValue());
        checkAndAddNodeStore();
        routedStore.updateRoutingStrategy(metadata.getRoutingStrategy(getName()));
    } finally {
        adminClient.close();
    }
}
Also used : Cluster(voldemort.cluster.Cluster) AdminClient(voldemort.client.protocol.admin.AdminClient)

Example 3 with AdminClient

use of voldemort.client.protocol.admin.AdminClient in project voldemort by voldemort.

the class RebootstrappingStoreTest method rebalance.

public void rebalance() {
    assert servers != null && servers.size() > 1;
    VoldemortConfig config = servers.get(0).getVoldemortConfig();
    AdminClient adminClient = AdminClient.createTempAdminClient(config, cluster, 4);
    List<Integer> partitionIds = ImmutableList.of(0, 1);
    int req = adminClient.storeMntOps.migratePartitions(0, 1, STORE_NAME, partitionIds, null, null);
    adminClient.rpcOps.waitForCompletion(1, req, 5, TimeUnit.SECONDS);
    Versioned<Cluster> versionedCluster = adminClient.metadataMgmtOps.getRemoteCluster(0);
    Node node0 = versionedCluster.getValue().getNodeById(0);
    Node node1 = versionedCluster.getValue().getNodeById(1);
    Node newNode0 = new Node(node0.getId(), node0.getHost(), node0.getHttpPort(), node0.getSocketPort(), node0.getAdminPort(), ImmutableList.<Integer>of());
    Node newNode1 = new Node(node1.getId(), node1.getHost(), node1.getHttpPort(), node1.getSocketPort(), node1.getAdminPort(), ImmutableList.of(0, 1));
    long deleted = adminClient.storeMntOps.deletePartitions(0, STORE_NAME, ImmutableList.of(0, 1), null);
    assert deleted > 0;
    Cluster newCluster = new Cluster(cluster.getName(), ImmutableList.of(newNode0, newNode1), Lists.newArrayList(cluster.getZones()));
    for (Node node : cluster.getNodes()) {
        VectorClock clock = (VectorClock) versionedCluster.getVersion();
        clock.incrementVersion(node.getId(), System.currentTimeMillis());
        adminClient.metadataMgmtOps.updateRemoteCluster(node.getId(), newCluster, clock);
    }
}
Also used : Node(voldemort.cluster.Node) VectorClock(voldemort.versioning.VectorClock) Cluster(voldemort.cluster.Cluster) VoldemortConfig(voldemort.server.VoldemortConfig) AdminClient(voldemort.client.protocol.admin.AdminClient)

Example 4 with AdminClient

use of voldemort.client.protocol.admin.AdminClient in project voldemort by voldemort.

the class ReplaceNodeCLI method init.

private void init() {
    this.adminClient = new AdminClient(this.url);
    this.newAdminClient = new AdminClient(this.newUrl);
    this.cluster = adminClient.getAdminClientCluster();
    // Validate node exists in the old cluster
    this.cluster.getNodeById(nodeId);
    this.newCluster = newAdminClient.getAdminClientCluster();
    if (newCluster.getNumberOfNodes() > 1) {
        newNodeId = nodeId;
    } else {
        newNodeId = newCluster.getNodeIds().iterator().next().intValue();
    }
    this.clusterXml = getClusterXML();
    // Update your cluster XML based on the consensus
    this.cluster = new ClusterMapper().readCluster(new StringReader(clusterXml));
    this.storesXml = getStoresXML();
    this.storeDefinitions = new StoreDefinitionsMapper().readStoreList(new StringReader(storesXml), false);
}
Also used : StringReader(java.io.StringReader) StoreDefinitionsMapper(voldemort.xml.StoreDefinitionsMapper) ClusterMapper(voldemort.xml.ClusterMapper) AdminClient(voldemort.client.protocol.admin.AdminClient)

Example 5 with AdminClient

use of voldemort.client.protocol.admin.AdminClient in project voldemort by voldemort.

the class AdminToolUtils method getAdminClient.

/**
 * Utility function that constructs AdminClient.
 *
 * @param url URL pointing to the bootstrap node
 * @return Newly constructed AdminClient
 */
public static AdminClient getAdminClient(String url) {
    ClientConfig config = new ClientConfig().setBootstrapUrls(url).setConnectionTimeout(5, TimeUnit.SECONDS);
    AdminClientConfig adminConfig = new AdminClientConfig().setAdminSocketTimeoutSec(5);
    return new AdminClient(adminConfig, config);
}
Also used : AdminClientConfig(voldemort.client.protocol.admin.AdminClientConfig) AdminClientConfig(voldemort.client.protocol.admin.AdminClientConfig) ClientConfig(voldemort.client.ClientConfig) AdminClient(voldemort.client.protocol.admin.AdminClient)

Aggregations

AdminClient (voldemort.client.protocol.admin.AdminClient)80 Test (org.junit.Test)35 Cluster (voldemort.cluster.Cluster)26 Node (voldemort.cluster.Node)26 Properties (java.util.Properties)19 StoreDefinition (voldemort.store.StoreDefinition)19 ArrayList (java.util.ArrayList)18 AdminClientConfig (voldemort.client.protocol.admin.AdminClientConfig)18 VoldemortException (voldemort.VoldemortException)17 IOException (java.io.IOException)14 Before (org.junit.Before)14 ByteArray (voldemort.utils.ByteArray)14 HashMap (java.util.HashMap)13 StoreDefinitionsMapper (voldemort.xml.StoreDefinitionsMapper)13 File (java.io.File)11 VoldemortServer (voldemort.server.VoldemortServer)11 ClientConfig (voldemort.client.ClientConfig)10 VectorClock (voldemort.versioning.VectorClock)10 Versioned (voldemort.versioning.Versioned)9 ClusterMapper (voldemort.xml.ClusterMapper)9