Search in sources :

Example 1 with RebalanceTask

use of voldemort.client.rebalance.task.RebalanceTask in project voldemort by voldemort.

the class RebalanceController method executeSubBatch.

// TODO: (refactor) Break this state-machine like method into multiple "sub"
// methods. AFAIK, this method either does the RO stores or the RW stores in
// a batch. I.e., there are at most 2 sub-batches for any given batch. And,
// in practice, there is one sub-batch that is either RO or RW.
// TODO: Fix the javadoc comment to be more easily understood.
/**
     * The smallest granularity of rebalancing where-in we move partitions for a
     * sub-set of stores. Finally at the end of the movement, the node is
     * removed out of rebalance state
     * 
     * <br>
     * 
     * Also any errors + rollback procedures are performed at this level itself.
     * 
     * <pre>
     * | Case | hasRO | hasRW | finishedRO | Action |
     * | 0 | t | t | t | rollback cluster change + swap |
     * | 1 | t | t | f | nothing to do since "rebalance state change" should have removed everything |
     * | 2 | t | f | t | won't be triggered since hasRW is false |
     * | 3 | t | f | f | nothing to do since "rebalance state change" should have removed everything |
     * | 4 | f | t | t | rollback cluster change |
     * | 5 | f | t | f | won't be triggered |
     * | 6 | f | f | t | won't be triggered | 
     * | 7 | f | f | f | won't be triggered |
     * </pre>
     * 
     * @param batchId Rebalance batch id
     * @param batchRollbackCluster Cluster to rollback to if we have a problem
     * @param rebalanceTaskPlanList The list of rebalance partition plans
     * @param hasReadOnlyStores Are we rebalancing any read-only stores?
     * @param hasReadWriteStores Are we rebalancing any read-write stores?
     * @param finishedReadOnlyStores Have we finished rebalancing of read-only
     *        stores?
     */
private void executeSubBatch(final int batchId, RebalanceBatchPlanProgressBar progressBar, final Cluster batchRollbackCluster, final List<StoreDefinition> batchRollbackStoreDefs, final List<RebalanceTaskInfo> rebalanceTaskPlanList, boolean hasReadOnlyStores, boolean hasReadWriteStores, boolean finishedReadOnlyStores) {
    RebalanceUtils.printBatchLog(batchId, logger, "Submitting rebalance tasks ");
    // Get an ExecutorService in place used for submitting our tasks
    ExecutorService service = RebalanceUtils.createExecutors(maxParallelRebalancing);
    // Sub-list of the above list
    final List<RebalanceTask> failedTasks = Lists.newArrayList();
    final List<RebalanceTask> incompleteTasks = Lists.newArrayList();
    // Semaphores for donor nodes - To avoid multiple disk sweeps
    Map<Integer, Semaphore> donorPermits = new HashMap<Integer, Semaphore>();
    for (Node node : batchRollbackCluster.getNodes()) {
        donorPermits.put(node.getId(), new Semaphore(1));
    }
    try {
        // List of tasks which will run asynchronously
        List<RebalanceTask> allTasks = executeTasks(batchId, progressBar, service, rebalanceTaskPlanList, donorPermits);
        RebalanceUtils.printBatchLog(batchId, logger, "All rebalance tasks submitted");
        // Wait and shutdown after (infinite) timeout
        RebalanceUtils.executorShutDown(service, Long.MAX_VALUE);
        RebalanceUtils.printBatchLog(batchId, logger, "Finished waiting for executors");
        // Collects all failures + incomplete tasks from the rebalance
        // tasks.
        List<Exception> failures = Lists.newArrayList();
        for (RebalanceTask task : allTasks) {
            if (task.hasException()) {
                failedTasks.add(task);
                failures.add(task.getError());
            } else if (!task.isComplete()) {
                incompleteTasks.add(task);
            }
        }
        if (failedTasks.size() > 0) {
            throw new VoldemortRebalancingException("Rebalance task terminated unsuccessfully on tasks " + failedTasks, failures);
        }
        // process.
        if (incompleteTasks.size() > 0) {
            throw new VoldemortException("Rebalance tasks are still incomplete / running " + incompleteTasks);
        }
    } catch (VoldemortRebalancingException e) {
        logger.error("Failure while migrating partitions for rebalance task " + batchId);
        if (hasReadOnlyStores && hasReadWriteStores && finishedReadOnlyStores) {
            // Case 0
            adminClient.rebalanceOps.rebalanceStateChange(null, batchRollbackCluster, null, batchRollbackStoreDefs, null, true, true, false, false, false);
        } else if (hasReadWriteStores && finishedReadOnlyStores) {
            // Case 4
            adminClient.rebalanceOps.rebalanceStateChange(null, batchRollbackCluster, null, batchRollbackStoreDefs, null, false, true, false, false, false);
        }
        throw e;
    } finally {
        if (!service.isShutdown()) {
            RebalanceUtils.printErrorLog(batchId, logger, "Could not shutdown service cleanly for rebalance task " + batchId, null);
            service.shutdownNow();
        }
    }
}
Also used : VoldemortRebalancingException(voldemort.server.rebalance.VoldemortRebalancingException) HashMap(java.util.HashMap) Node(voldemort.cluster.Node) ExecutorService(java.util.concurrent.ExecutorService) Semaphore(java.util.concurrent.Semaphore) StealerBasedRebalanceTask(voldemort.client.rebalance.task.StealerBasedRebalanceTask) RebalanceTask(voldemort.client.rebalance.task.RebalanceTask) VoldemortException(voldemort.VoldemortException) VoldemortException(voldemort.VoldemortException) VoldemortRebalancingException(voldemort.server.rebalance.VoldemortRebalancingException)

Example 2 with RebalanceTask

use of voldemort.client.rebalance.task.RebalanceTask in project voldemort by voldemort.

the class RebalanceController method executeTasks.

private List<RebalanceTask> executeTasks(final int batchId, RebalanceBatchPlanProgressBar progressBar, final ExecutorService service, List<RebalanceTaskInfo> rebalanceTaskPlanList, Map<Integer, Semaphore> donorPermits) {
    List<RebalanceTask> taskList = Lists.newArrayList();
    int taskId = 0;
    RebalanceScheduler scheduler = new RebalanceScheduler(service, maxParallelRebalancing);
    List<StealerBasedRebalanceTask> sbTaskList = Lists.newArrayList();
    for (RebalanceTaskInfo taskInfo : rebalanceTaskPlanList) {
        StealerBasedRebalanceTask rebalanceTask = new StealerBasedRebalanceTask(batchId, taskId, taskInfo, donorPermits.get(taskInfo.getDonorId()), adminClient, progressBar, scheduler);
        taskList.add(rebalanceTask);
        sbTaskList.add(rebalanceTask);
        // service.execute(rebalanceTask);
        taskId++;
    }
    scheduler.run(sbTaskList);
    return taskList;
}
Also used : StealerBasedRebalanceTask(voldemort.client.rebalance.task.StealerBasedRebalanceTask) StealerBasedRebalanceTask(voldemort.client.rebalance.task.StealerBasedRebalanceTask) RebalanceTask(voldemort.client.rebalance.task.RebalanceTask)

Aggregations

RebalanceTask (voldemort.client.rebalance.task.RebalanceTask)2 StealerBasedRebalanceTask (voldemort.client.rebalance.task.StealerBasedRebalanceTask)2 HashMap (java.util.HashMap)1 ExecutorService (java.util.concurrent.ExecutorService)1 Semaphore (java.util.concurrent.Semaphore)1 VoldemortException (voldemort.VoldemortException)1 Node (voldemort.cluster.Node)1 VoldemortRebalancingException (voldemort.server.rebalance.VoldemortRebalancingException)1