Search in sources :

Example 1 with DDatanode

use of org.apache.hadoop.hdfs.server.balancer.Dispatcher.DDatanode in project hadoop by apache.

the class Balancer method init.

/**
   * Given a datanode storage set, build a network topology and decide
   * over-utilized storages, above average utilized storages, 
   * below average utilized storages, and underutilized storages. 
   * The input datanode storage set is shuffled in order to randomize
   * to the storage matching later on.
   *
   * @return the number of bytes needed to move in order to balance the cluster.
   */
private long init(List<DatanodeStorageReport> reports) {
    // compute average utilization
    for (DatanodeStorageReport r : reports) {
        policy.accumulateSpaces(r);
    }
    policy.initAvgUtilization();
    // create network topology and classify utilization collections: 
    //   over-utilized, above-average, below-average and under-utilized.
    long overLoadedBytes = 0L, underLoadedBytes = 0L;
    for (DatanodeStorageReport r : reports) {
        final DDatanode dn = dispatcher.newDatanode(r.getDatanodeInfo());
        final boolean isSource = Util.isIncluded(sourceNodes, dn.getDatanodeInfo());
        for (StorageType t : StorageType.getMovableTypes()) {
            final Double utilization = policy.getUtilization(r, t);
            if (utilization == null) {
                // datanode does not have such storage type 
                continue;
            }
            final double average = policy.getAvgUtilization(t);
            if (utilization >= average && !isSource) {
                LOG.info(dn + "[" + t + "] has utilization=" + utilization + " >= average=" + average + " but it is not specified as a source; skipping it.");
                continue;
            }
            final double utilizationDiff = utilization - average;
            final long capacity = getCapacity(r, t);
            final double thresholdDiff = Math.abs(utilizationDiff) - threshold;
            final long maxSize2Move = computeMaxSize2Move(capacity, getRemaining(r, t), utilizationDiff, maxSizeToMove);
            final StorageGroup g;
            if (utilizationDiff > 0) {
                final Source s = dn.addSource(t, maxSize2Move, dispatcher);
                if (thresholdDiff <= 0) {
                    // within threshold
                    aboveAvgUtilized.add(s);
                } else {
                    overLoadedBytes += percentage2bytes(thresholdDiff, capacity);
                    overUtilized.add(s);
                }
                g = s;
            } else {
                g = dn.addTarget(t, maxSize2Move);
                if (thresholdDiff <= 0) {
                    // within threshold
                    belowAvgUtilized.add(g);
                } else {
                    underLoadedBytes += percentage2bytes(thresholdDiff, capacity);
                    underUtilized.add(g);
                }
            }
            dispatcher.getStorageGroupMap().put(g);
        }
    }
    logUtilizationCollections();
    Preconditions.checkState(dispatcher.getStorageGroupMap().size() == overUtilized.size() + underUtilized.size() + aboveAvgUtilized.size() + belowAvgUtilized.size(), "Mismatched number of storage groups");
    // return number of bytes to be moved in order to make the cluster balanced
    return Math.max(overLoadedBytes, underLoadedBytes);
}
Also used : StorageGroup(org.apache.hadoop.hdfs.server.balancer.Dispatcher.DDatanode.StorageGroup) StorageType(org.apache.hadoop.fs.StorageType) DatanodeStorageReport(org.apache.hadoop.hdfs.server.protocol.DatanodeStorageReport) DDatanode(org.apache.hadoop.hdfs.server.balancer.Dispatcher.DDatanode) Source(org.apache.hadoop.hdfs.server.balancer.Dispatcher.Source)

Aggregations

StorageType (org.apache.hadoop.fs.StorageType)1 DDatanode (org.apache.hadoop.hdfs.server.balancer.Dispatcher.DDatanode)1 StorageGroup (org.apache.hadoop.hdfs.server.balancer.Dispatcher.DDatanode.StorageGroup)1 Source (org.apache.hadoop.hdfs.server.balancer.Dispatcher.Source)1 DatanodeStorageReport (org.apache.hadoop.hdfs.server.protocol.DatanodeStorageReport)1