Search in sources :

Example 1 with PartitionToInputs

use of org.apache.tez.runtime.library.common.shuffle.InputHost.PartitionToInputs in project tez by apache.

the class ShuffleManager method constructFetcherForHost.

@VisibleForTesting
Fetcher constructFetcherForHost(InputHost inputHost, Configuration conf) {
    Path lockDisk = null;
    if (sharedFetchEnabled) {
        // pick a single lock disk from the edge name's hashcode + host hashcode
        final int h = Math.abs(Objects.hashCode(this.srcNameTrimmed, inputHost.getHost()));
        lockDisk = new Path(this.localDisks[h % this.localDisks.length], "locks");
    }
    FetcherBuilder fetcherBuilder = new FetcherBuilder(ShuffleManager.this, httpConnectionParams, inputManager, inputContext.getApplicationId(), inputContext.getDagIdentifier(), jobTokenSecretMgr, srcNameTrimmed, conf, localFs, localDirAllocator, lockDisk, localDiskFetchEnabled, sharedFetchEnabled, localhostName, shufflePort, asyncHttp, verifyDiskChecksum, compositeFetch);
    if (codec != null) {
        fetcherBuilder.setCompressionParameters(codec);
    }
    fetcherBuilder.setIFileParams(ifileReadAhead, ifileReadAheadLength);
    // Remove obsolete inputs from the list being given to the fetcher. Also
    // remove from the obsolete list.
    PartitionToInputs pendingInputsOfOnePartitionRange = inputHost.clearAndGetOnePartitionRange();
    int includedMaps = 0;
    for (Iterator<InputAttemptIdentifier> inputIter = pendingInputsOfOnePartitionRange.getInputs().iterator(); inputIter.hasNext(); ) {
        InputAttemptIdentifier input = inputIter.next();
        // For pipelined shuffle.
        if (!validateInputAttemptForPipelinedShuffle(input)) {
            continue;
        }
        // Avoid adding attempts which have already completed.
        boolean alreadyCompleted;
        if (input instanceof CompositeInputAttemptIdentifier) {
            CompositeInputAttemptIdentifier compositeInput = (CompositeInputAttemptIdentifier) input;
            int nextClearBit = completedInputSet.nextClearBit(compositeInput.getInputIdentifier());
            int maxClearBit = compositeInput.getInputIdentifier() + compositeInput.getInputIdentifierCount();
            alreadyCompleted = nextClearBit > maxClearBit;
        } else {
            alreadyCompleted = completedInputSet.get(input.getInputIdentifier());
        }
        // Avoid adding attempts which have already completed or have been marked as OBSOLETE
        if (alreadyCompleted || obsoletedInputs.contains(input)) {
            inputIter.remove();
            continue;
        }
        // Check if max threshold is met
        if (includedMaps >= maxTaskOutputAtOnce) {
            inputIter.remove();
            // add to inputHost
            inputHost.addKnownInput(pendingInputsOfOnePartitionRange.getPartition(), pendingInputsOfOnePartitionRange.getPartitionCount(), input);
        } else {
            includedMaps++;
        }
    }
    if (inputHost.getNumPendingPartitions() > 0) {
        // add it to queue
        pendingHosts.add(inputHost);
    }
    for (InputAttemptIdentifier input : pendingInputsOfOnePartitionRange.getInputs()) {
        ShuffleEventInfo eventInfo = shuffleInfoEventsMap.get(input.getInputIdentifier());
        if (eventInfo != null) {
            eventInfo.scheduledForDownload = true;
        }
    }
    fetcherBuilder.assignWork(inputHost.getHost(), inputHost.getPort(), pendingInputsOfOnePartitionRange.getPartition(), pendingInputsOfOnePartitionRange.getPartitionCount(), pendingInputsOfOnePartitionRange.getInputs());
    if (LOG.isDebugEnabled()) {
        LOG.debug("Created Fetcher for host: " + inputHost.getHost() + ", info: " + inputHost.getAdditionalInfo() + ", with inputs: " + pendingInputsOfOnePartitionRange);
    }
    return fetcherBuilder.build();
}
Also used : Path(org.apache.hadoop.fs.Path) CompositeInputAttemptIdentifier(org.apache.tez.runtime.library.common.CompositeInputAttemptIdentifier) CompositeInputAttemptIdentifier(org.apache.tez.runtime.library.common.CompositeInputAttemptIdentifier) InputAttemptIdentifier(org.apache.tez.runtime.library.common.InputAttemptIdentifier) FetcherBuilder(org.apache.tez.runtime.library.common.shuffle.Fetcher.FetcherBuilder) PartitionToInputs(org.apache.tez.runtime.library.common.shuffle.InputHost.PartitionToInputs) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Aggregations

VisibleForTesting (com.google.common.annotations.VisibleForTesting)1 Path (org.apache.hadoop.fs.Path)1 CompositeInputAttemptIdentifier (org.apache.tez.runtime.library.common.CompositeInputAttemptIdentifier)1 InputAttemptIdentifier (org.apache.tez.runtime.library.common.InputAttemptIdentifier)1 FetcherBuilder (org.apache.tez.runtime.library.common.shuffle.Fetcher.FetcherBuilder)1 PartitionToInputs (org.apache.tez.runtime.library.common.shuffle.InputHost.PartitionToInputs)1