Search in sources :

Example 46 with SlotSharingGroup

use of org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup in project flink by apache.

the class SsgNetworkMemoryCalculationUtilsTest method testGenerateEnrichedResourceProfile.

@Test
public void testGenerateEnrichedResourceProfile() throws Exception {
    SlotSharingGroup slotSharingGroup0 = new SlotSharingGroup();
    slotSharingGroup0.setResourceProfile(DEFAULT_RESOURCE);
    SlotSharingGroup slotSharingGroup1 = new SlotSharingGroup();
    slotSharingGroup1.setResourceProfile(DEFAULT_RESOURCE);
    createExecutionGraphAndEnrichNetworkMemory(Arrays.asList(slotSharingGroup0, slotSharingGroup0, slotSharingGroup1));
    assertEquals(new MemorySize(TestShuffleMaster.computeRequiredShuffleMemoryBytes(0, 2) + TestShuffleMaster.computeRequiredShuffleMemoryBytes(1, 6)), slotSharingGroup0.getResourceProfile().getNetworkMemory());
    assertEquals(new MemorySize(TestShuffleMaster.computeRequiredShuffleMemoryBytes(5, 0)), slotSharingGroup1.getResourceProfile().getNetworkMemory());
}
Also used : MemorySize(org.apache.flink.configuration.MemorySize) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup) IntermediateResultPartitionTest(org.apache.flink.runtime.executiongraph.IntermediateResultPartitionTest) Test(org.junit.Test)

Example 47 with SlotSharingGroup

use of org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup in project flink by apache.

the class PartialConsumePipelinedResultTest method testPartialConsumePipelinedResultReceiver.

/**
 * Tests a fix for FLINK-1930.
 *
 * <p>When consuming a pipelined result only partially, is is possible that local channels
 * release the buffer pool, which is associated with the result partition, too early. If the
 * producer is still producing data when this happens, it runs into an IllegalStateException,
 * because of the destroyed buffer pool.
 *
 * @see <a href="https://issues.apache.org/jira/browse/FLINK-1930">FLINK-1930</a>
 */
@Test
public void testPartialConsumePipelinedResultReceiver() throws Exception {
    final JobVertex sender = new JobVertex("Sender");
    sender.setInvokableClass(SlowBufferSender.class);
    sender.setParallelism(PARALLELISM);
    final JobVertex receiver = new JobVertex("Receiver");
    receiver.setInvokableClass(SingleBufferReceiver.class);
    receiver.setParallelism(PARALLELISM);
    // The partition needs to be pipelined, otherwise the original issue does not occur, because
    // the sender and receiver are not online at the same time.
    receiver.connectNewDataSetAsInput(sender, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
    final JobGraph jobGraph = JobGraphTestUtils.streamingJobGraph(sender, receiver);
    final SlotSharingGroup slotSharingGroup = new SlotSharingGroup();
    sender.setSlotSharingGroup(slotSharingGroup);
    receiver.setSlotSharingGroup(slotSharingGroup);
    MINI_CLUSTER_RESOURCE.getMiniCluster().executeJobBlocking(jobGraph);
}
Also used : JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup) Test(org.junit.Test)

Example 48 with SlotSharingGroup

use of org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup in project flink by apache.

the class StreamingJobGraphGenerator method buildVertexRegionSlotSharingGroups.

/**
 * Maps a vertex to its region slot sharing group. If {@link
 * StreamGraph#isAllVerticesInSameSlotSharingGroupByDefault()} returns true, all regions will be
 * in the same slot sharing group.
 */
private Map<JobVertexID, SlotSharingGroup> buildVertexRegionSlotSharingGroups() {
    final Map<JobVertexID, SlotSharingGroup> vertexRegionSlotSharingGroups = new HashMap<>();
    final SlotSharingGroup defaultSlotSharingGroup = new SlotSharingGroup();
    streamGraph.getSlotSharingGroupResource(StreamGraphGenerator.DEFAULT_SLOT_SHARING_GROUP).ifPresent(defaultSlotSharingGroup::setResourceProfile);
    final boolean allRegionsInSameSlotSharingGroup = streamGraph.isAllVerticesInSameSlotSharingGroupByDefault();
    final Iterable<DefaultLogicalPipelinedRegion> regions = DefaultLogicalTopology.fromJobGraph(jobGraph).getAllPipelinedRegions();
    for (DefaultLogicalPipelinedRegion region : regions) {
        final SlotSharingGroup regionSlotSharingGroup;
        if (allRegionsInSameSlotSharingGroup) {
            regionSlotSharingGroup = defaultSlotSharingGroup;
        } else {
            regionSlotSharingGroup = new SlotSharingGroup();
            streamGraph.getSlotSharingGroupResource(StreamGraphGenerator.DEFAULT_SLOT_SHARING_GROUP).ifPresent(regionSlotSharingGroup::setResourceProfile);
        }
        for (LogicalVertex vertex : region.getVertices()) {
            vertexRegionSlotSharingGroups.put(vertex.getId(), regionSlotSharingGroup);
        }
    }
    return vertexRegionSlotSharingGroups;
}
Also used : LogicalVertex(org.apache.flink.runtime.jobgraph.topology.LogicalVertex) IdentityHashMap(java.util.IdentityHashMap) HashMap(java.util.HashMap) DefaultLogicalPipelinedRegion(org.apache.flink.runtime.jobgraph.topology.DefaultLogicalPipelinedRegion) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup)

Example 49 with SlotSharingGroup

use of org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup in project flink by apache.

the class StreamingJobGraphGenerator method setManagedMemoryFractionForSlotSharingGroup.

private static void setManagedMemoryFractionForSlotSharingGroup(final SlotSharingGroup slotSharingGroup, final Map<JobVertexID, Integer> vertexHeadOperators, final Map<JobVertexID, Set<Integer>> vertexOperators, final Map<Integer, StreamConfig> operatorConfigs, final Map<Integer, Map<Integer, StreamConfig>> vertexChainedConfigs, final java.util.function.Function<Integer, Map<ManagedMemoryUseCase, Integer>> operatorScopeManagedMemoryUseCaseWeightsRetriever, final java.util.function.Function<Integer, Set<ManagedMemoryUseCase>> slotScopeManagedMemoryUseCasesRetriever) {
    final Set<Integer> groupOperatorIds = slotSharingGroup.getJobVertexIds().stream().flatMap((vid) -> vertexOperators.get(vid).stream()).collect(Collectors.toSet());
    final Map<ManagedMemoryUseCase, Integer> groupOperatorScopeUseCaseWeights = groupOperatorIds.stream().flatMap((oid) -> operatorScopeManagedMemoryUseCaseWeightsRetriever.apply(oid).entrySet().stream()).collect(Collectors.groupingBy(Map.Entry::getKey, Collectors.summingInt(Map.Entry::getValue)));
    final Set<ManagedMemoryUseCase> groupSlotScopeUseCases = groupOperatorIds.stream().flatMap((oid) -> slotScopeManagedMemoryUseCasesRetriever.apply(oid).stream()).collect(Collectors.toSet());
    for (JobVertexID jobVertexID : slotSharingGroup.getJobVertexIds()) {
        for (int operatorNodeId : vertexOperators.get(jobVertexID)) {
            final StreamConfig operatorConfig = operatorConfigs.get(operatorNodeId);
            final Map<ManagedMemoryUseCase, Integer> operatorScopeUseCaseWeights = operatorScopeManagedMemoryUseCaseWeightsRetriever.apply(operatorNodeId);
            final Set<ManagedMemoryUseCase> slotScopeUseCases = slotScopeManagedMemoryUseCasesRetriever.apply(operatorNodeId);
            setManagedMemoryFractionForOperator(operatorScopeUseCaseWeights, slotScopeUseCases, groupOperatorScopeUseCaseWeights, groupSlotScopeUseCases, operatorConfig);
        }
        // need to refresh the chained task configs because they are serialized
        final int headOperatorNodeId = vertexHeadOperators.get(jobVertexID);
        final StreamConfig vertexConfig = operatorConfigs.get(headOperatorNodeId);
        vertexConfig.setTransitiveChainedTaskConfigs(vertexChainedConfigs.get(headOperatorNodeId));
    }
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) Arrays(java.util.Arrays) DefaultLogicalPipelinedRegion(org.apache.flink.runtime.jobgraph.topology.DefaultLogicalPipelinedRegion) InputSelectable(org.apache.flink.streaming.api.operators.InputSelectable) Tuple2(org.apache.flink.api.java.tuple.Tuple2) JobGraph(org.apache.flink.runtime.jobgraph.JobGraph) CheckpointingMode(org.apache.flink.streaming.api.CheckpointingMode) YieldingOperatorFactory(org.apache.flink.streaming.api.operators.YieldingOperatorFactory) LoggerFactory(org.slf4j.LoggerFactory) CheckpointCoordinatorConfiguration(org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration) CheckpointStorage(org.apache.flink.runtime.state.CheckpointStorage) CoLocationGroupImpl(org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroupImpl) StringUtils(org.apache.commons.lang3.StringUtils) StateBackend(org.apache.flink.runtime.state.StateBackend) ChainingStrategy(org.apache.flink.streaming.api.operators.ChainingStrategy) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) ResourceSpec(org.apache.flink.api.common.operators.ResourceSpec) ManagedMemoryUseCase(org.apache.flink.core.memory.ManagedMemoryUseCase) CustomPartitionerWrapper(org.apache.flink.streaming.runtime.partitioner.CustomPartitionerWrapper) Map(java.util.Map) Function(org.apache.flink.api.common.functions.Function) WithMasterCheckpointHook(org.apache.flink.streaming.api.checkpoint.WithMasterCheckpointHook) Preconditions.checkNotNull(org.apache.flink.util.Preconditions.checkNotNull) ExecutionOptions(org.apache.flink.configuration.ExecutionOptions) JobCheckpointingSettings(org.apache.flink.runtime.jobgraph.tasks.JobCheckpointingSettings) MINIMAL_CHECKPOINT_TIME(org.apache.flink.runtime.jobgraph.tasks.CheckpointCoordinatorConfiguration.MINIMAL_CHECKPOINT_TIME) TypeSerializer(org.apache.flink.api.common.typeutils.TypeSerializer) ForwardPartitioner(org.apache.flink.streaming.runtime.partitioner.ForwardPartitioner) IdentityHashMap(java.util.IdentityHashMap) TaskConfig(org.apache.flink.runtime.operators.util.TaskConfig) Collection(java.util.Collection) Set(java.util.Set) DistributedCache(org.apache.flink.api.common.cache.DistributedCache) Collectors(java.util.stream.Collectors) List(java.util.List) SerializedValue(org.apache.flink.util.SerializedValue) Preconditions.checkArgument(org.apache.flink.util.Preconditions.checkArgument) UdfStreamOperatorFactory(org.apache.flink.streaming.api.operators.UdfStreamOperatorFactory) OperatorID(org.apache.flink.runtime.jobgraph.OperatorID) Optional(java.util.Optional) CheckpointConfig(org.apache.flink.streaming.api.environment.CheckpointConfig) StreamIterationTail(org.apache.flink.streaming.runtime.tasks.StreamIterationTail) IllegalConfigurationException(org.apache.flink.configuration.IllegalConfigurationException) JobEdge(org.apache.flink.runtime.jobgraph.JobEdge) JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) TaskInvokable(org.apache.flink.runtime.jobgraph.tasks.TaskInvokable) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup) LogicalVertex(org.apache.flink.runtime.jobgraph.topology.LogicalVertex) ForwardForConsecutiveHashPartitioner(org.apache.flink.streaming.runtime.partitioner.ForwardForConsecutiveHashPartitioner) ManagedMemoryUtils(org.apache.flink.runtime.util.config.memory.ManagedMemoryUtils) StreamOperatorFactory(org.apache.flink.streaming.api.operators.StreamOperatorFactory) InputOutputFormatVertex(org.apache.flink.runtime.jobgraph.InputOutputFormatVertex) ResultPartitionType(org.apache.flink.runtime.io.network.partition.ResultPartitionType) HashMap(java.util.HashMap) SourceOperatorFactory(org.apache.flink.streaming.api.operators.SourceOperatorFactory) ArrayList(java.util.ArrayList) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) HashSet(java.util.HashSet) StreamPartitioner(org.apache.flink.streaming.runtime.partitioner.StreamPartitioner) JobGraphUtils(org.apache.flink.runtime.jobgraph.JobGraphUtils) ExecutionCheckpointingOptions(org.apache.flink.streaming.api.environment.ExecutionCheckpointingOptions) StreamIterationHead(org.apache.flink.streaming.runtime.tasks.StreamIterationHead) LinkedList(java.util.LinkedList) DistributionPattern(org.apache.flink.runtime.jobgraph.DistributionPattern) Nullable(javax.annotation.Nullable) Preconditions.checkState(org.apache.flink.util.Preconditions.checkState) Logger(org.slf4j.Logger) FlinkRuntimeException(org.apache.flink.util.FlinkRuntimeException) Configuration(org.apache.flink.configuration.Configuration) IOException(java.io.IOException) OperatorIDPair(org.apache.flink.runtime.OperatorIDPair) ForwardForUnspecifiedPartitioner(org.apache.flink.streaming.runtime.partitioner.ForwardForUnspecifiedPartitioner) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting) MasterTriggerRestoreHook(org.apache.flink.runtime.checkpoint.MasterTriggerRestoreHook) RescalePartitioner(org.apache.flink.streaming.runtime.partitioner.RescalePartitioner) StreamExchangeMode(org.apache.flink.streaming.api.transformations.StreamExchangeMode) JobID(org.apache.flink.api.common.JobID) DefaultLogicalTopology(org.apache.flink.runtime.jobgraph.topology.DefaultLogicalTopology) OperatorCoordinator(org.apache.flink.runtime.operators.coordination.OperatorCoordinator) InputOutputFormatContainer(org.apache.flink.runtime.jobgraph.InputOutputFormatContainer) Internal(org.apache.flink.annotation.Internal) Comparator(java.util.Comparator) Collections(java.util.Collections) CheckpointRetentionPolicy(org.apache.flink.runtime.checkpoint.CheckpointRetentionPolicy) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) Map(java.util.Map) IdentityHashMap(java.util.IdentityHashMap) HashMap(java.util.HashMap) ManagedMemoryUseCase(org.apache.flink.core.memory.ManagedMemoryUseCase)

Example 50 with SlotSharingGroup

use of org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup in project flink by apache.

the class JobRecoveryITCase method createjobGraph.

private JobGraph createjobGraph(boolean slotSharingEnabled) throws IOException {
    final JobVertex sender = new JobVertex("Sender");
    sender.setParallelism(PARALLELISM);
    sender.setInvokableClass(TestingAbstractInvokables.Sender.class);
    final JobVertex receiver = new JobVertex("Receiver");
    receiver.setParallelism(PARALLELISM);
    receiver.setInvokableClass(FailingOnceReceiver.class);
    FailingOnceReceiver.reset();
    if (slotSharingEnabled) {
        final SlotSharingGroup slotSharingGroup = new SlotSharingGroup();
        receiver.setSlotSharingGroup(slotSharingGroup);
        sender.setSlotSharingGroup(slotSharingGroup);
    }
    receiver.connectNewDataSetAsInput(sender, DistributionPattern.POINTWISE, ResultPartitionType.PIPELINED);
    final ExecutionConfig executionConfig = new ExecutionConfig();
    executionConfig.setRestartStrategy(RestartStrategies.fixedDelayRestart(1, 0L));
    return JobGraphBuilder.newStreamingJobGraphBuilder().addJobVertices(Arrays.asList(sender, receiver)).setJobName(getClass().getSimpleName()).setExecutionConfig(executionConfig).build();
}
Also used : JobVertex(org.apache.flink.runtime.jobgraph.JobVertex) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) SlotSharingGroup(org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup)

Aggregations

SlotSharingGroup (org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup)53 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)35 Test (org.junit.Test)30 JobGraph (org.apache.flink.runtime.jobgraph.JobGraph)18 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)14 JobID (org.apache.flink.api.common.JobID)11 HashMap (java.util.HashMap)8 Configuration (org.apache.flink.configuration.Configuration)8 ArrayList (java.util.ArrayList)7 HashSet (java.util.HashSet)6 Map (java.util.Map)6 Set (java.util.Set)6 ExecutionConfig (org.apache.flink.api.common.ExecutionConfig)6 ResultPartitionType (org.apache.flink.runtime.io.network.partition.ResultPartitionType)6 CoLocationGroup (org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup)6 IOException (java.io.IOException)5 Arrays (java.util.Arrays)5 IdentityHashMap (java.util.IdentityHashMap)5 Collections (java.util.Collections)4 Comparator (java.util.Comparator)4