Search in sources :

Example 31 with KeyGroupsStateHandle

use of org.apache.flink.runtime.state.KeyGroupsStateHandle in project flink by apache.

the class AbstractStreamOperatorTestHarness method repackageState.

/**
	 * Takes the different {@link OperatorStateHandles} created by calling {@link #snapshot(long, long)}
	 * on different instances of {@link AbstractStreamOperatorTestHarness} (each one representing one subtask)
	 * and repacks them into a single {@link OperatorStateHandles} so that the parallelism of the test
	 * can change arbitrarily (i.e. be able to scale both up and down).
	 *
	 * <p>
	 * After repacking the partial states, use {@link #initializeState(OperatorStateHandles)} to initialize
	 * a new instance with the resulting state. Bear in mind that for parallelism greater than one, you
	 * have to use the constructor {@link #AbstractStreamOperatorTestHarness(StreamOperator, int, int, int)}.
	 *
	 * <p>
	 * <b>NOTE: </b> each of the {@code handles} in the argument list is assumed to be from a single task of a single
	 * operator (i.e. chain length of one).
	 *
	 * <p>
	 * For an example of how to use it, have a look at
	 * {@link AbstractStreamOperatorTest#testStateAndTimerStateShufflingScalingDown()}.
	 *
	 * @param handles the different states to be merged.
	 * @return the resulting state, or {@code null} if no partial states are specified.
	 */
public static OperatorStateHandles repackageState(OperatorStateHandles... handles) throws Exception {
    if (handles.length < 1) {
        return null;
    } else if (handles.length == 1) {
        return handles[0];
    }
    List<OperatorStateHandle> mergedManagedOperatorState = new ArrayList<>(handles.length);
    List<OperatorStateHandle> mergedRawOperatorState = new ArrayList<>(handles.length);
    List<KeyGroupsStateHandle> mergedManagedKeyedState = new ArrayList<>(handles.length);
    List<KeyGroupsStateHandle> mergedRawKeyedState = new ArrayList<>(handles.length);
    for (OperatorStateHandles handle : handles) {
        Collection<OperatorStateHandle> managedOperatorState = handle.getManagedOperatorState();
        Collection<OperatorStateHandle> rawOperatorState = handle.getRawOperatorState();
        Collection<KeyGroupsStateHandle> managedKeyedState = handle.getManagedKeyedState();
        Collection<KeyGroupsStateHandle> rawKeyedState = handle.getRawKeyedState();
        if (managedOperatorState != null) {
            mergedManagedOperatorState.addAll(managedOperatorState);
        }
        if (rawOperatorState != null) {
            mergedRawOperatorState.addAll(rawOperatorState);
        }
        if (managedKeyedState != null) {
            mergedManagedKeyedState.addAll(managedKeyedState);
        }
        if (rawKeyedState != null) {
            mergedRawKeyedState.addAll(rawKeyedState);
        }
    }
    return new OperatorStateHandles(0, null, mergedManagedKeyedState, mergedRawKeyedState, mergedManagedOperatorState, mergedRawOperatorState);
}
Also used : OperatorStateHandles(org.apache.flink.streaming.runtime.tasks.OperatorStateHandles) ArrayList(java.util.ArrayList) OperatorStateHandle(org.apache.flink.runtime.state.OperatorStateHandle) KeyGroupsStateHandle(org.apache.flink.runtime.state.KeyGroupsStateHandle)

Example 32 with KeyGroupsStateHandle

use of org.apache.flink.runtime.state.KeyGroupsStateHandle in project flink by apache.

the class KeyedOneInputStreamOperatorTestHarness method initializeState.

@Override
public void initializeState(OperatorStateHandles operatorStateHandles) throws Exception {
    if (operatorStateHandles != null) {
        int numKeyGroups = getEnvironment().getTaskInfo().getMaxNumberOfParallelSubtasks();
        int numSubtasks = getEnvironment().getTaskInfo().getNumberOfParallelSubtasks();
        int subtaskIndex = getEnvironment().getTaskInfo().getIndexOfThisSubtask();
        // create a new OperatorStateHandles that only contains the state for our key-groups
        List<KeyGroupRange> keyGroupPartitions = StateAssignmentOperation.createKeyGroupPartitions(numKeyGroups, numSubtasks);
        KeyGroupRange localKeyGroupRange = keyGroupPartitions.get(subtaskIndex);
        restoredKeyedState = null;
        Collection<KeyGroupsStateHandle> managedKeyedState = operatorStateHandles.getManagedKeyedState();
        if (managedKeyedState != null) {
            // the migration tag
            if (hasMigrationHandles(managedKeyedState)) {
                List<KeyGroupsStateHandle> result = new ArrayList<>(managedKeyedState.size());
                result.addAll(managedKeyedState);
                restoredKeyedState = result;
            } else {
                restoredKeyedState = StateAssignmentOperation.getKeyGroupsStateHandles(managedKeyedState, localKeyGroupRange);
            }
        }
    }
    super.initializeState(operatorStateHandles);
}
Also used : KeyGroupRange(org.apache.flink.runtime.state.KeyGroupRange) ArrayList(java.util.ArrayList) KeyGroupsStateHandle(org.apache.flink.runtime.state.KeyGroupsStateHandle)

Example 33 with KeyGroupsStateHandle

use of org.apache.flink.runtime.state.KeyGroupsStateHandle in project flink by apache.

the class HeapKeyedStateBackend method restorePartitionedState.

@SuppressWarnings({ "unchecked" })
private void restorePartitionedState(Collection<KeyGroupsStateHandle> state) throws Exception {
    final Map<Integer, String> kvStatesById = new HashMap<>();
    int numRegisteredKvStates = 0;
    stateTables.clear();
    for (KeyGroupsStateHandle keyGroupsHandle : state) {
        if (keyGroupsHandle == null) {
            continue;
        }
        FSDataInputStream fsDataInputStream = keyGroupsHandle.openInputStream();
        cancelStreamRegistry.registerClosable(fsDataInputStream);
        try {
            DataInputViewStreamWrapper inView = new DataInputViewStreamWrapper(fsDataInputStream);
            KeyedBackendSerializationProxy serializationProxy = new KeyedBackendSerializationProxy(userCodeClassLoader);
            serializationProxy.read(inView);
            List<KeyedBackendSerializationProxy.StateMetaInfo<?, ?>> metaInfoList = serializationProxy.getNamedStateSerializationProxies();
            for (KeyedBackendSerializationProxy.StateMetaInfo<?, ?> metaInfoSerializationProxy : metaInfoList) {
                StateTable<K, ?, ?> stateTable = stateTables.get(metaInfoSerializationProxy.getStateName());
                //important: only create a new table we did not already create it previously
                if (null == stateTable) {
                    RegisteredBackendStateMetaInfo<?, ?> registeredBackendStateMetaInfo = new RegisteredBackendStateMetaInfo<>(metaInfoSerializationProxy);
                    stateTable = newStateTable(registeredBackendStateMetaInfo);
                    stateTables.put(metaInfoSerializationProxy.getStateName(), stateTable);
                    kvStatesById.put(numRegisteredKvStates, metaInfoSerializationProxy.getStateName());
                    ++numRegisteredKvStates;
                }
            }
            for (Tuple2<Integer, Long> groupOffset : keyGroupsHandle.getGroupRangeOffsets()) {
                int keyGroupIndex = groupOffset.f0;
                long offset = groupOffset.f1;
                fsDataInputStream.seek(offset);
                int writtenKeyGroupIndex = inView.readInt();
                Preconditions.checkState(writtenKeyGroupIndex == keyGroupIndex, "Unexpected key-group in restore.");
                for (int i = 0; i < metaInfoList.size(); i++) {
                    int kvStateId = inView.readShort();
                    StateTable<K, ?, ?> stateTable = stateTables.get(kvStatesById.get(kvStateId));
                    StateTableByKeyGroupReader keyGroupReader = StateTableByKeyGroupReaders.readerForVersion(stateTable, serializationProxy.getRestoredVersion());
                    keyGroupReader.readMappingsInKeyGroup(inView, keyGroupIndex);
                }
            }
        } finally {
            cancelStreamRegistry.unregisterClosable(fsDataInputStream);
            IOUtils.closeQuietly(fsDataInputStream);
        }
    }
}
Also used : RegisteredBackendStateMetaInfo(org.apache.flink.runtime.state.RegisteredBackendStateMetaInfo) HashMap(java.util.HashMap) RegisteredBackendStateMetaInfo(org.apache.flink.runtime.state.RegisteredBackendStateMetaInfo) KeyedBackendSerializationProxy(org.apache.flink.runtime.state.KeyedBackendSerializationProxy) DataInputViewStreamWrapper(org.apache.flink.core.memory.DataInputViewStreamWrapper) KeyGroupsStateHandle(org.apache.flink.runtime.state.KeyGroupsStateHandle) FSDataInputStream(org.apache.flink.core.fs.FSDataInputStream)

Example 34 with KeyGroupsStateHandle

use of org.apache.flink.runtime.state.KeyGroupsStateHandle in project flink by apache.

the class HeapKeyedStateBackend method snapshot.

@Override
@SuppressWarnings("unchecked")
public RunnableFuture<KeyGroupsStateHandle> snapshot(final long checkpointId, final long timestamp, final CheckpointStreamFactory streamFactory, CheckpointOptions checkpointOptions) throws Exception {
    if (!hasRegisteredState()) {
        return DoneFuture.nullValue();
    }
    long syncStartTime = System.currentTimeMillis();
    Preconditions.checkState(stateTables.size() <= Short.MAX_VALUE, "Too many KV-States: " + stateTables.size() + ". Currently at most " + Short.MAX_VALUE + " states are supported");
    List<KeyedBackendSerializationProxy.StateMetaInfo<?, ?>> metaInfoProxyList = new ArrayList<>(stateTables.size());
    final Map<String, Integer> kVStateToId = new HashMap<>(stateTables.size());
    final Map<StateTable<K, ?, ?>, StateTableSnapshot> cowStateStableSnapshots = new HashedMap(stateTables.size());
    for (Map.Entry<String, StateTable<K, ?, ?>> kvState : stateTables.entrySet()) {
        RegisteredBackendStateMetaInfo<?, ?> metaInfo = kvState.getValue().getMetaInfo();
        KeyedBackendSerializationProxy.StateMetaInfo<?, ?> metaInfoProxy = new KeyedBackendSerializationProxy.StateMetaInfo(metaInfo.getStateType(), metaInfo.getName(), metaInfo.getNamespaceSerializer(), metaInfo.getStateSerializer());
        metaInfoProxyList.add(metaInfoProxy);
        kVStateToId.put(kvState.getKey(), kVStateToId.size());
        StateTable<K, ?, ?> stateTable = kvState.getValue();
        if (null != stateTable) {
            cowStateStableSnapshots.put(stateTable, stateTable.createSnapshot());
        }
    }
    final KeyedBackendSerializationProxy serializationProxy = new KeyedBackendSerializationProxy(keySerializer, metaInfoProxyList);
    //--------------------------------------------------- this becomes the end of sync part
    // implementation of the async IO operation, based on FutureTask
    final AbstractAsyncIOCallable<KeyGroupsStateHandle, CheckpointStreamFactory.CheckpointStateOutputStream> ioCallable = new AbstractAsyncIOCallable<KeyGroupsStateHandle, CheckpointStreamFactory.CheckpointStateOutputStream>() {

        AtomicBoolean open = new AtomicBoolean(false);

        @Override
        public CheckpointStreamFactory.CheckpointStateOutputStream openIOHandle() throws Exception {
            if (open.compareAndSet(false, true)) {
                CheckpointStreamFactory.CheckpointStateOutputStream stream = streamFactory.createCheckpointStateOutputStream(checkpointId, timestamp);
                try {
                    cancelStreamRegistry.registerClosable(stream);
                    return stream;
                } catch (Exception ex) {
                    open.set(false);
                    throw ex;
                }
            } else {
                throw new IOException("Operation already opened.");
            }
        }

        @Override
        public KeyGroupsStateHandle performOperation() throws Exception {
            long asyncStartTime = System.currentTimeMillis();
            CheckpointStreamFactory.CheckpointStateOutputStream stream = getIoHandle();
            DataOutputViewStreamWrapper outView = new DataOutputViewStreamWrapper(stream);
            serializationProxy.write(outView);
            long[] keyGroupRangeOffsets = new long[keyGroupRange.getNumberOfKeyGroups()];
            for (int keyGroupPos = 0; keyGroupPos < keyGroupRange.getNumberOfKeyGroups(); ++keyGroupPos) {
                int keyGroupId = keyGroupRange.getKeyGroupId(keyGroupPos);
                keyGroupRangeOffsets[keyGroupPos] = stream.getPos();
                outView.writeInt(keyGroupId);
                for (Map.Entry<String, StateTable<K, ?, ?>> kvState : stateTables.entrySet()) {
                    outView.writeShort(kVStateToId.get(kvState.getKey()));
                    cowStateStableSnapshots.get(kvState.getValue()).writeMappingsInKeyGroup(outView, keyGroupId);
                }
            }
            if (open.compareAndSet(true, false)) {
                StreamStateHandle streamStateHandle = stream.closeAndGetHandle();
                KeyGroupRangeOffsets offsets = new KeyGroupRangeOffsets(keyGroupRange, keyGroupRangeOffsets);
                final KeyGroupsStateHandle keyGroupsStateHandle = new KeyGroupsStateHandle(offsets, streamStateHandle);
                if (asynchronousSnapshots) {
                    LOG.info("Heap backend snapshot ({}, asynchronous part) in thread {} took {} ms.", streamFactory, Thread.currentThread(), (System.currentTimeMillis() - asyncStartTime));
                }
                return keyGroupsStateHandle;
            } else {
                throw new IOException("Checkpoint stream already closed.");
            }
        }

        @Override
        public void done(boolean canceled) {
            if (open.compareAndSet(true, false)) {
                CheckpointStreamFactory.CheckpointStateOutputStream stream = getIoHandle();
                if (null != stream) {
                    cancelStreamRegistry.unregisterClosable(stream);
                    IOUtils.closeQuietly(stream);
                }
            }
            for (StateTableSnapshot snapshot : cowStateStableSnapshots.values()) {
                snapshot.release();
            }
        }
    };
    AsyncStoppableTaskWithCallback<KeyGroupsStateHandle> task = AsyncStoppableTaskWithCallback.from(ioCallable);
    if (!asynchronousSnapshots) {
        task.run();
    }
    LOG.info("Heap backend snapshot (" + streamFactory + ", synchronous part) in thread " + Thread.currentThread() + " took " + (System.currentTimeMillis() - syncStartTime) + " ms.");
    return task;
}
Also used : RegisteredBackendStateMetaInfo(org.apache.flink.runtime.state.RegisteredBackendStateMetaInfo) HashMap(java.util.HashMap) KeyGroupRangeOffsets(org.apache.flink.runtime.state.KeyGroupRangeOffsets) ArrayList(java.util.ArrayList) KeyedBackendSerializationProxy(org.apache.flink.runtime.state.KeyedBackendSerializationProxy) KeyGroupsStateHandle(org.apache.flink.runtime.state.KeyGroupsStateHandle) StreamStateHandle(org.apache.flink.runtime.state.StreamStateHandle) CheckpointStreamFactory(org.apache.flink.runtime.state.CheckpointStreamFactory) IOException(java.io.IOException) AbstractAsyncIOCallable(org.apache.flink.runtime.io.async.AbstractAsyncIOCallable) IOException(java.io.IOException) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) DataOutputViewStreamWrapper(org.apache.flink.core.memory.DataOutputViewStreamWrapper) HashedMap(org.apache.commons.collections.map.HashedMap) Map(java.util.Map) HashedMap(org.apache.commons.collections.map.HashedMap) HashMap(java.util.HashMap)

Example 35 with KeyGroupsStateHandle

use of org.apache.flink.runtime.state.KeyGroupsStateHandle in project flink by apache.

the class InterruptSensitiveRestoreTest method createTask.

// ------------------------------------------------------------------------
//  Utilities
// ------------------------------------------------------------------------
private static Task createTask(Configuration taskConfig, StreamStateHandle state, int mode) throws IOException {
    NetworkEnvironment networkEnvironment = mock(NetworkEnvironment.class);
    when(networkEnvironment.createKvStateTaskRegistry(any(JobID.class), any(JobVertexID.class))).thenReturn(mock(TaskKvStateRegistry.class));
    ChainedStateHandle<StreamStateHandle> operatorState = null;
    List<KeyGroupsStateHandle> keyGroupStateFromBackend = Collections.emptyList();
    List<KeyGroupsStateHandle> keyGroupStateFromStream = Collections.emptyList();
    List<Collection<OperatorStateHandle>> operatorStateBackend = Collections.emptyList();
    List<Collection<OperatorStateHandle>> operatorStateStream = Collections.emptyList();
    Map<String, OperatorStateHandle.StateMetaInfo> operatorStateMetadata = new HashMap<>(1);
    OperatorStateHandle.StateMetaInfo metaInfo = new OperatorStateHandle.StateMetaInfo(new long[] { 0 }, OperatorStateHandle.Mode.SPLIT_DISTRIBUTE);
    operatorStateMetadata.put(DefaultOperatorStateBackend.DEFAULT_OPERATOR_STATE_NAME, metaInfo);
    KeyGroupRangeOffsets keyGroupRangeOffsets = new KeyGroupRangeOffsets(new KeyGroupRange(0, 0));
    Collection<OperatorStateHandle> operatorStateHandles = Collections.singletonList(new OperatorStateHandle(operatorStateMetadata, state));
    List<KeyGroupsStateHandle> keyGroupsStateHandles = Collections.singletonList(new KeyGroupsStateHandle(keyGroupRangeOffsets, state));
    switch(mode) {
        case OPERATOR_MANAGED:
            operatorStateBackend = Collections.singletonList(operatorStateHandles);
            break;
        case OPERATOR_RAW:
            operatorStateStream = Collections.singletonList(operatorStateHandles);
            break;
        case KEYED_MANAGED:
            keyGroupStateFromBackend = keyGroupsStateHandles;
            break;
        case KEYED_RAW:
            keyGroupStateFromStream = keyGroupsStateHandles;
            break;
        case LEGACY:
            operatorState = new ChainedStateHandle<>(Collections.singletonList(state));
            break;
        default:
            throw new IllegalArgumentException();
    }
    TaskStateHandles taskStateHandles = new TaskStateHandles(operatorState, operatorStateBackend, operatorStateStream, keyGroupStateFromBackend, keyGroupStateFromStream);
    JobInformation jobInformation = new JobInformation(new JobID(), "test job name", new SerializedValue<>(new ExecutionConfig()), new Configuration(), Collections.<BlobKey>emptyList(), Collections.<URL>emptyList());
    TaskInformation taskInformation = new TaskInformation(new JobVertexID(), "test task name", 1, 1, SourceStreamTask.class.getName(), taskConfig);
    return new Task(jobInformation, taskInformation, new ExecutionAttemptID(), new AllocationID(), 0, 0, Collections.<ResultPartitionDeploymentDescriptor>emptyList(), Collections.<InputGateDeploymentDescriptor>emptyList(), 0, taskStateHandles, mock(MemoryManager.class), mock(IOManager.class), networkEnvironment, mock(BroadcastVariableManager.class), mock(TaskManagerActions.class), mock(InputSplitProvider.class), mock(CheckpointResponder.class), new FallbackLibraryCacheManager(), new FileCache(new String[] { EnvironmentInformation.getTemporaryFileDirectory() }), new TestingTaskManagerRuntimeInfo(), new UnregisteredTaskMetricsGroup(), mock(ResultPartitionConsumableNotifier.class), mock(PartitionProducerStateChecker.class), mock(Executor.class));
}
Also used : Task(org.apache.flink.runtime.taskmanager.Task) Configuration(org.apache.flink.configuration.Configuration) HashMap(java.util.HashMap) KeyGroupRangeOffsets(org.apache.flink.runtime.state.KeyGroupRangeOffsets) JobVertexID(org.apache.flink.runtime.jobgraph.JobVertexID) KeyGroupRange(org.apache.flink.runtime.state.KeyGroupRange) TaskKvStateRegistry(org.apache.flink.runtime.query.TaskKvStateRegistry) ExecutionConfig(org.apache.flink.api.common.ExecutionConfig) KeyGroupsStateHandle(org.apache.flink.runtime.state.KeyGroupsStateHandle) TaskManagerActions(org.apache.flink.runtime.taskmanager.TaskManagerActions) StreamStateHandle(org.apache.flink.runtime.state.StreamStateHandle) TestingTaskManagerRuntimeInfo(org.apache.flink.runtime.util.TestingTaskManagerRuntimeInfo) Executor(java.util.concurrent.Executor) BroadcastVariableManager(org.apache.flink.runtime.broadcast.BroadcastVariableManager) PartitionProducerStateChecker(org.apache.flink.runtime.io.network.netty.PartitionProducerStateChecker) InputSplitProvider(org.apache.flink.runtime.jobgraph.tasks.InputSplitProvider) ResultPartitionConsumableNotifier(org.apache.flink.runtime.io.network.partition.ResultPartitionConsumableNotifier) UnregisteredTaskMetricsGroup(org.apache.flink.runtime.operators.testutils.UnregisteredTaskMetricsGroup) JobInformation(org.apache.flink.runtime.executiongraph.JobInformation) TaskInformation(org.apache.flink.runtime.executiongraph.TaskInformation) ExecutionAttemptID(org.apache.flink.runtime.executiongraph.ExecutionAttemptID) IOManager(org.apache.flink.runtime.io.disk.iomanager.IOManager) CheckpointResponder(org.apache.flink.runtime.taskmanager.CheckpointResponder) AllocationID(org.apache.flink.runtime.clusterframework.types.AllocationID) FallbackLibraryCacheManager(org.apache.flink.runtime.execution.librarycache.FallbackLibraryCacheManager) MemoryManager(org.apache.flink.runtime.memory.MemoryManager) FileCache(org.apache.flink.runtime.filecache.FileCache) TaskStateHandles(org.apache.flink.runtime.state.TaskStateHandles) NetworkEnvironment(org.apache.flink.runtime.io.network.NetworkEnvironment) Collection(java.util.Collection) OperatorStateHandle(org.apache.flink.runtime.state.OperatorStateHandle) JobID(org.apache.flink.api.common.JobID)

Aggregations

KeyGroupsStateHandle (org.apache.flink.runtime.state.KeyGroupsStateHandle)35 OperatorStateHandle (org.apache.flink.runtime.state.OperatorStateHandle)20 StreamStateHandle (org.apache.flink.runtime.state.StreamStateHandle)17 ArrayList (java.util.ArrayList)14 Test (org.junit.Test)14 JobID (org.apache.flink.api.common.JobID)11 KeyGroupRange (org.apache.flink.runtime.state.KeyGroupRange)11 HashMap (java.util.HashMap)10 ByteStreamStateHandle (org.apache.flink.runtime.state.memory.ByteStreamStateHandle)10 JobVertexID (org.apache.flink.runtime.jobgraph.JobVertexID)7 AcknowledgeCheckpoint (org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint)7 CheckpointStreamFactory (org.apache.flink.runtime.state.CheckpointStreamFactory)7 DeclineCheckpoint (org.apache.flink.runtime.messages.checkpoint.DeclineCheckpoint)6 IOException (java.io.IOException)5 ExecutionJobVertex (org.apache.flink.runtime.executiongraph.ExecutionJobVertex)5 ExecutionVertex (org.apache.flink.runtime.executiongraph.ExecutionVertex)5 ChainedStateHandle (org.apache.flink.runtime.state.ChainedStateHandle)5 KeyGroupRangeOffsets (org.apache.flink.runtime.state.KeyGroupRangeOffsets)5 TaskStateHandles (org.apache.flink.runtime.state.TaskStateHandles)5 Collection (java.util.Collection)4