Search in sources :

Example 1 with TriggerMemberListPublishOp

use of com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp in project hazelcast-jet by hazelcast.

the class JobExecutionService method verifyClusterInformation.

private void verifyClusterInformation(long jobId, long executionId, Address coordinator, int coordinatorMemberListVersion, Set<MemberInfo> participants) {
    Address masterAddress = nodeEngine.getMasterAddress();
    if (!coordinator.equals(masterAddress)) {
        failIfNotRunning();
        throw new IllegalStateException(String.format("Coordinator %s cannot initialize %s. Reason: it is not the master, the master is %s", coordinator, jobAndExecutionId(jobId, executionId), masterAddress));
    }
    ClusterServiceImpl clusterService = (ClusterServiceImpl) nodeEngine.getClusterService();
    MembershipManager membershipManager = clusterService.getMembershipManager();
    int localMemberListVersion = membershipManager.getMemberListVersion();
    Address thisAddress = nodeEngine.getThisAddress();
    if (coordinatorMemberListVersion > localMemberListVersion) {
        assert !masterAddress.equals(thisAddress) : String.format("Local node: %s is master but InitOperation has coordinator member list version: %s larger than " + " local member list version: %s", thisAddress, coordinatorMemberListVersion, localMemberListVersion);
        nodeEngine.getOperationService().send(new TriggerMemberListPublishOp(), masterAddress);
        throw new RetryableHazelcastException(String.format("Cannot initialize %s for coordinator %s, local member list version %s," + " coordinator member list version %s", jobAndExecutionId(jobId, executionId), coordinator, localMemberListVersion, coordinatorMemberListVersion));
    }
    boolean isLocalMemberParticipant = false;
    for (MemberInfo participant : participants) {
        if (participant.getAddress().equals(thisAddress)) {
            isLocalMemberParticipant = true;
        }
        if (membershipManager.getMember(participant.getAddress(), participant.getUuid()) == null) {
            throw new TopologyChangedException(String.format("Cannot initialize %s for coordinator %s: participant %s not found in local member list." + " Local member list version: %s, coordinator member list version: %s", jobAndExecutionId(jobId, executionId), coordinator, participant, localMemberListVersion, coordinatorMemberListVersion));
        }
    }
    if (!isLocalMemberParticipant) {
        throw new IllegalArgumentException(String.format("Cannot initialize %s since member %s is not in participants: %s", jobAndExecutionId(jobId, executionId), thisAddress, participants));
    }
}
Also used : Address(com.hazelcast.nio.Address) RetryableHazelcastException(com.hazelcast.spi.exception.RetryableHazelcastException) MemberInfo(com.hazelcast.internal.cluster.MemberInfo) ClusterServiceImpl(com.hazelcast.internal.cluster.impl.ClusterServiceImpl) MembershipManager(com.hazelcast.internal.cluster.impl.MembershipManager) TriggerMemberListPublishOp(com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp) TopologyChangedException(com.hazelcast.jet.core.TopologyChangedException)

Example 2 with TriggerMemberListPublishOp

use of com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp in project hazelcast by hazelcast.

the class ClusterDataSerializerHook method createFactory.

@Override
public DataSerializableFactory createFactory() {
    ConstructorFunction<Integer, IdentifiedDataSerializable>[] constructors = new ConstructorFunction[LEN];
    constructors[AUTH_FAILURE] = arg -> new AuthenticationFailureOp();
    constructors[ADDRESS] = arg -> new Address();
    constructors[MEMBER] = arg -> new MemberImpl();
    constructors[HEARTBEAT] = arg -> new HeartbeatOp();
    constructors[CONFIG_CHECK] = arg -> new ConfigCheck();
    constructors[MEMBER_HANDSHAKE] = arg -> new MemberHandshake();
    constructors[MEMBER_INFO_UPDATE] = arg -> new MembersUpdateOp();
    constructors[FINALIZE_JOIN] = arg -> new FinalizeJoinOp();
    constructors[BEFORE_JOIN_CHECK_FAILURE] = arg -> new BeforeJoinCheckFailureOp();
    constructors[CHANGE_CLUSTER_STATE] = arg -> new CommitClusterStateOp();
    constructors[CONFIG_MISMATCH] = arg -> new ConfigMismatchOp();
    constructors[CLUSTER_MISMATCH] = arg -> new ClusterMismatchOp();
    constructors[SPLIT_BRAIN_MERGE_VALIDATION] = arg -> new SplitBrainMergeValidationOp();
    constructors[JOIN_REQUEST_OP] = arg -> new JoinRequestOp();
    constructors[LOCK_CLUSTER_STATE] = arg -> new LockClusterStateOp();
    constructors[MASTER_CLAIM] = arg -> new JoinMastershipClaimOp();
    constructors[WHOIS_MASTER] = arg -> new WhoisMasterOp();
    constructors[MERGE_CLUSTERS] = arg -> new MergeClustersOp();
    constructors[POST_JOIN] = arg -> new OnJoinOp();
    constructors[ROLLBACK_CLUSTER_STATE] = arg -> new RollbackClusterStateOp();
    constructors[MASTER_RESPONSE] = arg -> new MasterResponseOp();
    constructors[SHUTDOWN_NODE] = arg -> new ShutdownNodeOp();
    constructors[TRIGGER_MEMBER_LIST_PUBLISH] = arg -> new TriggerMemberListPublishOp();
    constructors[CLUSTER_STATE_TRANSACTION_LOG_RECORD] = arg -> new ClusterStateTransactionLogRecord();
    constructors[MEMBER_INFO] = arg -> new MemberInfo();
    constructors[JOIN_MESSAGE] = arg -> new JoinMessage();
    constructors[JOIN_REQUEST] = arg -> new JoinRequest();
    constructors[MIGRATION_INFO] = arg -> new MigrationInfo();
    constructors[MEMBER_VERSION] = arg -> new MemberVersion();
    constructors[CLUSTER_STATE_CHANGE] = arg -> new ClusterStateChange();
    constructors[SPLIT_BRAIN_JOIN_MESSAGE] = arg -> new SplitBrainJoinMessage();
    constructors[VERSION] = arg -> new Version();
    constructors[FETCH_MEMBER_LIST_STATE] = arg -> new FetchMembersViewOp();
    constructors[EXPLICIT_SUSPICION] = arg -> new ExplicitSuspicionOp();
    constructors[MEMBERS_VIEW] = arg -> new MembersView();
    constructors[TRIGGER_EXPLICIT_SUSPICION] = arg -> new TriggerExplicitSuspicionOp();
    constructors[MEMBERS_VIEW_METADATA] = arg -> new MembersViewMetadata();
    constructors[HEARTBEAT_COMPLAINT] = arg -> new HeartbeatComplaintOp();
    constructors[PROMOTE_LITE_MEMBER] = arg -> new PromoteLiteMemberOp();
    constructors[VECTOR_CLOCK] = arg -> new VectorClock();
    constructors[ENDPOINT_QUALIFIER] = arg -> new EndpointQualifier();
    return new ArrayDataSerializableFactory(constructors);
}
Also used : MigrationInfo(com.hazelcast.internal.partition.MigrationInfo) MergeClustersOp(com.hazelcast.internal.cluster.impl.operations.MergeClustersOp) Address(com.hazelcast.cluster.Address) MembersUpdateOp(com.hazelcast.internal.cluster.impl.operations.MembersUpdateOp) SplitBrainMergeValidationOp(com.hazelcast.internal.cluster.impl.operations.SplitBrainMergeValidationOp) RollbackClusterStateOp(com.hazelcast.internal.cluster.impl.operations.RollbackClusterStateOp) EndpointQualifier(com.hazelcast.instance.EndpointQualifier) TriggerMemberListPublishOp(com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp) MemberVersion(com.hazelcast.version.MemberVersion) HeartbeatComplaintOp(com.hazelcast.internal.cluster.impl.operations.HeartbeatComplaintOp) MemberInfo(com.hazelcast.internal.cluster.MemberInfo) Version(com.hazelcast.version.Version) MemberVersion(com.hazelcast.version.MemberVersion) ShutdownNodeOp(com.hazelcast.internal.cluster.impl.operations.ShutdownNodeOp) HeartbeatOp(com.hazelcast.internal.cluster.impl.operations.HeartbeatOp) TriggerExplicitSuspicionOp(com.hazelcast.internal.cluster.impl.operations.TriggerExplicitSuspicionOp) BeforeJoinCheckFailureOp(com.hazelcast.internal.cluster.impl.operations.BeforeJoinCheckFailureOp) MasterResponseOp(com.hazelcast.internal.cluster.impl.operations.MasterResponseOp) MemberImpl(com.hazelcast.cluster.impl.MemberImpl) LockClusterStateOp(com.hazelcast.internal.cluster.impl.operations.LockClusterStateOp) ExplicitSuspicionOp(com.hazelcast.internal.cluster.impl.operations.ExplicitSuspicionOp) TriggerExplicitSuspicionOp(com.hazelcast.internal.cluster.impl.operations.TriggerExplicitSuspicionOp) VectorClock(com.hazelcast.cluster.impl.VectorClock) AuthenticationFailureOp(com.hazelcast.internal.cluster.impl.operations.AuthenticationFailureOp) ClusterMismatchOp(com.hazelcast.internal.cluster.impl.operations.ClusterMismatchOp) PromoteLiteMemberOp(com.hazelcast.internal.cluster.impl.operations.PromoteLiteMemberOp) FinalizeJoinOp(com.hazelcast.internal.cluster.impl.operations.FinalizeJoinOp) ConstructorFunction(com.hazelcast.internal.util.ConstructorFunction) ConfigMismatchOp(com.hazelcast.internal.cluster.impl.operations.ConfigMismatchOp) CommitClusterStateOp(com.hazelcast.internal.cluster.impl.operations.CommitClusterStateOp) WhoisMasterOp(com.hazelcast.internal.cluster.impl.operations.WhoisMasterOp) OnJoinOp(com.hazelcast.internal.cluster.impl.operations.OnJoinOp) JoinRequestOp(com.hazelcast.internal.cluster.impl.operations.JoinRequestOp) FetchMembersViewOp(com.hazelcast.internal.cluster.impl.operations.FetchMembersViewOp) ArrayDataSerializableFactory(com.hazelcast.internal.serialization.impl.ArrayDataSerializableFactory) JoinMastershipClaimOp(com.hazelcast.internal.cluster.impl.operations.JoinMastershipClaimOp)

Example 3 with TriggerMemberListPublishOp

use of com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp in project hazelcast by hazelcast.

the class InternalPartitionServiceImpl method requestMemberListUpdateIfUnknownMembersFound.

private void requestMemberListUpdateIfUnknownMembersFound(Address sender, InternalPartition[] partitions) {
    ClusterServiceImpl clusterService = node.clusterService;
    ClusterState clusterState = clusterService.getClusterState();
    Set<PartitionReplica> unknownReplicas = new HashSet<>();
    for (InternalPartition partition : partitions) {
        for (int index = 0; index < InternalPartition.MAX_REPLICA_COUNT; index++) {
            PartitionReplica replica = partition.getReplica(index);
            if (replica == null) {
                continue;
            }
            if (node.clusterService.getMember(replica.address(), replica.uuid()) == null && (clusterState.isJoinAllowed() || !clusterService.isMissingMember(replica.address(), replica.uuid()))) {
                unknownReplicas.add(replica);
            }
        }
    }
    if (!unknownReplicas.isEmpty()) {
        if (logger.isWarningEnabled()) {
            StringBuilder s = new StringBuilder("Following unknown addresses are found in partition table").append(" sent from master[").append(sender).append("].").append(" (Probably they have recently joined or left the cluster.)").append(" {");
            for (PartitionReplica replica : unknownReplicas) {
                s.append("\n\t").append(replica);
            }
            s.append("\n}");
            logger.warning(s.toString());
        }
        Address masterAddress = node.getClusterService().getMasterAddress();
        // If node is shutting down, master can be null.
        if (masterAddress != null && !masterAddress.equals(node.getThisAddress())) {
            // unknown addresses found in partition table, request a new member-list from master
            nodeEngine.getOperationService().send(new TriggerMemberListPublishOp(), masterAddress);
        }
    }
}
Also used : ClusterState(com.hazelcast.cluster.ClusterState) Address(com.hazelcast.cluster.Address) PartitionReplica(com.hazelcast.internal.partition.PartitionReplica) ClusterServiceImpl(com.hazelcast.internal.cluster.impl.ClusterServiceImpl) InternalPartition(com.hazelcast.internal.partition.InternalPartition) ReadonlyInternalPartition(com.hazelcast.internal.partition.ReadonlyInternalPartition) TriggerMemberListPublishOp(com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp) HashSet(java.util.HashSet)

Example 4 with TriggerMemberListPublishOp

use of com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp in project hazelcast by hazelcast.

the class JobExecutionService method verifyClusterInformation.

private void verifyClusterInformation(long jobId, long executionId, Address coordinator, int coordinatorMemberListVersion, Set<MemberInfo> participants) {
    Address masterAddress = nodeEngine.getMasterAddress();
    ClusterServiceImpl clusterService = (ClusterServiceImpl) nodeEngine.getClusterService();
    MembershipManager membershipManager = clusterService.getMembershipManager();
    int localMemberListVersion = membershipManager.getMemberListVersion();
    Address thisAddress = nodeEngine.getThisAddress();
    if (coordinatorMemberListVersion > localMemberListVersion) {
        if (masterAddress == null) {
            // elected or split brain merge will happen).
            throw new RetryableHazelcastException(String.format("Cannot initialize %s for coordinator %s, local member list version %s," + " coordinator member list version %s. And also, since the master address" + " is not known to this member, cannot request a new member list from master.", jobIdAndExecutionId(jobId, executionId), coordinator, localMemberListVersion, coordinatorMemberListVersion));
        }
        assert !masterAddress.equals(thisAddress) : String.format("Local node: %s is master but InitOperation has coordinator member list version: %s larger than " + " local member list version: %s", thisAddress, coordinatorMemberListVersion, localMemberListVersion);
        nodeEngine.getOperationService().send(new TriggerMemberListPublishOp(), masterAddress);
        throw new RetryableHazelcastException(String.format("Cannot initialize %s for coordinator %s, local member list version %s," + " coordinator member list version %s", jobIdAndExecutionId(jobId, executionId), coordinator, localMemberListVersion, coordinatorMemberListVersion));
    }
    // If the participant members can receive the new member list before the
    // coordinator, and we can also get into the
    // "coordinatorMemberListVersion < localMemberListVersion" case. If this
    // situation occurs when a job participant leaves, then the job start will
    // fail. Since the unknown participating member situation couldn't
    // be resolved with retrying the InitExecutionOperation for this
    // case, we do nothing here and let it fail below if some participant
    // isn't found.
    // The job start won't fail if this situation occurs when a new member
    // is added to the cluster, because all job participants are known to the
    // other participating members. The only disadvantage of this is that a
    // newly added member will not be a job participant and partition mapping
    // may not be completely proper in this case.
    boolean isLocalMemberParticipant = false;
    for (MemberInfo participant : participants) {
        if (participant.getAddress().equals(thisAddress)) {
            isLocalMemberParticipant = true;
        }
        if (membershipManager.getMember(participant.getAddress(), participant.getUuid()) == null) {
            throw new TopologyChangedException(String.format("Cannot initialize %s for coordinator %s: participant %s not found in local member list." + " Local member list version: %s, coordinator member list version: %s", jobIdAndExecutionId(jobId, executionId), coordinator, participant, localMemberListVersion, coordinatorMemberListVersion));
        }
    }
    if (!isLocalMemberParticipant) {
        throw new IllegalArgumentException(String.format("Cannot initialize %s since member %s is not in participants: %s", jobIdAndExecutionId(jobId, executionId), thisAddress, participants));
    }
}
Also used : Address(com.hazelcast.cluster.Address) RetryableHazelcastException(com.hazelcast.spi.exception.RetryableHazelcastException) MemberInfo(com.hazelcast.internal.cluster.MemberInfo) ClusterServiceImpl(com.hazelcast.internal.cluster.impl.ClusterServiceImpl) MembershipManager(com.hazelcast.internal.cluster.impl.MembershipManager) TriggerMemberListPublishOp(com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp) TopologyChangedException(com.hazelcast.jet.core.TopologyChangedException)

Aggregations

TriggerMemberListPublishOp (com.hazelcast.internal.cluster.impl.operations.TriggerMemberListPublishOp)4 Address (com.hazelcast.cluster.Address)3 MemberInfo (com.hazelcast.internal.cluster.MemberInfo)3 ClusterServiceImpl (com.hazelcast.internal.cluster.impl.ClusterServiceImpl)3 MembershipManager (com.hazelcast.internal.cluster.impl.MembershipManager)2 TopologyChangedException (com.hazelcast.jet.core.TopologyChangedException)2 ClusterState (com.hazelcast.cluster.ClusterState)1 MemberImpl (com.hazelcast.cluster.impl.MemberImpl)1 VectorClock (com.hazelcast.cluster.impl.VectorClock)1 EndpointQualifier (com.hazelcast.instance.EndpointQualifier)1 AuthenticationFailureOp (com.hazelcast.internal.cluster.impl.operations.AuthenticationFailureOp)1 BeforeJoinCheckFailureOp (com.hazelcast.internal.cluster.impl.operations.BeforeJoinCheckFailureOp)1 ClusterMismatchOp (com.hazelcast.internal.cluster.impl.operations.ClusterMismatchOp)1 CommitClusterStateOp (com.hazelcast.internal.cluster.impl.operations.CommitClusterStateOp)1 ConfigMismatchOp (com.hazelcast.internal.cluster.impl.operations.ConfigMismatchOp)1 ExplicitSuspicionOp (com.hazelcast.internal.cluster.impl.operations.ExplicitSuspicionOp)1 FetchMembersViewOp (com.hazelcast.internal.cluster.impl.operations.FetchMembersViewOp)1 FinalizeJoinOp (com.hazelcast.internal.cluster.impl.operations.FinalizeJoinOp)1 HeartbeatComplaintOp (com.hazelcast.internal.cluster.impl.operations.HeartbeatComplaintOp)1 HeartbeatOp (com.hazelcast.internal.cluster.impl.operations.HeartbeatOp)1