Search in sources :

Example 1 with ReplicatedContent

use of org.neo4j.causalclustering.core.replication.ReplicatedContent in project neo4j by neo4j.

the class Appending method appendNewEntries.

static void appendNewEntries(ReadableRaftState ctx, Outcome outcome, List<ReplicatedContent> contents) throws IOException {
    long prevLogIndex = ctx.entryLog().appendIndex();
    long prevLogTerm = prevLogIndex == -1 ? -1 : prevLogIndex > ctx.lastLogIndexBeforeWeBecameLeader() ? ctx.term() : ctx.entryLog().readEntryTerm(prevLogIndex);
    RaftLogEntry[] raftLogEntries = contents.stream().map(content -> new RaftLogEntry(ctx.term(), content)).toArray(RaftLogEntry[]::new);
    outcome.addShipCommand(new ShipCommand.NewEntries(prevLogIndex, prevLogTerm, raftLogEntries));
    outcome.addLogCommand(new BatchAppendLogEntries(prevLogIndex + 1, 0, raftLogEntries));
}
Also used : Log(org.neo4j.logging.Log) Outcome(org.neo4j.causalclustering.core.consensus.outcome.Outcome) IOException(java.io.IOException) ReadableRaftState(org.neo4j.causalclustering.core.consensus.state.ReadableRaftState) String.format(java.lang.String.format) ShipCommand(org.neo4j.causalclustering.core.consensus.outcome.ShipCommand) List(java.util.List) RaftLogEntry(org.neo4j.causalclustering.core.consensus.log.RaftLogEntry) AppendLogEntry(org.neo4j.causalclustering.core.consensus.outcome.AppendLogEntry) ReplicatedContent(org.neo4j.causalclustering.core.replication.ReplicatedContent) BatchAppendLogEntries(org.neo4j.causalclustering.core.consensus.outcome.BatchAppendLogEntries) RaftMessages(org.neo4j.causalclustering.core.consensus.RaftMessages) TruncateLogCommand(org.neo4j.causalclustering.core.consensus.outcome.TruncateLogCommand) ShipCommand(org.neo4j.causalclustering.core.consensus.outcome.ShipCommand) BatchAppendLogEntries(org.neo4j.causalclustering.core.consensus.outcome.BatchAppendLogEntries) RaftLogEntry(org.neo4j.causalclustering.core.consensus.log.RaftLogEntry)

Example 2 with ReplicatedContent

use of org.neo4j.causalclustering.core.replication.ReplicatedContent in project neo4j by neo4j.

the class Leader method handle.

@Override
public Outcome handle(RaftMessages.RaftMessage message, ReadableRaftState ctx, Log log) throws IOException {
    Outcome outcome = new Outcome(LEADER, ctx);
    switch(message.type()) {
        case HEARTBEAT:
            {
                Heartbeat req = (Heartbeat) message;
                if (req.leaderTerm() < ctx.term()) {
                    break;
                }
                stepDownToFollower(outcome);
                log.info("Moving to FOLLOWER state after receiving heartbeat at term %d (my term is " + "%d) from %s", req.leaderTerm(), ctx.term(), req.from());
                Heart.beat(ctx, outcome, (Heartbeat) message, log);
                break;
            }
        case HEARTBEAT_TIMEOUT:
            {
                sendHeartbeats(ctx, outcome);
                break;
            }
        case HEARTBEAT_RESPONSE:
            {
                outcome.addHeartbeatResponse(message.from());
                break;
            }
        case ELECTION_TIMEOUT:
            {
                if (!isQuorum(ctx.votingMembers().size(), ctx.heartbeatResponses().size())) {
                    stepDownToFollower(outcome);
                    log.info("Moving to FOLLOWER state after not receiving heartbeat responses in this election timeout " + "period. Heartbeats received: %s", ctx.heartbeatResponses());
                }
                outcome.getHeartbeatResponses().clear();
                break;
            }
        case APPEND_ENTRIES_REQUEST:
            {
                RaftMessages.AppendEntries.Request req = (RaftMessages.AppendEntries.Request) message;
                if (req.leaderTerm() < ctx.term()) {
                    RaftMessages.AppendEntries.Response appendResponse = new RaftMessages.AppendEntries.Response(ctx.myself(), ctx.term(), false, -1, ctx.entryLog().appendIndex());
                    outcome.addOutgoingMessage(new RaftMessages.Directed(req.from(), appendResponse));
                    break;
                } else if (req.leaderTerm() == ctx.term()) {
                    throw new IllegalStateException("Two leaders in the same term.");
                } else {
                    // There is a new leader in a later term, we should revert to follower. (ยง5.1)
                    stepDownToFollower(outcome);
                    log.info("Moving to FOLLOWER state after receiving append request at term %d (my term is " + "%d) from %s", req.leaderTerm(), ctx.term(), req.from());
                    Appending.handleAppendEntriesRequest(ctx, outcome, req, log);
                    break;
                }
            }
        case APPEND_ENTRIES_RESPONSE:
            {
                RaftMessages.AppendEntries.Response response = (RaftMessages.AppendEntries.Response) message;
                if (response.term() < ctx.term()) {
                    /* Ignore responses from old terms! */
                    break;
                } else if (response.term() > ctx.term()) {
                    outcome.setNextTerm(response.term());
                    stepDownToFollower(outcome);
                    log.info("Moving to FOLLOWER state after receiving append response at term %d (my term is " + "%d) from %s", response.term(), ctx.term(), response.from());
                    outcome.replaceFollowerStates(new FollowerStates<>());
                    break;
                }
                FollowerState follower = ctx.followerStates().get(response.from());
                if (response.success()) {
                    assert response.matchIndex() <= ctx.entryLog().appendIndex();
                    boolean followerProgressed = response.matchIndex() > follower.getMatchIndex();
                    outcome.replaceFollowerStates(outcome.getFollowerStates().onSuccessResponse(response.from(), max(response.matchIndex(), follower.getMatchIndex())));
                    outcome.addShipCommand(new ShipCommand.Match(response.matchIndex(), response.from()));
                    /*
                     * Matches from older terms can in complicated leadership change / log truncation scenarios
                     * be overwritten, even if they were replicated to a majority of instances. Thus we must only
                     * consider matches from this leader's term when figuring out which have been safely replicated
                     * and are ready for commit.
                     * This is explained nicely in Figure 3.7 of the thesis
                     */
                    boolean matchInCurrentTerm = ctx.entryLog().readEntryTerm(response.matchIndex()) == ctx.term();
                    /*
                     * The quorum situation may have changed only if the follower actually progressed.
                     */
                    if (followerProgressed && matchInCurrentTerm) {
                        // TODO: Test that mismatch between voting and participating members affects commit outcome
                        long quorumAppendIndex = Followers.quorumAppendIndex(ctx.votingMembers(), outcome.getFollowerStates());
                        if (quorumAppendIndex > ctx.commitIndex()) {
                            outcome.setLeaderCommit(quorumAppendIndex);
                            outcome.setCommitIndex(quorumAppendIndex);
                            outcome.addShipCommand(new ShipCommand.CommitUpdate());
                        }
                    }
                } else // Response indicated failure.
                {
                    if (response.appendIndex() > -1 && response.appendIndex() >= ctx.entryLog().prevIndex()) {
                        // Signal a mismatch to the log shipper, which will serve an earlier entry.
                        outcome.addShipCommand(new ShipCommand.Mismatch(response.appendIndex(), response.from()));
                    } else {
                        // There are no earlier entries, message the follower that we have compacted so that
                        // it can take appropriate action.
                        LogCompactionInfo compactionInfo = new LogCompactionInfo(ctx.myself(), ctx.term(), ctx.entryLog().prevIndex());
                        RaftMessages.Directed directedCompactionInfo = new RaftMessages.Directed(response.from(), compactionInfo);
                        outcome.addOutgoingMessage(directedCompactionInfo);
                    }
                }
                break;
            }
        case VOTE_REQUEST:
            {
                RaftMessages.Vote.Request req = (RaftMessages.Vote.Request) message;
                if (req.term() > ctx.term()) {
                    stepDownToFollower(outcome);
                    log.info("Moving to FOLLOWER state after receiving vote request at term %d (my term is " + "%d) from %s", req.term(), ctx.term(), req.from());
                    Voting.handleVoteRequest(ctx, outcome, req);
                    break;
                }
                outcome.addOutgoingMessage(new RaftMessages.Directed(req.from(), new RaftMessages.Vote.Response(ctx.myself(), ctx.term(), false)));
                break;
            }
        case NEW_ENTRY_REQUEST:
            {
                RaftMessages.NewEntry.Request req = (RaftMessages.NewEntry.Request) message;
                ReplicatedContent content = req.content();
                Appending.appendNewEntry(ctx, outcome, content);
                break;
            }
        case NEW_BATCH_REQUEST:
            {
                RaftMessages.NewEntry.BatchRequest req = (RaftMessages.NewEntry.BatchRequest) message;
                List<ReplicatedContent> contents = req.contents();
                Appending.appendNewEntries(ctx, outcome, contents);
                break;
            }
        case PRUNE_REQUEST:
            {
                Pruning.handlePruneRequest(outcome, (RaftMessages.PruneRequest) message);
                break;
            }
        default:
            break;
    }
    return outcome;
}
Also used : LogCompactionInfo(org.neo4j.causalclustering.core.consensus.RaftMessages.LogCompactionInfo) RaftMessages(org.neo4j.causalclustering.core.consensus.RaftMessages) Outcome(org.neo4j.causalclustering.core.consensus.outcome.Outcome) Heartbeat(org.neo4j.causalclustering.core.consensus.RaftMessages.Heartbeat) ReplicatedContent(org.neo4j.causalclustering.core.replication.ReplicatedContent) List(java.util.List) FollowerState(org.neo4j.causalclustering.core.consensus.roles.follower.FollowerState)

Example 3 with ReplicatedContent

use of org.neo4j.causalclustering.core.replication.ReplicatedContent in project neo4j by neo4j.

the class EntryRecord method read.

public static EntryRecord read(ReadableChannel channel, ChannelMarshal<ReplicatedContent> contentMarshal) throws IOException, EndOfStreamException {
    try {
        long appendIndex = channel.getLong();
        long term = channel.getLong();
        ReplicatedContent content = contentMarshal.unmarshal(channel);
        return new EntryRecord(appendIndex, new RaftLogEntry(term, content));
    } catch (ReadPastEndException e) {
        throw new EndOfStreamException(e);
    }
}
Also used : EndOfStreamException(org.neo4j.causalclustering.messaging.EndOfStreamException) ReplicatedContent(org.neo4j.causalclustering.core.replication.ReplicatedContent) ReadPastEndException(org.neo4j.storageengine.api.ReadPastEndException)

Example 4 with ReplicatedContent

use of org.neo4j.causalclustering.core.replication.ReplicatedContent in project neo4j by neo4j.

the class RaftContentByteBufferMarshalTest method shouldSerializeIdRangeRequest.

@Test
public void shouldSerializeIdRangeRequest() throws Exception {
    // given
    CoreReplicatedContentMarshal serializer = new CoreReplicatedContentMarshal();
    ReplicatedContent in = new ReplicatedIdAllocationRequest(memberId, IdType.NODE, 100, 200);
    // when
    ByteBuf buf = Unpooled.buffer();
    assertMarshalingEquality(serializer, buf, in);
}
Also used : CoreReplicatedContentMarshal(org.neo4j.causalclustering.messaging.CoreReplicatedContentMarshal) ReplicatedIdAllocationRequest(org.neo4j.causalclustering.core.state.machines.id.ReplicatedIdAllocationRequest) ReplicatedContent(org.neo4j.causalclustering.core.replication.ReplicatedContent) ByteBuf(io.netty.buffer.ByteBuf) NetworkFlushableByteBuf(org.neo4j.causalclustering.messaging.NetworkFlushableByteBuf) Test(org.junit.Test)

Example 5 with ReplicatedContent

use of org.neo4j.causalclustering.core.replication.ReplicatedContent in project neo4j by neo4j.

the class CatchUpTest method integerValues.

private List<Integer> integerValues(ReadableRaftLog log) throws IOException {
    List<Integer> actual = new ArrayList<>();
    for (long logIndex = 0; logIndex <= log.appendIndex(); logIndex++) {
        ReplicatedContent content = readLogEntry(log, logIndex).content();
        if (content instanceof ReplicatedInteger) {
            ReplicatedInteger integer = (ReplicatedInteger) content;
            actual.add(integer.get());
        }
    }
    return actual;
}
Also used : ReplicatedContent(org.neo4j.causalclustering.core.replication.ReplicatedContent) ArrayList(java.util.ArrayList)

Aggregations

ReplicatedContent (org.neo4j.causalclustering.core.replication.ReplicatedContent)9 RaftMessages (org.neo4j.causalclustering.core.consensus.RaftMessages)3 RaftLogEntry (org.neo4j.causalclustering.core.consensus.log.RaftLogEntry)3 List (java.util.List)2 Test (org.junit.Test)2 Outcome (org.neo4j.causalclustering.core.consensus.outcome.Outcome)2 CoreReplicatedContentMarshal (org.neo4j.causalclustering.messaging.CoreReplicatedContentMarshal)2 ByteBuf (io.netty.buffer.ByteBuf)1 File (java.io.File)1 IOException (java.io.IOException)1 String.format (java.lang.String.format)1 ArrayList (java.util.ArrayList)1 NewLeaderBarrier (org.neo4j.causalclustering.core.consensus.NewLeaderBarrier)1 Heartbeat (org.neo4j.causalclustering.core.consensus.RaftMessages.Heartbeat)1 LogCompactionInfo (org.neo4j.causalclustering.core.consensus.RaftMessages.LogCompactionInfo)1 CoreLogPruningStrategy (org.neo4j.causalclustering.core.consensus.log.segmented.CoreLogPruningStrategy)1 CoreLogPruningStrategyFactory (org.neo4j.causalclustering.core.consensus.log.segmented.CoreLogPruningStrategyFactory)1 SegmentedRaftLog (org.neo4j.causalclustering.core.consensus.log.segmented.SegmentedRaftLog)1 AppendLogEntry (org.neo4j.causalclustering.core.consensus.outcome.AppendLogEntry)1 BatchAppendLogEntries (org.neo4j.causalclustering.core.consensus.outcome.BatchAppendLogEntries)1