2020 6.824 的 Raft Lab 2C
前言
做2020的MIT6.824,完成了实验Raft Lab2C,通过了测试,对于之前的两个实验请参考2020 6.824 的 Raft Lab 2A 以及2020 6.824 的 Raft Lab 2B
这个实验的重点在于实现persist()以及做优化。总的来说代码量比Lab2B要少,又沿用Lab2A已经搭好的框架,所以整体会比前两个实验简单一些
这个实验的坑是TestFigure8Unreliable2C,这里有相关的讨论可以根据自己的情况参考一下,总结有两点如下
- AppendEntries实现有误
- Time interval 的问题
下面有这个链接对我的实验测试很有帮助,主要是为了多测测试,保证没有因为概率通过而miss掉的一些测试用例
3. 并行运行测试的shellscript
##每20个test并行运行,运行100次2B的test
bash test_many.sh 100 20 2C
一、Raft2C
实验可以分为两个部分,首先是实现persist()以及做优化
二、Persist()
2.1 实现
Lab已经有部分参考的实验代码了,结合paper的Figure 2就可以写出来了
最终实现如下
func (rf *Raft) persist() {
// Your code here (2C).
w := new(bytes.Buffer)
e := labgob.NewEncoder(w)
e.Encode(rf.currentTerm)
e.Encode(rf.votedFor)
e.Encode(rf.log)
data := w.Bytes()
rf.persister.SaveRaftState(data)
}
func (rf *Raft) readPersist(data []byte) {
if data == nil || len(data) < 1 { // bootstrap without any state?
return
}
// Your code here (2C).
r := bytes.NewBuffer(data)
d := labgob.NewDecoder(r)
rf.mu.Lock()
defer rf.mu.Unlock()
d.Decode(&rf.currentTerm)
d.Decode(&rf.votedFor)
d.Decode(&rf.log)
}
2.2 调用
接下来就要分析什么时候需要调用persist函数了,其实就是一旦修改了上面持久化的三个值currentTerm, votedFor, log 就需要调用一次persist函数
那么对于我的implementation来说就是在convertToFollower,convertToCandidate以及convertToLeader的时候
func (rf *Raft) convertToFollower(newTerm int) {
defer rf.persist()
...
}
func (rf *Raft) convertToCandidate() {
defer rf.persist()
...
}
func (rf *Raft) convertToLeader() {
defer rf.persist()
...
}
在完成保存日志的时候
// 保存日志
for i, logEntry := range args.Entries {
...
}
rf.persist()
在调用RequestVote的修改votedFor时候
if (rf.votedFor == -1 || rf.votedFor == args.CandidateId) && rf.isLogMoreUpToDate(args.LastLogIndex, args.LastLogTerm) {
rf.votedFor = args.CandidateId
reply.VoteGranted = true
rf.persist()
}
三、优化
一共有三点 对应实现就ok了
If a follower does not have prevLogIndex in its log, it should return with conflictIndex = len(log) and conflictTerm = None.
这一点Lab2B已经实现了
if len(rf.log)-1 < args.PrevLogIndex {
conflictIndex = len(rf.log)
goto label1
}
If a follower does have prevLogIndex in its log, but the term does not match, it should return conflictTerm = log[prevLogIndex].Term, and then search its log for the first index whose entry has term equal to conflictTerm.
翻译一下就是如果有index一直但是term不一致的情况,让conflictTerm等于follower对应prevLogIndex的term,然后找出conflictTerm第一次出现的位置(看不懂的话可以看看下面我举的例子)
if args.PrevLogIndex > 0 && rf.log[args.PrevLogIndex].Term != args.PrevLogTerm {
conflictTerm = rf.log[args.PrevLogIndex].Term
for i := 1; i <= args.PrevLogIndex; i++ {
if rf.log[i].Term == conflictTerm {
conflictIndex = i
break
}
}
goto label1
}
Upon receiving a conflict response, the leader should first search its log for conflictTerm. If it finds an entry in its log with that term, it should set nextIndex to be the one beyond the index of the last entry in that term in its log. If it does not find an entry with that term, it should set nextIndex = conflictIndex.
翻译一下就是当Leader收到Reply=false的时候
- Leader找出conflictTerm对应最后出现的位置+1
- 如果找不到,把nextIndex设为conflictIndex
(看不懂的话可以看看下面我举的例子)
Example
Follower e:Leader跟e在AppendEntries通讯之后,f会把Reply设为false,ConflictTerm设为4,ConflictIndex设为4(conflictTerm第一次出现的位置)。那么对于Lerder收到Reply=false之后,会找Term为4的最后一个index,也就是5,对应1中conflictTerm对应最后出现的位置+1,于是把nextIndex[e]设为5+1=6,再发送AppendEntries
Follower f:Leader跟f在AppendEntries通讯之后,f会把Reply设为false,ConflictTerm设为3,ConflictIndex设为7(conflictTerm第一次出现的位置)。那么对于Lerder收到Reply=false之后,会找Term为3的最后一个Index,没有找到于是把nextIndex[f]设为7,再发送AppendEntries
实现如下
if reply.Success == true {
...
} else {
if reply.ConflictTerm < 0 {
rf.nextIndex[p] = reply.ConflictIndex
} else {
//Upon receiving a conflict response, the leader should first search its log for conflictTerm.
lastEntryOfConflictTerm := -1
for entryIndex := args.PrevLogIndex; entryIndex > 0; entryIndex-- {
if rf.log[entryIndex].Term == reply.ConflictTerm {
lastEntryOfConflictTerm = entryIndex
break
}
}
//If it does not find an entry with that term, it should set nextIndex = conflictIndex
if lastEntryOfConflictTerm == -1 {
rf.nextIndex[p] = reply.ConflictIndex
} else {
//If it finds an entry in its log with that term, it should set nextIndex to be the one beyond the index of the last entry in that term in its log.
rf.nextIndex[p] = lastEntryOfConflictTerm + 1
}
}
}
总结
至此,Raft的Lab2就全部实现完了,回过头来思考就是要紧跟Paper实现,多使用DPrintf进行调试,AppendEntries有很多需要调试的地方,包括优化的部分,包括log保存的部分。实现一次还是很enjoy的,have fun