📜  分布式系统中的分布式共识(1)

📅  最后修改于: 2023-12-03 14:50:11.394000             🧑  作者: Mango

分布式系统中的分布式共识

分布式共识是指在分布式系统中达成一致,保证多个节点之间的数据一致性的协议或算法。在分布式系统中,由于网络延迟、节点故障等原因,导致节点之间的数据一致性难以保障,分布式共识机制就应运而生。

分布式共识的常用算法

在分布式系统中,常见的分布式共识算法有Paxos、Raft、PBFT等,其中Paxos是业界公认的分布式共识经典算法之一,Raft是一种易于理解和实现的分布式共识算法,受到了广泛的关注和应用,而PBFT则是一种适用于高性能、低延迟的系统中的一种分布式共识算法。

分布式共识算法的基本原理

分布式共识算法的基本原理是,通过多个节点之间的通信和协同,使得系统中多个节点对某个值或状态达成一致。在具体实现上,会有一个或多个节点扮演特殊的角色,负责协调和领导其他节点达成一致,以保障系统的正确性和可用性。

Paxos算法

Paxos算法是一种比较经典的分布式共识算法,其主要分为三个阶段:Prepare、Promise和Accept。在Prepare阶段,节点向其他节点发送一个编号,其他节点根据编号进行回应;在Promise阶段,节点告知发起方是否可以进行决定;在Accept阶段,节点进行决定并提交。

代码示例:

## Prepare
1. proposer (n) -> acceptor: prepare (n)
2. acceptor -> proposer: promise (n, m), where m is the largest number the acceptor has accepted so far (or none)

## Promise
3. proposer (v) -> acceptor: accept (n, v)
4. if the acceptor has not made a previous promise for this round, it accepts the proposal and sends an accept message to all learners
   else if the acceptor has made a previous promise for this round with a higher n, it rejects the proposal
   else if the acceptor has accepted a proposal with a higher n, it sends that proposal to the proposer and abandons the current proposal
   else if the acceptor has accepted a proposal with the same n, it ignores the proposal because it has already accepted a proposal for this round
   
## Accept
5. learner (n, v) learns that v was chosen if a majority of acceptors accept it
Raft算法

Raft算法是一种易于理解和实现的分布式共识算法,其主要分为Leader选举、日志复制和安全性检查三个部分。在Leader选举阶段,节点通过互相投票的方式进行选举;在日志复制阶段,Leader节点向其他节点进行日志复制,从而达成一致;在安全性检查阶段,节点检查自己状态机中的日志是否与Leader节点保持一致,从而保证节点之间的数据一致性。

代码示例:

## Leader选举
1. follower nodes start a timer after hearing no message from the leader
2. if the follower node's timer expires, it becomes a candidate and requests votes for itself
3. Nodes that receive a vote request vote for the candidate if they haven't voted yet

## 日志复制
4. The leader appends new log entries to its own log and sends append entries messages to each follower
5. Followers compare their own logs with the leader's logs and add any that are missing

## 安全性检查
6. nodes check if their logs are consistent with the leader's logs
7. If the logs aren't the same, nodes update their logs accordingly
8. When a node receives a committed log entry, it applies the command to its own state machine
PBFT算法

PBFT算法是一种适用于高性能、低延迟的系统中的一种分布式共识算法,其主要分为四个阶段:预准备、准备、提交和检查点。在预准备阶段,节点进行协商,选举出一个负责进行协商的主节点;在准备阶段,节点对某个事务进行投票;在提交阶段,节点将事务进行提交,保证多节点达成一致;在检查点阶段,防止由于某个节点崩溃导致数据不一致。

代码示例:

## 预准备
1. client sends a request to primary
2. primary multicasts prepared messages to all nodes

## 准备
3. nodes send precomputed digests and the current view number to the primary
4. primary multicasts a message containing the digest, the view number, the sequence number, and its signature

## 提交
5. nodes wait for 2f + 1 messages
6. nodes use the precomputed digest to compare to the received messages
7. nodes apply the new request

## 检查点
8. primary multicasts a checkpoint message that includes a state digest, a sequence number, and a signature
9. nodes check the state digest and sequence number before accepting
10. nodes place the checkpoint in stable storage