On February 28th, Ultrain Key Account Manager Zhou Ye specially invited Ultrain’s technical director / consensus Daniel Wang to conduct a live broadcast connection to explain Ultrain’s cutting-edge consensus protocol in detail.

alt text

1.Briefly introduce the Ultrain consensus

The Ultrain consensus is based on the BFT (Byzantine Fault Tolerance) algorithm. It introduces independent innovative random number contracts and optimizes it in engineering to form a unique RPOS algorithm. Nodes need to stake and join the consensus network after identity authentication. Under normal circumstances, two rounds of voting, BA0 and BA1, are needed to complete the consensus. Each round of voting requires more than 2/3 of the verification nodes to reach agreement before they can produce blocks. When a network anomaly or evil occurred, and two rounds of voting still could not reach an agreement, we introduced the BAX stage to optimize the voting process and block production conditions to accelerate system recovery after a network anomaly.

To adapt to different application scenarios, the voting time can be flexibly configured according to the network conditions. Currently, two types are deployed on the main network, namely, block production in 10 seconds and block production in 2 seconds.

  1. The main innovations of Ultrain consensus

l Independent innovation random number contract. Each round randomly selects the block proposal node proposer and verification node voter to solve the problem of rejection submission and advance prediction. Compared with the problem of uncontrollable selection of the number of proposal nodes using VDF, the random number contract can fix the number of proposals and effectively reduce the transmission bandwidth of consensus messages.

l Adopt aggregate signature BLS technology, reduce network bandwidth and storage requirements, improve system security, and effectively solve the problem of long-range attacks and block header forgery.

l Support multiple network access methods (static link, NAT traversal, etc.), so that home nodes can participate in consensus.

l Node timing intelligent synchronization mechanism, each node automatically aligns timing, adapts to different computing performance hardware joins the network

  1. What is the current operating status of consensus?

Since the launch of the main net in early 2019, 1 main chain and 8 side chains have been deployed, with a total of more than 700 network nodes, running smoothly and an effective block rate of more than 99.9%.

The degree of decentralization is high, and the nodes are mainly scattered in the East, Beijing, Shanghai, Guangzhou, Taizhou, Hangzhou, and Nanjing.

  1. A brief introduction to the Ultrain consensus architecture?

Consensus has a two-tier structure:

• Network layer (layer0): Gossip network, responsible for automatic node discovery, automatic chain building, identity verification, dynamic topology maintenance, etc.

• Consensus layer (layer1): chained future consensus algorithm implementation.

Consensus module diagram:
alt text

What are the functions and advantages of the Ultrain network layer

In a decentralized network system, node-to-node connectivity is very important. It determines whether the propagation of consensus protocols and data synchronization can reach all corners of the system in a timely and accurate manner. Due to the uncontrollability of node joining and leaving, the system also needs to consider the dynamic change of the network topology, that is, to link to other nodes in time after the neighbor node is disconnected, so as to maintain the robustness of the system.

The Ultrain network layer is based on the helper nodes (seed), discovers other peer nodes in the network through a series of point-to-point message interactions, and randomly establishes network connections with the peer nodes. When the link is down or the network status is unstable, the neighbor nodes are re-selected in time to realize the dynamic change of the topology.

The main functions of the chained future network layer: node discovery, identity authentication, and dynamic topology management.

The Ultrain network layer has the following advantages:

• Very convenient automatic node discovery, random and limited interconnection between nodes, which guarantees security and is sufficiently decentralized.

• Reverse link technology enables public network IP nodes to actively link private network nodes.

• NAT penetration technology can penetrate the LAN nodes behind the NAT gateway to build a more random and decentralized network, making home consensus nodes possible.

• Support multiple access methods: such as TCP, KCP, static link, dynamic p2p link, etc.

  1. What roles does the consensus node have and what are the responsibilities of each role?

After the node is staked, it automatically becomes a committee member. The committee includes a proposal node and a voter node. The unsecured nodes are listen nodes.

• Proposer node: A block lifter randomly selected among the members of the committee by a random number, responsible for the block lifting and sending a pose message in this round.

• Voter nodes: members of the committee, which need to be mortgaged. This round has not been selected as a committee member of the proposal and automatically becomes a voter. It is responsible for performing verification on the block (propose message) proposed by the proposer, signing and sending an echo message (voting) after the verification is passed. The verification fails by discarding the promise message and may be evil. The node initiates a penalty.

• listen nodes: Unmortgaged nodes, without the right to mention blocks and verify blocks, can produce blocks normally with the entire system.

  1. What are the main message types of the consensus protocol?

• Propose message: A block-lifting message sent by the proposal node requires the voter node to receive and verify the confirmation. The content of the message mainly includes: block data (including transactions), and the signature of the proposer to confirm the identity.

• Echo message: After the voter verifies and passes the pose message, it sends an echo message to notify other nodes. The message content mainly includes: block id, voter’s own signature, bls signature, round information, etc.

• sync message: Block synchronization message, used to transfer block data between two nodes. The message content mainly includes: block id, start block, end block, message sequence number, etc.

  1. Node voting process?

In the future, the chain will adopt the RPOS algorithm based on BFT. Normally, consensus is reached through 2 rounds of voting.

• First round (stage ba0): Blocks are raised by the sponsor node, the blocks are verified by the voter node, and the first voting is performed on the blocks produced in this round.

• The second round (phase ba1): Based on the voting results of the first round, all nodes vote again to confirm that only when 2/3 voter agrees on the block production in this round can they produce a block and enter the consensus of the next block. Each round lasts 5 seconds. Normally, the block is produced in 2 rounds and 10 seconds.

• Bax round: When there is an abnormal situation such as a network storm and consensus cannot be reached after 2 rounds of voting, then enter the bax round, the block production time is uncertain, and the block production conditions still need more than 2/3 of the nodes to reach consensus on the block production in this round .
9. Based on the above rounds of processes, can you introduce them in detail?

ba0 stage:

Block verification and first voting stage

  1. According to the true random number provided by the random number contract, the system selects two proposer nodes in all committees to take charge of the block lifting this round. The blocks provided by the master and backup proposer nodes are prioritized, and the master node has a high priority. The reason why two master nodes are selected instead of one in this case is to prevent the selected node from abnormally failing to send a proposal block promotion message, which will result in the situation that the block cannot be generated normally in this round.

  2. Two Proposer nodes send their respective Proposal messages to broadcast to the Voter nodes in the entire network. The Voter nodes execute transactions in the Proposal message block to verify, and verify the identity of the Proposer through the signature in the message. After the verification passes, the Voter sends and carries itself The signed echo message (the first vote) is broadcast to the entire network.

  3. After the 5 seconds of the ba0 phase, each node counts the first round of voting (echo messages) received during this period. There are several situations:

• Only one proposal message has more than two thirds of the members of the committee, then the block in the proposal message is used as a candidate block, that is, the voting target of the ba1 stage.

• The voter signatures of the two pose messages exceed 2/3 of the committee members, and the block in the pose message with a high priority (issued by the main proposer node) is selected as a candidate block, that is, the voting target of the ba1 stage.

• If no voter signature of any pose message exceeds 2/3 of the committee members, an empty block will be generated by voting in the ba1 stage, that is, an empty block will be produced in this round.

BA0 stage flow chart:

alt text

ba1 stage:

Second voting confirmation phase

  1. Each commiittee member votes 2 times according to the selection of the end stage of ba0, and broadcasts the echo message with his signature to the entire network.

At the end of 2.5 seconds, the ba1 phase ends, and each node counts the number of votes received in the second round. There are two cases

• If the number of voter signatures for a block (including empty blocks) exceeds 2/3 members of the committee, the node will use this block as the final round to confirm that the block is on the chain and update the world state.

• There are no blocks with more than 2/3 members of the commiittee. The system fails to produce blocks in 10 seconds and enters the bax round.

BA1 stage flow chart:

alt text

bax stage:

Various abnormal conditions will cause the consensus system to enter the bax phase, such as the failure of more than 1/3 of the members of the committee and the unreachable network.

  1. The node repeats the operation of the ba1 phase with 5 seconds as a round, and sends an echo message every 5 seconds to vote. The voting is not allowed to change.

  2. The statistics of the number of voter votes are completed at the end of each 5 second round. If there are more than 2/3 commiittee members with the number of voter signatures for a block, the bax phase ends, and the node uses this block as the current round to confirm the block and update the world state. .

  3. In the voting gap, the local node asks the neighboring nodes whether there are already higher blocks than the local node. If there is a neighbor node higher than this node block, it is considered that this node has fallen behind the entire consensus network, abandoned the consensus, started the block synchronization process, ended the bax phase by pulling the block, and entered the next round of consensus. If no neighbor node is higher than this node block, the bax phase can only be ended by continuous consensus.

  4. When the total number of rounds reaches 20 and the node still cannot end the bax phase, it is considered that the voting of all members of the committee in the ba1 phase is split (typically, half of the support is for the main sponsor node, and the other half is for the backup proposer node), that is, According to the current voting situation, it is never possible to reach a 2/3 consensus. No matter what votes are voted during the ba1 stage, starting from the 21st round, they will uniformly vote for empty blocks. Only empty blocks can be produced in this round.

BAX stage flow chart:
alt text
10. How does the state of a single node change during the consensus process?

Consensus nodes have 5 states:

• Init node is in the initial state before creation

• ba0 node is in the consensus ba0 stage

• ba1 node is in the consensus ba1 stage

• bax node is in the consensus bax phase

• sync node is in block synchronization state

alt text

State transition path

  1. Start consensus after creation, init → ba0

  2. ba0 ends in 5 seconds, ba0 → ba1

  3. ba1 ends in 5 seconds, and more than 2/3 of the members of commiittee reach an agreement. After the block is generated normally, it enters the next consensus. Ba1 → ba0

  4. ba1 ends in 5 seconds, and members of the committee are not in agreement, ba1 → bax

  5. bax is over for 5 seconds, the members of the committee are in agreement, bax → ba0

  6. The bax ends in 5 seconds, and the members of the commuitee do not reach an agreement. The node continues to stay in the bax phase, round ++.

  7. The bax is over for 5 seconds. The members of the committee are not in agreement. Some neighbors can provide block synchronization services. Bax → sync

  8. After the synchronization is completed, enter the consensus again, sync → ba0

  9. What are the security considerations?

Blockchain, especially public chains, are not the same as traditional software systems. First, the code is open source, and second, anyone can join your system. There is an unavoidable evil situation. Our RPoS consensus is a PoS consensus mechanism. To participate in the core system, tokens must be mortgaged. Damage to the system will harm your own interest as a token mortgagor. From the perspective of a rational person, the vast majority of nodes will not harm their own interests to do evil. This is the core foundation of PoS system stability, which can make most nodes honest. However, we do not exclude a few nodes from doing evil. Our consensus layer should be able to identify the perpetrators, punish the perpetrators, and reward the honest nodes for block production.

Here are some examples to see how to solve the security problem

  1. Identity authentication: Signature verification is performed when establishing a network connection. Only after identity identification can the system be connected to participate in consensus.

  2. Multi-signature attack: The verification node may detect whether a node raises multiple blocks or conducts multiple votes. After discovery, it will upload evidence and initiate punishment voting. It will delete the committee as a malicious node.

  3. Long-range attack: Adopting aggregate signature technology, the party signatures of all the verifiers of this block are stored in the block header. A small number of malicious nodes cannot forge the party signatures of more than two-thirds of the committee members, avoiding the possibility of long-range attacks.

  4. Double-spend attack: The consensus protocol is based on the BFT algorithm and does not support forking, theoretically avoiding the possibility of a double-spend attack.