The scope of Eth1-Eth2 merger

https://ethresear.ch/t/the-scope-of-eth1-eth2-merger/7362

This write up is the follow up to @djrtwo’s Eth1+eth2 client relationship. Taking the separation of duties outlined in the previous document as the foundation, it aims to define the scope of work required to deliver the merger.

@mkalinin wrote:

Thanks to @timbeiko, @djrtwo, @gballet for discussions and comments.

This write up is the follow up to @djrtwo’s Eth1+eth2 client relationship. Taking the separation of duties outlined in the previous document as the foundation, it aims to define the scope of work required to deliver the merger.

Alongside with the scope, it provides thoughts regarding the implementation of particular components of eth1-engine and eth2-client in the Eth1 shard context. Depending on the section, these thoughts have a different level of detail. This information intends to be a starting point for further discussions around specifications and implementation.

NOTE: Eth1x outcomes are not strongly required for the technical side of the merger. Despite of this, prerequisites include Eth1x to highlight how substantial it efforts are for validators adoption and user experience after the merger.

Consensus

  • Prerequisites
    • Phase 1
      • State transition
      • Fork choice
      • Validator
    • Eth1x
      • Stateless execution
  • Scope
    • Eth1 engine
      • Block processing
      • Block production
      • External fork choice
    • Eth1 shard
      • State transition function
      • Validator duties
    • Beacon chain
      • Eth1 shard as a source of deposits
      • Validator capabilities
      • Eth1 shard committee selection
    • Eth1 rewards

Eth1 engine

To become compatible to Eth1 shard, eth1-engine will have to expose block processing and block production via its API. Currently, eth_getWork endpoint produces a block but doesn’t return its full structure, and block processing seems trivial to be exposed as well (see InsertChain). Also, a scope of work on these two includes getting rid of Ethash and managing the concurrency case when block production depends on the processing.

The fork choice of Eth1 shard will be entirely up to eth2-client. Since forehanded knowledge of the head of the chain is essential for the transaction pool optimizations, eth1-engine must be aware of reorgs that occurred on the eth2-client side. There could be a setHead API endpoint leveraging SetHead function to expose the fork choice. This change also implies reorgs to be removed from the block processing flow of eth1-engine.

Eth1 shard

Eth1 shard state transition will invoke eth1-engine to process a block and verify the result. To create a block, Eth1 shard block producers will have to call eth1-engine as well. Attesters assigned to Eth1 shard will have to execute Eth1 block and verify the result of the execution. These extensions to the shard transition are subjects for Eth1 shard specification.

Considering potential usage, a body of Eth1 shard block should at least contain the following data:

  • Eth1-styled block hash. It will likely be involved in block production to point out eth1-engine to the parent block.
  • Eth1 state root. With this data building proofs against Eth1 state becomes possible, making various use cases crossing eth1-eth2 border turn into reality.
  • Transaction receipts. The other way to establish eth1-eth2 communication is by utilizing the receipts.
  • RLP encoded Eth1 block. Primary payload of a block of Eth1 shard.

ShardState object will contain a hash tree root of this data in the data field supplying beacon chain with all of Eth1 block essentials trough merkle proofs.

Note: exposing transaction receipts turns into additional work for Eth1 shard block proposer. It will have to parse RLP-encoded receipts returned by eth1-engine and publish them in a format suitable for beacon chain.

Beacon chain

Eth1 data voting mechanism currently used by Phase 0 may get significant improvement after the merger. Moreover, it can become the first use case leveraging the new consensus. When Eth1 shard kicks off deposit root can get updated as frequent as cross-linking of Eth1 state root happens. Supplemented by the merkle proof deposit root becomes verifiable by any beacon chain client allowing for instant inclusion of new deposits. Note that to verify merkle proofs created against Eth1 state root or, e.g., polynomial commitments that could come instead, beacon client will have to be aware of the verification algorithm. For example, if beacon client had to verify Eth1 state root at the current moment, it would have to support Merkle Patricia Trie and keccak256 hash function.

There is a more convenient way of deposit processing that utilizes Eth1 transaction receipts. Beacon chain proposers will be able to parse receipts corresponding to the DepositEvent and induce new deposits upon cross-linking of Eth1 shard.

According to the current state of the arts, validator attesting to or producing on Eth1 shard will have to maintain full Eth1 state. Alleviating this requirement is one of the objectives of Eth1x research efforts. Once stateless execution is in place, attesters will become free from maintaining the state but not the producers. This kind of separation creates a demand for validator capability management. It could be implemented by adding capability flags as yet another field to the validator record. Such change implies a new operation on the beacon chain to update capability flags. The selection of Eth1 shard committee will have to keep track of validators capability, making committee selection yet another subject for Eth1 shard specification.

Eth1 rewards

After the merger takes place and PoW does no longer exist on the mainnet, there will be significant issuance cut by getting rid of block and uncle rewards.

Shard block proposer rewards in Phase 1 are described in this issue and consist of the following parts:

  • block proposer reward
  • EIP-1559 fee per byte in the block data
  • transaction fees
    Beacon chain state transition is responsible for processing the first and the second steps, i.e., the core protocol pays the Eth1 block proposer for the proposal and subtracts the EIP-1559 fee from the proposer’s balance. Transaction fees charging (if there is) assumed to be out of the core protocol.

In Phase 2, when a shard block becomes executable, gas used by a block is going to replace the size in the scheme above. It retains the EIP-1559 logic in the core protocol. Following the less invasiveness strategy of the merger, we might want to keep eth1-engine responsible for maintaining the EIP-1559 fees with a follow-up changes in Phase 2. In this case, the gas limit calculation could be moved to the core protocol and explicitly passed to eth1-engine during the block creation process.

Eth1 shard proposers could simply utilize the coinbase for the transaction fee payments. Another option would be to include transaction fees in a shard block explicitly and make the core protocol responsible for depositing these fees to the validator account. This trade-off between a validator maintaining several identities and the core protocol complication is yet to be resolved.

Yet another expenditure that Eth1 shard validators will have to take into account is a storage fee required to maintain Eth1 state. Fortunately, if Eth1-capable validator is not currently participating in the Eth1 shard committee, it can drop the state, diminishing the storage fees and sync it back when needed.

Network

  • Prerequisites
    • Phase 1
      • Subnet topics
      • Shard chain sync
    • Eth1x
      • State sync
      • History data storage
  • Scope
    • Discovery service
    • Network reputation
    • Eth1 shard sync

Discovery service

In eth1+eth2 client relationship, Danny reasonably proposes to keep p2p interfaces of eth2-client and eth1-engine separately. These pair of interfaces should operate under the same network identity. Otherwise, it would be difficult to justify the reputation of peers constituting Eth1 shard network.

Working under a single identity implies the sharing of discovery service between eth2-client and eth1-engine. It would make sense if eth2-client took responsibility for network discovery, providing discovery service to eth1-engine. This approach to sharing discovery duties likely makes communication in the client-engine pair to be bi-directional because of the pattern widely used in eth1 clients to request discovered peers on demand.

The other approach to discovery service of client-engine pair would be to retain eth1-engine discovery and bootstrap it with the node id used by eth2-client. This way is less invasive to eth1-engine codebase and still allows for maintaining a shared network identity without requiring client-engine communication to be bi-bidirectional.

Network reputation

Operating under the same network id client-engine pair becomes a single subject of the reputation. However, managing the reputation of the other peers on the network requires additional communication between eth2-client and eth1-engine cause they still share network responsibilities like transaction gossiping and state sync by eth1-engine against block gossiping by eth2-client. For instance, both parts of the pair will have to notify each other about bad behavior observed on their side of the network, resulting in disconnects and further bans of malicious peers.

Network specifications do not cover reputation heuristic, making it implementation-specific. Thus, it might differ not only between eth1 and eth2 clients but also across various eth2 client implementations. This discrepancy leaves a small room for the heuristic of shared reputation management that would probably be stripped down to disconnect signals only. This part of the protocol does also require bi-directional communication in the client-engine pair.

Eth1 shard sync

eth1-engine is going to be responsible for the state sync with either the currently existing algorithm or whatever else solution that Eth1x will come up. But the orchestration of the sync process should be done by eth2-client that will initially browse the network to find the most recent finalized checkpoint and send a command to eth1-engine to start syncing with a particular state root.

The other part of what is currently called Fast Sync in Eth1 is linking the downloaded state to the genesis block by Ethash-valid chain, verifying that the state belongs to the canonical chain. In Eth2 world, this kind of verification will look different due to the weak subjectivity. Thus, eth2-client will finalize Eth1 shard sync by checking that the state is canonical. It’s highly likely both parts of the sync process could be done simultaneously.

The other responsibility worth noting in this section is the access to Eth1 history data. In general, ancient data will be entirely handled by eth1-engine. Upon the finishing of the sync process, the eth2-client may send command to eth1-engine to obtain the whole chain of blocks and receipts. However, dealing with the history represented by the chain of blocks and their receipts is kind of burdensome. One of the objectives of Eth1x research is to improve history data management, which could significantly impact the UX of Eth1 shard.

Old fashioned regular block sync could still be utilized for quick state catch-ups after relatively short periods of downtime. However, it would require additional information obtained from Eth1 shard network like the number of Eth1 block corresponding to the head of the chain. If this number would be too far from the current client’s state, then it worth falling back to the state sync or use other techniques probably borrowed from the beam sync. Anyway, from eth2-client perspective, it would be a single command to sync with the particular state root, and the heuristic of that sync should be entirely handled by eth1-engine.

Client

  • Scope
    • Code updates
      • Getting rid of PoW
        • validity conditions
        • block production
        • fork choice
      • Network stack
        • discovery
        • reputation
      • State trie pruning
    • Eth1-Eth2 communication protocol
      • Consensus
        • set head
        • create block
        • insert block
        • finalize block
      • Network
        • discovery
        • reputation
        • sync
    • UX
      • API facade
      • Light client
      • Client tooling

State trie pruning

Cleaning up outdated versions of the state trie is essential to keep Eth1 state of a sane size during the whole lifetime of the client run. Currently, this mechanism uses follow distance to decide which state versions are outdated and should undergo the pruning process. The distance parameter should be big enough to secure the client from accidental reorg requiring to proceed from the old version of the state that has been pruned already, for example, the geth client uses 128 blocks as a default follow distance on the mainnet.

For the Eth1 shard, more organic trigger for trie pruning would be checkpoint finalization. Once eth2-client gets the new finalized checkpoints, it could call eth1-engine with the hash of finalized Eth1 block and thereby trigger the process pruning the state trie up to that block.

Tooling

From a user perspective switching from Eth1 to Eth1 shard would mean maintaining one more client and setting up the connection between two parts of the new-fashioned client. It sounds like a significant UX flaw. Client tooling could be a remedy from this. This kind of tooling should provide support for all major eth1-engines and eth2-clients, be able to download and setup them, and establish a communication channel of the client-engine pair.

The other part of UX work would be in designing a unified API facade providing access to API interfaces of both eth2-client and eth1-engine. This facade should expose Eth1 JSON-RPC with as less changes as it could be achieved considering a major shift in consensus and updates of data structures.

Light client

The beacon state is very tiny and fits up to a hundred of megabytes. However, the network layer of the beacon node could be pretty much intensive due to attestation dissemination happening each slot.

Using a beacon chain light client instead of a full node would save a lot of network traffic to the user of Eth1 shard. A size of the state required by the light client is less than a megabyte and could be kept in memory all the way. When eth2-light-client starts, it downloads the most recent light client state within a single roundtrip and gives a command to eth1-engine to catch up with the most recent finalized Eth1 state. In the online mode, the light client would need to listen to beacon and shard blocks and execute Eth1 blocks to keep the tip of Eth1 shard chain with a pretty high level of security.

Note: light client is highly desirable but not a prerequisite to the merger. There is another way of reducing beacon chain traffic by turning on a light mode for the full beacon chain client. In this mode, the client will listen to the beacon blocks channel without participating in attestation subnets. This approach retains the security of individual clients but could affect the network itself; hence is a subject for further investigation.

The next step

We propose a core consensus implementation as the next step towards the merger. A scope of the PoC doesn’t include network and major client-related work. It minimizes consensus requirements to the following subset:

  • Prerequisites
    • Phase 1
      • State transition
      • Fork choice
      • Validator
  • Scope
    • Eth1 engine
      • Block processing
      • Block production
      • External fork choice
    • Eth1 shard
      • State transition function
      • Validator duties
    • Eth1-Eth2 communication protocol
      • Consensus

The communication protocol is going to be as minimal as required by the core consensus functionality. Any beacon chain changes, including new deposit processing, are out of the scope, client updates are stripped down to the minimum subset needed for delivery.

A product of the PoC should be a client-engine pair capable of producing and importing shard blocks with Eth1 blocks as executable payload.

Posts: 1

Participants: 1

Read full topic

Miners vote back

https://ethresear.ch/t/miners-vote-back/7129

@mkalinin wrote:

The write up focuses on one of the approaches to two-way bridge. Vitalik mentions that a replication of Eth1 data voting maintained by miners could be a source of eth2_data for Eth1.

This article investigates the major properties of two-way bridged system driven by miners voting rather than in-depth technical details like a particular approach to Eth1-side storage of eth2_data. It also assumes Eth1 clients use a full version of beacon chain client.

For the sake of simplicity, beacon chain data placed on Eth1 is denoted as eth2_data whatever structure and fields it would have.

Prerequisites

Copy of Eth1Data poll

Miner maintains a beacon chain client and queries it for eth2_data upon block creation. A reference point for the query could be timestamp of a block at the beginning of the voting period. An eth2_data sample that collects BLOCKS_PER_ETH2_VOTING_PERIOD * VOTING_MULTIPLIER votes is getting published on Eth1.

Miners are allowed to vote for eth2_data published previously to cope with failures of various kinds, whether they are caused by malicious behavior or by software errors.

Picking up parameter values

Voting threshold set to BLOCKS_PER_ETH2_VOTING_PERIOD/2 + 1 wouldn’t work for the period of any length. As it would reduce an adversary power limitation by up to 10% and would make bridge vulnerable to 41% attacks (exact value depends on the length of the period but never reaches desirable 51%).

To make this solution viable, we have to favor safety over liveness and pick satisfying VOTING_MULTIPLIER > 1/2. Parameters calculation comes up with the following values (link to the calculation sheet).

Parameter Value
BLOCKS_PER_ETH2_VOTING_PERIOD 1024
VOTING_MULTIPLIER 5/8
VOTING_THRESHOLD 640

Given parameter values, we can outline meanings of properties that Eth2 voting process got if it would follow this scheme.

Property Value
Safety Fault Tolerance 50%
Liveness Fault Tolerance 30%
Chance of Safety Violation 5.73*10-16
Chance of Liveness Violation 1.19*10-7
Blocks to finality 1069..2953 blocks* (4.2 to 11.5 hours)

* Blocks to finality is a sum of BLOCKS_PER_ETH2_VOTING_PERIOD with values from the previous write up where 45 and 1929 are evaluations of the new proposal and the status quo, respectively.

To reduce risks, Vitalik suggests setting the length of the voting period to one week. The table below shows the meanings of properties for the one-week voting period.

Property Value*
Safety Fault Tolerance 50%
Liveness Fault Tolerance 45%
Chance of Safety Violation ~0.0
Chance of Liveness Violation ~0.0
Blocks to finality 1 week

* Values are based on 1 week tab of the calculation sheet.

Drawbacks

  • Time to finality limitation. A thousand blocks to finality seems to be the best that this solution can give. The reduction of this number either reduces safety and liveness fault tolerance or increases the chance of failure of one or the other. You may check the calculation sheet for details.
  • Liveness weakening. Liveness fault tolerance of the voting process is less than Eth1 fault tolerance.

Instant finality

Initially proposed by @nrryuya here.

Instead of passing through a voting process Eth1 starts to invalidate blocks containing invalid eth2_data. Validity conditions for the data are as follow:

  1. The checkpoint must be finalized and belong to the canonical view of beacon chain.
  2. Eth1 block must belong to the canonical view of Eth1 chain.
  3. The checkpoint must be either equal to or descendant of the previous one.
  4. Eth1 block must be either equal to or descendant of the previous one.
  5. Beacon chain validators must vote for Eth1 block only if it contains eth2_data satisfying condition 1.

It worth noting that condition 1 opens a room for racing where Eth1 block could be deemed invalid due to Eth2 network latency. It will likely mean that Eth2 experiences networking issues. Possible mitigation would be to take a distance from beacon chain head.

Safety and Liveness

For the rest of this section, we assume safety and liveness of Eth1Data voting are tight with corresponding beacon chain properties. As we’ve discussed previously, the status quo doesn’t hold this. But there is a solution addressing that problem.

If a client follows the protocol supporting the above conditions, then the following statements are true:

  1. Neither finalized checkpoint nor finalized block can’t be reverted on Eth1 chain.
  2. There is no possibility to finalize two conflicting Eth1 blocks as long as beacon chain has no safety failure.
  3. With honest majority of miners it is possible to keep eth2_data up-to-date as long as beacon chain has no liveness failure.

Statement 1 directly follows from conditions 3 and 4. To prove statement 2, suppose two conflicting Eth1 blocks were finalized. Two cases are making it possible:

  • beacon chain validators polled a pair of inconsistent Eth1 blocks (Figure 1a)
  • blocks are finalized by controversial beacon chain forks (Figure 1b)

The first case means the broken Eth1Data voting process. The second implies that the canonical view of beacon chain contains two conflicting finalized checkpoints. Either of these two is a safety failure in beacon chain and contradicts the second part of statement 2.

While the first case can be remedied by condition 4, the second one looks more severe since it could cause a split in Eth1.

Liveness statement (statement 3) can be proved in a similar way to statement 2. If eth2_data hadn’t been updated for a significant amount of time, it would mean that either Eth1Data voting was stuck or Eth1 liveness was violated. Both the former and the latter contradict statement 3.

Instant finality approach has two options affecting the security of the bridged system:

  • Only miners run beacon chain clients. In this case, regular users can’t check conditions 1 and 3 and are sharing security between honest majority of miners and beacon chain. Therefore, making issuance reduction unattainable.
  • All Eth1 users run beacon chain clients. Users running beacon chain clients can’t be forced through incorrect beacon chain forks if they check the conditions. According to this, beacon chain becomes the only supplier of Eth1 security.

These two options are not mutually exclusive and could become a pair of subsequent milestones within the same roadmap.

Inconsistent reads

Suppose there is a split in beacon chain, and there are two forks in Eth1 supporting a pair of corresponding beacon chain forks. Then the following scenario is possible:

  1. Beacon chain fork BC_1 processes a portion of deposits from Eth1_1.
  2. BC_1 switches to Eth1_2.
  3. One may do a sequence of 1 ETH deposits on Eth1_2 to increment the deposit counter.
  4. After that, deposits from step 1 can be replayed on BC_1 via Eth1_2 fork.

Conditions 2 and 5 establish a one-to-one relationship between particular Eth1 and beacon chain forks, preventing inconsistent read and bounce attack scenarios.

Software errors

One of the potential problems with this scheme lies in the field of software errors. Suppose there is a consensus break between two major beacon chain clients, which means that the canonical view of beacon chain differs for those two. Then miners relying on this pair of clients would disagree about their view of Eth1 chain. Which, in turn, reduces the pace of honest blocks in Eth1 and increases its vulnerability to 51% attacks.

One could describe a simple model representing the influence of beacon chain software errors on Eth1 safety and liveness. Eth1 guarantees safety and liveness if (1 - γ)(1 - δ)α > β, where α and β are total mining rates of honest and malicious miners, respectively, 0 < δ < 1 is a parameter proportional to network latency. And 0 < γ < 1 is the probability of invalidating Eth1 block due to the failure of beacon chain software, γ is proportional to the number of different clients used by beacon chain validators and inversely proportional to their maturity.

One of the approaches to reduce γ is to set up a cluster of beacon chain clients and follow a conservative strategy of updating eth2_data only when the majority of cluster participants agree on the new portion of data. This approach will work with honest miners. But in a more realistic rationale majority model, there is a centralization risk caused by replacing locally maintained cluster by a public provider of eth2_data.

Drawbacks

  • Cost of issuance reduction. The cost of the desired issuance reduction is to obligate regular users to run beacon chain client software.
  • Centralization risk. Rationale miners and regular users tend to use public services to query beacon chain data.

Posts: 1

Participants: 1

Read full topic