The Modular Thesis: Scaling Web3 with Rollups
Castle Capital: Modular Series
The modular thesis proposes that we will collectively change how we build and utilize blockchains. Moreover, a modular-focused design enables scalable & secure execution layers as we move into the hype and heightened activity of a bullrun!
So what is modular blockchain architecture?
Data Availability: The concept in which any data that is published to a network is accessible and retrievable by all network participants (at least for a certain time).
Execution: Defines how nodes on the blockchain process transactions, transitioning them between states.
Settlement: Finality (probabilistic or deterministic) is a guarantee that a transaction committed to the chain is irreversible. This only happens when the chain is convinced of transaction validity. Hence, settlement means to validate transactions, verify proofs & arbitrate disputes.
Consensus: The mechanism by which nodes agree to what data on the blockchain can be verified as true & accurate.
Monolithic Blockchain Architecture (Source: Celestia)
While the monolithic design approach has some advantages of its own (e.g. reduced complexity & improved composability), it doesn't necessarily scale well. This is why modular designs strip these functions apart, having them performed on separate, specialized layers.
Consequently, the modular design space consists of:
Modular Blockchain Architecture (Source: Celestia)
More broadly, the modular landscape also includes:
Projects focused on order flow abstraction
Various infrastructure providers (rollup frameworks, rollup-as-a-service solutions & other tooling)
In this brief introductory piece, the focus lies on how we got to rollup-based (a.k.a. modular) scaling solutions before we deep dive into the nuances of modular blockchain systems over the coming weeks in this new series.
⚔️ A Call to Arms
Think you have what it takes to enter the Castle and contribute to research, community initiatives, due diligence analysis, and advising/servicing projects in the space? Or maybe you want to upskill and shadow community members who have already walked a successful path as an intern?
The History of Scaling
Scaling the throughput of blockchains has been a main focus of research and development in the space since its inception. It is indisputable that to reach true “mass adoption”, blockchains must be able to scale. Simply defined, scalability is the ability of a network to process a large amount of transactions quickly and at a low cost. This consequently means that as more use cases arise and network adoption accelerates, the performance of the blockchain doesn’t suffer. Based on this definition, Ethereum lacks scalability.
With increasing network usage, gas prices on Ethereum have skyrocketed to unsustainably high levels, ultimately pricing out many smaller users from interacting with decentralized applications entirely. Examples include the BAYC land mint (leading to a surge of gas fees up to 8000 gwei) or the artblocks NFT drop (leading to a surge of gas fees to over 1000 gwei) - as a reference, gas sits at 6 gwei at the time of writing. Instances such as these gave alternative, more “scalable” L1 blockchains (i.e. Solana) a chance to eat into Ethereum’s market share. However, this also spurred innovation around increasing the throughput of the Ethereum network.
However, the scaling approaches these Alt-L1s are taking often come at the cost of decentralization and security. Alt-L1 chains like Solana for example have chosen to go with a smaller validator set and have increased hardware requirements for validators. While this improves the network’s ability to verify the chain and hold its state, it reduces how many people can verify the chain themselves and increases barriers to entry in network participation. This conflict is also referred to as the blockchain trilemma (visualized below). The concept is based on the idea that a blockchain cannot reach all three core qualities that any blockchain network should strive to have (scalability, security & decentralization) all at once.
The blockchain trilemma (Source: SEBA Research)
This becomes clear when we think about the aforementioned increase in hardware requirements. To scale throughput, an Alt-L1 chain must utilize a more centralized network structure, where users have to trust a smaller number of validators with high-spec machines. This sacrifices two arms of the blockchain trilemma, decentralization & security, for scalability. Additionally, with the need for more powerful hardware, running a node also becomes more expensive (not only hardware itself but also bandwidth & storage). This drastically impairs the decentralization of the network as the barrier to entry for running a node increases dramatically, thus fewer people can participate in validating the network.
Since decentralization and inclusion are two core values of the Ethereum community, it is not surprising that running the chain with a small set of high-spec nodes was not a suitable path forward. Vitalik Buterin even argued that it is “crucial for blockchain decentralization for regular users to be able to run a node”. Hence, other scaling approaches gained traction.
Homogenous Execution Sharding
The Ethereum community has experimented with side chains, plasma, and state channels to solve the scalability problem, all of which have certain drawbacks that render them sub-optimal solutions. A scaling approach that many alternative L1 blockchains have chosen to take, is what is referred to as homogenous execution sharding. For quite some time, this also seemed like the most promising solution for Ethereum (in the context of the old ETH 2.0 roadmap).
Homogeneous execution sharding is a scaling approach that seeks to increase the throughput and capacity of a blockchain network by splitting its transaction processing workload among multiple, smaller units (validator sub-sets) called shards. Each shard operates independently and concurrently, processing its own set of transactions and maintaining a separate state. The goal is to enable parallel execution of transactions, thus increasing the overall network capacity and speed. Harmony and Ethereum 2.0 (old roadmap only!) are two examples of scaling initiatives that have adopted or at least considered homogeneous execution sharding as part of their scaling strategy.
Simplified Visualization of Execution Sharding
Harmony is an alternative L1 blockchain platform that aims to provide a scalable, secure, and energy-efficient infrastructure for decentralized applications (dApps). It uses a sharding-based approach in which the network is divided into multiple shards, each with its own set of validators who are responsible for processing transactions and maintaining a local state. Validators are randomly assigned to shards, ensuring a fair and balanced distribution of resources.
Cross-shard communication is facilitated through a mechanism called "receipts," which allows shards to send information about the state changes resulting from a transaction to other shards. This enables seamless interactions between dApps and smart contracts residing on different shards, without compromising the security and integrity of the network.
Ethereum 2.0, is an ongoing upgrade to the Ethereum network aiming to address the scalability, security, and sustainability issues faced by the original Proof-of-Work (PoW) based Ethereum version. The old Ethereum 2.0 roadmap proposed a multi-phase rollout, transitioning the network to a Proof-of-Stake (PoS) consensus mechanism (which we finally saw happen last fall) and introducing execution sharding to improve scalability. Under this original plan, Ethereum 2.0 would have consisted of a Beacon Chain and 64 shard chains. The Beacon Chain was designed to manage the PoS protocol, validator registration, and cross-shard communication.
The shard chains, on the other hand, were to be individual chains, responsible for processing transactions and maintaining separate states in parallel. Validators would have been assigned to a shard, rotating periodically to maintain the security and decentralization of the network. The Beacon Chain would have kept track of validator assignments and managed the process of finalizing shard chain data. Cross-shard communication was planned to be facilitated through a mechanism called "crosslinks," which would periodically bundle shard chain data into the Beacon Chain, allowing state changes to be propagated across the network.
But while homogenous execution sharding promises great scalability, it does come at the cost of security trade-offs, as the validator is split into smaller subsets and hence the network decentralization is impaired. Additionally, the value at stake that provides crypto-economic security on the shards is reduced.
However, the Ethereum 2.0 roadmap has since evolved, and execution sharding has been replaced by an approach referred to as data sharding that aims to provide the scalable basis for a more complex scaling technology known as rollups (more on this soon!).
Heterogenous Execution Sharding
Heterogeneous execution sharding is a scaling approach that connects multiple, independent blockchains with different consensus mechanisms, state models, and functionality into a single, interoperable network. This approach allows each connected blockchain to maintain its unique characteristics while benefiting from the security and scalability of the entire ecosystem. Two prominent examples of projects that employ heterogeneous execution sharding are Polkadot and Cosmos.
Polkadot is a decentralized platform designed to enable cross-chain communication and interoperability among multiple blockchains. Its architecture consists of a central Relay Chain, multiple Parachains, and Bridges.
Simplified Visualization of Polkadot’s network Architecture (Source: Polkadot Docs)
Relay Chain: The main chain in the Polkadot ecosystem, responsible for providing security, consensus, and cross-chain communication. Validators on the Relay Chain are in charge of validating transactions and producing new blocks.
Parachains: Independent blockchains that connect to the Relay Chain to benefit from its shared security and consensus mechanisms, as well as enable interoperability with other chains in the network. Each parachain can have its own state model, consensus mechanism, and specialized functionality tailored to specific use cases.
Bridges: Components that link Polkadot to external blockchains (like Ethereum) and enable communication and asset transfers between these networks and the Polkadot ecosystem.
Polkadot uses a hybrid consensus mechanism called Nominated Proof-of-Stake (NPoS) to secure its network. Validators on the Relay Chain are nominated by the community to validate transactions and produce blocks. Conversely, Parachains can use different consensus mechanisms, depending on their requirements. What is an important feature of Polkadot’s network architecture is that by design, all Parachains share security with the relay chain, hence inheriting the Relay Chain’s security guarantees.
Cosmos is another decentralized platform that aims to create an "Internet of Blockchains", facilitating seamless communication and interoperability between different blockchain networks. Its architecture is similar to that of Polkadot’s being composed of a central Hub, multiple Zones, and Bridges.
Simplified Visualization of Cosmos’ Network Architecture (Source: Cosmos Docs)
Hub: The central blockchain in the Cosmos ecosystem, which enables cross-chain communication and soon inter-chain security (shared security similar to Polkadot). Cosmos Hub uses a Proof-of-Stake (PoS) consensus mechanism called Tendermint, which offers fast finality and high throughput. Theoretically, there can be multiple hubs. However, with ATOM 2.0 and inter-chain security coming up, the Cosmos Hub will likely remain the center of the Cosmos-enabled “Internet of Blockchains.”
Zones: Independent blockchains connected to the Hub, each with its own consensus mechanism, state model, functionality, and validator set (typically). Zones can communicate with each other through the Hub using a standardized protocol called Inter-Blockchain Communication (IBC).
Bridges: Components that link the Cosmos ecosystem to external blockchains, allowing asset transfers and communication between Cosmos Zones and other networks.
Both Polkadot and Cosmos are examples of heterogeneous execution sharding, as they connect multiple, independent blockchains with diverse functionality, consensus mechanisms, and state models into a single, interoperable ecosystem. This approach allows each connected chain to maintain its unique characteristics while enabling scalability by separating application-specific execution layers from each other whilst still benefitting from the cross-chain communication and security capabilities of the entire network.
The main difference between the Cosmos and the Polkadot approach is the security model. While Cosmos goes for an approach in which the app-specific chains (heterogenous shards) have to spin up and maintain their own validator sets, Polkadot opts for a shared security model. Under this shared security model, the app-chains inherit security from the relay chain that stands at the center of the ecosystem. The latter is much closer to the rollup-based scaling approach that Ethereum wants to take to enable scaling.
Use our referral system to spread the word about the Chronicle!
Scaling Ethereum with Rollups
A rollup-centric Ethereum roadmap isn’t exactly a new phenomenon, but it has accelerated at pace in uptake and adoption. Vitalik first wrote about this roadmap pivot back in Oct 2020.
Rollups take sharding within a shared security paradigm to the next level. It’s a scaling solution in which transactions are processed off-chain in the rollup’s execution environment and, as the name suggests, rolled up into batches. Sequencers collect transactions from the users and submit the transaction batches to a smart contract on Ethereum L1 that enforces correct transaction execution on L2. Subsequently, the transaction data is stored on L1, which enables rollups to inherit the security of the battle-tested Ethereum base layer.
So now what were essentially shards in the old Ethereum 2.0 roadmap are completely decoupled from the base layer and developers have a wide open space to customize their L2 however they want (similar to Polkadot’s parachains or Cosmos’ zones). However, thanks to the settlement and DA on Ethereum, rollups are still able to rely on L1 security guarantees. Another key advantage compared to side-chains (e.g. Polygon) is that rollups do not need a validator set and consensus mechanism of their own.
A rollup system only needs to have a set of sequencers (collecting and ordering transactions), with only one sequencer needing to be live at any given time. With weak assumptions like this, rollups can actually run on a small set of high-spec server-grade machines or even a single sequencer, allowing for great scalability. However, as this comes at a trade-off with decentralization, most rollups try to design their systems as decentralized as possible (which includes the sequencer). While rollups don’t explicitly need consensus mechanisms (as finality comes from L1 consensus), rollups can have coordination mechanisms with rotation schedules to rotate sequencers or even fully-fledged PoS mechanisms in which a set of sequencers reach consensus on transaction batching/ordering. These approaches can increase security & improve decentralization.
Generally, there are two types of rollup systems…
What are referred to as optimistic rollups are characterized by having a sequencer node that collects transaction data on L2, subsequently submitting this data to the Ethereum base layer alongside the new L2 state root. In order to ensure that the new state root submitted to Ethereum L1 is correct, verifier nodes will compare their new state root to the one submitted by the sequencer. If there is a difference, they will begin what’s called a fraud proof process. If the fraud proof’s state root is different from the one submitted by the sequencer, the sequencer’s initial deposit (a.k.a. bond) will be slashed. The state roots from that transaction onward will be erased and the sequencer will have to recompute the lost state roots.
Rollup mechanism (Source: Panther Academy)
Validity (Zero-Knowledge) Rollups
Validity rollups on the other hand rely on validity proofs in the form of zero-knowledge proofs (e.g. SNARKs or STARKs) instead of fraud proving mechanisms. Similar to optimistic rollup systems, a sequencer collects transactions from users and is responsible for submitting (and sometimes also generating) the zero-knowledge proof to the L1 alongside the corresponding transaction data. The sequencer’s stake can be slashed if they act maliciously, which incentivizes them to post valid blocks (or proofs of batches). Validity rollups introduce a new role to the system that is not needed in the optimistic setup. The prover is the actor that generates unforgeable zk proofs of transaction execution, proving that the proposed state transitions are valid.
The sequencer subsequently submits these proofs to the verifier contract on the Ethereum mainnet. Technically, the responsibilities of sequencers and provers can be combined into one role. However, because proof generation and transaction ordering each require highly specialized skills to perform well, splitting these responsibilities prevents unnecessary centralization in a rollup’s design. The Zero Knowledge proof the sequencer submits to L1 reports only the changes in the L2 state and provides this data to the verifier smart contract on the Ethereum mainnet in the form of a verifiable hash.
Simplified Visualization of a zk-Rollup (Source: Chainlink)
Determining which approach is superior is a challenging task. However, let's briefly explore some key differences. Firstly, because validity proofs can be mathematically proven, the Ethereum network can trustlessly verify the legitimacy of batched transactions. This differs from optimistic rollups, where Ethereum relies on verifier nodes to validate transactions and execute fraud proofs if necessary. Hence, some may argue that zk-rollups are more secure. Furthermore, validity proofs (the zero-knowledge ones) enable instant confirmation of rollup transactions on the main chain.
Consequently, users can transfer funds seamlessly between the rollup and the base blockchain (as well as other zk-rollups) without experiencing friction or delays. In contrast, optimistic rollups (such as Optimism and Arbitrum) impose a waiting period before users can withdraw funds to L1 (7 days in the case of Optimism & Arbitrum) as the verifiers need to be able to verify the transactions and initiate the fraud proving mechanism if necessary. This limits the efficiency of rollups and reduces the value for users. While there are ways to enable fast withdrawals, it is generally not a native feature.
However, validity proofs are computationally expensive to generate and often costly to verify on-chain (depending on the proof size). By abstracting proof generation and verification, optimistic rollups gain an edge on validity rollups in terms of cost.
Both optimistic and validity rollups play a key role in the context of Ethereum’s rollup-centric roadmap. Transforming the Ethereum base layer into a major data availability/settlement layer for an almost infinite number of highly scalable, rollup-based execution layers will enable the overall Ethereum network and its rollup ecosystems to reach an enormous scale.
As we have seen, building decentralized applications that are sovereign & unconstrained by the limitations of base layers is a complex endeavor. It requires coordinating hundreds of node operators, which is both difficult & costly. Moreover, it is hard to scale monolithic blockchains without making significant tradeoffs on security and/or decentralization.
While frameworks such as the Cosmos SDK and Polkadot’s Substrate make it easier to abstract certain software components, they don’t allow for a seamless transition from code into the actual physical network of p2p hardware. Additionally, heterogenous sharding approaches might fragment ecosystem security, which can introduce additional friction & risk.
Rollups, the next-gen scaling solution, offers an amazing opportunity to not only remove the difficulty of coordinating hundreds or even thousands of individuals to operate a decentralized network but are also a major stepping stone towards significantly reducing the cost & time needed by developers to turn their ideas & concepts into reality.
The concept of modular chains further simplifies this. Modular blockchain design is a broad approach that separates a blockchain’s core functions into distinct, interchangeable components. Within these functional areas, specialized providers arise that jointly facilitate building scalable and secure rollup execution layers, broad app design flexibility, and enhanced adaptability for evolving technological demands.
Despite this, rollup-based scaling is still a nascent technology. Hence, there are still some obstacles to overcome. The main scalability bottleneck for (Ethereum-based) rollups currently is limited data availability (DA) capacity. However, the innovation, driven by the modular thesis does have some approaches in store to address this. To learn more about the DA problem and potential solutions, stay tuned for our deep-dive report that will be published next week as we continue this series!
Brought to you by zerokn0wledge!
The Alpha Assembly
Receive Telegram notifications of our posts and those of our partners! Join the Alpha Assembly Telegram channel today!
The central hub for everything crypto: