Handling the complexities of decentralisation at Vega

Over the past decade, decentralised computer systems have changed how we interact with the financial and economic system. The Bitcoin network is owned by no one, but trusted by its participants as a means to store and transfer value. Ethereum offers automatic contract execution that doesn’t rely on any one centralised server or authority.

These are the beginnings of a movement. It seems probable to me that other elements of the global financial and economic systems will go the same route. Insurance, capital markets, banking, and payment systems all spring to mind. Despite all of their undoubted problems, Bitcoin and Ethereum point the way towards an open source economy. Things that we formerly thought of as being the sole preserve of governments or large companies become programmable.

Our project, Vega, will offer an open, blockchain-backed network for the safe, end-to-end trading and execution of financial products.

Until now, trading derivatives, and products like them, has taken place on specialised, high-speed centralised exchanges. They are typically hosted by a single organisation — think Eurex or the NYSE. Each exchange provides a market for a known set of financial products, keeps trading closed to relatively small groups of participants, and enforces security controls on the exchange. The exchange acts as a central authority. It guarantees availability and ensures that trading takes place according to well-understood rules.

When we’ve considered the possibilities for a fully decentralised trading platform, several questions immediately surfaced. Can we build a decentralised exchange that provides a safe venue for trading? Could it be fast enough to be an economically viable proposition for traders and market makers? Is it possible to do the entire trade lifecycle, including order placement, determination of price levels, risk management calculations, and settlement without a central authority?

Not only do we believe it’s all possible, we’re building it. How can we do it?

For a system like this to work, we need a Byzantine Fault Tolerant, Proof of Stake blockchain platform that will allow us to rapidly calculate price levels and risk, and allow automatic settlement of funds. It should be resistant to Sybil attacks, and be considered a reasonably reliable store of value.

I’m going to break down each one of these elements, and look at why they’re important.

Byzantine Fault Tolerant

To build a decentralised environment, the system needs to be composed of multiple cooperating nodes, each of which can accept transactions simultaneously with all other nodes.

Input transaction types include orders for a given financial product in a given market, messages about financial collateral, market creation or closure, and funds withdrawal requests.

In order to function correctly, each participating node must see identical transactions in identical order, and then act upon those transactions deterministically (so each node reaches the same conclusions as every other node). Nodes must be able to reach global consensus to determine an ordered list of transactions without a central authority.

Nodes may be run by people whose real-world identities are unknown. Traders and market makers may be similarly pseudonymous. Some percentage of the participants in the system’s infrastructure may therefore be attackers who are actively attempting to subvert the system. To account for that, it must be Byzantine Fault Tolerant (BFT). BFT systems can come to the correct global consensus about an ordered list of transactions, even in the presence of one in three nodes being unavailable or acting as attackers.

To provide an ordered list of transactions and BFT guarantees, the system creates high-integrity blocks of data. This presupposes some decision-making mechanism to determine which blocks and which global transaction order becomes the agreed-upon history for all nodes. While there’s a lot of work going on in these areas, broadly speaking there are only a few recognised ways to do it: Satoshi consensus, or some variant of the Practical Byzantine Fault Tolerance family of algorithms.

But transaction ordering is only one part of the problem.

The Sybil attack

In an open and unpermissioned system, another problem, called the Sybil attack, then arises. Let’s assume we have a simple democracy of nodes, with each node getting one vote. Given a proposed block of transactions, each node could cast its vote as to whether it thought the block was correct, and a majority decision would win the game. But what if an attacker can multiply node identities very cheaply?

In the now-ancient classic movie The Matrix, the villain Agent Smith is just such an attacker. He can multiply himself at will to win any fight. If Agent Smith takes part in majority decision making, he can multiply himself a trillion times, statistically win every block proposal, and control history within the system. So Vega’s underlying platform must be Sybil resistant, able to defeat Agent Smith’s replicative powers.

Speed and automation

Lastly, if we want traders to find the Vega platform appealing enough to compete with centralised platforms, it needs to be fast.

Throughput, measured in transactions per second, must be high enough to make the system financially viable for node operators. Also, latency within the system needs to be low. A placed order needs speedy acknowledgement to satisfy the needs of market makers and traders who are used to latency typically measured in low single-digit milliseconds.

It also needs to work autonomously, without human intervention. If the system is to automatically execute a wide range of products, it needs a way to define the characteristics of arbitrary financial products. Settlement and funds withdrawal needs to be fully automated, too.

Lastly, the system must be performant enough to do real-time margining and risk management calculations in a decentralised way. Otherwise, financial risks in the system will multiply and the whole thing implodes. Which base platform?

There is no blockchain platform in production today that meets all of Vega’s needs.

Immediately, we ruled out Bitcoin. It isn’t programmable enough to do what we need.

Ethereum is a contender that meets many of the requirements. It can execute arbitrary code, act as a store of value, and do settlement operations between users. However, there are a few problems that mean Ethereum by itself is not the answer.

Like Bitcoin, Ethereum uses a Proof of Work (PoW) algorithm for consensus on the ordering of transactions, and as a Sybil resistance mechanism. Each participating node burns electricity in order to prove that it has participated in transaction ordering, and to deter Agent Smith from replicating endlessly and overrunning the system — attackers are free to replicate wildly and try and win the game, but they need to pay the electricity bill at the end of the month, and that’s expensive.

But there’s an inherent performance problem: due to their low throughput and high finality latency, PoW blockchains aren’t fast enough for the core operations of the Vega platform. Ethereum, for instance, does only about 25 transactions per second throughput, which is far too slow for our needs.

The latency story is in some ways even worse for an application like Vega. Blocks are produced by Ethereum every 15 seconds or so. But to be sure you’re in the longest chain you might wait 6 blocks (for a 99% probability that the transaction is included) or 13 blocks for a one-in-a-million chance of block reversion. If your aim is to process millions of high-value transactions every day, you’ll have several angry users in each 24-hour period, even if you wait three minutes for each transaction to clear.

Luckily, a new generation of Proof of Stake (PoS) blockchain platforms are now becoming available. Rather than burning energy to present Agent Smith with a big electricity bill at the end of the month and deter him from replication, PoS blockchains rely on the staking of cryptocurrency resources. Nodes put cryptocurrency into escrow with the network, proving that they have access to a scarce resource. This limits the scope for unlimited replication by attackers. Any nodes that misbehave have their stakes automatically taken away by the system, providing a disincentive for bad behaviour.

PoS systems offer several other advantages, including:

fast finality without block reversions
higher throughput
lower latency

PoS systems are faster because they don’t need to waste electricity and time to solve a hard but useless cryptographic problem.

Despite being much faster than PoW, PoS systems still have their limits. A globally distributed system will not able to match the speeds of a centralised exchange. Transactions on centralised exchanges are typically inserted into the order book and a new price level is determined in very short order — measured in thousandths or even millionths of a second.

Trading companies move to buildings closer to exchanges in order to shave tiny amounts of time off their network connections to the exchange, increasing their speed and shortening feedback loops when calculating what trades should be made next.

Network latency between e.g. London and Melbourne is about 300 milliseconds. So far we’ve been unable to code around that, because it’s determined by the speed of light in fibre as it travels tens of thousands of kilometres.

But we aim to get as close as the universe will allow. How far along is Vega?

Having tested our ideas through an initial proof of concept in April 2018, we’re now creating a full production build of Vega. Keep an eye out for more posts about how we’re tackling these challenges. Follow along on Medium, or sign up to our newsletter.