Analysis of hashrate-based double-spending
Latest version: July 14, 2019
Bitcoin ([Bitcoin]) is the world’s first decentralized digital currency. Its main technical innovation is the use of a blockchain and hash-based proof of work to synchronize transactions and prevent double-spending the currency. While the qualitative nature of this system is well understood, there is widespread confusion about its quantitative aspects and how they relate to attack vectors and their countermeasures. In this paper we take a look at the stochastic processes underlying typical attacks and their resulting probabilities of success.
The Bitcoin system revolves around the concept of transactions, digitally signed announcements that the owner of some coins agrees to transfer them to a different owner. The sender of coins will typically expect some product or service in return.
This concept will be undermined if the sender would be able, after receiving the product, to broadcast a conflicting transaction sending the same coin back to himself. As long as the recipient cannot be sure the coins are his to stay and that they cannot be redirected to another party (without his consent), he would not be safe to deliver a product in exchange. A double-spending attack is in fact a successful attempt to first convince a merchant that a transaction has been confirmed, and then convince the entire network to accept some other transaction; the merchant would be left with neither product nor coins, and the attacker will get to keep both.
This is a problem of synchronization – there needs to be some universally accepted signal indicating that some transaction is final and that no conflicting transaction can ever be accepted. Given two conflicting transactions, it does not really matter which of them will be accepted, as long as there is a way to know that one transaction has been accepted and can no longer be reversed.
Bitcoin solves this with a proof-of-work system: Computational effort (consisting in the calculation of hashes) is spent on acknowledging groups of transactions, called blocks; and a transaction is considered final once sufficient work has gone into acknowledging the block that contains it. By linking the blocks to form a chain, the total work spent on any transaction is perpetually increasing, making it difficult to elevate any conflicting transaction to the same confirmation status without a prohibitive computational effort.
However, if the attacker is in fact in control of substantial computational power, he may succeed in doing just that. Satoshi Nakamoto’s original Bitcoin whitepaper ([Satoshi]) contains a discussion of the statistical aspects of this problem; in this paper we clarify and expand on this work.
2 The blockchain and branch selection
Bitcoin transactions are grouped into blocks. Every block references an earlier block by including the uniquely identifying hash of this earlier block in its header. The one exception is the first ever block, known as the genesis block, which of course cannot reference an earlier block.
The blocks hence form a tree, with the genesis block as the root, and each block being a child of the block it references. A branch in this tree is a path from a leaf block to the genesis block; each such branch represents one version of the history of Bitcoin transactions. Each branch must be internally consistent and can never include two conflicting transactions; however, the branches need not be consistent with one another, and one branch can include a transaction which contradicts a transaction in another branch.
Since a single coherent history of transactions is desired, at every point one branch in the tree is considered the valid block chain. It is agreed that every node will consider the longest branch it is aware of as the valid chain (more precisely, the branch representing the most proof of work). If multiple branches are tied, the one of which the node learnt first is considered valid by it, until the tie is broken. When nodes are creating new blocks, they are expected to reference the leaf of the valid branch, extending the chain.
Different nodes may be in temporary disagreement about the valid blockchain, if they learnt of tied branches at different times; this is known as a blockchain fork. However, this is quickly resolved when a new block is found, as including it will make one branch longer than the other, a fact with which all nodes can agree. This is illustrated in LABEL:fig:ETree.