Faster Bulletproofs with Ristretto & AVX2

Faster Bulletproofs with Ristretto & AVX2

Zero-knowledge range proofs are a key building block for confidential transaction systems, such as Confidential Transactions for Bitcoin, Chain’s Confidential Assets, and many other protocols. Range proofs allow a verifier to ensure that secret values, such as asset amounts, are nonnegative. This prevents a user from forging value by secretly using a negative amount. Since every transaction involves one or more range proofs, their efficiency, both in terms of proof size and verification time, is key to transaction performance.

A few months ago, Bünz, Bootle, Boneh, Poelstra, Wuille, and Maxwell published Bulletproofs, which dramatically improves proof performance both in terms of proof size and verification time. In addition, it allows proving a much wider class of statements than just range proofs.

At Chain, we (Henry de Valence, Cathie Yun and Oleg Andreev) have been working on a pure-Rust Bulletproofs implementation, whose initial version we are publishing today, together with a set of notes.

Our implementation uses Ristretto, a variant of Mike Hamburg’s Decaf scheme, to implement a prime-order group using Curve25519, and accelerates curve operations with an AVX2 implementation of the parallel Edwards formulas published by Hisil, Wong, Carter, and Dawson in 2008. It is still a work in progress, but it can verify a 64-bit range proof in only 1.15ms, on an Intel i7–7800X. Our experiments suggest that the existing libsecp256k1 implementation takes approximately 2.18ms on the same hardware. We intend to extend our implementation with further optimizations, such as precomputation, and generalize it to support proofs for arbitrary circuits.

Our implementation

Our Bulletproofs implementation is written entirely in Rust, a powerful systems programming language made by Mozilla. Rust provides strong type-safety, prevents accidental sharing of mutable data, and eliminates an entire class of memory-safety bugs which are endemic to C and C++.

In addition, it provides much better ergonomics, without adding runtime cost. For example, the Bulletproofs paper notes that the verification checks for the range proof protocol can be combined into a single equation. We construct all the necessary scalars using Rust iterators, and use iterator combinators to combine the verification checks. Because Rust iterators are lazy, each coefficient’s computation ends up inlined into the multiscalar multiplication routine, which avoids managing temporaries.

Bulletproofs also allow proof aggregation, where multiple provers engage in a multi-party computation (MPC) to produce a joint proof that each of their values are in range. We have a work-in-progress implementation of the MPC protocol which can produce and verify aggregated proofs, and which encodes the protocol state machine into Rust’s type system, so that typechecking ensures that protocol messages are handled correctly.


Bulletproofs require a prime-order group. The implementation by the authors uses the Bitcoin curve secp256k1, a Weierstrass curve which has prime order. This is a natural choice for adding confidential transactions to Bitcoin, but what about other systems?

Non-prime-order curves have significant advantages, such as complete addition laws and faster and simpler formulas. However, these advantages come at a cost: the order is not a prime q, but h·q for some small cofactor h. The effects of the cofactor can be subtle, and the task of managing them is pushed upwards into the protocol, say, by multiplying by h in an appropriate place.

Even for simple protocols, like signatures, this can cause undesirable effects. For instance, Ed25519 signature verification is not well-defined, so standards-conforming implementations may not agree on which signatures are valid. For more complex protocols, the cofactor is a recurring source of vulnerabilities and design complications, making direct use of a non-prime-order curve a non-starter.

A clean solution to this problem is Mike Hamburg’s Decaf proposal, which shows how to use isogenies and quotient groups to implement a prime-order group using a non-prime-order curve — with no additional cost. The original Decaf proposal is designed for curves such as Ed448 with h = 4, but a variant of Decaf, called Ristretto, works for curves with h = 8, such as Curve25519.

This gives the best of both worlds: the correct abstraction required to implement this and other protocols, and the simplicity, efficiency, and speed of a non-prime-order curve.

The Ristretto implementation we use is provided by Isis Lovecruft and Henry de Valence’s curve25519-dalek, which was designed to provide a clean and safe API for implementing complex protocols such as range proofs and anonymous credentials. To further accelerate group operations, curve25519-dalek uses AVX2 intrinsics to implement parallel formulas for point operations on Edwards curves. These formulas were published in 2008 by Hisil, Wong, Carter, and Dawson in Twisted Edwards Curves Revisited but don’t seem to have been implemented in software before.

More details on the parallel formulas and their implementation can be found in the notes for the curve25519-dalek avx2 backend. More details on the Ristretto group can be found on the website.

Future Plans

Our implementation is still at an early stage and will continue in a public Github repository. We also have working implementations of the MPC protocol for proof aggregation and of batch verification, which are currently being polished and will be merged in the main codebase soon. We plan to generalize the implementation to support arithmetic circuits, and optimize curve operations for larger problem sizes.

Thanks to Benedikt Bünz for answering our questions about the paper and helping us understand it, to Mike Hamburg for designing Decaf and Ristretto, to Isis Lovecruft for their work on the upstream Dalek libraries.