FROST Performance

by Conrado Gouvêa

In this post, we briefly describe the threshold signature scheme FROST, and present benchmark results for our FROST implementation. We also describe one optimization that greatly improves its speed, especially for scenarios with a large number of signers (e.g. 12 times faster in a t=667 out of n=1000 scenario).

What is FROST?

FROST is an efficient threshold Schnorr signature scheme invented by Chelsea Komlo (at the University of Waterloo and Chief Scientist at the Zcash Foundation) and Ian Goldberg (at the University of Waterloo). FROST is in the process of becoming an IETF RFC.

Threshold signatures make use of a single private key that is split into shares that are held by multiple participants. A subset of the participants (e.g. 3 out of 5, or whatever threshold is specified at key generation) can generate a signature that can be verified by the group public key, as if it were signed by the original unsplit private key. The key can be generated centrally before being split and distributed to the participants (referred to as “trusted dealer”), or a decentralized key generation (DKG) protocol can be used, which means that no single party ever possesses the entire private key. It has many applications, such as allowing multiple entities to manage a cryptocurrency wallet in a safer and more resilient manner. FROST defines a two-round signing protocol, and is consequently the most efficient Schnorr threshold protocol in the literature today.

Schnorr has several advantages over the more traditional ECDSA signature: it’s much simpler; easier to implement in a secure manner; and is compatible with modern signatures schemes such as EdDSA. A significant upside is that it is also compatible with Zcash spend signatures (i.e. the signature that authorizes a Zcash shielded transaction), and this is one of our main use cases for FROST.

Currently, the RFC is close to completion. We have also completed a Rust implementation of all ciphersuites specified in the RFC, and are now doing final cleanups and improvements prior to the first release of the crates (which will then be audited). A ZIP has been drafted, which describes how FROST can be used to create Zcash spend authorization signatures, and Chelsea is working on an associated security proof (for the re-randomizable version of FROST).

Benchmarking and Investigating the Aggregate step

When we presented FROST at Zcon3 (check out the research and implementation talks), we were asked how FROST performed in larger settings, such as a 667-of-1000 signers. (This is motivated by a mechanism proposed by Christopher Goes for bridging Zcash with other ecosystems using FROST.) We set out to benchmark our Rust implementation, and I was a bit surprised about one particular step, “Aggregate”.

The FROST scheme can be split into steps: Key Generation, Round 1, Round 2, and Aggregate. Key Generation only needs to be done once, while the rest are carried out each time the group wishes to generate a new signature. In the Round 1 step, the participant generates commitments which are broadcast to all other participants via a Coordinator. In the Round 2 step, using these commitments and their respective key shares generated during Key Generation, they produce a signature share which is sent to the Coordinator. Finally, the Coordinator carries out the final step, Aggregate, which produces the final signatures from all the signatures shares received.

Initial Benchmarks

The initial benchmark for the Ristretto255 suite looked like the following. (Benchmarks were run on an AMD Ryzen 9 5900X 3.7GHZ, Ubuntu 22.04, Rust 1.66.0.)

Graph plotting time in milliseconds for each FROST step, for 3 different scenarios: 7 of 10, 67 of 100, and 667 of 100. Key Generation takes 0.71, 7.52 and 175.56ms respectively; Round 1 takes 0.08 for all scenarios; Round 2 takes 0.42, 3.79 and 37.64 respectively; and Aggregate takes 1.63, 16.61 and 404.27ms respectively. It’s notable how large is the Aggregate timing compared to the others, particularly in the 667 of 1000 scenario.

(Note that Round 1 and 2 timings in this post refer to per-signer timings, while Key Generation and Aggregate are performed by the Coordinator.)

It was expected that the timings would increase with the larger number of participants (with the exception of Round 1, which does not depend on that number), but the Aggregate timings appeared too high, surpassing 400ms for the 667-of-1000 case (which may not seem much but it’s unusual for a signing procedure).

Optimized Benchmarks

I intended to investigate this slow code, but I didn’t even need to. Coincidentally, while the RFC was in the last call for feedback, Tim Ruffing pointed out that Aggregate can be sped up significantly. Originally, it was specified that each share received from the participants should be verified (each signature share can be verified separately to ensure it is correct) and then aggregated. Tim’s observation is that the shares can be simply aggregated and the final signature verified with the group public key. If the verification fails, then it’s possible to find which participant generated an incorrect share by verifying them one by one (if desired). This greatly speeds up the case where all shares are correct, which should be the most common.

Aggregate is now more than 3 times faster for the 7 of 10 scenario, more than 4 times faster for the 67 of 100 scenario, and more than 12 times faster for the 667 of 1000 scenario! The Aggregate performance is now very similar to the Round 2 step, which makes sense since they have a very similar structure.

Ciphersuite Benchmarks

Here’s the Aggregate performance comparison for all ciphersuites, in three different scenarios:

Same as the previous graph, but in the 67 of 100 scenario. Ed25519: 16.4ms then 3.3ms; Ristretto255: 16.6ms then 3.4ms; secp256k1: 17.6ms then 3.8ms; P-256: 38.8ms then 9.4ms; Ed448: 101.7ms then 20.4ms.
Same as the previous graph, but in the 667 of 1000 scenario. Ed25519: 394ms then 32ms; Ristretto255: 404ms then 33ms; secp256k1: 347ms then 37ms; P-256: 590ms then 91ms; Ed448: 1529ms then 197ms.

Examining Overall Performance

With the benchmark machinery in place (we used criterion.rs) we can provide benchmark results for all supported ciphersuites in different scenarios. These all use the optimization described above.

Graph of times in ms for each FROST step, for each ciphersuite, in the 7 of 10 scenario. Data is available in table format below.
Graph of times in ms for each FROST step, for each ciphersuite, in the 67 of 100 scenario. Data is available in table format below.

The same data in table format:

ed25519

  Key Generation
with Dealer
Round 1 Round 2 Aggregate
2-of-3 0.24 0.08 0.12 0.22
7-of-10 0.73 0.08 0.39 0.45
67-of-100 7.7 0.08 3.64 3.28
667-of-1000 181.45 0.08 36.92 31.88

ristretto255

  Key Generation
with Dealer
Round 1 Round 2 Aggregate
2-of-3 0.24 0.08 0.13 0.22
7-of-10 0.71 0.08 0.42 0.47
67-of-100 7.61 0.08 3.77 3.40
667-of-1000 179.43 0.08 38.32 32.54

secp256k1

  Key Generation
with Dealer
Round 1 Round 2 Aggregate
2-of-3 0.26 0.09 0.15 0.25
7-of-10 0.78 0.09 0.48 0.52
67-of-100 7.50 0.09 4.41 3.82
667-of-1000 123.74 0.09 46.11 37.48

p256

  Key Generation
with Dealer
Round 1 Round 2 Aggregate
2-of-3 0.56 0.18 0.33 0.58
7-of-10 1.71 0.19 1.08 1.24
67-of-100 16.51 0.18 10.03 9.38
667-of-1000 206.85 0.19 97.49 90.82

ed448

  Key Generation
with Dealer
Round 1 Round 2 Aggregate
2-of-3 1.56 0.51 0.75 1.39
7-of-10 4.56 0.53 2.36 2.88
67-of-100 46.05 0.52 21.04 20.37
667-of-1000 693.45 0.53 211.68 197.00

Conclusion

FROST is a highly performant threshold signature scheme. We described an optimization for its “Aggregate” step, pointed out by Tim Ruffing in his review of the RFC draft, which made it 12 times faster in the 667 of 1000 scenario. With that, the Round 2 and Aggregate steps (recall that Key Generation only runs once, and Round 1 is almost negligible in all scenarios) each take less than 100ms for all ciphersuites, except Ed448, in the 667 of 1000 scenario (which is an uncommonly large one).

Appendix: Point Multiplications in FROST Steps

The time-consuming part of each step is elliptic curve point multiplication. It can be classified into three categories: random point multiplication, where the point being multiplied is unknown in advance; and fixed point multiplication, where the point is known in advance (usually, the “generator”).

Here’s a breakdown of what each step requires:

  • Key Generation with Trusted Dealer
    • One fixed point multiplication to derive the group public key from the group private key;
    • One fixed point multiplication per MIN_PARTICIPANTS to derive a commitment for each polynomial coefficient;
    • One fixed point multiplication per MAX_PARTICIPANTS to derive their individual public keys.
    • Total: (1 + MIN_PARTICIPANTS + MAX_PARTICIPANTS) fixed point multiplications
  • Round 1
    • Two fixed point multiplications to generate commitments to the pair of nonces.
    • Total: 2 fixed point multiplications
  • Round 2
    • One random point multiplication per NUM_PARTICIPANTS to compute the group commitment.
    • Total: NUM_PARTICIPANTS random point multiplications
  • Aggregate
    • One random point multiplication per NUM_PARTICIPANTS to compute the group commitment. If the Coordinator is also a participant, they could reuse the value from Round 2, but we didn’t assume that in our benchmark (and our implementation does not support this for now);
    • One random point multiplication and one fixed point multiplication to verify the aggregated signature;
    • Verifying all shares (i.e. in our original approach, or to find a corrupt signer if the aggregated signature failed) additionally requires one random point multiplication per NUM_PARTICIPANTS to compute each commitment share and then one random and one fixed point multiplication per NUM_PARTICIPANTS to actually verify each share.
    • Total (with optimization): (NUM_PARTICIPANTS + 1) random point multiplications and 1 fixed point multiplication. To identify a corrupt signer: up to 2*NUM_PARTICIPANTS random point multiplications and NUM_PARTICIPANTS fixed point multiplications.
    • Total (before optimization): 3*NUM_PARTICIPANTS random point multiplications and NUM_PARTICIPANTS fixed point multiplications.