Oblivious Message Retrieval

by Deirdre Connolly

The Zcash Foundation has been looking into Oblivious Message Retrieval (OMR) to determine whether it offers a potential solution to the recent performance problems that have affected Zcash wallet users, and whether there are any advantages to be gained from implementing this in a Zcash context. For example, can OMR be used to address the privacy leaks of using lightwalletd? Light wallets like Nighthawk Wallet, Zecwallet Lite, and YWallet use lightwalletd to fetch and send Zcash transactions, without the overhead of running a full node locally. Would running OMR on a full node provide us with any performance advantages when fetching transactions? 

The Current Lightwalletd Architecture

Light wallets are typically used to download shielded transactions via a lightwalletd server. To preserve privacy about which transactions a wallet is interested in, one can download all the transactions in full. However, this is expensive and nearly amounts to the full blockchain in size/bandwidth, which is not great for lighter devices or those with low bandwidth/network speed, etc. This is why the lightwallet protocol exists.

Most light wallets fetch a limited amount of data about every shielded transaction (i.e. a `CompactTx`) from a lightwalletd proxy, just enough to allow trial decryption. The wallet trial-decrypts every `CompactTx` to see if that transaction has been sent to them. This is because all shielded transactions on the Zcash blockchain are encrypted and indistinguishable from the others such that no one can tell which transactions are addressed to whom, unless you have the incoming viewing key to decrypt them. Trial decryption takes more computational effort than transparent syncing, proportional to how many shielded transactions are on on-chain, but advances in syncing algorithms have made this cheaper. However, to construct new transactions, a wallet needs data that’s not included in CompactTxs or CompactBlocks, so after detecting a transaction, the wallet has to fetch the full transaction via lightwalletd. Everything stays encrypted for shielded transactions (the ‘data’), but now the lightwalletd proxy knows that a wallet with a certain IP address has requested a certain subset of Zcash shielded transactions, at a specific time (‘metadata’). This is an unfortunate and undesirable privacy leak.

It would be better if we can trial-decrypt and fetch transactions privately, and also not have to download all transactions in full to do so.

What If We Can Do Better?

Oblivious Message Retrieval (OMR) is a construction from Zeyu (Thomas) Liu and Eran Tromer that uses lattice-based (LWE) fully-homomorphic encryption. It allows a full node to scan the Zcash blockchain for shielded transactions encrypted for a particular recipient and return just the transactions they need without knowing the full query or the results in plaintext. This would offload more computational effort and bandwidth requirements to a full node while also protecting the metadata privacy of a user or wallet, hiding which Zcash transactions the recipient cares about. The only thing the OMR-compatible full node learns about the user is that they are interested in OMR-compatible transactions, but not which ones.


At a high level, here’s how a version of Zcash that supports Oblivious Message Retrieval could work:

A user generates an updated unified address that includes a new clue key (this also supports diversified addresses). When sending money to an OMR-supporting address, the regular Zcash shielded transaction will occur, but the transaction will also include a clue ciphertext. This clue ciphertext is about 1KB of additional data per shielded output.

The user then generates a detection key and registers that with an OMR-supporting Zcash full node. The node uses that detection key to scan all the shielded transactions on the chain and their attached clue ciphertexts. The scanning involves taking all the transactions, including the clue ciphertexts, and fully-homomorphically, trying to use the detection key to decrypt the clue ciphertexts associated with the shielded transactions.

This is the magic of fully-homomorphic encryption: the full node cannot distinguish the (encrypted) secret key that is trying to decrypt the clue ciphertexts, as it is homomorphically encrypted, but it can still do the computation and return the encrypted results of the computation to the holder of the detection secret key.

As the OMR-compatible full node scans all the transactions, it homomorphically accumulates all the transactions whose clue ciphertexts decrypt under the clue secret key, and returns the digest of all the transactions as a new homomorphically-encrypted result. The user can then decrypt those results with the detection secret key, and then have all the Zcash transactions relevant to their wallet ready to go. The full node does not know which transactions are relevant to the wallet, as it processes all of them and computes the matching under fully-homomorphic encrypted computation.

What might this feel like? (beyond lightwalletd flow)

OMR supports two kinds of querying for results that support multiple ways for a wallet developer or other client to interact with an OMR-supporting full node:

“Sync” querying – In the single-shot model, the recipient makes a stateless query to the full node: it provides a detection key, a range of blocks to scan, and a bound k on the number of pertinent messages, and asks the node to digest those blocks with respect to that key. The detector runs all of the `Retrieve` algorithm. Response latency is high: about 0.145 sec per transaction.

“Async” querying – The subscribe and finalize model utilizes the streaming variant. The user provides a detection key and asks to subscribe to ongoing (and perhaps some past) transactions. The node starts processing these transactions, doing most of the computation (i.e. homomorphically computing the PV ciphertexts). Later, the user shows up and asks the server to finalize the results and pack them into a digest, with respect to some k. Neither the finalization time nor message bound k need be known in advance. This reduces finalization to 0.35 milliseconds/tx per core. 

The latter approach fits in nicely with the steady-state wallet UX: open up wallet app every once in a while, get an up-to-date view of wallet state ~immediately, then take an action.

This is attractive to wallet developers and users who value the privacy gains of OMR, but hopefully would not sacrifice usability and experience to achieve them.

This Sounds Great! Are We Doing This?

Well, we’re trying to figure that out. Zcash has changed a bit since OMR was first published (we have Orchard Actions now, for example). There are design decisions we would have to make about the Zcash protocol in order to support OMR, and other compatibility, design, and UX questions we’d have to address. Such as:

  • How backwards compatible is this?
    • Requires an updated key/address (see below), but also new changes to shielded transactions, specifically the inclusion of 956 bytes of clue data per shielded transaction (or specifically per description: Sapling Output, may need to be updated for Orchard Actions). This is okay but requires a network upgrade and coordination between node and wallet developers. It also increases the transaction size, in addition to the larger Orchard Actions from NU5.
    • “It would be the cleanest to extend the transaction format with a dedicated clue field,” according to the OMR paper. The clue is 1KB per spend.
  • Do I need to generate new keys and a new address?
    • Well, kind of. You need to generate a new clue key to be part of the address, so with Unified Addresses, this can be a new extension with existing Sapling/Orchard key material, but the Unified Address will look different. This clue key uses a different type of cryptography from the Sapling/Orchard keys so we can’t reuse the other keys to support Oblivious Message Retrieval.
    • Per the constructions proposed in the OMR paper, the clue key is 133KB. That’s right, kilobytes (thanks, lattices!) This is too big for a regular QR code encoding, or hex/ascii string, but, an alternative that should work in the Unified Address framework, and instead of having the full clue key inlined in the address, we encode and inline a link to and hash of the clue key, and fetch it from somewhere else. The clue key is a ~public key.
    • How can we maintain trust and anonymity if we host OMR keys on another platform? Key transparency?
    • When you set up a full node to do OMR detection for you, you generate and upload a detection key. This key is not part of your address. It is even larger than the clue key, at ~129MB (yes, megabytes). This is a one-time upload for recipients, but full nodes will have to keep them per-recipient as long as they are privately scanning for that user. The Zcash mainnet chain is coming up on 100GB in size, and storage is relatively cheap, but this does mean adding another recipient client as a full node has non-negligible storage costs as well as the computation costs, and the clue key size may restrict the number of addresses that can be detected live together, as some use cases involve hundreds or thousands of keys.
  • What about diversified addresses?
    • OMR clue keys are diversifiable, so that property of unlinkable keys is maintained; incoming payments sent to any of a user’s (multiple) diversified addresses can be retrieved and spent using a single key tuple.
  • How much compute does this require from full node operators?
    • “∼$1.02 per million payments scanned (for each recipient served), using commodity cloud computing,” according to the OMR paper. This is about the whole current Zcash mainnet chain right now so it sounds scalable and affordable to serve multiple recipients.
  • If I serve a lot of OMR recipients on one full node, will query latency suffer?
    • Hard to say before there is a full implementation to test. Using the subscribe and finalize query model for the general wallet case would indicate it could be quite fast and scale well in general in an async-supporting full node like zebrad.
  • This is based on lattices, does that mean it’s post-quantum secure?
    • It looks like it, yes!
  • How is this different from viewing keys?
    • Registering viewing keys (incoming, outgoing, or full) with a full node and having them scan the chain on a user’s behalf gets close to the data flow of OMR, but with one crucial difference: the node can see which shielded transactions you are interested in and addressed to you, and can read the memo field, transaction value, and target address. OMR does not allow this.

What's Next?

If it were possible to deploy OMR without the need for a network update and/or other significant changes (such as addresses), then we would be looking very closely at doing so. However, given the effort involved, we don’t think it’s a priority at this time, and we would want to consider alternative performance improvement approaches (like detection keys).