A Zero Knowledge Paradigm: Part 1 - What is a zk-VM?

Published on:
Jun 12, 2024
Written by:
Lita Team
Read time:
20 min
Category:
Education

Lita Team

“Within the next 5 years, we will be talking about applications of zero-knowledge protocols the way we talk about applications of blockchain protocols. The potential unlocked by the breakthroughs of the last few years is going to take the mainstream by storm”

- Jill Gunter from Espresso Systems, May 2021

Since 2021, the zero knowledge (ZK) landscape has evolved into a diverse ecosystem of primitives, networks, and applications spanning multiple sectors. However, most of ZK remains a mystery to its users and the crypto space at large, despite its development history and recent progress marked by the launches of ZK powered rollups such as Starknet and zkSync Era.

But times are changing. It is our view that zero knowledge cryptography is a powerful, universally adoptable tool for scaling and securing any software. Simply put, ZK is the bridge to mass adoption of cryptography. Quoting Jill once more, an enormous amount of value (both fundamental and speculative) will be created around anything that touches zero knowledge proofs (ZKPs), both in web2 and web3. The best minds in crypto are diligently iterating to make ZK economically feasible and production ready. Even so, the industry admittedly has some progress to make before our envisioned paradigm becomes a reality.

To draw parallels between ZK and Bitcoin adoption, one of the reasons Bitcoin has evolved from a fringe hobbyist forum internet currency to BlackRock approved “digital gold” is the proliferation of developer and community generated literature that cultivated interest. As it stands, ZK exists in a bubble within a bubble. Information is scattered and polarized; articles are either littered with jargon far too difficult to understand, or too layman to convey anything meaningful beyond recycled examples (e.g. Where’s Waldo). It seems as though everyone (experts and laymen) knows what zero knowledge is, but no one can describe how it actually works.

As one of the teams contributing to the zero knowledge paradigm, we want to demystify our work and help a broader audience build a canonical foundation for understanding and analyzing ZK systems and applications, in order to facilitate education and discussions between interested parties, and enable the proliferation of relevant information.

tldr; You can’t grow zero knowledge with zero knowledge.

In this article, we will cover the basics of a zero knowledge and zero knowledge virtual machines, provide a high level summary of the processes in a zkVM, and finish with a set of criteria for evaluating zkVMs.

This is the part one of our three part series, outlined as follows:

  • Part 1: What are zkVMs
  • Part 2: An In-depth look into zkVM Processes
  • Part 3: zkVM Design Tradeoffs

1. Zero Knowledge Proofs: A Primer

What are zero knowledge proofs (ZKPs)?

If you have no prior knowledge of zero knowledge proofs (ZKP), this video from Wired explains the concept at five difficulty levels in an interactive manner with easily understandable examples and demonstrations. Highly recommended.

In simplest terms, ZKPs enable one party (prover) to prove to another party (the verifier) that they know something without revealing what that thing is or any additional information. More specifically, ZKPs prove knowledge of a piece of data, or knowledge of the result of a computation, without revealing the data or inputs. The process of creating a zero knowledge proof involves a series of mathematical models to transform the results of a computation into a piece of otherwise meaningless information that proves successful execution of code, which can later be verified.

In some cases, it takes less work to verify the proof, which is constructed after multiple rounds of algebraic conversions and cryptography, than it would take to run the computation. This unique combination of security and scalability is what makes zero knowledge cryptography such a powerful tool.

zkSNARKs: Zero Knowledge Succinct Non-Interactive Argument of Knowledge

  • Relies on an initial (trusted or non-trusted) setup process to establish parameters for verification
  • Requires at least one interaction between prover and verifier
  • Proof sizes are small and easy to verify
  • SNARK based proofs are used by rollups like zkSync, Scroll, and Linea

zkSTARKs: Zero Knowledge Scalable Transparent Argument of Knowledge

  • No trusted setup required
  • Offer high transparency by using publicly verifiable randomness to create trustless verifiable systems, i.e. generating provably random parameters for proving and verifying
  • Highly scalable as they can generate and verify proofs fast (not always), even when the size of the underlying witness (data) is large
  • Requires no interaction between prover and verifier
  • Comes at the cost that STARKs generate larger proofs, which can be harder to verify than SNARKs
  • Proofs are harder to verify than some zkSNARK proofs but not as hard as some others to verify
  • STARKs are used by Starknet, as well as zkVMs like Lita, Risc Zero, and Succinct Labs

(Note: Succinct’s bridge uses SNARKs but SP1 is a STARK based protocol)

It is worth noting that all STARKs are SNARKs, but not all SNARKs are STARKs.

For a better general understanding of SNARKs and STARKs, we recommend reading this article series written by Yan Zhang and Yi Sun of Axiom, and this collection of articles in Ventali Tan’s github.

2. What is a zkVM?

A virtual machine (VM) is a program that runs programs. In context, a zkVM is a virtual computer that is implemented as a system for generating zero knowledge proofs, or a universal circuit, or tool, for generating ZKPs for any program or computation.

zkVMs remove the need to learn complicated mathematics and cryptography for designing and coding ZK, and enables any developer to execute programs written in their preferred languages and generate ZKPs, making it far easier to integrate and interact with zero knowledge. Broadly speaking, most references to zkVMs implicitly include the compiler toolchains and proof systems appended to the virtual machine that executes the programs, and not just the virtual machine itself. Below, we summarize the main components of a zkVM and their functions:

The main components of a zkVM
The main components of a zkVM

The design and implementation of each component is governed by the choice of proof (SNARKs or STARKs) and the instruction set architecture (ISA) of the zkVM. Traditionally, an ISA specifies what a CPU is capable of (data types, registers, memory etc.) and the sequence of actions the CPU performs when it executes a program. In context, the ISA determines the machine code that is interpretable and executable by the VM. Choosing an ISA can yield radical differences in the accessibility and usability of the zkVM, as well as the speed and efficiency of the proof generation processes, and underpins the construction of any zkVM.

Below are some examples of zkVMs and their components for your reference.

zkVMs and their components

For now, we will focus on the interactions between each component at a high level to provide a framework for understanding the algebraic and cryptographic processes as well as the design tradeoffs of a zkVM in a later article.

3. Abstracted zkVM Process Flow

The following diagram is an abstracted, generalized process flowchart of a zkVM, split and categorized between the format (inputs / outputs) of a program as it moves through the components of a zkVM. We will examine each process in depth in subsequent articles.

General flow for a zkVM

The process flow of a zkVM is generally as follows:

  • Compiler Stage

1. The compiler first compiles programs written in conventional languages (C, C++, Rust, Solidity) into machine code. The format of the machine code is dictated by the choice of the ISA.

  • VM Stage

2. The VM executes the machine code and generates an execution trace, which is the series of steps of the underlying program. The format of this is predetermined by the choice of arithmetization as well as the set of polynomial constraints. Common arithmetization schemes include R1CS as in Groth16, PLONKish arithmetization as in halo2, and AIR as in plonky2 and plonky3.

  • Prover Stage

3. The prover receives the trace and represents it as a set of polynomials bound by a set of constraints, essentially translating the computation into algebra by mapping out the facts mathematically.

4. The prover commits to these polynomials using a Polynomial Commitment Scheme (PCS). A commitment scheme is a protocol which allows the prover to create a fingerprint of some data X, which is called a commitment to X, and later prove facts about X without revealing X, using the commitment to X. The PCS is the fingerprint; a “pre-processed'' succinct version of the constraints on the computation. This allows the prover to prove facts about the computation, now expressed in a polynomial equation, using random values proposed by the verifier in the following steps.

5. The prover runs a Polynomial Interactive Oracle Proof (PIOP) to show that the committed polynomials represent an execution trace which satisfies the given constraints. A PIOP is an interactive proof protocol consisting of a series of rounds in which the prover sends commitments to polynomials, the verifier responds with random field values, and the prover provides evaluations of the polynomial at these random values, akin to “solving” the polynomial equation using random values to probabilistically persuade the verifier.

6. Applying the Fiat-Shamir heuristic; the prover runs the PIOP in a non-interactive mode, where the verifier’s behavior is limited to sending pseudo-random challenge points. In cryptography, the Fiat–Shamir heuristic converts an interactive proof of knowledge into a digital signature for verification. This step encrypts the proof and makes it zero knowledge.  

7. The prover must convince the verifier that the claimed polynomial evaluations are correct, with respect to the polynomial commitments it sent to the verifier. To do this, the prover produces an “evaluation” or “opening” proof, which is provided by the polynomial commitment scheme (fingerprint).

  • Verifier Stage

8. The verifier checks the proof by following the proof system’s verification protocol, either using the constraints or the commitment. The verifier either accepts or rejects the result according to the validity of the proof.

Summarily, a zkVM proof proves, for a given program, a given result, and given initial conditions, that there is some input which causes the program to produce the given result when executed from the given initial conditions. We can combine this statement with the process flow to arrive at the following description of a zkVM.

A zkVM proof proves, for a given VM program and a given output, that there is some input which causes the given program to produce the given output when executed on the VM.

4. Evaluating zkVMs

What are the criteria by which we should evaluate zkVMs? In other words, when should we say that one zkVM is better than another? Truthfully, the answer depends on the use case.

Lita’s market research suggests that for most commercial use cases, the most important properties, out of speed, efficiency, and succinctness, are either speed or core-time efficiency, depending on the application. Some applications are price sensitive and want to optimize for low energy consumption and low use of capital in proving; for these, core-time efficiency is probably the most important metric to optimize. Other applications, particularly finance or trading related applications, are latency sensitive and want to optimize for speed.

Most publicized comparisons of performance focus exclusively on speed, which is important but not a holistic measurement of performance. There are also a few important properties that measure the reliability of a zkVM; most of which are not up to production ready standards, even for market leading incumbents.

We hereby propose that zkVMs should be evaluated on the following criteria, separated into two subsets:

Main criteria for evaluating zk-VMs

Baseline: Measures the reliability of zkVMs

  • Correctness
  • Security
  • Trust Assumptions

Performance: Measures the capabilities of a zkVM

  • Efficiency
  • Speed
  • Succinctness

4.1 Baseline: Correctness, Security, and Trust Assumptions

When evaluating zkVMs for mission critical applications, correctness and security should be considered as the baseline. There needs to be sufficient reason to be confident about the correctness, and sufficiently strong claimed security. Also, the trust assumptions need to be sufficiently weak for the application.

Without these properties, the zkVM is potentially worse than useless for the application, as it may fail to perform as specified and expose users to hacking and exploits.

i. Correctness

  • The VM must execute the computation as intended
  • The proof system must satisfy its claimed security properties

Correctness is comprised of three properties:

  • Soundness: The proof system is truthful and therefore everything it proves is true. The verifier rejects proofs of false statements; it only accepts a computation's result if the inputs actually produce that result.
  • Completeness: The proof system is complete and can prove all true statements. If the prover claims it can prove a computation's result, it must be able to produce a proof that the verifier accepts.
  • Zero Knowledge: Possessing a proof doesn't reveal any more about the computation's inputs than knowing the result itself would

You can have completeness without soundness; if the proof system proves everything including falsity, obviously it is complete but not sound. Inversely, you can have soundness without completeness; if the proof system proves a program existed but does not cannot prove the computations, obviously it is sound (after all, it never proves any falsity), but not complete.

ii. Security

  • Relates to the tolerances of soundness, completeness, and zero knowledge

In practice, all three correctness properties have non-zero tolerances. This implies all proofs are statistical probabilities of correctness, and not absolute certainties. A tolerance refers to the maximum tolerable probability that one property will fail. Zero tolerances are of course the ideal, but zkVMs do not achieve zero tolerances on all of these properties in practice. Perfect soundness and completeness appear to be incompatible with succinctness, and there are no known methods for achieving perfect zero knowledge. A common way of measuring security is in bits of security, where a tolerance of 1 / (2^n) is referred to as n bits of security. More bits of security is better.

If a zkVM is perfectly correct, that does not necessarily imply that it is reliable. Correctness only implies that the zkVM satisfies its security properties up to the claimed tolerances. It does not imply that the claimed tolerances are low enough to be market ready. Also, if a zkVM is sufficiently secure, that does not imply that it is correct; security refers to the claimed tolerances, not the tolerances that are actually achieved. Only when a zkVM is both perfectly correct and sufficiently secure can it be said that the zkVM is reliable up to the claimed tolerances.

iii. Trust Assumptions

  • Assumptions about the honesty of the prover and verifier to reach the conclusion the zkVM functions reliably

When zkVMs have trust assumptions, this usually takes the form of a trusted setup process. A setup process for a ZK proof system runs once, before the first proof is generated using the proof system, in order to generate some information called “the setup data.” In a trusted setup process, one or more individuals generate some randomness which gets incorporated into the setup data, and it needs to be assumed that at least one of those individuals deleted the randomness which they incorporated into the setup data.

There are two common trust assumption models in practice.

An honest majority trust assumption states that out of some set of N individuals, more than N/2 of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by blockchains

A “1 out of N” trust assumption states that out of some set of N individuals, at least one of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by MPC based tools and applications.

It is generally agreed that all else being equal, zkVMs with no trust assumptions are more secure than zkVMs which require trust assumptions.

4.2 The zkVM Trilemma: Balancing Speed, Efficiency, and Succinctness in zkVMs

The zkVM trilemma: speed, efficiency, and succinctness

Speed, efficiency, and succinctness are all sliding scale properties. All of these factors contribute to the end user cost of zkVM. How they should be weighted in an evaluation depends on the application. Often, the fastest solution is not the most efficient or the most succinct; the most succinct solution is not the fastest or the most efficient; and so on and so forth. Let’s first define each property before explaining their relationship

i. Speed

  • How quickly can the prover generate proofs
  • Measured in wall clock time, which is the time elapsed from start to finish of computation

Speed should be defined and measured relative to specific test programs, inputs, and systems to ensure it can be quantitatively assessed. This metric is critical for latency-sensitive applications where prompt availability of proofs is essential, but comes with higher overhead and larger proof sizes

ii. Efficiency

  • The resources consumed by the prover, with less being preferable
  • Can be approximated by user time, which is the amount of computer time expended by the program code

The prover consumes two kinds of resources: core-time and space. Efficiency can therefore be subdivided into core-time efficiency and space efficiency.

Core-Time Efficiency: Average amount of time the prover operates across all cores multiplied by number of cores running the prover. For a single-core prover, the core-time consumption and the speed are the same thing. For a multi-core capable prover running in multi-core mode on a multi-core system, core-time consumption and speed are not the same thing. If a program fully utilizes 5 cores or threads for 5 seconds, that would be 25 seconds of user time and 5 seconds of wall clock time.

Space Efficiency: Refers to the storage capacity used, such as RAM

User time is interesting as a proxy for the energy consumed by running a computation. In a situation where all cores are fully utilized almost all the time, the energy consumption of a CPU should remain relatively constant. In this situation, the user time spent by a CPU-bound, predominantly user-mode code execution should be roughly linearly proportional to the number of watt-hours (i.e., energy) consumed by that code execution.

Economizing on energy consumption, or the usage of computing resources, should be interesting from the point of view of any proving operations which have enough scale that their energy bill (or their cloud compute bill) for proving is a considerable operational cost. For these reasons, user time is an interesting metric. Lower costs of proving allow service providers to pass on lower proving prices to cost-sensitive customers.

Both kinds of efficiency are related to the energy consumption of the proving process and the amount of capital utilized by the proving process, which relates to the financial cost of proving. To operationalize the definition of efficiency for measurement, the definition has to be made relative to one or more test programs, one or more test inputs to each of those programs, and one or more test systems.

iii. Succinctness

  • The size of proofs generated and the complexity of verifying them

Succinctness is a composite of three distinct metrics, with the complexity of proof verification further subdivided:

  • Proof Size: The physical size of the proofs, typically measured in kilobytes
  • Proof Verification Time: Duration required to verify proofs.
  • Proof Verification Space: Memory usage during proof verification

Verification is typically a single core operation, thus speed and core-time efficiency are generally equivalent in this context. As with speed and efficiency, operationalizing the definition of succinctness requires specifying sets of test programs, test inputs, and test systems.

With each performance property defined, we now illustrate the dimunitive effects of optimizing one property over the others.

  • Speed: Fast proof generation leads to larger proof sizes, but proof verification is slow. More resources are consumed to generate proofs, which reduces efficiency
  • Succinctness: Prover needs more time to compress proofs. But proof verification is fast. The more succinct the proof, the higher the computation overhead
  • Efficiency: Minimizing resource usage reduces speed of proof generation and the succinctness of the proof

In general, optimizing for one quality means not optimizing for another quality, and therefore a multi-dimensional analysis is needed to pick an optimal solution on a case by case basis.

A good way of weighting these properties in an evaluation may be to define acceptable levels for each property and then determine which properties are most important. The most important properties should be optimized, subject to maintaining good enough levels on all other properties.

Below we summarize each property and their key considerations:

Evaluation properties of zkVMs

5. What’s Next?

With the above table, we hereby conclude the first article of our series. In the next articles, we will build on the flow chart shown above to explain common arithmetic and cryptographic processes in zkVMs.

If you found this helpful, visit our website and gitbook to learn more about what we are building at Lita.

Also, follow us on X and Discord to stay updated so you don’t miss the rest of the series

________________________________________

Disclaimer:

The content of this blog, including text, graphics, and images, is protected by copyright © 2024 Lita.Foundation. All rights reserved. Unauthorized use and/or duplication of this material without express and written permission from this site’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Lita.Foundation with appropriate and specific direction to the original content.

For any questions or permissions, please contact Lita Foundation.

You may also like

A Zero Knowledge Paradigm: Part 2- Exploring zk-VM Design Trade-offs

June 12, 2024

A Zero Knowledge Paradigm: Part 2- Exploring zk-VM Design Trade-offs

Announcing Lita's Valida zkVM & C Compiler

May 27, 2024

Announcing Lita's Valida zkVM & C Compiler

Valida March Review Notes

March 8, 2024

Valida March Review Notes