Introducing the Valida Compiler Toolchain

Published on:
Jun 25, 2024
Written by:
Morgan Thomas
Read time:
3 min
Category:
Technical

Morgan Thomas

Research

In this article, we delve into the cutting-edge Valida Compiler Toolchain, which marks a significant leap forward in compiling C code into Valida machine code. We explore the intricacies of the Valida Instruction Set Architecture (ISA), designed to enhance the creation of SNARK proofs within zero knowledge virtual machines (zk-VMs). As we navigate through the unique characteristics of Valida's ISA, the strategic goals of Lita's compiler initiative, and the integration with the LLVM Compiler Infrastructure, we confront the challenges posed by adapting LLVM to Valida's novel environment. The article further discusses the potential advantages and transformative applications of the Valida zk-VM once the compiler toolchain is completed, and concludes with a glimpse into its current availability and developmental progress.


Overview of Valida Compiler Toolchain

Lita Foundation has developed the first compiler toolchain capable of compiling C code to Valida machine code. This work builds on prior efforts of open source contributors and Delendum Research.

The Valida Instruction Set Architecture (ISA)

Valida is a novel instruction set architecture (ISA). Valida is designed to optimize for efficient creation of SNARK proofs of program executions. Traditional ISAs are designed for efficient implementation in silicon. In contrast, Valida is designed to be implemented only in virtual machines, and specifically it’s designed for efficient implementation of zero knowledge virtual machines (zk-VMs). By a zk-VM, we mean a virtual machine implementation which, as a side effect of program execution, optionally outputs a SNARK proof of the program execution.

Distinct Features of Valida ISA

Because Valida is optimized for efficient zk-VM implementation, it is significantly different from most ISAs targeted by compilers. Unlike in silicon, where accessing RAM is significantly more expensive (takes more time) compared to accessing registers, there is no difference in cost between accessing registers and accessing RAM in Valida’s zk-VM prover design. For this reason, Valida lacks any general-purpose registers; instead, instructions directly access RAM, using a stack machine architecture. This is one example of a significant difference between Valida and most ISAs targeted by compilers.

Lita's Compiler Objectives

The goal of Lita’s compiler work is to let programs written in mainstream languages be compiled to run in the Valida zk-VM. This work leverages the LLVM compiler infrastructure. The LLVM compiler infrastructure consists of a collection of compiler frontends, backends, and associated tooling, all based around the Low Level Virtual Machine (LLVM).

Leveraging LLVM Compiler Infrastructure

LLVM is a virtual machine ISA which is designed specifically for use in the LLVM compiler infrastructure. LLVM is designed to be a good target language for compiler frontends and a good source language for compiler backends. For each supported source language, there is a frontend which compiles the source language to LLVM. For each supported target ISA, there is a backend which compiles LLVM to the target ISA. All of these frontends and backends are created using abstractions which the LLVM compiler infrastructure provides to make these jobs easier.

Challenges in Implementing Valida LLVM Backend

Implementing a Valida LLVM backend poses some challenges, due to the peculiarities of Valida. Because LLVM is designed with assumptions in mind, and those assumptions are not all true of Valida, there is some impedance mismatch between Valida and LLVM that needs to be worked around. Some of these challenges have been surmounted, and some remain to be solved. For instance, LLVM assumes that there are general purpose registers, but Valida has none. For another example, Valida has separate address spaces for RAM and program ROM, and the program ROM address space can only be read by the instruction decoder and not in any other way. LLVM assumes that it can emit static data into the object code which the program can then read. That is not  the case in Valida.

Advantages and Future Applications of Valida zk-VM

Once completed, this work will let people compile programs written in supported languages (initially, C/C++ and Rust) to Valida machine code which can be run on the Valida zk-VM. This will make it practical to use the Valida zk-VM for applications which require succinct and/or zero knowledge proving. We expect this to provide significant performance advantages, leading to cost savings for large scale applications of (zk-)SNARKs, and opening up use cases which are not currently viable due to the computational complexity of creating (zk-)SNARKs.

Availability and Current Status of the Valida Compiler Toolchain

Currently Lita’s Valida compiler toolchain is not publicly available, but we are open to providing early access to select partners (see this article). The status of this work currently is that it works for some very basic programs, but there are quite a few open issues that need to be addressed before it will be viable for applications.

You may also like

Custom ISA: Achieving Ultimate Efficiency in zk-Proving

June 19, 2024

Custom ISA: Achieving Ultimate Efficiency in zk-Proving

Exploring zk-VM Design Trade-offs

June 12, 2024

Exploring zk-VM Design Trade-offs

Announcing Lita's Valida zkVM & C Compiler

May 27, 2024

Announcing Lita's Valida zkVM & C Compiler