Introducing the Valida Compiler Toolchain

Published on:
Nov 30, 2023
Written by:
Morgan Thomas
Read time:
3 min
Category:
Technical

Morgan Thomas

Research

In this article, we delve into the cutting-edge Valida Compiler Toolchain, which marks a significant leap forward in compiling C code into Valida machine code. We explore the intricacies of the Valida Instruction Set Architecture (ISA), designed to enhance the creation of SNARK proofs within zero knowledge virtual machines (zk-VMs). As we navigate through the unique characteristics of Valida's ISA, the strategic goals of Lita's compiler initiative, and the integration with the LLVM Compiler Infrastructure, we confront the challenges posed by adapting LLVM to Valida's novel environment. The article further discusses the potential advantages and transformative applications of the Valida zk-VM once the compiler toolchain is completed, and concludes with a glimpse into its current availability and developmental progress.


Table of Contents

  1. Overview of Valida Compiler Toolchain
  2. The Valida Instruction Set Architecture (ISA)
  3. Distinct Features of Valida ISA
  4. Lita's Compiler Objectives
  5. Leveraging LLVM Compiler Infrastructure
  6. Challenges in Implementing Valida LLVM Backend
  7. Advantages and Future Applications of Valida zk-VM
  8. Availability and Current Status of the Valida Compiler Toolchain


Lita Foundation has developed the first compiler toolchain capable of compiling C code to Valida machine code. This work builds on prior efforts of open source contributors and Delendum Research.

Valida is a novel instruction set architecture (ISA). Valida is designed to optimize for efficient creation of SNARK proofs of program executions. Traditional ISAs are designed for efficient implementation in silicon. In contrast, Valida is designed to be implemented only in virtual machines, and specifically it’s designed for efficient implementation of zero knowledge virtual machines (zk-VMs). By a zk-VM, we mean a virtual machine implementation which, as a side effect of program execution, optionally outputs a SNARK proof of the program execution.

Because Valida is optimized for efficient zk-VM implementation, it is significantly different from most ISAs targeted by compilers. Unlike in silicon, where accessing RAM is significantly more expensive (takes more time) compared to accessing registers, there is no difference in cost between accessing registers and accessing RAM in Valida’s zk-VM prover design. For this reason, Valida lacks any general-purpose registers; instead, instructions directly access RAM, using a stack machine architecture. This is one example of a significant difference between Valida and most ISAs targeted by compilers.

The goal of Lita’s compiler work is to let programs written in mainstream languages be compiled to run in the Valida zk-VM. This work leverages the LLVM compiler infrastructure. The LLVM compiler infrastructure consists of a collection of compiler frontends, backends, and associated tooling, all based around the Low Level Virtual Machine (LLVM).

LLVM is a virtual machine ISA which is designed specifically for use in the LLVM compiler infrastructure. LLVM is designed to be a good target language for compiler frontends and a good source language for compiler backends. For each supported source language, there is a frontend which compiles the source language to LLVM. For each supported target ISA, there is a backend which compiles LLVM to the target ISA. All of these frontends and backends are created using abstractions which the LLVM compiler infrastructure provides to make these jobs easier.

Implementing a Valida LLVM backend poses some challenges, due to the peculiarities of Valida. Because LLVM is designed with assumptions in mind, and those assumptions are not all true of Valida, there is some impedance mismatch between Valida and LLVM that needs to be worked around. Some of these challenges have been surmounted, and some remain to be solved. For instance, LLVM assumes that there are general purpose registers, but Valida has none. For another example, Valida has separate address spaces for RAM and program ROM, and the program ROM address space can only be read by the instruction decoder and not in any other way. LLVM assumes that it can emit static data into the object code which the program can then read. That is not  the case in Valida.

Once completed, this work will let people compile programs written in supported languages (initially, C/C++ and Rust) to Valida machine code which can be run on the Valida zk-VM. This will make it practical to use the Valida zk-VM for applications which require succinct and/or zero knowledge proving. We expect this to provide significant performance advantages, leading to cost savings for large scale applications of (zk-)SNARKs, and opening up use cases which are not currently viable due to the computational complexity of creating (zk-)SNARKs.

Currently Lita’s Valida compiler toolchain is not publicly available, but we are open to providing early access to select partners (see this article). The status of this work currently is that it works for some very basic programs, but there are quite a few open issues that need to be addressed before it will be viable for applications.

You may also like

Valida March Review Notes

March 8, 2024

Valida March Review Notes

Account System SDK

November 8, 2023

Account System SDK

Early Access Partnership Program

November 5, 2023

Early Access Partnership Program