Introducing the SIEVE Circuit-IR: Functions

Written: 2023-07-10, Published: 2023-08-30, Author: Kimberlee Model, Tags: SIEVE IR, Circuit-IR, v2.1.0,

The SIEVE IR’s function gates will encapsulate a subcircuit so it may be reused many times over. Doing this can drastically reduce the size of a circuit. This may seem unimportant, considering how an interactive proof is likely to have communication proportional to the number of gates, however IO of the relation has proved to be a bottleneck in the past, and some relations have taken terabytes of space. So, its worth it to attempt to compress the circuit as much as possible.

Another aspect of increasing the IR’s scalability is reducing memory consumption. This installment will show off the @new(/* … /); and @delete(/ … */); directives and show how their use impacts function gates.

Function Gates

Lets start with an example: the equals function has two input wires and one output wire. If the two inputs have the same value it outputs 1, otherwise it outputs 0.

A function declaration always begins with a signature listing its output and input parameters. Each parameter has a type index and a length. Within the function’s body wire numbers restart at 0. Within each type, output wires are assigned to the parameters, left to right — so the first output wire is 0 — and incrementing from there.

@version 2.0.0;
circuit;
@type field 127;
@begin

  @function(equals, @out: 0:1, @in: 0:1, 0:1)
    // Output wire 0:$0
    // Input wires 0:$1 and 0:$2

The body of the function is simply a list of gates, one of which assigns to the output wire. For equality we’re going to use Fermat’s Little Theorem for reducing a non-zero value to 1. When raising a number to p-1 mod p, any non-zero value will yield 1. Thus, we can subtract and exponentiate to get inequality, and flip to get equality.

    // Subtract
    $3 <- @mulc($2, <126>);
    $4 <- @add($1, $3);

    // Exponentiation by squaring, iterative method, unrolled.
    $5 <- @mul($4, $4);
    $6 <- @mulc($5, <1>);
    $7 <- @mul($5, $5);
    $8 <- @mul($7, $6);
    $9 <- @mul($7, $7);
    $10 <- @mul($9, $8);
    $11 <- @mul($9, $9);
    $12 <- @mul($11, $10);
    $13 <- @mul($11, $11);
    $14 <- @mul($13, $12);
    $15 <- @mul($13, $13);
    $16 <- @mul($14, $15);
    // Did I mention the IR is intended to be computer generated?

    // Flip from 0|1 to 1|0
    $17 <- @mulc($16, <126>); // 0 or -1
    $0 <- @addc($17, <1>); // assigns to the output wire
 @end

Using an equality check is now as simple as calling this function. The $outputs… ← @call(function_name, $inputs…); syntax lets us do just that. Each wire or wire range specified in the call gate will be passed to the function. Input wires must have been previously assigned in the circuit, while output wires will be assigned by the function.

  $0 <- @private();
  $1 <- @private();

  // call the function
  $2 <- @call(equals, $0, $1);

  @assert_zero($2);
@end

When using functions with multiple types, the call gate doesn’t actually need type indexes for each parameter, in fact the syntax does not even allow for this. Instead, the backend can look at the function’s signature and each parameter is retrieved using the type from the signature.

Here’s a quick example. Notice that the commented types of the @call match those of the function definition

@function(foo, @out: 0:1, 1:1, @in 1:1, 0:1, 1:1)
  /* ... */
@end

$0 <- @private(0);
$0 <- @private(1);
$1 <- @private(1);

/* 0 */ $2, /* 1 */ $2 <- @call(foo, /* 1 */ $0, /* 0 */ $0, /* 1 */ $1);

@assert_zero(0: $2);
@assert_zero(1: $2);

Memory Management

Memory management in the IR began with the addition of a @delete($first … $last); gate which hinted to a backend that a range of wires would not be used further, and that memory for them could be freed. This was beneficial as we were considering how to stream very long circuits, before we created IR functions.

As we added functions we we found it difficult for the backend to actually delete wires while maintaining adequate performance. No longer was the IR’s implementation a single append-only list, now we wanted delete segments from it and pass segments to sub-functions. So we put in restrictions to make memory management more manageable.

Consider a memory manager which allocates chucks of 8 wires at a time. We have nice little 8-wire chunks, not too big not too small, and we can allocate as needed and delete on command. What happens when we call a function that requires inputs in chunks of 10? Now we have a problem: we could copy our 8-wire chunks in to new 10-wire chunks, but that’s too slow; we could make "super-chunks" which are composed of smaller chunks and pass the super-chunks, but that’s too complex. What we came up with was to pass the problem on to the frontend. While that seems like its just kicking the can to someone else, the frontend has more information about the problem they’re trying to solve. For example, if the frontend is using some range of wires to represent an array, then they probably already know the array’s length and could have encoded that into the IR. So we paired @delete with a @new($first … $last); directive. Now the frontend can request memory allocations of a particular size and delete them when finished.

// Allocate some wires
@new($0 ... $5);

/* assign and use them... */
$0 <- /* ... */;
$1 <- /* ... */;
/* ... */

// Now we're done, delete them.
@delete($0 ... $5);

In a function call, any wire range must come from the same allocation. Output wire ranges may be implicitly allocated, but if prior allocations conflict, then a problem would arise. Let’s take a look at a few examples using the mul4 and dot4 functions which will multiply 4 pairs of wires and return the products and the sum-of-products respectively.

Here are the functions.

@function(mul4, @out: 0:4, @in: 0:4, 0:4)
  $0 <- @mul($4, $8);
  $1 <- @mul($5, $9);
  $2 <- @mul($6, $10);
  $3 <- @mul($7, $11);
@end

@function(dot4, @out: 0:1, @in: 0:4, 0:4)
  $9 <- @mul($1, $5);
  $10 <- @mul($2, $6);
  $11 <- @mul($3, $7);
  $12 <- @mul($4, $8);
  $13 <- @add($9, $10);
  $14 <- @add($11, $12);
  $0 <- @add($13, $14);
@end

This example would cause a problem because all the inputs are allocated singly.

$0 <- @private();
$1 <- @private();
$2 <- @private();
$3 <- @private();
$4 <- @private();
$5 <- @private();
$6 <- @private();
$7 <- @private();

$8 <- @call(dot4, $0 ... $3, $4 ... $7); // ERROR

This example fixes it by allocating two ranges to pass to the function.

@new($0 ... $3);
$0 <- @private();
$1 <- @private();
$2 <- @private();
$3 <- @private();
@new($4 ... $7);
$4 <- @private();
$5 <- @private();
$6 <- @private();
$7 <- @private();

$8 <- @call(dot4, $0 ... $3, $4 ... $7); // Okay

This example fixes it by allocating a single range and subseting the range as the function’s arguments.

@new($0 ... $7);
$0 <- @private();
$1 <- @private();
$2 <- @private();
$3 <- @private();
$4 <- @private();
$5 <- @private();
$6 <- @private();
$7 <- @private();

$8 <- @call(dot4, $0 ... $3, $4 ... $7); // Okay

In this example, the function’s output may even be a part of the original allocation.

@new($0 ... $8);
$0 <- @private();
$1 <- @private();
$2 <- @private();
$3 <- @private();
$4 <- @private();
$5 <- @private();
$6 <- @private();
$7 <- @private();

$8 <- @call(dot4, $0 ... $3, $4 ... $7); // Okay

Now lets try with mul4. This example works just fine, because $8 … $11 can be allocated implicitly.

@new($0 ... $3);
$0 <- @private();
$1 <- @private();
$2 <- @private();
$3 <- @private();
@new($4 ... $7);
$4 <- @private();
$5 <- @private();
$6 <- @private();
$7 <- @private();

$8 ... $11 <- @call(mul4, $0 ... $3, $4 ... $7); // Okay

This one fails because the over allocation of the original range conflicts with implicit allocation of the call’s output.

@new($0 ... $8);
$0 <- @private();
$1 <- @private();
$2 <- @private();
$3 <- @private();
$4 <- @private();
$5 <- @private();
$6 <- @private();
$7 <- @private();

$8 ... $11 <- @call(mul4, $0 ... $3, $4 ... $7); // ERROR

But extending the allocation to include the call’s output fixes it.

@new($0 ... $11);
$0 <- @private();
$1 <- @private();
$2 <- @private();
$3 <- @private();
$4 <- @private();
$5 <- @private();
$6 <- @private();
$7 <- @private();

$8 ... $11 <- @call(mul4, $0 ... $3, $4 ... $7); // Okay

Conclusion

Today we saw how functions can be used to encapsulate sub-circuits and be called repeatedly, which can drastically reduce the size of a circuit. We also took a look at the IR’s memory management, enabling unused memory to be reclaimed mid-proof. Both of these enable backends to run circuits with billions or trillions of gates, sometimes taking hours or days to complete.

This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001120C0087. The views, opinions, and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Distribution Statement "A": Approved for Public Release, Distribution Unlimited