Overview

The Polynomial Multiproof (PMP) scheme is a polynomial commitment scheme that allows for efficiently creating/verifying opening proofs for multiple polynomials at multiple points.

Here's why this project is cool: Opening gets faster the more points we open a polynomial at.

This is a huge deal for data availability systems, which are bottlenecked by opening time.

For some sample numbers, here are times opening/verifying degree 32768 - 1 polynomials using BLS12-381¹ on a single core of an M1 Macbook Pro:

Benchmarks

This specification outlines implementation requirements for the polynomial multiproof (PMP) scheme. This builds on previous methods such as KZG10², and is derived entirely from BDFG21³. It should be seen as choosing a special case of BDFG21 which allows for significant optimizations that make the protocol more viable for use in applications like Data Availability.

The scheme consists of four methods:

Setup which sets up a structured reference string for the curve and does some
Commit which commits to a polynomial
M1Open/M2Open which computes a single opening proof that a set of polynomials are equal to specific values at a set of points
M1Verify/M2Verify which verify an opening proof against commitments and evaluations

API

The API is designed around two traits, the first looks roughly like


#![allow(unused)]
fn main() {
/// A curve-agnostic trait for a BDFG commitment scheme *without precomputation*
pub trait PolyMultiProofNoPrecomp<E: Pairing>: Sized {
    /// The output proof type
    type Proof: Clone;

    /// Creates a proof of the given polynomials and evals at the given points
    fn open(
        &self,
        transcript: &mut Transcript,
        evals: &[impl AsRef<[E::ScalarField]>],
        polys: &[impl AsRef<[E::ScalarField]>],
        points: &[E::ScalarField],
    ) -> Result<Self::Proof, Error>;

    /// Verifies a proof against the given set of commitments and points
    fn verify(
        &self,
        transcript: &mut Transcript,
        commits: &[Commitment<E>],
        points: &[E::ScalarField],
        evals: &[impl AsRef<[E::ScalarField]>],
        proof: &Self::Proof,
    ) -> Result<bool, Error>;
}
}

This takes a Merlin transcript for the Fiat-Shamir transform, some points $x_{j}$ , some polynomials $f_{i}$ and the evaluations of each polynomial at each point $f_{i} (x_{j})$ .

This API is the most flexible, but not the most efficient.

The second trait is more efficient, but requires knowledge of which points are being opened to beforehand.


#![allow(unused)]
fn main() {
/// A curve-agnostic trait for a BDFG commitment scheme *with precomputation*
pub trait PolyMultiProof<E: Pairing>: Sized {
    /// The output proof type
    type Proof: Clone;

    /// Creates a of the given polynomials at the given point set index
    fn open(
        &self,
        transcript: &mut Transcript,
        evals: &[impl AsRef<[E::ScalarField]>],
        polys: &[impl AsRef<[E::ScalarField]>],
        point_set_index: usize,
    ) -> Result<Self::Proof, Error>;

    /// Verifies a proof against the given set of commitments and points
    fn verify(
        &self,
        transcript: &mut Transcript,
        commits: &[Commitment<E>],
        point_set_index: usize,
        evals: &[impl AsRef<[E::ScalarField]>],
        proof: &Self::Proof,
    ) -> Result<bool, Error>;
}
}

When creating this object, there will be some way to specify the points, or they will be predetermined by the implementation. The API is largely the same.

Applications to Data Availability

Data availability systems can benefit greatly from PMP, since polynomial commitment-based DA systems are tremendously constrained by opening time. Even with very fast KZG implementations, opening is too slow, and constrains the amount of data which can be processed through the system

Opening to a degree 255 BLS12-381 scalar field polynomial takes at best ~3-6ms, which means opening to every cell of 256 of those polynomials will take >200s.

Because PMP allows many cells to be opened to at once, the data availability grid can be chunked into a smaller grid, allowing for a faster, more secure system, with far higher througput ⁴. This allows polynomial commitment-based DA systems scale up in size without taking more compute.

Curve selection

The protocol depends on the selection of a pairing based curve. Attributes of an ideal curve for this protocol are, in order of importance,

Security
Fast G1 scalar multiplication
Large scalar field size
Fast G2 scalar multiplication
Small compressed G1 scalar size

BLS12-381 satisfies the first two critera well, and while other curves likely could satisfy more requirements, they are currently not as well audited. Hence BLS12-381 is a reasonable choice to start with.

The scheme given in BDFG21 is interactive, so Merlin transcripts are used to do the Fiat-Shamir transform.

Optimizations made from BDFG21 are outlined in Optimizations

All times are with a Xeon E5-2676 v3 @ 2.40GHz vCPU on AWS

KZG10 Paper

BDFG21 Paper

⁴

More FFTs are required as data size increases, but these are dwarfed by opening times. Additionally, exactly how the chunking is done has nuanced security implications in different scenarios.

2. Methods

Here we detail the Setup

Setup

Setup defines the shared parameters for all other methods


#![allow(unused)]
fn main() {
Setup(max_coeffs: usize, point_sets: Vec<Vec<Scalar>>) -> Parameters
}

max_coeffs is the maximum number of coefficiens we will commit/open to at a time. For most curve choices, this should be a power of 2.
point_sets lists different sets of points where we can open a grid to.
- This is done so we can do ahead-of-time computation that only depends on which set of points we are opening at, and not on the actual contents of the data at those indicies.
Ex: point_sets = [[0, 4], [1, 2]] means I can open any rows of the grid at [0, 4] or [1, 2]. We can only open at both [0, 4] at the same time, and not 0 or 4 individually.

Setup does the following

Samples a random $x \in F, g_{1} \in G_{1}, g_{2} \in G_{2}$
Computes $(g_{1}, g_{1}^{x}, g_{1}^{x^{2}}, \dots g_{1}^{x^{max_coeffs - 1}})$
Computes $(g_{2}, g_{2}^{x}, g_{2}^{x^{2}}, \dots g_{2}^{x^{max_coeffs - 1}})$ ¹
For the $i$ -th set of points in point_sets
- Enumerates the points at the given indicies from the FFT domain. Call them $z_{1}, \dots, z_{k}$ .
- Constructs the unique degree $k - 1$ Lagrange polynomials for each $z_{j}$ $l_{i, j} (X) = \frac{\prod _{l \neq = j} ( X - z _{l} )}{\prod _{l \neq = j} ( z _{j} - z _{l} )}$
  - If the denominator is zero due to a repeated point, setup errors
- Constructs the vanishing polynomial $Z_{i} (X) = j \prod (X - z_{j})$
- Computes $θ_{i} = [Z_{i} (x)]_{2}$

For method 1, we only need as many powers of $g_{2}$ as points we open to. For method 2, we only need $g_{2}, g_{2}^{x}$ .

Commit

Commiting is exactly the same as in KZG10


#![allow(unused)]
fn main() {
Commit(p: Parameters, poly: Polynomial) -> Commitment
}

Takes the polynomial $f$ and computes $[f (x)]_{1}$ .

3. Method 1

This method biases towards a faster opening time, and requires roughly half the compute of FastVerify. See A. Optimizations for more details.

For the following, the notation of BDFG21 is followed, namely

$[k] = {1, \dots, k}$
For a set of points $(x_{1}, \dots, x_{k})$ and a polynomial $f_{i}$ , $r_{i}$ is the unique degree $k - 1$ polynomial such that $\forall j r_{i} (x_{j}) = f_{i} (x_{j})$

M1Open


#![allow(unused)]
fn main() {
M1Open(transcript: Transcript, 
       evals: Vec<Vec<Scalar>>, 
       polys: Vec<Polynomial>, 
       point_set_index: usize) -> Proof
}

Let the polynomials be $(f_{1}, \dots, f_{t})$ and the points be $(x_{1}, \dots x_{k})$

Check that evals has length $t$ and each element of evals has length $k$
Transcribes the following to the Merlin transcript
- each of the evals row by row with the utf-8 message open evals
- each of the points in order with the utf-8 message open points
Constructs $γ \in F$ by
- Reading SCALAR_SIZE bytes from transcript with utf-8 message open gamma
- Intepreting the bytes as an integer in big endian modded by the modulus of the scalar field
Computing $f_{sum} = i \in [t] \sum γ^{i - 1} f_{i}$
Polynomial divide to get $q = f_{sum} / Z_{point_set_index}$ , discarding the remainer
Compute $[q (x)]_{1}$ and serialize in compressed form.

M1Verify


#![allow(unused)]
fn main() {
M1Verify(transcript: Transcript, 
         commits: Vec<Commitment>, 
         evals: Vec<Vec<Scalar>>,
         proof: Proof,
         point_set_index: usize) -> bool
}

Let the evals be $((y_{1, 1}, \dots y_{1, k}), \dots (y_{t, 1}, \dots y_{t, k}))$ , commits be $(c_{1}, \dots, c_{t})$ , and the points be $(z_{1}, \dots z_{k})$ . Commit $c_{i}$ must be for the polynomial that evaluates to $(y_{i, 1}, \dots, y_{i, k})$ at points $(z_{1}, \dots, z_{k})$ .

This method

Checks that each element has the correct length
- Commits has length $t$
- Each element of evals has length $k$
Transcribes the points/evals the same as in the opening
Reads $γ$ same is in the opening
For each $j \in [k]$ , computes $a_{j} = i \in [t] \sum γ^{j - 1} y_{i, j}$ $a_{j}$ represents the value of $\sum_{i \in [t]} γ^{i - 1} r_{i} (z_{j})$
Use the $a_{j}$ to interpolate $ϕ = \sum_{i \in [t]} γ^{i - 1} r_{i}$ using the previously computed lagrange polynomials $l_{i, j}$ as $ϕ = i \in [t] \sum γ^{i - 1} r_{i} = j \in [k] \sum a_{j} l_{point_set_index, j}$
Compute $α = [ϕ (x)]_{1}$
Compute $β = \sum_{i \in [t]} γ^{i - 1} c_{i}$
Return true if $e (β - α, g_{2}) = e (proof, θ_{point_set_index})$

4. Method 2

This method biases towards a faster verifying time, and is roughly twice as slow to open on. The API is exactly the same as in method 1. See A. Optimizations for more details.

M2Open


#![allow(unused)]
fn main() {
M2Open(transcript: Transcript, 
       evals: Vec<Vec<Scalar>>, 
       polys: Vec<Polynomial>, 
       point_set_index: usize) -> Proof
}

This method

Transcribes the following (all points serialized compressed if possible)
- each of the evals row by row with the utf-8 message open evals
- each of the points in order with the utf-8 message open points
Constructs $γ \in F$ by
- Reading SCALAR_SIZE bytes from transcript with utf-8 message open gamma
- Intepreting the bytes as an integer in big endian modded by the modulus of the scalar field
Computing, for each polynomial $f_{1}, \dots, f_{t}$ $f_{sum} = i \in [t] \sum γ^{i - 1} f_{i}$
Polynomial divide to get $h = f_{sum} / Z_{point_set_index}$ , with remainder $v$ .
Compute $W_{1} = [q (x)]_{1}$
Serialize $W_{1}$ in compressed form and transcribe with the message open W1.
Construct $z$ in the same way as we did $γ$ , reading with message open z.
Compute $s = \sum_{i \in [t]} γ^{i - 1} r_{i} (z)$ by computing $(Z_{point_set_index} \cdot v) (z)$ . ¹
Comupute $f_{z} = - s + \sum_{i \in [t]} f_{i}$
Compute $l = f_{z} - h Z_{point_set_index}$
Compute $b = l / (X - z)$
Compute $W_{2} = [b (x)]_{1}$
Return $(W_{1}, W_{2})$ and serialize in compressed form

M2Verify


#![allow(unused)]
fn main() {
M2Verify(transcript: Transcript, 
         commits: Vec<Commitment>, 
         evals: Vec<Vec<Scalar>>,
         (w_1, w_2): Proof,
         point_set_index: usize) -> bool
}

This method

Transcribes the points/evals the same as in the opening
Reads $γ$ same is in the opening
Transcribes $W_{1}$
Reads $z$ same is in opening
Lagrange interpolate $ϕ = \sum_{i \in [t]} γ^{i - 1} r_{i}$ using the given evaluations, same as in method 1.
Compute $α = g_{1}^{ϕ (z)}$
Compute $β = \sum_{i \in [t]} γ^{i - 1} c_{i}$
Compute $F = β - α - W_{1}^{Z_{proof_set_index} (z)}$
Compute $g_{2}^{x} g_{2}^{- z} = g_{2}^{x - z}$
Return true if $e (F, g_{2}) = e (W_{1}, g_{2}^{x - z})$

For why this works, see the A. Optimizations

A. Optimizations

This contains all the optimizations made from the methods of BDFG21 and why they are correct. Please read the paper first 😘.

Method 1

This method is the fastest method for opening, and slightly slower for verification than Method 2. But, as written in the paper, verification is impractically slow for any appreciable number of polynomials/points. So here, we make an assumption to make the computation reasonable: each polynomial in $F = {f_{1}, \dots, f_{N}}$ is opened at all the same points, that is (using the notation from the paper) $S_{i} = S_{j} = T \forall i, j \in [k]$ . We also assume each polynomial $f_{i}$ is the same degree, $d$ .

Method 1, Opening

Let $T = {z_{1}, z_{2}, \dots z_{t}}$ . For brevity, assume sums over $i$ are done as $\sum_{i \in [k]}$

The prover has to compute, $[h (x)]_{1}$ , which means they must compute the coefficients of $h$ and then do $d + 1 - ∣ T ∣$ $G_{1}$ scalar multiplications and some addition. $h$ is defined as $h (X) := i \sum γ^{i - 1} \frac{f _{i} ( X ) - r _{i} ( X )}{Z _{S_{i}} ( X )} = \frac{\sum _{i} γ ^{i - 1} ( f _{i} ( X ) - r _{i} ( X ) )}{Z _{T} ( X )}$ Here, $r_{i}$ is the unique degree $∣ T ∣$ polynomial such that $r_{i} (z_{j}) = f_{i} (z_{j})$ . Note that polynomial fraction divides cleanly, because $f_{i} (z_{j}) - r_{i} (z_{j}) = 0$ for all $j \in [t]$ , meaning it has factors of $(X - z_{j})$ , and so does $Z_{T} (X)$ .

The $r_{i}$ term is just the unique minimal degree polynomial you need to subtract from $f_{i}$ to allow $Z_{T}$ to cleanly divide it. But, this also means that $r_{i}$ is simply the remainder of dividing $f_{i}$ and $Z_{T}$ . And the same applies for $\sum_{i} γ^{i - 1} f_{i}$ : $\sum_{i} γ^{i - 1} r_{i}$ is just the remainder you get when you divide the sum of $f_{i}$ s by $Z_{T}$ ! So all they need to do in order to compute $h$ is to

Compute $\sum_{i} γ^{i - 1} f_{i}$
Use a polynomial division algorithm to divide by $Z_{T}$
Take the result and throw away the remainder

This makes opening very fast, because no interpolation is needed.

Method 1, Verification

For verification, the verifier needs to do a little more legwork than the prover. Let the commitment to the $i$ -th polynomial be $c_{i}$

First, the verifier needs to compute $Z_{i} := [Z_{T ∖ S_{i}} (x)]_{2} = [Z_{\emptyset} (x)]_{2} = [1]_{2} = g_{2}$ That's simple enough, now they have to compute $F : = i \prod e (γ^{i - 1} (c_{i} [r_{i} (x)]_{1}), Z_{i}) = e (\sum γ^{i - 1} (c_{i} - [r_{i} (x)]_{1}), g_{2}) = e (\sum γ^{i - 1} c_{i} - [i \sum γ^{i - 1} r_{i} (x)]_{1}, g_{2})$ Constraining the points lets them compute a single pairing instead of $k$ ! Now let's look at each side of pairing. To compute the LHS, they need to compute $\sum_{i} γ^{i - 1} r_{i}$ . Sadly, there's no nice way around this here, it must be interpolated, but they can interpolate $\sum_{i} γ^{i - 1} r_{i}$ once rather than interpolate each $r_{i}$ . That leaves us with

$\sum γ^{i - 1} c_{i} \to$ $N$ $G_{1}$ scalar mults
$[\sum_{i} γ^{i - 1} r_{i} (x)]_{1} \to$ $∣ T ∣ + 1$ $G_{1}$ scalar mults

Then all they're left with is the single pairing. Next they have to compute $e (W, [Z_{T} (x)]_{2})$ . Computing $Z_{T}$ is pretty straightforward, and all they are left to do is the $d$ $G_{2}$ scalar multiplications. If you already know your $T$ ahead of time, the $Z_{T}$ term can be cached.

This leaves us with the following amount of operations (other operations impact runtime fairly little)

Operation	Open quantity	Verify quantity
Polynomial Interpolation	0	1
G1 Scalar Multiplication	$d + 1 - ∣ T ∣$	$N + T + 1$
G2 Scalar Multipcication	0	$d$ (can be cached)
Pairing	0	2

Interestingly, the opening time gets smaller as the size of $T$ grows, due to the growing degree of the vanishing polynomial.

Method 2

Method 2 is an equally secure way to generate blocked openings, which biases computation slightly more to the opener than the verifer. They use the same assumptions as in Method 1.

Method2, Opening

The prover computes $h (X) = f (X) / Z_{T} (X)$ for $f (X) := i \sum γ^{i - 1} Z_{T ∖ S_{i}} (X) (f_{i} (X) - r_{i} (X)) = i \sum γ^{i - 1} (f_{i} (X) - r_{i} (X))$ since $T ∖ S_{i} = \emptyset$ so $Z_{T ∖ S_{i}} = 1$ . This looks very familiar from method 1! Here they just compute $Z_{T}$ then compute the quotient $f / Z_{T}$ and for now, ignore the remainder. Then they compute $W := [h (x)]_{1}$ . This is $de g (h) + 1 = d - ∣ T ∣ + 1$ $G_{1}$ scalar multiplications.

This method has additional challenge other than $γ$ , called $z$ which must take into account $W$ . After seeing $W$ , the verifier sends a $z \in F$ . In reality this should be abstracted away using a Fiat-Shamir transform.

The prover then computes $L (X) := f_{z} (X) - Z_{T} (z) h (X)$ for $f_{z} (X) : = i \sum γ^{i - 1} Z_{T ∖ S_{i}} (z) (f_{i} (X) - r_{i} (z)) = i \sum γ^{i - 1} f_{i} (X) - i \sum γ^{i - 1} r_{i} (z)$

Here we notice that we have already computed $\sum_{i} γ^{i - 1} r_{i} (X)$ when we computed the remainder of $f / Z_{T}$ . All they have to do is take this remainder, evaluate it at $z$ , and then subtract it from $\sum_{i} γ^{i - 1} f_{i} (X)$ . They then evaluate $Z_{T} (z)$ , compute $L (X) := f_{z} (X) - Z_{T} (z) h (X)$ , then compute the polynomial $\frac{L ( X )}{x - z}$ and evaluate $W^{'} := [\frac{L ( X )}{x - z}]_{1}$ .

Here, $L (X)$ is of degree $d$ , so computing $W^{'}$ involves $d$ $G_{1}$ scalar multiplications.

The proof then consists of $(W, W^{'})$ .

Method 2, Verification

The verifier receives $W, W^{'}$ , the commitments $c_{i}$ , the evaluations of the $f_{i}$ at each $z_{j} \in T$ and computes $F : = - Z_{T} (z) W + i \sum γ^{i - 1} Z_{T ∖ S_{i}} (z) (c_{i} - [r_{i} (z)]_{1}) = - Z_{T} (z) W + i \sum γ^{i - 1} (c_{i} - [r_{i} (z)]_{1}) = - Z_{T} (z) W + i \sum γ^{i - 1} c_{i} - [i \sum γ^{i - 1} r_{i} (z)]_{1})$ To do this, they do lagrange interpolation of $\sum_{i} γ^{i - 1} r_{i} (X)$ , evaluate it at $z$ , then do the single $G_{1}$ scalar multiplication. Computing $\sum_{i} γ^{i - 1} c_{i}$ is $N$ $G_{1}$ scalar multiplications, and one more for computing $- Z_{T} (z) W$ . Next the verifier checks $e (F, g_{2}) = e (W^{'}, [x - z]_{2})$ Which involves 2 pairings, and 2 $G_{2}$ scalar multiplications, yielding

Operation	Open quantity	Verify quantity
Polynomial Interpolation	0	$1$
G1 Scalar Multiplication	$2 d - ∣ T ∣ + 1$	$2 + N$
G2 Scalar Multipcication	0	$2$
Pairing	0	$2$

Comparing the two methods, we get

Operation	Method 1 Open	Method 2 Open	Method 1 Verify	Method 2 Verify
Polynomial Interpolation	0	0	1	$1$
G1 Scalar Multiplication	$d + 1 - ∣ T ∣$	$2 d - ∣ T ∣ + 1$	$N + ∣ T ∣ + 1$	$2 + N$
G2 Scalar Multipcication	0	0	$d$ (can be cached)	$2$
Pairing	0	0	2	$2$

Polynomial Multiproof Spec