Overview
The Polynomial Multiproof (PMP) scheme is a polynomial commitment scheme that allows for efficiently creating/verifying opening proofs for multiple polynomials at multiple points.
Here's why this project is cool: Opening gets faster the more points we open a polynomial at.
This is a huge deal for data availability systems, which are bottlenecked by opening time.
For some sample numbers, here are times opening/verifying degree 32768 - 1 polynomials using BLS12-3811 on a single core of an M1 Macbook Pro:
This specification outlines implementation requirements for the polynomial multiproof (PMP) scheme. This builds on previous methods such as KZG102, and is derived entirely from BDFG213. It should be seen as choosing a special case of BDFG21 which allows for significant optimizations that make the protocol more viable for use in applications like Data Availability.
The scheme consists of four methods:
Setup
which sets up a structured reference string for the curve and does someCommit
which commits to a polynomialM1Open/M2Open
which computes a single opening proof that a set of polynomials are equal to specific values at a set of pointsM1Verify/M2Verify
which verify an opening proof against commitments and evaluations
API
The API is designed around two traits, the first looks roughly like
#![allow(unused)] fn main() { /// A curve-agnostic trait for a BDFG commitment scheme *without precomputation* pub trait PolyMultiProofNoPrecomp<E: Pairing>: Sized { /// The output proof type type Proof: Clone; /// Creates a proof of the given polynomials and evals at the given points fn open( &self, transcript: &mut Transcript, evals: &[impl AsRef<[E::ScalarField]>], polys: &[impl AsRef<[E::ScalarField]>], points: &[E::ScalarField], ) -> Result<Self::Proof, Error>; /// Verifies a proof against the given set of commitments and points fn verify( &self, transcript: &mut Transcript, commits: &[Commitment<E>], points: &[E::ScalarField], evals: &[impl AsRef<[E::ScalarField]>], proof: &Self::Proof, ) -> Result<bool, Error>; } }
This takes a Merlin transcript for the Fiat-Shamir transform, some points , some polynomials and the evaluations of each polynomial at each point .
This API is the most flexible, but not the most efficient.
The second trait is more efficient, but requires knowledge of which points are being opened to beforehand.
#![allow(unused)] fn main() { /// A curve-agnostic trait for a BDFG commitment scheme *with precomputation* pub trait PolyMultiProof<E: Pairing>: Sized { /// The output proof type type Proof: Clone; /// Creates a of the given polynomials at the given point set index fn open( &self, transcript: &mut Transcript, evals: &[impl AsRef<[E::ScalarField]>], polys: &[impl AsRef<[E::ScalarField]>], point_set_index: usize, ) -> Result<Self::Proof, Error>; /// Verifies a proof against the given set of commitments and points fn verify( &self, transcript: &mut Transcript, commits: &[Commitment<E>], point_set_index: usize, evals: &[impl AsRef<[E::ScalarField]>], proof: &Self::Proof, ) -> Result<bool, Error>; } }
When creating this object, there will be some way to specify the points, or they will be predetermined by the implementation. The API is largely the same.
Applications to Data Availability
Data availability systems can benefit greatly from PMP, since polynomial commitment-based DA systems are tremendously constrained by opening time. Even with very fast KZG implementations, opening is too slow, and constrains the amount of data which can be processed through the system
Opening to a degree 255 BLS12-381 scalar field polynomial takes at best ~3-6ms, which means opening to every cell of 256 of those polynomials will take >200s.
Because PMP allows many cells to be opened to at once, the data availability grid can be chunked into a smaller grid, allowing for a faster, more secure system, with far higher througput 4. This allows polynomial commitment-based DA systems scale up in size without taking more compute.
Curve selection
The protocol depends on the selection of a pairing based curve. Attributes of an ideal curve for this protocol are, in order of importance,
- Security
- Fast G1 scalar multiplication
- Large scalar field size
- Fast G2 scalar multiplication
- Small compressed G1 scalar size
BLS12-381 satisfies the first two critera well, and while other curves likely could satisfy more requirements, they are currently not as well audited. Hence BLS12-381 is a reasonable choice to start with.
The scheme given in BDFG21 is interactive, so Merlin transcripts are used to do the Fiat-Shamir transform.
Optimizations made from BDFG21 are outlined in Optimizations
All times are with a Xeon E5-2676 v3 @ 2.40GHz vCPU on AWS
More FFTs are required as data size increases, but these are dwarfed by opening times. Additionally, exactly how the chunking is done has nuanced security implications in different scenarios.
2. Methods
Here we detail the Setup
Setup
Setup defines the shared parameters for all other methods
#![allow(unused)] fn main() { Setup(max_coeffs: usize, point_sets: Vec<Vec<Scalar>>) -> Parameters }
max_coeffs
is the maximum number of coefficiens we will commit/open to at a time. For most curve choices, this should be a power of 2.point_sets
lists different sets of points where we can open a grid to.- This is done so we can do ahead-of-time computation that only depends on which set of points we are opening at, and not on the actual contents of the data at those indicies.
Ex:
point_sets = [[0, 4], [1, 2]]
means I can open any rows of the grid at[0, 4]
or[1, 2]
. We can only open at both[0, 4]
at the same time, and not0
or4
individually.
Setup
does the following
- Samples a random
- Computes
- Computes 1
- For the -th set of points in
point_sets
- Enumerates the points at the given indicies from the FFT domain. Call them .
- Constructs the unique degree Lagrange polynomials for each
- If the denominator is zero due to a repeated point, setup errors
- Constructs the vanishing polynomial
- Computes
For method 1, we only need as many powers of as points we open to. For method 2, we only need .
Commit
Commiting is exactly the same as in KZG10
#![allow(unused)] fn main() { Commit(p: Parameters, poly: Polynomial) -> Commitment }
Takes the polynomial and computes .
3. Method 1
This method biases towards a faster opening time, and requires roughly half the compute of FastVerify
.
See A. Optimizations for more details.
For the following, the notation of BDFG21 is followed, namely
- For a set of points and a polynomial , is the unique degree polynomial such that
M1Open
#![allow(unused)] fn main() { M1Open(transcript: Transcript, evals: Vec<Vec<Scalar>>, polys: Vec<Polynomial>, point_set_index: usize) -> Proof }
Let the polynomials be and the points be
- Check that
evals
has length and each element of evals has length - Transcribes the following to the Merlin transcript
- each of the evals row by row with the utf-8 message
open evals
- each of the points in order with the utf-8 message
open points
- each of the evals row by row with the utf-8 message
- Constructs by
- Reading
SCALAR_SIZE
bytes fromtranscript
with utf-8 messageopen gamma
- Intepreting the bytes as an integer in big endian modded by the modulus of the scalar field
- Reading
- Computing
- Polynomial divide to get , discarding the remainer
- Compute and serialize in compressed form.
M1Verify
#![allow(unused)] fn main() { M1Verify(transcript: Transcript, commits: Vec<Commitment>, evals: Vec<Vec<Scalar>>, proof: Proof, point_set_index: usize) -> bool }
Let the evals be , commits be , and the points be . Commit must be for the polynomial that evaluates to at points .
This method
- Checks that each element has the correct length
- Commits has length
- Each element of evals has length
- Transcribes the points/evals the same as in the opening
- Reads same is in the opening
- For each , computes represents the value of
- Use the to interpolate using the previously computed lagrange polynomials as
- Compute
- Compute
- Return
true
if
4. Method 2
This method biases towards a faster verifying time, and is roughly twice as slow to open on. The API is exactly the same as in method 1. See A. Optimizations for more details.
M2Open
#![allow(unused)] fn main() { M2Open(transcript: Transcript, evals: Vec<Vec<Scalar>>, polys: Vec<Polynomial>, point_set_index: usize) -> Proof }
This method
- Transcribes the following (all points serialized compressed if possible)
- each of the evals row by row with the utf-8 message
open evals
- each of the points in order with the utf-8 message
open points
- each of the evals row by row with the utf-8 message
- Constructs by
- Reading
SCALAR_SIZE
bytes fromtranscript
with utf-8 messageopen gamma
- Intepreting the bytes as an integer in big endian modded by the modulus of the scalar field
- Reading
- Computing, for each polynomial
- Polynomial divide to get , with remainder .
- Compute
- Serialize in compressed form and transcribe with the message
open W1
. - Construct in the same way as we did , reading with message
open z
. - Compute by computing . 1
- Comupute
- Compute
- Compute
- Compute
- Return and serialize in compressed form
M2Verify
#![allow(unused)] fn main() { M2Verify(transcript: Transcript, commits: Vec<Commitment>, evals: Vec<Vec<Scalar>>, (w_1, w_2): Proof, point_set_index: usize) -> bool }
Let the evals be , commits be , and the points be . Commit must be for the polynomial that evaluates to at points .
This method
- Transcribes the points/evals the same as in the opening
- Reads same is in the opening
- Transcribes
- Reads same is in opening
- Lagrange interpolate using the given evaluations, same as in method 1.
- Compute
- Compute
- Compute
- Compute
- Return
true
if
For why this works, see the A. Optimizations
A. Optimizations
This contains all the optimizations made from the methods of BDFG21 and why they are correct. Please read the paper first 😘.
Method 1
This method is the fastest method for opening, and slightly slower for verification than Method 2. But, as written in the paper, verification is impractically slow for any appreciable number of polynomials/points. So here, we make an assumption to make the computation reasonable: each polynomial in is opened at all the same points, that is (using the notation from the paper) . We also assume each polynomial is the same degree, .
Method 1, Opening
Let . For brevity, assume sums over are done as
The prover has to compute, , which means they must compute the coefficients of and then do scalar multiplications and some addition. is defined as Here, is the unique degree polynomial such that . Note that polynomial fraction divides cleanly, because for all , meaning it has factors of , and so does .
The term is just the unique minimal degree polynomial you need to subtract from to allow to cleanly divide it. But, this also means that is simply the remainder of dividing and . And the same applies for : is just the remainder you get when you divide the sum of s by ! So all they need to do in order to compute is to
- Compute
- Use a polynomial division algorithm to divide by
- Take the result and throw away the remainder
This makes opening very fast, because no interpolation is needed.
Method 1, Verification
For verification, the verifier needs to do a little more legwork than the prover. Let the commitment to the -th polynomial be
First, the verifier needs to compute That's simple enough, now they have to compute Constraining the points lets them compute a single pairing instead of ! Now let's look at each side of pairing. To compute the LHS, they need to compute . Sadly, there's no nice way around this here, it must be interpolated, but they can interpolate once rather than interpolate each . That leaves us with
-
scalar mults
-
scalar mults
Then all they're left with is the single pairing. Next they have to compute . Computing is pretty straightforward, and all they are left to do is the scalar multiplications. If you already know your ahead of time, the term can be cached.
This leaves us with the following amount of operations (other operations impact runtime fairly little)
Operation | Open quantity | Verify quantity |
---|---|---|
Polynomial Interpolation | 0 | 1 |
G1 Scalar Multiplication | ||
G2 Scalar Multipcication | 0 | (can be cached) |
Pairing | 0 | 2 |
Interestingly, the opening time gets smaller as the size of grows, due to the growing degree of the vanishing polynomial.
Method 2
Method 2 is an equally secure way to generate blocked openings, which biases computation slightly more to the opener than the verifer. They use the same assumptions as in Method 1.
Method2, Opening
The prover computes for since so . This looks very familiar from method 1! Here they just compute then compute the quotient and for now, ignore the remainder. Then they compute . This is scalar multiplications.
This method has additional challenge other than , called which must take into account . After seeing , the verifier sends a . In reality this should be abstracted away using a Fiat-Shamir transform.
The prover then computes for
Here we notice that we have already computed when we computed the remainder of . All they have to do is take this remainder, evaluate it at , and then subtract it from . They then evaluate , compute , then compute the polynomial and evaluate .
Here, is of degree , so computing involves scalar multiplications.
The proof then consists of .
Method 2, Verification
The verifier receives , the commitments , the evaluations of the at each and computes To do this, they do lagrange interpolation of , evaluate it at , then do the single scalar multiplication. Computing is scalar multiplications, and one more for computing . Next the verifier checks Which involves 2 pairings, and 2 scalar multiplications, yielding
Operation | Open quantity | Verify quantity |
---|---|---|
Polynomial Interpolation | 0 | |
G1 Scalar Multiplication | ||
G2 Scalar Multipcication | 0 | |
Pairing | 0 |
Comparing the two methods, we get
Operation | Method 1 Open | Method 2 Open | Method 1 Verify | Method 2 Verify |
---|---|---|---|---|
Polynomial Interpolation | 0 | 0 | 1 | |
G1 Scalar Multiplication | ||||
G2 Scalar Multipcication | 0 | 0 | (can be cached) | |
Pairing | 0 | 0 | 2 |