Technology attributes
Other attributes
zk-SNARK stands for Zero-Knowledge Succinct Non-interactive ARgument of Knowledge. The acronym further breaks down to:
- ZK (zero-knowledge): someone who verifies a statement cannot gain any new information through the process, or they only learn whether a statement is true or false.
- S (succinct): a succinct zk-SNARK proof is short and easy to verify, offering milliseconds of computation time, compared with regular zero-knowledge proofs, which can be long and complex.
- N (non-interactive): it does not require interaction between the prover and verifier, which is in contrast to an interactive proof, in which a prover and verifier must interact to complete the proof.
- AR (argument): a reason or set of reasons given to support a point, which in the context of zk-SNARKs is a series of mathematical operations to prove the truth of a statement.
- K (knowledge): the idea that the prover knows certain information required to complete the proof, such as the private key of an account.
zk-SNARK is the technology behind Zcash and Manta. It is a form of zero-knowledge cryptography that makes all the transactions in Zcash fully encrypted on the blockchain to ensure that the confidentiality of transaction metadata is fully preserved. It allows Zcash users to prove possession of certain information like having a secret key, without the need to reveal that information and without any interaction between the prover and verifier.
In other terms, a zk-SNARK offers a way to prove the correct execution of a defined computation without disclosing the values when performing the computation. To do this, zk-SNARKs are often implemented as a sequence of five steps:
- Representation of the computation to be proved, as a set of constraints among variables
- Reduction of the above set of constraints to polynomial equations
- Choice of a random value and one-way mapping of it and of all the equations to a "homomorphic" space, or a space where linear combinations still hold, so the equations defined in the first two points are still true in the new space; it's a one-way mapping, based on elliptic curve math, as an encryption technique to make it computationally unfeasible to calculate the original value only if you know the mapped encrypted point
- Proof creation, by mapping the equations, calculated in the chosen and secret value, to the homomorphic space; random values are then used to further hide the equations in this phase to preserve the equation structure
- Proof verification, by calculating the equation; the verification is not performed in the mapped space, but values are once more mapped to a new space by a pairing function that preserves and allows to check multiplication between the original values
zk-SNARKs have various applications, which can make use of their abilities to perform verification without revealing sensitive information. They can be used to make sidechains that are more scalable, or related blockchain-infrastructure projects with privacy at its center; to make supply chains visible without revealing sensitive or proprietary information; and for self-sovereign identity use cases, such as reducing the need for site-specific identity verification for various websites and applications.
The zk-SNARK with blockchains offers two general applications in terms of scalability and privacy. In scalability, a zk-SNARK allows users to verify the proof instead of taking the time to verify the statement, and the proof can be verified faster. And in privacy, the user can prove they have the right to transfer an asset without revealing the link to the asset and ensures security without leaking information about who is transacting with whom to the public blockchain.
Further, for blockchain-specific applications, zk-SNARKs can be used to develop privacy-preserving sidechains, cross-chain transfers, and privacy-enabled audit solutions. They can be employed in SDKs and allow users to develop blockchain applications with fully scalable and private ecosystems to maximize throughput and maintain low transaction fees. While for sidechains, parallel blockchains can be developed to communicate with the mainchain through a bridge while maintaining the privacy of users and allowing those users to make verifiable cross-chain transfers from blockchains in all ecosystems without relying on trusted validators.
In an enterprise blockchain network, zk-SNARKs could be used in to enable traceability and transaction verification for visibility into supply chains that can be used for enterprises but can also be used for consumers and regulatory agencies to ensure products have been treated properly throughout the supply chain. And for industries, such as pharmaceutical, food, and healthcare, where products require specialized supply chains to maintain the integrity of the product. For example, in 2013, the Drug Supply Chain Security Act (DSCSA), Title II of the Drug Quality and Security Act, required the pharmaceutical industry to adopt a digital system to track and trace prescription drugs in the United States.
Another application of zk-SNARKs can be in self-sovereign identity, where using zero-knowledge proofs to prove claims about aspects of an individual's identity. This is considered a major case for zk-SNARKs, where personal questions can be verified without revealing personally identifiable information. For example, when asked if an individual is "over 21," "live in the U.S.," or "employed," the information can be answered and verified through zk-SNARKs without giving specific details of the individual's identity or otherwise sensitive information to the party asking the question.
Further, when signing on to a website or application, a user is often required to verify their identity and create a profile, with that data being saved on a central database and requiring the user to remember in order to retain their profile and access to the website. However, with zk-SNARKs, users can generate a single sign-on that can be verified through the zk-SNARK proof, which does not need to share the personally identifiable information while verifying the truth of those statements. This can allow users to not need to remember passwords or sign-in usernames and instead use the zk-SNARK proof for their identity and get access to the website or application.
Zero-knowledge proofs are a cryptographic concept used in various applications, such as in privacy and blockchain technology. These proofs involve two parties: a prover and a verifier. In this scenario, the prover makes an assertion that their proof is valid, and the verifier must approve, without the prover leaking "knowledge" other than the assertion itself. And his asks the question of how a verifier can validate an assertion without any knowledge of the content and minimal interaction with the prover, which is where zero-knowledge proofs are used to prove the correctness of a statement without revealing any information. The concept, developed in part by Shafi Goldwasser, Silvio Micali, and Charles Rackoff, introduced the zero-knowledge concept with three main essential properties:
- Completeness: If the statement is true, the verifier has to be convinced of the truth of the statement by the prover.
- Soundness: The prover can only convince the verifier of the truth of the statement if that statement itself is actually true, except with some very small probability.
- Zero-knowledge: The verifier does not know any knowledge of the prover's statement or solution other than it is true.
One of the first valid general zero-knowledge proof systems was proposed by Goldreich, Micali, and Wigderson, which was specific to verify that a prover knows the 3-coloring of a graph. The 3-coloring problem is an NP-complete problem, stated as: given a graph G, can you color the nodes with <3 colors such that for every edge {u, v} to get f(u) =/= f(v). In this proof system, the prover wants to prove the verifier they know the 3-coloring of the given graph, without revealing the actual 3-coloring solution. Thus, the zero-knowledge proof works in the following way:
- The prover covers each vertex of a 3-coloring solution of a graph not made observable to the verifier.
- The verifier randomly chooses an edge of the graph, and the prover reveals the two vertices of the chosen edge, where the prover shows the two vertices are of a different color. If the two vertices are of the same color, the verifier knows the prover is dishonest. If the two vertices are of a different color, the verifier has some confidence that the prover is telling the truth. In this case, the prover has a (E-1)/E probability of cheating, with E representing the number of edges in the graph.
- The prover now covers all vertices of the graph again, and randomly switches the ordering of the three colors in the prover's solution, the verifier again chooses a random edge to check the edge is valid. The prover could cheat, but the probability of cheating in both rounds is ((E - 1)/E) * ((E - 1)/E) = ((E - 1)/E)2 which is lower than the previous round.
- After repeating the process for multiple rounds, the probability the prover is cheating the verifier can be reduced to a negligible value.
One of the important aspects of this protocol is that they have to show that the verifier cannot identify the actual solution to the 3-coloring solution. This is where the randomness of the coloring at each round comes in, with each round ordering the coloring differently means the verifier cannot link the edge revealed between subsequent rounds to construct a valid solution to the 3-coloring graph problem. And the 3-coloring graph problem is shown to be NP-complete, and any problem in the class NP can be reduced into an instance of that problem, which means, in essence, that all statements in NP can be verified through the zero-knowledge protocol. However, developing the protocol for efficient useful practice in everyday applications leads to the development of zk-SNARKs to make it more efficient and more applicable in practice.
The first zero-knowledge proofs, as shown above, were introduced in the 1980s by Goldwasser, Micali, and Rackoff, but the development of zk-SNARKs was largely introduced in the 2010s. zkSNARK was named in a 2011 paper by Nir Bitansky, Ran Canetti, Alessandro Chiesa, and Eran Tromer, with the term describing a new zero-knowledge protocol that would not, like prior methods, require interaction between the prover and verifier outside of a single message. Their work also introduced the idea of an extractable collision hash function (ECRH) and proved that the existence of ECRH implies a modified version of Di Crecsenzo and Lipmaa's protocol is a succinct non-interactive argument for NP, which was introduced as SNARK, with the zero-knowledge version being zk-SNARK.
When a prover wants to prove to a verifier some truth, such as knowing an input w such as z=f(x,w), where x is publicly known input to the function f. The verifier wants to be assured that the z provided by the prover is correct, and the provers want to be assured the verifier gains no knowledge about the input w. This is similar to more general zero-knowledge proofs, providing the completeness, soundness, and zero-knowledge attributes of those proofs, with the zk-SNARK building on these with the non-interactive and succinct requirements.
For non-interactive, unlike back-and-forth interactions between the prover and verifier—such as in the 3-coloring graph example—in zk-SNARKs, the prover provides on z along with a string p, which proves to the verifier that z is the correct output of f. The efficiency comes out of the one-way interaction, as z and p are sufficient to the verifier. While to be succinct, given a l, which is a security parameter in zk-SNARK, the p provided by the prover must have size Ol(1), and the verification should be able to finish, according to Bitansky, Canetti, Chiesa, and Tromer in the runtime: Ol(|f| + |x| + |z|).
To successfully construct such zk-SNARKs, there are four main ingredients: encoding into polynomial problem, succinctness through random samples, homomorphic encryption, and zero knowledge. A paper by Gennaro, Gentry, Parno, and Raykova stated that reducing original problems to Quadratic Span Programs (QSP's) was helpful in constructing zk-SNARKs. A QSP consists of multiple polynomials (v0...vi, wo...wj) over a field F and a target polynomial t. The QSP is accepted for an input and a witness if, and only if, t divides va * wb, where va and wb is constructed from the witness and the original polynomials.
Thus, the prover shows that t * k = va * wb for some other polynomial k. However, due to the complexity of large polynomials and the runtimes of multiplying and dividing polynomials, the QSP proves difficult to verify in practice. In other words, the verifier chooses a secret point s, such that t(s) * k(s) = va(s) * wb(s). Verifying the QSP at one point, rather than for all points, reduces security, since there are relatively few zeroes to the equation satisfactorily, but is relatively safe in real applications.
One of the more common constructions of zk-SNARKs involves a Common Reference String (CRS) and a setup of initial parameters. This includes a group and a generator (g), with an encryption scheme (E) where E(x) = gx. The verifier then secretly chooses s as well as another value (z) and posts as part of the CRS as: E(s0), E(s0),..., E(sd) and E(zs0), E(zs1),..., E(zsd) where d represents the maximum degree of all polynomials in the program. Once the values are calculated and posted, the verifier must discard the secret point (s) to ensure the prover cannot obtain it to create false proofs. The prover is then required to publish values above to prove that he can compute a polynomial function (f). The prover can then compute m = E(f(s)) for any function without knowing the verifier's secret value.
As the verifier discards s to ensure it is not discovered, there is no way to check that the prover correctly solved the polynomial for s. The way around this is that the verifier receives new values (m and n), the verifier is required to check that m and nmatch through a pairing function P. This function is chosen in the CRS setup phase, such that it holds for all input values x and y: P(gx, gy) = P(g, g)xy where a pairing function P is a computable bijection. The pairing function P further becomes useful as plugging in pairs n, g and m, gz into the pairing function, and if the results match, then it can be known that the prover solved the polynomial correctly at s.
The verifier can therefore verify the prover's solution without actually having to replicate the prover's computation. This shows that the protocol is non-interactive, such that the prover only needs to send a sequence of values to the verifier once and in one direction, while the verifier does not need to ask additional questions to verify the correctness of the prover's statement. However, this example is not completely zero knowledge. In this calculation, the verifier does not know what s or f(s) are but can calculate back through information by solving for correctness and the public value can find s. This requires a zero-knowledge element to be added to the SNARK protocol.
To add zero knowledge, the above example is modified with the prover choosing a random value to "shift" the value of f(s) before encryption. This means, instead of computing E(f(s)) and E(z * f(s)), the prover computes m' = E(p + f(s)) and n' = E(z * (p+f(s))) and sends it to the verifier: E(p + f(s)) = gp + f(s) = gp * gf(s) = E(f(s)) * E(p). However, in these, the prover can still compute m' and n' from the public parameters in the CRS. So, once the verifier receives the m' and n' the values can be put into the pairing function P in a fashion similar to above: P(m', gz) = P(gp + f(s), gz) = P(g, g)z*(p+f(s)) and P(n', g) = P(gz(p+f(s)), g) = P(g, g)z*(p+f(s)).
The above equations show the verification process as continuing to work properly, with the verifier's computation still limited to the pairing function. An added benefit is the zero knowledge and the prover no longer sends the value to the verifier for validation and E(f(s)) is no longer leaked. And a malicious verifier can extract from values m' and n' is p + f(s). Since p is random and only known to the prover, it is now apparent the malicious verifier can no longer deduce the value of f(s) and the modified protocol is verifiably zero knowledge.
The above example is simplified for a single polynomial f, while most problems zk-SNARKs are used for involve multiple polynomials derived from the original program to be proved. Further, in the setup phase of the CRS, there are additional parameters to show the values the prover sends to the verifier to ensure that the computation satisfies the assignment of the QSP and the initial polynomials, as a prover can theoretically find the values that make the statement p(m, gz) = p(n, g) still true without performing the necessary computation. Therefore, the prover sends more polynomials to the verifier, who uses a pairing function to match inputs, and the prover continues to use a random value to "shift" the values sent to the verifier to maintain zero knowledge.
The succinctness of zk-SNARKs is with the verification process and does not refer to the prover side. With more complex problems, the prover ends up with more complex polynomials in the Quadratic Span Program (QSP). To achieve succinctness in the verification process, there is no need to do polynomial multiplication for verification but rather to check polynomial identity for a single point. Although it leads to a reduction in security, the reduction is negligible and maintains the soundness of the proof. That leaves the efficiency of the verification process to be only affected by the inputs to the original program, and the group size chosen in the CRS setup.
The succinctness and its maintenance regardless of the problem complexity, and its non-interactive nature is the largest added benefit in the recent development of zk-SNARKs, as opposed to the zero-knowledge proofs introduced in the 1980s, and is largely the reason zk-SNARKs can be used in real-life applications.
One of the engineering challenges of zk-SNARKs in applications is the translation of the computational problem to the QSP or Quadratic Arithmetic Programs (QAP). For the zk-SNARKs to be applied to a problem a prover wants to verify, there tend to be three main preparatory procedures: a "flattening procedure" or a translation into gates, conversion from gates to R1CS, and conversion from R1CS to QAP form. The first step translates complex statements into a series of simple equations, or logic gates, of the form "x = y" or "x = y (operation) z" where the operation can be addition, subtraction, multiplication, or division, and y and z can be variables, numbers, or sub-expressions.
These simple "logic gates" are then converted into a form called the Rank-1 Constraint System (R1CS), which involves a sequence of three vectors a, b, and c, and the solution s to the R1CS, or the witness that the prover provides, is another vector. The transformation into the RICS is done for each of the logic gates in the previous step. Then the series of vectors in R1CS form are converted to polynomial form, using Lagrange Interpolation to create a polynomial, such that evaluating each polynomial at various coordinates, which creates the values of vectors that were created in R1CS before the series of polynomials from the QAP problem, to which zk-SNARKs can be applied.
There are various challenges which zk-SNARKs and zero-knowledge proofs face that prevent widespread application. One is the scalability and another is the security of the zk-SNARKs and the zero-knowledge proofs and their implementation. zk-SNARKs offer greater efficiency than other zero-knowledge proofs, but one issue with the implementation of zk-SNARKs is the cost to the prover in generating the actual proof that is to be sent to the verifier. Implementations of the zk-SNARKs have been shown to generate expensive CPU costs, and the pure memory required to store the transcript of the proof is a further cost and bottleneck. This means zk-SNARKs in many cases can only be applied to small-scale computations, while any overly complex program can be impractical and difficult to prove, let alone to scale for millions of users.
Further, depending on the implementation of the zk-SNARK, if a malicious actor is able to access a private key used to create the paramaters of the given proof protocol, they could create false proofs that look valid to verifiers. In the case in which the zk-SNARK is used to mint tokens, this could be used to counterfeit coins. Further, if these proofs can be manipulated through the acquisition of the private key, the different transactions using zk-SNARKs can be put at risk, but the trust created by the zero-knowledge proof can mean these false transactions can go without detection for a while.
The CRS phase of the zk-SNARK that creates private randomness further decreases the security of zero-knowledge systems. This is, in part, what can lead to the generation of fake proofs and fraudulently create verifiable transactions. This has led to an emerging solution known as zk-STARKs. zk-STARKs stands for zero-knowledge, scalable, and transparent argument of knowledge. The zk-STARKs work to provide scalability and transparency to zero-knowledge proofs. While "transparency" refers to the lack of a trusted set-up process, the proofs in zk-STARKs only use public parameters, and thus a malicious party would not have an unfair advantage or a way to generate a fake proof.
Additionally, zk-STARKs are developed to provide scalability that other zero-knowledge proofs and solutions do not have. This is based on the proofs provided in zk-STARKs systems that can be verified faster than zk-SNARKs, with zk-STARKs providing exponentially decreasing verification time, and a node in a blockchain network can produce proofs that can convince other nodes without requiring the nodes to store the entire blockchain's state or re-execute the computation. However, the downside of zk-STARKs is its long proofs, with proofs roughly 1000 times longer than zk-SNARK proofs.