Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.
The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.
We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:
Initialization:
Memory Read and Write Operations:
Post-processing:
If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:
1. The read and write sets are correctly initialized;
2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;
3. After post-processing, \(RS = WS.\).
Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).
There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).
Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.
Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.
For example, during zkVM execution, we can construct the following columns and corresponding constraints:
Columns:
Constraints:
To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:
\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]
We can construct the following columns and constraints:
Columns:
Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.
The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.
We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:
Initialization:
Memory Read and Write Operations:
Post-processing:
If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:
1. The read and write sets are correctly initialized;
2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;
3. After post-processing, \(RS = WS.\).
Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).
There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).
Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.
Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.
For example, during zkVM execution, we can construct the following columns and corresponding constraints:
Columns:
Constraints:
To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:
\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]
We can construct the following columns and constraints:
Columns: