Offline memory checking in zkVM
Share on

Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.

The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.

1. Constructing the Read Set RS and Write Set WS

We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:

Initialization:

  • \(RS = WS = \emptyset ;\)
  • If the memory \(a_i\) is to be initialized with value \(v_i\), then the initial tuple is added to the write set: \(WS = WSU(a_i, v_i, 0)\).

Memory Read and Write Operations:

  • When reading from address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for that address (i.e., with the largest instruction count \(c\)). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v, c_{\text{now}})\), where \(c_{\text{now}}\) is the current instruction count.
  • When writing value \(v'\) to address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for address \(a\). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v', c_{\text{now}})\).

Post-processing:

  • For each memory address \(a_i\), find the last tuple \((a_i, v_i, c_i)\) in \(WS\), and update: \(RS = RSU(a_i, v_i, c_i)\).

2. Core Argument

If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:

1. The read and write sets are correctly initialized;

2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;

  • For read operations, the tuples added to \(RS\) and \(WS\) must have the same value;
  • For read-write operations, the tuple added to \(RS\) must have a lower instruction count than the one added to \(WS\);

3. After post-processing, \(RS = WS.\).

Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).

There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).

3. Multiset Hashing Method

Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.

Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.

For example, during zkVM execution, we can construct the following columns and corresponding constraints:

Columns:

  • \(a\): operation address;
  • \(v_r, v_w\): values added to the read/write set;
  • \(c_r, c_w\): program counters added to the read/write set;
  • \(t_r, t_w\): 8-bit adjustments;
  • \(x_r, x_w\): x-coordinates, satisfying \(x = \text{hash}(a | |v| | |c| | t);\)
  • \(y_r, y_w\): y-coordinates;
  • \(z_r, z_w\): satisfying \(y = z^2\), used to ensure that \(y\) is a quadratic residue;
  • \(h_r, h_w\): current hash values of the read/write sets.

Constraints:

  1. Perform range checks on \(a, v, c, t\);
  2. \(c_{\text{now}} = c_w > c_r\) (you can introduce \(d = c_w - c_r\) and range-check \(d\));
  3. \(x = \text{hash}(a||v||c||t)\), \(y^2 = x^3 + Ax + B\), \(y = z^2\);
  4. \(h = h_{\text{old}} + (x, y)\);
  5. At the end, ensure \(h_r = h_w\).

4. LogUp Method

To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:

\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]

We can construct the following columns and constraints:

Columns:

  • \(a\);
  • \(v_r, v_w\);
  • \(c_r, c_w\);
More articles
ZKM 公布了我们的早期贡献者计划的第二阶段:社区演变
在 ZKM,我们的使命是通过将隐私、安全和效率放在首位,彻底改变数字世界。随着我们的早期贡献者计划(ECP)第二阶段:社区演进的推出,我们在实现这一目标方面向前迈出了一大步。该计划在培养具有前瞻性思维的开发人员社区方面发挥了重要作用,他们积极参与塑造开源零知识技术的未来。在这篇文章中,我们很高兴向大家介绍我们的ECP在这个新阶段为我们不断壮大的社区带来的新功能和机会。
ZKM 的投稿者专区
贡献者专区是一个旨在扩大 ZKM 社区的声音和专业知识的平台,邀请专家和发烧友分享他们的见解,加深对 ZK 技术及其应用的集体理解。贡献者专区的主要目标是创建一个充满活力的社区,让成员积极参与知识的共享和创造。该举措旨在:
Offline memory checking in zkVM

Memory checking in a zkVM is used to allow a prover to demonstrate to a verifier that read/write memory operations are performed correctly. In such a memory system, a value v can be written to an address a, and later retrieved from address a by the program. This technique enables the verifier to efficiently confirm that the prover has followed memory access rules—specifically, that any read returns the latest value written to that memory address.

The term offline memory checking refers to a technique in zkVM proofs where the correctness of all read operations is checked at once, after all read values have been committed. Offline checking does not immediately determine whether each read is correct at the moment it happens. Instead, it verifies the correctness of all reads in a single batch after all operations are done. This contrasts with online memory checking techniques, such as Merkle hashing, which confirm read correctness immediately by requiring an authentication path for every read. Merkle hashing imposes higher proof costs on the prover for each read operation, whereas offline memory checking is more suitable for zkVM scenarios.

1. Constructing the Read Set RS and Write Set WS

We construct two sets, the read set RS and the write set WS, where each element is a tuple Esto es un ejemplo: \( (a, v, c) \) representing the address, value, and instruction count. They are constructed as follows:

Initialization:

  • \(RS = WS = \emptyset ;\)
  • If the memory \(a_i\) is to be initialized with value \(v_i\), then the initial tuple is added to the write set: \(WS = WSU(a_i, v_i, 0)\).

Memory Read and Write Operations:

  • When reading from address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for that address (i.e., with the largest instruction count \(c\)). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v, c_{\text{now}})\), where \(c_{\text{now}}\) is the current instruction count.
  • When writing value \(v'\) to address \(a\), find the last tuple \((a, v, c)\) in \(WS\) for address \(a\). Update: \(RS = RSU(a, v, c)\) and \(WS = WSU(a, v', c_{\text{now}})\).

Post-processing:

  • For each memory address \(a_i\), find the last tuple \((a_i, v_i, c_i)\) in \(WS\), and update: \(RS = RSU(a_i, v_i, c_i)\).

2. Core Argument

If the following conditions are met, then the prover has followed the memory rules and read the correct values from memory:

1. The read and write sets are correctly initialized;

2. For each address \(a_i\), the instruction counts added to \(WS\) strictly increase over time;

  • For read operations, the tuples added to \(RS\) and \(WS\) must have the same value;
  • For read-write operations, the tuple added to \(RS\) must have a lower instruction count than the one added to \(WS\);

3. After post-processing, \(RS = WS.\).

Brief Proof: Consider the first faulty memory read operation. Suppose a read operation is expected to return the tuple \((a, v, c)\) but instead erroneously returns \((a, v' \neq v, c')\), which is added to \(RS\). Note that all tuples in \(WS\) are unique. After adding \((a, v', c_{\text{now}})\) to \(WS\), both \((a, v, c)\) and \((a, v', c_{\text{now}})\) are not in \(RS\). According to condition 2, after each read/write operation, there are always at least two tuples in \(WS\) that are not in \(RS\), which means post-processing cannot make \(RS = WS\).

There are two methods for enforcing the final equality of \(WS\) and \(RS\): multiset hashing and lookup tables (e.g., LogUp).

3. Multiset Hashing Method

Multiset hashing allows a (multi)set to be hashed into a short string such that it is computationally difficult to find two different sets with the same hash. The hash is computed incrementally—one element at a time. A key property is that the final hash value is independent of the order in which elements are hashed. This lets us compare sets constructed in different orders.

Let the group \(G\) be the set of points on an elliptic curve defined by \(y^2 = x^3 + Ax + B\) (including the point at infinity). We can implement multiset hashing using group hashing. To hash a set element into a point on the elliptic curve, we first map the element to an \(x\)-coordinate. Since this may not yield a valid curve point, we add an 8-bit adjustment value \(t\) and use hash operations to avoid collisions. Additionally, we apply constraints (e.g., ensuring the \(y\)-coordinate is a quadratic residue or adding range checks) to prevent flipping the sign of the y-coordinate.

For example, during zkVM execution, we can construct the following columns and corresponding constraints:

Columns:

  • \(a\): operation address;
  • \(v_r, v_w\): values added to the read/write set;
  • \(c_r, c_w\): program counters added to the read/write set;
  • \(t_r, t_w\): 8-bit adjustments;
  • \(x_r, x_w\): x-coordinates, satisfying \(x = \text{hash}(a | |v| | |c| | t);\)
  • \(y_r, y_w\): y-coordinates;
  • \(z_r, z_w\): satisfying \(y = z^2\), used to ensure that \(y\) is a quadratic residue;
  • \(h_r, h_w\): current hash values of the read/write sets.

Constraints:

  1. Perform range checks on \(a, v, c, t\);
  2. \(c_{\text{now}} = c_w > c_r\) (you can introduce \(d = c_w - c_r\) and range-check \(d\));
  3. \(x = \text{hash}(a||v||c||t)\), \(y^2 = x^3 + Ax + B\), \(y = z^2\);
  4. \(h = h_{\text{old}} + (x, y)\);
  5. At the end, ensure \(h_r = h_w\).

4. LogUp Method

To prove that two multisets \(A = a_i\) and \(B = b_i\) are equal, it is sufficient to prove that for a randomly chosen challenge \(\gamma\), the following holds:

\[\sum_{i} \frac{1}{a_i + \gamma} = \sum_{i} \frac{1}{b_i + \gamma}\]

We can construct the following columns and constraints:

Columns:

  • \(a\);
  • \(v_r, v_w\);
  • \(c_r, c_w\);