Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Sunday, April 05

Fun Little Solutions

from Computational Complexity

Here are the solutions to the problems I posted last week.

Problem 1

A language \(L\) is commutative if for all \(u\), \(v\) in \(L\), \(uv = vu\). Show that \(L\) is commutative if and only if \(L\) is a subset of \(w^*\) for some string \(w\). The "only if" direction is surprisingly tricky.

Answer

For the "if" direction, suppose \(L \subseteq w^*\). Then every \(u, v \in L\) can be written as \(u = w^i\) and \(v = w^j\), so \(uv = w^{i+j} = vu\).

For the "only if" direction, assume \(L\) is commutative. We may assume \(L\) contains a non-empty string. Let \(u\) be the shortest such string, let \(m\) be the greatest common divisor of the lengths of all non-empty strings in \(L\), and let \(w\) be the prefix of \(u\) of length \(m\). We use the following lemma.

Lemma. If \(xy = yx\) with \(x, y\) non-empty, then both are powers of their common prefix of length \(\gcd(|x|, |y|)\).

Proof of Lemma. By strong induction on \(|x| + |y|\). If \(|x| = |y|\), comparing the first \(|x|\) characters gives \(x = y\), and both equal that string to the first power. If \(|x| < |y|\) (WLOG), comparing the first \(|x|\) characters of \(xy\) and \(yx\) shows \(x\) is a prefix of \(y\). Write \(y = xz\). Then \(x(xz) = (xz)x\) simplifies to \(xz = zx\). Since \(|x| + |z| < |x| + |y|\), the inductive hypothesis gives that \(x\) and \(z\) are powers of their common prefix of length \(\gcd(|x|, |z|) = \gcd(|x|, |y|)\). Since \(y = xz\), \(y\) is a power of this prefix too. \(\square\)

Since \(m\) divides \(|u|\) and, for each \(v \in L\), the lemma gives \(u = r_v^{|u|/\gcd(|u|,|v|)}\), combining these periodicities shows \(u = w^{|u|/m}\). 

Now for any non-empty \(v\neq u \in L\), commutativity gives \(uv = vu\), so by the lemma, \(u\) and \(v\) are both powers of the prefix \(r\) of \(u\) of length \(\gcd(|u|, |v|)\). Since \(m\) divides both \(|u|\) and \(|v|\), it divides \(|r|\), so \(r\) is itself just \(w\) repeated \(|r|/m\) times. Therefore \(v = w^{|v|/m}\). \(\square\)

Problem 2

Let an NP machine be sparse if for some \(k\), it has at most \(n^k\) accepting paths for every input of length \(n\). The class FewP is the set of languages accepted by sparse NP machines. Let \(\#N(x)\) be the number of accepting paths of \(N(x)\). Show that if P = FewP then \(\#N(x)\) is polynomial-time computable for any sparse NP machine \(N\).

Answer

The obvious approach is to create a machine \(N'(x, k)\) that accepts if there are at least \(k\) accepting paths of \(N(x)\). But this fails: if \(N(x)\) has \(2n\) accepting paths then \(N'(x, n)\) will have exponentially many accepting paths.

Instead, define \(N'(x, w)\) that accepts if \(N(x)\) has an accepting path starting with \(w\), and use tree search to find all the accepting paths. 

Problem 3

Let PERM(\(L\)) be the set of all permutations of the strings in \(L\). For example, PERM(\(\{000, 010\}\)) = \(\{000, 001, 010, 100\}\). Are regular languages closed under PERM? How about context-free languages?

Answer

Regular languages are not closed under PERM. Let \(L = (01)^*\); then \(\mathrm{PERM}(L) \cap 0^*1^* = \{0^n 1^n\}\), which is not regular. Similarly, context-free languages are not closed under PERM in general: \(\mathrm{PERM}((012)^*)\) is not context-free.

However, over a binary alphabet \(\{a, b\}\), if \(L\) is context-free then \(\mathrm{PERM}(L)\) is context-free. Over a binary alphabet, two strings are permutations of each other if and only if they have the same number of each letter, so \(\mathrm{PERM}(L)\) depends only on the Parikh image \(\Pi(L) = \{(|u|_a, |u|_b) : u \in L\}\).

By Parikh's theorem, \(\Pi(L)\) is semilinear for any CFL \(L\), and every semilinear set is the Parikh image of some regular language \(R\). Thus \(\mathrm{PERM}(L) = \mathrm{PERM}(R)\), and it suffices to show \(\mathrm{PERM}(R)\) is context-free for regular \(R\).

Given a DFA \(A\) for \(R\), construct a PDA that, on input \(w\), nondeterministically selects a rearrangement \(u\) of \(w\) while simulating \(A\) on \(u\). The stack tracks the running imbalance \(\Delta_i = |\{a\text{'s in } u_1\cdots u_i\}| - |\{a\text{'s in } w_1 \cdots w_i\}|\) in unary, while the finite control tracks the DFA state and sign of \(\Delta_i\). The PDA accepts iff the DFA reaches a state in \(F\) and the stack is empty (i.e., \(|u|_a = |w|_a\)), establishing \(\mathrm{PERM}(R)\) is context-free.

Problem 4

Suppose you have a one-tape Turing machine where we allow the transition function to move the head left, right, or stay put. Show there is an equivalent one-tape Turing machine that only moves the head left or right — and do it without increasing the size of the state space or tape alphabet.

Answer

For each pair \((q, a)\) with \(\delta(q, a) = (p, b, S)\), precompute the stay-closure: keep applying \(\delta\) while the move is \(S\). Since only the scanned cell changes, the process evolves on the finite set \(Q \times \Gamma\). Exactly one of three outcomes occurs: you reach \((p', b', D)\) with \(D \in \{L, R\}\); you enter \(q_{\mathrm{acc}}\) or \(q_{\mathrm{rej}}\); or you fall into an \(S\)-only cycle. Define \(\delta'\) by:

  • if the first non-stay step is \((p', b', D)\), set \(\delta'(q, a) = (p', b', D)\);
  • if the closure halts, write the last \(b'\) and move (say \(R\)) into \(q_{\mathrm{acc}}\) or \(q_{\mathrm{rej}}\);
  • if it \(S\)-loops, write any fixed symbol and move (say \(R\)) into \(q_{\mathrm{rej}}\).

Leave all non-\(S\) transitions unchanged. Then \(Q\) and \(\Gamma\) are unchanged, no \(S\)-moves remain, and the accepted language is preserved.

Problem 5

Let E be the set of problems solvable in time \(2^{O(n)}\). Show unconditionally that E \(\neq\) NP.

Answer

EXP, the set of problems solvable in time \(2^{n^{O(1)}}\), has a complete problem that lies in E. So if E = NP then NP = EXP, which gives E = EXP, violating the time hierarchy theorem.

Note this proof does not say anything about whether or not E is contained in NP or vice versa.

Problem 6

Show there is a computable list of Turing machines \(M_1, M_2, \ldots\) such that \(\{L(M_1), L(M_2), \ldots\}\) is exactly the set of computable languages.

Answer

This is impossible if the \(M_i\) are all total Turing machines (halt on all inputs). But I never made that requirement.

Let \(N_1, N_2, \ldots\) be the standard enumeration of all Turing machines. Define \(M_i(x)\) to accept if \(N_i(y)\) halts for all \(y < x\) and \(N_i(x)\) accepts. If \(N_i\) is total then \(L(M_i) = L(N_i)\). If \(N_i\) is not total then \(L(M_i)\) is finite and hence computable. Thus \(\{L(M_1), L(M_2), \ldots\}\) contains all computable languages and no non-computable ones. 

By Lance Fortnow

Here are the solutions to the problems I posted last week.

Problem 1

A language \(L\) is commutative if for all \(u\), \(v\) in \(L\), \(uv = vu\). Show that \(L\) is commutative if and only if \(L\) is a subset of \(w^*\) for some string \(w\). The "only if" direction is surprisingly tricky.

Answer

For the "if" direction, suppose \(L \subseteq w^*\). Then every \(u, v \in L\) can be written as \(u = w^i\) and \(v = w^j\), so \(uv = w^{i+j} = vu\).

For the "only if" direction, assume \(L\) is commutative. We may assume \(L\) contains a non-empty string. Let \(u\) be the shortest such string, let \(m\) be the greatest common divisor of the lengths of all non-empty strings in \(L\), and let \(w\) be the prefix of \(u\) of length \(m\). We use the following lemma.

Lemma. If \(xy = yx\) with \(x, y\) non-empty, then both are powers of their common prefix of length \(\gcd(|x|, |y|)\).

Proof of Lemma. By strong induction on \(|x| + |y|\). If \(|x| = |y|\), comparing the first \(|x|\) characters gives \(x = y\), and both equal that string to the first power. If \(|x| < |y|\) (WLOG), comparing the first \(|x|\) characters of \(xy\) and \(yx\) shows \(x\) is a prefix of \(y\). Write \(y = xz\). Then \(x(xz) = (xz)x\) simplifies to \(xz = zx\). Since \(|x| + |z| < |x| + |y|\), the inductive hypothesis gives that \(x\) and \(z\) are powers of their common prefix of length \(\gcd(|x|, |z|) = \gcd(|x|, |y|)\). Since \(y = xz\), \(y\) is a power of this prefix too. \(\square\)

Since \(m\) divides \(|u|\) and, for each \(v \in L\), the lemma gives \(u = r_v^{|u|/\gcd(|u|,|v|)}\), combining these periodicities shows \(u = w^{|u|/m}\). 

Now for any non-empty \(v\neq u \in L\), commutativity gives \(uv = vu\), so by the lemma, \(u\) and \(v\) are both powers of the prefix \(r\) of \(u\) of length \(\gcd(|u|, |v|)\). Since \(m\) divides both \(|u|\) and \(|v|\), it divides \(|r|\), so \(r\) is itself just \(w\) repeated \(|r|/m\) times. Therefore \(v = w^{|v|/m}\). \(\square\)


Problem 2

Let an NP machine be sparse if for some \(k\), it has at most \(n^k\) accepting paths for every input of length \(n\). The class FewP is the set of languages accepted by sparse NP machines. Let \(\#N(x)\) be the number of accepting paths of \(N(x)\). Show that if P = FewP then \(\#N(x)\) is polynomial-time computable for any sparse NP machine \(N\).

Answer

The obvious approach is to create a machine \(N'(x, k)\) that accepts if there are at least \(k\) accepting paths of \(N(x)\). But this fails: if \(N(x)\) has \(2n\) accepting paths then \(N'(x, n)\) will have exponentially many accepting paths.

Instead, define \(N'(x, w)\) that accepts if \(N(x)\) has an accepting path starting with \(w\), and use tree search to find all the accepting paths. 


Problem 3

Let PERM(\(L\)) be the set of all permutations of the strings in \(L\). For example, PERM(\(\{000, 010\}\)) = \(\{000, 001, 010, 100\}\). Are regular languages closed under PERM? How about context-free languages?

Answer

Regular languages are not closed under PERM. Let \(L = (01)^*\); then \(\mathrm{PERM}(L) \cap 0^*1^* = \{0^n 1^n\}\), which is not regular. Similarly, context-free languages are not closed under PERM in general: \(\mathrm{PERM}((012)^*)\) is not context-free.

However, over a binary alphabet \(\{a, b\}\), if \(L\) is context-free then \(\mathrm{PERM}(L)\) is context-free. Over a binary alphabet, two strings are permutations of each other if and only if they have the same number of each letter, so \(\mathrm{PERM}(L)\) depends only on the Parikh image \(\Pi(L) = \{(|u|_a, |u|_b) : u \in L\}\).

By Parikh's theorem, \(\Pi(L)\) is semilinear for any CFL \(L\), and every semilinear set is the Parikh image of some regular language \(R\). Thus \(\mathrm{PERM}(L) = \mathrm{PERM}(R)\), and it suffices to show \(\mathrm{PERM}(R)\) is context-free for regular \(R\).

Given a DFA \(A\) for \(R\), construct a PDA that, on input \(w\), nondeterministically selects a rearrangement \(u\) of \(w\) while simulating \(A\) on \(u\). The stack tracks the running imbalance \(\Delta_i = |\{a\text{'s in } u_1\cdots u_i\}| - |\{a\text{'s in } w_1 \cdots w_i\}|\) in unary, while the finite control tracks the DFA state and sign of \(\Delta_i\). The PDA accepts iff the DFA reaches a state in \(F\) and the stack is empty (i.e., \(|u|_a = |w|_a\)), establishing \(\mathrm{PERM}(R)\) is context-free.


Problem 4

Suppose you have a one-tape Turing machine where we allow the transition function to move the head left, right, or stay put. Show there is an equivalent one-tape Turing machine that only moves the head left or right — and do it without increasing the size of the state space or tape alphabet.

Answer

For each pair \((q, a)\) with \(\delta(q, a) = (p, b, S)\), precompute the stay-closure: keep applying \(\delta\) while the move is \(S\). Since only the scanned cell changes, the process evolves on the finite set \(Q \times \Gamma\). Exactly one of three outcomes occurs: you reach \((p', b', D)\) with \(D \in \{L, R\}\); you enter \(q_{\mathrm{acc}}\) or \(q_{\mathrm{rej}}\); or you fall into an \(S\)-only cycle. Define \(\delta'\) by:

  • if the first non-stay step is \((p', b', D)\), set \(\delta'(q, a) = (p', b', D)\);
  • if the closure halts, write the last \(b'\) and move (say \(R\)) into \(q_{\mathrm{acc}}\) or \(q_{\mathrm{rej}}\);
  • if it \(S\)-loops, write any fixed symbol and move (say \(R\)) into \(q_{\mathrm{rej}}\).

Leave all non-\(S\) transitions unchanged. Then \(Q\) and \(\Gamma\) are unchanged, no \(S\)-moves remain, and the accepted language is preserved.


Problem 5

Let E be the set of problems solvable in time \(2^{O(n)}\). Show unconditionally that E \(\neq\) NP.

Answer

EXP, the set of problems solvable in time \(2^{n^{O(1)}}\), has a complete problem that lies in E. So if E = NP then NP = EXP, which gives E = EXP, violating the time hierarchy theorem.

Note this proof does not say anything about whether or not E is contained in NP or vice versa.


Problem 6

Show there is a computable list of Turing machines \(M_1, M_2, \ldots\) such that \(\{L(M_1), L(M_2), \ldots\}\) is exactly the set of computable languages.

Answer

This is impossible if the \(M_i\) are all total Turing machines (halt on all inputs). But I never made that requirement.

Let \(N_1, N_2, \ldots\) be the standard enumeration of all Turing machines. Define \(M_i(x)\) to accept if \(N_i(y)\) halts for all \(y < x\) and \(N_i(x)\) accepts. If \(N_i\) is total then \(L(M_i) = L(N_i)\). If \(N_i\) is not total then \(L(M_i)\) is finite and hence computable. Thus \(\{L(M_1), L(M_2), \ldots\}\) contains all computable languages and no non-computable ones. 

By Lance Fortnow

TR26-048 | Optimal Random Self-Reductions for All Linear Problems | Shuichi Hirahara, Nobutaka Shimizu

from ECCC Papers

The linear problem specified by an $n \times n$ matrix $M$ over a finite field is the problem of computing the product of $M$ and a given vector $x$. We present optimal error-tolerant random self-reductions (also known as worst-case to average-case reductions) for all linear problems: Given a linear-size circuit that computes $M x$ on an $\varepsilon$-fraction of inputs $x$ for a positive constant $\varepsilon$, we construct a randomized linear-size circuit that computes $M x$ for all inputs $x$ with high probability. This resolves the open problem posed by Asadi, Golovnev, Gur, Shinkar, and Subramanian (SODA'24), who presented quantum $n^{1.5}$-time random self-reductions for all linear problems. Somewhat surprisingly, we also demonstrate the quantum advantage of their quantum reduction over classical uniform algorithms, by proving that any classical subquadratic-time random self-reduction requires the advice complexity of $\Omega(\log(1/\varepsilon) \cdot \log n)$, as long as the field size is at most $1/\varepsilon$. We complement this advice complexity lower bound by presenting (1) a random self-reduction with the optimal advice complexity of $O(\log(1/\varepsilon) \cdot \log n)$ and (2) a uniform random self-reduction over a large finite field.

The linear problem specified by an $n \times n$ matrix $M$ over a finite field is the problem of computing the product of $M$ and a given vector $x$. We present optimal error-tolerant random self-reductions (also known as worst-case to average-case reductions) for all linear problems: Given a linear-size circuit that computes $M x$ on an $\varepsilon$-fraction of inputs $x$ for a positive constant $\varepsilon$, we construct a randomized linear-size circuit that computes $M x$ for all inputs $x$ with high probability. This resolves the open problem posed by Asadi, Golovnev, Gur, Shinkar, and Subramanian (SODA'24), who presented quantum $n^{1.5}$-time random self-reductions for all linear problems. Somewhat surprisingly, we also demonstrate the quantum advantage of their quantum reduction over classical uniform algorithms, by proving that any classical subquadratic-time random self-reduction requires the advice complexity of $\Omega(\log(1/\varepsilon) \cdot \log n)$, as long as the field size is at most $1/\varepsilon$. We complement this advice complexity lower bound by presenting (1) a random self-reduction with the optimal advice complexity of $O(\log(1/\varepsilon) \cdot \log n)$ and (2) a uniform random self-reduction over a large finite field.

TR26-047 | An $\Omega ( (\log n / \log \log n)^2 )$ Cell-Probe Lower Bound for Dynamic Boolean Data Structures | Young Kun Ko

from ECCC Papers

We resolve the long-standing open problem of Boolean dynamic data structure hardness, proving an unconditional lower bound of $\Omega((\log n / \log\log n)^2)$ for the Multiphase Problem of Patrascu [STOC 2010] (instantiated with Inner Product over $\mathbb{F}_2$). This matches the celebrated barrier for weighted problems established by Larsen [STOC 2012] and closes the gap left by the $\Omega(\log^{1.5} n)$ Boolean bound of Larsen, Weinstein, and Yu [STOC 2018]. The previous barrier was methodological: all prior works relied on ``one-way'' communication games, where the inability to verify query simulations necessitated complex machinery (such as the Peak-to-Average Lemma) that hit a hard ceiling at $\log^{1.5} n$. Our key contribution is conceptual: We introduce a 2.5-round Multiphase Communication Game that augments the standard one-way model with a verification round, where Bob confirms the consistency of Alice's simulation against the actual memory. This simple, qualitative change allows us to bypass technical barriers and obtain the optimal bound directly. As a consequence, our analysis naturally extends to other hard Boolean functions, offering a general recipe for translating discrepancy lower bounds into $\Omega((\log n / \log\log n)^2)$ dynamic Boolean data structure lower bounds. We also argue that this result likely represents the structural ceiling of the Chronogram framework initiated by Fredman and Saks [STOC 1989]: any $\omega(\log^2 n)$ lower bound would require either fundamentally new techniques or major circuit complexity breakthroughs.

We resolve the long-standing open problem of Boolean dynamic data structure hardness, proving an unconditional lower bound of $\Omega((\log n / \log\log n)^2)$ for the Multiphase Problem of Patrascu [STOC 2010] (instantiated with Inner Product over $\mathbb{F}_2$). This matches the celebrated barrier for weighted problems established by Larsen [STOC 2012] and closes the gap left by the $\Omega(\log^{1.5} n)$ Boolean bound of Larsen, Weinstein, and Yu [STOC 2018]. The previous barrier was methodological: all prior works relied on ``one-way'' communication games, where the inability to verify query simulations necessitated complex machinery (such as the Peak-to-Average Lemma) that hit a hard ceiling at $\log^{1.5} n$. Our key contribution is conceptual: We introduce a 2.5-round Multiphase Communication Game that augments the standard one-way model with a verification round, where Bob confirms the consistency of Alice's simulation against the actual memory. This simple, qualitative change allows us to bypass technical barriers and obtain the optimal bound directly. As a consequence, our analysis naturally extends to other hard Boolean functions, offering a general recipe for translating discrepancy lower bounds into $\Omega((\log n / \log\log n)^2)$ dynamic Boolean data structure lower bounds. We also argue that this result likely represents the structural ceiling of the Chronogram framework initiated by Fredman and Saks [STOC 1989]: any $\omega(\log^2 n)$ lower bound would require either fundamentally new techniques or major circuit complexity breakthroughs.

TR26-046 | Black-Box Separation Between Multi-Collision Resistance and Collision Resistance | Xinyu Mao, Jiapeng Zhang

from ECCC Papers

A $K$-multi-collision-resistant hash function ($K$-MCRH) is a shrinking keyed function for which it is computationally infeasible to find $K$ distinct inputs that map to the same output under a randomly chosen hash key; the case $K = 2$ coincides with the standard definition of collision-resistant hash function (CRH). A natural question is whether $K$-MCRH implies CRH for $K \geq 3$, as noted by Komargodski, Naor, and Yogev (EUROCRYPT 2018) and also by Jain, Li, Robere, and Xun (FOCS 2024). We resolve this question for all constant $K$, showing that there is no black-box construction of $K$-MCRH from $(K + 1)$-MCRH for all constant $K \geq 2$. We also show that there is no black-box construction of distributional CRH (which is another relaxation of CRH) from 3-MCRH, answering an open question posed by Komargodski and Yogev (CRYPTO 2018) and also by Berman, Degwekar, Rothblum, and Vasudevan (EUROCRYPT 2018). Besides applications in cryptography, our separation also implies black-box separations between TFNP search problems, which are related to problems in proof complexity and other areas.

A $K$-multi-collision-resistant hash function ($K$-MCRH) is a shrinking keyed function for which it is computationally infeasible to find $K$ distinct inputs that map to the same output under a randomly chosen hash key; the case $K = 2$ coincides with the standard definition of collision-resistant hash function (CRH). A natural question is whether $K$-MCRH implies CRH for $K \geq 3$, as noted by Komargodski, Naor, and Yogev (EUROCRYPT 2018) and also by Jain, Li, Robere, and Xun (FOCS 2024). We resolve this question for all constant $K$, showing that there is no black-box construction of $K$-MCRH from $(K + 1)$-MCRH for all constant $K \geq 2$. We also show that there is no black-box construction of distributional CRH (which is another relaxation of CRH) from 3-MCRH, answering an open question posed by Komargodski and Yogev (CRYPTO 2018) and also by Berman, Degwekar, Rothblum, and Vasudevan (EUROCRYPT 2018). Besides applications in cryptography, our separation also implies black-box separations between TFNP search problems, which are related to problems in proof complexity and other areas.

TR26-045 | Using Hardness vs Randomness to Design Low-Space Algorithms | Edward Pyne, Roei Tell

from ECCC Papers

Can we use ``hardness vs randomness'' techniques to design low-space algorithms? This text surveys a sequence of recent works showing ways to do that. These works designed algorithms for certified derandomization and for catalytic computation (which work unconditionally), derandomization and isolation algorithms from remarkably mild assumptions, and ``win-win'' pairs of algorithms that, for every input, solve either derandomization or another important problem (e.g., $s$-$t$ connectivity) on the input. Underlying these constructions are new, specialized ``hardness vs randomness'' tools for the setting of low-space algorithms. We describe these technical tools, most notably constructions of pseudorandom generators whose reconstruction algorithm (i.e., the security reduction) is a deterministic low-space algorithm. We also explain a key part of obtaining deterministic reconstruction, which is deterministic transformations of distinguishers to bit-predictors. We pose a host of open questions that explore new ways of using hardness vs randomness to design low-space algorithms. These questions address problems in derandomization, catalytic computation, explicit constructions, learning algorithms, and more. This survey appeared in Issue 148 of the Bulletin of the European Association for Theoretical Computer Science.

Can we use ``hardness vs randomness'' techniques to design low-space algorithms? This text surveys a sequence of recent works showing ways to do that. These works designed algorithms for certified derandomization and for catalytic computation (which work unconditionally), derandomization and isolation algorithms from remarkably mild assumptions, and ``win-win'' pairs of algorithms that, for every input, solve either derandomization or another important problem (e.g., $s$-$t$ connectivity) on the input. Underlying these constructions are new, specialized ``hardness vs randomness'' tools for the setting of low-space algorithms. We describe these technical tools, most notably constructions of pseudorandom generators whose reconstruction algorithm (i.e., the security reduction) is a deterministic low-space algorithm. We also explain a key part of obtaining deterministic reconstruction, which is deterministic transformations of distinguishers to bit-predictors. We pose a host of open questions that explore new ways of using hardness vs randomness to design low-space algorithms. These questions address problems in derandomization, catalytic computation, explicit constructions, learning algorithms, and more. This survey appeared in Issue 148 of the Bulletin of the European Association for Theoretical Computer Science.

TR26-044 | Polynomial-Time Almost Log-Space Tree Evaluation by Catalytic Pebbling | Vahid Reza Asadi, Richard Cleve

from ECCC Papers

The Tree Evaluation Problem (TreeEval) is a computational problem originally proposed as a candidate to prove a separation between complexity classes P and L. Recently, this problem has gained significant attention after Cook and Mertz (STOC 2024) showed that TreeEval can be solved using $O(\log n\log\log n)$ bits of space. Their algorithm, despite getting very close to showing TreeEval $\in$ L, falls short, and in particular, it does not run in polynomial time. In this work, we present the first polynomial-time, almost logarithmic-space algorithm for TreeEval. For any $\varepsilon>0$, our algorithm solves TreeEval in time $\mathrm{poly}(n)$ while using $O(\log^{1 +\varepsilon}n)$ space. Furthermore, our algorithm has the additional property that it requires only $O(\log n)$ bits of free space, and the rest can be catalytic space. Our approach is to trade off some (catalytic) space usage for a reduction in time complexity.

The Tree Evaluation Problem (TreeEval) is a computational problem originally proposed as a candidate to prove a separation between complexity classes P and L. Recently, this problem has gained significant attention after Cook and Mertz (STOC 2024) showed that TreeEval can be solved using $O(\log n\log\log n)$ bits of space. Their algorithm, despite getting very close to showing TreeEval $\in$ L, falls short, and in particular, it does not run in polynomial time. In this work, we present the first polynomial-time, almost logarithmic-space algorithm for TreeEval. For any $\varepsilon>0$, our algorithm solves TreeEval in time $\mathrm{poly}(n)$ while using $O(\log^{1 +\varepsilon}n)$ space. Furthermore, our algorithm has the additional property that it requires only $O(\log n)$ bits of free space, and the rest can be catalytic space. Our approach is to trade off some (catalytic) space usage for a reduction in time complexity.

Friday, April 03

News for March 2026

from Property Testing Review

After either really busy months or really quiet months, we’re now on a moderately busy month (which is the way it should be). A collection of 7 papers, now showing an increased interest in sublinear graph algorithms. Of course, we still have some distribution testing, and a paper on sparse functional representations. A note on […]

After either really busy months or really quiet months, we’re now on a moderately busy month (which is the way it should be). A collection of 7 papers, now showing an increased interest in sublinear graph algorithms. Of course, we still have some distribution testing, and a paper on sparse functional representations.

A note on approximating the average degree of bounded arboricity graphs by Talya Eden, C. Seshadhri (arXiv). Let us begin with this simple note, since it will set the stage for the other papers. Consider the basic problem of estimating the average degree of a (simple) graph \(G = (V,E)\), under the standard adjacency list model. We assume that \(V = [n]\), and an algorithm can get the degree of a vertex and query for uar neighbors. Classic results prove that there are \(\widetilde{O}(\varepsilon^{-2} n/\sqrt{m})\) query (and time) algorithms. Moreover, when the graph arboricity is \(\alpha\), one can get \(\widetilde{O}(\varepsilon^{-2} \alpha)\) query algorithms. These algorithms and results are unfortunately buried deep inside papers, or have extraneous \(\varepsilon\) and \(\log n\) factors. This note give a simple, clean presentation of these results; the main result just takes two pages.

Almost-Uniform Edge Sampling: Leveraging Independent-Set and Local Graph Queries by Tomer Adar and Amit Levi (arXiv). Same setting as the previous blurb. A result of Eden-Rosenbaum (among others) showed that the problem of sampling a uniform random edge can be done in the same query complexity as estimating the number of edges. This is quite interesting, since these problems (at least from a sublinear perspective) are not similar, and the underlying algorithms are quite different. This paper studies this phenomenon in more general models. Very recent results (from Jan 2026!) by Adar-Hotam-Levi showed that average degree estimation can be done much faster with Independent Set (IS) queries together with standard adjacency list queries. Essentially, for sparse graphs, one can get \(O(n^{1/4})\) queries, as opposed to the usual \(\sqrt{n}\) bound. This paper shows that edge sampling can be done the same time as edge estimation (with nearly matching lower bounds). This “complexity equivalence” is also proven for the model with only IS queries.

Testing Properties of Edge Distributions by Yumou Fei (arXiv). Let us bring in some distributions to the (dense) graph setting. Consider an input graph \(G\) with \(n\) vertices, represented as an adjacency matrix. Suppose there is an unknown distribution \(\mu\) over \([n]^2\) that we can sample from. Our aim is to test if (say) the graph is bipartite, where distance is defined according to \(\mu\). If \(\mu\) is uniform over all edges, then this is standard property testing. With arbitrary \(\mu\), we are looking at bipartiteness testing in the distribution-free setting. The classic Goldreich-Goldwasser-Ron result shows that bipartiteness testing (in the standard, uniform setting) can be done in \(poly(\varepsilon^{-1})\) queries. They also prove an \(\Theta(n)\) bound when the distribution \(\mu\) is uniform over an arbitrary subset of edges. This papers gives a bipartiteness tester for any \(\mu\), with the same asymptotic complexity. Moreover, there are results for testing triangle-freeness and square-freeness in this setting; the complexities are \(\Theta(n^{4/3})\) and \(\Theta(n^{9/8})\) respectively. (Note that in the standard setting, we get complexities independent of \(n\).)

Optimal Prediction-Augmented Algorithms for Testing Independence of Distributions by Maryam Aliakbarpour, Alireza Azizi, Ria Stevens (arXiv). Now onto distribution testing. Consider the problem of determining if a distribution \(p\) with support \([n_1] \times [n_2] \times \ldots [n_d]\) is a product distribution. This is a fundamental problem that goes back to the birth of statistics and Pearson’s \(\chi^2\) test. This paper studies the problem in the learning augmented setting. Suppose we are also given a complete description of a predicted approximate distribution \(\widehat{p}\), and a tolerance \(\alpha\). If \(\|\widehat{p} – p\|_{TV} \geq \alpha\), then the algorithm can output “Inaccurate Information”. So, if the predictor \(\widehat{p}\) is accurate, then the algorithm needs to give a correct output. The main result shows that one can improve on the optimal sample complexity by \(\alpha^{1/3}\) factors, so the approximate prediction can help avoid the lower bounds for independence testing.

Improved Local Computation Algorithms for Greedy Set Cover via Retroactive Updates by Slobodan Mitrović, Srikkanth Ramachandran, Ronitt Rubinfeld, Mihir Singhal (arXiv). Consider the classic set cover problem; given a collection of sets \(\mathcal{S}\) over a universe \(\mathcal{E} = \bigcup_{S \in \mathcal{S}} S\), find the smallest subcollection that covers \(\mathcal{S}\). For the sublinear perspective, assume that the input is represented as a bipartite incidence graph between \(\mathcal{S}\) and \(\mathcal{E}\), with adjacency list access. The aim is to get a Local Computation Algorithm (LCA) that computes a \(O(\log \Delta)\)-approximate set cover (where \(\Delta\) is the largest size in \(\mathcal{S}\)). An LCA is an algorithm that, given some \(S \in \mathcal{S}\), outputs (in sublinear time) whether this set is in the output subcollection. The main challenge is to bound the query complexity of this operation. The guarantee is that all the output sets should have the desired guarantees. There are distributed protocols that can output an \(O(\log \Delta)\)-approximate set cover in \(r = O(\log \Delta \cdot \log f)\) rounds (\(f\) is the largest number of sets that an element participates in). Parnas-Ron showed that one can get an LCA using \(\Delta^r\) queries. This paper gives an LCA that makes \(2^r\) queries, which avoids numerous roadblocks in previous work.

Approximate Butterfly Counting in Sublinear Time by Chi Luo, Jiaxin Song, Yuhao Zhang, Kai Wang, Zhixing He, Kuan Yang (arXiv). This is an interesting applied algorithms paper, that at its core, uses a sublinear algorithm. A significant amount of graph data is really a bipartite graph (like actor-movie, author-paper, etc.). A butterfly is a \(K_{2,2}\), which is also a 4-cycle. Counting butterflies is a common data mining problem, since a high density of \(K_{2,2}\)’s is usually indicative of some structure or clustering in the bipartite graph. There have been a number of practical algorithms to compute (or estimate) butterfly counts in bipartite graphs. This paper gives a theoretical sublinear algorithm that runs in \(\widetilde{O}(w\sqrt{m}/b + m/b^{1/4})\), where \(w\) is the wedge (2-path) count and \(b\) is butterfly count. This algorithm is implemented and compared with previous results. Interestingly, the comparison is done on queries and raw running time, and even the latter shows orders of magnitude improvement. This kind of work bridges the theory/practice divide, and shows the power of pure theoretical ideas (like heavy-light vertex partitioning, various edge sampling strategies) for practical algorithmics. Hopefully this inspires even more work along these lines!

Testing Sparse Functions over the Reals by Vipul Arora, Arnab Bhattacharyya, Philips George John, Sayantan Sen (arXiv). The problems of linearity/low degree testing are well-known to our readers. There is a recent line of work generalized such results to the real-valued setting. Consider \(f: \mathbb{R}^n \to \mathbb{R}\). Distance is measured in \(l_1\) over the Gaussian measure. This paper studies a number of properties parameterized by sparsity. Consider linear functions, where \(f(x)\) is linear iff \(f(x) = \sum_i c_i x_i\), for some constants \(c_i\). A function is \(k\)-linear if only \(k\) of the \(c_i\) coefficients are non-zero. So the function has a small representation, independent of the dimension \(n\). This paper gives a tester that makes \(\widetilde{O}(k\log k + 1/\varepsilon)\) queries. Similar results are given for \(k\)-sparse low-degree polynomials and the general problem of \(k\)-juntas. There is a significant subtlety in querying a real-valued function. One can imagine that the value \(f(x)\) is only obtained with some precision \(\eta\). There are results for \(k\)-linearity for the standard Boolean setting, but the precision and \(l_1\)-distance introduce challenges for generalizing to the real valued setting.

By Seshadhri

Arbitrary Geometry

from Ben Recht

Adversarial regret as a proof technique in learning, optimization, and games

This is the third live blog of Lecture 7 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

I closed yesterday’s blog with a cliffhanger, promising to give a few examples of where I think adversarial regret is a useful concept. On its own, I’m not sure that it is a useful concept. More on that next week! But today, I’ll show how you can use adversarial regret to bootstrap interesting arguments linking machine learning, game theory, and stochastic optimization.

Once again, I list the rules of the game. We have a two-player game with repeated interactions. In every round t,

  1. Information xt is revealed to both players.

  2. Player One takes action ut

  3. Player Two takes action dt

  4. A score rt is assigned based on the triple (xt,ut,dt).

Player One is the “decision maker,” and their action has to be computable from a few lines of code. Their goal is to accumulate as high a score as possible, summed across all rounds. Player two wants the sum of all of the rt to be as low as possible.

The adversarial regret compares the score of Player One’s strategy to that of a player who sees the entire sequence of disturbances but must play the same action at every time step. It’s a weird setup where we are comparing a player who can change their strategy arbitrarily to an omniscient player forced to play the same move every time. While these two notions don’t seem worth comparing at first blush, there are a few cases in learning theory and game theory where the comparison is mathematically powerful. It turns out that computing these regret bounds is often quite simple, and they follow from elementary derivations. While these bounds themselves might not be useful, they then imply results you actually care about. Let me give my three favorite examples.

Online learning and PAC Learning

Online learning is the case argmin readers will have already encountered if they followed my machine learning course blogging. I like to teach online learning because adversarial regret bounds imply the standard model of probabilistic machine learning. Adversarial regret highlights how most of the “generalization bounds” we derive in machine learning are artifacts of geometry rather than mystical manifestations of mechanical epiphany.

In the online learning model, the goal is to predict the disturbance from the information. The actions are predictions. At each round, your score is high if your prediction is correct and low if the prediction is incorrect.

Let’s change the notation to match the standard verbiage of machine learning. The prediction is a function f, and the “disturbance” is a “label,” denoted y. Instead of a high score, we want low loss. In the online learning setting, you get to change your prediction function at every time step and compare your losses to a single model that tries to fit the labels after you see them. In equations, this is

You can now do math and show that this expression is bounded by a sublinear function of the number of rounds. This post works out the details. Now, the resulting deterministic bound is necessarily interesting in and of itself, but the magic happens when you declare the xs and ys to be generated by a stochastic process. If, for example, you assume the information-label pairs are identically distributed, independently samples from some data-generating process, then after making a few assumptions about convexity, the regret bound becomes a generalization bound:

Here, FT denotes the random function that your model returns after seeing T examples. This bound compares the predictive accuracy of your algorithm on a new sample to that of the best function computable given the data-generating distribution. If you have sublinear regret, then this quantity tends to zero as T goes to infinity. This is called a generalization bound, or, if you use probability instead of expected values, a PAC Learning bound.

The technique of deriving a deterministic regret bound and transforming it into a probabilistic generalization bound by taking expected values is called “online-to-batch conversion.” It is one of the favorite tricks of learning theorists.

Stochastic Optimization

Similar techniques can be applied more generally to stochastic optimization. A clever analysis of the stochastic gradient method takes a similar approach: you can prove that gradient descent has low regret even if Player Two is handing you a different convex function at every time step. If you take expectations of the resulting regret bound and apply Jensen’s inequality, you derive a bound on the sample average approximation method for stochastic programming. Though substantially more general, the proof is almost identical to the one in online learning.

Repeated Games

Still closely related to but slightly more challenging than online convex optimization are repeated zero-sum games. In this setup, each round of the game is itself a zero-sum game. The players battle each other for multiple rounds, and Player One’s goal is to refine their strategy so they eventually achieve an infinite ELO score. Here, a classic result proves that when both players use algorithms with low adversarial regret, they converge to a Nash equilibrium. You assume that both Player One and Player Two are using algorithms that yield low regret against an arbitrary adversary. The baseline is a player forced to use the same strategy every round. If Player One and Two’s strategic improvements have sublinear regret, their strategies eventually converge to an equilibrium. This result is the backbone of modern poker bots, which use algorithms like counterfactual regret minimization. Whether or not you think solving poker is a major contribution to humanity and human knowledge is up to you.

Subscribe now

By Ben Recht

Estonian-Latvian Computer Science Theory Days 2026

from Luca Aceto

The Estonian-Latvian Computer Science Theory Days 2026 will be at the University of Tartu, Tartu, Estonia, in the period 24-26 April 2026. (Hat tip: My colleague Tarmo Uustalu at the Department of Computer Science at Reykjavik University.) Quoting from the event's website,


"The main goal of the Theory Days is to let the theoretical computer scientists of our two countries to get acquainted with the work of each other. However, people from other countries are welcome to participate as well. The main audience is intended to be the graduate students in the roles of both listeners and presenters."

The first joint Estonian-Latvian Theory Days were held in 2010 and this year's edition of the event will feature an invited talk by Robert Tarjan. Last year, Robert Tarjan gave a public lecture at the University of Tartu entitled “My Life with Data Structures”. See www.youtube.com/watch?v=pFHIueXFHWg for a recording of that talk. 

Note that the deadlines for registration and for proposing a talk is today.

Kudos to our colleagues in Estonia and Latvia for organising the theory days!

By Luca Aceto

The Estonian-Latvian Computer Science Theory Days 2026 will be at the University of Tartu, Tartu, Estonia, in the period 24-26 April 2026. (Hat tip: My colleague Tarmo Uustalu at the Department of Computer Science at Reykjavik University.) Quoting from the event's website,


"The main goal of the Theory Days is to let the theoretical computer scientists of our two countries to get acquainted with the work of each other. However, people from other countries are welcome to participate as well. The main audience is intended to be the graduate students in the roles of both listeners and presenters."

The first joint Estonian-Latvian Theory Days were held in 2010 and this year's edition of the event will feature an invited talk by Robert Tarjan. Last year, Robert Tarjan gave a public lecture at the University of Tartu entitled “My Life with Data Structures”. See https://www.youtube.com/watch?v=pFHIueXFHWg for a recording of that talk. 

Note that the deadlines for registration and for proposing a talk is today.

Kudos to our colleagues in Estonia and Latvia for organising the theory days!

By Luca Aceto

Polymath Plus AI

from Gil Kalai

This post was written together with Nissan Hajaj and Ido Kaminer.  AI and Math has become a hot topic (and a source of some worries) among and beyond the mathematical community. Nissan Hajaj from Google Research proposed to run an … Continue reading →

This post was written together with Nissan Hajaj and Ido Kaminer. 

AI and Math has become a hot topic (and a source of some worries) among and beyond the mathematical community.

Nissan Hajaj from Google Research proposed to run an “AI polymath project” based on a similar concept of polymath project but with participation of AI agents. Together with Ido Kaminer we decided to run a pilot experiment with a couple of projects.

Nissan’s vision for the AI-polymath project

By empowering collaborative research initiatives with AI tools, we can create a powerful platform that applies the scientific potential of modern AI to the most advanced frontiers of active research. Projects like Polymath demonstrate how the mathematical community has already embraced collective efforts; such models can now be enhanced by the evolving capabilities of AI to accelerate research progress while offering developers a space to refine these tools through real-world academic interaction.

We envision an open, hybrid research ecosystem where human experts and AI agents collaborate seamlessly, utilizing shared tools and resources as needed. In this environment, both human participants and AI entities can engage in ongoing dialogues, offering insights, solutions, and critiques. Our proposal focuses on establishing a platform designed to initiate,oversee, and complement these specialized mathematical research communities, featuring:

  1. Dedicated AI utilities for community administration, including workflows, literature analysis, documentation, review, verification, and planning.

  2. Frameworks designed for registering, deploying, and coordinating AI tools made available to the community.

  3. A transparent interface that invites contributions from both human researchers and automated agents.

Our objective is to select a diverse array of mathematical problems spanning various subfields, each presenting unique hurdles for both human intellect and AI-driven methodologies.

What we plan to do

We would like to have a preliminary (and rather partial) implementation of Nissan’s idea: To start with a description of a project (polymath X) and to proceed with AI contributions on the comment section.

The (default) prompt:

Polymath X with the participation of AI agents is dealing with the following mathematical problem:
—Short Formulation—-
An introduction to the project and the discussion so far can be found in this LINK.

[Note: The link itself will change according to the project. Here are a few links for projects that are potentially relevant: 1) This MO problem; 2) A combinatorial abstraction of the “polynomial Hirsh conjecture”; 3) The first proposal in this post; 4) Some of the proposals from this MO question.) ]

Please make a comment, preferably limited to 3-4 paragraphs. (If you have more to say, contribute several comments.)
Your comment may include (but are not limited to) one or more of the following
  1. A new idea for the project,
  2.  Some further thoughts on current ideas,
  3.  A comment on some earlier comments,
  4.  Proofs, heuristic arguments, examples, counter-examples, and conjectures,
  5.  Computer-programs and computer experimentation.

The comment section of the present post can serve for “meta discussion” for this project. 

In addition of a general discussion of the idea we can think about specific projects to run.

Potential stages and specific tools

Nissan sees several possible stages for the project. For example:

  1. Manually setting up the Polymath site/post and inviting participants.
    Every AI participant will need to find a way to interact with the site (through comments).
  2. Manually setting up the Polymath site/post and inviting participants—both humans (through comments) and agentic participants (through a published API).  
  3. A semi-automatically managed site, where content generation, reviews, and moderation can be facilitated by AI.
  4. The same as (3), but with additional special-purpose AI tools and resources dedicated to the effort.

In the long term, Nissan envisions such an effort evolving into a collection of encyclopedic entries (similar to Wikipedia) that are maintained and advanced by the hybrid community, with dedicated computational resources to explore solutions autonomously.

(Our blog efforts will be limited to items 1 and 2. API stands for Application Programming Interface.)

A few words about Ido

In addition to his research in experimental physics, Ido Kaminer and his collaborators developed in 2021 the Ramanujan Machine which is “a novel way to do mathematics by harnessing your computer power to make new discoveries. The Ramanujan Machine already discovered dozens of new conjectures.” We mentioned the Ramanujan Machine in this 2021 post. His group expanded this idea to build a library of connections among mathematical constants (ICLR 2025) and unify their formulas (NeurIPS 2025). We mentioned the Ramanujan Machine in this 2021 post. More recently, the research of his group expanded also to computations in physics. See Ido’s videotaped lecture From π to QFT: Symbolic Discovery at Scale.

Other Polymath news

On some other polymath news, Tim Gowers recently proposed a polymath project about the word problem in the Artin-Tits group.

AI+Math

On the topic of AI+Math (and math of CS), let me mention that I also had very pleasant and thought-provoking discussions with Rafi Ostrovsky and Yuval Rabani.

I also had  some illuminating correspondence with Dr. Z. (Separate to his pioneering and provocative role in Math+AI, see Doron’s surprise proposal from yesterday.)

By Gil Kalai

King Chasing Problem in Chinese Chess is NP-hard

from arXiv: Computational Complexity

Authors: Chao Li, Zhujun Zhang, Chao Yang

We prove that king chasing problem in Chinese Chess is NP-hard when generalized to $n\times n$ boards. `King chasing' is a frequently-used strategy in Chinese Chess, which means that the player has to continuously check the opponent in every move until finally checkmating the opponent's king. The problem is to determine which player has a winning strategy in generalized Chinese Chess, under the constraints of king chasing. Obviously, it is a sub-problem of generalized Chinese Chess problem. We prove that king chasing problem in Chinese Chess is NP-hard by reducing from the classic NP-complete problem 3-SAT.

Authors: Chao Li, Zhujun Zhang, Chao Yang

We prove that king chasing problem in Chinese Chess is NP-hard when generalized to $n\times n$ boards. `King chasing' is a frequently-used strategy in Chinese Chess, which means that the player has to continuously check the opponent in every move until finally checkmating the opponent's king. The problem is to determine which player has a winning strategy in generalized Chinese Chess, under the constraints of king chasing. Obviously, it is a sub-problem of generalized Chinese Chess problem. We prove that king chasing problem in Chinese Chess is NP-hard by reducing from the classic NP-complete problem 3-SAT.

DQC1-completeness of normalized trace estimation for functions of log-local Hamiltonians

from arXiv: Computational Complexity

Authors: Zhengfeng Ji, Tongyang Li, Changpeng Shao, Xinzhao Wang, Yuxin Zhang

We study the computational complexity of estimating the normalized trace $2^{-n}Tr[f(A)]$ for a log-local Hamiltonian $A$ acting on $n$ qubits. This problem arises naturally in the DQC1 model, yet its complexity is only understood for a limited class of functions $f(x)$. We show that if $f(x)$ is a continuous function with approximate degree $Ω({\rm poly}(n))$, then estimating $2^{-n}Tr[f(A)]$ up to constant additive error is DQC1-complete, under a technical condition on the polynomial approximation error of $f(x)$. This condition holds for a broad class of functions, including exponentials, trigonometric functions, logarithms, and inverse-type functions. We further prove that when $A$ is sparse, the classical query complexity of this problem is exponential in the approximate degree, assuming a conjectured lower bound for a trace variant of the $k$-Forrelation problem in the DQC1 query model. Together, these results identify the approximate degree as the key parameter governing the complexity of normalized trace estimation: it characterizes both the quantum complexity (via efficient DQC1 algorithms) and, conditionally, the classical hardness, yielding an exponential quantum-classical separation. Our proof develops a unified framework that cleanly combines circuit-to-Hamiltonian constructions, periodic Jacobi operators, and tools from polynomial approximation theory, including the Chebyshev equioscillation theorem.

Authors: Zhengfeng Ji, Tongyang Li, Changpeng Shao, Xinzhao Wang, Yuxin Zhang

We study the computational complexity of estimating the normalized trace $2^{-n}Tr[f(A)]$ for a log-local Hamiltonian $A$ acting on $n$ qubits. This problem arises naturally in the DQC1 model, yet its complexity is only understood for a limited class of functions $f(x)$. We show that if $f(x)$ is a continuous function with approximate degree $Ω({\rm poly}(n))$, then estimating $2^{-n}Tr[f(A)]$ up to constant additive error is DQC1-complete, under a technical condition on the polynomial approximation error of $f(x)$. This condition holds for a broad class of functions, including exponentials, trigonometric functions, logarithms, and inverse-type functions. We further prove that when $A$ is sparse, the classical query complexity of this problem is exponential in the approximate degree, assuming a conjectured lower bound for a trace variant of the $k$-Forrelation problem in the DQC1 query model. Together, these results identify the approximate degree as the key parameter governing the complexity of normalized trace estimation: it characterizes both the quantum complexity (via efficient DQC1 algorithms) and, conditionally, the classical hardness, yielding an exponential quantum-classical separation. Our proof develops a unified framework that cleanly combines circuit-to-Hamiltonian constructions, periodic Jacobi operators, and tools from polynomial approximation theory, including the Chebyshev equioscillation theorem.

Deterministic Hardness of Approximation For SVP in all Finite $\ell_p$ Norms

from arXiv: Computational Complexity

Authors: Isaac M Hair, Amit Sahai

We show that, assuming NP $\not\subseteq$ $\cap_{δ> 0}$DTIME$\left(\exp{n^δ}\right)$, the shortest vector problem for lattices of rank $n$ in any finite $\ell_p$ norm is hard to approximate within a factor of $2^{(\log n)^{1 - o(1)}}$, via a deterministic reduction. Previously, for the Euclidean case $p=2$, even hardness of the exact shortest vector problem was not known under a deterministic reduction.

Authors: Isaac M Hair, Amit Sahai

We show that, assuming NP $\not\subseteq$ $\cap_{δ> 0}$DTIME$\left(\exp{n^δ}\right)$, the shortest vector problem for lattices of rank $n$ in any finite $\ell_p$ norm is hard to approximate within a factor of $2^{(\log n)^{1 - o(1)}}$, via a deterministic reduction. Previously, for the Euclidean case $p=2$, even hardness of the exact shortest vector problem was not known under a deterministic reduction.

Quantum polymorphism characterisation of commutativity gadgets in all quantum models

from arXiv: Computational Complexity

Authors: Eric Culf, Josse van Dobben de Bruyn, Peter Zeman

Commutativity gadgets provide a technique for lifting classical reductions between constraint satisfaction problems to quantum-sound reductions between the corresponding nonlocal games. We develop a general framework for commutativity gadgets in the setting of quantum homomorphisms between finite relational structures. Building on the notion of quantum homomorphism spaces, we introduce a uniform notion of commutativity gadget capturing the finite-dimensional quantum, quantum approximate, and commuting-operator models. In the robust setting, we use the weighted-algebra formalism for approximate quantum homomorphisms to capture corresponding notions of robust commutativity gadgets. Our main results characterize both non-robust and robust commutativity gadgets purely in terms of quantum polymorphism spaces: in any model, existence of a commutativity gadget is equivalent to the collapse of the corresponding quantum polymorphisms to classical ones at arity $|A|^2$, and robust gadgets are characterized by stable commutativity of the appropriate weighted polymorphism algebra. We use this characterisation to show relations between the classes of commutativity gadget, notably that existence of a robust commutativity gadget is equivalent to the existence of a corresponding non-robust one. Finally, we prove that quantum polymorphisms of complete graphs $K_n$ have a very special structure, wherein the noncommutative behaviour only comes from the quantum permutation group $S_n^+$. Combining this with techniques from combinatorial group theory, we construct separations between commutativity-gadget classes: we exhibit a relational structure admitting a finite-dimensional commutativity gadget but no quantum approximate gadget, and, conditional on the existence of a non-hyperlinear group, a structure admitting a quantum approximate commutativity gadget but no commuting-operator gadget.

Authors: Eric Culf, Josse van Dobben de Bruyn, Peter Zeman

Commutativity gadgets provide a technique for lifting classical reductions between constraint satisfaction problems to quantum-sound reductions between the corresponding nonlocal games. We develop a general framework for commutativity gadgets in the setting of quantum homomorphisms between finite relational structures. Building on the notion of quantum homomorphism spaces, we introduce a uniform notion of commutativity gadget capturing the finite-dimensional quantum, quantum approximate, and commuting-operator models. In the robust setting, we use the weighted-algebra formalism for approximate quantum homomorphisms to capture corresponding notions of robust commutativity gadgets. Our main results characterize both non-robust and robust commutativity gadgets purely in terms of quantum polymorphism spaces: in any model, existence of a commutativity gadget is equivalent to the collapse of the corresponding quantum polymorphisms to classical ones at arity $|A|^2$, and robust gadgets are characterized by stable commutativity of the appropriate weighted polymorphism algebra. We use this characterisation to show relations between the classes of commutativity gadget, notably that existence of a robust commutativity gadget is equivalent to the existence of a corresponding non-robust one. Finally, we prove that quantum polymorphisms of complete graphs $K_n$ have a very special structure, wherein the noncommutative behaviour only comes from the quantum permutation group $S_n^+$. Combining this with techniques from combinatorial group theory, we construct separations between commutativity-gadget classes: we exhibit a relational structure admitting a finite-dimensional commutativity gadget but no quantum approximate gadget, and, conditional on the existence of a non-hyperlinear group, a structure admitting a quantum approximate commutativity gadget but no commuting-operator gadget.

Near-Optimal Space Lower Bounds for Streaming CSPs

from arXiv: Computational Complexity

Authors: Yumou Fei, Dor Minzer, Shuo Wang

In a streaming constraint satisfaction problem (streaming CSP), a $p$-pass algorithm receives the constraints of an instance sequentially, making $p$ passes over the input in a fixed order, with the goal of approximating the maximum fraction of satisfiable constraints. We show near optimal space lower bounds for streaming CSPs, improving upon prior works. (1) Fei, Minzer and Wang (\textit{STOC 2026}) showed that for any CSP, the basic linear program defines a threshold $α_{\mathrm{LP}}\in [0,1]$ such that, for any $\varepsilon > 0$, an $(α_{\mathrm{LP}} - \varepsilon)$-approximation can be achieved using constant passes and polylogarithmic space, whereas achieving $(α_{\mathrm{LP}} + \varepsilon)$-approximation requires $Ω(n^{1/3}/p)$ space. We improve this lower bound to $Ω(\sqrt{n}/p)$, which is nearly tight for a gap version of the problem. (2) For $p=o(\log n)$, we further strengthen the lower bound to $Ω(n\cdot2^{-O_{\varepsilon}(p)})$. Combined with existing algorithmic results, this shows that $α_{\mathrm{LP}}$ is not only the limit of multi-pass polylogarithmic-space algorithms, but also the limit of single-pass sublinear-space algorithms on bounded-degree instances. (3) For certain CSPs, we show that there exists $α< 1$ such that achieving an $α$-approximation requires $Ω(n/p)$ space. Our proofs are Fourier analytic, building on the techniques of Fei, Minzer and Wang (\textit{STOC 2026}) and the Fourier-$\ell_1$-based lower bound method of Kapralov and Krachun (\textit{STOC 2019}).

Authors: Yumou Fei, Dor Minzer, Shuo Wang

In a streaming constraint satisfaction problem (streaming CSP), a $p$-pass algorithm receives the constraints of an instance sequentially, making $p$ passes over the input in a fixed order, with the goal of approximating the maximum fraction of satisfiable constraints. We show near optimal space lower bounds for streaming CSPs, improving upon prior works. (1) Fei, Minzer and Wang (\textit{STOC 2026}) showed that for any CSP, the basic linear program defines a threshold $α_{\mathrm{LP}}\in [0,1]$ such that, for any $\varepsilon > 0$, an $(α_{\mathrm{LP}} - \varepsilon)$-approximation can be achieved using constant passes and polylogarithmic space, whereas achieving $(α_{\mathrm{LP}} + \varepsilon)$-approximation requires $Ω(n^{1/3}/p)$ space. We improve this lower bound to $Ω(\sqrt{n}/p)$, which is nearly tight for a gap version of the problem. (2) For $p=o(\log n)$, we further strengthen the lower bound to $Ω(n\cdot2^{-O_{\varepsilon}(p)})$. Combined with existing algorithmic results, this shows that $α_{\mathrm{LP}}$ is not only the limit of multi-pass polylogarithmic-space algorithms, but also the limit of single-pass sublinear-space algorithms on bounded-degree instances. (3) For certain CSPs, we show that there exists $α< 1$ such that achieving an $α$-approximation requires $Ω(n/p)$ space. Our proofs are Fourier analytic, building on the techniques of Fei, Minzer and Wang (\textit{STOC 2026}) and the Fourier-$\ell_1$-based lower bound method of Kapralov and Krachun (\textit{STOC 2019}).

The edge of the asymptotic spectrum of tensors

from arXiv: Computational Complexity

Authors: Josh Alman, Baitian Li, Kevin Pratt

Strassen founded the theory of the asymptotic spectrum of tensors to study the complexity of matrix multiplication. A central challenge in this theory is to explicitly construct new spectral points. In Crelle 1991, Strassen proposed the upper support functionals $ζ^θ$ as candidate spectral points, where $θ$ ranges over a triangle $Θ$. Recent progress, involving tools and ideas from quantum information theory (Christandl-Vrana-Zuiddam, STOC 2018, JAMS 2021) and convex optimization (Hirai, 2025), culminated in the proof that the upper support functionals are indeed spectral points over the complex numbers (Sakabe-Doğan-Walter, 2026). In this paper, we give an even clearer picture of the situation for support functionals when $θ$ lies along the edges of the triangle. We show that not only are these functionals spectral points, but that they are uniquely determined as spectral points by their behavior on matrix multiplication tensors. As our methods are algebraic, as a corollary this establishes for the first time the existence of nontrivial spectral points over arbitrary fields. As part of our argument, we show a close connection between the edge support functionals and Harder-Narasimhan filtrations from quiver representation theory. We thus show, using recent work in algorithmic invariant theory, that these support functionals can be computed in deterministic polynomial time. Other ingredients of our proof include a new criterion for abstractly characterizing asymptotic tensor ranks by spectral points, and a characterization of the edge support functionals in terms of matrix multiplication capacity. As another application of these tools, we prove the existence of spectral points for higher-mode tensors beyond those currently known.

Authors: Josh Alman, Baitian Li, Kevin Pratt

Strassen founded the theory of the asymptotic spectrum of tensors to study the complexity of matrix multiplication. A central challenge in this theory is to explicitly construct new spectral points. In Crelle 1991, Strassen proposed the upper support functionals $ζ^θ$ as candidate spectral points, where $θ$ ranges over a triangle $Θ$. Recent progress, involving tools and ideas from quantum information theory (Christandl-Vrana-Zuiddam, STOC 2018, JAMS 2021) and convex optimization (Hirai, 2025), culminated in the proof that the upper support functionals are indeed spectral points over the complex numbers (Sakabe-Doğan-Walter, 2026). In this paper, we give an even clearer picture of the situation for support functionals when $θ$ lies along the edges of the triangle. We show that not only are these functionals spectral points, but that they are uniquely determined as spectral points by their behavior on matrix multiplication tensors. As our methods are algebraic, as a corollary this establishes for the first time the existence of nontrivial spectral points over arbitrary fields. As part of our argument, we show a close connection between the edge support functionals and Harder-Narasimhan filtrations from quiver representation theory. We thus show, using recent work in algorithmic invariant theory, that these support functionals can be computed in deterministic polynomial time. Other ingredients of our proof include a new criterion for abstractly characterizing asymptotic tensor ranks by spectral points, and a characterization of the edge support functionals in terms of matrix multiplication capacity. As another application of these tools, we prove the existence of spectral points for higher-mode tensors beyond those currently known.

Topology-First B-Rep Meshing

from arXiv: Computational Geometry

Authors: YunFan Zhou, Daniel Zint, Nafiseh Izadyar, Michael Tao, Daniele Panozzo, Teseo Schneider

Parametric boundary representation models (B-Reps) are the de facto standard in CAD, graphics, and robotics, yet converting them into valid meshes remains fragile. The difficulty originates from the unavoidable approximation of high-order surface and curve intersections to low-order primitives: the resulting geometric realization often fails to respect the exact topology encoded in the B-Rep, producing meshes with incorrect or missing adjacencies. Existing meshing pipelines address these inconsistencies through heuristic feature-merging and repair strategies that offer no topological guarantees and frequently fail on complex models. We propose a fundamentally different approach: the B-Rep topology is treated as an invariant of the meshing process. Our algorithm enforces the exact B-Rep topology while allowing a single user-defined tolerance to control the deviation of the mesh from the underlying parametric surfaces. Consequently, for any admissible tolerance, the output mesh is topologically correct; only its geometric fidelity degrades as the tolerance increases. This decoupling eliminates the need for post-hoc repairs and yields robust meshes even when the underlying geometry is inconsistent or highly approximated. We evaluate our method on thousands of real-world CAD models from the ABC and Fusion 360 repositories, including instances that fail with standard meshing tools. The results demonstrate that topological guarantees at the algorithmic level enable reliable mesh generation suitable for downstream applications.

Authors: YunFan Zhou, Daniel Zint, Nafiseh Izadyar, Michael Tao, Daniele Panozzo, Teseo Schneider

Parametric boundary representation models (B-Reps) are the de facto standard in CAD, graphics, and robotics, yet converting them into valid meshes remains fragile. The difficulty originates from the unavoidable approximation of high-order surface and curve intersections to low-order primitives: the resulting geometric realization often fails to respect the exact topology encoded in the B-Rep, producing meshes with incorrect or missing adjacencies. Existing meshing pipelines address these inconsistencies through heuristic feature-merging and repair strategies that offer no topological guarantees and frequently fail on complex models. We propose a fundamentally different approach: the B-Rep topology is treated as an invariant of the meshing process. Our algorithm enforces the exact B-Rep topology while allowing a single user-defined tolerance to control the deviation of the mesh from the underlying parametric surfaces. Consequently, for any admissible tolerance, the output mesh is topologically correct; only its geometric fidelity degrades as the tolerance increases. This decoupling eliminates the need for post-hoc repairs and yields robust meshes even when the underlying geometry is inconsistent or highly approximated. We evaluate our method on thousands of real-world CAD models from the ABC and Fusion 360 repositories, including instances that fail with standard meshing tools. The results demonstrate that topological guarantees at the algorithmic level enable reliable mesh generation suitable for downstream applications.

SHARC: Reference point driven Spherical Harmonic Representation for Complex Shapes

from arXiv: Computational Geometry

Authors: Panagiotis Sapoutzoglou, George Terzakis, Maria Pateraki

We propose SHARC, a novel framework that synthesizes arbitrary, genus-agnostic shapes by means of a collection of Spherical Harmonic (SH) representations of distance fields. These distance fields are anchored at optimally placed reference points in the interior volume of the surface in a way that maximizes learning of the finer details of the surface. To achieve this, we employ a cost function that jointly maximizes sparsity and centrality in terms of positioning, as well as visibility of the surface from their location. For each selected reference point, we sample the visible distance field to the surface geometry via ray-casting and compute the SH coefficients using the Fast Spherical Harmonic Transform (FSHT). To enhance geometric fidelity, we apply a configurable low-pass filter to the coefficients and refine the output using a local consistency constraint based on proximity. Evaluation of SHARC against state-of-the-art methods demonstrates that the proposed method outperforms existing approaches in both reconstruction accuracy and time efficiency without sacrificing model parsimony. The source code is available at github.com/POSE-Lab/SHARC.

Authors: Panagiotis Sapoutzoglou, George Terzakis, Maria Pateraki

We propose SHARC, a novel framework that synthesizes arbitrary, genus-agnostic shapes by means of a collection of Spherical Harmonic (SH) representations of distance fields. These distance fields are anchored at optimally placed reference points in the interior volume of the surface in a way that maximizes learning of the finer details of the surface. To achieve this, we employ a cost function that jointly maximizes sparsity and centrality in terms of positioning, as well as visibility of the surface from their location. For each selected reference point, we sample the visible distance field to the surface geometry via ray-casting and compute the SH coefficients using the Fast Spherical Harmonic Transform (FSHT). To enhance geometric fidelity, we apply a configurable low-pass filter to the coefficients and refine the output using a local consistency constraint based on proximity. Evaluation of SHARC against state-of-the-art methods demonstrates that the proposed method outperforms existing approaches in both reconstruction accuracy and time efficiency without sacrificing model parsimony. The source code is available at https://github.com/POSE-Lab/SHARC.

The Computational Complexity of Avoiding Strict Saddle Points in Constrained Optimization

from arXiv: Data Structures and Algorithms

Authors: Andreas Kontogiannis, Ioannis Panageas, Vasilis Pollatos

While first-order stationary points (FOSPs) are the traditional targets of non-convex optimization, they often correspond to undesirable strict saddle points. To circumvent this, attention has shifted towards second-order stationary points (SOSPs). In unconstrained settings, finding approximate SOSPs is PLS-complete (Kontogiannis et al.), matching the complexity of finding unconstrained FOSPs (Hollender and Zampetakis). However, the complexity of finding SOSPs in constrained settings remained notoriously unclear and was highlighted as an important open question by both aforementioned works. Under one strict definition, even verifying whether a point is an approximate SOSP is NP-hard (Murty and Kabadi). Under another widely adopted, relaxed definition where non-negative curvature is required only along the null space of the active constraints, the problem lies in TFNP, and algorithms with O(poly(1/epsilon)) running times have been proposed (Lu et al.). In this work, we settle the complexity of constrained SOSP by proving that computing an epsilon-approximate SOSP under the tractable definition is PLS-complete. We demonstrate that our result holds even in the 2D unit square [0,1]^2, and remarkably, even when stationary points are isolated at a distance of Omega(1) from the domain's boundary. Our result establishes a fundamental barrier: unless PLS is a subset of PPAD (implying PLS = CLS), no deterministic, iterative algorithm with an efficient, continuous update rule can exist for finding approximate SOSPs. This contrasts with the constrained first-order counterpart, for which Fearnley et al. showed that finding an approximate KKT point is CLS-complete. Finally, our result yields the first problem defined in a compact domain to be shown PLS-complete beyond the canonical Real-LocalOpt (Daskalakis and Papadimitriou)."

Authors: Andreas Kontogiannis, Ioannis Panageas, Vasilis Pollatos

While first-order stationary points (FOSPs) are the traditional targets of non-convex optimization, they often correspond to undesirable strict saddle points. To circumvent this, attention has shifted towards second-order stationary points (SOSPs). In unconstrained settings, finding approximate SOSPs is PLS-complete (Kontogiannis et al.), matching the complexity of finding unconstrained FOSPs (Hollender and Zampetakis). However, the complexity of finding SOSPs in constrained settings remained notoriously unclear and was highlighted as an important open question by both aforementioned works. Under one strict definition, even verifying whether a point is an approximate SOSP is NP-hard (Murty and Kabadi). Under another widely adopted, relaxed definition where non-negative curvature is required only along the null space of the active constraints, the problem lies in TFNP, and algorithms with O(poly(1/epsilon)) running times have been proposed (Lu et al.). In this work, we settle the complexity of constrained SOSP by proving that computing an epsilon-approximate SOSP under the tractable definition is PLS-complete. We demonstrate that our result holds even in the 2D unit square [0,1]^2, and remarkably, even when stationary points are isolated at a distance of Omega(1) from the domain's boundary. Our result establishes a fundamental barrier: unless PLS is a subset of PPAD (implying PLS = CLS), no deterministic, iterative algorithm with an efficient, continuous update rule can exist for finding approximate SOSPs. This contrasts with the constrained first-order counterpart, for which Fearnley et al. showed that finding an approximate KKT point is CLS-complete. Finally, our result yields the first problem defined in a compact domain to be shown PLS-complete beyond the canonical Real-LocalOpt (Daskalakis and Papadimitriou)."

Bipartite Exact Matching in P

from arXiv: Data Structures and Algorithms

Authors: Yuefeng Du

The Exact Matching problem asks whether a bipartite graph with edges colored red and blue admits a perfect matching with exactly t red edges. Introduced by Papadimitriou and Yannakakis in 1982, the problem has resisted deterministic polynomial-time algorithms for over four decades, despite admitting a randomized solution via the Schwartz-Zippel lemma since 1987. We prove the Affine-Slice Nonvanishing Conjecture (ASNC) for all bipartite braces and give a deterministic O(n^6) algorithm for Exact Matching on all bipartite graphs. The algorithm follows via the tight-cut decomposition, which reduces the decision problem to brace blocks. The proof proceeds by structural induction on McCuaig's brace decomposition. We establish the McCuaig exceptional families, the replacement determinant algebra, and the narrow-extension cases (KA, J3 to D1). For the superfluous-edge step, we introduce two closure tools: a matching-induced Two-extra Hall theorem that resolves the rank-(m-2) branch via projective-collapse contradiction, and a distinguished-state q-circuit lemma that eliminates the rank-(m-1) branch entirely by showing that any minimal dependent set containing the superfluous state forces rank m-2. The entire proof has been formally verified in the Lean 4 proof assistant.

Authors: Yuefeng Du

The Exact Matching problem asks whether a bipartite graph with edges colored red and blue admits a perfect matching with exactly t red edges. Introduced by Papadimitriou and Yannakakis in 1982, the problem has resisted deterministic polynomial-time algorithms for over four decades, despite admitting a randomized solution via the Schwartz-Zippel lemma since 1987. We prove the Affine-Slice Nonvanishing Conjecture (ASNC) for all bipartite braces and give a deterministic O(n^6) algorithm for Exact Matching on all bipartite graphs. The algorithm follows via the tight-cut decomposition, which reduces the decision problem to brace blocks. The proof proceeds by structural induction on McCuaig's brace decomposition. We establish the McCuaig exceptional families, the replacement determinant algebra, and the narrow-extension cases (KA, J3 to D1). For the superfluous-edge step, we introduce two closure tools: a matching-induced Two-extra Hall theorem that resolves the rank-(m-2) branch via projective-collapse contradiction, and a distinguished-state q-circuit lemma that eliminates the rank-(m-1) branch entirely by showing that any minimal dependent set containing the superfluous state forces rank m-2. The entire proof has been formally verified in the Lean 4 proof assistant.

Sublinear-query relative-error testing of halfspaces

from arXiv: Data Structures and Algorithms

Authors: Xi Chen, Anindya De, Yizhi Huang, Shivam Nadimpalli, Rocco A. Servedio, Tianqi Yang

The relative-error property testing model was introduced in [CDHLNSY24] to facilitate the study of property testing for "sparse" Boolean-valued functions, i.e. ones for which only a small fraction of all input assignments satisfy the function. In this framework, the distance from the unknown target function $f$ that is being tested to a function $g$ is defined as $\mathrm{Vol}(f \mathop{\triangle} g)/\mathrm{Vol}(f)$, where the numerator is the fraction of inputs on which $f$ and $g$ disagree and the denominator is the fraction of inputs that satisfy $f$. Recent work [CDHNSY26] has shown that over the Boolean domain $\{0,1\}^n$, any relative-error testing algorithm for the fundamental class of halfspaces (i.e. linear threshold functions) must make $Ω(\log n)$ oracle calls. In this paper we complement the [CDHNSY26] lower bound by showing that halfspaces can be relative-error tested over $\mathbb{R}^n$ under the standard $N(0,I_n)$ Gaussian distribution using a sublinear number of oracle calls -- in particular, substantially fewer than would be required for learning. Our results use a wide range of tools including Hermite analysis, Gaussian isoperimetric inequalities, and geometric results on noise sensitivity and surface area.

Authors: Xi Chen, Anindya De, Yizhi Huang, Shivam Nadimpalli, Rocco A. Servedio, Tianqi Yang

The relative-error property testing model was introduced in [CDHLNSY24] to facilitate the study of property testing for "sparse" Boolean-valued functions, i.e. ones for which only a small fraction of all input assignments satisfy the function. In this framework, the distance from the unknown target function $f$ that is being tested to a function $g$ is defined as $\mathrm{Vol}(f \mathop{\triangle} g)/\mathrm{Vol}(f)$, where the numerator is the fraction of inputs on which $f$ and $g$ disagree and the denominator is the fraction of inputs that satisfy $f$. Recent work [CDHNSY26] has shown that over the Boolean domain $\{0,1\}^n$, any relative-error testing algorithm for the fundamental class of halfspaces (i.e. linear threshold functions) must make $Ω(\log n)$ oracle calls. In this paper we complement the [CDHNSY26] lower bound by showing that halfspaces can be relative-error tested over $\mathbb{R}^n$ under the standard $N(0,I_n)$ Gaussian distribution using a sublinear number of oracle calls -- in particular, substantially fewer than would be required for learning. Our results use a wide range of tools including Hermite analysis, Gaussian isoperimetric inequalities, and geometric results on noise sensitivity and surface area.

Subquadratic Counting via Perfect Marginal Sampling

from arXiv: Data Structures and Algorithms

Authors: Xiaoyu Chen, Zongchen Chen, Kuikui Liu, Xinyuan Zhang

We study the computational complexity of approximately computing the partition function of a spin system. Techniques based on standard counting-to-sampling reductions yield $\tilde{O}(n^2)$-time algorithms, where $n$ is the size of the input graph. We present new counting algorithms that break the quadratic-time barrier in a wide range of settings. For example, for the hardcore model of $λ$-weighted independent sets in graphs of maximum degree $Δ$, we obtain a $\tilde{O}(n^{2-δ})$-time approximate counting algorithm, for some constant $δ> 0$, when the fugacity $λ< \frac{1}{Δ-1}$, improving over the previous regime of $λ= o(Δ^{-3/2})$ by Anand, Feng, Freifeld, Guo, and Wang (2025). Our results apply broadly to many other spin systems, such as the Ising model, hypergraph independent sets, and vertex colorings. Interestingly, our work reveals a deep connection between $\textit{subquadratic}$ counting and $\textit{perfect}$ marginal sampling. For two-spin systems such as the hardcore and Ising models, we show that the existence of perfect marginal samplers directly yields subquadratic counting algorithms in a $\textit{black-box}$ fashion. For general spin systems, we show that almost all existing perfect marginal samplers can be adapted to produce a sufficiently low-variance marginal estimator in sublinear time, leading to subquadratic counting algorithms.

Authors: Xiaoyu Chen, Zongchen Chen, Kuikui Liu, Xinyuan Zhang

We study the computational complexity of approximately computing the partition function of a spin system. Techniques based on standard counting-to-sampling reductions yield $\tilde{O}(n^2)$-time algorithms, where $n$ is the size of the input graph. We present new counting algorithms that break the quadratic-time barrier in a wide range of settings. For example, for the hardcore model of $λ$-weighted independent sets in graphs of maximum degree $Δ$, we obtain a $\tilde{O}(n^{2-δ})$-time approximate counting algorithm, for some constant $δ> 0$, when the fugacity $λ< \frac{1}{Δ-1}$, improving over the previous regime of $λ= o(Δ^{-3/2})$ by Anand, Feng, Freifeld, Guo, and Wang (2025). Our results apply broadly to many other spin systems, such as the Ising model, hypergraph independent sets, and vertex colorings. Interestingly, our work reveals a deep connection between $\textit{subquadratic}$ counting and $\textit{perfect}$ marginal sampling. For two-spin systems such as the hardcore and Ising models, we show that the existence of perfect marginal samplers directly yields subquadratic counting algorithms in a $\textit{black-box}$ fashion. For general spin systems, we show that almost all existing perfect marginal samplers can be adapted to produce a sufficiently low-variance marginal estimator in sublinear time, leading to subquadratic counting algorithms.

Probabilistic AVL Trees (p-AVL): Relaxing Deterministic Balancing

from arXiv: Data Structures and Algorithms

Authors: Hayagriv Desikan

This paper studies the empirical behaviour of the p-AVL tree, a probabilistic variant of the AVL tree in which each imbalance is repaired with probability $p$. This gives an exact continuous interpolation from $p = 0$, which recovers the BST endpoint, to $p = 1$, which recovers the standard AVL tree. Across random-order insertion experiments, we track rotations per node, total imbalance events, average depth, average height, and a global imbalance statistic $σ$. The main empirical result is that even small nonzero p already causes a strong structural change. The goal here is empirical rather than fully theoretical: to document the behaviour of the p-AVL family clearly and identify the main patterns.

Authors: Hayagriv Desikan

This paper studies the empirical behaviour of the p-AVL tree, a probabilistic variant of the AVL tree in which each imbalance is repaired with probability $p$. This gives an exact continuous interpolation from $p = 0$, which recovers the BST endpoint, to $p = 1$, which recovers the standard AVL tree. Across random-order insertion experiments, we track rotations per node, total imbalance events, average depth, average height, and a global imbalance statistic $σ$. The main empirical result is that even small nonzero p already causes a strong structural change. The goal here is empirical rather than fully theoretical: to document the behaviour of the p-AVL family clearly and identify the main patterns.

BBC: Improving Large-k Approximate Nearest Neighbor Search with a Bucket-based Result Collector

from arXiv: Data Structures and Algorithms

Authors: Ziqi Yin, Gao Cong, Kai Zeng, Jinwei Zhu, Bin Cui

Although Approximate Nearest Neighbor (ANN) search has been extensively studied, large-k ANN queries that aim to retrieve a large number of nearest neighbors remain underexplored, despite their numerous real-world applications. Existing ANN methods face significant performance degradation for such queries. In this work, we first investigate the reasons for the performance degradation of quantization-based ANN indexes: (1) the inefficiency of existing top-k collectors, which incurs significant overhead in candidate maintenance, and (2) the reduced pruning effectiveness of quantization methods, which leads to a costly re-ranking process. To address this, we propose a novel bucket-based result collector (BBC) to enhance the efficiency of existing quantization-based ANN indexes for large-k ANN queries. BBC introduces two key components: (1) a bucket-based result buffer that organizes candidates into buckets by their distances to the query. This design reduces ranking costs and improves cache efficiency, enabling high performance maintenance of a candidate superset and a lightweight final selection of top-k results. (2) two re-ranking algorithms tailored for different types of quantization methods, which accelerate their re-ranking process by reducing either the number of candidate objects to be re-ranked or cache misses. Extensive experiments on real-world datasets demonstrate that BBC accelerates existing quantization-based ANN methods by up to 3.8x at recall@k = 0.95 for large-k ANN queries.

Authors: Ziqi Yin, Gao Cong, Kai Zeng, Jinwei Zhu, Bin Cui

Although Approximate Nearest Neighbor (ANN) search has been extensively studied, large-k ANN queries that aim to retrieve a large number of nearest neighbors remain underexplored, despite their numerous real-world applications. Existing ANN methods face significant performance degradation for such queries. In this work, we first investigate the reasons for the performance degradation of quantization-based ANN indexes: (1) the inefficiency of existing top-k collectors, which incurs significant overhead in candidate maintenance, and (2) the reduced pruning effectiveness of quantization methods, which leads to a costly re-ranking process. To address this, we propose a novel bucket-based result collector (BBC) to enhance the efficiency of existing quantization-based ANN indexes for large-k ANN queries. BBC introduces two key components: (1) a bucket-based result buffer that organizes candidates into buckets by their distances to the query. This design reduces ranking costs and improves cache efficiency, enabling high performance maintenance of a candidate superset and a lightweight final selection of top-k results. (2) two re-ranking algorithms tailored for different types of quantization methods, which accelerate their re-ranking process by reducing either the number of candidate objects to be re-ranked or cache misses. Extensive experiments on real-world datasets demonstrate that BBC accelerates existing quantization-based ANN methods by up to 3.8x at recall@k = 0.95 for large-k ANN queries.

A Constant-Approximation Distance Labeling Scheme under Polynomially Many Edge Failures

from arXiv: Data Structures and Algorithms

Authors: Bernhard Haeupler, Yaowei Long, Antti Roeyskoe, Thatchaphol Saranurak

A fault-tolerant distance labeling scheme assigns a label to each vertex and edge of an undirected weighted graph $G$ with $n$ vertices so that, for any edge set $F$ of size $|F| \leq f$, one can approximate the distance between $p$ and $q$ in $G \setminus F$ by reading only the labels of $F \cup \{p,q\}$. For any $k$, we present a deterministic polynomial-time scheme with $O(k^{4})$ approximation and $\tilde{O}(f^{4}n^{1/k})$ label size. This is the first scheme to achieve a constant approximation while handling any number of edge faults $f$, resolving the open problem posed by Dory and Parter [DP21]. All previous schemes provided only a linear-in-$f$ approximation [DP21, LPS25]. Our labeling scheme directly improves the state of the art in the simpler setting of distance sensitivity oracles. Even for just $f = Θ(\log n)$ faults, all previous oracles either have super-linear query time, linear-in-$f$ approximation [CLPR12], or exponentially worse $2^{{\rm poly}(k)}$ approximation dependency in $k$ [HLS24].

Authors: Bernhard Haeupler, Yaowei Long, Antti Roeyskoe, Thatchaphol Saranurak

A fault-tolerant distance labeling scheme assigns a label to each vertex and edge of an undirected weighted graph $G$ with $n$ vertices so that, for any edge set $F$ of size $|F| \leq f$, one can approximate the distance between $p$ and $q$ in $G \setminus F$ by reading only the labels of $F \cup \{p,q\}$. For any $k$, we present a deterministic polynomial-time scheme with $O(k^{4})$ approximation and $\tilde{O}(f^{4}n^{1/k})$ label size. This is the first scheme to achieve a constant approximation while handling any number of edge faults $f$, resolving the open problem posed by Dory and Parter [DP21]. All previous schemes provided only a linear-in-$f$ approximation [DP21, LPS25]. Our labeling scheme directly improves the state of the art in the simpler setting of distance sensitivity oracles. Even for just $f = Θ(\log n)$ faults, all previous oracles either have super-linear query time, linear-in-$f$ approximation [CLPR12], or exponentially worse $2^{{\rm poly}(k)}$ approximation dependency in $k$ [HLS24].

GPU-RMQ: Accelerating Range Minimum Queries on Modern GPUs

from arXiv: Data Structures and Algorithms

Authors: Lara Kreis, Justus Henneberg, Valentin Henkys, Felix Schuhknecht, Bertil Schmidt

Range minimum queries are frequently used in string processing and database applications including biological sequence analysis, document retrieval, and web search. Hence, various data structures have been proposed for improving their efficiency on both CPUs and GPUs.Recent work has also shown that hardware-accelerated ray tracing on modern NVIDIA RTX graphic cards can be exploited to answer range minimum queries by expressing queries as rays, which are fired into a scene of triangles representing minima of ranges at different granularities. While these approaches are promising, they suffer from at least one of three issues: severe memory overhead, high index construction time, and low query throughput. This renders these methods practically unusable on larger arrays: For example, the state-of-art GPU-based approaches LCA and RTXRMQ exceed the memory capacity of an NVIDIA RTX 4090 GPU for input arrays of size >= 2^29. To tackle these problems, in this work, we present a new approach called GPU-RMQ which is based on a hierarchical approach. GPU-RMQ first constructs a hierarchy of range minimum summaries on top of the original array in a highly parallel fashion. For query answering, only the relevant portions of the hierarchy are then processed in an optimized massively-parallel scan operation. Additionally, GPU-RMQ is hybrid in design enabling the use of both ray tracing cores and CUDA cores across different levels of the hierarchy to handle queries. Our experimental evaluation shows that GPU-RMQ outperforms the state-of-the-art approaches in terms of query throughput especially for larger arrays while offering a significantly lower memory footprint and up to two orders-of-magnitude faster index construction. In particular, it achieves up to ~8x higher throughput than LCA, ~17x higher throughput than RTXRMQ, and up to ~4800x higher throughput compared to an optimized CPU-based approach.

Authors: Lara Kreis, Justus Henneberg, Valentin Henkys, Felix Schuhknecht, Bertil Schmidt

Range minimum queries are frequently used in string processing and database applications including biological sequence analysis, document retrieval, and web search. Hence, various data structures have been proposed for improving their efficiency on both CPUs and GPUs.Recent work has also shown that hardware-accelerated ray tracing on modern NVIDIA RTX graphic cards can be exploited to answer range minimum queries by expressing queries as rays, which are fired into a scene of triangles representing minima of ranges at different granularities. While these approaches are promising, they suffer from at least one of three issues: severe memory overhead, high index construction time, and low query throughput. This renders these methods practically unusable on larger arrays: For example, the state-of-art GPU-based approaches LCA and RTXRMQ exceed the memory capacity of an NVIDIA RTX 4090 GPU for input arrays of size >= 2^29. To tackle these problems, in this work, we present a new approach called GPU-RMQ which is based on a hierarchical approach. GPU-RMQ first constructs a hierarchy of range minimum summaries on top of the original array in a highly parallel fashion. For query answering, only the relevant portions of the hierarchy are then processed in an optimized massively-parallel scan operation. Additionally, GPU-RMQ is hybrid in design enabling the use of both ray tracing cores and CUDA cores across different levels of the hierarchy to handle queries. Our experimental evaluation shows that GPU-RMQ outperforms the state-of-the-art approaches in terms of query throughput especially for larger arrays while offering a significantly lower memory footprint and up to two orders-of-magnitude faster index construction. In particular, it achieves up to ~8x higher throughput than LCA, ~17x higher throughput than RTXRMQ, and up to ~4800x higher throughput compared to an optimized CPU-based approach.

Adaptive Fully Dynamic $k$-Center Clustering with (Near-)Optimal Worst-Case Guarantees

from arXiv: Data Structures and Algorithms

Authors: Mara Grilnberger, Antonis Skarlatos

Given a sequence of adversarial point insertions and point deletions, is it possible to simultaneously optimize the approximation ratio, update time, and recourse for a $k$-clustering problem? If so, can this be achieved with worst-case guarantees against an adaptive adversary? These questions have garnered significant attention in recent years. Prior works by Bhattacharya, Costa, Garg, Lattanzi, and Parotsidis [FOCS '24] and by Bhattacharya, Costa, and Farokhnejad [STOC '25] have taken significant steps toward this direction for the $k$-median clustering problem and its generalization, the $(k, z)$-clustering problem. In this paper, we study the $k$-center clustering problem, which is one of the most classical and well-studied $k$-clustering problems. Recently, Bhattacharya, Costa, Farokhnejad, Lattanzi, and Parotsidis [ICML '25] provided an affirmative answer to the first question for the $k$-center clustering problem. However, their work did not resolve the second question, as their result provides only expected amortized guarantees against an oblivious adversary. In this work, we make significant progress and close the gap by answering both questions in the affirmative. Specifically, we show that the fully dynamic $k$-center clustering problem admits a constant-factor approximation, near-optimal worst-case update time, and constant worst-case recourse, even against an adaptive adversary. This is achieved by first developing a fully dynamic bicriteria approximation algorithm with (near-)optimal worst-case bounds, and then designing a suitable fully dynamic $k$-center algorithm with near-linear update time. For the fully dynamic bicriteria approximation algorithm, we establish the worst-case recourse and worst-case update time guarantees separately, and then merge them into a single algorithm through a simple yet elegant process.

Authors: Mara Grilnberger, Antonis Skarlatos

Given a sequence of adversarial point insertions and point deletions, is it possible to simultaneously optimize the approximation ratio, update time, and recourse for a $k$-clustering problem? If so, can this be achieved with worst-case guarantees against an adaptive adversary? These questions have garnered significant attention in recent years. Prior works by Bhattacharya, Costa, Garg, Lattanzi, and Parotsidis [FOCS '24] and by Bhattacharya, Costa, and Farokhnejad [STOC '25] have taken significant steps toward this direction for the $k$-median clustering problem and its generalization, the $(k, z)$-clustering problem. In this paper, we study the $k$-center clustering problem, which is one of the most classical and well-studied $k$-clustering problems. Recently, Bhattacharya, Costa, Farokhnejad, Lattanzi, and Parotsidis [ICML '25] provided an affirmative answer to the first question for the $k$-center clustering problem. However, their work did not resolve the second question, as their result provides only expected amortized guarantees against an oblivious adversary. In this work, we make significant progress and close the gap by answering both questions in the affirmative. Specifically, we show that the fully dynamic $k$-center clustering problem admits a constant-factor approximation, near-optimal worst-case update time, and constant worst-case recourse, even against an adaptive adversary. This is achieved by first developing a fully dynamic bicriteria approximation algorithm with (near-)optimal worst-case bounds, and then designing a suitable fully dynamic $k$-center algorithm with near-linear update time. For the fully dynamic bicriteria approximation algorithm, we establish the worst-case recourse and worst-case update time guarantees separately, and then merge them into a single algorithm through a simple yet elegant process.

Single-Pass Streaming CSPs via Two-Tier Sampling

from arXiv: Data Structures and Algorithms

Authors: Amir Azarmehr, Soheil Behnezhad, Shane Ferrante

We study the maximum constraint satisfaction problem, Max-CSP, in the streaming setting. Given $n$ variables, the constraints arrive sequentially in an arbitrary order, with each constraint involving only a small subset of the variables. The objective is to approximate the maximum fraction of constraints that can be satisfied by an optimal assignment in a single pass. The problem admits a trivial near-optimal solution with $O(n)$ space, so the major open problem in the literature has been the best approximation achievable when limiting the space to $o(n)$. The answer to the question above depends heavily on the CSP instance at hand. The integrality gap $α$ of an LP relaxation, known as the BasicLP, plays a central role. In particular, a major conjecture of the area is that in the single-pass streaming setting, for any fixed $\varepsilon > 0$, (i) an $(α-\varepsilon)$-approximation can be achieved with $o(n)$ space, and (ii) any $(α+\varepsilon)$-approximation requires $Ω(n)$ space. In this work, we fully resolve the first side of the conjecture by proving that an $(α- \varepsilon)$-approximation of Max-CSP can indeed be achieved using $n^{1-Ω_\varepsilon(1)}$ space and in a single pass. Given that Max-DiCut is a special case of Max-CSP, our algorithm fully recovers the recent result of [ABFS26, STOC'26] via a completely different algorithm and proof. On a technical level, our algorithm simulates a suitable local algorithm on a reduced graph using a technique that we call *two-tier sampling*: the algorithm combines both edge sampling and vertex sampling to handle high- and low-degree vertices at the same time.

Authors: Amir Azarmehr, Soheil Behnezhad, Shane Ferrante

We study the maximum constraint satisfaction problem, Max-CSP, in the streaming setting. Given $n$ variables, the constraints arrive sequentially in an arbitrary order, with each constraint involving only a small subset of the variables. The objective is to approximate the maximum fraction of constraints that can be satisfied by an optimal assignment in a single pass. The problem admits a trivial near-optimal solution with $O(n)$ space, so the major open problem in the literature has been the best approximation achievable when limiting the space to $o(n)$. The answer to the question above depends heavily on the CSP instance at hand. The integrality gap $α$ of an LP relaxation, known as the BasicLP, plays a central role. In particular, a major conjecture of the area is that in the single-pass streaming setting, for any fixed $\varepsilon > 0$, (i) an $(α-\varepsilon)$-approximation can be achieved with $o(n)$ space, and (ii) any $(α+\varepsilon)$-approximation requires $Ω(n)$ space. In this work, we fully resolve the first side of the conjecture by proving that an $(α- \varepsilon)$-approximation of Max-CSP can indeed be achieved using $n^{1-Ω_\varepsilon(1)}$ space and in a single pass. Given that Max-DiCut is a special case of Max-CSP, our algorithm fully recovers the recent result of [ABFS26, STOC'26] via a completely different algorithm and proof. On a technical level, our algorithm simulates a suitable local algorithm on a reduced graph using a technique that we call *two-tier sampling*: the algorithm combines both edge sampling and vertex sampling to handle high- and low-degree vertices at the same time.

On the Dynamics of Linear Finite Dynamical Systems Over Galois Rings

from arXiv: Data Structures and Algorithms

Authors: Jonas Kantic, Claudio Qureshi, Daniel Panario, Fabian Legl

Linear finite dynamical systems play an important role, for example, in coding theory and simulations. Methods for analyzing such systems are often restricted to cases in which the system is defined over a field %and usually strive to achieve a complete description of the system and its dynamics. or lack practicability to effectively analyze the system's dynamical behavior. However, when analyzing and prototyping finite dynamical systems, it is often desirable to quickly obtain basic information such as the length of cycles and transients that appear in its dynamics, which is reflected in the structure of the connected components of the corresponding functional graphs. In this paper, we extend the analysis of the dynamics of linear finite dynamical systems that act over cyclic modules to Galois rings. Furthermore, we propose algorithms for computing the length of the cycles and the height of the trees that make up their functional graphs.

Authors: Jonas Kantic, Claudio Qureshi, Daniel Panario, Fabian Legl

Linear finite dynamical systems play an important role, for example, in coding theory and simulations. Methods for analyzing such systems are often restricted to cases in which the system is defined over a field %and usually strive to achieve a complete description of the system and its dynamics. or lack practicability to effectively analyze the system's dynamical behavior. However, when analyzing and prototyping finite dynamical systems, it is often desirable to quickly obtain basic information such as the length of cycles and transients that appear in its dynamics, which is reflected in the structure of the connected components of the corresponding functional graphs. In this paper, we extend the analysis of the dynamics of linear finite dynamical systems that act over cyclic modules to Galois rings. Furthermore, we propose algorithms for computing the length of the cycles and the height of the trees that make up their functional graphs.

A Simple Average-case Analysis of Recursive Randomized Greedy MIS

from arXiv: Data Structures and Algorithms

Authors: Mina Dalirrooyfard, Konstantin Makarychev, Slobodan Mitrović

We revisit the complexity analysis of the recursive version of the randomized greedy algorithm for computing a maximal independent set (MIS), originally analyzed by Yoshida, Yamamoto, and Ito (2009). They showed that, on average per vertex, the expected number of recursive calls made by this algorithm is upper bounded by the average degree of the input graph. While their analysis is clever and intricate, we provide a significantly simpler alternative that achieves the same guarantee. Our analysis is inspired by the recent work of Dalirrooyfard, Makarychev, and Mitrović (2024), who developed a potential-function-based argument to analyze a new algorithm for correlation clustering. We adapt this approach to the MIS setting, yielding a more direct and arguably more transparent analysis of the recursive randomized greedy MIS algorithm.

Authors: Mina Dalirrooyfard, Konstantin Makarychev, Slobodan Mitrović

We revisit the complexity analysis of the recursive version of the randomized greedy algorithm for computing a maximal independent set (MIS), originally analyzed by Yoshida, Yamamoto, and Ito (2009). They showed that, on average per vertex, the expected number of recursive calls made by this algorithm is upper bounded by the average degree of the input graph. While their analysis is clever and intricate, we provide a significantly simpler alternative that achieves the same guarantee. Our analysis is inspired by the recent work of Dalirrooyfard, Makarychev, and Mitrović (2024), who developed a potential-function-based argument to analyze a new algorithm for correlation clustering. We adapt this approach to the MIS setting, yielding a more direct and arguably more transparent analysis of the recursive randomized greedy MIS algorithm.

Approximating the Permanent of a Random Matrix with Polynomially Small Mean: Zeros and Universality

from arXiv: Data Structures and Algorithms

Authors: Frederic Koehler, Pui Kuen Leung

We study algorithms for approximating the permanent of a random matrix when the entries are slightly biased away from zero. This question is motivated by the goal of understanding the classical complexity of linear optics and \emph{boson sampling} (Aaronson and Arkhipov '11; Eldar and Mehraban '17). Barvinok's interpolation method enables efficient approximation of the permanent, provided one can establish a sufficiently large zero-free region for the polynomial $\mathrm{per}(zJ + W)$, where $J$ is the all-ones matrix and $W$ is a random matrix with independent mean-zero entries. We show that when the entries of $W$ are standard complex Gaussians, all zeros of the random polynomial $\mathrm{per}(zJ + W)$ lie within a disk of radius $\tilde{O}(n^{-1/3})$, which yields an approximation algorithm when the bias of the entries is $\tildeΩ(n^{-1/3})$. Previously, there were no efficient algorithms at biases smaller than $1/\mathrm{polylog}(n)$, and it was unknown whether there typically exist zeros $z$ with $|z| \ge 1$. As a complementary result, we show that the bulk of the zeros, namely $(1 - ε)n$ of them, have magnitude $Θ(n^{-1/2})$. This prevents our interpolation method from contradicting the conjectured average-case hardness of approximating the permanent. We also establish analogous zero-free regions for the hardcore model on general graphs with complex vertex fugacities. In addition, we prove universality results establishing zero-free regions for random matrices $W$ with i.i.d. subexponential entries.

Authors: Frederic Koehler, Pui Kuen Leung

We study algorithms for approximating the permanent of a random matrix when the entries are slightly biased away from zero. This question is motivated by the goal of understanding the classical complexity of linear optics and \emph{boson sampling} (Aaronson and Arkhipov '11; Eldar and Mehraban '17). Barvinok's interpolation method enables efficient approximation of the permanent, provided one can establish a sufficiently large zero-free region for the polynomial $\mathrm{per}(zJ + W)$, where $J$ is the all-ones matrix and $W$ is a random matrix with independent mean-zero entries. We show that when the entries of $W$ are standard complex Gaussians, all zeros of the random polynomial $\mathrm{per}(zJ + W)$ lie within a disk of radius $\tilde{O}(n^{-1/3})$, which yields an approximation algorithm when the bias of the entries is $\tildeΩ(n^{-1/3})$. Previously, there were no efficient algorithms at biases smaller than $1/\mathrm{polylog}(n)$, and it was unknown whether there typically exist zeros $z$ with $|z| \ge 1$. As a complementary result, we show that the bulk of the zeros, namely $(1 - ε)n$ of them, have magnitude $Θ(n^{-1/2})$. This prevents our interpolation method from contradicting the conjectured average-case hardness of approximating the permanent. We also establish analogous zero-free regions for the hardcore model on general graphs with complex vertex fugacities. In addition, we prove universality results establishing zero-free regions for random matrices $W$ with i.i.d. subexponential entries.

A divide and conquer strategy for multinomial particle filter resampling

from arXiv: Data Structures and Algorithms

Authors: Andrey A. Popov

This work provides a new multinomial resampling procedure for particle filter resampling, focused on the case where the number of samples required is less than or equal to the size of the underlying discrete distribution. This setting is common in ensemble mixture model filters such as the Gaussian mixture filter. We show superiority of our approach with respect two of the best known multinomial sampling procedures both through a computational complexity analysis and through a numerical experiment.

Authors: Andrey A. Popov

This work provides a new multinomial resampling procedure for particle filter resampling, focused on the case where the number of samples required is less than or equal to the size of the underlying discrete distribution. This setting is common in ensemble mixture model filters such as the Gaussian mixture filter. We show superiority of our approach with respect two of the best known multinomial sampling procedures both through a computational complexity analysis and through a numerical experiment.

Space-Efficient Text Indexing with Mismatches using Function Inversion

from arXiv: Data Structures and Algorithms

Authors: Jackson Bibbens, Levi Borevitz, Samuel McCauley

A classic data structure problem is to preprocess a string T of length $n$ so that, given a query $q$, we can quickly find all substrings of T with Hamming distance at most $k$ from the query string. Variants of this problem have seen significant research both in theory and in practice. For a wide parameter range, the best worst-case bounds are achieved by the "CGL tree" (Cole, Gottlieb, Lewenstein 2004), which achieves query time roughly $\tilde{O}(|q| + \log^k n + \# occ)$ where $\# occ$ is the size of the output, and space ${O}(n\log^k n)$. The CGL Tree space was recently improved to $O(n \log^{k-1} n)$ (Kociumaka, Radoszewski 2026). A natural question is whether a high space bound is necessary. How efficient can we make queries when the data structure is constrained to $O(n)$ space? While this question has seen extensive research, all known results have query time with unfavorable dependence on $n$, $k$, and the alphabet $Σ$. The state of the art query time (Chan et al. 2011) is roughly $\tilde{O}(|q| + |Σ|^k \log^{k^2 + k} n + \# occ)$. We give an $O(n)$-space data structure with query time roughly $\tilde{O}(|q| + \log^{4k} n + \log^{2k} n \# occ)$, with no dependence on $|Σ|$. Even if $|Σ| = O(1)$, this is the best known query time for linear space if $k\geq 3$ unless $\# occ$ is large. Our results give a smooth tradeoff between time and space. We also give the first sublinear-space results: we give a succinct data structure using only $o(n)$ space in addition to the text itself. Our main technical idea is to apply function inversion (Fiat, Naor 2000) to the CGL tree. Combining these techniques is not immediate; in fact, we revisit the exposition of both the Fiat-Naor data structure and the CGL tree to obtain our bounds. Along the way, we obtain improved performance for both data structures, which may be of independent interest.

Authors: Jackson Bibbens, Levi Borevitz, Samuel McCauley

A classic data structure problem is to preprocess a string T of length $n$ so that, given a query $q$, we can quickly find all substrings of T with Hamming distance at most $k$ from the query string. Variants of this problem have seen significant research both in theory and in practice. For a wide parameter range, the best worst-case bounds are achieved by the "CGL tree" (Cole, Gottlieb, Lewenstein 2004), which achieves query time roughly $\tilde{O}(|q| + \log^k n + \# occ)$ where $\# occ$ is the size of the output, and space ${O}(n\log^k n)$. The CGL Tree space was recently improved to $O(n \log^{k-1} n)$ (Kociumaka, Radoszewski 2026). A natural question is whether a high space bound is necessary. How efficient can we make queries when the data structure is constrained to $O(n)$ space? While this question has seen extensive research, all known results have query time with unfavorable dependence on $n$, $k$, and the alphabet $Σ$. The state of the art query time (Chan et al. 2011) is roughly $\tilde{O}(|q| + |Σ|^k \log^{k^2 + k} n + \# occ)$. We give an $O(n)$-space data structure with query time roughly $\tilde{O}(|q| + \log^{4k} n + \log^{2k} n \# occ)$, with no dependence on $|Σ|$. Even if $|Σ| = O(1)$, this is the best known query time for linear space if $k\geq 3$ unless $\# occ$ is large. Our results give a smooth tradeoff between time and space. We also give the first sublinear-space results: we give a succinct data structure using only $o(n)$ space in addition to the text itself. Our main technical idea is to apply function inversion (Fiat, Naor 2000) to the CGL tree. Combining these techniques is not immediate; in fact, we revisit the exposition of both the Fiat-Naor data structure and the CGL tree to obtain our bounds. Along the way, we obtain improved performance for both data structures, which may be of independent interest.

Thursday, April 02

Should Have Known Better

from Ben Recht

Motivating the metric of regret in sequential decision making.

This is the second live blog of Lecture 7 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

When talking about sequential decision making and optimal control, I can’t avoid discussing the mathematical concept of regret. Regret is the preferred theoretical metric for evaluating bandit algorithms, and bandit algorithms are a core method for online decision-making.

Invariably, every time I try to explain “regret,” I get the question, “Wait, why do we care about that?” So to have an answer that I can point to in the future, I’m going to write a few blog posts.

We can use the setup from the last blog. We have a two-player game with repeated interactions. In every round t,

  1. Information xt is revealed to both players.

  2. Player One takes action ut

  3. Player Two takes action dt

  4. A score rt is assigned based on the triple (xt,ut,dt).

Player One is the “decision maker,” and their action has to be computable from a few lines of code. Their goal is to accumulate as high a score as possible, summed across all rounds. Player two wants the sum of all of the rt to be as low as possible.

You could think of designing the optimal policy as an optimization problem. Given a description of what Player Two can do, you can design a strategy for Player One. If Player Two is totally random, you could perhaps maximize the expected score. If Player Two is deterministic, you could maximize the score assuming Player Two plays the best possible strategy against you. Regret provides a flexible framework for going beyond both of these formulations.

To define regret, we imagine a counterfactual world in which Player One knows something about Player Two’s strategy in advance. The regret of some strategy is the difference between the score of a policy built on knowing this secret of Player Two and the score of the strategy that has to learn as it goes. It is called regret because it estimates how much the score could be improved with the benefit of hindsight.

Sometimes this regret is really large. Consider the following example. In each round, Player Two thinks of a color, Red or Blue, and Player One has to guess which color Player Two is thinking of. Player One gets a 1 if they guess correctly and a 0 otherwise. Player Two agrees to choose their sequence in advance, but only reveal one number to Player One in each round. In the counterfactual world, Player One would know the entire sequence and would receive a perfect score. In reality, Player One can’t do better than guessing, so they would be hard-pressed to get more than half of the colors correct.

This is where regret gets confusing. We ask, what if Player One in the counterfactual world has the benefit of hindsight but is constrained in their strategies? In this color guessing game, what if Player One is forced to choose one color in their counterfactual world? They see the entire sequence but can only pick red or blue. In this case, if Player Two chooses an even number of Red and Blues, the omniscient yet restricted Player One can only get half of the answers correct. A real-world strategy of random guessing will fare just as well as this counterfactual strategy with the benefit of hindsight.

No matter how many times I explain it, I find this setup confusing. Let me write it again: The regret model requires two things: a secret of Player Two and a restricted strategy set of Player One. In the real world, Player One has a flexible strategy set, but is missing information. In the counterfactual world, Player One has a restricted strategy set, but extra knowledge. Regret bounds the difference in scores achieved in these two worlds.

You might ask why this particular example of color guessing is interesting. I’m not sure it is, but it’s the one we’ll use next week when discussing forecasting. When someone tells you that they have calibrated predictions, they are doing this sort of sleight of hand and comparing against something that you probably don’t actually care about.

But let’s spend some time discussing examples where regret is reasonable. I’ll start with the canonical example: the stochastic multiarmed bandit. If Player Two is random and stationary, then the best strategy in hindsight makes a lot more sense. In our game of colors, this is the multiarmed bandit problem, the most annoyingly named subject in decision making. In the classic version of this problem, you have two slot machines and want to find the one with the highest payout. Each round, you are allowed to choose one of the machines to play. We model the payout from each machine as an independent draw from a fixed probability distribution. These distributions have different means, and your goal is to devise a strategy that results in the highest expected payout.

What would the best policy do? No matter how clever you are, you can’t beat the strategy of only using the machine with the higher mean payout. If you knew the expected payouts in advance and your goal is to maximize the expected payout, you would use only the machine with the highest expected payout. Thus, we can think of the secret held by Player Two to be the mean payouts of each machine.

If you didn’t know this secret, what would you do? You’d probably spend some time with each machine, look at which one is giving you higher returns, and then pick that one forever. This seems like a reasonable strategy.1 But note that this strategy necessarily has nonzero regret, because you necessarily have to try both machines to figure out which one is best.

Any strategy you devise for the real world has a particular expected regret, which is the difference between the expected payout of playing the best machine and the expected value of your strategy. In the case of our multiarmed bandit, the worst regret is accrued by always playing the suboptimal machine. So the regret would grow linearly with the number of pulls. Bandit algorithms seek strategies for which regret grows sublinearly.

Outside the casino, variants of the stochastic multiarmed bandit are reasonable models for adaptive experimentation. Suppose you want to select between treatments A and B that maximizes the average benefit to some cohort of subjects. If you can randomly sample individuals from the cohort, there will be regret associated with the number of subjects assigned to the suboptimal treatment in a randomized experiment, and there will be regret associated with the chance your experiment selects the suboptimal treatment. You would like to minimize application of the wrong treatment, but also be pretty sure you are finding the right one. You can compare this to the policy that assigned everyone to the optimal policy in advance.

Tomorrow I’ll talk through three other examples where regret feels like the right concept to me. In hindsight, it’s not always worth the headaches and confusion associated with regret minimization, but there are enough positive examples to make it a concept worth understanding.

Subscribe now

1

This is more or less the optimal strategy.

By Ben Recht

TCS+ talk: Wednesday, April 8 — Rahul Ilango, MIT

from TCS+ Seminar Series

The next TCS+ talk will take place this coming Wednesday, April 8th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Rahul Ilango from MIT will speak about “Gödel in Cryptography: Zero-Knowledge Proofs With No Interaction, No Setup, and Perfect Soundness” (abstract below). You can reserve a spot as […]

The next TCS+ talk will take place this coming Wednesday, April 8th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Rahul Ilango from MIT will speak about “Gödel in Cryptography: Zero-Knowledge Proofs With No Interaction, No Setup, and Perfect Soundness” (abstract below).

You can reserve a spot as an individual or a group to join us live by signing up on the online form. Registration is not required to attend the interactive talk, and the link will be posted on the website the day prior to the talk; however, by registering in the form, you will receive a reminder, along with the link. (The recorded talk will also be posted on our website afterwards) As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: Gödel showed that there are true but unprovable statements. This was bad news for Hilbert, who hoped that every true statement was provable. In this talk, I’ll describe why Gödel’s result is, in fact, good news for cryptography.

Specifically, Gödel’s result allows for the following strange scenario: a cryptographic system S is insecure, but it is impossible to prove that S is insecure. As I will explain, in this scenario (defined carefully), S is secure for nearly all practical purposes.

Leveraging this idea, we effectively construct — under longstanding assumptions — a classically-impossible cryptographic dream object: “zero-knowledge proofs for NP with no interaction, no setup, and perfect soundness” (I won’t assume you know what this means!). As an application, our result lets one give an ordinary mathematical proof that a Sudoku puzzle is solvable without revealing how to solve it. Previously, it was not known how to do this.

Paper link: https://eccc.weizmann.ac.il/report/2025/095/

By plustcs

Learning and Generating Mixed States Prepared by Shallow Channel Circuits

from arXiv: Computational Complexity

Authors: Fangjun Hu, Christian Kokail, Milan Kornjača, Pedro L. S. Lopes, Weiyuan Gong, Sheng-Tao Wang, Xun Gao, Stefan Ostermann

Learning quantum states from measurement data is a central problem in quantum information and computational complexity. In this work, we study the problem of learning to generate mixed states on a finite-dimensional lattice. Motivated by recent developments in mixed state phases of matter, we focus on arbitrary states in the trivial phase. A state belongs to the trivial phase if there exists a shallow preparation channel circuit under which local reversibility is preserved throughout the preparation. We prove that any mixed state in this class can be efficiently learned from measurement access alone. Specifically, given copies of an unknown trivial phase mixed state, our algorithm outputs a shallow local channel circuit that approximately generates this state in trace distance. The sample complexity and runtime are polynomial (or quasi-polynomial) in the number of qubits, assuming constant (or polylogarithmic) circuit depth and gate locality. Importantly, the learner is not given the original preparation circuit and relies only on its existence. Our results provide a structural foundation for quantum generative models based on shallow channel circuits. In the classical limit, our framework also inspires an efficient algorithm for classical diffusion models using only a polynomial overhead of training and generation.

Authors: Fangjun Hu, Christian Kokail, Milan Kornjača, Pedro L. S. Lopes, Weiyuan Gong, Sheng-Tao Wang, Xun Gao, Stefan Ostermann

Learning quantum states from measurement data is a central problem in quantum information and computational complexity. In this work, we study the problem of learning to generate mixed states on a finite-dimensional lattice. Motivated by recent developments in mixed state phases of matter, we focus on arbitrary states in the trivial phase. A state belongs to the trivial phase if there exists a shallow preparation channel circuit under which local reversibility is preserved throughout the preparation. We prove that any mixed state in this class can be efficiently learned from measurement access alone. Specifically, given copies of an unknown trivial phase mixed state, our algorithm outputs a shallow local channel circuit that approximately generates this state in trace distance. The sample complexity and runtime are polynomial (or quasi-polynomial) in the number of qubits, assuming constant (or polylogarithmic) circuit depth and gate locality. Importantly, the learner is not given the original preparation circuit and relies only on its existence. Our results provide a structural foundation for quantum generative models based on shallow channel circuits. In the classical limit, our framework also inspires an efficient algorithm for classical diffusion models using only a polynomial overhead of training and generation.

A General Framework for Computational Lower Bounds in Nontrivial Norm Approximation

from arXiv: Computational Complexity

Authors: Runshi Tang, Yuefeng Han, Anru R. Zhang

In this note, we propose a general framework for proving computational lower bounds in norm approximation by leveraging a reverse detection--estimation gap. The starting point is a testing problem together with an estimator whose error is significantly smaller than the corresponding computational detection threshold. We show that such a gap yields a lower bound on the approximation distortion achievable by any algorithm in the underlying computational class. In this way, reverse detection--estimation gaps can be turned into a general mechanism for certifying the hardness of approximating nontrivial norms. We apply this framework to the spectral norm of order-$d$ symmetric tensors in $\mathbb{R}^{p^d}$. Using a recently established low-degree hardness result for detecting nonzero high-order cumulant tensors, together with an efficiently computable estimator whose error is below the low-degree detection threshold, we prove that any degree-$D$ low-degree algorithm with $D \le c_d(\log p)^2$ must incur distortion at least $p^{d/4-1/2}/\operatorname{polylog}(p)$ for the tensor spectral norm. Under the low-degree conjecture, the same conclusion extends to all polynomial-time algorithms. In several important settings, this lower bound matches the best known upper bounds up to polylogarithmic factors, suggesting that the exponent $d/4-1/2$ captures a genuine computational barrier. Our results provide evidence that the difficulty of approximating tensor spectral norm is not merely an artifact of existing techniques, but reflects a broader computational barrier.

Authors: Runshi Tang, Yuefeng Han, Anru R. Zhang

In this note, we propose a general framework for proving computational lower bounds in norm approximation by leveraging a reverse detection--estimation gap. The starting point is a testing problem together with an estimator whose error is significantly smaller than the corresponding computational detection threshold. We show that such a gap yields a lower bound on the approximation distortion achievable by any algorithm in the underlying computational class. In this way, reverse detection--estimation gaps can be turned into a general mechanism for certifying the hardness of approximating nontrivial norms. We apply this framework to the spectral norm of order-$d$ symmetric tensors in $\mathbb{R}^{p^d}$. Using a recently established low-degree hardness result for detecting nonzero high-order cumulant tensors, together with an efficiently computable estimator whose error is below the low-degree detection threshold, we prove that any degree-$D$ low-degree algorithm with $D \le c_d(\log p)^2$ must incur distortion at least $p^{d/4-1/2}/\operatorname{polylog}(p)$ for the tensor spectral norm. Under the low-degree conjecture, the same conclusion extends to all polynomial-time algorithms. In several important settings, this lower bound matches the best known upper bounds up to polylogarithmic factors, suggesting that the exponent $d/4-1/2$ captures a genuine computational barrier. Our results provide evidence that the difficulty of approximating tensor spectral norm is not merely an artifact of existing techniques, but reflects a broader computational barrier.

An Unconditional Barrier for Proving Multilinear Algebraic Branching Program Lower Bounds

from arXiv: Computational Complexity

Authors: Deepanshu Kush

Since the breakthrough superpolynomial multilinear formula lower bounds of Raz (Theory of Computing 2006), proving such lower bounds against multilinear algebraic branching programs (mABPs) has been a longstanding open problem in algebraic complexity theory. All known multilinear lower bounds rely on the min-partition rank method, and the best bounds against mABPs have remained quadratic (Alon, Kumar, and Volk, Combinatorica 2020). We show that the min-partition rank method cannot prove superpolynomial mABP lower bounds: there exists a full-rank multilinear polynomial computable by a polynomial-size mABP. This is an unconditional barrier: new techniques are needed to separate $\mathsf{mVBP}$ from higher classes in the multilinear hierarchy. Our proof resolves an open problem of Fabris, Limaye, Srinivasan, and Yehudayoff (ECCC 2026), who showed that the power of this method is governed by the minimum size $N(n)$ of a combinatorial object called a $1$-balanced-chain set system, and proved $N(n) \le n^{O(\log n/\log\log n)}$. We prove $N(n) = n^{O(1)}$ by giving the chain-builder a binary choice at each step, biasing what was a symmetric random walk into one where the imbalance increases with probability at most $1/4$; a supermartingale argument combined with a multi-scale recursion yields the polynomial bound.

Authors: Deepanshu Kush

Since the breakthrough superpolynomial multilinear formula lower bounds of Raz (Theory of Computing 2006), proving such lower bounds against multilinear algebraic branching programs (mABPs) has been a longstanding open problem in algebraic complexity theory. All known multilinear lower bounds rely on the min-partition rank method, and the best bounds against mABPs have remained quadratic (Alon, Kumar, and Volk, Combinatorica 2020). We show that the min-partition rank method cannot prove superpolynomial mABP lower bounds: there exists a full-rank multilinear polynomial computable by a polynomial-size mABP. This is an unconditional barrier: new techniques are needed to separate $\mathsf{mVBP}$ from higher classes in the multilinear hierarchy. Our proof resolves an open problem of Fabris, Limaye, Srinivasan, and Yehudayoff (ECCC 2026), who showed that the power of this method is governed by the minimum size $N(n)$ of a combinatorial object called a $1$-balanced-chain set system, and proved $N(n) \le n^{O(\log n/\log\log n)}$. We prove $N(n) = n^{O(1)}$ by giving the chain-builder a binary choice at each step, biasing what was a symmetric random walk into one where the imbalance increases with probability at most $1/4$; a supermartingale argument combined with a multi-scale recursion yields the polynomial bound.

Breadth-First Search Trees with Many or Few Leaves

from arXiv: Data Structures and Algorithms

Authors: Jesse Beisegel, Ekkehard Köhler, Robert Scheffler, Martin Strehler

The Maximum (Minimum) Leaf Spanning Tree problem asks for a spanning tree with the largest (smallest) number of leaves. As spanning trees are often computed using graph search algorithms, it is natural to restrict this problem to the set of search trees of some particular graph search, e.g., find the Breadth-First Search (BFS) tree with the largest number of leaves. We study this problem for Generic Search (GS), BFS and Lexicographic Breadth-First Search (LBFS) using search trees that connect each vertex to its first neighbor in the search order (first-in trees) just like the classic BFS tree. In particular, we analyze the complexity of these problems, both in the classical and in the parameterized sense. Among other results, we show that the minimum and maximum leaf problems are in FPT for the first-in trees of GS, BFS and LBFS when parameterized by the number of leaves in the tree. However, when these problems are parameterized by the number of internal vertices of the tree, they are W[1]-hard for the first-in trees of GS, BFS and LBFS.

Authors: Jesse Beisegel, Ekkehard Köhler, Robert Scheffler, Martin Strehler

The Maximum (Minimum) Leaf Spanning Tree problem asks for a spanning tree with the largest (smallest) number of leaves. As spanning trees are often computed using graph search algorithms, it is natural to restrict this problem to the set of search trees of some particular graph search, e.g., find the Breadth-First Search (BFS) tree with the largest number of leaves. We study this problem for Generic Search (GS), BFS and Lexicographic Breadth-First Search (LBFS) using search trees that connect each vertex to its first neighbor in the search order (first-in trees) just like the classic BFS tree. In particular, we analyze the complexity of these problems, both in the classical and in the parameterized sense. Among other results, we show that the minimum and maximum leaf problems are in FPT for the first-in trees of GS, BFS and LBFS when parameterized by the number of leaves in the tree. However, when these problems are parameterized by the number of internal vertices of the tree, they are W[1]-hard for the first-in trees of GS, BFS and LBFS.

On the average-case complexity landscape for Tensor-Isomorphism-complete problems over finite fields

from arXiv: Data Structures and Algorithms

Authors: Tiange Li, Yinan Li, Youming Qiao, Dacheng Tao, Yingjie Wang

In Grochow and Qiao (SIAM J. Comput., 2021), the complexity class Tensor Isomorphism (TI) was introduced and isomorphism problems for groups, algebras, and polynomials were shown to be TI-complete. In this paper, we study average-case algorithms for several TI-complete problems over finite fields, including algebra isomorphism, matrix code conjugacy, and $4$-tensor isomorphism. Our main results are as follows. Over the finite field of order $q$, we devise (1) average-case polynomial-time algorithms for algebra isomorphism and matrix code conjugacy that succeed in a $1/Θ(q)$ fraction of inputs and (2) an average-case polynomial-time algorithm for the $4$-tensor isomorphism that succeeds in a $1/q^{Θ(1)}$ fraction of inputs. Prior to our work, algorithms for algebra isomorphism with rigorous average-case analyses ran in exponential time, albeit succeeding on a larger fraction of inputs (Li--Qiao, FOCS'17; Brooksbank--Li--Qiao--Wilson, ESA'20; Grochow--Qiao--Tang, STACS'21). These results reveal a finer landscape of the average-case complexities of TI-complete problems, providing guidance for cryptographic systems based on isomorphism problems. Our main technical contribution is to introduce the spectral properties of random matrices into algorithms for TI-complete problems. This leads to not only new algorithms but also new questions in random matrix theory over finite fields. To settle these questions, we need to extend both the generating function approach as in Neumann and Praeger (J. London Math. Soc., 1998) and the characteristic sum method of Gorodetsky and Rodgers (Trans. Amer. Math. Soc., 2021).

Authors: Tiange Li, Yinan Li, Youming Qiao, Dacheng Tao, Yingjie Wang

In Grochow and Qiao (SIAM J. Comput., 2021), the complexity class Tensor Isomorphism (TI) was introduced and isomorphism problems for groups, algebras, and polynomials were shown to be TI-complete. In this paper, we study average-case algorithms for several TI-complete problems over finite fields, including algebra isomorphism, matrix code conjugacy, and $4$-tensor isomorphism. Our main results are as follows. Over the finite field of order $q$, we devise (1) average-case polynomial-time algorithms for algebra isomorphism and matrix code conjugacy that succeed in a $1/Θ(q)$ fraction of inputs and (2) an average-case polynomial-time algorithm for the $4$-tensor isomorphism that succeeds in a $1/q^{Θ(1)}$ fraction of inputs. Prior to our work, algorithms for algebra isomorphism with rigorous average-case analyses ran in exponential time, albeit succeeding on a larger fraction of inputs (Li--Qiao, FOCS'17; Brooksbank--Li--Qiao--Wilson, ESA'20; Grochow--Qiao--Tang, STACS'21). These results reveal a finer landscape of the average-case complexities of TI-complete problems, providing guidance for cryptographic systems based on isomorphism problems. Our main technical contribution is to introduce the spectral properties of random matrices into algorithms for TI-complete problems. This leads to not only new algorithms but also new questions in random matrix theory over finite fields. To settle these questions, we need to extend both the generating function approach as in Neumann and Praeger (J. London Math. Soc., 1998) and the characteristic sum method of Gorodetsky and Rodgers (Trans. Amer. Math. Soc., 2021).

Stable algorithms cannot reliably find isolated perceptron solutions

from arXiv: Data Structures and Algorithms

Authors: Shuyang Gong, Brice Huang, Shuangping Li, Mark Sellke

We study the binary perceptron, a random constraint satisfaction problem that asks to find a Boolean vector in the intersection of independently chosen random halfspaces. A striking feature of this model is that at every positive constraint density, it is expected that a $1-o_N(1)$ fraction of solutions are \emph{strongly isolated}, i.e. separated from all others by Hamming distance $Ω(N)$. At the same time, efficient algorithms are known to find solutions at certain positive constraint densities. This raises a natural question: can any isolated solution be algorithmically visible? We answer this in the negative: no algorithm whose output is stable under a tiny Gaussian resampling of the disorder can \emph{reliably} locate isolated solutions. We show that any stable algorithm has success probability at most $\frac{3\sqrt{17}-9}{4}+o_N(1)\leq 0.84233$. Furthermore, every stable algorithm that finds a solution with probability $1-o_N(1)$ finds an isolated solution with probability $o_N(1)$. The class of stable algorithms we consider includes degree-$D$ polynomials up to $D\leq o(N/\log N)$; under the low-degree heuristic \cite{hopkins2018statistical}, this suggests that locating strongly isolated solutions requires running time $\exp(\widetildeΘ(N))$. Our proof does not use the overlap gap property. Instead, we show via Pitt's correlation inequality that after a random perturbation of the disorder, the number of solutions located close to a pre-existing isolated solution cannot concentrate at $1$.

Authors: Shuyang Gong, Brice Huang, Shuangping Li, Mark Sellke

We study the binary perceptron, a random constraint satisfaction problem that asks to find a Boolean vector in the intersection of independently chosen random halfspaces. A striking feature of this model is that at every positive constraint density, it is expected that a $1-o_N(1)$ fraction of solutions are \emph{strongly isolated}, i.e. separated from all others by Hamming distance $Ω(N)$. At the same time, efficient algorithms are known to find solutions at certain positive constraint densities. This raises a natural question: can any isolated solution be algorithmically visible? We answer this in the negative: no algorithm whose output is stable under a tiny Gaussian resampling of the disorder can \emph{reliably} locate isolated solutions. We show that any stable algorithm has success probability at most $\frac{3\sqrt{17}-9}{4}+o_N(1)\leq 0.84233$. Furthermore, every stable algorithm that finds a solution with probability $1-o_N(1)$ finds an isolated solution with probability $o_N(1)$. The class of stable algorithms we consider includes degree-$D$ polynomials up to $D\leq o(N/\log N)$; under the low-degree heuristic \cite{hopkins2018statistical}, this suggests that locating strongly isolated solutions requires running time $\exp(\widetildeΘ(N))$. Our proof does not use the overlap gap property. Instead, we show via Pitt's correlation inequality that after a random perturbation of the disorder, the number of solutions located close to a pre-existing isolated solution cannot concentrate at $1$.

The Mystery Deepens: On the Query Complexity of Tarski Fixed Points

from arXiv: Data Structures and Algorithms

Authors: Xi Chen, Yuhao Li, Mihalis Yannakakis

We give an $O(\log^2 n)$-query algorithm for finding a Tarski fixed point over the $4$-dimensional lattice $[n]^4$, matching the $Ω(\log^2 n)$ lower bound of [EPRY20]. Additionally, our algorithm yields an ${O(\log^{\lceil (k-1)/3\rceil+1} n)}$-query algorithm for any constant $k$, improving the previous best upper bound ${O(\log^{\lceil (k-1)/2\rceil+1} n)}$ of [CL22]. Our algorithm uses a new framework based on \emph{safe partial-information} functions. The latter were introduced in [CLY23] to give a reduction from the Tarski problem to its promised version with a unique fixed point. This is the first time they are directly used to design new algorithms for Tarski fixed points.

Authors: Xi Chen, Yuhao Li, Mihalis Yannakakis

We give an $O(\log^2 n)$-query algorithm for finding a Tarski fixed point over the $4$-dimensional lattice $[n]^4$, matching the $Ω(\log^2 n)$ lower bound of [EPRY20]. Additionally, our algorithm yields an ${O(\log^{\lceil (k-1)/3\rceil+1} n)}$-query algorithm for any constant $k$, improving the previous best upper bound ${O(\log^{\lceil (k-1)/2\rceil+1} n)}$ of [CL22]. Our algorithm uses a new framework based on \emph{safe partial-information} functions. The latter were introduced in [CLY23] to give a reduction from the Tarski problem to its promised version with a unique fixed point. This is the first time they are directly used to design new algorithms for Tarski fixed points.

Engineering Fully Dynamic Convex Hulls

from arXiv: Data Structures and Algorithms

Authors: Ivor van der Hoog, Henrik Reinstädtler, Eva Rotenberg

We present a new fully dynamic algorithm for maintaining convex hulls under insertions and deletions while supporting geometric queries. Our approach combines the logarithmic method with a deletion-only convex hull data structure, achieving amortised update times of $O(\log n \log \log n)$ and query times of $O(\log^2 n)$. We provide a robust and non-trivial implementation that supports point-location queries, a challenging and non-decomposable class of convex hull queries. We evaluate our implementation against the state of the art, including a new naive baseline that rebuilds the convex hull whenever an update affects it. On hulls that include polynomially many data points (e.g. $Θ(n^\varepsilon)$ for some $\varepsilon$), such as the ones that often occur in practice, our method outperforms all other techniques. Update-heavy workloads strongly favour our approach, which is in line with our theoretical guarantees. Yet, our method remains competitive all the way down to when the update to query ratio is $1$ to $10$. Experiments on real-world data sets furthermore reveal that existing fully dynamic techniques suffer from significant robustness issues. In contrast, our implementation remains stable across all tested inputs.

Authors: Ivor van der Hoog, Henrik Reinstädtler, Eva Rotenberg

We present a new fully dynamic algorithm for maintaining convex hulls under insertions and deletions while supporting geometric queries. Our approach combines the logarithmic method with a deletion-only convex hull data structure, achieving amortised update times of $O(\log n \log \log n)$ and query times of $O(\log^2 n)$. We provide a robust and non-trivial implementation that supports point-location queries, a challenging and non-decomposable class of convex hull queries. We evaluate our implementation against the state of the art, including a new naive baseline that rebuilds the convex hull whenever an update affects it. On hulls that include polynomially many data points (e.g. $Θ(n^\varepsilon)$ for some $\varepsilon$), such as the ones that often occur in practice, our method outperforms all other techniques. Update-heavy workloads strongly favour our approach, which is in line with our theoretical guarantees. Yet, our method remains competitive all the way down to when the update to query ratio is $1$ to $10$. Experiments on real-world data sets furthermore reveal that existing fully dynamic techniques suffer from significant robustness issues. In contrast, our implementation remains stable across all tested inputs.

Single-Criteria Metric $r$-Dominating Set Problem via Minor-Preserving Support

from arXiv: Data Structures and Algorithms

Authors: Reilly Browne, Hsien-Chih Chang

Given an unweighted graph $G$, the *minimum $r$-dominating set problem* asks for the smallest-cardinality subset $S$ such that every vertex in $G$ is within radius $r$ of some vertex in $S$. While the $r$-dominating set problem on planar graphs admits a PTAS from Baker's shifting/layering technique when $r$ is constant, it becomes significantly harder when $r$ can depend on $n$. Under the Exponential-Time Hypothesis, Fox-Epstein et al. [SODA 2019] showed that no efficient PTAS exists for the unbounded $r$-dominating set problem on planar graphs. One may also consider the harder *vertex-weighted metric $r$-dominating set*, where edges have lengths, vertices have positive weights, and the goal is to find an $r$-dominating set of minimum total weight. This led to the development of *bicriteria* algorithms that allow radius-$(1+\varepsilon)r$ balls while achieving a $1+\varepsilon$ approximation to the optimal weight. We establish the first *single-criteria* polynomial-time $O(1)$-approximation algorithm for the vertex-weighted metric $r$-dominating set on planar graphs, where $r$ is part of the input and can be arbitrarily large. Our algorithm applies the quasi-uniformity sampling of Chan et al. [SODA 2012] by bounding the *shallow cell complexity* of the radius-$r$ ball system to be linear in $n$. Two technical innovations enable this: 1. Since discrete ball systems on planar graphs are neither pseudodisks nor amenable to standard union-complexity arguments, we construct a *support graph* for arbitrary distance ball systems as contractions of Voronoi cells, with sparseness as a byproduct. 2. We assign each depth-($\geq 3$) cell to a unique 3-tuple of ball centers, enabling Clarkson-Shor techniques to reduce counting to depth-*exactly*-3 cells, which we prove are $O(n)$ by a geometric argument on our Voronoi contraction support.

Authors: Reilly Browne, Hsien-Chih Chang

Given an unweighted graph $G$, the *minimum $r$-dominating set problem* asks for the smallest-cardinality subset $S$ such that every vertex in $G$ is within radius $r$ of some vertex in $S$. While the $r$-dominating set problem on planar graphs admits a PTAS from Baker's shifting/layering technique when $r$ is constant, it becomes significantly harder when $r$ can depend on $n$. Under the Exponential-Time Hypothesis, Fox-Epstein et al. [SODA 2019] showed that no efficient PTAS exists for the unbounded $r$-dominating set problem on planar graphs. One may also consider the harder *vertex-weighted metric $r$-dominating set*, where edges have lengths, vertices have positive weights, and the goal is to find an $r$-dominating set of minimum total weight. This led to the development of *bicriteria* algorithms that allow radius-$(1+\varepsilon)r$ balls while achieving a $1+\varepsilon$ approximation to the optimal weight. We establish the first *single-criteria* polynomial-time $O(1)$-approximation algorithm for the vertex-weighted metric $r$-dominating set on planar graphs, where $r$ is part of the input and can be arbitrarily large. Our algorithm applies the quasi-uniformity sampling of Chan et al. [SODA 2012] by bounding the *shallow cell complexity* of the radius-$r$ ball system to be linear in $n$. Two technical innovations enable this: 1. Since discrete ball systems on planar graphs are neither pseudodisks nor amenable to standard union-complexity arguments, we construct a *support graph* for arbitrary distance ball systems as contractions of Voronoi cells, with sparseness as a byproduct. 2. We assign each depth-($\geq 3$) cell to a unique 3-tuple of ball centers, enabling Clarkson-Shor techniques to reduce counting to depth-*exactly*-3 cells, which we prove are $O(n)$ by a geometric argument on our Voronoi contraction support.

Two Linear Passes Are Necessary for Sum-Exclude-Self Under Sublinear Space

from arXiv: Data Structures and Algorithms

Authors: Andrew Au

We prove that any algorithm computing the sum-exclude-self of an unsigned $d$-bit integer array of length $n$ under sublinear space must perform two linear passes over the input. More precisely, the algorithm must read at least $n-1$ input elements before any output cell receives its final value, and at least $n - \lfloor t/d \rfloor$ additional elements thereafter, where $t = o(nd)$ bits is the working memory size. This gives a total of $2n - 1 - \lfloor t/d \rfloor$ element reads. A trivial modification of the standard two-pass algorithm achieves this bound exactly for all practical input sizes. The proof uses this toy problem as a worked example to demonstrate the choke-point technique for proving sublinear-space lower bounds.

Authors: Andrew Au

We prove that any algorithm computing the sum-exclude-self of an unsigned $d$-bit integer array of length $n$ under sublinear space must perform two linear passes over the input. More precisely, the algorithm must read at least $n-1$ input elements before any output cell receives its final value, and at least $n - \lfloor t/d \rfloor$ additional elements thereafter, where $t = o(nd)$ bits is the working memory size. This gives a total of $2n - 1 - \lfloor t/d \rfloor$ element reads. A trivial modification of the standard two-pass algorithm achieves this bound exactly for all practical input sizes. The proof uses this toy problem as a worked example to demonstrate the choke-point technique for proving sublinear-space lower bounds.

Faster Approximate Fixed Points of $\ell_\infty$-Contractions

from arXiv: Data Structures and Algorithms

Authors: Andrei Feodorov, Sebastian Haslebacher

We present a new algorithm for finding an $ε$-approximate fixed point of an $\ell_\infty$-contracting function $f : [0, 1]^d \rightarrow [0, 1]^d$. Our algorithm is based on the query-efficient algorithm by Chen, Li, and Yannakakis (STOC 2024), but comes with an improved upper bound of $(\log \frac{1}ε)^{\mathcal{O}(d \log d)}$ on the overall runtime (while still being query-efficient). By combining this with a recent decomposition theorem for $\ell_\infty$-contracting functions, we then describe a second algorithm that finds an $ε$-approximate fixed point in $(\log \frac{1}ε)^{\mathcal{O}(\sqrt{d} \log d)}$ queries and time. The key observation here is that decomposition theorems such as the one for $\ell_\infty$-contracting maps often allow a trade-off: If an algorithm's runtime is worse than its query complexity in terms of the dependency on the dimension $d$, then we can improve the runtime at the expense of weakening the query upper bound. By well-known reductions, our results imply a faster algorithm for $ε$-approximately solving Shapley stochastic games.

Authors: Andrei Feodorov, Sebastian Haslebacher

We present a new algorithm for finding an $ε$-approximate fixed point of an $\ell_\infty$-contracting function $f : [0, 1]^d \rightarrow [0, 1]^d$. Our algorithm is based on the query-efficient algorithm by Chen, Li, and Yannakakis (STOC 2024), but comes with an improved upper bound of $(\log \frac{1}ε)^{\mathcal{O}(d \log d)}$ on the overall runtime (while still being query-efficient). By combining this with a recent decomposition theorem for $\ell_\infty$-contracting functions, we then describe a second algorithm that finds an $ε$-approximate fixed point in $(\log \frac{1}ε)^{\mathcal{O}(\sqrt{d} \log d)}$ queries and time. The key observation here is that decomposition theorems such as the one for $\ell_\infty$-contracting maps often allow a trade-off: If an algorithm's runtime is worse than its query complexity in terms of the dependency on the dimension $d$, then we can improve the runtime at the expense of weakening the query upper bound. By well-known reductions, our results imply a faster algorithm for $ε$-approximately solving Shapley stochastic games.

Rapid mixing in positively weighted restricted Boltzmann machines

from arXiv: Data Structures and Algorithms

Authors: Weiming Feng, Heng Guo, Minji Yang

We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain new mixing time bounds up to the critical thresholds.

Authors: Weiming Feng, Heng Guo, Minji Yang

We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain new mixing time bounds up to the critical thresholds.

Round-efficient Fully-scalable MPC algorithms for k-Means

from arXiv: Data Structures and Algorithms

Authors: Shaofeng H. -C. Jiang, Yaonan Jin, Jianing Lou, Weicheng Wang

We study Euclidean $k$-Means under the Massively Parallel Computation (MPC) model, focusing on the \emph{fully-scalable} setting. Our main result is a fully-scalable $O((\log n/\log\log n)^2)$-approximation in $O(1)$ rounds. Previously, fully-scalable algorithms for $k$-Means either run in super-constant $O(\log\log n \cdot \log\log\log n)$ rounds, albeit with a better $O(1)$-approximation [Cohen-Addad et al., SODA'26], or suffer from bicriteria guarantees [Bhaskara and Wijewardena, ICML'18; Czumaj et al., ICALP'24]. Our algorithm also gives an $O(\log n/\log\log n)$-approximation for $k$-Median, which improves a recent $O(\log n)$-approximation [Goranci et al., SODA'26], and this $o(\log n)$ ratio breaks the fundamental barrier of tree embedding methods used therein. Our main technical contribution is a new variant of the MP algorithm [Mettu and Plaxton, SICOMP'03] that works for general metrics, whose new guarantee is the Lagrangian Multiplier Preserving (LMP) property, which, importantly, holds even under arbitrary distance distortions. Allowing distance distortion is crucial for efficient MPC implementations and useful for efficient algorithm design in general, whereas preserving the LMP property under distance distortion is known to be a significant technical challenge. As a byproduct of our techniques, we also obtain an $O(1)$-approximation to the optimal \emph{value} in $O(1)$ rounds, which conceptually suggests that achieving a true $O(1)$-approximation (for the solution) in $O(1)$ rounds may be a sensible goal for future study.

Authors: Shaofeng H. -C. Jiang, Yaonan Jin, Jianing Lou, Weicheng Wang

We study Euclidean $k$-Means under the Massively Parallel Computation (MPC) model, focusing on the \emph{fully-scalable} setting. Our main result is a fully-scalable $O((\log n/\log\log n)^2)$-approximation in $O(1)$ rounds. Previously, fully-scalable algorithms for $k$-Means either run in super-constant $O(\log\log n \cdot \log\log\log n)$ rounds, albeit with a better $O(1)$-approximation [Cohen-Addad et al., SODA'26], or suffer from bicriteria guarantees [Bhaskara and Wijewardena, ICML'18; Czumaj et al., ICALP'24]. Our algorithm also gives an $O(\log n/\log\log n)$-approximation for $k$-Median, which improves a recent $O(\log n)$-approximation [Goranci et al., SODA'26], and this $o(\log n)$ ratio breaks the fundamental barrier of tree embedding methods used therein. Our main technical contribution is a new variant of the MP algorithm [Mettu and Plaxton, SICOMP'03] that works for general metrics, whose new guarantee is the Lagrangian Multiplier Preserving (LMP) property, which, importantly, holds even under arbitrary distance distortions. Allowing distance distortion is crucial for efficient MPC implementations and useful for efficient algorithm design in general, whereas preserving the LMP property under distance distortion is known to be a significant technical challenge. As a byproduct of our techniques, we also obtain an $O(1)$-approximation to the optimal \emph{value} in $O(1)$ rounds, which conceptually suggests that achieving a true $O(1)$-approximation (for the solution) in $O(1)$ rounds may be a sensible goal for future study.

A Framework for Parameterized Subexponential-Subcubic-Time Algorithms for Weighted Problems in Planar Graphs

from arXiv: Data Structures and Algorithms

Authors: Matthias Bentert, Fedor V. Fomin, Petr A. Golovach

Many problems are known to be solvable in subexponential parameterized time when the input graph is planar. The bidimensionality framework of Demaine, Fomin, Hajiaghay, and Thilikos [JACM'05] and the treewidth-pattern-covering approach by Fomin, Lokshtanov, Marx, Pilipczuk, Pilipczuk, and Saurabh [SICOMP'22] give robust tools for designing such algorithms. However, there are still many problems for which we do not know whether subexponential parameterized algorithms exist. The bidimensionality framework is not able to handle weights or directed graphs and the treewidth-pattern-covering approach only works for finding connected solutions. Building on a result by Nederlof [STOC'20], we provide a framework that is able to solve a variety of problems in planar graphs in subexponential parameterized time for which this was previously not known (where the polynomial part of the running time is usually $O(n^{2.49})$). Our framework can handle weights, does not require solutions to contain only few connected components, and applies to cases where the number of potential patterns of a solution is exponential in the parameter. We then use the framework to show that various weighted problems like Weighted Partial Vertex Cover, Maximum-Weight Induced Forest, Minimum-Weight Rooted Simple Minor, and Maximum-Weight Rooted Parallel Induced Minor allow for subexponential parameterized algorithms. This was previously not known for any of them. Moreover, we present a very easy-to-use fragment of our framework. This fragment allows for significantly simpler proofs in the case of Maximum-Weight Independent Set and Maximum $(k, n-k)$-Cut and is able to show a subexponential parameterized algorithm for weighted versions of Densest $k$-Subgraph. Even the unweighted version was not known before and is stated as an open problem in the existing literature.

Authors: Matthias Bentert, Fedor V. Fomin, Petr A. Golovach

Many problems are known to be solvable in subexponential parameterized time when the input graph is planar. The bidimensionality framework of Demaine, Fomin, Hajiaghay, and Thilikos [JACM'05] and the treewidth-pattern-covering approach by Fomin, Lokshtanov, Marx, Pilipczuk, Pilipczuk, and Saurabh [SICOMP'22] give robust tools for designing such algorithms. However, there are still many problems for which we do not know whether subexponential parameterized algorithms exist. The bidimensionality framework is not able to handle weights or directed graphs and the treewidth-pattern-covering approach only works for finding connected solutions. Building on a result by Nederlof [STOC'20], we provide a framework that is able to solve a variety of problems in planar graphs in subexponential parameterized time for which this was previously not known (where the polynomial part of the running time is usually $O(n^{2.49})$). Our framework can handle weights, does not require solutions to contain only few connected components, and applies to cases where the number of potential patterns of a solution is exponential in the parameter. We then use the framework to show that various weighted problems like Weighted Partial Vertex Cover, Maximum-Weight Induced Forest, Minimum-Weight Rooted Simple Minor, and Maximum-Weight Rooted Parallel Induced Minor allow for subexponential parameterized algorithms. This was previously not known for any of them. Moreover, we present a very easy-to-use fragment of our framework. This fragment allows for significantly simpler proofs in the case of Maximum-Weight Independent Set and Maximum $(k, n-k)$-Cut and is able to show a subexponential parameterized algorithm for weighted versions of Densest $k$-Subgraph. Even the unweighted version was not known before and is stated as an open problem in the existing literature.

Near-Optimal Four-Cycle Counting in Graph Streams

from arXiv: Data Structures and Algorithms

Authors: Sebastian Lüderssen, Stefan Neumann, Pan Peng

We study four-cycle counting in arbitrary order graph streams. We present a 3-pass algorithm for $(1+\varepsilon)$-approximating the number of four-cycles using $\widetilde{O}(m/\sqrt{T})$ space, where $m$ is the number of edges and $T$ the number of four-cycles in the graph. This improves upon a 3-pass algorithm by Vorotnikova using space $\widetilde{O}(m/T^{1/3})$ and matches a multi-pass lower bound of $Ω(m/\sqrt{T})$ by McGregor and Vorotnikova.

Authors: Sebastian Lüderssen, Stefan Neumann, Pan Peng

We study four-cycle counting in arbitrary order graph streams. We present a 3-pass algorithm for $(1+\varepsilon)$-approximating the number of four-cycles using $\widetilde{O}(m/\sqrt{T})$ space, where $m$ is the number of edges and $T$ the number of four-cycles in the graph. This improves upon a 3-pass algorithm by Vorotnikova using space $\widetilde{O}(m/T^{1/3})$ and matches a multi-pass lower bound of $Ω(m/\sqrt{T})$ by McGregor and Vorotnikova.