Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Tuesday, March 10

Intrinsic Sequentiality in P: Causal Limits of Parallel Computation

from arXiv: Computational Complexity

Authors: Jing-Yuan Wei

We study a polynomial-time decision problem in which \emph{causal execution} is part of the instance specification. Each input describes a depth-$N$ process in which a single non-duplicable token must traverse an ordered sequence of steps, revealing only $O(1)$ bits of routing information at each step. The accept/reject outcome is defined solely by completion of this prescribed execution, rather than by order-free evaluation of the input. A deterministic Turing machine executes the process in $Θ(N)$ time. Using standard information-theoretic tools - specifically cut-set bounds for relay channels and Fano's inequality - we show that any execution respecting the causal constraints requires $Ω(N)$ units of causal time. Information about the delivery path can advance by at most one hop per unit of causal time, so the process admits no asymptotic parallel speedup. We further show that no classical $\mathbf{NC}$ circuit family can implement the process when circuit depth is interpreted as realizable parallel time. This identifies a class of polynomial-time problems with intrinsic causal structure and highlights a gap between logical parallelism and causal executability.

Authors: Jing-Yuan Wei

We study a polynomial-time decision problem in which \emph{causal execution} is part of the instance specification. Each input describes a depth-$N$ process in which a single non-duplicable token must traverse an ordered sequence of steps, revealing only $O(1)$ bits of routing information at each step. The accept/reject outcome is defined solely by completion of this prescribed execution, rather than by order-free evaluation of the input. A deterministic Turing machine executes the process in $Θ(N)$ time. Using standard information-theoretic tools - specifically cut-set bounds for relay channels and Fano's inequality - we show that any execution respecting the causal constraints requires $Ω(N)$ units of causal time. Information about the delivery path can advance by at most one hop per unit of causal time, so the process admits no asymptotic parallel speedup. We further show that no classical $\mathbf{NC}$ circuit family can implement the process when circuit depth is interpreted as realizable parallel time. This identifies a class of polynomial-time problems with intrinsic causal structure and highlights a gap between logical parallelism and causal executability.

The Unit Gap: How Sharing Works in Boolean Circuits

from arXiv: Computational Complexity

Authors: Kirill Krinkin

We study the gap between the minimum size of a Boolean circuit (DAG) and the minimum size of a formula (tree circuit) over the And-Inverter Graph (AIG) basis {AND, NOT} with free inversions. We prove that this gap is always 0 or 1 (Unit Gap Theorem), that sharing requires opt(f) >= n essential variables (Threshold Theorem), and that no sharing is needed when opt(f) <= 3 (Tree Theorem). Gate counts in optimal circuits satisfy an exact decomposition formula with a binary sharing term. When the gap equals 1, it arises from exactly one gate with fan-out 2, employing either dual-polarity or same-polarity reuse; we prove that no other sharing structure can produce a unit gap.

Authors: Kirill Krinkin

We study the gap between the minimum size of a Boolean circuit (DAG) and the minimum size of a formula (tree circuit) over the And-Inverter Graph (AIG) basis {AND, NOT} with free inversions. We prove that this gap is always 0 or 1 (Unit Gap Theorem), that sharing requires opt(f) >= n essential variables (Threshold Theorem), and that no sharing is needed when opt(f) <= 3 (Tree Theorem). Gate counts in optimal circuits satisfy an exact decomposition formula with a binary sharing term. When the gap equals 1, it arises from exactly one gate with fan-out 2, employing either dual-polarity or same-polarity reuse; we prove that no other sharing structure can produce a unit gap.

Quantum information advantage based on Bell inequalities

from arXiv: Computational Complexity

Authors: Rahul Jain, Srijita Kundu

Recently, Kretschmer et al. [KGD+25] presented an experimental demonstration of a proposed quantum information advantage protocol. We present an alternate proposal based on a relation derived from parallel-repeated CHSH games. Our memory measure is based on an information measure and is different from [KGD+25], where they count the number of qubits. Our proposal has an efficient verifier and a noise-robust quantum prover which is arguably much more efficient compared to [KGD+25].

Authors: Rahul Jain, Srijita Kundu

Recently, Kretschmer et al. [KGD+25] presented an experimental demonstration of a proposed quantum information advantage protocol. We present an alternate proposal based on a relation derived from parallel-repeated CHSH games. Our memory measure is based on an information measure and is different from [KGD+25], where they count the number of qubits. Our proposal has an efficient verifier and a noise-robust quantum prover which is arguably much more efficient compared to [KGD+25].

On Factorization of Sparse Polynomials of Bounded Individual Degree

from arXiv: Computational Complexity

Authors: Aminadav Chuyoon, Amir Shpilka

We study sparse polynomials with bounded individual degree and their factors, obtaining the following structural and algorithmic results. 1. A deterministic polynomial-time algorithm to find all sparse divisors of a sparse polynomial of bounded individual degree, together with the first upper bound on the number of non-monomial irreducible factors of such polynomials. 2. A $\mathrm{poly}(n,s^{d\log \ell})$-time algorithm that recovers $\ell$ irreducible $s$-sparse polynomials of individual degree at most $d$ from blackbox access to their (not necessarily sparse) product. This partially resolves a question of Dutta-Sinhababu-Thierauf (RANDOM 2024). In particular, if $\ell=O(1)$ the algorithm runs in polynomial time. 3. Deterministic algorithms for factoring a product of $s$-sparse polynomials of individual degree $d$ from blackbox access. Over fields of characteristic zero or sufficiently large characteristic the runtime is $\mathrm{poly}(n,s^{d^3\log n})$; over arbitrary fields it is $\mathrm{poly}(n,(d^2)!,s^{d^5\log n})$. This improves Bhargava-Saraf-Volkovich (JACM 2020), which gives $\mathrm{poly}(n,s^{d^7\log n})$ time for a single sparse polynomial. For a single sparse input we obtain $\mathrm{poly}(n,s^{d^2\log n})$ time. 4. Given blackbox access to a product of factors of sparse polynomials of bounded individual degree, we give a deterministic polynomial-time algorithm to find all irreducible sparse multiquadratic factors with multiplicities. This generalizes the algorithms of Volkovich (RANDOM 2015, 2017) and extends the complete-power test of Bisht-Volkovich (CC 2025). To handle arbitrary fields we introduce a notion of primitive divisors that removes characteristic assumptions from most of our algorithms.

Authors: Aminadav Chuyoon, Amir Shpilka

We study sparse polynomials with bounded individual degree and their factors, obtaining the following structural and algorithmic results. 1. A deterministic polynomial-time algorithm to find all sparse divisors of a sparse polynomial of bounded individual degree, together with the first upper bound on the number of non-monomial irreducible factors of such polynomials. 2. A $\mathrm{poly}(n,s^{d\log \ell})$-time algorithm that recovers $\ell$ irreducible $s$-sparse polynomials of individual degree at most $d$ from blackbox access to their (not necessarily sparse) product. This partially resolves a question of Dutta-Sinhababu-Thierauf (RANDOM 2024). In particular, if $\ell=O(1)$ the algorithm runs in polynomial time. 3. Deterministic algorithms for factoring a product of $s$-sparse polynomials of individual degree $d$ from blackbox access. Over fields of characteristic zero or sufficiently large characteristic the runtime is $\mathrm{poly}(n,s^{d^3\log n})$; over arbitrary fields it is $\mathrm{poly}(n,(d^2)!,s^{d^5\log n})$. This improves Bhargava-Saraf-Volkovich (JACM 2020), which gives $\mathrm{poly}(n,s^{d^7\log n})$ time for a single sparse polynomial. For a single sparse input we obtain $\mathrm{poly}(n,s^{d^2\log n})$ time. 4. Given blackbox access to a product of factors of sparse polynomials of bounded individual degree, we give a deterministic polynomial-time algorithm to find all irreducible sparse multiquadratic factors with multiplicities. This generalizes the algorithms of Volkovich (RANDOM 2015, 2017) and extends the complete-power test of Bisht-Volkovich (CC 2025). To handle arbitrary fields we introduce a notion of primitive divisors that removes characteristic assumptions from most of our algorithms.

A base change framework for tensor functions

from arXiv: Computational Complexity

Authors: Qiyuan Chen

The main contribution of this note is to establish a framework to extend results of tensor functions over specific field to general field. As a consequence of this framework, we extend the existing work to more general settings: \emph{(1)} slice rank is linearly bounded by geometric rank for any 3-tensors over any field. \emph{(2)} slice rank of any 3-tensors is quasi-supermultiplicative. As a consequence, the asymptotic slice rank exists for any 3-tensors.

Authors: Qiyuan Chen

The main contribution of this note is to establish a framework to extend results of tensor functions over specific field to general field. As a consequence of this framework, we extend the existing work to more general settings: \emph{(1)} slice rank is linearly bounded by geometric rank for any 3-tensors over any field. \emph{(2)} slice rank of any 3-tensors is quasi-supermultiplicative. As a consequence, the asymptotic slice rank exists for any 3-tensors.

The Complexity of Extending Storylines with Minimum Local Crossing Number

from arXiv: Computational Geometry

Authors: Alexander Dobler, Siddharth Gupta, Philipp Kindermann, Fabrizio Montecchiani, Martin Nöllenburg

Storyline layouts visualize temporal interactions by drawing each character as an $x$-monotone curve and enforcing that the participants of every meeting form a contiguous vertical group. We study a drawing extension variant in which a layout of a sub-storyline is fixed and has to be extended by inserting missing characters while preserving all meeting constraints. We minimize the local crossing number $χ$, i.e., the maximum number of crossings along any single character. We prove that the problem is W[1]-hard parameterized by the number $k$ of inserted characters plus the maximum number $σ$ of active characters, in XP parameterized by $σ$ and in FPT parameterized by $σ+χ$.

Authors: Alexander Dobler, Siddharth Gupta, Philipp Kindermann, Fabrizio Montecchiani, Martin Nöllenburg

Storyline layouts visualize temporal interactions by drawing each character as an $x$-monotone curve and enforcing that the participants of every meeting form a contiguous vertical group. We study a drawing extension variant in which a layout of a sub-storyline is fixed and has to be extended by inserting missing characters while preserving all meeting constraints. We minimize the local crossing number $χ$, i.e., the maximum number of crossings along any single character. We prove that the problem is W[1]-hard parameterized by the number $k$ of inserted characters plus the maximum number $σ$ of active characters, in XP parameterized by $σ$ and in FPT parameterized by $σ+χ$.

Topologically Stable Hough Transform

from arXiv: Computational Geometry

Authors: Stefan Huber, Kristóf Huszár, Michael Kerber, Martin Uray

We propose an alternative formulation of the well-known Hough transform to detect lines in point clouds. Replacing the discretized voting scheme of the classical Hough transform by a continuous score function, its persistent features in the sense of persistent homology give a set of candidate lines. We also devise and implement an algorithm to efficiently compute these candidate lines.

Authors: Stefan Huber, Kristóf Huszár, Michael Kerber, Martin Uray

We propose an alternative formulation of the well-known Hough transform to detect lines in point clouds. Replacing the discretized voting scheme of the classical Hough transform by a continuous score function, its persistent features in the sense of persistent homology give a set of candidate lines. We also devise and implement an algorithm to efficiently compute these candidate lines.

Geometric Give and Take

from arXiv: Computational Geometry

Authors: Oswin Aichholzer, Katharina Klost, Kristin Knorr, Viola Mészáros, Josef Tkadlec

We consider a special, geometric case of a balancing game introduced by Spencer in 1977. Consider any arrangement $\mathcal{L}$ of $n$ lines in the plane, and assume that each cell of the arrangement contains a box. Alice initially places pebbles in each box. In each subsequent step, Bob picks a line, and Alice must choose a side of that line, remove one pebble from each box on that side, and add one pebble to each box on the other side. Bob wins if any box ever becomes empty. We determine the minimum number $f(\mathcal L)$ of pebbles, computable in polynomial time, for which Alice can prevent Bob from ever winning, and we show that $f(\mathcal L)=Θ(n^3)$ for any arrangement $\mathcal{L}$ of $n$ lines in general position.

Authors: Oswin Aichholzer, Katharina Klost, Kristin Knorr, Viola Mészáros, Josef Tkadlec

We consider a special, geometric case of a balancing game introduced by Spencer in 1977. Consider any arrangement $\mathcal{L}$ of $n$ lines in the plane, and assume that each cell of the arrangement contains a box. Alice initially places pebbles in each box. In each subsequent step, Bob picks a line, and Alice must choose a side of that line, remove one pebble from each box on that side, and add one pebble to each box on the other side. Bob wins if any box ever becomes empty. We determine the minimum number $f(\mathcal L)$ of pebbles, computable in polynomial time, for which Alice can prevent Bob from ever winning, and we show that $f(\mathcal L)=Θ(n^3)$ for any arrangement $\mathcal{L}$ of $n$ lines in general position.

Which Vertical Graphs are Non VPHT Reconstructible?

from arXiv: Computational Geometry

Authors: Jette Gutzeit, Kalani Kistler, Tim Ophelders, Anna Schenfisch

The verbose persistent homology transform (VPHT) is a topological summary of shapes in Euclidean space. Assuming general position, the VPHT is injective, meaning shapes can be reconstructed using only the VPHT. In this work, we investigate cases in which the VPHT is not injective, focusing on a simple setting of degeneracy; graphs whose vertices are all collinear. We identify both necessary properties and sufficient properties for non-reconstructibility of such graphs, bringing us closer to a complete classification.

Authors: Jette Gutzeit, Kalani Kistler, Tim Ophelders, Anna Schenfisch

The verbose persistent homology transform (VPHT) is a topological summary of shapes in Euclidean space. Assuming general position, the VPHT is injective, meaning shapes can be reconstructed using only the VPHT. In this work, we investigate cases in which the VPHT is not injective, focusing on a simple setting of degeneracy; graphs whose vertices are all collinear. We identify both necessary properties and sufficient properties for non-reconstructibility of such graphs, bringing us closer to a complete classification.

Geometry and design of popup structures

from arXiv: Computational Geometry

Authors: Jay Jayeshbhai Chavda, S Ganga Prasath

Origami and Kirigami, the famous Japanese art forms of paper folding and cutting, have inspired the design of novel materials & structures utilizing their geometry. In this article, we explore the geometry of the lesser known popup art, which uses the facilities of both origami and kirigami via appropriately positioned folds and cuts. The simplest popup-unit resembles a four-bar mechanism, whose cut-fold pattern can be arranged on a sheet of paper to produce different shapes upon deployment. Each unit has three parameters associated with the length and height of the cut, as well as the width of the fold. We define the mean and Gaussian curvature of the popup structure via the discrete surface connecting the fold vertices and develop a geometric description of the structure. Using these definitions, we arrive at a design pipeline that identifies the cut-fold pattern required to create popup structure of prescribed shape which we test in experiments. By introducing splay to the rectangular unit-cell, a single cut-fold pattern is shown to take multiple shapes along the trajectory of deployment, making possible transitions from negative to positive curvature surfaces in a single structure. We demonstrate application directions for these structures in drag-reduction, packaging, and architectural facades.

Authors: Jay Jayeshbhai Chavda, S Ganga Prasath

Origami and Kirigami, the famous Japanese art forms of paper folding and cutting, have inspired the design of novel materials & structures utilizing their geometry. In this article, we explore the geometry of the lesser known popup art, which uses the facilities of both origami and kirigami via appropriately positioned folds and cuts. The simplest popup-unit resembles a four-bar mechanism, whose cut-fold pattern can be arranged on a sheet of paper to produce different shapes upon deployment. Each unit has three parameters associated with the length and height of the cut, as well as the width of the fold. We define the mean and Gaussian curvature of the popup structure via the discrete surface connecting the fold vertices and develop a geometric description of the structure. Using these definitions, we arrive at a design pipeline that identifies the cut-fold pattern required to create popup structure of prescribed shape which we test in experiments. By introducing splay to the rectangular unit-cell, a single cut-fold pattern is shown to take multiple shapes along the trajectory of deployment, making possible transitions from negative to positive curvature surfaces in a single structure. We demonstrate application directions for these structures in drag-reduction, packaging, and architectural facades.

Learning Functions of Halfspaces

from arXiv: Data Structures and Algorithms

Authors: Josh Alman, Shyamal Patel, Rocco A. Servedio

We give an algorithm that learns arbitrary Boolean functions of $k$ arbitrary halfspaces over $\mathbb{R}^n$, in the challenging distribution-free Probably Approximately Correct (PAC) learning model, running in time $2^{\sqrt{n} \cdot (\log n)^{O(k)}}$. This is the first algorithm that can PAC learn even intersections of two halfspaces in time $2^{o(n)}.$

Authors: Josh Alman, Shyamal Patel, Rocco A. Servedio

We give an algorithm that learns arbitrary Boolean functions of $k$ arbitrary halfspaces over $\mathbb{R}^n$, in the challenging distribution-free Probably Approximately Correct (PAC) learning model, running in time $2^{\sqrt{n} \cdot (\log n)^{O(k)}}$. This is the first algorithm that can PAC learn even intersections of two halfspaces in time $2^{o(n)}.$

A note on approximating the average degree of bounded arboricity graphs

from arXiv: Data Structures and Algorithms

Authors: Talya Eden, C. Seshadhri

Estimating the average degree of graph is a classic problem in sublinear graph algorithm. Eden, Ron, and Seshadhri (ICALP 2017, SIDMA 2019) gave a simple algorithm for this problem whose running time depended on the graph arboricity, but the underlying simplicity and associated analysis were buried inside the main result. Moreover, the description there loses logarithmic factors because of parameter search. The aim of this note is to give a full presentation of this algorithm, without these losses. Consider standard access (vertex samples, degree queries, and neighbor queries) to a graph $G = (V,E)$ of arboricity at most $α$. Let $d$ denote the average degree of $G$. We describe an algorithm that gives a $(1+\varepsilon)$-approximation to $d$ degree using $O(\varepsilon^{-2}α/d)$ queries. For completeness, we modify the algorithm to get a $O(\varepsilon^{-2} \sqrt{n/d})$ query

Authors: Talya Eden, C. Seshadhri

Estimating the average degree of graph is a classic problem in sublinear graph algorithm. Eden, Ron, and Seshadhri (ICALP 2017, SIDMA 2019) gave a simple algorithm for this problem whose running time depended on the graph arboricity, but the underlying simplicity and associated analysis were buried inside the main result. Moreover, the description there loses logarithmic factors because of parameter search. The aim of this note is to give a full presentation of this algorithm, without these losses. Consider standard access (vertex samples, degree queries, and neighbor queries) to a graph $G = (V,E)$ of arboricity at most $α$. Let $d$ denote the average degree of $G$. We describe an algorithm that gives a $(1+\varepsilon)$-approximation to $d$ degree using $O(\varepsilon^{-2}α/d)$ queries. For completeness, we modify the algorithm to get a $O(\varepsilon^{-2} \sqrt{n/d})$ query

Permutation Match Puzzles: How Young Tanvi Learned About Computational Complexity

from arXiv: Data Structures and Algorithms

Authors: Kshitij Gajjar, Neeldhara Misra

We study a family of sorting match puzzles on grids, which we call permutation match puzzles. In this puzzle, each row and column of a $n \times n$ grid is labeled with an ordering constraint -- ascending (A) or descending (D) -- and the goal is to fill the grid with the numbers 1 through $n^2$ such that each row and column respects its constraint. We provide a complete characterization of solvable puzzles: a puzzle admits a solution if and only if its associated constraint graph is acyclic, which translates to a simple "at most one switch" condition on the A/D labels. When solutions exist, we show that their count is given by a hook length formula. For unsolvable puzzles, we present an $O(n)$ algorithm to compute the minimum number of label flips required to reach a solvable configuration. Finally, we consider a generalization where rows and columns may specify arbitrary permutations rather than simple orderings, and establish that finding minimal repairs in this setting is NP-complete by a reduction from feedback arc set.

Authors: Kshitij Gajjar, Neeldhara Misra

We study a family of sorting match puzzles on grids, which we call permutation match puzzles. In this puzzle, each row and column of a $n \times n$ grid is labeled with an ordering constraint -- ascending (A) or descending (D) -- and the goal is to fill the grid with the numbers 1 through $n^2$ such that each row and column respects its constraint. We provide a complete characterization of solvable puzzles: a puzzle admits a solution if and only if its associated constraint graph is acyclic, which translates to a simple "at most one switch" condition on the A/D labels. When solutions exist, we show that their count is given by a hook length formula. For unsolvable puzzles, we present an $O(n)$ algorithm to compute the minimum number of label flips required to reach a solvable configuration. Finally, we consider a generalization where rows and columns may specify arbitrary permutations rather than simple orderings, and establish that finding minimal repairs in this setting is NP-complete by a reduction from feedback arc set.

The Theory and Practice of Computing the Bus-Factor

from arXiv: Data Structures and Algorithms

Authors: Sebastiano A. Piccolo, Pasquale De Meo, Giorgio Terracina, Gianluigi Greco

The bus-factor is a measure of project risk with respect to personnel availability, informally defined as the number of people whose sudden unavailability would cause a project to stall or experience severe delays. Despite its intuitive appeal, existing bus-factor measures rely on heterogeneous modeling assumptions, ambiguous definitions of failure, and domain-specific artifacts, limiting their generality, comparability, and ability to capture project fragmentation. In this paper, we develop a unified, domain-agnostic framework for bus-factor estimation by modeling projects as bipartite graphs of people and tasks and casting the computation of the bus-factor as a family of combinatorial optimization problems. Within this framework, we formalize and reconcile two complementary interpretations of the bus-factor, redundancy and criticality, corresponding to the Maximum Redundant Set and the Minimum Critical Set, respectively, and prove that both formulations are NP-hard. Building on this theoretical foundation, we introduce a novel bus-factor measure inspired by network robustness. Unlike prior approaches, the proposed measure captures both loss of coverage and increasing project fragmentation by tracking the largest connected set of tasks under progressive contributor removal. The resulting measure is normalized, threshold-free, and applicable across domains; we show that its exact computation is NP-hard as well. We further propose efficient linear-time approximation algorithms for all considered measures. Finally, we evaluate their behavior through a sensitivity analysis based on controlled perturbations of project structures, guided by expectations derived from project management theory. Our results show that the robustness-based measure behaves consistently with these expectations and provides a more informative and stable assessment of project risk than existing alternatives.

Authors: Sebastiano A. Piccolo, Pasquale De Meo, Giorgio Terracina, Gianluigi Greco

The bus-factor is a measure of project risk with respect to personnel availability, informally defined as the number of people whose sudden unavailability would cause a project to stall or experience severe delays. Despite its intuitive appeal, existing bus-factor measures rely on heterogeneous modeling assumptions, ambiguous definitions of failure, and domain-specific artifacts, limiting their generality, comparability, and ability to capture project fragmentation. In this paper, we develop a unified, domain-agnostic framework for bus-factor estimation by modeling projects as bipartite graphs of people and tasks and casting the computation of the bus-factor as a family of combinatorial optimization problems. Within this framework, we formalize and reconcile two complementary interpretations of the bus-factor, redundancy and criticality, corresponding to the Maximum Redundant Set and the Minimum Critical Set, respectively, and prove that both formulations are NP-hard. Building on this theoretical foundation, we introduce a novel bus-factor measure inspired by network robustness. Unlike prior approaches, the proposed measure captures both loss of coverage and increasing project fragmentation by tracking the largest connected set of tasks under progressive contributor removal. The resulting measure is normalized, threshold-free, and applicable across domains; we show that its exact computation is NP-hard as well. We further propose efficient linear-time approximation algorithms for all considered measures. Finally, we evaluate their behavior through a sensitivity analysis based on controlled perturbations of project structures, guided by expectations derived from project management theory. Our results show that the robustness-based measure behaves consistently with these expectations and provides a more informative and stable assessment of project risk than existing alternatives.

Complexity Lower Bounds of Small Matrix Multiplication over Finite Fields via Backtracking and Substitution

from arXiv: Data Structures and Algorithms

Authors: Chengu Wang

We introduce a new method for proving bilinear complexity lower bounds for matrix multiplication over finite fields. The approach combines the substitution method with a systematic backtracking search over linear restrictions on the first matrix $A$ in the product $AB = C^T$. We enumerate restriction classes up to symmetry; for each class we either obtain a rank lower bound by classical arguments or branch further via the substitution method. The search is organized by dynamic programming on the restricted matrix $A$. As an application we prove that the bilinear complexity of multiplying two $3 \times 3$ matrices over $\mathbb{F}_2$ is at least $20$, improving the longstanding lower bound of $19$ (Bläser 2003). The proof is found automatically within 1.5 hours on a laptop and verified in seconds.

Authors: Chengu Wang

We introduce a new method for proving bilinear complexity lower bounds for matrix multiplication over finite fields. The approach combines the substitution method with a systematic backtracking search over linear restrictions on the first matrix $A$ in the product $AB = C^T$. We enumerate restriction classes up to symmetry; for each class we either obtain a rank lower bound by classical arguments or branch further via the substitution method. The search is organized by dynamic programming on the restricted matrix $A$. As an application we prove that the bilinear complexity of multiplying two $3 \times 3$ matrices over $\mathbb{F}_2$ is at least $20$, improving the longstanding lower bound of $19$ (Bläser 2003). The proof is found automatically within 1.5 hours on a laptop and verified in seconds.

Sliding Cubes in Parallel

from arXiv: Data Structures and Algorithms

Authors: Hugo A. Akitaya, Joseph Dorfer, Peter Kramer, Christian Rieck, Gabriel Shahrouzi, Frederick Stock

We study the classic sliding cube model for programmable matter under parallel reconfiguration in three dimensions, providing novel algorithmic and surprising complexity results in addition to generalizing the best known bounds from two to three dimensions. In general, the problem asks for reconfiguration sequences between two connected configurations of $n$ indistinguishable unit cube modules under connectivity constraints; a connected backbone must exist at all times. The makespan of a reconfiguration sequence is the number of parallel moves performed. We show that deciding the existence of such a sequence is NP-hard, even for constant makespan and if the two input configurations have constant-size symmetric difference, solving an open question in [Akitaya et al., ESA 25]. In particular, deciding whether the optimal makespan is 1 or 2 is NP-hard. We also show log-APX-hardness of the problem in sequential and parallel models, strengthening the APX-hardness claim in [Akitaya et al., SWAT 22]. Finally, we outline an asymptotically worst-case optimal input-sensitive algorithm for reconfiguration. The produced sequence has length that depends on the bounding box of the input configurations which, in the worst case, results in a $O(n)$ makespan.

Authors: Hugo A. Akitaya, Joseph Dorfer, Peter Kramer, Christian Rieck, Gabriel Shahrouzi, Frederick Stock

We study the classic sliding cube model for programmable matter under parallel reconfiguration in three dimensions, providing novel algorithmic and surprising complexity results in addition to generalizing the best known bounds from two to three dimensions. In general, the problem asks for reconfiguration sequences between two connected configurations of $n$ indistinguishable unit cube modules under connectivity constraints; a connected backbone must exist at all times. The makespan of a reconfiguration sequence is the number of parallel moves performed. We show that deciding the existence of such a sequence is NP-hard, even for constant makespan and if the two input configurations have constant-size symmetric difference, solving an open question in [Akitaya et al., ESA 25]. In particular, deciding whether the optimal makespan is 1 or 2 is NP-hard. We also show log-APX-hardness of the problem in sequential and parallel models, strengthening the APX-hardness claim in [Akitaya et al., SWAT 22]. Finally, we outline an asymptotically worst-case optimal input-sensitive algorithm for reconfiguration. The produced sequence has length that depends on the bounding box of the input configurations which, in the worst case, results in a $O(n)$ makespan.

The Li-Chao Tree: Algorithm Specification and Analysis

from arXiv: Data Structures and Algorithms

Authors: Chao Li

The Li-Chao tree (LICT) was first introduced in lecture as an efficient data structure for dynamic lower envelope maintenance. In the years since, it has achieved widespread adoption within the competitive programming community, yet no formal specification has appeared in the peer-reviewed literature. This paper provides the definitive formalization of the Li-Chao tree, serving as both the official specification and an expansion of the original lecture. We present complete algorithmic specifications, establish formal correctness proofs, analyze theoretical complexity, and provide empirical performance characterization. The LICT offers distinct advantages in implementation simplicity, numerical stability, and extensibility to advanced variants such as persistence and line segments.

Authors: Chao Li

The Li-Chao tree (LICT) was first introduced in lecture as an efficient data structure for dynamic lower envelope maintenance. In the years since, it has achieved widespread adoption within the competitive programming community, yet no formal specification has appeared in the peer-reviewed literature. This paper provides the definitive formalization of the Li-Chao tree, serving as both the official specification and an expansion of the original lecture. We present complete algorithmic specifications, establish formal correctness proofs, analyze theoretical complexity, and provide empirical performance characterization. The LICT offers distinct advantages in implementation simplicity, numerical stability, and extensibility to advanced variants such as persistence and line segments.

Improved Certificates for Independence Number in Semirandom Hypergraphs

from arXiv: Data Structures and Algorithms

Authors: Pravesh Kothari, Anand Louis, Rameesh Paul, Prasad Raghavendra

We study the problem of efficiently certifying upper bounds on the independence number of $\ell$-uniform hypergraphs. This is a notoriously hard problem, with efficient algorithms failing to approximate the independence number within $n^{1-ε}$ factor in the worst case [Has99, Zuc07]. We study the problem in random and semirandom hypergraphs. There is a folklore reduction to the graph case, achieving a certifiable bound of $O(\sqrt{n/p})$. More recently, the work [GKM22] improved this by constructing spectral certificates that yield a bound of $O(\sqrt{n}.\mathrm{polylog}(n)/p^{1/(\ell/2)})$. We make two key improvements: firstly, we prove sharper bounds that get rid of pesky logarithmic factors in $n$, and nearly attain the conjectured optimal (in both $n$ and $p$) computational threshold of $O(\sqrt{n}/p^{1/\ell})$, and secondly, we design robust Sum-of-Squares (SoS) certificates, proving our bounds in the more challenging semirandom hypergraph model. Our analysis employs the proofs-to-algorithms paradigm [BS16, FKP19] in showing an upper bound for pseudo-expectation of degree-$2\ell$ SoS relaxation of the natural polynomial system for maximum independent set. The challenging case is odd-arity hypergraphs, where we employ a tensor-based analysis that reduces the problem to proving bounds on a natural class of random chaos matrices associated with $\ell$-uniform hypergraphs. Previous bounds [AMP21, RT23] have a logarithmic dependence, which we remove by leveraging recent progress on matrix concentration inequalities [BBvH23, BLNvH25]; we believe these may be useful in other hypergraph problems. As an application, we show our improved certificates can be combined with an SoS relaxation of a natural $r$-coloring polynomial system to recover an arbitrary planted $r$-colorable subhypergraph in a semirandom model along the lines of [LPR25], which allows for strong adversaries.

Authors: Pravesh Kothari, Anand Louis, Rameesh Paul, Prasad Raghavendra

We study the problem of efficiently certifying upper bounds on the independence number of $\ell$-uniform hypergraphs. This is a notoriously hard problem, with efficient algorithms failing to approximate the independence number within $n^{1-ε}$ factor in the worst case [Has99, Zuc07]. We study the problem in random and semirandom hypergraphs. There is a folklore reduction to the graph case, achieving a certifiable bound of $O(\sqrt{n/p})$. More recently, the work [GKM22] improved this by constructing spectral certificates that yield a bound of $O(\sqrt{n}.\mathrm{polylog}(n)/p^{1/(\ell/2)})$. We make two key improvements: firstly, we prove sharper bounds that get rid of pesky logarithmic factors in $n$, and nearly attain the conjectured optimal (in both $n$ and $p$) computational threshold of $O(\sqrt{n}/p^{1/\ell})$, and secondly, we design robust Sum-of-Squares (SoS) certificates, proving our bounds in the more challenging semirandom hypergraph model. Our analysis employs the proofs-to-algorithms paradigm [BS16, FKP19] in showing an upper bound for pseudo-expectation of degree-$2\ell$ SoS relaxation of the natural polynomial system for maximum independent set. The challenging case is odd-arity hypergraphs, where we employ a tensor-based analysis that reduces the problem to proving bounds on a natural class of random chaos matrices associated with $\ell$-uniform hypergraphs. Previous bounds [AMP21, RT23] have a logarithmic dependence, which we remove by leveraging recent progress on matrix concentration inequalities [BBvH23, BLNvH25]; we believe these may be useful in other hypergraph problems. As an application, we show our improved certificates can be combined with an SoS relaxation of a natural $r$-coloring polynomial system to recover an arbitrary planted $r$-colorable subhypergraph in a semirandom model along the lines of [LPR25], which allows for strong adversaries.

Distributed Algorithms for Euclidean Clustering

from arXiv: Data Structures and Algorithms

Authors: Vincent Cohen-Addad, Liudeng Wang, David P. Woodruff, Samson Zhou

We study the problem of constructing $(1+\varepsilon)$-coresets for Euclidean $(k,z)$-clustering in the distributed setting, where $n$ data points are partitioned across $s$ sites. We focus on two prominent communication models: the coordinator model and the blackboard model. In the coordinator model, we design a protocol that achieves a $(1+\varepsilon)$-strong coreset with total communication complexity $\tilde{O}\left(sk + \frac{dk}{\min(\varepsilon^4,\varepsilon^{2+z})} + dk\log(nΔ)\right)$ bits, improving upon prior work (Chen et al., NeurIPS 2016) by eliminating the need to communicate explicit point coordinates in-the-clear across all servers. In the blackboard model, we further reduce the communication complexity to $\tilde{O}\left(s\log(nΔ) + dk\log(nΔ) + \frac{dk}{\min(\varepsilon^4,\varepsilon^{2+z})}\right)$ bits, achieving better bounds than previous approaches while upgrading from constant-factor to $(1+\varepsilon)$-approximation guarantees. Our techniques combine new strategies for constant-factor approximation with efficient coreset constructions and compact encoding schemes, leading to optimal protocols that match both the communication costs of the best-known offline coreset constructions and existing lower bounds (Chen et al., NeurIPS 2016, Huang et. al., STOC 2024), up to polylogarithmic factors.

Authors: Vincent Cohen-Addad, Liudeng Wang, David P. Woodruff, Samson Zhou

We study the problem of constructing $(1+\varepsilon)$-coresets for Euclidean $(k,z)$-clustering in the distributed setting, where $n$ data points are partitioned across $s$ sites. We focus on two prominent communication models: the coordinator model and the blackboard model. In the coordinator model, we design a protocol that achieves a $(1+\varepsilon)$-strong coreset with total communication complexity $\tilde{O}\left(sk + \frac{dk}{\min(\varepsilon^4,\varepsilon^{2+z})} + dk\log(nΔ)\right)$ bits, improving upon prior work (Chen et al., NeurIPS 2016) by eliminating the need to communicate explicit point coordinates in-the-clear across all servers. In the blackboard model, we further reduce the communication complexity to $\tilde{O}\left(s\log(nΔ) + dk\log(nΔ) + \frac{dk}{\min(\varepsilon^4,\varepsilon^{2+z})}\right)$ bits, achieving better bounds than previous approaches while upgrading from constant-factor to $(1+\varepsilon)$-approximation guarantees. Our techniques combine new strategies for constant-factor approximation with efficient coreset constructions and compact encoding schemes, leading to optimal protocols that match both the communication costs of the best-known offline coreset constructions and existing lower bounds (Chen et al., NeurIPS 2016, Huang et. al., STOC 2024), up to polylogarithmic factors.

Bayesian inference of planted matchings: Local posterior approximation and infinite-volume limit

from arXiv: Data Structures and Algorithms

Authors: Zhou Fan, Timothy L. H. Wee, Kaylee Y. Yang

We study Bayesian inference of an unknown matching $π^*$ between two correlated random point sets $\{X_i\}_{i=1}^n$ and $\{Y_i\}_{i=1}^n$ in $[0,1]^d$, under a critical scaling $\|X_i-Y_{π^*(i)}\|_2 \asymp n^{-1/d}$, in both an exact matching model where all points are observed and a partial matching model where a fraction of points may be missing. Restricting to the simplest setting of $d=1$, in this work, we address the questions of (1) whether the posterior distribution over matchings is approximable by a local algorithm, and (2) whether marginal statistics of this posterior have a well-defined limit as $n \to \infty$. We answer both questions affirmatively for partial matching, where a decay-of-correlations arises for large $n$. For exact matching, we show that the posterior is approximable locally only after a global sorting of the points, and that defining a large-$n$ limit of marginal statistics requires a careful indexing of points in the Poisson point process limit of the data, based on a notion of flow. We leave as an open question the extensions of such results to dimensions $d \geq 2$.

Authors: Zhou Fan, Timothy L. H. Wee, Kaylee Y. Yang

We study Bayesian inference of an unknown matching $π^*$ between two correlated random point sets $\{X_i\}_{i=1}^n$ and $\{Y_i\}_{i=1}^n$ in $[0,1]^d$, under a critical scaling $\|X_i-Y_{π^*(i)}\|_2 \asymp n^{-1/d}$, in both an exact matching model where all points are observed and a partial matching model where a fraction of points may be missing. Restricting to the simplest setting of $d=1$, in this work, we address the questions of (1) whether the posterior distribution over matchings is approximable by a local algorithm, and (2) whether marginal statistics of this posterior have a well-defined limit as $n \to \infty$. We answer both questions affirmatively for partial matching, where a decay-of-correlations arises for large $n$. For exact matching, we show that the posterior is approximable locally only after a global sorting of the points, and that defining a large-$n$ limit of marginal statistics requires a careful indexing of points in the Poisson point process limit of the data, based on a notion of flow. We leave as an open question the extensions of such results to dimensions $d \geq 2$.

Sampling Colorings with Fixed Color Class Sizes

from arXiv: Data Structures and Algorithms

Authors: Aiya Kuchukova, Will Perkins, Xavier Povill

In 1970 Hajnal and Szemerédi proved a conjecture of Erdös that for a graph with maximum degree $Δ$, there exists an equitable $Δ+1$ coloring; that is a coloring where color class sizes differ by at most $1$. In 2007 Kierstand and Kostochka reproved their result and provided a polynomial-time algorithm which produces such a coloring. In this paper we study the problem of approximately sampling uniformly random equitable colorings. A series of works gives polynomial-time sampling algorithms for colorings without the color class constraint, the latest improvement being by Carlson and Vigoda for $q\geq 1.809 Δ$. In this paper we give a polynomial-time sampling algorithm for equitable colorings when $q> 2Δ$. Moreover, our results extend to colorings with small deviations from equitable (and as a corollary, establishing their existence). The proof uses the framework of the geometry of polynomials for multivariate polynomials, and as a consequence establishes a multivariate local Central Limit Theorem for color class sizes of uniform random colorings.

Authors: Aiya Kuchukova, Will Perkins, Xavier Povill

In 1970 Hajnal and Szemerédi proved a conjecture of Erdös that for a graph with maximum degree $Δ$, there exists an equitable $Δ+1$ coloring; that is a coloring where color class sizes differ by at most $1$. In 2007 Kierstand and Kostochka reproved their result and provided a polynomial-time algorithm which produces such a coloring. In this paper we study the problem of approximately sampling uniformly random equitable colorings. A series of works gives polynomial-time sampling algorithms for colorings without the color class constraint, the latest improvement being by Carlson and Vigoda for $q\geq 1.809 Δ$. In this paper we give a polynomial-time sampling algorithm for equitable colorings when $q> 2Δ$. Moreover, our results extend to colorings with small deviations from equitable (and as a corollary, establishing their existence). The proof uses the framework of the geometry of polynomials for multivariate polynomials, and as a consequence establishes a multivariate local Central Limit Theorem for color class sizes of uniform random colorings.

Structured Gossip: A Partition-Resilient DNS for Internet-Scale Dynamic Networks

from arXiv: Data Structures and Algorithms

Authors: Priyanka Sinha, Dilys Thomas

Network partitions pose fundamental challenges to distributed name resolution in mobile ad-hoc networks (MANETs) and edge computing. Existing solutions either require active coordination that fails to scale, or use unstructured gossip with excessive overhead. We present \textit{Structured Gossip DNS}, exploiting DHT finger tables to achieve partition resilience through \textbf{passive stabilization}. Our approach reduces message complexity from $O(n)$ to $O(n/\log n)$ while maintaining $O(\log^2 n)$ convergence. Unlike active protocols requiring synchronous agreement, our passive approach guarantees eventual consistency through commutative operations that converge regardless of message ordering. The system handles arbitrary concurrent partitions via version vectors, eliminating global coordination and enabling billion-node deployments.

Authors: Priyanka Sinha, Dilys Thomas

Network partitions pose fundamental challenges to distributed name resolution in mobile ad-hoc networks (MANETs) and edge computing. Existing solutions either require active coordination that fails to scale, or use unstructured gossip with excessive overhead. We present \textit{Structured Gossip DNS}, exploiting DHT finger tables to achieve partition resilience through \textbf{passive stabilization}. Our approach reduces message complexity from $O(n)$ to $O(n/\log n)$ while maintaining $O(\log^2 n)$ convergence. Unlike active protocols requiring synchronous agreement, our passive approach guarantees eventual consistency through commutative operations that converge regardless of message ordering. The system handles arbitrary concurrent partitions via version vectors, eliminating global coordination and enabling billion-node deployments.

Succinct QUBO formulations for permutation problems by sorting networks

from arXiv: Data Structures and Algorithms

Authors: Katalin Friedl, Levente Gegő, László Kabódi, Viktória Nemkin

Quadratic Unconstrained Binary Optimization (QUBO) is a standard NP-hard optimization problem. Recently, it has gained renewed interest through quantum computing, as QUBOs directly reduce to the Ising model, on which quantum annealing devices are based. We introduce a QUBO formulation for permutations using compare-exchange networks, with only $O(n \log^2 n)$ binary variables. This is a substantial improvement over the standard permutation matrix encoding, which requires $n^2$ variables and has a much denser interaction graph. A central feature of our approach is uniformity: each permutation corresponds to a unique variable assignment, enabling unbiased sampling. Our construction also allows additional constraints, including fixed points and parity. Moreover, it provides a representation of permutations that supports the operations multiplication and inversion, and also makes it possible to check the order of a permutation. This can be used to uniformly generate permutations of a given order or, for example, permutations that commute with a specified permutation. To our knowledge, this is the first result linking oblivious compare-exchange networks with QUBO encodings. While similar functionality can be achieved using permutation matrices, our method yields QUBOs that are both smaller and sparser. We expect this method to be practically useful in areas where unbiased sampling of constrained permutations is important, including cryptography and combinatorial design.

Authors: Katalin Friedl, Levente Gegő, László Kabódi, Viktória Nemkin

Quadratic Unconstrained Binary Optimization (QUBO) is a standard NP-hard optimization problem. Recently, it has gained renewed interest through quantum computing, as QUBOs directly reduce to the Ising model, on which quantum annealing devices are based. We introduce a QUBO formulation for permutations using compare-exchange networks, with only $O(n \log^2 n)$ binary variables. This is a substantial improvement over the standard permutation matrix encoding, which requires $n^2$ variables and has a much denser interaction graph. A central feature of our approach is uniformity: each permutation corresponds to a unique variable assignment, enabling unbiased sampling. Our construction also allows additional constraints, including fixed points and parity. Moreover, it provides a representation of permutations that supports the operations multiplication and inversion, and also makes it possible to check the order of a permutation. This can be used to uniformly generate permutations of a given order or, for example, permutations that commute with a specified permutation. To our knowledge, this is the first result linking oblivious compare-exchange networks with QUBO encodings. While similar functionality can be achieved using permutation matrices, our method yields QUBOs that are both smaller and sparser. We expect this method to be practically useful in areas where unbiased sampling of constrained permutations is important, including cryptography and combinatorial design.

A symmetric recursive algorithm for mean-payoff games

from arXiv: Data Structures and Algorithms

Authors: Pierre Ohlmann

We propose a new deterministic symmetric recursive algorithm for solving mean-payoff games.

Authors: Pierre Ohlmann

We propose a new deterministic symmetric recursive algorithm for solving mean-payoff games.

Approximating Tensor Network Contraction with Sketches

from arXiv: Data Structures and Algorithms

Authors: Mike Heddes, Igor Nunes, Tony Givargis, Alex Nicolau

Tensor network contraction is a fundamental mathematical operation that generalizes the dot product and matrix multiplication. It finds applications in numerous domains, such as database systems, graph theory, machine learning, probability theory, and quantum mechanics. Tensor network contractions are computationally expensive, in general requiring exponential time and space. Sketching methods include a number of dimensionality reduction techniques that are widely used in the design of approximation algorithms. The existing sketching methods for tensor network contraction, however, only support acyclic tensor networks. We present the first method capable of approximating arbitrary tensor network contractions, including those of cyclic tensor networks. Additionally, we show that the existing sketching methods require a computational complexity that grows exponentially with the number of contractions. We present a second method, for acyclic tensor networks, whose space and time complexity depends only polynomially on the number of contractions.

Authors: Mike Heddes, Igor Nunes, Tony Givargis, Alex Nicolau

Tensor network contraction is a fundamental mathematical operation that generalizes the dot product and matrix multiplication. It finds applications in numerous domains, such as database systems, graph theory, machine learning, probability theory, and quantum mechanics. Tensor network contractions are computationally expensive, in general requiring exponential time and space. Sketching methods include a number of dimensionality reduction techniques that are widely used in the design of approximation algorithms. The existing sketching methods for tensor network contraction, however, only support acyclic tensor networks. We present the first method capable of approximating arbitrary tensor network contractions, including those of cyclic tensor networks. Additionally, we show that the existing sketching methods require a computational complexity that grows exponentially with the number of contractions. We present a second method, for acyclic tensor networks, whose space and time complexity depends only polynomially on the number of contractions.

A Class of Unrooted Phylogenetic Networks Inspired by the Properties of Rooted Tree-Child Networks

from arXiv: Data Structures and Algorithms

Authors: Leo van Iersel, Mark Jones, Simone Linz, Norbert Zeh

A directed phylogenetic network is tree-child if every non-leaf vertex has a child that is not a reticulation. As a class of directed phylogenetic networks, tree-child networks are very useful from a computational perspective. For example, several computationally difficult problems in phylogenetics become tractable when restricted to tree-child networks. At the same time, the class itself is rich enough to contain quite complex networks. Furthermore, checking whether a directed network is tree-child can be done in polynomial time. In this paper, we seek a class of undirected phylogenetic networks that is rich and computationally useful in a similar way to the class tree-child directed networks. A natural class to consider for this role is the class of tree-child-orientable networks which contains all those undirected phylogenetic networks whose edges can be oriented to create a tree-child network. However, we show here that recognizing such networks is NP-hard, even for binary networks, and as such this class is inappropriate for this role. Towards finding a class of undirected networks that fills a similar role to directed tree-child networks, we propose new classes called $q$-cuttable networks, for any integer $q\geq 1$. We show that these classes have many of the desirable properties, similar to tree-child networks in the rooted case, including being recognizable in polynomial time, for all $q\geq 1$. Towards showing the computational usefulness of the class, we show that the NP-hard problem Tree Containment is polynomial-time solvable when restricted to $q$-cuttable networks with $q\geq 3$.

Authors: Leo van Iersel, Mark Jones, Simone Linz, Norbert Zeh

A directed phylogenetic network is tree-child if every non-leaf vertex has a child that is not a reticulation. As a class of directed phylogenetic networks, tree-child networks are very useful from a computational perspective. For example, several computationally difficult problems in phylogenetics become tractable when restricted to tree-child networks. At the same time, the class itself is rich enough to contain quite complex networks. Furthermore, checking whether a directed network is tree-child can be done in polynomial time. In this paper, we seek a class of undirected phylogenetic networks that is rich and computationally useful in a similar way to the class tree-child directed networks. A natural class to consider for this role is the class of tree-child-orientable networks which contains all those undirected phylogenetic networks whose edges can be oriented to create a tree-child network. However, we show here that recognizing such networks is NP-hard, even for binary networks, and as such this class is inappropriate for this role. Towards finding a class of undirected networks that fills a similar role to directed tree-child networks, we propose new classes called $q$-cuttable networks, for any integer $q\geq 1$. We show that these classes have many of the desirable properties, similar to tree-child networks in the rooted case, including being recognizable in polynomial time, for all $q\geq 1$. Towards showing the computational usefulness of the class, we show that the NP-hard problem Tree Containment is polynomial-time solvable when restricted to $q$-cuttable networks with $q\geq 3$.

Multi-Agent Reinforcement Learning with Submodular Reward

from arXiv: Data Structures and Algorithms

Authors: Wenjing Chen, Chengyuan Qian, Shuo Xing, Yi Zhou, Victoria Crawford

In this paper, we study cooperative multi-agent reinforcement learning (MARL) where the joint reward exhibits submodularity, which is a natural property capturing diminishing marginal returns when adding agents to a team. Unlike standard MARL with additive rewards, submodular rewards model realistic scenarios where agent contributions overlap (e.g., multi-drone surveillance, collaborative exploration). We provide the first formal framework for this setting and develop algorithms with provable guarantees on sample efficiency and regret bound. For known dynamics, our greedy policy optimization achieves a $1/2$-approximation with polynomial complexity in the number of agents $K$, overcoming the exponential curse of dimensionality inherent in joint policy optimization. For unknown dynamics, we propose a UCB-based learning algorithm achieving a $1/2$-regret of $O(H^2KS\sqrt{AT})$ over $T$ episodes.

Authors: Wenjing Chen, Chengyuan Qian, Shuo Xing, Yi Zhou, Victoria Crawford

In this paper, we study cooperative multi-agent reinforcement learning (MARL) where the joint reward exhibits submodularity, which is a natural property capturing diminishing marginal returns when adding agents to a team. Unlike standard MARL with additive rewards, submodular rewards model realistic scenarios where agent contributions overlap (e.g., multi-drone surveillance, collaborative exploration). We provide the first formal framework for this setting and develop algorithms with provable guarantees on sample efficiency and regret bound. For known dynamics, our greedy policy optimization achieves a $1/2$-approximation with polynomial complexity in the number of agents $K$, overcoming the exponential curse of dimensionality inherent in joint policy optimization. For unknown dynamics, we propose a UCB-based learning algorithm achieving a $1/2$-regret of $O(H^2KS\sqrt{AT})$ over $T$ episodes.

Monday, March 09

TR26-037 | A Note on the Equivalence Between Zero-knowledge and Quantum CSS Codes | Noga Ron-Zewi, Mor Weiss

from ECCC Papers

Zero-knowledge codes, introduced by Decatur, Goldreich, and Ron (ePrint 1997), are error-correcting codes in which few codeword symbols reveal no information about the encoded message, and have been extensively used in cryptographic constructions. Quantum CSS codes, introduced by Calderbank and Shor (Phys. Rev. A 1996) and Steane (Royal Society A 1996), are error-correcting codes that allow for quantum error correction, and are also useful for applications in quantum complexity theory. In this short note, we show that (linear, perfect) zero-knowledge codes and quantum CSS codes are equivalent. We demonstrate the potential of this equivalence by using it to obtain explicit asymptotically-good zero-knowledge locally-testable codes.

Zero-knowledge codes, introduced by Decatur, Goldreich, and Ron (ePrint 1997), are error-correcting codes in which few codeword symbols reveal no information about the encoded message, and have been extensively used in cryptographic constructions. Quantum CSS codes, introduced by Calderbank and Shor (Phys. Rev. A 1996) and Steane (Royal Society A 1996), are error-correcting codes that allow for quantum error correction, and are also useful for applications in quantum complexity theory. In this short note, we show that (linear, perfect) zero-knowledge codes and quantum CSS codes are equivalent. We demonstrate the potential of this equivalence by using it to obtain explicit asymptotically-good zero-knowledge locally-testable codes.

Minimal Updates

from Ben Recht

A few argmin announcements from the road

This week, I’m attending a workshop at NYU organized by Tyler Shoemaker and Leif Weatherby called “Cultural AI: An Emerging Field.” What is Cultural AI? I’m not sure yet, but this workshop is going to find out. At this point in my career, the invite list is far more important than the planned topic, and this list is top notch. My brief talk will be about the culture of benchmarking in machine learning and how weird it is now that we’re trying to benchmark culture. I’ll let you know what the group synthesizes later this week.

In other exciting news, The Irrational Decision comes out tomorrow! Get it wherever you buy books. You can even get it in one of those atavistic bindings of printed pages wrapped in a pretty dust cover. I hear they look good in the background of your podcast.

I should say something longer about the book, but I want to see how release day moves me. So until tomorrow, here’s what I wrote about it last fall. Technology Review wrote a nice review that companioned it with two other great books, The Means of Prediction: How AI Really Works (and Who Benefits) by Max Kasy and Prophecy: Prediction, Power, and the Fight for the Future, from Ancient Oracles to AI by Carissa Véliz. I need to read both of these. In the interdisciplinary journal Critical AI, Jathan Sadowski further contextualized the book within the broader sphere of science studies.

Finally, for my friends and readers in the Bay Area, the folks at Bloomberg Beta and Reboot are sponsoring a launch event for The Irrational Decision on March 17 in San Francisco. If you’re in the area and interested, you can find more information and register here. Space is limited, but it would be great to meet you in person.

Subscribe now

By Ben Recht

Intrinsic Information Flow in Structureless NP Search

from arXiv: Computational Complexity

Authors: Jing-Yuan Wei

We reinterpret NP witness discovery through an information-theoretic lens. Rather than measuring search solely by Turing-machine time, we treat recovery as an information-acquisition process: the hidden witness is the sole source of uncertainty, and identification requires reducing this uncertainty through a rate-limited access interface in the sense of Shannon. To make this perspective explicit, we analyze an extreme regime, the \emph{psocid model}, in which the witness is accessible only via equality probes $[π= w^\star]$ under a uniform, structureless prior. Each probe reveals at most $O(N/2^N)$ bits of mutual information, so polynomially many probes accumulate only $o(1)$ total information. By Fano's inequality, reliable recovery requires $Ω(N)$ bits, creating a fundamental mismatch between required and obtainable information. The psocid setting thus isolates a fully symmetric search regime in which no intermediate computation yields global eliminative leverage, thereby exposing an informational origin of exponential search complexity.

Authors: Jing-Yuan Wei

We reinterpret NP witness discovery through an information-theoretic lens. Rather than measuring search solely by Turing-machine time, we treat recovery as an information-acquisition process: the hidden witness is the sole source of uncertainty, and identification requires reducing this uncertainty through a rate-limited access interface in the sense of Shannon. To make this perspective explicit, we analyze an extreme regime, the \emph{psocid model}, in which the witness is accessible only via equality probes $[π= w^\star]$ under a uniform, structureless prior. Each probe reveals at most $O(N/2^N)$ bits of mutual information, so polynomially many probes accumulate only $o(1)$ total information. By Fano's inequality, reliable recovery requires $Ω(N)$ bits, creating a fundamental mismatch between required and obtainable information. The psocid setting thus isolates a fully symmetric search regime in which no intermediate computation yields global eliminative leverage, thereby exposing an informational origin of exponential search complexity.

Classical simulability of quantum circuits followed by sparse classical post-processing

from arXiv: Computational Complexity

Authors: Yasuhiro Takahashi, Masayuki Miyamoto, Noboru Kunihiro

We study the classical simulability of a polynomial-size quantum circuit $C_n$ on $n$ qubits followed by sparse classical post-processing (SCP) on $m$ bits, where $m \leq n \leq {\rm poly}(m)$. The SCP is described by a non-zero Boolean function $f_m$ that is classically computable in polynomial time and is sparse, i.e., has a peaked Fourier spectrum. First, we provide a necessary and sufficient condition on $C_n$ such that, for any SCP $f_m$, $C_n$ followed by $f_m$ is classically simulable. This characterization extends the result of Van den Nest and implies that various quantum circuits followed by SCP are classically simulable. Examples include IQP circuits, Clifford Magic circuits, and the quantum part of Simon's algorithm, even though these circuits alone are hard to simulate classically. Then, we consider the case where $C_n$ has constant depth $d$. While it is unlikely that, for any SCP $f_m$, $C_n$ followed by $f_m$ is classically simulable, we show that it is simulable by a polynomial-time probabilistic algorithm with access to commuting quantum circuits on $n+1$ qubits. Each such circuit consists of at most deg($f_m$) commuting gates and each commuting gate acts on at most $2^d+1$ qubits, where deg($f_m$) is the Fourier degree of $f_m$. This provides a better understanding of the hardness of simulating constant-depth quantum circuits followed by SCP.

Authors: Yasuhiro Takahashi, Masayuki Miyamoto, Noboru Kunihiro

We study the classical simulability of a polynomial-size quantum circuit $C_n$ on $n$ qubits followed by sparse classical post-processing (SCP) on $m$ bits, where $m \leq n \leq {\rm poly}(m)$. The SCP is described by a non-zero Boolean function $f_m$ that is classically computable in polynomial time and is sparse, i.e., has a peaked Fourier spectrum. First, we provide a necessary and sufficient condition on $C_n$ such that, for any SCP $f_m$, $C_n$ followed by $f_m$ is classically simulable. This characterization extends the result of Van den Nest and implies that various quantum circuits followed by SCP are classically simulable. Examples include IQP circuits, Clifford Magic circuits, and the quantum part of Simon's algorithm, even though these circuits alone are hard to simulate classically. Then, we consider the case where $C_n$ has constant depth $d$. While it is unlikely that, for any SCP $f_m$, $C_n$ followed by $f_m$ is classically simulable, we show that it is simulable by a polynomial-time probabilistic algorithm with access to commuting quantum circuits on $n+1$ qubits. Each such circuit consists of at most deg($f_m$) commuting gates and each commuting gate acts on at most $2^d+1$ qubits, where deg($f_m$) is the Fourier degree of $f_m$. This provides a better understanding of the hardness of simulating constant-depth quantum circuits followed by SCP.

Three Fixed-Dimension Satisfiability Semantics for Quantum Logic: Implications and an Explicit Separator

from arXiv: Computational Complexity

Authors: Joaquim Reizi Higuchi

We compare three satisfiability notions for propositional formulas in the language {not, and, or} over a fixed finite-dimensional Hilbert space H=F^d with F in {R, C}. The first is the standard Hilbert-lattice semantics on the subspace lattice L(H), where meet and join are total operations. The second is a global commuting-projector semantics, where all atoms occurring in the formula are interpreted by a single pairwise-commuting projector family. The third is a local partial-Boolean semantics, where binary connectives are defined only on commeasurable pairs and definedness is checked nodewise along the parse tree. We prove, for every fixed d >= 1, Sat_COM^d(phi) implies Sat_PBA^d(phi) implies Sat_STD^d(phi) for every formula phi. We then exhibit the explicit formula SEP-1 := (p and (q or r)) and not((p and q) or (p and r)) which is satisfiable in the standard semantics for every d >= 2, but unsatisfiable under both the global commuting and the partial-Boolean semantics. Consequently, for every d >= 2, the satisfiability classes satisfy SAT_COM^d subseteq SAT_PBA^d subset SAT_STD^d and SAT_COM^d subset SAT_STD^d, while the exact relation between SAT_COM^d and SAT_PBA^d remains open. The point of the paper is semantic comparison, not a new feasibility reduction or a generic translation theorem.

Authors: Joaquim Reizi Higuchi

We compare three satisfiability notions for propositional formulas in the language {not, and, or} over a fixed finite-dimensional Hilbert space H=F^d with F in {R, C}. The first is the standard Hilbert-lattice semantics on the subspace lattice L(H), where meet and join are total operations. The second is a global commuting-projector semantics, where all atoms occurring in the formula are interpreted by a single pairwise-commuting projector family. The third is a local partial-Boolean semantics, where binary connectives are defined only on commeasurable pairs and definedness is checked nodewise along the parse tree. We prove, for every fixed d >= 1, Sat_COM^d(phi) implies Sat_PBA^d(phi) implies Sat_STD^d(phi) for every formula phi. We then exhibit the explicit formula SEP-1 := (p and (q or r)) and not((p and q) or (p and r)) which is satisfiable in the standard semantics for every d >= 2, but unsatisfiable under both the global commuting and the partial-Boolean semantics. Consequently, for every d >= 2, the satisfiability classes satisfy SAT_COM^d subseteq SAT_PBA^d subset SAT_STD^d and SAT_COM^d subset SAT_STD^d, while the exact relation between SAT_COM^d and SAT_PBA^d remains open. The point of the paper is semantic comparison, not a new feasibility reduction or a generic translation theorem.

Recognizing Subgraphs of Regular Tilings

from arXiv: Computational Geometry

Authors: Eliel Ingervo, Sándor Kisfaludi-Bak

For $p,q\ge2$ the $\{p,q\}$-tiling graph is the (finite or infinite) planar graph $T_{p,q}$ where all faces are cycles of length $p$ and all vertices have degree $q$. We give algorithms for the problem of recognizing (induced) subgraphs of these graphs, as follows. - For $1/p+1/q>1/2$, these graphs correspond to regular tilings of the sphere. These graphs are finite, thus recognizing their (induced) subgraphs can be done in constant time. - For $1/p+1/q=1/2$, these graphs correspond to regular tilings of the Euclidean plane. For the Euclidean square grid $T_{4,4}$ Bhatt and Cosmadakis (IPL'87) showed that recognizing subgraphs is NP-hard, even if the input graph is a tree. We show that a simple divide-and conquer algorithm achieves a subexponential running time in all Euclidean tilings, and we observe that there is an almost matching lower bound in $T_{4,4}$ under the Exponential Time Hypothesis via known reductions. - For $1/p+1/q<1/2$, these graphs correspond to regular tilings of the hyperbolic plane. As our main contribution, we show that deciding if an $n$-vertex graph is isomorphic to a subgraph of the tiling $T_{p,q}$ can be done in quasi-polynomial ($n^{O(\log n)}$) time for any fixed $q$. Our results for the hyperbolic case show that it has significantly lower complexity than the Euclidean variant, and it is unlikely to be NP-hard. The Euclidean results also suggest that the problem can be maximally hard even if the graph in question is a tree. Consequently, the known treewidth bounds for subgraphs of hyperbolic tilings do not lead to an efficient algorithm by themselves. Instead, we use convex hulls within the tiling graph, which have several desirable properties in hyperbolic tilings. Our key technical insight is that planar subgraph isomorphism can be computed via a dynamic program that builds a sphere cut decomposition of a solution subgraph's convex hull.

Authors: Eliel Ingervo, Sándor Kisfaludi-Bak

For $p,q\ge2$ the $\{p,q\}$-tiling graph is the (finite or infinite) planar graph $T_{p,q}$ where all faces are cycles of length $p$ and all vertices have degree $q$. We give algorithms for the problem of recognizing (induced) subgraphs of these graphs, as follows. - For $1/p+1/q>1/2$, these graphs correspond to regular tilings of the sphere. These graphs are finite, thus recognizing their (induced) subgraphs can be done in constant time. - For $1/p+1/q=1/2$, these graphs correspond to regular tilings of the Euclidean plane. For the Euclidean square grid $T_{4,4}$ Bhatt and Cosmadakis (IPL'87) showed that recognizing subgraphs is NP-hard, even if the input graph is a tree. We show that a simple divide-and conquer algorithm achieves a subexponential running time in all Euclidean tilings, and we observe that there is an almost matching lower bound in $T_{4,4}$ under the Exponential Time Hypothesis via known reductions. - For $1/p+1/q<1/2$, these graphs correspond to regular tilings of the hyperbolic plane. As our main contribution, we show that deciding if an $n$-vertex graph is isomorphic to a subgraph of the tiling $T_{p,q}$ can be done in quasi-polynomial ($n^{O(\log n)}$) time for any fixed $q$. Our results for the hyperbolic case show that it has significantly lower complexity than the Euclidean variant, and it is unlikely to be NP-hard. The Euclidean results also suggest that the problem can be maximally hard even if the graph in question is a tree. Consequently, the known treewidth bounds for subgraphs of hyperbolic tilings do not lead to an efficient algorithm by themselves. Instead, we use convex hulls within the tiling graph, which have several desirable properties in hyperbolic tilings. Our key technical insight is that planar subgraph isomorphism can be computed via a dynamic program that builds a sphere cut decomposition of a solution subgraph's convex hull.

Efficient Neighbourhood Search in 3D Point Clouds Through Space-Filling Curves and Linear Octrees

from arXiv: Computational Geometry

Authors: Pablo D. Viñambres, Miguel Yermo, Silvia R. Alcaraz, Oscar G. Lorenzo, Francisco F. Rivera, José C. Cabaleiro

This work presents an efficient approach for neighbourhood searching in 3D point clouds by combining spatial reordering leveraging Space-Filling Curves (SFC), specifically Morton and Hilbert curves, with a linear Octree implementation. We also propose specialised search algorithms for fixed-radius and kNN queries, based on our linear Octree structures. Additionally, we introduce the novel concept of kNN locality histogram, which can be easily computed to characterise locality in data accesses, and we found to be directly related to cache misses and search performance. Our experiments reveal that SFC reordering significantly improves access to spatial data, reducing the number of cache misses from 25% to 75% and runtime by up to 50%. Moreover, we compare our proposal with several widely used Octree and KDTree implementations. Our method achieves a significant reduction in search time, up to 10$\times$ faster than existing solutions.Additionally, we analysed the performance of our neighbourhood searches (parallelised using OpenMP), demonstrating high scalability with the number of cores and the problem size. Notably, we observed a speedup of up to $36\times$ when executing fixed-radius searches in a system with 40 cores. The results obtained indicate that our methods provide a robust and efficient solution for applications that require fast access to large-scale 3D point neighbour sets.

Authors: Pablo D. Viñambres, Miguel Yermo, Silvia R. Alcaraz, Oscar G. Lorenzo, Francisco F. Rivera, José C. Cabaleiro

This work presents an efficient approach for neighbourhood searching in 3D point clouds by combining spatial reordering leveraging Space-Filling Curves (SFC), specifically Morton and Hilbert curves, with a linear Octree implementation. We also propose specialised search algorithms for fixed-radius and kNN queries, based on our linear Octree structures. Additionally, we introduce the novel concept of kNN locality histogram, which can be easily computed to characterise locality in data accesses, and we found to be directly related to cache misses and search performance. Our experiments reveal that SFC reordering significantly improves access to spatial data, reducing the number of cache misses from 25% to 75% and runtime by up to 50%. Moreover, we compare our proposal with several widely used Octree and KDTree implementations. Our method achieves a significant reduction in search time, up to 10$\times$ faster than existing solutions.Additionally, we analysed the performance of our neighbourhood searches (parallelised using OpenMP), demonstrating high scalability with the number of cores and the problem size. Notably, we observed a speedup of up to $36\times$ when executing fixed-radius searches in a system with 40 cores. The results obtained indicate that our methods provide a robust and efficient solution for applications that require fast access to large-scale 3D point neighbour sets.

Transversal Rank, Conformality and Enumeration

from arXiv: Data Structures and Algorithms

Authors: Martin Schirneck

The transversal rank of a hypergraph is the maximum size of its minimal hitting sets. Deciding, for an $n$-vertex, $m$-edge hypergraph and an integer $k$, whether the transversal rank is at least $k$ takes time $O(m^{k+1} n)$ with an algorithm that is known since the 70s. It essentially matches an $(m+n)^{Ω(k)}$ ETH-lower bound by Araújo, Bougeret, Campos, and Sau [Algorithmica 2023] and Dublois, Lampis, and Paschos [TCS 2022]. Many hypergraphs seen in practice have much more edges than vertices, $m \gg n$. This raises the question whether an improvement of the run time dependency on $m$ can be traded for an increase in the dependency on $n$. Our first result is an algorithm to recognize hypergraphs with transversal rank at least $k$ in time $O(Δ^{k-2} mn^{k-1})$, where $Δ\le m$ is the maximum degree. Our main technical contribution is a ``look-ahead'' method that allows us to find higher-order extensions, minimal hitting sets that augment a given set with at least two more vertices. We show that this method can also be used to enumerate all minimal hitting sets of a hypergraph with transversal rank $k^*$ with delay $O(Δ^{k^*-1} mn^2)$. We then explore the possibility of further reducing the running time for computing the transversal rank to $\textsf{poly}(m) \cdot n^{k+O(1)}$. This turns out to be equivalent to several breakthroughs in combinatorial algorithms and enumeration. Among other things, such an improvement is possible if and only if $k$-conformal hypergraphs can also be recognized in time $\textsf{poly}(m) \cdot n^{k+O(1)}$, and iff the maximal hypercliques/independent sets of a uniform hypergraph can be enumerated with incremental delay.

Authors: Martin Schirneck

The transversal rank of a hypergraph is the maximum size of its minimal hitting sets. Deciding, for an $n$-vertex, $m$-edge hypergraph and an integer $k$, whether the transversal rank is at least $k$ takes time $O(m^{k+1} n)$ with an algorithm that is known since the 70s. It essentially matches an $(m+n)^{Ω(k)}$ ETH-lower bound by Araújo, Bougeret, Campos, and Sau [Algorithmica 2023] and Dublois, Lampis, and Paschos [TCS 2022]. Many hypergraphs seen in practice have much more edges than vertices, $m \gg n$. This raises the question whether an improvement of the run time dependency on $m$ can be traded for an increase in the dependency on $n$. Our first result is an algorithm to recognize hypergraphs with transversal rank at least $k$ in time $O(Δ^{k-2} mn^{k-1})$, where $Δ\le m$ is the maximum degree. Our main technical contribution is a ``look-ahead'' method that allows us to find higher-order extensions, minimal hitting sets that augment a given set with at least two more vertices. We show that this method can also be used to enumerate all minimal hitting sets of a hypergraph with transversal rank $k^*$ with delay $O(Δ^{k^*-1} mn^2)$. We then explore the possibility of further reducing the running time for computing the transversal rank to $\textsf{poly}(m) \cdot n^{k+O(1)}$. This turns out to be equivalent to several breakthroughs in combinatorial algorithms and enumeration. Among other things, such an improvement is possible if and only if $k$-conformal hypergraphs can also be recognized in time $\textsf{poly}(m) \cdot n^{k+O(1)}$, and iff the maximal hypercliques/independent sets of a uniform hypergraph can be enumerated with incremental delay.

Forwarding Packets Greedily

from arXiv: Data Structures and Algorithms

Authors: Joan Boyar, Lene M. Favrholdt, Kim S. Larsen, Kevin Schewior, Rob van Stee

We consider the problem of forwarding packets arriving online with their destinations in a line network. In each time step, each router can forward one packet along the edge to its right. Each packet that is forwarded arrives at the next router one time step later. Packets are forwarded until they reach their destination. The flow time of a packet is the difference between its release time and the time of its arrival at its destination. The goal is to minimize the maximum flow time. This problem was introduced by Antoniadis et al.~in 2014. They propose a collection of natural algorithms and prove for one, and claim for others, that none of them are $O(1)$-competitive. It was posed as an open problem whether such an algorithm exists. We make the first progress on answering this question. We consider the special case where each packet needs to be forwarded by exactly one or two routers. We show that a greedy algorithm, which was not previously considered for this problem, achieves a competitive ratio of exactly $2-2^{1-k}$, where $k$ is the number of active routers in the network. We also give a general lower bound of $4/3$, even for randomized algorithms.

Authors: Joan Boyar, Lene M. Favrholdt, Kim S. Larsen, Kevin Schewior, Rob van Stee

We consider the problem of forwarding packets arriving online with their destinations in a line network. In each time step, each router can forward one packet along the edge to its right. Each packet that is forwarded arrives at the next router one time step later. Packets are forwarded until they reach their destination. The flow time of a packet is the difference between its release time and the time of its arrival at its destination. The goal is to minimize the maximum flow time. This problem was introduced by Antoniadis et al.~in 2014. They propose a collection of natural algorithms and prove for one, and claim for others, that none of them are $O(1)$-competitive. It was posed as an open problem whether such an algorithm exists. We make the first progress on answering this question. We consider the special case where each packet needs to be forwarded by exactly one or two routers. We show that a greedy algorithm, which was not previously considered for this problem, achieves a competitive ratio of exactly $2-2^{1-k}$, where $k$ is the number of active routers in the network. We also give a general lower bound of $4/3$, even for randomized algorithms.

Agnostic learning in (almost) optimal time via Gaussian surface area

from arXiv: Data Structures and Algorithms

Authors: Lucas Pesenti, Lucas Slot, Manuel Wiedmer

The complexity of learning a concept class under Gaussian marginals in the difficult agnostic model is closely related to its $L_1$-approximability by low-degree polynomials. For any concept class with Gaussian surface area at most $Γ$, Klivans et al. (2008) show that degree $d = O(Γ^2 / \varepsilon^4)$ suffices to achieve an $\varepsilon$-approximation. This leads to the best-known bounds on the complexity of learning a variety of concept classes. In this note, we improve their analysis by showing that degree $d = \tilde O (Γ^2 / \varepsilon^2)$ is enough. In light of lower bounds due to Diakonikolas et al. (2021), this yields (near) optimal bounds on the complexity of agnostically learning polynomial threshold functions in the statistical query model. Our proof relies on a direct analogue of a construction of Feldman et al. (2020), who considered $L_1$-approximation on the Boolean hypercube.

Authors: Lucas Pesenti, Lucas Slot, Manuel Wiedmer

The complexity of learning a concept class under Gaussian marginals in the difficult agnostic model is closely related to its $L_1$-approximability by low-degree polynomials. For any concept class with Gaussian surface area at most $Γ$, Klivans et al. (2008) show that degree $d = O(Γ^2 / \varepsilon^4)$ suffices to achieve an $\varepsilon$-approximation. This leads to the best-known bounds on the complexity of learning a variety of concept classes. In this note, we improve their analysis by showing that degree $d = \tilde O (Γ^2 / \varepsilon^2)$ is enough. In light of lower bounds due to Diakonikolas et al. (2021), this yields (near) optimal bounds on the complexity of agnostically learning polynomial threshold functions in the statistical query model. Our proof relies on a direct analogue of a construction of Feldman et al. (2020), who considered $L_1$-approximation on the Boolean hypercube.

How to Sort in a Refrigerator: Simple Entropy-Sensitive Strictly In-Place Sorting Algorithms

from arXiv: Data Structures and Algorithms

Authors: Ofek Gila, Michael T. Goodrich, Vinesh Sridhar

While modern general-purpose computing systems have ample amounts of memory, it is still the case that embedded computer systems, such as in a refrigerator, are memory limited; hence, such embedded systems motivate the need for strictly in-place algorithms, which use only O(1) additional memory besides that used for the input. In this paper, we provide the first comparison-based sorting algorithms that are strictly in-place and have a running time that is optimal in terms of the run-based entropy, H(A), of an input array, A, of size n. In particular, we describe two remarkably simple paradigms for implementing stack-based natural mergesort algorithms to be strictly in-place in O(n(1 + H(A))) time.

Authors: Ofek Gila, Michael T. Goodrich, Vinesh Sridhar

While modern general-purpose computing systems have ample amounts of memory, it is still the case that embedded computer systems, such as in a refrigerator, are memory limited; hence, such embedded systems motivate the need for strictly in-place algorithms, which use only O(1) additional memory besides that used for the input. In this paper, we provide the first comparison-based sorting algorithms that are strictly in-place and have a running time that is optimal in terms of the run-based entropy, H(A), of an input array, A, of size n. In particular, we describe two remarkably simple paradigms for implementing stack-based natural mergesort algorithms to be strictly in-place in O(n(1 + H(A))) time.

Space-efficient B-tree Implementation for Memory-Constrained Flash Embedded Devices

from arXiv: Data Structures and Algorithms

Authors: Nadir Ould-Khessal, Scott Fazackerley, Ramon Lawrence

Small devices collecting data for agricultural, environmental, and industrial monitoring enable Internet of Things (IoT) applications. Given their critical role in data collection, there is a need for optimizations to improve on-device data processing. Edge device computing allows processing of the data closer to where it is collected and reduces the amount of network transmissions. The B-tree has been optimized for flash storage on servers and solid-state drives, but these optimizations often require hardware and memory resources not available on embedded devices. The contribution of this work is the development and experimental evaluation of multiple variants for B-trees on memory-constrained embedded devices. Experimental results demonstrate that even the smallest devices can perform efficient B-tree indexing, and there is a significant performance advantage for using storage-specific optimizations.

Authors: Nadir Ould-Khessal, Scott Fazackerley, Ramon Lawrence

Small devices collecting data for agricultural, environmental, and industrial monitoring enable Internet of Things (IoT) applications. Given their critical role in data collection, there is a need for optimizations to improve on-device data processing. Edge device computing allows processing of the data closer to where it is collected and reduces the amount of network transmissions. The B-tree has been optimized for flash storage on servers and solid-state drives, but these optimizations often require hardware and memory resources not available on embedded devices. The contribution of this work is the development and experimental evaluation of multiple variants for B-trees on memory-constrained embedded devices. Experimental results demonstrate that even the smallest devices can perform efficient B-tree indexing, and there is a significant performance advantage for using storage-specific optimizations.

Sunday, March 08

How does AI do on Baseball-Brothers-Pitchers

from Computational Complexity

In my graduate Ramsey Theory class I taught Kruskal's tree theorem (KTT) which was proven by Joe Kruskal in his PhD thesis in 1960.  (Should that be in a graduate Ramsey Theory class? There are not enough people teaching such a course to get a debate going.) A simpler proof was discovered (invented?) by Nash-Williams in 1963.

The theorem is that the set of trees under the homeomorphism ordering is a well quasi order.

But this blog post is not about well quasi orderings. It's about baseball brothers and AI.

The Kruskals are one of the best math families of all time. See my post on math families.The Bernoullis are another great math family.  What makes both families so great is that they had at least THREE great math people. Most have two.

Having taught the KTT and talked briefly about math families, I was curious how ChatGPT would do on the better-defined question of largest number of wins by a pair of brothers in baseball.  So I asked my students to look that up and include which tools they used,  as a HW problem  (worth 0 points but they had to do it). 

I wrote up 9 pages on what the answer is (there are some issues) and what the students' answers were. See my write up.

In case those 9 pages are tl;dr, here are the main takeaways

1) 7 of the answers given were just WRONG no matter how you look at it.

2) 7 had either the Clarksons who played in the 1880's, so don't count as modern baseball, or there were three of them, or both.  Even so, one could argue these are correct

3) 1 got it right. (That's   one got it right  not  I, bill g, got it right. It can be hard to distinguish the numeral for one, the letter capital i, and the letter small L. I blogged about L vs I here in the context of Weird AI vs Weird AL).


4) There were 13 different answers, which I find amazing.

As usual, when I study some odd issue, I learn a few other things of interest, at least to me, which may also have life lessons, though YMMV. 

a) Around 85% of pitchers in the Hall of Fame have won over 200 games. Dizzy Dean (his brother was also a pitcher which is why I was looking at this) got into the Hall of Fame with only 150 wins. Why? For 6 years he was the most dominant pitcher in the game. In addition (1) there was some sympathy for him since his career was cut short by an injury he got in an All-star game, and (2) he had a colorful personality and people liked him. The four least-wins for a HOF are: Candy Cummings (W-L 145-94).  [he is said to have invented the curveball], Dizzy Dean (W-L 150-83) , Addie Joss (W-L 160-97 )[he only played 9 years, which is less than the 10 needed for eligibility in the HOF, but he died so he was given an exception], Sandy Koufax (W-L 165-87) [he was dominant, like Dean, for a short time]. Sandy is the only one who is still alive. (W-L means Win-Loss. For example, Candy won 145 games and lost 94.)


b) Modern baseball starts in 1900. But this is arbitrary. In 1904 Jack Chesbro won 40 games which would never happen in modern baseball. But you have to draw the line someplace.  History is hard because there are fuzzy boundaries.

c) I had not known about the Clarkson brothers. 

d) My interest in this subject goes back to 1973. In that year I heard the following during a baseball game and, ever since I began blogging, I wondered how I could fit it into a blog post:


An old baseball trick question is now gone!  Just last week [in 1973] if you asked

What pair of brothers in baseball won the most games?

the answer was Christy and Henry Mathewson. Christy played for 16 seasons and had a W-L record of 373-188, while his brother Henry played in 3 games and had a W-L record of 0-1. So their total number of wins is 373 which was the record.

But this last week [in 1973] the Perry brothers, Gaylord and Jim, got 374 wins between them.  Hence the question

What pair of brothers in baseball won the most games?

is no longer a trick question since both Perrys are fine pitchers [Gaylord made it into the Hall of Fame; Jim didn't, but Jim was still quite good.]

As you will see in my writeup, the Perrys' record was broken by the Niekros.

e) Christy and Henry are a trick answer because Henry wasn't much of a player with his 0-1 W-L record. Are there other pairs like that? Greg (355 wins) and Mike (39 wins) Maddux might seem that way but they are not. While Mike only had 39 wins he had a 14-year career as a relief pitcher. Such pitchers can be valuable to the team, but because of their role they do not get many wins.



SO the usual question: Why did AI get this one wrong? Lance will say that the paid version wouldn't have and he may be right. David in Tokyo might say that ChatGPT is a random word generator. Others tell me that ChatGPT does not understand the questions it is asked (my proofreader thinks this is obvious).  I'll add that the pitcher-brother  question has more ambiguity than I thought- do you count pitchers before 1900? Do you count a brother who won 0 games? What about 3 brothers (not the restaurant)?

By gasarch

In my graduate Ramsey Theory class I taught Kruskal's tree theorem (KTT) which was proven by Joe Kruskal in his PhD thesis in 1960.  (Should that be in a graduate Ramsey Theory class? There are not enough people teaching such a course to get a debate going.) A simpler proof was discovered (invented?) by Nash-Williams in 1963.

The theorem is that the set of trees under the homeomorphism ordering is a well quasi order.

But this blog post is not about well quasi orderings. It's about baseball brothers and AI.

The Kruskals are one of the best math families of all time. See my post on math families.The Bernoullis are another great math family.  What makes both families so great is that they had at least THREE great math people. Most have two.

Having taught the KTT and talked briefly about math families, I was curious how ChatGPT would do on the better-defined question of largest number of wins by a pair of brothers in baseball.  So I asked my students to look that up and include which tools they used,  as a HW problem  (worth 0 points but they had to do it). 

I wrote up 9 pages on what the answer is (there are some issues) and what the students' answers were. See my write up.

In case those 9 pages are tl;dr, here are the main takeaways

1) 7 of the answers given were just WRONG no matter how you look at it.

2) 7 had either the Clarksons who played in the 1880's, so don't count as modern baseball, or there were three of them, or both.  Even so, one could argue these are correct

3) 1 got it right. (That's   one got it right  not  I, bill g, got it right. It can be hard to distinguish the numeral for one, the letter capital i, and the letter small L. I blogged about L vs I here in the context of Weird AI vs Weird AL).


4) There were 13 different answers, which I find amazing.

As usual, when I study some odd issue, I learn a few other things of interest, at least to me, which may also have life lessons, though YMMV. 

a) Around 85% of pitchers in the Hall of Fame have won over 200 games. Dizzy Dean (his brother was also a pitcher which is why I was looking at this) got into the Hall of Fame with only 150 wins. Why? For 6 years he was the most dominant pitcher in the game. In addition (1) there was some sympathy for him since his career was cut short by an injury he got in an All-star game, and (2) he had a colorful personality and people liked him. The four least-wins for a HOF are: Candy Cummings (W-L 145-94).  [he is said to have invented the curveball], Dizzy Dean (W-L 150-83) , Addie Joss (W-L 160-97 )[he only played 9 years, which is less than the 10 needed for eligibility in the HOF, but he died so he was given an exception], Sandy Koufax (W-L 165-87) [he was dominant, like Dean, for a short time]. Sandy is the only one who is still alive. (W-L means Win-Loss. For example, Candy won 145 games and lost 94.)


b) Modern baseball starts in 1900. But this is arbitrary. In 1904 Jack Chesbro won 40 games which would never happen in modern baseball. But you have to draw the line someplace.  History is hard because there are fuzzy boundaries.

c) I had not known about the Clarkson brothers. 

d) My interest in this subject goes back to 1973. In that year I heard the following during a baseball game and, ever since I began blogging, I wondered how I could fit it into a blog post:


An old baseball trick question is now gone!  Just last week [in 1973] if you asked

What pair of brothers in baseball won the most games?

the answer was Christy and Henry Mathewson. Christy played for 16 seasons and had a W-L record of 373-188, while his brother Henry played in 3 games and had a W-L record of 0-1. So their total number of wins is 373 which was the record.

But this last week [in 1973] the Perry brothers, Gaylord and Jim, got 374 wins between them.  Hence the question

What pair of brothers in baseball won the most games?

is no longer a trick question since both Perrys are fine pitchers [Gaylord made it into the Hall of Fame; Jim didn't, but Jim was still quite good.]

As you will see in my writeup, the Perrys' record was broken by the Niekros.

e) Christy and Henry are a trick answer because Henry wasn't much of a player with his 0-1 W-L record. Are there other pairs like that? Greg (355 wins) and Mike (39 wins) Maddux might seem that way but they are not. While Mike only had 39 wins he had a 14-year career as a relief pitcher. Such pitchers can be valuable to the team, but because of their role they do not get many wins.



SO the usual question: Why did AI get this one wrong? Lance will say that the paid version wouldn't have and he may be right. David in Tokyo might say that ChatGPT is a random word generator. Others tell me that ChatGPT does not understand the questions it is asked (my proofreader thinks this is obvious).  I'll add that the pitcher-brother  question has more ambiguity than I thought- do you count pitchers before 1900? Do you count a brother who won 0 games? What about 3 brothers (not the restaurant)?

By gasarch

28th Estonian Winter School in Computer Science

from Luca Aceto

The 28th edition of the Estonian Winter School in Computer Science was held in the period 2-5 March in Viinistu, a small fishing village located on the coast of the Gulf of Finland. This edition of the school was organised by Cybernetica AS and the programme/organising committee included 

  • Peeter Laud (Cybernetica AS), 
  • Monika Perkmann (Tallinn University of Technology),
  • Pille Pullonen-Raudvere (Cybernetica AS),
  • Ago-Erik Riet (University of Tartu) and
  • Tarmo Uustalu (Reykjavik University and Tallinn University of Technology). 
The main objective of this series of winter schools is to expose Estonian, Baltic, and Nordic graduate students in computer science (but also interested students from elsewhere) to research topics that are usually not covered within the regular curricula. 
This year's edition of the school featured four courses, each consisting of four hours of lectures and three hours of exercises. Renato Neves (University of Minho, Braga, Portugal) delivered a course on "Reasoning precisely about imprecisions (metric program equialence)". Matej Pavlović (Near One) gave a course on "Distributed consensus, state machine replication and Byzantine fault tolerance". Miklós Simonovits (Alfréd Rényi Institute of Mathematics, Hungary) spoke about "Extremal graph theory and related areas". Finally, I contributed some lectures on "The equational logic of concurrent processes: Results and proof techniques".  
I thoroughly enjoyed taking part in this winter school. It was a great pleasure to learn from the other lecturers and to interact with the young researchers and students, including several BSc ones, who attended the event in a relaxed and idyllic setting. The organisation of the event was perfect and allowed all participants plenty of opportunities to discuss topics related to research, academic life and their studies with the lecturers and the organisers. We also had a lot of fun listening to the humorous presentations delivered at CRAPcon'26, including talks on pets in the wild, the Fibonacci properties of the broccolo romanesco, node equality in graphs, and reasons one should not enrich (in category theory and beyond). 
The organisers of the Estonian Winter School deserve the appreciation of the computer science community for their efforts over about 30 years. To my mind, events of this type have a huge positive influence on young computer scientists, some of whom then decide to pursue PhD studies and a research career. Organising the editions of the school for such a long period of time requires a lot of work and this type of contribution to the community is often not sufficiently recognised when evaluating an academic for positions or promotions. 
For the little that it might be worth, let me thank the organisers, the other lecturers and all the attendees for a lovely event, which rekindled a little, much-needed optimism in me in very troubled times. 

By Luca Aceto

The 28th edition of the Estonian Winter School in Computer Science was held in the period 2-5 March in Viinistu, a small fishing village located on the coast of the Gulf of Finland. This edition of the school was organised by Cybernetica AS and the programme/organising committee included 

  • Peeter Laud (Cybernetica AS), 
  • Monika Perkmann (Tallinn University of Technology),
  • Pille Pullonen-Raudvere (Cybernetica AS),
  • Ago-Erik Riet (University of Tartu) and
  • Tarmo Uustalu (Reykjavik University and Tallinn University of Technology). 
The main objective of this series of winter schools is to expose Estonian, Baltic, and Nordic graduate students in computer science (but also interested students from elsewhere) to research topics that are usually not covered within the regular curricula. 

This year's edition of the school featured four courses, each consisting of four hours of lectures and three hours of exercises. Renato Neves (University of Minho, Braga, Portugal) delivered a course on "Reasoning precisely about imprecisions (metric program equialence)". Matej Pavlović (Near One) gave a course on "Distributed consensus, state machine replication and Byzantine fault tolerance". Miklós Simonovits (Alfréd Rényi Institute of Mathematics, Hungary) spoke about "Extremal graph theory and related areas". Finally, I contributed some lectures on "The equational logic of concurrent processes: Results and proof techniques".  

I thoroughly enjoyed taking part in this winter school. It was a great pleasure to learn from the other lecturers and to interact with the young researchers and students, including several BSc ones, who attended the event in a relaxed and idyllic setting. The organisation of the event was perfect and allowed all participants plenty of opportunities to discuss topics related to research, academic life and their studies with the lecturers and the organisers. We also had a lot of fun listening to the humorous presentations delivered at CRAPcon'26, including talks on pets in the wild, the Fibonacci properties of the broccolo romanesco, node equality in graphs, and reasons one should not enrich (in category theory and beyond). 

The organisers of the Estonian Winter School deserve the appreciation of the computer science community for their efforts over about 30 years. To my mind, events of this type have a huge positive influence on young computer scientists, some of whom then decide to pursue PhD studies and a research career. Organising the editions of the school for such a long period of time requires a lot of work and this type of contribution to the community is often not sufficiently recognised when evaluating an academic for positions or promotions. 

For the little that it might be worth, let me thank the organisers, the other lecturers and all the attendees for a lovely event, which rekindled a little, much-needed optimism in me in very troubled times. 

By Luca Aceto

TR26-036 | On Factorization of Sparse Polynomials of Bounded Individual Degree | Aminadav Chuyoon, Amir Shpilka

from ECCC Papers

We study sparse polynomials with bounded individual degree and the class of their factors. In particular we obtain the following algorithmic and structural results: 1. A deterministic polynomial-time algorithm for finding all the sparse divisors of a sparse polynomial with bounded individual degree. As part of this, we establish the first upper bound on the number of non-monomial irreducible factors of such polynomials. 2. A $\mathrm{poly}(n,s^{d\log \ell})$-time algorithm for recovering $\ell$ irreducible $s$-sparse polynomials of bounded individual degree $d$ from blackbox access to their product (which is not necessarily sparse). This partially resolves a question posed in Dutta-Sinhababu-Thierauf (Random 2024). In particular, when $\ell=O(1)$, the algorithm runs in polynomial time. 3. Deterministic algorithms for factoring a product of $s$-sparse polynomials of bounded individual degree $d$ from blackbox access. Over fields of characteristic zero or sufficiently large, the algorithm runs in $\mathrm{poly}(n,s^{d^3\log n})$-time; over arbitrary fields it runs in $\mathrm{poly}(n,{(d^2)!},s^{d^5\log n})$-time. This improves upon the algorithm of Bhargava-Saraf-Volkovich (JACM 2020), which runs in $\mathrm{poly}(n,s^{d^7\log n})$-time and applies only to a single sparse polynomial of bounded individual degree. In the case where the input is a single sparse polynomial, we give an algorithm that runs in $\mathrm{poly}(n,s^{d^2\log n})$-time. 4. Given blackbox access to a product of (not necessarily sparse or irreducible) factors of sparse polynomials of bounded individual degree, we give a deterministic polynomial-time algorithm for finding all irreducible sparse multiquadratic factors of it (along with their multiplicities). This generalizes the algorithms of Volkovich (Random 2015, Random 2017). We also show how to decide whether such a product is a complete power (in case it is defined over a field of zero or large enough characteristic), extending the algorithm of Bisht-Volkovich (CC 2025). Our algorithms most naturally apply over fields of zero or sufficiently large characteristic. To handle arbitrary fields, we introduce the notion of primitive divisors for a class of polynomials, which may be of independent interest. This notion enables us to adapt ideas of Bisht-Volkovich (CC 2025) and remove characteristic assumptions from most of our algorithms.

We study sparse polynomials with bounded individual degree and the class of their factors. In particular we obtain the following algorithmic and structural results: 1. A deterministic polynomial-time algorithm for finding all the sparse divisors of a sparse polynomial with bounded individual degree. As part of this, we establish the first upper bound on the number of non-monomial irreducible factors of such polynomials. 2. A $\mathrm{poly}(n,s^{d\log \ell})$-time algorithm for recovering $\ell$ irreducible $s$-sparse polynomials of bounded individual degree $d$ from blackbox access to their product (which is not necessarily sparse). This partially resolves a question posed in Dutta-Sinhababu-Thierauf (Random 2024). In particular, when $\ell=O(1)$, the algorithm runs in polynomial time. 3. Deterministic algorithms for factoring a product of $s$-sparse polynomials of bounded individual degree $d$ from blackbox access. Over fields of characteristic zero or sufficiently large, the algorithm runs in $\mathrm{poly}(n,s^{d^3\log n})$-time; over arbitrary fields it runs in $\mathrm{poly}(n,{(d^2)!},s^{d^5\log n})$-time. This improves upon the algorithm of Bhargava-Saraf-Volkovich (JACM 2020), which runs in $\mathrm{poly}(n,s^{d^7\log n})$-time and applies only to a single sparse polynomial of bounded individual degree. In the case where the input is a single sparse polynomial, we give an algorithm that runs in $\mathrm{poly}(n,s^{d^2\log n})$-time. 4. Given blackbox access to a product of (not necessarily sparse or irreducible) factors of sparse polynomials of bounded individual degree, we give a deterministic polynomial-time algorithm for finding all irreducible sparse multiquadratic factors of it (along with their multiplicities). This generalizes the algorithms of Volkovich (Random 2015, Random 2017). We also show how to decide whether such a product is a complete power (in case it is defined over a field of zero or large enough characteristic), extending the algorithm of Bisht-Volkovich (CC 2025). Our algorithms most naturally apply over fields of zero or sufficiently large characteristic. To handle arbitrary fields, we introduce the notion of primitive divisors for a class of polynomials, which may be of independent interest. This notion enables us to adapt ideas of Bisht-Volkovich (CC 2025) and remove characteristic assumptions from most of our algorithms.

TR26-035 | Learning Read-Once Determinants and the Principal Minor Assignment Problem | Abhiram Aravind, Abhranil Chatterjee, Sumanta Ghosh, Rohit Gurjar, Roshan Raj, Chandan Saha

from ECCC Papers

A symbolic determinant under rank-one restriction computes a polynomial of the form $\det(A_0 + A_1y_1 + \ldots + A_ny_n)$, where $A_0, A_1, \ldots, A_n$ are square matrices over a field $\mathbb{F}$ and $\rank(A_i) = 1$ for each $i \in [n]$. This class of polynomials has been studied extensively, since the work of Edmonds (1967), in the context of linear matroids, matching, matrix completion and polynomial identity testing. We study the following learning problem for this class: Given black-box access to an $n$-variate polynomial $f = \det(A_0 + A_1y_1 + \ldots + A_ny_n)$, where $A_0, A_1, \ldots, A_n$ are unknown square matrices over $\mathbb{F}$ and $rank(A_i) = 1$ for each $i \in [n]$, find a square matrix $B_0$ and rank-one square matrices $B_1, \ldots, B_n$ over $\mathbb{F}$ such that $f = \det(B_0 + B_1y_1 + \ldots + B_ny_n)$. In this work, we give a randomized poly$(n)$ time algorithm to solve this problem; the algorithm can be derandomized in quasi-polynomial time. To our knowledge, this is the first efficient learning algorithm for this class. As the above-mentioned class is known to be equivalent to the class of read-once determinants (RODs), we will refer to the problem as learning RODs. An ROD computes the determinant of a matrix whose entries are field constants or variables and every variable appears at most once in the matrix. Thus, the class of RODs is a rare example of a well-studied class of polynomials that admits efficient proper learning. The algorithm for learning RODs is obtained by connecting with a well-known open problem in linear algebra, namely the Principal Minor Assignment Problem (PMAP), which asks to find (if possible) a matrix having prescribed principal minors. PMAP has also been studied in machine learning to learn the kernel matrix of a determinantal point process. Here, we study a natural black-box version of PMAP: Given black-box access to an $n$-variate polynomial $f = \det(A + Y)$, where $A \in \mathbb{F}^{n \times n}$ is unknown and $Y = diag(y_1, \ldots, y_n)$, find a $B \in \mathbb{F}^{n\times n}$ such that $f = \det(B + Y)$. We show that black-box PMAP can be solved in randomized poly$(n)$ time, and further, it is randomized polynomial-time equivalent to learning RODs. The algorithm and the reduction between the two problems can be derandomized in quasi-polynomial time. To our knowledge, no efficient algorithm to solve this black-box version of PMAP was known before. We resolve black-box PMAP by investigating a crucial property of dense matrices that we call the rank-one extension property. Understanding ``cuts'' of matrices with this property and designing a black-box cut-finding algorithm to solve PMAP for such matrices (using only principal minors of order $4$ or less) constitute the technical core of this work. The insights developed along the way also help us give the first NC algorithm for the Principal Minor Equivalence problem, which asks to check if two given matrices have equal corresponding principal minors.

A symbolic determinant under rank-one restriction computes a polynomial of the form $\det(A_0 + A_1y_1 + \ldots + A_ny_n)$, where $A_0, A_1, \ldots, A_n$ are square matrices over a field $\mathbb{F}$ and $\rank(A_i) = 1$ for each $i \in [n]$. This class of polynomials has been studied extensively, since the work of Edmonds (1967), in the context of linear matroids, matching, matrix completion and polynomial identity testing. We study the following learning problem for this class: Given black-box access to an $n$-variate polynomial $f = \det(A_0 + A_1y_1 + \ldots + A_ny_n)$, where $A_0, A_1, \ldots, A_n$ are unknown square matrices over $\mathbb{F}$ and $rank(A_i) = 1$ for each $i \in [n]$, find a square matrix $B_0$ and rank-one square matrices $B_1, \ldots, B_n$ over $\mathbb{F}$ such that $f = \det(B_0 + B_1y_1 + \ldots + B_ny_n)$. In this work, we give a randomized poly$(n)$ time algorithm to solve this problem; the algorithm can be derandomized in quasi-polynomial time. To our knowledge, this is the first efficient learning algorithm for this class. As the above-mentioned class is known to be equivalent to the class of read-once determinants (RODs), we will refer to the problem as learning RODs. An ROD computes the determinant of a matrix whose entries are field constants or variables and every variable appears at most once in the matrix. Thus, the class of RODs is a rare example of a well-studied class of polynomials that admits efficient proper learning. The algorithm for learning RODs is obtained by connecting with a well-known open problem in linear algebra, namely the Principal Minor Assignment Problem (PMAP), which asks to find (if possible) a matrix having prescribed principal minors. PMAP has also been studied in machine learning to learn the kernel matrix of a determinantal point process. Here, we study a natural black-box version of PMAP: Given black-box access to an $n$-variate polynomial $f = \det(A + Y)$, where $A \in \mathbb{F}^{n \times n}$ is unknown and $Y = diag(y_1, \ldots, y_n)$, find a $B \in \mathbb{F}^{n\times n}$ such that $f = \det(B + Y)$. We show that black-box PMAP can be solved in randomized poly$(n)$ time, and further, it is randomized polynomial-time equivalent to learning RODs. The algorithm and the reduction between the two problems can be derandomized in quasi-polynomial time. To our knowledge, no efficient algorithm to solve this black-box version of PMAP was known before. We resolve black-box PMAP by investigating a crucial property of dense matrices that we call the rank-one extension property. Understanding ``cuts'' of matrices with this property and designing a black-box cut-finding algorithm to solve PMAP for such matrices (using only principal minors of order $4$ or less) constitute the technical core of this work. The insights developed along the way also help us give the first NC algorithm for the Principal Minor Equivalence problem, which asks to check if two given matrices have equal corresponding principal minors.

The ”JVG algorithm” is crap

from Scott Aaronson

Sorry to interrupt your regular programming about the AI apocalypse, etc., and return to the traditional beat of this blog’s very earliest years … but I’ve now gotten multiple messages asking me to comment on something called the “JVG (Jesse–Victor–Gharabaghi) algorithm” (yes, the authors named it after themselves). This is presented as a massive improvement […]

Sorry to interrupt your regular programming about the AI apocalypse, etc., and return to the traditional beat of this blog’s very earliest years … but I’ve now gotten multiple messages asking me to comment on something called the “JVG (Jesse–Victor–Gharabaghi) algorithm” (yes, the authors named it after themselves). This is presented as a massive improvement over Shor’s factoring algorithm, which could (according to popular articles) allow RSA-2048 to be broken using only 5,000 physical qubits.

On inspection, the paper’s big new idea is that, in the key step of Shor’s algorithm where you compute xr mod N in a superposition over all r’s, you instead precompute the xr mod N’s on a classical computer and then load them all into the quantum state.

Alright kids, why does this not work? Shall we call on someone in the back of the class—like, any undergrad quantum computing class in the world? Yes class, that’s right! There are exponentially many r’s. Computing them all takes exponential time, and loading them into the quantum computer also takes exponential time. We’re out of the n2-time frying pan but into the 2n-time fire. This can only look like it wins on tiny numbers; on large numbers it’s hopeless.

If you want to see people explaining the same point more politely and at greater length, try this from Hacker News or this from Postquantum.com.

Even for those who know nothing about quantum algorithms, is there anything that could’ve raised suspicion here?

  1. The paper didn’t appear on the arXiv, but someplace called “Preprints.org.” Come to think of it, I should add this to my famous Ten Signs a Claimed Mathematical Breakthrough is Wrong! It’s not that there isn’t tons of crap on the arXiv as well, but so far I’ve seen pretty much only crap on preprint repositories other than arXiv, ECCC, and IACR.
  2. Judging from a Google search, the claim seems to have gotten endlessly amplified on clickbait link-farming news sites, but ignored by reputable science news outlets—yes, even the usual quantum hypesters weren’t touching this one!

Often, when something is this bad, the merciful answer is to let it die in obscurity. In this case, I feel like there was a sufficient level of intellectual hooliganism, just total lack of concern for what’s true, that those involved deserve to have this Shtetl-Optimized post as a tiny bit of egg on their faces forever.

By Scott

Saturday, March 07

Postdoc and PhD Positions at LMU Munich (apply by May 1, 2026)

from CCI: jobs

The Quantum Information Theory group at LMU Munich is inviting applications for several postdoctoral & PhD positions. We are looking for excellent, highly motivated candidates in all areas of the field, including: quantum {algorithms & complexity, information, cryptography, formal methods} and their mathematical and CS foundations and connections. Applications will be reviewed on a rolling […]

The Quantum Information Theory group at LMU Munich is inviting applications for several postdoctoral & PhD positions. We are looking for excellent, highly motivated candidates in all areas of the field, including: quantum {algorithms & complexity, information, cryptography, formal methods} and their mathematical and CS foundations and connections. Applications will be reviewed on a rolling basis.

Website: https://michaelwalter.info/jobs
Email: michael.walter@lmu.de

By shacharlovett

TR26-034 | Limit on the computational power of $\mathrm{C}$-random strings | Alexey Milovanov

from ECCC Papers

We construct a universal decompressor $U$ for plain Kolmogorov complexity $\mathrm{C}_U$ such that the Halting Problem cannot be decided by any polynomial-time oracle machine with access to the set of random strings $R_{\mathrm{C}_U} = \{x : \mathrm{C}_U(x) \ge |x|\}$. This result resolves a problem posed by Eric Allender regarding the computational power of Kolmogorov complexity-based oracles.

We construct a universal decompressor $U$ for plain Kolmogorov complexity $\mathrm{C}_U$ such that the Halting Problem cannot be decided by any polynomial-time oracle machine with access to the set of random strings $R_{\mathrm{C}_U} = \{x : \mathrm{C}_U(x) \ge |x|\}$. This result resolves a problem posed by Eric Allender regarding the computational power of Kolmogorov complexity-based oracles.

Friday, March 06

For All The Cows

from Ben Recht

Steampunk Data Science: An afterword and some thoughts on Cyberpunk Data Science.

This is the final part of a blogged essay “Steampunk Data Science.” A table of contents is here.

To be a practicing scientist, you need to maintain a state of vigilant cognitive dissonance. If you get too deep into the history and philosophy of your discipline, it becomes very hard to keep writing papers. A little perspective reveals the absurdity and irrationality of the system. And once you see it, you can’t unsee it. Participating requires a lot more effort.

Of course, I speak from experience. Here’s my super brief history. My research from 2015-2018 convinced me that machine learning was lying to itself about its foundations. I went to other disciplines for answers. I first looked to statistics and disappointingly found its epistemology even more confused than machine learning. David Blei sent me David Freedman’s Statistical Models and Shoe Leather, and I became a Freedman completionist. The last chapter of Freedman’s posthumously collected works, Statistical Models and Causal Inference, is a sequel to this famous Shoe Leather essay. It details several examples of “scientific shoe leather,” and devotes a page to the history of vitamins. I was a bit surprised that he wrote a lot about Eijkman, but didn’t mention Wisconsin. I wondered what else was missing in this summary. And here we are, six years later.

Though I wrote the first draft of this essay four years ago, I have never been able to figure out what to do with it. Initially, I thought it would be part of The Irrational Decision, but that project took me in a very different direction. The Irrational Decision is about science, engineering, and decision-making after the computer. This story was about how we did things before the standardizing forces of the computer and statistics. It didn’t cleanly fit, and I ended up jettisoning this chapter.

As a standalone piece, Henry Farrell warned me that it lived in an uncanny valley. The writing wasn’t academic enough to get published in a journal or pop enough to appear in a magazine. But you know where that sort of stuff can go? Directly to you, my argmin readers. Maybe this blog is the uncanny valley between NeurIPS and The New Yorker. I should make that the masthead.

Whatever the case, this piece not finding an external home is fine. Writing through a project is an exercise in itself. I finished this essay before I started substacking, and the recurring themes on here are built upon my research and writing about vitamins. This unpublished project laid the foundation of a research program. We’ll see where it ends up.

Now, for all of you scientists reading this, I’m not suggesting you follow me down this path. They’re uncommon, but scientific breakthroughs still happen, and looking for them can be thrilling. Keep looking.

The most dramatic plot in this essay is F. Gowland Hopkins’ crossover curves, in which he showed that some factor in milk was needed to sustain the growth of rats.

You might look at that plot and say, “Gee, it would have been nice to be a scientist back in 1900, as all of those low-hanging fruit are gone.” You might think that we never find such clear success stories in our modern, complex world. This couldn’t be further from the truth. I mean, this plot is from 2021:

This graphs the effect of starting and stopping semaglutide (Ozempic). The evidence of a revolutionary breakthrough in the management of weight loss tracks the average growth with a visualization scheme identical to F. Gowland Hopkins’ plots from a century earlier.

Yes, interventions like GLP-1 agonists come along rarely. But they do come along!

And for all of our results that are not robust and large? We should accept that, though they are likely illusory, the many small results help us build a scientific record. They form a scattered pile of puzzle pieces that we can all try to assemble. It’s only from the incremental pieces that aren’t precisely reproducible or clean that we can see the big picture. Don’t worry about the distribution of p-values. Do think hard about how to produce reproducible data and coding artifacts.

I’ll repeat what I said yesterday. The most important thing we can learn from the discovery of vitamins is that discovery starts with a mess. It’s in learning from the mess that we find the undeniable effects that completely transform our understanding.


Oh, one more thing. You might be wondering what happened to those cows in the Single-Grain Experiment. The cows were fed their monotonous diets for years, and the research team closely monitored the cows’ growth and health. They weighed each animal monthly and took a photograph once every six months. The cows all grew at similar rates, but there were noticeable differences in appearance, offspring, and milk production.

The corn-fed animals looked smooth of coat, fuller through the barrel; and as expressed by experienced feeders and judges of domestic animals, they were in a better state of nutrition. On the other extreme stood the wheat-fed group with rough coats, gaunt and thin in appearance, small of girth and barrel, and to the practiced eye, in rather a lower state of nutrition.

While those fed on corn gave birth to healthy calves, the wheat-fed cows’ offspring all died within a day. The milk of the cows had different fat contents depending on the diet. The corn-fed had somewhat less fat than the oat-fed, but the wheat-fed had almost no fat content whatsoever. Wheat alone was a perplexingly poor diet for cows.

Elmer McCollum was no dummy. He and Marguerite Davis began publishing papers about their rat colony well before the Single-Grain experiment had finished. Hart eventually terminated the Single-Grain experiment in 1911, and the team published their results in the Proceedings of the National Academy of Sciences, several years after McCollum and Davis announced their discovery of Vitamin A and confirmed the existence of Vitamin B.1

The Wisconsin researchers never figured out what was wrong with the wheat. Following McCollum and Davis’ discovery of Vitamin A, they tried adding butter to the rations, but this didn’t seem to improve the cows’ health. Their most likely hypothesis was that there was something toxic in their wheat ration. In his autobiography, McCollum later reflected that the harvested wheat itself was just of poor quality. Due to the way they grew, processed, and stored the wheat on the Madison campus, the cows ended up being fed only wheat grain and straw.

Had the cows eaten their full quota of leaf, as the corn- and oat-fed animals did, they would not have been in such poor nutritive condition. Through four years we had been inexcusably uncritical of some important details.

The experiment, though wildly influential, yielded nothing but inconclusive results.

Subscribe now

I’d like to thank Mihaela Curmei, Jessica Dai, Shamik Dasgupta, Henry Farrell, Sara Fridovich-Keil, Paula Gradu, Chris Harshaw, Lauren Kroiz, Kevin Munger, Deb Raji, Philippe Rigollet, Scott Shenker, and Chris Wiggins for many helpful comments and suggestions. Special thanks to the students in the 2024 Spring seminar “The Philosophy and History of Automated Decision Making,” who participated in a lively discussion about an earlier draft of this essay.

1

Even back then, PNAS was the journal for articles “Previously rejected from Nature And Science.”

By Ben Recht

The Fully Depolarizing Noise Conjecture for Physical Cat States is Twenty Years Old!

from Gil Kalai

Amidst the war, I am holding onto hope for a future of peace and tolerance. At the end of the post I include a few pictures from the safe room (closet). Writing this also made me reflect on related posts from … Continue reading →

Amidst the war, I am holding onto hope for a future of peace and tolerance. At the end of the post I include a few pictures from the safe room (closet). Writing this also made me reflect on related posts from 2017 and 2024.

 

The Fully Depolarizing Noise Conjecture (2006)

Conjecture (Fully Depolarizing Noise for Bell States).
For every physical implementation of a quantum computer, whenever a Bell state on a pair of physical qubits is created, there exists a small probability t>0 that both qubits are replaced by the maximally mixed state.

Equivalently: the preparation of a Bell state (i.e., a two-qubit cat state) on two physical qubits necessarily carries a non-negligible probability that both qubits undergo fully depolarizing noise.

Twenty years ago I proposed that this phenomenon cannot be avoided by any method of preparing a Bell state on a pair of physical qubits. In particular, the conjecture applies to any pair of transmons in a Bell state in superconducting architectures. As far as I know, the conjecture remains open.

What is a Bell state?

A Bell state (also called a two-qubit cat state) is a maximally entangled state of two qubits. A canonical example is

\frac{1}{\sqrt{2}}(|00\rangle + |11\rangle)

The other Bell states are obtained from this one by local unitary transformations (for example, by applying Pauli operations to one of the qubits).

Bell states represent the simplest and most fundamental form of quantum entanglement. They are the basic resource behind quantum teleportation, entanglement swapping, and many constructions in quantum error correction.

Remark: In its original formulation, the conjecture was stated more broadly: it applied to every entangled pair of physical qubits, not necessarily maximally entangled ones. Moreover, the strength of the correlated depolarizing component was conjectured to depend linearly on the amount of entanglement between the qubits. The discussion below implicitly relies on this more general formulation.

Direct versus indirect creation of entanglement

For directly gated qubits, this behavior is already built into standard stochastic noise models. The novelty of the conjecture is the claim that this remains true with a similar error level even when the cat state is created indirectly. Quantitatively  I conjecture that the value of t in the conjecture is in the ballpark of 2-gate fully depolarizing error.

For example, one may create a cat state on qubits 1 and 3 by:

  • applying a CNOT on (1,2),
  • then a CNOT on (2,3),
  • and measuring qubit 2 (or uncomputing it).

In standard gate-level noise models, if each two-qubit gate independently depolarizes its participating pair with probability t, then:

  • qubit 1 is replaced with probability t,
  • qubit 3 is replaced with probability t,
  • both are replaced only with probability t^2.

Thus, in such models, the probability that both entangled qubits are hit drops from order t (direct gate) to order t^2 (indirect construction).

My conjecture asserts that nature does not permit such quadratic suppression. Even when entanglement is generated indirectly—through mediating qubits, measurement, feedforward, or clever circuit identities—the resulting Bell state on the two physical qubits still carries an intrinsic order-t probability that both qubits are replaced by the maximally mixed state.

Formulating the conjecture for state-of-the-art superconducting devices makes it more concrete. For superconducting quantum computers, the best rate of 2-gate errors is around 0.005 and we can assume that a large fraction of the noise is depolarizing channel hitting both qubits.  If the conjecture is correct for indirect entangled qubits, we will be able to identify fully depolarizing errors for pairs of entangled qubits of similar rate rather than two orders of magnitude smaller as suggested by the computation above.

Physical intuition

The conjecture expresses, in a mathematically sharp way, a common physical intuition:

Entanglement between two physical systems requires a genuine physical interaction, and such interaction inevitably exposes both systems to correlated noise.

For example, there is no indirect method of performing an entangling gate between two transmons (say) that is much more reliable than just directly interacting them. Standard gate-level noise modeling does not enforce this. Indeed, in simple circuit models (as we computed above) indirect constructions reduce fully depolarization errors from order t to order t^2. Thus the conjecture postulates an additional, more subtle “nasty” property of physical noise—one that goes beyond standard local stochastic models. (I was myself recently confused about whether this stronger behavior follows from standard simulations; it does not.)

Localizable entanglement

Given a quantum state, the localizable entanglement (Verstraete–Popp–Cirac, 2004) of a pair of qubits is defined as the maximum amount of entanglement that can be obtained between them when single-qubit measurements are performed on all remaining qubits and the outcomes are conditioned upon.

The conjecture extends to such localizable entangled pairs: whenever a pair of qubits exhibits non-zero localizable entanglement, there exists a small probability t>0 (depending linearly on the value of the localizable entanglement) that both qubits are replaced by the maximally mixed (maximum-entropy) state.

Implications for larger entangled states

For more complicated entangled states—surface-code states, GHZ states, random-circuit sampling states, and cluster states—the extended conjecture applies to every pair of physical qubits that can exhibit localized entanglement. In several canonical families (such as GHZ states, cluster states, and surface-code states), this includes essentially every pair of qubits.

If true, this would have severe consequences for quantum error correction: correlated depolarizing noise on pairs of qubits is far more damaging than the quasi-independent noise assumed in threshold theorems. The reason is that a noise channel in which the events “qubit i is depolarized” and “qubit j is depolarized” are positively correlated for all pairs (i,j) necessarily leads to large-scale error synchronization.

The state of the conjecture

As far as I know, the conjecture remains open.

Current NISQ-scale devices could in principle test it experimentally—even at noise levels above the fault-tolerance threshold. The challenge is not the scale, but the identification of the specific correlated fully-depolarizing component in the noise.

Recent demonstrations of distance-3, distance-5, and even distance-7 surface codes appear, at least superficially, to be in tension with the conjecture. Whether this tension is genuine or only apparent deserves careful examination. It will also be interesting to examine the situation for experimentally realized “large” GHZ states.

I look forward to the conjecture being carefully tested—and, of course, possibly refuted.

Motivation

The original motivation for the conjecture was to “reverse engineer” natural structural conditions on noise that would cause quantum fault tolerance to fail.

For this reason, I did not view the conjecture itself as a reason for people to revise their a priori beliefs about quantum computers. Rather, I regarded it as a concrete and testable benchmark for quantum devices — one that is meaningful both for small systems with only a handful of qubits and for larger systems. The conjecture is relevant even to systems operating at noise levels above the threshold required for fault tolerance.

Sources

In March 2006 I raised an early version of the conjecture on Dave Bacon’s blog and discussed it with Dave Bacon, John Sidles, and others. The conjecture later appeared as Postulate 1 in my 2006 paper How quantum computers can fail and in several subsequent works. The related “noise synchronization” consequence for highly entangled states appeared as Postulate 2 and it connects back to my 2005 paper.

These postulates became Conjecture 3 and Conjecture 4 in my 2012 debate with Aram Harrow, where they played a central role. Conjectures 3 and 4 from the debate and a further consequence for the error rate (in terms of qubit errors) are Predictions 1,2, and 3 in my 2016 paper “The quantum computer Puzzle“.

Remark: In my papers I struggled to formulate the conjecture precisely and to identify the most appropriate notion of entanglement. Formulating the conjecture for localized entanglement (which I referred to as emergent entanglement) was proposed in my 2009 paper Quantum computers: Noise propagation and adversarial noise models.

I also considered a weaker notion where the maximum amount of entanglement over single-qubit measurements is replaced by the average entanglement obtained from such measurements.

In some early papers I also formulated related conjectures for correlated classical noisy systems,  asserting that correlation for the “signal” implies correlation for the noise. In this generality classical computation provides simple counterexamples. Nevertheless, suitably adjusted versions of the conjecture appear to apply to several natural physical systems.

Some counter arguments (and responses)

1) “This may be true for physical qubits, but it is false for logical qubits.”

Yes — the conjecture refers to physical qubits.

If the conjecture holds, it lowers the prospects for achieving reliable logical qubits. In fact, good-quality logical qubits would violate the Fully Depolarizing Noise Conjecture at the physical level. Thus the conjecture asserts that sufficiently good logical qubits are out of reach because the necessary physical behavior is unavailable.

2) “The conjecture does not describe a legitimate quantum channel (namely a channel supported by quantum mechanics).”

The conjecture does not specify a complete noise model. It imposes constraints on whatever the correct physical noise model is.

Even if universally true, the physical mechanisms leading to the conjectured behavior may differ between implementations. The conjecture is structural, not mechanistic.

3) “The conjecture violates  linearity of QM. It is possible that it will apply to one initial state of your quantum computer but not to another one.”

This is incorrect (while interesting). As I wrote above, the conjecture proposes constraints on the noise model rather than a complete description. If for the same circuit, with one initial condition the conjecture applies and for another initial condition the conjecture does not apply,  this does not violate linearity. Instead, it implies that if preparing these initial conditions makes no difference for the noise, then correlated depolarizing errors will appear in both cases.

4) The noise (or Nature) cannot “know” if the state is entangled or not. Entanglement cannot cause correlations for the noise.

The conjecture does not assume that noise “detects” entanglement or that entanglement directly “causes” correlation. It asserts that the physical processes required to generate entanglement inevitably produce correlated noise.

5) “The conjectured noise resembles nonphysical random-unitary models.”

Even if certain effective noise models implied by the conjecture appear unphysical, that does not refute the conjecture. It may instead indicate that certain large-scale entangled states are themselves physically unrealizable.

If creating a distance-15 surface code necessarily produces nonphysical types of noise, that would suggest that building such a code is itself nonphysical. The figure in my 2009 paper When Noise Accumulates was meant to illustrate precisely this point

6) “The threshold theorem extends to general long-range noise.”

Yes — but these extensions still violate the conjecture. Mathematically speaking, many small long-range interaction terms are not the same as imposing substantial correlated depolarization.

7) “Entanglement is basis dependent.”

The conjecture refers to correlations in a fixed computational basis. There is no ambiguity here.

8) “Empirical independence in random circuit sampling shows that there are no unexpected nasty aspects of quantum noise.

No.

Fidelity measures (including XEB) primarily estimate the probability that no error occurred. They are not sensitive to correlated structure among the error events that do occur.

The conjecture concerns correlations between qubits conditioned on noise events, not the global fidelity of outputs.

9) If the error rate is p, the conjecture  implies \Omega(n^2 p) errors per round. (We expect \Omega(np) errors per round.)”

This observation captures an important feature of the conjecture.

In trace-distance terms, the total noise per round may scale linearly.
But in terms of qubit-error counts, error synchronization can produce quadratic scaling \Omega(n^2 p) in the number of correlated qubit hits.

This quadratic synchronization is precisely the phenomenon the conjecture predicts.

10) The conjecture is too vague; it does not explicitly describe the noise channel. It also does not describe the physical source of the noise and its exact modeling.

This is partially true. The conjecture is structural and not mechanistic (see further explanation below).

Testing the conjecture experimentally would require identifying in experimental data specific correlated fully-depolarizing components. Supporting it theoretically would require fine-grained modeling of concrete physical systems.

11) As long as t > 0 is unspecified, then the conjecture might always stay open.

I conjecture that t is in the ballpark of the rate of 2-gate fully depolarizing error.

In contrast, (to the best of my memory) for fault tolerance quantum computers with n physical qubits, for most pairs of qubits i,j, the correlation between the events “qubit i is hit by a depolarizing  error” and “qubit j is hit by a depolarizing  error” tends to zero as n tends to infinity.

12) The correlation conjecture (and my earlier line of research from 2005) has no direct bearing on topological quantum computing.

Right.

13) The conjecture represents the “physical-noise-is-conspiring position”.  However, Nature is not malicious.

We will see about that 🙂 .

Remarks:

(to items 3,4) The possibility for systematic relation between the noiseless state and the noise was raised and discussed in my 2005 paper and through the years has led to interesting heated discussions.

(to item 5) noise models which resemble the behavior of random unitary operator were offered by early skeptical views. They do not violate the postulates of quantum mechanics but are considered unphysical: they violate how quantum physics is believed to be mapped into quantum mechanics.

(to item 6) Preskill’s 2012 paper (partially triggered by our 2011 Caltech discussion) presented general conditions for the threshold theorem to hold and cites earlier papers in this direction. Robert Alicki opined that the conditions proposed by Preskill are violated for open quantum systems, and I explained in this comment a specific feature of Preskill’s very general noise models that I find physically questionable.

Gated qubits and “purifying” gate errors

The assertion of the conjecture for gated qubits is (a rather vanilla) part of the standard model of noise for noisy quantum circuits and it is also clearly manifested by experimental data.  The negation of the conjecture would mean the ability to create entangled pairs of transmons (or other physical qubits) without any fully depolarizing errors. In effect, this would amount to “purifying” the two-qubit gate used to generate entanglement: starting with two-qubit gates that include fully depolarizing noise at rate t, one could nevertheless produce physical entangled pairs with no fully depolarizing component on the pair itself. The conjecture asserts that such purification is not physically possible.

The conjecture is structural and not mechanistic

Some of the objections listed above (especially, 2,3,4,10) treat the conjecture as if it were proposing a specific mechanistic claim of the form: “Here is the physical source of the noise — e.g., microscopic Hamiltonian couplings, thermal photons, leakage, cross-talk —
and here is the dynamical derivation showing why fully depolarizing correlations appear. Such a claim would specify, the environment, the interaction model, the time evolution, and the exact channel arising from tracing out the bath. My conjecture is a structural claim of the form

Whenever two physical qubits can be prepared (or projected) into a Bell state, there is an intrinsic order-t fully depolarizing component acting jointly on them.

This is a statement about the form of the effective channel, not about the physical process generating it. Of course, mechanistic explanations (that may differ for different implementations) that support the conjecture could be valuable.

A Simple Probabilistic Model: Tail Bounds from Pairwise Marginals

To illustrate how strong pairwise correlations can force even large-scale error synchronization, consider the following simple probabilistic model. There is a probability distribution W on \{0,1\}^n. We generate a random bitstring w according to the distribution W. If w_k=1 we fully depolarize qubit k.

Let X=(x_1,\dots,x_n)\in\{0,1\}^n be a random 0–1 vector of length n, and write S=\sum_{i=1}^n x_i. We assume that the joint distribution of every pair (x_i,x_j) (for i\neq j) is fixed. The goal is to minimize the upper tail probability \Pr(S\ge \lceil sn\rceil).


Theorem (symmetric “one-parameter” pair law): Assume that for every i\neq j,

\displaystyle \Pr(x_i=0,x_j=0)=1-t,\qquad \Pr(x_i=0,x_j=1)=\Pr(x_i=1,x_j=0)=\Pr(x_i=1,x_j=1)=\frac{t}{3},

where t\in[0,1]. Let S=\sum_{i=1}^n x_i and K=\lceil sn\rceil. Then the minimum possible value of \Pr(S\ge K) over all such distributions equals

\displaystyle \min \Pr(S\ge K)= \begin{cases} \frac{4t}{3}\cdot\frac{n}{n+1}, & \text{if } K \le \frac{n+1}{2},\\[6pt] 0, & \text{if } K > \frac{n+1}{2}. \end{cases}

(In particular, for large n this is asymptotically \frac{4t}{3} for s<\tfrac12 and 0 for s>\tfrac12.)

A proof and discussion of this implication will be given separately.

Time-parametrization and smooth Lindblad evolutions

My conjecture asserts that for noisy quantum states and evolutions — when the noise level is low — there are systematic relations between the noise and the corresponding “ideal” state. (Such systematic relations were already explored in my 2005 paper.) For example, the noise in a quantum computation may reflect symmetries and statistical features of the underlying signal.

Over the years, I made several attempts to place these ideas into a broader mathematical framework, extending beyond the specific (and hypothetical) setting of quantum computers. One direction I proposed was the study of certain subclasses of Lindblad evolutions, which I referred to as smoothed Lindblad evolutions. Another direction involved introducing an intrinsic time-parametrization for noisy quantum evolutions. These ideas appear as Predictions 6 and 7 in my 2016 Notices of the AMS paper.

Both directions aim at understanding noise not as arbitrary perturbation, but as a structured phenomenon constrained by the evolving quantum state.

My paper and discussion with Greg Kuperberg.

One offshoot of these conjectures — particularly the notion of smoothed Lindblad evolutions — and long email discussions with Greg Kuperberg led to a joint paper: Contagious error sources would need time travel to prevent quantum computation. Following ideas of Emanuel Knill, we proved that fault tolerance is possible even under very general forms of “disease-like” spreading noise.

Greg viewed this paper as a successful response to my skepticism. I did not quite see it that way. I saw our paper as clarifying an important point: if time-smoothing in my proposed class of Lindblad evolutions applies only forward in time, then fault tolerance can still succeed. To obstruct fault tolerance, the smoothing would need to extend to both past and future — effectively requiring a kind of “time-travel” correlation structure.

Recently, Greg jokingly proposed a joint mathematical collaboration (not necessarily on quantum topics) involving the two of us together with Gal Kronenberg and Guy Kindler. I think this is a wonderful idea.

My argument against QC (2013-2020), and other related directions.

Between 2013 and 2020, I pursued a different skeptical direction regarding quantum computation. Together with Guy Kindler (initially in the context of Boson Sampling), I developed an argument based on computational complexity and the notion of noise sensitivity. This line of work suggests that NISQ devices cannot achieve “quantum supremacy.”

Unlike the Fully Depolarizing Noise Conjecture — which posits a structurally “nasty” type of correlated noise beyond standard modeling — this later argument relies entirely on standard noise assumptions. It explains why the extremely low error rates required for quantum supremacy and scalable fault tolerance may be beyond reach.

Naturally, the quantum-supremacy claims of the past six years directly challenge this position. The central issue is careful evaluation of experimental details. As far as I know, those claims have no direct bearing on the Fully Depolarizing Noise Conjecture.

Both skeptical directions generate near-term experimental predictions. Both are discussed in my 2016 paper “The quantum computer Puzzle“. While experimental validation is the primary testing ground, another possible avenue is to derive broader physical consequences beyond the specific framework of quantum computing.

Let me also note that the question of whether “NISQ computers” could deliver “quantum supremacy” arose in my debate with Aram Harrow — and in other discussions — before the terms “NISQ” and “quantum supremacy” were coined.

Statistical testing

Since 2019, I have also been working (with Yosi Rinott, Tomer Shoham, and Carsten Voelkmann) on developing statistical tools for analyzing samples produced by NISQ experiments. Engaging directly with experimental data and statistical methodology may prove useful for testing the Fully Depolarizing Noise Conjecture as well.

One possible approach is to begin with a standard noise model, augment it with the hypothetical two-qubit depolarizing component (DEPOLARIZE2) predicted by the conjecture at some rate p, and then determine the value of p that best matches empirical data. Here DEPOLARIZE2 refers to a two-qubit depolarizing channel acting jointly on the pair.

I recently had useful discussions with Craig Gidney about this type of simulation. Craig is optimistic that the skeptical position will be decisively refuted in the coming years. Statistical fitting of experimental samples to classes of W-noise models (discussed earlier) may also provide empirical tests of the conjecture.

Conclusion

The Fully Depolarizing Noise Conjecture was proposed twenty years ago as a structural stress test for quantum computing — a condition that scalable quantum devices must overcome. It does not attempt to describe a specific microscopic mechanism. Rather, it imposes a constraint on the effective noise channel: whenever physical qubits can generate entanglement — or localizable entanglement — correlated fully depolarizing noise must appear at linear order.

Whether this structural constraint reflects a fundamental limitation of quantum devices, or whether it will ultimately be refuted by experiment, remains an open question. The answer lies mainly in precise experimental scrutiny together with careful theoretical modeling and analysis.

 

A few pictures from the safe room (closet).

Late remark: There is a lovely interview of Scott Aaronson with Yuval Boger of QuEra. Yuval asked: “I’ve heard you refer many times in your talks to an argument with Gil Kalai about whether large-scale quantum computing is even possible. Do you think he still has a path to being vindicated, or is it over?” Scott gave a detailed and nice answer presenting his view of my position. (See also this SO post.) The discussion nicely illustrates how this debate continues to evolve twenty years after the Fully Depolarizing Noise Conjecture was first proposed.

 

By Gil Kalai

Mysticeti: Revolutionizing Consensus on Sui

from Decentralized Thoughts

TL;DR Mysticeti is a novel Byzantine consensus protocol deployed on the Sui blockchain that revolutionizes transaction processing. Eliminates explicit certification, reducing good-case latency to the theoretical minimum. Crash-fault masking, avoiding head-of-line blocking for pipelined rounds. 40% CPU usage reduction for consensus in production. 80% latency reduction compared to Bullshark on Sui Mainnet (from ~1.9s to ~400ms). Mysticeti is a novel Byzantine consensus protocol recently deployed on the Sui blockchain. It...

By Lefteris Kokoris Kogias, Alberto Sonnino, George Danezis

TL;DR Mysticeti is a novel Byzantine consensus protocol deployed on the Sui blockchain that revolutionizes transaction processing. Eliminates explicit certification, reducing good-case latency to the theoretical minimum. Crash-fault masking, avoiding head-of-line blocking for pipelined rounds. 40% CPU usage reduction for consensus in production. 80% latency reduction compared to Bullshark on Sui Mainnet (from ~1.9s to ~400ms). Mysticeti is a novel Byzantine consensus protocol recently deployed on the Sui blockchain. It...

By Lefteris Kokoris Kogias, Alberto Sonnino, George Danezis

Recurrent Graph Neural Networks and Arithmetic Circuits

from arXiv: Computational Complexity

Authors: Timon Barlag, Vivian Holzapfel, Laura Strieker, Jonni Virtema, Heribert Vollmer

We characterise the computational power of recurrent graph neural networks (GNNs) in terms of arithmetic circuits over the real numbers. Our networks are not restricted to aggregate-combine GNNs or other particular types. Generalizing similar notions from the literature, we introduce the model of recurrent arithmetic circuits, which can be seen as arithmetic analogues of sequential or logical circuits. These circuits utilise so-called memory gates which are used to store data between iterations of the recurrent circuit. While (recurrent) GNNs work on labelled graphs, we construct arithmetic circuits that obtain encoded labelled graphs as real valued tuples and then compute the same function. For the other direction we construct recurrent GNNs which are able to simulate the computations of recurrent circuits. These GNNs are given the circuit-input as initial feature vectors and then, after the GNN-computation, have the circuit-output among the feature vectors of its nodes. In this way we establish an exact correspondence between the expressivity of recurrent GNNs and recurrent arithmetic circuits operating over real numbers.

Authors: Timon Barlag, Vivian Holzapfel, Laura Strieker, Jonni Virtema, Heribert Vollmer

We characterise the computational power of recurrent graph neural networks (GNNs) in terms of arithmetic circuits over the real numbers. Our networks are not restricted to aggregate-combine GNNs or other particular types. Generalizing similar notions from the literature, we introduce the model of recurrent arithmetic circuits, which can be seen as arithmetic analogues of sequential or logical circuits. These circuits utilise so-called memory gates which are used to store data between iterations of the recurrent circuit. While (recurrent) GNNs work on labelled graphs, we construct arithmetic circuits that obtain encoded labelled graphs as real valued tuples and then compute the same function. For the other direction we construct recurrent GNNs which are able to simulate the computations of recurrent circuits. These GNNs are given the circuit-input as initial feature vectors and then, after the GNN-computation, have the circuit-output among the feature vectors of its nodes. In this way we establish an exact correspondence between the expressivity of recurrent GNNs and recurrent arithmetic circuits operating over real numbers.