Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Tuesday, March 03

Completing the Complexity Classification of 2-Solo Chess: Knights and Kings are Hard

from arXiv: Computational Complexity

Authors: Kolja Kühn, Wendy Yi

We extend the study of the 2-Solo Chess problem which was first introduced by Aravind, Misra, and Mittal in 2022. 2-Solo Chess is a single-player variant of chess in which the player must clear the board via captures such that only one piece remains, with each piece capturing at most twice. It is known that the problem is solvable in polynomial time for instances containing only pawns, while it becomes NP-complete for instances restricted to rooks, bishops, or queens. In this work, we complete the complexity classification by proving that 2-Solo Chess is NP-complete if the instance contains only knights or only kings.

Authors: Kolja Kühn, Wendy Yi

We extend the study of the 2-Solo Chess problem which was first introduced by Aravind, Misra, and Mittal in 2022. 2-Solo Chess is a single-player variant of chess in which the player must clear the board via captures such that only one piece remains, with each piece capturing at most twice. It is known that the problem is solvable in polynomial time for instances containing only pawns, while it becomes NP-complete for instances restricted to rooks, bishops, or queens. In this work, we complete the complexity classification by proving that 2-Solo Chess is NP-complete if the instance contains only knights or only kings.

NP-Completeness and Physical Zero-Knowledge Proof of Hotaru Beam

from arXiv: Computational Complexity

Authors: Taisei Otsuji, Peter Fulla, Takuro Fukunaga

Hotaru Beam is a logic puzzle which objective is to connect circles placed on a grid by drawing only lines with specified starting points and numbers of bends. A zero-knowledge proof is a communication protocol that allows one player to persuade the other that they are in possession of a certain piece of information without actually revealing it. We show that Hotaru Beam is NP-complete and present a physical zero-knowledge proof (i.e. implementable using physical items) for proving that one knows a solution to the puzzle.

Authors: Taisei Otsuji, Peter Fulla, Takuro Fukunaga

Hotaru Beam is a logic puzzle which objective is to connect circles placed on a grid by drawing only lines with specified starting points and numbers of bends. A zero-knowledge proof is a communication protocol that allows one player to persuade the other that they are in possession of a certain piece of information without actually revealing it. We show that Hotaru Beam is NP-complete and present a physical zero-knowledge proof (i.e. implementable using physical items) for proving that one knows a solution to the puzzle.

Hexasort -- The Complexity of Stacking Colors on Graphs

from arXiv: Computational Complexity

Authors: Linus Klocker, Simon D. Fink

Many popular puzzle and matching games have been analyzed through the lens of computational complexity. Prominent examples include Sudoku, Candy Crush, and Flood-It. A common theme among these widely played games is that their generalized decision versions are NP-hard, which is often thought of as a source of their inherent difficulty and addictive appeal to human players. In this paper, we study a popular single-player stacking game commonly known as Hexasort. The game can be modelled as placing colored stacks onto the vertices of a graph, where adjacent stacks of the same color merge and vanish according to deterministic rules. We prove that Hexasort is NP-hard, even when restricted to single-color stacks and progressively more constrained classes of graphs, culminating in strong NP-hardness on trees of either bounded height or degree. Towards fixed-parameter tractable algorithms, we identify settings in which the problem becomes polynomial-time solvable and present dynamic programming algorithms.

Authors: Linus Klocker, Simon D. Fink

Many popular puzzle and matching games have been analyzed through the lens of computational complexity. Prominent examples include Sudoku, Candy Crush, and Flood-It. A common theme among these widely played games is that their generalized decision versions are NP-hard, which is often thought of as a source of their inherent difficulty and addictive appeal to human players. In this paper, we study a popular single-player stacking game commonly known as Hexasort. The game can be modelled as placing colored stacks onto the vertices of a graph, where adjacent stacks of the same color merge and vanish according to deterministic rules. We prove that Hexasort is NP-hard, even when restricted to single-color stacks and progressively more constrained classes of graphs, culminating in strong NP-hardness on trees of either bounded height or degree. Towards fixed-parameter tractable algorithms, we identify settings in which the problem becomes polynomial-time solvable and present dynamic programming algorithms.

PARWiS: Winner determination under shoestring budgets using active pairwise comparisons

from arXiv: Computational Complexity

Authors: Shailendra Bhandari

Determining a winner among a set of items using active pairwise comparisons under a limited budget is a challenging problem in preference-based learning. The goal of this study is to implement and evaluate the PARWiS algorithm, which shows spectral ranking and disruptive pair selection to identify the best item under shoestring budgets. This work have extended the PARWiS with a contextual variant (Contextual PARWiS) and a reinforcement learning-based variant (RL PARWiS), comparing them against baselines, including Double Thompson Sampling and a random selection strategy. This evaluation spans synthetic and real-world datasets (Jester and MovieLens), using budgets of 40, 60, and 80 comparisons for 20 items. The performance is measured through recovery fraction, true rank of reported winner, reported rank of true winner, and cumulative regret, alongside the separation metric \(Δ_{1,2}\). Results show that PARWiS and RL PARWiS outperform baselines across all datasets, particularly in the Jester dataset with a higher \(Δ_{1,2}\), while performance gaps narrow in the more challenging MovieLens dataset with a smaller \(Δ_{1,2}\). Contextual PARWiS shows comparable performance to PARWiS, indicating that contextual features may require further tuning to provide significant benefits.

Authors: Shailendra Bhandari

Determining a winner among a set of items using active pairwise comparisons under a limited budget is a challenging problem in preference-based learning. The goal of this study is to implement and evaluate the PARWiS algorithm, which shows spectral ranking and disruptive pair selection to identify the best item under shoestring budgets. This work have extended the PARWiS with a contextual variant (Contextual PARWiS) and a reinforcement learning-based variant (RL PARWiS), comparing them against baselines, including Double Thompson Sampling and a random selection strategy. This evaluation spans synthetic and real-world datasets (Jester and MovieLens), using budgets of 40, 60, and 80 comparisons for 20 items. The performance is measured through recovery fraction, true rank of reported winner, reported rank of true winner, and cumulative regret, alongside the separation metric \(Δ_{1,2}\). Results show that PARWiS and RL PARWiS outperform baselines across all datasets, particularly in the Jester dataset with a higher \(Δ_{1,2}\), while performance gaps narrow in the more challenging MovieLens dataset with a smaller \(Δ_{1,2}\). Contextual PARWiS shows comparable performance to PARWiS, indicating that contextual features may require further tuning to provide significant benefits.

A note on Jerabek's paper "A simplified lower bound for implicational logic"

from arXiv: Computational Complexity

Authors: Lev Gordeev, Edward Hermann Haeusler

In our previous papers we sketched proofs of the equality NP = coNP = PSPACE. These results have been obtained by proof theoretic tree-to-dag compressing techniques adapted to Prawitz's Natural Deduction (ND) for implicational minimal logic with references to Hudelmaier's cutfree sequent calculus. In this note we comment on Jeřábek's approach that claimed to refute our results by providing exponential lower bounds on the implicational minimal logic. This claim is wrong and misleading, which is briefly demonstrated by Basis example below.

Authors: Lev Gordeev, Edward Hermann Haeusler

In our previous papers we sketched proofs of the equality NP = coNP = PSPACE. These results have been obtained by proof theoretic tree-to-dag compressing techniques adapted to Prawitz's Natural Deduction (ND) for implicational minimal logic with references to Hudelmaier's cutfree sequent calculus. In this note we comment on Jeřábek's approach that claimed to refute our results by providing exponential lower bounds on the implicational minimal logic. This claim is wrong and misleading, which is briefly demonstrated by Basis example below.

Compatible Triangulations of Simple Polygons

from arXiv: Computational Geometry

Authors: Peyman Afshani, Boris Aronov, Kevin Buchin, Maike Buchin, Otfried Cheong, Katharina Klost, Carolin Rehs, Günter Rote

Let $P$ and $Q$ be simple polygons with $n$ vertices each. We wish to compute triangulations of $P$ and $Q$ that are combinatorially equivalent, if they exist. We consider two versions of the problem: if a triangulation of $P$ is given, we can decide in $O(n\log n + nr)$ time if $Q$ has a compatible triangulation, where $r$ is the number of reflex vertices of $Q$. If we are already given the correspondence between vertices of $P$ and $Q$ (but no triangulation), we can find compatible triangulations of $P$ and $Q$ in time $O(M(n))$, where $M(n)$ is the running time for multiplying two $n\times n$ matrices.

Authors: Peyman Afshani, Boris Aronov, Kevin Buchin, Maike Buchin, Otfried Cheong, Katharina Klost, Carolin Rehs, Günter Rote

Let $P$ and $Q$ be simple polygons with $n$ vertices each. We wish to compute triangulations of $P$ and $Q$ that are combinatorially equivalent, if they exist. We consider two versions of the problem: if a triangulation of $P$ is given, we can decide in $O(n\log n + nr)$ time if $Q$ has a compatible triangulation, where $r$ is the number of reflex vertices of $Q$. If we are already given the correspondence between vertices of $P$ and $Q$ (but no triangulation), we can find compatible triangulations of $P$ and $Q$ in time $O(M(n))$, where $M(n)$ is the running time for multiplying two $n\times n$ matrices.

Towards Computing Average Merge Tree Based on the Interleaving Distance

from arXiv: Computational Geometry

Authors: Elena Farahbakhsh Touli, Ingrid Hotz, Talha Bin Masood

The interleaving distance is a key tool for comparing merge trees, which provide topological summaries of scalar functions. In this work, we define an average merge tree for a pair of merge trees using the interleaving distance. Since such an average is not unique, we propose a method to construct a representative average merge tree. We further prove that the resulting merge tree indeed satisfies a natural notion of averaging for the two given merge trees. To demonstrate the structure of the average merge tree, we include illustrative examples.

Authors: Elena Farahbakhsh Touli, Ingrid Hotz, Talha Bin Masood

The interleaving distance is a key tool for comparing merge trees, which provide topological summaries of scalar functions. In this work, we define an average merge tree for a pair of merge trees using the interleaving distance. Since such an average is not unique, we propose a method to construct a representative average merge tree. We further prove that the resulting merge tree indeed satisfies a natural notion of averaging for the two given merge trees. To demonstrate the structure of the average merge tree, we include illustrative examples.

MergeDJD: A Fast Constructive Algorithm with Piece Merging for the Two-Dimensional Irregular Bin Packing Problem

from arXiv: Computational Geometry

Authors: Yi Zhou, Haocheng Fu, Yiping Liu, Jian Mao, Zhang-Hua Fu, Yuyi Wang

The two-dimensional irregular bin packing problem (2DIBPP) aims to pack a given set of irregular polygons, referred to as pieces, into fixed-size rectangular bins without overlap, while maximizing bin utilization. Although numerous metaheuristic algorithms have been proposed for the 2DIBPP, many industrial applications favor simpler constructive heuristics due to their deterministic behavior and low computational overhead. Among such methods, the DJD algorithm proposed by L'opez-Camacho et al. is one of the most competitive constructive heuristics for the 2DIBPP. However, DJD is less effective for cutting instances, in which many pieces can be seamlessly combined into larger polygons. To address the issue, we propose MergeDJD, a novel constructive algorithm that integrates and extends the DJD framework. MergeDJD first preprocesses the instance by iteratively identifying groups of pieces that can be combined into larger and more regular piece. It then employs an improved version of DJD, in which the placement strategy is enhanced to better handle non-convex and combined shapes, to pack all resulting pieces into bins. Computational experiments on 1,089 well-known benchmark instances show that MergeDJD consistently outperforms DJD on 1,083 instances while maintaining short runtimes. Notably, MergeDJD attains new best known values on 515 instances. Ablation studies further confirm the effectiveness of the proposed components. To facilitate reproducibility and future research, we have open-sourced the complete implementation and provided interfaces for visualizing packing results.

Authors: Yi Zhou, Haocheng Fu, Yiping Liu, Jian Mao, Zhang-Hua Fu, Yuyi Wang

The two-dimensional irregular bin packing problem (2DIBPP) aims to pack a given set of irregular polygons, referred to as pieces, into fixed-size rectangular bins without overlap, while maximizing bin utilization. Although numerous metaheuristic algorithms have been proposed for the 2DIBPP, many industrial applications favor simpler constructive heuristics due to their deterministic behavior and low computational overhead. Among such methods, the DJD algorithm proposed by L'opez-Camacho et al. is one of the most competitive constructive heuristics for the 2DIBPP. However, DJD is less effective for cutting instances, in which many pieces can be seamlessly combined into larger polygons. To address the issue, we propose MergeDJD, a novel constructive algorithm that integrates and extends the DJD framework. MergeDJD first preprocesses the instance by iteratively identifying groups of pieces that can be combined into larger and more regular piece. It then employs an improved version of DJD, in which the placement strategy is enhanced to better handle non-convex and combined shapes, to pack all resulting pieces into bins. Computational experiments on 1,089 well-known benchmark instances show that MergeDJD consistently outperforms DJD on 1,083 instances while maintaining short runtimes. Notably, MergeDJD attains new best known values on 515 instances. Ablation studies further confirm the effectiveness of the proposed components. To facilitate reproducibility and future research, we have open-sourced the complete implementation and provided interfaces for visualizing packing results.

A Unified Approach to Memory-Sample Tradeoffs for Detecting Planted Structures

from arXiv: Data Structures and Algorithms

Authors: Sumegha Garg, Jabari Hastings, Chirag Pabbaraju, Vatsal Sharan

We present a unified framework for proving memory lower bounds for multi-pass streaming algorithms that detect planted structures. Planted structures -- such as cliques or bicliques in graphs, and sparse signals in high-dimensional data -- arise in numerous applications, and our framework yields multi-pass memory lower bounds for many such fundamental settings. We show memory lower bounds for the planted $k$-biclique detection problem in random bipartite graphs and for detecting sparse Gaussian means. We also show the first memory-sample tradeoffs for the sparse principal component analysis (PCA) problem in the spiked covariance model. For all these problems to which we apply our unified framework, we obtain bounds which are nearly tight in the low, $O(\log n)$ memory regime. We also leverage our bounds to establish new multi-pass streaming lower bounds, in the vertex arrival model, for two well-studied graph streaming problems: approximating the size of the largest biclique and approximating the maximum density of bounded-size subgraphs. To show these bounds, we study a general distinguishing problem over matrices, where the goal is to distinguish a null distribution from one that plants an outlier distribution over a random submatrix. Our analysis builds on a new distributed data processing inequality that provides sufficient conditions for memory hardness in terms of the likelihood ratio between the averaged planted and null distributions. This result generalizes the inequality of [Braverman et al., STOC 2016] and may be of independent interest. The inequality enables us to measure information cost under the null distribution -- a key step for applying subsequent direct-sum-type arguments and incorporating the multi-pass information cost framework of [Braverman et al., STOC 2024].

Authors: Sumegha Garg, Jabari Hastings, Chirag Pabbaraju, Vatsal Sharan

We present a unified framework for proving memory lower bounds for multi-pass streaming algorithms that detect planted structures. Planted structures -- such as cliques or bicliques in graphs, and sparse signals in high-dimensional data -- arise in numerous applications, and our framework yields multi-pass memory lower bounds for many such fundamental settings. We show memory lower bounds for the planted $k$-biclique detection problem in random bipartite graphs and for detecting sparse Gaussian means. We also show the first memory-sample tradeoffs for the sparse principal component analysis (PCA) problem in the spiked covariance model. For all these problems to which we apply our unified framework, we obtain bounds which are nearly tight in the low, $O(\log n)$ memory regime. We also leverage our bounds to establish new multi-pass streaming lower bounds, in the vertex arrival model, for two well-studied graph streaming problems: approximating the size of the largest biclique and approximating the maximum density of bounded-size subgraphs. To show these bounds, we study a general distinguishing problem over matrices, where the goal is to distinguish a null distribution from one that plants an outlier distribution over a random submatrix. Our analysis builds on a new distributed data processing inequality that provides sufficient conditions for memory hardness in terms of the likelihood ratio between the averaged planted and null distributions. This result generalizes the inequality of [Braverman et al., STOC 2016] and may be of independent interest. The inequality enables us to measure information cost under the null distribution -- a key step for applying subsequent direct-sum-type arguments and incorporating the multi-pass information cost framework of [Braverman et al., STOC 2024].

Achievability of Heterogeneous Hypergraph Recovery from its Graph Projection

from arXiv: Data Structures and Algorithms

Authors: Alexander Morgan, Chenghao Guo

We formulate and analyze a heterogeneous random hypergraph model, and we provide an achieveability result for recovery of hyperedges from the observed projected graph. We observe a projected graph which combines random hyperedges across all degrees, where a projected edge appears if and only if both vertices appear in at least one hyperedge. Our goal is to reconstruct the original set of hyperedges of degree $d_j$ for some $j$. Our achievability result is based on the idea of selecting maximal cliques of size $d_j$ in the projected graph, and we show that this algorithm succeeds under a natural condition on the densities. This achievability condition generalizes a known threshold for $d$-uniform hypergraphs with noiseless and noisy projections. We conjecture the threshold to be optimal for recovering hyperedges with the largest degree.

Authors: Alexander Morgan, Chenghao Guo

We formulate and analyze a heterogeneous random hypergraph model, and we provide an achieveability result for recovery of hyperedges from the observed projected graph. We observe a projected graph which combines random hyperedges across all degrees, where a projected edge appears if and only if both vertices appear in at least one hyperedge. Our goal is to reconstruct the original set of hyperedges of degree $d_j$ for some $j$. Our achievability result is based on the idea of selecting maximal cliques of size $d_j$ in the projected graph, and we show that this algorithm succeeds under a natural condition on the densities. This achievability condition generalizes a known threshold for $d$-uniform hypergraphs with noiseless and noisy projections. We conjecture the threshold to be optimal for recovering hyperedges with the largest degree.

Partition-based Simple Heaps

from arXiv: Data Structures and Algorithms

Authors: Gerth Stølting Brodal, John Iacono, Casper Moldrup Rysgaard, Sebastian Wild

We introduce a new family of priority-queue data structures: partition-based simple heaps. The structures consist of $O(\log n)$ doubly-linked lists; order is enforced among data in different lists, but the individual lists are unordered. Our structures have amortized $O(\log n)$ time extract-min and amortized $O(\log \log n)$ time insert and decrease-key. The structures require nothing beyond binary search over $O(\log n)$ elements, as well as binary partitions and concatenations of linked lists in natural ways as the linked lists get too big or small. We present three different ways that these lists can be maintained in order to obtain the stated amortized running times.

Authors: Gerth Stølting Brodal, John Iacono, Casper Moldrup Rysgaard, Sebastian Wild

We introduce a new family of priority-queue data structures: partition-based simple heaps. The structures consist of $O(\log n)$ doubly-linked lists; order is enforced among data in different lists, but the individual lists are unordered. Our structures have amortized $O(\log n)$ time extract-min and amortized $O(\log \log n)$ time insert and decrease-key. The structures require nothing beyond binary search over $O(\log n)$ elements, as well as binary partitions and concatenations of linked lists in natural ways as the linked lists get too big or small. We present three different ways that these lists can be maintained in order to obtain the stated amortized running times.

High Probability Work Efficient Parallel Algorithms

from arXiv: Data Structures and Algorithms

Authors: Chase Hutton, Adam Melrod

Randomized parallel algorithms for many fundamental problems achieve optimal linear work in expectation, but upgrading this guarantee to hold with high probability (whp) remains a recurring theoretical challenge. In this paper, we address this gap for several core parallel primitives. First, we present the first parallel semisort algorithm achieving $O(n)$ work and $O(\text{polylog } n)$ depth whp, improving upon the $O(n)$ expected work bound of Gu et al. [SPAA 2015]. Our analysis introduces new concentration arguments based on simple tabulation hashing and tail bounds for weighted sums of geometric random variables. As a corollary, we obtain an integer sorting algorithm for keys in $[n]$ matching the same bounds. Second, we introduce a framework for boosting randomized parallel graph algorithms from expected to high probability linear work. The framework applies to \emph{locally extendable} problems -- those admitting a deterministic procedure that extends a solution across a graph cut in work proportional to the cut size. We combine this with a \emph{culled balanced partition} scheme: an iterative culling phase removes a polylogarithmic number of high-degree vertices, after which the remaining graph admits a balanced random vertex whp via a bounded-differences argument. Applying work-inefficient whp subroutines to the small pieces and deterministic extension across cuts yields overall linear work whp. We instantiate this framework to obtain $O(m)$ work and polylogarithmic depth whp algorithms for $(Δ+1)$-vertex coloring and maximal independent set.

Authors: Chase Hutton, Adam Melrod

Randomized parallel algorithms for many fundamental problems achieve optimal linear work in expectation, but upgrading this guarantee to hold with high probability (whp) remains a recurring theoretical challenge. In this paper, we address this gap for several core parallel primitives. First, we present the first parallel semisort algorithm achieving $O(n)$ work and $O(\text{polylog } n)$ depth whp, improving upon the $O(n)$ expected work bound of Gu et al. [SPAA 2015]. Our analysis introduces new concentration arguments based on simple tabulation hashing and tail bounds for weighted sums of geometric random variables. As a corollary, we obtain an integer sorting algorithm for keys in $[n]$ matching the same bounds. Second, we introduce a framework for boosting randomized parallel graph algorithms from expected to high probability linear work. The framework applies to \emph{locally extendable} problems -- those admitting a deterministic procedure that extends a solution across a graph cut in work proportional to the cut size. We combine this with a \emph{culled balanced partition} scheme: an iterative culling phase removes a polylogarithmic number of high-degree vertices, after which the remaining graph admits a balanced random vertex whp via a bounded-differences argument. Applying work-inefficient whp subroutines to the small pieces and deterministic extension across cuts yields overall linear work whp. We instantiate this framework to obtain $O(m)$ work and polylogarithmic depth whp algorithms for $(Δ+1)$-vertex coloring and maximal independent set.

Burning rooted graph products

from arXiv: Data Structures and Algorithms

Authors: John Peca-Medlin

The burning number $b(G)$ of a graph $G$ is the minimum number of rounds required to burn all vertices when, at each discrete step, existing fires spread to neighboring vertices and one new fire may be ignited at an unburned vertex. This parameter measures the speed of influence propagation in a network and has been studied as a model for information diffusion and resource allocation in distributed systems. A central open problem, the Burning Number Conjecture (BNC), asserts that every graph on $n$ vertices can be burned in at most $\lceil \sqrt n\rceil$ rounds, a bound known to be sharp for paths and verified for several structured families of trees. We investigate rooted graph products, focusing on comb graphs obtained by attaching a path (a ``tooth'') to each vertex of a path (the ``spine''). Unlike classical symmetric graph products, rooted products introduce hierarchical bottlenecks: communication between local subnetworks must pass through designated root vertices, providing a natural model for hub-and-spoke or chain-of-command architectures. We prove that the BNC holds for all comb graphs and determine the precise asymptotic order of their burning number in every parameter regime, including exact formulas in the spine-dominant case that generalize the known formula for paths. Our approach is constructive, based on an explicit greedy algorithm that is optimal or near-optimal depending on the regime.

Authors: John Peca-Medlin

The burning number $b(G)$ of a graph $G$ is the minimum number of rounds required to burn all vertices when, at each discrete step, existing fires spread to neighboring vertices and one new fire may be ignited at an unburned vertex. This parameter measures the speed of influence propagation in a network and has been studied as a model for information diffusion and resource allocation in distributed systems. A central open problem, the Burning Number Conjecture (BNC), asserts that every graph on $n$ vertices can be burned in at most $\lceil \sqrt n\rceil$ rounds, a bound known to be sharp for paths and verified for several structured families of trees. We investigate rooted graph products, focusing on comb graphs obtained by attaching a path (a ``tooth'') to each vertex of a path (the ``spine''). Unlike classical symmetric graph products, rooted products introduce hierarchical bottlenecks: communication between local subnetworks must pass through designated root vertices, providing a natural model for hub-and-spoke or chain-of-command architectures. We prove that the BNC holds for all comb graphs and determine the precise asymptotic order of their burning number in every parameter regime, including exact formulas in the spine-dominant case that generalize the known formula for paths. Our approach is constructive, based on an explicit greedy algorithm that is optimal or near-optimal depending on the regime.

Consistent Low-Rank Approximation

from arXiv: Data Structures and Algorithms

Authors: David P. Woodruff, Samson Zhou

We introduce and study the problem of consistent low-rank approximation, in which rows of an input matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ arrive sequentially and the goal is to provide a sequence of subspaces that well-approximate the optimal rank-$k$ approximation to the submatrix $\mathbf{A}^{(t)}$ that has arrived at each time $t$, while minimizing the recourse, i.e., the overall change in the sequence of solutions. We first show that when the goal is to achieve a low-rank cost within an additive $\varepsilon\cdot||\mathbf{A}^{(t)}||_F^2$ factor of the optimal cost, roughly $\mathcal{O}\left(\frac{k}{\varepsilon}\log(nd)\right)$ recourse is feasible. For the more challenging goal of achieving a relative $(1+\varepsilon)$-multiplicative approximation of the optimal rank-$k$ cost, we show that a simple upper bound in this setting is $\frac{k^2}{\varepsilon^2}\cdot\text{poly}\log(nd)$ recourse, which we further improve to $\frac{k^{3/2}}{\varepsilon^2}\cdot\text{poly}\log(nd)$ for integer-bounded matrices and $\frac{k}{\varepsilon^2}\cdot\text{poly}\log(nd)$ for data streams with polynomial online condition number. We also show that $Ω\left(\frac{k}{\varepsilon}\log\frac{n}{k}\right)$ recourse is necessary for any algorithm that maintains a multiplicative $(1+\varepsilon)$-approximation to the optimal low-rank cost, even if the full input is known in advance. Finally, we perform a number of empirical evaluations to complement our theoretical guarantees, demonstrating the efficacy of our algorithms in practice.

Authors: David P. Woodruff, Samson Zhou

We introduce and study the problem of consistent low-rank approximation, in which rows of an input matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ arrive sequentially and the goal is to provide a sequence of subspaces that well-approximate the optimal rank-$k$ approximation to the submatrix $\mathbf{A}^{(t)}$ that has arrived at each time $t$, while minimizing the recourse, i.e., the overall change in the sequence of solutions. We first show that when the goal is to achieve a low-rank cost within an additive $\varepsilon\cdot||\mathbf{A}^{(t)}||_F^2$ factor of the optimal cost, roughly $\mathcal{O}\left(\frac{k}{\varepsilon}\log(nd)\right)$ recourse is feasible. For the more challenging goal of achieving a relative $(1+\varepsilon)$-multiplicative approximation of the optimal rank-$k$ cost, we show that a simple upper bound in this setting is $\frac{k^2}{\varepsilon^2}\cdot\text{poly}\log(nd)$ recourse, which we further improve to $\frac{k^{3/2}}{\varepsilon^2}\cdot\text{poly}\log(nd)$ for integer-bounded matrices and $\frac{k}{\varepsilon^2}\cdot\text{poly}\log(nd)$ for data streams with polynomial online condition number. We also show that $Ω\left(\frac{k}{\varepsilon}\log\frac{n}{k}\right)$ recourse is necessary for any algorithm that maintains a multiplicative $(1+\varepsilon)$-approximation to the optimal low-rank cost, even if the full input is known in advance. Finally, we perform a number of empirical evaluations to complement our theoretical guarantees, demonstrating the efficacy of our algorithms in practice.

Kruskal-EDS: Edge Dynamic Stratification

from arXiv: Data Structures and Algorithms

Authors: Yves Mercadier

We introduce \textbf{Kruskal-EDS} (\emph{Edge Dynamic Stratification}), a distribution-adaptive variant of Kruskal's minimum spanning tree (MST) algorithm that replaces the mandatory $Θ(m\log m)$ global sort with a three-phase procedure inspired by Birkhoff's ergodic theorem. In Phase 1, a sample of $\sqrt{m}$ edges estimates the weight distribution in $Θ(\sqrt{m}\log m)$ time. In Phase 2, all $m$ edges are assigned to $k$ strata in $Θ(m\log k)$ time via binary search on quantile boundaries -- no global sort. In Phase 3, strata are sorted and processed in order; execution halts as soon as $n{-}1$ MST edges are accepted. We prove an effective complexity of $Θ(m + p\cdot(m/k)\log(m/k))$, where $p \leq k$ is the number of strata actually processed. On sparse graphs or heavy-tailed weight distributions, $p \ll k$ and the algorithm achieves near-$Θ(m)$ behaviour. We further derive the optimal strata count $k^* = \lceil\sqrt{m/\ln(m+1)}\,\rceil$, balancing partition overhead against intra-stratum sort cost. An extensive benchmark on 14 graph families demonstrates correctness on 12 test cases and practical speedups reaching $\mathbf{10\times}$ in wall-clock time and $\mathbf{33\times}$ in sort operations over standard Kruskal. A 3-dimensional TikZ visualisation of the complexity landscape illustrates the algorithm's adaptive behaviour as a function of graph density and weight distribution skewness.

Authors: Yves Mercadier

We introduce \textbf{Kruskal-EDS} (\emph{Edge Dynamic Stratification}), a distribution-adaptive variant of Kruskal's minimum spanning tree (MST) algorithm that replaces the mandatory $Θ(m\log m)$ global sort with a three-phase procedure inspired by Birkhoff's ergodic theorem. In Phase 1, a sample of $\sqrt{m}$ edges estimates the weight distribution in $Θ(\sqrt{m}\log m)$ time. In Phase 2, all $m$ edges are assigned to $k$ strata in $Θ(m\log k)$ time via binary search on quantile boundaries -- no global sort. In Phase 3, strata are sorted and processed in order; execution halts as soon as $n{-}1$ MST edges are accepted. We prove an effective complexity of $Θ(m + p\cdot(m/k)\log(m/k))$, where $p \leq k$ is the number of strata actually processed. On sparse graphs or heavy-tailed weight distributions, $p \ll k$ and the algorithm achieves near-$Θ(m)$ behaviour. We further derive the optimal strata count $k^* = \lceil\sqrt{m/\ln(m+1)}\,\rceil$, balancing partition overhead against intra-stratum sort cost. An extensive benchmark on 14 graph families demonstrates correctness on 12 test cases and practical speedups reaching $\mathbf{10\times}$ in wall-clock time and $\mathbf{33\times}$ in sort operations over standard Kruskal. A 3-dimensional TikZ visualisation of the complexity landscape illustrates the algorithm's adaptive behaviour as a function of graph density and weight distribution skewness.

Monday, March 02

At least it's an ethos?

from Ben Recht

The limits of optimal control, from the maximalist and minimalist perspectives

This is a live blog of Lecture 4 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here. The dramatic conclusion of Steampunk Data Science will appear later this week.

Though ostensibly chasing similar questions, the fields of artificial intelligence and control theory couldn’t be more temperamentally different. In AI, much like the rest of computer science, there’s a mindset to “just do stuff” and patch bugs as they come. There’s a spirit of tinkering and benchmaxxing, and the expectation that systems will get more robust as they get more capable. I mean, the AI Safety people seem to think that the way to get safe AI is to spend billions of dollars building a superintelligent AI. They say this out loud to anyone willing to listen.

Control theory, by contrast, wants you to learn non-self-adjoint operator theory before you think about building anything. If you’re designing an airplane or any other system where failure can lead to catastrophe, safety has to come first. And the path to safety is always more math.

Finding a nice middle ground between these two mindsets has been much harder than I’d like. And yet, the two fields connect at optimal control. The foundation of reinforcement learning (and its successor, recursive self-improvement) is optimal control. Modern control theory emerged in the Cold War planning of the 50s, where aerospace engineers developed optimal control to plot rocket trajectories. In class today, we’ll look at both perspectives and find why they not only can’t meet in the middle, but both leave us with rather unsatisfactory views of the role of optimization in sequential decision-making.

In its most grandiose formulation, one often adopted by rationalist AI maximalists, optimal control claims a universal model for making decisions in the face of uncertainty. You just need the right cost function and proper Bayesian updating.

This view falls apart once you move beyond the toy problems people nerd out about in online rationalist forums. Most problems have more stages than the Monty Hall problem. And once you have a modestly long horizon, exact solutions of optimal control problems are beyond impossible. We all quickly learn that with any reasonable complexity, you have to move from dynamic programming to approximate dynamic programming, meaning you are never sure if you are actually solving your original problem. Moreover, as soon as you have to incorporate measurement uncertainty and those clever Bayesian updates, the associated optimization problems are at best PSPACE-complete or at worst undecidable. You could try to argue to me that your POMDP problem isn’t PSPACE-complete, and enough iterations of GFYPO will find the answer. But this means that your superintelligent optimizer isn’t actually solving optimization problems. What it’s actually doing is anyone’s guess.

Perhaps we can retreat to the more modest view promoted in contemporary control classes. Optimal control is a nice scaffolding for engineering design. You get a framework to locally make sense of a huge parameter space. If you’re trying to tune multiple PID loops at once, you’d probably rather link things together with a Q and an R than a few dozen controller gains without a clear relation between them. But it’s a local framework (a small world framework, if you will), not a global system.

And the optimization framework often gives you some robustness. In the LQR problem, we know that if we find a control policy, it is stabilizing. We know that LQR gives us a way to bound the amount of errors we’ll accumulate if unmodeled noise perturbs the system. And we know that near-optimal solutions are often pretty good. Some early exciting work showed that LQR was robust to misspecification of the system model.

The gain margins, sadly, turned out to be a bit of a mathematical illusion. As soon as you incorporate the Bayesian reasoning that rationally summarizes the uncertainty in measurement, the policy becomes exceptionally fragile to model misspecification. This is Doyle’s famous “There Are None” example, and we’ll work through the details today.

You can incorporate robustness directly into the formulation, but this comes at its own computational and conceptual costs. Robust control is not easy and is seldom taught. My colleagues can correct me, but I’m not sure we ever teach it at Berkeley. Why is a topic for a future blog post.

So from both the maximalist RL perspective and the minimalist control perspective, we’re stuck…with LQR. Any other optimal control problem of reasonable complexity is intractable and inherently fragile to model mismatch and measurement uncertainty. This feels like a funny place to lay the foundation of an engineering philosophy!

So why would we build an entire system around optimization? I suppose I understand the promising allure of just writing down cost functions and having robots come out. But even people who don’t work on optimization know that there are always tradeoffs in designing systems and making decisions.1 If the goal is literally just “minimize cost,” we get cheap, fragile garbage out. We have to account for all the things we might care about, and write these explicitly into the cost, the constraints, or the solution heuristics.

Ultimately, real robustness gets added in with engineering expertise rather than mathematical rigor. There’s no getting around the years of hard work in simulation and pilot programs before you can convince everyone your system is actually ready for prime time. Optimal control gives you a false sense of interpretability and rigor, and it’s important to be periodically reminded of its illusoriness.

Subscribe now

1

You could argue people who don’t work on optimization probably have a better perspective on this than those of us in the field.

By Ben Recht

TR26-033 | Simple XOR lemma | Emanuele Viola

from ECCC Papers

I give an alternative proof of the xor lemma which may provide a simple explanation of why xor-ing decreases correlation.

I give an alternative proof of the xor lemma which may provide a simple explanation of why xor-ing decreases correlation.

Goodhart's law: Ken Jennings and Types of Knowledge

from Computational Complexity

Goodhart's law: When a measure becomes a target, it stops being a measure.  

I was watching the show Masterminds where Ken Jennings is one of the Masterminds. Here is what happened: 

Brook Burns (the host): The only vice president in the 20th century with initials H.H. was Hubert Humphrey. Who was the only vice president in the 19th century with initials H.H?

Bill Gasarch shouting at the screen: Hannibal Hamlin, Lincoln's VP for his first term! This is really obscure so this may be a rare case where I get a question right that Ken Jennings does not know!

Ken Jennings: Hannibal Hamlin. 

Bill Gasarch: Darn! However, I suspect he studied lists of presidents and vice presidents for the purpose of doing well on Jeopardy and now other shows. My knowledge is more natural in that I read books on presidents and vice presidents (best book, maybe the only book,  on Vice Presidents: Bland Ambition).  My knowledge of it is different from his. I tend to think my knowledge is more legitimate, though it would be hard to make that statement rigorous. If on quiz shows they asked follow-up questions, that might help alleviate this problem, if it is indeed a problem.

Misc: 

Ken Jennings is a Mormon so he does not drink or know about drinks. However, before going on Jeopardy he studied drinks for the show.

Ken Jennings does know novelty songs and that is legit since novelty songs comes up rarely on Jeopardy so I doubt he studied them in his preparation for going on Jeopardy. 

a) He knew that Shel Silverstein wrote A boy named Sue which was sung by Johnny Cash

b) He knew that Johnny Cash did a cover of Shel Silverstein's I'm being swallowed by a boa constrictor.

c) During his streak there was a category on novelty songs. He got 4 out of the 5 correct, and the one he got wrong I could tell he really knew. I got them all right, and faster, but Ken Jennings gets my respect for his legit knowledge of novelty songs. See here for the questions and answers. 

Goodharts law (maybe): Jeopardy is supposed to be about what people know. Is it supposed to be about what people study? Does studying for it help you  for things other than quiz shows?

When I watch a quiz show and get a question right there are levels of legitimacy:

1) I know the area naturally. 

2) I saw the question and answer on a different show and and I then looked it up and know more about it. So this is now legit.

3) I saw the question and answer on a different show and just know it as stimulus-response with very little understanding. Example: Kelly Clarkson was the first winner on American Idol, but I have never seen the show and only vaguely know how it works.

4) I saw the question and answer on the show I am watching, the exact episode, and I am watching a repeat. Sometimes I know stuff I don't normally know and then say OH, I've seen this episode before. 

Back to my point:

Is Ken Jennings memorizing lists a less-legit form of knowledge? If so, how to make that statement rigorous?

Is this what AI does? See also Chinese Room since that's also about a device that gets the right answers but perhaps for the ''wrong'' reason. 



By gasarch

Goodhart's lawWhen a measure becomes a target, it stops being a measure.  

I was watching the show Masterminds where Ken Jennings is one of the Masterminds. Here is what happened: 

Brook Burns (the host): The only vice president in the 20th century with initials H.H. was Hubert Humphrey. Who was the only vice president in the 19th century with initials H.H?

Bill Gasarch shouting at the screen: Hannibal Hamlin, Lincoln's VP for his first term! This is really obscure so this may be a rare case where I get a question right that Ken Jennings does not know!

Ken Jennings: Hannibal Hamlin. 

Bill Gasarch: Darn! However, I suspect he studied lists of presidents and vice presidents for the purpose of doing well on Jeopardy and now other shows. My knowledge is more natural in that I read books on presidents and vice presidents (best book, maybe the only book,  on Vice Presidents: Bland Ambition).  My knowledge of it is different from his. I tend to think my knowledge is more legitimate, though it would be hard to make that statement rigorous. If on quiz shows they asked follow-up questions, that might help alleviate this problem, if it is indeed a problem.

Misc: 

Ken Jennings is a Mormon so he does not drink or know about drinks. However, before going on Jeopardy he studied drinks for the show.

Ken Jennings does know novelty songs and that is legit since novelty songs comes up rarely on Jeopardy so I doubt he studied them in his preparation for going on Jeopardy. 

a) He knew that Shel Silverstein wrote A boy named Sue which was sung by Johnny Cash

b) He knew that Johnny Cash did a cover of Shel Silverstein's I'm being swallowed by a boa constrictor.

c) During his streak there was a category on novelty songs. He got 4 out of the 5 correct, and the one he got wrong I could tell he really knew. I got them all right, and faster, but Ken Jennings gets my respect for his legit knowledge of novelty songs. See here for the questions and answers. 

Goodharts law (maybe): Jeopardy is supposed to be about what people know. Is it supposed to be about what people study? Does studying for it help you  for things other than quiz shows?

When I watch a quiz show and get a question right there are levels of legitimacy:

1) I know the area naturally. 

2) I saw the question and answer on a different show and and I then looked it up and know more about it. So this is now legit.

3) I saw the question and answer on a different show and just know it as stimulus-response with very little understanding. Example: Kelly Clarkson was the first winner on American Idol, but I have never seen the show and only vaguely know how it works.

4) I saw the question and answer on the show I am watching, the exact episode, and I am watching a repeat. Sometimes I know stuff I don't normally know and then say OH, I've seen this episode before. 

Back to my point:

Is Ken Jennings memorizing lists a less-legit form of knowledge? If so, how to make that statement rigorous?

Is this what AI does? See also Chinese Room since that's also about a device that gets the right answers but perhaps for the ''wrong'' reason. 



By gasarch

Sandwiching Polynomials for Geometric Concepts with Low Intrinsic Dimension

from arXiv: Computational Complexity

Authors: Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan

Recent work has shown the surprising power of low-degree sandwiching polynomial approximators in the context of challenging learning settings such as learning with distribution shift, testable learning, and learning with contamination. A pair of sandwiching polynomials approximate a target function in expectation while also providing pointwise upper and lower bounds on the function's values. In this paper, we give a new method for constructing low-degree sandwiching polynomials that yield greatly improved degree bounds for several fundamental function classes and marginal distributions. In particular, we obtain degree $\mathrm{poly}(k)$ sandwiching polynomials for functions of $k$ halfspaces under the Gaussian distribution, improving exponentially over the prior $2^{O(k)}$ bound. More broadly, our approach applies to function classes that are low-dimensional and have smooth boundary. In contrast to prior work, our proof is relatively simple and directly uses the smoothness of the target function's boundary to construct sandwiching Lipschitz functions, which are amenable to results from high-dimensional approximation theory. For low-dimensional polynomial threshold functions (PTFs) with respect to Gaussians, we obtain doubly exponential improvements without applying the FT-mollification method of Kane used in the best previous result.

Authors: Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan

Recent work has shown the surprising power of low-degree sandwiching polynomial approximators in the context of challenging learning settings such as learning with distribution shift, testable learning, and learning with contamination. A pair of sandwiching polynomials approximate a target function in expectation while also providing pointwise upper and lower bounds on the function's values. In this paper, we give a new method for constructing low-degree sandwiching polynomials that yield greatly improved degree bounds for several fundamental function classes and marginal distributions. In particular, we obtain degree $\mathrm{poly}(k)$ sandwiching polynomials for functions of $k$ halfspaces under the Gaussian distribution, improving exponentially over the prior $2^{O(k)}$ bound. More broadly, our approach applies to function classes that are low-dimensional and have smooth boundary. In contrast to prior work, our proof is relatively simple and directly uses the smoothness of the target function's boundary to construct sandwiching Lipschitz functions, which are amenable to results from high-dimensional approximation theory. For low-dimensional polynomial threshold functions (PTFs) with respect to Gaussians, we obtain doubly exponential improvements without applying the FT-mollification method of Kane used in the best previous result.

Black-Box PWPP Is Not Turing-Closed

from arXiv: Computational Complexity

Authors: Pavel Hubáček

We establish that adaptive collision-finding queries are strictly more powerful than non-adaptive ones by proving that the complexity class PWPP (Polynomial Weak Pigeonhole Principle) is not closed under adaptive Turing reductions relative to a random oracle. Previously, PWPP was known to be closed under non-adaptive Turing reductions (Jeřábek 2016). We demonstrate this black-box separation by introducing the NESTED-COLLISION problem, a natural collision-finding problem defined on a pair of shrinking functions. We show that while this problem is solvable via two adaptive calls to a PWPP oracle, its random instances cannot be solved via a black-box non-adaptive reduction to the canonical PWPP-complete problem COLLISION.

Authors: Pavel Hubáček

We establish that adaptive collision-finding queries are strictly more powerful than non-adaptive ones by proving that the complexity class PWPP (Polynomial Weak Pigeonhole Principle) is not closed under adaptive Turing reductions relative to a random oracle. Previously, PWPP was known to be closed under non-adaptive Turing reductions (Jeřábek 2016). We demonstrate this black-box separation by introducing the NESTED-COLLISION problem, a natural collision-finding problem defined on a pair of shrinking functions. We show that while this problem is solvable via two adaptive calls to a PWPP oracle, its random instances cannot be solved via a black-box non-adaptive reduction to the canonical PWPP-complete problem COLLISION.

On the Need for (Quantum) Memory with Short Outputs

from arXiv: Computational Complexity

Authors: Zihan Hao, Zikuan Huang, Qipeng Liu

In this work, we establish the first separation between computation with bounded and unbounded space, for problems with short outputs (i.e., working memory can be exponentially larger than output size), both in the classical and the quantum setting. Towards that, we introduce a problem called nested collision finding, and show that optimal query complexity can not be achieved without exponential memory. Our result is based on a novel ``two-oracle recording'' technique, where one oracle ``records'' the computation's long outputs under the other oracle, effectively reducing the time-space trade-off for short-output problems to that of long-output problems. We believe this technique will be of independent interest for establishing time-space tradeoffs in other short-output settings.

Authors: Zihan Hao, Zikuan Huang, Qipeng Liu

In this work, we establish the first separation between computation with bounded and unbounded space, for problems with short outputs (i.e., working memory can be exponentially larger than output size), both in the classical and the quantum setting. Towards that, we introduce a problem called nested collision finding, and show that optimal query complexity can not be achieved without exponential memory. Our result is based on a novel ``two-oracle recording'' technique, where one oracle ``records'' the computation's long outputs under the other oracle, effectively reducing the time-space trade-off for short-output problems to that of long-output problems. We believe this technique will be of independent interest for establishing time-space tradeoffs in other short-output settings.

Spiky Rank and Its Applications to Rigidity and Circuits

from arXiv: Computational Complexity

Authors: Lianna Hambardzumyan, Konstantin Myasnikov, Artur Riazanov, Morgan Shirley, Adi Shraibman

We introduce spiky rank, a new matrix parameter that enhances blocky rank by combining the combinatorial structure of the latter with linear-algebraic flexibility. A spiky matrix is block-structured with diagonal blocks that are arbitrary rank-one matrices, and the spiky rank of a matrix is the minimum number of such matrices required to express it as a sum. This measure extends blocky rank to real matrices and is more robust for problems with both combinatorial and algebraic character. Our conceptual contribution is as follows: we propose spiky rank as a well-behaved candidate matrix complexity measure and demonstrate its potential through applications. We show that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks. On the technical side, we establish tight bounds for random matrices and develop a framework for explicit lower bounds, applying it to Hamming distance matrices and spectral expanders. Finally, we relate spiky rank to other matrix parameters, including blocky rank, sparsity, and the $γ_2$-norm.

Authors: Lianna Hambardzumyan, Konstantin Myasnikov, Artur Riazanov, Morgan Shirley, Adi Shraibman

We introduce spiky rank, a new matrix parameter that enhances blocky rank by combining the combinatorial structure of the latter with linear-algebraic flexibility. A spiky matrix is block-structured with diagonal blocks that are arbitrary rank-one matrices, and the spiky rank of a matrix is the minimum number of such matrices required to express it as a sum. This measure extends blocky rank to real matrices and is more robust for problems with both combinatorial and algebraic character. Our conceptual contribution is as follows: we propose spiky rank as a well-behaved candidate matrix complexity measure and demonstrate its potential through applications. We show that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks. On the technical side, we establish tight bounds for random matrices and develop a framework for explicit lower bounds, applying it to Hamming distance matrices and spectral expanders. Finally, we relate spiky rank to other matrix parameters, including blocky rank, sparsity, and the $γ_2$-norm.

Universal NP-Hardness of Clustering under General Utilities

from arXiv: Computational Complexity

Authors: Angshul Majumdar

Clustering is a central primitive in unsupervised learning, yet practice is dominated by heuristics whose outputs can be unstable and highly sensitive to representations, hyperparameters, and initialisation. Existing theoretical results are largely objective-specific and do not explain these behaviours at a unifying level. We formalise the common optimisation core underlying diverse clustering paradigms by defining the Universal Clustering Problem (UCP): the maximisation of a polynomial-time computable partition utility over a finite metric space. We prove the NP-hardness of UCP via two independent polynomial-time reductions from graph colouring and from exact cover by 3-sets (X3C). By mapping ten major paradigms -- including k-means, GMMs, DBSCAN, spectral clustering, and affinity propagation -- to the UCP framework, we demonstrate that each inherits this fundamental intractability. Our results provide a unified explanation for characteristic failure modes, such as local optima in alternating methods and greedy merge-order traps in hierarchical clustering. Finally, we show that clustering limitations reflect interacting computational and epistemic constraints, motivating a shift toward stability-aware objectives and interaction-driven formulations with explicit guarantees.

Authors: Angshul Majumdar

Clustering is a central primitive in unsupervised learning, yet practice is dominated by heuristics whose outputs can be unstable and highly sensitive to representations, hyperparameters, and initialisation. Existing theoretical results are largely objective-specific and do not explain these behaviours at a unifying level. We formalise the common optimisation core underlying diverse clustering paradigms by defining the Universal Clustering Problem (UCP): the maximisation of a polynomial-time computable partition utility over a finite metric space. We prove the NP-hardness of UCP via two independent polynomial-time reductions from graph colouring and from exact cover by 3-sets (X3C). By mapping ten major paradigms -- including k-means, GMMs, DBSCAN, spectral clustering, and affinity propagation -- to the UCP framework, we demonstrate that each inherits this fundamental intractability. Our results provide a unified explanation for characteristic failure modes, such as local optima in alternating methods and greedy merge-order traps in hierarchical clustering. Finally, we show that clustering limitations reflect interacting computational and epistemic constraints, motivating a shift toward stability-aware objectives and interaction-driven formulations with explicit guarantees.

Better Learning-Augmented Spanning Tree Algorithms via Metric Forest Completion

from arXiv: Data Structures and Algorithms

Authors: Nate Veldt, Thomas Stanley, Benjamin W. Priest, Trevor Steil, Keita Iwabuchi, T. S. Jayram, Grace J. Li, Geoffrey Sanders

We present improved learning-augmented algorithms for finding an approximate minimum spanning tree (MST) for points in an arbitrary metric space. Our work follows a recent framework called metric forest completion (MFC), where the learned input is a forest that must be given additional edges to form a full spanning tree. Veldt et al. (2025) showed that optimally completing the forest takes $Ω(n^2)$ time, but designed a 2.62-approximation for MFC with subquadratic complexity. The same method is a $(2γ+ 1)$-approximation for the original MST problem, where $γ\geq 1$ is a quality parameter for the initial forest. We introduce a generalized method that interpolates between this prior algorithm and an optimal $Ω(n^2)$-time MFC algorithm. Our approach considers only edges incident to a growing number of strategically chosen ``representative'' points. One corollary of our analysis is to improve the approximation factor of the previous algorithm from 2.62 for MFC and $(2γ+1)$ for metric MST to 2 and $2γ$ respectively. We prove this is tight for worst-case instances, but we still obtain better instance-specific approximations using our generalized method. We complement our theoretical results with a thorough experimental evaluation.

Authors: Nate Veldt, Thomas Stanley, Benjamin W. Priest, Trevor Steil, Keita Iwabuchi, T. S. Jayram, Grace J. Li, Geoffrey Sanders

We present improved learning-augmented algorithms for finding an approximate minimum spanning tree (MST) for points in an arbitrary metric space. Our work follows a recent framework called metric forest completion (MFC), where the learned input is a forest that must be given additional edges to form a full spanning tree. Veldt et al. (2025) showed that optimally completing the forest takes $Ω(n^2)$ time, but designed a 2.62-approximation for MFC with subquadratic complexity. The same method is a $(2γ+ 1)$-approximation for the original MST problem, where $γ\geq 1$ is a quality parameter for the initial forest. We introduce a generalized method that interpolates between this prior algorithm and an optimal $Ω(n^2)$-time MFC algorithm. Our approach considers only edges incident to a growing number of strategically chosen ``representative'' points. One corollary of our analysis is to improve the approximation factor of the previous algorithm from 2.62 for MFC and $(2γ+1)$ for metric MST to 2 and $2γ$ respectively. We prove this is tight for worst-case instances, but we still obtain better instance-specific approximations using our generalized method. We complement our theoretical results with a thorough experimental evaluation.

GPU-Native Approximate Nearest Neighbor Search with IVF-RaBitQ: Fast Index Build and Search

from arXiv: Data Structures and Algorithms

Authors: Jifan Shi, Jianyang Gao, James Xia, Tamás Béla Fehér, Cheng Long

Approximate nearest neighbor search (ANNS) on GPUs is gaining increasing popularity for modern retrieval and recommendation workloads that operate over massive high-dimensional vectors. Graph-based indexes deliver high recall and throughput but incur heavy build-time and storage costs. In contrast, cluster-based methods build and scale efficiently yet often need many probes for high recall, straining memory bandwidth and compute. Aiming to simultaneously achieve fast index build, high-throughput search, high recall, and low storage requirement for GPUs, we present IVF-RaBitQ (GPU), a GPU-native ANNS solution that integrates the cluster-based method IVF with RaBitQ quantization into an efficient GPU index build/search pipeline. Specifically, for index build, we develop a scalable GPU-native RaBitQ quantization method that enables fast and accurate low-bit encoding at scale. For search, we develop GPU-native distance computation schemes for RaBitQ codes and a fused search kernel to achieve high throughput with high recall. With IVF-RaBitQ implemented and integrated into the NVIDIA cuVS Library, experiments on cuVS Bench across multiple datasets show that IVF-RaBitQ offers a strong performance frontier in recall, throughput, index build time, and storage footprint. For Recall approximately equal to 0.95, IVF-RaBitQ achieves 2.2x higher QPS than the state-of-the-art graph-based method CAGRA, while also constructing indices 7.7x faster on average. Compared to the cluster-based method IVF-PQ, IVF-RaBitQ delivers on average over 2.7x higher throughput while avoiding accessing the raw vectors for reranking.

Authors: Jifan Shi, Jianyang Gao, James Xia, Tamás Béla Fehér, Cheng Long

Approximate nearest neighbor search (ANNS) on GPUs is gaining increasing popularity for modern retrieval and recommendation workloads that operate over massive high-dimensional vectors. Graph-based indexes deliver high recall and throughput but incur heavy build-time and storage costs. In contrast, cluster-based methods build and scale efficiently yet often need many probes for high recall, straining memory bandwidth and compute. Aiming to simultaneously achieve fast index build, high-throughput search, high recall, and low storage requirement for GPUs, we present IVF-RaBitQ (GPU), a GPU-native ANNS solution that integrates the cluster-based method IVF with RaBitQ quantization into an efficient GPU index build/search pipeline. Specifically, for index build, we develop a scalable GPU-native RaBitQ quantization method that enables fast and accurate low-bit encoding at scale. For search, we develop GPU-native distance computation schemes for RaBitQ codes and a fused search kernel to achieve high throughput with high recall. With IVF-RaBitQ implemented and integrated into the NVIDIA cuVS Library, experiments on cuVS Bench across multiple datasets show that IVF-RaBitQ offers a strong performance frontier in recall, throughput, index build time, and storage footprint. For Recall approximately equal to 0.95, IVF-RaBitQ achieves 2.2x higher QPS than the state-of-the-art graph-based method CAGRA, while also constructing indices 7.7x faster on average. Compared to the cluster-based method IVF-PQ, IVF-RaBitQ delivers on average over 2.7x higher throughput while avoiding accessing the raw vectors for reranking.

Variants of Merge-Width and Applications

from arXiv: Data Structures and Algorithms

Authors: Karolina Drabik, Maël Dumas, Colin Geniet, Jakub Nowakowski, Michał Pilipczuk, Szymon Toruńczyk

Merge-width is a recently introduced family of graph parameters that unifies treewidth, clique-width, twin-width, and generalised colouring numbers. We prove the equivalence of several alternative definitions of merge-width, thus demonstrating the robustness of the notion. Our characterisation via definable merge-width uses vertex orderings inspired by generalised colouring numbers from sparsity theory, and enables us to obtain the first non-trivial approximation algorithm for merge-width, running in time $n^{O(1)} \cdot 2^n$. We also obtain a new characterisation of bounded clique-width in terms of vertex orderings, and establish that graphs of bounded merge-width admit sparse quotients with bounded strong colouring numbers, are quasi-isometric to graphs of bounded expansion, and admit neighbourhood covers with constant overlap. We also discuss several other variants of merge-width and connections to adjacency labelling schemes.

Authors: Karolina Drabik, Maël Dumas, Colin Geniet, Jakub Nowakowski, Michał Pilipczuk, Szymon Toruńczyk

Merge-width is a recently introduced family of graph parameters that unifies treewidth, clique-width, twin-width, and generalised colouring numbers. We prove the equivalence of several alternative definitions of merge-width, thus demonstrating the robustness of the notion. Our characterisation via definable merge-width uses vertex orderings inspired by generalised colouring numbers from sparsity theory, and enables us to obtain the first non-trivial approximation algorithm for merge-width, running in time $n^{O(1)} \cdot 2^n$. We also obtain a new characterisation of bounded clique-width in terms of vertex orderings, and establish that graphs of bounded merge-width admit sparse quotients with bounded strong colouring numbers, are quasi-isometric to graphs of bounded expansion, and admit neighbourhood covers with constant overlap. We also discuss several other variants of merge-width and connections to adjacency labelling schemes.

An improved Lower Bound for Local Failover in Directed Networks via Binary Covering Arrays

from arXiv: Data Structures and Algorithms

Authors: Erik van den Akker, Klaus-Tycho Foerster

Communication networks often rely on some form of local failover rules for fast forwarding decisions upon link failures. While on undirected networks, up to two failures can be tolerated, when just matching packet origin and destination, on directed networks tolerance to even a single failure cannot be guaranteed. Previous results have shown a lower bound of at least $\lceil\log(k+1)\rceil$ rewritable bits to tolerate $k$ failures. We improve on this lower bound for cases of $k\geq 2$, by constructing a network, in which successful routing is linked to the \textit{Covering Array Problem} on a binary alphabet, leading to a lower bound of $Ω(k + \lceil\log\log(\lceil\frac{n}{4}\rceil-k)\rceil)$ for $k$ failures in an $n$ node network.

Authors: Erik van den Akker, Klaus-Tycho Foerster

Communication networks often rely on some form of local failover rules for fast forwarding decisions upon link failures. While on undirected networks, up to two failures can be tolerated, when just matching packet origin and destination, on directed networks tolerance to even a single failure cannot be guaranteed. Previous results have shown a lower bound of at least $\lceil\log(k+1)\rceil$ rewritable bits to tolerate $k$ failures. We improve on this lower bound for cases of $k\geq 2$, by constructing a network, in which successful routing is linked to the \textit{Covering Array Problem} on a binary alphabet, leading to a lower bound of $Ω(k + \lceil\log\log(\lceil\frac{n}{4}\rceil-k)\rceil)$ for $k$ failures in an $n$ node network.

Solving No-wait Scheduling for Time-Sensitive Networks with Daisy-Chain Topology

from arXiv: Data Structures and Algorithms

Authors: Qian Li, Henan Liu, Heng Liu, Yuyi Wang

Time-Sensitive Networking (TSN) is a set of standards aiming to enable deterministic and predictable communication over Ethernet networks. However, as the standards of TSN do not specify how to schedule the data streams, the main open problem around TSN is how to compute schedules efficiently and effectively. In this paper, we solve this open problem for no-wait schedules on the daisy-chain topology, one of the most commonly used topologies. Precisely, we develop an efficient algorithm that optimally computes no-wait schedules for the daisy-chain topology, with a time complexity that scales polynomially in both the number of streams and the network size. The basic idea is to recast the no-wait scheduling problem as a variant of a graph coloring problem where some restrictions are imposed on the colors available for every vertex, and where the underlying graph is an interval graph. Our main technical part is to show that this variant of graph coloring problem can be solved in polynomial time for interval graphs, though it is NP-hard for general graphs. Evaluations based on real-life TSN systems demonstrate its optimality and its ability to scale with up to tens of thousands of streams.

Authors: Qian Li, Henan Liu, Heng Liu, Yuyi Wang

Time-Sensitive Networking (TSN) is a set of standards aiming to enable deterministic and predictable communication over Ethernet networks. However, as the standards of TSN do not specify how to schedule the data streams, the main open problem around TSN is how to compute schedules efficiently and effectively. In this paper, we solve this open problem for no-wait schedules on the daisy-chain topology, one of the most commonly used topologies. Precisely, we develop an efficient algorithm that optimally computes no-wait schedules for the daisy-chain topology, with a time complexity that scales polynomially in both the number of streams and the network size. The basic idea is to recast the no-wait scheduling problem as a variant of a graph coloring problem where some restrictions are imposed on the colors available for every vertex, and where the underlying graph is an interval graph. Our main technical part is to show that this variant of graph coloring problem can be solved in polynomial time for interval graphs, though it is NP-hard for general graphs. Evaluations based on real-life TSN systems demonstrate its optimality and its ability to scale with up to tens of thousands of streams.

2G2T: Constant-Size, Statistically Sound MSM Outsourcing

from arXiv: Data Structures and Algorithms

Authors: Majid Khabbazian

Multi-scalar multiplication (MSM), defined as MSM(P, x) = sum_{i=1}^n x_i P_i, is a dominant computational kernel in discrete-logarithm-based cryptography and often becomes a bottleneck for verifiers and other resource-constrained clients. We present 2G2T, a simple protocol for verifiably outsourcing MSM to an untrusted server. After a one-time keyed setup for fixed bases P = (P1, ..., Pn) that produces a public merged-bases vector T and client secret state, the server answers each query x = (x1, ..., xn) with only two group elements: A claimed to equal MSM(P, x) and an auxiliary value B claimed to equal MSM(T, x). Verification requires a single length-n field inner product and a constant number of group operations (two scalar multiplications and one addition), while the server performs two MSMs. In our Ristretto255 implementation, verification is up to ~300x faster than computing the MSM locally using a highly optimized MSM routine for n up to 2^18, and the server-to-client response is constant-size (two compressed group elements, 64 bytes on Ristretto255). Despite its simplicity and efficiency, 2G2T achieves statistical soundness: for any (even computationally unbounded) adversarial server, the probability of accepting an incorrect result is at most 1/q per query, and at most e/q over e adaptive executions, in a prime-order group of size q.

Authors: Majid Khabbazian

Multi-scalar multiplication (MSM), defined as MSM(P, x) = sum_{i=1}^n x_i P_i, is a dominant computational kernel in discrete-logarithm-based cryptography and often becomes a bottleneck for verifiers and other resource-constrained clients. We present 2G2T, a simple protocol for verifiably outsourcing MSM to an untrusted server. After a one-time keyed setup for fixed bases P = (P1, ..., Pn) that produces a public merged-bases vector T and client secret state, the server answers each query x = (x1, ..., xn) with only two group elements: A claimed to equal MSM(P, x) and an auxiliary value B claimed to equal MSM(T, x). Verification requires a single length-n field inner product and a constant number of group operations (two scalar multiplications and one addition), while the server performs two MSMs. In our Ristretto255 implementation, verification is up to ~300x faster than computing the MSM locally using a highly optimized MSM routine for n up to 2^18, and the server-to-client response is constant-size (two compressed group elements, 64 bytes on Ristretto255). Despite its simplicity and efficiency, 2G2T achieves statistical soundness: for any (even computationally unbounded) adversarial server, the probability of accepting an incorrect result is at most 1/q per query, and at most e/q over e adaptive executions, in a prime-order group of size q.

Additive One Approximation for Minimum Degree Spanning Tree: Breaking the $O(mn)$ Time Barrier

from arXiv: Data Structures and Algorithms

Authors: Sayan Bhattacharya, Ermiya Farokhnejad, Haoze Wang

We consider the ``minimum degree spanning tree'' problem. As input, we receive an undirected, connected graph $G=(V, E)$ with $n$ nodes and $m$ edges, and our task is to find a spanning tree $T$ of $G$ that minimizes $\max_{u \in V} \text{deg}_T(u)$, where $\text{deg}_T(u)$ denotes the degree of $u \in V$ in $T$. The problem is known to be NP-hard. In the early 1990s, an influential work by Fürer and Raghavachari presented a local search algorithm that runs in $\tilde{O}(mn)$ time, and returns a spanning tree with maximum degree at most $Δ^\star+1$, where $Δ^\star$ is the optimal objective. This remained the state-of-the-art runtime bound for computing an additive one approximation, until now. We break this $O(mn)$ runtime barrier dating back to three decades, by providing a deterministic algorithm that returns an additive one approximate optimal spanning tree in $\tilde{O}(mn^{3/4})$ time. This constitutes a substantive progress towards answering an open question that has been repeatedly posed in the literature [Pettie'2016, Duan and Pettie'2020, Saranurak'2024]. Our algorithm is based on a novel application of the blocking flow paradigm.

Authors: Sayan Bhattacharya, Ermiya Farokhnejad, Haoze Wang

We consider the ``minimum degree spanning tree'' problem. As input, we receive an undirected, connected graph $G=(V, E)$ with $n$ nodes and $m$ edges, and our task is to find a spanning tree $T$ of $G$ that minimizes $\max_{u \in V} \text{deg}_T(u)$, where $\text{deg}_T(u)$ denotes the degree of $u \in V$ in $T$. The problem is known to be NP-hard. In the early 1990s, an influential work by Fürer and Raghavachari presented a local search algorithm that runs in $\tilde{O}(mn)$ time, and returns a spanning tree with maximum degree at most $Δ^\star+1$, where $Δ^\star$ is the optimal objective. This remained the state-of-the-art runtime bound for computing an additive one approximation, until now. We break this $O(mn)$ runtime barrier dating back to three decades, by providing a deterministic algorithm that returns an additive one approximate optimal spanning tree in $\tilde{O}(mn^{3/4})$ time. This constitutes a substantive progress towards answering an open question that has been repeatedly posed in the literature [Pettie'2016, Duan and Pettie'2020, Saranurak'2024]. Our algorithm is based on a novel application of the blocking flow paradigm.

Sunday, March 01

Making accessible LaTeX talk slides with ltx-talk

from David Eppstein

US-based academics are being required by the government and by their universities to make all online content accessible by April 24, and I think many of us have been running around trying to figure out what that means and how to do it. My university has been especially unhelpful, demanding compliance in vague terms but not telling us what standard they’re using and pointing only to Microsoft and Google’s web sites for guidance on how to improve accessibility for Microsoft and Google products. The relevant standard appears from the government links to be WCAG 2.1 AA. For those of us who have been using the LaTeX beamer package to create mathematical course lecture notes as pdf files, beamer cannot do this. The accessibility standards require tagged pdf and beamer does not have code to generate the tags correctly. But there is a solution: a replacement package, ltx-talk, that is mostly compatible with beamer. After some effort, I have succeeded in using ltx-talk to create slide decks that closely resemble my old beamer decks both in appearance and coding and that pass all automated accessibility checks in Acrobat. That makes now a good time to record what I have learned about going from beamer to accessible ltx-talk, before I forget it all again.

US-based academics are being required by the government and by their universities to make all online content accessible by April 24, and I think many of us have been running around trying to figure out what that means and how to do it. My university has been especially unhelpful, demanding compliance in vague terms but not telling us what standard they’re using and pointing only to Microsoft and Google’s web sites for guidance on how to improve accessibility for Microsoft and Google products. The relevant standard appears from the government links to be WCAG 2.1 AA. For those of us who have been using the LaTeX beamer package to create mathematical course lecture notes as pdf files, beamer cannot do this. The accessibility standards require tagged pdf and beamer does not have code to generate the tags correctly. But there is a solution: a replacement package, ltx-talk, that is mostly compatible with beamer. After some effort, I have succeeded in using ltx-talk to create slide decks that closely resemble my old beamer decks both in appearance and coding and that pass all automated accessibility checks in Acrobat. That makes now a good time to record what I have learned about going from beamer to accessible ltx-talk, before I forget it all again.

Online guides

I have found three helpful guides to accessible LaTeX (not specifically about ltx-talk): “Using LaTeX to produce accessible PDF” from the LaTeX tagging project, “A quick primer on modifying existing LaTeX for digital accessibility” by Richard Wong at Rice University, and “Creating accessible PDFs in LaTeX” from Overleaf. If you are using some unusual LaTeX packages or document classes, the LaTeX tagging project also maintains a useful list of the tagging status of LaTeX packages and classes. This includes the information that beamer cannot generate accessible tagged pdf but ltx-talk can. These links got me from a state of having no idea how to generate accessible pdf files from my slides to a state where I thought it might be possible, and helped me start setting up my files. Most of the rest was from carefully reading log files and accessibility reports and then experimenting to figure out what to change in order to make the errors and warnings go away.

This posting is not intended as a substitute for these guides, but rather as a collection of tips and tricks for converting beamer slide decks into accessible ltx-talk slide decks.

Compilation

To compile ltx-talk files and produce accessible tagged pdf files, you need to run lualatex instead of pdflatex. Sometimes you need as many as four runs: one run of lualatex to get an aux file with bibliography items, a run of bibtex, and then three more runs of lualatex before it stabilizes and stops telling you to run again because the labels have changed or because it miscalculated page numbers and added a dummy page. Check the log files for these things.

You also need to be running a very recent version of LaTeX, dated November 2025 or more recent. TeX Live 2025, released in March 2025, will not work. The new TeX Live 2026 pretest will. I use MacOS and have not figured out whether there is any way to access the TeX Live pretest on a Mac. Instead I have been using either Overleaf (through its Overleaf Labs feature) or a linux install on some departmental machines accessible to me via ssh. This already includes the ltx-talk package; you do not need to install it separately.

There are apparently incompatibilities between recent releases of LaTeX and the cleveref package. Fortunately, my slide decks do not use cleveref.

Preamble

The magic incantation at the start of your file is different, of course, because it’s a different package but also because tagged pdf needs a metadata command before the document class to set it up. The old incantation (for, say, a \(16\times 9\) target aspect ratio) was:

\documentclass[aspectratio=169]{beamer}

Instead, now it is:

\DocumentMetadata{
  lang          = en,
  pdfstandard   = {ua-2,a-4f},
  tagging       = on,
  tagging-setup = {math/setup=mathml-SE} 
}
\documentclass[aspect-ratio=16:9]{ltx-talk}

Change the lang = en line if you are writing in a different language than English. Wong suggests instead tagging-setup={math/setup=mathml-SE,math/alt/use}; I have no idea whether that would be an improvement.

The other important change in the LaTeX preamble works around what I think is a bug in ltx-talk. By default, it tags frame titles with an h4-level header tag. But there are no h1, h2, or h3-level header tags that it produces. This causes Acrobat’s accessibility checker to complain about header nesting. Maybe the right thing to do is to add some h1, h2, and h3-level tags, for instance on the title page, but I haven’t figured out how to do that. Instead, I changed the frame titles to h1, with

\tagpdfsetup{
  role / new-tag = frametitle / H1
}

The default appearance produced by ltx-talk is close to beamer with the structure skin, but not quite the same. You can change it in your LaTeX preamble but the documentation for how to change it is somewhat lacking. What worked for me is to look for code like DeclareTemplateInterface in ltx-talk.cls and to use \EditInstance to change it. For instance I use the code

\definecolor{ksblue}{RGB}{0,129,205}
\EditInstance{header}{std}{
  color            = ksblue,
  left-hspace      = 0cm plus 1fil,
  right-hspace     = 0cm plus 1fil
}

to change the color of frame titles and center them. I also use

\renewcommand{\labelitemi}{ {\footnotesize\color{ksblue}$\blacktriangleright$}}
\renewcommand{\labelitemii}{ {\scriptsize\color{ksblue}$\blacktriangleright$}}

to make big triangular itemize bullets like beamer, instead of the default little circular ones. (So far I have only converted slides with two levels of itemize nesting.)

Figures

In the article content, the easiest change to explain but the most time-consuming one, for me, is adding alt-text to all images. The syntax is straightforward: \includegraphics[alt={Alt text goes here}]{filename}. I have seen big online debates on what alt-text should describe and how detailed it should be. The important thing to remember is that you’re not trying to describe the image in vivid-enough detail that an AI image generator could make a copy of it. The purpose of these things is to substitute for the image when people use a screenreader to convert your slides into spoken text. So it should be concise enough that it doesn’t interrupt the flow of the text when spoken but informative enough that people using a screenreader don’t miss out on the meaning. For complicated mathematical examples that can be a big ask.

I had one figure, created as a pdf file by Adobe Illustrator, that triggered a flate decode error in lualatex. The same image had worked correctly in pdflatex and beamer. The compilation continued with a warning but the figure did not render correctly in the compiled file. I could not figure out why. The only workaround I could find was to save it as a different pdf version in Illustrator.

Environments

Getting ltx-talk to run required a few changes elsewhere in my LaTeX files. I was using \begin{frame}{Title of frame}, and ltx-talk has an option to make that work, but by default it doesn’t. Instead you need to use \begin{frame}\frametitle{Title of frame}. Using unicode characters such as an en-dash in a frametitle leads to a “unreferenced destination” error; code them in ASCII (in this case a double hyphen) instead.

Columns need to be delimited by \begin{column}{size}\end{column} instead of starting them by \column. (Also the spacing between columns is different between beamer and ltx-talk so you may need some reformatting to get them to look good.) And the same thing about using an environment rather than a command applies to some formatting things like \flushright: it won’t cause a LaTeX error but it will cause the tagging to get mismatched with a warning message at the end of the log. Then you have to work through your file trying to figure out which slide caused the mismatch.

If you use verbatim environments, beamer needed to mark this by \begin{frame}[fragile]. This still works in ltx-talk but with a different syntax, \begin{frame*}. If you use tabular environments, you need to tell LaTeX which rows of your table are header rows. See the accessibility guides linked above.

Formulas

There is lua code somewhere that generates mathml tags for the mathematical formulas (I think this is the main reason that you need to use lualatex). It is not as robust as LaTeX itself. Using code like \bigl and \bigr will cause a tagging mismatch; use \left and \right instead and let LaTeX choose for itself how big to make the parentheses. Using \dots inside math often does not work at all, and in one case using $\dots$ inside a tabular environment caused lualatex to crash hard. Use \ldots or \cdots instead. Also, I think using mathematics inside \text inside more mathematics caused a tagging mismatch; don’t nest them like that. This code will also generate files with names of the form *-luamml-mathml.html; add that pattern to your .gitignore file.

I had one deck (discussing the Ackermann function) containing the expression \(\approx 2\times 10^{19728}\). This caused lualatex to freeze. The only hypothesis I can find is that it looks like a formula whose numerical value can be calculated and that the mathml conversion code was trying to calculate it, using a slow exponentiation algorithm.

Sections

If your deck has 21 or more slides, Acrobat will complain if it doesn’t also have bookmarks for navigation within sections of the deck. I did this using \pdfbookmark[1]{Bookmark name}{Bookmark name} at the start of each section. Maybe there is a better way. This would be a good place to put more higher-level header tags than the H4 frame titles, if I knew how.

Many of my slide decks have bibtex bibliography sections. I don’t generally show these when speaking but I want them to be part of the pdf files of my slide decks that I distribute. I like to use natbib for this (plus the doi package to make the external links and dois work properly). But natbib never worked correctly in beamer; I needed to add the code \newcommand{\newblock}{} to the preamble to make it work. In ltx-talk, I also needed to add \newcommand{\thebibliography}{\relax} before loading natbib. Then you can just put your bibliography within one of the frames. A problem I have not found a good workaround for is that beamer allows multi-frame bibliographies with \begin{frame}[allowframebreaks]. However, ltx-talk does not and reading its documentation reveals that its developer is very hostile to long bibliographies (such as would arise in a talk incorporating a literature review). The only workaround I have found is to use a small-enough font size (in some cases \tiny) to allow the whole bibliography to fit on one slide. This is mildly problematic with respect to accessibility but better than truncating the bibliography because it overflows the slide.

(Discuss on Mastodon)

By David Eppstein

Saturday, February 28

Linkage

from David Eppstein

Joe Halpern (1953–2026) (\(\mathbb{M}\)). A leader in the mathematical reasoning about knowledge, founder of the Computing Research Repository (later the CS branch of arXiv), and recipient of the Gödel Prize and Dijkstra Prize.

By David Eppstein

TR26-032 | Advances in List Decoding of Polynomial Codes | Mrinal Kumar, Noga Ron-Zewi

from ECCC Papers

Error-correcting codes are a method for representing data, so that one can recover the original information even if some parts of it were corrupted. The basic idea, which dates back to the revolutionary work of Shannon and Hamming about a century ago, is to encode the data into a redundant form, so that the original information can be decoded from the redundant encoding even in the presence of some noise or corruption. One prominent family of error-correcting codes are Reed-Solomon Codes which encode the data using evaluations of low-degree polynomials. Nearly six decades after they were introduced, Reed-Solomon Codes, as well as some related families of polynomial-based codes, continue to be widely studied, both from a theoretical perspective and from the point of view of applications. Besides their obvious use in communication, error-correcting codes such as Reed-Solomon Codes are also useful for various applications in theoretical computer science. These applications often require the ability to cope with many errors, much more than what is possible information-theoretically. List-decodable codes are a special class of error-correcting codes that enable correction from more errors than is traditionally possible by allowing a small list of candidate decodings. These codes have turned out to be extremely useful in various applications across theoretical computer science and coding theory. In recent years, there have been significant advances in list decoding of Reed-Solomon Codes and related families of polynomial-based codes. This includes efficient list decoding of such codes up to the information-theoretic capacity, with optimal list-size, and using fast nearly-linear time, and even sublinear-time, algorithms. In this book, we survey these developments.

Error-correcting codes are a method for representing data, so that one can recover the original information even if some parts of it were corrupted. The basic idea, which dates back to the revolutionary work of Shannon and Hamming about a century ago, is to encode the data into a redundant form, so that the original information can be decoded from the redundant encoding even in the presence of some noise or corruption. One prominent family of error-correcting codes are Reed-Solomon Codes which encode the data using evaluations of low-degree polynomials. Nearly six decades after they were introduced, Reed-Solomon Codes, as well as some related families of polynomial-based codes, continue to be widely studied, both from a theoretical perspective and from the point of view of applications. Besides their obvious use in communication, error-correcting codes such as Reed-Solomon Codes are also useful for various applications in theoretical computer science. These applications often require the ability to cope with many errors, much more than what is possible information-theoretically. List-decodable codes are a special class of error-correcting codes that enable correction from more errors than is traditionally possible by allowing a small list of candidate decodings. These codes have turned out to be extremely useful in various applications across theoretical computer science and coding theory. In recent years, there have been significant advances in list decoding of Reed-Solomon Codes and related families of polynomial-based codes. This includes efficient list decoding of such codes up to the information-theoretic capacity, with optimal list-size, and using fast nearly-linear time, and even sublinear-time, algorithms. In this book, we survey these developments.

Friday, February 27

Anthropic: Stay strong!

from Scott Aaronson

I don’t have time to write a full post right now, but hopefully this is self-explanatory. Regardless of their broader views on the AI industry, the eventual risks from AI, or American politics, right every person of conscience needs to stand behind Anthropic, as they stand up for their right to [checks notes] not be […]

I don’t have time to write a full post right now, but hopefully this is self-explanatory.

Regardless of their broader views on the AI industry, the eventual risks from AI, or American politics, right every person of conscience needs to stand behind Anthropic, as they stand up for their right to [checks notes] not be effectively nationalized by the Trump administration and forced to build murderbots and to help surveil American citizens. No, I wouldn’t have believed this either in a science-fiction movie, but it’s now just the straightforward reality of our world, years ahead of schedule. In particular, I call on all other AI companies, in the strongest possible terms, to do the right thing and stand behind Anthropic, in this make-or-break moment for the AI industry and the entire world.

By Scott

Defining Normal to See Abnormal

from Ben Recht

The data scientific techniques used to find dietary allowances.

This is Part 4 of a blogged essay “Steampunk Data Science.” A table of contents is here.

Lafayette Mendel, Elmer McCollum’s post-doc mentor at Yale, was intrigued by his mentee’s foray into the chemistry of nutrition. Even if the results were puzzling, Mendel saw great potential in the controlled synthetic food experiments in rodents. Mendel recruited Thomas Osborne and Edna Ferry to synthesize dozens of diets for rats and compile their results in a two-volume monograph. They focused on providing the bare backbone macronutrients of carbohydrates, fats, and proteins, but purified the food to be as basic as possible.

One of the Yale team’s key insights was comparing the relative value of diets by tracking rat growth against a baseline of “normal” growth. They drew on the rat studies of Henry Donaldson, Elizabeth Dunn, and John Watson, who used rats to understand the development of the nervous system. Donaldson et al. had weighed dozens of rats at different developmental stages and recorded the average and extreme weights of their samples. This is the average weight of Donaldson, Dunn, and Watson’s rats over time:

This normal growth curve eliminated the need for control groups when studying rat diets: a good diet would hit the curve, a bad diet would miss it. Osborne, Mendel, and Ferry first used it to disprove McCollum’s palatability hypothesis. They found a diet that the rats readily devoured but could not sustain normal growth. It consisted of hordein (a type of gluten), starch, agar, lard, and “protein-free milk.” Delicious. The milk was boiled and processed to remove fats and proteins, leaving behind milk sugars and other substances that were unknown at the time.

Even though the rats ate their fill of this mixture, their growth was terribly stunted.

The Yale team plotted the growth of individual rats, rather than the averages of multiple rats. With such individualized studies, they could also demonstrate the effectiveness of dietary changes. Here is another prototypical paper from their research monograph.

The rat was fed multiple diets over its life. The experiment begins with a mixed food diet and then switches to a purified food. Soon after the dietary switch, the rat’s growth reverses, and it rapidly loses weight. Clearly, there is something deficient with the purified food. This sort of single-subject temporal experimentation let them hone in on which of the many food substances we eat are necessary for survival. What would we call this today? A “within-subject crossover design?” What’s the p-value? Whatever the case, other chemists and biologists were convinced there was something important going on in these studies, and the work won Osborne and Mendel a Science paper. They would be credited with the discovery of essential amino acids.

F. Gowland Hopkins, a nutrition researcher at Cambridge who had discovered the amino acid tryptophan, happened across their work in those illustrious pages and realized that he had already conducted similar experiments years earlier. Hopkins hadn’t written up his rat diet findings because he suffered an illness that had taken him out of the lab. Likely, he also hadn’t realized the significance of his findings until seeing Osborne and Mendel’s write-up. Wanting to join in on the party, Hopkins rushed to write up his earlier experiments and sent them to The Journal of Physiology. Figure 2 in that paper arguably won him the Nobel Prize.

In this experiment, there were two groups, each with eight rats. Each dot in the plot is the average weight of the group on a particular day (the x-axis is days, the y-axis is grams). The white dots correspond to rats fed only bread. The black dots correspond to rats fed bread and a tiny amount of milk. The rats with the milk supplement grew dramatically faster than those fed bread alone. Just like the Yale team, Hopkins had performed a crossover study. At day 18, he swaps the diets of the two groups. Subsequently, the milk-fed rats stopped growing on their milk-free diet. In sharp contrast, the rats fed bread only started to grow once Hopkins gave them some milk.

Hopkins’ paper includes a detailed appendix that shares much of the raw data plotted in the paper’s figures. For the plot above, he does not share the complete growth series for each rat, but he does list the rat weights at days 0, 18, and 50. Using the weights at days 0 and 18, I can create a more modern display of the data, revealing the variation within the groups rather than just representing the averages. In Figure 8, I plot the percentage growth from day zero for each rat, splitting them into the groups fed milk and those not fed milk.

Each dot here corresponds to the percentage growth of an individual rat. The shaded region corresponds to the middle half of the data. The line in the shaded region is the median of the data. The lines outside the box denote the extremes of the data. Without milk, all of the rats grew less than 20% over the 18 days. With milk, every rat grew more than 60%. Scientists didn’t need a hypothesis test to see that something incredible was in the milk.


Meanwhile, back in Madison, Elmer McCollum was stressed out. Not only was he confused by his early experiments, but he was embarrassed by the refutation of his work by Mendel’s team. He was distracted by the oppressive obligations of faculty life, found teaching an unwelcome burden, and resented being tasked with working on the cow experiment. Ah, the professor life. On top of all this, he would admit later that he was not naturally gifted at caring for his rats. The combination left his rat studies far less productive than he had hoped.

The missing catalyst arrived in 1909. Marguerite Davis had recently moved back to Madison to care for her father after the death of her mother. Davis had completed her bachelor’s degree at Berkeley and hoped to continue some form of graduate study while also looking after her dad. She told McCollum she’d be willing to do freelance research with him so she could learn biochemistry. McCollum agreed, and Davis began to learn the ways of the lab. Davis quickly noted that McCollum, though proficient at catching rats, was terrible at keeping them. His colony was struggling and in general disarray. She offered to care for them, and McCollum was overjoyed to pass off the rat-keeping responsibility. Davis not only enjoyed the work but also had a talent for running the experiments. Within a year, McCollum and Davis would publish revolutionary findings in the biochemical foundations of nutrition.

McCollum and Davis tried to figure out what exactly was missing from the protein-free milk of the Yale experiments. They devised a procedure to isolate the critical component by centrifuging egg yolks in an ether bath to separate a key fat-soluble material. They found that this isolated substance was itself sufficient to trigger rat growth.

Here’s a typical example of their experimental results, which they presented in the same style as Mendel and collaborators.

For the first period, when the rat was young, McCollum and Davis fed the rat salt, protein, fats, and a little bit of carbohydrates. For the second period, they removed the fat. In this second period, the rat’s weight eventually plateaus. Then, in period three, they added the fat-soluble compounds from the egg yolk. Whatever this stuff was, it sufficed to reignite the rat’s growth.

They found this pattern remarkably repeatable. On the simple synthetic diets of only pure protein (casein), sugar (lactose), carbohydrates (starch and dextrine), and fats (lard), the rats displayed striking symptoms of malnourishment. They would languish and develop crusty buildup in their eyes. They wouldn’t mate. Adding the ether extract of egg yolk consistently cured the poor growth in these rats and fully restored them to normal.

McCollum and Davis considered five case studies of rats and their associated growth curves sufficient evidence of a major discovery. They wrote:

We have seen this prompt resumption of growth after a period of suspension result from the introduction of ether extract of butter or of egg in about thirty animals and are convinced that these extracts contain some organic complex without which the animals cannot make further increase in body weight, but may maintain themselves in a fairly good nutritive state for a prolonged period.

This fat-soluble compound turned out to be Vitamin A.

Subscribe now

By Ben Recht

TR26-031 | On the Need for (Quantum) Memory with Short Outputs | Zihan Hao, Zikuan Huang, Qipeng Liu

from ECCC Papers

In this work, we establish the first separation between computation with bounded and unbounded space, for problems with short outputs (i.e., working memory can be exponentially larger than output size), both in the classical and the quantum setting. Towards that, we introduce a problem called nested collision finding, and show that optimal query complexity can not be achieved without exponential memory. Our result is based on a novel ``two-oracle recording'' technique, where one oracle ``records'' the computation's long outputs under the other oracle, effectively reducing the time-space trade-off for short-output problems to that of long-output problems. We believe this technique will be of independent interest for establishing time-space tradeoffs in other short-output settings.

In this work, we establish the first separation between computation with bounded and unbounded space, for problems with short outputs (i.e., working memory can be exponentially larger than output size), both in the classical and the quantum setting. Towards that, we introduce a problem called nested collision finding, and show that optimal query complexity can not be achieved without exponential memory. Our result is based on a novel ``two-oracle recording'' technique, where one oracle ``records'' the computation's long outputs under the other oracle, effectively reducing the time-space trade-off for short-output problems to that of long-output problems. We believe this technique will be of independent interest for establishing time-space tradeoffs in other short-output settings.

Faster algorithms for graph homomorphism via tractable constraint satisfaction

from arXiv: Computational Complexity

Authors: Clément Carbonnel

We show that the existence of a homomorphism from an $n$-vertex graph $G$ to an $h$-vertex graph $H$ can be decided in time $2^{O(n)}h^{O(1)}$ and polynomial space if $H$ comes from a family of graphs that excludes a topological minor. The algorithm is based on a reduction to a single-exponential number of constraint satisfaction problems over tractable languages and can handle cost minimization. We also present an improved randomized algorithm for the special case where the graph $H$ is an odd cycle.

Authors: Clément Carbonnel

We show that the existence of a homomorphism from an $n$-vertex graph $G$ to an $h$-vertex graph $H$ can be decided in time $2^{O(n)}h^{O(1)}$ and polynomial space if $H$ comes from a family of graphs that excludes a topological minor. The algorithm is based on a reduction to a single-exponential number of constraint satisfaction problems over tractable languages and can handle cost minimization. We also present an improved randomized algorithm for the special case where the graph $H$ is an odd cycle.

Results on three problems on isolation of graphs

from arXiv: Computational Complexity

Authors: Peter Borg, Yair Caro

The graph isolation problem was introduced by Caro and Hansberg in 2015. It is a vast generalization of the classical graph domination problem and its study is expanding rapidly. In this paper, we address a number of questions that arise naturally. Let $F$ be a graph. We show that the $F$-isolating set problem is NP-complete if $F$ is connected. We investigate how the $F$-isolation number $ι(G,F)$ of a graph $G$ is affected by the minimum degree $d$ of $G$, establishing a bounded range, in terms of $d$ and the orders of $F$ and $G$, for the largest possible value of $ι(G,F)$ with $d$ sufficiently large. We also investigate how close $ι(G,tF)$ is to $ι(G,F)$, using domination and, in suitable cases, the Erdos-Posa property.

Authors: Peter Borg, Yair Caro

The graph isolation problem was introduced by Caro and Hansberg in 2015. It is a vast generalization of the classical graph domination problem and its study is expanding rapidly. In this paper, we address a number of questions that arise naturally. Let $F$ be a graph. We show that the $F$-isolating set problem is NP-complete if $F$ is connected. We investigate how the $F$-isolation number $ι(G,F)$ of a graph $G$ is affected by the minimum degree $d$ of $G$, establishing a bounded range, in terms of $d$ and the orders of $F$ and $G$, for the largest possible value of $ι(G,F)$ with $d$ sufficiently large. We also investigate how close $ι(G,tF)$ is to $ι(G,F)$, using domination and, in suitable cases, the Erdos-Posa property.

FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time

from arXiv: Computational Geometry

Authors: David Dirnfeld, Fabien Delattre, Pedro Miraldo, Erik Learned-Miller

Estimating camera motion from monocular video is a fundamental problem in computer vision, central to tasks such as SLAM, visual odometry, and structure-from-motion. Existing methods that recover the camera's heading under known rotation, whether from an IMU or an optimization algorithm, tend to perform well in low-noise, low-outlier conditions, but often decrease in accuracy or become computationally expensive as noise and outlier levels increase. To address these limitations, we propose a novel generalization of the Hough transform on the unit sphere (S(2)) to estimate the camera's heading. First, the method extracts correspondences between two frames and generates a great circle of directions compatible with each pair of correspondences. Then, by discretizing the unit sphere using a Fibonacci lattice as bin centers, each great circle casts votes for a range of directions, ensuring that features unaffected by noise or dynamic objects vote consistently for the correct motion direction. Experimental results on three datasets demonstrate that the proposed method is on the Pareto frontier of accuracy versus efficiency. Additionally, experiments on SLAM show that the proposed method reduces RMSE by correcting the heading during camera pose initialization.

Authors: David Dirnfeld, Fabien Delattre, Pedro Miraldo, Erik Learned-Miller

Estimating camera motion from monocular video is a fundamental problem in computer vision, central to tasks such as SLAM, visual odometry, and structure-from-motion. Existing methods that recover the camera's heading under known rotation, whether from an IMU or an optimization algorithm, tend to perform well in low-noise, low-outlier conditions, but often decrease in accuracy or become computationally expensive as noise and outlier levels increase. To address these limitations, we propose a novel generalization of the Hough transform on the unit sphere (S(2)) to estimate the camera's heading. First, the method extracts correspondences between two frames and generates a great circle of directions compatible with each pair of correspondences. Then, by discretizing the unit sphere using a Fibonacci lattice as bin centers, each great circle casts votes for a range of directions, ensuring that features unaffected by noise or dynamic objects vote consistently for the correct motion direction. Experimental results on three datasets demonstrate that the proposed method is on the Pareto frontier of accuracy versus efficiency. Additionally, experiments on SLAM show that the proposed method reduces RMSE by correcting the heading during camera pose initialization.

Learning Tangent Bundles and Characteristic Classes with Autoencoder Atlases

from arXiv: Computational Geometry

Authors: Eduardo Paluzo-Hidalgo, Yuichi Ike

We introduce a theoretical framework that connects multi-chart autoencoders in manifold learning with the classical theory of vector bundles and characteristic classes. Rather than viewing autoencoders as producing a single global Euclidean embedding, we treat a collection of locally trained encoder-decoder pairs as a learned atlas on a manifold. We show that any reconstruction-consistent autoencoder atlas canonically defines transition maps satisfying the cocycle condition, and that linearising these transition maps yields a vector bundle coinciding with the tangent bundle when the latent dimension matches the intrinsic dimension of the manifold. This construction provides direct access to differential-topological invariants of the data. In particular, we show that the first Stiefel-Whitney class can be computed from the signs of the Jacobians of learned transition maps, yielding an algorithmic criterion for detecting orientability. We also show that non-trivial characteristic classes provide obstructions to single-chart representations, and that the minimum number of autoencoder charts is determined by the good cover structure of the manifold. Finally, we apply our methodology to low-dimensional orientable and non-orientable manifolds, as well as to a non-orientable high-dimensional image dataset.

Authors: Eduardo Paluzo-Hidalgo, Yuichi Ike

We introduce a theoretical framework that connects multi-chart autoencoders in manifold learning with the classical theory of vector bundles and characteristic classes. Rather than viewing autoencoders as producing a single global Euclidean embedding, we treat a collection of locally trained encoder-decoder pairs as a learned atlas on a manifold. We show that any reconstruction-consistent autoencoder atlas canonically defines transition maps satisfying the cocycle condition, and that linearising these transition maps yields a vector bundle coinciding with the tangent bundle when the latent dimension matches the intrinsic dimension of the manifold. This construction provides direct access to differential-topological invariants of the data. In particular, we show that the first Stiefel-Whitney class can be computed from the signs of the Jacobians of learned transition maps, yielding an algorithmic criterion for detecting orientability. We also show that non-trivial characteristic classes provide obstructions to single-chart representations, and that the minimum number of autoencoder charts is determined by the good cover structure of the manifold. Finally, we apply our methodology to low-dimensional orientable and non-orientable manifolds, as well as to a non-orientable high-dimensional image dataset.

Equivalent Dichotomies for Triangle Detection in Subgraph, Induced, and Colored H-Free Graphs

from arXiv: Data Structures and Algorithms

Authors: Amir Abboud, Ron Safier, Nathan Wallheimer

A recent paper by the authors (ITCS'26) initiates the study of the Triangle Detection problem in graphs avoiding a fixed pattern $H$ as a subgraph and proposes a \emph{dichotomy hypothesis} characterizing which patterns $H$ make the Triangle Detection problem easier in $H$-free graphs than in general graphs. In this work, we demonstrate that this hypothesis is, in fact, equivalent to analogous hypotheses in two broader settings that a priori seem significantly more challenging: \emph{induced} $H$-free graphs and \emph{colored} $H$-free graphs. Our main contribution is a reduction from the induced $H$-free case to the non-induced $\H^+$-free case, where $\H^+$ preserves the structural properties of $H$ that are relevant for the dichotomy, namely $3$-colorability and triangle count. A similar reduction is given for the colored case. A key technical ingredient is a self-reduction to Unique Triangle Detection that preserves the induced $H$-freeness property, via a new color-coding-like reduction.

Authors: Amir Abboud, Ron Safier, Nathan Wallheimer

A recent paper by the authors (ITCS'26) initiates the study of the Triangle Detection problem in graphs avoiding a fixed pattern $H$ as a subgraph and proposes a \emph{dichotomy hypothesis} characterizing which patterns $H$ make the Triangle Detection problem easier in $H$-free graphs than in general graphs. In this work, we demonstrate that this hypothesis is, in fact, equivalent to analogous hypotheses in two broader settings that a priori seem significantly more challenging: \emph{induced} $H$-free graphs and \emph{colored} $H$-free graphs. Our main contribution is a reduction from the induced $H$-free case to the non-induced $\H^+$-free case, where $\H^+$ preserves the structural properties of $H$ that are relevant for the dichotomy, namely $3$-colorability and triangle count. A similar reduction is given for the colored case. A key technical ingredient is a self-reduction to Unique Triangle Detection that preserves the induced $H$-freeness property, via a new color-coding-like reduction.

Dequantization Barriers for Guided Stoquastic Hamiltonians

from arXiv: Data Structures and Algorithms

Authors: Yassine Hamoudi, Yvan Le Borgne, Shrinidhi Teganahally Sridhara

We construct a probability distribution, induced by the Perron--Frobenius eigenvector of an exponentially large graph, which cannot be efficiently sampled by any classical algorithm, even when provided with the best-possible warm-start distribution. In the quantum setting, this problem can be viewed as preparing the ground state of a stoquastic Hamiltonian given a guiding state as input, and is known to be efficiently solvable on a quantum computer. Our result suggests that no efficient classical algorithm can solve a broad class of stoquastic ground-state problems. Our graph is constructed from a class of high-degree, high-girth spectral expanders to which self-similar trees are attached. This builds on and extends prior work of Gilyén, Hastings, and Vazirani [Quantum 2021, STOC 2021], which ruled out dequantization for a specific stoquastic adiabatic path algorithm. We strengthen their result by ruling out any classical algorithm for guided ground-state preparation.

Authors: Yassine Hamoudi, Yvan Le Borgne, Shrinidhi Teganahally Sridhara

We construct a probability distribution, induced by the Perron--Frobenius eigenvector of an exponentially large graph, which cannot be efficiently sampled by any classical algorithm, even when provided with the best-possible warm-start distribution. In the quantum setting, this problem can be viewed as preparing the ground state of a stoquastic Hamiltonian given a guiding state as input, and is known to be efficiently solvable on a quantum computer. Our result suggests that no efficient classical algorithm can solve a broad class of stoquastic ground-state problems. Our graph is constructed from a class of high-degree, high-girth spectral expanders to which self-similar trees are attached. This builds on and extends prior work of Gilyén, Hastings, and Vazirani [Quantum 2021, STOC 2021], which ruled out dequantization for a specific stoquastic adiabatic path algorithm. We strengthen their result by ruling out any classical algorithm for guided ground-state preparation.

Flip Distance of Triangulations of Convex Polygons / Rotation Distance of Binary Trees is NP-complete

from arXiv: Data Structures and Algorithms

Authors: Joseph Dorfer

Flips in triangulations of convex polygons arise in many different settings. They are isomorphic to rotations in binary trees, define edges in the 1-skeleton of the Associahedron and cover relations in the Tamari Lattice. The complexity of determining the minimum number of flips that transform one triangulation of a convex point set into another remained a tantalizing open question for many decades. We settle this question by proving that computing shortest flip sequences between triangulations of convex polygons, and therefore also computing the rotation distance of binary trees, is NP-hard. For our proof we develop techniques for flip sequences of triangulations whose counterparts were introduced for the study of flip sequences of non-crossing spanning trees by Bjerkevik, Kleist, Ueckerdt, and Vogtenhuber~[SODA25] and Bjerkevik, Dorfer, Kleist, Ueckerdt, and Vogtenhuber~[SoCG26].

Authors: Joseph Dorfer

Flips in triangulations of convex polygons arise in many different settings. They are isomorphic to rotations in binary trees, define edges in the 1-skeleton of the Associahedron and cover relations in the Tamari Lattice. The complexity of determining the minimum number of flips that transform one triangulation of a convex point set into another remained a tantalizing open question for many decades. We settle this question by proving that computing shortest flip sequences between triangulations of convex polygons, and therefore also computing the rotation distance of binary trees, is NP-hard. For our proof we develop techniques for flip sequences of triangulations whose counterparts were introduced for the study of flip sequences of non-crossing spanning trees by Bjerkevik, Kleist, Ueckerdt, and Vogtenhuber~[SODA25] and Bjerkevik, Dorfer, Kleist, Ueckerdt, and Vogtenhuber~[SoCG26].

Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

from arXiv: Data Structures and Algorithms

Authors: Alkis Kalavasis, Anay Mehrotra, Manolis Zampetakis, Felix Zhou, Ziyu Zhu

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low'' information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computationally efficient estimation possible under identifiability and convex partitions? This work resolves both questions. [...]

Authors: Alkis Kalavasis, Anay Mehrotra, Manolis Zampetakis, Felix Zhou, Ziyu Zhu

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low'' information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computationally efficient estimation possible under identifiability and convex partitions? This work resolves both questions. [...]

Isolation critical graphs under multiple edge subdivision

from arXiv: Data Structures and Algorithms

Authors: Karl Bartolo, Peter Borg, Magda Dettlaff, Magdalena Lemańska, Paweł Żyliński

This paper introduces the notion of $(ι,q)$-critical graphs. The isolation number of a graph $G$, denoted by $ι(G)$ and also known as the vertex-edge domination number, is the minimum number of vertices in a set $D$ such that the subgraph induced by the vertices not in the closed neighbourhood of $D$ has no edges. A graph $G$ is $(ι,q)$-critical, $q \ge 1$, if the subdivision of any $q$ edges in $G$ gives a graph with isolation number greater than $ι(G)$ and there exists a set of $q-1$ edges such that subdividing them gives a graph with isolation number equal to $ι(G)$. We prove that for each integer $q \ge 1$ there exists a $(ι,q)$-critical graph, while for a given graph $G$, the admissible values of $q$ satisfy $1 \le q \le |E(G)| - 1$. In addition, we provide a general characterisation of $(ι,1)$-critical graphs as well as a constructive characterisation of $(ι,1)$-critical trees.

Authors: Karl Bartolo, Peter Borg, Magda Dettlaff, Magdalena Lemańska, Paweł Żyliński

This paper introduces the notion of $(ι,q)$-critical graphs. The isolation number of a graph $G$, denoted by $ι(G)$ and also known as the vertex-edge domination number, is the minimum number of vertices in a set $D$ such that the subgraph induced by the vertices not in the closed neighbourhood of $D$ has no edges. A graph $G$ is $(ι,q)$-critical, $q \ge 1$, if the subdivision of any $q$ edges in $G$ gives a graph with isolation number greater than $ι(G)$ and there exists a set of $q-1$ edges such that subdividing them gives a graph with isolation number equal to $ι(G)$. We prove that for each integer $q \ge 1$ there exists a $(ι,q)$-critical graph, while for a given graph $G$, the admissible values of $q$ satisfy $1 \le q \le |E(G)| - 1$. In addition, we provide a general characterisation of $(ι,1)$-critical graphs as well as a constructive characterisation of $(ι,1)$-critical trees.

Efficient Parallel Algorithms for Hypergraph Matching

from arXiv: Data Structures and Algorithms

Authors: Henrik Reinstädtler, Christian Schulz, Nodari Sitchinava, Fabian Walliser

We present efficient parallel algorithms for computing maximal matchings in hypergraphs. Our algorithm finds locally maximal edges in the hypergraph and adds them in parallel to the matching. In the CRCW PRAM models our algorithms achieve $O(\log{m})$ time with $O((κ+ n) \log {m})$ work w.h.p. where $m$ is the number of hyperedges, and $κ$ is the sum of all vertex degrees. The CREW PRAM model algorithm has a running time of $O((\logΔ+\log{d})\log{m})$ and requires $O((κ+ n) \log {m})$ work w.h.p. It can be implemented work-optimal with $O(κ+n)$ work in $O((\log{m}+\log{n})\log{m})$ time. We prove a $1/d$-approximation guarantee for our algorithms. We evaluate our algorithms experimentally by implementing and running the proposed algorithms on the GPU using CUDA and Kokkos. Our experimental evaluation demonstrates the practical efficiency of our approach on real-world hypergraph instances, yielding a speed up of up to 76 times compared to a single-core CPU algorithm.

Authors: Henrik Reinstädtler, Christian Schulz, Nodari Sitchinava, Fabian Walliser

We present efficient parallel algorithms for computing maximal matchings in hypergraphs. Our algorithm finds locally maximal edges in the hypergraph and adds them in parallel to the matching. In the CRCW PRAM models our algorithms achieve $O(\log{m})$ time with $O((κ+ n) \log {m})$ work w.h.p. where $m$ is the number of hyperedges, and $κ$ is the sum of all vertex degrees. The CREW PRAM model algorithm has a running time of $O((\logΔ+\log{d})\log{m})$ and requires $O((κ+ n) \log {m})$ work w.h.p. It can be implemented work-optimal with $O(κ+n)$ work in $O((\log{m}+\log{n})\log{m})$ time. We prove a $1/d$-approximation guarantee for our algorithms. We evaluate our algorithms experimentally by implementing and running the proposed algorithms on the GPU using CUDA and Kokkos. Our experimental evaluation demonstrates the practical efficiency of our approach on real-world hypergraph instances, yielding a speed up of up to 76 times compared to a single-core CPU algorithm.

A Simple Distributed Deterministic Planar Separator

from arXiv: Data Structures and Algorithms

Authors: Yaseen Abd-Elhaleem, Michal Dory, Oren Weimann

A balanced separator of a graph $G$ is a set of vertices whose removal disconnects the graph into connected components that are a constant factor smaller than $G$. Lipton and Tarjan [FOCS'77] famously proved that every planar graph admits a balanced separator of size $O(\sqrt{n})$, as well as a balanced separator of size $O(D)$ that is a simple path (where $D$ is $G$'s diameter). In the centralized setting, both separators can be found in linear time. In the distributed setting, $D$ is a universal lower bound for the round complexity of solving many optimization problems, so, separators of size $O(D)$ are preferable. It was not until [DISC'17] that a distributed algorithm was devised by Ghaffari and Parter to compute such an $O(D)$-size separator in $\tilde O(D)$ rounds, by adapting the Lipton-Tarjan algorithm to the distributed model. Since then, this algorithm was used in several distributed algorithms for planar graphs, e.g., [GP, DISC'17], [LP, STOC'19], [AEDPW, PODC'25]. However, the algorithm is randomized, deeming the algorithms that use it to be randomized as well. Obtaining a deterministic algorithm remained an interesting open question until [PODC'25], when a (complex) deterministic separator algorithm was given by Jauregui, Montealegre and Rapaport. We present a much simpler deterministic separator algorithm with the same (near-optimal) $\tilde O(D)$-round complexity. While previous works devised either complicated or randomized ways of transferring weights from vertices to faces of $G$, we show that a straightforward way also works: Each vertex simply transfers its weight to one arbitrary face it lies on. That's it! We note that a deterministic separator algorithm directly derandomizes the state-of-the-art distributed algorithms for classical problems on planar graphs such as single-source shortest-paths, maximum-flow, directed global min-cut, and reachability.

Authors: Yaseen Abd-Elhaleem, Michal Dory, Oren Weimann

A balanced separator of a graph $G$ is a set of vertices whose removal disconnects the graph into connected components that are a constant factor smaller than $G$. Lipton and Tarjan [FOCS'77] famously proved that every planar graph admits a balanced separator of size $O(\sqrt{n})$, as well as a balanced separator of size $O(D)$ that is a simple path (where $D$ is $G$'s diameter). In the centralized setting, both separators can be found in linear time. In the distributed setting, $D$ is a universal lower bound for the round complexity of solving many optimization problems, so, separators of size $O(D)$ are preferable. It was not until [DISC'17] that a distributed algorithm was devised by Ghaffari and Parter to compute such an $O(D)$-size separator in $\tilde O(D)$ rounds, by adapting the Lipton-Tarjan algorithm to the distributed model. Since then, this algorithm was used in several distributed algorithms for planar graphs, e.g., [GP, DISC'17], [LP, STOC'19], [AEDPW, PODC'25]. However, the algorithm is randomized, deeming the algorithms that use it to be randomized as well. Obtaining a deterministic algorithm remained an interesting open question until [PODC'25], when a (complex) deterministic separator algorithm was given by Jauregui, Montealegre and Rapaport. We present a much simpler deterministic separator algorithm with the same (near-optimal) $\tilde O(D)$-round complexity. While previous works devised either complicated or randomized ways of transferring weights from vertices to faces of $G$, we show that a straightforward way also works: Each vertex simply transfers its weight to one arbitrary face it lies on. That's it! We note that a deterministic separator algorithm directly derandomizes the state-of-the-art distributed algorithms for classical problems on planar graphs such as single-source shortest-paths, maximum-flow, directed global min-cut, and reachability.

SYK thermal expectations are classically easy at any temperature

from arXiv: Data Structures and Algorithms

Authors: Alexander Zlokapa, Bobak T. Kiani

Estimating thermal expectations of local observables is a natural target for quantum advantage. We give a simple classical algorithm that approximates thermal expectations, and we show it has quasi-polynomial cost $n^{O(\log n/ε)}$ for all temperatures above a phase transition in the free energy. For many natural models, this coincides with the entire fast-mixing, quantumly easy phase. Our results apply to the Sachdev-Ye-Kitaev (SYK) model at any constant temperature -- including when the thermal state is highly entangled and satisfies polynomial quantum circuit lower bounds, a sign problem, and nontrivial instance-to-instance fluctuations. Our analysis of the SYK model relies on the replica trick to control the complex zeros of the partition function.

Authors: Alexander Zlokapa, Bobak T. Kiani

Estimating thermal expectations of local observables is a natural target for quantum advantage. We give a simple classical algorithm that approximates thermal expectations, and we show it has quasi-polynomial cost $n^{O(\log n/ε)}$ for all temperatures above a phase transition in the free energy. For many natural models, this coincides with the entire fast-mixing, quantumly easy phase. Our results apply to the Sachdev-Ye-Kitaev (SYK) model at any constant temperature -- including when the thermal state is highly entangled and satisfies polynomial quantum circuit lower bounds, a sign problem, and nontrivial instance-to-instance fluctuations. Our analysis of the SYK model relies on the replica trick to control the complex zeros of the partition function.

static_maps: consteval std::map and std::unordered_map Implementations in C++23

from arXiv: Data Structures and Algorithms

Authors: Isaac D. Myhal, Oliver Serang

Using consteval from C++23, we implement efficient, new versions of std::map and std::unordered_map for use when the keys are known at compile time. We demonstrate superior performance of our unordered_map on three demonstration use-cases: Lookup of elemental mass from atomic symbol, lookup of amino acid from codon, and modification of stock prices from S&P 500 ticker symbols all produced runtimes <40%, <35%, <73% of the respective runtimes of the std implementations. Our library runimes were <80%, <45%, <97% of the lookup time of Frozen, an alternative perfect hashing implementation in C++ for problems also using constexpr keys. To our knowledge, this makes our library the overall fastest drop-in (i.e., with a similar API) alternative to std::unordered_map. On one arbitrarily chosen demo, we demonstrate runtimes <35% of PTHash and <89% gperf, state-of-the-art but not drop-in hashing libraries via external tools.

Authors: Isaac D. Myhal, Oliver Serang

Using consteval from C++23, we implement efficient, new versions of std::map and std::unordered_map for use when the keys are known at compile time. We demonstrate superior performance of our unordered_map on three demonstration use-cases: Lookup of elemental mass from atomic symbol, lookup of amino acid from codon, and modification of stock prices from S&P 500 ticker symbols all produced runtimes <40%, <35%, <73% of the respective runtimes of the std implementations. Our library runimes were <80%, <45%, <97% of the lookup time of Frozen, an alternative perfect hashing implementation in C++ for problems also using constexpr keys. To our knowledge, this makes our library the overall fastest drop-in (i.e., with a similar API) alternative to std::unordered_map. On one arbitrarily chosen demo, we demonstrate runtimes <35% of PTHash and <89% gperf, state-of-the-art but not drop-in hashing libraries via external tools.

Microscopic Structure of Random 3-SAT: A Discrete Geometric Approach to Phase Transitions and Algorithmic Complexity

from arXiv: Data Structures and Algorithms

Authors: Yongjian Zhan

The structural phase transitions and computational complexity of random 3-SAT instances are traditionally described using thermodynamic analogies from statistical physics, such as Replica Symmetry Breaking and energy landscapes. While providing profound macroscopic insights, these theories lack a discrete microscopic structure. In this paper, we propose a complementary, strictly discrete geometric model that maps these phenomena directly to the combinatorial topology of an $N$-dimensional Boolean hypercube. By defining the problem space purely through valid solutions rather than abstract energy states, we establish deterministic mechanics for clustering and freezing, driven by the progressive elimination of vertices and Hamming distance bridges. Furthermore, we derive absolute structural boundaries for 3-SAT, identifying a minimal unsatisfiability limit at constraint density $α= \frac{8}{N}$ populated by at least $\frac{N(N-1)(N-2)}{6}$ distinct unsatisfiable cores, and a maximal satisfiability limit at $α= \frac{7}{6}(N-1)(N-2)$ populated by $2^N$ maximal satisfiable instances. These combinatorial extremes mathematically elucidate why the average-case Satisfiability Threshold Conjecture holds only ``almost surely.'' Finally, we apply this topological framework to explain the ``easy-hard-easy'' algorithmic complexity curve. We demonstrate that the efficiency of Depth-First Search is governed by the geometric transition from an abundance of valid search paths (the under-constrained easy phase) to a high density of structurally ``removed variables'' that force immediate contradictions (the over-constrained easy phase). This microscopic perspective bridges theoretical phase transitions with the concrete mechanics of complete search algorithms.

Authors: Yongjian Zhan

The structural phase transitions and computational complexity of random 3-SAT instances are traditionally described using thermodynamic analogies from statistical physics, such as Replica Symmetry Breaking and energy landscapes. While providing profound macroscopic insights, these theories lack a discrete microscopic structure. In this paper, we propose a complementary, strictly discrete geometric model that maps these phenomena directly to the combinatorial topology of an $N$-dimensional Boolean hypercube. By defining the problem space purely through valid solutions rather than abstract energy states, we establish deterministic mechanics for clustering and freezing, driven by the progressive elimination of vertices and Hamming distance bridges. Furthermore, we derive absolute structural boundaries for 3-SAT, identifying a minimal unsatisfiability limit at constraint density $α= \frac{8}{N}$ populated by at least $\frac{N(N-1)(N-2)}{6}$ distinct unsatisfiable cores, and a maximal satisfiability limit at $α= \frac{7}{6}(N-1)(N-2)$ populated by $2^N$ maximal satisfiable instances. These combinatorial extremes mathematically elucidate why the average-case Satisfiability Threshold Conjecture holds only ``almost surely.'' Finally, we apply this topological framework to explain the ``easy-hard-easy'' algorithmic complexity curve. We demonstrate that the efficiency of Depth-First Search is governed by the geometric transition from an abundance of valid search paths (the under-constrained easy phase) to a high density of structurally ``removed variables'' that force immediate contradictions (the over-constrained easy phase). This microscopic perspective bridges theoretical phase transitions with the concrete mechanics of complete search algorithms.