Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Tuesday, March 17

Smaller Depth-2 Linear Circuits for Disjointness Matrices

from arXiv: Computational Complexity

Authors: Lixi Ye

We prove two new upper bounds for depth-2 linear circuits computing the $N$th disjointness matrix $D^{\otimes N}$. First, we obtain a circuit of size $O\big(2^{1.24485N}\big)$ over $\{0,1\}$. Second, we obtain a circuit of degree $O\big(2^{0.3199N}\big)$ over $\{0,\pm 1\}$. These improve the previous bounds of Alman and Li, namely size $O\big(2^{1.249424N}\big)$ and degree $O\big(2^{N/3}\big)$. Our starting point is the rebalancing framework developed in a line of works by Jukna and Sergeev, Alman, Sergeev, and Alman-Guan-Padaki, culminating in Alman and Li. We sharpen that framework in two ways. First, we replace the earlier "wild" rebalancing process by a tame, discretized process whose geometric-average behavior is governed by the quenched top Lyapunov exponent of a random matrix product. This allows us to invoke the convex-optimization upper bound of Gharavi and Anantharam. Second, for the degree bound we work explicitly with a cost landscape on the $(p,q)$-plane and show that different circuit families are dominant on different regions, so that the global maximum remains below $0.3199$.

Authors: Lixi Ye

We prove two new upper bounds for depth-2 linear circuits computing the $N$th disjointness matrix $D^{\otimes N}$. First, we obtain a circuit of size $O\big(2^{1.24485N}\big)$ over $\{0,1\}$. Second, we obtain a circuit of degree $O\big(2^{0.3199N}\big)$ over $\{0,\pm 1\}$. These improve the previous bounds of Alman and Li, namely size $O\big(2^{1.249424N}\big)$ and degree $O\big(2^{N/3}\big)$. Our starting point is the rebalancing framework developed in a line of works by Jukna and Sergeev, Alman, Sergeev, and Alman-Guan-Padaki, culminating in Alman and Li. We sharpen that framework in two ways. First, we replace the earlier "wild" rebalancing process by a tame, discretized process whose geometric-average behavior is governed by the quenched top Lyapunov exponent of a random matrix product. This allows us to invoke the convex-optimization upper bound of Gharavi and Anantharam. Second, for the degree bound we work explicitly with a cost landscape on the $(p,q)$-plane and show that different circuit families are dominant on different regions, so that the global maximum remains below $0.3199$.

Lost in Aggregation: On a Fundamental Expressivity Limit of Message-Passing Graph Neural Networks

from arXiv: Computational Complexity

Authors: Eran Rosenbluth

We define a generic class of functions that captures most conceivable aggregations for Message-Passing Graph Neural Networks (MP-GNNs), and prove that any MP-GNN model with such aggregations induces only a polynomial number of equivalence classes on all graphs - while the number of non-isomorphic graphs is doubly-exponential (in number of vertices). Adding a familiar perspective, we observe that merely 2-iterations of Color Refinement (CR) induce at least an exponential number of equivalence classes, making the aforementioned MP-GNNs relatively infinitely weaker. Previous results state that MP-GNNs match full CR, however they concern a weak, 'non-uniform', notion of distinguishing-power where each graph size may required a different MP-GNN to distinguish graphs up to that size. Our results concern both distinguishing between non-equivariant vertices and distinguishing between non-isomorphic graphs.

Authors: Eran Rosenbluth

We define a generic class of functions that captures most conceivable aggregations for Message-Passing Graph Neural Networks (MP-GNNs), and prove that any MP-GNN model with such aggregations induces only a polynomial number of equivalence classes on all graphs - while the number of non-isomorphic graphs is doubly-exponential (in number of vertices). Adding a familiar perspective, we observe that merely 2-iterations of Color Refinement (CR) induce at least an exponential number of equivalence classes, making the aforementioned MP-GNNs relatively infinitely weaker. Previous results state that MP-GNNs match full CR, however they concern a weak, 'non-uniform', notion of distinguishing-power where each graph size may required a different MP-GNN to distinguish graphs up to that size. Our results concern both distinguishing between non-equivariant vertices and distinguishing between non-isomorphic graphs.

Towards Parameterized Hardness on Maintaining Conjunctive Queries

from arXiv: Computational Complexity

Authors: Qichen Wang

We investigate the fine-grained complexity of dynamically maintaining the result of fixed self-join free conjunctive queries under single-tuple updates. Prior work shows that free-connex queries can be maintained in update time $O(|D|^δ)$ for some $δ\in [0.5, 1]$, where $|D|$ is the size of the current database. However, a gap remains between the best known upper bound of $O(|D|)$ and lower bounds of $Ω(|D|^{0.5-ε})$ for any $ε\ge 0$. We narrow this gap by introducing two structural parameters to quantify the dynamic complexity of a conjunctive query: the height $k$ and the dimension $d$. We establish new fine-grained lower bounds showing that any algorithm maintaining a query with these parameters must incur update time $Ω(|D|^{1-1/\max(k,d)-ε})$, unless widely believed conjectures fail. These yield the first super-$\sqrt{|D|}$ lower bounds for maintaining free-connex queries, and suggest the tightness of current algorithms when considering arbitrarily large $k$ and~$d$. Complementing our lower bounds, we identify a data-dependent parameter, the generalized $H$-index $h(D)$, which is upper bounded by $|D|^{1/d}$, and design an efficient algorithm for maintaining star queries, a common class of height 2 free-connex queries. The algorithm achieves an instance-specific update time $O(h(D)^{d-1})$ with linear space $O(|D|)$. This matches our parameterized lower bound and provides instance-specific performance in favorable cases.

Authors: Qichen Wang

We investigate the fine-grained complexity of dynamically maintaining the result of fixed self-join free conjunctive queries under single-tuple updates. Prior work shows that free-connex queries can be maintained in update time $O(|D|^δ)$ for some $δ\in [0.5, 1]$, where $|D|$ is the size of the current database. However, a gap remains between the best known upper bound of $O(|D|)$ and lower bounds of $Ω(|D|^{0.5-ε})$ for any $ε\ge 0$. We narrow this gap by introducing two structural parameters to quantify the dynamic complexity of a conjunctive query: the height $k$ and the dimension $d$. We establish new fine-grained lower bounds showing that any algorithm maintaining a query with these parameters must incur update time $Ω(|D|^{1-1/\max(k,d)-ε})$, unless widely believed conjectures fail. These yield the first super-$\sqrt{|D|}$ lower bounds for maintaining free-connex queries, and suggest the tightness of current algorithms when considering arbitrarily large $k$ and~$d$. Complementing our lower bounds, we identify a data-dependent parameter, the generalized $H$-index $h(D)$, which is upper bounded by $|D|^{1/d}$, and design an efficient algorithm for maintaining star queries, a common class of height 2 free-connex queries. The algorithm achieves an instance-specific update time $O(h(D)^{d-1})$ with linear space $O(|D|)$. This matches our parameterized lower bound and provides instance-specific performance in favorable cases.

The Counting General Dominating Set Framework

from arXiv: Computational Complexity

Authors: Jiayi Zheng, Boning Meng

We introduce a new framework of counting problems called #GDS that encompasses #$(σ, ρ)$-Set, a class of domination-type problems that includes counting dominating sets and counting total dominating sets. We explore the intricate relation between #GDS and the well-known Holant. We propose the technique of gadget construction under the #GDS framework; using this technique, we prove the #P-completeness of counting dominating sets for 3-regular planar bipartite simple graphs. Through a generalization of a Holant dichotomy, and a special reduction method via symmetric bipartite graphs, we also prove the #P-completeness of counting total dominating sets for the same graph class.

Authors: Jiayi Zheng, Boning Meng

We introduce a new framework of counting problems called #GDS that encompasses #$(σ, ρ)$-Set, a class of domination-type problems that includes counting dominating sets and counting total dominating sets. We explore the intricate relation between #GDS and the well-known Holant. We propose the technique of gadget construction under the #GDS framework; using this technique, we prove the #P-completeness of counting dominating sets for 3-regular planar bipartite simple graphs. Through a generalization of a Holant dichotomy, and a special reduction method via symmetric bipartite graphs, we also prove the #P-completeness of counting total dominating sets for the same graph class.

Towards Exponential Quantum Improvements in Solving Cardinality-Constrained Binary Optimization

from arXiv: Computational Complexity

Authors: Haomu Yuan, Hanqing Wu, Kuan-Cheng Chen, Bin Cheng, Crispin H. W. Barnes

Cardinality-constrained binary optimization is a fundamental computational primitive with broad applications in machine learning, finance, and scientific computing. In this work, we introduce a Grover-based quantum algorithm that exploits the structure of the fixed-cardinality feasible subspace under a natural promise on solution existence. For quadratic objectives, our approach achieves ${O}\left(\sqrt{\frac{\binom{n}{k}}{M}}\right)$ Grover rotations for any fixed cardinality $k$ and degeneracy of the optima $M$, yielding an exponential reduction in the number of Grover iterations compared with unstructured search over $\{0,1\}^n$. Building on this result, we develop a hybrid classical--quantum framework based on the alternating direction method of multipliers (ADMM) algorithm. The proposed framework is guaranteed to output an $ε$-approximate solution with a consistency tolerance $ε+ δ$ using at most $ {O}\left(\sqrt{\binom{n}{k}}\frac{n^{6}k^{3/2} }{ \sqrt{M}ε^2 δ}\right)$ queries to a quadratic oracle, together with ${O}\left(\frac{n^{6}k^{3/2}}{ε^2δ}\right)$ classical overhead. Overall, our method suggests a practical use of quantum resources and demonstrates an exponential improvements over existing Grover-based approaches in certain parameter regimes, thereby paving the way toward quantum advantage in constrained binary optimization.

Authors: Haomu Yuan, Hanqing Wu, Kuan-Cheng Chen, Bin Cheng, Crispin H. W. Barnes

Cardinality-constrained binary optimization is a fundamental computational primitive with broad applications in machine learning, finance, and scientific computing. In this work, we introduce a Grover-based quantum algorithm that exploits the structure of the fixed-cardinality feasible subspace under a natural promise on solution existence. For quadratic objectives, our approach achieves ${O}\left(\sqrt{\frac{\binom{n}{k}}{M}}\right)$ Grover rotations for any fixed cardinality $k$ and degeneracy of the optima $M$, yielding an exponential reduction in the number of Grover iterations compared with unstructured search over $\{0,1\}^n$. Building on this result, we develop a hybrid classical--quantum framework based on the alternating direction method of multipliers (ADMM) algorithm. The proposed framework is guaranteed to output an $ε$-approximate solution with a consistency tolerance $ε+ δ$ using at most $ {O}\left(\sqrt{\binom{n}{k}}\frac{n^{6}k^{3/2} }{ \sqrt{M}ε^2 δ}\right)$ queries to a quadratic oracle, together with ${O}\left(\frac{n^{6}k^{3/2}}{ε^2δ}\right)$ classical overhead. Overall, our method suggests a practical use of quantum resources and demonstrates an exponential improvements over existing Grover-based approaches in certain parameter regimes, thereby paving the way toward quantum advantage in constrained binary optimization.

Decision Quotient: A Regime-Sensitive Complexity Theory of Exact Relevance Certification

from arXiv: Computational Complexity

Authors: Tristan Simas

Which coordinates of a decision problem can be hidden without changing the decision, and what is the coarsest exact abstraction that preserves all decision-relevant distinctions? We study this as an exact relevance-certification problem organized around the optimizer quotient. We classify how hard it is to certify this structure across three settings: static (counterexample exclusion), stochastic (conditioning and expectation), and sequential (temporal structure). In the static regime, sufficiency collapses to relevance containment, so minimum sufficiency is coNP-complete. In the stochastic regime, preservation and decisiveness separate: preservation is polynomial-time under explicit-state encoding with bridge theorems to static sufficiency and the optimizer quotient, while decisiveness is PP-hard under succinct encoding with anchor and minimum variants in $\textsf{NP}^{\textsf{PP}}$. In the sequential regime, all queries are PSPACE-complete. We also prove an encoding-sensitive contrast between explicit-state tractability and succinct-encoding hardness, derive an integrity-competence trilemma, and isolate twelve tractable subcases. A Lean 4 artifact mechanically verifies the optimizer-quotient universal property, main reductions, and finite decider core.

Authors: Tristan Simas

Which coordinates of a decision problem can be hidden without changing the decision, and what is the coarsest exact abstraction that preserves all decision-relevant distinctions? We study this as an exact relevance-certification problem organized around the optimizer quotient. We classify how hard it is to certify this structure across three settings: static (counterexample exclusion), stochastic (conditioning and expectation), and sequential (temporal structure). In the static regime, sufficiency collapses to relevance containment, so minimum sufficiency is coNP-complete. In the stochastic regime, preservation and decisiveness separate: preservation is polynomial-time under explicit-state encoding with bridge theorems to static sufficiency and the optimizer quotient, while decisiveness is PP-hard under succinct encoding with anchor and minimum variants in $\textsf{NP}^{\textsf{PP}}$. In the sequential regime, all queries are PSPACE-complete. We also prove an encoding-sensitive contrast between explicit-state tractability and succinct-encoding hardness, derive an integrity-competence trilemma, and isolate twelve tractable subcases. A Lean 4 artifact mechanically verifies the optimizer-quotient universal property, main reductions, and finite decider core.

Jaguar: A Primal Algorithm for Conjunctive Query Evaluation in Submodular-Width Time

from arXiv: Computational Complexity

Authors: Mahmoud Abo Khamis, Hubie Chen

The submodular width is a complexity measure of conjunctive queries (CQs), which assigns a nonnegative real number, subw(Q), to each CQ Q. An existing algorithm, called PAND, performs CQ evaluation in polynomial time where the exponent is essentially subw(Q). Formally, for every Boolean CQ Q, PANDA evaluates Q in time $O(N^{\mathsf{subw}(Q)} \cdot \mathsf{polylog}(N))$, where N denotes the input size; moreover, there is complexity-theoretic evidence that, for a number of Boolean CQs, no exponent strictly below subw(Q) can be achieved by combinatorial algorithms. On a high level, the submodular width of a CQ Q can be described as the maximum over all polymatroids, which are set functions on the variables of Q that satisfy Shannon inequalities. The PANDA algorithm in a sense works in the dual space of this maximization problem, makes use of information theory, and transforms a CQ into a set of disjunctive datalog programs which are individually solved. In this article, we introduce a new algorithm for CQ evaluation which achieves, for each Boolean CQ Q and for all epsilon > 0, a running time of $O(N^{\mathsf{subw}(Q)+ε})$. This new algorithm's description and analysis are, in our view, significantly simpler than those of PANDA. We refer to it as a "primal" algorithm as it operates in the primal space of the described maximization problem, by maintaining a feasible primal solution, namely, a polymatroid. Indeed, this algorithm deals directly with the input CQ and adaptively computes a sequence of joins, in a guided fashion, so that the cost of these join computations is bounded. Additionally, this algorithm can achieve the stated runtime for the generalization of the submodular width incorporating degree constraints. We dub our algorithm Jaguar, as it is a join-adaptive guided algorithm.

Authors: Mahmoud Abo Khamis, Hubie Chen

The submodular width is a complexity measure of conjunctive queries (CQs), which assigns a nonnegative real number, subw(Q), to each CQ Q. An existing algorithm, called PAND, performs CQ evaluation in polynomial time where the exponent is essentially subw(Q). Formally, for every Boolean CQ Q, PANDA evaluates Q in time $O(N^{\mathsf{subw}(Q)} \cdot \mathsf{polylog}(N))$, where N denotes the input size; moreover, there is complexity-theoretic evidence that, for a number of Boolean CQs, no exponent strictly below subw(Q) can be achieved by combinatorial algorithms. On a high level, the submodular width of a CQ Q can be described as the maximum over all polymatroids, which are set functions on the variables of Q that satisfy Shannon inequalities. The PANDA algorithm in a sense works in the dual space of this maximization problem, makes use of information theory, and transforms a CQ into a set of disjunctive datalog programs which are individually solved. In this article, we introduce a new algorithm for CQ evaluation which achieves, for each Boolean CQ Q and for all epsilon > 0, a running time of $O(N^{\mathsf{subw}(Q)+ε})$. This new algorithm's description and analysis are, in our view, significantly simpler than those of PANDA. We refer to it as a "primal" algorithm as it operates in the primal space of the described maximization problem, by maintaining a feasible primal solution, namely, a polymatroid. Indeed, this algorithm deals directly with the input CQ and adaptively computes a sequence of joins, in a guided fashion, so that the cost of these join computations is bounded. Additionally, this algorithm can achieve the stated runtime for the generalization of the submodular width incorporating degree constraints. We dub our algorithm Jaguar, as it is a join-adaptive guided algorithm.

Minimal enclosing balls via geodesics

from arXiv: Computational Geometry

Authors: Ariel Goodwin, Adrian S. Lewis

Algorithms for minimal enclosing ball problems are often geometric in nature. To highlight the metric ingredients underlying their efficiency, we focus here on a particularly simple geodesic-based method. A recent subgradient-based study proved a complexity result for this method in the broad setting of geodesic spaces of nonpositive curvature. We present a simpler, intuitive and self-contained complexity analysis in that setting, which also improves the convergence rate. We furthermore derive the first complexity result for the algorithm on geodesic spaces with curvature bounded above.

Authors: Ariel Goodwin, Adrian S. Lewis

Algorithms for minimal enclosing ball problems are often geometric in nature. To highlight the metric ingredients underlying their efficiency, we focus here on a particularly simple geodesic-based method. A recent subgradient-based study proved a complexity result for this method in the broad setting of geodesic spaces of nonpositive curvature. We present a simpler, intuitive and self-contained complexity analysis in that setting, which also improves the convergence rate. We furthermore derive the first complexity result for the algorithm on geodesic spaces with curvature bounded above.

Improved Online Hitting Set Algorithms for Structured and Geometric Set Systems

from arXiv: Data Structures and Algorithms

Authors: Sujoy Bhore, Anupam Gupta, Amit Kumar

In the online hitting set problem, sets arrive over time, and the algorithm has to maintain a subset of elements that hit all the sets seen so far. Alon, Awerbuch, Azar, Buchbinder, and Naor (SICOMP 2009) gave an algorithm with competitive ratio $O(\log n \log m)$ for the (general) online hitting set and set cover problems for $m$ sets and $n$ elements; this is known to be tight for efficient online algorithms. Given this barrier for general set systems, we ask: can we break this double-logarithmic phenomenon for online hitting set/set cover on structured and geometric set systems? We provide an $O(\log n \log\log n)$-competitive algorithm for the weighted online hitting set problem on set systems with linear shallow-cell complexity, replacing the double-logarithmic factor in the general result by effectively a single logarithmic term. As a consequence of our results we obtain the first bounds for weighted online hitting set for natural geometric set families, thereby answering open questions regarding the gap between general and geometric weighted online hitting set problems.

Authors: Sujoy Bhore, Anupam Gupta, Amit Kumar

In the online hitting set problem, sets arrive over time, and the algorithm has to maintain a subset of elements that hit all the sets seen so far. Alon, Awerbuch, Azar, Buchbinder, and Naor (SICOMP 2009) gave an algorithm with competitive ratio $O(\log n \log m)$ for the (general) online hitting set and set cover problems for $m$ sets and $n$ elements; this is known to be tight for efficient online algorithms. Given this barrier for general set systems, we ask: can we break this double-logarithmic phenomenon for online hitting set/set cover on structured and geometric set systems? We provide an $O(\log n \log\log n)$-competitive algorithm for the weighted online hitting set problem on set systems with linear shallow-cell complexity, replacing the double-logarithmic factor in the general result by effectively a single logarithmic term. As a consequence of our results we obtain the first bounds for weighted online hitting set for natural geometric set families, thereby answering open questions regarding the gap between general and geometric weighted online hitting set problems.

The Compilability Thresholds of 2-CNF to OBDD

from arXiv: Data Structures and Algorithms

Authors: Alexis de Colnet, Alfons Laarman, Joon Hyung Lee

We prove the existence of two thresholds regarding the compilability of random 2-CNF formulas to OBDDs. The formulas are drawn from $\mathcal{F}_2(n,δn)$, the uniform distribution over all 2-CNFs with $δn$ clauses and $n$ variables, with $δ\geq 0$ a constant. We show that, with high probability, the random 2-CNF admits OBDDs of size polynomial in $n$ if $0 \leq δ< 1/2$ or if $δ> 1$. On the other hand, for $1/2 < δ< 1$, with high probability, the random $2$-CNF admits only OBDDs of size exponential in $n$. It is no coincidence that the two ``compilability thresholds'' are $δ= 1/2$ and $δ= 1$. Both are known thresholds for other CNF properties, namely, $δ= 1$ is the satisfiability threshold for 2-CNF while $δ= 1/2$ is the treewidth threshold, i.e., the point where the treewidth of the primal graph jumps from constant to linear in $n$ with high probability.

Authors: Alexis de Colnet, Alfons Laarman, Joon Hyung Lee

We prove the existence of two thresholds regarding the compilability of random 2-CNF formulas to OBDDs. The formulas are drawn from $\mathcal{F}_2(n,δn)$, the uniform distribution over all 2-CNFs with $δn$ clauses and $n$ variables, with $δ\geq 0$ a constant. We show that, with high probability, the random 2-CNF admits OBDDs of size polynomial in $n$ if $0 \leq δ< 1/2$ or if $δ> 1$. On the other hand, for $1/2 < δ< 1$, with high probability, the random $2$-CNF admits only OBDDs of size exponential in $n$. It is no coincidence that the two ``compilability thresholds'' are $δ= 1/2$ and $δ= 1$. Both are known thresholds for other CNF properties, namely, $δ= 1$ is the satisfiability threshold for 2-CNF while $δ= 1/2$ is the treewidth threshold, i.e., the point where the treewidth of the primal graph jumps from constant to linear in $n$ with high probability.

Hecate: A Modular Genomic Compressor

from arXiv: Data Structures and Algorithms

Authors: Kamila Szewczyk, Sven Rahmann

We present Hecate, a modular lossless genomic compression framework. It is designed around uncommon but practical source-coding choices. Unlike many single-method compressors, Hecate treats compression as a conditional coding problem over coupled FASTA/FASTQ streams (control, headers, nucleotides, case, quality, extras). It uses per-stream codecs under a shared indexed block container. Codecs include alphabet-aware packing with an explicit side channel for out-of-alphabet residues, an auxiliary-index Burrows-Wheeler pipeline with custom arithmetic coding, and a blockwise Markov mixture coder with explicit model-competition signaling. This architecture yields high throughput, exact random-access slicing, and referential mode through streamwise binary differencing. In a comprehensive benchmark suite, Hecate provides the best compression vs. speed trade-offs against state-of-the-art established tools (MFCompress, NAF, bzip3, AGC), with notably stronger behaviour on large genomes and high-similarity referential settings. For the same compression ratio, Hecate is 2 to 10 times faster. When given the same time budget as other algorithms, Hecate achieves up to 5% to 10% better compression.

Authors: Kamila Szewczyk, Sven Rahmann

We present Hecate, a modular lossless genomic compression framework. It is designed around uncommon but practical source-coding choices. Unlike many single-method compressors, Hecate treats compression as a conditional coding problem over coupled FASTA/FASTQ streams (control, headers, nucleotides, case, quality, extras). It uses per-stream codecs under a shared indexed block container. Codecs include alphabet-aware packing with an explicit side channel for out-of-alphabet residues, an auxiliary-index Burrows-Wheeler pipeline with custom arithmetic coding, and a blockwise Markov mixture coder with explicit model-competition signaling. This architecture yields high throughput, exact random-access slicing, and referential mode through streamwise binary differencing. In a comprehensive benchmark suite, Hecate provides the best compression vs. speed trade-offs against state-of-the-art established tools (MFCompress, NAF, bzip3, AGC), with notably stronger behaviour on large genomes and high-similarity referential settings. For the same compression ratio, Hecate is 2 to 10 times faster. When given the same time budget as other algorithms, Hecate achieves up to 5% to 10% better compression.

The Price of Universal Temporal Reachability

from arXiv: Data Structures and Algorithms

Authors: Binh-Minh Bui-Xuan, Nhat-Minh Nguyen, Sébastien Tixeuil, Yukiko Yamauchi

Dynamic networks are graphs in which edges are available only at specific time instants, modeling connections that change over time. The dynamic network creation game studies this setting as a strategic interaction where each vertex represents a player. Players can add or remove time-labeled edges in order to minimize their personal cost. This cost has two components: a construction cost, calculated as the number of time instants during which a player maintains edges multiplied by a constant $α$, and a communication cost, defined as the average distance to all other vertices in the network. Communication occurs through temporal paths, which are sequences of adjacent edges with strictly increasing time labels and no repeated vertices. We show for the shortest distance (minimizing the number of edges) that the price of anarchy can be proportional to the number of vertices, contrasting the constant price conjectured for static networks.

Authors: Binh-Minh Bui-Xuan, Nhat-Minh Nguyen, Sébastien Tixeuil, Yukiko Yamauchi

Dynamic networks are graphs in which edges are available only at specific time instants, modeling connections that change over time. The dynamic network creation game studies this setting as a strategic interaction where each vertex represents a player. Players can add or remove time-labeled edges in order to minimize their personal cost. This cost has two components: a construction cost, calculated as the number of time instants during which a player maintains edges multiplied by a constant $α$, and a communication cost, defined as the average distance to all other vertices in the network. Communication occurs through temporal paths, which are sequences of adjacent edges with strictly increasing time labels and no repeated vertices. We show for the shortest distance (minimizing the number of edges) that the price of anarchy can be proportional to the number of vertices, contrasting the constant price conjectured for static networks.

A Single-Sample Polylogarithmic Regret Bound for Nonstationary Online Linear Programming

from arXiv: Data Structures and Algorithms

Authors: Haoran Xu, Owen Shen, Peter Glynn, Yinyu Ye, Patrick Jaillet

We study nonstationary Online Linear Programming (OLP), where $n$ orders arrive sequentially with reward-resource consumption pairs that form a sequence of independent, but not necessarily identically distributed, random vectors. At the beginning of the planning horizon, the decision-maker is provided with a resource endowment that is sufficient to fulfill a significant portion of the requests. The decision-maker seeks to maximize the expected total reward by making immediate and irrevocable acceptance or rejection decisions for each order, subject to this resource endowment. We focus on the challenging single-sample setting, where only one sample from each of the $n$ distributions is available at the start of the planning horizon. We propose a novel re-solving algorithm that integrates a dynamic programming perspective with the dual-based frameworks traditionally employed in stationary environments. In the large-resource regime, where the resource endowment scales linearly with the number of orders, we prove that our algorithm achieves $O((\log n)^2)$ regret across a broad class of nonstationary distribution sequences. Our results demonstrate that polylogarithmic regret is attainable even under significant environmental shifts and minimal data availability, bridging the gap between stationary OLP and more volatile real-world resource allocation problems.

Authors: Haoran Xu, Owen Shen, Peter Glynn, Yinyu Ye, Patrick Jaillet

We study nonstationary Online Linear Programming (OLP), where $n$ orders arrive sequentially with reward-resource consumption pairs that form a sequence of independent, but not necessarily identically distributed, random vectors. At the beginning of the planning horizon, the decision-maker is provided with a resource endowment that is sufficient to fulfill a significant portion of the requests. The decision-maker seeks to maximize the expected total reward by making immediate and irrevocable acceptance or rejection decisions for each order, subject to this resource endowment. We focus on the challenging single-sample setting, where only one sample from each of the $n$ distributions is available at the start of the planning horizon. We propose a novel re-solving algorithm that integrates a dynamic programming perspective with the dual-based frameworks traditionally employed in stationary environments. In the large-resource regime, where the resource endowment scales linearly with the number of orders, we prove that our algorithm achieves $O((\log n)^2)$ regret across a broad class of nonstationary distribution sequences. Our results demonstrate that polylogarithmic regret is attainable even under significant environmental shifts and minimal data availability, bridging the gap between stationary OLP and more volatile real-world resource allocation problems.

Rooting Out Entropy: Optimal Tree Extraction for Ultra-Succinct Graphs

from arXiv: Data Structures and Algorithms

Authors: Ziad Ismaili Alaoui, Tamio-Vesa Nakajima, Namrata, Sebastian Wild

We combine two methods for the lossless compression of unlabeled graphs - entropy compressing adjacency lists and computing canonical names for vertices - and solve an ensuing novel optimisation problem: Minimum-Entropy Tree-Extraction (MINETREX). MINETREX asks to determine a spanning forest $F$ to remove from a graph $G$ so that the remaining graph $G-F$ has minimal indegree entropy $H(d_1,\ldots,d_n) = \sum_{v\in V} d_v \log_2(m/d_v)$ among all choices for $F$. (Here $d_v$ is the indegree of vertex $v$ in $G-F$; $m$ is the number of edges.) We show that MINETREX is NP-hard to approximate with additive error better than $δn$ (for some constant $δ>0$), and provide a simple greedy algorithm that achieves additive error at most $n / \ln 2$. By storing the extracted spanning forest and the remaining edges separately, we obtain a degree-entropy compressed ("ultrasuccinct") data structure for representing an arbitrary (static) unlabeled graph that supports navigational graph queries in logarithmic time. It serves as a drop-in replacement for adjacency-list representations using substantially less space for most graphs; we precisely quantify these savings in terms of the maximal subgraph density. Our inapproximability result uses an approximate variant of the hitting set problem on biregular instances whose hardness proof is contained implicitly in a reduction by Guruswami and Trevisan (APPROX/RANDOM 2005); we consider the unearthing of this reduction partner of independent interest with further likely uses in hardness of approximation.

Authors: Ziad Ismaili Alaoui, Tamio-Vesa Nakajima, Namrata, Sebastian Wild

We combine two methods for the lossless compression of unlabeled graphs - entropy compressing adjacency lists and computing canonical names for vertices - and solve an ensuing novel optimisation problem: Minimum-Entropy Tree-Extraction (MINETREX). MINETREX asks to determine a spanning forest $F$ to remove from a graph $G$ so that the remaining graph $G-F$ has minimal indegree entropy $H(d_1,\ldots,d_n) = \sum_{v\in V} d_v \log_2(m/d_v)$ among all choices for $F$. (Here $d_v$ is the indegree of vertex $v$ in $G-F$; $m$ is the number of edges.) We show that MINETREX is NP-hard to approximate with additive error better than $δn$ (for some constant $δ>0$), and provide a simple greedy algorithm that achieves additive error at most $n / \ln 2$. By storing the extracted spanning forest and the remaining edges separately, we obtain a degree-entropy compressed ("ultrasuccinct") data structure for representing an arbitrary (static) unlabeled graph that supports navigational graph queries in logarithmic time. It serves as a drop-in replacement for adjacency-list representations using substantially less space for most graphs; we precisely quantify these savings in terms of the maximal subgraph density. Our inapproximability result uses an approximate variant of the hitting set problem on biregular instances whose hardness proof is contained implicitly in a reduction by Guruswami and Trevisan (APPROX/RANDOM 2005); we consider the unearthing of this reduction partner of independent interest with further likely uses in hardness of approximation.

Better approximation guarantee for Asymmetric TSP

from arXiv: Data Structures and Algorithms

Authors: Jens Vygen

We improve the approximation ratio for the Asymmetric TSP to less than 15. We also obtain improved ratios for the special case of unweighted digraphs and the generalization where we ask for a minimum-cost tour with given (distinct) endpoints. Moreover, we prove better upper bounds on the integrality ratios of the natural LP relaxations.

Authors: Jens Vygen

We improve the approximation ratio for the Asymmetric TSP to less than 15. We also obtain improved ratios for the special case of unweighted digraphs and the generalization where we ask for a minimum-cost tour with given (distinct) endpoints. Moreover, we prove better upper bounds on the integrality ratios of the natural LP relaxations.

Almost-Uniform Edge Sampling: Leveraging Independent-Set and Local Graph Queries

from arXiv: Data Structures and Algorithms

Authors: Tomer Adar, Amit Levi

A central theme in sublinear graph algorithms is the relationship between counting and sampling: can the ability to approximately count a combinatorial structure be leveraged to sample it nearly uniformly at essentially the same cost? We study (i) independent-set (IS) queries, which return whether a vertex set $S$ is edge-free, and (ii) two standard local queries: degree and neighbor queries. Eden and Rosenbaum (SOSA `18) proved that in the local-query model, uniform edge sampling is no harder than approximate edge counting. We extend this phenomenon to new settings. We establish sampling-counting equivalence for the hybrid model that combines IS and local queries, matching the complexity of edge-count estimation achieved by Adar, Hotam and Levi (2026), and an analogous equivalence for IS queries, matching the complexity of edge-count estimation achieved by Xi, Levi and Waingarten (SODA `20). For each query model, we show lower bounds for uniform edge sampling that essentially coincide with the known bounds for approximate edge counting.

Authors: Tomer Adar, Amit Levi

A central theme in sublinear graph algorithms is the relationship between counting and sampling: can the ability to approximately count a combinatorial structure be leveraged to sample it nearly uniformly at essentially the same cost? We study (i) independent-set (IS) queries, which return whether a vertex set $S$ is edge-free, and (ii) two standard local queries: degree and neighbor queries. Eden and Rosenbaum (SOSA `18) proved that in the local-query model, uniform edge sampling is no harder than approximate edge counting. We extend this phenomenon to new settings. We establish sampling-counting equivalence for the hybrid model that combines IS and local queries, matching the complexity of edge-count estimation achieved by Adar, Hotam and Levi (2026), and an analogous equivalence for IS queries, matching the complexity of edge-count estimation achieved by Xi, Levi and Waingarten (SODA `20). For each query model, we show lower bounds for uniform edge sampling that essentially coincide with the known bounds for approximate edge counting.

Sublime: Sublinear Error & Space for Unbounded Skewed Streams

from arXiv: Data Structures and Algorithms

Authors: Navid Eslami, Ioana O. Bercea, Rasmus Pagh, Niv Dayan

Modern stream processing systems must often track the frequency of distinct keys in a data stream in real-time. Since monitoring the exact counts often entails a prohibitive memory footprint, many applications rely on compact, probabilistic data structures called frequency estimation sketches to approximate them. However, mainstream frequency estimation sketches fall short in two critical aspects: (1) They are memory-inefficient under data skew. This is because they use uniformly-sized counters to track the key counts and thus waste memory on storing the leading zeros of many small counter values. (2) Their estimation error deteriorates at least linearly with the stream's length, which may grow indefinitely over time. This is because they count the keys using a fixed number~of~counters. We present Sublime, a framework that generalizes frequency estimation sketches to address these problems by dynamically adapting to the stream's skew and length. To save memory under skew, Sublime uses short counters upfront and elongates them with extensions stored within the same cache line as they overflow. It leverages novel bit manipulation routines to quickly access a counter's extension. It also controls the scaling of its error rate by expanding its number of approximate counters as the stream grows. We apply Sublime to Count-Min Sketch and Count Sketch. We show, theoretically and empirically, that Sublime significantly improves accuracy and memory over the state of the art while maintaining competitive or superior performance.

Authors: Navid Eslami, Ioana O. Bercea, Rasmus Pagh, Niv Dayan

Modern stream processing systems must often track the frequency of distinct keys in a data stream in real-time. Since monitoring the exact counts often entails a prohibitive memory footprint, many applications rely on compact, probabilistic data structures called frequency estimation sketches to approximate them. However, mainstream frequency estimation sketches fall short in two critical aspects: (1) They are memory-inefficient under data skew. This is because they use uniformly-sized counters to track the key counts and thus waste memory on storing the leading zeros of many small counter values. (2) Their estimation error deteriorates at least linearly with the stream's length, which may grow indefinitely over time. This is because they count the keys using a fixed number~of~counters. We present Sublime, a framework that generalizes frequency estimation sketches to address these problems by dynamically adapting to the stream's skew and length. To save memory under skew, Sublime uses short counters upfront and elongates them with extensions stored within the same cache line as they overflow. It leverages novel bit manipulation routines to quickly access a counter's extension. It also controls the scaling of its error rate by expanding its number of approximate counters as the stream grows. We apply Sublime to Count-Min Sketch and Count Sketch. We show, theoretically and empirically, that Sublime significantly improves accuracy and memory over the state of the art while maintaining competitive or superior performance.

Machine-Verifying Toom-Cook Multiplication with Integer Evaluation Points

from arXiv: Data Structures and Algorithms

Authors: Srihari Nanniyur, Siddhartha Jayanti

We present a machine-verified proof of the correctness of Toom-Cook multiplication with generalized integer evaluation points. Toom-Cook is a class of fast multiplication algorithms parameterized by a triple $(k_x, k_y, \vec v)$ consisting of two positive integer split sizes $k_x, k_y$ and a vector $\vec v$ of distinct evaluation points. As part of our proof, we verify that for any selection of $k_x+k_y-1$ distinct integer evaluation points, we can compute a threshold function $θ(k_x, k_y, \vec v)$ such that, if the algorithm's base-case problem size is set above this threshold, then the algorithm's termination is guaranteed regardless of the values of the operands. The threshold formula, which we derive by obtaining upper bounds on the subproblem sizes produced by the Toom-Cook recurrence, does not depend on the operands; it depends only on $k_x$, $k_y$, $\vec v$, and the base $b$ in which we operate. We write the proof in Lean 4, making use of the Mathlib library. We formalize the algorithm, our base case threshold formula, and our key lemma statements in Lean. We then use the AI theorem prover Aristotle to assist in completing the machine verification of the algorithm's correctness. This proof, through its synthesis of human input and AI assistance, demonstrates the considerable power of AI to automate the machine verification process.

Authors: Srihari Nanniyur, Siddhartha Jayanti

We present a machine-verified proof of the correctness of Toom-Cook multiplication with generalized integer evaluation points. Toom-Cook is a class of fast multiplication algorithms parameterized by a triple $(k_x, k_y, \vec v)$ consisting of two positive integer split sizes $k_x, k_y$ and a vector $\vec v$ of distinct evaluation points. As part of our proof, we verify that for any selection of $k_x+k_y-1$ distinct integer evaluation points, we can compute a threshold function $θ(k_x, k_y, \vec v)$ such that, if the algorithm's base-case problem size is set above this threshold, then the algorithm's termination is guaranteed regardless of the values of the operands. The threshold formula, which we derive by obtaining upper bounds on the subproblem sizes produced by the Toom-Cook recurrence, does not depend on the operands; it depends only on $k_x$, $k_y$, $\vec v$, and the base $b$ in which we operate. We write the proof in Lean 4, making use of the Mathlib library. We formalize the algorithm, our base case threshold formula, and our key lemma statements in Lean. We then use the AI theorem prover Aristotle to assist in completing the machine verification of the algorithm's correctness. This proof, through its synthesis of human input and AI assistance, demonstrates the considerable power of AI to automate the machine verification process.

Approximation Algorithms for Action-Reward Query-Commit Matching

from arXiv: Data Structures and Algorithms

Authors: Mahsa Derakhshan, Andisheh Ghasemi, Calum MacRury

Matching problems under uncertainty arise in applications such as kidney exchange, hiring, and online marketplaces. A decision-maker must sequentially explore potential matches under local exploration constraints, while committing irrevocably to successful matches as they are revealed. The query-commit matching problem captures these challenges by modeling edges that succeed independently with known probabilities and must be accepted upon success, subject to vertex patience (time-out) constraints limiting the number of incident queries. In this work, we introduce the action-reward query-commit matching problem, a strict generalization of query-commit matching in which each query selects an action from a known action space, determining both the success probability and the reward of the queried edge. If an edge is queried using a chosen action and succeeds, it is irrevocably added to the matching, and the corresponding reward is obtained; otherwise, the edge is permanently discarded. We study the design of approximation algorithms for this problem on bipartite graphs. This model captures a broad class of stochastic matching problems, including the sequential pricing problem introduced by Pollner, Roghani, Saberi, and Wajc (EC~2022). On the positive side, Pollner et al. designed a polynomial-time approximation algorithm achieving a ratio of $0.426$ in the one-sided patience setting, which degrades to $0.395$ when both sides have bounded patience. In this work, we design computationally efficient algorithms for the action-reward query-commit in one-sided and two-sided patience settings, achieving approximation ratios of $1-1/e \approx 0.63$ and $\frac{1}{27}\!\left(19-67/e^3\right) \approx 0.58$ respectively. These results improve the state of the art for the sequential pricing problem, surpassing the previous guarantees of $0.426$ and $0.395$.

Authors: Mahsa Derakhshan, Andisheh Ghasemi, Calum MacRury

Matching problems under uncertainty arise in applications such as kidney exchange, hiring, and online marketplaces. A decision-maker must sequentially explore potential matches under local exploration constraints, while committing irrevocably to successful matches as they are revealed. The query-commit matching problem captures these challenges by modeling edges that succeed independently with known probabilities and must be accepted upon success, subject to vertex patience (time-out) constraints limiting the number of incident queries. In this work, we introduce the action-reward query-commit matching problem, a strict generalization of query-commit matching in which each query selects an action from a known action space, determining both the success probability and the reward of the queried edge. If an edge is queried using a chosen action and succeeds, it is irrevocably added to the matching, and the corresponding reward is obtained; otherwise, the edge is permanently discarded. We study the design of approximation algorithms for this problem on bipartite graphs. This model captures a broad class of stochastic matching problems, including the sequential pricing problem introduced by Pollner, Roghani, Saberi, and Wajc (EC~2022). On the positive side, Pollner et al. designed a polynomial-time approximation algorithm achieving a ratio of $0.426$ in the one-sided patience setting, which degrades to $0.395$ when both sides have bounded patience. In this work, we design computationally efficient algorithms for the action-reward query-commit in one-sided and two-sided patience settings, achieving approximation ratios of $1-1/e \approx 0.63$ and $\frac{1}{27}\!\left(19-67/e^3\right) \approx 0.58$ respectively. These results improve the state of the art for the sequential pricing problem, surpassing the previous guarantees of $0.426$ and $0.395$.

Monday, March 16

Small World Models

from Ben Recht

How much do you need to know about a system to control it?

This is a live blog of Lecture 6 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

The backbone of control engineering is the assumption of a reasonably reliable, reasonably simple dynamical system with inputs and outputs. We have to believe that the behavior of the thing we want to steer is consistent enough so that whatever we design in the lab will work on the road.

Now, what exactly do I need to know about this reliably consistent dynamical system to get it to do what I want to do? You want a model of the system that is rich enough to describe everything you might ever see, but small enough so that you can computationally derive control policies and, perhaps, performance guarantees. Simpler system descriptions yield simpler state-estimation and control-design algorithms, both online and offline. What’s the right balance between modeling precision and simplicity? This is the question of system identification.

System identification is the natural place where machine learning and statistics meet control engineering. You need to either estimate parameters of models you believe are true or build predictions of how the system will respond to a string of inputs. What sorts of statistical infrastructure you need varies on your control engineering task

Sometimes you really believe that for all intents and purposes, the system behavior is captured by simple differential equations. An object falling in space without friction will obey Newton’s laws. To identify this system, you just need to measure the object’s mass. Easy peasy.

For more complicated mechanical systems, like quadrotors or simple, slow-moving wheeled vehicles, you still can get away with relatively simple modeling where the geometry of the problem gives you a nice differential equation with a few parameters determined by the build of your drone. In these cases, where you don’t need particularly high performance, you need only break out the ruler to scale to guestimate all of the parameters.

Sometimes you can’t determine the parameters from simple measurements, as environmental conditions dictate their values. For example, coefficients of friction might depend on temperature and the particulars of the flooring. Now you can find the parameters by repeatedly testing your system and running a nonlinear regression to minimize the input-output error.

As your problem gets even more complicated, maybe you don’t want to bother building a sophisticated simulator and would be perfectly fine with a “black-box” prediction of outputs from inputs. We’ve developed a zoo of methods to do this sort of prediction. The simplest are the “ARMAX” models, which predict outputs as a linear combination of the past inputs and outputs. You can fit these using least squares. If you want to be fancy, you can even compute nice “state-space” models from these linear ARMAX models, using a family of methods that are called subspace identification. This will yield smaller models and simplify your control synthesis problem.

On the other hand, you can go in a completely different direction and make your time-series predictor nonlinear. You can use a neural network to predict the next output from your history. If you want to get extra fancy, throw a transformer at the problem. I’m sure this will work great and build the best simulator without knowing anything about the problem at all.

So what’s the right level of modeling granularity for your problem? I don’t have a clear answer. In optimal control, the better your estimate, the better your performance. But maybe you care about the minimal amount of information you need to control something. How much is it?

You might think none. We’ve seen in class already that two systems that look completely different in open loop look the same in closed loop. Feedback can correct modeling errors. The simplest example is

Input u[0]=1, and the “x” variable goes to infinity, but the “z” variable goes to zero. However, under the negative state feedback rule “u[t]=-x[t]”, the systems are identical

which both quickly converge to zero.

Negative feedback is powerful and can drive solid performance in the face of huge model uncertainties. If you simply care about robust tracking or homeostatic behavior, perhaps you can get away with the most minimal system identification. Unfortunately, it’s not quite that easy. You can have two systems that look the same in open loop but have completely different closed-loop behavior. Karl Astrom has a relatively simple example that I described in an earlier post. There, one system has a filter between the controller and the state that slightly attenuates the frequencies needed to stabilize the system.

Now the question is whether Astrom’s pathological counterexample—where two systems look similar in open loop simulation but are catastrophically different under feedback—is indicative of widespread problems. Probably not. I’m not convinced that you have to learn sophisticated robust control for most small-scale robotics demos. (Sorry, John, though complex aerospace systems are certainly another story). I think the takeaway from Astrom’s examples is that your model should represent the sorts of disturbances and signals you should see out in the world. And it should be cognizant of the fact that you are going to use the model in a closed-loop, so you have to understand whether there are delays and noise between the actuation signal and the actuation action.

Of course, this makes sense to any graduate student who has worked on a real robot. Every robotics grad student I’ve spoken to has told me that investing the time in system identification makes the robotic performance infinitely better. Sometimes we have to sit with our dynamical systems for a long time before we know what we need to control them. Understanding what it means for our models to be good enough is the tricky part.

Subscribe now

By Ben Recht

Dynamic direct (ranked) access of MSO query evaluation over SLP-compressed strings

from arXiv: Data Structures and Algorithms

Authors: Martín Muñoz

We present an algorithm that, given an index $t$, produces the $t$-th (lexicographically ordered) answer of an MSO query over a string. The algorithm requires linear-time preprocessing, and builds a data structure that answers each of these calls in logarithmic time. We then show how to extend this algorithm for a string that is compressed by a straight-line program (SLP), also with linear-time preprocessing in the (compressed encoding of the) string, and maintaining direct access in logtime of the original string. Lastly, we extend the algorithm by allowing complex edits on the SLP after the direct-access data structure has been processsed, which are translated into the data structure in logtime. We do this by adapting a document editing framework introduced by Schmid and Schweikardt (PODS 2022). This work improves on a recent result of dynamic direct access of MSO queries over strings (Bourhis et. al., ICDT 2025) by a log-factor on the access procedure, and by extending the results to SLPs.

Authors: Martín Muñoz

We present an algorithm that, given an index $t$, produces the $t$-th (lexicographically ordered) answer of an MSO query over a string. The algorithm requires linear-time preprocessing, and builds a data structure that answers each of these calls in logarithmic time. We then show how to extend this algorithm for a string that is compressed by a straight-line program (SLP), also with linear-time preprocessing in the (compressed encoding of the) string, and maintaining direct access in logtime of the original string. Lastly, we extend the algorithm by allowing complex edits on the SLP after the direct-access data structure has been processsed, which are translated into the data structure in logtime. We do this by adapting a document editing framework introduced by Schmid and Schweikardt (PODS 2022). This work improves on a recent result of dynamic direct access of MSO queries over strings (Bourhis et. al., ICDT 2025) by a log-factor on the access procedure, and by extending the results to SLPs.

Tight (S)ETH-based Lower Bounds for Pseudopolynomial Algorithms for Bin Packing and Multi-Machine Scheduling

from arXiv: Data Structures and Algorithms

Authors: Karl Bringmann, Anita Dürr, Karol Węgrzycki

Bin Packing with $k$ bins is a fundamental optimisation problem in which we are given a set of $n$ integers and a capacity $T$ and the goal is to partition the set into $k$ subsets, each of total sum at most $T$. Bin Packing is NP-hard already for $k=2$ and a textbook dynamic programming algorithm solves it in pseudopolynomial time $\mathcal O(n T^{k-1})$. Jansen, Kratsch, Marx, and Schlotter [JCSS'13] proved that this time cannot be improved to $(nT)^{o(k / \log k)}$ assuming the Exponential Time Hypothesis (ETH). Their result has become an important building block, explaining the hardness of many problems in parameterised complexity. Note that their result is one log-factor short of being tight. In this paper, we prove a tight ETH-based lower bound for Bin Packing, ruling out time $2^{o(n)} T^{o(k)}$. This answers an open problem of Jansen et al. and yields improved lower bounds for many applications in parameterised complexity. Since Bin Packing is an example of multi-machine scheduling, it is natural to next study other scheduling problems. We prove tight lower bounds based on the Strong Exponential Time Hypothesis (SETH) for several classic $k$-machine scheduling problems, including makespan minimisation with release dates ($P_k|r_j|C_{\max}$), minimizing the number of tardy jobs ($P_k||ΣU_j$), and minimizing the weighted sum of completion times ($P_k || Σw_j C_j$). For all these problems, we rule out time $2^{o(n)} T^{k-1-\varepsilon}$ for any $\varepsilon > 0$ assuming SETH, where $T$ is the total processing time; this matches classic $n^{\mathcal O(1)} T^{k-1}$-time algorithms from the 60s and 70s. Moreover, we rule out time $2^{o(n)} T^{k-\varepsilon}$ for minimizing the total processing time of tardy jobs ($P_k||Σp_jU_j$), which matches a classic $\mathcal O(n T^{k})$-time algorithm and answers an open problem of Fischer and Wennmann [TheoretiCS'25].

Authors: Karl Bringmann, Anita Dürr, Karol Węgrzycki

Bin Packing with $k$ bins is a fundamental optimisation problem in which we are given a set of $n$ integers and a capacity $T$ and the goal is to partition the set into $k$ subsets, each of total sum at most $T$. Bin Packing is NP-hard already for $k=2$ and a textbook dynamic programming algorithm solves it in pseudopolynomial time $\mathcal O(n T^{k-1})$. Jansen, Kratsch, Marx, and Schlotter [JCSS'13] proved that this time cannot be improved to $(nT)^{o(k / \log k)}$ assuming the Exponential Time Hypothesis (ETH). Their result has become an important building block, explaining the hardness of many problems in parameterised complexity. Note that their result is one log-factor short of being tight. In this paper, we prove a tight ETH-based lower bound for Bin Packing, ruling out time $2^{o(n)} T^{o(k)}$. This answers an open problem of Jansen et al. and yields improved lower bounds for many applications in parameterised complexity. Since Bin Packing is an example of multi-machine scheduling, it is natural to next study other scheduling problems. We prove tight lower bounds based on the Strong Exponential Time Hypothesis (SETH) for several classic $k$-machine scheduling problems, including makespan minimisation with release dates ($P_k|r_j|C_{\max}$), minimizing the number of tardy jobs ($P_k||ΣU_j$), and minimizing the weighted sum of completion times ($P_k || Σw_j C_j$). For all these problems, we rule out time $2^{o(n)} T^{k-1-\varepsilon}$ for any $\varepsilon > 0$ assuming SETH, where $T$ is the total processing time; this matches classic $n^{\mathcal O(1)} T^{k-1}$-time algorithms from the 60s and 70s. Moreover, we rule out time $2^{o(n)} T^{k-\varepsilon}$ for minimizing the total processing time of tardy jobs ($P_k||Σp_jU_j$), which matches a classic $\mathcal O(n T^{k})$-time algorithm and answers an open problem of Fischer and Wennmann [TheoretiCS'25].

Extending Exact Integrality Gap Computations for the Metric TSP

from arXiv: Data Structures and Algorithms

Authors: William Cook, Stefan Hougardy, Moritz Petrich

The subtour relaxation of the traveling salesman problem (TSP) plays a central role in approximation algorithms and polyhedral studies of the TSP. A long-standing conjecture asserts that the integrality gap of the subtour relaxation for the metric TSP is exactly 4/3. In this paper, we extend the exact verification of this conjecture for small numbers of vertices. Using the framework introduced by Benoit and Boyd in 2008, we confirm their results up to n=10. We further show that for n=11 and n=12, the published lists of extreme points of the subtour polytope are incomplete: one extreme point is missing for n=11 and twenty-two extreme points are missing for n=12. We extend the enumeration of the extreme points of the subtour polytope to instances with up to 14 vertices in the general case. Restricted to half-integral vertices, we extend the enumeration of extreme points up to n=17. Our results provide additional support for the 4/3-Conjecture.

Authors: William Cook, Stefan Hougardy, Moritz Petrich

The subtour relaxation of the traveling salesman problem (TSP) plays a central role in approximation algorithms and polyhedral studies of the TSP. A long-standing conjecture asserts that the integrality gap of the subtour relaxation for the metric TSP is exactly 4/3. In this paper, we extend the exact verification of this conjecture for small numbers of vertices. Using the framework introduced by Benoit and Boyd in 2008, we confirm their results up to n=10. We further show that for n=11 and n=12, the published lists of extreme points of the subtour polytope are incomplete: one extreme point is missing for n=11 and twenty-two extreme points are missing for n=12. We extend the enumeration of the extreme points of the subtour polytope to instances with up to 14 vertices in the general case. Restricted to half-integral vertices, we extend the enumeration of extreme points up to n=17. Our results provide additional support for the 4/3-Conjecture.

Optimal Enumeration of Eulerian Trails in Directed Graphs

from arXiv: Data Structures and Algorithms

Authors: Ben Bals, Solon P. Pissis, Matei Tinca

The BEST theorem, due to de Bruijn, van Aardenne-Ehrenfest, Smith, and Tutte, is a classical tool from graph theory that links the Eulerian trails in a directed graph $G=(V,E)$ with the arborescences in $G$. In particular, one can use the BEST theorem to count the Eulerian trails in $G$ in polynomial time. For enumerating the Eulerian trails in $G$, one could naturally resort to first enumerating the arborescences in $G$ and then exploiting the insight of the BEST theorem to enumerate the Eulerian trails in $G$: every arborescence in $G$ corresponds to at least one Eulerian trail in $G$. Instead, we take a simple and direct approach. Our central contribution is a remarkably simple algorithm to directly enumerate the $z_T$ Eulerian trails in $G$ in the \emph{optimal} $O(m + z_T)$ time. As a consequence, our result improves on an implementation of the BEST theorem for counting Eulerian trails in $G$ when $z_T=o(n^2)$, and, in addition, it unconditionally improves the combinatorial $O(m\cdot z_T)$-time algorithm of Conte et al. [FCT 2021] for the same task. Moreover, we show that, with some care, our algorithm can be extended to enumerate Eulerian trails in directed multigraphs in optimal time, enabling applications in bioinformatics and data privacy.

Authors: Ben Bals, Solon P. Pissis, Matei Tinca

The BEST theorem, due to de Bruijn, van Aardenne-Ehrenfest, Smith, and Tutte, is a classical tool from graph theory that links the Eulerian trails in a directed graph $G=(V,E)$ with the arborescences in $G$. In particular, one can use the BEST theorem to count the Eulerian trails in $G$ in polynomial time. For enumerating the Eulerian trails in $G$, one could naturally resort to first enumerating the arborescences in $G$ and then exploiting the insight of the BEST theorem to enumerate the Eulerian trails in $G$: every arborescence in $G$ corresponds to at least one Eulerian trail in $G$. Instead, we take a simple and direct approach. Our central contribution is a remarkably simple algorithm to directly enumerate the $z_T$ Eulerian trails in $G$ in the \emph{optimal} $O(m + z_T)$ time. As a consequence, our result improves on an implementation of the BEST theorem for counting Eulerian trails in $G$ when $z_T=o(n^2)$, and, in addition, it unconditionally improves the combinatorial $O(m\cdot z_T)$-time algorithm of Conte et al. [FCT 2021] for the same task. Moreover, we show that, with some care, our algorithm can be extended to enumerate Eulerian trails in directed multigraphs in optimal time, enabling applications in bioinformatics and data privacy.

ExpanderGraph-128: A Novel Graph-Theoretic Block Cipher with Formal Security Analysis and Hardware Implementation

from arXiv: Data Structures and Algorithms

Authors: W. A. Susantha Wijesinghe

Lightweight block cipher design has largely focused on incremental optimization of established paradigms such as substitution--permutation networks, Feistel structures, and ARX constructions, where security derives from the algebraic complexity of individual components. We propose a different approach based on \emph{expander-graph interaction networks}, where diffusion and security arise from sparse structural connectivity rather than component sophistication. We present \textbf{ExpanderGraph-128 (EGC128)}, a 128-bit block cipher constructed as a 20-round balanced Feistel network. Each round applies a 64-bit nonlinear transformation governed by a 3-regular expander graph whose vertices execute identical 4-input Boolean functions on local neighborhoods. Security analysis combines MILP-based differential bounds, proven optimal through 10 rounds via SCIP, establishing 147.3-bit differential security and conservatively extrapolating to 413 bits for the full cipher. Linear analysis provides MILP bounds of $\geq 2^{145}$, while related-key evaluation shows no free rounds for any nonzero key difference. Additional tests confirm rapid algebraic degree growth and the absence of invariant affine subspaces. Implementation results demonstrate practical efficiency. FPGA synthesis on Xilinx Artix-7 achieves 261~Mbps at 100~MHz using only 380 LUTs, while ARM Cortex-M4F software requires 25.8~KB Flash and 1.66~ms per encryption. These results show that expander-graph-driven diffusion provides a promising design methodology for lightweight cryptography.

Authors: W. A. Susantha Wijesinghe

Lightweight block cipher design has largely focused on incremental optimization of established paradigms such as substitution--permutation networks, Feistel structures, and ARX constructions, where security derives from the algebraic complexity of individual components. We propose a different approach based on \emph{expander-graph interaction networks}, where diffusion and security arise from sparse structural connectivity rather than component sophistication. We present \textbf{ExpanderGraph-128 (EGC128)}, a 128-bit block cipher constructed as a 20-round balanced Feistel network. Each round applies a 64-bit nonlinear transformation governed by a 3-regular expander graph whose vertices execute identical 4-input Boolean functions on local neighborhoods. Security analysis combines MILP-based differential bounds, proven optimal through 10 rounds via SCIP, establishing 147.3-bit differential security and conservatively extrapolating to 413 bits for the full cipher. Linear analysis provides MILP bounds of $\geq 2^{145}$, while related-key evaluation shows no free rounds for any nonzero key difference. Additional tests confirm rapid algebraic degree growth and the absence of invariant affine subspaces. Implementation results demonstrate practical efficiency. FPGA synthesis on Xilinx Artix-7 achieves 261~Mbps at 100~MHz using only 380 LUTs, while ARM Cortex-M4F software requires 25.8~KB Flash and 1.66~ms per encryption. These results show that expander-graph-driven diffusion provides a promising design methodology for lightweight cryptography.

Early Pruning for Public Transport Routing

from arXiv: Data Structures and Algorithms

Authors: Andrii Rohovyi, Abdallah Abuaisha, Toby Walsh

Routing algorithms for public transport, particularly the widely used RAPTOR and its variants, often face performance bottlenecks during the transfer relaxation phase, especially on dense transfer graphs, when supporting unlimited transfers. This inefficiency arises from iterating over many potential inter-stop connections (walks, bikes, e-scooters, etc.). To maintain acceptable performance, practitioners often limit transfer distances or exclude certain transfer options, which can reduce path optimality and restrict the multimodal options presented to travellers. This paper introduces Early Pruning, a low-overhead technique that accelerates routing algorithms without compromising optimality. By pre-sorting transfer connections by duration and applying a pruning rule within the transfer loop, the method discards longer transfers at a stop once they cannot yield an earlier arrival than the current best solution. Early Pruning can be integrated with minimal changes to existing codebases and requires only a one-time preprocessing step. Across multiple state-of-the-art RAPTOR-based solutions, including RAPTOR, ULTRA-RAPTOR, McRAPTOR, BM-RAPTOR, ULTRA-McRAPTOR, and UBM-RAPTOR and tested on the Switzerland and London transit networks, we achieved query time reductions of up to 57%. This approach provides a generalizable improvement to the efficiency of transit pathfinding algorithms. Beyond algorithmic performance, Early Pruning has practical implications for transport planning. By reducing computational costs, it enables transit agencies to expand transfer radii and incorporate additional mobility modes into journey planners without requiring extra server infrastructure. This is particularly relevant for passengers in areas with sparse direct transit coverage, such as outer suburbs and smaller towns, where richer multimodal routing can reveal viable alternatives to private car use.

Authors: Andrii Rohovyi, Abdallah Abuaisha, Toby Walsh

Routing algorithms for public transport, particularly the widely used RAPTOR and its variants, often face performance bottlenecks during the transfer relaxation phase, especially on dense transfer graphs, when supporting unlimited transfers. This inefficiency arises from iterating over many potential inter-stop connections (walks, bikes, e-scooters, etc.). To maintain acceptable performance, practitioners often limit transfer distances or exclude certain transfer options, which can reduce path optimality and restrict the multimodal options presented to travellers. This paper introduces Early Pruning, a low-overhead technique that accelerates routing algorithms without compromising optimality. By pre-sorting transfer connections by duration and applying a pruning rule within the transfer loop, the method discards longer transfers at a stop once they cannot yield an earlier arrival than the current best solution. Early Pruning can be integrated with minimal changes to existing codebases and requires only a one-time preprocessing step. Across multiple state-of-the-art RAPTOR-based solutions, including RAPTOR, ULTRA-RAPTOR, McRAPTOR, BM-RAPTOR, ULTRA-McRAPTOR, and UBM-RAPTOR and tested on the Switzerland and London transit networks, we achieved query time reductions of up to 57%. This approach provides a generalizable improvement to the efficiency of transit pathfinding algorithms. Beyond algorithmic performance, Early Pruning has practical implications for transport planning. By reducing computational costs, it enables transit agencies to expand transfer radii and incorporate additional mobility modes into journey planners without requiring extra server infrastructure. This is particularly relevant for passengers in areas with sparse direct transit coverage, such as outer suburbs and smaller towns, where richer multimodal routing can reveal viable alternatives to private car use.

Weighted Set Multi-Cover on Bounded Universe and Applications in Package Recommendation

from arXiv: Data Structures and Algorithms

Authors: Nima Shahbazi, Aryan Esmailpour, Stavros Sintos

The weighted set multi-cover problem is a fundamental generalization of set cover that arises in data-driven applications where one must select a small, low-cost subset from a large collection of candidates under coverage constraints. In data management settings, such problems arise naturally either as expressive database queries or as post-processing steps over query results, for example, when selecting representative or diverse subsets from large relations returned by database queries for decision support, recommendation, fairness-aware data selection, or crowd-sourcing. While the general weighted set multi-cover problem is NP-complete, many practical workloads involve a \emph{bounded universe} of items that must be covered, leading to the Weighted Set Multi-Cover with Bounded Universe (WSMC-BU) problem, where the universe size is constant. In this paper, we develop exact and approximation algorithms for WSMC-BU. We first discuss a dynamic programming algorithm that solves WSMC-BU exactly in $O(n^{\ell+1})$ time, where $n$ is the number of input sets and $\ell=O(1)$ is the universe size. We then present a $2$-approximation algorithm based on linear programming and rounding, running in $O(\mathcal{L}(n))$ time, where $\mathcal{L}(n)$ denotes the complexity of solving a linear program with $O(n)$ variables. To further improve efficiency for large datasets, we propose a faster $(2+\varepsilon)$-approximation algorithm with running time $O(n \log n + \mathcal{L}(\log W))$, where $W$ is the ratio of the total weight to the minimum weight, and $\varepsilon$ is an arbitrary constant specified by the user. Extensive experiments on real and synthetic datasets demonstrate that our methods consistently outperform greedy and standard LP-rounding baselines in both solution quality and runtime, making them suitable for data-intensive selection tasks over large query outputs.

Authors: Nima Shahbazi, Aryan Esmailpour, Stavros Sintos

The weighted set multi-cover problem is a fundamental generalization of set cover that arises in data-driven applications where one must select a small, low-cost subset from a large collection of candidates under coverage constraints. In data management settings, such problems arise naturally either as expressive database queries or as post-processing steps over query results, for example, when selecting representative or diverse subsets from large relations returned by database queries for decision support, recommendation, fairness-aware data selection, or crowd-sourcing. While the general weighted set multi-cover problem is NP-complete, many practical workloads involve a \emph{bounded universe} of items that must be covered, leading to the Weighted Set Multi-Cover with Bounded Universe (WSMC-BU) problem, where the universe size is constant. In this paper, we develop exact and approximation algorithms for WSMC-BU. We first discuss a dynamic programming algorithm that solves WSMC-BU exactly in $O(n^{\ell+1})$ time, where $n$ is the number of input sets and $\ell=O(1)$ is the universe size. We then present a $2$-approximation algorithm based on linear programming and rounding, running in $O(\mathcal{L}(n))$ time, where $\mathcal{L}(n)$ denotes the complexity of solving a linear program with $O(n)$ variables. To further improve efficiency for large datasets, we propose a faster $(2+\varepsilon)$-approximation algorithm with running time $O(n \log n + \mathcal{L}(\log W))$, where $W$ is the ratio of the total weight to the minimum weight, and $\varepsilon$ is an arbitrary constant specified by the user. Extensive experiments on real and synthetic datasets demonstrate that our methods consistently outperform greedy and standard LP-rounding baselines in both solution quality and runtime, making them suitable for data-intensive selection tasks over large query outputs.

Pairwise Exchanges of Freely Replicable Goods with Negative Externalities

from arXiv: Data Structures and Algorithms

Authors: Shangyuan Yang, Kirthevasan Kandasamy

We study a setting where a set of agents engage in pairwise exchanges of freely replicable goods (e.g., digital goods such as data), where two agents grant each other a copy of a good they possess in exchange for a good they lack. Such exchanges introduce a fundamental tension: while agents benefit from acquiring additional goods, they incur negative externalities when others do the same. This dynamic typically arises in real-world scenarios where competing entities may benefit from selective collaboration. For example, in a data sharing consortium, pharmaceutical companies might share (copies of) drug discovery data, when the value of accessing a competitor's data outweighs the risk of revealing their own. In our model, an altruistic central planner wishes to design an exchange protocol (without money), to structure such exchanges between agents. The protocol operates over multiple rounds, proposing sets of pairwise exchanges in each round, which agents may accept or reject. We formulate three key desiderata for such a protocol: (i) individual rationality: agents should not be worse off by participating in the protocol; (ii) incentive-compatibility: agents should be incentivized to share as much as possible by accepting all exchange proposals by the planner; (iii) stability: there should be no further mutually beneficial exchanges upon termination. We design an exchange protocol for the planner that satisfies all three desiderata. While the above desiderata are inspired by classical models for exchange, free-replicability and negative externalities necessitate novel and nontrivial reformalizations of these goals. We also argue that achieving Pareto-efficient agent utilities -- often a central goal in exchange models without externalities -- may be ill-suited in this setting.

Authors: Shangyuan Yang, Kirthevasan Kandasamy

We study a setting where a set of agents engage in pairwise exchanges of freely replicable goods (e.g., digital goods such as data), where two agents grant each other a copy of a good they possess in exchange for a good they lack. Such exchanges introduce a fundamental tension: while agents benefit from acquiring additional goods, they incur negative externalities when others do the same. This dynamic typically arises in real-world scenarios where competing entities may benefit from selective collaboration. For example, in a data sharing consortium, pharmaceutical companies might share (copies of) drug discovery data, when the value of accessing a competitor's data outweighs the risk of revealing their own. In our model, an altruistic central planner wishes to design an exchange protocol (without money), to structure such exchanges between agents. The protocol operates over multiple rounds, proposing sets of pairwise exchanges in each round, which agents may accept or reject. We formulate three key desiderata for such a protocol: (i) individual rationality: agents should not be worse off by participating in the protocol; (ii) incentive-compatibility: agents should be incentivized to share as much as possible by accepting all exchange proposals by the planner; (iii) stability: there should be no further mutually beneficial exchanges upon termination. We design an exchange protocol for the planner that satisfies all three desiderata. While the above desiderata are inspired by classical models for exchange, free-replicability and negative externalities necessitate novel and nontrivial reformalizations of these goals. We also argue that achieving Pareto-efficient agent utilities -- often a central goal in exchange models without externalities -- may be ill-suited in this setting.

Sunday, March 15

For \(R^3\) the problem is open. That's too bad. We live in \(R^3\)

from Computational Complexity

(If you live in Montgomery County Maryland OR if you care about Education, you MUST read this guest blog by Daniel Gottesman on Scott Aaronson's blog HERE.) 

(This post is a sequel to a prior post on this topic that was here. However, this post is self-contained---you don't need to have read the prior post.)  

(Later in the post I point to my open problems column that does what is in this post rigorously. However , that link might be hard to find, so here it is:  HERE)



BILL: I have a nice problem to tell you about. First, the setup.

Say you have a finite coloring of \(R^n\).

A  mono unit square is a set of four points that are

(a) all the same color, and

(b) form a square of side 1. The square does not need to be parallel to any of the axes.

DARLING: Okay. What is the problem?

BILL:  It is known that for all  2-colorings of \(R^6\) there is a mono unit square.

DARLING: \(R^6\)? Really! That's hilarious! Surely, better is known.

BILL: Yes better is known. And stop calling me Shirley.

DARLING: Okay, so what else is known?

BILL: An observation about the \(R^6\) result gives us the result for \(R^5\). (The \(R^5\) result also follows from a different technique.) Then a much harder proof gives us the result for \(R^4\). It is easy to  construct  a coloring of \(R^2\) without a mono unit square. The problem for \(R^3\) is open.

DARLING: That's too bad. We live in \(R^3\).

DARLING: Someone should write an article about all this including proofs of all the known results, open problems,  and maybe a few new things.

BILL: By someone you mean Auguste Gezalyan (Got his  PhD in CS, topic Comp Geom, at  UMCP), Ryan Parker (ugrad working on Comp Geom at UMCP), and Bill Gasarch (that's me!)  Good idea!

A FEW WEEKS LATER

BILL: Done! See here. And I call the problem about \(R^3\) The Darling Problem.

DARLING: Great! Now that you have an in-depth knowledge of the problem---

BILL: Auguste and Ryan have an in depth knowledge. Frankly I'm out of my depth.

DARLING: Okay, then I'll ask them:  What do you think happens in \(R^3\) and when do think it will be proved?

AUGUSTE: I think there is a 2-coloring of \(R^3\) with no mono unit square.

RYAN: I think that for every 2-coloring of \(R^3\) there is a mono unit square.

BILL: I have no conjecture; however, I think this is the kind of problem that really could be solved. It has not been worked on that much and it might just be one key idea from being solved. It is my hope that this article and blog post inspires someone to work on it and solve it. 

OBLIGATORY AI COMMENT

Auguste asked ChatGPT (or some AI) about the problem. It replied that the problem is open and is known as The Darling's Problem. This is rather surprising---Auguste asked the AI about this before I had submitted the article (it has since appeared) and before this blog post. So how did AI know about it? It was on my website.  I conjecture that Auguste used some of the same language we used in the paper so the AI found our paper. The oddest thing about this is that I don't find this odd anymore. 

 COLOR COMMENTARY  

The article appeared as a SIGACT News Open Problems Column. Are you better off reading it there or on my website, which is pointed to above. The SIGACT News version is (a) behind a paywall, and (b) in black and white. The version on my website is (a) free access, and (b) uses color. You decide.

By gasarch

(If you live in Montgomery County Maryland OR if you care about Education, you MUST read this guest blog by Daniel Gottesman on Scott Aaronson's blog HERE.) 

(This post is a sequel to a prior post on this topic that was here. However, this post is self-contained---you don't need to have read the prior post.)  

(Later in the post I point to my open problems column that does what is in this post rigorously. However , that link might be hard to find, so here it is:  HERE)



BILL: I have a nice problem to tell you about. First, the setup.

Say you have a finite coloring of \(R^n\).

mono unit square is a set of four points that are

(a) all the same color, and

(b) form a square of side 1. The square does not need to be parallel to any of the axes.

DARLING: Okay. What is the problem?

BILL:  It is known that for all  2-colorings of \(R^6\) there is a mono unit square.

DARLING: \(R^6\)? Really! That's hilarious! Surely, better is known.

BILL: Yes better is known. And stop calling me Shirley.

DARLING: Okay, so what else is known?

BILL: An observation about the \(R^6\) result gives us the result for \(R^5\). (The \(R^5\) result also follows from a different technique.) Then a much harder proof gives us the result for \(R^4\). It is easy to  construct  a coloring of \(R^2\) without a mono unit square. The problem for \(R^3\) is open.

DARLING: That's too bad. We live in \(R^3\).

DARLING: Someone should write an article about all this including proofs of all the known results, open problems,  and maybe a few new things.

BILL: By someone you mean Auguste Gezalyan (Got his  PhD in CS, topic Comp Geom, at  UMCP), Ryan Parker (ugrad working on Comp Geom at UMCP), and Bill Gasarch (that's me!)  Good idea!

A FEW WEEKS LATER

BILL: Done! See here. And I call the problem about \(R^3\) The Darling Problem.

DARLING: Great! Now that you have an in-depth knowledge of the problem---

BILL: Auguste and Ryan have an in depth knowledge. Frankly I'm out of my depth.

DARLING: Okay, then I'll ask them:  What do you think happens in \(R^3\) and when do think it will be proved?

AUGUSTE: I think there is a 2-coloring of \(R^3\) with no mono unit square.

RYAN: I think that for every 2-coloring of \(R^3\) there is a mono unit square.

BILL: I have no conjecture; however, I think this is the kind of problem that really could be solved. It has not been worked on that much and it might just be one key idea from being solved. It is my hope that this article and blog post inspires someone to work on it and solve it. 

OBLIGATORY AI COMMENT

Auguste asked ChatGPT (or some AI) about the problem. It replied that the problem is open and is known as The Darling's Problem. This is rather surprising---Auguste asked the AI about this before I had submitted the article (it has since appeared) and before this blog post. So how did AI know about it? It was on my website.  I conjecture that Auguste used some of the same language we used in the paper so the AI found our paper. The oddest thing about this is that I don't find this odd anymore. 

 COLOR COMMENTARY  

The article appeared as a SIGACT News Open Problems Column. Are you better off reading it there or on my website, which is pointed to above. The SIGACT News version is (a) behind a paywall, and (b) in black and white. The version on my website is (a) free access, and (b) uses color. You decide.

By gasarch

Linkage with plum blossoms

from David Eppstein

Remembering Joe Halpern (\(\mathbb{M}\)) focuses on Joe’s pivotal role in founding and guiding the CS section of the arXiv.

By David Eppstein

On Montgomery County public magnet schools: a guest post by Daniel Gottesman

from Scott Aaronson

Scott’s foreword: I’ve known fellow quantum computing theorist Daniel Gottesman, now at the University of Maryland, for a quarter-century at this point. Daniel has been a friend, colleague, coauthor, and one of the people from whom I’ve learned the most in my career. Today he writes about a topic close to my heart, and one […]


Scott’s foreword: I’ve known fellow quantum computing theorist Daniel Gottesman, now at the University of Maryland, for a quarter-century at this point. Daniel has been a friend, colleague, coauthor, and one of the people from whom I’ve learned the most in my career. Today he writes about a topic close to my heart, and one to which I’ve regularly lent this blog over the decades: namely, the struggle to protect enrichment and acceleration in the United States (in this case, the public magnet programs in Montgomery County, Maryland) from the constant attempts to weaken or dismantle them. Thanks so much to Daniel for doing this, and please help out if you can!


Without further ado, Daniel Gottesman:

Scott has kindly let me write this guest post because I’d like to ask the readers of Shtetl-Optimized for help.  I live in Montgomery County, Maryland, and the county is getting ready to replace our current handful of great magnet programs with a plethora of mediocre ones.

Montgomery County has a generally quite good school system, but its gifted education programs are really inadequate at the elementary and middle school level.  Montgomery County Public Schools (MCPS) offers nothing at all for gifted children until 4th grade.  Starting in 4th grade, magnet programs are available, but there are not enough spaces for everyone who meets the minimum qualifications.  A few years ago, the elementary and middle school magnets were switched to a lottery system, meaning the highest-achieving students, who most need special programming, might or might not get in, based purely on luck of the draw.

The remaining bright spot has been the high school magnets.  Montgomery County has two well-known and high-performing magnets, a STEM magnet at Montgomery Blair high school and an International Baccalaureate (IB) program at Richard Montgomery.  The Richard Montgomery IB program draws students from the whole county and the Blair Magnet draws from 2/3 of the county (with the remaining 1/3 eligible to go to another successful but less well-known magnet at Poolesville).  And these programs have so far resisted the lottery: They pick the best students from the application pool.

So with inadequate magnets in the lower grades and stellar magnets in high school, you can guess which one is up for a change.

MCPS now wants to reconfigure the high school magnet programs by splitting the county up into 6 regions.  Students will only be allowed to apply to programs in their home region.  Each region will have its own STEM magnet and its own IB program, as well as programs in the arts, medicine, and leadership.  And actually there are multiple program strands in each of these subjects, sometimes in different schools.  The whole plan is big and complicated, with close to 100 different programs around the county, more than half of them new.

The stated purpose of this plan is to expand access to these programs by admitting more students and reducing travel times to the programs.  And who could object to that?  There are definitely places in the county that are far from the current magnets and there are certainly more students that can benefit from high-quality magnets than there is currently space for.

The problem is that making high-quality magnets has not been a priority in the design process.  The last time MCPS tried adding regional magnets was about 7 years ago, when they added 3 regional IB programs while keeping Richard Montgomery available to students all over the county.  It was a failure: Test scores at the regional IB programs are far below those at Richard Montgomery (the worst-performing regional IB had only 24% getting a passing grade in even one subject in 2024, compared to 99% at Richard Montgomery) and all 3 are underenrolled.  Now MCPS has decided they can solve this problem by preventing students from going to Richard Montgomery to try to force them to go to the regional IBs.  In addition, they want to repeat the same mistakes with the STEM and other magnets.  The best programs in the county will shrink and only be accessible to a small fraction of students, leaving everyone else with new programs of likely highly-varying quality.

And if that were not enough, they want to do this revamp on a ridiculously short timeline.  The new programs are supposed to start in the 2027-8 school year, and between now and then, they need to recruit and train teachers for these 100 programs, create all the curricula for the first year of the programs (they are only planning to do one year at a time), and much much more.  The probability of a train wreck in the early years of the new system seems high.

Equity is certainly a concern driving this change.  And let me be clear: I am totally in favor of improving equity in the school system.  But I agree with Scott on this point: strong magnet programs in the public schools are pro-equity and weakening magnet programs is anti-equity.  Magnet programs are pro-equity even if the magnets are disproportionally populated by more affluent students, which is admittedly the case in MCPS: Affluent students will always have access to enrichment outside school and to private schools for the most affluent, whereas the public magnet programs are the only source of enrichment for those without those resources.

If MCPS really wants to address the difference in achievement between richer and poorer students, the way to do that is to create gifted programming starting from kindergarten.  If you wait until high school, it is unreasonable to expect even brilliant students to catch up to their also highly-capable peers who have been doing math and science camps and extracurriculars and contests and whatnot since they were little.  Some can manage it, but it is certainly not easy.  Unfortunately, MCPS’s notion of equity seems more focused on optimizing the demographic breakdown of magnet programs, which is most easily achieved by techniques which don’t improve — and usually degrade — the quality of the education provided.

So how can you help?  The Board of Education (BOE) is supposed to vote on this plan on Mar. 26.  Those of us opposed to it are hoping to sway enough members to vote to tell MCPS to investigate alternatives.  For instance, I have proposed a model with only 3 regions, which could also substantially improve access while preserving the strong existing magnets.

If you live in Montgomery County, write to BOE members telling them you oppose this change.  You can also sign a petition — there are many, but my favorite is here.

If you are an alumnus of one of the MCPS magnets, write to the BOE telling them how your education there was valuable to you and how a smaller program would not have served you as well.

If you are unconnected to Montgomery County, you can still spread the word.  If the BOE gets enough press inquiries asking about the many things that don’t add up in the MCPS proposal, perhaps they will recognize that this is a bad idea.

If you are really really interested in this topic and want to learn more: Last fall, I put together a long analysis of some of the flaws in MCPS’s plan and their claims, and of the alternative 3-region model.  You can find it here.

By Scott

Saturday, March 14

TR26-039 | Super-quadratic Lower Bounds for Depth-2 Linear Threshold Circuits | Lijie Chen, Avishay Tal, Yichuan Wang

from ECCC Papers

Proving lower bounds against depth-$2$ linear threshold circuits (a.k.a. $THR \circ THR$) is one of the frontier questions in complexity theory. Despite tremendous effort, our best lower bounds for $THR \circ THR$ only hold for sub-quadratic number of gates, which was proven a decade ago by Tamaki (ECCC TR16) and Alman, Chan, and Williams (FOCS 2016) for a hard function in $E^{NP}$. In this work, we prove that there is a function $f \in E^{NP}$ that requires $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits for any $\varepsilon > 0$. We obtain our new results by designing a new $2^{n - n^{\Omega(\varepsilon)}}$-time algorithm for estimating the acceptance probability of an XOR of two $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits, and apply Williams' algorithmic method to obtain the desired lower bound.

Proving lower bounds against depth-$2$ linear threshold circuits (a.k.a. $THR \circ THR$) is one of the frontier questions in complexity theory. Despite tremendous effort, our best lower bounds for $THR \circ THR$ only hold for sub-quadratic number of gates, which was proven a decade ago by Tamaki (ECCC TR16) and Alman, Chan, and Williams (FOCS 2016) for a hard function in $E^{NP}$. In this work, we prove that there is a function $f \in E^{NP}$ that requires $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits for any $\varepsilon > 0$. We obtain our new results by designing a new $2^{n - n^{\Omega(\varepsilon)}}$-time algorithm for estimating the acceptance probability of an XOR of two $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits, and apply Williams' algorithmic method to obtain the desired lower bound.

News for February 2026

from Property Testing Review

Apologies for the very late post! Last month was a bit calmer on the property testing front, with “merely” 3 papers we found. (Of course, if we missed any… let us know in the comments!) Testing Monotonicity of Real-Valued Functions on DAGs, by Yuichi Yoshida (arXiv). Monotonicity of functions is a fundamental, and well-studied property […]

Apologies for the very late post! Last month was a bit calmer on the property testing front, with “merely” 3 papers we found. (Of course, if we missed any… let us know in the comments!)

Testing Monotonicity of Real-Valued Functions on DAGs, by Yuichi Yoshida (arXiv). Monotonicity of functions is a fundamental, and well-studied property in the literature, and testing monotonicity on the line, the reals, the Bollean hypercube, and the hypergrid (among others) have been studied at great lengths (and yet, still not fully understood!). This paper considers a new twist on the question, where the object of study is a real-valued function defined on an \(n\)-vertex directed acyclic graph (DAG) provided to the algorithm. The key contribution of this work is showing that, on this type of structured poset, testing monotonicity requires \(\Omega(n^{1/2-\delta}/\sqrt{\varepsilon})\) non-adaptive queries for any constant $\delta>0$, nearly matching the general-poset non-adaptive upper bound of Fisher, Lehman, Newman, Raskhodnikova, Rubinfeld, and Samorodnitsky (2002). The paper also provides a similar adaptive lower bound, for one-sided testers. The author also establishes more fine-grained results (both upper and lower bounds), leveraging assumptions on either the range of the function, or the sparsity of the DAG.

The Power of Two Bases: Robust and copy-optimal certification of nearly all quantum states with few-qubit measurements, by Andrea Coladangelo, Jerry Li, Joseph Slote, and Ellen Wu (arXiv). Following recent works by Huang, Preskill, and Soleimanifar, and then Gupta, He, and O’Donnell, this paper considers the task of state certification (“is this unknown quantum state, which I am given copies of, equal, or very different from, the reference quantum state I want?”), which can be seen as the quantum analogue of identity testing in the classical distribution testing case, for pure reference target states. The key aspect of these works is that one requires the testing algorithm to make very “simple” (ideally single-qubit ones) on the copies of the unknwon \(n\)-qubit state: the underlying idea being that certifying a state given to you should be, in a very quantifiable sense, much “simpler” than preparing the reference state from scratch, otherwise the whole endeavor is sort of useless. Long story short, in this paper, the authors obtain both a very long title and a much more robust algorithm to perform this task, allowing to do tolerant state certification, with constant tolerance parameter. Only slight wrinkles: the algorithm requires one final measurement on logarithmically many qubits (not a single qubit, which would be the Holy Qugrail), and only works for “nearly all” reference states.

Instance-optimal estimation of \(L_2\)-norm, by Tomer Adar (arXiv). Given i.i.d. samples from an probability distribution \(p\) over an arbitrary discrete domain, estimate its collision probability \(\|p\|^2_2\) (equivalently, its \(\ell_2\)-norm) to a multiplicative \(1\pm \varepsilon\) factor. How hard can this be? Quite surprisingly, this question was not, in fact fully solved, and shows a much more complex landscape than expected, in that the right answer is not the obvious guess. An algorithm matching a (known, yet unpublished) lower bound of Tugkan Batu and myself was posed as an open problem by Tugkan at WoLA 2025: in this work, the author solves the problem, showing that the lower bound is indeed tight, by providing an algorithm achieving the right sample complexity, \(O\left(\frac{1}{\varepsilon \|p\|_2}+\frac{\|p\|_3^3-\|p\|_2^4}{ \varepsilon^2\|p\|_2^4}\right)\). It feels good to see this very simple-looking (but not simple, it turns out!), fundamental question solved.

By Clement Canonne

Friday, March 13

Visibly Recursive Automata

from arXiv: Computational Complexity

Authors: Kévin Dubrulle, Véronique Bruyère, Guillermo A. Pérez, Gaëtan Staquet

As an alternative to visibly pushdown automata, we introduce visibly recursive automata (VRAs), composed of a set of classical automata that can call each other. VRAs are a strict extension of so-called systems of procedural automata, a model proposed by Frohme and Steffen. We study the complexity of standard language-theoretic operations and classical decision problems for VRAs. Since the class of deterministic VRAs forms a strict subclass in terms of expressiveness, we propose a (weaker) notion that does not restrict expressive power and which we call codeterminism. Codeterminism comes with many desirable algorithmic properties that we demonstrate by using it, e.g., as a stepping stone towards implementing complementation of VRAs.

Authors: Kévin Dubrulle, Véronique Bruyère, Guillermo A. Pérez, Gaëtan Staquet

As an alternative to visibly pushdown automata, we introduce visibly recursive automata (VRAs), composed of a set of classical automata that can call each other. VRAs are a strict extension of so-called systems of procedural automata, a model proposed by Frohme and Steffen. We study the complexity of standard language-theoretic operations and classical decision problems for VRAs. Since the class of deterministic VRAs forms a strict subclass in terms of expressiveness, we propose a (weaker) notion that does not restrict expressive power and which we call codeterminism. Codeterminism comes with many desirable algorithmic properties that we demonstrate by using it, e.g., as a stepping stone towards implementing complementation of VRAs.

On the Computational Hardness of Transformers

from arXiv: Computational Complexity

Authors: Barna Saha, Yinzhan Xu, Christopher Ye, Hantao Yu

The transformer has revolutionized modern AI across language, vision, and beyond. It consists of $L$ layers, each running $H$ attention heads in parallel and feeding the combined output to the subsequent layer. In attention, the input consists of $N$ tokens, each a vector of dimension $m$. The attention mechanism involves multiplying three $N \times m$ matrices, applying softmax to an intermediate product. Several recent works have advanced our understanding of the complexity of attention. Known algorithms for transformers compute each attention head independently. This raises a fundamental question that has recurred throughout TCS under the guise of ``direct sum'' problems: can multiple instances of the same problem be solved more efficiently than solving each instance separately? Many answers to this question, both positive and negative, have arisen in fields spanning communication complexity and algorithm design. Thus, we ask whether transformers can be computed more efficiently than $LH$ independent evaluations of attention. In this paper, we resolve this question in the negative, and give the first non-trivial computational lower bounds for multi-head multi-layer transformers. In the small embedding regime ($m = N^{o(1)}$), computing $LH$ attention heads separately takes $LHN^{2 + o(1)}$ time. We establish that this is essentially optimal under SETH. In the large embedding regime ($m = N$), one can compute $LH$ attention heads separately using $LHN^{ω+ o(1)}$ arithmetic operations (plus exponents), where $ω$ is the matrix multiplication exponent. We establish that this is optimal, by showing that $LHN^{ω- o(1)}$ arithmetic operations are necessary when $ω> 2$. Our lower bound in the large embedding regime relies on a novel application of the Baur-Strassen theorem, a powerful algorithmic tool underpinning the famous backpropagation algorithm.

Authors: Barna Saha, Yinzhan Xu, Christopher Ye, Hantao Yu

The transformer has revolutionized modern AI across language, vision, and beyond. It consists of $L$ layers, each running $H$ attention heads in parallel and feeding the combined output to the subsequent layer. In attention, the input consists of $N$ tokens, each a vector of dimension $m$. The attention mechanism involves multiplying three $N \times m$ matrices, applying softmax to an intermediate product. Several recent works have advanced our understanding of the complexity of attention. Known algorithms for transformers compute each attention head independently. This raises a fundamental question that has recurred throughout TCS under the guise of ``direct sum'' problems: can multiple instances of the same problem be solved more efficiently than solving each instance separately? Many answers to this question, both positive and negative, have arisen in fields spanning communication complexity and algorithm design. Thus, we ask whether transformers can be computed more efficiently than $LH$ independent evaluations of attention. In this paper, we resolve this question in the negative, and give the first non-trivial computational lower bounds for multi-head multi-layer transformers. In the small embedding regime ($m = N^{o(1)}$), computing $LH$ attention heads separately takes $LHN^{2 + o(1)}$ time. We establish that this is essentially optimal under SETH. In the large embedding regime ($m = N$), one can compute $LH$ attention heads separately using $LHN^{ω+ o(1)}$ arithmetic operations (plus exponents), where $ω$ is the matrix multiplication exponent. We establish that this is optimal, by showing that $LHN^{ω- o(1)}$ arithmetic operations are necessary when $ω> 2$. Our lower bound in the large embedding regime relies on a novel application of the Baur-Strassen theorem, a powerful algorithmic tool underpinning the famous backpropagation algorithm.

Space-Efficient Approximate Spherical Range Counting in High Dimensions

from arXiv: Computational Geometry

Authors: Andreas Kalavas, Ioannis Psarros

We study the following range searching problem in high-dimensional Euclidean spaces: given a finite set $P\subset \mathbb{R}^d$, where each $p\in P$ is assigned a weight $w_p$, and radius $r>0$, we need to preprocess $P$ into a data structure such that when a new query point $q\in \mathbb{R}^d$ arrives, the data structure reports the cumulative weight of points of $P$ within Euclidean distance $r$ from $q$. Solving the problem exactly seems to require space usage that is exponential to the dimension, a phenomenon known as the curse of dimensionality. Thus, we focus on approximate solutions where points up to $(1+\varepsilon)r$ away from $q$ may be taken into account, where $\varepsilon>0$ is an input parameter known during preprocessing. We build a data structure with near-linear space usage, and query time in $n^{1-Θ(\varepsilon^4/\log(1/\varepsilon))}+t_q^{\varrho}\cdot n^{1-\varrho}$, for some $\varrho=Θ(\varepsilon^2)$, where $t_q$ is the number of points of $P$ in the ambiguity zone, i.e., at distance between $r$ and $(1+\varepsilon)r$ from the query $q$. To the best of our knowledge, this is the first data structure with efficient space usage (subquadratic or near-linear for any $\varepsilon>0$) and query time that remains sublinear for any sublinear $t_q$. We supplement our worst-case bounds with a query-driven preprocessing algorithm to build data structures that are well-adapted to the query distribution.

Authors: Andreas Kalavas, Ioannis Psarros

We study the following range searching problem in high-dimensional Euclidean spaces: given a finite set $P\subset \mathbb{R}^d$, where each $p\in P$ is assigned a weight $w_p$, and radius $r>0$, we need to preprocess $P$ into a data structure such that when a new query point $q\in \mathbb{R}^d$ arrives, the data structure reports the cumulative weight of points of $P$ within Euclidean distance $r$ from $q$. Solving the problem exactly seems to require space usage that is exponential to the dimension, a phenomenon known as the curse of dimensionality. Thus, we focus on approximate solutions where points up to $(1+\varepsilon)r$ away from $q$ may be taken into account, where $\varepsilon>0$ is an input parameter known during preprocessing. We build a data structure with near-linear space usage, and query time in $n^{1-Θ(\varepsilon^4/\log(1/\varepsilon))}+t_q^{\varrho}\cdot n^{1-\varrho}$, for some $\varrho=Θ(\varepsilon^2)$, where $t_q$ is the number of points of $P$ in the ambiguity zone, i.e., at distance between $r$ and $(1+\varepsilon)r$ from the query $q$. To the best of our knowledge, this is the first data structure with efficient space usage (subquadratic or near-linear for any $\varepsilon>0$) and query time that remains sublinear for any sublinear $t_q$. We supplement our worst-case bounds with a query-driven preprocessing algorithm to build data structures that are well-adapted to the query distribution.

On strictly output sensitive color frequency reporting

from arXiv: Computational Geometry

Authors: Erwin Glazenburg, Frank Staals

Given a set of $n$ colored points $P \subset \mathbb{R}^d$ we wish to store $P$ such that, given some query region $Q$, we can efficiently report the colors of the points appearing in the query region, along with their frequencies. This is the \emph{color frequency reporting} problem. We study the case where query regions $Q$ are axis-aligned boxes or dominance ranges. If $Q$ contains $k$ colors, the main goal is to achieve ``strictly output sensitive'' query time $O(f(n) + k)$. Firstly, we show that, for every $s \in \{ 2, \dots, n \}$, there exists a simple $O(ns\log_s n)$ size data structure for points in $\mathbb{R}^2$ that allows frequency reporting queries in $O(\log n + k\log_s n)$ time. Secondly, we give a lower bound for the weighted version of the problem in the arithmetic model of computation, proving that with $O(m)$ space one can not achieve query times better than $Ω\left(φ\frac{\log (n / φ)}{\log (m / n)}\right)$, where $φ$ is the number of possible colors. This means that our data structure is near-optimal. We extend these results to higher dimensions as well. Thirdly, we present a transformation that allows us to reduce the space usage of the aforementioned datastructure to $O(n(s φ)^\varepsilon \log_s n)$. Finally, we give an $O(n^{1+\varepsilon} + m \log n + K)$-time algorithm that can answer $m$ dominance queries $\mathbb{R}^2$ with total output complexity $K$, while using only linear working space.

Authors: Erwin Glazenburg, Frank Staals

Given a set of $n$ colored points $P \subset \mathbb{R}^d$ we wish to store $P$ such that, given some query region $Q$, we can efficiently report the colors of the points appearing in the query region, along with their frequencies. This is the \emph{color frequency reporting} problem. We study the case where query regions $Q$ are axis-aligned boxes or dominance ranges. If $Q$ contains $k$ colors, the main goal is to achieve ``strictly output sensitive'' query time $O(f(n) + k)$. Firstly, we show that, for every $s \in \{ 2, \dots, n \}$, there exists a simple $O(ns\log_s n)$ size data structure for points in $\mathbb{R}^2$ that allows frequency reporting queries in $O(\log n + k\log_s n)$ time. Secondly, we give a lower bound for the weighted version of the problem in the arithmetic model of computation, proving that with $O(m)$ space one can not achieve query times better than $Ω\left(φ\frac{\log (n / φ)}{\log (m / n)}\right)$, where $φ$ is the number of possible colors. This means that our data structure is near-optimal. We extend these results to higher dimensions as well. Thirdly, we present a transformation that allows us to reduce the space usage of the aforementioned datastructure to $O(n(s φ)^\varepsilon \log_s n)$. Finally, we give an $O(n^{1+\varepsilon} + m \log n + K)$-time algorithm that can answer $m$ dominance queries $\mathbb{R}^2$ with total output complexity $K$, while using only linear working space.

On the maximum number of tangencies among $1$-intersecting curves

from arXiv: Computational Geometry

Authors: Eyal Ackerman, Balázs Keszegh

According to a conjecture of Pach, there are $O(n)$ tangent pairs among any family of $n$ Jordan arcs in which every pair of arcs has precisely one common point and no three arcs share a common point. This conjecture was proved for two special cases, however, for the general case the currently best upper bound is only $O(n^{7/4})$. This is also the best known bound on the number of tangencies in the relaxed case where every pair of arcs has \emph{at most} one common point. We improve the bounds for the latter and former cases to $O(n^{5/3})$ and $O(n^{3/2})$, respectively. We also consider a few other variants of these questions, for example, we show that if the arcs are \emph{$x$-monotone}, each pair intersects at most once and their left endpoints lie on a common vertical line, then the maximum number of tangencies is $Θ(n^{4/3})$. Without this last condition the number of tangencies is $O(n^{4/3}(\log n)^{1/3})$, improving a previous bound of Pach and Sharir. Along the way we prove a graph-theoretic theorem which extends a result of Erdős and Simonovits and may be of independent interest.

Authors: Eyal Ackerman, Balázs Keszegh

According to a conjecture of Pach, there are $O(n)$ tangent pairs among any family of $n$ Jordan arcs in which every pair of arcs has precisely one common point and no three arcs share a common point. This conjecture was proved for two special cases, however, for the general case the currently best upper bound is only $O(n^{7/4})$. This is also the best known bound on the number of tangencies in the relaxed case where every pair of arcs has \emph{at most} one common point. We improve the bounds for the latter and former cases to $O(n^{5/3})$ and $O(n^{3/2})$, respectively. We also consider a few other variants of these questions, for example, we show that if the arcs are \emph{$x$-monotone}, each pair intersects at most once and their left endpoints lie on a common vertical line, then the maximum number of tangencies is $Θ(n^{4/3})$. Without this last condition the number of tangencies is $O(n^{4/3}(\log n)^{1/3})$, improving a previous bound of Pach and Sharir. Along the way we prove a graph-theoretic theorem which extends a result of Erdős and Simonovits and may be of independent interest.

Fast and exact visibility on digitized shapes and application to saliency-aware normal estimation

from arXiv: Computational Geometry

Authors: Romain Negro, Jacques-Olivier Lachaud

Computing visibility on a geometric object requires heavy computations since it requires to identify pairs of points that are visible to each other, i.e. there is a straight segment joining them that stays in the close vicinity of the object boundary. We propose to exploit a specic representation of digital sets based on lists of integral intervals in order to compute eciently the complete visibility graph between lattice points of the digital shape. As a quite direct application, we show then how we can use visibility to estimate the normal vector eld of a digital shape in an accurate and convergent manner while staying aware of the salient and sharp features of the shape.

Authors: Romain Negro, Jacques-Olivier Lachaud

Computing visibility on a geometric object requires heavy computations since it requires to identify pairs of points that are visible to each other, i.e. there is a straight segment joining them that stays in the close vicinity of the object boundary. We propose to exploit a specic representation of digital sets based on lists of integral intervals in order to compute eciently the complete visibility graph between lattice points of the digital shape. As a quite direct application, we show then how we can use visibility to estimate the normal vector eld of a digital shape in an accurate and convergent manner while staying aware of the salient and sharp features of the shape.

Approximate Dynamic Nearest Neighbor Searching in a Polygonal Domain

from arXiv: Computational Geometry

Authors: Joost van der Laan, Frank Staals, Lorenzo Theunissen

We present efficient data structures for approximate nearest neighbor searching and approximate 2-point shortest path queries in a two-dimensional polygonal domain $P$ with $n$ vertices. Our goal is to store a dynamic set of $m$ point sites $S$ in $P$ so that we can efficiently find a site $s \in S$ closest to an arbitrary query point $q$. We will allow both insertions and deletions in the set of sites $S$. However, as even just computing the distance between an arbitrary pair of points $q,s \in P$ requires a substantial amount of space, we allow for approximating the distances. Given a parameter $\varepsilon > 0$, we build an $O(\frac{n}{\varepsilon}\log n)$ space data structure that can compute a $1+\varepsilon$-approximation of the distance between $q$ and $s$ in $O(\frac{1}{\varepsilon^2}\log n)$ time. Building on this, we then obtain an $O(\frac{n+m}{\varepsilon}\log n + \frac{m}{\varepsilon}\log m)$ space data structure that allows us to report a site $s \in S$ so that the distance between query point $q$ and $s$ is at most $(1+\varepsilon)$-times the distance between $q$ and its true nearest neighbor in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ time. Our data structure supports updates in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ amortized time.

Authors: Joost van der Laan, Frank Staals, Lorenzo Theunissen

We present efficient data structures for approximate nearest neighbor searching and approximate 2-point shortest path queries in a two-dimensional polygonal domain $P$ with $n$ vertices. Our goal is to store a dynamic set of $m$ point sites $S$ in $P$ so that we can efficiently find a site $s \in S$ closest to an arbitrary query point $q$. We will allow both insertions and deletions in the set of sites $S$. However, as even just computing the distance between an arbitrary pair of points $q,s \in P$ requires a substantial amount of space, we allow for approximating the distances. Given a parameter $\varepsilon > 0$, we build an $O(\frac{n}{\varepsilon}\log n)$ space data structure that can compute a $1+\varepsilon$-approximation of the distance between $q$ and $s$ in $O(\frac{1}{\varepsilon^2}\log n)$ time. Building on this, we then obtain an $O(\frac{n+m}{\varepsilon}\log n + \frac{m}{\varepsilon}\log m)$ space data structure that allows us to report a site $s \in S$ so that the distance between query point $q$ and $s$ is at most $(1+\varepsilon)$-times the distance between $q$ and its true nearest neighbor in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ time. Our data structure supports updates in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ amortized time.

Bounding the Fragmentation of B-Trees Subject to Batched Insertions

from arXiv: Data Structures and Algorithms

Authors: Michael A. Bender, Aaron Bernstein, Nairen Cao, Alex Conway, Martín Farach-Colton, Hanna Komlós, Yarin Shechter, Nicole Wein

The issue of internal fragmentation in data structures is a fundamental challenge in database design. A seminal result of Yao in this field shows that evenly splitting the leaves of a B-tree against a workload of uniformly random insertions achieves space utilization of around 69%. However, many database applications perform batched insertions, where a small run of consecutive keys is inserted at a single position. We develop a generalization of Yao's analysis to provide rigorous treatment of such batched workloads. Our approach revisits and reformulates the analytical structure underlying Yao's result in a way that enables generalization and is used to argue that even splitting works well for many workloads in our extended class. For the remaining workloads, we develop simple alternative strategies that provably maintain good space utilization.

Authors: Michael A. Bender, Aaron Bernstein, Nairen Cao, Alex Conway, Martín Farach-Colton, Hanna Komlós, Yarin Shechter, Nicole Wein

The issue of internal fragmentation in data structures is a fundamental challenge in database design. A seminal result of Yao in this field shows that evenly splitting the leaves of a B-tree against a workload of uniformly random insertions achieves space utilization of around 69%. However, many database applications perform batched insertions, where a small run of consecutive keys is inserted at a single position. We develop a generalization of Yao's analysis to provide rigorous treatment of such batched workloads. Our approach revisits and reformulates the analytical structure underlying Yao's result in a way that enables generalization and is used to argue that even splitting works well for many workloads in our extended class. For the remaining workloads, we develop simple alternative strategies that provably maintain good space utilization.

Time, Message and Memory-Optimal Distributed Minimum Spanning Tree and Partwise Aggregation

from arXiv: Data Structures and Algorithms

Authors: Michael Elkin Tanya Goldenfeld

Memory-(in)efficiency is a crucial consideration that oftentimes prevents deployment of state-of-the-art distributed algorithms in real-life modern networks. In the context of the MST problem, roughly speaking, there are three types of algorithms. The algorithm of Gallager-Humblet-Spira and its versions are memory- and message- efficient, but their running time is at least linear in the number of vertices $n$, even when the unweighted diameter $D$ is much smaller than $n$. The algorithm of Garay-Kutten-Peleg and its versions are time-efficient, but not message- or memory-efficient. The more recent algorithms of are time- and message-efficient, but are not memory-efficient. As a result, GHS-type algorithms are much more prominent in real-life applications than time-efficient ones. In this paper we develop a deterministic time-, message- and memory-efficient algorithm for the MST problem. It is also applicable to the more general partwise aggregation problem. We believe that our techniques will be useful for devising memory-efficient versions for many other distributed problems.

Authors: Michael Elkin Tanya Goldenfeld

Memory-(in)efficiency is a crucial consideration that oftentimes prevents deployment of state-of-the-art distributed algorithms in real-life modern networks. In the context of the MST problem, roughly speaking, there are three types of algorithms. The algorithm of Gallager-Humblet-Spira and its versions are memory- and message- efficient, but their running time is at least linear in the number of vertices $n$, even when the unweighted diameter $D$ is much smaller than $n$. The algorithm of Garay-Kutten-Peleg and its versions are time-efficient, but not message- or memory-efficient. The more recent algorithms of are time- and message-efficient, but are not memory-efficient. As a result, GHS-type algorithms are much more prominent in real-life applications than time-efficient ones. In this paper we develop a deterministic time-, message- and memory-efficient algorithm for the MST problem. It is also applicable to the more general partwise aggregation problem. We believe that our techniques will be useful for devising memory-efficient versions for many other distributed problems.

Pivot based correlation clustering in the presence of good clusters

from arXiv: Data Structures and Algorithms

Authors: David Rasmussen Lolck, Mikkel Thorup, Shuyi Yan

The classic pivot based clustering algorithm of Ailon, Charikar and Chawla [JACM'08] is factor 3, but all concrete examples showing that it is no better than 3 are based on some very good clusters, e.g., a complete graph minus a matching. By removing all good clusters before we make each pivot step, we show that this improves the approximation ratio to $2.9991$. To aid in this, we also show how our proposed algorithm performs on synthetic datasets, where the algorithm performs remarkably well, and shows improvements over both the algorithm for locating good clusters and the classic pivot algorithm.

Authors: David Rasmussen Lolck, Mikkel Thorup, Shuyi Yan

The classic pivot based clustering algorithm of Ailon, Charikar and Chawla [JACM'08] is factor 3, but all concrete examples showing that it is no better than 3 are based on some very good clusters, e.g., a complete graph minus a matching. By removing all good clusters before we make each pivot step, we show that this improves the approximation ratio to $2.9991$. To aid in this, we also show how our proposed algorithm performs on synthetic datasets, where the algorithm performs remarkably well, and shows improvements over both the algorithm for locating good clusters and the classic pivot algorithm.

Enumerating All Directed Spanning Trees in Optimal Time

from arXiv: Data Structures and Algorithms

Authors: Paweł Gawrychowski, Marcin Knapik

We consider the problem of enumerating, for a given directed graph $G=(V,E)$ and a node $r\in V$, all directed spanning trees of $G$ rooted at $r$. For undirected graphs, the corresponding problem of enumerating all spanning trees has received considerable attention, culminating in the algorithm of Kapoor and Ramesh [SICOMP 1995] working in $\mathcal{O}(n+m+N)$ time, where $N, n, m$ denote the number of spanning trees, vertices, and edges of $G$, respectively. In the area of enumeration algorithms, this is known as Constant Amortised Time, or CAT. To achieve only constant time per each spanning tree, the algorithm outputs the relative change between the subsequent spanning trees instead of the whole spanning trees themselves. The natural generalization to enumerating all directed spanning trees has been already considered by Gabow and Myers [SICOMP 1978], who provided an $\mathcal{O}(n+m+Nm)$ time algorithm. This time complexity has been improved upon a couple of times, and in 1998 Uno introduced the framework of trimming and balancing that allowed him to obtain an $\mathcal{O}(n+m\log n+N\log^{2}n)$ time algorithm for this problem. By plugging in later results it is immediate to improve the time complexity to $\mathcal{O}(n+m+N\log n)$, but achieving the optimal bound of $\mathcal{O}(n+m+N)$ seems problematic within this framework. In this paper, we show how to enumerate all directed spanning trees in $\mathcal{O}(n+m+N)$ time and $\mathcal{O}(n+m)$ space, matching the time bound for undirected graphs. Our improvement is obtained by designing a purely graph-theoretical characterization of graphs with very few directed spanning trees, and using their structure to speed up the algorithm.

Authors: Paweł Gawrychowski, Marcin Knapik

We consider the problem of enumerating, for a given directed graph $G=(V,E)$ and a node $r\in V$, all directed spanning trees of $G$ rooted at $r$. For undirected graphs, the corresponding problem of enumerating all spanning trees has received considerable attention, culminating in the algorithm of Kapoor and Ramesh [SICOMP 1995] working in $\mathcal{O}(n+m+N)$ time, where $N, n, m$ denote the number of spanning trees, vertices, and edges of $G$, respectively. In the area of enumeration algorithms, this is known as Constant Amortised Time, or CAT. To achieve only constant time per each spanning tree, the algorithm outputs the relative change between the subsequent spanning trees instead of the whole spanning trees themselves. The natural generalization to enumerating all directed spanning trees has been already considered by Gabow and Myers [SICOMP 1978], who provided an $\mathcal{O}(n+m+Nm)$ time algorithm. This time complexity has been improved upon a couple of times, and in 1998 Uno introduced the framework of trimming and balancing that allowed him to obtain an $\mathcal{O}(n+m\log n+N\log^{2}n)$ time algorithm for this problem. By plugging in later results it is immediate to improve the time complexity to $\mathcal{O}(n+m+N\log n)$, but achieving the optimal bound of $\mathcal{O}(n+m+N)$ seems problematic within this framework. In this paper, we show how to enumerate all directed spanning trees in $\mathcal{O}(n+m+N)$ time and $\mathcal{O}(n+m)$ space, matching the time bound for undirected graphs. Our improvement is obtained by designing a purely graph-theoretical characterization of graphs with very few directed spanning trees, and using their structure to speed up the algorithm.

Adapting Dijkstra for Buffers and Unlimited Transfers

from arXiv: Data Structures and Algorithms

Authors: Denys Katkalo, Andrii Rohovyi, Toby Walsh

In recent years, RAPTOR based algorithms have been considered the state-of-the-art for path-finding with unlimited transfers without preprocessing. However, this status largely stems from the evolution of routing research, where Dijkstra-based solutions were superseded by timetable-based algorithms without a systematic comparison. In this work, we revisit classical Dijkstra-based approaches for public transit routing with unlimited transfers and demonstrate that Time-Dependent Dijkstra (TD-Dijkstra) outperforms MR. However, efficient TD-Dijkstra implementations rely on filtering dominated connections during preprocessing, which assumes passengers can always switch to a faster connection. We show that this filtering is unsound when stops have buffer times, as it cannot distinguish between seated passengers who may continue without waiting and transferring passengers who must respect the buffer. To address this limitation, we introduce Transfer Aware Dijkstra (TAD), a modification that scans entire trip sequences rather than individual edges, correctly handling buffer times while maintaining performance advantages over MR. Our experiments on London and Switzerland networks show that we can achieve a greater than two time speed-up over MR while producing optimal results on both networks with and without buffer times.

Authors: Denys Katkalo, Andrii Rohovyi, Toby Walsh

In recent years, RAPTOR based algorithms have been considered the state-of-the-art for path-finding with unlimited transfers without preprocessing. However, this status largely stems from the evolution of routing research, where Dijkstra-based solutions were superseded by timetable-based algorithms without a systematic comparison. In this work, we revisit classical Dijkstra-based approaches for public transit routing with unlimited transfers and demonstrate that Time-Dependent Dijkstra (TD-Dijkstra) outperforms MR. However, efficient TD-Dijkstra implementations rely on filtering dominated connections during preprocessing, which assumes passengers can always switch to a faster connection. We show that this filtering is unsound when stops have buffer times, as it cannot distinguish between seated passengers who may continue without waiting and transferring passengers who must respect the buffer. To address this limitation, we introduce Transfer Aware Dijkstra (TAD), a modification that scans entire trip sequences rather than individual edges, correctly handling buffer times while maintaining performance advantages over MR. Our experiments on London and Switzerland networks show that we can achieve a greater than two time speed-up over MR while producing optimal results on both networks with and without buffer times.

Beyond BFS: A Comparative Study of Rooted Spanning Tree Algorithms on GPUs

from arXiv: Data Structures and Algorithms

Authors: Abhijeet Sahu, Srikar Vilas Donur

Rooted spanning trees (RSTs) are a core primitive in parallel graph analytics, underpinning algorithms such as biconnected components and planarity testing. On GPUs, RST construction has traditionally relied on breadth-first search (BFS) due to its simplicity and work efficiency. However, BFS incurs an O(D) step complexity, which severely limits parallelism on high-diameter and power-law graphs. We present a comparative study of alternative RST construction strategies on modern GPUs. We introduce a GPU adaptation of the Path Reversal RST (PR-RST) algorithm, optimizing its pointer-jumping and broadcast operations for modern GPU architecture. In addition, we evaluate an integrated approach that combines a state-of-the-art connectivity framework (GConn) with Eulerian tour-based rooting. Across more than 10 real-world graphs, our results show that the GConn-based approach achieves up to 300x speedup over optimized BFS on high-diameter graphs. These findings indicate that the O(log n) step complexity of connectivity-based methods can outweigh their structural overhead on modern hardware, motivating a rethinking of RST construction in GPU graph analytics.

Authors: Abhijeet Sahu, Srikar Vilas Donur

Rooted spanning trees (RSTs) are a core primitive in parallel graph analytics, underpinning algorithms such as biconnected components and planarity testing. On GPUs, RST construction has traditionally relied on breadth-first search (BFS) due to its simplicity and work efficiency. However, BFS incurs an O(D) step complexity, which severely limits parallelism on high-diameter and power-law graphs. We present a comparative study of alternative RST construction strategies on modern GPUs. We introduce a GPU adaptation of the Path Reversal RST (PR-RST) algorithm, optimizing its pointer-jumping and broadcast operations for modern GPU architecture. In addition, we evaluate an integrated approach that combines a state-of-the-art connectivity framework (GConn) with Eulerian tour-based rooting. Across more than 10 real-world graphs, our results show that the GConn-based approach achieves up to 300x speedup over optimized BFS on high-diameter graphs. These findings indicate that the O(log n) step complexity of connectivity-based methods can outweigh their structural overhead on modern hardware, motivating a rethinking of RST construction in GPU graph analytics.

Graph Generation Methods under Partial Information

from arXiv: Data Structures and Algorithms

Authors: Tong Sun, Jianshu Hao, Michael C. Fu, Guangxin Jiang

We study the problem of generating graphs with prescribed degree sequences for bipartite, directed, and undirected networks. We first propose a sequential method for bipartite graph generation and establish a necessary and sufficient interval condition that characterizes the admissible number of connections at each step, thereby guaranteeing global feasibility. Based on this result, we develop bipartite graph enumeration and sampling algorithms suitable for different problem sizes. We then extend these bipartite graph algorithms to the directed and undirected cases by incorporating additional connection constraints, as well as feasibility verification and symmetric connection steps, while preserving the same algorithmic principles. Finally, numerical experiments demonstrate the performance of the proposed algorithms, particularly their scalability to large instances where existing methods become computationally prohibitive.

Authors: Tong Sun, Jianshu Hao, Michael C. Fu, Guangxin Jiang

We study the problem of generating graphs with prescribed degree sequences for bipartite, directed, and undirected networks. We first propose a sequential method for bipartite graph generation and establish a necessary and sufficient interval condition that characterizes the admissible number of connections at each step, thereby guaranteeing global feasibility. Based on this result, we develop bipartite graph enumeration and sampling algorithms suitable for different problem sizes. We then extend these bipartite graph algorithms to the directed and undirected cases by incorporating additional connection constraints, as well as feasibility verification and symmetric connection steps, while preserving the same algorithmic principles. Finally, numerical experiments demonstrate the performance of the proposed algorithms, particularly their scalability to large instances where existing methods become computationally prohibitive.

Faster Relational Algorithms Using Geometric Data Structures

from arXiv: Data Structures and Algorithms

Authors: Aryan Esmailpour, Stavros Sintos

Optimization tasks over relational data, such as clustering, often suffer from the prohibitive cost of join operations, which are necessary to access the full dataset. While geometric data structures like BBD trees yield fast approximation algorithms in the standard computational setting, their application to relational data remains unclear due to the size of the join output. In this paper, we introduce a framework that leverages geometric insights to design faster algorithms when the data is stored as the results of a join query in a relational database. Our core contribution is the development of the RBBD tree, a randomized variant of the BBD tree tailored for relational settings. Instead of completely constructing the RBBD tree, by leveraging efficient sampling and counting techniques over relational joins, we enable on-the-fly efficient expansion of the RBBD tree, maintaining only the necessary parts. This allows us to simulate geometric query procedures without materializing the join result. As an application, we present algorithms that improve the state-of-the-art for relational $k$-center/means/median clustering by a factor of $k$ in running time while maintaining the same approximation guarantees. Our method is general and can be applied to various optimization problems in the relational setting.

Authors: Aryan Esmailpour, Stavros Sintos

Optimization tasks over relational data, such as clustering, often suffer from the prohibitive cost of join operations, which are necessary to access the full dataset. While geometric data structures like BBD trees yield fast approximation algorithms in the standard computational setting, their application to relational data remains unclear due to the size of the join output. In this paper, we introduce a framework that leverages geometric insights to design faster algorithms when the data is stored as the results of a join query in a relational database. Our core contribution is the development of the RBBD tree, a randomized variant of the BBD tree tailored for relational settings. Instead of completely constructing the RBBD tree, by leveraging efficient sampling and counting techniques over relational joins, we enable on-the-fly efficient expansion of the RBBD tree, maintaining only the necessary parts. This allows us to simulate geometric query procedures without materializing the join result. As an application, we present algorithms that improve the state-of-the-art for relational $k$-center/means/median clustering by a factor of $k$ in running time while maintaining the same approximation guarantees. Our method is general and can be applied to various optimization problems in the relational setting.

Induced Minors and Coarse Tree Decompositions

from arXiv: Data Structures and Algorithms

Authors: Maria Chudnovsky, Julien Codsi, Ajaykrishnan E S, Daniel Lokshtanov

Let $G$ be a graph, $S \subseteq V(G)$ be a vertex set in $G$ and $r$ be a positive integer. The distance $r$-independence number of $S$ is the size of the largest subset $I \subseteq S$ such that no pair $u$, $v$ of vertices in $I$ have a path on at most $r$ edges between them in $G$. It has been conjectured [Chudnovsky et al., arXiv, 2025] that for every positive integer $t$ there exist positive integers $c$, $d$ such that every graph $G$ that excludes both the complete bipartite graph $K_{t,t}$ and the grid $\boxplus_t$ as an induced minor has a tree decomposition in which every bag has (distance $1$) independence number at most $c(\log n)^d$. We prove a weaker version of this conjecture where every bag of the tree decomposition has distance $16(\log n + 1)$-independence number at most $c(\log n)^d$. On the way we also prove a version of the conjecture where every bag of the decomposition has distance $8$-independence number at most $2^{c (\log n)^{1-(1/d)}}$.

Authors: Maria Chudnovsky, Julien Codsi, Ajaykrishnan E S, Daniel Lokshtanov

Let $G$ be a graph, $S \subseteq V(G)$ be a vertex set in $G$ and $r$ be a positive integer. The distance $r$-independence number of $S$ is the size of the largest subset $I \subseteq S$ such that no pair $u$, $v$ of vertices in $I$ have a path on at most $r$ edges between them in $G$. It has been conjectured [Chudnovsky et al., arXiv, 2025] that for every positive integer $t$ there exist positive integers $c$, $d$ such that every graph $G$ that excludes both the complete bipartite graph $K_{t,t}$ and the grid $\boxplus_t$ as an induced minor has a tree decomposition in which every bag has (distance $1$) independence number at most $c(\log n)^d$. We prove a weaker version of this conjecture where every bag of the tree decomposition has distance $16(\log n + 1)$-independence number at most $c(\log n)^d$. On the way we also prove a version of the conjecture where every bag of the decomposition has distance $8$-independence number at most $2^{c (\log n)^{1-(1/d)}}$.

On the PLS-Completeness of $k$-Opt Local Search for the Traveling Salesman Problem

from arXiv: Data Structures and Algorithms

Authors: Sophia Heimann, Hung P. Hoang, Stefan Hougardy

The $k$-Opt algorithm is a local search algorithm for the traveling salesman problem. Starting with an initial tour, it iteratively replaces at most $k$ edges in the tour with the same number of edges to obtain a better tour. Krentel (FOCS 1989) showed that the traveling salesman problem with the $k$-Opt neighborhood is complete for the class PLS (polynomial time local search). However, his proof requires $k \gg 1000$ and has a substantial gap. We provide the first rigorous proof for the PLS-completeness and at the same time drastically lower the value of $k$ to $k \geq 15$, addressing an open question by Monien, Dumrauf, and Tscheuschner (ICALP 2010). Our result holds for both the general and the metric traveling salesman problem.

Authors: Sophia Heimann, Hung P. Hoang, Stefan Hougardy

The $k$-Opt algorithm is a local search algorithm for the traveling salesman problem. Starting with an initial tour, it iteratively replaces at most $k$ edges in the tour with the same number of edges to obtain a better tour. Krentel (FOCS 1989) showed that the traveling salesman problem with the $k$-Opt neighborhood is complete for the class PLS (polynomial time local search). However, his proof requires $k \gg 1000$ and has a substantial gap. We provide the first rigorous proof for the PLS-completeness and at the same time drastically lower the value of $k$ to $k \geq 15$, addressing an open question by Monien, Dumrauf, and Tscheuschner (ICALP 2010). Our result holds for both the general and the metric traveling salesman problem.