Theory of Computing Report

Postdoc at EPFL (apply by February 28, 2026)

Theory Group at EPFL has multiple postdoc positions available. Application review starts in early January 2026 (with final deadline February 28th). Topics: probabilistic proofs, combinatorial optimization, approximation algorithms, online algorithms, sublinear algorithms, (quantum) computational complexity (circuits/proofs/communication), quantum cryptography/algorithms/learning. Website: careers.epfl.ch/job/Lausanne-Postdoctoral-position-in-the-Theory-Group/1163629255/ Email: theory-postdoc@epfl.ch

Theory Group at EPFL has multiple postdoc positions available. Application review starts in early January 2026 (with final deadline February 28th).

Topics: probabilistic proofs, combinatorial optimization, approximation algorithms, online algorithms, sublinear algorithms, (quantum) computational complexity (circuits/proofs/communication), quantum cryptography/algorithms/learning.

Website: https://careers.epfl.ch/job/Lausanne-Postdoctoral-position-in-the-Theory-Group/1163629255/
Email: theory-postdoc@epfl.ch

By shacharlovett

Tuesday, October 28 2025, 13:28

Noisy nonlinear information and entropy numbers

from arXiv: Computational Complexity

Authors: David Krieg, Erich Novak, Leszek Plaskota, Mario Ullrich

It is impossible to recover a vector from $\mathbb{R}^m$ with less than $m$ linear measurements, even if the measurements are chosen adaptively. Recently, it has been shown that one can recover vectors from $\mathbb{R}^m$ with arbitrary precision using only $O(\log m)$ continuous (even Lipschitz) adaptive measurements, resulting in an exponential speed-up of continuous information compared to linear information for various approximation problems. In this note, we characterize the quality of optimal (dis-)continuous information that is disturbed by deterministic noise in terms of entropy numbers. This shows that in the presence of noise the potential gain of continuous over linear measurements is limited, but significant in some cases.

Authors: David Krieg, Erich Novak, Leszek Plaskota, Mario Ullrich

It is impossible to recover a vector from $\mathbb{R}^m$ with less than $m$ linear measurements, even if the measurements are chosen adaptively. Recently, it has been shown that one can recover vectors from $\mathbb{R}^m$ with arbitrary precision using only $O(\log m)$ continuous (even Lipschitz) adaptive measurements, resulting in an exponential speed-up of continuous information compared to linear information for various approximation problems. In this note, we characterize the quality of optimal (dis-)continuous information that is disturbed by deterministic noise in terms of entropy numbers. This shows that in the presence of noise the potential gain of continuous over linear measurements is limited, but significant in some cases.

Tuesday, October 28 2025, 00:00

A Critique of Quigley's "A Polynomial Time Algorithm for 3SAT"

from arXiv: Computational Complexity

Authors: Nicholas DeJesse, Spencer Lyudovyk, Dhruv Pai

In this paper, we examine Quigley's "A Polynomial Time Algorithm for 3SAT" [Qui24]. Quigley claims to construct an algorithm that runs in polynomial time and determines whether a boolean formula in 3CNF form is satisfiable. Such a result would prove that 3SAT $\in \text{P}$ and thus $\text{P} = \text{NP}$. We show Quigley's argument is flawed by providing counterexamples to several lemmas he attempts to use to justify the correctness of his algorithm. We also provide an infinite class of 3CNF formulas that are unsatisfiable but are classified as satisfiable by Quigley's algorithm. In doing so, we prove that Quigley's algorithm fails on certain inputs, and thus his claim that $\text{P} = \text{NP}$ is not established by his paper.

Authors: Nicholas DeJesse, Spencer Lyudovyk, Dhruv Pai

In this paper, we examine Quigley's "A Polynomial Time Algorithm for 3SAT" [Qui24]. Quigley claims to construct an algorithm that runs in polynomial time and determines whether a boolean formula in 3CNF form is satisfiable. Such a result would prove that 3SAT $\in \text{P}$ and thus $\text{P} = \text{NP}$. We show Quigley's argument is flawed by providing counterexamples to several lemmas he attempts to use to justify the correctness of his algorithm. We also provide an infinite class of 3CNF formulas that are unsatisfiable but are classified as satisfiable by Quigley's algorithm. In doing so, we prove that Quigley's algorithm fails on certain inputs, and thus his claim that $\text{P} = \text{NP}$ is not established by his paper.

Tuesday, October 28 2025, 00:00

NP-Completeness Proofs of All or Nothing and Water Walk Using the T-Metacell Framework

from arXiv: Computational Complexity

Authors: Pakapim Eua-anant, Papangkorn Apinyanon, Thunyatorn Jirachaisri, Nantapong Ruangsuksriwong, Suthee Ruangwises

All or Nothing and Water Walk are pencil puzzles that involve constructing a continuous loop on a rectangular grid under specific constraints. In this paper, we analyze their computational complexity using the T-metacell framework developed by Tang and MIT Hardness Group. We establish that both puzzles are NP-complete by providing reductions from the problem of finding a Hamiltonian cycle in a maximum-degree-3 spanning subgraph of a rectangular grid graph.

Authors: Pakapim Eua-anant, Papangkorn Apinyanon, Thunyatorn Jirachaisri, Nantapong Ruangsuksriwong, Suthee Ruangwises

All or Nothing and Water Walk are pencil puzzles that involve constructing a continuous loop on a rectangular grid under specific constraints. In this paper, we analyze their computational complexity using the T-metacell framework developed by Tang and MIT Hardness Group. We establish that both puzzles are NP-complete by providing reductions from the problem of finding a Hamiltonian cycle in a maximum-degree-3 spanning subgraph of a rectangular grid graph.

Tuesday, October 28 2025, 00:00

T-REGS: Minimum Spanning Tree Regularization for Self-Supervised Learning

from arXiv: Computational Geometry

Authors: Julie Mordacq, David Loiseaux, Vicky Kalogeiton, Steve Oudot

Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations without labeled data, often by enforcing invariance to input transformations such as rotations or blurring. Recent studies have highlighted two pivotal properties for effective representations: (i) avoiding dimensional collapse-where the learned features occupy only a low-dimensional subspace, and (ii) enhancing uniformity of the induced distribution. In this work, we introduce T-REGS, a simple regularization framework for SSL based on the length of the Minimum Spanning Tree (MST) over the learned representation. We provide theoretical analysis demonstrating that T-REGS simultaneously mitigates dimensional collapse and promotes distribution uniformity on arbitrary compact Riemannian manifolds. Several experiments on synthetic data and on classical SSL benchmarks validate the effectiveness of our approach at enhancing representation quality.

Authors: Julie Mordacq, David Loiseaux, Vicky Kalogeiton, Steve Oudot

Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations without labeled data, often by enforcing invariance to input transformations such as rotations or blurring. Recent studies have highlighted two pivotal properties for effective representations: (i) avoiding dimensional collapse-where the learned features occupy only a low-dimensional subspace, and (ii) enhancing uniformity of the induced distribution. In this work, we introduce T-REGS, a simple regularization framework for SSL based on the length of the Minimum Spanning Tree (MST) over the learned representation. We provide theoretical analysis demonstrating that T-REGS simultaneously mitigates dimensional collapse and promotes distribution uniformity on arbitrary compact Riemannian manifolds. Several experiments on synthetic data and on classical SSL benchmarks validate the effectiveness of our approach at enhancing representation quality.

Tuesday, October 28 2025, 00:00

Expected Length of the Euclidean Minimum Spanning Tree and 1-norms of Chromatic Persistence Diagrams in the Plane

from arXiv: Computational Geometry

Authors: Ondřej Draganov, Herbert Edelsbrunner, Sophie Rosenmeier, Morteza Saghafian

Let $c$ be the constant such that the expected length of the Euclidean minimum spanning tree of $n$ random points in the unit square is $c \sqrt{n}$ in the limit, when $n$ goes to infinity. We improve the prior best lower bound of $0.6008 \leq c$ by Avram and Bertsimas to $0.6289 \leq c$. The proof is a by-product of studying the persistent homology of randomly $2$-colored point sets. Specifically, we consider the filtration induced by the inclusions of the two mono-chromatic sublevel sets of the Euclidean distance function into the bi-chromatic sublevel set of that function. Assigning colors randomly, and with equal probability, we show that the expected $1$-norm of each chromatic persistence diagram is a constant times $\sqrt{n}$ in the limit, and we determine the constant in terms of $c$ and another constant, $c_L$, which arises for a novel type of Euclidean minimum spanning tree of $2$-colored point sets.

Authors: Ondřej Draganov, Herbert Edelsbrunner, Sophie Rosenmeier, Morteza Saghafian

Let $c$ be the constant such that the expected length of the Euclidean minimum spanning tree of $n$ random points in the unit square is $c \sqrt{n}$ in the limit, when $n$ goes to infinity. We improve the prior best lower bound of $0.6008 \leq c$ by Avram and Bertsimas to $0.6289 \leq c$. The proof is a by-product of studying the persistent homology of randomly $2$-colored point sets. Specifically, we consider the filtration induced by the inclusions of the two mono-chromatic sublevel sets of the Euclidean distance function into the bi-chromatic sublevel set of that function. Assigning colors randomly, and with equal probability, we show that the expected $1$-norm of each chromatic persistence diagram is a constant times $\sqrt{n}$ in the limit, and we determine the constant in terms of $c$ and another constant, $c_L$, which arises for a novel type of Euclidean minimum spanning tree of $2$-colored point sets.

Tuesday, October 28 2025, 00:00

Online Hitting Set for Axis-Aligned Squares

from arXiv: Computational Geometry

Authors: Minati De, Satyam Singh, Csaba D. Tóth

We are given a set $P$ of $n$ points in the plane, and a sequence of axis-aligned squares that arrive in an online fashion. The online hitting set problem consists of maintaining, by adding new points if necessary, a set $H\subseteq P$ that contains at least one point in each input square. We present an $O(\log n)$-competitive deterministic algorithm for this problem. The competitive ratio is the best possible, apart from constant factors. In fact, this is the first $O(\log n)$-competitive algorithm for the online hitting set problem that works for geometric objects of arbitrary sizes (i.e., arbitrary scaling factors) in the plane. We further generalize this result to positive homothets of a polygon with $k\geq 3$ vertices in the plane and provide an $O(k^2\log n)$-competitive algorithm.

Authors: Minati De, Satyam Singh, Csaba D. Tóth

We are given a set $P$ of $n$ points in the plane, and a sequence of axis-aligned squares that arrive in an online fashion. The online hitting set problem consists of maintaining, by adding new points if necessary, a set $H\subseteq P$ that contains at least one point in each input square. We present an $O(\log n)$-competitive deterministic algorithm for this problem. The competitive ratio is the best possible, apart from constant factors. In fact, this is the first $O(\log n)$-competitive algorithm for the online hitting set problem that works for geometric objects of arbitrary sizes (i.e., arbitrary scaling factors) in the plane. We further generalize this result to positive homothets of a polygon with $k\geq 3$ vertices in the plane and provide an $O(k^2\log n)$-competitive algorithm.

Tuesday, October 28 2025, 00:00

On the complexity of the free space of a translating square in R^3

from arXiv: Computational Geometry

Authors: Gabriel Nivasch

Consider a polyhedral robot $B$ that can translate (without rotating) amidst a finite set of non-moving polyhedral obstacles in $\mathbb R^3$. The "free space" $\mathcal F$ of $B$ is the set of all positions in which $B$ is disjoint from the interior of every obstacle. Aronov and Sharir (1997) derived an upper bound of $O(n^2\log n)$ for the combinatorial complexity of $\mathcal F$, where $n$ is the total number of vertices of the obstacles, and the complexity of $B$ is assumed constant. Halperin and Yap (1993) showed that, if $B$ is either a "flat" convex polygon or a three-dimensional box, then a tighter bound of $O(n^2\alpha(n))$ holds. Here $\alpha(n)$ is the inverse Ackermann function. In this paper we prove that if $B$ is a square (or a rectangle or a parallelogram), then the complexity of $\mathcal F$ is $O(n^2)$. We conjecture that this bound holds more generally if $B$ is any convex polygon whose edges come in parallel pairs. For such polygons $B$, the only triple contacts whose number we were not able to bound by $O(n^2)$ are those made by three mutually non-parallel edges of $B$. Similarly, for the case where $B$ is a cube (or a box or a parallelepiped), we bound by $O(n^2)$ all triple contacts except those made by three mutually non-parallel edges of $B$.

Authors: Gabriel Nivasch

Consider a polyhedral robot $B$ that can translate (without rotating) amidst a finite set of non-moving polyhedral obstacles in $\mathbb R^3$. The "free space" $\mathcal F$ of $B$ is the set of all positions in which $B$ is disjoint from the interior of every obstacle. Aronov and Sharir (1997) derived an upper bound of $O(n^2\log n)$ for the combinatorial complexity of $\mathcal F$, where $n$ is the total number of vertices of the obstacles, and the complexity of $B$ is assumed constant. Halperin and Yap (1993) showed that, if $B$ is either a "flat" convex polygon or a three-dimensional box, then a tighter bound of $O(n^2\alpha(n))$ holds. Here $\alpha(n)$ is the inverse Ackermann function. In this paper we prove that if $B$ is a square (or a rectangle or a parallelogram), then the complexity of $\mathcal F$ is $O(n^2)$. We conjecture that this bound holds more generally if $B$ is any convex polygon whose edges come in parallel pairs. For such polygons $B$, the only triple contacts whose number we were not able to bound by $O(n^2)$ are those made by three mutually non-parallel edges of $B$. Similarly, for the case where $B$ is a cube (or a box or a parallelepiped), we bound by $O(n^2)$ all triple contacts except those made by three mutually non-parallel edges of $B$.

Tuesday, October 28 2025, 00:00

Coresets for Clustering Under Stochastic Noise

from arXiv: Data Structures and Algorithms

Authors: Lingxiao Huang, Zhize Li, Nisheeth K. Vishnoi, Runkai Yang, Haoyu Zhao

We study the problem of constructing coresets for $(k, z)$-clustering when the input dataset is corrupted by stochastic noise drawn from a known distribution. In this setting, evaluating the quality of a coreset is inherently challenging, as the true underlying dataset is unobserved. To address this, we investigate coreset construction using surrogate error metrics that are tractable and provably related to the true clustering cost. We analyze a traditional metric from prior work and introduce a new error metric that more closely aligns with the true cost. Although our metric is defined independently of the noise distribution, it enables approximation guarantees that scale with the noise level. We design a coreset construction algorithm based on this metric and show that, under mild assumptions on the data and noise, enforcing an $\varepsilon$-bound under our metric yields smaller coresets and tighter guarantees on the true clustering cost than those obtained via classical metrics. In particular, we prove that the coreset size can improve by a factor of up to $\mathrm{poly}(k)$, where $n$ is the dataset size. Experiments on real-world datasets support our theoretical findings and demonstrate the practical advantages of our approach.

Authors: Lingxiao Huang, Zhize Li, Nisheeth K. Vishnoi, Runkai Yang, Haoyu Zhao

We study the problem of constructing coresets for $(k, z)$-clustering when the input dataset is corrupted by stochastic noise drawn from a known distribution. In this setting, evaluating the quality of a coreset is inherently challenging, as the true underlying dataset is unobserved. To address this, we investigate coreset construction using surrogate error metrics that are tractable and provably related to the true clustering cost. We analyze a traditional metric from prior work and introduce a new error metric that more closely aligns with the true cost. Although our metric is defined independently of the noise distribution, it enables approximation guarantees that scale with the noise level. We design a coreset construction algorithm based on this metric and show that, under mild assumptions on the data and noise, enforcing an $\varepsilon$-bound under our metric yields smaller coresets and tighter guarantees on the true clustering cost than those obtained via classical metrics. In particular, we prove that the coreset size can improve by a factor of up to $\mathrm{poly}(k)$, where $n$ is the dataset size. Experiments on real-world datasets support our theoretical findings and demonstrate the practical advantages of our approach.

Tuesday, October 28 2025, 00:00

Sublinear Sketches for Approximate Nearest Neighbor and Kernel Density Estimation

from arXiv: Data Structures and Algorithms

Authors: Ved Danait, Srijan Das, Sujoy Bhore

Approximate Nearest Neighbor (ANN) search and Approximate Kernel Density Estimation (A-KDE) are fundamental problems at the core of modern machine learning, with broad applications in data analysis, information systems, and large-scale decision making. In massive and dynamic data streams, a central challenge is to design compact sketches that preserve essential structural properties of the data while enabling efficient queries. In this work, we develop new sketching algorithms that achieve sublinear space and query time guarantees for both ANN and A-KDE for a dynamic stream of data. For ANN in the streaming model, under natural assumptions, we design a sublinear sketch that requires only $\mathcal{O}(n^{1+\rho-\eta})$ memory by storing only a sublinear ($n^{-\eta}$) fraction of the total inputs, where $\rho$ is a parameter of the LSH family, and $0<\eta<1$. Our method supports sublinear query time, batch queries, and extends to the more general Turnstile model. While earlier works have focused on Exact NN, this is the first result on ANN that achieves near-optimal trade-offs between memory size and approximation error. Next, for A-KDE in the Sliding-Window model, we propose a sketch of size $\mathcal{O}\left(RW \cdot \frac{1}{\sqrt{1+\epsilon} - 1} \log^2 N\right)$, where $R$ is the number of sketch rows, $W$ is the LSH range, $N$ is the window size, and $\epsilon$ is the approximation error. This, to the best of our knowledge, is the first theoretical sublinear sketch guarantee for A-KDE in the Sliding-Window model. We complement our theoretical results with experiments on various real-world datasets, which show that the proposed sketches are lightweight and achieve consistently low error in practice.

Authors: Ved Danait, Srijan Das, Sujoy Bhore

Approximate Nearest Neighbor (ANN) search and Approximate Kernel Density Estimation (A-KDE) are fundamental problems at the core of modern machine learning, with broad applications in data analysis, information systems, and large-scale decision making. In massive and dynamic data streams, a central challenge is to design compact sketches that preserve essential structural properties of the data while enabling efficient queries. In this work, we develop new sketching algorithms that achieve sublinear space and query time guarantees for both ANN and A-KDE for a dynamic stream of data. For ANN in the streaming model, under natural assumptions, we design a sublinear sketch that requires only $\mathcal{O}(n^{1+\rho-\eta})$ memory by storing only a sublinear ($n^{-\eta}$) fraction of the total inputs, where $\rho$ is a parameter of the LSH family, and $0<\eta<1$. Our method supports sublinear query time, batch queries, and extends to the more general Turnstile model. While earlier works have focused on Exact NN, this is the first result on ANN that achieves near-optimal trade-offs between memory size and approximation error. Next, for A-KDE in the Sliding-Window model, we propose a sketch of size $\mathcal{O}\left(RW \cdot \frac{1}{\sqrt{1+\epsilon} - 1} \log^2 N\right)$, where $R$ is the number of sketch rows, $W$ is the LSH range, $N$ is the window size, and $\epsilon$ is the approximation error. This, to the best of our knowledge, is the first theoretical sublinear sketch guarantee for A-KDE in the Sliding-Window model. We complement our theoretical results with experiments on various real-world datasets, which show that the proposed sketches are lightweight and achieve consistently low error in practice.

Tuesday, October 28 2025, 00:00

Multi-Way Co-Ranking: Index-Space Partitioning of Sorted Sequences Without Merge

from arXiv: Data Structures and Algorithms

Authors: Amit Joshi

We present a merge-free algorithm for multi-way co-ranking, the problem of computing cut indices $i_1,\dots,i_m$ that partition each of the $m$ sorted sequences such that all prefix segments together contain exactly $K$ elements. Our method extends two-list co-ranking to arbitrary $m$, maintaining per-sequence bounds that converge to a consistent global frontier without performing any multi-way merge or value-space search. Rather, we apply binary search to \emph{index-space}. The algorithm runs in $O(\log(\sum_t n_t)\,\log m)$ time and $O(m)$ space, independent of $K$. We prove correctness via an exchange argument and discuss applications to distributed fractional knapsack, parallel merge partitioning, and multi-stream joins. Keywords: Co-ranking \sep partitioning \sep Merge-free algorithms \sep Index-space optimization \sep Selection and merging \sep Data structures

Authors: Amit Joshi

We present a merge-free algorithm for multi-way co-ranking, the problem of computing cut indices $i_1,\dots,i_m$ that partition each of the $m$ sorted sequences such that all prefix segments together contain exactly $K$ elements. Our method extends two-list co-ranking to arbitrary $m$, maintaining per-sequence bounds that converge to a consistent global frontier without performing any multi-way merge or value-space search. Rather, we apply binary search to \emph{index-space}. The algorithm runs in $O(\log(\sum_t n_t)\,\log m)$ time and $O(m)$ space, independent of $K$. We prove correctness via an exchange argument and discuss applications to distributed fractional knapsack, parallel merge partitioning, and multi-stream joins. Keywords: Co-ranking \sep partitioning \sep Merge-free algorithms \sep Index-space optimization \sep Selection and merging \sep Data structures

Tuesday, October 28 2025, 00:00

Testing forbidden order-pattern properties on hypergrids

from arXiv: Data Structures and Algorithms

Authors: Harish Chandramouleeswaran, Ilan Newman, Tomer Pelleg, Nithin Varma

We study testing $\pi$-freeness of functions $f:[n]^d\to\mathbb{R}$, where $f$ is $\pi$-free if there there are no $k$ indices $x_1\prec\cdots\prec x_k\in [n]^d$ such that $f(x_i)2$. We initiate a systematic study of pattern freeness on higher-dimensional grids. For $d=2$ and all permutations of size $k=3$, we design an adaptive one-sided tester with query complexity $O(n^{4/5+o(1)})$. We also prove general lower bounds for $k=3$: every nonadaptive tester requires $\Omega(n)$ queries, and every adaptive tester requires $\Omega(\sqrt{n})$ queries, yielding the first super-logarithmic lower bounds for $\pi$-freeness. For the monotone patterns $\pi=(1,2,3)$ and $(3,2,1)$, we present a nonadaptive tester with polylogarithmic query complexity, giving an exponential separation between monotone and nonmonotone patterns (unlike the one-dimensional case). A key ingredient in our $\pi$-freeness testers is new erasure-resilient ($\delta$-ER) $\epsilon$-testers for monotonicity over $[n]^d$ with query complexity $O(\log^{O(d)}n/(\epsilon(1-\delta)))$, where $0<\delta<1$ is an upper bound on the fraction of erasures. Prior ER testers worked only for $\delta=O(\epsilon/d)$. Our nonadaptive monotonicity tester is nearly optimal via a matching lower bound due to Pallavoor, Raskhodnikova, and Waingarten (Random Struct. Algorithms, 2022). Finally, we show that current techniques cannot yield sublinear-query testers for patterns of length $4$ even on two-dimensional hypergrids.

Authors: Harish Chandramouleeswaran, Ilan Newman, Tomer Pelleg, Nithin Varma

We study testing $\pi$-freeness of functions $f:[n]^d\to\mathbb{R}$, where $f$ is $\pi$-free if there there are no $k$ indices $x_1\prec\cdots\prec x_k\in [n]^d$ such that $f(x_i)2$. We initiate a systematic study of pattern freeness on higher-dimensional grids. For $d=2$ and all permutations of size $k=3$, we design an adaptive one-sided tester with query complexity $O(n^{4/5+o(1)})$. We also prove general lower bounds for $k=3$: every nonadaptive tester requires $\Omega(n)$ queries, and every adaptive tester requires $\Omega(\sqrt{n})$ queries, yielding the first super-logarithmic lower bounds for $\pi$-freeness. For the monotone patterns $\pi=(1,2,3)$ and $(3,2,1)$, we present a nonadaptive tester with polylogarithmic query complexity, giving an exponential separation between monotone and nonmonotone patterns (unlike the one-dimensional case). A key ingredient in our $\pi$-freeness testers is new erasure-resilient ($\delta$-ER) $\epsilon$-testers for monotonicity over $[n]^d$ with query complexity $O(\log^{O(d)}n/(\epsilon(1-\delta)))$, where $0<\delta<1$ is an upper bound on the fraction of erasures. Prior ER testers worked only for $\delta=O(\epsilon/d)$. Our nonadaptive monotonicity tester is nearly optimal via a matching lower bound due to Pallavoor, Raskhodnikova, and Waingarten (Random Struct. Algorithms, 2022). Finally, we show that current techniques cannot yield sublinear-query testers for patterns of length $4$ even on two-dimensional hypergrids.

Tuesday, October 28 2025, 00:00

Hierarchical Exponential Search Via K-Spines

from arXiv: Data Structures and Algorithms

Authors: Bob Dong

We introduce the concept of a k-spine of a tree. A k-spine is essentially a path in the tree whose removal leaves only "less-bushy" components of a smaller pathwidth. Using a k-spine as a central guide, we introduce an O(klog dist) exponential search algorithm on a tree by searching mainly along the spine to narrow down the target's vicinity and then recursively handling the smaller components.

Authors: Bob Dong

We introduce the concept of a k-spine of a tree. A k-spine is essentially a path in the tree whose removal leaves only "less-bushy" components of a smaller pathwidth. Using a k-spine as a central guide, we introduce an O(klog dist) exponential search algorithm on a tree by searching mainly along the spine to narrow down the target's vicinity and then recursively handling the smaller components.

Tuesday, October 28 2025, 00:00

$L_p$ Sampling in Distributed Data Streams with Applications to Adversarial Robustness

from arXiv: Data Structures and Algorithms

Authors: Honghao Lin, Zhao Song, David P. Woodruff, Shenghao Xie, Samson Zhou

In the distributed monitoring model, a data stream over a universe of size $n$ is distributed over $k$ servers, who must continuously provide certain statistics of the overall dataset, while minimizing communication with a central coordinator. In such settings, the ability to efficiently collect a random sample from the global stream is a powerful primitive, enabling a wide array of downstream tasks such as estimating frequency moments, detecting heavy hitters, or performing sparse recovery. Of particular interest is the task of producing a perfect $L_p$ sample, which given a frequency vector $f \in \mathbb{R}^n$, outputs an index $i$ with probability $\frac{f_i^p}{\|f\|_p^p}+\frac{1}{\mathrm{poly}(n)}$. In this paper, we resolve the problem of perfect $L_p$ sampling for all $p\ge 1$ in the distributed monitoring model. Specifically, our algorithm runs in $k^{p-1} \cdot \mathrm{polylog}(n)$ bits of communication, which is optimal up to polylogarithmic factors. Utilizing our perfect $L_p$ sampler, we achieve adversarially-robust distributed monitoring protocols for the $F_p$ moment estimation problem, where the goal is to provide a $(1+\varepsilon)$-approximation to $f_1^p+\ldots+f_n^p$. Our algorithm uses $\frac{k^{p-1}}{\varepsilon^2}\cdot\mathrm{polylog}(n)$ bits of communication for all $p\ge 2$ and achieves optimal bounds up to polylogarithmic factors, matching lower bounds by Woodruff and Zhang (STOC 2012) in the non-robust setting. Finally, we apply our framework to achieve near-optimal adversarially robust distributed protocols for central problems such as counting, frequency estimation, heavy-hitters, and distinct element estimation.

Authors: Honghao Lin, Zhao Song, David P. Woodruff, Shenghao Xie, Samson Zhou

In the distributed monitoring model, a data stream over a universe of size $n$ is distributed over $k$ servers, who must continuously provide certain statistics of the overall dataset, while minimizing communication with a central coordinator. In such settings, the ability to efficiently collect a random sample from the global stream is a powerful primitive, enabling a wide array of downstream tasks such as estimating frequency moments, detecting heavy hitters, or performing sparse recovery. Of particular interest is the task of producing a perfect $L_p$ sample, which given a frequency vector $f \in \mathbb{R}^n$, outputs an index $i$ with probability $\frac{f_i^p}{\|f\|_p^p}+\frac{1}{\mathrm{poly}(n)}$. In this paper, we resolve the problem of perfect $L_p$ sampling for all $p\ge 1$ in the distributed monitoring model. Specifically, our algorithm runs in $k^{p-1} \cdot \mathrm{polylog}(n)$ bits of communication, which is optimal up to polylogarithmic factors. Utilizing our perfect $L_p$ sampler, we achieve adversarially-robust distributed monitoring protocols for the $F_p$ moment estimation problem, where the goal is to provide a $(1+\varepsilon)$-approximation to $f_1^p+\ldots+f_n^p$. Our algorithm uses $\frac{k^{p-1}}{\varepsilon^2}\cdot\mathrm{polylog}(n)$ bits of communication for all $p\ge 2$ and achieves optimal bounds up to polylogarithmic factors, matching lower bounds by Woodruff and Zhang (STOC 2012) in the non-robust setting. Finally, we apply our framework to achieve near-optimal adversarially robust distributed protocols for central problems such as counting, frequency estimation, heavy-hitters, and distinct element estimation.

Tuesday, October 28 2025, 00:00

Faster Negative-Weight Shortest Paths and Directed Low-Diameter Decompositions

from arXiv: Data Structures and Algorithms

Authors: Jason Li, Connor Mowry, Satish Rao

We present a faster algorithm for low-diameter decompositions on directed graphs, matching the $O(\log n\log\log n)$ loss factor from Bringmann, Fischer, Haeupler, and Latypov (ICALP 2025) and improving the running time to $O((m+n\log\log n)\log n\log\log n)$ in expectation. We then apply our faster low-diameter decomposition to obtain an algorithm for negative-weight single source shortest paths on integer-weighted graphs in $O((m+n\log\log n)\log(nW)\log n\log\log n)$ time, a nearly log-factor improvement over the algorithm of Bringmann, Cassis, and Fischer (FOCS 2023).

Authors: Jason Li, Connor Mowry, Satish Rao

We present a faster algorithm for low-diameter decompositions on directed graphs, matching the $O(\log n\log\log n)$ loss factor from Bringmann, Fischer, Haeupler, and Latypov (ICALP 2025) and improving the running time to $O((m+n\log\log n)\log n\log\log n)$ in expectation. We then apply our faster low-diameter decomposition to obtain an algorithm for negative-weight single source shortest paths on integer-weighted graphs in $O((m+n\log\log n)\log(nW)\log n\log\log n)$ time, a nearly log-factor improvement over the algorithm of Bringmann, Cassis, and Fischer (FOCS 2023).

Tuesday, October 28 2025, 00:00

Generating pivot Gray codes for spanning trees of complete graphs in constant amortized time

from arXiv: Data Structures and Algorithms

Authors: Bowie Liu, Dennis Wong, Chan-Tong Lam, Sio-Kei Im

We present the first known pivot Gray code for spanning trees of complete graphs, listing all spanning trees such that consecutive trees differ by pivoting a single edge around a vertex. This pivot Gray code thus addresses an open problem posed by Knuth in The Art of Computer Programming, Volume 4 (Exercise 101, Section 7.2.1.6, [Knuth, 2011]), rated at a difficulty level of 46 out of 50, and imposes stricter conditions than existing revolving-door or edge-exchange Gray codes for spanning trees of complete graphs. Our recursive algorithm generates each spanning tree in constant amortized time using $O(n^2)$ space. In addition, we provide a novel proof of Cayley's formula, $n^{n-2}$, for the number of spanning trees in a complete graph, derived from our recursive approach. We extend the algorithm to generate edge-exchange Gray codes for general graphs with $n$ vertices, achieving $O(n^2)$ time per tree using $O(n^2)$ space. For specific graph classes, the algorithm can be optimized to generate edge-exchange Gray codes for spanning trees in constant amortized time per tree for complete bipartite graphs, $O(n)$-amortized time per tree for fan graphs, and $O(n)$-amortized time per tree for wheel graphs, all using $O(n^2)$ space.

Authors: Bowie Liu, Dennis Wong, Chan-Tong Lam, Sio-Kei Im

We present the first known pivot Gray code for spanning trees of complete graphs, listing all spanning trees such that consecutive trees differ by pivoting a single edge around a vertex. This pivot Gray code thus addresses an open problem posed by Knuth in The Art of Computer Programming, Volume 4 (Exercise 101, Section 7.2.1.6, [Knuth, 2011]), rated at a difficulty level of 46 out of 50, and imposes stricter conditions than existing revolving-door or edge-exchange Gray codes for spanning trees of complete graphs. Our recursive algorithm generates each spanning tree in constant amortized time using $O(n^2)$ space. In addition, we provide a novel proof of Cayley's formula, $n^{n-2}$, for the number of spanning trees in a complete graph, derived from our recursive approach. We extend the algorithm to generate edge-exchange Gray codes for general graphs with $n$ vertices, achieving $O(n^2)$ time per tree using $O(n^2)$ space. For specific graph classes, the algorithm can be optimized to generate edge-exchange Gray codes for spanning trees in constant amortized time per tree for complete bipartite graphs, $O(n)$-amortized time per tree for fan graphs, and $O(n)$-amortized time per tree for wheel graphs, all using $O(n^2)$ space.

Tuesday, October 28 2025, 00:00

Tree Embedding in High Dimensions: Dynamic and Massively Parallel

from arXiv: Data Structures and Algorithms

Authors: Gramoz Goranci, Shaofeng H. -C. Jiang, Peter Kiss, Qihao Kong, Yi Qian, Eva Szilagyi

Tree embedding has been a fundamental method in algorithm design with wide applications. We focus on the efficiency of building tree embedding in various computational settings under high-dimensional Euclidean $\mathbb{R}^d$. We devise a new tree embedding construction framework that operates on an arbitrary metric decomposition with bounded diameter, offering a tradeoff between distortion and the locality of its algorithmic steps. This framework works for general metric spaces and may be of independent interest beyond the Euclidean setting. Using this framework, we obtain a dynamic algorithm that maintains an $O_\epsilon(\log n)$-distortion tree embedding with update time $\tilde O(n^\epsilon + d)$ subject to point insertions/deletions, and a massively parallel algorithm that achieves $O_\epsilon(\log n)$-distortion in $O(1)$ rounds and total space $\tilde O(n^{1 + \epsilon})$ (for constant $\epsilon \in (0, 1)$). These new tree embedding results allow for a wide range of applications. Notably, under a similar performance guarantee as in our tree embedding algorithms, i.e., $\tilde O(n^\epsilon + d)$ update time and $O(1)$ rounds, we obtain $O_\epsilon(\log n)$-approximate dynamic and MPC algorithms for $k$-median and earth-mover distance in $\mathbb{R}^d$.

Authors: Gramoz Goranci, Shaofeng H. -C. Jiang, Peter Kiss, Qihao Kong, Yi Qian, Eva Szilagyi

Tree embedding has been a fundamental method in algorithm design with wide applications. We focus on the efficiency of building tree embedding in various computational settings under high-dimensional Euclidean $\mathbb{R}^d$. We devise a new tree embedding construction framework that operates on an arbitrary metric decomposition with bounded diameter, offering a tradeoff between distortion and the locality of its algorithmic steps. This framework works for general metric spaces and may be of independent interest beyond the Euclidean setting. Using this framework, we obtain a dynamic algorithm that maintains an $O_\epsilon(\log n)$-distortion tree embedding with update time $\tilde O(n^\epsilon + d)$ subject to point insertions/deletions, and a massively parallel algorithm that achieves $O_\epsilon(\log n)$-distortion in $O(1)$ rounds and total space $\tilde O(n^{1 + \epsilon})$ (for constant $\epsilon \in (0, 1)$). These new tree embedding results allow for a wide range of applications. Notably, under a similar performance guarantee as in our tree embedding algorithms, i.e., $\tilde O(n^\epsilon + d)$ update time and $O(1)$ rounds, we obtain $O_\epsilon(\log n)$-approximate dynamic and MPC algorithms for $k$-median and earth-mover distance in $\mathbb{R}^d$.

Tuesday, October 28 2025, 00:00

On Integer Programs That Look Like Paths

from arXiv: Data Structures and Algorithms

Authors: Marcin Briański, Alexandra Lassota, Kristýna Pekárková, Michał Pilipczuk, Janina Reuter

Solving integer programs of the form $\min \{\mathbf{x} \mid A\mathbf{x} = \mathbf{b}, \mathbf{l} \leq \mathbf{x} \leq \mathbf{u}, \mathbf{x} \in \mathbb{Z}^n \}$ is, in general, $\mathsf{NP}$-hard. Hence, great effort has been put into identifying subclasses of integer programs that are solvable in polynomial or $\mathsf{FPT}$ time. A common scheme for many of these integer programs is a star-like structure of the constraint matrix. The arguably simplest form that is not a star is a path. We study integer programs where the constraint matrix $A$ has such a path-like structure: every non-zero coefficient appears in at most two consecutive constraints. We prove that even if all coefficients of $A$ are bounded by 8, deciding the feasibility of such integer programs is $\mathsf{NP}$-hard via a reduction from 3-SAT. Given the existence of efficient algorithms for integer programs with star-like structures and a closely related pattern where the sum of absolute values is column-wise bounded by 2 (hence, there are at most two non-zero entries per column of size at most 2), this hardness result is surprising.

Authors: Marcin Briański, Alexandra Lassota, Kristýna Pekárková, Michał Pilipczuk, Janina Reuter

Solving integer programs of the form $\min \{\mathbf{x} \mid A\mathbf{x} = \mathbf{b}, \mathbf{l} \leq \mathbf{x} \leq \mathbf{u}, \mathbf{x} \in \mathbb{Z}^n \}$ is, in general, $\mathsf{NP}$-hard. Hence, great effort has been put into identifying subclasses of integer programs that are solvable in polynomial or $\mathsf{FPT}$ time. A common scheme for many of these integer programs is a star-like structure of the constraint matrix. The arguably simplest form that is not a star is a path. We study integer programs where the constraint matrix $A$ has such a path-like structure: every non-zero coefficient appears in at most two consecutive constraints. We prove that even if all coefficients of $A$ are bounded by 8, deciding the feasibility of such integer programs is $\mathsf{NP}$-hard via a reduction from 3-SAT. Given the existence of efficient algorithms for integer programs with star-like structures and a closely related pattern where the sum of absolute values is column-wise bounded by 2 (hence, there are at most two non-zero entries per column of size at most 2), this hardness result is surprising.

Tuesday, October 28 2025, 00:00

Johnson-Lindenstrauss Lemma Beyond Euclidean Geometry

from arXiv: Data Structures and Algorithms

Authors: Chengyuan Deng, Jie Gao, Kevin Lu, Feng Luo, Cheng Xin

The Johnson-Lindenstrauss (JL) lemma is a cornerstone of dimensionality reduction in Euclidean space, but its applicability to non-Euclidean data has remained limited. This paper extends the JL lemma beyond Euclidean geometry to handle general dissimilarity matrices that are prevalent in real-world applications. We present two complementary approaches: First, we show the JL transform can be applied to vectors in pseudo-Euclidean space with signature $(p,q)$, providing theoretical guarantees that depend on the ratio of the $(p, q)$ norm and Euclidean norm of two vectors, measuring the deviation from Euclidean geometry. Second, we prove that any symmetric hollow dissimilarity matrix can be represented as a matrix of generalized power distances, with an additional parameter representing the uncertainty level within the data. In this representation, applying the JL transform yields multiplicative approximation with a controlled additive error term proportional to the deviation from Euclidean geometry. Our theoretical results provide fine-grained performance analysis based on the degree to which the input data deviates from Euclidean geometry, making practical and meaningful reduction in dimensionality accessible to a wider class of data. We validate our approaches on both synthetic and real-world datasets, demonstrating the effectiveness of extending the JL lemma to non-Euclidean settings.

Authors: Chengyuan Deng, Jie Gao, Kevin Lu, Feng Luo, Cheng Xin

The Johnson-Lindenstrauss (JL) lemma is a cornerstone of dimensionality reduction in Euclidean space, but its applicability to non-Euclidean data has remained limited. This paper extends the JL lemma beyond Euclidean geometry to handle general dissimilarity matrices that are prevalent in real-world applications. We present two complementary approaches: First, we show the JL transform can be applied to vectors in pseudo-Euclidean space with signature $(p,q)$, providing theoretical guarantees that depend on the ratio of the $(p, q)$ norm and Euclidean norm of two vectors, measuring the deviation from Euclidean geometry. Second, we prove that any symmetric hollow dissimilarity matrix can be represented as a matrix of generalized power distances, with an additional parameter representing the uncertainty level within the data. In this representation, applying the JL transform yields multiplicative approximation with a controlled additive error term proportional to the deviation from Euclidean geometry. Our theoretical results provide fine-grained performance analysis based on the degree to which the input data deviates from Euclidean geometry, making practical and meaningful reduction in dimensionality accessible to a wider class of data. We validate our approaches on both synthetic and real-world datasets, demonstrating the effectiveness of extending the JL lemma to non-Euclidean settings.

Tuesday, October 28 2025, 00:00

(Approximate) Matrix Multiplication via Convolutions

from arXiv: Data Structures and Algorithms

Authors: Kevin Pratt, Yahel Uffenheimer, Omri Weinstein

A longstanding open question in algorithm design is whether "combinatorial" matrix multiplication algorithms -- avoiding Strassen-like divide-and-conquer -- can achieve truly subcubic runtime $n^{3-\delta}$. We present an $O(n^{2.89})$-time exact algorithm, which only sums convolutions in $\mathbb{Z}_m^k$ (multivariate polynomial multiplications) via FFT, building on the work of Cohn, Kleinberg, Szegedy and Umans (CKSU'05). While the algorithm avoids recursion, the asymptotic speedup arises only for impractically large matrices. Motivated by practical applications, we use this baseline to develop a new framework for fast approximate matrix multiplication (AMM), via low-degree approximations of the CKSU polynomials. We show that combining the aforementioned algorithm with black-box linear sketching already breaks the longstanding linear speed-accuracy tradeoff for AMM (Sarlos'06, Clarkson-Woodruff'13 ,Pagh'11, Cohn-Lewis'00), achieving $\frac{1}{r^{1.1}}\|\mathbf{A}\|_F^2\|\mathbf{B}\|_F^2$ error in $O(rn^2)$-time. Our main result is a low-degree approximation scheme for the CKSU polynomials, based on a Fourier-concentration lemma, yielding substantially smaller error in the distributional setting where $\mathbf{A},\mathbf{B}$ come from an i.i.d product-distribution; For random Gaussian matrices, this practical AMM algorithm attains smaller error than the best rank-$r$ SVD of the output matrix $\mathbf{A}\mathbf{B}$, in time $O(rn^2)$. This is a substantial improvement over iterative Krylov subspace methods for low-rank approximation. Our theoretical and empirical results suggest the possibility of replacing MatMuls with sums of convolutions in LLM training and inference.

Authors: Kevin Pratt, Yahel Uffenheimer, Omri Weinstein

A longstanding open question in algorithm design is whether "combinatorial" matrix multiplication algorithms -- avoiding Strassen-like divide-and-conquer -- can achieve truly subcubic runtime $n^{3-\delta}$. We present an $O(n^{2.89})$-time exact algorithm, which only sums convolutions in $\mathbb{Z}_m^k$ (multivariate polynomial multiplications) via FFT, building on the work of Cohn, Kleinberg, Szegedy and Umans (CKSU'05). While the algorithm avoids recursion, the asymptotic speedup arises only for impractically large matrices. Motivated by practical applications, we use this baseline to develop a new framework for fast approximate matrix multiplication (AMM), via low-degree approximations of the CKSU polynomials. We show that combining the aforementioned algorithm with black-box linear sketching already breaks the longstanding linear speed-accuracy tradeoff for AMM (Sarlos'06, Clarkson-Woodruff'13 ,Pagh'11, Cohn-Lewis'00), achieving $\frac{1}{r^{1.1}}\|\mathbf{A}\|_F^2\|\mathbf{B}\|_F^2$ error in $O(rn^2)$-time. Our main result is a low-degree approximation scheme for the CKSU polynomials, based on a Fourier-concentration lemma, yielding substantially smaller error in the distributional setting where $\mathbf{A},\mathbf{B}$ come from an i.i.d product-distribution; For random Gaussian matrices, this practical AMM algorithm attains smaller error than the best rank-$r$ SVD of the output matrix $\mathbf{A}\mathbf{B}$, in time $O(rn^2)$. This is a substantial improvement over iterative Krylov subspace methods for low-rank approximation. Our theoretical and empirical results suggest the possibility of replacing MatMuls with sums of convolutions in LLM training and inference.

Tuesday, October 28 2025, 00:00

Quasi-Self-Concordant Optimization with Lewis Weights

from arXiv: Data Structures and Algorithms

Authors: Alina Ene, Ta Duy Nguyen, Adrian Vladu

In this paper, we study the problem $\min_{x\in \mathbb{R}^{d},Nx=v}\sum_{i=1}^{n}f((Ax-b)_{i})$ for a quasi-self-concordant function $f:\mathbb{R}\to\mathbb{R}$, where $A,N$ are $n\times d$ and $m\times d$ matrices, $b,v$ are vectors of length $n$ and $m$ with $n\ge d.$ We show an algorithm based on a trust-region method with an oracle that can be implemented using $\widetilde{O}(d^{1/3})$ linear system solves, improving the $\widetilde{O}(n^{1/3})$ oracle by {[}Adil-Bullins-Sachdeva, NeurIPS 2021{]}. Our implementation of the oracle relies on solving the overdetermined $\ell_{\infty}$-regression problem $\min_{x\in\mathbb{R}^{d},Nx=v}\|Ax-b\|_{\infty}$. We provide an algorithm that finds a $(1+\epsilon)$-approximate solution to this problem using $O((d^{1/3}/\epsilon+1/\epsilon^{2})\log(n/\epsilon))$ linear system solves. This algorithm leverages $\ell_{\infty}$ Lewis weight overestimates and achieves this iteration complexity via a simple lightweight IRLS approach, inspired by the work of {[}Ene-Vladu, ICML 2019{]}. Experimentally, we demonstrate that our algorithm significantly improves the runtime of the standard CVX solver.

Authors: Alina Ene, Ta Duy Nguyen, Adrian Vladu

In this paper, we study the problem $\min_{x\in \mathbb{R}^{d},Nx=v}\sum_{i=1}^{n}f((Ax-b)_{i})$ for a quasi-self-concordant function $f:\mathbb{R}\to\mathbb{R}$, where $A,N$ are $n\times d$ and $m\times d$ matrices, $b,v$ are vectors of length $n$ and $m$ with $n\ge d.$ We show an algorithm based on a trust-region method with an oracle that can be implemented using $\widetilde{O}(d^{1/3})$ linear system solves, improving the $\widetilde{O}(n^{1/3})$ oracle by {[}Adil-Bullins-Sachdeva, NeurIPS 2021{]}. Our implementation of the oracle relies on solving the overdetermined $\ell_{\infty}$-regression problem $\min_{x\in\mathbb{R}^{d},Nx=v}\|Ax-b\|_{\infty}$. We provide an algorithm that finds a $(1+\epsilon)$-approximate solution to this problem using $O((d^{1/3}/\epsilon+1/\epsilon^{2})\log(n/\epsilon))$ linear system solves. This algorithm leverages $\ell_{\infty}$ Lewis weight overestimates and achieves this iteration complexity via a simple lightweight IRLS approach, inspired by the work of {[}Ene-Vladu, ICML 2019{]}. Experimentally, we demonstrate that our algorithm significantly improves the runtime of the standard CVX solver.

Tuesday, October 28 2025, 00:00

An Optimal Density Bound for Discretized Point Patrolling

from arXiv: Data Structures and Algorithms

Authors: Ahan Mishra

The pinwheel problem is a real-time scheduling problem that asks, given $n$ tasks with periods $a_i \in \mathbb{N}$, whether it is possible to infinitely schedule the tasks, one per time unit, such that every task $i$ is scheduled in every interval of $a_i$ units. We study a corresponding version of this packing problem in the covering setting, stylized as the discretized point patrolling problem in the literature. Specifically, given $n$ tasks with periods $a_i$, the problem asks whether it is possible to assign each day to a task such that every task $i$ is scheduled at \textit{most} once every $a_i$ days. The density of an instance in either case is defined as the sum of the inverses of task periods. Recently, the long-standing $5/6$ density bound conjecture in the packing setting was resolved affirmatively. The resolution means any instance with density at least $5/6$ is schedulable. A corresponding conjecture was made in the covering setting and renewed multiple times in more recent work. We resolve this conjecture affirmatively by proving that every discretized point patrolling instance with density at least $\sum_{i = 0}^{\infty} 1/(2^i + 1) \approx 1.264$ is schedulable. This significantly improves upon the current best-known density bound of 1.546 and is, in fact, optimal. We also study the bamboo garden trimming problem, an optimization variant of the pinwheel problem. Specifically, given $n$ growth rates with values $h_i \in \mathbb{N}$, the objective is to minimize the maximum height of a bamboo garden with the corresponding growth rates, where we are allowed to trim one bamboo tree to height zero per time step. We achieve an efficient $9/7$-approximation algorithm for this problem, improving on the current best known approximation factor of $4/3$.

Authors: Ahan Mishra

The pinwheel problem is a real-time scheduling problem that asks, given $n$ tasks with periods $a_i \in \mathbb{N}$, whether it is possible to infinitely schedule the tasks, one per time unit, such that every task $i$ is scheduled in every interval of $a_i$ units. We study a corresponding version of this packing problem in the covering setting, stylized as the discretized point patrolling problem in the literature. Specifically, given $n$ tasks with periods $a_i$, the problem asks whether it is possible to assign each day to a task such that every task $i$ is scheduled at \textit{most} once every $a_i$ days. The density of an instance in either case is defined as the sum of the inverses of task periods. Recently, the long-standing $5/6$ density bound conjecture in the packing setting was resolved affirmatively. The resolution means any instance with density at least $5/6$ is schedulable. A corresponding conjecture was made in the covering setting and renewed multiple times in more recent work. We resolve this conjecture affirmatively by proving that every discretized point patrolling instance with density at least $\sum_{i = 0}^{\infty} 1/(2^i + 1) \approx 1.264$ is schedulable. This significantly improves upon the current best-known density bound of 1.546 and is, in fact, optimal. We also study the bamboo garden trimming problem, an optimization variant of the pinwheel problem. Specifically, given $n$ growth rates with values $h_i \in \mathbb{N}$, the objective is to minimize the maximum height of a bamboo garden with the corresponding growth rates, where we are allowed to trim one bamboo tree to height zero per time step. We achieve an efficient $9/7$-approximation algorithm for this problem, improving on the current best known approximation factor of $4/3$.

Tuesday, October 28 2025, 00:00

Generalized Top-k Mallows Model for Ranked Choices

from arXiv: Data Structures and Algorithms

Authors: Shahrzad Haddadan, Sara Ahmadian

The classic Mallows model is a foundational tool for modeling user preferences. However, it has limitations in capturing real-world scenarios, where users often focus only on a limited set of preferred items and are indifferent to the rest. To address this, extensions such as the top-k Mallows model have been proposed, aligning better with practical applications. In this paper, we address several challenges related to the generalized top-k Mallows model, with a focus on analyzing buyer choices. Our key contributions are: (1) a novel sampling scheme tailored to generalized top-k Mallows models, (2) an efficient algorithm for computing choice probabilities under this model, and (3) an active learning algorithm for estimating the model parameters from observed choice data. These contributions provide new tools for analysis and prediction in critical decision-making scenarios. We present a rigorous mathematical analysis for the performance of our algorithms. Furthermore, through extensive experiments on synthetic data and real-world data, we demonstrate the scalability and accuracy of our proposed methods, and we compare the predictive power of Mallows model for top-k lists compared to the simpler Multinomial Logit model.

Authors: Shahrzad Haddadan, Sara Ahmadian

The classic Mallows model is a foundational tool for modeling user preferences. However, it has limitations in capturing real-world scenarios, where users often focus only on a limited set of preferred items and are indifferent to the rest. To address this, extensions such as the top-k Mallows model have been proposed, aligning better with practical applications. In this paper, we address several challenges related to the generalized top-k Mallows model, with a focus on analyzing buyer choices. Our key contributions are: (1) a novel sampling scheme tailored to generalized top-k Mallows models, (2) an efficient algorithm for computing choice probabilities under this model, and (3) an active learning algorithm for estimating the model parameters from observed choice data. These contributions provide new tools for analysis and prediction in critical decision-making scenarios. We present a rigorous mathematical analysis for the performance of our algorithms. Furthermore, through extensive experiments on synthetic data and real-world data, we demonstrate the scalability and accuracy of our proposed methods, and we compare the predictive power of Mallows model for top-k lists compared to the simpler Multinomial Logit model.

Tuesday, October 28 2025, 00:00

The fine art of crate digging

from Ben Recht

Alexeev and Mixon's resolution of Erdős problem 707 and the vastness of the library.

In the comments of Friday’s post, Will P sent me a delightful paper by Boris Alexeev and Dustin Mixon. They resolve a 50-year-old open Erdős problem by finding the solution in an 80-year-old paper. If you’re at all mathematically inclined, Alexeev and Mixon’s paper is super fun to read. The mathematics is accessible, the bemused tone adds suspense, and the long section about vibe-coding in Lean using ChatGPT is hilarious. However, I want to emphasize from the get-go that the AI is a complete sideshow here. Alexeev and Mixon tell a fascinating story about how even experts are incapable of fathoming their own fields.

As I wrote last time, Paul Erdős posed thousands of challenging math problems. He was fond of some more than others, and often offered prize bounties for solutions he most desired. According to Thomas Bloom’s Erdős problems database, 11 of those bounties were for a thousand dollars or more. Problem 707, resolved in Alexeev and Mixon’s paper, is one of these.

The fun part of number theory is that you can often explain its open problems to anyone. Problem 707 asks whether every finite Sidon set can be extended to a finite perfect difference set. A set is a Sidon set if all of its pairwise differences are distinct. An example is {1, 2, 4, 8, 13}. The pairwise differences are {1, 2, 3, 4, 5, 6, 7, 9, 11, 12}. We’re missing 8 and 10. Erdős wondered if you could add integers to this set to get a complete list of differences that doesn’t skip any numbers, at least in modular arithmetic. It’s an esoteric question, but not that much weirder than Fermat’s Last Theorem, and not much harder to state. The other fun part of number theory is that you never know when something simple to state is completely impossible to prove.

Problem 707 puzzled mathematicians for half a century. Or so it seemed. When Alexeev and Mixon were working on this problem, they tried applying some ideas from projective planes, a powerful tool in combinatorics developed by mathematician Marshall Hall. In a 1947 paper by Hall, they found the following sentence:

“From this theorem it immediately follows that there are many sets of integers satisfying the conditions of [Erdos’ problem] which cannot be extended to any finite [perfect] difference set. For example the set −8, −6, 0, 1, 4 may not be so extended.”

Hall had derived a counterexample of Erdős’ conjecture thirty years before Erdős had conjectured it.

Alexeev and Mixon write:

“Clearly, it appears that Erdős was not aware of this result. The authors of this paper were also unaware of this result, even though they performed a reasonably deep literature search prior to starting this project. (In fact, no large language model could find Hall’s result, even with substantial prompting that the result indeed exists.) Instead, the paper was discovered by accident when searching for support for Conjecture 14.”

Now, the more I look into it, the more amazed I am that this Erdős problem remained open. First, I slightly disagree with Alexeev and Mixon that Hall just casually wrote a throwaway sentence about perfect difference extensions. In the previous section of Hall’s paper, he proves that every Sidon set has an infinite extension. He then devotes a long paragraph to the question of whether finite extensions exist. Here are the first and last sentences of that paragraph:

“As an example of the application of the theorem, consider the set -8, -6, 0, 1, 4... There is no corresponding theorem for finite planes such as Theorem 3.1. In the next section it will be shown that there is no finite difference set including the numbers -8, -6, 0, 1, 4.”

It’s clear Hall thought the finite extension problem was an interesting question. He just didn’t flag the counterexample as a theorem or proposition.

Second, it’s not like this Hall paper is particularly esoteric. While the 1947 paper is not as famous as Hall’s 1943 paper on projective planes, it’s still been cited hundreds of times. Hall was a notable mathematician and had been a professor at Ohio State and Caltech. His paper appeared in the reputable Duke Mathematical Journal. And the paper is still regularly cited today. For example, Sarah Peluse cites it in her 2020 work on finite difference sets.

Now, Peluse perhaps wasn’t aware of Erdős’ not-so-open open problem. But many mathematicians were aware of both Hall’s paper and Erdős’s conjecture and failed to see the connection. Alexeev and Mixon note that in a 2004 book called Unsolved Problems in Number Theory, Richard Guy cites Hall’s paper two sentences before stating Erdős’ conjecture! Here’s Guy’s passage:

“Marshall Hall has shown that numerous non-prime powers cannot serve as values of k and Evans & Mann that there is no such k < 1600 that is not a prime power. It is conjectured that no perfect difference set exists unless k is a prime power. Can a given finite sequence, which contains no repeated differences, always be extended to form a perfect difference set?”

Guy actually refers to Hall’s paper twice in his book.

Anyway, this is an amazing story about the vastness of the literature. We have no idea what’s in our libraries, even when we restrict our attention to papers with hundreds of citations. To be fair to all of us, we all cite papers that we have not read in their entirety. This is to be expected. We don’t write academic papers to be read like novels. If the authors don’t explicitly state a claim as THEOREM or LEMMA or CONJECTURE, it’s easy to just skim past it.

Still, it’s surprising that Erdős didn’t know about this. He was going around offering a thousand dollars to solve his simply stated problem. How is it that no one he talked to knew about Hall’s example?

And how is it that ChatGPT, which we know is trained illegally on paywalled papers, couldn’t find Hall’s paper for Alexeev and Mixon? It was probably too busy coming up with bad pull-up programs. Get it together, ChatGPT.

Subscribe now

By Ben Recht

Monday, October 27 2025, 13:59

TR25-160 | Intersection Theorems: A Potential Approach to Proof Complexity Lower Bounds | Yaroslav Alekseev, Nikita Gaevoy

from ECCC Papers

Recently, Göös et al. (2024) showed that Res ? uSA = RevRes in the following sense: if a formula $\varphi$ has refutations of size at most $s$ and width/degree at most $w$ in both Res and uSA, then there is a refutation for $\varphi$ of size at most $poly(s·2^w)$ in RevRes. Their proof relies on the TFNP characterization of the aforementioned proof systems. In our work, we give a direct and simplified proof of this result, simultaneously achieving better bounds: we show that if for a formula $\varphi$ there are refutations of size at most $s$ in both Res and uSA, then there is a refutation of $\varphi$ of size at most $poly(s)$ in RevRes. This potentially allows us to "lift" size lower bounds from RevRes to Res for the formulas for which there are upper bounds in uSA. This kind of lifting was not possible before because of the exponential blow-up in size from the width. Similarly, we improve the bounds in another intersection theorem from Göös et al. (2024) by giving a direct proof of Res ? uNS = RevResT. Finally, we generalize those intersection theorems to some proof systems for which we currently do not have a TFNP characterization. For example, we show that Res($\oplus$) ? u-wRes($\oplus$) = RevRes($\oplus$), which effectively allows us to reduce the problem of proving Pigeonhole Principle lower bounds in Res($\oplus$) to proving Pigeonhole Principle lower bounds in RevRes($\oplus$), a potentially weaker proof system.

Monday, October 27 2025, 13:27

TR25-159 | Efficiently Batching Unambiguous Interactive Proofs | Matthew M. Hong, Rohan Goyal, Bonnie Berger, Yael Tauman Kalai

from ECCC Papers

We show that if a language $\mathcal{L}$ admits a public-coin unambiguous interactive proof (UIP) with round complexity $\ell$, where $a$ bits are communicated per round, then the \emph{batch language} $\mathcal{L}^{\otimes k}$, i.e. the set of $k$-tuples of statements all belonging to $\mathcal{L}$, has an unambiguous interactive proof with round complexity $\ell\cdot\text{polylog}(k)$, per-round communication of $a\cdot \ell\cdot\text{polylog}(k) + \text{poly}(\ell)$ bits, assuming the verifier in the $\text{UIP}$ has depth bounded by $\text{polylog}(k)$. Prior to this work, the best known batch $\text{UIP}$ for $\mathcal{L}^{\otimes{k}}$ required communication complexity at least $(\text{poly}(a)\cdot k^{\epsilon} + k) \cdot \ell^{1/\epsilon}$ for any arbitrarily small constant $\epsilon>0$ (Reingold-Rothblum-Rothblum, STOC 2016). As a corollary of our result, we obtain a \emph{doubly efficient proof system}, that is, a proof system whose proving overhead is polynomial in the time of the underlying computation, for any language computable in polynomial space and in time at most $n^{O\left(\sqrt{\frac{\log n}{\log\log n}}\right)}$. This expands the state of the art of doubly efficient proof systems: prior to our work, such systems were known for languages computable in polynomial space and in time $n^{({\log n})^\delta}$ for a small $\delta>0$ significantly smaller than $1/2$ (Reingold-Rothblum-Rothblum, STOC 2016).

Monday, October 27 2025, 11:46

Bill's Bad Advice

from Computational Complexity

I sometimes give the following advice for research which I label Bill's Bad Advice. We will later see who it might be good advice for. Spoiler alert: the number of people for whom it is good advice is shrinking but might include Lance especially now (see his post about stepping down from admin, here).

When you come across a possible research topic or problem, or have some idea, and are wondering if you want to pursue it, here is my bad advice:

1) DON"T CARE if anyone else cares. If YOU care then that is enough to at least get started.

2) DON"T CARE if it has the potential for a published paper. FIRST do the work then, if you feel like it, look for a good venue. You might not bother if posting to arxiv or making an open problems column out of it or a (guest) blog post out of it is a good endpoint. (I run the SIGACT News Open Problems Column- feel free to contact me if you want to submit one.)

3) DON"T CARE if it has practical implications.

4) DON"T CARE if you can get a grant for it. With the current state of the NSF this advice may soon become irrelevant.

5) DON"T CARE if someone else already did it (though at a later stage you should check on this). Even if you work on it and find someone else did it, you will have LEARNED about the problem through your efforts. You might then want to do a survey for your own benefit to consolidate your knowledge.

Why should you NOT CARE about any of these things? Because they get in the way of actually DOING something.

Here are two examples of when this approach WORKED and one where it DID NOT WORK, though both classifications might depend on your definition of WORKED.

WORKED: My work on Muffins. All I wanted was to get some High School and Undergraduate Projects out of it. I ended up with a book on it which made me twenty dollars last year! More to the point, I learned a lot, as did my co-authors and I am happy with the book. The co-authors were Undergraduates so my dept put me up for a mentoring award (I have other credentials as well). I did not win, but I got a nice letter saying they had many qualified applicants. OH- it didn't say if I was one of them.

WORKED: I have had many Ramsey Projects where a High School Student codes stuff up and learns some Ramsey, some coding, and gets the experience of research. Sometimes they do a survey paper or open problems column. We both learn A LOT from this and the student gets a good letter from me. Do they do something NEW? Publishable? No, though some surveys and open problems columns have come out of this. I DO tell them ahead of time that the work is unlikely to lead to original results (and hence unlikely to be good for a science competition).

DID NOT WORK: See this blog post here about the math and here about finding out that the problem we were working on was already known and more was known than we thought. I didn't mind this, but one of the authors did.

Note that WORKED and DID NOT WORK also depend on your goals.

For whom is this bad advice? Good advice?

1) It was always bad advice for young assistant professors who need to get papers and grants to get tenure.

2) Hypothetically, once you get tenure you have job security and hence can change fields or follow my bad advice without consequences. But with grants and salary and teaching load issues, this is less the case. Perhaps I am nostalgic for a time that never was.

3) High School Students were my main audience for bad advice. It's not as important for them to get papers out as for (say) assistant professors. But even this is changing. Colleges are getting more competitive. And HS students may well want a project that can lead to Science competitions. I am not going to say things were better when I was a kid but instead pose non-rhetorical questions:

a) Are high school students getting into research earlier than they used to? I am sure the answer is yes.

b) Are we losing the safe space where a high school student can just learn things and do things and not worry so much about if it's publishable? Yes, but I am not sure how widespread that is.

c) If we are losing that safe space, is that a bad thing?

d) Points a,b,c apply to ugraduates who want to go to graduate school more than for high school students who want to go to college.

4) Full Professors may have more freedom to follow my bad advice. Lance is looking for things to do now that he is no longer a dean, and indeed, is back to being a teaching-and-research professor. So he might follow my advice. However, he actually cares if people care about his work. He does not have to follow all of my advice, but he can follow some of it.

By gasarch

I sometimes give the following advice for research which I label Bill's Bad Advice. We will later see who it might be good advice for. Spoiler alert: the number of people for whom it is good advice is shrinking but might include Lance especially now (see his post about stepping down from admin, here).