Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Thursday, February 19

Nash-convergence of Game Dynamics and Complexity

from arXiv: Computational Complexity

Authors: Oliver Biggar, Christos Papadimitriou, Georgios Piliouras

Does the failure of learning dynamics to converge globally to Nash equilibria stem from the geometry of the game or the complexity of computation? Previous impossibility results relied on game degeneracy, leaving open the case for generic, nondegenerate games. We resolve this by proving that while Nash-convergent dynamics theoretically exist for all nondegenerate games, computing them is likely intractable. We formulate the Impossibility Conjecture: if a locally efficient Nash-convergent dynamic exists for nondegenerate games, then $P=PPAD$. We validate this for three specific families of dynamics, showing their tractability would imply collapses such as $NP=RP$ or $CLS=PPAD$. En route, we settle the complexity of finding Nash equilibria of a given game that lie on a given affine subspace. Finally, we explain why the general conjecture remains open: we introduce a Proving Game to demonstrate that black-box reductions cannot distinguish between convergent and non-convergent dynamics in polynomial time. Our results suggest the barrier to Nash learning is not the non-existence of a vector field, but the intractability of computing it.

Authors: Oliver Biggar, Christos Papadimitriou, Georgios Piliouras

Does the failure of learning dynamics to converge globally to Nash equilibria stem from the geometry of the game or the complexity of computation? Previous impossibility results relied on game degeneracy, leaving open the case for generic, nondegenerate games. We resolve this by proving that while Nash-convergent dynamics theoretically exist for all nondegenerate games, computing them is likely intractable. We formulate the Impossibility Conjecture: if a locally efficient Nash-convergent dynamic exists for nondegenerate games, then $P=PPAD$. We validate this for three specific families of dynamics, showing their tractability would imply collapses such as $NP=RP$ or $CLS=PPAD$. En route, we settle the complexity of finding Nash equilibria of a given game that lie on a given affine subspace. Finally, we explain why the general conjecture remains open: we introduce a Proving Game to demonstrate that black-box reductions cannot distinguish between convergent and non-convergent dynamics in polynomial time. Our results suggest the barrier to Nash learning is not the non-existence of a vector field, but the intractability of computing it.

Improved Bounds for Discrete Voronoi Games

from arXiv: Computational Geometry

Authors: Mark de Berg, Geert van Wordragen

In the planar one-round discrete Voronoi game, two players $\mathcal{P}$ and $\mathcal{Q}$ compete over a set $V$ of $n$ voters represented by points in $\mathbb{R}^2$. First, $\mathcal{P}$ places a set $P$ of $k$ points, then $\mathcal{Q}$ places a set $Q$ of $\ell$ points, and then each voter $v\in V$ is won by the player who has placed a point closest to $v$. It is well known that if $k=\ell=1$, then $\mathcal{P}$ can always win $n/3$ voters and that this is worst-case optimal. We study the setting where $k>1$ and $\ell=1$. We present lower bounds on the number of voters that $\mathcal{P}$ can always win, which improve the existing bounds for all $k\geq 4$. As a by-product, we obtain improved bounds on small $\varepsilon$-nets for convex ranges. These results are for the $L_2$ metric. We also obtain lower bounds on the number of voters that $\mathcal{P}$ can always win when distances are measured in the $L_1$ metric.

Authors: Mark de Berg, Geert van Wordragen

In the planar one-round discrete Voronoi game, two players $\mathcal{P}$ and $\mathcal{Q}$ compete over a set $V$ of $n$ voters represented by points in $\mathbb{R}^2$. First, $\mathcal{P}$ places a set $P$ of $k$ points, then $\mathcal{Q}$ places a set $Q$ of $\ell$ points, and then each voter $v\in V$ is won by the player who has placed a point closest to $v$. It is well known that if $k=\ell=1$, then $\mathcal{P}$ can always win $n/3$ voters and that this is worst-case optimal. We study the setting where $k>1$ and $\ell=1$. We present lower bounds on the number of voters that $\mathcal{P}$ can always win, which improve the existing bounds for all $k\geq 4$. As a by-product, we obtain improved bounds on small $\varepsilon$-nets for convex ranges. These results are for the $L_2$ metric. We also obtain lower bounds on the number of voters that $\mathcal{P}$ can always win when distances are measured in the $L_1$ metric.

Ratio Covers of Convex Sets and Optimal Mixture Density Estimation

from arXiv: Computational Geometry

Authors: Spencer Compton, Gábor Lugosi, Jaouad Mourtada, Jian Qian, Nikita Zhivotovskiy

We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density $p$, the goal is to construct an estimator $\widehat p$ such that $\mathrm{KL}(p,\widehat p)$ is small with high probability. We consider two settings involving a finite dictionary of $M$ densities: (i) model aggregation, where $p$ belongs to the dictionary, and (ii) convex aggregation (mixture density estimation), where $p$ is a mixture of densities from the dictionary. Crucially, we make no assumption on the base densities: their ratios may be unbounded and their supports may differ. For both problems, we identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level. These optimal rates are higher than those achievable when density ratios are bounded by absolute constants; for mixture density estimation, they match existing lower bounds in the special case of discrete distributions. Our analysis of the mixture case hinges on two new covering results. First, we provide a sharp, distribution-free upper bound on the local Hellinger entropy of the class of mixtures of $M$ distributions. Second, we prove an optimal ratio covering theorem for convex sets: for every convex compact set $K\subset \mathbb{R}_+^d$, there exists a subset $A\subset K$ with at most $2^{8d}$ elements such that each element of $K$ is coordinate-wise dominated by an element of $A$ up to a universal constant factor. This geometric result is of independent interest; notably, it yields new cardinality estimates for $\varepsilon$-approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.

Authors: Spencer Compton, Gábor Lugosi, Jaouad Mourtada, Jian Qian, Nikita Zhivotovskiy

We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density $p$, the goal is to construct an estimator $\widehat p$ such that $\mathrm{KL}(p,\widehat p)$ is small with high probability. We consider two settings involving a finite dictionary of $M$ densities: (i) model aggregation, where $p$ belongs to the dictionary, and (ii) convex aggregation (mixture density estimation), where $p$ is a mixture of densities from the dictionary. Crucially, we make no assumption on the base densities: their ratios may be unbounded and their supports may differ. For both problems, we identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level. These optimal rates are higher than those achievable when density ratios are bounded by absolute constants; for mixture density estimation, they match existing lower bounds in the special case of discrete distributions. Our analysis of the mixture case hinges on two new covering results. First, we provide a sharp, distribution-free upper bound on the local Hellinger entropy of the class of mixtures of $M$ distributions. Second, we prove an optimal ratio covering theorem for convex sets: for every convex compact set $K\subset \mathbb{R}_+^d$, there exists a subset $A\subset K$ with at most $2^{8d}$ elements such that each element of $K$ is coordinate-wise dominated by an element of $A$ up to a universal constant factor. This geometric result is of independent interest; notably, it yields new cardinality estimates for $\varepsilon$-approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.

On the Hardness of Approximation of the Fair k-Center Problem

from arXiv: Data Structures and Algorithms

Authors: Suhas Thejaswi

In this work, we study the hardness of approximation of the fair $k$-center problem. Here the data points are partitioned into groups and the task is to choose a prescribed number of data points from each group, called centers, while minimizing the maximum distance from any point to its closest center. Although a polynomial-time $3$-approximation is known for this problem in general metrics, it has remained open whether this approximation guarantee is tight or could be further improved, especially since the unconstrained $k$-center problem admits a polynomial-time factor-$2$ approximation. We resolve this open question by proving that, for every $ε>0$, achieving a $(3-ε)$-approximation is NP-hard, assuming $\text{P} \neq \text{NP}$. Our inapproximability results hold even when only two disjoint groups are present and at least one center must be chosen from each group. Further, it extends to the canonical one-per-group setting with $k$-groups (for arbitrary $k$), where exactly one center must be selected from each group. Consequently, the factor-$3$ barrier for fair $k$-center in general metric spaces is inherent, and existing $3$-approximation algorithms are optimal up to lower-order terms even in these restricted regimes. This result stands in sharp contrast to the $k$-supplier formulation, where both the unconstrained and fair variants admit factor-$3$ approximation in polynomial time.

Authors: Suhas Thejaswi

In this work, we study the hardness of approximation of the fair $k$-center problem. Here the data points are partitioned into groups and the task is to choose a prescribed number of data points from each group, called centers, while minimizing the maximum distance from any point to its closest center. Although a polynomial-time $3$-approximation is known for this problem in general metrics, it has remained open whether this approximation guarantee is tight or could be further improved, especially since the unconstrained $k$-center problem admits a polynomial-time factor-$2$ approximation. We resolve this open question by proving that, for every $ε>0$, achieving a $(3-ε)$-approximation is NP-hard, assuming $\text{P} \neq \text{NP}$. Our inapproximability results hold even when only two disjoint groups are present and at least one center must be chosen from each group. Further, it extends to the canonical one-per-group setting with $k$-groups (for arbitrary $k$), where exactly one center must be selected from each group. Consequently, the factor-$3$ barrier for fair $k$-center in general metric spaces is inherent, and existing $3$-approximation algorithms are optimal up to lower-order terms even in these restricted regimes. This result stands in sharp contrast to the $k$-supplier formulation, where both the unconstrained and fair variants admit factor-$3$ approximation in polynomial time.

Submodular Maximization under Supermodular Constraint: Greedy Guarantees

from arXiv: Data Structures and Algorithms

Authors: Ajitesh Srivastava, Shanghua Teng

Motivated by a wide range of applications in data mining and machine learning, we consider the problem of maximizing a submodular function subject to supermodular cost constraints. In contrast to the well-understood setting of cardinality and matroid constraints, where greedy algorithms admit strong guarantees, the supermodular constraint regime remains poorly understood -- guarantees for greedy methods and other efficient algorithmic paradigms are largely open. We study this family of fundamental optimization problems under an upper-bound constraint on a supermodular cost function with curvature parameter $γ$. Our notion of supermodular curvature is less restrictive than prior definitions, substantially expanding the class of admissible cost functions. We show that our greedy algorithm that iteratively includes elements maximizing the ratio of the objective and constraint functions, achieves a $\left(1 - e^{-(1-γ)}\right)$-approximation before stopping. We prove that this approximation is indeed tight for this algorithm. Further, if the objective function has a submodular curvature $c$, then we show that the bound further improves to $\left(1 - (1- (1-c)(1-γ))^{1/(1-c)}\right)$, which can be further improved by continuing to violate the constraint. Finally, we show that the Greedy-Ratio-Marginal in conjunction with binary search leads to a bicriteria approximation for the dual problem -- minimizing a supermodular function under a lower bound constraint on a submodular function. We conduct a number of experiments on a simulation of LLM agents debating over multiple rounds -- the task is to select a subset of agents to maximize correctly answered questions. Our algorithm outperforms all other greedy heuristics, and on smaller problems, it achieves the same performance as the optimal set found by exhaustive search.

Authors: Ajitesh Srivastava, Shanghua Teng

Motivated by a wide range of applications in data mining and machine learning, we consider the problem of maximizing a submodular function subject to supermodular cost constraints. In contrast to the well-understood setting of cardinality and matroid constraints, where greedy algorithms admit strong guarantees, the supermodular constraint regime remains poorly understood -- guarantees for greedy methods and other efficient algorithmic paradigms are largely open. We study this family of fundamental optimization problems under an upper-bound constraint on a supermodular cost function with curvature parameter $γ$. Our notion of supermodular curvature is less restrictive than prior definitions, substantially expanding the class of admissible cost functions. We show that our greedy algorithm that iteratively includes elements maximizing the ratio of the objective and constraint functions, achieves a $\left(1 - e^{-(1-γ)}\right)$-approximation before stopping. We prove that this approximation is indeed tight for this algorithm. Further, if the objective function has a submodular curvature $c$, then we show that the bound further improves to $\left(1 - (1- (1-c)(1-γ))^{1/(1-c)}\right)$, which can be further improved by continuing to violate the constraint. Finally, we show that the Greedy-Ratio-Marginal in conjunction with binary search leads to a bicriteria approximation for the dual problem -- minimizing a supermodular function under a lower bound constraint on a submodular function. We conduct a number of experiments on a simulation of LLM agents debating over multiple rounds -- the task is to select a subset of agents to maximize correctly answered questions. Our algorithm outperforms all other greedy heuristics, and on smaller problems, it achieves the same performance as the optimal set found by exhaustive search.

Dynamic and Streaming Algorithms for Union Volume Estimation

from arXiv: Data Structures and Algorithms

Authors: Sujoy Bhore, Karl Bringmann, Timothy M. Chan, Yanheng Wang

The union volume estimation problem asks to $(1\pm\varepsilon)$-approximate the volume of the union of $n$ given objects $X_1,\ldots,X_n \subset \mathbb{R}^d$. In their seminal work in 1989, Karp, Luby, and Madras solved this problem in time $O(n/\varepsilon^2)$ in an oracle model where each object $X_i$ can be accessed via three types of queries: obtain the volume of $X_i$, sample a random point from $X_i$, and test whether $X_i$ contains a given point $x$. This running time was recently shown to be optimal [Bringmann, Larsen, Nusser, Rotenberg, and Wang, SoCG'25]. In another line of work, Meel, Vinodchandran, and Chakraborty [PODS'21] designed algorithms that read the objects in one pass using polylogarithmic time per object and polylogarithmic space; this can be phrased as a dynamic algorithm supporting insertions of objects for union volume estimation in the oracle model. In this paper, we study algorithms for union volume estimation in the oracle model that support both insertions and deletions of objects. We obtain the following results: - an algorithm supporting insertions and deletions in polylogarithmic update and query time and linear space (this is the first such dynamic algorithm, even for 2D triangles); - an algorithm supporting insertions and suffix queries (which generalizes the sliding window setting) in polylogarithmic update and query time and space; - an algorithm supporting insertions and deletions of convex bodies of constant dimension in polylogarithmic update and query time and space.

Authors: Sujoy Bhore, Karl Bringmann, Timothy M. Chan, Yanheng Wang

The union volume estimation problem asks to $(1\pm\varepsilon)$-approximate the volume of the union of $n$ given objects $X_1,\ldots,X_n \subset \mathbb{R}^d$. In their seminal work in 1989, Karp, Luby, and Madras solved this problem in time $O(n/\varepsilon^2)$ in an oracle model where each object $X_i$ can be accessed via three types of queries: obtain the volume of $X_i$, sample a random point from $X_i$, and test whether $X_i$ contains a given point $x$. This running time was recently shown to be optimal [Bringmann, Larsen, Nusser, Rotenberg, and Wang, SoCG'25]. In another line of work, Meel, Vinodchandran, and Chakraborty [PODS'21] designed algorithms that read the objects in one pass using polylogarithmic time per object and polylogarithmic space; this can be phrased as a dynamic algorithm supporting insertions of objects for union volume estimation in the oracle model. In this paper, we study algorithms for union volume estimation in the oracle model that support both insertions and deletions of objects. We obtain the following results: - an algorithm supporting insertions and deletions in polylogarithmic update and query time and linear space (this is the first such dynamic algorithm, even for 2D triangles); - an algorithm supporting insertions and suffix queries (which generalizes the sliding window setting) in polylogarithmic update and query time and space; - an algorithm supporting insertions and deletions of convex bodies of constant dimension in polylogarithmic update and query time and space.

Protecting the Undeleted in Machine Unlearning

from arXiv: Data Structures and Algorithms

Authors: Aloni Cohen, Refael Kohen, Kobbi Nissim, Uri Stemmer

Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary controlling merely $ω(1)$ data points to reconstruct almost the entire dataset merely by issuing deletion requests. We survey existing definitions for machine unlearning, showing they are either susceptible to such attacks or too restrictive to support basic functionalities like exact summation. To address this problem, we propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points. We show that our definition permits several essential functionalities, such as bulletin boards, summations, and statistical learning.

Authors: Aloni Cohen, Refael Kohen, Kobbi Nissim, Uri Stemmer

Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary controlling merely $ω(1)$ data points to reconstruct almost the entire dataset merely by issuing deletion requests. We survey existing definitions for machine unlearning, showing they are either susceptible to such attacks or too restrictive to support basic functionalities like exact summation. To address this problem, we propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points. We show that our definition permits several essential functionalities, such as bulletin boards, summations, and statistical learning.

An $n^{2+o(1)}$ Time Algorithm for Single-Source Negative Weight Shortest Paths

from arXiv: Data Structures and Algorithms

Authors: Sanjeev Khanna, Junkai Song

We present a randomized algorithm for the single-source shortest paths (SSSP) problem on directed graphs with arbitrary real-valued edge weights that runs in $n^{2+o(1)}$ time with high probability. This result yields the first almost linear-time algorithm for the problem on dense graphs ($m = Θ(n^2)$) and improves upon the best previously known bounds for moderately dense graphs ($m = ω(n^{1.306})$). Our approach builds on the hop-reduction via shortcutting framework introduced by Li, Li, Rao, and Zhang (2025), which iteratively augments the graph with shortcut edges to reduce the negative hop count of shortest paths. The central computational bottleneck in prior work is the cost of explicitly constructing these shortcuts in dense regions. We overcome this by introducing a new compression technique using auxiliary Steiner vertices. Specifically, we construct these vertices to represent large neighborhoods compactly in a structured manner, allowing us to efficiently generate and propagate shortcuts while strictly controlling the growth of vertex degrees and graph size.

Authors: Sanjeev Khanna, Junkai Song

We present a randomized algorithm for the single-source shortest paths (SSSP) problem on directed graphs with arbitrary real-valued edge weights that runs in $n^{2+o(1)}$ time with high probability. This result yields the first almost linear-time algorithm for the problem on dense graphs ($m = Θ(n^2)$) and improves upon the best previously known bounds for moderately dense graphs ($m = ω(n^{1.306})$). Our approach builds on the hop-reduction via shortcutting framework introduced by Li, Li, Rao, and Zhang (2025), which iteratively augments the graph with shortcut edges to reduce the negative hop count of shortest paths. The central computational bottleneck in prior work is the cost of explicitly constructing these shortcuts in dense regions. We overcome this by introducing a new compression technique using auxiliary Steiner vertices. Specifically, we construct these vertices to represent large neighborhoods compactly in a structured manner, allowing us to efficiently generate and propagate shortcuts while strictly controlling the growth of vertex degrees and graph size.

Fast Shortest Path in Graphs With Sparse Signed Tree Models and Applications

from arXiv: Data Structures and Algorithms

Authors: Édouard Bonnet, Colin Geniet, Eun Jung Kim, Sungmin Moon

A signed tree model of a graph $G$ is a compact binary structure consisting of a rooted binary tree whose leaves are bijectively mapped to the vertices of $G$, together with 2-colored edges $xy$, called transversal pairs, interpreted as bicliques or anti-bicliques whose sides are the leaves of the subtrees rooted at $x$ and at $y$. We design an algorithm that, given such a representation of an $n$-vertex graph $G$ with $p$ transversal pairs and a source $v \in V(G)$, computes a shortest-path tree rooted at $v$ in $G$ in time $O(p \log n)$. A wide variety of graph classes are such that for all $n$, their $n$-vertex graphs admit signed tree models with $O(n)$ transversal pairs: for instance, those of bounded symmetric difference, more generally of bounded sd-degeneracy, as well as interval graphs. As applications of our Single-Source Shortest Path algorithm and new techniques, we - improve the runtime of the fixed-parameter algorithm for first-order model checking on graphs given with a witness of low merge-width from cubic [Dreier and Toruńczyk, STOC '25] to quadratic; - give an $O(n^2 \log n)$-time algorithm for All-Pairs Shortest Path (APSP) on graphs given with a witness of low merge-width, generalizing a result known on twin-width [Twin-Width III, SICOMP '24]; - extend and simplify an $O(n^2 \log n)$-time algorithm for multiplying two $n \times n$ matrices $A, B$ of bounded twin-width in [Twin-Width V, STACS '23]: now $A$ solely has to be an adjacency matrix of a graph of bounded twin-width and $B$ can be arbitrary; - give an $O(n^2 \log^2 n)$-time algorithm for APSP on graphs of bounded twin-width, bypassing the need for contraction sequences in [Twin-Width III, SICOMP '24; Bannach et al. STACS '24]; - give an $O(n^{7/3} \log^2 n)$-time algorithm for APSP on graphs of symmetric difference $O(n^{1/3})$.

Authors: Édouard Bonnet, Colin Geniet, Eun Jung Kim, Sungmin Moon

A signed tree model of a graph $G$ is a compact binary structure consisting of a rooted binary tree whose leaves are bijectively mapped to the vertices of $G$, together with 2-colored edges $xy$, called transversal pairs, interpreted as bicliques or anti-bicliques whose sides are the leaves of the subtrees rooted at $x$ and at $y$. We design an algorithm that, given such a representation of an $n$-vertex graph $G$ with $p$ transversal pairs and a source $v \in V(G)$, computes a shortest-path tree rooted at $v$ in $G$ in time $O(p \log n)$. A wide variety of graph classes are such that for all $n$, their $n$-vertex graphs admit signed tree models with $O(n)$ transversal pairs: for instance, those of bounded symmetric difference, more generally of bounded sd-degeneracy, as well as interval graphs. As applications of our Single-Source Shortest Path algorithm and new techniques, we - improve the runtime of the fixed-parameter algorithm for first-order model checking on graphs given with a witness of low merge-width from cubic [Dreier and Toruńczyk, STOC '25] to quadratic; - give an $O(n^2 \log n)$-time algorithm for All-Pairs Shortest Path (APSP) on graphs given with a witness of low merge-width, generalizing a result known on twin-width [Twin-Width III, SICOMP '24]; - extend and simplify an $O(n^2 \log n)$-time algorithm for multiplying two $n \times n$ matrices $A, B$ of bounded twin-width in [Twin-Width V, STACS '23]: now $A$ solely has to be an adjacency matrix of a graph of bounded twin-width and $B$ can be arbitrary; - give an $O(n^2 \log^2 n)$-time algorithm for APSP on graphs of bounded twin-width, bypassing the need for contraction sequences in [Twin-Width III, SICOMP '24; Bannach et al. STACS '24]; - give an $O(n^{7/3} \log^2 n)$-time algorithm for APSP on graphs of symmetric difference $O(n^{1/3})$.

Steering diffusion models with quadratic rewards: a fine-grained analysis

from arXiv: Data Structures and Algorithms

Authors: Ankur Moitra, Andrej Risteski, Dhruv Rohatgi

Inference-time algorithms are an emerging paradigm in which pre-trained models are used as subroutines to solve downstream tasks. Such algorithms have been proposed for tasks ranging from inverse problems and guided image generation to reasoning. However, the methods currently deployed in practice are heuristics with a variety of failure modes -- and we have very little understanding of when these heuristics can be efficiently improved. In this paper, we consider the task of sampling from a reward-tilted diffusion model -- that is, sampling from $p^{\star}(x) \propto p(x) \exp(r(x))$ -- given a reward function $r$ and pre-trained diffusion oracle for $p$. We provide a fine-grained analysis of the computational tractability of this task for quadratic rewards $r(x) = x^\top A x + b^\top x$. We show that linear-reward tilts are always efficiently sampleable -- a simple result that seems to have gone unnoticed in the literature. We use this as a building block, along with a conceptually new ingredient -- the Hubbard-Stratonovich transform -- to provide an efficient algorithm for sampling from low-rank positive-definite quadratic tilts, i.e. $r(x) = x^\top A x$ where $A$ is positive-definite and of rank $O(1)$. For negative-definite tilts, i.e. $r(x) = - x^\top A x$ where $A$ is positive-definite, we prove that the problem is intractable even if $A$ is of rank 1 (albeit with exponentially-large entries).

Authors: Ankur Moitra, Andrej Risteski, Dhruv Rohatgi

Inference-time algorithms are an emerging paradigm in which pre-trained models are used as subroutines to solve downstream tasks. Such algorithms have been proposed for tasks ranging from inverse problems and guided image generation to reasoning. However, the methods currently deployed in practice are heuristics with a variety of failure modes -- and we have very little understanding of when these heuristics can be efficiently improved. In this paper, we consider the task of sampling from a reward-tilted diffusion model -- that is, sampling from $p^{\star}(x) \propto p(x) \exp(r(x))$ -- given a reward function $r$ and pre-trained diffusion oracle for $p$. We provide a fine-grained analysis of the computational tractability of this task for quadratic rewards $r(x) = x^\top A x + b^\top x$. We show that linear-reward tilts are always efficiently sampleable -- a simple result that seems to have gone unnoticed in the literature. We use this as a building block, along with a conceptually new ingredient -- the Hubbard-Stratonovich transform -- to provide an efficient algorithm for sampling from low-rank positive-definite quadratic tilts, i.e. $r(x) = x^\top A x$ where $A$ is positive-definite and of rank $O(1)$. For negative-definite tilts, i.e. $r(x) = - x^\top A x$ where $A$ is positive-definite, we prove that the problem is intractable even if $A$ is of rank 1 (albeit with exponentially-large entries).

Separating Oblivious and Adaptive Models of Variable Selection

from arXiv: Data Structures and Algorithms

Authors: Ziyun Chen, Jerry Li, Kevin Tian, Yusong Zhu

Sparse recovery is among the most well-studied problems in learning theory and high-dimensional statistics. In this work, we investigate the statistical and computational landscapes of sparse recovery with $\ell_\infty$ error guarantees. This variant of the problem is motivated by \emph{variable selection} tasks, where the goal is to estimate the support of a $k$-sparse signal in $\mathbb{R}^d$. Our main contribution is a provable separation between the \emph{oblivious} (``for each'') and \emph{adaptive} (``for all'') models of $\ell_\infty$ sparse recovery. We show that under an oblivious model, the optimal $\ell_\infty$ error is attainable in near-linear time with $\approx k\log d$ samples, whereas in an adaptive model, $\gtrsim k^2$ samples are necessary for any algorithm to achieve this bound. This establishes a surprising contrast with the standard $\ell_2$ setting, where $\approx k \log d$ samples suffice even for adaptive sparse recovery. We conclude with a preliminary examination of a \emph{partially-adaptive} model, where we show nontrivial variable selection guarantees are possible with $\approx k\log d$ measurements.

Authors: Ziyun Chen, Jerry Li, Kevin Tian, Yusong Zhu

Sparse recovery is among the most well-studied problems in learning theory and high-dimensional statistics. In this work, we investigate the statistical and computational landscapes of sparse recovery with $\ell_\infty$ error guarantees. This variant of the problem is motivated by \emph{variable selection} tasks, where the goal is to estimate the support of a $k$-sparse signal in $\mathbb{R}^d$. Our main contribution is a provable separation between the \emph{oblivious} (``for each'') and \emph{adaptive} (``for all'') models of $\ell_\infty$ sparse recovery. We show that under an oblivious model, the optimal $\ell_\infty$ error is attainable in near-linear time with $\approx k\log d$ samples, whereas in an adaptive model, $\gtrsim k^2$ samples are necessary for any algorithm to achieve this bound. This establishes a surprising contrast with the standard $\ell_2$ setting, where $\approx k \log d$ samples suffice even for adaptive sparse recovery. We conclude with a preliminary examination of a \emph{partially-adaptive} model, where we show nontrivial variable selection guarantees are possible with $\approx k\log d$ measurements.

The S-Hamiltonian Cycle Problem

from arXiv: Data Structures and Algorithms

Authors: Antoine Amarilli, Arthur Lombardo, Mikaël Monet

Determining if an input undirected graph is Hamiltonian, i.e., if it has a cycle that visits every vertex exactly once, is one of the most famous NP-complete problems. We consider the following generalization of Hamiltonian cycles: for a fixed set $S$ of natural numbers, we want to visit each vertex of a graph $G$ exactly once and ensure that any two consecutive vertices can be joined in $k$ hops for some choice of $k \in S$. Formally, an $S$-Hamiltonian cycle is a permutation $(v_0,\ldots,v_{n-1})$ of the vertices of $G$ such that, for $0 \leq i \leq n-1$, there exists a walk between $v_i$ and $v_{i+1 \bmod n}$ whose length is in $S$. (We do not impose any constraints on how many times vertices can be visited as intermediate vertices of walks.) Of course Hamiltonian cycles in the standard sense correspond to $S=\{1\}$. We study the $S$-Hamiltonian cycle problem of deciding whether an input graph $G$ has an $S$-Hamiltonian cycle. Our goal is to determine the complexity of this problem depending on the fixed set $S$. It is already known that the problem remains NP-complete for $S=\{1,2\}$, whereas it is trivial for $S=\{1,2,3\}$ because any connected graph contains a $\{1,2,3\}$-Hamiltonian cycle. Our work classifies the complexity of this problem for most kinds of sets $S$, with the key new results being the following: we have NP-completeness for $S = \{2\}$ and for $S = \{2, 4\}$, but tractability for $S = \{1, 2, 4\}$, for $S = \{2, 4, 6\}$, for any superset of these two tractable cases, and for $S$ the infinite set of all odd integers. The remaining open cases are the non-singleton finite sets of odd integers, in particular $S = \{1, 3\}$. Beyond cycles, we also discuss the complexity of finding $S$-Hamiltonian paths, and show that our problems are all tractable on graphs of bounded cliquewidth.

Authors: Antoine Amarilli, Arthur Lombardo, Mikaël Monet

Determining if an input undirected graph is Hamiltonian, i.e., if it has a cycle that visits every vertex exactly once, is one of the most famous NP-complete problems. We consider the following generalization of Hamiltonian cycles: for a fixed set $S$ of natural numbers, we want to visit each vertex of a graph $G$ exactly once and ensure that any two consecutive vertices can be joined in $k$ hops for some choice of $k \in S$. Formally, an $S$-Hamiltonian cycle is a permutation $(v_0,\ldots,v_{n-1})$ of the vertices of $G$ such that, for $0 \leq i \leq n-1$, there exists a walk between $v_i$ and $v_{i+1 \bmod n}$ whose length is in $S$. (We do not impose any constraints on how many times vertices can be visited as intermediate vertices of walks.) Of course Hamiltonian cycles in the standard sense correspond to $S=\{1\}$. We study the $S$-Hamiltonian cycle problem of deciding whether an input graph $G$ has an $S$-Hamiltonian cycle. Our goal is to determine the complexity of this problem depending on the fixed set $S$. It is already known that the problem remains NP-complete for $S=\{1,2\}$, whereas it is trivial for $S=\{1,2,3\}$ because any connected graph contains a $\{1,2,3\}$-Hamiltonian cycle. Our work classifies the complexity of this problem for most kinds of sets $S$, with the key new results being the following: we have NP-completeness for $S = \{2\}$ and for $S = \{2, 4\}$, but tractability for $S = \{1, 2, 4\}$, for $S = \{2, 4, 6\}$, for any superset of these two tractable cases, and for $S$ the infinite set of all odd integers. The remaining open cases are the non-singleton finite sets of odd integers, in particular $S = \{1, 3\}$. Beyond cycles, we also discuss the complexity of finding $S$-Hamiltonian paths, and show that our problems are all tractable on graphs of bounded cliquewidth.

The Complexity Landscape of Two-Stage Robust Selection Problems with Budgeted Uncertainty

from arXiv: Data Structures and Algorithms

Authors: Marc Goerigk, Dorothee Henke, Lasse Wulf

A standard type of uncertainty set in robust optimization is budgeted uncertainty, where an interval of possible values for each parameter is given and the total deviation from their lower bounds is bounded. In the two-stage setting, discrete and continuous budgeted uncertainty have to be distinguished. The complexity of such problems is largely unexplored, in particular if the underlying nominal optimization problem is simple, such as for selection problems. In this paper, we give a comprehensive answer to long-standing open complexity questions for three types of selection problems and three types of budgeted uncertainty sets. In particular, we demonstrate that the two-stage selection problem with continuous budgeted uncertainty is NP-hard, while the corresponding two-stage representative selection problem is solvable in polynomial time. Our hardness result implies that also the two-stage assignment problem with continuous budgeted uncertainty is NP-hard.

Authors: Marc Goerigk, Dorothee Henke, Lasse Wulf

A standard type of uncertainty set in robust optimization is budgeted uncertainty, where an interval of possible values for each parameter is given and the total deviation from their lower bounds is bounded. In the two-stage setting, discrete and continuous budgeted uncertainty have to be distinguished. The complexity of such problems is largely unexplored, in particular if the underlying nominal optimization problem is simple, such as for selection problems. In this paper, we give a comprehensive answer to long-standing open complexity questions for three types of selection problems and three types of budgeted uncertainty sets. In particular, we demonstrate that the two-stage selection problem with continuous budgeted uncertainty is NP-hard, while the corresponding two-stage representative selection problem is solvable in polynomial time. Our hardness result implies that also the two-stage assignment problem with continuous budgeted uncertainty is NP-hard.

Computing Tarski Fixed Points in Financial Networks

from arXiv: Data Structures and Algorithms

Authors: Leander Besting, Martin Hoefer, Lars Huth

Modern financial networks are highly connected and result in complex interdependencies of the involved institutions. In the prominent Eisenberg-Noe model, a fundamental aspect is clearing -- to determine the amount of assets available to each financial institution in the presence of potential defaults and bankruptcy. A clearing state represents a fixed point that satisfies a set of natural axioms. Existence can be established (even in broad generalizations of the model) using Tarski's theorem. While a maximal fixed point can be computed in polynomial time, the complexity of computing other fixed points is open. In this paper, we provide an efficient algorithm to compute a minimal fixed point that runs in strongly polynomial time. It applies in a broad generalization of the Eisenberg-Noe model with any monotone, piecewise-linear payment functions and default costs. Moreover, in this scenario we provide a polynomial-time algorithm to compute a maximal fixed point. For networks without default costs, we can efficiently decide the existence of fixed points in a given range. We also study claims trading, a local network adjustment to improve clearing, when networks are evaluated with minimal clearing. We provide an efficient algorithm to decide existence of Pareto-improving trades and compute optimal ones if they exist.

Authors: Leander Besting, Martin Hoefer, Lars Huth

Modern financial networks are highly connected and result in complex interdependencies of the involved institutions. In the prominent Eisenberg-Noe model, a fundamental aspect is clearing -- to determine the amount of assets available to each financial institution in the presence of potential defaults and bankruptcy. A clearing state represents a fixed point that satisfies a set of natural axioms. Existence can be established (even in broad generalizations of the model) using Tarski's theorem. While a maximal fixed point can be computed in polynomial time, the complexity of computing other fixed points is open. In this paper, we provide an efficient algorithm to compute a minimal fixed point that runs in strongly polynomial time. It applies in a broad generalization of the Eisenberg-Noe model with any monotone, piecewise-linear payment functions and default costs. Moreover, in this scenario we provide a polynomial-time algorithm to compute a maximal fixed point. For networks without default costs, we can efficiently decide the existence of fixed points in a given range. We also study claims trading, a local network adjustment to improve clearing, when networks are evaluated with minimal clearing. We provide an efficient algorithm to decide existence of Pareto-improving trades and compute optimal ones if they exist.

When to Identify Is to Control: On the Controllability of Combinatorial Optimization Problems

from arXiv: Data Structures and Algorithms

Authors: Max Klimm, Jannik Matuschke

Consider a finite ground set $E$, a set of feasible solutions $X \subseteq \mathbb{R}^{E}$, and a class of objective functions $\mathcal{C}$ defined on $X$. We are interested in subsets $S$ of $E$ that control $X$ in the sense that we can induce any given solution $x \in X$ as an optimum for any given objective function $c \in \mathcal{C}$ by adding linear terms to $c$ on the coordinates corresponding to $S$. This problem has many applications, e.g., when $X$ corresponds to the set of all traffic flows, the ability to control implies that one is able to induce all target flows by imposing tolls on the edges in $S$. Our first result shows the equivalence between controllability and identifiability. If $X$ is convex, or if $X$ consists of binary vectors, then $S$ controls $X$ if and only if the restriction of $x$ to $S$ uniquely determines $x$ among all solutions in $X$. In the convex case, we further prove that the family of controlling sets forms a matroid. This structural insight yields an efficient algorithm for computing minimum-weight controlling sets from a description of the affine hull of $X$. While the equivalence extends to matroid base families, the picture changes sharply for other discrete domains. We show that when $X$ is equal to the set of $s$-$t$-paths in a directed graph, deciding whether an identifying set of a given cardinality exists is $Σ\mathsf{_2^P}$-complete. The problem remains $\mathsf{NP}$-hard even on acyclic graphs. For acyclic instances, however, we obtain an approximation guarantee by proving a tight bound on the gap between the smallest identifying sets for $X$ and its convex hull, where the latter corresponds to the $s$-$t$-flow polyhedron.

Authors: Max Klimm, Jannik Matuschke

Consider a finite ground set $E$, a set of feasible solutions $X \subseteq \mathbb{R}^{E}$, and a class of objective functions $\mathcal{C}$ defined on $X$. We are interested in subsets $S$ of $E$ that control $X$ in the sense that we can induce any given solution $x \in X$ as an optimum for any given objective function $c \in \mathcal{C}$ by adding linear terms to $c$ on the coordinates corresponding to $S$. This problem has many applications, e.g., when $X$ corresponds to the set of all traffic flows, the ability to control implies that one is able to induce all target flows by imposing tolls on the edges in $S$. Our first result shows the equivalence between controllability and identifiability. If $X$ is convex, or if $X$ consists of binary vectors, then $S$ controls $X$ if and only if the restriction of $x$ to $S$ uniquely determines $x$ among all solutions in $X$. In the convex case, we further prove that the family of controlling sets forms a matroid. This structural insight yields an efficient algorithm for computing minimum-weight controlling sets from a description of the affine hull of $X$. While the equivalence extends to matroid base families, the picture changes sharply for other discrete domains. We show that when $X$ is equal to the set of $s$-$t$-paths in a directed graph, deciding whether an identifying set of a given cardinality exists is $Σ\mathsf{_2^P}$-complete. The problem remains $\mathsf{NP}$-hard even on acyclic graphs. For acyclic instances, however, we obtain an approximation guarantee by proving a tight bound on the gap between the smallest identifying sets for $X$ and its convex hull, where the latter corresponds to the $s$-$t$-flow polyhedron.

Condorcet Dimension and Pareto Optimality for Matchings and Beyond

from arXiv: Data Structures and Algorithms

Authors: Telikepalli Kavitha, Jannik Matuschke, Ulrike Schmidt-Kraepelin

We study matching problems in which agents form one side of a bipartite graph and have preferences over objects on the other side. A central solution concept in this setting is popularity: a matching is popular if it is a (weak) Condorcet winner, meaning that no other matching is preferred by a strict majority of agents. It is well known, however, that Condorcet winners need not exist. We therefore turn to a natural and prominent relaxation. A set of matchings is a Condorcet-winning set if, for every competing matching, a majority of agents prefers their favorite matching in the set over the competitor. The Condorcet dimension is the smallest cardinality of a Condorcet-winning set. Our main results reveal a connection between Condorcet-winning sets and Pareto optimality. We show that any Pareto-optimal set of two matchings is, in particular, a Condorcet-winning set. This implication continues to hold when we impose matroid constraints on the set of matched objects, and even when agents' valuations are given as partial orders. The existence picture, however, changes sharply with partial orders. While for weak orders a Pareto-optimal set of two matchings always exists, this is -- surprisingly -- not the case under partial orders. Consequently, although the Condorcet dimension for matchings is 2 under weak orders (even under matroid constraints), this guarantee fails for partial orders: we prove that the Condorcet dimension is $Θ(\sqrt{n})$, and rises further to $Θ(n)$ when matroid constraints are added. On the computational side, we show that, under partial orders, deciding whether there exists a Condorcet -- winning set of a given fixed size is NP-hard. The same holds for deciding the existence of a Pareto-optimal matching, which we believe to be of independent interest. Finally, we also show that the Condorcet dimension for a related problem on arborescences is also 2.

Authors: Telikepalli Kavitha, Jannik Matuschke, Ulrike Schmidt-Kraepelin

We study matching problems in which agents form one side of a bipartite graph and have preferences over objects on the other side. A central solution concept in this setting is popularity: a matching is popular if it is a (weak) Condorcet winner, meaning that no other matching is preferred by a strict majority of agents. It is well known, however, that Condorcet winners need not exist. We therefore turn to a natural and prominent relaxation. A set of matchings is a Condorcet-winning set if, for every competing matching, a majority of agents prefers their favorite matching in the set over the competitor. The Condorcet dimension is the smallest cardinality of a Condorcet-winning set. Our main results reveal a connection between Condorcet-winning sets and Pareto optimality. We show that any Pareto-optimal set of two matchings is, in particular, a Condorcet-winning set. This implication continues to hold when we impose matroid constraints on the set of matched objects, and even when agents' valuations are given as partial orders. The existence picture, however, changes sharply with partial orders. While for weak orders a Pareto-optimal set of two matchings always exists, this is -- surprisingly -- not the case under partial orders. Consequently, although the Condorcet dimension for matchings is 2 under weak orders (even under matroid constraints), this guarantee fails for partial orders: we prove that the Condorcet dimension is $Θ(\sqrt{n})$, and rises further to $Θ(n)$ when matroid constraints are added. On the computational side, we show that, under partial orders, deciding whether there exists a Condorcet -- winning set of a given fixed size is NP-hard. The same holds for deciding the existence of a Pareto-optimal matching, which we believe to be of independent interest. Finally, we also show that the Condorcet dimension for a related problem on arborescences is also 2.

Near-optimal population protocols on bounded-degree trees

from arXiv: Data Structures and Algorithms

Authors: Joel Rybicki, Jakob Solnerzik, Robin Vacus

We investigate space-time trade-offs for population protocols in sparse interaction graphs. In complete interaction graphs, optimal space-time trade-offs are known for the leader election and exact majority problems. However, it has remained open if other graph families exhibit similar space-time complexity trade-offs, as existing lower bound techniques do not extend beyond highly dense graphs. In this work, we show that -- unlike in complete graphs -- population protocols on bounded-degree trees do not exhibit significant asymptotic space-time trade-offs for leader election and exact majority. For these problems, we give constant-space protocols that have near-optimal worst-case expected stabilisation time. These new protocols achieve a linear speed-up compared to the state-of-the-art. Our results are based on two novel protocols, which we believe are of independent interest. First, we give a new fast self-stabilising 2-hop colouring protocol for general interaction graphs, whose stabilisation time we bound using a stochastic drift argument. Second, we give a self-stabilising tree orientation algorithm that builds a rooted tree in optimal time on any tree. As a consequence, we can use simple constant-state protocols designed for directed trees to solve leader election and exact majority fast. For example, we show that ``directed'' annihilation dynamics solve exact majority in $O(n^2 \log n)$ steps on directed trees.

Authors: Joel Rybicki, Jakob Solnerzik, Robin Vacus

We investigate space-time trade-offs for population protocols in sparse interaction graphs. In complete interaction graphs, optimal space-time trade-offs are known for the leader election and exact majority problems. However, it has remained open if other graph families exhibit similar space-time complexity trade-offs, as existing lower bound techniques do not extend beyond highly dense graphs. In this work, we show that -- unlike in complete graphs -- population protocols on bounded-degree trees do not exhibit significant asymptotic space-time trade-offs for leader election and exact majority. For these problems, we give constant-space protocols that have near-optimal worst-case expected stabilisation time. These new protocols achieve a linear speed-up compared to the state-of-the-art. Our results are based on two novel protocols, which we believe are of independent interest. First, we give a new fast self-stabilising 2-hop colouring protocol for general interaction graphs, whose stabilisation time we bound using a stochastic drift argument. Second, we give a self-stabilising tree orientation algorithm that builds a rooted tree in optimal time on any tree. As a consequence, we can use simple constant-state protocols designed for directed trees to solve leader election and exact majority fast. For example, we show that ``directed'' annihilation dynamics solve exact majority in $O(n^2 \log n)$ steps on directed trees.

Bellman-Ford in Almost-Linear Time for Dense Graphs

from arXiv: Data Structures and Algorithms

Authors: George Z. Li, Jason Li, Junkai Zhang

We consider the single-source shortest paths problem on a directed graph with real-valued (possibly negative) edge weights and solve this problem in $n^{2+o(1)}$ time by refining the shortcutting procedure introduced in Li, Li, Rao, and Zhang (2026).

Authors: George Z. Li, Jason Li, Junkai Zhang

We consider the single-source shortest paths problem on a directed graph with real-valued (possibly negative) edge weights and solve this problem in $n^{2+o(1)}$ time by refining the shortcutting procedure introduced in Li, Li, Rao, and Zhang (2026).

Markov Chains with Rewinding

from arXiv: Data Structures and Algorithms

Authors: Amir Azarmehr, Soheil Behnezhad, Alma Ghafari, Madhu Sudan

Motivated by techniques developed in recent progress on lower bounds for sublinear time algorithms (Behnezhad, Roghani and Rubinstein, STOC 2023, FOCS 2023, and STOC 2024) we introduce and study a new class of randomized algorithmic processes that we call Markov Chains with Rewinding. In this setting, an algorithm interacts with a (partially observable) Markovian random evolution by strategically rewinding the Markov chain to previous states. Depending on the application, this may lead the evolution to desired states faster, or allow the agent to efficiently learn or test properties of the underlying Markov chain that may be infeasible or inefficient with passive observation. We study the task of identifying the initial state in a given partially observable Markov chain. Analysis of this question in specific Markov chains is the central ingredient in the above cited works and we aim to systematize the analysis in our work. Our first result is that any pair of states distinguishable with any rewinding strategy can also be distinguished with a non-adaptive rewinding strategy (one whose rewinding choices are determined before observing any outcomes of the chain). Therefore, while rewinding strategies can be shown to be strictly more powerful than passive strategies (those that do not rewind back to previous states), adaptivity does not give additional power to a rewinding strategy in the absence of efficiency considerations. The difference becomes apparent however when we introduce a natural efficiency measure, namely the query complexity (i.e., the number of observations they need to identify distinguishable states). Our second main contribution is to quantify this efficiency gap. We present a non-adaptive rewinding strategy whose query complexity is within a polynomial of that of the optimal (adaptive) strategy, and show that such a polynomial loss is necessary in general.

Authors: Amir Azarmehr, Soheil Behnezhad, Alma Ghafari, Madhu Sudan

Motivated by techniques developed in recent progress on lower bounds for sublinear time algorithms (Behnezhad, Roghani and Rubinstein, STOC 2023, FOCS 2023, and STOC 2024) we introduce and study a new class of randomized algorithmic processes that we call Markov Chains with Rewinding. In this setting, an algorithm interacts with a (partially observable) Markovian random evolution by strategically rewinding the Markov chain to previous states. Depending on the application, this may lead the evolution to desired states faster, or allow the agent to efficiently learn or test properties of the underlying Markov chain that may be infeasible or inefficient with passive observation. We study the task of identifying the initial state in a given partially observable Markov chain. Analysis of this question in specific Markov chains is the central ingredient in the above cited works and we aim to systematize the analysis in our work. Our first result is that any pair of states distinguishable with any rewinding strategy can also be distinguished with a non-adaptive rewinding strategy (one whose rewinding choices are determined before observing any outcomes of the chain). Therefore, while rewinding strategies can be shown to be strictly more powerful than passive strategies (those that do not rewind back to previous states), adaptivity does not give additional power to a rewinding strategy in the absence of efficiency considerations. The difference becomes apparent however when we introduce a natural efficiency measure, namely the query complexity (i.e., the number of observations they need to identify distinguishable states). Our second main contribution is to quantify this efficiency gap. We present a non-adaptive rewinding strategy whose query complexity is within a polynomial of that of the optimal (adaptive) strategy, and show that such a polynomial loss is necessary in general.

Universally Optimal Decremental Tree Minima

from arXiv: Data Structures and Algorithms

Authors: Benjamin Aram Berendsohn

An algorithm on weighted graphs is called universally optimal if it is optimal for every input graph, in the worst case taken over all weight assignments. Informally, this means the algorithm is competitive even with algorithms that are optimized for only one specific input graph. Universal optimality was recently introduced [Haeupler et al. 2024] as an alternative to the stronger, but often unachievable instance optimality. In this paper, we extend the concept of universal optimality to data structures. In particular, we investigate the following dynamic graph problem: Given a vertex-weighted forest, maintain the minimum-weight vertex of every tree under edge deletions. The problem requires $Θ(\log n)$ amortized time per operation in general, but only $O(1)$ time if the initial forest is a path. We present a data structure that has optimal total running time for every fixed initial forest and every fixed number of operations/queries $m$, when taking the worst case over all weight assignments and operation sequences of length $m$. This definition of universal optimality is easily adapted to other data structure problems. Our result combines two techniques: (1) A decomposition of the input into paths, to take advantage of the $O(1)$-time path-specific data structure; and (2) splay trees [Sleator and Tarjan 1985], which, informally speaking, are used to optimally handle a certain sorting-related subproblem. We apply our data structure to solve problems related to Cartesian trees, path minimum queries, and bottleneck vertex/edge queries, each with a certain universal optimality guarantee. Our data structure also can be modified to support edge weights instead of vertex weights. Further, it generalizes to support semigroup sum queries instead of minimum queries, in universally optimal time.

Authors: Benjamin Aram Berendsohn

An algorithm on weighted graphs is called universally optimal if it is optimal for every input graph, in the worst case taken over all weight assignments. Informally, this means the algorithm is competitive even with algorithms that are optimized for only one specific input graph. Universal optimality was recently introduced [Haeupler et al. 2024] as an alternative to the stronger, but often unachievable instance optimality. In this paper, we extend the concept of universal optimality to data structures. In particular, we investigate the following dynamic graph problem: Given a vertex-weighted forest, maintain the minimum-weight vertex of every tree under edge deletions. The problem requires $Θ(\log n)$ amortized time per operation in general, but only $O(1)$ time if the initial forest is a path. We present a data structure that has optimal total running time for every fixed initial forest and every fixed number of operations/queries $m$, when taking the worst case over all weight assignments and operation sequences of length $m$. This definition of universal optimality is easily adapted to other data structure problems. Our result combines two techniques: (1) A decomposition of the input into paths, to take advantage of the $O(1)$-time path-specific data structure; and (2) splay trees [Sleator and Tarjan 1985], which, informally speaking, are used to optimally handle a certain sorting-related subproblem. We apply our data structure to solve problems related to Cartesian trees, path minimum queries, and bottleneck vertex/edge queries, each with a certain universal optimality guarantee. Our data structure also can be modified to support edge weights instead of vertex weights. Further, it generalizes to support semigroup sum queries instead of minimum queries, in universally optimal time.

Computing Approximate Pareto Frontiers for Submodular Utility and Cost Tradeoffs

from arXiv: Data Structures and Algorithms

Authors: Karan Vombatkere, Evimaria Terzi

In many data-mining applications, including recommender systems, influence maximization, and team formation, the goal is to pick a subset of elements (e.g., items, nodes in a network, experts to perform a task) to maximize a monotone submodular utility function while simultaneously minimizing a cost function. Classical formulations model this tradeoff via cardinality or knapsack constraints, or by combining utility and cost into a single weighted objective. However, such approaches require committing to a specific tradeoff in advance and return only a single solution, offering limited insight into the space of viable utility-cost tradeoffs. In this paper, we depart from the single-solution paradigm and examine the problem of computing representative sets of high-quality solutions that expose different tradeoffs between submodular utility and cost. For this, we introduce $(α_1,α_2)$-approximate Pareto frontiers that provably approximate the achievable tradeoffs between submodular utility and cost. Specifically, we formalize the Pareto-$\langle f,c \rangle$ problem and develop efficient algorithms for multiple instantiations arising from different combinations of submodular utility $f$ and cost functions $c$. Our results offer a principled and practical framework for understanding and exploiting utility-cost tradeoffs in submodular optimization. Experiments on datasets from diverse application domains demonstrate that our algorithms efficiently compute approximate Pareto frontiers in practice.

Authors: Karan Vombatkere, Evimaria Terzi

In many data-mining applications, including recommender systems, influence maximization, and team formation, the goal is to pick a subset of elements (e.g., items, nodes in a network, experts to perform a task) to maximize a monotone submodular utility function while simultaneously minimizing a cost function. Classical formulations model this tradeoff via cardinality or knapsack constraints, or by combining utility and cost into a single weighted objective. However, such approaches require committing to a specific tradeoff in advance and return only a single solution, offering limited insight into the space of viable utility-cost tradeoffs. In this paper, we depart from the single-solution paradigm and examine the problem of computing representative sets of high-quality solutions that expose different tradeoffs between submodular utility and cost. For this, we introduce $(α_1,α_2)$-approximate Pareto frontiers that provably approximate the achievable tradeoffs between submodular utility and cost. Specifically, we formalize the Pareto-$\langle f,c \rangle$ problem and develop efficient algorithms for multiple instantiations arising from different combinations of submodular utility $f$ and cost functions $c$. Our results offer a principled and practical framework for understanding and exploiting utility-cost tradeoffs in submodular optimization. Experiments on datasets from diverse application domains demonstrate that our algorithms efficiently compute approximate Pareto frontiers in practice.

Wednesday, February 18

Devezer's Urn

from Ben Recht

LLMs make metascience easier, but that doesn't increase metascientific validity.

In a week of incredibly annoying and inaccurate AI discourse, I’m instead drawn to write about metascientific squabbles. Engaging with AI discourse means ignoring ignorant bait articles that falsely cast academia as a unipolar axis defined by a professor in the Pacific Northwest and looking at how AI is shaping or being shaped by our social systems.

So yes, I’m intrigued by this new AI-driven political science paper by Ryan Briggs, Jonathan Mellon, and Vincent Arel-Bundock. They use large language models to scour over one hundred thousand political science papers published after 2010 and find that only 2% report null findings in the abstract. They use a statistical model to argue that this number should be 80%. As Briggs said on social media, the authors think this finding is “disastrous” for political science.

This is the latest paper that invariably appears on my timeline claiming to have scientific proof that all science is wrong. Longtime readers know I’m no fan of quantitative social science. However, I’m even less of a fan of quantitative meta-science. I’ve written multiple posts about this bizarre practice. There’s no shorter path to a paper than data mining existing papers and tut-tutting a scientific community for its questionable research practices that pad CVs while failing science. People keep writing these metascience papers, and LLMs make writing them infinitely easier. I guess that means I have to keep writing these blogposts.

So let me ask you, argmin reader: what should the statistical summarization of political science abstracts look like? What is the stochastic process that generates pseudorandom numbers and produces arXiv PDFs? What are the sufficient statistics for this distribution? The authors don’t say. Instead, their paper posits a simplistic statistical model that Berna Devezer pejoratively calls the “urn model of science.” Political scientists reach into a magical Polya urn to find a hypothesis. The urn contains 75% true hypotheses and 25% false hypotheses. Then they test the hypothesis using a perfectly normally distributed experiment. The null hypotheses are rejected based on the nebulous holy parameters statisticians call “power” and “size.” Under this model, where the authors set the power to 25% (a number plucked from a previous meta-scientific screed by one of the coauthors), they compute that eighty percent number I mentioned above.

The authors think the gap between 80% and 2% is large and evidence of widespread research malpractice. They conclude, “The publication filter that we document here is not a benign pathology. It is a structural feature of our reporting practices that threatens the credibility and accumulation of knowledge in political science.”

Now, here’s the thing. I understand that metascientists think they can just appeal to a caricatured version of Karl Popper’s philosophy of science and think that when you get a finding like this, you have refuted the idea that people are practicing good research. But the one thing every scientist should learn is the immediate counterargument to Popperian falsification called the Quine-Duhem problem. When formulating a prediction about an experimental outcome, a hypothesis never stands alone. You need to append a long chain of auxiliary hypotheses about your mathematical models, your theoretical constructs, your conditions of ceteris paribus, your measurement devices, and so on, to make any prediction. When you see the opposite of what your hypothesis predicts, that means either the hypothesis is wrong or one of your auxiliary hypotheses is wrong. You need a logical conjunction of hypotheses to make a prediction. When an observation contradicts a prediction, any of the clauses in that conjunction could be to blame.

In the case of statistical meta-science, if your toy statistical model predicts a certain curve under “ideal” research practices, and you find a different curve, it’s possible that the curve derived from undergraduate probability has nothing to do with scientific practice.

I mean, come on, the urn model of scientific practice is more insulting than the stochastic parrots model of LLMs. We don’t do research by picking random experiments independently. Test choice is informed by past practice, advisors’ tastes, measurement constraints, and the whims of journal reviewers. We certainly don’t write papers about random tests.

And we should ask ourselves why these significance tests litter social science papers. It’s an unfortunate convention that everyone knows is harmful. To first order, everyone hates null hypothesis significance tests. Most people realize that there’s no faster way to strangle a discipline than with the logically incoherent garrote of the significance test.

Unfortunately, some die-hard proponents still believe that null-hypothesis significance testing will prevent people from being fooled by things that are too good to be true. Bizarrely, 100 years of significance testing has not yet convinced them that the promise of significance testing was always too good to be true.

Indeed, we’ve known it’s too good to be true since Neyman proposed the potential outcomes model. Even in the social sciences, we’ve known the statistical testing framework is useless for over fifty years. Significance testing “is never a sufficient condition for claiming that a theory has been usefully corroborated, that a meaningful empirical fact has been established, or that an experimental report ought to be published.” The null ritual of power analyses and 0.05 rejections is incoherent, and it’s just a game evolutionarily designed to grease the publication market. As ‪Mark Copelovitch perfectly put it, “the entire edifice causal identification champions have built over the last decade is mainly barriers to entry and primarily about methodological tastes.”

The hardcore statistical wing of metascience has strong, peculiar normative beliefs about what science should be. Somehow, we fix all of science if we preregister every possible significance test and publish all observed outcomes. This view is not scientific, of course. There is no evidence whatsoever that ornate causal identification strategies, complex regressions, and dozens of pages of robustness checks “fix” quantitative social science. And yet, the proponents disguise their irrational normative claims about mathematical statistics in a language of rationality. They claim all knowledge apparently hinges on significance tests and complex causal inference machinery.

However, as Copelvich put it, the credibility revolution’s central tenet that the only meaningful test of causation is a randomized clinical trial, whether real or imagined, is a matter of taste. There are many confusing mathematical tools you can learn to play this charade of credibility, but it’s just a framework of expertise that crowds out alternative explanation, interpretation, and intervention.

There is an allure to the quantitative end of metascience. Scientists feel they are best equipped to clean their own house. They think the historians, qualitative sociologists, and theorists who have described the nuances, idiosyncrasies, and power structures of contemporary science are all postmodernist weirdos ready to eat up the next Sokal Hoax. And yet, maybe we need those historians, philosophers, and sociologists and their disciplinary norms to understand more clearly the normative world that mainstream metascience wants.

Subscribe now

By Ben Recht

Joe Halpern (1953-2026)

from Computational Complexity

♦Computer Science Professor Joseph Halpern passed away on Friday after a long battle with cancer. He was a leader in the mathematical reasoning about knowledge. His paper with Yoram Moses, Knowledge and Common Knowledge in a Distributed Environment, received both the 1997 Gödel Prize and the 2009 Dijkstra Prize. Halpern also co-authored a comprehensive book on the topic.

Halpern helped create a model of knowledge representation which consisted of a set of states of the world, and each person has a partition into a collection of sets of states, where states are in the same partition if that person can't distinguish between those states. You can use this system to define knowledge and common knowledge, and model problems like muddy children. It also serves as a great framework for temporal logic. 

Halpern led the creation of the Computing Research Repository (CoRR), a forerunner of arXiv, and would later moderate CS papers for arXiv. 

Joe Halpern was the driving force behind the Theoretical Aspects of Rationality and Knowledge (TARK) conference, which attracts philosophers, economists, computer scientists and others to discuss what it means to know stuff. I had two papers in TARK 2009 in Stanford. But my favorite TARK memory came from a debate at the 1998 TARK conference at Northwestern. 

Consider the centipede game, where two players alternate turns where each can either play to the right (R/r), or defect (D/d) to end the game immediately, with payouts in the diagram below.

♦The game is solved by backward induction, working out that in each subgame the player does better defecting.

The debate asked the following. Player 1 needs to think about the backward induction of the future moves, considering the case where player 2 played right in its first move. But this is an irrational move, so why should you assume player 2 is being rational when playing its second move later on?
Someone said such reasoning is fine, like when we assume that square root of two is rational, in order to get a contradiction. The counter argument: Square root of two does not "choose" to be irrational.
Thank you Joe for helping us think about knowledge and giving us the forums to do so.

By Lance Fortnow

Computer Science Professor Joseph Halpern passed away on Friday after a long battle with cancer. He was a leader in the mathematical reasoning about knowledge. His paper with Yoram Moses, Knowledge and Common Knowledge in a Distributed Environment, received both the 1997 Gödel Prize and the 2009 Dijkstra Prize. Halpern also co-authored a comprehensive book on the topic.

Halpern helped create a model of knowledge representation which consisted of a set of states of the world, and each person has a partition into a collection of sets of states, where states are in the same partition if that person can't distinguish between those states. You can use this system to define knowledge and common knowledge, and model problems like muddy children. It also serves as a great framework for temporal logic

Halpern led the creation of the Computing Research Repository (CoRR), a forerunner of arXiv, and would later moderate CS papers for arXiv. 

Joe Halpern was the driving force behind the Theoretical Aspects of Rationality and Knowledge (TARK) conference, which attracts philosophers, economists, computer scientists and others to discuss what it means to know stuff. I had two papers in TARK 2009 in Stanford. But my favorite TARK memory came from a debate at the 1998 TARK conference at Northwestern. 

Consider the centipede game, where two players alternate turns where each can either play to the right (R/r), or defect (D/d) to end the game immediately, with payouts in the diagram below.

The game is solved by backward induction, working out that in each subgame the player does better defecting.

The debate asked the following. Player 1 needs to think about the backward induction of the future moves, considering the case where player 2 played right in its first move. But this is an irrational move, so why should you assume player 2 is being rational when playing its second move later on?

Someone said such reasoning is fine, like when we assume that square root of two is rational, in order to get a contradiction. The counter argument: Square root of two does not "choose" to be irrational.

Thank you Joe for helping us think about knowledge and giving us the forums to do so.

By Lance Fortnow

postdoc at ENS Lyon (apply by April 3, 2026)

from CCI: jobs

The LIP laboratory is opening a one-year post-doctoral fellowship in Computer Science at Ecole Normale Supérieure de Lyon, France. All research themes of the laboratory are eligible. Website: www.ens-lyon.fr/LIP/index.php/latest-news/725-offre-post-doc-lip-2026-red Email: [isabelle.guerin-lassous,loris.marchal]@ens-lyon.fr

The LIP laboratory is opening a one-year post-doctoral fellowship in Computer Science at Ecole Normale Supérieure de Lyon, France. All research themes of the laboratory are eligible.

Website: https://www.ens-lyon.fr/LIP/index.php/latest-news/725-offre-post-doc-lip-2026-red
Email: [isabelle.guerin-lassous,loris.marchal]@ens-lyon.fr

By shacharlovett

TR26-023 | Separations above TFNP from Sherali-Adams Lower Bounds | Anna Gal, Noah Fleming, Deniz Imrek, Christophe Marciot

from ECCC Papers

Unlike in TFNP, for which there is an abundance of problems capturing natural existence principles which are incomparable (in the black-box setting), Kleinberg et al. [KKMP21] observed that many of the natural problems considered so far in the second level of the total function polynomial hierarchy (TF$\Sigma_2$) reduce to the Strong Avoid problem. In this work, we prove that the Linear Ordering Principle does not reduce to Strong Avoid in the black-box setting, exhibiting the first TF$\Sigma_2$ problem that lies outside of the class of problems reducible to Strong Avoid. The proof of our separation exploits a connection between total search problems in the polynomial hierarchy and proof complexity, recently developed by Fleming, Imrek, and Marciot [FIM25]. In particular, this implies that to show our separation, it suffices to show that there is no small proof of the Linear Ordering Principle in a $\Sigma_2$-variant of the Sherali-Adams proof system. To do so, we extend the classical pseudo-expectation method to the $\Sigma_2$ setting, showing that the existence of a $\Sigma_2$ pseudo-expectation precludes a $\Sigma_2$ Sherali-Adams proof. The main technical challenge is in proving the existence of such a pseudo-expectation, we manage to do so by solving a combinatorial covering problem about permutations. We also show that the extended pseudo-expectation bound implies that the Linear Ordering Principle cannot be reduced to any problem admitting a low-degree Sherali-Adams refutation.

Unlike in TFNP, for which there is an abundance of problems capturing natural existence principles which are incomparable (in the black-box setting), Kleinberg et al. [KKMP21] observed that many of the natural problems considered so far in the second level of the total function polynomial hierarchy (TF$\Sigma_2$) reduce to the Strong Avoid problem. In this work, we prove that the Linear Ordering Principle does not reduce to Strong Avoid in the black-box setting, exhibiting the first TF$\Sigma_2$ problem that lies outside of the class of problems reducible to Strong Avoid. The proof of our separation exploits a connection between total search problems in the polynomial hierarchy and proof complexity, recently developed by Fleming, Imrek, and Marciot [FIM25]. In particular, this implies that to show our separation, it suffices to show that there is no small proof of the Linear Ordering Principle in a $\Sigma_2$-variant of the Sherali-Adams proof system. To do so, we extend the classical pseudo-expectation method to the $\Sigma_2$ setting, showing that the existence of a $\Sigma_2$ pseudo-expectation precludes a $\Sigma_2$ Sherali-Adams proof. The main technical challenge is in proving the existence of such a pseudo-expectation, we manage to do so by solving a combinatorial covering problem about permutations. We also show that the extended pseudo-expectation bound implies that the Linear Ordering Principle cannot be reduced to any problem admitting a low-degree Sherali-Adams refutation.

Polynomial-time isomorphism test for $k$-generated extensions of abelian groups

from arXiv: Computational Complexity

Authors: Saveliy V. Skresanov

The group isomorphism problem asks whether two finite groups given by their Cayley tables are isomorphic or not. Although there are polynomial-time algorithms for some specific group classes, the best known algorithm for testing isomorphism of arbitrary groups of order $ n $ has time complexity $ n^{O(\log n)} $. We consider the group isomorphism problem for some extensions of abelian groups by $ k $-generated groups for bounded $ k $. In particular, we prove that one can decide isomorphism of abelian-by-cyclic extensions in polynomial time, generalizing a 2009 result of Le Gall for coprime extensions. As another application, we give a polynomial-time isomorphism test for abelian-by-simple group extensions, generalizing a 2017 result of Grochow and Qiao for central extensions. The main novelty of the proof is a polynomial-time algorithm for computing the unit group of a finite ring, which might be of independent interest.

Authors: Saveliy V. Skresanov

The group isomorphism problem asks whether two finite groups given by their Cayley tables are isomorphic or not. Although there are polynomial-time algorithms for some specific group classes, the best known algorithm for testing isomorphism of arbitrary groups of order $ n $ has time complexity $ n^{O(\log n)} $. We consider the group isomorphism problem for some extensions of abelian groups by $ k $-generated groups for bounded $ k $. In particular, we prove that one can decide isomorphism of abelian-by-cyclic extensions in polynomial time, generalizing a 2009 result of Le Gall for coprime extensions. As another application, we give a polynomial-time isomorphism test for abelian-by-simple group extensions, generalizing a 2017 result of Grochow and Qiao for central extensions. The main novelty of the proof is a polynomial-time algorithm for computing the unit group of a finite ring, which might be of independent interest.

Efficient quantum circuits for high-dimensional representations of SU(n) and Ramanujan quantum expanders

from arXiv: Computational Complexity

Authors: Vishnu Iyer, Siddhartha Jain, Stephen Jordan, Rolando Somma

We present efficient quantum circuits that implement high-dimensional unitary irreducible representations (irreps) of $SU(n)$, where $n \ge 2$ is constant. For dimension $N$ and error $ε$, the number of quantum gates in our circuits is polynomial in $\log(N)$ and $\log(1/ε)$. Our construction relies on the Jordan-Schwinger representation, which allows us to realize irreps of $SU(n)$ in the Hilbert space of $n$ quantum harmonic oscillators. Together with a recent efficient quantum Hermite transform, which allows us to map the computational basis states to the eigenstates of the quantum harmonic oscillator, this allows us to implement these irreps efficiently. Our quantum circuits can be used to construct explicit Ramanujan quantum expanders, a longstanding open problem. They can also be used to fast-forward the evolution of certain quantum systems.

Authors: Vishnu Iyer, Siddhartha Jain, Stephen Jordan, Rolando Somma

We present efficient quantum circuits that implement high-dimensional unitary irreducible representations (irreps) of $SU(n)$, where $n \ge 2$ is constant. For dimension $N$ and error $ε$, the number of quantum gates in our circuits is polynomial in $\log(N)$ and $\log(1/ε)$. Our construction relies on the Jordan-Schwinger representation, which allows us to realize irreps of $SU(n)$ in the Hilbert space of $n$ quantum harmonic oscillators. Together with a recent efficient quantum Hermite transform, which allows us to map the computational basis states to the eigenstates of the quantum harmonic oscillator, this allows us to implement these irreps efficiently. Our quantum circuits can be used to construct explicit Ramanujan quantum expanders, a longstanding open problem. They can also be used to fast-forward the evolution of certain quantum systems.

Natural Privacy Filters Are Not Always Free: A Characterization of Free Natural Filters

from arXiv: Data Structures and Algorithms

Authors: Matthew Regehr, Bingshan Hu, Ethan Leeman, Pasin Manurangsi, Pierre Tholoniat, Mathias Lécuyer

We study natural privacy filters, which enable the exact composition of differentially private (DP) mechanisms with adaptively chosen privacy characteristics. Earlier privacy filters consider only simple privacy parameters such as Rényi-DP or Gaussian DP parameters. Natural filters account for the entire privacy profile of every query, promising greater utility for a given privacy budget. We show that, contrary to other forms of DP, natural privacy filters are not free in general. Indeed, we show that only families of privacy mechanisms that are well-ordered when composed admit free natural privacy filters.

Authors: Matthew Regehr, Bingshan Hu, Ethan Leeman, Pasin Manurangsi, Pierre Tholoniat, Mathias Lécuyer

We study natural privacy filters, which enable the exact composition of differentially private (DP) mechanisms with adaptively chosen privacy characteristics. Earlier privacy filters consider only simple privacy parameters such as Rényi-DP or Gaussian DP parameters. Natural filters account for the entire privacy profile of every query, promising greater utility for a given privacy budget. We show that, contrary to other forms of DP, natural privacy filters are not free in general. Indeed, we show that only families of privacy mechanisms that are well-ordered when composed admit free natural privacy filters.

Local Node Differential Privacy

from arXiv: Data Structures and Algorithms

Authors: Sofya Raskhodnikova, Adam Smith, Connor Wagaman, Anatoly Zavyalov

We initiate an investigation of node differential privacy for graphs in the local model of private data analysis. In our model, dubbed LNDP, each node sees its own edge list and releases the output of a local randomizer on this input. These outputs are aggregated by an untrusted server to obtain a final output. We develop a novel algorithmic framework for this setting that allows us to accurately answer arbitrary linear queries on a blurry approximation of the input graph's degree distribution. For some natural problems, the resulting algorithms match the accuracy achievable with node privacy in the central model, where data are held and processed by a trusted server. We also prove lower bounds on the error required by LNDP that imply the optimality of our algorithms for several fundamental graph statistics. We then lift these lower bounds to the interactive LNDP setting, demonstrating the optimality of our algorithms even when constantly many rounds of interaction are permitted. Obtaining our lower bounds requires new approaches, since those developed for the usual local model do not apply to the inherently overlapping inputs that arise from graphs. Finally, we prove structural results that reveal qualitative differences between local node privacy and the standard local model for tabular data.

Authors: Sofya Raskhodnikova, Adam Smith, Connor Wagaman, Anatoly Zavyalov

We initiate an investigation of node differential privacy for graphs in the local model of private data analysis. In our model, dubbed LNDP, each node sees its own edge list and releases the output of a local randomizer on this input. These outputs are aggregated by an untrusted server to obtain a final output. We develop a novel algorithmic framework for this setting that allows us to accurately answer arbitrary linear queries on a blurry approximation of the input graph's degree distribution. For some natural problems, the resulting algorithms match the accuracy achievable with node privacy in the central model, where data are held and processed by a trusted server. We also prove lower bounds on the error required by LNDP that imply the optimality of our algorithms for several fundamental graph statistics. We then lift these lower bounds to the interactive LNDP setting, demonstrating the optimality of our algorithms even when constantly many rounds of interaction are permitted. Obtaining our lower bounds requires new approaches, since those developed for the usual local model do not apply to the inherently overlapping inputs that arise from graphs. Finally, we prove structural results that reveal qualitative differences between local node privacy and the standard local model for tabular data.

A Weighted-to-Unweighted Reduction for Matroid Intersection

from arXiv: Data Structures and Algorithms

Authors: Aditi Dudeja, Mara Grilnberger

Given two matroids $\mathcal{M}_1$ and $\mathcal{M}_2$ over the same ground set, the matroid intersection problem is to find the maximum cardinality common independent set. In the weighted version of the problem, the goal is to find a maximum weight common independent set. It has been a matter of interest to find efficient approximation algorithms for this problem in various settings. In many of these models, there is a gap between the best known results for the unweighted and weighted versions. In this work, we address the question of closing this gap. Our main result is a reduction which converts any $α$-approximate unweighted matroid intersection algorithm into an $α(1-\varepsilon)$-approximate weighted matroid intersection algorithm, while increasing the runtime of the algorithm by a $\log W$ factor, where $W$ is the aspect ratio. Our framework is versatile and translates to settings such as streaming and one-way communication complexity where matroid intersection is well-studied. As a by-product of our techniques, we derive new results for weighted matroid intersection in these models.

Authors: Aditi Dudeja, Mara Grilnberger

Given two matroids $\mathcal{M}_1$ and $\mathcal{M}_2$ over the same ground set, the matroid intersection problem is to find the maximum cardinality common independent set. In the weighted version of the problem, the goal is to find a maximum weight common independent set. It has been a matter of interest to find efficient approximation algorithms for this problem in various settings. In many of these models, there is a gap between the best known results for the unweighted and weighted versions. In this work, we address the question of closing this gap. Our main result is a reduction which converts any $α$-approximate unweighted matroid intersection algorithm into an $α(1-\varepsilon)$-approximate weighted matroid intersection algorithm, while increasing the runtime of the algorithm by a $\log W$ factor, where $W$ is the aspect ratio. Our framework is versatile and translates to settings such as streaming and one-way communication complexity where matroid intersection is well-studied. As a by-product of our techniques, we derive new results for weighted matroid intersection in these models.

Fair Correlation Clustering Meets Graph Parameters

from arXiv: Data Structures and Algorithms

Authors: Johannes Blaha, Robert Ganian, Katharina Gillig, Jonathan S. Højlev, Simon Wietheger

We study the generalization of Correlation Clustering which incorporates fairness constraints via the notion of fairlets. The corresponding Fair Correlation Clustering problem has been studied from several perspectives to date, but has so far lacked a detailed analysis from the parameterized complexity paradigm. We close this gap by providing tractability results for the problem under a variety of structural graph parameterizations, including treewidth, treedepth and the vertex cover number; our results lie at the very edge of tractability given the known NP-hardness of the problem on severely restricted inputs.

Authors: Johannes Blaha, Robert Ganian, Katharina Gillig, Jonathan S. Højlev, Simon Wietheger

We study the generalization of Correlation Clustering which incorporates fairness constraints via the notion of fairlets. The corresponding Fair Correlation Clustering problem has been studied from several perspectives to date, but has so far lacked a detailed analysis from the parameterized complexity paradigm. We close this gap by providing tractability results for the problem under a variety of structural graph parameterizations, including treewidth, treedepth and the vertex cover number; our results lie at the very edge of tractability given the known NP-hardness of the problem on severely restricted inputs.

Memory Reallocation with Polylogarithmic Overhead

from arXiv: Data Structures and Algorithms

Authors: Ce Jin

The Memory Reallocation problem asks to dynamically maintain an assignment of given objects of various sizes to non-overlapping contiguous chunks of memory, while supporting updates (insertions/deletions) in an online fashion. The total size of live objects at any time is guaranteed to be at most a $1-ε$ fraction of the total memory. To handle an online update, the allocator may rearrange the objects in memory to make space, and the overhead for this update is defined as the total size of moved objects divided by the size of the object being inserted/deleted. Our main result is an allocator with worst-case expected overhead $\mathrm{polylog}(ε^{-1})$. This exponentially improves the previous worst-case expected overhead $\tilde O(ε^{-1/2})$ achieved by Farach-Colton, Kuszmaul, Sheffield, and Westover (2024), narrowing the gap towards the $Ω(\logε^{-1})$ lower bound. Our improvement is based on an application of the sunflower lemma previously used by Erdős and Sárközy (1992) in the context of subset sums. Our allocator achieves polylogarithmic overhead only in expectation, and sometimes performs expensive rebuilds. Our second technical result shows that this is necessary: it is impossible to achieve subpolynomial overhead with high probability.

Authors: Ce Jin

The Memory Reallocation problem asks to dynamically maintain an assignment of given objects of various sizes to non-overlapping contiguous chunks of memory, while supporting updates (insertions/deletions) in an online fashion. The total size of live objects at any time is guaranteed to be at most a $1-ε$ fraction of the total memory. To handle an online update, the allocator may rearrange the objects in memory to make space, and the overhead for this update is defined as the total size of moved objects divided by the size of the object being inserted/deleted. Our main result is an allocator with worst-case expected overhead $\mathrm{polylog}(ε^{-1})$. This exponentially improves the previous worst-case expected overhead $\tilde O(ε^{-1/2})$ achieved by Farach-Colton, Kuszmaul, Sheffield, and Westover (2024), narrowing the gap towards the $Ω(\logε^{-1})$ lower bound. Our improvement is based on an application of the sunflower lemma previously used by Erdős and Sárközy (1992) in the context of subset sums. Our allocator achieves polylogarithmic overhead only in expectation, and sometimes performs expensive rebuilds. Our second technical result shows that this is necessary: it is impossible to achieve subpolynomial overhead with high probability.

Self-dual Stacked Quantum Low-Density Parity-Check Codes

from arXiv: Data Structures and Algorithms

Authors: Ze-Chuan Liu, Chong-Yuan Xu, Yong Xu

Quantum low-density parity-check (qLDPC) codes are promising candidates for fault-tolerant quantum computation due to their high encoding rates and distances. However, implementing logical operations using qLDPC codes presents significant challenges. Previous research has demonstrated that self-dual qLDPC codes facilitate the implementation of transversal Clifford gates. Here we introduce a method for constructing self-dual qLDPC codes by stacking non-self-dual qLDPC codes. Leveraging this methodology, we develop double-chain bicycle codes, double-layer bivariate bicycle (BB) codes, double-layer twisted BB codes, and double-layer reflection codes, many of which exhibit favorable code parameters. Additionally, we conduct numerical calculations to assess the performance of these codes as quantum memory under the circuit-level noise model, revealing that the logical failure rate can be significantly reduced with high pseudo-thresholds.

Authors: Ze-Chuan Liu, Chong-Yuan Xu, Yong Xu

Quantum low-density parity-check (qLDPC) codes are promising candidates for fault-tolerant quantum computation due to their high encoding rates and distances. However, implementing logical operations using qLDPC codes presents significant challenges. Previous research has demonstrated that self-dual qLDPC codes facilitate the implementation of transversal Clifford gates. Here we introduce a method for constructing self-dual qLDPC codes by stacking non-self-dual qLDPC codes. Leveraging this methodology, we develop double-chain bicycle codes, double-layer bivariate bicycle (BB) codes, double-layer twisted BB codes, and double-layer reflection codes, many of which exhibit favorable code parameters. Additionally, we conduct numerical calculations to assess the performance of these codes as quantum memory under the circuit-level noise model, revealing that the logical failure rate can be significantly reduced with high pseudo-thresholds.

Testing Monotonicity of Real-Valued Functions on DAGs

from arXiv: Data Structures and Algorithms

Authors: Yuichi Yoshida

We study monotonicity testing of real-valued functions on directed acyclic graphs (DAGs) with $n$ vertices. For every constant $δ>0$, we prove a $Ω(n^{1/2-δ}/\sqrt{\varepsilon})$ lower bound against non-adaptive two-sided testers on DAGs, nearly matching the classical $O(\sqrt{n/\varepsilon})$-query upper bound. For constant $\varepsilon$, we also prove an $Ω(\sqrt n)$ lower bound for randomized adaptive one-sided testers on explicit bipartite DAGs, whereas previously only an $Ω(\log n)$ lower bound was known. A key technical ingredient in both lower bounds is positive-matching Ruzsa--Szemerédi families. On the algorithmic side, we give simple non-adaptive one-sided testers with query complexity $O(\sqrt{m\,\ell}/(\varepsilon n))$ and $O(m^{1/3}/\varepsilon^{2/3})$, where $m$ is the number of edges in the transitive reduction and $\ell$ is the number of edges in the transitive closure. For constant $\varepsilon>0$, these improve over the previous $O(\sqrt{n/\varepsilon})$ bound when $m\ell=o(n^3)$ and $m=o(n^{3/2})$, respectively.

Authors: Yuichi Yoshida

We study monotonicity testing of real-valued functions on directed acyclic graphs (DAGs) with $n$ vertices. For every constant $δ>0$, we prove a $Ω(n^{1/2-δ}/\sqrt{\varepsilon})$ lower bound against non-adaptive two-sided testers on DAGs, nearly matching the classical $O(\sqrt{n/\varepsilon})$-query upper bound. For constant $\varepsilon$, we also prove an $Ω(\sqrt n)$ lower bound for randomized adaptive one-sided testers on explicit bipartite DAGs, whereas previously only an $Ω(\log n)$ lower bound was known. A key technical ingredient in both lower bounds is positive-matching Ruzsa--Szemerédi families. On the algorithmic side, we give simple non-adaptive one-sided testers with query complexity $O(\sqrt{m\,\ell}/(\varepsilon n))$ and $O(m^{1/3}/\varepsilon^{2/3})$, where $m$ is the number of edges in the transitive reduction and $\ell$ is the number of edges in the transitive closure. For constant $\varepsilon>0$, these improve over the previous $O(\sqrt{n/\varepsilon})$ bound when $m\ell=o(n^3)$ and $m=o(n^{3/2})$, respectively.

Revisiting the Sparse Matrix Compression Problem

from arXiv: Data Structures and Algorithms

Authors: Vincent Jugé, Dominik Köppl, Vincent Limouzy, Andrea Marino, Jannik Olblich, Giulia Punzi, Takeaki Uno

The sparse matrix compression problem asks for a one-dimensional representation of a binary $n \times \ell$ matrix, formed by an integer array of row indices and a shift function for each row, such that accessing a matrix entry is possible in constant time by consulting this representation. It has been shown that the decision problem for finding an integer array of length $\ell+ρ$ or restricting the shift function up to values of $ρ$ is NP-complete (cf. the textbook of Garey and Johnson). As a practical heuristic, a greedy algorithm has been proposed to shift the $i$-th row until it forms a solution with its predecessor rows. Despite that this greedy algorithm is cherished for its good approximation in practice, we show that it actually exhibits an approximation ratio of $Θ(\sqrt{\ell+ρ})$. We give further hardness results for parameterizations such as the number of distinct rows or the maximum number of non-zero entries per row. Finally, we devise a DP-algorithm that solves the problem for double-logarithmic matrix widths or logarithmic widths for further restrictions. We study all these findings also under a new perspective by introducing a variant of the problem, where we wish to minimize the length of the resulting integer array by trimming the non-zero borders, which has not been studied in the literature before but has practical motivations.

Authors: Vincent Jugé, Dominik Köppl, Vincent Limouzy, Andrea Marino, Jannik Olblich, Giulia Punzi, Takeaki Uno

The sparse matrix compression problem asks for a one-dimensional representation of a binary $n \times \ell$ matrix, formed by an integer array of row indices and a shift function for each row, such that accessing a matrix entry is possible in constant time by consulting this representation. It has been shown that the decision problem for finding an integer array of length $\ell+ρ$ or restricting the shift function up to values of $ρ$ is NP-complete (cf. the textbook of Garey and Johnson). As a practical heuristic, a greedy algorithm has been proposed to shift the $i$-th row until it forms a solution with its predecessor rows. Despite that this greedy algorithm is cherished for its good approximation in practice, we show that it actually exhibits an approximation ratio of $Θ(\sqrt{\ell+ρ})$. We give further hardness results for parameterizations such as the number of distinct rows or the maximum number of non-zero entries per row. Finally, we devise a DP-algorithm that solves the problem for double-logarithmic matrix widths or logarithmic widths for further restrictions. We study all these findings also under a new perspective by introducing a variant of the problem, where we wish to minimize the length of the resulting integer array by trimming the non-zero borders, which has not been studied in the literature before but has practical motivations.

Near-real-time Solutions for Online String Problems

from arXiv: Data Structures and Algorithms

Authors: Dominik Köppl, Gregory Kucherov

Based on the Breslauer-Italiano online suffix tree construction algorithm (2013) with double logarithmic worst-case guarantees on the update time per letter, we develop near-real-time algorithms for several classical problems on strings, including the computation of the longest repeating suffix array, the (reversed) Lempel-Ziv 77 factorization, and the maintenance of minimal unique substrings, all in an online manner. Our solutions improve over the best known running times for these problems in terms of the worst-case time per letter, for which we achieve a poly-log-logarithmic time complexity, within a linear space. Best known results for these problems require a poly-logarithmic time complexity per letter or only provide amortized complexity bounds. As a result of independent interest, we give conversions between the longest previous factor array and the longest repeating suffix array in space and time bounds based on their irreducible representations, which can have sizes sublinear in the length of the input string.

Authors: Dominik Köppl, Gregory Kucherov

Based on the Breslauer-Italiano online suffix tree construction algorithm (2013) with double logarithmic worst-case guarantees on the update time per letter, we develop near-real-time algorithms for several classical problems on strings, including the computation of the longest repeating suffix array, the (reversed) Lempel-Ziv 77 factorization, and the maintenance of minimal unique substrings, all in an online manner. Our solutions improve over the best known running times for these problems in terms of the worst-case time per letter, for which we achieve a poly-log-logarithmic time complexity, within a linear space. Best known results for these problems require a poly-logarithmic time complexity per letter or only provide amortized complexity bounds. As a result of independent interest, we give conversions between the longest previous factor array and the longest repeating suffix array in space and time bounds based on their irreducible representations, which can have sizes sublinear in the length of the input string.

Tensor Decomposition for Non-Clifford Gate Minimization

from arXiv: Data Structures and Algorithms

Authors: Kirill Khoruzhii, Patrick Gelß, Sebastian Pokutta

Fault-tolerant quantum computation requires minimizing non-Clifford gates, whose implementation via magic state distillation dominates the resource costs. While $T$-count minimization is well-studied, dedicated $CCZ$ factories shift the natural target to direct Toffoli minimization. We develop algebraic methods for this problem, building on a connection between Toffoli count and tensor decomposition over $\mathbb{F}_2$. On standard benchmarks, these methods match or improve all reported results for both Toffoli and $T$-count, with most circuits completing in under a minute on a single CPU instead of thousands of TPUs used by prior work.

Authors: Kirill Khoruzhii, Patrick Gelß, Sebastian Pokutta

Fault-tolerant quantum computation requires minimizing non-Clifford gates, whose implementation via magic state distillation dominates the resource costs. While $T$-count minimization is well-studied, dedicated $CCZ$ factories shift the natural target to direct Toffoli minimization. We develop algebraic methods for this problem, building on a connection between Toffoli count and tensor decomposition over $\mathbb{F}_2$. On standard benchmarks, these methods match or improve all reported results for both Toffoli and $T$-count, with most circuits completing in under a minute on a single CPU instead of thousands of TPUs used by prior work.

Latent Objective Induction and Diversity-Constrained Selection: Algorithms for Multi-Locale Retrieval Pipelines

from arXiv: Data Structures and Algorithms

Authors: Faruk Alpay, Levent Sarioglu

We present three algorithms with formal correctness guarantees and complexity bounds for the problem of selecting a diverse, multi-locale set of sources from ranked search results. First, we formulate weighted locale allocation as a constrained integer partition problem and give an $O(n \log n)$ algorithm that simultaneously satisfies minimum-representation, budget-exhaustion, and proportionality-bound constraints; we prove all three hold with a tight deviation bound of $< 1$. Second, we define a cascaded country-code inference function as a deterministic priority chain over heterogeneous signals (TLD structure, model-inferred metadata, language fallback) and prove it satisfies both determinism and graceful degradation. Third, we introduce a $κ$-domain diversity constraint for source selection and give an $O(|K| \cdot R)$ algorithm that maintains the invariant via hash-map lookup, eliminating the aggregator monopolization pathology present in URL-level deduplication. We further formalize Latent Objective Induction (LOI), an environment-shaping operator over prompt spaces that steers downstream model behavior without restricting the feasible output set, and prove its convergence under mild assumptions. Applied to a multi-locale retrieval pipeline, these algorithms yield 62% improvement in first-party source ratio and 89% reduction in same-domain duplication across 120 multilingual queries.

Authors: Faruk Alpay, Levent Sarioglu

We present three algorithms with formal correctness guarantees and complexity bounds for the problem of selecting a diverse, multi-locale set of sources from ranked search results. First, we formulate weighted locale allocation as a constrained integer partition problem and give an $O(n \log n)$ algorithm that simultaneously satisfies minimum-representation, budget-exhaustion, and proportionality-bound constraints; we prove all three hold with a tight deviation bound of $< 1$. Second, we define a cascaded country-code inference function as a deterministic priority chain over heterogeneous signals (TLD structure, model-inferred metadata, language fallback) and prove it satisfies both determinism and graceful degradation. Third, we introduce a $κ$-domain diversity constraint for source selection and give an $O(|K| \cdot R)$ algorithm that maintains the invariant via hash-map lookup, eliminating the aggregator monopolization pathology present in URL-level deduplication. We further formalize Latent Objective Induction (LOI), an environment-shaping operator over prompt spaces that steers downstream model behavior without restricting the feasible output set, and prove its convergence under mild assumptions. Applied to a multi-locale retrieval pipeline, these algorithms yield 62% improvement in first-party source ratio and 89% reduction in same-domain duplication across 120 multilingual queries.

Tuesday, February 17

The antiferromagnetic Ising model beyond line graphs

from arXiv: Computational Complexity

Authors: Mark Jerrum

Both the antiferromagnetic Ising model and the hard-core model could be said to be tractable on line graphs of bounded degree. For example, Glauber dynamics is rapidly mixing in both cases. In the case of the hard-core model, we know that tractability extends further, to claw-free graphs and somewhat beyond. In contrast, it is shown here that the corresponding extensions are not possible in the case of the antiferromagnetic Ising model.

Authors: Mark Jerrum

Both the antiferromagnetic Ising model and the hard-core model could be said to be tractable on line graphs of bounded degree. For example, Glauber dynamics is rapidly mixing in both cases. In the case of the hard-core model, we know that tractability extends further, to claw-free graphs and somewhat beyond. In contrast, it is shown here that the corresponding extensions are not possible in the case of the antiferromagnetic Ising model.

Geometric Characterization of Context-Free Intersections via the Inner Segment Dichotomy

from arXiv: Computational Complexity

Authors: Jorge Miguel Silva

The intersection of two context-free languages is not generally context-free, but no geometric criterion has characterized when it remains so. The crossing gap (max(i'-i, j'-j) for two crossing push-pop arcs) is the natural candidate. We refute this: we exhibit CFLs whose intersection is CFL despite unbounded-gap crossings. The governing quantity is the inner segment measure: for crossing arcs inducing a decomposition w = P1 P2 P3 P4, it is max(|P2|,|P3|), the length of the longer inner segment between interleaved crossing endpoints. We prove a dichotomy for this measure: bounded inner segments imply context-freeness via a finite buffer construction; growing inner segments with pump-sensitive linkages imply non-context-freeness. The inner segment concept applies to all CFL intersections; the strictness of the resulting characterization depends on the language class. For block-counting CFLs (languages requiring equality among designated pairs of block lengths), the dichotomy is complete: the intersection is CFL if and only if the combined arcs are jointly well-nested. For general CFLs, the CFL direction is unconditional; the non-CFL direction requires pump-sensitive linkages whose necessity is the main open problem, reducing the general CFL intersection problem to a specific property of pump-sensitive decompositions.

Authors: Jorge Miguel Silva

The intersection of two context-free languages is not generally context-free, but no geometric criterion has characterized when it remains so. The crossing gap (max(i'-i, j'-j) for two crossing push-pop arcs) is the natural candidate. We refute this: we exhibit CFLs whose intersection is CFL despite unbounded-gap crossings. The governing quantity is the inner segment measure: for crossing arcs inducing a decomposition w = P1 P2 P3 P4, it is max(|P2|,|P3|), the length of the longer inner segment between interleaved crossing endpoints. We prove a dichotomy for this measure: bounded inner segments imply context-freeness via a finite buffer construction; growing inner segments with pump-sensitive linkages imply non-context-freeness. The inner segment concept applies to all CFL intersections; the strictness of the resulting characterization depends on the language class. For block-counting CFLs (languages requiring equality among designated pairs of block lengths), the dichotomy is complete: the intersection is CFL if and only if the combined arcs are jointly well-nested. For general CFLs, the CFL direction is unconditional; the non-CFL direction requires pump-sensitive linkages whose necessity is the main open problem, reducing the general CFL intersection problem to a specific property of pump-sensitive decompositions.

Graph Homomorphisms and Universal Algebra

from arXiv: Computational Complexity

Authors: Manuel Bodirsky

Constraint satisfaction problems are computational problems that naturally appear in many areas of theoretical computer science. One of the central themes is their computational complexity, and in particular the border between polynomial-time tractability and NP-hardness. In this course we introduce the universal-algebraic approach to study the computational complexity of finite-domain CSPs. The course covers in particular the cyclic terms and bounded width theorems. To keep the presentation accessible, we start the course in the tangible setting of directed graphs and graph homomorphism problems.

Authors: Manuel Bodirsky

Constraint satisfaction problems are computational problems that naturally appear in many areas of theoretical computer science. One of the central themes is their computational complexity, and in particular the border between polynomial-time tractability and NP-hardness. In this course we introduce the universal-algebraic approach to study the computational complexity of finite-domain CSPs. The course covers in particular the cyclic terms and bounded width theorems. To keep the presentation accessible, we start the course in the tangible setting of directed graphs and graph homomorphism problems.

Affine Rank Minimization is ER Complete

from arXiv: Computational Complexity

Authors: Angshul Majumdar

We study the decision problem Affine Rank Minimization, denoted ARM(k). The input consists of rational matrices A_1,...,A_q in Q^{m x n} and rational scalars b_1,...,b_q in Q. The question is whether there exists a real matrix X in R^{m x n} such that trace(A_l^T X) = b_l for all l in {1,...,q} and rank(X) <= k. We first prove membership: for every fixed k >= 1, ARM(k) lies in the existential theory of the reals by giving an explicit existential encoding of the rank constraint using a constant-size factorization witness. We then prove existential-theory-of-reals hardness via a polynomial-time many-one reduction from ETR to ARM(k), where the target instance uses only affine equalities together with a single global constraint rank(X) <= k. The reduction compiles an ETR formula into an arithmetic circuit in gate-equality normal form and assigns each circuit quantity to a designated entry of X. Affine semantics (constants, copies, addition, and negation) are enforced by linear constraints, while multiplicative semantics are enforced by constant-size rank-forcing gadgets. Soundness is certified by a fixed-rank gauge submatrix that removes factorization ambiguity. We prove a composition lemma showing that gadgets can be embedded without unintended interactions, yielding global soundness and completeness while preserving polynomial bounds on dimension and bit-length. Consequently, ARM(k) is complete for the existential theory of the reals; in particular, ARM(3) is complete. This shows that feasibility of purely affine constraints under a fixed constant rank bound captures the full expressive power of real algebraic feasibility.

Authors: Angshul Majumdar

We study the decision problem Affine Rank Minimization, denoted ARM(k). The input consists of rational matrices A_1,...,A_q in Q^{m x n} and rational scalars b_1,...,b_q in Q. The question is whether there exists a real matrix X in R^{m x n} such that trace(A_l^T X) = b_l for all l in {1,...,q} and rank(X) <= k. We first prove membership: for every fixed k >= 1, ARM(k) lies in the existential theory of the reals by giving an explicit existential encoding of the rank constraint using a constant-size factorization witness. We then prove existential-theory-of-reals hardness via a polynomial-time many-one reduction from ETR to ARM(k), where the target instance uses only affine equalities together with a single global constraint rank(X) <= k. The reduction compiles an ETR formula into an arithmetic circuit in gate-equality normal form and assigns each circuit quantity to a designated entry of X. Affine semantics (constants, copies, addition, and negation) are enforced by linear constraints, while multiplicative semantics are enforced by constant-size rank-forcing gadgets. Soundness is certified by a fixed-rank gauge submatrix that removes factorization ambiguity. We prove a composition lemma showing that gadgets can be embedded without unintended interactions, yielding global soundness and completeness while preserving polynomial bounds on dimension and bit-length. Consequently, ARM(k) is complete for the existential theory of the reals; in particular, ARM(3) is complete. This shows that feasibility of purely affine constraints under a fixed constant rank bound captures the full expressive power of real algebraic feasibility.

An Algebraic Rigidity Framework for Order-Oblivious Deterministic Black-Box PIT of ROABPs

from arXiv: Computational Complexity

Authors: Shalender Singh, Vishnupriya Singh

Deterministic black-box polynomial identity testing (PIT) for read-once oblivious algebraic branching programs (ROABPs) is a central open problem in algebraic complexity, particularly in the absence of variable ordering. Prior deterministic algorithms either rely on order information or incur significant overhead through combinatorial isolation techniques. In this paper, we introduce an algebraic rigidity framework for ROABPs based on the internal structure of their associated matrix word algebras. We show that nonzero width-$w$ ROABPs induce word algebras whose effective algebraic degrees of freedom collapse to dimension at most $w^2$, independent of the number of variables. This rigidity enables deterministic witness construction via intrinsic algebraic invariants, bypassing rank concentration, isolation lemmas, and probabilistic tools used in previous work.Thus, we obtain the first order-oblivious deterministic black-box PIT algorithm for ROABPs, running in quasi-polynomial time $n\cdot(wd)^{O(w^2)}$. This establishes that algebraic rigidity alone suffices to derandomize PIT in this model, without assuming ordering information. The framework further isolates a single remaining obstacle to full polynomial-time complexity. We formulate a Modular Stability Conjecture, asserting that width-$w$ ROABPs are stable under hashing into cyclic quotient rings $\mathbb{K}[λ]/< λ^r-1 >$ once the modulus exceeds a polynomial threshold in $w$ and the individual degree. This conjecture arises naturally from the low-dimensional coefficient structure revealed by rigidity and is supported by extensive empirical evidence. Assuming the conjecture, our methods yield a fully polynomial-time deterministic black-box PIT algorithm for ROABPs, matching the complexity of the best-known white-box algorithms and reducing the black-box problem to a concrete algebraic stability question.

Authors: Shalender Singh, Vishnupriya Singh

Deterministic black-box polynomial identity testing (PIT) for read-once oblivious algebraic branching programs (ROABPs) is a central open problem in algebraic complexity, particularly in the absence of variable ordering. Prior deterministic algorithms either rely on order information or incur significant overhead through combinatorial isolation techniques. In this paper, we introduce an algebraic rigidity framework for ROABPs based on the internal structure of their associated matrix word algebras. We show that nonzero width-$w$ ROABPs induce word algebras whose effective algebraic degrees of freedom collapse to dimension at most $w^2$, independent of the number of variables. This rigidity enables deterministic witness construction via intrinsic algebraic invariants, bypassing rank concentration, isolation lemmas, and probabilistic tools used in previous work.Thus, we obtain the first order-oblivious deterministic black-box PIT algorithm for ROABPs, running in quasi-polynomial time $n\cdot(wd)^{O(w^2)}$. This establishes that algebraic rigidity alone suffices to derandomize PIT in this model, without assuming ordering information. The framework further isolates a single remaining obstacle to full polynomial-time complexity. We formulate a Modular Stability Conjecture, asserting that width-$w$ ROABPs are stable under hashing into cyclic quotient rings $\mathbb{K}[λ]/< λ^r-1 >$ once the modulus exceeds a polynomial threshold in $w$ and the individual degree. This conjecture arises naturally from the low-dimensional coefficient structure revealed by rigidity and is supported by extensive empirical evidence. Assuming the conjecture, our methods yield a fully polynomial-time deterministic black-box PIT algorithm for ROABPs, matching the complexity of the best-known white-box algorithms and reducing the black-box problem to a concrete algebraic stability question.

Morphing of and writing with a scissor linkage mechanism

from arXiv: Computational Geometry

Authors: Mohanraj A, S Ganga Prasath

Kinematics of mechanisms is intricately coupled to their geometry and their utility often arises out of the ability to perform reproducible motion with fewer actuating degrees of freedom. In this article, we explore the assembly of scissor-units, each made of two rigid linear members connected by a pin joint. The assembly has a single degree of freedom, where actuating any single unit results in a shape change of the entire assembly. We derive expressions for the effective curvature of the unit and the trajectory of the mechanism's tip as a function of the geometric variables which we then use as the basis to program two tasks in the mechanism: shape morphing and writing. By phrasing these tasks as optimization problems and utilizing the differentiable simulation framework, we arrive at solutions that are then tested in table-top experiments. Our results show that the geometry of scissor assemblies can be leveraged for automated navigation and inspection in complex domains, in light of the optimization framework. However, we highlight that the challenges associated with rapid programming and error-free implementation in experiments without feedback still remain.

Authors: Mohanraj A, S Ganga Prasath

Kinematics of mechanisms is intricately coupled to their geometry and their utility often arises out of the ability to perform reproducible motion with fewer actuating degrees of freedom. In this article, we explore the assembly of scissor-units, each made of two rigid linear members connected by a pin joint. The assembly has a single degree of freedom, where actuating any single unit results in a shape change of the entire assembly. We derive expressions for the effective curvature of the unit and the trajectory of the mechanism's tip as a function of the geometric variables which we then use as the basis to program two tasks in the mechanism: shape morphing and writing. By phrasing these tasks as optimization problems and utilizing the differentiable simulation framework, we arrive at solutions that are then tested in table-top experiments. Our results show that the geometry of scissor assemblies can be leveraged for automated navigation and inspection in complex domains, in light of the optimization framework. However, we highlight that the challenges associated with rapid programming and error-free implementation in experiments without feedback still remain.

Lower Estimates for $L_1$-Distortion of Transportation Cost Spaces

from arXiv: Computational Geometry

Authors: Chris Gartland, Mikhail Ostrovskii

Quantifying the degree of dissimilarity between two probability distributions on a finite metric space is a fundamental task in Computer Science and Computer Vision. A natural dissimilarity measure based on optimal transport is the Earth Mover's Distance (EMD). A key technique for analyzing this metric, pioneered by Charikar (2002) and Indyk and Thaper (2003), involves constructing low-distortion embeddings of EMD(X) into the Lebesgue space $L_1$. It became a key problem to investigate whether the upper bound of $O(\log n)$ can be improved for important classes of metric spaces known to admit low-distortion embeddings into $L_1$. In the context of Computer Vision, grid graphs, especially planar grids, are among the most fundamental. Indyk posed the related problem of estimating the $L_1$-distortion of the space of uniform distributions on $n$-point subsets of $R^2$. The Progress Report, last updated in August 2011, highlighted two key results: first, the work of Khot and Naor (2006) on Hamming cubes, which showed that the $L_1$-distortion for Hamming cubes meets the described above upper estimate, and second, the result of Naor and Schechtman (2007) for planar grids, which established that the $L_1$-distortion of for a planar $n$ by $n$ grid is $Ω(\sqrt{\log n})$. Our first result is the improvement of the lower bound on the $L_1$-distortion for grids to $Ω(\log n)$, matching the universal upper bound up to multiplicative constants. The key ingredient allowing us to obtain these sharp estimates is a new Sobolev-type inequality for scalar-valued functions on the grid graphs. Our method is also applicable to many recursive families of graphs, such as diamond and Laakso graphs. We obtain the sharp distortion estimates of $\log n$ in these cases as well.

Authors: Chris Gartland, Mikhail Ostrovskii

Quantifying the degree of dissimilarity between two probability distributions on a finite metric space is a fundamental task in Computer Science and Computer Vision. A natural dissimilarity measure based on optimal transport is the Earth Mover's Distance (EMD). A key technique for analyzing this metric, pioneered by Charikar (2002) and Indyk and Thaper (2003), involves constructing low-distortion embeddings of EMD(X) into the Lebesgue space $L_1$. It became a key problem to investigate whether the upper bound of $O(\log n)$ can be improved for important classes of metric spaces known to admit low-distortion embeddings into $L_1$. In the context of Computer Vision, grid graphs, especially planar grids, are among the most fundamental. Indyk posed the related problem of estimating the $L_1$-distortion of the space of uniform distributions on $n$-point subsets of $R^2$. The Progress Report, last updated in August 2011, highlighted two key results: first, the work of Khot and Naor (2006) on Hamming cubes, which showed that the $L_1$-distortion for Hamming cubes meets the described above upper estimate, and second, the result of Naor and Schechtman (2007) for planar grids, which established that the $L_1$-distortion of for a planar $n$ by $n$ grid is $Ω(\sqrt{\log n})$. Our first result is the improvement of the lower bound on the $L_1$-distortion for grids to $Ω(\log n)$, matching the universal upper bound up to multiplicative constants. The key ingredient allowing us to obtain these sharp estimates is a new Sobolev-type inequality for scalar-valued functions on the grid graphs. Our method is also applicable to many recursive families of graphs, such as diamond and Laakso graphs. We obtain the sharp distortion estimates of $\log n$ in these cases as well.

Fine-Grained Complexity for Quantum Problems from Size-Preserving Circuit-to-Hamiltonian Constructions

from arXiv: Data Structures and Algorithms

Authors: Nai-Hui Chia, Atsuya Hasegawa, François Le Gall, Yu-Ching Shen

The local Hamiltonian (LH) problem is the canonical $\mathsf{QMA}$-complete problem introduced by Kitaev. In this paper, we show its hardness in a very strong sense: we show that the 3-local Hamiltonian problem on $n$ qubits cannot be solved classically in time $O(2^{(1-\varepsilon)n})$ for any $\varepsilon>0$ under the Strong Exponential-Time Hypothesis (SETH), and cannot be solved quantumly in time $O(2^{(1-\varepsilon)n/2})$ for any $\varepsilon>0$ under the Quantum Strong Exponential-Time Hypothesis (QSETH). These lower bounds give evidence that the currently known classical and quantum algorithms for LH cannot be significantly improved. Furthermore, we are able to demonstrate fine-grained complexity lower bounds for approximating the quantum partition function (QPF) with an arbitrary constant relative error. Approximating QPF with relative error is known to be equivalent to approximately counting the dimension of the solution subspace of $\mathsf{QMA}$ problems. We show the SETH and QSETH hardness to estimate QPF with constant relative error. We then provide a quantum algorithm that runs in $O(\sqrt{2^n})$ time for an arbitrary $1/\mathrm{poly}(n)$ relative error, matching our lower bounds and improving the state-of-the-art algorithm by Bravyi, Chowdhury, Gosset, and Wocjan (Nature Physics 2022) in the low-temperature regime. To prove our fine-grained lower bounds, we introduce the first size-preserving circuit-to-Hamiltonian construction that encodes the computation of a $T$-time quantum circuit acting on $N$ qubits into a $(d+1)$-local Hamiltonian acting on $N+O(T^{1/d})$ qubits. This improves the standard construction based on the unary clock, which uses $N+O(T)$ qubits.

Authors: Nai-Hui Chia, Atsuya Hasegawa, François Le Gall, Yu-Ching Shen

The local Hamiltonian (LH) problem is the canonical $\mathsf{QMA}$-complete problem introduced by Kitaev. In this paper, we show its hardness in a very strong sense: we show that the 3-local Hamiltonian problem on $n$ qubits cannot be solved classically in time $O(2^{(1-\varepsilon)n})$ for any $\varepsilon>0$ under the Strong Exponential-Time Hypothesis (SETH), and cannot be solved quantumly in time $O(2^{(1-\varepsilon)n/2})$ for any $\varepsilon>0$ under the Quantum Strong Exponential-Time Hypothesis (QSETH). These lower bounds give evidence that the currently known classical and quantum algorithms for LH cannot be significantly improved. Furthermore, we are able to demonstrate fine-grained complexity lower bounds for approximating the quantum partition function (QPF) with an arbitrary constant relative error. Approximating QPF with relative error is known to be equivalent to approximately counting the dimension of the solution subspace of $\mathsf{QMA}$ problems. We show the SETH and QSETH hardness to estimate QPF with constant relative error. We then provide a quantum algorithm that runs in $O(\sqrt{2^n})$ time for an arbitrary $1/\mathrm{poly}(n)$ relative error, matching our lower bounds and improving the state-of-the-art algorithm by Bravyi, Chowdhury, Gosset, and Wocjan (Nature Physics 2022) in the low-temperature regime. To prove our fine-grained lower bounds, we introduce the first size-preserving circuit-to-Hamiltonian construction that encodes the computation of a $T$-time quantum circuit acting on $N$ qubits into a $(d+1)$-local Hamiltonian acting on $N+O(T^{1/d})$ qubits. This improves the standard construction based on the unary clock, which uses $N+O(T)$ qubits.

Expander Decomposition with Almost Optimal Overhead

from arXiv: Data Structures and Algorithms

Authors: Nikhil Bansal, Arun Jambulapati, Thatchaphol Saranurak

We present the first polynomial-time algorithm for computing a near-optimal \emph{flow}-expander decomposition. Given a graph $G$ and a parameter $φ$, our algorithm removes at most a $φ\log^{1+o(1)}n$ fraction of edges so that every remaining connected component is a $φ$-\emph{flow}-expander (a stronger guarantee than being a $φ$-\emph{cut}-expander). This achieves overhead $\log^{1+o(1)}n$, nearly matching the $Ω(\log n)$ graph-theoretic lower bound that already holds for cut-expander decompositions, up to a $\log^{o(1)}n$ factor. Prior polynomial-time algorithms required removing $O(φ\log^{1.5}n)$ and $O(φ\log^{2}n)$ fractions of edges to guarantee $φ$-cut-expander and $φ$-flow-expander components, respectively.

Authors: Nikhil Bansal, Arun Jambulapati, Thatchaphol Saranurak

We present the first polynomial-time algorithm for computing a near-optimal \emph{flow}-expander decomposition. Given a graph $G$ and a parameter $φ$, our algorithm removes at most a $φ\log^{1+o(1)}n$ fraction of edges so that every remaining connected component is a $φ$-\emph{flow}-expander (a stronger guarantee than being a $φ$-\emph{cut}-expander). This achieves overhead $\log^{1+o(1)}n$, nearly matching the $Ω(\log n)$ graph-theoretic lower bound that already holds for cut-expander decompositions, up to a $\log^{o(1)}n$ factor. Prior polynomial-time algorithms required removing $O(φ\log^{1.5}n)$ and $O(φ\log^{2}n)$ fractions of edges to guarantee $φ$-cut-expander and $φ$-flow-expander components, respectively.

Robust Value Maximization in Challenge the Champ Tournaments with Probabilistic Outcomes

from arXiv: Data Structures and Algorithms

Authors: Umang Bhaskar, Juhi Chaudhary, Sushmita Gupta, Pallavi Jain, Sanjay Seetharaman

Challenge the Champ is a simple tournament format, where an ordering of the players -- called a seeding -- is decided. The first player in this order is the initial champ, and faces the next player. The outcome of each match decides the current champion, who faces the next player in the order. Each player also has a popularity, and the value of each match is the popularity of the winner. Value maximization in tournaments has been previously studied when each match has a deterministic outcome. However, match outcomes are often probabilistic, rather than deterministic. We study robust value maximization in Challenge the Champ tournaments, when the winner of a match may be probabilistic. That is, we seek to maximize the total value that is obtained, irrespective of the outcome of probabilistic matches. We show that even in simple binary settings, for non-adaptive algorithms, the optimal robust value -- which we term the \textsc{VnaR}, or the value not at risk -- is hard to approximate. However, if we allow adaptive algorithms that determine the order of challengers based on the outcomes of previous matches, or restrict the matches with probabilistic outcomes, we can obtain good approximations to the optimal \textsc{VnaR}.

Authors: Umang Bhaskar, Juhi Chaudhary, Sushmita Gupta, Pallavi Jain, Sanjay Seetharaman

Challenge the Champ is a simple tournament format, where an ordering of the players -- called a seeding -- is decided. The first player in this order is the initial champ, and faces the next player. The outcome of each match decides the current champion, who faces the next player in the order. Each player also has a popularity, and the value of each match is the popularity of the winner. Value maximization in tournaments has been previously studied when each match has a deterministic outcome. However, match outcomes are often probabilistic, rather than deterministic. We study robust value maximization in Challenge the Champ tournaments, when the winner of a match may be probabilistic. That is, we seek to maximize the total value that is obtained, irrespective of the outcome of probabilistic matches. We show that even in simple binary settings, for non-adaptive algorithms, the optimal robust value -- which we term the \textsc{VnaR}, or the value not at risk -- is hard to approximate. However, if we allow adaptive algorithms that determine the order of challengers based on the outcomes of previous matches, or restrict the matches with probabilistic outcomes, we can obtain good approximations to the optimal \textsc{VnaR}.

On the Parameterized Tractability of Packing Vertex-Disjoint A-Paths with Length Constraints

from arXiv: Data Structures and Algorithms

Authors: Susobhan Bandopadhyay, Aritra Banik, Diptapriyo Majumdar, Abhishek Sahu

Given an undirected graph G and a set A \subseteq V(G), an A-path is a path in G that starts and ends at two distinct vertices of A with intermediate vertices in V(G) \setminus A. An A-path is called an (A,\ell)-path if the length of the path is exactly \ell. In the {\sc (A, \ell)-Path Packing} problem (ALPP), we seek to determine whether there exist k vertex-disjoint (A, \ell)-paths in G or not. We pursue this problem with respect to structural parameters. We prove that ALPP is W[1]-hard when it is parameterized by the combined parameter distance to path (dtp) and |A|. In addition, we consider the combined parameters distance to cluster (cvd) + |A| and distance to cluster (cvd) + \ell. For both these combined parameters, we provide FPT algorithms. Finally, we consider the vertex cover number (vc) as the parameter and provide a kernel with O(vc^2) vertices.

Authors: Susobhan Bandopadhyay, Aritra Banik, Diptapriyo Majumdar, Abhishek Sahu

Given an undirected graph G and a set A \subseteq V(G), an A-path is a path in G that starts and ends at two distinct vertices of A with intermediate vertices in V(G) \setminus A. An A-path is called an (A,\ell)-path if the length of the path is exactly \ell. In the {\sc (A, \ell)-Path Packing} problem (ALPP), we seek to determine whether there exist k vertex-disjoint (A, \ell)-paths in G or not. We pursue this problem with respect to structural parameters. We prove that ALPP is W[1]-hard when it is parameterized by the combined parameter distance to path (dtp) and |A|. In addition, we consider the combined parameters distance to cluster (cvd) + |A| and distance to cluster (cvd) + \ell. For both these combined parameters, we provide FPT algorithms. Finally, we consider the vertex cover number (vc) as the parameter and provide a kernel with O(vc^2) vertices.

Constant-Time Dynamic Enumeration of Word Infixes in a Regular Language

from arXiv: Data Structures and Algorithms

Authors: Antoine Amarilli, Sven Dziadek, Luc Segoufin

For a fixed regular language $L$, the enumeration of $L$-infixes is the following task: we are given an input word $w = a_1 \cdots a_n$ and we must enumerate the infixes of $w$ that belong to $L$, i.e., the pairs $i \leq j$ such that $a_i \cdots a_j \in L$. We are interested in dynamic enumeration of $L$-infixes, where we must additionally support letter substitution updates on $w$ (e.g., "replace the $i$-th letter of $w$ by a letter $a$"). Each update changes the set of infixes to enumerate, and resets the enumeration state. We study for which regular languages $L$ we can perform dynamic enumeration of $L$-infixes in constant delay (i.e., the next infix is always produced in constant time) and constant additional memory throughout the enumeration, while supporting each update in constant time. We show that, for languages $L$ with a neutral letter, if the language $L$ belongs to the class ZG and is extensible (i.e., if $u \in L$ and $u$ is a factor of $v$ then $v \in L$), then dynamic enumeration of $L$-infixes can be achieved with a simple algorithm that ensures constant-time updates and constant delay, but not constant additional memory. Our main contribution is then to show an algorithm that additionally uses only constant additional memory, and applies to a more general class of semi-extensible ZG languages for which we give several equivalent characterizations. We further discuss whether our results can be generalized to larger language classes and show some (conditional) lower bounds.

Authors: Antoine Amarilli, Sven Dziadek, Luc Segoufin

For a fixed regular language $L$, the enumeration of $L$-infixes is the following task: we are given an input word $w = a_1 \cdots a_n$ and we must enumerate the infixes of $w$ that belong to $L$, i.e., the pairs $i \leq j$ such that $a_i \cdots a_j \in L$. We are interested in dynamic enumeration of $L$-infixes, where we must additionally support letter substitution updates on $w$ (e.g., "replace the $i$-th letter of $w$ by a letter $a$"). Each update changes the set of infixes to enumerate, and resets the enumeration state. We study for which regular languages $L$ we can perform dynamic enumeration of $L$-infixes in constant delay (i.e., the next infix is always produced in constant time) and constant additional memory throughout the enumeration, while supporting each update in constant time. We show that, for languages $L$ with a neutral letter, if the language $L$ belongs to the class ZG and is extensible (i.e., if $u \in L$ and $u$ is a factor of $v$ then $v \in L$), then dynamic enumeration of $L$-infixes can be achieved with a simple algorithm that ensures constant-time updates and constant delay, but not constant additional memory. Our main contribution is then to show an algorithm that additionally uses only constant additional memory, and applies to a more general class of semi-extensible ZG languages for which we give several equivalent characterizations. We further discuss whether our results can be generalized to larger language classes and show some (conditional) lower bounds.