Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Tuesday, May 12

Average-Case Hardness of Binary-Encoded Clique in Proof and Communication Complexity

from arXiv: Computational Complexity

Authors: Susanna F. de Rezende, David Engström, Yassine Ghannane, Duri Andrea Janett, Artur Riazanov

We study the average-case hardness of establishing that a graph does not have a large clique in both proof and communication complexity. We show exponential lower bounds on the length of cutting planes and bounded-depth resolution over parities refutations of the binary encoding of clique formulas on randomly sampled dense graphs. Moreover, we show that the randomized communication complexity of finding a falsified clause in these formulas is polynomial.

Authors: Susanna F. de Rezende, David Engström, Yassine Ghannane, Duri Andrea Janett, Artur Riazanov

We study the average-case hardness of establishing that a graph does not have a large clique in both proof and communication complexity. We show exponential lower bounds on the length of cutting planes and bounded-depth resolution over parities refutations of the binary encoding of clique formulas on randomly sampled dense graphs. Moreover, we show that the randomized communication complexity of finding a falsified clause in these formulas is polynomial.

Constant Inapproximability for Fisher Markets

from arXiv: Computational Complexity

Authors: Argyrios Deligkas, John Fearnley, Alexandros Hollender, Themistoklis Melissourgos

We study the problem of computing approximate market equilibria in Fisher markets with separable piecewise-linear concave (SPLC) utility functions. In this setting, the problem was only known to be PPAD-complete for inverse-polynomial approximations. We strengthen this result by showing PPAD-hardness for constant approximations. This means that the problem does not admit a polynomial time approximation scheme (PTAS) unless PPAD$=$P. In fact, we prove that computing any approximation better than $1/11$ is PPAD-complete. As a direct byproduct of our main result, we get the same inapproximability bound for Arrow-Debreu exchange markets with SPLC utility functions.

Authors: Argyrios Deligkas, John Fearnley, Alexandros Hollender, Themistoklis Melissourgos

We study the problem of computing approximate market equilibria in Fisher markets with separable piecewise-linear concave (SPLC) utility functions. In this setting, the problem was only known to be PPAD-complete for inverse-polynomial approximations. We strengthen this result by showing PPAD-hardness for constant approximations. This means that the problem does not admit a polynomial time approximation scheme (PTAS) unless PPAD$=$P. In fact, we prove that computing any approximation better than $1/11$ is PPAD-complete. As a direct byproduct of our main result, we get the same inapproximability bound for Arrow-Debreu exchange markets with SPLC utility functions.

When Does Sparsity Help for k-Independent Set in Hypergraphs and Other Boolean CSPs?

from arXiv: Computational Complexity

Authors: Timo Fritsch, Marvin Künnemann, Mirza Redzic, Julian Stieß

Consider the fundamental task of finding independent sets of (constant) size $k$ in a given $n$-node hypergraph. How is the time complexity affected by the sparsity of the input, i.e., the number of hyperedges $m$? Turán's theorem implies that the problem is trivial if $m=O(n^{2-ε})$ for some $ε> 0$. Above that threshold (i.e., if $m=Θ(n^γ)$ for some $γ\ge 2$), we give a perhaps surprising algorithm with running time $O\left(\min\left\{n^{\fracω{3}k} + m^{k/3}, n^k\right\}\right)$ (for $k$ divisible by 3), which is essentially conditionally optimal for all $γ\ge 2$, assuming the $k$-clique and 3-uniform hyperclique hypotheses (here, $ω<2.372$ denotes the matrix multiplication exponent). In fact, we obtain a more detailed time complexity, sensitive to the arity distribution of the hyperedges. To study such phenomena in more generality, we study the time complexity of finding solutions of (constant) size $k$ in sparse instances of Boolean constraint satisfaction problems, where $n$ and $m$ denote the number of variables and constraints. Our results include an essentially full classification of the influence of sparsity for Boolean constraint families of binary arity. Of particular technical interest is a conditionally tight algorithm for the family consisting of the binary NAND and Implication constraints, with a running time of $Θ(m^{ωk/6 \pm c})$. Further, we identify a large class of constraint families $F$ that exhibits a sharp phase transition: there is a threshold $γ_F$ such that the problem is trivial for $m=O(n^{γ_F-ε})$, but requires essentially brute-force running time $Θ(n^{k\pm c})$ for $m=Ω(n^{γ_F})$, assuming the 3-uniform hyperclique hypothesis. Notably, in many cases the combination of constraints display higher time complexity than either constraint alone.

Authors: Timo Fritsch, Marvin Künnemann, Mirza Redzic, Julian Stieß

Consider the fundamental task of finding independent sets of (constant) size $k$ in a given $n$-node hypergraph. How is the time complexity affected by the sparsity of the input, i.e., the number of hyperedges $m$? Turán's theorem implies that the problem is trivial if $m=O(n^{2-ε})$ for some $ε> 0$. Above that threshold (i.e., if $m=Θ(n^γ)$ for some $γ\ge 2$), we give a perhaps surprising algorithm with running time $O\left(\min\left\{n^{\fracω{3}k} + m^{k/3}, n^k\right\}\right)$ (for $k$ divisible by 3), which is essentially conditionally optimal for all $γ\ge 2$, assuming the $k$-clique and 3-uniform hyperclique hypotheses (here, $ω<2.372$ denotes the matrix multiplication exponent). In fact, we obtain a more detailed time complexity, sensitive to the arity distribution of the hyperedges. To study such phenomena in more generality, we study the time complexity of finding solutions of (constant) size $k$ in sparse instances of Boolean constraint satisfaction problems, where $n$ and $m$ denote the number of variables and constraints. Our results include an essentially full classification of the influence of sparsity for Boolean constraint families of binary arity. Of particular technical interest is a conditionally tight algorithm for the family consisting of the binary NAND and Implication constraints, with a running time of $Θ(m^{ωk/6 \pm c})$. Further, we identify a large class of constraint families $F$ that exhibits a sharp phase transition: there is a threshold $γ_F$ such that the problem is trivial for $m=O(n^{γ_F-ε})$, but requires essentially brute-force running time $Θ(n^{k\pm c})$ for $m=Ω(n^{γ_F})$, assuming the 3-uniform hyperclique hypothesis. Notably, in many cases the combination of constraints display higher time complexity than either constraint alone.

Continuous Defensive Domination Problems

from arXiv: Computational Complexity

Authors: Christoph Grüne, Tom Janßen

The problem Defensive $δ$-Covering, for some covering range $δ> 0$, is a continuous facility location problem on undirected graphs where all edges have unit length. It is a generalization of Defensive Dominating Set and $δ$-Covering. An attack and defense are sets of points, which are on vertices or on the interior of an edge. A defense counters an attack, if there is a matching of the points in the defense to the points in the attack, such that any matched points have distance at most $δ$, and every point in the attack is matched. The task is, given a graph $G$ and numbers $\ell, k \in \mathbb N$, to find a defense of size at most $\ell$ that counters every possible attack of size at most $k$. We study the complexity of this problem in various different settings. We show that if the attack is restricted to vertices, the problem is $Σ^P_2$-complete for large $δ$, but if the attack may consist of any points on the graph, it is NP-complete. Additionally, we analyze how the complexity changes if the attacks or defenses may be a multiset. If the defense is allowed to be a multiset, the complexity does not change in any case we consider, while if the attack is allowed to be a multiset, the problem often becomes easier. To show containment in the various complexity classes, we introduce a number of discretization arguments, which show that solutions with a regular structure must always exist.

Authors: Christoph Grüne, Tom Janßen

The problem Defensive $δ$-Covering, for some covering range $δ> 0$, is a continuous facility location problem on undirected graphs where all edges have unit length. It is a generalization of Defensive Dominating Set and $δ$-Covering. An attack and defense are sets of points, which are on vertices or on the interior of an edge. A defense counters an attack, if there is a matching of the points in the defense to the points in the attack, such that any matched points have distance at most $δ$, and every point in the attack is matched. The task is, given a graph $G$ and numbers $\ell, k \in \mathbb N$, to find a defense of size at most $\ell$ that counters every possible attack of size at most $k$. We study the complexity of this problem in various different settings. We show that if the attack is restricted to vertices, the problem is $Σ^P_2$-complete for large $δ$, but if the attack may consist of any points on the graph, it is NP-complete. Additionally, we analyze how the complexity changes if the attacks or defenses may be a multiset. If the defense is allowed to be a multiset, the complexity does not change in any case we consider, while if the attack is allowed to be a multiset, the problem often becomes easier. To show containment in the various complexity classes, we introduce a number of discretization arguments, which show that solutions with a regular structure must always exist.

Parameterized Complexity of Stationarity Testing for Piecewise-Affine Functions and Shallow CNN Losses

from arXiv: Computational Complexity

Authors: Yuhan Ye

We study the parameterized complexity of testing approximate first-order stationarity at a prescribed point for continuous piecewise-affine (PA) functions, a basic task in nonsmooth optimization. PA functions form a canonical model for nonsmooth stationarity testing and capture the local polyhedral geometry that appears in ReLU-type training losses. Recent work by Tian and So (SODA 2025) shows that testing approximate stationarity notions for PA functions is computationally intractable in the worst case, and identifies fixed-dimensional tractability as an open direction. We address this direction from the viewpoint of parameterized complexity, with the ambient dimension $d$ as the parameter. In this paper, we give XP algorithms in fixed dimension for the tractable sides, and prove W[1]-hardness for the complementary sides. Moreover, lower bounds under the Exponential Time Hypothesis rule out algorithms running in time $ρ(d)\size^{o(d)}$ for any computable function $ρ$, where $\size$ denotes the total binary encoding length of the stationarity-testing instance. As a further consequence, our results yield the corresponding parameterized complexity picture for testing local minimality of continuous PA functions. We further extend our hardness results to a family of shallow ReLU CNN training losses, with stationarity tested in the trainable weight space. Thus, the same parameterized-complexity picture also appears for simple CNN training losses.

Authors: Yuhan Ye

We study the parameterized complexity of testing approximate first-order stationarity at a prescribed point for continuous piecewise-affine (PA) functions, a basic task in nonsmooth optimization. PA functions form a canonical model for nonsmooth stationarity testing and capture the local polyhedral geometry that appears in ReLU-type training losses. Recent work by Tian and So (SODA 2025) shows that testing approximate stationarity notions for PA functions is computationally intractable in the worst case, and identifies fixed-dimensional tractability as an open direction. We address this direction from the viewpoint of parameterized complexity, with the ambient dimension $d$ as the parameter. In this paper, we give XP algorithms in fixed dimension for the tractable sides, and prove W[1]-hardness for the complementary sides. Moreover, lower bounds under the Exponential Time Hypothesis rule out algorithms running in time $ρ(d)\size^{o(d)}$ for any computable function $ρ$, where $\size$ denotes the total binary encoding length of the stationarity-testing instance. As a further consequence, our results yield the corresponding parameterized complexity picture for testing local minimality of continuous PA functions. We further extend our hardness results to a family of shallow ReLU CNN training losses, with stationarity tested in the trainable weight space. Thus, the same parameterized-complexity picture also appears for simple CNN training losses.

Hardness Amplification for (Sparse) LPN

from arXiv: Computational Complexity

Authors: Divesh Aggarwal, Rishav Gupta, Li Zeyong

We prove new hardness amplification results for Learning Parity with Noise ($\mathsf{LPN}$) and its sparse variants. In $\mathsf{LPN}_{η,n,m}$, the goal is to recover a secret $\vec s\in\mathbb{F}_2^n$ from $m$ noisy linear samples $(\vec a,b)$, where $\vec a\leftarrow \mathbb{F}_2^n$ is uniform and $b=\langle \vec a,\vec s\rangle + e$ with $e\leftarrow \mathrm{Ber}(η)$. Building on the direct-product framework introduced by Hirahara and Shimizu [HS23], we show an 'instance-fraction amplification' theorem: for any $\varepsilon,δ>0$, any algorithm that solves $\mathsf{LPN}_{η,n,m}$ with success probability $\varepsilon$ can be transformed into an algorithm that succeeds with probability $1-δ$ on a related \textsf{LPN} distribution with scaled parameters $\mathsf{LPN}_{η/k,\;n/k,\;m}$, where $ k=Θ\!\left(\frac{1}δ\log\frac{1}{\varepsilon}\right). $ Equivalently, an algorithm that solves $\mathsf{LPN}$ on a 'small fraction of instances' can be converted into an algorithm that solves $\mathsf{LPN}$ on 'almost all instances', yielding a self-amplification for a wide range of parameters. We extend the same amplification approach to $\mathsf{LPN}$ over $\mathbb{F}_q$ and to Sparse-$\mathsf{LPN}$, where each query vector $\vec a$ has exactly $σ$ nonzero entries. Together, these results establish hardness self-amplification for a broad family of $\mathsf{LPN}$-type problems, strengthening the foundations for assuming the average-case hardness of $\mathsf{LPN}$ and its sparse variants.

Authors: Divesh Aggarwal, Rishav Gupta, Li Zeyong

We prove new hardness amplification results for Learning Parity with Noise ($\mathsf{LPN}$) and its sparse variants. In $\mathsf{LPN}_{η,n,m}$, the goal is to recover a secret $\vec s\in\mathbb{F}_2^n$ from $m$ noisy linear samples $(\vec a,b)$, where $\vec a\leftarrow \mathbb{F}_2^n$ is uniform and $b=\langle \vec a,\vec s\rangle + e$ with $e\leftarrow \mathrm{Ber}(η)$. Building on the direct-product framework introduced by Hirahara and Shimizu [HS23], we show an 'instance-fraction amplification' theorem: for any $\varepsilon,δ>0$, any algorithm that solves $\mathsf{LPN}_{η,n,m}$ with success probability $\varepsilon$ can be transformed into an algorithm that succeeds with probability $1-δ$ on a related \textsf{LPN} distribution with scaled parameters $\mathsf{LPN}_{η/k,\;n/k,\;m}$, where $ k=Θ\!\left(\frac{1}δ\log\frac{1}{\varepsilon}\right). $ Equivalently, an algorithm that solves $\mathsf{LPN}$ on a 'small fraction of instances' can be converted into an algorithm that solves $\mathsf{LPN}$ on 'almost all instances', yielding a self-amplification for a wide range of parameters. We extend the same amplification approach to $\mathsf{LPN}$ over $\mathbb{F}_q$ and to Sparse-$\mathsf{LPN}$, where each query vector $\vec a$ has exactly $σ$ nonzero entries. Together, these results establish hardness self-amplification for a broad family of $\mathsf{LPN}$-type problems, strengthening the foundations for assuming the average-case hardness of $\mathsf{LPN}$ and its sparse variants.

The two clocks and the innovation window: When and how generative models learn rules

from arXiv: Computational Complexity

Authors: Binxu Wang, Emma Lucia Byrnes Finn, Bingbin Liu

Generative models trained on finite data face a fundamental tension: their score-matching or next-token objective converges to the empirical training distribution rather than the population distribution we seek to learn. Using rule-valid synthetic tasks, we trace this tension across two training timescales: $τ_{\mathrm{rule}}$, the step at which generations first become rule-valid, and $τ_{\mathrm{mem}}$, the step at which models begin reproducing training samples. Focusing on parity and extending to other binary rules and combinatorial puzzles, we characterize how these two clocks, $τ_{\mathrm{rule}}$ and $τ_{\mathrm{mem}}$, depend on key aspects of the learning setup. Specifically, we show that $τ_{\mathrm{rule}}$ increases with rule complexity and decreases with model capacity, while $τ_{\mathrm{mem}}$ is approximately invariant to the rule and scales nearly linearly with dataset size $N$. We define the \emph{innovation window} as the interval $[τ_{\mathrm{rule}}, τ_{\mathrm{mem}}]$. This window widens with increasing $N$ and narrows with rule complexity, and may vanish entirely when $τ_{\mathrm{rule}} \geq τ_{\mathrm{mem}}$. The same two-clock structure arises in both diffusion (DiT) and autoregressive (GPT) models, with architecture-dependent offsets. Dissecting the learned score of DiT models reveals a corresponding evolution of the optimization landscapes, where rule-valid samples' basins expand substantially around $τ_{\mathrm{rule}}$, while training samples' basins begin to dominate around $τ_{\mathrm{mem}}$. Together, these results yield a unified and predictive account of when and how generative models exhibit genuine innovation.

Authors: Binxu Wang, Emma Lucia Byrnes Finn, Bingbin Liu

Generative models trained on finite data face a fundamental tension: their score-matching or next-token objective converges to the empirical training distribution rather than the population distribution we seek to learn. Using rule-valid synthetic tasks, we trace this tension across two training timescales: $τ_{\mathrm{rule}}$, the step at which generations first become rule-valid, and $τ_{\mathrm{mem}}$, the step at which models begin reproducing training samples. Focusing on parity and extending to other binary rules and combinatorial puzzles, we characterize how these two clocks, $τ_{\mathrm{rule}}$ and $τ_{\mathrm{mem}}$, depend on key aspects of the learning setup. Specifically, we show that $τ_{\mathrm{rule}}$ increases with rule complexity and decreases with model capacity, while $τ_{\mathrm{mem}}$ is approximately invariant to the rule and scales nearly linearly with dataset size $N$. We define the \emph{innovation window} as the interval $[τ_{\mathrm{rule}}, τ_{\mathrm{mem}}]$. This window widens with increasing $N$ and narrows with rule complexity, and may vanish entirely when $τ_{\mathrm{rule}} \geq τ_{\mathrm{mem}}$. The same two-clock structure arises in both diffusion (DiT) and autoregressive (GPT) models, with architecture-dependent offsets. Dissecting the learned score of DiT models reveals a corresponding evolution of the optimization landscapes, where rule-valid samples' basins expand substantially around $τ_{\mathrm{rule}}$, while training samples' basins begin to dominate around $τ_{\mathrm{mem}}$. Together, these results yield a unified and predictive account of when and how generative models exhibit genuine innovation.

Optimal Inapproximability of Generalized Linear Equations over a Finite Group

from arXiv: Computational Complexity

Authors: Amey Bhangale, Yezhou Zhang

Constraint satisfaction problems (CSPs) consist of a set of variables taking values from some finite domain and a set of local constraints on these variables. The objective is to find an assignment to the variables that maximizes the fraction of satisfied constraints. In this work, we study the CSP where the constraints are generalized linear equations over a finite group G. More specifically, for a given $S \subseteq G$, the constraints in this CSP are of the form addition of the values to the variables (similarly, product for non-abelian groups), belonging to the set $S$. We give an approximation algorithm for this problem on satisfiable instances and show that it is optimal for certain $S$ assuming $P\neq NP$. This natural predicate is one of the very few known predicates that are approximation resistant on almost satisfiable instances, assuming $P\neq NP$, but admits a non-trivial approximation algorithm on satisfiable instances.

Authors: Amey Bhangale, Yezhou Zhang

Constraint satisfaction problems (CSPs) consist of a set of variables taking values from some finite domain and a set of local constraints on these variables. The objective is to find an assignment to the variables that maximizes the fraction of satisfied constraints. In this work, we study the CSP where the constraints are generalized linear equations over a finite group G. More specifically, for a given $S \subseteq G$, the constraints in this CSP are of the form addition of the values to the variables (similarly, product for non-abelian groups), belonging to the set $S$. We give an approximation algorithm for this problem on satisfiable instances and show that it is optimal for certain $S$ assuming $P\neq NP$. This natural predicate is one of the very few known predicates that are approximation resistant on almost satisfiable instances, assuming $P\neq NP$, but admits a non-trivial approximation algorithm on satisfiable instances.

Multi-Prover Interactive Proof Systems with Leakage

from arXiv: Computational Complexity

Authors: Vahid R. Asadi, Atsuya Hasegawa, François Le Gall

It is known that there exist multi-prover interactive protocols ($\mathsf{MIP}$ protocols) for the complexity class $\mathsf{NEXP}$, succinct $\mathsf{MIP}$ protocols for $\mathsf{NP}$ and multi-prover interactive protocols with shared entanglement ($\mathsf{MIP}^\ast$ protocols) for $\mathsf{RE}$. This extraordinary power of multi-prover interactive proof systems comes from the assumption that provers do not communicate with each other during the protocols. If they are allowed to communicate freely, the setting is the same as in the single-prover case, and the computational power of the system becomes significantly weaker. In this paper, we investigate for the first time the setting where communication (i.e., leakage of information) between provers is allowed but bounded. We introduce two techniques to approach this question and show that multi-prover interactive proof systems are robust against some amount of leakage. Our first technique is based on parallel repetition theorems. We apply it to show that for any polynomial $p$, we can construct two-prover one-round $\mathsf{MIP}$ and $\mathsf{MIP}^\ast$ protocols for $\mathsf{NEXP}$ and $\mathsf{RE}$, respectively, that are robust against $p(n)$ bits of leakage. We further derive our second technique to convert any low-soundness PCP construction to a two-prover one-round $\mathsf{MIP}$ protocol for $\mathsf{NP}$ robust against leakage. We also discuss the relation between robustness against leakage in multi-prover interactive proof systems and the Sliding Scale Conjecture in the PCP literature.

Authors: Vahid R. Asadi, Atsuya Hasegawa, François Le Gall

It is known that there exist multi-prover interactive protocols ($\mathsf{MIP}$ protocols) for the complexity class $\mathsf{NEXP}$, succinct $\mathsf{MIP}$ protocols for $\mathsf{NP}$ and multi-prover interactive protocols with shared entanglement ($\mathsf{MIP}^\ast$ protocols) for $\mathsf{RE}$. This extraordinary power of multi-prover interactive proof systems comes from the assumption that provers do not communicate with each other during the protocols. If they are allowed to communicate freely, the setting is the same as in the single-prover case, and the computational power of the system becomes significantly weaker. In this paper, we investigate for the first time the setting where communication (i.e., leakage of information) between provers is allowed but bounded. We introduce two techniques to approach this question and show that multi-prover interactive proof systems are robust against some amount of leakage. Our first technique is based on parallel repetition theorems. We apply it to show that for any polynomial $p$, we can construct two-prover one-round $\mathsf{MIP}$ and $\mathsf{MIP}^\ast$ protocols for $\mathsf{NEXP}$ and $\mathsf{RE}$, respectively, that are robust against $p(n)$ bits of leakage. We further derive our second technique to convert any low-soundness PCP construction to a two-prover one-round $\mathsf{MIP}$ protocol for $\mathsf{NP}$ robust against leakage. We also discuss the relation between robustness against leakage in multi-prover interactive proof systems and the Sliding Scale Conjecture in the PCP literature.

Towards infinite PCSP: a dichotomy for monochromatic cliques

from arXiv: Computational Complexity

Authors: Demian Banakh, Alexey Barsukov, Tamio-Vesa Nakajima

The logic MMSNP is a well-studied fragment of Existential Second-Order logic that, from a computational perspective, captures finite-domain Constraint Satisfaction Problems (CSPs) modulo polynomial-time reductions. At the same time, MMSNP contains many problems that are expressible as $ω$-categorical CSPs but not as finite-domain ones. We initiate the study of Promise MMSNP (PMMSNP), a promise analogue of MMSNP. We show that every PMMSNP problem is poly-time equivalent to a (finite-domain) Promise CSP (PCSP), thereby extending the classical MMSNP-CSP correspondence to the promise setting. We then investigate the complexity of PMMSNPs arising from forbidding monochromatic cliques, a class encompassing promise graph colouring problems. For this class, we obtain a full complexity classification conditional on the Rich 2-to-1 Conjecture, a recently proposed perfect-completeness surrogate of the Unique Games Conjecture. As a key intermediate step which may be of independent interest, we prove that it is NP-hard, under the Rich 2-to-1 Conjecture, to properly colour a uniform hypergraph even if it is promised to admit a colouring satisfying a certain technical condition called reconfigurability. This proof is an extension of the recent work of Braverman, Khot, Lifshitz and Minzer (Adv. Math. 2025). To illustrate the broad applicability of this theorem, we show that it implies most of the linearly-ordered colouring conjecture of Barto, Battistelli, and Berg (STACS 2021).

Authors: Demian Banakh, Alexey Barsukov, Tamio-Vesa Nakajima

The logic MMSNP is a well-studied fragment of Existential Second-Order logic that, from a computational perspective, captures finite-domain Constraint Satisfaction Problems (CSPs) modulo polynomial-time reductions. At the same time, MMSNP contains many problems that are expressible as $ω$-categorical CSPs but not as finite-domain ones. We initiate the study of Promise MMSNP (PMMSNP), a promise analogue of MMSNP. We show that every PMMSNP problem is poly-time equivalent to a (finite-domain) Promise CSP (PCSP), thereby extending the classical MMSNP-CSP correspondence to the promise setting. We then investigate the complexity of PMMSNPs arising from forbidding monochromatic cliques, a class encompassing promise graph colouring problems. For this class, we obtain a full complexity classification conditional on the Rich 2-to-1 Conjecture, a recently proposed perfect-completeness surrogate of the Unique Games Conjecture. As a key intermediate step which may be of independent interest, we prove that it is NP-hard, under the Rich 2-to-1 Conjecture, to properly colour a uniform hypergraph even if it is promised to admit a colouring satisfying a certain technical condition called reconfigurability. This proof is an extension of the recent work of Braverman, Khot, Lifshitz and Minzer (Adv. Math. 2025). To illustrate the broad applicability of this theorem, we show that it implies most of the linearly-ordered colouring conjecture of Barto, Battistelli, and Berg (STACS 2021).

Understanding Robust Catalytic Computing

from arXiv: Computational Complexity

Authors: Michal Koucký, Ian Mertz, Sasha Sami

Catalytic computing concerns space bounded computation which starts with memory full of data that have to be restored by the end of the computation. Lossy catalytic computing, defined by Gupta et al. (2024) and fully characterized by Folkertsma et al. (ITCS 2025), is the study of allowing a small number of errors when resetting the catalytic tape at the end of a computation. Such a notion is useful when considering the robust use of catalytic techniques in the study of ordinary space-bounded algorithms. To that end however, defining and characterizing less strict notions of error was left open by Folkertsma et al. (ITCS 2025) and other works such as Mertz (B. EATCS, 2023). We expand the definition of possible resetting error in three natural ways: 1. randomized catalytic computation which can completely destroy the catalytic tape with some probability over the randomness 2. randomized catalytic computation which makes a bounded number of errors in expectation over the randomness 3. deterministic catalytic computation which makes a bounded number of errors in expectation over the initial catalytic tape itself We show a near complete characterization of the above models, both in the general case and in the logspace polynomial-time regime, by showing equivalences either between one another, to errorless catalytic space models, or to standard time or space complexity classes. Under a derandomization assumption, we show a near full collapse of all existing catalytic classes in the logspace regime.

Authors: Michal Koucký, Ian Mertz, Sasha Sami

Catalytic computing concerns space bounded computation which starts with memory full of data that have to be restored by the end of the computation. Lossy catalytic computing, defined by Gupta et al. (2024) and fully characterized by Folkertsma et al. (ITCS 2025), is the study of allowing a small number of errors when resetting the catalytic tape at the end of a computation. Such a notion is useful when considering the robust use of catalytic techniques in the study of ordinary space-bounded algorithms. To that end however, defining and characterizing less strict notions of error was left open by Folkertsma et al. (ITCS 2025) and other works such as Mertz (B. EATCS, 2023). We expand the definition of possible resetting error in three natural ways: 1. randomized catalytic computation which can completely destroy the catalytic tape with some probability over the randomness 2. randomized catalytic computation which makes a bounded number of errors in expectation over the randomness 3. deterministic catalytic computation which makes a bounded number of errors in expectation over the initial catalytic tape itself We show a near complete characterization of the above models, both in the general case and in the logspace polynomial-time regime, by showing equivalences either between one another, to errorless catalytic space models, or to standard time or space complexity classes. Under a derandomization assumption, we show a near full collapse of all existing catalytic classes in the logspace regime.

VP, VNP and Algebraic Branching Programs over Min-Plus Semirings

from arXiv: Computational Complexity

Authors: Balagopal Komarath, Harshil Mittal, Jayalal Sarma

Arithmetic circuit complexity studies the complexity of computing polynomials using only arithmetic operations such as addition, multiplication, subtraction, and division. Polynomials over rings of integers model counting problems. Similarly, polynomials over semirings such as tropical semirings model optimization problems. Circuits over semirings then model so called pure algorithms, algorithms that only use the operations in the semiring. In this paper, we do a complexity-theoretic study of the power and limitations of circuits (which represent dynamic programs) over semirings: i) We define $\mathsf{VNP}$ over min-plus semirings, which can faithfully represent problems such as computing min-weight perfect matchings and min-weight Hamiltonian cycles where we have efficiently verifiable certificates. Unlike over rings, we complement the values in the certificate for free as complementation is impossible over min-plus semirings. We prove a dichotomy theorem that states that if we only complement logarithmically many values, this class is same as $\mathsf{VP}$ over min-plus semirings. If we complement super-logarithmically many values, then $\mathsf{VNP} \neq \mathsf{VP}$. ii) We consider constant-width ABPs (which are also called incremental dynamic programs that are restricted to use only a constant number of registers) and show that even simple problems like computing the min-weight $2$-edge-matching is impossible with width $2$ (or $2$ registers). However, with width $3$ (or $3$ registers), such programs can compute everything. More generally, we show that constant-depth formulas are efficiently simulated by constant-width ABPs. iii) We show that an exponential hypercube sum (min in the semiring) over even provably weak models such as width-$2$ ABPs and products of linear forms are the same as $\mathsf{VNP}$.

Authors: Balagopal Komarath, Harshil Mittal, Jayalal Sarma

Arithmetic circuit complexity studies the complexity of computing polynomials using only arithmetic operations such as addition, multiplication, subtraction, and division. Polynomials over rings of integers model counting problems. Similarly, polynomials over semirings such as tropical semirings model optimization problems. Circuits over semirings then model so called pure algorithms, algorithms that only use the operations in the semiring. In this paper, we do a complexity-theoretic study of the power and limitations of circuits (which represent dynamic programs) over semirings: i) We define $\mathsf{VNP}$ over min-plus semirings, which can faithfully represent problems such as computing min-weight perfect matchings and min-weight Hamiltonian cycles where we have efficiently verifiable certificates. Unlike over rings, we complement the values in the certificate for free as complementation is impossible over min-plus semirings. We prove a dichotomy theorem that states that if we only complement logarithmically many values, this class is same as $\mathsf{VP}$ over min-plus semirings. If we complement super-logarithmically many values, then $\mathsf{VNP} \neq \mathsf{VP}$. ii) We consider constant-width ABPs (which are also called incremental dynamic programs that are restricted to use only a constant number of registers) and show that even simple problems like computing the min-weight $2$-edge-matching is impossible with width $2$ (or $2$ registers). However, with width $3$ (or $3$ registers), such programs can compute everything. More generally, we show that constant-depth formulas are efficiently simulated by constant-width ABPs. iii) We show that an exponential hypercube sum (min in the semiring) over even provably weak models such as width-$2$ ABPs and products of linear forms are the same as $\mathsf{VNP}$.

Quantum algorithms for path and cycle containment problems

from arXiv: Computational Complexity

Authors: Arjan Cornelissen, Amin Shiraz Gilani, Subhasree Patro

The quantum query complexity of subgraph-containment problems, which ask whether a given subgraph $H$ is present in an input graph $G$, has been the subject of considerable study. However, even for relatively simple subgraphs, such as paths and cycles, a complete understanding of their query complexities remains elusive. In this work, we consider several variants of path- and cycle-containment problems in the adjacency matrix model, where we search for paths or cycles of constant length $k$. We compare the settings where the graphs are directed or undirected, where the goal is to detect or find the existence of a path/cycle, and where the path/cycle we are looking for has length exactly $k$, or at most $k$. We also consider several promise versions of these problems, where we suppose that the input graph has a certain structure. We characterize the relative difficulty of these variants of the path/cycle-containment problems, by relating them to one another using randomized reductions, and grouping them into equivalence classes. When we restrict our attention to path-containment problems, we get a dichotomy result. Some of the path-containment problems can be solved using a linear number of queries, and all the others are equivalent to one another (and additionally to several cycle-containment problems) under randomized reductions, up to constant overhead. For the latter equivalence class, we prove a novel quantum-walk-based algorithm that achieves query complexity $\widetilde{O}(n^{3/2-α_k})$, where $α_k \in Θ(c^{-k})$ and $c = \sqrt{3+\sqrt{17}}/2 \approx 1.33$, beating the previous best upper bound $O(n^{3/2})$ on its query complexity. We also provide a conditional lower bound based on the graph-collision problem, which implies that this equivalence class does not admit linear-query quantum algorithms unless graph collision admits an $O(\sqrt{n})$ query algorithm.

Authors: Arjan Cornelissen, Amin Shiraz Gilani, Subhasree Patro

The quantum query complexity of subgraph-containment problems, which ask whether a given subgraph $H$ is present in an input graph $G$, has been the subject of considerable study. However, even for relatively simple subgraphs, such as paths and cycles, a complete understanding of their query complexities remains elusive. In this work, we consider several variants of path- and cycle-containment problems in the adjacency matrix model, where we search for paths or cycles of constant length $k$. We compare the settings where the graphs are directed or undirected, where the goal is to detect or find the existence of a path/cycle, and where the path/cycle we are looking for has length exactly $k$, or at most $k$. We also consider several promise versions of these problems, where we suppose that the input graph has a certain structure. We characterize the relative difficulty of these variants of the path/cycle-containment problems, by relating them to one another using randomized reductions, and grouping them into equivalence classes. When we restrict our attention to path-containment problems, we get a dichotomy result. Some of the path-containment problems can be solved using a linear number of queries, and all the others are equivalent to one another (and additionally to several cycle-containment problems) under randomized reductions, up to constant overhead. For the latter equivalence class, we prove a novel quantum-walk-based algorithm that achieves query complexity $\widetilde{O}(n^{3/2-α_k})$, where $α_k \in Θ(c^{-k})$ and $c = \sqrt{3+\sqrt{17}}/2 \approx 1.33$, beating the previous best upper bound $O(n^{3/2})$ on its query complexity. We also provide a conditional lower bound based on the graph-collision problem, which implies that this equivalence class does not admit linear-query quantum algorithms unless graph collision admits an $O(\sqrt{n})$ query algorithm.

Tight Lower Bound for Approximating Parametrized Maximum Likelihood Decoding under ETH

from arXiv: Computational Complexity

Authors: Rishav Gupta, Bingkai Lin, Xin Zheng

We present a simple deterministic reduction which, assuming the Exponential Time Hypothesis ($\mathsf{ETH}$), yields tight lower bounds for approximating the parameterized Maximum Likelihood Decoding problem ($\mathsf{MLD}$) and the parameterized Nearest Codeword Problem ($\mathsf{NCP}$) within some fixed constant factor. Our starting point is the ETH-based exponential-time hardness of $(c,s)$-Gap-$\mathsf{MAXLIN}$ established in [BHI+24]. We transform a $(c,s)$-Gap-$\mathsf{MAXLIN}$ instance into an instance of $γ$-Gap $k$-$\mathsf{MLD}$ via a novel combinatorial object that we call a cover family. We provide both a randomized construction of the required cover families and a subsequent derandomization. Prior to our work, $n^{Ω(k)}$ hardness for constant-factor approximation was only shown under the randomized Gap Exponential Time Hypothesis Gap-$\mathsf{ETH}$ [Man20], which is a much stronger assumption than $\mathsf{ETH}$. Under $\mathsf{ETH}$, the strongest known lower bound was $n^{Ω(k/\operatorname{poly} \log k)}$ due to [BKM25]. Unlike previous approaches that rely on reductions from the hardness of approximating $2$-$\mathsf{CSP}$, our reduction provides a more direct and conceptually simpler route to achieving the optimal lower bounds.

Authors: Rishav Gupta, Bingkai Lin, Xin Zheng

We present a simple deterministic reduction which, assuming the Exponential Time Hypothesis ($\mathsf{ETH}$), yields tight lower bounds for approximating the parameterized Maximum Likelihood Decoding problem ($\mathsf{MLD}$) and the parameterized Nearest Codeword Problem ($\mathsf{NCP}$) within some fixed constant factor. Our starting point is the ETH-based exponential-time hardness of $(c,s)$-Gap-$\mathsf{MAXLIN}$ established in [BHI+24]. We transform a $(c,s)$-Gap-$\mathsf{MAXLIN}$ instance into an instance of $γ$-Gap $k$-$\mathsf{MLD}$ via a novel combinatorial object that we call a cover family. We provide both a randomized construction of the required cover families and a subsequent derandomization. Prior to our work, $n^{Ω(k)}$ hardness for constant-factor approximation was only shown under the randomized Gap Exponential Time Hypothesis Gap-$\mathsf{ETH}$ [Man20], which is a much stronger assumption than $\mathsf{ETH}$. Under $\mathsf{ETH}$, the strongest known lower bound was $n^{Ω(k/\operatorname{poly} \log k)}$ due to [BKM25]. Unlike previous approaches that rely on reductions from the hardness of approximating $2$-$\mathsf{CSP}$, our reduction provides a more direct and conceptually simpler route to achieving the optimal lower bounds.

Entropy of pebble automata and space complexity

from arXiv: Computational Complexity

Authors: J. Andres Montoya

Let L denote the class Logpsace and NL the class NLogspace. We use logCFL to denote the closure under logspace reductions of the set of context-free languages. We prove that NL is different from logCFL. This result implies L different from Ptime and the stronger separation NL different from Ptime.

Authors: J. Andres Montoya

Let L denote the class Logpsace and NL the class NLogspace. We use logCFL to denote the closure under logspace reductions of the set of context-free languages. We prove that NL is different from logCFL. This result implies L different from Ptime and the stronger separation NL different from Ptime.

Charting the Diameter Computation Landscape on Intersection Graphs in the Plane

from arXiv: Computational Geometry

Authors: Timothy M. Chan, Hsien-Chih Chang, Jie Gao, Sándor Kisfaludi-Bak, Hung Le, Da Wei Zheng

Computing the diameter of the intersection graphs of objects is a basic problem in computational geometry. Previous works showed that the complexity of computing the diameter mainly depends on the object types: for unit disks and squares in 2D, the problem is solvable in truly subquadratic time, while for other objects, including unit segments and equilateral triangles in 2D or unit balls and axis-parallel unit cubes in 3D, there is no truly subquadratic time algorithm under the Orthogonal Vector (OV) hypothesis. We undertake a comprehensive study of computing the diameter of geometric intersection graphs for various types of objects. We discover many new irregularities, showing that the landscape is extremely nuanced: the source of hardness is a combination of the object type, the true diameter value, and how the objects intersect with each other. Our highlighted results for the 2D case include: 1. The diameter of non-degenerate, axis-aligned line segments can be computed in truly subquadratic time. Previous hardness result for line segments applies only to degenerate instances. On the other hand, for the degenerate case, we show that a truly subquadratic time algorithm exists when the true diameter is constant. 2. An almost-linear-time algorithm for unit-square graphs of constant diameter. Previous algorithms rely on succinct representation assuming bounded VC-dimension; for such a strategy $Ω(n^{7/4})$ time is an inherent barrier. 3. An $\tilde{O}(n^{4/3})$-time algorithm to decide if the diameter of a unit-disk graph is at most 2. This improves upon the recent algorithm with running time $\tilde{O}(n^{2-1/9})$. 4. Deciding if the diameter of intersection graphs of fat triangles or line segments is at most 2 is truly subquadratic-hard under fine-grained complexity assumptions. Previous lower bounds only hold when deciding if diameter is at most 3.

Authors: Timothy M. Chan, Hsien-Chih Chang, Jie Gao, Sándor Kisfaludi-Bak, Hung Le, Da Wei Zheng

Computing the diameter of the intersection graphs of objects is a basic problem in computational geometry. Previous works showed that the complexity of computing the diameter mainly depends on the object types: for unit disks and squares in 2D, the problem is solvable in truly subquadratic time, while for other objects, including unit segments and equilateral triangles in 2D or unit balls and axis-parallel unit cubes in 3D, there is no truly subquadratic time algorithm under the Orthogonal Vector (OV) hypothesis. We undertake a comprehensive study of computing the diameter of geometric intersection graphs for various types of objects. We discover many new irregularities, showing that the landscape is extremely nuanced: the source of hardness is a combination of the object type, the true diameter value, and how the objects intersect with each other. Our highlighted results for the 2D case include: 1. The diameter of non-degenerate, axis-aligned line segments can be computed in truly subquadratic time. Previous hardness result for line segments applies only to degenerate instances. On the other hand, for the degenerate case, we show that a truly subquadratic time algorithm exists when the true diameter is constant. 2. An almost-linear-time algorithm for unit-square graphs of constant diameter. Previous algorithms rely on succinct representation assuming bounded VC-dimension; for such a strategy $Ω(n^{7/4})$ time is an inherent barrier. 3. An $\tilde{O}(n^{4/3})$-time algorithm to decide if the diameter of a unit-disk graph is at most 2. This improves upon the recent algorithm with running time $\tilde{O}(n^{2-1/9})$. 4. Deciding if the diameter of intersection graphs of fat triangles or line segments is at most 2 is truly subquadratic-hard under fine-grained complexity assumptions. Previous lower bounds only hold when deciding if diameter is at most 3.

Higher-order Persistence Diagrams

from arXiv: Computational Geometry

Authors: Charles Fanning, Mehmet Aktas

Many topological data analysis (TDA) pipelines compute large collections of persistence diagrams, yet vectorizations and kernel methods discard the rank-induced implication relations among persistence intervals that are essential for faithful structural comparison and interpretability. We introduce higher-order persistence diagrams, a recursive construction in which containment relations among persistence intervals define higher-order persistence intervals. This construction performs comparison and aggregation directly on persistence diagrams and preserves interval-level structure. We use harmonic analysis to reduce frequency-space evaluations of aggregated diagrams to zeta transforms. This reduction avoids explicit construction of higher-order diagrams and replaces quadratic pair enumeration with nearly linear-time evaluation. Experiments on random network models show substantial speedups over explicit aggregation. Anonymized code is available at anonymous.4open.science/r/higher-order-persistence-8201.

Authors: Charles Fanning, Mehmet Aktas

Many topological data analysis (TDA) pipelines compute large collections of persistence diagrams, yet vectorizations and kernel methods discard the rank-induced implication relations among persistence intervals that are essential for faithful structural comparison and interpretability. We introduce higher-order persistence diagrams, a recursive construction in which containment relations among persistence intervals define higher-order persistence intervals. This construction performs comparison and aggregation directly on persistence diagrams and preserves interval-level structure. We use harmonic analysis to reduce frequency-space evaluations of aggregated diagrams to zeta transforms. This reduction avoids explicit construction of higher-order diagrams and replaces quadratic pair enumeration with nearly linear-time evaluation. Experiments on random network models show substantial speedups over explicit aggregation. Anonymized code is available at https://anonymous.4open.science/r/higher-order-persistence-8201.

Nearly-Tight Bounds for Vertical Decomposition in Three and Four Dimensions

from arXiv: Computational Geometry

Authors: Pankaj K. Agarwal, Esther Ezra, Micha Sharir

Vertical decomposition is a widely used general technique for decomposing the cells of arrangements of semi-algebraic sets in ${\mathbb R}^d$ into constant-complexity subcells. In this paper, we settle in the affirmative a few long-standing open problems involving the vertical decomposition of substructures of arrangements for $d = 3, 4$. For example, we obtain sharp bounds on the complexity of the vertical decomposition of the complement of the union of a set of semi-algebraic regions of constant complexity in ${\mathbb R}^3$, and of the minimization diagram of a set of trivariate functions. These results lead to efficient algorithms for a variety of problems involving vertical decompositions, including algorithms for constructing the decompositions themselves and for constructing $(1/r)$-cuttings of substructures of arrangements. They also lead to a data structure for answering point-enclosure queries amid semi-algebraic sets in ${\mathbb R}^3$ and ${\mathbb R}^4$.

Authors: Pankaj K. Agarwal, Esther Ezra, Micha Sharir

Vertical decomposition is a widely used general technique for decomposing the cells of arrangements of semi-algebraic sets in ${\mathbb R}^d$ into constant-complexity subcells. In this paper, we settle in the affirmative a few long-standing open problems involving the vertical decomposition of substructures of arrangements for $d = 3, 4$. For example, we obtain sharp bounds on the complexity of the vertical decomposition of the complement of the union of a set of semi-algebraic regions of constant complexity in ${\mathbb R}^3$, and of the minimization diagram of a set of trivariate functions. These results lead to efficient algorithms for a variety of problems involving vertical decompositions, including algorithms for constructing the decompositions themselves and for constructing $(1/r)$-cuttings of substructures of arrangements. They also lead to a data structure for answering point-enclosure queries amid semi-algebraic sets in ${\mathbb R}^3$ and ${\mathbb R}^4$.

The stochastic block model has the overlap graph property for modularity

from arXiv: Data Structures and Algorithms

Authors: Shankar Bhamidi, David Gamarnik, Remco van der Hofstad, Nelly Litvak, Pawel Pralat, Fiona Skerman, Yasmin Tousinejad

The overlap gap property (OGP) is a statement about the geometry of near-optimal solutions. Exhibiting OGP implies failure of a class of local algorithms; and has been observed to coincide with conjectured algorithmic limits in problems with statistical computational gap. We consider the Stochastic Block Model (SBM), where the graph has a planted partition with $k$ equal-size blocks which form the `communities', and where, for parameters $p>q$, vertices within the same community connect with probability $p$, while vertices in different communities connect with probability $q$, independently across pairs of vertices. Modularity--based clustering algorithms have become ubiquitous in applications. This article studies theoretical limits of local algorithms based on the modularity score on the SBM. We establish that modularity exhibits OGP on the SBM. This rules out a class of local algorithms based on modularity for recovery in the SBM, and shows slow mixing time for a related Markov Chain. Theoretically this is one of the few instances where OGP has been established for a `planted' model, as most such analyses to date consider the `null' model. As part of our analysis, we extend a result by Bickel and Chen 2009, who established that with high probability, the modularity optimal partition of SBM is $o(n)$ local moves away from the planted partition, where $n$ is the graph size. We show that, with high probability, any partition with modularity score sufficiently near the optimal value is close to the planted partition.

Authors: Shankar Bhamidi, David Gamarnik, Remco van der Hofstad, Nelly Litvak, Pawel Pralat, Fiona Skerman, Yasmin Tousinejad

The overlap gap property (OGP) is a statement about the geometry of near-optimal solutions. Exhibiting OGP implies failure of a class of local algorithms; and has been observed to coincide with conjectured algorithmic limits in problems with statistical computational gap. We consider the Stochastic Block Model (SBM), where the graph has a planted partition with $k$ equal-size blocks which form the `communities', and where, for parameters $p>q$, vertices within the same community connect with probability $p$, while vertices in different communities connect with probability $q$, independently across pairs of vertices. Modularity--based clustering algorithms have become ubiquitous in applications. This article studies theoretical limits of local algorithms based on the modularity score on the SBM. We establish that modularity exhibits OGP on the SBM. This rules out a class of local algorithms based on modularity for recovery in the SBM, and shows slow mixing time for a related Markov Chain. Theoretically this is one of the few instances where OGP has been established for a `planted' model, as most such analyses to date consider the `null' model. As part of our analysis, we extend a result by Bickel and Chen 2009, who established that with high probability, the modularity optimal partition of SBM is $o(n)$ local moves away from the planted partition, where $n$ is the graph size. We show that, with high probability, any partition with modularity score sufficiently near the optimal value is close to the planted partition.

Streaming Complexity Separations for Dense and Sparse Graphs

from arXiv: Data Structures and Algorithms

Authors: Yang P. Liu, Hoai-An Nguyen, Noah G. Singer, David P. Woodruff

We identify a sharp separation in the streaming space complexity of Maximum Cut when the algorithm must output an approximate cut (rather than only the approximate value). For dense graphs, we show that $O(n/\varepsilon^2)$ space is sufficient and that $Ω(n)$ space is necessary. In contrast, for graphs with $Θ(n/\varepsilon^2)$ edges, the situation is markedly different: we show that the problem requires $Ω(n \log(\varepsilon^2 n)/\varepsilon^2)$ space for any $\varepsilon=ω(1/\sqrt{n})$, which is tight for the full range of $\varepsilon$. We also give an $Ω(n \log n/\varepsilon^2)$-space lower bound against deterministic algorithms for outputting a $(1-\varepsilon)$ approximation to the value of the maximum cut. Using similar techniques we prove an analogous sharp separation in the streaming space complexity of Densest Subgraph and show that for every constant-arity CSP over a constant-size alphabet and the Similarity problem the space complexity in dense streams can be improved by shaving a logarithmic factor.

Authors: Yang P. Liu, Hoai-An Nguyen, Noah G. Singer, David P. Woodruff

We identify a sharp separation in the streaming space complexity of Maximum Cut when the algorithm must output an approximate cut (rather than only the approximate value). For dense graphs, we show that $O(n/\varepsilon^2)$ space is sufficient and that $Ω(n)$ space is necessary. In contrast, for graphs with $Θ(n/\varepsilon^2)$ edges, the situation is markedly different: we show that the problem requires $Ω(n \log(\varepsilon^2 n)/\varepsilon^2)$ space for any $\varepsilon=ω(1/\sqrt{n})$, which is tight for the full range of $\varepsilon$. We also give an $Ω(n \log n/\varepsilon^2)$-space lower bound against deterministic algorithms for outputting a $(1-\varepsilon)$ approximation to the value of the maximum cut. Using similar techniques we prove an analogous sharp separation in the streaming space complexity of Densest Subgraph and show that for every constant-arity CSP over a constant-size alphabet and the Similarity problem the space complexity in dense streams can be improved by shaving a logarithmic factor.

FPT Approximation Schemes for Min-Sum Radii and Min-Sum Diameters Clustering

from arXiv: Data Structures and Algorithms

Authors: Fabrizio Grandoni, Anupam Gupta, Jatin Yadav

In the classical Min-Sum Radii problem (MSR) we are given a set $X$ of $n$ points in a metric space and a positive integer $k\in [n]$. Our goal is to partition $X$ into $k$ subsets (the clusters) so as to minimize the sum of the radii of these clusters. The Min-Sum Diameters problem (MSD) is defined analogously, where instead of the radii of the clusters we consider their diameters. For both problems we present FPT approximation schemes for the natural parameter $k$. Specifically, given $ε>0$, we show how to compute $(1+ε)$-approximations for both MSD and MSR in time $(1/ε)^kn^{O(1)}$ and $(1/ε)^{O(k/ε\log 1/ε)}n^{poly(1/ε)}$ respectively. The previous best FPT approximation algorithms for these problems have approximation factors $4+ε$ and $2+ε$, respectively, and finding an FPT approximation scheme for both these problems had been outstanding open problems.

Authors: Fabrizio Grandoni, Anupam Gupta, Jatin Yadav

In the classical Min-Sum Radii problem (MSR) we are given a set $X$ of $n$ points in a metric space and a positive integer $k\in [n]$. Our goal is to partition $X$ into $k$ subsets (the clusters) so as to minimize the sum of the radii of these clusters. The Min-Sum Diameters problem (MSD) is defined analogously, where instead of the radii of the clusters we consider their diameters. For both problems we present FPT approximation schemes for the natural parameter $k$. Specifically, given $ε>0$, we show how to compute $(1+ε)$-approximations for both MSD and MSR in time $(1/ε)^kn^{O(1)}$ and $(1/ε)^{O(k/ε\log 1/ε)}n^{poly(1/ε)}$ respectively. The previous best FPT approximation algorithms for these problems have approximation factors $4+ε$ and $2+ε$, respectively, and finding an FPT approximation scheme for both these problems had been outstanding open problems.

The Impossibility of Simultaneous Time and I/O Optimality for The Planar Maxima and Convex Hull Problems

from arXiv: Data Structures and Algorithms

Authors: Peyman Afshani, Gerth Stølting Brodal, Nodari Sitchinava

We prove that no deterministic output-sensitive algorithm for the planar convex hull and maxima problems can obtain both optimal time and I/O complexity, where the optimality is defined with respect to both the input and output sizes. This explains why the best previous algorithms achieved an optimal I/O bound at the cost of sub-optimal running time (Goodrich et al. [FOCS, 1993]). To the best of our knowledge, the impossibility of simultaneous optimality was only shown previously for the permutation problem by Brodal and Fagerberg [STOC, 2003]. Our results imply that no optimal deterministic output-sensitive cache-oblivious algorithm exists for either problem. In addition, we present simple deterministic algorithms that match our lower bounds and that provide a trade-off between time and I/Os. On the other hand, a simple modification of our deterministic algorithm results in a randomized algorithm that simultaneously achieves optimal (worst-case) time and optimal expected I/O bounds.

Authors: Peyman Afshani, Gerth Stølting Brodal, Nodari Sitchinava

We prove that no deterministic output-sensitive algorithm for the planar convex hull and maxima problems can obtain both optimal time and I/O complexity, where the optimality is defined with respect to both the input and output sizes. This explains why the best previous algorithms achieved an optimal I/O bound at the cost of sub-optimal running time (Goodrich et al. [FOCS, 1993]). To the best of our knowledge, the impossibility of simultaneous optimality was only shown previously for the permutation problem by Brodal and Fagerberg [STOC, 2003]. Our results imply that no optimal deterministic output-sensitive cache-oblivious algorithm exists for either problem. In addition, we present simple deterministic algorithms that match our lower bounds and that provide a trade-off between time and I/Os. On the other hand, a simple modification of our deterministic algorithm results in a randomized algorithm that simultaneously achieves optimal (worst-case) time and optimal expected I/O bounds.

Chasing Small Sets Optimally Against Adaptive Adversaries

from arXiv: Data Structures and Algorithms

Authors: Christian Coester, Alexa Tudose

We study deterministic online algorithms for the problem of chasing sets of cardinality at most $k$ in a metric space, also known as metrical service systems and equivalent to width-$k$ layered graph traversal. We resolve the 30-year-old gap of $Ω(2^k)\cap O(k2^k)$ on the competitive ratio of this problem by giving an $O(2^k)$-competitive deterministic algorithm. This bound is optimal even among randomized algorithms against adaptive adversaries. We also (slightly) improve the deterministic lower bound to $D_k$, defined recursively by $D_1=1$ and $D_{k+1}=2D_k+\sqrt{8+8D_k}+3$, which we conjecture to be exactly tight. For $k=3$, we provide a matching upper bound of $D_3$. Our results imply slightly improved upper and lower bounds for distributed asynchronous collective tree exploration and for the $k$-taxi problem, respectively. Our algorithm generalizes the classical doubling strategy, previously known to be optimal for $k=2$. The previous best bound for general $k$ was achieved by the generalized work function algorithm (WFA), and was known to be tight for WFA. Our improved bound therefore implies that WFA is sub-optimal for chasing small sets.

Authors: Christian Coester, Alexa Tudose

We study deterministic online algorithms for the problem of chasing sets of cardinality at most $k$ in a metric space, also known as metrical service systems and equivalent to width-$k$ layered graph traversal. We resolve the 30-year-old gap of $Ω(2^k)\cap O(k2^k)$ on the competitive ratio of this problem by giving an $O(2^k)$-competitive deterministic algorithm. This bound is optimal even among randomized algorithms against adaptive adversaries. We also (slightly) improve the deterministic lower bound to $D_k$, defined recursively by $D_1=1$ and $D_{k+1}=2D_k+\sqrt{8+8D_k}+3$, which we conjecture to be exactly tight. For $k=3$, we provide a matching upper bound of $D_3$. Our results imply slightly improved upper and lower bounds for distributed asynchronous collective tree exploration and for the $k$-taxi problem, respectively. Our algorithm generalizes the classical doubling strategy, previously known to be optimal for $k=2$. The previous best bound for general $k$ was achieved by the generalized work function algorithm (WFA), and was known to be tight for WFA. Our improved bound therefore implies that WFA is sub-optimal for chasing small sets.

Mistake-Bounded Language Generation

from arXiv: Data Structures and Algorithms

Authors: Jon Kleinberg, Charlotte Peale, Omer Reingold

We investigate the learning task of language generation in the limit, but shift focus from the traditional time-of-last-mistake metric of a generator's success to a new notion of "mistake-bounded generation." While existing results for language generation in the limit focus on guaranteeing eventual consistency, they are blind to the cumulative error incurred during the learning process. We address this by shifting the goal to minimizing the total number of invalid elements output by a generation algorithm. We establish a formal reduction to the Learning from Correct Demonstrations framework of Joshi et al. (2025), enabling a general recipe for deriving mistake bounds via weighted update rules. For finite classes, we provide an algorithm that simultaneously achieves an optimal last-mistake time of $\mathsf{Cdim}(L)$ and a mistake bound of $\lfloor \log_2 |L| \rfloor$, whereas for the non-uniform setting of countably infinite streams of languages, we prove a fundamental trade-off: achieving logarithmic mistakes $O(\log i)$ necessarily precludes convergence guarantees established in prior work. Finally, we show that our framework can be extended to accommodate noisy adversaries and guarantee mistake bounds that scale with the adversary's suboptimality.

Authors: Jon Kleinberg, Charlotte Peale, Omer Reingold

We investigate the learning task of language generation in the limit, but shift focus from the traditional time-of-last-mistake metric of a generator's success to a new notion of "mistake-bounded generation." While existing results for language generation in the limit focus on guaranteeing eventual consistency, they are blind to the cumulative error incurred during the learning process. We address this by shifting the goal to minimizing the total number of invalid elements output by a generation algorithm. We establish a formal reduction to the Learning from Correct Demonstrations framework of Joshi et al. (2025), enabling a general recipe for deriving mistake bounds via weighted update rules. For finite classes, we provide an algorithm that simultaneously achieves an optimal last-mistake time of $\mathsf{Cdim}(L)$ and a mistake bound of $\lfloor \log_2 |L| \rfloor$, whereas for the non-uniform setting of countably infinite streams of languages, we prove a fundamental trade-off: achieving logarithmic mistakes $O(\log i)$ necessarily precludes convergence guarantees established in prior work. Finally, we show that our framework can be extended to accommodate noisy adversaries and guarantee mistake bounds that scale with the adversary's suboptimality.

Handicap reduction for linear complementarity problems

from arXiv: Data Structures and Algorithms

Authors: Marianna E. -Nagy, László A. Végh

Linear Complementarity Problems (LCPs) with sufficient matrices form an important subclass of LCPs, and it remains a significant open question whether problems in this class can be solved in polynomial time. Kojima, Megiddo, Noma, and Yoshise gave an Interior Point Algorithm (IPA) in 1991, that can solve LCPs with sufficient matrices in time bounded polynomially in the input size and the so-called handicap number $\hatκ(M)$ of the coefficient matrix $M$. However, this value can be exponentially large in the bit encoding length. In fact, no upper bounds were previously known on $\hatκ(M)$. Settling an open question raised in de Klerk and E.-Nagy (Math Programming, 2011), we give an exponential upper bound on $\hatκ(M)$ in the bit-complexity of $M$. This is based on a new characterization of sufficient matrices. The new characterization also leads to a simple new proof of Väliaho's theorem on the equivalence of sufficient and $\mathcal{P}^*$-matrices (Linear Algebra and its Applications, 1996). Noting that one can obtain an equivalent LCP by rescaling the rows and columns by a positive diagonal matrix, we define $\hatκ^\star(M)$ as the best possible handicap number achievable under such rescalings. Our second main result is an algorithm for LCPs with sufficient matrices, where the running time is polynomially bounded in the input size and in the optimized value $\hatκ^\star(M)$. This algorithm is based on the observation that the set of near-optimal row-rescalings forms a convex set. Our algorithm combines the Ellipsoid Method over the set of row rescalings, and an IPA with running time dependent on the handicap number of the matrix. If the IPA fails to solve the LCP in the desired running time, it provides a separation oracle to the Ellipsoid Method to find a better rescaling.

Authors: Marianna E. -Nagy, László A. Végh

Linear Complementarity Problems (LCPs) with sufficient matrices form an important subclass of LCPs, and it remains a significant open question whether problems in this class can be solved in polynomial time. Kojima, Megiddo, Noma, and Yoshise gave an Interior Point Algorithm (IPA) in 1991, that can solve LCPs with sufficient matrices in time bounded polynomially in the input size and the so-called handicap number $\hatκ(M)$ of the coefficient matrix $M$. However, this value can be exponentially large in the bit encoding length. In fact, no upper bounds were previously known on $\hatκ(M)$. Settling an open question raised in de Klerk and E.-Nagy (Math Programming, 2011), we give an exponential upper bound on $\hatκ(M)$ in the bit-complexity of $M$. This is based on a new characterization of sufficient matrices. The new characterization also leads to a simple new proof of Väliaho's theorem on the equivalence of sufficient and $\mathcal{P}^*$-matrices (Linear Algebra and its Applications, 1996). Noting that one can obtain an equivalent LCP by rescaling the rows and columns by a positive diagonal matrix, we define $\hatκ^\star(M)$ as the best possible handicap number achievable under such rescalings. Our second main result is an algorithm for LCPs with sufficient matrices, where the running time is polynomially bounded in the input size and in the optimized value $\hatκ^\star(M)$. This algorithm is based on the observation that the set of near-optimal row-rescalings forms a convex set. Our algorithm combines the Ellipsoid Method over the set of row rescalings, and an IPA with running time dependent on the handicap number of the matrix. If the IPA fails to solve the LCP in the desired running time, it provides a separation oracle to the Ellipsoid Method to find a better rescaling.

FractalSortCPU: Bandwidth-Efficient Compressed Radix Sort on CPU

from arXiv: Data Structures and Algorithms

Authors: Michael Dang'ana

Cloud database systems, particularly their middleware and query execution layers, use sorting as a core operation in query processing, indexing and join execution. Distribution-dependence and limited parallelism are key issues inherent in state-of-the-art radix sort which is preferred for large datasets due to performance advantages over comparison-based algorithms. Multi-pass bucketing, stochastic sampling and dependence graph structures are common solutions to these problems that incur the cost of data pre-processing and increased memory footprint hence they are less appropriate for large-scale workloads common in cloud environments. In-place radix sort schemes increase the number of passes as precision increases, which negatively impacts latency. Our work solves these problems by introducing a CPU-adapted histogram compression scheme for radix sorting for arbitrary-precision keys implemented on the CPU for increased accessibility, providing state-of-the-art execution time, while limiting histogram growth. Fully parallel key-based histogram updates eliminate the need for input bucketing and data pre-processing further lowering latency, mitigating distribution-dependence and reducing complexity. With a parallelized sorting architecture utilizing SIMD-accelerated operations for low latency, the algorithm demonstrates improvement over the state-of-the-art on the CPU, GPU, and FPGA by 6x, 3x and 2.5x in bandwidth efficiency on 512MB to 32GB data sets at 16-bit precision.

Authors: Michael Dang'ana

Cloud database systems, particularly their middleware and query execution layers, use sorting as a core operation in query processing, indexing and join execution. Distribution-dependence and limited parallelism are key issues inherent in state-of-the-art radix sort which is preferred for large datasets due to performance advantages over comparison-based algorithms. Multi-pass bucketing, stochastic sampling and dependence graph structures are common solutions to these problems that incur the cost of data pre-processing and increased memory footprint hence they are less appropriate for large-scale workloads common in cloud environments. In-place radix sort schemes increase the number of passes as precision increases, which negatively impacts latency. Our work solves these problems by introducing a CPU-adapted histogram compression scheme for radix sorting for arbitrary-precision keys implemented on the CPU for increased accessibility, providing state-of-the-art execution time, while limiting histogram growth. Fully parallel key-based histogram updates eliminate the need for input bucketing and data pre-processing further lowering latency, mitigating distribution-dependence and reducing complexity. With a parallelized sorting architecture utilizing SIMD-accelerated operations for low latency, the algorithm demonstrates improvement over the state-of-the-art on the CPU, GPU, and FPGA by 6x, 3x and 2.5x in bandwidth efficiency on 512MB to 32GB data sets at 16-bit precision.

Convex Optimization with Local Label Differential Privacy: Tight Bounds in All Privacy Regimes

from arXiv: Data Structures and Algorithms

Authors: Lynn Chua, Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Ziteng Sun, Chiyuan Zhang

We study the problem of Stochastic Convex Optimization (SCO) under the constraint of local Label Differential Privacy (L-LDP). In this setting, the features are considered public, but the corresponding labels are sensitive and must be randomized by each user locally before being sent to an untrusted analyzer. Prior work for SCO under L-LDP (Ghazi et al., 2021) established an excess population risk bound with a \emph{linear} dependence on the size of the label space, $K$: $O\left({\frac{K}{ε\sqrt{n}}}\right)$ in the high-privacy regime ($ε\leq 1$) and $O\left({\frac{K}{e^ε \sqrt{n}}}\right)$ in the medium-privacy regime ($1 \leq ε\leq \ln K$). This left open whether this linear cost is fundamental to the L-LDP model. In this note, we resolve this question. First, we present a novel and efficient non-interactive L-LDP algorithm that achieves an excess risk of $O\left({\sqrt{\frac{K}{εn}}}\right)$ in the high-privacy regime ($ε\leq 1$) and $O\left({\sqrt{\frac{K}{e^ε n}}}\right)$ in the medium-privacy regime ($1 \leq ε\leq \ln K$). This quadratically improves the dependency on the label space size from $O(K)$ to $O(\sqrt{K})$. Second, we prove a matching information-theoretic lower bound across all privacy regimes for any sufficiently large $n$.

Authors: Lynn Chua, Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Ziteng Sun, Chiyuan Zhang

We study the problem of Stochastic Convex Optimization (SCO) under the constraint of local Label Differential Privacy (L-LDP). In this setting, the features are considered public, but the corresponding labels are sensitive and must be randomized by each user locally before being sent to an untrusted analyzer. Prior work for SCO under L-LDP (Ghazi et al., 2021) established an excess population risk bound with a \emph{linear} dependence on the size of the label space, $K$: $O\left({\frac{K}{ε\sqrt{n}}}\right)$ in the high-privacy regime ($ε\leq 1$) and $O\left({\frac{K}{e^ε \sqrt{n}}}\right)$ in the medium-privacy regime ($1 \leq ε\leq \ln K$). This left open whether this linear cost is fundamental to the L-LDP model. In this note, we resolve this question. First, we present a novel and efficient non-interactive L-LDP algorithm that achieves an excess risk of $O\left({\sqrt{\frac{K}{εn}}}\right)$ in the high-privacy regime ($ε\leq 1$) and $O\left({\sqrt{\frac{K}{e^ε n}}}\right)$ in the medium-privacy regime ($1 \leq ε\leq \ln K$). This quadratically improves the dependency on the label space size from $O(K)$ to $O(\sqrt{K})$. Second, we prove a matching information-theoretic lower bound across all privacy regimes for any sufficiently large $n$.

An Approximation Algorithm for 2-Vertex-Connectivity via Cycle-Restricted 2-Edge-Covers

from arXiv: Data Structures and Algorithms

Authors: Yusuke Kobayashi, Takashi Noguchi

In the 2-Vertex-Connected Spanning Subgraph problem (2-VCSS), we are given an undirected graph $G$, and the objective is to find a 2-vertex-connected spanning subgraph $S$ of $G$ with the minimum number of edges. In the context of survivable network design, 2-VCSS is one of the most fundamental and well-studied problems. There has been active research on improving the approximation ratio of algorithms, and the current best ratio is $\frac{4}{3}$, achieved by Bosch-Calvo, Grandoni, and Jabal Ameli. In this paper, we improve the approximation ratio to $\frac{95}{72}+\varepsilon$ ($<1.32$). The key idea in our algorithm is to introduce a 2-edge-cover without certain cycle components, and use it as an initial solution.

Authors: Yusuke Kobayashi, Takashi Noguchi

In the 2-Vertex-Connected Spanning Subgraph problem (2-VCSS), we are given an undirected graph $G$, and the objective is to find a 2-vertex-connected spanning subgraph $S$ of $G$ with the minimum number of edges. In the context of survivable network design, 2-VCSS is one of the most fundamental and well-studied problems. There has been active research on improving the approximation ratio of algorithms, and the current best ratio is $\frac{4}{3}$, achieved by Bosch-Calvo, Grandoni, and Jabal Ameli. In this paper, we improve the approximation ratio to $\frac{95}{72}+\varepsilon$ ($<1.32$). The key idea in our algorithm is to introduce a 2-edge-cover without certain cycle components, and use it as an initial solution.

A 4.509-Approximation Algorithm for Generalized Min Sum Set Cover

from arXiv: Data Structures and Algorithms

Authors: Amey Bhangale, Yezhou Zhang

We study the \emph{generalized min-sum set cover} (GMSSC) problem, where given a collection of hyperedges $E$ with arbitrary covering requirements $\{k_e \in \mathbb{Z}^+ : e \in E\}$, the objective is to find an ordering of the vertices that minimizes the total cover time of the hyperedges. A hyperedge $e$ is considered covered at the first time when $k_e$ of its vertices appear in the ordering. We present a $4.509$-approximation algorithm for GMSSC, improving upon the previous best-known guarantee of $4.642$~\cite[SODA'21]{BansalBFT21}. Our approach retains the general LP-based framework of Bansal, Batra, Farhadi, and Tetali~\cite{BansalBFT21} but provides an improved analysis that narrows the gap toward the lower bound of $4$-approximation assuming P$\neq$NP. Our analysis takes advantage of the constraints of the linear program in a nontrivial way, along with new lower-tail bounds for the sums of independent Bernoulli random variables, which could be of independent interest.

Authors: Amey Bhangale, Yezhou Zhang

We study the \emph{generalized min-sum set cover} (GMSSC) problem, where given a collection of hyperedges $E$ with arbitrary covering requirements $\{k_e \in \mathbb{Z}^+ : e \in E\}$, the objective is to find an ordering of the vertices that minimizes the total cover time of the hyperedges. A hyperedge $e$ is considered covered at the first time when $k_e$ of its vertices appear in the ordering. We present a $4.509$-approximation algorithm for GMSSC, improving upon the previous best-known guarantee of $4.642$~\cite[SODA'21]{BansalBFT21}. Our approach retains the general LP-based framework of Bansal, Batra, Farhadi, and Tetali~\cite{BansalBFT21} but provides an improved analysis that narrows the gap toward the lower bound of $4$-approximation assuming P$\neq$NP. Our analysis takes advantage of the constraints of the linear program in a nontrivial way, along with new lower-tail bounds for the sums of independent Bernoulli random variables, which could be of independent interest.

Dynamic Rank, Basis, and Matching

from arXiv: Data Structures and Algorithms

Authors: Jan van den Brand, Vishal Kumar, Daniel J. Zhang

We study dynamic algorithms for maintaining fundamental algebraic properties of matrices, specifically, rank, basis, and full-rank submatrices, with applications to maximum matching on dynamic graphs. Prior dynamic algorithms for rank achieve subquadratic update times but scale with the matrix dimension $n$, and could not always maintain the corresponding objects such as a basis or maximum full-rank submatrix. We present the first dynamic rank algorithms whose update time scales with the matrix rank $r$, achieving $\tilde O(r^{1.405})$ time per entry-update and $\tilde O(r^{1.528}+ z)$ per column-update, where $z$ is the number of changed entries. This extends to $\tilde O(|M|^{1.405})$ edge-update time to maintain the size $|M|$ of a maximum matching. We also give dynamic algorithms for maintaining a column-basis subject to column-updates and a maximum full-rank submatrix subject to entry-updates.

Authors: Jan van den Brand, Vishal Kumar, Daniel J. Zhang

We study dynamic algorithms for maintaining fundamental algebraic properties of matrices, specifically, rank, basis, and full-rank submatrices, with applications to maximum matching on dynamic graphs. Prior dynamic algorithms for rank achieve subquadratic update times but scale with the matrix dimension $n$, and could not always maintain the corresponding objects such as a basis or maximum full-rank submatrix. We present the first dynamic rank algorithms whose update time scales with the matrix rank $r$, achieving $\tilde O(r^{1.405})$ time per entry-update and $\tilde O(r^{1.528}+ z)$ per column-update, where $z$ is the number of changed entries. This extends to $\tilde O(|M|^{1.405})$ edge-update time to maintain the size $|M|$ of a maximum matching. We also give dynamic algorithms for maintaining a column-basis subject to column-updates and a maximum full-rank submatrix subject to entry-updates.

Online Steiner Forest with Recourse

from arXiv: Data Structures and Algorithms

Authors: Yaowei Long, Sepideh Mahabadi, Sherry Sarkar, Jakub Tarnawski

In the online Steiner forest problem we are given a graph $G$, and a sequence of terminal pairs $(u_i,v_i)$ which arrive in an online fashion. We are asked to maintain a low-cost subgraph in which each $u_i$ is connected to $v_i$ for all the pairs that have arrived so far. If we are not allowed to delete edges from our solution, then the best possible competitive ratio is $Θ(\log n)$. In this work, we initiate the study of low-recourse algorithms for online Steiner forest. We give an algorithm that maintains a constant-competitive solution and has an amortized recourse of $O(\log n)$, i.e., inserts and deletes $O(\log n)$ edges per demand on average.

Authors: Yaowei Long, Sepideh Mahabadi, Sherry Sarkar, Jakub Tarnawski

In the online Steiner forest problem we are given a graph $G$, and a sequence of terminal pairs $(u_i,v_i)$ which arrive in an online fashion. We are asked to maintain a low-cost subgraph in which each $u_i$ is connected to $v_i$ for all the pairs that have arrived so far. If we are not allowed to delete edges from our solution, then the best possible competitive ratio is $Θ(\log n)$. In this work, we initiate the study of low-recourse algorithms for online Steiner forest. We give an algorithm that maintains a constant-competitive solution and has an amortized recourse of $O(\log n)$, i.e., inserts and deletes $O(\log n)$ edges per demand on average.

Near-Linear Time Generalized Sinkhorn Algorithms for Bounded Genus Graphs

from arXiv: Data Structures and Algorithms

Authors: Krzysztof Choromanski, Derek Long, Ananya Parashar, Dwaipayan Saha

We present GenusSink, a new class of approximate generalized Sinkhorn algorithms with shortest-path-distance costs for bounded genus (e.g. planar) graphs, providing near-linear time: (1) pre-processing, (2) iteration step, (3) final transport plan matrix querying and near-linear memory. Graphs handled by GenusSink include in particular planar graphs and bounded-genus meshes approximating 3D objects. GenusSink addresses total quadratic time complexity of its brute-force counterpart by leveraging separator-based decomposition of graphs, computational geometry techniques, and new results on fast matrix-vector multiplications with generalized distance matrices, using, in particular, Fourier analysis and low displacement rank theory. It is inspired by recent breakthroughs in graph theory on approximating bounded genus metrics with small treewidth metrics \citep{minor-free-paper}. The graph-centric approach enables us to target optimal transport problem with the corresponding distributions defined on the manifolds approximated by weighted graphs and with cost functions given by geodesic distances. We conduct rigorous theoretical analysis of GenusSink, provide practical implementations, leveraging newly introduced in this paper \textit{separation graph field integrators} (S-GFIs) data structures and present empirical verification. GenusSink provides orders of magnitude more accurate computations than other efficient Sinkhorn algorithms, while still guaranteeing significant computational improvements, as compared to the baseline. As a by-product of the developed methods, we show that GenusSink is \textbf{numerically equivalent} to the brute-force geodesic Sinkhorn algorithm on $n$-vertex graphs with treewidth $O(\log \log (n))$ (e.g. on trees).

Authors: Krzysztof Choromanski, Derek Long, Ananya Parashar, Dwaipayan Saha

We present GenusSink, a new class of approximate generalized Sinkhorn algorithms with shortest-path-distance costs for bounded genus (e.g. planar) graphs, providing near-linear time: (1) pre-processing, (2) iteration step, (3) final transport plan matrix querying and near-linear memory. Graphs handled by GenusSink include in particular planar graphs and bounded-genus meshes approximating 3D objects. GenusSink addresses total quadratic time complexity of its brute-force counterpart by leveraging separator-based decomposition of graphs, computational geometry techniques, and new results on fast matrix-vector multiplications with generalized distance matrices, using, in particular, Fourier analysis and low displacement rank theory. It is inspired by recent breakthroughs in graph theory on approximating bounded genus metrics with small treewidth metrics \citep{minor-free-paper}. The graph-centric approach enables us to target optimal transport problem with the corresponding distributions defined on the manifolds approximated by weighted graphs and with cost functions given by geodesic distances. We conduct rigorous theoretical analysis of GenusSink, provide practical implementations, leveraging newly introduced in this paper \textit{separation graph field integrators} (S-GFIs) data structures and present empirical verification. GenusSink provides orders of magnitude more accurate computations than other efficient Sinkhorn algorithms, while still guaranteeing significant computational improvements, as compared to the baseline. As a by-product of the developed methods, we show that GenusSink is \textbf{numerically equivalent} to the brute-force geodesic Sinkhorn algorithm on $n$-vertex graphs with treewidth $O(\log \log (n))$ (e.g. on trees).

Accelerating Power Method with Fast Sketching for Stronger Low-Rank Approximation

from arXiv: Data Structures and Algorithms

Authors: Shabarish Chenakkod, Michał Dereziński

The power method is one of the most fundamental tools for extracting top principal components from data through low-rank matrix approximation. Yet, when the target rank is large, the cost of matrix multiplication associated with this procedure becomes a major bottleneck. We develop an algorithmic and theoretical framework for accelerating the power method using fast sketching, which is a popular paradigm in randomized linear algebra. Our framework leads to simple and provably efficient methods for singular value decomposition, low-rank factorization, and Nyström approximation, which attain strong numerical performance on benchmark problems. The key novelty in our analysis is the use of regularized spectral approximation, a property of fast sketching methods which proves more flexible in generalizing power method guarantees than traditional arguments.

Authors: Shabarish Chenakkod, Michał Dereziński

The power method is one of the most fundamental tools for extracting top principal components from data through low-rank matrix approximation. Yet, when the target rank is large, the cost of matrix multiplication associated with this procedure becomes a major bottleneck. We develop an algorithmic and theoretical framework for accelerating the power method using fast sketching, which is a popular paradigm in randomized linear algebra. Our framework leads to simple and provably efficient methods for singular value decomposition, low-rank factorization, and Nyström approximation, which attain strong numerical performance on benchmark problems. The key novelty in our analysis is the use of regularized spectral approximation, a property of fast sketching methods which proves more flexible in generalizing power method guarantees than traditional arguments.

TreeWidzard: An Engine for Width-Based Dynamic Programming and Automated Theorem Proving

from arXiv: Data Structures and Algorithms

Authors: Mateus de Oliveira Oliveria, Sam Urmian

In this work, we introduce TreeWidzard, an engine for developing dynamic programming algorithms that decide graph-theoretic properties parameterized by treewidth and pathwidth. Besides providing a unified framework for algorithms deciding atomic graph-theoretic properties, our engine allows one to combine such algorithms for two purposes: to obtain dynamic programming algorithms for more complex graph properties, and to support treewidth-based automated theorem proving. Within this context, given the specification of a Boolean combination \(P\) of graph properties \(P_1, P_2, \ldots, P_r\), and a positive integer \(k\), our engine can be used to determine whether all graphs of treewidth at most \(k\) satisfy \(P\). The main goal of the present work is to provide a system description of TreeWidzard. In particular, we provide a step-by-step account of how to implement dynamic programming algorithms in our framework and how to combine these algorithms for model checking and automated theorem proving.

Authors: Mateus de Oliveira Oliveria, Sam Urmian

In this work, we introduce TreeWidzard, an engine for developing dynamic programming algorithms that decide graph-theoretic properties parameterized by treewidth and pathwidth. Besides providing a unified framework for algorithms deciding atomic graph-theoretic properties, our engine allows one to combine such algorithms for two purposes: to obtain dynamic programming algorithms for more complex graph properties, and to support treewidth-based automated theorem proving. Within this context, given the specification of a Boolean combination \(P\) of graph properties \(P_1, P_2, \ldots, P_r\), and a positive integer \(k\), our engine can be used to determine whether all graphs of treewidth at most \(k\) satisfy \(P\). The main goal of the present work is to provide a system description of TreeWidzard. In particular, we provide a step-by-step account of how to implement dynamic programming algorithms in our framework and how to combine these algorithms for model checking and automated theorem proving.

Dynamic Edge Coloring of Forests

from arXiv: Data Structures and Algorithms

Authors: Haim Kaplan, David Naori, Yaniv Sadeh

In the \emph{dynamic edge coloring} problem, one has to maintain a graph of maximum degree $Δ$ with at most $Δ+c$ colors, given updates to the edges of the graph. An important objective is to minimize the \emph{recourse}, which is the number of edges being recolored. We study this problem on forests, which is a natural yet nontrivial restriction of the problem. We consider the problem in both \emph{incremental} (edges are only inserted) and \emph{fully dynamic} (edges may be deleted) models. In the deterministic setting, we show that the natural greedy algorithm achieves $O(\frac{1}{c + \sqrtΔ})$ amortized recourse in the incremental model, and this is tight up to tie-breaking. In contrast, in a fully dynamic forest, greedy can be forced to have $Ω(\log_Δn)$ amortized recourse. To partially alleviate this limitation of greedy, we show an optimal non-greedy algorithm with $O(1)$ amortized recourse for \emph{rooted} fully dynamic forests and $c = Δ- 2$. In the randomized setting, we give a natural distribution-maintaining algorithm that achieves $Θ(\frac{1}Δ)$ expected amortized recourse in the incremental model and $Θ(\min \{ \fracΔ{c}, \log_Δ n \})$ expected recourse in the dynamic model. These randomized results are optimal for $c=0$.

Authors: Haim Kaplan, David Naori, Yaniv Sadeh

In the \emph{dynamic edge coloring} problem, one has to maintain a graph of maximum degree $Δ$ with at most $Δ+c$ colors, given updates to the edges of the graph. An important objective is to minimize the \emph{recourse}, which is the number of edges being recolored. We study this problem on forests, which is a natural yet nontrivial restriction of the problem. We consider the problem in both \emph{incremental} (edges are only inserted) and \emph{fully dynamic} (edges may be deleted) models. In the deterministic setting, we show that the natural greedy algorithm achieves $O(\frac{1}{c + \sqrtΔ})$ amortized recourse in the incremental model, and this is tight up to tie-breaking. In contrast, in a fully dynamic forest, greedy can be forced to have $Ω(\log_Δn)$ amortized recourse. To partially alleviate this limitation of greedy, we show an optimal non-greedy algorithm with $O(1)$ amortized recourse for \emph{rooted} fully dynamic forests and $c = Δ- 2$. In the randomized setting, we give a natural distribution-maintaining algorithm that achieves $Θ(\frac{1}Δ)$ expected amortized recourse in the incremental model and $Θ(\min \{ \fracΔ{c}, \log_Δ n \})$ expected recourse in the dynamic model. These randomized results are optimal for $c=0$.

A Scalable and Unified Framework to Weighted Rank Aggregation

from arXiv: Data Structures and Algorithms

Authors: Amir Carmel, Debarati Das, Tien-Long Nguyen

The rank aggregation problem seeks to combine multiple rank orderings of the same set of candidates into a single consensus ordering. Such problems arise in diverse domains, including web search, employment, college admissions, and voting. In this work we focus on the 1-median objective: given a set of m rankings over [n], the goal is to compute a ranking that minimizes the sum of its distances to all input rankings. We study rank aggregation under several classical distance metrics: Ulam distance, Spearman's footrule, Hamming distance, and Kendall-tau, as well as their weighted variants. Our contributions begin with a novel unified framework that identifies a key structural property: it suffices to focus on a small subset of rankings, where the corresponding local one-median provides a good approximation to the global median. This principle extends across these distance measures, yielding a general algorithmic framework for weighted rank aggregation. Building on this, we present a new approximation algorithm for rank aggregation under the Ulam distance that scales in the Massively Parallel Computation (MPC) model. Our algorithm computes a $(2-α)$-approximation, for a constant $α>0$, to the 1-median in a constant number of rounds, using local memory sublinear in n and total memory near-linear in n. We further design new MPC approximation algorithms for Spearman's footrule and for the element-weighted variants of Hamming and Kendall-tau distances. For each metric, we obtain a $(2-ζ)$-approximation, for a constant $ζ>0$, to the 1-median in a constant number of rounds, using local memory sublinear in n and total memory linear or near-linear in n. Moreover, for the Ulam distance, we simplify and strengthen the analysis of Chakraborty et al., obtaining an improved 1.968-approximation that further extends to the weighted setting.

Authors: Amir Carmel, Debarati Das, Tien-Long Nguyen

The rank aggregation problem seeks to combine multiple rank orderings of the same set of candidates into a single consensus ordering. Such problems arise in diverse domains, including web search, employment, college admissions, and voting. In this work we focus on the 1-median objective: given a set of m rankings over [n], the goal is to compute a ranking that minimizes the sum of its distances to all input rankings. We study rank aggregation under several classical distance metrics: Ulam distance, Spearman's footrule, Hamming distance, and Kendall-tau, as well as their weighted variants. Our contributions begin with a novel unified framework that identifies a key structural property: it suffices to focus on a small subset of rankings, where the corresponding local one-median provides a good approximation to the global median. This principle extends across these distance measures, yielding a general algorithmic framework for weighted rank aggregation. Building on this, we present a new approximation algorithm for rank aggregation under the Ulam distance that scales in the Massively Parallel Computation (MPC) model. Our algorithm computes a $(2-α)$-approximation, for a constant $α>0$, to the 1-median in a constant number of rounds, using local memory sublinear in n and total memory near-linear in n. We further design new MPC approximation algorithms for Spearman's footrule and for the element-weighted variants of Hamming and Kendall-tau distances. For each metric, we obtain a $(2-ζ)$-approximation, for a constant $ζ>0$, to the 1-median in a constant number of rounds, using local memory sublinear in n and total memory linear or near-linear in n. Moreover, for the Ulam distance, we simplify and strengthen the analysis of Chakraborty et al., obtaining an improved 1.968-approximation that further extends to the weighted setting.

Deterministically finding an element of large order in $\mathbb{Z}_N^*$

from arXiv: Data Structures and Algorithms

Authors: Itamar Nir

In this paper, we present an improvement for the problem of deterministically finding an element of large multiplicative order modulo some integer $N$. This problem arises as a key subroutine in current deterministic factoring algorithms, such as those proposed by Harvey and Hittmeir [Mathematics of Computation, 2021]. Specifically, let $D \exp\left(\sqrt{2\log N \log \log N}\right). \end{equation} We give a deterministic algorithm that does one of the following: Returns an element $a \in \mathbb{Z}_N^*$ with $\operatorname{ord}_N(a) > D$; Returns a non-trivial factor of $N$; Or reports that $N$ is prime. The running time of our algorithm is $O(D^{1/2 + o(1)})$. Similar results were independently and concurrently obtained by Harvey and Hittmeir [arXiv:2601.11131, 2026] in work that appeared while this manuscript was in preparation. Prior to these works, the best known algorithm for finding an element with order larger than $D$ was given by Oznovich and Volk [SODA 2026], requiring $D > N^{\frac{1}{6}}$. We also present a simpler algorithm that applies for any $D < N$ and runs in $O(D^{2.5+o(1)}\operatorname{polylog}(N))$.

Authors: Itamar Nir

In this paper, we present an improvement for the problem of deterministically finding an element of large multiplicative order modulo some integer $N$. This problem arises as a key subroutine in current deterministic factoring algorithms, such as those proposed by Harvey and Hittmeir [Mathematics of Computation, 2021]. Specifically, let $D \exp\left(\sqrt{2\log N \log \log N}\right). \end{equation} We give a deterministic algorithm that does one of the following: Returns an element $a \in \mathbb{Z}_N^*$ with $\operatorname{ord}_N(a) > D$; Returns a non-trivial factor of $N$; Or reports that $N$ is prime. The running time of our algorithm is $O(D^{1/2 + o(1)})$. Similar results were independently and concurrently obtained by Harvey and Hittmeir [arXiv:2601.11131, 2026] in work that appeared while this manuscript was in preparation. Prior to these works, the best known algorithm for finding an element with order larger than $D$ was given by Oznovich and Volk [SODA 2026], requiring $D > N^{\frac{1}{6}}$. We also present a simpler algorithm that applies for any $D < N$ and runs in $O(D^{2.5+o(1)}\operatorname{polylog}(N))$.

Computing Flows in Subquadratic Space

from arXiv: Data Structures and Algorithms

Authors: Jan van den Brand, Zhao Song, Albert Weng

Space complexity is a critical factor in various computational models, including streaming, parallel/distributed computing, and communication complexity. We study the space complexity of the minimum-cost flow problem, a generalization of the st-max flow problem, focusing on computing flows in subquadratic space. In the general case with arbitrary capacities, minimum cost and $st$-maximum flows can use up to $Ω(n^2)$ edges, so computing the flow on each edge (rather than just the size/cost) seems impossible in subquadratic space. Indeed, there are lower bounds proving quadratic space is needed to store the flow on every edge, which has been used to prove lower bounds on streaming algorithms. However, we show that these lower bounds can be circumvented, opening up improvements for streaming and communication complexity. For a directed graph with integer capacities and costs bounded by $W$, we provide a $\tilde O(n^{1.5}\log (W/ε))$-space $\tilde O(\sqrt{n} \log(W/ε))$-pass streaming algorithm, which during the last pass returns the flow on each edge up to an additive error of $ε$. Crucially, the algorithm does not return the flow at the end of the last pass but returns the flow on an edge, as the edge is read in the stream. This allows us to circumvent existing $Ω(n^2)$ space lower bounds. In the 2-party communication model, our algorithm implies $\tilde O(n^{1.5}\log^2 W)$ bits of communication.

Authors: Jan van den Brand, Zhao Song, Albert Weng

Space complexity is a critical factor in various computational models, including streaming, parallel/distributed computing, and communication complexity. We study the space complexity of the minimum-cost flow problem, a generalization of the st-max flow problem, focusing on computing flows in subquadratic space. In the general case with arbitrary capacities, minimum cost and $st$-maximum flows can use up to $Ω(n^2)$ edges, so computing the flow on each edge (rather than just the size/cost) seems impossible in subquadratic space. Indeed, there are lower bounds proving quadratic space is needed to store the flow on every edge, which has been used to prove lower bounds on streaming algorithms. However, we show that these lower bounds can be circumvented, opening up improvements for streaming and communication complexity. For a directed graph with integer capacities and costs bounded by $W$, we provide a $\tilde O(n^{1.5}\log (W/ε))$-space $\tilde O(\sqrt{n} \log(W/ε))$-pass streaming algorithm, which during the last pass returns the flow on each edge up to an additive error of $ε$. Crucially, the algorithm does not return the flow at the end of the last pass but returns the flow on an edge, as the edge is read in the stream. This allows us to circumvent existing $Ω(n^2)$ space lower bounds. In the 2-party communication model, our algorithm implies $\tilde O(n^{1.5}\log^2 W)$ bits of communication.

Positional LSH: Binary Block Matrix Approximation for Attention with Linear Biases

from arXiv: Data Structures and Algorithms

Authors: Daniel Wolfson, Tal Wagner

Positional encoding in transformers is commonly implemented through positional embeddings, attention masks, or bias terms, but formal connections between these mechanisms remain limited. We study attention with positional bias through the lens of locality-sensitive hashing (LSH), focusing on Attention with Linear Biases (ALiBi). We show that the ALiBi bias matrix is the expectation of contiguous block-diagonal binary masks induced by a ``positional LSH'' scheme. The empirical mean of masks sampled from this scheme yields spectral norm and max-norm approximation guarantees with bounded block sizes with high probability. This structural theorem implies a uniform approximation theorem for ALiBi-biased attention: with high probability over the sampled masks, the approximate attention output is accurate simultaneously for all query-key-value inputs and can be computed in near-linear time in the context length, reducing long-context ALiBi to a collection of randomized short-context regular (positionally unbiased) attention operations. Conceptually, this connects positional bias, masks, and positional embeddings in a single formal framework and suggests an approach to efficient ALiBi-biased attention. Experiments on large language models validate our theoretical findings.

Authors: Daniel Wolfson, Tal Wagner

Positional encoding in transformers is commonly implemented through positional embeddings, attention masks, or bias terms, but formal connections between these mechanisms remain limited. We study attention with positional bias through the lens of locality-sensitive hashing (LSH), focusing on Attention with Linear Biases (ALiBi). We show that the ALiBi bias matrix is the expectation of contiguous block-diagonal binary masks induced by a ``positional LSH'' scheme. The empirical mean of masks sampled from this scheme yields spectral norm and max-norm approximation guarantees with bounded block sizes with high probability. This structural theorem implies a uniform approximation theorem for ALiBi-biased attention: with high probability over the sampled masks, the approximate attention output is accurate simultaneously for all query-key-value inputs and can be computed in near-linear time in the context length, reducing long-context ALiBi to a collection of randomized short-context regular (positionally unbiased) attention operations. Conceptually, this connects positional bias, masks, and positional embeddings in a single formal framework and suggests an approach to efficient ALiBi-biased attention. Experiments on large language models validate our theoretical findings.

Learning-Augmented Scalable Linear Assignment Problem Optimization via Neural Dual Warm-Starts

from arXiv: Data Structures and Algorithms

Authors: Ilay Yavlovich, Jad Agbaria, Muhamed Mhamed, Jose Yallouz, Nir Weinberger

The Linear Assignment Problem (LAP) is a fundamental combinatorial optimization task with applications ranging from computer vision to logistics. Classical exact solvers such as the Hungarian and Jonker-Volgenant (LAPJV) algorithms guarantee optimality, but their cubic time complexity $\mathcal{O}(N^{3})$ becomes a bottleneck for large-scale instances. Recent learning-based approaches aim to replace these solvers with neural models, often sacrificing exactness or failing to scale due to memory constraints. We propose a learning-augmented framework that accelerates exact assignment solvers while maintaining optimality and worst-case guarantees. Our method predicts dual variables to warm-start a classical solver, with a fallback that prevents asymptotic runtime degradation when the learned advice is unreliable. We introduce RowDualNet, a lightweight row-independent architecture that avoids the $\mathcal{O}(N^{2})$ memory bottleneck of graph-based models, enabling neural warm-starting at large scale ($N=16{,}384$). Feasibility is ensured via a constructive mechanism based on LP duality (namely, the Min-Trick), eliminating costly iterative projection. Empirically, our approach reduces the search effort of LAPJV and achieves over $2{\times}$ speedups on challenging synthetic distributions, in addition to improving over $1.25{\times}$ and $1.5{\times}$ on real-world tracking (MOT) and transportation (LPT) datasets, respectively, while strictly maintaining full optimality, effectively yielding a robust zero-shot generalization to real-world tasks.

Authors: Ilay Yavlovich, Jad Agbaria, Muhamed Mhamed, Jose Yallouz, Nir Weinberger

The Linear Assignment Problem (LAP) is a fundamental combinatorial optimization task with applications ranging from computer vision to logistics. Classical exact solvers such as the Hungarian and Jonker-Volgenant (LAPJV) algorithms guarantee optimality, but their cubic time complexity $\mathcal{O}(N^{3})$ becomes a bottleneck for large-scale instances. Recent learning-based approaches aim to replace these solvers with neural models, often sacrificing exactness or failing to scale due to memory constraints. We propose a learning-augmented framework that accelerates exact assignment solvers while maintaining optimality and worst-case guarantees. Our method predicts dual variables to warm-start a classical solver, with a fallback that prevents asymptotic runtime degradation when the learned advice is unreliable. We introduce RowDualNet, a lightweight row-independent architecture that avoids the $\mathcal{O}(N^{2})$ memory bottleneck of graph-based models, enabling neural warm-starting at large scale ($N=16{,}384$). Feasibility is ensured via a constructive mechanism based on LP duality (namely, the Min-Trick), eliminating costly iterative projection. Empirically, our approach reduces the search effort of LAPJV and achieves over $2{\times}$ speedups on challenging synthetic distributions, in addition to improving over $1.25{\times}$ and $1.5{\times}$ on real-world tracking (MOT) and transportation (LPT) datasets, respectively, while strictly maintaining full optimality, effectively yielding a robust zero-shot generalization to real-world tasks.

Equitable Colorings of Vertex-Weighted Graphs

from arXiv: Data Structures and Algorithms

Authors: Siddharth Barman, Vignesh Viswanathan

We study a generalization of the classical Hajnal-Szemerédi theorem to vertex-weighted graphs. Given a graph with nonnegative vertex weights, a coloring is called $α$-approximately equitable up to one vertex ($α$-EQ1) if, for each color class, the total weight remaining after removing its maximum-weight vertex is at most $α\geq 1$ times the weight of any other color class. For vertex-weighted graphs with maximum degree $Δ$, we show that there exist instances for which no $k$-coloring is $α$-EQ1 for any $k < \frac{3Δ}{2}$ and $α< \sqrt{2}$. In light of this impossibility, we relax these parameters and establish the following results for any vertex-weighted graph $G$ with maximum degree $Δ$: (1) for any $\varepsilon \in (0,1)$ and all $k \geq (\frac{c}{\varepsilon^2}\ln{\frac{1}{\varepsilon}}) Δ$, there exists a $(1 + \varepsilon)$-EQ1 $k$-coloring of $G$, where $c$ is a fixed constant; and (2) for all $k \ge Δ+ 1$, there exists a $2$-EQ1 $k$-coloring of $G$. Furthermore, such equitable colorings can be computed in polynomial time. En route to our results on equitability under vertex weights, we establish sufficient conditions for the existence of $k$-colorings that are equitable with respect to any given partition of the vertex set. Our coloring results correspond to fairness guarantees in a constrained fair division setting and lead to concentration inequalities for partly dependent random variables.

Authors: Siddharth Barman, Vignesh Viswanathan

We study a generalization of the classical Hajnal-Szemerédi theorem to vertex-weighted graphs. Given a graph with nonnegative vertex weights, a coloring is called $α$-approximately equitable up to one vertex ($α$-EQ1) if, for each color class, the total weight remaining after removing its maximum-weight vertex is at most $α\geq 1$ times the weight of any other color class. For vertex-weighted graphs with maximum degree $Δ$, we show that there exist instances for which no $k$-coloring is $α$-EQ1 for any $k < \frac{3Δ}{2}$ and $α< \sqrt{2}$. In light of this impossibility, we relax these parameters and establish the following results for any vertex-weighted graph $G$ with maximum degree $Δ$: (1) for any $\varepsilon \in (0,1)$ and all $k \geq (\frac{c}{\varepsilon^2}\ln{\frac{1}{\varepsilon}}) Δ$, there exists a $(1 + \varepsilon)$-EQ1 $k$-coloring of $G$, where $c$ is a fixed constant; and (2) for all $k \ge Δ+ 1$, there exists a $2$-EQ1 $k$-coloring of $G$. Furthermore, such equitable colorings can be computed in polynomial time. En route to our results on equitability under vertex weights, we establish sufficient conditions for the existence of $k$-colorings that are equitable with respect to any given partition of the vertex set. Our coloring results correspond to fairness guarantees in a constrained fair division setting and lead to concentration inequalities for partly dependent random variables.

Witness-Sensitive Detection of Induced Diamonds

from arXiv: Data Structures and Algorithms

Authors: Keren Censor-Hillel, Tomer Even, Virginia Vasillevska Williams, Nathan Wallheimer

We provide a fast \emph{witness-sensitive} algorithm for detecting an induced diamond (a $K_4$ minus an edge) in an $n$-vertex graph containing $t$ induced diamonds. Our algorithm runs in time $\tilde{O}(\min(n^{2.425}/t^{0.25}+n^2, n^ω))$ with high probability, improving upon the prior state of the art (witness-oblivious) algorithm that runs in time $O(n^ω\log{n})$ [Vassilevska Williams, Wang, Williams, Yu, SODA 2014] whenever $t \geq n^{(3-ω)/3}$, where $ω< 2.372$ is the matrix multiplication exponent. Our key insight is that the size of a clique containing one of the triangles of an induced diamond plays a crucial role in detecting such a diamond. We say that a diamond is $r$-heavy if this size is at least $r$, and we provide a fast detection algorithm for $r$-heavy diamonds in $\tilde{O}(r \cdot (n/r)^ω+ (n/r)^3+ nr)$ time. When there are no $r$-heavy diamonds, we provide a different fast detection algorithm in $\tilde{O}(\mathsf{MM}(n,n,n\sqrt{r/t}))$ time, where $\mathsf{MM}(a,b,c)$ denotes the time to multiply an $a \times b$ matrix by a $b \times c$ matrix, which is conditionally optimal for $r=\tilde{O}(1)$. Our main technical contribution is in designing a refinement framework for sampling vectors, which allows sampling vertices for detecting diamonds in a manner that is adaptive to the structure of graphs with no $r$-heavy diamonds. We establish that our technique is of a wide applicability, by showing how it also allows for faster witness-sensitive algorithms for $4$-SUM and for a special case of $4$-cycles.

Authors: Keren Censor-Hillel, Tomer Even, Virginia Vasillevska Williams, Nathan Wallheimer

We provide a fast \emph{witness-sensitive} algorithm for detecting an induced diamond (a $K_4$ minus an edge) in an $n$-vertex graph containing $t$ induced diamonds. Our algorithm runs in time $\tilde{O}(\min(n^{2.425}/t^{0.25}+n^2, n^ω))$ with high probability, improving upon the prior state of the art (witness-oblivious) algorithm that runs in time $O(n^ω\log{n})$ [Vassilevska Williams, Wang, Williams, Yu, SODA 2014] whenever $t \geq n^{(3-ω)/3}$, where $ω< 2.372$ is the matrix multiplication exponent. Our key insight is that the size of a clique containing one of the triangles of an induced diamond plays a crucial role in detecting such a diamond. We say that a diamond is $r$-heavy if this size is at least $r$, and we provide a fast detection algorithm for $r$-heavy diamonds in $\tilde{O}(r \cdot (n/r)^ω+ (n/r)^3+ nr)$ time. When there are no $r$-heavy diamonds, we provide a different fast detection algorithm in $\tilde{O}(\mathsf{MM}(n,n,n\sqrt{r/t}))$ time, where $\mathsf{MM}(a,b,c)$ denotes the time to multiply an $a \times b$ matrix by a $b \times c$ matrix, which is conditionally optimal for $r=\tilde{O}(1)$. Our main technical contribution is in designing a refinement framework for sampling vectors, which allows sampling vertices for detecting diamonds in a manner that is adaptive to the structure of graphs with no $r$-heavy diamonds. We establish that our technique is of a wide applicability, by showing how it also allows for faster witness-sensitive algorithms for $4$-SUM and for a special case of $4$-cycles.

Node-Weighted Triangles: Faster and Simpler

from arXiv: Data Structures and Algorithms

Authors: Shyan Akmal, Nick Fischer

Weighted variants of triangle detection are an important object of study because of their prominence in fine-grained complexity. We revisit the Node-Weighted Triangle problem, where the goal is to decide if a vertex-weighted graph contains a triangle whose node weights sum to zero. This problem has been the focus of a celebrated line of work, beginning with a subcubic-time algorithm [Vassilevska, Williams; STOC '06], and culminating in algorithms running almost in matrix multiplication time, $O(\textsf{MM}(n) + n^2\cdot 2^{O(\sqrt{\log n})})$ [Czumaj, Lingas; SODA '07], [Vassilevska W., Williams; STOC '09]. This runtime is almost-optimal, since even detecting an unweighted triangle is conjectured to require matrix multiplication time $\textsf{MM}(n)$. However, the superpolylogarithmic $2^{Ω(\sqrt{\log n})}$ overhead persists in a world where near-optimal matrix multiplication is possible (i.e., $\textsf{MM}(n) \leq n^2\text{poly}(\log n)$). In this paper, we present a new algorithm solving Node-Weighted Triangle in $O(\textsf{MM}(n))$ time, closing the gap to unweighted triangle detection completely. Remarkably, our algorithm is much simpler than previous approaches, which use involved recursion schemes and communication protocols.

Authors: Shyan Akmal, Nick Fischer

Weighted variants of triangle detection are an important object of study because of their prominence in fine-grained complexity. We revisit the Node-Weighted Triangle problem, where the goal is to decide if a vertex-weighted graph contains a triangle whose node weights sum to zero. This problem has been the focus of a celebrated line of work, beginning with a subcubic-time algorithm [Vassilevska, Williams; STOC '06], and culminating in algorithms running almost in matrix multiplication time, $O(\textsf{MM}(n) + n^2\cdot 2^{O(\sqrt{\log n})})$ [Czumaj, Lingas; SODA '07], [Vassilevska W., Williams; STOC '09]. This runtime is almost-optimal, since even detecting an unweighted triangle is conjectured to require matrix multiplication time $\textsf{MM}(n)$. However, the superpolylogarithmic $2^{Ω(\sqrt{\log n})}$ overhead persists in a world where near-optimal matrix multiplication is possible (i.e., $\textsf{MM}(n) \leq n^2\text{poly}(\log n)$). In this paper, we present a new algorithm solving Node-Weighted Triangle in $O(\textsf{MM}(n))$ time, closing the gap to unweighted triangle detection completely. Remarkably, our algorithm is much simpler than previous approaches, which use involved recursion schemes and communication protocols.

Online Matrix Factorization, Online Private Query Release, and Online Discrepancy Minimization

from arXiv: Data Structures and Algorithms

Authors: Aleksandar Nikolov, Haohua Tang, Jonathan Ullman

In this paper we consider several related online computation problems. First, we study answering sequences of statistical queries arriving online, and being answered immediately when they arrive with differential privacy. Known matrix factorization mechanisms can answer a set of statistical queries with error bounded by the $γ_2$ norm of their query matrix, but require that all queries are known in advance. We show that nearly the same error bounds can be achieved in the online setting for non-adaptively chosen queries. To do so, we give an online factorization algorithm that competitively matches the best offline factorization up to logarithmic factors. In the online matrix factorization problem, a new row $q_t$ of a matrix arrives at each time step $t$, and the algorithm needs to maintain a factorization $L_tR_t=Q_t$ such that at each time it appends some rows to $R_t$, and outputs a new row $\ell_t$ s.t. $\ell_tR_t=q_t$. Our algorithm maintains the competitiveness over this online process, even if the number of rows to arrive is unknown. As another application, we give an online discrepancy minimization algorithm that achieves discrepancy competitive against the $γ_2$ norm (and also against hereditary discrepancy) up to logarithmic factors.

Authors: Aleksandar Nikolov, Haohua Tang, Jonathan Ullman

In this paper we consider several related online computation problems. First, we study answering sequences of statistical queries arriving online, and being answered immediately when they arrive with differential privacy. Known matrix factorization mechanisms can answer a set of statistical queries with error bounded by the $γ_2$ norm of their query matrix, but require that all queries are known in advance. We show that nearly the same error bounds can be achieved in the online setting for non-adaptively chosen queries. To do so, we give an online factorization algorithm that competitively matches the best offline factorization up to logarithmic factors. In the online matrix factorization problem, a new row $q_t$ of a matrix arrives at each time step $t$, and the algorithm needs to maintain a factorization $L_tR_t=Q_t$ such that at each time it appends some rows to $R_t$, and outputs a new row $\ell_t$ s.t. $\ell_tR_t=q_t$. Our algorithm maintains the competitiveness over this online process, even if the number of rows to arrive is unknown. As another application, we give an online discrepancy minimization algorithm that achieves discrepancy competitive against the $γ_2$ norm (and also against hereditary discrepancy) up to logarithmic factors.

Search and evacuation with a near majority of faulty agents

from arXiv: Data Structures and Algorithms

Authors: J. Czyzowicz, R. Killick, E. Kranakis, G. Stachowiak

There are $n\geq 3$ unit speed mobile agents placed at the origin of the infinite line. In as little time as possible, the agents must find and evacuate from an exit placed at an initially unknown location on the line. The agents can communicate in the wireless mode in order to facilitate the evacuation (i.e. by announcing the target's location when it is found). However, among the agents are a subset of at most $f$ crash faulty agents who may fail to announce the target when they visit its location. In this paper we study this aforementioned problem for the specific case that $n=2f+1$. We introduce a novel type of search algorithm and analyze its competitive ratio -- the supremum, over all possible target locations, of the ratio of the time the agents take to evacuate divided by the initial distance between the agents and the target. In particular, we demonstrate that the competitive ratio of evacuation is at most $7.437011$ for $(n,f)=(3,1)$; at most $7.253767$ for $(n,f)=(5,2)$ and $(7,3)$; and at most $7.147026$ for $(n,f)=(9,4)$. For larger values of $n=2f+1$ we prove an asymptotic upper bound of $4+2\sqrt{2}$. We also adapt our evacuation algorithm for $(n,f)=(3,1)$ to the problem of search by three agents with one byzantine fault, i.e. the faulty agent may also lie about finding the target. In doing so we improve the best known upper bound on this search problem from 8.653055 to 7.437011.

Authors: J. Czyzowicz, R. Killick, E. Kranakis, G. Stachowiak

There are $n\geq 3$ unit speed mobile agents placed at the origin of the infinite line. In as little time as possible, the agents must find and evacuate from an exit placed at an initially unknown location on the line. The agents can communicate in the wireless mode in order to facilitate the evacuation (i.e. by announcing the target's location when it is found). However, among the agents are a subset of at most $f$ crash faulty agents who may fail to announce the target when they visit its location. In this paper we study this aforementioned problem for the specific case that $n=2f+1$. We introduce a novel type of search algorithm and analyze its competitive ratio -- the supremum, over all possible target locations, of the ratio of the time the agents take to evacuate divided by the initial distance between the agents and the target. In particular, we demonstrate that the competitive ratio of evacuation is at most $7.437011$ for $(n,f)=(3,1)$; at most $7.253767$ for $(n,f)=(5,2)$ and $(7,3)$; and at most $7.147026$ for $(n,f)=(9,4)$. For larger values of $n=2f+1$ we prove an asymptotic upper bound of $4+2\sqrt{2}$. We also adapt our evacuation algorithm for $(n,f)=(3,1)$ to the problem of search by three agents with one byzantine fault, i.e. the faulty agent may also lie about finding the target. In doing so we improve the best known upper bound on this search problem from 8.653055 to 7.437011.

Monday, May 11

TR26-074 | Improved analysis of list-decodability of random linear codes: It’s all about counting constraints | Rohan Goyal, Venkatesan Guruswami

from ECCC Papers

List-decoding and list recovery ask how much corruption or uncertainty a code can tolerate while still keeping the number of plausible codewords small. For large alphabet codes, the ultimate benchmark for list-decoding is the ($\epsilon$-relaxed) generalized Singleton bound, which targets list-of-$L$ decoding radius with rate $R$ up to radius $\frac{L}{L+1}(1-R-\epsilon)$. We prove improved alphabet-size bounds for random linear and additive (folded) codes in this regime. Specifically, we show that random $s$-folded codes over any finite field $\mathbb F_q$ with $s=\Omega(1/\epsilon)$ meet the $\epsilon$-relaxed generalized Singleton bound for all list sizes $L$, matching the optimal $\exp(\Theta(1/\epsilon))$ dependence on the alphabet size. For random linear codes, we show that alphabet size $\exp(O(\log L/\epsilon))$ suffices, improving the previous $\exp(O(L/\epsilon))$ bound. In the important regime of $L=\Theta(1/\epsilon)$, where one list-decodes up to radius $(1-R-\epsilon)$, this improves the alphabet size from $\exp(O(1/\epsilon^2))$ to $\exp(\widetilde O(1/\epsilon))$ for random linear codes. For list recovery, we close the gap between the two best previous tradeoffs: prior work achieved either polynomial alphabet size in $\ell$ or near-optimal output list size, but not both simultaneously. We show that random linear codes achieve near-optimal output list size $(\ell/(R+\epsilon))^{O(R/\epsilon+1)}$ over alphabet size $(\ell/(R+\epsilon))^{O((R+\epsilon)/\epsilon^2)}$, which is polynomial in $\ell$. Our gains stem from isolating the right combinatorial tools to count constraints, and identifying canonical configurations avoiding which suffice for list-decoding or list-recovery. For list-decoding, we combine tools from weakly-partition-connected agreement hypergraphs with the partition structure implicit in recent subspace-design arguments to count only partition-induced local profiles, capturing the genuinely new linear constraints in a bad witness. For list recovery, we pair a reworked local coordinate-wise linear framework with discrete Brascamp--Lieb inequalities to quotient arbitrary bad configurations to minimal profiles. Together, our methods yield modular techniques and a general framework for improving the analysis of random linear codes across a broad range of settings, instantiated concretely here for list-decoding and list-recovery. Additionally, our presentation is self-contained and fully develops and proves all necessary ingredients.

List-decoding and list recovery ask how much corruption or uncertainty a code can tolerate while still keeping the number of plausible codewords small. For large alphabet codes, the ultimate benchmark for list-decoding is the ($\epsilon$-relaxed) generalized Singleton bound, which targets list-of-$L$ decoding radius with rate $R$ up to radius $\frac{L}{L+1}(1-R-\epsilon)$. We prove improved alphabet-size bounds for random linear and additive (folded) codes in this regime. Specifically, we show that random $s$-folded codes over any finite field $\mathbb F_q$ with $s=\Omega(1/\epsilon)$ meet the $\epsilon$-relaxed generalized Singleton bound for all list sizes $L$, matching the optimal $\exp(\Theta(1/\epsilon))$ dependence on the alphabet size. For random linear codes, we show that alphabet size $\exp(O(\log L/\epsilon))$ suffices, improving the previous $\exp(O(L/\epsilon))$ bound. In the important regime of $L=\Theta(1/\epsilon)$, where one list-decodes up to radius $(1-R-\epsilon)$, this improves the alphabet size from $\exp(O(1/\epsilon^2))$ to $\exp(\widetilde O(1/\epsilon))$ for random linear codes. For list recovery, we close the gap between the two best previous tradeoffs: prior work achieved either polynomial alphabet size in $\ell$ or near-optimal output list size, but not both simultaneously. We show that random linear codes achieve near-optimal output list size $(\ell/(R+\epsilon))^{O(R/\epsilon+1)}$ over alphabet size $(\ell/(R+\epsilon))^{O((R+\epsilon)/\epsilon^2)}$, which is polynomial in $\ell$. Our gains stem from isolating the right combinatorial tools to count constraints, and identifying canonical configurations avoiding which suffice for list-decoding or list-recovery. For list-decoding, we combine tools from weakly-partition-connected agreement hypergraphs with the partition structure implicit in recent subspace-design arguments to count only partition-induced local profiles, capturing the genuinely new linear constraints in a bad witness. For list recovery, we pair a reworked local coordinate-wise linear framework with discrete Brascamp--Lieb inequalities to quotient arbitrary bad configurations to minimal profiles. Together, our methods yield modular techniques and a general framework for improving the analysis of random linear codes across a broad range of settings, instantiated concretely here for list-decoding and list-recovery. Additionally, our presentation is self-contained and fully develops and proves all necessary ingredients.

Searches Are Weird! No they're not! Bad coding style?

from Computational Complexity

In David Marcus's guest post on good coding style (see here)  he reviewed a book from 1986 called  "Professional Pascal."

I wondered if it was still in print and could be bought:

1) I went to Amazon and searched all products for Professional Pascal. I got this, which is not that book.

2) I then restricted to books, and I got the same, though later on the page I got a relevant book, here.

3) I then searched for Professional Pascal on Google, and got the Amazon site for the book here.

David thought this was weird. I did not. As I put it:

A computer does something which makes no sense. This is common, hence it's not weird.

Why did the search from outside of Amazon do better than the search inside of Amazon?

Speculation

1) Search is just a really hard problem.

2) The coders at Amazon did not use good coding style. They should read the book. If they can find it. 

By gasarch

In David Marcus's guest post on good coding style (see here)  he reviewed a book from 1986 called  "Professional Pascal."

I wondered if it was still in print and could be bought:

1) I went to Amazon and searched all products for Professional Pascal. I got this, which is not that book.

2) I then restricted to books, and I got the same, though later on the page I got a relevant book, here.

3) I then searched for Professional Pascal on Google, and got the Amazon site for the book here.

David thought this was weird. I did not. As I put it:

A computer does something which makes no sense. This is common, hence it's not weird.

Why did the search from outside of Amazon do better than the search inside of Amazon?

Speculation

1) Search is just a really hard problem.

2) The coders at Amazon did not use good coding style. They should read the book. If they can find it. 

By gasarch

The Quantification Trap

from Ben Recht

A computational paradox of the postmodern condition.

If we want to make decisions in a complex society, we need a shared language. Experts on the ground must summarize complex situations in their communication with decision makers decoupled from the field. They need to make their experiences legible to those they report to.

The easiest way to make situations legible is to quantify them. To count things, record figures in tables, compute statistics, and make charts. Quantification sorts complexity into simple bins, simplifying communication both up and down the chain.

When we speak in such quantified numerical summaries, our statements feel objective. We believe that appropriate quantification isn’t be subject to the whims and opinions of an individual field worker. By agreeing upon standards, quantified measurements are now scientific facts.

Once we have objectivity, we have authority. Making decisions based on objective facts is obviously in the best interest of everyone else, and we impose threats of chastisement, ostracism, or violence upon those irrational individuals who disagree.

And once these numerical summaries that we made out of whole cloth to simplify communication become authoritative, they become real. They become things we should strive to maximize.

This is the quantification trap.

The quantification trap is social-scientific canon. You could build this story entirely out of texts written before the year 2000. The role of quantification, measurement, and legibility in statecraft is laid out by James C. Scott’s Seeing Like a State (1998) and Alain Desrosières The Politics of Large Numbers (1993). Theodore Porter’s Trust In Numbers (1995) highlights the turn to quantification in pursuit of standardization and objectivity. The blind optimization of decontextualized metrics is core to Jean-François Lyotard’s characterization of The Postmodern Condition (1979).

Twenty-five years into the twenty-first century, I don’t think you should have to run a Science and Technology Studies sidequest to recognize the quantification trap. It’s obvious and almost trite when we say it out loud. It’s trendy to talk about how metrics and benchmarks are bad and to prattle on endlessly about Goodhart’s, Campbell’s, or Murphy’s Laws. And yet, we continue to organize ourselves around statistical summaries. Is the quantification trap an inevitable part of scale? Is it an inevitable part of efficiency? Is it an inevitable part of the dismal hierarchy of bureaucratic power? The great puzzle of our contemporary condition is why it’s so hard to escape.

Part of the puzzle is that making society computable has dramatic benefits paired with every cost. The constant tension in mathematical rationality lies in the interplay between its sweet spots and its limitations. The quantification trap creates an intersubjectivity for collective action. Mathematically rational governance lets systems and hierarchies see, but also makes it easy to maintain their control. It facilitates posing clear questions and objectives, though crowds out nuance and multiplicity. It creates a shared understanding of standardization but removes the discretion of experts. It lets us speak about maximizing the average welfare of populations, but erases individuals.

If there are such clear trade-offs with quantification, why do we always tend to side with “the data?” The acceleration of computation has made the quantification trap exponentially more contagious. As computers became ubiquitous, the quantification process became inevitable and invisible. We don’t think about how we are tethered to unfathomable computing machines. They’re just part of who we are now. Our devices measure us all the time, recording time-on-site and click-throughs. Everything has a like button. All of these measurements are churned upon by data scientists hoping to hit their personal promotion metrics, regardless of whether the instrumentation means anything. The quantification trap is built out of an invisible fabric of computation.

I feel like I say this in the book, but never say it in the Irrational Decision. The book articulates the role of mathematical computation, optimization, and statistics as scaffolding in the elaborate quantification trap. To understand why we optimize what we optimize, it’s helpful to look at the history of computational methods and language boxing us in. The path from legibility to authority goes straight through computation and computerization. Quantification transforms experience into machine-readable data and a small number of interventions and outcomes. Decisions can only be automated once we throw away the messy, uncomputable parts. We maximize averages because it’s a convenient way to model uncertainty.

Now, I am by far not the only person to talk about the quantification trap. I wrote about it today because I felt I needed this placeholder after the last few weeks of talking about my book. However, if you want a reading list from the past 25 years, I can write us an impossibly long bibliography. Even in the past year, crossover books like Healy and Fourcade’s The Ordinal Society and Nguyen’s The Score have articulated the same conundrum.

It’s good that more people are talking about this. What we count, compute, and optimize is a political decision. Counting flattens complexity, and the choice of what is left is a question of power. The virality of the quantification trap forecloses better futures. We can’t strive for them if we can’t see the gilded cage we’re in.

Subscribe now

By Ben Recht

TR26-073 | An Algorithmic Proof of Kruskal’s Tensor Decomposition Theorem | Vishwas Bhargava, Leonard Schulman, Shiri Sivan

from ECCC Papers

A famous theorem of Kruskal gives the simplest and arguably most fundamental criterion under which a tensor is guaranteed a unique minimum-rank decomposition. Kruskal's condition requires that the sum of the Kruskal ranks $\{k_i\}_{i=1}^m$ of the components satisfies $\sum_{i \in [m]} k_i \ge 2r + m - 1$, where $r$ denotes the rank and $m$ the order of the tensor. However, Kruskal's original proof and subsequent simplifications/generalizations have remained non-constructive. With the sole exception of the case $(k_1=r,k_2=r,k_3=2)$, attributed to Jennrich—no algorithm has been established for decomposing tensors under the Kruskal condition without additional assumptions. In fact, whether there exists an efficient algorithm for decomposing a tensor under the Kruskal condition was explicitly posed as an open problem in the work of Bhaskara et al. (COLT 2014). Even slight variations of the Jennrich special case, such as the $(r, r-1, 3)$ case, have remained algorithmically open; specifically, no sub-exponential time bound was known. In this work, we make progress on this problem by giving an elementary, constructive proof of Kruskal’s Theorem for general $m$-way tensors. Concretely, we present a randomized algorithm that decomposes any tensor satisfying the Kruskal condition by utilizing random projections to map the problem into a geometry of intersecting hyperplanes via a MinRank instance. Specifically for $3$-way tensors satisfying $k_1+k_2+k_3=2r+2$, the algorithm achieves a runtime of $n^{O(k)}$ where $k = \min(k_1,k_2,k_3)$. Thus, we extend smoothly beyond the Jennrich special case, achieving polynomial-time complexity for any family of tensors that satisfies the Kruskal condition, provided the least Kruskal rank is bounded.

A famous theorem of Kruskal gives the simplest and arguably most fundamental criterion under which a tensor is guaranteed a unique minimum-rank decomposition. Kruskal's condition requires that the sum of the Kruskal ranks $\{k_i\}_{i=1}^m$ of the components satisfies $\sum_{i \in [m]} k_i \ge 2r + m - 1$, where $r$ denotes the rank and $m$ the order of the tensor. However, Kruskal's original proof and subsequent simplifications/generalizations have remained non-constructive. With the sole exception of the case $(k_1=r,k_2=r,k_3=2)$, attributed to Jennrich—no algorithm has been established for decomposing tensors under the Kruskal condition without additional assumptions. In fact, whether there exists an efficient algorithm for decomposing a tensor under the Kruskal condition was explicitly posed as an open problem in the work of Bhaskara et al. (COLT 2014). Even slight variations of the Jennrich special case, such as the $(r, r-1, 3)$ case, have remained algorithmically open; specifically, no sub-exponential time bound was known. In this work, we make progress on this problem by giving an elementary, constructive proof of Kruskal’s Theorem for general $m$-way tensors. Concretely, we present a randomized algorithm that decomposes any tensor satisfying the Kruskal condition by utilizing random projections to map the problem into a geometry of intersecting hyperplanes via a MinRank instance. Specifically for $3$-way tensors satisfying $k_1+k_2+k_3=2r+2$, the algorithm achieves a runtime of $n^{O(k)}$ where $k = \min(k_1,k_2,k_3)$. Thus, we extend smoothly beyond the Jennrich special case, achieving polynomial-time complexity for any family of tensors that satisfies the Kruskal condition, provided the least Kruskal rank is bounded.

On the Complexity of Discounted Robust MDPs with $L_p$ Uncertainty Sets

from arXiv: Computational Complexity

Authors: Ali Asadi, Krishnendu Chatterjee, Alipasha Montaseri, Ali Shafiee

A basic model in sequential decision making is the Markov decision process (MDP), which is extended to Robust MDPs (RMDPs) by allowing uncertainty in transition probabilities and optimizing against the worst-case transition probabilities from the uncertainty sets. The class of $(s, a)$-rectangular RMDPs with $L_p$ uncertainty sets provides a flexible and expressive model for such problems. We study this class of RMDPs with a discounted-sum cost criterion and a constant discount factor. The existence of an efficient algorithm for this class is a fundamental theoretical question in optimization and sequential decision making. Previous results only establish a strongly polynomial-time algorithm for $L_\infty$ uncertainty sets. In this work, our main results are as follows: (a)~we show that for any compact uncertainty set, the policy iteration algorithm for RMDPs is strongly polynomial with oracle access to solutions of Robust Markov chains (RMCs); (b)~we present strongly polynomial-time bounds on the policy iteration algorithm for RMCs with $L_1$ and $L_\infty$ uncertainty sets; and (c)~we establish hardness results for RMCs with $L_p$ uncertainty sets for integer $p$ satisfying $1

Authors: Ali Asadi, Krishnendu Chatterjee, Alipasha Montaseri, Ali Shafiee

A basic model in sequential decision making is the Markov decision process (MDP), which is extended to Robust MDPs (RMDPs) by allowing uncertainty in transition probabilities and optimizing against the worst-case transition probabilities from the uncertainty sets. The class of $(s, a)$-rectangular RMDPs with $L_p$ uncertainty sets provides a flexible and expressive model for such problems. We study this class of RMDPs with a discounted-sum cost criterion and a constant discount factor. The existence of an efficient algorithm for this class is a fundamental theoretical question in optimization and sequential decision making. Previous results only establish a strongly polynomial-time algorithm for $L_\infty$ uncertainty sets. In this work, our main results are as follows: (a)~we show that for any compact uncertainty set, the policy iteration algorithm for RMDPs is strongly polynomial with oracle access to solutions of Robust Markov chains (RMCs); (b)~we present strongly polynomial-time bounds on the policy iteration algorithm for RMCs with $L_1$ and $L_\infty$ uncertainty sets; and (c)~we establish hardness results for RMCs with $L_p$ uncertainty sets for integer $p$ satisfying $1