Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, May 01

Computing Equilibrium beyond Unilateral Deviation

from arXiv: Computational Complexity

Authors: Mingyang Liu, Gabriele Farina, Asuman Ozdaglar

Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts that provide stability against multilateral deviations (\emph{e.g.}, strong Nash and coalition-proof equilibrium), these generally fail to exist. In this paper, we study an alternative solution concept that minimizes coalitional deviation incentives, rather than requiring them to vanish, and is therefore guaranteed to exist. Specifically, we focus on minimizing the average gain of a deviating coalition, and extend the framework to weighted-average and maximum-within-coalition gains. In contrast, the minimum-gain analogue is shown to be computationally intractable. For the average-gain and maximum-gain objectives, we prove a lower bound on the complexity of computing such an equilibrium and present an algorithm that matches this bound. Finally, we use our framework to solve the \emph{Exploitability Welfare Frontier} (EWF), the maximum attainable social welfare subject to a given exploitability (the maximum gain over all unilateral deviations).

Authors: Mingyang Liu, Gabriele Farina, Asuman Ozdaglar

Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts that provide stability against multilateral deviations (\emph{e.g.}, strong Nash and coalition-proof equilibrium), these generally fail to exist. In this paper, we study an alternative solution concept that minimizes coalitional deviation incentives, rather than requiring them to vanish, and is therefore guaranteed to exist. Specifically, we focus on minimizing the average gain of a deviating coalition, and extend the framework to weighted-average and maximum-within-coalition gains. In contrast, the minimum-gain analogue is shown to be computationally intractable. For the average-gain and maximum-gain objectives, we prove a lower bound on the complexity of computing such an equilibrium and present an algorithm that matches this bound. Finally, we use our framework to solve the \emph{Exploitability Welfare Frontier} (EWF), the maximum attainable social welfare subject to a given exploitability (the maximum gain over all unilateral deviations).

Superpolynomial Length Lower Bounds for Tree-Like Semantic Proof Systems with Bounded Line Size

from arXiv: Computational Complexity

Authors: Susanna F. de Rezende, David Engström, Yassine Ghannane, Kilian Risse

We prove superpolynomial length lower bounds for the semantic tree-like Frege refutation system with bounded line size. Concretely, for any function $n^{2-\varepsilon} \leq s(n) \leq 2^{n^{1-\varepsilon}}$ we exhibit an explicit family $\mathcal{A}$ of $n$-variate CNF formulas $A$, each of size $|A| \le s(n)^{1+\varepsilon}$, such that if $A$ is chosen uniformly from $\mathcal{A}$, then asymptotically almost surely any tree-like Frege refutation of $A$ in line-size $s(n)$ is of length super-polynomial in $|A|$. Our lower bounds apply also to tree-like degree-$d$ threshold systems, for $d \approx \log\bigl(s(n)\bigr)$, that is, for $d$ up to $n^{1-\varepsilon}$. More generally, our lower bounds apply to the semantic version of these systems and to any semantic tree-like proof system where the number of distinct lines is bounded by $\exp\bigl(s(n)\bigr)$.

Authors: Susanna F. de Rezende, David Engström, Yassine Ghannane, Kilian Risse

We prove superpolynomial length lower bounds for the semantic tree-like Frege refutation system with bounded line size. Concretely, for any function $n^{2-\varepsilon} \leq s(n) \leq 2^{n^{1-\varepsilon}}$ we exhibit an explicit family $\mathcal{A}$ of $n$-variate CNF formulas $A$, each of size $|A| \le s(n)^{1+\varepsilon}$, such that if $A$ is chosen uniformly from $\mathcal{A}$, then asymptotically almost surely any tree-like Frege refutation of $A$ in line-size $s(n)$ is of length super-polynomial in $|A|$. Our lower bounds apply also to tree-like degree-$d$ threshold systems, for $d \approx \log\bigl(s(n)\bigr)$, that is, for $d$ up to $n^{1-\varepsilon}$. More generally, our lower bounds apply to the semantic version of these systems and to any semantic tree-like proof system where the number of distinct lines is bounded by $\exp\bigl(s(n)\bigr)$.

On the Principal Minor Expansion and Complexity of the Symmetrized Determinant

from arXiv: Computational Complexity

Authors: Sanyam Agarwal, Markus Bläser, Mridul Gupta

Barvinok introduced the symmetrized determinant ($\sdet$) as a \emph{non-commutative} analogue of the determinant. Intuitively, given a square matrix over an associative algebra, we can obtain the symmetrized determinant by averaging over all possible multiplication orders in the Leibniz formula for the determinant. He used the symmetrized determinant to design algorithms estimating the permanent of a matrix. To this end, he showed that there is a $O(n^{r+3})$ algorithm computing $\sdet$, where $r$ is the dimension of the algebra, and is therefore polynomial-time computable for fixed $r$. In this work, we study the algebraic properties and complexity of $\sdet$. While most of the properties of the ordinary determinant don't generalize to $\sdet$ defined on non-commutative algebras, we show that the principal minor expansion of the $\sdet$ is analogous to the ordinary determinant. Second, we prove that there exists a polynomial-sized algebra such that computing the symmetrized determinant is $\sharpP$-hard. Third, we show that the associated polynomial family is $\VNP$-complete over a suitable polynomial-dimensional algebra in the non-commutative setting. Further, when seen as a family of polynomials over the matrix algebra, it is also $\VNP$-complete in the commutative setting. This places the symmetrized determinant among the natural complete families arising from algebraic computation.

Authors: Sanyam Agarwal, Markus Bläser, Mridul Gupta

Barvinok introduced the symmetrized determinant ($\sdet$) as a \emph{non-commutative} analogue of the determinant. Intuitively, given a square matrix over an associative algebra, we can obtain the symmetrized determinant by averaging over all possible multiplication orders in the Leibniz formula for the determinant. He used the symmetrized determinant to design algorithms estimating the permanent of a matrix. To this end, he showed that there is a $O(n^{r+3})$ algorithm computing $\sdet$, where $r$ is the dimension of the algebra, and is therefore polynomial-time computable for fixed $r$. In this work, we study the algebraic properties and complexity of $\sdet$. While most of the properties of the ordinary determinant don't generalize to $\sdet$ defined on non-commutative algebras, we show that the principal minor expansion of the $\sdet$ is analogous to the ordinary determinant. Second, we prove that there exists a polynomial-sized algebra such that computing the symmetrized determinant is $\sharpP$-hard. Third, we show that the associated polynomial family is $\VNP$-complete over a suitable polynomial-dimensional algebra in the non-commutative setting. Further, when seen as a family of polynomials over the matrix algebra, it is also $\VNP$-complete in the commutative setting. This places the symmetrized determinant among the natural complete families arising from algebraic computation.

Unentangled stoquastic Merlin-Arthur proof systems: the power of unentanglement without destructive interference

from arXiv: Computational Complexity

Authors: Yupan Liu, Pei Wu

Stoquasticity, originating in sign-problem-free physical systems, gives rise to $\sf StoqMA$, introduced by Bravyi, Bessen, and Terhal (2006), a quantum-inspired intermediate class between $\sf MA$ and $\sf AM$. Unentanglement similarly gives rise to ${\sf QMA}(2)$, introduced by Kobayashi, Matsumoto, and Yamakami (CJTCS 2009), which generalizes $\sf QMA$ to two unentangled proofs and still has only the trivial $\sf NEXP$ upper bound. In this work, we initiate a systematic study of the power of unentanglement without destructive interference via ${\sf StoqMA}(2)$, the class of unentangled stoquastic Merlin-Arthur proof systems. Although $\sf StoqMA$ is semi-quantum and may collapse to $\sf MA$, ${\sf StoqMA}(2)$ turns out to be surprisingly powerful. We establish the following results: - ${\sf NP} \subseteq {\sf StoqMA}(2)$ with $\widetilde{O}(\sqrt{n})$-qubit proofs and completeness error $2^{-{\rm polylog}(n)}$. Conversely, ${\sf StoqMA}(2) \subseteq {\sf EXP}$ via the Sum-of-Squares algorithm of Barak, Kelner, and Steurer (STOC 2014); with our lower bound, our refined analysis yields the optimality of this algorithm under ETH. - ${\sf StoqMA}(2)_1 \subseteq {\sf PSPACE}$, and the containment holds with completeness error $2^{-2^{{\rm poly}(n)}}$. - ${\sf PreciseStoqMA}(2)$, a variant of ${\sf StoqMA}(2)$ with exponentially small promise gap, cannot achieve perfect completeness unless ${\sf EXP}={\sf NEXP}$. In contrast, ${\sf PreciseStoqMA}$ achieves perfect completeness, since ${\sf PSPACE} \subseteq {\sf PreciseStoqMA}_1$. - When the completeness error is negligible, ${\sf StoqMA}(k) = {\sf StoqMA}(2)$ for $k\geq 2$. Our lower bounds are obtained by stoquastizing the short-proof ${\sf QMA}(2)$ protocols via distribution testing techniques. Our upper bounds for the nearly perfect completeness case are proved via our new rectangular closure testing framework.

Authors: Yupan Liu, Pei Wu

Stoquasticity, originating in sign-problem-free physical systems, gives rise to $\sf StoqMA$, introduced by Bravyi, Bessen, and Terhal (2006), a quantum-inspired intermediate class between $\sf MA$ and $\sf AM$. Unentanglement similarly gives rise to ${\sf QMA}(2)$, introduced by Kobayashi, Matsumoto, and Yamakami (CJTCS 2009), which generalizes $\sf QMA$ to two unentangled proofs and still has only the trivial $\sf NEXP$ upper bound. In this work, we initiate a systematic study of the power of unentanglement without destructive interference via ${\sf StoqMA}(2)$, the class of unentangled stoquastic Merlin-Arthur proof systems. Although $\sf StoqMA$ is semi-quantum and may collapse to $\sf MA$, ${\sf StoqMA}(2)$ turns out to be surprisingly powerful. We establish the following results: - ${\sf NP} \subseteq {\sf StoqMA}(2)$ with $\widetilde{O}(\sqrt{n})$-qubit proofs and completeness error $2^{-{\rm polylog}(n)}$. Conversely, ${\sf StoqMA}(2) \subseteq {\sf EXP}$ via the Sum-of-Squares algorithm of Barak, Kelner, and Steurer (STOC 2014); with our lower bound, our refined analysis yields the optimality of this algorithm under ETH. - ${\sf StoqMA}(2)_1 \subseteq {\sf PSPACE}$, and the containment holds with completeness error $2^{-2^{{\rm poly}(n)}}$. - ${\sf PreciseStoqMA}(2)$, a variant of ${\sf StoqMA}(2)$ with exponentially small promise gap, cannot achieve perfect completeness unless ${\sf EXP}={\sf NEXP}$. In contrast, ${\sf PreciseStoqMA}$ achieves perfect completeness, since ${\sf PSPACE} \subseteq {\sf PreciseStoqMA}_1$. - When the completeness error is negligible, ${\sf StoqMA}(k) = {\sf StoqMA}(2)$ for $k\geq 2$. Our lower bounds are obtained by stoquastizing the short-proof ${\sf QMA}(2)$ protocols via distribution testing techniques. Our upper bounds for the nearly perfect completeness case are proved via our new rectangular closure testing framework.

Toward a Characterization of Simulation Between Arithmetic Theories

from arXiv: Computational Complexity

Authors: Hunter Monroe

We study when a sound arithmetic theory $\mathcal S{\supseteq}S^1_2$ with polynomial-time decidable axioms efficiently proves the bounded consistency statements $Con_{\mathcal S{+}φ}(n)$ for a true sentence $φ$. Equivalently, we ask when $\mathcal S$, viewed as a proof system, simulates $\mathcal S{+}φ$. The paper's two unconditional contributions constrain possible characterizations. First, for finitely axiomatized sequential $\mathcal S$, if $EA{\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}φ}$, then $\mathcal S$ interprets $\mathcal S{+}φ$, implying ${\mathcal S}{\vdash^{n^{O(1)}}}Con_{\mathcal S}(p(n)){\rightarrow}Con_{\mathcal S{+}φ}(n)$ for some polynomial $p$, and hence ${\mathcal S}{\vdash^{n^{O(1)}}}Con_{\mathcal S{+}φ}(n)$. Second, if $\mathcal S$ fails to simulate $\mathcal S{+}φ$ for some true $φ$, then for all sufficiently large $k$ it also fails for $φ_{BB}(k)$ asserting the exact value of the $k$-state Busy Beaver function. Informally, any argument showing that $\mathcal S$ fails to simulate some $\mathcal S{+}φ$ also yields unprovable $φ_{BB}(k)$ witnessing the same obstruction. These results suggest that relative consistency strength is a serious candidate for governing when simulation is possible, while leaving open whether it is the correct criterion. The paper's central conjectural proposal is that the above sufficient condition is also necessary: if $EA{\not\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}φ}$, then for every constant $c{>}0$, ${\mathcal S}{\not\vdash^{n^c}}Con_{\mathcal S{+}φ}(n)$. Under this proposal, hardness follows in canonical cases where $φ$ is $Con_{\mathcal S}$ or a Kolmogorov-randomness axiom. The latter yields further conjectural consequences and extensions.

Authors: Hunter Monroe

We study when a sound arithmetic theory $\mathcal S{\supseteq}S^1_2$ with polynomial-time decidable axioms efficiently proves the bounded consistency statements $Con_{\mathcal S{+}φ}(n)$ for a true sentence $φ$. Equivalently, we ask when $\mathcal S$, viewed as a proof system, simulates $\mathcal S{+}φ$. The paper's two unconditional contributions constrain possible characterizations. First, for finitely axiomatized sequential $\mathcal S$, if $EA{\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}φ}$, then $\mathcal S$ interprets $\mathcal S{+}φ$, implying ${\mathcal S}{\vdash^{n^{O(1)}}}Con_{\mathcal S}(p(n)){\rightarrow}Con_{\mathcal S{+}φ}(n)$ for some polynomial $p$, and hence ${\mathcal S}{\vdash^{n^{O(1)}}}Con_{\mathcal S{+}φ}(n)$. Second, if $\mathcal S$ fails to simulate $\mathcal S{+}φ$ for some true $φ$, then for all sufficiently large $k$ it also fails for $φ_{BB}(k)$ asserting the exact value of the $k$-state Busy Beaver function. Informally, any argument showing that $\mathcal S$ fails to simulate some $\mathcal S{+}φ$ also yields unprovable $φ_{BB}(k)$ witnessing the same obstruction. These results suggest that relative consistency strength is a serious candidate for governing when simulation is possible, while leaving open whether it is the correct criterion. The paper's central conjectural proposal is that the above sufficient condition is also necessary: if $EA{\not\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}φ}$, then for every constant $c{>}0$, ${\mathcal S}{\not\vdash^{n^c}}Con_{\mathcal S{+}φ}(n)$. Under this proposal, hardness follows in canonical cases where $φ$ is $Con_{\mathcal S}$ or a Kolmogorov-randomness axiom. The latter yields further conjectural consequences and extensions.

Strongly Refuting Random CSP without Literals

from arXiv: Computational Complexity

Authors: Siu On Chan, Tommaso d'Orsi, Jeff Xu

Under what condition is a random constraint satisfaction problem hard to refute by the sum-of-squares (SoS) algorithm? A sufficient condition is t-wise uniformity, that is, each constraint has a t-wise uniform distribution of satisfying assignments, as shown by the lower bounds of Kothari, Mori, O'Donnell, and Witmer (STOC 2017). This condition is also necessary for random CSPs given by a predicate and uniformly random literals, due to the constant-degree SoS refutation of Allen, O'Donnell, and Witmer (FOCS 2015). For higher degree, Raghavendra, Rao, and Schramm (STOC 2017) gave a refutation for Boolean random CSPs with uniformly random literals, matching the lower bounds optimally in terms of the three-way tradeoff between constraint density, SoS degree, and strength of refutation. Two long-standing open problems are to find a more general sufficient condition for SoS lower bounds, and to refute similar random CSPs not involving literals. We show that for a general random k-CSP, the necessary and sufficient hardness condition is not t-wise uniformity, but t-wise independence. We generalize the optimal three-way tradeoff to any random k-CSP, without assuming a Boolean domain or uniformly random literals. Our analysis involves new Kikuchi matrices for odd order and for asymmetric tensors. It also uses the global correlation rounding technique of Barak, Raghavendra, and Steurer (FOCS 2011). To avoid the running-time penalty of this technique, we also give a spectral refutation algorithm.

Authors: Siu On Chan, Tommaso d'Orsi, Jeff Xu

Under what condition is a random constraint satisfaction problem hard to refute by the sum-of-squares (SoS) algorithm? A sufficient condition is t-wise uniformity, that is, each constraint has a t-wise uniform distribution of satisfying assignments, as shown by the lower bounds of Kothari, Mori, O'Donnell, and Witmer (STOC 2017). This condition is also necessary for random CSPs given by a predicate and uniformly random literals, due to the constant-degree SoS refutation of Allen, O'Donnell, and Witmer (FOCS 2015). For higher degree, Raghavendra, Rao, and Schramm (STOC 2017) gave a refutation for Boolean random CSPs with uniformly random literals, matching the lower bounds optimally in terms of the three-way tradeoff between constraint density, SoS degree, and strength of refutation. Two long-standing open problems are to find a more general sufficient condition for SoS lower bounds, and to refute similar random CSPs not involving literals. We show that for a general random k-CSP, the necessary and sufficient hardness condition is not t-wise uniformity, but t-wise independence. We generalize the optimal three-way tradeoff to any random k-CSP, without assuming a Boolean domain or uniformly random literals. Our analysis involves new Kikuchi matrices for odd order and for asymmetric tensors. It also uses the global correlation rounding technique of Barak, Raghavendra, and Steurer (FOCS 2011). To avoid the running-time penalty of this technique, we also give a spectral refutation algorithm.

Fisher Markets with Approximately Optimal Bundles and the Need for a PCP Theorem for PPAD

from arXiv: Computational Complexity

Authors: Argyrios Deligkas, John Fearnley, Alexandros Hollender, Themistoklis Melissourgos

We study the problem of computing a competitive equilibrium with approximately optimal bundles in Fisher markets with separable piecewise-linear concave (SPLC) utility functions, meaning that every buyer receives a $(1-δ)$-optimal bundle, instead of a perfectly optimal one. We establish the first intractability result for the problem by showing that it is PPAD-hard for some constant $δ> 0$, assuming the PCP-for-PPAD conjecture. This hardness result holds even if all buyers have identical budgets (competitive equilibrium with equal incomes), linear capped utilities, and even if we also allow $\varepsilon$-approximate clearing instead of perfect clearing, for any constant $\varepsilon < 1/9$. Importantly, we show that the PCP-for-PPAD conjecture is in fact required to show hardness for constant $δ$: showing PPAD-hardness for finding such approximate market equilibria in a broad class of markets encompassing those generated by our hardness result would prove the conjecture. This is the first natural problem where the conjecture is provably required to establish hardness for it.

Authors: Argyrios Deligkas, John Fearnley, Alexandros Hollender, Themistoklis Melissourgos

We study the problem of computing a competitive equilibrium with approximately optimal bundles in Fisher markets with separable piecewise-linear concave (SPLC) utility functions, meaning that every buyer receives a $(1-δ)$-optimal bundle, instead of a perfectly optimal one. We establish the first intractability result for the problem by showing that it is PPAD-hard for some constant $δ> 0$, assuming the PCP-for-PPAD conjecture. This hardness result holds even if all buyers have identical budgets (competitive equilibrium with equal incomes), linear capped utilities, and even if we also allow $\varepsilon$-approximate clearing instead of perfect clearing, for any constant $\varepsilon < 1/9$. Importantly, we show that the PCP-for-PPAD conjecture is in fact required to show hardness for constant $δ$: showing PPAD-hardness for finding such approximate market equilibria in a broad class of markets encompassing those generated by our hardness result would prove the conjecture. This is the first natural problem where the conjecture is provably required to establish hardness for it.

An Exact 56-Addition, Rank-23 Scheme for General 3*3 Matrix Multiplication

from arXiv: Data Structures and Algorithms

Authors: Yinqi Sun

We present a rank-$23$ algorithm for general $3\times3$ matrix multiplication that uses $56$ additions/subtractions and $23$ multiplications, for a total of $79$ scalar operations in the standard bilinear straight-line model. This improves the recent sequence of $60$-, $59$-, and $58$-addition rank-$23$ schemes. The algorithm works over arbitrary associative, possibly noncommutative, coefficient rings. Its tensor coefficients are ternary, meaning that every coefficient lies in $\{-1,0,1\}$. Correctness is certified by the $729$ Brent equations over $\mathbb{Z}$, and the verifier also expands the straight-line program and performs additional finite-field and noncommutative implementation tests.

Authors: Yinqi Sun

We present a rank-$23$ algorithm for general $3\times3$ matrix multiplication that uses $56$ additions/subtractions and $23$ multiplications, for a total of $79$ scalar operations in the standard bilinear straight-line model. This improves the recent sequence of $60$-, $59$-, and $58$-addition rank-$23$ schemes. The algorithm works over arbitrary associative, possibly noncommutative, coefficient rings. Its tensor coefficients are ternary, meaning that every coefficient lies in $\{-1,0,1\}$. Correctness is certified by the $729$ Brent equations over $\mathbb{Z}$, and the verifier also expands the straight-line program and performs additional finite-field and noncommutative implementation tests.

Succinct Graph Representations and Algorithmic Applications

from arXiv: Data Structures and Algorithms

Authors: Ahammed Ullah, Alex Pothen

We propose new graph representations that exploit dense local structure to improve time and space simultaneously. Given an undirected graph $G$, we define a dual clique cover (DCC) representation of $G$ to be the pair $(C, L)$, where $C$ is a collection of cliques that covers the edges of $G$ and $L$ is the incidence dual of $C$. We identify classes of polynomial-time constructible DCC representations that are compact and call them succinct DCC representations. We then develop representation-aware algorithms for several fundamental graph problems. We show that graph primitives such as connected components, breadth-first search forests, depth-first search forests, and maximal matchings can be computed in time proportional to the size of a DCC representation rather than the number of edges. Combined with our succinct DCC representations, these results give a class of algorithms that either match or improve the time and space bounds of their counterparts on standard graph representations. Furthermore, we design several algorithms for constructing succinct DCC representations and establish provable guarantees on their efficiency. We evaluate several graph algorithms on DCC representations against adjacency-list-based implementations on a large collection of real-world and synthetic graphs. All evaluated applications show substantial execution memory savings and total-time speedups; for example, the connected components algorithm achieves about $9\times$ execution memory savings on average, with a maximum of $35\times$, and about $6.5\times$ total-time speedups on average, with a maximum of $35\times$. We also evaluate several DCC construction algorithms and find that the succinctness property plays a key role in making DCC representations effective for algorithmic applications.

Authors: Ahammed Ullah, Alex Pothen

We propose new graph representations that exploit dense local structure to improve time and space simultaneously. Given an undirected graph $G$, we define a dual clique cover (DCC) representation of $G$ to be the pair $(C, L)$, where $C$ is a collection of cliques that covers the edges of $G$ and $L$ is the incidence dual of $C$. We identify classes of polynomial-time constructible DCC representations that are compact and call them succinct DCC representations. We then develop representation-aware algorithms for several fundamental graph problems. We show that graph primitives such as connected components, breadth-first search forests, depth-first search forests, and maximal matchings can be computed in time proportional to the size of a DCC representation rather than the number of edges. Combined with our succinct DCC representations, these results give a class of algorithms that either match or improve the time and space bounds of their counterparts on standard graph representations. Furthermore, we design several algorithms for constructing succinct DCC representations and establish provable guarantees on their efficiency. We evaluate several graph algorithms on DCC representations against adjacency-list-based implementations on a large collection of real-world and synthetic graphs. All evaluated applications show substantial execution memory savings and total-time speedups; for example, the connected components algorithm achieves about $9\times$ execution memory savings on average, with a maximum of $35\times$, and about $6.5\times$ total-time speedups on average, with a maximum of $35\times$. We also evaluate several DCC construction algorithms and find that the succinctness property plays a key role in making DCC representations effective for algorithmic applications.

Distributed Santa Claus via Global Rounding

from arXiv: Data Structures and Algorithms

Authors: Tijn de Vos, Leo Wennmann, Malte Baumecker, Yannic Maus, Florian Schager

In this paper, we consider the Santa Claus problem in the CONGEST model. This NP-hard problem can be modeled as a bipartite graph of children and gifts where an edge indicates that a child desires a gift. Notably, each gift can have a different value. The goal is to assign the gifts to the children such that the least happy child is as happy as possible. Even though this is a well-studied problem in the sequential setting, we obtain the first results the distributed setting. In particular, we show that the complexity of computing an $\mathcal{O}(\log n/\log \log n)$-approximation is $\hat Θ(\sqrt n+D)$ rounds, where our $\widetildeΩ(\sqrt n+D)$-round lower bound is even stronger and holds for any approximation.

Authors: Tijn de Vos, Leo Wennmann, Malte Baumecker, Yannic Maus, Florian Schager

In this paper, we consider the Santa Claus problem in the CONGEST model. This NP-hard problem can be modeled as a bipartite graph of children and gifts where an edge indicates that a child desires a gift. Notably, each gift can have a different value. The goal is to assign the gifts to the children such that the least happy child is as happy as possible. Even though this is a well-studied problem in the sequential setting, we obtain the first results the distributed setting. In particular, we show that the complexity of computing an $\mathcal{O}(\log n/\log \log n)$-approximation is $\hat Θ(\sqrt n+D)$ rounds, where our $\widetildeΩ(\sqrt n+D)$-round lower bound is even stronger and holds for any approximation.

Simpler and Improved Replacement Path Coverings

from arXiv: Data Structures and Algorithms

Authors: Davide Bilò, Shiri Chechik, Keerti Choudhary, Sarel Cohen, Martin Schirneck

An important tool in the design of fault-tolerant graph data structures are $(L,f)$-replacement path coverings (RPCs). An RPC is a family $\mathcal{G}$ of subgraphs of a given graph $G$ such that, for every set $F$ of at most $f$ edges, there is a subfamily $\mathcal{G}_F \,{\subseteq}\, \mathcal{G}$ with the following properties. (1) No subgraph in $\mathcal{G}_F$ contains an edge of $F$. (2) For each pair of vertices $s,t$ that have a shortest path in $G-F$ with at most $L$ edges, one such path also exists in some subgraph in $\mathcal{G}_F$. The covering value of the RPC is the total number $|\mathcal{G}|$ of subgraphs. The query time is the time needed to compute the subfamily $\mathcal{G}_F$ given the set $F$. Weimann and Yuster [TALG'13] devised a randomized RPC with covering value $\widetilde{O}(fL^f)$ and query time $\widetilde{O}(f^2 L^f)$. This was derandomized by Karthik and Parter [TALG'24], who also reduced the query time to $\widetilde{O}(f^2 L)$. Their approach uses some heavy algebraic machinery involving error-correcting codes and an increased covering value of $O((cfL \log n)^{f+1})$ for some constant $c > 1$. We instead devise a much simpler derandomization via conditional expectations that lowers the covering value back to $\widetilde{O}(fL^{f+o(1)})$ and decreases the query time to $\widetilde{O}(f^{5/2}L^{o(1)})$, assuming $f = o(\log L)$. We also investigate the optimal covering value of any $(L,f)$-replacement path covering (deterministic or randomized) for different parameter ranges. We provide a new randomized construction as well as improving a known lower bound, also by Karthik and Parter. For example, for $f = o(\log L)$, we give an RPC with $\widetilde{O}( (L/f)^f L^{o(1)})$ subgraphs and show that this is tight up to the $L^{o(1)}$ term.

Authors: Davide Bilò, Shiri Chechik, Keerti Choudhary, Sarel Cohen, Martin Schirneck

An important tool in the design of fault-tolerant graph data structures are $(L,f)$-replacement path coverings (RPCs). An RPC is a family $\mathcal{G}$ of subgraphs of a given graph $G$ such that, for every set $F$ of at most $f$ edges, there is a subfamily $\mathcal{G}_F \,{\subseteq}\, \mathcal{G}$ with the following properties. (1) No subgraph in $\mathcal{G}_F$ contains an edge of $F$. (2) For each pair of vertices $s,t$ that have a shortest path in $G-F$ with at most $L$ edges, one such path also exists in some subgraph in $\mathcal{G}_F$. The covering value of the RPC is the total number $|\mathcal{G}|$ of subgraphs. The query time is the time needed to compute the subfamily $\mathcal{G}_F$ given the set $F$. Weimann and Yuster [TALG'13] devised a randomized RPC with covering value $\widetilde{O}(fL^f)$ and query time $\widetilde{O}(f^2 L^f)$. This was derandomized by Karthik and Parter [TALG'24], who also reduced the query time to $\widetilde{O}(f^2 L)$. Their approach uses some heavy algebraic machinery involving error-correcting codes and an increased covering value of $O((cfL \log n)^{f+1})$ for some constant $c > 1$. We instead devise a much simpler derandomization via conditional expectations that lowers the covering value back to $\widetilde{O}(fL^{f+o(1)})$ and decreases the query time to $\widetilde{O}(f^{5/2}L^{o(1)})$, assuming $f = o(\log L)$. We also investigate the optimal covering value of any $(L,f)$-replacement path covering (deterministic or randomized) for different parameter ranges. We provide a new randomized construction as well as improving a known lower bound, also by Karthik and Parter. For example, for $f = o(\log L)$, we give an RPC with $\widetilde{O}( (L/f)^f L^{o(1)})$ subgraphs and show that this is tight up to the $L^{o(1)}$ term.

Heisenberg-limited Hamiltonian learning without short-time control

from arXiv: Data Structures and Algorithms

Authors: Myeongjin Shin, Junseo Lee, Changhun Oh

Characterizing quantum systems by learning their underlying Hamiltonians is a central task in quantum information science. While recent algorithmic advances have achieved near-optimal efficiency in this task, they critically rely on accessing arbitrarily short-time dynamics. This reliance poses severe experimental challenges due to finite control bandwidth and transient pulse errors. In this work, we demonstrate that Heisenberg-limited Hamiltonian learning can be achieved without short-time control. We introduce a framework in which every query to the unknown dynamics has duration at least a prescribed minimum time $T$, and show that this restriction does not preclude Heisenberg-limited scaling. The key ingredient is a method for emulating the continuous quantum control required by iterative learning algorithms using only such lower-bounded evolution times. This reduces the learning task to sparse pure-state tomography. Notably, for logarithmically sparse Hamiltonians, our algorithm achieves the information-theoretically optimal $1/\varepsilon$ scaling in total evolution time for any arbitrary constant minimum evolution time $T$. For many-body (polynomially sparse) systems, we uncover a rigorous quantitative tradeoff, showing that the minimum required evolution time can be significantly relaxed from the standard limit at a polynomial cost in total evolution time. Our results affirmatively resolve a prominent open problem in the field and reveal that high-bandwidth, ultra-short pulses are not fundamentally necessary for optimal quantum learning.

Authors: Myeongjin Shin, Junseo Lee, Changhun Oh

Characterizing quantum systems by learning their underlying Hamiltonians is a central task in quantum information science. While recent algorithmic advances have achieved near-optimal efficiency in this task, they critically rely on accessing arbitrarily short-time dynamics. This reliance poses severe experimental challenges due to finite control bandwidth and transient pulse errors. In this work, we demonstrate that Heisenberg-limited Hamiltonian learning can be achieved without short-time control. We introduce a framework in which every query to the unknown dynamics has duration at least a prescribed minimum time $T$, and show that this restriction does not preclude Heisenberg-limited scaling. The key ingredient is a method for emulating the continuous quantum control required by iterative learning algorithms using only such lower-bounded evolution times. This reduces the learning task to sparse pure-state tomography. Notably, for logarithmically sparse Hamiltonians, our algorithm achieves the information-theoretically optimal $1/\varepsilon$ scaling in total evolution time for any arbitrary constant minimum evolution time $T$. For many-body (polynomially sparse) systems, we uncover a rigorous quantitative tradeoff, showing that the minimum required evolution time can be significantly relaxed from the standard limit at a polynomial cost in total evolution time. Our results affirmatively resolve a prominent open problem in the field and reveal that high-bandwidth, ultra-short pulses are not fundamentally necessary for optimal quantum learning.

Separating Feasibility and Movement in Solution Discovery: The Case of Path Discovery

from arXiv: Data Structures and Algorithms

Authors: Hanno von Bergen, Larissa Fastenau, Enna Gerhard, Nicola Lorenz, Stephanie Maaz, Amer E. Mouawad, Roman Rabinovich, Nicole Schirrmacher, Daniel Schmand, Sebastian Siebertz, Mai Trinh

We study solution discovery, where the goal is to obtain a feasible solution to a problem from an initial configuration by a bounded sequence of local moves. In many applications, however, the graph that defines which vertex sets are feasible is not the same as the graph that governs how tokens, agents, or resources may move. Existing models such as token sliding and token jumping typically do not distinguish the problem graph and the movement graph. Motivated by this mismatch, we introduce a directed weighted two-graph model that cleanly separates feasibility from movement. A problem graph specifies the desired combinatorial objects, while a movement graph specifies admissible relocations and their costs. This yields a flexible framework that captures asymmetry, heterogeneous movement constraints, and weighted transitions, while subsuming classical discovery models as special cases. We investigate this model through \textsc{Path Discovery} and \textsc{Shortest Path Discovery}, where the task is to realize a vertex set containing an $s$-$t$-path or a shortest $s$-$t$-path in the problem graph. These problems are particularly natural in applications, since directed and weighted shortest paths are among the most fundamental algorithmic primitives. At the same time, previous work has already shown that discovery can be computationally hard even when the underlying optimization problem is easy. Our results show that this phenomenon persists, and becomes especially rich, in the two-graph setting. We obtain a detailed complexity picture, identifying tractable cases as well as strong hardness results.

Authors: Hanno von Bergen, Larissa Fastenau, Enna Gerhard, Nicola Lorenz, Stephanie Maaz, Amer E. Mouawad, Roman Rabinovich, Nicole Schirrmacher, Daniel Schmand, Sebastian Siebertz, Mai Trinh

We study solution discovery, where the goal is to obtain a feasible solution to a problem from an initial configuration by a bounded sequence of local moves. In many applications, however, the graph that defines which vertex sets are feasible is not the same as the graph that governs how tokens, agents, or resources may move. Existing models such as token sliding and token jumping typically do not distinguish the problem graph and the movement graph. Motivated by this mismatch, we introduce a directed weighted two-graph model that cleanly separates feasibility from movement. A problem graph specifies the desired combinatorial objects, while a movement graph specifies admissible relocations and their costs. This yields a flexible framework that captures asymmetry, heterogeneous movement constraints, and weighted transitions, while subsuming classical discovery models as special cases. We investigate this model through \textsc{Path Discovery} and \textsc{Shortest Path Discovery}, where the task is to realize a vertex set containing an $s$-$t$-path or a shortest $s$-$t$-path in the problem graph. These problems are particularly natural in applications, since directed and weighted shortest paths are among the most fundamental algorithmic primitives. At the same time, previous work has already shown that discovery can be computationally hard even when the underlying optimization problem is easy. Our results show that this phenomenon persists, and becomes especially rich, in the two-graph setting. We obtain a detailed complexity picture, identifying tractable cases as well as strong hardness results.

Variational and Majorization Principles in Lattice Reduction

from arXiv: Data Structures and Algorithms

Authors: Javier Blanco-Romero, Florina Almenares Mendoza

Lattice reduction smooths the Gram-Schmidt profile, and we use majorization to describe the local swap mechanism behind that smoothing. In this language, each non-degenerate Lovász swap acts as a T-transform on the log-norm profile. As a consequence, every strictly Schur-convex measure of profile spread decreases at such a swap. Two structural consequences follow. First, the worst-case GSA envelope admits a variational interpretation. It is the unique minimum-variance profile compatible with the Lovász gap geometry, so its slope is determined by the LLL parameter alone. Second, the realized swap trajectory satisfies an exact telescoping identity for variance dissipation. The same viewpoint also helps organize deep-insertion heuristics. It suggests a thermal family of Schur-convex scoring rules, motivates adaptive selection within that family, and leads to two concrete selectors: Thermal-Adaptive, which reduces operation counts relative to SS-GG on flat profiles in our benchmarks while recovering SS-GG on $q$-ary inputs, and Geodesic Deep-LLL, which reduces equivalent-swap counts on structured lattices in our benchmarks at higher wall-clock cost.

Authors: Javier Blanco-Romero, Florina Almenares Mendoza

Lattice reduction smooths the Gram-Schmidt profile, and we use majorization to describe the local swap mechanism behind that smoothing. In this language, each non-degenerate Lovász swap acts as a T-transform on the log-norm profile. As a consequence, every strictly Schur-convex measure of profile spread decreases at such a swap. Two structural consequences follow. First, the worst-case GSA envelope admits a variational interpretation. It is the unique minimum-variance profile compatible with the Lovász gap geometry, so its slope is determined by the LLL parameter alone. Second, the realized swap trajectory satisfies an exact telescoping identity for variance dissipation. The same viewpoint also helps organize deep-insertion heuristics. It suggests a thermal family of Schur-convex scoring rules, motivates adaptive selection within that family, and leads to two concrete selectors: Thermal-Adaptive, which reduces operation counts relative to SS-GG on flat profiles in our benchmarks while recovering SS-GG on $q$-ary inputs, and Geodesic Deep-LLL, which reduces equivalent-swap counts on structured lattices in our benchmarks at higher wall-clock cost.

Temporal Routing in Static Networks: The Schedule Completion Problem

from arXiv: Data Structures and Algorithms

Authors: Michelle Döring, Niklas Mohrin, George Skretas

We introduce the TemporallyEdgeDisjointScheduleCompletion (TEDSC) problem in which we need to cover a set of temporal edge demands $D$ by routing $k$ temporal walks through a directed static graph while remaining temporally edge disjoint. This problem combines the temporal aspects of train routing and passenger demands with the static nature of real-world rail networks. We present a polynomial time algorithm for TEDSC. Motivated by real world constraints, we next investigate two restricted variants of TEDSC in which each walk can only travel for some bounded distance or time $h$. We show that both are tractable when parameterized by $k + h$, but hard for $h$ and $k + |D|$. If we fix the underlying network, the two problems exhibit distinct complexities: The distance variant remains $W[1]$-hard parameterized by $k$ even on a path of three vertices whereas the time variant admits an FPT algorithm on any fixed star. Finally, we show how to approximate the number of required walks up to a factor of $(2-h^{-1})$.

Authors: Michelle Döring, Niklas Mohrin, George Skretas

We introduce the TemporallyEdgeDisjointScheduleCompletion (TEDSC) problem in which we need to cover a set of temporal edge demands $D$ by routing $k$ temporal walks through a directed static graph while remaining temporally edge disjoint. This problem combines the temporal aspects of train routing and passenger demands with the static nature of real-world rail networks. We present a polynomial time algorithm for TEDSC. Motivated by real world constraints, we next investigate two restricted variants of TEDSC in which each walk can only travel for some bounded distance or time $h$. We show that both are tractable when parameterized by $k + h$, but hard for $h$ and $k + |D|$. If we fix the underlying network, the two problems exhibit distinct complexities: The distance variant remains $W[1]$-hard parameterized by $k$ even on a path of three vertices whereas the time variant admits an FPT algorithm on any fixed star. Finally, we show how to approximate the number of required walks up to a factor of $(2-h^{-1})$.

Average-Tree Phylogenetic Diversity Parameterized by Scanwidth and Invisibility

from arXiv: Data Structures and Algorithms

Authors: Leo van Iersel, Mark Jones, Jannik Schestag, Celine Scornavacca, Mathias Weller

We investigate parameterized algorithms for computing the average-tree phylogenetic diversity (APD) in rooted phylogenetic networks, studying the problem under different structural parameters that capture the deviation of a network from a tree. Our primary parameter is the scanwidth, a measure of the tree-likeness of a given directed acyclic graph. We show that a subset of taxa with maximum APD can be found in polynomial time in phylogenetic networks of scanwidth at most 2, but becomes NP-hard in networks of scanwidth 3. Further, we design an algorithm that computes the APD of a given set of taxa in O(2^sw n) time, where sw denotes the scanwidth and n the number of taxa in the input network. Finally, we give a linear-time algorithm for computing the APD of a given set of taxa if the network induced by these taxa is reticulation-visible. We generalize this algorithm to still run in polynomial time if each biconnected component of the induced network has only constantly many invisible reticulations.

Authors: Leo van Iersel, Mark Jones, Jannik Schestag, Celine Scornavacca, Mathias Weller

We investigate parameterized algorithms for computing the average-tree phylogenetic diversity (APD) in rooted phylogenetic networks, studying the problem under different structural parameters that capture the deviation of a network from a tree. Our primary parameter is the scanwidth, a measure of the tree-likeness of a given directed acyclic graph. We show that a subset of taxa with maximum APD can be found in polynomial time in phylogenetic networks of scanwidth at most 2, but becomes NP-hard in networks of scanwidth 3. Further, we design an algorithm that computes the APD of a given set of taxa in O(2^sw n) time, where sw denotes the scanwidth and n the number of taxa in the input network. Finally, we give a linear-time algorithm for computing the APD of a given set of taxa if the network induced by these taxa is reticulation-visible. We generalize this algorithm to still run in polynomial time if each biconnected component of the induced network has only constantly many invisible reticulations.

Online Coloring for Graphs of Large Odd Girth

from arXiv: Data Structures and Algorithms

Authors: Hirotaka Yoneda, Masataka Yoneda

We study the problem of online coloring for graphs with large odd girth. The best previously known algorithm uses $O(n^{1/2})$ colors, which was discovered by Kierstead in 1998. This algorithm works when the odd girth is 7 or more. In this paper, we provide the following: for every $\varepsilon > 0$, there exists a constant $g' \in \{3, 5, 7, \dots\}$ such that graphs with odd girth at least $g'$ can be deterministically colored online using $O(n^{\varepsilon})$ colors.

Authors: Hirotaka Yoneda, Masataka Yoneda

We study the problem of online coloring for graphs with large odd girth. The best previously known algorithm uses $O(n^{1/2})$ colors, which was discovered by Kierstead in 1998. This algorithm works when the odd girth is 7 or more. In this paper, we provide the following: for every $\varepsilon > 0$, there exists a constant $g' \in \{3, 5, 7, \dots\}$ such that graphs with odd girth at least $g'$ can be deterministically colored online using $O(n^{\varepsilon})$ colors.

Solving Hypergraph Laplacian Systems in Almost-Linear Time

from arXiv: Data Structures and Algorithms

Authors: Yuichi Yoshida

For a connected weighted hypergraph, we give a randomized almost-linear-time solver for the Poisson problem for the cut-based hypergraph Laplacian in the natural input size $P=\sum_{e\in E}|e|$, the sum of hyperedge sizes. For every fixed constant $C>0$, our randomized algorithm runs in $P^{1+o(1)}$ time and, with high probability over its internal randomness, returns a primal point and a dual certificate, with additive optimality gap at most $\exp(-\log^C P)$. A key step is to rewrite the Fenchel dual as a convex-flow problem on an auxiliary $O(P)$-arc graph, yielding a near-optimal dual flow. The main difficulty is primal recovery, because this flow does not by itself determine a primal potential. Our main new ingredient is a recovery theorem showing that, for primal recovery, the detailed routing of the dual flow inside each hyperedge gadget can be discarded: one nonnegative scalar per hyperedge is enough. After the necessary finite-precision rounding, these scalars define a linear-cost min-cost-flow instance on the auxiliary graph, and solving it exactly recovers a primal potential. Finally, a ground-vertex reduction from regularized objectives to the Poisson solver gives randomized almost-linear-time resolvent/proximal primitives for the same cut-based hypergraph Laplacian.

Authors: Yuichi Yoshida

For a connected weighted hypergraph, we give a randomized almost-linear-time solver for the Poisson problem for the cut-based hypergraph Laplacian in the natural input size $P=\sum_{e\in E}|e|$, the sum of hyperedge sizes. For every fixed constant $C>0$, our randomized algorithm runs in $P^{1+o(1)}$ time and, with high probability over its internal randomness, returns a primal point and a dual certificate, with additive optimality gap at most $\exp(-\log^C P)$. A key step is to rewrite the Fenchel dual as a convex-flow problem on an auxiliary $O(P)$-arc graph, yielding a near-optimal dual flow. The main difficulty is primal recovery, because this flow does not by itself determine a primal potential. Our main new ingredient is a recovery theorem showing that, for primal recovery, the detailed routing of the dual flow inside each hyperedge gadget can be discarded: one nonnegative scalar per hyperedge is enough. After the necessary finite-precision rounding, these scalars define a linear-cost min-cost-flow instance on the auxiliary graph, and solving it exactly recovers a primal potential. Finally, a ground-vertex reduction from regularized objectives to the Poisson solver gives randomized almost-linear-time resolvent/proximal primitives for the same cut-based hypergraph Laplacian.

Smallest suffixient set maintenance in near-real-time

from arXiv: Data Structures and Algorithms

Authors: Dominik Köppl, Gregory Kucherov

The size of the \textit{smallest suffixient set} of positions of a string recently emerged as a new measure of string \textit{repetitiveness} -- a measure reflecting how much of repetitive content the string contains. We study how to maintain the smallest suffixient set online in near-real-time, that is with small (in our case, polyloglog) worst-case time on processing each letter. Two frameworks are considered: when the text is given letter-by-letter in either a right-to-left or left-to-right direction. Our central algorithmic tool is Weiner's suffix tree algorithm and associated algorithmic primitives for its efficient implementation.

Authors: Dominik Köppl, Gregory Kucherov

The size of the \textit{smallest suffixient set} of positions of a string recently emerged as a new measure of string \textit{repetitiveness} -- a measure reflecting how much of repetitive content the string contains. We study how to maintain the smallest suffixient set online in near-real-time, that is with small (in our case, polyloglog) worst-case time on processing each letter. Two frameworks are considered: when the text is given letter-by-letter in either a right-to-left or left-to-right direction. Our central algorithmic tool is Weiner's suffix tree algorithm and associated algorithmic primitives for its efficient implementation.

Computing the (k+2)-Edge-Connected Components in k-Edge-Connected Digraphs in Subquadratic Time

from arXiv: Data Structures and Algorithms

Authors: Loukas Georgiadis, Evangelos Kipouridis, Evangelos Kosinas, Charis Papadopoulos, Nikos Parotsidis

Computing edge-connected components in directed and undirected graphs is a fundamental and well-studied problem in graph algorithms. In a very recent breakthrough, Korhonen [STOC 2025] showed that for any fixed $k$, the $k$-edge connected components of an undirected graph can be computed in linear time. In contrast, the directed case remains significantly more challenging: linear-time algorithms are only known for $k \le 3$, and for any fixed $k > 3$, the best known bound for sparse or moderately dense graphs is still the $O(mn)$-time algorithm of Nagamochi and Watanabe (1993). In this paper, we break the $O(mn)$ barrier for all $k = o(n^{1/4}/\sqrt{\log{n}})$. We present a randomized algorithm that computes the $(k+2)$-edge-connected components of a $k$-edge-connected directed graph in $O(k^2 m \sqrt{n} \log n)$ time, for any~$k$. This constitutes the first improvement over the classic Nagamochi--Watanabe bound for any constant $k > 3$. Our approach introduces new structural insights into directed edge-cuts and combines these with both new and existing techniques. A central contribution of our work is a substantial simplification and generalization of the framework introduced in~\cite{GKPP:3ECC}, which achieved an $\widetilde{O}(m\sqrt{m})$ bound for computing the $3$-edge-connected components of a digraph. In addition, we develop a variant of our algorithm that achieves the same $O(m \sqrt{n} \log n)$ running time for computing the $4$-edge-connected components of a \emph{general} directed graph.

Authors: Loukas Georgiadis, Evangelos Kipouridis, Evangelos Kosinas, Charis Papadopoulos, Nikos Parotsidis

Computing edge-connected components in directed and undirected graphs is a fundamental and well-studied problem in graph algorithms. In a very recent breakthrough, Korhonen [STOC 2025] showed that for any fixed $k$, the $k$-edge connected components of an undirected graph can be computed in linear time. In contrast, the directed case remains significantly more challenging: linear-time algorithms are only known for $k \le 3$, and for any fixed $k > 3$, the best known bound for sparse or moderately dense graphs is still the $O(mn)$-time algorithm of Nagamochi and Watanabe (1993). In this paper, we break the $O(mn)$ barrier for all $k = o(n^{1/4}/\sqrt{\log{n}})$. We present a randomized algorithm that computes the $(k+2)$-edge-connected components of a $k$-edge-connected directed graph in $O(k^2 m \sqrt{n} \log n)$ time, for any~$k$. This constitutes the first improvement over the classic Nagamochi--Watanabe bound for any constant $k > 3$. Our approach introduces new structural insights into directed edge-cuts and combines these with both new and existing techniques. A central contribution of our work is a substantial simplification and generalization of the framework introduced in~\cite{GKPP:3ECC}, which achieved an $\widetilde{O}(m\sqrt{m})$ bound for computing the $3$-edge-connected components of a digraph. In addition, we develop a variant of our algorithm that achieves the same $O(m \sqrt{n} \log n)$ running time for computing the $4$-edge-connected components of a \emph{general} directed graph.

A note on the parameter $\ell$ in Buchbinder--Feldman's deterministic submodular matroid algorithm

from arXiv: Data Structures and Algorithms

Authors: Shisheng Li

Buchbinder and Feldman recently gave a deterministic $(1-1/e-\varepsilon)$-approximation for maximizing a non-negative monotone submodular function subject to a matroid constraint, with query complexity $\widetilde{O}_\varepsilon(nr)$. Their algorithm uses an integer parameter $\ell$, which Buchbinder and Feldman fix to $\ell = 1 + \lceil 1/\varepsilon \rceil$ via a loose bound on $(1+1/\ell)^{-\ell}$. We point out two purely elementary refinements. First, the classical Pólya--Szegő inequality $(1+1/\ell)^{-\ell} \le e^{-1}(1+1/(2\ell))$ replaces the loose step in their proof and permits $\ell = \lceil 1/(2e\varepsilon) \rceil$, shrinking the hidden constant in $\widetilde{O}_\varepsilon(nr)$ by a factor $\approx 2^{0.816/\varepsilon}$. Second, an alternating-series tail bound for $\log(1+t)$ yields the asymptotically sharp inequality $(1+1/\ell)^{-\ell} \le e^{-1}\exp(1/(2\ell) - 1/(3\ell^2) + 1/(4\ell^3))$, matching the true expansion of $(1+1/\ell)^{-\ell}$ through order $\ell^{-3}$ and translating into $\ell_\star = 1/(2e\varepsilon) - 5/12 + O(\varepsilon)$. The asymptotic class $\widetilde{O}_\varepsilon(nr)$ of the query complexity is unchanged in either case; only the implicit constant in $\varepsilon$ is improved. All inequalities in this note are formalized and machine-checked in Lean 4 against Mathlib.

Authors: Shisheng Li

Buchbinder and Feldman recently gave a deterministic $(1-1/e-\varepsilon)$-approximation for maximizing a non-negative monotone submodular function subject to a matroid constraint, with query complexity $\widetilde{O}_\varepsilon(nr)$. Their algorithm uses an integer parameter $\ell$, which Buchbinder and Feldman fix to $\ell = 1 + \lceil 1/\varepsilon \rceil$ via a loose bound on $(1+1/\ell)^{-\ell}$. We point out two purely elementary refinements. First, the classical Pólya--Szegő inequality $(1+1/\ell)^{-\ell} \le e^{-1}(1+1/(2\ell))$ replaces the loose step in their proof and permits $\ell = \lceil 1/(2e\varepsilon) \rceil$, shrinking the hidden constant in $\widetilde{O}_\varepsilon(nr)$ by a factor $\approx 2^{0.816/\varepsilon}$. Second, an alternating-series tail bound for $\log(1+t)$ yields the asymptotically sharp inequality $(1+1/\ell)^{-\ell} \le e^{-1}\exp(1/(2\ell) - 1/(3\ell^2) + 1/(4\ell^3))$, matching the true expansion of $(1+1/\ell)^{-\ell}$ through order $\ell^{-3}$ and translating into $\ell_\star = 1/(2e\varepsilon) - 5/12 + O(\varepsilon)$. The asymptotic class $\widetilde{O}_\varepsilon(nr)$ of the query complexity is unchanged in either case; only the implicit constant in $\varepsilon$ is improved. All inequalities in this note are formalized and machine-checked in Lean 4 against Mathlib.

Designing sparse temporal graphs satisfying connectivity requirements

from arXiv: Data Structures and Algorithms

Authors: Thomas Bellitto, Jules Bouton Popper, Justine Cauvi, Bruno Escoffier, Raphaëlle Maistre-Matus

Connectivity of temporal graphs has been widely studied both as graph theory and as gossip theory. In particular, it is well known that in order to connect every vertex to every other, a temporal graph needs to have at least $2n-4$ edges where $n$ is the number of vertices. This paper investigates the optimal number of edges required to satisfy partial connectivity requirements. We introduce the problem of Connectivity Request Satisfaction where we are given a directed graph that we call the request graph, where an arc from $u$ to $v$ means that we need to be able to go from $u$ to $v$. Our goal is to build a temporal graph on the same vertex set with as few temporal edges as possible that would satisfy all the requests. When the graph we build is directed, we prove that the number of temporal arcs required is $n-\mathrm{cc}+\mathrm{dfvs}$ where $\mathrm{cc}$ is the number of connected component of the request graph and $\mathrm{dfvs}$ is the size of its smallest directed feedback vertex set. It follows that the problem is NP-complete but inherits fixed parameter tractability properties of Directed Feedback Vertex Set. When the graph we build is undirected, we establish a characterization of strongly connected request graphs that admit a solution with $n-1$ edges: it is possible if and only if any set of pairwise non-vertex-disjoint closed walks all share a common vertex. We prove that this criteria can be tested in polynomial time.

Authors: Thomas Bellitto, Jules Bouton Popper, Justine Cauvi, Bruno Escoffier, Raphaëlle Maistre-Matus

Connectivity of temporal graphs has been widely studied both as graph theory and as gossip theory. In particular, it is well known that in order to connect every vertex to every other, a temporal graph needs to have at least $2n-4$ edges where $n$ is the number of vertices. This paper investigates the optimal number of edges required to satisfy partial connectivity requirements. We introduce the problem of Connectivity Request Satisfaction where we are given a directed graph that we call the request graph, where an arc from $u$ to $v$ means that we need to be able to go from $u$ to $v$. Our goal is to build a temporal graph on the same vertex set with as few temporal edges as possible that would satisfy all the requests. When the graph we build is directed, we prove that the number of temporal arcs required is $n-\mathrm{cc}+\mathrm{dfvs}$ where $\mathrm{cc}$ is the number of connected component of the request graph and $\mathrm{dfvs}$ is the size of its smallest directed feedback vertex set. It follows that the problem is NP-complete but inherits fixed parameter tractability properties of Directed Feedback Vertex Set. When the graph we build is undirected, we establish a characterization of strongly connected request graphs that admit a solution with $n-1$ edges: it is possible if and only if any set of pairwise non-vertex-disjoint closed walks all share a common vertex. We prove that this criteria can be tested in polynomial time.

New Diameter Approximations via Distance Oracle Techniques

from arXiv: Data Structures and Algorithms

Authors: Yael Kirkpatrick, Liam Roditty, Richard Qi, Virginia Vassilevska Williams

Computing the diameter of a graph is a problem of great interest both in general algorithms research and specifically within fine-grained complexity, where it is a cornerstone hard problem. Recent work has achieved a full conditional lower bound tradeoff curve for both directed and undirected graphs. However, the best known upper bounds do not match the lower bounds. In particular, the best known approximation scheme for undirected graph diameter has not been improved. Moreover, this scheme is randomized and no similar deterministic scheme is known. Another fundamental field of research in shortest paths computation is the construction of approximate distance oracles. Thorup and Zwick [JACM'05] provided the first such distance oracle with constant query time and (conditionally) optimal space, and in the years since many advances have led to a vast toolbox of techniques and data structures. These two areas of research seem natural to combine since they both concern approximating shortest paths. However, the known diameter approximation algorithms only use a small subset of the techniques used in distance oracles research. In this work we show that in fact approximate diameter and distance oracles are intricately connected. We first demonstrate a strong connection between the current best known diameter approximation scheme of Cairo, Grossi and Rizzi ("CGR") and the $(2k-1)$-approximate distance oracle of Thorup and Zwick. This allows us to derandomize the CGR algorithm and obtain the first deterministic diameter approximation tradeoff. We further derandomize other central techniques in the field of distance oracles and use them to achieve new deterministic diameter approximation algorithms. Finally, we show how these new techniques can be used to derandomize many current best known results in various fields of shortest paths approximations.

Authors: Yael Kirkpatrick, Liam Roditty, Richard Qi, Virginia Vassilevska Williams

Computing the diameter of a graph is a problem of great interest both in general algorithms research and specifically within fine-grained complexity, where it is a cornerstone hard problem. Recent work has achieved a full conditional lower bound tradeoff curve for both directed and undirected graphs. However, the best known upper bounds do not match the lower bounds. In particular, the best known approximation scheme for undirected graph diameter has not been improved. Moreover, this scheme is randomized and no similar deterministic scheme is known. Another fundamental field of research in shortest paths computation is the construction of approximate distance oracles. Thorup and Zwick [JACM'05] provided the first such distance oracle with constant query time and (conditionally) optimal space, and in the years since many advances have led to a vast toolbox of techniques and data structures. These two areas of research seem natural to combine since they both concern approximating shortest paths. However, the known diameter approximation algorithms only use a small subset of the techniques used in distance oracles research. In this work we show that in fact approximate diameter and distance oracles are intricately connected. We first demonstrate a strong connection between the current best known diameter approximation scheme of Cairo, Grossi and Rizzi ("CGR") and the $(2k-1)$-approximate distance oracle of Thorup and Zwick. This allows us to derandomize the CGR algorithm and obtain the first deterministic diameter approximation tradeoff. We further derandomize other central techniques in the field of distance oracles and use them to achieve new deterministic diameter approximation algorithms. Finally, we show how these new techniques can be used to derandomize many current best known results in various fields of shortest paths approximations.

Improved Approximation Algorithm for Maximum Balanced Biclique

from arXiv: Data Structures and Algorithms

Authors: Pasin Manurangsi

We study the Maximum Balanced Biclique (MBB) problem: Given a bipartite graph $G$ with $n$ vertices on each side, find a balanced biclique in $G$ with maximum size. We give a polynomial-time $\left(\frac{n}{\widetildeΩ\left((\log n)^3\right)}\right)$-approximation algorithm for the problem, which improves upon an $\left(\frac{n}{Ω\left((\log n)^2\right)}\right)$-approximation by Chalermsook et al. (2020) and answers their open question. Furthermore, our approximation ratio matches that of the maximum clique problem by Feige (2004) up to an $O(\log \log n)$ factor.

Authors: Pasin Manurangsi

We study the Maximum Balanced Biclique (MBB) problem: Given a bipartite graph $G$ with $n$ vertices on each side, find a balanced biclique in $G$ with maximum size. We give a polynomial-time $\left(\frac{n}{\widetildeΩ\left((\log n)^3\right)}\right)$-approximation algorithm for the problem, which improves upon an $\left(\frac{n}{Ω\left((\log n)^2\right)}\right)$-approximation by Chalermsook et al. (2020) and answers their open question. Furthermore, our approximation ratio matches that of the maximum clique problem by Feige (2004) up to an $O(\log \log n)$ factor.

Online Monotone Metric Embeddings

from arXiv: Data Structures and Algorithms

Authors: Christian Coester, Yichen Huang

Metric embeddings into structured spaces, particularly hierarchically well-separated trees (HSTs), are a fundamental tool in the design of online algorithms. In the classical online embedding setting, points arrive sequentially and must be embedded irrevocably upon arrival, resulting in strong distortion lower bounds of $Ω(\min(n, \log n\log Δ))$, where $n$ is the number of points and $Δ$ their aspect ratio. We propose a novel relaxation, \emph{online monotone metric embeddings}, which allows distances between embedded points in the target space to decrease monotonically over time. Such relaxed embeddings remain compatible with many online algorithms. Moreover, this relaxation breaks existing lower bound barriers, enabling embeddings into HSTs with distortion $O(\log^2 n)$. We also study a dynamic variant, where points may both arrive and depart, seeking distortion guarantees in terms of the maximum number $l$ of simultaneously present points. For traditional embeddings, such bounds are impossible, and this limitation persists even for deterministic monotone embeddings. Surprisingly, probabilistic monotone embeddings allow for $O(l \log l)$ distortion, which is nearly optimal given an $Ω(l)$ lower bound.

Authors: Christian Coester, Yichen Huang

Metric embeddings into structured spaces, particularly hierarchically well-separated trees (HSTs), are a fundamental tool in the design of online algorithms. In the classical online embedding setting, points arrive sequentially and must be embedded irrevocably upon arrival, resulting in strong distortion lower bounds of $Ω(\min(n, \log n\log Δ))$, where $n$ is the number of points and $Δ$ their aspect ratio. We propose a novel relaxation, \emph{online monotone metric embeddings}, which allows distances between embedded points in the target space to decrease monotonically over time. Such relaxed embeddings remain compatible with many online algorithms. Moreover, this relaxation breaks existing lower bound barriers, enabling embeddings into HSTs with distortion $O(\log^2 n)$. We also study a dynamic variant, where points may both arrive and depart, seeking distortion guarantees in terms of the maximum number $l$ of simultaneously present points. For traditional embeddings, such bounds are impossible, and this limitation persists even for deterministic monotone embeddings. Surprisingly, probabilistic monotone embeddings allow for $O(l \log l)$ distortion, which is nearly optimal given an $Ω(l)$ lower bound.

Thursday, April 30

STOC 2026 Student Travel Grants

from Theory Dish: Stanford Blog

Hi all! An announcement on behalf of the STOC/TheoryFest organizers: ACM SIGACT is pleased to provide funds to support student attendance at STOC 2026. These travel awards are for students who are short on funds and are intended to help cover their registration, travel, and accommodation expenses. An advisor may support the applications of at most two students for a travel award. This support is funded by the US National Science Foundation and is only available for students attending US universities. The deadline to apply is May 13, 2026. More information here: acm-stoc.org/stoc2026/travel-support.html.

Hi all! An announcement on behalf of the STOC/TheoryFest organizers:

ACM SIGACT is pleased to provide funds to support student attendance at STOC 2026. These travel awards are for students who are short on funds and are intended to help cover their registration, travel, and accommodation expenses. An advisor may support the applications of at most two students for a travel award. This support is funded by the US National Science Foundation and is only available for students attending US universities. The deadline to apply is May 13, 2026. More information here: https://acm-stoc.org/stoc2026/travel-support.html.

By mkwoot

TR26-064 | Toward Improving Nisan’s PRG via Deweightization | ben chen, Gil Cohen, Dean Doron, Yuval Khaskelberg, Amnon Ta-Shma

from ECCC Papers

Breaking the log-squared barrier in pseudorandom generator constructions for read-once branching programs, namely, achieving seed length $o(\log^2 n)$ for length-$n$ programs, has remained a longstanding open problem since Nisan's seminal construction. We show that breaking this barrier, even achieving seed length $O(\log^{3/2} n)$ (for, say, constant width), would follow from simultaneously improving the dependence of PRGs on two seemingly secondary parameters: the error $\varepsilon$ and the program's arity $|\Sigma|$. Such improved dependence is already achieved by certain weighted PRGs (WPRGs). This reduces the problem of breaking the log-squared barrier to deweightization: closing the gap between PRG and WPRG constructions. While the importance of the error parameter has been recognized over the past decade, the role of the arity $|\Sigma|$ has largely been overlooked. By inspection, several existing WPRG constructions achieve optimal dependence on $\Sigma$, though the state-of-the-art constructions do not. As our second result, we construct WPRGs that attain optimal dependence on $\Sigma$ while matching the best known bounds in all other parameters.

Breaking the log-squared barrier in pseudorandom generator constructions for read-once branching programs, namely, achieving seed length $o(\log^2 n)$ for length-$n$ programs, has remained a longstanding open problem since Nisan's seminal construction. We show that breaking this barrier, even achieving seed length $O(\log^{3/2} n)$ (for, say, constant width), would follow from simultaneously improving the dependence of PRGs on two seemingly secondary parameters: the error $\varepsilon$ and the program's arity $|\Sigma|$. Such improved dependence is already achieved by certain weighted PRGs (WPRGs). This reduces the problem of breaking the log-squared barrier to deweightization: closing the gap between PRG and WPRG constructions. While the importance of the error parameter has been recognized over the past decade, the role of the arity $|\Sigma|$ has largely been overlooked. By inspection, several existing WPRG constructions achieve optimal dependence on $\Sigma$, though the state-of-the-art constructions do not. As our second result, we construct WPRGs that attain optimal dependence on $\Sigma$ while matching the best known bounds in all other parameters.

Linkage

from David Eppstein

Retraction Watch reports on the mass resignation of the editorial board of Elsevier’s Journal of Approximation Theory (\(\mathbb{M}\)). This follows in close succession from the mass resignation from Taylor & Francis’s Communications in Algebra.

By David Eppstein

TR26-063 | When Majority Fails: Tight Bounds for Correlation Distillation Conjectures | Pritish Kamath, Ravi Kumar, Pasin Manurangsi

from ECCC Papers

We study two conjectures posed in the analysis of Boolean functions $f : \{-1, 1\}^n ? \{?1, 1\}$, in both of which, the Majority function plays a central role: the "Majority is Least Stable" (Benjamini et al., 1999) and the "Non-Interactive Correlation Distillation for Erasures" (Yang, 2004; O'Donnell and Wright, 2012). While both conjectures have been refuted in their originally stated form, we obtain a nearly tight characterization of the noise parameter regime in which each of the conjectures hold, for all $n \ge 5$. Whereas, for $n = 3$, both conjectures hold in all noise parameter regimes. We state refined versions of both conjectures that we believe captures the spirit of the original conjectures.

We study two conjectures posed in the analysis of Boolean functions $f : \{-1, 1\}^n ? \{?1, 1\}$, in both of which, the Majority function plays a central role: the "Majority is Least Stable" (Benjamini et al., 1999) and the "Non-Interactive Correlation Distillation for Erasures" (Yang, 2004; O'Donnell and Wright, 2012). While both conjectures have been refuted in their originally stated form, we obtain a nearly tight characterization of the noise parameter regime in which each of the conjectures hold, for all $n \ge 5$. Whereas, for $n = 3$, both conjectures hold in all noise parameter regimes. We state refined versions of both conjectures that we believe captures the spirit of the original conjectures.

TR26-062 | Toward a Characterization of Simulation Between Arithmetic Theories | Hunter Monroe

from ECCC Papers

We study when a sound arithmetic theory $\mathcal S{\supseteq}S^1_2$ with polynomial-time decidable axioms efficiently proves the bounded consistency statements $Con_{\mathcal S{+}\phi}(n)$ for a true sentence $\phi$. Equivalently, we ask when $\mathcal S$, viewed as a proof system, simulates $\mathcal S{+}\phi$. The paper's two unconditional contributions constrain possible characterizations. First, for finitely axiomatized sequential $\mathcal S$, if $EA{\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}\phi}$, then $\mathcal S$ interprets $\mathcal S{+}\phi$, implying $\mathcal S{\vdash^{n^{O(1)}}}Con_{\mathcal S}(p(n)){\rightarrow}Con_{\mathcal S{+}\phi}(n)$ for some polynomial $p$, and hence $\mathcal S{\vdash^{n^{O(1)}}}Con_{\mathcal S{+}\phi}(n)$. Second, if $\mathcal S$ fails to simulate $\mathcal S{+}\phi$ for some true $\phi$, then for all sufficiently large $k$ it also fails for $\phi_{BB}(k)$ asserting the exact value of the $k$-state Busy Beaver function. Informally, any argument showing that $\mathcal S$ fails to simulate some $\mathcal S{+}\phi$ also yields unprovable $\phi_{BB}(k)$ witnessing the same obstruction. These results suggest that relative consistency strength is a serious candidate for governing when simulation is possible, while leaving open whether it is the correct criterion. The paper's central conjectural proposal is that the above sufficient condition is also necessary: if $EA{\not\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}\phi}$, then for every constant $c{>}0$, $\mathcal S\not{\vdash^{n^c}}Con_{\mathcal S{+}\phi}(n)$. Under this proposal, hardness follows in canonical cases where $\phi$ is $Con_{\mathcal S}$ or a Kolmogorov-randomness axiom. The latter yields further conjectural consequences and extensions.

We study when a sound arithmetic theory $\mathcal S{\supseteq}S^1_2$ with polynomial-time decidable axioms efficiently proves the bounded consistency statements $Con_{\mathcal S{+}\phi}(n)$ for a true sentence $\phi$. Equivalently, we ask when $\mathcal S$, viewed as a proof system, simulates $\mathcal S{+}\phi$. The paper's two unconditional contributions constrain possible characterizations. First, for finitely axiomatized sequential $\mathcal S$, if $EA{\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}\phi}$, then $\mathcal S$ interprets $\mathcal S{+}\phi$, implying $\mathcal S{\vdash^{n^{O(1)}}}Con_{\mathcal S}(p(n)){\rightarrow}Con_{\mathcal S{+}\phi}(n)$ for some polynomial $p$, and hence $\mathcal S{\vdash^{n^{O(1)}}}Con_{\mathcal S{+}\phi}(n)$. Second, if $\mathcal S$ fails to simulate $\mathcal S{+}\phi$ for some true $\phi$, then for all sufficiently large $k$ it also fails for $\phi_{BB}(k)$ asserting the exact value of the $k$-state Busy Beaver function. Informally, any argument showing that $\mathcal S$ fails to simulate some $\mathcal S{+}\phi$ also yields unprovable $\phi_{BB}(k)$ witnessing the same obstruction. These results suggest that relative consistency strength is a serious candidate for governing when simulation is possible, while leaving open whether it is the correct criterion. The paper's central conjectural proposal is that the above sufficient condition is also necessary: if $EA{\not\vdash}Con_{\mathcal S}{\rightarrow}Con_{\mathcal S{+}\phi}$, then for every constant $c{>}0$, $\mathcal S\not{\vdash^{n^c}}Con_{\mathcal S{+}\phi}(n)$. Under this proposal, hardness follows in canonical cases where $\phi$ is $Con_{\mathcal S}$ or a Kolmogorov-randomness axiom. The latter yields further conjectural consequences and extensions.

En Route to a Standard QMA1 vs. QCMA Oracle Separation

from arXiv: Computational Complexity

Authors: David Miloschewsky, Supartha Podder, Dorian Rudolph

We study the power of quantum witnesses under perfect completeness. We construct a classical oracle relative to which a language lies in $\mathsf{QMA}_1$ but not in $\mathsf{QCMA}$ when the $\mathsf{QCMA}$ verifier is only allowed polynomially many adaptive rounds and exponentially many parallel queries per round. Additionally, we derandomize the permutation-oracle separation of Fefferman and Kimmel, obtaining an in-place oracle separation between $\mathsf{QMA}_1$ and $\mathsf{QCMA}$. Furthermore, we focus on $\mathsf{QCMA}$ and $\mathsf{QMA}$ with an exponentially small gap, where we show a separation assuming the gap is fixed, but not when it may be arbitrarily small. Finally, we derive consequences for approximate ground-state preparation from sparse Hamiltonian oracle access, including a bounded-adaptivity frustration-free variant.

Authors: David Miloschewsky, Supartha Podder, Dorian Rudolph

We study the power of quantum witnesses under perfect completeness. We construct a classical oracle relative to which a language lies in $\mathsf{QMA}_1$ but not in $\mathsf{QCMA}$ when the $\mathsf{QCMA}$ verifier is only allowed polynomially many adaptive rounds and exponentially many parallel queries per round. Additionally, we derandomize the permutation-oracle separation of Fefferman and Kimmel, obtaining an in-place oracle separation between $\mathsf{QMA}_1$ and $\mathsf{QCMA}$. Furthermore, we focus on $\mathsf{QCMA}$ and $\mathsf{QMA}$ with an exponentially small gap, where we show a separation assuming the gap is fixed, but not when it may be arbitrarily small. Finally, we derive consequences for approximate ground-state preparation from sparse Hamiltonian oracle access, including a bounded-adaptivity frustration-free variant.

Hard-to-Sample Distributions from Robust Extractors

from arXiv: Computational Complexity

Authors: Farzan Byramji, Daniel M. Kane, Jackson Morris, Anthony Ostuni

We provide a unified method for constructing explicit distributions which are difficult for restricted models of computation to generate. Our constructions are based on a new notion of robust extractors, which are extractors that remain sound even when a small number of points violate the min-entropy constraint. Using such objects, we show that for a broad range of sampling models (e.g., low-depth circuits, small-space sources, etc.), every output of the model has distance $1 - o(1)$ from our target distribution, qualitatively recovering essentially all previously known hardness results. Our work extends that of Viola (SICOMP '14), who developed an earlier unified framework based on traditional extractors to rule out sampling with very small error. As a further application of our technique, we leverage a recent extractor construction of Chattopadhyay, Goodman, and Gurumukhani (ITCS '24) to present the first explicit distribution with distance $1 - o(1)$ from the output of any low-degree $\mathbb{F}_2$-polynomial source. We also describe a potential avenue toward proving a similar hardness result for $\mathsf{AC^0}[\oplus]$ circuits.

Authors: Farzan Byramji, Daniel M. Kane, Jackson Morris, Anthony Ostuni

We provide a unified method for constructing explicit distributions which are difficult for restricted models of computation to generate. Our constructions are based on a new notion of robust extractors, which are extractors that remain sound even when a small number of points violate the min-entropy constraint. Using such objects, we show that for a broad range of sampling models (e.g., low-depth circuits, small-space sources, etc.), every output of the model has distance $1 - o(1)$ from our target distribution, qualitatively recovering essentially all previously known hardness results. Our work extends that of Viola (SICOMP '14), who developed an earlier unified framework based on traditional extractors to rule out sampling with very small error. As a further application of our technique, we leverage a recent extractor construction of Chattopadhyay, Goodman, and Gurumukhani (ITCS '24) to present the first explicit distribution with distance $1 - o(1)$ from the output of any low-degree $\mathbb{F}_2$-polynomial source. We also describe a potential avenue toward proving a similar hardness result for $\mathsf{AC^0}[\oplus]$ circuits.

A proof of Jordan curve theorem based on the sweepline algorithm for trapezoidal decomposition of a polygon

from arXiv: Computational Geometry

Authors: Apurva Mudgal

We prove the Jordan curve theorem by generalizing the sweepline algorithm for trapezoidal decomposition of a polygon. Our proof uses Zorn's lemma (or, equivalently the axiom of choice). Though several proofs have been given for the Jordan curve theorem by various authors, ours is the {\bf first algorithmic proof} of Jordan curve theorem using computational geometry. Further, with some preparation, the proof can be taught as part of an undergraduate discrete mathematics course, where till now the proof of this theorem was considered inaccessible.

Authors: Apurva Mudgal

We prove the Jordan curve theorem by generalizing the sweepline algorithm for trapezoidal decomposition of a polygon. Our proof uses Zorn's lemma (or, equivalently the axiom of choice). Though several proofs have been given for the Jordan curve theorem by various authors, ours is the {\bf first algorithmic proof} of Jordan curve theorem using computational geometry. Further, with some preparation, the proof can be taught as part of an undergraduate discrete mathematics course, where till now the proof of this theorem was considered inaccessible.

The Nesting Bird Box Problem is ER-complete: Sharp Hardness Results for the Hidden Set Problem

from arXiv: Computational Geometry

Authors: Lucas Meijer, Till Miltzow, Johanna Ockenfels, Miloš Stojaković

In the (Nesting) Bird Box Problem we are given a polygonal domain P and a number k and we want to know if there is a set B of k points inside P such that no two points in B can see each other. The underlying idea is that each point represents a birdhouse and many birds only use a birdhouse if there is no other occupied birdhouse in its vicinity. We say two points a,b see each other if the open segment ab intersects neither the exterior of P nor any vertex of P. We show that the Nesting Bird Box problem is ER-complete. The complexity class ER can be defined by the set of problems that are polynomial time equivalent to finding a solution to the equation $p(x) = 0$, with $x\in R^n$ and $p\in $Z[X_1,...,X_n]$. The proof builds on the techniques developed in the original ER-completeness proof of the Art Gallery problem. However our proof is significantly shorter for two reasons. First, we can use recently developed tools that were not available at the time. Second, we consider polygonal domains with holes instead of simple polygons.

Authors: Lucas Meijer, Till Miltzow, Johanna Ockenfels, Miloš Stojaković

In the (Nesting) Bird Box Problem we are given a polygonal domain P and a number k and we want to know if there is a set B of k points inside P such that no two points in B can see each other. The underlying idea is that each point represents a birdhouse and many birds only use a birdhouse if there is no other occupied birdhouse in its vicinity. We say two points a,b see each other if the open segment ab intersects neither the exterior of P nor any vertex of P. We show that the Nesting Bird Box problem is ER-complete. The complexity class ER can be defined by the set of problems that are polynomial time equivalent to finding a solution to the equation $p(x) = 0$, with $x\in R^n$ and $p\in $Z[X_1,...,X_n]$. The proof builds on the techniques developed in the original ER-completeness proof of the Art Gallery problem. However our proof is significantly shorter for two reasons. First, we can use recently developed tools that were not available at the time. Second, we consider polygonal domains with holes instead of simple polygons.

A stellated tetrahedron that is probably not Rupert

from arXiv: Computational Geometry

Authors: Tony Zeng

A convex polyhedron is Rupert if a hole can be cut into it (making its genus $1$) such that an identical copy of the polyhedron can pass through the hole. Resolving a conjecture of Jerrard-Wetzel-Yuan, Steininger and Yurkevich recently constructed a convex polyhedron which is not Rupert. We propose a search for the simplest possible non-Rupert polyhedron and provide numerical evidence suggesting that a particular stellated tetrahedron is not Rupert. The computational techniques utilize linear program solvers to compute the largest possible scalings of polygons that can be translated to fit in other polygons. The relative simplicity of the stellated tetrahedron as compared to other polyhedra allows this more rudimentary check to be computationally tractable. In particular, we show that over 88% of a particular encoding of $\text{SO}(3) \times \text{SO}(3)$ equipped with the standard measure does not yield a Rupert passage.

Authors: Tony Zeng

A convex polyhedron is Rupert if a hole can be cut into it (making its genus $1$) such that an identical copy of the polyhedron can pass through the hole. Resolving a conjecture of Jerrard-Wetzel-Yuan, Steininger and Yurkevich recently constructed a convex polyhedron which is not Rupert. We propose a search for the simplest possible non-Rupert polyhedron and provide numerical evidence suggesting that a particular stellated tetrahedron is not Rupert. The computational techniques utilize linear program solvers to compute the largest possible scalings of polygons that can be translated to fit in other polygons. The relative simplicity of the stellated tetrahedron as compared to other polyhedra allows this more rudimentary check to be computationally tractable. In particular, we show that over 88% of a particular encoding of $\text{SO}(3) \times \text{SO}(3)$ equipped with the standard measure does not yield a Rupert passage.

Calibrated Persistent Homology Tests for High-dimensional Collapse Detection

from arXiv: Computational Geometry

Authors: Alexander Kalinowski

We study detection of collapse in high-dimensional point clouds, where mass concentrates near a lower-dimensional set relative to a non-collapsed geometry. We propose persistent homology-based test statistics under two well-studied filtrations, with cutoffs calibrated under a broad set of non-collapsed reference models. We benchmark power across three alternative collapse mechanisms (linear/spectral, nonlinear-support, and contamination/heterogeneity) and distill the results into a mechanism map guiding the choice of filtration and statistic.

Authors: Alexander Kalinowski

We study detection of collapse in high-dimensional point clouds, where mass concentrates near a lower-dimensional set relative to a non-collapsed geometry. We propose persistent homology-based test statistics under two well-studied filtrations, with cutoffs calibrated under a broad set of non-collapsed reference models. We benchmark power across three alternative collapse mechanisms (linear/spectral, nonlinear-support, and contamination/heterogeneity) and distill the results into a mechanism map guiding the choice of filtration and statistic.

Strict Hierarchy for Quantum Channel Certification to Unitary

from arXiv: Data Structures and Algorithms

Authors: Kean Chen, Qisheng Wang, Zhicheng Zhang

We consider the problem of quantum channel certification to unitary, where one is given access to an unknown $d$-dimensional channel $\mathcal{E}$, and wants to test whether $\mathcal{E}$ is equal to a target unitary channel or is $\varepsilon$-far from it in the diamond norm. We present optimal quantum algorithms for this problem, settling the query complexities in three access models with increasing power. Specifically, we show that: (i) $Θ(d/\varepsilon^2)$ queries suffice for incoherent access model, matching the lower bound due to Fawzi, Flammarion, Garivier, and Oufkir (COLT 2023). (ii) $Θ(d/\varepsilon)$ queries suffice for coherent access model, matching the lower bound due to Regev and Schiff (ICALP 2008). (iii) $Θ(\sqrt{d}/\varepsilon)$ queries suffice for source-code access model, matching the lower bound due to Jeon and Oh (npj Quantum Inf. 2026). This demonstrates a strict hierarchy of complexities for quantum channel certification to unitary across various access models.

Authors: Kean Chen, Qisheng Wang, Zhicheng Zhang

We consider the problem of quantum channel certification to unitary, where one is given access to an unknown $d$-dimensional channel $\mathcal{E}$, and wants to test whether $\mathcal{E}$ is equal to a target unitary channel or is $\varepsilon$-far from it in the diamond norm. We present optimal quantum algorithms for this problem, settling the query complexities in three access models with increasing power. Specifically, we show that: (i) $Θ(d/\varepsilon^2)$ queries suffice for incoherent access model, matching the lower bound due to Fawzi, Flammarion, Garivier, and Oufkir (COLT 2023). (ii) $Θ(d/\varepsilon)$ queries suffice for coherent access model, matching the lower bound due to Regev and Schiff (ICALP 2008). (iii) $Θ(\sqrt{d}/\varepsilon)$ queries suffice for source-code access model, matching the lower bound due to Jeon and Oh (npj Quantum Inf. 2026). This demonstrates a strict hierarchy of complexities for quantum channel certification to unitary across various access models.

Exact Dynamic Programming for Solow--Polasky Diversity Subset Selection on Lines and Staircases

from arXiv: Data Structures and Algorithms

Authors: Michael T. M. Emmerich

We study exact fixed-cardinality Solow--Polasky diversity subset selection on ordered finite $\ell_1$ sets, with monotone biobjective Pareto fronts and their higher-dimensional staircase analogues as central applications. Solow--Polasky diversity was introduced in biodiversity conservation, whereas the same inverse-matrix expression appears in metric geometry as magnitude: for a finite metric space $(X,d)$ with exponential similarity matrix $Z_{ij}=e^{-q d(x_i,x_j)}$, the quantity $\1^\top Z^{-1}\1$ is the magnitude of the scaled finite metric space $(X,qd)$ whenever the weighting is defined by the inverse matrix. Thus, in this finite exponential-kernel setting, Solow--Polasky diversity and magnitude are mathematically the same object viewed through different motivations. Building on the linear-chain magnitude formula of Leinster and Willerton, we provide a detailed proof of the scaled consecutive-gap identity $ \SP(X)=1+\sum_r \tanh(qg_r/2), $ where the $g_r$ are the gaps between consecutive selected points. We then prove an exact Bellman-recursion theorem for maximizing this value over all subsets of a prescribed cardinality, yielding an $O(kn^2)$ dynamic program for an ordered $n$-point candidate set and subset size $k$. Finally, we prove ordered $\ell_1$ reductions showing that the same algorithm applies to monotone biobjective Pareto-front approximations and, more generally, to finite coordinatewise monotone $\ell_1$ staircases in $\R^d$. These are precisely the ordered $\ell_1$ chains for which the Manhattan metric becomes a line metric along the chosen order, so the one-dimensional dynamic program applies without modification. Keywords: Dynamic Programming, Solow--Polasky Diversity, Complexity Theory, Multiobjective Optimization, Pareto front, Magnitude

Authors: Michael T. M. Emmerich

We study exact fixed-cardinality Solow--Polasky diversity subset selection on ordered finite $\ell_1$ sets, with monotone biobjective Pareto fronts and their higher-dimensional staircase analogues as central applications. Solow--Polasky diversity was introduced in biodiversity conservation, whereas the same inverse-matrix expression appears in metric geometry as magnitude: for a finite metric space $(X,d)$ with exponential similarity matrix $Z_{ij}=e^{-q d(x_i,x_j)}$, the quantity $\1^\top Z^{-1}\1$ is the magnitude of the scaled finite metric space $(X,qd)$ whenever the weighting is defined by the inverse matrix. Thus, in this finite exponential-kernel setting, Solow--Polasky diversity and magnitude are mathematically the same object viewed through different motivations. Building on the linear-chain magnitude formula of Leinster and Willerton, we provide a detailed proof of the scaled consecutive-gap identity $ \SP(X)=1+\sum_r \tanh(qg_r/2), $ where the $g_r$ are the gaps between consecutive selected points. We then prove an exact Bellman-recursion theorem for maximizing this value over all subsets of a prescribed cardinality, yielding an $O(kn^2)$ dynamic program for an ordered $n$-point candidate set and subset size $k$. Finally, we prove ordered $\ell_1$ reductions showing that the same algorithm applies to monotone biobjective Pareto-front approximations and, more generally, to finite coordinatewise monotone $\ell_1$ staircases in $\R^d$. These are precisely the ordered $\ell_1$ chains for which the Manhattan metric becomes a line metric along the chosen order, so the one-dimensional dynamic program applies without modification. Keywords: Dynamic Programming, Solow--Polasky Diversity, Complexity Theory, Multiobjective Optimization, Pareto front, Magnitude

Small Independent Sets versus Small Separator in Geometric Intersection Graphs

from arXiv: Data Structures and Algorithms

Authors: Malory Marin, Rémi Watrigant

While most classical NP-hard graph problems cannot be solved in time $2^{o(n)}$ on general graphs under the Exponential Time Hypothesis (ETH), many exhibit the square-root phenomenon and admit optimal algorithms running in time $2^{O(\sqrt{n})}$ on certain geometric intersection graphs, such as planar graphs or unit disk graphs. In 2018, de Berg et al. developed a general algorithmic framework for such problems on intersection graphs of similarly sized fat objects in $\mathbb{R}^d$, achieving running times of the form $2^{O(n^{1-1/d})}$, along with matching lower bounds under ETH. In this paper, we identify problems that do not exhibit the square-root phenomenon, yet still admit subexponential algorithms on intersection graphs of similarly sized fat objects in $\mathbb{R}^d$, for every fixed dimension $d \geqslant 2$. We introduce the notion of a weak square-root phenomenon: problems that can be solved in time $2^{\tilde{O}(n^{1-1/(d+1)})}$, and for which matching lower bounds hold under ETH. We develop both an algorithmic framework and a corresponding lower bound framework. As concrete examples, we show that the problems 2-Subcoloring and Two Sets Cut-Uncut exhibit this behavior. Our algorithms rely on a new win-win structural theorem, which can be informally stated as follows: every such graph admits a sublinear separator whose removal leaves connected components with sublinear independence number. To facilitate the design of these algorithms, we introduce a new graph parameter, the $α$-modulator number, which generalizes both the independence number and the vertex cover number.

Authors: Malory Marin, Rémi Watrigant

While most classical NP-hard graph problems cannot be solved in time $2^{o(n)}$ on general graphs under the Exponential Time Hypothesis (ETH), many exhibit the square-root phenomenon and admit optimal algorithms running in time $2^{O(\sqrt{n})}$ on certain geometric intersection graphs, such as planar graphs or unit disk graphs. In 2018, de Berg et al. developed a general algorithmic framework for such problems on intersection graphs of similarly sized fat objects in $\mathbb{R}^d$, achieving running times of the form $2^{O(n^{1-1/d})}$, along with matching lower bounds under ETH. In this paper, we identify problems that do not exhibit the square-root phenomenon, yet still admit subexponential algorithms on intersection graphs of similarly sized fat objects in $\mathbb{R}^d$, for every fixed dimension $d \geqslant 2$. We introduce the notion of a weak square-root phenomenon: problems that can be solved in time $2^{\tilde{O}(n^{1-1/(d+1)})}$, and for which matching lower bounds hold under ETH. We develop both an algorithmic framework and a corresponding lower bound framework. As concrete examples, we show that the problems 2-Subcoloring and Two Sets Cut-Uncut exhibit this behavior. Our algorithms rely on a new win-win structural theorem, which can be informally stated as follows: every such graph admits a sublinear separator whose removal leaves connected components with sublinear independence number. To facilitate the design of these algorithms, we introduce a new graph parameter, the $α$-modulator number, which generalizes both the independence number and the vertex cover number.

On the Learning Curves of Revenue Maximization

from arXiv: Data Structures and Algorithms

Authors: Steve Hanneke, Alkis Kalavasis, Shay Moran, Grigoris Velegkas

Learning curves are a fundamental primitive in supervised learning, describing how an algorithm's performance improves with more data and providing a quantitative measure of its generalization ability. Formally, a learning curve plots the decay of an algorithm's error for a fixed underlying distribution as a function of the number of training samples. Prior work on revenue-maximizing learning algorithms, starting with the seminal work of Cole and Roughgarden [STOC, 2014], adopts a distribution-free perspective, which parallels the PAC learning framework in learning theory. This approach evaluates performance against the hardest possible sequence of valuation distributions, one for each sample size, effectively defining the upper envelope of learning curves over all possible distributions, thus leading to error bounds that do not capture the shape of the learning curves. In this work we initiate the study of learning curves for revenue maximization and provide a near-complete characterization of their rate of decay in the basic setting of a single item and a single buyer. In the absence of any restriction on the valuation distribution, we show that there exists a Bayes-consistent algorithm, meaning that its learning curve converges to zero for any arbitrary valuation distribution as the number of samples $n \to \infty$. However, this convergence must be arbitrarily slow, even if the optimal revenue is finite. In contrast, if the optimal revenue is achieved by a finite price, then the optimal rate of decay is roughly $1/\sqrt{n}$. Finally, for distributions supported on discrete sets of values, we show that learning curves decay almost exponentially fast, a rate unattainable under the PAC framework.

Authors: Steve Hanneke, Alkis Kalavasis, Shay Moran, Grigoris Velegkas

Learning curves are a fundamental primitive in supervised learning, describing how an algorithm's performance improves with more data and providing a quantitative measure of its generalization ability. Formally, a learning curve plots the decay of an algorithm's error for a fixed underlying distribution as a function of the number of training samples. Prior work on revenue-maximizing learning algorithms, starting with the seminal work of Cole and Roughgarden [STOC, 2014], adopts a distribution-free perspective, which parallels the PAC learning framework in learning theory. This approach evaluates performance against the hardest possible sequence of valuation distributions, one for each sample size, effectively defining the upper envelope of learning curves over all possible distributions, thus leading to error bounds that do not capture the shape of the learning curves. In this work we initiate the study of learning curves for revenue maximization and provide a near-complete characterization of their rate of decay in the basic setting of a single item and a single buyer. In the absence of any restriction on the valuation distribution, we show that there exists a Bayes-consistent algorithm, meaning that its learning curve converges to zero for any arbitrary valuation distribution as the number of samples $n \to \infty$. However, this convergence must be arbitrarily slow, even if the optimal revenue is finite. In contrast, if the optimal revenue is achieved by a finite price, then the optimal rate of decay is roughly $1/\sqrt{n}$. Finally, for distributions supported on discrete sets of values, we show that learning curves decay almost exponentially fast, a rate unattainable under the PAC framework.

Solving Positive Linear Programs with Differential Privacy

from arXiv: Data Structures and Algorithms

Authors: Alina Ene, Huy Le Nguyen, Ta Duy Nguyen, Adrian Vladu

We study differentially private approximation algorithms for positive linear programs (LPs with nonnegative coefficients and variables), focusing on the fundamental families of packing, covering, and mixed packing-covering formulations. We focus on the high-sensitivity, constraint-private regime of Hsu-Roth-Roughgarden-Ullman (ICALP 2014), where neighboring instances may differ by an arbitrary single constraint, so one cannot hope to approximately satisfy every constraint under privacy. We give private solvers that return approximate solutions while violating only a controlled number of constraints. Our algorithms improve the prior instance-dependent guarantees, and also yield new data-independent bounds that depend only on the dimension. Our techniques involve a dense multiplicative weights update method developed from a regularized dual viewpoint, which we analyze in a way that exploits structure specific to positive LPs.

Authors: Alina Ene, Huy Le Nguyen, Ta Duy Nguyen, Adrian Vladu

We study differentially private approximation algorithms for positive linear programs (LPs with nonnegative coefficients and variables), focusing on the fundamental families of packing, covering, and mixed packing-covering formulations. We focus on the high-sensitivity, constraint-private regime of Hsu-Roth-Roughgarden-Ullman (ICALP 2014), where neighboring instances may differ by an arbitrary single constraint, so one cannot hope to approximately satisfy every constraint under privacy. We give private solvers that return approximate solutions while violating only a controlled number of constraints. Our algorithms improve the prior instance-dependent guarantees, and also yield new data-independent bounds that depend only on the dimension. Our techniques involve a dense multiplicative weights update method developed from a regularized dual viewpoint, which we analyze in a way that exploits structure specific to positive LPs.

Weighted Emulators with Local Heaviest Edges Stretch for Undirected Graphs

from arXiv: Data Structures and Algorithms

Authors: Liam Roditty, Ariel Sapir

We introduce a generalized family of $\left( 2\cdot \left\lfloor \frac{k}{2} \right\rfloor-1, 2\cdot \left\lceil \frac{k}{2} \right\rceil \cdot W_{1} +\max\left\{0,2\cdot\left(\left\lceil\frac{k}{2}\right\rceil-2\right)\right\}\cdot W_{2} \right)$-emulators with $\tilde O \left(n^{1+\frac{1}{k}}\right)$ edges, for any $k\in\mathbb{N}$, where $W_{i}$ is the $i$th heaviest edge on a shortest path between two vertices. Our construction generalizes the $+2W_{1}$-spanner of size $\tilde O\left(n^{\frac{3}{2}}\right)$ and the $+4W_{1}$-emulator of size $\tilde O \left(n^{\frac{4}{3}}\right)$, both by Elkin, Gitlitz and Neiman [DISC'21 and DICO'23]. When $k$ is even, these are $\left(k-1,k\cdot W_{1} + \left(k-4\right)\cdot W_{2}\right)$-emulators and when $k$ is odd, these are $\left(k-2,\left(k+1\right)\cdot W_{1} + \left(k-3\right) \cdot W_{2}\right)$-emulators. Our framework not only expands known constructions for weighted graphs but also yields an improved stretch over state of the art emulators and spanners for unweighted graphs within a specific distance regime. In particular, for all vertex pairs separated by a distance of $δ\leq O\left(3^{k^{2}}\right)$, our construction improves upon the seminal additive $+\tilde O\left(δ^{1-\frac{1}{k}}\right)$-emulator of size $\tilde O\left(n^{1+\frac{1}{2^{k+1}-1}}\right)$ by Thorup and Zwick [SODA'06].

Authors: Liam Roditty, Ariel Sapir

We introduce a generalized family of $\left( 2\cdot \left\lfloor \frac{k}{2} \right\rfloor-1, 2\cdot \left\lceil \frac{k}{2} \right\rceil \cdot W_{1} +\max\left\{0,2\cdot\left(\left\lceil\frac{k}{2}\right\rceil-2\right)\right\}\cdot W_{2} \right)$-emulators with $\tilde O \left(n^{1+\frac{1}{k}}\right)$ edges, for any $k\in\mathbb{N}$, where $W_{i}$ is the $i$th heaviest edge on a shortest path between two vertices. Our construction generalizes the $+2W_{1}$-spanner of size $\tilde O\left(n^{\frac{3}{2}}\right)$ and the $+4W_{1}$-emulator of size $\tilde O \left(n^{\frac{4}{3}}\right)$, both by Elkin, Gitlitz and Neiman [DISC'21 and DICO'23]. When $k$ is even, these are $\left(k-1,k\cdot W_{1} + \left(k-4\right)\cdot W_{2}\right)$-emulators and when $k$ is odd, these are $\left(k-2,\left(k+1\right)\cdot W_{1} + \left(k-3\right) \cdot W_{2}\right)$-emulators. Our framework not only expands known constructions for weighted graphs but also yields an improved stretch over state of the art emulators and spanners for unweighted graphs within a specific distance regime. In particular, for all vertex pairs separated by a distance of $δ\leq O\left(3^{k^{2}}\right)$, our construction improves upon the seminal additive $+\tilde O\left(δ^{1-\frac{1}{k}}\right)$-emulator of size $\tilde O\left(n^{1+\frac{1}{2^{k+1}-1}}\right)$ by Thorup and Zwick [SODA'06].

On (In)approximability of MaxMin Independent Set Reconfiguration

from arXiv: Data Structures and Algorithms

Authors: Hung P. Hoang, Naoto Ohsaka, Rin Saito, Yuma Tamura

In the Independent Set Reconfiguration problem under the Token Addition/Removal rule, given a graph $G$ and two independent sets $I$ and $J$ of $G$, we want to transform $I$ into $J$ by adding and removing vertices, such that all the sets throughout the process are independent sets. Its approximate version called MaxMin Independent Set Reconfiguration aims to maximise the minimum size of the independent sets in the process above. We study the (in)approximability of this problem for general graphs as well as restricted graph classes. Firstly, on general graphs, we obtain a polynomial-time $(n / \log n)$-factor approximation algorithm, complementing the $\mathsf{PSPACE}$-hardness of $n^{Ω(1)}$-factor approximation due to Hirahara and Ohsaka [STOC 2024, ICALP 2024] and the $\mathsf{NP}$-hardness of $n^{1-\varepsilon}$-factor approximation due to Ito, Demaine, Harvey, Papadimitriou, Sideri, Uehara, and Uno [TCS 2011]. Secondly, we present a polynomial-time approximation algorithm for degenerate graphs as well as $\mathsf{FPT}$-approximation schemes for bounded-treewidth graphs and $H$-minor-free graphs. Lastly, we extend the above inapproximability results to bounded-degree graphs, graphs of bandwidth $n^{\frac{1}{2}+Θ(1)}$, and bipartite graphs.

Authors: Hung P. Hoang, Naoto Ohsaka, Rin Saito, Yuma Tamura

In the Independent Set Reconfiguration problem under the Token Addition/Removal rule, given a graph $G$ and two independent sets $I$ and $J$ of $G$, we want to transform $I$ into $J$ by adding and removing vertices, such that all the sets throughout the process are independent sets. Its approximate version called MaxMin Independent Set Reconfiguration aims to maximise the minimum size of the independent sets in the process above. We study the (in)approximability of this problem for general graphs as well as restricted graph classes. Firstly, on general graphs, we obtain a polynomial-time $(n / \log n)$-factor approximation algorithm, complementing the $\mathsf{PSPACE}$-hardness of $n^{Ω(1)}$-factor approximation due to Hirahara and Ohsaka [STOC 2024, ICALP 2024] and the $\mathsf{NP}$-hardness of $n^{1-\varepsilon}$-factor approximation due to Ito, Demaine, Harvey, Papadimitriou, Sideri, Uehara, and Uno [TCS 2011]. Secondly, we present a polynomial-time approximation algorithm for degenerate graphs as well as $\mathsf{FPT}$-approximation schemes for bounded-treewidth graphs and $H$-minor-free graphs. Lastly, we extend the above inapproximability results to bounded-degree graphs, graphs of bandwidth $n^{\frac{1}{2}+Θ(1)}$, and bipartite graphs.

Path-Reporting Distance Oracles for Vertex-Labeled Graphs

from arXiv: Data Structures and Algorithms

Authors: Ofer Neiman, Alon Spector

Let $G=(V,E)$ be a weighted undirected graph, with $n$ vertices. A distance oracle is a data structure that can quickly answer distance queries, with some stretch factor. A seminal work of \cite{TZ01}, given an integer $k\ge 1$, provides such an oracle with stretch $2k-1$, query time $O(k)$, and size $O(k\cdot n^{1+1/k})$. Furthermore, this oracle can also report a path in $G$ corresponding to the returned distance. In this paper we focus on vertex-labeled graphs, in which each vertex is given a label from a set $L$ of size $\ell$. A {\em vertex-label distance oracle} answers queries of the form $(v,λ)$, where $v\in V$ and $λ\in L$, by reporting (an approximation to) the distance from $v$ to the closest vertex of label $λ$. Following \cite{HLWY11}, it was shown in \cite{C12} that for any integer $k> 1$, there exists a vertex-label distance oracle with stretch $4k-5$, query time $O(k)$, and size $O(k\cdot n\cdot \ell^{1/k})$. This state-of-the-art result suffers from two main drawbacks: The stretch is roughly a factor of 2 larger than in \cite{TZ01}, and it is not path-reporting. We address these concerns in this work, and provide the following results: First, we devise a {\em path-reporting} vertex-label distance oracle, at the cost of a slight increase in stretch and size. For any constant $0<ε<1$, our oracle has stretch $(4k-5)\cdot(1+ε)$, query time $O(k)$, and size $O(n^{1+o(1)}\cdot \ell^{1/k})$. Second, we show how to improve the stretch to the optimal $2k-1$, at the cost of mildly increasing the query time. Specifically, we devise a vertex-label distance oracle with stretch $2k-1$, query time $O(\ell^{1/k}\cdot\log n)$, and size $O(k\cdot n\cdot \ell^{1/k})$. \end{itemize}

Authors: Ofer Neiman, Alon Spector

Let $G=(V,E)$ be a weighted undirected graph, with $n$ vertices. A distance oracle is a data structure that can quickly answer distance queries, with some stretch factor. A seminal work of \cite{TZ01}, given an integer $k\ge 1$, provides such an oracle with stretch $2k-1$, query time $O(k)$, and size $O(k\cdot n^{1+1/k})$. Furthermore, this oracle can also report a path in $G$ corresponding to the returned distance. In this paper we focus on vertex-labeled graphs, in which each vertex is given a label from a set $L$ of size $\ell$. A {\em vertex-label distance oracle} answers queries of the form $(v,λ)$, where $v\in V$ and $λ\in L$, by reporting (an approximation to) the distance from $v$ to the closest vertex of label $λ$. Following \cite{HLWY11}, it was shown in \cite{C12} that for any integer $k> 1$, there exists a vertex-label distance oracle with stretch $4k-5$, query time $O(k)$, and size $O(k\cdot n\cdot \ell^{1/k})$. This state-of-the-art result suffers from two main drawbacks: The stretch is roughly a factor of 2 larger than in \cite{TZ01}, and it is not path-reporting. We address these concerns in this work, and provide the following results: First, we devise a {\em path-reporting} vertex-label distance oracle, at the cost of a slight increase in stretch and size. For any constant $0<ε<1$, our oracle has stretch $(4k-5)\cdot(1+ε)$, query time $O(k)$, and size $O(n^{1+o(1)}\cdot \ell^{1/k})$. Second, we show how to improve the stretch to the optimal $2k-1$, at the cost of mildly increasing the query time. Specifically, we devise a vertex-label distance oracle with stretch $2k-1$, query time $O(\ell^{1/k}\cdot\log n)$, and size $O(k\cdot n\cdot \ell^{1/k})$. \end{itemize}

Asymptotically Robust Learning-Augmented Algorithms for Preemptive FIFO Buffer Management

from arXiv: Data Structures and Algorithms

Authors: Wen-Han Hsieh, Ya-Chun Liang

We present a learning-augmented online algorithm for the preemptive FIFO buffer management problem, where packets arrive online to a finite-capacity buffer, must be transmitted in FIFO order, and the algorithm may preemptively discard buffered packets to accommodate future arrivals. Our algorithm simultaneously achieves 1-consistency, η-smoothness, and asymptotic \sqrt{3}-robustness, where ηdenotes the prediction error. Specifically, it attains an optimal competitive ratio of 1 under perfect predictions, degrades smoothly as the prediction error increases, and maintains an asymptotic competitive ratio of \sqrt{3} under arbitrarily inaccurate predictions, matching the best-known worst-case guarantee for the classical online problem, established by Englert and Westermann in 2009 [Algorithmica 53(4): 523-548]. A key technical contribution of our work is the introduction of an \emph{output-based prediction error metric}. Because capacity constraints dictate that only a strictly bounded subset of arriving packets is ultimately transmitted, our metric assesses prediction quality over the resulting optimal schedules rather than the raw input sequences, avoiding artificial error penalties. To guarantee robustness, our algorithm dynamically monitors predictions and executes a \emph{buffer-clearing strategy} upon transitioning to a worst-case fallback mechanism. We prove that the competitive loss incurred by this clearing operation is bounded by an additive capacity constant that vanishes asymptotically. Finally, we show that our algorithm provides a generalized framework for learning-augmented buffer management: substituting the fallback module with any β-competitive online algorithm immediately yields asymptotic β-robustness.

Authors: Wen-Han Hsieh, Ya-Chun Liang

We present a learning-augmented online algorithm for the preemptive FIFO buffer management problem, where packets arrive online to a finite-capacity buffer, must be transmitted in FIFO order, and the algorithm may preemptively discard buffered packets to accommodate future arrivals. Our algorithm simultaneously achieves 1-consistency, η-smoothness, and asymptotic \sqrt{3}-robustness, where ηdenotes the prediction error. Specifically, it attains an optimal competitive ratio of 1 under perfect predictions, degrades smoothly as the prediction error increases, and maintains an asymptotic competitive ratio of \sqrt{3} under arbitrarily inaccurate predictions, matching the best-known worst-case guarantee for the classical online problem, established by Englert and Westermann in 2009 [Algorithmica 53(4): 523-548]. A key technical contribution of our work is the introduction of an \emph{output-based prediction error metric}. Because capacity constraints dictate that only a strictly bounded subset of arriving packets is ultimately transmitted, our metric assesses prediction quality over the resulting optimal schedules rather than the raw input sequences, avoiding artificial error penalties. To guarantee robustness, our algorithm dynamically monitors predictions and executes a \emph{buffer-clearing strategy} upon transitioning to a worst-case fallback mechanism. We prove that the competitive loss incurred by this clearing operation is bounded by an additive capacity constant that vanishes asymptotically. Finally, we show that our algorithm provides a generalized framework for learning-augmented buffer management: substituting the fallback module with any β-competitive online algorithm immediately yields asymptotic β-robustness.

Flashback: A Reversible Bilateral Run-Peeling Decomposition of Strings

from arXiv: Data Structures and Algorithms

Authors: Thomas Konstantinovsky, Gur Yaari

We introduce Flashback, a reversible string decomposition that repeatedly peels the maximal leading and trailing character runs from a sentinel-wrapped input, recording each pair as one bilateral token. Decomposition and reconstruction both run in O(n) time and space. Our central result is a run-pairing theorem: Flashback is equivalent to pairing the first run of the string with the last, the second with the second-to-last, and so on. This gives an exact token count of 1+[r/2] for a string with r maximal runs, and matches a lower bound that holds for any admissible bilateral run-peeling scheme. From the run-pairing theorem the main structural properties follow as corollaries: the irreducible peeling kernel uses at most two symbols; palindromes are precisely the strings whose run-length encoding is symmetric with an odd number of runs; the image of the decomposition admits an explicit finite-state characterisation; and changing one run length rewrites exactly one content token.

Authors: Thomas Konstantinovsky, Gur Yaari

We introduce Flashback, a reversible string decomposition that repeatedly peels the maximal leading and trailing character runs from a sentinel-wrapped input, recording each pair as one bilateral token. Decomposition and reconstruction both run in O(n) time and space. Our central result is a run-pairing theorem: Flashback is equivalent to pairing the first run of the string with the last, the second with the second-to-last, and so on. This gives an exact token count of 1+[r/2] for a string with r maximal runs, and matches a lower bound that holds for any admissible bilateral run-peeling scheme. From the run-pairing theorem the main structural properties follow as corollaries: the irreducible peeling kernel uses at most two symbols; palindromes are precisely the strings whose run-length encoding is symmetric with an odd number of runs; the image of the decomposition admits an explicit finite-state characterisation; and changing one run length rewrites exactly one content token.

Incremental Strongly Connected Components with Predictions

from arXiv: Data Structures and Algorithms

Authors: Ronald Deng, Samuel McCauley, Aidin Niaparast, Helia Niaparast, Bennett Ptak, Shirel Quintanilla, Shikha Singh, Nathan Vosburg

Algorithms with predictions is a growing area that aims to leverage machine-learned predictions to design faster beyond-worst-case algorithms. In this paper, we use this framework to design a learned data structure for the incremental strongly connected components (SCC) problem. In this problem, the $n$ vertices of a graph are known a priori and the $m$ directed edges arrive over time. The goal is to efficiently maintain the strongly connected components of the graph after each insert. Our algorithm receives a possibly erroneous prediction of the edge sequence and uses it to precompute partial solutions to support fast inserts. We show that our algorithm achieves nearly optimal bounds with good predictions and its performance smoothly degrades with the prediction error. We also implement our data structure and perform experiments on real datasets. Our empirical results show that the theory is predictive of practical runtime improvements.

Authors: Ronald Deng, Samuel McCauley, Aidin Niaparast, Helia Niaparast, Bennett Ptak, Shirel Quintanilla, Shikha Singh, Nathan Vosburg

Algorithms with predictions is a growing area that aims to leverage machine-learned predictions to design faster beyond-worst-case algorithms. In this paper, we use this framework to design a learned data structure for the incremental strongly connected components (SCC) problem. In this problem, the $n$ vertices of a graph are known a priori and the $m$ directed edges arrive over time. The goal is to efficiently maintain the strongly connected components of the graph after each insert. Our algorithm receives a possibly erroneous prediction of the edge sequence and uses it to precompute partial solutions to support fast inserts. We show that our algorithm achieves nearly optimal bounds with good predictions and its performance smoothly degrades with the prediction error. We also implement our data structure and perform experiments on real datasets. Our empirical results show that the theory is predictive of practical runtime improvements.

Converting an Integer to a Decimal String in Under Two Nanoseconds

from arXiv: Data Structures and Algorithms

Authors: Jaël Champagne Gareau, Daniel Lemire

Converting binary integers to variable-length decimal strings is a fundamental operation in computing. Conventional fast approaches rely on recursive division and small lookup tables. We propose a SIMD-based algorithm that leverages integer multiply-add instructions available on recent AMD and Intel processors. Our method eliminates lookup tables entirely and computes multiple quotients and remainders in parallel. Additionally, we introduce a dual-variant design with dynamic selection that adapts to input characteristics: a branch-heavy variant optimized for homogeneous digit-length distributions and a branch-light variant for heterogeneous datasets. Our single-core algorithm consistently outperforms all competing methods across the full range of integer sizes, running 1.4-2x faster than the closest competitor and 2-4x faster than the C++ standard library function std::to_chars across tested workloads.

Authors: Jaël Champagne Gareau, Daniel Lemire

Converting binary integers to variable-length decimal strings is a fundamental operation in computing. Conventional fast approaches rely on recursive division and small lookup tables. We propose a SIMD-based algorithm that leverages integer multiply-add instructions available on recent AMD and Intel processors. Our method eliminates lookup tables entirely and computes multiple quotients and remainders in parallel. Additionally, we introduce a dual-variant design with dynamic selection that adapts to input characteristics: a branch-heavy variant optimized for homogeneous digit-length distributions and a branch-light variant for heterogeneous datasets. Our single-core algorithm consistently outperforms all competing methods across the full range of integer sizes, running 1.4-2x faster than the closest competitor and 2-4x faster than the C++ standard library function std::to_chars across tested workloads.

Fast and Faithful Edge Bundling using Spectral Sparsification

from arXiv: Data Structures and Algorithms

Authors: Xingjue Jiang, Seok-Hee Hong, Amyra Meidiana, Xianyuan Zeng

Edge bundling reduces the visual complexity of drawings of large and complex graphs by clustering "compatible" edges. However, it often introduces distortion by bundling "unrelated" edges, resulting in misleading, ambiguous drawings. Moreover, existing edge bundling methods often have high computational complexity. We present new edge bundling methods and faithfulness metrics for edge bundling using spectral sparsification, which sparsifies a graph G into a subgraph G' with O(n log n) edges, based on the effective resistance values of edges, preserving the spectrum of G, closely related to important structural properties of G, such as connectivity, clustering, and the commute distance. We first present a new edge bundling method SEB (Spectral Edge Bundling), introducing effective resistance-based compatibility for spectral-faithful bundling, improving distortion and ambiguity. Then, we present a general framework FEB (Fast Edge Bundling) utilizing spectral sparsification to improve the efficiency of existing bundling methods while maintaining a similar quality. We also present FBQ (Fast Bundling Quality) framework for proxy bundle faithfulness metrics, for measuring how FEB faithfully preserves the ground truth structure in the original edge bundling, with two variants, FBQ_JS (utilizing Jaccard Similarity) and FBQ_SQ (utilizing sampling quality metrics). Extensive experiments using various real-world and synthetic graphs demonstrate the effectiveness of SEB for edge bundling, outperforming state-of-art bundling methods on quality metrics, with 46% and 17% average improvement in distortion and ambiguity respectively for SEB2. Furthermore, experiments successfully demonstrate the efficiency of the FEB framework, with 61% runtime improvement over the original edge bundling methods without sparsification, while maintaining a similar quality, with 74% similarity based on FBQ_SQ.

Authors: Xingjue Jiang, Seok-Hee Hong, Amyra Meidiana, Xianyuan Zeng

Edge bundling reduces the visual complexity of drawings of large and complex graphs by clustering "compatible" edges. However, it often introduces distortion by bundling "unrelated" edges, resulting in misleading, ambiguous drawings. Moreover, existing edge bundling methods often have high computational complexity. We present new edge bundling methods and faithfulness metrics for edge bundling using spectral sparsification, which sparsifies a graph G into a subgraph G' with O(n log n) edges, based on the effective resistance values of edges, preserving the spectrum of G, closely related to important structural properties of G, such as connectivity, clustering, and the commute distance. We first present a new edge bundling method SEB (Spectral Edge Bundling), introducing effective resistance-based compatibility for spectral-faithful bundling, improving distortion and ambiguity. Then, we present a general framework FEB (Fast Edge Bundling) utilizing spectral sparsification to improve the efficiency of existing bundling methods while maintaining a similar quality. We also present FBQ (Fast Bundling Quality) framework for proxy bundle faithfulness metrics, for measuring how FEB faithfully preserves the ground truth structure in the original edge bundling, with two variants, FBQ_JS (utilizing Jaccard Similarity) and FBQ_SQ (utilizing sampling quality metrics). Extensive experiments using various real-world and synthetic graphs demonstrate the effectiveness of SEB for edge bundling, outperforming state-of-art bundling methods on quality metrics, with 46% and 17% average improvement in distortion and ambiguity respectively for SEB2. Furthermore, experiments successfully demonstrate the efficiency of the FEB framework, with 61% runtime improvement over the original edge bundling methods without sparsification, while maintaining a similar quality, with 74% similarity based on FBQ_SQ.

Wednesday, April 29

Because It Doesn't Have To

from Computational Complexity

My favorite quote about networking came from Jim Kurose.

The Internet works so well because it doesn't have to.

The IP and lower layers of the internet stack make no promises of delivery. Complete failure fulfills the protocol. This allows for simpler and more powerful protocols without the extra complexity needed to guarantee success. TCP aims for delivery basically by restarting the IP communication when it fails, and even TCP can report failure to the layers above.

We can say the same about modern artificial intelligence.

Machine learning works so well because it doesn't have to.

With the softmax function that neural nets use to determine the probability of outputs, neural nets never completely rule out a possibility, always giving it at least some tiny probability. In cases where the complexity is just too difficult, neural nets give several possibilities with nontrivial probabilities, as I described in my recent post, where a machine learning model would generate a uniform distribution to capture the output of a pseudorandom generator. Instead of rigidly forcing the model to give us a specific answer, by looking at distributions we allow the models to make mistakes.

Thus a machine learning model can be correct when it makes probabilistic guesses in situations too complicated to solve directly, which allows it to achieve its best possible performance. Because we allow the models to make mistakes, they have the flexibility to solve complex problems far more frequently.

By Lance Fortnow

My favorite quote about networking came from Jim Kurose.

The Internet works so well because it doesn't have to.

The IP and lower layers of the internet stack make no promises of delivery. Complete failure fulfills the protocol. This allows for simpler and more powerful protocols without the extra complexity needed to guarantee success. TCP aims for delivery basically by restarting the IP communication when it fails, and even TCP can report failure to the layers above.

We can say the same about modern artificial intelligence.

Machine learning works so well because it doesn't have to.

With the softmax function that neural nets use to determine the probability of outputs, neural nets never completely rule out a possibility, always giving it at least some tiny probability. In cases where the complexity is just too difficult, neural nets give several possibilities with nontrivial probabilities, as I described in my recent post, where a machine learning model would generate a uniform distribution to capture the output of a pseudorandom generator. Instead of rigidly forcing the model to give us a specific answer, by looking at distributions we allow the models to make mistakes.

Thus a machine learning model can be correct when it makes probabilistic guesses in situations too complicated to solve directly, which allows it to achieve its best possible performance. Because we allow the models to make mistakes, they have the flexibility to solve complex problems far more frequently.

By Lance Fortnow