Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Thursday, July 02

Conference announcement: Celebrating 100 Years: Avi 70 + CSDM 30

from Theory Matters

Celebrating 100 Years: Avi 70 + CSDM 30Dates: June 14-18, 2027 The Institute for Advanced Study’s School of Mathematics is pleased to announce a conference celebrating Avi Wigderson’s 70th birthday and retirement, together with 30 years of the Computer Science and Discrete Mathematics program (CSDM) at IAS. The conference will honor both Avi’s extraordinary scientific […]

Celebrating 100 Years: Avi 70 + CSDM 30
Dates: June 14-18, 2027

The Institute for Advanced Study’s School of Mathematics is pleased to announce a conference celebrating Avi Wigderson’s 70th birthday and retirement, together with 30 years of the Computer Science and Discrete Mathematics program (CSDM) at IAS. The conference will honor both Avi’s extraordinary scientific legacy and his transformative role in shaping the intellectual community around theoretical computer science and discrete mathematics at IAS.

Avi Wigderson has made foundational contributions across an exceptional range of areas in theoretical computer science and mathematics. His work has had a profound impact on complexity theory broadly, and especially on cryptography, circuit and proof complexity, communication complexity, pseudorandomness, and expander graphs; in recent years, he has also opened deep and unexpected connections to invariant theory. These contributions have shaped major research directions and influenced generations of mathematicians and theoretical computer scientists.

Posted on behalf of the organizers, Irit Dinur and Amir Shpilka. Additional information is available at https://www.ias.edu/math/events/celebrating-100-years-avi-70-csdm-30.

By shuchic

Feasibilism, Explication, and the Cobham-Edmonds Thesis

from arXiv: Computational Complexity

Authors: Abrahim Ladha, Yiran Luo, Alan Tian

While the Church-Turing thesis asserts that effective calculability explicates to sets decidable by a Turing machine, the Cobham-Edmonds thesis asserts that feasible computation explicates to the complexity class $\mathsf{P}$, those decidable by a polynomial-time bounded Turing machine. The Church-Turing thesis has been placed under rigorous scrutiny and has several convincing arguments in its favor, but the Cobham-Edmonds thesis has not undergone a similar examination. Many of the arguments in its favor simply suggest that $\mathsf{P}$ is a useful assumption, rather than a necessary target. This paper presents analogous arguments in favor of the Cobham-Edmonds thesis.

Authors: Abrahim Ladha, Yiran Luo, Alan Tian

While the Church-Turing thesis asserts that effective calculability explicates to sets decidable by a Turing machine, the Cobham-Edmonds thesis asserts that feasible computation explicates to the complexity class $\mathsf{P}$, those decidable by a polynomial-time bounded Turing machine. The Church-Turing thesis has been placed under rigorous scrutiny and has several convincing arguments in its favor, but the Cobham-Edmonds thesis has not undergone a similar examination. Many of the arguments in its favor simply suggest that $\mathsf{P}$ is a useful assumption, rather than a necessary target. This paper presents analogous arguments in favor of the Cobham-Edmonds thesis.

Fast Deterministic Normal Bases and Circulant Polynomial Determinants

from arXiv: Computational Complexity

Authors: Mark Giesbrecht, Armin Jamshidpey, Éric Schost

Let $\mathsf{E}=\mathbb F_q[x]/(Γ)$ be an algebraic extension of degree $n$ over the finite field $\mathbb F_q$, given by a $Γ\in\mathbb F_q[x]$ monic and irreducible. It is classical that any such $\mathsf{E}$ contains an element $β\in\mathsf{E}$ that is normal over $\mathbb F_q$, i.e., the conjugates $β,β^q,\ldots,β^{q^{n-1}}$ form an $\mathbb F_q$-basis of $\mathsf{E}$. In this paper we give a deterministic algorithm which finds such a normal element using $O_ε((n^2\log q)^{1+ε})+O\,\tilde{}\,(n\log^2 q)$ bit operations, for any $ε>0$. The algorithm works by showing that, for a parameter $t\in\mathbb F_q$, the element $β_t=(θ-t)^{-1}$ is normal except for at most $n(n-1)$ values of $t$. This is established by constructing a "cleared Moore" circulant matrix over $\mathbb F_{q^n}[\mathcal T]$, whose determinant degree at most $n(n-1)$, such that $β_t$ is normal if and only the determinant is non-zero at $t\in\mathbb F_q$. For faster computation over the base field, we replace this by an equivalent trace Gram circulant matrix over $\mathbb F_q[\mathcal T]$. A main algorithmic contribution is a fast determinant algorithm for circulant matrices of polynomials, which uses triangular set projection and modular composition techniques to achieve a near-linear cost. Given an $n\times n$ circulant matrix over $\mathbb F_q[t]$ whose entries have degree at most $m>0$, we show how to compute its determinant deterministically with $O_ε((nm\log q)^{1+ε})$ bit operations. We complete the solution by showing how to extend this to finite fields of size less than $n(n-1)$, through an embedding in a low-degree extension field, at poly-logarithmic additional cost.

Authors: Mark Giesbrecht, Armin Jamshidpey, Éric Schost

Let $\mathsf{E}=\mathbb F_q[x]/(Γ)$ be an algebraic extension of degree $n$ over the finite field $\mathbb F_q$, given by a $Γ\in\mathbb F_q[x]$ monic and irreducible. It is classical that any such $\mathsf{E}$ contains an element $β\in\mathsf{E}$ that is normal over $\mathbb F_q$, i.e., the conjugates $β,β^q,\ldots,β^{q^{n-1}}$ form an $\mathbb F_q$-basis of $\mathsf{E}$. In this paper we give a deterministic algorithm which finds such a normal element using $O_ε((n^2\log q)^{1+ε})+O\,\tilde{}\,(n\log^2 q)$ bit operations, for any $ε>0$. The algorithm works by showing that, for a parameter $t\in\mathbb F_q$, the element $β_t=(θ-t)^{-1}$ is normal except for at most $n(n-1)$ values of $t$. This is established by constructing a "cleared Moore" circulant matrix over $\mathbb F_{q^n}[\mathcal T]$, whose determinant degree at most $n(n-1)$, such that $β_t$ is normal if and only the determinant is non-zero at $t\in\mathbb F_q$. For faster computation over the base field, we replace this by an equivalent trace Gram circulant matrix over $\mathbb F_q[\mathcal T]$. A main algorithmic contribution is a fast determinant algorithm for circulant matrices of polynomials, which uses triangular set projection and modular composition techniques to achieve a near-linear cost. Given an $n\times n$ circulant matrix over $\mathbb F_q[t]$ whose entries have degree at most $m>0$, we show how to compute its determinant deterministically with $O_ε((nm\log q)^{1+ε})$ bit operations. We complete the solution by showing how to extend this to finite fields of size less than $n(n-1)$, through an embedding in a low-degree extension field, at poly-logarithmic additional cost.

The Decode-Work Law: Margin-Governed, Provably-Exact Spatial Joins over Compressed Geometry

from arXiv: Computational Geometry

Authors: Madhulatha Mandarapu, Sandeep Kunkunuru

Filter-and-refine spatial joins have always avoided touching exact geometry for certified candidate pairs, but the field never modeled the decompression cost of the pairs that survive the filter. When geometry is stored in a compressed, progressively-decodable multiresolution codec, the join's true cost is bytes decoded. We study provably-exact polygon intersection joins over a Douglas-Peucker level-of-detail (LOD) ladder, certified by a two-sided Hausdorff-margin test, and make two contributions. First, a reproducible mechanism and harness: on real U.S. Census TIGER water polygons, our progressive certificate join returns the exact join result while decoding 3.4-16.8x (median 5.9x) fewer vertices than naive decompress-then-refine, and about 4.9x fewer than the single-approximation multi-step baseline of Brinkhoff et al. (1994), with zero correctness violations (set-equality against a full-precision oracle) across 31 workloads. Second, a characterization we call the decode-work law: decode work is governed by each pair's signed-clearance margin -- how close it is to the predicate-flip boundary -- independent of object size, because the certificate descends the ladder only until its resolution beats the margin. The law is clean on controlled geometry (held-out R2=0.87, size-independent) and directional on real data (R2 ~= 0.55). We are explicit about what does not hold: a near-boundary-vertex predictor is the wrong model (we pre-registered one and rejected it), a selectivity regime forecaster did not materialize, and the worst case is the trivial Omega(v) read bound on adversarially interleaved boundaries. We contribute the mechanism, budget-honest decode accounting, and an open harness; we do not claim a new index.

Authors: Madhulatha Mandarapu, Sandeep Kunkunuru

Filter-and-refine spatial joins have always avoided touching exact geometry for certified candidate pairs, but the field never modeled the decompression cost of the pairs that survive the filter. When geometry is stored in a compressed, progressively-decodable multiresolution codec, the join's true cost is bytes decoded. We study provably-exact polygon intersection joins over a Douglas-Peucker level-of-detail (LOD) ladder, certified by a two-sided Hausdorff-margin test, and make two contributions. First, a reproducible mechanism and harness: on real U.S. Census TIGER water polygons, our progressive certificate join returns the exact join result while decoding 3.4-16.8x (median 5.9x) fewer vertices than naive decompress-then-refine, and about 4.9x fewer than the single-approximation multi-step baseline of Brinkhoff et al. (1994), with zero correctness violations (set-equality against a full-precision oracle) across 31 workloads. Second, a characterization we call the decode-work law: decode work is governed by each pair's signed-clearance margin -- how close it is to the predicate-flip boundary -- independent of object size, because the certificate descends the ladder only until its resolution beats the margin. The law is clean on controlled geometry (held-out R2=0.87, size-independent) and directional on real data (R2 ~= 0.55). We are explicit about what does not hold: a near-boundary-vertex predictor is the wrong model (we pre-registered one and rejected it), a selectivity regime forecaster did not materialize, and the worst case is the trivial Omega(v) read bound on adversarially interleaved boundaries. We contribute the mechanism, budget-honest decode accounting, and an open harness; we do not claim a new index.

The Singular Source of Vineyard Monodromy

from arXiv: Computational Geometry

Authors: Erin W. Chambers, Christopher Fillmore, Shankha Shubhra Mukherjee, Rohit Roy, Elizabeth Stephenson, Mathijs Wintraecken

Vineyards, or time-varying families of persistence diagrams, are widely used in topological data analysis (TDA) pipelines to track how topological features change and evolve as a parameter varies. When the parameter traces a closed loop, a vineyard can exhibit monodromy: diagram points permute over the course of a full traversal, which obstructs feature tracking and can complicate downstream analysis of such data. Chambers et al. considered the periodic vineyards that arise from the radial persistence transform, which maps the manifold to a family of persistence diagrams, where each diagram fixes a base point and considers the filtration that is based on Euclidean distance to that point, and showed that monodromy and knotting can occur. Other recent work by Arya et al. considers geometric conditions that exclude monodromy in two dimensions, in an effort to better understand when this effect happens. That said, understanding when and why monodromy occurs is a fundamental open problem with direct practical consequences for many data analysis pipelines. In this work, we study this question for 1-manifolds in $\mathbb{R}^2$, using a surprising connection with tools from singularity theory, and provide a classification for the causes of monodromy in vineyards. More precisely, we prove that the vineyard of a sufficiently small loop $γ$ cannot exhibit monodromy unless it contains a specific singularity of the distance function. The central geometric object in our analysis is the symmetry set, which is the locus of centers of spheres tangent in more than one point to the manifold; this object classifies singularities of the distance function, and in our setting, dictates precisely when monodromy occurs. This characterization opens the door to the development of algorithmic criteria for detecting and utilizing (or avoiding) monodromy in TDA pipelines.

Authors: Erin W. Chambers, Christopher Fillmore, Shankha Shubhra Mukherjee, Rohit Roy, Elizabeth Stephenson, Mathijs Wintraecken

Vineyards, or time-varying families of persistence diagrams, are widely used in topological data analysis (TDA) pipelines to track how topological features change and evolve as a parameter varies. When the parameter traces a closed loop, a vineyard can exhibit monodromy: diagram points permute over the course of a full traversal, which obstructs feature tracking and can complicate downstream analysis of such data. Chambers et al. considered the periodic vineyards that arise from the radial persistence transform, which maps the manifold to a family of persistence diagrams, where each diagram fixes a base point and considers the filtration that is based on Euclidean distance to that point, and showed that monodromy and knotting can occur. Other recent work by Arya et al. considers geometric conditions that exclude monodromy in two dimensions, in an effort to better understand when this effect happens. That said, understanding when and why monodromy occurs is a fundamental open problem with direct practical consequences for many data analysis pipelines. In this work, we study this question for 1-manifolds in $\mathbb{R}^2$, using a surprising connection with tools from singularity theory, and provide a classification for the causes of monodromy in vineyards. More precisely, we prove that the vineyard of a sufficiently small loop $γ$ cannot exhibit monodromy unless it contains a specific singularity of the distance function. The central geometric object in our analysis is the symmetry set, which is the locus of centers of spheres tangent in more than one point to the manifold; this object classifies singularities of the distance function, and in our setting, dictates precisely when monodromy occurs. This characterization opens the door to the development of algorithmic criteria for detecting and utilizing (or avoiding) monodromy in TDA pipelines.

A Geometric View of Combinatorial Fiedler Theory

from arXiv: Computational Geometry

Authors: José Fernández Goycoolea, Andrea de las Heras-Parrilla, Luis H. Herrera, Clemens Huemer, Carlos Seara

Recently, Andrade and Dahl introduced combinatorial Fiedler theory by studying a parameter $b(G)$ defined as the $\ell_1$-analog of the Rayleigh quotient minimization characterization of the algebraic connectivity of a graph $G=(V,E)$. In this work, we study the corresponding maximization problem, which plays the role of the $\ell_1$-analog of the largest Laplacian eigenvalue. We show that the new parameter $B(G)$ associated with this maximization problem admits a simple exact description: it is the average of the two largest vertex degrees of $G$. A unified combinatorial treatment of the minimization and maximization problems is presented first. Later, both optimization problems are reinterpreted in a geometrical setting. The feasible set is identified with a $(n-2)$-dimensional cuboctahedron shell where $n=|V|$. Additional structure is presented for this polyhedron, including the fact that maximizing solutions arise at its vertices and minimizing solutions arise at the centers of its facets. Finally, we analyze the number of optimal vectors for $b(G)$ and $B(G)$ for several graph families. Although the value of $B(G)$ is determined by the two largest degrees, we prove that counting the vectors that attain this value is actually $\#\mathrm{P}$-complete.

Authors: José Fernández Goycoolea, Andrea de las Heras-Parrilla, Luis H. Herrera, Clemens Huemer, Carlos Seara

Recently, Andrade and Dahl introduced combinatorial Fiedler theory by studying a parameter $b(G)$ defined as the $\ell_1$-analog of the Rayleigh quotient minimization characterization of the algebraic connectivity of a graph $G=(V,E)$. In this work, we study the corresponding maximization problem, which plays the role of the $\ell_1$-analog of the largest Laplacian eigenvalue. We show that the new parameter $B(G)$ associated with this maximization problem admits a simple exact description: it is the average of the two largest vertex degrees of $G$. A unified combinatorial treatment of the minimization and maximization problems is presented first. Later, both optimization problems are reinterpreted in a geometrical setting. The feasible set is identified with a $(n-2)$-dimensional cuboctahedron shell where $n=|V|$. Additional structure is presented for this polyhedron, including the fact that maximizing solutions arise at its vertices and minimizing solutions arise at the centers of its facets. Finally, we analyze the number of optimal vectors for $b(G)$ and $B(G)$ for several graph families. Although the value of $B(G)$ is determined by the two largest degrees, we prove that counting the vectors that attain this value is actually $\#\mathrm{P}$-complete.

Guaranteed Escape for a Bouncing Robot in Pipe Chains

from arXiv: Computational Geometry

Authors: Ahmad Kamaludeen, Somnath Kundu, Yeganeh Bahoo

We study the symmetric bouncing of a point robot within orthogonally-joined rectangles with equal width, which we refer to as pipes. We provide an exhaustive case analysis of every trajectory pattern inside a single rectangular pipe segment, identifying the conditions under which the robot exits. We then extend the analysis to L-shaped pipes and, more generally, to linear chains of $k$ orthogonally connected pipe segments. We prove exit guarantees for the special angle $α= π/4$. Furthermore, these results extend to pipes with curved joints.

Authors: Ahmad Kamaludeen, Somnath Kundu, Yeganeh Bahoo

We study the symmetric bouncing of a point robot within orthogonally-joined rectangles with equal width, which we refer to as pipes. We provide an exhaustive case analysis of every trajectory pattern inside a single rectangular pipe segment, identifying the conditions under which the robot exits. We then extend the analysis to L-shaped pipes and, more generally, to linear chains of $k$ orthogonally connected pipe segments. We prove exit guarantees for the special angle $α= π/4$. Furthermore, these results extend to pipes with curved joints.

Determining the Complexity of Chromatic Sum in Classes Defined by a Set of Forbidden Graphs

from arXiv: Data Structures and Algorithms

Authors: Clément Dallard, Daniël Paulusma, Erik Jan van Leeuwen

The Chromatic Sum problem asks, given a graph $G$ and an integer $k$, whether $G$ admits a colouring $c$ with sum $\sum_{v\in V}c(v) \leq k$. We study the complexity of Chromatic Sum on graph classes defined by some set of forbidden graphs. First, we show that three known frameworks fully classify the complexity of Chromatic Sum on $HH$-minor-free graphs and $HH$-topological-minor-free graphs for any set of graphs $HH$, and on $HH$-subgraph-free graphs for any finite set of graphs $HH$. To show this, we prove a new NP-completeness result for Chromatic Sum on certain subdivisions of planar subcubic graphs. Next, we consider other containment relations. We formalise a novel framework of problems that are NP-complete for planar graphs as well as for graphs of bounded independence number. For every problem in this framework, we obtain an almost complete complexity classification on $H$-induced-minor-free graphs, $H$-induced-topological-minor-free graphs, and $H$-free graphs for every graph $H$. We show that Chromatic Sum belongs to this framework, as do several other problems. We also define a more fine-grained framework for the induced subgraph relation. We apply this to obtain a complete complexity classification for Chromatic Sum on $H$-free graphs, as well as for several other problems. We justify the choice of this framework by proving that Chromatic Sum is NP-complete for graphs of clique-width at most $3$. This result complements a known polynomial-time result for graphs of clique-width at most $2$.

Authors: Clément Dallard, Daniël Paulusma, Erik Jan van Leeuwen

The Chromatic Sum problem asks, given a graph $G$ and an integer $k$, whether $G$ admits a colouring $c$ with sum $\sum_{v\in V}c(v) \leq k$. We study the complexity of Chromatic Sum on graph classes defined by some set of forbidden graphs. First, we show that three known frameworks fully classify the complexity of Chromatic Sum on $HH$-minor-free graphs and $HH$-topological-minor-free graphs for any set of graphs $HH$, and on $HH$-subgraph-free graphs for any finite set of graphs $HH$. To show this, we prove a new NP-completeness result for Chromatic Sum on certain subdivisions of planar subcubic graphs. Next, we consider other containment relations. We formalise a novel framework of problems that are NP-complete for planar graphs as well as for graphs of bounded independence number. For every problem in this framework, we obtain an almost complete complexity classification on $H$-induced-minor-free graphs, $H$-induced-topological-minor-free graphs, and $H$-free graphs for every graph $H$. We show that Chromatic Sum belongs to this framework, as do several other problems. We also define a more fine-grained framework for the induced subgraph relation. We apply this to obtain a complete complexity classification for Chromatic Sum on $H$-free graphs, as well as for several other problems. We justify the choice of this framework by proving that Chromatic Sum is NP-complete for graphs of clique-width at most $3$. This result complements a known polynomial-time result for graphs of clique-width at most $2$.

Independent Set Hardness in Graphs of Bounded Twin-Width and Low-Radius Merge-Width

from arXiv: Data Structures and Algorithms

Authors: Édouard Bonnet, Maël Dumas, Julien Duron

For every $\varepsilon > 0$, Max Independent Set admits a polynomial-time $n^\varepsilon$-approximation algorithm on $n$-vertex graphs of effectively bounded twin-width [Bergé et al., STACS '23]. The approximation factor actually obtained is more precisely $n^{O(1/ \log \log n)}$. Prior to the current paper, no approximation hardness was known for this problem, and the existence of a polynomial-time approximation scheme (PTAS) was repeatedly raised as an open question. We answer this question in a strong sense: We show that there is a constant $γ> 0$ such that a polynomial-time $n^{γ/ (\log \log n)^2}$-approximation algorithm for Max Independent Set on graphs of twin-width at most 4 would refute the Exponential-Time Hypothesis (ETH). This lower bound further holds if a 4-sequence is provided as part of the input. We show the same hardness of approximation for Min Coloring, which also has a nearly matching $n^{O(1/ \log \log n)}$-approximation algorithm on graphs of effectively bounded twin-width. We also clarify the parameterized complexity of $k$-Independent Set on graphs of bounded radius-$r$ merge-width when the range of $r$ is limited. There is a fixed-parameter tractable algorithm for $k$-Independent Set on graphs given with radius-$2^{O(k^2)}$ merge sequences of bounded width [Dreier and Toruńczyk, STOC '25]. We complement this result by showing that $k$-Independent Set is W[1]-hard on graphs given with radius-$o(k)$ merge sequences of bounded width. We further show that this result also holds for $k$-Dominating Set.

Authors: Édouard Bonnet, Maël Dumas, Julien Duron

For every $\varepsilon > 0$, Max Independent Set admits a polynomial-time $n^\varepsilon$-approximation algorithm on $n$-vertex graphs of effectively bounded twin-width [Bergé et al., STACS '23]. The approximation factor actually obtained is more precisely $n^{O(1/ \log \log n)}$. Prior to the current paper, no approximation hardness was known for this problem, and the existence of a polynomial-time approximation scheme (PTAS) was repeatedly raised as an open question. We answer this question in a strong sense: We show that there is a constant $γ> 0$ such that a polynomial-time $n^{γ/ (\log \log n)^2}$-approximation algorithm for Max Independent Set on graphs of twin-width at most 4 would refute the Exponential-Time Hypothesis (ETH). This lower bound further holds if a 4-sequence is provided as part of the input. We show the same hardness of approximation for Min Coloring, which also has a nearly matching $n^{O(1/ \log \log n)}$-approximation algorithm on graphs of effectively bounded twin-width. We also clarify the parameterized complexity of $k$-Independent Set on graphs of bounded radius-$r$ merge-width when the range of $r$ is limited. There is a fixed-parameter tractable algorithm for $k$-Independent Set on graphs given with radius-$2^{O(k^2)}$ merge sequences of bounded width [Dreier and Toruńczyk, STOC '25]. We complement this result by showing that $k$-Independent Set is W[1]-hard on graphs given with radius-$o(k)$ merge sequences of bounded width. We further show that this result also holds for $k$-Dominating Set.

Query Complexity of Hypergraph Connectivity and Learnability using CUT Oracles

from arXiv: Data Structures and Algorithms

Authors: Deeparnab Chakrabarty, Hang Liao

We investigate the power of CUT queries to reveal the structure of unknown hypergraphs. While simple graphs allow for optimal $O(n)$-query connectivity algorithms, hypergraphs face a fundamental identifiability barrier in that distinct hypergraphs can share identical cut-profiles, making exact edge learning impossible in general, a primitive crucial in the graph connectivity algorithms. We first present a zero-error randomized algorithm that identifies the connected components of any weighted hypergraph using $O(n)$ expected queries, matching the $Ω(n)$ lower bound. This approach bypasses the reconstruction barrier by introducing the notion of ``independent families'' -- vertex subpartitions that do not share hyperedges -- and iteratively coarsening them using auxiliary weighted graph connectivity techniques [Liao-Chakrabarty, 2024]. Second, we demonstrate that the impossibility of exact learning depends on hyperedge parity. For even-parity hypergraphs, we show that the structure is reconstructible using a Möbius transform on the CUT function to implement binary-search-style vertex identification. This yields deterministic algorithms for obtaining $k$-connectivity certificates for $r$-bounded even hypergraphs in $\tilde{O}_r(kn)$ queries. Finally, we bypass parity and rank constraints for linear hypergraphs, achieving a subquadratic $\tilde{O}(kn^{1.5})$ query complexity for $k$-connectivity. This significantly improves upon the general $\tilde{O}(n^2)$ bound derived via symmetric submodular function minimization.

Authors: Deeparnab Chakrabarty, Hang Liao

We investigate the power of CUT queries to reveal the structure of unknown hypergraphs. While simple graphs allow for optimal $O(n)$-query connectivity algorithms, hypergraphs face a fundamental identifiability barrier in that distinct hypergraphs can share identical cut-profiles, making exact edge learning impossible in general, a primitive crucial in the graph connectivity algorithms. We first present a zero-error randomized algorithm that identifies the connected components of any weighted hypergraph using $O(n)$ expected queries, matching the $Ω(n)$ lower bound. This approach bypasses the reconstruction barrier by introducing the notion of ``independent families'' -- vertex subpartitions that do not share hyperedges -- and iteratively coarsening them using auxiliary weighted graph connectivity techniques [Liao-Chakrabarty, 2024]. Second, we demonstrate that the impossibility of exact learning depends on hyperedge parity. For even-parity hypergraphs, we show that the structure is reconstructible using a Möbius transform on the CUT function to implement binary-search-style vertex identification. This yields deterministic algorithms for obtaining $k$-connectivity certificates for $r$-bounded even hypergraphs in $\tilde{O}_r(kn)$ queries. Finally, we bypass parity and rank constraints for linear hypergraphs, achieving a subquadratic $\tilde{O}(kn^{1.5})$ query complexity for $k$-connectivity. This significantly improves upon the general $\tilde{O}(n^2)$ bound derived via symmetric submodular function minimization.

Online Fair Division Meets Reordering Buffers

from arXiv: Data Structures and Algorithms

Authors: Georgios Amanatidis, Giulio Giaconi, Evangelos Markakis, Nicos Protopapas

We study the online fair division of indivisible mixed manna among agents with additive valuation functions. Under the standard online model, at each time step an indivisible item arrives; each agent may assign it a positive, negative, or zero value, and it must be irrevocably allocated, before the arrival of the next item. At the same time, we also wish to maintain some fairness guarantee, and in this work we focus on envy-freeness (EF) and one of its most prominent relaxations, envy-freeness up to one item (EF1). Given the strong negative and the scarce positive results for this problem without additional assumptions, we augment our algorithms with buffers that can store and rearrange a limited number of items. This setting interpolates naturally between the fully online case (no buffer) and the fully offline case (a buffer large enough to hold all items). We show that algorithms equipped with reasonably sized buffers can achieve strong guarantees for personalized $k$-value instances, i.e., instances in which each agent assigns at most $k$ distinct values to items. In particular, we construct allocations that are EF1 at every time step and EF at most time steps, using a buffer of size linear in $k$ and in the number of agents. Our approach relies on novel combinatorial arguments and on constructing a sequence of envy-free matchings that allocates most items. Finally, we extend our results to general additive valuation functions, with a dependence on the largest per-agent ratio between two values of the same sign, and we also identify limitations of our approach via impossibility results on the use of buffers with smaller size.

Authors: Georgios Amanatidis, Giulio Giaconi, Evangelos Markakis, Nicos Protopapas

We study the online fair division of indivisible mixed manna among agents with additive valuation functions. Under the standard online model, at each time step an indivisible item arrives; each agent may assign it a positive, negative, or zero value, and it must be irrevocably allocated, before the arrival of the next item. At the same time, we also wish to maintain some fairness guarantee, and in this work we focus on envy-freeness (EF) and one of its most prominent relaxations, envy-freeness up to one item (EF1). Given the strong negative and the scarce positive results for this problem without additional assumptions, we augment our algorithms with buffers that can store and rearrange a limited number of items. This setting interpolates naturally between the fully online case (no buffer) and the fully offline case (a buffer large enough to hold all items). We show that algorithms equipped with reasonably sized buffers can achieve strong guarantees for personalized $k$-value instances, i.e., instances in which each agent assigns at most $k$ distinct values to items. In particular, we construct allocations that are EF1 at every time step and EF at most time steps, using a buffer of size linear in $k$ and in the number of agents. Our approach relies on novel combinatorial arguments and on constructing a sequence of envy-free matchings that allocates most items. Finally, we extend our results to general additive valuation functions, with a dependence on the largest per-agent ratio between two values of the same sign, and we also identify limitations of our approach via impossibility results on the use of buffers with smaller size.

Fair Allocation under Conflict Constraints via Strong Colorability

from arXiv: Data Structures and Algorithms

Authors: Ishay Haviv

In the fair allocation problem under conflict constraints, the goal is to partition the vertices of a graph among agents in a fair manner, such that no two adjacent vertices are assigned to the same agent. We study this problem for agents with common preferences through the lens of three fairness criteria: stochastic-dominance envy-freeness up to one item for preference orders (SD-EF1), envy-freeness up to one item for monotone additive valuations (EF1), and envy-freeness up to one item from each side for general additive valuations (EF[1,1]). To do so, we introduce a hierarchy of variants of the strong chromatic number, a graph quantity introduced independently by Alon and Fellows in the early nineties. Our results reveal a close connection between fair allocation under conflict constraints and the first two levels of this hierarchy, providing a unified route to both existential and algorithmic results. For SD-EF1, we fully characterize the number of agents needed to guarantee a fair allocation of a given graph for every common preference order. For EF1 and EF[1,1], we provide analogous sufficient conditions, extending a result on path graphs due to Equbal, Gurjar, Igarashi, Kumar, Manurangsi, Nath, Saxena, Vaish, and Yoneda. We also show that, unlike in the SD-EF1 setting, the sufficient conditions for EF1 and EF[1,1] are not necessary in general. Our framework yields existential and algorithmic consequences in terms of the maximum degree. We obtain that every graph with maximum degree $Δ$ admits SD-EF1, EF1, and EF[1,1] allocations for common preferences whenever the number of agents is at least $3Δ-1$. We further provide, for any $\varepsilon>0$, deterministic polynomial-time algorithms that find such allocations whenever the number of agents is at least $(3+\varepsilon)Δ$. These guarantees strengthen earlier work by Barman and Viswanathan on equitable colorings.

Authors: Ishay Haviv

In the fair allocation problem under conflict constraints, the goal is to partition the vertices of a graph among agents in a fair manner, such that no two adjacent vertices are assigned to the same agent. We study this problem for agents with common preferences through the lens of three fairness criteria: stochastic-dominance envy-freeness up to one item for preference orders (SD-EF1), envy-freeness up to one item for monotone additive valuations (EF1), and envy-freeness up to one item from each side for general additive valuations (EF[1,1]). To do so, we introduce a hierarchy of variants of the strong chromatic number, a graph quantity introduced independently by Alon and Fellows in the early nineties. Our results reveal a close connection between fair allocation under conflict constraints and the first two levels of this hierarchy, providing a unified route to both existential and algorithmic results. For SD-EF1, we fully characterize the number of agents needed to guarantee a fair allocation of a given graph for every common preference order. For EF1 and EF[1,1], we provide analogous sufficient conditions, extending a result on path graphs due to Equbal, Gurjar, Igarashi, Kumar, Manurangsi, Nath, Saxena, Vaish, and Yoneda. We also show that, unlike in the SD-EF1 setting, the sufficient conditions for EF1 and EF[1,1] are not necessary in general. Our framework yields existential and algorithmic consequences in terms of the maximum degree. We obtain that every graph with maximum degree $Δ$ admits SD-EF1, EF1, and EF[1,1] allocations for common preferences whenever the number of agents is at least $3Δ-1$. We further provide, for any $\varepsilon>0$, deterministic polynomial-time algorithms that find such allocations whenever the number of agents is at least $(3+\varepsilon)Δ$. These guarantees strengthen earlier work by Barman and Viswanathan on equitable colorings.

Tighter Bounds for Wheeler Determinization

from arXiv: Data Structures and Algorithms

Authors: Philip Bille, Inge Li Gørtz, Máximo Pérez-López, Simon R. Tarnow

Given a Wheeler NFA $\mathcal{A}$, the Wheeler determinization problem is to construct a Wheeler DFA $\mathcal{D}$ that accepts the same language as $\mathcal{A}$. We use the notation $n_{\mathcal{A}},m_{\mathcal{A}}$ for the number of vertices and edges of $\mathcal{A}$, and equivalently $n_{\mathcal{D}},m_{\mathcal{D}}$ for $\mathcal{D}$. Alanko et al. [SODA 2020, Inf. Comp. 2021] show that we can solve this problem in $O(n_{\mathcal{A}}^3)$ time. In this paper, we show how to improve the running time to $O(n_{\mathcal{A}} + m_{\mathcal{A}} + n_{\mathcal{D}} + m_{\mathcal{D}})$ when given the Wheeler order of $\mathcal{A}$ (which can be computed in $O(m_{\mathcal{A}}\log n_{\mathcal{A}})$ with an algorithm by Becker et al. [ESA 2023]). Our running time is a factor $n_{\mathcal{A}}^2/σ$ faster than the state of the art, where $σ$ is the size of the alphabet. Furthermore, for $σ=O(1)$ we have the first linear time algorithm for this problem. We show that our bound is tight for sorted inputs with any combination of $n$ and $σ$, by giving a family of inputs for which our output $\mathcal{D}$ is minimum, and of maximum size $Θ(nσ)$.

Authors: Philip Bille, Inge Li Gørtz, Máximo Pérez-López, Simon R. Tarnow

Given a Wheeler NFA $\mathcal{A}$, the Wheeler determinization problem is to construct a Wheeler DFA $\mathcal{D}$ that accepts the same language as $\mathcal{A}$. We use the notation $n_{\mathcal{A}},m_{\mathcal{A}}$ for the number of vertices and edges of $\mathcal{A}$, and equivalently $n_{\mathcal{D}},m_{\mathcal{D}}$ for $\mathcal{D}$. Alanko et al. [SODA 2020, Inf. Comp. 2021] show that we can solve this problem in $O(n_{\mathcal{A}}^3)$ time. In this paper, we show how to improve the running time to $O(n_{\mathcal{A}} + m_{\mathcal{A}} + n_{\mathcal{D}} + m_{\mathcal{D}})$ when given the Wheeler order of $\mathcal{A}$ (which can be computed in $O(m_{\mathcal{A}}\log n_{\mathcal{A}})$ with an algorithm by Becker et al. [ESA 2023]). Our running time is a factor $n_{\mathcal{A}}^2/σ$ faster than the state of the art, where $σ$ is the size of the alphabet. Furthermore, for $σ=O(1)$ we have the first linear time algorithm for this problem. We show that our bound is tight for sorted inputs with any combination of $n$ and $σ$, by giving a family of inputs for which our output $\mathcal{D}$ is minimum, and of maximum size $Θ(nσ)$.

Sharp Bounds for Dynamic Averaging on Cycles

from arXiv: Data Structures and Algorithms

Authors: Dean Kraizberg, Ron Peretz

We study a dynamic averaging process on the cycle \(C_n\). At each discrete time, an edge is chosen uniformly at random, one unit of load is introduced, and the two endpoint loads are replaced by their common average after the new unit has been added. Starting from the zero configuration, we prove that the expected gap between the largest and smallest loads is \(O(\sqrt n)\), uniformly in time. Building on the lower-bound argument of Alistarh, Nadiradze, and Sabour for the expected square of the gap, we further show that the expected gap is \(Ω(\sqrt n)\) in the long run. This confirms their conjecture that the expected gap is of order \(\sqrt n\).

Authors: Dean Kraizberg, Ron Peretz

We study a dynamic averaging process on the cycle \(C_n\). At each discrete time, an edge is chosen uniformly at random, one unit of load is introduced, and the two endpoint loads are replaced by their common average after the new unit has been added. Starting from the zero configuration, we prove that the expected gap between the largest and smallest loads is \(O(\sqrt n)\), uniformly in time. Building on the lower-bound argument of Alistarh, Nadiradze, and Sabour for the expected square of the gap, we further show that the expected gap is \(Ω(\sqrt n)\) in the long run. This confirms their conjecture that the expected gap is of order \(\sqrt n\).

Tighter bounds for weighted and unweighted shortest cycle approximation

from arXiv: Data Structures and Algorithms

Authors: Avi Kadria, Liam Roditty, Virginia Vassilevska Williams

We study the problem of approximating the length of a shortest cycle in a given graph, known as the girth of the graph. The state-of-the-art approximation algorithms for unweighted graphs by Kadria et al. [SODA'22] and Roditty and Trabelsi [arXiv'25] achieve the following trade-off: for every integer $k\geq 2$, there is an $\tilde{O}(n^{1+2/k})$ time algorithm that achieves a $(2k/3)$-approximation for the girth in unweighted $n$-node graphs. The first result of this paper is to achieve the same trade-off for $m$-edge, $n$-node graphs with non-negative real edge weights: a $2k/3$-approximation algorithm running in $\tilde{O}(m+n^{1+2/k})$ time. The dependence on $m$ is unavoidable in weighted graphs. Our result improves on the work of Kadria et al.~[SODA'23] and Ducoffe [ICALP'19 and SIDMA'21], who were only able to achieve such a trade-off for some values of $k$. We also prove new fine-grained lower bounds for girth approximation and related problems in unweighted graphs.

Authors: Avi Kadria, Liam Roditty, Virginia Vassilevska Williams

We study the problem of approximating the length of a shortest cycle in a given graph, known as the girth of the graph. The state-of-the-art approximation algorithms for unweighted graphs by Kadria et al. [SODA'22] and Roditty and Trabelsi [arXiv'25] achieve the following trade-off: for every integer $k\geq 2$, there is an $\tilde{O}(n^{1+2/k})$ time algorithm that achieves a $(2k/3)$-approximation for the girth in unweighted $n$-node graphs. The first result of this paper is to achieve the same trade-off for $m$-edge, $n$-node graphs with non-negative real edge weights: a $2k/3$-approximation algorithm running in $\tilde{O}(m+n^{1+2/k})$ time. The dependence on $m$ is unavoidable in weighted graphs. Our result improves on the work of Kadria et al.~[SODA'23] and Ducoffe [ICALP'19 and SIDMA'21], who were only able to achieve such a trade-off for some values of $k$. We also prove new fine-grained lower bounds for girth approximation and related problems in unweighted graphs.

Space-Optimal Sensitivity Oracles for Single-Source Mincuts

from arXiv: Data Structures and Algorithms

Authors: Koustav Bhanja, Merav Parter, Asaf Petruschka

We study Single-Source Mincut Sensitivity Oracles: compact data structures that, when queried with an edge e, report those affected vertices whose mincut value to source $s$ changes upon the insertion or failure of e. Insertion queries were treated by Baswana, Gupta, and Knollmann [Algorithmica '22], who showed an extremely compact oracle with only O(n) space. In this work, we consider edge failure queries, which are of even greater interest, but far more challenging. The current-best approaches give O(n^2) space: either using n-1 fixed-pair oracles of O(n) space each, based on the Picard-Queyranne representation [MPS '80], or using the O(n^2) space all-pairs oracle by Baswana and Pandey [SODA '22]. -Our key result is an optimal O(n) space single-source mincut sensitivity oracle for edge failure queries. It reports the set of affected vertices in O(n) time, thus matching the state-of-the-art bounds for the insertion case. -Additionally, we provide oracles with near-optimal query times at the cost of increasing the space to O(n^{1.5}). They can determine if any given vertex is affected by an insertion/failure of an edge in O(log n) time, or reports all affected vertices in amortized O(\log^3 n) time per vertex. Such oracles of subquadratic space were previously unknown, even for insertion. Our main technical contribution is in establishing novel and intricate connections between two seemingly distant objects, representing two different families of mincuts. The first is the DAG representation of farthest mincuts to the source, which was the central tool introduced by Baswana, Gupta, and Knollmann. The second is the Connectivity Carcass for Steiner mincuts of Dinitz and Vainshtein [STOC '94], which generalizes well-known cactus representations of global mincuts. Our work demonstrates the relatively unexplored potential of the carcass beyond its obvious Steiner mincuts scope.

Authors: Koustav Bhanja, Merav Parter, Asaf Petruschka

We study Single-Source Mincut Sensitivity Oracles: compact data structures that, when queried with an edge e, report those affected vertices whose mincut value to source $s$ changes upon the insertion or failure of e. Insertion queries were treated by Baswana, Gupta, and Knollmann [Algorithmica '22], who showed an extremely compact oracle with only O(n) space. In this work, we consider edge failure queries, which are of even greater interest, but far more challenging. The current-best approaches give O(n^2) space: either using n-1 fixed-pair oracles of O(n) space each, based on the Picard-Queyranne representation [MPS '80], or using the O(n^2) space all-pairs oracle by Baswana and Pandey [SODA '22]. -Our key result is an optimal O(n) space single-source mincut sensitivity oracle for edge failure queries. It reports the set of affected vertices in O(n) time, thus matching the state-of-the-art bounds for the insertion case. -Additionally, we provide oracles with near-optimal query times at the cost of increasing the space to O(n^{1.5}). They can determine if any given vertex is affected by an insertion/failure of an edge in O(log n) time, or reports all affected vertices in amortized O(\log^3 n) time per vertex. Such oracles of subquadratic space were previously unknown, even for insertion. Our main technical contribution is in establishing novel and intricate connections between two seemingly distant objects, representing two different families of mincuts. The first is the DAG representation of farthest mincuts to the source, which was the central tool introduced by Baswana, Gupta, and Knollmann. The second is the Connectivity Carcass for Steiner mincuts of Dinitz and Vainshtein [STOC '94], which generalizes well-known cactus representations of global mincuts. Our work demonstrates the relatively unexplored potential of the carcass beyond its obvious Steiner mincuts scope.

Improved Approximation Algorithms for Parallel Task Scheduling and Multiple Cluster Scheduling

from arXiv: Data Structures and Algorithms

Authors: Bennet Edler, Klaus Jansen, Felix Ohnesorge, Lis Pirotton

In the problem of Parallel Task Scheduling (PTS), we are asked to schedule $n$ jobs, each with a fixed processing time and machine requirement, such that the completion time of the last job is minimized. Jansen and Rau (2019) presented an algorithm for PTS that achieves an approximation ratio of $(3/2)\text{OPT} + p_{\max}$. They additionally posed the open question whether an approximation ratio of $(4/3)\text{OPT} + p_{\max}$ is possible. In this work, we present such an algorithm with a running time of $O(n\log n)$. The problem of Multiple Cluster Scheduling (MCS) is a natural extension of PTS where we are given $N$ clusters each of $m$ machines to schedule jobs. Jansen and Rau (2019) adapted their PTS algorithm to MCS with the following results: (1) a 2 approximation, and (2) a near-linear 9/4 approximation if $N$ is divisible by 3. We improve the running time of their 2-approximation and generalize the 9/4 approximation to the general case. The 2-approximation for MCS is tight, since one cannot hope for an approximation ratio better than 2, unless P=NP [Zhuk, 2006]. In addition to our theoretical results, we implement our algorithm and show its practical applicability.

Authors: Bennet Edler, Klaus Jansen, Felix Ohnesorge, Lis Pirotton

In the problem of Parallel Task Scheduling (PTS), we are asked to schedule $n$ jobs, each with a fixed processing time and machine requirement, such that the completion time of the last job is minimized. Jansen and Rau (2019) presented an algorithm for PTS that achieves an approximation ratio of $(3/2)\text{OPT} + p_{\max}$. They additionally posed the open question whether an approximation ratio of $(4/3)\text{OPT} + p_{\max}$ is possible. In this work, we present such an algorithm with a running time of $O(n\log n)$. The problem of Multiple Cluster Scheduling (MCS) is a natural extension of PTS where we are given $N$ clusters each of $m$ machines to schedule jobs. Jansen and Rau (2019) adapted their PTS algorithm to MCS with the following results: (1) a 2 approximation, and (2) a near-linear 9/4 approximation if $N$ is divisible by 3. We improve the running time of their 2-approximation and generalize the 9/4 approximation to the general case. The 2-approximation for MCS is tight, since one cannot hope for an approximation ratio better than 2, unless P=NP [Zhuk, 2006]. In addition to our theoretical results, we implement our algorithm and show its practical applicability.

The Binary Tree Mechanism is Optimal for Approximate Differentially Private Continual Counting

from arXiv: Data Structures and Algorithms

Authors: Konstantina Bairaktari, Kasper Green Larsen

Private continual counting is a fundamental problem in differential privacy: given a binary stream of length $n$, where each $1$ corresponds to the contribution of one individual, the goal is to release all running counts while protecting the privacy of each individual. The standard algorithm is the binary tree mechanism, whose Gaussian-noise variant achieves expected $\ell_\infty$ error proportional to $\log^{3/2} n$ for approximate differential privacy. Whether this dependence on the stream length is necessary has remained a central open problem. In this work, we resolve the dependence on $n$ by proving that every differentially private mechanism for continual counting must incur expected $\ell_\infty$ error $Ω(\log^{3/2} n)$. This shows that the binary tree mechanism is asymptotically optimal in the approximate-DP setting. As a consequence, we also obtain a largest-possible separation between hereditary discrepancy and private $\ell_\infty$ error for linear queries, showing that the known general upper bound in terms of hereditary discrepancy has the optimal dependence on the number of queries.

Authors: Konstantina Bairaktari, Kasper Green Larsen

Private continual counting is a fundamental problem in differential privacy: given a binary stream of length $n$, where each $1$ corresponds to the contribution of one individual, the goal is to release all running counts while protecting the privacy of each individual. The standard algorithm is the binary tree mechanism, whose Gaussian-noise variant achieves expected $\ell_\infty$ error proportional to $\log^{3/2} n$ for approximate differential privacy. Whether this dependence on the stream length is necessary has remained a central open problem. In this work, we resolve the dependence on $n$ by proving that every differentially private mechanism for continual counting must incur expected $\ell_\infty$ error $Ω(\log^{3/2} n)$. This shows that the binary tree mechanism is asymptotically optimal in the approximate-DP setting. As a consequence, we also obtain a largest-possible separation between hereditary discrepancy and private $\ell_\infty$ error for linear queries, showing that the known general upper bound in terms of hereditary discrepancy has the optimal dependence on the number of queries.

Warm-Starting All-Pairs Shortest Paths with Predictions

from arXiv: Data Structures and Algorithms

Authors: Adam Polak, Jonas Schmidt

One of the three key hypotheses of fine-grained complexity asserts that computing All-Pairs Shortest Paths (APSP) requires cubic time, up to subpolynomial factors, in the worst case. We initiate the study of APSP in the paradigm of algorithms with predictions, also known as learning-augmented algorithms. We propose an APSP algorithm that takes as additional input a \emph{prediction} (e.g., given by a model learned from similar instances seen in the past) consisting of sets of vertices causing the shortest \emph{detour} for each pair of vertices. The algorithm runs in time $\mathcal{O}(n^{2.83} + ηn)$, where $η$ denotes the \emph{prediction error} defined as the number of pairs of vertices for which, informally speaking, the prediction was not sufficient to compute and certify optimality of the shortest path length. This is already subcubic when the prediction error is (polynomially) smaller than its maximum possible values $n^2$, i.e., whenever the prediction is at least slightly better than terrible. We build on the co-nondeterministic algorithm for the Exact Triangle problem by Chan, Vassilevska Williams, and Xu (STOC 2023), essentially enabling this algorithm to detect mistakes in the nondeterministic certificate and recover from them. Our result constitutes the first necessary step towards designing learning-augmented algorithms for problems with known fine-grained lower bounds conditioned on the APSP Hypothesis.

Authors: Adam Polak, Jonas Schmidt

One of the three key hypotheses of fine-grained complexity asserts that computing All-Pairs Shortest Paths (APSP) requires cubic time, up to subpolynomial factors, in the worst case. We initiate the study of APSP in the paradigm of algorithms with predictions, also known as learning-augmented algorithms. We propose an APSP algorithm that takes as additional input a \emph{prediction} (e.g., given by a model learned from similar instances seen in the past) consisting of sets of vertices causing the shortest \emph{detour} for each pair of vertices. The algorithm runs in time $\mathcal{O}(n^{2.83} + ηn)$, where $η$ denotes the \emph{prediction error} defined as the number of pairs of vertices for which, informally speaking, the prediction was not sufficient to compute and certify optimality of the shortest path length. This is already subcubic when the prediction error is (polynomially) smaller than its maximum possible values $n^2$, i.e., whenever the prediction is at least slightly better than terrible. We build on the co-nondeterministic algorithm for the Exact Triangle problem by Chan, Vassilevska Williams, and Xu (STOC 2023), essentially enabling this algorithm to detect mistakes in the nondeterministic certificate and recover from them. Our result constitutes the first necessary step towards designing learning-augmented algorithms for problems with known fine-grained lower bounds conditioned on the APSP Hypothesis.

Submodular Maximization over Many Matroids via Ordered Local Search

from arXiv: Data Structures and Algorithms

Authors: Neta Singer, Theophile Thiery

Given a monotone submodular function, we consider the problem of finding a maximum-valued set in the intersection of $k$ matroids. Our main result is a polynomial time local search based algorithm achieving a $\frac{k}{2} + o(k)$ approximation guarantee. This asymptotically matches the best-known guarantee of $\frac{k}{2} + ε$ in the unweighted setting by Lee, Sviridenko, and Vondrák (2009). Prior to this work, the state-of-the-art was a $\frac{\ln(4)k}{1+\ln(2)} + o(k)$-approximation algorithm obtained by Feldman and Ward (2026). Our approach extends to Matroid $k$-Parity yielding the same approximation guarantee. In contrast to the weight bucketing approach underlying the recent advances of Singer and Thiery (2025) and Feldman and Ward (2026), our algorithm processes elements greedily in decreasing order of marginal value and searches for sufficiently profitable swaps, whose gain exceeds a parameter $α$ given as a function of $k$. We further combine this idea with the weight bucketing approach to obtain improved guarantees for weighted $k$-Set Packing. Our second main result is a $\frac{\ln(4)k}{3} + o(k)$-approximation algorithm for weighted $k$-Set Packing, improving on the state of the art $\frac{k}{2.00561} + O(1)$-approximation by Neuwohner (2023).

Authors: Neta Singer, Theophile Thiery

Given a monotone submodular function, we consider the problem of finding a maximum-valued set in the intersection of $k$ matroids. Our main result is a polynomial time local search based algorithm achieving a $\frac{k}{2} + o(k)$ approximation guarantee. This asymptotically matches the best-known guarantee of $\frac{k}{2} + ε$ in the unweighted setting by Lee, Sviridenko, and Vondrák (2009). Prior to this work, the state-of-the-art was a $\frac{\ln(4)k}{1+\ln(2)} + o(k)$-approximation algorithm obtained by Feldman and Ward (2026). Our approach extends to Matroid $k$-Parity yielding the same approximation guarantee. In contrast to the weight bucketing approach underlying the recent advances of Singer and Thiery (2025) and Feldman and Ward (2026), our algorithm processes elements greedily in decreasing order of marginal value and searches for sufficiently profitable swaps, whose gain exceeds a parameter $α$ given as a function of $k$. We further combine this idea with the weight bucketing approach to obtain improved guarantees for weighted $k$-Set Packing. Our second main result is a $\frac{\ln(4)k}{3} + o(k)$-approximation algorithm for weighted $k$-Set Packing, improving on the state of the art $\frac{k}{2.00561} + O(1)$-approximation by Neuwohner (2023).

Accelerating Discrete Diffusion Models with Parallel-In-Time Sampling

from arXiv: Data Structures and Algorithms

Authors: Yu Yao, Huanjian Zhou, Andi Han, Wei Huang, Masashi Sugiyama

Discrete diffusion models are widely used for learning and generating discrete distributions. As the generation process is inherently sequential, the acceleration of sampling is of significant importance. In this work, we parallelize the mainstream $τ$-leaping algorithm for absorbing discrete diffusion in a Continuous-Time Markov Chain (CTMC) framework. By leveraging the continuous-time stochastic integral form of the $τ$-leaping algorithm and the Picard iteration method, we achieve parallel-in-time sampling acceleration and provide a proof of exponential-factorial convergence for our algorithm. We improve the overall time complexity of $τ$-leaping under absorbing settings from ${\mathcal{O}}(d \log S)$ to ${\mathcal{O}}(\log (d\log S)\cdot \log d)$ with respect to NFE. Empirically, our method shows consistent acceleration across synthetic and real-data settings. The new sampler achieves at most $7$--$9\times$ runtime speedup for synthetic distribution, and maintains the same quality with $50\%$ fewer NFE and $1.45$--$1.86\times$ runtime speedups in image/text tasks on a single GPU. Our research expands the potential of discrete diffusion models for efficient parallel inference, with broader implications for applications such as molecular structure and language generation.

Authors: Yu Yao, Huanjian Zhou, Andi Han, Wei Huang, Masashi Sugiyama

Discrete diffusion models are widely used for learning and generating discrete distributions. As the generation process is inherently sequential, the acceleration of sampling is of significant importance. In this work, we parallelize the mainstream $τ$-leaping algorithm for absorbing discrete diffusion in a Continuous-Time Markov Chain (CTMC) framework. By leveraging the continuous-time stochastic integral form of the $τ$-leaping algorithm and the Picard iteration method, we achieve parallel-in-time sampling acceleration and provide a proof of exponential-factorial convergence for our algorithm. We improve the overall time complexity of $τ$-leaping under absorbing settings from ${\mathcal{O}}(d \log S)$ to ${\mathcal{O}}(\log (d\log S)\cdot \log d)$ with respect to NFE. Empirically, our method shows consistent acceleration across synthetic and real-data settings. The new sampler achieves at most $7$--$9\times$ runtime speedup for synthetic distribution, and maintains the same quality with $50\%$ fewer NFE and $1.45$--$1.86\times$ runtime speedups in image/text tasks on a single GPU. Our research expands the potential of discrete diffusion models for efficient parallel inference, with broader implications for applications such as molecular structure and language generation.

Online computation of maximal closed substrings

from arXiv: Data Structures and Algorithms

Authors: Hiroki Shibata, Haruki Umezaki, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

A non-empty string is closed if its length is one or its longest border appears exactly twice in the string. An occurrence of a closed substring is a maximal closed substring (MCS) if it cannot be extended to the left or to the right while preserving closedness. MCSs can be regarded as a general class of maximal repetitive structures including runs. In this paper, we study the computation of MCSs of a string given in an online manner, where one character is appended to the string at a time. Our algorithm detects newly formed MCSs after each append operation by using the rightmost previous occurrences of suffixes. To support this efficiently, we introduce the link-cut suffix tree (LCST), a novel data structure combining an online suffix tree with a link-cut tree. The LCST maintains rightmost occurrence information for substrings represented in the suffix tree in $O(n \log n)$ total time and $O(n)$ space, where $n$ is the length of the input string. Using the LCST, we obtain an $O(n \log n)$-time online algorithm for computing all MCSs, which is worst-case optimal. As further direct applications of the LCST, we obtain online algorithms for rightmost LZ77 factorizations and most recent match queries.

Authors: Hiroki Shibata, Haruki Umezaki, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

A non-empty string is closed if its length is one or its longest border appears exactly twice in the string. An occurrence of a closed substring is a maximal closed substring (MCS) if it cannot be extended to the left or to the right while preserving closedness. MCSs can be regarded as a general class of maximal repetitive structures including runs. In this paper, we study the computation of MCSs of a string given in an online manner, where one character is appended to the string at a time. Our algorithm detects newly formed MCSs after each append operation by using the rightmost previous occurrences of suffixes. To support this efficiently, we introduce the link-cut suffix tree (LCST), a novel data structure combining an online suffix tree with a link-cut tree. The LCST maintains rightmost occurrence information for substrings represented in the suffix tree in $O(n \log n)$ total time and $O(n)$ space, where $n$ is the length of the input string. Using the LCST, we obtain an $O(n \log n)$-time online algorithm for computing all MCSs, which is worst-case optimal. As further direct applications of the LCST, we obtain online algorithms for rightmost LZ77 factorizations and most recent match queries.

Online Matching with Size-Based and Convex Delays

from arXiv: Data Structures and Algorithms

Authors: Junhao Gan, Xiao Sun, Seeun William Umboh

We study the online min-cost perfect matching with delay (MPMD) problem where $m$ requests arrive in a metric space of $n$ points. In MPMD, an algorithm can choose to match a request or to delay, and the objective is to minimise the sum of connection and delay costs. The connection cost of a match is the distance between the locations of two matched requests in the metric, and the increase of the delay cost is a function of the set of unmatched requests at every moment. In this paper, we study two different types of delay functions, size-based (MPMD-Size) and convex delays (MPMD-Convex). The study of MPMD-Size was initiated by Deryckere and Umboh (APPROX/RANDOM 2023) where the instantaneous delay increment is a non-negative monotone function of the number of unmatched requests. Our bounds are in terms of $n$, as opposed to Deryckere and Umboh's bounds that depend on $m$. Our results settle the deterministic competitive ratio (up to constants). At the heart of these results is a succinct encoding scheme of MPMD-Size on a given $n$-point metric as a metrical task system problem on a $2^{n-1}$-point metric. We also consider MPMD-Convex proposed by Liu et al. (ISAAC 2018) where the delay cost incurred by each request is a uniform convex delay function of the time difference between its arrival time and the moment that it is matched by the algorithm. They focused on delay functions $f$ that are unbounded, non-decreasing, continuous, and satisfy $f(0)=f'(0)=0$, and showed that the deterministic competitive ratio is $Ω(n)$ for $n$-point uniform metrics. We show that, surprisingly, when $f$ is a non-negative, monotone polynomial with $f'(0)>0$, there is an $O(1)$-competitive deterministic algorithm for uniform metrics. Our result completes our understanding of MPMD-Convex on uniform metrics for a broad class of functions.

Authors: Junhao Gan, Xiao Sun, Seeun William Umboh

We study the online min-cost perfect matching with delay (MPMD) problem where $m$ requests arrive in a metric space of $n$ points. In MPMD, an algorithm can choose to match a request or to delay, and the objective is to minimise the sum of connection and delay costs. The connection cost of a match is the distance between the locations of two matched requests in the metric, and the increase of the delay cost is a function of the set of unmatched requests at every moment. In this paper, we study two different types of delay functions, size-based (MPMD-Size) and convex delays (MPMD-Convex). The study of MPMD-Size was initiated by Deryckere and Umboh (APPROX/RANDOM 2023) where the instantaneous delay increment is a non-negative monotone function of the number of unmatched requests. Our bounds are in terms of $n$, as opposed to Deryckere and Umboh's bounds that depend on $m$. Our results settle the deterministic competitive ratio (up to constants). At the heart of these results is a succinct encoding scheme of MPMD-Size on a given $n$-point metric as a metrical task system problem on a $2^{n-1}$-point metric. We also consider MPMD-Convex proposed by Liu et al. (ISAAC 2018) where the delay cost incurred by each request is a uniform convex delay function of the time difference between its arrival time and the moment that it is matched by the algorithm. They focused on delay functions $f$ that are unbounded, non-decreasing, continuous, and satisfy $f(0)=f'(0)=0$, and showed that the deterministic competitive ratio is $Ω(n)$ for $n$-point uniform metrics. We show that, surprisingly, when $f$ is a non-negative, monotone polynomial with $f'(0)>0$, there is an $O(1)$-competitive deterministic algorithm for uniform metrics. Our result completes our understanding of MPMD-Convex on uniform metrics for a broad class of functions.

Efficient LCE Queries and Lexicographic Minimizers on Sliding Suffix Trees

from arXiv: Data Structures and Algorithms

Authors: Toshiharu Minematsu, Shunsuke Inenaga

We study longest-common-extension (LCE) queries and lexicographic minimizer maintenance on the suffix tree of a sliding window. The main difficulty is that a sliding suffix tree is maintained in an implicit Ukkonen-style form: some suffixes of the current window are not represented by leaves. We show that the longest implicit (i.e. non-leaf) suffix induces a periodic representative map that folds every implicit suffix to an explicit suffix leaf in constant time. Combined with leaf pointers [Leonard et al., PSC 2026] and a dynamic LCA data structure [Cole & Hariharan, SICOMP 2005], this yields a linear-space data structure with amortized constant-time window shifts and worst-case constant-time LCE queries over a constant-size alphabet. For minimizers, the LCE structure gives a direct exact solution, but it uses more machinery than fixed-depth comparisons require. We therefore give an alternative LCE-free algorithm that reports minimizers in constant time per window shift, which is built on BP-linked suffix trees [Sumiyoshi et al, SPIRE 2024] and a standard order maintenance data structure (e.g. [Bender et al., ESA 2002]).

Authors: Toshiharu Minematsu, Shunsuke Inenaga

We study longest-common-extension (LCE) queries and lexicographic minimizer maintenance on the suffix tree of a sliding window. The main difficulty is that a sliding suffix tree is maintained in an implicit Ukkonen-style form: some suffixes of the current window are not represented by leaves. We show that the longest implicit (i.e. non-leaf) suffix induces a periodic representative map that folds every implicit suffix to an explicit suffix leaf in constant time. Combined with leaf pointers [Leonard et al., PSC 2026] and a dynamic LCA data structure [Cole & Hariharan, SICOMP 2005], this yields a linear-space data structure with amortized constant-time window shifts and worst-case constant-time LCE queries over a constant-size alphabet. For minimizers, the LCE structure gives a direct exact solution, but it uses more machinery than fixed-depth comparisons require. We therefore give an alternative LCE-free algorithm that reports minimizers in constant time per window shift, which is built on BP-linked suffix trees [Sumiyoshi et al, SPIRE 2024] and a standard order maintenance data structure (e.g. [Bender et al., ESA 2002]).

Killing the Case for Randomization in Dynamic Assortment Optimization

from arXiv: Data Structures and Algorithms

Authors: Mikhail Fadin, Huseyin Topaloglu

One of the traditional approaches for constructing approximate policies for dynamic assortment optimization problems is to use sampling-based inventory-agnostic policies. Such policies are called sampling-based, as they sample an assortment of products from a fixed distribution at each time period to offer to a customer of each type. Such policies are called inventory-agnostic, as the sampled assortments may include products without remaining inventories, so if a customer chooses a product without remaining inventories, then she leaves without a purchase. Inventory-agnostic nature of a policy is not a concern, because it is known that if the policy samples an assortment that includes products without remaining inventories, then dropping the products without remaining inventories does not degrade the performance. However, sampling-based nature of a policy is a concern, because sampling brings another source of uncertainty in the performance. In this paper, we give an algorithm to de-randomize any sampling-based inventory-agnostic policy, so the de-randomized policy offers a deterministic sequence of assortments within the support of the original policy without degrading the performance. Furthermore, we give a variation of our de-randomization algorithm that searches for a deterministic sequence of assortments beyond the support of the original policy. We show that we can implement the latter variation efficiently as long as we can solve the static assortment optimization problem under the choice model governing the choice process of the customers. As our crowning technical contribution, we study locally-optimal deterministic policies, where changing any single one of the assortments in the policy does not improve the total expected revenue. We show that any locally-optimal policy has a performance guarantee of 1/2 - epsilon when compared with the best sampling-based policy.

Authors: Mikhail Fadin, Huseyin Topaloglu

One of the traditional approaches for constructing approximate policies for dynamic assortment optimization problems is to use sampling-based inventory-agnostic policies. Such policies are called sampling-based, as they sample an assortment of products from a fixed distribution at each time period to offer to a customer of each type. Such policies are called inventory-agnostic, as the sampled assortments may include products without remaining inventories, so if a customer chooses a product without remaining inventories, then she leaves without a purchase. Inventory-agnostic nature of a policy is not a concern, because it is known that if the policy samples an assortment that includes products without remaining inventories, then dropping the products without remaining inventories does not degrade the performance. However, sampling-based nature of a policy is a concern, because sampling brings another source of uncertainty in the performance. In this paper, we give an algorithm to de-randomize any sampling-based inventory-agnostic policy, so the de-randomized policy offers a deterministic sequence of assortments within the support of the original policy without degrading the performance. Furthermore, we give a variation of our de-randomization algorithm that searches for a deterministic sequence of assortments beyond the support of the original policy. We show that we can implement the latter variation efficiently as long as we can solve the static assortment optimization problem under the choice model governing the choice process of the customers. As our crowning technical contribution, we study locally-optimal deterministic policies, where changing any single one of the assortments in the policy does not improve the total expected revenue. We show that any locally-optimal policy has a performance guarantee of 1/2 - epsilon when compared with the best sampling-based policy.

Distributionally Robust Linear Regression With Block Lewis Weights

from arXiv: Data Structures and Algorithms

Authors: Naren Sarayu Manoj, Kumar Kshitij Patel

We present an algorithm for the group distributionally robust (GDR) least squares problem. Given $m$ groups, a parameter vector in $\mathbb{R}^d$, and stacked design matrices and responses $\mathbf{A}$ and $\mathbf{b}$, our algorithm obtains a $(1+\varepsilon)$-multiplicative optimal solution using $\widetilde{O}(\min\{\mathsf{rank}(\mathbf{A}),m\}^{1/3}\varepsilon^{-2/3})$ linear-system-solves of matrices of the form $\mathbf{A}^{\top}\mathbf{B}\mathbf{A}$ for block-diagonal $\mathbf{B}$. Our technical methods follow from a recent geometric construction, block Lewis weights, that relates the empirical GDR problem to a carefully chosen least squares problem and an application of accelerated proximal methods. Our algorithm improves over known interior point methods for moderate accuracy regimes and matches the state-of-the-art guarantees for the special case of $\ell_{\infty}$ regression. We also give algorithms that smoothly interpolate between minimizing the average least squares loss and the distributionally robust loss.

Authors: Naren Sarayu Manoj, Kumar Kshitij Patel

We present an algorithm for the group distributionally robust (GDR) least squares problem. Given $m$ groups, a parameter vector in $\mathbb{R}^d$, and stacked design matrices and responses $\mathbf{A}$ and $\mathbf{b}$, our algorithm obtains a $(1+\varepsilon)$-multiplicative optimal solution using $\widetilde{O}(\min\{\mathsf{rank}(\mathbf{A}),m\}^{1/3}\varepsilon^{-2/3})$ linear-system-solves of matrices of the form $\mathbf{A}^{\top}\mathbf{B}\mathbf{A}$ for block-diagonal $\mathbf{B}$. Our technical methods follow from a recent geometric construction, block Lewis weights, that relates the empirical GDR problem to a carefully chosen least squares problem and an application of accelerated proximal methods. Our algorithm improves over known interior point methods for moderate accuracy regimes and matches the state-of-the-art guarantees for the special case of $\ell_{\infty}$ regression. We also give algorithms that smoothly interpolate between minimizing the average least squares loss and the distributionally robust loss.

Computing Smallest Suffixient Arrays in Sublinear Time

from arXiv: Data Structures and Algorithms

Authors: Hiroto Fujimaru, Gonzalo Navarro, Francisco Olivares, Jakub Radoszewski, Giuseppe Romana, Cristian Urbina

A suffixient array is a novel data structure that, when combined with an index providing direct access on a text $T$, allows us to answer a variety of pattern matching queries. In this work, we show how to compute a smallest suffixient array for $T[1\dots n]$ in $O(\frac{n\log σ}{\sqrt{\log n}}+\min(r,\bar{r})\log^εn)$ time for any $ε> 0$, where $σ$ is the alphabet size of $T$ and $r$ and $\bar{r}$ are the numbers of equal-letter runs of the Burrows-Wheeler transforms of $T$ and its reverse $\overline{T}$, respectively. This time complexity becomes sublinear when $σ$ is small enough and $\min(r,\bar{r})=o(\frac{n}{\log^εn})$, yielding an asymptotic improvement over state-of-the-art algorithms. We also present a series of connected algorithmic results.

Authors: Hiroto Fujimaru, Gonzalo Navarro, Francisco Olivares, Jakub Radoszewski, Giuseppe Romana, Cristian Urbina

A suffixient array is a novel data structure that, when combined with an index providing direct access on a text $T$, allows us to answer a variety of pattern matching queries. In this work, we show how to compute a smallest suffixient array for $T[1\dots n]$ in $O(\frac{n\log σ}{\sqrt{\log n}}+\min(r,\bar{r})\log^εn)$ time for any $ε> 0$, where $σ$ is the alphabet size of $T$ and $r$ and $\bar{r}$ are the numbers of equal-letter runs of the Burrows-Wheeler transforms of $T$ and its reverse $\overline{T}$, respectively. This time complexity becomes sublinear when $σ$ is small enough and $\min(r,\bar{r})=o(\frac{n}{\log^εn})$, yielding an asymptotic improvement over state-of-the-art algorithms. We also present a series of connected algorithmic results.

Temporal Path Covers: Dilworth Properties and Parameterized Complexity

from arXiv: Data Structures and Algorithms

Authors: Lapo Cioni, Sotiris Kanellopoulos, Edouard Nemery, Aris Pagourtzis, Christos Pergaminelis, Manolis Vasilakis

The Minimum Temporal Path Cover (TPC) and Minimum Temporally Disjoint Path Cover (TDPC) problems were introduced by [Chakraborty, Dailly, Foucaud, Klasing, MFCS '24]. Both were shown to be NP-hard on temporal DAGs, while the latter is also NP-hard on temporal oriented trees. All tractable cases for T(D)PC established in that paper satisfy a temporal Dilworth property, namely that the size of the minimum T(D)PC is equal to the size of the maximum antichain. This raises a natural question: is T(D)PC polynomial-time solvable under the promise that the respective Dilworth property holds? In this work, we answer this question in the affirmative for both problems, proving in fact that, under the respective promise, the size of the minimum T(D)PC is exactly equal to the Lovász number of the connectivity graph. In another direction, we establish parameterized algorithms and hardness results for TPC and TDPC. Our main result is that TPC is W[1]-hard parameterized by the deletion distance to linear forest even for temporal graphs with two time-steps, answering in the negative an open question by Chakraborty et al. about whether an XP algorithm parameterized by treewidth plus number of time-steps can be improved to FPT. On the other hand, we prove that an FPT algorithm does exist if the vertex cover number is used as parameter instead of the treewidth in the above parameterization. We complement this with a proof that including the number of time-steps in the parameter is necessary to yield tractability, as, otherwise, both TPC and TDPC remain NP-hard even for constant vertex cover size. Along the way, we establish various other para-NP-hardness results involving structural parameters such as the pathwidth and the maximum degree of the underlying graph.

Authors: Lapo Cioni, Sotiris Kanellopoulos, Edouard Nemery, Aris Pagourtzis, Christos Pergaminelis, Manolis Vasilakis

The Minimum Temporal Path Cover (TPC) and Minimum Temporally Disjoint Path Cover (TDPC) problems were introduced by [Chakraborty, Dailly, Foucaud, Klasing, MFCS '24]. Both were shown to be NP-hard on temporal DAGs, while the latter is also NP-hard on temporal oriented trees. All tractable cases for T(D)PC established in that paper satisfy a temporal Dilworth property, namely that the size of the minimum T(D)PC is equal to the size of the maximum antichain. This raises a natural question: is T(D)PC polynomial-time solvable under the promise that the respective Dilworth property holds? In this work, we answer this question in the affirmative for both problems, proving in fact that, under the respective promise, the size of the minimum T(D)PC is exactly equal to the Lovász number of the connectivity graph. In another direction, we establish parameterized algorithms and hardness results for TPC and TDPC. Our main result is that TPC is W[1]-hard parameterized by the deletion distance to linear forest even for temporal graphs with two time-steps, answering in the negative an open question by Chakraborty et al. about whether an XP algorithm parameterized by treewidth plus number of time-steps can be improved to FPT. On the other hand, we prove that an FPT algorithm does exist if the vertex cover number is used as parameter instead of the treewidth in the above parameterization. We complement this with a proof that including the number of time-steps in the parameter is necessary to yield tractability, as, otherwise, both TPC and TDPC remain NP-hard even for constant vertex cover size. Along the way, we establish various other para-NP-hardness results involving structural parameters such as the pathwidth and the maximum degree of the underlying graph.

Wednesday, July 01

The True Method

from Computational Complexity

Harry Lewis pointed Bill and me to Gottfried Leibniz's 1677 treatise The True Method (translated from the original French). I highly recommend taking the time to read this three page document where he talks about formalizing all human knowledge.

The second paragraph has some of the intuition for P v NP three centuries before Cook, Levin and Gödel. 

But knowledge depends on proof, and discovering proofs requires a certain method that is not known to everyone. Every person is capable of judging a proof, since it would not deserve to be called a proof unless everyone who considered it carefully found it convincing. Nevertheless, not everyone is capable of discovering proofs independently or of presenting them clearly once they have been found, whether for lack of time or for lack of method. 

Though by "not everyone" he might have excluded himself.

Every investigation that depends on reasoning would be carried out by manipulating these symbols, through a kind of calculation. This would make the discovery of important results entirely straightforward. We would no longer have to rack our brains as much as we do today, while still being certain that we could accomplish everything that was possible from the information given.

Though not clear if "that was possible" took computational time into account. Is he claiming P = NP?

The article as a whole talks about a project to develop a fully logical language to cover not just mathematics, but the sciences like physics and medicine and even morality, politics and law. Here he was just being too ambitious and missing complexity issues, for example that you can't create laws that fully cover all potential futures.

Leibniz does give himself an out, a nod to the scientific method that was being developed during that time.

Some experiences will always be needed as the foundation for reasoning. But once those experiences have been supplied, we could derive from them everything that anyone could ever derive. We could even determine which further experiments still needed to be performed in order to clear up all remaining doubts.

He even gives a nod to probability. 

This would be of extraordinary assistance even in politics and medicine, where we must reason consistently and correctly from given symptoms and circumstances. Even when there are not enough facts to form an infallible judgment, it would still be possible to determine what is most probable from the information available. That is everything reason can do.

Sort of how machine learning works now.

By Lance Fortnow

Harry Lewis pointed Bill and me to Gottfried Leibniz's 1677 treatise The True Method (translated from the original French). I highly recommend taking the time to read this three page document where he talks about formalizing all human knowledge.

The second paragraph has some of the intuition for P v NP three centuries before Cook, Levin and Gödel

But knowledge depends on proof, and discovering proofs requires a certain method that is not known to everyone. Every person is capable of judging a proof, since it would not deserve to be called a proof unless everyone who considered it carefully found it convincing. Nevertheless, not everyone is capable of discovering proofs independently or of presenting them clearly once they have been found, whether for lack of time or for lack of method. 

Though by "not everyone" he might have excluded himself.

Every investigation that depends on reasoning would be carried out by manipulating these symbols, through a kind of calculation. This would make the discovery of important results entirely straightforward. We would no longer have to rack our brains as much as we do today, while still being certain that we could accomplish everything that was possible from the information given.

Though not clear if "that was possible" took computational time into account. Is he claiming P = NP?

The article as a whole talks about a project to develop a fully logical language to cover not just mathematics, but the sciences like physics and medicine and even morality, politics and law. Here he was just being too ambitious and missing complexity issues, for example that you can't create laws that fully cover all potential futures.

Leibniz does give himself an out, a nod to the scientific method that was being developed during that time.

Some experiences will always be needed as the foundation for reasoning. But once those experiences have been supplied, we could derive from them everything that anyone could ever derive. We could even determine which further experiments still needed to be performed in order to clear up all remaining doubts.

He even gives a nod to probability. 

This would be of extraordinary assistance even in politics and medicine, where we must reason consistently and correctly from given symptoms and circumstances. Even when there are not enough facts to form an infallible judgment, it would still be possible to determine what is most probable from the information available. That is everything reason can do.

Sort of how machine learning works now.

By Lance Fortnow

Three Years of Argmin

from Ben Recht

My annual navel-gazing post.

My argmin.md file compels me to write a blog about blogging every July 1st. It’s my official Substack anniversary, and we’ve made it through year three. The process here is, for better or worse, set in stone and could be written as a skill file for an LLM agent. I listed the rules last year, and they haven’t changed since. I don’t expect them to change this year either.

In fact, going back over my posts, it seems like the most novel change was adding a new kind of content that got decidedly mixed reviews: football blogging. Of all the topics I’ve dabbled with on this Substack, there is none more polarizing. Some people really liked it! Others, let’s say, did not. I’m waiting until training camp gets going before I decide whether I’ll commit to another season of yelling about the absurdity of win probability.

The football blogging is related to one of the trickier things I haven’t been able to figure out in three years writing this blog. Argmin is a periodical with different authors. Those authors just all happen to be me. I realize you receive a notification from me whether I’m writing about a topic you care about or not. This is a problem I don’t really know how to address, but I suppose it is an issue with all Substack subscriptions. I don’t want to take on the bother of splitting this into four different Substacks, which I think would annoy you even more than it would annoy me. I hope the disclaimers at the top of the posts help you decide if you want to keep reading. I would love feedback on how to make that signposting clearer.

Relatedly, commenters are increasingly asking questions about things I’ve already talked about. This is totally expected, as I know I write more than anyone likely cares to read, but I worry about repeating myself—thus alienating a different collection of readers—when addressing them. The balance between repetition and clarity is delicate. Apoorva Lal said my blog has its own private dialect. If I don’t repeat myself somewhat, filling in the context of what my brain thinks obvious, the content will be even less approachable, and no one wants that.

Moreover, sometimes the only way to iron out a nonobvious point is by repeating it until it’s clear. I wrote in the afterword of The Irrational Decision that I used this blog “as a sketch pad, where I first workshopped much of the text that you read here.” This newsletter remains a place for me to play and write out first drafts. I get a lot out of putting ideas out there, even when they are way underbaked. “Talks and blogs are ephemeral, but interactions with the audience helped to shape and hone my arguments. All of those early interlocutors, in person and online, helped shape this final archival form. I’m thankful to all of you.”

In the spirit of learning from the audience, we can use this year’s reflection post as a chance for you to tell me what you’d change about the argmin magazine. Do you want more of any topic? Would you prefer a different format or cadence of posts? You can email me if you’d prefer not to comment (my berkeley.edu address is fine for that).

In the meantime, I can give you a preview of what to expect over the next year. There will definitely be more live course blogging. In the fall, I am planning an unhinged new class on forecasting, the tools we use to do it, and why we’re so obsessed with it. That should be a fun one. In the Spring, I’m apparently signed up to teach undergrad probability. This is for engineering students and based on the excellent text of Bertsekas and Tsitsiklis. Since it’s about engineering, we’ll have to grapple with some of the complicated metaphysics of what we think these probabilities mean in the context of building things. You, me, and the undergrads will all get confused together.

As for the remainder of this summer, I want to finish this series Against Optimaxxing. I’ve been talking about writing this forever, and I need to put in the effort to finish it. It would be an attempt at writing what Henry Farrell calls speculative nonfiction, seeking a reasonable language to imagine a future free from the vulgar positivism that runs our culture, politics, and industry. This blog series will try to establish a necessary vocabulary. My draft writing on this particular topic might lead nowhere. It might remain a private dialect. But I want to try to write it down.

Subscribe now

By Ben Recht

Witness Complexity of Short Descriptions: A Cryptographic Perspective

from arXiv: Computational Complexity

Authors: Fabio F. G. Buono

In cryptographic practice, where protocols impose strict time bounds, implementations demand predictable resource usage, and real-world systems require immediate verification for security and usability, a short key or certificate is useful only if it can be expanded or verified within a bounded time; otherwise a compact representation that requires superpolynomial work to expand offers no operational guarantee within a bounded-time protocol. This paper formalises that gap by introducing \emph{witness complexity} \(\gam(x)\), the minimum running time over near-shortest descriptions of a string on a universal Turing machine. \(\gam\) differs from Shannon entropy and Kolmogorov complexity \(\KC\): low \(\KC\) can coexist with high \(\gam\). We prove invariance up to polynomial factors; a conditional separation (assuming \(\PneqNP\)). An unconditional lower bound from incomputability of \(\KC\); a biconditional characterisation of \(\PeqNP\) via the class-relative variant \(\gP\); and polynomial-time tractability for structured \(\classNP\) families. Part II develops companion measures and shows an unconditional gap between grammar size and derivation cost, positioning \(\gam\) as a metric for the usability of keys and certificates.

Authors: Fabio F. G. Buono

In cryptographic practice, where protocols impose strict time bounds, implementations demand predictable resource usage, and real-world systems require immediate verification for security and usability, a short key or certificate is useful only if it can be expanded or verified within a bounded time; otherwise a compact representation that requires superpolynomial work to expand offers no operational guarantee within a bounded-time protocol. This paper formalises that gap by introducing \emph{witness complexity} \(\gam(x)\), the minimum running time over near-shortest descriptions of a string on a universal Turing machine. \(\gam\) differs from Shannon entropy and Kolmogorov complexity \(\KC\): low \(\KC\) can coexist with high \(\gam\). We prove invariance up to polynomial factors; a conditional separation (assuming \(\PneqNP\)). An unconditional lower bound from incomputability of \(\KC\); a biconditional characterisation of \(\PeqNP\) via the class-relative variant \(\gP\); and polynomial-time tractability for structured \(\classNP\) families. Part II develops companion measures and shows an unconditional gap between grammar size and derivation cost, positioning \(\gam\) as a metric for the usability of keys and certificates.

The Fourth-Root Complexity of Data Movement

from arXiv: Computational Complexity

Authors: Chen Ding

Time complexity typically assumes $O(1)$ cost per data access. This paper presents an analysis based on an abstract memory hierarchy. For a common class of applications, it shows that the data-access cost scales with the fourth root of data size, that is, as data size $N$ increases, the cost of each access increases at the rate of $N^\frac{1}{4}$. While the analysis does not predict performance, it predicts scalability. Specifically, the paper provides a precise analysis that shows the constant-factor difference between cases where the miss ratio follows a power law versus an exponential decay.

Authors: Chen Ding

Time complexity typically assumes $O(1)$ cost per data access. This paper presents an analysis based on an abstract memory hierarchy. For a common class of applications, it shows that the data-access cost scales with the fourth root of data size, that is, as data size $N$ increases, the cost of each access increases at the rate of $N^\frac{1}{4}$. While the analysis does not predict performance, it predicts scalability. Specifically, the paper provides a precise analysis that shows the constant-factor difference between cases where the miss ratio follows a power law versus an exponential decay.

Complexity of Universality and Related Decision Problems for Unary Two-Dimensional Automata

from arXiv: Computational Complexity

Authors: Taylor J. Smith

A two-dimensional automaton is able to move its input head through its input word in four directions: upward, downward, leftward, and rightward. If we prevent the input head from moving upward, then we obtain a three-way two-dimensional automaton; preventing both upward and leftward movements results in a two-way two-dimensional automaton. While much is known about the decidability and complexity properties of the two-dimensional automaton model, the unary variant of this model is less studied. We show that the universality, equivalence, and inclusion problems for unary three-way deterministic two-dimensional automata are coNP-hard, while for the corresponding two-way model, the universality, equivalence, inclusion, and disjointness problems are in P. We further show that the universality, equivalence, and inclusion problems for unary two-way nondeterministic two-dimensional automata are coNP-hard and in ELEMENTARY; and the disjointness problem for the same model is NL-hard and in ELEMENTARY. Finally, we establish the decidability of a bounded variant of the universality problem for unary three-way nondeterministic two-dimensional automata, and show that this variant problem is coNP-complete.

Authors: Taylor J. Smith

A two-dimensional automaton is able to move its input head through its input word in four directions: upward, downward, leftward, and rightward. If we prevent the input head from moving upward, then we obtain a three-way two-dimensional automaton; preventing both upward and leftward movements results in a two-way two-dimensional automaton. While much is known about the decidability and complexity properties of the two-dimensional automaton model, the unary variant of this model is less studied. We show that the universality, equivalence, and inclusion problems for unary three-way deterministic two-dimensional automata are coNP-hard, while for the corresponding two-way model, the universality, equivalence, inclusion, and disjointness problems are in P. We further show that the universality, equivalence, and inclusion problems for unary two-way nondeterministic two-dimensional automata are coNP-hard and in ELEMENTARY; and the disjointness problem for the same model is NL-hard and in ELEMENTARY. Finally, we establish the decidability of a bounded variant of the universality problem for unary three-way nondeterministic two-dimensional automata, and show that this variant problem is coNP-complete.

Taming Complexity in Intuitionistic Modal Logic: The Case of FIK and Its Shallow Calculus

from arXiv: Computational Complexity

Authors: Han Gao, Nicola Olivetti

Intuitionistic modal logics (IMLs) comprise many systems: from constructive modal logics such as CK and Wijesekera's CCDL to Fischer Servi/Simpson's IK, as well as some recently introduced variants. All of them are characterized by bi-relational semantics and have complete axiomatisations. However, from the perspective of proof theory and complexity, there are strong differences: while for constructive modal logics simple Gentzen calculi suffice, for IK more complex calculi, based on nested or labelled sequents, are needed. As a consequence, the decision problem for constructive modal logics has a PSPACE upper bound, whereas for IK is not known and it is even conjectured to be non-elementary. We study here the proof theory and complexity of FIK, a natural intuitionistic modal logic recently introduced. FIK is strictly in between CCDL and IK, yet it has the same forcing conditions as IK. We define a "shallow" sequent calculus for FIK which is a nested sequent calculus where sequents have at most one level of nesting. We prove its syntactic completeness by showing the admissibility of cut. By means of this calculus we show that decision problem for FIK is in EXPSPACE, whence significantly lower than the complexity conjectured for IK.

Authors: Han Gao, Nicola Olivetti

Intuitionistic modal logics (IMLs) comprise many systems: from constructive modal logics such as CK and Wijesekera's CCDL to Fischer Servi/Simpson's IK, as well as some recently introduced variants. All of them are characterized by bi-relational semantics and have complete axiomatisations. However, from the perspective of proof theory and complexity, there are strong differences: while for constructive modal logics simple Gentzen calculi suffice, for IK more complex calculi, based on nested or labelled sequents, are needed. As a consequence, the decision problem for constructive modal logics has a PSPACE upper bound, whereas for IK is not known and it is even conjectured to be non-elementary. We study here the proof theory and complexity of FIK, a natural intuitionistic modal logic recently introduced. FIK is strictly in between CCDL and IK, yet it has the same forcing conditions as IK. We define a "shallow" sequent calculus for FIK which is a nested sequent calculus where sequents have at most one level of nesting. We prove its syntactic completeness by showing the admissibility of cut. By means of this calculus we show that decision problem for FIK is in EXPSPACE, whence significantly lower than the complexity conjectured for IK.

Counting Small Induced Subgraphs: Hardness of Symmetry-Based Properties

from arXiv: Computational Complexity

Authors: Radu Curticapean, Mingjun Liu

Jerrum and Meeks (TOCT, JCSS 2015) introduced the counting problems $\text{IndSub}(Φ)$ for fixed graph properties $Φ$: Given an input graph $G$ and $k\in\mathbb N$, count the $k$-vertex subsets $S \subseteq V(G)$ such that the induced subgraph $G[S]$ satisfies $Φ$. For recursively enumerable $Φ$, it is known that $\text{IndSub}(Φ)$ is either #W[1]-hard or fixed-parameter tractable. A direct classification depending on $Φ$ however still remains open. In particular, the status was open for the property of graphs without nontrivial automorphisms, also mentioned in a very recent survey on parameterized counting by Roth (Comput.~Sci.~Rev.~2026). This is a natural property that evades all currently known techniques for proving #W[1]-hardness, including a general toolkit based on Fourier analysis that was very recently introduced by Curticapean and Neuen (SODA~2025). In this paper, we show that counting induced $k$-vertex graphs without nontrivial automorphisms is #W[1]-hard by constructing ``clique scaffolds'', i.e., problem-specific restrictions of the property that enable a reduction from the $k$-clique problem. More generally, we show that for every finite group $Q$, counting $k$-vertex induced subgraphs with automorphism group $Q$ is #W[1]-hard.

Authors: Radu Curticapean, Mingjun Liu

Jerrum and Meeks (TOCT, JCSS 2015) introduced the counting problems $\text{IndSub}(Φ)$ for fixed graph properties $Φ$: Given an input graph $G$ and $k\in\mathbb N$, count the $k$-vertex subsets $S \subseteq V(G)$ such that the induced subgraph $G[S]$ satisfies $Φ$. For recursively enumerable $Φ$, it is known that $\text{IndSub}(Φ)$ is either #W[1]-hard or fixed-parameter tractable. A direct classification depending on $Φ$ however still remains open. In particular, the status was open for the property of graphs without nontrivial automorphisms, also mentioned in a very recent survey on parameterized counting by Roth (Comput.~Sci.~Rev.~2026). This is a natural property that evades all currently known techniques for proving #W[1]-hardness, including a general toolkit based on Fourier analysis that was very recently introduced by Curticapean and Neuen (SODA~2025). In this paper, we show that counting induced $k$-vertex graphs without nontrivial automorphisms is #W[1]-hard by constructing ``clique scaffolds'', i.e., problem-specific restrictions of the property that enable a reduction from the $k$-clique problem. More generally, we show that for every finite group $Q$, counting $k$-vertex induced subgraphs with automorphism group $Q$ is #W[1]-hard.

Sensitivity Lower Bounds via Locally Testable Codes

from arXiv: Data Structures and Algorithms

Authors: Yuichi Yoshida, Zihan Zhang

Sensitivity quantifies how far an algorithm's output can move in Hamming distance when a single input element is perturbed. We present a general scheme turning any locally testable code (LTC) into a sensitivity lower bound for the CSP encoded by the code. Instantiating it with the $c^3$-LTC yields a constant $\varepsilon > 0$ for which every $(1-\varepsilon)$-approximation algorithm for satisfiable Max E3LIN2 has sensitivity $Ω(n)$, strengthening the previous $Ω(n^δ)$ lower bound known only for general instances. Standard reductions give an $n^{1-O(\varepsilon)}$ lower bound for $n^{-\varepsilon}$-approximation algorithms for maximum clique, and an $Ω(k)$ bound for $(1-\varepsilon)$-approximation algorithms for maximum $k$-coverage, among others. These lower bounds match trivial upper bounds and imply that even slightly stable algorithms cannot achieve these approximations. A second instantiation, using a simple repetition code, shows that any $(1-\varepsilon)$-approximation algorithm for Max Cut on bipartite graphs has sensitivity $Ω(1/\varepsilon)$ as long as $\varepsilon = O(1/\sqrt{n})$, extending the prior result for exact algorithms (the regime $\varepsilon < 1/n$). Thus even on bipartite graphs, where a perfect cut always exists, near-optimal solutions cannot be maintained stably. Our sensitivity lower bounds have two applications. First, they yield averaged sensitivity lower bounds, implying that any nearly optimal randomized algorithm for Max E3SAT has linearly many output bits that, after fixing the random seed, are Boolean functions with large influence. Second, via the sensitivity-to-locality connection, they imply locality lower bounds in the non-signaling model, which generalizes $\mathsf{LOCAL}$ and quantum-$\mathsf{LOCAL}$.

Authors: Yuichi Yoshida, Zihan Zhang

Sensitivity quantifies how far an algorithm's output can move in Hamming distance when a single input element is perturbed. We present a general scheme turning any locally testable code (LTC) into a sensitivity lower bound for the CSP encoded by the code. Instantiating it with the $c^3$-LTC yields a constant $\varepsilon > 0$ for which every $(1-\varepsilon)$-approximation algorithm for satisfiable Max E3LIN2 has sensitivity $Ω(n)$, strengthening the previous $Ω(n^δ)$ lower bound known only for general instances. Standard reductions give an $n^{1-O(\varepsilon)}$ lower bound for $n^{-\varepsilon}$-approximation algorithms for maximum clique, and an $Ω(k)$ bound for $(1-\varepsilon)$-approximation algorithms for maximum $k$-coverage, among others. These lower bounds match trivial upper bounds and imply that even slightly stable algorithms cannot achieve these approximations. A second instantiation, using a simple repetition code, shows that any $(1-\varepsilon)$-approximation algorithm for Max Cut on bipartite graphs has sensitivity $Ω(1/\varepsilon)$ as long as $\varepsilon = O(1/\sqrt{n})$, extending the prior result for exact algorithms (the regime $\varepsilon < 1/n$). Thus even on bipartite graphs, where a perfect cut always exists, near-optimal solutions cannot be maintained stably. Our sensitivity lower bounds have two applications. First, they yield averaged sensitivity lower bounds, implying that any nearly optimal randomized algorithm for Max E3SAT has linearly many output bits that, after fixing the random seed, are Boolean functions with large influence. Second, via the sensitivity-to-locality connection, they imply locality lower bounds in the non-signaling model, which generalizes $\mathsf{LOCAL}$ and quantum-$\mathsf{LOCAL}$.

Orienting Unrooted Binary Networks Faster: Focus on the Generator

from arXiv: Data Structures and Algorithms

Authors: Jannik Schestag, Norbert Zeh

The problem of orienting an unrooted network to obtain a specific class of rooted phylogenetic networks is known to be NP-hard in many cases. In this paper, we introduce two algorithmic frameworks that yield significantly improved fixed-parameter tractable (FPT) algorithms parameterized by the network level $\ell$. Our first main contribution shows that for several prominent network classes, the core algorithmic difficulty lies in finding a directed spanning tree on the network's undirected generator. By enumerating these spanning trees in $O(5.3334^\ell + \ell)$ time and orienting all remaining edges in polynomial time, we solve the orientation problem in $O(5.3334^\ell \cdot n)$ time for tree-based networks and in $O(5.3334^\ell \cdot n^2)$ time for orchards, where $n$ is the number of vertices of the graph. Extending this approach with further branching yields $O(10.6667^\ell \cdot n^2)$-time algorithms for tree-child and normal networks. Our second technique bypasses spanning trees by directly guessing the placement of reticulations on the generator. This framework provides $O(12.2071^\ell \cdot n^2)$-time algorithms for temporal, reticulation-visible, and tree-sibling networks. Finally, we demonstrate the versatility of the reticulation-guessing framework by showing that even computing an orientation with minimum scanwidth is single-exponential FPT with respect to the level. Together, these results significantly improve the best-known running times for phylogenetic network orientation.

Authors: Jannik Schestag, Norbert Zeh

The problem of orienting an unrooted network to obtain a specific class of rooted phylogenetic networks is known to be NP-hard in many cases. In this paper, we introduce two algorithmic frameworks that yield significantly improved fixed-parameter tractable (FPT) algorithms parameterized by the network level $\ell$. Our first main contribution shows that for several prominent network classes, the core algorithmic difficulty lies in finding a directed spanning tree on the network's undirected generator. By enumerating these spanning trees in $O(5.3334^\ell + \ell)$ time and orienting all remaining edges in polynomial time, we solve the orientation problem in $O(5.3334^\ell \cdot n)$ time for tree-based networks and in $O(5.3334^\ell \cdot n^2)$ time for orchards, where $n$ is the number of vertices of the graph. Extending this approach with further branching yields $O(10.6667^\ell \cdot n^2)$-time algorithms for tree-child and normal networks. Our second technique bypasses spanning trees by directly guessing the placement of reticulations on the generator. This framework provides $O(12.2071^\ell \cdot n^2)$-time algorithms for temporal, reticulation-visible, and tree-sibling networks. Finally, we demonstrate the versatility of the reticulation-guessing framework by showing that even computing an orientation with minimum scanwidth is single-exponential FPT with respect to the level. Together, these results significantly improve the best-known running times for phylogenetic network orientation.

Directed Low Diameter Decomposition for Structured Digraphs

from arXiv: Data Structures and Algorithms

Authors: Shinwoo An, Arnold Filtser

Low diameter decompositions, or LDDs for short, are a fundamental primitive in the design of efficient graph algorithms. Roughly speaking, an LDD is a distribution over partitions of the vertices into bounded-diameter clusters such that nearby vertices are likely to be clustered together. Recently, there has been growing interest in lifting the notion of LDDs into \emph{directed graphs}. In particular, there are two natural directed analogues. The first is a directed LDD, where after removing a random subset of edges, every strongly connected component has a small diameter. The second is a quasipartition, which imposes the stronger requirement that whenever one vertex can still reach another after the edge removal, the two vertices must be close in the original directed metric. Every quasipartition yields an LDD, but the converse does not necessarily hold. In this work, we initiate the systematic study of LDDs in structured directed graphs. As our first main result, we show that any directed graph with pathwidth $\mathsf{pw}$ admits an $(O(\mathsf{pw}), Δ)$-LDD. This improves upon the previous best-known $(2^{O(\mathsf{pw}^2)}, Δ)$-LDD construction, which was implicitly derived from the quasipartition result of Salmasi, Sidiropoulos, and Sridhar [SODA'19]. As our second result, we show that the integrality gap of the Directed Non-Bipartite Sparsest-Cut LP relaxation on an $n$-vertex graph with treewidth $\mathsf{tw}$ is $O(\mathsf{tw} \log n)$. This improves upon the $O(\mathsf{tw}\log^2 n)$ bound of Mémoli, Sidiropoulos, and Sridhar [ICALP'16, Algorithmica'18]. We obtain this result through the refined analysis of the quasipartition construction of Mémoli et al. for bounded treewidth graphs.

Authors: Shinwoo An, Arnold Filtser

Low diameter decompositions, or LDDs for short, are a fundamental primitive in the design of efficient graph algorithms. Roughly speaking, an LDD is a distribution over partitions of the vertices into bounded-diameter clusters such that nearby vertices are likely to be clustered together. Recently, there has been growing interest in lifting the notion of LDDs into \emph{directed graphs}. In particular, there are two natural directed analogues. The first is a directed LDD, where after removing a random subset of edges, every strongly connected component has a small diameter. The second is a quasipartition, which imposes the stronger requirement that whenever one vertex can still reach another after the edge removal, the two vertices must be close in the original directed metric. Every quasipartition yields an LDD, but the converse does not necessarily hold. In this work, we initiate the systematic study of LDDs in structured directed graphs. As our first main result, we show that any directed graph with pathwidth $\mathsf{pw}$ admits an $(O(\mathsf{pw}), Δ)$-LDD. This improves upon the previous best-known $(2^{O(\mathsf{pw}^2)}, Δ)$-LDD construction, which was implicitly derived from the quasipartition result of Salmasi, Sidiropoulos, and Sridhar [SODA'19]. As our second result, we show that the integrality gap of the Directed Non-Bipartite Sparsest-Cut LP relaxation on an $n$-vertex graph with treewidth $\mathsf{tw}$ is $O(\mathsf{tw} \log n)$. This improves upon the $O(\mathsf{tw}\log^2 n)$ bound of Mémoli, Sidiropoulos, and Sridhar [ICALP'16, Algorithmica'18]. We obtain this result through the refined analysis of the quasipartition construction of Mémoli et al. for bounded treewidth graphs.

Graph Scheduling with Group Completion Times

from arXiv: Data Structures and Algorithms

Authors: Lars Rohwedder, Leander Schnaars

In the Graph Scheduling problem we schedule a given multiset of edges on discrete time steps, such that at each step the set of edges forms a matching. The goal is to minimize the sum of weighted group completion times, where a group is a set of edges and it completes when the last edge has been scheduled. Two popular variants of this problem are Coflow Scheduling and Data Migration. Our main result is extending a recent iterated rounding approach from Coflow Scheduling, roughly corresponding to the bipartite case, to the general Graph Scheduling problem. This yields an essentially tight $(2+ε)$-approximation for the asymptotic setting where OPT is assumed to be large. For this we rely on polyhedral techniques from general matching, namely odd-set inequalities, and graph theoretical results on edge colorings in multigraphs. The state-of-the-art approximation algorithm for Data Migration is a $(1 + φ)$-approximation that improves when OPT is small. Taking the best of this and our main result, we obtain an improvement of the approximation rate for Data Migration in any regime.

Authors: Lars Rohwedder, Leander Schnaars

In the Graph Scheduling problem we schedule a given multiset of edges on discrete time steps, such that at each step the set of edges forms a matching. The goal is to minimize the sum of weighted group completion times, where a group is a set of edges and it completes when the last edge has been scheduled. Two popular variants of this problem are Coflow Scheduling and Data Migration. Our main result is extending a recent iterated rounding approach from Coflow Scheduling, roughly corresponding to the bipartite case, to the general Graph Scheduling problem. This yields an essentially tight $(2+ε)$-approximation for the asymptotic setting where OPT is assumed to be large. For this we rely on polyhedral techniques from general matching, namely odd-set inequalities, and graph theoretical results on edge colorings in multigraphs. The state-of-the-art approximation algorithm for Data Migration is a $(1 + φ)$-approximation that improves when OPT is small. Taking the best of this and our main result, we obtain an improvement of the approximation rate for Data Migration in any regime.

Constant-factor approximation of maximum distance-2 independent set in graphs of bounded merge-width

from arXiv: Data Structures and Algorithms

Authors: Maël Dumas

We give a constant-factor approximation algorithm for Max Dist-2 Independent Set in graphs of bounded radius-2 merge-width. The same result holds for Min Dominating Set from [Bonamy and Geniet, 2025], [Chan et al., SODA '12]. Both approximation algorithms are LP-based, showing that the domination-to-2-independence ratio is bounded in graphs of bounded radius-2 merge-width. Moreover, this result is tight in the sense that the ratio can be unbounded in graphs of bounded radius-1 merge-width.

Authors: Maël Dumas

We give a constant-factor approximation algorithm for Max Dist-2 Independent Set in graphs of bounded radius-2 merge-width. The same result holds for Min Dominating Set from [Bonamy and Geniet, 2025], [Chan et al., SODA '12]. Both approximation algorithms are LP-based, showing that the domination-to-2-independence ratio is bounded in graphs of bounded radius-2 merge-width. Moreover, this result is tight in the sense that the ratio can be unbounded in graphs of bounded radius-1 merge-width.

Planar Embedding of Okamura-Seymour Quasimetrics in Polynomial Time with an Application to Distributed SSSP

from arXiv: Data Structures and Algorithms

Authors: Hung Le, Hector Tierno, Shuang Yang

A quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric if there exists an edge-weighted planar embedded directed graph $G = (V,E,w)$ such that $T$ is a set of terminals on the outerface of $G$ and $δ_G(t,t') = δ_T(t,t')$ for every pair $(t,t')\in T\times T$. If $(T,δ_T)$ is an Okamura-Seymour quasimetric, then $G$ is a planar embedding of $(T,δ_T)$. In a recent pioneering work, Chen and Tan gave a polynomial-time algorithm to test if a given quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric. A key step in their proof is existential, which suffices for an efficient testing algorithm but does not imply an efficient embedding algorithm. Our paper closes this gap by giving an algorithmic implementation of their existential step via linear programming. As a result, we obtain the first polynomial-time algorithm for finding a planar embedding of any given Okamura-Seymour quasimetric $(T,δ_T)$. As an application, we show how to use our planar embedding of Okamura-Seymour quasimetrics to compute a $(1+ε)$-approximate single-source shortest path (SSSP) in planar directed graphs in the distributed CONGEST model in $\widetilde{O}(D)$ rounds for any fixed $ε\in (0,1)$, nearly matching a simple lower bound of $Ω(D)$ and resolving a fundamental problem in this area. The best-known algorithm for this problem has round complexity $\widetilde{O}(D^2)$.

Authors: Hung Le, Hector Tierno, Shuang Yang

A quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric if there exists an edge-weighted planar embedded directed graph $G = (V,E,w)$ such that $T$ is a set of terminals on the outerface of $G$ and $δ_G(t,t') = δ_T(t,t')$ for every pair $(t,t')\in T\times T$. If $(T,δ_T)$ is an Okamura-Seymour quasimetric, then $G$ is a planar embedding of $(T,δ_T)$. In a recent pioneering work, Chen and Tan gave a polynomial-time algorithm to test if a given quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric. A key step in their proof is existential, which suffices for an efficient testing algorithm but does not imply an efficient embedding algorithm. Our paper closes this gap by giving an algorithmic implementation of their existential step via linear programming. As a result, we obtain the first polynomial-time algorithm for finding a planar embedding of any given Okamura-Seymour quasimetric $(T,δ_T)$. As an application, we show how to use our planar embedding of Okamura-Seymour quasimetrics to compute a $(1+ε)$-approximate single-source shortest path (SSSP) in planar directed graphs in the distributed CONGEST model in $\widetilde{O}(D)$ rounds for any fixed $ε\in (0,1)$, nearly matching a simple lower bound of $Ω(D)$ and resolving a fundamental problem in this area. The best-known algorithm for this problem has round complexity $\widetilde{O}(D^2)$.

Practical Linear-Time Computation of Smallest Suffixient Sets

from arXiv: Data Structures and Algorithms

Authors: Francisco Olivares, Gonzalo Navarro

Suffixient arrays are recent structures that have attracted attention because they offer relevant pattern matching functionality in less asymptotic space than the Run-Length BWT, the de-facto standard to index highly repetitive string collections. Various algorithms exist for building them from the suffix array data structures. We present the first construction algorithm that is (i) linear-time, (ii) one-pass over the structures, and (iii) implemented and practical. This makes the construction particularly useful on large text collections, which we demonstrate empirically by showing that it dominates the space/time tradeoff map of the implemented constructions.

Authors: Francisco Olivares, Gonzalo Navarro

Suffixient arrays are recent structures that have attracted attention because they offer relevant pattern matching functionality in less asymptotic space than the Run-Length BWT, the de-facto standard to index highly repetitive string collections. Various algorithms exist for building them from the suffix array data structures. We present the first construction algorithm that is (i) linear-time, (ii) one-pass over the structures, and (iii) implemented and practical. This makes the construction particularly useful on large text collections, which we demonstrate empirically by showing that it dominates the space/time tradeoff map of the implemented constructions.

Optimal-Time Contextual Pattern Matching in Compressed Space

from arXiv: Data Structures and Algorithms

Authors: Gonzalo Navarro, Francisco Olivares

Contextual pattern matching is the task of, given a pattern $P[1,m]$, a context length $λ$, and a text $T[1,n]$, find all the $occ$ distinct contexts in which $P$ occurs in $T$, the context being the $λ$ symbols preceding and the $λ$ symbols following the occurrence; a text position where each context occurs must be output. While the problem can be solved in optimal time $O(m+occ)$ using $O(n)$-space precomputed data structures on $T$, this type of search is particularly relevant on large repetitive text collections, where $O(n)$ space can be prohibitive. We present the first optimal-time solution that runs in compressed space, namely that of a symmetric CDAWG (SCDAWG) of $T$. Further, we show how the set of $occ$ solutions can be enumerated with $O(\log\logλ)$ delay after $O(m)$-time preprocessing of $P$. To achieve this, we develop an improved linear-space distance-sensitive weighted ancestor data structure.

Authors: Gonzalo Navarro, Francisco Olivares

Contextual pattern matching is the task of, given a pattern $P[1,m]$, a context length $λ$, and a text $T[1,n]$, find all the $occ$ distinct contexts in which $P$ occurs in $T$, the context being the $λ$ symbols preceding and the $λ$ symbols following the occurrence; a text position where each context occurs must be output. While the problem can be solved in optimal time $O(m+occ)$ using $O(n)$-space precomputed data structures on $T$, this type of search is particularly relevant on large repetitive text collections, where $O(n)$ space can be prohibitive. We present the first optimal-time solution that runs in compressed space, namely that of a symmetric CDAWG (SCDAWG) of $T$. Further, we show how the set of $occ$ solutions can be enumerated with $O(\log\logλ)$ delay after $O(m)$-time preprocessing of $P$. To achieve this, we develop an improved linear-space distance-sensitive weighted ancestor data structure.

Algorithms and complexity for geodetic sets on interval and chordal graphs

from arXiv: Data Structures and Algorithms

Authors: Dibyayan Chakraborty, Sandip Das, Florent Foucaud, Harmender Gahlawat, Dimitri Lajou

We study the computational complexity of finding the geodetic number of a graph on chordal graphs and interval graphs. A set $S$ of vertices of a graph $G$ is a \textit{geodetic set} if every vertex of $G$ lies in a shortest path between some pair of vertices of $S$. The \textsc{Minimum Geodetic Set (MGS)} problem is to find a geodetic set with minimum cardinality of a given graph. We show that \textsc{Minimum Geodetic Set} is fixed parameter tractable for chordal graphs when parameterized by its \emph{tree-width} (which equals its clique number). This implies a polynomial-time algorithm for $k$-trees, for fixed $k$. Then, we show that \textsc{Minimum Geodetic Set} is NP-hard on interval graphs, thereby answering a question of Ekim et al. (LATIN, 2012), who showed that \textsc{Minimum Geodetic Set} is polynomial-time solvable on proper interval graphs. As interval graphs are very constrained, to prove the latter result, we design a rather sophisticated reduction technique to work around their inherent linear structure.

Authors: Dibyayan Chakraborty, Sandip Das, Florent Foucaud, Harmender Gahlawat, Dimitri Lajou

We study the computational complexity of finding the geodetic number of a graph on chordal graphs and interval graphs. A set $S$ of vertices of a graph $G$ is a \textit{geodetic set} if every vertex of $G$ lies in a shortest path between some pair of vertices of $S$. The \textsc{Minimum Geodetic Set (MGS)} problem is to find a geodetic set with minimum cardinality of a given graph. We show that \textsc{Minimum Geodetic Set} is fixed parameter tractable for chordal graphs when parameterized by its \emph{tree-width} (which equals its clique number). This implies a polynomial-time algorithm for $k$-trees, for fixed $k$. Then, we show that \textsc{Minimum Geodetic Set} is NP-hard on interval graphs, thereby answering a question of Ekim et al. (LATIN, 2012), who showed that \textsc{Minimum Geodetic Set} is polynomial-time solvable on proper interval graphs. As interval graphs are very constrained, to prove the latter result, we design a rather sophisticated reduction technique to work around their inherent linear structure.

$\texttt{bucket-graph-spprc}$: an extensible C++ library for the shortest path problem with resource constraints

from arXiv: Data Structures and Algorithms

Authors: Simon Spoorendonk

We present $\texttt{bucket-graph-spprc}$ ($\texttt{bgspprc}$ for short), an open-source, header-only C++23 library for the shortest path problem with resource constraints (SPPRC), the pricing subproblem at the heart of branch-cut-and-price for vehicle routing and related problems. The library implements the bucket-graph labelling algorithm of Sadykov, Uchoa and Pessoa (2021), with bidirectional labelling, across-arc concatenation, bucket fixing and arc elimination, and a structure-of-arrays label store with SIMD-accelerated dominance. Its central design feature is a compile-time resource concept: a new SPPRC variant is added by implementing a fixed seven-function interface, and resources compose into a label state with no runtime dispatch, the state layout fixed at compile time. Five resources ship built in: time/capacity, ng-path elementarity relaxation, rank-1 cuts, cumulative cost, and pickup-and-delivery. In a reproducible, head-to-head comparison on shared public instances at an identical bound, $\texttt{bgspprc}$ outperforms PathWyse (Salani, Basso and Giuffrida, 2024), the main open-source comparator, by $1.3\times$--$2.35\times$ in shifted geometric mean (and by $1.3\times$--$2.3\times$ even when itself run single-threaded), and runs within $1.9\times$--$2.4\times$ of parallel pull labelling (Petersen and Spoorendonk, 2025), a different labelling technique for the same problem. The library, benchmark scripts, and pinned instances are publicly available.

Authors: Simon Spoorendonk

We present $\texttt{bucket-graph-spprc}$ ($\texttt{bgspprc}$ for short), an open-source, header-only C++23 library for the shortest path problem with resource constraints (SPPRC), the pricing subproblem at the heart of branch-cut-and-price for vehicle routing and related problems. The library implements the bucket-graph labelling algorithm of Sadykov, Uchoa and Pessoa (2021), with bidirectional labelling, across-arc concatenation, bucket fixing and arc elimination, and a structure-of-arrays label store with SIMD-accelerated dominance. Its central design feature is a compile-time resource concept: a new SPPRC variant is added by implementing a fixed seven-function interface, and resources compose into a label state with no runtime dispatch, the state layout fixed at compile time. Five resources ship built in: time/capacity, ng-path elementarity relaxation, rank-1 cuts, cumulative cost, and pickup-and-delivery. In a reproducible, head-to-head comparison on shared public instances at an identical bound, $\texttt{bgspprc}$ outperforms PathWyse (Salani, Basso and Giuffrida, 2024), the main open-source comparator, by $1.3\times$--$2.35\times$ in shifted geometric mean (and by $1.3\times$--$2.3\times$ even when itself run single-threaded), and runs within $1.9\times$--$2.4\times$ of parallel pull labelling (Petersen and Spoorendonk, 2025), a different labelling technique for the same problem. The library, benchmark scripts, and pinned instances are publicly available.

The online monotone array completion problem

from arXiv: Data Structures and Algorithms

Authors: Vishesh Jain, Dylan King, Clayton Mizgerd

Consider the following online filling game. An array of length $n$ is initially empty. At each time step one observes an independent sample from $\mathrm{Unif}[0,1]$ and must either discard it or place it irrevocably into an empty position of the array, while preserving the constraint that the occupied entries are non-decreasing from left to right. Among all possible strategies, what is the optimal expected time required to fill the array? Let $v_n$ denote this optimal expected completion time. Our main result determines $v_n$ up to lower-order terms: \[ v_n=\left(\frac12+o(1)\right)n\log n. \] More precisely, no strategy, even if randomized and adaptive, can have expected completion time below $\left(\frac12-o(1)\right)n\log n$, while we provide an explicit deterministic strategy whose expected completion time is at most $\left(\frac12+o(1)\right)n\log n$. For comparison, the natural coupon-collector strategy, which partitions $[0,1]$ into $n$ equal intervals and reserves one array position for each interval, has expected completion time $(1+o(1))n\log n$. We also consider a with-replacement version of the game, in which previously placed entries may be overwritten. For this variant, we give a deterministic strategy with expected completion time $O(n\sqrt{\log n})$, thereby establishing a separation between the two models.

Authors: Vishesh Jain, Dylan King, Clayton Mizgerd

Consider the following online filling game. An array of length $n$ is initially empty. At each time step one observes an independent sample from $\mathrm{Unif}[0,1]$ and must either discard it or place it irrevocably into an empty position of the array, while preserving the constraint that the occupied entries are non-decreasing from left to right. Among all possible strategies, what is the optimal expected time required to fill the array? Let $v_n$ denote this optimal expected completion time. Our main result determines $v_n$ up to lower-order terms: \[ v_n=\left(\frac12+o(1)\right)n\log n. \] More precisely, no strategy, even if randomized and adaptive, can have expected completion time below $\left(\frac12-o(1)\right)n\log n$, while we provide an explicit deterministic strategy whose expected completion time is at most $\left(\frac12+o(1)\right)n\log n$. For comparison, the natural coupon-collector strategy, which partitions $[0,1]$ into $n$ equal intervals and reserves one array position for each interval, has expected completion time $(1+o(1))n\log n$. We also consider a with-replacement version of the game, in which previously placed entries may be overwritten. For this variant, we give a deterministic strategy with expected completion time $O(n\sqrt{\log n})$, thereby establishing a separation between the two models.

Improved Algorithms for Bounded-Degree (Subset) Traveling Salesman Problems

from arXiv: Data Structures and Algorithms

Authors: Jongseo Lee, Jaehyeok Kwak, Hyung-Chan An

We present improved algorithms for several bounded-degree traveling salesman problems. In the bounded-degree traveling salesman path problem (BDTSPP), given a weighted graph G=(V,E), two endpoints s and t, and degree bounds b_v for all v, the goal is to find a minimum-cost subgraph of G that admits an Eulerian s-t path and in which each vertex v has degree at most b_v. Since deciding feasibility is already NP-hard for this problem, previous work gave a bicriteria approximation algorithm. However, that algorithm provides only a multiplicative guarantee on the degree violation, and it was left open whether additive violation is possible. We answer this open question affirmatively by giving a new bicriteria approximation algorithm with additive degree violation. The cost approximation ratio is improved as well, now matching that of Hoogeveen's analysis of the Christofides-Serdyukov algorithm. This improvement relies on a new lemma that enables the use of a bounded-degree minimum spanning tree, rather than a bounded-degree Steiner tree, as a starting point for the algorithm. The lemma compares the cost and degrees of the tree against those of an integral optimum for the bounded-degree TSP at hand, rather than those of a fractional optimum. Our lemma brings improvement to the circuit version (BDTSP) as well: we give a bicriteria algorithm that matches the previous cost approximation ratio while reducing the additive degree violation to +2, which is best possible. Subset TSP is a generalization of the standard "all-vertices" TSP, in which only a specified subset of vertices is required to be visited. We present improvements for both the circuit and the path versions. For the subset path problem (BDSTSPP), we present the first bicriteria approximation algorithm with additive degree violation; for the subset circuit problem (BDSTSP), we give an improved cost approximation ratio.

Authors: Jongseo Lee, Jaehyeok Kwak, Hyung-Chan An

We present improved algorithms for several bounded-degree traveling salesman problems. In the bounded-degree traveling salesman path problem (BDTSPP), given a weighted graph G=(V,E), two endpoints s and t, and degree bounds b_v for all v, the goal is to find a minimum-cost subgraph of G that admits an Eulerian s-t path and in which each vertex v has degree at most b_v. Since deciding feasibility is already NP-hard for this problem, previous work gave a bicriteria approximation algorithm. However, that algorithm provides only a multiplicative guarantee on the degree violation, and it was left open whether additive violation is possible. We answer this open question affirmatively by giving a new bicriteria approximation algorithm with additive degree violation. The cost approximation ratio is improved as well, now matching that of Hoogeveen's analysis of the Christofides-Serdyukov algorithm. This improvement relies on a new lemma that enables the use of a bounded-degree minimum spanning tree, rather than a bounded-degree Steiner tree, as a starting point for the algorithm. The lemma compares the cost and degrees of the tree against those of an integral optimum for the bounded-degree TSP at hand, rather than those of a fractional optimum. Our lemma brings improvement to the circuit version (BDTSP) as well: we give a bicriteria algorithm that matches the previous cost approximation ratio while reducing the additive degree violation to +2, which is best possible. Subset TSP is a generalization of the standard "all-vertices" TSP, in which only a specified subset of vertices is required to be visited. We present improvements for both the circuit and the path versions. For the subset path problem (BDSTSPP), we present the first bicriteria approximation algorithm with additive degree violation; for the subset circuit problem (BDSTSP), we give an improved cost approximation ratio.

Tight bounds for clique-packing parameterized by clique-width

from arXiv: Data Structures and Algorithms

Authors: Narek Bojikian, Stefan Kratsch

In the $d$-Clique Packing problem, given a graph $G$ and an integer $t$, we need to decide whether $G$ contains a set of $t$ pairwise vertex-disjoint cliques of size $d$ each. This generalizes Triangle Packing and it is NP-complete for all $d\geq 3$. For each such $d$, we show how to solve the problem in $n^{O(k^{d-1})}$ time where $k$ is the clique-width of the graph (with a $k$-expression of $G$ given in the input). We complement this by showing that, assuming the Exponential-Time Hypothesis (ETH), there is no algorithm that solves the problem in $n^{o(k^{d-1})}$ time for any fixed $d\geq 3$, already for the special case of seeking a partition into cliques of size $d$. Our proof also entails W[1]-hardness of $d$-Clique Packing (and $d$-Clique Partition) parameterized by clique-width for each $d\geq 3$. Our work continues a series of results on ETH-tight bounds for fundamental graph problems started by Fomin et al.\ (SICOMP 2010+2014) who obtained tight bounds for Max-Cut and Edge Dominating Set.

Authors: Narek Bojikian, Stefan Kratsch

In the $d$-Clique Packing problem, given a graph $G$ and an integer $t$, we need to decide whether $G$ contains a set of $t$ pairwise vertex-disjoint cliques of size $d$ each. This generalizes Triangle Packing and it is NP-complete for all $d\geq 3$. For each such $d$, we show how to solve the problem in $n^{O(k^{d-1})}$ time where $k$ is the clique-width of the graph (with a $k$-expression of $G$ given in the input). We complement this by showing that, assuming the Exponential-Time Hypothesis (ETH), there is no algorithm that solves the problem in $n^{o(k^{d-1})}$ time for any fixed $d\geq 3$, already for the special case of seeking a partition into cliques of size $d$. Our proof also entails W[1]-hardness of $d$-Clique Packing (and $d$-Clique Partition) parameterized by clique-width for each $d\geq 3$. Our work continues a series of results on ETH-tight bounds for fundamental graph problems started by Fomin et al.\ (SICOMP 2010+2014) who obtained tight bounds for Max-Cut and Edge Dominating Set.

Token sliding independent set reconfiguration on graphs with few $P_4$'s

from arXiv: Data Structures and Algorithms

Authors: Lucia Busolini, Mario Valencia-Pabon

We consider the INDEPENDENT SET RECONFIGURATION problem under the Token Sliding rule. Let $I$ be an independent set of a simple undirected graph $G$. Suppose that each vertex of $I$ has a token placed on it. The tokens are allowed to be moved, one at a time, by sliding along the edges of $G$, so that after each move, the vertices having tokens always form an independent set of $G$. The problem we deal is to decide if we can transform $I$ into $I'$ through a sequence of steps, each of which involves substituting a vertex in the current independent set with one of its neighbours to obtain another independent set. This problem of determining if one independent set of a graph "is reachable" from another independent set of it is known to be PSPACE-hard even for split graphs, planar graphs, and graphs of bounded treewidth. Polynomial time algorithms have been obtained for certain graph classes like trees, interval graphs, claw-free graphs, bipartite permutation graphs, block graphs, and cographs. We present a polynomial time algorithm for the problem on $P_4$-tidy graphs and $(q,q-4)$-graphs, both families of graphs generalizing cographs.

Authors: Lucia Busolini, Mario Valencia-Pabon

We consider the INDEPENDENT SET RECONFIGURATION problem under the Token Sliding rule. Let $I$ be an independent set of a simple undirected graph $G$. Suppose that each vertex of $I$ has a token placed on it. The tokens are allowed to be moved, one at a time, by sliding along the edges of $G$, so that after each move, the vertices having tokens always form an independent set of $G$. The problem we deal is to decide if we can transform $I$ into $I'$ through a sequence of steps, each of which involves substituting a vertex in the current independent set with one of its neighbours to obtain another independent set. This problem of determining if one independent set of a graph "is reachable" from another independent set of it is known to be PSPACE-hard even for split graphs, planar graphs, and graphs of bounded treewidth. Polynomial time algorithms have been obtained for certain graph classes like trees, interval graphs, claw-free graphs, bipartite permutation graphs, block graphs, and cographs. We present a polynomial time algorithm for the problem on $P_4$-tidy graphs and $(q,q-4)$-graphs, both families of graphs generalizing cographs.

Distributed Property Testing with (Quantum) Carrier Pigeons: Tight Bounds on State Certification

from arXiv: Data Structures and Algorithms

Authors: Kenny Chen

Recently, Doosti et al. introduced the problem of distributed quantum state verification, where $m$ distributed nodes are given a copy of an unknown state $ρ$, and can send limited one way communication to a central node, who has a complete description of a known state $σ$. They ask how many distributed nodes $m$ are required, before the central node can succeed at distinguishing whether $ρ=σ$ or $\|ρ-σ\|_1\geq\varepsilon$ with high probability. In the setting where only quantum communication is allowed, Doosti et al. exhibit conditional lower bounds in both the public and private-coin settings, and a matching upper bound in the public-coin setting. We extend these results, and show unconditional lower bounds for when both classical and quantum communication are permitted. We show the public-coin lower bound is tight by giving an algorithm with a matching upper bound. We also show an almost tight upper bound in the private-coin setting when only quantum communication is permitted.

Authors: Kenny Chen

Recently, Doosti et al. introduced the problem of distributed quantum state verification, where $m$ distributed nodes are given a copy of an unknown state $ρ$, and can send limited one way communication to a central node, who has a complete description of a known state $σ$. They ask how many distributed nodes $m$ are required, before the central node can succeed at distinguishing whether $ρ=σ$ or $\|ρ-σ\|_1\geq\varepsilon$ with high probability. In the setting where only quantum communication is allowed, Doosti et al. exhibit conditional lower bounds in both the public and private-coin settings, and a matching upper bound in the public-coin setting. We extend these results, and show unconditional lower bounds for when both classical and quantum communication are permitted. We show the public-coin lower bound is tight by giving an algorithm with a matching upper bound. We also show an almost tight upper bound in the private-coin setting when only quantum communication is permitted.