Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Wednesday, July 01

Three Years of Argmin

from Ben Recht

My annual navel-gazing post.

My argmin.md file compels me to write a blog about blogging every July 1st. It’s my official Substack anniversary, and we’ve made it through year three. The process here is, for better or worse, set in stone and could be written as a skill file for an LLM agent. I listed the rules last year, and they haven’t changed since. I don’t expect them to change this year either.

In fact, going back over my posts, it seems like the most novel change was adding a new kind of content that got decidedly mixed reviews: football blogging. Of all the topics I’ve dabbled with on this Substack, there is none more polarizing. Some people really liked it! Others, let’s say, did not. I’m waiting until training camp gets going before I decide whether I’ll commit to another season of yelling about the absurdity of win probability.

The football blogging is related to one of the trickier things I haven’t been able to figure out in three years writing this blog. Argmin is a periodical with different authors. Those authors just all happen to be me. I realize you receive a notification from me whether I’m writing about a topic you care about or not. This is a problem I don’t really know how to address, but I suppose it is an issue with all Substack subscriptions. I don’t want to take on the bother of splitting this into four different Substacks, which I think would annoy you even more than it would annoy me. I hope the disclaimers at the top of the posts help you decide if you want to keep reading. I would love feedback on how to make that signposting clearer.

Relatedly, commenters are increasingly asking questions about things I’ve already talked about. This is totally expected, as I know I write more than anyone likely cares to read, but I worry about repeating myself—thus alienating a different collection of readers—when addressing them. The balance between repetition and clarity is delicate. Apoorva Lal said my blog has its own private dialect. If I don’t repeat myself somewhat, filling in the context of what my brain thinks obvious, the content will be even less approachable, and no one wants that.

Moreover, sometimes the only way to iron out a nonobvious point is by repeating it until it’s clear. I wrote in the afterword of The Irrational Decision that I used this blog “as a sketch pad, where I first workshopped much of the text that you read here.” This newsletter remains a place for me to play and write out first drafts. I get a lot out of putting ideas out there, even when they are way underbaked. “Talks and blogs are ephemeral, but interactions with the audience helped to shape and hone my arguments. All of those early interlocutors, in person and online, helped shape this final archival form. I’m thankful to all of you.”

In the spirit of learning from the audience, we can use this year’s reflection post as a chance for you to tell me what you’d change about the argmin magazine. Do you want more of any topic? Would you prefer a different format or cadence of posts? You can email me if you’d prefer not to comment (my berkeley.edu address is fine for that).

In the meantime, I can give you a preview of what to expect over the next year. There will definitely be more live course blogging. In the fall, I am planning an unhinged new class on forecasting, the tools we use to do it, and why we’re so obsessed with it. That should be a fun one. In the Spring, I’m apparently signed up to teach undergrad probability. This is for engineering students and based on the excellent text of Bertsekas and Tsitsiklis. Since it’s about engineering, we’ll have to grapple with some of the complicated metaphysics of what we think these probabilities mean in the context of building things. You, me, and the undergrads will all get confused together.

As for the remainder of this summer, I want to finish this series Against Optimaxxing. I’ve been talking about writing this forever, and I need to put in the effort to finish it. It would be an attempt at writing what Henry Farrell calls speculative nonfiction, seeking a reasonable language to imagine a future free from the vulgar positivism that runs our culture, politics, and industry. This blog series will try to establish a necessary vocabulary. My draft writing on this particular topic might lead nowhere. It might remain a private dialect. But I want to try to write it down.

Subscribe now

By Ben Recht

Witness Complexity of Short Descriptions: A Cryptographic Perspective

from arXiv: Computational Complexity

Authors: Fabio F. G. Buono

In cryptographic practice, where protocols impose strict time bounds, implementations demand predictable resource usage, and real-world systems require immediate verification for security and usability, a short key or certificate is useful only if it can be expanded or verified within a bounded time; otherwise a compact representation that requires superpolynomial work to expand offers no operational guarantee within a bounded-time protocol. This paper formalises that gap by introducing \emph{witness complexity} \(\gam(x)\), the minimum running time over near-shortest descriptions of a string on a universal Turing machine. \(\gam\) differs from Shannon entropy and Kolmogorov complexity \(\KC\): low \(\KC\) can coexist with high \(\gam\). We prove invariance up to polynomial factors; a conditional separation (assuming \(\PneqNP\)). An unconditional lower bound from incomputability of \(\KC\); a biconditional characterisation of \(\PeqNP\) via the class-relative variant \(\gP\); and polynomial-time tractability for structured \(\classNP\) families. Part II develops companion measures and shows an unconditional gap between grammar size and derivation cost, positioning \(\gam\) as a metric for the usability of keys and certificates.

Authors: Fabio F. G. Buono

In cryptographic practice, where protocols impose strict time bounds, implementations demand predictable resource usage, and real-world systems require immediate verification for security and usability, a short key or certificate is useful only if it can be expanded or verified within a bounded time; otherwise a compact representation that requires superpolynomial work to expand offers no operational guarantee within a bounded-time protocol. This paper formalises that gap by introducing \emph{witness complexity} \(\gam(x)\), the minimum running time over near-shortest descriptions of a string on a universal Turing machine. \(\gam\) differs from Shannon entropy and Kolmogorov complexity \(\KC\): low \(\KC\) can coexist with high \(\gam\). We prove invariance up to polynomial factors; a conditional separation (assuming \(\PneqNP\)). An unconditional lower bound from incomputability of \(\KC\); a biconditional characterisation of \(\PeqNP\) via the class-relative variant \(\gP\); and polynomial-time tractability for structured \(\classNP\) families. Part II develops companion measures and shows an unconditional gap between grammar size and derivation cost, positioning \(\gam\) as a metric for the usability of keys and certificates.

The Fourth-Root Complexity of Data Movement

from arXiv: Computational Complexity

Authors: Chen Ding

Time complexity typically assumes $O(1)$ cost per data access. This paper presents an analysis based on an abstract memory hierarchy. For a common class of applications, it shows that the data-access cost scales with the fourth root of data size, that is, as data size $N$ increases, the cost of each access increases at the rate of $N^\frac{1}{4}$. While the analysis does not predict performance, it predicts scalability. Specifically, the paper provides a precise analysis that shows the constant-factor difference between cases where the miss ratio follows a power law versus an exponential decay.

Authors: Chen Ding

Time complexity typically assumes $O(1)$ cost per data access. This paper presents an analysis based on an abstract memory hierarchy. For a common class of applications, it shows that the data-access cost scales with the fourth root of data size, that is, as data size $N$ increases, the cost of each access increases at the rate of $N^\frac{1}{4}$. While the analysis does not predict performance, it predicts scalability. Specifically, the paper provides a precise analysis that shows the constant-factor difference between cases where the miss ratio follows a power law versus an exponential decay.

Complexity of Universality and Related Decision Problems for Unary Two-Dimensional Automata

from arXiv: Computational Complexity

Authors: Taylor J. Smith

A two-dimensional automaton is able to move its input head through its input word in four directions: upward, downward, leftward, and rightward. If we prevent the input head from moving upward, then we obtain a three-way two-dimensional automaton; preventing both upward and leftward movements results in a two-way two-dimensional automaton. While much is known about the decidability and complexity properties of the two-dimensional automaton model, the unary variant of this model is less studied. We show that the universality, equivalence, and inclusion problems for unary three-way deterministic two-dimensional automata are coNP-hard, while for the corresponding two-way model, the universality, equivalence, inclusion, and disjointness problems are in P. We further show that the universality, equivalence, and inclusion problems for unary two-way nondeterministic two-dimensional automata are coNP-hard and in ELEMENTARY; and the disjointness problem for the same model is NL-hard and in ELEMENTARY. Finally, we establish the decidability of a bounded variant of the universality problem for unary three-way nondeterministic two-dimensional automata, and show that this variant problem is coNP-complete.

Authors: Taylor J. Smith

A two-dimensional automaton is able to move its input head through its input word in four directions: upward, downward, leftward, and rightward. If we prevent the input head from moving upward, then we obtain a three-way two-dimensional automaton; preventing both upward and leftward movements results in a two-way two-dimensional automaton. While much is known about the decidability and complexity properties of the two-dimensional automaton model, the unary variant of this model is less studied. We show that the universality, equivalence, and inclusion problems for unary three-way deterministic two-dimensional automata are coNP-hard, while for the corresponding two-way model, the universality, equivalence, inclusion, and disjointness problems are in P. We further show that the universality, equivalence, and inclusion problems for unary two-way nondeterministic two-dimensional automata are coNP-hard and in ELEMENTARY; and the disjointness problem for the same model is NL-hard and in ELEMENTARY. Finally, we establish the decidability of a bounded variant of the universality problem for unary three-way nondeterministic two-dimensional automata, and show that this variant problem is coNP-complete.

Taming Complexity in Intuitionistic Modal Logic: The Case of FIK and Its Shallow Calculus

from arXiv: Computational Complexity

Authors: Han Gao, Nicola Olivetti

Intuitionistic modal logics (IMLs) comprise many systems: from constructive modal logics such as CK and Wijesekera's CCDL to Fischer Servi/Simpson's IK, as well as some recently introduced variants. All of them are characterized by bi-relational semantics and have complete axiomatisations. However, from the perspective of proof theory and complexity, there are strong differences: while for constructive modal logics simple Gentzen calculi suffice, for IK more complex calculi, based on nested or labelled sequents, are needed. As a consequence, the decision problem for constructive modal logics has a PSPACE upper bound, whereas for IK is not known and it is even conjectured to be non-elementary. We study here the proof theory and complexity of FIK, a natural intuitionistic modal logic recently introduced. FIK is strictly in between CCDL and IK, yet it has the same forcing conditions as IK. We define a "shallow" sequent calculus for FIK which is a nested sequent calculus where sequents have at most one level of nesting. We prove its syntactic completeness by showing the admissibility of cut. By means of this calculus we show that decision problem for FIK is in EXPSPACE, whence significantly lower than the complexity conjectured for IK.

Authors: Han Gao, Nicola Olivetti

Intuitionistic modal logics (IMLs) comprise many systems: from constructive modal logics such as CK and Wijesekera's CCDL to Fischer Servi/Simpson's IK, as well as some recently introduced variants. All of them are characterized by bi-relational semantics and have complete axiomatisations. However, from the perspective of proof theory and complexity, there are strong differences: while for constructive modal logics simple Gentzen calculi suffice, for IK more complex calculi, based on nested or labelled sequents, are needed. As a consequence, the decision problem for constructive modal logics has a PSPACE upper bound, whereas for IK is not known and it is even conjectured to be non-elementary. We study here the proof theory and complexity of FIK, a natural intuitionistic modal logic recently introduced. FIK is strictly in between CCDL and IK, yet it has the same forcing conditions as IK. We define a "shallow" sequent calculus for FIK which is a nested sequent calculus where sequents have at most one level of nesting. We prove its syntactic completeness by showing the admissibility of cut. By means of this calculus we show that decision problem for FIK is in EXPSPACE, whence significantly lower than the complexity conjectured for IK.

Counting Small Induced Subgraphs: Hardness of Symmetry-Based Properties

from arXiv: Computational Complexity

Authors: Radu Curticapean, Mingjun Liu

Jerrum and Meeks (TOCT, JCSS 2015) introduced the counting problems $\text{IndSub}(Φ)$ for fixed graph properties $Φ$: Given an input graph $G$ and $k\in\mathbb N$, count the $k$-vertex subsets $S \subseteq V(G)$ such that the induced subgraph $G[S]$ satisfies $Φ$. For recursively enumerable $Φ$, it is known that $\text{IndSub}(Φ)$ is either #W[1]-hard or fixed-parameter tractable. A direct classification depending on $Φ$ however still remains open. In particular, the status was open for the property of graphs without nontrivial automorphisms, also mentioned in a very recent survey on parameterized counting by Roth (Comput.~Sci.~Rev.~2026). This is a natural property that evades all currently known techniques for proving #W[1]-hardness, including a general toolkit based on Fourier analysis that was very recently introduced by Curticapean and Neuen (SODA~2025). In this paper, we show that counting induced $k$-vertex graphs without nontrivial automorphisms is #W[1]-hard by constructing ``clique scaffolds'', i.e., problem-specific restrictions of the property that enable a reduction from the $k$-clique problem. More generally, we show that for every finite group $Q$, counting $k$-vertex induced subgraphs with automorphism group $Q$ is #W[1]-hard.

Authors: Radu Curticapean, Mingjun Liu

Jerrum and Meeks (TOCT, JCSS 2015) introduced the counting problems $\text{IndSub}(Φ)$ for fixed graph properties $Φ$: Given an input graph $G$ and $k\in\mathbb N$, count the $k$-vertex subsets $S \subseteq V(G)$ such that the induced subgraph $G[S]$ satisfies $Φ$. For recursively enumerable $Φ$, it is known that $\text{IndSub}(Φ)$ is either #W[1]-hard or fixed-parameter tractable. A direct classification depending on $Φ$ however still remains open. In particular, the status was open for the property of graphs without nontrivial automorphisms, also mentioned in a very recent survey on parameterized counting by Roth (Comput.~Sci.~Rev.~2026). This is a natural property that evades all currently known techniques for proving #W[1]-hardness, including a general toolkit based on Fourier analysis that was very recently introduced by Curticapean and Neuen (SODA~2025). In this paper, we show that counting induced $k$-vertex graphs without nontrivial automorphisms is #W[1]-hard by constructing ``clique scaffolds'', i.e., problem-specific restrictions of the property that enable a reduction from the $k$-clique problem. More generally, we show that for every finite group $Q$, counting $k$-vertex induced subgraphs with automorphism group $Q$ is #W[1]-hard.

Sensitivity Lower Bounds via Locally Testable Codes

from arXiv: Data Structures and Algorithms

Authors: Yuichi Yoshida, Zihan Zhang

Sensitivity quantifies how far an algorithm's output can move in Hamming distance when a single input element is perturbed. We present a general scheme turning any locally testable code (LTC) into a sensitivity lower bound for the CSP encoded by the code. Instantiating it with the $c^3$-LTC yields a constant $\varepsilon > 0$ for which every $(1-\varepsilon)$-approximation algorithm for satisfiable Max E3LIN2 has sensitivity $Ω(n)$, strengthening the previous $Ω(n^δ)$ lower bound known only for general instances. Standard reductions give an $n^{1-O(\varepsilon)}$ lower bound for $n^{-\varepsilon}$-approximation algorithms for maximum clique, and an $Ω(k)$ bound for $(1-\varepsilon)$-approximation algorithms for maximum $k$-coverage, among others. These lower bounds match trivial upper bounds and imply that even slightly stable algorithms cannot achieve these approximations. A second instantiation, using a simple repetition code, shows that any $(1-\varepsilon)$-approximation algorithm for Max Cut on bipartite graphs has sensitivity $Ω(1/\varepsilon)$ as long as $\varepsilon = O(1/\sqrt{n})$, extending the prior result for exact algorithms (the regime $\varepsilon < 1/n$). Thus even on bipartite graphs, where a perfect cut always exists, near-optimal solutions cannot be maintained stably. Our sensitivity lower bounds have two applications. First, they yield averaged sensitivity lower bounds, implying that any nearly optimal randomized algorithm for Max E3SAT has linearly many output bits that, after fixing the random seed, are Boolean functions with large influence. Second, via the sensitivity-to-locality connection, they imply locality lower bounds in the non-signaling model, which generalizes $\mathsf{LOCAL}$ and quantum-$\mathsf{LOCAL}$.

Authors: Yuichi Yoshida, Zihan Zhang

Sensitivity quantifies how far an algorithm's output can move in Hamming distance when a single input element is perturbed. We present a general scheme turning any locally testable code (LTC) into a sensitivity lower bound for the CSP encoded by the code. Instantiating it with the $c^3$-LTC yields a constant $\varepsilon > 0$ for which every $(1-\varepsilon)$-approximation algorithm for satisfiable Max E3LIN2 has sensitivity $Ω(n)$, strengthening the previous $Ω(n^δ)$ lower bound known only for general instances. Standard reductions give an $n^{1-O(\varepsilon)}$ lower bound for $n^{-\varepsilon}$-approximation algorithms for maximum clique, and an $Ω(k)$ bound for $(1-\varepsilon)$-approximation algorithms for maximum $k$-coverage, among others. These lower bounds match trivial upper bounds and imply that even slightly stable algorithms cannot achieve these approximations. A second instantiation, using a simple repetition code, shows that any $(1-\varepsilon)$-approximation algorithm for Max Cut on bipartite graphs has sensitivity $Ω(1/\varepsilon)$ as long as $\varepsilon = O(1/\sqrt{n})$, extending the prior result for exact algorithms (the regime $\varepsilon < 1/n$). Thus even on bipartite graphs, where a perfect cut always exists, near-optimal solutions cannot be maintained stably. Our sensitivity lower bounds have two applications. First, they yield averaged sensitivity lower bounds, implying that any nearly optimal randomized algorithm for Max E3SAT has linearly many output bits that, after fixing the random seed, are Boolean functions with large influence. Second, via the sensitivity-to-locality connection, they imply locality lower bounds in the non-signaling model, which generalizes $\mathsf{LOCAL}$ and quantum-$\mathsf{LOCAL}$.

Orienting Unrooted Binary Networks Faster: Focus on the Generator

from arXiv: Data Structures and Algorithms

Authors: Jannik Schestag, Norbert Zeh

The problem of orienting an unrooted network to obtain a specific class of rooted phylogenetic networks is known to be NP-hard in many cases. In this paper, we introduce two algorithmic frameworks that yield significantly improved fixed-parameter tractable (FPT) algorithms parameterized by the network level $\ell$. Our first main contribution shows that for several prominent network classes, the core algorithmic difficulty lies in finding a directed spanning tree on the network's undirected generator. By enumerating these spanning trees in $O(5.3334^\ell + \ell)$ time and orienting all remaining edges in polynomial time, we solve the orientation problem in $O(5.3334^\ell \cdot n)$ time for tree-based networks and in $O(5.3334^\ell \cdot n^2)$ time for orchards, where $n$ is the number of vertices of the graph. Extending this approach with further branching yields $O(10.6667^\ell \cdot n^2)$-time algorithms for tree-child and normal networks. Our second technique bypasses spanning trees by directly guessing the placement of reticulations on the generator. This framework provides $O(12.2071^\ell \cdot n^2)$-time algorithms for temporal, reticulation-visible, and tree-sibling networks. Finally, we demonstrate the versatility of the reticulation-guessing framework by showing that even computing an orientation with minimum scanwidth is single-exponential FPT with respect to the level. Together, these results significantly improve the best-known running times for phylogenetic network orientation.

Authors: Jannik Schestag, Norbert Zeh

The problem of orienting an unrooted network to obtain a specific class of rooted phylogenetic networks is known to be NP-hard in many cases. In this paper, we introduce two algorithmic frameworks that yield significantly improved fixed-parameter tractable (FPT) algorithms parameterized by the network level $\ell$. Our first main contribution shows that for several prominent network classes, the core algorithmic difficulty lies in finding a directed spanning tree on the network's undirected generator. By enumerating these spanning trees in $O(5.3334^\ell + \ell)$ time and orienting all remaining edges in polynomial time, we solve the orientation problem in $O(5.3334^\ell \cdot n)$ time for tree-based networks and in $O(5.3334^\ell \cdot n^2)$ time for orchards, where $n$ is the number of vertices of the graph. Extending this approach with further branching yields $O(10.6667^\ell \cdot n^2)$-time algorithms for tree-child and normal networks. Our second technique bypasses spanning trees by directly guessing the placement of reticulations on the generator. This framework provides $O(12.2071^\ell \cdot n^2)$-time algorithms for temporal, reticulation-visible, and tree-sibling networks. Finally, we demonstrate the versatility of the reticulation-guessing framework by showing that even computing an orientation with minimum scanwidth is single-exponential FPT with respect to the level. Together, these results significantly improve the best-known running times for phylogenetic network orientation.

Directed Low Diameter Decomposition for Structured Digraphs

from arXiv: Data Structures and Algorithms

Authors: Shinwoo An, Arnold Filtser

Low diameter decompositions, or LDDs for short, are a fundamental primitive in the design of efficient graph algorithms. Roughly speaking, an LDD is a distribution over partitions of the vertices into bounded-diameter clusters such that nearby vertices are likely to be clustered together. Recently, there has been growing interest in lifting the notion of LDDs into \emph{directed graphs}. In particular, there are two natural directed analogues. The first is a directed LDD, where after removing a random subset of edges, every strongly connected component has a small diameter. The second is a quasipartition, which imposes the stronger requirement that whenever one vertex can still reach another after the edge removal, the two vertices must be close in the original directed metric. Every quasipartition yields an LDD, but the converse does not necessarily hold. In this work, we initiate the systematic study of LDDs in structured directed graphs. As our first main result, we show that any directed graph with pathwidth $\mathsf{pw}$ admits an $(O(\mathsf{pw}), Δ)$-LDD. This improves upon the previous best-known $(2^{O(\mathsf{pw}^2)}, Δ)$-LDD construction, which was implicitly derived from the quasipartition result of Salmasi, Sidiropoulos, and Sridhar [SODA'19]. As our second result, we show that the integrality gap of the Directed Non-Bipartite Sparsest-Cut LP relaxation on an $n$-vertex graph with treewidth $\mathsf{tw}$ is $O(\mathsf{tw} \log n)$. This improves upon the $O(\mathsf{tw}\log^2 n)$ bound of Mémoli, Sidiropoulos, and Sridhar [ICALP'16, Algorithmica'18]. We obtain this result through the refined analysis of the quasipartition construction of Mémoli et al. for bounded treewidth graphs.

Authors: Shinwoo An, Arnold Filtser

Low diameter decompositions, or LDDs for short, are a fundamental primitive in the design of efficient graph algorithms. Roughly speaking, an LDD is a distribution over partitions of the vertices into bounded-diameter clusters such that nearby vertices are likely to be clustered together. Recently, there has been growing interest in lifting the notion of LDDs into \emph{directed graphs}. In particular, there are two natural directed analogues. The first is a directed LDD, where after removing a random subset of edges, every strongly connected component has a small diameter. The second is a quasipartition, which imposes the stronger requirement that whenever one vertex can still reach another after the edge removal, the two vertices must be close in the original directed metric. Every quasipartition yields an LDD, but the converse does not necessarily hold. In this work, we initiate the systematic study of LDDs in structured directed graphs. As our first main result, we show that any directed graph with pathwidth $\mathsf{pw}$ admits an $(O(\mathsf{pw}), Δ)$-LDD. This improves upon the previous best-known $(2^{O(\mathsf{pw}^2)}, Δ)$-LDD construction, which was implicitly derived from the quasipartition result of Salmasi, Sidiropoulos, and Sridhar [SODA'19]. As our second result, we show that the integrality gap of the Directed Non-Bipartite Sparsest-Cut LP relaxation on an $n$-vertex graph with treewidth $\mathsf{tw}$ is $O(\mathsf{tw} \log n)$. This improves upon the $O(\mathsf{tw}\log^2 n)$ bound of Mémoli, Sidiropoulos, and Sridhar [ICALP'16, Algorithmica'18]. We obtain this result through the refined analysis of the quasipartition construction of Mémoli et al. for bounded treewidth graphs.

Graph Scheduling with Group Completion Times

from arXiv: Data Structures and Algorithms

Authors: Lars Rohwedder, Leander Schnaars

In the Graph Scheduling problem we schedule a given multiset of edges on discrete time steps, such that at each step the set of edges forms a matching. The goal is to minimize the sum of weighted group completion times, where a group is a set of edges and it completes when the last edge has been scheduled. Two popular variants of this problem are Coflow Scheduling and Data Migration. Our main result is extending a recent iterated rounding approach from Coflow Scheduling, roughly corresponding to the bipartite case, to the general Graph Scheduling problem. This yields an essentially tight $(2+ε)$-approximation for the asymptotic setting where OPT is assumed to be large. For this we rely on polyhedral techniques from general matching, namely odd-set inequalities, and graph theoretical results on edge colorings in multigraphs. The state-of-the-art approximation algorithm for Data Migration is a $(1 + φ)$-approximation that improves when OPT is small. Taking the best of this and our main result, we obtain an improvement of the approximation rate for Data Migration in any regime.

Authors: Lars Rohwedder, Leander Schnaars

In the Graph Scheduling problem we schedule a given multiset of edges on discrete time steps, such that at each step the set of edges forms a matching. The goal is to minimize the sum of weighted group completion times, where a group is a set of edges and it completes when the last edge has been scheduled. Two popular variants of this problem are Coflow Scheduling and Data Migration. Our main result is extending a recent iterated rounding approach from Coflow Scheduling, roughly corresponding to the bipartite case, to the general Graph Scheduling problem. This yields an essentially tight $(2+ε)$-approximation for the asymptotic setting where OPT is assumed to be large. For this we rely on polyhedral techniques from general matching, namely odd-set inequalities, and graph theoretical results on edge colorings in multigraphs. The state-of-the-art approximation algorithm for Data Migration is a $(1 + φ)$-approximation that improves when OPT is small. Taking the best of this and our main result, we obtain an improvement of the approximation rate for Data Migration in any regime.

Constant-factor approximation of maximum distance-2 independent set in graphs of bounded merge-width

from arXiv: Data Structures and Algorithms

Authors: Maël Dumas

We give a constant-factor approximation algorithm for Max Dist-2 Independent Set in graphs of bounded radius-2 merge-width. The same result holds for Min Dominating Set from [Bonamy and Geniet, 2025], [Chan et al., SODA '12]. Both approximation algorithms are LP-based, showing that the domination-to-2-independence ratio is bounded in graphs of bounded radius-2 merge-width. Moreover, this result is tight in the sense that the ratio can be unbounded in graphs of bounded radius-1 merge-width.

Authors: Maël Dumas

We give a constant-factor approximation algorithm for Max Dist-2 Independent Set in graphs of bounded radius-2 merge-width. The same result holds for Min Dominating Set from [Bonamy and Geniet, 2025], [Chan et al., SODA '12]. Both approximation algorithms are LP-based, showing that the domination-to-2-independence ratio is bounded in graphs of bounded radius-2 merge-width. Moreover, this result is tight in the sense that the ratio can be unbounded in graphs of bounded radius-1 merge-width.

Planar Embedding of Okamura-Seymour Quasimetrics in Polynomial Time with an Application to Distributed SSSP

from arXiv: Data Structures and Algorithms

Authors: Hung Le, Hector Tierno, Shuang Yang

A quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric if there exists an edge-weighted planar embedded directed graph $G = (V,E,w)$ such that $T$ is a set of terminals on the outerface of $G$ and $δ_G(t,t') = δ_T(t,t')$ for every pair $(t,t')\in T\times T$. If $(T,δ_T)$ is an Okamura-Seymour quasimetric, then $G$ is a planar embedding of $(T,δ_T)$. In a recent pioneering work, Chen and Tan gave a polynomial-time algorithm to test if a given quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric. A key step in their proof is existential, which suffices for an efficient testing algorithm but does not imply an efficient embedding algorithm. Our paper closes this gap by giving an algorithmic implementation of their existential step via linear programming. As a result, we obtain the first polynomial-time algorithm for finding a planar embedding of any given Okamura-Seymour quasimetric $(T,δ_T)$. As an application, we show how to use our planar embedding of Okamura-Seymour quasimetrics to compute a $(1+ε)$-approximate single-source shortest path (SSSP) in planar directed graphs in the distributed CONGEST model in $\widetilde{O}(D)$ rounds for any fixed $ε\in (0,1)$, nearly matching a simple lower bound of $Ω(D)$ and resolving a fundamental problem in this area. The best-known algorithm for this problem has round complexity $\widetilde{O}(D^2)$.

Authors: Hung Le, Hector Tierno, Shuang Yang

A quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric if there exists an edge-weighted planar embedded directed graph $G = (V,E,w)$ such that $T$ is a set of terminals on the outerface of $G$ and $δ_G(t,t') = δ_T(t,t')$ for every pair $(t,t')\in T\times T$. If $(T,δ_T)$ is an Okamura-Seymour quasimetric, then $G$ is a planar embedding of $(T,δ_T)$. In a recent pioneering work, Chen and Tan gave a polynomial-time algorithm to test if a given quasi-metric $(T,δ_T)$ is an Okamura-Seymour quasimetric. A key step in their proof is existential, which suffices for an efficient testing algorithm but does not imply an efficient embedding algorithm. Our paper closes this gap by giving an algorithmic implementation of their existential step via linear programming. As a result, we obtain the first polynomial-time algorithm for finding a planar embedding of any given Okamura-Seymour quasimetric $(T,δ_T)$. As an application, we show how to use our planar embedding of Okamura-Seymour quasimetrics to compute a $(1+ε)$-approximate single-source shortest path (SSSP) in planar directed graphs in the distributed CONGEST model in $\widetilde{O}(D)$ rounds for any fixed $ε\in (0,1)$, nearly matching a simple lower bound of $Ω(D)$ and resolving a fundamental problem in this area. The best-known algorithm for this problem has round complexity $\widetilde{O}(D^2)$.

Practical Linear-Time Computation of Smallest Suffixient Sets

from arXiv: Data Structures and Algorithms

Authors: Francisco Olivares, Gonzalo Navarro

Suffixient arrays are recent structures that have attracted attention because they offer relevant pattern matching functionality in less asymptotic space than the Run-Length BWT, the de-facto standard to index highly repetitive string collections. Various algorithms exist for building them from the suffix array data structures. We present the first construction algorithm that is (i) linear-time, (ii) one-pass over the structures, and (iii) implemented and practical. This makes the construction particularly useful on large text collections, which we demonstrate empirically by showing that it dominates the space/time tradeoff map of the implemented constructions.

Authors: Francisco Olivares, Gonzalo Navarro

Suffixient arrays are recent structures that have attracted attention because they offer relevant pattern matching functionality in less asymptotic space than the Run-Length BWT, the de-facto standard to index highly repetitive string collections. Various algorithms exist for building them from the suffix array data structures. We present the first construction algorithm that is (i) linear-time, (ii) one-pass over the structures, and (iii) implemented and practical. This makes the construction particularly useful on large text collections, which we demonstrate empirically by showing that it dominates the space/time tradeoff map of the implemented constructions.

Optimal-Time Contextual Pattern Matching in Compressed Space

from arXiv: Data Structures and Algorithms

Authors: Gonzalo Navarro, Francisco Olivares

Contextual pattern matching is the task of, given a pattern $P[1,m]$, a context length $λ$, and a text $T[1,n]$, find all the $occ$ distinct contexts in which $P$ occurs in $T$, the context being the $λ$ symbols preceding and the $λ$ symbols following the occurrence; a text position where each context occurs must be output. While the problem can be solved in optimal time $O(m+occ)$ using $O(n)$-space precomputed data structures on $T$, this type of search is particularly relevant on large repetitive text collections, where $O(n)$ space can be prohibitive. We present the first optimal-time solution that runs in compressed space, namely that of a symmetric CDAWG (SCDAWG) of $T$. Further, we show how the set of $occ$ solutions can be enumerated with $O(\log\logλ)$ delay after $O(m)$-time preprocessing of $P$. To achieve this, we develop an improved linear-space distance-sensitive weighted ancestor data structure.

Authors: Gonzalo Navarro, Francisco Olivares

Contextual pattern matching is the task of, given a pattern $P[1,m]$, a context length $λ$, and a text $T[1,n]$, find all the $occ$ distinct contexts in which $P$ occurs in $T$, the context being the $λ$ symbols preceding and the $λ$ symbols following the occurrence; a text position where each context occurs must be output. While the problem can be solved in optimal time $O(m+occ)$ using $O(n)$-space precomputed data structures on $T$, this type of search is particularly relevant on large repetitive text collections, where $O(n)$ space can be prohibitive. We present the first optimal-time solution that runs in compressed space, namely that of a symmetric CDAWG (SCDAWG) of $T$. Further, we show how the set of $occ$ solutions can be enumerated with $O(\log\logλ)$ delay after $O(m)$-time preprocessing of $P$. To achieve this, we develop an improved linear-space distance-sensitive weighted ancestor data structure.

Algorithms and complexity for geodetic sets on interval and chordal graphs

from arXiv: Data Structures and Algorithms

Authors: Dibyayan Chakraborty, Sandip Das, Florent Foucaud, Harmender Gahlawat, Dimitri Lajou

We study the computational complexity of finding the geodetic number of a graph on chordal graphs and interval graphs. A set $S$ of vertices of a graph $G$ is a \textit{geodetic set} if every vertex of $G$ lies in a shortest path between some pair of vertices of $S$. The \textsc{Minimum Geodetic Set (MGS)} problem is to find a geodetic set with minimum cardinality of a given graph. We show that \textsc{Minimum Geodetic Set} is fixed parameter tractable for chordal graphs when parameterized by its \emph{tree-width} (which equals its clique number). This implies a polynomial-time algorithm for $k$-trees, for fixed $k$. Then, we show that \textsc{Minimum Geodetic Set} is NP-hard on interval graphs, thereby answering a question of Ekim et al. (LATIN, 2012), who showed that \textsc{Minimum Geodetic Set} is polynomial-time solvable on proper interval graphs. As interval graphs are very constrained, to prove the latter result, we design a rather sophisticated reduction technique to work around their inherent linear structure.

Authors: Dibyayan Chakraborty, Sandip Das, Florent Foucaud, Harmender Gahlawat, Dimitri Lajou

We study the computational complexity of finding the geodetic number of a graph on chordal graphs and interval graphs. A set $S$ of vertices of a graph $G$ is a \textit{geodetic set} if every vertex of $G$ lies in a shortest path between some pair of vertices of $S$. The \textsc{Minimum Geodetic Set (MGS)} problem is to find a geodetic set with minimum cardinality of a given graph. We show that \textsc{Minimum Geodetic Set} is fixed parameter tractable for chordal graphs when parameterized by its \emph{tree-width} (which equals its clique number). This implies a polynomial-time algorithm for $k$-trees, for fixed $k$. Then, we show that \textsc{Minimum Geodetic Set} is NP-hard on interval graphs, thereby answering a question of Ekim et al. (LATIN, 2012), who showed that \textsc{Minimum Geodetic Set} is polynomial-time solvable on proper interval graphs. As interval graphs are very constrained, to prove the latter result, we design a rather sophisticated reduction technique to work around their inherent linear structure.

$\texttt{bucket-graph-spprc}$: an extensible C++ library for the shortest path problem with resource constraints

from arXiv: Data Structures and Algorithms

Authors: Simon Spoorendonk

We present $\texttt{bucket-graph-spprc}$ ($\texttt{bgspprc}$ for short), an open-source, header-only C++23 library for the shortest path problem with resource constraints (SPPRC), the pricing subproblem at the heart of branch-cut-and-price for vehicle routing and related problems. The library implements the bucket-graph labelling algorithm of Sadykov, Uchoa and Pessoa (2021), with bidirectional labelling, across-arc concatenation, bucket fixing and arc elimination, and a structure-of-arrays label store with SIMD-accelerated dominance. Its central design feature is a compile-time resource concept: a new SPPRC variant is added by implementing a fixed seven-function interface, and resources compose into a label state with no runtime dispatch, the state layout fixed at compile time. Five resources ship built in: time/capacity, ng-path elementarity relaxation, rank-1 cuts, cumulative cost, and pickup-and-delivery. In a reproducible, head-to-head comparison on shared public instances at an identical bound, $\texttt{bgspprc}$ outperforms PathWyse (Salani, Basso and Giuffrida, 2024), the main open-source comparator, by $1.3\times$--$2.35\times$ in shifted geometric mean (and by $1.3\times$--$2.3\times$ even when itself run single-threaded), and runs within $1.9\times$--$2.4\times$ of parallel pull labelling (Petersen and Spoorendonk, 2025), a different labelling technique for the same problem. The library, benchmark scripts, and pinned instances are publicly available.

Authors: Simon Spoorendonk

We present $\texttt{bucket-graph-spprc}$ ($\texttt{bgspprc}$ for short), an open-source, header-only C++23 library for the shortest path problem with resource constraints (SPPRC), the pricing subproblem at the heart of branch-cut-and-price for vehicle routing and related problems. The library implements the bucket-graph labelling algorithm of Sadykov, Uchoa and Pessoa (2021), with bidirectional labelling, across-arc concatenation, bucket fixing and arc elimination, and a structure-of-arrays label store with SIMD-accelerated dominance. Its central design feature is a compile-time resource concept: a new SPPRC variant is added by implementing a fixed seven-function interface, and resources compose into a label state with no runtime dispatch, the state layout fixed at compile time. Five resources ship built in: time/capacity, ng-path elementarity relaxation, rank-1 cuts, cumulative cost, and pickup-and-delivery. In a reproducible, head-to-head comparison on shared public instances at an identical bound, $\texttt{bgspprc}$ outperforms PathWyse (Salani, Basso and Giuffrida, 2024), the main open-source comparator, by $1.3\times$--$2.35\times$ in shifted geometric mean (and by $1.3\times$--$2.3\times$ even when itself run single-threaded), and runs within $1.9\times$--$2.4\times$ of parallel pull labelling (Petersen and Spoorendonk, 2025), a different labelling technique for the same problem. The library, benchmark scripts, and pinned instances are publicly available.

The online monotone array completion problem

from arXiv: Data Structures and Algorithms

Authors: Vishesh Jain, Dylan King, Clayton Mizgerd

Consider the following online filling game. An array of length $n$ is initially empty. At each time step one observes an independent sample from $\mathrm{Unif}[0,1]$ and must either discard it or place it irrevocably into an empty position of the array, while preserving the constraint that the occupied entries are non-decreasing from left to right. Among all possible strategies, what is the optimal expected time required to fill the array? Let $v_n$ denote this optimal expected completion time. Our main result determines $v_n$ up to lower-order terms: \[ v_n=\left(\frac12+o(1)\right)n\log n. \] More precisely, no strategy, even if randomized and adaptive, can have expected completion time below $\left(\frac12-o(1)\right)n\log n$, while we provide an explicit deterministic strategy whose expected completion time is at most $\left(\frac12+o(1)\right)n\log n$. For comparison, the natural coupon-collector strategy, which partitions $[0,1]$ into $n$ equal intervals and reserves one array position for each interval, has expected completion time $(1+o(1))n\log n$. We also consider a with-replacement version of the game, in which previously placed entries may be overwritten. For this variant, we give a deterministic strategy with expected completion time $O(n\sqrt{\log n})$, thereby establishing a separation between the two models.

Authors: Vishesh Jain, Dylan King, Clayton Mizgerd

Consider the following online filling game. An array of length $n$ is initially empty. At each time step one observes an independent sample from $\mathrm{Unif}[0,1]$ and must either discard it or place it irrevocably into an empty position of the array, while preserving the constraint that the occupied entries are non-decreasing from left to right. Among all possible strategies, what is the optimal expected time required to fill the array? Let $v_n$ denote this optimal expected completion time. Our main result determines $v_n$ up to lower-order terms: \[ v_n=\left(\frac12+o(1)\right)n\log n. \] More precisely, no strategy, even if randomized and adaptive, can have expected completion time below $\left(\frac12-o(1)\right)n\log n$, while we provide an explicit deterministic strategy whose expected completion time is at most $\left(\frac12+o(1)\right)n\log n$. For comparison, the natural coupon-collector strategy, which partitions $[0,1]$ into $n$ equal intervals and reserves one array position for each interval, has expected completion time $(1+o(1))n\log n$. We also consider a with-replacement version of the game, in which previously placed entries may be overwritten. For this variant, we give a deterministic strategy with expected completion time $O(n\sqrt{\log n})$, thereby establishing a separation between the two models.

Improved Algorithms for Bounded-Degree (Subset) Traveling Salesman Problems

from arXiv: Data Structures and Algorithms

Authors: Jongseo Lee, Jaehyeok Kwak, Hyung-Chan An

We present improved algorithms for several bounded-degree traveling salesman problems. In the bounded-degree traveling salesman path problem (BDTSPP), given a weighted graph G=(V,E), two endpoints s and t, and degree bounds b_v for all v, the goal is to find a minimum-cost subgraph of G that admits an Eulerian s-t path and in which each vertex v has degree at most b_v. Since deciding feasibility is already NP-hard for this problem, previous work gave a bicriteria approximation algorithm. However, that algorithm provides only a multiplicative guarantee on the degree violation, and it was left open whether additive violation is possible. We answer this open question affirmatively by giving a new bicriteria approximation algorithm with additive degree violation. The cost approximation ratio is improved as well, now matching that of Hoogeveen's analysis of the Christofides-Serdyukov algorithm. This improvement relies on a new lemma that enables the use of a bounded-degree minimum spanning tree, rather than a bounded-degree Steiner tree, as a starting point for the algorithm. The lemma compares the cost and degrees of the tree against those of an integral optimum for the bounded-degree TSP at hand, rather than those of a fractional optimum. Our lemma brings improvement to the circuit version (BDTSP) as well: we give a bicriteria algorithm that matches the previous cost approximation ratio while reducing the additive degree violation to +2, which is best possible. Subset TSP is a generalization of the standard "all-vertices" TSP, in which only a specified subset of vertices is required to be visited. We present improvements for both the circuit and the path versions. For the subset path problem (BDSTSPP), we present the first bicriteria approximation algorithm with additive degree violation; for the subset circuit problem (BDSTSP), we give an improved cost approximation ratio.

Authors: Jongseo Lee, Jaehyeok Kwak, Hyung-Chan An

We present improved algorithms for several bounded-degree traveling salesman problems. In the bounded-degree traveling salesman path problem (BDTSPP), given a weighted graph G=(V,E), two endpoints s and t, and degree bounds b_v for all v, the goal is to find a minimum-cost subgraph of G that admits an Eulerian s-t path and in which each vertex v has degree at most b_v. Since deciding feasibility is already NP-hard for this problem, previous work gave a bicriteria approximation algorithm. However, that algorithm provides only a multiplicative guarantee on the degree violation, and it was left open whether additive violation is possible. We answer this open question affirmatively by giving a new bicriteria approximation algorithm with additive degree violation. The cost approximation ratio is improved as well, now matching that of Hoogeveen's analysis of the Christofides-Serdyukov algorithm. This improvement relies on a new lemma that enables the use of a bounded-degree minimum spanning tree, rather than a bounded-degree Steiner tree, as a starting point for the algorithm. The lemma compares the cost and degrees of the tree against those of an integral optimum for the bounded-degree TSP at hand, rather than those of a fractional optimum. Our lemma brings improvement to the circuit version (BDTSP) as well: we give a bicriteria algorithm that matches the previous cost approximation ratio while reducing the additive degree violation to +2, which is best possible. Subset TSP is a generalization of the standard "all-vertices" TSP, in which only a specified subset of vertices is required to be visited. We present improvements for both the circuit and the path versions. For the subset path problem (BDSTSPP), we present the first bicriteria approximation algorithm with additive degree violation; for the subset circuit problem (BDSTSP), we give an improved cost approximation ratio.

Tight bounds for clique-packing parameterized by clique-width

from arXiv: Data Structures and Algorithms

Authors: Narek Bojikian, Stefan Kratsch

In the $d$-Clique Packing problem, given a graph $G$ and an integer $t$, we need to decide whether $G$ contains a set of $t$ pairwise vertex-disjoint cliques of size $d$ each. This generalizes Triangle Packing and it is NP-complete for all $d\geq 3$. For each such $d$, we show how to solve the problem in $n^{O(k^{d-1})}$ time where $k$ is the clique-width of the graph (with a $k$-expression of $G$ given in the input). We complement this by showing that, assuming the Exponential-Time Hypothesis (ETH), there is no algorithm that solves the problem in $n^{o(k^{d-1})}$ time for any fixed $d\geq 3$, already for the special case of seeking a partition into cliques of size $d$. Our proof also entails W[1]-hardness of $d$-Clique Packing (and $d$-Clique Partition) parameterized by clique-width for each $d\geq 3$. Our work continues a series of results on ETH-tight bounds for fundamental graph problems started by Fomin et al.\ (SICOMP 2010+2014) who obtained tight bounds for Max-Cut and Edge Dominating Set.

Authors: Narek Bojikian, Stefan Kratsch

In the $d$-Clique Packing problem, given a graph $G$ and an integer $t$, we need to decide whether $G$ contains a set of $t$ pairwise vertex-disjoint cliques of size $d$ each. This generalizes Triangle Packing and it is NP-complete for all $d\geq 3$. For each such $d$, we show how to solve the problem in $n^{O(k^{d-1})}$ time where $k$ is the clique-width of the graph (with a $k$-expression of $G$ given in the input). We complement this by showing that, assuming the Exponential-Time Hypothesis (ETH), there is no algorithm that solves the problem in $n^{o(k^{d-1})}$ time for any fixed $d\geq 3$, already for the special case of seeking a partition into cliques of size $d$. Our proof also entails W[1]-hardness of $d$-Clique Packing (and $d$-Clique Partition) parameterized by clique-width for each $d\geq 3$. Our work continues a series of results on ETH-tight bounds for fundamental graph problems started by Fomin et al.\ (SICOMP 2010+2014) who obtained tight bounds for Max-Cut and Edge Dominating Set.

Token sliding independent set reconfiguration on graphs with few $P_4$'s

from arXiv: Data Structures and Algorithms

Authors: Lucia Busolini, Mario Valencia-Pabon

We consider the INDEPENDENT SET RECONFIGURATION problem under the Token Sliding rule. Let $I$ be an independent set of a simple undirected graph $G$. Suppose that each vertex of $I$ has a token placed on it. The tokens are allowed to be moved, one at a time, by sliding along the edges of $G$, so that after each move, the vertices having tokens always form an independent set of $G$. The problem we deal is to decide if we can transform $I$ into $I'$ through a sequence of steps, each of which involves substituting a vertex in the current independent set with one of its neighbours to obtain another independent set. This problem of determining if one independent set of a graph "is reachable" from another independent set of it is known to be PSPACE-hard even for split graphs, planar graphs, and graphs of bounded treewidth. Polynomial time algorithms have been obtained for certain graph classes like trees, interval graphs, claw-free graphs, bipartite permutation graphs, block graphs, and cographs. We present a polynomial time algorithm for the problem on $P_4$-tidy graphs and $(q,q-4)$-graphs, both families of graphs generalizing cographs.

Authors: Lucia Busolini, Mario Valencia-Pabon

We consider the INDEPENDENT SET RECONFIGURATION problem under the Token Sliding rule. Let $I$ be an independent set of a simple undirected graph $G$. Suppose that each vertex of $I$ has a token placed on it. The tokens are allowed to be moved, one at a time, by sliding along the edges of $G$, so that after each move, the vertices having tokens always form an independent set of $G$. The problem we deal is to decide if we can transform $I$ into $I'$ through a sequence of steps, each of which involves substituting a vertex in the current independent set with one of its neighbours to obtain another independent set. This problem of determining if one independent set of a graph "is reachable" from another independent set of it is known to be PSPACE-hard even for split graphs, planar graphs, and graphs of bounded treewidth. Polynomial time algorithms have been obtained for certain graph classes like trees, interval graphs, claw-free graphs, bipartite permutation graphs, block graphs, and cographs. We present a polynomial time algorithm for the problem on $P_4$-tidy graphs and $(q,q-4)$-graphs, both families of graphs generalizing cographs.

Decoupling Trust in Byzantine CRDTs: Fine-grained Post-Compromise Handling without Breaking Causality

from arXiv: Data Structures and Algorithms

Authors: Amos Brocco

Conflict-free Replicated Data Types (CRDTs) provide strong eventual consistency without coordination, but classical approaches assume benign participants. In Byzantine settings, convergence is typically enforced through agreement on update validity, often relying on identity-based filtering. However, such approaches struggle in post-compromise scenarios, where a previously correct participant becomes malicious: retroactive exclusion of its updates may break causal dependencies and invalidate subsequent computations. In this paper, we decouple identity-based trust from content-based trust and introduce a fine-grained trust model that combines both dimensions. Building on deterministic reconstruction, our approach allows replicas to preserve previously accepted updates while enabling selective inclusion or exclusion based on both the originating identity (e.g., public keys) and the semantics of individual updates. Trust decisions can incorporate application-level policies, enabling precise control over the impact of each update on the system state. Our approach preserves causal consistency and enables robust and flexible handling of both Byzantine and faulty behavior in decentralized CRDT systems.

Authors: Amos Brocco

Conflict-free Replicated Data Types (CRDTs) provide strong eventual consistency without coordination, but classical approaches assume benign participants. In Byzantine settings, convergence is typically enforced through agreement on update validity, often relying on identity-based filtering. However, such approaches struggle in post-compromise scenarios, where a previously correct participant becomes malicious: retroactive exclusion of its updates may break causal dependencies and invalidate subsequent computations. In this paper, we decouple identity-based trust from content-based trust and introduce a fine-grained trust model that combines both dimensions. Building on deterministic reconstruction, our approach allows replicas to preserve previously accepted updates while enabling selective inclusion or exclusion based on both the originating identity (e.g., public keys) and the semantics of individual updates. Trust decisions can incorporate application-level policies, enabling precise control over the impact of each update on the system state. Our approach preserves causal consistency and enables robust and flexible handling of both Byzantine and faulty behavior in decentralized CRDT systems.

Distributed Property Testing with (Quantum) Carrier Pigeons: Tight Bounds on State Certification

from arXiv: Data Structures and Algorithms

Authors: Kenny Chen

Recently, Doosti et al. introduced the problem of distributed quantum state verification, where $m$ distributed nodes are given a copy of an unknown state $ρ$, and can send limited one way communication to a central node, who has a complete description of a known state $σ$. They ask how many distributed nodes $m$ are required, before the central node can succeed at distinguishing whether $ρ=σ$ or $\|ρ-σ\|_1\geq\varepsilon$ with high probability. In the setting where only quantum communication is allowed, Doosti et al. exhibit conditional lower bounds in both the public and private-coin settings, and a matching upper bound in the public-coin setting. We extend these results, and show unconditional lower bounds for when both classical and quantum communication are permitted. We show the public-coin lower bound is tight by giving an algorithm with a matching upper bound. We also show an almost tight upper bound in the private-coin setting when only quantum communication is permitted.

Authors: Kenny Chen

Recently, Doosti et al. introduced the problem of distributed quantum state verification, where $m$ distributed nodes are given a copy of an unknown state $ρ$, and can send limited one way communication to a central node, who has a complete description of a known state $σ$. They ask how many distributed nodes $m$ are required, before the central node can succeed at distinguishing whether $ρ=σ$ or $\|ρ-σ\|_1\geq\varepsilon$ with high probability. In the setting where only quantum communication is allowed, Doosti et al. exhibit conditional lower bounds in both the public and private-coin settings, and a matching upper bound in the public-coin setting. We extend these results, and show unconditional lower bounds for when both classical and quantum communication are permitted. We show the public-coin lower bound is tight by giving an algorithm with a matching upper bound. We also show an almost tight upper bound in the private-coin setting when only quantum communication is permitted.

Tuesday, June 30

Beachhenge linkage

from David Eppstein

On the scale of stupid things the US government is doing this is pretty small, but they appear to be banning the census bureau from any effective methods of privacy-preserving information release, and in particular from adding noise to their data to help create differential privacy (\(\mathbb{M}\), via). Sadly, taking away valuable disclosure avoidance tools doesn’t make fundamental trade-offs go away.

By David Eppstein

CJ2K battles ALS

from Ben Recht

Chris Johnson's ALS diagnosis and the indefatigable search for treatments.

Yesterday on Good Morning America, NFL Hall of Famer Chris Johnson announced that he had ALS. Johnson, nicknamed CJ2K for his 2006 rushing yards in 2009, was one of the most electrifying players I’ve ever seen. He was impossibly fast. Every one of his carries was must-watch football, as you’d never know when he’d break a run for a touchdown. Johnson still holds the record for the most yards from scrimmage in a season, with his total of 2509 a comfortable 80 yards in front of second place.

Johnson retired in 2018, and, like most football players, disappeared from the national spotlight. So it was beyond heartbreaking to see him yesterday, at 40 years old, speaking to Michael Strahan with an eye-tracking device. ALS attacks the body’s nerve cells, progressively inhibiting a person’s ability to move their limbs, speak, and breathe. Though highly variable, the progression can be shockingly rapid. Johnson told GMA that he had been diagnosed less than a year ago. He wanted to share his story to bring more attention to ALS and the search for radical new treatments.

In 2023, Gideon Lewis-Kraus wrote a moving investigation into this search for The New Yorker. People with ALS and their families are, for quite obvious reasons, passionately dedicated to finding any reasonable treatments for the disease. The hope is to find therapies that at least slow disease progression and improve the patient’s quality of life. The challenge is this tends to require drugs, and ALS lies outside the sweet spot of modern drug discovery.

ALS is a rare but devastating condition, with around 5,000 new cases diagnosed in the United States every year. There is no clear established cause for ALS. Though it typically manifests in people’s late fifties or early sixties, Johnson’s case exemplifies how unpredictable its onset can be. Sometimes ALS progression takes years; sometimes it rapidly advances in months. There are no clear genetic signals for ALS, and family history isn’t predictive. There is a biomarker present in the nerve cells of almost all ALS patients, but current technology only lets us detect this biomarker after death.

Without access to a marker, measuring progression or reversal is a subjective challenge. Current ALS surrogates numerically assess bodily functions like speech, swallowing, breathing, and motor skills, and combine these different tests into a single number. For example, the ALSFRS-R score is the sum of 12 different assessments, each scored on a scale from 0 to 4, where 0 indicates total loss of function, and 4 indicates no loss of function.

All of this adds up to make drug testing a major challenge. It’s hard to define reasonable endpoints, and most trials aim to find a statistically significant difference between treatment and control in ordinal assessments made by clinicians. Though patients and their families urgently seek new possible therapies, there are not enough ALS patients to meet the harsh statistical standards of clinical trials. Moreover, drugs require trials, and the ethics of potentially denying the control group something beneficial are complex and fraught.1

Lewis-Kraus draws connections to the disease that set the standard for patient advocacy against institutionalized medicine: AIDS. AIDS activists in the 1980s were not only vocal in spreading awareness of the disease and its various treatments, but they actively engaged in scientific research and drug policy. It’s a remarkable example of a citizen group not only challenging, but shaping scientific expertise. To this day, AIDS activists remain go-to consultants for helping people and families affected by rare diseases to organize and advocate for the afflicted.

In hindsight, HIV treatment is similar to the treatment of other progressive diseases like ALS. None of the treatments cured an HIV infection, but they allowed patients to manage their infection. Today, HIV can be treated pharmaceutically with an excellent long-term prognosis. Though initially a terrifying death sentence, people with HIV infections can live a normal life with access to the proper medication.

The differences between AIDS and ALS highlight what makes a progressive, terminal condition treatable. AIDS was caused by an infection with the virus HIV. The first effective HIV treatment, AZT, targeted the virus by fooling it to bind to a byproduct of the drug rather than the person’s DNA manufacturing system. This eliminated HIV’s ability to replicate. Moreover, AIDS had an easily measurable surrogate marker to monitor progress. HIV killed a particular type of white blood cell, and counting these cells was a reasonable way to test if the drug was having an effect.

Sadly, while researchers are hopeful that we’re making rapid progress, we’re still missing many crucial causal pieces to the ALS story that might illuminate an appropriate target for the pharmaceutical industry. Johnson hopes that raising awareness of ALS might help us find the missing link in understanding and treating the disease. He wants to keep fighting. My heart goes out to CJ2K, his family, and everyone else suffering from this terrible condition. And I hope his inspiration will help us find the missing gap for a breakthrough treatment.

Subscribe now

1

Gideon also discusses another problem I have to flag: drug companies targeting rare conditions must set exorbitant prices to generate a profit. Let’s pin this dark consequence of capitalism for later.

By Ben Recht

Celebrating 100 Years: Avi 70 + CSDM 30

from CS Theory Events

June 14-18, 2027 Institute for Advanced Study, Princeton, NJ, USA www.ias.edu/math/events/celebrating-100-years-avi-70-csdm-30 The Institute for Advanced Study’s School of Mathematics is pleased to announce a conference celebrating Avi Wigderson’s 70th birthday and retirement, together with 30 years of the Computer Science and Discrete Mathematics program (CSDM) at IAS. The conference will honor both Avi’s extraordinary scientific … Continue reading Celebrating 100 Years: Avi 70 + CSDM 30

By shacharlovett

June 14-18, 2027 Institute for Advanced Study, Princeton, NJ, USA https://www.ias.edu/math/events/celebrating-100-years-avi-70-csdm-30 The Institute for Advanced Study’s School of Mathematics is pleased to announce a conference celebrating Avi Wigderson’s 70th birthday and retirement, together with 30 years of the Computer Science and Discrete Mathematics program (CSDM) at IAS. The conference will honor both Avi’s extraordinary scientific … Continue reading Celebrating 100 Years: Avi 70 + CSDM 30

By shacharlovett

Celebrating 100 Years: Avi 70 + CSDM 30 (June 14-18, 2027)

from Windows on Theory

[Guest post by Amir Shpilka and Irit Dinur] www.ias.edu/math/events/celebrating-100-years-avi-70-csdm-30 The Institute for Advanced Study’s School of Mathematics is pleased to announce a conference celebrating Avi Wigderson’s 70th birthday and retirement, together with 30 years of the Computer Science and Discrete Mathematics program (CSDM) at IAS. The conference will honor both Avi’s extraordinary scientific legacy and … Continue reading Celebrating 100 Years: Avi 70 + CSDM 30 (June 14-18, 2027)

[Guest post by Amir Shpilka and Irit Dinur]

https://www.ias.edu/math/events/celebrating-100-years-avi-70-csdm-30

The Institute for Advanced Study’s School of Mathematics is pleased to announce a conference celebrating Avi Wigderson’s 70th birthday and retirement, together with 30 years of the Computer Science and Discrete Mathematics program (CSDM) at IAS. The conference will honor both Avi’s extraordinary scientific legacy and his transformative role in shaping the intellectual community around theoretical computer science and discrete mathematics at IAS.

Avi Wigderson has made foundational contributions across an exceptional range of areas in theoretical computer science and mathematics. His work has had a profound impact on complexity theory broadly, and especially on cryptography, circuit and proof complexity, communication complexity, pseudorandomness, and expander graphs; in recent years, he has also opened deep and unexpected connections to invariant theory. These contributions have shaped major research directions and influenced generations of mathematicians and theoretical computer scientists.

Confirmed Speakers (all alumni of CSDM):

Boaz Barak, Harvard University

Maria Chudnovsky, Princeton University

Julia Chuzhoy, Toyota Technological Institute at Chicago

Gil Cohen, Tel Aviv University

Yotam Dikstein, Tel Aviv University

Pavel Hrubes, Charles University

Valentine Kabanets, Simon Fraser University

Tali Kaufman, Bar Ilan University

Subhash Khot, NYU

Swastik Kopparty, University of Toronto

Parvesh Kothari, Princeton University

Dor Minzer, MIT

Ankur Moitra, MIT

Shay Moran, Technion Israel Institute of Technology

Ryan O’Donnell, Carnegie Mellon University

Jinyoung Park, NYU

Noga Ren-Zewi, University of Haifa

Shubhangi Saraf, University of Toronto

Benjamin Sudakov, ETH Zurich

Avishay Tal, University of California, Berkeley

Salil Vadhan, Harvard University

Emanuele Viola, Northeastern University

R. Ryan Williams, MIT

Amir Yehudayoff, University of Copenhagen

Or Zamir, Tel Aviv University

Jeroen Zuiddam, University of Amsterdam

By Boaz Barak

TR26-110 | Security Amplification via Robust Indistinguishability Combiners | Benny Applebaum, Nir Bitansky, Nathan Geier

from ECCC Papers

A robust combiner for a cryptographic primitive $P$ takes multiple candidate constructions of $P$ and produces a secure construction of $P$ provided that sufficiently many of the candidates are secure. A closely related notion is that of a security amplifier, where given a weakly secure construction of $P$, we aim to obtain a (strongly) secure one. Intuitively, one may expect that any robust combiner should act as an amplifier by thinking of "good randomness" as inducing secure instances, and of "bad randomness" as inducing insecure instances. Formalizing this intuition, however, has turned out to be challenging. Despite significant progress, general results remain limited and confined either to specific primitives or only to the statistical setting. We establish a new framework of robust indistinguishability combiners, which greatly extends the class of combiners covered by prior work, and prove that they inherently act as security amplifiers. Our results extend to the computational setting, provided that the combiner makes a single query to each candidate. The new framework allows us to rederive previously known amplification results in a simplified manner, as well as prove new amplification results that have so far been out of reach. As our main application, we present the first security amplifier for functional encryption, resolving an open question that first arose in constructions of indistinguishability obfuscation, and for which a gap was discovered in previous proofs. Our amplifier transforms a weak scheme with any constant indistinguishability error into one with full negligible security.
A robust combiner for a cryptographic primitive $P$ takes multiple candidate constructions of $P$ and produces a secure construction of $P$ provided that sufficiently many of the candidates are secure. A closely related notion is that of a security amplifier, where given a weakly secure construction of $P$, we aim to obtain a (strongly) secure one. Intuitively, one may expect that any robust combiner should act as an amplifier by thinking of "good randomness" as inducing secure instances, and of "bad randomness" as inducing insecure instances. Formalizing this intuition, however, has turned out to be challenging. Despite significant progress, general results remain limited and confined either to specific primitives or only to the statistical setting. We establish a new framework of robust indistinguishability combiners, which greatly extends the class of combiners covered by prior work, and prove that they inherently act as security amplifiers. Our results extend to the computational setting, provided that the combiner makes a single query to each candidate. The new framework allows us to rederive previously known amplification results in a simplified manner, as well as prove new amplification results that have so far been out of reach. As our main application, we present the first security amplifier for functional encryption, resolving an open question that first arose in constructions of indistinguishability obfuscation, and for which a gap was discovered in previous proofs. Our amplifier transforms a weak scheme with any constant indistinguishability error into one with full negligible security.

Quantum Lazy Sampling and Path Recording for Any Group

from arXiv: Computational Complexity

Authors: Ben Foxman, Alex Lombardi, Fermi Ma, Barak Nehoran, John Wright

A central challenge in quantum algorithms and cryptography is reasoning about algorithms with oracle access to a random group element (e.g. a random function, permutation, or unitary). Can we efficiently simulate such algorithms? Can we determine what they know after t queries? A classical tool for this is lazy sampling: the oracle does not commit to the full group element upfront, but rather samples partial information about it on the fly. We study a quantum analog of lazy sampling: compressed oracles (or recording oracles). These are quantum data structures that allow on-the-fly simulation for quantum queries, originally introduced by Zhandry (CRYPTO '19) for random functions, and generalized to unitaries by Ma-Huang (STOC '25) and permutations by Carolan (STOC '26), and used to great effect in security proofs and lower bounds due to their interpretability. We define and analyze a general-purpose and interpretable path-recording oracle, derived from first principles, that perfectly simulates random elements of any closed subgroup of $U(N)$. Our oracle stores, in superposition, t input-output pairs, with updates described in terms of the commutant of the group's tensor power representation. This transparently records the information the algorithm has learned. Our oracle builds on recent work of Grinko-Yoshida (QIP '26), who gave a different general-purpose compressed oracle without clear interpretability. One interesting application of our path-recording is allowing direct comparisons between compressed oracles of different groups, giving a new technique for proving pseudorandomness results. For example, comparing $S_N$ and $U(N)$ yields what is arguably the simplest construction to date of pseudorandom unitaries: the product PC of a pseudorandom permutation and a random Clifford, improving on the prior PFC construction (Metger-Poremba-Sinha-Yuen, FOCS '24; Ma-Huang, STOC '25).

Authors: Ben Foxman, Alex Lombardi, Fermi Ma, Barak Nehoran, John Wright

A central challenge in quantum algorithms and cryptography is reasoning about algorithms with oracle access to a random group element (e.g. a random function, permutation, or unitary). Can we efficiently simulate such algorithms? Can we determine what they know after t queries? A classical tool for this is lazy sampling: the oracle does not commit to the full group element upfront, but rather samples partial information about it on the fly. We study a quantum analog of lazy sampling: compressed oracles (or recording oracles). These are quantum data structures that allow on-the-fly simulation for quantum queries, originally introduced by Zhandry (CRYPTO '19) for random functions, and generalized to unitaries by Ma-Huang (STOC '25) and permutations by Carolan (STOC '26), and used to great effect in security proofs and lower bounds due to their interpretability. We define and analyze a general-purpose and interpretable path-recording oracle, derived from first principles, that perfectly simulates random elements of any closed subgroup of $U(N)$. Our oracle stores, in superposition, t input-output pairs, with updates described in terms of the commutant of the group's tensor power representation. This transparently records the information the algorithm has learned. Our oracle builds on recent work of Grinko-Yoshida (QIP '26), who gave a different general-purpose compressed oracle without clear interpretability. One interesting application of our path-recording is allowing direct comparisons between compressed oracles of different groups, giving a new technique for proving pseudorandomness results. For example, comparing $S_N$ and $U(N)$ yields what is arguably the simplest construction to date of pseudorandom unitaries: the product PC of a pseudorandom permutation and a random Clifford, improving on the prior PFC construction (Metger-Poremba-Sinha-Yuen, FOCS '24; Ma-Huang, STOC '25).

Cyclic Attractor Detection in Boolean Network Dynamics under Local Logical Constraints

from arXiv: Computational Complexity

Authors: Alexander Drobyshev, Grigoriy Bokov

Boolean networks are finite discrete nonlinear systems whose long-term behaviour is organised by fixed-point and cyclic attractors. Detecting such recurrent states is important in applications ranging from gene regulation and neural computation to complex-network models, but the computational boundary between tractable and intractable attractor analysis is still not fully understood. We study that boundary from the perspective of local logical rules. We consider Boolean networks under parallel update whose coordinate functions are given by circuits over a fixed finite basis of a closed Boolean-function class, and ask whether the network has a cyclic attractor of prescribed exact period $k$. For every fixed $k\ge 2$, we obtain a complete complexity dichotomy over Post's lattice. The problem is $\mathrm{NP}$-complete whenever the local rule class contains majority-like self-dual rules or one of the two mixed conjunctive-disjunctive monotone families. In all remaining Post classes it is polynomial-time solvable, with affine rules and pure conjunctive or pure disjunctive rules with constants providing the boundary tractable cases. The results show that exact attractor detection is governed not only by the network architecture but also by the logical mechanism of local update: affine and one-sided rules preserve algebraic or order structure, whereas majority-like and mixed monotone rules can encode global Boolean consistency constraints.

Authors: Alexander Drobyshev, Grigoriy Bokov

Boolean networks are finite discrete nonlinear systems whose long-term behaviour is organised by fixed-point and cyclic attractors. Detecting such recurrent states is important in applications ranging from gene regulation and neural computation to complex-network models, but the computational boundary between tractable and intractable attractor analysis is still not fully understood. We study that boundary from the perspective of local logical rules. We consider Boolean networks under parallel update whose coordinate functions are given by circuits over a fixed finite basis of a closed Boolean-function class, and ask whether the network has a cyclic attractor of prescribed exact period $k$. For every fixed $k\ge 2$, we obtain a complete complexity dichotomy over Post's lattice. The problem is $\mathrm{NP}$-complete whenever the local rule class contains majority-like self-dual rules or one of the two mixed conjunctive-disjunctive monotone families. In all remaining Post classes it is polynomial-time solvable, with affine rules and pure conjunctive or pure disjunctive rules with constants providing the boundary tractable cases. The results show that exact attractor detection is governed not only by the network architecture but also by the logical mechanism of local update: affine and one-sided rules preserve algebraic or order structure, whereas majority-like and mixed monotone rules can encode global Boolean consistency constraints.

Reachability in Fixed-Dimensional Continuous VASS

from arXiv: Computational Complexity

Authors: Michal Ajdarów, A. R. Balasubramanian, Łukasz Orlikowski

Vector Addition System with States (VASS) are a ubiquitous model of infinite-state systems consisting of a set of non-negative counters which can be incremented and decremented. It is known that the reachability problem for VASS is Ackermann-complete. Because of this huge complexity, various over-approximations of VASS have been studied in the literature. One such over-approximation is continuous VASS (CVASS), in which the counters are (non-negative) rational numbers and whenever a vector is added to the current counter values, it is first scaled with an arbitrarily chosen rational factor between zero and one. It is known that the reachability problem for CVASS is $\mathsf{NP}$-complete. In this paper, we initiate the study of fixed-dimensional CVASS, i.e., CVASS with a fixed number of counters. We study both the reachability and coverability problems, under both unary and binary encodings as well as over both the non-negative and the rational semantics. This gives rise to a collection of eight different problems. As our main result, we prove a complexity dichotomy for all of these eight problems when the transition vectors are over the rationals: For dimension 1, all of the eight problems are in $\mathsf{AC}^1$, whereas for any dimension at least 2, all of the eight problems are $\mathsf{NP}$-complete. Furthermore, the hardness holds even when the underlying automaton is acyclic. To achieve this result, we present a new technique called the Egyptian prime fractions technique. Finally, we also study these problems when the transition vectors are over the integers. Except for dimension 2, we classify the complexity of these problems over the non-negative semantics: For dimension 1, all of the problems are in $\mathsf{AC}^1$, whereas for dimensions 3 and above, all of the problems are $\mathsf{NP}$-complete.

Authors: Michal Ajdarów, A. R. Balasubramanian, Łukasz Orlikowski

Vector Addition System with States (VASS) are a ubiquitous model of infinite-state systems consisting of a set of non-negative counters which can be incremented and decremented. It is known that the reachability problem for VASS is Ackermann-complete. Because of this huge complexity, various over-approximations of VASS have been studied in the literature. One such over-approximation is continuous VASS (CVASS), in which the counters are (non-negative) rational numbers and whenever a vector is added to the current counter values, it is first scaled with an arbitrarily chosen rational factor between zero and one. It is known that the reachability problem for CVASS is $\mathsf{NP}$-complete. In this paper, we initiate the study of fixed-dimensional CVASS, i.e., CVASS with a fixed number of counters. We study both the reachability and coverability problems, under both unary and binary encodings as well as over both the non-negative and the rational semantics. This gives rise to a collection of eight different problems. As our main result, we prove a complexity dichotomy for all of these eight problems when the transition vectors are over the rationals: For dimension 1, all of the eight problems are in $\mathsf{AC}^1$, whereas for any dimension at least 2, all of the eight problems are $\mathsf{NP}$-complete. Furthermore, the hardness holds even when the underlying automaton is acyclic. To achieve this result, we present a new technique called the Egyptian prime fractions technique. Finally, we also study these problems when the transition vectors are over the integers. Except for dimension 2, we classify the complexity of these problems over the non-negative semantics: For dimension 1, all of the problems are in $\mathsf{AC}^1$, whereas for dimensions 3 and above, all of the problems are $\mathsf{NP}$-complete.

The Simple Strategy-Iteration Method is Strongly Polynomial for the Turn-Based Deterministic Forward Game

from arXiv: Computational Complexity

Authors: Sanyou Mei, Chunlin Sun, Yinyu Ye

We study Turn-Based Deterministic Forward Games (TBDFGs), the subclass of turn-based deterministic zero-sum games in which no directed cycle contains actions controlled by both players. This forward condition is strictly weaker than acyclicity: recurrent behavior may be arbitrarily rich within one player's states, while mixed-player feedback cycles are excluded. Our main contribution separates two algorithmic consequences of this structure. First, we analyze the simple strategy-iteration method of [11,14], a generic method for TBSGs whose execution neither tests for nor uses the TBDFG property. We prove that this structure-oblivious algorithm nevertheless has a strongly polynomial guarantee on every TBDFG. In particular, it terminates after at most $O(n^6m^4\log^4 n)$ simplex pivot steps. Thus, the forward property acts as a structural certificate for convergence even when the algorithm is not informed that the input has this property. Second, when the TBDFG structure is known in advance, a backward SCC propagation algorithm is proposed that solves a sequence of deterministic-MDP subproblems and improves the bound to $O(n^3m^2\log^2 n)$ simplex pivot steps. Together, these results show that forward structure both regularizes the convergence of a general strategy-iteration method and supports a sharper structure-aware algorithm.

Authors: Sanyou Mei, Chunlin Sun, Yinyu Ye

We study Turn-Based Deterministic Forward Games (TBDFGs), the subclass of turn-based deterministic zero-sum games in which no directed cycle contains actions controlled by both players. This forward condition is strictly weaker than acyclicity: recurrent behavior may be arbitrarily rich within one player's states, while mixed-player feedback cycles are excluded. Our main contribution separates two algorithmic consequences of this structure. First, we analyze the simple strategy-iteration method of [11,14], a generic method for TBSGs whose execution neither tests for nor uses the TBDFG property. We prove that this structure-oblivious algorithm nevertheless has a strongly polynomial guarantee on every TBDFG. In particular, it terminates after at most $O(n^6m^4\log^4 n)$ simplex pivot steps. Thus, the forward property acts as a structural certificate for convergence even when the algorithm is not informed that the input has this property. Second, when the TBDFG structure is known in advance, a backward SCC propagation algorithm is proposed that solves a sequence of deterministic-MDP subproblems and improves the bound to $O(n^3m^2\log^2 n)$ simplex pivot steps. Together, these results show that forward structure both regularizes the convergence of a general strategy-iteration method and supports a sharper structure-aware algorithm.

Toward a KKL Theorem for any HDX

from arXiv: Computational Complexity

Authors: Max Hopkins

The KKL Theorem, a seminal result in boolean function analysis, characterizes the structure of low-influence (non-expanding) functions on the hypercube. While recent years have seen breakthrough results across a variety of areas relying on analogs of the KKL Theorem beyond the cube (e.g., on product spaces, Grassmann graphs), further progress has been inhibited by our poor understanding of the phenomenon across more general domains. Motivated in this context, Bafna, Hopkins, Kaufman, and Lovett (STOC 2022) and Gur, Lifshitz, and Liu (STOC 2022) proved a generalized KKL-type Theorem for spectral high dimensional expanders (HDX). Their results, however, remain highly restricted due to strong quantitative expansion requirements on the underlying complex. In this work, we introduce a simple local-to-global method for analyzing low influence functions on simplicial complexes. Using this method we prove a local-to-global KKL-type Theorem: any simplicial complex whose links satisfy a KKL-Theorem also satisfies such a result globally. Building on Gotlib and Kaufman (RANDOM 2023), we also prove a weaker dimension-dependent KKL-type Theorem for simplicial complexes with any non-trivial (two-sided) expansion. As concrete applications of our framework, we give the first characterization of non-expanding functions on `combinatorial' HDX such as dense clique complexes and a corresponding Kruskal-Katona Theorem, as well as a small-set expansion theorem for the Ramanujan Complexes of Lubotzky, Samuels, and Vishne (EJC '05).

Authors: Max Hopkins

The KKL Theorem, a seminal result in boolean function analysis, characterizes the structure of low-influence (non-expanding) functions on the hypercube. While recent years have seen breakthrough results across a variety of areas relying on analogs of the KKL Theorem beyond the cube (e.g., on product spaces, Grassmann graphs), further progress has been inhibited by our poor understanding of the phenomenon across more general domains. Motivated in this context, Bafna, Hopkins, Kaufman, and Lovett (STOC 2022) and Gur, Lifshitz, and Liu (STOC 2022) proved a generalized KKL-type Theorem for spectral high dimensional expanders (HDX). Their results, however, remain highly restricted due to strong quantitative expansion requirements on the underlying complex. In this work, we introduce a simple local-to-global method for analyzing low influence functions on simplicial complexes. Using this method we prove a local-to-global KKL-type Theorem: any simplicial complex whose links satisfy a KKL-Theorem also satisfies such a result globally. Building on Gotlib and Kaufman (RANDOM 2023), we also prove a weaker dimension-dependent KKL-type Theorem for simplicial complexes with any non-trivial (two-sided) expansion. As concrete applications of our framework, we give the first characterization of non-expanding functions on `combinatorial' HDX such as dense clique complexes and a corresponding Kruskal-Katona Theorem, as well as a small-set expansion theorem for the Ramanujan Complexes of Lubotzky, Samuels, and Vishne (EJC '05).

On the Complexity of Counting Orderings in Graphs

from arXiv: Computational Complexity

Authors: Marcelo Arenas, María Alejandra Schild, Bernardo Subercaseaux

We study the computational complexity of several counting problems on graphs. Each of these problems consists of counting orderings of the vertices or edges with adjacency constraints. We show $\#P$-completeness for all of them via a common new technique. Given a counting function $C$ of interest, we define a parameterized family of instances $G_q$, where the parameter $q$ controls the amplification of a simple gadget. After multiplying by an explicit factor $f(q)$, we show that the values of $f(q) \cdot C(G_q)$, for positive integers $q$, agree with a rational function in $q$ whose numerator and denominator can be interpolated in polynomial time. We then recover a $\#P$-hard function by evaluating this rational function symbolically at a limiting value $L \in \mathbb{Q} \cup \{\infty, -\infty\}$. With this methodology, we show $\#P$-completeness for the following counting problems: (a) successive vertex orderings of bipartite graphs, (b) st-numberings of graphs, (c) shellings of bipartite graphs, (d) linear extensions of N-free posets of height $3$, and (e) linear extensions of posets of height $2$. Result (d) settles a conjecture of Felsner and Manneville (2015). Although result (e) was first proved by Dittmer and Pak (2018), we include an alternative proof, using our technique, that does not rely on the result of Brightwell and Winkler (1991) about the hardness of counting linear extensions for general posets.

Authors: Marcelo Arenas, María Alejandra Schild, Bernardo Subercaseaux

We study the computational complexity of several counting problems on graphs. Each of these problems consists of counting orderings of the vertices or edges with adjacency constraints. We show $\#P$-completeness for all of them via a common new technique. Given a counting function $C$ of interest, we define a parameterized family of instances $G_q$, where the parameter $q$ controls the amplification of a simple gadget. After multiplying by an explicit factor $f(q)$, we show that the values of $f(q) \cdot C(G_q)$, for positive integers $q$, agree with a rational function in $q$ whose numerator and denominator can be interpolated in polynomial time. We then recover a $\#P$-hard function by evaluating this rational function symbolically at a limiting value $L \in \mathbb{Q} \cup \{\infty, -\infty\}$. With this methodology, we show $\#P$-completeness for the following counting problems: (a) successive vertex orderings of bipartite graphs, (b) st-numberings of graphs, (c) shellings of bipartite graphs, (d) linear extensions of N-free posets of height $3$, and (e) linear extensions of posets of height $2$. Result (d) settles a conjecture of Felsner and Manneville (2015). Although result (e) was first proved by Dittmer and Pak (2018), we include an alternative proof, using our technique, that does not rely on the result of Brightwell and Winkler (1991) about the hardness of counting linear extensions for general posets.

The Optimal Knight Exchange Puzzle is NP-Hard

from arXiv: Computational Complexity

Authors: Henry Siegel

This paper explores the hardness of two popular recreational chess puzzles: The Knight's Tour and the Knight Exchange (Swap). The problem of finding a Knight's Tour is known to be NP-hard for any chessboard with holes and constant-time decidable for rectangular chessboards, so a natural direction is to explore the hardness of the problem for intermediate chessboard restrictions. In this paper, we show that Knight's Tour is NP-hard for connected boards. We also give a short polynomial-time reduction between the two problems, showing that the optimality version of Knight Exchange is NP-hard.

Authors: Henry Siegel

This paper explores the hardness of two popular recreational chess puzzles: The Knight's Tour and the Knight Exchange (Swap). The problem of finding a Knight's Tour is known to be NP-hard for any chessboard with holes and constant-time decidable for rectangular chessboards, so a natural direction is to explore the hardness of the problem for intermediate chessboard restrictions. In this paper, we show that Knight's Tour is NP-hard for connected boards. We also give a short polynomial-time reduction between the two problems, showing that the optimality version of Knight Exchange is NP-hard.

One Hex reduction to rule them all: Quoridor, Maze Attack, Pinko Pallino and Blockade are PSPACE-complete

from arXiv: Computational Complexity

Authors: Francesco Carboni, Daniele Muscillo

Quoridor is a popular award-winning board game whose computational complexity, listed among the open problems of the Demaine-Hearn survey, remained open for nearly two decades. It was settled only recently, via a reduction from the formula game $G_{pos}$ tailored to Quoridor. We give a shorter and more general proof: a single reduction from Reisch's planar graph-Hex, in which wall placement encodes the path-connection structure of Hex. The same construction settles three closely related games -- Maze Attack and Pinko Pallino with no change, and Blockade with only minor adaptations -- showing that all four are PSPACE-complete, the latter three for the first time. More generally, our reduction shows that any race-and-wall game is PSPACE-complete.

Authors: Francesco Carboni, Daniele Muscillo

Quoridor is a popular award-winning board game whose computational complexity, listed among the open problems of the Demaine-Hearn survey, remained open for nearly two decades. It was settled only recently, via a reduction from the formula game $G_{pos}$ tailored to Quoridor. We give a shorter and more general proof: a single reduction from Reisch's planar graph-Hex, in which wall placement encodes the path-connection structure of Hex. The same construction settles three closely related games -- Maze Attack and Pinko Pallino with no change, and Blockade with only minor adaptations -- showing that all four are PSPACE-complete, the latter three for the first time. More generally, our reduction shows that any race-and-wall game is PSPACE-complete.

The Undecidability of Artificial General Intelligence (AGI) Alignment

from arXiv: Computational Complexity

Authors: Jose Pascual Gumbau Mezquita

This article establishes the foundational mathematical limits of Artificial General Intelligence (AGI) safety, proving that the core barrier is not the impossibility of an aligned state, but its structural unverifiability. We formalize this boundary through two central impossibility results: the Unverifiability Theorem of Alignment and the Theorem of Finite Structural Unverifiability of AGI Alignment. We ground this boundary at Trakhtenbrot's Wall, demonstrating that contemporary engineering defenses relying on finite hardware or halting architectures fail to escape logical obstructions. This failure manifests as an inescapable triad of containment failures: open domains yield fundamental undecidability (Rice and Gödel); universal finite verification collapses into algorithmic incomputability (Trakhtenbrot); and particular bounded environments trap the supervisor within intractable bounds in the worst case. As a direct structural corollary of these results, we derive the Soundness--Completeness--Tractability Trilemma, establishing that the mutual incompatibility of these three properties is a necessary consequence of descriptive complexity rather than an empirical anomaly. Finally, we map these theoretical bounds onto practical AI engineering, demonstrating that modern containment strategies are not temporary patches, but mandatory sacrifices of logical expressivity required to secure decidable fragments of safety.

Authors: Jose Pascual Gumbau Mezquita

This article establishes the foundational mathematical limits of Artificial General Intelligence (AGI) safety, proving that the core barrier is not the impossibility of an aligned state, but its structural unverifiability. We formalize this boundary through two central impossibility results: the Unverifiability Theorem of Alignment and the Theorem of Finite Structural Unverifiability of AGI Alignment. We ground this boundary at Trakhtenbrot's Wall, demonstrating that contemporary engineering defenses relying on finite hardware or halting architectures fail to escape logical obstructions. This failure manifests as an inescapable triad of containment failures: open domains yield fundamental undecidability (Rice and Gödel); universal finite verification collapses into algorithmic incomputability (Trakhtenbrot); and particular bounded environments trap the supervisor within intractable bounds in the worst case. As a direct structural corollary of these results, we derive the Soundness--Completeness--Tractability Trilemma, establishing that the mutual incompatibility of these three properties is a necessary consequence of descriptive complexity rather than an empirical anomaly. Finally, we map these theoretical bounds onto practical AI engineering, demonstrating that modern containment strategies are not temporary patches, but mandatory sacrifices of logical expressivity required to secure decidable fragments of safety.

Computing the Integral R2 Indicator by Perspective Mapping and Box Decomposition

from arXiv: Computational Geometry

Authors: Michael T. M. Emmerich

The continuous integral R2 indicator is a Pareto-compliant refinement of the classical finite-weight-vector R2 indicator, used in performance assessment, bounded archiving for a-posteriori multi-objective optimization, and skyline selection in databases. This work introduces a bidirectional perspective mapping between continuous integral R2 computation and integration over unions of anchored axis-aligned boxes. After translating the ideal point of a minimization problem to the origin, approximation points become strictly positive loss vectors, and the subgraph of the lower weighted Tchebycheff envelope over the weight simplex maps to the complement of an anchored-box union in reciprocal objective space. The Jacobian gives an absolute R2 formula as a weighted complement volume with density $(x_1+\cdots+x_N)^{-(N+1)}$, while differences of R2 values become finite weighted hypervolume differences. Hence, hypervolume algorithms that emit box decompositions can be reused by replacing ordinary box volumes with closed-form weighted box integrals. For $N$ objectives, this gives an output-sensitive overhead $O(2^N M)$ for an $M$-box decomposition, or $O(M)$ for fixed $N$. Using existing box-decomposition approaches, the integral R2 can be computed in $O(n \log n)$ for $N=2,3$, in $O(n^2)$ for $N=4$, and in $O\left(n^{\lfloor (N-1)/2\rfloor+1}\right)$ for $N\geq4$, with $n$ denoting the size of the approximation set. On the lower-bound side, exact value computation has an $Ω(n\log n)$ lower bound in the algebraic decision-tree model already in two objectives, this bound lifts to every fixed $N\geq2$, and exact computation is $\#P$-hard when $N$ is part of the input. Together, the proposed perspective mapping provides a powerful tool for transferring algorithmic and structural results between anchored-box union and hypervolume theory and integral R2 computation.

Authors: Michael T. M. Emmerich

The continuous integral R2 indicator is a Pareto-compliant refinement of the classical finite-weight-vector R2 indicator, used in performance assessment, bounded archiving for a-posteriori multi-objective optimization, and skyline selection in databases. This work introduces a bidirectional perspective mapping between continuous integral R2 computation and integration over unions of anchored axis-aligned boxes. After translating the ideal point of a minimization problem to the origin, approximation points become strictly positive loss vectors, and the subgraph of the lower weighted Tchebycheff envelope over the weight simplex maps to the complement of an anchored-box union in reciprocal objective space. The Jacobian gives an absolute R2 formula as a weighted complement volume with density $(x_1+\cdots+x_N)^{-(N+1)}$, while differences of R2 values become finite weighted hypervolume differences. Hence, hypervolume algorithms that emit box decompositions can be reused by replacing ordinary box volumes with closed-form weighted box integrals. For $N$ objectives, this gives an output-sensitive overhead $O(2^N M)$ for an $M$-box decomposition, or $O(M)$ for fixed $N$. Using existing box-decomposition approaches, the integral R2 can be computed in $O(n \log n)$ for $N=2,3$, in $O(n^2)$ for $N=4$, and in $O\left(n^{\lfloor (N-1)/2\rfloor+1}\right)$ for $N\geq4$, with $n$ denoting the size of the approximation set. On the lower-bound side, exact value computation has an $Ω(n\log n)$ lower bound in the algebraic decision-tree model already in two objectives, this bound lifts to every fixed $N\geq2$, and exact computation is $\#P$-hard when $N$ is part of the input. Together, the proposed perspective mapping provides a powerful tool for transferring algorithmic and structural results between anchored-box union and hypervolume theory and integral R2 computation.

Informational Frustration in Neural Manifolds: Shannon Bottlenecks and the Limits of Learnability

from arXiv: Computational Geometry

Authors: Srinivasa Rao P., Vangmayi P Reddy

Why overparameterised deep networks generalise so remarkably well remains one of the most stubborn open questions in machine learning theory. Classical frameworks like VC dimension and Rademacher complexity predict catastrophic overfitting in modern models, leaving a massive theoretical gap between theory and reality. In this paper, we bridge this divide by introducing a unified framework that links information theory, topology, and statistical mechanics to map the hard limits of deep learning. Central to our approach is the Entropic Learnability Horizon (ELH): a fundamental law stating that a network can only truly learn a target function if the Shannon entropy of the data manifold outpaces the topological entropy of the function's decision boundary, balanced by the von Neumann entropy of the network's weight space. We establish the Shannon-Topological Bottleneck Theorem, proving that when a target boundary's geometric complexity exceeds this informational horizon, the system undergoes a sudden entropic phase transition. It falls into a state of Informational Frustration - a glassy, rigid memorization phase where generalization becomes thermodynamically impossible. Using this lens, we show that the enigmatic phenomenon of "grokking" is actually an Entropic Release, where weights abruptly reorganise to unlock the bottleneck. Finally, we translate this theory into practice with Entropic Gradient Descent (EGD), an optimization algorithm that dynamically manages weight entropy to keep learning on track. Ultimately, this work repositions entropy not just as a tool for tracking uncertainty but as the fundamental physical currency that dictates whether a machine can learn.

Authors: Srinivasa Rao P., Vangmayi P Reddy

Why overparameterised deep networks generalise so remarkably well remains one of the most stubborn open questions in machine learning theory. Classical frameworks like VC dimension and Rademacher complexity predict catastrophic overfitting in modern models, leaving a massive theoretical gap between theory and reality. In this paper, we bridge this divide by introducing a unified framework that links information theory, topology, and statistical mechanics to map the hard limits of deep learning. Central to our approach is the Entropic Learnability Horizon (ELH): a fundamental law stating that a network can only truly learn a target function if the Shannon entropy of the data manifold outpaces the topological entropy of the function's decision boundary, balanced by the von Neumann entropy of the network's weight space. We establish the Shannon-Topological Bottleneck Theorem, proving that when a target boundary's geometric complexity exceeds this informational horizon, the system undergoes a sudden entropic phase transition. It falls into a state of Informational Frustration - a glassy, rigid memorization phase where generalization becomes thermodynamically impossible. Using this lens, we show that the enigmatic phenomenon of "grokking" is actually an Entropic Release, where weights abruptly reorganise to unlock the bottleneck. Finally, we translate this theory into practice with Entropic Gradient Descent (EGD), an optimization algorithm that dynamically manages weight entropy to keep learning on track. Ultimately, this work repositions entropy not just as a tool for tracking uncertainty but as the fundamental physical currency that dictates whether a machine can learn.

Trajectory Optimization for Collision-Aware Redundant Robotic Multi-Axis Additive Manufacturing by Constrained Gradient Projection

from arXiv: Computational Geometry

Authors: Zhikai Shen, Jiasheng Qu, Chenyu Xu, Zhuo Huang, Chengkai Dai, Yongzhe Li, Guoxin Fang

Redundant robotic multi-axis additive manufacturing (MAAM) enables support-free and conformal fabrication, but trajectory optimization for long-horizon paths remains challenging under strict deposition-position constraints and time-varying collision constraints. This work proposes a computational framework for collision-aware trajectory optimization in redundant robotic MAAM. We first formulate nozzle-workpiece relative kinematics using a relative Jacobian, and develop a differentiable SDF-based collision model that captures fabrication-induced geometry evolution and provides optimization gradients. The deposition position is then enforced as a hard waypoint-wise equality constraint through iterative projection onto the self-motion manifold, with the loss gradient restricted to the corresponding tangent space. Experiments on an 8-DOF robotic MAAM platform with diverse long-horizon support-free and conformal toolpaths show that our method maintains a mean nozzle-position error below 10μm, reduces maximum joint jerk by up to $77.6\%$, and eliminates all sampled collision and orientation violations. Compared with the SQP-based baseline, it achieves up to a 10.2x speedup and improved convergence. Physical fabrication experiments further verify that the resulting smooth, collision-free trajectories enable successful printing of complex geometries with fewer visible deposition artifacts.

Authors: Zhikai Shen, Jiasheng Qu, Chenyu Xu, Zhuo Huang, Chengkai Dai, Yongzhe Li, Guoxin Fang

Redundant robotic multi-axis additive manufacturing (MAAM) enables support-free and conformal fabrication, but trajectory optimization for long-horizon paths remains challenging under strict deposition-position constraints and time-varying collision constraints. This work proposes a computational framework for collision-aware trajectory optimization in redundant robotic MAAM. We first formulate nozzle-workpiece relative kinematics using a relative Jacobian, and develop a differentiable SDF-based collision model that captures fabrication-induced geometry evolution and provides optimization gradients. The deposition position is then enforced as a hard waypoint-wise equality constraint through iterative projection onto the self-motion manifold, with the loss gradient restricted to the corresponding tangent space. Experiments on an 8-DOF robotic MAAM platform with diverse long-horizon support-free and conformal toolpaths show that our method maintains a mean nozzle-position error below 10μm, reduces maximum joint jerk by up to $77.6\%$, and eliminates all sampled collision and orientation violations. Compared with the SQP-based baseline, it achieves up to a 10.2x speedup and improved convergence. Physical fabrication experiments further verify that the resulting smooth, collision-free trajectories enable successful printing of complex geometries with fewer visible deposition artifacts.

EASE: Parametric garment design with explicit and local ease control

from arXiv: Computational Geometry

Authors: Kristijan Bartol, Frieda Hentschel, Nataliya Sadretdinova, Benjamin Russig, Melinos Averkiou, Yordan Kyosev, Stefan Gumhold

Garment fit and comfort depend critically on ease, the local allowance of excess material relative to the body. In existing design pipelines, ease is typically a byproduct of geometry or simulation rather than an independent design variable, making it difficult to specify, edit, transfer, or redistribute without re-running simulation or optimization. We propose a garment representation that embeds meshes directly on the surface of a parametric human body model and represents ease explicitly as spatially varying, anisotropic per-triangle scales. These scales act as primary design variables, decoupling the specification of material allowance from its physical deformation. Given a design specified by parametric and user-defined surface cuts together with local scale fields, we optimize sewing patterns that enforce the prescribed ease distribution while satisfying geometric and seam constraints. The representation enables three capabilities that are unavailable without explicit ease control: (1) direct specification and editing of local material allowance on the body surface; (2) intent-preserving transfer to new body shapes that reproduces the specified ease distribution without re-running simulation; and (3) intent-modifying pose adaptation that redistributes ease to relieve strain in high-stretch regions. We verify each of these experimentally: ease is closely retained after optimization, excessive strain is significantly mitigated for target poses, and the ease distribution is accurately transferred to target shapes. The approach is implemented as a virtual try-on framework, with physics-based cloth simulation used for final garment visualization. We will publicly release our framework and detailed documentation.

Authors: Kristijan Bartol, Frieda Hentschel, Nataliya Sadretdinova, Benjamin Russig, Melinos Averkiou, Yordan Kyosev, Stefan Gumhold

Garment fit and comfort depend critically on ease, the local allowance of excess material relative to the body. In existing design pipelines, ease is typically a byproduct of geometry or simulation rather than an independent design variable, making it difficult to specify, edit, transfer, or redistribute without re-running simulation or optimization. We propose a garment representation that embeds meshes directly on the surface of a parametric human body model and represents ease explicitly as spatially varying, anisotropic per-triangle scales. These scales act as primary design variables, decoupling the specification of material allowance from its physical deformation. Given a design specified by parametric and user-defined surface cuts together with local scale fields, we optimize sewing patterns that enforce the prescribed ease distribution while satisfying geometric and seam constraints. The representation enables three capabilities that are unavailable without explicit ease control: (1) direct specification and editing of local material allowance on the body surface; (2) intent-preserving transfer to new body shapes that reproduces the specified ease distribution without re-running simulation; and (3) intent-modifying pose adaptation that redistributes ease to relieve strain in high-stretch regions. We verify each of these experimentally: ease is closely retained after optimization, excessive strain is significantly mitigated for target poses, and the ease distribution is accurately transferred to target shapes. The approach is implemented as a virtual try-on framework, with physics-based cloth simulation used for final garment visualization. We will publicly release our framework and detailed documentation.

Algorithmic exploration of the unit distance problem in the rational plane

from arXiv: Computational Geometry

Authors: Panteleimon Rodis

This paper presents reproducible experimental evidence on unit-distance graph density that surpasses recent theoretical lower bounds. Our approach is based on a novel algorithmic exploration of the rational plane for the generation of unit-distance graphs. An efficient algorithm for this utility must perform a local-breadth search on a bounded and finite set of elements and generate a graph that potentially encompasses the general properties of a unit-distance graph, not affected by restrictions on its generation. To this end, we show that our approach accomplishes this purpose by overcoming the limitations of grid-based structures used in the literature for generating unit-distance graphs. Furthermore, the scaling exponent of the generated graph surpasses recent results.

Authors: Panteleimon Rodis

This paper presents reproducible experimental evidence on unit-distance graph density that surpasses recent theoretical lower bounds. Our approach is based on a novel algorithmic exploration of the rational plane for the generation of unit-distance graphs. An efficient algorithm for this utility must perform a local-breadth search on a bounded and finite set of elements and generate a graph that potentially encompasses the general properties of a unit-distance graph, not affected by restrictions on its generation. To this end, we show that our approach accomplishes this purpose by overcoming the limitations of grid-based structures used in the literature for generating unit-distance graphs. Furthermore, the scaling exponent of the generated graph surpasses recent results.

Invariant Reasoning Directions in Latent Trajectories of Language Models

from arXiv: Computational Geometry

Authors: Arun Vignesh Malarkkan, Manan Roy Choudhury, Utkarsh Byahut, Yash Ravindra Charde, Vivek Gupta, Yanjie Fu

Latent reasoning models perform multi-step inference directly in hidden-state space, yet the structure of these latent reasoning trajectories remains poorly understood. We show that contrastive refinement signals between stronger and weaker reasoning trajectories exhibit a highly concentrated low-rank structure, while unconstrained latent updates remain sensitive to paraphrases, checkpoint choice, and trajectory perturbations. These observations suggest that latent reasoning trajectories contain stable invariant directions mixed with unstable instance-specific variation. We introduce \textbf{Trajectory-Invariant Latent Refinement (TILR)}, a training-free intervention framework for identifying and manipulating stable reasoning directions in latent space. TILR first learns a low-rank invariant subspace from contrastive trajectory differences across inputs, then constrains latent interventions to this subspace while suppressing poorly aligned updates through an adaptive alignment gate. Across six reasoning benchmarks, we find that a small number of latent directions explain most variation between strong and weak reasoning trajectories. Interventions on these directions causally improve reasoning consistency and reduce trajectory instability under paraphrases and perturbations. TILR improves answer consistency under paraphrase by ~10% and reduces latent trajectory variance by up to $50\%$ while preserving reasoning accuracy. These results support a geometric view of latent reasoning in which transferable reasoning behavior emerges from stable low-dimensional structure within hidden-state trajectories.

Authors: Arun Vignesh Malarkkan, Manan Roy Choudhury, Utkarsh Byahut, Yash Ravindra Charde, Vivek Gupta, Yanjie Fu

Latent reasoning models perform multi-step inference directly in hidden-state space, yet the structure of these latent reasoning trajectories remains poorly understood. We show that contrastive refinement signals between stronger and weaker reasoning trajectories exhibit a highly concentrated low-rank structure, while unconstrained latent updates remain sensitive to paraphrases, checkpoint choice, and trajectory perturbations. These observations suggest that latent reasoning trajectories contain stable invariant directions mixed with unstable instance-specific variation. We introduce \textbf{Trajectory-Invariant Latent Refinement (TILR)}, a training-free intervention framework for identifying and manipulating stable reasoning directions in latent space. TILR first learns a low-rank invariant subspace from contrastive trajectory differences across inputs, then constrains latent interventions to this subspace while suppressing poorly aligned updates through an adaptive alignment gate. Across six reasoning benchmarks, we find that a small number of latent directions explain most variation between strong and weak reasoning trajectories. Interventions on these directions causally improve reasoning consistency and reduce trajectory instability under paraphrases and perturbations. TILR improves answer consistency under paraphrase by ~10% and reduces latent trajectory variance by up to $50\%$ while preserving reasoning accuracy. These results support a geometric view of latent reasoning in which transferable reasoning behavior emerges from stable low-dimensional structure within hidden-state trajectories.

A reduced planar body with area greater than $πΔ^2/4$

from arXiv: Computational Geometry

Authors: Scott Duke Kominers

We construct a reduced planar convex body $R$ with thickness $Δ(R)=1$ and \[\operatorname{area}(R)=0.786215\ldots>0.785398\ldots=\fracπ{4}.\] Thus $R$ is a counterexample to Lassak's conjectured upper bound $\operatorname{area}\le(π/4)Δ^2$ for planar reduced bodies. The construction is given by an explicit support function, and the proofs use only elementary support-function, width, area, and contact-point computations.

Authors: Scott Duke Kominers

We construct a reduced planar convex body $R$ with thickness $Δ(R)=1$ and \[\operatorname{area}(R)=0.786215\ldots>0.785398\ldots=\fracπ{4}.\] Thus $R$ is a counterexample to Lassak's conjectured upper bound $\operatorname{area}\le(π/4)Δ^2$ for planar reduced bodies. The construction is given by an explicit support function, and the proofs use only elementary support-function, width, area, and contact-point computations.

Working with measurement-based computations on qudits

from arXiv: Data Structures and Algorithms

Authors: Piotr Mitosek, Miriam Backens

Measurement-based quantum computing is a universal model of quantum computation in which successive product measurements of an entangled resource state drive the computation. The non-deterministic nature of measurements necessitates adaptivity to ensure an overall deterministic computation. Flow structures characterise cases in which such an adaptive correction procedure is possible. Recently, flow has been defined in a setting where the resource states are prime-dimensional qudit graph states rather than the usual qubit graph states. Yet, this qudit flow definition is more burdensome to work with than analogous definitions for qubits. Here, we give a simpler definition of qudit flow and consider various useful properties of this flow, drawing on results for the qubit case. In particular, we show how to focus qudit flow and argue that focused flow is canonical. We improve the previous algebraic formulation to capture focused flow and use it to obtain an $O(n^3)$ flow-finding algorithm (where $n$ is the number of qudits), matching the best known complexity for qubit flows and improving on the previous $O(n^4)$ result for qudits. Furthermore, we explore multiple flow-preserving transformations, thus opening a pathway to using flow for optimisation. These transformations include pivoting, removal and insertion of certain types of vertices, and reversibility of flow. Lastly, we propose an algorithmic approach to generating large qudit computations with flow, for testing or machine learning.

Authors: Piotr Mitosek, Miriam Backens

Measurement-based quantum computing is a universal model of quantum computation in which successive product measurements of an entangled resource state drive the computation. The non-deterministic nature of measurements necessitates adaptivity to ensure an overall deterministic computation. Flow structures characterise cases in which such an adaptive correction procedure is possible. Recently, flow has been defined in a setting where the resource states are prime-dimensional qudit graph states rather than the usual qubit graph states. Yet, this qudit flow definition is more burdensome to work with than analogous definitions for qubits. Here, we give a simpler definition of qudit flow and consider various useful properties of this flow, drawing on results for the qubit case. In particular, we show how to focus qudit flow and argue that focused flow is canonical. We improve the previous algebraic formulation to capture focused flow and use it to obtain an $O(n^3)$ flow-finding algorithm (where $n$ is the number of qudits), matching the best known complexity for qubit flows and improving on the previous $O(n^4)$ result for qudits. Furthermore, we explore multiple flow-preserving transformations, thus opening a pathway to using flow for optimisation. These transformations include pivoting, removal and insertion of certain types of vertices, and reversibility of flow. Lastly, we propose an algorithmic approach to generating large qudit computations with flow, for testing or machine learning.

Testing k-submodularity

from arXiv: Data Structures and Algorithms

Authors: Themistoklis Haris, Diptaksho Palit

We initiate the study of property testing for $k$-submodular functions, a higher-dimensional analogue of submodular functions defined on partial partitions of a ground set. While $k$-submodularity retains the diminishing-returns flavor of ordinary submodularity, it also introduces a pairwise monotonicity constraint comparing competing assignments of the same element. This additional local structure makes the testing problem qualitatively different from the classical case. Our results show a sharp contrast between distance regimes. In the $\ell_p$ regime for $p \geq 1$, we prove that every bounded $k$-submodular function is close to a junta on the hypergrid. Combined with an implicit-learning tester for hypergrid domains, this yields a constant-query tester for $k$-submodularity. In the Hamming distance regime, $k$-submodularity admits two qualitatively different local witnesses -- violated squares for diminishing marginal gains, and violated triangles for pairwise-monotonicity failures -- and the latter has no counterpart at $k=1$. We prove density theorems for both witness types via repair on filters and ideals of partial partitions, yielding non-adaptive, one-sided sub-exponential-query testers for the two component properties of $k$-submodularity. We then exhibit a configuration in which the two repair directions are forced into opposition on a shared vertex, identifying a structural barrier to combining these into a tester for the full property. Finally, for bounded-range functions, we give an adaptive tester for monotone $k$-submodularity via a pseudo-DNF representation and learning on the hypergrid. Several of the structural and learning tools developed here may be useful for testing other properties over product domains.

Authors: Themistoklis Haris, Diptaksho Palit

We initiate the study of property testing for $k$-submodular functions, a higher-dimensional analogue of submodular functions defined on partial partitions of a ground set. While $k$-submodularity retains the diminishing-returns flavor of ordinary submodularity, it also introduces a pairwise monotonicity constraint comparing competing assignments of the same element. This additional local structure makes the testing problem qualitatively different from the classical case. Our results show a sharp contrast between distance regimes. In the $\ell_p$ regime for $p \geq 1$, we prove that every bounded $k$-submodular function is close to a junta on the hypergrid. Combined with an implicit-learning tester for hypergrid domains, this yields a constant-query tester for $k$-submodularity. In the Hamming distance regime, $k$-submodularity admits two qualitatively different local witnesses -- violated squares for diminishing marginal gains, and violated triangles for pairwise-monotonicity failures -- and the latter has no counterpart at $k=1$. We prove density theorems for both witness types via repair on filters and ideals of partial partitions, yielding non-adaptive, one-sided sub-exponential-query testers for the two component properties of $k$-submodularity. We then exhibit a configuration in which the two repair directions are forced into opposition on a shared vertex, identifying a structural barrier to combining these into a tester for the full property. Finally, for bounded-range functions, we give an adaptive tester for monotone $k$-submodularity via a pseudo-DNF representation and learning on the hypergrid. Several of the structural and learning tools developed here may be useful for testing other properties over product domains.

Learning the structure of open quantum systems

from arXiv: Data Structures and Algorithms

Authors: Laura Lewis, Ewin Tang, John Wright

We design an algorithm for learning the coefficients of an $n$-qubit constant-local Lindbladian to $\varepsilon$ error with $O(g d^2 \log(n) / \varepsilon^2)$ total evolution time, where $g$ is the single-site energy and $d$ is the (approximate) degree of the interaction graph. Though Lindbladians present new challenges not present in the special case of Hamiltonians, our algorithm achieves the suite of desiderata attained by state-of-the-art Hamiltonian learning algorithms: (1) it uses non-adaptive, ancilla-free randomized Pauli measurement circuits with a time resolution of only $Θ(1/g)$; (2) it works without knowledge of the structure of the unknown Lindbladian; (3) it depends on a smooth form of degree, thereby supporting the learning of quasi-local and power-law Lindbladians. Our algorithm is a simple iterative method, where the objective function consists of Fourier coefficients of the Lindbladian restricted to few-site regions. Its analysis identifies the difficulty unique to open systems, which we call "confusing" terms. For settings where the "confusion" is limited, the performance of the algorithm improves. We demonstrate this for the case of structure learning of Hamiltonians from access to real-time evolution, where we obtain a new algorithm that is significantly simpler than previous work. In addition, using the same iterative method, we design the first efficient algorithm for structure learning Hamiltonians from high-temperature Gibbs states.

Authors: Laura Lewis, Ewin Tang, John Wright

We design an algorithm for learning the coefficients of an $n$-qubit constant-local Lindbladian to $\varepsilon$ error with $O(g d^2 \log(n) / \varepsilon^2)$ total evolution time, where $g$ is the single-site energy and $d$ is the (approximate) degree of the interaction graph. Though Lindbladians present new challenges not present in the special case of Hamiltonians, our algorithm achieves the suite of desiderata attained by state-of-the-art Hamiltonian learning algorithms: (1) it uses non-adaptive, ancilla-free randomized Pauli measurement circuits with a time resolution of only $Θ(1/g)$; (2) it works without knowledge of the structure of the unknown Lindbladian; (3) it depends on a smooth form of degree, thereby supporting the learning of quasi-local and power-law Lindbladians. Our algorithm is a simple iterative method, where the objective function consists of Fourier coefficients of the Lindbladian restricted to few-site regions. Its analysis identifies the difficulty unique to open systems, which we call "confusing" terms. For settings where the "confusion" is limited, the performance of the algorithm improves. We demonstrate this for the case of structure learning of Hamiltonians from access to real-time evolution, where we obtain a new algorithm that is significantly simpler than previous work. In addition, using the same iterative method, we design the first efficient algorithm for structure learning Hamiltonians from high-temperature Gibbs states.

Optimal Stable Coresets for Geometric Median via Uniform Sampling

from arXiv: Data Structures and Algorithms

Authors: Amir Carmel, Robert Krauthgamer, Nir Petruschka

The geometric median problem asks to find a point in $\mathbb{R}^d$ that minimizes the sum of Euclidean distances to an input set. It is a classical problem in computational geometry and appears as a subroutine in numerous optimization tasks, many of which require the solution to satisfy additional structural constraints. A common approach to reduce the input size is to construct a coreset, which is a small weighted subset that faithfully represents the input for a specific optimization problem. Strong coresets preserve the cost of every candidate solution but require linear time to construct; weak coresets admit sublinear construction, in fact by uniform sampling, but only preserve near-optimal solutions, which is insufficient when the solution is constrained. To address this, we focus instead on the recently introduced intermediate notion of a \emph{stable coreset}, which simultaneously handles all constrained variants. Currently, there is a large gap between the known sample sizes for stable and weak coresets. Our main result is that a uniform sample of size $O(ε^{-2} \log \tfrac{1}ε)$ is a stable $(ε, O(ε))$-coreset for the geometric median, with high constant probability, and this bound is tight up to the logarithmic factor. Our analysis adapts recent machinery of Carmel and Krauthgamer (ICLR 2026) for constructing stable coresets, which incurs an $O(\log d)$ factor. We show an iterative argument that progressively reduces the sample size, and eliminates this dependence on the dimension $d$. At a high level, this approach resembles the technique of iterative size reduction, which is applicable for strong coresets but not for weak coresets.

Authors: Amir Carmel, Robert Krauthgamer, Nir Petruschka

The geometric median problem asks to find a point in $\mathbb{R}^d$ that minimizes the sum of Euclidean distances to an input set. It is a classical problem in computational geometry and appears as a subroutine in numerous optimization tasks, many of which require the solution to satisfy additional structural constraints. A common approach to reduce the input size is to construct a coreset, which is a small weighted subset that faithfully represents the input for a specific optimization problem. Strong coresets preserve the cost of every candidate solution but require linear time to construct; weak coresets admit sublinear construction, in fact by uniform sampling, but only preserve near-optimal solutions, which is insufficient when the solution is constrained. To address this, we focus instead on the recently introduced intermediate notion of a \emph{stable coreset}, which simultaneously handles all constrained variants. Currently, there is a large gap between the known sample sizes for stable and weak coresets. Our main result is that a uniform sample of size $O(ε^{-2} \log \tfrac{1}ε)$ is a stable $(ε, O(ε))$-coreset for the geometric median, with high constant probability, and this bound is tight up to the logarithmic factor. Our analysis adapts recent machinery of Carmel and Krauthgamer (ICLR 2026) for constructing stable coresets, which incurs an $O(\log d)$ factor. We show an iterative argument that progressively reduces the sample size, and eliminates this dependence on the dimension $d$. At a high level, this approach resembles the technique of iterative size reduction, which is applicable for strong coresets but not for weak coresets.

I.i.d. Prophet Inequalities with Discounted Rewards: As Hard as the Non-i.i.d. Case

from arXiv: Data Structures and Algorithms

Authors: Jung-hun Kim, Vianney Perchet

We study prophet inequalities with discounted rewards, where i.i.d. base rewards are multiplicatively discounted over time. Our main message is that even this structured and arbitrarily weak form of nonstationarity can erase the classical advantage of the stationary i.i.d. setting. Focusing on single-quantile threshold policies, we show that the competitive ratio transitions from the classical $1-1/e$ guarantee to a fundamental $1/2$ barrier as discounting accumulates over many phases in a canonical regime with a common-decay factor and equal-length phases. We further show that, in the same regime, the $1/2$ barrier persists even for arbitrary stopping rules. Consequently, i.i.d. base rewards under discounting can be as hard as the fully non-i.i.d. case. On the algorithmic side, we design single-quantile threshold rules that attain the tight bounds by calibrating acceptance decisions to an effective horizon induced by discounting, and we extend this calibration to heterogeneous decay factors and unequal phase lengths. We further show that a similar discontinuous breakdown persists in an infinite-horizon continuous-decay benchmark, where arbitrarily weak decay collapses the stationary benchmark from $1$ to $1/2$.

Authors: Jung-hun Kim, Vianney Perchet

We study prophet inequalities with discounted rewards, where i.i.d. base rewards are multiplicatively discounted over time. Our main message is that even this structured and arbitrarily weak form of nonstationarity can erase the classical advantage of the stationary i.i.d. setting. Focusing on single-quantile threshold policies, we show that the competitive ratio transitions from the classical $1-1/e$ guarantee to a fundamental $1/2$ barrier as discounting accumulates over many phases in a canonical regime with a common-decay factor and equal-length phases. We further show that, in the same regime, the $1/2$ barrier persists even for arbitrary stopping rules. Consequently, i.i.d. base rewards under discounting can be as hard as the fully non-i.i.d. case. On the algorithmic side, we design single-quantile threshold rules that attain the tight bounds by calibrating acceptance decisions to an effective horizon induced by discounting, and we extend this calibration to heterogeneous decay factors and unequal phase lengths. We further show that a similar discontinuous breakdown persists in an infinite-horizon continuous-decay benchmark, where arbitrarily weak decay collapses the stationary benchmark from $1$ to $1/2$.

A Sieve-Accelerated Quadrature Method for Exact Privacy Accounting in the 2020 U.S. Decennial Census

from arXiv: Data Structures and Algorithms

Authors: Buxin Su, Weijie Su, Chendi Wang

In 2020, the U.S. Census Bureau adopted differential privacy for the Decennial Census by injecting integer-valued Gaussian noise into published census tabulations. Exactly evaluating the privacy guarantees of these data releases would enable the Bureau to determine the absolute minimum noise required to satisfy a given privacy budget, preventing the injection of unnecessary excess noise and thereby substantially enhancing the statistical utility of the data for downstream applications such as federal funding allocation and political redistricting. In this paper, we introduce a computationally efficient and mathematically rigorous quadrature method to evaluate the exact privacy profile of practical, large-scale census releases under the composition of heterogeneous discrete Gaussian mechanisms. Mathematically, this problem reduces to evaluating the tail probabilities of high-dimensional convolutions of integer-valued random variables sampled from heterogeneous discrete Gaussian distributions under exceptionally stringent numerical error tolerances (e.g., $10^{-35}$). By recasting the exact privacy accounting as a numerical integration problem via the discrete Fourier transform, we explicitly exploit the exponential convergence of the trapezoidal rule for complex analytic, periodic characteristic functions. Furthermore, to overcome the computational bottleneck of evaluating highly oscillatory integrands in high dimensions, we develop a sieve algorithm that identifies and prunes negligible quadrature nodes, accelerating the computation by three orders of magnitude. Taken together, these numerical innovations enable the first exact, assumption-free privacy accounting for the 2020 Census Demographic and Housing Characteristics File, achieving a 1,824-fold speedup over prior methods while maintaining census-mandated error tolerances.

Authors: Buxin Su, Weijie Su, Chendi Wang

In 2020, the U.S. Census Bureau adopted differential privacy for the Decennial Census by injecting integer-valued Gaussian noise into published census tabulations. Exactly evaluating the privacy guarantees of these data releases would enable the Bureau to determine the absolute minimum noise required to satisfy a given privacy budget, preventing the injection of unnecessary excess noise and thereby substantially enhancing the statistical utility of the data for downstream applications such as federal funding allocation and political redistricting. In this paper, we introduce a computationally efficient and mathematically rigorous quadrature method to evaluate the exact privacy profile of practical, large-scale census releases under the composition of heterogeneous discrete Gaussian mechanisms. Mathematically, this problem reduces to evaluating the tail probabilities of high-dimensional convolutions of integer-valued random variables sampled from heterogeneous discrete Gaussian distributions under exceptionally stringent numerical error tolerances (e.g., $10^{-35}$). By recasting the exact privacy accounting as a numerical integration problem via the discrete Fourier transform, we explicitly exploit the exponential convergence of the trapezoidal rule for complex analytic, periodic characteristic functions. Furthermore, to overcome the computational bottleneck of evaluating highly oscillatory integrands in high dimensions, we develop a sieve algorithm that identifies and prunes negligible quadrature nodes, accelerating the computation by three orders of magnitude. Taken together, these numerical innovations enable the first exact, assumption-free privacy accounting for the 2020 Census Demographic and Housing Characteristics File, achieving a 1,824-fold speedup over prior methods while maintaining census-mandated error tolerances.

An FPT algorithm for cycle rank on semi-complete digraphs

from arXiv: Data Structures and Algorithms

Authors: Seokbeom Kim, O-joung Kwon, Myounghwan Lee

Cycle rank is a depth parameter for digraphs introduced by Eggan in 1963. Gruber (DMTCS 2012) and Giannopoulou, Hunter, and Thilikos (DAM 2012) asked whether the problem of determining if a given digraph has cycle rank at most $w$ is fixed-parameter tractable parameterized by $w$. We provide such algorithms for semi-complete digraphs, and for digraphs of bounded directed clique-width. Specifically, we show that given an $n$-vertex semi-complete digraph $G$ and an integer $w$, one can in time $\mathcal{O}(9^{(w+1)4^{w+2}} \cdot n^2)$ determine whether $G$ has cycle rank at most $w$. The proof is reduced to the case of bounded directed clique-width, and we then show that given an $n$-vertex digraph $G$ with a directed clique-width $k$-expression and an integer $w$, one can in time $\mathcal{O}(9^{(w+1) 4^k} \cdot n)$ determine whether $G$ has cycle rank at most $w$. Additionally, we consider the \textsc{Minimum Feedback Arc Set} problem on semi-complete digraphs, and show that it can be solved in time $n^{\mathcal{O}(w)}$, where $w$ is the cycle rank of the given semi-complete digraph.

Authors: Seokbeom Kim, O-joung Kwon, Myounghwan Lee

Cycle rank is a depth parameter for digraphs introduced by Eggan in 1963. Gruber (DMTCS 2012) and Giannopoulou, Hunter, and Thilikos (DAM 2012) asked whether the problem of determining if a given digraph has cycle rank at most $w$ is fixed-parameter tractable parameterized by $w$. We provide such algorithms for semi-complete digraphs, and for digraphs of bounded directed clique-width. Specifically, we show that given an $n$-vertex semi-complete digraph $G$ and an integer $w$, one can in time $\mathcal{O}(9^{(w+1)4^{w+2}} \cdot n^2)$ determine whether $G$ has cycle rank at most $w$. The proof is reduced to the case of bounded directed clique-width, and we then show that given an $n$-vertex digraph $G$ with a directed clique-width $k$-expression and an integer $w$, one can in time $\mathcal{O}(9^{(w+1) 4^k} \cdot n)$ determine whether $G$ has cycle rank at most $w$. Additionally, we consider the \textsc{Minimum Feedback Arc Set} problem on semi-complete digraphs, and show that it can be solved in time $n^{\mathcal{O}(w)}$, where $w$ is the cycle rank of the given semi-complete digraph.