Thank you to arXiv for use of its open access interoperability.
Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.
Enterprise security was built to govern data boundaries: the protected surface was data at rest and in transit, and the controls -- access control, data-loss prevention, perimeter inspection -- governed crossings of that boundary. Production AI agents dissolve this assumption. An agent reads context, calls tools, invokes connectors, and modifies systems of record on an enterprise's behalf, so risk moves inside the workflow, into sequences of individually-permitted actions that may transform a business process no one authorized. Existing policy engines do not extend to this regime: they evaluate request-time decisions against atomic principals, where agentic systems require stateful evaluation against composite principals whose authority attenuates through delegation chains.
We present a reference architecture for the runtime governance of production agents, built from four composable primitives: a five-plane decomposition (a reasoning plane that adjudicates intent, and four enforcement planes -- network, identity, endpoint, data -- that realize the decision), stop-anywhere mediation, composite principals with capability attenuation, and audit as a structured evidence substrate. We define a taxonomy of six interruption primitives that generalize allow and deny, state and argue for four correctness invariants, and demonstrate the foreclosure of seven production-agent threats across five concrete workflows. A reference implementation of the policy-engine core supplies measured evidence: attenuation correctness and evidence reconstructability hold on every trial, adjudication runs in single-digit microseconds, and the audit substrate's tamper-evidence behaves exactly as designed. We are explicit about scope: the architecture governs delegated action, not model behavior, and a full-system evaluation against a live agent benchmark is the invited next step.
Enterprise security was built to govern data boundaries: the protected surface was data at rest and in transit, and the controls -- access control, data-loss prevention, perimeter inspection -- governed crossings of that boundary. Production AI agents dissolve this assumption. An agent reads context, calls tools, invokes connectors, and modifies systems of record on an enterprise's behalf, so risk moves inside the workflow, into sequences of individually-permitted actions that may transform a business process no one authorized. Existing policy engines do not extend to this regime: they evaluate request-time decisions against atomic principals, where agentic systems require stateful evaluation against composite principals whose authority attenuates through delegation chains.
We present a reference architecture for the runtime governance of production agents, built from four composable primitives: a five-plane decomposition (a reasoning plane that adjudicates intent, and four enforcement planes -- network, identity, endpoint, data -- that realize the decision), stop-anywhere mediation, composite principals with capability attenuation, and audit as a structured evidence substrate. We define a taxonomy of six interruption primitives that generalize allow and deny, state and argue for four correctness invariants, and demonstrate the foreclosure of seven production-agent threats across five concrete workflows. A reference implementation of the policy-engine core supplies measured evidence: attenuation correctness and evidence reconstructability hold on every trial, adjudication runs in single-digit microseconds, and the audit substrate's tamper-evidence behaves exactly as designed. We are explicit about scope: the architecture governs delegated action, not model behavior, and a full-system evaluation against a live agent benchmark is the invited next step.
In this paper, we prove that output-sensitive sparse polynomial GCD computation over finite fields is NP-hard under BPP many-one reduction. More precisely, for two sparse univariate polynomials $f,g$ with finite field coefficients, there exists no randomized algorithm to compute $\mathrm{gcd}(f,g)$, which is polynomial-time in the sizes of $f,g,\gcd(f,g)$ under the standard complexity assumption $\mathrm{NP}\nsubseteq\mathrm{BPP}$. This settles the open problem posed as Challenge 5 in The Sparsity Challenges in the finite field setting. Furthermore, we show that the Roots of Unity Detection problem over finite fields is NP-hard; that is, determining whether the GCD of a sparse univariate polynomial and $x^n - 1$ has nonzero degree is NP-hard.
In this paper, we prove that output-sensitive sparse polynomial GCD computation over finite fields is NP-hard under BPP many-one reduction. More precisely, for two sparse univariate polynomials $f,g$ with finite field coefficients, there exists no randomized algorithm to compute $\mathrm{gcd}(f,g)$, which is polynomial-time in the sizes of $f,g,\gcd(f,g)$ under the standard complexity assumption $\mathrm{NP}\nsubseteq\mathrm{BPP}$. This settles the open problem posed as Challenge 5 in The Sparsity Challenges in the finite field setting. Furthermore, we show that the Roots of Unity Detection problem over finite fields is NP-hard; that is, determining whether the GCD of a sparse univariate polynomial and $x^n - 1$ has nonzero degree is NP-hard.
In this paper, we show that deciding whether a sparse polynomial does not divide another sparse polynomial exactly over finite fields is NP-hard under BPP many-one reductions. Equivalently, the sparse polynomial divisibility test over finite fields is CoNP-hard. This resolves the long-standing open problem concerning the computational complexity of the divisibility test for sparse polynomials in the setting of finite fields.
In this paper, we show that deciding whether a sparse polynomial does not divide another sparse polynomial exactly over finite fields is NP-hard under BPP many-one reductions. Equivalently, the sparse polynomial divisibility test over finite fields is CoNP-hard. This resolves the long-standing open problem concerning the computational complexity of the divisibility test for sparse polynomials in the setting of finite fields.
Sparse polynomial multiplication is a fundamental problem in computer algebra and the theory of computation, and the development of a quasi-linear time output-sensitive multiplication algorithm has been posed as an open challenge. In this paper, a counterexample is provided to a previously claimed solution to this open problem for integer coefficients. By employing the existing quasi-linear modular-black-box interpolation algorithm, we are able to provide an algorithm with quasi-linear bit complexity for the integer coefficients setting. Furthermore, in the case of coefficients over a finite field, we obtain an algorithm whose bit complexity is linear in the number of terms, the logarithm of the degree, and the logarithm of the size of the finite field.
Sparse polynomial multiplication is a fundamental problem in computer algebra and the theory of computation, and the development of a quasi-linear time output-sensitive multiplication algorithm has been posed as an open challenge. In this paper, a counterexample is provided to a previously claimed solution to this open problem for integer coefficients. By employing the existing quasi-linear modular-black-box interpolation algorithm, we are able to provide an algorithm with quasi-linear bit complexity for the integer coefficients setting. Furthermore, in the case of coefficients over a finite field, we obtain an algorithm whose bit complexity is linear in the number of terms, the logarithm of the degree, and the logarithm of the size of the finite field.
Authors: Arie Soeteman, Balder ten Cate, Maurice Funk, Benny Kimelfeld, Carsten Lutz, Moritz Schönherr
The conventional approach to deep learning over relational databases applies neural models, such as Graph Neural Networks (GNNs), to a graph representation of the database. Recent approaches instead operate on databases directly, associating tuples with embeddings and extending query mechanisms to jointly process embeddings and relational content. Inspired by these developments, we introduce Neuro-Relational Programs (NRPs), a declarative query language for relational databases whose facts carry numeric vector embeddings. NRPs extend Datalog-style rules with operations that combine, aggregate, and transform embeddings, thereby interleaving relational reasoning and learnable neural components within a single formalism. This yields a general approach to neural computation over relational data: an NRP can be read both as a query plan with trainable components and as a neural architecture with relational structure built in.
Natural syntactic fragments of NRPs recover existing architectures and query formalisms. Zero-ary NRPs correspond to non-adaptive query algorithms; monadic NRPs generalize GNN-style message passing and precisely capture Deep Homomorphism Networks, a connection that we extend to frontier-guarded NRPs over databases with row-ids. We characterize the expressive power of unrestricted NRPs with ReLU-FFN transformations by FOCQ, an extension of first-order logic with counting interpreted over real-weighted structures, yielding a precise connection with uniform TC$^0$ over ordered databases. Together, these results establish NRPs as a broad declarative framework for querying and neural computation over relational data.
The conventional approach to deep learning over relational databases applies neural models, such as Graph Neural Networks (GNNs), to a graph representation of the database. Recent approaches instead operate on databases directly, associating tuples with embeddings and extending query mechanisms to jointly process embeddings and relational content. Inspired by these developments, we introduce Neuro-Relational Programs (NRPs), a declarative query language for relational databases whose facts carry numeric vector embeddings. NRPs extend Datalog-style rules with operations that combine, aggregate, and transform embeddings, thereby interleaving relational reasoning and learnable neural components within a single formalism. This yields a general approach to neural computation over relational data: an NRP can be read both as a query plan with trainable components and as a neural architecture with relational structure built in.
Natural syntactic fragments of NRPs recover existing architectures and query formalisms. Zero-ary NRPs correspond to non-adaptive query algorithms; monadic NRPs generalize GNN-style message passing and precisely capture Deep Homomorphism Networks, a connection that we extend to frontier-guarded NRPs over databases with row-ids. We characterize the expressive power of unrestricted NRPs with ReLU-FFN transformations by FOCQ, an extension of first-order logic with counting interpreted over real-weighted structures, yielding a precise connection with uniform TC$^0$ over ordered databases. Together, these results establish NRPs as a broad declarative framework for querying and neural computation over relational data.
We study the undirected three-terminal reachability-preserving minimum edge cut problem. The input is an undirected graph $G=(V,E)$ with nonnegative edge costs, two protected terminals $s_1,s_2$, and a target terminal $t$. The goal is to remove a minimum-cost edge set so that $t$ is disconnected from the protected terminals while $s_1$ and $s_2$ remain connected. This problem captures a basic tension between separation and connectivity preservation. Prior work on connectivity-preserving cuts established polynomial-time solvability for some special cases, such as planar edge-cut instances, and strong hardness for node-cut variants, but a general-graph approximation guarantee for the undirected three-terminal edge-cut version does not appear to have been known. We give a polynomial-time $O(\sqrt n)$-approximation algorithm in this paper. This is the first known approximation algorithm for the problem
We study the undirected three-terminal reachability-preserving minimum edge cut problem. The input is an undirected graph $G=(V,E)$ with nonnegative edge costs, two protected terminals $s_1,s_2$, and a target terminal $t$. The goal is to remove a minimum-cost edge set so that $t$ is disconnected from the protected terminals while $s_1$ and $s_2$ remain connected. This problem captures a basic tension between separation and connectivity preservation. Prior work on connectivity-preserving cuts established polynomial-time solvability for some special cases, such as planar edge-cut instances, and strong hardness for node-cut variants, but a general-graph approximation guarantee for the undirected three-terminal edge-cut version does not appear to have been known. We give a polynomial-time $O(\sqrt n)$-approximation algorithm in this paper. This is the first known approximation algorithm for the problem
Authors: Arjen Simons, Sarah Schöttler, Wouter Meulemans, Kevin Verbeek, Bettina Speckmann
Thematic maps visually communicate statistical information about spatial units such as countries or states. They must balance the individual readability of those map elements that carry the statistical information and the overall cartographic context. Nowadays, most maps are not static images, but must flexibly respond to a range of device types and display sizes. Current approaches to responsive thematic mapping are limited: they are labor-intensive for practitioners and often rely on combining disjointed visual encodings to cover different device types. In this paper we introduce the first algorithmic framework to efficiently compute responsive thematic maps that smoothly adapt to different display sizes. A key component of our framework is the layout guide: a combinatorial structure which encodes the two essential aspects of a thematic map. The first aspect are the visual requirements of each statistical map element (at least their desired width and height), the second aspect is the cartographic context in the form of relative positions of map elements. Our main algorithmic contribution is the map arranger which takes a visual container as input and returns a suitable layout guide. The map arranger does so in a stable and consistent manner: if the container changes only a little, then so does the layout guide, and the same input container always results in the same layout guide. To use our framework, one needs three ingredients: $(1)$ a reference layout, which corresponds to the ``ideal'' thematic map, $(2)$ a total vertical and horizontal order for all map elements (the desired layouts for containers with extreme aspect ratios), and $(3)$ a thematic mapping algorithm that can construct a thematic map from a layout guide. We demonstrate our framework on two types of thematic maps, namely rectangular and Demers cartograms.
Thematic maps visually communicate statistical information about spatial units such as countries or states. They must balance the individual readability of those map elements that carry the statistical information and the overall cartographic context. Nowadays, most maps are not static images, but must flexibly respond to a range of device types and display sizes. Current approaches to responsive thematic mapping are limited: they are labor-intensive for practitioners and often rely on combining disjointed visual encodings to cover different device types. In this paper we introduce the first algorithmic framework to efficiently compute responsive thematic maps that smoothly adapt to different display sizes. A key component of our framework is the layout guide: a combinatorial structure which encodes the two essential aspects of a thematic map. The first aspect are the visual requirements of each statistical map element (at least their desired width and height), the second aspect is the cartographic context in the form of relative positions of map elements. Our main algorithmic contribution is the map arranger which takes a visual container as input and returns a suitable layout guide. The map arranger does so in a stable and consistent manner: if the container changes only a little, then so does the layout guide, and the same input container always results in the same layout guide. To use our framework, one needs three ingredients: $(1)$ a reference layout, which corresponds to the ``ideal'' thematic map, $(2)$ a total vertical and horizontal order for all map elements (the desired layouts for containers with extreme aspect ratios), and $(3)$ a thematic mapping algorithm that can construct a thematic map from a layout guide. We demonstrate our framework on two types of thematic maps, namely rectangular and Demers cartograms.
We study the query complexity of Boolean functions $f: \{0, 1\}^n \rightarrow \{0, 1\}$ in the noisy query model introduced by Feige, Raghavan, Peleg and Upfal [SICOMP 1994]. In this model, an algorithm can adaptively query the bits of an input vector, but each query result is independently flipped with constant probability $p \in (0, 1/2)$; repeated queries are allowed. The noisy query complexity $\mathsf{N}_p(f)$ of a function $f$ is defined as the minimum expected number of queries needed to compute $f(x)$ with error probability at most $1/3$, for the worst case input $x$.
We prove a general lower bound on $\mathsf{N}_p(f)$ based on degree statistics of certain subgraphs of the Boolean hypercube. This is the first general lower bound beyond those implied by the simple observation that $\mathsf{N}_p(f)$ is lower bounded by the randomized query complexity. We show that this recovers (up to a constant factor) most previously known lower bounds on the noisy query complexity of Boolean functions, providing a unified framework for understanding these results and simplifying the proofs in several cases. Furthermore, this resolves in the affirmative an open problem of Gu, Li and Xu [COLT 2025] that $\mathsf{N}_p(f) = Ω(\mathsf{I}(f) \log \mathsf{I}(f))$, where $\mathsf{I}(f)$ denotes the total influence of $f$. We also apply our general lower bound to obtain tight bounds on the noisy query complexity for several new functions.
We study the query complexity of Boolean functions $f: \{0, 1\}^n \rightarrow \{0, 1\}$ in the noisy query model introduced by Feige, Raghavan, Peleg and Upfal [SICOMP 1994]. In this model, an algorithm can adaptively query the bits of an input vector, but each query result is independently flipped with constant probability $p \in (0, 1/2)$; repeated queries are allowed. The noisy query complexity $\mathsf{N}_p(f)$ of a function $f$ is defined as the minimum expected number of queries needed to compute $f(x)$ with error probability at most $1/3$, for the worst case input $x$.
We prove a general lower bound on $\mathsf{N}_p(f)$ based on degree statistics of certain subgraphs of the Boolean hypercube. This is the first general lower bound beyond those implied by the simple observation that $\mathsf{N}_p(f)$ is lower bounded by the randomized query complexity. We show that this recovers (up to a constant factor) most previously known lower bounds on the noisy query complexity of Boolean functions, providing a unified framework for understanding these results and simplifying the proofs in several cases. Furthermore, this resolves in the affirmative an open problem of Gu, Li and Xu [COLT 2025] that $\mathsf{N}_p(f) = Ω(\mathsf{I}(f) \log \mathsf{I}(f))$, where $\mathsf{I}(f)$ denotes the total influence of $f$. We also apply our general lower bound to obtain tight bounds on the noisy query complexity for several new functions.
A large body of work studies the problem of learning an approximation to an implicit matrix $A\in \mathbb{R}^{m\times n}$ that is only accessible implicitly via matrix-vector product queries (matvec queries) of the form ${x} \rightarrow {A}{x}$ or ${x} \rightarrow {A}^T{x}$. Of particular interest are methods that learn a near-optimal approximation with a fixed sparsity pattern. For example, we might want to learn a near-optimal diagonal, banded, or arrow-head approximation to an implicit matrix $A$.
Naturally, the number of matvec queries required to solve this problem depends on the sparsity pattern, which can be encoded as a binary matrix ${S}\in \{0,1\}^{m\times n}$. The query complexity of previous algorithms scales with quantities like the total number of ones in ${S}$, its maximum column/row sparsity, or the chromatic number of a its "conflict graph". These quantities are incomparable: for a given ${S}$, parameterizing by one might yield lower query complexity than another.
In this work, we unify and tighten these prior results by providing a nearly sharp characterization of the matvec query complexity of sparse matrix approximation. Generalizing a definition from graph algorithms, let the degeneracy, ${degen}({S})$, denote the smallest number $k$ so that, if we iteratively delete all rows and columns of ${S}$ with $\leq k$ ones, we are left with an empty matrix. We show that a near-optimal approximation to $A$ with sparsity pattern $S$ can be learned with $\tilde{O}({degen}({S}))$ matrix-vector product queries, and $Ω({degen}({S}))$ queries are necessary, for any sparsity pattern ${S}$. Moreover, unlike prior work based on graph coloring, all of our methods run in polynomial time.
A large body of work studies the problem of learning an approximation to an implicit matrix $A\in \mathbb{R}^{m\times n}$ that is only accessible implicitly via matrix-vector product queries (matvec queries) of the form ${x} \rightarrow {A}{x}$ or ${x} \rightarrow {A}^T{x}$. Of particular interest are methods that learn a near-optimal approximation with a fixed sparsity pattern. For example, we might want to learn a near-optimal diagonal, banded, or arrow-head approximation to an implicit matrix $A$.
Naturally, the number of matvec queries required to solve this problem depends on the sparsity pattern, which can be encoded as a binary matrix ${S}\in \{0,1\}^{m\times n}$. The query complexity of previous algorithms scales with quantities like the total number of ones in ${S}$, its maximum column/row sparsity, or the chromatic number of a its "conflict graph". These quantities are incomparable: for a given ${S}$, parameterizing by one might yield lower query complexity than another.
In this work, we unify and tighten these prior results by providing a nearly sharp characterization of the matvec query complexity of sparse matrix approximation. Generalizing a definition from graph algorithms, let the degeneracy, ${degen}({S})$, denote the smallest number $k$ so that, if we iteratively delete all rows and columns of ${S}$ with $\leq k$ ones, we are left with an empty matrix. We show that a near-optimal approximation to $A$ with sparsity pattern $S$ can be learned with $\tilde{O}({degen}({S}))$ matrix-vector product queries, and $Ω({degen}({S}))$ queries are necessary, for any sparsity pattern ${S}$. Moreover, unlike prior work based on graph coloring, all of our methods run in polynomial time.
Authors: Malte Baumecker, Rustam Latypov, Yannic Maus, Jara Uitto
Given a graph $G=(V,E)$, a $β$-ruling set is a subset of nodes $S\subseteq V$ that is independent, and each node in $V$ is at distance at most $β$ from some node in $S$. In this paper, we present almost optimal distributed algorithms for finding $2$-ruling sets in the classical \LOCAL model. Our main contribution is a randomized algorithm that w.h.p.\ computes a $2$-ruling set on any $n$-node graph with bounded arboricity in $O(\log \log n)$ rounds. In fact, the algorithm works up to arboricity $O(\log\log n)$, improves exponentially over the prior state of the art that can be achieved by combining [Barenboim, Elkin, Pettie, Schneider; JACM'16], [Ghaffari; SODA'16], and [Bisht, Kothapalli and Pemmaraju; PODC'14], and nearly matches the lower bound of $Ω(\log \log n / \log \log \log n)$ [Balliu, Brandt, Kuhn, Olivetti; FOCS'20]. The domination parameter $β=2$ is optimal for algorithms with runtime $\log^{o(1)}n$: on graphs with arboricity $2$, there is a lower bound of $Ω(\sqrt{\log n})$ rounds for MIS (i.e., $β= 1$) [Khoury, Schild; FOCS'25].
Additionally, we obtain improved algorithms for larger arboricity. For general graphs with arboricity $α$, we present a randomized algorithm that computes a $2$-ruling set in $\widetilde{O}(\log^{5/8} α+\log^{5/3} \log n)$ rounds. This improves exponentially over the state of the art for a large range of non-constant arboricity.
Our techniques extend beyond distributed computing. We present an $O(\log \log \log n)$-round algorithm in the low-space Massively Parallel Computation (\mpc) model that w.h.p.\ computes a $2$-ruling set on any graph with arboricity up to $2^{poly (\log \log n)}$, improving exponentially over the state of the art from [Kothapalli, Pai, Pemmaraju; FSTTCS'20] combined with [Fischer, Giliberti, Grunau; SPAA'23].
Given a graph $G=(V,E)$, a $β$-ruling set is a subset of nodes $S\subseteq V$ that is independent, and each node in $V$ is at distance at most $β$ from some node in $S$. In this paper, we present almost optimal distributed algorithms for finding $2$-ruling sets in the classical \LOCAL model. Our main contribution is a randomized algorithm that w.h.p.\ computes a $2$-ruling set on any $n$-node graph with bounded arboricity in $O(\log \log n)$ rounds. In fact, the algorithm works up to arboricity $O(\log\log n)$, improves exponentially over the prior state of the art that can be achieved by combining [Barenboim, Elkin, Pettie, Schneider; JACM'16], [Ghaffari; SODA'16], and [Bisht, Kothapalli and Pemmaraju; PODC'14], and nearly matches the lower bound of $Ω(\log \log n / \log \log \log n)$ [Balliu, Brandt, Kuhn, Olivetti; FOCS'20]. The domination parameter $β=2$ is optimal for algorithms with runtime $\log^{o(1)}n$: on graphs with arboricity $2$, there is a lower bound of $Ω(\sqrt{\log n})$ rounds for MIS (i.e., $β= 1$) [Khoury, Schild; FOCS'25].
Additionally, we obtain improved algorithms for larger arboricity. For general graphs with arboricity $α$, we present a randomized algorithm that computes a $2$-ruling set in $\widetilde{O}(\log^{5/8} α+\log^{5/3} \log n)$ rounds. This improves exponentially over the state of the art for a large range of non-constant arboricity.
Our techniques extend beyond distributed computing. We present an $O(\log \log \log n)$-round algorithm in the low-space Massively Parallel Computation (\mpc) model that w.h.p.\ computes a $2$-ruling set on any graph with arboricity up to $2^{poly (\log \log n)}$, improving exponentially over the state of the art from [Kothapalli, Pai, Pemmaraju; FSTTCS'20] combined with [Fischer, Giliberti, Grunau; SPAA'23].
Authors: Daniel Dadush, László A. Végh, Giacomo Zambelli
We consider linear programming in the oracle model: $\max\{c^\top x \,:\, x\in P\}$, where the polyhedron $P=\{x\in\mathbb{R}^n\,:\, Ax\le b\}$ is given by a separation oracle. We present an algorithm that finds exact primal and dual solutions using $O(n^2\log(n/δ))$ oracle calls and $O(n^4\log(n/δ)+n^5\log\log(1/δ))$ arithmetic operations, where $δ$ is a geometric condition number associated with the system $(A,b)$. These bounds do not depend on the cost vector $c$ and do not require a priori knowledge of $δ$. For rational data, $\log(1/δ)$ is polynomially bounded in the encoding size of $(A,b)$, thus providing a polynomial-time algorithm.
The algorithm works in a black box manner, requiring a subroutine for approximate primal and dual solutions; the above running times are achieved when using the cutting plane method of Jiang, Lee, Song, and Wong (STOC 2020) for this subroutine. Whereas approximate solvers may return primal solutions only, we develop a general framework for extracting dual certificates based on the work of Burrell and Todd (Math. Oper. Res. 1985).
Our algorithm strengthens results by Grötschel, Lovász, and Schrijver (Prog. Comb. Opt. 1984), and by Frank and Tardos (Combinatorica 1987) that rely on bit-complexity arguments. Our algorithm avoids rounding-based arguments such as simultaneous Diophantine approximation and uses geometric arguments instead.
We consider linear programming in the oracle model: $\max\{c^\top x \,:\, x\in P\}$, where the polyhedron $P=\{x\in\mathbb{R}^n\,:\, Ax\le b\}$ is given by a separation oracle. We present an algorithm that finds exact primal and dual solutions using $O(n^2\log(n/δ))$ oracle calls and $O(n^4\log(n/δ)+n^5\log\log(1/δ))$ arithmetic operations, where $δ$ is a geometric condition number associated with the system $(A,b)$. These bounds do not depend on the cost vector $c$ and do not require a priori knowledge of $δ$. For rational data, $\log(1/δ)$ is polynomially bounded in the encoding size of $(A,b)$, thus providing a polynomial-time algorithm.
The algorithm works in a black box manner, requiring a subroutine for approximate primal and dual solutions; the above running times are achieved when using the cutting plane method of Jiang, Lee, Song, and Wong (STOC 2020) for this subroutine. Whereas approximate solvers may return primal solutions only, we develop a general framework for extracting dual certificates based on the work of Burrell and Todd (Math. Oper. Res. 1985).
Our algorithm strengthens results by Grötschel, Lovász, and Schrijver (Prog. Comb. Opt. 1984), and by Frank and Tardos (Combinatorica 1987) that rely on bit-complexity arguments. Our algorithm avoids rounding-based arguments such as simultaneous Diophantine approximation and uses geometric arguments instead.
We consider the problem of privately releasing a $k$-dimensional vector under updates: Starting with a zero vector, at times $t_1, t_2,\dots$ the vector is updated by adding $x^{(1)}, x^{(2)},\dots$, respectively. For positive integers $T$, $k$ we model the updates as a data set $\{(t_i, x^{(i)})\}_i$, where $t_i \in [T]$ and $x^{(i)} \in B_k$ (the $k$-dimensional unit ball). Two such data sets are said to be neighboring if their symmetric difference has size at most $1$. The continual release consists of the sum $A^{(t)} = \sum_{i \; : \; t_i \leq t} x^{(i)}$ for each time step $t=1,\dots,T$. Classical continual release techniques allow us to release an approximation of $A^{(1)},\dots,A^{(T)}$ with additive noise of magnitude $\text{polylog}(T)$, computed in time $O(kT)$, even in the on-line, adaptive case where data is continually revealed for the current time step.
Motivated by private sketching techniques, we consider the setting where only a \emph{subset} of entries in $A^{(t)}$ need to be released at time step $t$. Our new result is that it is possible to sample any desired entry in a given noise vector in \emph{constant time} while reproducing exactly the distribution of the binary tree mechanism with Gaussian noise. The improvement on the known time bound of $O(\log T)$ comes from a new data structure that allows us to sample a new noise value with the correct correlations in constant time using Brownian bridges. We present two data management applications, of independent interest, that use our technique in conjunction with differentially private CountSketches: 1) A dynamic data structure for orthogonal range counting queries with a better privacy/accuracy/space trade-off than previous data structures, and 2) Join size estimation, where in addition we show improved high-probability bounds.
We consider the problem of privately releasing a $k$-dimensional vector under updates: Starting with a zero vector, at times $t_1, t_2,\dots$ the vector is updated by adding $x^{(1)}, x^{(2)},\dots$, respectively. For positive integers $T$, $k$ we model the updates as a data set $\{(t_i, x^{(i)})\}_i$, where $t_i \in [T]$ and $x^{(i)} \in B_k$ (the $k$-dimensional unit ball). Two such data sets are said to be neighboring if their symmetric difference has size at most $1$. The continual release consists of the sum $A^{(t)} = \sum_{i \; : \; t_i \leq t} x^{(i)}$ for each time step $t=1,\dots,T$. Classical continual release techniques allow us to release an approximation of $A^{(1)},\dots,A^{(T)}$ with additive noise of magnitude $\text{polylog}(T)$, computed in time $O(kT)$, even in the on-line, adaptive case where data is continually revealed for the current time step.
Motivated by private sketching techniques, we consider the setting where only a \emph{subset} of entries in $A^{(t)}$ need to be released at time step $t$. Our new result is that it is possible to sample any desired entry in a given noise vector in \emph{constant time} while reproducing exactly the distribution of the binary tree mechanism with Gaussian noise. The improvement on the known time bound of $O(\log T)$ comes from a new data structure that allows us to sample a new noise value with the correct correlations in constant time using Brownian bridges. We present two data management applications, of independent interest, that use our technique in conjunction with differentially private CountSketches: 1) A dynamic data structure for orthogonal range counting queries with a better privacy/accuracy/space trade-off than previous data structures, and 2) Join size estimation, where in addition we show improved high-probability bounds.
Multireference alignment (MRA) is the task of recovering a hidden "signal" vector, given many noisy copies that have been cyclically shifted by unknown offsets. This task belongs to the class of orbit recovery problems, in which the observed samples are affected by some group action. These problems have a variety of practical motivations, including the reconstruction of 3-dimensional molecular structure from cryogenic electron microscopy (cryo-EM) images. We consider two variants of MRA: dihedral MRA, where the cyclic group is replaced by the dihedral group, allowing for reversals of the vector in addition to shifts; and projected MRA, where the observations are passed through a projection operator akin to the tomographic projection present in cryo-EM. We apply the method of moments and aim to recover the signal from the third moment tensor of the samples. This inverse problem is well understood for basic MRA, but for the variants we consider there is no polynomial-time algorithm known to succeed for generic signals. We give the first such algorithm for both of these variants. Our method requires the signal length to be a power of two, and recursively subdivides the problem into smaller problems of half the size. The algorithm's success for generic signals is proven, conditional on a conjecture about the rank of a certain symbolic matrix of polynomials. For any given problem size, this conjecture can be verified on a computer.
Multireference alignment (MRA) is the task of recovering a hidden "signal" vector, given many noisy copies that have been cyclically shifted by unknown offsets. This task belongs to the class of orbit recovery problems, in which the observed samples are affected by some group action. These problems have a variety of practical motivations, including the reconstruction of 3-dimensional molecular structure from cryogenic electron microscopy (cryo-EM) images. We consider two variants of MRA: dihedral MRA, where the cyclic group is replaced by the dihedral group, allowing for reversals of the vector in addition to shifts; and projected MRA, where the observations are passed through a projection operator akin to the tomographic projection present in cryo-EM. We apply the method of moments and aim to recover the signal from the third moment tensor of the samples. This inverse problem is well understood for basic MRA, but for the variants we consider there is no polynomial-time algorithm known to succeed for generic signals. We give the first such algorithm for both of these variants. Our method requires the signal length to be a power of two, and recursively subdivides the problem into smaller problems of half the size. The algorithm's success for generic signals is proven, conditional on a conjecture about the rank of a certain symbolic matrix of polynomials. For any given problem size, this conjecture can be verified on a computer.
We study the task of density estimation, where we hope to accurately estimate a probability density from $n$ samples. A textbook method for density estimation in total variation distance is the minimum-distance estimator approach, where we conclude both the algorithm and the analysis merely from bounding the VC dimension of a particular concept class (the so-called Yatracos class).
While this technique has originally yielded sharp guarantees primarily for total variation distance, in this work we extend the minimum-distance estimator approach for learning within Hellinger distance. Our main observation is that we may produce an analogous recipe for Hellinger (where we only require bounding the VC dimension of a related concept class) by drawing connections to recent results yielding reverse data processing inequalities.
This recipe is flexible enough to accommodate fast algorithms originally designed for total variation distance; by modifying the approach of Acharya et al. (2017) we conclude the first near-linear time algorithm for learning classes including univariate mixtures of log-concave densities and mixtures of Gaussians (with arbitrary variances), with near-optimal sample complexity.
We study the task of density estimation, where we hope to accurately estimate a probability density from $n$ samples. A textbook method for density estimation in total variation distance is the minimum-distance estimator approach, where we conclude both the algorithm and the analysis merely from bounding the VC dimension of a particular concept class (the so-called Yatracos class).
While this technique has originally yielded sharp guarantees primarily for total variation distance, in this work we extend the minimum-distance estimator approach for learning within Hellinger distance. Our main observation is that we may produce an analogous recipe for Hellinger (where we only require bounding the VC dimension of a related concept class) by drawing connections to recent results yielding reverse data processing inequalities.
This recipe is flexible enough to accommodate fast algorithms originally designed for total variation distance; by modifying the approach of Acharya et al. (2017) we conclude the first near-linear time algorithm for learning classes including univariate mixtures of log-concave densities and mixtures of Gaussians (with arbitrary variances), with near-optimal sample complexity.
Efficiently sampling from a complex probability distribution is a fundamental problem which has become increasingly pertinent in recent years with the rise of generative AI, as sophisticated sampling procedures from LLMs have been proposed to solve challenging reasoning problems. The efficacy of such sampling algorithms is limited, however, by the relationship between the LLM and the particular sampling task at hand, which has motivated the framework of test-time training (TTT). TTT works by updating a model's weights in response to partial generations and reward feedback received at inference time, thus adapting to the particular problem. In this work, we propose a formalization for TTT as the problem of producing a sample from a given probability measure $μ^\star$ belonging to a known class ${F}$ of distributions, given an oracle $\hat μ$ which yields approximate density estimates for $μ^\star$. This is closely related to the problem of reducing sampling to approximate counting studied in seminal works of Jerrum, Valiant & Vazirani (1986) and Jerrum & Sinclair (1989): namely, when ${F}$ is the class of all distributions, it coincides exactly with the aforementioned counting-to-sampling reduction.
In this paper, we first show a quadratic lower bound on the query complexity of sampling from $μ^\star$ given query access to $\hat μ$ (for sufficiently large classes ${F}$), thus showing that the random walk approach proposed by Jerrum & Sinclair (1989) and refined by Hayes & Sinclair (2010), is optimal. This answers an open question posed by Hayes & Sinclair. We then show that this lower bound can be circumvented if the size of ${F}$ is bounded appropriately. As we discuss, this latter result can be viewed as an abstraction of TTT, and thus represents a starting point for the development of a principled theoretical framework for TTT.
Efficiently sampling from a complex probability distribution is a fundamental problem which has become increasingly pertinent in recent years with the rise of generative AI, as sophisticated sampling procedures from LLMs have been proposed to solve challenging reasoning problems. The efficacy of such sampling algorithms is limited, however, by the relationship between the LLM and the particular sampling task at hand, which has motivated the framework of test-time training (TTT). TTT works by updating a model's weights in response to partial generations and reward feedback received at inference time, thus adapting to the particular problem. In this work, we propose a formalization for TTT as the problem of producing a sample from a given probability measure $μ^\star$ belonging to a known class ${F}$ of distributions, given an oracle $\hat μ$ which yields approximate density estimates for $μ^\star$. This is closely related to the problem of reducing sampling to approximate counting studied in seminal works of Jerrum, Valiant & Vazirani (1986) and Jerrum & Sinclair (1989): namely, when ${F}$ is the class of all distributions, it coincides exactly with the aforementioned counting-to-sampling reduction.
In this paper, we first show a quadratic lower bound on the query complexity of sampling from $μ^\star$ given query access to $\hat μ$ (for sufficiently large classes ${F}$), thus showing that the random walk approach proposed by Jerrum & Sinclair (1989) and refined by Hayes & Sinclair (2010), is optimal. This answers an open question posed by Hayes & Sinclair. We then show that this lower bound can be circumvented if the size of ${F}$ is bounded appropriately. As we discuss, this latter result can be viewed as an abstraction of TTT, and thus represents a starting point for the development of a principled theoretical framework for TTT.
There are two ways to look at the P v NP problem, as a formal mathematically defined conjecture as a Clay Millennium Prize Problem, and as the more intuitive notion that everything efficiently verifiable is efficiently computable and the implications that has on our ability to compute.
I've written considerably about how artificial intelligence has affected the latter. In particular, how AI and other advances in computing have brought us to this Optiland of getting most of the good implications of P=NP while our cryptographic codes remain unbreakable.
But now with the recent advances in AI-created and assisted proofs, will AI change what we know about the formal mathematical statement? Is an AI-generated proof of P ≠ NP around the corner?
No, it isn't. I do not believe we will see a P v NP proof in my lifetime proven by man or machine, separately or working together.
While the disproof of the Erdős unit distance problem is an impressive AI achievement, keep in mind that for every AI math proof there are hundreds of problems that we have tried to solve with AI where we haven't seen progress. And there is a huge chasm between Erdős combinatorial conjectures and the Clay Millennium problems. AI will continue to improve, but there are limits.
People, particularly those outside of computational complexity, don't realize how difficult a mathematical challenge this is. Polynomial-time algorithms can work in strange and mysterious ways. They don't have to respect the semantics of an NP-search problem or do any searching at all. Bill gave me the following "algorithm" for clique: Take the eigenvalues of the adjacency matrix. For all we know, if there are two primes p and q such that the pth eigenvalue and the qth eigenvalue differ by more than 1/k then the graph has a k-clique. Of course this doesn't work. But to prove P ≠ NP, you need to prove not only that this algorithm doesn't work, but neither do any of the infinitely other potential algorithms for solving NP-complete problems.
We simply know of no way to manage general polynomial-time algorithms other than by simulating them. We know by relativization that simulation and diagonalization will not work to settle P v NP. Other attempts to understand polynomial time, like circuit complexity, proof complexity and algebraic geometry have gotten bogged down well below the full power of polynomial-time. At this time we don't even have a viable approach to settling the P v NP problem.
Don't waste your time trying a formal approach via Lean. (I'm looking at you Dmitry Khanukov) Computational complexity is very messy to formulate technically. I can't get an AI willing to give me a full Lean-verified proof of something trivial like P closed under complement, forget the PCP theorem. If someone or something does come up with a P ≠ NP, it'll be following the right intuitive approach, not a formalistic one.
At least start with something simpler, like showing BPP is in subexponential time, or SAT doesn't have quadratic algorithms. You won't succeed there either, even though these questions should be galactically simpler than P ≠ NP.
By Lance Fortnow
There are two ways to look at the P v NP problem, as a formal mathematically defined conjecture as a Clay Millennium Prize Problem, and as the more intuitive notion that everything efficiently verifiable is efficiently computable and the implications that has on our ability to compute.
I've written considerably about how artificial intelligence has affected the latter. In particular, how AI and other advances in computing have brought us to this Optiland of getting most of the good implications of P=NP while our cryptographic codes remain unbreakable.
But now with the recent advances in AI-created and assisted proofs, will AI change what we know about the formal mathematical statement? Is an AI-generated proof of P ≠ NP around the corner?
No, it isn't. I do not believe we will see a P v NP proof in my lifetime proven by man or machine, separately or working together.
While the disproof of the Erdős unit distance problem is an impressive AI achievement, keep in mind that for every AI math proof there are hundreds of problems that we have tried to solve with AI where we haven't seen progress. And there is a huge chasm between Erdős combinatorial conjectures and the Clay Millennium problems. AI will continue to improve, but there are limits.
People, particularly those outside of computational complexity, don't realize how difficult a mathematical challenge this is. Polynomial-time algorithms can work in strange and mysterious ways. They don't have to respect the semantics of an NP-search problem or do any searching at all. Bill gave me the following "algorithm" for clique: Take the eigenvalues of the adjacency matrix. For all we know, if there are two primes p and q such that the pth eigenvalue and the qth eigenvalue differ by more than 1/k then the graph has a k-clique. Of course this doesn't work. But to prove P ≠ NP, you need to prove not only that this algorithm doesn't work, but neither do any of the infinitely other potential algorithms for solving NP-complete problems.
We simply know of no way to manage general polynomial-time algorithms other than by simulating them. We know by relativization that simulation and diagonalization will not work to settle P v NP. Other attempts to understand polynomial time, like circuit complexity, proof complexity and algebraic geometry have gotten bogged down well below the full power of polynomial-time. At this time we don't even have a viable approach to settling the P v NP problem.
Don't waste your time trying a formal approach via Lean. (I'm looking at you Dmitry Khanukov) Computational complexity is very messy to formulate technically. I can't get an AI willing to give me a full Lean-verified proof of something trivial like P closed under complement, forget the PCP theorem. If someone or something does come up with a P ≠ NP, it'll be following the right intuitive approach, not a formalistic one.
At least start with something simpler, like showing BPP is in subexponential time, or SAT doesn't have quadratic algorithms. You won't succeed there either, even though these questions should be galactically simpler than P ≠ NP.
The symmetric determinantal complexity sdc(f) of a polynomial f is the least m such that f = det(M) for an m x m symmetric matrix M of affine-linear forms. We prove, over the complex numbers, that sdc(sum_{i=1}^n x_i^n) >= (1/(2e) - o(1)) n^2. This is a symmetric companion to the author's non-symmetric polar-degree preprint (arXiv:7680505); the method parallels that work, but the proof here is self-contained and redoes the load-bearing local incidence analysis in the symmetric setting. The general theorem: if X = V(f) in P^{N-1} is a smooth degree-d hypersurface, N >= 3, and f = det(A_0 + sum x_i A_i) with all A_i symmetric of size m, then the top polar degree d(d-1)^{N-2} is at most 2^{N-2} C(m, N-1). The proof uses the symmetric rank-one kernel incidence M(z,x) u = 0. At a genuine polar point M has rank m-1, and a symmetric Schur-complement normal form eliminates the unique kernel line scheme-theoretically; on the resulting local graph the lifted conormal forms u^T A_i u are a common unit multiple of the partials d_i f, so the lifted polar equations cut the ordinary polar slice up to units and each genuine lifted polar point is a zero-dimensional isolated solution. Multihomogeneous Bezout on P^N x P^{m-1} then yields the bound 2^{N-2} C(m, N-1). For F_n = sum x_i^n this gives the constant 1/(2e). More generally, for F_{N,d} = sum_{i=1}^N x_i^d the same theorem gives sdc(F_{N,d}) >= (1/(2e) - o_N(1)) N(d-1) as N -> infinity. We give an explicit symmetric representation of F_{N,d} of size 2N(d+1)+1, so the diagonal bounds are non-vacuous and tight up to a constant. The result is for exact symmetric determinantal complexity in characteristic zero; it is not a border statement and not a uniform positive-characteristic theorem.
The symmetric determinantal complexity sdc(f) of a polynomial f is the least m such that f = det(M) for an m x m symmetric matrix M of affine-linear forms. We prove, over the complex numbers, that sdc(sum_{i=1}^n x_i^n) >= (1/(2e) - o(1)) n^2. This is a symmetric companion to the author's non-symmetric polar-degree preprint (arXiv:7680505); the method parallels that work, but the proof here is self-contained and redoes the load-bearing local incidence analysis in the symmetric setting. The general theorem: if X = V(f) in P^{N-1} is a smooth degree-d hypersurface, N >= 3, and f = det(A_0 + sum x_i A_i) with all A_i symmetric of size m, then the top polar degree d(d-1)^{N-2} is at most 2^{N-2} C(m, N-1). The proof uses the symmetric rank-one kernel incidence M(z,x) u = 0. At a genuine polar point M has rank m-1, and a symmetric Schur-complement normal form eliminates the unique kernel line scheme-theoretically; on the resulting local graph the lifted conormal forms u^T A_i u are a common unit multiple of the partials d_i f, so the lifted polar equations cut the ordinary polar slice up to units and each genuine lifted polar point is a zero-dimensional isolated solution. Multihomogeneous Bezout on P^N x P^{m-1} then yields the bound 2^{N-2} C(m, N-1). For F_n = sum x_i^n this gives the constant 1/(2e). More generally, for F_{N,d} = sum_{i=1}^N x_i^d the same theorem gives sdc(F_{N,d}) >= (1/(2e) - o_N(1)) N(d-1) as N -> infinity. We give an explicit symmetric representation of F_{N,d} of size 2N(d+1)+1, so the diagonal bounds are non-vacuous and tight up to a constant. The result is for exact symmetric determinantal complexity in characteristic zero; it is not a border statement and not a uniform positive-characteristic theorem.
The last decade of research on the LOCAL model has seen tremendous progress in understanding locally checkable labeling (LCL) problems, culminating in an almost complete classification of the possible complexities LCL problems can exhibit. In particular, on undirected trees, Chang and Pettie showed that there is no LCL problem with complexity between $ω(\log n)$ and $n^{o(1)}$ and Chang showed that, for every positive integer $k$, there is no LCL problem with complexity between $ω(n^{1/(k+1)})$ and $o(n^{1/k})$; additionally, which side of each gap a problem is found on is decidable.
While the class of LCL problems - which, roughly speaking, consists of problems for which the correctness of a solution can be described by a finite set of allowed node configurations, which in turn can be locally verified by a constant-time algorithm - includes many important problems, it has one major restriction: problems can be defined only on bounded degree graphs, which consequently restricts all the classification and gap results mentioned above.
In this work, we propose a generalization of LCL problems to unbounded degree using Presburger monadic second-order (PMSO) formulas; more specifically, we consider what we call Local PMSO (LPMSO) problems, i.e., problems whose correct solutions are both finitely described by a PMSO formula and locally verifiable by a LOCAL algorithm in constant time - this class contains many of the important problems studied in the LOCAL model but defines them on unbounded degree graphs. As our main result we prove that, on unbounded degree rooted trees, the aforementioned $ω(\log n)$ - $n^{o(1)}$ and $ω(n^{1/(k+1)})$ - $o(n^{1/k})$ complexity gaps (and their decidability) extend to the class of LPMSO problems.
The last decade of research on the LOCAL model has seen tremendous progress in understanding locally checkable labeling (LCL) problems, culminating in an almost complete classification of the possible complexities LCL problems can exhibit. In particular, on undirected trees, Chang and Pettie showed that there is no LCL problem with complexity between $ω(\log n)$ and $n^{o(1)}$ and Chang showed that, for every positive integer $k$, there is no LCL problem with complexity between $ω(n^{1/(k+1)})$ and $o(n^{1/k})$; additionally, which side of each gap a problem is found on is decidable.
While the class of LCL problems - which, roughly speaking, consists of problems for which the correctness of a solution can be described by a finite set of allowed node configurations, which in turn can be locally verified by a constant-time algorithm - includes many important problems, it has one major restriction: problems can be defined only on bounded degree graphs, which consequently restricts all the classification and gap results mentioned above.
In this work, we propose a generalization of LCL problems to unbounded degree using Presburger monadic second-order (PMSO) formulas; more specifically, we consider what we call Local PMSO (LPMSO) problems, i.e., problems whose correct solutions are both finitely described by a PMSO formula and locally verifiable by a LOCAL algorithm in constant time - this class contains many of the important problems studied in the LOCAL model but defines them on unbounded degree graphs. As our main result we prove that, on unbounded degree rooted trees, the aforementioned $ω(\log n)$ - $n^{o(1)}$ and $ω(n^{1/(k+1)})$ - $o(n^{1/k})$ complexity gaps (and their decidability) extend to the class of LPMSO problems.
In the implicit version of a propositional proof system Q, we work with Q-proofs that are not written down directly, but are succinctly encoded by circuits. Thus implicit Q-proofs are potentially exponentially shorter than usual Q-proofs. We study narrow implicit proofs, a restricted version of this notion, in which lines in the encoded proof can only have polynomial size. We use a cut-elimination construction to show that G_{i+1} is equivalent to narrow implicit G_i, for i >= 1, where G_i is the extension of Frege allowing reasoning with Sigma^q_i quantified propositional formulas. We show that G_1 is equivalent to implicit resolution.
In the implicit version of a propositional proof system Q, we work with Q-proofs that are not written down directly, but are succinctly encoded by circuits. Thus implicit Q-proofs are potentially exponentially shorter than usual Q-proofs. We study narrow implicit proofs, a restricted version of this notion, in which lines in the encoded proof can only have polynomial size. We use a cut-elimination construction to show that G_{i+1} is equivalent to narrow implicit G_i, for i >= 1, where G_i is the extension of Frege allowing reasoning with Sigma^q_i quantified propositional formulas. We show that G_1 is equivalent to implicit resolution.
Authors: Mehmet Emre, Daniel C. Jerison, Ellen Veomett
In this article, we prove irreducibility results for a family of Markov chains arising in the study of redistricting and detecting gerrymandering. These chains use ReCom moves as their transition mechanism and are commonly employed in Markov chain Monte Carlo methods to generate ensembles of districting plans. Such ensembles are frequently used for outlier analysis, in which a proposed districting map is compared against the ensemble to determine whether it behaves atypically; this methodology often appears in expert testimony in redistricting litigation.
We show that when the underlying dual graph is a triangular subset of the triangular lattice and each district consists of two merged geographic regions, the associated ReCom chain is irreducible. This provides another entry in the very small list of known classes of ReCom chains for which irreducibility has been established.
We also demonstrate the fragility of this phenomenon by constructing an infinite family of maps for which the corresponding ReCom chain is not irreducible. Indeed, we produce a districting map that, after implementing a single ReCom move, always yields the same original map. These examples remain structurally close to the triangular lattice: they arise as subdivisions of the triangular lattice, and the resulting graphs have maximum degree at most 8.
Finally, we prove irreducibility for a further special case: the ReCom chain on a 3 x n grid graph partitioned into three districts of size n.
In this article, we prove irreducibility results for a family of Markov chains arising in the study of redistricting and detecting gerrymandering. These chains use ReCom moves as their transition mechanism and are commonly employed in Markov chain Monte Carlo methods to generate ensembles of districting plans. Such ensembles are frequently used for outlier analysis, in which a proposed districting map is compared against the ensemble to determine whether it behaves atypically; this methodology often appears in expert testimony in redistricting litigation.
We show that when the underlying dual graph is a triangular subset of the triangular lattice and each district consists of two merged geographic regions, the associated ReCom chain is irreducible. This provides another entry in the very small list of known classes of ReCom chains for which irreducibility has been established.
We also demonstrate the fragility of this phenomenon by constructing an infinite family of maps for which the corresponding ReCom chain is not irreducible. Indeed, we produce a districting map that, after implementing a single ReCom move, always yields the same original map. These examples remain structurally close to the triangular lattice: they arise as subdivisions of the triangular lattice, and the resulting graphs have maximum degree at most 8.
Finally, we prove irreducibility for a further special case: the ReCom chain on a 3 x n grid graph partitioned into three districts of size n.
Given a graph $G = (V, E)$, a signed dominating function is a function $f: V \rightarrow \{-1, 1\}$ such that for every vertex $u \in V$, $\sum\limits_{v \in N[u]} f(v) \geq 1$. The weight of $f$ is defined as $\sum\limits_{u \in V} f(u)$. The objective of the \sd{} problem is to compute a signed dominating function $f$ of minimum weight. The problem is known to be NP-complete even when restricted to bipartite, chordal, and planar graphs. In this paper, we extend the known complexity results for the \sd{} problem. Since the problem is NP-complete on chordal graphs, we study its complexity on split graphs, a subclass of chordal graphs, and show that it remains NP-complete. Moreover, as the problem is W[2]-hard parameterized by weight, we investigate its parameterized complexity with respect to structural parameters. We prove that the problem is W[1]-hard when parameterized by feedback vertex set number (and hence by treewidth and clique-width). Motivated by this hardness result, we consider more restrictive parameters, neighbourhood diversity and twin cover number, and present FPT algorithms.
Given a graph $G = (V, E)$, a signed dominating function is a function $f: V \rightarrow \{-1, 1\}$ such that for every vertex $u \in V$, $\sum\limits_{v \in N[u]} f(v) \geq 1$. The weight of $f$ is defined as $\sum\limits_{u \in V} f(u)$. The objective of the \sd{} problem is to compute a signed dominating function $f$ of minimum weight. The problem is known to be NP-complete even when restricted to bipartite, chordal, and planar graphs. In this paper, we extend the known complexity results for the \sd{} problem. Since the problem is NP-complete on chordal graphs, we study its complexity on split graphs, a subclass of chordal graphs, and show that it remains NP-complete. Moreover, as the problem is W[2]-hard parameterized by weight, we investigate its parameterized complexity with respect to structural parameters. We prove that the problem is W[1]-hard when parameterized by feedback vertex set number (and hence by treewidth and clique-width). Motivated by this hardness result, we consider more restrictive parameters, neighbourhood diversity and twin cover number, and present FPT algorithms.
Authors: Brian Bemman, Maximilien Gadouleau, Oliver W. Gnilke, George B. Mertzios
We present a simple $\mathcal{O}\left( n^2 \frac{ \log N }{ \log \log N } + N \right)$ enumeration algorithm for solving a problem from mathematical and computational music analysis where, given a strictly increasing integer sequence, $S$, with $n$ entries and maximum value $N$, the task is to enumerate all $m$ $\textit{inclusion-maximal arithmetic progressions (IMAPs)}$ in this sequence. An IMAP is a subsequence, $S' \subseteq S$ with $k>2$ integers, in which (i) the difference between any two consecutive integers is the same number, $d$ (i.e., $S'$ is an $\textit{arithmetic progression}$), (ii) $S'$ cannot be further extended to the left or to the right with any additional integers from $S$ while still remaining an arithmetic progression (i.e., $S'$ is a $\textit{maximal}$ arithmetic progression), and (iii) there is no other maximal arithmetic progression, $S'' \subseteq S$, which $\textit{properly}$ contains $S'$ (i.e., $S'$ is an $\textit{inclusion-maximal}$ arithmetic progression). We further provide proofs for the expected number of IMAPs in random integer sequences, $S$, and a bound on their order of growth. Finally, we provide empirical experiments comparing both (a) the practical running time performance of the proposed algorithm against that of a previously known algorithm which has higher time complexity $\mathcal{O}(N^{2+o(1)}n)$, and (b) the actual enumerated number of IMAPs to that of their mathematically expected number. Notably, the proposed algorithm demonstrates a significant improvement in running time over the previously known algorithm, and in immediate practical applications, will allow for more efficient analysis of large and rhythmically complex musical pieces.
We present a simple $\mathcal{O}\left( n^2 \frac{ \log N }{ \log \log N } + N \right)$ enumeration algorithm for solving a problem from mathematical and computational music analysis where, given a strictly increasing integer sequence, $S$, with $n$ entries and maximum value $N$, the task is to enumerate all $m$ $\textit{inclusion-maximal arithmetic progressions (IMAPs)}$ in this sequence. An IMAP is a subsequence, $S' \subseteq S$ with $k>2$ integers, in which (i) the difference between any two consecutive integers is the same number, $d$ (i.e., $S'$ is an $\textit{arithmetic progression}$), (ii) $S'$ cannot be further extended to the left or to the right with any additional integers from $S$ while still remaining an arithmetic progression (i.e., $S'$ is a $\textit{maximal}$ arithmetic progression), and (iii) there is no other maximal arithmetic progression, $S'' \subseteq S$, which $\textit{properly}$ contains $S'$ (i.e., $S'$ is an $\textit{inclusion-maximal}$ arithmetic progression). We further provide proofs for the expected number of IMAPs in random integer sequences, $S$, and a bound on their order of growth. Finally, we provide empirical experiments comparing both (a) the practical running time performance of the proposed algorithm against that of a previously known algorithm which has higher time complexity $\mathcal{O}(N^{2+o(1)}n)$, and (b) the actual enumerated number of IMAPs to that of their mathematically expected number. Notably, the proposed algorithm demonstrates a significant improvement in running time over the previously known algorithm, and in immediate practical applications, will allow for more efficient analysis of large and rhythmically complex musical pieces.
Authors: Albert Gong, Annabelle Michael Carrell, Raaz Dwivedi, Lester Mackey
We introduce a new tool, Express, for converting a non-causal attention approximation into a causal approximation with matching approximation guarantees. When combined with the state-of-the-art Thinformer approximation, Express improves upon the best known causal attention guarantees, delivering $\log^{3/2}(n)/s$ approximation error with only $O(s)$ memory and $O(s^2 \log^2(n))$ compression overhead for a sequence of length $n$. We pair these developments with an efficient I/O-aware Triton implementation, demonstrate substantial speedups over FlashAttention 2, and use Express to overcome four resource bottlenecks in the language modeling pipeline: long-context prefill, KV cache compression, long-form memory-constrained decoding, and long-form compute-constrained decoding.
We introduce a new tool, Express, for converting a non-causal attention approximation into a causal approximation with matching approximation guarantees. When combined with the state-of-the-art Thinformer approximation, Express improves upon the best known causal attention guarantees, delivering $\log^{3/2}(n)/s$ approximation error with only $O(s)$ memory and $O(s^2 \log^2(n))$ compression overhead for a sequence of length $n$. We pair these developments with an efficient I/O-aware Triton implementation, demonstrate substantial speedups over FlashAttention 2, and use Express to overcome four resource bottlenecks in the language modeling pipeline: long-context prefill, KV cache compression, long-form memory-constrained decoding, and long-form compute-constrained decoding.
How much voter input is necessary in order to ensure representation in multiwinner elections? If voters are randomly selected from an underlying population, how many draws are necessary to find a proportional committee of $k$ candidates, with high probability?
Sample-based adaptations of standard multiwinner voting rules that satisfy the justified representation (JR) proportionality axiom use $\tilde O(k^5 \log \frac{m}δ)$ sampled approval ballots over $m$ candidates, where $δ$ is a probability of failure and $\tilde O$ suppresses $\mathrm{polylog}(k)$ factors. We present a rule for which the sample complexity of JR-family proportional committee selection is $\tilde O(k^{4}\log \frac{m}δ)$. This separates the sample complexity of JR from that of the natural corresponding additive approximation to the voter coverage (Chamberlin-Courant) objective, which we show requires $Θ(k^5\log \frac{m}δ)$ samples.
For lower bounds, we present a family of instances with $m, \frac{1}δ \in \mathrm{poly}(k)$ for which $Ω(k^3)$ sampled ballots are necessary in order to identify a JR committee. We also show a dependence on $\log m$ is necessary. This lower bound is versatile, and also applies to Hare proportionality for solid coalitions (PSC) for ranked ballots. Unfortunately, no number of sampled ballots suffices to satisfy the slightly stronger Droop JR and Droop PSC axioms with high probability. But mild relaxations of JR require fewer samples, as do the beyond-worst-case domains and actual approval preferences we evaluate.
How much voter input is necessary in order to ensure representation in multiwinner elections? If voters are randomly selected from an underlying population, how many draws are necessary to find a proportional committee of $k$ candidates, with high probability?
Sample-based adaptations of standard multiwinner voting rules that satisfy the justified representation (JR) proportionality axiom use $\tilde O(k^5 \log \frac{m}δ)$ sampled approval ballots over $m$ candidates, where $δ$ is a probability of failure and $\tilde O$ suppresses $\mathrm{polylog}(k)$ factors. We present a rule for which the sample complexity of JR-family proportional committee selection is $\tilde O(k^{4}\log \frac{m}δ)$. This separates the sample complexity of JR from that of the natural corresponding additive approximation to the voter coverage (Chamberlin-Courant) objective, which we show requires $Θ(k^5\log \frac{m}δ)$ samples.
For lower bounds, we present a family of instances with $m, \frac{1}δ \in \mathrm{poly}(k)$ for which $Ω(k^3)$ sampled ballots are necessary in order to identify a JR committee. We also show a dependence on $\log m$ is necessary. This lower bound is versatile, and also applies to Hare proportionality for solid coalitions (PSC) for ranked ballots. Unfortunately, no number of sampled ballots suffices to satisfy the slightly stronger Droop JR and Droop PSC axioms with high probability. But mild relaxations of JR require fewer samples, as do the beyond-worst-case domains and actual approval preferences we evaluate.
Minimum-weight decoding for two-dimensional color codes is NP-hard (Walters and Turner 2026), motivating the search for approximation guarantees beyond worst-case exact decoding. We study a block-based decoder for triangular color-code lattices. The decoder satisfies the deterministic additive guarantee \(\lvert E_{\mathrm{alg}}\rvert \leq \operatorname{OPT}(S)+O(n/τ)\), where \(n\) is the number of vertices and \(τ\) is the wall spacing. We show that this additive guarantee becomes a near-optimal multiplicative guarantee under natural noise models. For constant-rate i.i.d. face noise and constant local degree, choosing \(τ=Θ(ε^{-1})\) gives a \((1+ε)\)-approximation with probability \(1-\exp(-Ω(n))\), in time \(n2^{O(ε^{-1})}\). We also prove a smoothed analogue: the same near-optimality guarantee holds when an arbitrary adversarial error pattern is perturbed by independent constant-rate noise. Finally, in the low-probability regime \(p=o(1/\log^2 n)\), the syndrome decomposes into small active regions with high probability, allowing independent component-wise decoding and yielding an exact minimum-weight correction in time \(n2^{O((\log n)^{3/2})}\). These results show that, despite worst-case hardness, color-code decoding admits strong average-case, smoothed, and sparse-regime guarantees.
Minimum-weight decoding for two-dimensional color codes is NP-hard (Walters and Turner 2026), motivating the search for approximation guarantees beyond worst-case exact decoding. We study a block-based decoder for triangular color-code lattices. The decoder satisfies the deterministic additive guarantee \(\lvert E_{\mathrm{alg}}\rvert \leq \operatorname{OPT}(S)+O(n/τ)\), where \(n\) is the number of vertices and \(τ\) is the wall spacing. We show that this additive guarantee becomes a near-optimal multiplicative guarantee under natural noise models. For constant-rate i.i.d. face noise and constant local degree, choosing \(τ=Θ(ε^{-1})\) gives a \((1+ε)\)-approximation with probability \(1-\exp(-Ω(n))\), in time \(n2^{O(ε^{-1})}\). We also prove a smoothed analogue: the same near-optimality guarantee holds when an arbitrary adversarial error pattern is perturbed by independent constant-rate noise. Finally, in the low-probability regime \(p=o(1/\log^2 n)\), the syndrome decomposes into small active regions with high probability, allowing independent component-wise decoding and yielding an exact minimum-weight correction in time \(n2^{O((\log n)^{3/2})}\). These results show that, despite worst-case hardness, color-code decoding admits strong average-case, smoothed, and sparse-regime guarantees.
Authors: Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi
We study the problem of generating synthetic data under differential privacy. We establish fixed-parameter tractability (FPT) for this problem where the parameter is the treewidth of the query family's incidence graph. Our algorithms attain optimal error rates across all regimes and are realized by two different approaches: the first is based on linear programming (LP) and the FPT of the separation problem for the LP dual; the second is based on a subsampled private multiplicative weights method, where we obtain FPT for sampling from Gibbs distributions. Both approaches are unified by a dynamic programming framework over a tree decomposition.
We study the problem of generating synthetic data under differential privacy. We establish fixed-parameter tractability (FPT) for this problem where the parameter is the treewidth of the query family's incidence graph. Our algorithms attain optimal error rates across all regimes and are realized by two different approaches: the first is based on linear programming (LP) and the FPT of the separation problem for the LP dual; the second is based on a subsampled private multiplicative weights method, where we obtain FPT for sampling from Gibbs distributions. Both approaches are unified by a dynamic programming framework over a tree decomposition.
The symmetric determinantal complexity $\sdc(f)$ of a polynomial $f$ is the
least $m$ such that $f=\Det(M)$ for an $m\times m$ symmetric matrix $M$ of
affine-linear forms. We prove, over $\CC$, that
\[
\sdc\!\left(\sum_{i=1}^n x_i^n\right)
\ge \left(\frac{1}{2e}-o(1)\right)n^2 .
\]
The result is a symmetric companion to the author's non-symmetric polar-degree
preprint~\cite{SheshadriArxiv}. The method parallels that work, but the proof
below is self-contained and redoes the load-bearing local incidence analysis in
the symmetric setting. The general theorem is the following. If
$X=V(f)\subset\PP^{N-1}$ is a smooth degree-$d$ hypersurface, $N\ge3$, and
$f=\Det(A_0+\sum_{i=1}^N x_iA_i)$ with all $A_i$ symmetric of size $m$, then
\[
\pdeg_{\mathrm{top}}(X)=d(d-1)^{N-2}
\le 2^{N-2}\binom{m}{N-1}.
\]
The proof uses the symmetric rank-one kernel incidence
$\Mcal(z,x)u=0$, where $\Mcal=zA_0+\sum_i x_iA_i$. At a genuine polar point,
$\Mcal$ has rank $m-1$, and the symmetric local normal form
\[
\Mcal=\begin{pmatrix}B&c\\ c^{\mathsf T}&s\end{pmatrix},\qquad
\det B\in\OO^\times,
\]
eliminates the unique projective kernel line scheme-theoretically:
$u=(-B^{-1}c,1)$ and $\det\Mcal=(\det B)(s-c^{\mathsf T}B^{-1}c)$. On this
local graph, $\adj(\Mcal)=(\det B)uu^{\mathsf T}$ along the determinant
hypersurface, so the lifted conormal forms $u^{\mathsf T}A_i u$ are a common
unit multiple of the ordinary partial derivatives $\partial_i f$. Hence the
lifted polar equations cut the ordinary polar slice, up to units, and every
genuine lifted polar point is a zero-dimensional scheme-theoretic isolated
solution. Multihomogeneous Bezout on
$\PP^N_{[z:x]}\times\PP^{m-1}_{[u]}$ then gives
\[
[H^N U^{m-1}]\,H(H+U)^m(2U)^{N-2}
=2^{N-2}\binom{m}{N-1}.
\]
For $F_n=\sum_i x_i^n$ this bounds $n(n-1)^{n-2}$ and yields the stated
constant $1/(2e)$. More generally, for
$F_{N,d}=\sum_{i=1}^N x_i^d$ the same theorem gives
$\sdc(F_{N,d})\ge(1/(2e)-o_N(1))N(d-1)$ as $N\to\infty$, uniformly for
$d\ge2$. We also give an explicit symmetric determinantal representation of
$F_{N,d}$ of size $2N(d+1)+1$, showing that the diagonal lower bounds are
non-vacuous and tight up to a constant factor. The result is for exact
symmetric determinantal complexity in characteristic zero; it is not a
border-complexity statement and it is not a uniform positive-characteristic
theorem.
The symmetric determinantal complexity $\sdc(f)$ of a polynomial $f$ is the
least $m$ such that $f=\Det(M)$ for an $m\times m$ symmetric matrix $M$ of
affine-linear forms. We prove, over $\CC$, that
\[
\sdc\!\left(\sum_{i=1}^n x_i^n\right)
\ge \left(\frac{1}{2e}-o(1)\right)n^2 .
\]
The result is a symmetric companion to the author's non-symmetric polar-degree
preprint~\cite{SheshadriArxiv}. The method parallels that work, but the proof
below is self-contained and redoes the load-bearing local incidence analysis in
the symmetric setting. The general theorem is the following. If
$X=V(f)\subset\PP^{N-1}$ is a smooth degree-$d$ hypersurface, $N\ge3$, and
$f=\Det(A_0+\sum_{i=1}^N x_iA_i)$ with all $A_i$ symmetric of size $m$, then
\[
\pdeg_{\mathrm{top}}(X)=d(d-1)^{N-2}
\le 2^{N-2}\binom{m}{N-1}.
\]
The proof uses the symmetric rank-one kernel incidence
$\Mcal(z,x)u=0$, where $\Mcal=zA_0+\sum_i x_iA_i$. At a genuine polar point,
$\Mcal$ has rank $m-1$, and the symmetric local normal form
\[
\Mcal=\begin{pmatrix}B&c\\ c^{\mathsf T}&s\end{pmatrix},\qquad
\det B\in\OO^\times,
\]
eliminates the unique projective kernel line scheme-theoretically:
$u=(-B^{-1}c,1)$ and $\det\Mcal=(\det B)(s-c^{\mathsf T}B^{-1}c)$. On this
local graph, $\adj(\Mcal)=(\det B)uu^{\mathsf T}$ along the determinant
hypersurface, so the lifted conormal forms $u^{\mathsf T}A_i u$ are a common
unit multiple of the ordinary partial derivatives $\partial_i f$. Hence the
lifted polar equations cut the ordinary polar slice, up to units, and every
genuine lifted polar point is a zero-dimensional scheme-theoretic isolated
solution. Multihomogeneous Bezout on
$\PP^N_{[z:x]}\times\PP^{m-1}_{[u]}$ then gives
\[
[H^N U^{m-1}]\,H(H+U)^m(2U)^{N-2}
=2^{N-2}\binom{m}{N-1}.
\]
For $F_n=\sum_i x_i^n$ this bounds $n(n-1)^{n-2}$ and yields the stated
constant $1/(2e)$. More generally, for
$F_{N,d}=\sum_{i=1}^N x_i^d$ the same theorem gives
$\sdc(F_{N,d})\ge(1/(2e)-o_N(1))N(d-1)$ as $N\to\infty$, uniformly for
$d\ge2$. We also give an explicit symmetric determinantal representation of
$F_{N,d}$ of size $2N(d+1)+1$, showing that the diagonal lower bounds are
non-vacuous and tight up to a constant factor. The result is for exact
symmetric determinantal complexity in characteristic zero; it is not a
border-complexity statement and it is not a uniform positive-characteristic
theorem.
The model of interactive oracle proofs (IOP) generalizes the notion of probabilistically checkable proof (PCP), in which a static proof is verified probabilistically by querying a small number of bits, to the interactive setting: a polynomial-time verifier interacts with an unbounded prover, but is restricted to only reading a small number of bits, in total, from the messages sent by the prover. IOPs provide a relaxed setting in which to study local probabilistic verification. They have proved instrumental in devising efficient methods for verification through subsequent compilation into non-interactive or succinct protocols.
We study a quantum analogue of interactive oracle proofs (qIOP) in which the verifier and communication are both allowed to be quantum; yet the verifier is restricted to perform measurements only on a small number of qubits received from the prover. Our main result is a qIOP for any language in QMA, in which the total communication is polynomial but the verifier only reads a polylogarithmic number of qubits in total. The protocol has completeness parameter exponentially close to $1$ and soundness bounded away from $1$ by a constant. In the absence of a quantum PCP theorem, this provides the first information-theoretically sound local and robust characterization of QMA, albeit interactive.
Our protocol combines the use of a quantum locally testable code (LTC) with classical techniques, notably probabilistically checkable proofs of proximity (PCPP). We avoid the necessity for complex multi-qubit tests employed in other settings by leveraging the local indistinguishability property of the quantum LTC.
The model of interactive oracle proofs (IOP) generalizes the notion of probabilistically checkable proof (PCP), in which a static proof is verified probabilistically by querying a small number of bits, to the interactive setting: a polynomial-time verifier interacts with an unbounded prover, but is restricted to only reading a small number of bits, in total, from the messages sent by the prover. IOPs provide a relaxed setting in which to study local probabilistic verification. They have proved instrumental in devising efficient methods for verification through subsequent compilation into non-interactive or succinct protocols.
We study a quantum analogue of interactive oracle proofs (qIOP) in which the verifier and communication are both allowed to be quantum; yet the verifier is restricted to perform measurements only on a small number of qubits received from the prover. Our main result is a qIOP for any language in QMA, in which the total communication is polynomial but the verifier only reads a polylogarithmic number of qubits in total. The protocol has completeness parameter exponentially close to $1$ and soundness bounded away from $1$ by a constant. In the absence of a quantum PCP theorem, this provides the first information-theoretically sound local and robust characterization of QMA, albeit interactive.
Our protocol combines the use of a quantum locally testable code (LTC) with classical techniques, notably probabilistically checkable proofs of proximity (PCPP). We avoid the necessity for complex multi-qubit tests employed in other settings by leveraging the local indistinguishability property of the quantum LTC.
We study the $t$-uniform hypergraphicality problem under a compressed representation of the degree sequence. Instead of listing all vertex degrees explicitly, the input consists of pairs $$ (δ_1,n_1),\dots,(δ_k,n_k), $$ meaning that exactly $n_i$ vertices have degree $δ_i$. Thus the parameter $k$ denotes the number of distinct degrees.
Although deciding $t$-hypergraphicality is NP-complete for every fixed $t>2$, we prove that the problem is fixed-parameter tractable parameterized by $(k,t)$. Our result shows that tractability extends substantially beyond previously known bounded-range regimes: even degree sequences with large overall degree spread can be handled efficiently when the number of distinct degrees is bounded.
Our approach decomposes hyperedges according to their types with respect to the degree classes, yielding a bounded-dimension spectrum representation. Using balancing hinge-flips, we show that every feasible spectrum can be transformed into a realization of the prescribed degree sequence. This leads to an integer programming feasibility formulation with $$ \binom{t+k-1}{k-1} $$ variables. Applying Lenstra's theorem yields an FPT algorithm running in time $$ f(k,t)\cdot \mathrm{poly}(L), $$ where $L$ denotes the encoding length of the compressed input.
We study the $t$-uniform hypergraphicality problem under a compressed representation of the degree sequence. Instead of listing all vertex degrees explicitly, the input consists of pairs $$ (δ_1,n_1),\dots,(δ_k,n_k), $$ meaning that exactly $n_i$ vertices have degree $δ_i$. Thus the parameter $k$ denotes the number of distinct degrees.
Although deciding $t$-hypergraphicality is NP-complete for every fixed $t>2$, we prove that the problem is fixed-parameter tractable parameterized by $(k,t)$. Our result shows that tractability extends substantially beyond previously known bounded-range regimes: even degree sequences with large overall degree spread can be handled efficiently when the number of distinct degrees is bounded.
Our approach decomposes hyperedges according to their types with respect to the degree classes, yielding a bounded-dimension spectrum representation. Using balancing hinge-flips, we show that every feasible spectrum can be transformed into a realization of the prescribed degree sequence. This leads to an integer programming feasibility formulation with $$ \binom{t+k-1}{k-1} $$ variables. Applying Lenstra's theorem yields an FPT algorithm running in time $$ f(k,t)\cdot \mathrm{poly}(L), $$ where $L$ denotes the encoding length of the compressed input.
We present a quantum algorithm for the Kravchuk transform that scales logarithmically in both the dimension and the inverse of the error parameter. The quantum Kravchuk transform maps computational basis states to states with amplitudes proportional to Kravchuk functions. We achieve this by combining two key techniques: the structural relationship between the Kravchuk transform and the Lie algebras $\mathfrak{su}(2)$, and a recent fast-forwarding simulation method for $\mathfrak{su}(2)$ operators in the oscillator representation. More precisely, we first establish the map from Kravchuk transform in computational basis to $\mathfrak{su}(2)$ in Fock basis. Then built on this connection, we apply the fast-forwarding to achieve an efficient quantum Kravchuk transform.
We present a quantum algorithm for the Kravchuk transform that scales logarithmically in both the dimension and the inverse of the error parameter. The quantum Kravchuk transform maps computational basis states to states with amplitudes proportional to Kravchuk functions. We achieve this by combining two key techniques: the structural relationship between the Kravchuk transform and the Lie algebras $\mathfrak{su}(2)$, and a recent fast-forwarding simulation method for $\mathfrak{su}(2)$ operators in the oscillator representation. More precisely, we first establish the map from Kravchuk transform in computational basis to $\mathfrak{su}(2)$ in Fock basis. Then built on this connection, we apply the fast-forwarding to achieve an efficient quantum Kravchuk transform.
We observe that the Kronecker product of tensors is the operation that converts the determinant polynomial into Cayley's first hyperdeterminant. We apply the Kronecker product to iterated matrix multiplication, which results in the hypercomputant, a VNP-complete and VW[1]-complete polynomial whose hardness we prove via the equivariance of the Kronecker product. The construction works over arbitrary commutative semirings and also for the tensor algebra and the exterior algebra. For the tensor algebra this gives a version of "noncommutative VNP", and for polynomials over the nonnegative real numbers this gives a version of "monotone VNP", each with the hypercomputant as the complete object. We take a parameterized complexity viewpoint and compare the noncommutative setting and the monotone setting. Using standard techniques we obtain optimal algebraic branching program width lower bounds in both settings, and these are notably not always the same. We also prove the polystability of the hypercomputant and that its isotypic components are characterized by their stabilizer.
We observe that the Kronecker product of tensors is the operation that converts the determinant polynomial into Cayley's first hyperdeterminant. We apply the Kronecker product to iterated matrix multiplication, which results in the hypercomputant, a VNP-complete and VW[1]-complete polynomial whose hardness we prove via the equivariance of the Kronecker product. The construction works over arbitrary commutative semirings and also for the tensor algebra and the exterior algebra. For the tensor algebra this gives a version of "noncommutative VNP", and for polynomials over the nonnegative real numbers this gives a version of "monotone VNP", each with the hypercomputant as the complete object. We take a parameterized complexity viewpoint and compare the noncommutative setting and the monotone setting. Using standard techniques we obtain optimal algebraic branching program width lower bounds in both settings, and these are notably not always the same. We also prove the polystability of the hypercomputant and that its isotypic components are characterized by their stabilizer.
Order types are an equivalence relation between point configurations that capture their combinatorial and convexity properties. Let $P$ be a $κ$-colored sequence of $n \ge d+1$ points in general position in $\mathbb{R}^d$. Let $ρ$ be a $κ$-colored order type on $k \le d+1$ points that has positive density on $P$; that is, for some constant $δ>0$, there are $δ\cdot \binom{n}{k}$ $k$-point subsequences of $P$ that have the same order type as $ρ$ and the same color pattern. In this paper we show that there exists a constant $c >0$ (depending only on $d, δ$, $k$ and $κ$) and disjoint subsets $X_1,\dots,X_k$ of $P$, each with at least $c \cdot n$ points, such that for every choice of $k$ points $x_i \in X_i$, $(x_1,\dots,x_k)$ has the same order type and color pattern as $ρ$.
Order types are an equivalence relation between point configurations that capture their combinatorial and convexity properties. Let $P$ be a $κ$-colored sequence of $n \ge d+1$ points in general position in $\mathbb{R}^d$. Let $ρ$ be a $κ$-colored order type on $k \le d+1$ points that has positive density on $P$; that is, for some constant $δ>0$, there are $δ\cdot \binom{n}{k}$ $k$-point subsequences of $P$ that have the same order type as $ρ$ and the same color pattern. In this paper we show that there exists a constant $c >0$ (depending only on $d, δ$, $k$ and $κ$) and disjoint subsets $X_1,\dots,X_k$ of $P$, each with at least $c \cdot n$ points, such that for every choice of $k$ points $x_i \in X_i$, $(x_1,\dots,x_k)$ has the same order type and color pattern as $ρ$.
Authors: Maria Constantin, Adrian Miclăuş, Alexandru Popa, Andrei Popa
Given a finite set of integers $A$, a \emph{unary translocation} produces a new set $A' = A \cup \{u,v\}$, where $u$ and $v$ are nonnegative integers satisfying $x+y=u+v$ for some $x,y\in A$. For an input set $A$ and a target set $B$, the \emph{unary translocation distance} is the minimum number of unary translocations required to obtain a superset containing $B$. In this paper, we study this problem from both theoretical and computational perspectives. We prove that computing the unary translocation distance is strongly NP-hard, thereby answering an open question raised by \citet{ConstantinMiclausPopa2026UnaryTranslocation}. On the positive side, we give an exact pseudo-polynomial algorithm for every fixed constant value of $|B|$, extending our previous results for $|B|\leq 2$. For arbitrary target sets, we present a $2$-approximation algorithm, an additive $(|B|-1)$-approximation algorithm, and show that the additive algorithm also yields a $3$-approximation. We also propose parameterized algorithms, including algorithms parameterized by the maximum value in the input set together with the optimum distance, and by the maximum value in the target set together with $|B|$. In addition, we propose an integer linear programming formulation that gives an exact mathematical model for the problem, analyze its size, and show that the LP relaxation has integrality gap at least $\frac{4}{3}$. Finally, we report computational experiments comparing the $2$-approximation algorithm, beam search, and simulated annealing. The results show that the approximation algorithm is highly effective in practice and often outperforms the heuristic baselines.
Given a finite set of integers $A$, a \emph{unary translocation} produces a new set $A' = A \cup \{u,v\}$, where $u$ and $v$ are nonnegative integers satisfying $x+y=u+v$ for some $x,y\in A$. For an input set $A$ and a target set $B$, the \emph{unary translocation distance} is the minimum number of unary translocations required to obtain a superset containing $B$. In this paper, we study this problem from both theoretical and computational perspectives. We prove that computing the unary translocation distance is strongly NP-hard, thereby answering an open question raised by \citet{ConstantinMiclausPopa2026UnaryTranslocation}. On the positive side, we give an exact pseudo-polynomial algorithm for every fixed constant value of $|B|$, extending our previous results for $|B|\leq 2$. For arbitrary target sets, we present a $2$-approximation algorithm, an additive $(|B|-1)$-approximation algorithm, and show that the additive algorithm also yields a $3$-approximation. We also propose parameterized algorithms, including algorithms parameterized by the maximum value in the input set together with the optimum distance, and by the maximum value in the target set together with $|B|$. In addition, we propose an integer linear programming formulation that gives an exact mathematical model for the problem, analyze its size, and show that the LP relaxation has integrality gap at least $\frac{4}{3}$. Finally, we report computational experiments comparing the $2$-approximation algorithm, beam search, and simulated annealing. The results show that the approximation algorithm is highly effective in practice and often outperforms the heuristic baselines.
The $k$-center problem is one of the best-studied and most intuitive clustering formulations. It asks, given a set of $n$ points in a metric space, for $k$ of the points to be designated as cluster centers, so that the maximum distance of an input point to its nearest center is minimized. Gonzalez's greedy algorithm from 1985 is a simple and efficient way to find a $2$-approximate solution. The algorithm has the attractive feature of \emph{incrementality}: it outputs the centers one by one, with a guaranteed $2$-approximation for every prefix of the obtained sequence of centers.
Incrementality imposes a geometric constraint on how solutions can be built, and it is natural to ask whether this comes at a price in the quality of the solution. It is known that in polynomial time, the approximation ratio of $2$ is best possible, assuming $P \neq NP$. In this paper we show that even with \emph{unlimited} computational power, the factor $2$ cannot be improved, if the solution is required to be built incrementally. The lower bound construction imposes a tradeoff between all $n$ levels of the clustering simultaneously; it was obtained with the help of ChatGPT, an aspect we discuss in Section 3 of the paper.
The $k$-center problem is one of the best-studied and most intuitive clustering formulations. It asks, given a set of $n$ points in a metric space, for $k$ of the points to be designated as cluster centers, so that the maximum distance of an input point to its nearest center is minimized. Gonzalez's greedy algorithm from 1985 is a simple and efficient way to find a $2$-approximate solution. The algorithm has the attractive feature of \emph{incrementality}: it outputs the centers one by one, with a guaranteed $2$-approximation for every prefix of the obtained sequence of centers.
Incrementality imposes a geometric constraint on how solutions can be built, and it is natural to ask whether this comes at a price in the quality of the solution. It is known that in polynomial time, the approximation ratio of $2$ is best possible, assuming $P \neq NP$. In this paper we show that even with \emph{unlimited} computational power, the factor $2$ cannot be improved, if the solution is required to be built incrementally. The lower bound construction imposes a tradeoff between all $n$ levels of the clustering simultaneously; it was obtained with the help of ChatGPT, an aspect we discuss in Section 3 of the paper.
Authors: Anupam Gupta, Benjamin Moseley, Rudy Zhou
We introduce a stochastic probing problem with correlated items. In our model, which we call Bayesian Probing, the correlations are modeled by an underlying graph $G$. Each vertex is independently active with a known probability. Each item corresponds to an edge in the graph. Probing an edge has some cost, gives some reward if both endpoints are active, and also reveals the state of its endpoints. Hence a probe induces a Bayesian update on the remaining edges. The goal is to adaptively probe items/edges subject to a knapsack constraint to maximize the expected total reward obtained from the probed edges.
Bayesian Probing generalizes stochastic knapsack and stochastic probing by allowing correlations between items. Moreover, it gives a tractable model for the Bayesian Active Search problem, a popular problem considered in the machine learning community. In Bayesian Active Search, the goal is to find items in a particular class by adaptively probing at most, say $k$, items. Given a prior distribution over items, we want to compute a Bayesian policy to maximize the number of such items found. For this general problem with arbitrary priors, there are strong lower bounds on efficiently computing good policies.
In this paper, we design efficient approximation algorithms for Bayesian Probing. These results give the first efficient approximation algorithms for Bayesian Active Search, for a class of practically-relevant prior distributions.
We introduce a stochastic probing problem with correlated items. In our model, which we call Bayesian Probing, the correlations are modeled by an underlying graph $G$. Each vertex is independently active with a known probability. Each item corresponds to an edge in the graph. Probing an edge has some cost, gives some reward if both endpoints are active, and also reveals the state of its endpoints. Hence a probe induces a Bayesian update on the remaining edges. The goal is to adaptively probe items/edges subject to a knapsack constraint to maximize the expected total reward obtained from the probed edges.
Bayesian Probing generalizes stochastic knapsack and stochastic probing by allowing correlations between items. Moreover, it gives a tractable model for the Bayesian Active Search problem, a popular problem considered in the machine learning community. In Bayesian Active Search, the goal is to find items in a particular class by adaptively probing at most, say $k$, items. Given a prior distribution over items, we want to compute a Bayesian policy to maximize the number of such items found. For this general problem with arbitrary priors, there are strong lower bounds on efficiently computing good policies.
In this paper, we design efficient approximation algorithms for Bayesian Probing. These results give the first efficient approximation algorithms for Bayesian Active Search, for a class of practically-relevant prior distributions.
Authors: Arpon Basu, Joshua Brakensiek, Pravesh K. Kothari, Aaron Putterman
In this paper, we continue a line of research initiated by Basu, Brakensiek, and Putterman [2026] studying the sparsifiability of Hamiltonians. We focus particularly on the sparsifiability of the widely-studied Quantum Cut (QC) Hamiltonians. Our main result is that in an $n$-qubit system, any $n$-qubit QC Hamiltonian can be sparsified to $\widetilde{O}(n /\varepsilon^2)$ many terms while preserving the energy of every state up to a factor of $1 \pm \varepsilon$. Our result can be interpreted as giving an importance sampling scheme for the edges of an arbitrary graph $G$ such that the \emph{Kikuchi} graph at level $\ell$ of the sampled graph is a spectral approximation to the Kikuchi graph of $G$. Importantly, the \emph{same} sampling scheme works simultaneously for all $\ell$.
The natural approach of leverage score sampling, analyzed via matrix concentration inequalities, yields a polynomially worse bound in our setting because the underlying matrices have dimension $\sim 2^n$. Instead, our approach relies on decomposing the action of these matrices into invariant subspaces. Then, by using an operator-valued inequality of Alon and Kozma [Ann. Henri Poincaré, 2020], itself building on an \emph{octopus inequality} of Caputo, Liggett, and Richthammer [J. AMS, 2010], we extend our sparsification technique to all expander graphs. We then invoke expander decomposition to extend our sparsifier to all graphs.
In this paper, we continue a line of research initiated by Basu, Brakensiek, and Putterman [2026] studying the sparsifiability of Hamiltonians. We focus particularly on the sparsifiability of the widely-studied Quantum Cut (QC) Hamiltonians. Our main result is that in an $n$-qubit system, any $n$-qubit QC Hamiltonian can be sparsified to $\widetilde{O}(n /\varepsilon^2)$ many terms while preserving the energy of every state up to a factor of $1 \pm \varepsilon$. Our result can be interpreted as giving an importance sampling scheme for the edges of an arbitrary graph $G$ such that the \emph{Kikuchi} graph at level $\ell$ of the sampled graph is a spectral approximation to the Kikuchi graph of $G$. Importantly, the \emph{same} sampling scheme works simultaneously for all $\ell$.
The natural approach of leverage score sampling, analyzed via matrix concentration inequalities, yields a polynomially worse bound in our setting because the underlying matrices have dimension $\sim 2^n$. Instead, our approach relies on decomposing the action of these matrices into invariant subspaces. Then, by using an operator-valued inequality of Alon and Kozma [Ann. Henri Poincaré, 2020], itself building on an \emph{octopus inequality} of Caputo, Liggett, and Richthammer [J. AMS, 2010], we extend our sparsification technique to all expander graphs. We then invoke expander decomposition to extend our sparsifier to all graphs.
Authors: Peter Sanders, Matthias Schimek, Tim Niklas Uhl, Thomas Weidmann
The list ranking problem is one of the classical problems of parallel computing, with nontrivial algorithms and many applications as a subroutine for solving other problems. While it has been intensively studied in the early days of parallel computing, few things happened in the last 20 years. In particular, there is little work on scaling list ranking to large machines and input sizes. We reconsider list ranking starting from the ground-breaking results of Sibeyn a quarter century ago. We employ algorithm and performance engineering to improve his sparse ruling-set algorithm, making it capable of scaling to many processors, and provide a more detailed analysis of the impact of the algorithm's parameters, further guiding our practical implementation.
We perform an extensive experimental study across a variety of input instances with different structural properties. We demonstrate that indirect communication, exploiting input locality, and message coalescing allows scaling to billions of elements on up to 24,576 cores.
The list ranking problem is one of the classical problems of parallel computing, with nontrivial algorithms and many applications as a subroutine for solving other problems. While it has been intensively studied in the early days of parallel computing, few things happened in the last 20 years. In particular, there is little work on scaling list ranking to large machines and input sizes. We reconsider list ranking starting from the ground-breaking results of Sibeyn a quarter century ago. We employ algorithm and performance engineering to improve his sparse ruling-set algorithm, making it capable of scaling to many processors, and provide a more detailed analysis of the impact of the algorithm's parameters, further guiding our practical implementation.
We perform an extensive experimental study across a variety of input instances with different structural properties. We demonstrate that indirect communication, exploiting input locality, and message coalescing allows scaling to billions of elements on up to 24,576 cores.
Authors: Amir Tonta, Bernhard Seeger, Eljas Soisalon-Soininen
Multiversion concurrency control (MVCC) enables scans to read data from a committed snapshot (version), reducing conflicts with write operations compared to traditional concurrency approaches. Currently, versioned records are often managed in a B$^+$-tree using version chains. However, version chains introduce overhead during scans and can still lead to conflicts between scans and writers. The multiversion B-tree (MVBT) was designed for optimal range scan performance on arbitrary versions, but has been considered impractical due to its structural complexity and, until recently, the lack of effective concurrency control. In this paper, we present the concurrent MVBT (cMVBT), a redesign of the MVBT featuring a novel concurrency control protocol that uses optimistic latches for write operations and requires no latches for range scans, while preserving all the optimality guarantees of the original MVBT. Additionally, cMVBT supports continuous garbage collection without activity spikes, seamlessly integrating free-space management. Experiments with mixed workloads derived from a standard benchmark show that the cMVBT achieves low overhead, high write throughput, and excellent range scan performance, outperforming state-of-the-art methods based on version chains.
Multiversion concurrency control (MVCC) enables scans to read data from a committed snapshot (version), reducing conflicts with write operations compared to traditional concurrency approaches. Currently, versioned records are often managed in a B$^+$-tree using version chains. However, version chains introduce overhead during scans and can still lead to conflicts between scans and writers. The multiversion B-tree (MVBT) was designed for optimal range scan performance on arbitrary versions, but has been considered impractical due to its structural complexity and, until recently, the lack of effective concurrency control. In this paper, we present the concurrent MVBT (cMVBT), a redesign of the MVBT featuring a novel concurrency control protocol that uses optimistic latches for write operations and requires no latches for range scans, while preserving all the optimality guarantees of the original MVBT. Additionally, cMVBT supports continuous garbage collection without activity spikes, seamlessly integrating free-space management. Experiments with mixed workloads derived from a standard benchmark show that the cMVBT achieves low overhead, high write throughput, and excellent range scan performance, outperforming state-of-the-art methods based on version chains.
Authors: Vladimir Braverman, Chen Wang, Liudeng Wang, Samson Zhou
Motivated by the recency effect in online learning, we study algorithms for single-pass *sliding-window streaming multi-armed bandits (MABs)* in this paper. In this setting, we are given $n$ arms with unknown sub-Gaussian reward distributions and a parameter $W$. The arms arrive in a single-pass stream, and only the most recent $W$ arms are considered valid. The algorithm is required to perform pure exploration and regret minimization with limited memory, defined as the number of stored arms. The model is a natural extension of the streaming multi-armed bandits model (without the sliding window) that has been extensively studied in recent years. We provide a comprehensive analysis of both the pure exploration and regret minimization problems with the model. For pure exploration, we prove that finding the best arm is hard with sublinear memory while finding an approximate best arm admits an efficient algorithm. For regret minimization, we explore a new notion of regret and give sharp memory-regret trade-offs for any single-pass algorithm. We complement our theoretical results with experiments, demonstrating the trade-offs between sample, regret, and memory.
Motivated by the recency effect in online learning, we study algorithms for single-pass *sliding-window streaming multi-armed bandits (MABs)* in this paper. In this setting, we are given $n$ arms with unknown sub-Gaussian reward distributions and a parameter $W$. The arms arrive in a single-pass stream, and only the most recent $W$ arms are considered valid. The algorithm is required to perform pure exploration and regret minimization with limited memory, defined as the number of stored arms. The model is a natural extension of the streaming multi-armed bandits model (without the sliding window) that has been extensively studied in recent years. We provide a comprehensive analysis of both the pure exploration and regret minimization problems with the model. For pure exploration, we prove that finding the best arm is hard with sublinear memory while finding an approximate best arm admits an efficient algorithm. For regret minimization, we explore a new notion of regret and give sharp memory-regret trade-offs for any single-pass algorithm. We complement our theoretical results with experiments, demonstrating the trade-offs between sample, regret, and memory.
We formulate the quotient admission problem for finite graph-window rows. The input is a finite row set, an admissible evidence map, semantic labels, witness-support hypergraphs, and atom-level admissibility predicates. The output is a quotient decision on evidence atoms, with possible decisions certificate, residual, low-confidence, or blocked. The problem asks for the maximal guard-respecting atom-level decision map that uses no refinement beyond the admissible evidence partition. We prove an atom-union characterization of identifiable classes, give a witness-support hypergraph guard for certificate admission, characterize projected-label conflicts as blocked atoms, and present quotient admission algorithms with correctness, maximality, and complexity guarantees. With explicit evidence vectors and hyperedges, the algorithms run in expected O(B + I + n) time and space by hashing and deterministic O(B + I + n log n) time by sorting under a key-linear comparison model, where n is the number of rows, B is the total evidence encoding length, and I is the total hyperedge incidence size. We also prove a magnitude-only indistinguishability lower bound: any evaluator that observes only residual magnitudes fails on instances whose evidence atoms require different residual decisions after the magnitudes collapse them.
We formulate the quotient admission problem for finite graph-window rows. The input is a finite row set, an admissible evidence map, semantic labels, witness-support hypergraphs, and atom-level admissibility predicates. The output is a quotient decision on evidence atoms, with possible decisions certificate, residual, low-confidence, or blocked. The problem asks for the maximal guard-respecting atom-level decision map that uses no refinement beyond the admissible evidence partition. We prove an atom-union characterization of identifiable classes, give a witness-support hypergraph guard for certificate admission, characterize projected-label conflicts as blocked atoms, and present quotient admission algorithms with correctness, maximality, and complexity guarantees. With explicit evidence vectors and hyperedges, the algorithms run in expected O(B + I + n) time and space by hashing and deterministic O(B + I + n log n) time by sorting under a key-linear comparison model, where n is the number of rows, B is the total evidence encoding length, and I is the total hyperedge incidence size. We also prove a magnitude-only indistinguishability lower bound: any evaluator that observes only residual magnitudes fails on instances whose evidence atoms require different residual decisions after the magnitudes collapse them.
Motivated by polynomial identity testing with exponentials (Li and Wu, ITCS'26), we study uncertainty principles for the number-theoretic transform (NTT). We show that the NTT satisfies strong sparsity tradeoffs: For every fixed prime $q$ and for all but finitely many primes $p \equiv 1 \pmod q$ every nonzero $f\in \mathbb F_p^{\mathbb Z_q}$ and its number-theoretic transform $\hat f$ satisfy \[ |\mathrm{Supp}(f)| + |\mathrm{Supp}(\hat f)| \ge q+1. \] Thus, a $k$-sparse function has transform support at least $q-k+1$. As our main technical contribution, we prove a probabilistic version of the above uncertainty principle, averaged over primes $p$, in the regime $p=q^{O(1)}$.
As an application, we obtain a black-box identity test for $k$-sparse exponential polynomials of degree at most $d$ with vanishing soundness error, for $q$ moderately larger than $k$.
Motivated by polynomial identity testing with exponentials (Li and Wu, ITCS'26), we study uncertainty principles for the number-theoretic transform (NTT). We show that the NTT satisfies strong sparsity tradeoffs: For every fixed prime $q$ and for all but finitely many primes $p \equiv 1 \pmod q$ every nonzero $f\in \mathbb F_p^{\mathbb Z_q}$ and its number-theoretic transform $\hat f$ satisfy \[ |\mathrm{Supp}(f)| + |\mathrm{Supp}(\hat f)| \ge q+1. \] Thus, a $k$-sparse function has transform support at least $q-k+1$. As our main technical contribution, we prove a probabilistic version of the above uncertainty principle, averaged over primes $p$, in the regime $p=q^{O(1)}$.
As an application, we obtain a black-box identity test for $k$-sparse exponential polynomials of degree at most $d$ with vanishing soundness error, for $q$ moderately larger than $k$.
A multivariate polynomial on $n$ variables $x_1,\ldots,x_n$ of total degree $n$ over $\mathbf{Z}_2$ containing the multilinear monomial $\prod_{i=1}^n x_i$ is by the combinatorial nullstellensatz [Alon, Comb. Probab. Comput., 1999] known to always have a nonroot. We show that there cannot be a randomised polynomial time algorithm that given an arithmetic circuit of polynomial size formally computing such a polynomial, locates a nonroot with constant nonzero probability unless RP=NP. The result holds even when the individual degree of every variable in the input polynomial is at most two.
A multivariate polynomial on $n$ variables $x_1,\ldots,x_n$ of total degree $n$ over $\mathbf{Z}_2$ containing the multilinear monomial $\prod_{i=1}^n x_i$ is by the combinatorial nullstellensatz [Alon, Comb. Probab. Comput., 1999] known to always have a nonroot. We show that there cannot be a randomised polynomial time algorithm that given an arithmetic circuit of polynomial size formally computing such a polynomial, locates a nonroot with constant nonzero probability unless RP=NP. The result holds even when the individual degree of every variable in the input polynomial is at most two.
We prove that level-$\ell$ Kikuchi graphs of random $2r$-uniform hypergraphs spectrally approximate the Kikuchi graph of the complete $2r$-uniform hypergraph at a sampling rate that is sharp up to a logarithmic factor, in the regime $r\leq \ell \leq n/2$. Our proof is based on the matrix Bernstein inequality, but, unlike prior works, we apply it to an appropriate collection of blocks of Johnson eigenspaces. Our analysis relies on a new, simple band-locality property for arbitrary Kikuchi graphs. As an application, we prove that the natural degree-$2\ell$ sum-of-squares relaxation for the Max $2r$-XOR problem is ``integral'' when the input is a planted noisy $2r$-XOR instance on a random hypergraph with $\gtrsim n \cdot (n/\ell)^{r-1} \log n$ hyperedges.
We prove that level-$\ell$ Kikuchi graphs of random $2r$-uniform hypergraphs spectrally approximate the Kikuchi graph of the complete $2r$-uniform hypergraph at a sampling rate that is sharp up to a logarithmic factor, in the regime $r\leq \ell \leq n/2$. Our proof is based on the matrix Bernstein inequality, but, unlike prior works, we apply it to an appropriate collection of blocks of Johnson eigenspaces. Our analysis relies on a new, simple band-locality property for arbitrary Kikuchi graphs. As an application, we prove that the natural degree-$2\ell$ sum-of-squares relaxation for the Max $2r$-XOR problem is ``integral'' when the input is a planted noisy $2r$-XOR instance on a random hypergraph with $\gtrsim n \cdot (n/\ell)^{r-1} \log n$ hyperedges.
In this work, we study Restricted Assignment scheduling on multiple machines, where each job can be processed only on a specified subset of machines and the objective is to minimize the makespan. We introduce a learning-augmented setting in which a possibly infeasible predicted assignment is provided. The prediction error (moved-load) is measured by the total processing volume that must be reassigned in order to obtain an optimal feasible schedule.
Using a single prediction, we obtain two types of guarantees. First, we design an algorithm whose approximation ratio degrades smoothly with the prediction error while retaining a worst-case guarantee independent of the prediction quality. More precisely, for any fixed constant, we can make the additive dependence on the prediction error arbitrarily small, at the cost of increasing the polynomial running time. This guarantee can also be combined with any approximation algorithm for the problem without predictions to obtain robustness.
Second, given a makespan estimate, we provide a repair procedure that returns a schedule matching this estimate in time parameterized by the prediction error. This allows the algorithm to exploit the separation between estimation and approximation algorithms for Restricted Assignment. Finally, we complement the repair algorithm with a parameterized hardness result, showing that exact moved-load repair with a given target makespan is W[1]-hard when parameterized by the amount of moved-load.
In this work, we study Restricted Assignment scheduling on multiple machines, where each job can be processed only on a specified subset of machines and the objective is to minimize the makespan. We introduce a learning-augmented setting in which a possibly infeasible predicted assignment is provided. The prediction error (moved-load) is measured by the total processing volume that must be reassigned in order to obtain an optimal feasible schedule.
Using a single prediction, we obtain two types of guarantees. First, we design an algorithm whose approximation ratio degrades smoothly with the prediction error while retaining a worst-case guarantee independent of the prediction quality. More precisely, for any fixed constant, we can make the additive dependence on the prediction error arbitrarily small, at the cost of increasing the polynomial running time. This guarantee can also be combined with any approximation algorithm for the problem without predictions to obtain robustness.
Second, given a makespan estimate, we provide a repair procedure that returns a schedule matching this estimate in time parameterized by the prediction error. This allows the algorithm to exploit the separation between estimation and approximation algorithms for Restricted Assignment. Finally, we complement the repair algorithm with a parameterized hardness result, showing that exact moved-load repair with a given target makespan is W[1]-hard when parameterized by the amount of moved-load.
Equitable allocation of indivisible goods to agents in online settings is an algorithmic primitive with applications for load balancing, network routing, online marketplaces, and multi-agent systems. We consider a general setting in which allocations are constrained to be bases of discrete polymatroids that arrive online.
Our work demonstrates that a simple, myopic algorithm called Brick-Laying, which greedily minimizes the sum of squared loads on agents, achieves a universal and objective-free notion of optimality called majorization minimax-optimality [BDK26] for this setting. As a consequence, Brick-Laying simultaneously guarantees minimax optimal competitive ratios and regret for all Schur-concave and Schur-convex objectives, and for any number of agents and resources (despite being agnostic to problem scale).
Departing from popular primal-dual analysis, we employ majorization to compare allocations. We leverage the conjugates of integer partitions -- which act as a discrete dual to majorization -- to characterize worst-case instances for the Brick-Laying algorithm. Our approach reveals a novel structural connection between the geometry of partitions and online equitable allocation.
Equitable allocation of indivisible goods to agents in online settings is an algorithmic primitive with applications for load balancing, network routing, online marketplaces, and multi-agent systems. We consider a general setting in which allocations are constrained to be bases of discrete polymatroids that arrive online.
Our work demonstrates that a simple, myopic algorithm called Brick-Laying, which greedily minimizes the sum of squared loads on agents, achieves a universal and objective-free notion of optimality called majorization minimax-optimality [BDK26] for this setting. As a consequence, Brick-Laying simultaneously guarantees minimax optimal competitive ratios and regret for all Schur-concave and Schur-convex objectives, and for any number of agents and resources (despite being agnostic to problem scale).
Departing from popular primal-dual analysis, we employ majorization to compare allocations. We leverage the conjugates of integer partitions -- which act as a discrete dual to majorization -- to characterize worst-case instances for the Brick-Laying algorithm. Our approach reveals a novel structural connection between the geometry of partitions and online equitable allocation.
Authors: Meng Li, Xiaohua Yang, Jie Liu, Shiyu Yan
This paper asks when MR-subset selection is a real mutant-level requirement for minimum complete evidence in metamorphic testing rather than a coarse fault-class counting artifact. We define a layer-relative completeness criterion over an admitted mutant--draw coverage universe. The central result is a support-set domination boundary: it states when class-level abstraction is safe and when mutant-level MR minimization is necessary. The boundary is governed by kill-signature heterogeneity, which yields a scoped fault-signature kernel and separates the MR-specific question from ordinary fault-class counting. The resulting Min-MR-Complete problem is Set-Cover-equivalent over the selected coverage universe, giving NP-hardness, the classical logarithmic approximation boundary, a greedy approximation, an exact ILP formulation, and an SMS-rank upper bound that is not a lower bound or tight predictor. Artifact lanes provide lane-local minimization and audit evidence; separately, route witnesses instantiate both collapse and non-collapse regimes for the boundary theorem and are not pooled as population-level experiments. Other MR-class-proxy rows remain intermediate signals rather than route-admitted witness evidence.
This paper asks when MR-subset selection is a real mutant-level requirement for minimum complete evidence in metamorphic testing rather than a coarse fault-class counting artifact. We define a layer-relative completeness criterion over an admitted mutant--draw coverage universe. The central result is a support-set domination boundary: it states when class-level abstraction is safe and when mutant-level MR minimization is necessary. The boundary is governed by kill-signature heterogeneity, which yields a scoped fault-signature kernel and separates the MR-specific question from ordinary fault-class counting. The resulting Min-MR-Complete problem is Set-Cover-equivalent over the selected coverage universe, giving NP-hardness, the classical logarithmic approximation boundary, a greedy approximation, an exact ILP formulation, and an SMS-rank upper bound that is not a lower bound or tight predictor. Artifact lanes provide lane-local minimization and audit evidence; separately, route witnesses instantiate both collapse and non-collapse regimes for the boundary theorem and are not pooled as population-level experiments. Other MR-class-proxy rows remain intermediate signals rather than route-admitted witness evidence.
Authors: Ben Bals, Joakim Blikstad, Daniel Dadush, Yasamin Nazari, Jonas Schmidt
The reachability diameter ($\mathrm{ReachDiam}$) of a directed graph is the maximum distance over all pairs $u,v$ where $v$ is reachable from $u$. This notion is present in the definition of shortcut sets, and the name was recently coined in that context by Haeupler, Jiang, and Saranurak [SOSA 2026]. While this is a very natural notion of diameter in directed graphs, and especially DAGs, it is so far not computationally explored. Other definitions of diameter in directed graphs are either trivial (infinite) in graphs that are not strongly connected (e.g., the classical definition) or are non-trivial only in highly restrictive graph classes (e.g., Min-Diameter).
We initiate the problem of computing the (approximate) reachability diameter from a fine-grained complexity point of view. Under certain fine-grained assumptions, we prove that there is no algorithm in time $\mathcal{O}(n^{ω- \varepsilon}$) that gives any approximation of $\mathrm{ReachDiam}$ in weighted graphs. Similarly, there is no algorithm with better than $2$-approximation for unweighted graphs in this time. To supplement this, we provide algorithmic upper bounds that lead to additive approximation of $\mathrm{ReachDiam}$ for unweighted graphs. Hence, we establish a strong separation between the weighted and unweighted cases, which makes this type of diameter different in nature than other known notions.
Considering the hardness in general weighted graphs, we also study special graph classes and get small constant approximations for DAGs with bounded width or graphs with bounded treewidth. Interestingly, our techniques also lead to exact hopsets with hopbound $2$ for bounded treewidth graphs. This and some of our upper bounds for general graphs show technical connections between approximating $\mathrm{ReachDiam}$ and computing shortcut sets and hopsets.
The reachability diameter ($\mathrm{ReachDiam}$) of a directed graph is the maximum distance over all pairs $u,v$ where $v$ is reachable from $u$. This notion is present in the definition of shortcut sets, and the name was recently coined in that context by Haeupler, Jiang, and Saranurak [SOSA 2026]. While this is a very natural notion of diameter in directed graphs, and especially DAGs, it is so far not computationally explored. Other definitions of diameter in directed graphs are either trivial (infinite) in graphs that are not strongly connected (e.g., the classical definition) or are non-trivial only in highly restrictive graph classes (e.g., Min-Diameter).
We initiate the problem of computing the (approximate) reachability diameter from a fine-grained complexity point of view. Under certain fine-grained assumptions, we prove that there is no algorithm in time $\mathcal{O}(n^{ω- \varepsilon}$) that gives any approximation of $\mathrm{ReachDiam}$ in weighted graphs. Similarly, there is no algorithm with better than $2$-approximation for unweighted graphs in this time. To supplement this, we provide algorithmic upper bounds that lead to additive approximation of $\mathrm{ReachDiam}$ for unweighted graphs. Hence, we establish a strong separation between the weighted and unweighted cases, which makes this type of diameter different in nature than other known notions.
Considering the hardness in general weighted graphs, we also study special graph classes and get small constant approximations for DAGs with bounded width or graphs with bounded treewidth. Interestingly, our techniques also lead to exact hopsets with hopbound $2$ for bounded treewidth graphs. This and some of our upper bounds for general graphs show technical connections between approximating $\mathrm{ReachDiam}$ and computing shortcut sets and hopsets.
Subgraph counting is a fundamental problem in graph analysis. Motivated by practical scenarios where graph analytics are performed on subgraphs induced by selected vertices -- rather than on the entire graph -- and by growing privacy concerns, we initiate the study of differentially private range subgraph counting (DPRSC). The goal is to privately count occurrences of a fixed pattern graph within induced subgraphs defined by multi-dimensional attribute ranges. Unlike classical point counting, subgraph counting is inherently nonlinear and exhibits high sensitivity: a single edge modification can affect many subgraph occurrences. We present the first efficient algorithms for DPRSC with small additive error. Our approach introduces a subgraph projection that reduces DPRSC to weighted orthogonal range counting, enabling the use of range trees and local sensitivity estimation to achieve accurate private query answering. We complement our algorithms with matching lower bounds, obtained by reducing reconstruction attacks to DPRSC and leveraging discrepancy theory. In particular, we show that any differentially private algorithm for DPRSC must incur additive error exponential in the dimension. Empirical evaluations demonstrate that our algorithms significantly outperform baseline methods in accuracy and runtime while maintaining strong privacy guarantees.
Subgraph counting is a fundamental problem in graph analysis. Motivated by practical scenarios where graph analytics are performed on subgraphs induced by selected vertices -- rather than on the entire graph -- and by growing privacy concerns, we initiate the study of differentially private range subgraph counting (DPRSC). The goal is to privately count occurrences of a fixed pattern graph within induced subgraphs defined by multi-dimensional attribute ranges. Unlike classical point counting, subgraph counting is inherently nonlinear and exhibits high sensitivity: a single edge modification can affect many subgraph occurrences. We present the first efficient algorithms for DPRSC with small additive error. Our approach introduces a subgraph projection that reduces DPRSC to weighted orthogonal range counting, enabling the use of range trees and local sensitivity estimation to achieve accurate private query answering. We complement our algorithms with matching lower bounds, obtained by reducing reconstruction attacks to DPRSC and leveraging discrepancy theory. In particular, we show that any differentially private algorithm for DPRSC must incur additive error exponential in the dimension. Empirical evaluations demonstrate that our algorithms significantly outperform baseline methods in accuracy and runtime while maintaining strong privacy guarantees.
We introduce two infinite families of 3-regular planar graphs. Both families are conceptual adversaries to the Pohl-Warnsdorf algorithm for finding Hamiltonians. We provide a closed form calculation of the number of Hamiltonians.
We introduce two infinite families of 3-regular planar graphs. Both families are conceptual adversaries to the Pohl-Warnsdorf algorithm for finding Hamiltonians. We provide a closed form calculation of the number of Hamiltonians.
We describe new dependent-rounding algorithms for bipartite graphs. Given a fractional matching $x$ of graph $G = (U \cup V, E)$, the algorithms return an integral solution $X$ such that each right-node $v \in V$ has at most one edge, and where the variables $X_e$ also satisfy broad non-positive correlation properties. In particular, for any edges $e_1, e_2$ sharing a left-node $u \in U$, the variables $X_{e_1}, X_{e_2}$ have \emph{strong} negative-correlation, i.e. the expectation of $X_{e_1} X_{e_2}$ is significantly below $x_{e_1} x_{e_2}$.
Dependent rounding schemes with these properties have been used for a approximation algorithms for job-scheduling on unrelated machines to minimize weighted completion times, among other applications. Our new algorithm achieves simpler and qualitatively stronger bounds compared to prior algorithms. In particular, we achieve a negative-correlation property $$ \E[X_{e_1} X_{e_2}] \leq 0.79751 \ x_{e_1} x_{e_2}, $$ which is a significant constant-factor improvement over Baveja, Qu & Srinivasan (2023).
We describe new dependent-rounding algorithms for bipartite graphs. Given a fractional matching $x$ of graph $G = (U \cup V, E)$, the algorithms return an integral solution $X$ such that each right-node $v \in V$ has at most one edge, and where the variables $X_e$ also satisfy broad non-positive correlation properties. In particular, for any edges $e_1, e_2$ sharing a left-node $u \in U$, the variables $X_{e_1}, X_{e_2}$ have \emph{strong} negative-correlation, i.e. the expectation of $X_{e_1} X_{e_2}$ is significantly below $x_{e_1} x_{e_2}$.
Dependent rounding schemes with these properties have been used for a approximation algorithms for job-scheduling on unrelated machines to minimize weighted completion times, among other applications. Our new algorithm achieves simpler and qualitatively stronger bounds compared to prior algorithms. In particular, we achieve a negative-correlation property $$ \E[X_{e_1} X_{e_2}] \leq 0.79751 \ x_{e_1} x_{e_2}, $$ which is a significant constant-factor improvement over Baveja, Qu & Srinivasan (2023).