Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Thursday, February 05

Trevisan Award for Expository Work

from Windows on Theory

[Guest post by Salil Vadhan; Boaz’s note: I am thrilled to see this award. Luca’s expository writing on his blog, surveys, and lecture notes was and is an amazing resource. Luca had a knack for explaining some of the most difficult topics in intuitive ways that cut to their heart. I hope it will inspire … Continue reading Trevisan Award for Expository Work

[Guest post by Salil Vadhan; Boaz’s note: I am thrilled to see this award. Luca’s expository writing on his blog, surveys, and lecture notes was and is an amazing resource. Luca had a knack for explaining some of the most difficult topics in intuitive ways that cut to their heart. I hope it will inspire more people to follow in footsteps.]

The Trevisan Award for Expository Work is a new SIGACT award 

created in memory of Luca Trevisan (1971-2024), with a nomination deadline of April 10, 2026.  

The award intended to promote and recognize high-impact work expositing ideas and results from the Theory of Computation. The exposition can have various target audiences, e.g. people in this field, people in adjacent or remote academic fields, as well as the general public. The form of exposition can vary, and can include books, surveys, lectures, course materials, video, audio (e.g. podcasts), blogs and other media products. The award may be given to a single piece of work or a series produced over time. The award may be given to an individual, or a small group who together produced this expository work.

The awardee will receive USD 2000 (to be divided among the awardees if multiple), as well as travel support if needed to attend STOC, where the award will be presented. STOC 2026 is June 22-26 in Salt Lake City, Utah.

The endowment for this prize was initiated by a gift from Avi Wigderson, drawing on his Turing Award, and has been subsequently augmented by other individuals.

For more details see https://sigact.org/prizes/trevisan.html.

By Boaz Barak

Lust for life

from Emanuele Viola

I just finished reading the book. I savored it over months, reading just a few pages each day. It was fun to read about something I knew nothing about, painting, coal miners, impressionism and the life of Vincent van Gogh. To think that I missed the exposition at the MFA right in front of my […]

I just finished reading the book. I savored it over months, reading just a few pages each day. It was fun to read about something I knew nothing about, painting, coal miners, impressionism and the life of Vincent van Gogh. To think that I missed the exposition at the MFA right in front of my office, as well as the museum in Amsterdam…

I came to know about this book after someone pointed me to this memorial about Maryam Mirzakhan, according to which the book “exemplified her excitement about work and life in general.” I wish I had talked more to her when we crossed at Harvard.

By Manu

Postdoc at West Virginia University (apply by May 1, 2026)

from CCI: jobs

The Lane Department of Computer Science and Electrical Engineering at West Virginia University invites applications for a Postdoctoral Fellow (funded by Algorithmic Foundations, NSF) in the general areas of theoretical computer science and algorithmic operations research, with an emphasis on computational complexity and game theory. The position is funded for two years, starting August 1 […]

The Lane Department of Computer Science and Electrical Engineering at West Virginia University invites applications for a Postdoctoral Fellow (funded by Algorithmic Foundations, NSF) in the general areas of theoretical computer science and algorithmic operations research, with an emphasis on computational complexity and game theory. The position is funded for two years, starting August 1 2026.

Website: https://wvu.taleo.net/careersection/faculty/jobdetail.ftl?job=28500&tz=GMT-05:00&tzname=America/New_York
Email: k.subramani@mail.wvu.edu

By shacharlovett

Quantum Advantage in Decision Trees: A Weighted Graph and $L_1$ Norm Approach

from arXiv: Computational Complexity

Authors: Sebastian Alberto Grillo, Bernardo Daniel Dávalos, Rodney Fabian Franco Torres, Franklin de Lima Marquezino, Edgar López Pezoa

The analysis of the computational power of single-query quantum algorithms is important because they must extract maximal information from one oracle call, revealing fundamental limits of quantum advantage and enabling optimal, resource-efficient quantum computation. This paper proposes a formulation of single-query quantum decision trees as weighted graphs. This formulation has the advantage that it facilitates the analysis of the $L_1$ spectral norm of the algorithm output. This advantage is based on the fact that a high $L_1$ spectral norm of the output of a quantum decision tree is a necessary condition to outperform its classical counterpart. We propose heuristics for maximizing the $L_{1}$ spectral norm, show how to combine weighted graphs to generate sequences with strictly increasing norm, and present functions exhibiting exponential quantum advantage. Finally, we establish a necessary condition linking single-query quantum advantage to the asymptotic growth of measurement projector dimensions.

Authors: Sebastian Alberto Grillo, Bernardo Daniel Dávalos, Rodney Fabian Franco Torres, Franklin de Lima Marquezino, Edgar López Pezoa

The analysis of the computational power of single-query quantum algorithms is important because they must extract maximal information from one oracle call, revealing fundamental limits of quantum advantage and enabling optimal, resource-efficient quantum computation. This paper proposes a formulation of single-query quantum decision trees as weighted graphs. This formulation has the advantage that it facilitates the analysis of the $L_1$ spectral norm of the algorithm output. This advantage is based on the fact that a high $L_1$ spectral norm of the output of a quantum decision tree is a necessary condition to outperform its classical counterpart. We propose heuristics for maximizing the $L_{1}$ spectral norm, show how to combine weighted graphs to generate sequences with strictly increasing norm, and present functions exhibiting exponential quantum advantage. Finally, we establish a necessary condition linking single-query quantum advantage to the asymptotic growth of measurement projector dimensions.

The Complexity of Min-Max Optimization with Product Constraints

from arXiv: Computational Complexity

Authors: Martino Bernasconi, Matteo Castiglioni

We study the computational complexity of the problem of computing local min-max equilibria of games with a nonconvex-nonconcave utility function $f$. From the work of Daskalakis, Skoulakis, and Zampetakis [DSZ21], this problem was known to be hard in the restrictive case in which players are required to play strategies that are jointly constrained, leaving open the question of its complexity under more natural constraints. In this paper, we settle the question and show that the problem is PPAD-hard even under product constraints and, in particular, over the hypercube.

Authors: Martino Bernasconi, Matteo Castiglioni

We study the computational complexity of the problem of computing local min-max equilibria of games with a nonconvex-nonconcave utility function $f$. From the work of Daskalakis, Skoulakis, and Zampetakis [DSZ21], this problem was known to be hard in the restrictive case in which players are required to play strategies that are jointly constrained, leaving open the question of its complexity under more natural constraints. In this paper, we settle the question and show that the problem is PPAD-hard even under product constraints and, in particular, over the hypercube.

On the Complexity of Vertex-Splitting Into an Interval Graph

from arXiv: Computational Complexity

Authors: Faisal N. Abu-Khzam, Dipayan Chakraborty, Lucas Isenmann, Nacim Oijid

Vertex splitting is a graph modification operation in which a vertex is replaced by multiple vertices such that the union of their neighborhoods equals the neighborhood of the original vertex. We introduce and study vertex splitting as a graph modification operation for transforming graphs into interval graphs. Given a graph $G$ and an integer $k$, we consider the problem of deciding whether $G$ can be transformed into an interval graph using at most $k$ vertex splits. We prove that this problem is NP-hard, even when the input is restricted to subcubic planar bipartite graphs. We further observe that vertex splitting differs fundamentally from vertex and edge deletions as graph modification operations when the objective is to obtain a chordal graph, even for graphs with maximum independent set size at most two. On the positive side, we give a polynomial-time algorithm for transforming, via a minimum number of vertex splits, a given graph into a disjoint union of paths, and that splitting triangle free graphs into unit interval graphs is also solvable in polynomial time.

Authors: Faisal N. Abu-Khzam, Dipayan Chakraborty, Lucas Isenmann, Nacim Oijid

Vertex splitting is a graph modification operation in which a vertex is replaced by multiple vertices such that the union of their neighborhoods equals the neighborhood of the original vertex. We introduce and study vertex splitting as a graph modification operation for transforming graphs into interval graphs. Given a graph $G$ and an integer $k$, we consider the problem of deciding whether $G$ can be transformed into an interval graph using at most $k$ vertex splits. We prove that this problem is NP-hard, even when the input is restricted to subcubic planar bipartite graphs. We further observe that vertex splitting differs fundamentally from vertex and edge deletions as graph modification operations when the objective is to obtain a chordal graph, even for graphs with maximum independent set size at most two. On the positive side, we give a polynomial-time algorithm for transforming, via a minimum number of vertex splits, a given graph into a disjoint union of paths, and that splitting triangle free graphs into unit interval graphs is also solvable in polynomial time.

The matrix-vector complexity of $Ax=b$

from arXiv: Data Structures and Algorithms

Authors: Michał Dereziński, Ethan N. Epperly, Raphael A. Meyer

Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, $Ω(κ\log(1/\varepsilon))$ matrix-vector products are necessary to solve a linear system with condition number $κ$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix-vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

Authors: Michał Dereziński, Ethan N. Epperly, Raphael A. Meyer

Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, $Ω(κ\log(1/\varepsilon))$ matrix-vector products are necessary to solve a linear system with condition number $κ$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix-vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

The Needle is a Thread: Finding Planted Paths in Noisy Process Trees

from arXiv: Data Structures and Algorithms

Authors: Maya Le, Paweł Prałat, Aaron Smith, François Théberge

Motivated by applications in cybersecurity such as finding meaningful sequences of malware-related events buried inside large amounts of computer log data, we introduce the "planted path" problem and propose an algorithm to find fuzzy matchings between two trees. This algorithm can be used as a "building block" for more complicated workflows. We demonstrate usefulness of a few of such workflows in mining synthetically generated data as well as real-world ACME cybersecurity datasets.

Authors: Maya Le, Paweł Prałat, Aaron Smith, François Théberge

Motivated by applications in cybersecurity such as finding meaningful sequences of malware-related events buried inside large amounts of computer log data, we introduce the "planted path" problem and propose an algorithm to find fuzzy matchings between two trees. This algorithm can be used as a "building block" for more complicated workflows. We demonstrate usefulness of a few of such workflows in mining synthetically generated data as well as real-world ACME cybersecurity datasets.

Incongruity-sensitive access to highly compressed strings

from arXiv: Data Structures and Algorithms

Authors: Ferdinando Cicalese, Zsuzsanna Lipták, Travis Gagie, Gonzalo Navarro, Nicola Prezza, Cristian Urbina

Random access to highly compressed strings -- represented by straight-line programs or Lempel-Ziv parses, for example -- is a well-studied topic. Random access to such strings in strongly sublogarithmic time is impossible in the worst case, but previous authors have shown how to support faster access to specific characters and their neighbourhoods. In this paper we explore whether, since better compression can impede access, we can support faster access to relatively incompressible substrings of highly compressed strings. We first show how, given a run-length compressed straight-line program (RLSLP) of size $g_{rl}$ or a block tree of size $L$, we can build an $O (g_{rl})$-space or an $O (L)$-space data structure, respectively, that supports access to any character in time logarithmic in the length of the longest repeated substring containing that character. That is, the more incongruous a character is with respect to the characters around it in a certain sense, the faster we can support access to it. We then prove a similar but more powerful and sophisticated result for parsings in which phrases' sources do not overlap much larger phrases, with the query time depending also on the number of phrases we must copy from their sources to obtain the queried character.

Authors: Ferdinando Cicalese, Zsuzsanna Lipták, Travis Gagie, Gonzalo Navarro, Nicola Prezza, Cristian Urbina

Random access to highly compressed strings -- represented by straight-line programs or Lempel-Ziv parses, for example -- is a well-studied topic. Random access to such strings in strongly sublogarithmic time is impossible in the worst case, but previous authors have shown how to support faster access to specific characters and their neighbourhoods. In this paper we explore whether, since better compression can impede access, we can support faster access to relatively incompressible substrings of highly compressed strings. We first show how, given a run-length compressed straight-line program (RLSLP) of size $g_{rl}$ or a block tree of size $L$, we can build an $O (g_{rl})$-space or an $O (L)$-space data structure, respectively, that supports access to any character in time logarithmic in the length of the longest repeated substring containing that character. That is, the more incongruous a character is with respect to the characters around it in a certain sense, the faster we can support access to it. We then prove a similar but more powerful and sophisticated result for parsings in which phrases' sources do not overlap much larger phrases, with the query time depending also on the number of phrases we must copy from their sources to obtain the queried character.

Simple 2-approximations for bad triangle transversals and some hardness results for related problems

from arXiv: Data Structures and Algorithms

Authors: Florian Adriaens, Nikolaj tatti

Given a signed graph, the bad triangle transversal (BTT) problem asks to find the smallest number of edges that need to be removed such that the remaining graph does not have a triangle with exactly one negative edge (a bad triangle). We propose novel 2-approximations for this problem, which are much simpler and faster than a folklore adaptation of the 2-approximation by Krivelevich for finding a minimum triangle transversal in unsigned graphs. One of our algorithms also works for weighted BTT and for approximately optimal feasible solutions to the bad triangle cover LP. Using a recent result on approximating the bad triangle cover LP, we obtain a $(2+ε)$ approximation in time almost equal to the time needed to find a maximal set of edge-disjoint bad triangles (which would give a standard 3-approximation). Additionally, several inapproximability results are provided. For complete signed graphs, we show that BTT is NP-hard to approximate with factor better than $\frac{2137}{2136}$. Our reduction also implies the same hardness result for related problems such as correlation clustering (cluster editing), cluster deletion and the min. strong triadic closure problem. On complete signed graphs, BTT is closely related to correlation clustering. We show that the correlation clustering optimum is at most $3/2$ times the BTT optimum, by describing a pivot procedure that transforms BTT solutions into clusters. This improves a result by Veldt, which states that their ratio is at most two.

Authors: Florian Adriaens, Nikolaj tatti

Given a signed graph, the bad triangle transversal (BTT) problem asks to find the smallest number of edges that need to be removed such that the remaining graph does not have a triangle with exactly one negative edge (a bad triangle). We propose novel 2-approximations for this problem, which are much simpler and faster than a folklore adaptation of the 2-approximation by Krivelevich for finding a minimum triangle transversal in unsigned graphs. One of our algorithms also works for weighted BTT and for approximately optimal feasible solutions to the bad triangle cover LP. Using a recent result on approximating the bad triangle cover LP, we obtain a $(2+ε)$ approximation in time almost equal to the time needed to find a maximal set of edge-disjoint bad triangles (which would give a standard 3-approximation). Additionally, several inapproximability results are provided. For complete signed graphs, we show that BTT is NP-hard to approximate with factor better than $\frac{2137}{2136}$. Our reduction also implies the same hardness result for related problems such as correlation clustering (cluster editing), cluster deletion and the min. strong triadic closure problem. On complete signed graphs, BTT is closely related to correlation clustering. We show that the correlation clustering optimum is at most $3/2$ times the BTT optimum, by describing a pivot procedure that transforms BTT solutions into clusters. This improves a result by Veldt, which states that their ratio is at most two.

Improved Sparse Recovery for Approximate Matrix Multiplication

from arXiv: Data Structures and Algorithms

Authors: Yahel Uffenheimer, Omri Weinstein

We present a simple randomized algorithm for approximate matrix multiplication (AMM) whose error scales with the *output* norm $\|AB\|_F$. Given any $n\times n$ matrices $A,B$ and a runtime parameter $r\leq n$, the algorithm produces in $O(n^2(r+\log n))$ time, a matrix $C$ with total squared error $\mathbb{E}[\|C-AB\|_F^2]\le (1-\frac{r}{n})\|AB\|_F^2$, per-entry variance $\|AB\|_F^2/n^2$ and bias $\mathbb{E}[C]=\frac{r}{n}AB$. Alternatively, the algorithm can compute an *unbiased* estimation with expected total squared error $\frac{n}{r}\|{AB}\|_{F}^2$, recovering the state-of-art AMM error obtained by Pagh's TensorSketch algorithm (Pagh, 2013). Our algorithm is a log-factor faster. The key insight in the algorithm is a new variation of pseudo-random rotation of the input matrices (a Fast Hadamard Transform with asymmetric diagonal scaling), which redistributes the Frobenius norm of the *output* $AB$ uniformly across its entries.

Authors: Yahel Uffenheimer, Omri Weinstein

We present a simple randomized algorithm for approximate matrix multiplication (AMM) whose error scales with the *output* norm $\|AB\|_F$. Given any $n\times n$ matrices $A,B$ and a runtime parameter $r\leq n$, the algorithm produces in $O(n^2(r+\log n))$ time, a matrix $C$ with total squared error $\mathbb{E}[\|C-AB\|_F^2]\le (1-\frac{r}{n})\|AB\|_F^2$, per-entry variance $\|AB\|_F^2/n^2$ and bias $\mathbb{E}[C]=\frac{r}{n}AB$. Alternatively, the algorithm can compute an *unbiased* estimation with expected total squared error $\frac{n}{r}\|{AB}\|_{F}^2$, recovering the state-of-art AMM error obtained by Pagh's TensorSketch algorithm (Pagh, 2013). Our algorithm is a log-factor faster. The key insight in the algorithm is a new variation of pseudo-random rotation of the input matrices (a Fast Hadamard Transform with asymmetric diagonal scaling), which redistributes the Frobenius norm of the *output* $AB$ uniformly across its entries.

QuadRank: Engineering a High Throughput Rank

from arXiv: Data Structures and Algorithms

Authors: R. Groot Koerkamp

Given a text, a query $\mathsf{rank}(q, c)$ counts the number of occurrences of character $c$ among the first $q$ characters of the text. Space-efficient methods to answer these rank queries form an important building block in many succinct data structures. For example, the FM-index is a widely used data structure that uses rank queries to locate all occurrences of a pattern in a text. In bioinformatics applications, the goal is usually to process a given input as fast as possible. Thus, data structures should have high throughput when used with many threads. Contributions. For the binary alphabet, we develop BiRank with 3.28% space overhead. It merges the central ideas of two recent papers: (1) we interleave (inline) offsets in each cache line of the underlying bit vector [Laws et al., 2024], reducing cache-misses, and (2) these offsets are to the middle of each block so that only half of them need popcounting [Gottlieb and Reinert, 2025]. In QuadRank (14.4% space overhead), we extend these techniques to the $σ=4$ (DNA) alphabet. Both data structures require only a single cache miss per query, making them highly suitable for high-throughput and memory-bound settings. To enable efficient batch-processing, we support prefetching the cache lines required to answer upcoming queries. Results. BiRank and QuadRank are around $1.5\times$ and $2\times$ faster than similar-overhead methods that do not use inlining. Prefetching gives an additional $2\times$ speedup, at which point the dual-channel DDR4 RAM bandwidth becomes a hard limit on the total throughput. With prefetching, both methods outperform all other methods apart from SPIDER [Laws et al., 2024] by $2\times$. When using QuadRank with prefetching in a toy count-only FM-index, QuadFm, this results in a smaller size and up to $4\times$ speedup over Genedex, a state-of-the-art batching FM-index implementation.

Authors: R. Groot Koerkamp

Given a text, a query $\mathsf{rank}(q, c)$ counts the number of occurrences of character $c$ among the first $q$ characters of the text. Space-efficient methods to answer these rank queries form an important building block in many succinct data structures. For example, the FM-index is a widely used data structure that uses rank queries to locate all occurrences of a pattern in a text. In bioinformatics applications, the goal is usually to process a given input as fast as possible. Thus, data structures should have high throughput when used with many threads. Contributions. For the binary alphabet, we develop BiRank with 3.28% space overhead. It merges the central ideas of two recent papers: (1) we interleave (inline) offsets in each cache line of the underlying bit vector [Laws et al., 2024], reducing cache-misses, and (2) these offsets are to the middle of each block so that only half of them need popcounting [Gottlieb and Reinert, 2025]. In QuadRank (14.4% space overhead), we extend these techniques to the $σ=4$ (DNA) alphabet. Both data structures require only a single cache miss per query, making them highly suitable for high-throughput and memory-bound settings. To enable efficient batch-processing, we support prefetching the cache lines required to answer upcoming queries. Results. BiRank and QuadRank are around $1.5\times$ and $2\times$ faster than similar-overhead methods that do not use inlining. Prefetching gives an additional $2\times$ speedup, at which point the dual-channel DDR4 RAM bandwidth becomes a hard limit on the total throughput. With prefetching, both methods outperform all other methods apart from SPIDER [Laws et al., 2024] by $2\times$. When using QuadRank with prefetching in a toy count-only FM-index, QuadFm, this results in a smaller size and up to $4\times$ speedup over Genedex, a state-of-the-art batching FM-index implementation.

Minimizing Makespan in Sublinear Time via Weighted Random Sampling

from arXiv: Data Structures and Algorithms

Authors: Bin Fu, Yumei Huo, Hairong Zhao

We consider the classical makespan minimization scheduling problem where $n$ jobs must be scheduled on $m$ identical machines. Using weighted random sampling, we developed two sublinear time approximation schemes: one for the case where $n$ is known and the other for the case where $n$ is unknown. Both algorithms not only give a $(1+3ε)$-approximation to the optimal makespan but also generate a sketch schedule. Our first algorithm, which targets the case where $n$ is known and draws samples in a single round under weighted random sampling, has a running time of $\tilde{O}(\tfrac{m^5}{ε^4} \sqrt{n}+A(\ceiling{m\over ε}, ε ))$, where $A(\mathcal{N}, α)$ is the time complexity of any $(1+α)$-approximation scheme for the makespan minimization of $\mathcal{N}$ jobs. The second algorithm addresses the case where $n$ is unknown. It uses adaptive weighted random sampling, %\textit{that is}, it draws samples in several rounds, adjusting the number of samples after each round, and runs in sublinear time $\tilde{O}\left( \tfrac{m^5} {ε^4} \sqrt{n} + A(\ceiling{m\over ε}, ε )\right)$. We also provide an implementation that generates a weighted random sample using $O(\log n)$ uniform random samples.

Authors: Bin Fu, Yumei Huo, Hairong Zhao

We consider the classical makespan minimization scheduling problem where $n$ jobs must be scheduled on $m$ identical machines. Using weighted random sampling, we developed two sublinear time approximation schemes: one for the case where $n$ is known and the other for the case where $n$ is unknown. Both algorithms not only give a $(1+3ε)$-approximation to the optimal makespan but also generate a sketch schedule. Our first algorithm, which targets the case where $n$ is known and draws samples in a single round under weighted random sampling, has a running time of $\tilde{O}(\tfrac{m^5}{ε^4} \sqrt{n}+A(\ceiling{m\over ε}, ε ))$, where $A(\mathcal{N}, α)$ is the time complexity of any $(1+α)$-approximation scheme for the makespan minimization of $\mathcal{N}$ jobs. The second algorithm addresses the case where $n$ is unknown. It uses adaptive weighted random sampling, %\textit{that is}, it draws samples in several rounds, adjusting the number of samples after each round, and runs in sublinear time $\tilde{O}\left( \tfrac{m^5} {ε^4} \sqrt{n} + A(\ceiling{m\over ε}, ε )\right)$. We also provide an implementation that generates a weighted random sample using $O(\log n)$ uniform random samples.

Functional Stochastic Localization

from arXiv: Data Structures and Algorithms

Authors: Anming Gu, Bobby Shi, Kevin Tian

Eldan's stochastic localization is a probabilistic construction that has proved instrumental to modern breakthroughs in high-dimensional geometry and the design of sampling algorithms. Motivated by sampling under non-Euclidean geometries and the mirror descent algorithm in optimization, we develop a functional generalization of Eldan's process that replaces Gaussian regularization with regularization by any positive integer multiple of a log-Laplace transform. We further give a mixing time bound on the Markov chain induced by our localization process, which holds if our target distribution satisfies a functional Poincaré inequality. Finally, we apply our framework to differentially private convex optimization in $\ell_p$ norms for $p \in [1, 2)$, where we improve state-of-the-art query complexities in a zeroth-order model.

Authors: Anming Gu, Bobby Shi, Kevin Tian

Eldan's stochastic localization is a probabilistic construction that has proved instrumental to modern breakthroughs in high-dimensional geometry and the design of sampling algorithms. Motivated by sampling under non-Euclidean geometries and the mirror descent algorithm in optimization, we develop a functional generalization of Eldan's process that replaces Gaussian regularization with regularization by any positive integer multiple of a log-Laplace transform. We further give a mixing time bound on the Markov chain induced by our localization process, which holds if our target distribution satisfies a functional Poincaré inequality. Finally, we apply our framework to differentially private convex optimization in $\ell_p$ norms for $p \in [1, 2)$, where we improve state-of-the-art query complexities in a zeroth-order model.

Approximately Partitioning Vertices into Short Paths

from arXiv: Data Structures and Algorithms

Authors: Mingyang Gong, Zhi-Zhong Chen, Brendan Mumey

Given a fixed positive integer $k$ and a simple undirected graph $G = (V, E)$, the {\em $k^-$-path partition} problem, denoted by $k$PP for short, aims to find a minimum collection $\cal{P}$ of vertex-disjoint paths in $G$ such that each path in $\cal{P}$ has at most $k$ vertices and each vertex of $G$ appears in one path in $\cal{P}$. In this paper, we present a $\frac {k+4}5$-approximation algorithm for $k$PP when $k\in\{9,10\}$ and an improved $(\frac{\sqrt{11}-2}7 k + \frac {9-\sqrt{11}}7)$-approximation algorithm when $k \ge 11$. Our algorithms achieve the current best approximation ratios for $k \in \{ 9, 10, \ldots, 18 \}$. Our algorithms start with a maximum triangle-free path-cycle cover $\cal{F}$, which may not be feasible because of the existence of cycles or paths with more than $k$ vertices. We connect as many cycles in $\cal{F}$ with $4$ or $5$ vertices as possible by computing another maximum-weight path-cycle cover in a suitably constructed graph so that $\cal{F}$ can be transformed into a $k^-$-path partition of $G$ without losing too many edges. Keywords: $k^-$-path partition; Triangle-free path-cycle cover; $[f, g]$-factor; Approximation algorithm

Authors: Mingyang Gong, Zhi-Zhong Chen, Brendan Mumey

Given a fixed positive integer $k$ and a simple undirected graph $G = (V, E)$, the {\em $k^-$-path partition} problem, denoted by $k$PP for short, aims to find a minimum collection $\cal{P}$ of vertex-disjoint paths in $G$ such that each path in $\cal{P}$ has at most $k$ vertices and each vertex of $G$ appears in one path in $\cal{P}$. In this paper, we present a $\frac {k+4}5$-approximation algorithm for $k$PP when $k\in\{9,10\}$ and an improved $(\frac{\sqrt{11}-2}7 k + \frac {9-\sqrt{11}}7)$-approximation algorithm when $k \ge 11$. Our algorithms achieve the current best approximation ratios for $k \in \{ 9, 10, \ldots, 18 \}$. Our algorithms start with a maximum triangle-free path-cycle cover $\cal{F}$, which may not be feasible because of the existence of cycles or paths with more than $k$ vertices. We connect as many cycles in $\cal{F}$ with $4$ or $5$ vertices as possible by computing another maximum-weight path-cycle cover in a suitably constructed graph so that $\cal{F}$ can be transformed into a $k^-$-path partition of $G$ without losing too many edges. Keywords: $k^-$-path partition; Triangle-free path-cycle cover; $[f, g]$-factor; Approximation algorithm

Wednesday, February 04

Sampling the Oxford CS Library

from Computational Complexity

♦Wandering around maze known as the Computer Science building at Oxford I found the computer science library. Rarely these days do you see a library (and a librarian) devoted to computer science. The librarian found their copy of The Golden Ticket and asked me to inscribe and sign it, just like at Dagstuhl, perhaps the only other active CS library I know of.

It brought back memories of the early 90s when I would often head to the 
Math/CS library at the University of Chicago to track down some conference or journal paper. Now we just click and download but you miss finding something else interesting in the proceedings or the stacks in general.

♦I had time to kill so I wandered around the library finding memories in the stacks including the 1987 STOC Proceedings, home to my first conference paper, The complexity of perfect zero-knowledge. The paper might be best known for my upper bound protocol which is republished here in its entirety. 

That's how I wrote it nearly four decades ago, without proof just an intuition why it works. Those were the days. I did work out the full covariance argument in the journal version though I missed other bugs in the proof. 

The upper bound requires the verifier to have a random sample of the distribution unknown to the prover. Rahul Santhanam, who is hosting my visit to Oxford, asked if the converse was known. Goldreich, Vadhan and Wigderson, in the appendix of their Laconic Prover paper, show a sampling protocol based on the upper bound on the size of a set, though the sample is not completely unknown to the prover. Neat to revisit questions from my first conference paper. 

♦ Oxford CS Librarian Aza Ballard-Whyte

By Lance Fortnow

Wandering around maze known as the Computer Science building at Oxford I found the computer science library. Rarely these days do you see a library (and a librarian) devoted to computer science. The librarian found their copy of The Golden Ticket and asked me to inscribe and sign it, just like at Dagstuhl, perhaps the only other active CS library I know of.

It brought back memories of the early 90s when I would often head to the 
Math/CS library at the University of Chicago to track down some conference or journal paper. Now we just click and download but you miss finding something else interesting in the proceedings or the stacks in general.

I had time to kill so I wandered around the library finding memories in the stacks including the 1987 STOC Proceedings, home to my first conference paper, The complexity of perfect zero-knowledge. The paper might be best known for my upper bound protocol which is republished here in its entirety. 

That's how I wrote it nearly four decades ago, without proof just an intuition why it works. Those were the days. I did work out the full covariance argument in the journal version though I missed other bugs in the proof

The upper bound requires the verifier to have a random sample of the distribution unknown to the prover. Rahul Santhanam, who is hosting my visit to Oxford, asked if the converse was known. Goldreich, Vadhan and Wigderson, in the appendix of their Laconic Prover paper, show a sampling protocol based on the upper bound on the size of a set, though the sample is not completely unknown to the prover. Neat to revisit questions from my first conference paper. 

Oxford CS Librarian Aza Ballard-Whyte

By Lance Fortnow

Fully Funded PhD Position in Algorithms & Complexity at University of Birmingham (apply by February 28, 2026)

from CCI: jobs

Fully funded PhD (3.5 years) in Algorithms & Complexity at the University of Birmingham, School of Computer Science. Tuition fees covered, stipend and travel support included. Applicants should have a strong background in theoretical computer science (algorithms, complexity). Strong masters or outstanding bachelors candidates welcome. Start date: September 2026. Website: www.birmingham.ac.uk/study/postgraduate/subjects/computer-science-and-data-science-courses/computer-science-phd Email: s.mukhopadhyay@bham.ac.uk

Fully funded PhD (3.5 years) in Algorithms & Complexity at the University of Birmingham, School of Computer Science. Tuition fees covered, stipend and travel support included. Applicants should have a strong background in theoretical computer science (algorithms, complexity). Strong masters or outstanding bachelors candidates welcome. Start date: September 2026.

Website: https://www.birmingham.ac.uk/study/postgraduate/subjects/computer-science-and-data-science-courses/computer-science-phd
Email: s.mukhopadhyay@bham.ac.uk

By shacharlovett

Game-Theoretic and Algorithmic Analyses of Multi-Agent Routing under Crossing Costs

from arXiv: Computational Complexity

Authors: Tesshu Hanaka, Nikolaos Melissinos, Hirotaka Ono

Coordinating the movement of multiple autonomous agents over a shared network is a fundamental challenge in algorithmic robotics, intelligent transportation, and distributed systems. The dominant approach, Multi-Agent Path Finding, relies on centralized control and synchronous collision avoidance, which often requires strict synchronization and guarantees of globally conflict-free execution. This paper introduces the Multi-Agent Routing under Crossing Cost model on mixed graphs, a novel framework tailored to asynchronous settings. In our model, instead of treating conflicts as hard constraints, each agent is assigned a path, and the system is evaluated through a cost function that measures potential head-on encounters. This ``crossing cost'', which is defined as the product of the numbers of agents traversing an edge in opposite directions, quantifies the risk of congestion and delay in decentralized execution. Our contributions are both game-theoretic and algorithmic. We model the setting as a congestion game with a non-standard cost function, prove the existence of pure Nash equilibria, and analyze the dynamics leading to them. Equilibria can be found in polynomial time under mild conditions, while the general case is PLS-complete. From an optimization perspective, minimizing the total crossing cost is NP-hard, as the problem generalizes Steiner Orientation. To address this hardness barrier, we design a suite of parameterized algorithms for minimizing crossing cost, with parameters including the number of arcs, edges, agents, and structural graph measures. These yield XP or FPT results depending on the parameter, offering algorithmic strategies for structurally restricted instances. Our framework provides a new theoretical foundation for decentralized multi-agent routing, bridging equilibrium analysis and parameterized complexity to support scalable and risk-aware coordination.

Authors: Tesshu Hanaka, Nikolaos Melissinos, Hirotaka Ono

Coordinating the movement of multiple autonomous agents over a shared network is a fundamental challenge in algorithmic robotics, intelligent transportation, and distributed systems. The dominant approach, Multi-Agent Path Finding, relies on centralized control and synchronous collision avoidance, which often requires strict synchronization and guarantees of globally conflict-free execution. This paper introduces the Multi-Agent Routing under Crossing Cost model on mixed graphs, a novel framework tailored to asynchronous settings. In our model, instead of treating conflicts as hard constraints, each agent is assigned a path, and the system is evaluated through a cost function that measures potential head-on encounters. This ``crossing cost'', which is defined as the product of the numbers of agents traversing an edge in opposite directions, quantifies the risk of congestion and delay in decentralized execution. Our contributions are both game-theoretic and algorithmic. We model the setting as a congestion game with a non-standard cost function, prove the existence of pure Nash equilibria, and analyze the dynamics leading to them. Equilibria can be found in polynomial time under mild conditions, while the general case is PLS-complete. From an optimization perspective, minimizing the total crossing cost is NP-hard, as the problem generalizes Steiner Orientation. To address this hardness barrier, we design a suite of parameterized algorithms for minimizing crossing cost, with parameters including the number of arcs, edges, agents, and structural graph measures. These yield XP or FPT results depending on the parameter, offering algorithmic strategies for structurally restricted instances. Our framework provides a new theoretical foundation for decentralized multi-agent routing, bridging equilibrium analysis and parameterized complexity to support scalable and risk-aware coordination.

An Algorithm for Monitoring Edge-geodetic Sets in Chordal Graphs

from arXiv: Computational Complexity

Authors: Nacim Oijid, Clara Marcille

A monitoring edge-geodetic set (or meg-set for short) of a graph is a set of vertices $M$ such that if any edge is removed, then the distance between some two vertices of $M$ increases. This notion was introduced by Foucaud et al. in 2023 as a way to monitor networks for communication failures. As computing a minimal meg-set is hard in general, recent works aimed to find polynomial-time algorithms to compute minimal meg-sets when the input belongs to a restricted class of graphs. Most of these results are based on the property of some classes of graphs to admit a unique minimum meg-set, which is then easy to compute. In this work, we prove that chordal graphs also admit a unique minimal meg-set, answering a standing open question of Foucaud et al.

Authors: Nacim Oijid, Clara Marcille

A monitoring edge-geodetic set (or meg-set for short) of a graph is a set of vertices $M$ such that if any edge is removed, then the distance between some two vertices of $M$ increases. This notion was introduced by Foucaud et al. in 2023 as a way to monitor networks for communication failures. As computing a minimal meg-set is hard in general, recent works aimed to find polynomial-time algorithms to compute minimal meg-sets when the input belongs to a restricted class of graphs. Most of these results are based on the property of some classes of graphs to admit a unique minimum meg-set, which is then easy to compute. In this work, we prove that chordal graphs also admit a unique minimal meg-set, answering a standing open question of Foucaud et al.

Obstruction theory and the complexity of counting group homomorphisms

from arXiv: Computational Complexity

Authors: Eric Samperton, Armin Weiß

Fix a finite group $G$. We study the computational complexity of counting problems of the following flavor: given a group $Γ$, count the number of homomorphisms $Γ\to G$. Our first result establishes that this problem is $\#\mathsf{P}$-hard whenever $G$ is a non-abelian group and $Γ$ is provided via a finite presentation. We give several improvements showing that this hardness conclusion continues to hold for restricted $Γ$ satisfying various promises. Our second result, in contrast, shows that if $G$ is class 2 nilpotent and $Γ= π_1(M^3)$ for some input 3-manifold triangulation $M^3$, then there is a polynomial time algorithm. The difference in complexity is explained by the fact that 3-manifolds are close enough to being Eilenberg-MacLane spaces for us to be able to solve the necessary group cohomological obstruction problems efficiently using the given triangulation. A similar polynomial time algorithm for counting maps to finite, class 2 nilpotent $G$ exists when $Γ$ is itself a finite group encoded via a multiplication table.

Authors: Eric Samperton, Armin Weiß

Fix a finite group $G$. We study the computational complexity of counting problems of the following flavor: given a group $Γ$, count the number of homomorphisms $Γ\to G$. Our first result establishes that this problem is $\#\mathsf{P}$-hard whenever $G$ is a non-abelian group and $Γ$ is provided via a finite presentation. We give several improvements showing that this hardness conclusion continues to hold for restricted $Γ$ satisfying various promises. Our second result, in contrast, shows that if $G$ is class 2 nilpotent and $Γ= π_1(M^3)$ for some input 3-manifold triangulation $M^3$, then there is a polynomial time algorithm. The difference in complexity is explained by the fact that 3-manifolds are close enough to being Eilenberg-MacLane spaces for us to be able to solve the necessary group cohomological obstruction problems efficiently using the given triangulation. A similar polynomial time algorithm for counting maps to finite, class 2 nilpotent $G$ exists when $Γ$ is itself a finite group encoded via a multiplication table.

Point Vortex Dynamics on Closed Surfaces

from arXiv: Computational Geometry

Authors: Marcel Padilla

The theory of point vortex dynamics has existed since Kirchhoff's proposal in 1891 and is still under development with connections to many fields in mathematics. As a strong simplification of the concept of vorticity it excels in computational speed for vorticity based fluid simulations at the cost of accuracy. Recent finding by Stefanella Boatto and Jair Koiller allowed the extension of this theory on to closed surfaces. A comprehensive guide to point vortex dynamics on closed surfaces with genus zero and vanishing total vorticity is presented here. Additionally fundamental knowledge of fluid dynamics and surfaces are explained in a way to unify the theory of point vortex dynamics of the plane, the sphere and closed surfaces together with implementation details and supplement material.

Authors: Marcel Padilla

The theory of point vortex dynamics has existed since Kirchhoff's proposal in 1891 and is still under development with connections to many fields in mathematics. As a strong simplification of the concept of vorticity it excels in computational speed for vorticity based fluid simulations at the cost of accuracy. Recent finding by Stefanella Boatto and Jair Koiller allowed the extension of this theory on to closed surfaces. A comprehensive guide to point vortex dynamics on closed surfaces with genus zero and vanishing total vorticity is presented here. Additionally fundamental knowledge of fluid dynamics and surfaces are explained in a way to unify the theory of point vortex dynamics of the plane, the sphere and closed surfaces together with implementation details and supplement material.

Perfect Network Resilience in Polynomial Time

from arXiv: Data Structures and Algorithms

Authors: Matthias Bentert, Stefan Schmid

Modern communication networks support local fast rerouting mechanisms to quickly react to link failures: nodes store a set of conditional rerouting rules which define how to forward an incoming packet in case of incident link failures. The rerouting decisions at any node $v$ must rely solely on local information available at $v$: the link from which a packet arrived at $v$, the target of the packet, and the incident link failures at $v$. Ideally, such rerouting mechanisms provide perfect resilience: any packet is routed from its source to its target as long as the two are connected in the underlying graph after the link failures. Already in their seminal paper at ACM PODC '12, Feigenbaum, Godfrey, Panda, Schapira, Shenker, and Singla showed that perfect resilience cannot always be achieved. While the design of local rerouting algorithms has received much attention since then, we still lack a detailed understanding of when perfect resilience is achievable. This paper closes this gap and presents a complete characterization of when perfect resilience can be achieved. This characterization also allows us to design an $O(n)$-time algorithm to decide whether a given instance is perfectly resilient and an $O(nm)$-time algorithm to compute perfectly resilient rerouting rules whenever it is. Our algorithm is also attractive for the simple structure of the rerouting rules it uses, known as skipping in the literature: alternative links are chosen according to an ordered priority list (per in-port), where failed links are simply skipped. Intriguingly, our result also implies that in the context of perfect resilience, skipping rerouting rules are as powerful as more general rerouting rules. This partially answers a long-standing open question by Chiesa, Nikolaevskiy, Mitrovic, Gurtov, Madry, Schapira, and Shenker [IEEE/ACM Transactions on Networking, 2017] in the affirmative.

Authors: Matthias Bentert, Stefan Schmid

Modern communication networks support local fast rerouting mechanisms to quickly react to link failures: nodes store a set of conditional rerouting rules which define how to forward an incoming packet in case of incident link failures. The rerouting decisions at any node $v$ must rely solely on local information available at $v$: the link from which a packet arrived at $v$, the target of the packet, and the incident link failures at $v$. Ideally, such rerouting mechanisms provide perfect resilience: any packet is routed from its source to its target as long as the two are connected in the underlying graph after the link failures. Already in their seminal paper at ACM PODC '12, Feigenbaum, Godfrey, Panda, Schapira, Shenker, and Singla showed that perfect resilience cannot always be achieved. While the design of local rerouting algorithms has received much attention since then, we still lack a detailed understanding of when perfect resilience is achievable. This paper closes this gap and presents a complete characterization of when perfect resilience can be achieved. This characterization also allows us to design an $O(n)$-time algorithm to decide whether a given instance is perfectly resilient and an $O(nm)$-time algorithm to compute perfectly resilient rerouting rules whenever it is. Our algorithm is also attractive for the simple structure of the rerouting rules it uses, known as skipping in the literature: alternative links are chosen according to an ordered priority list (per in-port), where failed links are simply skipped. Intriguingly, our result also implies that in the context of perfect resilience, skipping rerouting rules are as powerful as more general rerouting rules. This partially answers a long-standing open question by Chiesa, Nikolaevskiy, Mitrovic, Gurtov, Madry, Schapira, and Shenker [IEEE/ACM Transactions on Networking, 2017] in the affirmative.

Quantum Speedups for Derivative Pricing Beyond Black-Scholes

from arXiv: Data Structures and Algorithms

Authors: Dylan Herman, Yue Sun, Jin-Peng Liu, Marco Pistoia, Charlie Che, Rob Otter, Shouvanik Chakrabarti, Aram Harrow

This paper explores advancements in quantum algorithms for derivative pricing of exotics, a computational pipeline of fundamental importance in quantitative finance. For such cases, the classical Monte Carlo integration procedure provides the state-of-the-art provable, asymptotic performance: polynomial in problem dimension and quadratic in inverse-precision. While quantum algorithms are known to offer quadratic speedups over classical Monte Carlo methods, end-to-end speedups have been proven only in the simplified setting over the Black-Scholes geometric Brownian motion (GBM) model. This paper extends existing frameworks to demonstrate novel quadratic speedups for more practical models, such as the Cox-Ingersoll-Ross (CIR) model and a variant of Heston's stochastic volatility model, utilizing a characteristic of the underlying SDEs which we term fast-forwardability. Additionally, for general models that do not possess the fast-forwardable property, we introduce a quantum Milstein sampler, based on a novel quantum algorithm for sampling Lévy areas, which enables quantum multi-level Monte Carlo to achieve quadratic speedups for multi-dimensional stochastic processes exhibiting certain correlation types. We also present an improved analysis of numerical integration for derivative pricing, leading to substantial reductions in the resource requirements for pricing GBM and CIR models. Furthermore, we investigate the potential for additional reductions using arithmetic-free quantum procedures. Finally, we critique quantum partial differential equation (PDE) solvers as a method for derivative pricing based on amplitude estimation, identifying theoretical barriers that obstruct achieving a quantum speedup through this approach. Our findings significantly advance the understanding of quantum algorithms in derivative pricing, addressing key challenges and open questions in the field.

Authors: Dylan Herman, Yue Sun, Jin-Peng Liu, Marco Pistoia, Charlie Che, Rob Otter, Shouvanik Chakrabarti, Aram Harrow

This paper explores advancements in quantum algorithms for derivative pricing of exotics, a computational pipeline of fundamental importance in quantitative finance. For such cases, the classical Monte Carlo integration procedure provides the state-of-the-art provable, asymptotic performance: polynomial in problem dimension and quadratic in inverse-precision. While quantum algorithms are known to offer quadratic speedups over classical Monte Carlo methods, end-to-end speedups have been proven only in the simplified setting over the Black-Scholes geometric Brownian motion (GBM) model. This paper extends existing frameworks to demonstrate novel quadratic speedups for more practical models, such as the Cox-Ingersoll-Ross (CIR) model and a variant of Heston's stochastic volatility model, utilizing a characteristic of the underlying SDEs which we term fast-forwardability. Additionally, for general models that do not possess the fast-forwardable property, we introduce a quantum Milstein sampler, based on a novel quantum algorithm for sampling Lévy areas, which enables quantum multi-level Monte Carlo to achieve quadratic speedups for multi-dimensional stochastic processes exhibiting certain correlation types. We also present an improved analysis of numerical integration for derivative pricing, leading to substantial reductions in the resource requirements for pricing GBM and CIR models. Furthermore, we investigate the potential for additional reductions using arithmetic-free quantum procedures. Finally, we critique quantum partial differential equation (PDE) solvers as a method for derivative pricing based on amplitude estimation, identifying theoretical barriers that obstruct achieving a quantum speedup through this approach. Our findings significantly advance the understanding of quantum algorithms in derivative pricing, addressing key challenges and open questions in the field.

ZOR filters: fast and smaller than fuse filters

from arXiv: Data Structures and Algorithms

Authors: Antoine Limasset

Probabilistic membership filters support fast approximate membership queries with a controlled false-positive probability $\varepsilon$ and are widely used across storage, analytics, networking, and bioinformatics \cite{chang2008bigtable,dayan2018optimalbloom,broder2004network,harris2020improved,marchet2023scalable,chikhi2025logan,hernandez2025reindeer2}. In the static setting, state-of-the-art designs such as XOR and fuse filters achieve low overhead and very fast queries, but their peeling-based construction succeeds only with high probability, which complicates deterministic builds \cite{graf2020xor,graf2022binary,ulrich2023taxor}. We introduce \emph{ZOR filters}, a deterministic continuation of XOR/fuse filters that guarantees construction termination while preserving the same XOR-based query mechanism. ZOR replaces restart-on-failure with deterministic peeling that abandons a small fraction of keys, and restores false-positive-only semantics by storing the remainder in a compact auxiliary structure. In our experiments, the abandoned fraction drops below $1\%$ for moderate arity (e.g., $N\ge 5$), so the auxiliary handles a negligible fraction of keys. As a result, ZOR filters can achieve overhead within $1\%$ of the information-theoretic lower bound $\log_2(1/\varepsilon)$ while retaining fuse-like query performance; the additional cost is concentrated on negative queries due to the auxiliary check. Our current prototype builds several-fold slower than highly optimized fuse builders because it maintains explicit incidence information during deterministic peeling; closing this optimisation gap is an engineering target.

Authors: Antoine Limasset

Probabilistic membership filters support fast approximate membership queries with a controlled false-positive probability $\varepsilon$ and are widely used across storage, analytics, networking, and bioinformatics \cite{chang2008bigtable,dayan2018optimalbloom,broder2004network,harris2020improved,marchet2023scalable,chikhi2025logan,hernandez2025reindeer2}. In the static setting, state-of-the-art designs such as XOR and fuse filters achieve low overhead and very fast queries, but their peeling-based construction succeeds only with high probability, which complicates deterministic builds \cite{graf2020xor,graf2022binary,ulrich2023taxor}. We introduce \emph{ZOR filters}, a deterministic continuation of XOR/fuse filters that guarantees construction termination while preserving the same XOR-based query mechanism. ZOR replaces restart-on-failure with deterministic peeling that abandons a small fraction of keys, and restores false-positive-only semantics by storing the remainder in a compact auxiliary structure. In our experiments, the abandoned fraction drops below $1\%$ for moderate arity (e.g., $N\ge 5$), so the auxiliary handles a negligible fraction of keys. As a result, ZOR filters can achieve overhead within $1\%$ of the information-theoretic lower bound $\log_2(1/\varepsilon)$ while retaining fuse-like query performance; the additional cost is concentrated on negative queries due to the auxiliary check. Our current prototype builds several-fold slower than highly optimized fuse builders because it maintains explicit incidence information during deterministic peeling; closing this optimisation gap is an engineering target.

Exploiting Multi-Core Parallelism in Blockchain Validation and Construction

from arXiv: Data Structures and Algorithms

Authors: Arivarasan Karmegam, Lucianna Kiffer, Antonio Fernández Anta

Blockchain validators can reduce block processing time by exploiting multi-core CPUs, but deterministic execution must preserve a given total order while respecting transaction conflicts and per-block runtime limits. This paper systematically examines how validators can exploit multi-core parallelism during both block construction and execution without violating blockchain semantics. We formalize two validator-side optimization problems: (i) executing an already ordered block on \(p\) cores to minimize makespan while ensuring equivalence to sequential execution; and (ii) selecting and scheduling a subset of mempool transactions under a runtime limit \(B\) to maximize validator reward. For both, we develop exact Mixed-Integer Linear Programming (MILP) formulations that capture conflict, order, and capacity constraints, and propose fast deterministic heuristics that scale to realistic workloads. Using Ethereum mainnet traces and including a Solana-inspired declared-access baseline (Sol) for ordered-block scheduling and a simple reward-greedy baseline (RG) for block construction, we empirically quantify the trade-offs between optimality and runtime.

Authors: Arivarasan Karmegam, Lucianna Kiffer, Antonio Fernández Anta

Blockchain validators can reduce block processing time by exploiting multi-core CPUs, but deterministic execution must preserve a given total order while respecting transaction conflicts and per-block runtime limits. This paper systematically examines how validators can exploit multi-core parallelism during both block construction and execution without violating blockchain semantics. We formalize two validator-side optimization problems: (i) executing an already ordered block on \(p\) cores to minimize makespan while ensuring equivalence to sequential execution; and (ii) selecting and scheduling a subset of mempool transactions under a runtime limit \(B\) to maximize validator reward. For both, we develop exact Mixed-Integer Linear Programming (MILP) formulations that capture conflict, order, and capacity constraints, and propose fast deterministic heuristics that scale to realistic workloads. Using Ethereum mainnet traces and including a Solana-inspired declared-access baseline (Sol) for ordered-block scheduling and a simple reward-greedy baseline (RG) for block construction, we empirically quantify the trade-offs between optimality and runtime.

Vigemers: on the number of $k$-mers sharing the same XOR-based minimizer

from arXiv: Data Structures and Algorithms

Authors: Florian Ingels, Antoine Limasset, Camille Marchet, Mikaël Salson

In bioinformatics, minimizers have become an inescapable method for handling $k$-mers (words of fixed size $k$) extracted from DNA or RNA sequencing, whether for sampling, storage, querying or partitioning. According to some fixed order on $m$-mers ($m

Authors: Florian Ingels, Antoine Limasset, Camille Marchet, Mikaël Salson

In bioinformatics, minimizers have become an inescapable method for handling $k$-mers (words of fixed size $k$) extracted from DNA or RNA sequencing, whether for sampling, storage, querying or partitioning. According to some fixed order on $m$-mers ($m

Fast-MWEM: Private Data Release in Sublinear Time

from arXiv: Data Structures and Algorithms

Authors: Themistoklis Haris, Steve Choi, Mutiraj Laksanawisit

The Multiplicative Weights Exponential Mechanism (MWEM) is a fundamental iterative framework for private data analysis, with broad applications such as answering $m$ linear queries, or privately solving systems of $m$ linear constraints. However, a critical bottleneck hindering its scalability is the $Θ(m)$ time complexity required to execute the exponential mechanism in each iteration. We introduce a modification to the MWEM framework that improves the per-iteration runtime dependency to $Θ(\sqrt{m})$ in expectation. This is done via a lazy sampling approach to the Report-Noisy-Max mechanism, which we implement efficiently using Gumbel noise and a $k$-Nearest Neighbor data structure. This allows for the rapid selection of the approximate score in the exponential mechanism without an exhaustive linear scan. We apply our accelerated framework to the problems of private linear query release and solving Linear Programs (LPs) under neighboring constraint conditions and low-sensitivity assumptions. Experimental evaluation confirms that our method provides a substantial runtime improvement over classic MWEM.

Authors: Themistoklis Haris, Steve Choi, Mutiraj Laksanawisit

The Multiplicative Weights Exponential Mechanism (MWEM) is a fundamental iterative framework for private data analysis, with broad applications such as answering $m$ linear queries, or privately solving systems of $m$ linear constraints. However, a critical bottleneck hindering its scalability is the $Θ(m)$ time complexity required to execute the exponential mechanism in each iteration. We introduce a modification to the MWEM framework that improves the per-iteration runtime dependency to $Θ(\sqrt{m})$ in expectation. This is done via a lazy sampling approach to the Report-Noisy-Max mechanism, which we implement efficiently using Gumbel noise and a $k$-Nearest Neighbor data structure. This allows for the rapid selection of the approximate score in the exponential mechanism without an exhaustive linear scan. We apply our accelerated framework to the problems of private linear query release and solving Linear Programs (LPs) under neighboring constraint conditions and low-sensitivity assumptions. Experimental evaluation confirms that our method provides a substantial runtime improvement over classic MWEM.

Tuesday, February 03

Updates and Plans V: From Boise to Tel Aviv, Ceasefire, My 70th Birthday, Nostalgia, Problems, Outrageous Conjectures, Quantum, and AI

from Gil Kalai

This is the fifth post of this type (I (2008); II(2011); III(2015); IV(2024)). Between Boise and Tel Aviv During the summer we spent two months in the lovely city of Boise, Idaho. We stayed with my son Hagai and his husband Felix, … Continue reading →

This is the fifth post of this type (I (2008)II(2011)III(2015); IV(2024)).

Between Boise and Tel Aviv

During the summer we spent two months in the lovely city of Boise, Idaho. We stayed with my son Hagai and his husband Felix, their one-and-a-half-year-old son Yonatan, and—one week after our arrival—their second son, Rafael, was born. I was visiting Boise State University and was hosted by Zach Teitler, whom I first met many years ago at Texas A&M.

Boise is a beautiful city with wonderful parks, and Mazi and I also devoted a week to visiting Yellowstone for the first time.

On the flight back with Hagai, Rafael, Mazi, and Yonatan

Ceasefire in Gaza

On September 29, 2025, US president Donald Trump put forward a 21-point plan for a ceasefire in Gaza, as part of a broader initiative toward peace in the Middle East. The plan was endorsed by many world leaders, including Arab and Muslim leaders, as well as by the Israeli government headed by Benjamin Netanyahu. On October 9, an agreement was reached between Israel and Hamas on a ceasefire, a partial withdrawal of Israeli forces, and the release of kidnapped Israelis. On October 13, all living kidnapped Israelis were released. By January 27, 2026, all bodies of Israelis were returned.

The end of this terrible war is certainly a source of relief and hope. Still, we are living in dangerous and tragic times, with much uncertainty.

My 70th Birthday

We landed back in Tel Aviv on October 1, the eve of Yom Kippur. The following day—somewhat to my surprise—was my 70th birthday. Two weeks later, on Simchat Torah (October 14), we gathered to celebrate the holiday, the birthday, the new family member, and the end of the war. Being 70 years old feels sort of strange. 

Nostalgia Corner and Congratulations to Benjy Weiss

“secretly jubilant” 

I recently found, in my sister’s home (where we both lived as children), a Jerusalem Post article from 1970 about the Israeli Mathematical Olympiad. In that competition I received a consolation prize, while my friend Ron Donagi received the first price. Here is a quote from the article: ” ‘The prize is much more than I expected,’ stated an apparently indifferent yet secretly jubilant Gil.”

The reporter, Mark Daniel Sacks, also expressed his wish for similar encouragement for “those of us who are interested in literature, poetry, philosophy, and art.” I fully agree!

A few months earlier, in the fall of 1969, I began attending Benjy Weiss’s year-long mathematics course for high-school students, together with 20–25 other students, including Yehuda Agnon, Michael Ben-Or, Rami Grossberg (see comment below), Ehud Lehrer, Uzi Segal , Yonatan Stern, and Mike Werman. The course was an eye-opener for all of us.

It has just been announced that our teacher, Benjy Weiss, has won the 2026 Israel Prize in Mathematics. Heartfelt congratulations, Benjy!

Problems, Problems 

Over the years I have devoted quite a few posts—here, on other blogs, and on MathOverflow—to open problems. In 2013, at the Erdős Centennial conference, I gave a lecture on old and new problems, mainly in combinatorics and geometry (here are the slides), where I presented twenty problems that are also listed in this post. Since then, there has been substantial progress, and in some cases full solutions, for roughly 30% of them.  

I gradually plan, somewhat in Erdős’ tradition, to upgrade my problem posts and lectures into papers.

So far, in 2015 I wrote a paper around Borsuk’s problem. (Some of the problems appeared in these posts.) In 2022, Imre Barany and I wrote a survey article on Helly-type problems, which was nicely accepted.  I am currently writing a paper about the diameter problem for graphs of polytopes. We devoted many posts—and Polymath 3—to this problem, and I plan to submit the paper to the new and splendid journal JOMP: the Journal of Open Math Problems.

Geometric Combinatorics

There are many open problems that I like—and quite a few that I myself posed—concerning the combinatorial theory of convex polytopes, face numbers of polytopes, and related cellular objects. This post from 2008 lists five elementary problems and since then one problem was solved. One outstanding problem in the field that I like is whether triangulated spheres are determined from their dual graphs. This is known for simplicial polytopes (see this post from 2009) and was recently proved for all shellable simplicial spheres by  Yirong Yang in her paper Reconstructing a Shellable Sphere from its Facet-Ridge Graph

Let me mention two problems from other areas of combinatorial geometry.

Two triangles are called almost disjoint if they are either disjoint or their intersection consists of one common vertex. Let f(n) denote the maximum number of pairwise almost disjoint triangles that can be found on some vertex set of n points in 3-space. How large can f(n) be? It is easy to see that f(n) is at most quadratic in n and the best lower bound from 2002 by Karolyi and Solmyosi is f(n)=\Omega (n^{3/2}).  There is a related work from 2017 by Solmyosi and Wong. 

In 1995, Nati Linial and I conjectured that the kissing number for lattice sphere packings in \mathbb R^n is subexponential in n. The highest known kissing number behaves like n^{\log n}. Our problem was related to the question of finding upper bounds for the density of sphere packings in high dimension. Recent celebrated work of Klartag shows an intriguing connection between kissing numbers and lower bounds on sphere-packing density.

Analysis of Boolean Functions and Probabilistic Combinatorics

In a draft paper from 2000 (which I mostly distributed privately), I listed 18 interesting phenomena and 23 problems around these phenomena related to Boolean functions and their Fourier expansion. Since then there were many developments in the analysis of Boolean functions. Here is a comprehensive list of open problems from 2014. One problem in the list was recently solved by GPT5. I myself posed quite a few problems in this area but let me mention today the still open Aaronson-Ambainis conjecture from 2008: for every function f:\{-1,1\}^n\to [-1,1] of degree at most k, there exists a variable k with influence at least (V(f)/k)^C, for some constant CV(f) stands for the variance of f

In probabilistic combinatorics, the “Kahn-Kalai conjecture” from our 2006 paper has been famously solved by Park and Pham and a second conjecture about graphs was settled – up to \log^2 n factor by Dubroff, Kahn, and Park. 

Jeff Kahn and I regarded the conjecture as outrageous—and likely false—but in that paper we formulated several specific conjectures (in the area of discrete isoperimetric inequalities) as part of a broader program for proving it. In spite of some substantial progress, these conjecture remain largely open, although a few have been refuted. One of those conjectures is presented in this MO post. In principle, the Kruskal-Katona theorem should suffice to settle this problem, but still we cannot solve it. 

Extremal Combinatorics

One question I asked—independently also posed by Karen Meagher—concerned the independence numbers of intersection graphs of triangulations. This conjecture is still open and it admits a lovely generalization for a large class of polytopes. Recently, Anton Molnar, Cosmin Pohoata, Michael Zheng, and Daniel G. Zhu raised the question of finding the chromatic number of the intersection graphs of triangulations—and solved it! They showed that the Kneser graph of triangulations of a convex n-gon has chromatic number n-2. 

Computation Complexity and Number Theory

Around 2010, I formulated several conjectures relating computational complexity and number theory, which led to some very nice developments. Together with Mrinal Kumar and Ben Lee Volk, I plan to write a paper with further problems connecting algebraic circuit complexity and number theory.

Two Outrageous Conjectures

Here are two very outrageous conjectures that may well admit simple refutations.(Comments are welcome; the right thing to do would be to devote a separate post to each of then, stay tuned.)  

The first outrageous conjecture is presented in this slide from a 2024 lecture. 

See also this MO question and this one.

The second vague and outrageous conjecture (already mentioned earlier in this post) is about computational complexity and more precisely about Papadimitriou’s computational hierarchy for mathematical proofs. It asserts that theorems guaranteeing the existence  (for sure, not just with high probability) of combinatorial structures and whose proofs are based on the probabilistic method,  are accompanied by an efficient algorithm (possibly randomized) for finding this structures.  (In other words, the probabilistic method does not lead to a new Papadimitriou class beyond P.)

Quantum Information and Quantum Physics

It is likely that the proportion of posts dealing with quantum computing and quantum physics will increase. So far, they account for about 8% of all posts since I began blogging. My interest in this area has branched into several related directions.

The Argument Against Quantum Computing

The direction closest to my heart is the argument against quantum computing. I have invested considerable effort in explaining and discussing my theory for why quantum computers are inherently impossible—through papers, lectures, debates, and blog posts. I try not to oversell the case, and I think that ultimately, experiments are likely to provide the clearest way to decide the matter.

Correlated Errors 

A related but distinct issue concerns the modeling of correlated errors, which was central in my research between 2005 and 2012, and more generally the behavior (and modeling) of noisy quantum systems that do not exhibit quantum fault tolerance. Here too, experiments and simulations can provide significant insight, and my (admittedly bold) conjectures about error correlations could be tested directly.

Statistical Analysis of Experimental Data

Another topic is the statistical analysis of current experimental data. With my coauthors we devoted substantial effort to analyzing Google’s 2019 experiment, and I believe more can be done to clarify and explain the findings of our papers. Our long-going project is closely related to developing statistical tools for analyzing quantum measurements and modeling noise. A recent paper on this topic by another group is: How much can we learn from quantum random circuit sampling? by Manole et al.

Quantum Puzzles

I also plan a series of posts devoted to quantum puzzles related to quantum information and computation. The first post concerned Majorana zero modes. Whether Majorana zero modes can in fact be created remains a major mystery in physics, and I personally suspect the answer may be negative. (As with “quantum supremacy,” their realization has been claimed by several research groups.) Planned follow-up posts will address quantum cryptography and the time–energy uncertainty principle.

Free Will

I plan to return to the fascinating connections between quantum physics, computation, and free will. I wrote a paper on this topic in 2021, and we discussed it in this blog post. Since then, I participated in two conferences in Nazareth, in 2022 and 2024, devoted to free will (here are the videotaped lectures – in Hebrew). Following these conference and my paper, I have had many stimulating discussions with colleagues from a wide variety of disciplines. 

Is Quantum Computational Advantage Manifested by Nature? Has it been Achieved by Experiments?

This question lies at the heart of the matter and connects to all the topics above. In a recent lecture, Yosi Avron mentioned an argument—possibly going back to Feynman—that quantum physics in Nature already exhibits “quantum supremacy”: computing the magnetic moments of the proton or neutron from first principles is extraordinarily difficult and yields estimates far from experimental values, yet protons and neutrons “compute” their magnetic moments effortlessly. In the same lecture, delivered at a celebratory meeting for the 100th anniversary of quantum mechanics at the Open University in Ra’anana, Yosi also argued that no country can afford to lag behind in quantum computation, drawing an analogy with nuclear capabilities.

Computers, AI and Mathematics

Like many others, I plan to experiment with modern AI tools in the hope of using them for meaningful mathematical research. I am cautiously optimistic—perhaps naïve. Let’s see how it goes.

Pictures

  

 

 

 

 

 

Top row: Boise with Zach Teitler, Alexander Woo and Bruce Sagan’s classical book, and with local convex polytopes. Second row:  sightseeing near Boise. Third, fourth, and fifth rows: Yellowstone. Sixth row: Yonatan in Boise. Seventh row: Mazi and I with Ilan and Yoav in Tel Aviv. 

 

 

By Gil Kalai

All Downhill From Here

from Ben Recht

Lyapuov's two methods of stability analysis

This is a live blog of Lecture 2 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

The analysis in the last post hinted that we can use calculus to analyze the local behavior of a homeostatic system around its setpoint. I wrote up the details in these lecture notes. As long as the first terms of the Taylor series provide a reasonable approximation of the equations that define the dynamical system, we can use linear algebra to reason about how a homeostatic system maintains its steady state.

The problem with this analysis by linear proxy is that we need to somehow account for the error in the approximation. Such bookkeeping tends to be much more annoying. Determining the region of state space under which a Taylor series is accurate always amounts to frustrating calculations. These calculations also tend to be highly tuned to the particulars of the differential structure of the model. If the model slightly changes, you have to start all over again and rederive new error bounds.

To get around this sort of linear proxy analysis. Lyapunov invented an alternative method, called his second method or his direct method (I got the direct and indirect methods confused yesterday). To avoid having to create a mnemonic for what direct and indirect mean, I’m going to switch to descriptive terms for Lyapunov’s two methods: the method of linearization and the method of potential functions.

The method of potential functions is inspired by physics. The goal is to define a notion of “energy” for any possible state, and then show that energy dissipates as the dynamics unravel into the future. Mathematically, the method seeks a function that maps states to positive scalars. This function should be large far from the fixed point. It should equal zero at the fixed point and only at the fixed point. And the function should decrease along the trajectories of the dynamical system. In other words, the function must take on a lower value at time t+1 than it held at time t. Such a function is called a potential function (also often called a Lyapunov function).

You can already see that this construction should verify convergence to the fixed point. If potential decreases at every time step but is always positive, it eventually has to get to zero. The only place where the potential is zero is the fixed point. Therefore, the system has to converge to the fixed point. You can make this as rigorous as you’d like, but I find the intuition here easier than thinking about linearizations.

Proofs using potential functions are easy. Finding potential functions is hard. It’s an interesting mathematical maneuver: we have a proof technique that always works as long as you produce a particular certificate (the potential function). We thus shift the burden of proof to finding and verifying that the certificate satisfies a list of desiderata. This turns proof into a constraint satisfaction problem, one that is amenable to computer search.

Let me give a simple case in linear systems that demonstrates how this logical transformation works. We’ll do much more interesting nonlinear cases in the next class.

Suppose we’d like to show all trajectories of a linear dynamical system

converge to zero. From your first class on controls, you know that you can just compute the eigenvalues of A and make sure their magnitudes are all less than one. But let’s find a potential function that also certifies convergence. I need a family of functions that are positive everywhere except at the origin, where they are equal to zero. One simple family would be the strongly convex quadratics,

where P is a positive definite matrix with all eigenvalues greater than zero. If I want to show that the potential decreases along trajectories, I need

for all x. This is equivalent to the matrix inequality

I have reduced stability analysis to solving a system of linear matrix inequalities. The set of Lyapunov functions of this form is convex. And you can use techniques from convex optimization to search for the potential function.

Now, as written so far, this seems to have turned an annoying linear algebra problem (computing eigenvalues) into an annoying convex optimization problem (semidefinite programming). Fair! But the potential function method is far more extensible. For example, suppose the system were uncertain and could evolve according to either A1 or A2 at any given time. Then you can try to find a potential function that certifies both matrices. If one exists, then the global system will be stable, even if it’s switching. The appeal of the potential function method is this sort of robustness. It lets us handle inaccurate or uncertain dynamics in ways that linearization doesn’t. In the next lecture, we’ll apply these ideas to PID controllers and draw some interesting connections between analyzing the most ubiquitous control policies and the most ubiquitous optimization methods.

Subscribe now

By Ben Recht

TR26-012 | Perfectly Satisfiable Systems of Linear Equations and Fixed Weight Solutions | Johan Håstad

from ECCC Papers

We study systems of linear equations modulo two in $n$ variables with three variables in each equation. We assume that the system has a solution with $pn$ variables taking the value 1 for some value $00$ it is hard to find a solution of the same weight that satisfies at least a fraction $c_p +\delta$ of the equations. The constant $c_p$ is upper bounded by $.9$ for any value of $p$.

We study systems of linear equations modulo two in $n$ variables with three variables in each equation. We assume that the system has a solution with $pn$ variables taking the value 1 for some value $00$ it is hard to find a solution of the same weight that satisfies at least a fraction $c_p +\delta$ of the equations. The constant $c_p$ is upper bounded by $.9$ for any value of $p$.

A Provable Expressiveness Hierarchy in Hybrid Linear-Full Attention

from arXiv: Computational Complexity

Authors: Xiaowei Ye, Xiaoyu He, Chao Liao, Chen Wu, Pinyan Lu

Transformers serve as the foundation of most modern large language models. To mitigate the quadratic complexity of standard full attention, various efficient attention mechanisms, such as linear and hybrid attention, have been developed. A fundamental gap remains: their expressive power relative to full attention lacks a rigorous theoretical characterization. In this work, we theoretically characterize the performance differences among these attention mechanisms. Our theory applies to all linear attention variants that can be formulated as a recurrence, including Mamba, DeltaNet, etc. Specifically, we establish an expressiveness hierarchy: for the sequential function composition-a multi-step reasoning task that must occur within a model's forward pass, an ($L+1$)-layer full attention network is sufficient, whereas any hybrid network interleaving $L-1$ layers of full attention with a substantially larger number ($2^{3L^2}$) of linear attention layers cannot solve it. This result demonstrates a clear separation in expressive power between the two types of attention. Our work provides the first provable separation between hybrid attention and standard full attention, offering a theoretical perspective for understanding the fundamental capabilities and limitations of different attention mechanisms.

Authors: Xiaowei Ye, Xiaoyu He, Chao Liao, Chen Wu, Pinyan Lu

Transformers serve as the foundation of most modern large language models. To mitigate the quadratic complexity of standard full attention, various efficient attention mechanisms, such as linear and hybrid attention, have been developed. A fundamental gap remains: their expressive power relative to full attention lacks a rigorous theoretical characterization. In this work, we theoretically characterize the performance differences among these attention mechanisms. Our theory applies to all linear attention variants that can be formulated as a recurrence, including Mamba, DeltaNet, etc. Specifically, we establish an expressiveness hierarchy: for the sequential function composition-a multi-step reasoning task that must occur within a model's forward pass, an ($L+1$)-layer full attention network is sufficient, whereas any hybrid network interleaving $L-1$ layers of full attention with a substantially larger number ($2^{3L^2}$) of linear attention layers cannot solve it. This result demonstrates a clear separation in expressive power between the two types of attention. Our work provides the first provable separation between hybrid attention and standard full attention, offering a theoretical perspective for understanding the fundamental capabilities and limitations of different attention mechanisms.

Breaking the Temporal Complexity Barrier: Bucket Calculus for Parallel Machine Scheduling

from arXiv: Computational Complexity

Authors: Noor Islam S. Mohammad

This paper introduces bucket calculus, a novel mathematical framework that fundamentally transforms the computational complexity landscape of parallel machine scheduling optimization. We address the strongly NP-hard problem $P2|r_j|C_{\max}$ through an innovative adaptive temporal discretization methodology that achieves exponential complexity reduction from $O(T^n)$ to $O(B^n)$ where $B \ll T$, while maintaining near-optimal solution quality. Our bucket-indexed mixed-integer linear programming (MILP) formulation exploits dimensional complexity heterogeneity through precision-aware discretization, reducing decision variables by 94.4\% and achieving a theoretical speedup factor $2.75 \times 10^{37}$ for 20-job instances. Theoretical contributions include partial discretization theory, fractional bucket calculus operators, and quantum-inspired mechanisms for temporal constraint modeling. Empirical validation on instances with 20--400 jobs demonstrates 97.6\% resource utilization, near-perfect load balancing ($σ/μ= 0.006$), and sustained performance across problem scales with optimality gaps below 5.1\%. This work represents a paradigm shift from fine-grained temporal discretization to multi-resolution precision allocation, bridging the fundamental gap between exact optimization and computational tractability for industrial-scale NP-hard scheduling problems.

Authors: Noor Islam S. Mohammad

This paper introduces bucket calculus, a novel mathematical framework that fundamentally transforms the computational complexity landscape of parallel machine scheduling optimization. We address the strongly NP-hard problem $P2|r_j|C_{\max}$ through an innovative adaptive temporal discretization methodology that achieves exponential complexity reduction from $O(T^n)$ to $O(B^n)$ where $B \ll T$, while maintaining near-optimal solution quality. Our bucket-indexed mixed-integer linear programming (MILP) formulation exploits dimensional complexity heterogeneity through precision-aware discretization, reducing decision variables by 94.4\% and achieving a theoretical speedup factor $2.75 \times 10^{37}$ for 20-job instances. Theoretical contributions include partial discretization theory, fractional bucket calculus operators, and quantum-inspired mechanisms for temporal constraint modeling. Empirical validation on instances with 20--400 jobs demonstrates 97.6\% resource utilization, near-perfect load balancing ($σ/μ= 0.006$), and sustained performance across problem scales with optimality gaps below 5.1\%. This work represents a paradigm shift from fine-grained temporal discretization to multi-resolution precision allocation, bridging the fundamental gap between exact optimization and computational tractability for industrial-scale NP-hard scheduling problems.

On Condensation of Block Sensitivity, Certificate Complexity and the $\mathsf{AND}$ (and $\mathsf{OR}$) Decision Tree Complexity

from arXiv: Computational Complexity

Authors: Sai Soumya Nalli, Karthikeya Polisetty, Jayalal Sarma

Given an $n$-bit Boolean function with a complexity measure (such as block sensitivity, query complexity, etc.) $M(f) = k$, the hardness condensation question asks whether $f$ can be restricted to $O(k)$ variables such that the complexity measure is $Ω(k)$? In this work, we study the condensability of block sensitivity, certificate complexity, AND (and OR) query complexity and Fourier sparsity. We show that block sensitivity does not condense under restrictions, unlike sensitivity: there exists a Boolean function $f$ with query complexity $k$ such that any restriction of $f$ to $O(k)$ variables has block sensitivity $O(k^{\frac{2}{3}})$. This answers an open question in Göös, Newman, Riazanov, and Sokolov (2024) in the negative. The same function yields an analogous incondensable result for certificate complexity. We further show that $\mathsf{AND}$(and $\mathsf{OR}$) decision trees are also incondensable. In contrast, we prove that Fourier sparsity admits a weak form of condensation.

Authors: Sai Soumya Nalli, Karthikeya Polisetty, Jayalal Sarma

Given an $n$-bit Boolean function with a complexity measure (such as block sensitivity, query complexity, etc.) $M(f) = k$, the hardness condensation question asks whether $f$ can be restricted to $O(k)$ variables such that the complexity measure is $Ω(k)$? In this work, we study the condensability of block sensitivity, certificate complexity, AND (and OR) query complexity and Fourier sparsity. We show that block sensitivity does not condense under restrictions, unlike sensitivity: there exists a Boolean function $f$ with query complexity $k$ such that any restriction of $f$ to $O(k)$ variables has block sensitivity $O(k^{\frac{2}{3}})$. This answers an open question in Göös, Newman, Riazanov, and Sokolov (2024) in the negative. The same function yields an analogous incondensable result for certificate complexity. We further show that $\mathsf{AND}$(and $\mathsf{OR}$) decision trees are also incondensable. In contrast, we prove that Fourier sparsity admits a weak form of condensation.

The complexity of finding coset-generating polymorphisms and the promise metaproblem

from arXiv: Computational Complexity

Authors: Manuel Bodirsky, Armin Weiß

We show that the metaproblem for coset-generating polymorphisms is NP-complete, answering a question of Chen and Larose: given a finite structure, the computational question is whether this structure has a polymorphism of the form $(x,y,z) \mapsto x y^{-1} z$ with respect to some group; such operations are also called coset-generating, or heaps. Furthermore, we introduce a promise version of the metaproblem, parametrised by two polymorphism conditions $Σ_1$ and $Σ_2$ and defined analogously to the promise constraint satisfaction problem. We give sufficient conditions under which the promise metaproblem for $(Σ_1,Σ_2)$ is in P and under which it is NP-hard. In particular, the promise metaproblem is in P if $Σ_1$ states the existence of a Maltsev polymorphism and $Σ_2$ states the existence of an abelian heap polymorphism -- despite the fact that neither the metaproblem for $Σ_1$ nor the metaproblem for $Σ_2$ is known to be in P. We also show that the creation-metaproblem for Maltsev polymorphisms, under the promise that a heap polymorphism exists, is in P if and only if there is a uniform polynomial-time algorithm for CSPs with a heap polymorphism.

Authors: Manuel Bodirsky, Armin Weiß

We show that the metaproblem for coset-generating polymorphisms is NP-complete, answering a question of Chen and Larose: given a finite structure, the computational question is whether this structure has a polymorphism of the form $(x,y,z) \mapsto x y^{-1} z$ with respect to some group; such operations are also called coset-generating, or heaps. Furthermore, we introduce a promise version of the metaproblem, parametrised by two polymorphism conditions $Σ_1$ and $Σ_2$ and defined analogously to the promise constraint satisfaction problem. We give sufficient conditions under which the promise metaproblem for $(Σ_1,Σ_2)$ is in P and under which it is NP-hard. In particular, the promise metaproblem is in P if $Σ_1$ states the existence of a Maltsev polymorphism and $Σ_2$ states the existence of an abelian heap polymorphism -- despite the fact that neither the metaproblem for $Σ_1$ nor the metaproblem for $Σ_2$ is known to be in P. We also show that the creation-metaproblem for Maltsev polymorphisms, under the promise that a heap polymorphism exists, is in P if and only if there is a uniform polynomial-time algorithm for CSPs with a heap polymorphism.

Hardness Condensation for Decision Tree Measures by Restrictions

from arXiv: Computational Complexity

Authors: Chandrima Kayal, Rajat Mittal, Manaswi Paraashar, Nitin Saurabh

For any Boolean function $f:\{0,1\}^n \to \{0,1\}$ with a complexity measure having value $k \ll n$, is it possible to restrict the function $f$ to $Θ(k)$ variables while keeping the complexity preserved at $Θ(k)$? This question, in the context of query complexity, was recently studied by G{ö}{ö}s, Newman, Riazanov and Sokolov (STOC 2024). They showed, among other results, that query complexity can not be condensed losslessly. They asked if complexity measures like block sensitivity or unambiguous certificate complexity can be condensed losslessly? In this work, we show that decision tree measures like block sensitivity and certificate complexity, cannot be condensed losslessly. That is, there exists a Boolean function $f$ such that any restriction of $f$ to $O(\mathcal{M}(f))$ variables has $\mathcal{M}(\cdot)$-complexity at most $\tilde{O}(\mathcal{M}(f)^{2/3})$, where $\mathcal{M} \in \{\mathsf{bs},\mathsf{fbs},\mathsf{C},\mathsf{D}\}$. This also improves upon a result of G{ö}{ö}s, Newman, Riazanov and Sokolov (STOC 2024). We also complement the negative results on lossless condensation with positive results about lossy condensation. In particular, we show that for every Boolean function $f$ there exists a restriction of $f$ to $O(\mathcal{M}(f))$ variables such that its $\mathcal{M}(\cdot)$-complexity is at least $Ω(\mathcal{M}(f)^{1/2})$, where $\mathcal{M} \in \{\mathsf{bs},\mathsf{fbs},\mathsf{C},\mathsf{UC}_{min},\mathsf{UC}_1,\mathsf{UC},\mathsf{D},\widetilde{\mathsf{deg}},λ\}$. We also show a slightly weaker positive result for randomized and quantum query complexity.

Authors: Chandrima Kayal, Rajat Mittal, Manaswi Paraashar, Nitin Saurabh

For any Boolean function $f:\{0,1\}^n \to \{0,1\}$ with a complexity measure having value $k \ll n$, is it possible to restrict the function $f$ to $Θ(k)$ variables while keeping the complexity preserved at $Θ(k)$? This question, in the context of query complexity, was recently studied by G{ö}{ö}s, Newman, Riazanov and Sokolov (STOC 2024). They showed, among other results, that query complexity can not be condensed losslessly. They asked if complexity measures like block sensitivity or unambiguous certificate complexity can be condensed losslessly? In this work, we show that decision tree measures like block sensitivity and certificate complexity, cannot be condensed losslessly. That is, there exists a Boolean function $f$ such that any restriction of $f$ to $O(\mathcal{M}(f))$ variables has $\mathcal{M}(\cdot)$-complexity at most $\tilde{O}(\mathcal{M}(f)^{2/3})$, where $\mathcal{M} \in \{\mathsf{bs},\mathsf{fbs},\mathsf{C},\mathsf{D}\}$. This also improves upon a result of G{ö}{ö}s, Newman, Riazanov and Sokolov (STOC 2024). We also complement the negative results on lossless condensation with positive results about lossy condensation. In particular, we show that for every Boolean function $f$ there exists a restriction of $f$ to $O(\mathcal{M}(f))$ variables such that its $\mathcal{M}(\cdot)$-complexity is at least $Ω(\mathcal{M}(f)^{1/2})$, where $\mathcal{M} \in \{\mathsf{bs},\mathsf{fbs},\mathsf{C},\mathsf{UC}_{min},\mathsf{UC}_1,\mathsf{UC},\mathsf{D},\widetilde{\mathsf{deg}},λ\}$. We also show a slightly weaker positive result for randomized and quantum query complexity.

Entanglement-Dependent Error Bounds for Hamiltonian Simulation

from arXiv: Computational Complexity

Authors: Prateek P. Kulkarni

We establish tight connections between entanglement entropy and the approximation error in Trotter-Suzuki product formulas for Hamiltonian simulation. Product formulas remain the workhorse of quantum simulation on near-term devices, yet standard error analyses yield worst-case bounds that can vastly overestimate the resources required for structured problems. For systems governed by geometrically local Hamiltonians with maximum entanglement entropy $S_\text{max}$ across all bipartitions, we prove that the first-order Trotter error scales as $\mathcal{O}(t^2 S_\text{max} \operatorname{polylog}(n)/r)$ rather than the worst-case $\mathcal{O}(t^2 n/r)$, where $n$ is the system size and $r$ is the number of Trotter steps. This yields improvements of $\tildeΩ(n^2)$ for one-dimensional area-law systems and $\tildeΩ(n^{3/2})$ for two-dimensional systems. We extend these bounds to higher-order Suzuki formulas, where the improvement factor involves $2^{pS^*/2}$ for the $p$-th order formula. We further establish a separation result demonstrating that volume-law entangled systems fundamentally require $\tildeΩ(n)$ more Trotter steps than area-law systems to achieve the same precision. This separation is tight up to logarithmic factors. Our analysis combines Lieb-Robinson bounds for locality, tensor network representations for entanglement structure, and novel commutator-entropy inequalities that bound the expectation value of nested commutators by the Schmidt rank of the state. These results have immediate applications to quantum chemistry, condensed matter simulation, and resource estimation for fault-tolerant quantum computing.

Authors: Prateek P. Kulkarni

We establish tight connections between entanglement entropy and the approximation error in Trotter-Suzuki product formulas for Hamiltonian simulation. Product formulas remain the workhorse of quantum simulation on near-term devices, yet standard error analyses yield worst-case bounds that can vastly overestimate the resources required for structured problems. For systems governed by geometrically local Hamiltonians with maximum entanglement entropy $S_\text{max}$ across all bipartitions, we prove that the first-order Trotter error scales as $\mathcal{O}(t^2 S_\text{max} \operatorname{polylog}(n)/r)$ rather than the worst-case $\mathcal{O}(t^2 n/r)$, where $n$ is the system size and $r$ is the number of Trotter steps. This yields improvements of $\tildeΩ(n^2)$ for one-dimensional area-law systems and $\tildeΩ(n^{3/2})$ for two-dimensional systems. We extend these bounds to higher-order Suzuki formulas, where the improvement factor involves $2^{pS^*/2}$ for the $p$-th order formula. We further establish a separation result demonstrating that volume-law entangled systems fundamentally require $\tildeΩ(n)$ more Trotter steps than area-law systems to achieve the same precision. This separation is tight up to logarithmic factors. Our analysis combines Lieb-Robinson bounds for locality, tensor network representations for entanglement structure, and novel commutator-entropy inequalities that bound the expectation value of nested commutators by the Schmidt rank of the state. These results have immediate applications to quantum chemistry, condensed matter simulation, and resource estimation for fault-tolerant quantum computing.

The SPARSE-Relativization Framework and Applications to Optimal Proof Systems

from arXiv: Computational Complexity

Authors: Fabian Egidy

We investigate the following longstanding open questions raised by Krajíček and Pudlák (J. Symb. L. 1989), Sadowski (FCT 1997), Köbler and Messner (CCC 1998) and Messner (PhD 2000). Q1: Does TAUT have (p-)optimal proof systems? Q2: Does QBF have (p-)optimal proof systems? Q3: Are there arbitrarily complex sets with (p-)optimal proof systems? Recently, Egidy and Glaßer (STOC 2025) contributed to these questions by constructing oracles that show that there are no relativizable proofs for positive answers of these questions, even when assuming well-established conjectures about the separation of complexity classes. We continue this line of research by providing the same proof barrier for negative answers of these questions. For this, we introduce the SPARSE-relativization framework, which is an application of the notion of bounded relativization by Hirahara, Lu, and Ren (CCC 2023). This framework allows the construction of sparse oracles for statements such that additional useful properties (like an infinite polynomial-time hierarchy) hold. By applying the SPARSE-relativization framework, we show that the oracle construction of Egidy and Glaßer also yields the following new oracle. O1: No set in PSPACE\NP has optimal proof systems, $\mathrm{NP} \subsetneq \mathrm{PH} \subsetneq \mathrm{PSPACE}$, and PH collapses We use techniques of Cook and Krajíček (J. Symb. L. 2007) and Beyersdorff, Köbler, and Müller (Inf. Comp. 2011) and apply our SPARSE-relativization framework to obtain the following new oracle. O2: All sets in PSPACE have p-optimal proof systems, there are arbitrarily complex sets with p-optimal proof systems, and PH is infinite Together with previous results, our oracles show that questions Q1 and Q2 are independent of an infinite or collapsing polynomial-time hierarchy.

Authors: Fabian Egidy

We investigate the following longstanding open questions raised by Krajíček and Pudlák (J. Symb. L. 1989), Sadowski (FCT 1997), Köbler and Messner (CCC 1998) and Messner (PhD 2000). Q1: Does TAUT have (p-)optimal proof systems? Q2: Does QBF have (p-)optimal proof systems? Q3: Are there arbitrarily complex sets with (p-)optimal proof systems? Recently, Egidy and Glaßer (STOC 2025) contributed to these questions by constructing oracles that show that there are no relativizable proofs for positive answers of these questions, even when assuming well-established conjectures about the separation of complexity classes. We continue this line of research by providing the same proof barrier for negative answers of these questions. For this, we introduce the SPARSE-relativization framework, which is an application of the notion of bounded relativization by Hirahara, Lu, and Ren (CCC 2023). This framework allows the construction of sparse oracles for statements such that additional useful properties (like an infinite polynomial-time hierarchy) hold. By applying the SPARSE-relativization framework, we show that the oracle construction of Egidy and Glaßer also yields the following new oracle. O1: No set in PSPACE\NP has optimal proof systems, $\mathrm{NP} \subsetneq \mathrm{PH} \subsetneq \mathrm{PSPACE}$, and PH collapses We use techniques of Cook and Krajíček (J. Symb. L. 2007) and Beyersdorff, Köbler, and Müller (Inf. Comp. 2011) and apply our SPARSE-relativization framework to obtain the following new oracle. O2: All sets in PSPACE have p-optimal proof systems, there are arbitrarily complex sets with p-optimal proof systems, and PH is infinite Together with previous results, our oracles show that questions Q1 and Q2 are independent of an infinite or collapsing polynomial-time hierarchy.

$m$-Eternal Dominating Set Problem on Subclasses of Chordal Graphs

from arXiv: Computational Complexity

Authors: Ashutosh Rai, Soumyashree Rana

A dominating set of a graph G(V, E) is a set of vertices D\subseteq V such that every vertex in V\D has a neighbor in D. An eternal dominating set extends this concept by placing mobile guards on the vertices of D. In response to an infinite sequence of attacks on unoccupied vertices, a guard can move to the attacked vertex from an adjacent position, ensuring that the new guards configuration remains a dominating set. In the one (all) guard(s) move model, only one (multiple) guard(s) moves(may move) per attack. The set of vertices representing the initial configuration of guards in one(all) guard move model is the eternal dominating set (m-eternal dominating set) of G. The minimum size of such a set in one(all) guard move model is called the eternal domination number (m-eternal domination number) of G, respectively. Given a graph G and an integer k, the m-Eternal Dominating Set asks whether G has an m-eternal dominating set of size at most k. In this work, we focus mainly on the computational complexity of m-Eternal Dominating Set in subclasses of chordal graphs. For split graphs, we show a dichotomy result by first designing a polynomial-time algorithm for K1,t-free split graphs with t\le 4, and then proving that the problem becomes NP-complete for t\ge 5. We showed that the problem is NP-hard on undirected path graphs. Moreover, we exhibit the computational complexity difference between the variants by showing the existence of two graph classes such that, in one, both Dominating Set and m-Eternal Dominating Set are solvable in polynomial time while Eternal Dominating Set is NP-hard, whereas in the other, Eternal Dominating Set is solvable in polynomial time and both Dominating Set and m-Eternal Dominating Set are NP-hard. Finally, we present a graph class where Dominating Set is NP-hard, but m-Eternal Dominating Set is efficiently solvable.

Authors: Ashutosh Rai, Soumyashree Rana

A dominating set of a graph G(V, E) is a set of vertices D\subseteq V such that every vertex in V\D has a neighbor in D. An eternal dominating set extends this concept by placing mobile guards on the vertices of D. In response to an infinite sequence of attacks on unoccupied vertices, a guard can move to the attacked vertex from an adjacent position, ensuring that the new guards configuration remains a dominating set. In the one (all) guard(s) move model, only one (multiple) guard(s) moves(may move) per attack. The set of vertices representing the initial configuration of guards in one(all) guard move model is the eternal dominating set (m-eternal dominating set) of G. The minimum size of such a set in one(all) guard move model is called the eternal domination number (m-eternal domination number) of G, respectively. Given a graph G and an integer k, the m-Eternal Dominating Set asks whether G has an m-eternal dominating set of size at most k. In this work, we focus mainly on the computational complexity of m-Eternal Dominating Set in subclasses of chordal graphs. For split graphs, we show a dichotomy result by first designing a polynomial-time algorithm for K1,t-free split graphs with t\le 4, and then proving that the problem becomes NP-complete for t\ge 5. We showed that the problem is NP-hard on undirected path graphs. Moreover, we exhibit the computational complexity difference between the variants by showing the existence of two graph classes such that, in one, both Dominating Set and m-Eternal Dominating Set are solvable in polynomial time while Eternal Dominating Set is NP-hard, whereas in the other, Eternal Dominating Set is solvable in polynomial time and both Dominating Set and m-Eternal Dominating Set are NP-hard. Finally, we present a graph class where Dominating Set is NP-hard, but m-Eternal Dominating Set is efficiently solvable.

A two-player version of the assignment problem

from arXiv: Computational Complexity

Authors: Florian Galliot, Nacim Oijid, Jonas Sénizergues

We introduce the competitive assignment problem, a two-player version of the well-known assignment problem. Given a set of tasks and a set of agents with different efficiencies for different tasks, Alice and Bob take turns picking agents one by one. Once all agents have been picked, Alice and Bob compute the optimal values $s_A$ and $s_B$ for the assignment problem on their respective sets of agents, i.e. they assign their own agents to tasks (with at most one agent per task and at most one task per agent) so as to maximize the sum of the efficiencies. The score of the game is then defined as $s_A-s_B$. Alice aims at maximizing the score, while Bob aims at minimizing it. This problem can model drafts in sports and card games, or more generally situations where two entities fight for the same resources and then use them to compete against each other. We show that the problem is PSPACE-complete, even restricted to agents that have at most two nonzero efficiencies. On the other hand, in the case of agents having at most one nonzero efficiency, the problem lies in XP parameterized by the number of tasks, and the optimal score can be computed in linear time when there are only two tasks.

Authors: Florian Galliot, Nacim Oijid, Jonas Sénizergues

We introduce the competitive assignment problem, a two-player version of the well-known assignment problem. Given a set of tasks and a set of agents with different efficiencies for different tasks, Alice and Bob take turns picking agents one by one. Once all agents have been picked, Alice and Bob compute the optimal values $s_A$ and $s_B$ for the assignment problem on their respective sets of agents, i.e. they assign their own agents to tasks (with at most one agent per task and at most one task per agent) so as to maximize the sum of the efficiencies. The score of the game is then defined as $s_A-s_B$. Alice aims at maximizing the score, while Bob aims at minimizing it. This problem can model drafts in sports and card games, or more generally situations where two entities fight for the same resources and then use them to compete against each other. We show that the problem is PSPACE-complete, even restricted to agents that have at most two nonzero efficiencies. On the other hand, in the case of agents having at most one nonzero efficiency, the problem lies in XP parameterized by the number of tasks, and the optimal score can be computed in linear time when there are only two tasks.

An Improved Quasi-Physical Dynamic Algorithm for Efficient Circular Coverage in Arbitrary Convex

from arXiv: Computational Geometry

Authors: Zeping Yi, Yongjun Wang, Baoshan Wang, Songyi Liu

The optimal circle coverage problem aims to find a configuration of circles that maximizes the covered area within a given region. Although theoretical optimal solutions exist for simple cases, the problem's NP-hard characteristic makes the problem computationally intractable for complex polygons with numerous circles. Prevailing methods are largely confined to regular domains, while the few algorithms designed for irregular polygons suffer from poor initialization, unmanaged boundary effects, and excessive overlap among circles, resulting in low coverage efficiency. Consequently, we propose an Improved Quasi-Physical Dynamic(IQPD) algorithm for arbitrary convex polygons. Our core contributions are threefold: (1) proposing a structure-preserving initialization strategy that maps a hexagonal close-packing of circles into the target polygon via scaling and affine transformation; (2) constructing a virtual force field incorporating friction and a radius-expansion optimization iteration model; (3) designing a boundary-surrounding strategy based on normal and tangential gradients to retrieve overflowing circles. Experimental results demonstrate that our algorithm significantly outperforms four state-of-the-art methods on seven metrics across a variety of convex polygons. This work could provide a more efficient solution for operational optimization or resource allocation in practical applications.

Authors: Zeping Yi, Yongjun Wang, Baoshan Wang, Songyi Liu

The optimal circle coverage problem aims to find a configuration of circles that maximizes the covered area within a given region. Although theoretical optimal solutions exist for simple cases, the problem's NP-hard characteristic makes the problem computationally intractable for complex polygons with numerous circles. Prevailing methods are largely confined to regular domains, while the few algorithms designed for irregular polygons suffer from poor initialization, unmanaged boundary effects, and excessive overlap among circles, resulting in low coverage efficiency. Consequently, we propose an Improved Quasi-Physical Dynamic(IQPD) algorithm for arbitrary convex polygons. Our core contributions are threefold: (1) proposing a structure-preserving initialization strategy that maps a hexagonal close-packing of circles into the target polygon via scaling and affine transformation; (2) constructing a virtual force field incorporating friction and a radius-expansion optimization iteration model; (3) designing a boundary-surrounding strategy based on normal and tangential gradients to retrieve overflowing circles. Experimental results demonstrate that our algorithm significantly outperforms four state-of-the-art methods on seven metrics across a variety of convex polygons. This work could provide a more efficient solution for operational optimization or resource allocation in practical applications.

Non-Clashing Teaching in Graphs: Algorithms, Complexity, and Bounds

from arXiv: Data Structures and Algorithms

Authors: Sujoy Bhore, Liana Khazaliya, Fionn Mc Inerney

Kirkpatrick et al. [ALT 2019] and Fallat et al. [JMLR 2023] introduced non-clashing teaching and proved that it is the most efficient batch machine teaching model satisfying the collusion-avoidance benchmark established in the seminal work of Goldman and Mathias [COLT 1993]. Recently, (positive) non-clashing teaching was thoroughly studied for balls in graphs, yielding numerous algorithmic and combinatorial results. In particular, Chalopin et al. [COLT 2024] and Ganian et al. [ICLR 2025] gave an almost complete picture of the complexity landscape of the positive variant, showing that it is tractable only for restricted graph classes due to the non-trivial nature of the problem and concept class. In this work, we consider (positive) non-clashing teaching for closed neighborhoods in graphs. This concept class is not only extensively studied in various related contexts, but it also exhibits broad generality, as any finite binary concept class can be equivalently represented by a set of closed neighborhoods in a graph. In comparison to the works on balls in graphs, we provide improved algorithmic results, notably including FPT algorithms for more general classes of parameters, and we complement these results by deriving stronger lower bounds. Lastly, we obtain combinatorial upper bounds for wider classes of graphs.

Authors: Sujoy Bhore, Liana Khazaliya, Fionn Mc Inerney

Kirkpatrick et al. [ALT 2019] and Fallat et al. [JMLR 2023] introduced non-clashing teaching and proved that it is the most efficient batch machine teaching model satisfying the collusion-avoidance benchmark established in the seminal work of Goldman and Mathias [COLT 1993]. Recently, (positive) non-clashing teaching was thoroughly studied for balls in graphs, yielding numerous algorithmic and combinatorial results. In particular, Chalopin et al. [COLT 2024] and Ganian et al. [ICLR 2025] gave an almost complete picture of the complexity landscape of the positive variant, showing that it is tractable only for restricted graph classes due to the non-trivial nature of the problem and concept class. In this work, we consider (positive) non-clashing teaching for closed neighborhoods in graphs. This concept class is not only extensively studied in various related contexts, but it also exhibits broad generality, as any finite binary concept class can be equivalently represented by a set of closed neighborhoods in a graph. In comparison to the works on balls in graphs, we provide improved algorithmic results, notably including FPT algorithms for more general classes of parameters, and we complement these results by deriving stronger lower bounds. Lastly, we obtain combinatorial upper bounds for wider classes of graphs.

Counting Unit Circular Arc Intersections

from arXiv: Data Structures and Algorithms

Authors: Haitao Wang

Given a set of $n$ circular arcs of the same radius in the plane, we consider the problem of computing the number of intersections among the arcs. The problem was studied before and the previously best algorithm solves the problem in $O(n^{4/3+ε})$ time [Agarwal, Pellegrini, and Sharir, SIAM J. Comput., 1993], for any constant $ε>0$. No progress has been made on the problem for more than 30 years. We present a new algorithm of $O(n^{4/3}\log^{16/3}n)$ time and improve it to $O(n^{1+ε}+K^{1/3}n^{2/3}(\frac{n^2}{n+K})^ε\log^{16/3}n)$ time for small $K$, where $K$ is the number of intersections of all arcs.

Authors: Haitao Wang

Given a set of $n$ circular arcs of the same radius in the plane, we consider the problem of computing the number of intersections among the arcs. The problem was studied before and the previously best algorithm solves the problem in $O(n^{4/3+ε})$ time [Agarwal, Pellegrini, and Sharir, SIAM J. Comput., 1993], for any constant $ε>0$. No progress has been made on the problem for more than 30 years. We present a new algorithm of $O(n^{4/3}\log^{16/3}n)$ time and improve it to $O(n^{1+ε}+K^{1/3}n^{2/3}(\frac{n^2}{n+K})^ε\log^{16/3}n)$ time for small $K$, where $K$ is the number of intersections of all arcs.

A polynomial-time algorithm for recognizing high-bandwidth graphs

from arXiv: Data Structures and Algorithms

Authors: Luis M. B. Varona

An unweighted, undirected graph $G$ on $n$ nodes is said to have \emph{bandwidth} at most $k$ if its nodes can be labelled from $0$ to $n - 1$ such that no two adjacent nodes have labels that differ by more than $k$. It is known that one can decide whether the bandwidth of $G$ is at most $k$ in $O(n^k)$ time and $O(n^k)$ space using dynamic programming techniques. For small $k$ close to $0$, this approach is effectively polynomial, but as $k$ scales with $n$, it becomes superexponential, requiring up to $O(n^{n - 1})$ time (where $n - 1$ is the maximum possible bandwidth). In this paper, we reformulate the problem in terms of bipartite matching for sufficiently large $k \ge \lfloor (n - 1)/2 \rfloor$, allowing us to use Hall's marriage theorem to develop an algorithm that runs in $O(n^{n - k + 1})$ time and $O(n)$ auxiliary space (beyond storage of the input graph). This yields polynomial complexity for large $k$ close to $n - 1$, demonstrating that the bandwidth recognition problem is solvable in polynomial time whenever either $k$ or $n - k$ remains small.

Authors: Luis M. B. Varona

An unweighted, undirected graph $G$ on $n$ nodes is said to have \emph{bandwidth} at most $k$ if its nodes can be labelled from $0$ to $n - 1$ such that no two adjacent nodes have labels that differ by more than $k$. It is known that one can decide whether the bandwidth of $G$ is at most $k$ in $O(n^k)$ time and $O(n^k)$ space using dynamic programming techniques. For small $k$ close to $0$, this approach is effectively polynomial, but as $k$ scales with $n$, it becomes superexponential, requiring up to $O(n^{n - 1})$ time (where $n - 1$ is the maximum possible bandwidth). In this paper, we reformulate the problem in terms of bipartite matching for sufficiently large $k \ge \lfloor (n - 1)/2 \rfloor$, allowing us to use Hall's marriage theorem to develop an algorithm that runs in $O(n^{n - k + 1})$ time and $O(n)$ auxiliary space (beyond storage of the input graph). This yields polynomial complexity for large $k$ close to $n - 1$, demonstrating that the bandwidth recognition problem is solvable in polynomial time whenever either $k$ or $n - k$ remains small.

Finite and Corruption-Robust Regret Bounds in Online Inverse Linear Optimization under M-Convex Action Sets

from arXiv: Data Structures and Algorithms

Authors: Taihei Oki, Shinsaku Sakaue

We study online inverse linear optimization, also known as contextual recommendation, where a learner sequentially infers an agent's hidden objective vector from observed optimal actions over feasible sets that change over time. The learner aims to recommend actions that perform well under the agent's true objective, and the performance is measured by the regret, defined as the cumulative gap between the agent's optimal values and those achieved by the learner's recommended actions. Prior work has established a regret bound of $O(d\log T)$, as well as a finite but exponentially large bound of $\exp(O(d\log d))$, where $d$ is the dimension of the optimization problem and $T$ is the time horizon, while a regret lower bound of $Ω(d)$ is known (Gollapudi et al. 2021; Sakaue et al. 2025). Whether a finite regret bound polynomial in $d$ is achievable or not has remained an open question. We partially resolve this by showing that when the feasible sets are M-convex -- a broad class that includes matroids -- a finite regret bound of $O(d\log d)$ is possible. We achieve this by combining a structural characterization of optimal solutions on M-convex sets with a geometric volume argument. Moreover, we extend our approach to adversarially corrupted feedback in up to $C$ rounds. We obtain a regret bound of $O((C+1)d\log d)$ without prior knowledge of $C$, by monitoring directed graphs induced by the observed feedback to detect corruptions adaptively.

Authors: Taihei Oki, Shinsaku Sakaue

We study online inverse linear optimization, also known as contextual recommendation, where a learner sequentially infers an agent's hidden objective vector from observed optimal actions over feasible sets that change over time. The learner aims to recommend actions that perform well under the agent's true objective, and the performance is measured by the regret, defined as the cumulative gap between the agent's optimal values and those achieved by the learner's recommended actions. Prior work has established a regret bound of $O(d\log T)$, as well as a finite but exponentially large bound of $\exp(O(d\log d))$, where $d$ is the dimension of the optimization problem and $T$ is the time horizon, while a regret lower bound of $Ω(d)$ is known (Gollapudi et al. 2021; Sakaue et al. 2025). Whether a finite regret bound polynomial in $d$ is achievable or not has remained an open question. We partially resolve this by showing that when the feasible sets are M-convex -- a broad class that includes matroids -- a finite regret bound of $O(d\log d)$ is possible. We achieve this by combining a structural characterization of optimal solutions on M-convex sets with a geometric volume argument. Moreover, we extend our approach to adversarially corrupted feedback in up to $C$ rounds. We obtain a regret bound of $O((C+1)d\log d)$ without prior knowledge of $C$, by monitoring directed graphs induced by the observed feedback to detect corruptions adaptively.

A $5$-Approximation Analysis for the Cover Small Cuts Problem

from arXiv: Data Structures and Algorithms

Authors: Miles Simmons, Ishan Bansal, Joe Cheriyan

In the Cover Small Cuts problem, we are given a capacitated (undirected) graph $G=(V,E,u)$ and a threshold value $λ$, as well as a set of links $L$ with end-nodes in $V$ and a non-negative cost for each link $\ell\in L$; the goal is to find a minimum-cost set of links such that each non-trivial cut of capacity less than $λ$ is covered by a link. Bansal, Cheriyan, Grout, and Ibrahimpur (arXiv:2209.11209, Algorithmica 2024) showed that the WGMV primal-dual algorithm, due to Williamson, Goemans, Mihail, and Vazirani (Combinatorica, 1995), achieves approximation ratio $16$ for the Cover Small Cuts problem; their analysis uses the notion of a pliable family of sets that satisfies a combinatorial property. Later, Bansal (arXiv:2308.15714v2, IPCO 2025) and then Nutov (arXiv:2504.03910, MFCS 2025) proved that the same algorithm achieves approximation ratio $6$. We show that the same algorithm achieves approximation ratio $5$, by using a stronger notion, namely, a pliable family of sets that satisfies symmetry and structural submodularity.

Authors: Miles Simmons, Ishan Bansal, Joe Cheriyan

In the Cover Small Cuts problem, we are given a capacitated (undirected) graph $G=(V,E,u)$ and a threshold value $λ$, as well as a set of links $L$ with end-nodes in $V$ and a non-negative cost for each link $\ell\in L$; the goal is to find a minimum-cost set of links such that each non-trivial cut of capacity less than $λ$ is covered by a link. Bansal, Cheriyan, Grout, and Ibrahimpur (arXiv:2209.11209, Algorithmica 2024) showed that the WGMV primal-dual algorithm, due to Williamson, Goemans, Mihail, and Vazirani (Combinatorica, 1995), achieves approximation ratio $16$ for the Cover Small Cuts problem; their analysis uses the notion of a pliable family of sets that satisfies a combinatorial property. Later, Bansal (arXiv:2308.15714v2, IPCO 2025) and then Nutov (arXiv:2504.03910, MFCS 2025) proved that the same algorithm achieves approximation ratio $6$. We show that the same algorithm achieves approximation ratio $5$, by using a stronger notion, namely, a pliable family of sets that satisfies symmetry and structural submodularity.

Benchmarking of algorithms for set partitions

from arXiv: Data Structures and Algorithms

Authors: Arnav Khinvasara, Alexander Pikovski

Set partitions are arrangements of distinct objects into groups. The problem of listing all set partitions arises in a variety of settings, in particular in combinatorial optimization tasks. After a brief review, we give practical approximate formulas for determining the number of set partitions, both for small and large set sizes. Several algorithms for enumerating all set partitions are reviewed, and benchmarking tests were conducted. The algorithm of Djokic et al. is recommended for practical use.

Authors: Arnav Khinvasara, Alexander Pikovski

Set partitions are arrangements of distinct objects into groups. The problem of listing all set partitions arises in a variety of settings, in particular in combinatorial optimization tasks. After a brief review, we give practical approximate formulas for determining the number of set partitions, both for small and large set sizes. Several algorithms for enumerating all set partitions are reviewed, and benchmarking tests were conducted. The algorithm of Djokic et al. is recommended for practical use.

Profit Maximization in Closed Social Networks

from arXiv: Data Structures and Algorithms

Authors: Poonam Sharma, Suman Banerjee

Diffusion of information, innovation, and ideas is an important phenomenon in social networks. Information propagates through the network and reaches from one person to the next. In many settings, it is meaningful to restrict diffusion so that each node can spread information to only a limited number of its neighbors rather than to all of them. Such social networks are called closed social networks. In recent years, social media platforms have emerged as an effective medium for commercial entities, where the objective is to maximize profit. In this paper, we study the Profit Maximization in Closed Social Networks (PMCSN) problem in the context of viral marketing. The input to the problem is a closed social network and two positive integers $\ell$ and $B$. The problem asks to select seed nodes within a given budget $B$; during the diffusion process, each node is restricted to choose at most $\ell$ outgoing links for information diffusion; and the objective is to maximize the profit earned by the seed set. The PMCSN problem generalizes the Influence Maximization problem, which is NP-hard. We propose two solution approaches for PMCSN: a sampling-based approximate solution and a marginal-gain-based heuristic solution. We analyze the sample complexity, running time, and space requirements of the proposed approaches. We conduct experiments on real-world, publicly available social network datasets. The results show that the seed sets and diffusion links chosen by our methods yield higher profit than baseline methods. The implementation and data are available at \texttt{github.com/PoonamSharma-PY/ClosedNetwork}.

Authors: Poonam Sharma, Suman Banerjee

Diffusion of information, innovation, and ideas is an important phenomenon in social networks. Information propagates through the network and reaches from one person to the next. In many settings, it is meaningful to restrict diffusion so that each node can spread information to only a limited number of its neighbors rather than to all of them. Such social networks are called closed social networks. In recent years, social media platforms have emerged as an effective medium for commercial entities, where the objective is to maximize profit. In this paper, we study the Profit Maximization in Closed Social Networks (PMCSN) problem in the context of viral marketing. The input to the problem is a closed social network and two positive integers $\ell$ and $B$. The problem asks to select seed nodes within a given budget $B$; during the diffusion process, each node is restricted to choose at most $\ell$ outgoing links for information diffusion; and the objective is to maximize the profit earned by the seed set. The PMCSN problem generalizes the Influence Maximization problem, which is NP-hard. We propose two solution approaches for PMCSN: a sampling-based approximate solution and a marginal-gain-based heuristic solution. We analyze the sample complexity, running time, and space requirements of the proposed approaches. We conduct experiments on real-world, publicly available social network datasets. The results show that the seed sets and diffusion links chosen by our methods yield higher profit than baseline methods. The implementation and data are available at \texttt{https://github.com/PoonamSharma-PY/ClosedNetwork}.

Fast $k$-means Seeding Under The Manifold Hypothesis

from arXiv: Data Structures and Algorithms

Authors: Poojan Shah, Shashwat Agrawal, Ragesh Jaiswal

We study beyond worst case analysis for the $k$-means problem where the goal is to model typical instances of $k$-means arising in practice. Existing theoretical approaches provide guarantees under certain assumptions on the optimal solutions to $k$-means, making them difficult to validate in practice. We propose the manifold hypothesis, where data obtained in ambient dimension $D$ concentrates around a low dimensional manifold of intrinsic dimension $d$, as a reasonable assumption to model real world clustering instances. We identify key geometric properties of datasets which have theoretically predictable scaling laws depending on the quantization exponent $\varepsilon = 2/d$ using techniques from optimum quantization theory. We show how to exploit these regularities to design a fast seeding method called $\operatorname{Qkmeans}$ which provides $O(ρ^{-2} \log k)$ approximate solutions to the $k$-means problem in time $O(nD) + \widetilde{O}(\varepsilon^{1+ρ}ρ^{-1}k^{1+γ})$; where the exponent $γ= \varepsilon + ρ$ for an input parameter $ρ< 1$. This allows us to obtain new runtime - quality tradeoffs. We perform a large scale empirical study across various domains to validate our theoretical predictions and algorithm performance to bridge theory and practice for beyond worst case data clustering.

Authors: Poojan Shah, Shashwat Agrawal, Ragesh Jaiswal

We study beyond worst case analysis for the $k$-means problem where the goal is to model typical instances of $k$-means arising in practice. Existing theoretical approaches provide guarantees under certain assumptions on the optimal solutions to $k$-means, making them difficult to validate in practice. We propose the manifold hypothesis, where data obtained in ambient dimension $D$ concentrates around a low dimensional manifold of intrinsic dimension $d$, as a reasonable assumption to model real world clustering instances. We identify key geometric properties of datasets which have theoretically predictable scaling laws depending on the quantization exponent $\varepsilon = 2/d$ using techniques from optimum quantization theory. We show how to exploit these regularities to design a fast seeding method called $\operatorname{Qkmeans}$ which provides $O(ρ^{-2} \log k)$ approximate solutions to the $k$-means problem in time $O(nD) + \widetilde{O}(\varepsilon^{1+ρ}ρ^{-1}k^{1+γ})$; where the exponent $γ= \varepsilon + ρ$ for an input parameter $ρ< 1$. This allows us to obtain new runtime - quality tradeoffs. We perform a large scale empirical study across various domains to validate our theoretical predictions and algorithm performance to bridge theory and practice for beyond worst case data clustering.

Sublinear Time Quantum Algorithm for Attention Approximation

from arXiv: Data Structures and Algorithms

Authors: Zhao Song, Jianfei Xue, Jiahao Zhang, Lichen Zhang

Given the query, key and value matrices $Q, K, V\in \mathbb{R}^{n\times d}$, the attention module is defined as $\mathrm{Att}(Q, K, V)=D^{-1}AV$ where $A=\exp(QK^\top/\sqrt{d})$ with $\exp(\cdot)$ applied entrywise, $D=\mathrm{diag}(A{\bf 1}_n)$. The attention module is the backbone of modern transformers and large language models, but explicitly forming the softmax matrix $D^{-1}A$ incurs $Ω(n^2)$ time, motivating numerous approximation schemes that reduce runtime to $\widetilde O(nd)$ via sparsity or low-rank factorization. We propose a quantum data structure that approximates any row of $\mathrm{Att}(Q, K, V)$ using only row queries to $Q, K, V$. Our algorithm preprocesses these matrices in $\widetilde{O}\left( ε^{-1} n^{0.5} \left( s_λ^{2.5} + s_λ^{1.5} d + α^{0.5} d \right) \right)$ time, where $ε$ is the target accuracy, $s_λ$ is the $λ$-statistical dimension of the exponential kernel defined by $Q$ and $K$, and $α$ measures the row distortion of $V$ that is at most $d/{\rm srank}(V)$, the stable rank of $V$. Each row query can be answered in $\widetilde{O}(s_λ^2 + s_λd)$ time. To our knowledge, this is the first quantum data structure that approximates rows of the attention matrix in sublinear time with respect to $n$. Our approach relies on a quantum Nyström approximation of the exponential kernel, quantum multivariate mean estimation for computing $D$, and quantum leverage score sampling for the multiplication with $V$.

Authors: Zhao Song, Jianfei Xue, Jiahao Zhang, Lichen Zhang

Given the query, key and value matrices $Q, K, V\in \mathbb{R}^{n\times d}$, the attention module is defined as $\mathrm{Att}(Q, K, V)=D^{-1}AV$ where $A=\exp(QK^\top/\sqrt{d})$ with $\exp(\cdot)$ applied entrywise, $D=\mathrm{diag}(A{\bf 1}_n)$. The attention module is the backbone of modern transformers and large language models, but explicitly forming the softmax matrix $D^{-1}A$ incurs $Ω(n^2)$ time, motivating numerous approximation schemes that reduce runtime to $\widetilde O(nd)$ via sparsity or low-rank factorization. We propose a quantum data structure that approximates any row of $\mathrm{Att}(Q, K, V)$ using only row queries to $Q, K, V$. Our algorithm preprocesses these matrices in $\widetilde{O}\left( ε^{-1} n^{0.5} \left( s_λ^{2.5} + s_λ^{1.5} d + α^{0.5} d \right) \right)$ time, where $ε$ is the target accuracy, $s_λ$ is the $λ$-statistical dimension of the exponential kernel defined by $Q$ and $K$, and $α$ measures the row distortion of $V$ that is at most $d/{\rm srank}(V)$, the stable rank of $V$. Each row query can be answered in $\widetilde{O}(s_λ^2 + s_λd)$ time. To our knowledge, this is the first quantum data structure that approximates rows of the attention matrix in sublinear time with respect to $n$. Our approach relies on a quantum Nyström approximation of the exponential kernel, quantum multivariate mean estimation for computing $D$, and quantum leverage score sampling for the multiplication with $V$.

Fanciful Figurines flip Free Flood-It -- Polynomial-Time Miniature Painting on Co-gem-free Graphs

from arXiv: Data Structures and Algorithms

Authors: Christian Rosenke, Mark Scheibner

Inspired by the eponymous hobby, we introduce Miniature Painting as the computational problem to paint a given graph $G=(V,E)$ according to a prescribed template $t \colon V \rightarrow C$, which assigns colors $C$ to the vertices of $G$. In this setting, the goal is to realize the template using a shortest possible sequence of brush strokes, where each stroke overwrites a connected vertex subset with a color in $C$. We show that this problem is equivalent to a reversal of the well-studied Free Flood-It game, in which a colored graph is decolored into a single color using as few moves as possible. This equivalence allows known complexity results for Free Flood-It to be transferred directly to Miniature Painting, including NP-hardness under severe structural restrictions, such as when $G$ is a grid, a tree, or a split graph. Our main contribution is a polynomial-time algorithm for Miniature Painting on graphs that are free of induced co-gems, a graph class that strictly generalizes cographs. As a direct consequence, Free Flood-It is also polynomial-time solvable on co-gem-free graphs, independent of the initial coloring.

Authors: Christian Rosenke, Mark Scheibner

Inspired by the eponymous hobby, we introduce Miniature Painting as the computational problem to paint a given graph $G=(V,E)$ according to a prescribed template $t \colon V \rightarrow C$, which assigns colors $C$ to the vertices of $G$. In this setting, the goal is to realize the template using a shortest possible sequence of brush strokes, where each stroke overwrites a connected vertex subset with a color in $C$. We show that this problem is equivalent to a reversal of the well-studied Free Flood-It game, in which a colored graph is decolored into a single color using as few moves as possible. This equivalence allows known complexity results for Free Flood-It to be transferred directly to Miniature Painting, including NP-hardness under severe structural restrictions, such as when $G$ is a grid, a tree, or a split graph. Our main contribution is a polynomial-time algorithm for Miniature Painting on graphs that are free of induced co-gems, a graph class that strictly generalizes cographs. As a direct consequence, Free Flood-It is also polynomial-time solvable on co-gem-free graphs, independent of the initial coloring.