Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Monday, February 09

Advanced Simplicity

from Ben Recht

Building up a theory of feedback from proportional-integral-derivative building blocks

This is a live blog of Lecture 3 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

A theme I’ve been emphasizing so far in this class is how simple controllers can tame uncertain, complex components. In the feedback amplifier example, we showed how feeding back a signal proportional to the deviation from a reference level could yield stable behavior. This sort of proportional control seems intuitively sensible. The controller pulls the system down when it is above the desired level and pushes it up when it is below that level. The control signal is some factor of how far a system deviates from its desired setpoint.

The earliest control designs implemented precisely such proportional controllers to control beastly machines like steam engines. One issue 19th-century engineers soon discovered is that proportional control systems often don’t settle at the desired steady state of zero error. Studying such systems, James Clerk Maxwell—you may remember him from those equations about electricity—wrote arguably the first modern control theory paper in 1868. “On Governors” describes the mathematical properties of the mechanical control systems of the time and shows that to actually maintain fixed points, some sort of integral control was needed. I discussed this necessity last week when discussing homeostasis. Feeding back the running average of a signal implied that the signal must be zero at steady state.

You can, of course, combine the two types of feedback and get a “proportional-integral” (PI) controller. Any feedback system with an integrator can only converge to fixed points with zero steady-state error, and adding a proportional term often brings the system to steady state more quickly and more robustly.

We could add one more term to this setup, feeding back the derivative of the error signal as well. The derivative gives the control signal a bit of momentum. It determines whether the error is increasing or decreasing and, hence, how the corresponding control action should change in the future. By selectively summing the error signal, its integral, and its derivative, we obtain the PID controller.

PID controllers lie at the core of any contemporary control system. PID often suffices to control massive systems all on its own. Even in the most complex control systems, you will always find PID controllers sitting at the bottom of the architecture. For example, if you have a complex robotic control system, you can use some fancy new learning-control technique to tell the robot to move, which requires sending some sort of electronic message to mechanical actuators. At those actuators, PID controllers ensure that electronic signals are properly translated into joint torques.

One of the weirder things about how we typically teach control theory is that we begin with an abstract notion of controllers, prove things in full generality, and perhaps remind ourselves that a lot can be done with PID control in a lab or homework assignment.

Though it feels a lot less interesting, I always wonder why we don’t start with PID control as the foundation of all feedback systems. That is, we could begin the class by analyzing the simplest system and then use this as a building block for more complex control models later in the semester.

Some classes do this without fanfare, but they are usually not listed as control courses. In Nonlinear Programming (aka Continuous Optimization), we start with gradient descent and show how to build up a complex set of algorithms around it. Gradient descent is literally integral control! The “plant” takes as input a vector and outputs the gradient of some function evaluated at the input. The “controller” integrates the plant outputs and sends in a new vector proportional to the negative of that integral.

In our nonlinear programming classes, especially those that focus on the theory of convex optimization methods, the analysis is also control analysis. We analyze gradient descent by constructing a Lyapunov function. This Lyapunov function could be the function value itself. Or it could be the distance to the optimal solution. We proceed to build more sophisticated methods, like the proximal gradient method and Nesterov’s accelerated method. It turns out that these too are PID controllers, and we analyze their convergence behavior using Lyapunov functions as well. What if you want to see how a method works in the presence of uncertainty in the gradient oracle? We can then discuss stochastic gradient methods. And for nonstochastic plants with more complex dynamics, we can apply techniques from online learning and study online convex optimization. At the heart of all these methods remains gradient descent/integral control, even as we make the plant models and analyses more sophisticated.

It’s funny because our understanding of first-order optimization methods and PID controllers is basically… the same? The analyses were often derived by the same people, just in different contexts. Many control theorists became optimization theorists and vice versa. There is a clean dictionary between topics:1

For more details on these equivalences, check out this post from 2018. It builds on an excellent paper by Bin Hu and Laurent Lessard. And if you want a glimpse of how far you can take PID controllers in feedback systems, you should check out the book Advanced PID Control by Karl Åström and Tore Hägglund.

Given this strong parallel, this lecture will treat elementary optimization and control on the same footing. We’ll apply ideas from stability analysis to PID control and optimization methods. I’ll draw connections between the actual techniques from control theory and optimization theory. And from this, we’ll get a more general sense of how simple, structured components can tame complex uncertainties in feedback loops.

Subscribe now

1

It is odd that you can’t make tables in Substack.

By Ben Recht

Doctoral student at West Virginia University (apply by March 15, 2026)

from CCI: jobs

The Lane Department of Computer Science and Electrical Engineering in the Benjamin M. Statler College of Engineering and Mineral Resources at West Virginia University invites applications for a Ph.D. position in the general area of algorithmic operations research with an emphasis on computational complexity and game theory. The position is funded by NSF (Algorithmic Foundations). […]

The Lane Department of Computer Science and Electrical Engineering in the Benjamin M. Statler College of Engineering and Mineral Resources at West Virginia University invites applications for a Ph.D. position in the general area of algorithmic operations research with an emphasis on computational complexity and game theory. The position is funded by NSF (Algorithmic Foundations).

Website: https://community.wvu.edu/~krsubramani/
Email: k.subramani@mail.wvu.edu

By shacharlovett

TR26-014 | A Fourier-Analytic Switching Lemma over F_p and the AC^0 Lower Bound for Generalized Parity | Yipin Wang

from ECCC Papers

We prove a switching lemma for constant-depth circuits over the alphabet $F_p$ with generalized AND/OR gates, extending Tal's Fourier-analytic approach from the Boolean setting. The key new ingredient is a direct computation of the $L_1$ Fourier mass of AND/OR gates over $F_p$, which yields an exact closed-form expression for the expected high-degree Fourier mass after a random restriction. Combined with a Markov inequality argument, this gives a switching lemma with an explicit, prime-independent structure: $Pr[DT(f|\rho) \geq s] \leq (ep \cdot qK / ((p-1)s))^s$. As a consequence, we obtain that for any prime $p$, constant-depth circuits of sub-exponential size over $F_p$ cannot compute $1[\sum_i x_i \equiv 0 \pmod{p}]$.

We prove a switching lemma for constant-depth circuits over the alphabet $F_p$ with generalized AND/OR gates, extending Tal's Fourier-analytic approach from the Boolean setting. The key new ingredient is a direct computation of the $L_1$ Fourier mass of AND/OR gates over $F_p$, which yields an exact closed-form expression for the expected high-degree Fourier mass after a random restriction. Combined with a Markov inequality argument, this gives a switching lemma with an explicit, prime-independent structure: $Pr[DT(f|\rho) \geq s] \leq (ep \cdot qK / ((p-1)s))^s$. As a consequence, we obtain that for any prime $p$, constant-depth circuits of sub-exponential size over $F_p$ cannot compute $1[\sum_i x_i \equiv 0 \pmod{p}]$.

I used to think historians in the future will have too much to work with. I could be wrong

from Computational Complexity

 (I thought I had already posted this but the blogger system we use says I didn't. Apologies if I did. Most likely is that I posted something similar. When you blog for X years you forget what you've already blogged on.) 

Historians who study ancient Greece often have to work with fragments of text or just a few pottery shards. Nowadays we preserve so much that historians 1000 years from now will have an easier time. Indeed, they may have too much to look at; and have to sort through news, fake news, opinions, and satires, to figure out what was true.

The above is what I used to think. But I could be wrong. 

1) When technology changes stuff is lost. E.g., floppy disks.

2) (This is the inspiration for this blog post) Harry Lewis gave a talk in Zurich on 

The Birth of Binary: Leibniz and the origins of computer arithmetic

On Dec. 8, 2022 at 1:15PM-3:30PM Zurich time. I didn't watch it live (too early in the morning, east coast time) but it was taped and I watched a recording later. Yeah!

His blog about it (see here) had a pointer to the video, and my blog about it (see here) had a pointer to both the video and to his blog.

A while back  I was writing a blog post where I wanted to point to the video. My link didn't work. His link didn't work. I emailed him asking where it was. IT IS LOST FOREVER! Future Historians will not know about Leibniz and binary! Or they might--- he has a book on the topic that I reviewed here. But what if the book goes out of print and the only information on this topic is my review of his book? 

3) Entire journals can vanish. I blogged about that here.

4) I am happy that the link to the Wikipedia entry on Link Rot (see here) has not rotted.

5) I did a post on what tends to NOT be recorded and hence may be lost forever here.

6) (This is  bigger topic than my one point here.) People tend to OWN less than they used to. 


DVDs-don't bother buying! Whatever you want is on streaming (I recently watched, for the first time, Buffy the Vampire Slayer, one episode a day, on Treadmill, and it was great!)

CD's- don't bother buying!  Use Spotify. I do that and it's awesome-I have found novelty songs I didn't know about! Including a song by The Doubleclicks  which I thought was about Buffy: here. I emailed them about that it and they responded with: Hello! Buffy, hunger games, divergent, Harry Potter, you name it.

JOURNALS- don't bother buying them, its all on arXiv (Very true in TCS, might be less true in other fields). 

CONFERENCES: Not sure. I think very few have paper proceedings. At one time they gave out memory sticks with all the papers on them, so that IS ownership though depends on technology that might go away. Not sure what they do now. 

This may make it easier to lose things since nobody has a physical copy. 

7) Counterargument: Even given the points above, far more today IS being preserved than used to be. See my blog post on that here. But will that be true in the long run? 

8) I began saying that I used to think future historians will have too much to look at and have to sort through lots of stuff (using quicksort?) to figure out what's true. Then I said they may lose a lot. Oddly enough, both might be true- of the stuff they DO have they will have a hard time figuring out what's true (e.g., Was Pope Leo's ugrad thesis on Rado's Theorem for Non-Linear Equations? No. See my blog about that falsehood getting out to the world here. Spoiler alert- it was my fault.)

QUESTIONS:

1) Am I right--- will the future lose lots of stuff?

2) If so, what can we do about this? Not clear who we is in that last sentence. 



By gasarch

 (I thought I had already posted this but the blogger system we use says I didn't. Apologies if I did. Most likely is that I posted something similar. When you blog for X years you forget what you've already blogged on.) 

Historians who study ancient Greece often have to work with fragments of text or just a few pottery shards. Nowadays we preserve so much that historians 1000 years from now will have an easier time. Indeed, they may have too much to look at; and have to sort through news, fake news, opinions, and satires, to figure out what was true.

The above is what I used to think. But I could be wrong. 

1) When technology changes stuff is lost. E.g., floppy disks.

2) (This is the inspiration for this blog post) Harry Lewis gave a talk in Zurich on 

The Birth of Binary: Leibniz and the origins of computer arithmetic

On Dec. 8, 2022 at 1:15PM-3:30PM Zurich time. I didn't watch it live (too early in the morning, east coast time) but it was taped and I watched a recording later. Yeah!

His blog about it (see here) had a pointer to the video, and my blog about it (see here) had a pointer to both the video and to his blog.

A while back  I was writing a blog post where I wanted to point to the video. My link didn't work. His link didn't work. I emailed him asking where it was. IT IS LOST FOREVER! Future Historians will not know about Leibniz and binary! Or they might--- he has a book on the topic that I reviewed here. But what if the book goes out of print and the only information on this topic is my review of his book? 

3) Entire journals can vanish. I blogged about that here.

4) I am happy that the link to the Wikipedia entry on Link Rot (see here) has not rotted.

5) I did a post on what tends to NOT be recorded and hence may be lost forever here.

6) (This is  bigger topic than my one point here.) People tend to OWN less than they used to. 


DVDs-don't bother buying! Whatever you want is on streaming (I recently watched, for the first time, Buffy the Vampire Slayer, one episode a day, on Treadmill, and it was great!)

CD's- don't bother buying!  Use Spotify. I do that and it's awesome-I have found novelty songs I didn't know about! Including a song by The Doubleclicks  which I thought was about Buffy: here. I emailed them about that it and they responded with: Hello! Buffy, hunger games, divergent, Harry Potter, you name it.

JOURNALS- don't bother buying them, its all on arXiv (Very true in TCS, might be less true in other fields). 

CONFERENCES: Not sure. I think very few have paper proceedings. At one time they gave out memory sticks with all the papers on them, so that IS ownership though depends on technology that might go away. Not sure what they do now. 

This may make it easier to lose things since nobody has a physical copy. 

7) Counterargument: Even given the points above, far more today IS being preserved than used to be. See my blog post on that here. But will that be true in the long run? 

8) I began saying that I used to think future historians will have too much to look at and have to sort through lots of stuff (using quicksort?) to figure out what's true. Then I said they may lose a lot. Oddly enough, both might be true- of the stuff they DO have they will have a hard time figuring out what's true (e.g., Was Pope Leo's ugrad thesis on Rado's Theorem for Non-Linear Equations? No. See my blog about that falsehood getting out to the world here. Spoiler alert- it was my fault.)

QUESTIONS:

1) Am I right--- will the future lose lots of stuff?

2) If so, what can we do about this? Not clear who we is in that last sentence. 



By gasarch

The First Known Problem That Is FPT with Respect to Node Scanwidth but Not Treewidth

from arXiv: Computational Complexity

Authors: Jannik Schestag, Norbert Zeh

Structural parameters of graphs, such as treewidth, play a central role in the study of the parameterized complexity of graph problems. Motivated by the study of parametrized algorithms on phylogenetic networks, scanwidth was introduced recently as a new treewidth-like structural parameter for directed acyclic graphs (DAGs) that respects the edge directions in the DAG. The utility of this width measure has been demonstrated by results that show that a number of problems that are fixed-parameter tractable (FPT) with respect to both treewidth and scanwidth allow algorithms with a better dependence on scanwidth than on treewidth. More importantly, these scanwidth-based algorithms are often much simpler than their treewidth-based counterparts: the name ``scanwidth'' reflects that traversing a tree extension (the scanwidth-equivalent of a tree decomposition) of a DAG amounts to ``scanning'' the DAG according to a well-chosen topological ordering. While these results show that scanwidth is useful especially for solving problems on phylogenetic networks, all problems studied through the lens of scanwidth so far are either FPT with respect to both scanwidth and treewidth, or W[$\ell$]-hard, for some $\ell \ge 1$, with respect to both. In this paper, we show that scanwidth is not just a proxy for treewidth and provides information about the structure of the input graph not provided by treewidth, by proving a fairly stark complexity-theoretic separation between these two width measures. Specifically, we prove that Weighted Phylogenetic Diversity with Dependencies is FPT with respect to the scanwidth of the food web but W[$\ell$]-hard with respect to its treewidth, for all $\ell \ge 1$. To the best of our knowledge, no such separation between these two width measures has been shown for any problem before.

Authors: Jannik Schestag, Norbert Zeh

Structural parameters of graphs, such as treewidth, play a central role in the study of the parameterized complexity of graph problems. Motivated by the study of parametrized algorithms on phylogenetic networks, scanwidth was introduced recently as a new treewidth-like structural parameter for directed acyclic graphs (DAGs) that respects the edge directions in the DAG. The utility of this width measure has been demonstrated by results that show that a number of problems that are fixed-parameter tractable (FPT) with respect to both treewidth and scanwidth allow algorithms with a better dependence on scanwidth than on treewidth. More importantly, these scanwidth-based algorithms are often much simpler than their treewidth-based counterparts: the name ``scanwidth'' reflects that traversing a tree extension (the scanwidth-equivalent of a tree decomposition) of a DAG amounts to ``scanning'' the DAG according to a well-chosen topological ordering. While these results show that scanwidth is useful especially for solving problems on phylogenetic networks, all problems studied through the lens of scanwidth so far are either FPT with respect to both scanwidth and treewidth, or W[$\ell$]-hard, for some $\ell \ge 1$, with respect to both. In this paper, we show that scanwidth is not just a proxy for treewidth and provides information about the structure of the input graph not provided by treewidth, by proving a fairly stark complexity-theoretic separation between these two width measures. Specifically, we prove that Weighted Phylogenetic Diversity with Dependencies is FPT with respect to the scanwidth of the food web but W[$\ell$]-hard with respect to its treewidth, for all $\ell \ge 1$. To the best of our knowledge, no such separation between these two width measures has been shown for any problem before.

Makespan Minimization in Split Learning: From Theory to Practice

from arXiv: Computational Complexity

Authors: Robert Ganian, Fionn Mc Inerney, Dimitra Tsigkari

Split learning recently emerged as a solution for distributed machine learning with heterogeneous IoT devices, where clients can offload part of their training to computationally-powerful helpers. The core challenge in split learning is to minimize the training time by jointly devising the client-helper assignment and the schedule of tasks at the helpers. We first study the model where each helper has a memory cardinality constraint on how many clients it may be assigned, which represents the case of homogeneous tasks. Through complexity theory, we rule out exact polynomial-time algorithms and approximation schemes even for highly restricted instances of this problem. We complement these negative results with a non-trivial polynomial-time 5-approximation algorithm. Building on this, we then focus on the more general heterogeneous task setting considered by Tirana et al. [INFOCOM 2024], where helpers have memory capacity constraints and clients have variable memory costs. In this case, we prove that, unless P=NP, the problem cannot admit a polynomial-time approximation algorithm for any approximation factor. However, by adapting our aforementioned 5-approximation algorithm, we develop a novel heuristic for the heterogeneous task setting and show that it outperforms heuristics from prior works through extensive experiments.

Authors: Robert Ganian, Fionn Mc Inerney, Dimitra Tsigkari

Split learning recently emerged as a solution for distributed machine learning with heterogeneous IoT devices, where clients can offload part of their training to computationally-powerful helpers. The core challenge in split learning is to minimize the training time by jointly devising the client-helper assignment and the schedule of tasks at the helpers. We first study the model where each helper has a memory cardinality constraint on how many clients it may be assigned, which represents the case of homogeneous tasks. Through complexity theory, we rule out exact polynomial-time algorithms and approximation schemes even for highly restricted instances of this problem. We complement these negative results with a non-trivial polynomial-time 5-approximation algorithm. Building on this, we then focus on the more general heterogeneous task setting considered by Tirana et al. [INFOCOM 2024], where helpers have memory capacity constraints and clients have variable memory costs. In this case, we prove that, unless P=NP, the problem cannot admit a polynomial-time approximation algorithm for any approximation factor. However, by adapting our aforementioned 5-approximation algorithm, we develop a novel heuristic for the heterogeneous task setting and show that it outperforms heuristics from prior works through extensive experiments.

Gromov-Wasserstein at Scale, Beyond Squared Norms

from arXiv: Computational Geometry

Authors: Guillaume Houry, Jean Feydy, François-Xavier Vialard

A fundamental challenge in data science is to match disparate point sets with each other. While optimal transport efficiently minimizes point displacements under a bijectivity constraint, it is inherently sensitive to rotations. Conversely, minimizing distortions via the Gromov-Wasserstein (GW) framework addresses this limitation but introduces a non-convex, computationally demanding optimization problem. In this work, we identify a broad class of distortion penalties that reduce to a simple alignment problem within a lifted feature space. Leveraging this insight, we introduce an iterative GW solver with a linear memory footprint and quadratic (rather than cubic) time complexity. Our method is differentiable, comes with strong theoretical guarantees, and scales to hundreds of thousands of points in minutes. This efficiency unlocks a wide range of geometric applications and enables the exploration of the GW energy landscape, whose local minima encode the symmetries of the matching problem.

Authors: Guillaume Houry, Jean Feydy, François-Xavier Vialard

A fundamental challenge in data science is to match disparate point sets with each other. While optimal transport efficiently minimizes point displacements under a bijectivity constraint, it is inherently sensitive to rotations. Conversely, minimizing distortions via the Gromov-Wasserstein (GW) framework addresses this limitation but introduces a non-convex, computationally demanding optimization problem. In this work, we identify a broad class of distortion penalties that reduce to a simple alignment problem within a lifted feature space. Leveraging this insight, we introduce an iterative GW solver with a linear memory footprint and quadratic (rather than cubic) time complexity. Our method is differentiable, comes with strong theoretical guarantees, and scales to hundreds of thousands of points in minutes. This efficiency unlocks a wide range of geometric applications and enables the exploration of the GW energy landscape, whose local minima encode the symmetries of the matching problem.

Graph-Based Nearest-Neighbor Search without the Spread

from arXiv: Computational Geometry

Authors: Jeff Giliberti, Sariel Har-Peled, Jonas Sauer, Ali Vakilian

$\renewcommand{\Re}{\mathbb{R}}$Recent work showed how to construct nearest-neighbor graphs of linear size, on a given set $P$ of $n$ points in $\Re^d$, such that one can answer approximate nearest-neighbor queries in logarithmic time in the spread. Unfortunately, the spread might be unbounded in $n$, and an interesting theoretical question is how to remove the dependency on the spread. Here, we show how to construct an external linear-size data structure that, combined with the linear-size graph, allows us to answer ANN queries in logarithmic time in $n$.

Authors: Jeff Giliberti, Sariel Har-Peled, Jonas Sauer, Ali Vakilian

$\renewcommand{\Re}{\mathbb{R}}$Recent work showed how to construct nearest-neighbor graphs of linear size, on a given set $P$ of $n$ points in $\Re^d$, such that one can answer approximate nearest-neighbor queries in logarithmic time in the spread. Unfortunately, the spread might be unbounded in $n$, and an interesting theoretical question is how to remove the dependency on the spread. Here, we show how to construct an external linear-size data structure that, combined with the linear-size graph, allows us to answer ANN queries in logarithmic time in $n$.

Revisiting the Sliced Wasserstein Kernel for persistence diagrams: a Figalli-Gigli approach

from arXiv: Computational Geometry

Authors: Marc Janthial, Théo Lacombe

The Sliced Wasserstein Kernel (SWK) for persistence diagrams was introduced in (Carri{è}re et al. 2017) as a powerful tool to implicitly embed persistence diagrams in a Hilbert space with reasonable distortion. This kernel is built on the intuition that the Figalli-Gigli distance-that is the partial matching distance routinely used to compare persistence diagrams-resembles the Wasserstein distance used in the optimal transport literature, and that the later could be sliced to define a positive definite kernel on the space of persistence diagrams. This efficient construction nonetheless relies on ad-hoc tweaks on the Wasserstein distance to account for the peculiar geometry of the space of persistence diagrams. In this work, we propose to revisit this idea by directly using the Figalli-Gigli distance instead of the Wasserstein one as the building block of our kernel. On the theoretical side, our sliced Figalli-Gigli kernel (SFGK) shares most of the important properties of the SWK of Carri{è}re et al., including distortion results on the induced embedding and its ease of computation, while being more faithful to the natural geometry of persistence diagrams. In particular, it can be directly used to handle infinite persistence diagrams and persistence measures. On the numerical side, we show that the SFGK performs as well as the SWK on benchmark applications.

Authors: Marc Janthial, Théo Lacombe

The Sliced Wasserstein Kernel (SWK) for persistence diagrams was introduced in (Carri{è}re et al. 2017) as a powerful tool to implicitly embed persistence diagrams in a Hilbert space with reasonable distortion. This kernel is built on the intuition that the Figalli-Gigli distance-that is the partial matching distance routinely used to compare persistence diagrams-resembles the Wasserstein distance used in the optimal transport literature, and that the later could be sliced to define a positive definite kernel on the space of persistence diagrams. This efficient construction nonetheless relies on ad-hoc tweaks on the Wasserstein distance to account for the peculiar geometry of the space of persistence diagrams. In this work, we propose to revisit this idea by directly using the Figalli-Gigli distance instead of the Wasserstein one as the building block of our kernel. On the theoretical side, our sliced Figalli-Gigli kernel (SFGK) shares most of the important properties of the SWK of Carri{è}re et al., including distortion results on the induced embedding and its ease of computation, while being more faithful to the natural geometry of persistence diagrams. In particular, it can be directly used to handle infinite persistence diagrams and persistence measures. On the numerical side, we show that the SFGK performs as well as the SWK on benchmark applications.

Circuit Diameter of Polyhedra is Strongly Polynomial

from arXiv: Data Structures and Algorithms

Authors: Bento Natura

We prove a strongly polynomial bound on the circuit diameter of polyhedra, resolving the circuit analogue of the polynomial Hirsch conjecture. Specifically, we show that the circuit diameter of a polyhedron $P = \{x\in \mathbb{R}^n:\, A x = b, \, x \ge 0\}$ with $A\in\mathbb{R}^{m\times n}$ is $O(m^2 \log m)$. Our construction yields monotone circuit walks, giving the same bound for the monotone circuit diameter. The circuit diameter, introduced by Borgwardt, Finhold, and Hemmecke (SIDMA 2015), is a natural relaxation of the combinatorial diameter that allows steps along circuit directions rather than only along edges. All prior upper bounds on the circuit diameter were only weakly polynomial. Finding a circuit augmentation algorithm that matches this bound would yield a strongly polynomial time algorithm for linear programming, resolving Smale's 9th problem.

Authors: Bento Natura

We prove a strongly polynomial bound on the circuit diameter of polyhedra, resolving the circuit analogue of the polynomial Hirsch conjecture. Specifically, we show that the circuit diameter of a polyhedron $P = \{x\in \mathbb{R}^n:\, A x = b, \, x \ge 0\}$ with $A\in\mathbb{R}^{m\times n}$ is $O(m^2 \log m)$. Our construction yields monotone circuit walks, giving the same bound for the monotone circuit diameter. The circuit diameter, introduced by Borgwardt, Finhold, and Hemmecke (SIDMA 2015), is a natural relaxation of the combinatorial diameter that allows steps along circuit directions rather than only along edges. All prior upper bounds on the circuit diameter were only weakly polynomial. Finding a circuit augmentation algorithm that matches this bound would yield a strongly polynomial time algorithm for linear programming, resolving Smale's 9th problem.

Induced Cycles of Many Lengths

from arXiv: Data Structures and Algorithms

Authors: Maria Chudnovsky, Ilya Maier

Let $G$ be a graph and let $\mathrm{cl}(G)$ be the number of distinct induced cycle lengths in $G$. We show that for $c,t\in \mathbb N$, every graph $G$ that does not contain an induced subgraph isomorphic to $K_{t+1}$ or $K_{t,t}$ and satisfies $\mathrm{cl}(G) \le c$ has bounded treewidth. As a consequence, we obtain a polynomial-time algorithm for deciding whether a graph $G$ contains induced cycles of at least three distinct lengths.

Authors: Maria Chudnovsky, Ilya Maier

Let $G$ be a graph and let $\mathrm{cl}(G)$ be the number of distinct induced cycle lengths in $G$. We show that for $c,t\in \mathbb N$, every graph $G$ that does not contain an induced subgraph isomorphic to $K_{t+1}$ or $K_{t,t}$ and satisfies $\mathrm{cl}(G) \le c$ has bounded treewidth. As a consequence, we obtain a polynomial-time algorithm for deciding whether a graph $G$ contains induced cycles of at least three distinct lengths.

Towards Efficient Data Structures for Approximate Search with Range Queries

from arXiv: Data Structures and Algorithms

Authors: Ladan Kian, Dariusz R. Kowalski

Range queries are simple and popular types of queries used in data retrieval. However, extracting exact and complete information using range queries is costly. As a remedy, some previous work proposed a faster principle, {\em approximate} search with range queries, also called single range cover (SRC) search. It can, however, produce some false positives. In this work we introduce a new SRC search structure, a $c$-DAG (Directed Acyclic Graph), which provably decreases the average number of false positives by logarithmic factor while keeping asymptotically same time and memory complexities as a classic tree structure. A $c$-DAG is a tunable augmentation of the 1D-Tree with denser overlapping branches ($c \geq 3$ children per node). We perform a competitive analysis of a $c$-DAG with respect to 1D-Tree and derive an additive constant time overhead and a multiplicative logarithmic improvement of the false positives ratio, on average. We also provide a generic framework to extend our results to empirical distributions of queries, and demonstrate its effectiveness for Gowalla dataset. Finally, we quantify and discuss security and privacy aspects of SRC search on $c$-DAG vs 1D-Tree, mainly mitigation of structural leakage, which makes $c$-DAG a good data structure candidate for deployment in privacy-preserving systems (e.g., searchable encryption) and multimedia retrieval.

Authors: Ladan Kian, Dariusz R. Kowalski

Range queries are simple and popular types of queries used in data retrieval. However, extracting exact and complete information using range queries is costly. As a remedy, some previous work proposed a faster principle, {\em approximate} search with range queries, also called single range cover (SRC) search. It can, however, produce some false positives. In this work we introduce a new SRC search structure, a $c$-DAG (Directed Acyclic Graph), which provably decreases the average number of false positives by logarithmic factor while keeping asymptotically same time and memory complexities as a classic tree structure. A $c$-DAG is a tunable augmentation of the 1D-Tree with denser overlapping branches ($c \geq 3$ children per node). We perform a competitive analysis of a $c$-DAG with respect to 1D-Tree and derive an additive constant time overhead and a multiplicative logarithmic improvement of the false positives ratio, on average. We also provide a generic framework to extend our results to empirical distributions of queries, and demonstrate its effectiveness for Gowalla dataset. Finally, we quantify and discuss security and privacy aspects of SRC search on $c$-DAG vs 1D-Tree, mainly mitigation of structural leakage, which makes $c$-DAG a good data structure candidate for deployment in privacy-preserving systems (e.g., searchable encryption) and multimedia retrieval.

Algebraic Reduction to Improve an Optimally Bounded Quantum State Preparation Algorithm

from arXiv: Data Structures and Algorithms

Authors: Giacomo Belli, Michele Amoretti

The preparation of $n$-qubit quantum states is a cross-cutting subroutine for many quantum algorithms, and the effort to reduce its circuit complexity is a significant challenge. In the literature, the quantum state preparation algorithm by Sun et al. is known to be optimally bounded, defining the asymptotically optimal width-depth trade-off bounds with and without ancillary qubits. In this work, a simpler algebraic decomposition is proposed to separate the preparation of the real part of the desired state from the complex one, resulting in a reduction in terms of circuit depth, total gates, and CNOT count when $m$ ancillary qubits are available. The reduction in complexity is due to the use of a single operator $Λ$ for each uniformly controlled gate, instead of the three in the original decomposition. Using the PennyLane library, this new algorithm for state preparation has been implemented and tested in a simulated environment for both dense and sparse quantum states, including those that are random and of physical interest. Furthermore, its performance has been compared with that of Möttönen et al.'s algorithm, which is a de facto standard for preparing quantum states in cases where no ancillary qubits are used, highlighting interesting lines of development.

Authors: Giacomo Belli, Michele Amoretti

The preparation of $n$-qubit quantum states is a cross-cutting subroutine for many quantum algorithms, and the effort to reduce its circuit complexity is a significant challenge. In the literature, the quantum state preparation algorithm by Sun et al. is known to be optimally bounded, defining the asymptotically optimal width-depth trade-off bounds with and without ancillary qubits. In this work, a simpler algebraic decomposition is proposed to separate the preparation of the real part of the desired state from the complex one, resulting in a reduction in terms of circuit depth, total gates, and CNOT count when $m$ ancillary qubits are available. The reduction in complexity is due to the use of a single operator $Λ$ for each uniformly controlled gate, instead of the three in the original decomposition. Using the PennyLane library, this new algorithm for state preparation has been implemented and tested in a simulated environment for both dense and sparse quantum states, including those that are random and of physical interest. Furthermore, its performance has been compared with that of Möttönen et al.'s algorithm, which is a de facto standard for preparing quantum states in cases where no ancillary qubits are used, highlighting interesting lines of development.

Fast Makespan Minimization via Short ILPs

from arXiv: Data Structures and Algorithms

Authors: Danny Hermelin, Dvir Shabtay

Short integer linear programs are programs with a relatively small number of constraints. We show how recent improvements on the running-times of solvers for such programs can be used to obtain fast pseudo-polynomial time algorithms for makespan minimization on a fixed number of parallel machines, and other related variants. The running times of our algorithms are all of the form $\widetilde{O}(p^{O(1)}_{\max}+n)$ or $\widetilde{O}(p^{O(1)}_{\max} \cdot n)$, where $p_{\max}$ is the maximum processing time in the input. These improve upon the time complexity of previously known algorithms for moderate values of $p_{\max}$.

Authors: Danny Hermelin, Dvir Shabtay

Short integer linear programs are programs with a relatively small number of constraints. We show how recent improvements on the running-times of solvers for such programs can be used to obtain fast pseudo-polynomial time algorithms for makespan minimization on a fixed number of parallel machines, and other related variants. The running times of our algorithms are all of the form $\widetilde{O}(p^{O(1)}_{\max}+n)$ or $\widetilde{O}(p^{O(1)}_{\max} \cdot n)$, where $p_{\max}$ is the maximum processing time in the input. These improve upon the time complexity of previously known algorithms for moderate values of $p_{\max}$.

Sunday, February 08

Danish Data Science Academy postdoc positions at University of Copenhagen (apply by March 4, 2026)

from CCI: jobs

The Danish Data Science Academy invites applications for postdoc fellowships for visionary and ambitious young data scientists. The call is at ddsa.dk/postdocfellowshipcall2026/. Inquiries about opportunities in the Algorithms and Complexity Section at the University of Copenhagen are welcome and can be directed to Jakob Nordstrom or other faculty in the section. Website: ddsa.dk/postdocfellowshipcall2026/ Email: jn@di.ku.dk

The Danish Data Science Academy invites applications for postdoc fellowships for visionary and ambitious young data scientists. The call is at https://ddsa.dk/postdocfellowshipcall2026/. Inquiries about opportunities in the Algorithms and Complexity Section at the University of Copenhagen are welcome and can be directed to Jakob Nordstrom or other faculty in the section.

Website: https://ddsa.dk/postdocfellowshipcall2026/
Email: jn@di.ku.dk

By shacharlovett

TR26-013 | Quantum–Classical Equivalence for AND-Functions | Sreejata Bhattacharya, Farzan Byramji, Arkadev Chattopadhyay, Yogesh Dahiya, Shachar Lovett

from ECCC Papers

A major open problem at the interface of quantum computing and communication complexity is whether quantum protocols can be exponentially more efficient than classical protocols for computing total Boolean functions; the prevailing conjecture is that they are not. In a seminal work, Razborov (2002) resolved this question for AND-functions of the form $$ F(x,y) = f(x_1 \land y_1, \ldots, x_n \land y_n), $$ when the outer function $f$ is symmetric, by proving that their bounded-error quantum and classical communication complexities are polynomially related. Since then, extending this result to all AND-functions has remained open and has been posed by several authors. In this work, we settle this problem. We show that for every Boolean function $f$, the bounded-error quantum and classical communication complexities of the AND-function $f \circ \mathrm{AND}_2$ are polynomially related, up to polylogarithmic factors in $n$. Moreover, modulo such polylogarithmic factors, we prove that the bounded-error quantum communication complexity of $f \circ \mathrm{AND}_2$ is polynomially equivalent to its deterministic communication complexity, and that both are characterized—up to polynomial loss—by the logarithm of the De Morgan sparsity of $f$. Our results build on the recent work of Chattopadhyay, Dahiya, and Lovett (2025) on structural characterizations of non-sparse Boolean functions, which we extend to resolve the conjecture for general AND-functions.

A major open problem at the interface of quantum computing and communication complexity is whether quantum protocols can be exponentially more efficient than classical protocols for computing total Boolean functions; the prevailing conjecture is that they are not. In a seminal work, Razborov (2002) resolved this question for AND-functions of the form $$ F(x,y) = f(x_1 \land y_1, \ldots, x_n \land y_n), $$ when the outer function $f$ is symmetric, by proving that their bounded-error quantum and classical communication complexities are polynomially related. Since then, extending this result to all AND-functions has remained open and has been posed by several authors. In this work, we settle this problem. We show that for every Boolean function $f$, the bounded-error quantum and classical communication complexities of the AND-function $f \circ \mathrm{AND}_2$ are polynomially related, up to polylogarithmic factors in $n$. Moreover, modulo such polylogarithmic factors, we prove that the bounded-error quantum communication complexity of $f \circ \mathrm{AND}_2$ is polynomially equivalent to its deterministic communication complexity, and that both are characterized—up to polynomial loss—by the logarithm of the De Morgan sparsity of $f$. Our results build on the recent work of Chattopadhyay, Dahiya, and Lovett (2025) on structural characterizations of non-sparse Boolean functions, which we extend to resolve the conjecture for general AND-functions.

Trevisan Award for Expository Work

from Theory Matters

Posted on behalf of Salil Vadhan, the award committee chair: The Trevisan Award for Expository Work is a new SIGACT award created in memory of Luca Trevisan (1971-2024), with a nomination deadline of April 10, 2026.   The award is intended to promote and recognize high-impact work expositing ideas and results from the Theory of Computation. The exposition can have […]

Posted on behalf of Salil Vadhan, the award committee chair:

The Trevisan Award for Expository Work is a new SIGACT award created in memory of Luca Trevisan (1971-2024), with a nomination deadline of April 10, 2026.  

The award is intended to promote and recognize high-impact work expositing ideas and results from the Theory of Computation. The exposition can have various target audiences, e.g. people in this field, people in adjacent or remote academic fields, as well as the general public. The form of exposition can vary, and can include books, surveys, lectures, course materials, video, audio (e.g. podcasts), blogs and other media products. The award may be given to a single piece of work or a series produced over time. The award may be given to an individual, or a small group who together produced this expository work.

The awardee will receive USD 2000 (to be divided among the awardees if multiple), as well as travel support if needed to attend STOC, where the award will be presented. STOC 2026 is June 22-26 in Salt Lake City, Utah.

The endowment for this prize was initiated by a gift from Avi Wigderson, drawing on his Turing Award, and has been subsequently augmented by other individuals.

For more details see https://sigact.org/prizes/trevisan.html.

By shuchic

Saturday, February 07

Luca Trevisan Award for Expository Work

from Scott Aaronson

Friend-of-the-blog Salil Vadhan has asked me to share the following. The Trevisan Award for Expository Work is a new SIGACT award created in memory of Luca Trevisan (1971-2024), with a nomination deadline of April 10, 2026. The award is intended to promote and recognize high-impact work expositing ideas and results from the Theory of Computation. The exposition can have various […]

Friend-of-the-blog Salil Vadhan has asked me to share the following.


The Trevisan Award for Expository Work is a new SIGACT award created in memory of Luca Trevisan (1971-2024), with a nomination deadline of April 10, 2026.

The award is intended to promote and recognize high-impact work expositing ideas and results from the Theory of Computation. The exposition can have various target audiences, e.g. people in this field, people in adjacent or remote academic fields, as well as the general public. The form of exposition can vary, and can include books, surveys, lectures, course materials, video, audio (e.g. podcasts), blogs and other media products. The award may be given to a single piece of work or a series produced over time. The award may be given to an individual, or a small group who together produced this expository work.

The awardee will receive USD$2000 (to be divided among the awardees if multiple), as well as travel support if needed to attend STOC, where the award will be presented. STOC’2026 is June 22-26 in Salt Lake City, Utah.

The endowment for this prize was initiated by a gift from Avi Wigderson, drawing on his Turing Award, and has been subsequently augmented by other individuals.

For more details see here.

By Scott

Friday, February 06

Secrets of Intelligence Services

from Ben Recht

I asked Claude code to build Claude code. You won’t believe what happened next.

It is simultaneously true that coding agents are by far my favorite generative AI innovation and the most annoying to read about online. Agents profoundly democratize computing in ways similarly impactful as the personal computer and the graphical user interface. Unfortunately, they are also sold by AI companies and thus come shrouded in animistic mysticism. This leads to dumb one-day stories like Moltbook or endless annoying Twitter threads by economists on the future of labor.

I know that any sufficiently advanced technology is indistinguishable from magic, and agents do feel magical in a way distinct from chatbots. They let you produce working software for almost any idea you have in your head, conceptually a major technological breakthrough beyond the language machine. But the funny thing about agents is that they are not particularly advanced technology in the grand scheme of things. Agents are far far simpler than LLMs.

I was trying to explain to a friend how agents worked, and I realized I didn’t have a good vocabulary for the job. Reading a bunch of tutorials and looking at a bunch of open source repositories didn’t help. On a whim, I tried something silly that turned out to be far more illuminating than I expected.

I asked Claude Code to build itself.

“Build a minimalist coding agent that exposes step by step how the agent works.”

The resulting 400 lines of overly commented Python were instructive. My “microagent” generates perfectly reasonable code, and, because Claude is sentient and clearly knows everything about me,1 the design uncannily emphasizes the power of feedback that I’ve been blogging about. The agent is simple, ordinary software in feedback with a complex, extraordinary language model. Agents are a perfect example of how the feedback interconnection of a basic, precise component with an uncertain, generalist component achieves outcomes neither component can enact on its own.

For the purposes of this post, the human (aka me) has one job: setting the initial conditions of the feedback loop with a natural language prompt. With this prompt, the agent kicks off a sequence of interactions with the LLM. At every turn, the agent receives a message from the LLM, parses it to identify a valid action, takes the action, and then sends the LLM a message describing the outcome. It also has the option to stop acting and end the loop.

The set of valid agent actions, called “tools” in agent land, is small, but they take you well beyond chatting. The tools execute actual actions on your computer. Software can do a surprising amount with three simple actions: read_file, write_file, and shell. The first two are self-explanatory. read_file takes as input a filename and outputs the contents. write_file takes as input a filename and content and writes the content to disk under that filename. shell executes a command-line script and returns its output. These three operations let you copy a piece of software onto your computer, compile it, and run it. That’s enough power to do basically anything. These plugs into the world outside the chat window are what make the agents so impressive.

Here’s an architecture diagram of the simple agent, with labels denoting the order of the data flow. Steps 1-4 are repeated until the response tells the agent to be done.

Let me show you the gory details of how it all works in a simple example. I type the following text into the microagent software:

Create a file called test.txt with ‘hello world’

The agent now needs to send this command to the LLM. It uses JSON, not English, to do so. JSON, short for JavaScript Object Notation, is one of the core data formats on the internet. It consists of long spaghetti lists of “human readable” name-value pairs that encode data in ways that make it relatively simple for different applications to talk to each other. Here’s the agent’s message in JSON:

{
  "messages": [
    {
      "role": "Ben",
      "content": "Create a file called test.txt with 'hello world'"
    }
  ],
  "tools": [
    "read_file",
    "write_file",
    "shell"
  ]
}

This isn’t too bad to read, right? There are some curly brackets and straight brackets for grouping things. The first grouping encodes my prompt. The second grouping lists the agent’s available tools.

The agent sends this message to Anthropic via standard web protocols. All of the major AI players provide Python interfaces and JSON schemas for interacting with their language models. You send them a properly formatted package of data over an internet tube, and they will send you something back.

In this example, Anthropic processes the JSON and returns the following:

{
  "content": "I'll create a file called test.txt with the content 'hello world'.",
  "tool_calls": [
    {
      "name": "write_file",
      "parameters": {
        "path": "test.txt",
        "content": "hello world"
      },
      "id": "toolu_01..."
    }
  ]
}

Here, the LLM’s reply includes English text designed to make you think it is conscious, which the agent dutifully prints to the screen. The LLM also sends actual code for the agent to run.

The agent expects the LLM to return JSON that indicates a command to execute one of its tools. I don’t know what actually happens on the Anthropic side of the line, but it doesn’t really matter. Maybe Anthropic just takes the raw JSON and feeds it as raw tokens into a language model. Maybe Anthropic has its own simple agent that reads the JSON and dispatches prompts to the LLMs accordingly. Maybe it’s actual magic of a mystical elf using dark sorcery.

Whatever is happening behind the curtain doesn’t matter to the agent. The agent expects a particular kind of answer back. If it’s not in a valid format, the agent can flag an error and quit. At the expense of a little more software complexity, you could add simple rules to “try again,” hoping that the internal stochasticity of the LLM might yield a valid JSON message after enough attempts. Regardless, the agent side is all just simple rules that you would use to build any internet application.

In the simple running example here, the JSON sent back from Anthropic is perfectly fine. After displaying the English to the screen to placate me, the agent, explicitly programmed to read JSON, skips to the field “tool_calls.” It then tries to execute whatever is there. It sees the tool “write_file” and the parameters to pass as arguments to the tool’s path and content. You would not be surprised to learn that applying write_file to that content creates a file “test.txt” containing the words “hello world.”

Once this action is successfully executed, the agent sends the following message back to the LLM:

{
  "messages": [
    {
      "role": "Ben",
      "content": "Create a file called test.txt with 'hello world'"
    },
    {
      "role": "assistant",
      "content": "I'll create a file called test.txt with the content 'hello world'.",
      "tool_calls": [
        {
          "name": "write_file",
          "id": "toolu_01..."
        }
      ]
    },
    {
      "role": "tool",
      "content": "\u2713 Wrote 11 bytes to test.txt",
      "tool_use_id": "toolu_01..."
    }
  ],
  "tools": [
    "read_file",
    "write_file",
    "shell"
  ]
}

You now see why I put scare quotes around human readable. Even this simple example is quickly getting out of hand. However, this message reveals something surprising: in my trivial implementation, the agent and LLM do not share state. All interaction occurs through a stateless interface (a REST API), and the agent sends the entire interaction history to the LLM in every round. There are likely smarter ways to save tokens here, but those details are for another day. Agent software can create an illusion of state for the human, while actually just logging a long, unsummarized memory.

Whatever the case, Anthropic munges the long message and sends back the curt reply.

{
  "content": "The file test.txt has been successfully created with the content 'hello world'.",
  "is_final": true
}

The agent sees the LLM saying “is_final” is true. This is the signal to end the interaction. The agent prints the content to the screen and closes up shop.

The file test.txt has been successfully created with the content ‘hello world’.

The simple microagent can of course do far more complex tasks than this. You can try it out if you want. It’s not going to be as powerful or as fun as Claude Code, but it shows how far you can get with a hardcoded loop with access to a minimal set of tools. There is no machine learning or fancy “AI” in the agent. It’s boring, standard software. Even though people like to speak breathlessly about the power of agents and their apparent agency, agents are simple, rule-based systems that use a known, fixed JSON schema to communicate with LLMs.

LLMs, by contrast, remain deeply weird and mysterious. I don’t care how many pictures someone draws to explain transformers. It’s been a decade, and not a single explainer of LLM architecture has helped me understand what they actually do or why.

And yet, I don’t have to understand human language, machine language, or the training set of language models to use them in an agent loop. The agent has no idea how the LLM works, and neither do I. It is totally fine for me to think of the LLM as a stochastic parrot. I can also think of it as an omniscient god. The agent merely needs enough logic so that, when the LLM gives an unacceptable answer, the agent software can recover.

The microagent feedback loop parallels the simple amplifier story I told a couple of weeks ago. There we had an unpredictable but powerful amplifier, made precise by an accurate but small attenuator. In coding agents, feedback with the agent makes LLMs more predictable and verifiable. It also provides a path to translate the text output of LLMs into real, impactful actions. Agents are boring, precise, and logical. LLMs are mysterious, nondeterministic, and contain an unfathomable subset of human knowledge. In feedback, they are the most useful and engaging generative AI product yet.

Subscribe now

1

Goddamn it, I’m joking.

By Ben Recht

News for January 2026

from Property Testing Review

After some insanely busy months, we have a merely busy month. A couple of neat results on sublinear graph algorithms (a subject dear to me!), DNF testing, and two expositions that should be of interest to our readers. When Local and Non-Local Meet: Quadratic Improvement for Edge Estimation with Independent Set Queries by Tomer Adar, […]

After some insanely busy months, we have a merely busy month. A couple of neat results on sublinear graph algorithms (a subject dear to me!), DNF testing, and two expositions that should be of interest to our readers.

When Local and Non-Local Meet: Quadratic Improvement for Edge Estimation with Independent Set Queries by Tomer Adar, Yahel Hotam, Amit Levi (arXiv). We’ve seen a lot of recent activity of the fundamental problem of edge estimation in graph. Given a simple, connected, undirected graph \(G = (V,E)\) with \(n\) vertices, we want to get a \((1+\varepsilon)\)-approximation to the number of edges \(m\). In the standard adjacency list access model, a classic result of Goldreich-Ron proves that the sample complexity of this problem is \(\Theta(n/\sqrt{m})\) (ignoring \(poly(\varepsilon^{-1} \log n)\) factors). A completely different access model is the Independent Set (IS) model. For any set \(S\), an oracle outputs a single bit indicating whether an edge is present in \(S\). Again, in this model, the optimal bound is \(\Theta(n/\sqrt{m})\). This paper shows that with access to all oracles (standard adjacency list and IS queries), one gets a quadratic improvement and the complexity is \(\Theta(\sqrt{n/\sqrt{m}})\).

Spectral Clustering in Birthday Paradox Time by Michael Kapralov, Ekaterina Kochetkova, Weronika Wrzos-Kaminska (arXiv). Consider a (bounded degree) graph \(G\) that is can be partitioned into \(k\) roughly equal “expanding clusters”. This means that one can remove an \(\varepsilon\)-fraction of the edges to get \(k\) connected components of size around \(n/k\), each of which has expansion at least \(\phi\). The aim is to get a sublinear oracle that determines the cluster that a vertex belongs to. The origins of this problem go back to problems in expansion testing and local expansion reconstruction. From a spectral perspective, the ideal “cluster classifier” is to simply take the bottom \(k\) eigenvectors of the Laplacian, and project every vertex onto this vector space. The corresponding embeddings of vertices in the same cluster will be close. This paper effectively implements this procedure in sublinear time; specifically \(\widetilde{O}(\sqrt{n})\) time, using a birthday paradox type collision argument to estimate the embedding similarities.

DNF formulas are efficiently testable with relative error by Xi Chen, William Pires, Toniann Pitassi, Rocco A. Servedio (arXiv). The relative error model for Boolean functions was recently proposed as an analogy to sparse graph testing. The standard notion of distance between two functions \(f, g: \{0,1\}^n \to \{0,1\}\) is just \(\| f – g\|_0/2^n\). But this is not meaningful when \(|f^{-1}(1)| \ll 2^n\). A natural notion of distance is to consider the sets \(f^{-1}(1)\) and \(g^{-1}(1)\) and measure their symmetric difference. One can now define distances to properties analogously. There have been numerous property testing results with this notion of distance, called the “relative error” setting. This paper considers the property of being a (small) DNF. Learning an \(s\)-term DNF requires \(n^{O(\log(s/\varepsilon))}\) queries. This paper shows that the (relative error) testing of \(s\)-term DNFs can be done in \(poly(s/\varepsilon)\) queries.

A short note on (distribution) testing lower bounds via polynomials by Clément Canonne (ECCC). As the title says, this is a short note on a fundamental question of proving lower bounds for distribution testing. For your favorite distribution testing problem, to prove a lower bound, you start with a “Yes” distribution \(\mathcal{Y}\) and a “No” distribution \(\mathcal{N}\). So \(\mathcal{Y}\) would satisfy the desired property (say, uniformity) and \(\mathcal{N}\) would not. Then, you need to argue some sort of “statistical indistinguishability” of samples. So the distribution of the set \(s\) samples from \(\mathcal{Y}\) is almost the same as that from \(\mathcal{N}\). How does one prove the latter? A convenient lemma shows that if the first \(\ell\) moments of the distributions are the same, then we get a \(\Omega(k^{1-1/\ell})\) sample lower bound (\(k\) is the support size). The key insight is that such “matching moment” distributions/random variables can be generated by looking at specific univariate polynomials. It’s a really clever trick that looks at the powers of roots scaled by the derivative at those points. A really nice read overall!

Mathematical and computational perspectives on the Boolean and binary rank and their relation to the real rank by Michal Parnas (arXiv). This is long survey on the notion of Boolean and binary rank, and an overview of their use in TCS as well as methods to bound this rank. Section 7.3 gives an excellent discussion of property testing problems on Boolean rank. It also gives some of the main open questions regarding property testing low Boolean rank.

By Seshadhri

Reducing the Complexity of Matrix Multiplication to $O(N^2log_2N)$ by an Asymptotically Optimal Quantum Algorithm

from arXiv: Computational Complexity

Authors: Jiaqi Yao, Ding Liu

Matrix multiplication is a fundamental classical computing operation whose efficiency becomes a major challenge at scale, especially for machine learning applications. Quantum computing, with its inherent parallelism and exponential storage capacity, offers a potential solution to these limitations. This work presents a quantum kernel-based matrix multiplication algorithm (QKMM) that achieves an asymptotically optimal computational complexity of $ O(N^2 \log_2 N) $, outperforming the classical optimal complexity of $ O(N^{2.371552}) $, where $N$ denotes the matrix dimension. Through noiseless and noisy quantum simulation experiments, we demonstrate that the proposed algorithm not only exhibits superior theoretical efficiency but also shows practical advantages in runtime performance and stability.

Authors: Jiaqi Yao, Ding Liu

Matrix multiplication is a fundamental classical computing operation whose efficiency becomes a major challenge at scale, especially for machine learning applications. Quantum computing, with its inherent parallelism and exponential storage capacity, offers a potential solution to these limitations. This work presents a quantum kernel-based matrix multiplication algorithm (QKMM) that achieves an asymptotically optimal computational complexity of $ O(N^2 \log_2 N) $, outperforming the classical optimal complexity of $ O(N^{2.371552}) $, where $N$ denotes the matrix dimension. Through noiseless and noisy quantum simulation experiments, we demonstrate that the proposed algorithm not only exhibits superior theoretical efficiency but also shows practical advantages in runtime performance and stability.

Challenges in Solving Sequence-to-Graph Alignment with Co-Linear Structure

from arXiv: Computational Complexity

Authors: Xingfu Li

Sequence alignment is a cornerstone technique in computational biology for assessing similarities and differences among biological sequences. A key variant, sequence-to-graph alignment, plays a crucial role in effectively capturing genetic variations. In this work, we introduce two novel formulations within this framework: the Gap-Sensitive Co-Linear Chaining (Gap-CLC) problem and the Co-Linear Chaining with Errors based on Edit Distance (Edit-CLC) problem, and we investigate their computational complexity. We show that solving the Gap-CLC problem in sub-quadratic time is highly unlikely unless the Strong Exponential Time Hypothesis (SETH) fails -- even when restricted to binary alphabets. Furthermore, we establish that the Edit-CLC problem is NP-hard in the presence of errors within the graph. These findings emphasize that incorporating co-linear structures into sequence-to-graph alignment models fails to reduce computational complexity, highlighting that these models remain at least as computationally challenging to solve as those lacking such prior information.

Authors: Xingfu Li

Sequence alignment is a cornerstone technique in computational biology for assessing similarities and differences among biological sequences. A key variant, sequence-to-graph alignment, plays a crucial role in effectively capturing genetic variations. In this work, we introduce two novel formulations within this framework: the Gap-Sensitive Co-Linear Chaining (Gap-CLC) problem and the Co-Linear Chaining with Errors based on Edit Distance (Edit-CLC) problem, and we investigate their computational complexity. We show that solving the Gap-CLC problem in sub-quadratic time is highly unlikely unless the Strong Exponential Time Hypothesis (SETH) fails -- even when restricted to binary alphabets. Furthermore, we establish that the Edit-CLC problem is NP-hard in the presence of errors within the graph. These findings emphasize that incorporating co-linear structures into sequence-to-graph alignment models fails to reduce computational complexity, highlighting that these models remain at least as computationally challenging to solve as those lacking such prior information.

Certifiable Boolean Reasoning Is Universal

from arXiv: Computational Complexity

Authors: Wenhao Li, Anastasis Kratsios, Hrad Ghoukasian, Dennis Zvigelsky

The proliferation of agentic systems has thrust the reasoning capabilities of AI into the forefront of contemporary machine learning. While it is known that there \emph{exist} neural networks which can reason through any Boolean task $f:\{0,1\}^B\to\{0,1\}$, in the sense that they emulate Boolean circuits with fan-in $2$ and fan-out $1$ gates, trained models have been repeatedly demonstrated to fall short of these theoretical ideals. This raises the question: \textit{Can one exhibit a deep learning model which \textbf{certifiably} always reasons and can \textbf{universally} reason through any Boolean task?} Moreover, such a model should ideally require few parameters to solve simple Boolean tasks. We answer this question affirmatively by exhibiting a deep learning architecture which parameterizes distributions over Boolean circuits with the guarantee that, for every parameter configuration, a sample is almost surely a valid Boolean circuit (and hence admits an intrinsic circuit-level certificate). We then prove a universality theorem: for any Boolean $f:\{0,1\}^B\to\{0,1\}$, there exists a parameter configuration under which the sampled circuit computes $f$ with arbitrarily high probability. When $f$ is an $\mathcal{O}(\log B)$-junta, the required number of parameters scales linearly with the input dimension $B$. Empirically, on a controlled truth-table completion benchmark aligned with our setting, the proposed architecture trains reliably and achieves high exact-match accuracy while preserving the predicted structure: every internal unit is Boolean-valued on $\{0,1\}^B$. Matched MLP baselines reach comparable accuracy, but only about $10\%$ of hidden units admit a Boolean representation; i.e.\ are two-valued over the Boolean cube.

Authors: Wenhao Li, Anastasis Kratsios, Hrad Ghoukasian, Dennis Zvigelsky

The proliferation of agentic systems has thrust the reasoning capabilities of AI into the forefront of contemporary machine learning. While it is known that there \emph{exist} neural networks which can reason through any Boolean task $f:\{0,1\}^B\to\{0,1\}$, in the sense that they emulate Boolean circuits with fan-in $2$ and fan-out $1$ gates, trained models have been repeatedly demonstrated to fall short of these theoretical ideals. This raises the question: \textit{Can one exhibit a deep learning model which \textbf{certifiably} always reasons and can \textbf{universally} reason through any Boolean task?} Moreover, such a model should ideally require few parameters to solve simple Boolean tasks. We answer this question affirmatively by exhibiting a deep learning architecture which parameterizes distributions over Boolean circuits with the guarantee that, for every parameter configuration, a sample is almost surely a valid Boolean circuit (and hence admits an intrinsic circuit-level certificate). We then prove a universality theorem: for any Boolean $f:\{0,1\}^B\to\{0,1\}$, there exists a parameter configuration under which the sampled circuit computes $f$ with arbitrarily high probability. When $f$ is an $\mathcal{O}(\log B)$-junta, the required number of parameters scales linearly with the input dimension $B$. Empirically, on a controlled truth-table completion benchmark aligned with our setting, the proposed architecture trains reliably and achieves high exact-match accuracy while preserving the predicted structure: every internal unit is Boolean-valued on $\{0,1\}^B$. Matched MLP baselines reach comparable accuracy, but only about $10\%$ of hidden units admit a Boolean representation; i.e.\ are two-valued over the Boolean cube.

Computing Diffusion Geometry

from arXiv: Computational Geometry

Authors: Iolo Jones, David Lanners

Calculus and geometry are ubiquitous in the theoretical modelling of scientific phenomena, but have historically been very challenging to apply directly to real data as statistics. Diffusion geometry is a new theory that reformulates classical calculus and geometry in terms of a diffusion process, allowing these theories to generalise beyond manifolds and be computed from data. This work introduces a new computational framework for diffusion geometry that substantially broadens its practical scope and improves its precision, robustness to noise, and computational complexity. We present a range of new computational methods, including all the standard objects from vector calculus and Riemannian geometry, and apply them to solve spatial PDEs and vector field flows, find geodesic (intrinsic) distances, curvature, and several new topological tools like de Rham cohomology, circular coordinates, and Morse theory. These methods are data-driven, scalable, and can exploit highly optimised numerical tools for linear algebra.

Authors: Iolo Jones, David Lanners

Calculus and geometry are ubiquitous in the theoretical modelling of scientific phenomena, but have historically been very challenging to apply directly to real data as statistics. Diffusion geometry is a new theory that reformulates classical calculus and geometry in terms of a diffusion process, allowing these theories to generalise beyond manifolds and be computed from data. This work introduces a new computational framework for diffusion geometry that substantially broadens its practical scope and improves its precision, robustness to noise, and computational complexity. We present a range of new computational methods, including all the standard objects from vector calculus and Riemannian geometry, and apply them to solve spatial PDEs and vector field flows, find geodesic (intrinsic) distances, curvature, and several new topological tools like de Rham cohomology, circular coordinates, and Morse theory. These methods are data-driven, scalable, and can exploit highly optimised numerical tools for linear algebra.

Competitive Analysis of Online Facility Assignment Algorithms on Discrete Grid Graphs: Performance Bounds and Remediation Strategies

from arXiv: Data Structures and Algorithms

Authors: Lamya Alif, Raian Tasnim Saoda, Sumaiya Afrin, Md. Rawha Siddiqi Riad, Md. Tanzeem Rahat, Md Manzurul Hasan

We study the \emph{Online Facility Assignment} (OFA) problem on a discrete $r\times c$ grid graph under the standard model of Ahmed, Rahman, and Kobourov: a fixed set of facilities is given, each with limited capacity, and an online sequence of unit-demand requests must be irrevocably assigned upon arrival to an available facility, incurring Manhattan ($L_1$) distance cost. We investigate how the discrete geometry of grids interacts with capacity depletion by analyzing two natural baselines and one capacity-aware heuristic. First, we give explicit adversarial sequences on grid instances showing that purely local rules can be forced into large competitive ratios: (i) a capacity-sensitive weighted-Voronoi heuristic (\textsc{CS-Voronoi}) can suffer cascading \emph{region-collapse} effects when nearby capacity is exhausted; and (ii) nearest-available \textsc{Greedy} (with randomized tie-breaking) can be driven into repeated long reassignments via an \emph{oscillation} construction. These results formalize geometric failure modes that are specific to discrete $L_1$ metrics with hard capacities. Motivated by these lower bounds, we then discuss a semi-online extension in which the algorithm may delay assignment for up to $τ$ time steps and solve each batch optimally via a min-cost flow computation. We present this batching framework as a remediation strategy and delineate the parameters that govern its performance, while leaving sharp competitive guarantees for this semi-online variant as an open direction.

Authors: Lamya Alif, Raian Tasnim Saoda, Sumaiya Afrin, Md. Rawha Siddiqi Riad, Md. Tanzeem Rahat, Md Manzurul Hasan

We study the \emph{Online Facility Assignment} (OFA) problem on a discrete $r\times c$ grid graph under the standard model of Ahmed, Rahman, and Kobourov: a fixed set of facilities is given, each with limited capacity, and an online sequence of unit-demand requests must be irrevocably assigned upon arrival to an available facility, incurring Manhattan ($L_1$) distance cost. We investigate how the discrete geometry of grids interacts with capacity depletion by analyzing two natural baselines and one capacity-aware heuristic. First, we give explicit adversarial sequences on grid instances showing that purely local rules can be forced into large competitive ratios: (i) a capacity-sensitive weighted-Voronoi heuristic (\textsc{CS-Voronoi}) can suffer cascading \emph{region-collapse} effects when nearby capacity is exhausted; and (ii) nearest-available \textsc{Greedy} (with randomized tie-breaking) can be driven into repeated long reassignments via an \emph{oscillation} construction. These results formalize geometric failure modes that are specific to discrete $L_1$ metrics with hard capacities. Motivated by these lower bounds, we then discuss a semi-online extension in which the algorithm may delay assignment for up to $τ$ time steps and solve each batch optimally via a min-cost flow computation. We present this batching framework as a remediation strategy and delineate the parameters that govern its performance, while leaving sharp competitive guarantees for this semi-online variant as an open direction.

Location-Aware Dispersion on Anonymous Graphs

from arXiv: Data Structures and Algorithms

Authors: Himani, Supantha Pandit, Gokarna Sharma

The well-studied DISPERSION problem is a fundamental coordination problem in distributed robotics, where a set of mobile robots must relocate so that each occupies a distinct node of a network. DISPERSION assumes that a robot can settle at any node as long as no other robot settles on that node. In this work, we introduce LOCATION-AWARE DISPERSION, a novel generalization of DISPERSION that incorporates location awareness: Let $G = (V, E)$ be an anonymous, connected, undirected graph with $n = |V|$ nodes, each labeled with a color $\sf{col}(v) \in C = \{c_1, \dots, c_t\}, t\leq n$. A set $R = \{r_1, \dots, r_k\}$ of $k \leq n$ mobile robots is given, where each robot $r_i$ has an associated color $\mathsf{col}(r_i) \in C$. Initially placed arbitrarily on the graph, the goal is to relocate the robots so that each occupies a distinct node of the same color. When $|C|=1$, LOCATION-AWARE DISPERSION reduces to DISPERSION. There is a solution to DISPERSION in graphs with any $k\leq n$ without knowing $k,n$. Like DISPERSION, the goal is to solve LOCATION-AWARE DISPERSION minimizing both time and memory requirement at each agent. We develop several deterministic algorithms with guaranteed bounds on both time and memory requirement. We also give an impossibility and a lower bound for any deterministic algorithm for LOCATION-AWARE DISPERSION. To the best of our knowledge, the presented results collectively establish the algorithmic feasibility of LOCATION-AWARE DISPERSION in anonymous networks and also highlight the challenges on getting an efficient solution compared to the solutions for DISPERSION.

Authors: Himani, Supantha Pandit, Gokarna Sharma

The well-studied DISPERSION problem is a fundamental coordination problem in distributed robotics, where a set of mobile robots must relocate so that each occupies a distinct node of a network. DISPERSION assumes that a robot can settle at any node as long as no other robot settles on that node. In this work, we introduce LOCATION-AWARE DISPERSION, a novel generalization of DISPERSION that incorporates location awareness: Let $G = (V, E)$ be an anonymous, connected, undirected graph with $n = |V|$ nodes, each labeled with a color $\sf{col}(v) \in C = \{c_1, \dots, c_t\}, t\leq n$. A set $R = \{r_1, \dots, r_k\}$ of $k \leq n$ mobile robots is given, where each robot $r_i$ has an associated color $\mathsf{col}(r_i) \in C$. Initially placed arbitrarily on the graph, the goal is to relocate the robots so that each occupies a distinct node of the same color. When $|C|=1$, LOCATION-AWARE DISPERSION reduces to DISPERSION. There is a solution to DISPERSION in graphs with any $k\leq n$ without knowing $k,n$. Like DISPERSION, the goal is to solve LOCATION-AWARE DISPERSION minimizing both time and memory requirement at each agent. We develop several deterministic algorithms with guaranteed bounds on both time and memory requirement. We also give an impossibility and a lower bound for any deterministic algorithm for LOCATION-AWARE DISPERSION. To the best of our knowledge, the presented results collectively establish the algorithmic feasibility of LOCATION-AWARE DISPERSION in anonymous networks and also highlight the challenges on getting an efficient solution compared to the solutions for DISPERSION.

Adaptive Hashing: Faster Hash Functions with Fewer Collisions

from arXiv: Data Structures and Algorithms

Authors: Gábor Melis

Hash tables are ubiquitous, and the choice of hash function, which maps a key to a bucket, is key to their performance. We argue that the predominant approach of fixing the hash function for the lifetime of the hash table is suboptimal and propose adapting it to the current set of keys. In the prevailing view, good hash functions spread the keys ``randomly'' and are fast to evaluate. General-purpose ones (e.g. Murmur) are designed to do both while remaining agnostic to the distribution of the keys, which limits their bucketing ability and wastes computation. When these shortcomings are recognized, one may specify a hash function more tailored to some assumed key distribution, but doing so almost always introduces an unbounded risk in case this assumption does not bear out in practice. At the other, fully key-aware end of the spectrum, Perfect Hashing algorithms can discover hash functions to bucket a given set of keys optimally, but they are costly to run and require the keys to be known and fixed ahead of time. Our main conceptual contribution is that adapting the hash table's hash function to the keys online is necessary for the best performance, as adaptivity allows for better bucketing of keys \emph{and} faster hash functions. We instantiate the idea of online adaptation with minimal overhead and no change to the hash table API. The experiments show that the adaptive approach marries the common-case performance of weak hash functions with the robustness of general-purpose ones.

Authors: Gábor Melis

Hash tables are ubiquitous, and the choice of hash function, which maps a key to a bucket, is key to their performance. We argue that the predominant approach of fixing the hash function for the lifetime of the hash table is suboptimal and propose adapting it to the current set of keys. In the prevailing view, good hash functions spread the keys ``randomly'' and are fast to evaluate. General-purpose ones (e.g. Murmur) are designed to do both while remaining agnostic to the distribution of the keys, which limits their bucketing ability and wastes computation. When these shortcomings are recognized, one may specify a hash function more tailored to some assumed key distribution, but doing so almost always introduces an unbounded risk in case this assumption does not bear out in practice. At the other, fully key-aware end of the spectrum, Perfect Hashing algorithms can discover hash functions to bucket a given set of keys optimally, but they are costly to run and require the keys to be known and fixed ahead of time. Our main conceptual contribution is that adapting the hash table's hash function to the keys online is necessary for the best performance, as adaptivity allows for better bucketing of keys \emph{and} faster hash functions. We instantiate the idea of online adaptation with minimal overhead and no change to the hash table API. The experiments show that the adaptive approach marries the common-case performance of weak hash functions with the robustness of general-purpose ones.

Improved SDP-Based Algorithm for Coloring 3-Colorable Graphs

from arXiv: Data Structures and Algorithms

Authors: Nikhil Bansal, Neng Huang, Euiwoong Lee

We present a polynomial-time algorithm that colors any 3-colorable $n$-vertex graph using $O(n^{0.19539})$ colors, improving upon the previous best bound of $\widetilde{O}(n^{0.19747})$ by Kawarabayashi, Thorup, and Yoneda [STOC 2024]. Our result constitutes the first progress in nearly two decades on SDP-based approaches to this problem. The earlier SDP-based algorithms of Arora, Chlamtáč, and Charikar [STOC 2006] and Chlamtáč [FOCS 2007] rely on extracting a large independent set from a suitably "random-looking" second-level neighborhood, under the assumption that the KMS algorithm [Karger, Motwani, and Sudan, JACM 1998] fails to find one globally. We extend their analysis to third-level neighborhoods. We then come up with a new vector $5/2$-coloring, which allows us to extract a large independent set from some third-level neighborhood. The new vector coloring construction may be of independent interest.

Authors: Nikhil Bansal, Neng Huang, Euiwoong Lee

We present a polynomial-time algorithm that colors any 3-colorable $n$-vertex graph using $O(n^{0.19539})$ colors, improving upon the previous best bound of $\widetilde{O}(n^{0.19747})$ by Kawarabayashi, Thorup, and Yoneda [STOC 2024]. Our result constitutes the first progress in nearly two decades on SDP-based approaches to this problem. The earlier SDP-based algorithms of Arora, Chlamtáč, and Charikar [STOC 2006] and Chlamtáč [FOCS 2007] rely on extracting a large independent set from a suitably "random-looking" second-level neighborhood, under the assumption that the KMS algorithm [Karger, Motwani, and Sudan, JACM 1998] fails to find one globally. We extend their analysis to third-level neighborhoods. We then come up with a new vector $5/2$-coloring, which allows us to extract a large independent set from some third-level neighborhood. The new vector coloring construction may be of independent interest.

The Quantum Message Complexity of Distributed Wake-Up with Advice

from arXiv: Data Structures and Algorithms

Authors: Peter Robinson, Ming Ming Tan

We consider the distributed wake-up problem with advice, where nodes are equipped with initial knowledge about the network at large. After the adversary awakens a subset of nodes, an oracle computes a bit string (``the advice'') for each node, and the goal is to wake up all sleeping nodes efficiently. We present the first upper and lower bounds on the message complexity for wake-up in the quantum routing model, introduced by Dufoulon, Magniez, and Pandurangan (PODC 2025). In more detail, we give a distributed advising scheme that, given $α$ bits of advice per node, wakes up all nodes with a message complexity of $O( \sqrt{\frac{n^3}{2^{\max\{\lfloor (α-1)/2 \rfloor},0\}}}\cdot\log n )$ with high probability. Our result breaks the $Ω( \frac{n^2}{2^α} )$ barrier known for the classical port numbering model in sufficiently dense graphs. To complement our algorithm, we give a lower bound on the message complexity for distributed quantum algorithms: By leveraging a lower bound result for the single-bit descriptor problem in the query complexity model, we show that wake-up has a quantum message complexity of $Ω( n^{3/2} )$ without advice, which holds independently of how much time we allow. In the setting where an adversary decides which nodes start the algorithm, most graph problems of interest implicitly require solving wake-up, and thus the same lower bound also holds for other fundamental problems such as single-source broadcast and spanning tree construction.

Authors: Peter Robinson, Ming Ming Tan

We consider the distributed wake-up problem with advice, where nodes are equipped with initial knowledge about the network at large. After the adversary awakens a subset of nodes, an oracle computes a bit string (``the advice'') for each node, and the goal is to wake up all sleeping nodes efficiently. We present the first upper and lower bounds on the message complexity for wake-up in the quantum routing model, introduced by Dufoulon, Magniez, and Pandurangan (PODC 2025). In more detail, we give a distributed advising scheme that, given $α$ bits of advice per node, wakes up all nodes with a message complexity of $O( \sqrt{\frac{n^3}{2^{\max\{\lfloor (α-1)/2 \rfloor},0\}}}\cdot\log n )$ with high probability. Our result breaks the $Ω( \frac{n^2}{2^α} )$ barrier known for the classical port numbering model in sufficiently dense graphs. To complement our algorithm, we give a lower bound on the message complexity for distributed quantum algorithms: By leveraging a lower bound result for the single-bit descriptor problem in the query complexity model, we show that wake-up has a quantum message complexity of $Ω( n^{3/2} )$ without advice, which holds independently of how much time we allow. In the setting where an adversary decides which nodes start the algorithm, most graph problems of interest implicitly require solving wake-up, and thus the same lower bound also holds for other fundamental problems such as single-source broadcast and spanning tree construction.

A Structural Equivalence of Symmetric TSP to a Constrained Group Steiner Tree Problem

from arXiv: Data Structures and Algorithms

Authors: Yılmaz Arslanoğlu

We present a brief structural equivalence between the symmetric TSP and a constrained Group Steiner Tree Problem (cGSTP) defined on a simplicial incidence graph. Given the complete weighted graph on the city set V, we form the bipartite incidence graph between triangles and edges. Selecting an admissible, disk-like set of triangles induces a unique boundary cycle. With global connectivity and local regularity constraints, maximizing net weight in the cGSTP is exactly equivalent to minimizing the TSP tour length.

Authors: Yılmaz Arslanoğlu

We present a brief structural equivalence between the symmetric TSP and a constrained Group Steiner Tree Problem (cGSTP) defined on a simplicial incidence graph. Given the complete weighted graph on the city set V, we form the bipartite incidence graph between triangles and edges. Selecting an admissible, disk-like set of triangles induces a unique boundary cycle. With global connectivity and local regularity constraints, maximizing net weight in the cGSTP is exactly equivalent to minimizing the TSP tour length.

Tight FPT Approximations for Fair $k$-center with Outliers

from arXiv: Data Structures and Algorithms

Authors: Ameet Gadekar

The $k$-center problem is a fundamental clustering objective that has been extensively studied in approximation algorithms. Recent work has sought to incorporate modern constraints such as fairness and robustness, motivated by biased and noisy data. In this paper, we study fair $k$-center with outliers, where centers must respect group-based representation constraints while up to $z$ points may be discarded. While a bi-criteria FPT approximation was previously known, no true approximation algorithm was available for this problem. We present the first deterministic $3$-approximation algorithm running in fixed-parameter tractable time parameterized by $k$. Our approach departs from projection-based methods and instead directly constructs a fair solution using a novel iterative ball-finding framework, based on a structural trichotomy that enables fixed-parameter approximation for the problem. We further extend our algorithm to fair $k$-supplier with outliers and to the more general fair-range setting with both lower and upper bounds. Finally, we show that improving the approximation factor below $3$ is $\mathrm{W[2]}$-hard, establishing the optimality of our results.

Authors: Ameet Gadekar

The $k$-center problem is a fundamental clustering objective that has been extensively studied in approximation algorithms. Recent work has sought to incorporate modern constraints such as fairness and robustness, motivated by biased and noisy data. In this paper, we study fair $k$-center with outliers, where centers must respect group-based representation constraints while up to $z$ points may be discarded. While a bi-criteria FPT approximation was previously known, no true approximation algorithm was available for this problem. We present the first deterministic $3$-approximation algorithm running in fixed-parameter tractable time parameterized by $k$. Our approach departs from projection-based methods and instead directly constructs a fair solution using a novel iterative ball-finding framework, based on a structural trichotomy that enables fixed-parameter approximation for the problem. We further extend our algorithm to fair $k$-supplier with outliers and to the more general fair-range setting with both lower and upper bounds. Finally, we show that improving the approximation factor below $3$ is $\mathrm{W[2]}$-hard, establishing the optimality of our results.

Linear Systems and Eigenvalue Problems: Open Questions from a Simons Workshop

from arXiv: Data Structures and Algorithms

Authors: Noah Amsel, Yves Baumann, Paul Beckman, Peter Bürgisser, Chris Camaño, Tyler Chen, Edmond Chow, Anil Damle, Michal Derezinski, Mark Embree, Ethan N. Epperly, Robert Falgout, Mark Fornace, Anne Greenbaum, Chen Greif, Diana Halikias, Zhen Huang, Elias Jarlebring, Yiannis Koutis, Daniel Kressner, Rasmus Kyng, Jörg Liesen, Jackie Lok, Raphael A. Meyer, Yuji Nakatsukasa, Kate Pearce, Richard Peng, David Persson, Eliza Rebrova, Ryan Schneider, Rikhav Shah, Edgar Solomonik, Nikhil Srivastava, Alex Townsend, Robert J. Webber, Jess Williams

This document presents a series of open questions arising in matrix computations, i.e., the numerical solution of linear algebra problems. It is a result of working groups at the workshop \emph{Linear Systems and Eigenvalue Problems}, which was organized at the Simons Institute for the Theory of Computing program on \emph{Complexity and Linear Algebra} in Fall 2025. The complexity and numerical solution of linear algebra problems %in matrix computations and related fields is a crosscutting area between theoretical computer science and numerical analysis. The value of the particular problem formulations here is that they were produced via discussions between researchers from both groups. The open questions are organized in five categories: iterative solvers for linear systems, eigenvalue computation, low-rank approximation, randomized sketching, and other areas including tensors, quantum systems, and matrix functions.

Authors: Noah Amsel, Yves Baumann, Paul Beckman, Peter Bürgisser, Chris Camaño, Tyler Chen, Edmond Chow, Anil Damle, Michal Derezinski, Mark Embree, Ethan N. Epperly, Robert Falgout, Mark Fornace, Anne Greenbaum, Chen Greif, Diana Halikias, Zhen Huang, Elias Jarlebring, Yiannis Koutis, Daniel Kressner, Rasmus Kyng, Jörg Liesen, Jackie Lok, Raphael A. Meyer, Yuji Nakatsukasa, Kate Pearce, Richard Peng, David Persson, Eliza Rebrova, Ryan Schneider, Rikhav Shah, Edgar Solomonik, Nikhil Srivastava, Alex Townsend, Robert J. Webber, Jess Williams

This document presents a series of open questions arising in matrix computations, i.e., the numerical solution of linear algebra problems. It is a result of working groups at the workshop \emph{Linear Systems and Eigenvalue Problems}, which was organized at the Simons Institute for the Theory of Computing program on \emph{Complexity and Linear Algebra} in Fall 2025. The complexity and numerical solution of linear algebra problems %in matrix computations and related fields is a crosscutting area between theoretical computer science and numerical analysis. The value of the particular problem formulations here is that they were produced via discussions between researchers from both groups. The open questions are organized in five categories: iterative solvers for linear systems, eigenvalue computation, low-rank approximation, randomized sketching, and other areas including tensors, quantum systems, and matrix functions.

Polynomial-Time Solutions for Longest Common Subsequence Related Problems Between a Sequence and a Pangenome Graph

from arXiv: Data Structures and Algorithms

Authors: Xingfu Li, Yongping Wang

A pangenome captures the genetic diversity across multiple individuals simultaneously, providing a more comprehensive reference for genome analysis than a single linear genome, which may introduce allele bias. A widely adopted pangenome representation is a node-labeled directed graph, wherein the paths correspond to plausible genomic sequences within a species. Consequently, evaluating sequence-to-pangenome graph similarity constitutes a fundamental task in pangenome construction and analysis. This study explores the Longest Common Subsequence (LCS) problem and three of its variants involving a sequence and a pangenome graph. We present four polynomial-time reductions that transform these LCS-related problems into the longest path problem in a directed acyclic graph (DAG). These reductions demonstrate that all four problems can be solved in polynomial time, establishing their membership in the complexity class P.

Authors: Xingfu Li, Yongping Wang

A pangenome captures the genetic diversity across multiple individuals simultaneously, providing a more comprehensive reference for genome analysis than a single linear genome, which may introduce allele bias. A widely adopted pangenome representation is a node-labeled directed graph, wherein the paths correspond to plausible genomic sequences within a species. Consequently, evaluating sequence-to-pangenome graph similarity constitutes a fundamental task in pangenome construction and analysis. This study explores the Longest Common Subsequence (LCS) problem and three of its variants involving a sequence and a pangenome graph. We present four polynomial-time reductions that transform these LCS-related problems into the longest path problem in a directed acyclic graph (DAG). These reductions demonstrate that all four problems can be solved in polynomial time, establishing their membership in the complexity class P.

Learning fermionic linear optics with Heisenberg scaling and physical operations

from arXiv: Data Structures and Algorithms

Authors: Aria Christensen, Andrew Zhao

We revisit the problem of learning fermionic linear optics (FLO), also known as fermionic Gaussian unitaries. Given black-box query access to an unknown FLO, previous proposals required $\widetilde{\mathcal{O}}(n^5 / \varepsilon^2)$ queries, where $n$ is the system size and $\varepsilon$ is the error in diamond distance. These algorithms also use unphysical operations (i.e., violating fermionic superselection rules) and/or $n$ auxiliary modes to prepare Choi states of the FLO. In this work, we establish efficient and experimentally friendly protocols that obey superselection, use minimal ancilla (at most $1$ extra mode), and exhibit improved dependence on both parameters $n$ and $\varepsilon$. For arbitrary (active) FLOs this algorithm makes at most $\widetilde{\mathcal{O}}(n^4 / \varepsilon)$ queries, while for number-conserving (passive) FLOs we show that $\mathcal{O}(n^3 / \varepsilon)$ queries suffice. The complexity of the active case can be further reduced to $\widetilde{\mathcal{O}}(n^3 / \varepsilon)$ at the cost of using $n$ ancilla. This marks the first FLO learning algorithm that attains Heisenberg scaling in precision. As a side result, we also demonstrate an improved copy complexity of $\widetilde{\mathcal{O}}(n η^2 / \varepsilon^2)$ for time-efficient state tomography of $η$-particle Slater determinants in $\varepsilon$ trace distance, which may be of independent interest.

Authors: Aria Christensen, Andrew Zhao

We revisit the problem of learning fermionic linear optics (FLO), also known as fermionic Gaussian unitaries. Given black-box query access to an unknown FLO, previous proposals required $\widetilde{\mathcal{O}}(n^5 / \varepsilon^2)$ queries, where $n$ is the system size and $\varepsilon$ is the error in diamond distance. These algorithms also use unphysical operations (i.e., violating fermionic superselection rules) and/or $n$ auxiliary modes to prepare Choi states of the FLO. In this work, we establish efficient and experimentally friendly protocols that obey superselection, use minimal ancilla (at most $1$ extra mode), and exhibit improved dependence on both parameters $n$ and $\varepsilon$. For arbitrary (active) FLOs this algorithm makes at most $\widetilde{\mathcal{O}}(n^4 / \varepsilon)$ queries, while for number-conserving (passive) FLOs we show that $\mathcal{O}(n^3 / \varepsilon)$ queries suffice. The complexity of the active case can be further reduced to $\widetilde{\mathcal{O}}(n^3 / \varepsilon)$ at the cost of using $n$ ancilla. This marks the first FLO learning algorithm that attains Heisenberg scaling in precision. As a side result, we also demonstrate an improved copy complexity of $\widetilde{\mathcal{O}}(n η^2 / \varepsilon^2)$ for time-efficient state tomography of $η$-particle Slater determinants in $\varepsilon$ trace distance, which may be of independent interest.

Parameterized Algorithms for the Drone Delivery Problem

from arXiv: Data Structures and Algorithms

Authors: Simon Bartlmae, Andreas Hene, Joshua Könen, Heiko Röglin

Timely delivery and optimal routing remain fundamental challenges in the modern logistics industry. Building on prior work that considers single-package delivery across networks using multiple types of collaborative agents with restricted movement areas (e.g., drones or trucks), we examine the complexity of the problem under structural and operational constraints. Our focus is on minimizing total delivery time by coordinating agents that differ in speed and movement range across a graph. This problem formulation aligns with the recently proposed Drone Delivery Problem with respect to delivery time (DDT), introduced by Erlebach et al. [ISAAC 2022]. We first resolve an open question posed by Erlebach et al. [ISAAC 2022] by showing that even when the delivery network is a path graph, DDT admits no polynomial-time approximation within any polynomially encodable factor $a(n)$, unless P=NP. Additionally, we identify the intersection graph of the agents, where nodes represent agents and edges indicate an overlap of the movement areas of two agents, as an important structural concept. For path graphs, we show that DDT becomes tractable when parameterized by the treewidth $w$ of the intersection graph, and we present an exact FPT algorithm with running time $f(w)\cdot\text{poly}(n,k)$, for some computable function $f$. For general graphs, we give an FPT algorithm with running time $f(Δ,w)\cdot\text{poly}(n,k)$, where $Δ$ is the maximum degree of the intersection graph. In the special case where the intersection graph is a tree, we provide a simple polynomial-time algorithm.

Authors: Simon Bartlmae, Andreas Hene, Joshua Könen, Heiko Röglin

Timely delivery and optimal routing remain fundamental challenges in the modern logistics industry. Building on prior work that considers single-package delivery across networks using multiple types of collaborative agents with restricted movement areas (e.g., drones or trucks), we examine the complexity of the problem under structural and operational constraints. Our focus is on minimizing total delivery time by coordinating agents that differ in speed and movement range across a graph. This problem formulation aligns with the recently proposed Drone Delivery Problem with respect to delivery time (DDT), introduced by Erlebach et al. [ISAAC 2022]. We first resolve an open question posed by Erlebach et al. [ISAAC 2022] by showing that even when the delivery network is a path graph, DDT admits no polynomial-time approximation within any polynomially encodable factor $a(n)$, unless P=NP. Additionally, we identify the intersection graph of the agents, where nodes represent agents and edges indicate an overlap of the movement areas of two agents, as an important structural concept. For path graphs, we show that DDT becomes tractable when parameterized by the treewidth $w$ of the intersection graph, and we present an exact FPT algorithm with running time $f(w)\cdot\text{poly}(n,k)$, for some computable function $f$. For general graphs, we give an FPT algorithm with running time $f(Δ,w)\cdot\text{poly}(n,k)$, where $Δ$ is the maximum degree of the intersection graph. In the special case where the intersection graph is a tree, we provide a simple polynomial-time algorithm.

Fellow at UC Berkeley (apply by April 1, 2026)

from CCI: jobs

The Simons Institute for the Theory of Computing invites applications for Simons-Berkeley Research Fellowships for the Spring 2027 semester. Fellowships are intended for exceptional scientists to participate in at least one of the semester-long programs at the institute. The Institute will host a program on “Symmetry in Efficient Computation with Local Constraints” in Spring 2027. […]

The Simons Institute for the Theory of Computing invites applications for Simons-Berkeley Research Fellowships for the Spring 2027 semester. Fellowships are intended for exceptional scientists to participate in at least one of the semester-long programs at the institute. The Institute will host a program on “Symmetry in Efficient Computation with Local Constraints” in Spring 2027.

Website: https://simons.berkeley.edu/research-fellowship-call-applications
Email: cawinter@berkeley.edu

By shacharlovett

Thursday, February 05

Trevisan Award for Expository Work

from Windows on Theory

[Guest post by Salil Vadhan; Boaz’s note: I am thrilled to see this award. Luca’s expository writing on his blog, surveys, and lecture notes was and is an amazing resource. Luca had a knack for explaining some of the most difficult topics in intuitive ways that cut to their heart. I hope it will inspire … Continue reading Trevisan Award for Expository Work

[Guest post by Salil Vadhan; Boaz’s note: I am thrilled to see this award. Luca’s expository writing on his blog, surveys, and lecture notes was and is an amazing resource. Luca had a knack for explaining some of the most difficult topics in intuitive ways that cut to their heart. I hope it will inspire more people to follow in footsteps.]

The Trevisan Award for Expository Work is a new SIGACT award 

created in memory of Luca Trevisan (1971-2024), with a nomination deadline of April 10, 2026.  

The award intended to promote and recognize high-impact work expositing ideas and results from the Theory of Computation. The exposition can have various target audiences, e.g. people in this field, people in adjacent or remote academic fields, as well as the general public. The form of exposition can vary, and can include books, surveys, lectures, course materials, video, audio (e.g. podcasts), blogs and other media products. The award may be given to a single piece of work or a series produced over time. The award may be given to an individual, or a small group who together produced this expository work.

The awardee will receive USD 2000 (to be divided among the awardees if multiple), as well as travel support if needed to attend STOC, where the award will be presented. STOC 2026 is June 22-26 in Salt Lake City, Utah.

The endowment for this prize was initiated by a gift from Avi Wigderson, drawing on his Turing Award, and has been subsequently augmented by other individuals.

For more details see https://sigact.org/prizes/trevisan.html.

By Boaz Barak

Lust for life

from Emanuele Viola

I just finished reading the book. I savored it over months, reading just a few pages each day. It was fun to read about something I knew nothing about, painting, coal miners, impressionism and the life of Vincent van Gogh. To think that I missed the exposition at the MFA right in front of my […]

I just finished reading the book. I savored it over months, reading just a few pages each day. It was fun to read about something I knew nothing about, painting, coal miners, impressionism and the life of Vincent van Gogh. To think that I missed the exposition at the MFA right in front of my office, as well as the museum in Amsterdam…

I came to know about this book after someone pointed me to this memorial about Maryam Mirzakhan, according to which the book “exemplified her excitement about work and life in general.” I wish I had talked more to her when we crossed at Harvard.

By Manu

Postdoc at West Virginia University (apply by May 1, 2026)

from CCI: jobs

The Lane Department of Computer Science and Electrical Engineering at West Virginia University invites applications for a Postdoctoral Fellow (funded by Algorithmic Foundations, NSF) in the general areas of theoretical computer science and algorithmic operations research, with an emphasis on computational complexity and game theory. The position is funded for two years, starting August 1 […]

The Lane Department of Computer Science and Electrical Engineering at West Virginia University invites applications for a Postdoctoral Fellow (funded by Algorithmic Foundations, NSF) in the general areas of theoretical computer science and algorithmic operations research, with an emphasis on computational complexity and game theory. The position is funded for two years, starting August 1 2026.

Website: https://wvu.taleo.net/careersection/faculty/jobdetail.ftl?job=28500&tz=GMT-05:00&tzname=America/New_York
Email: k.subramani@mail.wvu.edu

By shacharlovett

Quantum Advantage in Decision Trees: A Weighted Graph and $L_1$ Norm Approach

from arXiv: Computational Complexity

Authors: Sebastian Alberto Grillo, Bernardo Daniel Dávalos, Rodney Fabian Franco Torres, Franklin de Lima Marquezino, Edgar López Pezoa

The analysis of the computational power of single-query quantum algorithms is important because they must extract maximal information from one oracle call, revealing fundamental limits of quantum advantage and enabling optimal, resource-efficient quantum computation. This paper proposes a formulation of single-query quantum decision trees as weighted graphs. This formulation has the advantage that it facilitates the analysis of the $L_1$ spectral norm of the algorithm output. This advantage is based on the fact that a high $L_1$ spectral norm of the output of a quantum decision tree is a necessary condition to outperform its classical counterpart. We propose heuristics for maximizing the $L_{1}$ spectral norm, show how to combine weighted graphs to generate sequences with strictly increasing norm, and present functions exhibiting exponential quantum advantage. Finally, we establish a necessary condition linking single-query quantum advantage to the asymptotic growth of measurement projector dimensions.

Authors: Sebastian Alberto Grillo, Bernardo Daniel Dávalos, Rodney Fabian Franco Torres, Franklin de Lima Marquezino, Edgar López Pezoa

The analysis of the computational power of single-query quantum algorithms is important because they must extract maximal information from one oracle call, revealing fundamental limits of quantum advantage and enabling optimal, resource-efficient quantum computation. This paper proposes a formulation of single-query quantum decision trees as weighted graphs. This formulation has the advantage that it facilitates the analysis of the $L_1$ spectral norm of the algorithm output. This advantage is based on the fact that a high $L_1$ spectral norm of the output of a quantum decision tree is a necessary condition to outperform its classical counterpart. We propose heuristics for maximizing the $L_{1}$ spectral norm, show how to combine weighted graphs to generate sequences with strictly increasing norm, and present functions exhibiting exponential quantum advantage. Finally, we establish a necessary condition linking single-query quantum advantage to the asymptotic growth of measurement projector dimensions.

The Complexity of Min-Max Optimization with Product Constraints

from arXiv: Computational Complexity

Authors: Martino Bernasconi, Matteo Castiglioni

We study the computational complexity of the problem of computing local min-max equilibria of games with a nonconvex-nonconcave utility function $f$. From the work of Daskalakis, Skoulakis, and Zampetakis [DSZ21], this problem was known to be hard in the restrictive case in which players are required to play strategies that are jointly constrained, leaving open the question of its complexity under more natural constraints. In this paper, we settle the question and show that the problem is PPAD-hard even under product constraints and, in particular, over the hypercube.

Authors: Martino Bernasconi, Matteo Castiglioni

We study the computational complexity of the problem of computing local min-max equilibria of games with a nonconvex-nonconcave utility function $f$. From the work of Daskalakis, Skoulakis, and Zampetakis [DSZ21], this problem was known to be hard in the restrictive case in which players are required to play strategies that are jointly constrained, leaving open the question of its complexity under more natural constraints. In this paper, we settle the question and show that the problem is PPAD-hard even under product constraints and, in particular, over the hypercube.

On the Complexity of Vertex-Splitting Into an Interval Graph

from arXiv: Computational Complexity

Authors: Faisal N. Abu-Khzam, Dipayan Chakraborty, Lucas Isenmann, Nacim Oijid

Vertex splitting is a graph modification operation in which a vertex is replaced by multiple vertices such that the union of their neighborhoods equals the neighborhood of the original vertex. We introduce and study vertex splitting as a graph modification operation for transforming graphs into interval graphs. Given a graph $G$ and an integer $k$, we consider the problem of deciding whether $G$ can be transformed into an interval graph using at most $k$ vertex splits. We prove that this problem is NP-hard, even when the input is restricted to subcubic planar bipartite graphs. We further observe that vertex splitting differs fundamentally from vertex and edge deletions as graph modification operations when the objective is to obtain a chordal graph, even for graphs with maximum independent set size at most two. On the positive side, we give a polynomial-time algorithm for transforming, via a minimum number of vertex splits, a given graph into a disjoint union of paths, and that splitting triangle free graphs into unit interval graphs is also solvable in polynomial time.

Authors: Faisal N. Abu-Khzam, Dipayan Chakraborty, Lucas Isenmann, Nacim Oijid

Vertex splitting is a graph modification operation in which a vertex is replaced by multiple vertices such that the union of their neighborhoods equals the neighborhood of the original vertex. We introduce and study vertex splitting as a graph modification operation for transforming graphs into interval graphs. Given a graph $G$ and an integer $k$, we consider the problem of deciding whether $G$ can be transformed into an interval graph using at most $k$ vertex splits. We prove that this problem is NP-hard, even when the input is restricted to subcubic planar bipartite graphs. We further observe that vertex splitting differs fundamentally from vertex and edge deletions as graph modification operations when the objective is to obtain a chordal graph, even for graphs with maximum independent set size at most two. On the positive side, we give a polynomial-time algorithm for transforming, via a minimum number of vertex splits, a given graph into a disjoint union of paths, and that splitting triangle free graphs into unit interval graphs is also solvable in polynomial time.

Graph--Theoretic Analysis of Phase Optimization Complexity in Variational Wave Functions for Heisenberg Antiferromagnets

from arXiv: Computational Complexity

Authors: Mahmud Ashraf Shamim, Moshiur Rahman, Mohamed Hibat-Allah, Paulo T Araujo

Despite extensive study, the phase structure of the wavefunctions in frustrated Heisenberg antiferromagnets (HAF) is not yet systematically characterized. In this work, we represent the Hilbert space of an HAF as a weighted graph, which we term the Hilbert graph (HG), whose vertices are spin configurations and whose edges are generated by off-diagonal spin-flip terms of the Heisenberg Hamiltonian, with weights set by products of wavefunction amplitudes. Holding the amplitudes fixed and restricting phases to $\mathbb{Z}_2$ values, the phase-dependent variational energy can be recast as a classical Ising antiferromagnet on the HG, so that phase reconstruction of the ground state reduces to a weighted Max-Cut instance. This shows that phase reconstruction HAF is worst-case NP-hard and provides a direct link between wavefunction sign structure and combinatorial optimization.

Authors: Mahmud Ashraf Shamim, Moshiur Rahman, Mohamed Hibat-Allah, Paulo T Araujo

Despite extensive study, the phase structure of the wavefunctions in frustrated Heisenberg antiferromagnets (HAF) is not yet systematically characterized. In this work, we represent the Hilbert space of an HAF as a weighted graph, which we term the Hilbert graph (HG), whose vertices are spin configurations and whose edges are generated by off-diagonal spin-flip terms of the Heisenberg Hamiltonian, with weights set by products of wavefunction amplitudes. Holding the amplitudes fixed and restricting phases to $\mathbb{Z}_2$ values, the phase-dependent variational energy can be recast as a classical Ising antiferromagnet on the HG, so that phase reconstruction of the ground state reduces to a weighted Max-Cut instance. This shows that phase reconstruction HAF is worst-case NP-hard and provides a direct link between wavefunction sign structure and combinatorial optimization.

The matrix-vector complexity of $Ax=b$

from arXiv: Data Structures and Algorithms

Authors: Michał Dereziński, Ethan N. Epperly, Raphael A. Meyer

Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, $Ω(κ\log(1/\varepsilon))$ matrix-vector products are necessary to solve a linear system with condition number $κ$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix-vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

Authors: Michał Dereziński, Ethan N. Epperly, Raphael A. Meyer

Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, $Ω(κ\log(1/\varepsilon))$ matrix-vector products are necessary to solve a linear system with condition number $κ$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix-vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

The Needle is a Thread: Finding Planted Paths in Noisy Process Trees

from arXiv: Data Structures and Algorithms

Authors: Maya Le, Paweł Prałat, Aaron Smith, François Théberge

Motivated by applications in cybersecurity such as finding meaningful sequences of malware-related events buried inside large amounts of computer log data, we introduce the "planted path" problem and propose an algorithm to find fuzzy matchings between two trees. This algorithm can be used as a "building block" for more complicated workflows. We demonstrate usefulness of a few of such workflows in mining synthetically generated data as well as real-world ACME cybersecurity datasets.

Authors: Maya Le, Paweł Prałat, Aaron Smith, François Théberge

Motivated by applications in cybersecurity such as finding meaningful sequences of malware-related events buried inside large amounts of computer log data, we introduce the "planted path" problem and propose an algorithm to find fuzzy matchings between two trees. This algorithm can be used as a "building block" for more complicated workflows. We demonstrate usefulness of a few of such workflows in mining synthetically generated data as well as real-world ACME cybersecurity datasets.

Incongruity-sensitive access to highly compressed strings

from arXiv: Data Structures and Algorithms

Authors: Ferdinando Cicalese, Zsuzsanna Lipták, Travis Gagie, Gonzalo Navarro, Nicola Prezza, Cristian Urbina

Random access to highly compressed strings -- represented by straight-line programs or Lempel-Ziv parses, for example -- is a well-studied topic. Random access to such strings in strongly sublogarithmic time is impossible in the worst case, but previous authors have shown how to support faster access to specific characters and their neighbourhoods. In this paper we explore whether, since better compression can impede access, we can support faster access to relatively incompressible substrings of highly compressed strings. We first show how, given a run-length compressed straight-line program (RLSLP) of size $g_{rl}$ or a block tree of size $L$, we can build an $O (g_{rl})$-space or an $O (L)$-space data structure, respectively, that supports access to any character in time logarithmic in the length of the longest repeated substring containing that character. That is, the more incongruous a character is with respect to the characters around it in a certain sense, the faster we can support access to it. We then prove a similar but more powerful and sophisticated result for parsings in which phrases' sources do not overlap much larger phrases, with the query time depending also on the number of phrases we must copy from their sources to obtain the queried character.

Authors: Ferdinando Cicalese, Zsuzsanna Lipták, Travis Gagie, Gonzalo Navarro, Nicola Prezza, Cristian Urbina

Random access to highly compressed strings -- represented by straight-line programs or Lempel-Ziv parses, for example -- is a well-studied topic. Random access to such strings in strongly sublogarithmic time is impossible in the worst case, but previous authors have shown how to support faster access to specific characters and their neighbourhoods. In this paper we explore whether, since better compression can impede access, we can support faster access to relatively incompressible substrings of highly compressed strings. We first show how, given a run-length compressed straight-line program (RLSLP) of size $g_{rl}$ or a block tree of size $L$, we can build an $O (g_{rl})$-space or an $O (L)$-space data structure, respectively, that supports access to any character in time logarithmic in the length of the longest repeated substring containing that character. That is, the more incongruous a character is with respect to the characters around it in a certain sense, the faster we can support access to it. We then prove a similar but more powerful and sophisticated result for parsings in which phrases' sources do not overlap much larger phrases, with the query time depending also on the number of phrases we must copy from their sources to obtain the queried character.

Simple 2-approximations for bad triangle transversals and some hardness results for related problems

from arXiv: Data Structures and Algorithms

Authors: Florian Adriaens, Nikolaj tatti

Given a signed graph, the bad triangle transversal (BTT) problem asks to find the smallest number of edges that need to be removed such that the remaining graph does not have a triangle with exactly one negative edge (a bad triangle). We propose novel 2-approximations for this problem, which are much simpler and faster than a folklore adaptation of the 2-approximation by Krivelevich for finding a minimum triangle transversal in unsigned graphs. One of our algorithms also works for weighted BTT and for approximately optimal feasible solutions to the bad triangle cover LP. Using a recent result on approximating the bad triangle cover LP, we obtain a $(2+ε)$ approximation in time almost equal to the time needed to find a maximal set of edge-disjoint bad triangles (which would give a standard 3-approximation). Additionally, several inapproximability results are provided. For complete signed graphs, we show that BTT is NP-hard to approximate with factor better than $\frac{2137}{2136}$. Our reduction also implies the same hardness result for related problems such as correlation clustering (cluster editing), cluster deletion and the min. strong triadic closure problem. On complete signed graphs, BTT is closely related to correlation clustering. We show that the correlation clustering optimum is at most $3/2$ times the BTT optimum, by describing a pivot procedure that transforms BTT solutions into clusters. This improves a result by Veldt, which states that their ratio is at most two.

Authors: Florian Adriaens, Nikolaj tatti

Given a signed graph, the bad triangle transversal (BTT) problem asks to find the smallest number of edges that need to be removed such that the remaining graph does not have a triangle with exactly one negative edge (a bad triangle). We propose novel 2-approximations for this problem, which are much simpler and faster than a folklore adaptation of the 2-approximation by Krivelevich for finding a minimum triangle transversal in unsigned graphs. One of our algorithms also works for weighted BTT and for approximately optimal feasible solutions to the bad triangle cover LP. Using a recent result on approximating the bad triangle cover LP, we obtain a $(2+ε)$ approximation in time almost equal to the time needed to find a maximal set of edge-disjoint bad triangles (which would give a standard 3-approximation). Additionally, several inapproximability results are provided. For complete signed graphs, we show that BTT is NP-hard to approximate with factor better than $\frac{2137}{2136}$. Our reduction also implies the same hardness result for related problems such as correlation clustering (cluster editing), cluster deletion and the min. strong triadic closure problem. On complete signed graphs, BTT is closely related to correlation clustering. We show that the correlation clustering optimum is at most $3/2$ times the BTT optimum, by describing a pivot procedure that transforms BTT solutions into clusters. This improves a result by Veldt, which states that their ratio is at most two.

Improved Sparse Recovery for Approximate Matrix Multiplication

from arXiv: Data Structures and Algorithms

Authors: Yahel Uffenheimer, Omri Weinstein

We present a simple randomized algorithm for approximate matrix multiplication (AMM) whose error scales with the *output* norm $\|AB\|_F$. Given any $n\times n$ matrices $A,B$ and a runtime parameter $r\leq n$, the algorithm produces in $O(n^2(r+\log n))$ time, a matrix $C$ with total squared error $\mathbb{E}[\|C-AB\|_F^2]\le (1-\frac{r}{n})\|AB\|_F^2$, per-entry variance $\|AB\|_F^2/n^2$ and bias $\mathbb{E}[C]=\frac{r}{n}AB$. Alternatively, the algorithm can compute an *unbiased* estimation with expected total squared error $\frac{n}{r}\|{AB}\|_{F}^2$, recovering the state-of-art AMM error obtained by Pagh's TensorSketch algorithm (Pagh, 2013). Our algorithm is a log-factor faster. The key insight in the algorithm is a new variation of pseudo-random rotation of the input matrices (a Fast Hadamard Transform with asymmetric diagonal scaling), which redistributes the Frobenius norm of the *output* $AB$ uniformly across its entries.

Authors: Yahel Uffenheimer, Omri Weinstein

We present a simple randomized algorithm for approximate matrix multiplication (AMM) whose error scales with the *output* norm $\|AB\|_F$. Given any $n\times n$ matrices $A,B$ and a runtime parameter $r\leq n$, the algorithm produces in $O(n^2(r+\log n))$ time, a matrix $C$ with total squared error $\mathbb{E}[\|C-AB\|_F^2]\le (1-\frac{r}{n})\|AB\|_F^2$, per-entry variance $\|AB\|_F^2/n^2$ and bias $\mathbb{E}[C]=\frac{r}{n}AB$. Alternatively, the algorithm can compute an *unbiased* estimation with expected total squared error $\frac{n}{r}\|{AB}\|_{F}^2$, recovering the state-of-art AMM error obtained by Pagh's TensorSketch algorithm (Pagh, 2013). Our algorithm is a log-factor faster. The key insight in the algorithm is a new variation of pseudo-random rotation of the input matrices (a Fast Hadamard Transform with asymmetric diagonal scaling), which redistributes the Frobenius norm of the *output* $AB$ uniformly across its entries.

QuadRank: Engineering a High Throughput Rank

from arXiv: Data Structures and Algorithms

Authors: R. Groot Koerkamp

Given a text, a query $\mathsf{rank}(q, c)$ counts the number of occurrences of character $c$ among the first $q$ characters of the text. Space-efficient methods to answer these rank queries form an important building block in many succinct data structures. For example, the FM-index is a widely used data structure that uses rank queries to locate all occurrences of a pattern in a text. In bioinformatics applications, the goal is usually to process a given input as fast as possible. Thus, data structures should have high throughput when used with many threads. Contributions. For the binary alphabet, we develop BiRank with 3.28% space overhead. It merges the central ideas of two recent papers: (1) we interleave (inline) offsets in each cache line of the underlying bit vector [Laws et al., 2024], reducing cache-misses, and (2) these offsets are to the middle of each block so that only half of them need popcounting [Gottlieb and Reinert, 2025]. In QuadRank (14.4% space overhead), we extend these techniques to the $σ=4$ (DNA) alphabet. Both data structures require only a single cache miss per query, making them highly suitable for high-throughput and memory-bound settings. To enable efficient batch-processing, we support prefetching the cache lines required to answer upcoming queries. Results. BiRank and QuadRank are around $1.5\times$ and $2\times$ faster than similar-overhead methods that do not use inlining. Prefetching gives an additional $2\times$ speedup, at which point the dual-channel DDR4 RAM bandwidth becomes a hard limit on the total throughput. With prefetching, both methods outperform all other methods apart from SPIDER [Laws et al., 2024] by $2\times$. When using QuadRank with prefetching in a toy count-only FM-index, QuadFm, this results in a smaller size and up to $4\times$ speedup over Genedex, a state-of-the-art batching FM-index implementation.

Authors: R. Groot Koerkamp

Given a text, a query $\mathsf{rank}(q, c)$ counts the number of occurrences of character $c$ among the first $q$ characters of the text. Space-efficient methods to answer these rank queries form an important building block in many succinct data structures. For example, the FM-index is a widely used data structure that uses rank queries to locate all occurrences of a pattern in a text. In bioinformatics applications, the goal is usually to process a given input as fast as possible. Thus, data structures should have high throughput when used with many threads. Contributions. For the binary alphabet, we develop BiRank with 3.28% space overhead. It merges the central ideas of two recent papers: (1) we interleave (inline) offsets in each cache line of the underlying bit vector [Laws et al., 2024], reducing cache-misses, and (2) these offsets are to the middle of each block so that only half of them need popcounting [Gottlieb and Reinert, 2025]. In QuadRank (14.4% space overhead), we extend these techniques to the $σ=4$ (DNA) alphabet. Both data structures require only a single cache miss per query, making them highly suitable for high-throughput and memory-bound settings. To enable efficient batch-processing, we support prefetching the cache lines required to answer upcoming queries. Results. BiRank and QuadRank are around $1.5\times$ and $2\times$ faster than similar-overhead methods that do not use inlining. Prefetching gives an additional $2\times$ speedup, at which point the dual-channel DDR4 RAM bandwidth becomes a hard limit on the total throughput. With prefetching, both methods outperform all other methods apart from SPIDER [Laws et al., 2024] by $2\times$. When using QuadRank with prefetching in a toy count-only FM-index, QuadFm, this results in a smaller size and up to $4\times$ speedup over Genedex, a state-of-the-art batching FM-index implementation.

Minimizing Makespan in Sublinear Time via Weighted Random Sampling

from arXiv: Data Structures and Algorithms

Authors: Bin Fu, Yumei Huo, Hairong Zhao

We consider the classical makespan minimization scheduling problem where $n$ jobs must be scheduled on $m$ identical machines. Using weighted random sampling, we developed two sublinear time approximation schemes: one for the case where $n$ is known and the other for the case where $n$ is unknown. Both algorithms not only give a $(1+3ε)$-approximation to the optimal makespan but also generate a sketch schedule. Our first algorithm, which targets the case where $n$ is known and draws samples in a single round under weighted random sampling, has a running time of $\tilde{O}(\tfrac{m^5}{ε^4} \sqrt{n}+A(\ceiling{m\over ε}, ε ))$, where $A(\mathcal{N}, α)$ is the time complexity of any $(1+α)$-approximation scheme for the makespan minimization of $\mathcal{N}$ jobs. The second algorithm addresses the case where $n$ is unknown. It uses adaptive weighted random sampling, %\textit{that is}, it draws samples in several rounds, adjusting the number of samples after each round, and runs in sublinear time $\tilde{O}\left( \tfrac{m^5} {ε^4} \sqrt{n} + A(\ceiling{m\over ε}, ε )\right)$. We also provide an implementation that generates a weighted random sample using $O(\log n)$ uniform random samples.

Authors: Bin Fu, Yumei Huo, Hairong Zhao

We consider the classical makespan minimization scheduling problem where $n$ jobs must be scheduled on $m$ identical machines. Using weighted random sampling, we developed two sublinear time approximation schemes: one for the case where $n$ is known and the other for the case where $n$ is unknown. Both algorithms not only give a $(1+3ε)$-approximation to the optimal makespan but also generate a sketch schedule. Our first algorithm, which targets the case where $n$ is known and draws samples in a single round under weighted random sampling, has a running time of $\tilde{O}(\tfrac{m^5}{ε^4} \sqrt{n}+A(\ceiling{m\over ε}, ε ))$, where $A(\mathcal{N}, α)$ is the time complexity of any $(1+α)$-approximation scheme for the makespan minimization of $\mathcal{N}$ jobs. The second algorithm addresses the case where $n$ is unknown. It uses adaptive weighted random sampling, %\textit{that is}, it draws samples in several rounds, adjusting the number of samples after each round, and runs in sublinear time $\tilde{O}\left( \tfrac{m^5} {ε^4} \sqrt{n} + A(\ceiling{m\over ε}, ε )\right)$. We also provide an implementation that generates a weighted random sample using $O(\log n)$ uniform random samples.