Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, April 03

Arbitrary Geometry

from Ben Recht

Adversarial regret as a proof technique in learning, optimization, and games

This is the third live blog of Lecture 7 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

I closed yesterday’s blog with a cliffhanger, promising to give a few examples of where I think adversarial regret is a useful concept. On its own, I’m not sure that it is a useful concept. More on that next week! But today, I’ll show how you can use adversarial regret to bootstrap interesting arguments linking machine learning, game theory, and stochastic optimization.

Once again, I list the rules of the game. We have a two-player game with repeated interactions. In every round t,

  1. Information xt is revealed to both players.

  2. Player One takes action ut

  3. Player Two takes action dt

  4. A score rt is assigned based on the triple (xt,ut,dt).

Player One is the “decision maker,” and their action has to be computable from a few lines of code. Their goal is to accumulate as high a score as possible, summed across all rounds. Player two wants the sum of all of the rt to be as low as possible.

The adversarial regret compares the score of Player One’s strategy to that of a player who sees the entire sequence of disturbances but must play the same action at every time step. It’s a weird setup where we are comparing a player who can change their strategy arbitrarily to an omniscient player forced to play the same move every time. While these two notions don’t seem worth comparing at first blush, there are a few cases in learning theory and game theory where the comparison is mathematically powerful. It turns out that computing these regret bounds is often quite simple, and they follow from elementary derivations. While these bounds themselves might not be useful, they then imply results you actually care about. Let me give my three favorite examples.

Online learning and PAC Learning

Online learning is the case argmin readers will have already encountered if they followed my machine learning course blogging. I like to teach online learning because adversarial regret bounds imply the standard model of probabilistic machine learning. Adversarial regret highlights how most of the “generalization bounds” we derive in machine learning are artifacts of geometry rather than mystical manifestations of mechanical epiphany.

In the online learning model, the goal is to predict the disturbance from the information. The actions are predictions. At each round, your score is high if your prediction is correct and low if the prediction is incorrect.

Let’s change the notation to match the standard verbiage of machine learning. The prediction is a function f, and the “disturbance” is a “label,” denoted y. Instead of a high score, we want low loss. In the online learning setting, you get to change your prediction function at every time step and compare your losses to a single model that tries to fit the labels after you see them. In equations, this is

You can now do math and show that this expression is bounded by a sublinear function of the number of rounds. This post works out the details. Now, the resulting deterministic bound is necessarily interesting in and of itself, but the magic happens when you declare the xs and ys to be generated by a stochastic process. If, for example, you assume the information-label pairs are identically distributed, independently samples from some data-generating process, then after making a few assumptions about convexity, the regret bound becomes a generalization bound:

Here, FT denotes the random function that your model returns after seeing T examples. This bound compares the predictive accuracy of your algorithm on a new sample to that of the best function computable given the data-generating distribution. If you have sublinear regret, then this quantity tends to zero as T goes to infinity. This is called a generalization bound, or, if you use probability instead of expected values, a PAC Learning bound.

The technique of deriving a deterministic regret bound and transforming it into a probabilistic generalization bound by taking expected values is called “online-to-batch conversion.” It is one of the favorite tricks of learning theorists.

Stochastic Optimization

Similar techniques can be applied more generally to stochastic optimization. A clever analysis of the stochastic gradient method takes a similar approach: you can prove that gradient descent has low regret even if Player Two is handing you a different convex function at every time step. If you take expectations of the resulting regret bound and apply Jensen’s inequality, you derive a bound on the sample average approximation method for stochastic programming. Though substantially more general, the proof is almost identical to the one in online learning.

Repeated Games

Still closely related to but slightly more challenging than online convex optimization are repeated zero-sum games. In this setup, each round of the game is itself a zero-sum game. The players battle each other for multiple rounds, and Player One’s goal is to refine their strategy so they eventually achieve an infinite ELO score. Here, a classic result proves that when both players use algorithms with low adversarial regret, they converge to a Nash equilibrium. You assume that both Player One and Player Two are using algorithms that yield low regret against an arbitrary adversary. The baseline is a player forced to use the same strategy every round. If Player One and Two’s strategic improvements have sublinear regret, their strategies eventually converge to an equilibrium. This result is the backbone of modern poker bots, which use algorithms like counterfactual regret minimization. Whether or not you think solving poker is a major contribution to humanity and human knowledge is up to you.

Subscribe now

By Ben Recht

Estonian-Latvian Computer Science Theory Days 2026

from Luca Aceto

The Estonian-Latvian Computer Science Theory Days 2026 will be at the University of Tartu, Tartu, Estonia, in the period 24-26 April 2026. (Hat tip: My colleague Tarmo Uustalu at the Department of Computer Science at Reykjavik University.) Quoting from the event's website,


"The main goal of the Theory Days is to let the theoretical computer scientists of our two countries to get acquainted with the work of each other. However, people from other countries are welcome to participate as well. The main audience is intended to be the graduate students in the roles of both listeners and presenters."

The first joint Estonian-Latvian Theory Days were held in 2010 and this year's edition of the event will feature an invited talk by Robert Tarjan. Last year, Robert Tarjan gave a public lecture at the University of Tartu entitled “My Life with Data Structures”. See www.youtube.com/watch?v=pFHIueXFHWg for a recording of that talk. 

Note that the deadlines for registration and for proposing a talk is today.

Kudos to our colleagues in Estonia and Latvia for organising the theory days!

By Luca Aceto

The Estonian-Latvian Computer Science Theory Days 2026 will be at the University of Tartu, Tartu, Estonia, in the period 24-26 April 2026. (Hat tip: My colleague Tarmo Uustalu at the Department of Computer Science at Reykjavik University.) Quoting from the event's website,


"The main goal of the Theory Days is to let the theoretical computer scientists of our two countries to get acquainted with the work of each other. However, people from other countries are welcome to participate as well. The main audience is intended to be the graduate students in the roles of both listeners and presenters."

The first joint Estonian-Latvian Theory Days were held in 2010 and this year's edition of the event will feature an invited talk by Robert Tarjan. Last year, Robert Tarjan gave a public lecture at the University of Tartu entitled “My Life with Data Structures”. See https://www.youtube.com/watch?v=pFHIueXFHWg for a recording of that talk. 

Note that the deadlines for registration and for proposing a talk is today.

Kudos to our colleagues in Estonia and Latvia for organising the theory days!

By Luca Aceto

Polymath Plus AI

from Gil Kalai

This post was written together with Nissan Hajaj and Ido Kaminer.  AI and Math has become a hot topic (and a source of some worries) among and beyond the mathematical community. Nissan Hajaj from Google Research proposed to run an … Continue reading →

This post was written together with Nissan Hajaj and Ido Kaminer. 

AI and Math has become a hot topic (and a source of some worries) among and beyond the mathematical community.

Nissan Hajaj from Google Research proposed to run an “AI polymath project” based on a similar concept of polymath project but with participation of AI agents. Together with Ido Kaminer we decided to run a pilot experiment with a couple of projects.

Nissan’s vision for the AI-polymath project

By empowering collaborative research initiatives with AI tools, we can create a powerful platform that applies the scientific potential of modern AI to the most advanced frontiers of active research. Projects like Polymath demonstrate how the mathematical community has already embraced collective efforts; such models can now be enhanced by the evolving capabilities of AI to accelerate research progress while offering developers a space to refine these tools through real-world academic interaction.

We envision an open, hybrid research ecosystem where human experts and AI agents collaborate seamlessly, utilizing shared tools and resources as needed. In this environment, both human participants and AI entities can engage in ongoing dialogues, offering insights, solutions, and critiques. Our proposal focuses on establishing a platform designed to initiate,oversee, and complement these specialized mathematical research communities, featuring:

  1. Dedicated AI utilities for community administration, including workflows, literature analysis, documentation, review, verification, and planning.

  2. Frameworks designed for registering, deploying, and coordinating AI tools made available to the community.

  3. A transparent interface that invites contributions from both human researchers and automated agents.

Our objective is to select a diverse array of mathematical problems spanning various subfields, each presenting unique hurdles for both human intellect and AI-driven methodologies.

What we plan to do

We would like to have a preliminary (and rather partial) implementation of Nissan’s idea: To start with a description of a project (polymath X) and to proceed with AI contributions on the comment section.

The (default) prompt:

Polymath X with the participation of AI agents is dealing with the following mathematical problem:
—Short Formulation—-
An introduction to the project and the discussion so far can be found in this LINK.
Please make a comment, preferably limited to 3-4 paragraphs. (If you have more to say, contribute several comments.)
Your comment may include (but are not limited to) one or more of the following
  1. A new idea for the project,
  2.  Some further thoughts on current ideas,
  3.  A comment on some earlier comments,
  4.  Proofs, heuristic arguments, examples, counter-examples, and conjectures,
  5.  Computer-programs and computer experimentation.

The comment section of the present post can serve for “meta discussion” for this project. 

In addition of a general discussion of the idea we can think about specific projects to run.

Potential stages and specific tools

Nissan sees several possible stages for the project. For example:

  1. Manually setting up the Polymath site/post and inviting participants.
    Every AI participant will need to find a way to interact with the site (through comments).
  2. Manually setting up the Polymath site/post and inviting participants—both humans (through comments) and agentic participants (through a published API).  
  3. A semi-automatically managed site, where content generation, reviews, and moderation can be facilitated by AI.
  4. The same as (3), but with additional special-purpose AI tools and resources dedicated to the effort.

In the long term, Nissan envisions such an effort evolving into a collection of encyclopedic entries (similar to Wikipedia) that are maintained and advanced by the hybrid community, with dedicated computational resources to explore solutions autonomously.

(Our blog efforts will be limited to items 1 and 2. API stands for Application Programming Interface.)

A few words about Ido

In addition to his research in experimental physics, Ido Kaminer and his collaborators developed in 2021 the Ramanujan Machine which is “a novel way to do mathematics by harnessing your computer power to make new discoveries. The Ramanujan Machine already discovered dozens of new conjectures.” We mentioned the Ramanujan Machine in this 2021 post. His group expanded this idea to build a library of connections among mathematical constants (ICLR 2025) and unify their formulas (NeurIPS 2025). We mentioned the Ramanujan Machine in this 2021 post. More recently, the research of his group expanded also to computations in physics. See Ido’s videotaped lecture From π to QFT: Symbolic Discovery at Scale.

Other Polymath news

On some other polymath news, Tim Gowers recently proposed a polymath project about the word problem in the Artin-Tits group.

AI+Math

On the topic of AI+Math (and math of CS), let me mention that I also had very pleasant and thought-provoking discussions with Rafi Ostrovsky and Yuval Rabani.

I also had  some illuminating correspondence with Dr. Z. (Separate to his pioneering and provocative role in Math+AI, see Doron’s surprise proposal from yesterday.)

By Gil Kalai

King Chasing Problem in Chinese Chess is NP-hard

from arXiv: Computational Complexity

Authors: Chao Li, Zhujun Zhang, Chao Yang

We prove that king chasing problem in Chinese Chess is NP-hard when generalized to $n\times n$ boards. `King chasing' is a frequently-used strategy in Chinese Chess, which means that the player has to continuously check the opponent in every move until finally checkmating the opponent's king. The problem is to determine which player has a winning strategy in generalized Chinese Chess, under the constraints of king chasing. Obviously, it is a sub-problem of generalized Chinese Chess problem. We prove that king chasing problem in Chinese Chess is NP-hard by reducing from the classic NP-complete problem 3-SAT.

Authors: Chao Li, Zhujun Zhang, Chao Yang

We prove that king chasing problem in Chinese Chess is NP-hard when generalized to $n\times n$ boards. `King chasing' is a frequently-used strategy in Chinese Chess, which means that the player has to continuously check the opponent in every move until finally checkmating the opponent's king. The problem is to determine which player has a winning strategy in generalized Chinese Chess, under the constraints of king chasing. Obviously, it is a sub-problem of generalized Chinese Chess problem. We prove that king chasing problem in Chinese Chess is NP-hard by reducing from the classic NP-complete problem 3-SAT.

DQC1-completeness of normalized trace estimation for functions of log-local Hamiltonians

from arXiv: Computational Complexity

Authors: Zhengfeng Ji, Tongyang Li, Changpeng Shao, Xinzhao Wang, Yuxin Zhang

We study the computational complexity of estimating the normalized trace $2^{-n}Tr[f(A)]$ for a log-local Hamiltonian $A$ acting on $n$ qubits. This problem arises naturally in the DQC1 model, yet its complexity is only understood for a limited class of functions $f(x)$. We show that if $f(x)$ is a continuous function with approximate degree $Ω({\rm poly}(n))$, then estimating $2^{-n}Tr[f(A)]$ up to constant additive error is DQC1-complete, under a technical condition on the polynomial approximation error of $f(x)$. This condition holds for a broad class of functions, including exponentials, trigonometric functions, logarithms, and inverse-type functions. We further prove that when $A$ is sparse, the classical query complexity of this problem is exponential in the approximate degree, assuming a conjectured lower bound for a trace variant of the $k$-Forrelation problem in the DQC1 query model. Together, these results identify the approximate degree as the key parameter governing the complexity of normalized trace estimation: it characterizes both the quantum complexity (via efficient DQC1 algorithms) and, conditionally, the classical hardness, yielding an exponential quantum-classical separation. Our proof develops a unified framework that cleanly combines circuit-to-Hamiltonian constructions, periodic Jacobi operators, and tools from polynomial approximation theory, including the Chebyshev equioscillation theorem.

Authors: Zhengfeng Ji, Tongyang Li, Changpeng Shao, Xinzhao Wang, Yuxin Zhang

We study the computational complexity of estimating the normalized trace $2^{-n}Tr[f(A)]$ for a log-local Hamiltonian $A$ acting on $n$ qubits. This problem arises naturally in the DQC1 model, yet its complexity is only understood for a limited class of functions $f(x)$. We show that if $f(x)$ is a continuous function with approximate degree $Ω({\rm poly}(n))$, then estimating $2^{-n}Tr[f(A)]$ up to constant additive error is DQC1-complete, under a technical condition on the polynomial approximation error of $f(x)$. This condition holds for a broad class of functions, including exponentials, trigonometric functions, logarithms, and inverse-type functions. We further prove that when $A$ is sparse, the classical query complexity of this problem is exponential in the approximate degree, assuming a conjectured lower bound for a trace variant of the $k$-Forrelation problem in the DQC1 query model. Together, these results identify the approximate degree as the key parameter governing the complexity of normalized trace estimation: it characterizes both the quantum complexity (via efficient DQC1 algorithms) and, conditionally, the classical hardness, yielding an exponential quantum-classical separation. Our proof develops a unified framework that cleanly combines circuit-to-Hamiltonian constructions, periodic Jacobi operators, and tools from polynomial approximation theory, including the Chebyshev equioscillation theorem.

Deterministic Hardness of Approximation For SVP in all Finite $\ell_p$ Norms

from arXiv: Computational Complexity

Authors: Isaac M Hair, Amit Sahai

We show that, assuming NP $\not\subseteq$ $\cap_{δ> 0}$DTIME$\left(\exp{n^δ}\right)$, the shortest vector problem for lattices of rank $n$ in any finite $\ell_p$ norm is hard to approximate within a factor of $2^{(\log n)^{1 - o(1)}}$, via a deterministic reduction. Previously, for the Euclidean case $p=2$, even hardness of the exact shortest vector problem was not known under a deterministic reduction.

Authors: Isaac M Hair, Amit Sahai

We show that, assuming NP $\not\subseteq$ $\cap_{δ> 0}$DTIME$\left(\exp{n^δ}\right)$, the shortest vector problem for lattices of rank $n$ in any finite $\ell_p$ norm is hard to approximate within a factor of $2^{(\log n)^{1 - o(1)}}$, via a deterministic reduction. Previously, for the Euclidean case $p=2$, even hardness of the exact shortest vector problem was not known under a deterministic reduction.

Quantum polymorphism characterisation of commutativity gadgets in all quantum models

from arXiv: Computational Complexity

Authors: Eric Culf, Josse van Dobben de Bruyn, Peter Zeman

Commutativity gadgets provide a technique for lifting classical reductions between constraint satisfaction problems to quantum-sound reductions between the corresponding nonlocal games. We develop a general framework for commutativity gadgets in the setting of quantum homomorphisms between finite relational structures. Building on the notion of quantum homomorphism spaces, we introduce a uniform notion of commutativity gadget capturing the finite-dimensional quantum, quantum approximate, and commuting-operator models. In the robust setting, we use the weighted-algebra formalism for approximate quantum homomorphisms to capture corresponding notions of robust commutativity gadgets. Our main results characterize both non-robust and robust commutativity gadgets purely in terms of quantum polymorphism spaces: in any model, existence of a commutativity gadget is equivalent to the collapse of the corresponding quantum polymorphisms to classical ones at arity $|A|^2$, and robust gadgets are characterized by stable commutativity of the appropriate weighted polymorphism algebra. We use this characterisation to show relations between the classes of commutativity gadget, notably that existence of a robust commutativity gadget is equivalent to the existence of a corresponding non-robust one. Finally, we prove that quantum polymorphisms of complete graphs $K_n$ have a very special structure, wherein the noncommutative behaviour only comes from the quantum permutation group $S_n^+$. Combining this with techniques from combinatorial group theory, we construct separations between commutativity-gadget classes: we exhibit a relational structure admitting a finite-dimensional commutativity gadget but no quantum approximate gadget, and, conditional on the existence of a non-hyperlinear group, a structure admitting a quantum approximate commutativity gadget but no commuting-operator gadget.

Authors: Eric Culf, Josse van Dobben de Bruyn, Peter Zeman

Commutativity gadgets provide a technique for lifting classical reductions between constraint satisfaction problems to quantum-sound reductions between the corresponding nonlocal games. We develop a general framework for commutativity gadgets in the setting of quantum homomorphisms between finite relational structures. Building on the notion of quantum homomorphism spaces, we introduce a uniform notion of commutativity gadget capturing the finite-dimensional quantum, quantum approximate, and commuting-operator models. In the robust setting, we use the weighted-algebra formalism for approximate quantum homomorphisms to capture corresponding notions of robust commutativity gadgets. Our main results characterize both non-robust and robust commutativity gadgets purely in terms of quantum polymorphism spaces: in any model, existence of a commutativity gadget is equivalent to the collapse of the corresponding quantum polymorphisms to classical ones at arity $|A|^2$, and robust gadgets are characterized by stable commutativity of the appropriate weighted polymorphism algebra. We use this characterisation to show relations between the classes of commutativity gadget, notably that existence of a robust commutativity gadget is equivalent to the existence of a corresponding non-robust one. Finally, we prove that quantum polymorphisms of complete graphs $K_n$ have a very special structure, wherein the noncommutative behaviour only comes from the quantum permutation group $S_n^+$. Combining this with techniques from combinatorial group theory, we construct separations between commutativity-gadget classes: we exhibit a relational structure admitting a finite-dimensional commutativity gadget but no quantum approximate gadget, and, conditional on the existence of a non-hyperlinear group, a structure admitting a quantum approximate commutativity gadget but no commuting-operator gadget.

Near-Optimal Space Lower Bounds for Streaming CSPs

from arXiv: Computational Complexity

Authors: Yumou Fei, Dor Minzer, Shuo Wang

In a streaming constraint satisfaction problem (streaming CSP), a $p$-pass algorithm receives the constraints of an instance sequentially, making $p$ passes over the input in a fixed order, with the goal of approximating the maximum fraction of satisfiable constraints. We show near optimal space lower bounds for streaming CSPs, improving upon prior works. (1) Fei, Minzer and Wang (\textit{STOC 2026}) showed that for any CSP, the basic linear program defines a threshold $α_{\mathrm{LP}}\in [0,1]$ such that, for any $\varepsilon > 0$, an $(α_{\mathrm{LP}} - \varepsilon)$-approximation can be achieved using constant passes and polylogarithmic space, whereas achieving $(α_{\mathrm{LP}} + \varepsilon)$-approximation requires $Ω(n^{1/3}/p)$ space. We improve this lower bound to $Ω(\sqrt{n}/p)$, which is nearly tight for a gap version of the problem. (2) For $p=o(\log n)$, we further strengthen the lower bound to $Ω(n\cdot2^{-O_{\varepsilon}(p)})$. Combined with existing algorithmic results, this shows that $α_{\mathrm{LP}}$ is not only the limit of multi-pass polylogarithmic-space algorithms, but also the limit of single-pass sublinear-space algorithms on bounded-degree instances. (3) For certain CSPs, we show that there exists $α< 1$ such that achieving an $α$-approximation requires $Ω(n/p)$ space. Our proofs are Fourier analytic, building on the techniques of Fei, Minzer and Wang (\textit{STOC 2026}) and the Fourier-$\ell_1$-based lower bound method of Kapralov and Krachun (\textit{STOC 2019}).

Authors: Yumou Fei, Dor Minzer, Shuo Wang

In a streaming constraint satisfaction problem (streaming CSP), a $p$-pass algorithm receives the constraints of an instance sequentially, making $p$ passes over the input in a fixed order, with the goal of approximating the maximum fraction of satisfiable constraints. We show near optimal space lower bounds for streaming CSPs, improving upon prior works. (1) Fei, Minzer and Wang (\textit{STOC 2026}) showed that for any CSP, the basic linear program defines a threshold $α_{\mathrm{LP}}\in [0,1]$ such that, for any $\varepsilon > 0$, an $(α_{\mathrm{LP}} - \varepsilon)$-approximation can be achieved using constant passes and polylogarithmic space, whereas achieving $(α_{\mathrm{LP}} + \varepsilon)$-approximation requires $Ω(n^{1/3}/p)$ space. We improve this lower bound to $Ω(\sqrt{n}/p)$, which is nearly tight for a gap version of the problem. (2) For $p=o(\log n)$, we further strengthen the lower bound to $Ω(n\cdot2^{-O_{\varepsilon}(p)})$. Combined with existing algorithmic results, this shows that $α_{\mathrm{LP}}$ is not only the limit of multi-pass polylogarithmic-space algorithms, but also the limit of single-pass sublinear-space algorithms on bounded-degree instances. (3) For certain CSPs, we show that there exists $α< 1$ such that achieving an $α$-approximation requires $Ω(n/p)$ space. Our proofs are Fourier analytic, building on the techniques of Fei, Minzer and Wang (\textit{STOC 2026}) and the Fourier-$\ell_1$-based lower bound method of Kapralov and Krachun (\textit{STOC 2019}).

The edge of the asymptotic spectrum of tensors

from arXiv: Computational Complexity

Authors: Josh Alman, Baitian Li, Kevin Pratt

Strassen founded the theory of the asymptotic spectrum of tensors to study the complexity of matrix multiplication. A central challenge in this theory is to explicitly construct new spectral points. In Crelle 1991, Strassen proposed the upper support functionals $ζ^θ$ as candidate spectral points, where $θ$ ranges over a triangle $Θ$. Recent progress, involving tools and ideas from quantum information theory (Christandl-Vrana-Zuiddam, STOC 2018, JAMS 2021) and convex optimization (Hirai, 2025), culminated in the proof that the upper support functionals are indeed spectral points over the complex numbers (Sakabe-Doğan-Walter, 2026). In this paper, we give an even clearer picture of the situation for support functionals when $θ$ lies along the edges of the triangle. We show that not only are these functionals spectral points, but that they are uniquely determined as spectral points by their behavior on matrix multiplication tensors. As our methods are algebraic, as a corollary this establishes for the first time the existence of nontrivial spectral points over arbitrary fields. As part of our argument, we show a close connection between the edge support functionals and Harder-Narasimhan filtrations from quiver representation theory. We thus show, using recent work in algorithmic invariant theory, that these support functionals can be computed in deterministic polynomial time. Other ingredients of our proof include a new criterion for abstractly characterizing asymptotic tensor ranks by spectral points, and a characterization of the edge support functionals in terms of matrix multiplication capacity. As another application of these tools, we prove the existence of spectral points for higher-mode tensors beyond those currently known.

Authors: Josh Alman, Baitian Li, Kevin Pratt

Strassen founded the theory of the asymptotic spectrum of tensors to study the complexity of matrix multiplication. A central challenge in this theory is to explicitly construct new spectral points. In Crelle 1991, Strassen proposed the upper support functionals $ζ^θ$ as candidate spectral points, where $θ$ ranges over a triangle $Θ$. Recent progress, involving tools and ideas from quantum information theory (Christandl-Vrana-Zuiddam, STOC 2018, JAMS 2021) and convex optimization (Hirai, 2025), culminated in the proof that the upper support functionals are indeed spectral points over the complex numbers (Sakabe-Doğan-Walter, 2026). In this paper, we give an even clearer picture of the situation for support functionals when $θ$ lies along the edges of the triangle. We show that not only are these functionals spectral points, but that they are uniquely determined as spectral points by their behavior on matrix multiplication tensors. As our methods are algebraic, as a corollary this establishes for the first time the existence of nontrivial spectral points over arbitrary fields. As part of our argument, we show a close connection between the edge support functionals and Harder-Narasimhan filtrations from quiver representation theory. We thus show, using recent work in algorithmic invariant theory, that these support functionals can be computed in deterministic polynomial time. Other ingredients of our proof include a new criterion for abstractly characterizing asymptotic tensor ranks by spectral points, and a characterization of the edge support functionals in terms of matrix multiplication capacity. As another application of these tools, we prove the existence of spectral points for higher-mode tensors beyond those currently known.

Topology-First B-Rep Meshing

from arXiv: Computational Geometry

Authors: YunFan Zhou, Daniel Zint, Nafiseh Izadyar, Michael Tao, Daniele Panozzo, Teseo Schneider

Parametric boundary representation models (B-Reps) are the de facto standard in CAD, graphics, and robotics, yet converting them into valid meshes remains fragile. The difficulty originates from the unavoidable approximation of high-order surface and curve intersections to low-order primitives: the resulting geometric realization often fails to respect the exact topology encoded in the B-Rep, producing meshes with incorrect or missing adjacencies. Existing meshing pipelines address these inconsistencies through heuristic feature-merging and repair strategies that offer no topological guarantees and frequently fail on complex models. We propose a fundamentally different approach: the B-Rep topology is treated as an invariant of the meshing process. Our algorithm enforces the exact B-Rep topology while allowing a single user-defined tolerance to control the deviation of the mesh from the underlying parametric surfaces. Consequently, for any admissible tolerance, the output mesh is topologically correct; only its geometric fidelity degrades as the tolerance increases. This decoupling eliminates the need for post-hoc repairs and yields robust meshes even when the underlying geometry is inconsistent or highly approximated. We evaluate our method on thousands of real-world CAD models from the ABC and Fusion 360 repositories, including instances that fail with standard meshing tools. The results demonstrate that topological guarantees at the algorithmic level enable reliable mesh generation suitable for downstream applications.

Authors: YunFan Zhou, Daniel Zint, Nafiseh Izadyar, Michael Tao, Daniele Panozzo, Teseo Schneider

Parametric boundary representation models (B-Reps) are the de facto standard in CAD, graphics, and robotics, yet converting them into valid meshes remains fragile. The difficulty originates from the unavoidable approximation of high-order surface and curve intersections to low-order primitives: the resulting geometric realization often fails to respect the exact topology encoded in the B-Rep, producing meshes with incorrect or missing adjacencies. Existing meshing pipelines address these inconsistencies through heuristic feature-merging and repair strategies that offer no topological guarantees and frequently fail on complex models. We propose a fundamentally different approach: the B-Rep topology is treated as an invariant of the meshing process. Our algorithm enforces the exact B-Rep topology while allowing a single user-defined tolerance to control the deviation of the mesh from the underlying parametric surfaces. Consequently, for any admissible tolerance, the output mesh is topologically correct; only its geometric fidelity degrades as the tolerance increases. This decoupling eliminates the need for post-hoc repairs and yields robust meshes even when the underlying geometry is inconsistent or highly approximated. We evaluate our method on thousands of real-world CAD models from the ABC and Fusion 360 repositories, including instances that fail with standard meshing tools. The results demonstrate that topological guarantees at the algorithmic level enable reliable mesh generation suitable for downstream applications.

SHARC: Reference point driven Spherical Harmonic Representation for Complex Shapes

from arXiv: Computational Geometry

Authors: Panagiotis Sapoutzoglou, George Terzakis, Maria Pateraki

We propose SHARC, a novel framework that synthesizes arbitrary, genus-agnostic shapes by means of a collection of Spherical Harmonic (SH) representations of distance fields. These distance fields are anchored at optimally placed reference points in the interior volume of the surface in a way that maximizes learning of the finer details of the surface. To achieve this, we employ a cost function that jointly maximizes sparsity and centrality in terms of positioning, as well as visibility of the surface from their location. For each selected reference point, we sample the visible distance field to the surface geometry via ray-casting and compute the SH coefficients using the Fast Spherical Harmonic Transform (FSHT). To enhance geometric fidelity, we apply a configurable low-pass filter to the coefficients and refine the output using a local consistency constraint based on proximity. Evaluation of SHARC against state-of-the-art methods demonstrates that the proposed method outperforms existing approaches in both reconstruction accuracy and time efficiency without sacrificing model parsimony. The source code is available at github.com/POSE-Lab/SHARC.

Authors: Panagiotis Sapoutzoglou, George Terzakis, Maria Pateraki

We propose SHARC, a novel framework that synthesizes arbitrary, genus-agnostic shapes by means of a collection of Spherical Harmonic (SH) representations of distance fields. These distance fields are anchored at optimally placed reference points in the interior volume of the surface in a way that maximizes learning of the finer details of the surface. To achieve this, we employ a cost function that jointly maximizes sparsity and centrality in terms of positioning, as well as visibility of the surface from their location. For each selected reference point, we sample the visible distance field to the surface geometry via ray-casting and compute the SH coefficients using the Fast Spherical Harmonic Transform (FSHT). To enhance geometric fidelity, we apply a configurable low-pass filter to the coefficients and refine the output using a local consistency constraint based on proximity. Evaluation of SHARC against state-of-the-art methods demonstrates that the proposed method outperforms existing approaches in both reconstruction accuracy and time efficiency without sacrificing model parsimony. The source code is available at https://github.com/POSE-Lab/SHARC.

The Computational Complexity of Avoiding Strict Saddle Points in Constrained Optimization

from arXiv: Data Structures and Algorithms

Authors: Andreas Kontogiannis, Ioannis Panageas, Vasilis Pollatos

While first-order stationary points (FOSPs) are the traditional targets of non-convex optimization, they often correspond to undesirable strict saddle points. To circumvent this, attention has shifted towards second-order stationary points (SOSPs). In unconstrained settings, finding approximate SOSPs is PLS-complete (Kontogiannis et al.), matching the complexity of finding unconstrained FOSPs (Hollender and Zampetakis). However, the complexity of finding SOSPs in constrained settings remained notoriously unclear and was highlighted as an important open question by both aforementioned works. Under one strict definition, even verifying whether a point is an approximate SOSP is NP-hard (Murty and Kabadi). Under another widely adopted, relaxed definition where non-negative curvature is required only along the null space of the active constraints, the problem lies in TFNP, and algorithms with O(poly(1/epsilon)) running times have been proposed (Lu et al.). In this work, we settle the complexity of constrained SOSP by proving that computing an epsilon-approximate SOSP under the tractable definition is PLS-complete. We demonstrate that our result holds even in the 2D unit square [0,1]^2, and remarkably, even when stationary points are isolated at a distance of Omega(1) from the domain's boundary. Our result establishes a fundamental barrier: unless PLS is a subset of PPAD (implying PLS = CLS), no deterministic, iterative algorithm with an efficient, continuous update rule can exist for finding approximate SOSPs. This contrasts with the constrained first-order counterpart, for which Fearnley et al. showed that finding an approximate KKT point is CLS-complete. Finally, our result yields the first problem defined in a compact domain to be shown PLS-complete beyond the canonical Real-LocalOpt (Daskalakis and Papadimitriou)."

Authors: Andreas Kontogiannis, Ioannis Panageas, Vasilis Pollatos

While first-order stationary points (FOSPs) are the traditional targets of non-convex optimization, they often correspond to undesirable strict saddle points. To circumvent this, attention has shifted towards second-order stationary points (SOSPs). In unconstrained settings, finding approximate SOSPs is PLS-complete (Kontogiannis et al.), matching the complexity of finding unconstrained FOSPs (Hollender and Zampetakis). However, the complexity of finding SOSPs in constrained settings remained notoriously unclear and was highlighted as an important open question by both aforementioned works. Under one strict definition, even verifying whether a point is an approximate SOSP is NP-hard (Murty and Kabadi). Under another widely adopted, relaxed definition where non-negative curvature is required only along the null space of the active constraints, the problem lies in TFNP, and algorithms with O(poly(1/epsilon)) running times have been proposed (Lu et al.). In this work, we settle the complexity of constrained SOSP by proving that computing an epsilon-approximate SOSP under the tractable definition is PLS-complete. We demonstrate that our result holds even in the 2D unit square [0,1]^2, and remarkably, even when stationary points are isolated at a distance of Omega(1) from the domain's boundary. Our result establishes a fundamental barrier: unless PLS is a subset of PPAD (implying PLS = CLS), no deterministic, iterative algorithm with an efficient, continuous update rule can exist for finding approximate SOSPs. This contrasts with the constrained first-order counterpart, for which Fearnley et al. showed that finding an approximate KKT point is CLS-complete. Finally, our result yields the first problem defined in a compact domain to be shown PLS-complete beyond the canonical Real-LocalOpt (Daskalakis and Papadimitriou)."

Bipartite Exact Matching in P

from arXiv: Data Structures and Algorithms

Authors: Yuefeng Du

The Exact Matching problem asks whether a bipartite graph with edges colored red and blue admits a perfect matching with exactly t red edges. Introduced by Papadimitriou and Yannakakis in 1982, the problem has resisted deterministic polynomial-time algorithms for over four decades, despite admitting a randomized solution via the Schwartz-Zippel lemma since 1987. We prove the Affine-Slice Nonvanishing Conjecture (ASNC) for all bipartite braces and give a deterministic O(n^6) algorithm for Exact Matching on all bipartite graphs. The algorithm follows via the tight-cut decomposition, which reduces the decision problem to brace blocks. The proof proceeds by structural induction on McCuaig's brace decomposition. We establish the McCuaig exceptional families, the replacement determinant algebra, and the narrow-extension cases (KA, J3 to D1). For the superfluous-edge step, we introduce two closure tools: a matching-induced Two-extra Hall theorem that resolves the rank-(m-2) branch via projective-collapse contradiction, and a distinguished-state q-circuit lemma that eliminates the rank-(m-1) branch entirely by showing that any minimal dependent set containing the superfluous state forces rank m-2. The entire proof has been formally verified in the Lean 4 proof assistant.

Authors: Yuefeng Du

The Exact Matching problem asks whether a bipartite graph with edges colored red and blue admits a perfect matching with exactly t red edges. Introduced by Papadimitriou and Yannakakis in 1982, the problem has resisted deterministic polynomial-time algorithms for over four decades, despite admitting a randomized solution via the Schwartz-Zippel lemma since 1987. We prove the Affine-Slice Nonvanishing Conjecture (ASNC) for all bipartite braces and give a deterministic O(n^6) algorithm for Exact Matching on all bipartite graphs. The algorithm follows via the tight-cut decomposition, which reduces the decision problem to brace blocks. The proof proceeds by structural induction on McCuaig's brace decomposition. We establish the McCuaig exceptional families, the replacement determinant algebra, and the narrow-extension cases (KA, J3 to D1). For the superfluous-edge step, we introduce two closure tools: a matching-induced Two-extra Hall theorem that resolves the rank-(m-2) branch via projective-collapse contradiction, and a distinguished-state q-circuit lemma that eliminates the rank-(m-1) branch entirely by showing that any minimal dependent set containing the superfluous state forces rank m-2. The entire proof has been formally verified in the Lean 4 proof assistant.

Sublinear-query relative-error testing of halfspaces

from arXiv: Data Structures and Algorithms

Authors: Xi Chen, Anindya De, Yizhi Huang, Shivam Nadimpalli, Rocco A. Servedio, Tianqi Yang

The relative-error property testing model was introduced in [CDHLNSY24] to facilitate the study of property testing for "sparse" Boolean-valued functions, i.e. ones for which only a small fraction of all input assignments satisfy the function. In this framework, the distance from the unknown target function $f$ that is being tested to a function $g$ is defined as $\mathrm{Vol}(f \mathop{\triangle} g)/\mathrm{Vol}(f)$, where the numerator is the fraction of inputs on which $f$ and $g$ disagree and the denominator is the fraction of inputs that satisfy $f$. Recent work [CDHNSY26] has shown that over the Boolean domain $\{0,1\}^n$, any relative-error testing algorithm for the fundamental class of halfspaces (i.e. linear threshold functions) must make $Ω(\log n)$ oracle calls. In this paper we complement the [CDHNSY26] lower bound by showing that halfspaces can be relative-error tested over $\mathbb{R}^n$ under the standard $N(0,I_n)$ Gaussian distribution using a sublinear number of oracle calls -- in particular, substantially fewer than would be required for learning. Our results use a wide range of tools including Hermite analysis, Gaussian isoperimetric inequalities, and geometric results on noise sensitivity and surface area.

Authors: Xi Chen, Anindya De, Yizhi Huang, Shivam Nadimpalli, Rocco A. Servedio, Tianqi Yang

The relative-error property testing model was introduced in [CDHLNSY24] to facilitate the study of property testing for "sparse" Boolean-valued functions, i.e. ones for which only a small fraction of all input assignments satisfy the function. In this framework, the distance from the unknown target function $f$ that is being tested to a function $g$ is defined as $\mathrm{Vol}(f \mathop{\triangle} g)/\mathrm{Vol}(f)$, where the numerator is the fraction of inputs on which $f$ and $g$ disagree and the denominator is the fraction of inputs that satisfy $f$. Recent work [CDHNSY26] has shown that over the Boolean domain $\{0,1\}^n$, any relative-error testing algorithm for the fundamental class of halfspaces (i.e. linear threshold functions) must make $Ω(\log n)$ oracle calls. In this paper we complement the [CDHNSY26] lower bound by showing that halfspaces can be relative-error tested over $\mathbb{R}^n$ under the standard $N(0,I_n)$ Gaussian distribution using a sublinear number of oracle calls -- in particular, substantially fewer than would be required for learning. Our results use a wide range of tools including Hermite analysis, Gaussian isoperimetric inequalities, and geometric results on noise sensitivity and surface area.

Subquadratic Counting via Perfect Marginal Sampling

from arXiv: Data Structures and Algorithms

Authors: Xiaoyu Chen, Zongchen Chen, Kuikui Liu, Xinyuan Zhang

We study the computational complexity of approximately computing the partition function of a spin system. Techniques based on standard counting-to-sampling reductions yield $\tilde{O}(n^2)$-time algorithms, where $n$ is the size of the input graph. We present new counting algorithms that break the quadratic-time barrier in a wide range of settings. For example, for the hardcore model of $λ$-weighted independent sets in graphs of maximum degree $Δ$, we obtain a $\tilde{O}(n^{2-δ})$-time approximate counting algorithm, for some constant $δ> 0$, when the fugacity $λ< \frac{1}{Δ-1}$, improving over the previous regime of $λ= o(Δ^{-3/2})$ by Anand, Feng, Freifeld, Guo, and Wang (2025). Our results apply broadly to many other spin systems, such as the Ising model, hypergraph independent sets, and vertex colorings. Interestingly, our work reveals a deep connection between $\textit{subquadratic}$ counting and $\textit{perfect}$ marginal sampling. For two-spin systems such as the hardcore and Ising models, we show that the existence of perfect marginal samplers directly yields subquadratic counting algorithms in a $\textit{black-box}$ fashion. For general spin systems, we show that almost all existing perfect marginal samplers can be adapted to produce a sufficiently low-variance marginal estimator in sublinear time, leading to subquadratic counting algorithms.

Authors: Xiaoyu Chen, Zongchen Chen, Kuikui Liu, Xinyuan Zhang

We study the computational complexity of approximately computing the partition function of a spin system. Techniques based on standard counting-to-sampling reductions yield $\tilde{O}(n^2)$-time algorithms, where $n$ is the size of the input graph. We present new counting algorithms that break the quadratic-time barrier in a wide range of settings. For example, for the hardcore model of $λ$-weighted independent sets in graphs of maximum degree $Δ$, we obtain a $\tilde{O}(n^{2-δ})$-time approximate counting algorithm, for some constant $δ> 0$, when the fugacity $λ< \frac{1}{Δ-1}$, improving over the previous regime of $λ= o(Δ^{-3/2})$ by Anand, Feng, Freifeld, Guo, and Wang (2025). Our results apply broadly to many other spin systems, such as the Ising model, hypergraph independent sets, and vertex colorings. Interestingly, our work reveals a deep connection between $\textit{subquadratic}$ counting and $\textit{perfect}$ marginal sampling. For two-spin systems such as the hardcore and Ising models, we show that the existence of perfect marginal samplers directly yields subquadratic counting algorithms in a $\textit{black-box}$ fashion. For general spin systems, we show that almost all existing perfect marginal samplers can be adapted to produce a sufficiently low-variance marginal estimator in sublinear time, leading to subquadratic counting algorithms.

Probabilistic AVL Trees (p-AVL): Relaxing Deterministic Balancing

from arXiv: Data Structures and Algorithms

Authors: Hayagriv Desikan

This paper studies the empirical behaviour of the p-AVL tree, a probabilistic variant of the AVL tree in which each imbalance is repaired with probability $p$. This gives an exact continuous interpolation from $p = 0$, which recovers the BST endpoint, to $p = 1$, which recovers the standard AVL tree. Across random-order insertion experiments, we track rotations per node, total imbalance events, average depth, average height, and a global imbalance statistic $σ$. The main empirical result is that even small nonzero p already causes a strong structural change. The goal here is empirical rather than fully theoretical: to document the behaviour of the p-AVL family clearly and identify the main patterns.

Authors: Hayagriv Desikan

This paper studies the empirical behaviour of the p-AVL tree, a probabilistic variant of the AVL tree in which each imbalance is repaired with probability $p$. This gives an exact continuous interpolation from $p = 0$, which recovers the BST endpoint, to $p = 1$, which recovers the standard AVL tree. Across random-order insertion experiments, we track rotations per node, total imbalance events, average depth, average height, and a global imbalance statistic $σ$. The main empirical result is that even small nonzero p already causes a strong structural change. The goal here is empirical rather than fully theoretical: to document the behaviour of the p-AVL family clearly and identify the main patterns.

BBC: Improving Large-k Approximate Nearest Neighbor Search with a Bucket-based Result Collector

from arXiv: Data Structures and Algorithms

Authors: Ziqi Yin, Gao Cong, Kai Zeng, Jinwei Zhu, Bin Cui

Although Approximate Nearest Neighbor (ANN) search has been extensively studied, large-k ANN queries that aim to retrieve a large number of nearest neighbors remain underexplored, despite their numerous real-world applications. Existing ANN methods face significant performance degradation for such queries. In this work, we first investigate the reasons for the performance degradation of quantization-based ANN indexes: (1) the inefficiency of existing top-k collectors, which incurs significant overhead in candidate maintenance, and (2) the reduced pruning effectiveness of quantization methods, which leads to a costly re-ranking process. To address this, we propose a novel bucket-based result collector (BBC) to enhance the efficiency of existing quantization-based ANN indexes for large-k ANN queries. BBC introduces two key components: (1) a bucket-based result buffer that organizes candidates into buckets by their distances to the query. This design reduces ranking costs and improves cache efficiency, enabling high performance maintenance of a candidate superset and a lightweight final selection of top-k results. (2) two re-ranking algorithms tailored for different types of quantization methods, which accelerate their re-ranking process by reducing either the number of candidate objects to be re-ranked or cache misses. Extensive experiments on real-world datasets demonstrate that BBC accelerates existing quantization-based ANN methods by up to 3.8x at recall@k = 0.95 for large-k ANN queries.

Authors: Ziqi Yin, Gao Cong, Kai Zeng, Jinwei Zhu, Bin Cui

Although Approximate Nearest Neighbor (ANN) search has been extensively studied, large-k ANN queries that aim to retrieve a large number of nearest neighbors remain underexplored, despite their numerous real-world applications. Existing ANN methods face significant performance degradation for such queries. In this work, we first investigate the reasons for the performance degradation of quantization-based ANN indexes: (1) the inefficiency of existing top-k collectors, which incurs significant overhead in candidate maintenance, and (2) the reduced pruning effectiveness of quantization methods, which leads to a costly re-ranking process. To address this, we propose a novel bucket-based result collector (BBC) to enhance the efficiency of existing quantization-based ANN indexes for large-k ANN queries. BBC introduces two key components: (1) a bucket-based result buffer that organizes candidates into buckets by their distances to the query. This design reduces ranking costs and improves cache efficiency, enabling high performance maintenance of a candidate superset and a lightweight final selection of top-k results. (2) two re-ranking algorithms tailored for different types of quantization methods, which accelerate their re-ranking process by reducing either the number of candidate objects to be re-ranked or cache misses. Extensive experiments on real-world datasets demonstrate that BBC accelerates existing quantization-based ANN methods by up to 3.8x at recall@k = 0.95 for large-k ANN queries.

A Constant-Approximation Distance Labeling Scheme under Polynomially Many Edge Failures

from arXiv: Data Structures and Algorithms

Authors: Bernhard Haeupler, Yaowei Long, Antti Roeyskoe, Thatchaphol Saranurak

A fault-tolerant distance labeling scheme assigns a label to each vertex and edge of an undirected weighted graph $G$ with $n$ vertices so that, for any edge set $F$ of size $|F| \leq f$, one can approximate the distance between $p$ and $q$ in $G \setminus F$ by reading only the labels of $F \cup \{p,q\}$. For any $k$, we present a deterministic polynomial-time scheme with $O(k^{4})$ approximation and $\tilde{O}(f^{4}n^{1/k})$ label size. This is the first scheme to achieve a constant approximation while handling any number of edge faults $f$, resolving the open problem posed by Dory and Parter [DP21]. All previous schemes provided only a linear-in-$f$ approximation [DP21, LPS25]. Our labeling scheme directly improves the state of the art in the simpler setting of distance sensitivity oracles. Even for just $f = Θ(\log n)$ faults, all previous oracles either have super-linear query time, linear-in-$f$ approximation [CLPR12], or exponentially worse $2^{{\rm poly}(k)}$ approximation dependency in $k$ [HLS24].

Authors: Bernhard Haeupler, Yaowei Long, Antti Roeyskoe, Thatchaphol Saranurak

A fault-tolerant distance labeling scheme assigns a label to each vertex and edge of an undirected weighted graph $G$ with $n$ vertices so that, for any edge set $F$ of size $|F| \leq f$, one can approximate the distance between $p$ and $q$ in $G \setminus F$ by reading only the labels of $F \cup \{p,q\}$. For any $k$, we present a deterministic polynomial-time scheme with $O(k^{4})$ approximation and $\tilde{O}(f^{4}n^{1/k})$ label size. This is the first scheme to achieve a constant approximation while handling any number of edge faults $f$, resolving the open problem posed by Dory and Parter [DP21]. All previous schemes provided only a linear-in-$f$ approximation [DP21, LPS25]. Our labeling scheme directly improves the state of the art in the simpler setting of distance sensitivity oracles. Even for just $f = Θ(\log n)$ faults, all previous oracles either have super-linear query time, linear-in-$f$ approximation [CLPR12], or exponentially worse $2^{{\rm poly}(k)}$ approximation dependency in $k$ [HLS24].

GPU-RMQ: Accelerating Range Minimum Queries on Modern GPUs

from arXiv: Data Structures and Algorithms

Authors: Lara Kreis, Justus Henneberg, Valentin Henkys, Felix Schuhknecht, Bertil Schmidt

Range minimum queries are frequently used in string processing and database applications including biological sequence analysis, document retrieval, and web search. Hence, various data structures have been proposed for improving their efficiency on both CPUs and GPUs.Recent work has also shown that hardware-accelerated ray tracing on modern NVIDIA RTX graphic cards can be exploited to answer range minimum queries by expressing queries as rays, which are fired into a scene of triangles representing minima of ranges at different granularities. While these approaches are promising, they suffer from at least one of three issues: severe memory overhead, high index construction time, and low query throughput. This renders these methods practically unusable on larger arrays: For example, the state-of-art GPU-based approaches LCA and RTXRMQ exceed the memory capacity of an NVIDIA RTX 4090 GPU for input arrays of size >= 2^29. To tackle these problems, in this work, we present a new approach called GPU-RMQ which is based on a hierarchical approach. GPU-RMQ first constructs a hierarchy of range minimum summaries on top of the original array in a highly parallel fashion. For query answering, only the relevant portions of the hierarchy are then processed in an optimized massively-parallel scan operation. Additionally, GPU-RMQ is hybrid in design enabling the use of both ray tracing cores and CUDA cores across different levels of the hierarchy to handle queries. Our experimental evaluation shows that GPU-RMQ outperforms the state-of-the-art approaches in terms of query throughput especially for larger arrays while offering a significantly lower memory footprint and up to two orders-of-magnitude faster index construction. In particular, it achieves up to ~8x higher throughput than LCA, ~17x higher throughput than RTXRMQ, and up to ~4800x higher throughput compared to an optimized CPU-based approach.

Authors: Lara Kreis, Justus Henneberg, Valentin Henkys, Felix Schuhknecht, Bertil Schmidt

Range minimum queries are frequently used in string processing and database applications including biological sequence analysis, document retrieval, and web search. Hence, various data structures have been proposed for improving their efficiency on both CPUs and GPUs.Recent work has also shown that hardware-accelerated ray tracing on modern NVIDIA RTX graphic cards can be exploited to answer range minimum queries by expressing queries as rays, which are fired into a scene of triangles representing minima of ranges at different granularities. While these approaches are promising, they suffer from at least one of three issues: severe memory overhead, high index construction time, and low query throughput. This renders these methods practically unusable on larger arrays: For example, the state-of-art GPU-based approaches LCA and RTXRMQ exceed the memory capacity of an NVIDIA RTX 4090 GPU for input arrays of size >= 2^29. To tackle these problems, in this work, we present a new approach called GPU-RMQ which is based on a hierarchical approach. GPU-RMQ first constructs a hierarchy of range minimum summaries on top of the original array in a highly parallel fashion. For query answering, only the relevant portions of the hierarchy are then processed in an optimized massively-parallel scan operation. Additionally, GPU-RMQ is hybrid in design enabling the use of both ray tracing cores and CUDA cores across different levels of the hierarchy to handle queries. Our experimental evaluation shows that GPU-RMQ outperforms the state-of-the-art approaches in terms of query throughput especially for larger arrays while offering a significantly lower memory footprint and up to two orders-of-magnitude faster index construction. In particular, it achieves up to ~8x higher throughput than LCA, ~17x higher throughput than RTXRMQ, and up to ~4800x higher throughput compared to an optimized CPU-based approach.

Adaptive Fully Dynamic $k$-Center Clustering with (Near-)Optimal Worst-Case Guarantees

from arXiv: Data Structures and Algorithms

Authors: Mara Grilnberger, Antonis Skarlatos

Given a sequence of adversarial point insertions and point deletions, is it possible to simultaneously optimize the approximation ratio, update time, and recourse for a $k$-clustering problem? If so, can this be achieved with worst-case guarantees against an adaptive adversary? These questions have garnered significant attention in recent years. Prior works by Bhattacharya, Costa, Garg, Lattanzi, and Parotsidis [FOCS '24] and by Bhattacharya, Costa, and Farokhnejad [STOC '25] have taken significant steps toward this direction for the $k$-median clustering problem and its generalization, the $(k, z)$-clustering problem. In this paper, we study the $k$-center clustering problem, which is one of the most classical and well-studied $k$-clustering problems. Recently, Bhattacharya, Costa, Farokhnejad, Lattanzi, and Parotsidis [ICML '25] provided an affirmative answer to the first question for the $k$-center clustering problem. However, their work did not resolve the second question, as their result provides only expected amortized guarantees against an oblivious adversary. In this work, we make significant progress and close the gap by answering both questions in the affirmative. Specifically, we show that the fully dynamic $k$-center clustering problem admits a constant-factor approximation, near-optimal worst-case update time, and constant worst-case recourse, even against an adaptive adversary. This is achieved by first developing a fully dynamic bicriteria approximation algorithm with (near-)optimal worst-case bounds, and then designing a suitable fully dynamic $k$-center algorithm with near-linear update time. For the fully dynamic bicriteria approximation algorithm, we establish the worst-case recourse and worst-case update time guarantees separately, and then merge them into a single algorithm through a simple yet elegant process.

Authors: Mara Grilnberger, Antonis Skarlatos

Given a sequence of adversarial point insertions and point deletions, is it possible to simultaneously optimize the approximation ratio, update time, and recourse for a $k$-clustering problem? If so, can this be achieved with worst-case guarantees against an adaptive adversary? These questions have garnered significant attention in recent years. Prior works by Bhattacharya, Costa, Garg, Lattanzi, and Parotsidis [FOCS '24] and by Bhattacharya, Costa, and Farokhnejad [STOC '25] have taken significant steps toward this direction for the $k$-median clustering problem and its generalization, the $(k, z)$-clustering problem. In this paper, we study the $k$-center clustering problem, which is one of the most classical and well-studied $k$-clustering problems. Recently, Bhattacharya, Costa, Farokhnejad, Lattanzi, and Parotsidis [ICML '25] provided an affirmative answer to the first question for the $k$-center clustering problem. However, their work did not resolve the second question, as their result provides only expected amortized guarantees against an oblivious adversary. In this work, we make significant progress and close the gap by answering both questions in the affirmative. Specifically, we show that the fully dynamic $k$-center clustering problem admits a constant-factor approximation, near-optimal worst-case update time, and constant worst-case recourse, even against an adaptive adversary. This is achieved by first developing a fully dynamic bicriteria approximation algorithm with (near-)optimal worst-case bounds, and then designing a suitable fully dynamic $k$-center algorithm with near-linear update time. For the fully dynamic bicriteria approximation algorithm, we establish the worst-case recourse and worst-case update time guarantees separately, and then merge them into a single algorithm through a simple yet elegant process.

Single-Pass Streaming CSPs via Two-Tier Sampling

from arXiv: Data Structures and Algorithms

Authors: Amir Azarmehr, Soheil Behnezhad, Shane Ferrante

We study the maximum constraint satisfaction problem, Max-CSP, in the streaming setting. Given $n$ variables, the constraints arrive sequentially in an arbitrary order, with each constraint involving only a small subset of the variables. The objective is to approximate the maximum fraction of constraints that can be satisfied by an optimal assignment in a single pass. The problem admits a trivial near-optimal solution with $O(n)$ space, so the major open problem in the literature has been the best approximation achievable when limiting the space to $o(n)$. The answer to the question above depends heavily on the CSP instance at hand. The integrality gap $α$ of an LP relaxation, known as the BasicLP, plays a central role. In particular, a major conjecture of the area is that in the single-pass streaming setting, for any fixed $\varepsilon > 0$, (i) an $(α-\varepsilon)$-approximation can be achieved with $o(n)$ space, and (ii) any $(α+\varepsilon)$-approximation requires $Ω(n)$ space. In this work, we fully resolve the first side of the conjecture by proving that an $(α- \varepsilon)$-approximation of Max-CSP can indeed be achieved using $n^{1-Ω_\varepsilon(1)}$ space and in a single pass. Given that Max-DiCut is a special case of Max-CSP, our algorithm fully recovers the recent result of [ABFS26, STOC'26] via a completely different algorithm and proof. On a technical level, our algorithm simulates a suitable local algorithm on a reduced graph using a technique that we call *two-tier sampling*: the algorithm combines both edge sampling and vertex sampling to handle high- and low-degree vertices at the same time.

Authors: Amir Azarmehr, Soheil Behnezhad, Shane Ferrante

We study the maximum constraint satisfaction problem, Max-CSP, in the streaming setting. Given $n$ variables, the constraints arrive sequentially in an arbitrary order, with each constraint involving only a small subset of the variables. The objective is to approximate the maximum fraction of constraints that can be satisfied by an optimal assignment in a single pass. The problem admits a trivial near-optimal solution with $O(n)$ space, so the major open problem in the literature has been the best approximation achievable when limiting the space to $o(n)$. The answer to the question above depends heavily on the CSP instance at hand. The integrality gap $α$ of an LP relaxation, known as the BasicLP, plays a central role. In particular, a major conjecture of the area is that in the single-pass streaming setting, for any fixed $\varepsilon > 0$, (i) an $(α-\varepsilon)$-approximation can be achieved with $o(n)$ space, and (ii) any $(α+\varepsilon)$-approximation requires $Ω(n)$ space. In this work, we fully resolve the first side of the conjecture by proving that an $(α- \varepsilon)$-approximation of Max-CSP can indeed be achieved using $n^{1-Ω_\varepsilon(1)}$ space and in a single pass. Given that Max-DiCut is a special case of Max-CSP, our algorithm fully recovers the recent result of [ABFS26, STOC'26] via a completely different algorithm and proof. On a technical level, our algorithm simulates a suitable local algorithm on a reduced graph using a technique that we call *two-tier sampling*: the algorithm combines both edge sampling and vertex sampling to handle high- and low-degree vertices at the same time.

On the Dynamics of Linear Finite Dynamical Systems Over Galois Rings

from arXiv: Data Structures and Algorithms

Authors: Jonas Kantic, Claudio Qureshi, Daniel Panario, Fabian Legl

Linear finite dynamical systems play an important role, for example, in coding theory and simulations. Methods for analyzing such systems are often restricted to cases in which the system is defined over a field %and usually strive to achieve a complete description of the system and its dynamics. or lack practicability to effectively analyze the system's dynamical behavior. However, when analyzing and prototyping finite dynamical systems, it is often desirable to quickly obtain basic information such as the length of cycles and transients that appear in its dynamics, which is reflected in the structure of the connected components of the corresponding functional graphs. In this paper, we extend the analysis of the dynamics of linear finite dynamical systems that act over cyclic modules to Galois rings. Furthermore, we propose algorithms for computing the length of the cycles and the height of the trees that make up their functional graphs.

Authors: Jonas Kantic, Claudio Qureshi, Daniel Panario, Fabian Legl

Linear finite dynamical systems play an important role, for example, in coding theory and simulations. Methods for analyzing such systems are often restricted to cases in which the system is defined over a field %and usually strive to achieve a complete description of the system and its dynamics. or lack practicability to effectively analyze the system's dynamical behavior. However, when analyzing and prototyping finite dynamical systems, it is often desirable to quickly obtain basic information such as the length of cycles and transients that appear in its dynamics, which is reflected in the structure of the connected components of the corresponding functional graphs. In this paper, we extend the analysis of the dynamics of linear finite dynamical systems that act over cyclic modules to Galois rings. Furthermore, we propose algorithms for computing the length of the cycles and the height of the trees that make up their functional graphs.

A Simple Average-case Analysis of Recursive Randomized Greedy MIS

from arXiv: Data Structures and Algorithms

Authors: Mina Dalirrooyfard, Konstantin Makarychev, Slobodan Mitrović

We revisit the complexity analysis of the recursive version of the randomized greedy algorithm for computing a maximal independent set (MIS), originally analyzed by Yoshida, Yamamoto, and Ito (2009). They showed that, on average per vertex, the expected number of recursive calls made by this algorithm is upper bounded by the average degree of the input graph. While their analysis is clever and intricate, we provide a significantly simpler alternative that achieves the same guarantee. Our analysis is inspired by the recent work of Dalirrooyfard, Makarychev, and Mitrović (2024), who developed a potential-function-based argument to analyze a new algorithm for correlation clustering. We adapt this approach to the MIS setting, yielding a more direct and arguably more transparent analysis of the recursive randomized greedy MIS algorithm.

Authors: Mina Dalirrooyfard, Konstantin Makarychev, Slobodan Mitrović

We revisit the complexity analysis of the recursive version of the randomized greedy algorithm for computing a maximal independent set (MIS), originally analyzed by Yoshida, Yamamoto, and Ito (2009). They showed that, on average per vertex, the expected number of recursive calls made by this algorithm is upper bounded by the average degree of the input graph. While their analysis is clever and intricate, we provide a significantly simpler alternative that achieves the same guarantee. Our analysis is inspired by the recent work of Dalirrooyfard, Makarychev, and Mitrović (2024), who developed a potential-function-based argument to analyze a new algorithm for correlation clustering. We adapt this approach to the MIS setting, yielding a more direct and arguably more transparent analysis of the recursive randomized greedy MIS algorithm.

Approximating the Permanent of a Random Matrix with Polynomially Small Mean: Zeros and Universality

from arXiv: Data Structures and Algorithms

Authors: Frederic Koehler, Pui Kuen Leung

We study algorithms for approximating the permanent of a random matrix when the entries are slightly biased away from zero. This question is motivated by the goal of understanding the classical complexity of linear optics and \emph{boson sampling} (Aaronson and Arkhipov '11; Eldar and Mehraban '17). Barvinok's interpolation method enables efficient approximation of the permanent, provided one can establish a sufficiently large zero-free region for the polynomial $\mathrm{per}(zJ + W)$, where $J$ is the all-ones matrix and $W$ is a random matrix with independent mean-zero entries. We show that when the entries of $W$ are standard complex Gaussians, all zeros of the random polynomial $\mathrm{per}(zJ + W)$ lie within a disk of radius $\tilde{O}(n^{-1/3})$, which yields an approximation algorithm when the bias of the entries is $\tildeΩ(n^{-1/3})$. Previously, there were no efficient algorithms at biases smaller than $1/\mathrm{polylog}(n)$, and it was unknown whether there typically exist zeros $z$ with $|z| \ge 1$. As a complementary result, we show that the bulk of the zeros, namely $(1 - ε)n$ of them, have magnitude $Θ(n^{-1/2})$. This prevents our interpolation method from contradicting the conjectured average-case hardness of approximating the permanent. We also establish analogous zero-free regions for the hardcore model on general graphs with complex vertex fugacities. In addition, we prove universality results establishing zero-free regions for random matrices $W$ with i.i.d. subexponential entries.

Authors: Frederic Koehler, Pui Kuen Leung

We study algorithms for approximating the permanent of a random matrix when the entries are slightly biased away from zero. This question is motivated by the goal of understanding the classical complexity of linear optics and \emph{boson sampling} (Aaronson and Arkhipov '11; Eldar and Mehraban '17). Barvinok's interpolation method enables efficient approximation of the permanent, provided one can establish a sufficiently large zero-free region for the polynomial $\mathrm{per}(zJ + W)$, where $J$ is the all-ones matrix and $W$ is a random matrix with independent mean-zero entries. We show that when the entries of $W$ are standard complex Gaussians, all zeros of the random polynomial $\mathrm{per}(zJ + W)$ lie within a disk of radius $\tilde{O}(n^{-1/3})$, which yields an approximation algorithm when the bias of the entries is $\tildeΩ(n^{-1/3})$. Previously, there were no efficient algorithms at biases smaller than $1/\mathrm{polylog}(n)$, and it was unknown whether there typically exist zeros $z$ with $|z| \ge 1$. As a complementary result, we show that the bulk of the zeros, namely $(1 - ε)n$ of them, have magnitude $Θ(n^{-1/2})$. This prevents our interpolation method from contradicting the conjectured average-case hardness of approximating the permanent. We also establish analogous zero-free regions for the hardcore model on general graphs with complex vertex fugacities. In addition, we prove universality results establishing zero-free regions for random matrices $W$ with i.i.d. subexponential entries.

A divide and conquer strategy for multinomial particle filter resampling

from arXiv: Data Structures and Algorithms

Authors: Andrey A. Popov

This work provides a new multinomial resampling procedure for particle filter resampling, focused on the case where the number of samples required is less than or equal to the size of the underlying discrete distribution. This setting is common in ensemble mixture model filters such as the Gaussian mixture filter. We show superiority of our approach with respect two of the best known multinomial sampling procedures both through a computational complexity analysis and through a numerical experiment.

Authors: Andrey A. Popov

This work provides a new multinomial resampling procedure for particle filter resampling, focused on the case where the number of samples required is less than or equal to the size of the underlying discrete distribution. This setting is common in ensemble mixture model filters such as the Gaussian mixture filter. We show superiority of our approach with respect two of the best known multinomial sampling procedures both through a computational complexity analysis and through a numerical experiment.

Space-Efficient Text Indexing with Mismatches using Function Inversion

from arXiv: Data Structures and Algorithms

Authors: Jackson Bibbens, Levi Borevitz, Samuel McCauley

A classic data structure problem is to preprocess a string T of length $n$ so that, given a query $q$, we can quickly find all substrings of T with Hamming distance at most $k$ from the query string. Variants of this problem have seen significant research both in theory and in practice. For a wide parameter range, the best worst-case bounds are achieved by the "CGL tree" (Cole, Gottlieb, Lewenstein 2004), which achieves query time roughly $\tilde{O}(|q| + \log^k n + \# occ)$ where $\# occ$ is the size of the output, and space ${O}(n\log^k n)$. The CGL Tree space was recently improved to $O(n \log^{k-1} n)$ (Kociumaka, Radoszewski 2026). A natural question is whether a high space bound is necessary. How efficient can we make queries when the data structure is constrained to $O(n)$ space? While this question has seen extensive research, all known results have query time with unfavorable dependence on $n$, $k$, and the alphabet $Σ$. The state of the art query time (Chan et al. 2011) is roughly $\tilde{O}(|q| + |Σ|^k \log^{k^2 + k} n + \# occ)$. We give an $O(n)$-space data structure with query time roughly $\tilde{O}(|q| + \log^{4k} n + \log^{2k} n \# occ)$, with no dependence on $|Σ|$. Even if $|Σ| = O(1)$, this is the best known query time for linear space if $k\geq 3$ unless $\# occ$ is large. Our results give a smooth tradeoff between time and space. We also give the first sublinear-space results: we give a succinct data structure using only $o(n)$ space in addition to the text itself. Our main technical idea is to apply function inversion (Fiat, Naor 2000) to the CGL tree. Combining these techniques is not immediate; in fact, we revisit the exposition of both the Fiat-Naor data structure and the CGL tree to obtain our bounds. Along the way, we obtain improved performance for both data structures, which may be of independent interest.

Authors: Jackson Bibbens, Levi Borevitz, Samuel McCauley

A classic data structure problem is to preprocess a string T of length $n$ so that, given a query $q$, we can quickly find all substrings of T with Hamming distance at most $k$ from the query string. Variants of this problem have seen significant research both in theory and in practice. For a wide parameter range, the best worst-case bounds are achieved by the "CGL tree" (Cole, Gottlieb, Lewenstein 2004), which achieves query time roughly $\tilde{O}(|q| + \log^k n + \# occ)$ where $\# occ$ is the size of the output, and space ${O}(n\log^k n)$. The CGL Tree space was recently improved to $O(n \log^{k-1} n)$ (Kociumaka, Radoszewski 2026). A natural question is whether a high space bound is necessary. How efficient can we make queries when the data structure is constrained to $O(n)$ space? While this question has seen extensive research, all known results have query time with unfavorable dependence on $n$, $k$, and the alphabet $Σ$. The state of the art query time (Chan et al. 2011) is roughly $\tilde{O}(|q| + |Σ|^k \log^{k^2 + k} n + \# occ)$. We give an $O(n)$-space data structure with query time roughly $\tilde{O}(|q| + \log^{4k} n + \log^{2k} n \# occ)$, with no dependence on $|Σ|$. Even if $|Σ| = O(1)$, this is the best known query time for linear space if $k\geq 3$ unless $\# occ$ is large. Our results give a smooth tradeoff between time and space. We also give the first sublinear-space results: we give a succinct data structure using only $o(n)$ space in addition to the text itself. Our main technical idea is to apply function inversion (Fiat, Naor 2000) to the CGL tree. Combining these techniques is not immediate; in fact, we revisit the exposition of both the Fiat-Naor data structure and the CGL tree to obtain our bounds. Along the way, we obtain improved performance for both data structures, which may be of independent interest.

Thursday, April 02

Should Have Known Better

from Ben Recht

Motivating the metric of regret in sequential decision making.

This is the second live blog of Lecture 7 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

When talking about sequential decision making and optimal control, I can’t avoid discussing the mathematical concept of regret. Regret is the preferred theoretical metric for evaluating bandit algorithms, and bandit algorithms are a core method for online decision-making.

Invariably, every time I try to explain “regret,” I get the question, “Wait, why do we care about that?” So to have an answer that I can point to in the future, I’m going to write a few blog posts.

We can use the setup from the last blog. We have a two-player game with repeated interactions. In every round t,

  1. Information xt is revealed to both players.

  2. Player One takes action ut

  3. Player Two takes action dt

  4. A score rt is assigned based on the triple (xt,ut,dt).

Player One is the “decision maker,” and their action has to be computable from a few lines of code. Their goal is to accumulate as high a score as possible, summed across all rounds. Player two wants the sum of all of the rt to be as low as possible.

You could think of designing the optimal policy as an optimization problem. Given a description of what Player Two can do, you can design a strategy for Player One. If Player Two is totally random, you could perhaps maximize the expected score. If Player Two is deterministic, you could maximize the score assuming Player Two plays the best possible strategy against you. Regret provides a flexible framework for going beyond both of these formulations.

To define regret, we imagine a counterfactual world in which Player One knows something about Player Two’s strategy in advance. The regret of some strategy is the difference between the score of a policy built on knowing this secret of Player Two and the score of the strategy that has to learn as it goes. It is called regret because it estimates how much the score could be improved with the benefit of hindsight.

Sometimes this regret is really large. Consider the following example. In each round, Player Two thinks of a color, Red or Blue, and Player One has to guess which color Player Two is thinking of. Player One gets a 1 if they guess correctly and a 0 otherwise. Player Two agrees to choose their sequence in advance, but only reveal one number to Player One in each round. In the counterfactual world, Player One would know the entire sequence and would receive a perfect score. In reality, Player One can’t do better than guessing, so they would be hard-pressed to get more than half of the colors correct.

This is where regret gets confusing. We ask, what if Player One in the counterfactual world has the benefit of hindsight but is constrained in their strategies? In this color guessing game, what if Player One is forced to choose one color in their counterfactual world? They see the entire sequence but can only pick red or blue. In this case, if Player Two chooses an even number of Red and Blues, the omniscient yet restricted Player One can only get half of the answers correct. A real-world strategy of random guessing will fare just as well as this counterfactual strategy with the benefit of hindsight.

No matter how many times I explain it, I find this setup confusing. Let me write it again: The regret model requires two things: a secret of Player Two and a restricted strategy set of Player One. In the real world, Player One has a flexible strategy set, but is missing information. In the counterfactual world, Player One has a restricted strategy set, but extra knowledge. Regret bounds the difference in scores achieved in these two worlds.

You might ask why this particular example of color guessing is interesting. I’m not sure it is, but it’s the one we’ll use next week when discussing forecasting. When someone tells you that they have calibrated predictions, they are doing this sort of sleight of hand and comparing against something that you probably don’t actually care about.

But let’s spend some time discussing examples where regret is reasonable. I’ll start with the canonical example: the stochastic multiarmed bandit. If Player Two is random and stationary, then the best strategy in hindsight makes a lot more sense. In our game of colors, this is the multiarmed bandit problem, the most annoyingly named subject in decision making. In the classic version of this problem, you have two slot machines and want to find the one with the highest payout. Each round, you are allowed to choose one of the machines to play. We model the payout from each machine as an independent draw from a fixed probability distribution. These distributions have different means, and your goal is to devise a strategy that results in the highest expected payout.

What would the best policy do? No matter how clever you are, you can’t beat the strategy of only using the machine with the higher mean payout. If you knew the expected payouts in advance and your goal is to maximize the expected payout, you would use only the machine with the highest expected payout. Thus, we can think of the secret held by Player Two to be the mean payouts of each machine.

If you didn’t know this secret, what would you do? You’d probably spend some time with each machine, look at which one is giving you higher returns, and then pick that one forever. This seems like a reasonable strategy.1 But note that this strategy necessarily has nonzero regret, because you necessarily have to try both machines to figure out which one is best.

Any strategy you devise for the real world has a particular expected regret, which is the difference between the expected payout of playing the best machine and the expected value of your strategy. In the case of our multiarmed bandit, the worst regret is accrued by always playing the suboptimal machine. So the regret would grow linearly with the number of pulls. Bandit algorithms seek strategies for which regret grows sublinearly.

Outside the casino, variants of the stochastic multiarmed bandit are reasonable models for adaptive experimentation. Suppose you want to select between treatments A and B that maximizes the average benefit to some cohort of subjects. If you can randomly sample individuals from the cohort, there will be regret associated with the number of subjects assigned to the suboptimal treatment in a randomized experiment, and there will be regret associated with the chance your experiment selects the suboptimal treatment. You would like to minimize application of the wrong treatment, but also be pretty sure you are finding the right one. You can compare this to the policy that assigned everyone to the optimal policy in advance.

Tomorrow I’ll talk through three other examples where regret feels like the right concept to me. In hindsight, it’s not always worth the headaches and confusion associated with regret minimization, but there are enough positive examples to make it a concept worth understanding.

Subscribe now

1

This is more or less the optimal strategy.

By Ben Recht

TCS+ talk: Wednesday, April 8 — Rahul Ilango, MIT

from TCS+ Seminar Series

The next TCS+ talk will take place this coming Wednesday, April 8th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Rahul Ilango from MIT will speak about “Gödel in Cryptography: Zero-Knowledge Proofs With No Interaction, No Setup, and Perfect Soundness” (abstract below). You can reserve a spot as […]

The next TCS+ talk will take place this coming Wednesday, April 8th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Rahul Ilango from MIT will speak about “Gödel in Cryptography: Zero-Knowledge Proofs With No Interaction, No Setup, and Perfect Soundness” (abstract below).

You can reserve a spot as an individual or a group to join us live by signing up on the online form. Registration is not required to attend the interactive talk, and the link will be posted on the website the day prior to the talk; however, by registering in the form, you will receive a reminder, along with the link. (The recorded talk will also be posted on our website afterwards) As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: Gödel showed that there are true but unprovable statements. This was bad news for Hilbert, who hoped that every true statement was provable. In this talk, I’ll describe why Gödel’s result is, in fact, good news for cryptography.

Specifically, Gödel’s result allows for the following strange scenario: a cryptographic system S is insecure, but it is impossible to prove that S is insecure. As I will explain, in this scenario (defined carefully), S is secure for nearly all practical purposes.

Leveraging this idea, we effectively construct — under longstanding assumptions — a classically-impossible cryptographic dream object: “zero-knowledge proofs for NP with no interaction, no setup, and perfect soundness” (I won’t assume you know what this means!). As an application, our result lets one give an ordinary mathematical proof that a Sudoku puzzle is solvable without revealing how to solve it. Previously, it was not known how to do this.

Paper link: https://eccc.weizmann.ac.il/report/2025/095/

By plustcs

Learning and Generating Mixed States Prepared by Shallow Channel Circuits

from arXiv: Computational Complexity

Authors: Fangjun Hu, Christian Kokail, Milan Kornjača, Pedro L. S. Lopes, Weiyuan Gong, Sheng-Tao Wang, Xun Gao, Stefan Ostermann

Learning quantum states from measurement data is a central problem in quantum information and computational complexity. In this work, we study the problem of learning to generate mixed states on a finite-dimensional lattice. Motivated by recent developments in mixed state phases of matter, we focus on arbitrary states in the trivial phase. A state belongs to the trivial phase if there exists a shallow preparation channel circuit under which local reversibility is preserved throughout the preparation. We prove that any mixed state in this class can be efficiently learned from measurement access alone. Specifically, given copies of an unknown trivial phase mixed state, our algorithm outputs a shallow local channel circuit that approximately generates this state in trace distance. The sample complexity and runtime are polynomial (or quasi-polynomial) in the number of qubits, assuming constant (or polylogarithmic) circuit depth and gate locality. Importantly, the learner is not given the original preparation circuit and relies only on its existence. Our results provide a structural foundation for quantum generative models based on shallow channel circuits. In the classical limit, our framework also inspires an efficient algorithm for classical diffusion models using only a polynomial overhead of training and generation.

Authors: Fangjun Hu, Christian Kokail, Milan Kornjača, Pedro L. S. Lopes, Weiyuan Gong, Sheng-Tao Wang, Xun Gao, Stefan Ostermann

Learning quantum states from measurement data is a central problem in quantum information and computational complexity. In this work, we study the problem of learning to generate mixed states on a finite-dimensional lattice. Motivated by recent developments in mixed state phases of matter, we focus on arbitrary states in the trivial phase. A state belongs to the trivial phase if there exists a shallow preparation channel circuit under which local reversibility is preserved throughout the preparation. We prove that any mixed state in this class can be efficiently learned from measurement access alone. Specifically, given copies of an unknown trivial phase mixed state, our algorithm outputs a shallow local channel circuit that approximately generates this state in trace distance. The sample complexity and runtime are polynomial (or quasi-polynomial) in the number of qubits, assuming constant (or polylogarithmic) circuit depth and gate locality. Importantly, the learner is not given the original preparation circuit and relies only on its existence. Our results provide a structural foundation for quantum generative models based on shallow channel circuits. In the classical limit, our framework also inspires an efficient algorithm for classical diffusion models using only a polynomial overhead of training and generation.

A General Framework for Computational Lower Bounds in Nontrivial Norm Approximation

from arXiv: Computational Complexity

Authors: Runshi Tang, Yuefeng Han, Anru R. Zhang

In this note, we propose a general framework for proving computational lower bounds in norm approximation by leveraging a reverse detection--estimation gap. The starting point is a testing problem together with an estimator whose error is significantly smaller than the corresponding computational detection threshold. We show that such a gap yields a lower bound on the approximation distortion achievable by any algorithm in the underlying computational class. In this way, reverse detection--estimation gaps can be turned into a general mechanism for certifying the hardness of approximating nontrivial norms. We apply this framework to the spectral norm of order-$d$ symmetric tensors in $\mathbb{R}^{p^d}$. Using a recently established low-degree hardness result for detecting nonzero high-order cumulant tensors, together with an efficiently computable estimator whose error is below the low-degree detection threshold, we prove that any degree-$D$ low-degree algorithm with $D \le c_d(\log p)^2$ must incur distortion at least $p^{d/4-1/2}/\operatorname{polylog}(p)$ for the tensor spectral norm. Under the low-degree conjecture, the same conclusion extends to all polynomial-time algorithms. In several important settings, this lower bound matches the best known upper bounds up to polylogarithmic factors, suggesting that the exponent $d/4-1/2$ captures a genuine computational barrier. Our results provide evidence that the difficulty of approximating tensor spectral norm is not merely an artifact of existing techniques, but reflects a broader computational barrier.

Authors: Runshi Tang, Yuefeng Han, Anru R. Zhang

In this note, we propose a general framework for proving computational lower bounds in norm approximation by leveraging a reverse detection--estimation gap. The starting point is a testing problem together with an estimator whose error is significantly smaller than the corresponding computational detection threshold. We show that such a gap yields a lower bound on the approximation distortion achievable by any algorithm in the underlying computational class. In this way, reverse detection--estimation gaps can be turned into a general mechanism for certifying the hardness of approximating nontrivial norms. We apply this framework to the spectral norm of order-$d$ symmetric tensors in $\mathbb{R}^{p^d}$. Using a recently established low-degree hardness result for detecting nonzero high-order cumulant tensors, together with an efficiently computable estimator whose error is below the low-degree detection threshold, we prove that any degree-$D$ low-degree algorithm with $D \le c_d(\log p)^2$ must incur distortion at least $p^{d/4-1/2}/\operatorname{polylog}(p)$ for the tensor spectral norm. Under the low-degree conjecture, the same conclusion extends to all polynomial-time algorithms. In several important settings, this lower bound matches the best known upper bounds up to polylogarithmic factors, suggesting that the exponent $d/4-1/2$ captures a genuine computational barrier. Our results provide evidence that the difficulty of approximating tensor spectral norm is not merely an artifact of existing techniques, but reflects a broader computational barrier.

An Unconditional Barrier for Proving Multilinear Algebraic Branching Program Lower Bounds

from arXiv: Computational Complexity

Authors: Deepanshu Kush

Since the breakthrough superpolynomial multilinear formula lower bounds of Raz (Theory of Computing 2006), proving such lower bounds against multilinear algebraic branching programs (mABPs) has been a longstanding open problem in algebraic complexity theory. All known multilinear lower bounds rely on the min-partition rank method, and the best bounds against mABPs have remained quadratic (Alon, Kumar, and Volk, Combinatorica 2020). We show that the min-partition rank method cannot prove superpolynomial mABP lower bounds: there exists a full-rank multilinear polynomial computable by a polynomial-size mABP. This is an unconditional barrier: new techniques are needed to separate $\mathsf{mVBP}$ from higher classes in the multilinear hierarchy. Our proof resolves an open problem of Fabris, Limaye, Srinivasan, and Yehudayoff (ECCC 2026), who showed that the power of this method is governed by the minimum size $N(n)$ of a combinatorial object called a $1$-balanced-chain set system, and proved $N(n) \le n^{O(\log n/\log\log n)}$. We prove $N(n) = n^{O(1)}$ by giving the chain-builder a binary choice at each step, biasing what was a symmetric random walk into one where the imbalance increases with probability at most $1/4$; a supermartingale argument combined with a multi-scale recursion yields the polynomial bound.

Authors: Deepanshu Kush

Since the breakthrough superpolynomial multilinear formula lower bounds of Raz (Theory of Computing 2006), proving such lower bounds against multilinear algebraic branching programs (mABPs) has been a longstanding open problem in algebraic complexity theory. All known multilinear lower bounds rely on the min-partition rank method, and the best bounds against mABPs have remained quadratic (Alon, Kumar, and Volk, Combinatorica 2020). We show that the min-partition rank method cannot prove superpolynomial mABP lower bounds: there exists a full-rank multilinear polynomial computable by a polynomial-size mABP. This is an unconditional barrier: new techniques are needed to separate $\mathsf{mVBP}$ from higher classes in the multilinear hierarchy. Our proof resolves an open problem of Fabris, Limaye, Srinivasan, and Yehudayoff (ECCC 2026), who showed that the power of this method is governed by the minimum size $N(n)$ of a combinatorial object called a $1$-balanced-chain set system, and proved $N(n) \le n^{O(\log n/\log\log n)}$. We prove $N(n) = n^{O(1)}$ by giving the chain-builder a binary choice at each step, biasing what was a symmetric random walk into one where the imbalance increases with probability at most $1/4$; a supermartingale argument combined with a multi-scale recursion yields the polynomial bound.

Breadth-First Search Trees with Many or Few Leaves

from arXiv: Data Structures and Algorithms

Authors: Jesse Beisegel, Ekkehard Köhler, Robert Scheffler, Martin Strehler

The Maximum (Minimum) Leaf Spanning Tree problem asks for a spanning tree with the largest (smallest) number of leaves. As spanning trees are often computed using graph search algorithms, it is natural to restrict this problem to the set of search trees of some particular graph search, e.g., find the Breadth-First Search (BFS) tree with the largest number of leaves. We study this problem for Generic Search (GS), BFS and Lexicographic Breadth-First Search (LBFS) using search trees that connect each vertex to its first neighbor in the search order (first-in trees) just like the classic BFS tree. In particular, we analyze the complexity of these problems, both in the classical and in the parameterized sense. Among other results, we show that the minimum and maximum leaf problems are in FPT for the first-in trees of GS, BFS and LBFS when parameterized by the number of leaves in the tree. However, when these problems are parameterized by the number of internal vertices of the tree, they are W[1]-hard for the first-in trees of GS, BFS and LBFS.

Authors: Jesse Beisegel, Ekkehard Köhler, Robert Scheffler, Martin Strehler

The Maximum (Minimum) Leaf Spanning Tree problem asks for a spanning tree with the largest (smallest) number of leaves. As spanning trees are often computed using graph search algorithms, it is natural to restrict this problem to the set of search trees of some particular graph search, e.g., find the Breadth-First Search (BFS) tree with the largest number of leaves. We study this problem for Generic Search (GS), BFS and Lexicographic Breadth-First Search (LBFS) using search trees that connect each vertex to its first neighbor in the search order (first-in trees) just like the classic BFS tree. In particular, we analyze the complexity of these problems, both in the classical and in the parameterized sense. Among other results, we show that the minimum and maximum leaf problems are in FPT for the first-in trees of GS, BFS and LBFS when parameterized by the number of leaves in the tree. However, when these problems are parameterized by the number of internal vertices of the tree, they are W[1]-hard for the first-in trees of GS, BFS and LBFS.

On the average-case complexity landscape for Tensor-Isomorphism-complete problems over finite fields

from arXiv: Data Structures and Algorithms

Authors: Tiange Li, Yinan Li, Youming Qiao, Dacheng Tao, Yingjie Wang

In Grochow and Qiao (SIAM J. Comput., 2021), the complexity class Tensor Isomorphism (TI) was introduced and isomorphism problems for groups, algebras, and polynomials were shown to be TI-complete. In this paper, we study average-case algorithms for several TI-complete problems over finite fields, including algebra isomorphism, matrix code conjugacy, and $4$-tensor isomorphism. Our main results are as follows. Over the finite field of order $q$, we devise (1) average-case polynomial-time algorithms for algebra isomorphism and matrix code conjugacy that succeed in a $1/Θ(q)$ fraction of inputs and (2) an average-case polynomial-time algorithm for the $4$-tensor isomorphism that succeeds in a $1/q^{Θ(1)}$ fraction of inputs. Prior to our work, algorithms for algebra isomorphism with rigorous average-case analyses ran in exponential time, albeit succeeding on a larger fraction of inputs (Li--Qiao, FOCS'17; Brooksbank--Li--Qiao--Wilson, ESA'20; Grochow--Qiao--Tang, STACS'21). These results reveal a finer landscape of the average-case complexities of TI-complete problems, providing guidance for cryptographic systems based on isomorphism problems. Our main technical contribution is to introduce the spectral properties of random matrices into algorithms for TI-complete problems. This leads to not only new algorithms but also new questions in random matrix theory over finite fields. To settle these questions, we need to extend both the generating function approach as in Neumann and Praeger (J. London Math. Soc., 1998) and the characteristic sum method of Gorodetsky and Rodgers (Trans. Amer. Math. Soc., 2021).

Authors: Tiange Li, Yinan Li, Youming Qiao, Dacheng Tao, Yingjie Wang

In Grochow and Qiao (SIAM J. Comput., 2021), the complexity class Tensor Isomorphism (TI) was introduced and isomorphism problems for groups, algebras, and polynomials were shown to be TI-complete. In this paper, we study average-case algorithms for several TI-complete problems over finite fields, including algebra isomorphism, matrix code conjugacy, and $4$-tensor isomorphism. Our main results are as follows. Over the finite field of order $q$, we devise (1) average-case polynomial-time algorithms for algebra isomorphism and matrix code conjugacy that succeed in a $1/Θ(q)$ fraction of inputs and (2) an average-case polynomial-time algorithm for the $4$-tensor isomorphism that succeeds in a $1/q^{Θ(1)}$ fraction of inputs. Prior to our work, algorithms for algebra isomorphism with rigorous average-case analyses ran in exponential time, albeit succeeding on a larger fraction of inputs (Li--Qiao, FOCS'17; Brooksbank--Li--Qiao--Wilson, ESA'20; Grochow--Qiao--Tang, STACS'21). These results reveal a finer landscape of the average-case complexities of TI-complete problems, providing guidance for cryptographic systems based on isomorphism problems. Our main technical contribution is to introduce the spectral properties of random matrices into algorithms for TI-complete problems. This leads to not only new algorithms but also new questions in random matrix theory over finite fields. To settle these questions, we need to extend both the generating function approach as in Neumann and Praeger (J. London Math. Soc., 1998) and the characteristic sum method of Gorodetsky and Rodgers (Trans. Amer. Math. Soc., 2021).

Stable algorithms cannot reliably find isolated perceptron solutions

from arXiv: Data Structures and Algorithms

Authors: Shuyang Gong, Brice Huang, Shuangping Li, Mark Sellke

We study the binary perceptron, a random constraint satisfaction problem that asks to find a Boolean vector in the intersection of independently chosen random halfspaces. A striking feature of this model is that at every positive constraint density, it is expected that a $1-o_N(1)$ fraction of solutions are \emph{strongly isolated}, i.e. separated from all others by Hamming distance $Ω(N)$. At the same time, efficient algorithms are known to find solutions at certain positive constraint densities. This raises a natural question: can any isolated solution be algorithmically visible? We answer this in the negative: no algorithm whose output is stable under a tiny Gaussian resampling of the disorder can \emph{reliably} locate isolated solutions. We show that any stable algorithm has success probability at most $\frac{3\sqrt{17}-9}{4}+o_N(1)\leq 0.84233$. Furthermore, every stable algorithm that finds a solution with probability $1-o_N(1)$ finds an isolated solution with probability $o_N(1)$. The class of stable algorithms we consider includes degree-$D$ polynomials up to $D\leq o(N/\log N)$; under the low-degree heuristic \cite{hopkins2018statistical}, this suggests that locating strongly isolated solutions requires running time $\exp(\widetildeΘ(N))$. Our proof does not use the overlap gap property. Instead, we show via Pitt's correlation inequality that after a random perturbation of the disorder, the number of solutions located close to a pre-existing isolated solution cannot concentrate at $1$.

Authors: Shuyang Gong, Brice Huang, Shuangping Li, Mark Sellke

We study the binary perceptron, a random constraint satisfaction problem that asks to find a Boolean vector in the intersection of independently chosen random halfspaces. A striking feature of this model is that at every positive constraint density, it is expected that a $1-o_N(1)$ fraction of solutions are \emph{strongly isolated}, i.e. separated from all others by Hamming distance $Ω(N)$. At the same time, efficient algorithms are known to find solutions at certain positive constraint densities. This raises a natural question: can any isolated solution be algorithmically visible? We answer this in the negative: no algorithm whose output is stable under a tiny Gaussian resampling of the disorder can \emph{reliably} locate isolated solutions. We show that any stable algorithm has success probability at most $\frac{3\sqrt{17}-9}{4}+o_N(1)\leq 0.84233$. Furthermore, every stable algorithm that finds a solution with probability $1-o_N(1)$ finds an isolated solution with probability $o_N(1)$. The class of stable algorithms we consider includes degree-$D$ polynomials up to $D\leq o(N/\log N)$; under the low-degree heuristic \cite{hopkins2018statistical}, this suggests that locating strongly isolated solutions requires running time $\exp(\widetildeΘ(N))$. Our proof does not use the overlap gap property. Instead, we show via Pitt's correlation inequality that after a random perturbation of the disorder, the number of solutions located close to a pre-existing isolated solution cannot concentrate at $1$.

The Mystery Deepens: On the Query Complexity of Tarski Fixed Points

from arXiv: Data Structures and Algorithms

Authors: Xi Chen, Yuhao Li, Mihalis Yannakakis

We give an $O(\log^2 n)$-query algorithm for finding a Tarski fixed point over the $4$-dimensional lattice $[n]^4$, matching the $Ω(\log^2 n)$ lower bound of [EPRY20]. Additionally, our algorithm yields an ${O(\log^{\lceil (k-1)/3\rceil+1} n)}$-query algorithm for any constant $k$, improving the previous best upper bound ${O(\log^{\lceil (k-1)/2\rceil+1} n)}$ of [CL22]. Our algorithm uses a new framework based on \emph{safe partial-information} functions. The latter were introduced in [CLY23] to give a reduction from the Tarski problem to its promised version with a unique fixed point. This is the first time they are directly used to design new algorithms for Tarski fixed points.

Authors: Xi Chen, Yuhao Li, Mihalis Yannakakis

We give an $O(\log^2 n)$-query algorithm for finding a Tarski fixed point over the $4$-dimensional lattice $[n]^4$, matching the $Ω(\log^2 n)$ lower bound of [EPRY20]. Additionally, our algorithm yields an ${O(\log^{\lceil (k-1)/3\rceil+1} n)}$-query algorithm for any constant $k$, improving the previous best upper bound ${O(\log^{\lceil (k-1)/2\rceil+1} n)}$ of [CL22]. Our algorithm uses a new framework based on \emph{safe partial-information} functions. The latter were introduced in [CLY23] to give a reduction from the Tarski problem to its promised version with a unique fixed point. This is the first time they are directly used to design new algorithms for Tarski fixed points.

Engineering Fully Dynamic Convex Hulls

from arXiv: Data Structures and Algorithms

Authors: Ivor van der Hoog, Henrik Reinstädtler, Eva Rotenberg

We present a new fully dynamic algorithm for maintaining convex hulls under insertions and deletions while supporting geometric queries. Our approach combines the logarithmic method with a deletion-only convex hull data structure, achieving amortised update times of $O(\log n \log \log n)$ and query times of $O(\log^2 n)$. We provide a robust and non-trivial implementation that supports point-location queries, a challenging and non-decomposable class of convex hull queries. We evaluate our implementation against the state of the art, including a new naive baseline that rebuilds the convex hull whenever an update affects it. On hulls that include polynomially many data points (e.g. $Θ(n^\varepsilon)$ for some $\varepsilon$), such as the ones that often occur in practice, our method outperforms all other techniques. Update-heavy workloads strongly favour our approach, which is in line with our theoretical guarantees. Yet, our method remains competitive all the way down to when the update to query ratio is $1$ to $10$. Experiments on real-world data sets furthermore reveal that existing fully dynamic techniques suffer from significant robustness issues. In contrast, our implementation remains stable across all tested inputs.

Authors: Ivor van der Hoog, Henrik Reinstädtler, Eva Rotenberg

We present a new fully dynamic algorithm for maintaining convex hulls under insertions and deletions while supporting geometric queries. Our approach combines the logarithmic method with a deletion-only convex hull data structure, achieving amortised update times of $O(\log n \log \log n)$ and query times of $O(\log^2 n)$. We provide a robust and non-trivial implementation that supports point-location queries, a challenging and non-decomposable class of convex hull queries. We evaluate our implementation against the state of the art, including a new naive baseline that rebuilds the convex hull whenever an update affects it. On hulls that include polynomially many data points (e.g. $Θ(n^\varepsilon)$ for some $\varepsilon$), such as the ones that often occur in practice, our method outperforms all other techniques. Update-heavy workloads strongly favour our approach, which is in line with our theoretical guarantees. Yet, our method remains competitive all the way down to when the update to query ratio is $1$ to $10$. Experiments on real-world data sets furthermore reveal that existing fully dynamic techniques suffer from significant robustness issues. In contrast, our implementation remains stable across all tested inputs.

Single-Criteria Metric $r$-Dominating Set Problem via Minor-Preserving Support

from arXiv: Data Structures and Algorithms

Authors: Reilly Browne, Hsien-Chih Chang

Given an unweighted graph $G$, the *minimum $r$-dominating set problem* asks for the smallest-cardinality subset $S$ such that every vertex in $G$ is within radius $r$ of some vertex in $S$. While the $r$-dominating set problem on planar graphs admits a PTAS from Baker's shifting/layering technique when $r$ is constant, it becomes significantly harder when $r$ can depend on $n$. Under the Exponential-Time Hypothesis, Fox-Epstein et al. [SODA 2019] showed that no efficient PTAS exists for the unbounded $r$-dominating set problem on planar graphs. One may also consider the harder *vertex-weighted metric $r$-dominating set*, where edges have lengths, vertices have positive weights, and the goal is to find an $r$-dominating set of minimum total weight. This led to the development of *bicriteria* algorithms that allow radius-$(1+\varepsilon)r$ balls while achieving a $1+\varepsilon$ approximation to the optimal weight. We establish the first *single-criteria* polynomial-time $O(1)$-approximation algorithm for the vertex-weighted metric $r$-dominating set on planar graphs, where $r$ is part of the input and can be arbitrarily large. Our algorithm applies the quasi-uniformity sampling of Chan et al. [SODA 2012] by bounding the *shallow cell complexity* of the radius-$r$ ball system to be linear in $n$. Two technical innovations enable this: 1. Since discrete ball systems on planar graphs are neither pseudodisks nor amenable to standard union-complexity arguments, we construct a *support graph* for arbitrary distance ball systems as contractions of Voronoi cells, with sparseness as a byproduct. 2. We assign each depth-($\geq 3$) cell to a unique 3-tuple of ball centers, enabling Clarkson-Shor techniques to reduce counting to depth-*exactly*-3 cells, which we prove are $O(n)$ by a geometric argument on our Voronoi contraction support.

Authors: Reilly Browne, Hsien-Chih Chang

Given an unweighted graph $G$, the *minimum $r$-dominating set problem* asks for the smallest-cardinality subset $S$ such that every vertex in $G$ is within radius $r$ of some vertex in $S$. While the $r$-dominating set problem on planar graphs admits a PTAS from Baker's shifting/layering technique when $r$ is constant, it becomes significantly harder when $r$ can depend on $n$. Under the Exponential-Time Hypothesis, Fox-Epstein et al. [SODA 2019] showed that no efficient PTAS exists for the unbounded $r$-dominating set problem on planar graphs. One may also consider the harder *vertex-weighted metric $r$-dominating set*, where edges have lengths, vertices have positive weights, and the goal is to find an $r$-dominating set of minimum total weight. This led to the development of *bicriteria* algorithms that allow radius-$(1+\varepsilon)r$ balls while achieving a $1+\varepsilon$ approximation to the optimal weight. We establish the first *single-criteria* polynomial-time $O(1)$-approximation algorithm for the vertex-weighted metric $r$-dominating set on planar graphs, where $r$ is part of the input and can be arbitrarily large. Our algorithm applies the quasi-uniformity sampling of Chan et al. [SODA 2012] by bounding the *shallow cell complexity* of the radius-$r$ ball system to be linear in $n$. Two technical innovations enable this: 1. Since discrete ball systems on planar graphs are neither pseudodisks nor amenable to standard union-complexity arguments, we construct a *support graph* for arbitrary distance ball systems as contractions of Voronoi cells, with sparseness as a byproduct. 2. We assign each depth-($\geq 3$) cell to a unique 3-tuple of ball centers, enabling Clarkson-Shor techniques to reduce counting to depth-*exactly*-3 cells, which we prove are $O(n)$ by a geometric argument on our Voronoi contraction support.

Two Linear Passes Are Necessary for Sum-Exclude-Self Under Sublinear Space

from arXiv: Data Structures and Algorithms

Authors: Andrew Au

We prove that any algorithm computing the sum-exclude-self of an unsigned $d$-bit integer array of length $n$ under sublinear space must perform two linear passes over the input. More precisely, the algorithm must read at least $n-1$ input elements before any output cell receives its final value, and at least $n - \lfloor t/d \rfloor$ additional elements thereafter, where $t = o(nd)$ bits is the working memory size. This gives a total of $2n - 1 - \lfloor t/d \rfloor$ element reads. A trivial modification of the standard two-pass algorithm achieves this bound exactly for all practical input sizes. The proof uses this toy problem as a worked example to demonstrate the choke-point technique for proving sublinear-space lower bounds.

Authors: Andrew Au

We prove that any algorithm computing the sum-exclude-self of an unsigned $d$-bit integer array of length $n$ under sublinear space must perform two linear passes over the input. More precisely, the algorithm must read at least $n-1$ input elements before any output cell receives its final value, and at least $n - \lfloor t/d \rfloor$ additional elements thereafter, where $t = o(nd)$ bits is the working memory size. This gives a total of $2n - 1 - \lfloor t/d \rfloor$ element reads. A trivial modification of the standard two-pass algorithm achieves this bound exactly for all practical input sizes. The proof uses this toy problem as a worked example to demonstrate the choke-point technique for proving sublinear-space lower bounds.

Faster Approximate Fixed Points of $\ell_\infty$-Contractions

from arXiv: Data Structures and Algorithms

Authors: Andrei Feodorov, Sebastian Haslebacher

We present a new algorithm for finding an $ε$-approximate fixed point of an $\ell_\infty$-contracting function $f : [0, 1]^d \rightarrow [0, 1]^d$. Our algorithm is based on the query-efficient algorithm by Chen, Li, and Yannakakis (STOC 2024), but comes with an improved upper bound of $(\log \frac{1}ε)^{\mathcal{O}(d \log d)}$ on the overall runtime (while still being query-efficient). By combining this with a recent decomposition theorem for $\ell_\infty$-contracting functions, we then describe a second algorithm that finds an $ε$-approximate fixed point in $(\log \frac{1}ε)^{\mathcal{O}(\sqrt{d} \log d)}$ queries and time. The key observation here is that decomposition theorems such as the one for $\ell_\infty$-contracting maps often allow a trade-off: If an algorithm's runtime is worse than its query complexity in terms of the dependency on the dimension $d$, then we can improve the runtime at the expense of weakening the query upper bound. By well-known reductions, our results imply a faster algorithm for $ε$-approximately solving Shapley stochastic games.

Authors: Andrei Feodorov, Sebastian Haslebacher

We present a new algorithm for finding an $ε$-approximate fixed point of an $\ell_\infty$-contracting function $f : [0, 1]^d \rightarrow [0, 1]^d$. Our algorithm is based on the query-efficient algorithm by Chen, Li, and Yannakakis (STOC 2024), but comes with an improved upper bound of $(\log \frac{1}ε)^{\mathcal{O}(d \log d)}$ on the overall runtime (while still being query-efficient). By combining this with a recent decomposition theorem for $\ell_\infty$-contracting functions, we then describe a second algorithm that finds an $ε$-approximate fixed point in $(\log \frac{1}ε)^{\mathcal{O}(\sqrt{d} \log d)}$ queries and time. The key observation here is that decomposition theorems such as the one for $\ell_\infty$-contracting maps often allow a trade-off: If an algorithm's runtime is worse than its query complexity in terms of the dependency on the dimension $d$, then we can improve the runtime at the expense of weakening the query upper bound. By well-known reductions, our results imply a faster algorithm for $ε$-approximately solving Shapley stochastic games.

Rapid mixing in positively weighted restricted Boltzmann machines

from arXiv: Data Structures and Algorithms

Authors: Weiming Feng, Heng Guo, Minji Yang

We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain new mixing time bounds up to the critical thresholds.

Authors: Weiming Feng, Heng Guo, Minji Yang

We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain new mixing time bounds up to the critical thresholds.

Round-efficient Fully-scalable MPC algorithms for k-Means

from arXiv: Data Structures and Algorithms

Authors: Shaofeng H. -C. Jiang, Yaonan Jin, Jianing Lou, Weicheng Wang

We study Euclidean $k$-Means under the Massively Parallel Computation (MPC) model, focusing on the \emph{fully-scalable} setting. Our main result is a fully-scalable $O((\log n/\log\log n)^2)$-approximation in $O(1)$ rounds. Previously, fully-scalable algorithms for $k$-Means either run in super-constant $O(\log\log n \cdot \log\log\log n)$ rounds, albeit with a better $O(1)$-approximation [Cohen-Addad et al., SODA'26], or suffer from bicriteria guarantees [Bhaskara and Wijewardena, ICML'18; Czumaj et al., ICALP'24]. Our algorithm also gives an $O(\log n/\log\log n)$-approximation for $k$-Median, which improves a recent $O(\log n)$-approximation [Goranci et al., SODA'26], and this $o(\log n)$ ratio breaks the fundamental barrier of tree embedding methods used therein. Our main technical contribution is a new variant of the MP algorithm [Mettu and Plaxton, SICOMP'03] that works for general metrics, whose new guarantee is the Lagrangian Multiplier Preserving (LMP) property, which, importantly, holds even under arbitrary distance distortions. Allowing distance distortion is crucial for efficient MPC implementations and useful for efficient algorithm design in general, whereas preserving the LMP property under distance distortion is known to be a significant technical challenge. As a byproduct of our techniques, we also obtain an $O(1)$-approximation to the optimal \emph{value} in $O(1)$ rounds, which conceptually suggests that achieving a true $O(1)$-approximation (for the solution) in $O(1)$ rounds may be a sensible goal for future study.

Authors: Shaofeng H. -C. Jiang, Yaonan Jin, Jianing Lou, Weicheng Wang

We study Euclidean $k$-Means under the Massively Parallel Computation (MPC) model, focusing on the \emph{fully-scalable} setting. Our main result is a fully-scalable $O((\log n/\log\log n)^2)$-approximation in $O(1)$ rounds. Previously, fully-scalable algorithms for $k$-Means either run in super-constant $O(\log\log n \cdot \log\log\log n)$ rounds, albeit with a better $O(1)$-approximation [Cohen-Addad et al., SODA'26], or suffer from bicriteria guarantees [Bhaskara and Wijewardena, ICML'18; Czumaj et al., ICALP'24]. Our algorithm also gives an $O(\log n/\log\log n)$-approximation for $k$-Median, which improves a recent $O(\log n)$-approximation [Goranci et al., SODA'26], and this $o(\log n)$ ratio breaks the fundamental barrier of tree embedding methods used therein. Our main technical contribution is a new variant of the MP algorithm [Mettu and Plaxton, SICOMP'03] that works for general metrics, whose new guarantee is the Lagrangian Multiplier Preserving (LMP) property, which, importantly, holds even under arbitrary distance distortions. Allowing distance distortion is crucial for efficient MPC implementations and useful for efficient algorithm design in general, whereas preserving the LMP property under distance distortion is known to be a significant technical challenge. As a byproduct of our techniques, we also obtain an $O(1)$-approximation to the optimal \emph{value} in $O(1)$ rounds, which conceptually suggests that achieving a true $O(1)$-approximation (for the solution) in $O(1)$ rounds may be a sensible goal for future study.

A Framework for Parameterized Subexponential-Subcubic-Time Algorithms for Weighted Problems in Planar Graphs

from arXiv: Data Structures and Algorithms

Authors: Matthias Bentert, Fedor V. Fomin, Petr A. Golovach

Many problems are known to be solvable in subexponential parameterized time when the input graph is planar. The bidimensionality framework of Demaine, Fomin, Hajiaghay, and Thilikos [JACM'05] and the treewidth-pattern-covering approach by Fomin, Lokshtanov, Marx, Pilipczuk, Pilipczuk, and Saurabh [SICOMP'22] give robust tools for designing such algorithms. However, there are still many problems for which we do not know whether subexponential parameterized algorithms exist. The bidimensionality framework is not able to handle weights or directed graphs and the treewidth-pattern-covering approach only works for finding connected solutions. Building on a result by Nederlof [STOC'20], we provide a framework that is able to solve a variety of problems in planar graphs in subexponential parameterized time for which this was previously not known (where the polynomial part of the running time is usually $O(n^{2.49})$). Our framework can handle weights, does not require solutions to contain only few connected components, and applies to cases where the number of potential patterns of a solution is exponential in the parameter. We then use the framework to show that various weighted problems like Weighted Partial Vertex Cover, Maximum-Weight Induced Forest, Minimum-Weight Rooted Simple Minor, and Maximum-Weight Rooted Parallel Induced Minor allow for subexponential parameterized algorithms. This was previously not known for any of them. Moreover, we present a very easy-to-use fragment of our framework. This fragment allows for significantly simpler proofs in the case of Maximum-Weight Independent Set and Maximum $(k, n-k)$-Cut and is able to show a subexponential parameterized algorithm for weighted versions of Densest $k$-Subgraph. Even the unweighted version was not known before and is stated as an open problem in the existing literature.

Authors: Matthias Bentert, Fedor V. Fomin, Petr A. Golovach

Many problems are known to be solvable in subexponential parameterized time when the input graph is planar. The bidimensionality framework of Demaine, Fomin, Hajiaghay, and Thilikos [JACM'05] and the treewidth-pattern-covering approach by Fomin, Lokshtanov, Marx, Pilipczuk, Pilipczuk, and Saurabh [SICOMP'22] give robust tools for designing such algorithms. However, there are still many problems for which we do not know whether subexponential parameterized algorithms exist. The bidimensionality framework is not able to handle weights or directed graphs and the treewidth-pattern-covering approach only works for finding connected solutions. Building on a result by Nederlof [STOC'20], we provide a framework that is able to solve a variety of problems in planar graphs in subexponential parameterized time for which this was previously not known (where the polynomial part of the running time is usually $O(n^{2.49})$). Our framework can handle weights, does not require solutions to contain only few connected components, and applies to cases where the number of potential patterns of a solution is exponential in the parameter. We then use the framework to show that various weighted problems like Weighted Partial Vertex Cover, Maximum-Weight Induced Forest, Minimum-Weight Rooted Simple Minor, and Maximum-Weight Rooted Parallel Induced Minor allow for subexponential parameterized algorithms. This was previously not known for any of them. Moreover, we present a very easy-to-use fragment of our framework. This fragment allows for significantly simpler proofs in the case of Maximum-Weight Independent Set and Maximum $(k, n-k)$-Cut and is able to show a subexponential parameterized algorithm for weighted versions of Densest $k$-Subgraph. Even the unweighted version was not known before and is stated as an open problem in the existing literature.

Near-Optimal Four-Cycle Counting in Graph Streams

from arXiv: Data Structures and Algorithms

Authors: Sebastian Lüderssen, Stefan Neumann, Pan Peng

We study four-cycle counting in arbitrary order graph streams. We present a 3-pass algorithm for $(1+\varepsilon)$-approximating the number of four-cycles using $\widetilde{O}(m/\sqrt{T})$ space, where $m$ is the number of edges and $T$ the number of four-cycles in the graph. This improves upon a 3-pass algorithm by Vorotnikova using space $\widetilde{O}(m/T^{1/3})$ and matches a multi-pass lower bound of $Ω(m/\sqrt{T})$ by McGregor and Vorotnikova.

Authors: Sebastian Lüderssen, Stefan Neumann, Pan Peng

We study four-cycle counting in arbitrary order graph streams. We present a 3-pass algorithm for $(1+\varepsilon)$-approximating the number of four-cycles using $\widetilde{O}(m/\sqrt{T})$ space, where $m$ is the number of edges and $T$ the number of four-cycles in the graph. This improves upon a 3-pass algorithm by Vorotnikova using space $\widetilde{O}(m/T^{1/3})$ and matches a multi-pass lower bound of $Ω(m/\sqrt{T})$ by McGregor and Vorotnikova.

Approximation Algorithms for Budget Splitting in Multi-Channel Influence Maximization

from arXiv: Data Structures and Algorithms

Authors: Dildar Ali, Ansh Jasrotia, Abishek Salaria, Suman Banerjee

How to utilize an allocated budget effectively for branding and promotion of a commercial house is an important problem, particularly when multiple advertising media are available. There exist multiple such media, and among them, two popular ones are billboards and social media advertisements. In this context, the question naturally arises: how should a budget be allocated to maximize total influence? Although there is significant literature on the effective use of budgets in individual advertising media, there are hardly any studies examining budget allocation across multiple advertising media. To bridge this gap, this paper introduces the \textsc{Budget Splitting Problem in Billboard and Social Network Advertisement}. We introduce the notion of \emph{interaction effect} to capture the additional influence due to triggers from multiple media of advertising. Using this notion, we propose a noble influence function $Φ(,)$ that captures the total influence and shows that this function is non-negative, monotone, and non-bisubmodular. We introduce \emph{bi-submodularity ratio $(γ)$} and \emph{generalized curvature $(α)$} to measure how close a function is to being bi-submodular and how far a function is from being modular, respectively. We propose the Randomized Greedy and Two-Phase Adaptive Greedy approach, where the influence function is non-bisubmodular and achieves an approximation guarantee of $\frac{1}α\left(1-e^ {-γα} \right)$. We conducted several experiments using real-world datasets and observed that the proposed solution approach's budget splitting leads to a greater influence than existing approaches.

Authors: Dildar Ali, Ansh Jasrotia, Abishek Salaria, Suman Banerjee

How to utilize an allocated budget effectively for branding and promotion of a commercial house is an important problem, particularly when multiple advertising media are available. There exist multiple such media, and among them, two popular ones are billboards and social media advertisements. In this context, the question naturally arises: how should a budget be allocated to maximize total influence? Although there is significant literature on the effective use of budgets in individual advertising media, there are hardly any studies examining budget allocation across multiple advertising media. To bridge this gap, this paper introduces the \textsc{Budget Splitting Problem in Billboard and Social Network Advertisement}. We introduce the notion of \emph{interaction effect} to capture the additional influence due to triggers from multiple media of advertising. Using this notion, we propose a noble influence function $Φ(,)$ that captures the total influence and shows that this function is non-negative, monotone, and non-bisubmodular. We introduce \emph{bi-submodularity ratio $(γ)$} and \emph{generalized curvature $(α)$} to measure how close a function is to being bi-submodular and how far a function is from being modular, respectively. We propose the Randomized Greedy and Two-Phase Adaptive Greedy approach, where the influence function is non-bisubmodular and achieves an approximation guarantee of $\frac{1}α\left(1-e^ {-γα} \right)$. We conducted several experiments using real-world datasets and observed that the proposed solution approach's budget splitting leads to a greater influence than existing approaches.

A column generation algorithm for finding co-3-plexes in chordal graphs

from arXiv: Data Structures and Algorithms

Authors: Alexandre Dupont-Bouillard

In this study, we tackle the problem of finding a maximum \emph{co-3-plex}, which is a subset of vertices of an input graph, inducing a subgraph of maximum degree 2. We focus on the class of chordal graphs. By observing that the graph induced by a co-3-plex in a chordal graph is a set of isolated triangles and induced paths, we reduce the problem of finding a maximum weight co-3-plex in a graph $G$ to that of finding a maximum stable set in an auxiliary graph $\mathcal{A}(G)$ of exponential size. This reduction allows us to derive an exponential variable-sized linear programming formulation for the maximum weighted co-3-plex problem. We show that the pricing subproblem of this formulation reduces to solving a maximum vertex and edge weight induced path. Such a problem is solvable in polynomial time; therefore, this exhibits a polynomial time column generation algorithm solving the maximum co-3-plex problem on chordal graphs. Moreover, this machinery exhibits a new application for the maximum vertex and edge weighted induced path problems.

Authors: Alexandre Dupont-Bouillard

In this study, we tackle the problem of finding a maximum \emph{co-3-plex}, which is a subset of vertices of an input graph, inducing a subgraph of maximum degree 2. We focus on the class of chordal graphs. By observing that the graph induced by a co-3-plex in a chordal graph is a set of isolated triangles and induced paths, we reduce the problem of finding a maximum weight co-3-plex in a graph $G$ to that of finding a maximum stable set in an auxiliary graph $\mathcal{A}(G)$ of exponential size. This reduction allows us to derive an exponential variable-sized linear programming formulation for the maximum weighted co-3-plex problem. We show that the pricing subproblem of this formulation reduces to solving a maximum vertex and edge weight induced path. Such a problem is solvable in polynomial time; therefore, this exhibits a polynomial time column generation algorithm solving the maximum co-3-plex problem on chordal graphs. Moreover, this machinery exhibits a new application for the maximum vertex and edge weighted induced path problems.

No quantum advantage implies improved bounds and classical algorithms for the binary paint shop problem

from arXiv: Data Structures and Algorithms

Authors: Mark Goh, Lara Caroline Pereira dos Santos, Matthias Sperl

The binary paint shop problem (BPSP) is an APX-hard optimization problem in which, given n car models that occur twice in a sequence of length 2n, the goal is to find a colouring sequence such that the two occurrences of each model are painted differently, while minimizing the number of times the paint is swapped along the sequence. A recent classical heuristic, known as the recursive star greedy (RSG) algorithm, is conjectured to achieve an expected paint swap ratio of 0.361, thereby outperforming the QAOA with circuit depth p = 7. Since the performance of the QAOA with logarithmic circuit depth is instance independent, the average paint swap ratio is upper-bounded by the QAOA, which numerical evidence suggests is approximately between 0.265 and 0.282. To provide hardware-relevant comparisons, we additionally implement the BPSP on a D-Wave Quantum Annealer Advantage 2, obtaining a minimum paint swap ratio of 0.320. Given that the QAOA with logarithmic circuit depth does not exhibit quantum advantage for sparse optimization problems such as the BPSP, this implies the existence of a classical algorithm that surpasses both the RSG algorithm and logarithmic depth QAOA. We provide numerical evidence that the Mean-Field Approximate Optimization Algorithm (MF-AOA) is one such algorithm that beats all known classical heuristics and quantum algorithms to date with a paint swap ratio of approximately 0.2799.

Authors: Mark Goh, Lara Caroline Pereira dos Santos, Matthias Sperl

The binary paint shop problem (BPSP) is an APX-hard optimization problem in which, given n car models that occur twice in a sequence of length 2n, the goal is to find a colouring sequence such that the two occurrences of each model are painted differently, while minimizing the number of times the paint is swapped along the sequence. A recent classical heuristic, known as the recursive star greedy (RSG) algorithm, is conjectured to achieve an expected paint swap ratio of 0.361, thereby outperforming the QAOA with circuit depth p = 7. Since the performance of the QAOA with logarithmic circuit depth is instance independent, the average paint swap ratio is upper-bounded by the QAOA, which numerical evidence suggests is approximately between 0.265 and 0.282. To provide hardware-relevant comparisons, we additionally implement the BPSP on a D-Wave Quantum Annealer Advantage 2, obtaining a minimum paint swap ratio of 0.320. Given that the QAOA with logarithmic circuit depth does not exhibit quantum advantage for sparse optimization problems such as the BPSP, this implies the existence of a classical algorithm that surpasses both the RSG algorithm and logarithmic depth QAOA. We provide numerical evidence that the Mean-Field Approximate Optimization Algorithm (MF-AOA) is one such algorithm that beats all known classical heuristics and quantum algorithms to date with a paint swap ratio of approximately 0.2799.

Secretary, Prophet, and Stochastic Probing via Big-Decisions-First

from arXiv: Data Structures and Algorithms

Authors: Aviad Rubinstein, Sahil Singla

We revisit three fundamental problems in algorithms under uncertainty: the Secretary Problem, Prophet Inequality, and Stochastic Probing, each subject to general downward-closed constraints. When elements have binary values, all three problems admit a tight $\tildeΘ(\log n)$-factor approximation guarantee. For general (non-binary) values, however, the best known algorithms lose an additional $\log n$ factor when discretizing to binary values, leaving a quadratic gap of $\tildeΘ(\log n)$ vs. $\tildeΘ(\log^2 n)$. We resolve this quadratic gap for all three problems, showing $\tildeΩ(\log^2 n)$-hardness for two of them and an $O(\log n)$-approximation algorithm for the third. While the technical details differ across settings, and between algorithmic and hardness proofs, all our results stem from a single core observation, which we call the Big-Decisions-First Principle: Under uncertainty, it is better to resolve high-stakes (large-value) decisions early.

Authors: Aviad Rubinstein, Sahil Singla

We revisit three fundamental problems in algorithms under uncertainty: the Secretary Problem, Prophet Inequality, and Stochastic Probing, each subject to general downward-closed constraints. When elements have binary values, all three problems admit a tight $\tildeΘ(\log n)$-factor approximation guarantee. For general (non-binary) values, however, the best known algorithms lose an additional $\log n$ factor when discretizing to binary values, leaving a quadratic gap of $\tildeΘ(\log n)$ vs. $\tildeΘ(\log^2 n)$. We resolve this quadratic gap for all three problems, showing $\tildeΩ(\log^2 n)$-hardness for two of them and an $O(\log n)$-approximation algorithm for the third. While the technical details differ across settings, and between algorithmic and hardness proofs, all our results stem from a single core observation, which we call the Big-Decisions-First Principle: Under uncertainty, it is better to resolve high-stakes (large-value) decisions early.

A Unified Framework for Analysis of Randomized Greedy Matching Algorithms

from arXiv: Data Structures and Algorithms

Authors: Mahsa Derakhshan, Tao Yu

Randomized greedy algorithms form one of the simplest yet most effective approaches for computing approximate matchings in graphs. In this paper, we focus on the class of vertex-iterative (VI) randomized greedy matching algorithms, which process the vertices of a graph $G=(V,E)$ in some order $π$ and, for each vertex $v$, greedily match it to the first available neighbor according to a preference order $σ(v)$. Various VI algorithms have been studied, each corresponding to a different distribution over $π$ and $σ(v)$. We develop a unified framework for analyzing this family of algorithms and use it to obtain improved approximation ratios for Ranking and FRanking, the state-of-the-art randomized greedy algorithms for the random-order and adversarial-order settings, respectively. In Ranking, the decision order is drawn uniformly at random and used as the common preference order, whereas FRanking uses an adversarial decision order and a uniformly random preference order shared by all vertices. We obtain an approximation ratio of $0.560$ for Ranking, improving on the $0.5469$ bound of Derakhshan et al. [SODA 2026]. For FRanking, we obtain a ratio of $0.539$, improving on the $0.521$ bound of Huang et al. [JACM 2020]. These results also imply state-of-the-art approximation ratios for oblivious matching and fully online matching problems on general graphs. Our analysis framework also enables us to prove improved approximation ratios for graphs with no short odd cycles. Such graphs form an intermediate class between general graphs and bipartite graphs. In particular, we show that Ranking is at least $0.570$-competitive for graphs that are both triangle-free and pentagon-free. For graphs whose shortest odd cycle has length at least $129$, we prove that Ranking is at least $0.615$-competitive.

Authors: Mahsa Derakhshan, Tao Yu

Randomized greedy algorithms form one of the simplest yet most effective approaches for computing approximate matchings in graphs. In this paper, we focus on the class of vertex-iterative (VI) randomized greedy matching algorithms, which process the vertices of a graph $G=(V,E)$ in some order $π$ and, for each vertex $v$, greedily match it to the first available neighbor according to a preference order $σ(v)$. Various VI algorithms have been studied, each corresponding to a different distribution over $π$ and $σ(v)$. We develop a unified framework for analyzing this family of algorithms and use it to obtain improved approximation ratios for Ranking and FRanking, the state-of-the-art randomized greedy algorithms for the random-order and adversarial-order settings, respectively. In Ranking, the decision order is drawn uniformly at random and used as the common preference order, whereas FRanking uses an adversarial decision order and a uniformly random preference order shared by all vertices. We obtain an approximation ratio of $0.560$ for Ranking, improving on the $0.5469$ bound of Derakhshan et al. [SODA 2026]. For FRanking, we obtain a ratio of $0.539$, improving on the $0.521$ bound of Huang et al. [JACM 2020]. These results also imply state-of-the-art approximation ratios for oblivious matching and fully online matching problems on general graphs. Our analysis framework also enables us to prove improved approximation ratios for graphs with no short odd cycles. Such graphs form an intermediate class between general graphs and bipartite graphs. In particular, we show that Ranking is at least $0.570$-competitive for graphs that are both triangle-free and pentagon-free. For graphs whose shortest odd cycle has length at least $129$, we prove that Ranking is at least $0.615$-competitive.

Near-Optimal Parallel Approximate Counting via Sampling

from arXiv: Data Structures and Algorithms

Authors: David G. Harris, Vladimir Kolmogorov, Hongyang Liu, Yitong Yin, Yiyao Zhang

The computational equivalence between approximate counting and sampling is well established for polynomial-time algorithms. The most efficient general reduction from counting to sampling is achieved via simulated annealing, where the counting problem is formulated in terms of estimating the ratio $Q={Z(β_{\max})}/{Z(β_{\min})}$ between partition functions $Z(β)=\sum_{x\in Ω} \exp(βH(x))$ of Gibbs distributions $μ_β$ over $Ω$ with Hamiltonian $H$, given access to a sampling oracle that produces samples from $μ_β$ for $β\in [β_{\min}, β_{\max}]$. The best bound achieved by known annealing algorithms with relative error $\varepsilon$ is $O(q \log h / \varepsilon^2)$, where $q, h$ are parameters which respectively bound $\ln Q$ and $H$. However, all known algorithms attaining this near-optimal complexity are inherently sequential, or *adaptive*: the queried parameters $β$ depend on previous samples. We develop a simple non-adaptive algorithm for approximate counting using $O(q \log^2 h / \varepsilon^2)$ samples, as well as an algorithm that achieves $O(q \log h / \varepsilon^2)$ samples with just two rounds of adaptivity, matching the best sample complexity of sequential algorithms. These algorithms naturally give rise to work-efficient parallel (RNC) counting algorithms. We discuss applications to RNC counting algorithms for several classic models, including the anti-ferromagnetic 2-spin, monomer-dimer and ferromagnetic Ising models.

Authors: David G. Harris, Vladimir Kolmogorov, Hongyang Liu, Yitong Yin, Yiyao Zhang

The computational equivalence between approximate counting and sampling is well established for polynomial-time algorithms. The most efficient general reduction from counting to sampling is achieved via simulated annealing, where the counting problem is formulated in terms of estimating the ratio $Q={Z(β_{\max})}/{Z(β_{\min})}$ between partition functions $Z(β)=\sum_{x\in Ω} \exp(βH(x))$ of Gibbs distributions $μ_β$ over $Ω$ with Hamiltonian $H$, given access to a sampling oracle that produces samples from $μ_β$ for $β\in [β_{\min}, β_{\max}]$. The best bound achieved by known annealing algorithms with relative error $\varepsilon$ is $O(q \log h / \varepsilon^2)$, where $q, h$ are parameters which respectively bound $\ln Q$ and $H$. However, all known algorithms attaining this near-optimal complexity are inherently sequential, or *adaptive*: the queried parameters $β$ depend on previous samples. We develop a simple non-adaptive algorithm for approximate counting using $O(q \log^2 h / \varepsilon^2)$ samples, as well as an algorithm that achieves $O(q \log h / \varepsilon^2)$ samples with just two rounds of adaptivity, matching the best sample complexity of sequential algorithms. These algorithms naturally give rise to work-efficient parallel (RNC) counting algorithms. We discuss applications to RNC counting algorithms for several classic models, including the anti-ferromagnetic 2-spin, monomer-dimer and ferromagnetic Ising models.

Wednesday, April 01

Quantum computing bombshells that are not April Fools

from Scott Aaronson

For those of you who haven’t seen, there were actually two “bombshell” QC announcements this week. One, from Caltech, including friend-of-the-blog John Preskill, showed how to do quantum fault-tolerance with lower overhead than was previously known, by using high-rate codes, which could work for example in neutral-atom architectures (or possibly other architectures that allow nonlocal […]

For those of you who haven’t seen, there were actually two “bombshell” QC announcements this week. One, from Caltech, including friend-of-the-blog John Preskill, showed how to do quantum fault-tolerance with lower overhead than was previously known, by using high-rate codes, which could work for example in neutral-atom architectures (or possibly other architectures that allow nonlocal operations, like trapped ions). The second bombshell, from Google, gave a lower-overhead implementation of Shor’s algorithm to break 256-bit elliptic curve cryptography.

Notably, out of an abudance of caution, the Google team chose to “publish” its result via a cryptographic zero-knowledge proof that their circuit exists (so, without revealing the details to attackers). This is the first time I’ve ever seen a new mathematical result actually announced that way, although I understand that there’s precedent in the 1500’s, when mathematicians would (for example) prove their ability to solve quartic equations by challenging their rivals to duels. I’m not sure how much it will actually help, as once other groups know that a smaller circuit exists, it might be only a short time until they’re able to find it as well.

Neither of these results change the basic principles of QC that we’ve known for decades, but they do change the numbers.

When you put both of them together, Bitcoin signatures for example certainly look vulnerable to quantum attack earlier than was previously known!  In particular, the Caltech group estimates that a mere 25,000 physical qubits might suffice for this, where a year ago the best estimates were in the millions. How much time will this save — maybe a year?  Subtracting, of course, off a number of years that no one knows.

In any case, these results provide an even stronger impetus for people to upgrade now to quantum-resistant cryptography.  They—meaning you, if relevant—should really get on that!

When I got an early heads-up about these results—especially the Google team’s choice to “publish” via a zero-knowledge proof—I thought of Frisch and Peierls, calculating how much U-235 was needed for a chain reaction in 1940, but not publishing it, even though the latest results on nuclear fission had been openly published just the year prior. Will we, in quantum computing, also soon cross that threshold? But I got strong pushback on that analogy from the cryptography and cybersecurity people who I most respect. They said: we have decades of experience with this, and the answer is that you publish. And, they said, if publishing causes people still using quantum-vulnerable systems to crap their pants … well, maybe that’s what needs to happen right now.

Naturally, journalists have been hounding me for comments, though it was the worst possible week, when I needed to host like four separate visitors in Austin. I hope this post helps! Please feel free to ask questions or post further details in the comments.

And now, with no time for this blog post to leaven and rise, I need to go home for my family’s Seder. Happy Passover!

By Scott