Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Monday, February 23

Maximum Principles

from Ben Recht

A crash course on optimal control

This is a live blog of Lecture 4 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

In machine learning and artificial intelligence classes, everyone treats control as if it’s a special kind of optimization problem. You write down a model for the system, you write down your goals for the system, and then you run PPO to find the optimal feedback controller.

This sounds pretty easy, to be honest. Why would we need all of these complex textbooks and Laplace transforms if we can just write down the cost function and gradient descent our way to optimal actions? I’m going to try to answer that question in the next lecture. But today we’ll review all of optimal control.

To make the leap from feedback design to policy optimization, you have to believe in state. In dynamical systems, the state is some vector that completely determines the fate of the system to be steered. If we know this vector and our input, we completely specify what happens in the next time period. More precisely, we assume that the next state of the system depends only on the current state, the current input, and some noise:

state_next = dynamics_model(state, input, noise)

This seems perfectly reasonable, up to the part where we assume that we can measure this state vector. Let’s just assume we can for now.

Next, we need to make our state do something. We have some goal in mind for the system: “go over there and make me coffee, but don’t run over my dog.” We have to write this goal as a numerical function. Convention decomposes this cost function into a sum of costs incurred at each point in time. In my coffee example, the cost is high every time I don’t have a coffee in my hand, even higher if the dog is killed, and zero once I am sipping a well-brewed cup. You can imagine how to generalize this. Perhaps you penalize the amount of control effort the feedback is exerting. Or you could penalize the distance between the current state and a desired state. There are lots of costs you can write down.

If you can pose your cost this way, there’s a generic solution for the optimal feedback policy: dynamic programming. Bellman’s masterful observation is that optimization problems of this form can be solved by running backward in time from the end of time to the present. The classic example everyone gives is shortest paths. If we have an optimal path from A to B, and Albuquerque is on the optimal path, then you have also found the optimal path from Albuquerque to B. You can work your way backwards from the last part of your path to the beginning, doing clever bookkeeping in what we call “cost-to-go” functions.

Dynamic programming is beautiful. It admits an amazingly elegant solution for linear systems with quadratic costs, with the feedback rule a simple linear function of the current state. It has a reasonably simple implementation when you have a few discrete states and actions. For everything else, dynamic programming is effectively not implementable. It becomes a meta-algorithm we all aspire to, but we end up throwing messy heuristics, be they gridded policy iteration or deep reinforcement learning, to solve for the optimal control.

Perhaps the most important thing about optimal control is how it changes control design. Whereas PID control fixes the form of the controller and asks the designer to tune the parameters, optimal control fixes the types of optimization problems you can pose and asks the designer to think about how to model their problem so that one of the tractable models applies. The control engineer shoehorns their system into a corner where the optimization problem is solvable using dynamic programming, a Hamiltonian method, or some other numerical solver. In the language used by reinforcement learning people, optimal control design is “reward shaping.” You find an optimization problem that’s close enough to what you really want.

Still, optimal control feels both easier to understand and more widely applicable than the sorts of simple controllers like PID we’ve been covering so far this semester. In optimal control, you specify what you want the controller to do and let computers do the hard part of finding the parameters for you. Indeed, we’ll see today that you can often pose PID control problems as optimal control problems. If you find it more intuitive to tune cost functions than control gains, this is a powerful restatement of the PID control problem.

But optimal control buries a lot of what happens when we close the loop. It doesn’t make it clear why feedback induces the behaviors we see in the world. Though there are people who work on “inverse optimal control,” that problem is too ill-posed to provide insights about natural systems. More often than not, no single optimization problem describes any interesting system.

It’s also very hard to incorporate uncertainty into optimal control. We showed last time that PID controllers worked for rather underspecified systems. By contrast, optimal control wants to work on very specified systems. But what happens when the dynamics models are wrong? What if you have multiple design goals and not just a single cost to optimize? What if you can’t actually solve the optimization problem? What if you can’t measure the state? We’ll talk about these various fun scenarios next time. But the spoiler alert is that we get stuck. While optimal control is accessible and powerful, it is merely one tool in the toolbox for designing and understanding feedback systems.

Subscribe now

By Ben Recht

The Complexity of Sparse Win-Lose Bimatrix Games

from arXiv: Computational Complexity

Authors: Eleni Batziou, John Fearnley, Abheek Ghosh, Rahul Savani

We prove that computing an $ε$-approximate Nash equilibrium of a win-lose bimatrix game with constant sparsity is PPAD-hard for inverse-polynomial $ε$. Our result holds for 3-sparse games, which is tight given that 2-sparse win-lose bimatrix games can be solved in polynomial time.

Authors: Eleni Batziou, John Fearnley, Abheek Ghosh, Rahul Savani

We prove that computing an $ε$-approximate Nash equilibrium of a win-lose bimatrix game with constant sparsity is PPAD-hard for inverse-polynomial $ε$. Our result holds for 3-sparse games, which is tight given that 2-sparse win-lose bimatrix games can be solved in polynomial time.

Complexity lower bounds for succinct binary structures of bounded clique-width with restrictions

from arXiv: Computational Complexity

Authors: Colin Geniet, Aliénor Goubault-Larrecq, Kévin Perrot

We present a Rice-like complexity lower bound for any MSO-definable problem on binary structures succinctly encoded by circuits. This work extends the framework recently developed as a counterpoint to Courcelle's theorem for graphs encoded by circuits, in two interplaying directions: (1) by allowing multiple binary relations, and (2) by restricting the interpretation of new symbols. Depending on the pair of an MSO problem $ψ$ and an MSO restriction $χ$, the problem is proven to be NP-hard or coNP-hard or P-hard, as long as $ψ$ is non-trivial on structures satisfying $χ$ with bounded clique-width. Indeed, there are P-complete problems (for logspace reductions) in our extended context. Finally, we strengthen a previous result on the necessity to parameterize the notion of non-triviality, hence supporting the choice of clique-width.

Authors: Colin Geniet, Aliénor Goubault-Larrecq, Kévin Perrot

We present a Rice-like complexity lower bound for any MSO-definable problem on binary structures succinctly encoded by circuits. This work extends the framework recently developed as a counterpoint to Courcelle's theorem for graphs encoded by circuits, in two interplaying directions: (1) by allowing multiple binary relations, and (2) by restricting the interpretation of new symbols. Depending on the pair of an MSO problem $ψ$ and an MSO restriction $χ$, the problem is proven to be NP-hard or coNP-hard or P-hard, as long as $ψ$ is non-trivial on structures satisfying $χ$ with bounded clique-width. Indeed, there are P-complete problems (for logspace reductions) in our extended context. Finally, we strengthen a previous result on the necessity to parameterize the notion of non-triviality, hence supporting the choice of clique-width.

Policy Gradient Algorithms in Average-Reward Multichain MDPs

from arXiv: Computational Complexity

Authors: Jongmin Lee, Ernest K. Ryu

While there is an extensive body of research analyzing policy gradient methods for discounted cumulative-reward MDPs, prior work on policy gradient methods for average-reward MDPs has been limited, with most existing results restricted to ergodic or unichain settings. In this work, we first establish a policy gradient theorem for average-reward multichain MDPs based on the invariance of the classification of recurrent and transient states. Building on this foundation, we develop refined analyses and obtain a collection of convergence and sample-complexity results that advance the understanding of this setting. In particular, we show that the proposed $α$-clipped policy mirror ascent algorithm attains an $ε$-optimal policy with respect to positive policies.

Authors: Jongmin Lee, Ernest K. Ryu

While there is an extensive body of research analyzing policy gradient methods for discounted cumulative-reward MDPs, prior work on policy gradient methods for average-reward MDPs has been limited, with most existing results restricted to ergodic or unichain settings. In this work, we first establish a policy gradient theorem for average-reward multichain MDPs based on the invariance of the classification of recurrent and transient states. Building on this foundation, we develop refined analyses and obtain a collection of convergence and sample-complexity results that advance the understanding of this setting. In particular, we show that the proposed $α$-clipped policy mirror ascent algorithm attains an $ε$-optimal policy with respect to positive policies.

Convergent Gate Elimination and Constructive Circuit Lower Bounds

from arXiv: Computational Complexity

Authors: Marco Carmosino, Ngu Dang, Tim Jackman

Towards better understanding of gate elimination, the only method known that can prove complexity lower bounds for explicit functions against unrestricted Boolean circuits, this work contributes: (1) formalizing circuit simplifications as a convergent term graph rewriting system and (2) giving a simple and constructive proof of a classical lower bound using this system. First, we show that circuit simplification is a convergent term graph rewriting system over the DeMorgan and $\{\land, \lor, \oplus\}$ bases. We define local rewriting rules from Boolean identities such that every simplification sequence yields an identical final result (up to circuit isomorphism or bisimulation). Convergence enables rigorous reasoning about structural properties of simplified circuits without dependence on the order of simplification. Then, we show that there is \emph{no similar} convergent formalization of circuit simplification over the $U_2$ and $B_2$ bases. Then, we use our simplification system to give a constructive circuit lower bound, generalizing Schnorr's classical result that the XOR function requires $3(n - 1)$ gates to compute in the DeMorgan basis. A constructive lower bound $f \not\in C$ gives an algorithm (called a "refuter") that efficiently finds counter-examples for every $C$-circuit trying to compute the function $f$. Chen, Jin, Santhanam, and Williams showed that constructivity plays a central role in many longstanding open problems about complexity theory (FOCS 2021), so it is natural to ask for constructive circuit lower bounds from gate elimination arguments. This demonstrates how using convergent simplification can lead to shorter and more modular proofs of circuit lower bounds. Furthermore, until this work, no constructive lower bound had been proved via gate elimination.

Authors: Marco Carmosino, Ngu Dang, Tim Jackman

Towards better understanding of gate elimination, the only method known that can prove complexity lower bounds for explicit functions against unrestricted Boolean circuits, this work contributes: (1) formalizing circuit simplifications as a convergent term graph rewriting system and (2) giving a simple and constructive proof of a classical lower bound using this system. First, we show that circuit simplification is a convergent term graph rewriting system over the DeMorgan and $\{\land, \lor, \oplus\}$ bases. We define local rewriting rules from Boolean identities such that every simplification sequence yields an identical final result (up to circuit isomorphism or bisimulation). Convergence enables rigorous reasoning about structural properties of simplified circuits without dependence on the order of simplification. Then, we show that there is \emph{no similar} convergent formalization of circuit simplification over the $U_2$ and $B_2$ bases. Then, we use our simplification system to give a constructive circuit lower bound, generalizing Schnorr's classical result that the XOR function requires $3(n - 1)$ gates to compute in the DeMorgan basis. A constructive lower bound $f \not\in C$ gives an algorithm (called a "refuter") that efficiently finds counter-examples for every $C$-circuit trying to compute the function $f$. Chen, Jin, Santhanam, and Williams showed that constructivity plays a central role in many longstanding open problems about complexity theory (FOCS 2021), so it is natural to ask for constructive circuit lower bounds from gate elimination arguments. This demonstrates how using convergent simplification can lead to shorter and more modular proofs of circuit lower bounds. Furthermore, until this work, no constructive lower bound had been proved via gate elimination.

Hilbert's Nullstellensatz is in the Counting Hierarchy

from arXiv: Computational Complexity

Authors: Robert Andrews, Abhibhav Garg, Éric Schost

We show that Hilbert's Nullstellensatz, the problem of deciding if a system of multivariate polynomial equations has a solution in the algebraic closure of the underlying field, lies in the counting hierarchy. More generally, we show that the number of solutions to a system of equations can be computed in polynomial time with oracle access to the counting hierarchy. Our results hold in particular for polynomials with coefficients in either the rational numbers or a finite field. Previously, the best-known bounds on the complexities of these problems were PSPACE and FPSPACE, respectively. Our main technical contribution is the construction of a uniform family of constant-depth arithmetic circuits that compute the multivariate resultant.

Authors: Robert Andrews, Abhibhav Garg, Éric Schost

We show that Hilbert's Nullstellensatz, the problem of deciding if a system of multivariate polynomial equations has a solution in the algebraic closure of the underlying field, lies in the counting hierarchy. More generally, we show that the number of solutions to a system of equations can be computed in polynomial time with oracle access to the counting hierarchy. Our results hold in particular for polynomials with coefficients in either the rational numbers or a finite field. Previously, the best-known bounds on the complexities of these problems were PSPACE and FPSPACE, respectively. Our main technical contribution is the construction of a uniform family of constant-depth arithmetic circuits that compute the multivariate resultant.

Euclidean Noncrossing Steiner Spanners of Nearly Optimal Sparsity

from arXiv: Computational Geometry

Authors: Sujoy Bhore, Sándor Kisfaludi-Bak, Lazar Milenković, Csaba D. Tóth, Karol Węgrzycki, Sampson Wong

A Euclidean noncrossing Steiner $(1+ε)$-spanner for a point set $P\subset\mathbb{R}^2$ is a planar straight-line graph that, for any two points $a, b \in P$, contains a path whose length is at most $1+ε$ times the Euclidean distance between $a$ and $b$. We construct a Euclidean noncrossing Steiner $(1+ε)$-spanner with $O(n/ε^{3/2})$ edges for any set of $n$ points in the plane. This result improves upon the previous best upper bound of $O(n/ε^{4})$ obtained nearly three decades ago. We also establish an almost matching lower bound: There exist $n$ points in the plane for which any Euclidean noncrossing Steiner $(1+ε)$-spanner has $Ω_μ(n/ε^{3/2-μ})$ edges for any $μ>0$. Our lower bound uses recent generalizations of the Szemerédi-Trotter theorem to disk-tube incidences in geometric measure theory.

Authors: Sujoy Bhore, Sándor Kisfaludi-Bak, Lazar Milenković, Csaba D. Tóth, Karol Węgrzycki, Sampson Wong

A Euclidean noncrossing Steiner $(1+ε)$-spanner for a point set $P\subset\mathbb{R}^2$ is a planar straight-line graph that, for any two points $a, b \in P$, contains a path whose length is at most $1+ε$ times the Euclidean distance between $a$ and $b$. We construct a Euclidean noncrossing Steiner $(1+ε)$-spanner with $O(n/ε^{3/2})$ edges for any set of $n$ points in the plane. This result improves upon the previous best upper bound of $O(n/ε^{4})$ obtained nearly three decades ago. We also establish an almost matching lower bound: There exist $n$ points in the plane for which any Euclidean noncrossing Steiner $(1+ε)$-spanner has $Ω_μ(n/ε^{3/2-μ})$ edges for any $μ>0$. Our lower bound uses recent generalizations of the Szemerédi-Trotter theorem to disk-tube incidences in geometric measure theory.

Unifying Formal Explanations: A Complexity-Theoretic Perspective

from arXiv: Data Structures and Algorithms

Authors: Shahaf Bassan, Xuanxiang Huang, Guy Katz

Previous work has explored the computational complexity of deriving two fundamental types of explanations for ML model predictions: (1) *sufficient reasons*, which are subsets of input features that, when fixed, determine a prediction, and (2) *contrastive reasons*, which are subsets of input features that, when modified, alter a prediction. Prior studies have examined these explanations in different contexts, such as non-probabilistic versus probabilistic frameworks and local versus global settings. In this study, we introduce a unified framework for analyzing these explanations, demonstrating that they can all be characterized through the minimization of a unified probabilistic value function. We then prove that the complexity of these computations is influenced by three key properties of the value function: (1) *monotonicity*, (2) *submodularity*, and (3) *supermodularity* - which are three fundamental properties in *combinatorial optimization*. Our findings uncover some counterintuitive results regarding the nature of these properties within the explanation settings examined. For instance, although the *local* value functions do not exhibit monotonicity or submodularity/supermodularity whatsoever, we demonstrate that the *global* value functions do possess these properties. This distinction enables us to prove a series of novel polynomial-time results for computing various explanations with provable guarantees in the global explainability setting, across a range of ML models that span the interpretability spectrum, such as neural networks, decision trees, and tree ensembles. In contrast, we show that even highly simplified versions of these explanations become NP-hard to compute in the corresponding local explainability setting.

Authors: Shahaf Bassan, Xuanxiang Huang, Guy Katz

Previous work has explored the computational complexity of deriving two fundamental types of explanations for ML model predictions: (1) *sufficient reasons*, which are subsets of input features that, when fixed, determine a prediction, and (2) *contrastive reasons*, which are subsets of input features that, when modified, alter a prediction. Prior studies have examined these explanations in different contexts, such as non-probabilistic versus probabilistic frameworks and local versus global settings. In this study, we introduce a unified framework for analyzing these explanations, demonstrating that they can all be characterized through the minimization of a unified probabilistic value function. We then prove that the complexity of these computations is influenced by three key properties of the value function: (1) *monotonicity*, (2) *submodularity*, and (3) *supermodularity* - which are three fundamental properties in *combinatorial optimization*. Our findings uncover some counterintuitive results regarding the nature of these properties within the explanation settings examined. For instance, although the *local* value functions do not exhibit monotonicity or submodularity/supermodularity whatsoever, we demonstrate that the *global* value functions do possess these properties. This distinction enables us to prove a series of novel polynomial-time results for computing various explanations with provable guarantees in the global explainability setting, across a range of ML models that span the interpretability spectrum, such as neural networks, decision trees, and tree ensembles. In contrast, we show that even highly simplified versions of these explanations become NP-hard to compute in the corresponding local explainability setting.

Improved Algorithms for Clustering with Noisy Distance Oracles

from arXiv: Data Structures and Algorithms

Authors: Pinki Pradhan, Anup Bhattacharya, Ragesh Jaiswal

Bateni et al. has recently introduced the weak-strong distance oracle model to study clustering problems in settings with limited distance information. Given query access to the strong-oracle and weak-oracle in the weak-strong oracle model, the authors design approximation algorithms for $k$-means and $k$-center clustering problems. In this work, we design algorithms with improved guarantees for $k$-means and $k$-center clustering problems in the weak-strong oracle model. The $k$-means++ algorithm is routinely used to solve $k$-means in settings where complete distance information is available. One of the main contributions of this work is to show that $k$-means++ algorithm can be adapted to work in the weak-strong oracle model using only a small number of strong-oracle queries, which is the critical resource in this model. In particular, our $k$-means++ based algorithm gives a constant approximation for $k$-means and uses $O(k^2 \log^2{n})$ strong-oracle queries. This improves on the algorithm of Bateni et al. that uses $O(k^2 \log^4n \log^2 \log n)$ strong-oracle queries for a constant factor approximation of $k$-means. For the $k$-center problem, we give a simple ball-carving based $6(1 + ε)$-approximation algorithm that uses $O(k^3 \log^2{n} \log{\frac{\log{n}}ε})$ strong-oracle queries. This is an improvement over the $14(1 + ε)$-approximation algorithm of Bateni et al. that uses $O(k^2 \log^4{n} \log^2{\frac{\log{n}}ε})$ strong-oracle queries. To show the effectiveness of our algorithms, we perform empirical evaluations on real-world datasets and show that our algorithms significantly outperform the algorithms of Bateni et al.

Authors: Pinki Pradhan, Anup Bhattacharya, Ragesh Jaiswal

Bateni et al. has recently introduced the weak-strong distance oracle model to study clustering problems in settings with limited distance information. Given query access to the strong-oracle and weak-oracle in the weak-strong oracle model, the authors design approximation algorithms for $k$-means and $k$-center clustering problems. In this work, we design algorithms with improved guarantees for $k$-means and $k$-center clustering problems in the weak-strong oracle model. The $k$-means++ algorithm is routinely used to solve $k$-means in settings where complete distance information is available. One of the main contributions of this work is to show that $k$-means++ algorithm can be adapted to work in the weak-strong oracle model using only a small number of strong-oracle queries, which is the critical resource in this model. In particular, our $k$-means++ based algorithm gives a constant approximation for $k$-means and uses $O(k^2 \log^2{n})$ strong-oracle queries. This improves on the algorithm of Bateni et al. that uses $O(k^2 \log^4n \log^2 \log n)$ strong-oracle queries for a constant factor approximation of $k$-means. For the $k$-center problem, we give a simple ball-carving based $6(1 + ε)$-approximation algorithm that uses $O(k^3 \log^2{n} \log{\frac{\log{n}}ε})$ strong-oracle queries. This is an improvement over the $14(1 + ε)$-approximation algorithm of Bateni et al. that uses $O(k^2 \log^4{n} \log^2{\frac{\log{n}}ε})$ strong-oracle queries. To show the effectiveness of our algorithms, we perform empirical evaluations on real-world datasets and show that our algorithms significantly outperform the algorithms of Bateni et al.

Generating minimal redundant and maximal irredundant sets in incidence graphs

from arXiv: Data Structures and Algorithms

Authors: Emanuel Castelo, Jérémie Chalopin, Oscar Defrain, Simon Vilmin

It has been proved by Boros and Makino that there is no output-polynomial-time algorithm enumerating the minimal redundant sets or the maximal irredundant sets of a hypergraph, unless P=NP. The same question was left open for graphs, with only a few tractable cases known to date. In this paper, we focus on graph classes that capture incidence relations such as bipartite, co-bipartite, and split graphs. Concerning maximal irredundant sets, we show that the problem on co-bipartite graphs is as hard as in general graphs and tractable in split and strongly orderable graphs, the latter being a generalization of chordal bipartite graphs. As for minimal redundant sets enumeration, we first show that the problem is intractable in split and co-bipartite graphs, answering the aforementioned open question, and that it is tractable on $(C_3,C_5,C_6,C_8)$-free graphs, a class of graphs incomparable to strongly orderable graphs, and which also generalizes chordal bipartite graphs.

Authors: Emanuel Castelo, Jérémie Chalopin, Oscar Defrain, Simon Vilmin

It has been proved by Boros and Makino that there is no output-polynomial-time algorithm enumerating the minimal redundant sets or the maximal irredundant sets of a hypergraph, unless P=NP. The same question was left open for graphs, with only a few tractable cases known to date. In this paper, we focus on graph classes that capture incidence relations such as bipartite, co-bipartite, and split graphs. Concerning maximal irredundant sets, we show that the problem on co-bipartite graphs is as hard as in general graphs and tractable in split and strongly orderable graphs, the latter being a generalization of chordal bipartite graphs. As for minimal redundant sets enumeration, we first show that the problem is intractable in split and co-bipartite graphs, answering the aforementioned open question, and that it is tractable on $(C_3,C_5,C_6,C_8)$-free graphs, a class of graphs incomparable to strongly orderable graphs, and which also generalizes chordal bipartite graphs.

QPTAS for MWIS and finding large sparse induced subgraphs in graphs with few independent long holes

from arXiv: Data Structures and Algorithms

Authors: Édouard Bonnet, Jadwiga Czyżewska, Tomáš Masařík, Marcin Pilipczuk, Paweł Rzążewski

We present a quasipolynomial-time approximation scheme (QPTAS) for the Maximum Independent Set (\textsc{MWIS}) in graphs with a bounded number of pairwise vertex-disjoint and non-adjacent long induced cycles. More formally, for every fixed $s$ and $t$, we show a QPTAS for \textsc{MWIS} in graphs that exclude $sC_t$ as an induced minor. Combining this with known results, we obtain a QPTAS for the problem of finding a largest induced subgraph of bounded treewidth with given hereditary property definable in Counting Monadic Second Order Logic, in the same classes of graphs. This is a step towards a conjecture of Gartland and Lokshtanov which asserts that for any planar graph $H$, graphs that exclude $H$ as an induced minor admit a polynomial-time algorithm for the latter problem. This conjecture is notoriously open and even its weaker variants are confirmed only for very restricted graphs $H$.

Authors: Édouard Bonnet, Jadwiga Czyżewska, Tomáš Masařík, Marcin Pilipczuk, Paweł Rzążewski

We present a quasipolynomial-time approximation scheme (QPTAS) for the Maximum Independent Set (\textsc{MWIS}) in graphs with a bounded number of pairwise vertex-disjoint and non-adjacent long induced cycles. More formally, for every fixed $s$ and $t$, we show a QPTAS for \textsc{MWIS} in graphs that exclude $sC_t$ as an induced minor. Combining this with known results, we obtain a QPTAS for the problem of finding a largest induced subgraph of bounded treewidth with given hereditary property definable in Counting Monadic Second Order Logic, in the same classes of graphs. This is a step towards a conjecture of Gartland and Lokshtanov which asserts that for any planar graph $H$, graphs that exclude $H$ as an induced minor admit a polynomial-time algorithm for the latter problem. This conjecture is notoriously open and even its weaker variants are confirmed only for very restricted graphs $H$.

Non-Stationary Online Resource Allocation: Learning from a Single Sample

from arXiv: Data Structures and Algorithms

Authors: Yiding Feng, Jiashuo Jiang, Yige Wang

We study online resource allocation under non-stationary demand with a minimum offline data requirement. In this problem, a decision-maker must allocate multiple types of resources to sequentially arriving queries over a finite horizon. Each query belongs to a finite set of types with fixed resource consumption and a stochastic reward drawn from an unknown, type-specific distribution. Critically, the environment exhibits arbitrary non-stationarity -- arrival distributions may shift unpredictably-while the algorithm requires only one historical sample per period to operate effectively. We distinguish two settings based on sample informativeness: (i) reward-observed samples containing both query type and reward realization, and (ii) the more challenging type-only samples revealing only query type information. We propose a novel type-dependent quantile-based meta-policy that decouples the problem into modular components: reward distribution estimation, optimization of target service probabilities via fluid relaxation, and real-time decisions through dynamic acceptance thresholds. For reward-observed samples, our static threshold policy achieves $\tilde{O}(\sqrt{T})$ regret. For type-only samples, we first establish that sublinear regret is impossible without additional structure; under a mild minimum-arrival-probability assumption, we design both a partially adaptive policy attaining the same $\tilde{O}({T})$ bound and, more significantly, a fully adaptive resolving policy with careful rounding that achieves the first poly-logarithmic regret guarantee of $O((\log T)^3)$ for non-stationary multi-resource allocation. Our framework advances prior work by operating with minimal offline data (one sample per period), handling arbitrary non-stationarity without variation-budget assumptions, and supporting multiple resource constraints.

Authors: Yiding Feng, Jiashuo Jiang, Yige Wang

We study online resource allocation under non-stationary demand with a minimum offline data requirement. In this problem, a decision-maker must allocate multiple types of resources to sequentially arriving queries over a finite horizon. Each query belongs to a finite set of types with fixed resource consumption and a stochastic reward drawn from an unknown, type-specific distribution. Critically, the environment exhibits arbitrary non-stationarity -- arrival distributions may shift unpredictably-while the algorithm requires only one historical sample per period to operate effectively. We distinguish two settings based on sample informativeness: (i) reward-observed samples containing both query type and reward realization, and (ii) the more challenging type-only samples revealing only query type information. We propose a novel type-dependent quantile-based meta-policy that decouples the problem into modular components: reward distribution estimation, optimization of target service probabilities via fluid relaxation, and real-time decisions through dynamic acceptance thresholds. For reward-observed samples, our static threshold policy achieves $\tilde{O}(\sqrt{T})$ regret. For type-only samples, we first establish that sublinear regret is impossible without additional structure; under a mild minimum-arrival-probability assumption, we design both a partially adaptive policy attaining the same $\tilde{O}({T})$ bound and, more significantly, a fully adaptive resolving policy with careful rounding that achieves the first poly-logarithmic regret guarantee of $O((\log T)^3)$ for non-stationary multi-resource allocation. Our framework advances prior work by operating with minimal offline data (one sample per period), handling arbitrary non-stationarity without variation-budget assumptions, and supporting multiple resource constraints.

Optimal Competitive Ratio of Two-sided Online Bipartite Matching

from arXiv: Data Structures and Algorithms

Authors: Zhihao Gavin Tang

We establish an optimal upper bound (negative result) of $\sim 0.526$ on the competitive ratio of the fractional version of online bipartite matching with two-sided vertex arrivals, matching the lower bound (positive result) achieved by Wang and Wong (ICALP 2015), and Tang and Zhang (EC 2024).

Authors: Zhihao Gavin Tang

We establish an optimal upper bound (negative result) of $\sim 0.526$ on the competitive ratio of the fractional version of online bipartite matching with two-sided vertex arrivals, matching the lower bound (positive result) achieved by Wang and Wong (ICALP 2015), and Tang and Zhang (EC 2024).

Faster Parallel Batch-Dynamic Algorithms for Low Out-Degree Orientation

from arXiv: Data Structures and Algorithms

Authors: Guy Blelloch, Andrew Brady, Laxman Dhulipala, Jeremy Fineman, Kishen Gowda, Chase Hutton

A low out-degree orientation directs each edge of an undirected graph with the goal of minimizing the maximum out-degree of a vertex. In the parallel batch-dynamic setting, one can insert or delete batches of edges, and the goal is to process the entire batch in parallel with work per edge similar to that of a single sequential update and with span (or depth) for the entire batch that is polylogarithmic. In this paper we present faster parallel batch-dynamic algorithms for maintaining a low out-degree orientation of an undirected graph. All results herein achieve polylogarithmic depth, with high probability (whp); the focus of this paper is on minimizing the work, which varies across results. Our first result is the first parallel batch-dynamic algorithm to maintain an asymptotically optimal orientation with asymptotically optimal expected work bounds, in an amortized sense, improving over the prior best work bounds of Liu et al.~[SPAA~'22] by a logarithmic factor. Our second result is a $O(c \log n)$ orientation algorithm with expected worst-case $O(\sqrt{\log n})$ work per edge update, where $c$ is a known upper-bound on the arboricity of the graph. This matches the best-known sequential worst-case $O(c \log n)$ orientation algorithm given by Berglin and Brodal ~[Algorithmica~'18], albeit in expectation. Our final result is a $O(c + \log n)$-orientation algorithm with $O(\log^2 n)$ expected worst-case work per edge update. This algorithm significantly improves upon the recent result of Ghaffari and Koo~[SPAA~'25], which maintains a $O(c)$-orientation with $O(\log^9 n)$ worst-case work per edge whp.

Authors: Guy Blelloch, Andrew Brady, Laxman Dhulipala, Jeremy Fineman, Kishen Gowda, Chase Hutton

A low out-degree orientation directs each edge of an undirected graph with the goal of minimizing the maximum out-degree of a vertex. In the parallel batch-dynamic setting, one can insert or delete batches of edges, and the goal is to process the entire batch in parallel with work per edge similar to that of a single sequential update and with span (or depth) for the entire batch that is polylogarithmic. In this paper we present faster parallel batch-dynamic algorithms for maintaining a low out-degree orientation of an undirected graph. All results herein achieve polylogarithmic depth, with high probability (whp); the focus of this paper is on minimizing the work, which varies across results. Our first result is the first parallel batch-dynamic algorithm to maintain an asymptotically optimal orientation with asymptotically optimal expected work bounds, in an amortized sense, improving over the prior best work bounds of Liu et al.~[SPAA~'22] by a logarithmic factor. Our second result is a $O(c \log n)$ orientation algorithm with expected worst-case $O(\sqrt{\log n})$ work per edge update, where $c$ is a known upper-bound on the arboricity of the graph. This matches the best-known sequential worst-case $O(c \log n)$ orientation algorithm given by Berglin and Brodal ~[Algorithmica~'18], albeit in expectation. Our final result is a $O(c + \log n)$-orientation algorithm with $O(\log^2 n)$ expected worst-case work per edge update. This algorithm significantly improves upon the recent result of Ghaffari and Koo~[SPAA~'25], which maintains a $O(c)$-orientation with $O(\log^9 n)$ worst-case work per edge whp.

Sunday, February 22

ChatGPT gets an easy math problem wrong (I got it right). How is that possible?

from Computational Complexity

A commenter on this post asked for me (or anyone) to solve the problem without AI:


A,B,C,D,E are digits (the poster said A could be 0 but I took A to be nonzero)  such that
ABCDE + BCDE + CDE + DE + E = 20320.
I solved it completely by hand. You can try it yourself or look at my solution which is here. I found seven solutions. 
I THEN asked ChatGPT to give me all solutions to see if I missed any.    I had it backwards. ChatGPT missed some solutions.  The entire exchange between chatty and me is here.
I asked it how it could get it wrong and how I can trust it. It responded to that and follow-up questions intelligently. 
Note that the problem is NOT a Putnam problem or anything of the sort. But I've read that AI can solve Putnam problems. SO, without an ax to grind I am curious- how come ChatGPT got the abcde problem wrong.
Speculative answers
1) The statement AI has solved IMO problems refers to an AI that was trained for Putnam problems, not the free ChatGPT. For more issues with the AI-IMO results see Terry Tao's comments here
2) ChatGPT is really good when the answer to the question is on the web someplace or can even be reconstructed from what's on the web. But if a problem, even an easy one, is new to the web, it can hallucinate (it didn't do that on my problem but it did on muffin problems) or miss some cases (it did that on my problem). 
3) It's only human. (It pretty much says this.)
4) The next version or even the paid version is better!  Lance ran it on his paid-for chatGPT and it wrote a program to brute force it and got all 7 solutions. 
5) I said ChatGPT got the problem wrong. If a student had submitted the solution it would get lots of partial credit since the solution took the right approach and only missed a few cases. So should I judge ChatGPT more harshly than a student? Yes. 
The question still stands: How come ChatGPT could not do this well defined simple math problem. 



By gasarch

A commenter on this post asked for me (or anyone) to solve the problem without AI:


A,B,C,D,E are digits (the poster said A could be 0 but I took A to be nonzero)  such that

ABCDE + BCDE + CDE + DE + E = 20320.

I solved it completely by hand. You can try it yourself or look at my solution which is here. I found seven solutions. 

I THEN asked ChatGPT to give me all solutions to see if I missed any. 
 
I had it backwards. ChatGPT missed some solutions.  The entire exchange between chatty and me is here.

I asked it how it could get it wrong and how I can trust it. It responded to that and follow-up questions intelligently. 

Note that the problem is NOT a Putnam problem or anything of the sort. But I've read that AI can solve Putnam problems. SO, without an ax to grind I am curious- how come ChatGPT got the abcde problem wrong.

Speculative answers

1) The statement AI has solved IMO problems refers to an AI that was trained for Putnam problems, not the free ChatGPT. For more issues with the AI-IMO results see Terry Tao's comments here

2) ChatGPT is really good when the answer to the question is on the web someplace or can even be reconstructed from what's on the web. But if a problem, even an easy one, is new to the web, it can hallucinate (it didn't do that on my problem but it did on muffin problems) or miss some cases (it did that on my problem). 

3) It's only human. (It pretty much says this.)

4) The next version or even the paid version is better!  Lance ran it on his paid-for chatGPT and it wrote a program to brute force it and got all 7 solutions. 

5) I said ChatGPT got the problem wrong. If a student had submitted the solution it would get lots of partial credit since the solution took the right approach and only missed a few cases. So should I judge ChatGPT more harshly than a student? Yes. 

The question still stands: How come ChatGPT could not do this well defined simple math problem. 




By gasarch

TR26-028 | Weak Zero-Knowledge and One-Way Functions | Rohit Chatterjee, Yunqi Li, Prashant Nalini Vasudevan

from ECCC Papers

We study the implications of the existence of weak Zero-Knowledge (ZK) protocols for worst-case hard languages. These are protocols that have completeness, soundness, and zero-knowledge errors (denoted $\epsilon_c$, $\epsilon_s$, and $\epsilon_z$, respectively) that might not be negligible. Under the assumption that there are worst-case hard languages in NP, we show the following: 1. If all languages in NP have NIZK proofs or arguments satisfying $ \epsilon_c+\epsilon_s+\epsilon_z < 1 $, then One-Way Functions (OWFs) exist. This covers all possible non-trivial values for these error rates. It additionally implies that if all languages in NP have such NIZK proofs and $\epsilon_c$ is negligible, then they also have NIZK proofs where all errors are negligible. Previously, these results were known under the more restrictive condition $ \epsilon_c+\sqrt{\epsilon_s}+\epsilon_z < 1 $ [Chakraborty et al., CRYPTO 2025]. 2. If all languages in NP have $k$-round public-coin ZK proofs or arguments satisfying $ \epsilon_c+\epsilon_s+(2k-1).\epsilon_z < 1 $, then OWFs exist. 3. If, for some constant $k$, all languages in NP have $k$-round public-coin ZK proofs or arguments satisfying $ \epsilon_c+\epsilon_s+k.\epsilon_z < 1 $, then infinitely-often OWFs exist.

We study the implications of the existence of weak Zero-Knowledge (ZK) protocols for worst-case hard languages. These are protocols that have completeness, soundness, and zero-knowledge errors (denoted $\epsilon_c$, $\epsilon_s$, and $\epsilon_z$, respectively) that might not be negligible. Under the assumption that there are worst-case hard languages in NP, we show the following: 1. If all languages in NP have NIZK proofs or arguments satisfying $ \epsilon_c+\epsilon_s+\epsilon_z < 1 $, then One-Way Functions (OWFs) exist. This covers all possible non-trivial values for these error rates. It additionally implies that if all languages in NP have such NIZK proofs and $\epsilon_c$ is negligible, then they also have NIZK proofs where all errors are negligible. Previously, these results were known under the more restrictive condition $ \epsilon_c+\sqrt{\epsilon_s}+\epsilon_z < 1 $ [Chakraborty et al., CRYPTO 2025]. 2. If all languages in NP have $k$-round public-coin ZK proofs or arguments satisfying $ \epsilon_c+\epsilon_s+(2k-1).\epsilon_z < 1 $, then OWFs exist. 3. If, for some constant $k$, all languages in NP have $k$-round public-coin ZK proofs or arguments satisfying $ \epsilon_c+\epsilon_s+k.\epsilon_z < 1 $, then infinitely-often OWFs exist.

TR26-027 | Efficient quantum circuits for high-dimensional representations of SU(n) and Ramanujan quantum expanders | Siddhartha Jain, Vishnu Iyer, Stephen Jordan, Rolando Somma

from ECCC Papers

We present efficient quantum circuits that implement high-dimensional unitary irreducible representations (irreps) of SU(n), where n>=2 is constant. For dimension N and error ?, the number of quantum gates in our circuits is polynomial in log(N) and log(1/?). Our construction relies on the Jordan-Schwinger representation, which allows us to realize irreps of SU(n) in the Hilbert space of n quantum harmonic oscillators. Together with a recent efficient quantum Hermite transform, which allows us to map the computational basis states to the eigenstates of the quantum harmonic oscillator, this allows us to implement these irreps efficiently. Our quantum circuits can be used to construct explicit Ramanujan quantum expanders, a longstanding open problem. They can also be used to fast-forward the evolution of certain quantum systems.

We present efficient quantum circuits that implement high-dimensional unitary irreducible representations (irreps) of SU(n), where n>=2 is constant. For dimension N and error ?, the number of quantum gates in our circuits is polynomial in log(N) and log(1/?). Our construction relies on the Jordan-Schwinger representation, which allows us to realize irreps of SU(n) in the Hilbert space of n quantum harmonic oscillators. Together with a recent efficient quantum Hermite transform, which allows us to map the computational basis states to the eigenstates of the quantum harmonic oscillator, this allows us to implement these irreps efficiently. Our quantum circuits can be used to construct explicit Ramanujan quantum expanders, a longstanding open problem. They can also be used to fast-forward the evolution of certain quantum systems.

Friday, February 20

Updatez!

from Scott Aaronson

  1. The STOC’2026 accepted papers list is out. It seems to me that there’s an emperor’s bounty of amazing stuff this year. I felt especially gratified to see the paper on the determination of BusyBeaver(5) on the list, reflecting a broad view of what theory of computing is about.
  2. There’s a phenomenal profile of Henry Yuen in Quanta magazine. Henry is now one of the world leaders of quantum complexity theory, involved in breakthroughs like MIP*=RE and now pioneering the complexity theory of quantum states and unitary transformations (the main focus of this interview). I’m proud that Henry tells Quanta that learned about the field in 2007 or 2008 from a blog called … what was it again? … Shtetl-Optimized? I’m also proud that I got to help mentor Henry when he was a PhD student of my wife Dana Moshkovitz at MIT. Before I read this Quanta profile, I didn’t even know the backstory about Henry’s parents surviving and fleeing the Cambodian genocide, or about Henry growing up working in his parents’ restaurant. Henry never brought any of that up!
  3. See Lance’s blog for an obituary of Joe Halpern, a pioneer of the branch of theoretical computer science that deals with reasoning about knowledge (e.g., the muddy children puzzle), who sadly passed away last week. I knew Prof. Halpern a bit when I was an undergrad at Cornell. He was a huge presence in the Cornell CS department who’ll be sorely missed.
  4. UT Austin has announced the formation of a School of Computing, which will bring together the CS department (where I work) with statistics, data science, and several other departments. Many of UT’s peer institutions have recently done the same. Naturally, I’m excited for what this says about the expanded role of computing at UT going forward. We’ll be looking to hire even more new faculty than we were before!
  5. When I glanced at the Chronicle of Higher Education to see what was new, I learned that researchers at OpenAI had proposed a technical solution, called “watermarking,” that might help tackle the crisis of students relying on AI to write all their papers … but that OpenAI had declined to deploy that solution. The piece strongly advocates a legislative mandate in favor of watermarking LLM outputs, and addresses some of the main counterarguments to that position.
  6. For those who can’t get enough podcasts of me, here are the ones I’ve done recently. Quantum: Science vs. Mythology on the Peggy Smedley Show. AI Alignment, Complexity Theory, and the Computability of Physics, on Alexander Chin’s Philosophy Podcast. And last but not least, What Is Quantum Computing? on the Robinson Erhardt Podcast.
  7. Also, here’s an article that quotes me entitled “Bitcoin needs a quantum upgrade. So why isn’t it happening?” Also, here’s a piece that interviews me in Investor’s Business Daily, entitled “Is quantum computing the next big tech shift?” (I have no say over these titles.)

By Scott

TR26-026 | When Hilbert approximates: A Strong Nullstellensatz for Approximate Polynomial Satisfiability | Sanyam Agarwal, Sagnik Dutta, Anurag Pandey, Himanshu Shukla

from ECCC Papers

Guo, Saxena, and Sinhababu (TOC'18, CCC'18) defined a natural, approximative analog of the polynomial system satisfiability problem, which they called approximate polynomial satisfiability (APS). They proved algebraic and geometric properties of it and showed an NP-hardness lower bound and a PSPACE upper bound for it. They further established how the problem naturally occurs in border complexity and Geometric complexity theory (GCT) and used the problem to construct hitting sets for $\overline{VP}$ in PSPACE, hence greatly mitigating the GCT chasm. The starting point of this work is the observation that Guo, Saxena, and Sinhababu's criterion for non-existence of approximative solution can be interpreted as an analog of Weak Hilbert's Nullstellensatz in the approximative setting. We extend their work by proving an analog of Strong Hilbert's Nullstellensatz in the approximative setting. Concretely, we give an algebraic criterion for containment between approximative solution sets defined by systems of polynomials. In fact, this characterization turns out to be equivalent to membership in the integral closure over a maximal ideal of a local subring of $\mathbb{C}(x_1,\ldots,x_n)$ determined by the given polynomials. In addition, we use our proof to provide a PSPACE algorithm for testing this containment, exponentially better than the EXPSPACE bounds for polynomial subalgebra mebership testing and the polynomial integral closure membership testing, hence matching the complexity bound of Guo, Saxena, and Sinhababu's Weak Approximative Nullstellensatz.

Guo, Saxena, and Sinhababu (TOC'18, CCC'18) defined a natural, approximative analog of the polynomial system satisfiability problem, which they called approximate polynomial satisfiability (APS). They proved algebraic and geometric properties of it and showed an NP-hardness lower bound and a PSPACE upper bound for it. They further established how the problem naturally occurs in border complexity and Geometric complexity theory (GCT) and used the problem to construct hitting sets for $\overline{VP}$ in PSPACE, hence greatly mitigating the GCT chasm. The starting point of this work is the observation that Guo, Saxena, and Sinhababu's criterion for non-existence of approximative solution can be interpreted as an analog of Weak Hilbert's Nullstellensatz in the approximative setting. We extend their work by proving an analog of Strong Hilbert's Nullstellensatz in the approximative setting. Concretely, we give an algebraic criterion for containment between approximative solution sets defined by systems of polynomials. In fact, this characterization turns out to be equivalent to membership in the integral closure over a maximal ideal of a local subring of $\mathbb{C}(x_1,\ldots,x_n)$ determined by the given polynomials. In addition, we use our proof to provide a PSPACE algorithm for testing this containment, exponentially better than the EXPSPACE bounds for polynomial subalgebra mebership testing and the polynomial integral closure membership testing, hence matching the complexity bound of Guo, Saxena, and Sinhababu's Weak Approximative Nullstellensatz.

TR26-025 | Beyond Bilinear Complexity: What Works and What Breaks with Many Modes? | Cornelius Brand, Radu Curticapean, Petteri Kaski, Baitian Li, Ian Orzel, Tim Seppelt, Jiaheng Wang

from ECCC Papers

The complexity of bilinear maps (equivalently, of $3$-mode tensors) has been studied extensively, most notably in the context of matrix multiplication. While circuit complexity and tensor rank coincide asymptotically for $3$-mode tensors, this correspondence breaks down for $d \geq 4$ modes. As a result, the complexity of $d$-mode tensors for larger fixed $d$ remains poorly understood, despite its relevance, e.g., in fine-grained complexity. Our paper explores this intermediate regime. First, we give a ``graph-theoretic'' proof of Strassen's $2\omega/3$ bound on the asymptotic rank exponent of $3$-mode tensors. Our proof directly generalizes to an upper bound of $(d-1)\omega/3$ for $d$-mode tensors. Using refined techniques available only for $d\geq 4$ modes, we improve this bound beyond the current state of the art for $\omega$. We also obtain a bound of $d/2+1$ on the asymptotic exponent of \emph{circuit complexity} of generic $d$-mode tensors and optimized bounds for $d \in \{4,5\}$. To the best of our knowledge, asymptotic circuit complexity (rather than rank) of tensors has not been studied before.To obtain a robust theory, we first ask whether low complexity of $T$ and $U$ imply low complexity of their Kronecker product $T \otimes U$. While this crucially holds for rank (and thus for circuit complexity in $3$ modes), we show that assumptions from fine-grained complexity rule out such a \emph{submultiplicativity} for the circuit complexity of tensors with many modes. In particular, assuming the Hyperclique Conjecture, this failure occurs already for $d=8$ modes. Nevertheless, we can salvage a restricted notion of submultiplicativity. From a technical perspective, our proofs heavily make use of the \emph{graph tensors} $T_H$, as employed by Christandl and Zuiddam ({\em Comput.~Complexity}~28~(2019)~27--56) and Christandl, Vrana and Zuiddam ({\em Comput.~Complexity}~28~(2019)~57--111), whose modes correspond to the vertices of undirected graphs $H$. We make the simple but conceptually crucial observation that Kronecker products $T_G \otimes T_H$ are isomorphic to $T_{G+H}$, and that $G$ and $H$ may also be \emph{fractional} graphs. By asymptotically converting generic tensors to specific graph tensors, we can use nontrivial results from algorithmic graph theory to study the rank and complexity of $d$-mode tensors for fixed $d$.

The complexity of bilinear maps (equivalently, of $3$-mode tensors) has been studied extensively, most notably in the context of matrix multiplication. While circuit complexity and tensor rank coincide asymptotically for $3$-mode tensors, this correspondence breaks down for $d \geq 4$ modes. As a result, the complexity of $d$-mode tensors for larger fixed $d$ remains poorly understood, despite its relevance, e.g., in fine-grained complexity. Our paper explores this intermediate regime. First, we give a ``graph-theoretic'' proof of Strassen's $2\omega/3$ bound on the asymptotic rank exponent of $3$-mode tensors. Our proof directly generalizes to an upper bound of $(d-1)\omega/3$ for $d$-mode tensors. Using refined techniques available only for $d\geq 4$ modes, we improve this bound beyond the current state of the art for $\omega$. We also obtain a bound of $d/2+1$ on the asymptotic exponent of \emph{circuit complexity} of generic $d$-mode tensors and optimized bounds for $d \in \{4,5\}$. To the best of our knowledge, asymptotic circuit complexity (rather than rank) of tensors has not been studied before.To obtain a robust theory, we first ask whether low complexity of $T$ and $U$ imply low complexity of their Kronecker product $T \otimes U$. While this crucially holds for rank (and thus for circuit complexity in $3$ modes), we show that assumptions from fine-grained complexity rule out such a \emph{submultiplicativity} for the circuit complexity of tensors with many modes. In particular, assuming the Hyperclique Conjecture, this failure occurs already for $d=8$ modes. Nevertheless, we can salvage a restricted notion of submultiplicativity. From a technical perspective, our proofs heavily make use of the \emph{graph tensors} $T_H$, as employed by Christandl and Zuiddam ({\em Comput.~Complexity}~28~(2019)~27--56) and Christandl, Vrana and Zuiddam ({\em Comput.~Complexity}~28~(2019)~57--111), whose modes correspond to the vertices of undirected graphs $H$. We make the simple but conceptually crucial observation that Kronecker products $T_G \otimes T_H$ are isomorphic to $T_{G+H}$, and that $G$ and $H$ may also be \emph{fractional} graphs. By asymptotically converting generic tensors to specific graph tensors, we can use nontrivial results from algorithmic graph theory to study the rank and complexity of $d$-mode tensors for fixed $d$.

TR26-024 | Hilbert’s Nullstellensatz is in the Counting Hierarchy | Robert Andrews, Abhibhav Garg, Éric Schost

from ECCC Papers

We show that Hilbert's Nullstellensatz, the problem of deciding if a system of multivariate polynomial equations has a solution in the algebraic closure of the underlying field, lies in the counting hierarchy. More generally, we show that the number of solutions to a system of equations can be computed in polynomial time with oracle access to the counting hierarchy. Our results hold in particular for polynomials with coefficients in either the rational numbers or a finite field. Previously, the best-known bounds on the complexities of these problems were PSPACE and FPSPACE, respectively. Our main technical contribution is the construction of a uniform family of constant-depth arithmetic circuits that compute the multivariate resultant.

We show that Hilbert's Nullstellensatz, the problem of deciding if a system of multivariate polynomial equations has a solution in the algebraic closure of the underlying field, lies in the counting hierarchy. More generally, we show that the number of solutions to a system of equations can be computed in polynomial time with oracle access to the counting hierarchy. Our results hold in particular for polynomials with coefficients in either the rational numbers or a finite field. Previously, the best-known bounds on the complexities of these problems were PSPACE and FPSPACE, respectively. Our main technical contribution is the construction of a uniform family of constant-depth arithmetic circuits that compute the multivariate resultant.

Pseudo-deterministic Quantum Algorithms

from arXiv: Computational Complexity

Authors: Hugo Aaronson, Tom Gur, Jiawei Li

We initiate a systematic study of pseudo-deterministic quantum algorithms. These are quantum algorithms that, for any input, output a canonical solution with high probability. Focusing on the query complexity model, our main contributions include the following complexity separations, which require new lower bound techniques specifically tailored to pseudo-determinism: - We exhibit a problem, Avoid One Encrypted String (AOES), whose classical randomized query complexity is $O(1)$ but is maximally hard for pseudo-deterministic quantum algorithms ($Ω(N)$ query complexity). - We exhibit a problem, Quantum-Locked Estimation (QL-Estimation), for which pseudo-deterministic quantum algorithms admit an exponential speed-up over classical pseudo-deterministic algorithms ($O(\log(N))$ vs. $Θ(\sqrt{N})$), while the randomized query complexity is $O(1)$. Complementing these separations, we show that for any total problem $R$, pseudo-deterministic quantum algorithms admit at most a quintic advantage over deterministic algorithms, i.e., $D(R) = \tilde O(psQ(R)^5)$. On the algorithmic side, we identify a class of quantum search problems that can be made pseudo-deterministic with small overhead, including Grover search, element distinctness, triangle finding, $k$-sum, and graph collision.

Authors: Hugo Aaronson, Tom Gur, Jiawei Li

We initiate a systematic study of pseudo-deterministic quantum algorithms. These are quantum algorithms that, for any input, output a canonical solution with high probability. Focusing on the query complexity model, our main contributions include the following complexity separations, which require new lower bound techniques specifically tailored to pseudo-determinism: - We exhibit a problem, Avoid One Encrypted String (AOES), whose classical randomized query complexity is $O(1)$ but is maximally hard for pseudo-deterministic quantum algorithms ($Ω(N)$ query complexity). - We exhibit a problem, Quantum-Locked Estimation (QL-Estimation), for which pseudo-deterministic quantum algorithms admit an exponential speed-up over classical pseudo-deterministic algorithms ($O(\log(N))$ vs. $Θ(\sqrt{N})$), while the randomized query complexity is $O(1)$. Complementing these separations, we show that for any total problem $R$, pseudo-deterministic quantum algorithms admit at most a quintic advantage over deterministic algorithms, i.e., $D(R) = \tilde O(psQ(R)^5)$. On the algorithmic side, we identify a class of quantum search problems that can be made pseudo-deterministic with small overhead, including Grover search, element distinctness, triangle finding, $k$-sum, and graph collision.

Provably Explaining Neural Additive Models

from arXiv: Computational Complexity

Authors: Shahaf Bassan, Yizhak Yisrael Elboher, Tobias Ladner, Volkan Şahin, Jan Kretinsky, Matthias Althoff, Guy Katz

Despite significant progress in post-hoc explanation methods for neural networks, many remain heuristic and lack provable guarantees. A key approach for obtaining explanations with provable guarantees is by identifying a cardinally-minimal subset of input features which by itself is provably sufficient to determine the prediction. However, for standard neural networks, this task is often computationally infeasible, as it demands a worst-case exponential number of verification queries in the number of input features, each of which is NP-hard. In this work, we show that for Neural Additive Models (NAMs), a recent and more interpretable neural network family, we can efficiently generate explanations with such guarantees. We present a new model-specific algorithm for NAMs that generates provably cardinally-minimal explanations using only a logarithmic number of verification queries in the number of input features, after a parallelized preprocessing step with logarithmic runtime in the required precision is applied to each small univariate NAM component. Our algorithm not only makes the task of obtaining cardinally-minimal explanations feasible, but even outperforms existing algorithms designed to find the relaxed variant of subset-minimal explanations - which may be larger and less informative but easier to compute - despite our algorithm solving a much more difficult task. Our experiments demonstrate that, compared to previous algorithms, our approach provides provably smaller explanations than existing works and substantially reduces the computation time. Moreover, we show that our generated provable explanations offer benefits that are unattainable by standard sampling-based techniques typically used to interpret NAMs.

Authors: Shahaf Bassan, Yizhak Yisrael Elboher, Tobias Ladner, Volkan Şahin, Jan Kretinsky, Matthias Althoff, Guy Katz

Despite significant progress in post-hoc explanation methods for neural networks, many remain heuristic and lack provable guarantees. A key approach for obtaining explanations with provable guarantees is by identifying a cardinally-minimal subset of input features which by itself is provably sufficient to determine the prediction. However, for standard neural networks, this task is often computationally infeasible, as it demands a worst-case exponential number of verification queries in the number of input features, each of which is NP-hard. In this work, we show that for Neural Additive Models (NAMs), a recent and more interpretable neural network family, we can efficiently generate explanations with such guarantees. We present a new model-specific algorithm for NAMs that generates provably cardinally-minimal explanations using only a logarithmic number of verification queries in the number of input features, after a parallelized preprocessing step with logarithmic runtime in the required precision is applied to each small univariate NAM component. Our algorithm not only makes the task of obtaining cardinally-minimal explanations feasible, but even outperforms existing algorithms designed to find the relaxed variant of subset-minimal explanations - which may be larger and less informative but easier to compute - despite our algorithm solving a much more difficult task. Our experiments demonstrate that, compared to previous algorithms, our approach provides provably smaller explanations than existing works and substantially reduces the computation time. Moreover, we show that our generated provable explanations offer benefits that are unattainable by standard sampling-based techniques typically used to interpret NAMs.

Some Remarks on Marginal Code Languages

from arXiv: Computational Complexity

Authors: Stavros Konstantinidis

A prefix code L satisfies the condition that no word of L is a proper prefix of another word of L. Recently, Ko, Han and Salomaa relaxed this condition by allowing a word of L to be a proper prefix of at most k words of L, for some `margin' k, introducing thus the class of k-prefix-free languages, as well as the similar classes of k-suffix-free and k-infix-free languages. Here we unify the definitions of these three classes of languages into one uniform definition in two ways: via the method of partial orders and via the method of transducers. Thus, for any known class of code-related languages definable via the transducer method, one gets a marginal version of that class. Building on the techniques of Ko, Han and Salomaa, we discuss the \emph{uniform} satisfaction and maximality problems for marginal classes of languages.

Authors: Stavros Konstantinidis

A prefix code L satisfies the condition that no word of L is a proper prefix of another word of L. Recently, Ko, Han and Salomaa relaxed this condition by allowing a word of L to be a proper prefix of at most k words of L, for some `margin' k, introducing thus the class of k-prefix-free languages, as well as the similar classes of k-suffix-free and k-infix-free languages. Here we unify the definitions of these three classes of languages into one uniform definition in two ways: via the method of partial orders and via the method of transducers. Thus, for any known class of code-related languages definable via the transducer method, one gets a marginal version of that class. Building on the techniques of Ko, Han and Salomaa, we discuss the \emph{uniform} satisfaction and maximality problems for marginal classes of languages.

Separations above TFNP from Sherali-Adams Lower Bounds

from arXiv: Computational Complexity

Authors: Anna Gal, Noah Fleming, Deniz Imrek, Christophe Marciot

Unlike in TFNP, for which there is an abundance of problems capturing natural existence principles which are incomparable (in the black-box setting), Kleinberg et al. [KKMP21] observed that many of the natural problems considered so far in the second level of the total function polynomial hierarchy (TF$Σ_2$) reduce to the Strong Avoid problem. In this work, we prove that the Linear Ordering Principle does not reduce to Strong Avoid in the black-box setting, exhibiting the first TF$Σ_2$ problem that lies outside of the class of problems reducible to Strong Avoid. The proof of our separation exploits a connection between total search problems in the polynomial hierarchy and proof complexity, recently developed by Fleming, Imrek, and Marciot [FIM25]. In particular, this implies that to show our separation, it suffices to show that there is no small proof of the Linear Ordering Principle in a $Σ_2$-variant of the Sherali-Adams proof system. To do so, we extend the classical pseudo-expectation method to the $Σ_2$ setting, showing that the existence of a $Σ_2$ pseudo-expectation precludes a $Σ_2$ Sherali-Adams proof. The main technical challenge is in proving the existence of such a pseudo-expectation, we manage to do so by solving a combinatorial covering problem about permutations. We also show that the extended pseudo-expectation bound implies that the Linear Ordering Principle cannot be reduced to any problem admitting a low-degree Sherali-Adams refutation.

Authors: Anna Gal, Noah Fleming, Deniz Imrek, Christophe Marciot

Unlike in TFNP, for which there is an abundance of problems capturing natural existence principles which are incomparable (in the black-box setting), Kleinberg et al. [KKMP21] observed that many of the natural problems considered so far in the second level of the total function polynomial hierarchy (TF$Σ_2$) reduce to the Strong Avoid problem. In this work, we prove that the Linear Ordering Principle does not reduce to Strong Avoid in the black-box setting, exhibiting the first TF$Σ_2$ problem that lies outside of the class of problems reducible to Strong Avoid. The proof of our separation exploits a connection between total search problems in the polynomial hierarchy and proof complexity, recently developed by Fleming, Imrek, and Marciot [FIM25]. In particular, this implies that to show our separation, it suffices to show that there is no small proof of the Linear Ordering Principle in a $Σ_2$-variant of the Sherali-Adams proof system. To do so, we extend the classical pseudo-expectation method to the $Σ_2$ setting, showing that the existence of a $Σ_2$ pseudo-expectation precludes a $Σ_2$ Sherali-Adams proof. The main technical challenge is in proving the existence of such a pseudo-expectation, we manage to do so by solving a combinatorial covering problem about permutations. We also show that the extended pseudo-expectation bound implies that the Linear Ordering Principle cannot be reduced to any problem admitting a low-degree Sherali-Adams refutation.

On the complexity of covering points by disjoint segments and by guillotine cuts

from arXiv: Computational Geometry

Authors: Delia Garijo, Alberto Márquez, Rodrigo I. Silveira

We show that two geometric cover problems in the plane, related to covering points with disjoint line segments, are NP-complete. Given $n$ points in the plane and a value $k$, the first problem asks if all points can be covered by $k$ disjoint line segments; the second problem treats the analogous question for $k$ guillotine cuts.

Authors: Delia Garijo, Alberto Márquez, Rodrigo I. Silveira

We show that two geometric cover problems in the plane, related to covering points with disjoint line segments, are NP-complete. Given $n$ points in the plane and a value $k$, the first problem asks if all points can be covered by $k$ disjoint line segments; the second problem treats the analogous question for $k$ guillotine cuts.

Computational Hardness of Private Coreset

from arXiv: Data Structures and Algorithms

Authors: Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi

We study the problem of differentially private (DP) computation of coreset for the $k$-means objective. For a given input set of points, a coreset is another set of points such that the $k$-means objective for any candidate solution is preserved up to a multiplicative $(1 \pm α)$ factor (and some additive factor). We prove the first computational lower bounds for this problem. Specifically, assuming the existence of one-way functions, we show that no polynomial-time $(ε, 1/n^{ω(1)})$-DP algorithm can compute a coreset for $k$-means in the $\ell_\infty$-metric for some constant $α> 0$ (and some constant additive factor), even for $k=3$. For $k$-means in the Euclidean metric, we show a similar result but only for $α= Θ\left(1/d^2\right)$, where $d$ is the dimension.

Authors: Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi

We study the problem of differentially private (DP) computation of coreset for the $k$-means objective. For a given input set of points, a coreset is another set of points such that the $k$-means objective for any candidate solution is preserved up to a multiplicative $(1 \pm α)$ factor (and some additive factor). We prove the first computational lower bounds for this problem. Specifically, assuming the existence of one-way functions, we show that no polynomial-time $(ε, 1/n^{ω(1)})$-DP algorithm can compute a coreset for $k$-means in the $\ell_\infty$-metric for some constant $α> 0$ (and some constant additive factor), even for $k=3$. For $k$-means in the Euclidean metric, we show a similar result but only for $α= Θ\left(1/d^2\right)$, where $d$ is the dimension.

Simultaneous Blackwell Approachability and Applications to Multiclass Omniprediction

from arXiv: Data Structures and Algorithms

Authors: Lunjia Hu, Kevin Tian, Chutong Yang

Omniprediction is a learning problem that requires suboptimality bounds for each of a family of losses $\mathcal{L}$ against a family of comparator predictors $\mathcal{C}$. We initiate the study of omniprediction in a multiclass setting, where the comparator family $\mathcal{C}$ may be infinite. Our main result is an extension of the recent binary omniprediction algorithm of [OKK25] to the multiclass setting, with sample complexity (in statistical settings) or regret horizon (in online settings) $\approx \varepsilon^{-(k+1)}$, for $\varepsilon$-omniprediction in a $k$-class prediction problem. En route to proving this result, we design a framework of potential broader interest for solving Blackwell approachability problems where multiple sets must simultaneously be approached via coupled actions.

Authors: Lunjia Hu, Kevin Tian, Chutong Yang

Omniprediction is a learning problem that requires suboptimality bounds for each of a family of losses $\mathcal{L}$ against a family of comparator predictors $\mathcal{C}$. We initiate the study of omniprediction in a multiclass setting, where the comparator family $\mathcal{C}$ may be infinite. Our main result is an extension of the recent binary omniprediction algorithm of [OKK25] to the multiclass setting, with sample complexity (in statistical settings) or regret horizon (in online settings) $\approx \varepsilon^{-(k+1)}$, for $\varepsilon$-omniprediction in a $k$-class prediction problem. En route to proving this result, we design a framework of potential broader interest for solving Blackwell approachability problems where multiple sets must simultaneously be approached via coupled actions.

Informative Trains: A Memory-Efficient Journey to a Self-Stabilizing Leader Election Algorithm in Anonymous Graphs

from arXiv: Data Structures and Algorithms

Authors: Lelia Blin, Sylvain Gay, Isabella Ziccardi

We study the self-stabilizing leader election problem in anonymous $n$-nodes networks. Achieving self-stabilization with low space memory complexity is particularly challenging, and designing space-optimal leader election algorithms remains an open problem for general graphs. In deterministic settings, it is known that $Ω(\log \log n)$ bits of memory per node are necessary [Blin et al., Disc. Math. \& Theor. Comput. Sci., 2023], while in probabilistic settings the same lower bound holds for some values of $n$, but only for an unfair scheduler [Beauquier et al., PODC 1999]. Several deterministic and probabilistic protocols have been proposed in models ranging from the state model to the population protocols. However, to the best of our knowledge, existing solutions either require $Ω(\log n)$ bits of memory per node for general worst case graphs, or achieve low state complexity only under restricted network topologies such as rings, trees, or bounded-degree graphs. In this paper, we present a probabilistic self-stabilizing leader election algorithm for arbitrary anonymous networks that uses $O(\log \log n)$ bits of memory per node. Our algorithm operates in the state model under a synchronous scheduler and assumes knowledge of a global parameter $N = Θ(\log n)$. We show that, under our protocol, the system converges almost surely to a stable configuration with a unique leader and stabilizes within $O(\mathrm{poly}(n))$ rounds with high probability. To achieve $O(\log \log n)$ bits of memory, our algorithm keeps transmitting information after convergence, i.e. it does not verify the silence property. Moreover, like most works in the field, our algorithm does not provide explicit termination detection (i.e., nodes do not detect when the algorithm has converged).

Authors: Lelia Blin, Sylvain Gay, Isabella Ziccardi

We study the self-stabilizing leader election problem in anonymous $n$-nodes networks. Achieving self-stabilization with low space memory complexity is particularly challenging, and designing space-optimal leader election algorithms remains an open problem for general graphs. In deterministic settings, it is known that $Ω(\log \log n)$ bits of memory per node are necessary [Blin et al., Disc. Math. \& Theor. Comput. Sci., 2023], while in probabilistic settings the same lower bound holds for some values of $n$, but only for an unfair scheduler [Beauquier et al., PODC 1999]. Several deterministic and probabilistic protocols have been proposed in models ranging from the state model to the population protocols. However, to the best of our knowledge, existing solutions either require $Ω(\log n)$ bits of memory per node for general worst case graphs, or achieve low state complexity only under restricted network topologies such as rings, trees, or bounded-degree graphs. In this paper, we present a probabilistic self-stabilizing leader election algorithm for arbitrary anonymous networks that uses $O(\log \log n)$ bits of memory per node. Our algorithm operates in the state model under a synchronous scheduler and assumes knowledge of a global parameter $N = Θ(\log n)$. We show that, under our protocol, the system converges almost surely to a stable configuration with a unique leader and stabilizes within $O(\mathrm{poly}(n))$ rounds with high probability. To achieve $O(\log \log n)$ bits of memory, our algorithm keeps transmitting information after convergence, i.e. it does not verify the silence property. Moreover, like most works in the field, our algorithm does not provide explicit termination detection (i.e., nodes do not detect when the algorithm has converged).

Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking

from arXiv: Data Structures and Algorithms

Authors: Afroditi Kolomvaki, Fangshuo Liao, Evan Dramko, Ziyun Guang, Anastasios Kyrillidis

We investigate the convergence guarantee of two-layer neural network training with Gaussian randomly masked inputs. This scenario corresponds to Gaussian dropout at the input level, or noisy input training common in sensor networks, privacy-preserving training, and federated learning, where each user may have access to partial or corrupted features. Using a Neural Tangent Kernel (NTK) analysis, we demonstrate that training a two-layer ReLU network with Gaussian randomly masked inputs achieves linear convergence up to an error region proportional to the mask's variance. A key technical contribution is resolving the randomness within the non-linear activation, a problem of independent interest.

Authors: Afroditi Kolomvaki, Fangshuo Liao, Evan Dramko, Ziyun Guang, Anastasios Kyrillidis

We investigate the convergence guarantee of two-layer neural network training with Gaussian randomly masked inputs. This scenario corresponds to Gaussian dropout at the input level, or noisy input training common in sensor networks, privacy-preserving training, and federated learning, where each user may have access to partial or corrupted features. Using a Neural Tangent Kernel (NTK) analysis, we demonstrate that training a two-layer ReLU network with Gaussian randomly masked inputs achieves linear convergence up to an error region proportional to the mask's variance. A key technical contribution is resolving the randomness within the non-linear activation, a problem of independent interest.

Partial Optimality in the Preordering Problem

from arXiv: Data Structures and Algorithms

Authors: David Stein, Jannik Irmai, Bjoern Andres

Preordering is a generalization of clustering and partial ordering with applications in bioinformatics and social network analysis. Given a finite set $V$ and a value $c_{ab} \in \mathbb{R}$ for every ordered pair $ab$ of elements of $V$, the preordering problem asks for a preorder $\lesssim$ on $V$ that maximizes the sum of the values of those pairs $ab$ for which $a \lesssim b$. Building on the state of the art in solving this NP-hard problem partially, we contribute new partial optimality conditions and efficient algorithms for deciding these conditions. In experiments with real and synthetic data, these new conditions increase, in particular, the fraction of pairs $ab$ for which it is decided efficiently that $a \not\lesssim b$ in an optimal preorder.

Authors: David Stein, Jannik Irmai, Bjoern Andres

Preordering is a generalization of clustering and partial ordering with applications in bioinformatics and social network analysis. Given a finite set $V$ and a value $c_{ab} \in \mathbb{R}$ for every ordered pair $ab$ of elements of $V$, the preordering problem asks for a preorder $\lesssim$ on $V$ that maximizes the sum of the values of those pairs $ab$ for which $a \lesssim b$. Building on the state of the art in solving this NP-hard problem partially, we contribute new partial optimality conditions and efficient algorithms for deciding these conditions. In experiments with real and synthetic data, these new conditions increase, in particular, the fraction of pairs $ab$ for which it is decided efficiently that $a \not\lesssim b$ in an optimal preorder.

Adaptive encodings for small and fast compressed suffix arrays

from arXiv: Data Structures and Algorithms

Authors: Diego Díaz-Domínguez, Veli Mäkinen

Compressed suffix arrays (CSAs) index large repetitive collections and are key in many text applications. The r-index and its derivatives combine the run-length Burrows-Wheeler Transform (BWT) with suffix array sampling to achieve space proportional to the number of equal-symbol runs in the BWT. While effective for near-identical strings, their size grows quickly as variation increases, since the number of BWT runs is sensitive to edits. Existing approaches typically trade space for query speed, or vice versa, limiting their practicality at large scale. We introduce variable-length blocking (VLB), an encoding technique for BWT-based CSAs that adapts the amount of indexing information to local compressibility. The BWT is recursively divided into blocks of at most w runs (a parameter) and organized into a tree. Compressible regions appear near the root and store little auxiliary data, while incompressible regions lie deeper and retain additional information to speed up access. Queries traverse a short root-to-leaf path followed by a small run scan. This strategy balances space and query speed by transferring bits saved in compressible areas to accelerate access in incompressible ones. Backward search relies on rank and successor queries over the BWT. We introduce a sampling technique that guarantees correctness only along valid backward-search states, reducing space without affecting query performance. We extend VLB to encode the subsampled r-index (sr-index). Experiments show that VLB-based techniques outperform the r-index and sr-index in query time, while retaining space close to that of the sr-index. Compared to the move data structure, VLB offers a more favorable space-time tradeoff.

Authors: Diego Díaz-Domínguez, Veli Mäkinen

Compressed suffix arrays (CSAs) index large repetitive collections and are key in many text applications. The r-index and its derivatives combine the run-length Burrows-Wheeler Transform (BWT) with suffix array sampling to achieve space proportional to the number of equal-symbol runs in the BWT. While effective for near-identical strings, their size grows quickly as variation increases, since the number of BWT runs is sensitive to edits. Existing approaches typically trade space for query speed, or vice versa, limiting their practicality at large scale. We introduce variable-length blocking (VLB), an encoding technique for BWT-based CSAs that adapts the amount of indexing information to local compressibility. The BWT is recursively divided into blocks of at most w runs (a parameter) and organized into a tree. Compressible regions appear near the root and store little auxiliary data, while incompressible regions lie deeper and retain additional information to speed up access. Queries traverse a short root-to-leaf path followed by a small run scan. This strategy balances space and query speed by transferring bits saved in compressible areas to accelerate access in incompressible ones. Backward search relies on rank and successor queries over the BWT. We introduce a sampling technique that guarantees correctness only along valid backward-search states, reducing space without affecting query performance. We extend VLB to encode the subsampled r-index (sr-index). Experiments show that VLB-based techniques outperform the r-index and sr-index in query time, while retaining space close to that of the sr-index. Compared to the move data structure, VLB offers a more favorable space-time tradeoff.

Offline green bin packing and its constrained variant

from arXiv: Data Structures and Algorithms

Authors: Mingyang Gong, Brendan Mumey

In this paper, we study the {\em green bin packing} (GBP) problem where $β\ge 0$ and $G \in [0, 1]$ are two given values as part of the input. The energy consumed by a bin is $\max \{0, β(x-G) \}$ where $x$ is the total size of the items packed into the bin. The GBP aims to pack all items into a set of unit-capacity bins so that the number of bins used plus the total energy consumption is minimized. When $β= 0$ or $G = 1$, GBP is reduced to the classic bin packing (BP) problem. In the {\em constrained green bin packing} (CGBP) problem, the objective is to minimize the number of bins used to pack all items while the total energy consumption does not exceed a given upper bound $U$. We present an APTAS and a $\frac 32$-approximation algorithm for both GBP and CGBP, where the ratio $\frac 32$ matches the lower bound of BP. Keywords: Green bin packing; constrained green bin packing; approximation scheme; offline algorithms

Authors: Mingyang Gong, Brendan Mumey

In this paper, we study the {\em green bin packing} (GBP) problem where $β\ge 0$ and $G \in [0, 1]$ are two given values as part of the input. The energy consumed by a bin is $\max \{0, β(x-G) \}$ where $x$ is the total size of the items packed into the bin. The GBP aims to pack all items into a set of unit-capacity bins so that the number of bins used plus the total energy consumption is minimized. When $β= 0$ or $G = 1$, GBP is reduced to the classic bin packing (BP) problem. In the {\em constrained green bin packing} (CGBP) problem, the objective is to minimize the number of bins used to pack all items while the total energy consumption does not exceed a given upper bound $U$. We present an APTAS and a $\frac 32$-approximation algorithm for both GBP and CGBP, where the ratio $\frac 32$ matches the lower bound of BP. Keywords: Green bin packing; constrained green bin packing; approximation scheme; offline algorithms

Thursday, February 19

Nash-convergence of Game Dynamics and Complexity

from arXiv: Computational Complexity

Authors: Oliver Biggar, Christos Papadimitriou, Georgios Piliouras

Does the failure of learning dynamics to converge globally to Nash equilibria stem from the geometry of the game or the complexity of computation? Previous impossibility results relied on game degeneracy, leaving open the case for generic, nondegenerate games. We resolve this by proving that while Nash-convergent dynamics theoretically exist for all nondegenerate games, computing them is likely intractable. We formulate the Impossibility Conjecture: if a locally efficient Nash-convergent dynamic exists for nondegenerate games, then $P=PPAD$. We validate this for three specific families of dynamics, showing their tractability would imply collapses such as $NP=RP$ or $CLS=PPAD$. En route, we settle the complexity of finding Nash equilibria of a given game that lie on a given affine subspace. Finally, we explain why the general conjecture remains open: we introduce a Proving Game to demonstrate that black-box reductions cannot distinguish between convergent and non-convergent dynamics in polynomial time. Our results suggest the barrier to Nash learning is not the non-existence of a vector field, but the intractability of computing it.

Authors: Oliver Biggar, Christos Papadimitriou, Georgios Piliouras

Does the failure of learning dynamics to converge globally to Nash equilibria stem from the geometry of the game or the complexity of computation? Previous impossibility results relied on game degeneracy, leaving open the case for generic, nondegenerate games. We resolve this by proving that while Nash-convergent dynamics theoretically exist for all nondegenerate games, computing them is likely intractable. We formulate the Impossibility Conjecture: if a locally efficient Nash-convergent dynamic exists for nondegenerate games, then $P=PPAD$. We validate this for three specific families of dynamics, showing their tractability would imply collapses such as $NP=RP$ or $CLS=PPAD$. En route, we settle the complexity of finding Nash equilibria of a given game that lie on a given affine subspace. Finally, we explain why the general conjecture remains open: we introduce a Proving Game to demonstrate that black-box reductions cannot distinguish between convergent and non-convergent dynamics in polynomial time. Our results suggest the barrier to Nash learning is not the non-existence of a vector field, but the intractability of computing it.

Improved Bounds for Discrete Voronoi Games

from arXiv: Computational Geometry

Authors: Mark de Berg, Geert van Wordragen

In the planar one-round discrete Voronoi game, two players $\mathcal{P}$ and $\mathcal{Q}$ compete over a set $V$ of $n$ voters represented by points in $\mathbb{R}^2$. First, $\mathcal{P}$ places a set $P$ of $k$ points, then $\mathcal{Q}$ places a set $Q$ of $\ell$ points, and then each voter $v\in V$ is won by the player who has placed a point closest to $v$. It is well known that if $k=\ell=1$, then $\mathcal{P}$ can always win $n/3$ voters and that this is worst-case optimal. We study the setting where $k>1$ and $\ell=1$. We present lower bounds on the number of voters that $\mathcal{P}$ can always win, which improve the existing bounds for all $k\geq 4$. As a by-product, we obtain improved bounds on small $\varepsilon$-nets for convex ranges. These results are for the $L_2$ metric. We also obtain lower bounds on the number of voters that $\mathcal{P}$ can always win when distances are measured in the $L_1$ metric.

Authors: Mark de Berg, Geert van Wordragen

In the planar one-round discrete Voronoi game, two players $\mathcal{P}$ and $\mathcal{Q}$ compete over a set $V$ of $n$ voters represented by points in $\mathbb{R}^2$. First, $\mathcal{P}$ places a set $P$ of $k$ points, then $\mathcal{Q}$ places a set $Q$ of $\ell$ points, and then each voter $v\in V$ is won by the player who has placed a point closest to $v$. It is well known that if $k=\ell=1$, then $\mathcal{P}$ can always win $n/3$ voters and that this is worst-case optimal. We study the setting where $k>1$ and $\ell=1$. We present lower bounds on the number of voters that $\mathcal{P}$ can always win, which improve the existing bounds for all $k\geq 4$. As a by-product, we obtain improved bounds on small $\varepsilon$-nets for convex ranges. These results are for the $L_2$ metric. We also obtain lower bounds on the number of voters that $\mathcal{P}$ can always win when distances are measured in the $L_1$ metric.

Ratio Covers of Convex Sets and Optimal Mixture Density Estimation

from arXiv: Computational Geometry

Authors: Spencer Compton, Gábor Lugosi, Jaouad Mourtada, Jian Qian, Nikita Zhivotovskiy

We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density $p$, the goal is to construct an estimator $\widehat p$ such that $\mathrm{KL}(p,\widehat p)$ is small with high probability. We consider two settings involving a finite dictionary of $M$ densities: (i) model aggregation, where $p$ belongs to the dictionary, and (ii) convex aggregation (mixture density estimation), where $p$ is a mixture of densities from the dictionary. Crucially, we make no assumption on the base densities: their ratios may be unbounded and their supports may differ. For both problems, we identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level. These optimal rates are higher than those achievable when density ratios are bounded by absolute constants; for mixture density estimation, they match existing lower bounds in the special case of discrete distributions. Our analysis of the mixture case hinges on two new covering results. First, we provide a sharp, distribution-free upper bound on the local Hellinger entropy of the class of mixtures of $M$ distributions. Second, we prove an optimal ratio covering theorem for convex sets: for every convex compact set $K\subset \mathbb{R}_+^d$, there exists a subset $A\subset K$ with at most $2^{8d}$ elements such that each element of $K$ is coordinate-wise dominated by an element of $A$ up to a universal constant factor. This geometric result is of independent interest; notably, it yields new cardinality estimates for $\varepsilon$-approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.

Authors: Spencer Compton, Gábor Lugosi, Jaouad Mourtada, Jian Qian, Nikita Zhivotovskiy

We study density estimation in Kullback-Leibler divergence: given an i.i.d. sample from an unknown density $p$, the goal is to construct an estimator $\widehat p$ such that $\mathrm{KL}(p,\widehat p)$ is small with high probability. We consider two settings involving a finite dictionary of $M$ densities: (i) model aggregation, where $p$ belongs to the dictionary, and (ii) convex aggregation (mixture density estimation), where $p$ is a mixture of densities from the dictionary. Crucially, we make no assumption on the base densities: their ratios may be unbounded and their supports may differ. For both problems, we identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level. These optimal rates are higher than those achievable when density ratios are bounded by absolute constants; for mixture density estimation, they match existing lower bounds in the special case of discrete distributions. Our analysis of the mixture case hinges on two new covering results. First, we provide a sharp, distribution-free upper bound on the local Hellinger entropy of the class of mixtures of $M$ distributions. Second, we prove an optimal ratio covering theorem for convex sets: for every convex compact set $K\subset \mathbb{R}_+^d$, there exists a subset $A\subset K$ with at most $2^{8d}$ elements such that each element of $K$ is coordinate-wise dominated by an element of $A$ up to a universal constant factor. This geometric result is of independent interest; notably, it yields new cardinality estimates for $\varepsilon$-approximate Pareto sets in multi-objective optimization when the attainable set of objective vectors is convex.

On the Hardness of Approximation of the Fair k-Center Problem

from arXiv: Data Structures and Algorithms

Authors: Suhas Thejaswi

In this work, we study the hardness of approximation of the fair $k$-center problem. Here the data points are partitioned into groups and the task is to choose a prescribed number of data points from each group, called centers, while minimizing the maximum distance from any point to its closest center. Although a polynomial-time $3$-approximation is known for this problem in general metrics, it has remained open whether this approximation guarantee is tight or could be further improved, especially since the unconstrained $k$-center problem admits a polynomial-time factor-$2$ approximation. We resolve this open question by proving that, for every $ε>0$, achieving a $(3-ε)$-approximation is NP-hard, assuming $\text{P} \neq \text{NP}$. Our inapproximability results hold even when only two disjoint groups are present and at least one center must be chosen from each group. Further, it extends to the canonical one-per-group setting with $k$-groups (for arbitrary $k$), where exactly one center must be selected from each group. Consequently, the factor-$3$ barrier for fair $k$-center in general metric spaces is inherent, and existing $3$-approximation algorithms are optimal up to lower-order terms even in these restricted regimes. This result stands in sharp contrast to the $k$-supplier formulation, where both the unconstrained and fair variants admit factor-$3$ approximation in polynomial time.

Authors: Suhas Thejaswi

In this work, we study the hardness of approximation of the fair $k$-center problem. Here the data points are partitioned into groups and the task is to choose a prescribed number of data points from each group, called centers, while minimizing the maximum distance from any point to its closest center. Although a polynomial-time $3$-approximation is known for this problem in general metrics, it has remained open whether this approximation guarantee is tight or could be further improved, especially since the unconstrained $k$-center problem admits a polynomial-time factor-$2$ approximation. We resolve this open question by proving that, for every $ε>0$, achieving a $(3-ε)$-approximation is NP-hard, assuming $\text{P} \neq \text{NP}$. Our inapproximability results hold even when only two disjoint groups are present and at least one center must be chosen from each group. Further, it extends to the canonical one-per-group setting with $k$-groups (for arbitrary $k$), where exactly one center must be selected from each group. Consequently, the factor-$3$ barrier for fair $k$-center in general metric spaces is inherent, and existing $3$-approximation algorithms are optimal up to lower-order terms even in these restricted regimes. This result stands in sharp contrast to the $k$-supplier formulation, where both the unconstrained and fair variants admit factor-$3$ approximation in polynomial time.

Submodular Maximization under Supermodular Constraint: Greedy Guarantees

from arXiv: Data Structures and Algorithms

Authors: Ajitesh Srivastava, Shanghua Teng

Motivated by a wide range of applications in data mining and machine learning, we consider the problem of maximizing a submodular function subject to supermodular cost constraints. In contrast to the well-understood setting of cardinality and matroid constraints, where greedy algorithms admit strong guarantees, the supermodular constraint regime remains poorly understood -- guarantees for greedy methods and other efficient algorithmic paradigms are largely open. We study this family of fundamental optimization problems under an upper-bound constraint on a supermodular cost function with curvature parameter $γ$. Our notion of supermodular curvature is less restrictive than prior definitions, substantially expanding the class of admissible cost functions. We show that our greedy algorithm that iteratively includes elements maximizing the ratio of the objective and constraint functions, achieves a $\left(1 - e^{-(1-γ)}\right)$-approximation before stopping. We prove that this approximation is indeed tight for this algorithm. Further, if the objective function has a submodular curvature $c$, then we show that the bound further improves to $\left(1 - (1- (1-c)(1-γ))^{1/(1-c)}\right)$, which can be further improved by continuing to violate the constraint. Finally, we show that the Greedy-Ratio-Marginal in conjunction with binary search leads to a bicriteria approximation for the dual problem -- minimizing a supermodular function under a lower bound constraint on a submodular function. We conduct a number of experiments on a simulation of LLM agents debating over multiple rounds -- the task is to select a subset of agents to maximize correctly answered questions. Our algorithm outperforms all other greedy heuristics, and on smaller problems, it achieves the same performance as the optimal set found by exhaustive search.

Authors: Ajitesh Srivastava, Shanghua Teng

Motivated by a wide range of applications in data mining and machine learning, we consider the problem of maximizing a submodular function subject to supermodular cost constraints. In contrast to the well-understood setting of cardinality and matroid constraints, where greedy algorithms admit strong guarantees, the supermodular constraint regime remains poorly understood -- guarantees for greedy methods and other efficient algorithmic paradigms are largely open. We study this family of fundamental optimization problems under an upper-bound constraint on a supermodular cost function with curvature parameter $γ$. Our notion of supermodular curvature is less restrictive than prior definitions, substantially expanding the class of admissible cost functions. We show that our greedy algorithm that iteratively includes elements maximizing the ratio of the objective and constraint functions, achieves a $\left(1 - e^{-(1-γ)}\right)$-approximation before stopping. We prove that this approximation is indeed tight for this algorithm. Further, if the objective function has a submodular curvature $c$, then we show that the bound further improves to $\left(1 - (1- (1-c)(1-γ))^{1/(1-c)}\right)$, which can be further improved by continuing to violate the constraint. Finally, we show that the Greedy-Ratio-Marginal in conjunction with binary search leads to a bicriteria approximation for the dual problem -- minimizing a supermodular function under a lower bound constraint on a submodular function. We conduct a number of experiments on a simulation of LLM agents debating over multiple rounds -- the task is to select a subset of agents to maximize correctly answered questions. Our algorithm outperforms all other greedy heuristics, and on smaller problems, it achieves the same performance as the optimal set found by exhaustive search.

Dynamic and Streaming Algorithms for Union Volume Estimation

from arXiv: Data Structures and Algorithms

Authors: Sujoy Bhore, Karl Bringmann, Timothy M. Chan, Yanheng Wang

The union volume estimation problem asks to $(1\pm\varepsilon)$-approximate the volume of the union of $n$ given objects $X_1,\ldots,X_n \subset \mathbb{R}^d$. In their seminal work in 1989, Karp, Luby, and Madras solved this problem in time $O(n/\varepsilon^2)$ in an oracle model where each object $X_i$ can be accessed via three types of queries: obtain the volume of $X_i$, sample a random point from $X_i$, and test whether $X_i$ contains a given point $x$. This running time was recently shown to be optimal [Bringmann, Larsen, Nusser, Rotenberg, and Wang, SoCG'25]. In another line of work, Meel, Vinodchandran, and Chakraborty [PODS'21] designed algorithms that read the objects in one pass using polylogarithmic time per object and polylogarithmic space; this can be phrased as a dynamic algorithm supporting insertions of objects for union volume estimation in the oracle model. In this paper, we study algorithms for union volume estimation in the oracle model that support both insertions and deletions of objects. We obtain the following results: - an algorithm supporting insertions and deletions in polylogarithmic update and query time and linear space (this is the first such dynamic algorithm, even for 2D triangles); - an algorithm supporting insertions and suffix queries (which generalizes the sliding window setting) in polylogarithmic update and query time and space; - an algorithm supporting insertions and deletions of convex bodies of constant dimension in polylogarithmic update and query time and space.

Authors: Sujoy Bhore, Karl Bringmann, Timothy M. Chan, Yanheng Wang

The union volume estimation problem asks to $(1\pm\varepsilon)$-approximate the volume of the union of $n$ given objects $X_1,\ldots,X_n \subset \mathbb{R}^d$. In their seminal work in 1989, Karp, Luby, and Madras solved this problem in time $O(n/\varepsilon^2)$ in an oracle model where each object $X_i$ can be accessed via three types of queries: obtain the volume of $X_i$, sample a random point from $X_i$, and test whether $X_i$ contains a given point $x$. This running time was recently shown to be optimal [Bringmann, Larsen, Nusser, Rotenberg, and Wang, SoCG'25]. In another line of work, Meel, Vinodchandran, and Chakraborty [PODS'21] designed algorithms that read the objects in one pass using polylogarithmic time per object and polylogarithmic space; this can be phrased as a dynamic algorithm supporting insertions of objects for union volume estimation in the oracle model. In this paper, we study algorithms for union volume estimation in the oracle model that support both insertions and deletions of objects. We obtain the following results: - an algorithm supporting insertions and deletions in polylogarithmic update and query time and linear space (this is the first such dynamic algorithm, even for 2D triangles); - an algorithm supporting insertions and suffix queries (which generalizes the sliding window setting) in polylogarithmic update and query time and space; - an algorithm supporting insertions and deletions of convex bodies of constant dimension in polylogarithmic update and query time and space.

Protecting the Undeleted in Machine Unlearning

from arXiv: Data Structures and Algorithms

Authors: Aloni Cohen, Refael Kohen, Kobbi Nissim, Uri Stemmer

Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary controlling merely $ω(1)$ data points to reconstruct almost the entire dataset merely by issuing deletion requests. We survey existing definitions for machine unlearning, showing they are either susceptible to such attacks or too restrictive to support basic functionalities like exact summation. To address this problem, we propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points. We show that our definition permits several essential functionalities, such as bulletin boards, summations, and statistical learning.

Authors: Aloni Cohen, Refael Kohen, Kobbi Nissim, Uri Stemmer

Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary controlling merely $ω(1)$ data points to reconstruct almost the entire dataset merely by issuing deletion requests. We survey existing definitions for machine unlearning, showing they are either susceptible to such attacks or too restrictive to support basic functionalities like exact summation. To address this problem, we propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points. We show that our definition permits several essential functionalities, such as bulletin boards, summations, and statistical learning.

An $n^{2+o(1)}$ Time Algorithm for Single-Source Negative Weight Shortest Paths

from arXiv: Data Structures and Algorithms

Authors: Sanjeev Khanna, Junkai Song

We present a randomized algorithm for the single-source shortest paths (SSSP) problem on directed graphs with arbitrary real-valued edge weights that runs in $n^{2+o(1)}$ time with high probability. This result yields the first almost linear-time algorithm for the problem on dense graphs ($m = Θ(n^2)$) and improves upon the best previously known bounds for moderately dense graphs ($m = ω(n^{1.306})$). Our approach builds on the hop-reduction via shortcutting framework introduced by Li, Li, Rao, and Zhang (2025), which iteratively augments the graph with shortcut edges to reduce the negative hop count of shortest paths. The central computational bottleneck in prior work is the cost of explicitly constructing these shortcuts in dense regions. We overcome this by introducing a new compression technique using auxiliary Steiner vertices. Specifically, we construct these vertices to represent large neighborhoods compactly in a structured manner, allowing us to efficiently generate and propagate shortcuts while strictly controlling the growth of vertex degrees and graph size.

Authors: Sanjeev Khanna, Junkai Song

We present a randomized algorithm for the single-source shortest paths (SSSP) problem on directed graphs with arbitrary real-valued edge weights that runs in $n^{2+o(1)}$ time with high probability. This result yields the first almost linear-time algorithm for the problem on dense graphs ($m = Θ(n^2)$) and improves upon the best previously known bounds for moderately dense graphs ($m = ω(n^{1.306})$). Our approach builds on the hop-reduction via shortcutting framework introduced by Li, Li, Rao, and Zhang (2025), which iteratively augments the graph with shortcut edges to reduce the negative hop count of shortest paths. The central computational bottleneck in prior work is the cost of explicitly constructing these shortcuts in dense regions. We overcome this by introducing a new compression technique using auxiliary Steiner vertices. Specifically, we construct these vertices to represent large neighborhoods compactly in a structured manner, allowing us to efficiently generate and propagate shortcuts while strictly controlling the growth of vertex degrees and graph size.

Fast Shortest Path in Graphs With Sparse Signed Tree Models and Applications

from arXiv: Data Structures and Algorithms

Authors: Édouard Bonnet, Colin Geniet, Eun Jung Kim, Sungmin Moon

A signed tree model of a graph $G$ is a compact binary structure consisting of a rooted binary tree whose leaves are bijectively mapped to the vertices of $G$, together with 2-colored edges $xy$, called transversal pairs, interpreted as bicliques or anti-bicliques whose sides are the leaves of the subtrees rooted at $x$ and at $y$. We design an algorithm that, given such a representation of an $n$-vertex graph $G$ with $p$ transversal pairs and a source $v \in V(G)$, computes a shortest-path tree rooted at $v$ in $G$ in time $O(p \log n)$. A wide variety of graph classes are such that for all $n$, their $n$-vertex graphs admit signed tree models with $O(n)$ transversal pairs: for instance, those of bounded symmetric difference, more generally of bounded sd-degeneracy, as well as interval graphs. As applications of our Single-Source Shortest Path algorithm and new techniques, we - improve the runtime of the fixed-parameter algorithm for first-order model checking on graphs given with a witness of low merge-width from cubic [Dreier and Toruńczyk, STOC '25] to quadratic; - give an $O(n^2 \log n)$-time algorithm for All-Pairs Shortest Path (APSP) on graphs given with a witness of low merge-width, generalizing a result known on twin-width [Twin-Width III, SICOMP '24]; - extend and simplify an $O(n^2 \log n)$-time algorithm for multiplying two $n \times n$ matrices $A, B$ of bounded twin-width in [Twin-Width V, STACS '23]: now $A$ solely has to be an adjacency matrix of a graph of bounded twin-width and $B$ can be arbitrary; - give an $O(n^2 \log^2 n)$-time algorithm for APSP on graphs of bounded twin-width, bypassing the need for contraction sequences in [Twin-Width III, SICOMP '24; Bannach et al. STACS '24]; - give an $O(n^{7/3} \log^2 n)$-time algorithm for APSP on graphs of symmetric difference $O(n^{1/3})$.

Authors: Édouard Bonnet, Colin Geniet, Eun Jung Kim, Sungmin Moon

A signed tree model of a graph $G$ is a compact binary structure consisting of a rooted binary tree whose leaves are bijectively mapped to the vertices of $G$, together with 2-colored edges $xy$, called transversal pairs, interpreted as bicliques or anti-bicliques whose sides are the leaves of the subtrees rooted at $x$ and at $y$. We design an algorithm that, given such a representation of an $n$-vertex graph $G$ with $p$ transversal pairs and a source $v \in V(G)$, computes a shortest-path tree rooted at $v$ in $G$ in time $O(p \log n)$. A wide variety of graph classes are such that for all $n$, their $n$-vertex graphs admit signed tree models with $O(n)$ transversal pairs: for instance, those of bounded symmetric difference, more generally of bounded sd-degeneracy, as well as interval graphs. As applications of our Single-Source Shortest Path algorithm and new techniques, we - improve the runtime of the fixed-parameter algorithm for first-order model checking on graphs given with a witness of low merge-width from cubic [Dreier and Toruńczyk, STOC '25] to quadratic; - give an $O(n^2 \log n)$-time algorithm for All-Pairs Shortest Path (APSP) on graphs given with a witness of low merge-width, generalizing a result known on twin-width [Twin-Width III, SICOMP '24]; - extend and simplify an $O(n^2 \log n)$-time algorithm for multiplying two $n \times n$ matrices $A, B$ of bounded twin-width in [Twin-Width V, STACS '23]: now $A$ solely has to be an adjacency matrix of a graph of bounded twin-width and $B$ can be arbitrary; - give an $O(n^2 \log^2 n)$-time algorithm for APSP on graphs of bounded twin-width, bypassing the need for contraction sequences in [Twin-Width III, SICOMP '24; Bannach et al. STACS '24]; - give an $O(n^{7/3} \log^2 n)$-time algorithm for APSP on graphs of symmetric difference $O(n^{1/3})$.

Steering diffusion models with quadratic rewards: a fine-grained analysis

from arXiv: Data Structures and Algorithms

Authors: Ankur Moitra, Andrej Risteski, Dhruv Rohatgi

Inference-time algorithms are an emerging paradigm in which pre-trained models are used as subroutines to solve downstream tasks. Such algorithms have been proposed for tasks ranging from inverse problems and guided image generation to reasoning. However, the methods currently deployed in practice are heuristics with a variety of failure modes -- and we have very little understanding of when these heuristics can be efficiently improved. In this paper, we consider the task of sampling from a reward-tilted diffusion model -- that is, sampling from $p^{\star}(x) \propto p(x) \exp(r(x))$ -- given a reward function $r$ and pre-trained diffusion oracle for $p$. We provide a fine-grained analysis of the computational tractability of this task for quadratic rewards $r(x) = x^\top A x + b^\top x$. We show that linear-reward tilts are always efficiently sampleable -- a simple result that seems to have gone unnoticed in the literature. We use this as a building block, along with a conceptually new ingredient -- the Hubbard-Stratonovich transform -- to provide an efficient algorithm for sampling from low-rank positive-definite quadratic tilts, i.e. $r(x) = x^\top A x$ where $A$ is positive-definite and of rank $O(1)$. For negative-definite tilts, i.e. $r(x) = - x^\top A x$ where $A$ is positive-definite, we prove that the problem is intractable even if $A$ is of rank 1 (albeit with exponentially-large entries).

Authors: Ankur Moitra, Andrej Risteski, Dhruv Rohatgi

Inference-time algorithms are an emerging paradigm in which pre-trained models are used as subroutines to solve downstream tasks. Such algorithms have been proposed for tasks ranging from inverse problems and guided image generation to reasoning. However, the methods currently deployed in practice are heuristics with a variety of failure modes -- and we have very little understanding of when these heuristics can be efficiently improved. In this paper, we consider the task of sampling from a reward-tilted diffusion model -- that is, sampling from $p^{\star}(x) \propto p(x) \exp(r(x))$ -- given a reward function $r$ and pre-trained diffusion oracle for $p$. We provide a fine-grained analysis of the computational tractability of this task for quadratic rewards $r(x) = x^\top A x + b^\top x$. We show that linear-reward tilts are always efficiently sampleable -- a simple result that seems to have gone unnoticed in the literature. We use this as a building block, along with a conceptually new ingredient -- the Hubbard-Stratonovich transform -- to provide an efficient algorithm for sampling from low-rank positive-definite quadratic tilts, i.e. $r(x) = x^\top A x$ where $A$ is positive-definite and of rank $O(1)$. For negative-definite tilts, i.e. $r(x) = - x^\top A x$ where $A$ is positive-definite, we prove that the problem is intractable even if $A$ is of rank 1 (albeit with exponentially-large entries).

Separating Oblivious and Adaptive Models of Variable Selection

from arXiv: Data Structures and Algorithms

Authors: Ziyun Chen, Jerry Li, Kevin Tian, Yusong Zhu

Sparse recovery is among the most well-studied problems in learning theory and high-dimensional statistics. In this work, we investigate the statistical and computational landscapes of sparse recovery with $\ell_\infty$ error guarantees. This variant of the problem is motivated by \emph{variable selection} tasks, where the goal is to estimate the support of a $k$-sparse signal in $\mathbb{R}^d$. Our main contribution is a provable separation between the \emph{oblivious} (``for each'') and \emph{adaptive} (``for all'') models of $\ell_\infty$ sparse recovery. We show that under an oblivious model, the optimal $\ell_\infty$ error is attainable in near-linear time with $\approx k\log d$ samples, whereas in an adaptive model, $\gtrsim k^2$ samples are necessary for any algorithm to achieve this bound. This establishes a surprising contrast with the standard $\ell_2$ setting, where $\approx k \log d$ samples suffice even for adaptive sparse recovery. We conclude with a preliminary examination of a \emph{partially-adaptive} model, where we show nontrivial variable selection guarantees are possible with $\approx k\log d$ measurements.

Authors: Ziyun Chen, Jerry Li, Kevin Tian, Yusong Zhu

Sparse recovery is among the most well-studied problems in learning theory and high-dimensional statistics. In this work, we investigate the statistical and computational landscapes of sparse recovery with $\ell_\infty$ error guarantees. This variant of the problem is motivated by \emph{variable selection} tasks, where the goal is to estimate the support of a $k$-sparse signal in $\mathbb{R}^d$. Our main contribution is a provable separation between the \emph{oblivious} (``for each'') and \emph{adaptive} (``for all'') models of $\ell_\infty$ sparse recovery. We show that under an oblivious model, the optimal $\ell_\infty$ error is attainable in near-linear time with $\approx k\log d$ samples, whereas in an adaptive model, $\gtrsim k^2$ samples are necessary for any algorithm to achieve this bound. This establishes a surprising contrast with the standard $\ell_2$ setting, where $\approx k \log d$ samples suffice even for adaptive sparse recovery. We conclude with a preliminary examination of a \emph{partially-adaptive} model, where we show nontrivial variable selection guarantees are possible with $\approx k\log d$ measurements.

The S-Hamiltonian Cycle Problem

from arXiv: Data Structures and Algorithms

Authors: Antoine Amarilli, Arthur Lombardo, Mikaël Monet

Determining if an input undirected graph is Hamiltonian, i.e., if it has a cycle that visits every vertex exactly once, is one of the most famous NP-complete problems. We consider the following generalization of Hamiltonian cycles: for a fixed set $S$ of natural numbers, we want to visit each vertex of a graph $G$ exactly once and ensure that any two consecutive vertices can be joined in $k$ hops for some choice of $k \in S$. Formally, an $S$-Hamiltonian cycle is a permutation $(v_0,\ldots,v_{n-1})$ of the vertices of $G$ such that, for $0 \leq i \leq n-1$, there exists a walk between $v_i$ and $v_{i+1 \bmod n}$ whose length is in $S$. (We do not impose any constraints on how many times vertices can be visited as intermediate vertices of walks.) Of course Hamiltonian cycles in the standard sense correspond to $S=\{1\}$. We study the $S$-Hamiltonian cycle problem of deciding whether an input graph $G$ has an $S$-Hamiltonian cycle. Our goal is to determine the complexity of this problem depending on the fixed set $S$. It is already known that the problem remains NP-complete for $S=\{1,2\}$, whereas it is trivial for $S=\{1,2,3\}$ because any connected graph contains a $\{1,2,3\}$-Hamiltonian cycle. Our work classifies the complexity of this problem for most kinds of sets $S$, with the key new results being the following: we have NP-completeness for $S = \{2\}$ and for $S = \{2, 4\}$, but tractability for $S = \{1, 2, 4\}$, for $S = \{2, 4, 6\}$, for any superset of these two tractable cases, and for $S$ the infinite set of all odd integers. The remaining open cases are the non-singleton finite sets of odd integers, in particular $S = \{1, 3\}$. Beyond cycles, we also discuss the complexity of finding $S$-Hamiltonian paths, and show that our problems are all tractable on graphs of bounded cliquewidth.

Authors: Antoine Amarilli, Arthur Lombardo, Mikaël Monet

Determining if an input undirected graph is Hamiltonian, i.e., if it has a cycle that visits every vertex exactly once, is one of the most famous NP-complete problems. We consider the following generalization of Hamiltonian cycles: for a fixed set $S$ of natural numbers, we want to visit each vertex of a graph $G$ exactly once and ensure that any two consecutive vertices can be joined in $k$ hops for some choice of $k \in S$. Formally, an $S$-Hamiltonian cycle is a permutation $(v_0,\ldots,v_{n-1})$ of the vertices of $G$ such that, for $0 \leq i \leq n-1$, there exists a walk between $v_i$ and $v_{i+1 \bmod n}$ whose length is in $S$. (We do not impose any constraints on how many times vertices can be visited as intermediate vertices of walks.) Of course Hamiltonian cycles in the standard sense correspond to $S=\{1\}$. We study the $S$-Hamiltonian cycle problem of deciding whether an input graph $G$ has an $S$-Hamiltonian cycle. Our goal is to determine the complexity of this problem depending on the fixed set $S$. It is already known that the problem remains NP-complete for $S=\{1,2\}$, whereas it is trivial for $S=\{1,2,3\}$ because any connected graph contains a $\{1,2,3\}$-Hamiltonian cycle. Our work classifies the complexity of this problem for most kinds of sets $S$, with the key new results being the following: we have NP-completeness for $S = \{2\}$ and for $S = \{2, 4\}$, but tractability for $S = \{1, 2, 4\}$, for $S = \{2, 4, 6\}$, for any superset of these two tractable cases, and for $S$ the infinite set of all odd integers. The remaining open cases are the non-singleton finite sets of odd integers, in particular $S = \{1, 3\}$. Beyond cycles, we also discuss the complexity of finding $S$-Hamiltonian paths, and show that our problems are all tractable on graphs of bounded cliquewidth.

The Complexity Landscape of Two-Stage Robust Selection Problems with Budgeted Uncertainty

from arXiv: Data Structures and Algorithms

Authors: Marc Goerigk, Dorothee Henke, Lasse Wulf

A standard type of uncertainty set in robust optimization is budgeted uncertainty, where an interval of possible values for each parameter is given and the total deviation from their lower bounds is bounded. In the two-stage setting, discrete and continuous budgeted uncertainty have to be distinguished. The complexity of such problems is largely unexplored, in particular if the underlying nominal optimization problem is simple, such as for selection problems. In this paper, we give a comprehensive answer to long-standing open complexity questions for three types of selection problems and three types of budgeted uncertainty sets. In particular, we demonstrate that the two-stage selection problem with continuous budgeted uncertainty is NP-hard, while the corresponding two-stage representative selection problem is solvable in polynomial time. Our hardness result implies that also the two-stage assignment problem with continuous budgeted uncertainty is NP-hard.

Authors: Marc Goerigk, Dorothee Henke, Lasse Wulf

A standard type of uncertainty set in robust optimization is budgeted uncertainty, where an interval of possible values for each parameter is given and the total deviation from their lower bounds is bounded. In the two-stage setting, discrete and continuous budgeted uncertainty have to be distinguished. The complexity of such problems is largely unexplored, in particular if the underlying nominal optimization problem is simple, such as for selection problems. In this paper, we give a comprehensive answer to long-standing open complexity questions for three types of selection problems and three types of budgeted uncertainty sets. In particular, we demonstrate that the two-stage selection problem with continuous budgeted uncertainty is NP-hard, while the corresponding two-stage representative selection problem is solvable in polynomial time. Our hardness result implies that also the two-stage assignment problem with continuous budgeted uncertainty is NP-hard.

Computing Tarski Fixed Points in Financial Networks

from arXiv: Data Structures and Algorithms

Authors: Leander Besting, Martin Hoefer, Lars Huth

Modern financial networks are highly connected and result in complex interdependencies of the involved institutions. In the prominent Eisenberg-Noe model, a fundamental aspect is clearing -- to determine the amount of assets available to each financial institution in the presence of potential defaults and bankruptcy. A clearing state represents a fixed point that satisfies a set of natural axioms. Existence can be established (even in broad generalizations of the model) using Tarski's theorem. While a maximal fixed point can be computed in polynomial time, the complexity of computing other fixed points is open. In this paper, we provide an efficient algorithm to compute a minimal fixed point that runs in strongly polynomial time. It applies in a broad generalization of the Eisenberg-Noe model with any monotone, piecewise-linear payment functions and default costs. Moreover, in this scenario we provide a polynomial-time algorithm to compute a maximal fixed point. For networks without default costs, we can efficiently decide the existence of fixed points in a given range. We also study claims trading, a local network adjustment to improve clearing, when networks are evaluated with minimal clearing. We provide an efficient algorithm to decide existence of Pareto-improving trades and compute optimal ones if they exist.

Authors: Leander Besting, Martin Hoefer, Lars Huth

Modern financial networks are highly connected and result in complex interdependencies of the involved institutions. In the prominent Eisenberg-Noe model, a fundamental aspect is clearing -- to determine the amount of assets available to each financial institution in the presence of potential defaults and bankruptcy. A clearing state represents a fixed point that satisfies a set of natural axioms. Existence can be established (even in broad generalizations of the model) using Tarski's theorem. While a maximal fixed point can be computed in polynomial time, the complexity of computing other fixed points is open. In this paper, we provide an efficient algorithm to compute a minimal fixed point that runs in strongly polynomial time. It applies in a broad generalization of the Eisenberg-Noe model with any monotone, piecewise-linear payment functions and default costs. Moreover, in this scenario we provide a polynomial-time algorithm to compute a maximal fixed point. For networks without default costs, we can efficiently decide the existence of fixed points in a given range. We also study claims trading, a local network adjustment to improve clearing, when networks are evaluated with minimal clearing. We provide an efficient algorithm to decide existence of Pareto-improving trades and compute optimal ones if they exist.

When to Identify Is to Control: On the Controllability of Combinatorial Optimization Problems

from arXiv: Data Structures and Algorithms

Authors: Max Klimm, Jannik Matuschke

Consider a finite ground set $E$, a set of feasible solutions $X \subseteq \mathbb{R}^{E}$, and a class of objective functions $\mathcal{C}$ defined on $X$. We are interested in subsets $S$ of $E$ that control $X$ in the sense that we can induce any given solution $x \in X$ as an optimum for any given objective function $c \in \mathcal{C}$ by adding linear terms to $c$ on the coordinates corresponding to $S$. This problem has many applications, e.g., when $X$ corresponds to the set of all traffic flows, the ability to control implies that one is able to induce all target flows by imposing tolls on the edges in $S$. Our first result shows the equivalence between controllability and identifiability. If $X$ is convex, or if $X$ consists of binary vectors, then $S$ controls $X$ if and only if the restriction of $x$ to $S$ uniquely determines $x$ among all solutions in $X$. In the convex case, we further prove that the family of controlling sets forms a matroid. This structural insight yields an efficient algorithm for computing minimum-weight controlling sets from a description of the affine hull of $X$. While the equivalence extends to matroid base families, the picture changes sharply for other discrete domains. We show that when $X$ is equal to the set of $s$-$t$-paths in a directed graph, deciding whether an identifying set of a given cardinality exists is $Σ\mathsf{_2^P}$-complete. The problem remains $\mathsf{NP}$-hard even on acyclic graphs. For acyclic instances, however, we obtain an approximation guarantee by proving a tight bound on the gap between the smallest identifying sets for $X$ and its convex hull, where the latter corresponds to the $s$-$t$-flow polyhedron.

Authors: Max Klimm, Jannik Matuschke

Consider a finite ground set $E$, a set of feasible solutions $X \subseteq \mathbb{R}^{E}$, and a class of objective functions $\mathcal{C}$ defined on $X$. We are interested in subsets $S$ of $E$ that control $X$ in the sense that we can induce any given solution $x \in X$ as an optimum for any given objective function $c \in \mathcal{C}$ by adding linear terms to $c$ on the coordinates corresponding to $S$. This problem has many applications, e.g., when $X$ corresponds to the set of all traffic flows, the ability to control implies that one is able to induce all target flows by imposing tolls on the edges in $S$. Our first result shows the equivalence between controllability and identifiability. If $X$ is convex, or if $X$ consists of binary vectors, then $S$ controls $X$ if and only if the restriction of $x$ to $S$ uniquely determines $x$ among all solutions in $X$. In the convex case, we further prove that the family of controlling sets forms a matroid. This structural insight yields an efficient algorithm for computing minimum-weight controlling sets from a description of the affine hull of $X$. While the equivalence extends to matroid base families, the picture changes sharply for other discrete domains. We show that when $X$ is equal to the set of $s$-$t$-paths in a directed graph, deciding whether an identifying set of a given cardinality exists is $Σ\mathsf{_2^P}$-complete. The problem remains $\mathsf{NP}$-hard even on acyclic graphs. For acyclic instances, however, we obtain an approximation guarantee by proving a tight bound on the gap between the smallest identifying sets for $X$ and its convex hull, where the latter corresponds to the $s$-$t$-flow polyhedron.

Condorcet Dimension and Pareto Optimality for Matchings and Beyond

from arXiv: Data Structures and Algorithms

Authors: Telikepalli Kavitha, Jannik Matuschke, Ulrike Schmidt-Kraepelin

We study matching problems in which agents form one side of a bipartite graph and have preferences over objects on the other side. A central solution concept in this setting is popularity: a matching is popular if it is a (weak) Condorcet winner, meaning that no other matching is preferred by a strict majority of agents. It is well known, however, that Condorcet winners need not exist. We therefore turn to a natural and prominent relaxation. A set of matchings is a Condorcet-winning set if, for every competing matching, a majority of agents prefers their favorite matching in the set over the competitor. The Condorcet dimension is the smallest cardinality of a Condorcet-winning set. Our main results reveal a connection between Condorcet-winning sets and Pareto optimality. We show that any Pareto-optimal set of two matchings is, in particular, a Condorcet-winning set. This implication continues to hold when we impose matroid constraints on the set of matched objects, and even when agents' valuations are given as partial orders. The existence picture, however, changes sharply with partial orders. While for weak orders a Pareto-optimal set of two matchings always exists, this is -- surprisingly -- not the case under partial orders. Consequently, although the Condorcet dimension for matchings is 2 under weak orders (even under matroid constraints), this guarantee fails for partial orders: we prove that the Condorcet dimension is $Θ(\sqrt{n})$, and rises further to $Θ(n)$ when matroid constraints are added. On the computational side, we show that, under partial orders, deciding whether there exists a Condorcet -- winning set of a given fixed size is NP-hard. The same holds for deciding the existence of a Pareto-optimal matching, which we believe to be of independent interest. Finally, we also show that the Condorcet dimension for a related problem on arborescences is also 2.

Authors: Telikepalli Kavitha, Jannik Matuschke, Ulrike Schmidt-Kraepelin

We study matching problems in which agents form one side of a bipartite graph and have preferences over objects on the other side. A central solution concept in this setting is popularity: a matching is popular if it is a (weak) Condorcet winner, meaning that no other matching is preferred by a strict majority of agents. It is well known, however, that Condorcet winners need not exist. We therefore turn to a natural and prominent relaxation. A set of matchings is a Condorcet-winning set if, for every competing matching, a majority of agents prefers their favorite matching in the set over the competitor. The Condorcet dimension is the smallest cardinality of a Condorcet-winning set. Our main results reveal a connection between Condorcet-winning sets and Pareto optimality. We show that any Pareto-optimal set of two matchings is, in particular, a Condorcet-winning set. This implication continues to hold when we impose matroid constraints on the set of matched objects, and even when agents' valuations are given as partial orders. The existence picture, however, changes sharply with partial orders. While for weak orders a Pareto-optimal set of two matchings always exists, this is -- surprisingly -- not the case under partial orders. Consequently, although the Condorcet dimension for matchings is 2 under weak orders (even under matroid constraints), this guarantee fails for partial orders: we prove that the Condorcet dimension is $Θ(\sqrt{n})$, and rises further to $Θ(n)$ when matroid constraints are added. On the computational side, we show that, under partial orders, deciding whether there exists a Condorcet -- winning set of a given fixed size is NP-hard. The same holds for deciding the existence of a Pareto-optimal matching, which we believe to be of independent interest. Finally, we also show that the Condorcet dimension for a related problem on arborescences is also 2.

Near-optimal population protocols on bounded-degree trees

from arXiv: Data Structures and Algorithms

Authors: Joel Rybicki, Jakob Solnerzik, Robin Vacus

We investigate space-time trade-offs for population protocols in sparse interaction graphs. In complete interaction graphs, optimal space-time trade-offs are known for the leader election and exact majority problems. However, it has remained open if other graph families exhibit similar space-time complexity trade-offs, as existing lower bound techniques do not extend beyond highly dense graphs. In this work, we show that -- unlike in complete graphs -- population protocols on bounded-degree trees do not exhibit significant asymptotic space-time trade-offs for leader election and exact majority. For these problems, we give constant-space protocols that have near-optimal worst-case expected stabilisation time. These new protocols achieve a linear speed-up compared to the state-of-the-art. Our results are based on two novel protocols, which we believe are of independent interest. First, we give a new fast self-stabilising 2-hop colouring protocol for general interaction graphs, whose stabilisation time we bound using a stochastic drift argument. Second, we give a self-stabilising tree orientation algorithm that builds a rooted tree in optimal time on any tree. As a consequence, we can use simple constant-state protocols designed for directed trees to solve leader election and exact majority fast. For example, we show that ``directed'' annihilation dynamics solve exact majority in $O(n^2 \log n)$ steps on directed trees.

Authors: Joel Rybicki, Jakob Solnerzik, Robin Vacus

We investigate space-time trade-offs for population protocols in sparse interaction graphs. In complete interaction graphs, optimal space-time trade-offs are known for the leader election and exact majority problems. However, it has remained open if other graph families exhibit similar space-time complexity trade-offs, as existing lower bound techniques do not extend beyond highly dense graphs. In this work, we show that -- unlike in complete graphs -- population protocols on bounded-degree trees do not exhibit significant asymptotic space-time trade-offs for leader election and exact majority. For these problems, we give constant-space protocols that have near-optimal worst-case expected stabilisation time. These new protocols achieve a linear speed-up compared to the state-of-the-art. Our results are based on two novel protocols, which we believe are of independent interest. First, we give a new fast self-stabilising 2-hop colouring protocol for general interaction graphs, whose stabilisation time we bound using a stochastic drift argument. Second, we give a self-stabilising tree orientation algorithm that builds a rooted tree in optimal time on any tree. As a consequence, we can use simple constant-state protocols designed for directed trees to solve leader election and exact majority fast. For example, we show that ``directed'' annihilation dynamics solve exact majority in $O(n^2 \log n)$ steps on directed trees.