Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Wednesday, April 15

Reticulating Splines

from Ben Recht

The long legacy of simulation-based control.

This is a live blog of Lecture 9 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

As I mentioned Monday, one of the big paradigms in modern robotics and control is the “sim2real” pipeline. People invest in complex computer simulators to test their robotic policies. The simulators have detailed dynamic and kinematic models of the robot and how it moves in contact with varied terrain and obstacles. The hope is that by burning through infinite GPU credits to troubleshoot every possibility in simulation, they can deploy code to their actual robot and need no troubleshooting once it’s unleashed in the real world.

While the young folks like to make this paradigm sound like a novel new research program, all of optimal control rests on the sim2real pipeline. Think about the core problem of optimal control: the linear quadratic regulator. This problem looks for a control sequence that minimizes a quadratic cost subject to the world evolving according to a linear dynamical system. Control theorists banged their heads against this problem for decades, and we are now taught the beautiful dynamic programming derivations that reduce this problem to solving a compact equation. However, we can also solve it using gradient descent. The gradient computation amounts to simulating the system with the current control policy, computing the sensitivity of the cost trajectory to each control decision, and then adding this information up to compute the gradient.

The lovely thing about gradient descent is that it gives you a solution technique for general optimal control problems with nonquadratic costs or nonlinear dynamics. You evaluate your policy under the current control, run a dynamical system backward in time to compute how sensitive the trajectory was to your control decisions, and then add up the contributions of each time point to get the full gradient. Arthur Bryson invented this method to compute gradients of general optimal control problems in 1962. Today, we call his algorithm backpropagation. This simulation-based gradient method provides incremental improvement of policies for any differentiable dynamical model and any differentiable cost function.

Now, if your simulation isn’t differentiable, maybe you’ll use a different sort of policy optimization method to solve your optimal control problem. However, reinforcement learning for robotics is still optimal control. RL for robotics minimizes a designed cost function subject to dynamics. The modern departure is that no one bothers to write down the equations of motion anymore. They just assume the simulator will compute them.

This belief pushes a lot of work onto the simulator. GPU cycles are sadly neither free nor abundant. It would be nice to minimize the simulation time and cost required to find a good control policy. It would be particularly nice because many people would like to have a simulator on board the actual robot to compute policies with methods like model predictive control. This begs the question of how accurate your simulation needs to be.

Unfortunately, no one knows. We all think that if you can act quickly enough with enough control authority, then a really simple model should work. But it’s impossible to quantify “enough” in that sentence. You have to try things out because dynamical processes are always surprising.

While it feels like increasing the fidelity of a simulator to the minute details of physical law always improves performance, this is not remotely the case. In class on Monday, Spencer Schutz presented a paper on autonomous driving showing a simple, inaccurate kinematic model with a low sampling rate performed just as well as a more accurate dynamic model. Anyone who’s spent time with dynamic models knows that very high-dimensional complex systems often look simple when you have limited controllability and observability. This is the basis of thermodynamics, where infinitely many bodies colliding collectively produce fairly boring dissipative behavior. Many complex-looking circuits have the input-output behavior of resistors.

On the other side of the coin, safe execution demands identification of subtle aspects of input-output relationships. You can have two dynamical systems with nearly identical behavior perform completely differently once in a closed loop circuit. You can also have systems with completely different behavior look the same in closed loop. I worked through a few examples of this phenomenon in a blog post a couple of years ago. Your model needs to be perfect in exactly the right places. But it’s usually impossible to know those places in advance.

To make matters worse, you can’t really identify the parameters of a robot in open loop. An expensive robot is always going to be running with its low-level controllers on both for its safety and yours. The actual parameters of closed-loop systems can’t be identified.1 So you’re stuck with guesses in your simulator, and you have to hope that your plausible parameters are good enough for your sim2real application.

The most popular solution to this identification problem is domain adaptation. Since you can only find a range of parameters that describe reality, you build a control policy that works for randomly sampled parameters. By constantly sampling different parameters in each run, you build a policy that performs well on average across all possible parameters.

Finding controllers that work for the average model isn’t new. Indeed, this is just a variant of optimal control called dual control, which has seen bursts of interest since the 1960s. Dual control is literally the problem of minimizing an expected control performance over a distribution of parameters. Like dual control, domain adaptation needs a good prior model for how the environment “samples” parameters. But you can also just YOLO and hope that as long as you include all the edge cases, you’ll never crash. That’s the machine learning mindset, after all.

But what does it mean to sample the coefficient of friction of a surface? What’s the right distribution of coefficients of friction? This is again a tricky question.

One approach to modeling the distribution of parameters is to add an element of adversarial behavior to the system. We can adapt the simulations to find hard parameter settings and train more on those. We can have the simulator learn to trip up the robot. Rather than minimizing expected cost, we are working to minimize a worst-case cost, where the supremum is over a distribution of parameters or disturbances. The dual control people were really into this sort of minimax robustness in the 60s. But practice in aerospace applications ultimately pushed the community to robust control.

But people hate robust control because it gives them conservative policies. Computer scientists love to hack and ship. Look how productive they’ve been! You only need to write a few tests and make sure your simulator passes those. No bugs detected, LGTM! What could go wrong, right?

Is that last paragraph about coding agents? It might be.

But regardless, robust control pointed out that unmodeled uncertainties are everywhere, and they can be out there to bite you if you’re not careful. For its entire history, robust control advocates have been haranguing people about the limits of simulators. They note a couple of significant problems: first, training on a simulator often means fitting to quirks of the simulator that don’t appear in the real world. This is a major danger, even in linear systems. Second, many apparent parametric robustness properties of optimal controllers break down under scrutiny.

In class, I introduced the structured singular value to motivate this issue. The structured singular value showed that when you had a system with many inputs and outputs, and you only considered independent perturbations, you’d convince yourself that a system was stable when it was not remotely stable. Guaranteeing stable behavior required understanding the dependencies between different errors. But how you test stability in simulation is not clear.

We are thus left considering a strategy beyond sim2real: sim2real2sim2real. Or sim2real2sim2real2sim2real. You deploy the system and find out what didn’t work in reality. And then you go back to your simulator, add a few thousand lines of code to account for the mistake, and try again. The software state of mind is that we can always patch mistakes. You can have an all-hands, blameless post-mortem and say it won’t happen again. This drives the old control theorists mad, but it’s been great working so far, so why change course?

Subscribe now

1

In case you haven’t encountered this before, suppose you are trying to model a closed-loop system x[t+1] = Ax[t]+ Bu[t], u[t]= Kx[t]. Then for an arbitrary matrix EB,

A+BK = (A -EBK)+(B+EB) K

Hence, you can only identify a subspace of possible dynamical systems describing your data.

By Ben Recht

PhD student at University of Salzburg (apply by May 6, 2026)

from CCI: jobs

The University of Salzburg is currently seeking a Predoctoral University Assistant (“PhD student”) in the Big Data Algorithms Group headed by Sebastian Forster. The goal is to develop graph algorithms for solving clustering, distance, flow, or cut problems that are as well-suited as possible to dynamic, parallel, or distributed computing models. Please consult the linked […]

The University of Salzburg is currently seeking a Predoctoral University Assistant (“PhD student”) in the Big Data Algorithms Group headed by Sebastian Forster. The goal is to develop graph algorithms for solving clustering, distance, flow, or cut problems that are as well-suited as possible to dynamic, parallel, or distributed computing models. Please consult the linked website for details.

Website: https://karriere.plus.ac.at/en/jobs/6e6aa798-0429-01e8-cabb-69c12151a2bd
Email: forster@cs.sbg.ac.at

By shacharlovett

A complexity phase transition at the EPR Hamiltonian

from arXiv: Computational Complexity

Authors: Kunal Marwaha, James Sud

We study the computational complexity of 2-local Hamiltonian problems generated by a positive-weight symmetric interaction term, encompassing many canonical problems in statistical mechanics and optimization. We show these problems belong to one of three complexity phases: QMA-complete, StoqMA-complete, and reducible to a new problem we call EPR*. The phases are physically interpretable, corresponding to the energy level ordering of the local term. The EPR* problem is a simple generalization of the EPR problem of King. Inspired by empirically efficient algorithms for EPR, we conjecture that EPR* is in BPP. If true, this would complete the complexity classification of these problems, and imply EPR* is the transition point between easy and hard local Hamiltonians. Our proofs rely on perturbative gadgets. One simple gadget, when recursed, induces a renormalization-group-like flow on the space of local interaction terms. This gives the correct complexity picture, but does not run in polynomial time. To overcome this, we design a gadget based on a large spin chain, which we analyze via the Jordan-Wigner transformation.

Authors: Kunal Marwaha, James Sud

We study the computational complexity of 2-local Hamiltonian problems generated by a positive-weight symmetric interaction term, encompassing many canonical problems in statistical mechanics and optimization. We show these problems belong to one of three complexity phases: QMA-complete, StoqMA-complete, and reducible to a new problem we call EPR*. The phases are physically interpretable, corresponding to the energy level ordering of the local term. The EPR* problem is a simple generalization of the EPR problem of King. Inspired by empirically efficient algorithms for EPR, we conjecture that EPR* is in BPP. If true, this would complete the complexity classification of these problems, and imply EPR* is the transition point between easy and hard local Hamiltonians. Our proofs rely on perturbative gadgets. One simple gadget, when recursed, induces a renormalization-group-like flow on the space of local interaction terms. This gives the correct complexity picture, but does not run in polynomial time. To overcome this, we design a gadget based on a large spin chain, which we analyze via the Jordan-Wigner transformation.

Symmetric subrank and its border analogue

from arXiv: Computational Complexity

Authors: Benjamin Biaggi, Jan Draisma, Koen de Nooij, Immanuel van Santen

The symmetric subrank of homogeneous polynomial is the largest number of terms in a diagonal form to which it can be specialized by a (typically non-invertible) linear variable substitution. Building on earlier work by Derksen-Makam-Zuiddam and Biaggi-Chang-Draisma-Rupniewski for ordinary tensors, we determine the asymptotic behavior of symmetric subrank and symmetric border subrank of degree-d forms as the number of variables tends to infinity. Furthermore, by using results from geometric invariant theory we show that for cubic (resp. quartic) forms the symmetric subrank and symmetric border subrank coincide if the latter is at most three (resp. two).

Authors: Benjamin Biaggi, Jan Draisma, Koen de Nooij, Immanuel van Santen

The symmetric subrank of homogeneous polynomial is the largest number of terms in a diagonal form to which it can be specialized by a (typically non-invertible) linear variable substitution. Building on earlier work by Derksen-Makam-Zuiddam and Biaggi-Chang-Draisma-Rupniewski for ordinary tensors, we determine the asymptotic behavior of symmetric subrank and symmetric border subrank of degree-d forms as the number of variables tends to infinity. Furthermore, by using results from geometric invariant theory we show that for cubic (resp. quartic) forms the symmetric subrank and symmetric border subrank coincide if the latter is at most three (resp. two).

From Witness-Space Sharpness To Family-Pointwise Exactness For The Solvability Complexity Index

from arXiv: Computational Complexity

Authors: Christopher Sorg

We study how exact Solvability Complexity Index (SCI) statements should be formulated for families of computational problems rather than for single problems. While the equality \(\operatorname{SCI}_G (\mathcal P)=k\) is unambiguous for an individual computational problem \(\mathcal P\), the family setting requires one to distinguish family-pointwise exactness, witness-space sharpness, and worst-case exactness. We formalize this trichotomy, prove that witness-space sharpness coincides with worst-case exactness but is, in general, strictly weaker than family-pointwise exactness, and show that certain Koopman-operator classification results are sharp only in this worst-case sense. We then establish two positive upgrade theorems: an abstract pullback principle and a concrete finite-query criterion guaranteeing that witness-space sharpness upgrades to family-pointwise exactness. Next, we introduce a decoder-regular finite-query transport preorder on SCI computational problems, prove that it is a preorder, derive a transport-saturation sufficient criterion extending the principal-source package, and show that the associated transport degrees need not form a lattice in full generality. We analyze the natural decoder classes \(\mathscr R_{\mathrm{cont}}\) and \(\mathscr R_{\mathrm{Bor}}\): on the full class the corresponding quotients are not upper semilattices, while on the nondegenerate subclass the preorder is upward and downward directed. Finally, we exhibit two natural positive families realizing the principal transport mechanism: exact integration on compact intervals and a fixed-window spectral decision family obtained by block-diagonal stabilization.

Authors: Christopher Sorg

We study how exact Solvability Complexity Index (SCI) statements should be formulated for families of computational problems rather than for single problems. While the equality \(\operatorname{SCI}_G (\mathcal P)=k\) is unambiguous for an individual computational problem \(\mathcal P\), the family setting requires one to distinguish family-pointwise exactness, witness-space sharpness, and worst-case exactness. We formalize this trichotomy, prove that witness-space sharpness coincides with worst-case exactness but is, in general, strictly weaker than family-pointwise exactness, and show that certain Koopman-operator classification results are sharp only in this worst-case sense. We then establish two positive upgrade theorems: an abstract pullback principle and a concrete finite-query criterion guaranteeing that witness-space sharpness upgrades to family-pointwise exactness. Next, we introduce a decoder-regular finite-query transport preorder on SCI computational problems, prove that it is a preorder, derive a transport-saturation sufficient criterion extending the principal-source package, and show that the associated transport degrees need not form a lattice in full generality. We analyze the natural decoder classes \(\mathscr R_{\mathrm{cont}}\) and \(\mathscr R_{\mathrm{Bor}}\): on the full class the corresponding quotients are not upper semilattices, while on the nondegenerate subclass the preorder is upward and downward directed. Finally, we exhibit two natural positive families realizing the principal transport mechanism: exact integration on compact intervals and a fixed-window spectral decision family obtained by block-diagonal stabilization.

A Relativizing MIP for BQP

from arXiv: Computational Complexity

Authors: Scott Aaronson, Anand Natarajan, Avishay Tal, Agi Villanyi

Complexity class containments involving interactive proof classes are famously nonrelativizing: although $\mathsf{IP} = \mathsf{PSPACE}$, Fortnow and Sipser showed that that there exists an oracle relative to which $\mathsf{coNP} \not\subseteq \mathsf{IP}$. In contrast, the question of whether the containment $\mathsf{BQP} \subseteq \mathsf{IP}$ is relativizing remains wide open. In this work we make progress towards resolving this question by showing that the containment $\mathsf{BQP} \subseteq \mathsf{MIP}$ holds with respect to any classical oracle. We obtain this result by constructing, for any classical oracle $O$, a $\mathsf{PCP}$ proof system for $\mathsf{BQP}^{O}$ where the verifier makes polynomially many classical queries to an exponentially-long proof, and to the oracle $O$. Our construction is inspired by the state synthesis algorithm of Grover and Rudolph, and serves as a complement to the "exponential PCP" constructed by Aharonov, Arad, and Vidick, which achieves similar parameters but which is based on different ideas and does not relativize. We propose relativization as a proxy for prover efficiency, and hope that progress towards an $\mathsf{IP}$ for $\mathsf{BQP}$ in the oracle world will lead to a non-cryptographic interactive protocol for proving any quantum computation to a classical skeptic in the unrelativized world, which is a longstanding open problem in quantum complexity theory.

Authors: Scott Aaronson, Anand Natarajan, Avishay Tal, Agi Villanyi

Complexity class containments involving interactive proof classes are famously nonrelativizing: although $\mathsf{IP} = \mathsf{PSPACE}$, Fortnow and Sipser showed that that there exists an oracle relative to which $\mathsf{coNP} \not\subseteq \mathsf{IP}$. In contrast, the question of whether the containment $\mathsf{BQP} \subseteq \mathsf{IP}$ is relativizing remains wide open. In this work we make progress towards resolving this question by showing that the containment $\mathsf{BQP} \subseteq \mathsf{MIP}$ holds with respect to any classical oracle. We obtain this result by constructing, for any classical oracle $O$, a $\mathsf{PCP}$ proof system for $\mathsf{BQP}^{O}$ where the verifier makes polynomially many classical queries to an exponentially-long proof, and to the oracle $O$. Our construction is inspired by the state synthesis algorithm of Grover and Rudolph, and serves as a complement to the "exponential PCP" constructed by Aharonov, Arad, and Vidick, which achieves similar parameters but which is based on different ideas and does not relativize. We propose relativization as a proxy for prover efficiency, and hope that progress towards an $\mathsf{IP}$ for $\mathsf{BQP}$ in the oracle world will lead to a non-cryptographic interactive protocol for proving any quantum computation to a classical skeptic in the unrelativized world, which is a longstanding open problem in quantum complexity theory.

Topology Understanding of B-Spline Surface/Surface Intersection with Mapper

from arXiv: Computational Geometry

Authors: Chenming Gao, Hongwei Lin, Gengchen Li

In the realm of computer-aided design (CAD) software, the intersection of B-spline surfaces stands as a fundamental operation. Despite the extensive history of surface intersection algorithms, the challenge of handling complex intersection topologies persists. While subdivision algorithms have demonstrated strong robustness in computing surface/surface intersection and are capable of addressing singular cases, determining the topology of the intersection obtained through these methods is a key factor for calculating correct intersection, and remains a difficult issue. To address this challenge, we propose a Mapper-based method for determining the topology of the intersection between two B-spline surfaces. Our algorithm is designed to efficiently handle various common and complex intersection topologies. Experimental results verify the robustness and topological correctness of this method.

Authors: Chenming Gao, Hongwei Lin, Gengchen Li

In the realm of computer-aided design (CAD) software, the intersection of B-spline surfaces stands as a fundamental operation. Despite the extensive history of surface intersection algorithms, the challenge of handling complex intersection topologies persists. While subdivision algorithms have demonstrated strong robustness in computing surface/surface intersection and are capable of addressing singular cases, determining the topology of the intersection obtained through these methods is a key factor for calculating correct intersection, and remains a difficult issue. To address this challenge, we propose a Mapper-based method for determining the topology of the intersection between two B-spline surfaces. Our algorithm is designed to efficiently handle various common and complex intersection topologies. Experimental results verify the robustness and topological correctness of this method.

The Parameterized Complexity of Vertex-Coloring Edge-Weighting

from arXiv: Data Structures and Algorithms

Authors: Shubhada Aute, Fahad Panolan, Geevarghese Philip

Motivated by the landmark resolution of the 1-2-3 Conjecture, we initiate the study of the parameterized complexity of the Vertex-Coloring {0,1}-Edge-Weighting problem and its generalization, Vertex-Coloring Pre-edge-Weighting, under various structural parameters. The base problem, Vertex-Coloring {0,1}-Edge-Weighting, asks whether we can assign a weight from {0,1} to each edge of a graph. The goal is to ensure that for every pair of adjacent vertices, the sums of their incident edge weights are distinct. In the Vertex-Coloring Pre-edge-Weighting variant, we are given a graph where a subset of edges is already assigned fixed weights from {0,1}. The goal is to determine if this partial weighting can be extended to all remaining edges such that the final, complete assignment satisfies the proper vertex coloring property. While the existence of such weightings is well-understood for specific graph classes, their algorithmic complexity under structural parameterization has remained unexplored. We prove both hardness and tractability for the problem, across a hierarchy of structural parameters. We show that both the base problem and the Pre-edge-Weighting variant are W[1]-hard when parameterized by the size of a feedback vertex set of the input graph. On the positive side, we establish that the base problem and a restricted Pre-edge-Weighting variant where the pre-assigned weights are all 1, become FPT when parameterized by the size of a vertex cover of the input graph. Further, we show that both the base problem and the Pre-edge-Weighting variant have XP algorithms when parameterized by the treewidth of the input graph.

Authors: Shubhada Aute, Fahad Panolan, Geevarghese Philip

Motivated by the landmark resolution of the 1-2-3 Conjecture, we initiate the study of the parameterized complexity of the Vertex-Coloring {0,1}-Edge-Weighting problem and its generalization, Vertex-Coloring Pre-edge-Weighting, under various structural parameters. The base problem, Vertex-Coloring {0,1}-Edge-Weighting, asks whether we can assign a weight from {0,1} to each edge of a graph. The goal is to ensure that for every pair of adjacent vertices, the sums of their incident edge weights are distinct. In the Vertex-Coloring Pre-edge-Weighting variant, we are given a graph where a subset of edges is already assigned fixed weights from {0,1}. The goal is to determine if this partial weighting can be extended to all remaining edges such that the final, complete assignment satisfies the proper vertex coloring property. While the existence of such weightings is well-understood for specific graph classes, their algorithmic complexity under structural parameterization has remained unexplored. We prove both hardness and tractability for the problem, across a hierarchy of structural parameters. We show that both the base problem and the Pre-edge-Weighting variant are W[1]-hard when parameterized by the size of a feedback vertex set of the input graph. On the positive side, we establish that the base problem and a restricted Pre-edge-Weighting variant where the pre-assigned weights are all 1, become FPT when parameterized by the size of a vertex cover of the input graph. Further, we show that both the base problem and the Pre-edge-Weighting variant have XP algorithms when parameterized by the treewidth of the input graph.

Dequantizing Short-Path Quantum Algorithms

from arXiv: Data Structures and Algorithms

Authors: François Le Gall, Suguru Tamaki

The short-path quantum algorithm introduced by Hastings (Quantum 2018, 2019) is a variant of adiabatic quantum algorithms that enables an easier worst-case analysis by avoiding the need to control the spectral gap along a long adiabatic path. Dalzell, Pancotti, Campbell, and Brandão (STOC 2023) recently revisited this framework and obtained a clear analysis of the complexity of the short-path algorithm for several classes of constraint satisfaction problems (MAX-$k$-CSPs), leading to quantum algorithms with complexity $2^{(1-c)n/2}$ for some constant $c>0$. This suggested a super-quadratic quantum advantage over classical algorithms. In this work, we identify an explicit classical mechanism underlying a substantial part of this line of work, and show that it leads to clean dequantizations. As a consequence, we obtain classical algorithms that run in time $2^{(1-c')n}$, for some constant $c'>c$, for the same classes of constraint satisfaction problems. This shows that current short-path quantum algorithms for these problems do not achieve a super-quadratic advantage. On the positive side, our results provide a new ``quantum-inspired'' approach to designing classical algorithms for important classes of constraint satisfaction problems.

Authors: François Le Gall, Suguru Tamaki

The short-path quantum algorithm introduced by Hastings (Quantum 2018, 2019) is a variant of adiabatic quantum algorithms that enables an easier worst-case analysis by avoiding the need to control the spectral gap along a long adiabatic path. Dalzell, Pancotti, Campbell, and Brandão (STOC 2023) recently revisited this framework and obtained a clear analysis of the complexity of the short-path algorithm for several classes of constraint satisfaction problems (MAX-$k$-CSPs), leading to quantum algorithms with complexity $2^{(1-c)n/2}$ for some constant $c>0$. This suggested a super-quadratic quantum advantage over classical algorithms. In this work, we identify an explicit classical mechanism underlying a substantial part of this line of work, and show that it leads to clean dequantizations. As a consequence, we obtain classical algorithms that run in time $2^{(1-c')n}$, for some constant $c'>c$, for the same classes of constraint satisfaction problems. This shows that current short-path quantum algorithms for these problems do not achieve a super-quadratic advantage. On the positive side, our results provide a new ``quantum-inspired'' approach to designing classical algorithms for important classes of constraint satisfaction problems.

Asymptotically faster algorithms for recognizing $(k,\ell)$-sparse graphs

from arXiv: Data Structures and Algorithms

Authors: Bence Deák, Péter Madarasi

The family of $(k,\ell)$-sparse graphs, introduced by Lorea, plays a central role in combinatorial optimization and has a wide range of applications, particularly in rigidity theory. A key algorithmic problem is to decide whether a given graph is $(k,\ell)$-sparse and, if not, to produce a vertex set certifying the failure of sparsity. While pebble game algorithms have long yielded $O(n^2)$-time recognition throughout the classical range $0 \leq \ell < 2k$, and $O(n^3)$-time algorithms in the extended range $2k \leq \ell < 3k$, substantially faster bounds were previously known only in a few special cases. We present new recognition algorithms for the parameter ranges $0 \le \ell \le k$, $k < \ell < 2k$, and $2k \leq \ell < 3k$. Our approach combines bounded-indegree orientations, reductions to rooted arc-connectivity, augmenting-path techniques, and a divide-and-conquer method based on centroid decomposition. This yields the first subquadratic, and in fact near-linear-time, recognition algorithms throughout the classical range when instantiated with the fastest currently available subroutines. Under purely combinatorial implementations, the running times become $O(n\sqrt n)$ for $0 \leq \ell \leq k$ and $O(n\sqrt{n\log n})$ for $k< \ell <2k$. For $2k \leq \ell < 3k$, we obtain an $O(n^2)$-time algorithm when $\ell \leq 2k+1$ and an $O(n^2\log n)$-time algorithm otherwise. In each case, the algorithm can also return an explicit violating set certifying that the input graph is not $(k,\ell)$-sparse.

Authors: Bence Deák, Péter Madarasi

The family of $(k,\ell)$-sparse graphs, introduced by Lorea, plays a central role in combinatorial optimization and has a wide range of applications, particularly in rigidity theory. A key algorithmic problem is to decide whether a given graph is $(k,\ell)$-sparse and, if not, to produce a vertex set certifying the failure of sparsity. While pebble game algorithms have long yielded $O(n^2)$-time recognition throughout the classical range $0 \leq \ell < 2k$, and $O(n^3)$-time algorithms in the extended range $2k \leq \ell < 3k$, substantially faster bounds were previously known only in a few special cases. We present new recognition algorithms for the parameter ranges $0 \le \ell \le k$, $k < \ell < 2k$, and $2k \leq \ell < 3k$. Our approach combines bounded-indegree orientations, reductions to rooted arc-connectivity, augmenting-path techniques, and a divide-and-conquer method based on centroid decomposition. This yields the first subquadratic, and in fact near-linear-time, recognition algorithms throughout the classical range when instantiated with the fastest currently available subroutines. Under purely combinatorial implementations, the running times become $O(n\sqrt n)$ for $0 \leq \ell \leq k$ and $O(n\sqrt{n\log n})$ for $k< \ell <2k$. For $2k \leq \ell < 3k$, we obtain an $O(n^2)$-time algorithm when $\ell \leq 2k+1$ and an $O(n^2\log n)$-time algorithm otherwise. In each case, the algorithm can also return an explicit violating set certifying that the input graph is not $(k,\ell)$-sparse.

Efficiency of Proportional Mechanisms in Online Auto-Bidding Advertising

from arXiv: Data Structures and Algorithms

Authors: Nguyen Kim Thang

The rise of automated bidding strategies in online advertising presents new challenges in designing and analyzing efficient auction mechanisms. In this paper, we focus on proportional mechanisms within the context of auto-bidding and study the efficiency of pure Nash equilibria, specifically the price of anarchy (PoA), under the liquid welfare objective. We first establish a tight PoA bound of 2 for the standard proportional mechanism. Next, we introduce a modified version with an alternative payment scheme that achieves a PoA bound of $1 + \frac{O(1)}{n-1}$ where $n \geq 2$ denotes the number of bidding agents. This improvement surpasses the existing PoA barrier of 2 and approaches full efficiency as the number of agents increases. Our methodology leverages duality and the Karush-Kuhn-Tucker (KKT) conditions from linear and convex programming. Despite its conceptual simplicity, our approach proves powerful and may offer broader applications for establishing PoA bounds.

Authors: Nguyen Kim Thang

The rise of automated bidding strategies in online advertising presents new challenges in designing and analyzing efficient auction mechanisms. In this paper, we focus on proportional mechanisms within the context of auto-bidding and study the efficiency of pure Nash equilibria, specifically the price of anarchy (PoA), under the liquid welfare objective. We first establish a tight PoA bound of 2 for the standard proportional mechanism. Next, we introduce a modified version with an alternative payment scheme that achieves a PoA bound of $1 + \frac{O(1)}{n-1}$ where $n \geq 2$ denotes the number of bidding agents. This improvement surpasses the existing PoA barrier of 2 and approaches full efficiency as the number of agents increases. Our methodology leverages duality and the Karush-Kuhn-Tucker (KKT) conditions from linear and convex programming. Despite its conceptual simplicity, our approach proves powerful and may offer broader applications for establishing PoA bounds.

Longest Common Extension of a Dynamic String in Parallel Constant Time

from arXiv: Data Structures and Algorithms

Authors: Daniel Albert

A longest common extension (LCE) query on a string computes the length of the longest common suffix or prefix at two given positions. A dynamic LCE algorithm maintains a data structure that allows efficient LCE queries on a string that can change via character insertions and deletions. A dynamic parallel constant-time algorithm is presented that can maintain LCE queries on a common CRCW PRAM with $\mathcal{O}(n^ε)$ work, for any $ε> 0$. The algorithm maintains a string synchronizing sets hierarchy, which it uses to answer substring equality queries, which it in turn uses to answer LCE queries. To achieve constant runtime, the algorithm allows parts of its information to become outdated by up to $\log n \log^* n$ updates. It answers queries by combining this slightly outdated information with a list of the recent changes. Two applications of this dynamic LCE algorithm are shown. Firstly, a dynamic parallel constant-time algorithm can maintain membership in a Dyck language $D_k, k > 0$ with $\mathcal{O}(n^ε)$ work for any $ε> 0$. Secondly, a dynamic parallel constant-time algorithm can maintain squares with $\mathcal{O}(n^ε)$ work for any $ε> 0$.

Authors: Daniel Albert

A longest common extension (LCE) query on a string computes the length of the longest common suffix or prefix at two given positions. A dynamic LCE algorithm maintains a data structure that allows efficient LCE queries on a string that can change via character insertions and deletions. A dynamic parallel constant-time algorithm is presented that can maintain LCE queries on a common CRCW PRAM with $\mathcal{O}(n^ε)$ work, for any $ε> 0$. The algorithm maintains a string synchronizing sets hierarchy, which it uses to answer substring equality queries, which it in turn uses to answer LCE queries. To achieve constant runtime, the algorithm allows parts of its information to become outdated by up to $\log n \log^* n$ updates. It answers queries by combining this slightly outdated information with a list of the recent changes. Two applications of this dynamic LCE algorithm are shown. Firstly, a dynamic parallel constant-time algorithm can maintain membership in a Dyck language $D_k, k > 0$ with $\mathcal{O}(n^ε)$ work for any $ε> 0$. Secondly, a dynamic parallel constant-time algorithm can maintain squares with $\mathcal{O}(n^ε)$ work for any $ε> 0$.

Sorting under Partial Information with Optimal Preprocessing Time via Unified Bound Heaps

from arXiv: Data Structures and Algorithms

Authors: Daniel Rutschmann

In 1972, Fredman proposes the problem of sorting under partial information: preprocess a directed acyclic graph $G$ with vertex set $X$ so that you can sort $X$ in $O(\log e(G))$ time, where $e(G)$ is the number of sorted orders compatible with $G$. Cardinal, Fiorini, Joret, Jungers and Munro [STOC'10] show that you can preprocess $G$ in $O(n^{2.5})$ time and then sort $X$ in $O(\log e(G) + n)$ time and $O(\log e(G))$ comparisons. Recent work of van der Hoog and Rutschmann [FOCS'24] implies an algorithm with $O(n^ω)$ preprocessing time where $ω< 2.372$ and $O(\log e(G))$ sorting time. Haeupler, Hladík, Iacono, Rozhoň, Tarjan and Tětek [SODA'25] achieve an overall running time of $O(\log e(G) + m)$. In this paper, we achieve tight bounds for this problem: $O(m)$ preprocessing time and $O(\log e(G))$ sorting time. As a key ingredient, we design a new fast heap data structure that might be of independent theoretical interest.

Authors: Daniel Rutschmann

In 1972, Fredman proposes the problem of sorting under partial information: preprocess a directed acyclic graph $G$ with vertex set $X$ so that you can sort $X$ in $O(\log e(G))$ time, where $e(G)$ is the number of sorted orders compatible with $G$. Cardinal, Fiorini, Joret, Jungers and Munro [STOC'10] show that you can preprocess $G$ in $O(n^{2.5})$ time and then sort $X$ in $O(\log e(G) + n)$ time and $O(\log e(G))$ comparisons. Recent work of van der Hoog and Rutschmann [FOCS'24] implies an algorithm with $O(n^ω)$ preprocessing time where $ω< 2.372$ and $O(\log e(G))$ sorting time. Haeupler, Hladík, Iacono, Rozhoň, Tarjan and Tětek [SODA'25] achieve an overall running time of $O(\log e(G) + m)$. In this paper, we achieve tight bounds for this problem: $O(m)$ preprocessing time and $O(\log e(G))$ sorting time. As a key ingredient, we design a new fast heap data structure that might be of independent theoretical interest.

Robust Graph Isomorphism, Quadratic Assignment and VC Dimension

from arXiv: Data Structures and Algorithms

Authors: Anatole Dahan, Martin Grohe, Daniel Neuen, Tomáš Novotný

We present an additive $\varepsilon n^{2}$-approximation algorithm for the Graph Edit Distance problem (GED) on graphs of VC dimension $d$ running in time $n^{O(d/\varepsilon^{2})}$. In particular, this recovers a previous result by Arora, Frieze, and Kaplan [Math. Program. 2002] who gave an $\varepsilon n^{2}$-approximation running in time $n^{O(\log n/\varepsilon^{2})}$. Similar to the work of Arora et al., we extend our results to arbitrary Quadratic Assignment problems (QAPs) by introducing a notion of VC dimension for QAP instances, and giving an $\varepsilon n^{2}$-approximation for QAPs with bounded weights running in time $n^{O(\varepsilon^{-2}(d + \log\varepsilon^{-1}))}$. As a particularly interesting special case, we further study the problem $\varepsilon$-$\mathsf{GI}$, which entails determining if two graphs $G,H$ over $n$ vertices are isomorphic, when promised that if they are not, their graph edit distance is at least $\varepsilon n^{2}$. We show that the standard Weisfeiler--Leman algorithm of dimension $O(\varepsilon^{-1}d\log(\varepsilon^{-1}))$ solves this problem on graphs of VC dimension $d$. We also show that dimension $O(\varepsilon^{-1}\log n)$ suffices on arbitrary $n$-vertex graphs, while $k$-WL fails on instances at distance $Ω(n^{2}/k)$.

Authors: Anatole Dahan, Martin Grohe, Daniel Neuen, Tomáš Novotný

We present an additive $\varepsilon n^{2}$-approximation algorithm for the Graph Edit Distance problem (GED) on graphs of VC dimension $d$ running in time $n^{O(d/\varepsilon^{2})}$. In particular, this recovers a previous result by Arora, Frieze, and Kaplan [Math. Program. 2002] who gave an $\varepsilon n^{2}$-approximation running in time $n^{O(\log n/\varepsilon^{2})}$. Similar to the work of Arora et al., we extend our results to arbitrary Quadratic Assignment problems (QAPs) by introducing a notion of VC dimension for QAP instances, and giving an $\varepsilon n^{2}$-approximation for QAPs with bounded weights running in time $n^{O(\varepsilon^{-2}(d + \log\varepsilon^{-1}))}$. As a particularly interesting special case, we further study the problem $\varepsilon$-$\mathsf{GI}$, which entails determining if two graphs $G,H$ over $n$ vertices are isomorphic, when promised that if they are not, their graph edit distance is at least $\varepsilon n^{2}$. We show that the standard Weisfeiler--Leman algorithm of dimension $O(\varepsilon^{-1}d\log(\varepsilon^{-1}))$ solves this problem on graphs of VC dimension $d$. We also show that dimension $O(\varepsilon^{-1}\log n)$ suffices on arbitrary $n$-vertex graphs, while $k$-WL fails on instances at distance $Ω(n^{2}/k)$.

Submodular Max-Min Allocation under Identical Valuations

from arXiv: Data Structures and Algorithms

Authors: Kimon Boehmer

In the problem of Submodular Max-Min Allocation, we are given a set of items, a set of players, and monotone submodular valuation functions that represent the satisfaction of a player with a certain subset of items. The goal is to find an allocation of the items to the players that maximizes the lowest satisfaction among all players. We study this problem in the special case where all players have the same valuation function. We devise a greedy algorithm which gives a $0.4$-approximation, improving the previously best factor of $\frac{10}{27} \approx 0.37$ by Uziahu and Feige. Furthermore, we study the integrality gap of the \emph{configuration LP} when players have identical valuations. By constructing a variable assignment to the dual from a primal integral solution, we give the first constant upper bound on the integrality gap for submodular valuations. Generalizing the result to the case where players' allocations must be independent in $k$ given matroids, we derive a $\mathcal{O}(k)$-estimation algorithm for max-min allocation subject to $k$ matroid constraints under identical valuations.

Authors: Kimon Boehmer

In the problem of Submodular Max-Min Allocation, we are given a set of items, a set of players, and monotone submodular valuation functions that represent the satisfaction of a player with a certain subset of items. The goal is to find an allocation of the items to the players that maximizes the lowest satisfaction among all players. We study this problem in the special case where all players have the same valuation function. We devise a greedy algorithm which gives a $0.4$-approximation, improving the previously best factor of $\frac{10}{27} \approx 0.37$ by Uziahu and Feige. Furthermore, we study the integrality gap of the \emph{configuration LP} when players have identical valuations. By constructing a variable assignment to the dual from a primal integral solution, we give the first constant upper bound on the integrality gap for submodular valuations. Generalizing the result to the case where players' allocations must be independent in $k$ given matroids, we derive a $\mathcal{O}(k)$-estimation algorithm for max-min allocation subject to $k$ matroid constraints under identical valuations.

Fully Dynamic Breadth First Search and Spanning Trees in Directed Graphs

from arXiv: Data Structures and Algorithms

Authors: Gregory Morse, Tamás Kozsik

We study the problem of maintaining a breadth-first spanning tree and the induced BFS ordering in a directed graph under edge updates. While semi-dynamic algorithms are known, maintaining the spanning tree, level information, and numbering together in the fully dynamic setting is less developed. This preprint presents a framework for fully dynamic BFS in directed graphs together with supporting data structures for maintaining the BFS tree, single-source shortest paths, and single-source reachability under both insertions and deletions.

Authors: Gregory Morse, Tamás Kozsik

We study the problem of maintaining a breadth-first spanning tree and the induced BFS ordering in a directed graph under edge updates. While semi-dynamic algorithms are known, maintaining the spanning tree, level information, and numbering together in the fully dynamic setting is less developed. This preprint presents a framework for fully dynamic BFS in directed graphs together with supporting data structures for maintaining the BFS tree, single-source shortest paths, and single-source reachability under both insertions and deletions.

A Residual-Shell-Based Lower Bound for Ollivier-Ricci Curvature

from arXiv: Data Structures and Algorithms

Authors: Xiang Gu, Huichun Zhang, Jian Sun

Ollivier-Ricci curvature (ORC), defined via the Wasserstein distance that captures rich geometric information, has received growing attention in both theory and applications. However, the high computational cost of Wasserstein distance evaluation has significantly limited the broader practical use of ORC. To alleviate this issue, previous work introduced a computationally efficient lower bound as a proxy for ORC based on 1-hop random walks, but this approach empirically exhibits large gaps from the exact ORC. In this paper, we establish a substantially tighter lower bound for ORC than the existing lower bound, while retaining much lower computational cost than exact ORC computation, with practical speedups of tens of times. Moreover, our bound is not restricted to 1-hop random walks, but also applies to k-hop random walks (k > 1). Experiments on several fundamental graph structures demonstrate the effectiveness of our bound in terms of both approximation accuracy and computational efficiency.

Authors: Xiang Gu, Huichun Zhang, Jian Sun

Ollivier-Ricci curvature (ORC), defined via the Wasserstein distance that captures rich geometric information, has received growing attention in both theory and applications. However, the high computational cost of Wasserstein distance evaluation has significantly limited the broader practical use of ORC. To alleviate this issue, previous work introduced a computationally efficient lower bound as a proxy for ORC based on 1-hop random walks, but this approach empirically exhibits large gaps from the exact ORC. In this paper, we establish a substantially tighter lower bound for ORC than the existing lower bound, while retaining much lower computational cost than exact ORC computation, with practical speedups of tens of times. Moreover, our bound is not restricted to 1-hop random walks, but also applies to k-hop random walks (k > 1). Experiments on several fundamental graph structures demonstrate the effectiveness of our bound in terms of both approximation accuracy and computational efficiency.

Phylogenetic Inference under the Balanced Minimum Evolution Criterion via Semidefinite Programming

from arXiv: Data Structures and Algorithms

Authors: P. Skums

In this study, we investigate the application of Semidefinite Programming (SDP) to phylogenetics. SDP is a powerful optimization framework that seeks to optimize a linear objective function over the cone of positive semidefinite matrices. As a convex optimization problem, SDP generalizes linear programming and provides tight relaxations for many combinatorial optimization problems. However, despite its many applications, SDP remains largely unused in computational biology. We argue that SDP relaxations are particularly well suited for phylogenetic inference. As a proof of concept, we focus on the Balanced Minimum Evolution (BME) problem, a widely used model in distance-based phylogenetics. We propose an algorithm combining an SDP relaxation with a rounding scheme that iteratively converts relaxed solutions into valid tree topologies. Experiments on simulated and empirical datasets show that the method enables accurate phylogenetic reconstruction. The approach is sufficiently general to be extendable to other phylogenetic problems.

Authors: P. Skums

In this study, we investigate the application of Semidefinite Programming (SDP) to phylogenetics. SDP is a powerful optimization framework that seeks to optimize a linear objective function over the cone of positive semidefinite matrices. As a convex optimization problem, SDP generalizes linear programming and provides tight relaxations for many combinatorial optimization problems. However, despite its many applications, SDP remains largely unused in computational biology. We argue that SDP relaxations are particularly well suited for phylogenetic inference. As a proof of concept, we focus on the Balanced Minimum Evolution (BME) problem, a widely used model in distance-based phylogenetics. We propose an algorithm combining an SDP relaxation with a rounding scheme that iteratively converts relaxed solutions into valid tree topologies. Experiments on simulated and empirical datasets show that the method enables accurate phylogenetic reconstruction. The approach is sufficiently general to be extendable to other phylogenetic problems.

Constant-Factor Approximation for the Uniform Decision Tree

from arXiv: Data Structures and Algorithms

Authors: Michał Szyfelbein

We resolve a long-standing open question, about the existence of a constant-factor approximation algorithm for the average-case \textsc{Decision Tree} problem with uniform probability distribution over the hypotheses. We answer the question in the affirmative by providing a simple polynomial-time algorithm with approximation ratio of $\frac{2}{1-\sqrt{(e+1)/(2e)}}+ε<11.57$. This improves upon the currently best-known, greedy algorithm which achieves $O(\log n/{\log\log n})$-approximation. The first key ingredient in our analysis is the usage of a decomposition technique known from problems related to \textsc{Hierarchical Clustering} [SODA '17, WALCOM '26], which allows us to decompose the optimal decision tree into a series of objects called separating subfamilies. The second crucial idea is to reduce the subproblem of finding a \textsc{Separating Subfamily} to an instance of the \textsc{Maximum Coverage} problem. To do so, we analyze the properties of cutting cliques into small pieces, which represent pairs of hypotheses to be separated. This allows us to obtain a good approximation for the \textsc{Separating Subfamily} problem, which then enables the design of the approximation algorithm for the original problem.

Authors: Michał Szyfelbein

We resolve a long-standing open question, about the existence of a constant-factor approximation algorithm for the average-case \textsc{Decision Tree} problem with uniform probability distribution over the hypotheses. We answer the question in the affirmative by providing a simple polynomial-time algorithm with approximation ratio of $\frac{2}{1-\sqrt{(e+1)/(2e)}}+ε<11.57$. This improves upon the currently best-known, greedy algorithm which achieves $O(\log n/{\log\log n})$-approximation. The first key ingredient in our analysis is the usage of a decomposition technique known from problems related to \textsc{Hierarchical Clustering} [SODA '17, WALCOM '26], which allows us to decompose the optimal decision tree into a series of objects called separating subfamilies. The second crucial idea is to reduce the subproblem of finding a \textsc{Separating Subfamily} to an instance of the \textsc{Maximum Coverage} problem. To do so, we analyze the properties of cutting cliques into small pieces, which represent pairs of hypotheses to be separated. This allows us to obtain a good approximation for the \textsc{Separating Subfamily} problem, which then enables the design of the approximation algorithm for the original problem.

Sampling Colorings Close to the Maximum Degree: Non-Markovian Coupling and Local Uniformity

from arXiv: Data Structures and Algorithms

Authors: Vishesh Jain, Clayton Mizgerd, Eric Vigoda

Sampling graph colorings via local Markov chains is a central problem in approximate counting and Markov chain Monte Carlo (MCMC). We address the problem of sampling a random $k$-coloring of a graph with maximum degree $Δ$. The simplest algorithmic approach is to establish rapid mixing of the single-site update chain known as the Metropolis Glauber dynamics, which at each step chooses a random vertex $v$ and proposes a random color $c$, recoloring $v$ to $c$ if the resulting coloring remains proper. It is a long-standing open problem to prove that the Glauber dynamics has polynomial mixing time on all graphs whenever $k\geqΔ+2$. We prove that for every $δ>0$ and all $Δ\geq Δ_0(δ)$, if $k\ge (1+δ)Δ$ then the Glauber dynamics has optimal mixing time of $O_δ(|V| \log |V|)$ on any graph of girth $\geq 11$ and maximum degree $Δ$. Our approach builds on a non-Markovian coupling introduced by Hayes and Vigoda (2003) for the large-degree regime $Δ=Ω(\log n)$, in which updates at time $t$ may depend on and modify proposed updates at future times. A complete analysis of this framework requires resolving substantial technical obstacles that remain in the original argument, and extending it to the constant-degree regime introduces further difficulties, since non-Markovian updates may fail with constant probability. We overcome these obstacles by developing and analyzing a refined local non-Markovian coupling, and by establishing new local-uniformity results for the Metropolis dynamics, extending prior results for the heat-bath chain due to Hayes (2013). Together, these ingredients provide a complete analysis of the non-Markovian coupling framework in the large-degree regime, while simultaneously strengthening it substantially to obtain optimal mixing all the way down to the constant-degree setting.

Authors: Vishesh Jain, Clayton Mizgerd, Eric Vigoda

Sampling graph colorings via local Markov chains is a central problem in approximate counting and Markov chain Monte Carlo (MCMC). We address the problem of sampling a random $k$-coloring of a graph with maximum degree $Δ$. The simplest algorithmic approach is to establish rapid mixing of the single-site update chain known as the Metropolis Glauber dynamics, which at each step chooses a random vertex $v$ and proposes a random color $c$, recoloring $v$ to $c$ if the resulting coloring remains proper. It is a long-standing open problem to prove that the Glauber dynamics has polynomial mixing time on all graphs whenever $k\geqΔ+2$. We prove that for every $δ>0$ and all $Δ\geq Δ_0(δ)$, if $k\ge (1+δ)Δ$ then the Glauber dynamics has optimal mixing time of $O_δ(|V| \log |V|)$ on any graph of girth $\geq 11$ and maximum degree $Δ$. Our approach builds on a non-Markovian coupling introduced by Hayes and Vigoda (2003) for the large-degree regime $Δ=Ω(\log n)$, in which updates at time $t$ may depend on and modify proposed updates at future times. A complete analysis of this framework requires resolving substantial technical obstacles that remain in the original argument, and extending it to the constant-degree regime introduces further difficulties, since non-Markovian updates may fail with constant probability. We overcome these obstacles by developing and analyzing a refined local non-Markovian coupling, and by establishing new local-uniformity results for the Metropolis dynamics, extending prior results for the heat-bath chain due to Hayes (2013). Together, these ingredients provide a complete analysis of the non-Markovian coupling framework in the large-degree regime, while simultaneously strengthening it substantially to obtain optimal mixing all the way down to the constant-degree setting.

Tuesday, April 14

TCS+ talk: Wednesday, April 22 — Yichuan Wang, UC Berkeley

from TCS+ Seminar Series

The next TCS+ talk will take place this coming Wednesday, April 22th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Yichuan Wang from UC Berkeley will speak about “Superquadratic Lower Bounds for Depth-2 Linear Threshold Circuits” (abstract below). You can reserve a spot as an individual or a […]

The next TCS+ talk will take place this coming Wednesday, April 22th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Yichuan Wang from UC Berkeley will speak about “Superquadratic Lower Bounds for Depth-2 Linear Threshold Circuits” (abstract below).

You can reserve a spot as an individual or a group to join us live by signing up on the online form. Registration is not required to attend the interactive talk, and the link will be posted on the website the day prior to the talk; however, by registering in the form, you will receive a reminder, along with the link. (The recorded talk will also be posted on our website afterwards) As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: Proving lower bounds against depth-2 linear threshold circuits (a.k.a. THR ◦ THR) is one of the frontier questions in complexity theory. Despite tremendous effort, our best lower bounds for THR ◦ THR only hold for sub-quadratic number of gates, which was proven a decade ago by Tamaki (ECCC TR16) and Alman, Chan, and Williams (FOCS 2016) for a hard function in E^NP.

In this work, we prove that there is a function f in E^NP that requires n^{2.5-\varepsilon}-size THR ◦ THR circuits for any \varepsilon > 0. We obtain our new results by designing a new non-trivial algorithm for estimating the acceptance probability of an XOR of two n^{2.5-\varepsilon}-size THR ◦ THR circuits, and apply Williams’ algorithmic method to obtain the desired lower bound.

By plustcs

Guest Post from Peter Brass, Former NSF Theory Director, on the NSF budget.

from Computational Complexity

 Guest post from Peter Brass, Former NSF Theory director (though not affiliated with the NSF now) on the White House NSF budget for FY 2027.

---------------------------------------------

Dear Colleagues

A week ago the White House released the NSF budget request for FY 2027( here) so I want to provide you an update. As always, this is just my interpretation, I am not connected to the NSF any more

The general news is bad, we get a replay of FY 2026 (FY 2026 is October 2025 to September 2026)
For context: the NSF enacted budget in FY 2023 was $9.5B. The CISE budget was $0.9B. Budgets start with a request by the White House, and then get negotiated with Congress.
A year ago, the  White House made a FY 2026 budget request of $3.9B, which was a ~60% cut from the previous level. After negotiation with Congress, the enacted budget became $8.7B
However, a long period of FY 2025 were funded by continuing resolutions, and continuing  resolutions are based on the White House budget request, so for the first half of FY 2025 the  60% cut was in effect, and awards only now pick up. In addition, there was the government shutdown, and the removal of NSF from the Eisenhower Ave building in Alexandria, so processing will be delayed.
The cancellation of awards, although it caused a huge media echo, had a very small impact outside the EDU and SBE directorates, about 2% of awards were affected, and perhaps 1/3 of them were restored.
The FY 2027 budget request asks for $4.8B, but subtracting $0.9B reserved for a new antarctic research vessel, the request is the same $3.9B.  We hope that Congress will again restore much, but clearly the White House continues not supportive of basic research. The proposed cuts for NSF are a larger percentage than for NASA, NIH, NIST, or the DoE Research.  The request reduces the CISE budget from $0.9B to $0.3B.
The NSF projects a slight decrease in the number of proposals, and a huge decrease in the funding rate, across all research proposals a decline from 18% to 7%. The average award size and award duration are projected to increase slightly, giving much fewer people a bit more resources.
So far NSF has not done any hiring since the start of the current administration, and although Congress restored much of the research funding, it did not restore the cuts in the staffing level.  Rotators continue to end their rotation, and are not replaced, the current plan seems to discontinue the concept of a rotator entirely. With the reduction is program staff, the remaining program directors are overworked, and are put in charge of programs where they have no previous connection to the research community, without time to establish the connection.  The programs themselves remain unchanged, as they are governed by the solicitations, but the people managing the programs can give less time to the individual program. And the money available depends on the outcome of the budget process.
That is the current situation, or at least the plan of the White House; we will see what Congress makes of it.
TO DO: tell your congress member that you thank for their previous support of basic research at the NSF, and hope they continue it.


By gasarch

 Guest post from Peter Brass, Former NSF Theory director (though not affiliated with the NSF now) on the White House NSF budget for FY 2027.

---------------------------------------------

Dear Colleagues

A week ago the White House released the NSF budget request for FY 2027( here) so I want to provide you an update. As always, this is just my interpretation, I am not connected to the NSF any more

The general news is bad, we get a replay of FY 2026 (FY 2026 is October 2025 to September 2026)

For context: the NSF enacted budget in FY 2023 was $9.5B. The CISE budget was $0.9B. Budgets start with a request by the White House, and then get negotiated with Congress.

A year ago, the  White House made a FY 2026 budget request of $3.9B, which was a ~60% cut from the previous level. After negotiation with Congress, the enacted budget became $8.7B

However, a long period of FY 2025 were funded by continuing resolutions, and continuing  resolutions are based on the White House budget request, so for the first half of FY 2025 the  60% cut was in effect, and awards only now pick up. In addition, there was the government shutdown, and the removal of NSF from the Eisenhower Ave building in Alexandria, so processing will be delayed.

The cancellation of awards, although it caused a huge media echo, had a very small impact outside the EDU and SBE directorates, about 2% of awards were affected, and perhaps 1/3 of them were restored.

The FY 2027 budget request asks for $4.8B, but subtracting $0.9B reserved for a new antarctic research vessel, the request is the same $3.9B.  We hope that Congress will again restore much, but clearly the White House continues not supportive of basic research. The proposed cuts for NSF are a larger percentage than for NASA, NIH, NIST, or the DoE Research.  The request reduces the CISE budget from $0.9B to $0.3B.

The NSF projects a slight decrease in the number of proposals, and a huge decrease in the funding rate, across all research proposals a decline from 18% to 7%. The average award size and award duration are projected to increase slightly, giving much fewer people a bit more resources.

So far NSF has not done any hiring since the start of the current administration, and although Congress restored much of the research funding, it did not restore the cuts in the staffing level.  Rotators continue to end their rotation, and are not replaced, the current plan seems to discontinue the concept of a rotator entirely.
With the reduction is program staff, the remaining program directors are overworked, and are put in charge of programs where they have no previous connection to the research community, without time to establish the connection.  The programs themselves remain unchanged, as they are governed by the solicitations, but the people managing the programs can give less time to the individual program. And the money available depends on the outcome of the budget process.

That is the current situation, or at least the plan of the White House; we will see what Congress makes of it.

TO DO: tell your congress member that you thank for their previous support of basic research at the NSF, and hope they continue it.



By gasarch

Complexity Theory meets Ordinary Differential Equations

from arXiv: Computational Complexity

Authors: Adalbert Fono, Noah Wedlich, Holger Boche, Gitta Kutyniok

This contribution investigates the computational complexity of simulating linear ordinary differential equations (ODEs) on digital computers. We provide an exact characterization of the complexity blowup for a class of ODEs of arbitrary order based on their algebraic properties, extending previous characterization of first order ODEs. Complexity blowup indeed arises in most ODEs (except for certain degenerate cases) and means that there exists a low complexity input signal, which can be generated on a Turing machine in polynomial time, leading to a corresponding high complexity output signal of the system in the sense that the computation time for determining an approximation up to $n$ significant digits grows faster than any polynomial in $n$. Similarly, we derive an analogous blowup criterion for a subclass of first-order systems of linear ODEs. Finally, we discuss the implications for the simulation of analog systems governed by ODEs and exemplarily apply our framework to a simple model of neuronal dynamics$-$the leaky integrate-and-fire neuron$-$heavily employed in neuroscience.

Authors: Adalbert Fono, Noah Wedlich, Holger Boche, Gitta Kutyniok

This contribution investigates the computational complexity of simulating linear ordinary differential equations (ODEs) on digital computers. We provide an exact characterization of the complexity blowup for a class of ODEs of arbitrary order based on their algebraic properties, extending previous characterization of first order ODEs. Complexity blowup indeed arises in most ODEs (except for certain degenerate cases) and means that there exists a low complexity input signal, which can be generated on a Turing machine in polynomial time, leading to a corresponding high complexity output signal of the system in the sense that the computation time for determining an approximation up to $n$ significant digits grows faster than any polynomial in $n$. Similarly, we derive an analogous blowup criterion for a subclass of first-order systems of linear ODEs. Finally, we discuss the implications for the simulation of analog systems governed by ODEs and exemplarily apply our framework to a simple model of neuronal dynamics$-$the leaky integrate-and-fire neuron$-$heavily employed in neuroscience.

The Borsuk number of a graph

from arXiv: Computational Geometry

Authors: José Cáceres, Delia Garijo, Alberto Márquez, Rodrigo I. Silveira

The Borsuk problem asks for the smallest number of subsets with strictly smaller diameters into which any bounded set in the $d$-dimensional space can be decomposed. It is a classical problem in combinatorial geometry that has been subject of much attention over the years, and research on variants of the problem continues nowadays in a plethora of directions. In this work, we propose a formulation of the problem in the context of graphs. Depending on how the graph is partitioned, we consider two different settings dealing either with the usual notion of diameter in abstract graphs, or with the diameter in the context of continuous graphs, where all points along the edges, instead of only the vertices, must be taken into account when computing distances. We present complexity results, exact computations and upper bounds on the parameters associated to the problem.

Authors: José Cáceres, Delia Garijo, Alberto Márquez, Rodrigo I. Silveira

The Borsuk problem asks for the smallest number of subsets with strictly smaller diameters into which any bounded set in the $d$-dimensional space can be decomposed. It is a classical problem in combinatorial geometry that has been subject of much attention over the years, and research on variants of the problem continues nowadays in a plethora of directions. In this work, we propose a formulation of the problem in the context of graphs. Depending on how the graph is partitioned, we consider two different settings dealing either with the usual notion of diameter in abstract graphs, or with the diameter in the context of continuous graphs, where all points along the edges, instead of only the vertices, must be taken into account when computing distances. We present complexity results, exact computations and upper bounds on the parameters associated to the problem.

Any 3D Scene is Worth 1K Tokens: 3D-Grounded Representation for Scene Generation at Scale

from arXiv: Computational Geometry

Authors: Dongxu Wei, Qi Xu, Zhiqi Li, Hangning Zhou, Cong Qiu, Hailong Qin, Mu Yang, Zhaopeng Cui, Peidong Liu

3D scene generation has long been dominated by 2D multi-view or video diffusion models. This is due not only to the lack of scene-level 3D latent representation, but also to the fact that most scene-level 3D visual data exists in the form of multi-view images or videos, which are naturally compatible with 2D diffusion architectures. Typically, these 2D-based approaches degrade 3D spatial extrapolation to 2D temporal extension, which introduces two fundamental issues: (i) representing 3D scenes via 2D views leads to significant representation redundancy, and (ii) latent space rooted in 2D inherently limits the spatial consistency of the generated 3D scenes. In this paper, we propose, for the first time, to perform 3D scene generation directly within an implicit 3D latent space to address these limitations. First, we repurpose frozen 2D representation encoders to construct our 3D Representation Autoencoder (3DRAE), which grounds view-coupled 2D semantic representations into a view-decoupled 3D latent representation. This enables representing 3D scenes observed from arbitrary numbers of views--at any resolution and aspect ratio--with fixed complexity and rich semantics. Then we introduce 3D Diffusion Transformer (3DDiT), which performs diffusion modeling in this 3D latent space, achieving remarkably efficient and spatially consistent 3D scene generation while supporting diverse conditioning configurations. Moreover, since our approach directly generates a 3D scene representation, it can be decoded to images and optional point maps along arbitrary camera trajectories without requiring per-trajectory diffusion sampling pass, which is common in 2D-based approaches.

Authors: Dongxu Wei, Qi Xu, Zhiqi Li, Hangning Zhou, Cong Qiu, Hailong Qin, Mu Yang, Zhaopeng Cui, Peidong Liu

3D scene generation has long been dominated by 2D multi-view or video diffusion models. This is due not only to the lack of scene-level 3D latent representation, but also to the fact that most scene-level 3D visual data exists in the form of multi-view images or videos, which are naturally compatible with 2D diffusion architectures. Typically, these 2D-based approaches degrade 3D spatial extrapolation to 2D temporal extension, which introduces two fundamental issues: (i) representing 3D scenes via 2D views leads to significant representation redundancy, and (ii) latent space rooted in 2D inherently limits the spatial consistency of the generated 3D scenes. In this paper, we propose, for the first time, to perform 3D scene generation directly within an implicit 3D latent space to address these limitations. First, we repurpose frozen 2D representation encoders to construct our 3D Representation Autoencoder (3DRAE), which grounds view-coupled 2D semantic representations into a view-decoupled 3D latent representation. This enables representing 3D scenes observed from arbitrary numbers of views--at any resolution and aspect ratio--with fixed complexity and rich semantics. Then we introduce 3D Diffusion Transformer (3DDiT), which performs diffusion modeling in this 3D latent space, achieving remarkably efficient and spatially consistent 3D scene generation while supporting diverse conditioning configurations. Moreover, since our approach directly generates a 3D scene representation, it can be decoded to images and optional point maps along arbitrary camera trajectories without requiring per-trajectory diffusion sampling pass, which is common in 2D-based approaches.

A Ray Intersection Algorithm for Fast Growth Distance Computation Between Convex Sets

from arXiv: Computational Geometry

Authors: Akshay Thirugnanam, Koushil Sreenath

In this paper, we discuss an efficient algorithm for computing the growth distance between two compact convex sets with representable support functions. The growth distance between two sets is the minimum scaling factor such that the sets intersect when scaled about some center points. Unlike the minimum distance between sets, the growth distance provides a unified measure for set intersection and separation. We first reduce the growth distance problem to an equivalent ray intersection problem on the Minkowski difference set. Then, we propose an algorithm to solve the ray intersection problem by iteratively constructing inner and outer polyhedral approximations of the Minkowski difference set. We show that our algorithm satisfies several key properties, such as primal and dual feasibility and monotone convergence. We provide extensive benchmark results for our algorithm and show that our open-source implementation achieves state-of-the-art performance across a wide variety of convex sets. Finally, we demonstrate robotics applications of our algorithm in motion planning and rigid-body simulation.

Authors: Akshay Thirugnanam, Koushil Sreenath

In this paper, we discuss an efficient algorithm for computing the growth distance between two compact convex sets with representable support functions. The growth distance between two sets is the minimum scaling factor such that the sets intersect when scaled about some center points. Unlike the minimum distance between sets, the growth distance provides a unified measure for set intersection and separation. We first reduce the growth distance problem to an equivalent ray intersection problem on the Minkowski difference set. Then, we propose an algorithm to solve the ray intersection problem by iteratively constructing inner and outer polyhedral approximations of the Minkowski difference set. We show that our algorithm satisfies several key properties, such as primal and dual feasibility and monotone convergence. We provide extensive benchmark results for our algorithm and show that our open-source implementation achieves state-of-the-art performance across a wide variety of convex sets. Finally, we demonstrate robotics applications of our algorithm in motion planning and rigid-body simulation.

Topo-ADV: Generating Topology-Driven Imperceptible Adversarial Point Clouds

from arXiv: Computational Geometry

Authors: Gayathry Chandramana Krishnan Nampoothiry, Raghuram Venkatapuram, Anirban Ghosh, Ayan Dutta

Deep neural networks for 3D point cloud understanding have achieved remarkable success in object classification and recognition, yet recent work shows that these models remain highly vulnerable to adversarial perturbations. Existing 3D attacks predominantly manipulate geometric properties such as point locations, curvature, or surface structure, implicitly assuming that preserving global shape fidelity preserves semantic content. In this work, we challenge this assumption and introduce the first topology-driven adversarial attack for point cloud deep learning. Our key insight is that the homological structure of a 3D object constitutes a previously unexplored vulnerability surface. We propose Topo-ADV, an end-to-end differentiable framework that incorporates persistent homology as an explicit optimization objective, enabling gradient-based manipulation of topological features during adversarial example generation. By embedding persistence diagrams through differentiable topological representations, our method jointly optimizes (i) a topology divergence loss that alters persistence, (ii) a misclassification objective, and (iii) geometric imperceptibility constraints that preserve visual plausibility. Experiments demonstrate that subtle topology-driven perturbations consistently achieve up to 100% attack success rates on benchmark datasets such as ModelNet40, ShapeNet Part, and ScanObjectNN using PointNet and DGCNN classifiers, while remaining geometrically indistinguishable from the original point clouds, beating state-of-the-art methods on various perceptibility metrics.

Authors: Gayathry Chandramana Krishnan Nampoothiry, Raghuram Venkatapuram, Anirban Ghosh, Ayan Dutta

Deep neural networks for 3D point cloud understanding have achieved remarkable success in object classification and recognition, yet recent work shows that these models remain highly vulnerable to adversarial perturbations. Existing 3D attacks predominantly manipulate geometric properties such as point locations, curvature, or surface structure, implicitly assuming that preserving global shape fidelity preserves semantic content. In this work, we challenge this assumption and introduce the first topology-driven adversarial attack for point cloud deep learning. Our key insight is that the homological structure of a 3D object constitutes a previously unexplored vulnerability surface. We propose Topo-ADV, an end-to-end differentiable framework that incorporates persistent homology as an explicit optimization objective, enabling gradient-based manipulation of topological features during adversarial example generation. By embedding persistence diagrams through differentiable topological representations, our method jointly optimizes (i) a topology divergence loss that alters persistence, (ii) a misclassification objective, and (iii) geometric imperceptibility constraints that preserve visual plausibility. Experiments demonstrate that subtle topology-driven perturbations consistently achieve up to 100% attack success rates on benchmark datasets such as ModelNet40, ShapeNet Part, and ScanObjectNN using PointNet and DGCNN classifiers, while remaining geometrically indistinguishable from the original point clouds, beating state-of-the-art methods on various perceptibility metrics.

Differentially Private Verification of Distribution Properties

from arXiv: Data Structures and Algorithms

Authors: Elbert Du, Cynthia Dwork, Pranay Tankala, Linjun Zhang

A recent line of work initiated by Chiesa and Gur and further developed by Herman and Rothblum investigates the sample and communication complexity of verifying properties of distributions with the assistance of a powerful, knowledgeable, but untrusted prover. In this work, we initiate the study of differentially private (DP) distribution property testing. After all, if we do not trust the prover to help us with verification, why should we trust it with our sensitive sample? We map a landscape of DP prover-aided proofs of properties of distributions. In the non-private case it is known that one-round (two message) private-coin protocols can have substantially lower complexity than public-coin AM protocols, but in the private case, the possibility for improvement depends on the parameter regime and privacy model. Drawing on connections to replicability and techniques for amplification, we show: (1) There exists a reduction from any one-round $(\varepsilon,δ)$-DP private-coin interactive proof to a one-round public-coin DP interactive proof with the same privacy parameters, for the parameter regime $\varepsilon=O(1/\sqrt{n})$ and $δ=O(1/n^{5/2})$, and with the same sample and communication complexities. (2) If the verifier's message in the private-coin interactive proof is $O(1/\sqrt{\log n})$ locally DP -- a far more relaxed privacy parameter regime in a different model -- then applying one additional transformation again yields a one-round public-coin protocol with the same privacy bound and the same sample and computational complexities. (3) However, when the privacy guarantee is very relaxed ($\varepsilon\inΩ(\log n)$), private coins indeed reduce complexity. We also obtain a Merlin-Arthur (one-message) proof for privately testing whether samples are drawn from a product distribution, and prove that its sample complexity is optimal.

Authors: Elbert Du, Cynthia Dwork, Pranay Tankala, Linjun Zhang

A recent line of work initiated by Chiesa and Gur and further developed by Herman and Rothblum investigates the sample and communication complexity of verifying properties of distributions with the assistance of a powerful, knowledgeable, but untrusted prover. In this work, we initiate the study of differentially private (DP) distribution property testing. After all, if we do not trust the prover to help us with verification, why should we trust it with our sensitive sample? We map a landscape of DP prover-aided proofs of properties of distributions. In the non-private case it is known that one-round (two message) private-coin protocols can have substantially lower complexity than public-coin AM protocols, but in the private case, the possibility for improvement depends on the parameter regime and privacy model. Drawing on connections to replicability and techniques for amplification, we show: (1) There exists a reduction from any one-round $(\varepsilon,δ)$-DP private-coin interactive proof to a one-round public-coin DP interactive proof with the same privacy parameters, for the parameter regime $\varepsilon=O(1/\sqrt{n})$ and $δ=O(1/n^{5/2})$, and with the same sample and communication complexities. (2) If the verifier's message in the private-coin interactive proof is $O(1/\sqrt{\log n})$ locally DP -- a far more relaxed privacy parameter regime in a different model -- then applying one additional transformation again yields a one-round public-coin protocol with the same privacy bound and the same sample and computational complexities. (3) However, when the privacy guarantee is very relaxed ($\varepsilon\inΩ(\log n)$), private coins indeed reduce complexity. We also obtain a Merlin-Arthur (one-message) proof for privately testing whether samples are drawn from a product distribution, and prove that its sample complexity is optimal.

Near Optimal Algorithms for Noisy $k$-XOR under Low-Degree Heuristic

from arXiv: Data Structures and Algorithms

Authors: Songtao Mao

Noisy $k$-XOR is a basic average-case inference problem in which one observes random noisy $k$-ary parity constraints and seeks to recover, or more weakly, detect, a hidden Boolean assignment. A central question is to characterize the tradeoff among sample complexity, noise level, and running time. We give a recovery algorithm, and hence also a detection algorithm, for noisy $k$-XOR in the high-noise regime. For every parameter $D$, our algorithm runs in time $n^{D+O(1)}$ and succeeds whenever $$ m \ge C_k \frac{n^{k/2}}{D^{\,k/2-1}δ^2}, $$ where $C_k$ is an explicit constant depending only on $k$, and $δ$ is the noise bias. Our result matches the best previously known time--sample tradeoff for detection, while simultaneously yielding recovery guarantees. In addition, the dependence on the noise bias $δ$ is optimal up to constant factors, matching the information-theoretic scaling. We also prove matching low-degree lower bounds. In particular, we show that the degree-$D$ low-degree likelihood ratio has bounded $L^2$-norm below the same threshold, up to the same factor $D^{k/2-1}$. Under the low-degree heuristic, this implies that our algorithm is near-optimal over a broad range of parameters. Our approach combines a refined second-moment analysis with color coding and dynamic programming for structured hypergraph embedding statistics. These techniques may be of independent interest for other average-case inference problems.

Authors: Songtao Mao

Noisy $k$-XOR is a basic average-case inference problem in which one observes random noisy $k$-ary parity constraints and seeks to recover, or more weakly, detect, a hidden Boolean assignment. A central question is to characterize the tradeoff among sample complexity, noise level, and running time. We give a recovery algorithm, and hence also a detection algorithm, for noisy $k$-XOR in the high-noise regime. For every parameter $D$, our algorithm runs in time $n^{D+O(1)}$ and succeeds whenever $$ m \ge C_k \frac{n^{k/2}}{D^{\,k/2-1}δ^2}, $$ where $C_k$ is an explicit constant depending only on $k$, and $δ$ is the noise bias. Our result matches the best previously known time--sample tradeoff for detection, while simultaneously yielding recovery guarantees. In addition, the dependence on the noise bias $δ$ is optimal up to constant factors, matching the information-theoretic scaling. We also prove matching low-degree lower bounds. In particular, we show that the degree-$D$ low-degree likelihood ratio has bounded $L^2$-norm below the same threshold, up to the same factor $D^{k/2-1}$. Under the low-degree heuristic, this implies that our algorithm is near-optimal over a broad range of parameters. Our approach combines a refined second-moment analysis with color coding and dynamic programming for structured hypergraph embedding statistics. These techniques may be of independent interest for other average-case inference problems.

Algorithms for Standard-form ILP Problems via Komlós' Discrepancy Setting

from arXiv: Data Structures and Algorithms

Authors: Dmitry Gribanov, Tagir Khayaleev, Mikhail Cherniavskii, Maxim Klimenko, Dmitry Malyshev, Stanislav Moiseev

We study the standard-form ILP problem $\max\{ c^\top x \colon A x = b,\; x \in Z_{\geq 0}^n \}$, where $A\in Z^{k\times n}$ has full row rank. We obtain refined FPT algorithms parameterized by $k$ and $Δ$, the maximum absolute value of a $k\times k$ minor of $A$. Our approach combines discrepancy-based dynamic programming with matrix discrepancy bounds in Komlós' setting. Let $κ_k$ denote the maximum discrepancy over all matrices with $k$ columns whose columns have Euclidean norm at most $1$. Up to polynomial factors in the input size, the optimization problem can be solved in time $O(κ_k)^{2k}Δ^2$, and the corresponding feasibility problem in time $O(κ_k)^kΔ$. Using the best currently known bound $κ_k=\widetilde O(\log^{1/4}k)$, this yields running times $O(\log k)^{\frac{k}{2}(1+o(1))}Δ^2$ and $O(\log k)^{\frac{k}{4}(1+o(1))}Δ$, respectively. Under the Komlós conjecture, the dependence on $k$ in both running times reduces to $2^{O(k)}$.

Authors: Dmitry Gribanov, Tagir Khayaleev, Mikhail Cherniavskii, Maxim Klimenko, Dmitry Malyshev, Stanislav Moiseev

We study the standard-form ILP problem $\max\{ c^\top x \colon A x = b,\; x \in Z_{\geq 0}^n \}$, where $A\in Z^{k\times n}$ has full row rank. We obtain refined FPT algorithms parameterized by $k$ and $Δ$, the maximum absolute value of a $k\times k$ minor of $A$. Our approach combines discrepancy-based dynamic programming with matrix discrepancy bounds in Komlós' setting. Let $κ_k$ denote the maximum discrepancy over all matrices with $k$ columns whose columns have Euclidean norm at most $1$. Up to polynomial factors in the input size, the optimization problem can be solved in time $O(κ_k)^{2k}Δ^2$, and the corresponding feasibility problem in time $O(κ_k)^kΔ$. Using the best currently known bound $κ_k=\widetilde O(\log^{1/4}k)$, this yields running times $O(\log k)^{\frac{k}{2}(1+o(1))}Δ^2$ and $O(\log k)^{\frac{k}{4}(1+o(1))}Δ$, respectively. Under the Komlós conjecture, the dependence on $k$ in both running times reduces to $2^{O(k)}$.

Scalable Exact Hierarchical Agglomerative Clustering via Sparse Geographic Distance Graphs

from arXiv: Data Structures and Algorithms

Authors: Victor Maus, Vinicius Pozzobon Borin

Exact hierarchical agglomerative clustering (HAC) of large spatial datasets is limited in practice by the $\mathcal{O}(n^2)$ time and memory required for the full pairwise distance matrix. We present GSHAC (Geographically Sparse Hierarchical Agglomerative Clustering), a system that makes exact HAC feasible at scales of millions of geographic features on a commodity workstation. GSHAC replaces the distance matrix with a sparse geographic distance graph containing only pairs within a user-specified geodesic bound~$h_{\max}$, constructed in $\mathcal{O}(n \cdot k)$ time via spatial indexing, where~$k$ is the mean number of neighbors within~$h_{\max}$. Connected components of this graph define independent subproblems, and we prove that the resulting assignments are exact for all standard linkage methods at any cut height $h \le h_{\max}$. For single linkage, an MST-based path keeps memory at $\mathcal{O}(n_k + m_k)$ per component. Applied to a global mining inventory ($n = 261{,}073$), the system completes in 12\,s (109\,MiB peak HAC memory) versus $\approx 545$\,GiB for the dense baseline. On a 2-million-point GeoNames sample, all tested thresholds completed in under 3\,minutes with peak memory under 3\,GiB. We provide a scikit-learn-compatible implementation for direct integration into GIS workflows.

Authors: Victor Maus, Vinicius Pozzobon Borin

Exact hierarchical agglomerative clustering (HAC) of large spatial datasets is limited in practice by the $\mathcal{O}(n^2)$ time and memory required for the full pairwise distance matrix. We present GSHAC (Geographically Sparse Hierarchical Agglomerative Clustering), a system that makes exact HAC feasible at scales of millions of geographic features on a commodity workstation. GSHAC replaces the distance matrix with a sparse geographic distance graph containing only pairs within a user-specified geodesic bound~$h_{\max}$, constructed in $\mathcal{O}(n \cdot k)$ time via spatial indexing, where~$k$ is the mean number of neighbors within~$h_{\max}$. Connected components of this graph define independent subproblems, and we prove that the resulting assignments are exact for all standard linkage methods at any cut height $h \le h_{\max}$. For single linkage, an MST-based path keeps memory at $\mathcal{O}(n_k + m_k)$ per component. Applied to a global mining inventory ($n = 261{,}073$), the system completes in 12\,s (109\,MiB peak HAC memory) versus $\approx 545$\,GiB for the dense baseline. On a 2-million-point GeoNames sample, all tested thresholds completed in under 3\,minutes with peak memory under 3\,GiB. We provide a scikit-learn-compatible implementation for direct integration into GIS workflows.

Maximum Independent Sets in Disk Graphs with Disks in Convex Position

from arXiv: Data Structures and Algorithms

Authors: Anastasiia Tkachenko, Haitao Wang

For a set $\mathcal{D}$ of disks in the plane, its disk graph $G(\mathcal{D})$ is the graph with vertex set $\mathcal{D}$, where two vertices are adjacent if and only if the corresponding disks intersect. Given a set $\mathcal{D}$ of $n$ weighted disks, computing a maximum independent set of $G(\mathcal{D})$ is NP-hard. In this paper, we present an $O(n^3\log n)$-time algorithm for this problem in a special setting in which the disks are in convex position, meaning that every disk appears on the convex hull of $\mathcal{D}$. This setting has been studied previously for disks of equal radius, for which an $O(n^{37/11})$-time algorithm was known. Our algorithm also works in the weighted case where disks have weights and the goal is to compute a maximum-weight independent set. As an application of our result, we obtain an $O(n^3\log^2 n)$-time algorithm for the dispersion problem on a set of $n$ disks in convex position: given an integer $k$, compute a subset of $k$ disks that maximizes the minimum pairwise distance among all disks in the subset.

Authors: Anastasiia Tkachenko, Haitao Wang

For a set $\mathcal{D}$ of disks in the plane, its disk graph $G(\mathcal{D})$ is the graph with vertex set $\mathcal{D}$, where two vertices are adjacent if and only if the corresponding disks intersect. Given a set $\mathcal{D}$ of $n$ weighted disks, computing a maximum independent set of $G(\mathcal{D})$ is NP-hard. In this paper, we present an $O(n^3\log n)$-time algorithm for this problem in a special setting in which the disks are in convex position, meaning that every disk appears on the convex hull of $\mathcal{D}$. This setting has been studied previously for disks of equal radius, for which an $O(n^{37/11})$-time algorithm was known. Our algorithm also works in the weighted case where disks have weights and the goal is to compute a maximum-weight independent set. As an application of our result, we obtain an $O(n^3\log^2 n)$-time algorithm for the dispersion problem on a set of $n$ disks in convex position: given an integer $k$, compute a subset of $k$ disks that maximizes the minimum pairwise distance among all disks in the subset.

Universality of first-order methods on random and deterministic matrices

from arXiv: Data Structures and Algorithms

Authors: Nicola Gorini, Chris Jones, Dmitriy Kunisky, Lucas Pesenti

General first-order methods (GFOM) are a flexible class of iterative algorithms which update a state vector by matrix-vector multiplications and entrywise nonlinearities. A long line of work has sought to understand the large-n dynamics of GFOM, mostly focusing on "very random" input matrices and the approximate message passing (AMP) special case of GFOM whose state is asymptotically Gaussian. Yet, it has long remained unknown how to construct iterative algorithms that retain this Gaussianity for more structured inputs, or why existing AMP algorithms can be as effective for some deterministic matrices as they are for random matrices. We analyze diagrammatic expansions of GFOM via the limiting traffic distribution of the input matrix, the collection of all limiting values of permutation-invariant polynomials in the matrix entries, to obtain the following results: 1. We calculate the traffic distribution for the first non-trivial deterministic matrices, including (minor variants of) the Walsh-Hadamard and discrete sine and cosine transform matrices. This determines the limiting dynamics of GFOM on these inputs, resolving parts of longstanding conjectures of Marinari, Parisi, and Ritort (1994). 2. We design a new AMP iteration which unifies several previous AMP variants and generalizes to new input types, whose limiting dynamics are Gaussian conditional on some latent random variables. The asymptotic dynamics hold for a large and natural class of traffic distributions (encompassing both random and deterministic input matrices) and the algorithm's analysis gives a simple combinatorial interpretation of the Onsager correction, answering questions posed recently by Wang, Zhong, and Fan (2022).

Authors: Nicola Gorini, Chris Jones, Dmitriy Kunisky, Lucas Pesenti

General first-order methods (GFOM) are a flexible class of iterative algorithms which update a state vector by matrix-vector multiplications and entrywise nonlinearities. A long line of work has sought to understand the large-n dynamics of GFOM, mostly focusing on "very random" input matrices and the approximate message passing (AMP) special case of GFOM whose state is asymptotically Gaussian. Yet, it has long remained unknown how to construct iterative algorithms that retain this Gaussianity for more structured inputs, or why existing AMP algorithms can be as effective for some deterministic matrices as they are for random matrices. We analyze diagrammatic expansions of GFOM via the limiting traffic distribution of the input matrix, the collection of all limiting values of permutation-invariant polynomials in the matrix entries, to obtain the following results: 1. We calculate the traffic distribution for the first non-trivial deterministic matrices, including (minor variants of) the Walsh-Hadamard and discrete sine and cosine transform matrices. This determines the limiting dynamics of GFOM on these inputs, resolving parts of longstanding conjectures of Marinari, Parisi, and Ritort (1994). 2. We design a new AMP iteration which unifies several previous AMP variants and generalizes to new input types, whose limiting dynamics are Gaussian conditional on some latent random variables. The asymptotic dynamics hold for a large and natural class of traffic distributions (encompassing both random and deterministic input matrices) and the algorithm's analysis gives a simple combinatorial interpretation of the Onsager correction, answering questions posed recently by Wang, Zhong, and Fan (2022).

Faster Approximate Linear Matroid Intersection

from arXiv: Data Structures and Algorithms

Authors: Tatsuya Terao

We consider a fast approximation algorithm for the linear matroid intersection problem. In this problem, we are given two $r \times n$ matrices $M_1$ and $M_2$, and the objective is to find a largest set of columns that are linearly independent in both $M_1$ and $M_2$. We design a $(1 - \varepsilon)$-approximation algorithm with time complexity $\tilde{O}_{\varepsilon}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + r_{*}^ω)$, where $\mathrm{nnz}(M_i)$ denotes the number of nonzero entries in $M_i$ for $i = 1, 2$, $r_{*}$ denotes the maximum size of a common independent set, and $ω< 2.372$ denotes the matrix multiplication exponent. Our approximation algorithm is faster than the exact algorithm by Harvey [FOCS'06 & SICOMP'09] and Cheung--Kwok--Lau [STOC'12 & JACM'13], which runs in $\tilde{O}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + n r_{*}^{ω- 1})$ time. We also develop a fast $(1 - \varepsilon)$-approximation algorithm for the weighted version of the linear matroid intersection problem. In fact, we design a $(1 - \varepsilon)$-approximation algorithm for weighted linear matroid intersection with time complexity $\tilde{O}_{\varepsilon}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + r_{*}^ω)$. Our algorithm improves upon the $(1 - \varepsilon)$-approximation algorithm by Huang--Kakimura--Kamiyama [SODA'16 & Math. Program.'19], which runs in $\tilde{O}_{\varepsilon}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + nr_{*}^{ω- 1})$ time. To obtain these results, we combine Quanrud's adaptive sparsification framework [ICALP'24] with a simple yet effective method for efficiently checking whether a given vector lies in the linear span of a subset of vectors, which is of independent interest.

Authors: Tatsuya Terao

We consider a fast approximation algorithm for the linear matroid intersection problem. In this problem, we are given two $r \times n$ matrices $M_1$ and $M_2$, and the objective is to find a largest set of columns that are linearly independent in both $M_1$ and $M_2$. We design a $(1 - \varepsilon)$-approximation algorithm with time complexity $\tilde{O}_{\varepsilon}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + r_{*}^ω)$, where $\mathrm{nnz}(M_i)$ denotes the number of nonzero entries in $M_i$ for $i = 1, 2$, $r_{*}$ denotes the maximum size of a common independent set, and $ω< 2.372$ denotes the matrix multiplication exponent. Our approximation algorithm is faster than the exact algorithm by Harvey [FOCS'06 & SICOMP'09] and Cheung--Kwok--Lau [STOC'12 & JACM'13], which runs in $\tilde{O}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + n r_{*}^{ω- 1})$ time. We also develop a fast $(1 - \varepsilon)$-approximation algorithm for the weighted version of the linear matroid intersection problem. In fact, we design a $(1 - \varepsilon)$-approximation algorithm for weighted linear matroid intersection with time complexity $\tilde{O}_{\varepsilon}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + r_{*}^ω)$. Our algorithm improves upon the $(1 - \varepsilon)$-approximation algorithm by Huang--Kakimura--Kamiyama [SODA'16 & Math. Program.'19], which runs in $\tilde{O}_{\varepsilon}(\mathrm{nnz}(M_1) + \mathrm{nnz}(M_2) + nr_{*}^{ω- 1})$ time. To obtain these results, we combine Quanrud's adaptive sparsification framework [ICALP'24] with a simple yet effective method for efficiently checking whether a given vector lies in the linear span of a subset of vectors, which is of independent interest.

GPU Acceleration of Sparse Fully Homomorphic Encrypted DNNs

from arXiv: Data Structures and Algorithms

Authors: Lara D'Agata, Carlos Agulló-Domingo, Óscar Vera-López, Kaustubh Shivdikar, Ardhi W. B. Yudha, Ferhat Yaman, David Kaeli, José L. Abellán, Ian Colbert, José Cano

Fully homomorphic encryption (FHE) has recently attracted significant attention as both a cryptographic primitive and a systems challenge. Given the latest advances in accelerated computing, FHE presents a promising opportunity for progress, with applications ranging from machine learning to information security. We target the most computationally intensive operation in deep neural networks from a hardware perspective, matrix multiplication (matmul), and adapt it for execution on AMD GPUs. We propose a new optimized method that improves the runtime and complexity of ciphertext matmul by using FIDESlib, a recent open-source FHE library designed specifically for GPUs. By exploiting sparsity in both operands, our sparse matmul implementation outperforms its CPU counterpart by up to $3.0\times$ and reduces the time complexity from cubic to semi-linear, demonstrating an improvement over existing FHE matmul implementations.

Authors: Lara D'Agata, Carlos Agulló-Domingo, Óscar Vera-López, Kaustubh Shivdikar, Ardhi W. B. Yudha, Ferhat Yaman, David Kaeli, José L. Abellán, Ian Colbert, José Cano

Fully homomorphic encryption (FHE) has recently attracted significant attention as both a cryptographic primitive and a systems challenge. Given the latest advances in accelerated computing, FHE presents a promising opportunity for progress, with applications ranging from machine learning to information security. We target the most computationally intensive operation in deep neural networks from a hardware perspective, matrix multiplication (matmul), and adapt it for execution on AMD GPUs. We propose a new optimized method that improves the runtime and complexity of ciphertext matmul by using FIDESlib, a recent open-source FHE library designed specifically for GPUs. By exploiting sparsity in both operands, our sparse matmul implementation outperforms its CPU counterpart by up to $3.0\times$ and reduces the time complexity from cubic to semi-linear, demonstrating an improvement over existing FHE matmul implementations.

Limited Perfect Monotonical Surrogates constructed using low-cost recursive linkage discovery with guaranteed output

from arXiv: Data Structures and Algorithms

Authors: M. W. Przewozniczek, F. Chicano, R. Tinós, M. M. Komarnicki

Surrogates provide a cheap solution evaluation and offer significant leverage for optimizing computationally expensive problems. Usually, surrogates only approximate the original function. Recently, the perfect linear surrogates were proposed that ideally represent the original function. These surrogates do not mimic the original function. In fact, they are another (correct) representation of it and enable a wide range of possibilities, e.g., discovering the optimized function for problems where the direct transformation of the encoded solution into its evaluation is not available. However, many real-world problems can not be represented by linear models, making the aforementioned surrogates inapplicable. Therefore, we propose the Limited Monotonical Perfect Surrogate (LyMPuS), which overcomes this difficulty and enables the comparison of two solutions that differ by a single variable. Our proposition is suitable for limiting the cost of expensive local search procedures. The proposed surrogate is parameterless and can be trained on the fly without any separate surrogate-building step. It uses only the necessary fitness evaluations, and the already-paid costs are not wasted when the model is updated. Finally, it offers low-cost missing-linkage detection and low-cost linkage discovery, guaranteed to find a missing dependency in no more than $2\lceil\log_2(n)\rceil$ steps.

Authors: M. W. Przewozniczek, F. Chicano, R. Tinós, M. M. Komarnicki

Surrogates provide a cheap solution evaluation and offer significant leverage for optimizing computationally expensive problems. Usually, surrogates only approximate the original function. Recently, the perfect linear surrogates were proposed that ideally represent the original function. These surrogates do not mimic the original function. In fact, they are another (correct) representation of it and enable a wide range of possibilities, e.g., discovering the optimized function for problems where the direct transformation of the encoded solution into its evaluation is not available. However, many real-world problems can not be represented by linear models, making the aforementioned surrogates inapplicable. Therefore, we propose the Limited Monotonical Perfect Surrogate (LyMPuS), which overcomes this difficulty and enables the comparison of two solutions that differ by a single variable. Our proposition is suitable for limiting the cost of expensive local search procedures. The proposed surrogate is parameterless and can be trained on the fly without any separate surrogate-building step. It uses only the necessary fitness evaluations, and the already-paid costs are not wasted when the model is updated. Finally, it offers low-cost missing-linkage detection and low-cost linkage discovery, guaranteed to find a missing dependency in no more than $2\lceil\log_2(n)\rceil$ steps.

Min-Sum Set Cover on Parallel Machines

from arXiv: Data Structures and Algorithms

Authors: Michał Szyfelbein

Consider the classical \textsc{Min-Sum Set Cover} problem: We are given a universe $\mathcal{U}$ of $n$ elements and a collection $\mathcal{S}$ of $k$ subsets of $\mathcal{U}$. Moreover, a cost function is associated with each set. The goal is to find a subsequence of sets in $\mathcal{S}$ that covers all elements in $\mathcal{U}$, such that the sum of the covering times of the elements is minimized. The covering time of an element $u$ is the cost of all sets that appear in the sequence before $u$ is first covered. This problem can be seen as a scheduling problem on a single machine, where each job represents a set and elements are represented by some kind of utility that is required to be provided by at least one of the jobs. The goal is to schedule the jobs to minimize the sum of provision times of the utilities. In this paper we consider a generalization of this problem to the case of $m$ machines, processing the jobs in parallel. We call this problem \textsc{Parallel Min-Sum Set Cover}. To obtain approximation algorithm for both related and unrelated machines, we use a crucial subproblem which we call \textsc{Parallel Maximum Coverage}. We give a randomized bicriteria $(1-1/e-ε, O(\log m/\log\log m))$-approximation algorithm for this problem based on a natural LP relaxation. This can be then used to obtain $O(\log m/\log\log m)$-approximation algorithm for the \textsc{Min-Sum Set Cover} problem on unrelated machines. For related machines, we allow the aforementioned bicriteria approximation algorithm to run in FPT time, and apply a technique enabling transformating an relataed machines instance into one consisting of $O(\log m)$ unrelated machines, to obtain an $\frac{8e}{e+1}+ε<12.66$-approximation algorithm for this case. We also show a greedy algorithm for unit cost sets, subject to precedence constraints, with an approximation ratio of $O(k^{2/3})$.

Authors: Michał Szyfelbein

Consider the classical \textsc{Min-Sum Set Cover} problem: We are given a universe $\mathcal{U}$ of $n$ elements and a collection $\mathcal{S}$ of $k$ subsets of $\mathcal{U}$. Moreover, a cost function is associated with each set. The goal is to find a subsequence of sets in $\mathcal{S}$ that covers all elements in $\mathcal{U}$, such that the sum of the covering times of the elements is minimized. The covering time of an element $u$ is the cost of all sets that appear in the sequence before $u$ is first covered. This problem can be seen as a scheduling problem on a single machine, where each job represents a set and elements are represented by some kind of utility that is required to be provided by at least one of the jobs. The goal is to schedule the jobs to minimize the sum of provision times of the utilities. In this paper we consider a generalization of this problem to the case of $m$ machines, processing the jobs in parallel. We call this problem \textsc{Parallel Min-Sum Set Cover}. To obtain approximation algorithm for both related and unrelated machines, we use a crucial subproblem which we call \textsc{Parallel Maximum Coverage}. We give a randomized bicriteria $(1-1/e-ε, O(\log m/\log\log m))$-approximation algorithm for this problem based on a natural LP relaxation. This can be then used to obtain $O(\log m/\log\log m)$-approximation algorithm for the \textsc{Min-Sum Set Cover} problem on unrelated machines. For related machines, we allow the aforementioned bicriteria approximation algorithm to run in FPT time, and apply a technique enabling transformating an relataed machines instance into one consisting of $O(\log m)$ unrelated machines, to obtain an $\frac{8e}{e+1}+ε<12.66$-approximation algorithm for this case. We also show a greedy algorithm for unit cost sets, subject to precedence constraints, with an approximation ratio of $O(k^{2/3})$.

Wavelet Forests Revisited

from arXiv: Data Structures and Algorithms

Authors: Eric Chiu, Dominik Kempa

Rank and select queries are basic operations on sequences, with applications in compressed text indexes and other space-efficient data structures. One of the standard data structures supporting these queries is the wavelet tree. In this paper, we study wavelet forests, that is, wavelet-tree structures based on the fixed-block compression boosting technique. Such structures partition the input sequence into fixed-size blocks and build a separate wavelet tree for each block. Previous work showed that this approach yields strong practical performance for rank queries. We extend wavelet forests to support select queries. We show that select support can be added with little additional space overhead and that the resulting structures remain practically efficient. In experiments on a range of non-repetitive and repetitive inputs, wavelet forests are competitive with, and in most cases outperform, standalone wavelet-tree implementations. We also study the effect of internal parameters, including superblock size and navigational data, on select-query performance.

Authors: Eric Chiu, Dominik Kempa

Rank and select queries are basic operations on sequences, with applications in compressed text indexes and other space-efficient data structures. One of the standard data structures supporting these queries is the wavelet tree. In this paper, we study wavelet forests, that is, wavelet-tree structures based on the fixed-block compression boosting technique. Such structures partition the input sequence into fixed-size blocks and build a separate wavelet tree for each block. Previous work showed that this approach yields strong practical performance for rank queries. We extend wavelet forests to support select queries. We show that select support can be added with little additional space overhead and that the resulting structures remain practically efficient. In experiments on a range of non-repetitive and repetitive inputs, wavelet forests are competitive with, and in most cases outperform, standalone wavelet-tree implementations. We also study the effect of internal parameters, including superblock size and navigational data, on select-query performance.

Above-Guarantee Algorithm for Properly Colored Spanning Trees

from arXiv: Data Structures and Algorithms

Authors: Yuhang Bai, Kristóf Bérczi

In the Properly Colored Spanning Tree problem, we are given an edge-colored undirected graph and the goal is to find a spanning tree in which any two adjacent edges have distinct colors. Since finding such a tree is NP-hard in general, previous work often relied on minimum color degree conditions to guarantee the existence of properly colored spanning trees. While it is known that every connected edge-colored graph $G$ contains a properly colored tree of order at least $\min\{|V(G)|, 2δ^c(G)\}$, where $δ^c(G)$ denotes the minimum number of colors incident to a vertex, we study the algorithmic above-guarantee problem for properly colored trees. We provide a polynomial-time algorithm that constructs a properly colored tree of order at least $\min\{|V(G)|, 2δ^c(G)+1\}$ in a connected edge-colored graph $G$, whenever such a tree exists.

Authors: Yuhang Bai, Kristóf Bérczi

In the Properly Colored Spanning Tree problem, we are given an edge-colored undirected graph and the goal is to find a spanning tree in which any two adjacent edges have distinct colors. Since finding such a tree is NP-hard in general, previous work often relied on minimum color degree conditions to guarantee the existence of properly colored spanning trees. While it is known that every connected edge-colored graph $G$ contains a properly colored tree of order at least $\min\{|V(G)|, 2δ^c(G)\}$, where $δ^c(G)$ denotes the minimum number of colors incident to a vertex, we study the algorithmic above-guarantee problem for properly colored trees. We provide a polynomial-time algorithm that constructs a properly colored tree of order at least $\min\{|V(G)|, 2δ^c(G)+1\}$ in a connected edge-colored graph $G$, whenever such a tree exists.

Coarse Balanced Separators in Fat-Minor-Free Graphs

from arXiv: Data Structures and Algorithms

Authors: Édouard Bonnet, Hung Le, Marcin Pilipczuk, Michał Pilipczuk

Fat minors are a coarse analogue of graph minors where the subgraphs modeling vertices and edges of the embedded graph are required to be distant from each other, instead of just being disjoint. In this paper, we give a coarse analogue of the classic theorem that an $n$-vertex graph excluding a fixed minor admits a balanced separator of size $O(\sqrt{n})$. Specifically, we prove that for every integer $d$, real $\varepsilon>0$, and graph $H$, there exist constants $c$ and $r$ such that every $n$-vertex graph $G$ excluding $H$ as a $d$-fat minor admits a set $S \subseteq V(G)$ that is a balanced separator of $G$ and can be covered by $c n^{\frac{1}{2}+\varepsilon}$ balls of radius $r$ in $G$. Our proof also works in the weighted setting where the balance of the separator is measured with respect to any weight function on the vertices, and is effective: we obtain a randomized polynomial-time algorithm to compute either such a balanced separator, or a $d$-fat model of $H$ in $G$.

Authors: Édouard Bonnet, Hung Le, Marcin Pilipczuk, Michał Pilipczuk

Fat minors are a coarse analogue of graph minors where the subgraphs modeling vertices and edges of the embedded graph are required to be distant from each other, instead of just being disjoint. In this paper, we give a coarse analogue of the classic theorem that an $n$-vertex graph excluding a fixed minor admits a balanced separator of size $O(\sqrt{n})$. Specifically, we prove that for every integer $d$, real $\varepsilon>0$, and graph $H$, there exist constants $c$ and $r$ such that every $n$-vertex graph $G$ excluding $H$ as a $d$-fat minor admits a set $S \subseteq V(G)$ that is a balanced separator of $G$ and can be covered by $c n^{\frac{1}{2}+\varepsilon}$ balls of radius $r$ in $G$. Our proof also works in the weighted setting where the balance of the separator is measured with respect to any weight function on the vertices, and is effective: we obtain a randomized polynomial-time algorithm to compute either such a balanced separator, or a $d$-fat model of $H$ in $G$.

Computational Generation of Substrate-Specific Molecular Cages

from arXiv: Data Structures and Algorithms

Authors: Noé Demange, Yann Strozecki, Sandrine Vial

In this paper, we propose a method to build molecular cages designed to capture a specific substrate. We model a cage as a graph of atoms with coordinates in space, and several constraints on their edges (degree, length and angle). We use a simple method to place binding patterns which are able to interact with certain parts of the substrate. We then propose an algorithm which considers all possible ways of connecting these binding patterns and try to construct the smallest possible molecular paths realizing these connections. We investigate many variants of our method in order to obtain the most efficient algorithm, able to build cages of more than a hundred atoms.

Authors: Noé Demange, Yann Strozecki, Sandrine Vial

In this paper, we propose a method to build molecular cages designed to capture a specific substrate. We model a cage as a graph of atoms with coordinates in space, and several constraints on their edges (degree, length and angle). We use a simple method to place binding patterns which are able to interact with certain parts of the substrate. We then propose an algorithm which considers all possible ways of connecting these binding patterns and try to construct the smallest possible molecular paths realizing these connections. We investigate many variants of our method in order to obtain the most efficient algorithm, able to build cages of more than a hundred atoms.

Entropic independence via sparse localization

from arXiv: Data Structures and Algorithms

Authors: Vishesh Jain, Huy Tuan Pham, Thuy-Duong Vuong

Entropic independence is a structural property of measures that underlies modern proofs of functional inequalities, notably (modified) log-Sobolev inequalities, via ``annealing'' or local-to-global schemes. Existing sufficient criteria for entropic independence typically require spectral independence and/or uniform bounds on marginals under \emph{all} pinnings, which can fail in natural canonical-ensemble models even when strong mixing properties are expected. We introduce \emph{sparse localization}: a restricted localization framework, in the spirit of Chen--Eldan, in which one assumes $\ell_2$-independence only for a sparse family of pinnings (those fixing at most $cn$ coordinates for any $c > 0$), yet still deduces quadratic entropic stability and entropic independence with an explicit multiplicative loss of order $c^{-1}$. As an application, we give a rigorous proof of approximate conservation of entropy for the uniform distribution on independent sets of a given size in bounded degree graphs.

Authors: Vishesh Jain, Huy Tuan Pham, Thuy-Duong Vuong

Entropic independence is a structural property of measures that underlies modern proofs of functional inequalities, notably (modified) log-Sobolev inequalities, via ``annealing'' or local-to-global schemes. Existing sufficient criteria for entropic independence typically require spectral independence and/or uniform bounds on marginals under \emph{all} pinnings, which can fail in natural canonical-ensemble models even when strong mixing properties are expected. We introduce \emph{sparse localization}: a restricted localization framework, in the spirit of Chen--Eldan, in which one assumes $\ell_2$-independence only for a sparse family of pinnings (those fixing at most $cn$ coordinates for any $c > 0$), yet still deduces quadratic entropic stability and entropic independence with an explicit multiplicative loss of order $c^{-1}$. As an application, we give a rigorous proof of approximate conservation of entropy for the uniform distribution on independent sets of a given size in bounded degree graphs.

Query Lower Bounds for Diffusion Sampling

from arXiv: Data Structures and Algorithms

Authors: Zhiyang Xun, Eric Price

Diffusion models generate samples by iteratively querying learned score estimates. A rapidly growing literature focuses on accelerating sampling by minimizing the number of score evaluations, yet the information-theoretic limits of such acceleration remain unclear. In this work, we establish the first score query lower bounds for diffusion sampling. We prove that for $d$-dimensional distributions, given access to score estimates with polynomial accuracy $\varepsilon=d^{-O(1)}$ (in any $L^p$ sense), any sampling algorithm requires $\widetildeΩ(\sqrt{d})$ adaptive score queries. In particular, our proof shows that any sampler must search over $\widetildeΩ(\sqrt{d})$ distinct noise levels, providing a formal explanation for why multiscale noise schedules are necessary in practice.

Authors: Zhiyang Xun, Eric Price

Diffusion models generate samples by iteratively querying learned score estimates. A rapidly growing literature focuses on accelerating sampling by minimizing the number of score evaluations, yet the information-theoretic limits of such acceleration remain unclear. In this work, we establish the first score query lower bounds for diffusion sampling. We prove that for $d$-dimensional distributions, given access to score estimates with polynomial accuracy $\varepsilon=d^{-O(1)}$ (in any $L^p$ sense), any sampling algorithm requires $\widetildeΩ(\sqrt{d})$ adaptive score queries. In particular, our proof shows that any sampler must search over $\widetildeΩ(\sqrt{d})$ distinct noise levels, providing a formal explanation for why multiscale noise schedules are necessary in practice.

New Approximations for Temporal Vertex Cover on Always Star Temporal Graphs

from arXiv: Data Structures and Algorithms

Authors: Sophia Heck, Eleni Akrida

Modern networks are highly dynamic, and temporal graphs capture these changes through discrete edge appearances on a fixed vertex set, known in advance up to the graph's lifetime. The Vertex Cover problem extends to the temporal setting as Temporal Vertex Cover (TVC) and Sliding Window Temporal Vertex Cover (SW-TVC). In TVC, each edge is covered by one endpoint over the lifetime, while in SW-TVC, edges are covered within every $Δ$-step window. In always star temporal graphs, each snapshot is a star with a center that may change at each time step. TVC is NP-complete on always star temporal graphs, but an FPT algorithm parameterized by $Δ$ solves it optimally in $O(TΔ(n+m)\cdot 2^Δ)$. This paper presents two polynomial-time approximation algorithms for SW-TVC on always star temporal graphs, achieving $2Δ-1$ and $Δ-1$ approximation ratios with running times $O(T)$ and $O(TmΔ^2)$, respectively. These algorithms provide exact solutions for $Δ=1$ and $Δ\leq 2$. Additionally, we offer the first implementation and experimental evaluation of state-of-the-art approximation algorithms with $d$ and $d-1$ approximation ratios, where $d$ is the maximum degree of any snapshot. Our experiments on artificially generated always star temporal graphs show that the new approximation algorithms outperform the known $d-1$ approximation in running time, even in some cases where $Δ>d$. We test state-of-the-art algorithms on real-world data and observe that the $d-1$ approximation algorithm outperforms the analytically better $d$ approximation algorithm in running time when implemented as described in the original paper. However, a novel implementation of the $d$ approximation algorithm significantly improves its runtime, surpassing $d-1$ in practice. Nonetheless, the $d-1$ approximation consistently computes smaller solutions.

Authors: Sophia Heck, Eleni Akrida

Modern networks are highly dynamic, and temporal graphs capture these changes through discrete edge appearances on a fixed vertex set, known in advance up to the graph's lifetime. The Vertex Cover problem extends to the temporal setting as Temporal Vertex Cover (TVC) and Sliding Window Temporal Vertex Cover (SW-TVC). In TVC, each edge is covered by one endpoint over the lifetime, while in SW-TVC, edges are covered within every $Δ$-step window. In always star temporal graphs, each snapshot is a star with a center that may change at each time step. TVC is NP-complete on always star temporal graphs, but an FPT algorithm parameterized by $Δ$ solves it optimally in $O(TΔ(n+m)\cdot 2^Δ)$. This paper presents two polynomial-time approximation algorithms for SW-TVC on always star temporal graphs, achieving $2Δ-1$ and $Δ-1$ approximation ratios with running times $O(T)$ and $O(TmΔ^2)$, respectively. These algorithms provide exact solutions for $Δ=1$ and $Δ\leq 2$. Additionally, we offer the first implementation and experimental evaluation of state-of-the-art approximation algorithms with $d$ and $d-1$ approximation ratios, where $d$ is the maximum degree of any snapshot. Our experiments on artificially generated always star temporal graphs show that the new approximation algorithms outperform the known $d-1$ approximation in running time, even in some cases where $Δ>d$. We test state-of-the-art algorithms on real-world data and observe that the $d-1$ approximation algorithm outperforms the analytically better $d$ approximation algorithm in running time when implemented as described in the original paper. However, a novel implementation of the $d$ approximation algorithm significantly improves its runtime, surpassing $d-1$ in practice. Nonetheless, the $d-1$ approximation consistently computes smaller solutions.

Optimized Customizable Route Planning in Large Road Networks with Batch Processing

from arXiv: Data Structures and Algorithms

Authors: Muhammad Farhan, Henning Koehler

Modern route planners such as Google Maps and Apple Maps serve millions of users worldwide, optmizing routes in large-scale road networks where fast responses are required under diverse cost metrics including travel time, fuel consumption, and toll costs. Classical algorithms like Dijkstra or A$^*$ are too slow at this scale, and while index-based techniques achieve fast queries, they are often tied to fixed metrics, making them unsuitable for dynamic conditions or user-specific metrics. Customizable approaches address this limitation by separating metric-independent preprocessing and metric-dependent customization, but they remain limited by slower query performance. Notably, Customizable Tree Labeling (CTL) was recently introduced as a promising framework that combines tree labelings with shortcut graphs. The shortcut graph enables efficient customization to different cost metrics, while tree labeling, supported by path arrays, provides fast query answering. Although CTL enables optimizing routes under different cost metrics, it still faces challenges in storing and reconstructing path information efficiently, which hinders its scalability for answering millions of queries. In this article, we build on the Customizable Tree Labeling framework to introduce new optimizations for the storage and reconstruction of path information. We develop several algorithmic variants that differ in the information retained within shortcut graphs and path arrays, offering a spectrum of trade-offs between memory usage and query performance. To further enhance scalability, we propose a batch processing strategy that shares path information across queries to eliminate redundant computation. Empirically, we have evaluated the performance of our algorithms on 13 real-world road networks. The results show that they significantly outperform state-of-the-art methods.

Authors: Muhammad Farhan, Henning Koehler

Modern route planners such as Google Maps and Apple Maps serve millions of users worldwide, optmizing routes in large-scale road networks where fast responses are required under diverse cost metrics including travel time, fuel consumption, and toll costs. Classical algorithms like Dijkstra or A$^*$ are too slow at this scale, and while index-based techniques achieve fast queries, they are often tied to fixed metrics, making them unsuitable for dynamic conditions or user-specific metrics. Customizable approaches address this limitation by separating metric-independent preprocessing and metric-dependent customization, but they remain limited by slower query performance. Notably, Customizable Tree Labeling (CTL) was recently introduced as a promising framework that combines tree labelings with shortcut graphs. The shortcut graph enables efficient customization to different cost metrics, while tree labeling, supported by path arrays, provides fast query answering. Although CTL enables optimizing routes under different cost metrics, it still faces challenges in storing and reconstructing path information efficiently, which hinders its scalability for answering millions of queries. In this article, we build on the Customizable Tree Labeling framework to introduce new optimizations for the storage and reconstruction of path information. We develop several algorithmic variants that differ in the information retained within shortcut graphs and path arrays, offering a spectrum of trade-offs between memory usage and query performance. To further enhance scalability, we propose a batch processing strategy that shares path information across queries to eliminate redundant computation. Empirically, we have evaluated the performance of our algorithms on 13 real-world road networks. The results show that they significantly outperform state-of-the-art methods.

Edge-Tilting Field Dynamics: Rapid Mixing at the Uniqueness Threshold and Optimal Mixing for Swendsen-Wang Dynamics

from arXiv: Data Structures and Algorithms

Authors: Xiaoyu Chen, Zhe Ju, Tianshun Miao, Yitong Yin, Xinyuan Zhang

We prove two results on the mixing times of Markov chains for two-spin systems. First, we show that the Glauber dynamics mixes in polynomial time for the Gibbs distributions of antiferromagnetic two-spin systems at the critical threshold of the uniqueness phase transition of the Gibbs measure on infinite regular trees. This completes the computational phase transition picture for antiferromagnetic two-spin systems, which includes near-linear-time optimal mixing in the uniqueness regime [Chen--Liu--Vigoda, STOC '21; Chen--Feng--Yin--Zhang, FOCS '22], NP-hardness of approximate sampling in the non-uniqueness regime [Sly--Sun, FOCS '12], and polynomial-time mixing at criticality (this work). Second, we prove an optimal $O(\log n)$ mixing time bound as well as an optimal $Ω(1)$ spectral gap for the Swendsen--Wang dynamics for the ferromagnetic Ising model with an external field on bounded-degree graphs. To the best of our knowledge, these are the first sharp bounds on the mixing rate of this classical global Markov chain beyond mean-field or strong spatial mixing (SSM) regimes, and resolve a conjecture of [Feng--Guo--Wang, IANDC '23]. A key ingredient in both proofs is a new family of localization schemes that extends the field dynamics of [Chen--Feng--Yin--Zhang, FOCS '21] by tilting general edge (or hyperedge) weights rather than vertex fields. This framework, which subsumes the classical Swendsen--Wang dynamics as a special case, extends the localization framework of [Chen--Eldan, FOCS '22] beyond stochastic and field localizations, and enables controlled tilting of interaction strengths while preserving external fields.

Authors: Xiaoyu Chen, Zhe Ju, Tianshun Miao, Yitong Yin, Xinyuan Zhang

We prove two results on the mixing times of Markov chains for two-spin systems. First, we show that the Glauber dynamics mixes in polynomial time for the Gibbs distributions of antiferromagnetic two-spin systems at the critical threshold of the uniqueness phase transition of the Gibbs measure on infinite regular trees. This completes the computational phase transition picture for antiferromagnetic two-spin systems, which includes near-linear-time optimal mixing in the uniqueness regime [Chen--Liu--Vigoda, STOC '21; Chen--Feng--Yin--Zhang, FOCS '22], NP-hardness of approximate sampling in the non-uniqueness regime [Sly--Sun, FOCS '12], and polynomial-time mixing at criticality (this work). Second, we prove an optimal $O(\log n)$ mixing time bound as well as an optimal $Ω(1)$ spectral gap for the Swendsen--Wang dynamics for the ferromagnetic Ising model with an external field on bounded-degree graphs. To the best of our knowledge, these are the first sharp bounds on the mixing rate of this classical global Markov chain beyond mean-field or strong spatial mixing (SSM) regimes, and resolve a conjecture of [Feng--Guo--Wang, IANDC '23]. A key ingredient in both proofs is a new family of localization schemes that extends the field dynamics of [Chen--Feng--Yin--Zhang, FOCS '21] by tilting general edge (or hyperedge) weights rather than vertex fields. This framework, which subsumes the classical Swendsen--Wang dynamics as a special case, extends the localization framework of [Chen--Eldan, FOCS '22] beyond stochastic and field localizations, and enables controlled tilting of interaction strengths while preserving external fields.

Tradeoffs in Privacy, Welfare, and Fairness for Facility Location

from arXiv: Data Structures and Algorithms

Authors: Sara Fish, Yannai A. Gonczarowski, Jason Z. Tang, Salil Vadhan

The differentially private (DP) facility location problem seeks to determine a socially optimal placement for a public facility while ensuring that each participating agent's location remains private. To privatize its input data, a DP mechanism must inject noise into its output distribution, producing a placement that will have lower expected social welfare than the optimal spot for the facility. The privacy-induced welfare loss can be viewed as the "cost of privacy," illustrating a tradeoff between social welfare and privacy that has been the focus of prior work. Yet, the imposition of privacy also induces a third consideration that has not been similarly studied: fairness in how the "cost of privacy" is distributed across individuals. For instance, a mechanism may satisfy DP with minimal social welfare loss, yet still be undesirable if that loss falls entirely on one individual. In this paper, we quantify this new notion of unfairness and design mechanisms for facility location that attempt to simultaneously optimize across privacy, social welfare, and fairness. We first derive an impossibility result, showing that privacy and fairness cannot be simultaneously guaranteed over all possible datasets that could represent the locations of individuals in a population. We then consider a relaxation that still requires worst-case DP, but only seeks fairness and social welfare over smaller, more "realistic-looking" families of datasets. For this relaxation, we construct a DP mechanism and demonstrate that it is simultaneously optimal (or, for a harder family of datasets, near-optimal up to small factors) on fairness and social welfare. This suggests that while there is a tradeoff between privacy and each of social welfare and fairness, there is no additional tradeoff when we consider all three objectives simultaneously, provided that the population data is sufficiently natural.

Authors: Sara Fish, Yannai A. Gonczarowski, Jason Z. Tang, Salil Vadhan

The differentially private (DP) facility location problem seeks to determine a socially optimal placement for a public facility while ensuring that each participating agent's location remains private. To privatize its input data, a DP mechanism must inject noise into its output distribution, producing a placement that will have lower expected social welfare than the optimal spot for the facility. The privacy-induced welfare loss can be viewed as the "cost of privacy," illustrating a tradeoff between social welfare and privacy that has been the focus of prior work. Yet, the imposition of privacy also induces a third consideration that has not been similarly studied: fairness in how the "cost of privacy" is distributed across individuals. For instance, a mechanism may satisfy DP with minimal social welfare loss, yet still be undesirable if that loss falls entirely on one individual. In this paper, we quantify this new notion of unfairness and design mechanisms for facility location that attempt to simultaneously optimize across privacy, social welfare, and fairness. We first derive an impossibility result, showing that privacy and fairness cannot be simultaneously guaranteed over all possible datasets that could represent the locations of individuals in a population. We then consider a relaxation that still requires worst-case DP, but only seeks fairness and social welfare over smaller, more "realistic-looking" families of datasets. For this relaxation, we construct a DP mechanism and demonstrate that it is simultaneously optimal (or, for a harder family of datasets, near-optimal up to small factors) on fairness and social welfare. This suggests that while there is a tradeoff between privacy and each of social welfare and fairness, there is no additional tradeoff when we consider all three objectives simultaneously, provided that the population data is sufficiently natural.

Replicable Composition

from arXiv: Data Structures and Algorithms

Authors: Kiarash Banihashem, MohammadHossein Bateni, Hossein Esfandiari, Samira Goudarzi, MohammadTaghi Hajiaghayi

Replicability requires that algorithmic conclusions remain consistent when rerun on independently drawn data. A central structural question is composition: given $k$ problems each admitting a $ρ$-replicable algorithm with sample complexity $n$, how many samples are needed to solve all jointly while preserving replicability? The naive analysis yields $\widetilde{O}(nk^2)$ samples, and Bun et al. (STOC'23) observed that reductions through differential privacy give an alternative $\widetilde{O}(n^2k)$ bound, leaving open whether the optimal $\widetilde{O}(nk)$ scaling is achievable. We resolve this open problem and, more generally, show that problems with sample complexities $n_1,\ldots,n_k$ can be jointly solved with $\widetilde{O}(\sum_i n_i)$ samples while preserving constant replicability. Our approach converts each replicable algorithm into a perfectly generalizing one, composes them via a privacy-style analysis, and maps back via correlated sampling. This yields the first advanced composition theorem for replicability. En route, we obtain new bounds for the composition of perfectly generalizing algorithms with heterogeneous parameters. As part of our results, we provide a boosting theorem for the success probability of replicable algorithms. For a broad class of problems, the failure probability appears as a separate additive term independent of $ρ$, immediately yielding improved sample complexity bounds for several problems. Finally, we prove an $Ω(nk^2)$ lower bound for adaptive composition, establishing a quadratic separation from the non-adaptive setting. The key technique, which we call the phantom run, yields structural results of independent interest.

Authors: Kiarash Banihashem, MohammadHossein Bateni, Hossein Esfandiari, Samira Goudarzi, MohammadTaghi Hajiaghayi

Replicability requires that algorithmic conclusions remain consistent when rerun on independently drawn data. A central structural question is composition: given $k$ problems each admitting a $ρ$-replicable algorithm with sample complexity $n$, how many samples are needed to solve all jointly while preserving replicability? The naive analysis yields $\widetilde{O}(nk^2)$ samples, and Bun et al. (STOC'23) observed that reductions through differential privacy give an alternative $\widetilde{O}(n^2k)$ bound, leaving open whether the optimal $\widetilde{O}(nk)$ scaling is achievable. We resolve this open problem and, more generally, show that problems with sample complexities $n_1,\ldots,n_k$ can be jointly solved with $\widetilde{O}(\sum_i n_i)$ samples while preserving constant replicability. Our approach converts each replicable algorithm into a perfectly generalizing one, composes them via a privacy-style analysis, and maps back via correlated sampling. This yields the first advanced composition theorem for replicability. En route, we obtain new bounds for the composition of perfectly generalizing algorithms with heterogeneous parameters. As part of our results, we provide a boosting theorem for the success probability of replicable algorithms. For a broad class of problems, the failure probability appears as a separate additive term independent of $ρ$, immediately yielding improved sample complexity bounds for several problems. Finally, we prove an $Ω(nk^2)$ lower bound for adaptive composition, establishing a quadratic separation from the non-adaptive setting. The key technique, which we call the phantom run, yields structural results of independent interest.

Optimal FPT-Approximability for Modular Linear Equations

from arXiv: Data Structures and Algorithms

Authors: Konrad K. Dabrowski, Peter Jonsson, Sebastian Ordyniak, George Osipov, Magnus Wahlström

We show optimal FPT-approximability results for solving almost satisfiable systems of modular linear equations, completing the picture of the parameterized complexity and FPT-approximability landscape for the Min-$r$-Lin$(\mathbb{Z}_m)$ problem for every $r$ and $m$. In Min-$r$-Lin$(\mathbb{Z}_m)$, we are given a system $S$ of linear equations modulo $m$, each on at most $r$ variables, and the goal is to find a subset $Z \subseteq S$ of minimum cardinality such that $S - Z$ is satisfiable. The problem is UGC-hard to approximate within any constant factor for every $r \geq 2$ and $m \geq 2$, which motivates studying it through the lens of parameterized complexity with solution size as the parameter. From previous work (Dabrowski et al. SODA'23/TALG and ESA'25) we know that Min-$r$-Lin$(\mathbb{Z}_m)$ is W[1]-hard to FPT-approximate within any constant factor when $r \geq 3$, and that Min-$2$-Lin$(\mathbb{Z}_m)$ is in FPT when $m$ is prime and W[1]-hard when $m$ has at least two distinct prime factors. The case when $m = p^d$ for some prime $p$ and $d \geq 2$ has remained an open problem. We resolve this problem in this paper and prove the following: (1) We prove that Min-$2$-Lin$(\mathbb{Z}_{p^d})$ is in FPT for every prime $p$ and $d \geq 1$. This implies that Min-$2$-Lin$(\mathbb{Z}_{m})$ can be FPT-approximated within a factor of $ω(m)$, where $ω$ is the number of distinct prime factors of $m$. (2) We show that, under the ETH, Min-$2$-Lin$(\mathbb{Z}_m)$ cannot be FPT-approximated within $ω(m) - ε$ for any $ε> 0$. Our main algorithmic contribution is a new technique coined balanced subgraph covering, which generalizes important balanced subgraphs of Dabrowski et al. (SODA'23/TALG) and shadow removal of Marx and Razgon (STOC'11/SICOMP). For the lower bounds, we develop a framework for proving optimality of FPT-approximation factors under the ETH.

Authors: Konrad K. Dabrowski, Peter Jonsson, Sebastian Ordyniak, George Osipov, Magnus Wahlström

We show optimal FPT-approximability results for solving almost satisfiable systems of modular linear equations, completing the picture of the parameterized complexity and FPT-approximability landscape for the Min-$r$-Lin$(\mathbb{Z}_m)$ problem for every $r$ and $m$. In Min-$r$-Lin$(\mathbb{Z}_m)$, we are given a system $S$ of linear equations modulo $m$, each on at most $r$ variables, and the goal is to find a subset $Z \subseteq S$ of minimum cardinality such that $S - Z$ is satisfiable. The problem is UGC-hard to approximate within any constant factor for every $r \geq 2$ and $m \geq 2$, which motivates studying it through the lens of parameterized complexity with solution size as the parameter. From previous work (Dabrowski et al. SODA'23/TALG and ESA'25) we know that Min-$r$-Lin$(\mathbb{Z}_m)$ is W[1]-hard to FPT-approximate within any constant factor when $r \geq 3$, and that Min-$2$-Lin$(\mathbb{Z}_m)$ is in FPT when $m$ is prime and W[1]-hard when $m$ has at least two distinct prime factors. The case when $m = p^d$ for some prime $p$ and $d \geq 2$ has remained an open problem. We resolve this problem in this paper and prove the following: (1) We prove that Min-$2$-Lin$(\mathbb{Z}_{p^d})$ is in FPT for every prime $p$ and $d \geq 1$. This implies that Min-$2$-Lin$(\mathbb{Z}_{m})$ can be FPT-approximated within a factor of $ω(m)$, where $ω$ is the number of distinct prime factors of $m$. (2) We show that, under the ETH, Min-$2$-Lin$(\mathbb{Z}_m)$ cannot be FPT-approximated within $ω(m) - ε$ for any $ε> 0$. Our main algorithmic contribution is a new technique coined balanced subgraph covering, which generalizes important balanced subgraphs of Dabrowski et al. (SODA'23/TALG) and shadow removal of Marx and Razgon (STOC'11/SICOMP). For the lower bounds, we develop a framework for proving optimality of FPT-approximation factors under the ETH.

On the Approximability of Max-Cut on 3-Colorable Graphs and Graphs with Large Independent Sets

from arXiv: Data Structures and Algorithms

Authors: Suprovat Ghoshal, Neng Huang, Euiwoong Lee, Konstantin Makarychev, Yury Makarychev

Max-Cut is a classical graph-partitioning problem where given a graph $G = (V,E)$, the objective is to find a cut $(S,S^c)$ which maximizes the number of edges crossing the cut. In a seminal work, Goemans and Williamson gave an $α_{GW} \approx 0.87856$-factor approximation algorithm for the problem, which was later shown to be tight by the work of Khot, Kindler, Mossel, and O'Donnell. Since then, there has been a steady progress in understanding the approximability at even finer levels, and a fundamental goal in this context is to understand how the structure of the underlying graph affects the approximability of the Max-Cut problem. In this work, we investigate this question by exploring how the chromatic structure of a graph affects the Max-Cut problem. While it is well-known that Max-Cut can be solved perfectly and near-perfectly in $2$-colorable and almost $2$-colorable graphs in polynomial time, here we explore its approximability under much weaker structural conditions such as when the graph is $3$-colorable or contains a large independent set. Our main contributions in this context are as follows: 1. We show Max-Cut is $α_{GW}$-hard to approximate for $3$-colorable graphs. 2. We identify a natural threshold $α^*$ such that the following holds. Firstly, for graphs which contain an independent set of size up to $α^*$, Max-Cut continues to be $α_{GW}$-factor hard to approximate. Furthermore, for any graph that contains an independent set of size $> α^*$, there exists an efficient $>α_{GW}$-approximation algorithm for Max-Cut. Our hardness results are derived using various analytical tools and novel variants of the Majority-Is-Stablest theorem, which might be of independent interest. Our algorithmic results are based on a novel SDP relaxation, which is then rounded and analyzed using interval arithmetic.

Authors: Suprovat Ghoshal, Neng Huang, Euiwoong Lee, Konstantin Makarychev, Yury Makarychev

Max-Cut is a classical graph-partitioning problem where given a graph $G = (V,E)$, the objective is to find a cut $(S,S^c)$ which maximizes the number of edges crossing the cut. In a seminal work, Goemans and Williamson gave an $α_{GW} \approx 0.87856$-factor approximation algorithm for the problem, which was later shown to be tight by the work of Khot, Kindler, Mossel, and O'Donnell. Since then, there has been a steady progress in understanding the approximability at even finer levels, and a fundamental goal in this context is to understand how the structure of the underlying graph affects the approximability of the Max-Cut problem. In this work, we investigate this question by exploring how the chromatic structure of a graph affects the Max-Cut problem. While it is well-known that Max-Cut can be solved perfectly and near-perfectly in $2$-colorable and almost $2$-colorable graphs in polynomial time, here we explore its approximability under much weaker structural conditions such as when the graph is $3$-colorable or contains a large independent set. Our main contributions in this context are as follows: 1. We show Max-Cut is $α_{GW}$-hard to approximate for $3$-colorable graphs. 2. We identify a natural threshold $α^*$ such that the following holds. Firstly, for graphs which contain an independent set of size up to $α^*$, Max-Cut continues to be $α_{GW}$-factor hard to approximate. Furthermore, for any graph that contains an independent set of size $> α^*$, there exists an efficient $>α_{GW}$-approximation algorithm for Max-Cut. Our hardness results are derived using various analytical tools and novel variants of the Majority-Is-Stablest theorem, which might be of independent interest. Our algorithmic results are based on a novel SDP relaxation, which is then rounded and analyzed using interval arithmetic.