Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, April 17

Faculty & Research Positions in Artificial Intelligence at Capital Normal University (apply by April 24, 2028)

from CCI: jobs

Located in the center of Beijing, the College of Artificial Intelligence at Capital Normal University invites applications for full-time faculty positions and research positions at all ranks. Submit a single PDF to t116@cnu.edu.cn containing: 1. Curriculum vitae 2. 3–5 publications 3. Research statement 4. Three letters of recommendation (sent directly by recommenders to t116@cnu.edu.cn) Website: […]

Located in the center of Beijing, the College of Artificial Intelligence at Capital Normal University invites applications for full-time faculty positions and research positions at all ranks. Submit a single PDF to t116@cnu.edu.cn containing:
1. Curriculum vitae
2. 3–5 publications
3. Research statement
4. Three letters of recommendation (sent directly by recommenders to t116@cnu.edu.cn)

Website: https://aic.cnu.edu.cn/
Email: t116@cnu.edu.cn

By shacharlovett

Structured Uncertainties

from Ben Recht

A brief introduction to the structured singular value and what it teaches us about uncertainty quantification.

This is a live blog of Lecture 9 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

The problem with a fast-paced course is that I keep hitting topics I want to dig into but am forced to move on. Simulation is fascinating, and I need to spend more time with its history and nuance. I guess that just means I’m going to add it to the syllabus of my next graduate course.1

But I wanted to post about one fun thing I learned this week that, while not directly related to simulation, does seem to have some broader lessons. I came to better appreciate John Doyle’s structured singular value, often called by its Greek name “𝜇” or “mu.” The lessons it teaches about interconnection and uncertainty, though perhaps not always computable, are quite general and important.

Let’s go back to the recurring simple feedback loop.

Here, C is the controller we’re designing, P is the plant we’re trying to steer. I’ve added a new block, Δ, to represent some uncertain system in our feedback loop. When Δ equals zero, we have our nominal system model. The structured singular value asks the following question: if I have a design that works without uncertainty, how much margin for error do I have? How large can Δ be before the system goes unstable?

A standard problem in linear control models Δ as a simple scalar. In that case, because of how the equations work out, the uncertainty question is about the gain margin of the linear system. How much can you amplify or attenuate the plant before the closed-loop system goes unstable? Asked another way, how well do you need to know the amplification factor to guarantee stable operation? Or, let’s say you have a bunch of plants that all have reasonably similar dynamics but different gains. Is your controller good enough for all of them? The uncertainty could model the mass of a flying vehicle or the insulin sensitivity of a person with diabetes. Steady-state control of both of these systems relies on some robustness to uncertainty.

One of the classic results about the linear quadratic regulator is that its stability is maintained for any Δ between -½ and infinity. That’s a good gain margin! Many other control design techniques from classical control using Nyquist plots can also guarantee large gain margins for single-input, single-output systems.

However, the problem becomes a lot trickier when you need to control a plant with many inputs and outputs. Most control systems are networks of interconnected feedback loops, not just simple single-input, single-output systems. In my favorite control system, the espresso machine, you might have a PID controller for water temperature and another for water pressure. You could calibrate these by tuning each PID parameter, one at a time. But obviously, these two loops interact with each other. They also interact with your grind and your tamping.

An industrial process, a chemical plant, or a robot has a networked control system of far greater complexity. You might be able to write out performance guarantees for each loop in the system, and that might look fine on its face. But if these loops are coupled, your margin calculations might be misleading.

To see why, we can look at a static example, like we did with the feedback amplifier. Imagine we’re trying to get a plant to track a constant reference signal. The controller compares the reference signal with the plant’s output and applies a new input if the difference is large. This signal sets a different set point for each loop. We can compute the steady state of our system by looking at a matrix equation. Indeed, slightly abusing notation, we can think of the steady-state maps C, P, and Δ as matrices (these are the DC responses of each system). The map from the error signal input of the controller, e, to the output of the uncertainty, y, is a system of equations:

Using the fact that the error is the difference between the reference and the output, e = r - y, we can compute the steady-state output as a function of the reference signal. It’s not the prettiest formula, but you can write it out in closed form and stare at it:

If the open-loop map from the controller input to plant output — the matrix (I+Δ) PC — is sufficiently large, the output of the plant will be approximately equal to the reference input. However, there’s a catch. We need to know that matrix never has an eigenvalue of -1 for any instantiation of the uncertainty. If it does, then the inverse in the above matrix expression isn’t defined, and the expression blows up in unpleasant ways. We’d say the closed-loop system was unstable.

Hence, we can capture a notion of multivariate robustness by finding the smallest perturbation that makes that matrix singular. The tricky part is that you get different answers based on what sorts of uncertainties you believe are plausible.

Consider the classic gain margin question. For simplicity, define the matrix

When Δ is a scalar, you are just looking for the smallest number such that

This number is precisely equal to the inverse of the magnitude of the largest eigenvalue of T. By contrast, if you think that you can have uncertainty that couples channels of your system together, the size of the uncertainty you can handle is much smaller. Indeed, you can check that if you allow for the uncertainty to be an arbitrary matrix, the norm of the uncertainty has to be smaller than the inverse of the magnitude of the largest singular value of T.

Singular values are always larger than eigenvalues. Sometimes, they can be much larger. For instance if

Then the eigenvalues of T are ½, and the maximum singular value of T is approximately 250. If the uncertainties were just a multiple of the identity, it would appear very robust to perturbations, handling disturbances with gains up to a magnitude of 2. However, if general matrix uncertainty were allowed, you could only handle disturbances with gains of magnitude at most 0.004.

The structured singular value lets you figure out what this magnitude is for whatever plausible model of uncertainty you can construct. Maybe only a subset of the loops is coupled. Maybe Δ has block structure. Each structure gives you a different number in between the spectral radius and the norm of your system’s complementary sensitivity, T. The structured singular value generalizes beyond this simple matrix example to general linear systems. It lets you compute bounds even when the uncertain blocks are themselves structured dynamical systems.

For people designing mission-critical linear feedback systems, you should learn all of the details. For everyone else who is stuck with nonlinear systems, there are still lessons to take away. Nonlinearity seldom makes our lives easier! If a robustness problem presents itself when we look at simple linear instances, we shouldn’t just hope that it’s not there on hard nonlinear ones. This is one of the reasons that in our post-math age of YOLO scaling, it’s useful to learn a little bit of math to be a little bit chastened. Though I suppose if you do it that way, you’ll never make a dime.

Subscribe now

1

It is indeed already on there, my friends.

By Ben Recht

The Parameterized Complexity of Coloring Mixed Graphs

from arXiv: Computational Complexity

Authors: Antonio Lauerbach, Konstanty Junosza-Szaniawski, Marie Diana Sieper, Alexander Wolff

A mixed graph contains (undirected) edges as well as (directed) arcs, thus generalizing undirected and directed graphs. A proper coloring c of a mixed graph G assigns a positive integer to each vertex such that c(u)!=c(v) for every edge {u,v} and c(u)

Authors: Antonio Lauerbach, Konstanty Junosza-Szaniawski, Marie Diana Sieper, Alexander Wolff

A mixed graph contains (undirected) edges as well as (directed) arcs, thus generalizing undirected and directed graphs. A proper coloring c of a mixed graph G assigns a positive integer to each vertex such that c(u)!=c(v) for every edge {u,v} and c(u)

IQP circuits for 2-Forrelation

from arXiv: Computational Complexity

Authors: Quentin Buzet, André Chailloux

The 2-Forrelation problem provides an optimal separation between classical and quantum query complexity and is also the problem used for separating $\mathsf{BQP}$ and $\mathsf{PH}$ relative to an oracle. A natural question is therefore to ask what are the minimal quantum resources needed to solve this problem. We show that 2-Forrelation can be solved using Instantaneous Quantum Polynomial-time ($\mathsf{IQP}$) circuits, a restricted model of quantum computation in which all gates commute. Concretely, two $\mathsf{IQP}$ circuits with two quantum queries and efficient classical processing suffice. For the signed variant of 2-Forrelation, even a single $\mathsf{IQP}$ circuit and query suffices. This answers a recent open question of Girish (arXiv:2510.06385) on the power of commuting quantum computations. We use this to show that $(\mathsf{BPP}^{\mathsf{IQP}})^O \not\subseteq \mathsf{PH}^O$ relative to an oracle $O$, strengthening the result of Raz and Tal (STOC 2019). Our results show that $\mathsf{IQP}$ circuits can be used for classically hard decision problems, thus providing a new route for showing quantum advantage with $\mathsf{IQP}$ circuits, avoiding the verification difficulties associated with sampling tasks. We also prove Fourier growth bounds for $\mathsf{IQP}$ circuits in terms of the size of their accepting set. The key ingredient is an algebraic identity of the quadratic function $Q(x) = \sum_{i < j} x_ix_j$ that allows extracting inner-product phases within an $\mathsf{IQP}$ circuit.

Authors: Quentin Buzet, André Chailloux

The 2-Forrelation problem provides an optimal separation between classical and quantum query complexity and is also the problem used for separating $\mathsf{BQP}$ and $\mathsf{PH}$ relative to an oracle. A natural question is therefore to ask what are the minimal quantum resources needed to solve this problem. We show that 2-Forrelation can be solved using Instantaneous Quantum Polynomial-time ($\mathsf{IQP}$) circuits, a restricted model of quantum computation in which all gates commute. Concretely, two $\mathsf{IQP}$ circuits with two quantum queries and efficient classical processing suffice. For the signed variant of 2-Forrelation, even a single $\mathsf{IQP}$ circuit and query suffices. This answers a recent open question of Girish (arXiv:2510.06385) on the power of commuting quantum computations. We use this to show that $(\mathsf{BPP}^{\mathsf{IQP}})^O \not\subseteq \mathsf{PH}^O$ relative to an oracle $O$, strengthening the result of Raz and Tal (STOC 2019). Our results show that $\mathsf{IQP}$ circuits can be used for classically hard decision problems, thus providing a new route for showing quantum advantage with $\mathsf{IQP}$ circuits, avoiding the verification difficulties associated with sampling tasks. We also prove Fourier growth bounds for $\mathsf{IQP}$ circuits in terms of the size of their accepting set. The key ingredient is an algebraic identity of the quadratic function $Q(x) = \sum_{i < j} x_ix_j$ that allows extracting inner-product phases within an $\mathsf{IQP}$ circuit.

Explicit Constant-Alphabet Subspace Design Codes

from arXiv: Computational Complexity

Authors: Rohan Goyal, Venkatesan Guruswami, Jun-Ting Hsieh

The subspace design property for additive codes is a higher-dimensional generalization of the minimum distance property. As shown recently by Brakensiek, Chen, Dhar and Zhang, it implies that the code has similar performance as random linear codes with respect to all "local properties". Explicit algebraic codes, such as folded Reed-Solomon and multiplicity codes, are known to have the subspace design property, but they need alphabet sizes that grow as a large polynomial in the block length. Constructing explicit constant-alphabet subspace design codes was subsequently posed as an open question in Brakensiek, Chen, Dhar and Zhang. In this work, we answer their question and give explicit constructions of subspace design codes over constant-sized alphabets, using the expander-based Alon-Edmonds-Luby (AEL) framework. This generalizes the recent work of Jeronimo and Shagrithaya, which showed that such codes share local properties of random linear codes. Our work obtains this consequence in a unified manner via the subspace design property. In addition, our approach yields some improvements in parameters for list-recovery.

Authors: Rohan Goyal, Venkatesan Guruswami, Jun-Ting Hsieh

The subspace design property for additive codes is a higher-dimensional generalization of the minimum distance property. As shown recently by Brakensiek, Chen, Dhar and Zhang, it implies that the code has similar performance as random linear codes with respect to all "local properties". Explicit algebraic codes, such as folded Reed-Solomon and multiplicity codes, are known to have the subspace design property, but they need alphabet sizes that grow as a large polynomial in the block length. Constructing explicit constant-alphabet subspace design codes was subsequently posed as an open question in Brakensiek, Chen, Dhar and Zhang. In this work, we answer their question and give explicit constructions of subspace design codes over constant-sized alphabets, using the expander-based Alon-Edmonds-Luby (AEL) framework. This generalizes the recent work of Jeronimo and Shagrithaya, which showed that such codes share local properties of random linear codes. Our work obtains this consequence in a unified manner via the subspace design property. In addition, our approach yields some improvements in parameters for list-recovery.

Complexity of Fungal Automaton Prediction

from arXiv: Computational Complexity

Authors: Enrico Formenti, Eric Goles, Kévin Perrot, Martín Ríos-Wilson, Domingo Ruiz-Tala

Fungal automata are a nature-inspired computational model, where a rule is alternatively applied verticaly and horizontaly. In this work we study the computational complexity of predicting the dynamics of all fungal freezing totalistic one-dimentional rules of radius $1$, exhibiting various behaviors. Despite efficiently predictable in most cases (with non-deterministic logspace algorithms), a non-linear rule is left open to characterize. We further explore the freezing majority rule (which is totalistic), and prove that at radius $1.5$ it becomes $\mathbf{P}$-complete to predict.

Authors: Enrico Formenti, Eric Goles, Kévin Perrot, Martín Ríos-Wilson, Domingo Ruiz-Tala

Fungal automata are a nature-inspired computational model, where a rule is alternatively applied verticaly and horizontaly. In this work we study the computational complexity of predicting the dynamics of all fungal freezing totalistic one-dimentional rules of radius $1$, exhibiting various behaviors. Despite efficiently predictable in most cases (with non-deterministic logspace algorithms), a non-linear rule is left open to characterize. We further explore the freezing majority rule (which is totalistic), and prove that at radius $1.5$ it becomes $\mathbf{P}$-complete to predict.

A Hypergraph Container Method on Spread SAT: Approximation and Speedup

from arXiv: Computational Complexity

Authors: Zicheng Han, Yupeng Lin, Jie Ma, Xiande Zhang

We develop a hypergraph container method for the Boolean Satisfiability Problem (SAT) via the newly developed container results [Campos and Samotij (2024)]. This provides an explicit connection between the extent of spread of clauses and the efficiency of container-based algorithms. Informally, the more evenly the clauses are distributed, the stronger the shrinking effect of the containers, which leads to faster algorithms for SAT. To quantify the extent of spread, we use a weighted point of view, in which a clause of size $s$ receives weight $p^s$ for some $0

Authors: Zicheng Han, Yupeng Lin, Jie Ma, Xiande Zhang

We develop a hypergraph container method for the Boolean Satisfiability Problem (SAT) via the newly developed container results [Campos and Samotij (2024)]. This provides an explicit connection between the extent of spread of clauses and the efficiency of container-based algorithms. Informally, the more evenly the clauses are distributed, the stronger the shrinking effect of the containers, which leads to faster algorithms for SAT. To quantify the extent of spread, we use a weighted point of view, in which a clause of size $s$ receives weight $p^s$ for some $0

On the Expressive Power and Limitations of Multi-Layer SSMs

from arXiv: Computational Complexity

Authors: Nikola Zubić, Qian Li, Yuyi Wang, Davide Scaramuzza

We study the expressive power and limitations of multi-layer state-space models (SSMs). First, we show that multi-layer SSMs face fundamental limitations in compositional tasks, revealing an inherent gap between SSMs and streaming models. Then, we examine the role of chain-of-thought (CoT), showing that offline CoT does not fundamentally increase the expressiveness, while online CoT can substantially increase its power. Indeed, with online CoT, multi-layer SSMs become equivalent in power to streaming algorithms. Finally, we investigate the tradeoff between width and precision, showing that these resources are not interchangeable in the base model, but admit a clean equivalence once online CoT is allowed. Overall, our results offer a unified perspective on how depth, finite precision, and CoT shape the power and limits of SSMs.

Authors: Nikola Zubić, Qian Li, Yuyi Wang, Davide Scaramuzza

We study the expressive power and limitations of multi-layer state-space models (SSMs). First, we show that multi-layer SSMs face fundamental limitations in compositional tasks, revealing an inherent gap between SSMs and streaming models. Then, we examine the role of chain-of-thought (CoT), showing that offline CoT does not fundamentally increase the expressiveness, while online CoT can substantially increase its power. Indeed, with online CoT, multi-layer SSMs become equivalent in power to streaming algorithms. Finally, we investigate the tradeoff between width and precision, showing that these resources are not interchangeable in the base model, but admit a clean equivalence once online CoT is allowed. Overall, our results offer a unified perspective on how depth, finite precision, and CoT shape the power and limits of SSMs.

Reverse-Robust Computation with Chemical Reaction Networks

from arXiv: Computational Complexity

Authors: Ravi Kini, David Doty

Chemical reaction networks, or CRNs, are known to stably compute semilinear Boolean-valued predicates and functions, provided that all reactions are irreversible. However, this property does not hold for wet-lab implementations, as all chemical reactions are reversible, even at very slow rates. We study the computational power of CRNs under the reverse-robust computation model, where reactions are permitted to occur either in forward or in reverse up to a cutoff point, after which they may only occur in forward. Our main results show that all semilinear predicates and all semilinear functions can be computed reverse-robustly, and in fact, that existing constructions continue to hold under the reverse-robust computational model. A key tool used to prove correctness under the reverse-robust computation model is invariants: linear (or linear modulo some $m$) combinations of the counts of the species that are preserved by all reactions.

Authors: Ravi Kini, David Doty

Chemical reaction networks, or CRNs, are known to stably compute semilinear Boolean-valued predicates and functions, provided that all reactions are irreversible. However, this property does not hold for wet-lab implementations, as all chemical reactions are reversible, even at very slow rates. We study the computational power of CRNs under the reverse-robust computation model, where reactions are permitted to occur either in forward or in reverse up to a cutoff point, after which they may only occur in forward. Our main results show that all semilinear predicates and all semilinear functions can be computed reverse-robustly, and in fact, that existing constructions continue to hold under the reverse-robust computational model. A key tool used to prove correctness under the reverse-robust computation model is invariants: linear (or linear modulo some $m$) combinations of the counts of the species that are preserved by all reactions.

Orthogonal Strip Partitioning of Polygons: Lattice-Theoretic Algorithms and Lower Bounds

from arXiv: Computational Geometry

Authors: Jaehoon Chung

We study a variant of a polygon partition problem, introduced by Chung, Iwama, Liao, and Ahn [ISAAC'25]. Given orthogonal unit vectors $\mathbf{u},\mathbf{v}\in \mathbb{R}^2$ and a polygon $P$ with $n$ vertices, we partition $P$ into connected pieces by cuts parallel to $\mathbf{v}$ such that each resulting subpolygon has width at most one in direction $\mathbf{u}$. We consider the value version, which asks for the minimum number of strips, and the reporting version, which outputs a compact encoding of the cuts in an optimal strip partition. We give efficient algorithms and lower bounds for both versions on three classes of polygons of increasing generality: convex, simple, and self-overlapping. For convex polygons, we solve the value version in $O(\log n)$ time and the reporting version in $O\!\left(h \log\left(1 + \frac{n}{h}\right)\right)$ time, where $h$ is the width of $P$ in direction $\mathbf{u}$. We prove matching lower bounds in the decision-tree model, showing that the reporting algorithm is input-sensitive optimal with respect to $h$. For simple polygons, we present $O(n \log n)$-time, $O(n)$-space algorithms for both versions and prove an $Ω(n)$ lower bound. For self-overlapping polygons, we extend the approach for simple polygons to obtain $O(n \log n)$-time, $O(n)$-space algorithms for both versions, and we prove a matching $Ω(n \log n)$ lower bound in the algebraic computation-tree model via a reduction from the $δ$-closeness problem. Our approach relies on a lattice-theoretic formulation of the problem. We represent strip partitions as antichains of intervals in the Clarke--Cormack--Burkowski lattice, originally developed for minimal-interval semantics in information retrieval. Within this lattice framework, we design a dynamic programming algorithm that uses the lattice operations of meet and join.

Authors: Jaehoon Chung

We study a variant of a polygon partition problem, introduced by Chung, Iwama, Liao, and Ahn [ISAAC'25]. Given orthogonal unit vectors $\mathbf{u},\mathbf{v}\in \mathbb{R}^2$ and a polygon $P$ with $n$ vertices, we partition $P$ into connected pieces by cuts parallel to $\mathbf{v}$ such that each resulting subpolygon has width at most one in direction $\mathbf{u}$. We consider the value version, which asks for the minimum number of strips, and the reporting version, which outputs a compact encoding of the cuts in an optimal strip partition. We give efficient algorithms and lower bounds for both versions on three classes of polygons of increasing generality: convex, simple, and self-overlapping. For convex polygons, we solve the value version in $O(\log n)$ time and the reporting version in $O\!\left(h \log\left(1 + \frac{n}{h}\right)\right)$ time, where $h$ is the width of $P$ in direction $\mathbf{u}$. We prove matching lower bounds in the decision-tree model, showing that the reporting algorithm is input-sensitive optimal with respect to $h$. For simple polygons, we present $O(n \log n)$-time, $O(n)$-space algorithms for both versions and prove an $Ω(n)$ lower bound. For self-overlapping polygons, we extend the approach for simple polygons to obtain $O(n \log n)$-time, $O(n)$-space algorithms for both versions, and we prove a matching $Ω(n \log n)$ lower bound in the algebraic computation-tree model via a reduction from the $δ$-closeness problem. Our approach relies on a lattice-theoretic formulation of the problem. We represent strip partitions as antichains of intervals in the Clarke--Cormack--Burkowski lattice, originally developed for minimal-interval semantics in information retrieval. Within this lattice framework, we design a dynamic programming algorithm that uses the lattice operations of meet and join.

On the Doubling Dimension and the Perimeter of Geodesically Convex Sets in Fat Polygons

from arXiv: Computational Geometry

Authors: Mark de Berg, Prosenjit Bose, Leonidas Theocharous

Many algorithmic problems can be solved (almost) as efficiently in metric spaces of bounded doubling dimension as in Euclidean space. Unfortunately, the metric space defined by points in a simple polygon equipped with the geodesic distance does not necessarily have bounded doubling dimension. We therefore study the doubling dimension of fat polygons, for two well-known fatness definitions. We prove that locally-fat simple polygons do not always have bounded doubling dimension, while any $(α,β)$-covered polygon does have bounded doubling dimension (even if it has holes). We also study the perimeter of geodesically convex sets in $(α,β)$-covered polygons (possibly with holes), and show that this perimeter is at most a constant times the Euclidean diameter of the set. Using these two results, we obtain new results for several problems on $(α,β)$-covered polygons, including an algorithm that computes the closest pair of a set of $m$ points in an $(α,β)$-covered polygon with $n$ vertices that runs in $O(n + m\log{n})$ expected time.

Authors: Mark de Berg, Prosenjit Bose, Leonidas Theocharous

Many algorithmic problems can be solved (almost) as efficiently in metric spaces of bounded doubling dimension as in Euclidean space. Unfortunately, the metric space defined by points in a simple polygon equipped with the geodesic distance does not necessarily have bounded doubling dimension. We therefore study the doubling dimension of fat polygons, for two well-known fatness definitions. We prove that locally-fat simple polygons do not always have bounded doubling dimension, while any $(α,β)$-covered polygon does have bounded doubling dimension (even if it has holes). We also study the perimeter of geodesically convex sets in $(α,β)$-covered polygons (possibly with holes), and show that this perimeter is at most a constant times the Euclidean diameter of the set. Using these two results, we obtain new results for several problems on $(α,β)$-covered polygons, including an algorithm that computes the closest pair of a set of $m$ points in an $(α,β)$-covered polygon with $n$ vertices that runs in $O(n + m\log{n})$ expected time.

Bivariate range functions with superior convergence order

from arXiv: Computational Geometry

Authors: Bingwei Zhang, Thomas Chen, Kai Hormann, Chee Yap

Range functions are a fundamental tool for certified computations in geometric modeling, computer graphics, and robotics, but traditional range functions have only quadratic convergence order ($m=2$). For ``superior'' convergence order (i.e., $m>2$), we exploit the Cornelius--Lohner framework in order to introduce new bivariate range functions based on Taylor, Lagrange, and Hermite interpolation. In particular, we focus on practical range functions with cubic and quartic convergence order. We implemented them in Julia and provide experimental validation of their performance in terms of efficiency and efficacy.

Authors: Bingwei Zhang, Thomas Chen, Kai Hormann, Chee Yap

Range functions are a fundamental tool for certified computations in geometric modeling, computer graphics, and robotics, but traditional range functions have only quadratic convergence order ($m=2$). For ``superior'' convergence order (i.e., $m>2$), we exploit the Cornelius--Lohner framework in order to introduce new bivariate range functions based on Taylor, Lagrange, and Hermite interpolation. In particular, we focus on practical range functions with cubic and quartic convergence order. We implemented them in Julia and provide experimental validation of their performance in terms of efficiency and efficacy.

Interactive Exploration of Large-scale Streamlines of Vector Fields via a Curve Segment Neighborhood Graph

from arXiv: Computational Geometry

Authors: Nguyen Phan, Brian Kim, Adeel Zafar, Guoning Chen

Streamlines have been widely used to represent and analyze various steady vector fields. To sufficiently represent important features in complex vector fields (like flow), a large number of streamlines are required. Due to the lack of a rigorous definition of features or patterns in streamlines, user interaction and exploration are required to achieve effective interpretation. Existing approaches based on clustering or pattern search, while valuable for specific analysis tasks, often face challenges in supporting interactive and level-of-detail exploration of large-scale curve-based data, particularly when real-time parameter adjustment and iterative refinement are needed. To address this, we design and implement an interactive web-based system. Our system utilizes a Curve Segment Neighborhood Graph (CSNG) to encode the neighboring relationships between curve segments. CSNG enables us to adapt a fast community detection algorithm to identify coherent flow structures and spatial groupings in the streamlines interactively. CSNG also supports a multi-level exploration through an enhanced force-directed layout. Furthermore, our system integrates an adjacency matrix representation to reveal detailed inter-relations among segments. To achieve real-time performance within a web browser, our system employs matrix compression for memory-efficient CSNG storage and parallel processing. We have applied our system to analyze and interpret complex patterns in several streamline datasets. Our experiments show that we achieve real-time performance on datasets with hundreds of thousands of segments.

Authors: Nguyen Phan, Brian Kim, Adeel Zafar, Guoning Chen

Streamlines have been widely used to represent and analyze various steady vector fields. To sufficiently represent important features in complex vector fields (like flow), a large number of streamlines are required. Due to the lack of a rigorous definition of features or patterns in streamlines, user interaction and exploration are required to achieve effective interpretation. Existing approaches based on clustering or pattern search, while valuable for specific analysis tasks, often face challenges in supporting interactive and level-of-detail exploration of large-scale curve-based data, particularly when real-time parameter adjustment and iterative refinement are needed. To address this, we design and implement an interactive web-based system. Our system utilizes a Curve Segment Neighborhood Graph (CSNG) to encode the neighboring relationships between curve segments. CSNG enables us to adapt a fast community detection algorithm to identify coherent flow structures and spatial groupings in the streamlines interactively. CSNG also supports a multi-level exploration through an enhanced force-directed layout. Furthermore, our system integrates an adjacency matrix representation to reveal detailed inter-relations among segments. To achieve real-time performance within a web browser, our system employs matrix compression for memory-efficient CSNG storage and parallel processing. We have applied our system to analyze and interpret complex patterns in several streamline datasets. Our experiments show that we achieve real-time performance on datasets with hundreds of thousands of segments.

Online Algorithms for Geometric Independent Set

from arXiv: Data Structures and Algorithms

Authors: Minati De, Satyam Singh

In the classical online model, the maximum independent set problem admits an $Ω(n)$ lower bound on the competitive ratio even for interval graphs, motivating the study of the problem under additional assumptions. We first study the problem on graphs with a bounded independent kissing number $ζ$, defined as the size of the largest induced star in the graph minus one. We show that a simple greedy algorithm, requiring no geometric representation, achieves a competitive ratio of $ζ$. Moreover, this bound is optimal for deterministic online algorithms and asymptotically optimal for randomized ones. This extends previous results from specific geometric graph families to more general graph classes. Since this bound rules out further improvements through randomization alone, we investigate the power of randomization with access to geometric representation. When the geometric representation of the objects is known, we present randomized online algorithms with improved guarantees. For unit ball graphs in $\mathbb{R}^3$, we present an algorithm whose expected competitive ratio is strictly smaller than the deterministic lower bound implied by the independent kissing number. For $α$-fat objects and for axis-aligned hyper-rectangles in $\mathbb{R}^d$ with bounded diameters, we obtain algorithms with expected competitive ratios that depend polylogarithmically on the ratio between the maximum and minimum object diameters. In both cases, the randomized lower bound implied by the independent kissing number grows polynomially with the ratio between the maximum and minimum object diameters, implying substantial performance guarantees for our algorithms.

Authors: Minati De, Satyam Singh

In the classical online model, the maximum independent set problem admits an $Ω(n)$ lower bound on the competitive ratio even for interval graphs, motivating the study of the problem under additional assumptions. We first study the problem on graphs with a bounded independent kissing number $ζ$, defined as the size of the largest induced star in the graph minus one. We show that a simple greedy algorithm, requiring no geometric representation, achieves a competitive ratio of $ζ$. Moreover, this bound is optimal for deterministic online algorithms and asymptotically optimal for randomized ones. This extends previous results from specific geometric graph families to more general graph classes. Since this bound rules out further improvements through randomization alone, we investigate the power of randomization with access to geometric representation. When the geometric representation of the objects is known, we present randomized online algorithms with improved guarantees. For unit ball graphs in $\mathbb{R}^3$, we present an algorithm whose expected competitive ratio is strictly smaller than the deterministic lower bound implied by the independent kissing number. For $α$-fat objects and for axis-aligned hyper-rectangles in $\mathbb{R}^d$ with bounded diameters, we obtain algorithms with expected competitive ratios that depend polylogarithmically on the ratio between the maximum and minimum object diameters. In both cases, the randomized lower bound implied by the independent kissing number grows polynomially with the ratio between the maximum and minimum object diameters, implying substantial performance guarantees for our algorithms.

Super-Constant Weight Dicke States in Constant Depth Without Fanout

from arXiv: Data Structures and Algorithms

Authors: Lucas Gretta, Meghal Gupta, Malvika Raj Joshi

An $n$-qubit Dicke state of weight $k$, is the uniform superposition over all $n$-bit strings of Hamming weight $k$. Dicke states are an entanglement resource with important practical applications in the NISQ era and, for instance, play a central role in Decoded Quantum Interferometry (DQI). Furthermore, any symmetric state can be expressed as a superposition of Dicke states. First, we give explicit constant-depth circuits that prepare $n$-qubit Dicke states for all $k \leq \text{polylog}(n)$, using only multi-qubit Toffoli gates and single-qubit unitaries. This gives the first $\text{QAC}^0$ construction of super-constant weight Dicke states. Previous constant-depth constructions for any super-constant $k$ required the FANOUT$_n$ gate, while $\text{QAC}^0$ is only known to implement FANOUT$_k$ for $k$ up to $\text{polylog}(n)$. Moreover, we show that any weight-$k$ Dicke state can be constructed with access to FANOUT$_{\min(k,n-k)}$, rather than FANOUT$_n$. Combined with recent hardness results, this yields a tight characterization: for $k \leq n/2$, weight-$k$ Dicke states can be prepared in $\text{QAC}^0$ if and only if FANOUT$_k \in \text{QAC}^0$. We further extend our techniques to show that, in fact, \emph{any} superposition of $n$-qubit Dicke states of weight at most $k$ can be prepared in $\text{QAC}^0$ with access to FANOUT$_k$. Taking $k = n$, we obtain the first $O(1)$-depth unitary construction for arbitrary symmetric states. In particular, any symmetric state can be prepared in constant depth on quantum hardware architectures that support FANOUT$_n$, such as trapped ions with native global entangling operations.

Authors: Lucas Gretta, Meghal Gupta, Malvika Raj Joshi

An $n$-qubit Dicke state of weight $k$, is the uniform superposition over all $n$-bit strings of Hamming weight $k$. Dicke states are an entanglement resource with important practical applications in the NISQ era and, for instance, play a central role in Decoded Quantum Interferometry (DQI). Furthermore, any symmetric state can be expressed as a superposition of Dicke states. First, we give explicit constant-depth circuits that prepare $n$-qubit Dicke states for all $k \leq \text{polylog}(n)$, using only multi-qubit Toffoli gates and single-qubit unitaries. This gives the first $\text{QAC}^0$ construction of super-constant weight Dicke states. Previous constant-depth constructions for any super-constant $k$ required the FANOUT$_n$ gate, while $\text{QAC}^0$ is only known to implement FANOUT$_k$ for $k$ up to $\text{polylog}(n)$. Moreover, we show that any weight-$k$ Dicke state can be constructed with access to FANOUT$_{\min(k,n-k)}$, rather than FANOUT$_n$. Combined with recent hardness results, this yields a tight characterization: for $k \leq n/2$, weight-$k$ Dicke states can be prepared in $\text{QAC}^0$ if and only if FANOUT$_k \in \text{QAC}^0$. We further extend our techniques to show that, in fact, \emph{any} superposition of $n$-qubit Dicke states of weight at most $k$ can be prepared in $\text{QAC}^0$ with access to FANOUT$_k$. Taking $k = n$, we obtain the first $O(1)$-depth unitary construction for arbitrary symmetric states. In particular, any symmetric state can be prepared in constant depth on quantum hardware architectures that support FANOUT$_n$, such as trapped ions with native global entangling operations.

Efficient calculation of available space for multi-NUMA virtual machines

from arXiv: Data Structures and Algorithms

Authors: Andrei Gudkov, Elizaveta Ponomareva, Alexis Pospelov

Increasing demand for computational power has led cloud providers to employ multi-NUMA servers and offer multi-NUMA virtual machines to their customers. However, multi-NUMA VMs introduce additional complexity to scheduling algorithms. Beyond merely selecting a host for a VM, the scheduler has to map virtual NUMA topology onto the physical NUMA topology of the server to ensure optimal VM performance and minimize interference with co-located VMs. Under these constraints, maximizing the number of allocated multi-NUMA VMs on a host becomes a combinatorial optimization problem. In this paper, we derive closed-form expressions to compute the maximum number of VMs for a given flavor that can be additionally allocated onto a physical server. We consider nontrivial scenarios of mapping 2- and 4-NUMA symmetric VMs to 4- and 8-NUMA physical topologies. Our results have broad applicability, ranging from real-time dashboards (displaying available cluster capacity per VM flavor) to optimization tools for large-scale cloud resource reorganization.

Authors: Andrei Gudkov, Elizaveta Ponomareva, Alexis Pospelov

Increasing demand for computational power has led cloud providers to employ multi-NUMA servers and offer multi-NUMA virtual machines to their customers. However, multi-NUMA VMs introduce additional complexity to scheduling algorithms. Beyond merely selecting a host for a VM, the scheduler has to map virtual NUMA topology onto the physical NUMA topology of the server to ensure optimal VM performance and minimize interference with co-located VMs. Under these constraints, maximizing the number of allocated multi-NUMA VMs on a host becomes a combinatorial optimization problem. In this paper, we derive closed-form expressions to compute the maximum number of VMs for a given flavor that can be additionally allocated onto a physical server. We consider nontrivial scenarios of mapping 2- and 4-NUMA symmetric VMs to 4- and 8-NUMA physical topologies. Our results have broad applicability, ranging from real-time dashboards (displaying available cluster capacity per VM flavor) to optimization tools for large-scale cloud resource reorganization.

Sublinear Spectral Clustering Oracle with Little Memory

from arXiv: Data Structures and Algorithms

Authors: Ranran Shen, Xiaoyi Zhu, Pan Peng, Zengfeng Huang

We study the problem of designing \emph{sublinear spectral clustering oracles} for well-clusterable graphs. Such an oracle is an algorithm that, given query access to the adjacency list of a graph $G$, first constructs a compact data structure $\mathcal{D}$ that captures the clustering structure of $G$. Once built, $\mathcal{D}$ enables sublinear time responses to \textsc{WhichCluster}$(G,x)$ queries for any vertex $x$. A major limitation of existing oracles is that constructing $\mathcal{D}$ requires $Ω(\sqrt{n})$ memory, which becomes a bottleneck for massive graphs and memory-limited settings. In this paper, we break this barrier and establish a memory-time trade-off for sublinear spectral clustering oracles. Specifically, for well-clusterable graphs, we present oracles that construct $\mathcal{D}$ using much smaller than $O(\sqrt{n})$ memory (e.g., $O(n^{0.01})$) while still answering membership queries in sublinear time. We also characterize the trade-off frontier between memory usage $S$ and query time $T$, showing, for example, that $S\cdot T=\widetilde{O}(n)$ for clusterable graphs with a logarithmic conductance gap, and we show that this trade-off is nearly optimal (up to logarithmic factors) for a natural class of approaches. Finally, to complement our theory, we validate the performance of our oracles through experiments on synthetic networks.

Authors: Ranran Shen, Xiaoyi Zhu, Pan Peng, Zengfeng Huang

We study the problem of designing \emph{sublinear spectral clustering oracles} for well-clusterable graphs. Such an oracle is an algorithm that, given query access to the adjacency list of a graph $G$, first constructs a compact data structure $\mathcal{D}$ that captures the clustering structure of $G$. Once built, $\mathcal{D}$ enables sublinear time responses to \textsc{WhichCluster}$(G,x)$ queries for any vertex $x$. A major limitation of existing oracles is that constructing $\mathcal{D}$ requires $Ω(\sqrt{n})$ memory, which becomes a bottleneck for massive graphs and memory-limited settings. In this paper, we break this barrier and establish a memory-time trade-off for sublinear spectral clustering oracles. Specifically, for well-clusterable graphs, we present oracles that construct $\mathcal{D}$ using much smaller than $O(\sqrt{n})$ memory (e.g., $O(n^{0.01})$) while still answering membership queries in sublinear time. We also characterize the trade-off frontier between memory usage $S$ and query time $T$, showing, for example, that $S\cdot T=\widetilde{O}(n)$ for clusterable graphs with a logarithmic conductance gap, and we show that this trade-off is nearly optimal (up to logarithmic factors) for a natural class of approaches. Finally, to complement our theory, we validate the performance of our oracles through experiments on synthetic networks.

PlanB: Efficient Software IPv6 Lookup with Linearized $B^+$-Tree

from arXiv: Data Structures and Algorithms

Authors: Zhihao Zhang, Lanzheng Liu, Chen Chen, Huiba Li, Jiwu Shu, Windsor Hsu, Yiming Zhang

IP lookup via Longest Prefix Match (LPM) is critical for packet forwarding. Unfortunately, conventional lookup algorithms are inefficient for IPv6 Forwarding Information Bases (FIBs), which are characterized by a set of long prefixes with diverse lengths. We observe that LPM inherently represents a two-dimensional (2D) search problem over both prefix values and prefix lengths, but existing algorithms mostly treat LPM as two separate levels of one-dimensional (1D) searches, causing poor lookup performance and high memory overhead. This paper presents PlanB, a novel scheme for high-speed IPv6 lookup. We transform the 2D LPM into an equivalent 1D search problem over elementary intervals, thereby unifying the search across prefix value and lengths. We then adapt a flat-array-based B-tree structure to the needs of LPM to propose the linearized $B^+$-tree, based on which we introduce an efficient search algorithm tailored to the properties of the transformed space. To maximize performance, we integrate PlanB with vectorization, batching, branch-free logic, and loop unrolling to fully exploit CPU parallelism. Extensive evaluation shows that PlanB achieves single-core performance of 390 Million Lookups Per Sec (MLPS) with real-world IPv6 FIBs on AMD processor, and scales to full-12-core performance of 3.4 Billion Lookups Per Sec (BLPS). This is 1.6$\times$$\sim$14$\times$ higher than state-of-the-art software-based schemes (PopTrie, CP-Trie, Neurotrie and HBS).

Authors: Zhihao Zhang, Lanzheng Liu, Chen Chen, Huiba Li, Jiwu Shu, Windsor Hsu, Yiming Zhang

IP lookup via Longest Prefix Match (LPM) is critical for packet forwarding. Unfortunately, conventional lookup algorithms are inefficient for IPv6 Forwarding Information Bases (FIBs), which are characterized by a set of long prefixes with diverse lengths. We observe that LPM inherently represents a two-dimensional (2D) search problem over both prefix values and prefix lengths, but existing algorithms mostly treat LPM as two separate levels of one-dimensional (1D) searches, causing poor lookup performance and high memory overhead. This paper presents PlanB, a novel scheme for high-speed IPv6 lookup. We transform the 2D LPM into an equivalent 1D search problem over elementary intervals, thereby unifying the search across prefix value and lengths. We then adapt a flat-array-based B-tree structure to the needs of LPM to propose the linearized $B^+$-tree, based on which we introduce an efficient search algorithm tailored to the properties of the transformed space. To maximize performance, we integrate PlanB with vectorization, batching, branch-free logic, and loop unrolling to fully exploit CPU parallelism. Extensive evaluation shows that PlanB achieves single-core performance of 390 Million Lookups Per Sec (MLPS) with real-world IPv6 FIBs on AMD processor, and scales to full-12-core performance of 3.4 Billion Lookups Per Sec (BLPS). This is 1.6$\times$$\sim$14$\times$ higher than state-of-the-art software-based schemes (PopTrie, CP-Trie, Neurotrie and HBS).

Balancing Weights, Directed Sparsification, and Augmenting Paths

from arXiv: Data Structures and Algorithms

Authors: Jason Li

We present a randomized augmenting paths-based algorithm to compute the maximum flow in a directed, uncapacitated graph in almost $m+nF$ time, matching the algorithm of Karger and Levine for undirected graphs (SICOMP 2015). Combined with an initial $\sqrt n$ rounds of blocking flow to reduce the value of $F$, we obtain a maximum flow algorithm with running time $mn^{1/2+o(1)}$. For combinatorial, augmenting paths-based algorithms, this is the first improvement over Dinic's algorithm for moderately sparse graphs. To obtain our algorithm, we introduce a new technique to re-weight the edges of a strongly connected directed graph so that each cut is approximately balanced: the total weight of edges in one direction is within a constant factor of the total weight in the other direction. We then adapt Karger and Levine's technique of sampling edges from the newly weighted residual graph, ensuring that an augmenting path exists in the sampled graph with high probability. One technical difficulty is that our balancing weights have to be dynamically maintained upon changes to the residual graph. Surprisingly, we can black box the dynamic data structure from the recent interior point method-based flow algorithm of van den Brand et al. (FOCS 2024).

Authors: Jason Li

We present a randomized augmenting paths-based algorithm to compute the maximum flow in a directed, uncapacitated graph in almost $m+nF$ time, matching the algorithm of Karger and Levine for undirected graphs (SICOMP 2015). Combined with an initial $\sqrt n$ rounds of blocking flow to reduce the value of $F$, we obtain a maximum flow algorithm with running time $mn^{1/2+o(1)}$. For combinatorial, augmenting paths-based algorithms, this is the first improvement over Dinic's algorithm for moderately sparse graphs. To obtain our algorithm, we introduce a new technique to re-weight the edges of a strongly connected directed graph so that each cut is approximately balanced: the total weight of edges in one direction is within a constant factor of the total weight in the other direction. We then adapt Karger and Levine's technique of sampling edges from the newly weighted residual graph, ensuring that an augmenting path exists in the sampled graph with high probability. One technical difficulty is that our balancing weights have to be dynamically maintained upon changes to the residual graph. Surprisingly, we can black box the dynamic data structure from the recent interior point method-based flow algorithm of van den Brand et al. (FOCS 2024).

Tight Bounds for Learning Polyhedra with a Margin

from arXiv: Data Structures and Algorithms

Authors: Shyamal Patel, Santosh Vempala

We give an algorithm for PAC learning intersections of $k$ halfspaces with a $ρ$ margin to within error $\varepsilon$ that runs in time $\textsf{poly}(k, \varepsilon^{-1}, ρ^{-1}) \cdot \exp \left(O(\sqrt{n \log(1/ρ) \log k})\right)$. Notably, this improves on prior work which had an exponential dependence on either $k$ or $ρ^{-1}$ and matches known cryptographic and Statistical Query lower bounds up to the logarithmic factors in $k$ and $ρ$ in the exponent. Our learning algorithm extends to the more general setting when we are only promised that most points have distance at least $ρ$ from the boundary of the polyhedron, making it applicable to continuous distributions as well.

Authors: Shyamal Patel, Santosh Vempala

We give an algorithm for PAC learning intersections of $k$ halfspaces with a $ρ$ margin to within error $\varepsilon$ that runs in time $\textsf{poly}(k, \varepsilon^{-1}, ρ^{-1}) \cdot \exp \left(O(\sqrt{n \log(1/ρ) \log k})\right)$. Notably, this improves on prior work which had an exponential dependence on either $k$ or $ρ^{-1}$ and matches known cryptographic and Statistical Query lower bounds up to the logarithmic factors in $k$ and $ρ$ in the exponent. Our learning algorithm extends to the more general setting when we are only promised that most points have distance at least $ρ$ from the boundary of the polyhedron, making it applicable to continuous distributions as well.

Fast Concurrent Primitives Despite Contention

from arXiv: Data Structures and Algorithms

Authors: Michael A. Bender, Guy E. Blelloch, Martin Farach-Colton, Yang Hu, Rob Johnson, Rotem Oshman, Renfei Zhou

We study the problem of constructing concurrent objects in a setting where $P$ processes run in parallel and interact through a shared memory that is subject to write contention. Our goal is to transform hardware primitives that are subject to write contention into ones that handle contention gracefully. We give contention-resolution algorithms for several basic primitives, and analyze them under a relaxed, roughly-synchronous stochastic scheduler, where processes run at roughly the same rate up to a constant factor with high probability. Specifically, we construct read/write registers and CAS registers that have latency $O(\log P)$ w.h.p. under our scheduler model, using $O(1)$ hardware read/write registers and, in the case of our CAS construction, one hardware CAS register. Our algorithms guarantee performance even when their operations are invoked by an adaptive adversary that is able to see the entire history of operations so far, including their timing and return values. This allows them to be used as building blocks inside larger programs; using this compositionality property, we obtain several other constructions (LL/SC, fetch-and-increment, bounded max registers, and counters). To complement our constructions, we give a trade-off showing that even under a perfectly synchronous schedule and even if each process only executes one operation, any algorithm that implements any of the primitives that we consider, uses space $M$, and has latency at most $L$ with high probability must have expected latency at least $Ω(\log_{ML} P)$.

Authors: Michael A. Bender, Guy E. Blelloch, Martin Farach-Colton, Yang Hu, Rob Johnson, Rotem Oshman, Renfei Zhou

We study the problem of constructing concurrent objects in a setting where $P$ processes run in parallel and interact through a shared memory that is subject to write contention. Our goal is to transform hardware primitives that are subject to write contention into ones that handle contention gracefully. We give contention-resolution algorithms for several basic primitives, and analyze them under a relaxed, roughly-synchronous stochastic scheduler, where processes run at roughly the same rate up to a constant factor with high probability. Specifically, we construct read/write registers and CAS registers that have latency $O(\log P)$ w.h.p. under our scheduler model, using $O(1)$ hardware read/write registers and, in the case of our CAS construction, one hardware CAS register. Our algorithms guarantee performance even when their operations are invoked by an adaptive adversary that is able to see the entire history of operations so far, including their timing and return values. This allows them to be used as building blocks inside larger programs; using this compositionality property, we obtain several other constructions (LL/SC, fetch-and-increment, bounded max registers, and counters). To complement our constructions, we give a trade-off showing that even under a perfectly synchronous schedule and even if each process only executes one operation, any algorithm that implements any of the primitives that we consider, uses space $M$, and has latency at most $L$ with high probability must have expected latency at least $Ω(\log_{ML} P)$.

Thursday, April 16

TR26-057 | Explicit Constant-Alphabet Subspace Design Codes | Rohan Goyal, Venkatesan Guruswami, Jun-Ting Hsieh

from ECCC Papers

The subspace design property for additive codes is a higher-dimensional generalization of the minimum distance property. As shown recently by Brakensiek, Chen, Dhar and Zhang, it implies that the code has similar performance as random linear codes with respect to all “local properties”. Explicit algebraic codes, such as folded Reed-Solomon and multiplicity codes, are known to have the subspace design property, but they need alphabet sizes that grow as a large polynomial in the block length. Constructing explicit constant-alphabet subspace design codes was subsequently posed as an open question in Brakensiek, Chen, Dhar and Zhang. In this work, we answer their question and give explicit constructions of subspace design codes over constant-sized alphabets, using the expander-based Alon-Edmonds-Luby (AEL) framework. This generalizes the recent work of Jeronimo and Shagrithaya, which showed that such codes share local properties of random linear codes. Our work obtains this consequence in a unified manner via the subspace design property. In addition, our approach yields some improvements in parameters for list-recovery.

The subspace design property for additive codes is a higher-dimensional generalization of the minimum distance property. As shown recently by Brakensiek, Chen, Dhar and Zhang, it implies that the code has similar performance as random linear codes with respect to all “local properties”. Explicit algebraic codes, such as folded Reed-Solomon and multiplicity codes, are known to have the subspace design property, but they need alphabet sizes that grow as a large polynomial in the block length. Constructing explicit constant-alphabet subspace design codes was subsequently posed as an open question in Brakensiek, Chen, Dhar and Zhang. In this work, we answer their question and give explicit constructions of subspace design codes over constant-sized alphabets, using the expander-based Alon-Edmonds-Luby (AEL) framework. This generalizes the recent work of Jeronimo and Shagrithaya, which showed that such codes share local properties of random linear codes. Our work obtains this consequence in a unified manner via the subspace design property. In addition, our approach yields some improvements in parameters for list-recovery.

Postdoc at University of Antwerp (apply by June 1, 2026)

from CCI: jobs

Automated Reasoning for Quantum Knowledge: at the intersection of TCS and quantum computing, contributing to research on representing and reasoning about quantum knowledge while collaborating on grant proposals, supervising PhD students, and engaging in light teaching. Open to candidates worldwide who hold (or soon obtain) a PhD. Starting around September 2026 (for 1–2 years). Website: […]

Automated Reasoning for Quantum Knowledge: at the intersection of TCS and quantum computing, contributing to research on representing and reasoning about quantum knowledge while collaborating on grant proposals, supervising PhD students, and engaging in light teaching. Open to candidates worldwide who hold (or soon obtain) a PhD. Starting around September 2026 (for 1–2 years).

Website: https://www.uantwerpen.be/en/jobs/vacancies/academic-staff/?q=4367&descr=Postdoctoral-scholarship-holder-Automated-Reasoning-for-Quantum-Knowledge
Email: guillermo.perez@uantwerpen.be

By shacharlovett

APPROX 2026 Call for Papers

from CS Theory Events

April 16 – May 7, 2026 Boston University, Boston approxconference.com/) Submission deadline: May 6, 2026 Dear researchers, The 29th International Conference on Approximation Algorithms for Combinatorial Optimization Problems (APPROX 2026) will be held at Boston University, Boston, Massachusetts, USA, on August 19-21, 2026 (together with RANDOM 2026 and WOLA 2026). The deadline to submit your … Continue reading APPROX 2026 Call for Papers

By shacharlovett

April 16 – May 7, 2026 Boston University, Boston https://approxconference.com/) Submission deadline: May 6, 2026 Dear researchers, The 29th International Conference on Approximation Algorithms for Combinatorial Optimization Problems (APPROX 2026) will be held at Boston University, Boston, Massachusetts, USA, on August 19-21, 2026 (together with RANDOM 2026 and WOLA 2026). The deadline to submit your … Continue reading APPROX 2026 Call for Papers

By shacharlovett

Postdoc at West Virginia University (apply by May 31, 2026)

from CCI: jobs

The Lane Department of Computer Science and Electrical Engineering at West Virginia University invites applications for a Postdoctoral Fellow (funded by Algorithmic Foundations, NSF) in the general areas of theoretical computer science and algorithmic operations research, with an emphasis on computational complexity and game theory. The position is funded for two years, starting August 1 […]

The Lane Department of Computer Science and Electrical Engineering at West Virginia University invites applications for a Postdoctoral Fellow (funded by Algorithmic Foundations, NSF) in the general areas of theoretical computer science and algorithmic operations research, with an emphasis on computational complexity and game theory. The position is funded for two years, starting August 1 2026.

Website: https://wvu.taleo.net/careersection/faculty/jobdetail.ftl?job=29316&tz=GMT-04%3A00&tzname=America%2FNew_York
Email: k.subramani@mail.wvu.edu

By shacharlovett

Machine Learning and Complexity

from Computational Complexity

At Oxford I focused my research and discussions on how we can use the tools of computational complexity to help us understand the power and limitations of machine learning. Last week I posted my paper How Does Machine Learning Manage Complexity?, a first step in this direction. Let me give a broad overview of the paper. Please refer to the paper for more technical details.

Instead of focusing on the machine learning concepts like neural nets, transformers, etc., I wanted to abstract out a model defined in terms of complexity, as time-efficient non-uniform computable distributions with minimum probabilities. Let's unpack this abstraction.

Neural nets as next token predictors don't always give the same next token, rather they give a distribution of tokens, which leads to a distribution on text of length up to the size of the context window. The way probabilities are computed via a softmax function guarantee that every output can occur, with at least an exponentially small probability, the "minimum probability" in the abstraction. 

In computational complexity, we study two main kinds of distributions. A distribution is sampleable if there is an algorithm that takes in uniformly random bits and outputs text according to that distribution. A distribution is computable if we can compute not only the probability that a piece of text occurs, but the cumulative distribution function, the sum of the probabilities of all outputs lexicographically less than that text. Every efficiently computable distribution is efficiently sampleable but not likely the other way around.

Neural nets as next token predictors turn out to be equivalent to computable distributions. We need these kinds of distributions both on how neural nets are trained but also on how we use them. Computable distributions allow for conditional sampling which lets us use a large-language model to answer questions or have a conversation. You can't get conditional sampling from a sampleable distribution. 

Neural nets have a limited number of layers, typically about a hundred layers deep which prevents them from handling full time-efficient (polynomial-time) computations. They can get around this restriction either with reasoning models that allow a model to talk to itself, or by directly writing and executing code which run efficient algorithms of any kind.

Most of the algorithms we use, think Dijkstra, or matching, are uniform, that is the same algorithm on graphs of twenty nodes as the algorithm on twenty million. But neural nets change their weights based on the distributions they train on. These weights encode a compressed view of that data and the extra computation needed to process that data. So we have different algorithms as our technology and data sources improve. I wrote more about this connection between AI and nonuniformity last fall. 

What does this abstraction buy us?

Restricting to computable distributions forces machine learning algorithms to treat complex behavior as random behavior, much like we treat a coin flip as a random event because it is too complicated to work out all the physical interactions that would determine which side it lands on.

We illustrate this point by our main result. Let D be the distribution of outputs from a cryptographic pseudorandom generator and U be the uniform distribution of words of the same length. If \(\mu\) is a time-efficient non-uniform computable distribution with minimum probabilities and \(\mu\) does as good or a better job than U of modeling D then \(\mu\) and U are essentially the same distribution. Machine learning models the complex PRG as a uniform distribution, simplifying what we can't directly compute by using randomness. 

More in the paper and future blog posts.

By Lance Fortnow

At Oxford I focused my research and discussions on how we can use the tools of computational complexity to help us understand the power and limitations of machine learning. Last week I posted my paper How Does Machine Learning Manage Complexity?, a first step in this direction. Let me give a broad overview of the paper. Please refer to the paper for more technical details.

Instead of focusing on the machine learning concepts like neural nets, transformers, etc., I wanted to abstract out a model defined in terms of complexity, as time-efficient non-uniform computable distributions with minimum probabilities. Let's unpack this abstraction.

Neural nets as next token predictors don't always give the same next token, rather they give a distribution of tokens, which leads to a distribution on text of length up to the size of the context window. The way probabilities are computed via a softmax function guarantee that every output can occur, with at least an exponentially small probability, the "minimum probability" in the abstraction. 

In computational complexity, we study two main kinds of distributions. A distribution is sampleable if there is an algorithm that takes in uniformly random bits and outputs text according to that distribution. A distribution is computable if we can compute not only the probability that a piece of text occurs, but the cumulative distribution function, the sum of the probabilities of all outputs lexicographically less than that text. Every efficiently computable distribution is efficiently sampleable but not likely the other way around.

Neural nets as next token predictors turn out to be equivalent to computable distributions. We need these kinds of distributions both on how neural nets are trained but also on how we use them. Computable distributions allow for conditional sampling which lets us use a large-language model to answer questions or have a conversation. You can't get conditional sampling from a sampleable distribution. 

Neural nets have a limited number of layers, typically about a hundred layers deep which prevents them from handling full time-efficient (polynomial-time) computations. They can get around this restriction either with reasoning models that allow a model to talk to itself, or by directly writing and executing code which run efficient algorithms of any kind.

Most of the algorithms we use, think Dijkstra, or matching, are uniform, that is the same algorithm on graphs of twenty nodes as the algorithm on twenty million. But neural nets change their weights based on the distributions they train on. These weights encode a compressed view of that data and the extra computation needed to process that data. So we have different algorithms as our technology and data sources improve. I wrote more about this connection between AI and nonuniformity last fall. 

What does this abstraction buy us?

Restricting to computable distributions forces machine learning algorithms to treat complex behavior as random behavior, much like we treat a coin flip as a random event because it is too complicated to work out all the physical interactions that would determine which side it lands on.

We illustrate this point by our main result. Let D be the distribution of outputs from a cryptographic pseudorandom generator and U be the uniform distribution of words of the same length. If \(\mu\) is a time-efficient non-uniform computable distribution with minimum probabilities and \(\mu\) does as good or a better job than U of modeling D then \(\mu\) and U are essentially the same distribution. Machine learning models the complex PRG as a uniform distribution, simplifying what we can't directly compute by using randomness. 

More in the paper and future blog posts.

By Lance Fortnow

Reachability Constraints in Variational Quantum Circuits: Optimization within Polynomial Group Module

from arXiv: Computational Complexity

Authors: Yun-Tak Oh, Dongsoo Lee, Jungyoul Park, Kyung Chul Jeong, Panjin Kim

This work identifies a necessary condition for any variational quantum approach to reach the exact ground state. Briefly, the norms of the projections of the input and the ground state onto each group module must match, implying that module weights of the solution state have to be known in advance in order to reach the exact ground state. An exemplary case is provided by matchgate circuits applied to problems whose solutions are classical bit strings, since all computational basis states share the same module-wise weights. Combined with the known classical simulability of quantum circuits for which observables lie in a small linear subspace, this implies that certain problems admit a classical surrogate for exact solution with each step taking $O(n^5)$ time. The Maximum Cut problem serves as an illustrative example.

Authors: Yun-Tak Oh, Dongsoo Lee, Jungyoul Park, Kyung Chul Jeong, Panjin Kim

This work identifies a necessary condition for any variational quantum approach to reach the exact ground state. Briefly, the norms of the projections of the input and the ground state onto each group module must match, implying that module weights of the solution state have to be known in advance in order to reach the exact ground state. An exemplary case is provided by matchgate circuits applied to problems whose solutions are classical bit strings, since all computational basis states share the same module-wise weights. Combined with the known classical simulability of quantum circuits for which observables lie in a small linear subspace, this implies that certain problems admit a classical surrogate for exact solution with each step taking $O(n^5)$ time. The Maximum Cut problem serves as an illustrative example.

On the Decidability of Verification under Release/Acquire

from arXiv: Computational Complexity

Authors: Giovanna Kobus Conrado, Andreas Pavlogiannis

The verification of concurrent programs under weak-memory models is a burgeoning effort, owing to the increasing adoption of weak memory in concurrent software and hardware. Release/Acquire has become the standard model for high-performance concurrent programming, adopted by common mainstream languages and computer architectures. In a surprising result, Abdulla et al. (PLDI'19) proved that reachability in this model is undecidable when programs have access to atomic Read-Modify-Write (RMW) operations. Moreover, undecidability holds even for executions with just 4 contexts, and is thus immune to underapproximations based on bounded context switching. The canonical, RMW-free case was left as a challenging question, proving a non-primitive recursive lower bound as a first step, and has remained open for the past seven years. In this paper, we settle the above open question by proving that reachability is undecidable even in the RMW-free fragment of Release/Acquire, thereby characterizing the simplest set of communication primitives that gives rise to undecidability. Moreover, we prove that bounding both the number of context switches and the number of RMWs recovers decidability, thus fully characterizing the boundary of decidability along the dimensions of RMW-bounding and context-bounding.

Authors: Giovanna Kobus Conrado, Andreas Pavlogiannis

The verification of concurrent programs under weak-memory models is a burgeoning effort, owing to the increasing adoption of weak memory in concurrent software and hardware. Release/Acquire has become the standard model for high-performance concurrent programming, adopted by common mainstream languages and computer architectures. In a surprising result, Abdulla et al. (PLDI'19) proved that reachability in this model is undecidable when programs have access to atomic Read-Modify-Write (RMW) operations. Moreover, undecidability holds even for executions with just 4 contexts, and is thus immune to underapproximations based on bounded context switching. The canonical, RMW-free case was left as a challenging question, proving a non-primitive recursive lower bound as a first step, and has remained open for the past seven years. In this paper, we settle the above open question by proving that reachability is undecidable even in the RMW-free fragment of Release/Acquire, thereby characterizing the simplest set of communication primitives that gives rise to undecidability. Moreover, we prove that bounding both the number of context switches and the number of RMWs recovers decidability, thus fully characterizing the boundary of decidability along the dimensions of RMW-bounding and context-bounding.

Fast Time-Varying Contiguous Cartograms Using Integral Images

from arXiv: Computational Geometry

Authors: Vladimir Molchanov, Hennes Rave, Lars Linsen

Cartograms are a technique for visually representing geographically distributed statistical data, where values of a numerical attribute are mapped to the size of geographic regions. Contiguous cartograms preserve the adjacencies of the original regions during the mapping. To be useful, contiguous cartograms also require approximate preservation of shapes and relative positions. Due to these desirable properties, contiguous cartograms are among the most popular ones. Most methods for constructing contiguous cartograms exploit a deformation of the original map. Aiming at the preservation of geographical properties, existing approaches are often algorithmically cumbersome and computationally intensive. We propose a novel deformation technique for computing time-varying contiguous cartograms based on integral images evaluated for a series of discrete density distributions. The density textures represent the given dynamic statistical data. The iterative application of the proposed mapping smoothly transforms the domain to gradually equalize the temporal density, i.e., region areas grow or shrink following their evolutionary statistical data. Global shape preservation at each time step is controlled by a single parameter that can be interactively adjusted by the user. Our efficient GPU implementation of the proposed algorithm is significantly faster than existing state-of-the-art methods while achieving comparable quality for cartographic accuracy, shape preservation, and topological error. We investigate strategies for transitioning between adjacent time steps and discuss the parameter choice. Our approach applies to comparative cartograms' morphing and interactive cartogram exploration.

Authors: Vladimir Molchanov, Hennes Rave, Lars Linsen

Cartograms are a technique for visually representing geographically distributed statistical data, where values of a numerical attribute are mapped to the size of geographic regions. Contiguous cartograms preserve the adjacencies of the original regions during the mapping. To be useful, contiguous cartograms also require approximate preservation of shapes and relative positions. Due to these desirable properties, contiguous cartograms are among the most popular ones. Most methods for constructing contiguous cartograms exploit a deformation of the original map. Aiming at the preservation of geographical properties, existing approaches are often algorithmically cumbersome and computationally intensive. We propose a novel deformation technique for computing time-varying contiguous cartograms based on integral images evaluated for a series of discrete density distributions. The density textures represent the given dynamic statistical data. The iterative application of the proposed mapping smoothly transforms the domain to gradually equalize the temporal density, i.e., region areas grow or shrink following their evolutionary statistical data. Global shape preservation at each time step is controlled by a single parameter that can be interactively adjusted by the user. Our efficient GPU implementation of the proposed algorithm is significantly faster than existing state-of-the-art methods while achieving comparable quality for cartographic accuracy, shape preservation, and topological error. We investigate strategies for transitioning between adjacent time steps and discuss the parameter choice. Our approach applies to comparative cartograms' morphing and interactive cartogram exploration.

NP-Hardness and a PTAS for the Pinwheel Problem

from arXiv: Data Structures and Algorithms

Authors: Robert Kleinberg, Ahan Mishra

In the pinwheel problem, one is given an $m$-tuple of positive integers $(a_1, \ldots, a_m)$ and asked whether the integers can be partitioned into $m$ color classes $C_1,\ldots,C_m$ such that every interval of length $a_i$ has non-empty intersection with $C_i$, for $i = 1, 2, \ldots, m$. It was a long-standing open question whether the pinwheel problem is NP-hard. We affirm a prediction of Holte et al. (1989) by demonstrating, for the first time, NP-hardness of the pinwheel problem. This enables us to prove NP-hardness for a host of other problems considered in the literature: pinwheel covering, bamboo garden trimming, windows scheduling, recurrent scheduling, and the constant gap problem. On the positive side, we develop a PTAS for an approximate version of the pinwheel problem. Previously, the best approximation factor known to be achievable in polynomial time was $\frac{9}{7}$.

Authors: Robert Kleinberg, Ahan Mishra

In the pinwheel problem, one is given an $m$-tuple of positive integers $(a_1, \ldots, a_m)$ and asked whether the integers can be partitioned into $m$ color classes $C_1,\ldots,C_m$ such that every interval of length $a_i$ has non-empty intersection with $C_i$, for $i = 1, 2, \ldots, m$. It was a long-standing open question whether the pinwheel problem is NP-hard. We affirm a prediction of Holte et al. (1989) by demonstrating, for the first time, NP-hardness of the pinwheel problem. This enables us to prove NP-hardness for a host of other problems considered in the literature: pinwheel covering, bamboo garden trimming, windows scheduling, recurrent scheduling, and the constant gap problem. On the positive side, we develop a PTAS for an approximate version of the pinwheel problem. Previously, the best approximation factor known to be achievable in polynomial time was $\frac{9}{7}$.

Parallel Algorithms for Group Isomorphism via Code Equivalence

from arXiv: Data Structures and Algorithms

Authors: Michael Levet

In this paper, we exhibit $\textsf{AC}^{3}$ isomorphism tests for coprime extensions $H \ltimes N$ where $H$ is elementary Abelian and $N$ is Abelian; and groups where $\text{Rad}(G) = Z(G)$ is elementary Abelian and $G = \text{Soc}^{*}(G)$. The fact that isomorphism testing for these families is in $\textsf{P}$ was established respectively by Qiao, Sarma, and Tang (STACS 2011), and Grochow and Qiao (CCC 2014, SIAM J. Comput. 2017). The polynomial-time isomorphism tests for both of these families crucially leveraged small (size $O(\log |G|)$) instances of Linear Code Equivalence (Babai, SODA 2011). Here, we combine Luks' group-theoretic method for Graph Isomorphism (FOCS 1980, J. Comput. Syst. Sci. 1982) with the fact that $G$ is given by its multiplication table, to implement the corresponding instances of Linear Code Equivalence in $\textsf{AC}^{3}$. As a byproduct of our work, we show that isomorphism testing of arbitrary central-radical groups is decidable using $\textsf{AC}$ circuits of depth $O(\log^3 n)$ and size $n^{O(\log \log n)}$. This improves upon the previous bound of $n^{O(\log \log n)}$-time due to Grochow and Qiao (ibid.).

Authors: Michael Levet

In this paper, we exhibit $\textsf{AC}^{3}$ isomorphism tests for coprime extensions $H \ltimes N$ where $H$ is elementary Abelian and $N$ is Abelian; and groups where $\text{Rad}(G) = Z(G)$ is elementary Abelian and $G = \text{Soc}^{*}(G)$. The fact that isomorphism testing for these families is in $\textsf{P}$ was established respectively by Qiao, Sarma, and Tang (STACS 2011), and Grochow and Qiao (CCC 2014, SIAM J. Comput. 2017). The polynomial-time isomorphism tests for both of these families crucially leveraged small (size $O(\log |G|)$) instances of Linear Code Equivalence (Babai, SODA 2011). Here, we combine Luks' group-theoretic method for Graph Isomorphism (FOCS 1980, J. Comput. Syst. Sci. 1982) with the fact that $G$ is given by its multiplication table, to implement the corresponding instances of Linear Code Equivalence in $\textsf{AC}^{3}$. As a byproduct of our work, we show that isomorphism testing of arbitrary central-radical groups is decidable using $\textsf{AC}$ circuits of depth $O(\log^3 n)$ and size $n^{O(\log \log n)}$. This improves upon the previous bound of $n^{O(\log \log n)}$-time due to Grochow and Qiao (ibid.).

Max Cut with Small-Dimensional SDP Solutions

from arXiv: Data Structures and Algorithms

Authors: Hsien-Chih Chang, Suprovat Ghoshal, Euiwoong Lee

We study the Max-Cut semidefinite programming (SDP) relaxation in the regime where a near-optimal solution admits a low-dimensional realization. While the Goemans--Williamson hyperplane rounding achieves the worst-case optimal approximation ratio $α_{GW}\approx 0.87856$, it is natural to ask whether one can beat $α_{GW}$ when the SDP solution lives in $\mathbb{R}^d$ for a small dimension $d$. We answer this in the affirmative for every fixed $d$: there is a polynomial-time rounding algorithm that, given a $d$-dimensional feasible solution to the standard Max-Cut SDP strengthened with triangle inequalities, produces a cut of expected value at least $(α_{GW}+2^{-O(d)})$ times the SDP value. Our improvement is driven by a new geometric anti-concentration lemma for signs of low-dimensional Gaussian projections.

Authors: Hsien-Chih Chang, Suprovat Ghoshal, Euiwoong Lee

We study the Max-Cut semidefinite programming (SDP) relaxation in the regime where a near-optimal solution admits a low-dimensional realization. While the Goemans--Williamson hyperplane rounding achieves the worst-case optimal approximation ratio $α_{GW}\approx 0.87856$, it is natural to ask whether one can beat $α_{GW}$ when the SDP solution lives in $\mathbb{R}^d$ for a small dimension $d$. We answer this in the affirmative for every fixed $d$: there is a polynomial-time rounding algorithm that, given a $d$-dimensional feasible solution to the standard Max-Cut SDP strengthened with triangle inequalities, produces a cut of expected value at least $(α_{GW}+2^{-O(d)})$ times the SDP value. Our improvement is driven by a new geometric anti-concentration lemma for signs of low-dimensional Gaussian projections.

Block-Based Pathfinding: A Minecraft System for Visualizing Graph Algorithms

from arXiv: Data Structures and Algorithms

Authors: Luca-Stefan Pirvu, Bogdan-Alexandru Maciuca, Andrei-Ciprian Rabu, Adrian-Marius Dumitran

Graph theory is a cornerstone of Computer Science education, yet entry-level students often struggle to map abstract node-edge relationships to practical applications. This paper presents the design and architecture of a Minecraft-based educational tool specifically built to visualize graph traversal and shortest-path algorithms. We propose a three-layer system: (1) a Grid Traversal module where terrain types (e.g., soul sand, ice) represent edge weights, allowing for the gamified study of shortest path algorithms; (2) a "Sky Graph" module for interactive 3D manipulation of both directed and undirected graphs; and (3) lessons and quizzes available through books. The system grounds its design in Constructionist learning theory, transitioning students from passive observers to active protagonists who physically manipulate algorithmic behavior. We additionally present a planned empirical evaluation using NASA-TLX and in-game telemetry to validate the system's pedagogical efficacy.

Authors: Luca-Stefan Pirvu, Bogdan-Alexandru Maciuca, Andrei-Ciprian Rabu, Adrian-Marius Dumitran

Graph theory is a cornerstone of Computer Science education, yet entry-level students often struggle to map abstract node-edge relationships to practical applications. This paper presents the design and architecture of a Minecraft-based educational tool specifically built to visualize graph traversal and shortest-path algorithms. We propose a three-layer system: (1) a Grid Traversal module where terrain types (e.g., soul sand, ice) represent edge weights, allowing for the gamified study of shortest path algorithms; (2) a "Sky Graph" module for interactive 3D manipulation of both directed and undirected graphs; and (3) lessons and quizzes available through books. The system grounds its design in Constructionist learning theory, transitioning students from passive observers to active protagonists who physically manipulate algorithmic behavior. We additionally present a planned empirical evaluation using NASA-TLX and in-game telemetry to validate the system's pedagogical efficacy.

Fully Dynamic Maintenance of Loop Nesting Forests in Reducible Flow Graphs

from arXiv: Data Structures and Algorithms

Authors: Gregory Morse, Tamás Kozsik

Loop nesting forests (LNFs) are a fundamental abstraction for reasoning about control-flow structure, enabling applications such as compiler optimizations, program analysis, and dominator computation. While efficient static algorithms for constructing LNFs are well understood, maintaining them under dynamic graph updates has remained largely unexplored due to the lack of efficient dynamic depth-first search (DFS) maintenance. In this paper, we present the first fully dynamic algorithm for maintaining loop nesting forests in reducible control-flow graphs. Our approach leverages recent advances in dynamic DFS maintenance to incrementally update loop structure under edge insertions and deletions. We show that updates can be confined to local regions of the depth-first spanning tree, avoiding global recomputation. We provide formal invariants, correctness arguments, and complexity analysis, and demonstrate how the maintained LNF enables efficient derivation of dominance information. Our results establish LNFs as a practical dynamic abstraction for modern compiler and analysis pipelines.

Authors: Gregory Morse, Tamás Kozsik

Loop nesting forests (LNFs) are a fundamental abstraction for reasoning about control-flow structure, enabling applications such as compiler optimizations, program analysis, and dominator computation. While efficient static algorithms for constructing LNFs are well understood, maintaining them under dynamic graph updates has remained largely unexplored due to the lack of efficient dynamic depth-first search (DFS) maintenance. In this paper, we present the first fully dynamic algorithm for maintaining loop nesting forests in reducible control-flow graphs. Our approach leverages recent advances in dynamic DFS maintenance to incrementally update loop structure under edge insertions and deletions. We show that updates can be confined to local regions of the depth-first spanning tree, avoiding global recomputation. We provide formal invariants, correctness arguments, and complexity analysis, and demonstrate how the maintained LNF enables efficient derivation of dominance information. Our results establish LNFs as a practical dynamic abstraction for modern compiler and analysis pipelines.

Lawler-Moore Speedups via Additive Combinatorics

from arXiv: Data Structures and Algorithms

Authors: Karl Bringmann, Danny Hermelin, Tomohiro Koana, Dvir Shabtay

The Lawler-Moore dynamic programming framework is a classical tool in scheduling on parallel machines. It applies when the objective is regular, i.e. monotone in job completion times, and each machine follows a fixed priority order such as Smith's Rule or Jackson's Rule. For the basic objectives $Pm||\sum w_jC_j$, $Pm||L_{\max}$, and $Pm||\sum w_jU_j$, it gives running times $O(P^{m-1}n)$, $O(P^{m-1}n)$, and $O(P^mn)$, respectively, where $P$ is the total processing time. Recent SETH-based lower bounds indicate that the dependence on $P$ is essentially optimal, but they do not rule out improved dependence on the maximum processing time $p_{\max}$. We give the first major speedup of the Lawler-Moore recurrence. Our main ingredients are a new state-pruning method and a swapping argument based on an additive-combinatorial lemma. We prove that, whenever this swap does not increase the objective value, there exists an optimal schedule in which, for every prefix of jobs, the load difference between any two machines is at most $4p_{\max}^2$. This lets us prune redundant states throughout the dynamic program, replacing the dependence on $P$ by a dependence on $p_{\max}^2$. We show that the swap is non-increasing for all three objectives above. Hence $Pm||\sum w_jC_j$ and $Pm||L_{\max}$ admit algorithms with running time $O(p_{\max}^{2m-2}n)$, while $Pm||\sum w_jU_j$ can be solved in time $O(p_{\max}^{2m-2}Pn)\le O(p_{\max}^{2m-1}n^2)$. These bounds strictly improve the original Lawler-Moore runtimes whenever $p_{\max}=o(\sqrt{P})$. In particular, for $Pm||\sum w_jC_j$ and $Pm||L_{\max}$, we obtain the first near-linear-time algorithms when processing times are polylogarithmic in $n$.

Authors: Karl Bringmann, Danny Hermelin, Tomohiro Koana, Dvir Shabtay

The Lawler-Moore dynamic programming framework is a classical tool in scheduling on parallel machines. It applies when the objective is regular, i.e. monotone in job completion times, and each machine follows a fixed priority order such as Smith's Rule or Jackson's Rule. For the basic objectives $Pm||\sum w_jC_j$, $Pm||L_{\max}$, and $Pm||\sum w_jU_j$, it gives running times $O(P^{m-1}n)$, $O(P^{m-1}n)$, and $O(P^mn)$, respectively, where $P$ is the total processing time. Recent SETH-based lower bounds indicate that the dependence on $P$ is essentially optimal, but they do not rule out improved dependence on the maximum processing time $p_{\max}$. We give the first major speedup of the Lawler-Moore recurrence. Our main ingredients are a new state-pruning method and a swapping argument based on an additive-combinatorial lemma. We prove that, whenever this swap does not increase the objective value, there exists an optimal schedule in which, for every prefix of jobs, the load difference between any two machines is at most $4p_{\max}^2$. This lets us prune redundant states throughout the dynamic program, replacing the dependence on $P$ by a dependence on $p_{\max}^2$. We show that the swap is non-increasing for all three objectives above. Hence $Pm||\sum w_jC_j$ and $Pm||L_{\max}$ admit algorithms with running time $O(p_{\max}^{2m-2}n)$, while $Pm||\sum w_jU_j$ can be solved in time $O(p_{\max}^{2m-2}Pn)\le O(p_{\max}^{2m-1}n^2)$. These bounds strictly improve the original Lawler-Moore runtimes whenever $p_{\max}=o(\sqrt{P})$. In particular, for $Pm||\sum w_jC_j$ and $Pm||L_{\max}$, we obtain the first near-linear-time algorithms when processing times are polylogarithmic in $n$.

Lower Bounds for Testing Directed Acyclicity in the Unidirectional Bounded-Degree Model

from arXiv: Data Structures and Algorithms

Authors: Yuichi Yoshida

We study property testing of directed acyclicity in the unidirectional bounded-degree oracle model, where a query to a vertex reveals its outgoing neighbors. We prove that there exist absolute constants $d_0\in\mathbb{N}$ and $\varepsilon>0$ such that for every constant $d\ge d_0$, any one-sided $\varepsilon$-tester for acyclicity on $n$-vertex digraphs of maximum outdegree at most $d$ requires $\widetildeΩ(n^{2/3})$ queries. This improves the previous $\widetildeΩ(n^{5/9})$ lower bound for one-sided testing of acyclicity in the same model. We also prove that, under the same degree assumption, any two-sided $\varepsilon$-tester requires $Ω(\sqrt n)$ queries, improving the previous $Ω(n^{1/3})$ lower bound. We further prove an $Ω(n)$ lower bound for tolerant testing for some absolute constant outdegree bound $d$ by reduction from bounded-degree $3$-colorability.

Authors: Yuichi Yoshida

We study property testing of directed acyclicity in the unidirectional bounded-degree oracle model, where a query to a vertex reveals its outgoing neighbors. We prove that there exist absolute constants $d_0\in\mathbb{N}$ and $\varepsilon>0$ such that for every constant $d\ge d_0$, any one-sided $\varepsilon$-tester for acyclicity on $n$-vertex digraphs of maximum outdegree at most $d$ requires $\widetildeΩ(n^{2/3})$ queries. This improves the previous $\widetildeΩ(n^{5/9})$ lower bound for one-sided testing of acyclicity in the same model. We also prove that, under the same degree assumption, any two-sided $\varepsilon$-tester requires $Ω(\sqrt n)$ queries, improving the previous $Ω(n^{1/3})$ lower bound. We further prove an $Ω(n)$ lower bound for tolerant testing for some absolute constant outdegree bound $d$ by reduction from bounded-degree $3$-colorability.

Linear-Time Exact Computation of Influence Spread on Bounded-Pathwidth Graphs

from arXiv: Data Structures and Algorithms

Authors: Kengo Nakamura, Masaaki Nishino

Given a network and a set of vertices called seeds to initially inject information, influence spread is the expected number of vertices that eventually receive the information under a certain stochastic model of information propagation. Under the commonly used independent cascade model, influence spread is equivalent to the expected number of vertices reachable from the seeds on a directed uncertain graph, and the exact evaluation of influence spread offers many applications, e.g., influence maximization. Although its evaluation is a \#P-hard task, there is an algorithm that can precisely compute the influence spread in $O(mnω_p^2\cdot 2^{ω_p^2})$ time, where $ω_p$ is the pathwidth of the graph. We improve this by developing an algorithm that computes the influence spread in $O((m+n)ω_p^2\cdot 2^{ω_p^2})$ time. This is achieved by identifying the similarities in the repetitive computations in the existing algorithm and sharing them to reduce computation. Although similar refinements have been considered for the probability computation on undirected uncertain graphs, a greater number of similarities must be leveraged for directed graphs to achieve linear time complexity.

Authors: Kengo Nakamura, Masaaki Nishino

Given a network and a set of vertices called seeds to initially inject information, influence spread is the expected number of vertices that eventually receive the information under a certain stochastic model of information propagation. Under the commonly used independent cascade model, influence spread is equivalent to the expected number of vertices reachable from the seeds on a directed uncertain graph, and the exact evaluation of influence spread offers many applications, e.g., influence maximization. Although its evaluation is a \#P-hard task, there is an algorithm that can precisely compute the influence spread in $O(mnω_p^2\cdot 2^{ω_p^2})$ time, where $ω_p$ is the pathwidth of the graph. We improve this by developing an algorithm that computes the influence spread in $O((m+n)ω_p^2\cdot 2^{ω_p^2})$ time. This is achieved by identifying the similarities in the repetitive computations in the existing algorithm and sharing them to reduce computation. Although similar refinements have been considered for the probability computation on undirected uncertain graphs, a greater number of similarities must be leveraged for directed graphs to achieve linear time complexity.

Online TCP Acknowledgment under General Delays

from arXiv: Data Structures and Algorithms

Authors: Sujoy Bhore, Michał Pawłowski, Seeun William Umboh

In a seminal work, Dooly, Goldman, and Scott (STOC 1998; JACM 2001) introduced the classic Online TCP Acknowledgment problem. In this problem, a sequence of $n$ packets arrives over time, and the objective is to minimize both the number of acknowledgments sent and the total delay experienced by the packets. They showed that a greedy algorithm -- acknowledge when the delay of pending packets equals the acknowledgment cost -- is $2$-competitive. Online TCP Acknowledgment is the canonical online problem with delay, capturing the fundamental tradeoff between reducing service cost through batching and the delay incurred by pending requests. Prior work has largely focused on different types of service costs, e.g., Joint Replenishment. However, to the best of our knowledge, beyond the work of Albers and Bals (SODA 2003), which studies maximum delay and closely related objectives, not much is known for more general delay cost models. In this work, we study the Online TCP Acknowledgment under two new generalized delay costs that we call batch-oblivious and batch-aware. In the former, the delay cost is a function of the vector of packet delays. We show that the classic greedy approach is also $2$-competitive for continuous submodular functions and $\ell_p$ norms. In the batch-aware model, each batch incurs a delay cost that is a function $f$ of the vector of delays incurred by its packets. When the overall delay cost is the maximum of the batch delay costs, we show that the greedy approach is also $2$-competitive. Our main technical contribution is for the batch-aware setting, where the overall delay cost is the sum of the batch delay costs. We show that the greedy approach is $Ω(n)$-competitive and that the deterministic competitive ratio is $Θ(\log n)$. Remarkably, our algorithm only requires the bare minimum assumption that $f$ is monotone.

Authors: Sujoy Bhore, Michał Pawłowski, Seeun William Umboh

In a seminal work, Dooly, Goldman, and Scott (STOC 1998; JACM 2001) introduced the classic Online TCP Acknowledgment problem. In this problem, a sequence of $n$ packets arrives over time, and the objective is to minimize both the number of acknowledgments sent and the total delay experienced by the packets. They showed that a greedy algorithm -- acknowledge when the delay of pending packets equals the acknowledgment cost -- is $2$-competitive. Online TCP Acknowledgment is the canonical online problem with delay, capturing the fundamental tradeoff between reducing service cost through batching and the delay incurred by pending requests. Prior work has largely focused on different types of service costs, e.g., Joint Replenishment. However, to the best of our knowledge, beyond the work of Albers and Bals (SODA 2003), which studies maximum delay and closely related objectives, not much is known for more general delay cost models. In this work, we study the Online TCP Acknowledgment under two new generalized delay costs that we call batch-oblivious and batch-aware. In the former, the delay cost is a function of the vector of packet delays. We show that the classic greedy approach is also $2$-competitive for continuous submodular functions and $\ell_p$ norms. In the batch-aware model, each batch incurs a delay cost that is a function $f$ of the vector of delays incurred by its packets. When the overall delay cost is the maximum of the batch delay costs, we show that the greedy approach is also $2$-competitive. Our main technical contribution is for the batch-aware setting, where the overall delay cost is the sum of the batch delay costs. We show that the greedy approach is $Ω(n)$-competitive and that the deterministic competitive ratio is $Θ(\log n)$. Remarkably, our algorithm only requires the bare minimum assumption that $f$ is monotone.

Near-Optimal Constructive Bounds for $\ell_2$ Prefix Discrepancy and Steinitz Problems via Affine Spectral Independence

from arXiv: Data Structures and Algorithms

Authors: Kunal Dutta, Agastya Vibhuti Jha, Haotian Jiang

A classical result of Steinitz from 1913 \cite{Ste13}, answering an earlier question of Riemann and Lévy (e.g., \cite{Lev05}), states that for any norm $\|\cdot\|$ in $\mathbb{R}^d$ and any set of vectors $v_1, \cdots, v_n \in \R^d$ satisfying $\sum_{i=1}^n v_i = 0$, there exists an ordering $π: [n] \rightarrow [n]$ such that every partial sum along this order is bounded by $O(d)$, i.e., $\big\| \sum_{i=1}^t v_{π(i)} \big\| \leq O(d)$ for all $t \in [n]$. Steinitz's bound is tight up to constants in general, but for the $\ell_2$ norm $\|\cdot\|_2$, it has been conjectured that the best bound is $O(\sqrt{d})$. Almost a century later, a breakthrough work of Banaszczyk \cite{Ban12} gave a bound of $O(\sqrt{d} + \sqrt{\log n})$ for the $\ell_2$ Steinitz problem, matching the conjecture under the mild assumption that $d \geq Ω(\log n)$. Banaszczyk's result is non-constructive, and the previous best algorithmic bound was $O(\sqrt{d \log n})$, due to Bansal and Garg \cite{BG17}. In this work, we give an efficient algorithm that matches the conjectured $O(\sqrt{d})$ bound for the $\ell_2$ Steinitz problem under the slightly worse, yet still polylogarithmic, condition of $d \geq Ω(\log^7 n)$. As in prior work, our result extends to the harder problem of $\ell_2$ prefix discrepancy. We employ the framework of obtaining the desired ordering via a discrete Brownian motion, guided by a semidefinite program (SDP). To obtain our results, we use the new technique of ``Decoupling via Affine Spectral Independence'', proposed by Bansal and Jiang \cite{BJ26} to achieve substantial progress on the Beck-Fiala and Komlós conjectures, together with a ``Global Interval Tree'' data structure that simultaneously controls the deviations for all prefixes.

Authors: Kunal Dutta, Agastya Vibhuti Jha, Haotian Jiang

A classical result of Steinitz from 1913 \cite{Ste13}, answering an earlier question of Riemann and Lévy (e.g., \cite{Lev05}), states that for any norm $\|\cdot\|$ in $\mathbb{R}^d$ and any set of vectors $v_1, \cdots, v_n \in \R^d$ satisfying $\sum_{i=1}^n v_i = 0$, there exists an ordering $π: [n] \rightarrow [n]$ such that every partial sum along this order is bounded by $O(d)$, i.e., $\big\| \sum_{i=1}^t v_{π(i)} \big\| \leq O(d)$ for all $t \in [n]$. Steinitz's bound is tight up to constants in general, but for the $\ell_2$ norm $\|\cdot\|_2$, it has been conjectured that the best bound is $O(\sqrt{d})$. Almost a century later, a breakthrough work of Banaszczyk \cite{Ban12} gave a bound of $O(\sqrt{d} + \sqrt{\log n})$ for the $\ell_2$ Steinitz problem, matching the conjecture under the mild assumption that $d \geq Ω(\log n)$. Banaszczyk's result is non-constructive, and the previous best algorithmic bound was $O(\sqrt{d \log n})$, due to Bansal and Garg \cite{BG17}. In this work, we give an efficient algorithm that matches the conjectured $O(\sqrt{d})$ bound for the $\ell_2$ Steinitz problem under the slightly worse, yet still polylogarithmic, condition of $d \geq Ω(\log^7 n)$. As in prior work, our result extends to the harder problem of $\ell_2$ prefix discrepancy. We employ the framework of obtaining the desired ordering via a discrete Brownian motion, guided by a semidefinite program (SDP). To obtain our results, we use the new technique of ``Decoupling via Affine Spectral Independence'', proposed by Bansal and Jiang \cite{BJ26} to achieve substantial progress on the Beck-Fiala and Komlós conjectures, together with a ``Global Interval Tree'' data structure that simultaneously controls the deviations for all prefixes.

Encodings for Range Minimum Queries over Bounded Alphabets

from arXiv: Data Structures and Algorithms

Authors: Seungbum Jo, Srinivasa Rao Satti

Range minimum queries (RMQs) are fundamental operations with widespread applications in database management, text indexing and computational biology. While many space-efficient data structures have been designed for RMQs on arrays with arbitrary elements, there has not been any results developed for the case when the alphabet size is small, which is the case in many practical scenarios where RMQ structures are used. In this paper, we investigate the encoding complexity of RMQs on arrays over bounded alphabet. We consider both one-dimensional (1D) and two-dimensional (2D) arrays. For the 1D case, we present a near-optimal space encoding. For constant-sized alphabets, this also supports the queries in constant time. For the 2D case, we systematically analyze the 1-sided, 2-sided, 3-sided and 4-sided queries and derive lower bounds for encoding space, and also matching upper bounds that support efficient queries in most cases. Our results demonstrate that, even with the bounded alphabet restriction, the space requirements remain close to those for the general alphabet case.

Authors: Seungbum Jo, Srinivasa Rao Satti

Range minimum queries (RMQs) are fundamental operations with widespread applications in database management, text indexing and computational biology. While many space-efficient data structures have been designed for RMQs on arrays with arbitrary elements, there has not been any results developed for the case when the alphabet size is small, which is the case in many practical scenarios where RMQ structures are used. In this paper, we investigate the encoding complexity of RMQs on arrays over bounded alphabet. We consider both one-dimensional (1D) and two-dimensional (2D) arrays. For the 1D case, we present a near-optimal space encoding. For constant-sized alphabets, this also supports the queries in constant time. For the 2D case, we systematically analyze the 1-sided, 2-sided, 3-sided and 4-sided queries and derive lower bounds for encoding space, and also matching upper bounds that support efficient queries in most cases. Our results demonstrate that, even with the bounded alphabet restriction, the space requirements remain close to those for the general alphabet case.

Wednesday, April 15

Linkage

from David Eppstein

Congratulations to Ken Clarkson and the rest of the newly-named SIAM Fellows (\(\mathbb{M}\))!

By David Eppstein

Reticulating Splines

from Ben Recht

The long legacy of simulation-based control.

This is a live blog of Lecture 9 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

As I mentioned Monday, one of the big paradigms in modern robotics and control is the “sim2real” pipeline. People invest in complex computer simulators to test their robotic policies. The simulators have detailed dynamic and kinematic models of the robot and how it moves in contact with varied terrain and obstacles. The hope is that by burning through infinite GPU credits to troubleshoot every possibility in simulation, they can deploy code to their actual robot and need no troubleshooting once it’s unleashed in the real world.

While the young folks like to make this paradigm sound like a novel new research program, all of optimal control rests on the sim2real pipeline. Think about the core problem of optimal control: the linear quadratic regulator. This problem looks for a control sequence that minimizes a quadratic cost subject to the world evolving according to a linear dynamical system. Control theorists banged their heads against this problem for decades, and we are now taught the beautiful dynamic programming derivations that reduce this problem to solving a compact equation. However, we can also solve it using gradient descent. The gradient computation amounts to simulating the system with the current control policy, computing the sensitivity of the cost trajectory to each control decision, and then adding this information up to compute the gradient.

The lovely thing about gradient descent is that it gives you a solution technique for general optimal control problems with nonquadratic costs or nonlinear dynamics. You evaluate your policy under the current control, run a dynamical system backward in time to compute how sensitive the trajectory was to your control decisions, and then add up the contributions of each time point to get the full gradient. Arthur Bryson invented this method to compute gradients of general optimal control problems in 1962. Today, we call his algorithm backpropagation. This simulation-based gradient method provides incremental improvement of policies for any differentiable dynamical model and any differentiable cost function.

Now, if your simulation isn’t differentiable, maybe you’ll use a different sort of policy optimization method to solve your optimal control problem. However, reinforcement learning for robotics is still optimal control. RL for robotics minimizes a designed cost function subject to dynamics. The modern departure is that no one bothers to write down the equations of motion anymore. They just assume the simulator will compute them.

This belief pushes a lot of work onto the simulator. GPU cycles are sadly neither free nor abundant. It would be nice to minimize the simulation time and cost required to find a good control policy. It would be particularly nice because many people would like to have a simulator on board the actual robot to compute policies with methods like model predictive control. This begs the question of how accurate your simulation needs to be.

Unfortunately, no one knows. We all think that if you can act quickly enough with enough control authority, then a really simple model should work. But it’s impossible to quantify “enough” in that sentence. You have to try things out because dynamical processes are always surprising.

While it feels like increasing the fidelity of a simulator to the minute details of physical law always improves performance, this is not remotely the case. In class on Monday, Spencer Schutz presented a paper on autonomous driving showing a simple, inaccurate kinematic model with a low sampling rate performed just as well as a more accurate dynamic model. Anyone who’s spent time with dynamic models knows that very high-dimensional complex systems often look simple when you have limited controllability and observability. This is the basis of thermodynamics, where infinitely many bodies colliding collectively produce fairly boring dissipative behavior. Many complex-looking circuits have the input-output behavior of resistors.

On the other side of the coin, safe execution demands identification of subtle aspects of input-output relationships. You can have two dynamical systems with nearly identical behavior perform completely differently once in a closed loop circuit. You can also have systems with completely different behavior look the same in closed loop. I worked through a few examples of this phenomenon in a blog post a couple of years ago. Your model needs to be perfect in exactly the right places. But it’s usually impossible to know those places in advance.

To make matters worse, you can’t really identify the parameters of a robot in open loop. An expensive robot is always going to be running with its low-level controllers on both for its safety and yours. The actual parameters of closed-loop systems can’t be identified.1 So you’re stuck with guesses in your simulator, and you have to hope that your plausible parameters are good enough for your sim2real application.

The most popular solution to this identification problem is domain adaptation. Since you can only find a range of parameters that describe reality, you build a control policy that works for randomly sampled parameters. By constantly sampling different parameters in each run, you build a policy that performs well on average across all possible parameters.

Finding controllers that work for the average model isn’t new. Indeed, this is just a variant of optimal control called dual control, which has seen bursts of interest since the 1960s. Dual control is literally the problem of minimizing an expected control performance over a distribution of parameters. Like dual control, domain adaptation needs a good prior model for how the environment “samples” parameters. But you can also just YOLO and hope that as long as you include all the edge cases, you’ll never crash. That’s the machine learning mindset, after all.

But what does it mean to sample the coefficient of friction of a surface? What’s the right distribution of coefficients of friction? This is again a tricky question.

One approach to modeling the distribution of parameters is to add an element of adversarial behavior to the system. We can adapt the simulations to find hard parameter settings and train more on those. We can have the simulator learn to trip up the robot. Rather than minimizing expected cost, we are working to minimize a worst-case cost, where the supremum is over a distribution of parameters or disturbances. The dual control people were really into this sort of minimax robustness in the 60s. But practice in aerospace applications ultimately pushed the community to robust control.

But people hate robust control because it gives them conservative policies. Computer scientists love to hack and ship. Look how productive they’ve been! You only need to write a few tests and make sure your simulator passes those. No bugs detected, LGTM! What could go wrong, right?

Is that last paragraph about coding agents? It might be.

But regardless, robust control pointed out that unmodeled uncertainties are everywhere, and they can be out there to bite you if you’re not careful. For its entire history, robust control advocates have been haranguing people about the limits of simulators. They note a couple of significant problems: first, training on a simulator often means fitting to quirks of the simulator that don’t appear in the real world. This is a major danger, even in linear systems. Second, many apparent parametric robustness properties of optimal controllers break down under scrutiny.

In class, I introduced the structured singular value to motivate this issue. The structured singular value showed that when you had a system with many inputs and outputs, and you only considered independent perturbations, you’d convince yourself that a system was stable when it was not remotely stable. Guaranteeing stable behavior required understanding the dependencies between different errors. But how you test stability in simulation is not clear.

We are thus left considering a strategy beyond sim2real: sim2real2sim2real. Or sim2real2sim2real2sim2real. You deploy the system and find out what didn’t work in reality. And then you go back to your simulator, add a few thousand lines of code to account for the mistake, and try again. The software state of mind is that we can always patch mistakes. You can have an all-hands, blameless post-mortem and say it won’t happen again. This drives the old control theorists mad, but it’s been great working so far, so why change course?

Subscribe now

1

In case you haven’t encountered this before, suppose you are trying to model a closed-loop system x[t+1] = Ax[t]+ Bu[t], u[t]= Kx[t]. Then for an arbitrary matrix EB,

A+BK = (A -EBK)+(B+EB) K

Hence, you can only identify a subspace of possible dynamical systems describing your data.

By Ben Recht

PhD student at University of Salzburg (apply by May 6, 2026)

from CCI: jobs

The University of Salzburg is currently seeking a Predoctoral University Assistant (“PhD student”) in the Big Data Algorithms Group headed by Sebastian Forster. The goal is to develop graph algorithms for solving clustering, distance, flow, or cut problems that are as well-suited as possible to dynamic, parallel, or distributed computing models. Please consult the linked […]

The University of Salzburg is currently seeking a Predoctoral University Assistant (“PhD student”) in the Big Data Algorithms Group headed by Sebastian Forster. The goal is to develop graph algorithms for solving clustering, distance, flow, or cut problems that are as well-suited as possible to dynamic, parallel, or distributed computing models. Please consult the linked website for details.

Website: https://karriere.plus.ac.at/en/jobs/6e6aa798-0429-01e8-cabb-69c12151a2bd
Email: forster@cs.sbg.ac.at

By shacharlovett

A complexity phase transition at the EPR Hamiltonian

from arXiv: Computational Complexity

Authors: Kunal Marwaha, James Sud

We study the computational complexity of 2-local Hamiltonian problems generated by a positive-weight symmetric interaction term, encompassing many canonical problems in statistical mechanics and optimization. We show these problems belong to one of three complexity phases: QMA-complete, StoqMA-complete, and reducible to a new problem we call EPR*. The phases are physically interpretable, corresponding to the energy level ordering of the local term. The EPR* problem is a simple generalization of the EPR problem of King. Inspired by empirically efficient algorithms for EPR, we conjecture that EPR* is in BPP. If true, this would complete the complexity classification of these problems, and imply EPR* is the transition point between easy and hard local Hamiltonians. Our proofs rely on perturbative gadgets. One simple gadget, when recursed, induces a renormalization-group-like flow on the space of local interaction terms. This gives the correct complexity picture, but does not run in polynomial time. To overcome this, we design a gadget based on a large spin chain, which we analyze via the Jordan-Wigner transformation.

Authors: Kunal Marwaha, James Sud

We study the computational complexity of 2-local Hamiltonian problems generated by a positive-weight symmetric interaction term, encompassing many canonical problems in statistical mechanics and optimization. We show these problems belong to one of three complexity phases: QMA-complete, StoqMA-complete, and reducible to a new problem we call EPR*. The phases are physically interpretable, corresponding to the energy level ordering of the local term. The EPR* problem is a simple generalization of the EPR problem of King. Inspired by empirically efficient algorithms for EPR, we conjecture that EPR* is in BPP. If true, this would complete the complexity classification of these problems, and imply EPR* is the transition point between easy and hard local Hamiltonians. Our proofs rely on perturbative gadgets. One simple gadget, when recursed, induces a renormalization-group-like flow on the space of local interaction terms. This gives the correct complexity picture, but does not run in polynomial time. To overcome this, we design a gadget based on a large spin chain, which we analyze via the Jordan-Wigner transformation.

Symmetric subrank and its border analogue

from arXiv: Computational Complexity

Authors: Benjamin Biaggi, Jan Draisma, Koen de Nooij, Immanuel van Santen

The symmetric subrank of homogeneous polynomial is the largest number of terms in a diagonal form to which it can be specialized by a (typically non-invertible) linear variable substitution. Building on earlier work by Derksen-Makam-Zuiddam and Biaggi-Chang-Draisma-Rupniewski for ordinary tensors, we determine the asymptotic behavior of symmetric subrank and symmetric border subrank of degree-d forms as the number of variables tends to infinity. Furthermore, by using results from geometric invariant theory we show that for cubic (resp. quartic) forms the symmetric subrank and symmetric border subrank coincide if the latter is at most three (resp. two).

Authors: Benjamin Biaggi, Jan Draisma, Koen de Nooij, Immanuel van Santen

The symmetric subrank of homogeneous polynomial is the largest number of terms in a diagonal form to which it can be specialized by a (typically non-invertible) linear variable substitution. Building on earlier work by Derksen-Makam-Zuiddam and Biaggi-Chang-Draisma-Rupniewski for ordinary tensors, we determine the asymptotic behavior of symmetric subrank and symmetric border subrank of degree-d forms as the number of variables tends to infinity. Furthermore, by using results from geometric invariant theory we show that for cubic (resp. quartic) forms the symmetric subrank and symmetric border subrank coincide if the latter is at most three (resp. two).

From Witness-Space Sharpness To Family-Pointwise Exactness For The Solvability Complexity Index

from arXiv: Computational Complexity

Authors: Christopher Sorg

We study how exact Solvability Complexity Index (SCI) statements should be formulated for families of computational problems rather than for single problems. While the equality \(\operatorname{SCI}_G (\mathcal P)=k\) is unambiguous for an individual computational problem \(\mathcal P\), the family setting requires one to distinguish family-pointwise exactness, witness-space sharpness, and worst-case exactness. We formalize this trichotomy, prove that witness-space sharpness coincides with worst-case exactness but is, in general, strictly weaker than family-pointwise exactness, and show that certain Koopman-operator classification results are sharp only in this worst-case sense. We then establish two positive upgrade theorems: an abstract pullback principle and a concrete finite-query criterion guaranteeing that witness-space sharpness upgrades to family-pointwise exactness. Next, we introduce a decoder-regular finite-query transport preorder on SCI computational problems, prove that it is a preorder, derive a transport-saturation sufficient criterion extending the principal-source package, and show that the associated transport degrees need not form a lattice in full generality. We analyze the natural decoder classes \(\mathscr R_{\mathrm{cont}}\) and \(\mathscr R_{\mathrm{Bor}}\): on the full class the corresponding quotients are not upper semilattices, while on the nondegenerate subclass the preorder is upward and downward directed. Finally, we exhibit two natural positive families realizing the principal transport mechanism: exact integration on compact intervals and a fixed-window spectral decision family obtained by block-diagonal stabilization.

Authors: Christopher Sorg

We study how exact Solvability Complexity Index (SCI) statements should be formulated for families of computational problems rather than for single problems. While the equality \(\operatorname{SCI}_G (\mathcal P)=k\) is unambiguous for an individual computational problem \(\mathcal P\), the family setting requires one to distinguish family-pointwise exactness, witness-space sharpness, and worst-case exactness. We formalize this trichotomy, prove that witness-space sharpness coincides with worst-case exactness but is, in general, strictly weaker than family-pointwise exactness, and show that certain Koopman-operator classification results are sharp only in this worst-case sense. We then establish two positive upgrade theorems: an abstract pullback principle and a concrete finite-query criterion guaranteeing that witness-space sharpness upgrades to family-pointwise exactness. Next, we introduce a decoder-regular finite-query transport preorder on SCI computational problems, prove that it is a preorder, derive a transport-saturation sufficient criterion extending the principal-source package, and show that the associated transport degrees need not form a lattice in full generality. We analyze the natural decoder classes \(\mathscr R_{\mathrm{cont}}\) and \(\mathscr R_{\mathrm{Bor}}\): on the full class the corresponding quotients are not upper semilattices, while on the nondegenerate subclass the preorder is upward and downward directed. Finally, we exhibit two natural positive families realizing the principal transport mechanism: exact integration on compact intervals and a fixed-window spectral decision family obtained by block-diagonal stabilization.

A Relativizing MIP for BQP

from arXiv: Computational Complexity

Authors: Scott Aaronson, Anand Natarajan, Avishay Tal, Agi Villanyi

Complexity class containments involving interactive proof classes are famously nonrelativizing: although $\mathsf{IP} = \mathsf{PSPACE}$, Fortnow and Sipser showed that that there exists an oracle relative to which $\mathsf{coNP} \not\subseteq \mathsf{IP}$. In contrast, the question of whether the containment $\mathsf{BQP} \subseteq \mathsf{IP}$ is relativizing remains wide open. In this work we make progress towards resolving this question by showing that the containment $\mathsf{BQP} \subseteq \mathsf{MIP}$ holds with respect to any classical oracle. We obtain this result by constructing, for any classical oracle $O$, a $\mathsf{PCP}$ proof system for $\mathsf{BQP}^{O}$ where the verifier makes polynomially many classical queries to an exponentially-long proof, and to the oracle $O$. Our construction is inspired by the state synthesis algorithm of Grover and Rudolph, and serves as a complement to the "exponential PCP" constructed by Aharonov, Arad, and Vidick, which achieves similar parameters but which is based on different ideas and does not relativize. We propose relativization as a proxy for prover efficiency, and hope that progress towards an $\mathsf{IP}$ for $\mathsf{BQP}$ in the oracle world will lead to a non-cryptographic interactive protocol for proving any quantum computation to a classical skeptic in the unrelativized world, which is a longstanding open problem in quantum complexity theory.

Authors: Scott Aaronson, Anand Natarajan, Avishay Tal, Agi Villanyi

Complexity class containments involving interactive proof classes are famously nonrelativizing: although $\mathsf{IP} = \mathsf{PSPACE}$, Fortnow and Sipser showed that that there exists an oracle relative to which $\mathsf{coNP} \not\subseteq \mathsf{IP}$. In contrast, the question of whether the containment $\mathsf{BQP} \subseteq \mathsf{IP}$ is relativizing remains wide open. In this work we make progress towards resolving this question by showing that the containment $\mathsf{BQP} \subseteq \mathsf{MIP}$ holds with respect to any classical oracle. We obtain this result by constructing, for any classical oracle $O$, a $\mathsf{PCP}$ proof system for $\mathsf{BQP}^{O}$ where the verifier makes polynomially many classical queries to an exponentially-long proof, and to the oracle $O$. Our construction is inspired by the state synthesis algorithm of Grover and Rudolph, and serves as a complement to the "exponential PCP" constructed by Aharonov, Arad, and Vidick, which achieves similar parameters but which is based on different ideas and does not relativize. We propose relativization as a proxy for prover efficiency, and hope that progress towards an $\mathsf{IP}$ for $\mathsf{BQP}$ in the oracle world will lead to a non-cryptographic interactive protocol for proving any quantum computation to a classical skeptic in the unrelativized world, which is a longstanding open problem in quantum complexity theory.

Topology Understanding of B-Spline Surface/Surface Intersection with Mapper

from arXiv: Computational Geometry

Authors: Chenming Gao, Hongwei Lin, Gengchen Li

In the realm of computer-aided design (CAD) software, the intersection of B-spline surfaces stands as a fundamental operation. Despite the extensive history of surface intersection algorithms, the challenge of handling complex intersection topologies persists. While subdivision algorithms have demonstrated strong robustness in computing surface/surface intersection and are capable of addressing singular cases, determining the topology of the intersection obtained through these methods is a key factor for calculating correct intersection, and remains a difficult issue. To address this challenge, we propose a Mapper-based method for determining the topology of the intersection between two B-spline surfaces. Our algorithm is designed to efficiently handle various common and complex intersection topologies. Experimental results verify the robustness and topological correctness of this method.

Authors: Chenming Gao, Hongwei Lin, Gengchen Li

In the realm of computer-aided design (CAD) software, the intersection of B-spline surfaces stands as a fundamental operation. Despite the extensive history of surface intersection algorithms, the challenge of handling complex intersection topologies persists. While subdivision algorithms have demonstrated strong robustness in computing surface/surface intersection and are capable of addressing singular cases, determining the topology of the intersection obtained through these methods is a key factor for calculating correct intersection, and remains a difficult issue. To address this challenge, we propose a Mapper-based method for determining the topology of the intersection between two B-spline surfaces. Our algorithm is designed to efficiently handle various common and complex intersection topologies. Experimental results verify the robustness and topological correctness of this method.

The Parameterized Complexity of Vertex-Coloring Edge-Weighting

from arXiv: Data Structures and Algorithms

Authors: Shubhada Aute, Fahad Panolan, Geevarghese Philip

Motivated by the landmark resolution of the 1-2-3 Conjecture, we initiate the study of the parameterized complexity of the Vertex-Coloring {0,1}-Edge-Weighting problem and its generalization, Vertex-Coloring Pre-edge-Weighting, under various structural parameters. The base problem, Vertex-Coloring {0,1}-Edge-Weighting, asks whether we can assign a weight from {0,1} to each edge of a graph. The goal is to ensure that for every pair of adjacent vertices, the sums of their incident edge weights are distinct. In the Vertex-Coloring Pre-edge-Weighting variant, we are given a graph where a subset of edges is already assigned fixed weights from {0,1}. The goal is to determine if this partial weighting can be extended to all remaining edges such that the final, complete assignment satisfies the proper vertex coloring property. While the existence of such weightings is well-understood for specific graph classes, their algorithmic complexity under structural parameterization has remained unexplored. We prove both hardness and tractability for the problem, across a hierarchy of structural parameters. We show that both the base problem and the Pre-edge-Weighting variant are W[1]-hard when parameterized by the size of a feedback vertex set of the input graph. On the positive side, we establish that the base problem and a restricted Pre-edge-Weighting variant where the pre-assigned weights are all 1, become FPT when parameterized by the size of a vertex cover of the input graph. Further, we show that both the base problem and the Pre-edge-Weighting variant have XP algorithms when parameterized by the treewidth of the input graph.

Authors: Shubhada Aute, Fahad Panolan, Geevarghese Philip

Motivated by the landmark resolution of the 1-2-3 Conjecture, we initiate the study of the parameterized complexity of the Vertex-Coloring {0,1}-Edge-Weighting problem and its generalization, Vertex-Coloring Pre-edge-Weighting, under various structural parameters. The base problem, Vertex-Coloring {0,1}-Edge-Weighting, asks whether we can assign a weight from {0,1} to each edge of a graph. The goal is to ensure that for every pair of adjacent vertices, the sums of their incident edge weights are distinct. In the Vertex-Coloring Pre-edge-Weighting variant, we are given a graph where a subset of edges is already assigned fixed weights from {0,1}. The goal is to determine if this partial weighting can be extended to all remaining edges such that the final, complete assignment satisfies the proper vertex coloring property. While the existence of such weightings is well-understood for specific graph classes, their algorithmic complexity under structural parameterization has remained unexplored. We prove both hardness and tractability for the problem, across a hierarchy of structural parameters. We show that both the base problem and the Pre-edge-Weighting variant are W[1]-hard when parameterized by the size of a feedback vertex set of the input graph. On the positive side, we establish that the base problem and a restricted Pre-edge-Weighting variant where the pre-assigned weights are all 1, become FPT when parameterized by the size of a vertex cover of the input graph. Further, we show that both the base problem and the Pre-edge-Weighting variant have XP algorithms when parameterized by the treewidth of the input graph.

Dequantizing Short-Path Quantum Algorithms

from arXiv: Data Structures and Algorithms

Authors: François Le Gall, Suguru Tamaki

The short-path quantum algorithm introduced by Hastings (Quantum 2018, 2019) is a variant of adiabatic quantum algorithms that enables an easier worst-case analysis by avoiding the need to control the spectral gap along a long adiabatic path. Dalzell, Pancotti, Campbell, and Brandão (STOC 2023) recently revisited this framework and obtained a clear analysis of the complexity of the short-path algorithm for several classes of constraint satisfaction problems (MAX-$k$-CSPs), leading to quantum algorithms with complexity $2^{(1-c)n/2}$ for some constant $c>0$. This suggested a super-quadratic quantum advantage over classical algorithms. In this work, we identify an explicit classical mechanism underlying a substantial part of this line of work, and show that it leads to clean dequantizations. As a consequence, we obtain classical algorithms that run in time $2^{(1-c')n}$, for some constant $c'>c$, for the same classes of constraint satisfaction problems. This shows that current short-path quantum algorithms for these problems do not achieve a super-quadratic advantage. On the positive side, our results provide a new ``quantum-inspired'' approach to designing classical algorithms for important classes of constraint satisfaction problems.

Authors: François Le Gall, Suguru Tamaki

The short-path quantum algorithm introduced by Hastings (Quantum 2018, 2019) is a variant of adiabatic quantum algorithms that enables an easier worst-case analysis by avoiding the need to control the spectral gap along a long adiabatic path. Dalzell, Pancotti, Campbell, and Brandão (STOC 2023) recently revisited this framework and obtained a clear analysis of the complexity of the short-path algorithm for several classes of constraint satisfaction problems (MAX-$k$-CSPs), leading to quantum algorithms with complexity $2^{(1-c)n/2}$ for some constant $c>0$. This suggested a super-quadratic quantum advantage over classical algorithms. In this work, we identify an explicit classical mechanism underlying a substantial part of this line of work, and show that it leads to clean dequantizations. As a consequence, we obtain classical algorithms that run in time $2^{(1-c')n}$, for some constant $c'>c$, for the same classes of constraint satisfaction problems. This shows that current short-path quantum algorithms for these problems do not achieve a super-quadratic advantage. On the positive side, our results provide a new ``quantum-inspired'' approach to designing classical algorithms for important classes of constraint satisfaction problems.