Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Monday, March 16

Small World Models

from Ben Recht

How much do you need to know about a system to control it?

This is a live blog of Lecture 6 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

The backbone of control engineering is the assumption of a reasonably reliable, reasonably simple dynamical system with inputs and outputs. We have to believe that the behavior of the thing we want to steer is consistent enough so that whatever we design in the lab will work on the road.

Now, what exactly do I need to know about this reliably consistent dynamical system to get it to do what I want to do? You want a model of the system that is rich enough to describe everything you might ever see, but small enough so that you can computationally derive control policies and, perhaps, performance guarantees. Simpler system descriptions yield simpler state-estimation and control-design algorithms, both online and offline. What’s the right balance between modeling precision and simplicity? This is the question of system identification.

System identification is the natural place where machine learning and statistics meet control engineering. You need to either estimate parameters of models you believe are true or build predictions of how the system will respond to a string of inputs. What sorts of statistical infrastructure you need varies on your control engineering task

Sometimes you really believe that for all intents and purposes, the system behavior is captured by simple differential equations. An object falling in space without friction will obey Newton’s laws. To identify this system, you just need to measure the object’s mass. Easy peasy.

For more complicated mechanical systems, like quadrotors or simple, slow-moving wheeled vehicles, you still can get away with relatively simple modeling where the geometry of the problem gives you a nice differential equation with a few parameters determined by the build of your drone. In these cases, where you don’t need particularly high performance, you need only break out the ruler to scale to guestimate all of the parameters.

Sometimes you can’t determine the parameters from simple measurements, as environmental conditions dictate their values. For example, coefficients of friction might depend on temperature and the particulars of the flooring. Now you can find the parameters by repeatedly testing your system and running a nonlinear regression to minimize the input-output error.

As your problem gets even more complicated, maybe you don’t want to bother building a sophisticated simulator and would be perfectly fine with a “black-box” prediction of outputs from inputs. We’ve developed a zoo of methods to do this sort of prediction. The simplest are the “ARMAX” models, which predict outputs as a linear combination of the past inputs and outputs. You can fit these using least squares. If you want to be fancy, you can even compute nice “state-space” models from these linear ARMAX models, using a family of methods that are called subspace identification. This will yield smaller models and simplify your control synthesis problem.

On the other hand, you can go in a completely different direction and make your time-series predictor nonlinear. You can use a neural network to predict the next output from your history. If you want to get extra fancy, throw a transformer at the problem. I’m sure this will work great and build the best simulator without knowing anything about the problem at all.

So what’s the right level of modeling granularity for your problem? I don’t have a clear answer. In optimal control, the better your estimate, the better your performance. But maybe you care about the minimal amount of information you need to control something. How much is it?

You might think none. We’ve seen in class already that two systems that look completely different in open loop look the same in closed loop. Feedback can correct modeling errors. The simplest example is

Input u[0]=1, and the “x” variable goes to infinity, but the “z” variable goes to zero. However, under the negative state feedback rule “u[t]=-x[t]”, the systems are identical

which both quickly converge to zero.

Negative feedback is powerful and can drive solid performance in the face of huge model uncertainties. If you simply care about robust tracking or homeostatic behavior, perhaps you can get away with the most minimal system identification. Unfortunately, it’s not quite that easy. You can have two systems that look the same in open loop but have completely different closed-loop behavior. Karl Astrom has a relatively simple example that I described in an earlier post. There, one system has a filter between the controller and the state that slightly attenuates the frequencies needed to stabilize the system.

Now the question is whether Astrom’s pathological counterexample—where two systems look similar in open loop simulation but are catastrophically different under feedback—is indicative of widespread problems. Probably not. I’m not convinced that you have to learn sophisticated robust control for most small-scale robotics demos. (Sorry, John, though complex aerospace systems are certainly another story). I think the takeaway from Astrom’s examples is that your model should represent the sorts of disturbances and signals you should see out in the world. And it should be cognizant of the fact that you are going to use the model in a closed-loop, so you have to understand whether there are delays and noise between the actuation signal and the actuation action.

Of course, this makes sense to any graduate student who has worked on a real robot. Every robotics grad student I’ve spoken to has told me that investing the time in system identification makes the robotic performance infinitely better. Sometimes we have to sit with our dynamical systems for a long time before we know what we need to control them. Understanding what it means for our models to be good enough is the tricky part.

Subscribe now

By Ben Recht

Dynamic direct (ranked) access of MSO query evaluation over SLP-compressed strings

from arXiv: Data Structures and Algorithms

Authors: Martín Muñoz

We present an algorithm that, given an index $t$, produces the $t$-th (lexicographically ordered) answer of an MSO query over a string. The algorithm requires linear-time preprocessing, and builds a data structure that answers each of these calls in logarithmic time. We then show how to extend this algorithm for a string that is compressed by a straight-line program (SLP), also with linear-time preprocessing in the (compressed encoding of the) string, and maintaining direct access in logtime of the original string. Lastly, we extend the algorithm by allowing complex edits on the SLP after the direct-access data structure has been processsed, which are translated into the data structure in logtime. We do this by adapting a document editing framework introduced by Schmid and Schweikardt (PODS 2022). This work improves on a recent result of dynamic direct access of MSO queries over strings (Bourhis et. al., ICDT 2025) by a log-factor on the access procedure, and by extending the results to SLPs.

Authors: Martín Muñoz

We present an algorithm that, given an index $t$, produces the $t$-th (lexicographically ordered) answer of an MSO query over a string. The algorithm requires linear-time preprocessing, and builds a data structure that answers each of these calls in logarithmic time. We then show how to extend this algorithm for a string that is compressed by a straight-line program (SLP), also with linear-time preprocessing in the (compressed encoding of the) string, and maintaining direct access in logtime of the original string. Lastly, we extend the algorithm by allowing complex edits on the SLP after the direct-access data structure has been processsed, which are translated into the data structure in logtime. We do this by adapting a document editing framework introduced by Schmid and Schweikardt (PODS 2022). This work improves on a recent result of dynamic direct access of MSO queries over strings (Bourhis et. al., ICDT 2025) by a log-factor on the access procedure, and by extending the results to SLPs.

Tight (S)ETH-based Lower Bounds for Pseudopolynomial Algorithms for Bin Packing and Multi-Machine Scheduling

from arXiv: Data Structures and Algorithms

Authors: Karl Bringmann, Anita Dürr, Karol Węgrzycki

Bin Packing with $k$ bins is a fundamental optimisation problem in which we are given a set of $n$ integers and a capacity $T$ and the goal is to partition the set into $k$ subsets, each of total sum at most $T$. Bin Packing is NP-hard already for $k=2$ and a textbook dynamic programming algorithm solves it in pseudopolynomial time $\mathcal O(n T^{k-1})$. Jansen, Kratsch, Marx, and Schlotter [JCSS'13] proved that this time cannot be improved to $(nT)^{o(k / \log k)}$ assuming the Exponential Time Hypothesis (ETH). Their result has become an important building block, explaining the hardness of many problems in parameterised complexity. Note that their result is one log-factor short of being tight. In this paper, we prove a tight ETH-based lower bound for Bin Packing, ruling out time $2^{o(n)} T^{o(k)}$. This answers an open problem of Jansen et al. and yields improved lower bounds for many applications in parameterised complexity. Since Bin Packing is an example of multi-machine scheduling, it is natural to next study other scheduling problems. We prove tight lower bounds based on the Strong Exponential Time Hypothesis (SETH) for several classic $k$-machine scheduling problems, including makespan minimisation with release dates ($P_k|r_j|C_{\max}$), minimizing the number of tardy jobs ($P_k||ΣU_j$), and minimizing the weighted sum of completion times ($P_k || Σw_j C_j$). For all these problems, we rule out time $2^{o(n)} T^{k-1-\varepsilon}$ for any $\varepsilon > 0$ assuming SETH, where $T$ is the total processing time; this matches classic $n^{\mathcal O(1)} T^{k-1}$-time algorithms from the 60s and 70s. Moreover, we rule out time $2^{o(n)} T^{k-\varepsilon}$ for minimizing the total processing time of tardy jobs ($P_k||Σp_jU_j$), which matches a classic $\mathcal O(n T^{k})$-time algorithm and answers an open problem of Fischer and Wennmann [TheoretiCS'25].

Authors: Karl Bringmann, Anita Dürr, Karol Węgrzycki

Bin Packing with $k$ bins is a fundamental optimisation problem in which we are given a set of $n$ integers and a capacity $T$ and the goal is to partition the set into $k$ subsets, each of total sum at most $T$. Bin Packing is NP-hard already for $k=2$ and a textbook dynamic programming algorithm solves it in pseudopolynomial time $\mathcal O(n T^{k-1})$. Jansen, Kratsch, Marx, and Schlotter [JCSS'13] proved that this time cannot be improved to $(nT)^{o(k / \log k)}$ assuming the Exponential Time Hypothesis (ETH). Their result has become an important building block, explaining the hardness of many problems in parameterised complexity. Note that their result is one log-factor short of being tight. In this paper, we prove a tight ETH-based lower bound for Bin Packing, ruling out time $2^{o(n)} T^{o(k)}$. This answers an open problem of Jansen et al. and yields improved lower bounds for many applications in parameterised complexity. Since Bin Packing is an example of multi-machine scheduling, it is natural to next study other scheduling problems. We prove tight lower bounds based on the Strong Exponential Time Hypothesis (SETH) for several classic $k$-machine scheduling problems, including makespan minimisation with release dates ($P_k|r_j|C_{\max}$), minimizing the number of tardy jobs ($P_k||ΣU_j$), and minimizing the weighted sum of completion times ($P_k || Σw_j C_j$). For all these problems, we rule out time $2^{o(n)} T^{k-1-\varepsilon}$ for any $\varepsilon > 0$ assuming SETH, where $T$ is the total processing time; this matches classic $n^{\mathcal O(1)} T^{k-1}$-time algorithms from the 60s and 70s. Moreover, we rule out time $2^{o(n)} T^{k-\varepsilon}$ for minimizing the total processing time of tardy jobs ($P_k||Σp_jU_j$), which matches a classic $\mathcal O(n T^{k})$-time algorithm and answers an open problem of Fischer and Wennmann [TheoretiCS'25].

Extending Exact Integrality Gap Computations for the Metric TSP

from arXiv: Data Structures and Algorithms

Authors: William Cook, Stefan Hougardy, Moritz Petrich

The subtour relaxation of the traveling salesman problem (TSP) plays a central role in approximation algorithms and polyhedral studies of the TSP. A long-standing conjecture asserts that the integrality gap of the subtour relaxation for the metric TSP is exactly 4/3. In this paper, we extend the exact verification of this conjecture for small numbers of vertices. Using the framework introduced by Benoit and Boyd in 2008, we confirm their results up to n=10. We further show that for n=11 and n=12, the published lists of extreme points of the subtour polytope are incomplete: one extreme point is missing for n=11 and twenty-two extreme points are missing for n=12. We extend the enumeration of the extreme points of the subtour polytope to instances with up to 14 vertices in the general case. Restricted to half-integral vertices, we extend the enumeration of extreme points up to n=17. Our results provide additional support for the 4/3-Conjecture.

Authors: William Cook, Stefan Hougardy, Moritz Petrich

The subtour relaxation of the traveling salesman problem (TSP) plays a central role in approximation algorithms and polyhedral studies of the TSP. A long-standing conjecture asserts that the integrality gap of the subtour relaxation for the metric TSP is exactly 4/3. In this paper, we extend the exact verification of this conjecture for small numbers of vertices. Using the framework introduced by Benoit and Boyd in 2008, we confirm their results up to n=10. We further show that for n=11 and n=12, the published lists of extreme points of the subtour polytope are incomplete: one extreme point is missing for n=11 and twenty-two extreme points are missing for n=12. We extend the enumeration of the extreme points of the subtour polytope to instances with up to 14 vertices in the general case. Restricted to half-integral vertices, we extend the enumeration of extreme points up to n=17. Our results provide additional support for the 4/3-Conjecture.

Optimal Enumeration of Eulerian Trails in Directed Graphs

from arXiv: Data Structures and Algorithms

Authors: Ben Bals, Solon P. Pissis, Matei Tinca

The BEST theorem, due to de Bruijn, van Aardenne-Ehrenfest, Smith, and Tutte, is a classical tool from graph theory that links the Eulerian trails in a directed graph $G=(V,E)$ with the arborescences in $G$. In particular, one can use the BEST theorem to count the Eulerian trails in $G$ in polynomial time. For enumerating the Eulerian trails in $G$, one could naturally resort to first enumerating the arborescences in $G$ and then exploiting the insight of the BEST theorem to enumerate the Eulerian trails in $G$: every arborescence in $G$ corresponds to at least one Eulerian trail in $G$. Instead, we take a simple and direct approach. Our central contribution is a remarkably simple algorithm to directly enumerate the $z_T$ Eulerian trails in $G$ in the \emph{optimal} $O(m + z_T)$ time. As a consequence, our result improves on an implementation of the BEST theorem for counting Eulerian trails in $G$ when $z_T=o(n^2)$, and, in addition, it unconditionally improves the combinatorial $O(m\cdot z_T)$-time algorithm of Conte et al. [FCT 2021] for the same task. Moreover, we show that, with some care, our algorithm can be extended to enumerate Eulerian trails in directed multigraphs in optimal time, enabling applications in bioinformatics and data privacy.

Authors: Ben Bals, Solon P. Pissis, Matei Tinca

The BEST theorem, due to de Bruijn, van Aardenne-Ehrenfest, Smith, and Tutte, is a classical tool from graph theory that links the Eulerian trails in a directed graph $G=(V,E)$ with the arborescences in $G$. In particular, one can use the BEST theorem to count the Eulerian trails in $G$ in polynomial time. For enumerating the Eulerian trails in $G$, one could naturally resort to first enumerating the arborescences in $G$ and then exploiting the insight of the BEST theorem to enumerate the Eulerian trails in $G$: every arborescence in $G$ corresponds to at least one Eulerian trail in $G$. Instead, we take a simple and direct approach. Our central contribution is a remarkably simple algorithm to directly enumerate the $z_T$ Eulerian trails in $G$ in the \emph{optimal} $O(m + z_T)$ time. As a consequence, our result improves on an implementation of the BEST theorem for counting Eulerian trails in $G$ when $z_T=o(n^2)$, and, in addition, it unconditionally improves the combinatorial $O(m\cdot z_T)$-time algorithm of Conte et al. [FCT 2021] for the same task. Moreover, we show that, with some care, our algorithm can be extended to enumerate Eulerian trails in directed multigraphs in optimal time, enabling applications in bioinformatics and data privacy.

ExpanderGraph-128: A Novel Graph-Theoretic Block Cipher with Formal Security Analysis and Hardware Implementation

from arXiv: Data Structures and Algorithms

Authors: W. A. Susantha Wijesinghe

Lightweight block cipher design has largely focused on incremental optimization of established paradigms such as substitution--permutation networks, Feistel structures, and ARX constructions, where security derives from the algebraic complexity of individual components. We propose a different approach based on \emph{expander-graph interaction networks}, where diffusion and security arise from sparse structural connectivity rather than component sophistication. We present \textbf{ExpanderGraph-128 (EGC128)}, a 128-bit block cipher constructed as a 20-round balanced Feistel network. Each round applies a 64-bit nonlinear transformation governed by a 3-regular expander graph whose vertices execute identical 4-input Boolean functions on local neighborhoods. Security analysis combines MILP-based differential bounds, proven optimal through 10 rounds via SCIP, establishing 147.3-bit differential security and conservatively extrapolating to 413 bits for the full cipher. Linear analysis provides MILP bounds of $\geq 2^{145}$, while related-key evaluation shows no free rounds for any nonzero key difference. Additional tests confirm rapid algebraic degree growth and the absence of invariant affine subspaces. Implementation results demonstrate practical efficiency. FPGA synthesis on Xilinx Artix-7 achieves 261~Mbps at 100~MHz using only 380 LUTs, while ARM Cortex-M4F software requires 25.8~KB Flash and 1.66~ms per encryption. These results show that expander-graph-driven diffusion provides a promising design methodology for lightweight cryptography.

Authors: W. A. Susantha Wijesinghe

Lightweight block cipher design has largely focused on incremental optimization of established paradigms such as substitution--permutation networks, Feistel structures, and ARX constructions, where security derives from the algebraic complexity of individual components. We propose a different approach based on \emph{expander-graph interaction networks}, where diffusion and security arise from sparse structural connectivity rather than component sophistication. We present \textbf{ExpanderGraph-128 (EGC128)}, a 128-bit block cipher constructed as a 20-round balanced Feistel network. Each round applies a 64-bit nonlinear transformation governed by a 3-regular expander graph whose vertices execute identical 4-input Boolean functions on local neighborhoods. Security analysis combines MILP-based differential bounds, proven optimal through 10 rounds via SCIP, establishing 147.3-bit differential security and conservatively extrapolating to 413 bits for the full cipher. Linear analysis provides MILP bounds of $\geq 2^{145}$, while related-key evaluation shows no free rounds for any nonzero key difference. Additional tests confirm rapid algebraic degree growth and the absence of invariant affine subspaces. Implementation results demonstrate practical efficiency. FPGA synthesis on Xilinx Artix-7 achieves 261~Mbps at 100~MHz using only 380 LUTs, while ARM Cortex-M4F software requires 25.8~KB Flash and 1.66~ms per encryption. These results show that expander-graph-driven diffusion provides a promising design methodology for lightweight cryptography.

Early Pruning for Public Transport Routing

from arXiv: Data Structures and Algorithms

Authors: Andrii Rohovyi, Abdallah Abuaisha, Toby Walsh

Routing algorithms for public transport, particularly the widely used RAPTOR and its variants, often face performance bottlenecks during the transfer relaxation phase, especially on dense transfer graphs, when supporting unlimited transfers. This inefficiency arises from iterating over many potential inter-stop connections (walks, bikes, e-scooters, etc.). To maintain acceptable performance, practitioners often limit transfer distances or exclude certain transfer options, which can reduce path optimality and restrict the multimodal options presented to travellers. This paper introduces Early Pruning, a low-overhead technique that accelerates routing algorithms without compromising optimality. By pre-sorting transfer connections by duration and applying a pruning rule within the transfer loop, the method discards longer transfers at a stop once they cannot yield an earlier arrival than the current best solution. Early Pruning can be integrated with minimal changes to existing codebases and requires only a one-time preprocessing step. Across multiple state-of-the-art RAPTOR-based solutions, including RAPTOR, ULTRA-RAPTOR, McRAPTOR, BM-RAPTOR, ULTRA-McRAPTOR, and UBM-RAPTOR and tested on the Switzerland and London transit networks, we achieved query time reductions of up to 57%. This approach provides a generalizable improvement to the efficiency of transit pathfinding algorithms. Beyond algorithmic performance, Early Pruning has practical implications for transport planning. By reducing computational costs, it enables transit agencies to expand transfer radii and incorporate additional mobility modes into journey planners without requiring extra server infrastructure. This is particularly relevant for passengers in areas with sparse direct transit coverage, such as outer suburbs and smaller towns, where richer multimodal routing can reveal viable alternatives to private car use.

Authors: Andrii Rohovyi, Abdallah Abuaisha, Toby Walsh

Routing algorithms for public transport, particularly the widely used RAPTOR and its variants, often face performance bottlenecks during the transfer relaxation phase, especially on dense transfer graphs, when supporting unlimited transfers. This inefficiency arises from iterating over many potential inter-stop connections (walks, bikes, e-scooters, etc.). To maintain acceptable performance, practitioners often limit transfer distances or exclude certain transfer options, which can reduce path optimality and restrict the multimodal options presented to travellers. This paper introduces Early Pruning, a low-overhead technique that accelerates routing algorithms without compromising optimality. By pre-sorting transfer connections by duration and applying a pruning rule within the transfer loop, the method discards longer transfers at a stop once they cannot yield an earlier arrival than the current best solution. Early Pruning can be integrated with minimal changes to existing codebases and requires only a one-time preprocessing step. Across multiple state-of-the-art RAPTOR-based solutions, including RAPTOR, ULTRA-RAPTOR, McRAPTOR, BM-RAPTOR, ULTRA-McRAPTOR, and UBM-RAPTOR and tested on the Switzerland and London transit networks, we achieved query time reductions of up to 57%. This approach provides a generalizable improvement to the efficiency of transit pathfinding algorithms. Beyond algorithmic performance, Early Pruning has practical implications for transport planning. By reducing computational costs, it enables transit agencies to expand transfer radii and incorporate additional mobility modes into journey planners without requiring extra server infrastructure. This is particularly relevant for passengers in areas with sparse direct transit coverage, such as outer suburbs and smaller towns, where richer multimodal routing can reveal viable alternatives to private car use.

Weighted Set Multi-Cover on Bounded Universe and Applications in Package Recommendation

from arXiv: Data Structures and Algorithms

Authors: Nima Shahbazi, Aryan Esmailpour, Stavros Sintos

The weighted set multi-cover problem is a fundamental generalization of set cover that arises in data-driven applications where one must select a small, low-cost subset from a large collection of candidates under coverage constraints. In data management settings, such problems arise naturally either as expressive database queries or as post-processing steps over query results, for example, when selecting representative or diverse subsets from large relations returned by database queries for decision support, recommendation, fairness-aware data selection, or crowd-sourcing. While the general weighted set multi-cover problem is NP-complete, many practical workloads involve a \emph{bounded universe} of items that must be covered, leading to the Weighted Set Multi-Cover with Bounded Universe (WSMC-BU) problem, where the universe size is constant. In this paper, we develop exact and approximation algorithms for WSMC-BU. We first discuss a dynamic programming algorithm that solves WSMC-BU exactly in $O(n^{\ell+1})$ time, where $n$ is the number of input sets and $\ell=O(1)$ is the universe size. We then present a $2$-approximation algorithm based on linear programming and rounding, running in $O(\mathcal{L}(n))$ time, where $\mathcal{L}(n)$ denotes the complexity of solving a linear program with $O(n)$ variables. To further improve efficiency for large datasets, we propose a faster $(2+\varepsilon)$-approximation algorithm with running time $O(n \log n + \mathcal{L}(\log W))$, where $W$ is the ratio of the total weight to the minimum weight, and $\varepsilon$ is an arbitrary constant specified by the user. Extensive experiments on real and synthetic datasets demonstrate that our methods consistently outperform greedy and standard LP-rounding baselines in both solution quality and runtime, making them suitable for data-intensive selection tasks over large query outputs.

Authors: Nima Shahbazi, Aryan Esmailpour, Stavros Sintos

The weighted set multi-cover problem is a fundamental generalization of set cover that arises in data-driven applications where one must select a small, low-cost subset from a large collection of candidates under coverage constraints. In data management settings, such problems arise naturally either as expressive database queries or as post-processing steps over query results, for example, when selecting representative or diverse subsets from large relations returned by database queries for decision support, recommendation, fairness-aware data selection, or crowd-sourcing. While the general weighted set multi-cover problem is NP-complete, many practical workloads involve a \emph{bounded universe} of items that must be covered, leading to the Weighted Set Multi-Cover with Bounded Universe (WSMC-BU) problem, where the universe size is constant. In this paper, we develop exact and approximation algorithms for WSMC-BU. We first discuss a dynamic programming algorithm that solves WSMC-BU exactly in $O(n^{\ell+1})$ time, where $n$ is the number of input sets and $\ell=O(1)$ is the universe size. We then present a $2$-approximation algorithm based on linear programming and rounding, running in $O(\mathcal{L}(n))$ time, where $\mathcal{L}(n)$ denotes the complexity of solving a linear program with $O(n)$ variables. To further improve efficiency for large datasets, we propose a faster $(2+\varepsilon)$-approximation algorithm with running time $O(n \log n + \mathcal{L}(\log W))$, where $W$ is the ratio of the total weight to the minimum weight, and $\varepsilon$ is an arbitrary constant specified by the user. Extensive experiments on real and synthetic datasets demonstrate that our methods consistently outperform greedy and standard LP-rounding baselines in both solution quality and runtime, making them suitable for data-intensive selection tasks over large query outputs.

Pairwise Exchanges of Freely Replicable Goods with Negative Externalities

from arXiv: Data Structures and Algorithms

Authors: Shangyuan Yang, Kirthevasan Kandasamy

We study a setting where a set of agents engage in pairwise exchanges of freely replicable goods (e.g., digital goods such as data), where two agents grant each other a copy of a good they possess in exchange for a good they lack. Such exchanges introduce a fundamental tension: while agents benefit from acquiring additional goods, they incur negative externalities when others do the same. This dynamic typically arises in real-world scenarios where competing entities may benefit from selective collaboration. For example, in a data sharing consortium, pharmaceutical companies might share (copies of) drug discovery data, when the value of accessing a competitor's data outweighs the risk of revealing their own. In our model, an altruistic central planner wishes to design an exchange protocol (without money), to structure such exchanges between agents. The protocol operates over multiple rounds, proposing sets of pairwise exchanges in each round, which agents may accept or reject. We formulate three key desiderata for such a protocol: (i) individual rationality: agents should not be worse off by participating in the protocol; (ii) incentive-compatibility: agents should be incentivized to share as much as possible by accepting all exchange proposals by the planner; (iii) stability: there should be no further mutually beneficial exchanges upon termination. We design an exchange protocol for the planner that satisfies all three desiderata. While the above desiderata are inspired by classical models for exchange, free-replicability and negative externalities necessitate novel and nontrivial reformalizations of these goals. We also argue that achieving Pareto-efficient agent utilities -- often a central goal in exchange models without externalities -- may be ill-suited in this setting.

Authors: Shangyuan Yang, Kirthevasan Kandasamy

We study a setting where a set of agents engage in pairwise exchanges of freely replicable goods (e.g., digital goods such as data), where two agents grant each other a copy of a good they possess in exchange for a good they lack. Such exchanges introduce a fundamental tension: while agents benefit from acquiring additional goods, they incur negative externalities when others do the same. This dynamic typically arises in real-world scenarios where competing entities may benefit from selective collaboration. For example, in a data sharing consortium, pharmaceutical companies might share (copies of) drug discovery data, when the value of accessing a competitor's data outweighs the risk of revealing their own. In our model, an altruistic central planner wishes to design an exchange protocol (without money), to structure such exchanges between agents. The protocol operates over multiple rounds, proposing sets of pairwise exchanges in each round, which agents may accept or reject. We formulate three key desiderata for such a protocol: (i) individual rationality: agents should not be worse off by participating in the protocol; (ii) incentive-compatibility: agents should be incentivized to share as much as possible by accepting all exchange proposals by the planner; (iii) stability: there should be no further mutually beneficial exchanges upon termination. We design an exchange protocol for the planner that satisfies all three desiderata. While the above desiderata are inspired by classical models for exchange, free-replicability and negative externalities necessitate novel and nontrivial reformalizations of these goals. We also argue that achieving Pareto-efficient agent utilities -- often a central goal in exchange models without externalities -- may be ill-suited in this setting.

Sunday, March 15

For \(R^3\) the problem is open. That's too bad. We live in \(R^3\)

from Computational Complexity

(If you live in Montgomery County Maryland OR if you care about Education, you MUST read this guest blog by Daniel Gottesman on Scott Aaronson's blog HERE.) 

(This post is a sequel to a prior post on this topic that was here. However, this post is self-contained---you don't need to have read the prior post.)  

(Later in the post I point to my open problems column that does what is in this post rigorously. However , that link might be hard to find, so here it is:  HERE)



BILL: I have a nice problem to tell you about. First, the setup.

Say you have a finite coloring of \(R^n\).

A  mono unit square is a set of four points that are

(a) all the same color, and

(b) form a square of side 1. The square does not need to be parallel to any of the axes.

DARLING: Okay. What is the problem?

BILL:  It is known that for all  2-colorings of \(R^6\) there is a mono unit square.

DARLING: \(R^6\)? Really! That's hilarious! Surely, better is known.

BILL: Yes better is known. And stop calling me Shirley.

DARLING: Okay, so what else is known?

BILL: An observation about the \(R^6\) result gives us the result for \(R^5\). (The \(R^5\) result also follows from a different technique.) Then a much harder proof gives us the result for \(R^4\). It is easy to  construct  a coloring of \(R^2\) without a mono unit square. The problem for \(R^3\) is open.

DARLING: That's too bad. We live in \(R^3\).

DARLING: Someone should write an article about all this including proofs of all the known results, open problems,  and maybe a few new things.

BILL: By someone you mean Auguste Gezalyan (Got his  PhD in CS, topic Comp Geom, at  UMCP), Ryan Parker (ugrad working on Comp Geom at UMCP), and Bill Gasarch (that's me!)  Good idea!

A FEW WEEKS LATER

BILL: Done! See here. And I call the problem about \(R^3\) The Darling Problem.

DARLING: Great! Now that you have an in-depth knowledge of the problem---

BILL: Auguste and Ryan have an in depth knowledge. Frankly I'm out of my depth.

DARLING: Okay, then I'll ask them:  What do you think happens in \(R^3\) and when do think it will be proved?

AUGUSTE: I think there is a 2-coloring of \(R^3\) with no mono unit square.

RYAN: I think that for every 2-coloring of \(R^3\) there is a mono unit square.

BILL: I have no conjecture; however, I think this is the kind of problem that really could be solved. It has not been worked on that much and it might just be one key idea from being solved. It is my hope that this article and blog post inspires someone to work on it and solve it. 

OBLIGATORY AI COMMENT

Auguste asked ChatGPT (or some AI) about the problem. It replied that the problem is open and is known as The Darling's Problem. This is rather surprising---Auguste asked the AI about this before I had submitted the article (it has since appeared) and before this blog post. So how did AI know about it? It was on my website.  I conjecture that Auguste used some of the same language we used in the paper so the AI found our paper. The oddest thing about this is that I don't find this odd anymore. 

 COLOR COMMENTARY  

The article appeared as a SIGACT News Open Problems Column. Are you better off reading it there or on my website, which is pointed to above. The SIGACT News version is (a) behind a paywall, and (b) in black and white. The version on my website is (a) free access, and (b) uses color. You decide.

By gasarch

(If you live in Montgomery County Maryland OR if you care about Education, you MUST read this guest blog by Daniel Gottesman on Scott Aaronson's blog HERE.) 

(This post is a sequel to a prior post on this topic that was here. However, this post is self-contained---you don't need to have read the prior post.)  

(Later in the post I point to my open problems column that does what is in this post rigorously. However , that link might be hard to find, so here it is:  HERE)



BILL: I have a nice problem to tell you about. First, the setup.

Say you have a finite coloring of \(R^n\).

mono unit square is a set of four points that are

(a) all the same color, and

(b) form a square of side 1. The square does not need to be parallel to any of the axes.

DARLING: Okay. What is the problem?

BILL:  It is known that for all  2-colorings of \(R^6\) there is a mono unit square.

DARLING: \(R^6\)? Really! That's hilarious! Surely, better is known.

BILL: Yes better is known. And stop calling me Shirley.

DARLING: Okay, so what else is known?

BILL: An observation about the \(R^6\) result gives us the result for \(R^5\). (The \(R^5\) result also follows from a different technique.) Then a much harder proof gives us the result for \(R^4\). It is easy to  construct  a coloring of \(R^2\) without a mono unit square. The problem for \(R^3\) is open.

DARLING: That's too bad. We live in \(R^3\).

DARLING: Someone should write an article about all this including proofs of all the known results, open problems,  and maybe a few new things.

BILL: By someone you mean Auguste Gezalyan (Got his  PhD in CS, topic Comp Geom, at  UMCP), Ryan Parker (ugrad working on Comp Geom at UMCP), and Bill Gasarch (that's me!)  Good idea!

A FEW WEEKS LATER

BILL: Done! See here. And I call the problem about \(R^3\) The Darling Problem.

DARLING: Great! Now that you have an in-depth knowledge of the problem---

BILL: Auguste and Ryan have an in depth knowledge. Frankly I'm out of my depth.

DARLING: Okay, then I'll ask them:  What do you think happens in \(R^3\) and when do think it will be proved?

AUGUSTE: I think there is a 2-coloring of \(R^3\) with no mono unit square.

RYAN: I think that for every 2-coloring of \(R^3\) there is a mono unit square.

BILL: I have no conjecture; however, I think this is the kind of problem that really could be solved. It has not been worked on that much and it might just be one key idea from being solved. It is my hope that this article and blog post inspires someone to work on it and solve it. 

OBLIGATORY AI COMMENT

Auguste asked ChatGPT (or some AI) about the problem. It replied that the problem is open and is known as The Darling's Problem. This is rather surprising---Auguste asked the AI about this before I had submitted the article (it has since appeared) and before this blog post. So how did AI know about it? It was on my website.  I conjecture that Auguste used some of the same language we used in the paper so the AI found our paper. The oddest thing about this is that I don't find this odd anymore. 

 COLOR COMMENTARY  

The article appeared as a SIGACT News Open Problems Column. Are you better off reading it there or on my website, which is pointed to above. The SIGACT News version is (a) behind a paywall, and (b) in black and white. The version on my website is (a) free access, and (b) uses color. You decide.

By gasarch

Linkage with plum blossoms

from David Eppstein

Remembering Joe Halpern (\(\mathbb{M}\)) focuses on Joe’s pivotal role in founding and guiding the CS section of the arXiv.

By David Eppstein

On Montgomery County public magnet schools: a guest post by Daniel Gottesman

from Scott Aaronson

Scott’s foreword: I’ve known fellow quantum computing theorist Daniel Gottesman, now at the University of Maryland, for a quarter-century at this point. Daniel has been a friend, colleague, coauthor, and one of the people from whom I’ve learned the most in my career. Today he writes about a topic close to my heart, and one […]


Scott’s foreword: I’ve known fellow quantum computing theorist Daniel Gottesman, now at the University of Maryland, for a quarter-century at this point. Daniel has been a friend, colleague, coauthor, and one of the people from whom I’ve learned the most in my career. Today he writes about a topic close to my heart, and one to which I’ve regularly lent this blog over the decades: namely, the struggle to protect enrichment and acceleration in the United States (in this case, the public magnet programs in Montgomery County, Maryland) from the constant attempts to weaken or dismantle them. Thanks so much to Daniel for doing this, and please help out if you can!


Without further ado, Daniel Gottesman:

Scott has kindly let me write this guest post because I’d like to ask the readers of Shtetl-Optimized for help.  I live in Montgomery County, Maryland, and the county is getting ready to replace our current handful of great magnet programs with a plethora of mediocre ones.

Montgomery County has a generally quite good school system, but its gifted education programs are really inadequate at the elementary and middle school level.  Montgomery County Public Schools (MCPS) offers nothing at all for gifted children until 4th grade.  Starting in 4th grade, magnet programs are available, but there are not enough spaces for everyone who meets the minimum qualifications.  A few years ago, the elementary and middle school magnets were switched to a lottery system, meaning the highest-achieving students, who most need special programming, might or might not get in, based purely on luck of the draw.

The remaining bright spot has been the high school magnets.  Montgomery County has two well-known and high-performing magnets, a STEM magnet at Montgomery Blair high school and an International Baccalaureate (IB) program at Richard Montgomery.  The Richard Montgomery IB program draws students from the whole county and the Blair Magnet draws from 2/3 of the county (with the remaining 1/3 eligible to go to another successful but less well-known magnet at Poolesville).  And these programs have so far resisted the lottery: They pick the best students from the application pool.

So with inadequate magnets in the lower grades and stellar magnets in high school, you can guess which one is up for a change.

MCPS now wants to reconfigure the high school magnet programs by splitting the county up into 6 regions.  Students will only be allowed to apply to programs in their home region.  Each region will have its own STEM magnet and its own IB program, as well as programs in the arts, medicine, and leadership.  And actually there are multiple program strands in each of these subjects, sometimes in different schools.  The whole plan is big and complicated, with close to 100 different programs around the county, more than half of them new.

The stated purpose of this plan is to expand access to these programs by admitting more students and reducing travel times to the programs.  And who could object to that?  There are definitely places in the county that are far from the current magnets and there are certainly more students that can benefit from high-quality magnets than there is currently space for.

The problem is that making high-quality magnets has not been a priority in the design process.  The last time MCPS tried adding regional magnets was about 7 years ago, when they added 3 regional IB programs while keeping Richard Montgomery available to students all over the county.  It was a failure: Test scores at the regional IB programs are far below those at Richard Montgomery (the worst-performing regional IB had only 24% getting a passing grade in even one subject in 2024, compared to 99% at Richard Montgomery) and all 3 are underenrolled.  Now MCPS has decided they can solve this problem by preventing students from going to Richard Montgomery to try to force them to go to the regional IBs.  In addition, they want to repeat the same mistakes with the STEM and other magnets.  The best programs in the county will shrink and only be accessible to a small fraction of students, leaving everyone else with new programs of likely highly-varying quality.

And if that were not enough, they want to do this revamp on a ridiculously short timeline.  The new programs are supposed to start in the 2027-8 school year, and between now and then, they need to recruit and train teachers for these 100 programs, create all the curricula for the first year of the programs (they are only planning to do one year at a time), and much much more.  The probability of a train wreck in the early years of the new system seems high.

Equity is certainly a concern driving this change.  And let me be clear: I am totally in favor of improving equity in the school system.  But I agree with Scott on this point: strong magnet programs in the public schools are pro-equity and weakening magnet programs is anti-equity.  Magnet programs are pro-equity even if the magnets are disproportionally populated by more affluent students, which is admittedly the case in MCPS: Affluent students will always have access to enrichment outside school and to private schools for the most affluent, whereas the public magnet programs are the only source of enrichment for those without those resources.

If MCPS really wants to address the difference in achievement between richer and poorer students, the way to do that is to create gifted programming starting from kindergarten.  If you wait until high school, it is unreasonable to expect even brilliant students to catch up to their also highly-capable peers who have been doing math and science camps and extracurriculars and contests and whatnot since they were little.  Some can manage it, but it is certainly not easy.  Unfortunately, MCPS’s notion of equity seems more focused on optimizing the demographic breakdown of magnet programs, which is most easily achieved by techniques which don’t improve — and usually degrade — the quality of the education provided.

So how can you help?  The Board of Education (BOE) is supposed to vote on this plan on Mar. 26.  Those of us opposed to it are hoping to sway enough members to vote to tell MCPS to investigate alternatives.  For instance, I have proposed a model with only 3 regions, which could also substantially improve access while preserving the strong existing magnets.

If you live in Montgomery County, write to BOE members telling them you oppose this change.  You can also sign a petition — there are many, but my favorite is here.

If you are an alumnus of one of the MCPS magnets, write to the BOE telling them how your education there was valuable to you and how a smaller program would not have served you as well.

If you are unconnected to Montgomery County, you can still spread the word.  If the BOE gets enough press inquiries asking about the many things that don’t add up in the MCPS proposal, perhaps they will recognize that this is a bad idea.

If you are really really interested in this topic and want to learn more: Last fall, I put together a long analysis of some of the flaws in MCPS’s plan and their claims, and of the alternative 3-region model.  You can find it here.

By Scott

Saturday, March 14

TR26-039 | Super-quadratic Lower Bounds for Depth-2 Linear Threshold Circuits | Lijie Chen, Avishay Tal, Yichuan Wang

from ECCC Papers

Proving lower bounds against depth-$2$ linear threshold circuits (a.k.a. $THR \circ THR$) is one of the frontier questions in complexity theory. Despite tremendous effort, our best lower bounds for $THR \circ THR$ only hold for sub-quadratic number of gates, which was proven a decade ago by Tamaki (ECCC TR16) and Alman, Chan, and Williams (FOCS 2016) for a hard function in $E^{NP}$. In this work, we prove that there is a function $f \in E^{NP}$ that requires $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits for any $\varepsilon > 0$. We obtain our new results by designing a new $2^{n - n^{\Omega(\varepsilon)}}$-time algorithm for estimating the acceptance probability of an XOR of two $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits, and apply Williams' algorithmic method to obtain the desired lower bound.

Proving lower bounds against depth-$2$ linear threshold circuits (a.k.a. $THR \circ THR$) is one of the frontier questions in complexity theory. Despite tremendous effort, our best lower bounds for $THR \circ THR$ only hold for sub-quadratic number of gates, which was proven a decade ago by Tamaki (ECCC TR16) and Alman, Chan, and Williams (FOCS 2016) for a hard function in $E^{NP}$. In this work, we prove that there is a function $f \in E^{NP}$ that requires $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits for any $\varepsilon > 0$. We obtain our new results by designing a new $2^{n - n^{\Omega(\varepsilon)}}$-time algorithm for estimating the acceptance probability of an XOR of two $n^{2.5-\varepsilon}$-size $THR \circ THR$ circuits, and apply Williams' algorithmic method to obtain the desired lower bound.

News for February 2026

from Property Testing Review

Apologies for the very late post! Last month was a bit calmer on the property testing front, with “merely” 3 papers we found. (Of course, if we missed any… let us know in the comments!) Testing Monotonicity of Real-Valued Functions on DAGs, by Yuichi Yoshida (arXiv). Monotonicity of functions is a fundamental, and well-studied property […]

Apologies for the very late post! Last month was a bit calmer on the property testing front, with “merely” 3 papers we found. (Of course, if we missed any… let us know in the comments!)

Testing Monotonicity of Real-Valued Functions on DAGs, by Yuichi Yoshida (arXiv). Monotonicity of functions is a fundamental, and well-studied property in the literature, and testing monotonicity on the line, the reals, the Bollean hypercube, and the hypergrid (among others) have been studied at great lengths (and yet, still not fully understood!). This paper considers a new twist on the question, where the object of study is a real-valued function defined on an \(n\)-vertex directed acyclic graph (DAG) provided to the algorithm. The key contribution of this work is showing that, on this type of structured poset, testing monotonicity requires \(\Omega(n^{1/2-\delta}/\sqrt{\varepsilon})\) non-adaptive queries for any constant $\delta>0$, nearly matching the general-poset non-adaptive upper bound of Fisher, Lehman, Newman, Raskhodnikova, Rubinfeld, and Samorodnitsky (2002). The paper also provides a similar adaptive lower bound, for one-sided testers. The author also establishes more fine-grained results (both upper and lower bounds), leveraging assumptions on either the range of the function, or the sparsity of the DAG.

The Power of Two Bases: Robust and copy-optimal certification of nearly all quantum states with few-qubit measurements, by Andrea Coladangelo, Jerry Li, Joseph Slote, and Ellen Wu (arXiv). Following recent works by Huang, Preskill, and Soleimanifar, and then Gupta, He, and O’Donnell, this paper considers the task of state certification (“is this unknown quantum state, which I am given copies of, equal, or very different from, the reference quantum state I want?”), which can be seen as the quantum analogue of identity testing in the classical distribution testing case, for pure reference target states. The key aspect of these works is that one requires the testing algorithm to make very “simple” (ideally single-qubit ones) on the copies of the unknwon \(n\)-qubit state: the underlying idea being that certifying a state given to you should be, in a very quantifiable sense, much “simpler” than preparing the reference state from scratch, otherwise the whole endeavor is sort of useless. Long story short, in this paper, the authors obtain both a very long title and a much more robust algorithm to perform this task, allowing to do tolerant state certification, with constant tolerance parameter. Only slight wrinkles: the algorithm requires one final measurement on logarithmically many qubits (not a single qubit, which would be the Holy Qugrail), and only works for “nearly all” reference states.

Instance-optimal estimation of \(L_2\)-norm, by Tomer Adar (arXiv). Given i.i.d. samples from an probability distribution \(p\) over an arbitrary discrete domain, estimate its collision probability \(\|p\|^2_2\) (equivalently, its \(\ell_2\)-norm) to a multiplicative \(1\pm \varepsilon\) factor. How hard can this be? Quite surprisingly, this question was not, in fact fully solved, and shows a much more complex landscape than expected, in that the right answer is not the obvious guess. An algorithm matching a (known, yet unpublished) lower bound of Tugkan Batu and myself was posed as an open problem by Tugkan at WoLA 2025: in this work, the author solves the problem, showing that the lower bound is indeed tight, by providing an algorithm achieving the right sample complexity, \(O\left(\frac{1}{\varepsilon \|p\|_2}+\frac{\|p\|_3^3-\|p\|_2^4}{ \varepsilon^2\|p\|_2^4}\right)\). It feels good to see this very simple-looking (but not simple, it turns out!), fundamental question solved.

By Clement Canonne

Friday, March 13

Visibly Recursive Automata

from arXiv: Computational Complexity

Authors: Kévin Dubrulle, Véronique Bruyère, Guillermo A. Pérez, Gaëtan Staquet

As an alternative to visibly pushdown automata, we introduce visibly recursive automata (VRAs), composed of a set of classical automata that can call each other. VRAs are a strict extension of so-called systems of procedural automata, a model proposed by Frohme and Steffen. We study the complexity of standard language-theoretic operations and classical decision problems for VRAs. Since the class of deterministic VRAs forms a strict subclass in terms of expressiveness, we propose a (weaker) notion that does not restrict expressive power and which we call codeterminism. Codeterminism comes with many desirable algorithmic properties that we demonstrate by using it, e.g., as a stepping stone towards implementing complementation of VRAs.

Authors: Kévin Dubrulle, Véronique Bruyère, Guillermo A. Pérez, Gaëtan Staquet

As an alternative to visibly pushdown automata, we introduce visibly recursive automata (VRAs), composed of a set of classical automata that can call each other. VRAs are a strict extension of so-called systems of procedural automata, a model proposed by Frohme and Steffen. We study the complexity of standard language-theoretic operations and classical decision problems for VRAs. Since the class of deterministic VRAs forms a strict subclass in terms of expressiveness, we propose a (weaker) notion that does not restrict expressive power and which we call codeterminism. Codeterminism comes with many desirable algorithmic properties that we demonstrate by using it, e.g., as a stepping stone towards implementing complementation of VRAs.

On the Computational Hardness of Transformers

from arXiv: Computational Complexity

Authors: Barna Saha, Yinzhan Xu, Christopher Ye, Hantao Yu

The transformer has revolutionized modern AI across language, vision, and beyond. It consists of $L$ layers, each running $H$ attention heads in parallel and feeding the combined output to the subsequent layer. In attention, the input consists of $N$ tokens, each a vector of dimension $m$. The attention mechanism involves multiplying three $N \times m$ matrices, applying softmax to an intermediate product. Several recent works have advanced our understanding of the complexity of attention. Known algorithms for transformers compute each attention head independently. This raises a fundamental question that has recurred throughout TCS under the guise of ``direct sum'' problems: can multiple instances of the same problem be solved more efficiently than solving each instance separately? Many answers to this question, both positive and negative, have arisen in fields spanning communication complexity and algorithm design. Thus, we ask whether transformers can be computed more efficiently than $LH$ independent evaluations of attention. In this paper, we resolve this question in the negative, and give the first non-trivial computational lower bounds for multi-head multi-layer transformers. In the small embedding regime ($m = N^{o(1)}$), computing $LH$ attention heads separately takes $LHN^{2 + o(1)}$ time. We establish that this is essentially optimal under SETH. In the large embedding regime ($m = N$), one can compute $LH$ attention heads separately using $LHN^{ω+ o(1)}$ arithmetic operations (plus exponents), where $ω$ is the matrix multiplication exponent. We establish that this is optimal, by showing that $LHN^{ω- o(1)}$ arithmetic operations are necessary when $ω> 2$. Our lower bound in the large embedding regime relies on a novel application of the Baur-Strassen theorem, a powerful algorithmic tool underpinning the famous backpropagation algorithm.

Authors: Barna Saha, Yinzhan Xu, Christopher Ye, Hantao Yu

The transformer has revolutionized modern AI across language, vision, and beyond. It consists of $L$ layers, each running $H$ attention heads in parallel and feeding the combined output to the subsequent layer. In attention, the input consists of $N$ tokens, each a vector of dimension $m$. The attention mechanism involves multiplying three $N \times m$ matrices, applying softmax to an intermediate product. Several recent works have advanced our understanding of the complexity of attention. Known algorithms for transformers compute each attention head independently. This raises a fundamental question that has recurred throughout TCS under the guise of ``direct sum'' problems: can multiple instances of the same problem be solved more efficiently than solving each instance separately? Many answers to this question, both positive and negative, have arisen in fields spanning communication complexity and algorithm design. Thus, we ask whether transformers can be computed more efficiently than $LH$ independent evaluations of attention. In this paper, we resolve this question in the negative, and give the first non-trivial computational lower bounds for multi-head multi-layer transformers. In the small embedding regime ($m = N^{o(1)}$), computing $LH$ attention heads separately takes $LHN^{2 + o(1)}$ time. We establish that this is essentially optimal under SETH. In the large embedding regime ($m = N$), one can compute $LH$ attention heads separately using $LHN^{ω+ o(1)}$ arithmetic operations (plus exponents), where $ω$ is the matrix multiplication exponent. We establish that this is optimal, by showing that $LHN^{ω- o(1)}$ arithmetic operations are necessary when $ω> 2$. Our lower bound in the large embedding regime relies on a novel application of the Baur-Strassen theorem, a powerful algorithmic tool underpinning the famous backpropagation algorithm.

Space-Efficient Approximate Spherical Range Counting in High Dimensions

from arXiv: Computational Geometry

Authors: Andreas Kalavas, Ioannis Psarros

We study the following range searching problem in high-dimensional Euclidean spaces: given a finite set $P\subset \mathbb{R}^d$, where each $p\in P$ is assigned a weight $w_p$, and radius $r>0$, we need to preprocess $P$ into a data structure such that when a new query point $q\in \mathbb{R}^d$ arrives, the data structure reports the cumulative weight of points of $P$ within Euclidean distance $r$ from $q$. Solving the problem exactly seems to require space usage that is exponential to the dimension, a phenomenon known as the curse of dimensionality. Thus, we focus on approximate solutions where points up to $(1+\varepsilon)r$ away from $q$ may be taken into account, where $\varepsilon>0$ is an input parameter known during preprocessing. We build a data structure with near-linear space usage, and query time in $n^{1-Θ(\varepsilon^4/\log(1/\varepsilon))}+t_q^{\varrho}\cdot n^{1-\varrho}$, for some $\varrho=Θ(\varepsilon^2)$, where $t_q$ is the number of points of $P$ in the ambiguity zone, i.e., at distance between $r$ and $(1+\varepsilon)r$ from the query $q$. To the best of our knowledge, this is the first data structure with efficient space usage (subquadratic or near-linear for any $\varepsilon>0$) and query time that remains sublinear for any sublinear $t_q$. We supplement our worst-case bounds with a query-driven preprocessing algorithm to build data structures that are well-adapted to the query distribution.

Authors: Andreas Kalavas, Ioannis Psarros

We study the following range searching problem in high-dimensional Euclidean spaces: given a finite set $P\subset \mathbb{R}^d$, where each $p\in P$ is assigned a weight $w_p$, and radius $r>0$, we need to preprocess $P$ into a data structure such that when a new query point $q\in \mathbb{R}^d$ arrives, the data structure reports the cumulative weight of points of $P$ within Euclidean distance $r$ from $q$. Solving the problem exactly seems to require space usage that is exponential to the dimension, a phenomenon known as the curse of dimensionality. Thus, we focus on approximate solutions where points up to $(1+\varepsilon)r$ away from $q$ may be taken into account, where $\varepsilon>0$ is an input parameter known during preprocessing. We build a data structure with near-linear space usage, and query time in $n^{1-Θ(\varepsilon^4/\log(1/\varepsilon))}+t_q^{\varrho}\cdot n^{1-\varrho}$, for some $\varrho=Θ(\varepsilon^2)$, where $t_q$ is the number of points of $P$ in the ambiguity zone, i.e., at distance between $r$ and $(1+\varepsilon)r$ from the query $q$. To the best of our knowledge, this is the first data structure with efficient space usage (subquadratic or near-linear for any $\varepsilon>0$) and query time that remains sublinear for any sublinear $t_q$. We supplement our worst-case bounds with a query-driven preprocessing algorithm to build data structures that are well-adapted to the query distribution.

On strictly output sensitive color frequency reporting

from arXiv: Computational Geometry

Authors: Erwin Glazenburg, Frank Staals

Given a set of $n$ colored points $P \subset \mathbb{R}^d$ we wish to store $P$ such that, given some query region $Q$, we can efficiently report the colors of the points appearing in the query region, along with their frequencies. This is the \emph{color frequency reporting} problem. We study the case where query regions $Q$ are axis-aligned boxes or dominance ranges. If $Q$ contains $k$ colors, the main goal is to achieve ``strictly output sensitive'' query time $O(f(n) + k)$. Firstly, we show that, for every $s \in \{ 2, \dots, n \}$, there exists a simple $O(ns\log_s n)$ size data structure for points in $\mathbb{R}^2$ that allows frequency reporting queries in $O(\log n + k\log_s n)$ time. Secondly, we give a lower bound for the weighted version of the problem in the arithmetic model of computation, proving that with $O(m)$ space one can not achieve query times better than $Ω\left(φ\frac{\log (n / φ)}{\log (m / n)}\right)$, where $φ$ is the number of possible colors. This means that our data structure is near-optimal. We extend these results to higher dimensions as well. Thirdly, we present a transformation that allows us to reduce the space usage of the aforementioned datastructure to $O(n(s φ)^\varepsilon \log_s n)$. Finally, we give an $O(n^{1+\varepsilon} + m \log n + K)$-time algorithm that can answer $m$ dominance queries $\mathbb{R}^2$ with total output complexity $K$, while using only linear working space.

Authors: Erwin Glazenburg, Frank Staals

Given a set of $n$ colored points $P \subset \mathbb{R}^d$ we wish to store $P$ such that, given some query region $Q$, we can efficiently report the colors of the points appearing in the query region, along with their frequencies. This is the \emph{color frequency reporting} problem. We study the case where query regions $Q$ are axis-aligned boxes or dominance ranges. If $Q$ contains $k$ colors, the main goal is to achieve ``strictly output sensitive'' query time $O(f(n) + k)$. Firstly, we show that, for every $s \in \{ 2, \dots, n \}$, there exists a simple $O(ns\log_s n)$ size data structure for points in $\mathbb{R}^2$ that allows frequency reporting queries in $O(\log n + k\log_s n)$ time. Secondly, we give a lower bound for the weighted version of the problem in the arithmetic model of computation, proving that with $O(m)$ space one can not achieve query times better than $Ω\left(φ\frac{\log (n / φ)}{\log (m / n)}\right)$, where $φ$ is the number of possible colors. This means that our data structure is near-optimal. We extend these results to higher dimensions as well. Thirdly, we present a transformation that allows us to reduce the space usage of the aforementioned datastructure to $O(n(s φ)^\varepsilon \log_s n)$. Finally, we give an $O(n^{1+\varepsilon} + m \log n + K)$-time algorithm that can answer $m$ dominance queries $\mathbb{R}^2$ with total output complexity $K$, while using only linear working space.

On the maximum number of tangencies among $1$-intersecting curves

from arXiv: Computational Geometry

Authors: Eyal Ackerman, Balázs Keszegh

According to a conjecture of Pach, there are $O(n)$ tangent pairs among any family of $n$ Jordan arcs in which every pair of arcs has precisely one common point and no three arcs share a common point. This conjecture was proved for two special cases, however, for the general case the currently best upper bound is only $O(n^{7/4})$. This is also the best known bound on the number of tangencies in the relaxed case where every pair of arcs has \emph{at most} one common point. We improve the bounds for the latter and former cases to $O(n^{5/3})$ and $O(n^{3/2})$, respectively. We also consider a few other variants of these questions, for example, we show that if the arcs are \emph{$x$-monotone}, each pair intersects at most once and their left endpoints lie on a common vertical line, then the maximum number of tangencies is $Θ(n^{4/3})$. Without this last condition the number of tangencies is $O(n^{4/3}(\log n)^{1/3})$, improving a previous bound of Pach and Sharir. Along the way we prove a graph-theoretic theorem which extends a result of Erdős and Simonovits and may be of independent interest.

Authors: Eyal Ackerman, Balázs Keszegh

According to a conjecture of Pach, there are $O(n)$ tangent pairs among any family of $n$ Jordan arcs in which every pair of arcs has precisely one common point and no three arcs share a common point. This conjecture was proved for two special cases, however, for the general case the currently best upper bound is only $O(n^{7/4})$. This is also the best known bound on the number of tangencies in the relaxed case where every pair of arcs has \emph{at most} one common point. We improve the bounds for the latter and former cases to $O(n^{5/3})$ and $O(n^{3/2})$, respectively. We also consider a few other variants of these questions, for example, we show that if the arcs are \emph{$x$-monotone}, each pair intersects at most once and their left endpoints lie on a common vertical line, then the maximum number of tangencies is $Θ(n^{4/3})$. Without this last condition the number of tangencies is $O(n^{4/3}(\log n)^{1/3})$, improving a previous bound of Pach and Sharir. Along the way we prove a graph-theoretic theorem which extends a result of Erdős and Simonovits and may be of independent interest.

Fast and exact visibility on digitized shapes and application to saliency-aware normal estimation

from arXiv: Computational Geometry

Authors: Romain Negro, Jacques-Olivier Lachaud

Computing visibility on a geometric object requires heavy computations since it requires to identify pairs of points that are visible to each other, i.e. there is a straight segment joining them that stays in the close vicinity of the object boundary. We propose to exploit a specic representation of digital sets based on lists of integral intervals in order to compute eciently the complete visibility graph between lattice points of the digital shape. As a quite direct application, we show then how we can use visibility to estimate the normal vector eld of a digital shape in an accurate and convergent manner while staying aware of the salient and sharp features of the shape.

Authors: Romain Negro, Jacques-Olivier Lachaud

Computing visibility on a geometric object requires heavy computations since it requires to identify pairs of points that are visible to each other, i.e. there is a straight segment joining them that stays in the close vicinity of the object boundary. We propose to exploit a specic representation of digital sets based on lists of integral intervals in order to compute eciently the complete visibility graph between lattice points of the digital shape. As a quite direct application, we show then how we can use visibility to estimate the normal vector eld of a digital shape in an accurate and convergent manner while staying aware of the salient and sharp features of the shape.

Approximate Dynamic Nearest Neighbor Searching in a Polygonal Domain

from arXiv: Computational Geometry

Authors: Joost van der Laan, Frank Staals, Lorenzo Theunissen

We present efficient data structures for approximate nearest neighbor searching and approximate 2-point shortest path queries in a two-dimensional polygonal domain $P$ with $n$ vertices. Our goal is to store a dynamic set of $m$ point sites $S$ in $P$ so that we can efficiently find a site $s \in S$ closest to an arbitrary query point $q$. We will allow both insertions and deletions in the set of sites $S$. However, as even just computing the distance between an arbitrary pair of points $q,s \in P$ requires a substantial amount of space, we allow for approximating the distances. Given a parameter $\varepsilon > 0$, we build an $O(\frac{n}{\varepsilon}\log n)$ space data structure that can compute a $1+\varepsilon$-approximation of the distance between $q$ and $s$ in $O(\frac{1}{\varepsilon^2}\log n)$ time. Building on this, we then obtain an $O(\frac{n+m}{\varepsilon}\log n + \frac{m}{\varepsilon}\log m)$ space data structure that allows us to report a site $s \in S$ so that the distance between query point $q$ and $s$ is at most $(1+\varepsilon)$-times the distance between $q$ and its true nearest neighbor in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ time. Our data structure supports updates in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ amortized time.

Authors: Joost van der Laan, Frank Staals, Lorenzo Theunissen

We present efficient data structures for approximate nearest neighbor searching and approximate 2-point shortest path queries in a two-dimensional polygonal domain $P$ with $n$ vertices. Our goal is to store a dynamic set of $m$ point sites $S$ in $P$ so that we can efficiently find a site $s \in S$ closest to an arbitrary query point $q$. We will allow both insertions and deletions in the set of sites $S$. However, as even just computing the distance between an arbitrary pair of points $q,s \in P$ requires a substantial amount of space, we allow for approximating the distances. Given a parameter $\varepsilon > 0$, we build an $O(\frac{n}{\varepsilon}\log n)$ space data structure that can compute a $1+\varepsilon$-approximation of the distance between $q$ and $s$ in $O(\frac{1}{\varepsilon^2}\log n)$ time. Building on this, we then obtain an $O(\frac{n+m}{\varepsilon}\log n + \frac{m}{\varepsilon}\log m)$ space data structure that allows us to report a site $s \in S$ so that the distance between query point $q$ and $s$ is at most $(1+\varepsilon)$-times the distance between $q$ and its true nearest neighbor in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ time. Our data structure supports updates in $O(\frac{1}{\varepsilon^2}\log n + \frac{1}{\varepsilon}\log n \log m + \frac{1}{\varepsilon}\log^2 m)$ amortized time.

Deterministic Algorithm for Non-monotone Submodular Maximization under Matroid and Knapsack Constraints

from arXiv: Data Structures and Algorithms

Authors: Shengminjie Chen, Yiwei Gao, Kaifeng Lin, Xiaoming Sun, Jialin Zhang

Submodular maximization constitutes a prominent research topic in combinatorial optimization and theoretical computer science, with extensive applications across diverse domains. While substantial advancements have been achieved in approximation algorithms for submodular maximization, the majority of algorithms yielding high approximation guarantees are randomized. In this work, we investigate deterministic approximation algorithms for maximizing non-monotone submodular functions subject to matroid and knapsack constraints. For the two distinct constraint settings, we propose novel deterministic algorithms grounded in an extended multilinear extension framework. Under matroid constraints, our algorithm achieves an approximation ratio of $(0.385 - ε)$, whereas for knapsack constraints, the proposed algorithm attains an approximation ratio of $(0.367 -ε)$. Both algorithms run in $\mathrm{poly}(n)$ query complexity, where $n$ is the size of the ground set, and improve upon the state-of-the-art deterministic approximation ratios of $(0.367 - ε)$ for matroid constraints and $0.25$ for knapsack constraints.

Authors: Shengminjie Chen, Yiwei Gao, Kaifeng Lin, Xiaoming Sun, Jialin Zhang

Submodular maximization constitutes a prominent research topic in combinatorial optimization and theoretical computer science, with extensive applications across diverse domains. While substantial advancements have been achieved in approximation algorithms for submodular maximization, the majority of algorithms yielding high approximation guarantees are randomized. In this work, we investigate deterministic approximation algorithms for maximizing non-monotone submodular functions subject to matroid and knapsack constraints. For the two distinct constraint settings, we propose novel deterministic algorithms grounded in an extended multilinear extension framework. Under matroid constraints, our algorithm achieves an approximation ratio of $(0.385 - ε)$, whereas for knapsack constraints, the proposed algorithm attains an approximation ratio of $(0.367 -ε)$. Both algorithms run in $\mathrm{poly}(n)$ query complexity, where $n$ is the size of the ground set, and improve upon the state-of-the-art deterministic approximation ratios of $(0.367 - ε)$ for matroid constraints and $0.25$ for knapsack constraints.

Bounding the Fragmentation of B-Trees Subject to Batched Insertions

from arXiv: Data Structures and Algorithms

Authors: Michael A. Bender, Aaron Bernstein, Nairen Cao, Alex Conway, Martín Farach-Colton, Hanna Komlós, Yarin Shechter, Nicole Wein

The issue of internal fragmentation in data structures is a fundamental challenge in database design. A seminal result of Yao in this field shows that evenly splitting the leaves of a B-tree against a workload of uniformly random insertions achieves space utilization of around 69%. However, many database applications perform batched insertions, where a small run of consecutive keys is inserted at a single position. We develop a generalization of Yao's analysis to provide rigorous treatment of such batched workloads. Our approach revisits and reformulates the analytical structure underlying Yao's result in a way that enables generalization and is used to argue that even splitting works well for many workloads in our extended class. For the remaining workloads, we develop simple alternative strategies that provably maintain good space utilization.

Authors: Michael A. Bender, Aaron Bernstein, Nairen Cao, Alex Conway, Martín Farach-Colton, Hanna Komlós, Yarin Shechter, Nicole Wein

The issue of internal fragmentation in data structures is a fundamental challenge in database design. A seminal result of Yao in this field shows that evenly splitting the leaves of a B-tree against a workload of uniformly random insertions achieves space utilization of around 69%. However, many database applications perform batched insertions, where a small run of consecutive keys is inserted at a single position. We develop a generalization of Yao's analysis to provide rigorous treatment of such batched workloads. Our approach revisits and reformulates the analytical structure underlying Yao's result in a way that enables generalization and is used to argue that even splitting works well for many workloads in our extended class. For the remaining workloads, we develop simple alternative strategies that provably maintain good space utilization.

Time, Message and Memory-Optimal Distributed Minimum Spanning Tree and Partwise Aggregation

from arXiv: Data Structures and Algorithms

Authors: Michael Elkin Tanya Goldenfeld

Memory-(in)efficiency is a crucial consideration that oftentimes prevents deployment of state-of-the-art distributed algorithms in real-life modern networks. In the context of the MST problem, roughly speaking, there are three types of algorithms. The algorithm of Gallager-Humblet-Spira and its versions are memory- and message- efficient, but their running time is at least linear in the number of vertices $n$, even when the unweighted diameter $D$ is much smaller than $n$. The algorithm of Garay-Kutten-Peleg and its versions are time-efficient, but not message- or memory-efficient. The more recent algorithms of are time- and message-efficient, but are not memory-efficient. As a result, GHS-type algorithms are much more prominent in real-life applications than time-efficient ones. In this paper we develop a deterministic time-, message- and memory-efficient algorithm for the MST problem. It is also applicable to the more general partwise aggregation problem. We believe that our techniques will be useful for devising memory-efficient versions for many other distributed problems.

Authors: Michael Elkin Tanya Goldenfeld

Memory-(in)efficiency is a crucial consideration that oftentimes prevents deployment of state-of-the-art distributed algorithms in real-life modern networks. In the context of the MST problem, roughly speaking, there are three types of algorithms. The algorithm of Gallager-Humblet-Spira and its versions are memory- and message- efficient, but their running time is at least linear in the number of vertices $n$, even when the unweighted diameter $D$ is much smaller than $n$. The algorithm of Garay-Kutten-Peleg and its versions are time-efficient, but not message- or memory-efficient. The more recent algorithms of are time- and message-efficient, but are not memory-efficient. As a result, GHS-type algorithms are much more prominent in real-life applications than time-efficient ones. In this paper we develop a deterministic time-, message- and memory-efficient algorithm for the MST problem. It is also applicable to the more general partwise aggregation problem. We believe that our techniques will be useful for devising memory-efficient versions for many other distributed problems.

Pivot based correlation clustering in the presence of good clusters

from arXiv: Data Structures and Algorithms

Authors: David Rasmussen Lolck, Mikkel Thorup, Shuyi Yan

The classic pivot based clustering algorithm of Ailon, Charikar and Chawla [JACM'08] is factor 3, but all concrete examples showing that it is no better than 3 are based on some very good clusters, e.g., a complete graph minus a matching. By removing all good clusters before we make each pivot step, we show that this improves the approximation ratio to $2.9991$. To aid in this, we also show how our proposed algorithm performs on synthetic datasets, where the algorithm performs remarkably well, and shows improvements over both the algorithm for locating good clusters and the classic pivot algorithm.

Authors: David Rasmussen Lolck, Mikkel Thorup, Shuyi Yan

The classic pivot based clustering algorithm of Ailon, Charikar and Chawla [JACM'08] is factor 3, but all concrete examples showing that it is no better than 3 are based on some very good clusters, e.g., a complete graph minus a matching. By removing all good clusters before we make each pivot step, we show that this improves the approximation ratio to $2.9991$. To aid in this, we also show how our proposed algorithm performs on synthetic datasets, where the algorithm performs remarkably well, and shows improvements over both the algorithm for locating good clusters and the classic pivot algorithm.

Enumerating All Directed Spanning Trees in Optimal Time

from arXiv: Data Structures and Algorithms

Authors: Paweł Gawrychowski, Marcin Knapik

We consider the problem of enumerating, for a given directed graph $G=(V,E)$ and a node $r\in V$, all directed spanning trees of $G$ rooted at $r$. For undirected graphs, the corresponding problem of enumerating all spanning trees has received considerable attention, culminating in the algorithm of Kapoor and Ramesh [SICOMP 1995] working in $\mathcal{O}(n+m+N)$ time, where $N, n, m$ denote the number of spanning trees, vertices, and edges of $G$, respectively. In the area of enumeration algorithms, this is known as Constant Amortised Time, or CAT. To achieve only constant time per each spanning tree, the algorithm outputs the relative change between the subsequent spanning trees instead of the whole spanning trees themselves. The natural generalization to enumerating all directed spanning trees has been already considered by Gabow and Myers [SICOMP 1978], who provided an $\mathcal{O}(n+m+Nm)$ time algorithm. This time complexity has been improved upon a couple of times, and in 1998 Uno introduced the framework of trimming and balancing that allowed him to obtain an $\mathcal{O}(n+m\log n+N\log^{2}n)$ time algorithm for this problem. By plugging in later results it is immediate to improve the time complexity to $\mathcal{O}(n+m+N\log n)$, but achieving the optimal bound of $\mathcal{O}(n+m+N)$ seems problematic within this framework. In this paper, we show how to enumerate all directed spanning trees in $\mathcal{O}(n+m+N)$ time and $\mathcal{O}(n+m)$ space, matching the time bound for undirected graphs. Our improvement is obtained by designing a purely graph-theoretical characterization of graphs with very few directed spanning trees, and using their structure to speed up the algorithm.

Authors: Paweł Gawrychowski, Marcin Knapik

We consider the problem of enumerating, for a given directed graph $G=(V,E)$ and a node $r\in V$, all directed spanning trees of $G$ rooted at $r$. For undirected graphs, the corresponding problem of enumerating all spanning trees has received considerable attention, culminating in the algorithm of Kapoor and Ramesh [SICOMP 1995] working in $\mathcal{O}(n+m+N)$ time, where $N, n, m$ denote the number of spanning trees, vertices, and edges of $G$, respectively. In the area of enumeration algorithms, this is known as Constant Amortised Time, or CAT. To achieve only constant time per each spanning tree, the algorithm outputs the relative change between the subsequent spanning trees instead of the whole spanning trees themselves. The natural generalization to enumerating all directed spanning trees has been already considered by Gabow and Myers [SICOMP 1978], who provided an $\mathcal{O}(n+m+Nm)$ time algorithm. This time complexity has been improved upon a couple of times, and in 1998 Uno introduced the framework of trimming and balancing that allowed him to obtain an $\mathcal{O}(n+m\log n+N\log^{2}n)$ time algorithm for this problem. By plugging in later results it is immediate to improve the time complexity to $\mathcal{O}(n+m+N\log n)$, but achieving the optimal bound of $\mathcal{O}(n+m+N)$ seems problematic within this framework. In this paper, we show how to enumerate all directed spanning trees in $\mathcal{O}(n+m+N)$ time and $\mathcal{O}(n+m)$ space, matching the time bound for undirected graphs. Our improvement is obtained by designing a purely graph-theoretical characterization of graphs with very few directed spanning trees, and using their structure to speed up the algorithm.

Adapting Dijkstra for Buffers and Unlimited Transfers

from arXiv: Data Structures and Algorithms

Authors: Denys Katkalo, Andrii Rohovyi, Toby Walsh

In recent years, RAPTOR based algorithms have been considered the state-of-the-art for path-finding with unlimited transfers without preprocessing. However, this status largely stems from the evolution of routing research, where Dijkstra-based solutions were superseded by timetable-based algorithms without a systematic comparison. In this work, we revisit classical Dijkstra-based approaches for public transit routing with unlimited transfers and demonstrate that Time-Dependent Dijkstra (TD-Dijkstra) outperforms MR. However, efficient TD-Dijkstra implementations rely on filtering dominated connections during preprocessing, which assumes passengers can always switch to a faster connection. We show that this filtering is unsound when stops have buffer times, as it cannot distinguish between seated passengers who may continue without waiting and transferring passengers who must respect the buffer. To address this limitation, we introduce Transfer Aware Dijkstra (TAD), a modification that scans entire trip sequences rather than individual edges, correctly handling buffer times while maintaining performance advantages over MR. Our experiments on London and Switzerland networks show that we can achieve a greater than two time speed-up over MR while producing optimal results on both networks with and without buffer times.

Authors: Denys Katkalo, Andrii Rohovyi, Toby Walsh

In recent years, RAPTOR based algorithms have been considered the state-of-the-art for path-finding with unlimited transfers without preprocessing. However, this status largely stems from the evolution of routing research, where Dijkstra-based solutions were superseded by timetable-based algorithms without a systematic comparison. In this work, we revisit classical Dijkstra-based approaches for public transit routing with unlimited transfers and demonstrate that Time-Dependent Dijkstra (TD-Dijkstra) outperforms MR. However, efficient TD-Dijkstra implementations rely on filtering dominated connections during preprocessing, which assumes passengers can always switch to a faster connection. We show that this filtering is unsound when stops have buffer times, as it cannot distinguish between seated passengers who may continue without waiting and transferring passengers who must respect the buffer. To address this limitation, we introduce Transfer Aware Dijkstra (TAD), a modification that scans entire trip sequences rather than individual edges, correctly handling buffer times while maintaining performance advantages over MR. Our experiments on London and Switzerland networks show that we can achieve a greater than two time speed-up over MR while producing optimal results on both networks with and without buffer times.

Beyond BFS: A Comparative Study of Rooted Spanning Tree Algorithms on GPUs

from arXiv: Data Structures and Algorithms

Authors: Abhijeet Sahu, Srikar Vilas Donur

Rooted spanning trees (RSTs) are a core primitive in parallel graph analytics, underpinning algorithms such as biconnected components and planarity testing. On GPUs, RST construction has traditionally relied on breadth-first search (BFS) due to its simplicity and work efficiency. However, BFS incurs an O(D) step complexity, which severely limits parallelism on high-diameter and power-law graphs. We present a comparative study of alternative RST construction strategies on modern GPUs. We introduce a GPU adaptation of the Path Reversal RST (PR-RST) algorithm, optimizing its pointer-jumping and broadcast operations for modern GPU architecture. In addition, we evaluate an integrated approach that combines a state-of-the-art connectivity framework (GConn) with Eulerian tour-based rooting. Across more than 10 real-world graphs, our results show that the GConn-based approach achieves up to 300x speedup over optimized BFS on high-diameter graphs. These findings indicate that the O(log n) step complexity of connectivity-based methods can outweigh their structural overhead on modern hardware, motivating a rethinking of RST construction in GPU graph analytics.

Authors: Abhijeet Sahu, Srikar Vilas Donur

Rooted spanning trees (RSTs) are a core primitive in parallel graph analytics, underpinning algorithms such as biconnected components and planarity testing. On GPUs, RST construction has traditionally relied on breadth-first search (BFS) due to its simplicity and work efficiency. However, BFS incurs an O(D) step complexity, which severely limits parallelism on high-diameter and power-law graphs. We present a comparative study of alternative RST construction strategies on modern GPUs. We introduce a GPU adaptation of the Path Reversal RST (PR-RST) algorithm, optimizing its pointer-jumping and broadcast operations for modern GPU architecture. In addition, we evaluate an integrated approach that combines a state-of-the-art connectivity framework (GConn) with Eulerian tour-based rooting. Across more than 10 real-world graphs, our results show that the GConn-based approach achieves up to 300x speedup over optimized BFS on high-diameter graphs. These findings indicate that the O(log n) step complexity of connectivity-based methods can outweigh their structural overhead on modern hardware, motivating a rethinking of RST construction in GPU graph analytics.

Graph Generation Methods under Partial Information

from arXiv: Data Structures and Algorithms

Authors: Tong Sun, Jianshu Hao, Michael C. Fu, Guangxin Jiang

We study the problem of generating graphs with prescribed degree sequences for bipartite, directed, and undirected networks. We first propose a sequential method for bipartite graph generation and establish a necessary and sufficient interval condition that characterizes the admissible number of connections at each step, thereby guaranteeing global feasibility. Based on this result, we develop bipartite graph enumeration and sampling algorithms suitable for different problem sizes. We then extend these bipartite graph algorithms to the directed and undirected cases by incorporating additional connection constraints, as well as feasibility verification and symmetric connection steps, while preserving the same algorithmic principles. Finally, numerical experiments demonstrate the performance of the proposed algorithms, particularly their scalability to large instances where existing methods become computationally prohibitive.

Authors: Tong Sun, Jianshu Hao, Michael C. Fu, Guangxin Jiang

We study the problem of generating graphs with prescribed degree sequences for bipartite, directed, and undirected networks. We first propose a sequential method for bipartite graph generation and establish a necessary and sufficient interval condition that characterizes the admissible number of connections at each step, thereby guaranteeing global feasibility. Based on this result, we develop bipartite graph enumeration and sampling algorithms suitable for different problem sizes. We then extend these bipartite graph algorithms to the directed and undirected cases by incorporating additional connection constraints, as well as feasibility verification and symmetric connection steps, while preserving the same algorithmic principles. Finally, numerical experiments demonstrate the performance of the proposed algorithms, particularly their scalability to large instances where existing methods become computationally prohibitive.

Faster Relational Algorithms Using Geometric Data Structures

from arXiv: Data Structures and Algorithms

Authors: Aryan Esmailpour, Stavros Sintos

Optimization tasks over relational data, such as clustering, often suffer from the prohibitive cost of join operations, which are necessary to access the full dataset. While geometric data structures like BBD trees yield fast approximation algorithms in the standard computational setting, their application to relational data remains unclear due to the size of the join output. In this paper, we introduce a framework that leverages geometric insights to design faster algorithms when the data is stored as the results of a join query in a relational database. Our core contribution is the development of the RBBD tree, a randomized variant of the BBD tree tailored for relational settings. Instead of completely constructing the RBBD tree, by leveraging efficient sampling and counting techniques over relational joins, we enable on-the-fly efficient expansion of the RBBD tree, maintaining only the necessary parts. This allows us to simulate geometric query procedures without materializing the join result. As an application, we present algorithms that improve the state-of-the-art for relational $k$-center/means/median clustering by a factor of $k$ in running time while maintaining the same approximation guarantees. Our method is general and can be applied to various optimization problems in the relational setting.

Authors: Aryan Esmailpour, Stavros Sintos

Optimization tasks over relational data, such as clustering, often suffer from the prohibitive cost of join operations, which are necessary to access the full dataset. While geometric data structures like BBD trees yield fast approximation algorithms in the standard computational setting, their application to relational data remains unclear due to the size of the join output. In this paper, we introduce a framework that leverages geometric insights to design faster algorithms when the data is stored as the results of a join query in a relational database. Our core contribution is the development of the RBBD tree, a randomized variant of the BBD tree tailored for relational settings. Instead of completely constructing the RBBD tree, by leveraging efficient sampling and counting techniques over relational joins, we enable on-the-fly efficient expansion of the RBBD tree, maintaining only the necessary parts. This allows us to simulate geometric query procedures without materializing the join result. As an application, we present algorithms that improve the state-of-the-art for relational $k$-center/means/median clustering by a factor of $k$ in running time while maintaining the same approximation guarantees. Our method is general and can be applied to various optimization problems in the relational setting.

Induced Minors and Coarse Tree Decompositions

from arXiv: Data Structures and Algorithms

Authors: Maria Chudnovsky, Julien Codsi, Ajaykrishnan E S, Daniel Lokshtanov

Let $G$ be a graph, $S \subseteq V(G)$ be a vertex set in $G$ and $r$ be a positive integer. The distance $r$-independence number of $S$ is the size of the largest subset $I \subseteq S$ such that no pair $u$, $v$ of vertices in $I$ have a path on at most $r$ edges between them in $G$. It has been conjectured [Chudnovsky et al., arXiv, 2025] that for every positive integer $t$ there exist positive integers $c$, $d$ such that every graph $G$ that excludes both the complete bipartite graph $K_{t,t}$ and the grid $\boxplus_t$ as an induced minor has a tree decomposition in which every bag has (distance $1$) independence number at most $c(\log n)^d$. We prove a weaker version of this conjecture where every bag of the tree decomposition has distance $16(\log n + 1)$-independence number at most $c(\log n)^d$. On the way we also prove a version of the conjecture where every bag of the decomposition has distance $8$-independence number at most $2^{c (\log n)^{1-(1/d)}}$.

Authors: Maria Chudnovsky, Julien Codsi, Ajaykrishnan E S, Daniel Lokshtanov

Let $G$ be a graph, $S \subseteq V(G)$ be a vertex set in $G$ and $r$ be a positive integer. The distance $r$-independence number of $S$ is the size of the largest subset $I \subseteq S$ such that no pair $u$, $v$ of vertices in $I$ have a path on at most $r$ edges between them in $G$. It has been conjectured [Chudnovsky et al., arXiv, 2025] that for every positive integer $t$ there exist positive integers $c$, $d$ such that every graph $G$ that excludes both the complete bipartite graph $K_{t,t}$ and the grid $\boxplus_t$ as an induced minor has a tree decomposition in which every bag has (distance $1$) independence number at most $c(\log n)^d$. We prove a weaker version of this conjecture where every bag of the tree decomposition has distance $16(\log n + 1)$-independence number at most $c(\log n)^d$. On the way we also prove a version of the conjecture where every bag of the decomposition has distance $8$-independence number at most $2^{c (\log n)^{1-(1/d)}}$.

On the PLS-Completeness of $k$-Opt Local Search for the Traveling Salesman Problem

from arXiv: Data Structures and Algorithms

Authors: Sophia Heimann, Hung P. Hoang, Stefan Hougardy

The $k$-Opt algorithm is a local search algorithm for the traveling salesman problem. Starting with an initial tour, it iteratively replaces at most $k$ edges in the tour with the same number of edges to obtain a better tour. Krentel (FOCS 1989) showed that the traveling salesman problem with the $k$-Opt neighborhood is complete for the class PLS (polynomial time local search). However, his proof requires $k \gg 1000$ and has a substantial gap. We provide the first rigorous proof for the PLS-completeness and at the same time drastically lower the value of $k$ to $k \geq 15$, addressing an open question by Monien, Dumrauf, and Tscheuschner (ICALP 2010). Our result holds for both the general and the metric traveling salesman problem.

Authors: Sophia Heimann, Hung P. Hoang, Stefan Hougardy

The $k$-Opt algorithm is a local search algorithm for the traveling salesman problem. Starting with an initial tour, it iteratively replaces at most $k$ edges in the tour with the same number of edges to obtain a better tour. Krentel (FOCS 1989) showed that the traveling salesman problem with the $k$-Opt neighborhood is complete for the class PLS (polynomial time local search). However, his proof requires $k \gg 1000$ and has a substantial gap. We provide the first rigorous proof for the PLS-completeness and at the same time drastically lower the value of $k$ to $k \geq 15$, addressing an open question by Monien, Dumrauf, and Tscheuschner (ICALP 2010). Our result holds for both the general and the metric traveling salesman problem.

Frequency Moments in Noisy Streaming and Distributed Data under Mismatch Ambiguity

from arXiv: Data Structures and Algorithms

Authors: Kaiwen Liu, Qin Zhang

We propose a novel framework for statistical estimation on noisy datasets. Within this framework, we focus on the frequency moments ($F_p$) problem and demonstrate that it is possible to approximate $F_p$ of the unknown ground-truth dataset using sublinear space in the data stream model and sublinear communication in the coordinator model, provided that the approximation ratio is parameterized by a data-dependent quantity, which we call the $F_p$-mismatch-ambiguity. We also establish a set of lower bounds, which are tight in terms of the input size. Our results yield several interesting insights: (1) In the data stream model, the $F_p$ problem is inherently more difficult in the noisy setting than in the noiseless one. In particular, while $F_2$ can be approximated in logarithmic space in terms of the input size in the noiseless setting, any algorithm for $F_2$ in the noisy setting requires polynomial space. (2) In the coordinator model, in sharp contrast to the noiseless case, achieving polylogarithmic communication in the input size is generally impossible for $F_p$ under noise. However, when the $F_p$ mismatch ambiguity falls below a certain threshold, it becomes possible to achieve communication that is entirely independent of the input size.

Authors: Kaiwen Liu, Qin Zhang

We propose a novel framework for statistical estimation on noisy datasets. Within this framework, we focus on the frequency moments ($F_p$) problem and demonstrate that it is possible to approximate $F_p$ of the unknown ground-truth dataset using sublinear space in the data stream model and sublinear communication in the coordinator model, provided that the approximation ratio is parameterized by a data-dependent quantity, which we call the $F_p$-mismatch-ambiguity. We also establish a set of lower bounds, which are tight in terms of the input size. Our results yield several interesting insights: (1) In the data stream model, the $F_p$ problem is inherently more difficult in the noisy setting than in the noiseless one. In particular, while $F_2$ can be approximated in logarithmic space in terms of the input size in the noiseless setting, any algorithm for $F_2$ in the noisy setting requires polynomial space. (2) In the coordinator model, in sharp contrast to the noiseless case, achieving polylogarithmic communication in the input size is generally impossible for $F_p$ under noise. However, when the $F_p$ mismatch ambiguity falls below a certain threshold, it becomes possible to achieve communication that is entirely independent of the input size.

Thursday, March 12

TCS+ talk: Wednesday, March 18 — Chris Gartland, UNC Charlotte

from TCS+ Seminar Series

“The next TCS+ talk will take place this coming Wednesday, March 18th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 18:00 Central European Time, 17:00 UTC). Chris Gartland from UNC Charlotte will speak about “”-Distortion of EMD over Grids“” (abstract below). You can reserve a spot as an individual or a group to join […]

“The next TCS+ talk will take place this coming Wednesday, March 18th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 18:00 Central European Time, 17:00 UTC). Chris Gartland from UNC Charlotte will speak about “”L_1-Distortion of EMD over Grids“” (abstract below).

You can reserve a spot as an individual or a group to join us live by signing up on the online form. Registration is not required to attend the interactive talk, and the link will be posted on the website the day prior to the talk; however, by registering in the form, you will receive a reminder, along with the link. (The recorded talk will also be posted on our website afterwards) As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: The Earth Mover Distance (EMD) is a popular metric used in the comparison of probability distributions over a metric space, and low-distortion embeddings of this metric into L_1 is a commonly used approximation tool. We will discuss a general technique of using Sobolev-type inequalities to prove lower bounds for the L_1-distortion of EMD. While the main focus will be on describing the specific Sobolev-type inequality for the planar grid \{1,\dots n\}^2, we will also mention results for the higher dimensional grids \{1,\dots n\}^d, d \geq 3. Based on joint work with Mikhail Ostrovskii, Yuval Rabani, and Robert Young.

By plustcs

TR26-038 | Hardness Amplification Beyond Boolean Functions | Nobutaka Shimizu, Kenji Yasunaga

from ECCC Papers

A central goal in average-case complexity is to understand how average-case hardness can be amplified to near-optimal hardness. Classical results such as Yao’s XOR lemma establish this principle for Boolean functions, but these techniques typically apply only to artificially constructed functions, rather than to natural computational problems. In this work, we extend hardness amplification beyond the Boolean setting and extend the XOR Lemma to the sum of functions over the finite field $\mathbb{F}_p$, where $p$ is a prime. Specifically, we show that if a function $f \colon \{0,1\}^n \to \mathbb{F}_p$ fails to be computed on at least a $\delta$-fraction of inputs, then the $k$-wise sum $f^{+k}(x_1,\dots,x_k) = f(x_1) + \cdots + f(x_k)$ becomes almost optimally unpredictable: no efficient algorithm can compute it with success probability exceeding $\frac{1 + \varepsilon}{p}$ for suitable parameters $k,\delta,\varepsilon$. Our proof is based on the pseudo-average-min entropy characterization of unpredictability due to Zheng (2014) and Vadhan and Zheng (2012), which we simplify and quantitatively refine to make the dependence of the circuit blow-up on all parameters fully explicit. As an application, we obtain the first error-tolerant random self-reduction for a natural subgraph counting problem. Specifically, we show that any circuit that correctly counts triangles in an Erd?s--Rényi random graph with noticeable probability can be transformed into a worst-case circuit with only a quasi-linear overhead. We further extend the query lower bound framework of Shaltiel and Viola (2010) to the $\mathbb{F}_p$-valued setting, proving that any (possibly adaptive) black-box hardness amplification over $\mathbb{F}_p$ must make at least $\Omega(p\log(1/\delta)/\varepsilon^2)$ oracle queries. Our proof substantially simplifies the core \emph{fixed-set lemma} underlying previous analyses, offering a more modular and entropy-based argument.

A central goal in average-case complexity is to understand how average-case hardness can be amplified to near-optimal hardness. Classical results such as Yao’s XOR lemma establish this principle for Boolean functions, but these techniques typically apply only to artificially constructed functions, rather than to natural computational problems. In this work, we extend hardness amplification beyond the Boolean setting and extend the XOR Lemma to the sum of functions over the finite field $\mathbb{F}_p$, where $p$ is a prime. Specifically, we show that if a function $f \colon \{0,1\}^n \to \mathbb{F}_p$ fails to be computed on at least a $\delta$-fraction of inputs, then the $k$-wise sum $f^{+k}(x_1,\dots,x_k) = f(x_1) + \cdots + f(x_k)$ becomes almost optimally unpredictable: no efficient algorithm can compute it with success probability exceeding $\frac{1 + \varepsilon}{p}$ for suitable parameters $k,\delta,\varepsilon$. Our proof is based on the pseudo-average-min entropy characterization of unpredictability due to Zheng (2014) and Vadhan and Zheng (2012), which we simplify and quantitatively refine to make the dependence of the circuit blow-up on all parameters fully explicit. As an application, we obtain the first error-tolerant random self-reduction for a natural subgraph counting problem. Specifically, we show that any circuit that correctly counts triangles in an Erd?s--Rényi random graph with noticeable probability can be transformed into a worst-case circuit with only a quasi-linear overhead. We further extend the query lower bound framework of Shaltiel and Viola (2010) to the $\mathbb{F}_p$-valued setting, proving that any (possibly adaptive) black-box hardness amplification over $\mathbb{F}_p$ must make at least $\Omega(p\log(1/\delta)/\varepsilon^2)$ oracle queries. Our proof substantially simplifies the core \emph{fixed-set lemma} underlying previous analyses, offering a more modular and entropy-based argument.

The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory

from arXiv: Computational Complexity

Authors: Romain Peyrichou

Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar induction). Generation and recognition are extensionally equivalent -- they characterize the same set -- but operationally asymmetric in multiple independent ways. Inference is a qualitatively harder problem: it does not have access to a known grammar. Despite the centrality of this triad to compiler design, natural language processing, and formal language theory, no survey has treated it as a unified, multidimensional phenomenon. We identify six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. We show that the common characterization "generation is easy, parsing is hard" is misleading: unconstrained generation is trivial, but generation under constraints can be NP-hard. The real asymmetry is that parsing is always constrained (the input is given) while generation need not be. Two of these dimensions -- directionality and temporality -- have not previously been identified as dimensions of the generation-recognition asymmetry. We connect the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry between a generator (surprisal = 0) and a parser that predicts under uncertainty (surprisal > 0). We review bidirectional systems in NLP and observe that bidirectionality has been available for fifty years yet has not transferred to most domain-specific applications. We conclude with a discussion of large language models, which architecturally unify generation and recognition while operationally preserving the asymmetry.

Authors: Romain Peyrichou

Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar induction). Generation and recognition are extensionally equivalent -- they characterize the same set -- but operationally asymmetric in multiple independent ways. Inference is a qualitatively harder problem: it does not have access to a known grammar. Despite the centrality of this triad to compiler design, natural language processing, and formal language theory, no survey has treated it as a unified, multidimensional phenomenon. We identify six dimensions along which generation and recognition diverge: computational complexity, ambiguity, directionality, information availability, grammar inference, and temporality. We show that the common characterization "generation is easy, parsing is hard" is misleading: unconstrained generation is trivial, but generation under constraints can be NP-hard. The real asymmetry is that parsing is always constrained (the input is given) while generation need not be. Two of these dimensions -- directionality and temporality -- have not previously been identified as dimensions of the generation-recognition asymmetry. We connect the temporal dimension to the surprisal framework of Hale (2001) and Levy (2008), arguing that surprisal formalizes the temporal asymmetry between a generator (surprisal = 0) and a parser that predicts under uncertainty (surprisal > 0). We review bidirectional systems in NLP and observe that bidirectionality has been available for fifty years yet has not transferred to most domain-specific applications. We conclude with a discussion of large language models, which architecturally unify generation and recognition while operationally preserving the asymmetry.

Punctually Standard and Nonstandard Models of Natural Numbers

from arXiv: Computational Complexity

Authors: Nikolay Bazhenov, Ivan Georgiev, Dariusz Kalociński, Stefan Vatev, Michał Wrocławski

Abstract models of computation often treat the successor function $S$ on $\mathbb{N}$ as a primitive operation, even though its low-level implementations correspond to non-trivial programs operating on specific numerical representations. This behaviour can be analyzed without referring to notations by replacing the standard interpretation $(\mathbb{N}, S)$ with an isomorphic copy ${\mathcal A} = (\mathbb{N}, S^{\mathcal A})$, in which $S^{\mathcal A}$ is no longer computable by a single instruction. While the class of computable functions on $\mathcal{A}$ is standard if $S^{\mathcal{A}}$ is computable, existing results indicate that this invariance fails at the level of primitive recursion. We investigate which sets of operations have the property that if they are primitive recursive on $\mathcal A$ then the class of primitive recursive functions on $\mathcal A$ remains standard. We call such sets of operations \emph{bases for punctual standardness}. We exhibit a series of non-basis results which show how the induced class of primitive recursive functions on $\mathcal A$ can deviate substantially from the standard one. In particular, we demonstrate that a wide range of natural operations, including large subclasses of primitive recursive functions studied by Skolem and Levitz, fail to form such bases. On the positive side, we exhibit natural finite bases for punctual standardness. Our results answer a question recently posed by Grabmayr and establish punctual categoricity for certain natural finitely generated structures.

Authors: Nikolay Bazhenov, Ivan Georgiev, Dariusz Kalociński, Stefan Vatev, Michał Wrocławski

Abstract models of computation often treat the successor function $S$ on $\mathbb{N}$ as a primitive operation, even though its low-level implementations correspond to non-trivial programs operating on specific numerical representations. This behaviour can be analyzed without referring to notations by replacing the standard interpretation $(\mathbb{N}, S)$ with an isomorphic copy ${\mathcal A} = (\mathbb{N}, S^{\mathcal A})$, in which $S^{\mathcal A}$ is no longer computable by a single instruction. While the class of computable functions on $\mathcal{A}$ is standard if $S^{\mathcal{A}}$ is computable, existing results indicate that this invariance fails at the level of primitive recursion. We investigate which sets of operations have the property that if they are primitive recursive on $\mathcal A$ then the class of primitive recursive functions on $\mathcal A$ remains standard. We call such sets of operations \emph{bases for punctual standardness}. We exhibit a series of non-basis results which show how the induced class of primitive recursive functions on $\mathcal A$ can deviate substantially from the standard one. In particular, we demonstrate that a wide range of natural operations, including large subclasses of primitive recursive functions studied by Skolem and Levitz, fail to form such bases. On the positive side, we exhibit natural finite bases for punctual standardness. Our results answer a question recently posed by Grabmayr and establish punctual categoricity for certain natural finitely generated structures.

Large chirotopes with computable numbers of triangulations

from arXiv: Computational Geometry

Authors: Mathilde Bouvel, Valentin Féray, Xavier Goaoc, Florent Koechlin

Chirotopes are a common combinatorial abstraction of (planar) point sets. In this paper we investigate decomposition methods for chirotopes, and their application to the problem of counting the number of triangulations supported by a given planar point set. In particular, we generalize the convex and concave sums operations defined by Rutschmann and Wettstein for a particular family of chirotopes (which they call chains), and obtain a precise asymptotic estimate for the number of triangulations of the double circle, using a functional equation and the kernel method.

Authors: Mathilde Bouvel, Valentin Féray, Xavier Goaoc, Florent Koechlin

Chirotopes are a common combinatorial abstraction of (planar) point sets. In this paper we investigate decomposition methods for chirotopes, and their application to the problem of counting the number of triangulations supported by a given planar point set. In particular, we generalize the convex and concave sums operations defined by Rutschmann and Wettstein for a particular family of chirotopes (which they call chains), and obtain a precise asymptotic estimate for the number of triangulations of the double circle, using a functional equation and the kernel method.

Sublinear-Time Reconfiguration of Programmable Matter with Joint Movements

from arXiv: Data Structures and Algorithms

Authors: Manish Kumar, Othon Michail, Andreas Padalkin, Christian Scheideler

We study centralized reconfiguration problems for geometric amoebot structures. A set of $n$ amoebots occupy nodes on the triangular grid and can reconfigure via expansion and contraction operations. We focus on the joint movement extension, where amoebots may expand and contract in parallel, enabling coordinated motion of larger substructures. Prior work introduced this extension and analyzed reconfiguration under additional assumptions such as metamodules. In contrast, we investigate the intrinsic dynamics of reconfiguration without such assumptions by restricting attention to centralized algorithms, leaving distributed solutions for future work. We study the reconfiguration problem between two classes of amoebot structures $A$ and $B$: For every structure $S\in A$, the goal is to compute a schedule that reconfigures $S$ into some structure $S'\in B$. Our focus is on sublinear-time algorithms. We affirmatively answer the open problem by Padalkin et al. (Auton. Robots, 2025) whether a within-the-model sublinear-time universal reconfiguration algorithm is possible, by proving that any structure can be reconfigured into a canonical line-segment structure in $O(\sqrt{n}\log n)$ rounds. Additionally, we give a constant-time algorithm for reconfiguring any spiral structure into a line segment. These results are enabled by new constant-time primitives that facilitate efficient parallel movement. Our findings demonstrate that the joint movement model supports sublinear reconfiguration without auxiliary assumptions. A central open question is whether universal reconfiguration within this model can be achieved in polylogarithmic or even constant time.

Authors: Manish Kumar, Othon Michail, Andreas Padalkin, Christian Scheideler

We study centralized reconfiguration problems for geometric amoebot structures. A set of $n$ amoebots occupy nodes on the triangular grid and can reconfigure via expansion and contraction operations. We focus on the joint movement extension, where amoebots may expand and contract in parallel, enabling coordinated motion of larger substructures. Prior work introduced this extension and analyzed reconfiguration under additional assumptions such as metamodules. In contrast, we investigate the intrinsic dynamics of reconfiguration without such assumptions by restricting attention to centralized algorithms, leaving distributed solutions for future work. We study the reconfiguration problem between two classes of amoebot structures $A$ and $B$: For every structure $S\in A$, the goal is to compute a schedule that reconfigures $S$ into some structure $S'\in B$. Our focus is on sublinear-time algorithms. We affirmatively answer the open problem by Padalkin et al. (Auton. Robots, 2025) whether a within-the-model sublinear-time universal reconfiguration algorithm is possible, by proving that any structure can be reconfigured into a canonical line-segment structure in $O(\sqrt{n}\log n)$ rounds. Additionally, we give a constant-time algorithm for reconfiguring any spiral structure into a line segment. These results are enabled by new constant-time primitives that facilitate efficient parallel movement. Our findings demonstrate that the joint movement model supports sublinear reconfiguration without auxiliary assumptions. A central open question is whether universal reconfiguration within this model can be achieved in polylogarithmic or even constant time.

Instruction set for the representation of graphs

from arXiv: Data Structures and Algorithms

Authors: Ezequiel Lopez-Rubio, Mario Pascual-Gonzalez

We present IsalGraph, a method for representing the structure of any finite, simple graph as a compact string over a nine-character instruction alphabet. The encoding is executed by a small virtual machine comprising a sparse graph, a circular doubly-linked list (CDLL) of graph-node references, and two traversal pointers. Instructions either move a pointer through the CDLL or insert a node or edge into the graph. A key design property is that every string over the alphabet decodes to a valid graph, with no invalid states reachable. A greedy \emph{GraphToString} algorithm encodes any connected graph into a string in time polynomial in the number of nodes; an exhaustive-backtracking variant produces a canonical string by selecting the lexicographically smallest shortest string across all starting nodes and all valid traversal orders. We evaluate the representation on five real-world graph benchmark datasets (IAM Letter LOW/MED/HIGH, LINUX, and AIDS) and show that the Levenshtein distance between IsalGraph strings correlates strongly with graph edit distance (GED). Together, these properties make IsalGraph strings a compact, isomorphism-invariant, and language-model-compatible sequential encoding of graph structure, with direct applications in graph similarity search, graph generation, and graph-conditioned language modelling

Authors: Ezequiel Lopez-Rubio, Mario Pascual-Gonzalez

We present IsalGraph, a method for representing the structure of any finite, simple graph as a compact string over a nine-character instruction alphabet. The encoding is executed by a small virtual machine comprising a sparse graph, a circular doubly-linked list (CDLL) of graph-node references, and two traversal pointers. Instructions either move a pointer through the CDLL or insert a node or edge into the graph. A key design property is that every string over the alphabet decodes to a valid graph, with no invalid states reachable. A greedy \emph{GraphToString} algorithm encodes any connected graph into a string in time polynomial in the number of nodes; an exhaustive-backtracking variant produces a canonical string by selecting the lexicographically smallest shortest string across all starting nodes and all valid traversal orders. We evaluate the representation on five real-world graph benchmark datasets (IAM Letter LOW/MED/HIGH, LINUX, and AIDS) and show that the Levenshtein distance between IsalGraph strings correlates strongly with graph edit distance (GED). Together, these properties make IsalGraph strings a compact, isomorphism-invariant, and language-model-compatible sequential encoding of graph structure, with direct applications in graph similarity search, graph generation, and graph-conditioned language modelling

Separating Oblivious and Adaptive Differential Privacy under Continual Observation

from arXiv: Data Structures and Algorithms

Authors: Mark Bun, Marco Gaboardi, Connor Wagaman

We resolve an open question of Jain, Raskhodnikova, Sivakumar, and Smith (ICML 2023) by exhibiting a problem separating differential privacy under continual observation in the oblivious and adaptive settings. The continual observation (a.k.a. continual release) model formalizes privacy for streaming algorithms, where data is received over time and output is released at each time step. In the oblivious setting, privacy need only hold for data streams fixed in advance; in the adaptive setting, privacy is required even for streams that can be chosen adaptively based on the streaming algorithm's output. We describe the first explicit separation between the oblivious and adaptive settings. The problem showing this separation is based on the correlated vector queries problem of Bun, Steinke, and Ullman (SODA 2017). Specifically, we present an $(\varepsilon,0)$-DP algorithm for the oblivious setting that remains accurate for exponentially many time steps in the dimension of the input. On the other hand, we show that every $(\varepsilon,δ)$-DP adaptive algorithm fails to be accurate after releasing output for only a constant number of time steps.

Authors: Mark Bun, Marco Gaboardi, Connor Wagaman

We resolve an open question of Jain, Raskhodnikova, Sivakumar, and Smith (ICML 2023) by exhibiting a problem separating differential privacy under continual observation in the oblivious and adaptive settings. The continual observation (a.k.a. continual release) model formalizes privacy for streaming algorithms, where data is received over time and output is released at each time step. In the oblivious setting, privacy need only hold for data streams fixed in advance; in the adaptive setting, privacy is required even for streams that can be chosen adaptively based on the streaming algorithm's output. We describe the first explicit separation between the oblivious and adaptive settings. The problem showing this separation is based on the correlated vector queries problem of Bun, Steinke, and Ullman (SODA 2017). Specifically, we present an $(\varepsilon,0)$-DP algorithm for the oblivious setting that remains accurate for exponentially many time steps in the dimension of the input. On the other hand, we show that every $(\varepsilon,δ)$-DP adaptive algorithm fails to be accurate after releasing output for only a constant number of time steps.

Simple minimally unsatisfiable subsets of 2-CNFs

from arXiv: Data Structures and Algorithms

Authors: Oliver Kullmann, Edward Clewer

We present a study of minimal unsatisfiable subsets (MUSs) of 2-CNF Boolean formulas, building on the Abbasizanjani-Kullmann classification of minimally unsatisfiable 2-CNFs (2-MUs). We start by giving a linear-time procedure for recognising 2-MUs. Then we study the problem of finding one simple MUS. On the one hand we extend the results by Kleine Buening et al, which showed NP-completeness of the decision, whether a deficiency-1 MUS exists. On the other hand we show that deciding/finding an MUS containing one or two unit-clauses (which are special deficiency-1 MUSs) can be done in polynomial time. Finally we present an incremental polynomial time algorithm for some special type of MUSs, namely those MUSs containing at least one unit-clause. We conclude by discussing the main open problem, developing a deeper understanding of the landscape of easy/hard MUSs of 2-CNFs.

Authors: Oliver Kullmann, Edward Clewer

We present a study of minimal unsatisfiable subsets (MUSs) of 2-CNF Boolean formulas, building on the Abbasizanjani-Kullmann classification of minimally unsatisfiable 2-CNFs (2-MUs). We start by giving a linear-time procedure for recognising 2-MUs. Then we study the problem of finding one simple MUS. On the one hand we extend the results by Kleine Buening et al, which showed NP-completeness of the decision, whether a deficiency-1 MUS exists. On the other hand we show that deciding/finding an MUS containing one or two unit-clauses (which are special deficiency-1 MUSs) can be done in polynomial time. Finally we present an incremental polynomial time algorithm for some special type of MUSs, namely those MUSs containing at least one unit-clause. We conclude by discussing the main open problem, developing a deeper understanding of the landscape of easy/hard MUSs of 2-CNFs.

Huffman-Bucket Sketch: A Simple $O(m)$ Algorithm for Cardinality Estimation

from arXiv: Data Structures and Algorithms

Authors: Matti Karppa

We introduce the Huffman-Bucket Sketch (HBS), a simple, mergeable data structure that losslessly compresses a HyperLogLog (HLL) sketch with $m$ registers to optimal space $O(m+\log n)$ bits, with amortized constant-time updates, acting as a drop-in replacement for HLL that retains mergeability and substantially reduces memory requirements. We partition registers into small buckets and encode their values with a global Huffman codebook derived from the strongly concentrated HLL rank distribution, using the current cardinality estimate for determining the mode of the distribution. We prove that the Huffman tree needs rebuilding only $O(\log n)$ times over a stream, roughly when cardinality doubles. The framework can be extended to other sketches with similar strongly concentrated distributions. We provide preliminary numerical evidence that suggests that HBS is practical and can potentially be competitive with state-of-the-art in practice.

Authors: Matti Karppa

We introduce the Huffman-Bucket Sketch (HBS), a simple, mergeable data structure that losslessly compresses a HyperLogLog (HLL) sketch with $m$ registers to optimal space $O(m+\log n)$ bits, with amortized constant-time updates, acting as a drop-in replacement for HLL that retains mergeability and substantially reduces memory requirements. We partition registers into small buckets and encode their values with a global Huffman codebook derived from the strongly concentrated HLL rank distribution, using the current cardinality estimate for determining the mode of the distribution. We prove that the Huffman tree needs rebuilding only $O(\log n)$ times over a stream, roughly when cardinality doubles. The framework can be extended to other sketches with similar strongly concentrated distributions. We provide preliminary numerical evidence that suggests that HBS is practical and can potentially be competitive with state-of-the-art in practice.

Sample-and-Search: An Effective Algorithm for Learning-Augmented k-Median Clustering in High dimensions

from arXiv: Data Structures and Algorithms

Authors: Kangke Cheng, Shihong Song, Guanlin Mo, Hu Ding

In this paper, we investigate the learning-augmented $k$-median clustering problem, which aims to improve the performance of traditional clustering algorithms by preprocessing the point set with a predictor of error rate $α\in [0,1)$. This preprocessing step assigns potential labels to the points before clustering. We introduce an algorithm for this problem based on a simple yet effective sampling method, which substantially improves upon the time complexities of existing algorithms. Moreover, we mitigate their exponential dependency on the dimensionality of the Euclidean space. Lastly, we conduct experiments to compare our method with several state-of-the-art learning-augmented $k$-median clustering methods. The experimental results suggest that our proposed approach can significantly reduce the computational complexity in practice, while achieving a lower clustering cost.

Authors: Kangke Cheng, Shihong Song, Guanlin Mo, Hu Ding

In this paper, we investigate the learning-augmented $k$-median clustering problem, which aims to improve the performance of traditional clustering algorithms by preprocessing the point set with a predictor of error rate $α\in [0,1)$. This preprocessing step assigns potential labels to the points before clustering. We introduce an algorithm for this problem based on a simple yet effective sampling method, which substantially improves upon the time complexities of existing algorithms. Moreover, we mitigate their exponential dependency on the dimensionality of the Euclidean space. Lastly, we conduct experiments to compare our method with several state-of-the-art learning-augmented $k$-median clustering methods. The experimental results suggest that our proposed approach can significantly reduce the computational complexity in practice, while achieving a lower clustering cost.

Polynomial-size encoding of all cuts of small value in integer-valued symmetric submodular functions

from arXiv: Data Structures and Algorithms

Authors: Sang-il Oum, Marek Sokołowski

We study connectivity functions, that is, integer-valued symmetric submodular functions on a finite ground set attaining $0$ on the empty set. For a connectivity function $f$ on an $n$-element set $V$ and an integer $k\ge 0$, we show that the family of all sets $X\subseteq V$ with $f(X)=k$ admits a polynomial-size representation: it can be described by a list of at most $O(n^{4k})$ items, each consisting of a set to be included, another set to be excluded, and a partition of remaining elements, such that the union of some members of the partition and the set to be included are precisely all sets $X$ with $f(X)=k$. We also give an algorithm that constructs this representation in time $O(n^{2k+7}γ+n^{2k+8}+n^{4k+2})$, where $γ$ is the oracle time to evaluate $f$. This generalizes the low rank structure theorem of Bojańczyk, Pilipczuk, Przybyszewski, Sokołowski, and Stamoulis [Low rank MSO, arXiv, 2025] on cut-rank functions on graphs to general connectivity functions. As an application, for fixed $k$, we obtain a polynomial-time algorithm for finding a set $A$ with $f(A)=k$ and a prescribed cardinality constraint on $A$.

Authors: Sang-il Oum, Marek Sokołowski

We study connectivity functions, that is, integer-valued symmetric submodular functions on a finite ground set attaining $0$ on the empty set. For a connectivity function $f$ on an $n$-element set $V$ and an integer $k\ge 0$, we show that the family of all sets $X\subseteq V$ with $f(X)=k$ admits a polynomial-size representation: it can be described by a list of at most $O(n^{4k})$ items, each consisting of a set to be included, another set to be excluded, and a partition of remaining elements, such that the union of some members of the partition and the set to be included are precisely all sets $X$ with $f(X)=k$. We also give an algorithm that constructs this representation in time $O(n^{2k+7}γ+n^{2k+8}+n^{4k+2})$, where $γ$ is the oracle time to evaluate $f$. This generalizes the low rank structure theorem of Bojańczyk, Pilipczuk, Przybyszewski, Sokołowski, and Stamoulis [Low rank MSO, arXiv, 2025] on cut-rank functions on graphs to general connectivity functions. As an application, for fixed $k$, we obtain a polynomial-time algorithm for finding a set $A$ with $f(A)=k$ and a prescribed cardinality constraint on $A$.

Intermittent Cauchy walks enable optimal 3D search across target shapes and sizes

from arXiv: Data Structures and Algorithms

Authors: Matteo Stromieri, Emanuele Natale, Amos Korman

Target shape, not just size, plays a pivotal role in determining detectability during random search. We analyze intermittent Lévy walks in three dimensions, and mathematically prove that the widely observed Cauchy strategy (Lévy exponent $μ= 2$) uniquely achieves scale-invariant, near-optimal detection across a broad spectrum of target sizes and shapes. In a domain of volume $n$ with boundary conditions, expected detection time for a convex target of surface area $Δ$ optimally scales as $n/Δ$. Conversely, Lévy strategies with $μ< 2$ are slow at detecting targets with large surface area-to-volume ratios, while those with $μ> 2$ excel at finding large elongated shapes but degrade as targets become wider. Our results further indicate a continuous geometric transition: volume dictates detection near $μ= 1$, ceding dominance to surface area as $μ\to 2$, after which surface area and elongation couple to govern detection. Ultimately, 3D search introduces a pronounced sensitivity to target shape that is absent in lower dimensions. Our work provides a rigorous foundation for the Lévy flight foraging hypothesis in 3D by establishing the scale-invariant optimality of the Cauchy walk. Furthermore, our results reveal dimensionality-driven shape vulnerabilities and offer testable predictions for biological and engineered systems.

Authors: Matteo Stromieri, Emanuele Natale, Amos Korman

Target shape, not just size, plays a pivotal role in determining detectability during random search. We analyze intermittent Lévy walks in three dimensions, and mathematically prove that the widely observed Cauchy strategy (Lévy exponent $μ= 2$) uniquely achieves scale-invariant, near-optimal detection across a broad spectrum of target sizes and shapes. In a domain of volume $n$ with boundary conditions, expected detection time for a convex target of surface area $Δ$ optimally scales as $n/Δ$. Conversely, Lévy strategies with $μ< 2$ are slow at detecting targets with large surface area-to-volume ratios, while those with $μ> 2$ excel at finding large elongated shapes but degrade as targets become wider. Our results further indicate a continuous geometric transition: volume dictates detection near $μ= 1$, ceding dominance to surface area as $μ\to 2$, after which surface area and elongation couple to govern detection. Ultimately, 3D search introduces a pronounced sensitivity to target shape that is absent in lower dimensions. Our work provides a rigorous foundation for the Lévy flight foraging hypothesis in 3D by establishing the scale-invariant optimality of the Cauchy walk. Furthermore, our results reveal dimensionality-driven shape vulnerabilities and offer testable predictions for biological and engineered systems.

Density-Dependent Graph Orientation and Coloring in Scalable MPC

from arXiv: Data Structures and Algorithms

Authors: Mohsen Ghaffari, Christoph Grunau

This paper presents massively parallel computation (MPC) algorithms in the strongly sublinear memory regime (aka, scalable MPC) for orienting and coloring graphs as a function of its subgraph density. Our algorithms run in $poly(\log\log n)$ rounds and compute an orientation of the edges with maximum outdegree $O(α\log\log n)$ as well as a coloring of the vertices with $O(α\log\log n)$ colors. Here, $α$ denotes the density of the densest subgraph. Our algorithm's round complexity is notable because it breaks the $\tildeΘ(\sqrt{\log n})$ barrier, which applied to the previously best known density-dependent orientation algorithm [Ghaffari, Lattanzi, and Mitrovic ICML'19] and is common to many other scalable MPC algorithms.

Authors: Mohsen Ghaffari, Christoph Grunau

This paper presents massively parallel computation (MPC) algorithms in the strongly sublinear memory regime (aka, scalable MPC) for orienting and coloring graphs as a function of its subgraph density. Our algorithms run in $poly(\log\log n)$ rounds and compute an orientation of the edges with maximum outdegree $O(α\log\log n)$ as well as a coloring of the vertices with $O(α\log\log n)$ colors. Here, $α$ denotes the density of the densest subgraph. Our algorithm's round complexity is notable because it breaks the $\tildeΘ(\sqrt{\log n})$ barrier, which applied to the previously best known density-dependent orientation algorithm [Ghaffari, Lattanzi, and Mitrovic ICML'19] and is common to many other scalable MPC algorithms.

Reconstructing Bounded Treelength Graphs with Linearithmic Shortest Path Distance Queries

from arXiv: Data Structures and Algorithms

Authors: Chirag Kaudan, Amir Nayyeri

We consider the following graph reconstruction problem: given an unweighted connected graph $G = (V,E)$ with visible vertex set $V$ and an oracle which takes two vertices $u,v \in V$ and returns the shortest path distance between $u$ and $v$, how many queries are needed to reconstruct $E$? Specifically, we consider bounded degree $Δ$ and bounded treelength $\mathrm{tl}$ connected graphs and show that reconstruction can be done in $O_{Δ,\mathrm{tl}}(n \log n)$ queries with a deterministic algorithm. This result improves over the best known algorithm (deterministic or randomized) for this graph class by a $\log n$ factor and matches the known lower bound for the class of graphs with bounded chordality, which is a subclass of bounded treelength graphs.

Authors: Chirag Kaudan, Amir Nayyeri

We consider the following graph reconstruction problem: given an unweighted connected graph $G = (V,E)$ with visible vertex set $V$ and an oracle which takes two vertices $u,v \in V$ and returns the shortest path distance between $u$ and $v$, how many queries are needed to reconstruct $E$? Specifically, we consider bounded degree $Δ$ and bounded treelength $\mathrm{tl}$ connected graphs and show that reconstruction can be done in $O_{Δ,\mathrm{tl}}(n \log n)$ queries with a deterministic algorithm. This result improves over the best known algorithm (deterministic or randomized) for this graph class by a $\log n$ factor and matches the known lower bound for the class of graphs with bounded chordality, which is a subclass of bounded treelength graphs.

Transposition is Nearly Optimal for IID List Update

from arXiv: Data Structures and Algorithms

Authors: Christian Coester

The list update problem is one of the oldest and simplest problems in online algorithms: A set of items must be maintained in a list while requests to these items arrive over time. Whenever an item is requested, the algorithm pays a cost equal to the position of the item in the list. In the i.i.d. model, where requests are drawn independently from a fixed distribution, the static ordering by decreasing access probabilities $p_1\ge p_2\ge \dots \ge p_n$ achieves the minimal expected access cost OPT$=\sum_{i=1}^n ip_i$. However, $p$ is typically unknown, and approximating it by tracking access frequencies creates undesirable overheads. We prove that the Transposition rule (swap the requested item with its predecessor) has expected access cost at most OPT$+1$ in its stationary distribution. This confirms a 50-year-old conjecture by Rivest up to an unavoidable additive constant. More abstractly, it yields a purely memoryless procedure to approximately sort probabilities via sampling. Our proof is based on a decomposition of excess cost, and its technical core is a "sign-eliminating" combinatorial injection to witness nonnegativity of a constrained multivariate polynomial.

Authors: Christian Coester

The list update problem is one of the oldest and simplest problems in online algorithms: A set of items must be maintained in a list while requests to these items arrive over time. Whenever an item is requested, the algorithm pays a cost equal to the position of the item in the list. In the i.i.d. model, where requests are drawn independently from a fixed distribution, the static ordering by decreasing access probabilities $p_1\ge p_2\ge \dots \ge p_n$ achieves the minimal expected access cost OPT$=\sum_{i=1}^n ip_i$. However, $p$ is typically unknown, and approximating it by tracking access frequencies creates undesirable overheads. We prove that the Transposition rule (swap the requested item with its predecessor) has expected access cost at most OPT$+1$ in its stationary distribution. This confirms a 50-year-old conjecture by Rivest up to an unavoidable additive constant. More abstractly, it yields a purely memoryless procedure to approximately sort probabilities via sampling. Our proof is based on a decomposition of excess cost, and its technical core is a "sign-eliminating" combinatorial injection to witness nonnegativity of a constrained multivariate polynomial.

Wednesday, March 11

PhD position at Uppsala University (apply by April 7, 2026)

from CCI: jobs

Fully funded PhD position at the Department of Information Technology, Uppsala University on the topic of scalable quantum program verification. The project is at the intersection of formal verification, programming languages and quantum computing and aims to develop mathematically grounded methods for reasoning about hybrid quantum-classical programs. Website: uu.varbi.com/en/what:job/jobID:907722/ Email: ramanathan.s.thinniyam@it.uu.se

Fully funded PhD position at the Department of Information Technology, Uppsala University on the topic of scalable quantum program verification. The project is at the intersection of formal verification, programming languages and quantum computing and aims to develop mathematically grounded methods for reasoning about hybrid quantum-classical programs.

Website: https://uu.varbi.com/en/what:job/jobID:907722/
Email: ramanathan.s.thinniyam@it.uu.se

By shacharlovett