Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, February 27

Anthropic: Stay strong!

from Scott Aaronson

I don’t have time to write a full post right now, but hopefully this is self-explanatory. Regardless of their broader views on the AI industry, the eventual risks from AI, or American politics, right every person of conscience needs to stand behind Anthropic, as they stand up for their right to [checks notes] not be […]

I don’t have time to write a full post right now, but hopefully this is self-explanatory.

Regardless of their broader views on the AI industry, the eventual risks from AI, or American politics, right every person of conscience needs to stand behind Anthropic, as they stand up for their right to [checks notes] not be effectively nationalized by the Trump administration and forced to build murderbots and to help surveil American citizens. No, I wouldn’t have believed this either in a science-fiction movie, but it’s now just the straightforward reality of our world, years ahead of schedule. In particular, I call on all other AI companies, in the strongest possible terms, to do the right thing and stand behind Anthropic, in this make-or-break moment for the AI industry and the entire world.

By Scott

Defining Normal to See Abnormal

from Ben Recht

The data scientific techniques used to find dietary allowances.

This is Part 4 of a blogged essay “Steampunk Data Science.” A table of contents is here.

Lafayette Mendel, Elmer McCollum’s post-doc mentor at Yale, was intrigued by his mentee’s foray into the chemistry of nutrition. Even if the results were puzzling, Mendel saw great potential in the controlled synthetic food experiments in rodents. Mendel recruited Thomas Osborne and Edna Ferry to synthesize dozens of diets for rats and compile their results in a two-volume monograph. They focused on providing the bare backbone macronutrients of carbohydrates, fats, and proteins, but purified the food to be as basic as possible.

One of the Yale team’s key insights was comparing the relative value of diets by tracking rat growth against a baseline of “normal” growth. They drew on the rat studies of Henry Donaldson, Elizabeth Dunn, and John Watson, who used rats to understand the development of the nervous system. Donaldson et al. had weighed dozens of rats at different developmental stages and recorded the average and extreme weights of their samples. This is the average weight of Donaldson, Dunn, and Watson’s rats over time:

This normal growth curve eliminated the need for control groups when studying rat diets: a good diet would hit the curve, a bad diet would miss it. Osborne, Mendel, and Ferry first used it to disprove McCollum’s palatability hypothesis. They found a diet that the rats readily devoured but could not sustain normal growth. It consisted of hordein (a type of gluten), starch, agar, lard, and “protein-free milk.” Delicious. The milk was boiled and processed to remove fats and proteins, leaving behind milk sugars and other substances that were unknown at the time.

Even though the rats ate their fill of this mixture, their growth was terribly stunted.

The Yale team plotted the growth of individual rats, rather than the averages of multiple rats. With such individualized studies, they could also demonstrate the effectiveness of dietary changes. Here is another prototypical paper from their research monograph.

The rat was fed multiple diets over its life. The experiment begins with a mixed food diet and then switches to a purified food. Soon after the dietary switch, the rat’s growth reverses, and it rapidly loses weight. Clearly, there is something deficient with the purified food. This sort of single-subject temporal experimentation let them hone in on which of the many food substances we eat are necessary for survival. What would we call this today? A “within-subject crossover design?” What’s the p-value? Whatever the case, other chemists and biologists were convinced there was something important going on in these studies, and the work won Osborne and Mendel a Science paper. They would be credited with the discovery of essential amino acids.

F. Gowland Hopkins, a nutrition researcher at Cambridge who had discovered the amino acid tryptophan, happened across their work in those illustrious pages and realized that he had already conducted similar experiments years earlier. Hopkins hadn’t written up his rat diet findings because he suffered an illness that had taken him out of the lab. Likely, he also hadn’t realized the significance of his findings until seeing Osborne and Mendel’s write-up. Wanting to join in on the party, Hopkins rushed to write up his earlier experiments and sent them to The Journal of Physiology. Figure 2 in that paper arguably won him the Nobel Prize.

In this experiment, there were two groups, each with eight rats. Each dot in the plot is the average weight of the group on a particular day (the x-axis is days, the y-axis is grams). The white dots correspond to rats fed only bread. The black dots correspond to rats fed bread and a tiny amount of milk. The rats with the milk supplement grew dramatically faster than those fed bread alone. Just like the Yale team, Hopkins had performed a crossover study. At day 18, he swaps the diets of the two groups. Subsequently, the milk-fed rats stopped growing on their milk-free diet. In sharp contrast, the rats fed bread only started to grow once Hopkins gave them some milk.

Hopkins’ paper includes a detailed appendix that shares much of the raw data plotted in the paper’s figures. For the plot above, he does not share the complete growth series for each rat, but he does list the rat weights at days 0, 18, and 50. Using the weights at days 0 and 18, I can create a more modern display of the data, revealing the variation within the groups rather than just representing the averages. In Figure 8, I plot the percentage growth from day zero for each rat, splitting them into the groups fed milk and those not fed milk.

Each dot here corresponds to the percentage growth of an individual rat. The shaded region corresponds to the middle half of the data. The line in the shaded region is the median of the data. The lines outside the box denote the extremes of the data. Without milk, all of the rats grew less than 20% over the 18 days. With milk, every rat grew more than 60%. Scientists didn’t need a hypothesis test to see that something incredible was in the milk.


Meanwhile, back in Madison, Elmer McCollum was stressed out. Not only was he confused by his early experiments, but he was embarrassed by the refutation of his work by Mendel’s team. He was distracted by the oppressive obligations of faculty life, found teaching an unwelcome burden, and resented being tasked with working on the cow experiment. Ah, the professor life. On top of all this, he would admit later that he was not naturally gifted at caring for his rats. The combination left his rat studies far less productive than he had hoped.

The missing catalyst arrived in 1909. Marguerite Davis had recently moved back to Madison to care for her father after the death of her mother. Davis had completed her bachelor’s degree at Berkeley and hoped to continue some form of graduate study while also looking after her dad. She told McCollum she’d be willing to do freelance research with him so she could learn biochemistry. McCollum agreed, and Davis began to learn the ways of the lab. Davis quickly noted that McCollum, though proficient at catching rats, was terrible at keeping them. His colony was struggling and in general disarray. She offered to care for them, and McCollum was overjoyed to pass off the rat-keeping responsibility. Davis not only enjoyed the work but also had a talent for running the experiments. Within a year, McCollum and Davis would publish revolutionary findings in the biochemical foundations of nutrition.

McCollum and Davis tried to figure out what exactly was missing from the protein-free milk of the Yale experiments. They devised a procedure to isolate the critical component by centrifuging egg yolks in an ether bath to separate a key fat-soluble material. They found that this isolated substance was itself sufficient to trigger rat growth.

Here’s a typical example of their experimental results, which they presented in the same style as Mendel and collaborators.

For the first period, when the rat was young, McCollum and Davis fed the rat salt, protein, fats, and a little bit of carbohydrates. For the second period, they removed the fat. In this second period, the rat’s weight eventually plateaus. Then, in period three, they added the fat-soluble compounds from the egg yolk. Whatever this stuff was, it sufficed to reignite the rat’s growth.

They found this pattern remarkably repeatable. On the simple synthetic diets of only pure protein (casein), sugar (lactose), carbohydrates (starch and dextrine), and fats (lard), the rats displayed striking symptoms of malnourishment. They would languish and develop crusty buildup in their eyes. They wouldn’t mate. Adding the ether extract of egg yolk consistently cured the poor growth in these rats and fully restored them to normal.

McCollum and Davis considered five case studies of rats and their associated growth curves sufficient evidence of a major discovery. They wrote:

We have seen this prompt resumption of growth after a period of suspension result from the introduction of ether extract of butter or of egg in about thirty animals and are convinced that these extracts contain some organic complex without which the animals cannot make further increase in body weight, but may maintain themselves in a fairly good nutritive state for a prolonged period.

This fat-soluble compound turned out to be Vitamin A.

Subscribe now

By Ben Recht

TR26-031 | On the Need for (Quantum) Memory with Short Outputs | Zihan Hao, Zikuan Huang, Qipeng Liu

from ECCC Papers

In this work, we establish the first separation between computation with bounded and unbounded space, for problems with short outputs (i.e., working memory can be exponentially larger than output size), both in the classical and the quantum setting. Towards that, we introduce a problem called nested collision finding, and show that optimal query complexity can not be achieved without exponential memory. Our result is based on a novel ``two-oracle recording'' technique, where one oracle ``records'' the computation's long outputs under the other oracle, effectively reducing the time-space trade-off for short-output problems to that of long-output problems. We believe this technique will be of independent interest for establishing time-space tradeoffs in other short-output settings.

In this work, we establish the first separation between computation with bounded and unbounded space, for problems with short outputs (i.e., working memory can be exponentially larger than output size), both in the classical and the quantum setting. Towards that, we introduce a problem called nested collision finding, and show that optimal query complexity can not be achieved without exponential memory. Our result is based on a novel ``two-oracle recording'' technique, where one oracle ``records'' the computation's long outputs under the other oracle, effectively reducing the time-space trade-off for short-output problems to that of long-output problems. We believe this technique will be of independent interest for establishing time-space tradeoffs in other short-output settings.

Faster algorithms for graph homomorphism via tractable constraint satisfaction

from arXiv: Computational Complexity

Authors: Clément Carbonnel

We show that the existence of a homomorphism from an $n$-vertex graph $G$ to an $h$-vertex graph $H$ can be decided in time $2^{O(n)}h^{O(1)}$ and polynomial space if $H$ comes from a family of graphs that excludes a topological minor. The algorithm is based on a reduction to a single-exponential number of constraint satisfaction problems over tractable languages and can handle cost minimization. We also present an improved randomized algorithm for the special case where the graph $H$ is an odd cycle.

Authors: Clément Carbonnel

We show that the existence of a homomorphism from an $n$-vertex graph $G$ to an $h$-vertex graph $H$ can be decided in time $2^{O(n)}h^{O(1)}$ and polynomial space if $H$ comes from a family of graphs that excludes a topological minor. The algorithm is based on a reduction to a single-exponential number of constraint satisfaction problems over tractable languages and can handle cost minimization. We also present an improved randomized algorithm for the special case where the graph $H$ is an odd cycle.

Results on three problems on isolation of graphs

from arXiv: Computational Complexity

Authors: Peter Borg, Yair Caro

The graph isolation problem was introduced by Caro and Hansberg in 2015. It is a vast generalization of the classical graph domination problem and its study is expanding rapidly. In this paper, we address a number of questions that arise naturally. Let $F$ be a graph. We show that the $F$-isolating set problem is NP-complete if $F$ is connected. We investigate how the $F$-isolation number $ι(G,F)$ of a graph $G$ is affected by the minimum degree $d$ of $G$, establishing a bounded range, in terms of $d$ and the orders of $F$ and $G$, for the largest possible value of $ι(G,F)$ with $d$ sufficiently large. We also investigate how close $ι(G,tF)$ is to $ι(G,F)$, using domination and, in suitable cases, the Erdos-Posa property.

Authors: Peter Borg, Yair Caro

The graph isolation problem was introduced by Caro and Hansberg in 2015. It is a vast generalization of the classical graph domination problem and its study is expanding rapidly. In this paper, we address a number of questions that arise naturally. Let $F$ be a graph. We show that the $F$-isolating set problem is NP-complete if $F$ is connected. We investigate how the $F$-isolation number $ι(G,F)$ of a graph $G$ is affected by the minimum degree $d$ of $G$, establishing a bounded range, in terms of $d$ and the orders of $F$ and $G$, for the largest possible value of $ι(G,F)$ with $d$ sufficiently large. We also investigate how close $ι(G,tF)$ is to $ι(G,F)$, using domination and, in suitable cases, the Erdos-Posa property.

Dynamic Level Sets

from arXiv: Computational Complexity

Authors: Michael Stephen Fiske

A mathematical concept is identified and analyzed that is implicit in the 2012 paper Turing Incomputable Computation, presented at the Alan Turing Centenary Conference (Turing 100, Manchester). The concept, called dynamic level sets, is distinct from mathematical concepts in the standard literature on dynamical systems, topology, and computability theory. A new mathematical object is explained and why it may have escaped prior characterizations, including the classical result of de Leeuw, Moore, Shannon, and Shapiro (1956) that probabilistic Turing machines compute no more than deterministic ones.

Authors: Michael Stephen Fiske

A mathematical concept is identified and analyzed that is implicit in the 2012 paper Turing Incomputable Computation, presented at the Alan Turing Centenary Conference (Turing 100, Manchester). The concept, called dynamic level sets, is distinct from mathematical concepts in the standard literature on dynamical systems, topology, and computability theory. A new mathematical object is explained and why it may have escaped prior characterizations, including the classical result of de Leeuw, Moore, Shannon, and Shapiro (1956) that probabilistic Turing machines compute no more than deterministic ones.

FLIGHT: Fibonacci Lattice-based Inference for Geometric Heading in real-Time

from arXiv: Computational Geometry

Authors: David Dirnfeld, Fabien Delattre, Pedro Miraldo, Erik Learned-Miller

Estimating camera motion from monocular video is a fundamental problem in computer vision, central to tasks such as SLAM, visual odometry, and structure-from-motion. Existing methods that recover the camera's heading under known rotation, whether from an IMU or an optimization algorithm, tend to perform well in low-noise, low-outlier conditions, but often decrease in accuracy or become computationally expensive as noise and outlier levels increase. To address these limitations, we propose a novel generalization of the Hough transform on the unit sphere (S(2)) to estimate the camera's heading. First, the method extracts correspondences between two frames and generates a great circle of directions compatible with each pair of correspondences. Then, by discretizing the unit sphere using a Fibonacci lattice as bin centers, each great circle casts votes for a range of directions, ensuring that features unaffected by noise or dynamic objects vote consistently for the correct motion direction. Experimental results on three datasets demonstrate that the proposed method is on the Pareto frontier of accuracy versus efficiency. Additionally, experiments on SLAM show that the proposed method reduces RMSE by correcting the heading during camera pose initialization.

Authors: David Dirnfeld, Fabien Delattre, Pedro Miraldo, Erik Learned-Miller

Estimating camera motion from monocular video is a fundamental problem in computer vision, central to tasks such as SLAM, visual odometry, and structure-from-motion. Existing methods that recover the camera's heading under known rotation, whether from an IMU or an optimization algorithm, tend to perform well in low-noise, low-outlier conditions, but often decrease in accuracy or become computationally expensive as noise and outlier levels increase. To address these limitations, we propose a novel generalization of the Hough transform on the unit sphere (S(2)) to estimate the camera's heading. First, the method extracts correspondences between two frames and generates a great circle of directions compatible with each pair of correspondences. Then, by discretizing the unit sphere using a Fibonacci lattice as bin centers, each great circle casts votes for a range of directions, ensuring that features unaffected by noise or dynamic objects vote consistently for the correct motion direction. Experimental results on three datasets demonstrate that the proposed method is on the Pareto frontier of accuracy versus efficiency. Additionally, experiments on SLAM show that the proposed method reduces RMSE by correcting the heading during camera pose initialization.

Learning Tangent Bundles and Characteristic Classes with Autoencoder Atlases

from arXiv: Computational Geometry

Authors: Eduardo Paluzo-Hidalgo, Yuichi Ike

We introduce a theoretical framework that connects multi-chart autoencoders in manifold learning with the classical theory of vector bundles and characteristic classes. Rather than viewing autoencoders as producing a single global Euclidean embedding, we treat a collection of locally trained encoder-decoder pairs as a learned atlas on a manifold. We show that any reconstruction-consistent autoencoder atlas canonically defines transition maps satisfying the cocycle condition, and that linearising these transition maps yields a vector bundle coinciding with the tangent bundle when the latent dimension matches the intrinsic dimension of the manifold. This construction provides direct access to differential-topological invariants of the data. In particular, we show that the first Stiefel-Whitney class can be computed from the signs of the Jacobians of learned transition maps, yielding an algorithmic criterion for detecting orientability. We also show that non-trivial characteristic classes provide obstructions to single-chart representations, and that the minimum number of autoencoder charts is determined by the good cover structure of the manifold. Finally, we apply our methodology to low-dimensional orientable and non-orientable manifolds, as well as to a non-orientable high-dimensional image dataset.

Authors: Eduardo Paluzo-Hidalgo, Yuichi Ike

We introduce a theoretical framework that connects multi-chart autoencoders in manifold learning with the classical theory of vector bundles and characteristic classes. Rather than viewing autoencoders as producing a single global Euclidean embedding, we treat a collection of locally trained encoder-decoder pairs as a learned atlas on a manifold. We show that any reconstruction-consistent autoencoder atlas canonically defines transition maps satisfying the cocycle condition, and that linearising these transition maps yields a vector bundle coinciding with the tangent bundle when the latent dimension matches the intrinsic dimension of the manifold. This construction provides direct access to differential-topological invariants of the data. In particular, we show that the first Stiefel-Whitney class can be computed from the signs of the Jacobians of learned transition maps, yielding an algorithmic criterion for detecting orientability. We also show that non-trivial characteristic classes provide obstructions to single-chart representations, and that the minimum number of autoencoder charts is determined by the good cover structure of the manifold. Finally, we apply our methodology to low-dimensional orientable and non-orientable manifolds, as well as to a non-orientable high-dimensional image dataset.

Equivalent Dichotomies for Triangle Detection in Subgraph, Induced, and Colored H-Free Graphs

from arXiv: Data Structures and Algorithms

Authors: Amir Abboud, Ron Safier, Nathan Wallheimer

A recent paper by the authors (ITCS'26) initiates the study of the Triangle Detection problem in graphs avoiding a fixed pattern $H$ as a subgraph and proposes a \emph{dichotomy hypothesis} characterizing which patterns $H$ make the Triangle Detection problem easier in $H$-free graphs than in general graphs. In this work, we demonstrate that this hypothesis is, in fact, equivalent to analogous hypotheses in two broader settings that a priori seem significantly more challenging: \emph{induced} $H$-free graphs and \emph{colored} $H$-free graphs. Our main contribution is a reduction from the induced $H$-free case to the non-induced $\H^+$-free case, where $\H^+$ preserves the structural properties of $H$ that are relevant for the dichotomy, namely $3$-colorability and triangle count. A similar reduction is given for the colored case. A key technical ingredient is a self-reduction to Unique Triangle Detection that preserves the induced $H$-freeness property, via a new color-coding-like reduction.

Authors: Amir Abboud, Ron Safier, Nathan Wallheimer

A recent paper by the authors (ITCS'26) initiates the study of the Triangle Detection problem in graphs avoiding a fixed pattern $H$ as a subgraph and proposes a \emph{dichotomy hypothesis} characterizing which patterns $H$ make the Triangle Detection problem easier in $H$-free graphs than in general graphs. In this work, we demonstrate that this hypothesis is, in fact, equivalent to analogous hypotheses in two broader settings that a priori seem significantly more challenging: \emph{induced} $H$-free graphs and \emph{colored} $H$-free graphs. Our main contribution is a reduction from the induced $H$-free case to the non-induced $\H^+$-free case, where $\H^+$ preserves the structural properties of $H$ that are relevant for the dichotomy, namely $3$-colorability and triangle count. A similar reduction is given for the colored case. A key technical ingredient is a self-reduction to Unique Triangle Detection that preserves the induced $H$-freeness property, via a new color-coding-like reduction.

Dequantization Barriers for Guided Stoquastic Hamiltonians

from arXiv: Data Structures and Algorithms

Authors: Yassine Hamoudi, Yvan Le Borgne, Shrinidhi Teganahally Sridhara

We construct a probability distribution, induced by the Perron--Frobenius eigenvector of an exponentially large graph, which cannot be efficiently sampled by any classical algorithm, even when provided with the best-possible warm-start distribution. In the quantum setting, this problem can be viewed as preparing the ground state of a stoquastic Hamiltonian given a guiding state as input, and is known to be efficiently solvable on a quantum computer. Our result suggests that no efficient classical algorithm can solve a broad class of stoquastic ground-state problems. Our graph is constructed from a class of high-degree, high-girth spectral expanders to which self-similar trees are attached. This builds on and extends prior work of Gilyén, Hastings, and Vazirani [Quantum 2021, STOC 2021], which ruled out dequantization for a specific stoquastic adiabatic path algorithm. We strengthen their result by ruling out any classical algorithm for guided ground-state preparation.

Authors: Yassine Hamoudi, Yvan Le Borgne, Shrinidhi Teganahally Sridhara

We construct a probability distribution, induced by the Perron--Frobenius eigenvector of an exponentially large graph, which cannot be efficiently sampled by any classical algorithm, even when provided with the best-possible warm-start distribution. In the quantum setting, this problem can be viewed as preparing the ground state of a stoquastic Hamiltonian given a guiding state as input, and is known to be efficiently solvable on a quantum computer. Our result suggests that no efficient classical algorithm can solve a broad class of stoquastic ground-state problems. Our graph is constructed from a class of high-degree, high-girth spectral expanders to which self-similar trees are attached. This builds on and extends prior work of Gilyén, Hastings, and Vazirani [Quantum 2021, STOC 2021], which ruled out dequantization for a specific stoquastic adiabatic path algorithm. We strengthen their result by ruling out any classical algorithm for guided ground-state preparation.

Flip Distance of Triangulations of Convex Polygons / Rotation Distance of Binary Trees is NP-complete

from arXiv: Data Structures and Algorithms

Authors: Joseph Dorfer

Flips in triangulations of convex polygons arise in many different settings. They are isomorphic to rotations in binary trees, define edges in the 1-skeleton of the Associahedron and cover relations in the Tamari Lattice. The complexity of determining the minimum number of flips that transform one triangulation of a convex point set into another remained a tantalizing open question for many decades. We settle this question by proving that computing shortest flip sequences between triangulations of convex polygons, and therefore also computing the rotation distance of binary trees, is NP-hard. For our proof we develop techniques for flip sequences of triangulations whose counterparts were introduced for the study of flip sequences of non-crossing spanning trees by Bjerkevik, Kleist, Ueckerdt, and Vogtenhuber~[SODA25] and Bjerkevik, Dorfer, Kleist, Ueckerdt, and Vogtenhuber~[SoCG26].

Authors: Joseph Dorfer

Flips in triangulations of convex polygons arise in many different settings. They are isomorphic to rotations in binary trees, define edges in the 1-skeleton of the Associahedron and cover relations in the Tamari Lattice. The complexity of determining the minimum number of flips that transform one triangulation of a convex point set into another remained a tantalizing open question for many decades. We settle this question by proving that computing shortest flip sequences between triangulations of convex polygons, and therefore also computing the rotation distance of binary trees, is NP-hard. For our proof we develop techniques for flip sequences of triangulations whose counterparts were introduced for the study of flip sequences of non-crossing spanning trees by Bjerkevik, Kleist, Ueckerdt, and Vogtenhuber~[SODA25] and Bjerkevik, Dorfer, Kleist, Ueckerdt, and Vogtenhuber~[SoCG26].

Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

from arXiv: Data Structures and Algorithms

Authors: Alkis Kalavasis, Anay Mehrotra, Manolis Zampetakis, Felix Zhou, Ziyu Zhu

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low'' information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computationally efficient estimation possible under identifiability and convex partitions? This work resolves both questions. [...]

Authors: Alkis Kalavasis, Anay Mehrotra, Manolis Zampetakis, Felix Zhou, Ziyu Zhu

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low'' information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computationally efficient estimation possible under identifiability and convex partitions? This work resolves both questions. [...]

Isolation critical graphs under multiple edge subdivision

from arXiv: Data Structures and Algorithms

Authors: Karl Bartolo, Peter Borg, Magda Dettlaff, Magdalena Lemańska, Paweł Żyliński

This paper introduces the notion of $(ι,q)$-critical graphs. The isolation number of a graph $G$, denoted by $ι(G)$ and also known as the vertex-edge domination number, is the minimum number of vertices in a set $D$ such that the subgraph induced by the vertices not in the closed neighbourhood of $D$ has no edges. A graph $G$ is $(ι,q)$-critical, $q \ge 1$, if the subdivision of any $q$ edges in $G$ gives a graph with isolation number greater than $ι(G)$ and there exists a set of $q-1$ edges such that subdividing them gives a graph with isolation number equal to $ι(G)$. We prove that for each integer $q \ge 1$ there exists a $(ι,q)$-critical graph, while for a given graph $G$, the admissible values of $q$ satisfy $1 \le q \le |E(G)| - 1$. In addition, we provide a general characterisation of $(ι,1)$-critical graphs as well as a constructive characterisation of $(ι,1)$-critical trees.

Authors: Karl Bartolo, Peter Borg, Magda Dettlaff, Magdalena Lemańska, Paweł Żyliński

This paper introduces the notion of $(ι,q)$-critical graphs. The isolation number of a graph $G$, denoted by $ι(G)$ and also known as the vertex-edge domination number, is the minimum number of vertices in a set $D$ such that the subgraph induced by the vertices not in the closed neighbourhood of $D$ has no edges. A graph $G$ is $(ι,q)$-critical, $q \ge 1$, if the subdivision of any $q$ edges in $G$ gives a graph with isolation number greater than $ι(G)$ and there exists a set of $q-1$ edges such that subdividing them gives a graph with isolation number equal to $ι(G)$. We prove that for each integer $q \ge 1$ there exists a $(ι,q)$-critical graph, while for a given graph $G$, the admissible values of $q$ satisfy $1 \le q \le |E(G)| - 1$. In addition, we provide a general characterisation of $(ι,1)$-critical graphs as well as a constructive characterisation of $(ι,1)$-critical trees.

Efficient Parallel Algorithms for Hypergraph Matching

from arXiv: Data Structures and Algorithms

Authors: Henrik Reinstädtler, Christian Schulz, Nodari Sitchinava, Fabian Walliser

We present efficient parallel algorithms for computing maximal matchings in hypergraphs. Our algorithm finds locally maximal edges in the hypergraph and adds them in parallel to the matching. In the CRCW PRAM models our algorithms achieve $O(\log{m})$ time with $O((κ+ n) \log {m})$ work w.h.p. where $m$ is the number of hyperedges, and $κ$ is the sum of all vertex degrees. The CREW PRAM model algorithm has a running time of $O((\logΔ+\log{d})\log{m})$ and requires $O((κ+ n) \log {m})$ work w.h.p. It can be implemented work-optimal with $O(κ+n)$ work in $O((\log{m}+\log{n})\log{m})$ time. We prove a $1/d$-approximation guarantee for our algorithms. We evaluate our algorithms experimentally by implementing and running the proposed algorithms on the GPU using CUDA and Kokkos. Our experimental evaluation demonstrates the practical efficiency of our approach on real-world hypergraph instances, yielding a speed up of up to 76 times compared to a single-core CPU algorithm.

Authors: Henrik Reinstädtler, Christian Schulz, Nodari Sitchinava, Fabian Walliser

We present efficient parallel algorithms for computing maximal matchings in hypergraphs. Our algorithm finds locally maximal edges in the hypergraph and adds them in parallel to the matching. In the CRCW PRAM models our algorithms achieve $O(\log{m})$ time with $O((κ+ n) \log {m})$ work w.h.p. where $m$ is the number of hyperedges, and $κ$ is the sum of all vertex degrees. The CREW PRAM model algorithm has a running time of $O((\logΔ+\log{d})\log{m})$ and requires $O((κ+ n) \log {m})$ work w.h.p. It can be implemented work-optimal with $O(κ+n)$ work in $O((\log{m}+\log{n})\log{m})$ time. We prove a $1/d$-approximation guarantee for our algorithms. We evaluate our algorithms experimentally by implementing and running the proposed algorithms on the GPU using CUDA and Kokkos. Our experimental evaluation demonstrates the practical efficiency of our approach on real-world hypergraph instances, yielding a speed up of up to 76 times compared to a single-core CPU algorithm.

A Simple Distributed Deterministic Planar Separator

from arXiv: Data Structures and Algorithms

Authors: Yaseen Abd-Elhaleem, Michal Dory, Oren Weimann

A balanced separator of a graph $G$ is a set of vertices whose removal disconnects the graph into connected components that are a constant factor smaller than $G$. Lipton and Tarjan [FOCS'77] famously proved that every planar graph admits a balanced separator of size $O(\sqrt{n})$, as well as a balanced separator of size $O(D)$ that is a simple path (where $D$ is $G$'s diameter). In the centralized setting, both separators can be found in linear time. In the distributed setting, $D$ is a universal lower bound for the round complexity of solving many optimization problems, so, separators of size $O(D)$ are preferable. It was not until [DISC'17] that a distributed algorithm was devised by Ghaffari and Parter to compute such an $O(D)$-size separator in $\tilde O(D)$ rounds, by adapting the Lipton-Tarjan algorithm to the distributed model. Since then, this algorithm was used in several distributed algorithms for planar graphs, e.g., [GP, DISC'17], [LP, STOC'19], [AEDPW, PODC'25]. However, the algorithm is randomized, deeming the algorithms that use it to be randomized as well. Obtaining a deterministic algorithm remained an interesting open question until [PODC'25], when a (complex) deterministic separator algorithm was given by Jauregui, Montealegre and Rapaport. We present a much simpler deterministic separator algorithm with the same (near-optimal) $\tilde O(D)$-round complexity. While previous works devised either complicated or randomized ways of transferring weights from vertices to faces of $G$, we show that a straightforward way also works: Each vertex simply transfers its weight to one arbitrary face it lies on. That's it! We note that a deterministic separator algorithm directly derandomizes the state-of-the-art distributed algorithms for classical problems on planar graphs such as single-source shortest-paths, maximum-flow, directed global min-cut, and reachability.

Authors: Yaseen Abd-Elhaleem, Michal Dory, Oren Weimann

A balanced separator of a graph $G$ is a set of vertices whose removal disconnects the graph into connected components that are a constant factor smaller than $G$. Lipton and Tarjan [FOCS'77] famously proved that every planar graph admits a balanced separator of size $O(\sqrt{n})$, as well as a balanced separator of size $O(D)$ that is a simple path (where $D$ is $G$'s diameter). In the centralized setting, both separators can be found in linear time. In the distributed setting, $D$ is a universal lower bound for the round complexity of solving many optimization problems, so, separators of size $O(D)$ are preferable. It was not until [DISC'17] that a distributed algorithm was devised by Ghaffari and Parter to compute such an $O(D)$-size separator in $\tilde O(D)$ rounds, by adapting the Lipton-Tarjan algorithm to the distributed model. Since then, this algorithm was used in several distributed algorithms for planar graphs, e.g., [GP, DISC'17], [LP, STOC'19], [AEDPW, PODC'25]. However, the algorithm is randomized, deeming the algorithms that use it to be randomized as well. Obtaining a deterministic algorithm remained an interesting open question until [PODC'25], when a (complex) deterministic separator algorithm was given by Jauregui, Montealegre and Rapaport. We present a much simpler deterministic separator algorithm with the same (near-optimal) $\tilde O(D)$-round complexity. While previous works devised either complicated or randomized ways of transferring weights from vertices to faces of $G$, we show that a straightforward way also works: Each vertex simply transfers its weight to one arbitrary face it lies on. That's it! We note that a deterministic separator algorithm directly derandomizes the state-of-the-art distributed algorithms for classical problems on planar graphs such as single-source shortest-paths, maximum-flow, directed global min-cut, and reachability.

An $\mathcal{O}(\log N)$ Time Algorithm for the Generalized Egg Dropping Problem

from arXiv: Data Structures and Algorithms

Authors: Kleitos Papadopoulos

The generalized egg dropping problem is a canonical benchmark in sequential decision-making. Standard dynamic programming evaluates the minimum number of tests in the worst case in $\mathcal{O}(K \cdot N^2)$ time. The previous state-of-the-art approach formulates the testable thresholds as a partial sum of binomial coefficients and applies a combinatorial search to reduce the time complexity to $\mathcal{O}(K \log N)$. In this paper, we demonstrate that the discrete binary search over the decision tree can be bypassed entirely. By utilizing a relaxation of the binomial bounds, we compute an approximate root that tightly bounds the optimal value. We mathematically prove that this approximation restricts the remaining search space to exactly $\mathcal{O}(K)$ discrete steps. Because constraints inherently enforce $K < \log_2(N+1)$, our algorithm achieves an unconditional worst-case time complexity of $\mathcal{O}(\min(K, \log N))$. Furthermore, we formulate an explicit $\mathcal{O}(1)$ space deterministic policy to dynamically retrace the optimal sequential choices, eliminating classical state-transition matrices completely.

Authors: Kleitos Papadopoulos

The generalized egg dropping problem is a canonical benchmark in sequential decision-making. Standard dynamic programming evaluates the minimum number of tests in the worst case in $\mathcal{O}(K \cdot N^2)$ time. The previous state-of-the-art approach formulates the testable thresholds as a partial sum of binomial coefficients and applies a combinatorial search to reduce the time complexity to $\mathcal{O}(K \log N)$. In this paper, we demonstrate that the discrete binary search over the decision tree can be bypassed entirely. By utilizing a relaxation of the binomial bounds, we compute an approximate root that tightly bounds the optimal value. We mathematically prove that this approximation restricts the remaining search space to exactly $\mathcal{O}(K)$ discrete steps. Because constraints inherently enforce $K < \log_2(N+1)$, our algorithm achieves an unconditional worst-case time complexity of $\mathcal{O}(\min(K, \log N))$. Furthermore, we formulate an explicit $\mathcal{O}(1)$ space deterministic policy to dynamically retrace the optimal sequential choices, eliminating classical state-transition matrices completely.

SYK thermal expectations are classically easy at any temperature

from arXiv: Data Structures and Algorithms

Authors: Alexander Zlokapa, Bobak T. Kiani

Estimating thermal expectations of local observables is a natural target for quantum advantage. We give a simple classical algorithm that approximates thermal expectations, and we show it has quasi-polynomial cost $n^{O(\log n/ε)}$ for all temperatures above a phase transition in the free energy. For many natural models, this coincides with the entire fast-mixing, quantumly easy phase. Our results apply to the Sachdev-Ye-Kitaev (SYK) model at any constant temperature -- including when the thermal state is highly entangled and satisfies polynomial quantum circuit lower bounds, a sign problem, and nontrivial instance-to-instance fluctuations. Our analysis of the SYK model relies on the replica trick to control the complex zeros of the partition function.

Authors: Alexander Zlokapa, Bobak T. Kiani

Estimating thermal expectations of local observables is a natural target for quantum advantage. We give a simple classical algorithm that approximates thermal expectations, and we show it has quasi-polynomial cost $n^{O(\log n/ε)}$ for all temperatures above a phase transition in the free energy. For many natural models, this coincides with the entire fast-mixing, quantumly easy phase. Our results apply to the Sachdev-Ye-Kitaev (SYK) model at any constant temperature -- including when the thermal state is highly entangled and satisfies polynomial quantum circuit lower bounds, a sign problem, and nontrivial instance-to-instance fluctuations. Our analysis of the SYK model relies on the replica trick to control the complex zeros of the partition function.

static_maps: consteval std::map and std::unordered_map Implementations in C++23

from arXiv: Data Structures and Algorithms

Authors: Isaac D. Myhal, Oliver Serang

Using consteval from C++23, we implement efficient, new versions of std::map and std::unordered_map for use when the keys are known at compile time. We demonstrate superior performance of our unordered_map on three demonstration use-cases: Lookup of elemental mass from atomic symbol, lookup of amino acid from codon, and modification of stock prices from S&P 500 ticker symbols all produced runtimes <40%, <35%, <73% of the respective runtimes of the std implementations. Our library runimes were <80%, <45%, <97% of the lookup time of Frozen, an alternative perfect hashing implementation in C++ for problems also using constexpr keys. To our knowledge, this makes our library the overall fastest drop-in (i.e., with a similar API) alternative to std::unordered_map. On one arbitrarily chosen demo, we demonstrate runtimes <35% of PTHash and <89% gperf, state-of-the-art but not drop-in hashing libraries via external tools.

Authors: Isaac D. Myhal, Oliver Serang

Using consteval from C++23, we implement efficient, new versions of std::map and std::unordered_map for use when the keys are known at compile time. We demonstrate superior performance of our unordered_map on three demonstration use-cases: Lookup of elemental mass from atomic symbol, lookup of amino acid from codon, and modification of stock prices from S&P 500 ticker symbols all produced runtimes <40%, <35%, <73% of the respective runtimes of the std implementations. Our library runimes were <80%, <45%, <97% of the lookup time of Frozen, an alternative perfect hashing implementation in C++ for problems also using constexpr keys. To our knowledge, this makes our library the overall fastest drop-in (i.e., with a similar API) alternative to std::unordered_map. On one arbitrarily chosen demo, we demonstrate runtimes <35% of PTHash and <89% gperf, state-of-the-art but not drop-in hashing libraries via external tools.

Thursday, February 26

TR26-030 | Spiky Rank and Its Applications to Rigidity and Circuits | Lianna Hambardzumyan, Konstantin Myasnikov, Artur Riazanov, Morgan Shirley, Adi Shraibman

from ECCC Papers

We introduce spiky rank, a new matrix parameter that enhances blocky rank by combining the combinatorial structure of the latter with linear-algebraic flexibility. A spiky matrix is block-structured with diagonal blocks that are arbitrary rank-one matrices, and the spiky rank of a matrix is the minimum number of such matrices required to express it as a sum. This measure extends blocky rank to real matrices and is more robust for problems with both combinatorial and algebraic character. Our conceptual contribution is as follows: we propose spiky rank as a well-behaved candidate matrix complexity measure and demonstrate its potential through applications. We show that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks. On the technical side, we establish tight bounds for random matrices and develop a framework for explicit lower bounds, applying it to Hamming distance matrices and spectral expanders. Finally, we relate spiky rank to other matrix parameters, including blocky rank, sparsity, and the $\gamma_2$-norm.

We introduce spiky rank, a new matrix parameter that enhances blocky rank by combining the combinatorial structure of the latter with linear-algebraic flexibility. A spiky matrix is block-structured with diagonal blocks that are arbitrary rank-one matrices, and the spiky rank of a matrix is the minimum number of such matrices required to express it as a sum. This measure extends blocky rank to real matrices and is more robust for problems with both combinatorial and algebraic character. Our conceptual contribution is as follows: we propose spiky rank as a well-behaved candidate matrix complexity measure and demonstrate its potential through applications. We show that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks. On the technical side, we establish tight bounds for random matrices and develop a framework for explicit lower bounds, applying it to Hamming distance matrices and spectral expanders. Finally, we relate spiky rank to other matrix parameters, including blocky rank, sparsity, and the $\gamma_2$-norm.

Ratts of the Capital

from Ben Recht

In the quest for knowledge, Elmer McCollum cooks up some synthetic rodent food.

This is Part 3 of a blogged essay “Steampunk Data Science.” A table of contents is here.

Shortly after Elmer McCollum was hired as a young faculty in agricultural chemistry at the University of Wisconsin, Edwin Hart recruited him to work on the single-grain experiment. Being a junior faculty member susceptible to the pressure from those more senior, McCollum reluctantly agreed to join the project. But McCollum knew from the get-go that the Single Grain Experiment was probably hopeless. How could they hope to isolate the differences between the grains if every experiment would require the half-decade needed to raise their herd of cows to maturity?

McCollum was trained as an organic chemist, and his limited experience in nutrition had been gleaned during his postdoctoral fellowship at Yale. As a go-getting young faculty member, he hit the books, purchasing all 37 volumes of the German “Yearbook on the Progress of Animal Chemistry.” There, he found numerous nutrition studies done in Germany on mice. McCollim had an epiphany. Clearly, the Wisconsin investigations into nutrition could be accelerated by switching from cows to smaller animals, thereby converting studies that were taking years to complete into ones that could be finished in a few months. But which animals would be most appropriate?

McCollum decided upon rats. Rats certainly grew much faster than cows. Rats were also much smaller. A researcher could keep dozens of cages in the same space required to house a cow. Also, for better or for worse, rats would eat almost anything you put in front of them. And few have sympathy for rodents being sacrificed for the sake of human knowledge.

For McCollum, it was settled: the Wisconsin Agricultural Experiment Station must immediately begin investigations on rats. He presented his plan to the Dean of the Agriculture School in 1907, shortly after the Single Grain Experiment had begun. McCollum’s lobbying met the same fate as Stephen Babcock’s earlier pleas for the single-grain experiment. The dean emphatically denied his request. He told McCollum he couldn’t imagine the trouble he’d be in had the citizens of Wisconsin discovered that he had allowed vermin to be grown at his field station on the government dime.

Undeterred, and with the encouragement of now emeritus faculty Babcock, McCollum decided to covertly run rat experiments anyway. He had grown up on a farm in Kansas, and his parents had paid him and his brother a bounty of a penny for each rat they could catch. So he put his farmboy skills to the test. Despite the dean’s denial, the Madison campus’ sprawling array of barns was already swimming with rats. McCollum trapped some of the locals and set up a clandestine animal laboratory. After several unfortunate encounters, McCollum determined that the wild rats were far too feral to keep in a lab.

McCollum needed nicer rats. He journeyed south to Chicago and bought a dozen tame white rats from a pet store. The pet rats proved far more manageable than the barn rats. From this initial dozen, McCollum bred a large rat colony with some combination of his research discretionary funds and his personal salary.

Now ready to experiment, McCollum returned to the European literature to find the best initial foray into nutrition science. Data visualization in nutrition was still in its infancy, but by the early 1900s, growth curves had become standard tools for understanding diet. An influential paper by Watson and Hunter analyzed various food diets for rats, feeding them restricted diets of either bread and milk, porridge and milk, rice, horse flesh, or ox flesh. In Figure 1, they plotted the average growth of the rats fed bread and milk and those fed porridge (oats cooked in boiled skim milk and water with a pinch of salt).

The thin line is the average weight in grams of the rats fed the bread and milk diet, while the thick line is the average weight in grams of the rats fed the porridge diet. The arrows denote the deaths of the rats. Watson and Hunter observed that the bread-and-milk diet resulted in uniformly healthy adolescent rats. In contrast, for whatever reason, young rats couldn’t live on only porridge, and all died in under 22 weeks.

Watson and Hunter never specify exactly what they mean by “bread.” Indeed, not every bread recipe is the same, and British bread likely differed from French or German bread in its grains, seeds, and other ingredients. From the experimental protocol in the paper, we can’t determine what exactly the bread contained that was lacking in the porridge. Whatever was in the bread and not in the oatmeal seemed essential for sustaining the growth of rats.

McCollum realized he needed a more fine-grained approach to isolating the nutrients necessary for rat growth. Using his skills as a chemist, he created “synthetic diets” that built food up from its basic building blocks. McCollum’s synthetic foods could be titrated to find the bare essentials required to sustain rats.

In his first paper on his rat colony, McCollum compared the nutritional value of some of his synthetic diets. He recorded the results in tables like these:

He fed some rats a diet based on some organic compounds and some inorganic phosphorus. He compared this synthetic diet to one with added hydrolyzed beef fat to make it tastier for the rats. McCollum’s assessment of the experiment was somewhat surprising: he concluded that palatability, that is, the tastiness of the diet, was the most essential factor in nutrition. In fact, he went as far as ruling out the possibility that poor growth could be attributed to the “lack of certain organic complexes in the food given, which the body was not able to supply through its synthetic power from the materials at hand.” This conclusion was astonishingly wrong.

I’ll tell you why in the next installment of this series, but let’s first try to understand how McCollum could have been so confused by his experiment. His data presentations certainly didn’t help. Though growth curves were commonplace by the time he had published this work, McCollum’s paper contained no plots. All the data are collected in unruly tables listing the weights of different rats under different diets. Even his experimental protocols were confusing. He decided to change the diet of Rat VII on day 53. McCollum describes that he made this dietary change for Rat VII because the animal ate more aggressively than the others, and wanted to find out whether the phosphorus was necessary for continued growth.

It’s possible that plotting the data may have revealed to McCollum the tenuousness of his conclusions. For example, in Figure 3, I plotted McCollum’s data (from Figure 2) in the style of Watson and Hunter, having each point denote the average weight of each rat in the lot.

In McCollum’s experiment, the control group was fed a combination of oats, corn, and wheat, like in the single-grain experiment. It is clear here that the two lots fed synthetic foods are doing much worse than rats fed normal food mixtures. Perhaps the differences between Lots 1 and 2 were due to unrelated traits of the individual rats in the study. With the data we have, it’s impossible to know.

Even though it wasn’t particularly convincing, McCollum still managed to find a journal to accept his first attempts at systematic experimental nutrition. Though we now know the results were not reproducible and the conclusions were way off, the publication got the ball rolling in American nutrition science. Within a few years, it would lead to a revolution in our understanding of food.

Subscribe now

Endnote: Though I heard many variants of McCollum’s legend when I was a professor at Wisconsin, I’ve based my account here mostly on McCollum’s autobiography, which paints the most favorable account of him. McCollum was a prickly guy, and his bucking the trend of the cow experiment did not make him many friends in Madison. Whereas Babcock and Harry Steenbock have buildings and streets and ice cream shops named after them, McCollum’s legacy is the Sprague-Dawley rat, developed at Wisconsin after McCollum had moved to Johns Hopkins, but still one of the most popular strains of laboratory rat in biology.

By Ben Recht

The Lens of Abelian Embeddings

from arXiv: Computational Complexity

Authors: Dor Minzer

We discuss a recent line of research investigating inverse theorems with respect to general k-wise correlations, and explain how such correlations arise in different contexts in mathematics. We outline some of the results that were established and their applications in discrete mathematics and theoretical computer science. We also mention some open problems for future research.

Authors: Dor Minzer

We discuss a recent line of research investigating inverse theorems with respect to general k-wise correlations, and explain how such correlations arise in different contexts in mathematics. We outline some of the results that were established and their applications in discrete mathematics and theoretical computer science. We also mention some open problems for future research.

Two NP-hard Extensions of the Spearman Footrule even for a Small Constant Number of Voters

from arXiv: Computational Complexity

Authors: Martin Durand

The Spearman footrule is a voting rule that takes as input voter preferences expressed as rankings. It outputs a ranking that minimizes the sum of the absolute differences between the position of each candidate in the ranking and in the voters' preferences. In this paper, we study the computational complexity of two extensions of the Spearman footrule when the number of voters is a small constant. The first extension, introduced by Pascual et al. (2018), arises from the collective scheduling problem and treats candidates, referred to as tasks in their model, as having associated lengths. The second extension, proposed by Kumar and Vassilvitskii (2010), assigns weights to candidates; these weights serve both as lengths, as in the collective scheduling model, and as coefficients in the objective function to be minimized. Although computing a ranking under the standard Spearman footrule is polynomial-time solvable, we demonstrate that the first extension is NP-hard with as few as 3 voters, and the second extension is NP-hard with as few as 4 voters. Both extensions are polynomial-time solvable for 2 voters.

Authors: Martin Durand

The Spearman footrule is a voting rule that takes as input voter preferences expressed as rankings. It outputs a ranking that minimizes the sum of the absolute differences between the position of each candidate in the ranking and in the voters' preferences. In this paper, we study the computational complexity of two extensions of the Spearman footrule when the number of voters is a small constant. The first extension, introduced by Pascual et al. (2018), arises from the collective scheduling problem and treats candidates, referred to as tasks in their model, as having associated lengths. The second extension, proposed by Kumar and Vassilvitskii (2010), assigns weights to candidates; these weights serve both as lengths, as in the collective scheduling model, and as coefficients in the objective function to be minimized. Although computing a ranking under the standard Spearman footrule is polynomial-time solvable, we demonstrate that the first extension is NP-hard with as few as 3 voters, and the second extension is NP-hard with as few as 4 voters. Both extensions are polynomial-time solvable for 2 voters.

(Semi-)Invariant Curves from Centers of Triangle Families

from arXiv: Computational Geometry

Authors: Klara Mundilova, Oliver Gross

We study curves obtained by tracing triangle centers within special families of triangles, focusing on centers and families that yield (semi-)invariant triangle curves, meaning that varying the initial triangle changes the loci only by an affine transformation. We identify four two-parameter families of triangle centers that are semi-invariant and determine which are invariant, in the sense that the resulting curves for different initial triangles are related by a similarity transformation. We further observe that these centers, when combined with the aliquot triangle family, yield sheared Maclaurin trisectrices, whereas the nedian triangle family yields Limaçon trisectrices.

Authors: Klara Mundilova, Oliver Gross

We study curves obtained by tracing triangle centers within special families of triangles, focusing on centers and families that yield (semi-)invariant triangle curves, meaning that varying the initial triangle changes the loci only by an affine transformation. We identify four two-parameter families of triangle centers that are semi-invariant and determine which are invariant, in the sense that the resulting curves for different initial triangles are related by a similarity transformation. We further observe that these centers, when combined with the aliquot triangle family, yield sheared Maclaurin trisectrices, whereas the nedian triangle family yields Limaçon trisectrices.

Steiner Forest for $H$-Subgraph-Free Graphs

from arXiv: Data Structures and Algorithms

Authors: Tala Eagling-Vose, David C. Kutner, Felicia Lucke, Dániel Marx, Barnaby Martin, Daniël Paulusma, Erik Jan van Leeuwen

Our main result is a full classification, for every connected graph $H$, of the computational complexity of Steiner Forest on $H$-subgraph-free graphs. To obtain this dichotomy, we establish the following new algorithmic, hardness, and combinatorial results: Algorithms: We identify two new classes of graph-theoretical structures that make it possible to solve Steiner Forest in polynomial time. Roughly speaking, our algorithms handle the following cases: (1) a set $X$ of vertices of bounded size that are pairwise connected by subgraphs of treewidth $2$ or bounded size, possibly together with an independent set of arbitrary size that is connected to $X$ in an arbitrary way; (2) a set $X$ of vertices of arbitrary size that are pairwise connected in a cyclic manner by subgraphs of treewidth $2$ or bounded size. Hardness results: We show that Steiner Forest remains NP-complete for graphs with 2-deletion set number $3$. (The $c$-deletion set number is the size of a smallest cutset $S$ such that every component of $G-S$ has at most $c$ vertices.) Combinatorial results: To establish the dichotomy, we perform a delicate graph-theoretic analysis showing that if $H$ is a path or a subdivided claw, then excluding $H$ as a subgraph either yields one of the two algorithmically favourable structures described above, or yields a graph class for which NP-completeness of Steiner Forest follows from either our new hardness result or a previously known one. Along the way to classifying the hardness for excluded subgraphs, we establish a dichotomy for graphs with $c$-deletion set number at most $k$. Specifically, our results together with pre-existing ones show that Steiner Forest is polynomial-time solvable if (1) $c=1$ and $k\geq 0$, or (2) $c=2$ and $k\leq 2$, or (3) $c\geq 3$ and $k=1$, and is NP-complete otherwise.

Authors: Tala Eagling-Vose, David C. Kutner, Felicia Lucke, Dániel Marx, Barnaby Martin, Daniël Paulusma, Erik Jan van Leeuwen

Our main result is a full classification, for every connected graph $H$, of the computational complexity of Steiner Forest on $H$-subgraph-free graphs. To obtain this dichotomy, we establish the following new algorithmic, hardness, and combinatorial results: Algorithms: We identify two new classes of graph-theoretical structures that make it possible to solve Steiner Forest in polynomial time. Roughly speaking, our algorithms handle the following cases: (1) a set $X$ of vertices of bounded size that are pairwise connected by subgraphs of treewidth $2$ or bounded size, possibly together with an independent set of arbitrary size that is connected to $X$ in an arbitrary way; (2) a set $X$ of vertices of arbitrary size that are pairwise connected in a cyclic manner by subgraphs of treewidth $2$ or bounded size. Hardness results: We show that Steiner Forest remains NP-complete for graphs with 2-deletion set number $3$. (The $c$-deletion set number is the size of a smallest cutset $S$ such that every component of $G-S$ has at most $c$ vertices.) Combinatorial results: To establish the dichotomy, we perform a delicate graph-theoretic analysis showing that if $H$ is a path or a subdivided claw, then excluding $H$ as a subgraph either yields one of the two algorithmically favourable structures described above, or yields a graph class for which NP-completeness of Steiner Forest follows from either our new hardness result or a previously known one. Along the way to classifying the hardness for excluded subgraphs, we establish a dichotomy for graphs with $c$-deletion set number at most $k$. Specifically, our results together with pre-existing ones show that Steiner Forest is polynomial-time solvable if (1) $c=1$ and $k\geq 0$, or (2) $c=2$ and $k\leq 2$, or (3) $c\geq 3$ and $k=1$, and is NP-complete otherwise.

Optimal Trajectories in Discrete Space with Acceleration Constraints

from arXiv: Data Structures and Algorithms

Authors: Arnaud Casteigts, Matteo De Francesco, Pierre Leone

In the racetrack acceleration model, proposed by Martin Gardner in 1973, each step consists of changing the position of the vehicle by a vector in $\mathbb{Z}^2$, with the constraints that two consecutive vectors differ by at most one unit in each dimension. We investigate three problems related to this model in arbitrary dimension in open space (no obstacles), where a configuration of the vehicle consists of its current position and the last-used vector. The three problems are the following. In Branching Cost (BC), given two configurations, the goal is to compute the minimum number of intermediate configurations (length of a trajectory) between the two configurations. Branching Trajectory (BT) has the same input and asks for a description of the corresponding trajectory. Multipoint Trajectory (MT) asks for an optimal trajectory that visits given points $p_1,\dots,p_n$ in a prescribed order, starting and ending with zero-speed configurations.\\ We revisit known approaches to solve BC in 2D, showing that this problem can be solved in constant time in any fixed number of dimensions $d$ (more generally, in $O(d \log d)$ time). We show that BT can also be solved in constant time for any fixed $d$, despite the fact that the length of the trajectory is not constant, by leveraging the fact that there always exists \emph{one} optimal trajectory compactly represented by $O(1)$ intermediate configurations. For MT, we collect theoretical and experimental evidence that the speed cannot be trivially bounded; local decisions may be impacted by points that are arbitrarily far in the visit order; and an optimal trajectory may require significant excursions out of the convex hull of the points. We still establish conservative speed bounds that a natural dynamic programming (DP) algorithm can exploit to solve reasonably large instances efficiently.

Authors: Arnaud Casteigts, Matteo De Francesco, Pierre Leone

In the racetrack acceleration model, proposed by Martin Gardner in 1973, each step consists of changing the position of the vehicle by a vector in $\mathbb{Z}^2$, with the constraints that two consecutive vectors differ by at most one unit in each dimension. We investigate three problems related to this model in arbitrary dimension in open space (no obstacles), where a configuration of the vehicle consists of its current position and the last-used vector. The three problems are the following. In Branching Cost (BC), given two configurations, the goal is to compute the minimum number of intermediate configurations (length of a trajectory) between the two configurations. Branching Trajectory (BT) has the same input and asks for a description of the corresponding trajectory. Multipoint Trajectory (MT) asks for an optimal trajectory that visits given points $p_1,\dots,p_n$ in a prescribed order, starting and ending with zero-speed configurations.\\ We revisit known approaches to solve BC in 2D, showing that this problem can be solved in constant time in any fixed number of dimensions $d$ (more generally, in $O(d \log d)$ time). We show that BT can also be solved in constant time for any fixed $d$, despite the fact that the length of the trajectory is not constant, by leveraging the fact that there always exists \emph{one} optimal trajectory compactly represented by $O(1)$ intermediate configurations. For MT, we collect theoretical and experimental evidence that the speed cannot be trivially bounded; local decisions may be impacted by points that are arbitrarily far in the visit order; and an optimal trajectory may require significant excursions out of the convex hull of the points. We still establish conservative speed bounds that a natural dynamic programming (DP) algorithm can exploit to solve reasonably large instances efficiently.

Sample Complexity Bounds for Robust Mean Estimation with Mean-Shift Contamination

from arXiv: Data Structures and Algorithms

Authors: Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Sihan Liu

We study the basic task of mean estimation in the presence of mean-shift contamination. In the mean-shift contamination model, an adversary is allowed to replace a small constant fraction of the clean samples by samples drawn from arbitrarily shifted versions of the base distribution. Prior work characterized the sample complexity of this task for the special cases of the Gaussian and Laplace distributions. Specifically, it was shown that consistent estimation is possible in these cases, a property that is provably impossible in Huber's contamination model. An open question posed in earlier work was to determine the sample complexity of mean estimation in the mean-shift contamination model for general base distributions. In this work, we study and essentially resolve this open question. Specifically, we show that, under mild spectral conditions on the characteristic function of the (potentially multivariate) base distribution, there exists a sample-efficient algorithm that estimates the target mean to any desired accuracy. We complement our upper bound with a qualitatively matching sample complexity lower bound. Our techniques make critical use of Fourier analysis, and in particular introduce the notion of a Fourier witness as an essential ingredient of our upper and lower bounds.

Authors: Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Sihan Liu

We study the basic task of mean estimation in the presence of mean-shift contamination. In the mean-shift contamination model, an adversary is allowed to replace a small constant fraction of the clean samples by samples drawn from arbitrarily shifted versions of the base distribution. Prior work characterized the sample complexity of this task for the special cases of the Gaussian and Laplace distributions. Specifically, it was shown that consistent estimation is possible in these cases, a property that is provably impossible in Huber's contamination model. An open question posed in earlier work was to determine the sample complexity of mean estimation in the mean-shift contamination model for general base distributions. In this work, we study and essentially resolve this open question. Specifically, we show that, under mild spectral conditions on the characteristic function of the (potentially multivariate) base distribution, there exists a sample-efficient algorithm that estimates the target mean to any desired accuracy. We complement our upper bound with a qualitatively matching sample complexity lower bound. Our techniques make critical use of Fourier analysis, and in particular introduce the notion of a Fourier witness as an essential ingredient of our upper and lower bounds.

Robust Permutation Flowshops Under Budgeted Uncertainty

from arXiv: Data Structures and Algorithms

Authors: Noam Goldberg, Danny Hermelin, Dvir Shabtay

We consider the robust permutation flowshop problem under the budgeted uncertainty model, where at most a given number of job processing times may deviate on each machine. We show that solutions for this problem can be determined by solving polynomially many instances of the corresponding nominal problem. As a direct consequence, our result implies that this robust flowshop problem can be solved in polynomial time for two machines, and can be approximated in polynomial time for any fixed number of machines. The reduction that is our main result follows from an analysis similar to Bertsimas and Sim (2003) except that dualization is applied to the terms of a min-max objective rather than to a linear objective function. Our result may be surprising considering that heuristic and exact integer programming based methods have been developed in the literature for solving the two-machine flowshop problem. We conclude by showing a logarithmic factor improvement in the overall running time implied by a naive reduction to nominal problems in the case of two machines and three machines.

Authors: Noam Goldberg, Danny Hermelin, Dvir Shabtay

We consider the robust permutation flowshop problem under the budgeted uncertainty model, where at most a given number of job processing times may deviate on each machine. We show that solutions for this problem can be determined by solving polynomially many instances of the corresponding nominal problem. As a direct consequence, our result implies that this robust flowshop problem can be solved in polynomial time for two machines, and can be approximated in polynomial time for any fixed number of machines. The reduction that is our main result follows from an analysis similar to Bertsimas and Sim (2003) except that dualization is applied to the terms of a min-max objective rather than to a linear objective function. Our result may be surprising considering that heuristic and exact integer programming based methods have been developed in the literature for solving the two-machine flowshop problem. We conclude by showing a logarithmic factor improvement in the overall running time implied by a naive reduction to nominal problems in the case of two machines and three machines.

Tight Bounds for Online Scheduling in the One-Fast-Many-Slow Machines Setting

from arXiv: Data Structures and Algorithms

Authors: John Jeang, Vladimir Podolskii

In the One-Fast-Many-Slow decision problem, introduced by Sheffield and Westover (ITCS '25), a scheduler, with access to one fast machine and infinitely many slow machines, receives a series of tasks and must allocate the work among its machines. The goal is to minimize the overhead of an online algorithm over the optimal offline algorithm. Three versions of this setting were considered: Instantly-committing schedulers that must assign tasks to machines immediately and irrevocably, Eventually-committing schedulers whose assignments are irrevocable but can occur anytime after a task arrives, and Never-committing schedulers that can interrupt and restart a task on a different machine. In the Instantly-committing model, Sheffield and Westover showed that the optimal competitive ratio is equal to 2, while in the Eventually-committing model the competitive ratio lies in the interval [1.618, 1.678], and in the Never-committing model the competitive ratio lies in the interval [1.366, 1.5] (SPAA '24, ITCS '25). In the latter two models, the exact optimal competitive ratios were left as open problems, moreover Kuszmaul and Westover (SPAA '24) conjectured that the lower bound in the Eventually-committing model is tight. In this paper we resolve this problem by providing tight bounds for the competitive ratios in the Eventually-committing and Never-committing models. For Eventually-committing, we prove Kuszmaul and Westover's conjecture by giving an algorithm achieving a competitive ratio equal to the lower bound of $\frac{1+\sqrt{5}}{2}\approx 1.618$. For Never-committing, we provide an explicit Task Arrival Process (TAP) lower bounding the competitive ratio to the previous upper bound of 1.5.

Authors: John Jeang, Vladimir Podolskii

In the One-Fast-Many-Slow decision problem, introduced by Sheffield and Westover (ITCS '25), a scheduler, with access to one fast machine and infinitely many slow machines, receives a series of tasks and must allocate the work among its machines. The goal is to minimize the overhead of an online algorithm over the optimal offline algorithm. Three versions of this setting were considered: Instantly-committing schedulers that must assign tasks to machines immediately and irrevocably, Eventually-committing schedulers whose assignments are irrevocable but can occur anytime after a task arrives, and Never-committing schedulers that can interrupt and restart a task on a different machine. In the Instantly-committing model, Sheffield and Westover showed that the optimal competitive ratio is equal to 2, while in the Eventually-committing model the competitive ratio lies in the interval [1.618, 1.678], and in the Never-committing model the competitive ratio lies in the interval [1.366, 1.5] (SPAA '24, ITCS '25). In the latter two models, the exact optimal competitive ratios were left as open problems, moreover Kuszmaul and Westover (SPAA '24) conjectured that the lower bound in the Eventually-committing model is tight. In this paper we resolve this problem by providing tight bounds for the competitive ratios in the Eventually-committing and Never-committing models. For Eventually-committing, we prove Kuszmaul and Westover's conjecture by giving an algorithm achieving a competitive ratio equal to the lower bound of $\frac{1+\sqrt{5}}{2}\approx 1.618$. For Never-committing, we provide an explicit Task Arrival Process (TAP) lower bounding the competitive ratio to the previous upper bound of 1.5.

Delayed-Clairvoyant Flow Time Scheduling via a Borrow Graph Analysis

from arXiv: Data Structures and Algorithms

Authors: Alexander Lindermayr, Jens Schlöter

We study the problem of preemptively scheduling jobs online over time on a single machine to minimize the total flow time. In the traditional clairvoyant scheduling model, the scheduler learns about the processing time of a job at its arrival, and scheduling at any time the job with the shortest remaining processing time (SRPT) is optimal. In contrast, the practically relevant non-clairvoyant model assumes that the processing time of a job is unknown at its arrival, and is only revealed when it completes. Non-clairvoyant flow time minimization does not admit algorithms with a constant competitive ratio. Consequently, the problem has been studied under speed augmentation (JACM'00) or with predicted processing times (STOC'21, SODA'22) to attain constant guarantees. In this paper, we consider $α$-clairvoyant scheduling, where the scheduler learns the processing time of a job once it completes an $α$-fraction of its processing time. This naturally interpolates between clairvoyant scheduling ($α=0$) and non-clairvoyant scheduling ($α=1$). By elegantly fusing two traditional algorithms, we propose a scheduling rule with a competitive ratio of $\mathcal{O}(\frac{1}{1-α})$ whenever $0 \leq α< 1$. As $α$ increases, our competitive guarantee transitions nicely (up to constants) between the previously established bounds for clairvoyant and non-clairvoyant flow time minimization. We complement this positive result with a tight randomized lower bound.

Authors: Alexander Lindermayr, Jens Schlöter

We study the problem of preemptively scheduling jobs online over time on a single machine to minimize the total flow time. In the traditional clairvoyant scheduling model, the scheduler learns about the processing time of a job at its arrival, and scheduling at any time the job with the shortest remaining processing time (SRPT) is optimal. In contrast, the practically relevant non-clairvoyant model assumes that the processing time of a job is unknown at its arrival, and is only revealed when it completes. Non-clairvoyant flow time minimization does not admit algorithms with a constant competitive ratio. Consequently, the problem has been studied under speed augmentation (JACM'00) or with predicted processing times (STOC'21, SODA'22) to attain constant guarantees. In this paper, we consider $α$-clairvoyant scheduling, where the scheduler learns the processing time of a job once it completes an $α$-fraction of its processing time. This naturally interpolates between clairvoyant scheduling ($α=0$) and non-clairvoyant scheduling ($α=1$). By elegantly fusing two traditional algorithms, we propose a scheduling rule with a competitive ratio of $\mathcal{O}(\frac{1}{1-α})$ whenever $0 \leq α< 1$. As $α$ increases, our competitive guarantee transitions nicely (up to constants) between the previously established bounds for clairvoyant and non-clairvoyant flow time minimization. We complement this positive result with a tight randomized lower bound.

Maximal Biclique Enumeration with Improved Worst-Case Time Complexity Guarantee: A Partition-Oriented Strategy

from arXiv: Data Structures and Algorithms

Authors: Kaixin Wang, Kaiqiang Yu, Cheng Long

The maximal biclique enumeration problem in bipartite graphs is fundamental and has numerous applications in E-commerce and transaction networks. Most existing studies adopt a branch-and-bound framework, which recursively expands a partial biclique with a vertex until no further vertices can be added. Equipped with a basic pivot selection strategy, all state-of-the-art methods have a worst-case time complexity no better than $O(m\cdot (\sqrt{2})^n)$}, where $m$ and $n$ are the number of edges and vertices in the graph, respectively. In this paper, we introduce a new branch-and-bound (BB) algorithm \texttt{IPS}. In \texttt{IPS}, we relax the strict stopping criterion of existing methods by allowing termination when all maximal bicliques within the current branch can be outputted in the time proportional to the number of maximal bicliques inside, reducing the total number of branches required. Second, to fully unleash the power of the new termination condition, we propose an improved pivot selection strategy, which well aligns with the new termination condition to achieve better theoretical and practical performance. Formally, \texttt{IPS} improves the worst-case time complexity to $O(m\cdot α^n + n\cdot β)$, where $α(\approx 1.3954)$ is the largest positive root of $x^4-2x-1=0$ and $β$ represents the number of maximal bicliques in the graph, respectively. This result surpasses that of all existing algorithms given that $α$ is strictly smaller than $\sqrt{2}$ and $β$ is at most $(\sqrt{2})^n-2$ theoretically. Furthermore, we apply an inclusion-exclusion-based framework to boost the performance of \texttt{IPS}, improving the worst-case time complexity to $O(n\cdot γ^2\cdotα^γ+ γ\cdot β)$ for large sparse graphs ($γ$ is a parameter satisfying $γ\ll n$ for sparse graphs).

Authors: Kaixin Wang, Kaiqiang Yu, Cheng Long

The maximal biclique enumeration problem in bipartite graphs is fundamental and has numerous applications in E-commerce and transaction networks. Most existing studies adopt a branch-and-bound framework, which recursively expands a partial biclique with a vertex until no further vertices can be added. Equipped with a basic pivot selection strategy, all state-of-the-art methods have a worst-case time complexity no better than $O(m\cdot (\sqrt{2})^n)$}, where $m$ and $n$ are the number of edges and vertices in the graph, respectively. In this paper, we introduce a new branch-and-bound (BB) algorithm \texttt{IPS}. In \texttt{IPS}, we relax the strict stopping criterion of existing methods by allowing termination when all maximal bicliques within the current branch can be outputted in the time proportional to the number of maximal bicliques inside, reducing the total number of branches required. Second, to fully unleash the power of the new termination condition, we propose an improved pivot selection strategy, which well aligns with the new termination condition to achieve better theoretical and practical performance. Formally, \texttt{IPS} improves the worst-case time complexity to $O(m\cdot α^n + n\cdot β)$, where $α(\approx 1.3954)$ is the largest positive root of $x^4-2x-1=0$ and $β$ represents the number of maximal bicliques in the graph, respectively. This result surpasses that of all existing algorithms given that $α$ is strictly smaller than $\sqrt{2}$ and $β$ is at most $(\sqrt{2})^n-2$ theoretically. Furthermore, we apply an inclusion-exclusion-based framework to boost the performance of \texttt{IPS}, improving the worst-case time complexity to $O(n\cdot γ^2\cdotα^γ+ γ\cdot β)$ for large sparse graphs ($γ$ is a parameter satisfying $γ\ll n$ for sparse graphs).

The Instability of all Backoff Protocols

from arXiv: Data Structures and Algorithms

Authors: Leslie Ann Goldberg, John Lapinskas

In this paper we prove Aldous's conjecture from 1987 that there is no backoff protocol that is stable for any positive arrival rate. The setting is a communication channel for coordinating requests for a shared resource. Each user who wants to access the resource makes a request by sending a message to the channel. The users don't have any way to communicate with each other, except by sending messages to the channel. The operation of the channel proceeds in discrete time steps. If exactly one message is sent to the channel during a time step then this message succeeds (and leaves the system). If multiple messages are sent during a time step then these messages collide. Each of the users that sent these messages therefore waits a random amount of time before re-sending. A backoff protocol is a randomised algorithm for determining how long to wait -- the waiting time is a function of how many collisions a message has had. Specifically, a backoff protocol is described by a send sequence $\overline{p} = (p_0,p_1,p_2,\ldots)$. If a message has had $k$ collisions before a time step then, with probability $p_k$, it sends during that time step, whereas with probability $1-p_k$ it is silent (waiting for later). The most famous backoff protocol is binary exponential backoff, where $p_k = 2^{-k}$. Under Kelly's model, in which the number of new messages that arrive in the system at each time step is given by a Poisson random variable with mean $λ$, Aldous proved that binary exponential backoff is unstable for any positive $λ$. He conjectured that the same is true for any backoff protocol. We prove this conjecture.

Authors: Leslie Ann Goldberg, John Lapinskas

In this paper we prove Aldous's conjecture from 1987 that there is no backoff protocol that is stable for any positive arrival rate. The setting is a communication channel for coordinating requests for a shared resource. Each user who wants to access the resource makes a request by sending a message to the channel. The users don't have any way to communicate with each other, except by sending messages to the channel. The operation of the channel proceeds in discrete time steps. If exactly one message is sent to the channel during a time step then this message succeeds (and leaves the system). If multiple messages are sent during a time step then these messages collide. Each of the users that sent these messages therefore waits a random amount of time before re-sending. A backoff protocol is a randomised algorithm for determining how long to wait -- the waiting time is a function of how many collisions a message has had. Specifically, a backoff protocol is described by a send sequence $\overline{p} = (p_0,p_1,p_2,\ldots)$. If a message has had $k$ collisions before a time step then, with probability $p_k$, it sends during that time step, whereas with probability $1-p_k$ it is silent (waiting for later). The most famous backoff protocol is binary exponential backoff, where $p_k = 2^{-k}$. Under Kelly's model, in which the number of new messages that arrive in the system at each time step is given by a Poisson random variable with mean $λ$, Aldous proved that binary exponential backoff is unstable for any positive $λ$. He conjectured that the same is true for any backoff protocol. We prove this conjecture.

Precedence-Constrained Decision Trees and Coverings

from arXiv: Data Structures and Algorithms

Authors: Michał Szyfelbein, Dariusz Dereniowski

This work considers a number of optimization problems and reductive relations between them. The two main problems we are interested in are the \emph{Optimal Decision Tree} and \emph{Set Cover}. We study these two fundamental tasks under precedence constraints, that is, if a test (or set) $X$ is a predecessor of $Y$, then in any feasible decision tree $X$ needs to be an ancestor of $Y$ (or respectively, if $Y$ is added to set cover, then so must be $X$). For the Optimal Decision Tree we consider two optimization criteria: worst case identification time (height of the tree) or the average identification time. Similarly, for the Set Cover we study two cost measures: the size of the cover or the average cover time. Our approach is to develop a number of algorithmic reductions, where an approximation algorithm for one problem provides an approximation for another via a black-box usage of a procedure for the former. En route we introduce other optimization problems either to complete the `reduction landscape' or because they hold the essence of combinatorial structure of our problems. The latter is brought by a problem of finding a maximum density precedence closed subfamily, where the density is defined as the ratio of the number of items the family covers to its size. By doing so we provide $\cO^*(\sqrt{m})$-approximation algorithms for all of the aforementioned problems. The picture is complemented by a number of hardness reductions that provide $o(m^{1/12-ε})$-inapproximability results for the decision tree and covering problems. Besides giving a complete set of results for general precedence constraints, we also provide polylogarithmic approximation guarantees for two most typically studied and applicable precedence types, outforests and inforests. By providing corresponding hardness results, we show these results to be tight.

Authors: Michał Szyfelbein, Dariusz Dereniowski

This work considers a number of optimization problems and reductive relations between them. The two main problems we are interested in are the \emph{Optimal Decision Tree} and \emph{Set Cover}. We study these two fundamental tasks under precedence constraints, that is, if a test (or set) $X$ is a predecessor of $Y$, then in any feasible decision tree $X$ needs to be an ancestor of $Y$ (or respectively, if $Y$ is added to set cover, then so must be $X$). For the Optimal Decision Tree we consider two optimization criteria: worst case identification time (height of the tree) or the average identification time. Similarly, for the Set Cover we study two cost measures: the size of the cover or the average cover time. Our approach is to develop a number of algorithmic reductions, where an approximation algorithm for one problem provides an approximation for another via a black-box usage of a procedure for the former. En route we introduce other optimization problems either to complete the `reduction landscape' or because they hold the essence of combinatorial structure of our problems. The latter is brought by a problem of finding a maximum density precedence closed subfamily, where the density is defined as the ratio of the number of items the family covers to its size. By doing so we provide $\cO^*(\sqrt{m})$-approximation algorithms for all of the aforementioned problems. The picture is complemented by a number of hardness reductions that provide $o(m^{1/12-ε})$-inapproximability results for the decision tree and covering problems. Besides giving a complete set of results for general precedence constraints, we also provide polylogarithmic approximation guarantees for two most typically studied and applicable precedence types, outforests and inforests. By providing corresponding hardness results, we show these results to be tight.

Testable Learning of General Halfspaces under Massart Noise

from arXiv: Data Structures and Algorithms

Authors: Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Sihan Liu

We study the algorithmic task of testably learning general Massart halfspaces under the Gaussian distribution. In the testable learning setting, the aim is the design of a tester-learner pair satisfying the following properties: (1) if the tester accepts, the learner outputs a hypothesis and a certificate that it achieves near-optimal error, and (2) it is highly unlikely that the tester rejects if the data satisfies the underlying assumptions. Our main result is the first testable learning algorithm for general halfspaces with Massart noise and Gaussian marginals. The complexity of our algorithm is $d^{\mathrm{polylog}(\min\{1/γ, 1/ε\})}$, where $ε$ is the excess error and $γ$ is the bias of the target halfspace, which qualitatively matches the known quasi-polynomial Statistical Query lower bound for the non-testable setting. The analysis of our algorithm hinges on a novel sandwiching polynomial approximation to the sign function with multiplicative error that may be of broader interest.

Authors: Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Sihan Liu

We study the algorithmic task of testably learning general Massart halfspaces under the Gaussian distribution. In the testable learning setting, the aim is the design of a tester-learner pair satisfying the following properties: (1) if the tester accepts, the learner outputs a hypothesis and a certificate that it achieves near-optimal error, and (2) it is highly unlikely that the tester rejects if the data satisfies the underlying assumptions. Our main result is the first testable learning algorithm for general halfspaces with Massart noise and Gaussian marginals. The complexity of our algorithm is $d^{\mathrm{polylog}(\min\{1/γ, 1/ε\})}$, where $ε$ is the excess error and $γ$ is the bias of the target halfspace, which qualitatively matches the known quasi-polynomial Statistical Query lower bound for the non-testable setting. The analysis of our algorithm hinges on a novel sandwiching polynomial approximation to the sign function with multiplicative error that may be of broader interest.

Wednesday, February 25

Cosmin Pohoata: The Cayley-Bacharach theorem and its applications

from Gil Kalai

Cosmin Pohoata’s new and beautiful blog post presents several combinatorial applications of the Cayley–Bacharach theorem. Let me also mention his earlier post, in which he gave a new proof of Jamison’s direction tree theorem: every finite noncollinear point set in … Continue reading →

Cosmin Pohoata’s new and beautiful blog post presents several combinatorial applications of the Cayley–Bacharach theorem. Let me also mention his earlier post, in which he gave a new proof of Jamison’s direction tree theorem: every finite noncollinear point set in \mathbb{R}^2 admits a admits a direction tree—that is, a spanning tree whose edges have pairwise distinct directions.

In January, after a long break, Cosmin wrote another post presenting an entropy-based proof of an interesting product–sum theorem over finite fields.

Some direction trees from Jamison’s original paper

By Gil Kalai

A Single Grain Experiment

from Ben Recht

Steampunk Data Science - Part 2

This is Part 2 of a blogged essay “Steampunk Data Science.” A table of contents is here.

Though nutrition science of his time suggested otherwise, Stephen Babcock knew you couldn’t feed cows coal. He would devote his life to finding the best diet for cattle and, in the process, precipitate the discovery of vitamins.

In 1888, Babcock moved to Madison to chair the Department of Agricultural Chemistry at the University of Wisconsin. Agriculture was big business in Wisconsin, and the university was founded as a land-grant institution tasked with emphasizing research and education in agriculture as part of its core academic mission. We all now know Wisconsin as the dairy state, but in the mid-1800s, Wisconsin’s primary export was wheat. However, after seasonal blights and competition from more southernly midwestern states like Iowa, Wisconsin farmers began to shift their emphasis. By the turn of the century, 90% of Wisconsin farms reared dairy cows. By 1915, Wisconsin produced more dairy than any other state in the country.1

This transition to dairy farming was the product of an ambitious public-private partnership between the state’s farmers, government, and university. The university had a mandate to improve the economic and social conditions of Wisconsin. As part of this mission, they established an agricultural research station on the western edge of campus. You can still visit barns and silos. Though it sounds unusual today, the top biologists and chemists in the 1800s devoted major research efforts to improving agriculture.

A particularly important question for farmers was how to feed their livestock. Cows, in particular, are large, hungry animals and are expensive to feed. With the inherently low margins of farming, farmers wanted to spend as little on them as possible. A central research theme in 1800s Agricultural Chemistry was finding the least expensive diet that produced healthy, large cows that gave delicious milk and bore healthy offspring.

When Babcock arrived in Wisconsin, modern nutrition theory was in its infancy. Protein had been discovered in 1838, and most nutritionists understood that it was essential to a healthy diet. As often happens after a revolutionary discovery, the conventional advice from scientific experts focused too strongly on their newly discovered nutrient. Wilbur Atwater had devised a method to determine the available protein and caloric energy in foods based on nitrogen content. From this, nutritional guidelines were proposed to maximize protein at the cheapest price possible.

You couldn’t buy big tubs of whey isolate in 1887, and the table above lists some potential alternatives for the protein-conscious. Each quantity of food could be purchased for 25 cents in 1887 dollars.2 If price and protein were the only concerns and intestinal distress could be dismissed, a person would eat nothing but beans.

Having grown up on a farm himself, Babcock knew this conventional wisdom was suspect. He would ridicule Atwater and his colleagues, asking why we didn’t just feed cows dung or soft coal, as both cleared the bar for available energy and nitrogen content.

Farmers’ intuitions about feed suggested value in variety when it came to feeding cows. Babcock figured they must do this for a reason. Although these grains had similar macronutrients, no single grain would be sufficient for healthy animals. What would happen if cows were only fed single grains? If cows were only fed wheat, or only corn, or only barley, balanced so the protein and mineral content were the same, would this still yield thriving cattle?

Babcock pressured his dean to allow him to run an experiment. He proposed purchasing some young calves, raising them on single feeds, and comparing their development. The university had ample resources to tend and manage the cows, and Babcock would oversee the scientific reporting.

The dean balked. Such an interdisciplinary project would be nothing but a headache. He’d have to marshal support from the Animal Husbandry department, and they were demanding resources for their pet-breeding projects. Cows were expensive! Couldn’t Babcock pursue research on food chemistry without raising his own animals?

Though disappointed, Babcock found he could succeed in the lab without animals. In the next few years, he devised simple tests that would revolutionize dairy farming. His most famous discovery was a simple procedure to test the fat content of milk. This test could be done by civilians and allowed people to determine whether impurities had been added to the milk to dilute it or if someone had skimmed cream off the top. Moreover, farmers could use Babcock’s test to select cows that produced higher fat milk for breeding. In addition to his innovations, Babcock was an engaging, committed teacher and held forums with farmers to discuss best practices and current challenges. By 1900, Babcock had become a Wisconsin celebrity. He was a fixture in both the academic and farming circles across the state.3

Despite his extraordinary academic success and impact, Babcock never gave up on his single-grain experiment. He finally got his way in 1907 upon his retirement. Babcock was succeeded by Edwin Hart, who loved Babcock’s research proposal. Hart marshaled the resources for the experiment.

The university purchased 16 cows that were divided into four groups. Each of the groups of cows was fed a special diet. The first group was only fed corn, the second group only wheat, the third group only oats, and the fourth group an equal combination of the three grains. They then waited to see what would happen.

Subscribe now

1

My first job as a professor was at the University of Wisconsin, and that’s also part of why I was so drawn to this story. They named streets after Babcock! If you don’t have the pleasure of visiting Madison, you can read more on the history of the economy of Wisconsin and the public-private partnership with the university in

John D Buenker. (2013). The History of Wisconsin, Volume IV: The Progressive Era, 1893-1914, volume 4. Wisconsin Historical Society.

2

This table was originally assembled by Atwater and printed in the Century Magazine, one of the most popular magazines of the late 1800s. For more on these early developments in nutrition, see

Kenneth J Carpenter. (2003). A short history of nutritional science: part 2 (1885–1912). The Journal of Nutrition, 133(4):975–984. https://jn.nutrition.org/article/S00223166(22)15713-5/fulltext.

3

My biography of Babcock is pieced together from recollections of his collaborators. In particular:

Edwin Bret Hart. (1949). Stephen Moulton Babcock. The Journal of Nutrition, 37(1):1–7. doi:10.1093/jn/37.1.1.

Elmer Verner McCollum. (1964). From Kansas Farm Boy to Scientist: The Autobiography of Elmer Verner McCollum. University of Kansas Press.

Harry L. Russell, Glenn Frank, and A. S. Alexander. (1943). Stephen Moulton Babcock, man of science: a memorial to him in observance of the centenary of his birth. Wisconsin Alumni Research Foundation. https://digital.library.wisc.edu/1711. dl/FPJBPAWEVQOFZ87.

By Ben Recht

A Probability Challenge

from Computational Complexity

Last week I had the pleasure of meeting Alex Bellos in Oxford. Among other things Bellos writes the Guardian Monday puzzle column. He gave me a copy of his latest book, Puzzle Me Twice, where the obvious answer is not correct. I got more right than wrong, but I hated being wrong. Here is one of those puzzles, Sistery Mystery (page 28), which is a variation of a puzzle from Rob Eastaway. 

Puzzle 1: Suppose the probability of a girl is 51% independently and uniformly over all children. In expectation, who has more sisters, a boy or a girl?

Go ahead and try to solve this before reading further.

In any family with both boys and girls, each boy will have one more sister than each girl. For example in a family with four girls, each boy has four sisters and each girl only has three. Thus boys have more sisters on average.

Wrong, it's exactly the same. To see this consider Alex, the sixth child of a ten-child family. The number of Alex's sisters is independent of Alex's gender. This is a pretty robust result, it doesn't depend on the probability of a child being a girl, or if we allow non-binary children, or if the distributions aren't identical, say the probability of a girl is higher for later kids in a family. All you need is independence.

So what's wrong with my initial intuition that in every family boys have more sisters than girls. Eastaway suggests this gets balanced by the families of a single gender, but this happens rarely for large families. Instead it's a variation of Simpson's paradox. The naive argument doesn't account for the fact that girls are overrepresented in girl-heavy families. Consider a family of two boys and eight girls. Each of the two boys has eight sisters but four times as many girls have seven sisters, which adds to the expected value more to the girls than the boys.

If you lose independence the solution may not hold, for example if we have identical twins. 

I'll leave you with one more puzzle.

Puzzle 2: Suppose you are in a country where each family has children until they get their first boy. In this country, do boys or girls have more sisters on average?

Answer below.

In Puzzle 2 we lose independence and if Alex is a girl, she's more likely to be in a family with many girls. Indeed if boys and girls have equal probability, when you work out the infinite sums in expectation a boy will have one sister and a girl will have two.

By Lance Fortnow

Last week I had the pleasure of meeting Alex Bellos in Oxford. Among other things Bellos writes the Guardian Monday puzzle column. He gave me a copy of his latest book, Puzzle Me Twice, where the obvious answer is not correct. I got more right than wrong, but I hated being wrong. Here is one of those puzzles, Sistery Mystery (page 28), which is a variation of a puzzle from Rob Eastaway

Puzzle 1: Suppose the probability of a girl is 51% independently and uniformly over all children. In expectation, who has more sisters, a boy or a girl?

Go ahead and try to solve this before reading further.

In any family with both boys and girls, each boy will have one more sister than each girl. For example in a family with four girls, each boy has four sisters and each girl only has three. Thus boys have more sisters on average.

Wrong, it's exactly the same. To see this consider Alex, the sixth child of a ten-child family. The number of Alex's sisters is independent of Alex's gender. This is a pretty robust result, it doesn't depend on the probability of a child being a girl, or if we allow non-binary children, or if the distributions aren't identical, say the probability of a girl is higher for later kids in a family. All you need is independence.

So what's wrong with my initial intuition that in every family boys have more sisters than girls. Eastaway suggests this gets balanced by the families of a single gender, but this happens rarely for large families. Instead it's a variation of Simpson's paradox. The naive argument doesn't account for the fact that girls are overrepresented in girl-heavy families. Consider a family of two boys and eight girls. Each of the two boys has eight sisters but four times as many girls have seven sisters, which adds to the expected value more to the girls than the boys.

If you lose independence the solution may not hold, for example if we have identical twins. 

I'll leave you with one more puzzle.

Puzzle 2: Suppose you are in a country where each family has children until they get their first boy. In this country, do boys or girls have more sisters on average?

Answer below.

In Puzzle 2 we lose independence and if Alex is a girl, she's more likely to be in a family with many girls. Indeed if boys and girls have equal probability, when you work out the infinite sums in expectation a boy will have one sister and a girl will have two.

By Lance Fortnow

Polynomial Identity Testing and Reconstruction for Depth-4 Powering Circuits of High Degree

from arXiv: Computational Complexity

Authors: Amir Shpilka, Yann Tal

We study deterministic polynomial identity testing (PIT) and reconstruction algorithms for depth-$4$ arithmetic circuits of the form \[ Σ^{[r]}\!\wedge^{[d]}\!Σ^{[s]}\!Π^{[δ]}. \] This model generalizes Waring decompositions and diagonal circuits, and captures sums of powers of low-degree sparse polynomials. Specifically, each circuit computes a sum of $r$ terms, where each term is a $d$-th power of an $s$-sparse polynomial of degree $δ$. This model also includes algebraic representations that arise in tensor decomposition and moment-based learning tasks such as mixture models and subspace learning. We give deterministic worst-case algorithms for PIT and reconstruction in this model. Our PIT construction applies when $d>r^2$ and yields explicit hitting sets of size $O(r^4 s^4 n^2 d δ^3)$. The reconstruction algorithm runs in time $\textrm{poly}(n,s,d)$ under the condition $d=Ω(r^4δ)$, and in particular it tolerates polynomially large top fan-in $r$ and bottom degree $δ$. Both results hold over fields of characteristic zero and over fields of sufficiently large characteristic. These algorithms provide the first polynomial-time deterministic solutions for depth-$4$ powering circuits with unbounded top fan-in. In particular, the reconstruction result improves upon previous work which required non-degeneracy or average-case assumptions. The PIT construction relies on the ABC theorem for function fields (Mason-Stothers theorem), which ensures linear independence of high-degree powers of sparse polynomials after a suitable projection. The reconstruction algorithm combines this with Wronskian-based differential operators, structural properties of their kernels, and a robust version of the Klivans-Spielman hitting set.

Authors: Amir Shpilka, Yann Tal

We study deterministic polynomial identity testing (PIT) and reconstruction algorithms for depth-$4$ arithmetic circuits of the form \[ Σ^{[r]}\!\wedge^{[d]}\!Σ^{[s]}\!Π^{[δ]}. \] This model generalizes Waring decompositions and diagonal circuits, and captures sums of powers of low-degree sparse polynomials. Specifically, each circuit computes a sum of $r$ terms, where each term is a $d$-th power of an $s$-sparse polynomial of degree $δ$. This model also includes algebraic representations that arise in tensor decomposition and moment-based learning tasks such as mixture models and subspace learning. We give deterministic worst-case algorithms for PIT and reconstruction in this model. Our PIT construction applies when $d>r^2$ and yields explicit hitting sets of size $O(r^4 s^4 n^2 d δ^3)$. The reconstruction algorithm runs in time $\textrm{poly}(n,s,d)$ under the condition $d=Ω(r^4δ)$, and in particular it tolerates polynomially large top fan-in $r$ and bottom degree $δ$. Both results hold over fields of characteristic zero and over fields of sufficiently large characteristic. These algorithms provide the first polynomial-time deterministic solutions for depth-$4$ powering circuits with unbounded top fan-in. In particular, the reconstruction result improves upon previous work which required non-degeneracy or average-case assumptions. The PIT construction relies on the ABC theorem for function fields (Mason-Stothers theorem), which ensures linear independence of high-degree powers of sparse polynomials after a suitable projection. The reconstruction algorithm combines this with Wronskian-based differential operators, structural properties of their kernels, and a robust version of the Klivans-Spielman hitting set.

Markets are competitive if and only if P != NP

from arXiv: Computational Complexity

Authors: Philip Z. Maymin

I prove that competitive market outcomes require computational intractability. If P = NP, firms can efficiently solve the collusion detection problem, identifying deviations from cooperative agreements in complex, noisy markets and thereby making collusion sustainable as an equilibrium. If P != NP, the collusion detection problem is computationally infeasible for markets satisfying a natural instance-hardness condition on their demand structure, rendering punishment threats non-credible and collusion unstable. Combined with Maymin (2011), who proved that market efficiency requires P = NP, this yields a fundamental impossibility: markets can be informationally efficient or competitive, but not both. Artificial intelligence, by expanding firms' computational capabilities, is pushing markets from the competitive regime toward the collusive regime, explaining the empirical emergence of algorithmic collusion without explicit coordination.

Authors: Philip Z. Maymin

I prove that competitive market outcomes require computational intractability. If P = NP, firms can efficiently solve the collusion detection problem, identifying deviations from cooperative agreements in complex, noisy markets and thereby making collusion sustainable as an equilibrium. If P != NP, the collusion detection problem is computationally infeasible for markets satisfying a natural instance-hardness condition on their demand structure, rendering punishment threats non-credible and collusion unstable. Combined with Maymin (2011), who proved that market efficiency requires P = NP, this yields a fundamental impossibility: markets can be informationally efficient or competitive, but not both. Artificial intelligence, by expanding firms' computational capabilities, is pushing markets from the competitive regime toward the collusive regime, explaining the empirical emergence of algorithmic collusion without explicit coordination.

Exponential Lower Bounds for 2-query Relaxed Locally Decodable Codes

from arXiv: Computational Complexity

Authors: Alexander R. Block, Jeremiah Blocki, Kuan Cheng, Elena Grigorescu, Xin Li, Yu Zheng, Minshen Zhu

Locally Decodable Codes (LDCs) are error-correcting codes $C\colonΣ^n\rightarrow Σ^m,$ encoding \emph{messages} in $Σ^n$ to \emph{codewords} in $Σ^m$, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length $m$ that is super-polynomial in $n$, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting. We prove an exponential lower bound on the length of Hamming RLDCs making $2$ queries (even adaptively) over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a ``phase-transition''-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits.

Authors: Alexander R. Block, Jeremiah Blocki, Kuan Cheng, Elena Grigorescu, Xin Li, Yu Zheng, Minshen Zhu

Locally Decodable Codes (LDCs) are error-correcting codes $C\colonΣ^n\rightarrow Σ^m,$ encoding \emph{messages} in $Σ^n$ to \emph{codewords} in $Σ^m$, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length $m$ that is super-polynomial in $n$, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting. We prove an exponential lower bound on the length of Hamming RLDCs making $2$ queries (even adaptively) over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a ``phase-transition''-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits.

Incomplete Open Platonic Solids

from arXiv: Computational Geometry

Authors: Mikael Vejdemo-Johansson

Sol LeWitt famously enumerated all the incomplete open cubes, finding 122 of these connected, non-planar subsets of the edges of the cube. Since then, while several projects have revisited the cube enumeration, no such enumeration has been published for any other interesting solid. In this paper we present work on enumerating all the incomplete open platonic solids, finding 6 tetrahedra, 122 cubes (just like LeWitt), 185 octahedra, 2\,423\,206 dodecahedra and 16\,096\,166 icosahedra.

Authors: Mikael Vejdemo-Johansson

Sol LeWitt famously enumerated all the incomplete open cubes, finding 122 of these connected, non-planar subsets of the edges of the cube. Since then, while several projects have revisited the cube enumeration, no such enumeration has been published for any other interesting solid. In this paper we present work on enumerating all the incomplete open platonic solids, finding 6 tetrahedra, 122 cubes (just like LeWitt), 185 octahedra, 2\,423\,206 dodecahedra and 16\,096\,166 icosahedra.

Frontier Space-Time Algorithms Using Only Full Memory

from arXiv: Data Structures and Algorithms

Authors: Petr Chmel, Aditi Dudeja, Michal Koucký, Ian Mertz, Ninad Rajgopal

We develop catalytic algorithms for fundamental problems in algorithm design that run in polynomial time, use only $\mathcal{O}(\log(n))$ workspace, and use sublinear catalytic space matching the best-known space bounds of non-catalytic algorithms running in polynomial time. First, we design a polynomial time algorithm for directed $s$-$t$ connectivity using $n \big/ 2^{Θ(\sqrt{\log n})}$ catalytic space, which matches the state-of-the-art time-space bounds in the non-catalytic setting [Barnes et al., 1998], and improves the catalytic space usage of the best known algorithm [Cook and Pyne, 2026]. Furthermore, using only $\mathcal{O}(\log(n))$ random bits we get a randomized algorithm whose running time nearly matches the fastest time bounds known for space-unrestricted algorithms. Second, we design polynomial time algorithms for the problems of computing Edit Distance, Longest Common Subsequence, and the Discrete Fréchet Distance, again using $n \big/ 2^{Θ(\sqrt{\log n})}$ catalytic space. This again matches non-catalytic time-space frontier for Edit Distance and Least Common Subsequence [Kiyomi et al., 2021].

Authors: Petr Chmel, Aditi Dudeja, Michal Koucký, Ian Mertz, Ninad Rajgopal

We develop catalytic algorithms for fundamental problems in algorithm design that run in polynomial time, use only $\mathcal{O}(\log(n))$ workspace, and use sublinear catalytic space matching the best-known space bounds of non-catalytic algorithms running in polynomial time. First, we design a polynomial time algorithm for directed $s$-$t$ connectivity using $n \big/ 2^{Θ(\sqrt{\log n})}$ catalytic space, which matches the state-of-the-art time-space bounds in the non-catalytic setting [Barnes et al., 1998], and improves the catalytic space usage of the best known algorithm [Cook and Pyne, 2026]. Furthermore, using only $\mathcal{O}(\log(n))$ random bits we get a randomized algorithm whose running time nearly matches the fastest time bounds known for space-unrestricted algorithms. Second, we design polynomial time algorithms for the problems of computing Edit Distance, Longest Common Subsequence, and the Discrete Fréchet Distance, again using $n \big/ 2^{Θ(\sqrt{\log n})}$ catalytic space. This again matches non-catalytic time-space frontier for Edit Distance and Least Common Subsequence [Kiyomi et al., 2021].

A Space-space Trade-off for Directed st-Connectivity

from arXiv: Data Structures and Algorithms

Authors: Roman Edenhofer

We prove a space-space trade-off for directed $st$-connectivity in the catalytic space model. For any integer $k \leq n$, we give an algorithm that decides directed $st$-connectivity using $O(\log n \cdot \log k+\log n)$ regular workspace and $O\left(\frac{n}{k} \cdot \log^2 n\right)$ bits of catalytic memory. This interpolates between the classical $O(\log^2 n)$-space bound from Savitch's algorithm and a catalytic endpoint with $O(\log n)$ workspace and $O(n\cdot \log^2 n)$ catalytic memory. As a warm-up, we present a catalytic variant of Savitch's algorithm achieving the endpoint above. Up to logarithmic factors, this matches the smallest catalyst size currently known for catalytic logspace algorithms, due to Cook and Pyne (ITCS 2026). Our techniques also extend to counting the number of walks from $s$ to $t$ of a given length $\ell\leq n$.

Authors: Roman Edenhofer

We prove a space-space trade-off for directed $st$-connectivity in the catalytic space model. For any integer $k \leq n$, we give an algorithm that decides directed $st$-connectivity using $O(\log n \cdot \log k+\log n)$ regular workspace and $O\left(\frac{n}{k} \cdot \log^2 n\right)$ bits of catalytic memory. This interpolates between the classical $O(\log^2 n)$-space bound from Savitch's algorithm and a catalytic endpoint with $O(\log n)$ workspace and $O(n\cdot \log^2 n)$ catalytic memory. As a warm-up, we present a catalytic variant of Savitch's algorithm achieving the endpoint above. Up to logarithmic factors, this matches the smallest catalyst size currently known for catalytic logspace algorithms, due to Cook and Pyne (ITCS 2026). Our techniques also extend to counting the number of walks from $s$ to $t$ of a given length $\ell\leq n$.

Statistical Query Lower Bounds for Smoothed Agnostic Learning

from arXiv: Data Structures and Algorithms

Authors: Ilias Diakonikolas, Daniel M. Kane

We study the complexity of smoothed agnostic learning, recently introduced by~\cite{CKKMS24}, in which the learner competes with the best classifier in a target class under slight Gaussian perturbations of the inputs. Specifically, we focus on the prototypical task of agnostically learning halfspaces under subgaussian distributions in the smoothed model. The best known upper bound for this problem relies on $L_1$-polynomial regression and has complexity $d^{\tilde{O}(1/σ^2) \log(1/ε)}$, where $σ$ is the smoothing parameter and $ε$ is the excess error. Our main result is a Statistical Query (SQ) lower bound providing formal evidence that this upper bound is close to best possible. In more detail, we show that (even for Gaussian marginals) any SQ algorithm for smoothed agnostic learning of halfspaces requires complexity $d^{Ω(1/σ^{2}+\log(1/ε))}$. This is the first non-trivial lower bound on the complexity of this task and nearly matches the known upper bound. Roughly speaking, we show that applying $L_1$-polynomial regression to a smoothed version of the function is essentially best possible. Our techniques involve finding a moment-matching hard distribution by way of linear programming duality. This dual program corresponds exactly to finding a low-degree approximating polynomial to the smoothed version of the target function (which turns out to be the same condition required for the $L_1$-polynomial regression to work). Our explicit SQ lower bound then comes from proving lower bounds on this approximation degree for the class of halfspaces.

Authors: Ilias Diakonikolas, Daniel M. Kane

We study the complexity of smoothed agnostic learning, recently introduced by~\cite{CKKMS24}, in which the learner competes with the best classifier in a target class under slight Gaussian perturbations of the inputs. Specifically, we focus on the prototypical task of agnostically learning halfspaces under subgaussian distributions in the smoothed model. The best known upper bound for this problem relies on $L_1$-polynomial regression and has complexity $d^{\tilde{O}(1/σ^2) \log(1/ε)}$, where $σ$ is the smoothing parameter and $ε$ is the excess error. Our main result is a Statistical Query (SQ) lower bound providing formal evidence that this upper bound is close to best possible. In more detail, we show that (even for Gaussian marginals) any SQ algorithm for smoothed agnostic learning of halfspaces requires complexity $d^{Ω(1/σ^{2}+\log(1/ε))}$. This is the first non-trivial lower bound on the complexity of this task and nearly matches the known upper bound. Roughly speaking, we show that applying $L_1$-polynomial regression to a smoothed version of the function is essentially best possible. Our techniques involve finding a moment-matching hard distribution by way of linear programming duality. This dual program corresponds exactly to finding a low-degree approximating polynomial to the smoothed version of the target function (which turns out to be the same condition required for the $L_1$-polynomial regression to work). Our explicit SQ lower bound then comes from proving lower bounds on this approximation degree for the class of halfspaces.

Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank

from arXiv: Data Structures and Algorithms

Authors: Kimon Fountoulakis, David Martínez-Rubio

We study the degree-weighted work required to compute $\ell_1$-regularized PageRank using the standard one-gradient-per-iteration accelerated proximal-gradient method (FISTA). For non-accelerated local methods, the best known worst-case work scales as $\widetilde{O} ((αρ)^{-1})$, where $α$ is the teleportation parameter and $ρ$ is the $\ell_1$-regularization parameter. A natural question is whether FISTA can improve the dependence on $α$ from $1/α$ to $1/\sqrtα$ while preserving the $1/ρ$ locality scaling. The challenge is that acceleration can break locality by transiently activating nodes that are zero at optimality, thereby increasing the cost of gradient evaluations. We analyze FISTA on a slightly over-regularized objective and show that, under a checkable confinement condition, all spurious activations remain inside a boundary set $\mathcal{B}$. This yields a bound consisting of an accelerated $(ρ\sqrtα)^{-1}\log(α/\varepsilon)$ term plus a boundary overhead $\sqrt{vol(\mathcal{B})}/(ρα^{3/2})$. We provide graph-structural conditions that imply such confinement. Experiments on synthetic and real graphs show the resulting speedup and slowdown regimes under the degree-weighted work model.

Authors: Kimon Fountoulakis, David Martínez-Rubio

We study the degree-weighted work required to compute $\ell_1$-regularized PageRank using the standard one-gradient-per-iteration accelerated proximal-gradient method (FISTA). For non-accelerated local methods, the best known worst-case work scales as $\widetilde{O} ((αρ)^{-1})$, where $α$ is the teleportation parameter and $ρ$ is the $\ell_1$-regularization parameter. A natural question is whether FISTA can improve the dependence on $α$ from $1/α$ to $1/\sqrtα$ while preserving the $1/ρ$ locality scaling. The challenge is that acceleration can break locality by transiently activating nodes that are zero at optimality, thereby increasing the cost of gradient evaluations. We analyze FISTA on a slightly over-regularized objective and show that, under a checkable confinement condition, all spurious activations remain inside a boundary set $\mathcal{B}$. This yields a bound consisting of an accelerated $(ρ\sqrtα)^{-1}\log(α/\varepsilon)$ term plus a boundary overhead $\sqrt{vol(\mathcal{B})}/(ρα^{3/2})$. We provide graph-structural conditions that imply such confinement. Experiments on synthetic and real graphs show the resulting speedup and slowdown regimes under the degree-weighted work model.

Ski Rental with Distributional Predictions of Unknown Quality

from arXiv: Data Structures and Algorithms

Authors: Qiming Cui, Michael Dinitz

We revisit the central online problem of ski rental in the "algorithms with predictions" framework from the point of view of distributional predictions. Ski rental was one of the first problems to be studied with predictions, where a natural prediction is simply the number of ski days. But it is both more natural and potentially more powerful to think of a prediction as a distribution p-hat over the ski days. If the true number of ski days is drawn from some true (but unknown) distribution p, then we show as our main result that there is an algorithm with expected cost at most OPT + O(min(max({eta}, 1) * sqrt(b), b log b)), where OPT is the expected cost of the optimal policy for the true distribution p, b is the cost of buying, and {eta} is the Earth Mover's (Wasserstein-1) distance between p and p-hat. Note that when {eta} < o(sqrt(b)) this gives additive loss less than b (the trivial bound), and when {eta} is arbitrarily large (corresponding to an extremely inaccurate prediction) we still do not pay more than O(b log b) additive loss. An implication of these bounds is that our algorithm has consistency O(sqrt(b)) (additive loss when the prediction error is 0) and robustness O(b log b) (additive loss when the prediction error is arbitrarily large). Moreover, we do not need to assume that we know (or have any bound on) the prediction error {eta}, in contrast with previous work in robust optimization which assumes that we know this error. We complement this upper bound with a variety of lower bounds showing that it is essentially tight: not only can the consistency/robustness tradeoff not be improved, but our particular loss function cannot be meaningfully improved.

Authors: Qiming Cui, Michael Dinitz

We revisit the central online problem of ski rental in the "algorithms with predictions" framework from the point of view of distributional predictions. Ski rental was one of the first problems to be studied with predictions, where a natural prediction is simply the number of ski days. But it is both more natural and potentially more powerful to think of a prediction as a distribution p-hat over the ski days. If the true number of ski days is drawn from some true (but unknown) distribution p, then we show as our main result that there is an algorithm with expected cost at most OPT + O(min(max({eta}, 1) * sqrt(b), b log b)), where OPT is the expected cost of the optimal policy for the true distribution p, b is the cost of buying, and {eta} is the Earth Mover's (Wasserstein-1) distance between p and p-hat. Note that when {eta} < o(sqrt(b)) this gives additive loss less than b (the trivial bound), and when {eta} is arbitrarily large (corresponding to an extremely inaccurate prediction) we still do not pay more than O(b log b) additive loss. An implication of these bounds is that our algorithm has consistency O(sqrt(b)) (additive loss when the prediction error is 0) and robustness O(b log b) (additive loss when the prediction error is arbitrarily large). Moreover, we do not need to assume that we know (or have any bound on) the prediction error {eta}, in contrast with previous work in robust optimization which assumes that we know this error. We complement this upper bound with a variety of lower bounds showing that it is essentially tight: not only can the consistency/robustness tradeoff not be improved, but our particular loss function cannot be meaningfully improved.

Asymptotics of solutions to the linear search problem

from arXiv: Data Structures and Algorithms

Authors: Robin A. Heinonen

The exact leading asymptotics of solutions to the symmetric linear search problem are obtained for any positive probability density on the real line with a monotonic, sufficiently regular tail. A similar result holds for densities on a compact interval.

Authors: Robin A. Heinonen

The exact leading asymptotics of solutions to the symmetric linear search problem are obtained for any positive probability density on the real line with a monotonic, sufficiently regular tail. A similar result holds for densities on a compact interval.

A $2$-branching construction for the $χ\leq 2r$ bound

from arXiv: Data Structures and Algorithms

Authors: Vinicius Tikara Venturi Date, Leandro Miranda Zatesko

The string repetitiveness measures $χ$ (the size of a smallest suffixient set of a string) and $r$ (the number of runs in the Burrows--Wheeler Transform) are related. Recently, we have shown that the bound $χ\leq 2r$, proved by Navarro et al., is asymptotically tight as the size $σ$ of the alphabet increases, but achieving near-tight ratios for fixed $σ> 2$ remained open. We introduce a \emph{2-branching property}: a cyclic string is 2-branching at order~$k$ if every $(k{-}1)$-length substring admits exactly two $k$-length extensions. We show that 2-branching strings of order~$k$ yield closed-form ratios $χ/r = (2σ^{k-1}+1)/(σ^{k-1}+4)$. For order~$3$, we give an explicit construction for every $σ\geq 2$, narrowing the gap to~$2$ from $O(1/σ)$ to $O(1/σ^2)$. For $σ\in \{3,4\}$, we additionally present order-$5$ instances with ratios exceeding~$1.91$.

Authors: Vinicius Tikara Venturi Date, Leandro Miranda Zatesko

The string repetitiveness measures $χ$ (the size of a smallest suffixient set of a string) and $r$ (the number of runs in the Burrows--Wheeler Transform) are related. Recently, we have shown that the bound $χ\leq 2r$, proved by Navarro et al., is asymptotically tight as the size $σ$ of the alphabet increases, but achieving near-tight ratios for fixed $σ> 2$ remained open. We introduce a \emph{2-branching property}: a cyclic string is 2-branching at order~$k$ if every $(k{-}1)$-length substring admits exactly two $k$-length extensions. We show that 2-branching strings of order~$k$ yield closed-form ratios $χ/r = (2σ^{k-1}+1)/(σ^{k-1}+4)$. For order~$3$, we give an explicit construction for every $σ\geq 2$, narrowing the gap to~$2$ from $O(1/σ)$ to $O(1/σ^2)$. For $σ\in \{3,4\}$, we additionally present order-$5$ instances with ratios exceeding~$1.91$.

Adversarial Robustness on Insertion-Deletion Streams

from arXiv: Data Structures and Algorithms

Authors: Elena Gribelyuk, Honghao Lin, David P. Woodruff, Huacheng Yu, Samson Zhou

We study adversarially robust algorithms for insertion-deletion (turnstile) streams, where future updates may depend on past algorithm outputs. While robust algorithms exist for insertion-only streams with only a polylogarithmic overhead in memory over non-robust algorithms, it was widely conjectured that turnstile streams of length polynomial in the universe size $n$ require space linear in $n$. We refute this conjecture, showing that robustness can be achieved using space which is significantly sublinear in $n$. Our framework combines multiple linear sketches in a novel estimator-corrector-learner framework, yielding the first insertion-deletion algorithms that approximate: (1) the second moment $F_2$ up to a $(1+\varepsilon)$-factor in polylogarithmic space, (2) any symmetric function $\cal{F}$ with an $\mathcal{O}(1)$-approximate triangle inequality up to a $2^{\mathcal{O}(C)}$ factor in $\tilde{\mathcal{O}}(n^{1/C}) \cdot S(n)$ bits of space, where $S$ is the space required to approximate $\cal{F}$ non-robustly; this includes a broad class of functions such as the $L_1$-norm, the support size $F_0$, and non-normed losses such as the $M$-estimators, and (3) $L_2$ heavy hitters. For the $F_2$ moment, our algorithm is optimal up to $\textrm{poly}((\log n)/\varepsilon)$ factors. Given the recent results of Gribelyuk et al. (STOC, 2025), this shows an exponential separation between linear sketches and non-linear sketches for achieving adversarial robustness in turnstile streams.

Authors: Elena Gribelyuk, Honghao Lin, David P. Woodruff, Huacheng Yu, Samson Zhou

We study adversarially robust algorithms for insertion-deletion (turnstile) streams, where future updates may depend on past algorithm outputs. While robust algorithms exist for insertion-only streams with only a polylogarithmic overhead in memory over non-robust algorithms, it was widely conjectured that turnstile streams of length polynomial in the universe size $n$ require space linear in $n$. We refute this conjecture, showing that robustness can be achieved using space which is significantly sublinear in $n$. Our framework combines multiple linear sketches in a novel estimator-corrector-learner framework, yielding the first insertion-deletion algorithms that approximate: (1) the second moment $F_2$ up to a $(1+\varepsilon)$-factor in polylogarithmic space, (2) any symmetric function $\cal{F}$ with an $\mathcal{O}(1)$-approximate triangle inequality up to a $2^{\mathcal{O}(C)}$ factor in $\tilde{\mathcal{O}}(n^{1/C}) \cdot S(n)$ bits of space, where $S$ is the space required to approximate $\cal{F}$ non-robustly; this includes a broad class of functions such as the $L_1$-norm, the support size $F_0$, and non-normed losses such as the $M$-estimators, and (3) $L_2$ heavy hitters. For the $F_2$ moment, our algorithm is optimal up to $\textrm{poly}((\log n)/\varepsilon)$ factors. Given the recent results of Gribelyuk et al. (STOC, 2025), this shows an exponential separation between linear sketches and non-linear sketches for achieving adversarial robustness in turnstile streams.

Turing Completeness of GNU find: From mkdir-assisted Loops to Standalone Computation

from arXiv: Data Structures and Algorithms

Authors: Keigo Oka

The Unix command \texttt{find} is among the first commands taught to beginners, yet remains indispensable for experienced engineers. In this paper, we demonstrate that \texttt{find} possesses unexpected computational power, establishing three Turing completeness results using the GNU implementation (a standard in Linux distributions). (1) \texttt{find} + \texttt{mkdir} (a system that has only \texttt{find} and \texttt{mkdir}) is Turing complete: by encoding computational states as directory paths and using regex back-references to copy substrings, we simulate 2-tag systems. (2) GNU \texttt{find} 4.9.0+ alone is Turing complete: by reading and writing to files during traversal, we simulate a two-counter machine without \texttt{mkdir}. (3) \texttt{find} + \texttt{mkdir} without regex back-references is still Turing complete: by a trick of encoding regex patterns directly into directory names, we achieve the same power. These results place \texttt{find} among the ``surprisingly Turing-complete'' systems, highlighting the hidden complexity within seemingly simple standard utilities.

Authors: Keigo Oka

The Unix command \texttt{find} is among the first commands taught to beginners, yet remains indispensable for experienced engineers. In this paper, we demonstrate that \texttt{find} possesses unexpected computational power, establishing three Turing completeness results using the GNU implementation (a standard in Linux distributions). (1) \texttt{find} + \texttt{mkdir} (a system that has only \texttt{find} and \texttt{mkdir}) is Turing complete: by encoding computational states as directory paths and using regex back-references to copy substrings, we simulate 2-tag systems. (2) GNU \texttt{find} 4.9.0+ alone is Turing complete: by reading and writing to files during traversal, we simulate a two-counter machine without \texttt{mkdir}. (3) \texttt{find} + \texttt{mkdir} without regex back-references is still Turing complete: by a trick of encoding regex patterns directly into directory names, we achieve the same power. These results place \texttt{find} among the ``surprisingly Turing-complete'' systems, highlighting the hidden complexity within seemingly simple standard utilities.

Online Algorithms with Unreliable Guidance

from arXiv: Data Structures and Algorithms

Authors: Julien Dallot, Yuval Emek, Yuval Gil, Maciej Pacut, Stefan Schmid

This paper introduces a new model for ML-augmented online decision making, called online algorithms with unreliable guidance (OAG). This model completely separates between the predictive and algorithmic components, thus offering a single well-defined analysis framework that relies solely on the considered problem. Formulated through the lens of request-answer games, an OAG algorithm receives, with each incoming request, a piece of guidance which is taken from the problem's answer space; ideally, this guidance is the optimal answer for the current request, however with probability $β$, the guidance is adversarially corrupted. The goal is to develop OAG algorithms that admit good competitiveness when $β= 0$ (a.k.a. consistency) as well as when $β= 1$ (a.k.a. robustness); the appealing notion of smoothness, that in most prior work required a dedicated loss function, now arises naturally as $β$ shifts from $0$ to $1$. We then describe a systematic method, called the drop or trust blindly (DTB) compiler, which transforms any online algorithm into a learning-augmented online algorithm in the OAG model. Given a prediction-oblivious online algorithm, its learning-augmented counterpart produced by applying the DTB compiler either follows the incoming guidance blindly or ignores it altogether and proceeds as the initial algorithm would have; the choice between these two alternatives is based on the outcome of a (biased) coin toss. As our main technical contribution, we prove (rigorously) that although remarkably simple, the class of algorithms produced via the DTB compiler includes algorithms with attractive consistency-robustness guarantees for three classic online problems: for caching and uniform metrical task systems our algorithms are optimal, whereas for bipartite matching (with adversarial arrival order), our algorithm outperforms the state-of-the-art.

Authors: Julien Dallot, Yuval Emek, Yuval Gil, Maciej Pacut, Stefan Schmid

This paper introduces a new model for ML-augmented online decision making, called online algorithms with unreliable guidance (OAG). This model completely separates between the predictive and algorithmic components, thus offering a single well-defined analysis framework that relies solely on the considered problem. Formulated through the lens of request-answer games, an OAG algorithm receives, with each incoming request, a piece of guidance which is taken from the problem's answer space; ideally, this guidance is the optimal answer for the current request, however with probability $β$, the guidance is adversarially corrupted. The goal is to develop OAG algorithms that admit good competitiveness when $β= 0$ (a.k.a. consistency) as well as when $β= 1$ (a.k.a. robustness); the appealing notion of smoothness, that in most prior work required a dedicated loss function, now arises naturally as $β$ shifts from $0$ to $1$. We then describe a systematic method, called the drop or trust blindly (DTB) compiler, which transforms any online algorithm into a learning-augmented online algorithm in the OAG model. Given a prediction-oblivious online algorithm, its learning-augmented counterpart produced by applying the DTB compiler either follows the incoming guidance blindly or ignores it altogether and proceeds as the initial algorithm would have; the choice between these two alternatives is based on the outcome of a (biased) coin toss. As our main technical contribution, we prove (rigorously) that although remarkably simple, the class of algorithms produced via the DTB compiler includes algorithms with attractive consistency-robustness guarantees for three classic online problems: for caching and uniform metrical task systems our algorithms are optimal, whereas for bipartite matching (with adversarial arrival order), our algorithm outperforms the state-of-the-art.