Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Monday, May 25

Recursion and proof theoretical characterizations of small circuit classes with modulo counting via discrete differential equations (long version)

from arXiv: Computational Complexity

Authors: Melissa Antonelli, Arnaud Durand, Rui Li

The paper proposes an implicit (i.e., machine-independent) complexity approach to studying computation by polynomial-size, constant-depth circuits with gates counting modulo a constant through the lens of discrete ordinary differential equations (ODEs). So far, recursion-theoretic characterizations have been provided for functions computed by circuits of constant depth, including gates counting modulo 2 and 6 only (i.e., for the classes FAC0[2] and FAC0[6], resp.). In this paper, it is shown that considering ODE schemas, rather than bounded recursion, allows for a more fine-grained analysis, leading to (uniform) characterizations for all classes FAC0[n] (n \in N), i.e. functions computed by circuits including counting modulo n gates. Inspired by the syntactic form of the ODE schemas, we go further in this direction and present first-order bounded theories for capturing provably total functions in each of these classes.

Authors: Melissa Antonelli, Arnaud Durand, Rui Li

The paper proposes an implicit (i.e., machine-independent) complexity approach to studying computation by polynomial-size, constant-depth circuits with gates counting modulo a constant through the lens of discrete ordinary differential equations (ODEs). So far, recursion-theoretic characterizations have been provided for functions computed by circuits of constant depth, including gates counting modulo 2 and 6 only (i.e., for the classes FAC0[2] and FAC0[6], resp.). In this paper, it is shown that considering ODE schemas, rather than bounded recursion, allows for a more fine-grained analysis, leading to (uniform) characterizations for all classes FAC0[n] (n \in N), i.e. functions computed by circuits including counting modulo n gates. Inspired by the syntactic form of the ODE schemas, we go further in this direction and present first-order bounded theories for capturing provably total functions in each of these classes.

Probabilistically checkable proofs for the Existential Theory of the Reals

from arXiv: Computational Complexity

Authors: Jack Stade

We prove a PCP theorem for the existential theory of the reals, showing that MAX-ETR-INV is $\exists\mathbb{R}$-hard to approximate to within some constant factor. The existential theory of the reals (ETR) is a decision problem asking if there exists a set of real-valued variables satisfying some constraints involving polynomials and inequalities, and $\exists\mathbb{R}$ is the complexity class of problems polynomial-time reducible to ETR. Many important geometric problems are known to be $\exists\mathbb{R}$-complete. $\exists\mathbb{R}$-hardness results frequently work by a reduction from the $\exists\mathbb{R}$-complete problem ETR-INV, which asks if there is a an assignment of real variables each in the interval $[\frac12, 2]$ satisfying some constraints of form $x=1$, $xy=1$ and $x+y=z$. MAX-ETR-INV is a related optimization problem that asks, given a set of constraints of form $x=1$, $xy=1$, and $x+y=z$, for a feasible (that is, satisfiable with variables in $[\frac12, 2]$) subset of those constraints of the largest possible size. We show that there is some constant $ε>0$ such that it is $\exists\mathbb{R}$-hard to approximate MAX-ETR-INV better than a $1-ε$ factor. This means that even a non-deterministic polynomial-time algorithm can't approximate MAX-ETR-INV better than this factor unless $\exists\mathbb{R}=\text{NP}$. We also give a polynomial-time $8$-factor approximation algorithm and a non-deterministic-polynomial-time $2$-factor approximation algorithm for MAX-ETR-INV.

Authors: Jack Stade

We prove a PCP theorem for the existential theory of the reals, showing that MAX-ETR-INV is $\exists\mathbb{R}$-hard to approximate to within some constant factor. The existential theory of the reals (ETR) is a decision problem asking if there exists a set of real-valued variables satisfying some constraints involving polynomials and inequalities, and $\exists\mathbb{R}$ is the complexity class of problems polynomial-time reducible to ETR. Many important geometric problems are known to be $\exists\mathbb{R}$-complete. $\exists\mathbb{R}$-hardness results frequently work by a reduction from the $\exists\mathbb{R}$-complete problem ETR-INV, which asks if there is a an assignment of real variables each in the interval $[\frac12, 2]$ satisfying some constraints of form $x=1$, $xy=1$ and $x+y=z$. MAX-ETR-INV is a related optimization problem that asks, given a set of constraints of form $x=1$, $xy=1$, and $x+y=z$, for a feasible (that is, satisfiable with variables in $[\frac12, 2]$) subset of those constraints of the largest possible size. We show that there is some constant $ε>0$ such that it is $\exists\mathbb{R}$-hard to approximate MAX-ETR-INV better than a $1-ε$ factor. This means that even a non-deterministic polynomial-time algorithm can't approximate MAX-ETR-INV better than this factor unless $\exists\mathbb{R}=\text{NP}$. We also give a polynomial-time $8$-factor approximation algorithm and a non-deterministic-polynomial-time $2$-factor approximation algorithm for MAX-ETR-INV.

On the Approximate Non-Deterministic Degree of Total Boolean Functions

from arXiv: Computational Complexity

Authors: Samruddhi Pednekar, Supartha Podder

The approximate non-deterministic degree of a Boolean function $f$, denoted $\mathsf{ndeg}_ε(f)$ (written $\mathsf{N}_ε(f)$ for brevity), is the minimum degree of a real polynomial $p$ such that $0 \le |p(x)| \le ε$ whenever $f(x) = 0$, and $|p(x)| \ge 1$ whenever $f(x) = 1$. Unlike exact non-deterministic degree, which only requires the polynomial to be nonzero on $1$-inputs, this measure enforces a uniform gap: the polynomial must stay close to zero on all $0$-inputs and bounded away from zero on all $1$-inputs. The rational degree conjecture, open for over three decades, was recently resolved by Kothari, Kovacs-Deak, Wang, and Yang, who showed that for every total Boolean function $f$, \[ deg(f) \le \widetilde O\!\left(\operatorname{rdeg}(f)^3\right). \] In their paper, they explicitly propose a stronger conjecture: that approximate degree is polynomially bounded by $\mathsf{N}_ε(f)$ and $\mathsf{N}_ε(\overline{f})$ jointly, i.e., for every total Boolean function $f$ and every constant $0<ε<1$, \[ \widetilde{deg}(f) \le \operatorname{poly}(\mathsf N_ε(f), \mathsf N_ε(\overline f)). \] This conjecture, if true, would imply a polynomial version of the rational degree result and bring us closer to resolving de Wolf's longstanding non-deterministic degree conjecture. In this work, we make the first systematic progress on this problem, establishing the conjecture for several broad and natural function classes: monotone and unate functions, functions of bounded alternation number, symmetric functions, $k$-uniform hypergraph properties, and read-$k$ Disjunctive Normal Form (DNF) formulas.

Authors: Samruddhi Pednekar, Supartha Podder

The approximate non-deterministic degree of a Boolean function $f$, denoted $\mathsf{ndeg}_ε(f)$ (written $\mathsf{N}_ε(f)$ for brevity), is the minimum degree of a real polynomial $p$ such that $0 \le |p(x)| \le ε$ whenever $f(x) = 0$, and $|p(x)| \ge 1$ whenever $f(x) = 1$. Unlike exact non-deterministic degree, which only requires the polynomial to be nonzero on $1$-inputs, this measure enforces a uniform gap: the polynomial must stay close to zero on all $0$-inputs and bounded away from zero on all $1$-inputs. The rational degree conjecture, open for over three decades, was recently resolved by Kothari, Kovacs-Deak, Wang, and Yang, who showed that for every total Boolean function $f$, \[ deg(f) \le \widetilde O\!\left(\operatorname{rdeg}(f)^3\right). \] In their paper, they explicitly propose a stronger conjecture: that approximate degree is polynomially bounded by $\mathsf{N}_ε(f)$ and $\mathsf{N}_ε(\overline{f})$ jointly, i.e., for every total Boolean function $f$ and every constant $0<ε<1$, \[ \widetilde{deg}(f) \le \operatorname{poly}(\mathsf N_ε(f), \mathsf N_ε(\overline f)). \] This conjecture, if true, would imply a polynomial version of the rational degree result and bring us closer to resolving de Wolf's longstanding non-deterministic degree conjecture. In this work, we make the first systematic progress on this problem, establishing the conjecture for several broad and natural function classes: monotone and unate functions, functions of bounded alternation number, symmetric functions, $k$-uniform hypergraph properties, and read-$k$ Disjunctive Normal Form (DNF) formulas.

Query Lower Bounds for Correlation Clustering under Memory Constraints

from arXiv: Computational Complexity

Authors: Sumegha Garg, Songhua He, Periklis A. Papakonstantinou

This work initiates the study of memory-query tradeoffs for graph problems, with a focus on correlation clustering. Correlation clustering asks for a partition of the vertices that minimizes disagreements: non-edges inside clusters plus edges across clusters. Our first result is a tight query lower bound: to output a partition whose cost approximates the optimum up to an additive error of $\varepsilon n^2$, any algorithm requires $Ω(n/\varepsilon^2)$ adjacency-matrix queries. Under memory constraints, we show that even for the seemingly easier task of approximating the optimal clustering cost (without producing a partition), any algorithm in the random query model must make $\gg n/\varepsilon^2$ adjacency-matrix queries. Finally, we prove the first general graph model query lower bound for correlation clustering, where algorithms are allowed adjacency-matrix, neighbor, and degree queries. The latter two bounds are not yet tight, leaving room for sharper results.

Authors: Sumegha Garg, Songhua He, Periklis A. Papakonstantinou

This work initiates the study of memory-query tradeoffs for graph problems, with a focus on correlation clustering. Correlation clustering asks for a partition of the vertices that minimizes disagreements: non-edges inside clusters plus edges across clusters. Our first result is a tight query lower bound: to output a partition whose cost approximates the optimum up to an additive error of $\varepsilon n^2$, any algorithm requires $Ω(n/\varepsilon^2)$ adjacency-matrix queries. Under memory constraints, we show that even for the seemingly easier task of approximating the optimal clustering cost (without producing a partition), any algorithm in the random query model must make $\gg n/\varepsilon^2$ adjacency-matrix queries. Finally, we prove the first general graph model query lower bound for correlation clustering, where algorithms are allowed adjacency-matrix, neighbor, and degree queries. The latter two bounds are not yet tight, leaving room for sharper results.

The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems

from arXiv: Computational Complexity

Authors: Dongxin Guo

Large language models now write software, draft legal documents, and produce clinical notes, yet fundamental limits, from Turing and Arrow to the No Free Lunch theorems, shape what computation can do. This thesis turns such impossibility results from curiosities into design rules. Its flagship result proves an accuracy ceiling set by architecture alone: past a critical reasoning depth, no amount of training moves it, at any adapter rank, sample size, or loss function. Computable before deployment from layer count and embedding width, this Deterministic Horizon is measured between nineteen and thirty-one across twelve transformer architectures, and fine-tuning on optimal-length traces recovers under four percentage points. The mechanism is a capacity invariant of the residual stream, and an information-theoretic conversion yields super-exponential accuracy decay past the horizon. An unconditional circuit-complexity lower bound for modular exponentiation against constant-depth prime-modulus circuits complements this result. The same argument recasts across subfields: preference learning under any misspecified model jumps discontinuously in sample complexity; multi-stage retrieval pipelines require at least as many independent metrics as stages; standard truthful auctions fail for agents with prompt-dependent valuations; and zero-knowledge verification of neural inference pays a measured overhead of one hundred ten to one hundred ninety times per non-linear activation. Together these form a catalogue of sixteen specifications, each pairing a computable boundary, a quantified violation cost, and a constructive design rule: two compositions are proved, one pairing is an honest obstruction, and four remain open. The impossibility-specification methodology is offered for the generative research programme that trustworthy AI may need. Every fundamental limit of AI is also a design rule.

Authors: Dongxin Guo

Large language models now write software, draft legal documents, and produce clinical notes, yet fundamental limits, from Turing and Arrow to the No Free Lunch theorems, shape what computation can do. This thesis turns such impossibility results from curiosities into design rules. Its flagship result proves an accuracy ceiling set by architecture alone: past a critical reasoning depth, no amount of training moves it, at any adapter rank, sample size, or loss function. Computable before deployment from layer count and embedding width, this Deterministic Horizon is measured between nineteen and thirty-one across twelve transformer architectures, and fine-tuning on optimal-length traces recovers under four percentage points. The mechanism is a capacity invariant of the residual stream, and an information-theoretic conversion yields super-exponential accuracy decay past the horizon. An unconditional circuit-complexity lower bound for modular exponentiation against constant-depth prime-modulus circuits complements this result. The same argument recasts across subfields: preference learning under any misspecified model jumps discontinuously in sample complexity; multi-stage retrieval pipelines require at least as many independent metrics as stages; standard truthful auctions fail for agents with prompt-dependent valuations; and zero-knowledge verification of neural inference pays a measured overhead of one hundred ten to one hundred ninety times per non-linear activation. Together these form a catalogue of sixteen specifications, each pairing a computable boundary, a quantified violation cost, and a constructive design rule: two compositions are proved, one pairing is an honest obstruction, and four remain open. The impossibility-specification methodology is offered for the generative research programme that trustworthy AI may need. Every fundamental limit of AI is also a design rule.

Dynamic Query Modification for Binary Locality Sensitive Hashing

from arXiv: Computational Geometry

Authors: Ben Claydon, Richard Connor, Alan Dearle

Our context of interest is how binary locality sensitive hash (LSH) functions can be used to solve the approximate near neighbour (ANN) problem, which seeks to find the k closest elements of some dataset X to some further point q presented as a query. Binary locality sensitive function families H are sets of functions each which accept a point and return a binary value. A function is locality sensitive if and only if the output of the function is more likely to be equal (a 'hash collision') if two close vectors are used as input than if two far vectors are used. A data structure can be built by generating binary hash codes for each member of X, which are generated by drawing and applying one or more functions from H. When q is presented as a query, the same set of functions is applied to it and those elements of X with equal binary hash codes are retrieved. In this paper we introduce dynamic query modification. This process changes q at query time to form a new value c, which by theoretical and experimental analysis we prove has two significant advantages. Firstly, the hash output of c collides with near neighbours with a greater probability than q. Secondly, we show there is little chance of c failing to collide with any near neighbours; a property which we demonstrate is not true for q. To demonstrate the efficacy of the technique, we define a novel structure MQ-Forest, a modified version of RP-Forest. Both are binary LSH-based ANN mechanisms, but MQ-Forest dynamically estimates a value for c during the query process. We show that MQ-Forest reduces both build and query times by up to 40% when measured over several large, high-dimensional benchmark datasets.

Authors: Ben Claydon, Richard Connor, Alan Dearle

Our context of interest is how binary locality sensitive hash (LSH) functions can be used to solve the approximate near neighbour (ANN) problem, which seeks to find the k closest elements of some dataset X to some further point q presented as a query. Binary locality sensitive function families H are sets of functions each which accept a point and return a binary value. A function is locality sensitive if and only if the output of the function is more likely to be equal (a 'hash collision') if two close vectors are used as input than if two far vectors are used. A data structure can be built by generating binary hash codes for each member of X, which are generated by drawing and applying one or more functions from H. When q is presented as a query, the same set of functions is applied to it and those elements of X with equal binary hash codes are retrieved. In this paper we introduce dynamic query modification. This process changes q at query time to form a new value c, which by theoretical and experimental analysis we prove has two significant advantages. Firstly, the hash output of c collides with near neighbours with a greater probability than q. Secondly, we show there is little chance of c failing to collide with any near neighbours; a property which we demonstrate is not true for q. To demonstrate the efficacy of the technique, we define a novel structure MQ-Forest, a modified version of RP-Forest. Both are binary LSH-based ANN mechanisms, but MQ-Forest dynamically estimates a value for c during the query process. We show that MQ-Forest reduces both build and query times by up to 40% when measured over several large, high-dimensional benchmark datasets.

Optimal Dimension-Free Sampling for Regularized Classification

from arXiv: Data Structures and Algorithms

Authors: Meysam Alishahi, Alexander Munteanu, Simon Omlor, Jeff M. Phillips

We prove optimal sampling bounds achieving $(1\pm\varepsilon)$-relative error for a broad class of Lipschitz continuous classification loss functions under various regularization terms. This includes important functions such as logistic and sigmoid loss, hinge loss, and ReLU loss, as prominent and popular representative examples. In particular, we prove $k^2/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_2/k$ regularization, and $k/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_1/k$ regularization. For $\|\cdot\|_2^2/k$ regularization, the sampling complexity depends mainly on a bounded derivative property: if $|g'(x)|\leq g(x)$, and $g(0)>0$, and $g$ is monotonic or convex, then it admits linear in $k$ sampling complexity; otherwise the general bound is $k^2/\varepsilon^2$. However, if $g(0)=0$, our results indicate that no dimension-free bounds are possible, and even sublinear bounds are ruled out. All upper bounds are complemented by matching lower bounds up to polylogarithmic terms. Moreover, our work relies conceptually and algorithmically on simple uniform or (squared) norm sampling and hereby improves over recent cubic $k^3/\varepsilon^2$ sensitivity sampling bounds of (Alishahi and Phillips, ICML'24). This is achieved by refined arguments involving higher moment bounds and empirical process analyses to avoid overcounting that appears in the de-facto standard VC-dimension and sensitivity framework.

Authors: Meysam Alishahi, Alexander Munteanu, Simon Omlor, Jeff M. Phillips

We prove optimal sampling bounds achieving $(1\pm\varepsilon)$-relative error for a broad class of Lipschitz continuous classification loss functions under various regularization terms. This includes important functions such as logistic and sigmoid loss, hinge loss, and ReLU loss, as prominent and popular representative examples. In particular, we prove $k^2/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_2/k$ regularization, and $k/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_1/k$ regularization. For $\|\cdot\|_2^2/k$ regularization, the sampling complexity depends mainly on a bounded derivative property: if $|g'(x)|\leq g(x)$, and $g(0)>0$, and $g$ is monotonic or convex, then it admits linear in $k$ sampling complexity; otherwise the general bound is $k^2/\varepsilon^2$. However, if $g(0)=0$, our results indicate that no dimension-free bounds are possible, and even sublinear bounds are ruled out. All upper bounds are complemented by matching lower bounds up to polylogarithmic terms. Moreover, our work relies conceptually and algorithmically on simple uniform or (squared) norm sampling and hereby improves over recent cubic $k^3/\varepsilon^2$ sensitivity sampling bounds of (Alishahi and Phillips, ICML'24). This is achieved by refined arguments involving higher moment bounds and empirical process analyses to avoid overcounting that appears in the de-facto standard VC-dimension and sensitivity framework.

Reducing the Randomness in Partition Oracles for Bounded Degree Minor-Free Graphs

from arXiv: Data Structures and Algorithms

Authors: Akash Kumar, Abhiruk Lahiri, C. Seshadhri

Consider a bounded-degree graph $G$ that belongs to a minor-closed family (such as planar graphs). Such a graph has a hyperfinite decomposition, wherein, for a sufficiently small $\varepsilon > 0$, one can remove $\varepsilon dn$ edges to obtain connected components of size independent of $n$. (As usual, $n$ is the number of vertices and $d$ is the degree bound.) In a seminal result, Hassidim-Kelner-Nguyen-Onak (FOCS 2009) introduced the partition oracle, a procedure that provides local access to a hyperfinite decomposition. The partition oracle computes the component containing an input vertex $v$ with query complexity (to $G$) independent of $n$. Remarkably, this is done without any preprocessing on $G$. The coordination is done purely through a shared random seed. Despite a line of work on optimizing the query complexity of partition oracles, there were no attempts to bound the size of the random seed. All existing partition oracles use a random seed of size $Ω(n)$, which technically implies a linear setup time. Any blackbox derandomization would likely need $Ω(\log^2n)$ uniform random bits. A natural question is whether the random seed can also have length independent of $n$. We prove the $poly(d\varepsilon^{-1})$-query partition oracles of Kumar-Seshadhri-Stolman can be implemented with a random seed of $poly(d\varepsilon^{-1}) \cdot \log n$ length. To get a deeper understanding on the randomness complexity, we consider a more general model where the vertex labels come from the universe $[N]$, where $N \geq n$. In this setting, we prove that any partition oracle even for cycles requires $ω_N(1)$ random bits.

Authors: Akash Kumar, Abhiruk Lahiri, C. Seshadhri

Consider a bounded-degree graph $G$ that belongs to a minor-closed family (such as planar graphs). Such a graph has a hyperfinite decomposition, wherein, for a sufficiently small $\varepsilon > 0$, one can remove $\varepsilon dn$ edges to obtain connected components of size independent of $n$. (As usual, $n$ is the number of vertices and $d$ is the degree bound.) In a seminal result, Hassidim-Kelner-Nguyen-Onak (FOCS 2009) introduced the partition oracle, a procedure that provides local access to a hyperfinite decomposition. The partition oracle computes the component containing an input vertex $v$ with query complexity (to $G$) independent of $n$. Remarkably, this is done without any preprocessing on $G$. The coordination is done purely through a shared random seed. Despite a line of work on optimizing the query complexity of partition oracles, there were no attempts to bound the size of the random seed. All existing partition oracles use a random seed of size $Ω(n)$, which technically implies a linear setup time. Any blackbox derandomization would likely need $Ω(\log^2n)$ uniform random bits. A natural question is whether the random seed can also have length independent of $n$. We prove the $poly(d\varepsilon^{-1})$-query partition oracles of Kumar-Seshadhri-Stolman can be implemented with a random seed of $poly(d\varepsilon^{-1}) \cdot \log n$ length. To get a deeper understanding on the randomness complexity, we consider a more general model where the vertex labels come from the universe $[N]$, where $N \geq n$. In this setting, we prove that any partition oracle even for cycles requires $ω_N(1)$ random bits.

Ancilla-Efficient QSAMPLE Preparation for Reversible Markov Chains

from arXiv: Data Structures and Algorithms

Authors: Nicholas Zhao

Preparing quantum samples (QSAMPLES), coherent encodings of stationary distributions of reversible Markov chains, is a fundamental primitive in quantum sampling, particularly for quantum simulated annealing. A central limitation of existing phase-estimation-based frameworks is the ancilla qubit overhead. In this work, we present a new end-to-end framework requiring only one ancilla qubit in the working register. The key technical ingredient is a selective phase compiler circuit using one ancilla qubit, built from a generalized quantum signal processing (GQSP)-based projector onto the 1-eigenspace of the qubitized Szegedy walk. Embedding these selective phase compilers into the fixed-point amplitude amplification (FPAA) procedure and iterating yields a quantum algorithm that, given an initial state, oracle access, lower bounds on the overlaps between adjacent states, and lower bounds on the phase gaps, outputs a QSAMPLE within any desired trace distance and thus total variation distance. The query complexity scales inversely with the square roots of both the minimum overlap and the minimum spectral gap of the Markov chains across the cooling schedule, up to polylogarithmic factors. We also perform simulations to verify how our qubit and query complexity evolve with the trace distance, and how this work compares to the previous framework. These results establish two improvements over the previous framework by Wocjan and Abeyesinghe. First, the working-register ancilla cost is reduced to one. Second, by inserting our GQSP-based selective phase compiler into the FPAA procedure, we improve the QSAMPLE transport overlap dependence from inverse minimum overlap to inverse square-root minimum overlap, relative to their Grover pi-over-three fixed-point method. Finally, as a direct application, we apply the quantum algorithm to prepare a Gibbs QSAMPLE and obtain a rigorous complexity analysis.

Authors: Nicholas Zhao

Preparing quantum samples (QSAMPLES), coherent encodings of stationary distributions of reversible Markov chains, is a fundamental primitive in quantum sampling, particularly for quantum simulated annealing. A central limitation of existing phase-estimation-based frameworks is the ancilla qubit overhead. In this work, we present a new end-to-end framework requiring only one ancilla qubit in the working register. The key technical ingredient is a selective phase compiler circuit using one ancilla qubit, built from a generalized quantum signal processing (GQSP)-based projector onto the 1-eigenspace of the qubitized Szegedy walk. Embedding these selective phase compilers into the fixed-point amplitude amplification (FPAA) procedure and iterating yields a quantum algorithm that, given an initial state, oracle access, lower bounds on the overlaps between adjacent states, and lower bounds on the phase gaps, outputs a QSAMPLE within any desired trace distance and thus total variation distance. The query complexity scales inversely with the square roots of both the minimum overlap and the minimum spectral gap of the Markov chains across the cooling schedule, up to polylogarithmic factors. We also perform simulations to verify how our qubit and query complexity evolve with the trace distance, and how this work compares to the previous framework. These results establish two improvements over the previous framework by Wocjan and Abeyesinghe. First, the working-register ancilla cost is reduced to one. Second, by inserting our GQSP-based selective phase compiler into the FPAA procedure, we improve the QSAMPLE transport overlap dependence from inverse minimum overlap to inverse square-root minimum overlap, relative to their Grover pi-over-three fixed-point method. Finally, as a direct application, we apply the quantum algorithm to prepare a Gibbs QSAMPLE and obtain a rigorous complexity analysis.

Beyond the Half-Approximation: Fair and Efficient Online Class Matching

from arXiv: Data Structures and Algorithms

Authors: Sander Borst, Max Springer

Online bipartite matching, where agents are known in advance but items arrive sequentially and must be irrevocably assigned, is fundamental to problems ranging from ride-sharing to online advertising. When agents belong to classes such as demographic groups or geographic regions, fairness demands equitable treatment across these groups. Recent work introduced class envy-freeness (CEF), a natural extension of the classical fair division notion: an algorithm is $α$-CEF if each class receives value at least an $α$ fraction of what it could extract from any other class's bundle. However, all known algorithms achieving constant-factor CEF guarantees attain utilitarian social welfare (total matching value) of at most $\frac{1}{2}$ times the optimum, far below the $1-\frac{1}{e} \approx 0.632$ achievable without fairness constraints. We resolve the open question of whether fairness necessitates this efficiency loss, by introducing threshold-based algorithms parameterized by $γ\in [0,1]$ that equalize allocations across classes until threshold $γ$, then maximize efficiency. For divisible matching, this yields simultaneous $(1-e^{-γ})$-CEF and $(1 - \frac{e^{γ-1}}{γ+1})$-USW guarantees; for indivisible matching, $\fracγ{2}$-CEF with the same USW. Setting $γ> 0$ produces the first algorithms beating $\frac{1}{2}$-USW while maintaining constant CEF. We complement this with a novel upper bound construction, proving no non-wasteful $α$-CEF algorithm can exceed $\frac{1 +α- e^{α-1}}{1+α}$-USW and correcting prior bounds that were vacuous for $α< 0.58$. Our upper bound nearly matches our algorithms' performance, giving the first substantive characterization of the price of fairness in online class matching.

Authors: Sander Borst, Max Springer

Online bipartite matching, where agents are known in advance but items arrive sequentially and must be irrevocably assigned, is fundamental to problems ranging from ride-sharing to online advertising. When agents belong to classes such as demographic groups or geographic regions, fairness demands equitable treatment across these groups. Recent work introduced class envy-freeness (CEF), a natural extension of the classical fair division notion: an algorithm is $α$-CEF if each class receives value at least an $α$ fraction of what it could extract from any other class's bundle. However, all known algorithms achieving constant-factor CEF guarantees attain utilitarian social welfare (total matching value) of at most $\frac{1}{2}$ times the optimum, far below the $1-\frac{1}{e} \approx 0.632$ achievable without fairness constraints. We resolve the open question of whether fairness necessitates this efficiency loss, by introducing threshold-based algorithms parameterized by $γ\in [0,1]$ that equalize allocations across classes until threshold $γ$, then maximize efficiency. For divisible matching, this yields simultaneous $(1-e^{-γ})$-CEF and $(1 - \frac{e^{γ-1}}{γ+1})$-USW guarantees; for indivisible matching, $\fracγ{2}$-CEF with the same USW. Setting $γ> 0$ produces the first algorithms beating $\frac{1}{2}$-USW while maintaining constant CEF. We complement this with a novel upper bound construction, proving no non-wasteful $α$-CEF algorithm can exceed $\frac{1 +α- e^{α-1}}{1+α}$-USW and correcting prior bounds that were vacuous for $α< 0.58$. Our upper bound nearly matches our algorithms' performance, giving the first substantive characterization of the price of fairness in online class matching.

Tractable Maximization of Budgeted Phylogenetic Diversity on Networks Utilizing Node Scanwidth

from arXiv: Data Structures and Algorithms

Authors: Niels Holtgrefe, Jannik Schestag

Identifying a subset of taxa that maximizes Phylogenetic Diversity (PD) is a cornerstone of quantitative conservation planning. Traditionally, PD is defined over a phylogenetic tree in which leaves resemble present-day taxa and the branch lengths capture the estimated evolutionary distinctiveness. While PD maximization is computationally tractable on trees with unit costs, the problem becomes NP-hard when transitioning to phylogenetic networks or to budgeted versions in which protecting taxa incurs non-homogeneous costs. In this paper, we address these two challenges by providing definitions and a comprehensive analysis of three distinct variants of budgeted PD on networks. We conduct our study through the lens of a small structural parameter, node scanwidth (nsw), which measures the "tree-likeness" of a phylogenetic network. We show that two of the considered variants can be optimized in O*(2^nsw B^2) time, where B is the budget. For the computationally harder, third variant, we provide an algorithm to compute PD scores in O*(3^nsw) time. We further contribute the first exact algorithms to compute node scanwidth, recognizing that the utility of algorithms based on nsw depends on the ability to compute nsw and its corresponding decomposition. Our approaches integrate data reduction rules, dynamic programming, and an Integer Linear Programming formulation. We validate our theoretical results through extensive experiments on highly reticulated, simulated networks containing several hundred taxa, using heterogeneous costs. Our implementation computes PD scores and optimal nsw in fractions of a second, even on the most challenging instances. Furthermore, our budgeted optimization algorithms significantly outperform existing benchmarks for computing PD on networks, which were previously limited to unit-cost scenarios. The software makes analyses even on networks with a thousand taxa tracta...

Authors: Niels Holtgrefe, Jannik Schestag

Identifying a subset of taxa that maximizes Phylogenetic Diversity (PD) is a cornerstone of quantitative conservation planning. Traditionally, PD is defined over a phylogenetic tree in which leaves resemble present-day taxa and the branch lengths capture the estimated evolutionary distinctiveness. While PD maximization is computationally tractable on trees with unit costs, the problem becomes NP-hard when transitioning to phylogenetic networks or to budgeted versions in which protecting taxa incurs non-homogeneous costs. In this paper, we address these two challenges by providing definitions and a comprehensive analysis of three distinct variants of budgeted PD on networks. We conduct our study through the lens of a small structural parameter, node scanwidth (nsw), which measures the "tree-likeness" of a phylogenetic network. We show that two of the considered variants can be optimized in O*(2^nsw B^2) time, where B is the budget. For the computationally harder, third variant, we provide an algorithm to compute PD scores in O*(3^nsw) time. We further contribute the first exact algorithms to compute node scanwidth, recognizing that the utility of algorithms based on nsw depends on the ability to compute nsw and its corresponding decomposition. Our approaches integrate data reduction rules, dynamic programming, and an Integer Linear Programming formulation. We validate our theoretical results through extensive experiments on highly reticulated, simulated networks containing several hundred taxa, using heterogeneous costs. Our implementation computes PD scores and optimal nsw in fractions of a second, even on the most challenging instances. Furthermore, our budgeted optimization algorithms significantly outperform existing benchmarks for computing PD on networks, which were previously limited to unit-cost scenarios. The software makes analyses even on networks with a thousand taxa tracta...

Fairness in Aggregation: Optimal Top-$k$ and Improved Full Ranking

from arXiv: Data Structures and Algorithms

Authors: Diptarka Chakraborty, Arya Mazumdar, Barna Saha, Alvin Hong Yao Yan

Ensuring fairness in algorithmic ranking systems is a critical challenge with significant societal implications for hiring, recommendations, web search, and data management. Standard methods for aggregating multiple preference orders into a consensus ranking may perpetuate and even amplify the lack of representation of underrepresented groups. To address this, recent research has focused on incorporating fairness constraints to ensure the presence of different groups in the top-$k$ positions of the final aggregate ranking. We study two fairness-aware variants under the well-known Spearman footrule, which corresponds to the $L_1$ distance between rankings. First, we address the practically salient task of computing a fair aggregate top-$k$ ranking -- crucial in settings like recommendations and hiring where selection is primarily based on the top-$k$ results -- and present the first optimal algorithm for this problem. Second, we consider fair (full) rank aggregation over all candidates (not specifically on top-$k$). We already know of a $3$-approximation for this fair rank aggregation variant (Wei et al., SIGMOD'22; Chakraborty et al., NeurIPS'22), whereas an exact algorithm exists for the corresponding unconstrained (unfair) version (Dwork et al., WWW'01). Closing the computational gap between fair and unconstrained rank aggregation has remained a tantalizing open problem. We make significant progress by giving a $2$-approximation algorithm for fair (full) rank aggregation, improving substantially over the previous $3$-approximation. Further, we complement our theoretical contributions with experiments on different real-world datasets, which corroborate our theoretical results and demonstrate strong empirical performance relative to state-of-the-art baselines.

Authors: Diptarka Chakraborty, Arya Mazumdar, Barna Saha, Alvin Hong Yao Yan

Ensuring fairness in algorithmic ranking systems is a critical challenge with significant societal implications for hiring, recommendations, web search, and data management. Standard methods for aggregating multiple preference orders into a consensus ranking may perpetuate and even amplify the lack of representation of underrepresented groups. To address this, recent research has focused on incorporating fairness constraints to ensure the presence of different groups in the top-$k$ positions of the final aggregate ranking. We study two fairness-aware variants under the well-known Spearman footrule, which corresponds to the $L_1$ distance between rankings. First, we address the practically salient task of computing a fair aggregate top-$k$ ranking -- crucial in settings like recommendations and hiring where selection is primarily based on the top-$k$ results -- and present the first optimal algorithm for this problem. Second, we consider fair (full) rank aggregation over all candidates (not specifically on top-$k$). We already know of a $3$-approximation for this fair rank aggregation variant (Wei et al., SIGMOD'22; Chakraborty et al., NeurIPS'22), whereas an exact algorithm exists for the corresponding unconstrained (unfair) version (Dwork et al., WWW'01). Closing the computational gap between fair and unconstrained rank aggregation has remained a tantalizing open problem. We make significant progress by giving a $2$-approximation algorithm for fair (full) rank aggregation, improving substantially over the previous $3$-approximation. Further, we complement our theoretical contributions with experiments on different real-world datasets, which corroborate our theoretical results and demonstrate strong empirical performance relative to state-of-the-art baselines.

Learning-Augmented Online Scheduling with Parsimonious Preemption

from arXiv: Data Structures and Algorithms

Authors: Mugen Blue, Sungjin Im, Alexander Lindermayr

Learning-augmented algorithms have emerged as a powerful paradigm to surpass traditional worst-case lower bounds by integrating potentially noisy predictions. While this framework has seen success in online scheduling, existing work primarily optimizes job latency while relying on frequent, ``blind'' preemptions. This ignores the fundamental trade-off between algorithmic performance and preemption complexity. We provide the first systematic study of learning-augmented scheduling that curbs preemption while optimizing latency. We establish that the gap between theoretical latency bounds and preemption overhead can be bridged with solid analytical foundations. Our results include $O(1)$-competitive algorithms for single and unrelated parallel machines with only $O(1)$ preemptions per job under accurate predictions, with overhead scaling logarithmically with the prediction error. By providing the first bounded-preemption guarantees for unrelated and malleable machines, we extend the theoretical reach of the learning-augmented framework to more constrained and realistic settings. Finally, our algorithms are validated through experiments.

Authors: Mugen Blue, Sungjin Im, Alexander Lindermayr

Learning-augmented algorithms have emerged as a powerful paradigm to surpass traditional worst-case lower bounds by integrating potentially noisy predictions. While this framework has seen success in online scheduling, existing work primarily optimizes job latency while relying on frequent, ``blind'' preemptions. This ignores the fundamental trade-off between algorithmic performance and preemption complexity. We provide the first systematic study of learning-augmented scheduling that curbs preemption while optimizing latency. We establish that the gap between theoretical latency bounds and preemption overhead can be bridged with solid analytical foundations. Our results include $O(1)$-competitive algorithms for single and unrelated parallel machines with only $O(1)$ preemptions per job under accurate predictions, with overhead scaling logarithmically with the prediction error. By providing the first bounded-preemption guarantees for unrelated and malleable machines, we extend the theoretical reach of the learning-augmented framework to more constrained and realistic settings. Finally, our algorithms are validated through experiments.

Entropy Equivalence Testing

from arXiv: Data Structures and Algorithms

Authors: Clément L. Canonne, Yash Pote, Jonathan Scarlett, Joy Qiping Yang

We introduce the problem of \emph{entropy equivalence testing} for probability distributions, a relaxation of the well-studied closeness testing problem, where the distribution testing algorithm is now only required to distinguish, given samples from two unknown distributions $p,q$ and a parameter $\varepsilon \in(0,1/2]$, between $p=q$ and $|H(p)-H(q)| \geq \varepsilon$ (where $H$ denotes the Shannon entropy). We provide a time- and sample-efficient algorithm for this task, showing that the optimal sample complexity for this task can be significantly lower than that of closeness testing. As an application, we leverage this result to provide the first non-trivial testing algorithm for (standard) closeness of low-degree \emph{Bayesian networks}, which significantly improves on either the sample or time complexity of a baseline based on full learning.

Authors: Clément L. Canonne, Yash Pote, Jonathan Scarlett, Joy Qiping Yang

We introduce the problem of \emph{entropy equivalence testing} for probability distributions, a relaxation of the well-studied closeness testing problem, where the distribution testing algorithm is now only required to distinguish, given samples from two unknown distributions $p,q$ and a parameter $\varepsilon \in(0,1/2]$, between $p=q$ and $|H(p)-H(q)| \geq \varepsilon$ (where $H$ denotes the Shannon entropy). We provide a time- and sample-efficient algorithm for this task, showing that the optimal sample complexity for this task can be significantly lower than that of closeness testing. As an application, we leverage this result to provide the first non-trivial testing algorithm for (standard) closeness of low-degree \emph{Bayesian networks}, which significantly improves on either the sample or time complexity of a baseline based on full learning.

Sunday, May 24

Two Erdős Problems on Points in the Plane and AI

from Computational Complexity

In a 1946 paper in the American Mathematical Monthly, Paul Erdős posed the Erdős Distinct Distance Problem and the Erdős Unit Distance Problem.

--------------------------------------------------------------------

THE ERDŐS DISTINCT DISTANCE PROBLEM

A nice high school math competition problem, if it were not fairly well known, is:

Show that for all sets of \(n\) points in the plane, there are at least \(0.5\sqrt{n}\) distinct distances.

A bit harder is to show that there exist \(n\) points in the plane where the number of distinct distances is \(O(\frac{n}{(\log n)^{1/2}})\). The points are the grid points of a \(\sqrt{n}\times\sqrt{n}\) integer grid. 

ADDED LATER:  The set of points is \(\{ (x,y) \in Z\times Z   \colon    1\le x,y\le \sqrt{n}   \} \)

Let \(g(n)\) be the minimum number such that, for every set of \(n\) points in the plane,
there are \(g(n)\) distinct distances. The above results (of Erdős) show that

\( \Omega(n^{1/2}) \le g(n) \le O(\frac{n}{(\log n)^{1/2}}) \)

Erdős made no conjecture about \(g(n)\) in that 1946 paper. However, later papers attribute to him one of two conjectures:

1) \( (\forall \epsilon)[g(n)\ge n^{1-\epsilon}]\)

or

2) \(g(n) = \Omega(\frac{n}{(\log n)^{1/2}})\)

Those papers incorrectly point to the 1946 paper for where he made those conjectures.  I suspect he made those conjectures later in talks or compilations of open problems.

This was a great problem in that there were many papers making progress on it, each one having new ideas. The current best result is that \(g(n) \ge \Omega(\frac{n}{\log n})\).  This result was obtained by Larry Guth and Netz Katz in 2011 and is, by far, the largest leap forward in progress on the problem.  Brass-Moser-Pach (2005) and later Pach-Sharir (2009) observed that existing techniques could never prove \(g(n) \ge \Omega(n^{8/9+\epsilon})\).  I've also heard the observation credited to Rusza. Guth and Katz invented (discovered?) new techniques to obtain their breakthrough.

See my webpage of all papers on the problem here or the Wikipedia page here

Takeaway: The Erdős Distinct Distance Problem saw decades of steady human progress and was eventually almost completely resolved without computer assistance.  I would not have emphasized without computer assistance one year ago.  Maybe not even one week ago.

----------------------------------------------

THE ERDŐS UNIT DISTANCE PROBLEM

Let S be a set of \(n\) points in the plane. Some of the distances between points of S may be 1. How many?  What is the maximum number of unit distances that can occur? Denote this by \(f(n)\). Erdős showed

\( n^{1+\Omega(1/\log\log n)} \le f(n) \le O(n^{3/2}) \).

The lower bound is obtained by using the grid points of a scaled version of the \(\sqrt{n}\times\sqrt{n}\) integer grid. See a writeup by ChatGPT,   here.  We note for later that this can be phrased as using the points in the Gaussian integers, \(Z[i]\).  

In the 1946 paper Erdős conjectured that, for all \(\epsilon\),  \(f(n) = O(n^{1+\epsilon})\).

Joel Spencer, Endre Szemerédi, and William Trotter in 1984 showed \(f(n) = O(n^{4/3})\).

Aside from the Spencer-Szemerédi-Trotter bound, there has been relatively little progress on either the upper or lower bound.

----------------------------------

CONTRASTING THE TWO PROBLEMS

Contrast: The Erdős Distinct Distance problem has seen extensive progress and a large literature. By contrast, although many papers were written on the Erdős unit distance problem, there was little improvement. That changed on May 20, 2026.

------------------------------------------

OPENAI PRODUCES A COUNTEREXAMPLE TO THE ERDŐS CONJECTURE

OpenAI recently produced a counterexample to Erdős's conjecture.  More precisely, OpenAI proved the following:

There exists \(\epsilon>0 \) and a sequence of point  sets \(P_i \subseteq R^2\) such that, letting \(|P_i|=n_i\):

a) \(n_i \rightarrow\infty\), and

b) for all \(i\), the number of unit distances in \(P_i\) is at least \(n_i^{1+\epsilon}\).


The  OpenAI announcement is here. The technical paper is here. 


Note that the author is OpenAI. There are no human co-authors.  However,(1)  Lije Chen used an internal OpenAI model and (2) Mark Selke and Metaab Sawhney verified correctness. The \(\epsilon\) is not given though one could go through the paper and find it. It would be very small. 

After the initial AI-generated argument (or AI-assisted?) human mathematicians streamlined and clarified the proof. The result is a paper by Noga Alon, Thomas Bloom, W.T. Gowers, Daniel Litt, Will Sawin, Arul Shankar, Jacob Tsimermann, Victor Wang, and Melanie Matchett Wood  that explains the history, the proof, and their reactions to it. That paper is here. They obtain \(\epsilon \sim 6\times 10^{-38}\). 

Recall that Erdős's lower bound was obtained by using points in \(Z[i]\). The new result used more complicated number fields. Indeed, the degree of the number fields used goes to infinity.  For more details see the Alon et al. proof.  

Will Sawin wrote a paper where he obtained \(\epsilon=0.014114\). Quoting his paper:

To do this, we make explicit and sharpen every step of the argument. Interestingly, this rarely makes the argument much more complex. The total length of this paper is essentially the same as the original OpenAI writeup, though longer than the simplified version prepared by human authors [Alon et a.].

His paper is here

-------------------------------------------

 MY POINTS

1) The Erdős's Unit Distance problem is a well known and interesting conjecture, so this is not picking out some obscure problem.

2) AI-generated or AI-assisted? The announcement by OpenAI, and the Alon et al. paper, indicate that the proof is AI-generated.

Fellow blogger Gary Marcus disagrees and argues that it is AI-assisted.  See his post here. 

ADDED LATER: An interview with Gary Marcus is 


I think this is a hard question, and perhaps there is more of a continuum.  However, I trust Alon et al., so I will go with AI-generated.

3) Humans were still needed to verify and  clean up the proof. In the future, we will all be grad students, verifying and cleaning up what AI outputs.

4) The final proof was readable. One concern about AI proofs is that they would be unreadable and hard to verify. That was not the case here.

5) The ideas needed for the solution already existed; however:

a) The right combination was hard to find.

b) The relevant techniques used, algebraic number theory, are not standard tools in combinatorial geometry.

c) It was widely believed that Erdős's conjecture was true.

d) Producing a counterexample seems less impressive than proving the theorem.  It would be of interest to see if OpenAI could prove the Guth-Katz result; however, that would not work now since Guth-Katz is already out there for OpenAI to see. Perhaps OpenAI should try to improve Guth-Katz. 


6) In the short term, this result and what it portends, is good: math problems we care about will be resolved with the help of AI, perhaps solely by AI. But in the long run we may lose the ability---or at least the patience---to do the proofs ourselves. Note that I am assuming that AI will have future successes---see item 10-ONE for a counterthought.

7) AI seems to be good at combining known concepts. Is it good at coming up with new ones? Are humans? The distinction between combining known ideas and coming up with new ones is very thin.

8) Two contrary lessons:

a) You should know many fields of mathematics so that you can use ideas from one field in another, like the AI did.

b) You should know some field of math really well so you may do something new in it that current AI could not have done.

9) If an AI produces a new treatment for cancer that is cheaper and better than what is known I do not think we will care that an AI did it (though we will have humans check it). Is mathematics similar? Do we care that an AI did it?  Medicine primarily values outcomes; Mathematics also values understanding. Does reading a proof that AI did give the same understanding as coming up with it yourself?  Can AI help with that as well?


10) I suggest two futures:

ONE: While this AI-generated (or AI-assisted) result is impressive, it will be a rare occurrence. This result was actually a counterexample. The needed math was known. The result was interesting. This is a perfect storm that we might not see again for a while.

TWO: Even before the AI revolution, when I came up with a math problem I wanted solved, I would seek help, perhaps too early. My curiosity far exceeds my ego.  Since AI makes it easy to get help, my fear is that eventually we will all be Bill Gasarch---scary.


By gasarch

In a 1946 paper in the American Mathematical Monthly, Paul Erdős posed the Erdős Distinct Distance Problem and the Erdős Unit Distance Problem.

--------------------------------------------------------------------

THE ERDŐS DISTINCT DISTANCE PROBLEM

A nice high school math competition problem, if it were not fairly well known, is:

Show that for all sets of \(n\) points in the plane, there are at least \(0.5\sqrt{n}\) distinct distances.


A bit harder is to show that there exist \(n\) points in the plane where the number of distinct distances is \(O(\frac{n}{(\log n)^{1/2}})\). The points are the grid points of a \(\sqrt{n}\times\sqrt{n}\) integer grid. 

ADDED LATER:  The set of points is \(\{ (x,y) \in Z\times Z   \colon    1\le x,y\le \sqrt{n}   \} \)

Let \(g(n)\) be the minimum number such that, for every set of \(n\) points in the plane,
there are \(g(n)\) distinct distances. The above results (of Erdős) show that

\( \Omega(n^{1/2}) \le g(n) \le O(\frac{n}{(\log n)^{1/2}}) \)

Erdős made no conjecture about \(g(n)\) in that 1946 paper. However, later papers attribute to him one of two conjectures:

1) \( (\forall \epsilon)[g(n)\ge n^{1-\epsilon}]\)

or

2) \(g(n) = \Omega(\frac{n}{(\log n)^{1/2}})\)

Those papers incorrectly point to the 1946 paper for where he made those conjectures.  I suspect he made those conjectures later in talks or compilations of open problems.

This was a great problem in that there were many papers making progress on it, each one having new ideas. The current best result is that \(g(n) \ge \Omega(\frac{n}{\log n})\).  This result was obtained by Larry Guth and Netz Katz in 2011 and is, by far, the largest leap forward in progress on the problem.  Brass-Moser-Pach (2005) and later Pach-Sharir (2009) observed that existing techniques could never prove \(g(n) \ge \Omega(n^{8/9+\epsilon})\).  I've also heard the observation credited to Rusza. Guth and Katz invented (discovered?) new techniques to obtain their breakthrough.

See my webpage of all papers on the problem here or the Wikipedia page here

Takeaway: The Erdős Distinct Distance Problem saw decades of steady human progress and was eventually almost completely resolved without computer assistance.  I would not have emphasized without computer assistance one year ago.  Maybe not even one week ago.

----------------------------------------------

THE ERDŐS UNIT DISTANCE PROBLEM

Let S be a set of \(n\) points in the plane. Some of the distances between points of S may be 1. How many?  What is the maximum number of unit distances that can occur? Denote this by \(f(n)\). Erdős showed

\( n^{1+\Omega(1/\log\log n)} \le f(n) \le O(n^{3/2}) \).

The lower bound is obtained by using the grid points of a scaled version of the \(\sqrt{n}\times\sqrt{n}\) integer grid. See a writeup by ChatGPT,   here.  We note for later that this can be phrased as using the points in the Gaussian integers, \(Z[i]\).  

In the 1946 paper Erdős conjectured that, for all \(\epsilon\),  \(f(n) = O(n^{1+\epsilon})\).

Joel Spencer, Endre Szemerédi, and William Trotter in 1984 showed \(f(n) = O(n^{4/3})\).

Aside from the Spencer-Szemerédi-Trotter bound, there has been relatively little progress on either the upper or lower bound.

----------------------------------

CONTRASTING THE TWO PROBLEMS

Contrast: The Erdős Distinct Distance problem has seen extensive progress and a large literature. By contrast, although many papers were written on the Erdős unit distance problem, there was little improvement. That changed on May 20, 2026.

------------------------------------------

OPENAI PRODUCES A COUNTEREXAMPLE TO THE ERDŐS CONJECTURE

OpenAI recently produced a counterexample to Erdős's conjecture.  More precisely, OpenAI proved the following:

There exists \(\epsilon>0 \) and a sequence of point  sets \(P_i \subseteq R^2\) such that, letting \(|P_i|=n_i\):

a) \(n_i \rightarrow\infty\), and

b) for all \(i\), the number of unit distances in \(P_i\) is at least \(n_i^{1+\epsilon}\).


The  OpenAI announcement is here. The technical paper is here


Note that the author is OpenAI. There are no human co-authors.  However,(1)  Lije Chen used an internal OpenAI model and (2) Mark Selke and Metaab Sawhney verified correctness. The \(\epsilon\) is not given though one could go through the paper and find it. It would be very small. 

After the initial AI-generated argument (or AI-assisted?) human mathematicians streamlined and clarified the proof. The result is a paper by Noga Alon, Thomas Bloom, W.T. Gowers, Daniel Litt, Will Sawin, Arul Shankar, Jacob Tsimermann, Victor Wang, and Melanie Matchett Wood  that explains the history, the proof, and their reactions to it. That paper is here. They obtain \(\epsilon \sim 6\times 10^{-38}\). 

Recall that Erdős's lower bound was obtained by using points in \(Z[i]\). The new result used more complicated number fields. Indeed, the degree of the number fields used goes to infinity.  For more details see the Alon et al. proof.  

Will Sawin wrote a paper where he obtained \(\epsilon=0.014114\). Quoting his paper:

To do this, we make explicit and sharpen every step of the argument. Interestingly, this rarely makes the argument much more complex. The total length of this paper is essentially the same as the original OpenAI writeup, though longer than the simplified version prepared by human authors [Alon et a.].

His paper is here

-------------------------------------------

 MY POINTS

1) The Erdős's Unit Distance problem is a well known and interesting conjecture, so this is not picking out some obscure problem.

2) AI-generated or AI-assisted? The announcement by OpenAI, and the Alon et al. paper, indicate that the proof is AI-generated.

Fellow blogger Gary Marcus disagrees and argues that it is AI-assisted.  See his post here

ADDED LATER: An interview with Gary Marcus is 


I think this is a hard question, and perhaps there is more of a continuum.  However, I trust Alon et al., so I will go with AI-generated.

3) Humans were still needed to verify and  clean up the proof. In the future, we will all be grad students, verifying and cleaning up what AI outputs.

4) The final proof was readable. One concern about AI proofs is that they would be unreadable and hard to verify. That was not the case here.

5) The ideas needed for the solution already existed; however:

a) The right combination was hard to find.

b) The relevant techniques used, algebraic number theory, are not standard tools in combinatorial geometry.

c) It was widely believed that Erdős's conjecture was true.

d) Producing a counterexample seems less impressive than proving the theorem.  It would be of interest to see if OpenAI could prove the Guth-Katz result; however, that would not work now since Guth-Katz is already out there for OpenAI to see. Perhaps OpenAI should try to improve Guth-Katz. 


6) In the short term, this result and what it portends, is good: math problems we care about will be resolved with the help of AI, perhaps solely by AI. But in the long run we may lose the ability---or at least the patience---to do the proofs ourselves. Note that I am assuming that AI will have future successes---see item 10-ONE for a counterthought.

7) AI seems to be good at combining known concepts. Is it good at coming up with new ones? Are humans? The distinction between combining known ideas and coming up with new ones is very thin.

8) Two contrary lessons:

a) You should know many fields of mathematics so that you can use ideas from one field in another, like the AI did.

b) You should know some field of math really well so you may do something new in it that current AI could not have done.

9) If an AI produces a new treatment for cancer that is cheaper and better than what is known I do not think we will care that an AI did it (though we will have humans check it). Is mathematics similar? Do we care that an AI did it?  Medicine primarily values outcomes; Mathematics also values understanding. Does reading a proof that AI did give the same understanding as coming up with it yourself?  Can AI help with that as well?


10) I suggest two futures:

ONE: While this AI-generated (or AI-assisted) result is impressive, it will be a rare occurrence. This result was actually a counterexample. The needed math was known. The result was interesting. This is a perfect storm that we might not see again for a while.

TWO: Even before the AI revolution, when I came up with a math problem I wanted solved, I would seek help, perhaps too early. My curiosity far exceeds my ego.  Since AI makes it easy to get help, my fear is that eventually we will all be Bill Gasarch---scary.


By gasarch

TR26-086 | A Note on Second-Order Expected Maximum-Load Bounds for Binary Linear Hashing | Nader Bshouty

from ECCC Papers

Let $S\subseteq {\mathbb F}_2^u$ have size $n=2^\ell$, and let $h:{\mathbb F}_2^u\to {\mathbb F}_2^\ell$ be a uniformly random linear map. For $y\in{\mathbb F}_2^\ell$, write ${load}_h(y):=|h^{-1}(y)\cap S|$, and let $M(S,h):=\max_{y\in{\mathbb F}_2^\ell}\{load}_h(y)$ be the maximum load. Jaber, Kumar and Zuckerman (STOC 2025) proved that the expected maximum load of $h$ on $S$ is at most $16\log n/\log\log n$, matching the fully independent keys-into-bins scale up to constants. Their proof also gives the tail estimate \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left(\frac{1}{R^{2}}\right). \] We record a base optimization in their exponential-potential method showing that binary linear hashing nearly matches fully independent hashing also at the level of the second-order maximum-load scale. For every $R>1$ satisfying $R\ell^{1-1/R}\ge D\ln\ell$, where $D$ is an absolute constant, we prove \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left( \frac{(\log\log n)^2}{R^2(\log n)^{2-2/R}} \right). \] Integrating this tail yields \[ {\mathbb E}[M(S,h)] \le \left( 1+ (1+o(1)) \frac{\log\log\log n}{\log\log n} \right) \frac{\log n}{\log\log n}. \] Thus binary linear hashing matches fully independent hashing in the leading term and matches the dominant second-order correction up to a $1+o(1)$ factor. We also prove, by an independent self-contained argument, a sharp tail bound for one prescribed bucket: for fixed $y\in{\mathbb F}_2^\ell$, \[ \Pr[{load}_h(y)>2^a-2]\le \gamma^{-1}2^{-a^2}, \] where $\gamma=\prod_{j\ge1}(1-2^{-j})$. A subspace construction shows that this is asymptotically tight even in the leading constant as $a\to\infty$. However, this controls only a fixed bucket; a direct union bound over all buckets loses a factor $2^\ell$.

Let $S\subseteq {\mathbb F}_2^u$ have size $n=2^\ell$, and let $h:{\mathbb F}_2^u\to {\mathbb F}_2^\ell$ be a uniformly random linear map. For $y\in{\mathbb F}_2^\ell$, write ${load}_h(y):=|h^{-1}(y)\cap S|$, and let $M(S,h):=\max_{y\in{\mathbb F}_2^\ell}\{load}_h(y)$ be the maximum load. Jaber, Kumar and Zuckerman (STOC 2025) proved that the expected maximum load of $h$ on $S$ is at most $16\log n/\log\log n$, matching the fully independent keys-into-bins scale up to constants. Their proof also gives the tail estimate \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left(\frac{1}{R^{2}}\right). \] We record a base optimization in their exponential-potential method showing that binary linear hashing nearly matches fully independent hashing also at the level of the second-order maximum-load scale. For every $R>1$ satisfying $R\ell^{1-1/R}\ge D\ln\ell$, where $D$ is an absolute constant, we prove \[ \Pr\left[ M(S,h)\ge R\frac{\log n}{\log\log n} \right] \le O\left( \frac{(\log\log n)^2}{R^2(\log n)^{2-2/R}} \right). \] Integrating this tail yields \[ {\mathbb E}[M(S,h)] \le \left( 1+ (1+o(1)) \frac{\log\log\log n}{\log\log n} \right) \frac{\log n}{\log\log n}. \] Thus binary linear hashing matches fully independent hashing in the leading term and matches the dominant second-order correction up to a $1+o(1)$ factor. We also prove, by an independent self-contained argument, a sharp tail bound for one prescribed bucket: for fixed $y\in{\mathbb F}_2^\ell$, \[ \Pr[{load}_h(y)>2^a-2]\le \gamma^{-1}2^{-a^2}, \] where $\gamma=\prod_{j\ge1}(1-2^{-j})$. A subspace construction shows that this is asymptotically tight even in the leading constant as $a\to\infty$. However, this controls only a fixed bucket; a direct union bound over all buckets loses a factor $2^\ell$.

TR26-085 | On Parallel Complexity of Arboricity in Structured Graphs | Archit Chauhan, Himanshi Singh, Rohit Gurjar, Sujoy Bhore

from ECCC Papers

We study the parallel complexity of computing the arboricity of a graph, defined as the minimum number of forests into which its edges can be partitioned. For graphs of bounded treewidth, we present a simple dynamic programming–based parallel algorithm that constructs an optimal partition of the edges into forests. For graphs of bounded genus, we give an alternative and simple parallel algorithm for computing arboricity by adapting Goldberg’s method for finding dense subgraphs.

We study the parallel complexity of computing the arboricity of a graph, defined as the minimum number of forests into which its edges can be partitioned. For graphs of bounded treewidth, we present a simple dynamic programming–based parallel algorithm that constructs an optimal partition of the edges into forests. For graphs of bounded genus, we give an alternative and simple parallel algorithm for computing arboricity by adapting Goldberg’s method for finding dense subgraphs.

Saturday, May 23

The Branch and the Wish

from Nisheeth Vishnoi

On mathematics, machines, and what the proof never held When I was young, I learned about the great Sanskrit poet and playwright Kālidāsa, the author of Śakuntalā and other works whose beauty has survived across centuries. But the story I remember most about him is about a man sitting on a tree branch and cutting the very […]

On mathematics, machines, and what the proof never held


When I was young, I learned about the great Sanskrit poet and playwright Kālidāsa, the author of Śakuntalā and other works whose beauty has survived across centuries. But the story I remember most about him is about a man sitting on a tree branch and cutting the very branch on which he sits. It was usually told as a joke about stupidity, but I could never hear it only that way. The man was absorbed in the motion of the blade, unable to see the branch beneath him. He was not failing at the task. He was doing exactly what he had set out to do, letting the task fill the whole field of attention.

Mathematics has always needed proof. Without it, a claim remains too dependent on the person who made it. Proof allows an insight to be checked by others, long after the circumstances of its discovery have disappeared. It is one of the most beautiful inventions of human thought.

But I have begun to wonder whether, over the centuries, something else happened alongside this beauty. 

Full essay: https://nisheethvishnoi.substack.com/

By nisheethvishnoi

Friday, May 22

A sharp interaction-degree threshold for simulating QAOA

from arXiv: Computational Complexity

Authors: Ralfs Āboliņš, Andris Ambainis

We identify a sharp interaction-degree threshold for the classical simulation of QAOA with $2$-local cost functions. At degree $3$, classical sampling from depth-$1$ QAOA with small multiplicative error would collapse the polynomial hierarchy to its third level. At degree $2$, exact classical sampling from depth-$p$ QAOA on $n$ qubits runs in time $n^{O(1)}$ whenever $p = O(\log n)$. The hard degree-$3$ instances have trivially optimizable cost functions, so sampling hardness does not by itself imply a quantum optimization advantage.

Authors: Ralfs Āboliņš, Andris Ambainis

We identify a sharp interaction-degree threshold for the classical simulation of QAOA with $2$-local cost functions. At degree $3$, classical sampling from depth-$1$ QAOA with small multiplicative error would collapse the polynomial hierarchy to its third level. At degree $2$, exact classical sampling from depth-$p$ QAOA on $n$ qubits runs in time $n^{O(1)}$ whenever $p = O(\log n)$. The hard degree-$3$ instances have trivially optimizable cost functions, so sampling hardness does not by itself imply a quantum optimization advantage.

Quoridor is PSPACE-Hard

from arXiv: Computational Complexity

Authors: Marius Drop, Benjamin G. Rin, Finn van der Velde

Quoridor is an award-winning abstract strategy game designed by Mirko Marchesi and published in 1997. Similar games include Maze Attack, Blockade (also known as Cul-de-sac), and Pinko Pallino. In line with chess, checkers, Go, and other classic combinatorial games, Quoridor is a turn-based, deterministic, perfect-information game played on a square grid. We show that it is PSPACE-complete to determine whether a given player has a winning strategy in a given Quoridor position on a board with size $n \times n$. We prove this by reduction from Gpos(POS CNF), a Boolean formula game originally defined in 1978 by T. Schaefer.

Authors: Marius Drop, Benjamin G. Rin, Finn van der Velde

Quoridor is an award-winning abstract strategy game designed by Mirko Marchesi and published in 1997. Similar games include Maze Attack, Blockade (also known as Cul-de-sac), and Pinko Pallino. In line with chess, checkers, Go, and other classic combinatorial games, Quoridor is a turn-based, deterministic, perfect-information game played on a square grid. We show that it is PSPACE-complete to determine whether a given player has a winning strategy in a given Quoridor position on a board with size $n \times n$. We prove this by reduction from Gpos(POS CNF), a Boolean formula game originally defined in 1978 by T. Schaefer.

A Simple Sub-Polynomial Degree Coboundary Expander

from arXiv: Computational Complexity

Authors: Max Hopkins, Arka Ray

High dimensional expanders simultaneously satisfying spectral and combinatorial (coboundary) expansion have recently played a major role in breakthroughs in PCP and coding theory, but the only known construction of such complexes is extremely involved, requiring deep algebraic number theory. In this work, we give an extremely simple combinatorial construction of a sub-polynomial degree complex based on projections of the flags complex (subspace chains) that is (i) a local spectral expander, (ii) a coboundary expander, and (iii) a swap coboundary expander. As a corollary, we also give the first near-linear size combinatorial hypergraphs with good agreement tests in the '1%' regime, and a simple PCP construction with near-linear size.

Authors: Max Hopkins, Arka Ray

High dimensional expanders simultaneously satisfying spectral and combinatorial (coboundary) expansion have recently played a major role in breakthroughs in PCP and coding theory, but the only known construction of such complexes is extremely involved, requiring deep algebraic number theory. In this work, we give an extremely simple combinatorial construction of a sub-polynomial degree complex based on projections of the flags complex (subspace chains) that is (i) a local spectral expander, (ii) a coboundary expander, and (iii) a swap coboundary expander. As a corollary, we also give the first near-linear size combinatorial hypergraphs with good agreement tests in the '1%' regime, and a simple PCP construction with near-linear size.

Maximum-Weight Two Boxes Symmetric Difference Problem

from arXiv: Computational Geometry

Authors: José Fernández Goycoolea, Luis H. Herrera, Pablo Pérez Lantero, Carlos Seara

Let $P$ be a set of $n$ points in the plane, where each element of $P$ is assigned a weight $ω(p)$, positive or negative. In this paper, we present an algorithm that runs in $O(n^4\log n)$ time and $O(n)$ space to find two possibly overlapping axis-aligned rectangles $A$ and $B$ so as to maximize the total weight of the points contained in the symmetric difference of $A$ and $B$. The same optimization framework can easily be adapted to solve related problems such as to maximize the total weight in the symmetric difference of $k \geq 3$ boxes and/or in the union of $k \geq 2$ boxes.

Authors: José Fernández Goycoolea, Luis H. Herrera, Pablo Pérez Lantero, Carlos Seara

Let $P$ be a set of $n$ points in the plane, where each element of $P$ is assigned a weight $ω(p)$, positive or negative. In this paper, we present an algorithm that runs in $O(n^4\log n)$ time and $O(n)$ space to find two possibly overlapping axis-aligned rectangles $A$ and $B$ so as to maximize the total weight of the points contained in the symmetric difference of $A$ and $B$. The same optimization framework can easily be adapted to solve related problems such as to maximize the total weight in the symmetric difference of $k \geq 3$ boxes and/or in the union of $k \geq 2$ boxes.

A geometric modelling framework to support the design of heterogeneous lattice structures with non-linearly varying geometry

from arXiv: Computational Geometry

Authors: Nikita Letov, Yaoyao Fiona Zhao

Geometric modelling has been a crucial component of the design process ever since the introduction of the first Computer-Aided Design (CAD) systems. Additive Manufacturing (AM) pushes design freedom to previously unachievable limits. AM allows the manufacturing of lattice structures which are otherwise close to impossible to be manufactured conventionally. Yet, the geometric modelling of heterogeneous lattice structures is still greatly limited. Thus, the AM industry is now in a situation where the manufacturing capabilities exceed the geometric modelling capabilities. While there have been advancements in the modelling of heterogeneous lattice structures, the review of relevant literature revealed critical limitations of the existing approaches. These limitations include their inability to model non-linear variation of geometric parameters, as well as the limited amount of controllable geometric parameters. This work presents a novel geometric modelling methodology based on function representation as an attempt to bridge this gap. The proposed approach avoids the manual definition of geometric parameters and provides a method to control them with mathematical functions instead. A software prototype implementing the proposed approach is presented, and several use-cases are analysed.

Authors: Nikita Letov, Yaoyao Fiona Zhao

Geometric modelling has been a crucial component of the design process ever since the introduction of the first Computer-Aided Design (CAD) systems. Additive Manufacturing (AM) pushes design freedom to previously unachievable limits. AM allows the manufacturing of lattice structures which are otherwise close to impossible to be manufactured conventionally. Yet, the geometric modelling of heterogeneous lattice structures is still greatly limited. Thus, the AM industry is now in a situation where the manufacturing capabilities exceed the geometric modelling capabilities. While there have been advancements in the modelling of heterogeneous lattice structures, the review of relevant literature revealed critical limitations of the existing approaches. These limitations include their inability to model non-linear variation of geometric parameters, as well as the limited amount of controllable geometric parameters. This work presents a novel geometric modelling methodology based on function representation as an attempt to bridge this gap. The proposed approach avoids the manual definition of geometric parameters and provides a method to control them with mathematical functions instead. A software prototype implementing the proposed approach is presented, and several use-cases are analysed.

Exact Uniform L1 Spacing for Solow-Polasky Diversity on Lines and Ordered Pareto Fronts

from arXiv: Computational Geometry

Authors: Michael T. M. Emmerich, Mahboubeh Nezhadmoghaddam, Jesús Guillermo Falcón Cardona

We study fixed-cardinality maximization of the inverse-matrix Solow--Polasky diversity, equivalently finite metric magnitude for the exponential kernel, on one-dimensional and ordered metric sets. The analysis starts from the known finite-line gap formula for the exponential kernel, which writes the excess inverse-matrix diversity as a sum of functions of consecutive gaps. Building on this formula, the main interval theorem proves that, for every $k\geq 2$, the unique maximizing $k$-point subset of $[0,1]$ is the equally spaced set. Thus the objective selects a uniform gap representation on the real line. A converse kernel proposition shows that, among normalized non-increasing distance kernels, requiring the corresponding adjacent-gap additive structure forces the exponential family. Further results transfer the interval theorem to ordered $\ell_1$ (L1, or Manhattan) curves by isometry: the maximizing sets are uniform in accumulated $\ell_1$ length. As a consequence, monotone biobjective Pareto fronts admit Solow--Polasky optimal finite approximations that are uniformly spaced in accumulated objective-space change, a natural representation when all parts of a continuous front should be covered. Examples, including a dense connected front and a finite disconnected ZDT3 front, illustrate how the continuous uniform-gap result appears on discrete candidate sets. Solow-Polasky diversity; diversity measures; finite metric magnitude; L1 distance; uniform spacing; Pareto-front approximation; multiobjective optimization; fixed-cardinality subset selection

Authors: Michael T. M. Emmerich, Mahboubeh Nezhadmoghaddam, Jesús Guillermo Falcón Cardona

We study fixed-cardinality maximization of the inverse-matrix Solow--Polasky diversity, equivalently finite metric magnitude for the exponential kernel, on one-dimensional and ordered metric sets. The analysis starts from the known finite-line gap formula for the exponential kernel, which writes the excess inverse-matrix diversity as a sum of functions of consecutive gaps. Building on this formula, the main interval theorem proves that, for every $k\geq 2$, the unique maximizing $k$-point subset of $[0,1]$ is the equally spaced set. Thus the objective selects a uniform gap representation on the real line. A converse kernel proposition shows that, among normalized non-increasing distance kernels, requiring the corresponding adjacent-gap additive structure forces the exponential family. Further results transfer the interval theorem to ordered $\ell_1$ (L1, or Manhattan) curves by isometry: the maximizing sets are uniform in accumulated $\ell_1$ length. As a consequence, monotone biobjective Pareto fronts admit Solow--Polasky optimal finite approximations that are uniformly spaced in accumulated objective-space change, a natural representation when all parts of a continuous front should be covered. Examples, including a dense connected front and a finite disconnected ZDT3 front, illustrate how the continuous uniform-gap result appears on discrete candidate sets. Solow-Polasky diversity; diversity measures; finite metric magnitude; L1 distance; uniform spacing; Pareto-front approximation; multiobjective optimization; fixed-cardinality subset selection

Bifunction and Interlevel Delaunay Trifiltrations

from arXiv: Computational Geometry

Authors: Ángel Javier Alonso, Michael Kerber, Tung Lam, Michael Lesnick, Abhishek Rathod

A key property of the Delaunay filtration is that it is topologically (i.e., weakly) equivalent to the offset (union-of-balls) filtration. Recently, this filtration has been extended to point clouds equipped with an $\mathbb{R}$-valued function, yielding a computable 2-parameter filtration that satisfies an analogous weak equivalence. Motivated in part by the study of time-varying data, we introduce a 3-parameter extension of the Delaunay filtration for point clouds equipped with an $\mathbb{R}^2$-valued function, also satisfying an analogous weak equivalence. For a point cloud $X \subset \mathbb{R}^d$, our trifiltration has size $O\bigl(|X|^{\lceil(d+1)/2\rceil+1}\bigr)$. We present an algorithm that computes this trifiltration in time $O\bigl(|X|^{\lceil d/2\rceil+2}\bigr)$, together with an implementation. Our experiments demonstrate that implementation can handle thousands of points in $\mathbb{R}^3$, with memory growth that is nearly linear.

Authors: Ángel Javier Alonso, Michael Kerber, Tung Lam, Michael Lesnick, Abhishek Rathod

A key property of the Delaunay filtration is that it is topologically (i.e., weakly) equivalent to the offset (union-of-balls) filtration. Recently, this filtration has been extended to point clouds equipped with an $\mathbb{R}$-valued function, yielding a computable 2-parameter filtration that satisfies an analogous weak equivalence. Motivated in part by the study of time-varying data, we introduce a 3-parameter extension of the Delaunay filtration for point clouds equipped with an $\mathbb{R}^2$-valued function, also satisfying an analogous weak equivalence. For a point cloud $X \subset \mathbb{R}^d$, our trifiltration has size $O\bigl(|X|^{\lceil(d+1)/2\rceil+1}\bigr)$. We present an algorithm that computes this trifiltration in time $O\bigl(|X|^{\lceil d/2\rceil+2}\bigr)$, together with an implementation. Our experiments demonstrate that implementation can handle thousands of points in $\mathbb{R}^3$, with memory growth that is nearly linear.

On the Parameterized Complexity of Min-Sum-Radii

from arXiv: Data Structures and Algorithms

Authors: Pankaj Kumar, Haiko Müller, Sebastian Ordyniak, Melanie Schmidt

In the Min-Sum-Radii (MSR) clustering problem, we are given a finite set X of n points in a metric space. The objective is to find at most k clusters centered at a subset of these points such that every point of X is assigned to one of the clusters, minimizing the sum of the radii of the clusters. The problem is known to be NP-hard even on metrics induced by weighted planar graphs and metrics with constant doubling dimension, as shown by Gibson et al. (SWAT 2008). In this work, we investigate the parameterized complexity of MSR on metrics induced by undirected graphs. We distinguish between weighted graph metrics (with positive edge weights) and unweighted graph metrics (where all edges have unit weight). Weighted Graph Metrics: We show that MSR is W[1]-hard on metrics induced by weighted bipartite graphs, when parameterized by the combined parameter k (the number of clusters) and Delta (the cost of the clustering). We then investigate the structural parameterized complexity of the problem. Drexler et al. (arXiv:2310.02130) showed that the MSR problem admits an XP algorithm on metrics induced by weighted graphs when parameterized by treewidth, and asked whether this can be improved to fixed-parameter tractability. We first answer their question in the negative, and more strongly show that MSR stays W[1]-hard on metrics induced by undirected weighted bipartite graphs when parameterized by the vertex cover number plus k. We then turn our attention to parameters for dense graphs and show that MSR remains W[1]-hard when parameterized by k+Delta even on cliques and complete bipartite graphs. On the positive side, we employ the known XP algorithm parameterized by treewidth, to show that the MSR problem is FPT when parameterized by the parameter treewidth plus Delta.

Authors: Pankaj Kumar, Haiko Müller, Sebastian Ordyniak, Melanie Schmidt

In the Min-Sum-Radii (MSR) clustering problem, we are given a finite set X of n points in a metric space. The objective is to find at most k clusters centered at a subset of these points such that every point of X is assigned to one of the clusters, minimizing the sum of the radii of the clusters. The problem is known to be NP-hard even on metrics induced by weighted planar graphs and metrics with constant doubling dimension, as shown by Gibson et al. (SWAT 2008). In this work, we investigate the parameterized complexity of MSR on metrics induced by undirected graphs. We distinguish between weighted graph metrics (with positive edge weights) and unweighted graph metrics (where all edges have unit weight). Weighted Graph Metrics: We show that MSR is W[1]-hard on metrics induced by weighted bipartite graphs, when parameterized by the combined parameter k (the number of clusters) and Delta (the cost of the clustering). We then investigate the structural parameterized complexity of the problem. Drexler et al. (arXiv:2310.02130) showed that the MSR problem admits an XP algorithm on metrics induced by weighted graphs when parameterized by treewidth, and asked whether this can be improved to fixed-parameter tractability. We first answer their question in the negative, and more strongly show that MSR stays W[1]-hard on metrics induced by undirected weighted bipartite graphs when parameterized by the vertex cover number plus k. We then turn our attention to parameters for dense graphs and show that MSR remains W[1]-hard when parameterized by k+Delta even on cliques and complete bipartite graphs. On the positive side, we employ the known XP algorithm parameterized by treewidth, to show that the MSR problem is FPT when parameterized by the parameter treewidth plus Delta.

Asymptotic Rank Speedup Theorems, Revisited

from arXiv: Data Structures and Algorithms

Authors: Josh Alman, Baitian Li

Motivated by fast matrix multiplication and recent connections between asymptotic tensor rank and fine-grained complexity, we revisit classical tools from the matrix multiplication literature and develop a framework for obtaining improved asymptotic rank upper bounds for tensors beyond matrix multiplication. In the 1980s, Coppersmith-Winograd and Strassen discovered a series of speedup theorems for asymptotic rank: in certain regimes, one can extract additional terms from a border rank upper bound on a tensor $T$, and then use these terms to obtain an improved asymptotic rank of $T$. We establish general speedup theorems that subsume these results and enable quantitative improvements. Two representative applications are: (1) The asymptotic rank of the small Coppersmith-Winograd tensor $\mathrm{cw}_q$ is less than its border rank. For instance, we prove the asymptotic rank of $\mathrm{cw}_2$ is smaller than $3.931$, improving on $\underline{\mathrm{R}}(\mathrm{cw}_2)=4$. It is known that if the asymptotic rank of $\mathrm{cw}_2$ equals $3$, this would imply $ω=2$. (2) A general improvement over Strassen's bound: we obtain an upper bound below $d^{2ω/3}$ on the asymptotic rank of any $d\times d\times d$ tensor. To make full use of speedups, we analyze degenerations in which both sides are nontrivial direct sums, a setting where the optimal quantitative bound one can achieve was previously unclear. We do so via an approach we call Strassen calculus: a systematic method for converting such degeneration data into explicit asymptotic rank bounds using Strassen's theory of the asymptotic spectrum.

Authors: Josh Alman, Baitian Li

Motivated by fast matrix multiplication and recent connections between asymptotic tensor rank and fine-grained complexity, we revisit classical tools from the matrix multiplication literature and develop a framework for obtaining improved asymptotic rank upper bounds for tensors beyond matrix multiplication. In the 1980s, Coppersmith-Winograd and Strassen discovered a series of speedup theorems for asymptotic rank: in certain regimes, one can extract additional terms from a border rank upper bound on a tensor $T$, and then use these terms to obtain an improved asymptotic rank of $T$. We establish general speedup theorems that subsume these results and enable quantitative improvements. Two representative applications are: (1) The asymptotic rank of the small Coppersmith-Winograd tensor $\mathrm{cw}_q$ is less than its border rank. For instance, we prove the asymptotic rank of $\mathrm{cw}_2$ is smaller than $3.931$, improving on $\underline{\mathrm{R}}(\mathrm{cw}_2)=4$. It is known that if the asymptotic rank of $\mathrm{cw}_2$ equals $3$, this would imply $ω=2$. (2) A general improvement over Strassen's bound: we obtain an upper bound below $d^{2ω/3}$ on the asymptotic rank of any $d\times d\times d$ tensor. To make full use of speedups, we analyze degenerations in which both sides are nontrivial direct sums, a setting where the optimal quantitative bound one can achieve was previously unclear. We do so via an approach we call Strassen calculus: a systematic method for converting such degeneration data into explicit asymptotic rank bounds using Strassen's theory of the asymptotic spectrum.

Optimal Testing of Reed-Muller Codes with an Online Adversary

from arXiv: Data Structures and Algorithms

Authors: Esty Kelman, Uri Meir, Kai Zhe Zheng

Motivated by applications to property testing in the online-erasure model of Kalemaj, Raskhodnikova, and Varma (ITCS 2022 and Theory of Computing 2023), we define and analyze {\em semi-sample-based testers} for Reed-Muller codes. The task in Reed-Muller testing is to determine whether an input function $f: \F^n \to \F$ belongs to the Reed-Muller code or is far from it, using as few point queries to $f$ as possible. Reed-Muller testing is a well-studied task with its roots in both the Property Testing and Probabilistically Checkable Proofs literature. The online-erasure model introduces a twist: after each query made, an adversary may erase up to $t$ points of the input function, potentially thwarting any test in which the queries follow a predictable pattern. Semi-sample-based testers are a hybrid between sample-based testers -- which can only make uniformly random queries to the input function -- and standard testers, which can choose their queries freely. They are designed with the online-erasure model in mind and operate by first choosing some subset $S$ of the domain and then making their queries uniformly at random inside of $S$. We describe semi-sample-based testers for the Reed-Muller code and give an optimal analysis of their soundness. Consequently, we show that semi-sample-based testers are indeed effective in the presence of online erasures, and thereby achieve optimal query complexity for testing the Reed-Muller code in the online-erasure model. This result improves upon prior work of Minzer and Zheng (SODA 2024). As an added bonus, we show that semi-sample-based testers also exist for the lifted affine-invariant codes of Guo, Kopparty, and Sudan (ITCS 2013), thereby providing the first known testers for these codes in the online-erasure model.

Authors: Esty Kelman, Uri Meir, Kai Zhe Zheng

Motivated by applications to property testing in the online-erasure model of Kalemaj, Raskhodnikova, and Varma (ITCS 2022 and Theory of Computing 2023), we define and analyze {\em semi-sample-based testers} for Reed-Muller codes. The task in Reed-Muller testing is to determine whether an input function $f: \F^n \to \F$ belongs to the Reed-Muller code or is far from it, using as few point queries to $f$ as possible. Reed-Muller testing is a well-studied task with its roots in both the Property Testing and Probabilistically Checkable Proofs literature. The online-erasure model introduces a twist: after each query made, an adversary may erase up to $t$ points of the input function, potentially thwarting any test in which the queries follow a predictable pattern. Semi-sample-based testers are a hybrid between sample-based testers -- which can only make uniformly random queries to the input function -- and standard testers, which can choose their queries freely. They are designed with the online-erasure model in mind and operate by first choosing some subset $S$ of the domain and then making their queries uniformly at random inside of $S$. We describe semi-sample-based testers for the Reed-Muller code and give an optimal analysis of their soundness. Consequently, we show that semi-sample-based testers are indeed effective in the presence of online erasures, and thereby achieve optimal query complexity for testing the Reed-Muller code in the online-erasure model. This result improves upon prior work of Minzer and Zheng (SODA 2024). As an added bonus, we show that semi-sample-based testers also exist for the lifted affine-invariant codes of Guo, Kopparty, and Sudan (ITCS 2013), thereby providing the first known testers for these codes in the online-erasure model.

Lumberjack: Better Differentially Private Random Forests through Heavy Hitter Detection in Trees

from arXiv: Data Structures and Algorithms

Authors: Christian Janos Lebeda, David Erb, Tudor Cebere, Aurélien Bellet

Random forests are widely used in fields involving sensitive tabular data, but existing approaches to enforcing differential privacy (DP) typically degrade performance to the point of impracticality. In this paper, we introduce Lumberjack, a differentially private random forest algorithm that achieves substantially higher utility by constructing large random decision trees and then applying aggressive, privacy-preserving pruning to retain only sufficiently populated nodes. A key component of our approach is a novel $(\varepsilon,δ)$-DP heavy hitter detection algorithm for hierarchical data, whose error is $O_{\varepsilon,δ}(\sqrt{\log h})$ for trees of height $h$ and may be of independent interest. This favorable scaling enables the use of significantly deeper trees than in prior work, leading to improved expressiveness under privacy constraints. Our empirical evaluation on benchmark datasets shows that Lumberjack consistently outperforms prior DP random forest methods, establishing a new state of the art. In particular, our approach yields substantial improvements in the privacy-utility trade-off for practical privacy budgets. Our findings suggest that carefully designed DP random forests can close much of the utility gap, highlighting a promising and underexplored direction for future research.

Authors: Christian Janos Lebeda, David Erb, Tudor Cebere, Aurélien Bellet

Random forests are widely used in fields involving sensitive tabular data, but existing approaches to enforcing differential privacy (DP) typically degrade performance to the point of impracticality. In this paper, we introduce Lumberjack, a differentially private random forest algorithm that achieves substantially higher utility by constructing large random decision trees and then applying aggressive, privacy-preserving pruning to retain only sufficiently populated nodes. A key component of our approach is a novel $(\varepsilon,δ)$-DP heavy hitter detection algorithm for hierarchical data, whose error is $O_{\varepsilon,δ}(\sqrt{\log h})$ for trees of height $h$ and may be of independent interest. This favorable scaling enables the use of significantly deeper trees than in prior work, leading to improved expressiveness under privacy constraints. Our empirical evaluation on benchmark datasets shows that Lumberjack consistently outperforms prior DP random forest methods, establishing a new state of the art. In particular, our approach yields substantial improvements in the privacy-utility trade-off for practical privacy budgets. Our findings suggest that carefully designed DP random forests can close much of the utility gap, highlighting a promising and underexplored direction for future research.

The Secretary Problem with a Stochastic Precursor

from arXiv: Data Structures and Algorithms

Authors: Franziska Eberle, Alexander Lindermayr

In learning-augmented online algorithms, predictions are usually valued for what they say: a value estimate, a solution, or an algorithmic recommendation. This paper shows that predictions can also be valuable solely due to their arrival time. We study the fundamental secretary problem augmented with a stochastic precursor: a content-free signal that is guaranteed to arrive no later than the best item, but is otherwise stochastically timed. The signal does not carry any additional information; nevertheless, its timing alone changes the structure of optimal stopping. We characterize optimal policies in the random-order and adversarial-order models. In random order, a single uniformly timed precursor already gives success probability at least $\frac12$, improving on the classic $\frac1e$ benchmark. With increasingly late precursors, the success probability approaches $1$. In adversarial order, for which traditional models do not admit strong guarantees, sufficiently concentrated precursors recover constant success guarantees. Our results show that such novel forms of asynchronous temporal information are a distinct and powerful form of advice in online decision making and may also be effective for other problems.

Authors: Franziska Eberle, Alexander Lindermayr

In learning-augmented online algorithms, predictions are usually valued for what they say: a value estimate, a solution, or an algorithmic recommendation. This paper shows that predictions can also be valuable solely due to their arrival time. We study the fundamental secretary problem augmented with a stochastic precursor: a content-free signal that is guaranteed to arrive no later than the best item, but is otherwise stochastically timed. The signal does not carry any additional information; nevertheless, its timing alone changes the structure of optimal stopping. We characterize optimal policies in the random-order and adversarial-order models. In random order, a single uniformly timed precursor already gives success probability at least $\frac12$, improving on the classic $\frac1e$ benchmark. With increasingly late precursors, the success probability approaches $1$. In adversarial order, for which traditional models do not admit strong guarantees, sufficiently concentrated precursors recover constant success guarantees. Our results show that such novel forms of asynchronous temporal information are a distinct and powerful form of advice in online decision making and may also be effective for other problems.

A Coalgebraic Dijkstra Algorithm

from arXiv: Data Structures and Algorithms

Authors: Takahiro Sanada, Yoàv Montacute, Kittiphon Phalakarn, Ichiro Hasuo

The Dijkstra algorithm is a classical method for solving the shortest path problem on weighted graphs. There are several variations of the Dijkstra algorithm, including algorithms for the widest path problem and for two-player games. In this paper, we introduce the coalgebraic shortest path problem (CSPP), a unifying framework for a broad class of optimization problems on state-transition systems. This framework encompasses not only the aforementioned problems but also new ones such as the shortest binary tree problem. We further present a coalgebraic Dijkstra algorithm for solving the CSPP efficiently under a suitable condition. Our condition is necessary and sufficient for the algorithm to return correct solutions, thereby providing a precise criterion for when Dijkstra-style acceleration is possible. We also show that the proposed algorithm achieves asymptotic complexity comparable to that of the classical Dijkstra algorithm.

Authors: Takahiro Sanada, Yoàv Montacute, Kittiphon Phalakarn, Ichiro Hasuo

The Dijkstra algorithm is a classical method for solving the shortest path problem on weighted graphs. There are several variations of the Dijkstra algorithm, including algorithms for the widest path problem and for two-player games. In this paper, we introduce the coalgebraic shortest path problem (CSPP), a unifying framework for a broad class of optimization problems on state-transition systems. This framework encompasses not only the aforementioned problems but also new ones such as the shortest binary tree problem. We further present a coalgebraic Dijkstra algorithm for solving the CSPP efficiently under a suitable condition. Our condition is necessary and sufficient for the algorithm to return correct solutions, thereby providing a precise criterion for when Dijkstra-style acceleration is possible. We also show that the proposed algorithm achieves asymptotic complexity comparable to that of the classical Dijkstra algorithm.

Optimal $e^{(γ+o(1))n}$-Approximation of the Permanent of Positive Semidefinite Matrices

from arXiv: Data Structures and Algorithms

Authors: Nima Anari, Farzam Ebrahimnejad

We determine, up to lower-order terms in the exponent, the best possible deterministic polynomial-time approximation ratio for the permanent of a Hermitian positive semidefinite matrix. If $A\succeq 0$ has no zero diagonal entry, $d=\operatorname{rank}(A)$, $A=VV^\dagger$ with $V\in\mathbb{C}^{n\times d}$ full column rank, and $v_1,\ldots,v_n$ are the rows of $V$, define \[ Φ(V)=\max_{X\succ 0} \left\{\sum_{i=1}^n \log(v_i^\dagger Xv_i)+\log\det X-\operatorname{tr} X+d\right\}, \qquad \widehat P(A)=e^{Φ(V)}. \] We prove the exact sandwich \[ e^{-γn}\widehat P(A)\le \operatorname{per}(A)\le \widehat P(A). \] Here $γ$ is the Euler--Mascheroni constant. Since the maximization is concave, this gives a deterministic polynomial-time $e^{(γ+\varepsilon)n}$-approximation for every $\varepsilon>0$. Combined with the previous $e^{(γ-\varepsilon)n}$-hardness of approximation for positive semidefinite permanents, this resolves the optimal exponential approximation ratio for deterministic polynomial-time algorithms as $e^{(γ+o(1))n}$, assuming $\mathrm{P}\ne\mathrm{NP}$. The proof is an entropy argument applied to the standard Wick integral formula for $\operatorname{per}(A)$; the loss is exactly $γ$ per factor because $\mathbb{E}[\log T]=-γ$ for $T\sim\operatorname{Exp}(1)$. The result was obtained through interactions with GPT 5.5 Pro Extended: the first author's interaction was one-shot, while the second author's was a separate multi-turn interaction with high-level guidance. Both authors verified the theorem and proof. Codex was used to assemble and typeset the manuscript.

Authors: Nima Anari, Farzam Ebrahimnejad

We determine, up to lower-order terms in the exponent, the best possible deterministic polynomial-time approximation ratio for the permanent of a Hermitian positive semidefinite matrix. If $A\succeq 0$ has no zero diagonal entry, $d=\operatorname{rank}(A)$, $A=VV^\dagger$ with $V\in\mathbb{C}^{n\times d}$ full column rank, and $v_1,\ldots,v_n$ are the rows of $V$, define \[ Φ(V)=\max_{X\succ 0} \left\{\sum_{i=1}^n \log(v_i^\dagger Xv_i)+\log\det X-\operatorname{tr} X+d\right\}, \qquad \widehat P(A)=e^{Φ(V)}. \] We prove the exact sandwich \[ e^{-γn}\widehat P(A)\le \operatorname{per}(A)\le \widehat P(A). \] Here $γ$ is the Euler--Mascheroni constant. Since the maximization is concave, this gives a deterministic polynomial-time $e^{(γ+\varepsilon)n}$-approximation for every $\varepsilon>0$. Combined with the previous $e^{(γ-\varepsilon)n}$-hardness of approximation for positive semidefinite permanents, this resolves the optimal exponential approximation ratio for deterministic polynomial-time algorithms as $e^{(γ+o(1))n}$, assuming $\mathrm{P}\ne\mathrm{NP}$. The proof is an entropy argument applied to the standard Wick integral formula for $\operatorname{per}(A)$; the loss is exactly $γ$ per factor because $\mathbb{E}[\log T]=-γ$ for $T\sim\operatorname{Exp}(1)$. The result was obtained through interactions with GPT 5.5 Pro Extended: the first author's interaction was one-shot, while the second author's was a separate multi-turn interaction with high-level guidance. Both authors verified the theorem and proof. Codex was used to assemble and typeset the manuscript.

Minimum Sum Set Cover: Structures and Algorithm

from arXiv: Data Structures and Algorithms

Authors: Zhongyi Zhang, Yixin Cao

A set cover of a hypergraph $H$ is a set of vertices intersecting every hyperedge. In the minimum sum set cover problem, vertices are selected one by one; each edge pays the position of the first vertex that hits it, and the objective is to minimize the total cost. When $H$ is a graph, this is the minimum sum vertex cover problem. A solution is specified by a set cover $S$ together with an ordering of its vertices. While the classical set cover problem seeks to minimize $|S|$, the minimum sum variant favors covering many edges early and may prefer larger covers. This motivates a natural question: how large can the gap between~$\overrightarrowτ$ and $τ$ be? We prove an upper bound $\overrightarrowτ \le τ\log_{2} \lvert E(H)\rvert$, and show that for any positive~$n$, there exists a hypergraph $H$ on $n + 3$ vertices with $τ=3$ and $\overrightarrowτ=n$. For graphs, we obtain stronger bounds: we prove~$\overrightarrowτ \le 2τ\log_{2} τ$, improving the bound of Liu et al.\ [Theor. Comput. Sci., 2025], and we construct graphs with~$\overrightarrowτ = Ω\left( \frac{τ\log τ}{\log\log τ}\right)$, nearly matching this upper bound. On the algorithmic side, we show that minimum sum set cover is fixed-parameter tractable on bounded-rank hypergraphs, parameterized by~$\overrightarrowτ$, extending the algorithm of Liu et al.\ for graphs (i.e., rank-two hypergraphs).

Authors: Zhongyi Zhang, Yixin Cao

A set cover of a hypergraph $H$ is a set of vertices intersecting every hyperedge. In the minimum sum set cover problem, vertices are selected one by one; each edge pays the position of the first vertex that hits it, and the objective is to minimize the total cost. When $H$ is a graph, this is the minimum sum vertex cover problem. A solution is specified by a set cover $S$ together with an ordering of its vertices. While the classical set cover problem seeks to minimize $|S|$, the minimum sum variant favors covering many edges early and may prefer larger covers. This motivates a natural question: how large can the gap between~$\overrightarrowτ$ and $τ$ be? We prove an upper bound $\overrightarrowτ \le τ\log_{2} \lvert E(H)\rvert$, and show that for any positive~$n$, there exists a hypergraph $H$ on $n + 3$ vertices with $τ=3$ and $\overrightarrowτ=n$. For graphs, we obtain stronger bounds: we prove~$\overrightarrowτ \le 2τ\log_{2} τ$, improving the bound of Liu et al.\ [Theor. Comput. Sci., 2025], and we construct graphs with~$\overrightarrowτ = Ω\left( \frac{τ\log τ}{\log\log τ}\right)$, nearly matching this upper bound. On the algorithmic side, we show that minimum sum set cover is fixed-parameter tractable on bounded-rank hypergraphs, parameterized by~$\overrightarrowτ$, extending the algorithm of Liu et al.\ for graphs (i.e., rank-two hypergraphs).

Robust Statistical Estimators with Bounded Empirical Sensitivity

from arXiv: Data Structures and Algorithms

Authors: Valentio Iverson, Gautam Kamath, Argyris Mouzakis, Adam Smith

We introduce a new measure of robustness for statistical estimators, which we call \emph{empirical sensitivity}. An estimator $\hat θ$ has bounded empirical sensitivity if, with high probability over a dataset $X = (X_1, \dots, X_n) \sim \mathcal{D}^{\otimes n}$, for any dataset $Y$ obtained by modifying at most $ηn$ points in $X$, we have that $\hat θ(Y)$ is close to $\hat θ(X)$. We study bounds on this quantity for the prototypical problem of Gaussian mean estimation. We prove new lower bounds, showing that for any estimator $\hat μ$ which achieves an optimal $\ell_2$-error bound of $O\left(\sqrt{d/n}\right)$, the empirical sensitivity is at least $Ω\left(η+ \sqrt{ηd/n}\right)$. The two terms arise due to obstructions on the mean and variance (via an Efron-Stein argument) of such an estimator. We show that this bound is tight up to logarithmic factors, by employing recent results for robust empirical mean estimation.

Authors: Valentio Iverson, Gautam Kamath, Argyris Mouzakis, Adam Smith

We introduce a new measure of robustness for statistical estimators, which we call \emph{empirical sensitivity}. An estimator $\hat θ$ has bounded empirical sensitivity if, with high probability over a dataset $X = (X_1, \dots, X_n) \sim \mathcal{D}^{\otimes n}$, for any dataset $Y$ obtained by modifying at most $ηn$ points in $X$, we have that $\hat θ(Y)$ is close to $\hat θ(X)$. We study bounds on this quantity for the prototypical problem of Gaussian mean estimation. We prove new lower bounds, showing that for any estimator $\hat μ$ which achieves an optimal $\ell_2$-error bound of $O\left(\sqrt{d/n}\right)$, the empirical sensitivity is at least $Ω\left(η+ \sqrt{ηd/n}\right)$. The two terms arise due to obstructions on the mean and variance (via an Efron-Stein argument) of such an estimator. We show that this bound is tight up to logarithmic factors, by employing recent results for robust empirical mean estimation.

An $Ω(n \log n)$ Randomized Lower Bound for Cutting a Cake into Proportionally Fair Pieces

from arXiv: Data Structures and Algorithms

Authors: Stephen Arndt, Kirk Pruhs, Trung Tran

We consider the classic cake cutting problem in the Robertson-Webb model, with the objective of proportional fairness. We show that any randomized algorithm must use $Ω(n \log n)$ queries.

Authors: Stephen Arndt, Kirk Pruhs, Trung Tran

We consider the classic cake cutting problem in the Robertson-Webb model, with the objective of proportional fairness. We show that any randomized algorithm must use $Ω(n \log n)$ queries.

Finding a Solution to the Erdős-Ginzburg-Ziv Theorem in Linear Time

from arXiv: Data Structures and Algorithms

Authors: Sunghyeon Jo

The Erdős-Ginzburg-Ziv theorem states that every sequence of 2n - 1 integers contains a subsequence of length n whose sum is divisible by n. Choi, Kang, and Lim gave a simple deterministic O(n log n) algorithm for finding such a subsequence, and Leung recently improved this to O(n log log log n). We give a deterministic linear-time algorithm. The core is a linear-time algorithm for the following prime target subset-sum problem: given p - 1 nonzero residues in Z_p and a target residue, find a subset with the prescribed sum. Our algorithm maintains a compact arithmetic-progression representation of reachable sums. When two progressions intersect, a bounded Frobenius interval in their sum allows them to be merged into one longer progression, with enough growth to pay for the update. When the representation either contains a full progression or covers all nonzero residues, the target residue is recovered constructively. The standard multiplicative reduction then extends the prime algorithm to arbitrary moduli.

Authors: Sunghyeon Jo

The Erdős-Ginzburg-Ziv theorem states that every sequence of 2n - 1 integers contains a subsequence of length n whose sum is divisible by n. Choi, Kang, and Lim gave a simple deterministic O(n log n) algorithm for finding such a subsequence, and Leung recently improved this to O(n log log log n). We give a deterministic linear-time algorithm. The core is a linear-time algorithm for the following prime target subset-sum problem: given p - 1 nonzero residues in Z_p and a target residue, find a subset with the prescribed sum. Our algorithm maintains a compact arithmetic-progression representation of reachable sums. When two progressions intersect, a bounded Frobenius interval in their sum allows them to be merged into one longer progression, with enough growth to pay for the update. When the representation either contains a full progression or covers all nonzero residues, the target residue is recovered constructively. The standard multiplicative reduction then extends the prime algorithm to arbitrary moduli.

Near-Optimal Generalized Private Testing

from arXiv: Data Structures and Algorithms

Authors: Anamay Chaturvedi, Monika Henzinger, Jalaj Upadhyay

In differential privacy (DP), the generalized private testing problem was introduced by Liu and Talwar (STOC 2019). Given a dataset $X \in \mathcal{X}$ and a sequence of black-box $\varepsilon_t$-DP mechanisms $M_t:\mathcal{X}\to\{+1,-1\}$, the analyst must accept the first mechanism whose success probability $p_t=\Pr[M_t(X)=+1]$ exceeds a given threshold $p^*\in(0,1)$, while achieving DP. Accuracy is measured by the gap between $p^*$ and a rejection threshold $\bar{p}$, such that with probability $1-β$ for all $t\geq1$, if $p_t\leq\bar{p}$, then $M_t$ is rejected, and if $p_t\geq p^*$, then it is accepted. This generalizes the standard private testing problem, whose solution, the Sparse Vector Technique, is ubiquitous in DP. We introduce the Generalized Thresholding Mechanism (GTM) for generalized private testing. For $\varepsilon>0$ and any sequence of $(\varepsilon_t,δ_t)$-DP mechanisms $M_t$, the GTM is pure $\varepsilon$-DP. For $θ>0$, $γ\in(1,2]$, and $β\in(0,1)$, $\bar{p}_t=\max(p^*/γΛ_t, 1 - γΛ_t(1-p^*))-δ_t/\varepsilon_t$ for $Λ_t=(5t\ln^3(t+2))^{(2+θ)\varepsilon_t/\varepsilon}(4/β)^{(3+θ+2/θ)\varepsilon_t/\varepsilon}$. With probability $1-β$, the number of evaluations of $M_t$ is at most $O((\ln(t/β)/(γ-1)^2)\max(Λ_t/p^*,(1-p^*)^{-1}))$ for all $t\geq 1$. Our lower bounds prove near-optimality of our accuracy and sample complexity guarantees. Via the GTM, we give a black-box reduction for DP optimization from the continual observation (CO) setting to the batch setting. This gives us the first DP-CO algorithms for many maximization problems. Further, the GTM permits an adaptive choice of acceptance thresholds $(p^*_t)_{t\geq1}$, addressing a challenge mentioned in prior work on using generalized private testing for hyperparameter optimization (Papernot and Steinke (ICLR 2022)).

Authors: Anamay Chaturvedi, Monika Henzinger, Jalaj Upadhyay

In differential privacy (DP), the generalized private testing problem was introduced by Liu and Talwar (STOC 2019). Given a dataset $X \in \mathcal{X}$ and a sequence of black-box $\varepsilon_t$-DP mechanisms $M_t:\mathcal{X}\to\{+1,-1\}$, the analyst must accept the first mechanism whose success probability $p_t=\Pr[M_t(X)=+1]$ exceeds a given threshold $p^*\in(0,1)$, while achieving DP. Accuracy is measured by the gap between $p^*$ and a rejection threshold $\bar{p}$, such that with probability $1-β$ for all $t\geq1$, if $p_t\leq\bar{p}$, then $M_t$ is rejected, and if $p_t\geq p^*$, then it is accepted. This generalizes the standard private testing problem, whose solution, the Sparse Vector Technique, is ubiquitous in DP. We introduce the Generalized Thresholding Mechanism (GTM) for generalized private testing. For $\varepsilon>0$ and any sequence of $(\varepsilon_t,δ_t)$-DP mechanisms $M_t$, the GTM is pure $\varepsilon$-DP. For $θ>0$, $γ\in(1,2]$, and $β\in(0,1)$, $\bar{p}_t=\max(p^*/γΛ_t, 1 - γΛ_t(1-p^*))-δ_t/\varepsilon_t$ for $Λ_t=(5t\ln^3(t+2))^{(2+θ)\varepsilon_t/\varepsilon}(4/β)^{(3+θ+2/θ)\varepsilon_t/\varepsilon}$. With probability $1-β$, the number of evaluations of $M_t$ is at most $O((\ln(t/β)/(γ-1)^2)\max(Λ_t/p^*,(1-p^*)^{-1}))$ for all $t\geq 1$. Our lower bounds prove near-optimality of our accuracy and sample complexity guarantees. Via the GTM, we give a black-box reduction for DP optimization from the continual observation (CO) setting to the batch setting. This gives us the first DP-CO algorithms for many maximization problems. Further, the GTM permits an adaptive choice of acceptance thresholds $(p^*_t)_{t\geq1}$, addressing a challenge mentioned in prior work on using generalized private testing for hyperparameter optimization (Papernot and Steinke (ICLR 2022)).

Thursday, May 21

Interview with Naoki Kobayashi, CONCUR 2026 ToT Award recipient

from Luca Aceto

 As mentioned in a previous post, Naoki Kobayashi will receive one of the two CONCUR 2026 Test-of-Time Awards at CONCUR 2026. Naoki has kindly agreed to answer some questions of mine on his award-winning paper via email. I am delighted to post his answers below and hope you'll enjoy reading them as much as I did. Thanks, Naoki!

Luca: You receive the CONCUR ToT Award 2026 for your paper  A New Type System for Deadlock-Free Processes, which appeared at CONCUR 2006. That article is the culmination of a series of contributions you gave on the development of type systems for variations on the pi-calculus that guarantee deadlock-freedom. Could you briefly explain to our readers how you came to study the question addressed in your award-winning article and how the main ideas in your CONCUR 2006 paper evolved over time? Which of the ideas and results in your paper did you find most pleasing, surprising or challenging? 

Naoki: If I remember correctly, the idea of using types to ensure deadlock-freedom came up during discussions on the linear pi-calculus with Benjamin Pierce and David Turner [1]. Indeed, if one wants to ensure that a channel is really used exactly once, one also has to ensure that the process does not get stuck before using it. An initial result on type systems for deadlock-freedom was published in LICS 1997 [2].

After that, the type system was extended in two directions: enabling automated type inference and increasing expressiveness. These were somewhat conflicting goals, and the CONCUR 2006 paper achieved a balance between them. The core idea was as follows. To ensure deadlock-freedom, it is necessary to control dependencies among different channels. In [2], such dependencies were expressed using “time tags” and a possibly infinite partial order on them. The time tags were later replaced by natural numbers, called capability and obligation levels, to enable automated inference [3], but at the cost of expressive power. The CONCUR 2006 paper combined a restricted form of time tags with capability and obligation levels, thereby recovering much of the expressive power while retaining automated inference.

What I found most pleasing was precisely this balance: the paper showed that one could make the type system significantly more practical without completely giving up the conceptual elegance and expressiveness of the earlier formulation.

[1] Naoki Kobayashi, Benjamin C. Pierce, David N. Turner, Linearity and the Pi-Calculus. POPL 1996: 358-37
[2] Naoki Kobayashi, A Partially Deadlock-Free Typed Process Calculus. LICS 1997: 128-139
[3] Naoki Kobayashi, Type-based information flow analysis for the pi-calculus. Acta Informatica 42(4-5): 291-347 (2005)
Luca: The contribution in your article achieves a good trade-off between expressiveness of the type system and ease of implementing its associated type inference algorithm. Did you or any other researcher try to extend the range of examples that could be handled by a variation on your type system? Do you think that it would be worth doing so or have we hit the point of diminishing returns, where any advance would be very technical and hard to implement?
Naoki: With Elena Giachino and Cosimo Laneve [4], I indeed investigated such an extension. The system developed there can deal with infinite chains of dependencies more general than those handled in the CONCUR 2006 paper. It was, however, developed for value-passing CCS rather than the pi-calculus. Incorporating the idea of [4] into the pi-calculus setting remains future work. Another possible direction would be to combine the approach with generic types [5], which can capture precise dependencies among a set of channels. Unfortunately, I did not have time to fully explore this direction.
As for whether it is worth pushing this line further, my feeling is that further general-purpose extensions may easily become technically involved and hard to implement. Perhaps a more promising direction would be to adapt the type system to real concurrent languages such as Go, and then see what kinds of extensions are most beneficial in practice.

[4] Elena Giachino, Naoki Kobayashi, Cosimo Laneve, Deadlock Analysis of Unbounded Process Networks. CONCUR 2014: 63-77[5] Atsushi Igarashi, Naoki Kobayashi, A generic type system for the Pi-calculus. Theor. Comput. Sci. 311(1-3): 121-163 (2004)
Luca: Commendably, you supported the theoretical developments in your article with an implementation in TyPiCal. What role did the implementation work play in your research on type systems guaranteeing deadlock-freedom? Did the theoretical developments benefit from the implementation? Do you think that every proposal for a type system for some language ought to be accompanied by an implementation? 
Naoki: TyPiCal was initially developed as a proof of concept for the type inference algorithm of [3]. At that time, the theory had already been fully developed, and the implementation was mainly used to convince readers that the algorithm indeed works in practice. In later extensions of TyPiCal, for the CONCUR 2006 paper and the work of [6], the implementation was also useful for checking the theories.
I think it is desirable for every type-system proposal to be accompanied by an implementation, although I would not say that it is absolutely necessary. An implementation is useful not only for checking the theory, but also for identifying limitations of the theory and suggesting possible directions for improvement.

[6] Naoki Kobayashi, Davide Sangiorgi, A Hybrid Type System for Lock-Freedom of Mobile Processes. CAV 2008: 80-93
Luca:  In your paper, you showed, amongst other things, that the simply-typed λ-calculus with recursion could be encoded in the deadlock-free fragment of your calculus. What role did this result play in convincing you that your type system was sufficiently expressive? What were the main challenges that you had to overcome in developing that encoding and what were the main inspirations for that result?
Naoki: The encodability of the simply-typed λ-calculus with recursion in the deadlock-free fragment was a sort of minimal requirement for expressiveness. Since that property had already been shown for the original deadlock-free type system [2], I wanted to show that a similar level of expressive power was retained in the CONCUR 2006 type system. 
There was also a more conceptual reason for considering such an encoding. Deadlock-freedom corresponds to the progress property of well-typed functional programs: evaluation does not get stuck. Thus, the encodability result shows that this progress property can be preserved by compilation from possibly parallel functional programs into the pi-calculus, which may be viewed as a CPS-style, lower-level language with concurrency primitives. 
The main challenge was how to give a finite representation, within the type system, of the infinite chains of causal dependencies induced by recursion. As mentioned in the answer to the first question, we achieved this by combining the ideas of capability/obligation levels from [3] with a partial order on time tags from [2].
Luca: Your article gave a seminal contribution to the study of type systems for the pi-calculus that guarantee some desirable properties. Other examples of such contributions are, for instance, your paper with Davide Sangiorgi on a type system for lock-freedom or the paper by Yuxin Deng and Davide Sangiorgi on ensuring termination by typability amongst others. In hindsight, are there any relations amongst those contributions or ideas that underlie them and that can guide further developments in a systematic fashion? 
Naoki: All three type systems you mentioned are related in that they control causal dependencies using natural numbers called “levels.” The type system for deadlock-freedom forbids cyclic dependencies, while the type systems for lock-freedom and termination forbid infinite chains of dependencies. The formulation of such reasoning through types, however, suffers from inherent limitations in expressive power. 
To address this limitation, my paper with Davide Sangiorgi [7] suggested a new direction: combining syntactic typing rules with semantic ones. The latter can be discharged by resorting to model checkers, interactive theorem provers, or other verification tools, thereby complementing the limitations of type systems. I think this combination of type-based reasoning and semantic verification may provide a useful guide for further developments: types can capture common structural patterns, while semantic methods can handle cases that fall outside the reach of purely syntactic typing rules. 
[7] Naoki Kobayashi, Davide Sangiorgi, A hybrid type system for lock-freedom of mobile processes. ACM Trans. Program. Lang. Syst. 32(5): 16:1-16:49 (2010)
Luca: Are there any problems that you left open in your award-winning article that you'd still love to see solved? 
Naoki: I would still like to see at least two directions explored further. First, increasing the expressiveness of the type system by integrating the ideas of [4] and [5], mentioned above, would probably be necessary to make the approach practical. Second, since the type-based approach is inherently incomplete, finding a smooth integration with other methods, such as model checking and interactive theorem proving, remains an important open problem, as suggested in [6].
Luca: How did the results and the techniques you developed in your award-winning paper influence your subsequent research? Is there any result obtained by other researchers that builds on, or is related to, your work and that you like in particular? 
Naoki: A key ingredient of our type systems for deadlock-freedom is the notion of obligations and capabilities. An obligation of a send or receive operation describes the need to perform that operation, while a capability describes a guarantee that the operation can succeed. For example, in a client-server model, a client has the capability to send a request successfully, while a server has an obligation to receive and handle the request. In lock-based mutual exclusion, a thread has the capability to eventually acquire a lock, and, after acquiring the lock, it has an obligation to eventually release the lock. 
These dual notions of obligations and capabilities turned out to be useful in other contexts. For example, we adapted them for automated verification of security protocols [8]. Another line of work developed related ideas for more conventional concurrent programming languages. Leino et al. [9] applied related techniques to deadlock-freedom of channels and locks, and this line was further extended to monitors and condition variables by Hamin and Jacobs [10]. I found this line of work particularly interesting because it shows that similar ideas arise naturally also in a more conventional programming-language setting. 
[8] Morten Dahl, Naoki Kobayashi, Yunde Sun, Hans Hüttel, Type-Based Automated Verification of Authenticity in Asymmetric Cryptographic Protocols. ATVA 2011: 75-89 [9] K. Rustan M. Leino, Peter Müller, Jan Smans, Deadlock-Free Channels and Locks. ESOP 2010: 407-426
[10] Jafar Hamin, Bart Jacobs, Deadlock-Free Monitors. ESOP 2018: 415-441
Luca: What are the research topics related to type systems that you find most interesting right now? What role do you think type systems will or should have, if any, in the era of coding agents based on GenAI?
Naoki:  I think GenAI may change not only type systems, but also the focus of formal methods more broadly. Traditionally, programs have been written by humans, and therefore “lightweight formal methods” have played an important role. In that setting, it was important to keep program annotations minimal and to automate verification or bug finding as much as possible, even if this sometimes came at the cost of precision or completeness. 
In the era of GenAI-based coding agents, however, the situation may change. If coding agents can also generate annotations such as types and loop invariants, then the burden of writing annotations may no longer be such a serious issue. As a result, verification methods that aim at strong correctness guarantees, rather than merely finding bugs or checking lightweight properties, may become more important. 
In this respect, the CONCUR 2006 type system can be seen as a product of the former design philosophy: it emphasized automation, at some cost in precision. A possible future direction would be to move beyond such lightweight type systems, by incorporating dependent types or combining type systems with other methods such as model checking, aiming at more complete verification methods.
Luca: What advice would you give to a young researcher who is keen to start working on topics related to types for programming languages? 

Naoki: As discussed above, GenAI is rapidly changing the landscape of PL research, and we are facing significant challenges in adapting to these changes and finding new research directions to explore. I think this is a very good opportunity for “AI-native” young researchers to play important roles. My advice would be to learn the classical foundations carefully, but at the same time not to be constrained by traditional assumptions about how programs are written and verified.

By Luca Aceto

 As mentioned in a previous post, Naoki Kobayashi will receive one of the two CONCUR 2026 Test-of-Time Awards at CONCUR 2026. Naoki has kindly agreed to answer some questions of mine on his award-winning paper via email. I am delighted to post his answers below and hope you'll enjoy reading them as much as I did. Thanks, Naoki!

Luca: You receive the CONCUR ToT Award 2026 for your paper  A New Type System for Deadlock-Free Processes, which appeared at CONCUR 2006. That article is the culmination of a series of contributions you gave on the development of type systems for variations on the pi-calculus that guarantee deadlock-freedom. Could you briefly explain to our readers how you came to study the question addressed in your award-winning article and how the main ideas in your CONCUR 2006 paper evolved over time? Which of the ideas and results in your paper did you find most pleasing, surprising or challenging? 

Naoki: If I remember correctly, the idea of using types to ensure deadlock-freedom came up during discussions on the linear pi-calculus with Benjamin Pierce and David Turner [1]. Indeed, if one wants to ensure that a channel is really used exactly once, one also has to ensure that the process does not get stuck before using it. An initial result on type systems for deadlock-freedom was published in LICS 1997 [2].

After that, the type system was extended in two directions: enabling automated type inference and increasing expressiveness. These were somewhat conflicting goals, and the CONCUR 2006 paper achieved a balance between them. The core idea was as follows. To ensure deadlock-freedom, it is necessary to control dependencies among different channels. In [2], such dependencies were expressed using “time tags” and a possibly infinite partial order on them. The time tags were later replaced by natural numbers, called capability and obligation levels, to enable automated inference [3], but at the cost of expressive power. The CONCUR 2006 paper combined a restricted form of time tags with capability and obligation levels, thereby recovering much of the expressive power while retaining automated inference.

What I found most pleasing was precisely this balance: the paper showed that one could make the type system significantly more practical without completely giving up the conceptual elegance and expressiveness of the earlier formulation.

[1] Naoki Kobayashi, Benjamin C. Pierce, David N. Turner, Linearity and the Pi-Calculus. POPL 1996: 358-37
[2] Naoki Kobayashi, A Partially Deadlock-Free Typed Process Calculus. LICS 1997: 128-139
[3] Naoki Kobayashi, Type-based information flow analysis for the pi-calculus. Acta Informatica 42(4-5): 291-347 (2005)

Luca: The contribution in your article achieves a good trade-off between expressiveness of the type system and ease of implementing its associated type inference algorithm. Did you or any other researcher try to extend the range of examples that could be handled by a variation on your type system? Do you think that it would be worth doing so or have we hit the point of diminishing returns, where any advance would be very technical and hard to implement?

Naoki: With Elena Giachino and Cosimo Laneve [4], I indeed investigated such an extension. The system developed there can deal with infinite chains of dependencies more general than those handled in the CONCUR 2006 paper. It was, however, developed for value-passing CCS rather than the pi-calculus. Incorporating the idea of [4] into the pi-calculus setting remains future work.
Another possible direction would be to combine the approach with generic types [5], which can capture precise dependencies among a set of channels. Unfortunately, I did not have time to fully explore this direction.
As for whether it is worth pushing this line further, my feeling is that further general-purpose extensions may easily become technically involved and hard to implement. Perhaps a more promising direction would be to adapt the type system to real concurrent languages such as Go, and then see what kinds of extensions are most beneficial in practice.

[4] Elena Giachino, Naoki Kobayashi, Cosimo Laneve, Deadlock Analysis of Unbounded Process Networks. CONCUR 2014: 63-77
[5] Atsushi Igarashi, Naoki Kobayashi, A generic type system for the Pi-calculus. Theor. Comput. Sci. 311(1-3): 121-163 (2004)

Luca: Commendably, you supported the theoretical developments in your article with an implementation in TyPiCal. What role did the implementation work play in your research on type systems guaranteeing deadlock-freedom? Did the theoretical developments benefit from the implementation? Do you think that every proposal for a type system for some language ought to be accompanied by an implementation? 

Naoki: TyPiCal was initially developed as a proof of concept for the type inference algorithm of [3]. At that time, the theory had already been fully developed, and the implementation was mainly used to convince readers that the algorithm indeed works in practice. In later extensions of TyPiCal, for the CONCUR 2006 paper and the work of [6], the implementation was also useful for checking the theories.

I think it is desirable for every type-system proposal to be accompanied by an implementation, although I would not say that it is absolutely necessary. An implementation is useful not only for checking the theory, but also for identifying limitations of the theory and suggesting possible directions for improvement.

[6] Naoki Kobayashi, Davide Sangiorgi, A Hybrid Type System for Lock-Freedom of Mobile Processes. CAV 2008: 80-93

Luca:  In your paper, you showed, amongst other things, that the simply-typed λ-calculus with recursion could be encoded in the deadlock-free fragment of your calculus. What role did this result play in convincing you that your type system was sufficiently expressive? What were the main challenges that you had to overcome in developing that encoding and what were the main inspirations for that result?

Naoki: The encodability of the simply-typed λ-calculus with recursion in the deadlock-free fragment was a sort of minimal requirement for expressiveness. Since that property had already been shown for the original deadlock-free type system [2], I wanted to show that a similar level of expressive power was retained in the CONCUR 2006 type system. 

There was also a more conceptual reason for considering such an encoding. Deadlock-freedom corresponds to the progress property of well-typed functional programs: evaluation does not get stuck. Thus, the encodability result shows that this progress property can be preserved by compilation from possibly parallel functional programs into the pi-calculus, which may be viewed as a CPS-style, lower-level language with concurrency primitives. 

The main challenge was how to give a finite representation, within the type system, of the infinite chains of causal dependencies induced by recursion. As mentioned in the answer to the first question, we achieved this by combining the ideas of capability/obligation levels from [3] with a partial order on time tags from [2].

Luca: Your article gave a seminal contribution to the study of type systems for the pi-calculus that guarantee some desirable properties. Other examples of such contributions are, for instance, your paper with Davide Sangiorgi on a type system for lock-freedom or the paper by Yuxin Deng and Davide Sangiorgi on ensuring termination by typability amongst others. In hindsight, are there any relations amongst those contributions or ideas that underlie them and that can guide further developments in a systematic fashion? 

Naoki: All three type systems you mentioned are related in that they control causal dependencies using natural numbers called “levels.” The type system for deadlock-freedom forbids cyclic dependencies, while the type systems for lock-freedom and termination forbid infinite chains of dependencies. The formulation of such reasoning through types, however, suffers from inherent limitations in expressive power. 

To address this limitation, my paper with Davide Sangiorgi [7] suggested a new direction: combining syntactic typing rules with semantic ones. The latter can be discharged by resorting to model checkers, interactive theorem provers, or other verification tools, thereby complementing the limitations of type systems. I think this combination of type-based reasoning and semantic verification may provide a useful guide for further developments: types can capture common structural patterns, while semantic methods can handle cases that fall outside the reach of purely syntactic typing rules. 

[7] Naoki Kobayashi, Davide Sangiorgi, A hybrid type system for lock-freedom of mobile processes. ACM Trans. Program. Lang. Syst. 32(5): 16:1-16:49 (2010)

Luca: Are there any problems that you left open in your award-winning article that you'd still love to see solved? 

Naoki: I would still like to see at least two directions explored further. First, increasing the expressiveness of the type system by integrating the ideas of [4] and [5], mentioned above, would probably be necessary to make the approach practical. Second, since the type-based approach is inherently incomplete, finding a smooth integration with other methods, such as model checking and interactive theorem proving, remains an important open problem, as suggested in [6].

Luca: How did the results and the techniques you developed in your award-winning paper influence your subsequent research? Is there any result obtained by other researchers that builds on, or is related to, your work and that you like in particular? 

Naoki: A key ingredient of our type systems for deadlock-freedom is the notion of obligations and capabilities. An obligation of a send or receive operation describes the need to perform that operation, while a capability describes a guarantee that the operation can succeed. For example, in a client-server model, a client has the capability to send a request successfully, while a server has an obligation to receive and handle the request. In lock-based mutual exclusion, a thread has the capability to eventually acquire a lock, and, after acquiring the lock, it has an obligation to eventually release the lock. 

These dual notions of obligations and capabilities turned out to be useful in other contexts. For example, we adapted them for automated verification of security protocols [8]. Another line of work developed related ideas for more conventional concurrent programming languages. Leino et al. [9] applied related techniques to deadlock-freedom of channels and locks, and this line was further extended to monitors and condition variables by Hamin and Jacobs [10]. I found this line of work particularly interesting because it shows that similar ideas arise naturally also in a more conventional programming-language setting. 

[8] Morten Dahl, Naoki Kobayashi, Yunde Sun, Hans Hüttel, Type-Based Automated Verification of Authenticity in Asymmetric Cryptographic Protocols. ATVA 2011: 75-89
[9] K. Rustan M. Leino, Peter Müller, Jan Smans, Deadlock-Free Channels and Locks. ESOP 2010: 407-426
[10] Jafar Hamin, Bart Jacobs, Deadlock-Free Monitors. ESOP 2018: 415-441

Luca: What are the research topics related to type systems that you find most interesting right now? What role do you think type systems will or should have, if any, in the era of coding agents based on GenAI?

Naoki:  I think GenAI may change not only type systems, but also the focus of formal methods more broadly. Traditionally, programs have been written by humans, and therefore “lightweight formal methods” have played an important role. In that setting, it was important to keep program annotations minimal and to automate verification or bug finding as much as possible, even if this sometimes came at the cost of precision or completeness. 

In the era of GenAI-based coding agents, however, the situation may change. If coding agents can also generate annotations such as types and loop invariants, then the burden of writing annotations may no longer be such a serious issue. As a result, verification methods that aim at strong correctness guarantees, rather than merely finding bugs or checking lightweight properties, may become more important. 

In this respect, the CONCUR 2006 type system can be seen as a product of the former design philosophy: it emphasized automation, at some cost in precision. A possible future direction would be to move beyond such lightweight type systems, by incorporating dependent types or combining type systems with other methods such as model checking, aiming at more complete verification methods.

Luca: What advice would you give to a young researcher who is keen to start working on topics related to types for programming languages? 

Naoki: As discussed above, GenAI is rapidly changing the landscape of PL research, and we are facing significant challenges in adapting to these changes and finding new research directions to explore. I think this is a very good opportunity for “AI-native” young researchers to play important roles. My advice would be to learn the classical foundations carefully, but at the same time not to be constrained by traditional assumptions about how programs are written and verified.

By Luca Aceto

Linear Functional Testing with General Loadings in Sparse Regression: Separation Rates and Computational Barriers

from arXiv: Computational Complexity

Authors: Jie Xie, Dongming Huang

We study the problem of testing $H_0: ξ^\topβ=t_0$ in high-dimensional sparse linear regression with Gaussian random design and unknown design covariance. The loading vector $ξ$ is arbitrary, and the exact sparsity level $k$ is unknown but bounded by a known value $k_u$. Tests are required to control Type I error uniformly over the $k_u$-sparse null, while power is evaluated against $k$-sparse alternatives. We construct a computationally efficient mixed test that gives an upper bound on the adaptive separation distance and establish an information-theoretic lower bound calibrated to the magnitude profile of $ξ$. In the ultra-sparse regime $k_u\lesssim \sqrt n/\log p$, these bounds characterize the adaptive separation rate up to logarithmic factors for arbitrary $ξ$. In the moderately sparse regime $\sqrt n/\log p\ll k_u\lesssim n/\log p$, these bounds match for several classes of loading vectors but may differ in general. In this regime, we further prove a low-degree lower bound that matches the upper bound up to logarithmic factors. This provides evidence that improving on the rate of the mixed test, if statistically possible, may be computationally hard. For flat sparse loadings, we complement this evidence with a polynomial-time reduction from sparse CCA. Finally, we examine how information about the design covariance affects the adaptive separation rate in two settings. Under a sparse signed-spiked covariance model, the information-theoretic lower bound is attainable up to logarithmic factors by a computationally inefficient procedure, while the low-degree lower bound and sparse-CCA reduction continue to apply, providing evidence for a statistical-computational gap. When the design covariance is known and diagonal, the adaptive separation rate takes the same form as in the ultra-sparse regime.

Authors: Jie Xie, Dongming Huang

We study the problem of testing $H_0: ξ^\topβ=t_0$ in high-dimensional sparse linear regression with Gaussian random design and unknown design covariance. The loading vector $ξ$ is arbitrary, and the exact sparsity level $k$ is unknown but bounded by a known value $k_u$. Tests are required to control Type I error uniformly over the $k_u$-sparse null, while power is evaluated against $k$-sparse alternatives. We construct a computationally efficient mixed test that gives an upper bound on the adaptive separation distance and establish an information-theoretic lower bound calibrated to the magnitude profile of $ξ$. In the ultra-sparse regime $k_u\lesssim \sqrt n/\log p$, these bounds characterize the adaptive separation rate up to logarithmic factors for arbitrary $ξ$. In the moderately sparse regime $\sqrt n/\log p\ll k_u\lesssim n/\log p$, these bounds match for several classes of loading vectors but may differ in general. In this regime, we further prove a low-degree lower bound that matches the upper bound up to logarithmic factors. This provides evidence that improving on the rate of the mixed test, if statistically possible, may be computationally hard. For flat sparse loadings, we complement this evidence with a polynomial-time reduction from sparse CCA. Finally, we examine how information about the design covariance affects the adaptive separation rate in two settings. Under a sparse signed-spiked covariance model, the information-theoretic lower bound is attainable up to logarithmic factors by a computationally inefficient procedure, while the low-degree lower bound and sparse-CCA reduction continue to apply, providing evidence for a statistical-computational gap. When the design covariance is known and diagonal, the adaptive separation rate takes the same form as in the ultra-sparse regime.

Towards Single Exponential Time for Temporal and Spatial Reasoning: A Study via Redundancy and Dynamic Programming

from arXiv: Computational Complexity

Authors: Victor Lagerkvist, Johanna Groven, Leif Eriksson

The region connection calculus ($RCC$) and Allen's interval algebra ($IA$) are two well-known NP-hard spatial-temporal qualitative reasoning problems. They are solvable in $2^{O(n \log n)}$ time, where $n$ is the number of variables, and $IA$ is additionally known to be solvable in $o(n)^n$ time. However, no improvement over exhaustive search is known for $RCC$, and if they are also solvable in single exponential time $2^{O(n)}$ is unknown. We investigate multiple avenues towards reaching such bounds. First, we show that branching is insufficient since there are too many non-redundant constraints. Concretely, we classify the maximum number of non-redundant constraints in $RCC$ and $IA$. Algorithmically, we make two significant contributions based on dynamic programming (DP). The first algorithm runs in $4^n$ time and is applicable to a non-trivial, NP-hard fragment of $IA$, which includes the well-known interval graph sandwich problem of Golumbic and Shamir (1993). For the richer $RCC$ problem with 8 basic relations we use a more sophisticated approach which asymptotically matches the $o(n)^n$ bound for $IA$.

Authors: Victor Lagerkvist, Johanna Groven, Leif Eriksson

The region connection calculus ($RCC$) and Allen's interval algebra ($IA$) are two well-known NP-hard spatial-temporal qualitative reasoning problems. They are solvable in $2^{O(n \log n)}$ time, where $n$ is the number of variables, and $IA$ is additionally known to be solvable in $o(n)^n$ time. However, no improvement over exhaustive search is known for $RCC$, and if they are also solvable in single exponential time $2^{O(n)}$ is unknown. We investigate multiple avenues towards reaching such bounds. First, we show that branching is insufficient since there are too many non-redundant constraints. Concretely, we classify the maximum number of non-redundant constraints in $RCC$ and $IA$. Algorithmically, we make two significant contributions based on dynamic programming (DP). The first algorithm runs in $4^n$ time and is applicable to a non-trivial, NP-hard fragment of $IA$, which includes the well-known interval graph sandwich problem of Golumbic and Shamir (1993). For the richer $RCC$ problem with 8 basic relations we use a more sophisticated approach which asymptotically matches the $o(n)^n$ bound for $IA$.

Resource bounded Kučera-Gács Theorems

from arXiv: Computational Complexity

Authors: Satyadev Nandakumar, Akhil S, Chandra Shekhar Tiwari

The Kučera--Gács theorem is a fundamental result in algorithmic randomness. It states that every infinite sequence $X$ is Turing reducible to a Martin-Löf random $R$. This paper studies resource-bounded analogues of the Kučera-Gács Theorem, at the resource bounds of polynomial-time and finite-state computation. We prove a {quasi-polynomial-time}{ Kučera-Gács Theorem}, showing that every infinite sequence $X$ is quasi-polynomial-time reducible to a \emph{polynomial-time random} sequence $R$. We also show that for any $X$, the oracle use of $R$ is $n+o(n)$ bits for obtaining the first $n$ bits of $X$. We then study the relationship between compressibility and Turing reductions, in the polynomial-time setting. We establish that $ρ^-_{\mathsf{poly}}(X) = K_{poly}(X)$, demonstrating that the lower polynomial-time Turing decompression ratio is precisely characterized by the polynomial-time Kolmogorov complexity rate. We note that this characterization fails for the polynomial-time dimension if one-way functions exist, resolving an open problem from Doty's work. We use these results to strengthen the {quasi-polynomial-time}{ Kučera-Gács Theorem}. We show that every infinite sequence $X$ is quasi-polynomial-time reducible to a {polynomial-time random} sequence $R$, where the lower oracle use rate of the reduction is less than ${K}_{poly}(X)$. We also show that any sequence extracted from the (even larger) set of \emph{normal sequences} by a finite-state reduction must have a convergent asymptotic frequency for its symbols. Since sequences lacking this invariant property exist, they cannot be finite-state reduced from any normal sequence. Hence we show that the Kučera-Gács theorem \emph{fails} for finite-state reductions.

Authors: Satyadev Nandakumar, Akhil S, Chandra Shekhar Tiwari

The Kučera--Gács theorem is a fundamental result in algorithmic randomness. It states that every infinite sequence $X$ is Turing reducible to a Martin-Löf random $R$. This paper studies resource-bounded analogues of the Kučera-Gács Theorem, at the resource bounds of polynomial-time and finite-state computation. We prove a {quasi-polynomial-time}{ Kučera-Gács Theorem}, showing that every infinite sequence $X$ is quasi-polynomial-time reducible to a \emph{polynomial-time random} sequence $R$. We also show that for any $X$, the oracle use of $R$ is $n+o(n)$ bits for obtaining the first $n$ bits of $X$. We then study the relationship between compressibility and Turing reductions, in the polynomial-time setting. We establish that $ρ^-_{\mathsf{poly}}(X) = K_{poly}(X)$, demonstrating that the lower polynomial-time Turing decompression ratio is precisely characterized by the polynomial-time Kolmogorov complexity rate. We note that this characterization fails for the polynomial-time dimension if one-way functions exist, resolving an open problem from Doty's work. We use these results to strengthen the {quasi-polynomial-time}{ Kučera-Gács Theorem}. We show that every infinite sequence $X$ is quasi-polynomial-time reducible to a {polynomial-time random} sequence $R$, where the lower oracle use rate of the reduction is less than ${K}_{poly}(X)$. We also show that any sequence extracted from the (even larger) set of \emph{normal sequences} by a finite-state reduction must have a convergent asymptotic frequency for its symbols. Since sequences lacking this invariant property exist, they cannot be finite-state reduced from any normal sequence. Hence we show that the Kučera-Gács theorem \emph{fails} for finite-state reductions.

Polynomial-Time Robust Multiclass Linear Classification under Gaussian Marginals

from arXiv: Data Structures and Algorithms

Authors: Ilias Diakonikolas, Giannis Iakovidis, Mingchen Ma

We study the task of agnostic learning of multiclass linear classifiers under the Gaussian distribution. Given labeled examples $(x, y)$ from a distribution over $\mathbb{R}^d \times [k]$, with Gaussian $x$-marginal, the goal is to output a hypothesis whose error is comparable to that of the best $k$-class linear classifier. While the binary case $k=2$ has a well-developed algorithmic theory, much less is known for $k \ge 3$. Even for $k=3$, prior robust algorithms incur exponential dependence on the inverse of the desired accuracy in both complexity and representation size. In this work, we develop new structural results for multiclass linear classifiers and use them to design fully polynomial-time robust learners with dimension-independent error guarantees. Our first result shows that the standard multiclass perceptron algorithm requires super-polynomially many samples and updates, even with clean labels and Gaussian marginals, revealing a basic obstruction absent in the binary case. Our main positive result is a pairwise improper-learning framework which yields an efficient learner with error $\widetilde O(k^{3/2}\sqrt{\mathrm{opt}})+ε$ for general $k$. Additionally, we develop a sharper localization-based framework which leads to error $O(\mathrm{opt})+ε$ for $k=3$, and error $\mathrm{poly}(k)\mathrm{opt}+ε$ for geometrically regular $k$-class linear classifiers.

Authors: Ilias Diakonikolas, Giannis Iakovidis, Mingchen Ma

We study the task of agnostic learning of multiclass linear classifiers under the Gaussian distribution. Given labeled examples $(x, y)$ from a distribution over $\mathbb{R}^d \times [k]$, with Gaussian $x$-marginal, the goal is to output a hypothesis whose error is comparable to that of the best $k$-class linear classifier. While the binary case $k=2$ has a well-developed algorithmic theory, much less is known for $k \ge 3$. Even for $k=3$, prior robust algorithms incur exponential dependence on the inverse of the desired accuracy in both complexity and representation size. In this work, we develop new structural results for multiclass linear classifiers and use them to design fully polynomial-time robust learners with dimension-independent error guarantees. Our first result shows that the standard multiclass perceptron algorithm requires super-polynomially many samples and updates, even with clean labels and Gaussian marginals, revealing a basic obstruction absent in the binary case. Our main positive result is a pairwise improper-learning framework which yields an efficient learner with error $\widetilde O(k^{3/2}\sqrt{\mathrm{opt}})+ε$ for general $k$. Additionally, we develop a sharper localization-based framework which leads to error $O(\mathrm{opt})+ε$ for $k=3$, and error $\mathrm{poly}(k)\mathrm{opt}+ε$ for geometrically regular $k$-class linear classifiers.

Distributed Stochastic Graph Algorithms

from arXiv: Data Structures and Algorithms

Authors: Keren Censor-Hillel, Aditi Dudeja, George Giakkoupis

We study stochastic graph optimization problems in a novel distributed setting. As in the standard centralized setting, a random subgraph $G^*$ of a known base graph $G$ is realized by including each edge $e$ independently with a known probability $p_e$, and we must solve an optimization problem on $G^*$ despite uncertainty about its edges. In the standard setting, to cope with this uncertainty, the algorithm can query any edge of $G$ to learn if the edge exists in $G^*$, and its complexity is the number of queried edges. The distributed setting incorporates uncertainty in a natural manner, by having each vertex know only about its own edges in $G^*$ (and only communicate over them), and the complexity is measured by the number of synchronous communication rounds. We establish that distributed stochastic algorithms can be drastically faster than their non-stochastic counterparts and overcome known lower bounds, by showing fast distributed approximation algorithms for maximum matching, minimum vertex cover, and minimum dominating set.

Authors: Keren Censor-Hillel, Aditi Dudeja, George Giakkoupis

We study stochastic graph optimization problems in a novel distributed setting. As in the standard centralized setting, a random subgraph $G^*$ of a known base graph $G$ is realized by including each edge $e$ independently with a known probability $p_e$, and we must solve an optimization problem on $G^*$ despite uncertainty about its edges. In the standard setting, to cope with this uncertainty, the algorithm can query any edge of $G$ to learn if the edge exists in $G^*$, and its complexity is the number of queried edges. The distributed setting incorporates uncertainty in a natural manner, by having each vertex know only about its own edges in $G^*$ (and only communicate over them), and the complexity is measured by the number of synchronous communication rounds. We establish that distributed stochastic algorithms can be drastically faster than their non-stochastic counterparts and overcome known lower bounds, by showing fast distributed approximation algorithms for maximum matching, minimum vertex cover, and minimum dominating set.

Efficient Banzhaf-Based Data Valuation for $k$-Nearest Neighbors Classification

from arXiv: Data Structures and Algorithms

Authors: Guangyi Zhang, Lutz Oettershagen, Lixu Wang, Aristides Gionis

Data valuation, the task of quantifying the contribution of individual data points to model performance, has emerged as a fundamental challenge in machine learning. Game-theoretic approaches, such as the Banzhaf value, offer principled frameworks for fair data valuation; however, they suffer from exponential computational complexity. We address this challenge by developing efficient algorithms specifically tailored for computing Banzhaf values in $k$-nearest neighbor ($k$NN) classifiers. We first establish the theoretical hardness of the problem by proving that it is \#P-hard. Despite this intractability, we exploit the locality properties of $k$NN classifiers to develop practical exact algorithms. Our main contribution is a dynamic programming framework that achieves significant computational improvements: we present a pseudo-polynomial algorithm with $O(Wkn^2)$ time complexity for weighted $k$NN classifiers, where $W$ is the maximum sum of top-$k$ weights, and a specialized algorithm for unweighted $k$NN that achieves $O(nk^2)$ time complexity, that is, linear in the number of data points. We also offer efficient Monte Carlo estimation methods. Extensive experiments on real-world datasets demonstrate the practical efficiency of our approach and its effectiveness in data valuation applications.

Authors: Guangyi Zhang, Lutz Oettershagen, Lixu Wang, Aristides Gionis

Data valuation, the task of quantifying the contribution of individual data points to model performance, has emerged as a fundamental challenge in machine learning. Game-theoretic approaches, such as the Banzhaf value, offer principled frameworks for fair data valuation; however, they suffer from exponential computational complexity. We address this challenge by developing efficient algorithms specifically tailored for computing Banzhaf values in $k$-nearest neighbor ($k$NN) classifiers. We first establish the theoretical hardness of the problem by proving that it is \#P-hard. Despite this intractability, we exploit the locality properties of $k$NN classifiers to develop practical exact algorithms. Our main contribution is a dynamic programming framework that achieves significant computational improvements: we present a pseudo-polynomial algorithm with $O(Wkn^2)$ time complexity for weighted $k$NN classifiers, where $W$ is the maximum sum of top-$k$ weights, and a specialized algorithm for unweighted $k$NN that achieves $O(nk^2)$ time complexity, that is, linear in the number of data points. We also offer efficient Monte Carlo estimation methods. Extensive experiments on real-world datasets demonstrate the practical efficiency of our approach and its effectiveness in data valuation applications.

Treewidth of the $n \times n$ toroidal grid

from arXiv: Data Structures and Algorithms

Authors: Tatsuya Gima, Hiraku Morimoto, Yuto Okada, Yota Otachi

In this paper, we show that the treewidth of the $n \times n$ toroidal grid is $2n-1$ for all $n \ge 5$. This closes the gap between the previously known upper bound of $2n-1$ (Ellis and Warren, DAM 2008) and the lower bound of $2n-2$ (Kiyomi, Okamoto, and Otachi, DAM 2016). To establish the matching lower bound, we construct a bramble of maximum order by utilizing maximum components obtained after removing $2n-1$ vertices. Our construction relies on the vertex-isoperimetric properties of the infinite grid to establish tight lower bounds on neighborhood sizes, combined with a careful analysis of balls of radius $n/2-1$ and their boundaries to overcome structural obstructions when $n$ is even.

Authors: Tatsuya Gima, Hiraku Morimoto, Yuto Okada, Yota Otachi

In this paper, we show that the treewidth of the $n \times n$ toroidal grid is $2n-1$ for all $n \ge 5$. This closes the gap between the previously known upper bound of $2n-1$ (Ellis and Warren, DAM 2008) and the lower bound of $2n-2$ (Kiyomi, Okamoto, and Otachi, DAM 2016). To establish the matching lower bound, we construct a bramble of maximum order by utilizing maximum components obtained after removing $2n-1$ vertices. Our construction relies on the vertex-isoperimetric properties of the infinite grid to establish tight lower bounds on neighborhood sizes, combined with a careful analysis of balls of radius $n/2-1$ and their boundaries to overcome structural obstructions when $n$ is even.

Creating Robust and Fair Graph Structures for Connectivity and Clustering

from arXiv: Data Structures and Algorithms

Authors: Kushagra Chatterjee

Graph algorithms are central to large-scale applications such as navigation systems, social networks, and data analysis platforms. This thesis studies two important challenges in such systems: robustness to failures and fairness in clustering outcomes. In the first part, we investigate fault-tolerant reachability preservers in directed graphs. We present the first non-trivial constructions of dual fault-tolerant pairwise reachability preservers that remain resilient to two edge or vertex failures, achieving a sparse construction of size $O(n^{4/3}|\mathcal{P}|^{1/3})$. In the second part, we study fair clustering algorithms that ensure balanced representation of protected groups. We develop approximation algorithms for fair consensus clustering and introduce the framework of closest fair clustering, establishing hardness results and efficient algorithms for multi-group settings. Building on this framework, we obtain improved guarantees for fair correlation clustering and design the first streaming algorithm for fair consensus clustering using only logarithmic memory. Together, these results contribute toward the design of graph algorithms that are both robust and socially responsible.

Authors: Kushagra Chatterjee

Graph algorithms are central to large-scale applications such as navigation systems, social networks, and data analysis platforms. This thesis studies two important challenges in such systems: robustness to failures and fairness in clustering outcomes. In the first part, we investigate fault-tolerant reachability preservers in directed graphs. We present the first non-trivial constructions of dual fault-tolerant pairwise reachability preservers that remain resilient to two edge or vertex failures, achieving a sparse construction of size $O(n^{4/3}|\mathcal{P}|^{1/3})$. In the second part, we study fair clustering algorithms that ensure balanced representation of protected groups. We develop approximation algorithms for fair consensus clustering and introduce the framework of closest fair clustering, establishing hardness results and efficient algorithms for multi-group settings. Building on this framework, we obtain improved guarantees for fair correlation clustering and design the first streaming algorithm for fair consensus clustering using only logarithmic memory. Together, these results contribute toward the design of graph algorithms that are both robust and socially responsible.

Circuits of Quantum Hashing and Quantum Fourier Transform for a Cactus as a Qubit Connectivity Graph

from arXiv: Data Structures and Algorithms

Authors: Kamil Khadiev, Ilnur Valeev

We present a quantum circuit implementation of the quantum hashing algorithm (quantum fingerprinting) for a quantum device with restrictions on the application of two-qubit gates by a qubit connectivity graph. We present an optimization technique for the shallow circuit for quantum hashing in the case of a cactus as a qubit connectivity graph. The algorithm has $O(n^3)$ complexity to build the circuit, where $n$ is the number of qubits and $m$ is the number of connections (edges) in the graph. It is improvement compared to the existing exponential-time algorithm in the case of arbitrary graphs. The algorithm uses solution for the shortest non-simple 1-covering path problem as a subroutine. We present an $O(n^3)$-time solution for this graph-theory problem in the case of a cactus. This result can be interesting independently. The algorithm also used for improving of the quantum circuit for Quantum Fourier Transform.

Authors: Kamil Khadiev, Ilnur Valeev

We present a quantum circuit implementation of the quantum hashing algorithm (quantum fingerprinting) for a quantum device with restrictions on the application of two-qubit gates by a qubit connectivity graph. We present an optimization technique for the shallow circuit for quantum hashing in the case of a cactus as a qubit connectivity graph. The algorithm has $O(n^3)$ complexity to build the circuit, where $n$ is the number of qubits and $m$ is the number of connections (edges) in the graph. It is improvement compared to the existing exponential-time algorithm in the case of arbitrary graphs. The algorithm uses solution for the shortest non-simple 1-covering path problem as a subroutine. We present an $O(n^3)$-time solution for this graph-theory problem in the case of a cactus. This result can be interesting independently. The algorithm also used for improving of the quantum circuit for Quantum Fourier Transform.

An $O(n^5)$-Time Algorithm for Optimal Broadcast Domination

from arXiv: Data Structures and Algorithms

Authors: Kleitos Papadopoulos

Broadcast domination assigns a nonnegative integer power to every vertex of a graph so that every vertex is within the assigned power of some broadcasting vertex, and the objective is to minimize the sum of the powers. Heggernes and Lokshtanov proved that the problem is polynomial-time solvable on arbitrary connected unweighted graphs by showing that some optimal efficient broadcast has a domination graph that is a path or a cycle, and by reducing the general case to an $O(n^6)$-time algorithm. This paper gives an efficient algorithm of the path-case. Instead of building one auxiliary acyclic graph for every possible left endpoint vertex, we build a single directed acyclic graph whose states are oriented broadcast balls together with their two possible residual sides. The resulting path-case algorithm runs in $O(n^3)$ time and $O(n^3)$ space on an $n$-vertex graph. Combining this routine with the same peel-one-ball reduction of Heggernes and Lokshtanov yields an exact $O(n^5)$-time algorithm for optimal broadcast domination on arbitrary connected unweighted graphs. This resolves the quintic-time conjecture for general graphs attributed to Heggernes and Sæther and recorded in subsequent surveys of broadcast domination.

Authors: Kleitos Papadopoulos

Broadcast domination assigns a nonnegative integer power to every vertex of a graph so that every vertex is within the assigned power of some broadcasting vertex, and the objective is to minimize the sum of the powers. Heggernes and Lokshtanov proved that the problem is polynomial-time solvable on arbitrary connected unweighted graphs by showing that some optimal efficient broadcast has a domination graph that is a path or a cycle, and by reducing the general case to an $O(n^6)$-time algorithm. This paper gives an efficient algorithm of the path-case. Instead of building one auxiliary acyclic graph for every possible left endpoint vertex, we build a single directed acyclic graph whose states are oriented broadcast balls together with their two possible residual sides. The resulting path-case algorithm runs in $O(n^3)$ time and $O(n^3)$ space on an $n$-vertex graph. Combining this routine with the same peel-one-ball reduction of Heggernes and Lokshtanov yields an exact $O(n^5)$-time algorithm for optimal broadcast domination on arbitrary connected unweighted graphs. This resolves the quintic-time conjecture for general graphs attributed to Heggernes and Sæther and recorded in subsequent surveys of broadcast domination.

Wednesday, May 20

Amazing: Erdős’ Unit Distance Problem was Disproved! It was achieved by AI!

from Gil Kalai

Paul Erdős’s, in his 1946  paper published in the American Mathematical Monthly, posed two general questions about the distribution of distances determined by a finite set of points in a metric space. 1. Unit Distance Problem: At most how many … Continue reading →

Paul Erdős’s, in his 1946  paper published in the American Mathematical Monthly, posed two general questions about the distribution of distances determined by a finite set of points in a metric space.

1. Unit Distance Problem: At most how many times can the same distance (say, distance 1) occur among a set of n points?

2. Distinct Distances Problem: What is the minimum number of distinct distances determined by a set of n points?

Erdős conjectured that in the plane the number of unit distances determined by n points is at most n^{1+c/loglog n}, for a positive constant c, but the best known upper bound, due to Spencer, Szemeredi, and Trotter is only O(n^{4/3}).

As for the Distinct Distances Problem, the order of magnitude of the conjectured minimum is n/\sqrt{log n}.  In 2010 a sensational paper of Guth and Katz presents a proof of an almost tight lower bound of the order of n/log n. Janos Pach wrote about it in this 2010 post. See also this post from 2008, that explains (among other things) the proof for the upper bound.

I have just learned that an internal model of OpenAI have very recently disproved Erdős’ conjecture for the unit distance problem and found an example with more than n^{1+\epsilon} unit distances. This is truly amazing. The construction relies on algebraic number theory.

Here is Open AI announcement and technical paper.

The proof is presented, explained and discussed in the paper Remarks on the disproof of the unit distance conjecture by Alon, Bloom, Gowers, Litt, Sawin, Shankar,Tsimerman, Wang, and Matchett Wood. (I learned about it from an answer to an MO question on the topic of mathematics and AI.)

Like the computer-based proof of the four color theorem in 1976 by Appel and Haken, this may well be a scientific landmark whose importance goes beyond combinatorics and beyond mathematics. I will add a link to the Open AI document when I will have it. (Done.)

There is also a short Open AI video where the result is presented by Tim Gowers and by four members of the team, Lijie Chen, Mark Sellke, Mehtaab Sawhney, and Sebastien (Seb) Bubeck.

Update: Will Sawin proved a lower bound of n^{1.014}; This was further improved to n^{ 1.0318};  There are interesting comments on the ErdosProblems site; The list of best constants so far can be found in Terence Tao’s Github. The result is reported and discussed on the Geomblog; computational complexity; An interesting commentary by Erik Hoel.

By Gil Kalai

The unit distances problem

from The Geomblog

 OpenAI just announced that ChatGPT has disproved a conjecture about one of Erdos's most famous problems: the unit distance problem.

This problem is personal to me: I spent a good chunk of time during my Ph.D mulling over it, and it's what hooked me into computational geometry. Like most of Erdos's problems, it's really easy to state

Let P be a set of n points in the plane (amen). What is the maximum number of unit distances that can be achieved? 

(the amen is a running joke in my old field of computational geometry) 

Note that this is really about duplicate distances (because if you get a bunch of pairs of points at distance d, you can scale the point set so that d = 1). It's also trivial to see that the maximum is at least n-1 (just put points evenly spaced along the number line), and that the number can't be more than n-choose-2 (because that's the number of pairs). 

So what's the real number? Erdos showed using a fairly complicated construction that you can get a set of points that has $n^{1 + 1/\log\log n}$ unit distances. On the other side of it, a famous result from 1983 by Spencer, Szemederi and Trotter showed that you can't get more than $O(n^{4/3})$ distances. 

Erdos himself used a really elegant but weaker argument to show that you can't get more than $O(n^{3/2})$ distances. And the argument was cool and deceptive (in retrospect). To get a distance pair of 1, a circle of size 1 around a point must touch another point (and vice versa). Draw an edge between those two points when this happens. So here's a fun fact: you can't create a situation where two points are connected by edges to each of three other points. Or to be more formal, you can't create a situation where there's a $K_{2,3}$ bipartite graph hidden inside the set of edges. By a well known result in graph theory this means this graph can't have too many edges in it (because if it did you'd eventually find one of these special graphs): specifically no more than $O(n^{3/2})$ edges. 

So there's a limit. But what's the real limit? A long standing conjecture was that you could NOT get anything nontrivially more than n pairs. Specifically that you couldn't get $\Omega(n^{1 + \epsilon})$ pairs for any $\epsilon$. This is frustrating because the gap between this, and $n^{4/3}$ is huge. 

Turns out this conjecture was wrong. And ChatGPT proved it, by building a complicated generalization of Erdos's construction that is indeed of size $\Omega(n^{1 + \epsilon})$ for some $\epsilon > 0$. 

This was a tantalizing, infuriating, and beautiful problem that has resisted progress for a very long time, and touches on some very deep concepts in mathematics. It's really impressive that an AI system has provided a proof for it. For more on the significance of the result and some interpretation of the proof technique, check out the companion article. 

By Suresh Venkatasubramanian

 OpenAI just announced that ChatGPT has disproved a conjecture about one of Erdos's most famous problems: the unit distance problem.

This problem is personal to me: I spent a good chunk of time during my Ph.D mulling over it, and it's what hooked me into computational geometry. Like most of Erdos's problems, it's really easy to state

Let P be a set of n points in the plane (amen). What is the maximum number of unit distances that can be achieved? 

(the amen is a running joke in my old field of computational geometry) 

Note that this is really about duplicate distances (because if you get a bunch of pairs of points at distance d, you can scale the point set so that d = 1). It's also trivial to see that the maximum is at least n-1 (just put points evenly spaced along the number line), and that the number can't be more than n-choose-2 (because that's the number of pairs). 

So what's the real number? Erdos showed using a fairly complicated construction that you can get a set of points that has $n^{1 + 1/\log\log n}$ unit distances. On the other side of it, a famous result from 1983 by Spencer, Szemederi and Trotter showed that you can't get more than $O(n^{4/3})$ distances. 

Erdos himself used a really elegant but weaker argument to show that you can't get more than $O(n^{3/2})$ distances. And the argument was cool and deceptive (in retrospect). To get a distance pair of 1, a circle of size 1 around a point must touch another point (and vice versa). Draw an edge between those two points when this happens. So here's a fun fact: you can't create a situation where two points are connected by edges to each of three other points. Or to be more formal, you can't create a situation where there's a $K_{2,3}$ bipartite graph hidden inside the set of edges. By a well known result in graph theory this means this graph can't have too many edges in it (because if it did you'd eventually find one of these special graphs): specifically no more than $O(n^{3/2})$ edges. 

So there's a limit. But what's the real limit? A long standing conjecture was that you could NOT get anything nontrivially more than n pairs. Specifically that you couldn't get $\Omega(n^{1 + \epsilon})$ pairs for any $\epsilon$. This is frustrating because the gap between this, and $n^{4/3}$ is huge. 

Turns out this conjecture was wrong. And ChatGPT proved it, by building a complicated generalization of Erdos's construction that is indeed of size $\Omega(n^{1 + \epsilon})$ for some $\epsilon > 0$. 

This was a tantalizing, infuriating, and beautiful problem that has resisted progress for a very long time, and touches on some very deep concepts in mathematics. It's really impressive that an AI system has provided a proof for it. For more on the significance of the result and some interpretation of the proof technique, check out the companion article

By Suresh Venkatasubramanian