Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Wednesday, February 12

TR25-010 | Towards Free Lunch Derandomization from Necessary Assumptions (and OWFs) | Roei Tell, Marshall Ball, Lijie Chen

from ECCC Papers

The question of optimal derandomization, introduced by Doron et. al (JACM 2022), garnered significant recent attention. Works in recent years showed conditional superfast derandomization algorithms, as well as conditional impossibility results, and barriers for obtaining superfast derandomization using certain black-box techniques. Of particular interest is the extreme high-end, which focuses on ``free lunch'' derandomization, as suggested by Chen and Tell (FOCS 2021). This is derandomization that incurs essentially no time overhead, and errs only on inputs that are infeasible to find. Constructing such algorithms is challenging, and so far there have not been any results following the one in their initial work. In their result, their algorithm is essentially the classical Nisan-Wigderson generator, and they relied on an ad-hoc assumption asserting the existence of a function that is non-batch-computable over all polynomial-time samplable distributions. In this work we deduce free lunch derandomization from a variety of natural hardness assumptions. In particular, we do not resort to non-batch-computability, and the common denominator for all of our assumptions is hardness over all polynomial-time samplable distributions, which is necessary for the conclusion. The main technical components in our proofs are constructions of new and superfast targeted generators, which completely eliminate the time overheads that are inherent to all previously known constructions. In particular, we present an alternative construction for the targeted generator by Chen and Tell (FOCS 2021), which is faster than the original construction, and also more natural and technically intuitive. These contributions significantly strengthen the evidence for the possibility of free lunch derandomization, distill the required assumptions for such a result, and provide the first set of dedicated technical tools that are useful for studying the question.

The question of optimal derandomization, introduced by Doron et. al (JACM 2022), garnered significant recent attention. Works in recent years showed conditional superfast derandomization algorithms, as well as conditional impossibility results, and barriers for obtaining superfast derandomization using certain black-box techniques. Of particular interest is the extreme high-end, which focuses on ``free lunch'' derandomization, as suggested by Chen and Tell (FOCS 2021). This is derandomization that incurs essentially no time overhead, and errs only on inputs that are infeasible to find. Constructing such algorithms is challenging, and so far there have not been any results following the one in their initial work. In their result, their algorithm is essentially the classical Nisan-Wigderson generator, and they relied on an ad-hoc assumption asserting the existence of a function that is non-batch-computable over all polynomial-time samplable distributions. In this work we deduce free lunch derandomization from a variety of natural hardness assumptions. In particular, we do not resort to non-batch-computability, and the common denominator for all of our assumptions is hardness over all polynomial-time samplable distributions, which is necessary for the conclusion. The main technical components in our proofs are constructions of new and superfast targeted generators, which completely eliminate the time overheads that are inherent to all previously known constructions. In particular, we present an alternative construction for the targeted generator by Chen and Tell (FOCS 2021), which is faster than the original construction, and also more natural and technically intuitive. These contributions significantly strengthen the evidence for the possibility of free lunch derandomization, distill the required assumptions for such a result, and provide the first set of dedicated technical tools that are useful for studying the question.

A forbidden subgraph study for cut problems on graphs permitting loops and multiedges

from arXiv: Computational Complexity

Authors: Tala Eagling-Vose, Barnaby Martin, Daniel Paulusma, Siani Smith

We take the recently-introduced C123-framework, for the study of (simple) graph problems restricted to inputs specified by the omission of some finite set of subgraphs, to more general graph problems possibly involving self-loops and multiedges. We study specifically the problems Partially Reflexive Stable Cut and Multigraph Matching Cut in this connection. When one forbids a single (simple) subgraph, these problems exhibit the same complexity behaviour as C123-problems, but on finite sets of forbidden subgraphs, the classification appears more complex. While Multigraph Matching Cut and Multigraph d-Cut are C123-problems, already Partially Reflexive Stable Cut fails to be. This is witnessed by forbidding as subgraphs both $C_3$ and $H_1$. Indeed, the difference of behaviour occurs only around pendant subdivisions of nets and pendant subdivisions of $H_1$. We examine this area in close detail.

Authors: Tala Eagling-Vose, Barnaby Martin, Daniel Paulusma, Siani Smith

We take the recently-introduced C123-framework, for the study of (simple) graph problems restricted to inputs specified by the omission of some finite set of subgraphs, to more general graph problems possibly involving self-loops and multiedges. We study specifically the problems Partially Reflexive Stable Cut and Multigraph Matching Cut in this connection. When one forbids a single (simple) subgraph, these problems exhibit the same complexity behaviour as C123-problems, but on finite sets of forbidden subgraphs, the classification appears more complex. While Multigraph Matching Cut and Multigraph d-Cut are C123-problems, already Partially Reflexive Stable Cut fails to be. This is witnessed by forbidding as subgraphs both $C_3$ and $H_1$. Indeed, the difference of behaviour occurs only around pendant subdivisions of nets and pendant subdivisions of $H_1$. We examine this area in close detail.

Explicit Codes approaching Generalized Singleton Bound using Expanders

from arXiv: Computational Complexity

Authors: Fernando Granha Jeronimo, Tushant Mittal, Shashank Srivastava, Madhur Tulsiani

We construct a new family of explicit codes that are list decodable to capacity and achieve an optimal list size of $O(\frac{1}{\epsilon})$. In contrast to existing explicit constructions of codes achieving list decoding capacity, our arguments do not rely on algebraic structure but utilize simple combinatorial properties of expander graphs. Our construction is based on a celebrated distance amplification procedure due to Alon, Edmonds, and Luby [FOCS'95], which transforms any high-rate code into one with near-optimal rate-distance tradeoff. We generalize it to show that the same procedure can be used to transform any high-rate code into one that achieves list decoding capacity. Our proof can be interpreted as a "local-to-global" phenomenon for (a slight strengthening of) the generalized Singleton bound. Using this construction, for every $R, \epsilon \in (0,1)$ and $k \in \mathbb{N}^+$, we obtain an \emph{explicit} family of codes $\mathcal{C} \subseteq \Sigma^n$, with rate $R$ such that, - They achieve the $\epsilon$-relaxed generalized Singleton bound: for any $g \in \Sigma^n$ and any list $\mathcal{H}$ of at most $k$ codewords, we have, \[ \underset{h \in \mathcal{H}}{\mathbb{E}} [\Delta(g,h)] ~\geq~ \frac{|\mathcal{H}|-1}{|\mathcal{H}|} \cdot (1 - R - \epsilon). \] - The alphabet size is a constant depending only on $\epsilon$ and $k$. - They can be list decoded up to radius $\frac{k-1}{k}(1-R-\epsilon)$, in time $n^{O_{k,\epsilon}(1)}$. As a corollary of our result, we also obtain the first explicit construction of LDPC codes achieving list decoding capacity, and in fact arbitrarily close to the generalized Singleton bound.

Authors: Fernando Granha Jeronimo, Tushant Mittal, Shashank Srivastava, Madhur Tulsiani

We construct a new family of explicit codes that are list decodable to capacity and achieve an optimal list size of $O(\frac{1}{\epsilon})$. In contrast to existing explicit constructions of codes achieving list decoding capacity, our arguments do not rely on algebraic structure but utilize simple combinatorial properties of expander graphs. Our construction is based on a celebrated distance amplification procedure due to Alon, Edmonds, and Luby [FOCS'95], which transforms any high-rate code into one with near-optimal rate-distance tradeoff. We generalize it to show that the same procedure can be used to transform any high-rate code into one that achieves list decoding capacity. Our proof can be interpreted as a "local-to-global" phenomenon for (a slight strengthening of) the generalized Singleton bound. Using this construction, for every $R, \epsilon \in (0,1)$ and $k \in \mathbb{N}^+$, we obtain an \emph{explicit} family of codes $\mathcal{C} \subseteq \Sigma^n$, with rate $R$ such that, - They achieve the $\epsilon$-relaxed generalized Singleton bound: for any $g \in \Sigma^n$ and any list $\mathcal{H}$ of at most $k$ codewords, we have, \[ \underset{h \in \mathcal{H}}{\mathbb{E}} [\Delta(g,h)] ~\geq~ \frac{|\mathcal{H}|-1}{|\mathcal{H}|} \cdot (1 - R - \epsilon). \] - The alphabet size is a constant depending only on $\epsilon$ and $k$. - They can be list decoded up to radius $\frac{k-1}{k}(1-R-\epsilon)$, in time $n^{O_{k,\epsilon}(1)}$. As a corollary of our result, we also obtain the first explicit construction of LDPC codes achieving list decoding capacity, and in fact arbitrarily close to the generalized Singleton bound.

Multiview Point Cloud Registration Based on Minimum Potential Energy for Free-Form Blade Measurement

from arXiv: Computational Geometry

Authors: Zijie Wu, Yaonan Wang, Yang Mo, Qing Zhu, He Xie, Haotian Wu, Mingtao Feng, Ajmal Mian

Point cloud registration is an essential step for free-form blade reconstruction in industrial measurement. Nonetheless, measuring defects of the 3D acquisition system unavoidably result in noisy and incomplete point cloud data, which renders efficient and accurate registration challenging. In this paper, we propose a novel global registration method that is based on the minimum potential energy (MPE) method to address these problems. The basic strategy is that the objective function is defined as the minimum potential energy optimization function of the physical registration system. The function distributes more weight to the majority of inlier points and less weight to the noise and outliers, which essentially reduces the influence of perturbations in the mathematical formulation. We decompose the solution into a globally optimal approximation procedure and a fine registration process with the trimmed iterative closest point algorithm to boost convergence. The approximation procedure consists of two main steps. First, according to the construction of the force traction operator, we can simply compute the position of the potential energy minimum. Second, to find the MPE point, we propose a new theory that employs two flags to observe the status of the registration procedure. We demonstrate the performance of the proposed algorithm on four types of blades. The proposed method outperforms the other global methods in terms of both accuracy and noise resistance.

Authors: Zijie Wu, Yaonan Wang, Yang Mo, Qing Zhu, He Xie, Haotian Wu, Mingtao Feng, Ajmal Mian

Point cloud registration is an essential step for free-form blade reconstruction in industrial measurement. Nonetheless, measuring defects of the 3D acquisition system unavoidably result in noisy and incomplete point cloud data, which renders efficient and accurate registration challenging. In this paper, we propose a novel global registration method that is based on the minimum potential energy (MPE) method to address these problems. The basic strategy is that the objective function is defined as the minimum potential energy optimization function of the physical registration system. The function distributes more weight to the majority of inlier points and less weight to the noise and outliers, which essentially reduces the influence of perturbations in the mathematical formulation. We decompose the solution into a globally optimal approximation procedure and a fine registration process with the trimmed iterative closest point algorithm to boost convergence. The approximation procedure consists of two main steps. First, according to the construction of the force traction operator, we can simply compute the position of the potential energy minimum. Second, to find the MPE point, we propose a new theory that employs two flags to observe the status of the registration procedure. We demonstrate the performance of the proposed algorithm on four types of blades. The proposed method outperforms the other global methods in terms of both accuracy and noise resistance.

Efficient Sparsification of Simplicial Complexes via Local Densities of States

from arXiv: Computational Geometry

Authors: Anton Savostianov, Michael T. Schaub, Nicola Guglielmi, Francesco Tudisco

Simplicial complexes (SCs), a generalization of graph models for relational data that account for higher-order relations between data items, have become a popular abstraction for analyzing complex data using tools from topological data analysis or topological signal processing. However, the analysis of many real-world datasets leads to dense SCs with a large number of higher-order interactions. Unfortunately, analyzing such large SCs often has a prohibitive cost in terms of computation time and memory consumption. The sparsification of such complexes, i.e., the approximation of an original SC with a sparser simplicial complex with only a log-linear number of high-order simplices while maintaining a spectrum close to the original SC, is of broad interest. In this work, we develop a novel method for a probabilistic sparsifaction of SCs. At its core lies the efficient computation of sparsifying sampling probability through local densities of states as functional descriptors of the spectral information. To avoid pathological structures in the spectrum of the corresponding Hodge Laplacian operators, we suggest a "kernel-ignoring" decomposition for approximating the sampling probability; additionally, we exploit error estimates to show asymptotically prevailing algorithmic complexity of the developed method. The performance of the framework is demonstrated on the family of Vietoris--Rips filtered simplicial complexes.

Authors: Anton Savostianov, Michael T. Schaub, Nicola Guglielmi, Francesco Tudisco

Simplicial complexes (SCs), a generalization of graph models for relational data that account for higher-order relations between data items, have become a popular abstraction for analyzing complex data using tools from topological data analysis or topological signal processing. However, the analysis of many real-world datasets leads to dense SCs with a large number of higher-order interactions. Unfortunately, analyzing such large SCs often has a prohibitive cost in terms of computation time and memory consumption. The sparsification of such complexes, i.e., the approximation of an original SC with a sparser simplicial complex with only a log-linear number of high-order simplices while maintaining a spectrum close to the original SC, is of broad interest. In this work, we develop a novel method for a probabilistic sparsifaction of SCs. At its core lies the efficient computation of sparsifying sampling probability through local densities of states as functional descriptors of the spectral information. To avoid pathological structures in the spectrum of the corresponding Hodge Laplacian operators, we suggest a "kernel-ignoring" decomposition for approximating the sampling probability; additionally, we exploit error estimates to show asymptotically prevailing algorithmic complexity of the developed method. The performance of the framework is demonstrated on the family of Vietoris--Rips filtered simplicial complexes.

VLWE: Variety-based Learning with Errors for Vector Encryption through Algebraic Geometry

from arXiv: Computational Geometry

Authors: Dongfang Zhao

Lattice-based cryptography is a foundation for post-quantum security, with the Learning with Errors (LWE) problem as a core component in key exchange, encryption, and homomorphic computation. Structured variants like Ring-LWE (RLWE) and Module-LWE (MLWE) improve efficiency using polynomial rings but remain constrained by traditional polynomial multiplication rules, limiting their ability to handle structured vectorized data. This work introduces Variety-LWE (VLWE), a new structured lattice problem based on algebraic geometry. Unlike RLWE and MLWE, which use polynomial quotient rings with standard multiplication, VLWE operates over multivariate polynomial rings defined by algebraic varieties. A key difference is that these polynomials lack mixed variables, and multiplication is coordinate-wise rather than following standard polynomial multiplication. This enables direct encoding and homomorphic processing of high-dimensional data while preserving worst-case to average-case hardness reductions. We prove VLWE's security by reducing it to multiple independent Ideal-SVP instances, demonstrating resilience against classical and quantum attacks. Additionally, we analyze hybrid algebraic-lattice attacks, showing that existing Grobner basis and lattice reduction methods do not directly threaten VLWE. We further construct a vector homomorphic encryption scheme based on VLWE, supporting structured computations while controlling noise growth. This scheme offers advantages in privacy-preserving machine learning, encrypted search, and secure computations over structured data. VLWE emerges as a novel and independent paradigm in lattice-based cryptography, leveraging algebraic geometry to enable new cryptographic capabilities beyond traditional polynomial quotient rings.

Authors: Dongfang Zhao

Lattice-based cryptography is a foundation for post-quantum security, with the Learning with Errors (LWE) problem as a core component in key exchange, encryption, and homomorphic computation. Structured variants like Ring-LWE (RLWE) and Module-LWE (MLWE) improve efficiency using polynomial rings but remain constrained by traditional polynomial multiplication rules, limiting their ability to handle structured vectorized data. This work introduces Variety-LWE (VLWE), a new structured lattice problem based on algebraic geometry. Unlike RLWE and MLWE, which use polynomial quotient rings with standard multiplication, VLWE operates over multivariate polynomial rings defined by algebraic varieties. A key difference is that these polynomials lack mixed variables, and multiplication is coordinate-wise rather than following standard polynomial multiplication. This enables direct encoding and homomorphic processing of high-dimensional data while preserving worst-case to average-case hardness reductions. We prove VLWE's security by reducing it to multiple independent Ideal-SVP instances, demonstrating resilience against classical and quantum attacks. Additionally, we analyze hybrid algebraic-lattice attacks, showing that existing Grobner basis and lattice reduction methods do not directly threaten VLWE. We further construct a vector homomorphic encryption scheme based on VLWE, supporting structured computations while controlling noise growth. This scheme offers advantages in privacy-preserving machine learning, encrypted search, and secure computations over structured data. VLWE emerges as a novel and independent paradigm in lattice-based cryptography, leveraging algebraic geometry to enable new cryptographic capabilities beyond traditional polynomial quotient rings.

Polynomial-Time Approximability of Constrained Reinforcement Learning

from arXiv: Data Structures and Algorithms

Authors: Jeremy McMahan

We study the computational complexity of approximating general constrained Markov decision processes. Our primary contribution is the design of a polynomial time $(0,\epsilon)$-additive bicriteria approximation algorithm for finding optimal constrained policies across a broad class of recursively computable constraints, including almost-sure, chance, expectation, and their anytime variants. Matching lower bounds imply our approximation guarantees are optimal so long as $P \neq NP$. The generality of our approach results in answers to several long-standing open complexity questions in the constrained reinforcement learning literature. Specifically, we are the first to prove polynomial-time approximability for the following settings: policies under chance constraints, deterministic policies under multiple expectation constraints, policies under non-homogeneous constraints (i.e., constraints of different types), and policies under constraints for continuous-state processes.

Authors: Jeremy McMahan

We study the computational complexity of approximating general constrained Markov decision processes. Our primary contribution is the design of a polynomial time $(0,\epsilon)$-additive bicriteria approximation algorithm for finding optimal constrained policies across a broad class of recursively computable constraints, including almost-sure, chance, expectation, and their anytime variants. Matching lower bounds imply our approximation guarantees are optimal so long as $P \neq NP$. The generality of our approach results in answers to several long-standing open complexity questions in the constrained reinforcement learning literature. Specifically, we are the first to prove polynomial-time approximability for the following settings: policies under chance constraints, deterministic policies under multiple expectation constraints, policies under non-homogeneous constraints (i.e., constraints of different types), and policies under constraints for continuous-state processes.

Online matching and market imbalance

from arXiv: Data Structures and Algorithms

Authors: Benjamin Barrientos, Daniel Freund, Daniela Saban

Our work introduces the effect of supply/demand imbalances into the literature on online matching with stochastic rewards in bipartite graphs. We provide a parameterized definition that characterizes instances as over- or undersupplied (or balanced), and show that higher competitive ratios against an offline clairvoyant algorithm are achievable, for both adversarial and stochastic arrivals, when instances are more imbalanced. The competitive ratio guarantees we obtain are the best-possible for the class of delayed algorithms we focus on (such algorithms may adapt to the history of arrivals and the algorithm's own decisions, but not to the stochastic realization of each potential match). We then explore the real-world implications of our improved competitive ratios. First, we demonstrate analytically that the improved competitive ratios under imbalanced instances is not a one-way street by showing that a platform that conducts effective supply- and demand management should incorporate the effect of imbalance on its matching performance on its supply planning in order to create imbalanced instances. Second, we empirically study the relationship between achieved competitive ratios and imbalance using the data of a volunteer matching platform.

Authors: Benjamin Barrientos, Daniel Freund, Daniela Saban

Our work introduces the effect of supply/demand imbalances into the literature on online matching with stochastic rewards in bipartite graphs. We provide a parameterized definition that characterizes instances as over- or undersupplied (or balanced), and show that higher competitive ratios against an offline clairvoyant algorithm are achievable, for both adversarial and stochastic arrivals, when instances are more imbalanced. The competitive ratio guarantees we obtain are the best-possible for the class of delayed algorithms we focus on (such algorithms may adapt to the history of arrivals and the algorithm's own decisions, but not to the stochastic realization of each potential match). We then explore the real-world implications of our improved competitive ratios. First, we demonstrate analytically that the improved competitive ratios under imbalanced instances is not a one-way street by showing that a platform that conducts effective supply- and demand management should incorporate the effect of imbalance on its matching performance on its supply planning in order to create imbalanced instances. Second, we empirically study the relationship between achieved competitive ratios and imbalance using the data of a volunteer matching platform.

Coresets for Robust Clustering via Black-box Reductions to Vanilla Case

from arXiv: Data Structures and Algorithms

Authors: Shaofeng H. -C. Jiang, Jianing Lou

We devise $\epsilon$-coresets for robust $(k,z)$-Clustering with $m$ outliers through black-box reductions to vanilla case. Given an $\epsilon$-coreset construction for vanilla clustering with size $N$, we construct coresets of size $N\cdot \mathrm{poly}\log(km\epsilon^{-1}) + O_z\left(\min\{km\epsilon^{-1}, m\epsilon^{-2z}\log^z(km\epsilon^{-1}) \}\right)$ for various metric spaces, where $O_z$ hides $2^{O(z\log z)}$ factors. This increases the size of the vanilla coreset by a small multiplicative factor of $\mathrm{poly}\log(km\epsilon^{-1})$, and the additive term is up to a $(\epsilon^{-1}\log (km))^{O(z)}$ factor to the size of the optimal robust coreset. Plugging in vanilla coreset results of [Cohen-Addad et al., STOC'21], we obtain the first coresets for $(k,z)$-Clustering with $m$ outliers with size near-linear in $k$ while previous results have size at least $\Omega(k^2)$ [Huang et al., ICLR'23; Huang et al., SODA'25]. Technically, we establish two conditions under which a vanilla coreset is as well a robust coreset. The first condition requires the dataset to satisfy special structures - it can be broken into "dense" parts with bounded diameter. We combine this with a new bounded-diameter decomposition that has only $O_z(km \epsilon^{-1})$ non-dense points to obtain the $O_z(km \epsilon^{-1})$ additive bound. Another condition requires the vanilla coreset to possess an extra size-preserving property. We further give a black-box reduction that turns a vanilla coreset to the one satisfying the said size-preserving property, leading to the alternative $O_z(m\epsilon^{-2z}\log^{z}(km\epsilon^{-1}))$ additive bound. We also implement our reductions in the dynamic streaming setting and obtain the first streaming algorithms for $k$-Median and $k$-Means with $m$ outliers, using space $\tilde{O}(k+m)\cdot\mathrm{poly}(d\epsilon^{-1}\log\Delta)$ for inputs on the grid $[\Delta]^d$.

Authors: Shaofeng H. -C. Jiang, Jianing Lou

We devise $\epsilon$-coresets for robust $(k,z)$-Clustering with $m$ outliers through black-box reductions to vanilla case. Given an $\epsilon$-coreset construction for vanilla clustering with size $N$, we construct coresets of size $N\cdot \mathrm{poly}\log(km\epsilon^{-1}) + O_z\left(\min\{km\epsilon^{-1}, m\epsilon^{-2z}\log^z(km\epsilon^{-1}) \}\right)$ for various metric spaces, where $O_z$ hides $2^{O(z\log z)}$ factors. This increases the size of the vanilla coreset by a small multiplicative factor of $\mathrm{poly}\log(km\epsilon^{-1})$, and the additive term is up to a $(\epsilon^{-1}\log (km))^{O(z)}$ factor to the size of the optimal robust coreset. Plugging in vanilla coreset results of [Cohen-Addad et al., STOC'21], we obtain the first coresets for $(k,z)$-Clustering with $m$ outliers with size near-linear in $k$ while previous results have size at least $\Omega(k^2)$ [Huang et al., ICLR'23; Huang et al., SODA'25]. Technically, we establish two conditions under which a vanilla coreset is as well a robust coreset. The first condition requires the dataset to satisfy special structures - it can be broken into "dense" parts with bounded diameter. We combine this with a new bounded-diameter decomposition that has only $O_z(km \epsilon^{-1})$ non-dense points to obtain the $O_z(km \epsilon^{-1})$ additive bound. Another condition requires the vanilla coreset to possess an extra size-preserving property. We further give a black-box reduction that turns a vanilla coreset to the one satisfying the said size-preserving property, leading to the alternative $O_z(m\epsilon^{-2z}\log^{z}(km\epsilon^{-1}))$ additive bound. We also implement our reductions in the dynamic streaming setting and obtain the first streaming algorithms for $k$-Median and $k$-Means with $m$ outliers, using space $\tilde{O}(k+m)\cdot\mathrm{poly}(d\epsilon^{-1}\log\Delta)$ for inputs on the grid $[\Delta]^d$.

Private Low-Rank Approximation for Covariance Matrices, Dyson Brownian Motion, and Eigenvalue-Gap Bounds for Gaussian Perturbations

from arXiv: Data Structures and Algorithms

Authors: Oren Mangoubi, Nisheeth K. Vishnoi

We consider the problem of approximating a $d \times d$ covariance matrix $M$ with a rank-$k$ matrix under $(\varepsilon,\delta)$-differential privacy. We present and analyze a complex variant of the Gaussian mechanism and obtain upper bounds on the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$. Our analysis provides improvements over previous bounds, particularly when the spectrum of $M$ satisfies natural structural assumptions. The novel insight is to view the addition of Gaussian noise to a matrix as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations enable us to upper bound the Frobenius distance between the best rank-$k$ approximation of $M$ and that of a Gaussian perturbation of $M$ as an integral that involves inverse eigenvalue gaps of the stochastically evolving matrix, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems. Subsequently, again using the Dyson Brownian motion viewpoint, we show that the eigenvalues of the matrix $M$ perturbed by Gaussian noise have large gaps with high probability. These results also contribute to the analysis of low-rank approximations under average-case perturbations, and to an understanding of eigenvalue gaps for random matrices, both of which may be of independent interest.

Authors: Oren Mangoubi, Nisheeth K. Vishnoi

We consider the problem of approximating a $d \times d$ covariance matrix $M$ with a rank-$k$ matrix under $(\varepsilon,\delta)$-differential privacy. We present and analyze a complex variant of the Gaussian mechanism and obtain upper bounds on the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$. Our analysis provides improvements over previous bounds, particularly when the spectrum of $M$ satisfies natural structural assumptions. The novel insight is to view the addition of Gaussian noise to a matrix as a continuous-time matrix Brownian motion. This viewpoint allows us to track the evolution of eigenvalues and eigenvectors of the matrix, which are governed by stochastic differential equations discovered by Dyson. These equations enable us to upper bound the Frobenius distance between the best rank-$k$ approximation of $M$ and that of a Gaussian perturbation of $M$ as an integral that involves inverse eigenvalue gaps of the stochastically evolving matrix, as opposed to a sum of perturbation bounds obtained via Davis-Kahan-type theorems. Subsequently, again using the Dyson Brownian motion viewpoint, we show that the eigenvalues of the matrix $M$ perturbed by Gaussian noise have large gaps with high probability. These results also contribute to the analysis of low-rank approximations under average-case perturbations, and to an understanding of eigenvalue gaps for random matrices, both of which may be of independent interest.

Robust-Sorting and Applications to Ulam-Median

from arXiv: Data Structures and Algorithms

Authors: Ragesh Jaiswal, Amit Kumar, Jatin Yadav

Sorting is one of the most basic primitives in many algorithms and data analysis tasks. Comparison-based sorting algorithms, like quick-sort and merge-sort, are known to be optimal when the outcome of each comparison is error-free. However, many real-world sorting applications operate in scenarios where the outcome of each comparison can be noisy. In this work, we explore settings where a bounded number of comparisons are potentially corrupted by erroneous agents, resulting in arbitrary, adversarial outcomes. We model the sorting problem as a query-limited tournament graph where edges involving erroneous nodes may yield arbitrary results. Our primary contribution is a randomized algorithm inspired by quick-sort that, in expectation, produces an ordering close to the true total order while only querying $\tilde{O}(n)$ edges. We achieve a distance from the target order $\pi$ within $(3 + \epsilon)|B|$, where $B$ is the set of erroneous nodes, balancing the competing objectives of minimizing both query complexity and misalignment with $\pi$. Our algorithm needs to carefully balance two aspects: identify a pivot that partitions the vertex set evenly and ensure that this partition is "truthful" and yet query as few "triangles" in the graph $G$ as possible. Since the nodes in $B$ can potentially hide in an intricate manner, our algorithm requires several technical steps. Additionally, we demonstrate significant implications for the Ulam-$k$-Median problem, a classical clustering problem where the metric is defined on the set of permutations on a set of $d$ elements. Chakraborty, Das, and Krauthgamer gave a $(2-\varepsilon)$ FPT approximation algorithm for this problem, where the running time is super-linear in both $n$ and $d$. We use our robust sorting framework to give the first $(2-\varepsilon)$ FPT linear time approximation algorithm for this problem.

Authors: Ragesh Jaiswal, Amit Kumar, Jatin Yadav

Sorting is one of the most basic primitives in many algorithms and data analysis tasks. Comparison-based sorting algorithms, like quick-sort and merge-sort, are known to be optimal when the outcome of each comparison is error-free. However, many real-world sorting applications operate in scenarios where the outcome of each comparison can be noisy. In this work, we explore settings where a bounded number of comparisons are potentially corrupted by erroneous agents, resulting in arbitrary, adversarial outcomes. We model the sorting problem as a query-limited tournament graph where edges involving erroneous nodes may yield arbitrary results. Our primary contribution is a randomized algorithm inspired by quick-sort that, in expectation, produces an ordering close to the true total order while only querying $\tilde{O}(n)$ edges. We achieve a distance from the target order $\pi$ within $(3 + \epsilon)|B|$, where $B$ is the set of erroneous nodes, balancing the competing objectives of minimizing both query complexity and misalignment with $\pi$. Our algorithm needs to carefully balance two aspects: identify a pivot that partitions the vertex set evenly and ensure that this partition is "truthful" and yet query as few "triangles" in the graph $G$ as possible. Since the nodes in $B$ can potentially hide in an intricate manner, our algorithm requires several technical steps. Additionally, we demonstrate significant implications for the Ulam-$k$-Median problem, a classical clustering problem where the metric is defined on the set of permutations on a set of $d$ elements. Chakraborty, Das, and Krauthgamer gave a $(2-\varepsilon)$ FPT approximation algorithm for this problem, where the running time is super-linear in both $n$ and $d$. We use our robust sorting framework to give the first $(2-\varepsilon)$ FPT linear time approximation algorithm for this problem.

Faster diameter computation in graphs of bounded Euler genus

from arXiv: Data Structures and Algorithms

Authors: Kacper Kluk, Marcin Pilipczuk, Michał Pilipczuk, Giannos Stamoulis

We show that for any fixed integer $k \geq 0$, there exists an algorithm that computes the diameter and the eccentricies of all vertices of an input unweighted, undirected $n$-vertex graph of Euler genus at most $k$ in time \[ \mathcal{O}_k(n^{2-\frac{1}{25}}). \] Furthermore, for the more general class of graphs that can be constructed by clique-sums from graphs that are of Euler genus at most $k$ after deletion of at most $k$ vertices, we show an algorithm for the same task that achieves the running time bound \[ \mathcal{O}_k(n^{2-\frac{1}{356}} \log^{6k} n). \] Up to today, the only known subquadratic algorithms for computing the diameter in those graph classes are that of [Ducoffe, Habib, Viennot; SICOMP 2022], [Le, Wulff-Nilsen; SODA 2024], and [Duraj, Konieczny, Pot\k{e}pa; ESA 2024]. These algorithms work in the more general setting of $K_h$-minor-free graphs, but the running time bound is $\mathcal{O}_h(n^{2-c_h})$ for some constant $c_h > 0$ depending on $h$. That is, our savings in the exponent, as compared to the naive quadratic algorithm, are independent of the parameter $k$. The main technical ingredient of our work is an improved bound on the number of distance profiles, as defined in [Le, Wulff-Nilsen; SODA 2024], in graphs of bounded Euler genus.

Authors: Kacper Kluk, Marcin Pilipczuk, Michał Pilipczuk, Giannos Stamoulis

We show that for any fixed integer $k \geq 0$, there exists an algorithm that computes the diameter and the eccentricies of all vertices of an input unweighted, undirected $n$-vertex graph of Euler genus at most $k$ in time \[ \mathcal{O}_k(n^{2-\frac{1}{25}}). \] Furthermore, for the more general class of graphs that can be constructed by clique-sums from graphs that are of Euler genus at most $k$ after deletion of at most $k$ vertices, we show an algorithm for the same task that achieves the running time bound \[ \mathcal{O}_k(n^{2-\frac{1}{356}} \log^{6k} n). \] Up to today, the only known subquadratic algorithms for computing the diameter in those graph classes are that of [Ducoffe, Habib, Viennot; SICOMP 2022], [Le, Wulff-Nilsen; SODA 2024], and [Duraj, Konieczny, Pot\k{e}pa; ESA 2024]. These algorithms work in the more general setting of $K_h$-minor-free graphs, but the running time bound is $\mathcal{O}_h(n^{2-c_h})$ for some constant $c_h > 0$ depending on $h$. That is, our savings in the exponent, as compared to the naive quadratic algorithm, are independent of the parameter $k$. The main technical ingredient of our work is an improved bound on the number of distance profiles, as defined in [Le, Wulff-Nilsen; SODA 2024], in graphs of bounded Euler genus.

Exploring Word-Representable Temporal Graphs

from arXiv: Data Structures and Algorithms

Authors: Duncan Adamson

Word-representable graphs are a subset of graphs that may be represented by a word $w$ over an alphabet composed of the vertices in the graph. In such graphs, an edge exists if and only if the occurrences of the corresponding vertices alternate in the word $w$. We generalise this notion to temporal graphs, constructing timesteps by partitioning the word into factors (contiguous subwords) such that no factor contains more than one copy of any given symbol. With this definition, we study the problem of \emph{exploration}, asking for the fastest schedule such that a given agent may explore all $n$ vertices of the graph. We show that if the corresponding temporal graph is connected in every timestep, we may explore the graph in $2\delta n$ timesteps, where $\delta$ is the lowest degree of any vertex in the graph. In general, we show that, for any temporal graph represented by a word of length at least $n(2dn + d)$, with a connected underlying graph, the full graph can be explored in $2 d n$ timesteps, where $d$ is the diameter of the graph. We show this is asymptotically optimal by providing a class of graphs of diameter $d$ requiring $\Omega(d n)$ timesteps to explore, for any $d \in [1, n]$.

Authors: Duncan Adamson

Word-representable graphs are a subset of graphs that may be represented by a word $w$ over an alphabet composed of the vertices in the graph. In such graphs, an edge exists if and only if the occurrences of the corresponding vertices alternate in the word $w$. We generalise this notion to temporal graphs, constructing timesteps by partitioning the word into factors (contiguous subwords) such that no factor contains more than one copy of any given symbol. With this definition, we study the problem of \emph{exploration}, asking for the fastest schedule such that a given agent may explore all $n$ vertices of the graph. We show that if the corresponding temporal graph is connected in every timestep, we may explore the graph in $2\delta n$ timesteps, where $\delta$ is the lowest degree of any vertex in the graph. In general, we show that, for any temporal graph represented by a word of length at least $n(2dn + d)$, with a connected underlying graph, the full graph can be explored in $2 d n$ timesteps, where $d$ is the diameter of the graph. We show this is asymptotically optimal by providing a class of graphs of diameter $d$ requiring $\Omega(d n)$ timesteps to explore, for any $d \in [1, n]$.

Quantum Communication Advantage for Leader Election and Agreement

from arXiv: Data Structures and Algorithms

Authors: Fabien Dufoulon, Frédéric Magniez, Gopal Pandurangan

This work focuses on understanding the quantum message complexity of two central problems in distributed computing, namely, leader election and agreement in synchronous message-passing communication networks. We show that quantum communication gives an advantage for both problems by presenting quantum distributed algorithms that significantly outperform their respective classical counterparts under various network topologies. While prior works have studied and analyzed quantum distributed algorithms in the context of (improving) round complexity, a key conceptual contribution of our work is positing a framework to design and analyze the message complexity of quantum distributed algorithms. We present and show how quantum algorithmic techniques such as Grover search, quantum counting, and quantum walks can make distributed algorithms significantly message-efficient. In particular, our leader election protocol for diameter-2 networks uses quantum walks to achieve the improved message complexity. To the best of our knowledge, this is the first such application of quantum walks in distributed computing.

Authors: Fabien Dufoulon, Frédéric Magniez, Gopal Pandurangan

This work focuses on understanding the quantum message complexity of two central problems in distributed computing, namely, leader election and agreement in synchronous message-passing communication networks. We show that quantum communication gives an advantage for both problems by presenting quantum distributed algorithms that significantly outperform their respective classical counterparts under various network topologies. While prior works have studied and analyzed quantum distributed algorithms in the context of (improving) round complexity, a key conceptual contribution of our work is positing a framework to design and analyze the message complexity of quantum distributed algorithms. We present and show how quantum algorithmic techniques such as Grover search, quantum counting, and quantum walks can make distributed algorithms significantly message-efficient. In particular, our leader election protocol for diameter-2 networks uses quantum walks to achieve the improved message complexity. To the best of our knowledge, this is the first such application of quantum walks in distributed computing.

Pareto Optimal Algorithmic Recourse in Multi-cost Function

from arXiv: Data Structures and Algorithms

Authors: Wen-Ling Chen, Hong-Chang Huang, Kai-Hung Lin, Shang-Wei Hwang, Hao-Tsung Yang

In decision-making systems, algorithmic recourse aims to identify minimal-cost actions to alter an individual features, thereby obtaining a desired outcome. This empowers individuals to understand, question, or alter decisions that negatively affect them. However, due to the variety and sensitivity of system environments and individual personalities, quantifying the cost of a single function is nearly impossible while considering multiple criteria situations. Most current recourse mechanisms use gradient-based methods that assume cost functions are differentiable, often not applicable in real-world scenarios, resulting in sub-optimal solutions that compromise various criteria. These solutions are typically intractable and lack rigorous theoretical foundations, raising concerns regarding interpretability, reliability, and transparency from the explainable AI (XAI) perspective. To address these issues, this work proposes an algorithmic recourse framework that handles non-differentiable and discrete multi-cost functions. By formulating recourse as a multi-objective optimization problem and assigning weights to different criteria based on their importance, our method identifies Pareto optimal recourse recommendations. To demonstrate scalability, we incorporate the concept of epsilon-net, proving the ability to find approximated Pareto optimal actions. Experiments show the trade-off between different criteria and the methods scalability in large graphs. Compared to current heuristic practices, our approach provides a stronger theoretical foundation and better aligns recourse suggestions with real-world requirements.

Authors: Wen-Ling Chen, Hong-Chang Huang, Kai-Hung Lin, Shang-Wei Hwang, Hao-Tsung Yang

In decision-making systems, algorithmic recourse aims to identify minimal-cost actions to alter an individual features, thereby obtaining a desired outcome. This empowers individuals to understand, question, or alter decisions that negatively affect them. However, due to the variety and sensitivity of system environments and individual personalities, quantifying the cost of a single function is nearly impossible while considering multiple criteria situations. Most current recourse mechanisms use gradient-based methods that assume cost functions are differentiable, often not applicable in real-world scenarios, resulting in sub-optimal solutions that compromise various criteria. These solutions are typically intractable and lack rigorous theoretical foundations, raising concerns regarding interpretability, reliability, and transparency from the explainable AI (XAI) perspective. To address these issues, this work proposes an algorithmic recourse framework that handles non-differentiable and discrete multi-cost functions. By formulating recourse as a multi-objective optimization problem and assigning weights to different criteria based on their importance, our method identifies Pareto optimal recourse recommendations. To demonstrate scalability, we incorporate the concept of epsilon-net, proving the ability to find approximated Pareto optimal actions. Experiments show the trade-off between different criteria and the methods scalability in large graphs. Compared to current heuristic practices, our approach provides a stronger theoretical foundation and better aligns recourse suggestions with real-world requirements.

One-Shot Learning for k-SAT

from arXiv: Data Structures and Algorithms

Authors: Andreas Galanis, Leslie Ann Goldberg, Xusheng Zhang

Consider a $k$-SAT formula $\Phi$ where every variable appears at most $d$ times, and let $\sigma$ be a satisfying assignment of $\Phi$ sampled proportionally to $e^{\beta m(\sigma)}$ where $m(\sigma)$ is the number of variables set to true and $\beta$ is a real parameter. Given $\Phi$ and $\sigma$, can we learn the value of $\beta$ efficiently? This problem falls into a recent line of works about single-sample ("one-shot") learning of Markov random fields. The $k$-SAT setting we consider here was recently studied by Galanis, Kandiros, and Kalavasis (SODA'24) where they showed that single-sample learning is possible when roughly $d\leq 2^{k/6.45}$ and impossible when $d\geq (k+1) 2^{k-1}$. Crucially, for their impossibility results they used the existence of unsatisfiable instances which, aside from the gap in $d$, left open the question of whether the feasibility threshold for one-shot learning is dictated by the satisfiability threshold of $k$-SAT formulas of bounded degree. Our main contribution is to answer this question negatively. We show that one-shot learning for $k$-SAT is infeasible well below the satisfiability threshold; in fact, we obtain impossibility results for degrees $d$ as low as $k^2$ when $\beta$ is sufficiently large, and bootstrap this to small values of $\beta$ when $d$ scales exponentially with $k$, via a probabilistic construction. On the positive side, we simplify the analysis of the learning algorithm and obtain significantly stronger bounds on $d$ in terms of $\beta$. In particular, for the uniform case $\beta\rightarrow 0$ that has been studied extensively in the sampling literature, our analysis shows that learning is possible under the condition $d\lesssim 2^{k/2}$. This is nearly optimal (up to constant factors) in the sense that it is known that sampling a uniformly-distributed satisfying assignment is NP-hard for $d\gtrsim 2^{k/2}$.

Authors: Andreas Galanis, Leslie Ann Goldberg, Xusheng Zhang

Consider a $k$-SAT formula $\Phi$ where every variable appears at most $d$ times, and let $\sigma$ be a satisfying assignment of $\Phi$ sampled proportionally to $e^{\beta m(\sigma)}$ where $m(\sigma)$ is the number of variables set to true and $\beta$ is a real parameter. Given $\Phi$ and $\sigma$, can we learn the value of $\beta$ efficiently? This problem falls into a recent line of works about single-sample ("one-shot") learning of Markov random fields. The $k$-SAT setting we consider here was recently studied by Galanis, Kandiros, and Kalavasis (SODA'24) where they showed that single-sample learning is possible when roughly $d\leq 2^{k/6.45}$ and impossible when $d\geq (k+1) 2^{k-1}$. Crucially, for their impossibility results they used the existence of unsatisfiable instances which, aside from the gap in $d$, left open the question of whether the feasibility threshold for one-shot learning is dictated by the satisfiability threshold of $k$-SAT formulas of bounded degree. Our main contribution is to answer this question negatively. We show that one-shot learning for $k$-SAT is infeasible well below the satisfiability threshold; in fact, we obtain impossibility results for degrees $d$ as low as $k^2$ when $\beta$ is sufficiently large, and bootstrap this to small values of $\beta$ when $d$ scales exponentially with $k$, via a probabilistic construction. On the positive side, we simplify the analysis of the learning algorithm and obtain significantly stronger bounds on $d$ in terms of $\beta$. In particular, for the uniform case $\beta\rightarrow 0$ that has been studied extensively in the sampling literature, our analysis shows that learning is possible under the condition $d\lesssim 2^{k/2}$. This is nearly optimal (up to constant factors) in the sense that it is known that sampling a uniformly-distributed satisfying assignment is NP-hard for $d\gtrsim 2^{k/2}$.

Breaking Barriers: Combinatorial Algorithms for Non-monotone Submodular Maximization with Sublinear Adaptivity and $1/e$ Approximation

from arXiv: Data Structures and Algorithms

Authors: Yixin Chen, Wenjing Chen, Alan Kuhnle

With the rapid growth of data in modern applications, parallel combinatorial algorithms for maximizing non-monotone submodular functions have gained significant attention. The state-of-the-art approximation ratio of $1/e$ is currently achieved only by a continuous algorithm (Ene & Nguyen, 2020) with adaptivity $\mathcal O(\log(n))$. In this work, we focus on size constraints and propose a $(1/4-\varepsilon)$-approximation algorithm with high probability for this problem, as well as the first randomized parallel combinatorial algorithm achieving a $1/e-\varepsilon$ approximation ratio, which bridges the gap between continuous and combinatorial approaches. Both algorithms achieve $\mathcal O(\log(n)\log(k))$ adaptivity and $\mathcal O(n\log(n)\log(k))$ query complexity. Empirical results show our algorithms achieve competitive objective values, with the first algorithm particularly efficient in queries.

Authors: Yixin Chen, Wenjing Chen, Alan Kuhnle

With the rapid growth of data in modern applications, parallel combinatorial algorithms for maximizing non-monotone submodular functions have gained significant attention. The state-of-the-art approximation ratio of $1/e$ is currently achieved only by a continuous algorithm (Ene & Nguyen, 2020) with adaptivity $\mathcal O(\log(n))$. In this work, we focus on size constraints and propose a $(1/4-\varepsilon)$-approximation algorithm with high probability for this problem, as well as the first randomized parallel combinatorial algorithm achieving a $1/e-\varepsilon$ approximation ratio, which bridges the gap between continuous and combinatorial approaches. Both algorithms achieve $\mathcal O(\log(n)\log(k))$ adaptivity and $\mathcal O(n\log(n)\log(k))$ query complexity. Empirical results show our algorithms achieve competitive objective values, with the first algorithm particularly efficient in queries.

Tuesday, February 11

Toward a non-constant cancellation function

from Scott Aaronson

It now seems the switch of Cancel Culture has only two settings: How could we possibly draw any line between these two extremes? Wouldn’t that require … judgment? Common sense? Consideration of the facts of individual cases? I, of course, survived attempted cancellation by a large online mob a decade ago, led by well-known figures […]

It now seems the switch of Cancel Culture has only two settings:

  1. everything is cancellable—including giving intellectual arguments against specific DEI policies, or teaching students about a Chinese filler word (“ne-ge”) that sounds a little like the N-word, or else
  2. nothing is cancellable—not even tweeting “normalize Indian hate” and “I was racist before it was cool,” shortly before getting empowered to remake the US federal government.

How could we possibly draw any line between these two extremes? Wouldn’t that require … judgment? Common sense? Consideration of the facts of individual cases?

I, of course, survived attempted cancellation by a large online mob a decade ago, led by well-known figures such as Amanda Marcotte and Arthur Chu. Though it was terrifying at the time—it felt like my career and even my life were over—I daresay that, here in 2025, not many people would still condemn me for trying to have the heartfelt conversation I did about nerds, feminism, and dating, deep in the comments section of this blog. My side has now conclusively “won” that battle. The once-terrifying commissars of the People’s Republic of Woke, who delighted in trying to ruin me, are now bound and chained, as whooping soldiers of the MAGA Empire drag them by their hair to the torture dungeons.

And this is … not at all the outcome I wanted? It’s a possible outcome that I foresaw in 2014, and was desperately trying to help prevent, through fostering open dialogue between shy male nerds and feminists? I’m now, if anything, more terrified for my little tribe of pro-Enlightenment, science-loving nerds than I was under the woke regime? Speaking of switches with only two settings.

Anyway, with whatever moral authority this experience vests in me, I’d like to suggest that, in future cancellation controversies, the central questions ought to include the following:

  1. What did the accused person actually say or do? Disregarding all confident online discourse about what that “type” of person normally does, or wants to do.
  2. Is there a wider context that often gets cut from social media posts, but that, as soon as you know it, makes the incident seem either better or worse?
  3. How long ago was the offense: more like thirty years or like last week?
  4. Was the person in a radically different condition than they are now—e.g., were they very young, or undergoing a mental health episode, or reacting to a fresh traumatic incident, or drunk or high?
  5. Were the relevant cultural norms different when the offense happened? Did countless others say or do the same thing, and if so, are they also at risk of cancellation?
  6. What’s reasonable to infer about what the person actually believes? What do they want to have happen to whichever group they offended? What would they do to the group given unlimited power? Have they explicitly stated answers to these questions, either before or after the incident? Have they taken real-world actions by which we could judge their answers as either sincere or insincere?
  7. If we don’t cancel this person, what are we being asked to tolerate? Just that they get to keep teaching and publishing views that many people find objectionable? Or that they get to impose their objectionable views on an entire academic department, university, company, organization, or government?
  8. If we agree that the person said something genuinely bad, did they apologize or express regret? Or, if what they said got confused with something bad, did they rush to clarify and disclaim the bad interpretation?
  9. Did they not only refuse to clarify or apologize, but do the opposite? That is, did they express glee about what they were able to get away with, or make light of the suffering or “tears” of their target group?

People can debate how to weigh these considerations, though I personally put enormous weight on 8 and 9, what you could call the “clarification vs. glee axis.” I have nearly unlimited charity for people willing to have a good-faith moral conversation with the world, and nearly unlimited contempt for people who mock the request for such a conversation.

The sad part is that, in practice, the criteria for cancellation have tended instead to be things like:

  • Is the target giving off signals of shame, distress, and embarrassment—thereby putting blood in the water and encouraging us to take bigger bites?
  • Do we, the mob, have the power to cancel this person? Does the person’s reputation and livelihood depend on organizations that care what we think, that would respond to pressure from us?

The trouble with these questions is that, not only are their answers not positively correlated with which people deserve to be cancelled, they’re negatively correlated. This is precisely how you get the phenomenon of the left-wing circular firing squad, which destroys the poor schmucks capable of shame even while the shameless, the proud racists and pussy-grabbers, go completely unpunished. Surely we can do better than that.

By Scott

Holding out for an explanation

from Ben Recht

Learning theory needs to admit test sets exist.

After class last Thursday, Ishaq Aden-Ali brought up a sneakily obvious point about how theorists approach learning theory: it starts from the position that test sets don’t exist.

This is not to say that learning theory can’t analyze methods that use test sets. Some theory papers talk about using test sets for generalization (notably this one and this one). It’s not hard to prove theorems about the utility of test sets. But when you dig into the history of the subject or the most famous textbooks, you see that the test set never gets mentioned.

For example, the most famous result from the 1960s is Novikoff’s Mistake Bound for the Perceptron. This online learning result proves that if you have data separable by a hyperplane, the Perceptron algorithm will make a finite number of mistakes no matter how many data points you have. There is no test set in Novikoff’s bound. You don’t need one! In the model where your data is separable by a hyperplane, if you run Roseblatt’s algorithm for long enough, you generalize perfectly.

Less well known, though frequently mentioned on this blog, Bill Highleyman arguably invented empirical risk minimization in his PhD thesis at Bell Labs. Bill’s first paper on ERM also has no mention of test sets. He has a theorem that asserts that with a finite data set, the empirical risk is an unbiased estimate of the average prediction error. This is only true for a single function, not for multiple functions. Highleyman’s result caught the eye of Russian mathematicians, including Alexey Chervonenkis (the C in VC Dimension). In 2004, Chervonenkis recalled an argument about Highleyman’s paper with Yakov Khurgin in 1965:

“The justification of convergence was rather ‘wild’… We gave explicit examples where getting an acceptable result required an arbitrarily long training sequence (even for linear rules). It is here that uniform convergence was mentioned for the first time. Khurgin was saying, ‘You are playing on the non-compactness of the Hilbert ball,’ or ‘You are demanding uniform convergence.’ I was replying, ‘Yes, we are!’”

The whole recollection is a fun read. Chervonenkis describes arguments about overparameterization, the first leave-one-out proofs, and a colorful metaphor comparing optimization to visiting a “venereal clinic” (h.t. Maxim Raginsky). But he never mentions testing sets.

In fact, generalization bounds pioneered by Chervonenkis and Vapnik assume you don’t need a test set. You get bounds that argue your generalization error will be small if you take the model that minimizes the training error. With appropriate simplifying assumptions about the behavior of the world, these theorems assert that you don’t need a test set. The output of your optimization algorithm generalizes. These are interesting theorems! They certainly give us some actionable insights, telling us facts about simple linear rules with bounded coefficients. However, it is hard to argue that these theorems say much about actual practice,

Somehow, everyone missed Highleyman’s other 1962 paper, which tried to build a theory of test sets. Here’s the key excerpt from the abstract:

“[T]he problem of the optimum partitioning of a sample of fixed size between the design and test phases of a pattern recognition machine is discussed. One important nonparametric result is that the proportion of the total sample used for testing the machine should never be less than that proportion used for designing the machine, and in some cases should be a good deal more.”

Highleyman argues that everyone uses holdout sets for evaluation, but we don’t know why they work or have rigorous guidance for how to build them. Frankly, he comes to confusing conclusions. Why does the test set need to be larger than the training set? His reasoning is convoluted. Not only is the paper hard to follow, but I’m pretty sure his mathematical derivations are incorrect. However, what is clear is that the method he is attempting to analyze is precisely the holdout method we use today, not the more sophisticated statistical methods like cross validation. Indeed, Stone’s famous paper on cross validation comes a decade after Highleyman and a year after Duda and Hart’s Pattern Recognition text proposes the holdout method for model selection.

Why did machine learning theory never engage with the holdout method? Partially, it’s because machine learning practice didn’t fully embrace the holdout method until the 1990s. This can’t be the whole story, though. The holdout method has been around for as long as empirical risk minimization and mistake bounds.

Whatever the reason for the omission, we can’t claim a “theory” of machine learning that doesn’t rest on a theory of the holdout method. The last forty years have proven the test set to be as important as the training set. The holdout method has remained a methodological standard, even as other algorithmic fads for fiddling with training data have come and gone. The evaluation is more important than the learning algorithm.

Can I close with a call to theoretical action? There are still a lot of questions about how these holdout sets shape our choice of prediction functions. For example, I still don’t have a satisfactory answer for why we robustly observe these “generalization scatter plots” we first saw when reproducing CIFAR10.

Are these plots explainable by theory? Practitioners might just say it doesn’t matter, but I’d bet that if we better understood the causes of this trend, we’d be better equipped to make sense of the current benchmark wars that dominate our LLM discourse.

Subscribe now

By Ben Recht

Roadmap for the Debate about Quantum Computers

from Gil Kalai

Here is a roadmap for the debate about quantum computers from my perspective. Skeptical claims are in blue, responses to these claims are in purple. Points 1-8 represent skeptical claims made over the years, while points a-p are replies to … Continue reading →

Here is a roadmap for the debate about quantum computers from my perspective. Skeptical claims are in blue, responses to these claims are in purple. Points 1-8 represent skeptical claims made over the years, while points a-p are replies to these skeptical claims. Points (i)-(x) are replies to some of these replies. Underlined claims were made by me. Green boldface indicates points of agreement. 

  1. The understanding of noise is a crucial ingredient of any model of computation. Noise may cause quantum computing to fail.

    • a) Noise is not fundamental. There is no inherent “noise” in quantum mechanics.

      • (i) A realistic computational model must include also the modeling of errors.

    • b) Quantum error correction and quantum fault tolerance have the potential to deal successfully with noise and various forms of inaccuracies.

    • c) Topological quantum computing gives an alternative avenue for quantum computing that does not depend on the circuit model.
  2. General forms of quantum noise (in particular, correlated errors) will fail quantum error correction.

    • d) These types of general noise are not physical.

    • e) These types of general noise would cause classical computation to fail as well.

    • f) For arbitrary forms of noise, if the error rate is sufficiently small, log-depth quantum computing survives.

  3. Specific physical noise models differ from the ideal model.

    • g) We will be able to extend the threshold theorem to these models as well.

      • (ii) All those extensions still assume common properties that might be unrealistic.

  4. There will be a systematic relationship between the “signal” and the noise. For example, the noise for a pair of entangled qubits will be highly correlated. (This argument extends to attempts to build quantum gadgets needed for topological quantum computing.)

    • h) This cannot be: how would nature know what the intended state is? It is like saying, “wet roads cause rain to pour.”

      • (iii) This systematic relation can be a general rule if you consider special-purpose quantum computational devices.

      • (iv) This systematic relation holds for NISQ (noisy intermediate scale quantum) systems and more generally for systems that do not apply quantum fault tolerance. This systematic relation describes quantum physics “above the quantum fault tolerance threshold.”

    • i) Such a principle would harm classical computation as well, unless you believe that by a miracle the threshold for fault tolerance lies between what is required for classical and quantum computation.

      • (v) The argument about the miraculous location of the threshold is incorrect because classical computers are not universal quantum computers.

  5. NISQ computers represent a primitive classical computational power (LDP), which implies an inherent reason they cannot achieve quantum computational supremacy. This suggests there is an absolute lower bound on noise rates that prevents effective quantum error correction. 

    • j) It is incorrect to apply asymptotic insights (on the number of qubits) to systems with only tens or a few hundred qubits.

      • (vi) We commonly apply asymptotic insights to small systems.

    • k) But why classical information and computation are possible?

      • (vii) The computational class LDP still supports classical information.

    • l) The Google 2019 experiment refutes this argument.

      • (Agreed upon) Google’s 2019 supremacy claims appear to be incorrect.

    • m) Some aspects of the Google 2019 experiment and its replications are promising and support basic assumptions about quantum noise.

      • (viii) These aspects of Google 2019 appear to be incorrect as well.

    • n) (New) The 2019 Google experiment was long ago superseded. Several recent experiments, including Google’s “septillion years advantage over classical” and other results from IBM, Quantinuum, QuEra, and USTC, are more robust.

      • (ix) We need to scrutinize these assertions as well. Gradual experimental progress is expected to hit a barrier, and fantastical experimental claims are expected to be falsified.

  6. Building quantum computers is a ridiculously hard engineering task.

    • o) It is very hard but not ridiculously hard; progress is being made, and a long-term approach is necessary.

      • (x) Even achieving near-term tasks is inherently impossible.

      • (Agreed upon) We may have an opportunity to test this matter in the next few years.

  7. (Simplified) Quantum computations conflict with important principles and laws from physics and computing theory:

    • 7.1 The time-energy uncertainty principle.

    • 7.2 The effective Church-Turing thesis.

    • 7.3 Several principles of thermodynamics.

    • 7.4 Bohr’s selection principle.

    • p) (Simplified) These physical and computational rules are not universally correct, which is part of what makes quantum computation so remarkable as it would enable novel physical and computational behaviors.

  8. Quantum computers will allow sampling according to the exact values of permanents. Achieving it is implausible given that computing permanents is #P complete.

Summary

Proponents of Quantum Computers: BQP is the computational complexity class representing our quantum physical world. BQP characterizes the complexity of quantum processes on a microscopic scale, as evidenced by the challenges of simulating quantum physics on classical computers. Building quantum computers is primarily an engineering task, and pursuing this endeavor will drive exciting scientific progress across various fields. Many promising avenues exist, ranging from different implementations of quantum circuits to non-abelian anyons and topological quantum computing. Numerous impressive experimental results have already been achieved.

Skeptics of Quantum Computers: BPP is the computational class representing computation in the physical world. Classical information and computation are sustained in our noisy quantum world through the “averaging” of noisy quantum information.  Quantum computational supremacy or high quality quantum error correction are inherently impossible and there is no compelling evidence for them in natural quantum processes. Advances in experimental quantum computing should be scrutinized carefully. Gradual experimental progress is expected to hit a barrier, and fantastical experimental claims are unlikely to withstand rigorous examination.

A “meta claim” of proponents is that there is no symmetry between the two sides of the debate: the prevalent view shared by many experts is that quantum computation is possible. 

_____________

Future Plans for Quantum Posts

I am planning an ambitious post, “Some Quantum Mysteries,” which will explore several mysteries in quantum physics, many of which are connected to quantum information theory. The first item is about the elusive Majorana zero modes, and the second item is about the security of quantum key distributions.

As a spinoff of the mysteries post, I am also writing two posts on the debate surrounding quantum computing. The first will summarize a few skeptical perspectives on quantum computing from Robert Alicki, Michel Dyakonov, Leonid Levin, Oded Goldreich, and others. The second post will begin with the roadmap presented here, briefly outline my own skeptical views with relevant links, and then delve into the broader debate, including counterarguments from John Preskill, Aram Harrow, Scott Aaronson, Dave Bacon, Boaz Barak, and others.

Seven assertions about quantum computing

In my previous post, I presented seven assertions about quantum computing that emerged from my research. My primary goal was to articulate these assertions as clearly as possible. (Of course, there is much to discuss regarding the arguments for their validity, precise quantitative formulations, falsifiable predictions, and their place within the broader context of the quantum computing endeavor.)

By Gil Kalai

Symmetric Algebraic Circuits and Homomorphism Polynomials

from arXiv: Computational Complexity

Authors: Anuj Dawar, Benedikt Pago, Tim Seppelt

The central open question of algebraic complexity is whether VP is unequal to VNP, which is saying that the permanent cannot be represented by families of polynomial-size algebraic circuits. For symmetric algebraic circuits, this has been confirmed by Dawar and Wilsenach (2020) who showed exponential lower bounds on the size of symmetric circuits for the permanent. In this work, we set out to develop a more general symmetric algebraic complexity theory. Our main result is that a family of symmetric polynomials admits small symmetric circuits if and only if they can be written as a linear combination of homomorphism counting polynomials of graphs of bounded treewidth. We also establish a relationship between the symmetric complexity of subgraph counting polynomials and the vertex cover number of the pattern graph. As a concrete example, we examine the symmetric complexity of immanant families (a generalisation of the determinant and permanent) and show that a known conditional dichotomy due to Curticapean (2021) holds unconditionally in the symmetric setting.

Authors: Anuj Dawar, Benedikt Pago, Tim Seppelt

The central open question of algebraic complexity is whether VP is unequal to VNP, which is saying that the permanent cannot be represented by families of polynomial-size algebraic circuits. For symmetric algebraic circuits, this has been confirmed by Dawar and Wilsenach (2020) who showed exponential lower bounds on the size of symmetric circuits for the permanent. In this work, we set out to develop a more general symmetric algebraic complexity theory. Our main result is that a family of symmetric polynomials admits small symmetric circuits if and only if they can be written as a linear combination of homomorphism counting polynomials of graphs of bounded treewidth. We also establish a relationship between the symmetric complexity of subgraph counting polynomials and the vertex cover number of the pattern graph. As a concrete example, we examine the symmetric complexity of immanant families (a generalisation of the determinant and permanent) and show that a known conditional dichotomy due to Curticapean (2021) holds unconditionally in the symmetric setting.

A Parameterized Study of Secluded Structures in Directed Graphs

from arXiv: Computational Complexity

Authors: Nadym Mallek, Jonas Schmidt, Shaily Verma

Given an undirected graph $G$ and an integer $k$, the Secluded $\Pi$-Subgraph problem asks you to find a maximum size induced subgraph that satisfies a property $\Pi$ and has at most $k$ neighbors in the rest of the graph. This problem has been extensively studied; however, there is no prior study of the problem in directed graphs. This question has been mentioned by Jansen et al. [ISAAC'23]. In this paper, we initiate the study of Secluded Subgraph problem in directed graphs by incorporating different notions of neighborhoods: in-neighborhood, out-neighborhood, and their union. Formally, we call these problems {\{In, Out, Total\}-Secluded $\Pi$-Subgraph, where given a directed graph $G$ and integers $k$, we want to find an induced subgraph satisfying $\Pi$ of maximum size that has at most $k$ in/out/total-neighbors in the rest of the graph, respectively. We investigate the parameterized complexity of these problems for different properties $\Pi$. In particular, we prove the following parameterized results: - We design an FPT algorithm for the Total-Secluded Strongly Connected Subgraph problem when parameterized by $k$. - We show that the In/Out-Secluded $\mathcal{F}$-Free Subgraph problem with parameter $k+w$ is W[1]-hard, where $\mathcal{F}$ is a family of directed graphs except any subgraph of a star graph whose edges are directed towards the center. This result also implies that In/Out-Secluded DAG is W[1]-hard, unlike the undirected variants of the two problems, which are FPT. - We design an FPT-algorithm for In/Out/Total-Secluded $\alpha$-Bounded Subgraph when parameterized by $k$, where $\alpha$-bounded graphs are a superclass of tournaments. - For undirected graphs, we improve the best-known FPT algorithm for Secluded Clique by providing a faster FPT algorithm that runs in time $1.6181^kn^{\mathcal{O}(1)}$.

Authors: Nadym Mallek, Jonas Schmidt, Shaily Verma

Given an undirected graph $G$ and an integer $k$, the Secluded $\Pi$-Subgraph problem asks you to find a maximum size induced subgraph that satisfies a property $\Pi$ and has at most $k$ neighbors in the rest of the graph. This problem has been extensively studied; however, there is no prior study of the problem in directed graphs. This question has been mentioned by Jansen et al. [ISAAC'23]. In this paper, we initiate the study of Secluded Subgraph problem in directed graphs by incorporating different notions of neighborhoods: in-neighborhood, out-neighborhood, and their union. Formally, we call these problems {\{In, Out, Total\}-Secluded $\Pi$-Subgraph, where given a directed graph $G$ and integers $k$, we want to find an induced subgraph satisfying $\Pi$ of maximum size that has at most $k$ in/out/total-neighbors in the rest of the graph, respectively. We investigate the parameterized complexity of these problems for different properties $\Pi$. In particular, we prove the following parameterized results: - We design an FPT algorithm for the Total-Secluded Strongly Connected Subgraph problem when parameterized by $k$. - We show that the In/Out-Secluded $\mathcal{F}$-Free Subgraph problem with parameter $k+w$ is W[1]-hard, where $\mathcal{F}$ is a family of directed graphs except any subgraph of a star graph whose edges are directed towards the center. This result also implies that In/Out-Secluded DAG is W[1]-hard, unlike the undirected variants of the two problems, which are FPT. - We design an FPT-algorithm for In/Out/Total-Secluded $\alpha$-Bounded Subgraph when parameterized by $k$, where $\alpha$-bounded graphs are a superclass of tournaments. - For undirected graphs, we improve the best-known FPT algorithm for Secluded Clique by providing a faster FPT algorithm that runs in time $1.6181^kn^{\mathcal{O}(1)}$.

Hardness of Hypergraph Edge Modification Problems

from arXiv: Computational Complexity

Authors: Lior Gishboliner, Yevgeny Levanzov, Asaf Shapira

For a fixed graph $F$, let $ex_F(G)$ denote the size of the largest $F$-free subgraph of $G$. Computing or estimating $ex_F(G)$ for various pairs $F,G$ is one of the central problems in extremal combinatorics. It is thus natural to ask how hard is it to compute this function. Motivated by an old problem of Yannakakis from the 80's, Alon, Shapira and Sudakov [ASS'09] proved that for every non-bipartite graph $F$, computing $ex_F(G)$ is NP-hard. Addressing a conjecture of Ailon and Alon (2007), we prove a hypergraph analogue of this theorem, showing that for every $k \geq 3$ and every non-$k$-partite $k$-graph $F$, computing $ex_F(G)$ is NP-hard. Furthermore, we conjecture that our hardness result can be extended to all $k$-graphs $F$ other than a matching of fixed size. If true, this would give a precise characterization of the $k$-graphs $F$ for which computing $ex_F(G)$ is NP-hard, since we also prove that when $F$ is a matching of fixed size, $ex_F(G)$ is computable in polynomial time. This last result can be considered an algorithmic version of the celebrated Erd\H{o}s-Ko-Rado Theorem. The proof of [ASS'09] relied on a variety of tools from extremal graph theory, one of them being Tur\'an's theorem. One of the main challenges we have to overcome in order to prove our hypergraph extension is the lack of a Tur\'an-type theorem for $k$-graphs. To circumvent this, we develop a completely new graph theoretic approach for proving such hardness results.

Authors: Lior Gishboliner, Yevgeny Levanzov, Asaf Shapira

For a fixed graph $F$, let $ex_F(G)$ denote the size of the largest $F$-free subgraph of $G$. Computing or estimating $ex_F(G)$ for various pairs $F,G$ is one of the central problems in extremal combinatorics. It is thus natural to ask how hard is it to compute this function. Motivated by an old problem of Yannakakis from the 80's, Alon, Shapira and Sudakov [ASS'09] proved that for every non-bipartite graph $F$, computing $ex_F(G)$ is NP-hard. Addressing a conjecture of Ailon and Alon (2007), we prove a hypergraph analogue of this theorem, showing that for every $k \geq 3$ and every non-$k$-partite $k$-graph $F$, computing $ex_F(G)$ is NP-hard. Furthermore, we conjecture that our hardness result can be extended to all $k$-graphs $F$ other than a matching of fixed size. If true, this would give a precise characterization of the $k$-graphs $F$ for which computing $ex_F(G)$ is NP-hard, since we also prove that when $F$ is a matching of fixed size, $ex_F(G)$ is computable in polynomial time. This last result can be considered an algorithmic version of the celebrated Erd\H{o}s-Ko-Rado Theorem. The proof of [ASS'09] relied on a variety of tools from extremal graph theory, one of them being Tur\'an's theorem. One of the main challenges we have to overcome in order to prove our hypergraph extension is the lack of a Tur\'an-type theorem for $k$-graphs. To circumvent this, we develop a completely new graph theoretic approach for proving such hardness results.

Provably Overwhelming Transformer Models with Designed Inputs

from arXiv: Computational Complexity

Authors: Lev Stambler, Seyed Sajjad Nezhadi, Matthew Coudron

We develop an algorithm which, given a trained transformer model $\mathcal{M}$ as input, as well as a string of tokens $s$ of length $n_{fix}$ and an integer $n_{free}$, can generate a mathematical proof that $\mathcal{M}$ is ``overwhelmed'' by $s$, in time and space $\widetilde{O}(n_{fix}^2 + n_{free}^3)$. We say that $\mathcal{M}$ is ``overwhelmed'' by $s$ when the output of the model evaluated on this string plus any additional string $t$, $\mathcal{M}(s + t)$, is completely insensitive to the value of the string $t$ whenever length($t$) $\leq n_{free}$. Along the way, we prove a particularly strong worst-case form of ``over-squashing'', which we use to bound the model's behavior. Our technique uses computer-aided proofs to establish this type of operationally relevant guarantee about transformer models. We empirically test our algorithm on a single layer transformer complete with an attention head, layer-norm, MLP/ReLU layers, and RoPE positional encoding. We believe that this work is a stepping stone towards the difficult task of obtaining useful guarantees for trained transformer models.

Authors: Lev Stambler, Seyed Sajjad Nezhadi, Matthew Coudron

We develop an algorithm which, given a trained transformer model $\mathcal{M}$ as input, as well as a string of tokens $s$ of length $n_{fix}$ and an integer $n_{free}$, can generate a mathematical proof that $\mathcal{M}$ is ``overwhelmed'' by $s$, in time and space $\widetilde{O}(n_{fix}^2 + n_{free}^3)$. We say that $\mathcal{M}$ is ``overwhelmed'' by $s$ when the output of the model evaluated on this string plus any additional string $t$, $\mathcal{M}(s + t)$, is completely insensitive to the value of the string $t$ whenever length($t$) $\leq n_{free}$. Along the way, we prove a particularly strong worst-case form of ``over-squashing'', which we use to bound the model's behavior. Our technique uses computer-aided proofs to establish this type of operationally relevant guarantee about transformer models. We empirically test our algorithm on a single layer transformer complete with an attention head, layer-norm, MLP/ReLU layers, and RoPE positional encoding. We believe that this work is a stepping stone towards the difficult task of obtaining useful guarantees for trained transformer models.

Barriers and Pathways to Human-AI Alignment: A Game-Theoretic Approach

from arXiv: Computational Complexity

Authors: Aran Nayebi

Under what conditions can capable AI agents efficiently align their actions with human preferences? More specifically, when they are proficient enough to collaborate with us, how long does coordination take, and when is it computationally feasible? These foundational questions of AI alignment help define what makes an AI agent ``sufficiently safe'' and valuable to humans. Since such generally capable systems do not yet exist, a theoretical analysis is needed to establish when guarantees hold -- and what they even are. We introduce a game-theoretic framework that generalizes prior alignment approaches with fewer assumptions, allowing us to analyze the computational complexity of alignment across $M$ objectives and $N$ agents, providing both upper and lower bounds. Unlike previous work, which often assumes common priors, idealized communication, or implicit tractability, our framework formally characterizes the difficulty of alignment under minimal assumptions. Our main result shows that even when agents are fully rational and computationally \emph{unbounded}, alignment can be achieved with high probability in time \emph{linear} in the task space size. Therefore, in real-world settings, where task spaces are often \emph{exponential} in input length, this remains impractical. More strikingly, our lower bound demonstrates that alignment is \emph{impossible} to speed up when scaling to exponentially many tasks or agents, highlighting a fundamental computational barrier to scalable alignment. Relaxing these idealized assumptions, we study \emph{computationally bounded} agents with noisy messages (representing obfuscated intent), showing that while alignment can still succeed with high probability, it incurs additional \emph{exponential} slowdowns in the task space size, number of agents, and number of tasks. We conclude by identifying conditions that make alignment more feasible.

Authors: Aran Nayebi

Under what conditions can capable AI agents efficiently align their actions with human preferences? More specifically, when they are proficient enough to collaborate with us, how long does coordination take, and when is it computationally feasible? These foundational questions of AI alignment help define what makes an AI agent ``sufficiently safe'' and valuable to humans. Since such generally capable systems do not yet exist, a theoretical analysis is needed to establish when guarantees hold -- and what they even are. We introduce a game-theoretic framework that generalizes prior alignment approaches with fewer assumptions, allowing us to analyze the computational complexity of alignment across $M$ objectives and $N$ agents, providing both upper and lower bounds. Unlike previous work, which often assumes common priors, idealized communication, or implicit tractability, our framework formally characterizes the difficulty of alignment under minimal assumptions. Our main result shows that even when agents are fully rational and computationally \emph{unbounded}, alignment can be achieved with high probability in time \emph{linear} in the task space size. Therefore, in real-world settings, where task spaces are often \emph{exponential} in input length, this remains impractical. More strikingly, our lower bound demonstrates that alignment is \emph{impossible} to speed up when scaling to exponentially many tasks or agents, highlighting a fundamental computational barrier to scalable alignment. Relaxing these idealized assumptions, we study \emph{computationally bounded} agents with noisy messages (representing obfuscated intent), showing that while alignment can still succeed with high probability, it incurs additional \emph{exponential} slowdowns in the task space size, number of agents, and number of tasks. We conclude by identifying conditions that make alignment more feasible.

FeatPCA: A feature subspace based principal component analysis technique for enhancing clustering of single-cell RNA-seq data

from arXiv: Computational Complexity

Authors: Md Romizul Islam, Swakkhar Shatabda

Single-cell RNA sequencing (scRNA-seq) has revolutionized our ability to analyze gene expression at the cellular level. By providing data on gene expression for each individual cell, scRNA-seq generates large datasets with thousands of genes. However, handling such high-dimensional data poses computational challenges due to increased complexity. Dimensionality reduction becomes crucial for scRNA-seq analysis. Various dimensionality reduction algorithms, including Principal Component Analysis (PCA), Uniform Manifold Approximation and Projection (UMAP), and t-Distributed Stochastic Neighbor Embedding (t-SNE), are commonly used to address this challenge. These methods transform the original high-dimensional data into a lower-dimensional representation while preserving relevant information. In this paper we propose {\methodname}. Instead of applying dimensionality reduction directly to the entire dataset, we divide it into multiple subspaces. Within each subspace, we apply dimension reduction techniques, and then merge the reduced data. {\methodname} offers four variations for subspacing. Our experimental results demonstrate that clustering based on subspacing yields better accuracy than working with the full dataset. Across a variety of scRNA-seq datasets, {\methodname} consistently outperforms existing state-of-the-art clustering tools.

Authors: Md Romizul Islam, Swakkhar Shatabda

Single-cell RNA sequencing (scRNA-seq) has revolutionized our ability to analyze gene expression at the cellular level. By providing data on gene expression for each individual cell, scRNA-seq generates large datasets with thousands of genes. However, handling such high-dimensional data poses computational challenges due to increased complexity. Dimensionality reduction becomes crucial for scRNA-seq analysis. Various dimensionality reduction algorithms, including Principal Component Analysis (PCA), Uniform Manifold Approximation and Projection (UMAP), and t-Distributed Stochastic Neighbor Embedding (t-SNE), are commonly used to address this challenge. These methods transform the original high-dimensional data into a lower-dimensional representation while preserving relevant information. In this paper we propose {\methodname}. Instead of applying dimensionality reduction directly to the entire dataset, we divide it into multiple subspaces. Within each subspace, we apply dimension reduction techniques, and then merge the reduced data. {\methodname} offers four variations for subspacing. Our experimental results demonstrate that clustering based on subspacing yields better accuracy than working with the full dataset. Across a variety of scRNA-seq datasets, {\methodname} consistently outperforms existing state-of-the-art clustering tools.

From an odd arity signature to a Holant dichotomy

from arXiv: Computational Complexity

Authors: Boning Meng, Juqiu Wang, Mingji Xia, Jiayi Zheng

\textsf{Holant} is an essential framework in the field of counting complexity. For over fifteen years, researchers have been clarifying the complexity classification for complex-valued \textsf{Holant} on the Boolean domain, a challenge that remains unresolved. In this article, we prove a complexity dichotomy for complex-valued \textsf{Holant} on Boolean domain when a non-trivial signature of odd arity exists. This dichotomy is based on the dichotomy for \textsf{\#EO}, and consequently is an $\text{FP}^\text{NP}$ vs. \#P dichotomy as well, stating that each problem is either in $\text{FP}^\text{NP}$ or \#P-hard. Furthermore, we establish a generalized version of the decomposition lemma for complex-valued \textsf{Holant} on Boolean domain. It asserts that each signature can be derived from its tensor product with other signatures, or conversely, the problem itself is in $\text{FP}^\text{NP}$. We believe that this result is a powerful method for building reductions in complex-valued \textsf{Holant}, as it is also employed as a pivotal technique in the proof of the aforementioned dichotomy in this article.

Authors: Boning Meng, Juqiu Wang, Mingji Xia, Jiayi Zheng

\textsf{Holant} is an essential framework in the field of counting complexity. For over fifteen years, researchers have been clarifying the complexity classification for complex-valued \textsf{Holant} on the Boolean domain, a challenge that remains unresolved. In this article, we prove a complexity dichotomy for complex-valued \textsf{Holant} on Boolean domain when a non-trivial signature of odd arity exists. This dichotomy is based on the dichotomy for \textsf{\#EO}, and consequently is an $\text{FP}^\text{NP}$ vs. \#P dichotomy as well, stating that each problem is either in $\text{FP}^\text{NP}$ or \#P-hard. Furthermore, we establish a generalized version of the decomposition lemma for complex-valued \textsf{Holant} on Boolean domain. It asserts that each signature can be derived from its tensor product with other signatures, or conversely, the problem itself is in $\text{FP}^\text{NP}$. We believe that this result is a powerful method for building reductions in complex-valued \textsf{Holant}, as it is also employed as a pivotal technique in the proof of the aforementioned dichotomy in this article.

Exponential Separation Criteria for Quantum Iterative Power Algorithms

from arXiv: Computational Complexity

Authors: András Czégel, Boglárka G. -Tóth

In the vast field of Quantum Optimization, Quantum Iterative Power Algorithms (QIPA) has been introduced recently with a promise of exponential speedup over an already established and well-known method, the variational Quantum Imaginary Time Evolution (varQITE) algorithm. Since the convergence and error of varQITE are known, the promise of QIPA also implied certain collapses in the complexity hierarchy - such as NP $\subseteq$ BQP, as we show in our study. However the original article of QIPA explicitly states the algorithm does not cause any collapses. In this study we prove that these collapses indeed do not occur, and with that, prove that the promised exponential separation is practically unachievable. We do so by introducing criteria for the exponential separation between QIPA that uses a double exponential function and varQITE, and then showing how these criteria require certain properties in problem instances. After that we introduce a preprocessing step that enforces problems to satisfy these criteria, and then we show that the algorithmic error blows up exponentially for these instances, as there is an inverse polynomial term between speedup and the error. Despite the theoretical results, we also show that practically relevant polynomial enhancement is still possible, and show experimental results on a small problem instance, where we used our preprocessing step to achieve the improvement.

Authors: András Czégel, Boglárka G. -Tóth

In the vast field of Quantum Optimization, Quantum Iterative Power Algorithms (QIPA) has been introduced recently with a promise of exponential speedup over an already established and well-known method, the variational Quantum Imaginary Time Evolution (varQITE) algorithm. Since the convergence and error of varQITE are known, the promise of QIPA also implied certain collapses in the complexity hierarchy - such as NP $\subseteq$ BQP, as we show in our study. However the original article of QIPA explicitly states the algorithm does not cause any collapses. In this study we prove that these collapses indeed do not occur, and with that, prove that the promised exponential separation is practically unachievable. We do so by introducing criteria for the exponential separation between QIPA that uses a double exponential function and varQITE, and then showing how these criteria require certain properties in problem instances. After that we introduce a preprocessing step that enforces problems to satisfy these criteria, and then we show that the algorithmic error blows up exponentially for these instances, as there is an inverse polynomial term between speedup and the error. Despite the theoretical results, we also show that practically relevant polynomial enhancement is still possible, and show experimental results on a small problem instance, where we used our preprocessing step to achieve the improvement.

The Complexity of Blocking All Solutions

from arXiv: Computational Complexity

Authors: Christoph Grüne, Lasse Wulf

We consider the general problem of blocking all solutions of some given combinatorial problem with only few elements. For example, the problem of destroying all maximum cliques of a given graph by forbidding only few vertices. Problems of this kind are so fundamental that they have been studied under many different names in many different disjoint research communities already since the 90s. Depending on the context, they have been called the interdiction, most vital vertex, most vital edge, blocker, or vertex deletion problem. Despite their apparent popularity, surprisingly little is known about the computational complexity of interdiction problems in the case where the original problem is already NP-complete. In this paper, we fill that gap of knowledge by showing that a large amount of interdiction problems are even harder than NP-hard. Namely, they are complete for the second stage of Stockmeyer's polynomial hierarchy, the complexity class $\Sigma^p_2$. Such complexity insights are important because they imply that all these problems can not be modelled by a compact integer program (unless the unlikely conjecture NP $= \Sigma_2^p$ holds). Concretely, we prove $\Sigma^p_2$-completeness of the following interdiction problems: satisfiability, 3satisfiability, dominating set, set cover, hitting set, feedback vertex set, feedback arc set, uncapacitated facility location, $p$-center, $p$-median, independent set, clique, subset sum, knapsack, Hamiltonian path/cycle (directed/undirected), TSP, $k$ directed vertex disjoint path ($k \geq 2$), Steiner tree. We show that all of these problems share an abstract property which implies that their interdiction counterpart is $\Sigma_2^p$-complete. Thus, all of these problems are $\Sigma_2^p$-complete \enquote{for the same reason}. Our result extends a recent framework by Gr\"une and Wulf.

Authors: Christoph Grüne, Lasse Wulf

We consider the general problem of blocking all solutions of some given combinatorial problem with only few elements. For example, the problem of destroying all maximum cliques of a given graph by forbidding only few vertices. Problems of this kind are so fundamental that they have been studied under many different names in many different disjoint research communities already since the 90s. Depending on the context, they have been called the interdiction, most vital vertex, most vital edge, blocker, or vertex deletion problem. Despite their apparent popularity, surprisingly little is known about the computational complexity of interdiction problems in the case where the original problem is already NP-complete. In this paper, we fill that gap of knowledge by showing that a large amount of interdiction problems are even harder than NP-hard. Namely, they are complete for the second stage of Stockmeyer's polynomial hierarchy, the complexity class $\Sigma^p_2$. Such complexity insights are important because they imply that all these problems can not be modelled by a compact integer program (unless the unlikely conjecture NP $= \Sigma_2^p$ holds). Concretely, we prove $\Sigma^p_2$-completeness of the following interdiction problems: satisfiability, 3satisfiability, dominating set, set cover, hitting set, feedback vertex set, feedback arc set, uncapacitated facility location, $p$-center, $p$-median, independent set, clique, subset sum, knapsack, Hamiltonian path/cycle (directed/undirected), TSP, $k$ directed vertex disjoint path ($k \geq 2$), Steiner tree. We show that all of these problems share an abstract property which implies that their interdiction counterpart is $\Sigma_2^p$-complete. Thus, all of these problems are $\Sigma_2^p$-complete \enquote{for the same reason}. Our result extends a recent framework by Gr\"une and Wulf.

Induced Disjoint Paths Without an Induced Minor

from arXiv: Computational Complexity

Authors: Pierre Aboulker, Édouard Bonnet, Timothé Picavet, Nicolas Trotignon

We exhibit a new obstacle to the nascent algorithmic theory for classes excluding an induced minor. We indeed show that on the class of string graphs -- which avoids the 1-subdivision of, say, $K_5$ as an induced minor -- Induced 2-Disjoint Paths is NP-complete. So, while $k$-Disjoint Paths, for a fixed $k$, is polynomial-time solvable in general graphs, the absence of a graph as an induced minor does not make its induced variant tractable, even for $k=2$. This answers a question of Korhonen and Lokshtanov [SODA '24], and complements a polynomial-time algorithm for Induced $k$-Disjoint Paths in classes of bounded genus by Kobayashi and Kawarabayashi [SODA '09]. In addition to being string graphs, our produced hard instances are subgraphs of a constant power of bounded-degree planar graphs, hence have bounded twin-width and bounded maximum degree. We also leverage our new result to show that there is a fixed subcubic graph $H$ such that deciding if an input graph contains $H$ as an induced subdivision is NP-complete. Until now, all the graphs $H$ for which such a statement was known had a vertex of degree at least 4. This answers a question by Chudnovsky, Seymour, and the fourth author [JCTB '13], and by Le [JGT '19]. Finally we resolve another question of Korhonen and Lokshtanov by exhibiting a subcubic graph $H$ without two adjacent degree-3 vertices and such that deciding if an input $n$-vertex graph contains $H$ as an induced minor is NP-complete, and unless the Exponential-Time Hypothesis fails, requires time $2^{\Omega(\sqrt n)}$. This complements an algorithm running in subexponential time $2^{O(n^{2/3} \log n)}$ by these authors [SODA '24] under the same technical condition.

Authors: Pierre Aboulker, Édouard Bonnet, Timothé Picavet, Nicolas Trotignon

We exhibit a new obstacle to the nascent algorithmic theory for classes excluding an induced minor. We indeed show that on the class of string graphs -- which avoids the 1-subdivision of, say, $K_5$ as an induced minor -- Induced 2-Disjoint Paths is NP-complete. So, while $k$-Disjoint Paths, for a fixed $k$, is polynomial-time solvable in general graphs, the absence of a graph as an induced minor does not make its induced variant tractable, even for $k=2$. This answers a question of Korhonen and Lokshtanov [SODA '24], and complements a polynomial-time algorithm for Induced $k$-Disjoint Paths in classes of bounded genus by Kobayashi and Kawarabayashi [SODA '09]. In addition to being string graphs, our produced hard instances are subgraphs of a constant power of bounded-degree planar graphs, hence have bounded twin-width and bounded maximum degree. We also leverage our new result to show that there is a fixed subcubic graph $H$ such that deciding if an input graph contains $H$ as an induced subdivision is NP-complete. Until now, all the graphs $H$ for which such a statement was known had a vertex of degree at least 4. This answers a question by Chudnovsky, Seymour, and the fourth author [JCTB '13], and by Le [JGT '19]. Finally we resolve another question of Korhonen and Lokshtanov by exhibiting a subcubic graph $H$ without two adjacent degree-3 vertices and such that deciding if an input $n$-vertex graph contains $H$ as an induced minor is NP-complete, and unless the Exponential-Time Hypothesis fails, requires time $2^{\Omega(\sqrt n)}$. This complements an algorithm running in subexponential time $2^{O(n^{2/3} \log n)}$ by these authors [SODA '24] under the same technical condition.

Computational Complexity of Polynomial Subalgebras

from arXiv: Computational Complexity

Authors: Leonie Kayser

The computational complexity of polynomial ideals and Gr\"obner bases has been studied since the 1980s. In recent years the related notions of polynomial subalgebras and SAGBI bases have gained more and more attention in computational algebra, with a view towards effective algorithms. We investigate the computational complexity of the subalgebra membership problem and degree bounds. In particular, we place these problems in the complexity class EXPSPACE and prove PSPACE-completeness for homogeneous algebras. We highlight parallels and differences compared to the settings of ideals and also look at important classes of polynomials such as monomial algebras.

Authors: Leonie Kayser

The computational complexity of polynomial ideals and Gr\"obner bases has been studied since the 1980s. In recent years the related notions of polynomial subalgebras and SAGBI bases have gained more and more attention in computational algebra, with a view towards effective algorithms. We investigate the computational complexity of the subalgebra membership problem and degree bounds. In particular, we place these problems in the complexity class EXPSPACE and prove PSPACE-completeness for homogeneous algebras. We highlight parallels and differences compared to the settings of ideals and also look at important classes of polynomials such as monomial algebras.

Physically-Based Mesh Generation for Confined 3D Point Clouds Using Flexible Foil Models

from arXiv: Computational Geometry

Authors: Netzer Moriya

We propose a method for constructing high-quality, closed-surface meshes from confined 3D point clouds via a physically-based simulation of flexible foils under spatial constraints. The approach integrates dynamic elasticity, pressure-driven deformation, and adaptive snapping to fixed vertices, providing a robust framework for realistic and physically accurate mesh creation. Applications in computer graphics and computational geometry are discussed.

Authors: Netzer Moriya

We propose a method for constructing high-quality, closed-surface meshes from confined 3D point clouds via a physically-based simulation of flexible foils under spatial constraints. The approach integrates dynamic elasticity, pressure-driven deformation, and adaptive snapping to fixed vertices, providing a robust framework for realistic and physically accurate mesh creation. Applications in computer graphics and computational geometry are discussed.

Scalable k-Means Clustering for Large k via Seeded Approximate Nearest-Neighbor Search

from arXiv: Computational Geometry

Authors: Jack Spalding-Jamieson, Eliot Wong Robson, Da Wei Zheng

For very large values of $k$, we consider methods for fast $k$-means clustering of massive datasets with $10^7\sim10^9$ points in high-dimensions ($d\geq100$). All current practical methods for this problem have runtimes at least $\Omega(k^2)$. We find that initialization routines are not a bottleneck for this case. Instead, it is critical to improve the speed of Lloyd's local-search algorithm, particularly the step that reassigns points to their closest center. Attempting to improve this step naturally leads us to leverage approximate nearest-neighbor search methods, although this alone is not enough to be practical. Instead, we propose a family of problems we call "Seeded Approximate Nearest-Neighbor Search", for which we propose "Seeded Search-Graph" methods as a solution.

Authors: Jack Spalding-Jamieson, Eliot Wong Robson, Da Wei Zheng

For very large values of $k$, we consider methods for fast $k$-means clustering of massive datasets with $10^7\sim10^9$ points in high-dimensions ($d\geq100$). All current practical methods for this problem have runtimes at least $\Omega(k^2)$. We find that initialization routines are not a bottleneck for this case. Instead, it is critical to improve the speed of Lloyd's local-search algorithm, particularly the step that reassigns points to their closest center. Attempting to improve this step naturally leads us to leverage approximate nearest-neighbor search methods, although this alone is not enough to be practical. Instead, we propose a family of problems we call "Seeded Approximate Nearest-Neighbor Search", for which we propose "Seeded Search-Graph" methods as a solution.

Validity-first automatic polycube labeling for CAD models

from arXiv: Computational Geometry

Authors: Sébastien Mestrallet, Christophe Bourcier, Franck Ledoux

For many simulation codes, block-structured hex meshes remain preferred while their automatic generation is unsolved. We investigate the usage of a polycube-based approach. More specifically, we focus on the labeling stage, which consists in assigning each boundary facet to one of the 6 signed principal axis. Similar works are confronted with 2 challenges: over-constraining validity criteria, and the conflated processing of validity criteria with quality metrics. We tackle these obstacles with automatic routines based on semi-global labeling operators. Our approach is successfully tested on CAD models, which are of interest for many numerical simulation problems.

Authors: Sébastien Mestrallet, Christophe Bourcier, Franck Ledoux

For many simulation codes, block-structured hex meshes remain preferred while their automatic generation is unsolved. We investigate the usage of a polycube-based approach. More specifically, we focus on the labeling stage, which consists in assigning each boundary facet to one of the 6 signed principal axis. Similar works are confronted with 2 challenges: over-constraining validity criteria, and the conflated processing of validity criteria with quality metrics. We tackle these obstacles with automatic routines based on semi-global labeling operators. Our approach is successfully tested on CAD models, which are of interest for many numerical simulation problems.

Engineering Insights into Biclique Partitions and Fractional Binary Ranks of Matrices

from arXiv: Data Structures and Algorithms

Authors: Angikar Ghosal, Andreas Karrenbauer

We investigate structural properties of the binary rank of Kronecker powers of binary matrices, equivalently, the biclique partition numbers of the corresponding bipartite graphs. To this end, we engineer a Column Generation approach to solve linear optimization problems for the fractional biclique partition number of bipartite graphs, specifically examining the Domino graph and its Kronecker powers. We address the challenges posed by the double exponential growth of the number of bicliques in increasing Kronecker powers. We discuss various strategies to generate suitable initial sets of bicliques, including an inductive method for increasing Kronecker powers. We show how to manage the number of active bicliques to improve running time and to stay within memory limits. Our computational results reveal that the fractional binary rank is not multiplicative with respect to the Kronecker product. Hence, there are binary matrices, and bipartite graphs, respectively, such as the Domino, where the asymptotic fractional binary rank is strictly smaller than the fractional binary rank. While we used our algorithm to reduce the upper bound, we formally prove that the fractional biclique cover number is a lower bound, which is at least as good as the widely used isolating (or fooling set) bound. For the Domino, we obtain that the asymptotic fractional binary rank lies in the interval $[2,2.373]$. Since our computational resources are not sufficient to further reduce the upper bound, we encourage further exploration using more substantial computing resources or further mathematical engineering techniques to narrow the gap and advance our understanding of biclique partitions, particularly, to settle the open question whether binary rank and biclique partition number are multiplicative with respect to the Kronecker product.

Authors: Angikar Ghosal, Andreas Karrenbauer

We investigate structural properties of the binary rank of Kronecker powers of binary matrices, equivalently, the biclique partition numbers of the corresponding bipartite graphs. To this end, we engineer a Column Generation approach to solve linear optimization problems for the fractional biclique partition number of bipartite graphs, specifically examining the Domino graph and its Kronecker powers. We address the challenges posed by the double exponential growth of the number of bicliques in increasing Kronecker powers. We discuss various strategies to generate suitable initial sets of bicliques, including an inductive method for increasing Kronecker powers. We show how to manage the number of active bicliques to improve running time and to stay within memory limits. Our computational results reveal that the fractional binary rank is not multiplicative with respect to the Kronecker product. Hence, there are binary matrices, and bipartite graphs, respectively, such as the Domino, where the asymptotic fractional binary rank is strictly smaller than the fractional binary rank. While we used our algorithm to reduce the upper bound, we formally prove that the fractional biclique cover number is a lower bound, which is at least as good as the widely used isolating (or fooling set) bound. For the Domino, we obtain that the asymptotic fractional binary rank lies in the interval $[2,2.373]$. Since our computational resources are not sufficient to further reduce the upper bound, we encourage further exploration using more substantial computing resources or further mathematical engineering techniques to narrow the gap and advance our understanding of biclique partitions, particularly, to settle the open question whether binary rank and biclique partition number are multiplicative with respect to the Kronecker product.

Decay of correlation for edge colorings when $q>3Δ$

from arXiv: Data Structures and Algorithms

Authors: Zejia Chen, Yulin Wang, Chihao Zhang, Zihan Zhang

We examine various perspectives on the decay of correlation for the uniform distribution over proper $q$-edge colorings of graphs with maximum degree $\Delta$. First, we establish the coupling independence property when $q\ge 3\Delta$ for general graphs. Together with the work of Chen et al. (2024), this result implies a fully polynomial-time approximation scheme (FPTAS) for counting the number of proper $q$-edge colorings. Next, we prove the strong spatial mixing property on trees, provided that $q> (3+o(1))\Delta$. The strong spatial mixing property is derived from the spectral independence property of a version of the weighted edge coloring distribution, which is established using the matrix trickle-down method developed in Abdolazimi, Liu and Oveis Gharan (FOCS, 2021) and Wang, Zhang and Zhang (STOC, 2024). Finally, we show that the weak spatial mixing property holds on trees with maximum degree $\Delta$ if and only if $q\ge 2\Delta-1$.

Authors: Zejia Chen, Yulin Wang, Chihao Zhang, Zihan Zhang

We examine various perspectives on the decay of correlation for the uniform distribution over proper $q$-edge colorings of graphs with maximum degree $\Delta$. First, we establish the coupling independence property when $q\ge 3\Delta$ for general graphs. Together with the work of Chen et al. (2024), this result implies a fully polynomial-time approximation scheme (FPTAS) for counting the number of proper $q$-edge colorings. Next, we prove the strong spatial mixing property on trees, provided that $q> (3+o(1))\Delta$. The strong spatial mixing property is derived from the spectral independence property of a version of the weighted edge coloring distribution, which is established using the matrix trickle-down method developed in Abdolazimi, Liu and Oveis Gharan (FOCS, 2021) and Wang, Zhang and Zhang (STOC, 2024). Finally, we show that the weak spatial mixing property holds on trees with maximum degree $\Delta$ if and only if $q\ge 2\Delta-1$.

Robust Scatter Matrix Estimation for Elliptical Distributions in Polynomial Time

from arXiv: Data Structures and Algorithms

Authors: Gleb Novikov

We study the problem of computationally efficient robust estimation of scatter matrices of elliptical distributions under the strong contamination model. We design polynomial time algorithms that achieve dimension-independent error in Frobenius norm. Our first result is a sequence of efficient algorithms that approaches nearly optimal error. Specifically, under a mild assumption on the eigenvalues of the scatter matrix $\Sigma$, for every $t \in \mathbb{N}$, we design an estimator that, given $n = d^{O(t)}$ samples, in time $n^{O(t)}$ finds $\hat{\Sigma}$ such that $ \Vert{\Sigma^{-1/2}\, ({\hat{\Sigma} - \Sigma})\, \Sigma^{-1/2}}\Vert_{\text{F}} \le O(t \cdot \varepsilon^{1-\frac{1}{t}})$, where $\varepsilon$ is the fraction of corruption. We do not require any assumptions on the moments of the distribution, while all previously known computationally efficient algorithms for robust covariance/scatter estimation with dimension-independent error rely on strong assumptions on the moments, such as sub-Gaussianity or (certifiable) hypercontractivity. Furthermore, under a stronger assumption on the eigenvalues of $\Sigma$ (that, in particular, is satisfied by all matrices with constant condition number), we provide a fast (sub-quadratic in the input size) algorithm that, given nearly optimal number of samples $n = \tilde{O}(d^2/\varepsilon)$, in time $\tilde{O}({nd^2 poly(1/\varepsilon)})$ finds $\hat{\Sigma}$ such that $\Vert\hat{\Sigma} - \Sigma\Vert_{\text{F}} \le O(\Vert{\Sigma}\Vert \cdot \sqrt{\varepsilon})$. Our approach is based on robust covariance estimation of the spatial sign (the projection onto the sphere of radius $\sqrt{d}$) of elliptical distributions.

Authors: Gleb Novikov

We study the problem of computationally efficient robust estimation of scatter matrices of elliptical distributions under the strong contamination model. We design polynomial time algorithms that achieve dimension-independent error in Frobenius norm. Our first result is a sequence of efficient algorithms that approaches nearly optimal error. Specifically, under a mild assumption on the eigenvalues of the scatter matrix $\Sigma$, for every $t \in \mathbb{N}$, we design an estimator that, given $n = d^{O(t)}$ samples, in time $n^{O(t)}$ finds $\hat{\Sigma}$ such that $ \Vert{\Sigma^{-1/2}\, ({\hat{\Sigma} - \Sigma})\, \Sigma^{-1/2}}\Vert_{\text{F}} \le O(t \cdot \varepsilon^{1-\frac{1}{t}})$, where $\varepsilon$ is the fraction of corruption. We do not require any assumptions on the moments of the distribution, while all previously known computationally efficient algorithms for robust covariance/scatter estimation with dimension-independent error rely on strong assumptions on the moments, such as sub-Gaussianity or (certifiable) hypercontractivity. Furthermore, under a stronger assumption on the eigenvalues of $\Sigma$ (that, in particular, is satisfied by all matrices with constant condition number), we provide a fast (sub-quadratic in the input size) algorithm that, given nearly optimal number of samples $n = \tilde{O}(d^2/\varepsilon)$, in time $\tilde{O}({nd^2 poly(1/\varepsilon)})$ finds $\hat{\Sigma}$ such that $\Vert\hat{\Sigma} - \Sigma\Vert_{\text{F}} \le O(\Vert{\Sigma}\Vert \cdot \sqrt{\varepsilon})$. Our approach is based on robust covariance estimation of the spatial sign (the projection onto the sphere of radius $\sqrt{d}$) of elliptical distributions.

On the FirstFit Algorithm for Online Unit-Interval Coloring

from arXiv: Data Structures and Algorithms

Authors: Bob Krekelberg, Alison Hsiang-Hsuan Liu

In this paper, we study the performance of the FirstFit algorithm for the online unit-length intervals coloring problem where the intervals can be either open or closed, which serves a further investigation towards the actual performance of FirstFit. We develop a sophisticated counting method by generalizing the classic neighborhood bound, which limits the color FirstFit can assign an interval by counting the potential intersections. In the generalization, we show that for any interval, there is a critical interval intersecting it that can help reduce the overestimation of the number of intersections, and it further helps bound the color an interval can be assigned. The technical challenge then falls on identifying these critical intervals that guarantee the effectiveness of counting. Using this new mechanism for bounding the color that FirstFit can assign an interval, we provide a tight analysis of $2\omega$ colors when all intervals have integral endpoints and an upper bound of $\lceil\frac{7}{3}\omega\rceil-2$ colors for the general case, where $\omega$ is the optimal number of colors needed for the input set of intervals.

Authors: Bob Krekelberg, Alison Hsiang-Hsuan Liu

In this paper, we study the performance of the FirstFit algorithm for the online unit-length intervals coloring problem where the intervals can be either open or closed, which serves a further investigation towards the actual performance of FirstFit. We develop a sophisticated counting method by generalizing the classic neighborhood bound, which limits the color FirstFit can assign an interval by counting the potential intersections. In the generalization, we show that for any interval, there is a critical interval intersecting it that can help reduce the overestimation of the number of intersections, and it further helps bound the color an interval can be assigned. The technical challenge then falls on identifying these critical intervals that guarantee the effectiveness of counting. Using this new mechanism for bounding the color that FirstFit can assign an interval, we provide a tight analysis of $2\omega$ colors when all intervals have integral endpoints and an upper bound of $\lceil\frac{7}{3}\omega\rceil-2$ colors for the general case, where $\omega$ is the optimal number of colors needed for the input set of intervals.

Approximation Algorithms for Optimal Hopsets

from arXiv: Data Structures and Algorithms

Authors: Michael Dinitz, Ama Koranteng, Yasamin Nazari

For a given graph $G$, a "hopset" $H$ with hopbound $\beta$ and stretch $\alpha$ is a set of edges such that between every pair of vertices $u$ and $v$, there is a path with at most $\beta$ hops in $G \cup H$ that approximates the distance between $u$ and $v$ up to a multiplicative stretch of $\alpha$. Hopsets have found a wide range of applications for distance-based problems in various computational models since the 90s. More recently, there has been significant interest in understanding these fundamental objects from an existential and structural perspective. But all of this work takes a worst-case (or existential) point of view: How many edges do we need to add to satisfy a given hopbound and stretch requirement for any input graph? We initiate the study of the natural optimization variant of this problem: given a specific graph instance, what is the minimum number of edges that satisfy the hopbound and stretch requirements? We give approximation algorithms for a generalized hopset problem which, when combined with known existential bounds, lead to different approximation guarantees for various regimes depending on hopbound, stretch, and directed vs. undirected inputs. We complement our upper bounds with a lower bound that implies Label Cover hardness for directed hopsets and shortcut sets with hopbound at least $3$.

Authors: Michael Dinitz, Ama Koranteng, Yasamin Nazari

For a given graph $G$, a "hopset" $H$ with hopbound $\beta$ and stretch $\alpha$ is a set of edges such that between every pair of vertices $u$ and $v$, there is a path with at most $\beta$ hops in $G \cup H$ that approximates the distance between $u$ and $v$ up to a multiplicative stretch of $\alpha$. Hopsets have found a wide range of applications for distance-based problems in various computational models since the 90s. More recently, there has been significant interest in understanding these fundamental objects from an existential and structural perspective. But all of this work takes a worst-case (or existential) point of view: How many edges do we need to add to satisfy a given hopbound and stretch requirement for any input graph? We initiate the study of the natural optimization variant of this problem: given a specific graph instance, what is the minimum number of edges that satisfy the hopbound and stretch requirements? We give approximation algorithms for a generalized hopset problem which, when combined with known existential bounds, lead to different approximation guarantees for various regimes depending on hopbound, stretch, and directed vs. undirected inputs. We complement our upper bounds with a lower bound that implies Label Cover hardness for directed hopsets and shortcut sets with hopbound at least $3$.

ARRIVAL: Recursive Framework & $\ell_1$-Contraction

from arXiv: Data Structures and Algorithms

Authors: Sebastian Haslebacher

ARRIVAL is the problem of deciding which out of two possible destinations will be reached first by a token that moves deterministically along the edges of a directed graph, according to so-called switching rules. It is known to lie in NP $\cap$ CoNP, but not known to lie in P. The state-of-the-art algorithm due to G\"artner et al. (ICALP `21) runs in time $2^{\mathcal{O}(\sqrt{n} \log n)}$ on an $n$-vertex graph. We prove that ARRIVAL can be solved in time $2^{\mathcal{O}(k \log^2 n)}$ on $n$-vertex graphs of treewidth $k$. Our algorithm is derived by adapting a simple recursive algorithm for a generalization of ARRIVAL called G-ARRIVAL. This simple recursive algorithm acts as a framework from which we can also rederive the subexponential upper bound of G\"artner et al. Our second result is a reduction from G-ARRIVAL to the problem of finding an approximate fixed point of an $\ell_1$-contracting function $f : [0, 1]^n \rightarrow [0, 1]^n$. Finding such fixed points is a well-studied problem in the case of the $\ell_2$-metric and the $\ell_\infty$-metric, but little is known about the $\ell_1$-case. Both of our results highlight parallels between ARRIVAL and the Simple Stochastic Games (SSG) problem. Concretely, Chatterjee et al. (SODA `23) gave an algorithm for SSG parameterized by treewidth that achieves a similar bound as we do for ARRIVAL, and SSG is known to reduce to $\ell_\infty$-contraction.

Authors: Sebastian Haslebacher

ARRIVAL is the problem of deciding which out of two possible destinations will be reached first by a token that moves deterministically along the edges of a directed graph, according to so-called switching rules. It is known to lie in NP $\cap$ CoNP, but not known to lie in P. The state-of-the-art algorithm due to G\"artner et al. (ICALP `21) runs in time $2^{\mathcal{O}(\sqrt{n} \log n)}$ on an $n$-vertex graph. We prove that ARRIVAL can be solved in time $2^{\mathcal{O}(k \log^2 n)}$ on $n$-vertex graphs of treewidth $k$. Our algorithm is derived by adapting a simple recursive algorithm for a generalization of ARRIVAL called G-ARRIVAL. This simple recursive algorithm acts as a framework from which we can also rederive the subexponential upper bound of G\"artner et al. Our second result is a reduction from G-ARRIVAL to the problem of finding an approximate fixed point of an $\ell_1$-contracting function $f : [0, 1]^n \rightarrow [0, 1]^n$. Finding such fixed points is a well-studied problem in the case of the $\ell_2$-metric and the $\ell_\infty$-metric, but little is known about the $\ell_1$-case. Both of our results highlight parallels between ARRIVAL and the Simple Stochastic Games (SSG) problem. Concretely, Chatterjee et al. (SODA `23) gave an algorithm for SSG parameterized by treewidth that achieves a similar bound as we do for ARRIVAL, and SSG is known to reduce to $\ell_\infty$-contraction.

New simple and fastest quicksort algorithm for equal keys

from arXiv: Data Structures and Algorithms

Authors: Parviz Afereidoon

This paper introduces a novel and efficient partitioning technique for quicksort, specifically designed for real-world data with duplicate elements (50-year-old problem). The method is referred to as "equal quicksort" or "eqsort". Based on the experimental findings, it has been determined that the newly developed algorithm, eqsort, is competitive with the best current implementations,such as fat partitioning algorithms and dual-pivot quicksort. This method offers several advantages over the commonly used dual-pivot method and pdqsort partitioning, making it a potential replacement.

Authors: Parviz Afereidoon

This paper introduces a novel and efficient partitioning technique for quicksort, specifically designed for real-world data with duplicate elements (50-year-old problem). The method is referred to as "equal quicksort" or "eqsort". Based on the experimental findings, it has been determined that the newly developed algorithm, eqsort, is competitive with the best current implementations,such as fat partitioning algorithms and dual-pivot quicksort. This method offers several advantages over the commonly used dual-pivot method and pdqsort partitioning, making it a potential replacement.

Maximum Coverage $k$-Antichains and Chains: A Greedy Approach

from arXiv: Data Structures and Algorithms

Authors: Manuel Cáceres, Andreas Grigorjew, Wanchote Po Jiamjitrak, Alexandru I. Tomescu

Given an input acyclic digraph $G = (V,E)$ and a positive integer $k$, the problem of Maximum Coverage $k$-Antichains (resp., Chains) denoted as MA-$k$ (resp., MC-$k$) asks to find $k$ sets of pairwise unreachable vertices, known as antichains (resp., $k$ subsequences of paths, known as chains), maximizing the number of vertices covered by these antichains (resp. chains). While MC-$k$ has been recently solved in (almost) optimal $O(|E|^{1+o(1)})$ time [Kogan and Parter, ICALP 2022], the fastest known algorithm for MA-$k$ is a recent $(k|E|)^{1+o(1)}$-time solution [Kogan and Parter, ESA 2024] as well as a $1/2$ approximation running in $|E|^{1+o(1)}$ time in the same paper. In this paper, we leverage a paths-based proof of the Greene-Kleitmann (GK) theorem with the help of the greedy algorithm for set cover and recent advances on fast algorithms for flows and shortest paths to obtain the following results for MA-$k$: - The first (exact) algorithm running in $|E|^{1+o(1)}$ time, hence independent in $k$. - A randomized algorithm running in $\tilde{O}(\alpha_k|E|)$ time, where $\alpha_k$ is the size of the optimal solution. That is, a near-linear parameterized running time, generalizing the result of [M\"akinen et al., ACM TALG] obtained for $k=1$. - An approximation algorithm running in time $O(\alpha_1^2|V| + (\alpha_1+k)|E|)$ with approximation ratio of $(1-1/e) > 0.63 > 1/2$. Our last two solutions rely on the use of greedy set cover, first exploited in [Felsner et al., Order 2003] for chains, which we now apply to antichains. We complement these results with two examples (one for chains and one for antichains) showing that, for every $k \ge 2$, greedy misses a $1/4$ portion of the optimal coverage. We also show that greedy is a $\Omega(\log{|V|})$ factor away from minimality when required to cover all vertices: previously unknown for sets of chains or antichains.

Authors: Manuel Cáceres, Andreas Grigorjew, Wanchote Po Jiamjitrak, Alexandru I. Tomescu

Given an input acyclic digraph $G = (V,E)$ and a positive integer $k$, the problem of Maximum Coverage $k$-Antichains (resp., Chains) denoted as MA-$k$ (resp., MC-$k$) asks to find $k$ sets of pairwise unreachable vertices, known as antichains (resp., $k$ subsequences of paths, known as chains), maximizing the number of vertices covered by these antichains (resp. chains). While MC-$k$ has been recently solved in (almost) optimal $O(|E|^{1+o(1)})$ time [Kogan and Parter, ICALP 2022], the fastest known algorithm for MA-$k$ is a recent $(k|E|)^{1+o(1)}$-time solution [Kogan and Parter, ESA 2024] as well as a $1/2$ approximation running in $|E|^{1+o(1)}$ time in the same paper. In this paper, we leverage a paths-based proof of the Greene-Kleitmann (GK) theorem with the help of the greedy algorithm for set cover and recent advances on fast algorithms for flows and shortest paths to obtain the following results for MA-$k$: - The first (exact) algorithm running in $|E|^{1+o(1)}$ time, hence independent in $k$. - A randomized algorithm running in $\tilde{O}(\alpha_k|E|)$ time, where $\alpha_k$ is the size of the optimal solution. That is, a near-linear parameterized running time, generalizing the result of [M\"akinen et al., ACM TALG] obtained for $k=1$. - An approximation algorithm running in time $O(\alpha_1^2|V| + (\alpha_1+k)|E|)$ with approximation ratio of $(1-1/e) > 0.63 > 1/2$. Our last two solutions rely on the use of greedy set cover, first exploited in [Felsner et al., Order 2003] for chains, which we now apply to antichains. We complement these results with two examples (one for chains and one for antichains) showing that, for every $k \ge 2$, greedy misses a $1/4$ portion of the optimal coverage. We also show that greedy is a $\Omega(\log{|V|})$ factor away from minimality when required to cover all vertices: previously unknown for sets of chains or antichains.

On the query complexity of sampling from non-log-concave distributions

from arXiv: Data Structures and Algorithms

Authors: Yuchen He, Chihao Zhang

We study the problem of sampling from a $d$-dimensional distribution with density $p(x)\propto e^{-f(x)}$, which does not necessarily satisfy good isoperimetric conditions. Specifically, we show that for any $L,M$ satisfying $LM\ge d\ge 5$, $\epsilon\in \left\{0,\frac{1}{32}\right\}$, and any algorithm with query accesses to the value of $f(x)$ and $\nabla f(x)$, there exists an $L$-log-smooth distribution with second moment at most $M$ such that the algorithm requires $\left\{\frac{LM}{d\epsilon}\right\}^{\Omega(d)}$ queries to compute a sample whose distribution is within $\epsilon$ in total variation distance to the target distribution. We complement the lower bound with an algorithm requiring $\left\{\frac{LM}{d\epsilon}\right\}^{\mathcal O(d)}$ queries, thereby characterizing the tight (up to the constant in the exponent) query complexity for sampling from the family of non-log-concave distributions. Our results are in sharp contrast with the recent work of Huang et al. (COLT'24), where an algorithm with quasi-polynomial query complexity was proposed for sampling from a non-log-concave distribution when $M=\mathtt{poly}(d)$. Their algorithm works under the stronger condition that all distributions along the trajectory of the Ornstein-Uhlenbeck process, starting from the target distribution, are $\mathcal O(1)$-log-smooth. We investigate this condition and prove that it is strictly stronger than requiring the target distribution to be $\mathcal O(1)$-log-smooth. Additionally, we study this condition in the context of mixtures of Gaussians. Finally, we place our results within the broader theme of ``sampling versus optimization'', as studied in Ma et al. (PNAS'19). We show that for a wide range of parameters, sampling is strictly easier than optimization by a super-exponential factor in the dimension $d$.

Authors: Yuchen He, Chihao Zhang

We study the problem of sampling from a $d$-dimensional distribution with density $p(x)\propto e^{-f(x)}$, which does not necessarily satisfy good isoperimetric conditions. Specifically, we show that for any $L,M$ satisfying $LM\ge d\ge 5$, $\epsilon\in \left\{0,\frac{1}{32}\right\}$, and any algorithm with query accesses to the value of $f(x)$ and $\nabla f(x)$, there exists an $L$-log-smooth distribution with second moment at most $M$ such that the algorithm requires $\left\{\frac{LM}{d\epsilon}\right\}^{\Omega(d)}$ queries to compute a sample whose distribution is within $\epsilon$ in total variation distance to the target distribution. We complement the lower bound with an algorithm requiring $\left\{\frac{LM}{d\epsilon}\right\}^{\mathcal O(d)}$ queries, thereby characterizing the tight (up to the constant in the exponent) query complexity for sampling from the family of non-log-concave distributions. Our results are in sharp contrast with the recent work of Huang et al. (COLT'24), where an algorithm with quasi-polynomial query complexity was proposed for sampling from a non-log-concave distribution when $M=\mathtt{poly}(d)$. Their algorithm works under the stronger condition that all distributions along the trajectory of the Ornstein-Uhlenbeck process, starting from the target distribution, are $\mathcal O(1)$-log-smooth. We investigate this condition and prove that it is strictly stronger than requiring the target distribution to be $\mathcal O(1)$-log-smooth. Additionally, we study this condition in the context of mixtures of Gaussians. Finally, we place our results within the broader theme of ``sampling versus optimization'', as studied in Ma et al. (PNAS'19). We show that for a wide range of parameters, sampling is strictly easier than optimization by a super-exponential factor in the dimension $d$.

Pcodec: Better Compression for Numerical Sequences

from arXiv: Data Structures and Algorithms

Authors: Martin Loncaric, Niels Jeppesen, Ben Zinberg

We present Pcodec (Pco), a format and algorithm for losslessly compressing numerical sequences. Pco's core and most novel component is a binning algorithm that quickly converges to the true entropy of smoothly, independently, and identically distributed (SIID) data. To automatically handle more general data, Pco has two opinionated preprocessing steps. The first step, Pco's mode, decomposes the data into more smoothly distributed latent variables. The second step, delta encoding, makes the latents more independently and identically distributed. We prove that, given $k$ bins, binning uses only $\mathcal{O}(1/k)$ bits more than the SIID data's entropy. Additionally, we demonstrate that Pco achieves 29-94% higher compression ratio than other approaches on six real-world datasets while using less compression time.

Authors: Martin Loncaric, Niels Jeppesen, Ben Zinberg

We present Pcodec (Pco), a format and algorithm for losslessly compressing numerical sequences. Pco's core and most novel component is a binning algorithm that quickly converges to the true entropy of smoothly, independently, and identically distributed (SIID) data. To automatically handle more general data, Pco has two opinionated preprocessing steps. The first step, Pco's mode, decomposes the data into more smoothly distributed latent variables. The second step, delta encoding, makes the latents more independently and identically distributed. We prove that, given $k$ bins, binning uses only $\mathcal{O}(1/k)$ bits more than the SIID data's entropy. Additionally, we demonstrate that Pco achieves 29-94% higher compression ratio than other approaches on six real-world datasets while using less compression time.

Improved Sublinear Algorithms for Classical and Quantum Graph Coloring

from arXiv: Data Structures and Algorithms

Authors: Asaf Ferber, Liam Hardiman, Xiaonan Chen

We present three sublinear randomized algorithms for vertex-coloring of graphs with maximum degree $\Delta$. The first is a simple algorithm that extends the idea of Morris and Song to color graphs with maximum degree $\Delta$ using $\Delta+1$ colors. Combined with the greedy algorithm, it achieves an expected runtime of $O(n^{3/2}\sqrt{\log n})$ in the query model, improving on Assadi, Chen, and Khanna's algorithm by a $\sqrt{\log n}$ factor in expectation. When we allow quantum queries to the graph, we can accelerate the first algorithm using Grover's famous algorithm, resulting in a runtime of $\tilde{O}(n^{4/3})$ quantum queries. Finally, we introduce a quantum algorithm for $(1+\epsilon)\Delta$-coloring, achieving $O(\epsilon^{-1}n^{5/4}\log^{3/2}n)$ quantum queries, offering a polynomial improvement over the previous best bound by Morris and Song.

Authors: Asaf Ferber, Liam Hardiman, Xiaonan Chen

We present three sublinear randomized algorithms for vertex-coloring of graphs with maximum degree $\Delta$. The first is a simple algorithm that extends the idea of Morris and Song to color graphs with maximum degree $\Delta$ using $\Delta+1$ colors. Combined with the greedy algorithm, it achieves an expected runtime of $O(n^{3/2}\sqrt{\log n})$ in the query model, improving on Assadi, Chen, and Khanna's algorithm by a $\sqrt{\log n}$ factor in expectation. When we allow quantum queries to the graph, we can accelerate the first algorithm using Grover's famous algorithm, resulting in a runtime of $\tilde{O}(n^{4/3})$ quantum queries. Finally, we introduce a quantum algorithm for $(1+\epsilon)\Delta$-coloring, achieving $O(\epsilon^{-1}n^{5/4}\log^{3/2}n)$ quantum queries, offering a polynomial improvement over the previous best bound by Morris and Song.

Amnesiac Flooding: Easy to break, hard to escape

from arXiv: Data Structures and Algorithms

Authors: Henry Austin, Maximilien Gadouleau, George B. Mertzios, Amitabh Trehan

Broadcast is a central problem in distributed computing. Recently, Hussak and Trehan [PODC'19/DC'23] proposed a stateless broadcasting protocol (Amnesiac Flooding), which was surprisingly proven to terminate in asymptotically optimal time (linear in the diameter of the network). However, it remains unclear: (i) Are there other stateless terminating broadcast algorithms with the desirable properties of Amnesiac Flooding, (ii) How robust is Amnesiac Flooding with respect to \emph{faults}? In this paper we make progress on both of these fronts. Under a reasonable restriction (obliviousness to message content) additional to the fault-free synchronous model, we prove that Amnesiac Flooding is the \emph{only} strictly stateless deterministic protocol that can achieve terminating broadcast. We identify four natural properties of a terminating broadcast protocol that Amnesiac Flooding uniquely satisfies. In contrast, we prove that even minor relaxations of \textit{any} of these four criteria allow the construction of other terminating broadcast protocols. On the other hand, we prove that Amnesiac Flooding can become non-terminating or non-broadcasting, even if we allow just one node to drop a single message on a single edge in a single round. As a tool for proving this, we focus on the set of all \textit{configurations} of transmissions between nodes in the network, and obtain a \textit{dichotomy} characterizing the configurations, starting from which, Amnesiac Flooding terminates. Additionally, we characterise the structure of sets of Byzantine agents capable of forcing non-termination or non-broadcast of the protocol on arbitrary networks.

Authors: Henry Austin, Maximilien Gadouleau, George B. Mertzios, Amitabh Trehan

Broadcast is a central problem in distributed computing. Recently, Hussak and Trehan [PODC'19/DC'23] proposed a stateless broadcasting protocol (Amnesiac Flooding), which was surprisingly proven to terminate in asymptotically optimal time (linear in the diameter of the network). However, it remains unclear: (i) Are there other stateless terminating broadcast algorithms with the desirable properties of Amnesiac Flooding, (ii) How robust is Amnesiac Flooding with respect to \emph{faults}? In this paper we make progress on both of these fronts. Under a reasonable restriction (obliviousness to message content) additional to the fault-free synchronous model, we prove that Amnesiac Flooding is the \emph{only} strictly stateless deterministic protocol that can achieve terminating broadcast. We identify four natural properties of a terminating broadcast protocol that Amnesiac Flooding uniquely satisfies. In contrast, we prove that even minor relaxations of \textit{any} of these four criteria allow the construction of other terminating broadcast protocols. On the other hand, we prove that Amnesiac Flooding can become non-terminating or non-broadcasting, even if we allow just one node to drop a single message on a single edge in a single round. As a tool for proving this, we focus on the set of all \textit{configurations} of transmissions between nodes in the network, and obtain a \textit{dichotomy} characterizing the configurations, starting from which, Amnesiac Flooding terminates. Additionally, we characterise the structure of sets of Byzantine agents capable of forcing non-termination or non-broadcast of the protocol on arbitrary networks.

Constant sensitivity on the CDAWGs

from arXiv: Data Structures and Algorithms

Authors: Rikuya Hamai, Hiroto Fujimaru, Shunsuke Inenaga

Compact directed acyclic word graphs (CDAWGs) [Blumer et al. 1987] are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string $T$ is obtained by merging isomorphic subtrees of the suffix tree [Weiner 1973] of the same string $T$, and thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation is performed at an arbitrary position in $T$. We show that the size of the CDAWG after an edit operation on $T$ is asymptotically at most 8 times larger than the original CDAWG before the edit.

Authors: Rikuya Hamai, Hiroto Fujimaru, Shunsuke Inenaga

Compact directed acyclic word graphs (CDAWGs) [Blumer et al. 1987] are a fundamental data structure on strings with applications in text pattern searching, data compression, and pattern discovery. Intuitively, the CDAWG of a string $T$ is obtained by merging isomorphic subtrees of the suffix tree [Weiner 1973] of the same string $T$, and thus CDAWGs are a compact indexing structure. In this paper, we investigate the sensitivity of CDAWGs when a single character edit operation is performed at an arbitrary position in $T$. We show that the size of the CDAWG after an edit operation on $T$ is asymptotically at most 8 times larger than the original CDAWG before the edit.

Faster Approximation Algorithms for k-Center via Data Reduction

from arXiv: Data Structures and Algorithms

Authors: Arnold Filtser, Shaofeng H. -C. Jiang, Yi Li, Anurag Murty Naredla, Ioannis Psarros, Qiaoyuan Yang, Qin Zhang

We study efficient algorithms for the Euclidean $k$-Center problem, focusing on the regime of large $k$. We take the approach of data reduction by considering $\alpha$-coreset, which is a small subset $S$ of the dataset $P$ such that any $\beta$-approximation on $S$ is an $(\alpha + \beta)$-approximation on $P$. We give efficient algorithms to construct coresets whose size is $k \cdot o(n)$, which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time $O(1)$-approximation when $k = n^c$ for any $0 < c < 1$. We validate the performance of our coresets on real-world datasets with large $k$, and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to $4$ times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive parameters. This general tool may be of independent interest for algorithm design in high dimensional Euclidean spaces.

Authors: Arnold Filtser, Shaofeng H. -C. Jiang, Yi Li, Anurag Murty Naredla, Ioannis Psarros, Qiaoyuan Yang, Qin Zhang

We study efficient algorithms for the Euclidean $k$-Center problem, focusing on the regime of large $k$. We take the approach of data reduction by considering $\alpha$-coreset, which is a small subset $S$ of the dataset $P$ such that any $\beta$-approximation on $S$ is an $(\alpha + \beta)$-approximation on $P$. We give efficient algorithms to construct coresets whose size is $k \cdot o(n)$, which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time $O(1)$-approximation when $k = n^c$ for any $0 < c < 1$. We validate the performance of our coresets on real-world datasets with large $k$, and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to $4$ times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive parameters. This general tool may be of independent interest for algorithm design in high dimensional Euclidean spaces.

Sink-free orientations: a local sampler with applications

from arXiv: Data Structures and Algorithms

Authors: Konrad Anand, Graham Freifeld, Heng Guo, Chunyang Wang, Jiaheng Wang

For sink-free orientations in graphs of minimum degree at least $3$, we show that there is a deterministic approximate counting algorithm that runs in time $O((n^{73}/\varepsilon^{72})\log(n/\varepsilon))$, a near-linear time sampling algorithm, and a randomised approximate counting algorithm that runs in time $O((n/\varepsilon)^2\log(n/\varepsilon))$, where $n$ denotes the number of vertices of the input graph and $0<\varepsilon<1$ is the desired accuracy. All three algorithms are based on a local implementation of the sink popping method (Cohn, Pemantle, Propp, 2002) under the partial rejection sampling framework (Guo, Jerrum, Liu, 2019).

Authors: Konrad Anand, Graham Freifeld, Heng Guo, Chunyang Wang, Jiaheng Wang

For sink-free orientations in graphs of minimum degree at least $3$, we show that there is a deterministic approximate counting algorithm that runs in time $O((n^{73}/\varepsilon^{72})\log(n/\varepsilon))$, a near-linear time sampling algorithm, and a randomised approximate counting algorithm that runs in time $O((n/\varepsilon)^2\log(n/\varepsilon))$, where $n$ denotes the number of vertices of the input graph and $0<\varepsilon<1$ is the desired accuracy. All three algorithms are based on a local implementation of the sink popping method (Cohn, Pemantle, Propp, 2002) under the partial rejection sampling framework (Guo, Jerrum, Liu, 2019).

Attainability of Two-Point Testing Rates for Finite-Sample Location Estimation

from arXiv: Data Structures and Algorithms

Authors: Spencer Compton, Gregory Valiant

LeCam's two-point testing method yields perhaps the simplest lower bound for estimating the mean of a distribution: roughly, if it is impossible to well-distinguish a distribution centered at $\mu$ from the same distribution centered at $\mu+\Delta$, then it is impossible to estimate the mean by better than $\Delta/2$. It is setting-dependent whether or not a nearly matching upper bound is attainable. We study the conditions under which the two-point testing lower bound can be attained for univariate mean estimation; both in the setting of location estimation (where the distribution is known up to translation) and adaptive location estimation (unknown distribution). Roughly, we will say an estimate nearly attains the two-point testing lower bound if it incurs error that is at most polylogarithmically larger than the Hellinger modulus of continuity for $\tilde{\Omega}(n)$ samples. Adaptive location estimation is particularly interesting as some distributions admit much better guarantees than sub-Gaussian rates (e.g. $\operatorname{Unif}(\mu-1,\mu+1)$ permits error $\Theta(\frac{1}{n})$, while the sub-Gaussian rate is $\Theta(\frac{1}{\sqrt{n}})$), yet it is not obvious whether these rates may be adaptively attained by one unified approach. Our main result designs an algorithm that nearly attains the two-point testing rate for mixtures of symmetric, log-concave distributions with a common mean. Moreover, this algorithm runs in near-linear time and is parameter-free. In contrast, we show the two-point testing rate is not nearly attainable even for symmetric, unimodal distributions. We complement this with results for location estimation, showing the two-point testing rate is nearly attainable for unimodal distributions, but unattainable for symmetric distributions.

Authors: Spencer Compton, Gregory Valiant

LeCam's two-point testing method yields perhaps the simplest lower bound for estimating the mean of a distribution: roughly, if it is impossible to well-distinguish a distribution centered at $\mu$ from the same distribution centered at $\mu+\Delta$, then it is impossible to estimate the mean by better than $\Delta/2$. It is setting-dependent whether or not a nearly matching upper bound is attainable. We study the conditions under which the two-point testing lower bound can be attained for univariate mean estimation; both in the setting of location estimation (where the distribution is known up to translation) and adaptive location estimation (unknown distribution). Roughly, we will say an estimate nearly attains the two-point testing lower bound if it incurs error that is at most polylogarithmically larger than the Hellinger modulus of continuity for $\tilde{\Omega}(n)$ samples. Adaptive location estimation is particularly interesting as some distributions admit much better guarantees than sub-Gaussian rates (e.g. $\operatorname{Unif}(\mu-1,\mu+1)$ permits error $\Theta(\frac{1}{n})$, while the sub-Gaussian rate is $\Theta(\frac{1}{\sqrt{n}})$), yet it is not obvious whether these rates may be adaptively attained by one unified approach. Our main result designs an algorithm that nearly attains the two-point testing rate for mixtures of symmetric, log-concave distributions with a common mean. Moreover, this algorithm runs in near-linear time and is parameter-free. In contrast, we show the two-point testing rate is not nearly attainable even for symmetric, unimodal distributions. We complement this with results for location estimation, showing the two-point testing rate is nearly attainable for unimodal distributions, but unattainable for symmetric distributions.