Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, April 04

Sublinear Time Algorithms for Abelian Group Isomorphism and Basis Construction

from arXiv: Computational Complexity

Authors: Nader H. Bshouty

In his paper, we study the problems of abelian group isomorphism and basis construction in two models. In the {\it partially specified model} (PS-model), the algorithm does not know the group size but can access randomly chosen elements of the group along with the Cayley table of those elements, which provides the result of the binary operation for every pair of selected elements. In the stronger {\it fully specified model} (FS-model), the algorithm knows the size of the group and has access to its elements and Cayley table. Given two abelian groups, $G$, and $H$, we present an algorithm in the PS-model (and hence in the FS-model) that runs in time $\tilde O(\sqrt{|G|})$ and decides if they are isomorphic. This improves on Kavitha's linear-time algorithm and gives the first sublinear-time solution for this problem. We then prove the lower bound $\Omega(|G|^{1/4})$ for the FS-model and the tight bound $\Omega(\sqrt{|G|})$ for the PS-model. This is the first known lower bound for this problem. We obtain similar results for finding a basis for abelian groups. For deterministic algorithms, a simple $\Omega(|G|)$ lower bound is given.

Authors: Nader H. Bshouty

In his paper, we study the problems of abelian group isomorphism and basis construction in two models. In the {\it partially specified model} (PS-model), the algorithm does not know the group size but can access randomly chosen elements of the group along with the Cayley table of those elements, which provides the result of the binary operation for every pair of selected elements. In the stronger {\it fully specified model} (FS-model), the algorithm knows the size of the group and has access to its elements and Cayley table. Given two abelian groups, $G$, and $H$, we present an algorithm in the PS-model (and hence in the FS-model) that runs in time $\tilde O(\sqrt{|G|})$ and decides if they are isomorphic. This improves on Kavitha's linear-time algorithm and gives the first sublinear-time solution for this problem. We then prove the lower bound $\Omega(|G|^{1/4})$ for the FS-model and the tight bound $\Omega(\sqrt{|G|})$ for the PS-model. This is the first known lower bound for this problem. We obtain similar results for finding a basis for abelian groups. For deterministic algorithms, a simple $\Omega(|G|)$ lower bound is given.

Quantum algorithms through graph composition

from arXiv: Computational Complexity

Authors: Arjan Cornelissen

In this work, we introduce the graph composition framework, a generalization of the st-connectivity framework for generating quantum algorithms, where the availability of each of the graph's edges is computed by a span program. We provide an exact characterization of the resulting witness sizes in terms of effective resistances of related graphs. We also provide less-powerful, but easier-to-use upper bounds on these witness sizes. We give generic time-efficient implementations of algorithms generated through the graph composition framework, in the quantum read-only memory model, which is a weaker assumption than the more common quantum random-access model. Along the way, we simplify the span program algorithm, and remove the dependence of its analysis on the effective spectral gap lemma. We unify the quantum algorithmic frameworks that are based on span programs or the quantum adversary bound. In particular, we show how the st-connectivity framework subsumes the learning graph framework, the weighted-decision-tree framework, and a zero-error version of the latter. We show that the graph composition framework subsumes part of the quantum divide and conquer framework, and that it is itself subsumed by the multidimensional quantum walk framework. Moreover, we show that the weighted-decision-tree complexity is quadratically related to deterministic query complexity, and to the GT-bound with polynomial exponent 3/2. For the latter, we also provide a matching separation. We apply our techniques to give improved algorithms for various string-search problems, namely the Dyck-language recognition problem of depth 3, the 3-increasing subsequence problem, and the OR $\circ$ pSEARCH problem. We also simplify existing quantum algorithms for the space-efficient directed st-connectivity problem, the pattern-matching problem and the infix-search problem.

Authors: Arjan Cornelissen

In this work, we introduce the graph composition framework, a generalization of the st-connectivity framework for generating quantum algorithms, where the availability of each of the graph's edges is computed by a span program. We provide an exact characterization of the resulting witness sizes in terms of effective resistances of related graphs. We also provide less-powerful, but easier-to-use upper bounds on these witness sizes. We give generic time-efficient implementations of algorithms generated through the graph composition framework, in the quantum read-only memory model, which is a weaker assumption than the more common quantum random-access model. Along the way, we simplify the span program algorithm, and remove the dependence of its analysis on the effective spectral gap lemma. We unify the quantum algorithmic frameworks that are based on span programs or the quantum adversary bound. In particular, we show how the st-connectivity framework subsumes the learning graph framework, the weighted-decision-tree framework, and a zero-error version of the latter. We show that the graph composition framework subsumes part of the quantum divide and conquer framework, and that it is itself subsumed by the multidimensional quantum walk framework. Moreover, we show that the weighted-decision-tree complexity is quadratically related to deterministic query complexity, and to the GT-bound with polynomial exponent 3/2. For the latter, we also provide a matching separation. We apply our techniques to give improved algorithms for various string-search problems, namely the Dyck-language recognition problem of depth 3, the 3-increasing subsequence problem, and the OR $\circ$ pSEARCH problem. We also simplify existing quantum algorithms for the space-efficient directed st-connectivity problem, the pattern-matching problem and the infix-search problem.

Investigating Simple Drawings of $K_n$ using SAT

from arXiv: Computational Geometry

Authors: Helena Bergold, Manfred Scheucher

We present a SAT framework which allows to investigate properties of simple drawings of the complete graph $K_n$ using the power of AI. In contrast to classic imperative programming, where a program is operated step by step, our framework models mathematical questions as Boolean formulas which are then solved using modern SAT solvers. Our framework for simple drawings is based on a characterization via rotation systems and finite forbidden substructures. We showcase its universality by addressing various open problems, reproving previous computational results and deriving several new computational results. In particular, we test and progress on several unavoidable configurations such as variants of Rafla's conjecture on plane Hamiltonian cycles, Harborth's conjecture on empty triangles, and crossing families for general simple drawings as well as for various subclasses. Moreover, based our computational results we propose some new challenging conjectures.

Authors: Helena Bergold, Manfred Scheucher

We present a SAT framework which allows to investigate properties of simple drawings of the complete graph $K_n$ using the power of AI. In contrast to classic imperative programming, where a program is operated step by step, our framework models mathematical questions as Boolean formulas which are then solved using modern SAT solvers. Our framework for simple drawings is based on a characterization via rotation systems and finite forbidden substructures. We showcase its universality by addressing various open problems, reproving previous computational results and deriving several new computational results. In particular, we test and progress on several unavoidable configurations such as variants of Rafla's conjecture on plane Hamiltonian cycles, Harborth's conjecture on empty triangles, and crossing families for general simple drawings as well as for various subclasses. Moreover, based our computational results we propose some new challenging conjectures.

Instance-Optimal Imprecise Convex Hull

from arXiv: Computational Geometry

Authors: Sarita de Berg, Ivor van der Hoog, Eva Rotenberg, Daniel Rutschmann, Sampson Wong

Imprecise measurements of a point set P = (p1, ..., pn) can be modelled by a family of regions F = (R1, ..., Rn), where each imprecise region Ri contains a unique point pi. A retrieval models an accurate measurement by replacing an imprecise region Ri with its corresponding point pi. We construct the convex hull of an imprecise point set in the plane, where regions in F may be retrieved at unit cost. The goal is to determine the cyclic ordering of the convex hull vertices of P as efficiently as possible. Here, efficiency is interpreted in two ways: (i) minimising the number of retrievals, and (ii) computing each retrieval location quickly. Prior works focused on only one of these two aspects: either minimising retrievals or optimising algorithmic runtime. Our contribution is the first to simultaneously achieve both. Let r(F, P) denote the minimal number of retrievals required by any algorithm to determine the convex hull of P for a given instance (F, P). For a family F of n constant-complexity polygons, our main result is a reconstruction algorithm that performs O(r(F, P)) retrievals in O(r(F, P) log^3 n) time. Compared to previous approaches that achieve optimal retrieval counts, we improve the runtime per retrieval by a exponential factor, from polynomial to polylogarithmic. Compared to near-linear time algorithms, we significantly reduce the number of retrievals used, and broaden the input families to include overlapping regions. We further extend our results to simple k-gons and to pairwise disjoint disks with radii in [1,k], where our runtime scales linearly with k.

Authors: Sarita de Berg, Ivor van der Hoog, Eva Rotenberg, Daniel Rutschmann, Sampson Wong

Imprecise measurements of a point set P = (p1, ..., pn) can be modelled by a family of regions F = (R1, ..., Rn), where each imprecise region Ri contains a unique point pi. A retrieval models an accurate measurement by replacing an imprecise region Ri with its corresponding point pi. We construct the convex hull of an imprecise point set in the plane, where regions in F may be retrieved at unit cost. The goal is to determine the cyclic ordering of the convex hull vertices of P as efficiently as possible. Here, efficiency is interpreted in two ways: (i) minimising the number of retrievals, and (ii) computing each retrieval location quickly. Prior works focused on only one of these two aspects: either minimising retrievals or optimising algorithmic runtime. Our contribution is the first to simultaneously achieve both. Let r(F, P) denote the minimal number of retrievals required by any algorithm to determine the convex hull of P for a given instance (F, P). For a family F of n constant-complexity polygons, our main result is a reconstruction algorithm that performs O(r(F, P)) retrievals in O(r(F, P) log^3 n) time. Compared to previous approaches that achieve optimal retrieval counts, we improve the runtime per retrieval by a exponential factor, from polynomial to polylogarithmic. Compared to near-linear time algorithms, we significantly reduce the number of retrievals used, and broaden the input families to include overlapping regions. We further extend our results to simple k-gons and to pairwise disjoint disks with radii in [1,k], where our runtime scales linearly with k.

Determining Sphere Radius through Pairwise Distances

from arXiv: Computational Geometry

Authors: Boris Sukhovilov

We propose a novel method for determining the radius of a spherical surface based on the distances measured between points on this surface. We consider the most general case of determining the radius when the distances are measured with errors and the sphere has random deviations from its ideal shape. For the solution, we used the minimally necessary four points and an arbitrary N number of points. We provide a new closed form solution for the radius of the sphere through the matrix of pairwise distances. We also determine the standard deviation of the radius estimate caused by measurement errors and deviations of the sphere from its ideal shape. We found optimal configurations of points on the sphere that provide the minimum standard deviation of the radius estimate. This paper describes our solution and provides all the mathematical derivations. We share the implementation of our method as open source code at github.com/boris-sukhovilov/Sphere_Radius.

Authors: Boris Sukhovilov

We propose a novel method for determining the radius of a spherical surface based on the distances measured between points on this surface. We consider the most general case of determining the radius when the distances are measured with errors and the sphere has random deviations from its ideal shape. For the solution, we used the minimally necessary four points and an arbitrary N number of points. We provide a new closed form solution for the radius of the sphere through the matrix of pairwise distances. We also determine the standard deviation of the radius estimate caused by measurement errors and deviations of the sphere from its ideal shape. We found optimal configurations of points on the sphere that provide the minimum standard deviation of the radius estimate. This paper describes our solution and provides all the mathematical derivations. We share the implementation of our method as open source code at https://github.com/boris-sukhovilov/Sphere_Radius.

Beating full state tomography for unentangled spectrum estimation

from arXiv: Data Structures and Algorithms

Authors: Angelos Pelecanos, Xinyu Tan, Ewin Tang, John Wright

How many copies of a mixed state $\rho \in \mathbb{C}^{d \times d}$ are needed to learn its spectrum? To date, the best known algorithms for spectrum estimation require as many copies as full state tomography, suggesting the possibility that learning a state's spectrum might be as difficult as learning the entire state. We show that this is not the case in the setting of unentangled measurements, by giving a spectrum estimation algorithm that uses $n = O(d^3\cdot (\log\log(d) / \log(d))^4 )$ copies of $\rho$, which is asymptotically fewer than the $n = \Omega(d^3)$ copies necessary for full state tomography. Our algorithm is inspired by the technique of local moment matching from classical statistics, and shows how it can be applied in the quantum setting. As an important subroutine in our spectrum estimation algorithm, we give an estimator of the $k$-th moment $\operatorname{tr}(\rho^k)$ which performs unentangled measurements and uses $O(d^{3-2/k})$ copies of $\rho$ in order to achieve a constant multiplicative error. This directly translates to an additive-error estimator of quantum Renyi entropy of order $k$ with the same number of copies. Finally, we present numerical evidence that the sample complexity of spectrum estimation can only improve over full state tomography by a sub-polynomial factor. Specifically, for spectrum learning with fully entangled measurements, we run simulations which suggest a lower bound of $\Omega(d^{2 - \gamma})$ copies for any constant $\gamma > 0$. From this, we conclude the current best lower bound of $\Omega(d)$ is likely not tight.

Authors: Angelos Pelecanos, Xinyu Tan, Ewin Tang, John Wright

How many copies of a mixed state $\rho \in \mathbb{C}^{d \times d}$ are needed to learn its spectrum? To date, the best known algorithms for spectrum estimation require as many copies as full state tomography, suggesting the possibility that learning a state's spectrum might be as difficult as learning the entire state. We show that this is not the case in the setting of unentangled measurements, by giving a spectrum estimation algorithm that uses $n = O(d^3\cdot (\log\log(d) / \log(d))^4 )$ copies of $\rho$, which is asymptotically fewer than the $n = \Omega(d^3)$ copies necessary for full state tomography. Our algorithm is inspired by the technique of local moment matching from classical statistics, and shows how it can be applied in the quantum setting. As an important subroutine in our spectrum estimation algorithm, we give an estimator of the $k$-th moment $\operatorname{tr}(\rho^k)$ which performs unentangled measurements and uses $O(d^{3-2/k})$ copies of $\rho$ in order to achieve a constant multiplicative error. This directly translates to an additive-error estimator of quantum Renyi entropy of order $k$ with the same number of copies. Finally, we present numerical evidence that the sample complexity of spectrum estimation can only improve over full state tomography by a sub-polynomial factor. Specifically, for spectrum learning with fully entangled measurements, we run simulations which suggest a lower bound of $\Omega(d^{2 - \gamma})$ copies for any constant $\gamma > 0$. From this, we conclude the current best lower bound of $\Omega(d)$ is likely not tight.

Mind the Gap? Not for SVP Hardness under ETH!

from arXiv: Data Structures and Algorithms

Authors: Divesh Aggarwal, Rishav Gupta, Aditya Morolia

We prove new hardness results for fundamental lattice problems under the Exponential Time Hypothesis (ETH). Building on a recent breakthrough by Bitansky et al. [BHIRW24], who gave a polynomial-time reduction from $\mathsf{3SAT}$ to the (gap) $\mathsf{MAXLIN}$ problem-a class of CSPs with linear equations over finite fields-we derive ETH-hardness for several lattice problems. First, we show that for any $p \in [1, \infty)$, there exists an explicit constant $\gamma > 1$ such that $\mathsf{CVP}_{p,\gamma}$ (the $\ell_p$-norm approximate Closest Vector Problem) does not admit a $2^{o(n)}$-time algorithm unless ETH is false. Our reduction is deterministic and proceeds via a direct reduction from (gap) $\mathsf{MAXLIN}$ to $\mathsf{CVP}_{p,\gamma}$. Next, we prove a randomized ETH-hardness result for $\mathsf{SVP}_{p,\gamma}$ (the $\ell_p$-norm approximate Shortest Vector Problem) for all $p > 2$. This result relies on a novel property of the integer lattice $\mathbb{Z}^n$ in the $\ell_p$ norm and a randomized reduction from $\mathsf{CVP}_{p,\gamma}$ to $\mathsf{SVP}_{p,\gamma'}$. Finally, we improve over prior reductions from $\mathsf{3SAT}$ to $\mathsf{BDD}_{p, \alpha}$ (the Bounded Distance Decoding problem), yielding better ETH-hardness results for $\mathsf{BDD}_{p, \alpha}$ for any $p \in [1, \infty)$ and $\alpha > \alpha_p^{\ddagger}$, where $\alpha_p^{\ddagger}$ is an explicit threshold depending on $p$. We additionally observe that prior work implies ETH hardness for the gap minimum distance problem ($\gamma$-$\mathsf{MDP}$) in codes.

Authors: Divesh Aggarwal, Rishav Gupta, Aditya Morolia

We prove new hardness results for fundamental lattice problems under the Exponential Time Hypothesis (ETH). Building on a recent breakthrough by Bitansky et al. [BHIRW24], who gave a polynomial-time reduction from $\mathsf{3SAT}$ to the (gap) $\mathsf{MAXLIN}$ problem-a class of CSPs with linear equations over finite fields-we derive ETH-hardness for several lattice problems. First, we show that for any $p \in [1, \infty)$, there exists an explicit constant $\gamma > 1$ such that $\mathsf{CVP}_{p,\gamma}$ (the $\ell_p$-norm approximate Closest Vector Problem) does not admit a $2^{o(n)}$-time algorithm unless ETH is false. Our reduction is deterministic and proceeds via a direct reduction from (gap) $\mathsf{MAXLIN}$ to $\mathsf{CVP}_{p,\gamma}$. Next, we prove a randomized ETH-hardness result for $\mathsf{SVP}_{p,\gamma}$ (the $\ell_p$-norm approximate Shortest Vector Problem) for all $p > 2$. This result relies on a novel property of the integer lattice $\mathbb{Z}^n$ in the $\ell_p$ norm and a randomized reduction from $\mathsf{CVP}_{p,\gamma}$ to $\mathsf{SVP}_{p,\gamma'}$. Finally, we improve over prior reductions from $\mathsf{3SAT}$ to $\mathsf{BDD}_{p, \alpha}$ (the Bounded Distance Decoding problem), yielding better ETH-hardness results for $\mathsf{BDD}_{p, \alpha}$ for any $p \in [1, \infty)$ and $\alpha > \alpha_p^{\ddagger}$, where $\alpha_p^{\ddagger}$ is an explicit threshold depending on $p$. We additionally observe that prior work implies ETH hardness for the gap minimum distance problem ($\gamma$-$\mathsf{MDP}$) in codes.

Finding Diverse Solutions in Combinatorial Problems with a Distributive Lattice Structure

from arXiv: Data Structures and Algorithms

Authors: Mark de Berg, Andrés López Martínez, Frits Spieksma

We generalize the polynomial-time solvability of $k$-\textsc{Diverse Minimum s-t Cuts} (De Berg et al., ISAAC'23) to a wider class of combinatorial problems whose solution sets have a distributive lattice structure. We identify three structural conditions that, when met by a problem, ensure that a $k$-sized multiset of maximally-diverse solutions -- measured by the sum of pairwise Hamming distances -- can be found in polynomial time. We apply this framework to obtain polynomial time algorithms for finding diverse minimum $s$-$t$ cuts and diverse stable matchings. Moreover, we show that the framework extends to two other natural measures of diversity. Lastly, we present a simpler algorithmic framework for finding a largest set of pairwise disjoint solutions in problems that meet these structural conditions.

Authors: Mark de Berg, Andrés López Martínez, Frits Spieksma

We generalize the polynomial-time solvability of $k$-\textsc{Diverse Minimum s-t Cuts} (De Berg et al., ISAAC'23) to a wider class of combinatorial problems whose solution sets have a distributive lattice structure. We identify three structural conditions that, when met by a problem, ensure that a $k$-sized multiset of maximally-diverse solutions -- measured by the sum of pairwise Hamming distances -- can be found in polynomial time. We apply this framework to obtain polynomial time algorithms for finding diverse minimum $s$-$t$ cuts and diverse stable matchings. Moreover, we show that the framework extends to two other natural measures of diversity. Lastly, we present a simpler algorithmic framework for finding a largest set of pairwise disjoint solutions in problems that meet these structural conditions.

A PTAS for Travelling Salesman Problem with Neighbourhoods Over Parallel Line Segments of Similar Length

from arXiv: Data Structures and Algorithms

Authors: Benyamin Ghaseminia, Mohammad R. Salavatipour

We consider the Travelling Salesman Problem with Neighbourhoods (TSPN) on the Euclidean plane ($\RR^2$) and present a Polynomial-Time Approximation Scheme (PTAS) when the neighbourhoods are parallel line segments with lengths between $ [1, \lambda] $ for any constant value $ \lambda \ge 1 $. In TSPN (which generalizes classic TSP), each client represents a set (or neighbourhood) of points in a metric and the goal is to find a minimum cost TSP tour that visits at least one point from each client set. In the Euclidean setting, each neighbourhood is a region on the plane. TSPN is significantly more difficult than classic TSP even in the Euclidean setting, as it captures group TSP. A notable case of TSPN is when each neighbourhood is a line segment. Although there are PTASs for when neighbourhoods are fat objects (with limited overlap), TSPN over line segments is \textbf{APX}-hard even if all the line segments have unit length. For parallel (unit) line segments, the best approximation factor is $3\sqrt2$ from more than two decades ago \cite{DM03}. The PTAS we present in this paper settles the approximability of this case of the problem. Our algorithm finds a $ (1 + \eps) $-factor approximation for an instance of the problem for $n$ segments with lengths in $ [1,\lambda] $ in time $ n^{O(\lambda/\eps^3)} $.

Authors: Benyamin Ghaseminia, Mohammad R. Salavatipour

We consider the Travelling Salesman Problem with Neighbourhoods (TSPN) on the Euclidean plane ($\RR^2$) and present a Polynomial-Time Approximation Scheme (PTAS) when the neighbourhoods are parallel line segments with lengths between $ [1, \lambda] $ for any constant value $ \lambda \ge 1 $. In TSPN (which generalizes classic TSP), each client represents a set (or neighbourhood) of points in a metric and the goal is to find a minimum cost TSP tour that visits at least one point from each client set. In the Euclidean setting, each neighbourhood is a region on the plane. TSPN is significantly more difficult than classic TSP even in the Euclidean setting, as it captures group TSP. A notable case of TSPN is when each neighbourhood is a line segment. Although there are PTASs for when neighbourhoods are fat objects (with limited overlap), TSPN over line segments is \textbf{APX}-hard even if all the line segments have unit length. For parallel (unit) line segments, the best approximation factor is $3\sqrt2$ from more than two decades ago \cite{DM03}. The PTAS we present in this paper settles the approximability of this case of the problem. Our algorithm finds a $ (1 + \eps) $-factor approximation for an instance of the problem for $n$ segments with lengths in $ [1,\lambda] $ in time $ n^{O(\lambda/\eps^3)} $.

Dynamic Treewidth in Logarithmic Time

from arXiv: Data Structures and Algorithms

Authors: Tuukka Korhonen

We present a dynamic data structure that maintains a tree decomposition of width at most $9k+8$ of a dynamic graph with treewidth at most $k$, which is updated by edge insertions and deletions. The amortized update time of our data structure is $2^{O(k)} \log n$, where $n$ is the number of vertices. The data structure also supports maintaining any ``dynamic programming scheme'' on the tree decomposition, providing, for example, a dynamic version of Courcelle's theorem with $O_{k}(\log n)$ amortized update time; the $O_{k}(\cdot)$ notation hides factors that depend on $k$. This improves upon a result of Korhonen, Majewski, Nadara, Pilipczuk, and Soko{\l}owski [FOCS 2023], who gave a similar data structure but with amortized update time $2^{k^{O(1)}} n^{o(1)}$. Furthermore, our data structure is arguably simpler. Our main novel idea is to maintain a tree decomposition that is ``downwards well-linked'', which allows us to implement local rotations and analysis similar to those for splay trees.

Authors: Tuukka Korhonen

We present a dynamic data structure that maintains a tree decomposition of width at most $9k+8$ of a dynamic graph with treewidth at most $k$, which is updated by edge insertions and deletions. The amortized update time of our data structure is $2^{O(k)} \log n$, where $n$ is the number of vertices. The data structure also supports maintaining any ``dynamic programming scheme'' on the tree decomposition, providing, for example, a dynamic version of Courcelle's theorem with $O_{k}(\log n)$ amortized update time; the $O_{k}(\cdot)$ notation hides factors that depend on $k$. This improves upon a result of Korhonen, Majewski, Nadara, Pilipczuk, and Soko{\l}owski [FOCS 2023], who gave a similar data structure but with amortized update time $2^{k^{O(1)}} n^{o(1)}$. Furthermore, our data structure is arguably simpler. Our main novel idea is to maintain a tree decomposition that is ``downwards well-linked'', which allows us to implement local rotations and analysis similar to those for splay trees.

Efficient Algorithms for Cardinality Estimation and Conjunctive Query Evaluation With Simple Degree Constraints

from arXiv: Data Structures and Algorithms

Authors: Sungjin Im, Benjamin Moseley, Hung Q. Ngo, Kirk Pruhs

Cardinality estimation and conjunctive query evaluation are two of the most fundamental problems in database query processing. Recent work proposed, studied, and implemented a robust and practical information-theoretic cardinality estimation framework. In this framework, the estimator is the cardinality upper bound of a conjunctive query subject to ``degree-constraints'', which model a rich set of input data statistics. For general degree constraints, computing this bound is computationally hard. Researchers have naturally sought efficiently computable relaxed upper bounds that are as tight as possible. The polymatroid bound is the tightest among those relaxed upper bounds. While it is an open question whether the polymatroid bound can be computed in polynomial-time in general, it is known to be computable in polynomial-time for some classes of degree constraints. Our focus is on a common class of degree constraints called simple degree constraints. Researchers had not previously determined how to compute the polymatroid bound in polynomial time for this class of constraints. Our first main result is a polynomial time algorithm to compute the polymatroid bound given simple degree constraints. Our second main result is a polynomial-time algorithm to compute a ``proof sequence'' establishing this bound. This proof sequence can then be incorporated in the PANDA-framework to give a faster algorithm to evaluate a conjunctive query. In addition, we show computational limitations to extending our results to broader classes of degree constraints. Finally, our technique leads naturally to a new relaxed upper bound called the {\em flow bound}, which is computationally tractable.

Authors: Sungjin Im, Benjamin Moseley, Hung Q. Ngo, Kirk Pruhs

Cardinality estimation and conjunctive query evaluation are two of the most fundamental problems in database query processing. Recent work proposed, studied, and implemented a robust and practical information-theoretic cardinality estimation framework. In this framework, the estimator is the cardinality upper bound of a conjunctive query subject to ``degree-constraints'', which model a rich set of input data statistics. For general degree constraints, computing this bound is computationally hard. Researchers have naturally sought efficiently computable relaxed upper bounds that are as tight as possible. The polymatroid bound is the tightest among those relaxed upper bounds. While it is an open question whether the polymatroid bound can be computed in polynomial-time in general, it is known to be computable in polynomial-time for some classes of degree constraints. Our focus is on a common class of degree constraints called simple degree constraints. Researchers had not previously determined how to compute the polymatroid bound in polynomial time for this class of constraints. Our first main result is a polynomial time algorithm to compute the polymatroid bound given simple degree constraints. Our second main result is a polynomial-time algorithm to compute a ``proof sequence'' establishing this bound. This proof sequence can then be incorporated in the PANDA-framework to give a faster algorithm to evaluate a conjunctive query. In addition, we show computational limitations to extending our results to broader classes of degree constraints. Finally, our technique leads naturally to a new relaxed upper bound called the {\em flow bound}, which is computationally tractable.

Faster Mixing of the Jerrum-Sinclair Chain

from arXiv: Data Structures and Algorithms

Authors: Xiaoyu Chen, Weiming Feng, Zhe Ju, Tianshun Miao, Yitong Yin, Xinyuan Zhang

We show that the Jerrum-Sinclair Markov chain on matchings mixes in time $\widetilde{O}(\Delta^2 m)$ on any graph with $n$ vertices, $m$ edges, and maximum degree $\Delta$, for any constant edge weight $\lambda>0$. For general graphs with arbitrary, potentially unbounded $\Delta$, this provides the first improvement over the classic $\widetilde{O}(n^2 m)$ mixing time bound of Jerrum and Sinclair (1989) and Sinclair (1992). To achieve this, we develop a general framework for analyzing mixing times, combining ideas from the classic canonical path method with the "local-to-global" approaches recently developed in high-dimensional expanders, introducing key innovations to both techniques.

Authors: Xiaoyu Chen, Weiming Feng, Zhe Ju, Tianshun Miao, Yitong Yin, Xinyuan Zhang

We show that the Jerrum-Sinclair Markov chain on matchings mixes in time $\widetilde{O}(\Delta^2 m)$ on any graph with $n$ vertices, $m$ edges, and maximum degree $\Delta$, for any constant edge weight $\lambda>0$. For general graphs with arbitrary, potentially unbounded $\Delta$, this provides the first improvement over the classic $\widetilde{O}(n^2 m)$ mixing time bound of Jerrum and Sinclair (1989) and Sinclair (1992). To achieve this, we develop a general framework for analyzing mixing times, combining ideas from the classic canonical path method with the "local-to-global" approaches recently developed in high-dimensional expanders, introducing key innovations to both techniques.

Computing High-dimensional Confidence Sets for Arbitrary Distributions

from arXiv: Data Structures and Algorithms

Authors: Chao Gao, Liren Shan, Vaidehi Srinivas, Aravindan Vijayaraghavan

We study the problem of learning a high-density region of an arbitrary distribution over $\mathbb{R}^d$. Given a target coverage parameter $\delta$, and sample access to an arbitrary distribution $D$, we want to output a confidence set $S \subset \mathbb{R}^d$ such that $S$ achieves $\delta$ coverage of $D$, i.e., $\mathbb{P}_{y \sim D} \left[ y \in S \right] \ge \delta$, and the volume of $S$ is as small as possible. This is a central problem in high-dimensional statistics with applications in finding confidence sets, uncertainty quantification, and support estimation. In the most general setting, this problem is statistically intractable, so we restrict our attention to competing with sets from a concept class $C$ with bounded VC-dimension. An algorithm is competitive with class $C$ if, given samples from an arbitrary distribution $D$, it outputs in polynomial time a set that achieves $\delta$ coverage of $D$, and whose volume is competitive with the smallest set in $C$ with the required coverage $\delta$. This problem is computationally challenging even in the basic setting when $C$ is the set of all Euclidean balls. Existing algorithms based on coresets find in polynomial time a ball whose volume is $\exp(\tilde{O}( d/ \log d))$-factor competitive with the volume of the best ball. Our main result is an algorithm that finds a confidence set whose volume is $\exp(\tilde{O}(d^{2/3}))$ factor competitive with the optimal ball having the desired coverage. The algorithm is improper (it outputs an ellipsoid). Combined with our computational intractability result for proper learning balls within an $\exp(\tilde{O}(d^{1-o(1)}))$ approximation factor in volume, our results provide an interesting separation between proper and (improper) learning of confidence sets.

Authors: Chao Gao, Liren Shan, Vaidehi Srinivas, Aravindan Vijayaraghavan

We study the problem of learning a high-density region of an arbitrary distribution over $\mathbb{R}^d$. Given a target coverage parameter $\delta$, and sample access to an arbitrary distribution $D$, we want to output a confidence set $S \subset \mathbb{R}^d$ such that $S$ achieves $\delta$ coverage of $D$, i.e., $\mathbb{P}_{y \sim D} \left[ y \in S \right] \ge \delta$, and the volume of $S$ is as small as possible. This is a central problem in high-dimensional statistics with applications in finding confidence sets, uncertainty quantification, and support estimation. In the most general setting, this problem is statistically intractable, so we restrict our attention to competing with sets from a concept class $C$ with bounded VC-dimension. An algorithm is competitive with class $C$ if, given samples from an arbitrary distribution $D$, it outputs in polynomial time a set that achieves $\delta$ coverage of $D$, and whose volume is competitive with the smallest set in $C$ with the required coverage $\delta$. This problem is computationally challenging even in the basic setting when $C$ is the set of all Euclidean balls. Existing algorithms based on coresets find in polynomial time a ball whose volume is $\exp(\tilde{O}( d/ \log d))$-factor competitive with the volume of the best ball. Our main result is an algorithm that finds a confidence set whose volume is $\exp(\tilde{O}(d^{2/3}))$ factor competitive with the optimal ball having the desired coverage. The algorithm is improper (it outputs an ellipsoid). Combined with our computational intractability result for proper learning balls within an $\exp(\tilde{O}(d^{1-o(1)}))$ approximation factor in volume, our results provide an interesting separation between proper and (improper) learning of confidence sets.

Edge-weighted balanced connected partitions: Hardness and formulations

from arXiv: Data Structures and Algorithms

Authors: Morteza Davari, Phablo F. S. Moura, Hande Yaman

The balanced connected $k$-partition problem (BCP) is a classic problem which consists in partitioning the set of vertices of a vertex-weighted connected graph into a collection of $k$ sets such that each of them induces a connected subgraph of roughly the same weight. There exists a vast literature on BCP that includes hardness results, approximation algorithms, integer programming formulations, and a polyhedral study. We investigate edge-weighted variants of BCP where we are given a connected graph $G$, $k \in \mathbb{Z}_\ge$, and an edge-weight function $w \colon E(G)\to\mathbb{Q}_\ge$, and the goal is to compute a spanning $k$-forest $\mathcal{T}$ of $G$ (i.e., a forest with exactly $k$ trees) that minimizes the weight of the heaviest tree in $\mathcal{T}$ in the min-max version, or maximizes the weight of the lightest tree in $\mathcal{T}$ in the max-min version. We show that both versions of this problem are $\mathsf{NP}$-hard on complete graphs with $k=2$, unweighted split graphs, and unweighted bipartite graphs with $k\geq 2$ fixed. Moreover, we prove that these problems do not admit subexponential-time algorithms, unless the Exponential-Time Hypothesis fails. Finally, we devise compact and non-compact integer linear programming formulations, valid inequalities, and separation algorithms.

Authors: Morteza Davari, Phablo F. S. Moura, Hande Yaman

The balanced connected $k$-partition problem (BCP) is a classic problem which consists in partitioning the set of vertices of a vertex-weighted connected graph into a collection of $k$ sets such that each of them induces a connected subgraph of roughly the same weight. There exists a vast literature on BCP that includes hardness results, approximation algorithms, integer programming formulations, and a polyhedral study. We investigate edge-weighted variants of BCP where we are given a connected graph $G$, $k \in \mathbb{Z}_\ge$, and an edge-weight function $w \colon E(G)\to\mathbb{Q}_\ge$, and the goal is to compute a spanning $k$-forest $\mathcal{T}$ of $G$ (i.e., a forest with exactly $k$ trees) that minimizes the weight of the heaviest tree in $\mathcal{T}$ in the min-max version, or maximizes the weight of the lightest tree in $\mathcal{T}$ in the max-min version. We show that both versions of this problem are $\mathsf{NP}$-hard on complete graphs with $k=2$, unweighted split graphs, and unweighted bipartite graphs with $k\geq 2$ fixed. Moreover, we prove that these problems do not admit subexponential-time algorithms, unless the Exponential-Time Hypothesis fails. Finally, we devise compact and non-compact integer linear programming formulations, valid inequalities, and separation algorithms.

Quantum singular value transformation without block encodings: Near-optimal complexity with minimal ancilla

from arXiv: Data Structures and Algorithms

Authors: Shantanav Chakraborty, Soumyabrata Hazra, Tongyang Li, Changpeng Shao, Xinzhao Wang, Yuxin Zhang

We develop new algorithms for Quantum Singular Value Transformation (QSVT), a unifying framework underlying a wide range of quantum algorithms. Existing implementations of QSVT rely on block encoding, incurring $O(\log L)$ ancilla overhead and circuit depth $\widetilde{O}(d\lambda L)$ for polynomial transformations of a Hamiltonian $H=\sum_{k=1}^L \lambda_k H_k$, where $d$ is polynomial degree, and $\lambda=\sum_k |\lambda_k|$. We introduce a new approach that eliminates block encoding, needs only a single ancilla qubit, and maintains near-optimal complexity, using only basic Hamiltonian simulation methods such as Trotterization. Our method achieves a circuit depth of $\widetilde{O}(L(d\lambda_{\mathrm{comm}})^{1+o(1)})$, without any multi-qubit controlled gates. Here, $\lambda_{\mathrm{comm}}$ depends on the nested commutators of the $H_k$'s and can be much smaller than $\lambda$. Central to our technique is a novel use of Richardson extrapolation, enabling systematic error cancellation in interleaved sequences of arbitrary unitaries and Hamiltonian evolution operators, establishing a broadly applicable framework beyond QSVT. Additionally, we propose two randomized QSVT algorithms for cases with only sampling access to Hamiltonian terms. The first uses qDRIFT, while the second replaces block encodings in QSVT with randomly sampled unitaries. Both achieve quadratic complexity in $d$, which we establish as a lower bound for any randomized method implementing polynomial transformations in this model. Finally, as applications, we develop end-to-end quantum algorithms for quantum linear systems and ground state property estimation, achieving near-optimal complexity without oracular access. Our results provide a new framework for quantum algorithms, reducing hardware overhead while maintaining near-optimal performance, with implications for both near-term and fault-tolerant quantum computing.

Authors: Shantanav Chakraborty, Soumyabrata Hazra, Tongyang Li, Changpeng Shao, Xinzhao Wang, Yuxin Zhang

We develop new algorithms for Quantum Singular Value Transformation (QSVT), a unifying framework underlying a wide range of quantum algorithms. Existing implementations of QSVT rely on block encoding, incurring $O(\log L)$ ancilla overhead and circuit depth $\widetilde{O}(d\lambda L)$ for polynomial transformations of a Hamiltonian $H=\sum_{k=1}^L \lambda_k H_k$, where $d$ is polynomial degree, and $\lambda=\sum_k |\lambda_k|$. We introduce a new approach that eliminates block encoding, needs only a single ancilla qubit, and maintains near-optimal complexity, using only basic Hamiltonian simulation methods such as Trotterization. Our method achieves a circuit depth of $\widetilde{O}(L(d\lambda_{\mathrm{comm}})^{1+o(1)})$, without any multi-qubit controlled gates. Here, $\lambda_{\mathrm{comm}}$ depends on the nested commutators of the $H_k$'s and can be much smaller than $\lambda$. Central to our technique is a novel use of Richardson extrapolation, enabling systematic error cancellation in interleaved sequences of arbitrary unitaries and Hamiltonian evolution operators, establishing a broadly applicable framework beyond QSVT. Additionally, we propose two randomized QSVT algorithms for cases with only sampling access to Hamiltonian terms. The first uses qDRIFT, while the second replaces block encodings in QSVT with randomly sampled unitaries. Both achieve quadratic complexity in $d$, which we establish as a lower bound for any randomized method implementing polynomial transformations in this model. Finally, as applications, we develop end-to-end quantum algorithms for quantum linear systems and ground state property estimation, achieving near-optimal complexity without oracular access. Our results provide a new framework for quantum algorithms, reducing hardware overhead while maintaining near-optimal performance, with implications for both near-term and fault-tolerant quantum computing.

Interval Graphs are Reconstructible

from arXiv: Data Structures and Algorithms

Authors: Irene Heinrich, Masashi Kiyomi, Yota Otachi, Pascal Schweitzer

A graph is reconstructible if it is determined up to isomorphism by the multiset of its proper induced subgraphs. The reconstruction conjecture postulates that every graph of order at least 3 is reconstructible. We show that interval graphs with at least three vertices are reconstructible. For this purpose we develop a technique to handle separations in the context of reconstruction. This resolves a major roadblock to using graph structure theory in the context of reconstruction. To apply our novel technique, we also develop a resilient combinatorial structure theory for interval graphs. A consequence of our result is that interval graphs can be reconstructed in polynomial time.

Authors: Irene Heinrich, Masashi Kiyomi, Yota Otachi, Pascal Schweitzer

A graph is reconstructible if it is determined up to isomorphism by the multiset of its proper induced subgraphs. The reconstruction conjecture postulates that every graph of order at least 3 is reconstructible. We show that interval graphs with at least three vertices are reconstructible. For this purpose we develop a technique to handle separations in the context of reconstruction. This resolves a major roadblock to using graph structure theory in the context of reconstruction. To apply our novel technique, we also develop a resilient combinatorial structure theory for interval graphs. A consequence of our result is that interval graphs can be reconstructed in polynomial time.

On the twin-width of near-regular graphs

from arXiv: Data Structures and Algorithms

Authors: Irene Heinrich, Ferdinand Ihringer, Simon Raßmann, Lena Volk

Twin-width is a recently introduced graph parameter based on the repeated contraction of near-twins. It has shown remarkable utility in algorithmic and structural graph theory, as well as in finite model theory -- particularly since first-order model checking is fixed-parameter tractable when a witness certifying small twin-width is provided. However, the behavior of twin-width in specific graph classes, particularly cubic graphs, remains poorly understood. While cubic graphs are known to have unbounded twin-width, no explicit cubic graph of twin-width greater than 4 is known. This paper explores this phenomenon in regular and near-regular graph classes. We show that extremal graphs of bounded degree and high twin-width are asymmetric, partly explaining their elusiveness. Additionally, we establish bounds for circulant and d-degenerate graphs, and examine strongly regular graphs, which exhibit similar behavior to cubic graphs. Our results include determining the twin-width of Johnson graphs over 2-sets, and cyclic Latin square graphs.

Authors: Irene Heinrich, Ferdinand Ihringer, Simon Raßmann, Lena Volk

Twin-width is a recently introduced graph parameter based on the repeated contraction of near-twins. It has shown remarkable utility in algorithmic and structural graph theory, as well as in finite model theory -- particularly since first-order model checking is fixed-parameter tractable when a witness certifying small twin-width is provided. However, the behavior of twin-width in specific graph classes, particularly cubic graphs, remains poorly understood. While cubic graphs are known to have unbounded twin-width, no explicit cubic graph of twin-width greater than 4 is known. This paper explores this phenomenon in regular and near-regular graph classes. We show that extremal graphs of bounded degree and high twin-width are asymmetric, partly explaining their elusiveness. Additionally, we establish bounds for circulant and d-degenerate graphs, and examine strongly regular graphs, which exhibit similar behavior to cubic graphs. Our results include determining the twin-width of Johnson graphs over 2-sets, and cyclic Latin square graphs.

Asymmetric graph alignment and the phase transition for asymmetric tree correlation testing

from arXiv: Data Structures and Algorithms

Authors: Jakob Maier, Laurent Massoulié

Graph alignment - identifying node correspondences between two graphs - is a fundamental problem with applications in network analysis, biology, and privacy research. While substantial progress has been made in aligning correlated Erd\H{o}s-R\'enyi graphs under symmetric settings, real-world networks often exhibit asymmetry in both node numbers and edge densities. In this work, we introduce a novel framework for asymmetric correlated Erd\H{o}s-R\'enyi graphs, generalizing existing models to account for these asymmetries. We conduct a rigorous theoretical analysis of graph alignment in the sparse regime, where local neighborhoods exhibit tree-like structures. Our approach leverages tree correlation testing as the central tool in our polynomial-time algorithm, MPAlign, which achieves one-sided partial alignment under certain conditions. A key contribution of our work is characterizing these conditions under which asymmetric tree correlation testing is feasible: If two correlated graphs $G$ and $G'$ have average degrees $\lambda s$ and $\lambda s'$ respectively, where $\lambda$ is their common density and $s,s'$ are marginal correlation parameters, their tree neighborhoods can be aligned if $ss' > \alpha$, where $\alpha$ denotes Otter's constant and $\lambda$ is supposed large enough. The feasibility of this tree comparison problem undergoes a sharp phase transition since $ss' \leq \alpha$ implies its impossibility. These new results on tree correlation testing allow us to solve a class of random subgraph isomorphism problems, resolving an open problem in the field.

Authors: Jakob Maier, Laurent Massoulié

Graph alignment - identifying node correspondences between two graphs - is a fundamental problem with applications in network analysis, biology, and privacy research. While substantial progress has been made in aligning correlated Erd\H{o}s-R\'enyi graphs under symmetric settings, real-world networks often exhibit asymmetry in both node numbers and edge densities. In this work, we introduce a novel framework for asymmetric correlated Erd\H{o}s-R\'enyi graphs, generalizing existing models to account for these asymmetries. We conduct a rigorous theoretical analysis of graph alignment in the sparse regime, where local neighborhoods exhibit tree-like structures. Our approach leverages tree correlation testing as the central tool in our polynomial-time algorithm, MPAlign, which achieves one-sided partial alignment under certain conditions. A key contribution of our work is characterizing these conditions under which asymmetric tree correlation testing is feasible: If two correlated graphs $G$ and $G'$ have average degrees $\lambda s$ and $\lambda s'$ respectively, where $\lambda$ is their common density and $s,s'$ are marginal correlation parameters, their tree neighborhoods can be aligned if $ss' > \alpha$, where $\alpha$ denotes Otter's constant and $\lambda$ is supposed large enough. The feasibility of this tree comparison problem undergoes a sharp phase transition since $ss' \leq \alpha$ implies its impossibility. These new results on tree correlation testing allow us to solve a class of random subgraph isomorphism problems, resolving an open problem in the field.

Efficient Computation of Hyper-triangles on Hypergraphs

from arXiv: Data Structures and Algorithms

Authors: Haozhe Yin, Kai Wang, Wenjie Zhang, Ying Zhang, Ruijia Wu, Xuemin Lin

Hypergraphs, which use hyperedges to capture groupwise interactions among different entities, have gained increasing attention recently for their versatility in effectively modeling real-world networks. In this paper, we study the problem of computing hyper-triangles (formed by three fully-connected hyperedges), which is a basic structural unit in hypergraphs. Although existing approaches can be adopted to compute hyper-triangles by exhaustively examining hyperedge combinations, they overlook the structural characteristics distinguishing different hyper-triangle patterns. Consequently, these approaches lack specificity in computing particular hyper-triangle patterns and exhibit low efficiency. In this paper, we unveil a new formation pathway for hyper-triangles, transitioning from hyperedges to hyperwedges before assembling into hyper-triangles, and classify hyper-triangle patterns based on hyperwedges. Leveraging this insight, we introduce a two-step framework to reduce the redundant checking of hyperedge combinations. Under this framework, we propose efficient algorithms for computing a specific pattern of hyper-triangles. Approximate algorithms are also devised to support estimated counting scenarios. Furthermore, we introduce a fine-grained hypergraph clustering coefficient measurement that can reflect diverse properties of hypergraphs based on different hyper-triangle patterns. Extensive experimental evaluations conducted on 11 real-world datasets validate the effectiveness and efficiency of our proposed techniques.

Authors: Haozhe Yin, Kai Wang, Wenjie Zhang, Ying Zhang, Ruijia Wu, Xuemin Lin

Hypergraphs, which use hyperedges to capture groupwise interactions among different entities, have gained increasing attention recently for their versatility in effectively modeling real-world networks. In this paper, we study the problem of computing hyper-triangles (formed by three fully-connected hyperedges), which is a basic structural unit in hypergraphs. Although existing approaches can be adopted to compute hyper-triangles by exhaustively examining hyperedge combinations, they overlook the structural characteristics distinguishing different hyper-triangle patterns. Consequently, these approaches lack specificity in computing particular hyper-triangle patterns and exhibit low efficiency. In this paper, we unveil a new formation pathway for hyper-triangles, transitioning from hyperedges to hyperwedges before assembling into hyper-triangles, and classify hyper-triangle patterns based on hyperwedges. Leveraging this insight, we introduce a two-step framework to reduce the redundant checking of hyperedge combinations. Under this framework, we propose efficient algorithms for computing a specific pattern of hyper-triangles. Approximate algorithms are also devised to support estimated counting scenarios. Furthermore, we introduce a fine-grained hypergraph clustering coefficient measurement that can reflect diverse properties of hypergraphs based on different hyper-triangle patterns. Extensive experimental evaluations conducted on 11 real-world datasets validate the effectiveness and efficiency of our proposed techniques.

Quantum Gibbs states are locally Markovian

from arXiv: Data Structures and Algorithms

Authors: Chi-Fang Chen, Cambyse Rouzé

The Markov property entails the conditional independence structure inherent in Gibbs distributions for general classical Hamiltonians, a feature that plays a crucial role in inference, mixing time analysis, and algorithm design. However, much less is known about quantum Gibbs states. In this work, we show that for any Hamiltonian with a bounded interaction degree, the quantum Gibbs state is locally Markov at arbitrary temperature, meaning there exists a quasi-local recovery map for every local region. Notably, this recovery map is obtained by applying a detailed-balanced Lindbladian with jumps acting on the region. Consequently, we prove that (i) the conditional mutual information (CMI) for a shielded small region decays exponentially with the shielding distance, and (ii) under the assumption of uniform clustering of correlations, Gibbs states of general non-commuting Hamiltonians on $D$-dimensional lattices can be prepared by a quantum circuit of depth $e^{O(\log^D(n/\epsilon))}$, which can be further reduced assuming certain local gap condition. Our proofs introduce a regularization scheme for imaginary-time-evolved operators at arbitrarily low temperatures and reveal a connection between the Dirichlet form, a dynamic quantity, and the commutator in the KMS inner product, a static quantity. We believe these tools pave the way for tackling further challenges in quantum thermodynamics and mixing times, particularly in low-temperature regimes.

Authors: Chi-Fang Chen, Cambyse Rouzé

The Markov property entails the conditional independence structure inherent in Gibbs distributions for general classical Hamiltonians, a feature that plays a crucial role in inference, mixing time analysis, and algorithm design. However, much less is known about quantum Gibbs states. In this work, we show that for any Hamiltonian with a bounded interaction degree, the quantum Gibbs state is locally Markov at arbitrary temperature, meaning there exists a quasi-local recovery map for every local region. Notably, this recovery map is obtained by applying a detailed-balanced Lindbladian with jumps acting on the region. Consequently, we prove that (i) the conditional mutual information (CMI) for a shielded small region decays exponentially with the shielding distance, and (ii) under the assumption of uniform clustering of correlations, Gibbs states of general non-commuting Hamiltonians on $D$-dimensional lattices can be prepared by a quantum circuit of depth $e^{O(\log^D(n/\epsilon))}$, which can be further reduced assuming certain local gap condition. Our proofs introduce a regularization scheme for imaginary-time-evolved operators at arbitrarily low temperatures and reveal a connection between the Dirichlet form, a dynamic quantity, and the commutator in the KMS inner product, a static quantity. We believe these tools pave the way for tackling further challenges in quantum thermodynamics and mixing times, particularly in low-temperature regimes.

Thursday, April 03

In favor of the morally sane thing

from Scott Aaronson

The United States is now a country that disappears people. Visa holders, green card holders, and even occasionally citizens mistaken for non-citizens: Trump’s goons can now seize them off the sidewalk at any time, handcuff them, detain them indefinitely in a cell in Louisiana with minimal access to lawyers, or even fly them to an […]

The United States is now a country that disappears people.

Visa holders, green card holders, and even occasionally citizens mistaken for non-citizens: Trump’s goons can now seize them off the sidewalk at any time, handcuff them, detain them indefinitely in a cell in Louisiana with minimal access to lawyers, or even fly them to an overcrowded prison in El Salvador to be tortured.

It’s important to add: from what I know, some of the people being detained and deported are genuinely horrible. Some worked for organizations linked to Hamas, and cheered the murder of Jews. Some trafficked fentanyl. Some were violent gang members.

There are proper avenues to deport such people, in normal pre-Trumpian US law. For example, you can void someone’s visa by convincing a judge that they lied about not supporting terrorist organizations in their visa application.

But already other disappeared people seem to have been entirely innocent. Some apparently did nothing worse than write lefty op-eds or social media posts. Others had innocuous tattoos that were mistaken for gang insignia.

Millennia ago, civilization evolved mechanisms like courts and judges and laws and evidence and testimony, to help separate the guilty from the innocent. These are known problems with known solutions. No new ideas are needed.

One reader advised me not to blog about this issue unless I had something original to say: how could I possibly add to the New York Times’ and CNN’s daily coverage of every norm-shattering wrinkle? But other readers were livid at me for not blogging, even interpreting silence or delay as support for fascism.

For those readers, but more importantly for my kids and posterity, let me say: no one who follows this blog could ever accuse me of reflexive bleeding-heart wokery, much less of undue sympathy for “globalize the intifada” agitators. So with whatever credibility that grants me: Shtetl-Optimized unequivocally condemns the “grabbing random foreign students off the street” method of immigration enforcement. If there are resident aliens who merit deportation, prove it to a friggin’ judge (I’ll personally feel more confident that the law is being applied sanely if the judge wasn’t appointed by Trump). Prove that you got the right person, and that they did what you said, and that that violated the agreed-upon conditions of their residency according to some consistently-applied standard. And let the person contest the charges, with advice of counsel.

I don’t want to believe the most hyperbolic claims of my colleagues, that the US is now a full Soviet-style police state, or inevitably on its way to one. I beg any conservatives reading this post, particularly those with influence over events: help me not to believe this.

By Scott

Recovery Reductions in the Random Noise Model via Group Theory: Insights into NP-Complete and Fine-Grained Problems

from arXiv: Computational Complexity

Authors: Tejas Nareddy, Abhishek Mishra

We introduce and initiate the study of a new model of reductions called the random noise model. In this model, the truth table $T_f$ of the function $f$ is corrupted on a randomly chosen $\delta$-fraction of instances. A randomized algorithm $A$ is a $\left(t, \delta, 1-\varepsilon\right)$-recovery reduction for $f$ if: 1. With probability $1-\varepsilon$ over the choice of $\delta$-fraction corruptions, given access to the corrupted truth table, the algorithm $A$ computes $f(\phi)$ correctly with probability at least $2/3$ on every input $\phi$. 2. The algorithm $A$ runs in time $O(t)$. We believe this model, which is a natural relaxation of average-case complexity, both has practical motivations and is mathematically interesting. Pointing towards this, we show the existence of robust deterministic polynomial-time recovery reductions with the highest tolerable noise level for many of the canonical NP-complete problems - SAT, kSAT, kCSP, CLIQUE and more. Our recovery reductions are optimal for non-adaptive algorithms under complexity-theoretic assumptions. Notably, all our recovery reductions follow as corollaries of one black box algorithm based on group theory and permutation group algorithms. This suggests that recovery reductions in the random noise model are important to the study of the structure of NP-completeness. Furthermore, we establish recovery reductions with optimal parameters for Orthogonal Vectors and Parity $k$-Clique problems. These problems exhibit structural similarities to NP-complete problems, with Orthogonal Vectors admitting a $2^{0.5n}$-time reduction from kSAT on $n$ variables and Parity $k$-Clique, a subexponential-time reduction from 3SAT. This further highlights the relevance of our model to the study of NP-completeness.

Authors: Tejas Nareddy, Abhishek Mishra

We introduce and initiate the study of a new model of reductions called the random noise model. In this model, the truth table $T_f$ of the function $f$ is corrupted on a randomly chosen $\delta$-fraction of instances. A randomized algorithm $A$ is a $\left(t, \delta, 1-\varepsilon\right)$-recovery reduction for $f$ if: 1. With probability $1-\varepsilon$ over the choice of $\delta$-fraction corruptions, given access to the corrupted truth table, the algorithm $A$ computes $f(\phi)$ correctly with probability at least $2/3$ on every input $\phi$. 2. The algorithm $A$ runs in time $O(t)$. We believe this model, which is a natural relaxation of average-case complexity, both has practical motivations and is mathematically interesting. Pointing towards this, we show the existence of robust deterministic polynomial-time recovery reductions with the highest tolerable noise level for many of the canonical NP-complete problems - SAT, kSAT, kCSP, CLIQUE and more. Our recovery reductions are optimal for non-adaptive algorithms under complexity-theoretic assumptions. Notably, all our recovery reductions follow as corollaries of one black box algorithm based on group theory and permutation group algorithms. This suggests that recovery reductions in the random noise model are important to the study of the structure of NP-completeness. Furthermore, we establish recovery reductions with optimal parameters for Orthogonal Vectors and Parity $k$-Clique problems. These problems exhibit structural similarities to NP-complete problems, with Orthogonal Vectors admitting a $2^{0.5n}$-time reduction from kSAT on $n$ variables and Parity $k$-Clique, a subexponential-time reduction from 3SAT. This further highlights the relevance of our model to the study of NP-completeness.

Lower Bounds for Leader Election and Collective Coin Flipping, Revisited

from arXiv: Computational Complexity

Authors: Eshan Chattopadhyay, Mohit Gurumukhani, Noam Ringach, Rocco Servedio

We study the tasks of collective coin flipping and leader election in the full-information model. We prove new lower bounds for coin flipping protocols, implying lower bounds for leader election protocols. We show that any $k$-round coin flipping protocol, where each of $\ell$ players sends 1 bit per round, can be biased by $O(\ell/\log^{(k)}(\ell))$ bad players. For all $k>1$ this strengthens previous lower bounds [RSZ, SICOMP 2002], which ruled out protocols resilient to adversaries controlling $O(\ell/\log^{(2k-1)}(\ell))$ players. Consequently, we establish that any protocol tolerating a linear fraction of corrupt players, with only 1 bit per round, must run for at least $\log^*\ell-O(1)$ rounds, improving on the prior best lower bound of $\frac12 \log^*\ell-\log^*\log^*\ell$. This lower bound matches the number of rounds, $\log^*\ell$, taken by the current best coin flipping protocols from [RZ, JCSS 2001], [F, FOCS 1999] that can handle a linear sized coalition of bad players, but with players sending unlimited bits per round. We also derive lower bounds for protocols allowing multi-bit messages per round. Our results show that the protocols from [RZ, JCSS 2001], [F, FOCS 1999] that handle a linear number of corrupt players are almost optimal in terms of round complexity and communication per player in a round. A key technical ingredient in proving our lower bounds is a new result regarding biasing most functions from a family of functions using a common set of bad players and a small specialized set of bad players specific to each function that is biased. We give improved constant-round coin flipping protocols in the setting that each player can send 1 bit per round. For two rounds, our protocol can handle $O(\ell/(\log\ell)(\log\log\ell)^2)$ sized coalition of bad players; better than the best one-round protocol by [AL, Combinatorica 1993] in this setting.

Authors: Eshan Chattopadhyay, Mohit Gurumukhani, Noam Ringach, Rocco Servedio

We study the tasks of collective coin flipping and leader election in the full-information model. We prove new lower bounds for coin flipping protocols, implying lower bounds for leader election protocols. We show that any $k$-round coin flipping protocol, where each of $\ell$ players sends 1 bit per round, can be biased by $O(\ell/\log^{(k)}(\ell))$ bad players. For all $k>1$ this strengthens previous lower bounds [RSZ, SICOMP 2002], which ruled out protocols resilient to adversaries controlling $O(\ell/\log^{(2k-1)}(\ell))$ players. Consequently, we establish that any protocol tolerating a linear fraction of corrupt players, with only 1 bit per round, must run for at least $\log^*\ell-O(1)$ rounds, improving on the prior best lower bound of $\frac12 \log^*\ell-\log^*\log^*\ell$. This lower bound matches the number of rounds, $\log^*\ell$, taken by the current best coin flipping protocols from [RZ, JCSS 2001], [F, FOCS 1999] that can handle a linear sized coalition of bad players, but with players sending unlimited bits per round. We also derive lower bounds for protocols allowing multi-bit messages per round. Our results show that the protocols from [RZ, JCSS 2001], [F, FOCS 1999] that handle a linear number of corrupt players are almost optimal in terms of round complexity and communication per player in a round. A key technical ingredient in proving our lower bounds is a new result regarding biasing most functions from a family of functions using a common set of bad players and a small specialized set of bad players specific to each function that is biased. We give improved constant-round coin flipping protocols in the setting that each player can send 1 bit per round. For two rounds, our protocol can handle $O(\ell/(\log\ell)(\log\log\ell)^2)$ sized coalition of bad players; better than the best one-round protocol by [AL, Combinatorica 1993] in this setting.

Epistemic Skills: Reasoning about Knowledge and Oblivion

from arXiv: Computational Complexity

Authors: Xiaolong Liang, Yì N. Wáng

This paper presents a class of epistemic logics that captures the dynamics of acquiring knowledge and descending into oblivion, while incorporating concepts of group knowledge. The approach is grounded in a system of weighted models, introducing an ``epistemic skills'' metric to represent the epistemic capacities tied to knowledge updates. Within this framework, knowledge acquisition is modeled as a process of upskilling, whereas oblivion is represented as a consequence of downskilling. The framework further enables exploration of ``knowability'' and ``forgettability,'' defined as the potential to gain knowledge through upskilling and to lapse into oblivion through downskilling, respectively. Additionally, it supports a detailed analysis of the distinctions between epistemic de re and de dicto expressions. The computational complexity of the model checking and satisfiability problems is examined, offering insights into their theoretical foundations and practical implications.

Authors: Xiaolong Liang, Yì N. Wáng

This paper presents a class of epistemic logics that captures the dynamics of acquiring knowledge and descending into oblivion, while incorporating concepts of group knowledge. The approach is grounded in a system of weighted models, introducing an ``epistemic skills'' metric to represent the epistemic capacities tied to knowledge updates. Within this framework, knowledge acquisition is modeled as a process of upskilling, whereas oblivion is represented as a consequence of downskilling. The framework further enables exploration of ``knowability'' and ``forgettability,'' defined as the potential to gain knowledge through upskilling and to lapse into oblivion through downskilling, respectively. Additionally, it supports a detailed analysis of the distinctions between epistemic de re and de dicto expressions. The computational complexity of the model checking and satisfiability problems is examined, offering insights into their theoretical foundations and practical implications.

Dichotomies for \#CSP on graphs that forbid a clique as a minor

from arXiv: Computational Complexity

Authors: Boning Meng, Yicheng Pan

We prove complexity dichotomies for \#CSP problems (not necessarily symmetric) with Boolean domain and complex range on several typical minor-closed graph classes. These dichotomies give a complete characterization of the complexity of \#CSP on graph classes that forbid a complete graph as a minor. In particular, we also demonstrate that, whether the maximum degree of vertices is bounded may influence the complexity on specific minor-closed graph classes, and this phenomenon has never been observed in the previous related studies. Furthermore, our proofs integrate the properties of each graph class with the techniques from counting complexity, and develop a systematic approach for analyzing the complexity of \#CSP on these graph classes.

Authors: Boning Meng, Yicheng Pan

We prove complexity dichotomies for \#CSP problems (not necessarily symmetric) with Boolean domain and complex range on several typical minor-closed graph classes. These dichotomies give a complete characterization of the complexity of \#CSP on graph classes that forbid a complete graph as a minor. In particular, we also demonstrate that, whether the maximum degree of vertices is bounded may influence the complexity on specific minor-closed graph classes, and this phenomenon has never been observed in the previous related studies. Furthermore, our proofs integrate the properties of each graph class with the techniques from counting complexity, and develop a systematic approach for analyzing the complexity of \#CSP on these graph classes.

Confidence Bands for Multiparameter Persistence Landscapes

from arXiv: Computational Geometry

Authors: Inés García-Redondo, Anthea Monod, Qiquan Wang

Multiparameter persistent homology is a generalization of classical persistent homology, a central and widely-used methodology from topological data analysis, which takes into account density estimation and is an effective tool for data analysis in the presence of noise. Similar to its classical single-parameter counterpart, however, it is challenging to compute and use in practice due to its complex algebraic construction. In this paper, we study a popular and tractable invariant for multiparameter persistent homology in a statistical setting: the multiparameter persistence landscape. We derive a functional central limit theorem for multiparameter persistence landscapes, from which we compute confidence bands, giving rise to one of the first statistical inference methodologies for multiparameter persistence landscapes. We provide an implementation of confidence bands and demonstrate their application in a machine learning task on synthetic data.

Authors: Inés García-Redondo, Anthea Monod, Qiquan Wang

Multiparameter persistent homology is a generalization of classical persistent homology, a central and widely-used methodology from topological data analysis, which takes into account density estimation and is an effective tool for data analysis in the presence of noise. Similar to its classical single-parameter counterpart, however, it is challenging to compute and use in practice due to its complex algebraic construction. In this paper, we study a popular and tractable invariant for multiparameter persistent homology in a statistical setting: the multiparameter persistence landscape. We derive a functional central limit theorem for multiparameter persistence landscapes, from which we compute confidence bands, giving rise to one of the first statistical inference methodologies for multiparameter persistence landscapes. We provide an implementation of confidence bands and demonstrate their application in a machine learning task on synthetic data.

Distributed Triangle Detection is Hard in Few Rounds

from arXiv: Data Structures and Algorithms

Authors: Sepehr Assadi, Janani Sundaresan

In the distributed triangle detection problem, we have an $n$-vertex network $G=(V,E)$ with one player for each vertex of the graph who sees the edges incident on the vertex. The players communicate in synchronous rounds using the edges of this network and have a limited bandwidth of $O(\log{n})$ bits over each edge. The goal is to detect whether or not $G$ contains a triangle as a subgraph in a minimal number of rounds. We prove that any protocol (deterministic or randomized) for distributed triangle detection requires $\Omega(\log\log{n})$ rounds of communication. Prior to our work, only one-round lower bounds were known for this problem. The primary technique for proving these types of distributed lower bounds is via reductions from two-party communication complexity. However, it has been known for a while that this approach is provably incapable of establishing any meaningful lower bounds for distributed triangle detection. Our main technical contribution is a new information theoretic argument which combines recent advances on multi-pass graph streaming lower bounds with the point-to-point communication aspects of distributed models, and can be of independent interest.

Authors: Sepehr Assadi, Janani Sundaresan

In the distributed triangle detection problem, we have an $n$-vertex network $G=(V,E)$ with one player for each vertex of the graph who sees the edges incident on the vertex. The players communicate in synchronous rounds using the edges of this network and have a limited bandwidth of $O(\log{n})$ bits over each edge. The goal is to detect whether or not $G$ contains a triangle as a subgraph in a minimal number of rounds. We prove that any protocol (deterministic or randomized) for distributed triangle detection requires $\Omega(\log\log{n})$ rounds of communication. Prior to our work, only one-round lower bounds were known for this problem. The primary technique for proving these types of distributed lower bounds is via reductions from two-party communication complexity. However, it has been known for a while that this approach is provably incapable of establishing any meaningful lower bounds for distributed triangle detection. Our main technical contribution is a new information theoretic argument which combines recent advances on multi-pass graph streaming lower bounds with the point-to-point communication aspects of distributed models, and can be of independent interest.

Shared-Memory Hierarchical Process Mapping

from arXiv: Data Structures and Algorithms

Authors: Christian Schulz, Henning Woydt

Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse and can be determined in advance. Such applications are executed on supercomputers, which are often organized in a hierarchical hardware topology, consisting of islands, racks, nodes, and processors, where processing elements reside. To ensure efficient workload distribution, tasks must be allocated to processing elements in a way that ensures balanced utilization. However, this approach optimizes only the workload, not the communication cost of the application. It is straightforward to see that placing groups of tasks that frequently exchange large amounts of data on processing elements located near each other is beneficial. The problem of mapping tasks to processing elements considering optimization goals is called process mapping. In this work, we focus on minimizing communication cost while evenly distributing work. We present the first shared-memory algorithm that utilizes hierarchical multisection to partition the communication model across processing elements. Our parallel approach achieves the best solution on 95 percent of instances while also being marginally faster than the next best algorithm. Even in a serial setting, it delivers the best solution quality while also outperforming previous serial algorithms in speed.

Authors: Christian Schulz, Henning Woydt

Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse and can be determined in advance. Such applications are executed on supercomputers, which are often organized in a hierarchical hardware topology, consisting of islands, racks, nodes, and processors, where processing elements reside. To ensure efficient workload distribution, tasks must be allocated to processing elements in a way that ensures balanced utilization. However, this approach optimizes only the workload, not the communication cost of the application. It is straightforward to see that placing groups of tasks that frequently exchange large amounts of data on processing elements located near each other is beneficial. The problem of mapping tasks to processing elements considering optimization goals is called process mapping. In this work, we focus on minimizing communication cost while evenly distributing work. We present the first shared-memory algorithm that utilizes hierarchical multisection to partition the communication model across processing elements. Our parallel approach achieves the best solution on 95 percent of instances while also being marginally faster than the next best algorithm. Even in a serial setting, it delivers the best solution quality while also outperforming previous serial algorithms in speed.

Local Computation Algorithms for Knapsack: impossibility results, and how to avoid them

from arXiv: Data Structures and Algorithms

Authors: Clément L. Canonne, Yun Li, Seeun William Umboh

Local Computation Algorithms (LCA), as introduced by Rubinfeld, Tamir, Vardi, and Xie (2011), are a type of ultra-efficient algorithms which, given access to a (large) input for a given computational task, are required to provide fast query access to a consistent output solution, without maintaining a state between queries. This paradigm of computation in particular allows for hugely distributed algorithms, where independent instances of a given LCA provide consistent access to a common output solution. The past decade has seen a significant amount of work on LCAs, by and large focusing on graph problems. In this paper, we initiate the study of Local Computation Algorithms for perhaps the archetypal combinatorial optimization problem, Knapsack. We first establish strong impossibility results, ruling out the existence of any non-trivial LCA for Knapsack as several of its relaxations. We then show how equipping the LCA with additional access to the Knapsack instance, namely, weighted item sampling, allows one to circumvent these impossibility results, and obtain sublinear-time and query LCAs. Our positive result draws on a connection to the recent notion of reproducibility for learning algorithms (Impagliazzo, Lei, Pitassi, and Sorrell, 2022), a connection we believe to be of independent interest for the design of LCAs.

Authors: Clément L. Canonne, Yun Li, Seeun William Umboh

Local Computation Algorithms (LCA), as introduced by Rubinfeld, Tamir, Vardi, and Xie (2011), are a type of ultra-efficient algorithms which, given access to a (large) input for a given computational task, are required to provide fast query access to a consistent output solution, without maintaining a state between queries. This paradigm of computation in particular allows for hugely distributed algorithms, where independent instances of a given LCA provide consistent access to a common output solution. The past decade has seen a significant amount of work on LCAs, by and large focusing on graph problems. In this paper, we initiate the study of Local Computation Algorithms for perhaps the archetypal combinatorial optimization problem, Knapsack. We first establish strong impossibility results, ruling out the existence of any non-trivial LCA for Knapsack as several of its relaxations. We then show how equipping the LCA with additional access to the Knapsack instance, namely, weighted item sampling, allows one to circumvent these impossibility results, and obtain sublinear-time and query LCAs. Our positive result draws on a connection to the recent notion of reproducibility for learning algorithms (Impagliazzo, Lei, Pitassi, and Sorrell, 2022), a connection we believe to be of independent interest for the design of LCAs.

Generalized Assignment and Knapsack Problems in the Random-Order Model

from arXiv: Data Structures and Algorithms

Authors: Max Klimm, Martin Knaack

We study different online optimization problems in the random-order model. There is a finite set of bins with known capacity and a finite set of items arriving in a random order. Upon arrival of an item, its size and its value for each of the bins is revealed and it has to be decided immediately and irrevocably to which bin the item is assigned, or to not assign the item at all. In this setting, an algorithm is $\alpha$-competitive if the total value of all items assigned to the bins is at least an $\alpha$-fraction of the total value of an optimal assignment that knows all items beforehand. We give an algorithm that is $\alpha$-competitive with $\alpha = (1-\ln(2))/2 \approx 1/6.52$ improving upon the previous best algorithm with $\alpha \approx 1/6.99$ for the generalized assignment problem and the previous best algorithm with $\alpha \approx 1/6.65$ for the integral knapsack problem. We then study the fractional knapsack problem where we have a single bin and it is also allowed to pack items fractionally. For that case, we obtain an algorithm that is $\alpha$-competitive with $\alpha = 1/e \approx 1/2.71$ improving on the previous best algorithm with $\alpha = 1/4.39$. We further show that this competitive ratio is the best-possible for deterministic algorithms in this model.

Authors: Max Klimm, Martin Knaack

We study different online optimization problems in the random-order model. There is a finite set of bins with known capacity and a finite set of items arriving in a random order. Upon arrival of an item, its size and its value for each of the bins is revealed and it has to be decided immediately and irrevocably to which bin the item is assigned, or to not assign the item at all. In this setting, an algorithm is $\alpha$-competitive if the total value of all items assigned to the bins is at least an $\alpha$-fraction of the total value of an optimal assignment that knows all items beforehand. We give an algorithm that is $\alpha$-competitive with $\alpha = (1-\ln(2))/2 \approx 1/6.52$ improving upon the previous best algorithm with $\alpha \approx 1/6.99$ for the generalized assignment problem and the previous best algorithm with $\alpha \approx 1/6.65$ for the integral knapsack problem. We then study the fractional knapsack problem where we have a single bin and it is also allowed to pack items fractionally. For that case, we obtain an algorithm that is $\alpha$-competitive with $\alpha = 1/e \approx 1/2.71$ improving on the previous best algorithm with $\alpha = 1/4.39$. We further show that this competitive ratio is the best-possible for deterministic algorithms in this model.

Diameter Shortcut Sets on Temporal Graphs

from arXiv: Data Structures and Algorithms

Authors: Gerome Quantmeyer

Shortcut sets are a vital instrument for reducing the diameter of a static graph and, consequently, its shortest path complexity, which is relevant in numerous subfields of graph theory. We explore the notion of shortcut sets in temporal graphs, which incorporate a discrete time model into the graph, rendering each edge accessible exclusively at specific points in time. This not only alters the underlying assumptions of regular graphs but also substantially increases the complexity of path problems and reachability. In turn, a temporal graph is often a much more realistic and accurate representation of a real-world network. In this thesis we provide a definition for a shortcut set in a temporal graph and explore differences to classic shortcut sets. Utilizing this definition, we show that temporal and regular shortcut sets yield the same results on temporal paths, enabling the application of existing construction algorithms for static shortcut sets on paths. The primary contribution of this thesis is a translation approach for general temporal graphs that utilizes the static expansion of a temporal graph, allowing the conversion of static shortcut sets into temporal shortcut sets, yielding similar results.

Authors: Gerome Quantmeyer

Shortcut sets are a vital instrument for reducing the diameter of a static graph and, consequently, its shortest path complexity, which is relevant in numerous subfields of graph theory. We explore the notion of shortcut sets in temporal graphs, which incorporate a discrete time model into the graph, rendering each edge accessible exclusively at specific points in time. This not only alters the underlying assumptions of regular graphs but also substantially increases the complexity of path problems and reachability. In turn, a temporal graph is often a much more realistic and accurate representation of a real-world network. In this thesis we provide a definition for a shortcut set in a temporal graph and explore differences to classic shortcut sets. Utilizing this definition, we show that temporal and regular shortcut sets yield the same results on temporal paths, enabling the application of existing construction algorithms for static shortcut sets on paths. The primary contribution of this thesis is a translation approach for general temporal graphs that utilizes the static expansion of a temporal graph, allowing the conversion of static shortcut sets into temporal shortcut sets, yielding similar results.

Computing Time-varying Network Reliability using Binary Decision Diagrams

from arXiv: Data Structures and Algorithms

Authors: Yu Nakahata, Shun Arizono, Shoji Kasahara

Computing the reliability of a time-varying network, taking into account its dynamic nature, is crucial for networks that change over time, such as space networks, vehicular ad-hoc networks, and drone networks. These networks are modeled using temporal graphs, in which each edge is labeled with a time indicating its existence at a specific point in time. The time-varying network reliability is defined as the probability that a data packet from the source vertex can reach the terminal vertex, following links with increasing time labels (i.e., a journey), while taking into account the possibility of network link failures. Currently, the existing method for calculating this reliability involves explicitly enumerating all possible journeys between the source and terminal vertices and then calculating the reliability using the sum of disjoint products method. However, this method has high computational complexity. In contrast, there is an efficient algorithm that uses binary decision diagrams (BDDs) to evaluate the reliability of a network whose topology does not change over time. This paper presents an efficient exact algorithm that utilizes BDDs for computing the time-varying network reliability. Experimental results show that the proposed method runs faster than the existing method up to four orders of magnitude.

Authors: Yu Nakahata, Shun Arizono, Shoji Kasahara

Computing the reliability of a time-varying network, taking into account its dynamic nature, is crucial for networks that change over time, such as space networks, vehicular ad-hoc networks, and drone networks. These networks are modeled using temporal graphs, in which each edge is labeled with a time indicating its existence at a specific point in time. The time-varying network reliability is defined as the probability that a data packet from the source vertex can reach the terminal vertex, following links with increasing time labels (i.e., a journey), while taking into account the possibility of network link failures. Currently, the existing method for calculating this reliability involves explicitly enumerating all possible journeys between the source and terminal vertices and then calculating the reliability using the sum of disjoint products method. However, this method has high computational complexity. In contrast, there is an efficient algorithm that uses binary decision diagrams (BDDs) to evaluate the reliability of a network whose topology does not change over time. This paper presents an efficient exact algorithm that utilizes BDDs for computing the time-varying network reliability. Experimental results show that the proposed method runs faster than the existing method up to four orders of magnitude.

SplineSketch: Even More Accurate Quantiles with Error Guarantees

from arXiv: Data Structures and Algorithms

Authors: Aleksander Łukasiewicz, Jakub Tětek, Pavel Veselý

Space-efficient estimation of quantiles in massive datasets is a fundamental problem with numerous applications in data monitoring and analysis. While theoretical research led to optimal algorithms, such as the Greenwald-Khanna algorithm or the KLL sketch, practitioners often use other sketches that perform significantly better in practice but lack theoretical guarantees. Most notably, the widely used t-digest has unbounded worst-case error. In this paper, we seek to get the best of both worlds. We present a new quantile summary, SplineSketch, for numeric data, offering near-optimal theoretical guarantees and outperforming t-digest by a factor of 2-20 on a range of synthetic and real-world datasets with non-skewed frequency distributions. To achieve such performance, we develop a novel approach that maintains a dynamic subdivision of the input range into buckets while fitting the input distribution using monotone cubic spline interpolation. The core challenge is implementing this method in a space-efficient manner while ensuring strong worst-case guarantees.

Authors: Aleksander Łukasiewicz, Jakub Tětek, Pavel Veselý

Space-efficient estimation of quantiles in massive datasets is a fundamental problem with numerous applications in data monitoring and analysis. While theoretical research led to optimal algorithms, such as the Greenwald-Khanna algorithm or the KLL sketch, practitioners often use other sketches that perform significantly better in practice but lack theoretical guarantees. Most notably, the widely used t-digest has unbounded worst-case error. In this paper, we seek to get the best of both worlds. We present a new quantile summary, SplineSketch, for numeric data, offering near-optimal theoretical guarantees and outperforming t-digest by a factor of 2-20 on a range of synthetic and real-world datasets with non-skewed frequency distributions. To achieve such performance, we develop a novel approach that maintains a dynamic subdivision of the input range into buckets while fitting the input distribution using monotone cubic spline interpolation. The core challenge is implementing this method in a space-efficient manner while ensuring strong worst-case guarantees.

LimTDD: A Compact Decision Diagram Integrating Tensor and Local Invertible Map Representations

from arXiv: Data Structures and Algorithms

Authors: Xin Hong, Aochu Dai, Dingchao Gao, Sanjiang Li, Zhengfeng Ji, Mingsheng Ying

Tensor Decision Diagrams (TDDs) provide an efficient structure for representing tensors by combining techniques from both tensor networks and decision diagrams, demonstrating competitive performance in quantum circuit simulation and verification. However, existing decision diagrams, including TDDs, fail to exploit isomorphisms within tensors, limiting their compression efficiency. This paper introduces Local Invertible Map Tensor Decision Diagrams (LimTDDs), an extension of TDD that integrates local invertible maps (LIMs) to achieve more compact representations. Unlike LIMDD, which applies Pauli operators to quantum states, LimTDD generalizes this approach using the XP-stabilizer group, enabling broader applicability. We develop efficient algorithms for normalization and key tensor operations, including slicing, addition, and contraction, essential for quantum circuit simulation and verification. Theoretical analysis shows that LimTDD surpasses TDD in compactness while maintaining its generality and offers exponential advantages over both TDD and LIMDD in the best-case scenarios. Experimental results validate these improvements, demonstrating LimTDD's superior efficiency in quantum circuit simulation and functionality computation.

Authors: Xin Hong, Aochu Dai, Dingchao Gao, Sanjiang Li, Zhengfeng Ji, Mingsheng Ying

Tensor Decision Diagrams (TDDs) provide an efficient structure for representing tensors by combining techniques from both tensor networks and decision diagrams, demonstrating competitive performance in quantum circuit simulation and verification. However, existing decision diagrams, including TDDs, fail to exploit isomorphisms within tensors, limiting their compression efficiency. This paper introduces Local Invertible Map Tensor Decision Diagrams (LimTDDs), an extension of TDD that integrates local invertible maps (LIMs) to achieve more compact representations. Unlike LIMDD, which applies Pauli operators to quantum states, LimTDD generalizes this approach using the XP-stabilizer group, enabling broader applicability. We develop efficient algorithms for normalization and key tensor operations, including slicing, addition, and contraction, essential for quantum circuit simulation and verification. Theoretical analysis shows that LimTDD surpasses TDD in compactness while maintaining its generality and offers exponential advantages over both TDD and LIMDD in the best-case scenarios. Experimental results validate these improvements, demonstrating LimTDD's superior efficiency in quantum circuit simulation and functionality computation.

Wednesday, April 02

Improved Round-by-round Soundness IOPs via Reed-Muller Codes

from arXiv: Computational Complexity

Authors: Dor Minzer, Kai Zhe Zheng

We give an IOPP (interactive oracle proof of proximity) for trivariate Reed-Muller codes that achieves the best known query complexity in some range of security parameters. Specifically, for degree $d$ and security parameter $\lambda\leq \frac{\log^2 d}{\log\log d}$ , our IOPP has $2^{-\lambda}$ round-by-round soundness, $O(\lambda)$ queries, $O(\log\log d)$ rounds and $O(d)$ length. This improves upon the FRI [Ben-Sasson, Bentov, Horesh, Riabzev, ICALP 2018] and the STIR [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] IOPPs for Reed-Solomon codes, that have larger query and round complexity standing at $O(\lambda \log d)$ and $O(\log d+\lambda\log\log d)$ respectively. We use our IOPP to give an IOP for the NP-complete language Rank-1-Constraint-Satisfaction with the same parameters. Our construction is based on the line versus point test in the low-soundness regime. Compared to the axis parallel test (which is used in all prior works), the general affine lines test has improved soundness, which is the main source of our improved soundness. Using this test involves several complications, most significantly that projection to affine lines does not preserve individual degrees, and we show how to overcome these difficulties. En route, we extend some existing machinery to more general settings. Specifically, we give proximity generators for Reed-Muller codes, show a more systematic way of handling ``side conditions'' in IOP constructions, and generalize the compiling procedure of [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] to general codes.

Authors: Dor Minzer, Kai Zhe Zheng

We give an IOPP (interactive oracle proof of proximity) for trivariate Reed-Muller codes that achieves the best known query complexity in some range of security parameters. Specifically, for degree $d$ and security parameter $\lambda\leq \frac{\log^2 d}{\log\log d}$ , our IOPP has $2^{-\lambda}$ round-by-round soundness, $O(\lambda)$ queries, $O(\log\log d)$ rounds and $O(d)$ length. This improves upon the FRI [Ben-Sasson, Bentov, Horesh, Riabzev, ICALP 2018] and the STIR [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] IOPPs for Reed-Solomon codes, that have larger query and round complexity standing at $O(\lambda \log d)$ and $O(\log d+\lambda\log\log d)$ respectively. We use our IOPP to give an IOP for the NP-complete language Rank-1-Constraint-Satisfaction with the same parameters. Our construction is based on the line versus point test in the low-soundness regime. Compared to the axis parallel test (which is used in all prior works), the general affine lines test has improved soundness, which is the main source of our improved soundness. Using this test involves several complications, most significantly that projection to affine lines does not preserve individual degrees, and we show how to overcome these difficulties. En route, we extend some existing machinery to more general settings. Specifically, we give proximity generators for Reed-Muller codes, show a more systematic way of handling ``side conditions'' in IOP constructions, and generalize the compiling procedure of [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] to general codes.

SAT problem and Limit of Solomonoff's inductive reasoning theory

from arXiv: Computational Complexity

Authors: Feng Pan

This paper explores the Boolean Satisfiability Problem (SAT) in the context of Kolmogorov complexity theory. We present three versions of the distinguishability problem-Boolean formulas, Turing machines, and quantum systems-each focused on distinguishing between two Bernoulli distributions induced by these computational models. A reduction is provided that establishes the equivalence between the Boolean formula version of the program output statistical prediction problem and the #SAT problem. Furthermore, we apply Solomonoff's inductive reasoning theory, revealing its limitations: the only "algorithm" capable of determining the output of any shortest program is the program itself, and any other algorithms are computationally indistinguishable from a universal computer, based on the coding theorem. The quantum version of this problem introduces a unique algorithm based on statistical distance and distinguishability, reflecting a fundamental limit in quantum mechanics. Finally, the potential equivalence of Kolmogorov complexity between circuit models and Turing machines may have significant implications for the NP vs P problem. We also investigate the nature of short programs corresponding to exponentially long bit sequences that can be compressed, revealing that these programs inherently contain loops that grow exponentially.

Authors: Feng Pan

This paper explores the Boolean Satisfiability Problem (SAT) in the context of Kolmogorov complexity theory. We present three versions of the distinguishability problem-Boolean formulas, Turing machines, and quantum systems-each focused on distinguishing between two Bernoulli distributions induced by these computational models. A reduction is provided that establishes the equivalence between the Boolean formula version of the program output statistical prediction problem and the #SAT problem. Furthermore, we apply Solomonoff's inductive reasoning theory, revealing its limitations: the only "algorithm" capable of determining the output of any shortest program is the program itself, and any other algorithms are computationally indistinguishable from a universal computer, based on the coding theorem. The quantum version of this problem introduces a unique algorithm based on statistical distance and distinguishability, reflecting a fundamental limit in quantum mechanics. Finally, the potential equivalence of Kolmogorov complexity between circuit models and Turing machines may have significant implications for the NP vs P problem. We also investigate the nature of short programs corresponding to exponentially long bit sequences that can be compressed, revealing that these programs inherently contain loops that grow exponentially.

Strongly sublinear separators and bounded asymptotic dimension for sphere intersection graphs

from arXiv: Computational Geometry

Authors: James Davies, Agelos Georgakopoulos, Meike Hatzel, Rose McCarty

In this paper, we consider the class $\mathcal{C}^d$ of sphere intersection graphs in $\mathbb{R}^d$ for $d \geq 2$. We show that for each integer $t$, the class of all graphs in $\mathcal{C}^d$ that exclude $K_{t,t}$ as a subgraph has strongly sublinear separators. We also prove that $\mathcal{C}^d$ has asymptotic dimension at most $2d+2$.

Authors: James Davies, Agelos Georgakopoulos, Meike Hatzel, Rose McCarty

In this paper, we consider the class $\mathcal{C}^d$ of sphere intersection graphs in $\mathbb{R}^d$ for $d \geq 2$. We show that for each integer $t$, the class of all graphs in $\mathcal{C}^d$ that exclude $K_{t,t}$ as a subgraph has strongly sublinear separators. We also prove that $\mathcal{C}^d$ has asymptotic dimension at most $2d+2$.

Crossing number inequalities for curves on surfaces

from arXiv: Computational Geometry

Authors: Alfredo Hubard, Hugo Parlier

We prove that, as $m$ grows, any family of $m$ homotopically distinct closed curves on a surface induces a number of crossings that grows at least like $(m \log m)^2$. We use this to answer two questions of Pach, Tardos and Toth related to crossing numbers of drawings of multigraphs where edges are required to be non-homotopic. Furthermore, we generalize these results, obtaining effective bounds with optimal growth rates on every orientable surface.

Authors: Alfredo Hubard, Hugo Parlier

We prove that, as $m$ grows, any family of $m$ homotopically distinct closed curves on a surface induces a number of crossings that grows at least like $(m \log m)^2$. We use this to answer two questions of Pach, Tardos and Toth related to crossing numbers of drawings of multigraphs where edges are required to be non-homotopic. Furthermore, we generalize these results, obtaining effective bounds with optimal growth rates on every orientable surface.

Co-design Optimization of Moving Parts for Compliance and Collision Avoidance

from arXiv: Computational Geometry

Authors: Amir M. Mirzendehdel, Morad Behandish

Design requirements for moving parts in mechanical assemblies are typically specified in terms of interactions with other parts. Some are purely kinematic (e.g., pairwise collision avoidance) while others depend on physics and material properties (e.g., deformation under loads). Kinematic design methods and physics-based shape/topology optimization (SO/TO) deal separately with these requirements. They rarely talk to each other as the former uses set algebra and group theory while the latter requires discretizing and solving differential equations. Hence, optimizing a moving part based on physics typically relies on either neglecting or pruning kinematic constraints in advance, e.g., by restricting the design domain to a collision-free space using an unsweep operation. In this paper, we show that TO can be used to co-design two or more parts in relative motion to simultaneously satisfy physics-based criteria and collision avoidance. We restrict our attention to maximizing linear-elastic stiffness while penalizing collision measures aggregated in time. We couple the TO loops for two parts in relative motion so that the evolution of each part's shape is accounted for when penalizing collision for the other part. The collision measures are computed by a correlation functional that can be discretized by left- and right-multiplying the shape design variables by a pre-computed matrix that depends solely on the motion. This decoupling is key to making the computations scalable for TO iterations. We demonstrate the effectiveness of the approach with 2D and 3D examples.

Authors: Amir M. Mirzendehdel, Morad Behandish

Design requirements for moving parts in mechanical assemblies are typically specified in terms of interactions with other parts. Some are purely kinematic (e.g., pairwise collision avoidance) while others depend on physics and material properties (e.g., deformation under loads). Kinematic design methods and physics-based shape/topology optimization (SO/TO) deal separately with these requirements. They rarely talk to each other as the former uses set algebra and group theory while the latter requires discretizing and solving differential equations. Hence, optimizing a moving part based on physics typically relies on either neglecting or pruning kinematic constraints in advance, e.g., by restricting the design domain to a collision-free space using an unsweep operation. In this paper, we show that TO can be used to co-design two or more parts in relative motion to simultaneously satisfy physics-based criteria and collision avoidance. We restrict our attention to maximizing linear-elastic stiffness while penalizing collision measures aggregated in time. We couple the TO loops for two parts in relative motion so that the evolution of each part's shape is accounted for when penalizing collision for the other part. The collision measures are computed by a correlation functional that can be discretized by left- and right-multiplying the shape design variables by a pre-computed matrix that depends solely on the motion. This decoupling is key to making the computations scalable for TO iterations. We demonstrate the effectiveness of the approach with 2D and 3D examples.

Linked Array Tree: A Constant-Time Search Structure for Big Data

from arXiv: Data Structures and Algorithms

Authors: Songpeng Liu

As data volumes continue to grow rapidly, traditional search algorithms, like the red-black tree and B+ Tree, face increasing challenges in performance, especially in big data scenarios with intensive storage access. This paper presents the Linked Array Tree (LAT), a novel data structure designed to achieve constant-time complexity for search, insertion, and deletion operations. LAT leverages a sparse, non-moving hierarchical layout that enables direct access paths without requiring rebalancing or data movement. Its low memory overhead and avoidance of pointer-heavy structures make it well-suited for large-scale and intensive workloads. While not specifically tested under parallel or concurrent conditions, the structure's static layout and non-interfering operations suggest potential advantages in such environments. This paper first introduces the structure and algorithms of LAT, followed by a detailed analysis of its time complexity in search, insertion, and deletion operations. Finally, it presents experimental results across both data-intensive and sparse usage scenarios to evaluate LAT's practical performance.

Authors: Songpeng Liu

As data volumes continue to grow rapidly, traditional search algorithms, like the red-black tree and B+ Tree, face increasing challenges in performance, especially in big data scenarios with intensive storage access. This paper presents the Linked Array Tree (LAT), a novel data structure designed to achieve constant-time complexity for search, insertion, and deletion operations. LAT leverages a sparse, non-moving hierarchical layout that enables direct access paths without requiring rebalancing or data movement. Its low memory overhead and avoidance of pointer-heavy structures make it well-suited for large-scale and intensive workloads. While not specifically tested under parallel or concurrent conditions, the structure's static layout and non-interfering operations suggest potential advantages in such environments. This paper first introduces the structure and algorithms of LAT, followed by a detailed analysis of its time complexity in search, insertion, and deletion operations. Finally, it presents experimental results across both data-intensive and sparse usage scenarios to evaluate LAT's practical performance.

Expander Pruning with Polylogarithmic Worst-Case Recourse and Update Time

from arXiv: Data Structures and Algorithms

Authors: Simon Meierhans, Maximilian Probst Gutenberg, Thatchaphol Saranurak

Expander graphs are known to be robust to edge deletions in the following sense: for any online sequence of edge deletions $e_1, e_2, \ldots, e_k$ to an $m$-edge graph $G$ that is initially a $\phi$-expander, the algorithm can grow a set $P \subseteq V$ such that at any time $t$, $G[V \setminus P]$ is an expander of the same quality as the initial graph $G$ up to a constant factor and the set $P$ has volume at most $O(t/\phi)$. However, currently, there is no algorithm to grow $P$ with low worst-case recourse that achieves any non-trivial guarantee. In this work, we present an algorithm that achieves near-optimal guarantees: we give an algorithm that grows $P$ only by $\tilde{O}(1/\phi^2)$ vertices per time step and ensures that $G[V \setminus P]$ remains $\tilde{\Omega}(\phi)$-expander at any time. Even more excitingly, our algorithm is extremely efficient: it can process each update in near-optimal worst-case update time $\tilde{O}(1/\phi^2)$. This affirmatively answers the main open question posed in [SW19] whether such an algorithm exists. By combining our results with recent techniques in [BvdBPG+22], we obtain the first adaptive algorithms to maintain spanners, cut and spectral sparsifiers with $\tilde{O}(n)$ edges and polylogarithmic approximation guarantees, worst-case update time and recourse. More generally, we believe that worst-case pruning is an essential tool for obtaining worst-case guarantees in dynamic graph algorithms and online algorithms.

Authors: Simon Meierhans, Maximilian Probst Gutenberg, Thatchaphol Saranurak

Expander graphs are known to be robust to edge deletions in the following sense: for any online sequence of edge deletions $e_1, e_2, \ldots, e_k$ to an $m$-edge graph $G$ that is initially a $\phi$-expander, the algorithm can grow a set $P \subseteq V$ such that at any time $t$, $G[V \setminus P]$ is an expander of the same quality as the initial graph $G$ up to a constant factor and the set $P$ has volume at most $O(t/\phi)$. However, currently, there is no algorithm to grow $P$ with low worst-case recourse that achieves any non-trivial guarantee. In this work, we present an algorithm that achieves near-optimal guarantees: we give an algorithm that grows $P$ only by $\tilde{O}(1/\phi^2)$ vertices per time step and ensures that $G[V \setminus P]$ remains $\tilde{\Omega}(\phi)$-expander at any time. Even more excitingly, our algorithm is extremely efficient: it can process each update in near-optimal worst-case update time $\tilde{O}(1/\phi^2)$. This affirmatively answers the main open question posed in [SW19] whether such an algorithm exists. By combining our results with recent techniques in [BvdBPG+22], we obtain the first adaptive algorithms to maintain spanners, cut and spectral sparsifiers with $\tilde{O}(n)$ edges and polylogarithmic approximation guarantees, worst-case update time and recourse. More generally, we believe that worst-case pruning is an essential tool for obtaining worst-case guarantees in dynamic graph algorithms and online algorithms.

How to Protect Yourself from Threatening Skeletons: Optimal Padded Decompositions for Minor-Free Graphs

from arXiv: Data Structures and Algorithms

Authors: Jonathan Conroy, Arnold Filtser

Roughly, a metric space has padding parameter $\beta$ if for every $\Delta>0$, there is a stochastic decomposition of the metric points into clusters of diameter at most $\Delta$ such that every ball of radius $\gamma\Delta$ is contained in a single cluster with probability at least $e^{-\gamma\beta}$. The padding parameter is an important characteristic of a metric space with vast algorithmic implications. In this paper we prove that the shortest path metric of every $K_r$-minor-free graph has padding parameter $O(\log r)$, which is also tight. This resolves a long standing open question, and exponentially improves the previous bound. En route to our main result, we construct sparse covers for $K_r$-minor-free graphs with improved parameters, and we prove a general reduction from sparse covers to padded decompositions.

Authors: Jonathan Conroy, Arnold Filtser

Roughly, a metric space has padding parameter $\beta$ if for every $\Delta>0$, there is a stochastic decomposition of the metric points into clusters of diameter at most $\Delta$ such that every ball of radius $\gamma\Delta$ is contained in a single cluster with probability at least $e^{-\gamma\beta}$. The padding parameter is an important characteristic of a metric space with vast algorithmic implications. In this paper we prove that the shortest path metric of every $K_r$-minor-free graph has padding parameter $O(\log r)$, which is also tight. This resolves a long standing open question, and exponentially improves the previous bound. En route to our main result, we construct sparse covers for $K_r$-minor-free graphs with improved parameters, and we prove a general reduction from sparse covers to padded decompositions.

Tuesday, April 01

ICE-TCS seminar by Benjamin Moore on "Smoothed analysis for graph isomorphism"

from Luca Aceto

Today, the ICE-TCS seminar series at Reykjavik University hosted a talk by Benjamin Moore (Institute of Science and Technology Austria) who is visiting our postdoctoral researcher Nicolaos Matsakis. 

Benjamin presented the main results in his paper "Smoothed analysis for graph isomorphism", coauthored with his ISTA colleagues Michael Anastos and Matthew Kwan. (In passing, I just saw that Matthew Kwan received the main prize of the Austrian Mathematical Society last year. Congratulations!) 

To my mind, Benjamin did an excellent job in presenting the context for their exciting (but very technical) contribution and the main ideas that underlie it. Kudos! The work by Benjamin and his collaborators provides another explanation of the effectiveness of the colour refinement algorithm (also known as the one-dimensional Weisfeiler-Leman algorithm) in checking whether two graphs are isomorphic. I encourage you to read at least the introduction of their paper, which will be presented at STOC 2025, and the ISTA news article here, which does a much better job at putting their work in context than an interested, but ignorant, observer like me ever could. FWIW, I find results like theirs, which offer some explanation as to why theoretically hard problems are seemingly easy in practice, fascinating and I feel like that paper might be a strong candidate for a best paper award. 

It was also fitting to see recent work on smoothed analysis being presented at our seminar series since Daniel Spielman and Shang-Hua Teng received the 2008 Gödel Prize at ICALP 2008, which was held at Reykjavik University. Time flies, but great work is timeless. 


By Luca Aceto

Today, the ICE-TCS seminar series at Reykjavik University hosted a talk by Benjamin Moore (Institute of Science and Technology Austria) who is visiting our postdoctoral researcher Nicolaos Matsakis

Benjamin presented the main results in his paper "Smoothed analysis for graph isomorphism", coauthored with his ISTA colleagues Michael Anastos and Matthew Kwan. (In passing, I just saw that Matthew Kwan received the main prize of the Austrian Mathematical Society last year. Congratulations!) 

To my mind, Benjamin did an excellent job in presenting the context for their exciting (but very technical) contribution and the main ideas that underlie it. Kudos! The work by Benjamin and his collaborators provides another explanation of the effectiveness of the colour refinement algorithm (also known as the one-dimensional Weisfeiler-Leman algorithm) in checking whether two graphs are isomorphic. I encourage you to read at least the introduction of their paper, which will be presented at STOC 2025, and the ISTA news article here, which does a much better job at putting their work in context than an interested, but ignorant, observer like me ever could. FWIW, I find results like theirs, which offer some explanation as to why theoretically hard problems are seemingly easy in practice, fascinating and I feel like that paper might be a strong candidate for a best paper award. 

It was also fitting to see recent work on smoothed analysis being presented at our seminar series since Daniel Spielman and Shang-Hua Teng received the 2008 Gödel Prize at ICALP 2008, which was held at Reykjavik University. Time flies, but great work is timeless. 


By Luca Aceto

PDQ Shor (?-2025)

from Computational Complexity

♦ PDQ Shor

PDQ Shor, Peter Shor's smarter brother, passed away last week. PDQ was a Physicist/Computer Scientist/Mathematician/Astrologer/Psychic at the University of Northern South Dakota in Wakpala.

Dr. Phineas Dominic Quincy Shor III, PhD, MBA, BLT, received his education at Europa U. during one of his many alien abductions. He ended up in South Dakota after having fled every other state.

He was most famous for the concept of unnatural proofs, collected in his anthology Proofs from the Other Book, which includes his classic "interpretive dance proof" of the Pythagorean theorem. Critics complain the proof only handles the case where the right angle is on the left.

His follow up book, Proofs from the Crypt, contains his masterpiece, a 1237 page proof that conclusively shows that the empty set contains no irrational numbers.

Like his brother he's moved to the quantum space, reverse engineering Peter's work by giving a doubly exponential time quantum algorithm for multiplying numbers. He created the innovative dipping bird quantum error collection machine that constantly monitors a quantum machine collapsing all entanglement. Apple bought the device for $327 million which immediately destroyed their plans for a QiPhone.
PDQ used the proceeds to create the perpetual Turing machine, guaranteed to never halt. Until it did.

Sadly PDQ passed away from paranormal causes last week. Or was it last year? No one is quite sure. He donated his body to pseudoscience, currently lying in state in an undisclosed state. We hardly knew you.

With apologies to Peter Schickele. This April Fools post was inspired by the complexity class PDQMA.

By Lance Fortnow

PDQ Shor

PDQ Shor, Peter Shor's smarter brother, passed away last week. PDQ was a Physicist/Computer Scientist/Mathematician/Astrologer/Psychic at the University of Northern South Dakota in Wakpala.

Dr. Phineas Dominic Quincy Shor III, PhD, MBA, BLT, received his education at Europa U. during one of his many alien abductions. He ended up in South Dakota after having fled every other state.

He was most famous for the concept of unnatural proofs, collected in his anthology Proofs from the Other Book, which includes his classic "interpretive dance proof" of the Pythagorean theorem. Critics complain the proof only handles the case where the right angle is on the left.

His follow up book, Proofs from the Crypt, contains his masterpiece, a 1237 page proof that conclusively shows that the empty set contains no irrational numbers.

Like his brother he's moved to the quantum space, reverse engineering Peter's work by giving a doubly exponential time quantum algorithm for multiplying numbers. He created the innovative dipping bird quantum error collection machine that constantly monitors a quantum machine collapsing all entanglement. Apple bought the device for $327 million which immediately destroyed their plans for a QiPhone.

PDQ used the proceeds to create the perpetual Turing machine, guaranteed to never halt. Until it did.

Sadly PDQ passed away from paranormal causes last week. Or was it last year? No one is quite sure. He donated his body to pseudoscience, currently lying in state in an undisclosed state. We hardly knew you.

With apologies to Peter Schickele. This April Fools post was inspired by the complexity class PDQMA.

By Lance Fortnow

Lifting for Arbitrary Gadgets

from arXiv: Computational Complexity

Authors: Siddharth Iyer

We prove a sensitivity-to-communication lifting theorem for arbitrary gadgets. Given functions $f: \{0,1\}^n\to \{0,1\}$ and $g : \mathcal X\times \mathcal Y\to \{0,1\}$, denote $f\circ g(x,y) := f(g(x_1,y_1),\ldots,g(x_n,y_n))$. We show that for any $f$ with sensitivity $s$ and any $g$, \[D(f\circ g) \geq s\cdot \bigg(\frac{\Omega(D(g))}{\log\mathsf{rk}(g)} - \log\mathsf{rk}(g)\bigg),\] where $D(\cdot)$ denotes the deterministic communication complexity and $\mathsf{rk}(g)$ is the rank of the matrix associated with $g$. As a corollary, we get that if $D(g)$ is a sufficiently large constant, $D(f\circ g) = \Omega(\min\{s,d\}\cdot \sqrt{D(g)})$, where $s$ and $d$ denote the sensitivity and degree of $f$. In particular, computing the OR of $n$ copies of $g$ requires $\Omega(n\cdot\sqrt{D(g)})$ bits.

Authors: Siddharth Iyer

We prove a sensitivity-to-communication lifting theorem for arbitrary gadgets. Given functions $f: \{0,1\}^n\to \{0,1\}$ and $g : \mathcal X\times \mathcal Y\to \{0,1\}$, denote $f\circ g(x,y) := f(g(x_1,y_1),\ldots,g(x_n,y_n))$. We show that for any $f$ with sensitivity $s$ and any $g$, \[D(f\circ g) \geq s\cdot \bigg(\frac{\Omega(D(g))}{\log\mathsf{rk}(g)} - \log\mathsf{rk}(g)\bigg),\] where $D(\cdot)$ denotes the deterministic communication complexity and $\mathsf{rk}(g)$ is the rank of the matrix associated with $g$. As a corollary, we get that if $D(g)$ is a sufficiently large constant, $D(f\circ g) = \Omega(\min\{s,d\}\cdot \sqrt{D(g)})$, where $s$ and $d$ denote the sensitivity and degree of $f$. In particular, computing the OR of $n$ copies of $g$ requires $\Omega(n\cdot\sqrt{D(g)})$ bits.

$\mathsf{P}$-completeness of Graph Local Complementation

from arXiv: Computational Complexity

Authors: Pablo Concha-Vega

Local complementation of a graph $G$ on vertex $v$ is an operation that results in a new graph $G*v$, where the neighborhood of $v$ is complemented. This operation has been widely studied in graph theory and quantum computing. This article introduces the Local Complementation Problem, a decision problem that captures the complexity of applying a sequence of local complementations. Given a graph $G$, a sequence of vertices $s$, and a pair of vertices $u,v$, the problem asks whether the edge $(u,v)$ is present in the graph obtained after applying local complementations according to $s$. The main contribution of this work is proving that this problem is $\mathsf{P}$-complete, implying that computing a sequence of local complementation is unlikely to be efficiently parallelizable. The proof is based on a reduction from the Circuit Value Problem, a well-known $\mathsf{P}$-complete problem, by simulating circuits through local complementations. Aditionally, the complexity of this problem is analyzed under different restrictions. In particular, it is shown that for complete and star graphs, the problem belongs to $\mathsf{LOGSPACE}$. Finally, it is conjectured that the problem remains $\mathsf{P}$-complete for the class of circle graphs.

Authors: Pablo Concha-Vega

Local complementation of a graph $G$ on vertex $v$ is an operation that results in a new graph $G*v$, where the neighborhood of $v$ is complemented. This operation has been widely studied in graph theory and quantum computing. This article introduces the Local Complementation Problem, a decision problem that captures the complexity of applying a sequence of local complementations. Given a graph $G$, a sequence of vertices $s$, and a pair of vertices $u,v$, the problem asks whether the edge $(u,v)$ is present in the graph obtained after applying local complementations according to $s$. The main contribution of this work is proving that this problem is $\mathsf{P}$-complete, implying that computing a sequence of local complementation is unlikely to be efficiently parallelizable. The proof is based on a reduction from the Circuit Value Problem, a well-known $\mathsf{P}$-complete problem, by simulating circuits through local complementations. Aditionally, the complexity of this problem is analyzed under different restrictions. In particular, it is shown that for complete and star graphs, the problem belongs to $\mathsf{LOGSPACE}$. Finally, it is conjectured that the problem remains $\mathsf{P}$-complete for the class of circle graphs.

Simple general magnification of circuit lower bounds

from arXiv: Computational Complexity

Authors: Albert Atserias, Moritz Müller

We construct so-called distinguishers, sparse matrices that retain some properties of error correcting codes. They provide a technically and conceptually simple approach to magnification. We generalize and strengthen known general (not problem specific) magnification results and in particular achieve magnification thresholds below known lower bounds. For example, we show that fixed polynomial formula size lower bounds for NP are implied by slightly superlinear formula size lower bounds for approximating any sufficiently sparse problem in NP. We also show that the thresholds achieved are sharp. Additionally, our approach yields a uniform magnification result for the minimum circuit size problem. This seems to sidestep the localization barrier.

Authors: Albert Atserias, Moritz Müller

We construct so-called distinguishers, sparse matrices that retain some properties of error correcting codes. They provide a technically and conceptually simple approach to magnification. We generalize and strengthen known general (not problem specific) magnification results and in particular achieve magnification thresholds below known lower bounds. For example, we show that fixed polynomial formula size lower bounds for NP are implied by slightly superlinear formula size lower bounds for approximating any sufficiently sparse problem in NP. We also show that the thresholds achieved are sharp. Additionally, our approach yields a uniform magnification result for the minimum circuit size problem. This seems to sidestep the localization barrier.

On the Quantum Chromatic Gap

from arXiv: Computational Complexity

Authors: Lorenzo Ciardo

The largest known gap between quantum and classical chromatic number of graphs, obtained via quantum protocols for colouring Hadamard graphs based on the Deutsch--Jozsa algorithm and the quantum Fourier transform, is exponential. We put forth a quantum pseudo-telepathy version of Khot's $d$-to-$1$ Games Conjecture and prove that, conditional to its validity, the gap is unbounded: There exist graphs whose quantum chromatic number is $3$ and whose classical chromatic number is arbitrarily large. Furthermore, we show that the existence of a certain form of pseudo-telepathic XOR games would imply the conjecture and, thus, the unboundedness of the quantum chromatic gap. As two technical steps of our proof that might be of independent interest, we establish a quantum adjunction theorem for Pultr functors between categories of relational structures, and we prove that the Dinur--Khot--Kindler--Minzer--Safra reduction, recently used for proving the $2$-to-$2$ Games Theorem, is quantum complete.

Authors: Lorenzo Ciardo

The largest known gap between quantum and classical chromatic number of graphs, obtained via quantum protocols for colouring Hadamard graphs based on the Deutsch--Jozsa algorithm and the quantum Fourier transform, is exponential. We put forth a quantum pseudo-telepathy version of Khot's $d$-to-$1$ Games Conjecture and prove that, conditional to its validity, the gap is unbounded: There exist graphs whose quantum chromatic number is $3$ and whose classical chromatic number is arbitrarily large. Furthermore, we show that the existence of a certain form of pseudo-telepathic XOR games would imply the conjecture and, thus, the unboundedness of the quantum chromatic gap. As two technical steps of our proof that might be of independent interest, we establish a quantum adjunction theorem for Pultr functors between categories of relational structures, and we prove that the Dinur--Khot--Kindler--Minzer--Safra reduction, recently used for proving the $2$-to-$2$ Games Theorem, is quantum complete.

Classical Simulation of Quantum CSP Strategies

from arXiv: Computational Complexity

Authors: Demian Banakh, Lorenzo Ciardo, Marcin Kozik, Jan Tułowiecki

We prove that any perfect quantum strategy for the two-prover game encoding a constraint satisfaction problem (CSP) can be simulated via a perfect classical strategy with an extra classical communication channel, whose size depends only on $(i)$ the size of the shared quantum system used in the quantum strategy, and $(ii)$ structural parameters of the CSP template. The result is obtained via a combinatorial characterisation of perfect classical strategies with extra communication channels and a geometric rounding procedure for the projection-valued measurements involved in quantum strategies. A key intermediate step of our proof is to establish that the gap between the classical chromatic number of graphs and its quantum variant is bounded when the quantum strategy involves shared quantum information of bounded size.

Authors: Demian Banakh, Lorenzo Ciardo, Marcin Kozik, Jan Tułowiecki

We prove that any perfect quantum strategy for the two-prover game encoding a constraint satisfaction problem (CSP) can be simulated via a perfect classical strategy with an extra classical communication channel, whose size depends only on $(i)$ the size of the shared quantum system used in the quantum strategy, and $(ii)$ structural parameters of the CSP template. The result is obtained via a combinatorial characterisation of perfect classical strategies with extra communication channels and a geometric rounding procedure for the projection-valued measurements involved in quantum strategies. A key intermediate step of our proof is to establish that the gap between the classical chromatic number of graphs and its quantum variant is bounded when the quantum strategy involves shared quantum information of bounded size.

On the difficulty of order constrained pattern matching with applications to feature matching based malware detection

from arXiv: Computational Complexity

Authors: Adiesha Liyanage, Braeden Sopp, Binhai Zhu

We formulate low-level malware detection using algorithms based on feature matching as Order-based Malware Detection with Critical Instructions (General-OMDCI): given a pattern in the form of a sequence \(M\) of colored blocks, where each block contains a critical character (representing a unique sequence of critical instructions potentially associated with malware but without certainty), and a program \(A\), represented as a sequence of \(n\) colored blocks with critical characters, the goal is to find two subsequences, \(M'\) of \(M\) and \(A'\) of \(A\), with blocks matching in color and whose critical characters form a permutation of each other. When $M$ is a permutation in both colors and critical characters the problem is called OMDCI. If we additionally require $M'=M$, then the problem is called OMDCI+; if in this case $d=|M|$ is used as a parameter, then the OMDCI+ problem is easily shown to be FPT. Our main (negative) results are on the cases when $|M|$ is arbitrary and are summarized as follows: OMDCI+ is NP-complete, which implies OMDCI is also NP-complete. For the special case of OMDCI, deciding if the optimal solution has length $0$ (i.e., deciding if no part of \(M\) appears in \(A\)) is co-NP-hard. As a result, the OMDCI problem does not admit an FPT algorithm unless P=co-NP. In summary, our results imply that using algorithms based on feature matching to identify malware or determine the absence of malware in a given low-level program are both hard.

Authors: Adiesha Liyanage, Braeden Sopp, Binhai Zhu

We formulate low-level malware detection using algorithms based on feature matching as Order-based Malware Detection with Critical Instructions (General-OMDCI): given a pattern in the form of a sequence \(M\) of colored blocks, where each block contains a critical character (representing a unique sequence of critical instructions potentially associated with malware but without certainty), and a program \(A\), represented as a sequence of \(n\) colored blocks with critical characters, the goal is to find two subsequences, \(M'\) of \(M\) and \(A'\) of \(A\), with blocks matching in color and whose critical characters form a permutation of each other. When $M$ is a permutation in both colors and critical characters the problem is called OMDCI. If we additionally require $M'=M$, then the problem is called OMDCI+; if in this case $d=|M|$ is used as a parameter, then the OMDCI+ problem is easily shown to be FPT. Our main (negative) results are on the cases when $|M|$ is arbitrary and are summarized as follows: OMDCI+ is NP-complete, which implies OMDCI is also NP-complete. For the special case of OMDCI, deciding if the optimal solution has length $0$ (i.e., deciding if no part of \(M\) appears in \(A\)) is co-NP-hard. As a result, the OMDCI problem does not admit an FPT algorithm unless P=co-NP. In summary, our results imply that using algorithms based on feature matching to identify malware or determine the absence of malware in a given low-level program are both hard.