Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, July 19

Liquid Amortization: Proving Amortized Complexity with LiquidHaskell (Functional Pearl)

from arXiv: Computational Complexity

Authors: Jan van Brügge

Formal reasoning about the time complexity of algorithms and data structures is usually done in interactive theorem provers like Isabelle/HOL. This includes reasoning about amortized time complexity which looks at the worst case performance over a series of operations. However, most programs are not written within a theorem prover and thus use the data structures of the production language. To verify the correctness it is necessary to translate the data structures from the production language into the language of the prover. Such a translation step could introduce errors, for example due to a mismatch in features between the two languages. We show how to prove amortized complexity of data structures directly in Haskell using LiquidHaskell. Besides skipping the translation step, our approach can also provide a didactic advantage. Learners do not have to learn an additional language for proofs and can focus on the new concepts only. For this paper, we do not assume prior knowledge of amortized complexity as we explain the concepts and apply them in our first case study, a simple stack with multipop. Moving to more complicated (and useful) data structures, we show that the same technique works for binomial heaps which can be used to implement a priority queue. We also prove amortized complexity bounds for Claessen's version of the finger tree, a sequence-like data structure with constant-time cons/uncons on either end. Finally we discuss the current limitations of LiquidHaskell that made certain versions of the data structures not feasible.

Authors: Jan van Brügge

Formal reasoning about the time complexity of algorithms and data structures is usually done in interactive theorem provers like Isabelle/HOL. This includes reasoning about amortized time complexity which looks at the worst case performance over a series of operations. However, most programs are not written within a theorem prover and thus use the data structures of the production language. To verify the correctness it is necessary to translate the data structures from the production language into the language of the prover. Such a translation step could introduce errors, for example due to a mismatch in features between the two languages. We show how to prove amortized complexity of data structures directly in Haskell using LiquidHaskell. Besides skipping the translation step, our approach can also provide a didactic advantage. Learners do not have to learn an additional language for proofs and can focus on the new concepts only. For this paper, we do not assume prior knowledge of amortized complexity as we explain the concepts and apply them in our first case study, a simple stack with multipop. Moving to more complicated (and useful) data structures, we show that the same technique works for binomial heaps which can be used to implement a priority queue. We also prove amortized complexity bounds for Claessen's version of the finger tree, a sequence-like data structure with constant-time cons/uncons on either end. Finally we discuss the current limitations of LiquidHaskell that made certain versions of the data structures not feasible.

Computing the second and third systoles of a combinatorial surface

from arXiv: Computational Geometry

Authors: Matthijs Ebbens, Francis Lazarus

Given a weighted, undirected graph $G$ cellularly embedded on a topological surface $S$, we describe algorithms to compute the second shortest and third shortest closed walks of $G$ that are homotopically non-trivial in $S$. Our algorithms run in $O(n^2\log n)$ time for the second shortest walk and in $O(n^3)$ time for the third shortest walk. We also show how to reduce the running time for the second shortest homotopically non-trivial closed walk to $O(n\log n)$ when both the genus and the number of boundaries are fixed. Our algorithms rely on a careful analysis of the configurations of the first three shortest homotopically non-trivial curves in $S$. As an intermediate step, we also describe how to compute a shortest essential arc between \emph{one} pair of vertices or between \emph{all} pairs of vertices of a given boundary component of $S$ in $O(n^2)$ time or $O(n^3)$ time, respectively.

Authors: Matthijs Ebbens, Francis Lazarus

Given a weighted, undirected graph $G$ cellularly embedded on a topological surface $S$, we describe algorithms to compute the second shortest and third shortest closed walks of $G$ that are homotopically non-trivial in $S$. Our algorithms run in $O(n^2\log n)$ time for the second shortest walk and in $O(n^3)$ time for the third shortest walk. We also show how to reduce the running time for the second shortest homotopically non-trivial closed walk to $O(n\log n)$ when both the genus and the number of boundaries are fixed. Our algorithms rely on a careful analysis of the configurations of the first three shortest homotopically non-trivial curves in $S$. As an intermediate step, we also describe how to compute a shortest essential arc between \emph{one} pair of vertices or between \emph{all} pairs of vertices of a given boundary component of $S$ in $O(n^2)$ time or $O(n^3)$ time, respectively.

On Finding the Closest Zonotope to a Polytope in Hausdorff Distance

from arXiv: Computational Geometry

Authors: George D. Torres

We provide a local theory for the optimization of the Hausdorff distance between a polytope and a zonotope. To do this, we compute explicit local formulae for the Hausdorff function $d(P, -) : Z_n \to \mathbb{R}$, where $P$ is a fixed polytope and $Z_n$ is the space of rank $n$ zonotopes. This local theory is then used to provide an optimization algorithm based on subgradient descent that converges to critical points of $d(P, -)$. We also express the condition of being at a local minimum as a polyhedral feasibility condition.

Authors: George D. Torres

We provide a local theory for the optimization of the Hausdorff distance between a polytope and a zonotope. To do this, we compute explicit local formulae for the Hausdorff function $d(P, -) : Z_n \to \mathbb{R}$, where $P$ is a fixed polytope and $Z_n$ is the space of rank $n$ zonotopes. This local theory is then used to provide an optimization algorithm based on subgradient descent that converges to critical points of $d(P, -)$. We also express the condition of being at a local minimum as a polyhedral feasibility condition.

The Madness of Multiple Entries in March Madness

from arXiv: Data Structures and Algorithms

Authors: Jeff Decary, David Bergman, Carlos Cardonha, Jason Imbrogno, Andrea Lodi

This paper explores multi-entry strategies for betting pools related to single-elimination tournaments. In such betting pools, participants select winners of games, and their respective score is a weighted sum of the number of correct selections. Most betting pools have a top-heavy payoff structure, so the paper focuses on strategies that maximize the expected score of the best-performing entry. There is no known closed-formula expression for the estimation of this metric, so the paper investigates the challenges associated with the estimation and the optimization of multi-entry solutions. We present an exact dynamic programming approach for calculating the maximum expected score of any given fixed solution, which is exponential in the number of entries. We explore the structural properties of the problem to develop several solution techniques. In particular, by extracting insights from the solutions produced by one of our algorithms, we design a simple yet effective problem-specific heuristic that was the best-performing technique in our experiments, which were based on real-world data extracted from recent March Madness tournaments. In particular, our results show that the best 100-entry solution identified by our heuristic had a 2.2% likelihood of winning a $1 million prize in a real-world betting pool.

Authors: Jeff Decary, David Bergman, Carlos Cardonha, Jason Imbrogno, Andrea Lodi

This paper explores multi-entry strategies for betting pools related to single-elimination tournaments. In such betting pools, participants select winners of games, and their respective score is a weighted sum of the number of correct selections. Most betting pools have a top-heavy payoff structure, so the paper focuses on strategies that maximize the expected score of the best-performing entry. There is no known closed-formula expression for the estimation of this metric, so the paper investigates the challenges associated with the estimation and the optimization of multi-entry solutions. We present an exact dynamic programming approach for calculating the maximum expected score of any given fixed solution, which is exponential in the number of entries. We explore the structural properties of the problem to develop several solution techniques. In particular, by extracting insights from the solutions produced by one of our algorithms, we design a simple yet effective problem-specific heuristic that was the best-performing technique in our experiments, which were based on real-world data extracted from recent March Madness tournaments. In particular, our results show that the best 100-entry solution identified by our heuristic had a 2.2% likelihood of winning a $1 million prize in a real-world betting pool.

Rényi-infinity constrained sampling with $d^3$ membership queries

from arXiv: Data Structures and Algorithms

Authors: Yunbum Kook, Matthew S. Zhang

Uniform sampling over a convex body is a fundamental algorithmic problem, yet the convergence in KL or R\'enyi divergence of most samplers remains poorly understood. In this work, we propose a constrained proximal sampler, a principled and simple algorithm that possesses elegant convergence guarantees. Leveraging the uniform ergodicity of this sampler, we show that it converges in the R\'enyi-infinity divergence ($\mathcal R_\infty$) with no query complexity overhead when starting from a warm start. This is the strongest of commonly considered performance metrics, implying rates in $\{\mathcal R_q, \mathsf{KL}\}$ convergence as special cases. By applying this sampler within an annealing scheme, we propose an algorithm which can approximately sample $\varepsilon$-close to the uniform distribution on convex bodies in $\mathcal R_\infty$-divergence with $\widetilde{\mathcal{O}}(d^3\, \text{polylog} \frac{1}{\varepsilon})$ query complexity. This improves on all prior results in $\{\mathcal R_q, \mathsf{KL}\}$-divergences, without resorting to any algorithmic modifications or post-processing of the sample. It also matches the prior best known complexity in total variation distance.

Authors: Yunbum Kook, Matthew S. Zhang

Uniform sampling over a convex body is a fundamental algorithmic problem, yet the convergence in KL or R\'enyi divergence of most samplers remains poorly understood. In this work, we propose a constrained proximal sampler, a principled and simple algorithm that possesses elegant convergence guarantees. Leveraging the uniform ergodicity of this sampler, we show that it converges in the R\'enyi-infinity divergence ($\mathcal R_\infty$) with no query complexity overhead when starting from a warm start. This is the strongest of commonly considered performance metrics, implying rates in $\{\mathcal R_q, \mathsf{KL}\}$ convergence as special cases. By applying this sampler within an annealing scheme, we propose an algorithm which can approximately sample $\varepsilon$-close to the uniform distribution on convex bodies in $\mathcal R_\infty$-divergence with $\widetilde{\mathcal{O}}(d^3\, \text{polylog} \frac{1}{\varepsilon})$ query complexity. This improves on all prior results in $\{\mathcal R_q, \mathsf{KL}\}$-divergences, without resorting to any algorithmic modifications or post-processing of the sample. It also matches the prior best known complexity in total variation distance.

Thursday, July 18

Remembering Luca Trevisan (1971–2024)

from Simons Institute Blog

The theory community mourns the loss of Luca Trevisan. We hope that this page will serve as an enduring memorial to a brilliant scientist and expositor, and beloved friend and colleague. Please share your memories, roasts, and toasts in the … Continue reading →

By 956284

The theory community mourns the loss of Luca Trevisan. We hope that this page will serve as an enduring memorial to a brilliant scientist and expositor, and beloved friend and colleague. Please share your memories, roasts, and toasts in the … Continue reading

By 956284

The Story of Shor's Algorithm

from Computational Complexity

♦The quantum factoring algorithm of Peter Shor (FOCS 1994, SIAM Review 1999) turns thirty this year. Before his algorithm, quantum computing lacked the killer app, something practical that quantum could do that seems hard for classical computers. Back in 1994, I said Shor's algorithm bought quantum computing another twenty years. How I misjudged the longevity of quantum hype. 

Peter got the idea for his algorithm from a paper by Daniel Simon solving a theoretical complexity problem. The quantum factoring algorithm is a great example of how a complexity result can open doors to new algorithmic ideas.

Simon came up with a beautifully simple example of a problem that required exponential-time on a probabilistic machine but polynomial-time on a quantum computer. Let's define addition over the \(n\)-bit strings, for \(x\) and \(y\) in \(\{0,1\}^n\), \(x+y\) is the bitwise parity of \(x\) and \(y\). For example if \(x\) is 0110 and \(y\) is 1100, \(x+y = 1010\).

Suppose we have a Boolean function \(f:\{0,1\}^n\rightarrow\{0,1\}^n\) (maps \(n\) bits to \(n\) bits) with the property that \(f(x)=f(y)\) iff \(x=y+z\) for some fixed \(z\). The problem is given \(f\) as an oracle or a circuit, find the \(z\). A classical machine would need exponential steps in to find \(z\) in the worst case.

Simon gave a simple quantum algorithm that would with a single query output a random w such that \(w\cdot z=0\). With \(n = \log N\) linearly independent \(w\), you can solve for \(z\).

Shor's asked what if we could do the same for regular integer addition instead of bitwise parity. Suppose you have a function \(f(x)=f(y)\) iff \(x-y\) is a multiple of \(z\) for a fixed \(z\). (In Simon's case over bits the only multiples are zero and one.) That means \(f\) is periodic and \(z\) is the period. Shor knew that by an algorithm by Miller, finding a period leads to factoring.

Let m be an odd number with multiple prime factors. Consider \(f(x)=a^x\bmod m\) for a randomly chosen \(a\) relatively prime to \(m\). If this function has a period \(z\), then \(a^z\bmod m=a\), \(a^{z-1}\bmod m=1\) and with probability at least one-half, the gcd of \(a^{\frac{z-1}{2}}\) and \(m\) will be a nontrivial factor of m. 

Getting all this to work on a quantum computer requires a number of addition tricks beyond what Simon did but once Shor had the inspiration the rest followed. 

Peter Shor really understood the landscape of theory from complexity to cryptography, a curiosity for quantum computing and the vision to see how it all connected together to get the quantum algorithm that almost single-handedly brought billions of dollars to the field. 

Peter just received the Shannon Award for his work on quantum error correction that would help enable quantum computers to run his algorithm. Still the largest number present day quantum computers can factor with the algorithm is 21. If (and its a big if) that number gets up past the RSA challenge numbers, Peter will have far larger prizes in his future.

By Lance Fortnow

The quantum factoring algorithm of Peter Shor (FOCS 1994, SIAM Review 1999) turns thirty this year. Before his algorithm, quantum computing lacked the killer app, something practical that quantum could do that seems hard for classical computers. Back in 1994, I said Shor's algorithm bought quantum computing another twenty years. How I misjudged the longevity of quantum hype. 

Peter got the idea for his algorithm from a paper by Daniel Simon solving a theoretical complexity problem. The quantum factoring algorithm is a great example of how a complexity result can open doors to new algorithmic ideas.

Simon came up with a beautifully simple example of a problem that required exponential-time on a probabilistic machine but polynomial-time on a quantum computer. Let's define addition over the \(n\)-bit strings, for \(x\) and \(y\) in \(\{0,1\}^n\), \(x+y\) is the bitwise parity of \(x\) and \(y\). For example if \(x\) is 0110 and \(y\) is 1100, \(x+y = 1010\).

Suppose we have a Boolean function \(f:\{0,1\}^n\rightarrow\{0,1\}^n\) (maps \(n\) bits to \(n\) bits) with the property that \(f(x)=f(y)\) iff \(x=y+z\) for some fixed \(z\). The problem is given \(f\) as an oracle or a circuit, find the \(z\). A classical machine would need exponential steps in to find \(z\) in the worst case.

Simon gave a simple quantum algorithm that would with a single query output a random w such that \(w\cdot z=0\). With \(n = \log N\) linearly independent \(w\), you can solve for \(z\).

Shor's asked what if we could do the same for regular integer addition instead of bitwise parity. Suppose you have a function \(f(x)=f(y)\) iff \(x-y\) is a multiple of \(z\) for a fixed \(z\). (In Simon's case over bits the only multiples are zero and one.) That means \(f\) is periodic and \(z\) is the period. Shor knew that by an algorithm by Miller, finding a period leads to factoring.

Let m be an odd number with multiple prime factors. Consider \(f(x)=a^x\bmod m\) for a randomly chosen \(a\) relatively prime to \(m\). If this function has a period \(z\), then \(a^z\bmod m=a\), \(a^{z-1}\bmod m=1\) and with probability at least one-half, the gcd of \(a^{\frac{z-1}{2}}\) and \(m\) will be a nontrivial factor of m. 

Getting all this to work on a quantum computer requires a number of addition tricks beyond what Simon did but once Shor had the inspiration the rest followed. 

Peter Shor really understood the landscape of theory from complexity to cryptography, a curiosity for quantum computing and the vision to see how it all connected together to get the quantum algorithm that almost single-handedly brought billions of dollars to the field. 

Peter just received the Shannon Award for his work on quantum error correction that would help enable quantum computers to run his algorithm. Still the largest number present day quantum computers can factor with the algorithm is 21. If (and its a big if) that number gets up past the RSA challenge numbers, Peter will have far larger prizes in his future.

By Lance Fortnow

A polynomial-time classical algorithm for noisy quantum circuits

from arXiv: Computational Complexity

Authors: Thomas Schuster, Chao Yin, Xun Gao, Norman Y. Yao

We provide a polynomial-time classical algorithm for noisy quantum circuits. The algorithm computes the expectation value of any observable for any circuit, with a small average error over input states drawn from an ensemble (e.g. the computational basis). Our approach is based upon the intuition that noise exponentially damps non-local correlations relative to local correlations. This enables one to classically simulate a noisy quantum circuit by only keeping track of the dynamics of local quantum information. Our algorithm also enables sampling from the output distribution of a circuit in quasi-polynomial time, so long as the distribution anti-concentrates. A number of practical implications are discussed, including a fundamental limit on the efficacy of noise mitigation strategies: any quantum circuit for which error mitigation is efficient must be classically simulable.

Authors: Thomas Schuster, Chao Yin, Xun Gao, Norman Y. Yao

We provide a polynomial-time classical algorithm for noisy quantum circuits. The algorithm computes the expectation value of any observable for any circuit, with a small average error over input states drawn from an ensemble (e.g. the computational basis). Our approach is based upon the intuition that noise exponentially damps non-local correlations relative to local correlations. This enables one to classically simulate a noisy quantum circuit by only keeping track of the dynamics of local quantum information. Our algorithm also enables sampling from the output distribution of a circuit in quasi-polynomial time, so long as the distribution anti-concentrates. A number of practical implications are discussed, including a fundamental limit on the efficacy of noise mitigation strategies: any quantum circuit for which error mitigation is efficient must be classically simulable.

Quasi-Linear Size PCPs with Small Soundness from HDX

from arXiv: Computational Complexity

Authors: Mitali Bafna, Dor Minzer, Nikhil Vyas

We construct 2-query, quasi-linear sized probabilistically checkable proofs (PCPs) with arbitrarily small constant soundness, improving upon Dinur's 2-query quasi-linear size PCPs with soundness $1-\Omega(1)$. As an immediate corollary, we get that under the exponential time hypothesis, for all $\epsilon >0$ no approximation algorithm for $3$-SAT can obtain an approximation ratio of $7/8+\epsilon$ in time $2^{n/\log^C n}$, where $C$ is a constant depending on $\epsilon$. Our result builds on a recent line of works showing the existence of linear sized direct product testers with small soundness by independent works of Bafna, Lifshitz, and Minzer, and of Dikstein, Dinur, and Lubotzky. The main new ingredient in our proof is a technique that embeds a given PCP construction into a PCP on a prescribed graph, provided that the latter is a graph underlying a sufficiently good high-dimensional expander. Towards this end, we use ideas from fault-tolerant distributed computing, and more precisely from the literature of the almost everywhere agreement problem starting with the work of Dwork, Peleg, Pippenger, and Upfal (1986). We show that graphs underlying HDXs admit routing protocols that are tolerant to adversarial edge corruptions, and in doing so we also improve the state of the art in this line of work. Our PCP construction requires variants of the aforementioned direct product testers with poly-logarithmic degree. The existence and constructability of these variants is shown in an appendix by Zhiwei Yun.

Authors: Mitali Bafna, Dor Minzer, Nikhil Vyas

We construct 2-query, quasi-linear sized probabilistically checkable proofs (PCPs) with arbitrarily small constant soundness, improving upon Dinur's 2-query quasi-linear size PCPs with soundness $1-\Omega(1)$. As an immediate corollary, we get that under the exponential time hypothesis, for all $\epsilon >0$ no approximation algorithm for $3$-SAT can obtain an approximation ratio of $7/8+\epsilon$ in time $2^{n/\log^C n}$, where $C$ is a constant depending on $\epsilon$. Our result builds on a recent line of works showing the existence of linear sized direct product testers with small soundness by independent works of Bafna, Lifshitz, and Minzer, and of Dikstein, Dinur, and Lubotzky. The main new ingredient in our proof is a technique that embeds a given PCP construction into a PCP on a prescribed graph, provided that the latter is a graph underlying a sufficiently good high-dimensional expander. Towards this end, we use ideas from fault-tolerant distributed computing, and more precisely from the literature of the almost everywhere agreement problem starting with the work of Dwork, Peleg, Pippenger, and Upfal (1986). We show that graphs underlying HDXs admit routing protocols that are tolerant to adversarial edge corruptions, and in doing so we also improve the state of the art in this line of work. Our PCP construction requires variants of the aforementioned direct product testers with poly-logarithmic degree. The existence and constructability of these variants is shown in an appendix by Zhiwei Yun.

On the Complexity of Identification in Linear Structural Causal Models

from arXiv: Computational Complexity

Authors: Julian Dörfler, Benito van der Zander, Markus Bläser, Maciej Liskiewicz

Learning the unknown causal parameters of a linear structural causal model is a fundamental task in causal analysis. The task, known as the problem of identification, asks to estimate the parameters of the model from a combination of assumptions on the graphical structure of the model and observational data, represented as a non-causal covariance matrix. In this paper, we give a new sound and complete algorithm for generic identification which runs in polynomial space. By standard simulation results, this algorithm has exponential running time which vastly improves the state-of-the-art double exponential time method using a Gr\"obner basis approach. The paper also presents evidence that parameter identification is computationally hard in general. In particular, we prove, that the task asking whether, for a given feasible correlation matrix, there are exactly one or two or more parameter sets explaining the observed matrix, is hard for $\forall R$, the co-class of the existential theory of the reals. In particular, this problem is $coNP$-hard. To our best knowledge, this is the first hardness result for some notion of identifiability.

Authors: Julian Dörfler, Benito van der Zander, Markus Bläser, Maciej Liskiewicz

Learning the unknown causal parameters of a linear structural causal model is a fundamental task in causal analysis. The task, known as the problem of identification, asks to estimate the parameters of the model from a combination of assumptions on the graphical structure of the model and observational data, represented as a non-causal covariance matrix. In this paper, we give a new sound and complete algorithm for generic identification which runs in polynomial space. By standard simulation results, this algorithm has exponential running time which vastly improves the state-of-the-art double exponential time method using a Gr\"obner basis approach. The paper also presents evidence that parameter identification is computationally hard in general. In particular, we prove, that the task asking whether, for a given feasible correlation matrix, there are exactly one or two or more parameter sets explaining the observed matrix, is hard for $\forall R$, the co-class of the existential theory of the reals. In particular, this problem is $coNP$-hard. To our best knowledge, this is the first hardness result for some notion of identifiability.

Geometric and computational hardness of bilevel programming

from arXiv: Computational Complexity

Authors: Jérôme Bolte, Quoc-Tung Le, Edouard Pauwels, Samuel Vaiter

We first show a simple but striking result in bilevel optimization: unconstrained $C^\infty$ smooth bilevel programming is as hard as general extended-real-valued lower semicontinuous minimization. We then proceed to a worst-case analysis of box-constrained bilevel polynomial optimization. We show in particular that any extended-real-valued semi-algebraic function, possibly non-continuous, can be expressed as the value function of a polynomial bilevel program. Secondly, from a computational complexity perspective, the decision version of polynomial bilevel programming is one level above NP in the polynomial hierarchy ($\Sigma^p_2$-hard). Both types of difficulties are uncommon in non-linear programs for which objective functions are typically continuous and belong to the class NP. These results highlight the irremediable hardness attached to general bilevel optimization and the necessity of imposing some form of regularity on the lower level.

Authors: Jérôme Bolte, Quoc-Tung Le, Edouard Pauwels, Samuel Vaiter

We first show a simple but striking result in bilevel optimization: unconstrained $C^\infty$ smooth bilevel programming is as hard as general extended-real-valued lower semicontinuous minimization. We then proceed to a worst-case analysis of box-constrained bilevel polynomial optimization. We show in particular that any extended-real-valued semi-algebraic function, possibly non-continuous, can be expressed as the value function of a polynomial bilevel program. Secondly, from a computational complexity perspective, the decision version of polynomial bilevel programming is one level above NP in the polynomial hierarchy ($\Sigma^p_2$-hard). Both types of difficulties are uncommon in non-linear programs for which objective functions are typically continuous and belong to the class NP. These results highlight the irremediable hardness attached to general bilevel optimization and the necessity of imposing some form of regularity on the lower level.

Pseudorandomness, symmetry, smoothing: II

from arXiv: Computational Complexity

Authors: Harm Derksen, Peter Ivanov, Chin Ho Lee, Emanuele Viola

We prove several new results on the Hamming weight of bounded uniform and small-bias distributions. We exhibit bounded-uniform distributions whose weight is anti-concentrated, matching existing concentration inequalities. This construction relies on a recent result in approximation theory due to Erd\'eyi (Acta Arithmetica 2016). In particular, we match the classical tail bounds, generalizing a result by Bun and Steinke (RANDOM 2015). Also, we improve on a construction by Benjamini, Gurel-Gurevich, and Peled (2012). We give a generic transformation that converts any bounded uniform distribution to a small-bias distribution that almost preserves its weight distribution. Applying this transformation in conjunction with the above results and others, we construct small-bias distributions with various weight restrictions. In particular, we match the concentration that follows from that of bounded uniformity and the generic closeness of small-bias and bounded-uniform distributions, answering a question by Bun and Steinke (RANDOM 2015). Moreover, these distributions are supported on only a constant number of Hamming weights. We further extend the anti-concentration constructions to small-bias distributions perturbed with noise, a class that has received much attention recently in derandomization. Our results imply (but are not implied by) a recent result of the authors (CCC 2024), and are based on different techniques. In particular, we prove that the standard Gaussian distribution is far from any mixture of Gaussians with bounded variance.

Authors: Harm Derksen, Peter Ivanov, Chin Ho Lee, Emanuele Viola

We prove several new results on the Hamming weight of bounded uniform and small-bias distributions. We exhibit bounded-uniform distributions whose weight is anti-concentrated, matching existing concentration inequalities. This construction relies on a recent result in approximation theory due to Erd\'eyi (Acta Arithmetica 2016). In particular, we match the classical tail bounds, generalizing a result by Bun and Steinke (RANDOM 2015). Also, we improve on a construction by Benjamini, Gurel-Gurevich, and Peled (2012). We give a generic transformation that converts any bounded uniform distribution to a small-bias distribution that almost preserves its weight distribution. Applying this transformation in conjunction with the above results and others, we construct small-bias distributions with various weight restrictions. In particular, we match the concentration that follows from that of bounded uniformity and the generic closeness of small-bias and bounded-uniform distributions, answering a question by Bun and Steinke (RANDOM 2015). Moreover, these distributions are supported on only a constant number of Hamming weights. We further extend the anti-concentration constructions to small-bias distributions perturbed with noise, a class that has received much attention recently in derandomization. Our results imply (but are not implied by) a recent result of the authors (CCC 2024), and are based on different techniques. In particular, we prove that the standard Gaussian distribution is far from any mixture of Gaussians with bounded variance.

A Practical Solver for Scalar Data Topological Simplification

from arXiv: Computational Geometry

Authors: Mohamed Kissi, Mathieu Pont, Joshua A. Levine, Julien Tierny

This paper presents a practical approach for the optimization of topological simplification, a central pre-processing step for the analysis and visualization of scalar data. Given an input scalar field f and a set of "signal" persistence pairs to maintain, our approach produces an output field g that is close to f and which optimizes (i) the cancellation of "non-signal" pairs, while (ii) preserving the "signal" pairs. In contrast to pre-existing simplification algorithms, our approach is not restricted to persistence pairs involving extrema and can thus address a larger class of topological features, in particular saddle pairs in three-dimensional scalar data. Our approach leverages recent generic persistence optimization frameworks and extends them with tailored accelerations specific to the problem of topological simplification. Extensive experiments report substantial accelerations over these frameworks, thereby making topological simplification optimization practical for real-life datasets. Our approach enables a direct visualization and analysis of the topologically simplified data, e.g., via isosurfaces of simplified topology (fewer components and handles). We apply our approach to the extraction of prominent filament structures in three-dimensional data. Specifically, we show that our pre-simplification of the data leads to practical improvements over standard topological techniques for removing filament loops. We also show how our approach can be used to repair genus defects in surface processing. Finally, we provide a C++ implementation for reproducibility purposes.

Authors: Mohamed Kissi, Mathieu Pont, Joshua A. Levine, Julien Tierny

This paper presents a practical approach for the optimization of topological simplification, a central pre-processing step for the analysis and visualization of scalar data. Given an input scalar field f and a set of "signal" persistence pairs to maintain, our approach produces an output field g that is close to f and which optimizes (i) the cancellation of "non-signal" pairs, while (ii) preserving the "signal" pairs. In contrast to pre-existing simplification algorithms, our approach is not restricted to persistence pairs involving extrema and can thus address a larger class of topological features, in particular saddle pairs in three-dimensional scalar data. Our approach leverages recent generic persistence optimization frameworks and extends them with tailored accelerations specific to the problem of topological simplification. Extensive experiments report substantial accelerations over these frameworks, thereby making topological simplification optimization practical for real-life datasets. Our approach enables a direct visualization and analysis of the topologically simplified data, e.g., via isosurfaces of simplified topology (fewer components and handles). We apply our approach to the extraction of prominent filament structures in three-dimensional data. Specifically, we show that our pre-simplification of the data leads to practical improvements over standard topological techniques for removing filament loops. We also show how our approach can be used to repair genus defects in surface processing. Finally, we provide a C++ implementation for reproducibility purposes.

Sampling with a Black Box: Faster Parameterized Approximation Algorithms for Vertex Deletion Problems

from arXiv: Data Structures and Algorithms

Authors: Barış Can Esmer, Ariel Kulik

In this paper we introduce Sampling with a Black Box, a generic technique for the design of parameterized approximation algorithms for vertex deletion problems (e.g., Vertex Cover, Feedback Vertex Set, etc.). The technique relies on two components: $\bullet$ A Sampling Step. A polynomial time randomized algorithm which given a graph $G$ returns a random vertex $v$ such that the optimum of $G\setminus \{v\}$ is smaller by $1$ than the optimum of $G$ with some prescribed probability $q$. We show such algorithms exists for multiple vertex deletion problems. $\bullet$ A Black Box algorithm which is either an exact parameterized algorithm or a polynomial time approximation algorithm. Our technique combines these two components together. The sampling step is applied iteratively to remove vertices from the input graph, and then the solution is extended using the black box algorithm. The process is repeated sufficiently many times so that the target approximation ratio is attained with a constant probability. The main novelty of our work lies in the analysis of the framework and the optimization of the parameters it uses. We use the technique to derive parameterized approximation algorithm for several vertex deletion problems, including Feedback Vertex Set, $d$-Hitting Set and $\ell$-Path Vertex Cover. In particular, for every approximation ratio $1<\beta<2$, we attain a parameterized $\beta$-approximation for Feedback Vertex Set which is faster than the parameterized $\beta$-approximation of [Jana, Lokshtanov, Mandal, Rai and Saurabh, MFCS 23']. Furthermore, our algorithms are always faster than the algorithms attained using Fidelity Preserving Transformations [Fellows, Kulik, Rosamond, and Shachnai, JCSS 18'].

Authors: Barış Can Esmer, Ariel Kulik

In this paper we introduce Sampling with a Black Box, a generic technique for the design of parameterized approximation algorithms for vertex deletion problems (e.g., Vertex Cover, Feedback Vertex Set, etc.). The technique relies on two components: $\bullet$ A Sampling Step. A polynomial time randomized algorithm which given a graph $G$ returns a random vertex $v$ such that the optimum of $G\setminus \{v\}$ is smaller by $1$ than the optimum of $G$ with some prescribed probability $q$. We show such algorithms exists for multiple vertex deletion problems. $\bullet$ A Black Box algorithm which is either an exact parameterized algorithm or a polynomial time approximation algorithm. Our technique combines these two components together. The sampling step is applied iteratively to remove vertices from the input graph, and then the solution is extended using the black box algorithm. The process is repeated sufficiently many times so that the target approximation ratio is attained with a constant probability. The main novelty of our work lies in the analysis of the framework and the optimization of the parameters it uses. We use the technique to derive parameterized approximation algorithm for several vertex deletion problems, including Feedback Vertex Set, $d$-Hitting Set and $\ell$-Path Vertex Cover. In particular, for every approximation ratio $1<\beta<2$, we attain a parameterized $\beta$-approximation for Feedback Vertex Set which is faster than the parameterized $\beta$-approximation of [Jana, Lokshtanov, Mandal, Rai and Saurabh, MFCS 23']. Furthermore, our algorithms are always faster than the algorithms attained using Fidelity Preserving Transformations [Fellows, Kulik, Rosamond, and Shachnai, JCSS 18'].

Exact Graph Matching in Correlated Gaussian-Attributed Erdős-Rényi Model

from arXiv: Data Structures and Algorithms

Authors: Joonhyuk Yang, Hye Won Chung

Graph matching problem aims to identify node correspondence between two or more correlated graphs. Previous studies have primarily focused on models where only edge information is provided. However, in many social networks, not only the relationships between users, represented by edges, but also their personal information, represented by features, are present. In this paper, we address the challenge of identifying node correspondence in correlated graphs, where additional node features exist, as in many real-world settings. We propose a two-step procedure, where we initially match a subset of nodes only using edge information, and then match the remaining nodes using node features. We derive information-theoretic limits for exact graph matching on this model. Our approach provides a comprehensive solution to the real-world graph matching problem by providing systematic ways to utilize both edge and node information for exact matching of the graphs.

Authors: Joonhyuk Yang, Hye Won Chung

Graph matching problem aims to identify node correspondence between two or more correlated graphs. Previous studies have primarily focused on models where only edge information is provided. However, in many social networks, not only the relationships between users, represented by edges, but also their personal information, represented by features, are present. In this paper, we address the challenge of identifying node correspondence in correlated graphs, where additional node features exist, as in many real-world settings. We propose a two-step procedure, where we initially match a subset of nodes only using edge information, and then match the remaining nodes using node features. We derive information-theoretic limits for exact graph matching on this model. Our approach provides a comprehensive solution to the real-world graph matching problem by providing systematic ways to utilize both edge and node information for exact matching of the graphs.

Engineering Fully Dynamic Exact $Δ$-Orientation Algorithms

from arXiv: Data Structures and Algorithms

Authors: Ernestine Großmann, Henrik Reinstädtler, Christian Schulz, Fabian Walliser

A (fully) dynamic graph algorithm is a data structure that supports edge insertions, edge deletions, and answers specific queries pertinent to the problem at hand. In this work, we address the fully dynamic edge orientation problem, also known as the fully dynamic $\Delta$-orientation problem. The objective is to maintain an orientation of the edges in an undirected graph such that the out-degree of any vertex remains low. When edges are inserted or deleted, it may be necessary to reorient some edges to prevent vertices from having excessively high out-degrees. In this paper, we introduce the first algorithm that maintains an optimal edge orientation during both insertions and deletions. In experiments comparing with recent nearly exact algorithms, we achieve a 32% lower running time. The update time of our algorithm is up to 6 orders of magnitude faster than static exact algorithms.

Authors: Ernestine Großmann, Henrik Reinstädtler, Christian Schulz, Fabian Walliser

A (fully) dynamic graph algorithm is a data structure that supports edge insertions, edge deletions, and answers specific queries pertinent to the problem at hand. In this work, we address the fully dynamic edge orientation problem, also known as the fully dynamic $\Delta$-orientation problem. The objective is to maintain an orientation of the edges in an undirected graph such that the out-degree of any vertex remains low. When edges are inserted or deleted, it may be necessary to reorient some edges to prevent vertices from having excessively high out-degrees. In this paper, we introduce the first algorithm that maintains an optimal edge orientation during both insertions and deletions. In experiments comparing with recent nearly exact algorithms, we achieve a 32% lower running time. The update time of our algorithm is up to 6 orders of magnitude faster than static exact algorithms.

A Unified Model of Congestion Games with Priorities: Two-Sided Markets with Ties, Finite and Non-Affine Delay Functions, and Pure Nash Equilibria

from arXiv: Data Structures and Algorithms

Authors: Kenjiro Takazawa

The study of equilibrium concepts in congestion games and two-sided markets with ties has been a primary topic in game theory, economics, and computer science. Ackermann, Goldberg, Mirrokni, R\"oglin, V\"ocking (2008) gave a common generalization of these two models, in which a player more prioritized by a resource produces an infinite delay on less prioritized players. While presenting several theorems on pure Nash equilibria in this model, Ackermann et al.\ posed an open problem of how to design a model in which more prioritized players produce a large but finite delay on less prioritized players. In this paper, we present a positive solution to this open problem by combining the model of Ackermann et al.\ with a generalized model of congestion games due to Bil\`o and Vinci (2023). In the model of Bil\`o and Vinci, the more prioritized players produce a finite delay on the less prioritized players, while the delay functions are of a specific kind of affine function, and all resources have the same priorities. By unifying these two models, we achieve a model in which the delay functions may be finite and non-affine, and the priorities of the resources may be distinct. We prove some positive results on the existence and computability of pure Nash equilibria in our model, which extend those for the previous models and support the validity of our model.

Authors: Kenjiro Takazawa

The study of equilibrium concepts in congestion games and two-sided markets with ties has been a primary topic in game theory, economics, and computer science. Ackermann, Goldberg, Mirrokni, R\"oglin, V\"ocking (2008) gave a common generalization of these two models, in which a player more prioritized by a resource produces an infinite delay on less prioritized players. While presenting several theorems on pure Nash equilibria in this model, Ackermann et al.\ posed an open problem of how to design a model in which more prioritized players produce a large but finite delay on less prioritized players. In this paper, we present a positive solution to this open problem by combining the model of Ackermann et al.\ with a generalized model of congestion games due to Bil\`o and Vinci (2023). In the model of Bil\`o and Vinci, the more prioritized players produce a finite delay on the less prioritized players, while the delay functions are of a specific kind of affine function, and all resources have the same priorities. By unifying these two models, we achieve a model in which the delay functions may be finite and non-affine, and the priorities of the resources may be distinct. We prove some positive results on the existence and computability of pure Nash equilibria in our model, which extend those for the previous models and support the validity of our model.

Optimal Padded Decomposition For Bounded Treewidth Graphs

from arXiv: Data Structures and Algorithms

Authors: Arnold Filtser, Tobias Friedrich, Davis Issac, Nikhil Kumar, Hung Le, Nadym Mallek, Ziena Zeif

A $(\beta,\delta,\Delta)$-padded decomposition of an edge-weighted graph $G = (V,E,w)$ is a stochastic decomposition into clusters of diameter at most $\Delta$ such that for every vertex $v\in V$, the probability that $\rm{ball}_G(v,\gamma\Delta)$ is entirely contained in the cluster containing $v$ is at least $e^{-\beta\gamma}$ for every $\gamma \in [0,\delta]$. Padded decompositions have been studied for decades and have found numerous applications, including metric embedding, multicommodity flow-cut gap, muticut, and zero extension problems, to name a few. In these applications, parameter $\beta$, called the padding parameter, is the most important parameter since it decides either the distortion or the approximation ratios. For general graphs with $n$ vertices, $\beta = \Theta(\log n)$. Klein, Plotkin, and Rao showed that $K_r$-minor-free graphs have padding parameter $\beta = O(r^3)$, which is a significant improvement over general graphs when $r$ is a constant. A long-standing conjecture is to construct a padded decomposition for $K_r$-minor-free graphs with padding parameter $\beta = O(\log r)$. Despite decades of research, the best-known result is $\beta = O(r)$, even for graphs with treewidth at most $r$. In this work, we make significant progress toward the aforementioned conjecture by showing that graphs with treewidth $\rm{tw}$ admit a padded decomposition with padding parameter $O(\log \rm{tw})$, which is tight. As corollaries, we obtain an exponential improvement in dependency on treewidth in a host of algorithmic applications: $O(\sqrt{ \log n \cdot \log(\rm{tw})})$ flow-cut gap, max flow-min multicut ratio of $O(\log(\rm{tw}))$, an $O(\log(\rm{tw}))$ approximation for the 0-extension problem, an $\ell^{O(\log n)}_\infty$ embedding with distortion $O(\log \rm{tw})$, and an $O(\log \rm{tw})$ bound for integrality gap for the uniform sparsest cut.

Authors: Arnold Filtser, Tobias Friedrich, Davis Issac, Nikhil Kumar, Hung Le, Nadym Mallek, Ziena Zeif

A $(\beta,\delta,\Delta)$-padded decomposition of an edge-weighted graph $G = (V,E,w)$ is a stochastic decomposition into clusters of diameter at most $\Delta$ such that for every vertex $v\in V$, the probability that $\rm{ball}_G(v,\gamma\Delta)$ is entirely contained in the cluster containing $v$ is at least $e^{-\beta\gamma}$ for every $\gamma \in [0,\delta]$. Padded decompositions have been studied for decades and have found numerous applications, including metric embedding, multicommodity flow-cut gap, muticut, and zero extension problems, to name a few. In these applications, parameter $\beta$, called the padding parameter, is the most important parameter since it decides either the distortion or the approximation ratios. For general graphs with $n$ vertices, $\beta = \Theta(\log n)$. Klein, Plotkin, and Rao showed that $K_r$-minor-free graphs have padding parameter $\beta = O(r^3)$, which is a significant improvement over general graphs when $r$ is a constant. A long-standing conjecture is to construct a padded decomposition for $K_r$-minor-free graphs with padding parameter $\beta = O(\log r)$. Despite decades of research, the best-known result is $\beta = O(r)$, even for graphs with treewidth at most $r$. In this work, we make significant progress toward the aforementioned conjecture by showing that graphs with treewidth $\rm{tw}$ admit a padded decomposition with padding parameter $O(\log \rm{tw})$, which is tight. As corollaries, we obtain an exponential improvement in dependency on treewidth in a host of algorithmic applications: $O(\sqrt{ \log n \cdot \log(\rm{tw})})$ flow-cut gap, max flow-min multicut ratio of $O(\log(\rm{tw}))$, an $O(\log(\rm{tw}))$ approximation for the 0-extension problem, an $\ell^{O(\log n)}_\infty$ embedding with distortion $O(\log \rm{tw})$, and an $O(\log \rm{tw})$ bound for integrality gap for the uniform sparsest cut.

Optimal Distance Labeling for Permutation Graphs

from arXiv: Data Structures and Algorithms

Authors: Paweł Gawrychowski, Wojciech Janczewski

A permutation graph is the intersection graph of a set of segments between two parallel lines. In other words, they are defined by a permutation $\pi$ on $n$ elements, such that $u$ and $v$ are adjacent if an only if $u\pi(v)$. We consider the problem of computing the distances in such a graph in the setting of informative labeling schemes. The goal of such a scheme is to assign a short bitstring $\ell(u)$ to every vertex $u$, such that the distance between $u$ and $v$ can be computed using only $\ell(u)$ and $\ell(v)$, and no further knowledge about the whole graph (other than that it is a permutation graph). This elegantly captures the intuition that we would like our data structure to be distributed, and often leads to interesting combinatorial challenges while trying to obtain lower and upper bounds that match up to the lower-order terms. For distance labeling of permutation graphs on $n$ vertices, Katz, Katz, and Peleg [STACS 2000] showed how to construct labels consisting of $\mathcal{O}(\log^{2} n)$ bits. Later, Bazzaro and Gavoille [Discret. Math. 309(11)] obtained an asymptotically optimal bounds by showing how to construct labels consisting of $9\log{n}+\mathcal{O}(1)$ bits, and proving that $3\log{n}-\mathcal{O}(\log{\log{n}})$ bits are necessary. This however leaves a quite large gap between the known lower and upper bounds. We close this gap by showing how to construct labels consisting of $3\log{n}+\mathcal{O}(\log\log n)$ bits.

Authors: Paweł Gawrychowski, Wojciech Janczewski

A permutation graph is the intersection graph of a set of segments between two parallel lines. In other words, they are defined by a permutation $\pi$ on $n$ elements, such that $u$ and $v$ are adjacent if an only if $u\pi(v)$. We consider the problem of computing the distances in such a graph in the setting of informative labeling schemes. The goal of such a scheme is to assign a short bitstring $\ell(u)$ to every vertex $u$, such that the distance between $u$ and $v$ can be computed using only $\ell(u)$ and $\ell(v)$, and no further knowledge about the whole graph (other than that it is a permutation graph). This elegantly captures the intuition that we would like our data structure to be distributed, and often leads to interesting combinatorial challenges while trying to obtain lower and upper bounds that match up to the lower-order terms. For distance labeling of permutation graphs on $n$ vertices, Katz, Katz, and Peleg [STACS 2000] showed how to construct labels consisting of $\mathcal{O}(\log^{2} n)$ bits. Later, Bazzaro and Gavoille [Discret. Math. 309(11)] obtained an asymptotically optimal bounds by showing how to construct labels consisting of $9\log{n}+\mathcal{O}(1)$ bits, and proving that $3\log{n}-\mathcal{O}(\log{\log{n}})$ bits are necessary. This however leaves a quite large gap between the known lower and upper bounds. We close this gap by showing how to construct labels consisting of $3\log{n}+\mathcal{O}(\log\log n)$ bits.

Wednesday, July 17

Clinical versus Statistical Prediction (III)

from Ben Recht

Meehl's Philosophical Psychology, Lecture 10, part 3/3.

This post digs into Lecture 10 of Paul Meehl’s course “Philosophical Psychology.” Technically speaking, this lecture starts at minute 74 of Lecture 9. The video for Lecture 10 is here. Here’s the full table of contents of my blogging through the class.

The earliest study Meehl finds demonstrating the superiority of statistical judgment asked whether “scientific methods” could be applied to parole. In the 1920s, sociologist Ernest Burgess worked with the Illinois Parole Board to determine the factors that contributed to recidivism and whether it was possible to predict whether a parolee would commit further crimes after release. 

Burgess assembled 21 predictive factors, including age, the type of offense, whether a person was a repeat offender, and whether the person had held a job before. He then constructed a sophisticated AI tool for predicting parole: he scored each factor either 0 or 1 and then added them all up. Of the 68 men with at least 16 positive factors, only one ever committed a crime again. Of the 25 men with fewer than five positive factors, 19 recidivated.

In a 1928 report, Burgess compared his predictions against two prison psychiatrists. He gathered a dataset of 1000 men who appeared before the Illinois Parole Board in the 1920s. The psychiatrists assigned each prisoner as likely to violate parole, unlikely to violate parole, or uncertain. Of the ones deemed unlikely to violate, the first psychiatrist predicted 85 percent correctly, the second 80 percent. Of the ones deemed likely to violate parole, the first psychiatrist predicted 30 percent correctly, the second 51 percent. Burgess’ method, looking for at least ten positive factors, not only made a prediction for all parolees but correctly predicted 86 percent of the ones unlikely to violate and 51 percent of those likely to violate. It outperformed the first psychiatrist at predicting violations and the second at predicting successful parole.

Algorithmic recidivism prediction remains a contentious topic. It is one of the most popular examples discussed by the machine learning fairness community. The common refrain is to argue that these risk assessments are examples of “an opaque decision-making system that influences the fundamental rights of residents of the US.” But Burgess was attempting to make the case for a more liberal parole system. He thought his algorithm could be less political, more fair, and more accurate.

Meehl highlights a dozen other studies in his book and continued to track these throughout his career. No matter how much he looked, he kept finding the same thing as Burgess: statistical rules were seldom worse and often much better than clinical predictions. In a reflection on his book, Meehl wrote in 1986, “There is no controversy in social science that shows such a large body of qualitatively diverse studies coming out so uniformly in the same direction as this one.” 

He may have been right. I wasn’t sure how best to convey the evidence, but let me discuss two meta-analyses from the 21st century. Meehl was no fan of metaanalysis (and neither am I), but sometimes it’s worth gathering all the papers and looking at the trends. 

The two biggest, broadest meta-analyses were done by Grove et al. (2000) and Ægistdóttir et al. (2006). Grove et al.’s analysis included 136 predictions. In 46% of the predictions, mechanical methods were roughly 5 percentage points better than clinical judgments.1 That is, the difference between the accuracy of the statistical and clinical predictions was at least 0.05. In 48% the predictions were close to each other within about 5 points of each other. Clinical predictions were substantially better than mechanical predictions in less than 6% of the studies. This plot from Grove demonstrates further that there was a skew in the distribution.

Here, a positive score denotes an advantage for mechanical prediction and a negative score an advantage for clinical. When they were better, mechanical predictions were more frequently far better.

Ægistdóttir et al. focused on statistical methods but found the same results as Grove et al. In their compilation of 48 predictions, 52% favored statistical methods, 38% reported comparable performance, and 10% favored clinical judgment.

What should we make of these findings? First, and foremost, let us realize that we should accept that statistical judgment can be considerably better than expert judgment. This should inform how we proceed in decisionmaking about people. However, I cannot emphasize enough here that just because statistical prediction is never worse and often better than clinical judgment, that doesn’t mean that you still can’t screw up statistical prediction. Careful statistical prediction remains a delicate skill.

  • You can have too few features.

  • You can have too many features.

  • You can have completely uniformative features.

  • You can have missing data.

  • You can have non-stationarity and frequency shifts.

These are just a few of the major headaches you have to deal with. If we’re going to rely on statistical prediction, then we need expertise in statistical hygiene.

With regards to that hygiene, I also want to emphasize again and again and again that you shouldn’t just break out some fancy new machine learning method and assume it’s going to be the best method. Many have noted that sophisticated machine learning methods are often outperformed by least squares. But least squares is still statistical prediction! One of Meehl’s examples is Sarbin’s study (1943), which showed a two-variable linear regression based on high school ranking and college entrance exam score was more predictive of a student’s success at the University of Minnesota than the assessment of the university’s clinical councilors. Just because simple ML methods perform better than complex ones does not mean that simple ML methods are inferior to clinical judgment.

Recidivism prediction provides another great example. An infamous ProPublica report highlighted a bizarre, opaque psychometric system, COMPAS, sold by Northpointe to the state of Wisconsin for predicting recidivism. Later analysis showed that COMPAS was no better than simple rules like Sarbin’s. Fancy opaque rules should always be compared to the simplest baselines. For many of these messy social questions, you’ll never beat the simple rules because the prediction problems are so hard anyway.

Statistical rules are more accurate, faster, and cheaper than experts. They can even be more fair and safe. And yet, statistical prediction is not a panacea. Meehl didn’t think so either! Statistical rules need to be targeted at interventions with simple outcomes. They are challenging to keep updated. Tech companies retrain their prediction systems every day. Medical risk assessments might stay static for decades. And mechanical rules have human costs. They can lead to an erosion of expertise as practitioners spend too much time deferring to their apps. They can lead to decision fatigue, forcing too many things into a computerized system. And they can lead to complacency, as following mechanical rules is drudgery. For all these reasons, the adoption of mechanical rules and statistical prediction in high-stakes scenarios must be done with care.


I understand why the power of statistical prediction will remain discomfiting for professionals. Statistical prediction is atheoretical. There’s no good reason why counts of the past lead to reasonable predictions of the future. That’s the problem of induction, my friends. I’m not going to go full neorationalist on you and argue that statistical prediction always works. That would be ridiculous and I don’t believe it (I have written endless blogs on why). We should interpret Meehl as providing us with a setting where statistical is probably going to be better than clinical: answering clear, multiple-choice questions about simple actions from machine-readable data. This characterization is useful! 

Still, it feels like a doctor can assess more than what is fed into the computer. That a counselor can see subtle queues that are valuable for prediction. That there are edge cases statistics can’t catch. Isn’t this true? Why is clinical judgment worse on average?

The key to the entire clinical-statistical puzzle is those last two words. “On average.” The trick that Meehl plays–and that all bureaucrats play–is in the quantification of “better.” By better we of course mean on average. Once you decide that things will be evaluated by averages, the game is up. If you believe that prediction is possible, and you tell me that I’m going to be evaluated by hit rates, then I’m going find a method that maximizes hit rate over some class of possible algorithms. In machine learning, we call this empirical risk minimization. You should find a rule that predicts the past well and use this to make predictions about the future. Since you will be evaluated based on averages, this is effectively the optimal thing to do.

Meehl summarizes the situation in the last paragraph of his 1954 book. If we subscribe to the bureaucratic utilitarian mindset, the algorithm always wins:

“If a clinician says, ‘This one is different’ or ‘It’s not like the ones in your table,’ ‘This time I’m surer,’ the obvious question is, ‘Why should we care whether you think this one is different or whether you are surer?’ Again, there is only one rational reply to such a question. We have now to study the success frequency of the clinician’s guesses when he asserts that he feels this way. If we have already done so and found him still behind the hit frequency of the table, we would be well advised to ignore him. Always, we might as well face it, the shadow of the statistician hovers in the background; always the actuary will have the final word.”

Subscribe now

1

As is the case with all meta-analyses, the way they pool their comparisons is frustrating. In order to evaluate the bulk benefit of one method versus another, you have to take a diverse set of results and homogenize them. Both of these studies do this by trying to scale the difference between clinical and statistical judgment to standardized units using Cohen’s d. Specifically, if one method has accuracy a1 and the other method accuracy a2, then 

A method was favored if |d| was greater than 0.1. Now, if a2 is 60% and d is 0.1, then a1 is 65%. If a2=70% and d=0.1, then a1=74%. This is why I say “roughly 0.05” above. Is this the right metric? Gah, I don’t think so. But if you have a better idea, please tell me in the comments!

By Ben Recht

Pseudorandom density matrices

from arXiv: Computational Complexity

Authors: Nikhil Bansal, Wai-Keong Mok, Kishor Bharti, Dax Enshan Koh, Tobias Haug

Pseudorandom states (PRSs) are state ensembles that cannot be distinguished from Haar random states by any efficient quantum algorithm. However, the definition of PRSs has been limited to pure states and lacks robustness against noise. In this work, we introduce pseudorandom density matrices (PRDMs), ensembles of $n$-qubit states that are computationally indistinguishable from the generalized Hilbert-Schmidt ensemble, which is constructed from $(n+m)$-qubit Haar random states with $m$ qubits traced out. For a mixedness parameter $m=0$, PRDMs are equivalent to PRSs, whereas for $m=\omega(\log n)$, PRDMs are computationally indistinguishable from the maximally mixed state. In contrast to PRSs, PRDMs with $m=\omega(\log n)$ are robust to unital noise channels and a recently introduced $\mathsf{PostBQP}$ attack. Further, we construct pseudomagic and pseudocoherent state ensembles, which possess near-maximal magic and coherence, but are computationally indistinguishable from states with zero magic and coherence. PRDMs can exhibit a pseudoresource gap of $\Theta(n)$ vs $0$, surpassing previously found gaps. We introduce noise-robust EFI pairs, which are state ensembles that are computationally indistinguishable yet statistically far, even when subject to noise. We show that testing entanglement, magic and coherence is not efficient. Further, we prove that black-box resource distillation requires a superpolynomial number of copies. We also establish lower bounds on the purity needed for efficient testing and black-box distillation. Finally, we introduce memoryless PRSs, a noise-robust notion of PRS which are indistinguishable to Haar random states for efficient algorithms without quantum memory. Our work provides a comprehensive framework of pseudorandomness for mixed states, which yields powerful quantum cryptographic primitives and fundamental bounds on quantum resource theories.

Authors: Nikhil Bansal, Wai-Keong Mok, Kishor Bharti, Dax Enshan Koh, Tobias Haug

Pseudorandom states (PRSs) are state ensembles that cannot be distinguished from Haar random states by any efficient quantum algorithm. However, the definition of PRSs has been limited to pure states and lacks robustness against noise. In this work, we introduce pseudorandom density matrices (PRDMs), ensembles of $n$-qubit states that are computationally indistinguishable from the generalized Hilbert-Schmidt ensemble, which is constructed from $(n+m)$-qubit Haar random states with $m$ qubits traced out. For a mixedness parameter $m=0$, PRDMs are equivalent to PRSs, whereas for $m=\omega(\log n)$, PRDMs are computationally indistinguishable from the maximally mixed state. In contrast to PRSs, PRDMs with $m=\omega(\log n)$ are robust to unital noise channels and a recently introduced $\mathsf{PostBQP}$ attack. Further, we construct pseudomagic and pseudocoherent state ensembles, which possess near-maximal magic and coherence, but are computationally indistinguishable from states with zero magic and coherence. PRDMs can exhibit a pseudoresource gap of $\Theta(n)$ vs $0$, surpassing previously found gaps. We introduce noise-robust EFI pairs, which are state ensembles that are computationally indistinguishable yet statistically far, even when subject to noise. We show that testing entanglement, magic and coherence is not efficient. Further, we prove that black-box resource distillation requires a superpolynomial number of copies. We also establish lower bounds on the purity needed for efficient testing and black-box distillation. Finally, we introduce memoryless PRSs, a noise-robust notion of PRS which are indistinguishable to Haar random states for efficient algorithms without quantum memory. Our work provides a comprehensive framework of pseudorandomness for mixed states, which yields powerful quantum cryptographic primitives and fundamental bounds on quantum resource theories.

Intrinsic Universality in Seeded Active Tile Self-Assembly

from arXiv: Computational Complexity

Authors: Tim Gomez, Elise Grizzell, Asher Haun, Ryan Knobel, Tom Peters, Robert Schweller, Tim Wylie

The Tile Automata (TA) model describes self-assembly systems in which monomers can build structures and transition with an adjacent monomer to change their states. This paper shows that seeded TA is a non-committal intrinsically universal model of self-assembly. We present a single universal Tile Automata system containing approximately 4600 states that can simulate (a) the output assemblies created by any other Tile Automata system G, (b) the dynamics involved in building G's assemblies, and (c) G's internal state transitions. It does so in a non-committal way: it preserves the full non-deterministic dynamics of a tile's potential attachment or transition by selecting its state in a single step, considering all possible outcomes until the moment of selection. The system uses supertiles, each encoding the complete system being simulated. The universal system builds supertiles from its seed, each representing a single tile in G, transferring the information to simulate G to each new tile. Supertiles may also asynchronously transition states according to the rules of G. This result directly transfers to a restricted version of asynchronous Cellular Automata: pairwise Cellular Automata.

Authors: Tim Gomez, Elise Grizzell, Asher Haun, Ryan Knobel, Tom Peters, Robert Schweller, Tim Wylie

The Tile Automata (TA) model describes self-assembly systems in which monomers can build structures and transition with an adjacent monomer to change their states. This paper shows that seeded TA is a non-committal intrinsically universal model of self-assembly. We present a single universal Tile Automata system containing approximately 4600 states that can simulate (a) the output assemblies created by any other Tile Automata system G, (b) the dynamics involved in building G's assemblies, and (c) G's internal state transitions. It does so in a non-committal way: it preserves the full non-deterministic dynamics of a tile's potential attachment or transition by selecting its state in a single step, considering all possible outcomes until the moment of selection. The system uses supertiles, each encoding the complete system being simulated. The universal system builds supertiles from its seed, each representing a single tile in G, transferring the information to simulate G to each new tile. Supertiles may also asynchronously transition states according to the rules of G. This result directly transfers to a restricted version of asynchronous Cellular Automata: pairwise Cellular Automata.

Transforming the Challenge of Constructing Low-Discrepancy Point Sets into a Permutation Selection Problem

from arXiv: Computational Geometry

Authors: François Clément, Carola Doerr, Kathrin Klamroth, Luís Paquete

Low discrepancy point sets have been widely used as a tool to approximate continuous objects by discrete ones in numerical processes, for example in numerical integration. Following a century of research on the topic, it is still unclear how low the discrepancy of point sets can go; in other words, how regularly distributed can points be in a given space. Recent insights using optimization and machine learning techniques have led to substantial improvements in the construction of low-discrepancy point sets, resulting in configurations of much lower discrepancy values than previously known. Building on the optimal constructions, we present a simple way to obtain $L_{\infty}$-optimized placement of points that follow the same relative order as an (arbitrary) input set. Applying this approach to point sets in dimensions 2 and 3 for up to 400 and 50 points, respectively, we obtain point sets whose $L_{\infty}$ star discrepancies are up to 25% smaller than those of the current-best sets, and around 50% better than classical constructions such as the Fibonacci set.

Authors: François Clément, Carola Doerr, Kathrin Klamroth, Luís Paquete

Low discrepancy point sets have been widely used as a tool to approximate continuous objects by discrete ones in numerical processes, for example in numerical integration. Following a century of research on the topic, it is still unclear how low the discrepancy of point sets can go; in other words, how regularly distributed can points be in a given space. Recent insights using optimization and machine learning techniques have led to substantial improvements in the construction of low-discrepancy point sets, resulting in configurations of much lower discrepancy values than previously known. Building on the optimal constructions, we present a simple way to obtain $L_{\infty}$-optimized placement of points that follow the same relative order as an (arbitrary) input set. Applying this approach to point sets in dimensions 2 and 3 for up to 400 and 50 points, respectively, we obtain point sets whose $L_{\infty}$ star discrepancies are up to 25% smaller than those of the current-best sets, and around 50% better than classical constructions such as the Fibonacci set.

Faster Algorithms for Schatten-p Low Rank Approximation

from arXiv: Data Structures and Algorithms

Authors: Praneeth Kacham, David P. Woodruff

We study algorithms for the Schatten-$p$ Low Rank Approximation (LRA) problem. First, we show that by using fast rectangular matrix multiplication algorithms and different block sizes, we can improve the running time of the algorithms in the recent work of Bakshi, Clarkson and Woodruff (STOC 2022). We then show that by carefully combining our new algorithm with the algorithm of Li and Woodruff (ICML 2020), we can obtain even faster algorithms for Schatten-$p$ LRA. While the block-based algorithms are fast in the real number model, we do not have a stability analysis which shows that the algorithms work when implemented on a machine with polylogarithmic bits of precision. We show that the LazySVD algorithm of Allen-Zhu and Li (NeurIPS 2016) can be implemented on a floating point machine with only logarithmic, in the input parameters, bits of precision. As far as we are aware, this is the first stability analysis of any algorithm using $O((k/\sqrt{\varepsilon})\text{poly}(\log n))$ matrix-vector products with the matrix $A$ to output a $1+\varepsilon$ approximate solution for the rank-$k$ Schatten-$p$ LRA problem.

Authors: Praneeth Kacham, David P. Woodruff

We study algorithms for the Schatten-$p$ Low Rank Approximation (LRA) problem. First, we show that by using fast rectangular matrix multiplication algorithms and different block sizes, we can improve the running time of the algorithms in the recent work of Bakshi, Clarkson and Woodruff (STOC 2022). We then show that by carefully combining our new algorithm with the algorithm of Li and Woodruff (ICML 2020), we can obtain even faster algorithms for Schatten-$p$ LRA. While the block-based algorithms are fast in the real number model, we do not have a stability analysis which shows that the algorithms work when implemented on a machine with polylogarithmic bits of precision. We show that the LazySVD algorithm of Allen-Zhu and Li (NeurIPS 2016) can be implemented on a floating point machine with only logarithmic, in the input parameters, bits of precision. As far as we are aware, this is the first stability analysis of any algorithm using $O((k/\sqrt{\varepsilon})\text{poly}(\log n))$ matrix-vector products with the matrix $A$ to output a $1+\varepsilon$ approximate solution for the rank-$k$ Schatten-$p$ LRA problem.

Text Indexing for Long Patterns using Locally Consistent Anchors

from arXiv: Data Structures and Algorithms

Authors: Lorraine A. K. Ayad, Grigorios Loukides, Solon P. Pissis

In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to represent such string datasets in a compact form but also to simultaneously enable fast pattern matching queries. This is the classic text indexing problem. The four absolute measures anyone should pay attention to when designing or implementing a text index are: (i) index space; (ii) query time; (iii) construction space; and (iv) construction time. Unfortunately, however, most (if not all) widely-used indexes (e.g., suffix tree, suffix array, or their compressed counterparts) are not optimized for all four measures simultaneously, as it is difficult to have the best of all four worlds. Here, we take an important step in this direction by showing that text indexing with sampling based on locally consistent anchors (lc-anchors) offers remarkably good performance in all four measures, when we have at hand a lower bound $\ell$ on the length of the queried patterns -- which is arguably a quite reasonable assumption in practical applications. Our index offers average-case guarantees. In our experiments using real benchmark datasets, we show that it compares favorably based on the four measures to all classic indexes: (compressed) suffix tree; (compressed) suffix array; and the FM-index. Notably, we also present a counterpart of our index with worst-case guarantees based on the lc-anchors notion of partitioning sets. To the best of our knowledge, this is the first index achieving the best of all worlds in the regime where we have at hand a lower bound $\ell$ on the length of the queried patterns.

Authors: Lorraine A. K. Ayad, Grigorios Loukides, Solon P. Pissis

In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to represent such string datasets in a compact form but also to simultaneously enable fast pattern matching queries. This is the classic text indexing problem. The four absolute measures anyone should pay attention to when designing or implementing a text index are: (i) index space; (ii) query time; (iii) construction space; and (iv) construction time. Unfortunately, however, most (if not all) widely-used indexes (e.g., suffix tree, suffix array, or their compressed counterparts) are not optimized for all four measures simultaneously, as it is difficult to have the best of all four worlds. Here, we take an important step in this direction by showing that text indexing with sampling based on locally consistent anchors (lc-anchors) offers remarkably good performance in all four measures, when we have at hand a lower bound $\ell$ on the length of the queried patterns -- which is arguably a quite reasonable assumption in practical applications. Our index offers average-case guarantees. In our experiments using real benchmark datasets, we show that it compares favorably based on the four measures to all classic indexes: (compressed) suffix tree; (compressed) suffix array; and the FM-index. Notably, we also present a counterpart of our index with worst-case guarantees based on the lc-anchors notion of partitioning sets. To the best of our knowledge, this is the first index achieving the best of all worlds in the regime where we have at hand a lower bound $\ell$ on the length of the queried patterns.

Independent Set Reconfiguration Under Bounded-Hop Token

from arXiv: Data Structures and Algorithms

Authors: Hiroki Hatano, Naoki Kitamura, Taisuke Izumi, Takehiro Ito, Toshimitsu Masuzawa

The independent set reconfiguration problem (ISReconf) is the problem of determining, for given independent sets I_s and I_t of a graph G, whether I_s can be transformed into I_t by repeatedly applying a prescribed reconfiguration rule that transforms an independent set to another. As reconfiguration rules for the ISReconf, the Token Sliding (TS) model and the Token Jumping (TJ) model are commonly considered. While the TJ model admits the addition of any vertex (as far as the addition yields an independent set), the TS model admits the addition of only a neighbor of the removed vertex. It is known that the complexity status of the ISReconf differs between the TS and TJ models for some graph classes. In this paper, we analyze how changes in reconfiguration rules affect the computational complexity of reconfiguration problems. To this end, we generalize the TS and TJ models to a unified reconfiguration rule, called the k-Jump model, which admits the addition of a vertex within distance k from the removed vertex. Then, the TS and TJ models are the 1-Jump and D(G)-Jump models, respectively, where D(G) denotes the diameter of a connected graph G. We give the following three results: First, we show that the computational complexity of the ISReconf under the k-Jump model for general graphs is equivalent for all k >= 3. Second, we present a polynomial-time algorithm to solve the ISReconf under the 2-Jump model for split graphs. We note that the ISReconf under the 1-Jump (i.e., TS) model is PSPACE-complete for split graphs, and hence the complexity status of the ISReconf differs between k = 1 and k = 2. Third, we consider the optimization variant of the ISReconf, which computes the minimum number of steps of any transformation between Is and It. We prove that this optimization variant under the k-Jump model is NP-complete for chordal graphs of diameter at most 2k + 1, for any k >=3.

Authors: Hiroki Hatano, Naoki Kitamura, Taisuke Izumi, Takehiro Ito, Toshimitsu Masuzawa

The independent set reconfiguration problem (ISReconf) is the problem of determining, for given independent sets I_s and I_t of a graph G, whether I_s can be transformed into I_t by repeatedly applying a prescribed reconfiguration rule that transforms an independent set to another. As reconfiguration rules for the ISReconf, the Token Sliding (TS) model and the Token Jumping (TJ) model are commonly considered. While the TJ model admits the addition of any vertex (as far as the addition yields an independent set), the TS model admits the addition of only a neighbor of the removed vertex. It is known that the complexity status of the ISReconf differs between the TS and TJ models for some graph classes. In this paper, we analyze how changes in reconfiguration rules affect the computational complexity of reconfiguration problems. To this end, we generalize the TS and TJ models to a unified reconfiguration rule, called the k-Jump model, which admits the addition of a vertex within distance k from the removed vertex. Then, the TS and TJ models are the 1-Jump and D(G)-Jump models, respectively, where D(G) denotes the diameter of a connected graph G. We give the following three results: First, we show that the computational complexity of the ISReconf under the k-Jump model for general graphs is equivalent for all k >= 3. Second, we present a polynomial-time algorithm to solve the ISReconf under the 2-Jump model for split graphs. We note that the ISReconf under the 1-Jump (i.e., TS) model is PSPACE-complete for split graphs, and hence the complexity status of the ISReconf differs between k = 1 and k = 2. Third, we consider the optimization variant of the ISReconf, which computes the minimum number of steps of any transformation between Is and It. We prove that this optimization variant under the k-Jump model is NP-complete for chordal graphs of diameter at most 2k + 1, for any k >=3.

IID Prophet Inequality with Random Horizon: Going Beyond Increasing Hazard Rates

from arXiv: Data Structures and Algorithms

Authors: Giordano Giambartolomei, Frederik Mallmann-Trenn, Raimundo Saona

Prophet inequalities are a central object of study in optimal stopping theory. In the iid model, a gambler sees values in an online fashion, sampled independently from a given distribution. Upon observing each value, the gambler either accepts it as a reward or irrevocably rejects it and proceeds to observe the next value. The goal of the gambler, who cannot see the future, is maximising the expected value of the reward while competing against the expectation of a prophet (the offline maximum). In other words, one seeks to maximise the gambler-to-prophet ratio of the expectations. This model has been studied with infinite, finite and unknown number of values. When the gambler faces a random number of values, the model is said to have random horizon. We consider the model in which the gambler is given a priori knowledge of the horizon's distribution. Alijani et al. (2020) designed a single-threshold algorithms achieving a ratio of $1/2$ when the random horizon has an increasing hazard rate and is independent of the values. We prove that with a single-threshold, a ratio of $1/2$ is actually achievable for several larger classes of horizon distributions, with the largest being known as the $\mathcal{G}$ class in reliability theory. Moreover, we extend this result to its dual, the $\overline{\mathcal{G}}$ class (which includes the decreasing hazard rate class), and to low-variance horizons. Finally, we construct the first example of a family of horizons, for which multiple thresholds are necessary to achieve a nonzero ratio. We establish that the Secretary Problem optimal stopping rule provides one such algorithm, paving the way towards the study of the model beyond single-threshold algorithms.

Authors: Giordano Giambartolomei, Frederik Mallmann-Trenn, Raimundo Saona

Prophet inequalities are a central object of study in optimal stopping theory. In the iid model, a gambler sees values in an online fashion, sampled independently from a given distribution. Upon observing each value, the gambler either accepts it as a reward or irrevocably rejects it and proceeds to observe the next value. The goal of the gambler, who cannot see the future, is maximising the expected value of the reward while competing against the expectation of a prophet (the offline maximum). In other words, one seeks to maximise the gambler-to-prophet ratio of the expectations. This model has been studied with infinite, finite and unknown number of values. When the gambler faces a random number of values, the model is said to have random horizon. We consider the model in which the gambler is given a priori knowledge of the horizon's distribution. Alijani et al. (2020) designed a single-threshold algorithms achieving a ratio of $1/2$ when the random horizon has an increasing hazard rate and is independent of the values. We prove that with a single-threshold, a ratio of $1/2$ is actually achievable for several larger classes of horizon distributions, with the largest being known as the $\mathcal{G}$ class in reliability theory. Moreover, we extend this result to its dual, the $\overline{\mathcal{G}}$ class (which includes the decreasing hazard rate class), and to low-variance horizons. Finally, we construct the first example of a family of horizons, for which multiple thresholds are necessary to achieve a nonzero ratio. We establish that the Secretary Problem optimal stopping rule provides one such algorithm, paving the way towards the study of the model beyond single-threshold algorithms.

Speed-robust scheduling revisited

from arXiv: Data Structures and Algorithms

Authors: Josef Minařík, Jiří Sgall

Speed-robust scheduling is the following two-stage problem of scheduling $n$ jobs on $m$ uniformly related machines. In the first stage, the algorithm receives the value of $m$ and the processing times of $n$ jobs; it has to partition the jobs into $b$ groups called bags. In the second stage, the machine speeds are revealed and the bags are assigned to the machines, i.e., the algorithm produces a schedule where all the jobs in the same bag are assigned to the same machine. The objective is to minimize the makespan (the length of the schedule). The algorithm is compared to the optimal schedule and it is called $\rho$-robust, if its makespan is always at most $\rho$ times the optimal one. Our main result is an improved bound for equal-size jobs for $b=m$. We give an upper bound of $1.6$. This improves previous bound of $1.8$ and it is almost tight in the light of previous lower bound of $1.58$. Second, for infinitesimally small jobs, we give tight upper and lower bounds for the case when $b\geq m$. This generalizes and simplifies the previous bounds for $b=m$. Finally, we introduce a new special case with relatively small jobs for which we give an algorithm whose robustness is close to that of infinitesimal jobs and thus gives better than $2$-robust for a large class of inputs.

Authors: Josef Minařík, Jiří Sgall

Speed-robust scheduling is the following two-stage problem of scheduling $n$ jobs on $m$ uniformly related machines. In the first stage, the algorithm receives the value of $m$ and the processing times of $n$ jobs; it has to partition the jobs into $b$ groups called bags. In the second stage, the machine speeds are revealed and the bags are assigned to the machines, i.e., the algorithm produces a schedule where all the jobs in the same bag are assigned to the same machine. The objective is to minimize the makespan (the length of the schedule). The algorithm is compared to the optimal schedule and it is called $\rho$-robust, if its makespan is always at most $\rho$ times the optimal one. Our main result is an improved bound for equal-size jobs for $b=m$. We give an upper bound of $1.6$. This improves previous bound of $1.8$ and it is almost tight in the light of previous lower bound of $1.58$. Second, for infinitesimally small jobs, we give tight upper and lower bounds for the case when $b\geq m$. This generalizes and simplifies the previous bounds for $b=m$. Finally, we introduce a new special case with relatively small jobs for which we give an algorithm whose robustness is close to that of infinitesimal jobs and thus gives better than $2$-robust for a large class of inputs.

Paralleling and Accelerating Arc Consistency Enforcement with Recurrent Tensor Computations

from arXiv: Data Structures and Algorithms

Authors: Mingqi Yang

We propose a new arc consistency enforcement paradigm that transforms arc consistency enforcement into recurrent tensor operations. In each iteration of the recurrence, all involved processes can be fully parallelized with tensor operations. And the number of iterations is quite small. Based on these benefits, the resulting algorithm fully leverages the power of parallelization and GPU, and therefore is extremely efficient on large and densely connected constraint networks.

Authors: Mingqi Yang

We propose a new arc consistency enforcement paradigm that transforms arc consistency enforcement into recurrent tensor operations. In each iteration of the recurrence, all involved processes can be fully parallelized with tensor operations. And the number of iterations is quite small. Based on these benefits, the resulting algorithm fully leverages the power of parallelization and GPU, and therefore is extremely efficient on large and densely connected constraint networks.

Learning-augmented Maximum Independent Set

from arXiv: Data Structures and Algorithms

Authors: Vladimir Braverman, Prathamesh Dharangutte, Vihan Shah, Chen Wang

We study the Maximum Independent Set (MIS) problem on general graphs within the framework of learning-augmented algorithms. The MIS problem is known to be NP-hard and is also NP-hard to approximate to within a factor of $n^{1-\delta}$ for any $\delta>0$. We show that we can break this barrier in the presence of an oracle obtained through predictions from a machine learning model that answers vertex membership queries for a fixed MIS with probability $1/2+\varepsilon$. In the first setting we consider, the oracle can be queried once per vertex to know if a vertex belongs to a fixed MIS, and the oracle returns the correct answer with probability $1/2 + \varepsilon$. Under this setting, we show an algorithm that obtains an $\tilde{O}(\sqrt{\Delta}/\varepsilon)$-approximation in $O(m)$ time where $\Delta$ is the maximum degree of the graph. In the second setting, we allow multiple queries to the oracle for a vertex, each of which is correct with probability $1/2 + \varepsilon$. For this setting, we show an $O(1)$-approximation algorithm using $O(n/\varepsilon^2)$ total queries and $\tilde{O}(m)$ runtime.

Authors: Vladimir Braverman, Prathamesh Dharangutte, Vihan Shah, Chen Wang

We study the Maximum Independent Set (MIS) problem on general graphs within the framework of learning-augmented algorithms. The MIS problem is known to be NP-hard and is also NP-hard to approximate to within a factor of $n^{1-\delta}$ for any $\delta>0$. We show that we can break this barrier in the presence of an oracle obtained through predictions from a machine learning model that answers vertex membership queries for a fixed MIS with probability $1/2+\varepsilon$. In the first setting we consider, the oracle can be queried once per vertex to know if a vertex belongs to a fixed MIS, and the oracle returns the correct answer with probability $1/2 + \varepsilon$. Under this setting, we show an algorithm that obtains an $\tilde{O}(\sqrt{\Delta}/\varepsilon)$-approximation in $O(m)$ time where $\Delta$ is the maximum degree of the graph. In the second setting, we allow multiple queries to the oracle for a vertex, each of which is correct with probability $1/2 + \varepsilon$. For this setting, we show an $O(1)$-approximation algorithm using $O(n/\varepsilon^2)$ total queries and $\tilde{O}(m)$ runtime.

On the Houdré-Tetali conjecture about an isoperimetric constant of graphs

from arXiv: Data Structures and Algorithms

Authors: Lap Chi Lau, Dante Tjowasi

Houdr\'e and Tetali defined a class of isoperimetric constants $\varphi_p$ of graphs for $0 \leq p \leq 1$, and conjectured a Cheeger-type inequality for $\varphi_\frac12$ of the form $$\lambda_2 \lesssim \varphi_\frac12 \lesssim \sqrt{\lambda_2}$$ where $\lambda_2$ is the second smallest eigenvalue of the normalized Laplacian matrix. If true, the conjecture would be a strengthening of the hard direction of the classical Cheeger's inequality. Morris and Peres proved Houdr\'e and Tetali's conjecture up to an additional log factor, using techniques from evolving sets. We present the following related results on this conjecture. - We provide a family of counterexamples to the conjecture of Houdr\'e and Tetali, showing that the logarithmic factor is needed. - We match Morris and Peres's bound using standard spectral arguments. - We prove that Houdr\'e and Tetali's conjecture is true for any constant $p$ strictly bigger than $\frac12$, which is also a strengthening of the hard direction of Cheeger's inequality. Furthermore, our results can be extended to directed graphs using Chung's definition of eigenvalues for directed graphs.

Authors: Lap Chi Lau, Dante Tjowasi

Houdr\'e and Tetali defined a class of isoperimetric constants $\varphi_p$ of graphs for $0 \leq p \leq 1$, and conjectured a Cheeger-type inequality for $\varphi_\frac12$ of the form $$\lambda_2 \lesssim \varphi_\frac12 \lesssim \sqrt{\lambda_2}$$ where $\lambda_2$ is the second smallest eigenvalue of the normalized Laplacian matrix. If true, the conjecture would be a strengthening of the hard direction of the classical Cheeger's inequality. Morris and Peres proved Houdr\'e and Tetali's conjecture up to an additional log factor, using techniques from evolving sets. We present the following related results on this conjecture. - We provide a family of counterexamples to the conjecture of Houdr\'e and Tetali, showing that the logarithmic factor is needed. - We match Morris and Peres's bound using standard spectral arguments. - We prove that Houdr\'e and Tetali's conjecture is true for any constant $p$ strictly bigger than $\frac12$, which is also a strengthening of the hard direction of Cheeger's inequality. Furthermore, our results can be extended to directed graphs using Chung's definition of eigenvalues for directed graphs.

Almost-linear Time Approximation Algorithm to Euclidean $k$-median and $k$-means

from arXiv: Data Structures and Algorithms

Authors: Max Dupré la Tour, David Saulpic

Clustering is one of the staples of data analysis and unsupervised learning. As such, clustering algorithms are often used on massive data sets, and they need to be extremely fast. We focus on the Euclidean $k$-median and $k$-means problems, two of the standard ways to model the task of clustering. For these, the go-to algorithm is $k$-means++, which yields an $O(\log k)$-approximation in time $\tilde O(nkd)$. While it is possible to improve either the approximation factor [Lattanzi and Sohler, ICML19] or the running time [Cohen-Addad et al., NeurIPS 20], it is unknown how precise a linear-time algorithm can be. In this paper, we almost answer this question by presenting an almost linear-time algorithm to compute a constant-factor approximation.

Authors: Max Dupré la Tour, David Saulpic

Clustering is one of the staples of data analysis and unsupervised learning. As such, clustering algorithms are often used on massive data sets, and they need to be extremely fast. We focus on the Euclidean $k$-median and $k$-means problems, two of the standard ways to model the task of clustering. For these, the go-to algorithm is $k$-means++, which yields an $O(\log k)$-approximation in time $\tilde O(nkd)$. While it is possible to improve either the approximation factor [Lattanzi and Sohler, ICML19] or the running time [Cohen-Addad et al., NeurIPS 20], it is unknown how precise a linear-time algorithm can be. In this paper, we almost answer this question by presenting an almost linear-time algorithm to compute a constant-factor approximation.

Trace reconstruction from local statistical queries

from arXiv: Data Structures and Algorithms

Authors: Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio

The goal of trace reconstruction is to reconstruct an unknown $n$-bit string $x$ given only independent random traces of $x$, where a random trace of $x$ is obtained by passing $x$ through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of $x$ rather than individual traces themselves. Such an algorithm is said to be $\ell$-local if each of its statistical queries corresponds to an $\ell$-junta function over some block of $\ell$ consecutive bits in the trace. Since several -- but not all -- known algorithms for trace reconstruction fall under the local statistical query paradigm, it is interesting to understand the abilities and limitations of local SQ algorithms for trace reconstruction. In this paper we establish nearly-matching upper and lower bounds on local Statistical Query algorithms for both worst-case and average-case trace reconstruction. For the worst-case problem, we show that there is an $\tilde{O}(n^{1/5})$-local SQ algorithm that makes all its queries with tolerance $\tau \geq 2^{-\tilde{O}(n^{1/5})}$, and also that any $\tilde{O}(n^{1/5})$-local SQ algorithm must make some query with tolerance $\tau \leq 2^{-\tilde{\Omega}(n^{1/5})}$. For the average-case problem, we show that there is an $O(\log n)$-local SQ algorithm that makes all its queries with tolerance $\tau \geq 1/\mathrm{poly}(n)$, and also that any $O(\log n)$-local SQ algorithm must make some query with tolerance $\tau \leq 1/\mathrm{poly}(n).$

Authors: Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio

The goal of trace reconstruction is to reconstruct an unknown $n$-bit string $x$ given only independent random traces of $x$, where a random trace of $x$ is obtained by passing $x$ through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of $x$ rather than individual traces themselves. Such an algorithm is said to be $\ell$-local if each of its statistical queries corresponds to an $\ell$-junta function over some block of $\ell$ consecutive bits in the trace. Since several -- but not all -- known algorithms for trace reconstruction fall under the local statistical query paradigm, it is interesting to understand the abilities and limitations of local SQ algorithms for trace reconstruction. In this paper we establish nearly-matching upper and lower bounds on local Statistical Query algorithms for both worst-case and average-case trace reconstruction. For the worst-case problem, we show that there is an $\tilde{O}(n^{1/5})$-local SQ algorithm that makes all its queries with tolerance $\tau \geq 2^{-\tilde{O}(n^{1/5})}$, and also that any $\tilde{O}(n^{1/5})$-local SQ algorithm must make some query with tolerance $\tau \leq 2^{-\tilde{\Omega}(n^{1/5})}$. For the average-case problem, we show that there is an $O(\log n)$-local SQ algorithm that makes all its queries with tolerance $\tau \geq 1/\mathrm{poly}(n)$, and also that any $O(\log n)$-local SQ algorithm must make some query with tolerance $\tau \leq 1/\mathrm{poly}(n).$

Tuesday, July 16

TR24-121 | Approximating the Number of Relevant Variables in a Parity Implies Proper Learning | Nader Bshouty

from ECCC Papers

Consider the model where we can access a parity function through random uniform labeled examples in the presence of random classification noise. In this paper, we show that approximating the number of relevant variables in the parity function is as hard as properly learning parities. More specifically, let $\gamma:{\mathbb R}^+\to {\mathbb R}^+$, where $\gamma(x) \ge x$, be any strictly increasing function. In our first result, we show that from any polynomial-time algorithm that returns a $\gamma$-approximation, $D$ (i.e., $\gamma^{-1}(d(f)) \leq D \leq \gamma(d(f))$), of the number of relevant variables~$d(f)$ for any parity $f$, we can, in polynomial time, construct a solution to the long-standing open problem of polynomial-time learning $k(n)$-sparse parities (parities with $k(n)\le n$ relevant variables), where $k(n) = \omega_n(1)$. In our second result, we show that from any $T(n)$-time algorithm that, for any parity $f$, returns a $\gamma$-approximation of the number of relevant variables $d(f)$ of $f$, we can, in polynomial time, construct a $poly(\Gamma(n))T(\Gamma(n)^2)$-time algorithm that properly learns parities, where $\Gamma(x)=\gamma(\gamma(x))$. If $T(\Gamma(n)^2)=\exp({o(n/\log n)})$, this would resolve another long-standing open problem of properly learning parities in the presence of random classification noise in time~$\exp({o(n/\log n)})$.

Consider the model where we can access a parity function through random uniform labeled examples in the presence of random classification noise. In this paper, we show that approximating the number of relevant variables in the parity function is as hard as properly learning parities. More specifically, let $\gamma:{\mathbb R}^+\to {\mathbb R}^+$, where $\gamma(x) \ge x$, be any strictly increasing function. In our first result, we show that from any polynomial-time algorithm that returns a $\gamma$-approximation, $D$ (i.e., $\gamma^{-1}(d(f)) \leq D \leq \gamma(d(f))$), of the number of relevant variables~$d(f)$ for any parity $f$, we can, in polynomial time, construct a solution to the long-standing open problem of polynomial-time learning $k(n)$-sparse parities (parities with $k(n)\le n$ relevant variables), where $k(n) = \omega_n(1)$. In our second result, we show that from any $T(n)$-time algorithm that, for any parity $f$, returns a $\gamma$-approximation of the number of relevant variables $d(f)$ of $f$, we can, in polynomial time, construct a $poly(\Gamma(n))T(\Gamma(n)^2)$-time algorithm that properly learns parities, where $\Gamma(x)=\gamma(\gamma(x))$. If $T(\Gamma(n)^2)=\exp({o(n/\log n)})$, this would resolve another long-standing open problem of properly learning parities in the presence of random classification noise in time~$\exp({o(n/\log n)})$.

Clinical versus Statistical Prediction (II)

from Ben Recht

Meehl's Philosophical Psychology, Lecture 10, part 2.

This post digs into Lecture 10 of Paul Meehl’s course “Philosophical Psychology.” Technically speaking, this lecture starts at minute 74 of Lecture 9. The video for Lecture 10 is here. Here’s the full table of contents of my blogging through the class.

One of the more common misreadings of Meehl is that he thought you could somehow do away with clinicians altogether. This was not his position, and as we’ll see more in Lecture 11, Meehl did not believe that all decisions could be made statistically. His aim was determining the scope of statistical judgment and when it might be useful. There was a significant set of decisions where he deemed statistics superior. By being precise about this subset, he thought that he could both improve care and simplify the life of the clinician, allowing them room to automate part of their job. Today, let’s hone in on the sorts of predictions Meehl thought were best decided by statistical methods.

Actions

Meehl first clarifies that the goal should be about predicting the outcome of interventions. He is not interested in diagnostic tests. He is not asking about the construct validity of testing for diseases. (He has written other papers about that topic!) Here, he wants to understand how to predict the consequences of actions.

All of the example questions he asks are attempting to predict how an action will affect a particular person. If granted admission, will a person succeed in law school? If released from prison, will a person recidivate? If a depressed person isn’t hospitalized, will they commit suicide? If a person receives shock therapy, will their depression be relieved? 

These sorts of questions are about the impact of single actions. They also have yes or no answers. Meehl focuses on questions with a small list of possible outcomes. For open-ended questions, Meehl thought clinical expertise was indispensable. It was only for problems with simple multiple-choice answers where he thought statistical decision-making could play a role.

Data

To make the decision, Meehl assumes the clinician has the same data as the statistical rule. He belabors distinguishing between the kind of data and the mode of combining the data. As long as the statistical formula and the clinician are presented with the same information, the data can be anything: interviews, life history data, a mental test, other biometrics. 

Obviously, such data has to be transformed into a machine-readable format somehow. Here’s another place the clinician may be indispensable. A clinician may be required to observe a patient’s behavior or facial expressions and write down appropriate diagnostics. Today, this could perhaps also be done with statistical machine learning. In his 1989 lectures, he notes that character recognition is still barely functional. He doesn’t rule out the possibility of more sophisticated pattern recognition methods being used if computers improve. (Spoiler alert: they did). 

Regardless, he just wants the computer and the clinician to be using the same data. The controversy is about the mode of combination not the data types.

Mechanical and actuarial rules

Meehl defines two forms of algorithmic decision rules. First, there are mechanical rules, which we now call algorithms. Mechanical rule and algorithm are synonymous. A mechanical rule is a well defined, step by step process for translating data into a decision that can be implemented on a computer.

Actuarial rules are a special kind of mechanical rule. They are algorithms that make decisions based on rates of past occurrences. These are the statistical prediction methods. A decade ago we called these prediction methods machine learning. Today we call them AI.

Actually, now that I think about it, we’re in the goofy phase of the hype cycle where all mechanical rules are now annoyingly called AI. So I’m going to use Meehl’s terms of mechanical and actuarial to keep things clear. Let me still emphasize that Meehl’s clinical-statistical question asks when AI is better than people at making decisions. There’s a large academic community that still argues the answer is never. As we’ll see, Meehl does not agree.

Clinical judgment

Meehl’s definition of a clinical judgment is a bit more vague. He says it’s anything “informal” made by a human specialist. It’s whatever process occurs in a person’s head. Clinical rules are those made by clinicians based on intuitive assessments of data. These are decisions that clinicians can’t cleanly explain and hence aren’t formalizable as algorithms. 

The clinical-statistical question

With all of this setup, we can now pose Meehl’s central question:

Given a decision problem with a small set of possible outcomes and an appropriate, fixed collection of data, do actuarial rules or clinical judgment provide more accurate judgments about the future?

For this narrow but broadly applicable question, Meehl came down solidly on one side: Statistical prediction would never be worse than clinical prediction.

If you had asked me a year ago, I’d have vehemently disagreed. But I’ve come around. Meehl provides compelling empirical evidence in his 1954 book. And 70 years of studies have backed him up. You’d be hard-pressed to find a result in social science that is as robust as statistical decisions outperforming clinical judgment. After grappling with the evidence and the counterarguments, I now totally agree with Meehl. Tomorrow, let me try to convince you, too. I will present both the empirical evidence, Meehl’s philosophical arguments, and what I consider to be a simple but deceptively subtle explanation. It’s through the subtlety that we might find some resolution.

Subscribe now

By Ben Recht

Planar Graphs—Again

from Richard Lipton

Professors Yin Tat Lee and Thomas Rothvoss are faculty at the Allen School of the University of Washington. Lee has the A.W. Tucker Prize that recognizes the best doctoral thesis in optimization in the past three years. Rothvoss, who holds a dual appointment in the Allen School and the Department of Mathematics, earned the Delbert […]

Professors Yin Tat Lee and Thomas Rothvoss are faculty at the Allen School of the University of Washington.

Lee has the A.W. Tucker Prize that recognizes the best doctoral thesis in optimization in the past three years. Rothvoss, who holds a dual appointment in the Allen School and the Department of Mathematics, earned the Delbert Ray Fulkerson Prize recognizing outstanding papers in the area of discrete mathematics. They are advising theory PhD students in the Allen School—some jointly.

Paul Beame and Anna Karlin are some of the top senior theorists in the Allen School that I have known for many years.

Sally
One of the students in the theory group is Sally Dong who is a final year PhD student. I ran across her work recently the other day
here. She is working with Lee and Rothvoss.

Sally’s Paper

Computing Circle Packing Representations of Planar Graphs by Sally Dong, Yin Tat Lee, Kent Quanrud. See it here:

The Circle Packing Theorem states that every planar graph can be represented as the tangency graph of a family of internally-disjoint circles. A well-known generalization is the Primal-Dual Circle Packing Theorem for 3-connected planar graphs. The existence of these representations has widespread applications in theoretical computer science and mathematics; however, the algorithmic aspect has received relatively little attention. In this work, we present an algorithm based on convex optimization for computing a primal-dual circle packing representation of maximal planar graphs, i.e. triangulations. This in turn gives an algorithm for computing a circle packing representation of any planar graph. Both take the O(nlog(R/s)) expected run-time to produce a solution that is s close to a true representation, where R is the ratio between the maximum and minimum circle radius in the true representation.

Open Problems

I loved the fact that the Circle Packing Theorem was related to an ancient theorem of mine—with Bob Tarjan—on planar graphs. Dong’s paper says:

It gives a geometric proof of the Planar Separator Theorem of Lipton and Tarjan; an analysis of circle packing properties further gives an improved constant bound for the separator size; it is also used crucially to design a simple spectral algorithm for computing optimal separators in graphs of bounded genus and degree.

I loved that it could be used to prove better bounds on the size of the separators. Very cool.

By rjlipton

Postdoc at ETH Institute for Theoretical Studies in Zurich (apply by September 19, 2024)

from CCI: jobs

The Institute for Theoretical Studies at ETH Zurich is looking for Junior and Advanced Fellows for 2025. Please visit our website for more detailed information. Please note that we do not accept any direct applications. Candidates must be nominated by faculty or senior researchers in mathematics, theoretical computer science or the theoretical natural sciences. Website: […]

The Institute for Theoretical Studies at ETH Zurich is looking for Junior and Advanced Fellows for 2025. Please visit our website for more detailed information. Please note that we do not accept any direct applications. Candidates must be nominated by faculty or senior researchers in mathematics, theoretical computer science or the theoretical natural sciences.

Website: https://eth-its.ethz.ch/fellows/nomination-of-junior-fellows1.html
Email: nominations@eth-its.ethz.ch

By shacharlovett

On full-separating sets in graphs

from arXiv: Computational Complexity

Authors: Dipayan Chakraborty, Annegret K. Wagler

Several different types of identification problems have been already studied in the literature, where the objective is to distinguish any two vertices of a graph by their unique neighborhoods in a suitably chosen dominating or total-dominating set of the graph, often referred to as a \emph{code}. To study such problems under a unifying point of view, reformulations of the already studied problems in terms of covering problems in suitably constructed hypergraphs have been provided. Analyzing these hypergraph representations, we introduce a new separation property, called \emph{full-separation}, which has not yet been considered in the literature so far. We study it in combination with both domination and total-domination, and call the resulting codes \emph{full-separating-dominating codes} (or \emph{FD-codes} for short) and \emph{full-separating-total-dominating codes} (or \emph{FTD-codes} for short), respectively. We address the conditions for the existence of FD- and FTD-codes, bounds for their size and their relation to codes of the other types. We show that the problems of determining an FD- or an FTD-code of minimum cardinality in a graph is NP-hard. We also show that the cardinalities of minimum FD- and FTD-codes differ by at most one, but that it is NP-complete to decide if they are equal for a given graph in general. We find the exact values of minimum cardinalities of the FD- and FTD-codes on some familiar graph classes like paths, cycles, half-graphs and spiders. This helps us compare the two codes with other codes on these graph families thereby exhibiting extremal cases for several lower bounds.

Authors: Dipayan Chakraborty, Annegret K. Wagler

Several different types of identification problems have been already studied in the literature, where the objective is to distinguish any two vertices of a graph by their unique neighborhoods in a suitably chosen dominating or total-dominating set of the graph, often referred to as a \emph{code}. To study such problems under a unifying point of view, reformulations of the already studied problems in terms of covering problems in suitably constructed hypergraphs have been provided. Analyzing these hypergraph representations, we introduce a new separation property, called \emph{full-separation}, which has not yet been considered in the literature so far. We study it in combination with both domination and total-domination, and call the resulting codes \emph{full-separating-dominating codes} (or \emph{FD-codes} for short) and \emph{full-separating-total-dominating codes} (or \emph{FTD-codes} for short), respectively. We address the conditions for the existence of FD- and FTD-codes, bounds for their size and their relation to codes of the other types. We show that the problems of determining an FD- or an FTD-code of minimum cardinality in a graph is NP-hard. We also show that the cardinalities of minimum FD- and FTD-codes differ by at most one, but that it is NP-complete to decide if they are equal for a given graph in general. We find the exact values of minimum cardinalities of the FD- and FTD-codes on some familiar graph classes like paths, cycles, half-graphs and spiders. This helps us compare the two codes with other codes on these graph families thereby exhibiting extremal cases for several lower bounds.

Explicit Commutative ROABPs from Partial Derivatives

from arXiv: Computational Complexity

Authors: Vishwas Bhargava, Anamay Tengse

The dimension of partial derivatives (Nisan and Wigderson, 1997) is a popular measure for proving lower bounds in algebraic complexity. It is used to give strong lower bounds on the Waring decomposition of polynomials (called Waring rank). This naturally leads to an interesting open question: does this measure essentially characterize the Waring rank of any polynomial? The well-studied model of Read-once Oblivious ABPs (ROABPs for short) lends itself to an interesting hierarchy of 'sub-models': Any-Order-ROABPs (ARO), Commutative ROABPs, and Diagonal ROABPs. It follows from previous works that for any polynomial, a bound on its Waring rank implies an analogous bound on its Diagonal ROABP complexity (called the duality trick), and a bound on its dimension of partial derivatives implies an analogous bound on its 'ARO complexity': ROABP complexity in any order (Nisan, 1991). Our work strengthens the latter connection by showing that a bound on the dimension of partial derivatives in fact implies a bound on the commutative ROABP complexity. Thus, we improve our understanding of partial derivatives and move a step closer towards answering the above question. Our proof builds on the work of Ramya and Tengse (2022) to show that the commutative-ROABP-width of any homogeneous polynomial is at most the dimension of its partial derivatives. The technique itself is a generalization of the proof of the duality trick due to Saxena (2008).

Authors: Vishwas Bhargava, Anamay Tengse

The dimension of partial derivatives (Nisan and Wigderson, 1997) is a popular measure for proving lower bounds in algebraic complexity. It is used to give strong lower bounds on the Waring decomposition of polynomials (called Waring rank). This naturally leads to an interesting open question: does this measure essentially characterize the Waring rank of any polynomial? The well-studied model of Read-once Oblivious ABPs (ROABPs for short) lends itself to an interesting hierarchy of 'sub-models': Any-Order-ROABPs (ARO), Commutative ROABPs, and Diagonal ROABPs. It follows from previous works that for any polynomial, a bound on its Waring rank implies an analogous bound on its Diagonal ROABP complexity (called the duality trick), and a bound on its dimension of partial derivatives implies an analogous bound on its 'ARO complexity': ROABP complexity in any order (Nisan, 1991). Our work strengthens the latter connection by showing that a bound on the dimension of partial derivatives in fact implies a bound on the commutative ROABP complexity. Thus, we improve our understanding of partial derivatives and move a step closer towards answering the above question. Our proof builds on the work of Ramya and Tengse (2022) to show that the commutative-ROABP-width of any homogeneous polynomial is at most the dimension of its partial derivatives. The technique itself is a generalization of the proof of the duality trick due to Saxena (2008).

Circuits and Backdoors: Five Shades of the SETH

from arXiv: Computational Complexity

Authors: Michael Lampis

The SETH is a hypothesis of fundamental importance to (fine-grained) parameterized complexity theory and many important tight lower bounds are based on it. This situation is somewhat problematic, because the validity of the SETH is not universally believed and because in some senses the SETH seems to be "too strong" a hypothesis for the considered lower bounds. Motivated by this, we consider a number of reasonable weakenings of the SETH that render it more plausible, with sources ranging from circuit complexity, to backdoors for SAT-solving, to graph width parameters, to weighted satisfiability problems. Despite the diversity of the different formulations, we are able to uncover several non-obvious connections using tools from classical complexity theory. This leads us to a hierarchy of five main equivalence classes of hypotheses, with some of the highlights being the following: We show that beating brute force search for SAT parameterized by a modulator to a graph of bounded pathwidth, or bounded treewidth, or logarithmic tree-depth, is actually the same question, and is in fact equivalent to beating brute force for circuits of depth $\epsilon n$; we show that beating brute force search for a strong 2-SAT backdoor is equivalent to beating brute force search for a modulator to logarithmic pathwidth; we show that beting brute force search for a strong Horn backdoor is equivalent to beating brute force search for arbitrary circuit SAT.

Authors: Michael Lampis

The SETH is a hypothesis of fundamental importance to (fine-grained) parameterized complexity theory and many important tight lower bounds are based on it. This situation is somewhat problematic, because the validity of the SETH is not universally believed and because in some senses the SETH seems to be "too strong" a hypothesis for the considered lower bounds. Motivated by this, we consider a number of reasonable weakenings of the SETH that render it more plausible, with sources ranging from circuit complexity, to backdoors for SAT-solving, to graph width parameters, to weighted satisfiability problems. Despite the diversity of the different formulations, we are able to uncover several non-obvious connections using tools from classical complexity theory. This leads us to a hierarchy of five main equivalence classes of hypotheses, with some of the highlights being the following: We show that beating brute force search for SAT parameterized by a modulator to a graph of bounded pathwidth, or bounded treewidth, or logarithmic tree-depth, is actually the same question, and is in fact equivalent to beating brute force for circuits of depth $\epsilon n$; we show that beating brute force search for a strong 2-SAT backdoor is equivalent to beating brute force search for a modulator to logarithmic pathwidth; we show that beting brute force search for a strong Horn backdoor is equivalent to beating brute force search for arbitrary circuit SAT.

Complexity of 2D Snake Cube Puzzles

from arXiv: Computational Geometry

Authors: MIT Hardness Group, Nithid Anchaleenukoon, Alex Dang, Erik D. Demaine, Kaylee Ji, Pitchayut Saengrungkongka

Given a chain of $HW$ cubes where each cube is marked "turn $90^\circ$" or "go straight", when can it fold into a $1 \times H \times W$ rectangular box? We prove several variants of this (still) open problem NP-hard: (1) allowing some cubes to be wildcard (can turn or go straight); (2) allowing a larger box with empty spaces (simplifying a proof from CCCG 2022); (3) growing the box (and the number of cubes) to $2 \times H \times W$ (improving a prior 3D result from height $8$ to $2$); (4) with hexagonal prisms rather than cubes, each specified as going straight, turning $60^\circ$, or turning $120^\circ$; and (5) allowing the cubes to be encoded implicitly to compress exponentially large repetitions.

Authors: MIT Hardness Group, Nithid Anchaleenukoon, Alex Dang, Erik D. Demaine, Kaylee Ji, Pitchayut Saengrungkongka

Given a chain of $HW$ cubes where each cube is marked "turn $90^\circ$" or "go straight", when can it fold into a $1 \times H \times W$ rectangular box? We prove several variants of this (still) open problem NP-hard: (1) allowing some cubes to be wildcard (can turn or go straight); (2) allowing a larger box with empty spaces (simplifying a proof from CCCG 2022); (3) growing the box (and the number of cubes) to $2 \times H \times W$ (improving a prior 3D result from height $8$ to $2$); (4) with hexagonal prisms rather than cubes, each specified as going straight, turning $60^\circ$, or turning $120^\circ$; and (5) allowing the cubes to be encoded implicitly to compress exponentially large repetitions.

On the twin-width of smooth manifolds

from arXiv: Computational Geometry

Authors: Édouard Bonnet, Kristóf Huszár

Building on Whitney's classical method of triangulating smooth manifolds, we show that every compact $d$-dimensional smooth manifold admits a triangulation with dual graph of twin-width at most $d^{O(d)}$. In particular, it follows that every compact 3-manifold has a triangulation with dual graph of bounded twin-width. This is in sharp contrast to the case of treewidth, where for any natural number $n$ there exists a closed 3-manifold such that every triangulation thereof has dual graph with treewidth at least $n$. To establish this result, we bound the twin-width of the incidence graph of the $d$-skeleton of the second barycentric subdivision of the $2d$-dimensional hypercubic honeycomb. We also show that every compact, piecewise-linear (hence smooth) $d$-dimensional manifold has triangulations where the dual graph has an arbitrarily large twin-width.

Authors: Édouard Bonnet, Kristóf Huszár

Building on Whitney's classical method of triangulating smooth manifolds, we show that every compact $d$-dimensional smooth manifold admits a triangulation with dual graph of twin-width at most $d^{O(d)}$. In particular, it follows that every compact 3-manifold has a triangulation with dual graph of bounded twin-width. This is in sharp contrast to the case of treewidth, where for any natural number $n$ there exists a closed 3-manifold such that every triangulation thereof has dual graph with treewidth at least $n$. To establish this result, we bound the twin-width of the incidence graph of the $d$-skeleton of the second barycentric subdivision of the $2d$-dimensional hypercubic honeycomb. We also show that every compact, piecewise-linear (hence smooth) $d$-dimensional manifold has triangulations where the dual graph has an arbitrarily large twin-width.

Spanning Trees Minimizing Branching Costs

from arXiv: Data Structures and Algorithms

Authors: Luisa Gargano, Adele A. Rescigno

The Minimum Branch Vertices Spanning Tree problem aims to find a spanning tree $T$ in a given graph $G$ with the fewest branch vertices, defined as vertices with a degree three or more in $T$. This problem, known to be NP-hard, has attracted significant attention due to its importance in network design and optimization. Extensive research has been conducted on the algorithmic and combinatorial aspects of this problem, with recent studies delving into its fixed-parameter tractability. In this paper, we focus primarily on the parameter modular-width. We demonstrate that finding a spanning tree with the minimum number of branch vertices is Fixed-Parameter Tractable (FPT) when considered with respect to modular-width. Additionally, in cases where each vertex in the input graph has an associated cost for serving as a branch vertex, we prove that the problem of finding a spanning tree with the minimum branch cost (i.e., minimizing the sum of the costs of branch vertices) is FPT with respect to neighborhood diversity.

Authors: Luisa Gargano, Adele A. Rescigno

The Minimum Branch Vertices Spanning Tree problem aims to find a spanning tree $T$ in a given graph $G$ with the fewest branch vertices, defined as vertices with a degree three or more in $T$. This problem, known to be NP-hard, has attracted significant attention due to its importance in network design and optimization. Extensive research has been conducted on the algorithmic and combinatorial aspects of this problem, with recent studies delving into its fixed-parameter tractability. In this paper, we focus primarily on the parameter modular-width. We demonstrate that finding a spanning tree with the minimum number of branch vertices is Fixed-Parameter Tractable (FPT) when considered with respect to modular-width. Additionally, in cases where each vertex in the input graph has an associated cost for serving as a branch vertex, we prove that the problem of finding a spanning tree with the minimum branch cost (i.e., minimizing the sum of the costs of branch vertices) is FPT with respect to neighborhood diversity.

An efficient algorithm to compute the minimum free energy of interacting nucleic acid strands

from arXiv: Data Structures and Algorithms

Authors: Ahmed Shalaby, Damien Woods

The information-encoding molecules RNA and DNA form a combinatorially large set of secondary structures through nucleic acid base pairing. Thermodynamic prediction algorithms predict favoured, or minimum free energy (MFE), secondary structures, and can assign an equilibrium probability to any structure via the partition function: a Boltzman-weighted sum over the set of secondary structures. MFE is NP-hard in the presence pseudoknots, base pairings that violate a restricted planarity condition. However, unpseudoknotted structures are amenable to dynamic programming: for a single DNA/RNA strand there are polynomial time algorithms for MFE and partition function. For multiple strands, the problem is more complicated due to entropic penalties. Dirks et al [SICOMP Review; 2007] showed that for O(1) strands, with N bases, there is a polynomial time in N partition function algorithm, however their technique did not generalise to MFE which they left open. We give the first polynomial time (O(N^4)) algorithm for unpseudoknotted multiple (O(1)) strand MFE, answering the open problem from Dirks et al. The challenge lies in considering rotational symmetry of secondary structures, a feature not immediately amenable to dynamic programming algorithms. Our proof has two main technical contributions: First, a polynomial upper bound on the number of symmetric secondary structures to be considered when computing rotational symmetry penalties. Second, that bound is leveraged by a backtracking algorithm to find the MFE in an exponential space of contenders. Our MFE algorithm has the same asymptotic run time as Dirks et al's partition function algorithm, suggesting efficient handling of rotational symmetry, although higher space complexity. It also seems reasonably tight in the number of strands since Codon, Hajiaghayi & Thachuk [DNA27, 2021] have shown that unpseudoknotted MFE is NP-hard for O(N) strands.

Authors: Ahmed Shalaby, Damien Woods

The information-encoding molecules RNA and DNA form a combinatorially large set of secondary structures through nucleic acid base pairing. Thermodynamic prediction algorithms predict favoured, or minimum free energy (MFE), secondary structures, and can assign an equilibrium probability to any structure via the partition function: a Boltzman-weighted sum over the set of secondary structures. MFE is NP-hard in the presence pseudoknots, base pairings that violate a restricted planarity condition. However, unpseudoknotted structures are amenable to dynamic programming: for a single DNA/RNA strand there are polynomial time algorithms for MFE and partition function. For multiple strands, the problem is more complicated due to entropic penalties. Dirks et al [SICOMP Review; 2007] showed that for O(1) strands, with N bases, there is a polynomial time in N partition function algorithm, however their technique did not generalise to MFE which they left open. We give the first polynomial time (O(N^4)) algorithm for unpseudoknotted multiple (O(1)) strand MFE, answering the open problem from Dirks et al. The challenge lies in considering rotational symmetry of secondary structures, a feature not immediately amenable to dynamic programming algorithms. Our proof has two main technical contributions: First, a polynomial upper bound on the number of symmetric secondary structures to be considered when computing rotational symmetry penalties. Second, that bound is leveraged by a backtracking algorithm to find the MFE in an exponential space of contenders. Our MFE algorithm has the same asymptotic run time as Dirks et al's partition function algorithm, suggesting efficient handling of rotational symmetry, although higher space complexity. It also seems reasonably tight in the number of strands since Codon, Hajiaghayi & Thachuk [DNA27, 2021] have shown that unpseudoknotted MFE is NP-hard for O(N) strands.

Fine-Grained Optimality of Partially Dynamic Shortest Paths and More

from arXiv: Data Structures and Algorithms

Authors: Barna Saha, Virginia Vassilevska Williams, Yinzhan Xu, Christopher Ye

Single Source Shortest Paths ($\textrm{SSSP}$) is among the most well-studied problems in computer science. In the incremental (resp. decremental) setting, the goal is to maintain distances from a fixed source in a graph undergoing edge insertions (resp. deletions). A long line of research culminated in a near-optimal deterministic $(1 + \varepsilon)$-approximate data structure with $m^{1 + o(1)}$ total update time over all $m$ updates by Bernstein, Probst Gutenberg and Saranurak [FOCS 2021]. However, there has been remarkably little progress on the exact $\textrm{SSSP}$ problem beyond Even and Shiloach's algorithm [J. ACM 1981] for unweighted graphs. For weighted graphs, there are no exact algorithms beyond recomputing $\textrm{SSSP}$ from scratch in $\widetilde{O}(m^2)$ total update time, even for the simpler Single-Source Single-Target Shortest Path problem ($\textrm{stSP}$). Despite this lack of progress, known (conditional) lower bounds only rule out algorithms with amortized update time better than $m^{1/2 - o(1)}$ in dense graphs. In this paper, we give a tight (conditional) lower bound: any partially dynamic exact $\textrm{stSP}$ algorithm requires $m^{2 - o(1)}$ total update time for any sparsity $m$. We thus resolve the complexity of partially dynamic shortest paths, and separate the hardness of exact and approximate shortest paths, giving evidence as to why no non-trivial exact algorithms have been obtained while fast approximation algorithms are known. Moreover, we give tight bounds on the complexity of combinatorial algorithms for several path problems that have been studied in the static setting since early sixties: Node-weighted shortest paths (studied alongside edge-weighted shortest paths), bottleneck paths (early work dates back to 1960), and earliest arrivals (early work dates back to 1958).

Authors: Barna Saha, Virginia Vassilevska Williams, Yinzhan Xu, Christopher Ye

Single Source Shortest Paths ($\textrm{SSSP}$) is among the most well-studied problems in computer science. In the incremental (resp. decremental) setting, the goal is to maintain distances from a fixed source in a graph undergoing edge insertions (resp. deletions). A long line of research culminated in a near-optimal deterministic $(1 + \varepsilon)$-approximate data structure with $m^{1 + o(1)}$ total update time over all $m$ updates by Bernstein, Probst Gutenberg and Saranurak [FOCS 2021]. However, there has been remarkably little progress on the exact $\textrm{SSSP}$ problem beyond Even and Shiloach's algorithm [J. ACM 1981] for unweighted graphs. For weighted graphs, there are no exact algorithms beyond recomputing $\textrm{SSSP}$ from scratch in $\widetilde{O}(m^2)$ total update time, even for the simpler Single-Source Single-Target Shortest Path problem ($\textrm{stSP}$). Despite this lack of progress, known (conditional) lower bounds only rule out algorithms with amortized update time better than $m^{1/2 - o(1)}$ in dense graphs. In this paper, we give a tight (conditional) lower bound: any partially dynamic exact $\textrm{stSP}$ algorithm requires $m^{2 - o(1)}$ total update time for any sparsity $m$. We thus resolve the complexity of partially dynamic shortest paths, and separate the hardness of exact and approximate shortest paths, giving evidence as to why no non-trivial exact algorithms have been obtained while fast approximation algorithms are known. Moreover, we give tight bounds on the complexity of combinatorial algorithms for several path problems that have been studied in the static setting since early sixties: Node-weighted shortest paths (studied alongside edge-weighted shortest paths), bottleneck paths (early work dates back to 1960), and earliest arrivals (early work dates back to 1958).

Improved Lower Bounds on the Expected Length of Longest Common Subsequences

from arXiv: Data Structures and Algorithms

Authors: George T. Heineman, Chase Miller, Daniel Reichman, Andrew Salls, Gábor Sárközy, Duncan Soiffer

It has been proven that, when normalized by $n$, the expected length of a longest common subsequence of $d$ random strings of length $n$ over an alphabet of size $\sigma$ converges to some constant that depends only on $d$ and $\sigma$. These values are known as the Chv\'{a}tal-Sankoff constants, and determining their exact values is a well-known open problem. Upper and lower bounds are known for some combinations of $\sigma$ and $d$, with the best lower and upper bounds for the most studied case, $\sigma=2, d=2$, at $0.788071$ and $0.826280$, respectively. Building off previous algorithms for lower-bounding the constants, we implement runtime optimizations, parallelization, and an efficient memory reading and writing scheme to obtain an improved lower bound of $0.792665992$ for $\sigma=2, d=2$. We additionally improve upon almost all previously reported lower bounds for the Chv\'{a}tal-Sankoff constants when either the size of alphabet, the number of strings, or both are larger than 2.

Authors: George T. Heineman, Chase Miller, Daniel Reichman, Andrew Salls, Gábor Sárközy, Duncan Soiffer

It has been proven that, when normalized by $n$, the expected length of a longest common subsequence of $d$ random strings of length $n$ over an alphabet of size $\sigma$ converges to some constant that depends only on $d$ and $\sigma$. These values are known as the Chv\'{a}tal-Sankoff constants, and determining their exact values is a well-known open problem. Upper and lower bounds are known for some combinations of $\sigma$ and $d$, with the best lower and upper bounds for the most studied case, $\sigma=2, d=2$, at $0.788071$ and $0.826280$, respectively. Building off previous algorithms for lower-bounding the constants, we implement runtime optimizations, parallelization, and an efficient memory reading and writing scheme to obtain an improved lower bound of $0.792665992$ for $\sigma=2, d=2$. We additionally improve upon almost all previously reported lower bounds for the Chv\'{a}tal-Sankoff constants when either the size of alphabet, the number of strings, or both are larger than 2.

Cut-Preserving Vertex Sparsifiers for Planar and Quasi-bipartite Graphs

from arXiv: Data Structures and Algorithms

Authors: Yu Chen, Zihan Tan

We study vertex sparsification for preserving cuts. Given a graph $G$ with a subset $|T|=k$ of its vertices called terminals, a \emph{quality-$q$ cut sparsifier} is a graph $G'$ that contains $T$, such that, for any partition $(T_1,T_2)$ of $T$ into non-empty subsets, the value of the min-cut in $G'$ separating $T_1$ from $T_2$ is within factor $q$ from the value of the min-cut in $G$ separating $T_1$ from $T_2$. The construction of cut sparsifiers with good (small) quality and size has been a central problem in graph compression for years. Planar graphs and quasi-bipartite graphs are two important special families studied in this research direction. The main results in this paper are new cut sparsifier constructions for them in the high-quality regime (where $q=1$ or $1+\varepsilon$ for small $\varepsilon>0$). We first show that every planar graph admits a planar quality-$(1+\varepsilon)$ cut sparsifier of size $\tilde O(k/\text{poly}(\varepsilon))$, which is in sharp contrast with the lower bound of $2^{\Omega(k)}$ for the quality-$1$ case. We then show that every quasi-bipartite graph admits a quality-$1$ cut sparsifier of size $2^{\tilde O(k^2)}$. This is the second to improve over the doubly-exponential bound for general graphs (previously only planar graphs have been shown to have single-exponential size quality-$1$ cut sparsifiers). Lastly, we show that contraction, a common approach for constructing cut sparsifiers adopted in most previous works, does not always give optimal bounds for cut sparsifiers. We demonstrate this by showing that the optimal size bound for quality-$(1+\varepsilon)$ contraction-based cut sparsifiers for quasi-bipartite graphs lies in the range $[k^{\tilde\Omega(1/\varepsilon)},k^{O(1/\varepsilon^2)}]$, while in previous work an upper bound of $\tilde O(k/\varepsilon^2)$ was achieved via a non-contraction approach.

Authors: Yu Chen, Zihan Tan

We study vertex sparsification for preserving cuts. Given a graph $G$ with a subset $|T|=k$ of its vertices called terminals, a \emph{quality-$q$ cut sparsifier} is a graph $G'$ that contains $T$, such that, for any partition $(T_1,T_2)$ of $T$ into non-empty subsets, the value of the min-cut in $G'$ separating $T_1$ from $T_2$ is within factor $q$ from the value of the min-cut in $G$ separating $T_1$ from $T_2$. The construction of cut sparsifiers with good (small) quality and size has been a central problem in graph compression for years. Planar graphs and quasi-bipartite graphs are two important special families studied in this research direction. The main results in this paper are new cut sparsifier constructions for them in the high-quality regime (where $q=1$ or $1+\varepsilon$ for small $\varepsilon>0$). We first show that every planar graph admits a planar quality-$(1+\varepsilon)$ cut sparsifier of size $\tilde O(k/\text{poly}(\varepsilon))$, which is in sharp contrast with the lower bound of $2^{\Omega(k)}$ for the quality-$1$ case. We then show that every quasi-bipartite graph admits a quality-$1$ cut sparsifier of size $2^{\tilde O(k^2)}$. This is the second to improve over the doubly-exponential bound for general graphs (previously only planar graphs have been shown to have single-exponential size quality-$1$ cut sparsifiers). Lastly, we show that contraction, a common approach for constructing cut sparsifiers adopted in most previous works, does not always give optimal bounds for cut sparsifiers. We demonstrate this by showing that the optimal size bound for quality-$(1+\varepsilon)$ contraction-based cut sparsifiers for quasi-bipartite graphs lies in the range $[k^{\tilde\Omega(1/\varepsilon)},k^{O(1/\varepsilon^2)}]$, while in previous work an upper bound of $\tilde O(k/\varepsilon^2)$ was achieved via a non-contraction approach.

Almost-Linear Time Algorithms for Decremental Graphs: Min-Cost Flow and More via Duality

from arXiv: Data Structures and Algorithms

Authors: Jan van den Brand, Li Chen, Rasmus Kyng, Yang P. Liu, Simon Meierhans, Maximilian Probst Gutenberg, Sushant Sachdeva

We give the first almost-linear total time algorithm for deciding if a flow of cost at most $F$ still exists in a directed graph, with edge costs and capacities, undergoing decremental updates, i.e., edge deletions, capacity decreases, and cost increases. This implies almost-linear time algorithms for approximating the minimum-cost flow value and $s$-$t$ distance on such decremental graphs. Our framework additionally allows us to maintain decremental strongly connected components in almost-linear time deterministically. These algorithms also improve over the current best known runtimes for statically computing minimum-cost flow, in both the randomized and deterministic settings. We obtain our algorithms by taking the dual perspective, which yields cut-based algorithms. More precisely, our algorithm computes the flow via a sequence of $m^{1+o(1)}$ dynamic min-ratio cut problems, the dual analog of the dynamic min-ratio cycle problem that underlies recent fast algorithms for minimum-cost flow. Our main technical contribution is a new data structure that returns an approximately optimal min-ratio cut in amortized $m^{o(1)}$ time by maintaining a tree-cut sparsifier. This is achieved by devising a new algorithm to maintain the dynamic expander hierarchy of [Goranci-R\"{a}cke-Saranurak-Tan, SODA 2021] that also works in capacitated graphs. All our algorithms are deterministc, though they can be sped up further using randomized techniques while still working against an adaptive adversary.

Authors: Jan van den Brand, Li Chen, Rasmus Kyng, Yang P. Liu, Simon Meierhans, Maximilian Probst Gutenberg, Sushant Sachdeva

We give the first almost-linear total time algorithm for deciding if a flow of cost at most $F$ still exists in a directed graph, with edge costs and capacities, undergoing decremental updates, i.e., edge deletions, capacity decreases, and cost increases. This implies almost-linear time algorithms for approximating the minimum-cost flow value and $s$-$t$ distance on such decremental graphs. Our framework additionally allows us to maintain decremental strongly connected components in almost-linear time deterministically. These algorithms also improve over the current best known runtimes for statically computing minimum-cost flow, in both the randomized and deterministic settings. We obtain our algorithms by taking the dual perspective, which yields cut-based algorithms. More precisely, our algorithm computes the flow via a sequence of $m^{1+o(1)}$ dynamic min-ratio cut problems, the dual analog of the dynamic min-ratio cycle problem that underlies recent fast algorithms for minimum-cost flow. Our main technical contribution is a new data structure that returns an approximately optimal min-ratio cut in amortized $m^{o(1)}$ time by maintaining a tree-cut sparsifier. This is achieved by devising a new algorithm to maintain the dynamic expander hierarchy of [Goranci-R\"{a}cke-Saranurak-Tan, SODA 2021] that also works in capacitated graphs. All our algorithms are deterministc, though they can be sped up further using randomized techniques while still working against an adaptive adversary.

From Data Completion to Problems on Hypercubes: A Parameterized Analysis of the Independent Set Problem

from arXiv: Data Structures and Algorithms

Authors: Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, Stefan Szeider

Several works have recently investigated the parameterized complexity of data completion problems, motivated by their applications in machine learning, and clustering in particular. Interestingly, these problems can be equivalently formulated as classical graph problems on induced subgraphs of powers of partially-defined hypercubes. In this paper, we follow up on this recent direction by investigating the Independent Set problem on this graph class, which has been studied in the data science setting under the name Diversity. We obtain a comprehensive picture of the problem's parameterized complexity and establish its fixed-parameter tractability w.r.t. the solution size plus the power of the hypercube. Given that several such FO-definable problems have been shown to be fixed-parameter tractable on the considered graph class, one may ask whether fixed-parameter tractability could be extended to capture all FO-definable problems. We answer this question in the negative by showing that FO model checking on induced subgraphs of hypercubes is as difficult as FO model checking on general graphs.

Authors: Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, Stefan Szeider

Several works have recently investigated the parameterized complexity of data completion problems, motivated by their applications in machine learning, and clustering in particular. Interestingly, these problems can be equivalently formulated as classical graph problems on induced subgraphs of powers of partially-defined hypercubes. In this paper, we follow up on this recent direction by investigating the Independent Set problem on this graph class, which has been studied in the data science setting under the name Diversity. We obtain a comprehensive picture of the problem's parameterized complexity and establish its fixed-parameter tractability w.r.t. the solution size plus the power of the hypercube. Given that several such FO-definable problems have been shown to be fixed-parameter tractable on the considered graph class, one may ask whether fixed-parameter tractability could be extended to capture all FO-definable problems. We answer this question in the negative by showing that FO model checking on induced subgraphs of hypercubes is as difficult as FO model checking on general graphs.

NPA Hierarchy for Quantum Isomorphism and Homomorphism Indistinguishability

from arXiv: Data Structures and Algorithms

Authors: Prem Nigam Kar, David E. Roberson, Tim Seppelt, Peter Zeman

Man\v{c}inska and Roberson~[FOCS'20] showed that two graphs are quantum isomorphic if and only if they are homomorphism indistinguishable over the class of planar graphs. Atserias et al.~[JCTB'19] proved that quantum isomorphism is undecidable in general. The NPA hierarchy gives a sequence of semidefinite programming relaxations of quantum isomorphism. Recently, Roberson and Seppelt~[ICALP'23] obtained a homomorphism indistinguishability characterization of the feasibility of each level of the Lasserre hierarchy of semidefinite programming relaxations of graph isomorphism. We prove a quantum analogue of this result by showing that each level of the NPA hierarchy of SDP relaxations for quantum isomorphism of graphs is equivalent to homomorphism indistinguishability over an appropriate class of planar graphs. By combining the convergence of the NPA hierarchy with the fact that the union of these graph classes is the set of all planar graphs, we are able to give a new proof of the result of Man\v{c}inska and Roberson~[FOCS'20] that avoids the use of the theory of quantum groups. This homomorphism indistinguishability characterization also allows us to give a randomized polynomial-time algorithm deciding exact feasibility of each fixed level of the NPA hierarchy of SDP relaxations for quantum isomorphism.

Authors: Prem Nigam Kar, David E. Roberson, Tim Seppelt, Peter Zeman

Man\v{c}inska and Roberson~[FOCS'20] showed that two graphs are quantum isomorphic if and only if they are homomorphism indistinguishable over the class of planar graphs. Atserias et al.~[JCTB'19] proved that quantum isomorphism is undecidable in general. The NPA hierarchy gives a sequence of semidefinite programming relaxations of quantum isomorphism. Recently, Roberson and Seppelt~[ICALP'23] obtained a homomorphism indistinguishability characterization of the feasibility of each level of the Lasserre hierarchy of semidefinite programming relaxations of graph isomorphism. We prove a quantum analogue of this result by showing that each level of the NPA hierarchy of SDP relaxations for quantum isomorphism of graphs is equivalent to homomorphism indistinguishability over an appropriate class of planar graphs. By combining the convergence of the NPA hierarchy with the fact that the union of these graph classes is the set of all planar graphs, we are able to give a new proof of the result of Man\v{c}inska and Roberson~[FOCS'20] that avoids the use of the theory of quantum groups. This homomorphism indistinguishability characterization also allows us to give a randomized polynomial-time algorithm deciding exact feasibility of each fixed level of the NPA hierarchy of SDP relaxations for quantum isomorphism.