Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Thursday, April 25

Applied Algorithms for Machine Learning: A Workshop on Future of Computation

from CS Theory Events

June 10-12, 2024 Paris, France aaforml.com In this workshop, we present a series of talks on the intersection between applied algorithms and machine learning, two indispensable areas of future computation. We will cover a range of specific topics, including randomized and approximation algorithms; large-scale machine learning; distributed and federated learning; learning-augmented algorithms; algorithms for fairness … Continue reading Applied Algorithms for Machine Learning: A Workshop on Future of Computation

By shacharlovett

June 10-12, 2024 Paris, France https://aaforml.com In this workshop, we present a series of talks on the intersection between applied algorithms and machine learning, two indispensable areas of future computation. We will cover a range of specific topics, including randomized and approximation algorithms; large-scale machine learning; distributed and federated learning; learning-augmented algorithms; algorithms for fairness … Continue reading Applied Algorithms for Machine Learning: A Workshop on Future of Computation

By shacharlovett

Workshop on Local Algorithms

from CS Theory Events

August 5-8, 2024 Simons Institute (Berkeley, USA) ptreview.sublinear.info/2024/04/wola24-dates-registration-and-call-for-contributed-talks-and-posters/ Submission deadline: May 3, 2024 The 8th edition of WoLA, the Workshop on Local Algorithms, will be taking place on August 5-7, 2024 at the Simons Institute, as part of the Simons Institute’ summer program on Sublinear Algorithms.

By shacharlovett

August 5-8, 2024 Simons Institute (Berkeley, USA) https://ptreview.sublinear.info/2024/04/wola24-dates-registration-and-call-for-contributed-talks-and-posters/ Submission deadline: May 3, 2024 The 8th edition of WoLA, the Workshop on Local Algorithms, will be taking place on August 5-7, 2024 at the Simons Institute, as part of the Simons Institute’ summer program on Sublinear Algorithms.

By shacharlovett

A nearly-$4\log n$ depth lower bound for formulas with restriction on top

from arXiv: Computational Complexity

Authors: Hao Wu

One of the major open problems in complexity theory is to demonstrate an explicit function which requires super logarithmic depth, a.k.a, the $\mathbf{P}$ versus $\mathbf{NC^1}$ problem. The current best depth lower bound is $(3-o(1))\cdot \log n$, and it is widely open how to prove a super-$3\log n$ depth lower bound. Recently Mihajlin and Sofronova (CCC'22) show if considering formulas with restriction on top, we can break the $3\log n$ barrier. Formally, they prove there exist two functions $f:\{0,1\}^n \rightarrow \{0,1\},g:\{0,1\}^n \rightarrow \{0,1\}^n$, such that for any constant $0<\alpha<0.4$ and constant $0<\epsilon<\alpha/2$, their XOR composition $f(g(x)\oplus y)$ is not computable by an AND of $2^{(\alpha-\epsilon)n}$ formulas of size at most $2^{(1-\alpha/2-\epsilon)n}$. This implies a modified version of Andreev function is not computable by any circuit of depth $(3.2-\epsilon)\log n$ with the restriction that top $0.4-\epsilon$ layers only consist of AND gates for any small constant $\epsilon>0$. They ask whether the parameter $\alpha$ can be push up to nearly $1$ thus implying a nearly-$3.5\log n$ depth lower bound. In this paper, we provide a stronger answer to their question. We show there exist two functions $f:\{0,1\}^n \rightarrow \{0,1\},g:\{0,1\}^n \rightarrow \{0,1\}^n$, such that for any constant $0<\alpha<2-o(1)$, their XOR composition $f(g(x)\oplus y)$ is not computable by an AND of $2^{\alpha n}$ formulas of size at most $2^{(1-\alpha/2-o(1))n}$. This implies a $(4-o(1))\log n$ depth lower bound with the restriction that top $2-o(1)$ layers only consist of AND gates. We prove it by observing that one crucial component in Mihajlin and Sofronova's work, called the well-mixed set of functions, can be significantly simplified thus improved. Then with this observation and a more careful analysis, we obtain these nearly tight results.

Authors: Hao Wu

One of the major open problems in complexity theory is to demonstrate an explicit function which requires super logarithmic depth, a.k.a, the $\mathbf{P}$ versus $\mathbf{NC^1}$ problem. The current best depth lower bound is $(3-o(1))\cdot \log n$, and it is widely open how to prove a super-$3\log n$ depth lower bound. Recently Mihajlin and Sofronova (CCC'22) show if considering formulas with restriction on top, we can break the $3\log n$ barrier. Formally, they prove there exist two functions $f:\{0,1\}^n \rightarrow \{0,1\},g:\{0,1\}^n \rightarrow \{0,1\}^n$, such that for any constant $0<\alpha<0.4$ and constant $0<\epsilon<\alpha/2$, their XOR composition $f(g(x)\oplus y)$ is not computable by an AND of $2^{(\alpha-\epsilon)n}$ formulas of size at most $2^{(1-\alpha/2-\epsilon)n}$. This implies a modified version of Andreev function is not computable by any circuit of depth $(3.2-\epsilon)\log n$ with the restriction that top $0.4-\epsilon$ layers only consist of AND gates for any small constant $\epsilon>0$. They ask whether the parameter $\alpha$ can be push up to nearly $1$ thus implying a nearly-$3.5\log n$ depth lower bound. In this paper, we provide a stronger answer to their question. We show there exist two functions $f:\{0,1\}^n \rightarrow \{0,1\},g:\{0,1\}^n \rightarrow \{0,1\}^n$, such that for any constant $0<\alpha<2-o(1)$, their XOR composition $f(g(x)\oplus y)$ is not computable by an AND of $2^{\alpha n}$ formulas of size at most $2^{(1-\alpha/2-o(1))n}$. This implies a $(4-o(1))\log n$ depth lower bound with the restriction that top $2-o(1)$ layers only consist of AND gates. We prove it by observing that one crucial component in Mihajlin and Sofronova's work, called the well-mixed set of functions, can be significantly simplified thus improved. Then with this observation and a more careful analysis, we obtain these nearly tight results.

A Review on Message Complexity of the Algorithms for Clock Synchronization in Distributed Systems

from arXiv: Computational Complexity

Authors: Chandeepa Dissanayake, Chanuka Algama

In this work, we present an extensive analysis of clock synchronization algorithms, with a specific focus on message complexity. We begin by introducing fundamental concepts in clock synchronization, such as the Byzantine generals problem and specific concepts like clock accuracy, precision, skew, offset, timestamping, and clock drift estimation. Describing the concept of logical clocks, their implementation in distributed systems is discussed, highlighting their significance and various approaches. The paper then examines four prominent clock synchronization algorithms: Lamport's Algorithm, Ricart-Agrawala Algorithm, Vector Clocks Algorithm, and Christian's Algorithm. Special attention is given to the analysis of message complexity, providing insights into the efficiency of each algorithm. Finally, we compare the message complexities of the discussed algorithms.

Authors: Chandeepa Dissanayake, Chanuka Algama

In this work, we present an extensive analysis of clock synchronization algorithms, with a specific focus on message complexity. We begin by introducing fundamental concepts in clock synchronization, such as the Byzantine generals problem and specific concepts like clock accuracy, precision, skew, offset, timestamping, and clock drift estimation. Describing the concept of logical clocks, their implementation in distributed systems is discussed, highlighting their significance and various approaches. The paper then examines four prominent clock synchronization algorithms: Lamport's Algorithm, Ricart-Agrawala Algorithm, Vector Clocks Algorithm, and Christian's Algorithm. Special attention is given to the analysis of message complexity, providing insights into the efficiency of each algorithm. Finally, we compare the message complexities of the discussed algorithms.

Minimum Consistent Subset in Trees and Interval Graphs

from arXiv: Computational Geometry

Authors: Aritra Banik, Sayani Das, Anil Maheshwari, Bubai Manna, Subhas C Nandy, Krishna Priya K M, Bodhayan Roy, Sasanka Roy, Abhishek Sahu

In the Minimum Consistent Subset (MCS) problem, we are presented with a connected simple undirected graph $G=(V,E)$, consisting of a vertex set $V$ of size $n$ and an edge set $E$. Each vertex in $V$ is assigned a color from the set $\{1,2,\ldots, c\}$. The objective is to determine a subset $V' \subseteq V$ with minimum possible cardinality, such that for every vertex $v \in V$, at least one of its nearest neighbors in $V'$ (measured in terms of the hop distance) shares the same color as $v$. The decision problem, indicating whether there exists a subset $V'$ of cardinality at most $l$ for some positive integer $l$, is known to be NP-complete even for planar graphs. In this paper, we establish that the MCS problem for trees, when the number of colors $c$ is considered an input parameter, is NP-complete. We propose a fixed-parameter tractable (FPT) algorithm for MCS on trees running in $O(2^{6c}n^6)$ time, significantly improving the currently best-known algorithm whose running time is $O(2^{4c}n^{2c+3})$. In an effort to comprehensively understand the computational complexity of the MCS problem across different graph classes, we extend our investigation to interval graphs. We show that it remains NP-complete for interval graphs, thus enriching graph classes where MCS remains intractable.

Authors: Aritra Banik, Sayani Das, Anil Maheshwari, Bubai Manna, Subhas C Nandy, Krishna Priya K M, Bodhayan Roy, Sasanka Roy, Abhishek Sahu

In the Minimum Consistent Subset (MCS) problem, we are presented with a connected simple undirected graph $G=(V,E)$, consisting of a vertex set $V$ of size $n$ and an edge set $E$. Each vertex in $V$ is assigned a color from the set $\{1,2,\ldots, c\}$. The objective is to determine a subset $V' \subseteq V$ with minimum possible cardinality, such that for every vertex $v \in V$, at least one of its nearest neighbors in $V'$ (measured in terms of the hop distance) shares the same color as $v$. The decision problem, indicating whether there exists a subset $V'$ of cardinality at most $l$ for some positive integer $l$, is known to be NP-complete even for planar graphs. In this paper, we establish that the MCS problem for trees, when the number of colors $c$ is considered an input parameter, is NP-complete. We propose a fixed-parameter tractable (FPT) algorithm for MCS on trees running in $O(2^{6c}n^6)$ time, significantly improving the currently best-known algorithm whose running time is $O(2^{4c}n^{2c+3})$. In an effort to comprehensively understand the computational complexity of the MCS problem across different graph classes, we extend our investigation to interval graphs. We show that it remains NP-complete for interval graphs, thus enriching graph classes where MCS remains intractable.

A Note on Approximating Weighted Nash Social Welfare with Additive Valuations

from arXiv: Data Structures and Algorithms

Authors: Yuda Feng, Shi Li

We give the first $O(1)$-approximation for the weighted Nash Social Welfare problem with additive valuations. The approximation ratio we obtain is $e^{1/e} + \epsilon \approx 1.445 + \epsilon$, which matches the best known approximation ratio for the unweighted case \cite{BKV18}. Both our algorithm and analysis are simple. We solve a natural configuration LP for the problem, and obtain the allocation of items to agents using a randomized version of the Shmoys-Tardos rounding algorithm developed for unrelated machine scheduling problems. In the analysis, we show that the approximation ratio of the algorithm is at most the worst gap between the Nash social welfare of the optimum allocation and that of an EF1 allocation, for an unweighted Nash Social Welfare instance with identical additive valuations. This was shown to be at most $e^{1/e} \approx 1.445$ by Barman et al., leading to our approximation ratio.

Authors: Yuda Feng, Shi Li

We give the first $O(1)$-approximation for the weighted Nash Social Welfare problem with additive valuations. The approximation ratio we obtain is $e^{1/e} + \epsilon \approx 1.445 + \epsilon$, which matches the best known approximation ratio for the unweighted case \cite{BKV18}. Both our algorithm and analysis are simple. We solve a natural configuration LP for the problem, and obtain the allocation of items to agents using a randomized version of the Shmoys-Tardos rounding algorithm developed for unrelated machine scheduling problems. In the analysis, we show that the approximation ratio of the algorithm is at most the worst gap between the Nash social welfare of the optimum allocation and that of an EF1 allocation, for an unweighted Nash Social Welfare instance with identical additive valuations. This was shown to be at most $e^{1/e} \approx 1.445$ by Barman et al., leading to our approximation ratio.

Online Disjoint Set Covers: Randomization is not Necessary

from arXiv: Data Structures and Algorithms

Authors: Marcin Bienkowski, Jarosław Byrka, Łukasz Jeż

In the online disjoint set covers problem, the edges of a hypergraph are revealed online, and the goal is to partition them into a maximum number of disjoint set covers. That is, n nodes of a hypergraph are given at the beginning, and then a sequence of hyperedges (subsets of [n]) is presented to an algorithm. For each hyperedge, an online algorithm must assign a color (an integer). Once an input terminates, the gain of the algorithm is the number of colors that correspond to valid set covers (i.e., the union of hyperedges that have that color contains all n nodes). We present a deterministic online algorithm that is O(log^2 n)-competitive, exponentially improving on the previous bound of O(n) and matching the performance of the best randomized algorithm by Emek et al. [ESA 2019]. For color selection, our algorithm uses a novel potential function, which can be seen as an online counterpart of the derandomization method of conditional probabilities and pessimistic estimators. There are only a few cases where derandomization has been successfully used in the field of online algorithms. In contrast to previous approaches, our result extends this tool to tackle the following new challenges: (i) the potential function derandomizes not only the Chernoff bound, but also the coupon collector's problem, (ii) the value of OPT of the maximization problem is not bounded a priori, and (iii) we do not produce a fractional solution first, but work directly on the input.

Authors: Marcin Bienkowski, Jarosław Byrka, Łukasz Jeż

In the online disjoint set covers problem, the edges of a hypergraph are revealed online, and the goal is to partition them into a maximum number of disjoint set covers. That is, n nodes of a hypergraph are given at the beginning, and then a sequence of hyperedges (subsets of [n]) is presented to an algorithm. For each hyperedge, an online algorithm must assign a color (an integer). Once an input terminates, the gain of the algorithm is the number of colors that correspond to valid set covers (i.e., the union of hyperedges that have that color contains all n nodes). We present a deterministic online algorithm that is O(log^2 n)-competitive, exponentially improving on the previous bound of O(n) and matching the performance of the best randomized algorithm by Emek et al. [ESA 2019]. For color selection, our algorithm uses a novel potential function, which can be seen as an online counterpart of the derandomization method of conditional probabilities and pessimistic estimators. There are only a few cases where derandomization has been successfully used in the field of online algorithms. In contrast to previous approaches, our result extends this tool to tackle the following new challenges: (i) the potential function derandomizes not only the Chernoff bound, but also the coupon collector's problem, (ii) the value of OPT of the maximization problem is not bounded a priori, and (iii) we do not produce a fractional solution first, but work directly on the input.

Renting Servers for Multi-Parameter Jobs in the Cloud

from arXiv: Data Structures and Algorithms

Authors: Yaqiao Li, Mahtab Masoori, Lata Narayanan, Denis Pankratov

We study the Renting Servers in the Cloud problem (RSiC) in multiple dimensions. In this problem, a sequence of multi-parameter jobs must be scheduled on servers that can be rented on-demand. Each job has an arrival time, a finishing time, and a multi-dimensional size vector that specifies its resource demands. Each server has a multi-dimensional capacity and jobs can be scheduled on a server as long as in each dimension the sum of sizes of jobs does not exceed the capacity of the server in that dimension. The goal is to minimize the total rental time of servers needed to process the job sequence. AF algorithms do not rent new servers to accommodate a job unless they have to. We introduce a sub-family of AF algorithms called monotone AF algorithms. We show this family have a tight competitive ratio of $Theta(d mu)$, where $d$ is the dimension of the problem and $mu$ is the ratio between the maximum and minimum duration of jobs in the input sequence. We also show that upper bounds for the RSiC problem obey the direct-sum property with respect to dimension $d$, that is we show how to transform $1$-dimensional algorithms for RSiC to work in the $d$-dimensional setting with competitive ratio scaling by a factor of $d$. As a corollary, we obtain an $O(d\sqrt{log mu})$ upper bound for $d$-dimensional clairvoyant RSiC. We also establish a lower bound of $\widetilde{Omega}(d mu)$ for both deterministic and randomized algorithms for $d$-dimensional non-clairvoyant RSiC, under the assumption that $mu \le log d - 2$. Lastly, we propose a natural greedy algorithm called Greedy. Greedy, is a clairvoyant algorithm belongs to the monotone AF family, achieves a competitive ratio of $Theta(d mu)$. Our experimental results indicate that Greedy performs better or matches all other existing algorithms, for almost all the settings of arrival rates and values of mu and $d$ that we implemented.

Authors: Yaqiao Li, Mahtab Masoori, Lata Narayanan, Denis Pankratov

We study the Renting Servers in the Cloud problem (RSiC) in multiple dimensions. In this problem, a sequence of multi-parameter jobs must be scheduled on servers that can be rented on-demand. Each job has an arrival time, a finishing time, and a multi-dimensional size vector that specifies its resource demands. Each server has a multi-dimensional capacity and jobs can be scheduled on a server as long as in each dimension the sum of sizes of jobs does not exceed the capacity of the server in that dimension. The goal is to minimize the total rental time of servers needed to process the job sequence. AF algorithms do not rent new servers to accommodate a job unless they have to. We introduce a sub-family of AF algorithms called monotone AF algorithms. We show this family have a tight competitive ratio of $Theta(d mu)$, where $d$ is the dimension of the problem and $mu$ is the ratio between the maximum and minimum duration of jobs in the input sequence. We also show that upper bounds for the RSiC problem obey the direct-sum property with respect to dimension $d$, that is we show how to transform $1$-dimensional algorithms for RSiC to work in the $d$-dimensional setting with competitive ratio scaling by a factor of $d$. As a corollary, we obtain an $O(d\sqrt{log mu})$ upper bound for $d$-dimensional clairvoyant RSiC. We also establish a lower bound of $\widetilde{Omega}(d mu)$ for both deterministic and randomized algorithms for $d$-dimensional non-clairvoyant RSiC, under the assumption that $mu \le log d - 2$. Lastly, we propose a natural greedy algorithm called Greedy. Greedy, is a clairvoyant algorithm belongs to the monotone AF family, achieves a competitive ratio of $Theta(d mu)$. Our experimental results indicate that Greedy performs better or matches all other existing algorithms, for almost all the settings of arrival rates and values of mu and $d$ that we implemented.

Wednesday, April 24

Is Persistence an Anachronism?

from Computational Complexity

Guest post by Martin Bullinger

Very recently, Vijay Vazirani's paper A Theory of Alternating Paths and Blossoms, from the Perspective of Minimum Length got accepted to Mathematics of Operations Research. For the first time, it gives a complete and correct proof that the Micali-Vazirani algorithm finds a maximum cardinality matching in time \(\mathcal O\left(m\sqrt{n}\right)\). I would like to give an account of the extraordinary story of this proof and how Vazirani's contribution inspires persistence.

My fascination for matching already started during my undergrad when I gave a talk on Edmonds' blossom algorithm. It was at this time that I first heard about the Micali-Vazirani (MV) algorithm. Naturally, I was quite excited when I got to know Vazirani personally years later. When I talked to him about the MV algorithm I was, however, shocked: Vazirani admitted that even to that day, there did not exist a complete proof of its correctness. How can a theoretical result be accepted to FOCS without a proof?

Now, 44 years after publication of the algorithm, a proof exists and has been peer-reviewed in great depth. But why did it take so long? Apparently, some results just need time. Sometimes a lot of time. Think of Fermat's Last Theorem, whose proof took 358 years! So what is the story behind the MV algorithm? It can without a doubt be seen as a lifework. Together with his fellow PhD student Silvio Micali, Vazirani discovered it in the first year of his PhD in 1979-80. Without even attempting a proof, it was published in the proceedings of FOCS 1980. The first proof attempt by Vazirani was published in 1994 in Combinatorica. Unfortunately, this proof turned out to be flawed. It took another 30 years until his current paper.

What kept Vazirani going for so long? In the acknowledgements of his paper, he thanks matching theory for its gloriously elegant structure. Vazirani was driven by his passion for the subject matter---but passion by itself can only go so far. Even more important was his belief in the correctness of the algorithm and the theory, which he had broadly outlined in his 1994 paper. Similar to Andrew Wiles' story, his perseverance led him to the idea which clinched the proof. In Vazirani's case, this was to use the new algorithmic idea of double depth-first search, which forms the core of the MV algorithm, and now, its proof as well. But Vazirani's result is also the story of an excellent research environment. Finding deep results requires colleagues or friends to discuss ideas with. Vazirani had these in the form of strong postdocs and PhD students. About ten years ago, he had been discussing ideas towards his proof with his former postdoc Ruta Mehta, and in the last three years, he discussed the final touches of his proof with his current PhD student Rohith Gangam. Needless to say, both of them gained a lot from these discussions.

So why should we care for the MV algorithm? I have several reasons. First, without doubt, it is a historic result within combinatorial optimization. Matching is one of the most fundamental objects in discrete mathematics and we keep finding new applications for it, for example, in health, labor markets, and modern day matching markets on the Internet, basically in every part of our lives. But there is more. Once again, one can look at Vazirani's paper where he describes the impact of matching to the development of the theory of algorithms: Matching theory has led to foundational concepts like the definition of the complexity classes \(\mathcal P\) (Edmonds, 1965a) and \(\# \mathcal P\) (Valiant, 1979), the primal-dual paradigm (Kuhn, 1955), and polyhedral combinatorics (Edmonds, 1965b). The impact of matching on complexity theory was an earlier topic of this blog.

Despite being around for decades, the MV algorithm is still the fastest known algorithm for computing a maximum cardinality matching. This is surprising, to put it mildly. Similar to many other fundamental problems in combinatorial optimization, I would have expected the discovery of better algorithms in the last four decades. Why has this not happened? Vazirani appears to have gotten to the essence of the problem: a profound theory that interleaves algorithmic invariants and graph-theoretic concepts. It seems to be the kind of theory which would play an active role in the field of combinatorial optimization.

However, Vazirani's result proves something else, possibly even more important: the massive gains to be made by single-minded persistence. In a world in which departments and promotion procedures focus on publishing large numbers of papers, it seems impossible to work on one result for more than a year, let alone for decades. Vazirani managed to achieve both: pursue his passion and get the unfinished job done, but not let it come in the way of the rest of his otherwise-active research career. As a young researcher, this inspires me! In the end, it is through such persistence that science will take big steps forward.

This blog post evolved from many enjoyable discussions, which I had with Vijay Vazirani during a research stay at UC Irvine in spring 2024. I am grateful to Ruta Mehta for feedback on the initial version of this post. Vazirani recently presented his paper in a mini series of two talks available online.

By Lance Fortnow

Guest post by Martin Bullinger

Very recently, Vijay Vazirani's paper A Theory of Alternating Paths and Blossoms, from the Perspective of Minimum Length got accepted to Mathematics of Operations Research. For the first time, it gives a complete and correct proof that the Micali-Vazirani algorithm finds a maximum cardinality matching in time \(\mathcal O\left(m\sqrt{n}\right)\). I would like to give an account of the extraordinary story of this proof and how Vazirani's contribution inspires persistence.

My fascination for matching already started during my undergrad when I gave a talk on Edmonds' blossom algorithm. It was at this time that I first heard about the Micali-Vazirani (MV) algorithm. Naturally, I was quite excited when I got to know Vazirani personally years later. When I talked to him about the MV algorithm I was, however, shocked: Vazirani admitted that even to that day, there did not exist a complete proof of its correctness. How can a theoretical result be accepted to FOCS without a proof?

Now, 44 years after publication of the algorithm, a proof exists and has been peer-reviewed in great depth. But why did it take so long? Apparently, some results just need time. Sometimes a lot of time. Think of Fermat's Last Theorem, whose proof took 358 years! So what is the story behind the MV algorithm? It can without a doubt be seen as a lifework. Together with his fellow PhD student Silvio Micali, Vazirani discovered it in the first year of his PhD in 1979-80. Without even attempting a proof, it was published in the proceedings of FOCS 1980. The first proof attempt by Vazirani was published in 1994 in Combinatorica. Unfortunately, this proof turned out to be flawed. It took another 30 years until his current paper.

What kept Vazirani going for so long? In the acknowledgements of his paper, he thanks matching theory for its gloriously elegant structure. Vazirani was driven by his passion for the subject matter---but passion by itself can only go so far. Even more important was his belief in the correctness of the algorithm and the theory, which he had broadly outlined in his 1994 paper. Similar to Andrew Wiles' story, his perseverance led him to the idea which clinched the proof. In Vazirani's case, this was to use the new algorithmic idea of double depth-first search, which forms the core of the MV algorithm, and now, its proof as well. But Vazirani's result is also the story of an excellent research environment. Finding deep results requires colleagues or friends to discuss ideas with. Vazirani had these in the form of strong postdocs and PhD students. About ten years ago, he had been discussing ideas towards his proof with his former postdoc Ruta Mehta, and in the last three years, he discussed the final touches of his proof with his current PhD student Rohith Gangam. Needless to say, both of them gained a lot from these discussions.

So why should we care for the MV algorithm? I have several reasons. First, without doubt, it is a historic result within combinatorial optimization. Matching is one of the most fundamental objects in discrete mathematics and we keep finding new applications for it, for example, in health, labor markets, and modern day matching markets on the Internet, basically in every part of our lives. But there is more. Once again, one can look at Vazirani's paper where he describes the impact of matching to the development of the theory of algorithms: Matching theory has led to foundational concepts like the definition of the complexity classes \(\mathcal P\) (Edmonds, 1965a) and \(\# \mathcal P\) (Valiant, 1979), the primal-dual paradigm (Kuhn, 1955), and polyhedral combinatorics (Edmonds, 1965b). The impact of matching on complexity theory was an earlier topic of this blog.

Despite being around for decades, the MV algorithm is still the fastest known algorithm for computing a maximum cardinality matching. This is surprising, to put it mildly. Similar to many other fundamental problems in combinatorial optimization, I would have expected the discovery of better algorithms in the last four decades. Why has this not happened? Vazirani appears to have gotten to the essence of the problem: a profound theory that interleaves algorithmic invariants and graph-theoretic concepts. It seems to be the kind of theory which would play an active role in the field of combinatorial optimization.

However, Vazirani's result proves something else, possibly even more important: the massive gains to be made by single-minded persistence. In a world in which departments and promotion procedures focus on publishing large numbers of papers, it seems impossible to work on one result for more than a year, let alone for decades. Vazirani managed to achieve both: pursue his passion and get the unfinished job done, but not let it come in the way of the rest of his otherwise-active research career. As a young researcher, this inspires me! In the end, it is through such persistence that science will take big steps forward.

This blog post evolved from many enjoyable discussions, which I had with Vijay Vazirani during a research stay at UC Irvine in spring 2024. I am grateful to Ruta Mehta for feedback on the initial version of this post. Vazirani recently presented his paper in a mini series of two talks available online.

By Lance Fortnow

assistant/associate professor at University of Sheffield (apply by May 20, 2024)

from CCI: jobs

This is an exciting opportunity for a Lecturer or Senior Lecturer in Algorithms at the University of Sheffield. Working in the Department of Computer Science, you will join our Foundations of Computation (FOX) Group. Its research topics range from the theoretical mathematical foundations that underpin computer science to their applications in real world contexts. Website: […]

This is an exciting opportunity for a Lecturer or Senior Lecturer in Algorithms at the University of Sheffield. Working in the Department of Computer Science, you will join our Foundations of Computation (FOX) Group. Its research topics range from the theoretical mathematical foundations that underpin computer science to their applications in real world contexts.

Website: https://www.jobs.ac.uk/job/DHG612/lecturer-senior-lecturer-in-algorithms
Email: feldmann.a.e@gmail.com

By shacharlovett

Popperian Falsification

from Ben Recht

Meehl's Philosophical Psychology, Lecture 2, Part 1

Meehl’s second lecture is almost entirely about Karl Popper and his program of refutation. Popper is the scientist’s favorite philosopher, as he conceives the scientist as a heroic truth-seeker, carving out understanding with the sword of falsification. I’ve been guilty of falling for Popper’s flirting! If you don’t think too deeply about it, Popper’s view of science is very romantic. Great thinkers put theories up for scrutiny and do ingenious experiments rendering them false, rapidly revealing an essential theoretical core. But, as we’ll see, not only does no science work this way, but if you probe a scientist for more than a few minutes, they’ll agree they aren’t in the falsification business. Let me return to these social issues after first describing the logical idea behind falsification.

As is always the case with logic, we have to start with some stodgy formal notation. I’m not going to use Meehl’s notation as I’d like to avoid the Emoji & Symbols Viewer when possible. But I think what I’ve chosen should be fine. If P and Q are statements, then I’ll write ~P to denote the negation of P, and P-->Q will denote material implication. Material implication is a logical rule that we colloquially read as “if P, then Q.” You could say that Q is necessary for P, or P is sufficient for Q. If you really like logic, the implication is equivalent to “~P or Q.” Or, more instructively, “~(P and ~Q).” Bah, I don’t like logic! But fortunately we won’t need much more than this to discuss Popper.

The final piece we need is the hypothetical syllogism.

The way to read a chart like this is “A is true. B is true. Therefore C is true.” A is some rule, B is a truth statement, and C is the implication.

There are four combinations from the “if P, then Q” relationship.

In the second line of each of these syllogisms, the truth of one of the propositions is asserted. Only two of these correspond to valid logical deduction. The top left corner is called modus ponens or affirming the antecedent. If P implies Q and P is true, then Q must be true. This is all well and good. 

The lower left corner is called denying the antecedent. It doesn’t get a fancy Latin name as it’s not valid. Certainly, just because P implies Q doesn’t mean that Q can’t happen when P doesn’t happen. When the Patriots win a lot of games, it makes me happy. I’m happy. Checks the Patriots’ record in 2023

Now, the really interesting cell in this table is the upper right. It is not valid. Just because P implies Q does not mean that Q implies P. “If I listen to Taylor Swift, I get a headache. I have a headache. Therefore I listened to Taylor Swift.” Or whatever. This implication is called the “converse fallacy” or “affirming the consequent.” We learn that it’s invalid in high school geometry at the latest.

But when you think about it, science and engineering is entirely built upon affirming the consequent. Typical reasoning in science goes something like this: “If my theory is true, then I’ll observe this outcome of my experiment. I observe exactly this outcome. Therefore, my theory is true.” We do this all the time. “If Newton’s Laws are true, bowling balls and feathers drop at the same rate in a vacuum. I see that bowling balls and feathers drop at the same rate in a vacuum. Therefore, I conclude Newton’s Laws are awesome.”

Huh, this can’t be the way things work, can it? Science can’t be built upon irrationality! Popper certainly didn’t think so. But let me get back to Popper in a second.

Even if it’s the first logical fallacy we learn, our entire society is built upon affirming the consequent. We all agree to believe the future will resemble the past, at least somewhat. This belief will always just be your opinion, man. Postmodern Machine Learning Dude Ben learned to stop worrying and embrace Hume’s Problem of Induction. It’s unavoidable. The sun will come up tomorrow. We build technology around prediction, assuming the future is like the past. Our society affirms the consequent. We’re delightfully arational. The Dude abides.

If you don’t want to embrace inductive anarchy like me, Meehl offers a probabilistic fix, one I incessantly write about on this blog:

“All empirical inference is only probable. That's why it differs from inference in mathematics set theory, pure logic. That's why no matter how much evidence you have about facts, the theory can never be said to be proved in the strong sense of Euclid. It's only proved in the sense of rendered more likely, rendered more credible, supported, whatever you want to say.“

For this reason, probability will necessarily play a central role in Meehl’s course. 

OK, but let’s get back to Popper. Popper hated inductive reasoning. He knew it was logically invalid. And he thought that we were just confused by Hume’s ramblings and could do science with purely logical deduction. Popper’s scientific logic is based on the fourth hypothetical syllogism: “If P, then Q. Not Q. Therefore Not P.” This is denying the consequent, also known as modus tollens. It gets a fancy Latin name as it’s logically valid. It forms the logical basis of our proofs by contradiction. And Popper tried to make it the basis of scientific inference.

Popper figured that he could solve the problem of induction, by denying induction exists. Bold! Or, at least, you could do science without induction. “You don’t support theories with facts.” For Popper. what is essential about science is its falsifiability. A scientist honorably tells their colleagues what sorts of observations undermine their theory. And then the other scientists do these experiments. The irrefutable theories are left standing.

I get why this is appealing, of course. We like to teach the scientific method as about generating alternative hypotheses and finding clever experiments to show these are false. After all, didn’t Galileo actually do that Leaning Tower of Pisa experiment to prove Aristotle wrong? Though Popper and the Logical Positivists were allergic to history, they were clearly inspired by certain historical anecdotes. But tomorrow, I’ll dive into some alternative anecdotes showing how science has never been about falsification. How scientists cling to theories despite significant evidence against them. And how Popper and others tried to patch this up.

Subscribe now

By Ben Recht

Positive Moments Forever: Undecidable and Decidable Cases

from arXiv: Computational Complexity

Authors: Gemma De les Coves, Joshua Graf, Andreas Klingler, Tim Netzer

Is there an algorithm to determine attributes such as positivity or non-zeroness of linear recurrence sequences? This long-standing question is known as Skolem's problem. In this paper, we study the complexity of an equivalent problem, namely the (generalized) moment membership problem for matrices. We show that this problem is decidable for orthogonal, unitary and real eigenvalue matrices, and undecidable for matrices over certain commutative and non-commutative polynomial rings. Our results imply that the positivity problem for simple unitary linear recurrence sequences is decidable, and is undecidable for linear recurrence sequences over the ring of commutative polynomials. As a byproduct, we prove a free version of Polya's theorem.

Authors: Gemma De les Coves, Joshua Graf, Andreas Klingler, Tim Netzer

Is there an algorithm to determine attributes such as positivity or non-zeroness of linear recurrence sequences? This long-standing question is known as Skolem's problem. In this paper, we study the complexity of an equivalent problem, namely the (generalized) moment membership problem for matrices. We show that this problem is decidable for orthogonal, unitary and real eigenvalue matrices, and undecidable for matrices over certain commutative and non-commutative polynomial rings. Our results imply that the positivity problem for simple unitary linear recurrence sequences is decidable, and is undecidable for linear recurrence sequences over the ring of commutative polynomials. As a byproduct, we prove a free version of Polya's theorem.

Transformers Can Represent $n$-gram Language Models

from arXiv: Computational Complexity

Authors: Anej Svete, Ryan Cotterell

Plenty of existing work has analyzed the abilities of the transformer architecture by describing its representational capacity with formal models of computation. However, the focus so far has been on analyzing the architecture in terms of language \emph{acceptance}. We contend that this is an ill-suited problem in the study of \emph{language models} (LMs), which are definitionally \emph{probability distributions} over strings. In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models. We show that transformer LMs using the hard or sparse attention mechanisms can exactly represent any $n$-gram LM, giving us a concrete lower bound on their probabilistic representational capacity. This provides a first step towards understanding the mechanisms that transformer LMs can use to represent probability distributions over strings.

Authors: Anej Svete, Ryan Cotterell

Plenty of existing work has analyzed the abilities of the transformer architecture by describing its representational capacity with formal models of computation. However, the focus so far has been on analyzing the architecture in terms of language \emph{acceptance}. We contend that this is an ill-suited problem in the study of \emph{language models} (LMs), which are definitionally \emph{probability distributions} over strings. In this paper, we focus on the relationship between transformer LMs and $n$-gram LMs, a simple and historically relevant class of language models. We show that transformer LMs using the hard or sparse attention mechanisms can exactly represent any $n$-gram LM, giving us a concrete lower bound on their probabilistic representational capacity. This provides a first step towards understanding the mechanisms that transformer LMs can use to represent probability distributions over strings.

Pseudorandom Permutations from Random Reversible Circuits

from arXiv: Computational Complexity

Authors: William He, Ryan O'Donnell

We study pseudorandomness properties of permutations on $\{0,1\}^n$ computed by random circuits made from reversible $3$-bit gates (permutations on $\{0,1\}^3$). Our main result is that a random circuit of depth $n \cdot \tilde{O}(k^2)$, with each layer consisting of $\approx n/3$ random gates in a fixed nearest-neighbor architecture, yields almost $k$-wise independent permutations. The main technical component is showing that the Markov chain on $k$-tuples of $n$-bit strings induced by a single random $3$-bit nearest-neighbor gate has spectral gap at least $1/n \cdot \tilde{O}(k)$. This improves on the original work of Gowers [Gowers96], who showed a gap of $1/\mathrm{poly}(n,k)$ for one random gate (with non-neighboring inputs); and, on subsequent work [HMMR05,BH08] improving the gap to $\Omega(1/n^2k)$ in the same setting. From the perspective of cryptography, our result can be seen as a particularly simple/practical block cipher construction that gives provable statistical security against attackers with access to $k$~input-output pairs within few rounds. We also show that the Luby--Rackoff construction of pseudorandom permutations from pseudorandom functions can be implemented with reversible circuits. From this, we make progress on the complexity of the Minimum Reversible Circuit Size Problem (MRCSP), showing that block ciphers of fixed polynomial size are computationally secure against arbitrary polynomial-time adversaries, assuming the existence of one-way functions (OWFs).

Authors: William He, Ryan O'Donnell

We study pseudorandomness properties of permutations on $\{0,1\}^n$ computed by random circuits made from reversible $3$-bit gates (permutations on $\{0,1\}^3$). Our main result is that a random circuit of depth $n \cdot \tilde{O}(k^2)$, with each layer consisting of $\approx n/3$ random gates in a fixed nearest-neighbor architecture, yields almost $k$-wise independent permutations. The main technical component is showing that the Markov chain on $k$-tuples of $n$-bit strings induced by a single random $3$-bit nearest-neighbor gate has spectral gap at least $1/n \cdot \tilde{O}(k)$. This improves on the original work of Gowers [Gowers96], who showed a gap of $1/\mathrm{poly}(n,k)$ for one random gate (with non-neighboring inputs); and, on subsequent work [HMMR05,BH08] improving the gap to $\Omega(1/n^2k)$ in the same setting. From the perspective of cryptography, our result can be seen as a particularly simple/practical block cipher construction that gives provable statistical security against attackers with access to $k$~input-output pairs within few rounds. We also show that the Luby--Rackoff construction of pseudorandom permutations from pseudorandom functions can be implemented with reversible circuits. From this, we make progress on the complexity of the Minimum Reversible Circuit Size Problem (MRCSP), showing that block ciphers of fixed polynomial size are computationally secure against arbitrary polynomial-time adversaries, assuming the existence of one-way functions (OWFs).

Complexity of Planar Graph Orientation Consistency, Promise-Inference, and Uniqueness, with Applications to Minesweeper Variants

from arXiv: Computational Complexity

Authors: MIT Hardness Group, Della Hendrickson, Andy Tockman

We study three problems related to the computational complexity of the popular game Minesweeper. The first is consistency: given a set of clues, is there any arrangement of mines that satisfies it? This problem has been known to be NP-complete since 2000, but our framework proves it as a side effect. The second is inference: given a set of clues, is there any cell that the player can prove is safe? The coNP-completeness of this problem has been in the literature since 2011, but we discovered a flaw that we believe is present in all published results, and we provide a fixed proof. Finally, the third is solvability: given the full state of a Minesweeper game, can the player win the game by safely clicking all non-mine cells? This problem has not yet been studied, and we prove that it is coNP-complete.

Authors: MIT Hardness Group, Della Hendrickson, Andy Tockman

We study three problems related to the computational complexity of the popular game Minesweeper. The first is consistency: given a set of clues, is there any arrangement of mines that satisfies it? This problem has been known to be NP-complete since 2000, but our framework proves it as a side effect. The second is inference: given a set of clues, is there any cell that the player can prove is safe? The coNP-completeness of this problem has been in the literature since 2011, but we discovered a flaw that we believe is present in all published results, and we provide a fixed proof. Finally, the third is solvability: given the full state of a Minesweeper game, can the player win the game by safely clicking all non-mine cells? This problem has not yet been studied, and we prove that it is coNP-complete.

PHLP: Sole Persistent Homology for Link Prediction -- Interpretable Feature Extraction

from arXiv: Computational Geometry

Authors: Junwon You, Eunwoo Heo, Jae-Hun Jung

Link prediction (LP), inferring the connectivity between nodes, is a significant research area in graph data, where a link represents essential information on relationships between nodes. Although graph neural network (GNN)-based models have achieved high performance in LP, understanding why they perform well is challenging because most comprise complex neural networks. We employ persistent homology (PH), a topological data analysis method that helps analyze the topological information of graphs, to explain the reasons for the high performance. We propose a novel method that employs PH for LP (PHLP) focusing on how the presence or absence of target links influences the overall topology. The PHLP utilizes the angle hop subgraph and new node labeling called degree double radius node labeling (Degree DRNL), distinguishing the information of graphs better than DRNL. Using only a classifier, PHLP performs similarly to state-of-the-art (SOTA) models on most benchmark datasets. Incorporating the outputs calculated using PHLP into the existing GNN-based SOTA models improves performance across all benchmark datasets. To the best of our knowledge, PHLP is the first method of applying PH to LP without GNNs. The proposed approach, employing PH while not relying on neural networks, enables the identification of crucial factors for improving performance.

Authors: Junwon You, Eunwoo Heo, Jae-Hun Jung

Link prediction (LP), inferring the connectivity between nodes, is a significant research area in graph data, where a link represents essential information on relationships between nodes. Although graph neural network (GNN)-based models have achieved high performance in LP, understanding why they perform well is challenging because most comprise complex neural networks. We employ persistent homology (PH), a topological data analysis method that helps analyze the topological information of graphs, to explain the reasons for the high performance. We propose a novel method that employs PH for LP (PHLP) focusing on how the presence or absence of target links influences the overall topology. The PHLP utilizes the angle hop subgraph and new node labeling called degree double radius node labeling (Degree DRNL), distinguishing the information of graphs better than DRNL. Using only a classifier, PHLP performs similarly to state-of-the-art (SOTA) models on most benchmark datasets. Incorporating the outputs calculated using PHLP into the existing GNN-based SOTA models improves performance across all benchmark datasets. To the best of our knowledge, PHLP is the first method of applying PH to LP without GNNs. The proposed approach, employing PH while not relying on neural networks, enables the identification of crucial factors for improving performance.

Neural Slicer for Multi-Axis 3D Printing

from arXiv: Computational Geometry

Authors: Tao Liu, Tianyu Zhang, Yongxue Chen, Yuming Huang, Charlie C. L. Wang

We introduce a novel neural network-based computational pipeline as a representation-agnostic slicer for multi-axis 3D printing. This advanced slicer can work on models with diverse representations and intricate topology. The approach involves employing neural networks to establish a deformation mapping, defining a scalar field in the space surrounding an input model. Isosurfaces are subsequently extracted from this field to generate curved layers for 3D printing. Creating a differentiable pipeline enables us to optimize the mapping through loss functions directly defined on the field gradients as the local printing directions. New loss functions have been introduced to meet the manufacturing objectives of support-free and strength reinforcement. Our new computation pipeline relies less on the initial values of the field and can generate slicing results with significantly improved performance.

Authors: Tao Liu, Tianyu Zhang, Yongxue Chen, Yuming Huang, Charlie C. L. Wang

We introduce a novel neural network-based computational pipeline as a representation-agnostic slicer for multi-axis 3D printing. This advanced slicer can work on models with diverse representations and intricate topology. The approach involves employing neural networks to establish a deformation mapping, defining a scalar field in the space surrounding an input model. Isosurfaces are subsequently extracted from this field to generate curved layers for 3D printing. Creating a differentiable pipeline enables us to optimize the mapping through loss functions directly defined on the field gradients as the local printing directions. New loss functions have been introduced to meet the manufacturing objectives of support-free and strength reinforcement. Our new computation pipeline relies less on the initial values of the field and can generate slicing results with significantly improved performance.

The Geometry of the Set of Equivalent Linear Neural Networks

from arXiv: Computational Geometry

Authors: Jonathan Richard Shewchuk, Sagnik Bhattacharya

We characterize the geometry and topology of the set of all weight vectors for which a linear neural network computes the same linear transformation $W$. This set of weight vectors is called the fiber of $W$ (under the matrix multiplication map), and it is embedded in the Euclidean weight space of all possible weight vectors. The fiber is an algebraic variety that is not necessarily a manifold. We describe a natural way to stratify the fiber--that is, to partition the algebraic variety into a finite set of manifolds of varying dimensions called strata. We call this set of strata the rank stratification. We derive the dimensions of these strata and the relationships by which they adjoin each other. Although the strata are disjoint, their closures are not. Our strata satisfy the frontier condition: if a stratum intersects the closure of another stratum, then the former stratum is a subset of the closure of the latter stratum. Each stratum is a manifold of class $C^\infty$ embedded in weight space, so it has a well-defined tangent space and normal space at every point (weight vector). We show how to determine the subspaces tangent to and normal to a specified stratum at a specified point on the stratum, and we construct elegant bases for those subspaces. To help achieve these goals, we first derive what we call a Fundamental Theorem of Linear Neural Networks, analogous to what Strang calls the Fundamental Theorem of Linear Algebra. We show how to decompose each layer of a linear neural network into a set of subspaces that show how information flows through the neural network. Each stratum of the fiber represents a different pattern by which information flows (or fails to flow) through the neural network. The topology of a stratum depends solely on this decomposition. So does its geometry, up to a linear transformation in weight space.

Authors: Jonathan Richard Shewchuk, Sagnik Bhattacharya

We characterize the geometry and topology of the set of all weight vectors for which a linear neural network computes the same linear transformation $W$. This set of weight vectors is called the fiber of $W$ (under the matrix multiplication map), and it is embedded in the Euclidean weight space of all possible weight vectors. The fiber is an algebraic variety that is not necessarily a manifold. We describe a natural way to stratify the fiber--that is, to partition the algebraic variety into a finite set of manifolds of varying dimensions called strata. We call this set of strata the rank stratification. We derive the dimensions of these strata and the relationships by which they adjoin each other. Although the strata are disjoint, their closures are not. Our strata satisfy the frontier condition: if a stratum intersects the closure of another stratum, then the former stratum is a subset of the closure of the latter stratum. Each stratum is a manifold of class $C^\infty$ embedded in weight space, so it has a well-defined tangent space and normal space at every point (weight vector). We show how to determine the subspaces tangent to and normal to a specified stratum at a specified point on the stratum, and we construct elegant bases for those subspaces. To help achieve these goals, we first derive what we call a Fundamental Theorem of Linear Neural Networks, analogous to what Strang calls the Fundamental Theorem of Linear Algebra. We show how to decompose each layer of a linear neural network into a set of subspaces that show how information flows through the neural network. Each stratum of the fiber represents a different pattern by which information flows (or fails to flow) through the neural network. The topology of a stratum depends solely on this decomposition. So does its geometry, up to a linear transformation in weight space.

Parameterized Maximum Node-Disjoint Paths

from arXiv: Data Structures and Algorithms

Authors: Michael Lampis, Manolis Vasilakis

We revisit the Maximum Node-Disjoint Paths problem, the natural optimization version of Node-Disjoint Paths, where we are given a graph $G$, $k$ pairs of vertices $(s_i, t_i)$ and an integer $\ell$, and are asked whether there exist at least $\ell$ vertex-disjoint paths in $G$ whose endpoints are given pairs. We present several results, with an emphasis towards FPT approximation. Our main positive contribution is to show that the problem's intractability can be overcome using approximation and that for several of the structural parameters for which the problem is hard, most notably tree-depth, it admits an efficient FPT approximation scheme, returning a $(1-\varepsilon)$-approximate solution in time $f(td,\varepsilon)n^{O(1)}$. We manage to obtain these results by comprehensively mapping out the structural parameters for which the problem is FPT if $\ell$ is also a parameter, hence showing that understanding $\ell$ as a parameter is key to the problem's approximability. This, in turn, is a problem we are able to solve via a surprisingly simple color-coding algorithm, which relies on identifying an insightful problem-specific variant of the natural parameter, namely the number of vertices used in the solution. A natural question is whether the FPT approximation algorithm we devised for tree-depth can be extended to pathwidth. We resolve this negatively, showing that under the Parameterized Inapproximability Hypothesis no FPT approximation scheme for this parameter is possible, even in time $f(pw,\varepsilon)n^{g(\varepsilon)}$, thus precisely determining the parameter border where the problem transitions from ``hard but approximable'' to ``inapproximable''. Lastly, we strengthen existing lower bounds by replacing W[1]-hardness by XNLP-completeness for parameter pathwidth, and improving the $n^{o(\sqrt{td})}$ ETH-based lower bound for tree-depth to $n^{o(td)}$.

Authors: Michael Lampis, Manolis Vasilakis

We revisit the Maximum Node-Disjoint Paths problem, the natural optimization version of Node-Disjoint Paths, where we are given a graph $G$, $k$ pairs of vertices $(s_i, t_i)$ and an integer $\ell$, and are asked whether there exist at least $\ell$ vertex-disjoint paths in $G$ whose endpoints are given pairs. We present several results, with an emphasis towards FPT approximation. Our main positive contribution is to show that the problem's intractability can be overcome using approximation and that for several of the structural parameters for which the problem is hard, most notably tree-depth, it admits an efficient FPT approximation scheme, returning a $(1-\varepsilon)$-approximate solution in time $f(td,\varepsilon)n^{O(1)}$. We manage to obtain these results by comprehensively mapping out the structural parameters for which the problem is FPT if $\ell$ is also a parameter, hence showing that understanding $\ell$ as a parameter is key to the problem's approximability. This, in turn, is a problem we are able to solve via a surprisingly simple color-coding algorithm, which relies on identifying an insightful problem-specific variant of the natural parameter, namely the number of vertices used in the solution. A natural question is whether the FPT approximation algorithm we devised for tree-depth can be extended to pathwidth. We resolve this negatively, showing that under the Parameterized Inapproximability Hypothesis no FPT approximation scheme for this parameter is possible, even in time $f(pw,\varepsilon)n^{g(\varepsilon)}$, thus precisely determining the parameter border where the problem transitions from ``hard but approximable'' to ``inapproximable''. Lastly, we strengthen existing lower bounds by replacing W[1]-hardness by XNLP-completeness for parameter pathwidth, and improving the $n^{o(\sqrt{td})}$ ETH-based lower bound for tree-depth to $n^{o(td)}$.

It's Hard to HAC with Average Linkage!

from arXiv: Data Structures and Algorithms

Authors: MohammadHossein Bateni, Laxman Dhulipala, Kishen N Gowda, D Ellis Hershkowitz, Rajesh Jayaram, Jakub Łącki

Average linkage Hierarchical Agglomerative Clustering (HAC) is an extensively studied and applied method for hierarchical clustering. Recent applications to massive datasets have driven significant interest in near-linear-time and efficient parallel algorithms for average linkage HAC. We provide hardness results that rule out such algorithms. On the sequential side, we establish a runtime lower bound of $n^{3/2-\epsilon}$ on $n$ node graphs for sequential combinatorial algorithms under standard fine-grained complexity assumptions. This essentially matches the best-known running time for average linkage HAC. On the parallel side, we prove that average linkage HAC likely cannot be parallelized even on simple graphs by showing that it is CC-hard on trees of diameter $4$. On the possibility side, we demonstrate that average linkage HAC can be efficiently parallelized (i.e., it is in NC) on paths and can be solved in near-linear time when the height of the output cluster hierarchy is small.

Authors: MohammadHossein Bateni, Laxman Dhulipala, Kishen N Gowda, D Ellis Hershkowitz, Rajesh Jayaram, Jakub Łącki

Average linkage Hierarchical Agglomerative Clustering (HAC) is an extensively studied and applied method for hierarchical clustering. Recent applications to massive datasets have driven significant interest in near-linear-time and efficient parallel algorithms for average linkage HAC. We provide hardness results that rule out such algorithms. On the sequential side, we establish a runtime lower bound of $n^{3/2-\epsilon}$ on $n$ node graphs for sequential combinatorial algorithms under standard fine-grained complexity assumptions. This essentially matches the best-known running time for average linkage HAC. On the parallel side, we prove that average linkage HAC likely cannot be parallelized even on simple graphs by showing that it is CC-hard on trees of diameter $4$. On the possibility side, we demonstrate that average linkage HAC can be efficiently parallelized (i.e., it is in NC) on paths and can be solved in near-linear time when the height of the output cluster hierarchy is small.

Near-Universally-Optimal Differentially Private Minimum Spanning Trees

from arXiv: Data Structures and Algorithms

Authors: Richard Hladík, Jakub Tětek

Devising mechanisms with good beyond-worst-case input-dependent performance has been an important focus of differential privacy, with techniques such as smooth sensitivity, propose-test-release, or inverse sensitivity mechanism being developed to achieve this goal. This makes it very natural to use the notion of universal optimality in differential privacy. Universal optimality is a strong instance-specific optimality guarantee for problems on weighted graphs, which roughly states that for any fixed underlying (unweighted) graph, the algorithm is optimal in the worst-case sense, with respect to the possible setting of the edge weights. In this paper, we give the first such result in differential privacy. Namely, we prove that a simple differentially private mechanism for approximately releasing the minimum spanning tree is near-optimal in the sense of universal optimality for the $\ell_1$ neighbor relation. Previously, it was only known that this mechanism is nearly optimal in the worst case. We then focus on the $\ell_\infty$ neighbor relation, for which the described mechanism is not optimal. We show that one may implement the exponential mechanism for MST in polynomial time, and that this results in universal near-optimality for both the $\ell_1$ and the $\ell_\infty$ neighbor relations.

Authors: Richard Hladík, Jakub Tětek

Devising mechanisms with good beyond-worst-case input-dependent performance has been an important focus of differential privacy, with techniques such as smooth sensitivity, propose-test-release, or inverse sensitivity mechanism being developed to achieve this goal. This makes it very natural to use the notion of universal optimality in differential privacy. Universal optimality is a strong instance-specific optimality guarantee for problems on weighted graphs, which roughly states that for any fixed underlying (unweighted) graph, the algorithm is optimal in the worst-case sense, with respect to the possible setting of the edge weights. In this paper, we give the first such result in differential privacy. Namely, we prove that a simple differentially private mechanism for approximately releasing the minimum spanning tree is near-optimal in the sense of universal optimality for the $\ell_1$ neighbor relation. Previously, it was only known that this mechanism is nearly optimal in the worst case. We then focus on the $\ell_\infty$ neighbor relation, for which the described mechanism is not optimal. We show that one may implement the exponential mechanism for MST in polynomial time, and that this results in universal near-optimality for both the $\ell_1$ and the $\ell_\infty$ neighbor relations.

On the Number of Steps of CyclePopping in Weakly Inconsistent U(1)-Connection Graphs

from arXiv: Data Structures and Algorithms

Authors: Michaël Fanuel, Rémi Bardenet

A U(1)-connection graph $G$ is a graph in which each oriented edge is endowed with a unit complex number, the latter being conjugated under orientation flip. We consider cycle-rooted spanning forests (CRSFs), a particular kind of spanning subgraphs of $G$ that have recently found computational applications as randomized spectral sparsifiers. In this context, CRSFs are drawn from a determinantal measure. Under a condition on the connection, Kassel and Kenyon gave an elegant algorithm, named CyclePopping, to sample from this distribution. The algorithm is an extension of the celebrated algorithm of Wilson that uses a loop-erased random walk to sample uniform spanning trees. In this paper, we give an alternative, elementary proof of correctness of CyclePopping for CRSF sampling; we fill the gaps of a proof sketch by Kassel, who was himself inspired by Marchal's proof of the correctness of Wilson's original algorithm. One benefit of the full proof \`a la Marchal is that we obtain a concise expression for the law of the number of steps to complete the sampling procedure, shedding light on practical situations where the algorithm is expected to run fast. Furthermore, we show how to extend the proof to more general distributions over CRSFs, which are not determinantal. The correctness of CyclePopping is known even in the non-determinantal case from the work of Kassel and Kenyon, so our merit is only to provide an alternate proof. One interest of this alternate proof is again to provide the distribution of the time complexity of the algorithm, in terms of a Poisson point process on the graph loops, or equivalently as a Poisson process on pyramids of cycles, a combinatorial notion introduced by Viennot. Finally, we strive to make the connections to loop measures and combinatorial structures as explicit as possible, to provide a reference for future extensions of the algorithm and its analysis.

Authors: Michaël Fanuel, Rémi Bardenet

A U(1)-connection graph $G$ is a graph in which each oriented edge is endowed with a unit complex number, the latter being conjugated under orientation flip. We consider cycle-rooted spanning forests (CRSFs), a particular kind of spanning subgraphs of $G$ that have recently found computational applications as randomized spectral sparsifiers. In this context, CRSFs are drawn from a determinantal measure. Under a condition on the connection, Kassel and Kenyon gave an elegant algorithm, named CyclePopping, to sample from this distribution. The algorithm is an extension of the celebrated algorithm of Wilson that uses a loop-erased random walk to sample uniform spanning trees. In this paper, we give an alternative, elementary proof of correctness of CyclePopping for CRSF sampling; we fill the gaps of a proof sketch by Kassel, who was himself inspired by Marchal's proof of the correctness of Wilson's original algorithm. One benefit of the full proof \`a la Marchal is that we obtain a concise expression for the law of the number of steps to complete the sampling procedure, shedding light on practical situations where the algorithm is expected to run fast. Furthermore, we show how to extend the proof to more general distributions over CRSFs, which are not determinantal. The correctness of CyclePopping is known even in the non-determinantal case from the work of Kassel and Kenyon, so our merit is only to provide an alternate proof. One interest of this alternate proof is again to provide the distribution of the time complexity of the algorithm, in terms of a Poisson point process on the graph loops, or equivalently as a Poisson process on pyramids of cycles, a combinatorial notion introduced by Viennot. Finally, we strive to make the connections to loop measures and combinatorial structures as explicit as possible, to provide a reference for future extensions of the algorithm and its analysis.

$α_i$-Metric Graphs: Hyperbolicity

from arXiv: Data Structures and Algorithms

Authors: Feodor F. Dragan, Guillaume Ducoffe

A graph is called $\alpha_i$-metric ($i \in {\cal N}$) if it satisfies the following $\alpha_i$-metric property for every vertices $u, w, v$ and $x$: if a shortest path between $u$ and $w$ and a shortest path between $x$ and $v$ share a terminal edge $vw$, then $d(u,x) \ge d(u,v) + d(v,x) - i$. The latter is a discrete relaxation of the property that in Euclidean spaces the union of two geodesics sharing a terminal segment must be also a geodesic. Recently in (Dragan & Ducoffe, WG'23) we initiated the study of the algorithmic applications of $\alpha_i$-metric graphs. Our results in this prior work were very similar to those established in (Chepoi et al., SoCG'08) and (Chepoi et al., COCOA'18) for graphs with bounded hyperbolicity. The latter is a heavily studied metric tree-likeness parameter first introduced by Gromov. In this paper, we clarify the relationship between hyperbolicity and the $\alpha_i$-metric property, proving that $\alpha_i$-metric graphs are $f(i)$-hyperbolic for some function $f$ linear in $i$. We give different proofs of this result, using various equivalent definitions to graph hyperbolicity. By contrast, we give simple constructions of $1$-hyperbolic graphs that are not $\alpha_i$-metric for any constant $i$. Finally, in the special case of $i=1$, we prove that $\alpha_1$-metric graphs are $1$-hyperbolic, and the bound is sharp. By doing so, we can answer some questions left open in (Dragan & Ducoffe, WG'23).

Authors: Feodor F. Dragan, Guillaume Ducoffe

A graph is called $\alpha_i$-metric ($i \in {\cal N}$) if it satisfies the following $\alpha_i$-metric property for every vertices $u, w, v$ and $x$: if a shortest path between $u$ and $w$ and a shortest path between $x$ and $v$ share a terminal edge $vw$, then $d(u,x) \ge d(u,v) + d(v,x) - i$. The latter is a discrete relaxation of the property that in Euclidean spaces the union of two geodesics sharing a terminal segment must be also a geodesic. Recently in (Dragan & Ducoffe, WG'23) we initiated the study of the algorithmic applications of $\alpha_i$-metric graphs. Our results in this prior work were very similar to those established in (Chepoi et al., SoCG'08) and (Chepoi et al., COCOA'18) for graphs with bounded hyperbolicity. The latter is a heavily studied metric tree-likeness parameter first introduced by Gromov. In this paper, we clarify the relationship between hyperbolicity and the $\alpha_i$-metric property, proving that $\alpha_i$-metric graphs are $f(i)$-hyperbolic for some function $f$ linear in $i$. We give different proofs of this result, using various equivalent definitions to graph hyperbolicity. By contrast, we give simple constructions of $1$-hyperbolic graphs that are not $\alpha_i$-metric for any constant $i$. Finally, in the special case of $i=1$, we prove that $\alpha_1$-metric graphs are $1$-hyperbolic, and the bound is sharp. By doing so, we can answer some questions left open in (Dragan & Ducoffe, WG'23).

On the sizes of BDDs and ZDDs representing matroids

from arXiv: Data Structures and Algorithms

Authors: Hiromi Emoto, Yuni Iwamasa, Shin-ichi Minato

Matroids are often represented as oracles since there are no unified and compact representations for general matroids. This paper initiates the study of binary decision diagrams (BDDs) and zero-suppressed binary decision diagrams (ZDDs) as relatively compact data structures for representing matroids in a computer. This study particularly focuses on the sizes of BDDs and ZDDs representing matroids. First, we compare the sizes of different variations of BDDs and ZDDs for a matroid. These comparisons involve concise transformations between specific decision diagrams. Second, we provide upper bounds on the size of BDDs and ZDDs for several classes of matroids. These bounds are closely related to the number of minors of the matroid and depend only on the connectivity function or pathwidth of the matroid, which deeply relates to the classes of matroids called strongly pigeonhole classes. In essence, these results indicate upper bounds on the number of minors for specific classes of matroids and new strongly pigeonhole classes.

Authors: Hiromi Emoto, Yuni Iwamasa, Shin-ichi Minato

Matroids are often represented as oracles since there are no unified and compact representations for general matroids. This paper initiates the study of binary decision diagrams (BDDs) and zero-suppressed binary decision diagrams (ZDDs) as relatively compact data structures for representing matroids in a computer. This study particularly focuses on the sizes of BDDs and ZDDs representing matroids. First, we compare the sizes of different variations of BDDs and ZDDs for a matroid. These comparisons involve concise transformations between specific decision diagrams. Second, we provide upper bounds on the size of BDDs and ZDDs for several classes of matroids. These bounds are closely related to the number of minors of the matroid and depend only on the connectivity function or pathwidth of the matroid, which deeply relates to the classes of matroids called strongly pigeonhole classes. In essence, these results indicate upper bounds on the number of minors for specific classes of matroids and new strongly pigeonhole classes.

An inexact augmented Lagrangian algorithm for unsymmetric saddle-point systems

from arXiv: Data Structures and Algorithms

Authors: N. Huang, Y. -H. Dai, D. Orban, M. A. Saunders

Augmented Lagrangian (AL) methods are a well known class of algorithms for solving constrained optimization problems. They have been extended to the solution of saddle-point systems of linear equations. We study an AL (SPAL) algorithm for unsymmetric saddle-point systems and derive convergence and semi-convergence properties, even when the system is singular. At each step, our SPAL requires the exact solution of a linear system of the same size but with an SPD (2,2) block. To improve efficiency, we introduce an inexact SPAL algorithm. We establish its convergence properties under reasonable assumptions. Specifically, we use a gradient method, known as the Barzilai-Borwein (BB) method, to solve the linear system at each iteration. We call the result the augmented Lagrangian BB (SPALBB) algorithm and study its convergence. Numerical experiments on test problems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB is more robust and efficient than BICGSTAB and GMRES. SPALBB often requires the least CPU time, especially on large systems.

Authors: N. Huang, Y. -H. Dai, D. Orban, M. A. Saunders

Augmented Lagrangian (AL) methods are a well known class of algorithms for solving constrained optimization problems. They have been extended to the solution of saddle-point systems of linear equations. We study an AL (SPAL) algorithm for unsymmetric saddle-point systems and derive convergence and semi-convergence properties, even when the system is singular. At each step, our SPAL requires the exact solution of a linear system of the same size but with an SPD (2,2) block. To improve efficiency, we introduce an inexact SPAL algorithm. We establish its convergence properties under reasonable assumptions. Specifically, we use a gradient method, known as the Barzilai-Borwein (BB) method, to solve the linear system at each iteration. We call the result the augmented Lagrangian BB (SPALBB) algorithm and study its convergence. Numerical experiments on test problems from Navier-Stokes equations and coupled Stokes-Darcy flow show that SPALBB is more robust and efficient than BICGSTAB and GMRES. SPALBB often requires the least CPU time, especially on large systems.

Tuesday, April 23

Arturo Merino, Torsten Mütze, and Namrata Apply Gliders for Hamiltonicty!

from Gil Kalai

Happy Passover to all our readers On the way from Cambridge to Tel Aviv I had a splendid three hour visit to London (from Kings Cross to UCL and back), where I met my graduate student Gabriel Gendler and Freddie … Continue reading →

Happy Passover to all our readers

On the way from Cambridge to Tel Aviv I had a splendid three hour visit to London (from Kings Cross to UCL and back), where I met my graduate student Gabriel Gendler and Freddie Illingworth from UCL. (Like Tim Gowers and Imre Leader but a few decades later, Freddie and Gabriel first met at an IMO.)  Gabriel and Freddie told me about several striking things including the remarkable “glider” proof for showing that Kneser graphs, Schrijver graphs, and stable Kneser graphs are Hamiltonian.  (These proofs settled old standing conjectures.) To quote Merino, Mütze, and Namrata:

“Our main technical innovation is to study cycles in Kneser graphs by a kinetic system of multiple gliders that move at different speeds and that interact over time, reminiscent of the gliders in Conway’s Game of Life.”

Here are relevant links: 

  1. Arturo Merino, Torsten Mütze, and Namrata:  Kneser graphs are Hamiltonian.
  2.  Torsten Mütze and Namrata, Hamiltonicity of Schrijver graphs and stable Kneser graphs.
  3. (A forthcoming survey article in the Notices of the AMS) Torsten Mütze, On Hamilton cycles in graphs defined by intersecting set systems.
  4.  Torsten Mütze, Gliders in Kneser graphs, Internet site.
  5.  (While looking at 4. I discovered this very cool site) COS++: The combinatorial object server.

     

 

glider2

glider3

By Gil Kalai

Everything Inherently Meta

from Ben Recht

Meehl's Philosophical Psychology, Lecture 1

Meehl kicks things off with an alternative title for the course: Meehl’s Theory of Metatheory for Psychology Students. He defines metatheory as the empirical theory of scientific theory. The course is about concepts, theories, and how we appraise them. How do you assemble evidence of how people make theories? What can we say about the general process? Meehl argues that the utility of metatheory is not in designing new experiments or theories but in how you analyze data, criticize theory, and defend arguments. My goal for these blogs is to extract Meehl’s Theory of Metatheory into a broader context that lets me think more about the foundations of statistics and engineering.

As is the case with all first days of class, the first lecture here is a hodge podge of topics. After describing the scope of the class, the remainder of the lecture has two distinct modules. First, describing the role of language in theory. Second, a brief survey of the history of science and philosophy that Meehl finds relevant to the class.

Object language vs metalanguage

The main technical part of the first lecture is setting up a boundary between object language and metalanguage. Meehl’s definitions here are fairly conventional, but let me quote his distinction directly.

“The object language of a science is the language that speaks about its subject matter and the entities that the science investigates.”

Object language doesn’t necessarily mean observable objects. Meehl notes that protons are not really directly observable. Nor is libido. These are two conceptual abstractions that are still in the object language of their respective fields. 

“In the metalanguage, the subject matter is not the material objects of science but the statements that occur in science and the relations between them.”

Metalanguage consists of statements, properties of statements, and relations between the statements. It’s like the logic of a science. Meehl gives a list of examples of metalanguage with dramatic pauses between each 

  • the term true

  • the term false

  • the phrase confirmed by data 

  • the term rational 

  • the term unknown 

  • the term fallacious 

  • the term derivable or deducible

  • the term valid

  • the term probable 

These are all metalinguistic as they will describe statements or propositions. As Meehl highlights, metalangauge is about relationships between beliefs and relationships between beliefs and evidence. Evidence lives in the object language, beliefs live in your mind, metalanguage stitches it all together. If you see this as foreshadowing future discussion of probabilistic reasoning about empirics, you are correct. 

Meehl’s historical overview

The second part of the lecture gives a rather broad history of what Meehl thinks is relevant 20th-century psychology and philosophy. Meehl knew many of these seminal characters. He describes the 200 martinis he drank with Paul Feyerabend at the Temple Bar. Meehl rightfully notes that being able to bug these scholars about their writings provides insights you can’t extract from the page. Every book, paper, and monograph misses context and details that can completely reframe one’s interpretation of an argument. Meehl badgering of these characters for clarification informs the unique perspective of this class. In a similar way, listening to Meehl talk about his own writing in these videos clarifies what he was after in his metatheoretic program.

I’m not going to summarize the psychology history, as I’m not well versed in the subject. And, I’ll admit it, I’m also not particularly interested in hundred-year-old squabbling about mouse studies. But if you’re into that sort of thing, Meehl briefly introduces some of the characters he’ll return to as examples of theory builders in psychology. In particular, his discussion of B. F. Skinner and Skinner’s dogmatism is helpful as it exemplifies the role of the psychology of the scientist in the philosophy of science. Skinner, who was at Minnesota at the same time as Meehl, is a common character in anecdotes throughout the class.

With regard to philosophy, Meehl spends the rest of the class discussing the history and failure of logical positivism. Into the early 1940s, when Meehl was a graduate student, logical positivism was dominant in the American philosophy of science. The logical positivists figured you could do philosophy of science without looking at how people actually did science (“from the armchair” as Meehl puts it). Science was some sort of automatic truth-grinding machine that took evidence and produced truth. 

“If you're not poisoned by your politics. Or by your religion. Or by your social class. Or your sex. Or your age. Or whatever. And if you don't have dirty glasses. Then you look at the empirical world, and you collect these protocols, and then you apply the instrumentation of mathematics (maybe in social sciences a lot of statistics), and out of that truth will emerge.”

Some people still believe this! But those people believe a lot of other kooky things too.

The logical positivists wanted a rational reconstruction of science. Meehl asks “why should a rational person believe in chemistry?” This is a good question! The logical positivists thought they could justify the analysis of science with a few simple protocols. The goal would be to take what scientists present as “how the world is” and then come back and see how to reconstruct it.

Rudolf Carnap, one of the more influential logical positivists, saw the philosophy of science as equivalent to the logic of science. Carnap was a logician after all. The logic of science is the logical syntax of the language of science. Hence, to justify science, you just had to parse the discourse of science. The positivists believed that “no overarching epistemological or methodological things are needed if you think clearly.”

This didn’t pan out, but Meehl argues that logical positivism is useful for showing us the limits of logic. I buy this. It’s important to know where deduction ends and our postmodern relativism begins. In the next lecture, Meehl turns to Popper, who makes a valiant but untenable attempt to save science from relativism and historicism. Popper is the scientist’s favorite philosopher. But, as we’ll see next time, the Popperian view of science is a romantic illusion.

Subscribe now

Loose ends

A few other points I couldn’t work into the synopsis, but I want to record for myself for later.

  • Meehl describes how he tried to pin down Feyerabend about theory-infection and what that means. For Kuhn and Feyerabend, every observation is dependent on the theory. This is related to what I was getting at in my post on quantum mechanics. In physics, evidence and measurement are tricky! Both the measurement device and the thing you are measuring are interacting through the “laws” of physics. Meehl argues that this is less of a concern in social science. Feyerabend clarified that he was mostly thinking about “cosmological theories” like those in physics.

  • The second goal of the positivists was the elimination of metaphysics. This was quite a lofty goal! But it led to all sorts of unpleasant conclusions. There were matters of fact like science, and then everything else was opinion. If you eliminate metaphysics, you eliminate ethics and aesthetics. And hence you end up being unable to argue against the Nazis.

  • Meehl also gets into the positivist’s view of meaning. Wittgenstein would destroy this, but it’s good to get a bit of perspective on the grappling with language and logic before the Philosophical Investigations. For the logical positivists, the meaning of a sentence is its method of verification. Meehl gives a simple example of “Caesar crossed the Rubicon” showing that “method of verification” leads to two completely different meanings: that held by a Roman soldier and that held by a contemporary historian.

By Ben Recht

Postdoc at Purdue University (apply by June 30, 2024)

from CCI: jobs

Applications are sought for a 1-2 year postdoctoral position at the Department of Statistics, Purdue University, under the supervision of Dr. Jordan Awan, working on projects at the intersection of statistics and differential privacy. Website: careers.purdue.edu/default/job/Post-Doc-Research-Associate/31276-en_US Email: jawan@purdue.edu

Applications are sought for a 1-2 year postdoctoral position at the Department of Statistics, Purdue University, under the supervision of Dr. Jordan Awan, working on projects at the intersection of statistics and differential privacy.

Website: https://careers.purdue.edu/default/job/Post-Doc-Research-Associate/31276-en_US
Email: jawan@purdue.edu

By shacharlovett

An Open Problem

from Richard Lipton

Richard Feynman and Gian-Carlo Rota worked on different parts of science during their separate careers. Feynman of course was one of the most important scientists of the 20th century—see here. His work in theoretical physics radically reshaped our understanding of the universe we live in at the most fundamental subatomic levels. But Feynman’s brilliance was […]

Richard Feynman and Gian-Carlo Rota worked on different parts of science during their separate careers. Feynman of course was one of the most important scientists of the 20th century—see here.

His work in theoretical physics radically reshaped our understanding of the universe we live in at the most fundamental subatomic levels.

But Feynman’s brilliance was not solely due to his natural cognitive abilities. He relied on a method: a simple technique for seeing the world through the lens of open-ended questions, which he called his “favorite problems.”

Rota worked in the subject of combinatorics; and he lifted it from a barely respectable obscurity to one of the most active areas of mathematics today—to quote Richard Stanley, one of his great students. Rota had many neat results including one of the great papers Ten Lessons I Wish I Had Been Taught is wonderful. It fits neatly with Feynman notion of problems.

The Problem

Our problem today is not one that Feynman nor Rota would have most likely have worked on. Not physics and not combinatorics. But I do think it is possibly one that is still interesting. I hope you agree.

We often have to create a post so that a user can ask some particular question. The post should make it easy for them to get the answer that they are interested in without any difficulty. This is not always easy for us to make it easy.

One simple issue is this: Given a post we wish to make it easy for you to be able to ask for some particular information. The problem is that the actual post may let you get into some type of nasty state. The state may be hard to get into a place where you feel you are in control. The post may be in some strange state—one that does not respond to you questions in a way that you understand. This can be quite upsetting and led to a tough experience for your audience. This is something that we wish to avoid.

More Details

Imagine you are using our post. You interact via some application programming interface (API). You input information to the API by typing some keys or clicking on some icons. Each such input causes the API to change and enter some potentially new state. The difficulty is that this state may be a wrong one. You would like some input to force the post to reset. That is the API should change to a “reset” or some other nice state. But nothing causes this to happen. Nothing.

How can we fix the API to make that happen when we wish? This is the problem that we would like to solve.

Open Problems

This problem does have a simple solution. But the solution is not that pleasant. We could agree that some input—if you type XXX—will always reset the API. But we often do not wish to reserve a fixed sequence of inputs to always cause a reset. What if we wish to be able to input XXX in some situation? This is the problem. How do we have a way out of trouble and yet do not make the inputs restricted? The answer is not that natural we believe. What would you do?

One idea is have a special sequence like XXX that makes the reset. It does mean that XXX cannot be used in any other situation. But we could agree that XXX is replaced with a special secret sequence, that is extremely unlikely to be used except in the reset case. Perhaps we could replace XXX with the Fibonacci sequence 0112358132134 or some other special unusual sequence. Does this make some sense. It would have to be known by users and they would have to operate like this:

If the API is in some nasty case, they would say to themselves “I better type the special sequence.” Oh I recall it is 0112358132134 and the API will always reset.

And the reset will happen. Neat?

By rjlipton

Workshop on Average-Case Computation

from CS Theory Events

May 20-24, 2024 Bocconi University, Milan, Italy rosenalon.github.io/cifra/workshop We are organizing a full-week workshop on average-case computation that aims to explore interactions between cryptography, average-case complexity, statistical physics, and machine learning. The workshop will be held at Bocconi University, Milan, from May 20th to May 24th. It is conveniently timed and located close to Eurocrypt. … Continue reading Workshop on Average-Case Computation

By shacharlovett

May 20-24, 2024 Bocconi University, Milan, Italy https://rosenalon.github.io/cifra/workshop We are organizing a full-week workshop on average-case computation that aims to explore interactions between cryptography, average-case complexity, statistical physics, and machine learning. The workshop will be held at Bocconi University, Milan, from May 20th to May 24th. It is conveniently timed and located close to Eurocrypt. … Continue reading Workshop on Average-Case Computation

By shacharlovett

Functional Closure Properties of Finite $\mathbb{N}$-weighted Automata

from arXiv: Computational Complexity

Authors: Julian Dörfler, Christian Ikenmeyer

We determine all functional closure properties of finite $\mathbb{N}$-weighted automata, even all multivariate ones, and in particular all multivariate polynomials. We also determine all univariate closure properties in the promise setting, and all multivariate closure properties under certain assumptions on the promise, in particular we determine all multivariate closure properties where the output vector lies on a monotone algebraic graph variety.

Authors: Julian Dörfler, Christian Ikenmeyer

We determine all functional closure properties of finite $\mathbb{N}$-weighted automata, even all multivariate ones, and in particular all multivariate polynomials. We also determine all univariate closure properties in the promise setting, and all multivariate closure properties under certain assumptions on the promise, in particular we determine all multivariate closure properties where the output vector lies on a monotone algebraic graph variety.

Prove Symbolic Regression is NP-hard by Symbol Graph

from arXiv: Computational Complexity

Authors: Jinglu Song, Qiang Lu, Bozhou Tian, Jingwen Zhang, Jake Luo, Zhiguang Wang

Symbolic regression (SR) is the task of discovering a symbolic expression that fits a given data set from the space of mathematical expressions. Despite the abundance of research surrounding the SR problem, there's a scarcity of works that confirm its NP-hard nature. Therefore, this paper introduces the concept of a symbol graph as a comprehensive representation of the entire mathematical expression space, effectively illustrating the NP-hard characteristics of the SR problem. Leveraging the symbol graph, we establish a connection between the SR problem and the task of identifying an optimally fitted degree-constrained Steiner Arborescence (DCSAP). The complexity of DCSAP, which is proven to be NP-hard, directly implies the NP-hard nature of the SR problem.

Authors: Jinglu Song, Qiang Lu, Bozhou Tian, Jingwen Zhang, Jake Luo, Zhiguang Wang

Symbolic regression (SR) is the task of discovering a symbolic expression that fits a given data set from the space of mathematical expressions. Despite the abundance of research surrounding the SR problem, there's a scarcity of works that confirm its NP-hard nature. Therefore, this paper introduces the concept of a symbol graph as a comprehensive representation of the entire mathematical expression space, effectively illustrating the NP-hard characteristics of the SR problem. Leveraging the symbol graph, we establish a connection between the SR problem and the task of identifying an optimally fitted degree-constrained Steiner Arborescence (DCSAP). The complexity of DCSAP, which is proven to be NP-hard, directly implies the NP-hard nature of the SR problem.

Unambiguous and Co-Nondeterministic Computations of Finite Automata and Pushdown Automata Families and the Effects of Multiple Counters

from arXiv: Computational Complexity

Authors: Tomoyuki Yamakami

Nonuniform families of polynomial-size finite automata and pushdown automata respectively have strong connections to nonuniform-NL and nonuniform-LOGCFL. We examine the behaviors of unambiguous and co-nondeterministic computations produced by such families of automata operating multiple counters. As its consequences, we obtain various collapses of the complexity classes of families of promise problems solvable by finite and pushdown automata families when all valid instances are limited to either polynomially long strings or unary strings. A key technical ingredient of our proofs is an inductive counting of reachable vertices of each computation graph of finite and pushdown automata that operate multiple counters simultaneously.

Authors: Tomoyuki Yamakami

Nonuniform families of polynomial-size finite automata and pushdown automata respectively have strong connections to nonuniform-NL and nonuniform-LOGCFL. We examine the behaviors of unambiguous and co-nondeterministic computations produced by such families of automata operating multiple counters. As its consequences, we obtain various collapses of the complexity classes of families of promise problems solvable by finite and pushdown automata families when all valid instances are limited to either polynomially long strings or unary strings. A key technical ingredient of our proofs is an inductive counting of reachable vertices of each computation graph of finite and pushdown automata that operate multiple counters simultaneously.

Quantum Advantage and CSP Complexity

from arXiv: Computational Complexity

Authors: Lorenzo Ciardo

Information-processing tasks modelled by homomorphisms between relational structures can witness quantum advantage when entanglement is used as a computational resource. We prove that the occurrence of quantum advantage is determined by the same type of algebraic structure (known as a minion) that captures the polymorphism identities of CSPs and, thus, CSP complexity. We investigate the connection between the minion of quantum advantage and other known minions controlling CSP tractability and width. In this way, we make use of complexity results from the algebraic theory of CSPs to characterise the occurrence of quantum advantage in the case of graphs, and to obtain new necessary and sufficient conditions in the case of arbitrary relational structures.

Authors: Lorenzo Ciardo

Information-processing tasks modelled by homomorphisms between relational structures can witness quantum advantage when entanglement is used as a computational resource. We prove that the occurrence of quantum advantage is determined by the same type of algebraic structure (known as a minion) that captures the polymorphism identities of CSPs and, thus, CSP complexity. We investigate the connection between the minion of quantum advantage and other known minions controlling CSP tractability and width. In this way, we make use of complexity results from the algebraic theory of CSPs to characterise the occurrence of quantum advantage in the case of graphs, and to obtain new necessary and sufficient conditions in the case of arbitrary relational structures.

A Tight Subexponential-time Algorithm for Two-Page Book Embedding

from arXiv: Computational Geometry

Authors: Robert Ganian, Haiko Mueller, Sebastian Ordyniak, Giacomo Paesani, Mateusz Rychlicki

A book embedding of a graph is a drawing that maps vertices onto a line and edges to simple pairwise non-crossing curves drawn into pages, which are half-planes bounded by that line. Two-page book embeddings, i.e., book embeddings into 2 pages, are of special importance as they are both NP-hard to compute and have specific applications. We obtain a 2^(O(\sqrt{n})) algorithm for computing a book embedding of an n-vertex graph on two pages -- a result which is asymptotically tight under the Exponential Time Hypothesis. As a key tool in our approach, we obtain a single-exponential fixed-parameter algorithm for the same problem when parameterized by the treewidth of the input graph. We conclude by establishing the fixed-parameter tractability of computing minimum-page book embeddings when parameterized by the feedback edge number, settling an open question arising from previous work on the problem.

Authors: Robert Ganian, Haiko Mueller, Sebastian Ordyniak, Giacomo Paesani, Mateusz Rychlicki

A book embedding of a graph is a drawing that maps vertices onto a line and edges to simple pairwise non-crossing curves drawn into pages, which are half-planes bounded by that line. Two-page book embeddings, i.e., book embeddings into 2 pages, are of special importance as they are both NP-hard to compute and have specific applications. We obtain a 2^(O(\sqrt{n})) algorithm for computing a book embedding of an n-vertex graph on two pages -- a result which is asymptotically tight under the Exponential Time Hypothesis. As a key tool in our approach, we obtain a single-exponential fixed-parameter algorithm for the same problem when parameterized by the treewidth of the input graph. We conclude by establishing the fixed-parameter tractability of computing minimum-page book embeddings when parameterized by the feedback edge number, settling an open question arising from previous work on the problem.

On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments

from arXiv: Computational Geometry

Authors: Gang Ma, Hui Wei

Over the years, scene understanding has attracted a growing interest in computer vision, providing the semantic and physical scene information necessary for robots to complete some particular tasks autonomously. In 3D scenes, rich spatial geometric and topological information are often ignored by RGB-based approaches for scene understanding. In this study, we develop a bottom-up approach for scene understanding that infers support relations between objects from a point cloud. Our approach utilizes the spatial topology information of the plane pairs in the scene, consisting of three major steps. 1) Detection of pairwise spatial configuration: dividing primitive pairs into local support connection and local inner connection; 2) primitive classification: a combinatorial optimization method applied to classify primitives; and 3) support relations inference and hierarchy graph construction: bottom-up support relations inference and scene hierarchy graph construction containing primitive level and object level. Through experiments, we demonstrate that the algorithm achieves excellent performance in primitive classification and support relations inference. Additionally, we show that the scene hierarchy graph contains rich geometric and topological information of objects, and it possesses great scalability for scene understanding.

Authors: Gang Ma, Hui Wei

Over the years, scene understanding has attracted a growing interest in computer vision, providing the semantic and physical scene information necessary for robots to complete some particular tasks autonomously. In 3D scenes, rich spatial geometric and topological information are often ignored by RGB-based approaches for scene understanding. In this study, we develop a bottom-up approach for scene understanding that infers support relations between objects from a point cloud. Our approach utilizes the spatial topology information of the plane pairs in the scene, consisting of three major steps. 1) Detection of pairwise spatial configuration: dividing primitive pairs into local support connection and local inner connection; 2) primitive classification: a combinatorial optimization method applied to classify primitives; and 3) support relations inference and hierarchy graph construction: bottom-up support relations inference and scene hierarchy graph construction containing primitive level and object level. Through experiments, we demonstrate that the algorithm achieves excellent performance in primitive classification and support relations inference. Additionally, we show that the scene hierarchy graph contains rich geometric and topological information of objects, and it possesses great scalability for scene understanding.

On the rectilinear crossing number of complete balanced multipartite graphs and layered graphs

from arXiv: Computational Geometry

Authors: Ruy Fabila-Monroy, Rosna Paul, Jenifer Viafara-Chanchi, Alexandra Weinberger

A rectilinear drawing of a graph is a drawing of the graph in the plane in which the edges are drawn as straight-line segments. The rectilinear crossing number of a graph is the minimum number of pairs of edges that cross over all rectilinear drawings of the graph. Let $n \ge r$ be positive integers. The graph $K_n^r$, is the complete $r$-partite graph on $n$ vertices, in which every set of the partition has at least $\lfloor n/r \rfloor$ vertices. The layered graph, $L_n^r$, is an $r$-partite graph on $n$ vertices, in which for every $1\le i \le r-1$, all the vertices in the $i$-th partition are adjacent to all the vertices in the $(i+1)$-th partition. In this paper, we give upper bounds on the rectilinear crossing numbers of $K_n^r$ and~$L_n^r$.

Authors: Ruy Fabila-Monroy, Rosna Paul, Jenifer Viafara-Chanchi, Alexandra Weinberger

A rectilinear drawing of a graph is a drawing of the graph in the plane in which the edges are drawn as straight-line segments. The rectilinear crossing number of a graph is the minimum number of pairs of edges that cross over all rectilinear drawings of the graph. Let $n \ge r$ be positive integers. The graph $K_n^r$, is the complete $r$-partite graph on $n$ vertices, in which every set of the partition has at least $\lfloor n/r \rfloor$ vertices. The layered graph, $L_n^r$, is an $r$-partite graph on $n$ vertices, in which for every $1\le i \le r-1$, all the vertices in the $i$-th partition are adjacent to all the vertices in the $(i+1)$-th partition. In this paper, we give upper bounds on the rectilinear crossing numbers of $K_n^r$ and~$L_n^r$.

Computing the LCP Array of a Labeled Graph

from arXiv: Data Structures and Algorithms

Authors: Jarno Alanko, Davide Cenzato, Nicola Cotumaccio, Sung-Hwan Kim, Giovanni Manzini, Nicola Prezza

The LCP array is an important tool in stringology, allowing to speed up pattern matching algorithms and enabling compact representations of the suffix tree. Recently, Conte et al. [DCC 2023] and Cotumaccio et al. [SPIRE 2023] extended the definition of this array to Wheeler DFAs and, ultimately, to arbitrary labeled graphs, proving that it can be used to efficiently solve matching statistics queries on the graph's paths. In this paper, we provide the first efficient algorithm building the LCP array of a directed labeled graph with $n$ nodes and $m$ edges labeled over an alphabet of size $\sigma$. After arguing that the natural generalization of a compact-space LCP-construction algorithm by Beller et al. [J. Discrete Algorithms 2013] runs in time $\Omega(n\sigma)$, we present a new algorithm based on dynamic range stabbing building the LCP array in $O(n\log \sigma)$ time and $O(n\log\sigma)$ bits of working space.

Authors: Jarno Alanko, Davide Cenzato, Nicola Cotumaccio, Sung-Hwan Kim, Giovanni Manzini, Nicola Prezza

The LCP array is an important tool in stringology, allowing to speed up pattern matching algorithms and enabling compact representations of the suffix tree. Recently, Conte et al. [DCC 2023] and Cotumaccio et al. [SPIRE 2023] extended the definition of this array to Wheeler DFAs and, ultimately, to arbitrary labeled graphs, proving that it can be used to efficiently solve matching statistics queries on the graph's paths. In this paper, we provide the first efficient algorithm building the LCP array of a directed labeled graph with $n$ nodes and $m$ edges labeled over an alphabet of size $\sigma$. After arguing that the natural generalization of a compact-space LCP-construction algorithm by Beller et al. [J. Discrete Algorithms 2013] runs in time $\Omega(n\sigma)$, we present a new algorithm based on dynamic range stabbing building the LCP array in $O(n\log \sigma)$ time and $O(n\log\sigma)$ bits of working space.

Minimizing the Number of Tardy Jobs with Uniform Processing Times on Parallel Machines

from arXiv: Data Structures and Algorithms

Authors: Klaus Heeger, Hendrik Molter

In this work, we study the computational (parameterized) complexity of $P \mid r_j, p_j=p \mid \sum_j w_j U_j$. Here, we are given $m$ identical parallel machines and $n$ jobs with equal processing time, each characterized by a release date, a due date, and a weight. The task is to find a feasible schedule, that is, an assignment of the jobs to starting times on machines, such that no job starts before its release date and no machine processes several jobs at the same time, that minimizes the weighted number of tardy jobs. A job is considered tardy if it finishes after its due date. Our main contribution is showing that $P \mid r_j, p_j=p \mid \sum_j U_j$ (the unweighted version of the problem) is NP-hard and W[2]-hard when parameterized by the number of machines. The former resolves an open problem in Note 2.1.19 by Kravchenko and Werner [Journal of Scheduling, 2011] and Open Problem 2 by Sgall [ESA, 2012], and the latter resolves Open Problem 7 by Mnich and van Bevern [Computers & Operations Research, 2018]. Furthermore, our result shows that the known XP-algorithm for $P \mid r_j, p_j=p \mid \sum_j w_j U_j$ parameterized by the number of machines is optimal from a classification standpoint. On the algorithmic side, we provide alternative running time bounds for the above-mentioned known XP-algorithm. Our analysis shows that $P \mid r_j, p_j=p \mid \sum_j w_j U_j$ is contained in XP when parameterized by the processing time, and that it is contained in FPT when parameterized by the combination of the number of machines and the processing time. Finally, we give an FPT-algorithm for $P \mid r_j, p_j=p \mid \sum_j w_j U_j$ parameterized by the number of release dates or the number of due dates. With this work, we lay out the foundation for a systematic study of the parameterized complexity of $P \mid r_j, p_j=p \mid \sum_j w_j U_j$.

Authors: Klaus Heeger, Hendrik Molter

In this work, we study the computational (parameterized) complexity of $P \mid r_j, p_j=p \mid \sum_j w_j U_j$. Here, we are given $m$ identical parallel machines and $n$ jobs with equal processing time, each characterized by a release date, a due date, and a weight. The task is to find a feasible schedule, that is, an assignment of the jobs to starting times on machines, such that no job starts before its release date and no machine processes several jobs at the same time, that minimizes the weighted number of tardy jobs. A job is considered tardy if it finishes after its due date. Our main contribution is showing that $P \mid r_j, p_j=p \mid \sum_j U_j$ (the unweighted version of the problem) is NP-hard and W[2]-hard when parameterized by the number of machines. The former resolves an open problem in Note 2.1.19 by Kravchenko and Werner [Journal of Scheduling, 2011] and Open Problem 2 by Sgall [ESA, 2012], and the latter resolves Open Problem 7 by Mnich and van Bevern [Computers & Operations Research, 2018]. Furthermore, our result shows that the known XP-algorithm for $P \mid r_j, p_j=p \mid \sum_j w_j U_j$ parameterized by the number of machines is optimal from a classification standpoint. On the algorithmic side, we provide alternative running time bounds for the above-mentioned known XP-algorithm. Our analysis shows that $P \mid r_j, p_j=p \mid \sum_j w_j U_j$ is contained in XP when parameterized by the processing time, and that it is contained in FPT when parameterized by the combination of the number of machines and the processing time. Finally, we give an FPT-algorithm for $P \mid r_j, p_j=p \mid \sum_j w_j U_j$ parameterized by the number of release dates or the number of due dates. With this work, we lay out the foundation for a systematic study of the parameterized complexity of $P \mid r_j, p_j=p \mid \sum_j w_j U_j$.

Semirandom Planted Clique and the Restricted Isometry Property

from arXiv: Data Structures and Algorithms

Authors: Jarosław Błasiok, Rares-Darius Buhai, Pravesh K. Kothari, David Steurer

We give a simple, greedy $O(n^{\omega+0.5})=O(n^{2.872})$-time algorithm to list-decode planted cliques in a semirandom model introduced in [CSV17] (following [FK01]) that succeeds whenever the size of the planted clique is $k\geq O(\sqrt{n} \log^2 n)$. In the model, the edges touching the vertices in the planted $k$-clique are drawn independently with probability $p=1/2$ while the edges not touching the planted clique are chosen by an adversary in response to the random choices. Our result shows that the computational threshold in the semirandom setting is within a $O(\log^2 n)$ factor of the information-theoretic one [Ste17] thus resolving an open question of Steinhardt. This threshold also essentially matches the conjectured computational threshold for the well-studied special case of fully random planted clique. All previous algorithms [CSV17, MMT20, BKS23] in this model are based on rather sophisticated rounding algorithms for entropy-constrained semidefinite programming relaxations and their sum-of-squares strengthenings and the best known guarantee is a $n^{O(1/\epsilon)}$-time algorithm to list-decode planted cliques of size $k \geq \tilde{O}(n^{1/2+\epsilon})$. In particular, the guarantee trivializes to quasi-polynomial time if the planted clique is of size $O(\sqrt{n} \operatorname{polylog} n)$. Our algorithm achieves an almost optimal guarantee with a surprisingly simple greedy algorithm. The prior state-of-the-art algorithmic result above is based on a reduction to certifying bounds on the size of unbalanced bicliques in random graphs -- closely related to certifying the restricted isometry property (RIP) of certain random matrices and known to be hard in the low-degree polynomial model. Our key idea is a new approach that relies on the truth of -- but not efficient certificates for -- RIP of a new class of matrices built from the input graphs.

Authors: Jarosław Błasiok, Rares-Darius Buhai, Pravesh K. Kothari, David Steurer

We give a simple, greedy $O(n^{\omega+0.5})=O(n^{2.872})$-time algorithm to list-decode planted cliques in a semirandom model introduced in [CSV17] (following [FK01]) that succeeds whenever the size of the planted clique is $k\geq O(\sqrt{n} \log^2 n)$. In the model, the edges touching the vertices in the planted $k$-clique are drawn independently with probability $p=1/2$ while the edges not touching the planted clique are chosen by an adversary in response to the random choices. Our result shows that the computational threshold in the semirandom setting is within a $O(\log^2 n)$ factor of the information-theoretic one [Ste17] thus resolving an open question of Steinhardt. This threshold also essentially matches the conjectured computational threshold for the well-studied special case of fully random planted clique. All previous algorithms [CSV17, MMT20, BKS23] in this model are based on rather sophisticated rounding algorithms for entropy-constrained semidefinite programming relaxations and their sum-of-squares strengthenings and the best known guarantee is a $n^{O(1/\epsilon)}$-time algorithm to list-decode planted cliques of size $k \geq \tilde{O}(n^{1/2+\epsilon})$. In particular, the guarantee trivializes to quasi-polynomial time if the planted clique is of size $O(\sqrt{n} \operatorname{polylog} n)$. Our algorithm achieves an almost optimal guarantee with a surprisingly simple greedy algorithm. The prior state-of-the-art algorithmic result above is based on a reduction to certifying bounds on the size of unbalanced bicliques in random graphs -- closely related to certifying the restricted isometry property (RIP) of certain random matrices and known to be hard in the low-degree polynomial model. Our key idea is a new approach that relies on the truth of -- but not efficient certificates for -- RIP of a new class of matrices built from the input graphs.

Individual Rationality in Topological Distance Games is Surprisingly Hard

from arXiv: Data Structures and Algorithms

Authors: Argyrios Deligkas, Eduard Eiben, Dušan Knop, Šimon Schierreich

In the recently introduced topological distance games, strategic agents need to be assigned to a subset of vertices of a topology. In the assignment, the utility of an agent depends on both the agent's inherent utilities for other agents and its distance from them on the topology. We study the computational complexity of finding individually rational outcomes; this notion is widely assumed to be the very minimal stability requirement and requires that the utility of every agent in a solution is non-negative. We perform a comprehensive study of the problem's complexity, and we prove that even in very basic cases, deciding whether an individually rational solution exists is intractable. To reach at least some tractability, one needs to combine multiple restrictions of the input instance, including the number of agents and the topology and the influence of distant agents on the utility.

Authors: Argyrios Deligkas, Eduard Eiben, Dušan Knop, Šimon Schierreich

In the recently introduced topological distance games, strategic agents need to be assigned to a subset of vertices of a topology. In the assignment, the utility of an agent depends on both the agent's inherent utilities for other agents and its distance from them on the topology. We study the computational complexity of finding individually rational outcomes; this notion is widely assumed to be the very minimal stability requirement and requires that the utility of every agent in a solution is non-negative. We perform a comprehensive study of the problem's complexity, and we prove that even in very basic cases, deciding whether an individually rational solution exists is intractable. To reach at least some tractability, one needs to combine multiple restrictions of the input instance, including the number of agents and the topology and the influence of distant agents on the utility.

Decline and Fall of the ICALP 2008 Modular Decomposition algorithm

from arXiv: Data Structures and Algorithms

Authors: William Atherton, Dmitrii V. Pasechnik

We provide a counterexample to a crucial lemma in the ICALP 2008 paper "Simpler Linear-Time Modular Decomposition Via Recursive Factorizing Permutations", invalidating the algorithm described there.

Authors: William Atherton, Dmitrii V. Pasechnik

We provide a counterexample to a crucial lemma in the ICALP 2008 paper "Simpler Linear-Time Modular Decomposition Via Recursive Factorizing Permutations", invalidating the algorithm described there.

Engineering Edge Orientation Algorithms

from arXiv: Data Structures and Algorithms

Authors: H. Reinstädtler, C. Schulz, B. Uçar

Given an undirected graph G, the edge orientation problem asks for assigning a direction to each edge to convert G into a directed graph. The aim is to minimize the maximum out degree of a vertex in the resulting directed graph. This problem, which is solvable in polynomial time, arises in many applications. An ongoing challenge in edge orientation algorithms is their scalability, particularly in handling large-scale networks with millions or billions of edges efficiently. We propose a novel algorithmic framework based on finding and manipulating simple paths to face this challenge. Our framework is based on an existing algorithm and allows many algorithmic choices. By carefully exploring these choices and engineering the underlying algorithms, we obtain an implementation which is more efficient and scalable than the current state-of-the-art. Our experiments demonstrate significant performance improvements compared to state-of-the-art solvers. On average our algorithm is 6.59 times faster when compared to the state-of-the-art.

Authors: H. Reinstädtler, C. Schulz, B. Uçar

Given an undirected graph G, the edge orientation problem asks for assigning a direction to each edge to convert G into a directed graph. The aim is to minimize the maximum out degree of a vertex in the resulting directed graph. This problem, which is solvable in polynomial time, arises in many applications. An ongoing challenge in edge orientation algorithms is their scalability, particularly in handling large-scale networks with millions or billions of edges efficiently. We propose a novel algorithmic framework based on finding and manipulating simple paths to face this challenge. Our framework is based on an existing algorithm and allows many algorithmic choices. By carefully exploring these choices and engineering the underlying algorithms, we obtain an implementation which is more efficient and scalable than the current state-of-the-art. Our experiments demonstrate significant performance improvements compared to state-of-the-art solvers. On average our algorithm is 6.59 times faster when compared to the state-of-the-art.

Faster Algorithms for Dual-Failure Replacement Paths

from arXiv: Data Structures and Algorithms

Authors: Shiri Chechik, Tianyi Zhang

Given a simple weighted directed graph $G = (V, E, \omega)$ on $n$ vertices as well as two designated terminals $s, t\in V$, our goal is to compute the shortest path from $s$ to $t$ avoiding any pair of presumably failed edges $f_1, f_2\in E$, which is a natural generalization of the classical replacement path problem which considers single edge failures only. This dual failure replacement paths problem was recently studied by Vassilevska Williams, Woldeghebriel and Xu [FOCS 2022] who designed a cubic time algorithm for general weighted digraphs which is conditionally optimal; in the same paper, for unweighted graphs where $\omega \equiv 1$, the authors presented an algebraic algorithm with runtime $\tilde{O}(n^{2.9146})$, as well as a conditional lower bound of $n^{8/3-o(1)}$ against combinatorial algorithms. However, it was unknown in their work whether fast matrix multiplication is necessary for a subcubic runtime in unweighted digraphs. As our primary result, we present the first truly subcubic combinatorial algorithm for dual failure replacement paths in unweighted digraphs. Our runtime is $\tilde{O}(n^{3-1/18})$. Besides, we also study algebraic algorithms for digraphs with small integer edge weights from $\{-M, -M+1, \cdots, M-1, M\}$. As our secondary result, we obtained a runtime of $\tilde{O}(Mn^{2.8716})$, which is faster than the previous bound of $\tilde{O}(M^{2/3}n^{2.9144} + Mn^{2.8716})$ from [Vassilevska Williams, Woldeghebriela and Xu, 2022].

Authors: Shiri Chechik, Tianyi Zhang

Given a simple weighted directed graph $G = (V, E, \omega)$ on $n$ vertices as well as two designated terminals $s, t\in V$, our goal is to compute the shortest path from $s$ to $t$ avoiding any pair of presumably failed edges $f_1, f_2\in E$, which is a natural generalization of the classical replacement path problem which considers single edge failures only. This dual failure replacement paths problem was recently studied by Vassilevska Williams, Woldeghebriel and Xu [FOCS 2022] who designed a cubic time algorithm for general weighted digraphs which is conditionally optimal; in the same paper, for unweighted graphs where $\omega \equiv 1$, the authors presented an algebraic algorithm with runtime $\tilde{O}(n^{2.9146})$, as well as a conditional lower bound of $n^{8/3-o(1)}$ against combinatorial algorithms. However, it was unknown in their work whether fast matrix multiplication is necessary for a subcubic runtime in unweighted digraphs. As our primary result, we present the first truly subcubic combinatorial algorithm for dual failure replacement paths in unweighted digraphs. Our runtime is $\tilde{O}(n^{3-1/18})$. Besides, we also study algebraic algorithms for digraphs with small integer edge weights from $\{-M, -M+1, \cdots, M-1, M\}$. As our secondary result, we obtained a runtime of $\tilde{O}(Mn^{2.8716})$, which is faster than the previous bound of $\tilde{O}(M^{2/3}n^{2.9144} + Mn^{2.8716})$ from [Vassilevska Williams, Woldeghebriela and Xu, 2022].

Sublinear Time Low-Rank Approximation of Toeplitz Matrices

from arXiv: Data Structures and Algorithms

Authors: Cameron Musco, Kshiteej Sheth

We present a sublinear time algorithm for computing a near optimal low-rank approximation to any positive semidefinite (PSD) Toeplitz matrix $T\in \mathbb{R}^{d\times d}$, given noisy access to its entries. In particular, given entrywise query access to $T+E$ for an arbitrary noise matrix $E\in \mathbb{R}^{d\times d}$, integer rank $k\leq d$, and error parameter $\delta>0$, our algorithm runs in time $\text{poly}(k,\log(d/\delta))$ and outputs (in factored form) a Toeplitz matrix $\widetilde{T} \in \mathbb{R}^{d \times d}$ with rank $\text{poly}(k,\log(d/\delta))$ satisfying, for some fixed constant $C$, \begin{equation*} \|T-\widetilde{T}\|_F \leq C \cdot \max\{\|E\|_F,\|T-T_k\|_F\} + \delta \cdot \|T\|_F. \end{equation*} Here $\|\cdot \|_F$ is the Frobenius norm and $T_k$ is the best (not necessarily Toeplitz) rank-$k$ approximation to $T$ in the Frobenius norm, given by projecting $T$ onto its top $k$ eigenvectors. Our result has the following applications. When $E = 0$, we obtain the first sublinear time near-relative-error low-rank approximation algorithm for PSD Toeplitz matrices, resolving the main open problem of Kapralov et al. SODA `23, whose algorithm had sublinear query complexity but exponential runtime. Our algorithm can also be applied to approximate the unknown Toeplitz covariance matrix of a multivariate Gaussian distribution, given sample access to this distribution, resolving an open question of Eldar et al. SODA `20. Our algorithm applies sparse Fourier transform techniques to recover a low-rank Toeplitz matrix using its Fourier structure. Our key technical contribution is the first polynomial time algorithm for \emph{discrete time off-grid} sparse Fourier recovery, which may be of independent interest.

Authors: Cameron Musco, Kshiteej Sheth

We present a sublinear time algorithm for computing a near optimal low-rank approximation to any positive semidefinite (PSD) Toeplitz matrix $T\in \mathbb{R}^{d\times d}$, given noisy access to its entries. In particular, given entrywise query access to $T+E$ for an arbitrary noise matrix $E\in \mathbb{R}^{d\times d}$, integer rank $k\leq d$, and error parameter $\delta>0$, our algorithm runs in time $\text{poly}(k,\log(d/\delta))$ and outputs (in factored form) a Toeplitz matrix $\widetilde{T} \in \mathbb{R}^{d \times d}$ with rank $\text{poly}(k,\log(d/\delta))$ satisfying, for some fixed constant $C$, \begin{equation*} \|T-\widetilde{T}\|_F \leq C \cdot \max\{\|E\|_F,\|T-T_k\|_F\} + \delta \cdot \|T\|_F. \end{equation*} Here $\|\cdot \|_F$ is the Frobenius norm and $T_k$ is the best (not necessarily Toeplitz) rank-$k$ approximation to $T$ in the Frobenius norm, given by projecting $T$ onto its top $k$ eigenvectors. Our result has the following applications. When $E = 0$, we obtain the first sublinear time near-relative-error low-rank approximation algorithm for PSD Toeplitz matrices, resolving the main open problem of Kapralov et al. SODA `23, whose algorithm had sublinear query complexity but exponential runtime. Our algorithm can also be applied to approximate the unknown Toeplitz covariance matrix of a multivariate Gaussian distribution, given sample access to this distribution, resolving an open question of Eldar et al. SODA `20. Our algorithm applies sparse Fourier transform techniques to recover a low-rank Toeplitz matrix using its Fourier structure. Our key technical contribution is the first polynomial time algorithm for \emph{discrete time off-grid} sparse Fourier recovery, which may be of independent interest.

Stochastic Multi-round Submodular Optimization with Budget

from arXiv: Data Structures and Algorithms

Authors: Vincenzo Auletta, Diodato Ferraioli, Cosimo Vinci

In this work we study the problem of Stochastic Budgeted Multi-round Submodular Maximization (SBMSm), in which we would like to maximize the sum over multiple rounds of the value of a monotone and submodular objective function, subject to the fact that the values of this function depend on the realization of stochastic events and the number of observations that we can make over all rounds is limited by a given budget. This problem extends, and generalizes to multiple round settings, well-studied problems such as (adaptive) influence maximization and stochastic probing. We first show that whenever a certain single-round optimization problem can be optimally solved in polynomial time, then there is a polynomial time dynamic programming algorithm that returns the same solution as the optimal algorithm, that can adaptively choose both which observations to make and in which round to have them. Unfortunately, this dynamic programming approach cannot be extended to work when the single-round optimization problem cannot be efficiently solved (even if we allow it would be approximated within an arbitrary small constant). Anyway, in this case we are able to provide a simple greedy algorithm for the problem. It guarantees a $(1/2-\epsilon)$-approximation to the optimal value, even if it non-adaptively allocates the budget to rounds.

Authors: Vincenzo Auletta, Diodato Ferraioli, Cosimo Vinci

In this work we study the problem of Stochastic Budgeted Multi-round Submodular Maximization (SBMSm), in which we would like to maximize the sum over multiple rounds of the value of a monotone and submodular objective function, subject to the fact that the values of this function depend on the realization of stochastic events and the number of observations that we can make over all rounds is limited by a given budget. This problem extends, and generalizes to multiple round settings, well-studied problems such as (adaptive) influence maximization and stochastic probing. We first show that whenever a certain single-round optimization problem can be optimally solved in polynomial time, then there is a polynomial time dynamic programming algorithm that returns the same solution as the optimal algorithm, that can adaptively choose both which observations to make and in which round to have them. Unfortunately, this dynamic programming approach cannot be extended to work when the single-round optimization problem cannot be efficiently solved (even if we allow it would be approximated within an arbitrary small constant). Anyway, in this case we are able to provide a simple greedy algorithm for the problem. It guarantees a $(1/2-\epsilon)$-approximation to the optimal value, even if it non-adaptively allocates the budget to rounds.

Predict to Minimize Swap Regret for All Payoff-Bounded Tasks

from arXiv: Data Structures and Algorithms

Authors: Lunjia Hu, Yifan Wu

A sequence of predictions is calibrated if and only if it induces no swap regret to all down-stream decision tasks. We study the Maximum Swap Regret (MSR) of predictions for binary events: the swap regret maximized over all downstream tasks with bounded payoffs. Previously, the best online prediction algorithm for minimizing MSR is obtained by minimizing the K1 calibration error, which upper bounds MSR up to a constant factor. However, recent work (Qiao and Valiant, 2021) gives an ${\Omega}(T^{0.528})$ lower bound for the worst-case expected K1 calibration error incurred by any randomized algorithm in T rounds, presenting a barrier to achieving better rates for MSR. Several relaxations of MSR have been considered to overcome this barrier, via external regret (Kleinberg et al., 2023) and regret bounds depending polynomially on the number of actions in downstream tasks (Noarov et al., 2023; Roth and Shi, 2024). We show that the barrier can be surpassed without any relaxations: we give an efficient randomized prediction algorithm that guarantees $O(TlogT)$ expected MSR. We also discuss the economic utility of calibration by viewing MSR as a decision-theoretic calibration error metric and study its relationship to existing metrics.

Authors: Lunjia Hu, Yifan Wu

A sequence of predictions is calibrated if and only if it induces no swap regret to all down-stream decision tasks. We study the Maximum Swap Regret (MSR) of predictions for binary events: the swap regret maximized over all downstream tasks with bounded payoffs. Previously, the best online prediction algorithm for minimizing MSR is obtained by minimizing the K1 calibration error, which upper bounds MSR up to a constant factor. However, recent work (Qiao and Valiant, 2021) gives an ${\Omega}(T^{0.528})$ lower bound for the worst-case expected K1 calibration error incurred by any randomized algorithm in T rounds, presenting a barrier to achieving better rates for MSR. Several relaxations of MSR have been considered to overcome this barrier, via external regret (Kleinberg et al., 2023) and regret bounds depending polynomially on the number of actions in downstream tasks (Noarov et al., 2023; Roth and Shi, 2024). We show that the barrier can be surpassed without any relaxations: we give an efficient randomized prediction algorithm that guarantees $O(TlogT)$ expected MSR. We also discuss the economic utility of calibration by viewing MSR as a decision-theoretic calibration error metric and study its relationship to existing metrics.

Optimal Non-Adaptive Tolerant Junta Testing via Local Estimators

from arXiv: Data Structures and Algorithms

Authors: Shivam Nadimpalli, Shyamal Patel

We give a non-adaptive algorithm that makes $2^{\tilde{O}(\sqrt{k\log(1/\varepsilon_2 - \varepsilon_1)})}$ queries to a Boolean function $f:\{\pm 1\}^n \rightarrow \{\pm 1\}$ and distinguishes between $f$ being $\varepsilon_1$-close to some $k$-junta versus $\varepsilon_2$-far from every $k$-junta. At the heart of our algorithm is a local mean estimation procedure for Boolean functions that may be of independent interest. We complement our upper bound with a matching lower bound, improving a recent lower bound obtained by Chen et al. We thus obtain the first tight bounds for a natural property of Boolean functions in the tolerant testing model.

Authors: Shivam Nadimpalli, Shyamal Patel

We give a non-adaptive algorithm that makes $2^{\tilde{O}(\sqrt{k\log(1/\varepsilon_2 - \varepsilon_1)})}$ queries to a Boolean function $f:\{\pm 1\}^n \rightarrow \{\pm 1\}$ and distinguishes between $f$ being $\varepsilon_1$-close to some $k$-junta versus $\varepsilon_2$-far from every $k$-junta. At the heart of our algorithm is a local mean estimation procedure for Boolean functions that may be of independent interest. We complement our upper bound with a matching lower bound, improving a recent lower bound obtained by Chen et al. We thus obtain the first tight bounds for a natural property of Boolean functions in the tolerant testing model.

An Optimal MPC Algorithm for Subunit-Monge Matrix Multiplication, with Applications to LIS

from arXiv: Data Structures and Algorithms

Authors: Jaehyun Koo

We present an $O(1)$-round fully-scalable deterministic massively parallel algorithm for computing the min-plus matrix multiplication of unit-Monge matrices. We use this to derive a $O(\log n)$-round fully-scalable massively parallel algorithm for solving the exact longest increasing subsequence (LIS) problem. For a fully-scalable MPC regime, this result substantially improves the previously known algorithm of $O(\log^4 n)$-round complexity, and matches the best algorithm for computing the $(1+\epsilon)$-approximation of LIS.

Authors: Jaehyun Koo

We present an $O(1)$-round fully-scalable deterministic massively parallel algorithm for computing the min-plus matrix multiplication of unit-Monge matrices. We use this to derive a $O(\log n)$-round fully-scalable massively parallel algorithm for solving the exact longest increasing subsequence (LIS) problem. For a fully-scalable MPC regime, this result substantially improves the previously known algorithm of $O(\log^4 n)$-round complexity, and matches the best algorithm for computing the $(1+\epsilon)$-approximation of LIS.

New Structures and Algorithms for Length-Constrained Expander Decompositions

from arXiv: Data Structures and Algorithms

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Zihan Tan

Expander decompositions form the basis of one of the most flexible paradigms for close-to-linear-time graph algorithms. Length-constrained expander decompositions generalize this paradigm to better work for problems with lengths, distances and costs. Roughly, an $(h,s)$-length $\phi$-expander decomposition is a small collection of length increases to a graph so that nodes within distance $h$ can route flow over paths of length $hs$ with congestion at most $1/\phi$. In this work, we give a close-to-linear time algorithm for computing length-constrained expander decompositions in graphs with general lengths and capacities. Notably, and unlike previous works, our algorithm allows for one to trade off off between the size of the decomposition and the length of routing paths: for any $\epsilon > 0$ not too small, our algorithm computes in close-to-linear time an $(h,s)$-length $\phi$-expander decomposition of size $m \cdot \phi \cdot n^\epsilon$ where $s = \exp(\text{poly}(1/\epsilon))$. The key foundations of our algorithm are: (1) a simple yet powerful structural theorem which states that the union of a sequence of sparse length-constrained cuts is itself sparse and (2) new algorithms for efficiently computing sparse length-constrained flows.

Authors: Bernhard Haeupler, D Ellis Hershkowitz, Zihan Tan

Expander decompositions form the basis of one of the most flexible paradigms for close-to-linear-time graph algorithms. Length-constrained expander decompositions generalize this paradigm to better work for problems with lengths, distances and costs. Roughly, an $(h,s)$-length $\phi$-expander decomposition is a small collection of length increases to a graph so that nodes within distance $h$ can route flow over paths of length $hs$ with congestion at most $1/\phi$. In this work, we give a close-to-linear time algorithm for computing length-constrained expander decompositions in graphs with general lengths and capacities. Notably, and unlike previous works, our algorithm allows for one to trade off off between the size of the decomposition and the length of routing paths: for any $\epsilon > 0$ not too small, our algorithm computes in close-to-linear time an $(h,s)$-length $\phi$-expander decomposition of size $m \cdot \phi \cdot n^\epsilon$ where $s = \exp(\text{poly}(1/\epsilon))$. The key foundations of our algorithm are: (1) a simple yet powerful structural theorem which states that the union of a sequence of sparse length-constrained cuts is itself sparse and (2) new algorithms for efficiently computing sparse length-constrained flows.