Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, January 17

Hardness of clique approximation for monotone circuits

from arXiv: Computational Complexity

Authors: Jarosław Błasiok, Linus Meierhöfer

We consider a problem of approximating the size of the largest clique in a graph, with a monotone circuit. Concretely, we focus on distinguishing a random Erd\H{o}s-Renyi graph $\mathcal{G}_{n,p}$, with $p=n^{-\frac{2}{\alpha-1}}$ chosen st. with high probability it does not even have an $\alpha$-clique, from a random clique on $\beta$ vertices (where $\alpha \leq \beta$). Using the approximation method of Razborov, Alon and Boppana showed in 1987 that as long as $\sqrt{\alpha} \beta < n^{1-\delta}/\log n$, this problem requires a monotone circuit of size $n^{\Omega(\delta\sqrt{\alpha})}$, implying a lower bound of $2^{\tilde\Omega(n^{1/3})}$ for the exact version of the problem when $k\approx n^{2/3}$. Recently Cavalar, Kumar, and Rossman improved their result by showing the tight lower bound $n^{\Omega(k)}$, in a limited range $k \leq n^{1/3}$, implying a comparable $2^{\tilde{\Omega}(n^{1/3})}$ lower bound. We combine the ideas of Cavalar, Kumar and Rossman with the recent breakthrough results on the sunflower conjecture by Alweiss, Lovett, Wu and Zhang to show that as long as $\alpha \beta < n^{1-\delta}/\log n$, any monotone circuit rejecting $\mathcal{G}_{n,p}$ while accepting a $\beta$-clique needs to have size at least $n^{\Omega(\delta^2 \alpha)}$; this implies a stronger $2^{\tilde{\Omega}(\sqrt{n})}$ lower bound for the unrestricted version of the problem. We complement this result with a construction of an explicit monotone circuit of size $O(n^{\delta^2 \alpha/2})$ which rejects $\mathcal{G}_{n,p}$, and accepts any graph containing $\beta$-clique whenever $\beta > n^{1-\delta}$. Those two theorems explain the largest $\beta$-clique that can be distinguished from $\mathcal{G}_{n, 1/2}$: when $\beta > n / 2^{C \sqrt{\log n}}$, polynomial size circuit co do it, while for $\beta < n / 2^{\omega(\sqrt{\log n})}$ every circuit needs size $n^{\omega(1)}$.

Authors: Jarosław Błasiok, Linus Meierhöfer

We consider a problem of approximating the size of the largest clique in a graph, with a monotone circuit. Concretely, we focus on distinguishing a random Erd\H{o}s-Renyi graph $\mathcal{G}_{n,p}$, with $p=n^{-\frac{2}{\alpha-1}}$ chosen st. with high probability it does not even have an $\alpha$-clique, from a random clique on $\beta$ vertices (where $\alpha \leq \beta$). Using the approximation method of Razborov, Alon and Boppana showed in 1987 that as long as $\sqrt{\alpha} \beta < n^{1-\delta}/\log n$, this problem requires a monotone circuit of size $n^{\Omega(\delta\sqrt{\alpha})}$, implying a lower bound of $2^{\tilde\Omega(n^{1/3})}$ for the exact version of the problem when $k\approx n^{2/3}$. Recently Cavalar, Kumar, and Rossman improved their result by showing the tight lower bound $n^{\Omega(k)}$, in a limited range $k \leq n^{1/3}$, implying a comparable $2^{\tilde{\Omega}(n^{1/3})}$ lower bound. We combine the ideas of Cavalar, Kumar and Rossman with the recent breakthrough results on the sunflower conjecture by Alweiss, Lovett, Wu and Zhang to show that as long as $\alpha \beta < n^{1-\delta}/\log n$, any monotone circuit rejecting $\mathcal{G}_{n,p}$ while accepting a $\beta$-clique needs to have size at least $n^{\Omega(\delta^2 \alpha)}$; this implies a stronger $2^{\tilde{\Omega}(\sqrt{n})}$ lower bound for the unrestricted version of the problem. We complement this result with a construction of an explicit monotone circuit of size $O(n^{\delta^2 \alpha/2})$ which rejects $\mathcal{G}_{n,p}$, and accepts any graph containing $\beta$-clique whenever $\beta > n^{1-\delta}$. Those two theorems explain the largest $\beta$-clique that can be distinguished from $\mathcal{G}_{n, 1/2}$: when $\beta > n / 2^{C \sqrt{\log n}}$, polynomial size circuit co do it, while for $\beta < n / 2^{\omega(\sqrt{\log n})}$ every circuit needs size $n^{\omega(1)}$.

A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise

from arXiv: Data Structures and Algorithms

Authors: Ilias Diakonikolas, Nikos Zarifis

We study the problem of PAC learning $\gamma$-margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be $\widetilde{\Theta}(1/(\gamma^2 \epsilon))$. Prior computationally efficient algorithms for the problem incur sample complexity $\tilde{O}(1/(\gamma^4 \epsilon^3))$ and achieve 0-1 error of $\eta+\epsilon$, where $\eta<1/2$ is the upper bound on the noise rate. Recent work gave evidence of an information-computation tradeoff, suggesting that a quadratic dependence on $1/\epsilon$ is required for computationally efficient algorithms. Our main result is a computationally efficient learner with sample complexity $\widetilde{\Theta}(1/(\gamma^2 \epsilon^2))$, nearly matching this lower bound. In addition, our algorithm is simple and practical, relying on online SGD on a carefully selected sequence of convex losses.

Authors: Ilias Diakonikolas, Nikos Zarifis

We study the problem of PAC learning $\gamma$-margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be $\widetilde{\Theta}(1/(\gamma^2 \epsilon))$. Prior computationally efficient algorithms for the problem incur sample complexity $\tilde{O}(1/(\gamma^4 \epsilon^3))$ and achieve 0-1 error of $\eta+\epsilon$, where $\eta<1/2$ is the upper bound on the noise rate. Recent work gave evidence of an information-computation tradeoff, suggesting that a quadratic dependence on $1/\epsilon$ is required for computationally efficient algorithms. Our main result is a computationally efficient learner with sample complexity $\widetilde{\Theta}(1/(\gamma^2 \epsilon^2))$, nearly matching this lower bound. In addition, our algorithm is simple and practical, relying on online SGD on a carefully selected sequence of convex losses.

Scheduling Coflows for Minimizing the Maximum Completion Time in Heterogeneous Parallel Networks

from arXiv: Data Structures and Algorithms

Authors: Chi-Yeh Chen

Coflow represents a network abstraction that models communication patterns within data centers. The scheduling of coflows is a prevalent issue in large data center environments and is classified as an $\mathcal{NP}$-hard problem. This paper focuses on the scheduling of coflows in heterogeneous parallel networks, defined by architectures featuring multiple network cores operating concurrently. We introduce two pseudo-polynomial-time algorithms and two polynomial-time approximation algorithms to minimize the maximum completion time (makespan) in heterogeneous parallel networks. We propose a randomized algorithm that offers an expected approximation ratio of 1.5. Building upon this foundation, we provide a deterministic algorithm that utilizes derandomization techniques, which offers a performance guarantee of $1.5 + \frac{1}{2 \cdot LB}$, where $LB$ is the lower bound of the makespan for each instance. To address time complexity concerns, we implement an exponential partitioning of time intervals and present a randomized algorithm with an expected approximation ratio of $1.5 + \epsilon$ in polynomial time where $\epsilon>0$. Additionally, we develop a deterministic algorithm with a performance guarantee expressed as $\max\left\{1.5+\epsilon, 1.5+\frac{1}{2 \cdot LB}\right\}$ within polynomial time. These advancements markedly enhance the best-known approximation ratio of $2+\epsilon$.

Authors: Chi-Yeh Chen

Coflow represents a network abstraction that models communication patterns within data centers. The scheduling of coflows is a prevalent issue in large data center environments and is classified as an $\mathcal{NP}$-hard problem. This paper focuses on the scheduling of coflows in heterogeneous parallel networks, defined by architectures featuring multiple network cores operating concurrently. We introduce two pseudo-polynomial-time algorithms and two polynomial-time approximation algorithms to minimize the maximum completion time (makespan) in heterogeneous parallel networks. We propose a randomized algorithm that offers an expected approximation ratio of 1.5. Building upon this foundation, we provide a deterministic algorithm that utilizes derandomization techniques, which offers a performance guarantee of $1.5 + \frac{1}{2 \cdot LB}$, where $LB$ is the lower bound of the makespan for each instance. To address time complexity concerns, we implement an exponential partitioning of time intervals and present a randomized algorithm with an expected approximation ratio of $1.5 + \epsilon$ in polynomial time where $\epsilon>0$. Additionally, we develop a deterministic algorithm with a performance guarantee expressed as $\max\left\{1.5+\epsilon, 1.5+\frac{1}{2 \cdot LB}\right\}$ within polynomial time. These advancements markedly enhance the best-known approximation ratio of $2+\epsilon$.

Testing Noise Assumptions of Learning Algorithms

from arXiv: Data Structures and Algorithms

Authors: Surbhi Goel, Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan

We pose a fundamental question in computational learning theory: can we efficiently test whether a training set satisfies the assumptions of a given noise model? This question has remained unaddressed despite decades of research on learning in the presence of noise. In this work, we show that this task is tractable and present the first efficient algorithm to test various noise assumptions on the training data. To model this question, we extend the recently proposed testable learning framework of Rubinfeld and Vasilyan (2023) and require a learner to run an associated test that satisfies the following two conditions: (1) whenever the test accepts, the learner outputs a classifier along with a certificate of optimality, and (2) the test must pass for any dataset drawn according to a specified modeling assumption on both the marginal distribution and the noise model. We then consider the problem of learning halfspaces over Gaussian marginals with Massart noise (where each label can be flipped with probability less than $1/2$ depending on the input features), and give a fully-polynomial time testable learning algorithm. We also show a separation between the classical setting of learning in the presence of structured noise and testable learning. In fact, for the simple case of random classification noise (where each label is flipped with fixed probability $\eta = 1/2$), we show that testable learning requires super-polynomial time while classical learning is trivial.

Authors: Surbhi Goel, Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan

We pose a fundamental question in computational learning theory: can we efficiently test whether a training set satisfies the assumptions of a given noise model? This question has remained unaddressed despite decades of research on learning in the presence of noise. In this work, we show that this task is tractable and present the first efficient algorithm to test various noise assumptions on the training data. To model this question, we extend the recently proposed testable learning framework of Rubinfeld and Vasilyan (2023) and require a learner to run an associated test that satisfies the following two conditions: (1) whenever the test accepts, the learner outputs a classifier along with a certificate of optimality, and (2) the test must pass for any dataset drawn according to a specified modeling assumption on both the marginal distribution and the noise model. We then consider the problem of learning halfspaces over Gaussian marginals with Massart noise (where each label can be flipped with probability less than $1/2$ depending on the input features), and give a fully-polynomial time testable learning algorithm. We also show a separation between the classical setting of learning in the presence of structured noise and testable learning. In fact, for the simple case of random classification noise (where each label is flipped with fixed probability $\eta = 1/2$), we show that testable learning requires super-polynomial time while classical learning is trivial.

A simpler QPTAS for scheduling jobs with precedence constraints

from arXiv: Data Structures and Algorithms

Authors: Syamantak Das, Andreas Wiese

We study the classical scheduling problem of minimizing the makespan of a set of unit size jobs with precedence constraints on parallel identical machines. Research on the problem dates back to the landmark paper by Graham from 1966 who showed that the simple List Scheduling algorithm is a $(2-\frac{1}{m})$-approximation. Interestingly, it is open whether the problem is NP-hard if $m=3$ which is one of the few remaining open problems in the seminal book by Garey and Johnson. Recently, quite some progress has been made for the setting that $m$ is a constant. In a break-through paper, Levey and Rothvoss presented a $(1+\epsilon)$-approximation with a running time of $n^{(\log n)^{O((m^{2}/\epsilon^{2})\log\log n)}}$[STOC 2016, SICOMP 2019] and this running time was improved to quasi-polynomial by Garg[ICALP 2018] and to even $n^{O_{m,\epsilon}(\log^{3}\log n)}$ by Li[SODA 2021]. These results use techniques like LP-hierarchies, conditioning on certain well-selected jobs, and abstractions like (partial) dyadic systems and virtually valid schedules. In this paper, we present a QPTAS for the problem which is arguably simpler than the previous algorithms. We just guess the positions of certain jobs in the optimal solution, recurse on a set of guessed subintervals, and fill in the remaining jobs with greedy routines. We believe that also our analysis is more accessible, in particular since we do not use (LP-)hierarchies or abstractions of the problem like the ones above, but we guess properties of the optimal solution directly.

Authors: Syamantak Das, Andreas Wiese

We study the classical scheduling problem of minimizing the makespan of a set of unit size jobs with precedence constraints on parallel identical machines. Research on the problem dates back to the landmark paper by Graham from 1966 who showed that the simple List Scheduling algorithm is a $(2-\frac{1}{m})$-approximation. Interestingly, it is open whether the problem is NP-hard if $m=3$ which is one of the few remaining open problems in the seminal book by Garey and Johnson. Recently, quite some progress has been made for the setting that $m$ is a constant. In a break-through paper, Levey and Rothvoss presented a $(1+\epsilon)$-approximation with a running time of $n^{(\log n)^{O((m^{2}/\epsilon^{2})\log\log n)}}$[STOC 2016, SICOMP 2019] and this running time was improved to quasi-polynomial by Garg[ICALP 2018] and to even $n^{O_{m,\epsilon}(\log^{3}\log n)}$ by Li[SODA 2021]. These results use techniques like LP-hierarchies, conditioning on certain well-selected jobs, and abstractions like (partial) dyadic systems and virtually valid schedules. In this paper, we present a QPTAS for the problem which is arguably simpler than the previous algorithms. We just guess the positions of certain jobs in the optimal solution, recurse on a set of guessed subintervals, and fill in the remaining jobs with greedy routines. We believe that also our analysis is more accessible, in particular since we do not use (LP-)hierarchies or abstractions of the problem like the ones above, but we guess properties of the optimal solution directly.

Thursday, January 16

Twenty questions with a random liar

from David Eppstein

I have another new arXiv preprint: “Computational geometry with probabilistically noisy primitive operations”, arXiv:2501.07707, with Mike Goodrich and his student Vinesh Sridhar. Many computational geometry algorithms are designed around the use of primitives, subroutines that access the coordinate data of the input and convert it into combinatorial information, with the assumption that all access to the input goes through these primitives. For instance, a key primitive often used for computing convex hulls (and central to my book on forbidden configurations) takes as input an ordered triple of points in the plane and determines whether they are ordered clockwise around the triangle they generate, counterclockwise, or collinear. The premise of the new preprint is that this primitive could randomly produce the wrong result, with some constant probability, independent of any past errors or successes.

I have another new arXiv preprint: “Computational geometry with probabilistically noisy primitive operations”, arXiv:2501.07707, with Mike Goodrich and his student Vinesh Sridhar. Many computational geometry algorithms are designed around the use of primitives, subroutines that access the coordinate data of the input and convert it into combinatorial information, with the assumption that all access to the input goes through these primitives. For instance, a key primitive often used for computing convex hulls (and central to my book on forbidden configurations) takes as input an ordered triple of points in the plane and determines whether they are ordered clockwise around the triangle they generate, counterclockwise, or collinear. The premise of the new preprint is that this primitive could randomly produce the wrong result, with some constant probability, independent of any past errors or successes.

(The title of this post comes from a related paper by Dhagat, Gács, and Winkler, from SODA 1992, using a different model in which the primitive is incorrect adversarially rather than randomly.)

You might like to perform a bigger computation using these noisy primitives, with high probability of a correct overall outcome despite their errors. An easy way to get more accurate results would be to repeat each primitive many times and take its majority outcome. Repeating a logarithmic number of times would make the failure probability polynomially small, so small that you could make a polynomial number of calls to these repeated primitives and have a good chance of never seeing an error. That polynomially small failure probability is what we mean by “high probability”. But this trivial repetition strategy has a cost: it slows everything down by a logarithmic factor. The goal of our preprint is to find geometry problems for which more sophisticated methods can avoid this slowdown. We achieve this no-slowdown goal for some problems (like constructing convex hulls) and show that it is not possible for others (like finding closest pairs of points, which has a linear-time conventional algorithm but requires \(\Omega(n\log n)\) noisy primitives).

Although we think the application of this model to computational geometry is new, the model of randomly erroneous primitives is not new. It has been applied for instance to comparison-based algorithms for sorting and searching, where the primitive is a comparison between two input numbers.

You might well wonder whether this is just an intellectual puzzle, or whether there is some application for algorithms designed to work with faulty comparison primitives. One application comes from a paper by Manu Viola on communication complexity: “The communication complexity of addition”, Combinatorica 2015, doi:10.1007/s00493-014-3078-3. In one of the problems studied by Viola, two communicating parties (Alice and Bob) each hold an \(n\)-bit binary number. They want to know whose number is bigger, and they don’t care about the secrecy of their numbers, but they would like to find out the result of this numerical comparison by exchanging as few bits of information as possible. You could do this by having Alice tell Bob her whole number, but this would take \(n\) bits of communication. It turns out that Alice and Bob can do much better than this.

Viola considers an algorithm in which Alice and Bob collaborate to perform a binary search for the first bit position where their numbers differ from each other. Once they both know this, they also know who has a 0 in that position and who has a 1, and therefore whose number is larger. If Alice and Bob could test, for a given \(k\), whether their numbers are the same in the first \(k\) bits, then they could compare their numbers by using \(\log_2 n\) of these tests. So the question becomes: how can Alice and Bob perform an equality test of two bitstrings using only very few bits? And when phrased in this way, the obvious answer is: hashing! Do some initial computation to generate a random hash function, then do the equality tests on the hashes of the two bitstrings. These hash values could potentially be much shorter than \(n\) bits, reducing the communication needed per test. But there’s a chance of an error when the two strings collide, hashing to the same value even when they’re unequal.

In a model of computation where Alice and Bob have a shared source of randomness, this method can be made to work with hash functions whose hash values have only a constant number of bits, performing each equality test with a constant number of bits of communication but with a constant probability of error. The trivial repetition strategy would then give an equality test with high probability of correctness but \(O(\log n)\) bits of communication, making the binary search use \(O(\log^2 n)\) bits of communication. Alternatively, using \(O(\log n)\)-bit hash values would again get high probability of correctness (low probability of a hash collision) but again lead to \(O(\log^2 n)\) bits of communication in the binary search. Instead, Viola replaces a standard binary search by a search algorithm that uses a logarithmic number of calls to a noisy primitive, the primitive that uses one hash value with a constant number of bits. This allows Alice and Bob to find which of their numbers is bigger, using only \(O(\log n)\) bits of communication.

The main idea of Viola’s noisy binary search algorithm is for each step of the search to walk up or down in a binary search tree, stepping downward on the basis of noisy comparisons and stepping upward when a test result makes it appear that you’ve made a mistake somewhere earlier. Of course that test result could itself be erroneous! But if you balance the error probabilities correctly you’ll walk downward more often than you walk upward, and by adding logarithmically-long tails below the leaves of the search tree you can have high confidence that when you reach the end of one of these tails it will be the correct one. This same idea of walking upward and downward based on test results and apparent mistakes turns out to be key to our geometric algorithms, as well, but with the twist that the structure we are walking around in is a directed acyclic graph rather than a tree. (For those familiar with randomized incremental algorithms in geometry, it’s the history DAG of a randomized incremental construction.)

(Discuss on Mastodon)

By David Eppstein

Millions Now Living Will Never Die

from Ben Recht

The Preparedness Paradox in Preventive Medicine

The legacy of the Y2K bug tells a story about prevention at large scale, but the Preparedness Paradox applies to individuals as well. There is no shortage of advice out there for how to live a long and healthy life. It comes at us from all directions. Governmental agencies and self-help podcasters preach their ideas of what you can do to live forever.

And yet, some people who do everything right still die early, be it from freak accidents or fateful disease. Some people do everything wrong and live to see 100. How can we know whether preventative medicine will be worth it in the end?

Longevity Guru Peter Attia wrote a whole book based on the idea that mimicking the internal homeostatic mechanisms of centenarians can unlock universal longevity. But Attia admits very early in his mammoth how-to-book/memoir mashup Outlive that he’ll never be able to prove that his ideas actually achieve his goals. Because of the timescales involved, the multifactorial inputs required, and the complete lack of understanding of mechanism, there can be no randomized trial of his longevity protocols. You can’t do “evidence-based” medicine to live forever. Instead, Attia unironically calls his approach “evidence-informed” medicine. Evidence-informed means he reads a lot of papers, talks to people on his podcast, and comes up with what he currently thinks is best based on these experiences. This rings hollow. Given the vastness of the scientific literature and the confirmation bias provided by talking to his friends, “evidence-informed” is just an arrogant, nerdy euphemism for “I go with my gut.”

Indeed, the evidence-based medicine crowd is diametrically opposed to Attia. Evidence-based medicine preaches a medical conservatism that is loath to treat healthy people as patients and argues for only employing interventions that have been demonstrated effective by randomized clinical trials. One of the pioneers of the evidence-based medicine movement, David Sackett, wrote an impassioned editorial decrying “The arrogance of preventive medicine” in the Canadian Medical Association Journal in 2002. He vents:

“First, it is aggressively assertive, pursuing symptomless individuals and telling them what they must do to remain healthy… Second, preventive medicine is presumptuous, confident that the interventions it espouses will, on average, do more good than harm to those who accept and adhere to them. Finally, preventive medicine is overbearing, attacking those who question the value of its recommendations.”

Like in the case of the Y2K story, Attia and Sackett are correct at the same time. How we balance their perspectives and make sense of our lives is what’s challenging.

What makes Sackett more right than Attia is his honesty and humility about what is out of our control. Our bodies, which work so hard to maintain homeostasis, can only do so for so long. And if we spend all of our time preventing the worst, even Attia admits that this makes life stressful and miserable.

A beautiful meditation on how homeostasis can hold for only so long is Siddhartha Mukherjee’s essay “My Father’s Body, at Rest and in Motion.” Mukherjee mourns the death of his father, describing the efforts required to keep him in order until everything falls apart. In a poignant scene, Mukherjee describes how his mother and his father’s nurse worked to create a simulation of his father’s old life in hopes that it might remind his bodily systems of how they used to interoperate:

“And then, quietly, a new kind of physiology started to coalesce around him. The fruit and vegetable sellers began to turn up at home. The daytime nurse—a scraggly young man nicknamed Bishnu: the god, among other things, of maintenance and preservation—made a habit of propping him up in his rocking chair on the balcony every morning, and having the various venders congregate below like a worshipful throng. My father was delighted to be back among the believers. He would banter with them from above—a king under house arrest, but a king nonetheless—berating them about their prices; protesting the abysmal quality of the eggplants; asking why he, at his age, must suffer the sins of their bruised cauliflowers; and why the fish was never quite fresh. It was a small miracle: Mr. Mukherjee could no longer go to the market, and so the market came to Mr. Mukherjee.

“In retrospect, I understood that this, too, was homeostasis of a sort. The little rituals saved him.”

Trying to keep elements of his daily routine together was a way to sustain his life. This simulation wasn’t necessarily what we would call medical intervention. It was an attempt to experiential maintain constancy to maintain physiological constancy.

Mukherjee sees homeostasis in everything, from his father’s body to India’s infrastructure. Homeostasis is about regulation and control, and one of the ways we aim to prolong life is by externally maintaining homeostasis to help our internal homeostatic mechanisms. He raises an interesting question about preventative medicine.

In the tale of the Y2K bug, there were competing alternatives: the expensive fix-everything approach and the fix-on-failure alternative. Some mix was probably required to mitigate the effect of the problem. The same is true with ourselves. No matter how complicated the modern evidence-informed scientist life gurus make it out to be, some measures of prevention are likely helpful. If you look past Attia’s kookier, more expensive advice like getting full-body cancer scans or taking rapamycin prophylactically, then Outlive’s prescriptions amount to eating well, exercising, sleeping well, and living a low-stress lifestyle. Sounds good to me!

But how can you know your measures of prevention did the right thing? Did your daily jogging and healthy diet get you to a healthy 9th decade? Was everything you did to sustain your life worth it because of how you were able to live it? Will you run a statistical regression to prove you lived your life to the fullest?

Subscribe now

By Ben Recht

TR25-003 | Fooling Near-Maximal Decision Trees | William Hoza

from ECCC Papers

For any constant $\alpha > 0$, we construct an explicit pseudorandom generator (PRG) that fools $n$-variate decision trees of size $m$ with error $\epsilon$ and seed length $(1 + \alpha) \cdot \log_2 m + O(\log(1/\epsilon) + \log \log n)$. For context, one can achieve seed length $(2 + o(1)) \cdot \log_2 m + O(\log(1/\epsilon) + \log \log n)$ using well-known constructions and analyses of small-bias distributions, but such a seed length is trivial when $m \geq 2^{n/2}$. By combining our new PRG with work by Chen and Kabanets (TCS 2016), we get an explicit PRG that fools circuits of size $2.99\cdot n$ over the $U_2$ basis with error $2^{-\Omega(n)}$ and seed length $(1 - \Omega(1)) \cdot n$. Our approach for fooling decision trees is to develop a new variant of the classic concept of almost $k$-wise independence, which might be of independent interest. We say that a distribution $X$ over $\{0, 1\}^n$ is $k$-wise $\epsilon$-probably uniform if every Boolean function $f$ that depends on only $k$ variables satisfies $\mathbb{E}[f(X)] \geq (1 - \epsilon) \cdot \mathbb{E}[f]$. We show how to sample a $k$-wise $\epsilon$-probably uniform distribution using a seed of length $(1 + \alpha) \cdot k + O(\log(1/\epsilon) + \log \log n)$.

For any constant $\alpha > 0$, we construct an explicit pseudorandom generator (PRG) that fools $n$-variate decision trees of size $m$ with error $\epsilon$ and seed length $(1 + \alpha) \cdot \log_2 m + O(\log(1/\epsilon) + \log \log n)$. For context, one can achieve seed length $(2 + o(1)) \cdot \log_2 m + O(\log(1/\epsilon) + \log \log n)$ using well-known constructions and analyses of small-bias distributions, but such a seed length is trivial when $m \geq 2^{n/2}$. By combining our new PRG with work by Chen and Kabanets (TCS 2016), we get an explicit PRG that fools circuits of size $2.99\cdot n$ over the $U_2$ basis with error $2^{-\Omega(n)}$ and seed length $(1 - \Omega(1)) \cdot n$. Our approach for fooling decision trees is to develop a new variant of the classic concept of almost $k$-wise independence, which might be of independent interest. We say that a distribution $X$ over $\{0, 1\}^n$ is $k$-wise $\epsilon$-probably uniform if every Boolean function $f$ that depends on only $k$ variables satisfies $\mathbb{E}[f(X)] \geq (1 - \epsilon) \cdot \mathbb{E}[f]$. We show how to sample a $k$-wise $\epsilon$-probably uniform distribution using a seed of length $(1 + \alpha) \cdot k + O(\log(1/\epsilon) + \log \log n)$.

Computing Game Symmetries and Equilibria That Respect Them

from arXiv: Computational Complexity

Authors: Emanuel Tewolde, Brian Hu Zhang, Caspar Oesterheld, Tuomas Sandholm, Vincent Conitzer

Strategic interactions can be represented more concisely, and analyzed and solved more efficiently, if we are aware of the symmetries within the multiagent system. Symmetries also have conceptual implications, for example for equilibrium selection. We study the computational complexity of identifying and using symmetries. Using the classical framework of normal-form games, we consider game symmetries that can be across some or all players and/or actions. We find a strong connection between game symmetries and graph automorphisms, yielding graph automorphism and graph isomorphism completeness results for characterizing the symmetries present in a game. On the other hand, we also show that the problem becomes polynomial-time solvable when we restrict the consideration of actions in one of two ways. Next, we investigate when exactly game symmetries can be successfully leveraged for Nash equilibrium computation. We show that finding a Nash equilibrium that respects a given set of symmetries is PPAD- and CLS-complete in general-sum and team games respectively -- that is, exactly as hard as Brouwer fixed point and gradient descent problems. Finally, we present polynomial-time methods for the special cases where we are aware of a vast number of symmetries, or where the game is two-player zero-sum and we do not even know the symmetries.

Authors: Emanuel Tewolde, Brian Hu Zhang, Caspar Oesterheld, Tuomas Sandholm, Vincent Conitzer

Strategic interactions can be represented more concisely, and analyzed and solved more efficiently, if we are aware of the symmetries within the multiagent system. Symmetries also have conceptual implications, for example for equilibrium selection. We study the computational complexity of identifying and using symmetries. Using the classical framework of normal-form games, we consider game symmetries that can be across some or all players and/or actions. We find a strong connection between game symmetries and graph automorphisms, yielding graph automorphism and graph isomorphism completeness results for characterizing the symmetries present in a game. On the other hand, we also show that the problem becomes polynomial-time solvable when we restrict the consideration of actions in one of two ways. Next, we investigate when exactly game symmetries can be successfully leveraged for Nash equilibrium computation. We show that finding a Nash equilibrium that respects a given set of symmetries is PPAD- and CLS-complete in general-sum and team games respectively -- that is, exactly as hard as Brouwer fixed point and gradient descent problems. Finally, we present polynomial-time methods for the special cases where we are aware of a vast number of symmetries, or where the game is two-player zero-sum and we do not even know the symmetries.

Matching Cut and Variants on Bipartite Graphs of Bounded Radius and Diameter

from arXiv: Computational Complexity

Authors: Felicia Lucke

In the Matching Cut problem we ask whether a graph $G$ has a matching cut, that is, a matching which is also an edge cut of $G$. We consider the variants Perfect Matching Cut and Disconnected Perfect Matching where we ask whether there exists a matching cut equal to, respectively contained in, a perfect matching. Further, in the problem Maximum Matching Cut we ask for a matching cut with a maximum number of edges. The last problem we consider is $d$-Cut where we ask for an edge cut where each vertex is incident to at most $d$ edges in the cut. We investigate the computational complexity of these problems on bipartite graphs of bounded radius and diameter. Our results extend known results for Matching Cut and Disconnected Perfect Matching. We give complexity dichotomies for $d$-Cut and Maximum Matching Cut and solve one of two open cases for Disconnected Perfect Matching. For Perfect Matching Cut we give the first hardness result for bipartite graphs of bounded radius and diameter and extend the known polynomial cases.

Authors: Felicia Lucke

In the Matching Cut problem we ask whether a graph $G$ has a matching cut, that is, a matching which is also an edge cut of $G$. We consider the variants Perfect Matching Cut and Disconnected Perfect Matching where we ask whether there exists a matching cut equal to, respectively contained in, a perfect matching. Further, in the problem Maximum Matching Cut we ask for a matching cut with a maximum number of edges. The last problem we consider is $d$-Cut where we ask for an edge cut where each vertex is incident to at most $d$ edges in the cut. We investigate the computational complexity of these problems on bipartite graphs of bounded radius and diameter. Our results extend known results for Matching Cut and Disconnected Perfect Matching. We give complexity dichotomies for $d$-Cut and Maximum Matching Cut and solve one of two open cases for Disconnected Perfect Matching. For Perfect Matching Cut we give the first hardness result for bipartite graphs of bounded radius and diameter and extend the known polynomial cases.

Is magnitude 'generically continuous' for finite metric spaces?

from arXiv: Computational Geometry

Authors: Hirokazu Katsumasa, Emily Roff, Masahiko Yoshinaga

Magnitude is a real-valued invariant of metric spaces which, in the finite setting, can be understood as recording the 'effective number of points' in a space as the scale of the metric varies. Motivated by applications in topological data analysis, this paper investigates the stability of magnitude: its continuity properties with respect to the Gromov-Hausdorff topology. We show that magnitude is nowhere continuous on the Gromov-Hausdorff space of finite metric spaces. Yet, we find evidence to suggest that it may be 'generically continuous', in the sense that generic Gromov-Hausdorff limits are preserved by magnitude. We make the case that, in fact, 'generic stability' is what matters for applicability.

Authors: Hirokazu Katsumasa, Emily Roff, Masahiko Yoshinaga

Magnitude is a real-valued invariant of metric spaces which, in the finite setting, can be understood as recording the 'effective number of points' in a space as the scale of the metric varies. Motivated by applications in topological data analysis, this paper investigates the stability of magnitude: its continuity properties with respect to the Gromov-Hausdorff topology. We show that magnitude is nowhere continuous on the Gromov-Hausdorff space of finite metric spaces. Yet, we find evidence to suggest that it may be 'generically continuous', in the sense that generic Gromov-Hausdorff limits are preserved by magnitude. We make the case that, in fact, 'generic stability' is what matters for applicability.

Beating Competitive Ratio 4 for Graphic Matroid Secretary

from arXiv: Data Structures and Algorithms

Authors: Kiarash Banihashem, MohammadTaghi Hajiaghayi, Dariusz R. Kowalski, Piotr Krysta, Danny Mittal, Jan Olkowski

One of the classic problems in online decision-making is the *secretary problem* where to goal is to maximize the probability of choosing the largest number from a randomly ordered sequence. A natural extension allows selecting multiple values under a combinatorial constraint. Babaioff, Immorlica, Kempe, and Kleinberg (SODA'07, JACM'18) introduced the *matroid secretary conjecture*, suggesting an $O(1)$-competitive algorithm exists for matroids. Many works since have attempted to obtain algorithms for both general matroids and specific classes of matroids. The ultimate goal is to obtain an $e$-competitive algorithm, and the *strong matroid secretary conjecture* states that this is possible for general matroids. A key class of matroids is the *graphic matroid*, where a set of graph edges is independent if it contains no cycle. The rich combinatorial structure of graphs makes them a natural first step towards solving a problem for general matroids. Babaioff et al. (SODA'07, JACM'18) first studied the graphic matroid setting, achieving a $16$-competitive algorithm. Subsequent works have improved the competitive ratio, most recently to 4 by Soto, Turkieltaub, and Verdugo (SODA'18). We break this $4$-competitive barrier, presenting a new algorithm with a competitive ratio of $3.95$. For simple graphs, we further improve this to $3.77$. Intuitively, solving the problem for simple graphs is easier since they lack length-two cycles. A natural question is whether a ratio arbitrarily close to $e$ can be achieved by assuming sufficiently large girth. We answer this affirmatively, showing a competitive ratio arbitrarily close to $e$ even for constant girth values, supporting the strong matroid secretary conjecture. We also prove this bound is tight: for any constant $g$, no algorithm can achieve a ratio better than $e$ even when the graph has girth at least $g$.

Authors: Kiarash Banihashem, MohammadTaghi Hajiaghayi, Dariusz R. Kowalski, Piotr Krysta, Danny Mittal, Jan Olkowski

One of the classic problems in online decision-making is the *secretary problem* where to goal is to maximize the probability of choosing the largest number from a randomly ordered sequence. A natural extension allows selecting multiple values under a combinatorial constraint. Babaioff, Immorlica, Kempe, and Kleinberg (SODA'07, JACM'18) introduced the *matroid secretary conjecture*, suggesting an $O(1)$-competitive algorithm exists for matroids. Many works since have attempted to obtain algorithms for both general matroids and specific classes of matroids. The ultimate goal is to obtain an $e$-competitive algorithm, and the *strong matroid secretary conjecture* states that this is possible for general matroids. A key class of matroids is the *graphic matroid*, where a set of graph edges is independent if it contains no cycle. The rich combinatorial structure of graphs makes them a natural first step towards solving a problem for general matroids. Babaioff et al. (SODA'07, JACM'18) first studied the graphic matroid setting, achieving a $16$-competitive algorithm. Subsequent works have improved the competitive ratio, most recently to 4 by Soto, Turkieltaub, and Verdugo (SODA'18). We break this $4$-competitive barrier, presenting a new algorithm with a competitive ratio of $3.95$. For simple graphs, we further improve this to $3.77$. Intuitively, solving the problem for simple graphs is easier since they lack length-two cycles. A natural question is whether a ratio arbitrarily close to $e$ can be achieved by assuming sufficiently large girth. We answer this affirmatively, showing a competitive ratio arbitrarily close to $e$ even for constant girth values, supporting the strong matroid secretary conjecture. We also prove this bound is tight: for any constant $g$, no algorithm can achieve a ratio better than $e$ even when the graph has girth at least $g$.

Adaptive Approximation Schemes for Matching Queues

from arXiv: Data Structures and Algorithms

Authors: Alireza AmaniHamedani, Ali Aouad, Amin Saberi

We study a continuous-time, infinite-horizon dynamic matching problem. Suppliers arrive according to a Poisson process; while waiting, they may abandon the queue at a uniform rate. Customers on the other side of the network must be matched upon arrival. The objective is to minimize the expected long-term average cost subject to a throughput constraint on the total match rate. Previous literature on dynamic matching focuses on "static" policies, where the matching decisions do not depend explicitly on the state of the supplier queues, achieving constant-factor approximations. By contrast, we design "adaptive" policies, which leverage queue length information, and obtain near-optimal polynomial-time algorithms for several classes of instances. First, we develop a bi-criteria Fully Polynomial-time Approximation Scheme (FPTAS) for dynamic matching on networks with a constant number of queues -- that computes a $(1-\epsilon)$-approximation of the optimal policy in time polynomial in both the input size and $1/\epsilon$. Using this algorithm as a subroutine, we obtain an FPTAS for dynamic matching on Euclidean networks of fixed dimension. A key new technique is a hybrid LP relaxation, which combines static and state-dependent LP approximations of the queue dynamics, after a decomposition of the network. Constant-size networks are motivated by deceased organ donation schemes, where the supply types can be divided according to blood and tissue types. The Euclidean case is of interest in ride-hailing and spatial service platforms, where the goal is to fulfill as many trips as possible while minimizing driving distances.

Authors: Alireza AmaniHamedani, Ali Aouad, Amin Saberi

We study a continuous-time, infinite-horizon dynamic matching problem. Suppliers arrive according to a Poisson process; while waiting, they may abandon the queue at a uniform rate. Customers on the other side of the network must be matched upon arrival. The objective is to minimize the expected long-term average cost subject to a throughput constraint on the total match rate. Previous literature on dynamic matching focuses on "static" policies, where the matching decisions do not depend explicitly on the state of the supplier queues, achieving constant-factor approximations. By contrast, we design "adaptive" policies, which leverage queue length information, and obtain near-optimal polynomial-time algorithms for several classes of instances. First, we develop a bi-criteria Fully Polynomial-time Approximation Scheme (FPTAS) for dynamic matching on networks with a constant number of queues -- that computes a $(1-\epsilon)$-approximation of the optimal policy in time polynomial in both the input size and $1/\epsilon$. Using this algorithm as a subroutine, we obtain an FPTAS for dynamic matching on Euclidean networks of fixed dimension. A key new technique is a hybrid LP relaxation, which combines static and state-dependent LP approximations of the queue dynamics, after a decomposition of the network. Constant-size networks are motivated by deceased organ donation schemes, where the supply types can be divided according to blood and tissue types. The Euclidean case is of interest in ride-hailing and spatial service platforms, where the goal is to fulfill as many trips as possible while minimizing driving distances.

Efficient Shape Reconfiguration by Hybrid Programmable Matter

from arXiv: Data Structures and Algorithms

Authors: Jonas Friemel, David Liedtke, Christian Scheffer

Shape formation is one of the most thoroughly studied problems in most algorithmic models of programmable matter. However, few existing shape formation algorithms utilize similarities between an initial configuration and a desired target shape. In the hybrid model, an active agent with the computational capabilities of a deterministic finite automaton can form shapes by lifting and placing passive tiles on the triangular lattice. We study the shape reconfiguration problem where the agent needs to move all tiles in an input shape to so-called target nodes, which are distinguishable from other nodes by the agent. We present a worst-case optimal $O(mn)$ algorithm for simply connected target shapes and an $O(n^4)$ algorithm for a large class of target shapes that may contain holes, where $m$ is the initial number of unoccupied target nodes and $n$ is the total number of tiles.

Authors: Jonas Friemel, David Liedtke, Christian Scheffer

Shape formation is one of the most thoroughly studied problems in most algorithmic models of programmable matter. However, few existing shape formation algorithms utilize similarities between an initial configuration and a desired target shape. In the hybrid model, an active agent with the computational capabilities of a deterministic finite automaton can form shapes by lifting and placing passive tiles on the triangular lattice. We study the shape reconfiguration problem where the agent needs to move all tiles in an input shape to so-called target nodes, which are distinguishable from other nodes by the agent. We present a worst-case optimal $O(mn)$ algorithm for simply connected target shapes and an $O(n^4)$ algorithm for a large class of target shapes that may contain holes, where $m$ is the initial number of unoccupied target nodes and $n$ is the total number of tiles.

A Refreshment Stirred, Not Shaken (II): Invariant-Preserving Deployments of Differential Privacy for the US Decennial Census

from arXiv: Data Structures and Algorithms

Authors: James Bailie, Ruobin Gong, Xiao-Li Meng

Through the lens of the system of differential privacy specifications developed in Part I of a trio of articles, this second paper examines two statistical disclosure control (SDC) methods for the United States Decennial Census: the Permutation Swapping Algorithm (PSA), which is similar to the 2010 Census's disclosure avoidance system (DAS), and the TopDown Algorithm (TDA), which was used in the 2020 DAS. To varying degrees, both methods leave unaltered some statistics of the confidential data $\unicode{x2013}$ which are called the method's invariants $\unicode{x2013}$ and hence neither can be readily reconciled with differential privacy (DP), at least as it was originally conceived. Nevertheless, we establish that the PSA satisfies $\varepsilon$-DP subject to the invariants it necessarily induces, thereby showing that this traditional SDC method can in fact still be understood within our more-general system of DP specifications. By a similar modification to $\rho$-zero concentrated DP, we also provide a DP specification for the TDA. Finally, as a point of comparison, we consider the counterfactual scenario in which the PSA was adopted for the 2020 Census, resulting in a reduction in the nominal privacy loss, but at the cost of releasing many more invariants. Therefore, while our results explicate the mathematical guarantees of SDC provided by the PSA, the TDA and the 2020 DAS in general, care must be taken in their translation to actual privacy protection $\unicode{x2013}$ just as is the case for any DP deployment.

Authors: James Bailie, Ruobin Gong, Xiao-Li Meng

Through the lens of the system of differential privacy specifications developed in Part I of a trio of articles, this second paper examines two statistical disclosure control (SDC) methods for the United States Decennial Census: the Permutation Swapping Algorithm (PSA), which is similar to the 2010 Census's disclosure avoidance system (DAS), and the TopDown Algorithm (TDA), which was used in the 2020 DAS. To varying degrees, both methods leave unaltered some statistics of the confidential data $\unicode{x2013}$ which are called the method's invariants $\unicode{x2013}$ and hence neither can be readily reconciled with differential privacy (DP), at least as it was originally conceived. Nevertheless, we establish that the PSA satisfies $\varepsilon$-DP subject to the invariants it necessarily induces, thereby showing that this traditional SDC method can in fact still be understood within our more-general system of DP specifications. By a similar modification to $\rho$-zero concentrated DP, we also provide a DP specification for the TDA. Finally, as a point of comparison, we consider the counterfactual scenario in which the PSA was adopted for the 2020 Census, resulting in a reduction in the nominal privacy loss, but at the cost of releasing many more invariants. Therefore, while our results explicate the mathematical guarantees of SDC provided by the PSA, the TDA and the 2020 DAS in general, care must be taken in their translation to actual privacy protection $\unicode{x2013}$ just as is the case for any DP deployment.

Wednesday, January 15

Linkage

from David Eppstein

My drunken theorem (\(\mathbb{M}\)). Bill Gasarch’s open problems column in memory of Luca Trevisan brings Lance Fortnow vague memories of drinking and deriving.

By David Eppstein

"Our Days Are Numbered"

from Computational Complexity

Slide in Lev Reyzin's JMM talk "Problems in AI and ML for Mathematicians" Reyzin is paraphrasing Telgarsky. Posted with permission.

Last week I attended the Joint Mathematics Meeting in Seattle with a theme of

We Decide Our Future: Mathematics in the Age of AI

With little computational complexity in the conference, I attended many of the AI talks. You could divide them into Math for AI and AI for Math. Mathematics of course plays critical roles in machine learning optimization, and several talks focused on provably good learning algorithms, though they overemphasized the importance of such. Do you get on a plane because you understand the physics or because air travel has a strong safety record?

But let's focus on AI for Math which generated the most discussion and angst. 

I started the conference sitting on a panel on the challenges of peer review in mathematics. Math doesn't have the replication crisis that dogs other scientific communities. But math does have papers with very specialized, long, detailed proofs and few qualified referees willing to check them over with care. 

We're not there yet, but in the "near future" we ought to have AI systems that can verify well-written proofs by compiling them using proof assistants like Lean. Referees could spend less time on checking proofs and more on deciding whether the model, theorem and/or techniques merit publication. We might get to the point that you couldn't even submit a paper to a journal until the proofs have been verified.

It's what comes next that really spooks mathematicians. While AI has made significant progress in solving competition problems with DeepMind's AlphaProof and Open AI's O3, it has only played minor roles in developing new ideas for theorems. Eventually AI systems will find critical steps for publishable theorems. Who do we give the credit to? When AI systems become stronger that typical PhD students, what kinds of problems to we give the students? 

We'll get plenty of AI slop, flooding journals with mathematical papers that technically prove new theorems but don't offer any particularly interesting or illuminating insights. But we'll also get mathematicians who can unleash an army of virtual grad students to find new connections between different subfields, or make significant progress on major open problems in the field. 

Some mathematicians don't want AI trained on their papers and using their techniques and theorems, even though they wouldn't have problems with other mathematicians doing so. In any case, they might not have much choice as many publishers are making deals with AI companies.

The future of mathematicians might follow that of artists and programmers. The best will do fine since they can find proofs beyond what an AI can do. Mathematicians who know how to ask the right questions can harness AI to make themselves far more productive. All mathematicians will have to navigate this brave new world.

"At least they'll still need us to teach the courses," one mathematician told me. Don't be so sure.

By Lance Fortnow

Proofs are amenable to chess techniques. "Our Days are Numbered".

Slide in Lev Reyzin's JMM talk "Problems in AI and ML for Mathematicians" Reyzin is paraphrasing Telgarsky. Posted with permission.

Last week I attended the Joint Mathematics Meeting in Seattle with a theme of

We Decide Our Future: Mathematics in the Age of AI

With little computational complexity in the conference, I attended many of the AI talks. You could divide them into Math for AI and AI for Math. Mathematics of course plays critical roles in machine learning optimization, and several talks focused on provably good learning algorithms, though they overemphasized the importance of such. Do you get on a plane because you understand the physics or because air travel has a strong safety record?

But let's focus on AI for Math which generated the most discussion and angst. 

I started the conference sitting on a panel on the challenges of peer review in mathematics. Math doesn't have the replication crisis that dogs other scientific communities. But math does have papers with very specialized, long, detailed proofs and few qualified referees willing to check them over with care. 

We're not there yet, but in the "near future" we ought to have AI systems that can verify well-written proofs by compiling them using proof assistants like Lean. Referees could spend less time on checking proofs and more on deciding whether the model, theorem and/or techniques merit publication. We might get to the point that you couldn't even submit a paper to a journal until the proofs have been verified.

It's what comes next that really spooks mathematicians. While AI has made significant progress in solving competition problems with DeepMind's AlphaProof and Open AI's O3, it has only played minor roles in developing new ideas for theorems. Eventually AI systems will find critical steps for publishable theorems. Who do we give the credit to? When AI systems become stronger that typical PhD students, what kinds of problems to we give the students? 

We'll get plenty of AI slop, flooding journals with mathematical papers that technically prove new theorems but don't offer any particularly interesting or illuminating insights. But we'll also get mathematicians who can unleash an army of virtual grad students to find new connections between different subfields, or make significant progress on major open problems in the field. 

Some mathematicians don't want AI trained on their papers and using their techniques and theorems, even though they wouldn't have problems with other mathematicians doing so. In any case, they might not have much choice as many publishers are making deals with AI companies.

The future of mathematicians might follow that of artists and programmers. The best will do fine since they can find proofs beyond what an AI can do. Mathematicians who know how to ask the right questions can harness AI to make themselves far more productive. All mathematicians will have to navigate this brave new world.

"At least they'll still need us to teach the courses," one mathematician told me. Don't be so sure.

By Lance Fortnow

Faculty positions in Computer Science at Reykjavik University (apply by January 30, 2025)

from CCI: jobs

The Department of Computer Science at Reykjavik University invites applications for full-time, permanent faculty positions at any rank, in the fields of data science, software engineering, theoretical computer science, as well as visual computing, games, and interactive media. Website: ui-jobs.50skills.app/ru/en/32303 Email: hansr@ru.is

The Department of Computer Science at Reykjavik University invites applications for full-time, permanent faculty positions at any rank, in the fields of data science, software engineering, theoretical computer science, as well as visual computing, games, and interactive media.

Website: https://ui-jobs.50skills.app/ru/en/32303
Email: hansr@ru.is

By shacharlovett

Towards the Pseudorandomness of Expander Random Walks for Read-Once ACC0 circuits

from arXiv: Computational Complexity

Authors: Emile Anand

Expander graphs are among the most useful combinatorial objects in theoretical computer science. A line of work studies random walks on expander graphs for their pseudorandomness against various classes of test functions, including symmetric functions, read-only branching programs, permutation branching programs, and $\mathrm{AC}^0$ circuits. The promising results of pseudorandomness of expander random walks against $\mathrm{AC}^0$ circuits indicate a robustness of expander random walks beyond symmetric functions, motivating the question of whether expander random walks can fool more robust \emph{asymmetric} complexity classes, such as $\mathrm{ACC}^0$. In this work, we make progress towards this question by considering certain two-layered circuit compositions of $\mathrm{MOD}[k]$ gates, where we show that these family of circuits are fooled by expander random walks with total variation distance error $O(\lambda)$, where $\lambda$ is the second largest eigenvalue of the underlying expander graph. For $k\geq 3$, these circuits can be highly asymmetric with complicated Fourier characters. In this context, our work takes a step in the direction of fooling more complex asymmetric circuits. Separately, drawing from the learning-theory literature, we construct an explicit threshold circuit in the circuit family $\mathrm{TC}^0$, and show that it \emph{is} fooled by expander random walks, providing an upper bound on the set of functions fooled by expander random walks.

Authors: Emile Anand

Expander graphs are among the most useful combinatorial objects in theoretical computer science. A line of work studies random walks on expander graphs for their pseudorandomness against various classes of test functions, including symmetric functions, read-only branching programs, permutation branching programs, and $\mathrm{AC}^0$ circuits. The promising results of pseudorandomness of expander random walks against $\mathrm{AC}^0$ circuits indicate a robustness of expander random walks beyond symmetric functions, motivating the question of whether expander random walks can fool more robust \emph{asymmetric} complexity classes, such as $\mathrm{ACC}^0$. In this work, we make progress towards this question by considering certain two-layered circuit compositions of $\mathrm{MOD}[k]$ gates, where we show that these family of circuits are fooled by expander random walks with total variation distance error $O(\lambda)$, where $\lambda$ is the second largest eigenvalue of the underlying expander graph. For $k\geq 3$, these circuits can be highly asymmetric with complicated Fourier characters. In this context, our work takes a step in the direction of fooling more complex asymmetric circuits. Separately, drawing from the learning-theory literature, we construct an explicit threshold circuit in the circuit family $\mathrm{TC}^0$, and show that it \emph{is} fooled by expander random walks, providing an upper bound on the set of functions fooled by expander random walks.

Computational Geometry with Probabilistically Noisy Primitive Operations

from arXiv: Computational Geometry

Authors: David Eppstein, Michael T. Goodrich, Vinesh Sridhar

Much prior work has been done on designing computational geometry algorithms that handle input degeneracies, data imprecision, and arithmetic round-off errors. We take a new approach, inspired by the noisy sorting literature, and study computational geometry algorithms subject to noisy Boolean primitive operations in which, e.g., the comparison "is point q above line L?" returns the wrong answer with some fixed probability. We propose a novel technique called path-guided pushdown random walks that generalizes the results of noisy sorting. We apply this technique to solve point-location, plane-sweep, convex hulls in 2D and 3D, dynamic 2D convex hulls, and Delaunay triangulations for noisy primitives in optimal time with high probability.

Authors: David Eppstein, Michael T. Goodrich, Vinesh Sridhar

Much prior work has been done on designing computational geometry algorithms that handle input degeneracies, data imprecision, and arithmetic round-off errors. We take a new approach, inspired by the noisy sorting literature, and study computational geometry algorithms subject to noisy Boolean primitive operations in which, e.g., the comparison "is point q above line L?" returns the wrong answer with some fixed probability. We propose a novel technique called path-guided pushdown random walks that generalizes the results of noisy sorting. We apply this technique to solve point-location, plane-sweep, convex hulls in 2D and 3D, dynamic 2D convex hulls, and Delaunay triangulations for noisy primitives in optimal time with high probability.

Optimal Classification Trees for Continuous Feature Data Using Dynamic Programming with Branch-and-Bound

from arXiv: Data Structures and Algorithms

Authors: Catalin E. Brita, Jacobus G. M. van der Linden, Emir Demirović

Computing an optimal classification tree that provably maximizes training performance within a given size limit, is NP-hard, and in practice, most state-of-the-art methods do not scale beyond computing optimal trees of depth three. Therefore, most methods rely on a coarse binarization of continuous features to maintain scalability. We propose a novel algorithm that optimizes trees directly on the continuous feature data using dynamic programming with branch-and-bound. We develop new pruning techniques that eliminate many sub-optimal splits in the search when similar to previously computed splits and we provide an efficient subroutine for computing optimal depth-two trees. Our experiments demonstrate that these techniques improve runtime by one or more orders of magnitude over state-of-the-art optimal methods and improve test accuracy by 5% over greedy heuristics.

Authors: Catalin E. Brita, Jacobus G. M. van der Linden, Emir Demirović

Computing an optimal classification tree that provably maximizes training performance within a given size limit, is NP-hard, and in practice, most state-of-the-art methods do not scale beyond computing optimal trees of depth three. Therefore, most methods rely on a coarse binarization of continuous features to maintain scalability. We propose a novel algorithm that optimizes trees directly on the continuous feature data using dynamic programming with branch-and-bound. We develop new pruning techniques that eliminate many sub-optimal splits in the search when similar to previously computed splits and we provide an efficient subroutine for computing optimal depth-two trees. Our experiments demonstrate that these techniques improve runtime by one or more orders of magnitude over state-of-the-art optimal methods and improve test accuracy by 5% over greedy heuristics.

DynHAC: Fully Dynamic Approximate Hierarchical Agglomerative Clustering

from arXiv: Data Structures and Algorithms

Authors: Shangdi Yu, Laxman Dhulipala, Jakub Łącki, Nikos Parotsidis

We consider the problem of maintaining a hierarchical agglomerative clustering (HAC) in the dynamic setting, when the input is subject to point insertions and deletions. We introduce DynHAC - the first dynamic HAC algorithm for the popular average-linkage version of the problem which can maintain a 1+\epsilon approximate solution. Our approach leverages recent structural results on (1+\epsilon)-approximate HAC to carefully identify the part of the clustering dendrogram that needs to be updated in order to produce a solution that is consistent with what a full recomputation from scratch would have output. We evaluate DynHAC on a number of real-world graphs. We show that DynHAC can handle each update up to 423x faster than what it would take to recompute the clustering from scratch. At the same time it achieves up to 0.21 higher NMI score than the state-of-the-art dynamic hierarchical clustering algorithms, which do not provably approximate HAC.

Authors: Shangdi Yu, Laxman Dhulipala, Jakub Łącki, Nikos Parotsidis

We consider the problem of maintaining a hierarchical agglomerative clustering (HAC) in the dynamic setting, when the input is subject to point insertions and deletions. We introduce DynHAC - the first dynamic HAC algorithm for the popular average-linkage version of the problem which can maintain a 1+\epsilon approximate solution. Our approach leverages recent structural results on (1+\epsilon)-approximate HAC to carefully identify the part of the clustering dendrogram that needs to be updated in order to produce a solution that is consistent with what a full recomputation from scratch would have output. We evaluate DynHAC on a number of real-world graphs. We show that DynHAC can handle each update up to 423x faster than what it would take to recompute the clustering from scratch. At the same time it achieves up to 0.21 higher NMI score than the state-of-the-art dynamic hierarchical clustering algorithms, which do not provably approximate HAC.

Tuesday, January 14

In the year 2000

from Ben Recht

The Y2K Bug, The Preparedness Paradox, and the existence of multiple truths

Back in the nineties, I was told computers would end the world. Not because they would become superintelligent like in the movies, but because programmers had been frugal/lazy and written years with only two digits. You know how we write dates with only the last two digits of the year (01/14/25)? Well, computers were programmed this way too. That meant that when we moved from year 99 to year 00, havoc would break loose because differences in years would give invalid negative numbers. Newborns would be classified as centenarians, banks would lose all records of their holdings, planes would fall from the sky, nuclear missiles would launch themselves…

To prevent this pending disaster, IT departments around the world united to patch code bases. People received hefty bonuses if they could read and write the COBOL that instructed our ancient mainframes. Hundreds of billions of dollars were invested to patch this annoying problem that years should be represented by four digits, not two.1

When the clock struck midnight on December 31, 1999, the Judgment Day predicted by Prince did not arrive. Some things definitely broke, and the wikipedia has a fun list of some of the systems that crashed. For instance, some ticket dispensers on public transit stopped working. But it was not the end of the world as we knew it.

Why did the doomsday predictions not pan out? This is a fascinating question of causal inference. It could be that the concerted efforts of IT workers programmed our way out of the apocalypse. It could be that the predicted consequences of data overflows were overblown, and most of the Y2K bugs were relatively minor threats that could be fixed upon failure. How could we know the difference? Historians spend their time assembling strong evidentiary bases to impute why things that happened happened. But how do we assemble evidence to argue about the cause of something that didn’t happen?

One technique might be to look to “synthetic controls,” places that faced similar circumstances but took different actions. As I read retrospectives on Y2K, I was surprised to learn that some countries decided to do nothing! Not only did Italy and Russia do nothing, but the US State Department issued an advisory not to visit these countries because of potential Y2K catastrophes. Of course, nothing catastrophic happened in either country.

Now, this isn’t evidence that the prevention in the US wasn’t important! These countries are not perfect comparisons. However, it does suggest that those who raised maximal Y2K panic were way off. And the reporting of the time tended to amplify the doomsayers and downplay any reasonable prognoses. I mean, this book is real:

Yet plenty of people at the time tried to quell the alarm about the Y2K panic.

Computer science professor Ross Anderson was initially so afraid that in 1996 he bought a cabin in the woods to ride out the millennium. He had calmed down considerably by 1999. After assessing and fixing the bugs at Cambridge University, he came away with a sanguine prediction in December 1999:

“The surprising discovery is that although some systems have been fixed or replaced, none of the bugs found so far would have had a serious impact on our business operations even if they had been ignored until January 2000.

“So the UK government anxiety about the millennium bug compliance of small-to-medium sized enterprises appears misplaced. The average small business owner, who is doing absolutely nothing about the bug, appears to be acting quite rationally. This does not mean that the millennium bug is harmless.

“There are still risks to the economy and to public safety, but in the UK at least they appear to be minor.”

Friend-of-the-blog John Quiggin, who has been blogging since before blogs were a thing, also bet that the Y2K bug was overhyped. You can read his prescient predictions from September 1999 here. Notably, John noted that by 1999, about half of the catastrophic failures should have already happened! This information guided his prediction:

“We can use the Gartner Group's analysis to conclude that date-related problems in the year 2000 will be about twice as severe as those in 1999. Since twice zero is still zero, it looks as if it is time to start running down those emergency stocks of baked beans.”

Finally, as a last piece of Y2K minimizing evidence, let’s be honest with ourselves about programming computers. We all know that for every bug we patch, we create two more bugs. The idea that our global effort fixed all bugs and didn’t introduce new glitches is unreasonable. In fact, it wasn’t the case. Buggy Y2K patches took out several US spy satellites for several days: “It was the only serious military consequence of the year 2000 rollover, for which the Pentagon prepared at a cost of $3.6 billion.” Considering that the massive bug patching didn’t create more failures adds credence to the position that the Y2K bug was not world-ending.

I’ve said it a couple of times already, and I’ll say it one more time to fend off grumpy commenters: Don’t take this presentation of evidence as an argument that Y2K wasn’t a real issue. I’m not taking sides here. I’m instead interested in how to best assemble historical evidence to assess why catastrophe didn’t occur in the year 2000. How can we think about retrospective evidence of non-catastrophes? On the 25th anniversary of the end of the world as we know it, John Quiggin wrote a retrospective arguing that we’ll never know. Or, as John puts it, the Y2K experience was a “Rashomon phenomenon. Many realities are simultaneously true:

“For programmers in large organisations, the experience of Y2k remediation was that of a huge effort that fixed plenty of bugs was 100 % effective. For small businesses and schools, another thing to worry about about, but too hard to solve, leading to “fix on failure” when problems emerged. For Pollyannas like me who correctly predicted few critical problems anywhere, vindication, but no credit. For most people, another panic and false alarm.”

All of these are simultaneously true. This is the crux of “the preparedness paradox.” In prevention, whether it be preventing natural disasters or individual illness, your hard work can simultaneously pay off and not have been worth it. It can be simultaneously true that you overreacted, but a reaction was essential. And it can be true that not all prevention for the sake of prevention is good. When regulation impacts people’s lives, “preparedness” can be a huge ask. We can’t govern by moral panic, yet we need guardrails against potential disasters. How we balance these makes governance daunting. Especially because citizens can believe completely contradictory things about the state of affairs and still all be correct.

Subscribe now

1

I can’t wait for the year 10000.

By Ben Recht

35th International Conference Radioelektronika 2025

from CS Theory Events

May 12-14, 2025 Hnanice, Czechia radioelektronika.cz The conference is expected to attract a large number of researchers, scientists, industry experts, and students who are interested in the latest developments in electronics, signal processing, applications, information technology, microwave engineering, and related fields.

By shacharlovett

May 12-14, 2025 Hnanice, Czechia https://radioelektronika.cz The conference is expected to attract a large number of researchers, scientists, industry experts, and students who are interested in the latest developments in electronics, signal processing, applications, information technology, microwave engineering, and related fields.

By shacharlovett

Determining distances and consensus between mutation trees

from arXiv: Computational Complexity

Authors: Luís Cunha, Jack Kuipers, Thiago Lopes

The mutational heterogeneity of tumours can be described with a tree representing the evolutionary history of the tumour. With noisy sequencing data there may be uncertainty in the inferred tree structure, while we may also wish to study patterns in the evolution of cancers in different patients. In such situations, understanding tree similarities is a key challenge, and therefore we present an approach to determine distances between trees. Considering the bounded height of trees, we determine the distances associated with the swap operations over strings. While in general, by solving the {\sc Maximum Common Almost $v$-tree} problem between two trees, we describe an efficient approach to determine the minimum number of operations to transform one tree into another. The inherent noise in current statistical methods for constructing mutation evolution trees of cancer cells presents a significant challenge: handling such collections of trees to determine a consensus tree that accurately represents the set and evaluating the extent of their variability or dispersion. Given a set of mutation trees and the notion of distance, there are at least two natural ways to define the ``target'' tree, such as a min-sum (\emph{median tree}) or a min-max (\emph{closest tree}) of a set of trees. Thus, considering a set of trees as input and dealing with the {\sc median} and {\sc closest} problems, we prove that both problems are \NP-complete, even with only three input trees. In addition, we develop algorithms to obtain upper bounds on the median and closest solutions, which are analysed by the experiments presented on generated and on real databases. We show a fast way to find consensus trees with better results than any tree in the input set while still preserving all internal structure.

Authors: Luís Cunha, Jack Kuipers, Thiago Lopes

The mutational heterogeneity of tumours can be described with a tree representing the evolutionary history of the tumour. With noisy sequencing data there may be uncertainty in the inferred tree structure, while we may also wish to study patterns in the evolution of cancers in different patients. In such situations, understanding tree similarities is a key challenge, and therefore we present an approach to determine distances between trees. Considering the bounded height of trees, we determine the distances associated with the swap operations over strings. While in general, by solving the {\sc Maximum Common Almost $v$-tree} problem between two trees, we describe an efficient approach to determine the minimum number of operations to transform one tree into another. The inherent noise in current statistical methods for constructing mutation evolution trees of cancer cells presents a significant challenge: handling such collections of trees to determine a consensus tree that accurately represents the set and evaluating the extent of their variability or dispersion. Given a set of mutation trees and the notion of distance, there are at least two natural ways to define the ``target'' tree, such as a min-sum (\emph{median tree}) or a min-max (\emph{closest tree}) of a set of trees. Thus, considering a set of trees as input and dealing with the {\sc median} and {\sc closest} problems, we prove that both problems are \NP-complete, even with only three input trees. In addition, we develop algorithms to obtain upper bounds on the median and closest solutions, which are analysed by the experiments presented on generated and on real databases. We show a fast way to find consensus trees with better results than any tree in the input set while still preserving all internal structure.

Col is PSPACE-complete on Triangular Grids

from arXiv: Computational Complexity

Authors: Kyle Burke, Craig Tennenhouse

We demonstrate that Col is PSPACE-complete on triangular grid graphs via a reduction from Bounded Two-Player Constraint Logic. This is the most structured graph family that Col is known to be computationally hard for.

Authors: Kyle Burke, Craig Tennenhouse

We demonstrate that Col is PSPACE-complete on triangular grid graphs via a reduction from Bounded Two-Player Constraint Logic. This is the most structured graph family that Col is known to be computationally hard for.

On the Computational Capability of Graph Neural Networks: A Circuit Complexity Bound Perspective

from arXiv: Computational Complexity

Authors: Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Wei Wang, Jiahao Zhang

Graph Neural Networks (GNNs) have become the standard approach for learning and reasoning over relational data, leveraging the message-passing mechanism that iteratively propagates node embeddings through graph structures. While GNNs have achieved significant empirical success, their theoretical limitations remain an active area of research. Existing studies primarily focus on characterizing GNN expressiveness through Weisfeiler-Lehman (WL) graph isomorphism tests. In this paper, we take a fundamentally different approach by exploring the computational limitations of GNNs through the lens of circuit complexity. Specifically, we analyze the circuit complexity of common GNN architectures and prove that under constraints of constant-depth layers, linear or sublinear embedding sizes, and polynomial precision, GNNs cannot solve key problems such as graph connectivity and graph isomorphism unless $\mathsf{TC}^0 = \mathsf{NC}^1$. These results reveal the intrinsic expressivity limitations of GNNs behind their empirical success and introduce a novel framework for analyzing GNN expressiveness that can be extended to a broader range of GNN models and graph decision problems.

Authors: Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Wei Wang, Jiahao Zhang

Graph Neural Networks (GNNs) have become the standard approach for learning and reasoning over relational data, leveraging the message-passing mechanism that iteratively propagates node embeddings through graph structures. While GNNs have achieved significant empirical success, their theoretical limitations remain an active area of research. Existing studies primarily focus on characterizing GNN expressiveness through Weisfeiler-Lehman (WL) graph isomorphism tests. In this paper, we take a fundamentally different approach by exploring the computational limitations of GNNs through the lens of circuit complexity. Specifically, we analyze the circuit complexity of common GNN architectures and prove that under constraints of constant-depth layers, linear or sublinear embedding sizes, and polynomial precision, GNNs cannot solve key problems such as graph connectivity and graph isomorphism unless $\mathsf{TC}^0 = \mathsf{NC}^1$. These results reveal the intrinsic expressivity limitations of GNNs behind their empirical success and introduce a novel framework for analyzing GNN expressiveness that can be extended to a broader range of GNN models and graph decision problems.

Strong Low Degree Hardness for Stable Local Optima in Spin Glasses

from arXiv: Computational Complexity

Authors: Brice Huang, Mark Sellke

It is a folklore belief in the theory of spin glasses and disordered systems that out-of-equilibrium dynamics fail to find stable local optima exhibiting e.g. local strict convexity on physical time-scales. In the context of the Sherrington--Kirkpatrick spin glass, Behrens-Arpino-Kivva-Zdeborov\'a and Minzer-Sah-Sawhney have recently conjectured that this obstruction may be inherent to all efficient algorithms, despite the existence of exponentially many such optima throughout the landscape. We prove this search problem exhibits strong low degree hardness for polynomial algorithms of degree $D\leq o(N)$: any such algorithm has probability $o(1)$ to output a stable local optimum. To the best of our knowledge, this is the first result to prove that even constant-degree polynomials have probability $o(1)$ to solve a random search problem without planted structure. To prove this, we develop a general-purpose enhancement of the ensemble overlap gap property, and as a byproduct improve previous results on spin glass optimization, maximum independent set, random $k$-SAT, and the Ising perceptron to strong low degree hardness. Finally for spherical spin glasses with no external field, we prove that Langevin dynamics does not find stable local optima within dimension-free time.

Authors: Brice Huang, Mark Sellke

It is a folklore belief in the theory of spin glasses and disordered systems that out-of-equilibrium dynamics fail to find stable local optima exhibiting e.g. local strict convexity on physical time-scales. In the context of the Sherrington--Kirkpatrick spin glass, Behrens-Arpino-Kivva-Zdeborov\'a and Minzer-Sah-Sawhney have recently conjectured that this obstruction may be inherent to all efficient algorithms, despite the existence of exponentially many such optima throughout the landscape. We prove this search problem exhibits strong low degree hardness for polynomial algorithms of degree $D\leq o(N)$: any such algorithm has probability $o(1)$ to output a stable local optimum. To the best of our knowledge, this is the first result to prove that even constant-degree polynomials have probability $o(1)$ to solve a random search problem without planted structure. To prove this, we develop a general-purpose enhancement of the ensemble overlap gap property, and as a byproduct improve previous results on spin glass optimization, maximum independent set, random $k$-SAT, and the Ising perceptron to strong low degree hardness. Finally for spherical spin glasses with no external field, we prove that Langevin dynamics does not find stable local optima within dimension-free time.

Isometric simplices for high-order anisotropic mesh adaptation. Part I: Definition and existence of isometric triangulations

from arXiv: Computational Geometry

Authors: Arthur Bawin, André Garon, Jean-François Remacle

Anisotropic mesh adaptation with Riemannian metrics has proven effective for generating straight-sided meshes with anisotropy induced by the geometry of interest and/or the resolved physics. Within the continuous mesh framework, anisotropic meshes are thought of as discrete counterparts to Riemannian metrics. Ideal, or unit, simplicial meshes consist only of simplices whose edges exhibit unit or quasi-unit length with respect to a given Riemannian metric. Recently, mesh adaptation with high-order (i.e., curved) elements has grown in popularity in the meshing community, as the additional flexibility of high-order elements can further reduce the approximation error. However, a complete and competitive methodology for anisotropic and high-order mesh adaptation is not yet available. The goal of this paper is to address a key aspect of metric-based high-order mesh adaptation, namely, the adequacy between a Riemannian metric and high-order simplices. This is done by extending the notions of unit simplices and unit meshes, central to the continuous mesh framework, to high-order elements. The existing definitions of a unit simplex are reviewed, then a broader definition involving Riemannian isometries is introduced to handle curved and high-order simplices. Similarly, the notion of quasi-unitness is extended to curved simplices to tackle the practical generation of high-order meshes. Proofs of concept for unit and (quasi-)isometric meshes are presented in two dimensions.

Authors: Arthur Bawin, André Garon, Jean-François Remacle

Anisotropic mesh adaptation with Riemannian metrics has proven effective for generating straight-sided meshes with anisotropy induced by the geometry of interest and/or the resolved physics. Within the continuous mesh framework, anisotropic meshes are thought of as discrete counterparts to Riemannian metrics. Ideal, or unit, simplicial meshes consist only of simplices whose edges exhibit unit or quasi-unit length with respect to a given Riemannian metric. Recently, mesh adaptation with high-order (i.e., curved) elements has grown in popularity in the meshing community, as the additional flexibility of high-order elements can further reduce the approximation error. However, a complete and competitive methodology for anisotropic and high-order mesh adaptation is not yet available. The goal of this paper is to address a key aspect of metric-based high-order mesh adaptation, namely, the adequacy between a Riemannian metric and high-order simplices. This is done by extending the notions of unit simplices and unit meshes, central to the continuous mesh framework, to high-order elements. The existing definitions of a unit simplex are reviewed, then a broader definition involving Riemannian isometries is introduced to handle curved and high-order simplices. Similarly, the notion of quasi-unitness is extended to curved simplices to tackle the practical generation of high-order meshes. Proofs of concept for unit and (quasi-)isometric meshes are presented in two dimensions.

A Greedy Algorithm for Low-Crossing Partitions for General Set Systems

from arXiv: Computational Geometry

Authors: Mónika Csikós, Alexandre Louvet, Nabil Mustafa

Simplicial partitions are a fundamental structure in computational geometry, as they form the basis of optimal data structures for range searching and several related problems. Current algorithms are built on very specific spatial partitioning tools tailored for certain geometric cases. This severely limits their applicability to general set systems. In this work, we propose a simple greedy heuristic for constructing simplicial partitions of any set system. We present a thorough empirical evaluation of its behavior on a variety of geometric and non-geometric set systems, showing that it performs well on most instances. Implementation of these algorithms is available on Github.

Authors: Mónika Csikós, Alexandre Louvet, Nabil Mustafa

Simplicial partitions are a fundamental structure in computational geometry, as they form the basis of optimal data structures for range searching and several related problems. Current algorithms are built on very specific spatial partitioning tools tailored for certain geometric cases. This severely limits their applicability to general set systems. In this work, we propose a simple greedy heuristic for constructing simplicial partitions of any set system. We present a thorough empirical evaluation of its behavior on a variety of geometric and non-geometric set systems, showing that it performs well on most instances. Implementation of these algorithms is available on Github.

A Tight VC-Dimension Analysis of Clustering Coresets with Applications

from arXiv: Data Structures and Algorithms

Authors: Vincent Cohen-Addad, Andrew Draganov, Matteo Russo, David Saulpic, Chris Schwiegelshohn

We consider coresets for $k$-clustering problems, where the goal is to assign points to centers minimizing powers of distances. A popular example is the $k$-median objective $\sum_{p}\min_{c\in C}dist(p,C)$. Given a point set $P$, a coreset $\Omega$ is a small weighted subset that approximates the cost of $P$ for all candidate solutions $C$ up to a $(1\pm\varepsilon )$ multiplicative factor. In this paper, we give a sharp VC-dimension based analysis for coreset construction. As a consequence, we obtain improved $k$-median coreset bounds for the following metrics: Coresets of size $\tilde{O}\left(k\varepsilon^{-2}\right)$ for shortest path metrics in planar graphs, improving over the bounds $\tilde{O}\left(k\varepsilon^{-6}\right)$ by [Cohen-Addad, Saulpic, Schwiegelshohn, STOC'21] and $\tilde{O}\left(k^2\varepsilon^{-4}\right)$ by [Braverman, Jiang, Krauthgamer, Wu, SODA'21]. Coresets of size $\tilde{O}\left(kd\ell\varepsilon^{-2}\log m\right)$ for clustering $d$-dimensional polygonal curves of length at most $m$ with curves of length at most $\ell$ with respect to Frechet metrics, improving over the bounds $\tilde{O}\left(k^3d\ell\varepsilon^{-3}\log m\right)$ by [Braverman, Cohen-Addad, Jiang, Krauthgamer, Schwiegelshohn, Toftrup, and Wu, FOCS'22] and $\tilde{O}\left(k^2d\ell\varepsilon^{-2}\log m \log |P|\right)$ by [Conradi, Kolbe, Psarros, Rohde, SoCG'24].

Authors: Vincent Cohen-Addad, Andrew Draganov, Matteo Russo, David Saulpic, Chris Schwiegelshohn

We consider coresets for $k$-clustering problems, where the goal is to assign points to centers minimizing powers of distances. A popular example is the $k$-median objective $\sum_{p}\min_{c\in C}dist(p,C)$. Given a point set $P$, a coreset $\Omega$ is a small weighted subset that approximates the cost of $P$ for all candidate solutions $C$ up to a $(1\pm\varepsilon )$ multiplicative factor. In this paper, we give a sharp VC-dimension based analysis for coreset construction. As a consequence, we obtain improved $k$-median coreset bounds for the following metrics: Coresets of size $\tilde{O}\left(k\varepsilon^{-2}\right)$ for shortest path metrics in planar graphs, improving over the bounds $\tilde{O}\left(k\varepsilon^{-6}\right)$ by [Cohen-Addad, Saulpic, Schwiegelshohn, STOC'21] and $\tilde{O}\left(k^2\varepsilon^{-4}\right)$ by [Braverman, Jiang, Krauthgamer, Wu, SODA'21]. Coresets of size $\tilde{O}\left(kd\ell\varepsilon^{-2}\log m\right)$ for clustering $d$-dimensional polygonal curves of length at most $m$ with curves of length at most $\ell$ with respect to Frechet metrics, improving over the bounds $\tilde{O}\left(k^3d\ell\varepsilon^{-3}\log m\right)$ by [Braverman, Cohen-Addad, Jiang, Krauthgamer, Schwiegelshohn, Toftrup, and Wu, FOCS'22] and $\tilde{O}\left(k^2d\ell\varepsilon^{-2}\log m \log |P|\right)$ by [Conradi, Kolbe, Psarros, Rohde, SoCG'24].

Big Atomics

from arXiv: Data Structures and Algorithms

Authors: Daniel Anderson, Guy E. Blelloch, Siddhartha Jayanti

In this paper, we give theoretically and practically efficient implementations of Big Atomics, i.e., $k$-word linearizable registers that support the load, store, and compare-and-swap (CAS) operations. While modern hardware supports $k = 1$ and sometimes $k = 2$ (e.g., double-width compare-and-swap in x86), our implementations support arbitrary $k$. Big Atomics are useful in many applications, including atomic manipulation of tuples, version lists, and implementing load-linked/store-conditional (LL/SC). We design fast, lock-free implementations of big atomics based on a novel fast-path-slow-path approach we develop. We then use them to develop an efficient concurrent hash table, as evidence of their utility. We experimentally validate the approach by comparing a variety of implementations of big atomics under a variety of workloads (thread counts, load/store ratios, contention, oversubscription, and number of atomics). The experiments compare two of our lock-free variants with C++ std::atomic, a lock-based version, a version using sequence locks, and an indirect version. The results show that our approach is close to the fastest under all conditions and far outperforms others under oversubscription. We also compare our big atomics based concurrent hash table to a variety of other state-of-the-art hash tables that support arbitrary length keys and values, including implementations from Intel's TBB, Facebook's Folly, libcuckoo, and a recent release from Boost. The results show that our approach of using big atomics in the design of hash tables is a promising direction.

Authors: Daniel Anderson, Guy E. Blelloch, Siddhartha Jayanti

In this paper, we give theoretically and practically efficient implementations of Big Atomics, i.e., $k$-word linearizable registers that support the load, store, and compare-and-swap (CAS) operations. While modern hardware supports $k = 1$ and sometimes $k = 2$ (e.g., double-width compare-and-swap in x86), our implementations support arbitrary $k$. Big Atomics are useful in many applications, including atomic manipulation of tuples, version lists, and implementing load-linked/store-conditional (LL/SC). We design fast, lock-free implementations of big atomics based on a novel fast-path-slow-path approach we develop. We then use them to develop an efficient concurrent hash table, as evidence of their utility. We experimentally validate the approach by comparing a variety of implementations of big atomics under a variety of workloads (thread counts, load/store ratios, contention, oversubscription, and number of atomics). The experiments compare two of our lock-free variants with C++ std::atomic, a lock-based version, a version using sequence locks, and an indirect version. The results show that our approach is close to the fastest under all conditions and far outperforms others under oversubscription. We also compare our big atomics based concurrent hash table to a variety of other state-of-the-art hash tables that support arbitrary length keys and values, including implementations from Intel's TBB, Facebook's Folly, libcuckoo, and a recent release from Boost. The results show that our approach of using big atomics in the design of hash tables is a promising direction.

Algorithmical Aspects of Some Bio Inspired Operations

from arXiv: Data Structures and Algorithms

Authors: Marius Dumitran

This thesis investigates three biologically inspired operations: prefix-suffix duplication, bounded prefix-suffix duplication, and prefix-suffix-square completion. Duplication, a common genetic mutation, involves repeating DNA sequences and is modeled here as formal operations on words. The prefix-suffix duplication generates non-context-free languages, even from simple initial words. To better reflect biological processes, we propose a bounded variant that limits duplication length, resolving unsolved problems and aligning with biochemical realities. We also introduce the prefix-suffix-square completion operation, which generates squares at sequence ends. This operation enables the generation of infinite words such as Fibonacci, Period-doubling, and Thue-Morse, which contain squares but avoid higher exponent repetitions, highlighting unique structural properties. In contrast, prefix-suffix duplication cannot generate certain infinite words, such as Thue-Morse, but can produce cube-free words. Additionally, we address the detection of gapped repeats and palindromes-structures important in DNA and RNA analysis. These involve repeating or reversed factors flanking a central gap. Previous studies imposed constraints on gap length or arm-gap relationships; we extend this by solving the problem in three novel settings. This work advances theoretical insights into biologically inspired operations and their computational applications in genetic modeling.

Authors: Marius Dumitran

This thesis investigates three biologically inspired operations: prefix-suffix duplication, bounded prefix-suffix duplication, and prefix-suffix-square completion. Duplication, a common genetic mutation, involves repeating DNA sequences and is modeled here as formal operations on words. The prefix-suffix duplication generates non-context-free languages, even from simple initial words. To better reflect biological processes, we propose a bounded variant that limits duplication length, resolving unsolved problems and aligning with biochemical realities. We also introduce the prefix-suffix-square completion operation, which generates squares at sequence ends. This operation enables the generation of infinite words such as Fibonacci, Period-doubling, and Thue-Morse, which contain squares but avoid higher exponent repetitions, highlighting unique structural properties. In contrast, prefix-suffix duplication cannot generate certain infinite words, such as Thue-Morse, but can produce cube-free words. Additionally, we address the detection of gapped repeats and palindromes-structures important in DNA and RNA analysis. These involve repeating or reversed factors flanking a central gap. Previous studies imposed constraints on gap length or arm-gap relationships; we extend this by solving the problem in three novel settings. This work advances theoretical insights into biologically inspired operations and their computational applications in genetic modeling.

On Optimizing Locality of Graph Transposition on Modern Architectures

from arXiv: Data Structures and Algorithms

Authors: Mohsen Koohi Esfahani, Hans Vandierendonck

This paper investigates the shared-memory Graph Transposition (GT) problem, a fundamental graph algorithm that is widely used in graph analytics and scientific computing. Previous GT algorithms have significant memory requirements that are proportional to the number of vertices and threads which obstructs their use on large graphs. Moreover, atomic memory operations have become comparably fast on recent CPU architectures, which creates new opportunities for improving the performance of concurrent atomic accesses in GT. We design PoTra, a GT algorithm which leverages graph structure and processor and memory architecture to optimize locality and performance. PoTra limits the size of additional data structures close to CPU cache sizes and utilizes the skewed degree distribution of graph datasets to optimize locality and performance. We present the performance model of PoTra to explain the connection between cache and memory response times and graph locality. Our evaluation of PoTra on three CPU architectures and 20 real-world and synthetic graph datasets with up to 128 billion edges demonstrates that PoTra achieves up to 8.7 times speedup compared to previous works and if there is a performance loss it remains limited to 15.7%, on average.

Authors: Mohsen Koohi Esfahani, Hans Vandierendonck

This paper investigates the shared-memory Graph Transposition (GT) problem, a fundamental graph algorithm that is widely used in graph analytics and scientific computing. Previous GT algorithms have significant memory requirements that are proportional to the number of vertices and threads which obstructs their use on large graphs. Moreover, atomic memory operations have become comparably fast on recent CPU architectures, which creates new opportunities for improving the performance of concurrent atomic accesses in GT. We design PoTra, a GT algorithm which leverages graph structure and processor and memory architecture to optimize locality and performance. PoTra limits the size of additional data structures close to CPU cache sizes and utilizes the skewed degree distribution of graph datasets to optimize locality and performance. We present the performance model of PoTra to explain the connection between cache and memory response times and graph locality. Our evaluation of PoTra on three CPU architectures and 20 real-world and synthetic graph datasets with up to 128 billion edges demonstrates that PoTra achieves up to 8.7 times speedup compared to previous works and if there is a performance loss it remains limited to 15.7%, on average.

TUCKET: A Tensor Time Series Data Structure for Efficient and Accurate Factor Analysis over Time Ranges

from arXiv: Data Structures and Algorithms

Authors: Ruizhong Qiu, Jun-Gi Jang, Xiao Lin, Lihui Liu, Hanghang Tong

Tucker decomposition has been widely used in a variety of applications to obtain latent factors of tensor data. In these applications, a common need is to compute Tucker decomposition for a given time range. Furthermore, real-world tensor time series are typically evolving in the time dimension. Such needs call for a data structure that can efficiently and accurately support range queries of Tucker decomposition and stream updates. Unfortunately, existing methods do not support either range queries or stream updates. This challenging problem has remained open for years prior to our work. To solve this challenging problem, we propose TUCKET, a data structure that can efficiently and accurately handle both range queries and stream updates. Our key idea is to design a new data structure that we call a stream segment tree by generalizing the segment tree, a data structure that was originally invented for computational geometry. For a range query of length $L$, our TUCKET can find $O(\log L)$ nodes (called the hit set) from the tree and efficiently stitch their preprocessed decompositions to answer the range query. We also propose an algorithm to optimally prune the hit set via an approximation of subtensor decomposition. For the $T$-th stream update, our TUCKET modifies only amortized $O(1)$ nodes and only $O(\log T)$ nodes in the worst case. Extensive evaluation demonstrates that our TUCKET consistently achieves the highest efficiency and accuracy across four large-scale datasets. Our TUCKET achieves at least 3 times lower latency and at least 1.4 times smaller reconstruction error than Zoom-Tucker on all datasets.

Authors: Ruizhong Qiu, Jun-Gi Jang, Xiao Lin, Lihui Liu, Hanghang Tong

Tucker decomposition has been widely used in a variety of applications to obtain latent factors of tensor data. In these applications, a common need is to compute Tucker decomposition for a given time range. Furthermore, real-world tensor time series are typically evolving in the time dimension. Such needs call for a data structure that can efficiently and accurately support range queries of Tucker decomposition and stream updates. Unfortunately, existing methods do not support either range queries or stream updates. This challenging problem has remained open for years prior to our work. To solve this challenging problem, we propose TUCKET, a data structure that can efficiently and accurately handle both range queries and stream updates. Our key idea is to design a new data structure that we call a stream segment tree by generalizing the segment tree, a data structure that was originally invented for computational geometry. For a range query of length $L$, our TUCKET can find $O(\log L)$ nodes (called the hit set) from the tree and efficiently stitch their preprocessed decompositions to answer the range query. We also propose an algorithm to optimally prune the hit set via an approximation of subtensor decomposition. For the $T$-th stream update, our TUCKET modifies only amortized $O(1)$ nodes and only $O(\log T)$ nodes in the worst case. Extensive evaluation demonstrates that our TUCKET consistently achieves the highest efficiency and accuracy across four large-scale datasets. Our TUCKET achieves at least 3 times lower latency and at least 1.4 times smaller reconstruction error than Zoom-Tucker on all datasets.

Faster parameterized algorithm for 3-Hitting Set

from arXiv: Data Structures and Algorithms

Authors: Dekel Tsur

In the 3-Hitting Set problem, the input is a hypergraph $G$ such that the size of every hyperedge of $G$ is at most 3, and an integers $k$, and the goal is to decide whether there is a set $S$ of at most $k$ vertices such that every hyperedge of $G$ contains at least one vertex from $S$. In this paper we give an $O^*(2.0409^k)$-time algorithm for 3-Hitting Set.

Authors: Dekel Tsur

In the 3-Hitting Set problem, the input is a hypergraph $G$ such that the size of every hyperedge of $G$ is at most 3, and an integers $k$, and the goal is to decide whether there is a set $S$ of at most $k$ vertices such that every hyperedge of $G$ contains at least one vertex from $S$. In this paper we give an $O^*(2.0409^k)$-time algorithm for 3-Hitting Set.

DiscQuant: A Quantization Method for Neural Networks Inspired by Discrepancy Theory

from arXiv: Data Structures and Algorithms

Authors: Jerry Chee, Arturs Backurs, Rainie Heck, Li Zhang, Janardhan Kulkarni, Thomas Rothvoss, Sivakanth Gopi

Quantizing the weights of a neural network has two steps: (1) Finding a good low bit-complexity representation for weights (which we call the quantization grid) and (2) Rounding the original weights to values in the quantization grid. In this paper, we study the problem of rounding optimally given any quantization grid. The simplest and most commonly used way to round is Round-to-Nearest (RTN). By rounding in a data-dependent way instead, one can improve the quality of the quantized model significantly. We study the rounding problem from the lens of \emph{discrepancy theory}, which studies how well we can round a continuous solution to a discrete solution without affecting solution quality too much. We prove that given $m=\mathrm{poly}(1/\epsilon)$ samples from the data distribution, we can round all but $O(m)$ model weights such that the expected approximation error of the quantized model on the true data distribution is $\le \epsilon$ as long as the space of gradients of the original model is approximately low rank (which we empirically validate). Our proof, which is algorithmic, inspired a simple and practical rounding algorithm called \emph{DiscQuant}. In our experiments, we demonstrate that DiscQuant significantly improves over the prior state-of-the-art rounding method called GPTQ and the baseline RTN over a range of benchmarks on Phi3mini-3.8B and Llama3.1-8B. For example, rounding Phi3mini-3.8B to a fixed quantization grid with 3.25 bits per parameter using DiscQuant gets 64\% accuracy on the GSM8k dataset, whereas GPTQ achieves 54\% and RTN achieves 31\% (the original model achieves 84\%). We make our code available at github.com/jerry-chee/DiscQuant.

Authors: Jerry Chee, Arturs Backurs, Rainie Heck, Li Zhang, Janardhan Kulkarni, Thomas Rothvoss, Sivakanth Gopi

Quantizing the weights of a neural network has two steps: (1) Finding a good low bit-complexity representation for weights (which we call the quantization grid) and (2) Rounding the original weights to values in the quantization grid. In this paper, we study the problem of rounding optimally given any quantization grid. The simplest and most commonly used way to round is Round-to-Nearest (RTN). By rounding in a data-dependent way instead, one can improve the quality of the quantized model significantly. We study the rounding problem from the lens of \emph{discrepancy theory}, which studies how well we can round a continuous solution to a discrete solution without affecting solution quality too much. We prove that given $m=\mathrm{poly}(1/\epsilon)$ samples from the data distribution, we can round all but $O(m)$ model weights such that the expected approximation error of the quantized model on the true data distribution is $\le \epsilon$ as long as the space of gradients of the original model is approximately low rank (which we empirically validate). Our proof, which is algorithmic, inspired a simple and practical rounding algorithm called \emph{DiscQuant}. In our experiments, we demonstrate that DiscQuant significantly improves over the prior state-of-the-art rounding method called GPTQ and the baseline RTN over a range of benchmarks on Phi3mini-3.8B and Llama3.1-8B. For example, rounding Phi3mini-3.8B to a fixed quantization grid with 3.25 bits per parameter using DiscQuant gets 64\% accuracy on the GSM8k dataset, whereas GPTQ achieves 54\% and RTN achieves 31\% (the original model achieves 84\%). We make our code available at https://github.com/jerry-chee/DiscQuant.

Monday, January 13

TR25-002 | New Pseudorandom Generators and Correlation Bounds Using Extractors | Vinayak Kumar

from ECCC Papers

We establish new correlation bounds and pseudorandom generators for a collection of computation models. These models are all natural generalizations of structured low-degree $F_2$-polynomials that we did not have correlation bounds for before. In particular: 1. We construct a PRG for width-2 $poly(n)$-length branching programs which read $d$ bits at a time with seed length $2^{O(\sqrt{\log n})}\cdot d^2\log^2(1/\epsilon)$. This comes quadratically close to optimal dependence in $d$ and $\log(1/\epsilon)$. Improving the dependence on $n$ would imply nontrivial PRGs for $\log n$-degree $F_2$-polynomials. The previous PRG by Bogdanov, Dvir, Verbin, and Yehudayoff had an exponentially worse dependence on $d$ with seed length of $O(d\log n + d2^d\log(1/\epsilon))$. 2. We provide correlation bounds and PRGs against size-$n^{\Omega(\log n)}$ AC0 circuits with either $n^{.99}$ SYM gates (computing an arbitrary symmetric function) or $n^{.49}$ THR gates (computing an arbitrary linear threshold function). Previous work of Servedio and Tan only handled $n^{.49}$ SYM gates or $n^{.24}$ THR gates, and previous work of Lovett and Srinivasan only handled polysize circuits. 3. We give exponentially small correlation bounds against degree-$n^{O(1)}$ $F_2$-polynomials set-multilinear over some partition of the input into $n^{.99}$ parts (noting that at $n$ parts, we recover all low-degree polynomials). This generalizes correlation bounds against degree-$(d-1)$ polynomials which are set-multilinear over a fixed partition into $d$ blocks, which were established by Bhrushundi, Harsha, Hatami, Kopparty and Kumar. The common technique behind all of these results is to fortify a hard function with the right type of extractor to obtain stronger correlation bounds. Although this technique has been used in previous work, it relies on the model shrinking to a very small class under random restrictions. Our results show such fortification can be done even for classes that do not enjoy such behavior.

We establish new correlation bounds and pseudorandom generators for a collection of computation models. These models are all natural generalizations of structured low-degree $F_2$-polynomials that we did not have correlation bounds for before. In particular: 1. We construct a PRG for width-2 $poly(n)$-length branching programs which read $d$ bits at a time with seed length $2^{O(\sqrt{\log n})}\cdot d^2\log^2(1/\epsilon)$. This comes quadratically close to optimal dependence in $d$ and $\log(1/\epsilon)$. Improving the dependence on $n$ would imply nontrivial PRGs for $\log n$-degree $F_2$-polynomials. The previous PRG by Bogdanov, Dvir, Verbin, and Yehudayoff had an exponentially worse dependence on $d$ with seed length of $O(d\log n + d2^d\log(1/\epsilon))$. 2. We provide correlation bounds and PRGs against size-$n^{\Omega(\log n)}$ AC0 circuits with either $n^{.99}$ SYM gates (computing an arbitrary symmetric function) or $n^{.49}$ THR gates (computing an arbitrary linear threshold function). Previous work of Servedio and Tan only handled $n^{.49}$ SYM gates or $n^{.24}$ THR gates, and previous work of Lovett and Srinivasan only handled polysize circuits. 3. We give exponentially small correlation bounds against degree-$n^{O(1)}$ $F_2$-polynomials set-multilinear over some partition of the input into $n^{.99}$ parts (noting that at $n$ parts, we recover all low-degree polynomials). This generalizes correlation bounds against degree-$(d-1)$ polynomials which are set-multilinear over a fixed partition into $d$ blocks, which were established by Bhrushundi, Harsha, Hatami, Kopparty and Kumar. The common technique behind all of these results is to fortify a hard function with the right type of extractor to obtain stronger correlation bounds. Although this technique has been used in previous work, it relies on the model shrinking to a very small class under random restrictions. Our results show such fortification can be done even for classes that do not enjoy such behavior.

Non-planar 3D Printing of Double Shells

from arXiv: Computational Geometry

Authors: Ioanna Mitropoulou, Amir Vaxman, Olga Diamanti, Benjamin Dillenburger

We present a method to fabricate double shell structures printed in trans-versal directions using multi-axis fused-deposition-modeling (FDM) robot-ic 3D printing. Shell structures, characterized by lightweight, thin walls, fast buildup, and minimal material usage, find diverse applications in pro-totyping and architecture for uses such as fa\c{c}ade panels, molds for concrete casting, or full-scale pavilions. We leverage an underlying representation of transversal strip networks generated using existing methods and propose a methodology for converting them into printable partitions. Each partition is printed separately and assembled into a double-shell structure. We out-line the specifications and workflow that make the printing of each piece and the subsequent assembly process feasible. The versatility and robust-ness of our method are demonstrated with both digital and fabricated re-sults on surfaces of different scales and geometric complexity.

Authors: Ioanna Mitropoulou, Amir Vaxman, Olga Diamanti, Benjamin Dillenburger

We present a method to fabricate double shell structures printed in trans-versal directions using multi-axis fused-deposition-modeling (FDM) robot-ic 3D printing. Shell structures, characterized by lightweight, thin walls, fast buildup, and minimal material usage, find diverse applications in pro-totyping and architecture for uses such as fa\c{c}ade panels, molds for concrete casting, or full-scale pavilions. We leverage an underlying representation of transversal strip networks generated using existing methods and propose a methodology for converting them into printable partitions. Each partition is printed separately and assembled into a double-shell structure. We out-line the specifications and workflow that make the printing of each piece and the subsequent assembly process feasible. The versatility and robust-ness of our method are demonstrated with both digital and fabricated re-sults on surfaces of different scales and geometric complexity.

Accelerating Computation of Stable Merge Tree Edit Distances using Parameterized Heuristics

from arXiv: Computational Geometry

Authors: Florian Wetzels, Heike Leitte, Christoph Garth

In this paper, we present a novel heuristic algorithm for the stable but NP-complete deformation-based edit distance on merge trees. Our key contribution is the introduction of a user-controlled look-ahead parameter that allows to trade off accuracy and computational cost. We achieve a fixed parameter tractable running time that is polynomial in the size of the input but exponential in the look-ahead value. This extension unlocks the potential of the deformation-based edit distance in handling saddle swaps, while maintaining feasible computation times. Experimental results demonstrate the computational efficiency and effectiveness of this approach in handling specific perturbations.

Authors: Florian Wetzels, Heike Leitte, Christoph Garth

In this paper, we present a novel heuristic algorithm for the stable but NP-complete deformation-based edit distance on merge trees. Our key contribution is the introduction of a user-controlled look-ahead parameter that allows to trade off accuracy and computational cost. We achieve a fixed parameter tractable running time that is polynomial in the size of the input but exponential in the look-ahead value. This extension unlocks the potential of the deformation-based edit distance in handling saddle swaps, while maintaining feasible computation times. Experimental results demonstrate the computational efficiency and effectiveness of this approach in handling specific perturbations.

Mim-Width is paraNP-complete

from arXiv: Data Structures and Algorithms

Authors: Benjamin Bergougnoux, Édouard Bonnet, Julien Duron

We show that it is NP-hard to distinguish graphs of linear mim-width at most 1211 from graphs of sim-width at least 1216. This implies that Mim-Width, Sim-Width, One-Sided Mim-Width, and their linear counterparts are all paraNP-complete, i.e., NP-complete to compute even when upper bounded by a constant.

Authors: Benjamin Bergougnoux, Édouard Bonnet, Julien Duron

We show that it is NP-hard to distinguish graphs of linear mim-width at most 1211 from graphs of sim-width at least 1216. This implies that Mim-Width, Sim-Width, One-Sided Mim-Width, and their linear counterparts are all paraNP-complete, i.e., NP-complete to compute even when upper bounded by a constant.

Colorful Vertex Recoloring of Bipartite Graphs

from arXiv: Data Structures and Algorithms

Authors: Boaz Patt-Shamir, Adi Rosen, Seeun William Umboh

In vertex recoloring, we are given $n$ vertices with their initial coloring, and edges arrive in an online fashion. The algorithm must maintain a valid coloring by recoloring vertices, at a cost. The problem abstracts a scenario of job placement in machines (possibly in the cloud), where vertices represent jobs, colors represent machines, and edges represent ``anti affinity'' (disengagement) constraints. Online recoloring is a hard problem. One family of instances which is fairly well-understood is bipartite graphs, in which two colors are sufficient to satisfy all constraints. In this case it is known that the competitive ratio of vertex recoloring is $\Theta(\log n)$. We propose a generalization of the problem, which allows using additional colors (possibly at a higher cost), to improve overall performance. We analyze the simple case of bipartite graphs of bounded largest \emph{bond} (a bond of a connected graph is an edge-cut that partitions the graph into two connected components). First, we propose two algorithms. One exhibits a trade-off for the uniform-cost case: given $\Omega(\log\beta)\le c\le O(\log n)$ colors, the algorithm guarantees that its cost is at most $O(\frac{\log n}{c})$ times the optimal offline cost for two colors, where $n$ is the number of vertices and $\beta$ is the size of the largest bond. The other algorithm is for the case where the additional colors come at a higher cost, $D>1$: given $\Delta$ additional colors, where $\Delta$ is the maximum degree in the graph, the algorithm guarantees $O(\log D)$ competitiveness. As to lower bounds, we show that if the cost of the extra colors is $D>1$, no (randomized) algorithm can achieve a competitive ratio of $o(\log D)$. We also show that for bipartite graphs of unbounded bond size, any deterministic online algorithm has competitive ratio $\Omega(\min(D,\log n))$.

Authors: Boaz Patt-Shamir, Adi Rosen, Seeun William Umboh

In vertex recoloring, we are given $n$ vertices with their initial coloring, and edges arrive in an online fashion. The algorithm must maintain a valid coloring by recoloring vertices, at a cost. The problem abstracts a scenario of job placement in machines (possibly in the cloud), where vertices represent jobs, colors represent machines, and edges represent ``anti affinity'' (disengagement) constraints. Online recoloring is a hard problem. One family of instances which is fairly well-understood is bipartite graphs, in which two colors are sufficient to satisfy all constraints. In this case it is known that the competitive ratio of vertex recoloring is $\Theta(\log n)$. We propose a generalization of the problem, which allows using additional colors (possibly at a higher cost), to improve overall performance. We analyze the simple case of bipartite graphs of bounded largest \emph{bond} (a bond of a connected graph is an edge-cut that partitions the graph into two connected components). First, we propose two algorithms. One exhibits a trade-off for the uniform-cost case: given $\Omega(\log\beta)\le c\le O(\log n)$ colors, the algorithm guarantees that its cost is at most $O(\frac{\log n}{c})$ times the optimal offline cost for two colors, where $n$ is the number of vertices and $\beta$ is the size of the largest bond. The other algorithm is for the case where the additional colors come at a higher cost, $D>1$: given $\Delta$ additional colors, where $\Delta$ is the maximum degree in the graph, the algorithm guarantees $O(\log D)$ competitiveness. As to lower bounds, we show that if the cost of the extra colors is $D>1$, no (randomized) algorithm can achieve a competitive ratio of $o(\log D)$. We also show that for bipartite graphs of unbounded bond size, any deterministic online algorithm has competitive ratio $\Omega(\min(D,\log n))$.

Faster Edge Coloring by Partition Sieving

from arXiv: Data Structures and Algorithms

Authors: Shyan Akmal, Tomohiro Koana

In the Edge Coloring problem, we are given an undirected graph $G$ with $n$ vertices and $m$ edges, and are tasked with finding the smallest positive integer $k$ so that the edges of $G$ can be assigned $k$ colors in such a way that no two edges incident to the same vertex are assigned the same color. Edge Coloring is a classic NP-hard problem, and so significant research has gone into designing fast exponential-time algorithms for solving Edge Coloring and its variants exactly. Prior work showed that Edge Coloring can be solved in $2^m\text{poly}(n)$ time and polynomial space, and in graphs with average degree $d$ in $2^{(1-\varepsilon_d)m}\text{poly}(n)$ time and exponential space, where $\varepsilon_d = (1/d)^{\Theta(d^3)}$. We present an algorithm that solves Edge Coloring in $2^{m-3n/5}\text{poly}(n)$ time and polynomial space. Our result is the first algorithm for this problem which simultaneously runs in faster than $2^m\text{poly}(m)$ time and uses only polynomial space. In graphs of average degree $d$, our algorithm runs in $2^{(1-6/(5d))m}\text{poly}(n)$ time, which has far better dependence in $d$ than previous results. We also generalize our algorithm to solve a problem known as List Edge Coloring, where each edge $e$ in the input graph comes with a list $L_e\subseteq\left\{1, \dots, k\right\}$ of colors, and we must determine whether we can assign each edge a color from its list so that no two edges incident to the same vertex receive the same color. We solve this problem in $2^{(1-6/(5k))m}\text{poly}(n)$ time and polynomial space. The previous best algorithm for List Edge Coloring took $2^m\text{poly}(n)$ time and space.

Authors: Shyan Akmal, Tomohiro Koana

In the Edge Coloring problem, we are given an undirected graph $G$ with $n$ vertices and $m$ edges, and are tasked with finding the smallest positive integer $k$ so that the edges of $G$ can be assigned $k$ colors in such a way that no two edges incident to the same vertex are assigned the same color. Edge Coloring is a classic NP-hard problem, and so significant research has gone into designing fast exponential-time algorithms for solving Edge Coloring and its variants exactly. Prior work showed that Edge Coloring can be solved in $2^m\text{poly}(n)$ time and polynomial space, and in graphs with average degree $d$ in $2^{(1-\varepsilon_d)m}\text{poly}(n)$ time and exponential space, where $\varepsilon_d = (1/d)^{\Theta(d^3)}$. We present an algorithm that solves Edge Coloring in $2^{m-3n/5}\text{poly}(n)$ time and polynomial space. Our result is the first algorithm for this problem which simultaneously runs in faster than $2^m\text{poly}(m)$ time and uses only polynomial space. In graphs of average degree $d$, our algorithm runs in $2^{(1-6/(5d))m}\text{poly}(n)$ time, which has far better dependence in $d$ than previous results. We also generalize our algorithm to solve a problem known as List Edge Coloring, where each edge $e$ in the input graph comes with a list $L_e\subseteq\left\{1, \dots, k\right\}$ of colors, and we must determine whether we can assign each edge a color from its list so that no two edges incident to the same vertex receive the same color. We solve this problem in $2^{(1-6/(5k))m}\text{poly}(n)$ time and polynomial space. The previous best algorithm for List Edge Coloring took $2^m\text{poly}(n)$ time and space.

Sunday, January 12

Danish Data Science Academy postdoc positions at University of Copenhagen (apply by March 5, 2025)

from CCI: jobs

The Danish Data Science Academy invites applications for postdoc fellowships for visionary and ambitious young data scientists. The call is at ddsa.dk/postdocfellowshipcall2025/. Inquiries about opportunities in the Algorithms and Complexity Section at the University of Copenhagen are welcome and can be directed to Jakob Nordstrom or other faculty in the section. Website: ddsa.dk/postdocfellowshipcall2025/ Email: jn@di.ku.dk

The Danish Data Science Academy invites applications for postdoc fellowships for visionary and ambitious young data scientists. The call is at https://ddsa.dk/postdocfellowshipcall2025/. Inquiries about opportunities in the Algorithms and Complexity Section at the University of Copenhagen are welcome and can be directed to Jakob Nordstrom or other faculty in the section.

Website: https://ddsa.dk/postdocfellowshipcall2025/
Email: jn@di.ku.dk

By shacharlovett

Random Thought on AI from someone in the REAL WORLD

from Computational Complexity

Guest Post from Nick Sovich. 

-----------------------------------------------------------------------------

Bill Gasarch recently blogged on RANDOM THOUGHTS ON AI here . He is in the realm of theory. I am in the realm of applications so I asked if I could do a post on AI from that background. He agreed, and here's the post:

1) Words Matter

Today you tell a foundation model who to be and how to behave by giving it a system prompt and a chat prompt. The system prompt is the AI analog of 

 I’m a CS professor, I’m an expert in Ramsay theory, and my goal is to help students learn. 

The chat prompt is the AI analog of


“Hi CS Professor, can you please help me write a blog post to formulate my thoughts on AI”?

Words comprise these system prompts and chat prompts. Getting the right words is very important, in the same way writing the right lines of code is important.

So if you have an AI that *can* be any 157-level IQ expert in any field, you still haveto tell it what kind of expert to be, what its worldview is, and how to communicate. Because it can’t have infinite world views and communication styles all at once.

Example System Prompt and Chat Prompt is here.


 
2) If English becomes the #1 programming language, then what’s the difference between a poet and a computer scientist?

There has been talk in the industry of English eventually becoming the number 1 programming language. What does that mean? A poet has mastery over the English language. A computer scientist has mastery over programming languages. A poet can create works of art.  A poet can be creative, but so can a computer scientist. A computer scientist can be pragmatic, but so can a poet. Will poets become better programmers than computer scientists?  Will computer scientists become better artists than poets? Does it matter?

3) What is code and what is poetry?

Bad code doesn’t compile, or it compiles but doesn’t run, or it runs but has bugs. Bad poetry doesn’t express the author’s intent, or it doesn’t evoke the reader’s emotion, or it doesn’t convey a message to society. Bad code is still code, bad poetry is still poetry. Humans can produce good and bad code and good and bad poetry. AI can produce good and bad code and bad poetry. Can AI produce good poetry?


4) AI is a tool, like auto-complete. 

This is both good and bad.

Under use the tool, and you waste a lot of time.

Over use the tool and you don't understand your own product.

5) Where should AI start and end in academia?


This is an important question worthy of a separate blog that I might do in the future.

6) AI can either replace people or empower people. 

We should look for ways that it empowers people. A nice 38 second clip about that here.

As an example:

Good: Giving the vet tech tools to record my intake questions through natural language instead of pen and paper.

Bad: Automating away all vet tech jobs and replacing them with humanoid robots that would scare my dog anyway. It’s traumatic already when a human pins her down to give her eye drops.

Example System Prompt and Chat Prompt here.

7) How many tokens and how much test-time computation would it take to fully capture my dog’s personality?

First of all, the real thing is better. To see what the latest models are capable of, I used the reasoning models to help generate prompts for the diffusion models, to see how closely I could replicate my dog’s appearance. The reasoning models capture the veterinary terminology very well, and increased specificity in prompting leads to better results from the diffusion models. The diffusion models get close, but don’t seem to have the level of expert veterinary knowledge that the reasoning models do (yet).

Example Prompt and Resulting Image here.

 
8) If spending an hour a day on social media can radicalize you politically, can spending an hour a day reasoning with an AI on technical topics make you more technical?

Spending an hour a day searching the internet for technical topics and reading them can certainly make you more technical. If AI helps you get that information more efficiently, and if the AI actually works (hallucinates less than doing a web search would lead you to incorrect technical information), then it follows that AI can be more efficient than a web search in making you more technical. AI needs to be “aligned”, just like the web search needs to be “unbiased”. And make sure that one of the topics you learn, either through web search or AI, is how to read a map if you lose internet access.

BILL's comment on point 8: While I agree that spending an hour a day (or more) reasoning with an AI on a technical topic is a good idea, make sure you don't fall into a rabbit hole.  For example, while using AI to help me with Ramsey Theory I, on a whim, asked it to write me a poem about Ramsey Theory. It wasn't that good but it was better than what I could do. I won't reproduce it for fear of sending my readers down a rabbit hole.



 



By gasarch

Guest Post from Nick Sovich. 

-----------------------------------------------------------------------------

Bill Gasarch recently blogged on RANDOM THOUGHTS ON AI here . He is in the realm of theory. I am in the realm of applications so I asked if I could do a post on AI from that background. He agreed, and here's the post:

1) Words Matter

Today you tell a foundation model who to be and how to behave by giving it a system prompt and a chat prompt. The system prompt is the AI analog of 

 I’m a CS professor, I’m an expert in Ramsay theory, and my goal is to help students learn. 

The chat prompt is the AI analog of


Hi CS Professor, can you please help me write a blog post to formulate my thoughts on AI”?

Words comprise these system prompts and chat prompts. Getting the right words is very important, in the same way writing the right lines of code is important.

So if you have an AI that *can* be any 157-level IQ expert in any field, you still haveto tell it what kind of expert to be, what its worldview is, and how to communicate. Because it can’t have infinite world views and communication styles all at once.

Example System Prompt and Chat Prompt is here.


 
2) If English becomes the #1 programming language, then what’s the difference between a poet and a computer scientist?

There has been talk in the industry of English eventually becoming the number 1 programming language. What does that mean? A poet has mastery over the English language. A computer scientist has mastery over programming languages. A poet can create works of art.  A poet can be creative, but so can a computer scientist. A computer scientist can be pragmatic, but so can a poet. Will poets become better programmers than computer scientists?  Will computer scientists become better artists than poets? Does it matter?

3) What is code and what is poetry?

Bad code doesn’t compile, or it compiles but doesn’t run, or it runs but has bugs. Bad poetry doesn’t express the author’s intent, or it doesn’t evoke the reader’s emotion, or it doesn’t convey a message to society. Bad code is still code, bad poetry is still poetry. Humans can produce good and bad code and good and bad poetry. AI can produce good and bad code and bad poetry. Can AI produce good poetry?


4) AI is a tool, like auto-complete. 

This is both good and bad.

Under use the tool, and you waste a lot of time.

Over use the tool and you don't understand your own product.

5) Where should AI start and end in academia?


This is an important question worthy of a separate blog that I might do in the future.

6) AI can either replace people or empower people. 

We should look for ways that it empowers people. A nice 38 second clip about that here.

As an example:

Good: Giving the vet tech tools to record my intake questions through natural language instead of pen and paper.

Bad: Automating away all vet tech jobs and replacing them with humanoid robots that would scare my dog anyway. It’s traumatic already when a human pins her down to give her eye drops.

Example System Prompt and Chat Prompt here.

7) How many tokens and how much test-time computation would it take to fully capture my dog’s personality?

First of all, the real thing is better. To see what the latest models are capable of, I used the reasoning models to help generate prompts for the diffusion models, to see how closely I could replicate my dog’s appearance. The reasoning models capture the veterinary terminology very well, and increased specificity in prompting leads to better results from the diffusion models. The diffusion models get close, but don’t seem to have the level of expert veterinary knowledge that the reasoning models do (yet).

Example Prompt and Resulting Image here.

 
8) If spending an hour a day on social media can radicalize you politically, can spending an hour a day reasoning with an AI on technical topics make you more technical?

Spending an hour a day searching the internet for technical topics and reading them can certainly make you more technical. If AI helps you get that information more efficiently, and if the AI actually works (hallucinates less than doing a web search would lead you to incorrect technical information), then it follows that AI can be more efficient than a web search in making you more technical. AI needs to be “aligned”, just like the web search needs to be “unbiased”. And make sure that one of the topics you learn, either through web search or AI, is how to read a map if you lose internet access.

BILL's comment on point 8: While I agree that spending an hour a day (or more) reasoning with an AI on a technical topic is a good idea, make sure you don't fall into a rabbit hole.  For example, while using AI to help me with Ramsey Theory I, on a whim, asked it to write me a poem about Ramsey Theory. It wasn't that good but it was better than what I could do. I won't reproduce it for fear of sending my readers down a rabbit hole.



 



By gasarch

TR25-001 | The Meta-Complexity of Secret Sharing | Oded Nir, Benny Applebaum

from ECCC Papers

A secret-sharing scheme allows the distribution of a secret $s$ among $n$ parties, such that only certain predefined “authorized” sets of parties can reconstruct the secret, while all other “unauthorized” sets learn nothing about $s$. The collection of authorized/unauthorized sets is defined by a monotone function $f: \{0,1\}^n \rightarrow \{0,1\}$. It is known that any monotone function can be realized by a secret-sharing scheme; thus, the smallest achievable \emph{total share size}, $S(f)$, serves as a natural complexity measure. In this paper, we initiate the study of the following meta-complexity question: Given a monotone function $f$, is it possible to efficiently distinguish between cases where the secret-sharing complexity of $f$ is small versus large? We examine this question across several computational models, yielding the following main results. (Hardness for formulas and circuits): Given a monotone formula $f$ of size $L$, it is coNP-hard to distinguish between ``cheap'' functions, where the maximum share size is 1 bit and the total share size is $O(L^{0.01})$, and ``expensive'' functions, where the maximum share size is $\Omega(\sqrt{L})$ and the total share size is $\Omega(L/\log L)$. This latter bound nearly matches known secret-sharing constructions yielding a total share size of $L$ bits. For monotone circuits, we strengthen the bound on the expensive case to a maximum share size of $\Omega(L/\log L)$ and a total share size of $\Omega(L^2/\log L)$. These results rule out the existence of instance-optimal compilers that map a formula $f$ to a secret-sharing scheme with complexity polynomially related to $S(f)$. (Hardness for truth tables): Under cryptographic assumptions, either (1) every $n$-bit slice function can be realized by a $\text{poly}(n)$-size secret-sharing scheme, or (2) given a truth-table representation of $f$ of size $N = 2^n$, it is computationally infeasible to distinguish in time $\text{poly}(N)$ between cases where $S(f) = \text{poly}(n)$ and $S(f) = n^{\omega(1)}$. Option (1) would be considered a breakthrough result, as the best-known construction for slices has a sub-exponential complexity of $2^{\tilde{O}(\sqrt{n})}$ (Liu, Vaikuntanathan, and Wee; Eurocrypt 2018). Our proof introduces a new worst-case-to-average-case reduction for slices, which may be of independent interest. (Hardness for graphs): We examine the simple case where $f$ is given as a 2-DNF, represented by a graph $G$ whose edges correspond to 2-terms, and ask whether it is possible to distinguish between cases where the share size is constant and those where the share size is large, say $\Omega(\log n)$. We establish several connections between this question and questions in communication complexity. For instance, we show that graphs admitting constant-cost secret sharing form a subclass of graphs with constant randomized communication complexity and constant-size adjacency sketches (Harms, Wild, and Zamaraev; STOC 2022). We leverage these connections to establish new lower bounds for specific graph families, derive a combinatorial characterization of graphs with constant-size linear secret-sharing schemes, and show that a natural class of myopic algorithms fails to distinguish cheap graphs from expensive ones.

A secret-sharing scheme allows the distribution of a secret $s$ among $n$ parties, such that only certain predefined “authorized” sets of parties can reconstruct the secret, while all other “unauthorized” sets learn nothing about $s$. The collection of authorized/unauthorized sets is defined by a monotone function $f: \{0,1\}^n \rightarrow \{0,1\}$. It is known that any monotone function can be realized by a secret-sharing scheme; thus, the smallest achievable \emph{total share size}, $S(f)$, serves as a natural complexity measure. In this paper, we initiate the study of the following meta-complexity question: Given a monotone function $f$, is it possible to efficiently distinguish between cases where the secret-sharing complexity of $f$ is small versus large? We examine this question across several computational models, yielding the following main results. (Hardness for formulas and circuits): Given a monotone formula $f$ of size $L$, it is coNP-hard to distinguish between ``cheap'' functions, where the maximum share size is 1 bit and the total share size is $O(L^{0.01})$, and ``expensive'' functions, where the maximum share size is $\Omega(\sqrt{L})$ and the total share size is $\Omega(L/\log L)$. This latter bound nearly matches known secret-sharing constructions yielding a total share size of $L$ bits. For monotone circuits, we strengthen the bound on the expensive case to a maximum share size of $\Omega(L/\log L)$ and a total share size of $\Omega(L^2/\log L)$. These results rule out the existence of instance-optimal compilers that map a formula $f$ to a secret-sharing scheme with complexity polynomially related to $S(f)$. (Hardness for truth tables): Under cryptographic assumptions, either (1) every $n$-bit slice function can be realized by a $\text{poly}(n)$-size secret-sharing scheme, or (2) given a truth-table representation of $f$ of size $N = 2^n$, it is computationally infeasible to distinguish in time $\text{poly}(N)$ between cases where $S(f) = \text{poly}(n)$ and $S(f) = n^{\omega(1)}$. Option (1) would be considered a breakthrough result, as the best-known construction for slices has a sub-exponential complexity of $2^{\tilde{O}(\sqrt{n})}$ (Liu, Vaikuntanathan, and Wee; Eurocrypt 2018). Our proof introduces a new worst-case-to-average-case reduction for slices, which may be of independent interest. (Hardness for graphs): We examine the simple case where $f$ is given as a 2-DNF, represented by a graph $G$ whose edges correspond to 2-terms, and ask whether it is possible to distinguish between cases where the share size is constant and those where the share size is large, say $\Omega(\log n)$. We establish several connections between this question and questions in communication complexity. For instance, we show that graphs admitting constant-cost secret sharing form a subclass of graphs with constant randomized communication complexity and constant-size adjacency sketches (Harms, Wild, and Zamaraev; STOC 2022). We leverage these connections to establish new lower bounds for specific graph families, derive a combinatorial characterization of graphs with constant-size linear secret-sharing schemes, and show that a natural class of myopic algorithms fails to distinguish cheap graphs from expensive ones.

Call for Nominations: 2024 SIGecom Doctoral Dissertation Award

from Turing's Invisible Hand

The SIGecom Doctoral Dissertation Award recognizes an outstanding dissertation in the field of economics and computation. The award is conferred annually at the ACM Conference on Economics and Computation and includes a certificate, complimentary conference registration, and an honorarium of $1,500. A certificate may further be given to up to two runners-up. No award may be conferred if the nominations are […]

The SIGecom Doctoral Dissertation Award recognizes an outstanding dissertation in the field of economics and computation. The award is conferred annually at the ACM Conference on Economics and Computation and includes a certificate, complimentary conference registration, and an honorarium of $1,500. A certificate may further be given to up to two runners-up. No award may be conferred if the nominations are judged not to meet the standards for the award.

To be eligible, a dissertation must be on a topic related to economics and computation and must have been defended successfully during the calendar year preceding the year of the award presentation.

The next SIGecom Doctoral Dissertation Award will be given for dissertations defended in 2024. Nominations are due by February 28th, 2025 (Anywhere on Earth), and should be submitted by email to sigecom-dissertationaward@googlegroups.com with the name of the nominee in the subject. A dissertation may be nominated simultaneously for both the SIGecom Doctoral Dissertation Award and the ACM Doctoral Dissertation Award.

Nominations may be made by any member of SIGecom and will typically come from the dissertation supervisor. Self-nomination is not allowed. Nominations for the award must include the following, preferably having items 1, 3, 4 and 5 in a single PDF file:

  1. Suggested citation if the dissertation is selected. This should consist of at most 2 sentences describing the key contributions of the thesis. 
  2. A two-page summary of the dissertation, written by the nominee, including bibliographic data and links to publicly accessible versions of published papers based primarily on the dissertation.
  3. An English-language version of the dissertation in a separate file.
  4. An endorsement letter of no more than two pages by the nominator, arguing the merit of the dissertation, potential impact, and justification of the nomination. This document should also certify the dissertation defense date.
  5. The names, email addresses, and affiliations of two additional endorsers.

The additional endorsement letters themselves should be sent directly by email (sigecom-dissertationaward@googlegroups.com), by the same deadline. These endorsements should be no longer than 500 words and should specify the relationship of the endorser to the nominee, contributions of the dissertation, and its potential impact on the field.

It is expected that a nominated candidate, if selected for the award, will attend the next ACM Conference on Economics and Computation to accept the award and give a presentation on the dissertation work. The award includes complimentary registration but does not cover travel or accommodation expenses to attend the conference.

The 2024 Doctoral Dissertation Award Committee:
Ben Brooks, University of Chicago
Richard Cole, New York University
Rachel Cummings (chair), Columbia University

By michalfeldman

Jiaoyang Huang, Theo Mckenzie, Horng-Tzer Yau: Ramanujan Property and Edge Universality of Random Regular Graphs

from Gil Kalai

A central problem in combinatorics, probability theory, and analysis is to understand the spectrum of  random d-regular graphs G with vertices. The following paper marks a huge leap in our understanding of this problem.  Ramanujan Property and Edge Universality of … Continue reading →

A central problem in combinatorics, probability theory, and analysis is to understand the spectrum of  random d-regular graphs G with n vertices. The following paper marks a huge leap in our understanding of this problem. 

Ramanujan Property and Edge Universality of Random Regular Graphs, by Jiaoyang Huang, Theo Mckenzie, and Horng-Tzer Yau.

Abstract: We consider the normalized adjacency matrix of a random d-regular graph on N vertices with any fixed degree d\geq 3 and denote its eigenvalues as

\displaystyle \lambda_1=d/\sqrt{d-1}\geq \lambda_2\geq\lambda_3\cdots\geq \lambda_N.

We establish the following two results as N\rightarrow \infty.

  • (i) With high probability, all eigenvalues are optimally rigid, up to an additional N^{o(1)} factor.
    Specifically, the fluctuations of bulk eigenvalues are bounded by N^{-1+o(1)}, and the fluctuations of edge eigenvalues are bounded by N^{-2/3+o(1)}.
  • (ii) Edge universality holds for random d-regular graphs. That is, the distributions of \lambda_2 and \lambda_N converge to the Tracy-Widom distribution associated with the Gaussian Orthogonal Ensemble.

As a consequence, for sufficiently large N, approximately 69\% of d-regular graphs on $N$ vertices are Ramanujan, meaning

\displaystyle\max \{\lambda_2,|\lambda_N|\}\leq 2.

HMY

Discussion

This new terrific paper connects to some of the most exciting developments in mathematics of the last fifty years: 

(1) Random matrix theory, the Tracy-Widom theory and universatility.

The joint probability density of eigenvalues in random matrices is expressed by the following formula:

\displaystyle \frac{1}{Z_{\beta,b}}\prod_{k=1}^n e^{-\frac{\beta}{4}\lambda_k^2}\prod_{i<j}|\lambda_j-\lambda_i|^\beta ,

where β=1,2,4 corresponds to different random matrix ensembles (GOE, GUE, etc.), (though other values of β are also important!).  This formula gives a lot of information on the distribution of individual eigenvalues. See also this post of Terry Tao on the Gaussian ensembles in random matrix theory. 

In the early 1990s,  Craig Tracy and Harold Widom identified the distribution of the normalized largest eigenvalue of a random Hermitian matrix (GOE, β=1) and for random unitary matrix (GUE, β=2).  The new paper proves universality for incidence graphs of d-regular graphs. Here “universality” means that “after proper normalization and shifts, spectral statistics agree with those from GOE.” (This was conjectured by Sarnak  and by Miller, Novikoff, and Sabelli.) 

Random matrix theory also has important applications in combinatorics, physics, and number theory. For example, Baik, Deift, and Johansson established a connection between the Tracy-Widom distribution and combinatorics in their 1999 paper, On the distribution of the length of the longest increasing subsequence of random permutations. 

(2) Ramanujan graphs. 

Ramanujan graphs were constructed in the 1980s independently by Margulis and by Lubotzky, Philips and Sarnak (who also coined the term).

Consider the adjacency matrix of a d-regular graph G with n vertices. The eigenvalues d and are trivial: d always appears, while appears only if G is bipartite. A graph is called a Ramanujan graph if all its nontrivial eigenvalues belong to the interval

[-2\sqrt{d-1},+2\sqrt{d-1}].

A theorem of Alon and Boppana asserts that  for every \epsilon>0 there are only finitely many d-regular graphs that all their non-trivial eigenvalues have absolute value less than 2\sqrt{d-1}-\epsilon. (Note that Huang,  Mckenzie, and Yau further normalize the adjacency matrices so that the interval becomes [-2,2] for every d.) 

Complete graphs and complete bipartite graphs are Ramanujan and the challenge is to construct Ramanujan graphs where the degree d is fixed and the number of vertices n is large. The Ramanujan property proved by Margulis and Lubotzky, Philips and Sarnak for the graphs they constructed relies on deep number theory. The original Ramanujan graphs (both bipartite and not) by Margulis and Lubotzky, Philips and Sarnak had degrees which were of the form p+1, where p is a prime. Moshe Morgenstern extended this to degrees of the form q+1 for every prime power q.

A decade ago, Marcus, Spielman, and Srivastava introduced a new construction for arbitrary degrees in the bipartite case using Bilu-Linial graph lifting (see the 2013 post New Ramanujan graphs).

However, a major open question remained

Are there d-regular Ramanujan (non-bipartite) graphs when d is not a prime power?

As far as I know, this question is answered in the affirmative for the first time by Huang,  Mckenzie, and Yau in the new paper. 

(3) Random regular graphs 

In 2008, Joel Friedman proved that a random d-regular graph is, with high probability, nearly Ramanujan — a result that confirmed a conjecture by Noga Alon. A graph is said to be nearly Ramanujan if its nontrivial eigenvalues lie within the interval [-2\sqrt{d-1}-o(1),+2\sqrt{d-1}+o(1)]. We reported in earlier posts about a simple proof by Charles Bordenave and a related recent result by Chen, Garza-Vargas, Tropp, and van Handel. Let me also mention Doron Puder’s 2014 beautiful paper: Expansion of random graphs: new proofs, new results.

A major open question was  

Are random d-regular graphs Ramanujan? Are random 3-regular graphs Ramanujan? 

Early in the study of Ramanujan graphs, opinions were divided. Peter Sarnak conjectured that the answer is negative with high probability as the number of vertices tends to infinity, while Noga Alon conjectured the opposite—that it is positive with high probability. The two even made a bet about this question, though the exact terms of the wager were eventually forgotten.

Subsequent computer simulations suggested that the probability of a random graph being Ramanujan lies strictly between 0 and 1. This observation aligns with conjectures by Sarnak, Miller, Novikoff, and Sabelli. The new paper computes the precise probability. A random d-regular graph is Ramanujan with probability c+o(1) where c is roughly 0.69, (this constant remains the same across all values of d).  

There is a rich theory of random regular graphs with many exciting results and problems. Let me mention a 1994 paper of Wormald, brought to my attention by Geva Yashfe, asserting that with high probability random 3-regular graphs are 3-edge colorable (and even Hamiltonian). 

Thanks to Nati Linial and Peter Sarnak who told me about the new paper.

Two slightly related stories

(1) The Tracy-Widom distribution is expected to emerge in the context of first passage planar percolation. However, proving this result seems far out of reach at present.

(2) A decade ago or so, Yuval Peres told me that that familiarity with the law of iterated logarithms serves as a good test of one’s knowledge of probability theory. This remark motivated me to study the law, to ask multiple experts to explain it to me, and to try to come up with related problems to test my intuition. This led me to pose a question on MathOverflow  Laws of Iterated Logarithm for Random Matrices and Random Permutation.

Elliot Paquette and Ofer Zeitouni later resolved my problem for random matrices in their paper Extremal eigenvalue fluctuations in the GUE minor process and the law of fractional logarithm. A year later, when I met Yuval again and shared my progress in understanding the law of iterated logarithm with him, Yuval remarked that while the law of iterated logarithms is a necessary condition for understanding probability theory, it is not sufficient 🙂

Another connection is that the law of iterated logarithm is related to the genesis of the Khintchin inequalities. (This is briefly described  (with references) in Ron Blei’s book Analysis in Integer and Fractional Dimensions. ) Khintchin’s inequality is a grandparent of hypercontractive inequalities which have broad applications in combinatorics, including results related to first passage percolation, and even a little for random matrices. 

By Gil Kalai

Saturday, January 11

Linus Pauling Commemorative Ceramic Mural

from David Eppstein

While in Palo Alto for the holidays, I stumbled on a piece of public art I didn’t previously know about: the Linus Pauling Commemorative Ceramic Mural, created in 2000 by Ross Drago.

While in Palo Alto for the holidays, I stumbled on a piece of public art I didn’t previously know about: the Linus Pauling Commemorative Ceramic Mural, created in 2000 by Ross Drago.

Tiles from the Linus Pauling Commemorative Ceramic Mural (Ross Drago, 2000), Palo Alto

It’s a set of individually decorated ceramic tiles, installed on a wall along the sidewalk on the north side of Oregon Expressway, near its corner with El Camino Real. It’s not an area that attracts much foot traffic. If you’re driving by you won’t see it behind the oleander bushes that separate the sidewalk from the street, although there are some more purely decorative tiles further along the wall that are more visible. So to find it you either have to be randomly exploring (as I was) or know that it’s there and go out of your way to see it.

Tiles from the Linus Pauling Commemorative Ceramic Mural (Ross Drago, 2000), Palo Alto

In case you’re unfamiliar with Pauling’s history, I’ve summarized some of it below from his Wikipedia article. Linus Pauling (1901–1994) was a pioneer of both quantum chemistry and molecular biology who won the Nobel Prize twice: once for chemistry in 1954, and the Nobel Peace Prize in 1962. He and Marie Curie are the only people to have won it twice in different areas. He dropped out of high school in Oregon to enter college at age 15, and was only given an honorary high school diploma after the second of his two Nobel Prizes. After a 1925 PhD in physical chemistry from Caltech, and a postdoctoral visit to Europe, he began his work on the quantum nature of atoms, molecules, and atomic bonds, and returned to a faculty position at Caltech.

Tiles from the Linus Pauling Commemorative Ceramic Mural (Ross Drago, 2000), Palo Alto

From the 1930s to the 1950s, Pauling’s interests shifted to the use of X-ray crystallography to study the three-dimensional structure of biomolecules. He determined the structure of hemoglobin, and identified alpha-helices and beta-sheets as common motifs in protein structure. He was close to understanding the helical structure of DNA, but incorrectly throught it was a triple helix, and ended up beaten to the punch by Watson, Crick, Wilkins, and Franklin. His work from this period also included early recognition of the importance of complementarity in antibody-antigen binding and in DNA replication, and the recognition that abnormalities in protein structure could cause diseases such as sickle-cell anemia.

Tiles from the Linus Pauling Commemorative Ceramic Mural (Ross Drago, 2000), Palo Alto

During World War II, Pauling was invited to be part of the Manhattan Project. He declined, but did work on scientific instrumentation, synthetic blood for transfusions, explosives, and rocket propellants for the war effort. Immediately following the war he became a prominent anti-nuclear and peace activist, for which he was given the Nobel Peace Prize. All of this activity, unrelated to the topic of his academic appointment, caused the Caltech Board of Trustees to remove Pauling from his department chair position in 1958, in response to which he resigned his faculty position. After working briefly for a political think tank and then for the University of California, San Diego, he moved to Palo Alto in 1969 to become a professor of chemistry at Stanford University.

Tiles from the Linus Pauling Commemorative Ceramic Mural (Ross Drago, 2000), Palo Alto

In the 1970s, Pauling became famous yet again, for his controversial advocacy of vitamin C megadoses as a treatment for atherosclerosis and to prevent the common cold. He founded the Linus Pauling Institute of Science and Medicine, originally in Menlo Park just north of Palo Alto. Pauling lived in Big Sur in his retirement, and died there in 1994. After his death the institute he founded moved to Oregon State University, where it continues to exist.

Tiles from the Linus Pauling Commemorative Ceramic Mural (Ross Drago, 2000), Palo Alto

When I mentioned the mural to a friend who lives nearby, he recalled that Pauling did some of his vitamin-related work in an office at the location of the mural. There was nothing about this in Wikipedia, but according to a 1996 story in the Palo Alto Weekly, the Linus Pauling Institute’s offices on Page Mill near El Camino were zoned for residential uses in 1993, leading the institute to move to Oregon State in 1996. Page Mill and Oregon Expressway are the same road, changing names at the crossing with El Camino, and some residences are visible in the photo above, on the other side of the wall holding the mural. The south side of Oregon is business property, and across El Camino are a shopping center and some soccer fields, so this is the only location that matches the description. I suspect that my friend is correct and that the Weekly got the street name wrong.

Tiles from the Linus Pauling Commemorative Ceramic Mural (Ross Drago, 2000), Palo Alto

Some of the key milestones of Pauling’s life, including his work on atomic bonds, a protein alpha-helix, his peace and anti-nuclear activism, and his time at Caltech, are clearly visible in the tiles of the mural. At least one tile mentions vitamin C, but with an abstract rather than repesentational design. The portrait of Pauling himself, in a tile next to the title tile, appears to be modeled after several photos of Pauling wearing his habitual beret taken late in his life, in the 1980s. At least one tile appears to have fallen and become lost or destroyed. My photos here show the tiles in left-to-right order, but it is not obvious to me whether they were intended to follow any sort of logical or chronological order.

(Discuss on Mastodon)

By David Eppstein

Friday, January 10

Acceptable I-V-vi-IV Songs

from Ben Recht

You have to know what stinks to appreciate what's great

In an epic new video, my favorite YouTuber Pat Finnerty introduced me to a horrible concept in modern music interpolation. In a cover song, an artist performs an existing piece of music, adding their artistic flourishes, crediting the songwriting to the original artist. In sampling, an artist lifts audio from a previous recording and weaves it into their song with different degrees of sonic manipulation. Interpolation is somewhere in the middle. You take a song, change some of the lyrics and arrangement, and then call it your own.

Now, all music is interpolation. There are only so many permutations of notes that resonate with people. One of the best ways to write catchy tunes is to start with something familiar and modulate it a bit to make it unique and surprising. Artistry is elevating the familiar.

Pat’s YouTube channel posts a couple of videos a year about what makes songs stink. The song that tortured Pat this week was a blurry interpolation too far: Machine Gun Kelley and Jelly Roll’s “Lonely Road.” Pat’s right; this song is an audacious dumpster fire. They took John Denver’s “Take Me Home, Country Roads” and changed its name to “Lonely Road” and put some of his insipid lyrics and lousy singing on top. The chorus is literally the melody of Country Roads. The song people sing at football games!

Yes, I’m a snob, but this song really stinks. It’s just so cheap. As Pat points out, he’s interpolating a song that EVERYONE knows. You can’t take songs sung at every karaoke bar and then change their lyrics. Using the melody of “Country Roads” as your hook is worse than writing a song to the tune of Happy Birthday. It’s just cheap pandering, and unless you bring something to the table, it’s just crap. And this song is atrocious.

Pat has a lot more to say about this. As in all of his videos, there are a lot of deeper positive messages about what music is and what it means to us. There’s also a great synth version of Billy Joel’s Piano Man. You should watch it.

But the part I got stuck on was a brief screenshot of Pat’s Notes App. Pat’s videos have many recurring bits, one of which is his hatred of this I-V-vi-IV chord progression, abused by the band Train in countless insipid power ballads. The numbers here refer to the root note of the chord. If you are in the key of C, then I is C, V is G, vi is A minor (that’s why it’s lowercase), and IV is F. Strum these on the ukelele, and you get “Hey Soul Sister.” Or the intro to Israel Kamakawiwoʻole’s “Over the Rainbow.” Or Jason Mraz’s “I’m Yours.” Hmm. Once you start listening for the I-V-vi-IV, you’ll hear it everywhere.

Train’s blatant stealing of Kamakawiwoʻole’s ukelele is awful, but no chord progression intrinsically stinks. Pat says “I don't hate every I-V-vi-IV, just most of them and definitely the ones that make it sound like if Brett Michaels joined Mumford and Sons.” To show he means it, Pat posts a screenshot of acceptable I-V-vi-IV songs at 14:40.

My friend and fellow Pat Finnerty fan Brian Whitman threw these into a Spotify playlist:

They’re all really good! The first song, “So Lonely” by The Police is nothing but I-V-vi-IV. There are no other chords. But they hook you in by changing the feel as they go: the verses are a laid back Sting reggae groove, the chorus fast, straight-ahead rock, and they gear change up a whole step at the guitar solo. But it’s all the same chord progression. U2’s “With or Without You” is even more unwavering. Other than some short single-chord vamps after the chorus, the bass line never changes. The song is propelled by variations in the intensity of Bono’s singing, the drums, and the guitar textures.

Some of the songs hide the progression as little turns to grab the listener. You have to pay attention in “Romeo and Juliet” to hear where Mark Knopfler stitches in the I-IV-vi-V. And I’m not sure if we should count Better Than Ezra’s “Good,” as it replaces the IV major chord with a dominant 7th. But that small change is part of what makes that song so damned catchy.

What makes Pat Finnerty great is not just his humor and storytelling but how his complaining about bad music reminds us why we like music in the first place. “Lonely Road” stinks because it's audaciously lazy and transparently calculated. And it might not even be the worst interpolation out there.

As Pat says at 33:25, Pat shares a far worse offender, too terrible to even make a whole video about: Dustin Lynch and Jelly Roll’s Interpolation of Dobie Gray’s “Drift Away” into “Chevrolet.” Dear God. They wrote a jingle for a car commercial and paid money to get it on the radio (though I wouldn’t be surprised to learn they got paid a kickback by Chevy). The dying remnants of the music industry remain pretty gross.

The core problem with popular music is the fine line between transcendent cultural relevance and advertising slop. Yes, people produce slop and get paid for it. With reference to this terrible “Chevrolet” song, Pat says

“I know everybody's worried about the future of music with AI. How bad it's going to be and soulless and stuff like that. But I'm still worried about “I.” Don't sleep on “I.” “I” has me way more worried than AI at this point because “I” did this.”

He’s right. Part of the problem with how we wrestle with whatever these mystical AI tools do and what they are for is that we don’t like to reckon with how culture is built upon stochastic parroting. People create terrible interpolations and blurry jpegs all the time, and unlike AI, they make a profit when they do it. But Pat’s videos highlight what distinguishes us from the machines: that we can choose to be better.

Subscribe now

By Ben Recht