Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Friday, April 10

Calibrated Games

from Ben Recht

Solving the game of forecasting with accounting strategies.

This is a live blog of Lecture 8 of my graduate seminar “Feedback, Learning, and Adaptation.” A table of contents is here.

One of the main uses of simulation and forecasting in designed feedback systems is for deciding how to act. If I can map what will happen next, I can choose actions that steer me toward good outcomes. This mindset seems perfectly sensible, and it’s the backbone of statistical decision theory, tree search in game play, optimal control, and model predictive control. Moreover, people who are good at prediction get clout. You can even win money in markets. It seems like forecasting is a skill and talent, and one that requires deep knowledge of how the world works. And yet, in class on Monday, I discussed how you can make excellent forecasts by simple, strategic accounting.

To understand why, let’s examine how we know if forecasts are good. It’s sort of obvious, but we can only evaluate our predictions of the future once the future has become the past. I can’t tell how good your forecast is until the forecast event occurs. No matter how much we think about setting up ungamable metrics, forecasters can only be evaluated retrospectively. And this retrospective nature means we can cast forecasting in the game theoretic framework from last week. Let me write out the rules in the format I’ve been using.

We have a two-player game with repeated interactions. In every round t,

  1. Information xt is revealed to both players.

  2. Player One makes the forecast pt

  3. Player Two takes reveals the actual outcome yt

  4. A score st is assigned based on the triple (xt,pt,yt).

Player One is the “forecaster.” Their goal is to accumulate as high a score as possible, summed across all rounds. Player Two wants the sum of all of the st to be as low as possible.

Now, we need to come up with score functions that can’t be “gamed,” and people have thought of many. For example, you might require the forecaster to have a low Brier score.

Now, a low Brier score is impossible in the adversarial context. Think about this weird game of bit prediction, where Player One guesses a number 0 or 1 and Player Two responds with the correct answer equal to 0 or 1. Player Two goes second and can always pick the opposite of what Player One says. That seems unfair.

Last week, I brought up the possibility of judging Player One with regret. Regret would measure the difference between Player One’s score and the score of a player who knows all of Player Two’s moves in advance but can only play a single prediction. In math, this quantity is written as

But good predictions alone may not be what you care about. Certainly, you’d want the frequencies to match. If you are predicting a sequence of probabilities, the average of those probabilities should match the average of the actual outcomes. If you consistently predict a player makes 90% of their free throws, we should see 90% of free throws made.

Similarly, other expected values should match. If you are changing your probabilities over time, the variance of the outcomes should still match the variance of your probabilities.

Maybe you’d prefer the predictions to be good across stratifications of the data. For example, if you are predicting free throws, maybe you’d want your forecasts to be accurate for all players individually. There are lots of subtests and subsets I can inspect, and I’d like to check that you are making good predictions on all of them.

Perhaps you’d like a certain degree of calibration from the forecast. In all of Player One’s forecasts where they say 20%, Player Two should say 1 only 20% of the time. In the forecasts where they say 60%, Player Two should say 1 60% of the time. If in all of the times Player One says there’s a 90% chance of a 1, only 10% of the times Player Two plays a 1, we’d think Player One is a pretty bad forecaster. Trying to achieve calibration across all possible probabilistic predictions seems a lot harder than just getting a single frequency correct in a Brier score game.

Mathematically, however, all of these problems are basically the same. They list a set of “test functions”, and Player 1 wants the following to be small for every single test function:

What are the test functions? If all we care about is getting the frequency that yt equals one correct, then the test function is the constant function. For calibration, the test function is equal to one when a forecaster predicts probability x% and 0 otherwise. You’ll have one test function for each calibration bin. For calibration across strata, there will be a function for each stratum. Even Brier scores amount to calibration. You can get a low Brier Score by calibrating the functions

for all values of q.

The amazing thing is that making calibration errors of the form Ef small is incredibly mechanical. Juanky Perdomo and I spell out the general details in Section 3 of the tutorial “In Defense of Defensive Forecasting.” More or less, you just have to choose a prediction that makes the future look uncorrelated with the past. And you can always find such a prediction with simple search. Though there are specific details you have to deal with for each case, essentially the same procedure applies to very general sets of calibration functions.

We found that we could reduce every metric used to evaluate forecasting skill to some form of generalized calibration. There are whole bodies of work on proper scoring rules, conformal prediction, omniprediction, and outcome indistinguishability that reduce to generalized calibration. In the forecasting game, this generalized calibration can be done without specific domain expertise. As long as the evaluation metrics are prescribed in advance, a Defensive Forecaster will do well in fantasy sports, weather prediction, and election forecasting. It doesn’t need to know anything about the topic other than the judgment scheme.

Though Juanky and I wrote up our defense of defensive forecasting almost a year ago, this week was the first time I tried to present it in class. I got a lot of puzzled looks, as if I was playing clever card tricks. That’s the correct reaction! We are naturally impressed by people who are good at forecasting. We’re obsessed with predicting the future. Predictions from soothsayers are reassuring even if they’re consistently wrong.

And yet, forecasting is often just playing clever tricks for fun and profit. Though Dean Foster and Rahesh Vorha famously showed that percentile calibration amounted to bookkeeping thirty years ago, it turns out that all forms of generalized calibration can be achieved through bookkeeping. Next time someone tries to impress you with their prediction market prowess, remember that cooking the books isn’t the same as clairvoyance.

Subscribe now

By Ben Recht

Afterthoughs on Banach Tarski and the Miracle of loaves and Fishes

from Computational Complexity

I posted about using the Banach-Tarski Paradox(BT)  to explain the miracle of Loaves and Fishes (LF) here.

Darling says that whenever I fool my readers or my students then I have to tell them later, so I'll tell you now: The story about me meeting with Pope and talking about the BT Paradox (that would be a good name for a rock band: B-T-Paradox) was not true. I think my readers know that.

 

1) I first learned the Banach-Tarski Paradox as a grad student in 1981 when I read Hillary Putnam's article Models and Reality where he writes on Page 470:

One cannot simply sit down in one's study and ``decide'' that ``V=L'' is to be true, or that the axiom of choice is to be true. Nor would it be appropriate for the mathematical community to call an international convention and legislate these matters. Yet , it seems to me that if we encountered an extra-terrestrial species of intelligent beings who had developed a high level of mathematics, and it turned out they rejected the axiom of choice (perhaps because of the Tarski-Banach Theorem), it would be wrong to regard them as simply making a mistake. To do that would, on view, amount to saying that acceptance of the axiom of choice is built into our notion of rationality itself; that does not seem to me to be the case.

I agree with him and I wonder if we accept AC too readily. See my blog post on the BT paradox and my wife's strong opinion (she's against it).    

2) Back in 1981 my first thought was I wonder if someone has thought to use the BT paradox to explain the LF? And if so, were they serious or was it some kind of  joke? And does the Pope really get a discount at Pope-Yes? 

I also had the meta-thought (which I could not have said as cleanly as I will now):

 I wonder how I could find out if anyone else has thought of the BT-LF connection? 

Recall that back in 1981 the Internet was but a glint in Al Gore's eyes.  So back then I could not find out if anyone else had that thought of BT-LF. 

But now I can! And indeed, as I expected, some other people have made the connection of BT to LF: 

A tweet and a Reddit thread discussing the tweet: here. Not serious 

A serious article, I think, here   

A 24-page article about Holy Water and BT.  I can't imagine an article that long being a parody so I think its serious, see here. On the other hand, there is a 12-page article about Ramsey Theory and History that I think is supposed to be a parody, see here. 

A parody article, I think, here

I am sure there are more. 

3) I had thought of doing a blog post about BT and LF  a long time ago,  but Pope Leo having a math degree was the final push I needed.

4) The word cardinal has three very different meanings: (a) a type of infinity, (b) a position in the Catholic church, (c) the bird.  Same for large cardinal

5) One of my students who proofread the post thought that people will know it's a hoax since I am a vegetarian and hence would not eat at Pope-Yes, even if the Pope was paying. 

 

By gasarch

I posted about using the Banach-Tarski Paradox(BT)  to explain the miracle of Loaves and Fishes (LF) here.

Darling says that whenever I fool my readers or my students then I have to tell them later, so I'll tell you now: The story about me meeting with Pope and talking about the BT Paradox (that would be a good name for a rock band: B-T-Paradox) was not true. I think my readers know that.

 

1) I first learned the Banach-Tarski Paradox as a grad student in 1981 when I read Hillary Putnam's article Models and Reality where he writes on Page 470:

One cannot simply sit down in one's study and ``decide'' that ``V=L'' is to be true, or that the axiom of choice is to be true. Nor would it be appropriate for the mathematical community to call an international convention and legislate these matters. Yet , it seems to me that if we encountered an extra-terrestrial species of intelligent beings who had developed a high level of mathematics, and it turned out they rejected the axiom of choice (perhaps because of the Tarski-Banach Theorem), it would be wrong to regard them as simply making a mistake. To do that would, on view, amount to saying that acceptance of the axiom of choice is built into our notion of rationality itself; that does not seem to me to be the case.

I agree with him and I wonder if we accept AC too readily. See my blog post on the BT paradox and my wife's strong opinion (she's against it).    

2) Back in 1981 my first thought was I wonder if someone has thought to use the BT paradox to explain the LF? And if so, were they serious or was it some kind of  joke? And does the Pope really get a discount at Pope-Yes? 

I also had the meta-thought (which I could not have said as cleanly as I will now):

 I wonder how I could find out if anyone else has thought of the BT-LF connection? 

Recall that back in 1981 the Internet was but a glint in Al Gore's eyes.  So back then I could not find out if anyone else had that thought of BT-LF. 

But now I can! And indeed, as I expected, some other people have made the connection of BT to LF: 

A tweet and a Reddit thread discussing the tweet: here. Not serious 

A serious article, I think, here   

A 24-page article about Holy Water and BT.  I can't imagine an article that long being a parody so I think its serious, see here. On the other hand, there is a 12-page article about Ramsey Theory and History that I think is supposed to be a parody, see here

A parody article, I think, here

I am sure there are more. 

3) I had thought of doing a blog post about BT and LF  a long time ago,  but Pope Leo having a math degree was the final push I needed.

4) The word cardinal has three very different meanings: (a) a type of infinity, (b) a position in the Catholic church, (c) the bird.  Same for large cardinal

5) One of my students who proofread the post thought that people will know it's a hoax since I am a vegetarian and hence would not eat at Pope-Yes, even if the Pope was paying. 

 

By gasarch

The Quantum Query Complexity of Finding a Tarski Fixed Point on the 2D Grid

from arXiv: Computational Complexity

Authors: Reed Phillips

Tarski's theorem states that every monotone function from a complete lattice to itself has a fixed point. We specifically consider the two-dimensional lattice $\mathcal{L}^2_n$ on points $\{1, \ldots, n\}^2$ and where $(x_1, y_1) \leq (x_2, y_2)$ if $x_1 \leq x_2$ and $y_1 \leq y_2$. We show that the quantum query complexity of finding a fixed point given query access to a monotone function on $\mathcal{L}^2_n$ is $Ω((\log n)^2)$, matching the classical deterministic upper bound. The proof consists of two main parts: a lower bound on the quantum query complexity of a composition of a class of functions including ordered search, and an extremely close relationship between finding Tarski fixed points and nested ordered search.

Authors: Reed Phillips

Tarski's theorem states that every monotone function from a complete lattice to itself has a fixed point. We specifically consider the two-dimensional lattice $\mathcal{L}^2_n$ on points $\{1, \ldots, n\}^2$ and where $(x_1, y_1) \leq (x_2, y_2)$ if $x_1 \leq x_2$ and $y_1 \leq y_2$. We show that the quantum query complexity of finding a fixed point given query access to a monotone function on $\mathcal{L}^2_n$ is $Ω((\log n)^2)$, matching the classical deterministic upper bound. The proof consists of two main parts: a lower bound on the quantum query complexity of a composition of a class of functions including ordered search, and an extremely close relationship between finding Tarski fixed points and nested ordered search.

The Boolean surface area of polynomial threshold functions

from arXiv: Computational Complexity

Authors: Joseph Slote, Alexander Volberg, Haonan Zhang

Polynomial threshold functions (PTFs) are an important low-complexity class of Boolean functions, with strong connections to learning theory and approximation theory. Recent work on learning and testing PTFs has exploited structural and isoperimetric properties of the class, especially bounds on average sensitivity, one of the central themes in the study of PTFs since the Gotsman--Linial conjecture. In this work we exhibit a new geometric sense in which PTFs are tightly constrained, by studying them through the lens of the \textit{Boolean surface area} (or Talagrand boundary): \[ \text{BSA}[f]={\mathbb E}|\nabla f| = {\mathbb E}|\sqrt{{Sens}_f(x)}, \] which is a natural measure of vertex-boundary complexity on the discrete cube. Our main result is that every degree-$d$ PTF $f$ has subpolynomial Boolean surface area: \[ \text{BSA}[f]\le \exp(C(d)\sqrt{\log n}). \] This is a superpolynomial improvement over the previous bound of $n^{1/4}(\log n)^{C(d)}$ that follows from Kane's landmark bounds on average sensitivity of PTFs \cite{DK}.

Authors: Joseph Slote, Alexander Volberg, Haonan Zhang

Polynomial threshold functions (PTFs) are an important low-complexity class of Boolean functions, with strong connections to learning theory and approximation theory. Recent work on learning and testing PTFs has exploited structural and isoperimetric properties of the class, especially bounds on average sensitivity, one of the central themes in the study of PTFs since the Gotsman--Linial conjecture. In this work we exhibit a new geometric sense in which PTFs are tightly constrained, by studying them through the lens of the \textit{Boolean surface area} (or Talagrand boundary): \[ \text{BSA}[f]={\mathbb E}|\nabla f| = {\mathbb E}|\sqrt{{Sens}_f(x)}, \] which is a natural measure of vertex-boundary complexity on the discrete cube. Our main result is that every degree-$d$ PTF $f$ has subpolynomial Boolean surface area: \[ \text{BSA}[f]\le \exp(C(d)\sqrt{\log n}). \] This is a superpolynomial improvement over the previous bound of $n^{1/4}(\log n)^{C(d)}$ that follows from Kane's landmark bounds on average sensitivity of PTFs \cite{DK}.

Exponential quantum advantage in processing massive classical data

from arXiv: Computational Complexity

Authors: Haimeng Zhao, Alexander Zlokapa, Hartmut Neven, Ryan Babbush, John Preskill, Jarrod R. McClean, Hsin-Yuan Huang

Broadly applicable quantum advantage, particularly in classical data processing and machine learning, has been a fundamental open problem. In this work, we prove that a small quantum computer of polylogarithmic size can perform large-scale classification and dimension reduction on massive classical data by processing samples on the fly, whereas any classical machine achieving the same prediction performance requires exponentially larger size. Furthermore, classical machines that are exponentially larger yet below the required size need superpolynomially more samples and time. We validate these quantum advantages in real-world applications, including single-cell RNA sequencing and movie review sentiment analysis, demonstrating four to six orders of magnitude reduction in size with fewer than 60 logical qubits. These quantum advantages are enabled by quantum oracle sketching, an algorithm for accessing the classical world in quantum superposition using only random classical data samples. Combined with classical shadows, our algorithm circumvents the data loading and readout bottleneck to construct succinct classical models from massive classical data, a task provably impossible for any classical machine that is not exponentially larger than the quantum machine. These quantum advantages persist even when classical machines are granted unlimited time or if BPP=BQP, and rely only on the correctness of quantum mechanics. Together, our results establish machine learning on classical data as a broad and natural domain of quantum advantage and a fundamental test of quantum mechanics at the complexity frontier.

Authors: Haimeng Zhao, Alexander Zlokapa, Hartmut Neven, Ryan Babbush, John Preskill, Jarrod R. McClean, Hsin-Yuan Huang

Broadly applicable quantum advantage, particularly in classical data processing and machine learning, has been a fundamental open problem. In this work, we prove that a small quantum computer of polylogarithmic size can perform large-scale classification and dimension reduction on massive classical data by processing samples on the fly, whereas any classical machine achieving the same prediction performance requires exponentially larger size. Furthermore, classical machines that are exponentially larger yet below the required size need superpolynomially more samples and time. We validate these quantum advantages in real-world applications, including single-cell RNA sequencing and movie review sentiment analysis, demonstrating four to six orders of magnitude reduction in size with fewer than 60 logical qubits. These quantum advantages are enabled by quantum oracle sketching, an algorithm for accessing the classical world in quantum superposition using only random classical data samples. Combined with classical shadows, our algorithm circumvents the data loading and readout bottleneck to construct succinct classical models from massive classical data, a task provably impossible for any classical machine that is not exponentially larger than the quantum machine. These quantum advantages persist even when classical machines are granted unlimited time or if BPP=BQP, and rely only on the correctness of quantum mechanics. Together, our results establish machine learning on classical data as a broad and natural domain of quantum advantage and a fundamental test of quantum mechanics at the complexity frontier.

Vulnerability Abundance: A formal proof of infinite vulnerabilities in code

from arXiv: Computational Complexity

Authors: Eireann Leverett, Jeroen van der Ham-de Vos

We present a constructive proof that a single C program, the \emph{Vulnerability Factory}, admits a countably infinite set of distinct, independently CVE-assignable software vulnerabilities. We formalise the argument using elementary set theory, verify it against MITRE's CVE Numbering Authority counting rules, sketch a model-checking analysis that corroborates unbounded vulnerability generation, and provide a Turing-machine characterisation that situates the result within classical computability theory. We then contextualise this result within the long-running debate on whether undiscovered vulnerabilities in software are \emph{dense} or \emph{sparse}, and introduce the concept of \emph{vulnerability abundance}: a quantitative analogy to chemical elemental abundance that describes the proportional distribution of vulnerability classes across the global software corpus. Because different programming languages render different vulnerability classes possible or impossible, and because language popularity shifts over time, vulnerability abundance is neither static nor uniform. Crucially, we distinguish between infinite \emph{vulnerabilities} and the far smaller set of \emph{exploits}: empirical evidence suggests that fewer than 6\% of published CVEs are ever exploited in the wild, and that exploitation frequency depends not only on vulnerability abundance but on the market share of the affected software. We argue that measuring vulnerability abundance, and its interaction with software deployment, has practical value for both vulnerability prevention and cyber-risk analysis. We conclude that if one programme can harbour infinitely many vulnerabilities, the set of all software vulnerabilities is necessarily infinite, and we suggest the Vulnerability Factory may serve as a reusable proof artifact, a foundational `test object',for future formal results in vulnerability theory.

Authors: Eireann Leverett, Jeroen van der Ham-de Vos

We present a constructive proof that a single C program, the \emph{Vulnerability Factory}, admits a countably infinite set of distinct, independently CVE-assignable software vulnerabilities. We formalise the argument using elementary set theory, verify it against MITRE's CVE Numbering Authority counting rules, sketch a model-checking analysis that corroborates unbounded vulnerability generation, and provide a Turing-machine characterisation that situates the result within classical computability theory. We then contextualise this result within the long-running debate on whether undiscovered vulnerabilities in software are \emph{dense} or \emph{sparse}, and introduce the concept of \emph{vulnerability abundance}: a quantitative analogy to chemical elemental abundance that describes the proportional distribution of vulnerability classes across the global software corpus. Because different programming languages render different vulnerability classes possible or impossible, and because language popularity shifts over time, vulnerability abundance is neither static nor uniform. Crucially, we distinguish between infinite \emph{vulnerabilities} and the far smaller set of \emph{exploits}: empirical evidence suggests that fewer than 6\% of published CVEs are ever exploited in the wild, and that exploitation frequency depends not only on vulnerability abundance but on the market share of the affected software. We argue that measuring vulnerability abundance, and its interaction with software deployment, has practical value for both vulnerability prevention and cyber-risk analysis. We conclude that if one programme can harbour infinitely many vulnerabilities, the set of all software vulnerabilities is necessarily infinite, and we suggest the Vulnerability Factory may serve as a reusable proof artifact, a foundational `test object',for future formal results in vulnerability theory.

Quantum Property Testing for Bounded-Degree Directed Graphs

from arXiv: Data Structures and Algorithms

Authors: Pan Peng, Jingyu Wu

We study quantum property testing for directed graphs with maximum in-degree and out-degree bounded by some universal constant $d$. For a proximity parameter $\varepsilon$, we show that any property that can be tested with $O_{\varepsilon,d}(1)$ queries in the classical bidirectional model, where both incoming and outgoing edges are accessible, can also be tested in the quantum unidirectional model, where only outgoing edges are accessible, using $n^{1/2 - Ω_{\varepsilon,d}(1)}$ queries. This yields an almost quadratic quantum speedup over the best known classical algorithms in the unidirectional model. Moreover, we prove that our transformation is almost tight by giving an explicit property $P_\varepsilon$ that is $\varepsilon$-testable within $O_\varepsilon(1)$ classical queries in the bidirectional model, but requires $\widetildeΩ(n^{1/2-f'(\varepsilon)})$ quantum queries in the unidirectional model, where $f'(\varepsilon)$ is a function that approaches $0$ as $\varepsilon$ approaches $0$. As a byproduct, we show that in the unidirectional model, the number of occurrences of any constant-size subgraph $H$ can be approximated up to additive error $δn$ using $o(\sqrt{n})$ quantum queries.

Authors: Pan Peng, Jingyu Wu

We study quantum property testing for directed graphs with maximum in-degree and out-degree bounded by some universal constant $d$. For a proximity parameter $\varepsilon$, we show that any property that can be tested with $O_{\varepsilon,d}(1)$ queries in the classical bidirectional model, where both incoming and outgoing edges are accessible, can also be tested in the quantum unidirectional model, where only outgoing edges are accessible, using $n^{1/2 - Ω_{\varepsilon,d}(1)}$ queries. This yields an almost quadratic quantum speedup over the best known classical algorithms in the unidirectional model. Moreover, we prove that our transformation is almost tight by giving an explicit property $P_\varepsilon$ that is $\varepsilon$-testable within $O_\varepsilon(1)$ classical queries in the bidirectional model, but requires $\widetildeΩ(n^{1/2-f'(\varepsilon)})$ quantum queries in the unidirectional model, where $f'(\varepsilon)$ is a function that approaches $0$ as $\varepsilon$ approaches $0$. As a byproduct, we show that in the unidirectional model, the number of occurrences of any constant-size subgraph $H$ can be approximated up to additive error $δn$ using $o(\sqrt{n})$ quantum queries.

Differentially Private Language Generation and Identification in the Limit

from arXiv: Data Structures and Algorithms

Authors: Anay Mehrotra, Grigoris Velegkas, Xifan Yu, Felix Zhou

We initiate the study of language generation in the limit, a model recently introduced by Kleinberg and Mullainathan [KM24], under the constraint of differential privacy. We consider the continual release model, where a generator must eventually output a stream of valid strings while protecting the privacy of the entire input sequence. Our first main result is that for countable collections of languages, privacy comes at no qualitative cost: we provide an $\varepsilon$-differentially-private algorithm that generates in the limit from any countable collection. This stands in contrast to many learning settings where privacy renders learnability impossible. However, privacy does impose a quantitative cost: there are finite collections of size $k$ for which uniform private generation requires $Ω(k/\varepsilon)$ samples, whereas just one sample suffices non-privately. We then turn to the harder problem of language identification in the limit. Here, we show that privacy creates fundamental barriers. We prove that no $\varepsilon$-DP algorithm can identify a collection containing two languages with an infinite intersection and a finite set difference, a condition far stronger than the classical non-private characterization of identification. Next, we turn to the stochastic setting where the sample strings are sampled i.i.d. from a distribution (instead of being generated by an adversary). Here, we show that private identification is possible if and only if the collection is identifiable in the adversarial model. Together, our results establish new dimensions along which generation and identification differ and, for identification, a separation between adversarial and stochastic settings induced by privacy constraints.

Authors: Anay Mehrotra, Grigoris Velegkas, Xifan Yu, Felix Zhou

We initiate the study of language generation in the limit, a model recently introduced by Kleinberg and Mullainathan [KM24], under the constraint of differential privacy. We consider the continual release model, where a generator must eventually output a stream of valid strings while protecting the privacy of the entire input sequence. Our first main result is that for countable collections of languages, privacy comes at no qualitative cost: we provide an $\varepsilon$-differentially-private algorithm that generates in the limit from any countable collection. This stands in contrast to many learning settings where privacy renders learnability impossible. However, privacy does impose a quantitative cost: there are finite collections of size $k$ for which uniform private generation requires $Ω(k/\varepsilon)$ samples, whereas just one sample suffices non-privately. We then turn to the harder problem of language identification in the limit. Here, we show that privacy creates fundamental barriers. We prove that no $\varepsilon$-DP algorithm can identify a collection containing two languages with an infinite intersection and a finite set difference, a condition far stronger than the classical non-private characterization of identification. Next, we turn to the stochastic setting where the sample strings are sampled i.i.d. from a distribution (instead of being generated by an adversary). Here, we show that private identification is possible if and only if the collection is identifiable in the adversarial model. Together, our results establish new dimensions along which generation and identification differ and, for identification, a separation between adversarial and stochastic settings induced by privacy constraints.

Rapid mixing for high-temperature Gibbs states with arbitrary external fields

from arXiv: Data Structures and Algorithms

Authors: Ainesh Bakshi, Xinyu Tan

Gibbs states are a natural model of quantum matter at thermal equilibrium. We investigate the role of external fields in shaping the entanglement structure and computational complexity of high-temperature Gibbs states. External fields can induce entanglement in states that are otherwise provably separable, and the crossover scale is $h\asymp β^{-1} \log(1/β)$, where $h$ is an upper bound on any on-site potential and $β$ is the inverse temperature. We introduce a quasi-local Lindbladian that satisfies detailed balance and rapidly mixes to the Gibbs state in $\mathcal{O}(\log(n/ε))$ time, even in the presence of an arbitrary on-site external field. Additionally, we prove that for any $β<1$, there exist local Hamiltonians for which sampling from the computational-basis distribution of the corresponding Gibbs state with a sufficiently large external field is classically hard, under standard complexity-theoretic assumptions. Therefore, high-temperature Gibbs states with external fields are natural physical models that can exhibit entanglement and classical hardness while also admitting efficient quantum Gibbs samplers, making them suitable candidates for quantum advantage via state preparation.

Authors: Ainesh Bakshi, Xinyu Tan

Gibbs states are a natural model of quantum matter at thermal equilibrium. We investigate the role of external fields in shaping the entanglement structure and computational complexity of high-temperature Gibbs states. External fields can induce entanglement in states that are otherwise provably separable, and the crossover scale is $h\asymp β^{-1} \log(1/β)$, where $h$ is an upper bound on any on-site potential and $β$ is the inverse temperature. We introduce a quasi-local Lindbladian that satisfies detailed balance and rapidly mixes to the Gibbs state in $\mathcal{O}(\log(n/ε))$ time, even in the presence of an arbitrary on-site external field. Additionally, we prove that for any $β<1$, there exist local Hamiltonians for which sampling from the computational-basis distribution of the corresponding Gibbs state with a sufficiently large external field is classically hard, under standard complexity-theoretic assumptions. Therefore, high-temperature Gibbs states with external fields are natural physical models that can exhibit entanglement and classical hardness while also admitting efficient quantum Gibbs samplers, making them suitable candidates for quantum advantage via state preparation.

Revisiting Fair and Efficient Allocations for Bivalued Goods

from arXiv: Data Structures and Algorithms

Authors: Hui Liu, Zhijie Zhang

This paper re-examines the problem of fairly and efficiently allocating indivisible goods among agents with additive bivalued valuations. Garg and Murhekar (2021) proposed a polynomial-time algorithm that purported to find an EFX and fPO allocation. However, we provide a counterexample demonstrating that their algorithm may fail to terminate. To address this issue, we propose a new polynomial-time algorithm that computes a WEFX (Weighted Envy-Free up to any good) and fPO allocation, thereby correcting the prior approach and offering a more general solution. Furthermore, we show that our algorithm can be adapted to compute a WEQX (Weighted Equitable up to any good) and fPO allocation.

Authors: Hui Liu, Zhijie Zhang

This paper re-examines the problem of fairly and efficiently allocating indivisible goods among agents with additive bivalued valuations. Garg and Murhekar (2021) proposed a polynomial-time algorithm that purported to find an EFX and fPO allocation. However, we provide a counterexample demonstrating that their algorithm may fail to terminate. To address this issue, we propose a new polynomial-time algorithm that computes a WEFX (Weighted Envy-Free up to any good) and fPO allocation, thereby correcting the prior approach and offering a more general solution. Furthermore, we show that our algorithm can be adapted to compute a WEQX (Weighted Equitable up to any good) and fPO allocation.

Counting HyperGraphlets via Color Coding: a Quadratic Barrier and How to Break It

from arXiv: Data Structures and Algorithms

Authors: Marco Bressan, Stefano Clemente, Giacomo Fumagalli

We study the problem of counting $k$-\emph{hyper}graphlets, an interesting but surprisingly ignored primitive, with the aim of understanding if efficient algorithms exist. To this end we consider \emph{color coding}, a well-known technique for approximately counting $k$-graphlets in graphs. Our first result is that, on hypergraphs, color coding encounters a \emph{quadratic barrier}: under the Orthogonal Vector Conjecture, no implementation of it can run in time sub-quadratic in the size of the input. We then introduce a simple property, $(α,β)$-niceness, that hypergraphs from real-world datasets appear to satisfy for small values of $α$ and $β$. Intuitively, an $(α,β)$-nice hypergraph can be split into two sub-hypergraphs having respectively rank at most $α$ and degree at most $β$. By applying different techniques to each sub-hypergraph and carefully combining the outputs, we show how to run color coding in time $2^{O(k)} \cdot \big(2^β|V| + α^k |E| + α^2 β\size{H}\big)$, where $H=(V,E)$ is the input hypergraph. Afterwards, we can sample colorful $k$-hypergraphlets uniformly in expected $k^{O(k)} \cdot (β^2+\ln |V|)$ time per sample. Experiments on real-world hypergraphs show that our algorithm neatly outperforms the naive quadratic algorithm, sometimes by more than an order of magnitude.

Authors: Marco Bressan, Stefano Clemente, Giacomo Fumagalli

We study the problem of counting $k$-\emph{hyper}graphlets, an interesting but surprisingly ignored primitive, with the aim of understanding if efficient algorithms exist. To this end we consider \emph{color coding}, a well-known technique for approximately counting $k$-graphlets in graphs. Our first result is that, on hypergraphs, color coding encounters a \emph{quadratic barrier}: under the Orthogonal Vector Conjecture, no implementation of it can run in time sub-quadratic in the size of the input. We then introduce a simple property, $(α,β)$-niceness, that hypergraphs from real-world datasets appear to satisfy for small values of $α$ and $β$. Intuitively, an $(α,β)$-nice hypergraph can be split into two sub-hypergraphs having respectively rank at most $α$ and degree at most $β$. By applying different techniques to each sub-hypergraph and carefully combining the outputs, we show how to run color coding in time $2^{O(k)} \cdot \big(2^β|V| + α^k |E| + α^2 β\size{H}\big)$, where $H=(V,E)$ is the input hypergraph. Afterwards, we can sample colorful $k$-hypergraphlets uniformly in expected $k^{O(k)} \cdot (β^2+\ln |V|)$ time per sample. Experiments on real-world hypergraphs show that our algorithm neatly outperforms the naive quadratic algorithm, sometimes by more than an order of magnitude.

Competitive Transaction Admission in PCNs: Online Knapsack with Positive and Negative Items

from arXiv: Data Structures and Algorithms

Authors: Marcin Bienkowski, Julien Dallot, Dominik Danelski, Maciej Pacut, Stefan Schmid

Payment channel networks (PCNs) are a promising solution to make cryptocurrency transactions faster and more scalable. At their core, PCNs bypass the blockchain by routing the transactions through intermediary channels. However, a channel can forward a transaction only if it possesses the necessary funds: the problem of keeping the channels balanced is a current bottleneck on the PCN's transaction throughput. This paper considers the problem of maximizing the number of accepted transactions by a channel in a PCN. Previous works either considered the associated optimization problem with all transactions known in advance or developed heuristics tested on particular transaction datasets. This work however considers the problem in its purely online form where the transactions are arbitrary and revealed one after the other. We show that the problem can be modeled as a new online knapsack variant where the items (transaction proposals) can be either positive or negative depending on the direction of the transaction. The main contribution of this paper is a deterministic online algorithm that is $O(\log B)$-competitive, where $B$ is the knapsack capacity (initially allocated funds). We complement this result with an asymptotically matching lower bound of $Ω(\log B)$ which holds for any randomized algorithm, demonstrating our algorithm's optimality.

Authors: Marcin Bienkowski, Julien Dallot, Dominik Danelski, Maciej Pacut, Stefan Schmid

Payment channel networks (PCNs) are a promising solution to make cryptocurrency transactions faster and more scalable. At their core, PCNs bypass the blockchain by routing the transactions through intermediary channels. However, a channel can forward a transaction only if it possesses the necessary funds: the problem of keeping the channels balanced is a current bottleneck on the PCN's transaction throughput. This paper considers the problem of maximizing the number of accepted transactions by a channel in a PCN. Previous works either considered the associated optimization problem with all transactions known in advance or developed heuristics tested on particular transaction datasets. This work however considers the problem in its purely online form where the transactions are arbitrary and revealed one after the other. We show that the problem can be modeled as a new online knapsack variant where the items (transaction proposals) can be either positive or negative depending on the direction of the transaction. The main contribution of this paper is a deterministic online algorithm that is $O(\log B)$-competitive, where $B$ is the knapsack capacity (initially allocated funds). We complement this result with an asymptotically matching lower bound of $Ω(\log B)$ which holds for any randomized algorithm, demonstrating our algorithm's optimality.

Identifying bubble-like subgraphs in linear-time via a unified SPQR-tree framework

from arXiv: Data Structures and Algorithms

Authors: Francisco Sena, Aleksandr Politov, Corentin Moumard, Massimo Cairo, Romeo Rizzi, Manuel Cáceres, Sebastian Schmidt, Juha Harviainen, Alexandru I. Tomescu

A fundamental algorithmic problem in computational biology is to find all subgraphs of a given type (superbubbles, snarls, and ultrabubbles) in a directed or bidirected input graph. These correspond to regions of genetic variation and are useful in analyzing collections of genomes. We present the first linear-time algorithms for identifying all snarls and all ultrabubbles, resolving problems open since 2018. The algorithm for snarls relies on a new linear-size representation of all snarls with respect to the number of vertices in the graph. We employ the well-known SPQR-tree decomposition, which encodes all 2-separators of a biconnected graph. After several dynamic-programming-style traversals of this tree, we maintain key properties (such as acyclicity) that allow us to decide whether a given 2-separator defines a subgraph to be reported. A crucial ingredient for linear-time complexity is that acyclicity of linearly many subgraphs can be tested simultaneously via the problem of computing all arcs in a directed graph whose removal renders it acyclic (so-called feedback arcs). As such, we prove a fundamental result for bidirected graphs, that may be of independent interest: all feedback arcs can be computed in linear time for tipless bidirected graphs, while in general this is at least as hard as matrix multiplication, assuming the k-Clique Conjecture. Our results form a unified framework that also yields a completely different linear-time algorithm for finding all superbubbles. Although some of the results are technically involved, the underlying ideas are conceptually simple, and may extend to other bubble-like subgraphs. More broadly, our work contributes to the theoretical foundations of computational biology and advances a growing line of research that uses SPQR-tree decompositions as a general tool for designing efficient algorithms, beyond their traditional role in graph drawing.

Authors: Francisco Sena, Aleksandr Politov, Corentin Moumard, Massimo Cairo, Romeo Rizzi, Manuel Cáceres, Sebastian Schmidt, Juha Harviainen, Alexandru I. Tomescu

A fundamental algorithmic problem in computational biology is to find all subgraphs of a given type (superbubbles, snarls, and ultrabubbles) in a directed or bidirected input graph. These correspond to regions of genetic variation and are useful in analyzing collections of genomes. We present the first linear-time algorithms for identifying all snarls and all ultrabubbles, resolving problems open since 2018. The algorithm for snarls relies on a new linear-size representation of all snarls with respect to the number of vertices in the graph. We employ the well-known SPQR-tree decomposition, which encodes all 2-separators of a biconnected graph. After several dynamic-programming-style traversals of this tree, we maintain key properties (such as acyclicity) that allow us to decide whether a given 2-separator defines a subgraph to be reported. A crucial ingredient for linear-time complexity is that acyclicity of linearly many subgraphs can be tested simultaneously via the problem of computing all arcs in a directed graph whose removal renders it acyclic (so-called feedback arcs). As such, we prove a fundamental result for bidirected graphs, that may be of independent interest: all feedback arcs can be computed in linear time for tipless bidirected graphs, while in general this is at least as hard as matrix multiplication, assuming the k-Clique Conjecture. Our results form a unified framework that also yields a completely different linear-time algorithm for finding all superbubbles. Although some of the results are technically involved, the underlying ideas are conceptually simple, and may extend to other bubble-like subgraphs. More broadly, our work contributes to the theoretical foundations of computational biology and advances a growing line of research that uses SPQR-tree decompositions as a general tool for designing efficient algorithms, beyond their traditional role in graph drawing.

Density Decomposition on Hypergraphs

from arXiv: Data Structures and Algorithms

Authors: Xiaoyu Leng, Hongchao Qin, Rong-Hua Li

Decomposing hypergraphs is a key task in hypergraph analysis with broad applications in community detection, pattern discovery, and task scheduling. Existing approaches such as $k$-core and neighbor-$k$-core rely on vertex degree constraints, which often fail to capture true density variations induced by multi-way interactions and may lead to sparse or uneven decomposition layers. To address these issues, we propose a novel \((k,δ)\)-dense subhypergraph model for decomposing hypergraphs based on integer density values. Here, $k$ represents the density level of a subhypergraph, while \(δ\) sets the upper limit for each hyperedge's contribution to density, allowing fine-grained control over density distribution across layers. Computing such dense subhypergraphs is algorithmically challenging, as it requires identifying an egalitarian orientation under bounded hyperedge contributions, which may incur an intuitive worst-case complexity of up to $O(2^{mδ})$. To enable efficient computation, we develop a fair-stable-based algorithm that reduces the complexity of mining a single $(k,δ)$-dense subhypergraph from $O(m^{2}δ^{2})$ to $O(nmδ)$. Building on this result, we further design a divide-and-conquer decomposition framework that improves the overall complexity of full density decomposition from $O(nmδ\cdot d^E_{\max} \cdot k_{\max})$ to $O(nmδ\cdot d^E_{\max} \cdot \log k_{\max})$. Experiments on nine real-world hypergraph datasets demonstrate that our approach produces more continuous and less redundant decomposition hierarchies than existing baselines, while maintaining strong computational efficiency. Case studies further illustrate the practical utility of our model by uncovering cohesive and interpretable community structures.

Authors: Xiaoyu Leng, Hongchao Qin, Rong-Hua Li

Decomposing hypergraphs is a key task in hypergraph analysis with broad applications in community detection, pattern discovery, and task scheduling. Existing approaches such as $k$-core and neighbor-$k$-core rely on vertex degree constraints, which often fail to capture true density variations induced by multi-way interactions and may lead to sparse or uneven decomposition layers. To address these issues, we propose a novel \((k,δ)\)-dense subhypergraph model for decomposing hypergraphs based on integer density values. Here, $k$ represents the density level of a subhypergraph, while \(δ\) sets the upper limit for each hyperedge's contribution to density, allowing fine-grained control over density distribution across layers. Computing such dense subhypergraphs is algorithmically challenging, as it requires identifying an egalitarian orientation under bounded hyperedge contributions, which may incur an intuitive worst-case complexity of up to $O(2^{mδ})$. To enable efficient computation, we develop a fair-stable-based algorithm that reduces the complexity of mining a single $(k,δ)$-dense subhypergraph from $O(m^{2}δ^{2})$ to $O(nmδ)$. Building on this result, we further design a divide-and-conquer decomposition framework that improves the overall complexity of full density decomposition from $O(nmδ\cdot d^E_{\max} \cdot k_{\max})$ to $O(nmδ\cdot d^E_{\max} \cdot \log k_{\max})$. Experiments on nine real-world hypergraph datasets demonstrate that our approach produces more continuous and less redundant decomposition hierarchies than existing baselines, while maintaining strong computational efficiency. Case studies further illustrate the practical utility of our model by uncovering cohesive and interpretable community structures.

Parallel Batch-Dynamic Maximal Independent Set

from arXiv: Data Structures and Algorithms

Authors: Guy Blelloch, Andrew Brady, Laxman Dhulipala, Jeremy Fineman, Jared Lo

We develop the first theoretically-efficient algorithm for maintaining the maximal independent set (MIS) of a graph in the parallel batch-dynamic setting. In this setting, a graph is updated with batches of edge insertions/deletions, and for each batch a parallel algorithm updates the maximal independent set to agree with the new graph. A batch-dynamic algorithm is considered efficient if it is work efficient (i.e., does no more asymptotic work than applying the updates sequentially) and has polylogarithmic depth (parallel time). In the sequential setting, the best known dynamic algorithms for MIS, by Chechik and Zhang (CZ) [FOCS19] and Behnezhad et al. (BDHSS) [FOCS19], take $O(\log^4 n)$ time per update in expectation. For a batch of $b$ updates, our algorithm has $O(b \log^3 n)$ expected work and polylogarithmic depth with high probability (whp). It therefore outperforms the best algorithm even in the sequential dynamic case ($b = 1)$. As with the sequential dynamic MIS algorithms of CZ and BDHSS, our solution maintains a lexicographically first MIS based on a random ordering of the vertices. Their analysis relied on a result of Censor-Hillel, Haramaty and Karnin [PODC16] that bounded the ``influence set" for a single update, but surprisingly, the influence of a batch is not simply the union of the influence of each update therein. We therefore develop a new approach to analyze the influence set for a batch of updates. Our construction of the batch influence set is natural and leads to an arguably simpler analysis than prior work. We then instrument this construction to bound the work of our algorithm. To argue our depth is polylogarithmic, we prove that the number of subrounds our algorithm takes is the same as depth bounds on parallel static MIS.

Authors: Guy Blelloch, Andrew Brady, Laxman Dhulipala, Jeremy Fineman, Jared Lo

We develop the first theoretically-efficient algorithm for maintaining the maximal independent set (MIS) of a graph in the parallel batch-dynamic setting. In this setting, a graph is updated with batches of edge insertions/deletions, and for each batch a parallel algorithm updates the maximal independent set to agree with the new graph. A batch-dynamic algorithm is considered efficient if it is work efficient (i.e., does no more asymptotic work than applying the updates sequentially) and has polylogarithmic depth (parallel time). In the sequential setting, the best known dynamic algorithms for MIS, by Chechik and Zhang (CZ) [FOCS19] and Behnezhad et al. (BDHSS) [FOCS19], take $O(\log^4 n)$ time per update in expectation. For a batch of $b$ updates, our algorithm has $O(b \log^3 n)$ expected work and polylogarithmic depth with high probability (whp). It therefore outperforms the best algorithm even in the sequential dynamic case ($b = 1)$. As with the sequential dynamic MIS algorithms of CZ and BDHSS, our solution maintains a lexicographically first MIS based on a random ordering of the vertices. Their analysis relied on a result of Censor-Hillel, Haramaty and Karnin [PODC16] that bounded the ``influence set" for a single update, but surprisingly, the influence of a batch is not simply the union of the influence of each update therein. We therefore develop a new approach to analyze the influence set for a batch of updates. Our construction of the batch influence set is natural and leads to an arguably simpler analysis than prior work. We then instrument this construction to bound the work of our algorithm. To argue our depth is polylogarithmic, we prove that the number of subrounds our algorithm takes is the same as depth bounds on parallel static MIS.

Optimal Quantum State Testing Even with Limited Entanglement

from arXiv: Data Structures and Algorithms

Authors: Chirag Wadhwa, Sitan Chen

In this work, we consider the fundamental task of quantum state certification: given copies of an unknown quantum state $ρ$, test whether it matches some target state $σ$ or is $ε$-far from it. For certifying $d$-dimensional states, $Θ(d/ε^2)$ copies of $ρ$ are known to be necessary and sufficient. However, the algorithm achieving this complexity makes fully entangled measurements over all $O(d/ε^2)$ copies of $ρ$. Often, one is interested in certifying states to a high precision; this makes such joint measurements intractable even for low-dimensional states. Thus, we study whether one can obtain optimal rates for quantum state certification and related testing problems while only performing measurements on $t$ copies at once, for some $1 < t \ll d/ε^2$. While it is well-understood how to use intermediate entanglement to achieve optimal quantum state learning, the only protocol known to achieve optimal testing is the one using fully entangled measurements. Our main result is a smooth copy complexity upper bound for state certification as a function of $t$, which achieves a near-optimal rate at $t = d^2$. In the high-precision regime, i.e., for $ε< \frac{1}{\sqrt{d}}$, this is a strict improvement over the entanglement used by the aforementioned optimal protocol. We also extend our techniques to develop new algorithms for the related tasks of mixedness testing and purity estimation, and show tradeoffs achieving the optimal rates for these problems at $t = d^2$ as well. Our algorithms are based on novel reductions from testing to learning and leverage recent advances in quantum state tomography in a non-black-box fashion. We complement our upper bounds with smooth lower bounds that imply joint measurements on $t \geq d^{Ω(1)}$ copies are necessary to achieve optimal rates for certification in the high-precision regime.

Authors: Chirag Wadhwa, Sitan Chen

In this work, we consider the fundamental task of quantum state certification: given copies of an unknown quantum state $ρ$, test whether it matches some target state $σ$ or is $ε$-far from it. For certifying $d$-dimensional states, $Θ(d/ε^2)$ copies of $ρ$ are known to be necessary and sufficient. However, the algorithm achieving this complexity makes fully entangled measurements over all $O(d/ε^2)$ copies of $ρ$. Often, one is interested in certifying states to a high precision; this makes such joint measurements intractable even for low-dimensional states. Thus, we study whether one can obtain optimal rates for quantum state certification and related testing problems while only performing measurements on $t$ copies at once, for some $1 < t \ll d/ε^2$. While it is well-understood how to use intermediate entanglement to achieve optimal quantum state learning, the only protocol known to achieve optimal testing is the one using fully entangled measurements. Our main result is a smooth copy complexity upper bound for state certification as a function of $t$, which achieves a near-optimal rate at $t = d^2$. In the high-precision regime, i.e., for $ε< \frac{1}{\sqrt{d}}$, this is a strict improvement over the entanglement used by the aforementioned optimal protocol. We also extend our techniques to develop new algorithms for the related tasks of mixedness testing and purity estimation, and show tradeoffs achieving the optimal rates for these problems at $t = d^2$ as well. Our algorithms are based on novel reductions from testing to learning and leverage recent advances in quantum state tomography in a non-black-box fashion. We complement our upper bounds with smooth lower bounds that imply joint measurements on $t \geq d^{Ω(1)}$ copies are necessary to achieve optimal rates for certification in the high-precision regime.

Thursday, April 09

TR26-054 | Optimal Single-Pass Streaming Lower Bounds for Approximating CSPs | Noah Singer, Madhur Tulsiani, Santhoshini Velusamy

from ECCC Papers

For an arbitrary family of predicates $\mathcal{F} \subseteq \{0,1\}^{[q]^k}$ and any $\epsilon > 0$, we prove a single-pass, linear-space streaming lower bound against the gap promise problem of distinguishing instances of Max-CSP$({\mathcal{F}})$ with at most $\beta+\epsilon$ fraction of satisfiable constraints from instances of with at least $\gamma-\epsilon$ fraction of satisfiable constraints, whenever Max-CSP$({\mathcal{F}})$ admits a $(\gamma,\beta)$-integrality gap instance for the basic LP. This subsumes the linear-space lower bound of Chou, Golovnev, Sudan, Velingker, and Velusamy (STOC 2022), which applies only to a special subclass of CSPs with linear-algebraic structure. (Their result itself generalizes work of Kapralov and Krachun (STOC 2019) for Max-CUT.) Our approach identifies the right ``analytic'' analogues of previously-used linear-algebraic conditions; this yields substantial simplifications while capturing a much larger class of problems. Our lower bound is essentially optimal for single-pass streaming, since: (1) All CSPs admit $(1-\epsilon)$-approximations in quasilinear space, and (2) sublinear-space streaming algorithms can simulate the LP (on bounded-degree instances), giving approximation algorithms when integrality gap instances do not exist. The starting point for our lower bound is a reduction from a "distributional implicit hidden partition'' problem defined by Fei, Minzer, and Wang (STOC 2026) in the context of multi-pass streaming. Our result is an analogue of theirs in the single-pass setting, where we obtain a much stronger (and tight) space lower bound.

For an arbitrary family of predicates $\mathcal{F} \subseteq \{0,1\}^{[q]^k}$ and any $\epsilon > 0$, we prove a single-pass, linear-space streaming lower bound against the gap promise problem of distinguishing instances of Max-CSP$({\mathcal{F}})$ with at most $\beta+\epsilon$ fraction of satisfiable constraints from instances of with at least $\gamma-\epsilon$ fraction of satisfiable constraints, whenever Max-CSP$({\mathcal{F}})$ admits a $(\gamma,\beta)$-integrality gap instance for the basic LP. This subsumes the linear-space lower bound of Chou, Golovnev, Sudan, Velingker, and Velusamy (STOC 2022), which applies only to a special subclass of CSPs with linear-algebraic structure. (Their result itself generalizes work of Kapralov and Krachun (STOC 2019) for Max-CUT.) Our approach identifies the right ``analytic'' analogues of previously-used linear-algebraic conditions; this yields substantial simplifications while capturing a much larger class of problems. Our lower bound is essentially optimal for single-pass streaming, since: (1) All CSPs admit $(1-\epsilon)$-approximations in quasilinear space, and (2) sublinear-space streaming algorithms can simulate the LP (on bounded-degree instances), giving approximation algorithms when integrality gap instances do not exist. The starting point for our lower bound is a reduction from a "distributional implicit hidden partition'' problem defined by Fei, Minzer, and Wang (STOC 2026) in the context of multi-pass streaming. Our result is an analogue of theirs in the single-pass setting, where we obtain a much stronger (and tight) space lower bound.

TR26-053 | How Does Machine Learning Manage Complexity? | Lance Fortnow

from ECCC Papers

We provide a computational complexity lens to understand the power of machine learning models, particularly their ability to model complex systems. Machine learning models are often trained on data drawn from sampleable or more complex distributions, a far wider range of distributions than just computable ones. By focusing on computable distributions, machine learning models can better manage complexity via probability. We abstract away from specific learning mechanisms, modeling machine learning as producing P/poly-computable distributions with polynomially-bounded max-entropy. We illustrate how learning computable distributions models complexity by showing that if a machine learning model produces a distribution $\mu$ that minimizes error against the distribution generated by a cryptographic pseudorandom generator, then $\mu$ must be close to uniform.

We provide a computational complexity lens to understand the power of machine learning models, particularly their ability to model complex systems. Machine learning models are often trained on data drawn from sampleable or more complex distributions, a far wider range of distributions than just computable ones. By focusing on computable distributions, machine learning models can better manage complexity via probability. We abstract away from specific learning mechanisms, modeling machine learning as producing P/poly-computable distributions with polynomially-bounded max-entropy. We illustrate how learning computable distributions models complexity by showing that if a machine learning model produces a distribution $\mu$ that minimizes error against the distribution generated by a cryptographic pseudorandom generator, then $\mu$ must be close to uniform.

postdoc at Nagoya University (apply by April 30, 2026)

from CCI: jobs

Applications are invited for a postdoctoral position in theoretical computer science (especially quantum computing) at the Graduate School of Mathematics, Nagoya University. The position will start after July 2026. The starting date and duration are negotiable. Website: francoislegall.com/jobs2026.html Email: legall@math.nagoya-u.ac.jp

Applications are invited for a postdoctoral position in theoretical computer science (especially quantum computing) at the Graduate School of Mathematics, Nagoya University.

The position will start after July 2026. The starting date and duration are negotiable.

Website: http://francoislegall.com/jobs2026.html
Email: legall@math.nagoya-u.ac.jp

By shacharlovett

Toward a Tractability Frontier for Exact Relevance Certification

from arXiv: Computational Complexity

Authors: Tristan Simas

Exact relevance certification asks which coordinates are necessary to determine the optimal action in a coordinate-structured decision problem. The tractable families treated here admit a finite primitive basis, but optimizer-quotient realizability is maximal, so quotient shape alone cannot characterize the frontier. We prove a meta-impossibility theorem for efficiently checkable structural predicates invariant under the theorem-forced closure laws of exact certification. Structural convergence with zero-distortion summaries, quotient entropy bounds, and support-counting arguments explains why those closure laws are canonical. We establish the theorem by constructing same-orbit disagreements for four obstruction families, namely dominant-pair concentration, margin masking, ghost-action concentration, and additive/statewise offset concentration, using action-independent, pair-targeted affine witnesses. Consequently no correct tractability classifier on a closure-closed domain yields an exact characterization over these families. Here closure-orbit agreement is forced by correctness rather than assumed as an invariance axiom. The result therefore applies to correct classifiers on closure-closed domains, not only to classifiers presented through a designated admissibility package.

Authors: Tristan Simas

Exact relevance certification asks which coordinates are necessary to determine the optimal action in a coordinate-structured decision problem. The tractable families treated here admit a finite primitive basis, but optimizer-quotient realizability is maximal, so quotient shape alone cannot characterize the frontier. We prove a meta-impossibility theorem for efficiently checkable structural predicates invariant under the theorem-forced closure laws of exact certification. Structural convergence with zero-distortion summaries, quotient entropy bounds, and support-counting arguments explains why those closure laws are canonical. We establish the theorem by constructing same-orbit disagreements for four obstruction families, namely dominant-pair concentration, margin masking, ghost-action concentration, and additive/statewise offset concentration, using action-independent, pair-targeted affine witnesses. Consequently no correct tractability classifier on a closure-closed domain yields an exact characterization over these families. Here closure-orbit agreement is forced by correctness rather than assumed as an invariance axiom. The result therefore applies to correct classifiers on closure-closed domains, not only to classifiers presented through a designated admissibility package.

Multiple Planted Structures Below $\sqrt{n}$: An SoS Integrality Gap and an SQ Lower Bound

from arXiv: Computational Complexity

Authors: Matvey Mosievskiy, Lev Reyzin

We study computational limitations in \emph{multi-plant} average-case inference problems, in which $t$ disjoint planted structures of size $k$ are embedded in a random background on $n$ elements. A natural parameter in this setting is the total planted size $K := kt$. For several classic planted-subgraph problems, including planted clique, existing algorithmic and lower-bound evidence suggests a characteristic computational threshold near $\sqrt{n}$ in the single-plant setting. Our main result is a Sum-of-Squares (SoS) integrality gap for refuting the presence of multiple planted cliques. Specifically, for $G \sim G(n,1/2)$, we construct a degree-$d$ SoS pseudoexpectation for the natural relaxation that maximizes the total size of up to $t$ disjoint cliques. Throughout the regime $kt \le n^{1/2 - c\sqrt{d/\log n}},$ for a universal constant $c>0$, this relaxation achieves objective value $kt(1-o(1))$, and therefore degree-$d$ SoS cannot certify an upper bound below $kt$. This extends the planted-clique SoS lower bounds of~\cite{BarakHKKMP19} to a multi-plant setting with explicit disjointness constraints. As complementary evidence from a different computational model, we prove a lower bound in the statistical query (SQ) framework, extending the results of~\cite{FeldmanGRVX17}. We show that for detecting $t$ disjoint planted $k \times k$ bicliques (equivalently, a row-mixture distribution), when $kt = O(n^{1/2-δ})$ for any fixed $δ>0$, no polynomial-time SQ algorithm can distinguish the planted and null distributions with constant advantage.

Authors: Matvey Mosievskiy, Lev Reyzin

We study computational limitations in \emph{multi-plant} average-case inference problems, in which $t$ disjoint planted structures of size $k$ are embedded in a random background on $n$ elements. A natural parameter in this setting is the total planted size $K := kt$. For several classic planted-subgraph problems, including planted clique, existing algorithmic and lower-bound evidence suggests a characteristic computational threshold near $\sqrt{n}$ in the single-plant setting. Our main result is a Sum-of-Squares (SoS) integrality gap for refuting the presence of multiple planted cliques. Specifically, for $G \sim G(n,1/2)$, we construct a degree-$d$ SoS pseudoexpectation for the natural relaxation that maximizes the total size of up to $t$ disjoint cliques. Throughout the regime $kt \le n^{1/2 - c\sqrt{d/\log n}},$ for a universal constant $c>0$, this relaxation achieves objective value $kt(1-o(1))$, and therefore degree-$d$ SoS cannot certify an upper bound below $kt$. This extends the planted-clique SoS lower bounds of~\cite{BarakHKKMP19} to a multi-plant setting with explicit disjointness constraints. As complementary evidence from a different computational model, we prove a lower bound in the statistical query (SQ) framework, extending the results of~\cite{FeldmanGRVX17}. We show that for detecting $t$ disjoint planted $k \times k$ bicliques (equivalently, a row-mixture distribution), when $kt = O(n^{1/2-δ})$ for any fixed $δ>0$, no polynomial-time SQ algorithm can distinguish the planted and null distributions with constant advantage.

How Does Machine Learning Manage Complexity?

from arXiv: Computational Complexity

Authors: Lance Fortnow

We provide a computational complexity lens to understand the power of machine learning models, particularly their ability to model complex systems. Machine learning models are often trained on data drawn from sampleable or more complex distributions, a far wider range of distributions than just computable ones. By focusing on computable distributions, machine learning models can better manage complexity via probability. We abstract away from specific learning mechanisms, modeling machine learning as producing P/poly-computable distributions with polynomially-bounded max-entropy. We illustrate how learning computable distributions models complexity by showing that if a machine learning model produces a distribution $μ$ that minimizes error against the distribution generated by a cryptographic pseudorandom generator, then $μ$ must be close to uniform.

Authors: Lance Fortnow

We provide a computational complexity lens to understand the power of machine learning models, particularly their ability to model complex systems. Machine learning models are often trained on data drawn from sampleable or more complex distributions, a far wider range of distributions than just computable ones. By focusing on computable distributions, machine learning models can better manage complexity via probability. We abstract away from specific learning mechanisms, modeling machine learning as producing P/poly-computable distributions with polynomially-bounded max-entropy. We illustrate how learning computable distributions models complexity by showing that if a machine learning model produces a distribution $μ$ that minimizes error against the distribution generated by a cryptographic pseudorandom generator, then $μ$ must be close to uniform.

When Majority Fails: Tight Bounds for Correlation Distillation Conjectures

from arXiv: Computational Complexity

Authors: Pritish Kamath, Ravi Kumar, Pasin Manurangsi

We study two conjectures posed in the analysis of Boolean functions $f : \{-1, 1\}^n \to \{-1, 1\}$, in both of which, the Majority function plays a central role: the "Majority is Least Stable" (Benjamini et al., 1999) and the "Non-Interactive Correlation Distillation for Erasures" (Yang, 2004; O'Donnell and Wright, 2012). While both conjectures have been refuted in their originally stated form, we obtain a nearly tight characterization of the noise parameter regime in which each of the conjectures hold, for all $n \ge 5$. Whereas, for $n=3$, both conjectures hold in all noise parameter regimes. We state refined versions of both conjectures that we believe captures the spirit of the original conjectures.

Authors: Pritish Kamath, Ravi Kumar, Pasin Manurangsi

We study two conjectures posed in the analysis of Boolean functions $f : \{-1, 1\}^n \to \{-1, 1\}$, in both of which, the Majority function plays a central role: the "Majority is Least Stable" (Benjamini et al., 1999) and the "Non-Interactive Correlation Distillation for Erasures" (Yang, 2004; O'Donnell and Wright, 2012). While both conjectures have been refuted in their originally stated form, we obtain a nearly tight characterization of the noise parameter regime in which each of the conjectures hold, for all $n \ge 5$. Whereas, for $n=3$, both conjectures hold in all noise parameter regimes. We state refined versions of both conjectures that we believe captures the spirit of the original conjectures.

On Formally Undecidable Propositions of Nondeterministic Complexity and Related Classes

from arXiv: Computational Complexity

Authors: Martin Kolář

The definition of \NP\ requires, for each member language~$L$, a polynomial-time checking relation~$R$ and a constant~$k$ such that $w \in L \iff \exists y\,(|y| \leq |w|^k \wedge R(w,y))$. We show that this biconditional instantiates, for each member language, Hilbert's triple: a sound, complete, decidable proof system in which truth-in-$L$ and bounded provability coincide by fiat. We show further that the polynomial-time restriction on~$R$ does not exclude Gödel's proof-checking relation, which is itself polynomial-time and fits the definition as a literal instance. Hence \NP, taken as a totality over all polynomial-time~$R$, contains languages for which the biconditional asserts a property that Gödel's First Incompleteness Theorem prohibits. The semantic definition of \NP\ is unsatisfiable, for the same reason that Hilbert's Program is.

Authors: Martin Kolář

The definition of \NP\ requires, for each member language~$L$, a polynomial-time checking relation~$R$ and a constant~$k$ such that $w \in L \iff \exists y\,(|y| \leq |w|^k \wedge R(w,y))$. We show that this biconditional instantiates, for each member language, Hilbert's triple: a sound, complete, decidable proof system in which truth-in-$L$ and bounded provability coincide by fiat. We show further that the polynomial-time restriction on~$R$ does not exclude Gödel's proof-checking relation, which is itself polynomial-time and fits the definition as a literal instance. Hence \NP, taken as a totality over all polynomial-time~$R$, contains languages for which the biconditional asserts a property that Gödel's First Incompleteness Theorem prohibits. The semantic definition of \NP\ is unsatisfiable, for the same reason that Hilbert's Program is.

The Josehedron: A space-filling plesiohedron based on the Fischer-Koch S Triply Periodic Minimal Surface

from arXiv: Computational Geometry

Authors: Mathias Bernhard

This paper presents a novel space-filling polyhedron (SFPH), here named the Josehedron, derived from the extremal points of the Fischer-Koch S triply periodic minimal surface (TPMS). The Josehedron is a plesiohedron with 12 faces (4 isosceles triangles and 8 mirror-symmetric quadrilaterals), 12 vertices, and 22 edges. It tiles three-dimensional space with 12 instances per cubic unit cell in 6 distinct orientations. The generating point set exhibits a remarkable connection to the pentagonal Cairo tiling when projected onto any coordinate plane. Several additional geometric properties are described, including integer vertex coordinates, interwoven labyrinths, and chiral symmetry between the polyhedra obtained from the combined minima and maxima of the function. Finally, the paper presents a general method for finding novel SFPHs based on any periodic function, TPMS, or other functions. The described method is applied to a selection of TPMS, and 7 additional, previously undocumented SFPH are shown in the Appendix.

Authors: Mathias Bernhard

This paper presents a novel space-filling polyhedron (SFPH), here named the Josehedron, derived from the extremal points of the Fischer-Koch S triply periodic minimal surface (TPMS). The Josehedron is a plesiohedron with 12 faces (4 isosceles triangles and 8 mirror-symmetric quadrilaterals), 12 vertices, and 22 edges. It tiles three-dimensional space with 12 instances per cubic unit cell in 6 distinct orientations. The generating point set exhibits a remarkable connection to the pentagonal Cairo tiling when projected onto any coordinate plane. Several additional geometric properties are described, including integer vertex coordinates, interwoven labyrinths, and chiral symmetry between the polyhedra obtained from the combined minima and maxima of the function. Finally, the paper presents a general method for finding novel SFPHs based on any periodic function, TPMS, or other functions. The described method is applied to a selection of TPMS, and 7 additional, previously undocumented SFPH are shown in the Appendix.

An Algebraic Introduction to Persistence

from arXiv: Computational Geometry

Authors: Ulrich Bauer, Thomas Brüstle, Luis Scoccola

We introduce persistence with an emphasis on its algebraic foundations, using the representation theory of posets. Linear representations of posets arise in several areas of mathematics, including the representation theory of quivers and finite dimensional algebras, Morse theory and other areas of geometry, as well as topological inference and topological data analysis -- often via persistent homology. In some of these contexts, the category of poset representations of interest admits a metric structure given by the so-called interleaving distance. Persistence studies the algebraic properties of these poset representations and their behavior under perturbations in the interleaving distance. We survey fundamental results in the area and applications to pure and applied mathematics, as well as theoretical challenges and open questions.

Authors: Ulrich Bauer, Thomas Brüstle, Luis Scoccola

We introduce persistence with an emphasis on its algebraic foundations, using the representation theory of posets. Linear representations of posets arise in several areas of mathematics, including the representation theory of quivers and finite dimensional algebras, Morse theory and other areas of geometry, as well as topological inference and topological data analysis -- often via persistent homology. In some of these contexts, the category of poset representations of interest admits a metric structure given by the so-called interleaving distance. Persistence studies the algebraic properties of these poset representations and their behavior under perturbations in the interleaving distance. We survey fundamental results in the area and applications to pure and applied mathematics, as well as theoretical challenges and open questions.

Toward a Uniform Algorithm and Uniform Reduction for Constraint Problems

from arXiv: Data Structures and Algorithms

Authors: Libor Barto, Maximilian Hadek, Dmitriy Zhuk

We develop a unified framework to characterize the power of higher-level algorithms for the constraint satisfaction problem (CSP), such as $k$-consistency, the Sherali-Adams LP hierarchy, and the affine IP hierarchy. As a result, solvability of a fixed-template CSP or, more generally, a Promise CSP by a given level is shown to depend only on the polymorphism minion of the template. Similarly, we obtain a minion-theoretic description of $k$-consistency reductions between Promise CSPs. We introduce a new hierarchy of SDP-like vector relaxations with vectors over $\mathbb Z_{p}$ in which orthogonality is imposed on $k$-tuples of vectors. Surprisingly, this relaxation turns out to be equivalent to the $k$-th level of the AIP-$\mathbb{Z}_p$ relaxation. We show that it solves the CSP of the dihedral group $\mathbf{D}_4$, the smallest CSP that fools the singleton BLP+AIP algorithm. Using this vector representation, we further show that the $p$-th level of the $\mathbb{Z}_p$ relaxation solves linear equations modulo $p^2$.

Authors: Libor Barto, Maximilian Hadek, Dmitriy Zhuk

We develop a unified framework to characterize the power of higher-level algorithms for the constraint satisfaction problem (CSP), such as $k$-consistency, the Sherali-Adams LP hierarchy, and the affine IP hierarchy. As a result, solvability of a fixed-template CSP or, more generally, a Promise CSP by a given level is shown to depend only on the polymorphism minion of the template. Similarly, we obtain a minion-theoretic description of $k$-consistency reductions between Promise CSPs. We introduce a new hierarchy of SDP-like vector relaxations with vectors over $\mathbb Z_{p}$ in which orthogonality is imposed on $k$-tuples of vectors. Surprisingly, this relaxation turns out to be equivalent to the $k$-th level of the AIP-$\mathbb{Z}_p$ relaxation. We show that it solves the CSP of the dihedral group $\mathbf{D}_4$, the smallest CSP that fools the singleton BLP+AIP algorithm. Using this vector representation, we further show that the $p$-th level of the $\mathbb{Z}_p$ relaxation solves linear equations modulo $p^2$.

On the Price of Privacy for Language Identification and Generation

from arXiv: Data Structures and Algorithms

Authors: Xiaoyu Li, Andi Han, Jiaojiao Jiang, Junbin Gao

As large language models (LLMs) are increasingly trained on sensitive user data, understanding the fundamental cost of privacy in language learning becomes essential. We initiate the study of differentially private (DP) language identification and generation in the agnostic statistical setting, establishing algorithms and matching lower bounds that precisely quantify the cost of privacy. For both tasks, approximate $(\varepsilon, δ)$-DP with constant $\varepsilon > 0$ recovers the non-private error rates: $\exp(-r(n))$ for identification (for any $r(n) = o(n)$) and $\exp(-Ω(n))$ for generation. Under pure $\varepsilon$-DP, the exponents degrade by a multiplicative factor of $\min\{1, \varepsilon\}$, which we show is tight up to constants. Notably, for generation under pure DP with mild assumptions, the upper bound $\exp(-\min\{1,\varepsilon\} \cdot Ω(n))$ matches the lower bound up to some constants, establishing an optimal rate. Our results show that the cost of privacy in language learning is surprisingly mild: absent entirely under approximate DP, and exactly a $\min\{1,\varepsilon\}$ factor in the exponent under pure DP.

Authors: Xiaoyu Li, Andi Han, Jiaojiao Jiang, Junbin Gao

As large language models (LLMs) are increasingly trained on sensitive user data, understanding the fundamental cost of privacy in language learning becomes essential. We initiate the study of differentially private (DP) language identification and generation in the agnostic statistical setting, establishing algorithms and matching lower bounds that precisely quantify the cost of privacy. For both tasks, approximate $(\varepsilon, δ)$-DP with constant $\varepsilon > 0$ recovers the non-private error rates: $\exp(-r(n))$ for identification (for any $r(n) = o(n)$) and $\exp(-Ω(n))$ for generation. Under pure $\varepsilon$-DP, the exponents degrade by a multiplicative factor of $\min\{1, \varepsilon\}$, which we show is tight up to constants. Notably, for generation under pure DP with mild assumptions, the upper bound $\exp(-\min\{1,\varepsilon\} \cdot Ω(n))$ matches the lower bound up to some constants, establishing an optimal rate. Our results show that the cost of privacy in language learning is surprisingly mild: absent entirely under approximate DP, and exactly a $\min\{1,\varepsilon\}$ factor in the exponent under pure DP.

Wednesday, April 08

TR26-052 | A Sharp Characterization of Pessiland | Mikito Nanashima, Shuichi Hirahara

from ECCC Papers

It is a long-standing open question whether the average-case hardness of NP implies the existence of a one-way function. The hypothetical world in which this does not hold is called Pessiland, which is the most pessimistic among Impagliazzo's five possible worlds. In this paper, we present the first "sharp" characterization of Pessiland: - NP is hard on average if and only if the minimum description length of programs in agnostic learning is hard to approximate on average with an approximation factor $\ell / \mathrm{polylog}(\ell)$, where $\ell$ is a new complexity measure of a distribution called advice complexity of sampling. - A one-way function does not exist if and only if the minimum description length of programs in agnostic learning is easy to approximate on average with an approximation factor $O(\ell)$. In particular, Pessiland is ruled out if and only if the small quantitative gap in approximation factors $\ell/\mathrm{polylog}(\ell)$ and $O(\ell)$ is closed. Our characterization is based on an optimal NP-hardness result for the Collective Minimum Monotone Satisfying Assignment (CMMSA) Problem, whose task is, given as input a collection of monotone formulas with at most $\ell$ literals, to compute the minimum weight of an assignment that satisfies as many monotone formulas as possible. We prove the NP-hardness of approximating the minimum weight within a factor of $\ell / \mathrm{polylog} \ell$, improving the previous inapproximability factor of $\ell^{\Omega(1)}$ by Hirahara (FOCS 2022). Our inapproximability factor is optimal up to the $\mathrm{polylog} \ell$ factor unless NP $\subseteq$ coAM because the CMMSA problem with an approximation factor $O(\ell)$ is in coAM.

It is a long-standing open question whether the average-case hardness of NP implies the existence of a one-way function. The hypothetical world in which this does not hold is called Pessiland, which is the most pessimistic among Impagliazzo's five possible worlds. In this paper, we present the first "sharp" characterization of Pessiland: - NP is hard on average if and only if the minimum description length of programs in agnostic learning is hard to approximate on average with an approximation factor $\ell / \mathrm{polylog}(\ell)$, where $\ell$ is a new complexity measure of a distribution called advice complexity of sampling. - A one-way function does not exist if and only if the minimum description length of programs in agnostic learning is easy to approximate on average with an approximation factor $O(\ell)$. In particular, Pessiland is ruled out if and only if the small quantitative gap in approximation factors $\ell/\mathrm{polylog}(\ell)$ and $O(\ell)$ is closed. Our characterization is based on an optimal NP-hardness result for the Collective Minimum Monotone Satisfying Assignment (CMMSA) Problem, whose task is, given as input a collection of monotone formulas with at most $\ell$ literals, to compute the minimum weight of an assignment that satisfies as many monotone formulas as possible. We prove the NP-hardness of approximating the minimum weight within a factor of $\ell / \mathrm{polylog} \ell$, improving the previous inapproximability factor of $\ell^{\Omega(1)}$ by Hirahara (FOCS 2022). Our inapproximability factor is optimal up to the $\mathrm{polylog} \ell$ factor unless NP $\subseteq$ coAM because the CMMSA problem with an approximation factor $O(\ell)$ is in coAM.

PhD student at Lund University (apply by April 15, 2026)

from CCI: jobs

The Department of Computer Science at Lund University invites applications for 1-2 PhD positions in theoretical computer science with focus on computational complexity and algorithms to be working in the research group of Susanna de Rezende. These are four-year full-time employed positions, but PhD positions usually include 20% teaching, in which case they are prolonged […]

The Department of Computer Science at Lund University invites applications for 1-2 PhD positions in theoretical computer science with focus on computational complexity and algorithms to be working in the research group of Susanna de Rezende. These are four-year full-time employed positions, but PhD positions usually include 20% teaching, in which case they are prolonged for one more year.

Website: https://derezende.github.io/openpositions/
Email: susanna.rezende@cs.lth.se

By shacharlovett

SMB algebras II: On the Constraint Satisfaction Problem over Semilattices of Mal'cev Blocks

from arXiv: Computational Complexity

Authors: Petar Marković, Miklós Maróti, Ralph McKenzie, Aleksandar Prokić

We define a class of algebras, the semilattices of Mal'cev blocks (for short, SMB algebras). In a nutshell, these algebras are semilattices in which each element gets blown up into a Mal'cev algebra. We publish for the first time our old proofs that some SMB algebras induce tractable templates of the reprove that the Constraint Satisfaction Problem. Next, we reprove that, in fact, all SMB algebras induce tractable templates of the Constraint Satisfaction Problem, a result already proved by A. Bulatov. Also, we compare the two general proofs of the CSP Dichotomy and prove they are more similar than initially thought when they are applied to SMB algebras. This paper is the second in the series of papers investigating the SMB algebras and it is a precursor to our further research on the similarities between the proofs of the Dichotomy Theorem.

Authors: Petar Marković, Miklós Maróti, Ralph McKenzie, Aleksandar Prokić

We define a class of algebras, the semilattices of Mal'cev blocks (for short, SMB algebras). In a nutshell, these algebras are semilattices in which each element gets blown up into a Mal'cev algebra. We publish for the first time our old proofs that some SMB algebras induce tractable templates of the reprove that the Constraint Satisfaction Problem. Next, we reprove that, in fact, all SMB algebras induce tractable templates of the Constraint Satisfaction Problem, a result already proved by A. Bulatov. Also, we compare the two general proofs of the CSP Dichotomy and prove they are more similar than initially thought when they are applied to SMB algebras. This paper is the second in the series of papers investigating the SMB algebras and it is a precursor to our further research on the similarities between the proofs of the Dichotomy Theorem.

Selecting a Maximum Solow-Polasky Diversity Subset in General Metric Spaces Is NP-hard

from arXiv: Computational Geometry

Authors: Michael T. M. Emmerich, Ksenia Pereverdieva, André H. Deutz

The Solow--Polasky diversity indicator (or magnitude) is a classical measure of diversity based on pairwise distances. It has applications in ecology, conservation planning, and, more recently, in algorithmic subset selection and diversity optimization. In this note, we investigate the computational complexity of selecting a subset of fixed cardinality from a finite set so as to maximize the Solow--Polasky diversity value. We prove that this problem is NP-hard in general metric spaces. The reduction is from the classical Independent Set problem and uses a simple metric construction containing only two non-zero distance values. Importantly, the hardness result holds for every fixed kernel parameter $θ_0>0$; equivalently, by rescaling the metric, one may fix the parameter to $1$ without loss of generality. A central point is that this is not a boilerplate reduction: because the Solow--Polasky objective is defined through matrix inversion, it is a nontrivial nonlinear function of the distances. Accordingly, the proof requires a dedicated strict-monotonicity argument for the specific family of distance matrices arising in the reduction; this strict monotonicity is established here for that family, but it is not assumed to hold in full generality. We also explain how the proof connects to continuity and monotonicity considerations for diversity indicators.

Authors: Michael T. M. Emmerich, Ksenia Pereverdieva, André H. Deutz

The Solow--Polasky diversity indicator (or magnitude) is a classical measure of diversity based on pairwise distances. It has applications in ecology, conservation planning, and, more recently, in algorithmic subset selection and diversity optimization. In this note, we investigate the computational complexity of selecting a subset of fixed cardinality from a finite set so as to maximize the Solow--Polasky diversity value. We prove that this problem is NP-hard in general metric spaces. The reduction is from the classical Independent Set problem and uses a simple metric construction containing only two non-zero distance values. Importantly, the hardness result holds for every fixed kernel parameter $θ_0>0$; equivalently, by rescaling the metric, one may fix the parameter to $1$ without loss of generality. A central point is that this is not a boilerplate reduction: because the Solow--Polasky objective is defined through matrix inversion, it is a nontrivial nonlinear function of the distances. Accordingly, the proof requires a dedicated strict-monotonicity argument for the specific family of distance matrices arising in the reduction; this strict monotonicity is established here for that family, but it is not assumed to hold in full generality. We also explain how the proof connects to continuity and monotonicity considerations for diversity indicators.

Realizing Planar Linkages in Polygonal Domains

from arXiv: Computational Geometry

Authors: Thomas Depian, Carolina Haase, Martin Nöllenburg, André Schulz

A linkage $\mathcal{L}$ consists of a graph $G=(V,E)$ and an edge-length function $\ell$. Deciding whether $\mathcal{L}$ can be realized as a planar straight-line embedding in $\mathbb{R}^2$ with edge length $\ell(e)$ for all $e \in E$ is $\exists\mathbb{R}$-complete [Abel et al., JoCG'25], even if $\ell \equiv 1$, but a considerable part of $\mathcal{L}$ is rigid. In this paper, we study the computational complexity of the realization question for structurally simpler, less rigid linkages inside an open polygonal domain $P$, where the placement of some vertices may be specified in the input. We show XP-membership and W[1]-hardness with respect to the size of $G$, even if $\ell \equiv 1$ and no vertex positions are prescribed. Furthermore, we consider the case where $G$ is a path with prescribed start and end position and $\ell \equiv 1$. Despite the absence of any rigid components, we obtain NP-hardness in general, and provide a linear-time algorithm for arbitrary $\ell$ if $G$ has only three edges and $P$ is convex.

Authors: Thomas Depian, Carolina Haase, Martin Nöllenburg, André Schulz

A linkage $\mathcal{L}$ consists of a graph $G=(V,E)$ and an edge-length function $\ell$. Deciding whether $\mathcal{L}$ can be realized as a planar straight-line embedding in $\mathbb{R}^2$ with edge length $\ell(e)$ for all $e \in E$ is $\exists\mathbb{R}$-complete [Abel et al., JoCG'25], even if $\ell \equiv 1$, but a considerable part of $\mathcal{L}$ is rigid. In this paper, we study the computational complexity of the realization question for structurally simpler, less rigid linkages inside an open polygonal domain $P$, where the placement of some vertices may be specified in the input. We show XP-membership and W[1]-hardness with respect to the size of $G$, even if $\ell \equiv 1$ and no vertex positions are prescribed. Furthermore, we consider the case where $G$ is a path with prescribed start and end position and $\ell \equiv 1$. Despite the absence of any rigid components, we obtain NP-hardness in general, and provide a linear-time algorithm for arbitrary $\ell$ if $G$ has only three edges and $P$ is convex.

Learning $\mathsf{AC}^0$ Under Graphical Models

from arXiv: Data Structures and Algorithms

Authors: Gautam Chandrasekaran, Jason Gaitonde, Ankur Moitra, Arsen Vasilyan

In a landmark result, Linial, Mansour and Nisan (J. ACM 1993) gave a quasipolynomial-time algorithm for learning constant-depth circuits given labeled i.i.d. samples under the uniform distribution. Their work has had a deep and lasting legacy in computational learning theory, in particular introducing the $\textit{low-degree algorithm}$. However, an important critique of many results and techniques in the area is the reliance on product structure, which is unlikely to hold in realistic settings. Obtaining similar learning guarantees for more natural correlated distributions has been a longstanding challenge in the field. In particular, we give quasipolynomial-time algorithms for learning $\mathsf{AC}^0$ substantially beyond the product setting, when the inputs come from any graphical model with polynomial growth that exhibits strong spatial mixing. The main technical challenge is in giving a workaround to Fourier analysis, which we do by showing how new sampling algorithms allow us to transfer statements about low-degree polynomial approximation under the uniform setting to graphical models. Our approach is general enough to extend to other well-studied function classes, like monotone functions and halfspaces.

Authors: Gautam Chandrasekaran, Jason Gaitonde, Ankur Moitra, Arsen Vasilyan

In a landmark result, Linial, Mansour and Nisan (J. ACM 1993) gave a quasipolynomial-time algorithm for learning constant-depth circuits given labeled i.i.d. samples under the uniform distribution. Their work has had a deep and lasting legacy in computational learning theory, in particular introducing the $\textit{low-degree algorithm}$. However, an important critique of many results and techniques in the area is the reliance on product structure, which is unlikely to hold in realistic settings. Obtaining similar learning guarantees for more natural correlated distributions has been a longstanding challenge in the field. In particular, we give quasipolynomial-time algorithms for learning $\mathsf{AC}^0$ substantially beyond the product setting, when the inputs come from any graphical model with polynomial growth that exhibits strong spatial mixing. The main technical challenge is in giving a workaround to Fourier analysis, which we do by showing how new sampling algorithms allow us to transfer statements about low-degree polynomial approximation under the uniform setting to graphical models. Our approach is general enough to extend to other well-studied function classes, like monotone functions and halfspaces.

$k$-Clustering via Iterative Randomized Rounding

from arXiv: Data Structures and Algorithms

Authors: Jarosław Byrka, Yuhao Guo, Yang Hu, Shi Li, Chengzhang Wan, Zaixuan Wang

In this work we propose a single rounding algorithm for the fractional solutions of the standard LP relaxation for $k$-clustering. As a starting point, we obtain an iterative rounding $(\frac{3^p + 1}{2})$-Lagrangian Multiplier-Perserving (LMP) approximation for the $k$-clustering problem with the cost function being the $p$-th power of the distance. Such an algorithm outputs a random solution that opens $k$ facilities \emph{in expectation}, whose cost in expectation is at most $\frac{3^p + 1}{2}$ times the optimum cost. Thus, we recover the $2$-LMP approximation for $k$-median by Jain et al.~[JACM'03], which played a central role in deriving the current best $2$ approximation for $k$-median. Unlike the result of Jain et al., our algorithm is based on LP rounding, and it can be easily adapted to the $L_p^p$-cost setting. For the Euclidean $k$-means problem, the LMP factor we obtain is $\frac{11}{3}$, which is better than the $5$ approximation given by this framework for general metrics. Then, we show how to convert the LMP-approximation algorithms to a true-approximation, with only a $(1+\varepsilon)$ factor loss in the approximation ratio. We obtain a ($\frac{3^p + 1}{2}+\varepsilon$)-approximation algorithm for $k$-clustering with cost function being the $p$-th power of the distance, for $p \geq 1$. This reproduces the best known ($2+\varepsilon$)-approximation for $k$-median by Cohen-Addad et al. [STOC'25], and improves the approximation factor for metric $k$-means from 5.83 by Charikar at al. [FOCS'25] to $5+\varepsilon$ in our framework. Moreover, the same algorithm, but with a specialized analysis, attains ($4+\varepsilon$)-approximation for Euclidean $k$-means matching the recent result by Charikar et al. [STOC'26].

Authors: Jarosław Byrka, Yuhao Guo, Yang Hu, Shi Li, Chengzhang Wan, Zaixuan Wang

In this work we propose a single rounding algorithm for the fractional solutions of the standard LP relaxation for $k$-clustering. As a starting point, we obtain an iterative rounding $(\frac{3^p + 1}{2})$-Lagrangian Multiplier-Perserving (LMP) approximation for the $k$-clustering problem with the cost function being the $p$-th power of the distance. Such an algorithm outputs a random solution that opens $k$ facilities \emph{in expectation}, whose cost in expectation is at most $\frac{3^p + 1}{2}$ times the optimum cost. Thus, we recover the $2$-LMP approximation for $k$-median by Jain et al.~[JACM'03], which played a central role in deriving the current best $2$ approximation for $k$-median. Unlike the result of Jain et al., our algorithm is based on LP rounding, and it can be easily adapted to the $L_p^p$-cost setting. For the Euclidean $k$-means problem, the LMP factor we obtain is $\frac{11}{3}$, which is better than the $5$ approximation given by this framework for general metrics. Then, we show how to convert the LMP-approximation algorithms to a true-approximation, with only a $(1+\varepsilon)$ factor loss in the approximation ratio. We obtain a ($\frac{3^p + 1}{2}+\varepsilon$)-approximation algorithm for $k$-clustering with cost function being the $p$-th power of the distance, for $p \geq 1$. This reproduces the best known ($2+\varepsilon$)-approximation for $k$-median by Cohen-Addad et al. [STOC'25], and improves the approximation factor for metric $k$-means from 5.83 by Charikar at al. [FOCS'25] to $5+\varepsilon$ in our framework. Moreover, the same algorithm, but with a specialized analysis, attains ($4+\varepsilon$)-approximation for Euclidean $k$-means matching the recent result by Charikar et al. [STOC'26].

Distributed Quantum Property Testing with Communication Constraints

from arXiv: Data Structures and Algorithms

Authors: Mina Doosti, Ryan Sweke, Chirag Wadhwa

We introduce a framework for distributed quantum inference under communication constraints. In our model, $m$ distributed nodes each receive one copy of an unknown $d$-dimensional quantum state $ρ$, before communicating via a constrained one-way communication channel with a central node, which aims to infer some property of $ρ$. This framework generalizes the classical distributed inference framework introduced by Acharya, Canonne, and Tyagi [COLT2019], by allowing quantum resources such as quantum communication and shared entanglement. Within this setting, we focus on the fundamental problem of quantum state certification: Given a complete description of some state $σ$, decide whether $ρ=σ$ or $\|ρ-σ\|_1\geq ε$. Additionally, we focus on the case of limited quantum communication between distributed nodes and the central node. We show that when each communication channel is limited to only $n_q\leq \log d$ qubits, then the sample complexity of distributed state certification is $\mathcal{O}(\frac{d^2}{2^{n_q}ε^2})$ when public randomness is available to all nodes. Moreover, under the assumption that the channels used by the distributed nodes are mixedness-preserving, we prove a matching lower bound. We further demonstrate that shared randomness is necessary to achieve the above complexity, by proving an $Ω(\frac{d^3}{4^{n_q} ε^2})$ lower bound in the private-coin setting under the same assumption as above. Our lower bounds leverage a recently introduced quantum analogue of the celebrated Ingster-Suslina method and generalize arguments from the classical setting. Together, our work provides the first characterization of distributed quantum state certification in the regime of limited quantum communication and establishes a general framework for distributed quantum inference with communication constraints.

Authors: Mina Doosti, Ryan Sweke, Chirag Wadhwa

We introduce a framework for distributed quantum inference under communication constraints. In our model, $m$ distributed nodes each receive one copy of an unknown $d$-dimensional quantum state $ρ$, before communicating via a constrained one-way communication channel with a central node, which aims to infer some property of $ρ$. This framework generalizes the classical distributed inference framework introduced by Acharya, Canonne, and Tyagi [COLT2019], by allowing quantum resources such as quantum communication and shared entanglement. Within this setting, we focus on the fundamental problem of quantum state certification: Given a complete description of some state $σ$, decide whether $ρ=σ$ or $\|ρ-σ\|_1\geq ε$. Additionally, we focus on the case of limited quantum communication between distributed nodes and the central node. We show that when each communication channel is limited to only $n_q\leq \log d$ qubits, then the sample complexity of distributed state certification is $\mathcal{O}(\frac{d^2}{2^{n_q}ε^2})$ when public randomness is available to all nodes. Moreover, under the assumption that the channels used by the distributed nodes are mixedness-preserving, we prove a matching lower bound. We further demonstrate that shared randomness is necessary to achieve the above complexity, by proving an $Ω(\frac{d^3}{4^{n_q} ε^2})$ lower bound in the private-coin setting under the same assumption as above. Our lower bounds leverage a recently introduced quantum analogue of the celebrated Ingster-Suslina method and generalize arguments from the classical setting. Together, our work provides the first characterization of distributed quantum state certification in the regime of limited quantum communication and establishes a general framework for distributed quantum inference with communication constraints.

Polynomial-Time Algorithm for Thiele Voting Rules with Voter Interval Preferences

from arXiv: Data Structures and Algorithms

Authors: Pasin Manurangsi, Krzysztof Sornat

We present a polynomial-time algorithm for computing an optimal committee of size $k$ under any given Thiele voting rule for elections on the Voter Interval domain (i.e., when voters can be ordered so that each candidate is approved by a consecutive voters). Our result extends to the Generalized Thiele rule, in which each voter has an individual weight (scoring) sequence. This resolves a 10-year-old open problem that was originally posed for Proportional Approval Voting and later extended to every Thiele rule (Elkind and Lackner, IJCAI 2015; Peters, AAAI 2018). Our main technical ingredient is a new structural result -- a concavity theorem for families of intervals. It shows that, given two solutions of different sizes, one can construct a solution of any intermediate size whose score is at least the corresponding linear interpolation of the two scores. As a consequence, on Voter Interval profiles, the optimal total Thiele score is a concave function of the committee size. We exploit this concavity within an optimization framework based on a Lagrangian relaxation of a natural integer linear program formulation, obtained by moving the cardinality constraint into the objective. On Voter Interval profiles, the resulting constraint matrix is totally unimodular, so it can be solved in polynomial time. Our main algorithm and its proof were obtained via human--AI collaboration. In particular, a slightly simplified version of the main structural theorem used by the algorithm was obtained in a single call to Gemini Deep Think.

Authors: Pasin Manurangsi, Krzysztof Sornat

We present a polynomial-time algorithm for computing an optimal committee of size $k$ under any given Thiele voting rule for elections on the Voter Interval domain (i.e., when voters can be ordered so that each candidate is approved by a consecutive voters). Our result extends to the Generalized Thiele rule, in which each voter has an individual weight (scoring) sequence. This resolves a 10-year-old open problem that was originally posed for Proportional Approval Voting and later extended to every Thiele rule (Elkind and Lackner, IJCAI 2015; Peters, AAAI 2018). Our main technical ingredient is a new structural result -- a concavity theorem for families of intervals. It shows that, given two solutions of different sizes, one can construct a solution of any intermediate size whose score is at least the corresponding linear interpolation of the two scores. As a consequence, on Voter Interval profiles, the optimal total Thiele score is a concave function of the committee size. We exploit this concavity within an optimization framework based on a Lagrangian relaxation of a natural integer linear program formulation, obtained by moving the cardinality constraint into the objective. On Voter Interval profiles, the resulting constraint matrix is totally unimodular, so it can be solved in polynomial time. Our main algorithm and its proof were obtained via human--AI collaboration. In particular, a slightly simplified version of the main structural theorem used by the algorithm was obtained in a single call to Gemini Deep Think.

Improved Space-Time Tradeoffs for Permutation Problems via Extremal Combinatorics

from arXiv: Data Structures and Algorithms

Authors: Afrouz Jabal Ameli, Jesper Nederlof, Shengzhe Wang

We provide improved space-time tradeoffs for permutation problems over additively idempotent semi-rings. In particular, there is an algorithm for the Traveling Salesperson Problem that solves $N$-vertex instances using space $S$ and time $T$ where $S\cdot T \leq 3.7493^{N}$. This improves a previous work by Koivisto and Parviainen [SODA'10] where $S\cdot T \leq 3.9271^N$, and overcomes a barrier they identified, as their bound was shown to be optimal within their framework. To get our results, we introduce a new parameter of a set system that we call the chain efficiency. This relates the number of maximal chains contained in the set system with the cardinality of the system. We show that set systems of high efficiency imply efficient space-time tradeoffs for permutation problems, and give constructions of set systems with high chain efficiency, disproving a conjecture by Johnson, Leader and Russel [Comb. Probab. Comput.'15].

Authors: Afrouz Jabal Ameli, Jesper Nederlof, Shengzhe Wang

We provide improved space-time tradeoffs for permutation problems over additively idempotent semi-rings. In particular, there is an algorithm for the Traveling Salesperson Problem that solves $N$-vertex instances using space $S$ and time $T$ where $S\cdot T \leq 3.7493^{N}$. This improves a previous work by Koivisto and Parviainen [SODA'10] where $S\cdot T \leq 3.9271^N$, and overcomes a barrier they identified, as their bound was shown to be optimal within their framework. To get our results, we introduce a new parameter of a set system that we call the chain efficiency. This relates the number of maximal chains contained in the set system with the cardinality of the system. We show that set systems of high efficiency imply efficient space-time tradeoffs for permutation problems, and give constructions of set systems with high chain efficiency, disproving a conjecture by Johnson, Leader and Russel [Comb. Probab. Comput.'15].

Improved space-time tradeoff for TSP via extremal set systems

from arXiv: Data Structures and Algorithms

Authors: Justin Dallant, László Kozma

The traveling salesman problem (TSP) is a cornerstone of combinatorial optimization and has deeply influenced the development of algorithmic techniques in both exact and approximate settings. Yet, improving on the decades-old bounds for solving TSP exactly remains elusive: the dynamic program of Bellman, Held, and Karp from 1962 uses $2^{n+O(\log{n})}$ time and space, and the divide-and-conquer approach of Gurevich and Shelah from 1987 uses $4^{n + O(\log^2{n})}$ time and polynomial space. A straightforward combination of the two algorithms trades off $T^{n+o(n)}$ time and $S^{n+o(n)}$ space at various points of the curve $ST = 4$. An improvement to this tradeoff when $2 < T < 2\sqrt{2}$ was found by Koivisto and Parviainen (SODA 2010), yielding a minimum of $ST \approx 3.93$. Koivisto and Parviainen show their method to be optimal among a broad class of partial-order-based approaches, and to date, no improvement or alternative method has been found. In this paper we give a tradeoff that strictly improves all previous ones for all $2 < T < 4$, achieving a minimum of $ST < 3.572$. A key ingredient is the construction of sparse set systems (hypergraphs) that admit a large number of maximal chains. The existence of such objects is of independent interest in extremal combinatorics, likely to see further applications. Along the way we disprove a combinatorial conjecture of Johnson, Leader, and Russell from 2013, relating it with the optimality of the previous tradeoff schemes for TSP. Our techniques extend to a broad class of permutation problems over arbitrary semirings, yielding improved space-time tradeoffs in these settings as well.

Authors: Justin Dallant, László Kozma

The traveling salesman problem (TSP) is a cornerstone of combinatorial optimization and has deeply influenced the development of algorithmic techniques in both exact and approximate settings. Yet, improving on the decades-old bounds for solving TSP exactly remains elusive: the dynamic program of Bellman, Held, and Karp from 1962 uses $2^{n+O(\log{n})}$ time and space, and the divide-and-conquer approach of Gurevich and Shelah from 1987 uses $4^{n + O(\log^2{n})}$ time and polynomial space. A straightforward combination of the two algorithms trades off $T^{n+o(n)}$ time and $S^{n+o(n)}$ space at various points of the curve $ST = 4$. An improvement to this tradeoff when $2 < T < 2\sqrt{2}$ was found by Koivisto and Parviainen (SODA 2010), yielding a minimum of $ST \approx 3.93$. Koivisto and Parviainen show their method to be optimal among a broad class of partial-order-based approaches, and to date, no improvement or alternative method has been found. In this paper we give a tradeoff that strictly improves all previous ones for all $2 < T < 4$, achieving a minimum of $ST < 3.572$. A key ingredient is the construction of sparse set systems (hypergraphs) that admit a large number of maximal chains. The existence of such objects is of independent interest in extremal combinatorics, likely to see further applications. Along the way we disprove a combinatorial conjecture of Johnson, Leader, and Russell from 2013, relating it with the optimality of the previous tradeoff schemes for TSP. Our techniques extend to a broad class of permutation problems over arbitrary semirings, yielding improved space-time tradeoffs in these settings as well.

Classes Testable with $O(1/ε)$ Queries for Small $ε$ Independent of the Number of Variables

from arXiv: Data Structures and Algorithms

Authors: Nader H. Bshouty, George Haddad

In this paper, we study classes of Boolean functions that are testable with $O(ψ+1/ε)$ queries, where $ψ$ depends on the parameters of the class (e.g., the number of terms, the number of relevant variables, etc.) but not on the total number of variables $n$. In particular, when $ε\le 1/ψ$, the query complexity is $O(1/ε)$, matching the known tight bound $Ω(1/ε)$. This result was previously known for classes of terms of size at most $k$ and exclusive OR functions of at most $k$ variables. In this paper, we extend this list to include the classes: $k$-junta, functions with Fourier degree at most $d$, $s$-sparse polynomials of degree at most $d$, and $s$-sparse polynomials. Additionally, we show that for any class $C$ of Boolean functions that depend on at most $k$ variables, if $C$ is properly exactly learnable, then it is testable with $O(1/ε)$ queries for $ε<1/ψ$, where $ψ$ depends on $k$ and independent of the total number of variables $n$.

Authors: Nader H. Bshouty, George Haddad

In this paper, we study classes of Boolean functions that are testable with $O(ψ+1/ε)$ queries, where $ψ$ depends on the parameters of the class (e.g., the number of terms, the number of relevant variables, etc.) but not on the total number of variables $n$. In particular, when $ε\le 1/ψ$, the query complexity is $O(1/ε)$, matching the known tight bound $Ω(1/ε)$. This result was previously known for classes of terms of size at most $k$ and exclusive OR functions of at most $k$ variables. In this paper, we extend this list to include the classes: $k$-junta, functions with Fourier degree at most $d$, $s$-sparse polynomials of degree at most $d$, and $s$-sparse polynomials. Additionally, we show that for any class $C$ of Boolean functions that depend on at most $k$ variables, if $C$ is properly exactly learnable, then it is testable with $O(1/ε)$ queries for $ε<1/ψ$, where $ψ$ depends on $k$ and independent of the total number of variables $n$.

Maintaining Random Assignments under Adversarial Dynamics

from arXiv: Data Structures and Algorithms

Authors: Bernhard Haeupler, Anton Paramonov

We study and further develop powerful general-purpose schemes to maintain random assignments under adversarial dynamic changes. The goal is to maintain assignments that are (approximately) distributed similarly as a completely fresh resampling of all assignments after each change, while doing only a few resamples per change. This becomes particularly interesting and challenging when dynamics are controlled by an adaptive adversary. Our work builds on and further develops the proactive resampling technique [Bhattacharya, Saranurak, and Sukprasert ESA'22]. We identify a new ``temporal selection'' attack that adaptive adversaries can use to cause biases, even against proactive resampling. We propose a new ''temporal aggregation'' principle that algorithms should follow to counteract these biases, and present two powerful new resampling schemes based on this principle. We give various applications of our new methods. The main one in maintaining proper coloring of the graph under adaptive adversarial modifications: we maintain $O(Δ)$ coloring for general graphs with maximum degree $Δ$ and $O(\fracΔ{\ln Δ})$ coloring for triangle free graphs, both with sublinear in the number of vertices average work per modification. Other applications include efficiently maintaining random walks in dynamically changing graphs.

Authors: Bernhard Haeupler, Anton Paramonov

We study and further develop powerful general-purpose schemes to maintain random assignments under adversarial dynamic changes. The goal is to maintain assignments that are (approximately) distributed similarly as a completely fresh resampling of all assignments after each change, while doing only a few resamples per change. This becomes particularly interesting and challenging when dynamics are controlled by an adaptive adversary. Our work builds on and further develops the proactive resampling technique [Bhattacharya, Saranurak, and Sukprasert ESA'22]. We identify a new ``temporal selection'' attack that adaptive adversaries can use to cause biases, even against proactive resampling. We propose a new ''temporal aggregation'' principle that algorithms should follow to counteract these biases, and present two powerful new resampling schemes based on this principle. We give various applications of our new methods. The main one in maintaining proper coloring of the graph under adaptive adversarial modifications: we maintain $O(Δ)$ coloring for general graphs with maximum degree $Δ$ and $O(\fracΔ{\ln Δ})$ coloring for triangle free graphs, both with sublinear in the number of vertices average work per modification. Other applications include efficiently maintaining random walks in dynamically changing graphs.

A canonical generalization of OBDD

from arXiv: Data Structures and Algorithms

Authors: Florent Capelli, YooJung Choi, Stefan Mengel, Martín Muñoz, Guy Van den Broeck

We introduce Tree Decision Diagrams (TDD) as a model for Boolean functions that generalizes OBDD. They can be seen as a restriction of structured d-DNNF; that is, d-DNNF that respect a vtree $T$. We show that TDDs enjoy the same tractability properties as OBDD, such as model counting, enumeration, conditioning, and apply, and are more succinct. In particular, we show that CNF formulas of treewidth $k$ can be represented by TDDs of FPT size, which is known to be impossible for OBDD. We study the complexity of compiling CNF formulas into deterministic TDDs via bottom-up compilation and relate the complexity of this approach with the notion of factor width introduced by Bova and Szeider.

Authors: Florent Capelli, YooJung Choi, Stefan Mengel, Martín Muñoz, Guy Van den Broeck

We introduce Tree Decision Diagrams (TDD) as a model for Boolean functions that generalizes OBDD. They can be seen as a restriction of structured d-DNNF; that is, d-DNNF that respect a vtree $T$. We show that TDDs enjoy the same tractability properties as OBDD, such as model counting, enumeration, conditioning, and apply, and are more succinct. In particular, we show that CNF formulas of treewidth $k$ can be represented by TDDs of FPT size, which is known to be impossible for OBDD. We study the complexity of compiling CNF formulas into deterministic TDDs via bottom-up compilation and relate the complexity of this approach with the notion of factor width introduced by Bova and Szeider.

Parameterized algorithms for $k$-Inversion

from arXiv: Data Structures and Algorithms

Authors: Dhanyamol Antony, L. Sunil Chandran, Dalu Jacob, R. B. Sandeep

Inversion of a directed graph $D$ with respect to a vertex subset $Y$ is the directed graph obtained from $D$ by reversing the direction of every arc whose endpoints both lie in $Y$. More generally, the inversion of $D$ with respect to a tuple $(Y_1, Y_2, \ldots, Y_\ell)$ of vertex subsets is defined as the directed graph obtained by successively applying inversions with respect to $Y_1, Y_2, \ldots, Y_\ell$. Such a tuple is called a \emph{decycling family} of $D$ if the resulting graph is acyclic. In the \textsc{$k$-Inversion} problem, the input consists of a directed graph $D$ and an integer $k$, and the task is to decide whether $D$ admits a decycling family of size at most $k$. Alon et al.\ (SIAM J.\ Discrete Math., 2024) proved that the problem is NP-complete for every fixed value of $k$, thereby ruling out XP algorithms, and presented a fixed-parameter tractable (FPT) algorithm parameterized by $k$ for tournament inputs. In this paper, we generalize their algorithm to a broader variant of the problem on tournaments and subsequently use this result to obtain an FPT algorithm for \textsc{$k$-Inversion} when the underlying undirected graph of the input is a block graph. Furthermore, we obtain an algorithm for \textsc{$k$-Inversion} on general directed graphs with running time $2^{O(\mathrm{tw}(k + \mathrm{tw}))} \cdot n^{O(1)}$, where $\mathrm{tw}$ denotes the treewidth of the underlying graph.

Authors: Dhanyamol Antony, L. Sunil Chandran, Dalu Jacob, R. B. Sandeep

Inversion of a directed graph $D$ with respect to a vertex subset $Y$ is the directed graph obtained from $D$ by reversing the direction of every arc whose endpoints both lie in $Y$. More generally, the inversion of $D$ with respect to a tuple $(Y_1, Y_2, \ldots, Y_\ell)$ of vertex subsets is defined as the directed graph obtained by successively applying inversions with respect to $Y_1, Y_2, \ldots, Y_\ell$. Such a tuple is called a \emph{decycling family} of $D$ if the resulting graph is acyclic. In the \textsc{$k$-Inversion} problem, the input consists of a directed graph $D$ and an integer $k$, and the task is to decide whether $D$ admits a decycling family of size at most $k$. Alon et al.\ (SIAM J.\ Discrete Math., 2024) proved that the problem is NP-complete for every fixed value of $k$, thereby ruling out XP algorithms, and presented a fixed-parameter tractable (FPT) algorithm parameterized by $k$ for tournament inputs. In this paper, we generalize their algorithm to a broader variant of the problem on tournaments and subsequently use this result to obtain an FPT algorithm for \textsc{$k$-Inversion} when the underlying undirected graph of the input is a block graph. Furthermore, we obtain an algorithm for \textsc{$k$-Inversion} on general directed graphs with running time $2^{O(\mathrm{tw}(k + \mathrm{tw}))} \cdot n^{O(1)}$, where $\mathrm{tw}$ denotes the treewidth of the underlying graph.

Solving Hard Instances from Knapsack and Bounded Knapsack Problems: A new state-of-the-art solver

from arXiv: Data Structures and Algorithms

Authors: Renan F. F. da Silva, Thiago A. de Queiroz, Rafael C. S. Schouery

The Knapsack Problem (KP) and its generalization, the Bounded Knapsack Problem (BKP), are classical NP-hard problems with numerous practical applications, and despite being introduced over 25 years ago, the solvers COMBO and BOUKNAP remain the state of the art due to their highly optimized implementations and sophisticated bounding techniques. In this work, we present RECORD (Refined Core-based Dynamic Programming), a new solver for both problems that builds upon key components of COMBO, including core- and state-based dynamic programming, weak upper bounds, and surrogate relaxation with cardinality constraints, while introducing novel strategies to overcome its limitations. In particular, we propose multiplicity reduction to limit the number of distinct item types, combined with on-the-fly item aggregation, refined fixing-by-dominance techniques, and a new divisibility bound that strengthens item fixing and symmetry breaking. These enhancements allow RECORD to preserve COMBO's near-linear-time behavior on most instances while achieving substantial speedups on more challenging cases, and computational experiments show that it consistently outperforms both COMBO and BOUKNAP on difficult benchmark sets, often by several orders of magnitude, establishing a new state-of-the-art solver for KP and BKP.

Authors: Renan F. F. da Silva, Thiago A. de Queiroz, Rafael C. S. Schouery

The Knapsack Problem (KP) and its generalization, the Bounded Knapsack Problem (BKP), are classical NP-hard problems with numerous practical applications, and despite being introduced over 25 years ago, the solvers COMBO and BOUKNAP remain the state of the art due to their highly optimized implementations and sophisticated bounding techniques. In this work, we present RECORD (Refined Core-based Dynamic Programming), a new solver for both problems that builds upon key components of COMBO, including core- and state-based dynamic programming, weak upper bounds, and surrogate relaxation with cardinality constraints, while introducing novel strategies to overcome its limitations. In particular, we propose multiplicity reduction to limit the number of distinct item types, combined with on-the-fly item aggregation, refined fixing-by-dominance techniques, and a new divisibility bound that strengthens item fixing and symmetry breaking. These enhancements allow RECORD to preserve COMBO's near-linear-time behavior on most instances while achieving substantial speedups on more challenging cases, and computational experiments show that it consistently outperforms both COMBO and BOUKNAP on difficult benchmark sets, often by several orders of magnitude, establishing a new state-of-the-art solver for KP and BKP.

Polynomial and Pseudopolynomial Algorithms for Two Classes of Bin Packing Instances

from arXiv: Data Structures and Algorithms

Authors: Renan Fernando Franco da Silva, Vinícius Loti de Lima, Rafael C. S. Schouery, Jean-François Côté, Manuel Iori

Cutting and packing problems are fundamental in manufacturing and logistics, as they aim to minimize waste and improve efficiency. The Cutting Stock Problem (CSP) concerns material cutting, whereas the Bin Packing Problem (BPP) concerns packing items into bins. Since the 1960s, these problems have been widely studied because of their industrial relevance and computational complexity. Over time, exact algorithms, often based on mixed-integer programming (MIP), have become able to solve increasingly large instances, often with hundreds of items, within minutes. In 2016, Delorme et al. showed that the algorithm BELOV, combined with a modern version of CPLEX, could solve all benchmark instances available at that time within ten minutes. Motivated by this progress, they introduced two new classes of instances, AI and ANI, which proved extremely challenging for all exact solvers and have guided research on CSP and BPP over the past decade. Despite significant subsequent advances, 13 out of 500 of these instances remain unsolved by state-of-the-art algorithms within a one-hour time limit. In this paper, we show that although AI and ANI instances are particularly hard for MIP-based methods, the BPP restricted to these classes is not strongly NP-hard. We present polynomial-time algorithms for the AI class and pseudopolynomial-time algorithms for the ANI class. Our best algorithms solve all benchmark instances from these classes orders of magnitude faster than previous approaches. They are also straightforward to adapt to the Skiving Stock Problem (SSP), which can be seen as a counterpart of the CSP. Additionally, they can be used as preprocessing routines in exact methods, as their runtime is independent of the instance class, although they are guaranteed to return an optimality status only for instances belonging to the class for which they were designed.

Authors: Renan Fernando Franco da Silva, Vinícius Loti de Lima, Rafael C. S. Schouery, Jean-François Côté, Manuel Iori

Cutting and packing problems are fundamental in manufacturing and logistics, as they aim to minimize waste and improve efficiency. The Cutting Stock Problem (CSP) concerns material cutting, whereas the Bin Packing Problem (BPP) concerns packing items into bins. Since the 1960s, these problems have been widely studied because of their industrial relevance and computational complexity. Over time, exact algorithms, often based on mixed-integer programming (MIP), have become able to solve increasingly large instances, often with hundreds of items, within minutes. In 2016, Delorme et al. showed that the algorithm BELOV, combined with a modern version of CPLEX, could solve all benchmark instances available at that time within ten minutes. Motivated by this progress, they introduced two new classes of instances, AI and ANI, which proved extremely challenging for all exact solvers and have guided research on CSP and BPP over the past decade. Despite significant subsequent advances, 13 out of 500 of these instances remain unsolved by state-of-the-art algorithms within a one-hour time limit. In this paper, we show that although AI and ANI instances are particularly hard for MIP-based methods, the BPP restricted to these classes is not strongly NP-hard. We present polynomial-time algorithms for the AI class and pseudopolynomial-time algorithms for the ANI class. Our best algorithms solve all benchmark instances from these classes orders of magnitude faster than previous approaches. They are also straightforward to adapt to the Skiving Stock Problem (SSP), which can be seen as a counterpart of the CSP. Additionally, they can be used as preprocessing routines in exact methods, as their runtime is independent of the instance class, although they are guaranteed to return an optimality status only for instances belonging to the class for which they were designed.

Tuesday, April 07

TR26-051 | Cryptographic Implications of Worst-Case Hardness of Time-Bounded Kolmogorov Complexity | Yanyi Liu, Noam Mazor, Rafael Pass

from ECCC Papers

We consider the worst-case hardness of the gap version of the classic time-bounded Kolmogorov complexity problem—$Gap_pMK^tP[s_1,s_2]$—where the goal is to determine whether for a given string x, $K^t(x) ?s_1(n)$ or $K^{p(t)}(x) > s_2(n)$, where $K^t(x)$ denotes the t-bounded Kolmogorov complexity of x. As shown by Hirahara (STOC’18), if $Gap_pMK^tP[s_1,s_2] \notin prBPP$ for every polynomial p, then (under appropriate derandomization assumption) $Gap_pMK^tP$ is errorless average-case hard with respect to BPP heuristics. The notion of errorless average-case hardness, however, is seemingly insufficient for cryptographic applications where one needs to consider average-case hardness against attacks that simply may err with some probability (i.e., two-sided error hardness). In this work, we present several new consequences of the assumption that $Gap_pMK^tP[s_1,s_2]\notin P/poly$ for all polynomials p, for appropriate choices of $s_1$,$s_2$, and under appropriate (worst-case) derandomization assumptions. In particular, we show that this assumption implies: - The existence of an (inefficient-prover) zero-knowledge proof system for NP with a non-uniform simulator w.r.t. adversaries with a-priori bounded-length auxiliary-input. -  The existence of a hard disjoint NP pair, defined as a promise problem $(Y,N)$ where both $Y,N\in NP$; this provides a barrier towards showing that $Gap_pMK^tP$ is NP-complete. The above results are proven via first showing that the above assumption implies the existence of a so-called conditional PRG—roughly speaking, a cryptographic PRG where indistinguishability only needs to hold for some (potentially not efficiently sampleable) distribution over the seed to the PRG. (In fact, this notion of a PRG also almost directly implies average-case hardness of $Gap_pMK^tP$, and as such, this provides a modular explanation to Hirahara’s results.) Finally, we show that for the results on conditional PRGs and Zero-knowledge Proofs, unconditional results can be obtained (that is, without making any derandomization assumptions), if considering an appropriate version of $Gap_pMK^tP$ concerning randomized $K^t$.

We consider the worst-case hardness of the gap version of the classic time-bounded Kolmogorov complexity problem—$Gap_pMK^tP[s_1,s_2]$—where the goal is to determine whether for a given string x, $K^t(x) ?s_1(n)$ or $K^{p(t)}(x) > s_2(n)$, where $K^t(x)$ denotes the t-bounded Kolmogorov complexity of x. As shown by Hirahara (STOC’18), if $Gap_pMK^tP[s_1,s_2] \notin prBPP$ for every polynomial p, then (under appropriate derandomization assumption) $Gap_pMK^tP$ is errorless average-case hard with respect to BPP heuristics. The notion of errorless average-case hardness, however, is seemingly insufficient for cryptographic applications where one needs to consider average-case hardness against attacks that simply may err with some probability (i.e., two-sided error hardness). In this work, we present several new consequences of the assumption that $Gap_pMK^tP[s_1,s_2]\notin P/poly$ for all polynomials p, for appropriate choices of $s_1$,$s_2$, and under appropriate (worst-case) derandomization assumptions. In particular, we show that this assumption implies: - The existence of an (inefficient-prover) zero-knowledge proof system for NP with a non-uniform simulator w.r.t. adversaries with a-priori bounded-length auxiliary-input. -  The existence of a hard disjoint NP pair, defined as a promise problem $(Y,N)$ where both $Y,N\in NP$; this provides a barrier towards showing that $Gap_pMK^tP$ is NP-complete. The above results are proven via first showing that the above assumption implies the existence of a so-called conditional PRG—roughly speaking, a cryptographic PRG where indistinguishability only needs to hold for some (potentially not efficiently sampleable) distribution over the seed to the PRG. (In fact, this notion of a PRG also almost directly implies average-case hardness of $Gap_pMK^tP$, and as such, this provides a modular explanation to Hirahara’s results.) Finally, we show that for the results on conditional PRGs and Zero-knowledge Proofs, unconditional results can be obtained (that is, without making any derandomization assumptions), if considering an appropriate version of $Gap_pMK^tP$ concerning randomized $K^t$.

TR26-050 | Witness-Indistinguishable Arguments of Knowledge and One-Way Functions | Gal Arnon, Noam Mazor, Rafael Pass, Jad Silbak

from ECCC Papers

In this paper we study the cryptographic complexity of non-trivial witness-indistinguishable ($WI$) arguments of knowledge. We establish that: - Assuming that $NP\not\subseteq P/poly,$ the existence of a constant-round computational $WI$ argument of knowledge for $NP$ implies that (infinitely-often) auxiliary-input one-way functions exist. - Assuming that $NP\not\subseteq P^{Sam}/poly,$ there is no black-box construction of a constant-round (unbounded-verifier) statistical $WI$ argument of knowledge from one-way permutations. Here, $Sam$ is the collision finder oracle of Haitner, Hoch, Reingold, and Segev [FOCS '07]. Moreover, we identify a natural class of knowledge extractors for which stronger versions of the above implications hold (e.g., even if the protocols have many rounds).

In this paper we study the cryptographic complexity of non-trivial witness-indistinguishable ($WI$) arguments of knowledge. We establish that: - Assuming that $NP\not\subseteq P/poly,$ the existence of a constant-round computational $WI$ argument of knowledge for $NP$ implies that (infinitely-often) auxiliary-input one-way functions exist. - Assuming that $NP\not\subseteq P^{Sam}/poly,$ there is no black-box construction of a constant-round (unbounded-verifier) statistical $WI$ argument of knowledge from one-way permutations. Here, $Sam$ is the collision finder oracle of Haitner, Hoch, Reingold, and Segev [FOCS '07]. Moreover, we identify a natural class of knowledge extractors for which stronger versions of the above implications hold (e.g., even if the protocols have many rounds).

Unreal Is Here

from Ben Recht

Mapping the territory of simulation and its many purposes.

Though I’ve been prefacing my lecture blog posts with italicized disclaimers, I want to single this lecture blog out as being targeted a bit more broadly. Because, in a weird confluence, the topic of this week’s lecture coincides with the topic of an op-ed by Leif Weatherby and me that appears this morning in the New York Times: forecasting and simulation.

We can’t avoid prediction and simulation in a class about feedback systems. Our theories suggest that better predictions and forecasts lead to better plans of action. Additionally, we try to make sense of complex, interconnected systems by simulating their behavior, and simulations often reveal surprising “emergent” behavior of the whole, which wasn’t evident from the modeled behavior of the parts. We also tend to think that the subcomponents of complex, interconnected systems make sense of their surroundings by predicting what other components around them will do.

I was a bit slippery in that paragraph about what the difference is between simulation and prediction. That’s because I’m still not sure how to draw a boundary between the two concepts. The most common axis is opacity: everyone thinks there is a fundamental difference between a model that is “easy to describe” from first principles and one that is purely data-driven. We call the latter “black box” to mark our disdain. The “transparent box” systems might derive from physical laws, and we write down a set of equations that dictate how each step relates to the next. The black box systems might be derived by curve fitting, where we pick a function of convenience, untethered from causal explanation, to describe how inputs have historically mapped to outputs.

I’ll talk more about the opacity slider in later posts this week, but today, I want to ask about the purpose of simulation. That axis is more interesting to me. Simulations can be used in many different ways. You might use a simulation to better understand a system itself. Simulations of mechanical systems can give you a feel for their performance limits. You can use simulations to figure out why something went wrong, deriving causal explanations from plausible mechanisms. And, of course, you can use simulations to predict the future. You can use these simulation forecasts to make a plan of action. Or, in our Draft-Kings-addled culture, you might use them to gamble.

Leif and I talked about this murky simulation landscape in the world of public opinion polling. Specifically, we wrote about the absurdity of silicon sampling. For those unfamiliar with the term, silicon sampling is when you design a social science survey experiment and give the questions to LLMs rather than people. As absurd as this sounds, people are really pushing to do this. There’s a billion-dollar startup called Aaru that is based entirely on this silly idea. And one of their fake polls slipped its way into Axios last week, without Mike Allen realizing that the “poll” he was reporting on was a computer simulation (embarrassed, Axios later edited the story to reflect the phoniness).

But why do silicon samples have so much cachet with pollsters and social scientists? As Leif and I argue in our piece, it’s because polls already rely heavily on simulation methods. Because of remarkably high nonresponse bias, pollsters lean heavily on statistical modeling to tweak their numbers to align with reality. Polls that use multilevel regression and poststratification are already inputting a lot of simulated reality to “correct” their summarization of the data they collected. The number isn’t “percentage of yesses in my sample,” it’s “what I think the percentage of yesses is in the population given my sample and my beliefs about the population.”

Since polling already relies heavily on simulation, tossing out the expensive part of the process—you know, asking actual people questions—feels like a logical conclusion. The Nate Silverization of political coverage turned polling into prediction. In the media, the goal of polls stopped being about understanding what people think and became more about predicting the outcome of elections. If all you need to do is predict, you don’t really need pristine distillations of understanding. You can take your empirical facts and use them solely to predict outcomes. And if the goal is just prediction, you don’t need to bother asking people at all. In fact, you want more reliable data than the fickle behavior of people nagged by pollsters at the end of some modern transmission line. If your goal is only prediction, you’re probably better off not talking to people at all.

But is the purpose of polling prediction? It depends on who you ask, but I’d like to think that the answer is no. At pure face value, the topline numbers of an opinion poll are a summary of a survey. They reduce a list of ones and zeros into two numbers: a mean and a variance.

Now, using a bit more social-scientific reasoning, we might interpret this summarization as a measurement of what a group of people believes. With a rigid methodology, we can consider polling to be quantified opinion. It’s a bit odd to think that you can “objectively” measure opinion in the first place, but this has been a supposition of social science research for a long time.

Unfortunately, statistics has incredibly slippery semantics that lead people to conflate summarization with measurement and measurement with prediction. Is the percentage of “people who answered yes” a summarization of the data? Is it a measured quantity about the opinion of a broader population? Is it a prediction of how people will vote in November? Yes?

I’m interested in this conflation for both political and academic reasons. Leif and I think the polling industry is harmful to the public sphere. But setting those politics aside, I think that being upfront about the purpose of simulations and forecasts helps demystify their outputs. Indeed, this week I’ll describe how purpose dictates forecasts. Prediction of the future is difficult. But if you tell me how my predictions will be evaluated, prediction of the future is trivial. I’ll explain more about why in the next post.

Subscribe now

By Ben Recht

TR26-049 | No Constant-Cost Protocol for Point–Line Incidence | Mika Göös, Nathaniel Harms, Florian Richter, Anastasia Sofronova

from ECCC Papers

Alice and Bob are given $n$-bit integer pairs $(x,y)$ and $(a,b)$, respectively, and they must decide if $y=ax+b$. We prove that the randomised communication complexity of this Point–Line Incidence problem is $\Theta(\log n)$. This confirms a conjecture of Cheung, Hatami, Hosseini, and Shirley (CCC 2023) that the complexity is super-constant, and gives the first example of a communication problem with constant support-rank but super-constant randomised complexity.

Alice and Bob are given $n$-bit integer pairs $(x,y)$ and $(a,b)$, respectively, and they must decide if $y=ax+b$. We prove that the randomised communication complexity of this Point–Line Incidence problem is $\Theta(\log n)$. This confirms a conjecture of Cheung, Hatami, Hosseini, and Shirley (CCC 2023) that the complexity is super-constant, and gives the first example of a communication problem with constant support-rank but super-constant randomised complexity.

Before we start on quantum

from Scott Aaronson

Imagine that every week for twenty years, people message you asking you to comment on the latest wolf sighting, and every week you have to tell them: I haven’t seen a wolf, I haven’t heard a wolf, I believe wolves exist but I don’t yet see evidence of them anywhere near our town. Then one […]

Imagine that every week for twenty years, people message you asking you to comment on the latest wolf sighting, and every week you have to tell them: I haven’t seen a wolf, I haven’t heard a wolf, I believe wolves exist but I don’t yet see evidence of them anywhere near our town.

Then one evening, you hear a howl in the distance, and sure enough, on a hill overlooking the town is the clear silhouette of a large wolf. So you point to it — and all the same people laugh and accuse you of “crying wolf.”

Now you know how it’s been for me with cryptographically relevant quantum computing.


I’ve been writing about QC on this blog for a while, and have done hundreds of public lectures and interviews and podcasts on the subject. By now, I can almost always predict where a non-expert’s QC question is going from its first few words, and have a well-rehearsed answer ready to go the moment they stop talking. Yet sometimes I feel like it’s all for naught.

Only today did it occur to me that I should write about something more basic. Not quantum computing itself, but the habits of mind that seem to prevent some listeners from hearing whatever I or other researchers have to tell them about QC. The stuff that we’re wasting our breath if we don’t get past.

Which habits of mind am I talking about?

  1. The Tyranny of Black and White. Hundreds of times, I’ve answered someone’s request to explain QC, only to have them nod impatiently, then interrupt as soon as they can with: “So basically, the take-home message is that quantum is coming, and it’ll change everything?” Someone else might respond to exactly the same words from me with: “So basically, you’re saying it’s all hype and I shouldn’t take any of it seriously?” As in my wolf allegory, the same person might even jump from one reaction to the other. Seeing this, I’ve become a fervent believer in horseshoe theory, in QC no less than in politics. Which sort of makes sense: if you think QCs are “the magic machines of the future that will revolutionize everything,” and then you learn that they’re not, why wouldn’t you jump to the opposite extreme and conclude you’ve been lied to and it’s all a scam?
  2. The Unidimensional Hype-Meter. “So … [long, thoughtful pause] … you’re actually telling me that some of what I hear about QC is real … but some of it is hype? Or—yuk yuk, I bet no one ever told you this one before—it’s a superposition of real and hype?” OK, that’s better. But it’s still trying to project everything down onto a 1-dimensional subspace that loses almost all the information!
  3. Words As Seasoning. I often get the sense that a listener is treating all the words of explanation—about amplitudes and interference, Shor versus Grover, physical versus logical qubits, etc.—as seasoning, filler, an annoying tic, a stalling tactic to put off answering the only questions that matter: “is Quantum real or not real? If it’s real, when is it coming? Which companies will own the Quantum space?” In reality, explanations are the entire substance of what I can offer. For my experience has consistently been that, if someone has no interest in learning what QC is, which classes of problems it helps for, etc., then even if I answer their simplistic questions like “which QC companies are good or bad?,” they won’t believe my answers anyway. Or they’ll believe my answers only until the next person comes along and tells them the opposite.
  4. Black-Boxing. Sometimes these days, I’ll survey the spectacular recent progress in fault-tolerance, 2-qubit gate fidelities, programmable hundred-qubit systems, etc., only to be answered with a sneer: “What’s the biggest number that Shor’s algorithm has factored? Still 15 after all these years? Haha, apparently the emperor has no clothes!” I’ve commented that this is sort of like dismissing the Manhattan Project as hopelessly stalled in 1944, on the ground that so far it hasn’t produced even a tiny nuclear explosion. Or the Apollo program in 1967, on the ground that so far it hasn’t gotten any humans even 10% of the way to the moon. Or GPT in 2020, on the ground that so far it can’t even do elementary-school math. Yes, sometimes emperors are naked—but you can’t tell until you actually look at the emperor! Engage with the specifics of quantum error correction. If there’s a reason why you think it can’t work beyond a certain scale, say so. But don’t fixate on one external benchmark and ignore everything happening under the hood, if the experts are telling you that under the hood is where all the action now is, and your preferred benchmark is only relevant later.
  5. Questions with Confused Premises. “When is Q-Day?” I confess that this question threw me for a loop the first few times I heard it, because I had no idea what “Q-Day” was. Apparently, it’s the single day when quantum computing becomes powerful enough to break all of cryptography? Or: “What differentiates quantum from binary?” “How will daily life be different once we all have quantum computers in our homes?” Try to minimize the number of presuppositions.
  6. Anchoring on Specific Marketing Claims. “What do you make of D-Wave’s latest quantum annealing announcement?” “What about IonQ’s claim to recognize handwriting with a QC?” “What about Microsoft’s claim to have built a topological qubit?” These questions can be fine as part of a larger conversation. Again and again, though, someone who doesn’t know the basics will lead with them—with whichever specific, contentious thing they most recently read. Then the entire conversation gets stuck at a deep node within the concept tree, and it can’t progress until we backtrack about five levels.

Anyway—sorry for yet another post of venting and ranting. Maybe this will help:

The wise child asks, “what are the main classes of problems that are currently known to admit superpolynomial quantum speedups?” To this child, you can talk about quantum simulation and finding hidden structures in abelian and occasionally nonabelian groups, as well as Forrelation, glued trees, HHL, and DQI—explaining how the central challenge has been to find end-to-end speedups for non-oracular tasks.

The wicked child asks, “so can I buy a quantum computer right now to help me pick stocks and search for oil and turbocharge LLMs, or is this entire thing basically a fraud?” To this child you answer: “the quantum computing people who seek you as their audience are frauds.”

The simple child asks, “what is quantum computing?” You answer: “it’s a strange new way of harnessing nature to do computation, one that dramatically speeds up certain tasks, but doesn’t really help with others.”

And to the child who doesn’t know how to ask—well, to that child you don’t need to bring up quantum computing at all. That child is probably already fascinated to learn classical stuff.

By Scott