Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Sunday, May 17

TR26-082 | Pseudorandomness Beating the Hybrid Argument for Insensitive Algorithms | Dean Doron, David Zuckerman, Dana Moshkovitz, Justin Oh

from ECCC Papers

The hardness vs. randomness paradigm converts a function $f \colon \{0,1\}^n \rightarrow \{0,1\}$ that is hard for circuits of size $s$ into a pseudorandom generator (PRG) $G \colon \{0,1\}^d \to \{0,1\}^{s'}$ that fools circuits of size $s' = s'(s)$. In the application for derandomization, such as proofs of $\mathbf{BPP} = \mathbf{P}$, the overhead of using such a PRG directly depends on two quantities: the runtime of $G$ and its seed length $d$. Ideally, the runtime of a PRG fooling circuits of size $s'$ should only be slightly larger than $s'$, and $d$ should be $\log s'$. A central challenge in optimizing these parameters is optimizing the relationship between $s$ and $s'$. Ideally, $s'$ should only be slightly less than $s$. Impagliazzo and Wigderson [IW97] constructed PRGs where $s$ is a large polynomial in $s'$ and $d=c\cdot \log s$ for a large constant $c\ge 8$. This gives derandomization with a large polynomial slowdown. Doron, Moshkovitz, Oh and Zuckerman [DMOZ22] constructed near optimal PRGs where $s'=s^{1-\alpha}$ and $d=(1+\alpha)\log s$ for arbitrarily small $\alpha>0$. However, they require that $f$ is hard against nondeterministic randomized circuits rather than deterministic circuits. Subsequent works of Chen, and Tell [CT21, CT24], later joined by Ball [BCT25], provide fast derandomization under various new assumptions. However, no result recovers the nearly optimal PRG of [DMOZ22] under the standard deterministic hardness assumption of [IW97]. In fact, from such an assumption, even constructing a nearly optimal hitting set or pseudoentropy generator against algorithms that err rarely is open. A key bottleneck in obtaining such a result is the use of the hybrid argument, which is known to require a large polynomial loss from $s$ to $s'$. We identify a property of randomized algorithms that we call ``insensitivity'', which allows us to bypass the hybrid argument. Insensitivity is a simple property we identify that many randomized algorithms satisfy: flipping a uniformly random bit in a randomness string on which the algorithm is correct is likely to yield yet another randomness string on which the algorithm is correct. We construct hitting sets and pseudoentropy generators for insensitive algorithms with better parameters than the hybrid argument can offer. Additionally, we consider a two-sided variant of insensitivity. In this variant, flipping a uniformly random bit in a randomness string for which the algorithm is incorrect either always yields another incorrect string, or yields a correct string with decent probability. For such algorithms we construct hitting sets and pseudoentropy generators with near optimal parameters.

The hardness vs. randomness paradigm converts a function $f \colon \{0,1\}^n \rightarrow \{0,1\}$ that is hard for circuits of size $s$ into a pseudorandom generator (PRG) $G \colon \{0,1\}^d \to \{0,1\}^{s'}$ that fools circuits of size $s' = s'(s)$. In the application for derandomization, such as proofs of $\mathbf{BPP} = \mathbf{P}$, the overhead of using such a PRG directly depends on two quantities: the runtime of $G$ and its seed length $d$. Ideally, the runtime of a PRG fooling circuits of size $s'$ should only be slightly larger than $s'$, and $d$ should be $\log s'$. A central challenge in optimizing these parameters is optimizing the relationship between $s$ and $s'$. Ideally, $s'$ should only be slightly less than $s$. Impagliazzo and Wigderson [IW97] constructed PRGs where $s$ is a large polynomial in $s'$ and $d=c\cdot \log s$ for a large constant $c\ge 8$. This gives derandomization with a large polynomial slowdown. Doron, Moshkovitz, Oh and Zuckerman [DMOZ22] constructed near optimal PRGs where $s'=s^{1-\alpha}$ and $d=(1+\alpha)\log s$ for arbitrarily small $\alpha>0$. However, they require that $f$ is hard against nondeterministic randomized circuits rather than deterministic circuits. Subsequent works of Chen, and Tell [CT21, CT24], later joined by Ball [BCT25], provide fast derandomization under various new assumptions. However, no result recovers the nearly optimal PRG of [DMOZ22] under the standard deterministic hardness assumption of [IW97]. In fact, from such an assumption, even constructing a nearly optimal hitting set or pseudoentropy generator against algorithms that err rarely is open. A key bottleneck in obtaining such a result is the use of the hybrid argument, which is known to require a large polynomial loss from $s$ to $s'$. We identify a property of randomized algorithms that we call ``insensitivity'', which allows us to bypass the hybrid argument. Insensitivity is a simple property we identify that many randomized algorithms satisfy: flipping a uniformly random bit in a randomness string on which the algorithm is correct is likely to yield yet another randomness string on which the algorithm is correct. We construct hitting sets and pseudoentropy generators for insensitive algorithms with better parameters than the hybrid argument can offer. Additionally, we consider a two-sided variant of insensitivity. In this variant, flipping a uniformly random bit in a randomness string for which the algorithm is incorrect either always yields another incorrect string, or yields a correct string with decent probability. For such algorithms we construct hitting sets and pseudoentropy generators with near optimal parameters.

Can AI do Theory @ STOC 2026

from CS Theory Events

June 27, 2026 STOC 2026, Salt Lake City, Utah, USA pritishkamath.github.io/ai-tcs-stoc-2026/#about Submission deadline: May 29, 2026 This workshop will explore the intersection of artificial intelligence and theoretical computer science. Through invited talks, a panel discussion and a poster session, we hope to foster a community-wide dialogue on whether and how AI can augment our current … Continue reading Can AI do Theory @ STOC 2026

By shacharlovett

June 27, 2026 STOC 2026, Salt Lake City, Utah, USA https://pritishkamath.github.io/ai-tcs-stoc-2026/#about Submission deadline: May 29, 2026 This workshop will explore the intersection of artificial intelligence and theoretical computer science. Through invited talks, a panel discussion and a poster session, we hope to foster a community-wide dialogue on whether and how AI can augment our current … Continue reading Can AI do Theory @ STOC 2026

By shacharlovett

Scott Aaronson wins Trevisan Award? Prize? Medal? Statue?

from Computational Complexity

 1) Congratulations to Scott Aaronson for winning the first Trevisan Award.

The Trevisan Award is in memory of Luca Trevisan and recognizes expository work in Theoretical Computer Science. It is given out by the ACM. The ACM announcement of Scott's award is here. Scott blogged about winning it here.

2) When preparing this blog post, I googled Trevisan Award. The first hit I got was this. That site says the award was for significant research. Hmmm. My first thought was Scott has, in fact, done significant research. Perhaps I misunderstood the award. I shared these thoughts with Lance who pointed out I was at the wrong website:

Scott won the Trevisan AWARD.

There is also a Trevisan PRIZE. This was the first hit I got when I googled Trevisan Award. 

There is not yet a Trevisan MEDAL. Give it time.

 
4) I then realized I did not actually know the difference between an award, a prize, and a medal.

a) What are all the ways to honor someone in academia?

b) How do they differ?

5) I asked our future AI overlords:

In academia there are PRIZES, AWARDS, MEDALS. Other ways to honor people in other fields are Ribbons and Trophies. Are there other ways to honor people? How do they differ?

Chatty gave me 10 pages of interesting material that appeared correct, which is about the best one can hope for. See here. In case you are thinking tl;dr, here is a short version focused on academia, with some of my own thought mixed in.  

a) Awards: General Recognition of excellence. May or may not involve competition (that's a tautology). I aspire to win the Montgomery Burns Award for Outstanding Achievement in the Field of Excellence.

b) Prizes: Usually competitive. Often recognizes a specific breakthrough.

c) Medals: Prestige and ceremony. Often accompanied by an actual medal.

d) Fellowships: Recognizes a substantial body of work.

e) Named lectures: Prestige plus airfare.

f) Named professorships: Prestige plus discretionary funds.

g) Election to academies: Peer validation with nice stationary.

h) Honorary degrees. A way to get a celebrity speaker at a graduation.

i) Statues. The only academic one I know of is The Slide Rule Statue, see here. 


Academia has many ways to honor people. Apparently the hard part is keeping track of which noun goes with which person.

The real award was the friends we cited along the way.


 

By gasarch

 1) Congratulations to Scott Aaronson for winning the first Trevisan Award.

The Trevisan Award is in memory of Luca Trevisan and recognizes expository work in Theoretical Computer Science. It is given out by the ACM. The ACM announcement of Scott's award is here. Scott blogged about winning it here.

2) When preparing this blog post, I googled Trevisan Award. The first hit I got was this. That site says the award was for significant research. Hmmm. My first thought was Scott has, in fact, done significant research. Perhaps I misunderstood the award. I shared these thoughts with Lance who pointed out I was at the wrong website:

Scott won the Trevisan AWARD.

There is also a Trevisan PRIZE. This was the first hit I got when I googled Trevisan Award. 

There is not yet a Trevisan MEDAL. Give it time.

 
4) I then realized I did not actually know the difference between an award, a prize, and a medal.

a) What are all the ways to honor someone in academia?

b) How do they differ?

5) I asked our future AI overlords:

In academia there are PRIZES, AWARDS, MEDALS. Other ways to honor people in other fields are Ribbons and Trophies. Are there other ways to honor people? How do they differ?

Chatty gave me 10 pages of interesting material that appeared correct, which is about the best one can hope for. See here. In case you are thinking tl;dr, here is a short version focused on academia, with some of my own thought mixed in.  

a) Awards: General Recognition of excellence. May or may not involve competition (that's a tautology). I aspire to win the Montgomery Burns Award for Outstanding Achievement in the Field of Excellence.

b) Prizes: Usually competitive. Often recognizes a specific breakthrough.

c) Medals: Prestige and ceremony. Often accompanied by an actual medal.

d) Fellowships: Recognizes a substantial body of work.

e) Named lectures: Prestige plus airfare.

f) Named professorships: Prestige plus discretionary funds.

g) Election to academies: Peer validation with nice stationary.

h) Honorary degrees. A way to get a celebrity speaker at a graduation.

i) Statues. The only academic one I know of is The Slide Rule Statue, see here


Academia has many ways to honor people. Apparently the hard part is keeping track of which noun goes with which person.

The real award was the friends we cited along the way.


 

By gasarch

TR26-081 | Hard-to-Sample Distributions from Robust Extractors | Anthony Ostuni, Daniel Kane, Jackson Morris, Farzan Byramji

from ECCC Papers

We provide a unified method for constructing explicit distributions which are difficult for restricted models of computation to generate. Our constructions are based on a new notion of robust extractors, which are extractors that remain sound even when a small number of points violate the min-entropy constraint. Using such objects, we show that for a broad range of sampling models (e.g., low-depth circuits, small-space sources, etc.), every output of the model has distance $1 - o(1)$ from our target distribution, qualitatively recovering essentially all previously known hardness results. Our work extends that of Viola (SICOMP '14), who developed an earlier unified framework based on traditional extractors to rule out sampling with very small error. As a further application of our technique, we leverage a recent extractor construction of Chattopadhyay, Goodman, and Gurumukhani (ITCS '24) to present the first explicit distribution with distance $1 - o(1)$ from the output of any low-degree $\mathbb{F}_2$-polynomial source. We also describe a potential avenue toward proving a similar hardness result for $AC^0[\oplus]$ circuits.

We provide a unified method for constructing explicit distributions which are difficult for restricted models of computation to generate. Our constructions are based on a new notion of robust extractors, which are extractors that remain sound even when a small number of points violate the min-entropy constraint. Using such objects, we show that for a broad range of sampling models (e.g., low-depth circuits, small-space sources, etc.), every output of the model has distance $1 - o(1)$ from our target distribution, qualitatively recovering essentially all previously known hardness results. Our work extends that of Viola (SICOMP '14), who developed an earlier unified framework based on traditional extractors to rule out sampling with very small error. As a further application of our technique, we leverage a recent extractor construction of Chattopadhyay, Goodman, and Gurumukhani (ITCS '24) to present the first explicit distribution with distance $1 - o(1)$ from the output of any low-degree $\mathbb{F}_2$-polynomial source. We also describe a potential avenue toward proving a similar hardness result for $AC^0[\oplus]$ circuits.

TR26-080 | Maximum Matching and Related Problems in Catalytic Logspace | SRIJAN CHAKRABORTY, Samir Datta, Aryan Kusre, Partha Mukhopadhyay, Amit Sinhababu

from ECCC Papers

Understanding the power of space-bounded computation with access to catalytic space has been an important theme in complexity theory over the recent years. One of the key algorithmic results in this area is that bipartite maximum matching can be computed in catalytic logspace with a polynomial-time bound, Agarwala and Mertz (2025). In this paper, we show that we can construct a \emph{maximum matching} in \emph{general graphs} in CL, and, in fact, in CLP. We first show that the size of a \emph{maximum matching} in \emph{general graphs} can be determined in CL. Our algorithm is based on the linear-algebraic algorithm for maximum matching by Geelen (2000). We then show that this algorithm, along with some new ideas, can be used to \emph{find} a maximum matching in general graphs. Using a similar algorithm of Geelen (1999), we also solve the \emph{maximum rank completion problem} in CLP, which was previously known to be solvable in deterministic polynomial time, Geelen. This problem turns out to be equivalent to the \emph{linear matroid intersection} problem (shown by Murota, 1995) which has been shown to be in CLP by Agarwala, Alekseev, and Vinciguerra (2026). Finally, using a PTAS algorithm Bl\"{a}ser, Jindal and Pandey (2018), for approximating the rank in Edmond's problem, we derive a CLP algorithm that can approximate the rank given by any instance of the \emph{Edmond's problem} upto a factor of $(1-\eps)$ for any $\eps\in(0,1)$. An application of this is a CLP bound for approximating the maximum independent matching in the \emph{linear matroid matching} problem.

Understanding the power of space-bounded computation with access to catalytic space has been an important theme in complexity theory over the recent years. One of the key algorithmic results in this area is that bipartite maximum matching can be computed in catalytic logspace with a polynomial-time bound, Agarwala and Mertz (2025). In this paper, we show that we can construct a \emph{maximum matching} in \emph{general graphs} in CL, and, in fact, in CLP. We first show that the size of a \emph{maximum matching} in \emph{general graphs} can be determined in CL. Our algorithm is based on the linear-algebraic algorithm for maximum matching by Geelen (2000). We then show that this algorithm, along with some new ideas, can be used to \emph{find} a maximum matching in general graphs. Using a similar algorithm of Geelen (1999), we also solve the \emph{maximum rank completion problem} in CLP, which was previously known to be solvable in deterministic polynomial time, Geelen. This problem turns out to be equivalent to the \emph{linear matroid intersection} problem (shown by Murota, 1995) which has been shown to be in CLP by Agarwala, Alekseev, and Vinciguerra (2026). Finally, using a PTAS algorithm Bl\"{a}ser, Jindal and Pandey (2018), for approximating the rank in Edmond's problem, we derive a CLP algorithm that can approximate the rank given by any instance of the \emph{Edmond's problem} upto a factor of $(1-\eps)$ for any $\eps\in(0,1)$. An application of this is a CLP bound for approximating the maximum independent matching in the \emph{linear matroid matching} problem.

Fast Simplex on a Fixed View Schedule

from Decentralized Thoughts

Simplex FVS uses fixed views of length $3\Delta$. This post gives a two round version with view length $2\Delta$, using the two round Simplex certificate structure. The price is the usual one for two round BFT: we use $n=5f+1$ parties. A view has only one leader proposal and one round of votes. If a later leader needs to skip the view, it does not wait for a skip vote sent...

By Ittai Abraham

Simplex FVS uses fixed views of length $3\Delta$. This post gives a two round version with view length $2\Delta$, using the two round Simplex certificate structure. The price is the usual one for two round BFT: we use $n=5f+1$ parties. A view has only one leader proposal and one round of votes. If a later leader needs to skip the view, it does not wait for a skip vote sent...

By Ittai Abraham

TR26-079 | On the LSH Distortion of Ulam and Cayley Similarities | Flavio Chierichetti, Mirko Giacchini, Ravi Kumar, Erasmo Tani

from ECCC Papers

Locality-sensitive hashing (LSH) has found widespread use as a fundamental primitive, particularly to accelerate nearest neighbor search. An LSH scheme for a similarity function $S:\mathcal{X} \times \mathcal{X} \to [0,1]$ is a distribution over hash functions on $\mathcal{X}$ with the property that the probability of collision of any two elements $x,y\in \mathcal{X}$ is exactly equal to $S(x,y)$. However, not all similarity functions admit exact LSH schemes. The notion of LSH distortion measures how multiplicatively close a similarity function is to having an LSH scheme. In this work, we study the LSH distortion of the Ulam and Cayley similarities, which are popular similarity measures on permutations of $n$ elements. We show that the Ulam similarity admits a sublinear LSH distortion of $O(n / \sqrt{\log n})$; we also prove a lower bound of $\Omega(n^{0.12})$ on the best LSH distortion achievable. On the other hand, we show that the LSH distortion of the Cayley similarity is $\Theta(n)$.

Locality-sensitive hashing (LSH) has found widespread use as a fundamental primitive, particularly to accelerate nearest neighbor search. An LSH scheme for a similarity function $S:\mathcal{X} \times \mathcal{X} \to [0,1]$ is a distribution over hash functions on $\mathcal{X}$ with the property that the probability of collision of any two elements $x,y\in \mathcal{X}$ is exactly equal to $S(x,y)$. However, not all similarity functions admit exact LSH schemes. The notion of LSH distortion measures how multiplicatively close a similarity function is to having an LSH scheme. In this work, we study the LSH distortion of the Ulam and Cayley similarities, which are popular similarity measures on permutations of $n$ elements. We show that the Ulam similarity admits a sublinear LSH distortion of $O(n / \sqrt{\log n})$; we also prove a lower bound of $\Omega(n^{0.12})$ on the best LSH distortion achievable. On the other hand, we show that the LSH distortion of the Cayley similarity is $\Theta(n)$.

TR26-078 | Superpolynomial Length Lower Bounds for Tree-Like Semantic Proof Systems with Bounded Line Size | Susanna F. de Rezende, David Engström, Yassine Ghannane, Kilian Risse

from ECCC Papers

We prove superpolynomial length lower bounds for the semantic tree-like Frege refutation system with bounded line size. Concretely, for any function $n^{2-\varepsilon} \leq s(n) \leq \exp\bigl(n^{1-\varepsilon}\bigr)$ we exhibit an explicit family $\mathcal{A}$ of $n$-variate CNF formulas $A$, each of size $|A| \le s(n)^{1+\varepsilon}$, such that if $A$ is chosen uniformly from $\mathcal{A}$, then asymptotically almost surely any tree-like Frege refutation of $A$ in line-size $s(n)$ is of length super-polynomial in $|A|$. Our lower bounds apply also to tree-like degree $d$ threshold systems, for $d \approx \log\bigl(s(n)\bigr)$, that is, for $d$ up to $n^{1-\varepsilon}$. More generally, our lower bounds apply to the semantic version of these systems and to any semantic tree-like proof system where the number of distinct lines is bounded by $\exp\bigl(s(n)\bigr)$.

We prove superpolynomial length lower bounds for the semantic tree-like Frege refutation system with bounded line size. Concretely, for any function $n^{2-\varepsilon} \leq s(n) \leq \exp\bigl(n^{1-\varepsilon}\bigr)$ we exhibit an explicit family $\mathcal{A}$ of $n$-variate CNF formulas $A$, each of size $|A| \le s(n)^{1+\varepsilon}$, such that if $A$ is chosen uniformly from $\mathcal{A}$, then asymptotically almost surely any tree-like Frege refutation of $A$ in line-size $s(n)$ is of length super-polynomial in $|A|$. Our lower bounds apply also to tree-like degree $d$ threshold systems, for $d \approx \log\bigl(s(n)\bigr)$, that is, for $d$ up to $n^{1-\varepsilon}$. More generally, our lower bounds apply to the semantic version of these systems and to any semantic tree-like proof system where the number of distinct lines is bounded by $\exp\bigl(s(n)\bigr)$.

Two New Preprints: Balanced Separators for (Fat) Minor-Free Graphs

from Hung Le

The graph minor structure theorem by Robertson and Seymour, and balanced separators are two central algorithmic tools in minor-free graphs. I personally like thinking about balanced separators. I wrote about it 3 years ago on this blog. A few months ago, we gave the first linear time algorithm for finding such a separator. Now I write about it again, focusing on two recent preprints. A Separator for Minor-Free Graphs Beyond the Flow Barrier. Coarse Balanced Separators in Fat-Minor-Free Graphs, joint with Édouard Bonnet, Marcin Pilipczuk and Michał Pilipczuk. I am sure this is not the last time I write about separators; more will be coming. Fat-minor-free graphs are a relatively new concept and quite interesting. I will say more about them below. 1. The Balanced Separator Conjecture for Minor-Free Graphs In 1990, Alon, Seymour, and Thomas (AST) proved their landmark result that \( K_h \)-minor-free graphs have a balanced separator of size \( O(h^{3/2} \sqrt{n}) \). (In this section, I will not treat \( h \) as constant so the dependence of the separator’s size on \( h \) will be given explicitly.) They conjectured that the size of the separator should be the clean bound \( O(h\sqrt{n}) \). AST Separator Conjecture: \( K_h \)-minor-free graphs have a balanced separator of size \( O(h \sqrt{n}) \). They also observed that the bound in the conjecture would be: Expander graphs of degree-3 have clique minors of size \( h = \Theta(\sqrt{n}) \) and balanced separators of size \( \Theta(n) \). The history of the progress on this conjecture is a bit messy. Kawarabayashi and Reed claimed a solution but the precise bound on the size of their separator is \( O(h\sqrt{n} + f(h)) \) where \( f \) is a very fast-growing function in the graph minor structure theorem. So it is only \( O(h\sqrt{n}) \) for very small values of \( h \). On the other hand, it is clear that AST is interested in the regime where \( h \) is relatively large, in which the bound by Kawarabayashi and Reed is vacuous. I wrote about another relatively simple algorithm by Plotkin, Rao and Smith which gives a balanced separator size \( O(h\sqrt{n \log n}) \). There is an annoying \( \sqrt{\log n} \) factor, which makes the bound inferior to the original AST bound for small \( h \). There is another amazing line of work on constructing balanced separators for \( K_h \)-minor-free graphs using the duality between concurrent flow and sparsest cut. In this line of work, the paper to read is Korhonen and Lokshtanov, a beautiful paper, giving a clear presentation of how the flow-cut duality helps in finding balanced separators. In a nutshell, the lesson is this: if the flow-cut duality gap is \( \beta \), then \( K_h \)-minor-free graphs have balanced separators of size \(O(h\beta \sqrt{n}).\) I included the detail of this in Section 4 of this preprint. Of course if one can show \( \beta = O(1) \), then one proves the AST conjecture. However, for general graphs, \( \beta = \Omega(\log n) \), and for \( K_h \)-minor-free graphs, \( \beta = \Omega(\log h) \). The recent paper on padded decomposition by Filtser and Conroy (which I also wrote about previously) showed that one can achieve the optimal \( \beta = O(\log h) \). This means one can construct a balanced separator of size \( O(h\log h \sqrt{n}) \), which is pretty close to the AST conjecture. However, the \( \log h \) factor is inherent in this type of technique due to the lower bound \( \beta = \Omega(\log h) \). As beautiful as it is, this technique cannot give a proof of the AST conjecture. So the question I am interested in is whether one can get below the “flow barrier” bound \( O(h\log h \sqrt{n}) \). And as you might well guess, the answer is yes: the first preprint mentioned above gives a balanced separator of size \(O(h\sqrt{\log h} \sqrt{n}).\) The \( h\sqrt{\log h} \) factor above mysteriously coincides with the same factor in the sparsity bound. I think it is just a coincidence and nothing else. To get this bound, I add a new twist, namely a low-diameter decomposition, to the same old framework for balanced separators. If you are curious, check it out. 2. Balanced Separators in Fat-Minor-Free Graphs Michał introduced me to the fat-minor-free graph stuffs in a Dagstuhl workshop that I co-organized. Fat-minor is a central concept in a so-called coarse graph theory. There are many interesting and ambitious conjectures in the theory. The most famous one is the existence of a quasi-isometry to minor-free graphs (Conjecture 1.1 in the linked preprint). Unfortunately, many, if not all, of these conjectures have been recently disproved, including the famous conjecture. But not all are lost. And one of which is the existence of a good coarse balanced separator, the topic discussed here. Let me introduce the notion of a \( d \)-fat clique minor. A clique minor \( K_h \) of a graph \( G \) refers to a collection of vertex-disjoint connected subgraphs that have pairwise edges between them; this collection is called a \( K_h \)-minor model. A graph is \( K_h \)-minor-free if it does not have a \( K_h \)-minor model. Easy peasy. There is a more complicated way to define a \( K_h \)-minor model of \( G \). Let \( B \) be a map that maps each vertex \( v \) in \( K_h \) to a connected subgraph \( B_v \) of \( G \) and each edge \( e \) in \( K_h \) to a connected subgraph \( B_e \) of \( G \). The map \( B \) realizes a \( K_h \)-minor model if: whenever \( v \) is an endpoint of \( e \), \( B_v \) and \( B_e \) share at least one vertex. otherwise, subgraphs in \( \{B_v \mid v \in V\} \cup \{B_e \mid e \in E\} \) are vertex disjoint. Figure 1: A \( 2 \)-fat minor of \( K_4 \) in a graph \( G \). Picture generated by Gemini. This somewhat complicated definition is naturally generalizable. Given an integer \( d \ge 1 \), a map \( B \) realizes a \( d \)-fat \( K_h \)-minor if: whenever \( v \) is an endpoint of \( e \), \( B_v \) and \( B_e \) share at least one vertex. otherwise, subgraphs in \( \{B_v \mid v \in V\} \cup \{B_e \mid e \in E\} \) are distance at least \( d \) in \( G \). When \( d = 1 \), a \( d \)-fat \( K_h \)-minor is also a \( K_h \)-minor. Things are interesting when \( d \ge 2 \). For example, a \( 2 \)-fat minor is closely related to an induced minor. Things are more complicated when \( d \ge 3 \), which is the central regime in coarse graph theory. We say that a graph class is fat-minor-free if every graph in the class has no \( d \)-fat \( K_h \)-minor for a constant \( d \) and \( h \). Fat-minor-free graphs are very rich: a complete graph on \( n \) vertices excludes a \( K_2 \) (yes, \( K_2 \)) as a \( 2 \)-fat minor. This is because one cannot have a map \( B \) mapping two vertices of \( K_2 \) to two vertex-disjoint subgraphs of distance \( 2 \): every pair of distinct vertices in \( K_n \) has distance exactly 1. But richness comes at a cost: it’s hard to understand this class. A question we ask is a separator theorem for fat-minor-free graphs. Of course, we can hope to have a balanced separator with a sublinear number of vertices, as the complete graph \( K_n \) does not have any. But we could hope for a balanced separator that contains a small number of balls of constant radius. Specifically, the question Michał asked is: Fat-Minor Separator Conjecture: For any two given integers \( h, d \in \mathbb{N} \), there exist constants \( c, r \in \mathbb{N} \) such that every \( n \)-vertex graph \( G \) that excludes \( K_h \) as a \( d \)-fat minor has a balanced separator that can be covered by \( c \sqrt{n} \) balls of radius \( r \). The conjecture remains open. In our preprint, we showed the following separator theorem, which is very close to the conjecture. Theorem: For any two given integers \( h, d \in \mathbb{N} \), and real \( \epsilon > 0 \), the following holds: Every \( n \)-vertex graph that excludes \( K_h \) as a \( d \)-fat minor has a balanced separator that can be covered by \( O(h^4 \cdot n^{1/2 + \epsilon}) \) balls of radius \( O(d / \epsilon) \). The proof of the above theorem follows the flow-based technique of Korhonen and Lokshtanov mentioned above. The key difficulty in the proof is to guarantee that the distance between the branch sets (the connected subgraphs \( B_v \) and \( B_e \)) is sufficiently far in the input graph. This is precisely the reason why we need the \( \epsilon \) in the theorem above. 3. Open problems Two open problems: Open Problem 1: Prove the AST separator conjecture. Open Problem 2: Prove the fat-minor separator conjecture. Regarding the fat-minor separator conjectures, in our preprint, we formulated a few related conjectures that could lead to its proof, so be sure to check them out.

The graph minor structure theorem by Robertson and Seymour, and balanced separators are two central algorithmic tools in minor-free graphs. I personally like thinking about balanced separators. I wrote about it 3 years ago on this blog. A few months ago, we gave the first linear time algorithm for finding such a separator. Now I write about it again, focusing on two recent preprints.

  1. A Separator for Minor-Free Graphs Beyond the Flow Barrier.
  2. Coarse Balanced Separators in Fat-Minor-Free Graphs, joint with Édouard Bonnet, Marcin Pilipczuk and Michał Pilipczuk.

I am sure this is not the last time I write about separators; more will be coming.

Fat-minor-free graphs are a relatively new concept and quite interesting. I will say more about them below.


1. The Balanced Separator Conjecture for Minor-Free Graphs

In 1990, Alon, Seymour, and Thomas (AST) proved their landmark result that \( K_h \)-minor-free graphs have a balanced separator of size \( O(h^{3/2} \sqrt{n}) \). (In this section, I will not treat \( h \) as constant so the dependence of the separator’s size on \( h \) will be given explicitly.) They conjectured that the size of the separator should be the clean bound \( O(h\sqrt{n}) \).


AST Separator Conjecture: \( K_h \)-minor-free graphs have a balanced separator of size \( O(h \sqrt{n}) \).


They also observed that the bound in the conjecture would be: Expander graphs of degree-3 have clique minors of size \( h = \Theta(\sqrt{n}) \) and balanced separators of size \( \Theta(n) \).

The history of the progress on this conjecture is a bit messy. Kawarabayashi and Reed claimed a solution but the precise bound on the size of their separator is \( O(h\sqrt{n} + f(h)) \) where \( f \) is a very fast-growing function in the graph minor structure theorem. So it is only \( O(h\sqrt{n}) \) for very small values of \( h \). On the other hand, it is clear that AST is interested in the regime where \( h \) is relatively large, in which the bound by Kawarabayashi and Reed is vacuous.

I wrote about another relatively simple algorithm by Plotkin, Rao and Smith which gives a balanced separator size \( O(h\sqrt{n \log n}) \). There is an annoying \( \sqrt{\log n} \) factor, which makes the bound inferior to the original AST bound for small \( h \).

There is another amazing line of work on constructing balanced separators for \( K_h \)-minor-free graphs using the duality between concurrent flow and sparsest cut. In this line of work, the paper to read is Korhonen and Lokshtanov, a beautiful paper, giving a clear presentation of how the flow-cut duality helps in finding balanced separators. In a nutshell, the lesson is this: if the flow-cut duality gap is \( \beta \), then \( K_h \)-minor-free graphs have balanced separators of size \(O(h\beta \sqrt{n}).\)

I included the detail of this in Section 4 of this preprint. Of course if one can show \( \beta = O(1) \), then one proves the AST conjecture. However, for general graphs, \( \beta = \Omega(\log n) \), and for \( K_h \)-minor-free graphs, \( \beta = \Omega(\log h) \). The recent paper on padded decomposition by Filtser and Conroy (which I also wrote about previously) showed that one can achieve the optimal \( \beta = O(\log h) \). This means one can construct a balanced separator of size \( O(h\log h \sqrt{n}) \), which is pretty close to the AST conjecture.

However, the \( \log h \) factor is inherent in this type of technique due to the lower bound \( \beta = \Omega(\log h) \). As beautiful as it is, this technique cannot give a proof of the AST conjecture. So the question I am interested in is whether one can get below the “flow barrier” bound \( O(h\log h \sqrt{n}) \). And as you might well guess, the answer is yes: the first preprint mentioned above gives a balanced separator of size \(O(h\sqrt{\log h} \sqrt{n}).\)

The \( h\sqrt{\log h} \) factor above mysteriously coincides with the same factor in the sparsity bound. I think it is just a coincidence and nothing else. To get this bound, I add a new twist, namely a low-diameter decomposition, to the same old framework for balanced separators. If you are curious, check it out.


2. Balanced Separators in Fat-Minor-Free Graphs

Michał introduced me to the fat-minor-free graph stuffs in a Dagstuhl workshop that I co-organized. Fat-minor is a central concept in a so-called coarse graph theory. There are many interesting and ambitious conjectures in the theory. The most famous one is the existence of a quasi-isometry to minor-free graphs (Conjecture 1.1 in the linked preprint). Unfortunately, many, if not all, of these conjectures have been recently disproved, including the famous conjecture. But not all are lost. And one of which is the existence of a good coarse balanced separator, the topic discussed here.

Let me introduce the notion of a \( d \)-fat clique minor. A clique minor \( K_h \) of a graph \( G \) refers to a collection of vertex-disjoint connected subgraphs that have pairwise edges between them; this collection is called a \( K_h \)-minor model. A graph is \( K_h \)-minor-free if it does not have a \( K_h \)-minor model. Easy peasy.

There is a more complicated way to define a \( K_h \)-minor model of \( G \). Let \( B \) be a map that maps each vertex \( v \) in \( K_h \) to a connected subgraph \( B_v \) of \( G \) and each edge \( e \) in \( K_h \) to a connected subgraph \( B_e \) of \( G \). The map \( B \) realizes a \( K_h \)-minor model if:

  1. whenever \( v \) is an endpoint of \( e \), \( B_v \) and \( B_e \) share at least one vertex.
  2. otherwise, subgraphs in \( \{B_v \mid v \in V\} \cup \{B_e \mid e \in E\} \) are vertex disjoint.

K4-fatminor

Figure 1: A \( 2 \)-fat minor of \( K_4 \) in a graph \( G \). Picture generated by Gemini.

This somewhat complicated definition is naturally generalizable. Given an integer \( d \ge 1 \), a map \( B \) realizes a \( d \)-fat \( K_h \)-minor if:

  1. whenever \( v \) is an endpoint of \( e \), \( B_v \) and \( B_e \) share at least one vertex.
  2. otherwise, subgraphs in \( \{B_v \mid v \in V\} \cup \{B_e \mid e \in E\} \) are distance at least \( d \) in \( G \).

When \( d = 1 \), a \( d \)-fat \( K_h \)-minor is also a \( K_h \)-minor. Things are interesting when \( d \ge 2 \). For example, a \( 2 \)-fat minor is closely related to an induced minor. Things are more complicated when \( d \ge 3 \), which is the central regime in coarse graph theory.

We say that a graph class is fat-minor-free if every graph in the class has no \( d \)-fat \( K_h \)-minor for a constant \( d \) and \( h \). Fat-minor-free graphs are very rich: a complete graph on \( n \) vertices excludes a \( K_2 \) (yes, \( K_2 \)) as a \( 2 \)-fat minor. This is because one cannot have a map \( B \) mapping two vertices of \( K_2 \) to two vertex-disjoint subgraphs of distance \( 2 \): every pair of distinct vertices in \( K_n \) has distance exactly 1. But richness comes at a cost: it’s hard to understand this class.

A question we ask is a separator theorem for fat-minor-free graphs. Of course, we can hope to have a balanced separator with a sublinear number of vertices, as the complete graph \( K_n \) does not have any. But we could hope for a balanced separator that contains a small number of balls of constant radius. Specifically, the question Michał asked is:


Fat-Minor Separator Conjecture: For any two given integers \( h, d \in \mathbb{N} \), there exist constants \( c, r \in \mathbb{N} \) such that every \( n \)-vertex graph \( G \) that excludes \( K_h \) as a \( d \)-fat minor has a balanced separator that can be covered by \( c \sqrt{n} \) balls of radius \( r \).


The conjecture remains open. In our preprint, we showed the following separator theorem, which is very close to the conjecture.


Theorem: For any two given integers \( h, d \in \mathbb{N} \), and real \( \epsilon > 0 \), the following holds: Every \( n \)-vertex graph that excludes \( K_h \) as a \( d \)-fat minor has a balanced separator that can be covered by \( O(h^4 \cdot n^{1/2 + \epsilon}) \) balls of radius \( O(d / \epsilon) \).


The proof of the above theorem follows the flow-based technique of Korhonen and Lokshtanov mentioned above. The key difficulty in the proof is to guarantee that the distance between the branch sets (the connected subgraphs \( B_v \) and \( B_e \)) is sufficiently far in the input graph. This is precisely the reason why we need the \( \epsilon \) in the theorem above.


3. Open problems

Two open problems:


Open Problem 1: Prove the AST separator conjecture.


Open Problem 2: Prove the fat-minor separator conjecture.


Regarding the fat-minor separator conjectures, in our preprint, we formulated a few related conjectures that could lead to its proof, so be sure to check them out.

By Hung Le

Friday, May 15

Linkage

from David Eppstein

Daniel Lemire speeds up binary search using parallel data-comparison instructions (\(\mathbb{M}\), see also).

By David Eppstein

TR26-077 | Redundancy Is All You Need (for CSP Sparsification) | Joshua Brakensiek, Venkatesan Guruswami

from ECCC Papers

The seminal work of Benczu}r and Karger demonstrated cut sparsifiers of near-linear size, with several applications throughout theoretical computer science. Subsequent extensions have yielded sparsifiers for hypergraph cuts and more recently linear codes over Abelian groups. A decade ago, Kogan and Krauthgamer asked about the sparsifiability of arbitrary constraint satisfaction problems (CSPs). For this question, a trivial lower bound is the size of a *non-redundant* CSP instance, which admits, for each constraint, an assignment satisfying only that constraint (so that no constraint can be dropped by the sparsifier). For instance, for graph cuts, spanning trees are non-redundant instances. Our main result is that redundant clauses are sufficient for sparsification: for any CSP predicate $R$, every unweighted instance of CSP$(R)$ has a sparsifier of size at most its non-redundancy (up to polylog and $1/\epsilon$ factors). For weighted instances, we similarly pin down the sparsifiability to the so-called chain length of the predicate. These results precisely determine the extent to which any CSP can be sparsified. Our result is established in the general setting of non-linear codes, or equivalently set families, yielding a VC-type theorem for multiplicative error approximation. This unifies and extends known sparsification results for graph cuts, linear codes, and broad CSP classes. A key technical ingredient in our work is a novel application of the entropy method from Gilmer's recent breakthrough on the union-closed sets conjecture. As an immediate consequence of our main theorem, a number of results in the non-redundancy literature immediately extend to CSP sparsification. We also contribute new techniques for understanding the non-redundancy of CSP predicates. By adapting methods from the matching vector codes literature in coding theory, we are able to construct an explicit predicate whose non-redundancy lies between $\Omega(n^{1.5})$ and $\widetilde{O}(n^{1.6})$, the first example with a provably non-integral exponent. The tools we introduce in this paper, including a conditional version of non-redundancy, have spurred several follow-ups on non-redundancy as well as new connections to extremal combinatorics, sparsification, and streaming.

The seminal work of Benczu}r and Karger demonstrated cut sparsifiers of near-linear size, with several applications throughout theoretical computer science. Subsequent extensions have yielded sparsifiers for hypergraph cuts and more recently linear codes over Abelian groups. A decade ago, Kogan and Krauthgamer asked about the sparsifiability of arbitrary constraint satisfaction problems (CSPs). For this question, a trivial lower bound is the size of a *non-redundant* CSP instance, which admits, for each constraint, an assignment satisfying only that constraint (so that no constraint can be dropped by the sparsifier). For instance, for graph cuts, spanning trees are non-redundant instances. Our main result is that redundant clauses are sufficient for sparsification: for any CSP predicate $R$, every unweighted instance of CSP$(R)$ has a sparsifier of size at most its non-redundancy (up to polylog and $1/\epsilon$ factors). For weighted instances, we similarly pin down the sparsifiability to the so-called chain length of the predicate. These results precisely determine the extent to which any CSP can be sparsified. Our result is established in the general setting of non-linear codes, or equivalently set families, yielding a VC-type theorem for multiplicative error approximation. This unifies and extends known sparsification results for graph cuts, linear codes, and broad CSP classes. A key technical ingredient in our work is a novel application of the entropy method from Gilmer's recent breakthrough on the union-closed sets conjecture. As an immediate consequence of our main theorem, a number of results in the non-redundancy literature immediately extend to CSP sparsification. We also contribute new techniques for understanding the non-redundancy of CSP predicates. By adapting methods from the matching vector codes literature in coding theory, we are able to construct an explicit predicate whose non-redundancy lies between $\Omega(n^{1.5})$ and $\widetilde{O}(n^{1.6})$, the first example with a provably non-integral exponent. The tools we introduce in this paper, including a conditional version of non-redundancy, have spurred several follow-ups on non-redundancy as well as new connections to extremal combinatorics, sparsification, and streaming.

One-Sided Differential Privacy

from DifferentialPrivacy.org

Differential privacy is defined in terms of pairs of neighboring datasets. That is, \(M\) is \((\varepsilon,\delta)\)-differentially private if, for all measurable events \(T\) and all neighboring pairs of datasets \(x,x’\), we have \[\mathbb{P}[M(x) \in T] \le e^\varepsilon \mathbb{P}[M(x’) \in T] + \delta.\] There are three common ways to define neighors: (i) \(x\) and \(x’\) differ only by the replacement of one person’s data, (ii) \(x\) and \(x’\) differ only by the addition or removal of one person’s data, or (iii) \(x\) and \(x’\) differ only by the addition, removal, or replacement of one person’s data. All three of these are symmetric in the sense that we can swap \(x\) and \(x’\). This symmetry is nice mathematically, but maybe you’ve wondered what happens if we make it asymmetric; if so, read on.

Since replacement is inherently symmetric, we will focus on the asymmetry of addition or removal. Let’s write out the two definitions of one-sided differential privacy that we will work with:

Definition 1 (Addition-only Differential Privacy). A randomized algorithm \(M\) is \((\varepsilon,\delta)\)-differentially private for addition if, for all measurable events \(T\) and all pairs of datasets \(x,x’\) such that \(x\) is formed by adding one element to \(x’\), we have \[\mathbb{P}[M(x) \in T] \le e^\varepsilon \mathbb{P}[M(x’) \in T] + \delta.\]

Definition 2 (Removal-only Differential Privacy). A randomized algorithm \(M\) is \((\varepsilon,\delta)\)-differentially private for removal if, for all measurable events \(T\) and all pairs of datasets \(x,x’\) such that \(x’\) is formed by removing one element from \(x\), we have \[\mathbb{P}[M(x’) \in T] \le e^\varepsilon \mathbb{P}[M(x) \in T] + \delta.\]

To make this post less confusing we will stick to the notational convention that \(x’ \subseteq x\).

Interpreting the Asymmetric Guarantees

Before getting technical, let’s try to get some intuition for how these two definitions differ. I generally think of the promise of differential privacy as the following \[\mathbb{P}\left[\text{bad event happens} \atop \text{in the real world}\right] \le e^\varepsilon \mathbb{P}\left[\text{bad event happens in a} \atop \text{hypothetical ideal world}\right] + \delta. \tag{1}\] Thinking of it like this hopefully gives some sense for how the two datasets in the definition have differing roles – the real world is the real world and the hypothetical ideal world is a counterfactual that we devise. From a privacy standpoint, your ideal world is usually one where your data has been deleted. In other words, we typically want something like \[\mathbb{P}\left[\text{bad event happens} \atop \text{in the real world}\right] \le e^\varepsilon \mathbb{P}\left[\text{bad event happens} \atop \text{without access to your data}\right] + \delta. \tag{2}\] This corresponds to to addition-only differential privacy (Definition 1).

OK, hopefully, I’ve convinced you that Definition 1 is what we really want. Is there any reason to work with Definition 2? I can think of two reasons:

  1. Suppose the dataset is a collection of people who are deemed to be wise and pure of heart. Perhaps you have been omitted from this dataset (for whatever reason) and you want to be part of this dataset. In this case, your ideal world is the one where your data is inserted, and Definition 2 is what you want.

  2. The other reason is that I’ve implicitly assumed that bad events are unlikely. Equation 1 can become vacuous, say, if \(\varepsilon=1\) and \(\mathbb{P}[\text{bad event happens in a hypothetical ideal world}]\ge 0.5\).

    In such a case, we can apply Definition 2 to the good event (\(\notin T\)) instead of the bad event (\(\in T\)): \[\mathbb{P}[M(x’) \notin T] \le e^\varepsilon \mathbb{P}[M(x) \notin T] + \delta,\] which rearranges to \[\mathbb{P}[M(x) \in T] \le e^{-\varepsilon} \mathbb{P}[M(x’) \in T] + 1 - e^{-\varepsilon} + e^{-\varepsilon} \delta .\] This guarantee is of the form of Definition 1. It’s weird, but it’s useful when your bad event in the hypothetical ideal world (\(M(x’) \in T\)) is actually quite likely.

    This calculation shows that \((\varepsilon,\delta)\)-differential privacy for removal implies \((-\varepsilon,1-e^{-\varepsilon}+e^{-\varepsilon}\delta)\)-differential privacy for addition (and vice versa). In other words, the difference between the two definitions is, in some sense, just a matter of differing parameter regimes.

These two reasons are a little contrived. I’d still argue that Definition 1 is more natural, but Definition 2 has its place.

One-Sided Laplace Noise

OK, so far we’ve stated the definitions and tried to give them some intuitive meaning. What can we actually do with these definitions?

Let’s suppose we have a counting query \(q\) (or, more generally, a monotone low-sensitivity query). Under standard two-sided differential privacy, we can answer the query with Laplace noise addition. What can we do under one-sided differential privacy? Of course, we can still use Laplace noise, but can we do something with better accuracy under this weaker privacy notion?

It turns out we can use one-sided Laplace noise – more commonly known as Exponential noise – to get one-sided differential privacy:

Proposition 3 (One-sided Noise for Addition-only Differential Privacy). Let \(q : \mathcal{X}^* \to \mathbb{R}\) be monotone (i.e., \(x’ \subseteq x \implies q(x’) \le q(x)\)) and have sensitivity 1 (i.e. \(q(x)-q(x’) \le |x \setminus x’| \)). Let \(\xi\) denote a standard exponential random variable (i.e., \(\forall y \in \mathbb{R} ~ ~ \mathbb{P}[\xi \ge y]=\min\{1,\exp(-y)\}\)).

Define \(M : \mathcal{X}^* \to \mathbb{R}\) by \(M(x) = q(x) + \frac{1}{\varepsilon}\xi\). Then \(M\) satisfies \((\varepsilon,0)\)-differential privacy for addition (Definition 1).

Proof. Let \(x’ \subseteq x\) be inputs differing by the addition of one element – i.e. \(|x \setminus x’|=1\). And let \(T \subseteq \mathbb{R}\) be measurable. By assumption, \(q(x)-1 \le q(x’) \le q(x)\). Now \[\mathbb{P}[M(x) \in T] = \mathbb{P}[q(x) + \frac{1}{\varepsilon} \xi \in T] ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ \] \[ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ = \int_T e^{-\varepsilon (y-q(x))} \varepsilon \mathbb{I}[y \ge q(x)]\mathrm{d}y\] \[ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ \le \int_T e^{-\varepsilon (y-q(x’)-1)} \varepsilon \mathbb{I}[y \ge q(x’)]\mathrm{d}y\] \[ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ = e^\varepsilon \int_T e^{-\varepsilon (y-q(x’))} \varepsilon \mathbb{I}[y \ge q(x’)]\mathrm{d}y\] \[ ~ ~ ~ = e^\varepsilon \mathbb{P}[M(x’) \in T] .\] Here we used the fact that the shifted and scaled exponential \(\mu + \lambda \xi\) (with \(\lambda>0\)) has probability density \(e^{-\lambda(y-\mu)}\lambda\mathbb{I}[y \ge \mu]\) at \(y\) to write the probabilities as integrals. And we used \(q(x)-1 \le q(x’) \le q(x)\) to obtain the bounds \(e^{-\varepsilon (y-q(x))} \le e^{-\varepsilon (y-q(x’)-1)}\) and \(\mathbb{I}[y \ge q(x)] \le \mathbb{I}[y \ge q(x’)]\). ∎

We can also do the other side:

Proposition 4 (One-sided Noise for Removal-only Differential Privacy). Let \(q : \mathcal{X}^* \to \mathbb{R}\) be monotone and have sensitivity 1. Let \(\xi\) denote a standard exponential random variable.

Define \(M : \mathcal{X}^* \to \mathbb{R}\) by \(M(x) = q(x) - \frac{1}{\varepsilon}\xi\). Then \(M\) satisfies \((\varepsilon,0)\)-differential privacy for removal (Definition 2).

The proof is symmetric. A few remarks are in order:

  1. If \(M\) satisfies both addition-only and removal-only differential privacy then \(M\) satisfies regular two-sided differential privacy. In particular, if we define \(M(x) = q(x) + \frac{1}{\varepsilon}\xi - \frac{1}{\varepsilon}\xi’\) where \(\xi\) and \(\xi’\) are independent standard exponential random variables, then \(M\) satisfies regular two-sided differential privacy. And – surprise, surprise – this is identical to the regular Laplace mechanism; \(\xi-\xi’\) follows a standard Laplace distribution.

    Thus decomposing the differential privacy guarantee into addition- and removal-only parts corresponds exactly to decomposing the Laplace noise into positive and negative parts!

  2. Exponential noise has half the variance of Laplace noise. Thus relaxing to one-sided differential privacy saves you a factor of two in the variance, relative to two-sided differential privacy. That’s a significant win.

  3. The exponential mechani—er, the mechanism in Propositions 3 and 4 is biased. This can be fixed. Namely, \(M(x) = q(x) \pm \frac{1}{\varepsilon}(\xi-1)\) would be unbiased and would satisfy the same privacy guarantee.

  4. We can extend Propositions 3 and 4 to histograms. I.e., a collection of counts where each person contributes to only one count. Histograms demonstrate that positive and negative noise are fundamentally different. Namely, suppose the true count is \(q(x)=0\), then negative noise results in a negative number, but since counts can’t be negative, we’ll just round this back to 0. So negative noise will not change zero counts. On the other hand, positive noise will take zero counts and increase them. Histograms are often sparse (i.e., most counts are zero), so this is a significant difference.

Subsampling

We’ve posted about privacy amplification by subsampling before. Let’s examine it under one-sided differential privacy.

For addition-only differential privacy, nothing much changes compared to regular two-sided differential privacy: If we start with \((\varepsilon,\delta)\)-differential privacy for addition and combine it with Poisson subsampling with probability \(p\), then we get \((\varepsilon’,p\delta)\)-differential privacy for addition with \(\varepsilon’ = \log(1+p(\exp(\varepsilon)-1))\). And the proof is the same as for two-sided differential privacy.

However, for removal-only differential privacy, strange things happen:

Theorem 5 (Subsampling is Removal-only Differentially Private). For \(p \in [0,1)\), define \(S_p : \mathcal{X}^* \to \mathcal{X}^*\) to be the Poisson random subsampling procedure that includes each element independently with probability \(p\) – i.e., \(\forall x \in \mathcal{X}^* ~ S_p(x) \subseteq x\) and, independently for all \(x_i \in x\), we have \(\mathbb{P}[x_i \in S_p(x)]=p\). Then \(S_p\) is \((\varepsilon,0)\)-differentially private for removal (Definition 2) for \(\varepsilon=-\log(1-p)\).

Proof. Let \(x’ \subseteq x\) be inputs differing by the removal of one element – i.e. \(x’ = x \setminus \{x_i\}\). And let \(T \subseteq \mathcal{X}^*\) be measurable. We need to show that \[\mathbb{P}[S_p(x’) \in T] \le e^\varepsilon \mathbb{P}[S_p(x) \in T].\] By independence, the distribution of \(S_p(x)\) conditioned on \(x_i \notin S_p(x)\) is equivalent to the distribution of \(S_p(x \setminus \{x_i\})\), which is simply \(S_p(x’)\). Similarly, the distribution of \(S_p(x)\) conditioned on \(x_i \in S_p(x)\) is equivalent to the distribution of \(S_p(x’) \cup \{x_i\}\). Now we can decompose:
\(\mathbb{P}[S_p(x) \in T] = \mathbb{P}[S_p(x) \in T \mid x_i \in S_p(x)] \mathbb{P}[x_i \in S_p(x)] \)
\( ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ + \mathbb{P}[S_p(x) \in T \mid x_i \notin S_p(x)] \mathbb{P}[x_i \notin S_p(x)]\)
\( ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ = \mathbb{P}[S_p(x’) \cup \{x_i\} \in T] p + \mathbb{P}[S_p(x’) \in T] (1-p)\)
\( ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ \ge \mathbb{P}[S_p(x’) \in T] e^{-\varepsilon},\)
which rearranges to give the desired result. ∎

Note that \(\varepsilon=-\log(1-p)\) is equivalent to \(p = 1 - e^{-\varepsilon}\). And for small values we have \(\varepsilon \approx p\).

OK, what do we make of this? Combined with a differentially private algorithm, subsampling does amplify privacy, but this is saying that subsampling on its own provides removal-only differential privacy. This algorithm outputs a \(p\)-fraction of the dataset in the clear; intuitively this should not be private. I interpret this result as telling us that something is fishy about removal-only differential privacy; and this connects back to the interpretation discussion earlier.

From Addition-only DP to Two-sided DP

Regular two-sided differential privacy is the conjunction of addition- and removal-only differential privacy. And we saw with the one-sided Laplace noise that we can combine two one-sided differentially private algorithms to get a two-sided differentially private algorithm. Now we’re going to turn this into a generic recipe, where Theorem 5 provides the generic removal-only half of the recipe.

Theorem 6 (Addition-only DP + Subsampling = Two-sided DP). [GKKMS24, Lemma 3.1]

Let \(M : \mathcal{X}^* \to \mathcal{Y}\) satisfy \((\varepsilon,\delta)\)-differential privacy for addition (Definition 1). For \(p \in [0,1)\), define \(S_p : \mathcal{X}^* \to \mathcal{X}^*\) to be the Poisson random subsampling procedure that includes each element independently with probability \(p\).

Define \(\widetilde{M} : \mathcal{X}^* \to \mathcal{Y}\) by \(\widetilde{M}(x) = M(S_p(x)))\). Then \(\widetilde{M}\) satisfies \((\widetilde{\varepsilon},p\delta)\)-differential privacy (for addition and removal) for \[\widetilde{\varepsilon} = \max\{-\log(1-p), \log(1+p(\exp(\varepsilon)-1))\}.\]

Proof Sketch. Let \(x’ \subseteq x\) be inputs differing by the addition or removal of one element – i.e. \(|x \setminus x’|=1\). And let \(T \subseteq \mathcal{Y}\) be measurable. We have to prove two directions.

First, the removal direction: \[\mathbb{P}[\widetilde{M}(x’) \in T] \le \frac{1}{1-p} \mathbb{P}[\widetilde{M}(x) \in T] \le e^{\widetilde{\varepsilon}} \mathbb{P}[\widetilde{M}(x) \in T]\] follows from Theorem 5 and the fact that \(\widetilde{M}\) is a postprocessing of \(S_p\).

Second, the addition direction \[\mathbb{P}[\widetilde{M}(x) \in T] \le (1-p + pe^\varepsilon) \mathbb{P}[\widetilde{M}(x’) \in T] + p\delta\] follows from the addition-only differential privacy of the base algorithm \(M\) and privacy amplification by subsampling. ∎

This is an interesting result – it provides a new way to construct differentially private algorithms. In particular, combining Theorem 6 with Proposition 3 shows that subsampling and adding exponential noise satisfies differential privacy. To be precise, if \(q : \mathcal{X}^* \to \mathbb{R}\) is monotone and sensitivity 1, then, for \(p=1-e^{-\varepsilon}\), \(\lambda = \frac{1}{\log\left(1 + e^\varepsilon\right)},\) and \(\xi\) a standard exponential random variable, \[\widetilde{M}(x) = q(S_p(x)) + \lambda \xi \tag{3}\] defines a \((\varepsilon,0)\)-differentially private algorithm (for addition and removal).1

We can also view this as a negative result. Propositions 3 and 4 showed that we can reduce the variance of count queries by a factor of two by relaxing from standard two-sided differential privacy to one-sided differential privacy. How much more gain could we get? Theorem 6 shows that we cannot improve accuracy by too much via a relaxed one-sided privacy definition, since any algorithm satisfying the relaxed addition-only definition can be converted into one that satisfies the two-sided definition.

Conclusion

In this post we explored asymmetric one-sided differential privacy. We proved a few results, but the real takeaways are the insights we made along the way! By separating the two sides of the standard two-sided differential privacy guarantee, we gain a deeper understanding of what is going on in the definition.

We saw both at the intuitive and formal level that addition-only differential privacy (Definition 1) is a more meaningful guarantee than removal-only differential privacy (Definition 2). Indeed, the removal side can be tacked on relatively easily via subsampling (Theorem 6).

Should we be using one-sided differential privacy guarantees in practice? We definitely shouldn’t rely on removal-only differential privacy. But addition-only differential privacy seems meaningful, so perhaps its worthwhile if it lets us improve constants. However, in my humble opinion, the added complexity of one-sided differential privacy is probably not worthwhile. It’s more of a curiosity, or perhaps an intermediate analytical tool.2

Finally, we note that there are a couple of papers that have explored one-sided differential privacy:

  • Ghazi, Kamath, Kumar, Manurangsi, and Sealfon [GKKMS24] used addition-only differential privacy as an intermediate analytical tool; in particular, they proved Theorem 6 (and used it).
  • Kotsogiannis, Doudalis, Haney, Machanavajjhala, and Mehrotra [KDHMM20] define one-sided differential privacy differently than we do. Roughly, they define one-sided differential privacy in terms of replacement (rather than addition/removal); to make this asymmetric they classify records as either sensitive or non-sensitive and they allow replacement of a sensitive record with a non-sensitive record. They analyzed one-sided Laplace noise in this setting (cf. Proposition 3).
  • Takagi, Kato, Cao, and Yoshikawa [TCY22] define asymmetric differential privacy in terms of replacement similarly to Kotsogiannis et al. and also give an analysis of one-sided Laplace noise.
  • Chen, Hu, Zhuang, Zhao, and Yu [CHZZY25] define one-sided personalized differential privacy using the same replacement formalism as Kotsogiannis et al., but with an added twist where each person has their own privacy parameter \(\varepsilon\).
  1. There exist settings where the algorithm in Equation 3 achieves lower variance than the Laplace mechanism; working out this parameter regime is left as an exercise for the reader. ↩

  2. Some speculative examples where addition-only differential privacy might be a useful tool: There are some situations – e.g., the shuffle model – where it is easier to add noise than to subtract noise. In such cases addition-only differential privacy may be a useful tool for designing and analyzing algorithms. Another example is when using differential privacy to prove generalization guarantees. ↩

By

Differential privacy is defined in terms of pairs of neighboring datasets. That is, \(M\) is \((\varepsilon,\delta)\)-differentially private if, for all measurable events \(T\) and all neighboring pairs of datasets \(x,x’\), we have \[\mathbb{P}[M(x) \in T] \le e^\varepsilon \mathbb{P}[M(x’) \in T] + \delta.\] There are three common ways to define neighors: (i) \(x\) and \(x’\) differ only by the replacement of one person’s data, (ii) \(x\) and \(x’\) differ only by the addition or removal of one person’s data, or (iii) \(x\) and \(x’\) differ only by the addition, removal, or replacement of one person’s data. All three of these are symmetric in the sense that we can swap \(x\) and \(x’\). This symmetry is nice mathematically, but maybe you’ve wondered what happens if we make it asymmetric; if so, read on.

Since replacement is inherently symmetric, we will focus on the asymmetry of addition or removal. Let’s write out the two definitions of one-sided differential privacy that we will work with:

Definition 1 (Addition-only Differential Privacy). A randomized algorithm \(M\) is \((\varepsilon,\delta)\)-differentially private for addition if, for all measurable events \(T\) and all pairs of datasets \(x,x’\) such that \(x\) is formed by adding one element to \(x’\), we have \[\mathbb{P}[M(x) \in T] \le e^\varepsilon \mathbb{P}[M(x’) \in T] + \delta.\]

Definition 2 (Removal-only Differential Privacy). A randomized algorithm \(M\) is \((\varepsilon,\delta)\)-differentially private for removal if, for all measurable events \(T\) and all pairs of datasets \(x,x’\) such that \(x’\) is formed by removing one element from \(x\), we have \[\mathbb{P}[M(x’) \in T] \le e^\varepsilon \mathbb{P}[M(x) \in T] + \delta.\]

To make this post less confusing we will stick to the notational convention that \(x’ \subseteq x\).

Interpreting the Asymmetric Guarantees

Before getting technical, let’s try to get some intuition for how these two definitions differ. I generally think of the promise of differential privacy as the following \[\mathbb{P}\left[\text{bad event happens} \atop \text{in the real world}\right] \le e^\varepsilon \mathbb{P}\left[\text{bad event happens in a} \atop \text{hypothetical ideal world}\right] + \delta. \tag{1}\] Thinking of it like this hopefully gives some sense for how the two datasets in the definition have differing roles – the real world is the real world and the hypothetical ideal world is a counterfactual that we devise. From a privacy standpoint, your ideal world is usually one where your data has been deleted. In other words, we typically want something like \[\mathbb{P}\left[\text{bad event happens} \atop \text{in the real world}\right] \le e^\varepsilon \mathbb{P}\left[\text{bad event happens} \atop \text{without access to your data}\right] + \delta. \tag{2}\] This corresponds to to addition-only differential privacy (Definition 1).

OK, hopefully, I’ve convinced you that Definition 1 is what we really want. Is there any reason to work with Definition 2? I can think of two reasons:

  1. Suppose the dataset is a collection of people who are deemed to be wise and pure of heart. Perhaps you have been omitted from this dataset (for whatever reason) and you want to be part of this dataset. In this case, your ideal world is the one where your data is inserted, and Definition 2 is what you want.

  2. The other reason is that I’ve implicitly assumed that bad events are unlikely. Equation 1 can become vacuous, say, if \(\varepsilon=1\) and \(\mathbb{P}[\text{bad event happens in a hypothetical ideal world}]\ge 0.5\).

    In such a case, we can apply Definition 2 to the good event (\(\notin T\)) instead of the bad event (\(\in T\)): \[\mathbb{P}[M(x’) \notin T] \le e^\varepsilon \mathbb{P}[M(x) \notin T] + \delta,\] which rearranges to \[\mathbb{P}[M(x) \in T] \le e^{-\varepsilon} \mathbb{P}[M(x’) \in T] + 1 - e^{-\varepsilon} + e^{-\varepsilon} \delta .\] This guarantee is of the form of Definition 1. It’s weird, but it’s useful when your bad event in the hypothetical ideal world (\(M(x’) \in T\)) is actually quite likely.

    This calculation shows that \((\varepsilon,\delta)\)-differential privacy for removal implies \((-\varepsilon,1-e^{-\varepsilon}+e^{-\varepsilon}\delta)\)-differential privacy for addition (and vice versa). In other words, the difference between the two definitions is, in some sense, just a matter of differing parameter regimes.

These two reasons are a little contrived. I’d still argue that Definition 1 is more natural, but Definition 2 has its place.

One-Sided Laplace Noise

OK, so far we’ve stated the definitions and tried to give them some intuitive meaning. What can we actually do with these definitions?

Let’s suppose we have a counting query \(q\) (or, more generally, a monotone low-sensitivity query). Under standard two-sided differential privacy, we can answer the query with Laplace noise addition. What can we do under one-sided differential privacy? Of course, we can still use Laplace noise, but can we do something with better accuracy under this weaker privacy notion?

It turns out we can use one-sided Laplace noise – more commonly known as Exponential noise – to get one-sided differential privacy:

Proposition 3 (One-sided Noise for Addition-only Differential Privacy). Let \(q : \mathcal{X}^* \to \mathbb{R}\) be monotone (i.e., \(x’ \subseteq x \implies q(x’) \le q(x)\)) and have sensitivity 1 (i.e. \(q(x)-q(x’) \le |x \setminus x’| \)). Let \(\xi\) denote a standard exponential random variable (i.e., \(\forall y \in \mathbb{R} ~ ~ \mathbb{P}[\xi \ge y]=\min\{1,\exp(-y)\}\)).

Define \(M : \mathcal{X}^* \to \mathbb{R}\) by \(M(x) = q(x) + \frac{1}{\varepsilon}\xi\). Then \(M\) satisfies \((\varepsilon,0)\)-differential privacy for addition (Definition 1).

Proof. Let \(x’ \subseteq x\) be inputs differing by the addition of one element – i.e. \(|x \setminus x’|=1\). And let \(T \subseteq \mathbb{R}\) be measurable. By assumption, \(q(x)-1 \le q(x’) \le q(x)\). Now \[\mathbb{P}[M(x) \in T] = \mathbb{P}[q(x) + \frac{1}{\varepsilon} \xi \in T] ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ \] \[ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ = \int_T e^{-\varepsilon (y-q(x))} \varepsilon \mathbb{I}[y \ge q(x)]\mathrm{d}y\] \[ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ \le \int_T e^{-\varepsilon (y-q(x’)-1)} \varepsilon \mathbb{I}[y \ge q(x’)]\mathrm{d}y\] \[ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ = e^\varepsilon \int_T e^{-\varepsilon (y-q(x’))} \varepsilon \mathbb{I}[y \ge q(x’)]\mathrm{d}y\] \[ ~ ~ ~ = e^\varepsilon \mathbb{P}[M(x’) \in T] .\] Here we used the fact that the shifted and scaled exponential \(\mu + \lambda \xi\) (with \(\lambda>0\)) has probability density \(e^{-\lambda(y-\mu)}\lambda\mathbb{I}[y \ge \mu]\) at \(y\) to write the probabilities as integrals. And we used \(q(x)-1 \le q(x’) \le q(x)\) to obtain the bounds \(e^{-\varepsilon (y-q(x))} \le e^{-\varepsilon (y-q(x’)-1)}\) and \(\mathbb{I}[y \ge q(x)] \le \mathbb{I}[y \ge q(x’)]\). ∎

We can also do the other side:

Proposition 4 (One-sided Noise for Removal-only Differential Privacy). Let \(q : \mathcal{X}^* \to \mathbb{R}\) be monotone and have sensitivity 1. Let \(\xi\) denote a standard exponential random variable.

Define \(M : \mathcal{X}^* \to \mathbb{R}\) by \(M(x) = q(x) - \frac{1}{\varepsilon}\xi\). Then \(M\) satisfies \((\varepsilon,0)\)-differential privacy for removal (Definition 2).

The proof is symmetric. A few remarks are in order:

  1. If \(M\) satisfies both addition-only and removal-only differential privacy then \(M\) satisfies regular two-sided differential privacy. In particular, if we define \(M(x) = q(x) + \frac{1}{\varepsilon}\xi - \frac{1}{\varepsilon}\xi’\) where \(\xi\) and \(\xi’\) are independent standard exponential random variables, then \(M\) satisfies regular two-sided differential privacy. And – surprise, surprise – this is identical to the regular Laplace mechanism; \(\xi-\xi’\) follows a standard Laplace distribution.

    Thus decomposing the differential privacy guarantee into addition- and removal-only parts corresponds exactly to decomposing the Laplace noise into positive and negative parts!

  2. Exponential noise has half the variance of Laplace noise. Thus relaxing to one-sided differential privacy saves you a factor of two in the variance, relative to two-sided differential privacy. That’s a significant win.

  3. The exponential mechani—er, the mechanism in Propositions 3 and 4 is biased. This can be fixed. Namely, \(M(x) = q(x) \pm \frac{1}{\varepsilon}(\xi-1)\) would be unbiased and would satisfy the same privacy guarantee.

  4. We can extend Propositions 3 and 4 to histograms. I.e., a collection of counts where each person contributes to only one count. Histograms demonstrate that positive and negative noise are fundamentally different. Namely, suppose the true count is \(q(x)=0\), then negative noise results in a negative number, but since counts can’t be negative, we’ll just round this back to 0. So negative noise will not change zero counts. On the other hand, positive noise will take zero counts and increase them. Histograms are often sparse (i.e., most counts are zero), so this is a significant difference.

Subsampling

We’ve posted about privacy amplification by subsampling before. Let’s examine it under one-sided differential privacy.

For addition-only differential privacy, nothing much changes compared to regular two-sided differential privacy: If we start with \((\varepsilon,\delta)\)-differential privacy for addition and combine it with Poisson subsampling with probability \(p\), then we get \((\varepsilon’,p\delta)\)-differential privacy for addition with \(\varepsilon’ = \log(1+p(\exp(\varepsilon)-1))\). And the proof is the same as for two-sided differential privacy.

However, for removal-only differential privacy, strange things happen:

Theorem 5 (Subsampling is Removal-only Differentially Private). For \(p \in [0,1)\), define \(S_p : \mathcal{X}^* \to \mathcal{X}^*\) to be the Poisson random subsampling procedure that includes each element independently with probability \(p\) – i.e., \(\forall x \in \mathcal{X}^* ~ S_p(x) \subseteq x\) and, independently for all \(x_i \in x\), we have \(\mathbb{P}[x_i \in S_p(x)]=p\). Then \(S_p\) is \((\varepsilon,0)\)-differentially private for removal (Definition 2) for \(\varepsilon=-\log(1-p)\).

Proof. Let \(x’ \subseteq x\) be inputs differing by the removal of one element – i.e. \(x’ = x \setminus \{x_i\}\). And let \(T \subseteq \mathcal{X}^*\) be measurable. We need to show that \[\mathbb{P}[S_p(x’) \in T] \le e^\varepsilon \mathbb{P}[S_p(x) \in T].\] By independence, the distribution of \(S_p(x)\) conditioned on \(x_i \notin S_p(x)\) is equivalent to the distribution of \(S_p(x \setminus \{x_i\})\), which is simply \(S_p(x’)\). Similarly, the distribution of \(S_p(x)\) conditioned on \(x_i \in S_p(x)\) is equivalent to the distribution of \(S_p(x’) \cup \{x_i\}\). Now we can decompose:
\(\mathbb{P}[S_p(x) \in T] = \mathbb{P}[S_p(x) \in T \mid x_i \in S_p(x)] \mathbb{P}[x_i \in S_p(x)] \)
\( ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ + \mathbb{P}[S_p(x) \in T \mid x_i \notin S_p(x)] \mathbb{P}[x_i \notin S_p(x)]\)
\( ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ = \mathbb{P}[S_p(x’) \cup \{x_i\} \in T] p + \mathbb{P}[S_p(x’) \in T] (1-p)\)
\( ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ \ge \mathbb{P}[S_p(x’) \in T] e^{-\varepsilon},\)
which rearranges to give the desired result. ∎

Note that \(\varepsilon=-\log(1-p)\) is equivalent to \(p = 1 - e^{-\varepsilon}\). And for small values we have \(\varepsilon \approx p\).

OK, what do we make of this? Combined with a differentially private algorithm, subsampling does amplify privacy, but this is saying that subsampling on its own provides removal-only differential privacy. This algorithm outputs a \(p\)-fraction of the dataset in the clear; intuitively this should not be private. I interpret this result as telling us that something is fishy about removal-only differential privacy; and this connects back to the interpretation discussion earlier.

From Addition-only DP to Two-sided DP

Regular two-sided differential privacy is the conjunction of addition- and removal-only differential privacy. And we saw with the one-sided Laplace noise that we can combine two one-sided differentially private algorithms to get a two-sided differentially private algorithm. Now we’re going to turn this into a generic recipe, where Theorem 5 provides the generic removal-only half of the recipe.

Theorem 6 (Addition-only DP + Subsampling = Two-sided DP). [GKKMS24, Lemma 3.1]

Let \(M : \mathcal{X}^* \to \mathcal{Y}\) satisfy \((\varepsilon,\delta)\)-differential privacy for addition (Definition 1). For \(p \in [0,1)\), define \(S_p : \mathcal{X}^* \to \mathcal{X}^*\) to be the Poisson random subsampling procedure that includes each element independently with probability \(p\).

Define \(\widetilde{M} : \mathcal{X}^* \to \mathcal{Y}\) by \(\widetilde{M}(x) = M(S_p(x)))\). Then \(\widetilde{M}\) satisfies \((\widetilde{\varepsilon},p\delta)\)-differential privacy (for addition and removal) for \[\widetilde{\varepsilon} = \max\{-\log(1-p), \log(1+p(\exp(\varepsilon)-1))\}.\]

Proof Sketch. Let \(x’ \subseteq x\) be inputs differing by the addition or removal of one element – i.e. \(|x \setminus x’|=1\). And let \(T \subseteq \mathcal{Y}\) be measurable. We have to prove two directions.

First, the removal direction: \[\mathbb{P}[\widetilde{M}(x’) \in T] \le \frac{1}{1-p} \mathbb{P}[\widetilde{M}(x) \in T] \le e^{\widetilde{\varepsilon}} \mathbb{P}[\widetilde{M}(x) \in T]\] follows from Theorem 5 and the fact that \(\widetilde{M}\) is a postprocessing of \(S_p\).

Second, the addition direction \[\mathbb{P}[\widetilde{M}(x) \in T] \le (1-p + pe^\varepsilon) \mathbb{P}[\widetilde{M}(x’) \in T] + p\delta\] follows from the addition-only differential privacy of the base algorithm \(M\) and privacy amplification by subsampling. ∎

This is an interesting result – it provides a new way to construct differentially private algorithms. In particular, combining Theorem 6 with Proposition 3 shows that subsampling and adding exponential noise satisfies differential privacy. To be precise, if \(q : \mathcal{X}^* \to \mathbb{R}\) is monotone and sensitivity 1, then, for \(p=1-e^{-\varepsilon}\), \(\lambda = \frac{1}{\log\left(1 + e^\varepsilon\right)},\) and \(\xi\) a standard exponential random variable, \[\widetilde{M}(x) = q(S_p(x)) + \lambda \xi \tag{3}\] defines a \((\varepsilon,0)\)-differentially private algorithm (for addition and removal).1

We can also view this as a negative result. Propositions 3 and 4 showed that we can reduce the variance of count queries by a factor of two by relaxing from standard two-sided differential privacy to one-sided differential privacy. How much more gain could we get? Theorem 6 shows that we cannot improve accuracy by too much via a relaxed one-sided privacy definition, since any algorithm satisfying the relaxed addition-only definition can be converted into one that satisfies the two-sided definition.

Conclusion

In this post we explored asymmetric one-sided differential privacy. We proved a few results, but the real takeaways are the insights we made along the way! By separating the two sides of the standard two-sided differential privacy guarantee, we gain a deeper understanding of what is going on in the definition.

We saw both at the intuitive and formal level that addition-only differential privacy (Definition 1) is a more meaningful guarantee than removal-only differential privacy (Definition 2). Indeed, the removal side can be tacked on relatively easily via subsampling (Theorem 6).

Should we be using one-sided differential privacy guarantees in practice? We definitely shouldn’t rely on removal-only differential privacy. But addition-only differential privacy seems meaningful, so perhaps its worthwhile if it lets us improve constants. However, in my humble opinion, the added complexity of one-sided differential privacy is probably not worthwhile. It’s more of a curiosity, or perhaps an intermediate analytical tool.2

Finally, we note that there are a couple of papers that have explored one-sided differential privacy:

  • Ghazi, Kamath, Kumar, Manurangsi, and Sealfon [GKKMS24] used addition-only differential privacy as an intermediate analytical tool; in particular, they proved Theorem 6 (and used it).
  • Kotsogiannis, Doudalis, Haney, Machanavajjhala, and Mehrotra [KDHMM20] define one-sided differential privacy differently than we do. Roughly, they define one-sided differential privacy in terms of replacement (rather than addition/removal); to make this asymmetric they classify records as either sensitive or non-sensitive and they allow replacement of a sensitive record with a non-sensitive record. They analyzed one-sided Laplace noise in this setting (cf. Proposition 3).
  • Takagi, Kato, Cao, and Yoshikawa [TCY22] define asymmetric differential privacy in terms of replacement similarly to Kotsogiannis et al. and also give an analysis of one-sided Laplace noise.
  • Chen, Hu, Zhuang, Zhao, and Yu [CHZZY25] define one-sided personalized differential privacy using the same replacement formalism as Kotsogiannis et al., but with an added twist where each person has their own privacy parameter \(\varepsilon\).

  1. There exist settings where the algorithm in Equation 3 achieves lower variance than the Laplace mechanism; working out this parameter regime is left as an exercise for the reader. 

  2. Some speculative examples where addition-only differential privacy might be a useful tool: There are some situations – e.g., the shuffle model – where it is easier to add noise than to subtract noise. In such cases addition-only differential privacy may be a useful tool for designing and analyzing algorithms. Another example is when using differential privacy to prove generalization guarantees. 

By

TCS for All Spotlight Workshop at STOC 2026: Conference registration grants and call for speaker nominations

from CS Theory Events

May 15-25, 2026 STOC 2026/TCS4All events sigact.org/tcsforall/ Submission deadline: May 24, 2026 This year, TCS for All *Registration Scholarships* will cover the early registration cost at STOC 2026. The scholarships are intended for researchers at the beginning of their careers and they are being made available for all students in TCS. This scholarship is open … Continue reading TCS for All Spotlight Workshop at STOC 2026: Conference registration grants and call for speaker nominations

By shacharlovett

May 15-25, 2026 STOC 2026/TCS4All events https://sigact.org/tcsforall/ Submission deadline: May 24, 2026 This year, TCS for All *Registration Scholarships* will cover the early registration cost at STOC 2026. The scholarships are intended for researchers at the beginning of their careers and they are being made available for all students in TCS. This scholarship is open … Continue reading TCS for All Spotlight Workshop at STOC 2026: Conference registration grants and call for speaker nominations

By shacharlovett

TR26-076 | Polynomial Identity Testing for Read-4 Arithmetic Formulas | Nimrod Kaplan, Amir Shpilka

from ECCC Papers

We present the first algorithms for polynomial identity testing (PIT) of read-$4$ arithmetic formulas in the non-multilinear setting. Specifically, we give a polynomial-time PIT algorithm in the whitebox model and a quasi-polynomial-time algorithm in the blackbox model. Since our techniques are based on proving hardness of representation results, we extend our algorithms to orbits of read-$4$ formulas under the action of the affine linear group. Prior to our work, no subexponential white- or blackbox algorithms were known for this class of formulas. All our results hold over any field $\mathbb{F}$ with $\text{char}(\mathbb{F}) = 0$ or $\text{char}(\mathbb{F}) \ge 5$. Prior work addressed only restricted cases. Anderson, van Melkebeek, and Volkovich (Computational Complexity, 2015) studied multilinear read-$k$ formulas, giving a polynomial-time whitebox PIT algorithm and a quasi-polynomial-time blackbox algorithm. Without the multilinearity restriction, Mahajan, Rao, and Sreenivasaiah (TCS, 2014) gave polynomial-time whitebox algorithms for read-$2$ and read-$3$ formulas, Prakriya (Doctoral Thesis, 2019) obtained quasi-polynomial-time blackbox PIT algorithm for both read-$2$ and read-$3$ formulas. Independently, Shamir (Master's Thesis, 2022) obtained a quasi-polynomial-time blackbox PIT algorithm for read-$2$ formulas. For bounded-depth read-$k$ formulas, Agrawal, Saha, Saptharishi, and Saxena (SICOMP, 2016) obtained a polynomial-time blackbox algorithm in the non-multilinear case. The running time of their algorithm is $n^{k^{2^\Delta}}$ for read-$k$, depth-$\Delta$ formulas, and hence it is applicable only to constant depth. Partial derivatives are a central tool in the study of deterministic PIT for bounded-read formulas. However, for non-multilinear R$k$Fs, differentiation may increase the number of reads. To address this, we develop new structural results that ensure ``nice behavior'' of derivatives. Specifically, we introduce a new Fragmentation Lemma that reduces the PIT problem for general R$k$Fs to simpler models via differentiation. In addition, we define the notion of dominating degree patterns and show that, in certain cases, taking partial derivatives with respect to these patterns preserves the read count.

We present the first algorithms for polynomial identity testing (PIT) of read-$4$ arithmetic formulas in the non-multilinear setting. Specifically, we give a polynomial-time PIT algorithm in the whitebox model and a quasi-polynomial-time algorithm in the blackbox model. Since our techniques are based on proving hardness of representation results, we extend our algorithms to orbits of read-$4$ formulas under the action of the affine linear group. Prior to our work, no subexponential white- or blackbox algorithms were known for this class of formulas. All our results hold over any field $\mathbb{F}$ with $\text{char}(\mathbb{F}) = 0$ or $\text{char}(\mathbb{F}) \ge 5$. Prior work addressed only restricted cases. Anderson, van Melkebeek, and Volkovich (Computational Complexity, 2015) studied multilinear read-$k$ formulas, giving a polynomial-time whitebox PIT algorithm and a quasi-polynomial-time blackbox algorithm. Without the multilinearity restriction, Mahajan, Rao, and Sreenivasaiah (TCS, 2014) gave polynomial-time whitebox algorithms for read-$2$ and read-$3$ formulas, Prakriya (Doctoral Thesis, 2019) obtained quasi-polynomial-time blackbox PIT algorithm for both read-$2$ and read-$3$ formulas. Independently, Shamir (Master's Thesis, 2022) obtained a quasi-polynomial-time blackbox PIT algorithm for read-$2$ formulas. For bounded-depth read-$k$ formulas, Agrawal, Saha, Saptharishi, and Saxena (SICOMP, 2016) obtained a polynomial-time blackbox algorithm in the non-multilinear case. The running time of their algorithm is $n^{k^{2^\Delta}}$ for read-$k$, depth-$\Delta$ formulas, and hence it is applicable only to constant depth. Partial derivatives are a central tool in the study of deterministic PIT for bounded-read formulas. However, for non-multilinear R$k$Fs, differentiation may increase the number of reads. To address this, we develop new structural results that ensure ``nice behavior'' of derivatives. Specifically, we introduce a new Fragmentation Lemma that reduces the PIT problem for general R$k$Fs to simpler models via differentiation. In addition, we define the notion of dominating degree patterns and show that, in certain cases, taking partial derivatives with respect to these patterns preserves the read count.

Extensive long-range magic in non-Abelian topological orders

from arXiv: Computational Complexity

Authors: Yuzhen Zhang, Isaac H. Kim, Yimu Bao, Sagar Vijay

We show that the low-energy states of non-Abelian topological orders possess extensive magic which is long-ranged, and cannot be eliminated by a constant-depth local unitary circuit. This refines conventional notions of complexity beyond the linear circuit depth which is required to prepare any topological phase, and provides a new resource-theoretic characterization of topological orders. A central technical result is a no-go theorem establishing that stabilizer states--even up to constant-depth local unitarie--cannot approximate low-energy states of non-Abelian string-net models which satisfy the entanglement bootstrap axioms. Moreover, we show that stabilizer-realizable Abelian string-net phases have mutual braiding phases quantized by the on-site qudit dimension, and that any violation of this condition necessarily implies extensive long-range magic. Extending to higher spatial dimensions, we argue that any state obeying an entanglement area law and hosting excitations with nontrivial fusion spaces must exhibit extensive long-range magic. This applies, in particular, to ground-states and low-energy states of higher-dimensional quantum double models.

Authors: Yuzhen Zhang, Isaac H. Kim, Yimu Bao, Sagar Vijay

We show that the low-energy states of non-Abelian topological orders possess extensive magic which is long-ranged, and cannot be eliminated by a constant-depth local unitary circuit. This refines conventional notions of complexity beyond the linear circuit depth which is required to prepare any topological phase, and provides a new resource-theoretic characterization of topological orders. A central technical result is a no-go theorem establishing that stabilizer states--even up to constant-depth local unitarie--cannot approximate low-energy states of non-Abelian string-net models which satisfy the entanglement bootstrap axioms. Moreover, we show that stabilizer-realizable Abelian string-net phases have mutual braiding phases quantized by the on-site qudit dimension, and that any violation of this condition necessarily implies extensive long-range magic. Extending to higher spatial dimensions, we argue that any state obeying an entanglement area law and hosting excitations with nontrivial fusion spaces must exhibit extensive long-range magic. This applies, in particular, to ground-states and low-energy states of higher-dimensional quantum double models.

On the Limits of PAC Learning of Networks from Opinion Dynamics

from arXiv: Computational Complexity

Authors: Dmitry Chistikov, Luisa Estrada, Mike Paterson, Paolo Turrini

Agents in social networks with threshold-based dynamics change opinions when influenced by sufficiently many peers. Existing literature typically assumes that the network structure and dynamics are fully known, which is often unrealistic. In this work, we ask how to learn a network structure from samples of the agents' synchronous opinion updates. Firstly, if the opinion dynamics follow a threshold rule in which a fixed number of influencers prevent opinion change (e.g., unanimity and quasi-unanimity), we provide an efficient PAC learning algorithm provided that the number of influencers per agent is bounded. Secondly, under standard computational complexity assumptions, we prove that if agents' opinions follow the majority of their influencers, then there is no efficient PAC learning algorithm. We propose a polynomial-time heuristic that successfully learns consistent networks in over $98\%$ of our simulations on random graphs, with no failures for some specified conditions on the numbers of agents and opinion diffusion examples.

Authors: Dmitry Chistikov, Luisa Estrada, Mike Paterson, Paolo Turrini

Agents in social networks with threshold-based dynamics change opinions when influenced by sufficiently many peers. Existing literature typically assumes that the network structure and dynamics are fully known, which is often unrealistic. In this work, we ask how to learn a network structure from samples of the agents' synchronous opinion updates. Firstly, if the opinion dynamics follow a threshold rule in which a fixed number of influencers prevent opinion change (e.g., unanimity and quasi-unanimity), we provide an efficient PAC learning algorithm provided that the number of influencers per agent is bounded. Secondly, under standard computational complexity assumptions, we prove that if agents' opinions follow the majority of their influencers, then there is no efficient PAC learning algorithm. We propose a polynomial-time heuristic that successfully learns consistent networks in over $98\%$ of our simulations on random graphs, with no failures for some specified conditions on the numbers of agents and opinion diffusion examples.

Extending CDCL to disjunctions of parity equations

from arXiv: Computational Complexity

Authors: Paul Beame, Glenn Sun

Because CDCL produces proofs in the Resolution proof system, problems provably hard for Resolution are also provably hard for CDCL. Exponentially shorter proofs can sometimes be found using stronger proof systems such as $\text{Res}(\oplus)$, a generalization of Resolution to XNF formulas, whose constraints are disjunctions of parity equations ("linear clauses") such as $(x \oplus y) \lor \lnot (y \oplus z)$. While some modern solvers like CryptoMiniSAT reason on Boolean clauses with separate parity equations, reasoning about more general linear clauses is less explored. We present $\text{CDCL}(\oplus)$, a generalization of CDCL to XNF formulas, and prove a bidirectional connection with $\text{Res}(\oplus)$: $\text{CDCL}(\oplus)$ not only produces $\text{Res}(\oplus)$ proofs, but also polynomially simulates $\text{Res}(\oplus)$ given nondeterministic decisions and restarts, mirroring the classical relationship between CDCL and Resolution. Our key technical tool is a new set of inference rules for $\text{Res}(\oplus)$ that helps us translate Resolution-based subroutines such as 1-UIP clause learning. Altogether, $\text{CDCL}(\oplus)$'s parity reasoning includes branching on arbitrary parity equations, linear-algebraic reasoning during unit propagation, and learning linear clauses through conflict analysis. We provide a proof-of-concept implementation of $\text{CDCL}(\oplus)$ called Xorcle, which includes adaptations of existing CDCL heuristics to XNF formulas and an extension of LRUP proof logging that we call $\text{LRUP}(\oplus)$. On a selected suite of benchmarks focusing on native XNF formulas, Xorcle outperforms existing solvers such as Kissat and CryptoMiniSAT. Additionally, on Tseitin formulas written in CNF, even without preprocessing, Xorcle's running time appears to scale nearly polynomially.

Authors: Paul Beame, Glenn Sun

Because CDCL produces proofs in the Resolution proof system, problems provably hard for Resolution are also provably hard for CDCL. Exponentially shorter proofs can sometimes be found using stronger proof systems such as $\text{Res}(\oplus)$, a generalization of Resolution to XNF formulas, whose constraints are disjunctions of parity equations ("linear clauses") such as $(x \oplus y) \lor \lnot (y \oplus z)$. While some modern solvers like CryptoMiniSAT reason on Boolean clauses with separate parity equations, reasoning about more general linear clauses is less explored. We present $\text{CDCL}(\oplus)$, a generalization of CDCL to XNF formulas, and prove a bidirectional connection with $\text{Res}(\oplus)$: $\text{CDCL}(\oplus)$ not only produces $\text{Res}(\oplus)$ proofs, but also polynomially simulates $\text{Res}(\oplus)$ given nondeterministic decisions and restarts, mirroring the classical relationship between CDCL and Resolution. Our key technical tool is a new set of inference rules for $\text{Res}(\oplus)$ that helps us translate Resolution-based subroutines such as 1-UIP clause learning. Altogether, $\text{CDCL}(\oplus)$'s parity reasoning includes branching on arbitrary parity equations, linear-algebraic reasoning during unit propagation, and learning linear clauses through conflict analysis. We provide a proof-of-concept implementation of $\text{CDCL}(\oplus)$ called Xorcle, which includes adaptations of existing CDCL heuristics to XNF formulas and an extension of LRUP proof logging that we call $\text{LRUP}(\oplus)$. On a selected suite of benchmarks focusing on native XNF formulas, Xorcle outperforms existing solvers such as Kissat and CryptoMiniSAT. Additionally, on Tseitin formulas written in CNF, even without preprocessing, Xorcle's running time appears to scale nearly polynomially.

The Complexity of Nested Reset Counter Systems

from arXiv: Computational Complexity

Authors: A. R. Balasubramanian, Franzisco Schmidt

Nested counter systems (NCS) are a generalization of counter systems to higher-order counters. Here, a higher-order counter is allowed to have other (lower-order) counters as elements, instead of just a number. Such systems can be viewed as working on trees, where the height of the tree naturally corresponds to the highest order counter that the system is working with. It is known that the coverability problem for NCS, which asks if a given final tree can be covered from a given initial tree, is $\mathbf{F}_{ε_0}$-complete. Here $\mathbf{F}_{ε_0}$ is a class in the fast-growing hierarchy of complexity classes. In this paper, we consider an extension of NCS called nested reset counter systems (NRCS) that extends NCS with resets. We show that coverability for NRCS over order-$k$ counters is $\mathbf{F}_{Ω_k}$-complete where $Ω_k$ is the tower of height $k$ of the $ω$ ordinal. This gives the first natural hierarchy of complete problems for all of these classes. Furthermore, to prove our upper bounds, we also develop length function theorems for any fixed amount of applications of the multiset operation on finite sets. As an application of our results, we improve existing upper bounds for various problems from XML processing, graph transformation systems, $π$-calculus, logic and parameterized verification. Furthermore, using our completeness results for $k$-NRCS, we also prove $\mathbf{F}_{Ω_k}$-completeness of the considered problems from the realms of parameterized verification and logic, for all $k$.

Authors: A. R. Balasubramanian, Franzisco Schmidt

Nested counter systems (NCS) are a generalization of counter systems to higher-order counters. Here, a higher-order counter is allowed to have other (lower-order) counters as elements, instead of just a number. Such systems can be viewed as working on trees, where the height of the tree naturally corresponds to the highest order counter that the system is working with. It is known that the coverability problem for NCS, which asks if a given final tree can be covered from a given initial tree, is $\mathbf{F}_{ε_0}$-complete. Here $\mathbf{F}_{ε_0}$ is a class in the fast-growing hierarchy of complexity classes. In this paper, we consider an extension of NCS called nested reset counter systems (NRCS) that extends NCS with resets. We show that coverability for NRCS over order-$k$ counters is $\mathbf{F}_{Ω_k}$-complete where $Ω_k$ is the tower of height $k$ of the $ω$ ordinal. This gives the first natural hierarchy of complete problems for all of these classes. Furthermore, to prove our upper bounds, we also develop length function theorems for any fixed amount of applications of the multiset operation on finite sets. As an application of our results, we improve existing upper bounds for various problems from XML processing, graph transformation systems, $π$-calculus, logic and parameterized verification. Furthermore, using our completeness results for $k$-NRCS, we also prove $\mathbf{F}_{Ω_k}$-completeness of the considered problems from the realms of parameterized verification and logic, for all $k$.

Enhanced and Efficient Reasoning in Large Learning Models

from arXiv: Computational Complexity

Authors: Leslie G. Valiant

In current Large Language Models we can trust the production of smoothly flowing prose on the basis of the principles of machine learning. However, there is no comparably principled basis to justify trust in the content of the text produced. It appears to be conventional wisdom that addressing this issue by adding more principled reasoning is not computationally affordable. Here we propose a principled method of reasoning that is efficient enough to be practical for large language models. Further, the method allows the retention of much of the currently used software and hardware base. Our method for improving the functioning of large language models consists of a first stage of preprocessing that recodes the data to a Unary Relational Integracode that is more explicit about the relationships among the objects described in the text, followed as a second stage by a standard but possibly streamlined machine learning process that then also learns to predict these relationships. The method may be viewed as realizing a world model and applying beyond natural language, to vision and actions, for example, where the multiple properties of an object referred to in an input are brought together explicitly, rather than remaining distributed in the various references to it in the input. We articulate its advantages in terms of Robust Logic, a system for performing principled chaining on learned, and hence uncertain, information. We show that this recoding has the surprising and fortuitous property that, while succinct, it makes the task of learning a core subset of relational rules that hold in the world described in the training data polynomial time learnable in a defined sense, the polynomial depending on the complexity of the rule. This gives support for sound reasoning within each single call of the learned classifier as well as between multiple calls.

Authors: Leslie G. Valiant

In current Large Language Models we can trust the production of smoothly flowing prose on the basis of the principles of machine learning. However, there is no comparably principled basis to justify trust in the content of the text produced. It appears to be conventional wisdom that addressing this issue by adding more principled reasoning is not computationally affordable. Here we propose a principled method of reasoning that is efficient enough to be practical for large language models. Further, the method allows the retention of much of the currently used software and hardware base. Our method for improving the functioning of large language models consists of a first stage of preprocessing that recodes the data to a Unary Relational Integracode that is more explicit about the relationships among the objects described in the text, followed as a second stage by a standard but possibly streamlined machine learning process that then also learns to predict these relationships. The method may be viewed as realizing a world model and applying beyond natural language, to vision and actions, for example, where the multiple properties of an object referred to in an input are brought together explicitly, rather than remaining distributed in the various references to it in the input. We articulate its advantages in terms of Robust Logic, a system for performing principled chaining on learned, and hence uncertain, information. We show that this recoding has the surprising and fortuitous property that, while succinct, it makes the task of learning a core subset of relational rules that hold in the world described in the training data polynomial time learnable in a defined sense, the polynomial depending on the complexity of the rule. This gives support for sound reasoning within each single call of the learned classifier as well as between multiple calls.

Meschers: Geometry Processing of Impossible Objects

from arXiv: Computational Geometry

Authors: Ana Dodik, Isabella Yu, Kartik Chandra, Jonathan Ragan-Kelley, Joshua Tenenbaum, Vincent Sitzmann, Justin Solomon

Impossible objects, geometric constructions that humans can perceive but that cannot exist in real life, have been a topic of intrigue in visual arts, perception, and graphics, yet no satisfying computer representation of such objects exists. Previous work embeds impossible objects in 3D, cutting them or twisting/bending them in the depth axis. Cutting an impossible object changes its local geometry at the cut, which can hamper downstream graphics applications, such as smoothing, while bending makes it difficult to relight the object. Both of these can invalidate geometry operations, such as distance computation. As an alternative, we introduce Meschers, meshes capable of representing impossible constructions akin to those found in M.C. Escher's woodcuts. Our representation has a theoretical foundation in discrete exterior calculus and supports the use-cases above, as we demonstrate in a number of example applications. Moreover, because we can do discrete geometry processing on our representation, we can inverse-render impossible objects. We also compare our representation to cut and bend representations of impossible objects.

Authors: Ana Dodik, Isabella Yu, Kartik Chandra, Jonathan Ragan-Kelley, Joshua Tenenbaum, Vincent Sitzmann, Justin Solomon

Impossible objects, geometric constructions that humans can perceive but that cannot exist in real life, have been a topic of intrigue in visual arts, perception, and graphics, yet no satisfying computer representation of such objects exists. Previous work embeds impossible objects in 3D, cutting them or twisting/bending them in the depth axis. Cutting an impossible object changes its local geometry at the cut, which can hamper downstream graphics applications, such as smoothing, while bending makes it difficult to relight the object. Both of these can invalidate geometry operations, such as distance computation. As an alternative, we introduce Meschers, meshes capable of representing impossible constructions akin to those found in M.C. Escher's woodcuts. Our representation has a theoretical foundation in discrete exterior calculus and supports the use-cases above, as we demonstrate in a number of example applications. Moreover, because we can do discrete geometry processing on our representation, we can inverse-render impossible objects. We also compare our representation to cut and bend representations of impossible objects.

Fast and Robust Mesh Simplification for Generated and Real-World 3D Assets

from arXiv: Computational Geometry

Authors: Kunal Bhosikar, Preet Savalia, Lokender Tiwari, Brojeshwar Bhowmick

The rapid growth of 3D content from modern reconstruction and generative pipelines, such as neural rendering and large-scale 3D asset generation, has led to an abundance of dense, noisy, and often non-manifold meshes. While these representations achieve high visual fidelity, their complexity poses significant challenges for downstream applications in simulation, AR/VR, and scientific computing, where efficient and reliable geometry is essential. This necessitates mesh simplification methods that are not only fast and robust to "in-the-wild" inputs, but also capable of preserving fine geometric structures and high-quality appearance. In this paper, we propose Feature-Aware Quadric Error Metric (FA-QEM), a comprehensive mesh simplification pipeline designed for modern 3D assets. Our approach introduces a novel multi-term quadric error formulation that jointly encodes geometric deviation, boundary curvature, and surface normal consistency, enabling optimal vertex placement that preserves sharp features even under aggressive simplification. Furthermore, we show that high-fidelity geometric simplification significantly improves downstream appearance transfer, serving as a superior front-end for texture mapping via successive mapping techniques. We conduct extensive evaluations on both AI-generated meshes and large-scale real-world datasets, including Thingi10K and the Real-World Textured Things dataset. Our results demonstrate that FA-QEM achieves consistently lower geometric error, better visual fidelity, and substantially faster runtimes compared to existing methods, while maintaining robustness across diverse and challenging inputs. These properties make FA-QEM a practical and effective component for scalable 3D reconstruction and generation pipelines.

Authors: Kunal Bhosikar, Preet Savalia, Lokender Tiwari, Brojeshwar Bhowmick

The rapid growth of 3D content from modern reconstruction and generative pipelines, such as neural rendering and large-scale 3D asset generation, has led to an abundance of dense, noisy, and often non-manifold meshes. While these representations achieve high visual fidelity, their complexity poses significant challenges for downstream applications in simulation, AR/VR, and scientific computing, where efficient and reliable geometry is essential. This necessitates mesh simplification methods that are not only fast and robust to "in-the-wild" inputs, but also capable of preserving fine geometric structures and high-quality appearance. In this paper, we propose Feature-Aware Quadric Error Metric (FA-QEM), a comprehensive mesh simplification pipeline designed for modern 3D assets. Our approach introduces a novel multi-term quadric error formulation that jointly encodes geometric deviation, boundary curvature, and surface normal consistency, enabling optimal vertex placement that preserves sharp features even under aggressive simplification. Furthermore, we show that high-fidelity geometric simplification significantly improves downstream appearance transfer, serving as a superior front-end for texture mapping via successive mapping techniques. We conduct extensive evaluations on both AI-generated meshes and large-scale real-world datasets, including Thingi10K and the Real-World Textured Things dataset. Our results demonstrate that FA-QEM achieves consistently lower geometric error, better visual fidelity, and substantially faster runtimes compared to existing methods, while maintaining robustness across diverse and challenging inputs. These properties make FA-QEM a practical and effective component for scalable 3D reconstruction and generation pipelines.

Fast Leaf-to-Ancestor Minimum Query in the Oracle Model

from arXiv: Data Structures and Algorithms

Authors: Aleksey Upirvitskiy, Aleksandr Levin

We study leaf-to-ancestor path-minimum queries on a rooted, weighted tree in the oracle model, where the only allowed value operation is a comparison oracle on edge (or node) weights. We give a static data structure that, after O(n log h) preprocessing time, space, and oracle calls (where $n$ is the number of nodes and $h$ is the tree height), answers any leaf-to-ancestor query in O(1) worst-case time with zero oracle calls at query time. The method combines (I) an edge-to-node weight conversion with a deterministic tie-break to obtain a total order; (II) ladder (longest-path) decomposition; (III) binary lifting; and (IV) sparse-table RMQ built over ladder arrays, storing indices selected via the oracle during preprocessing. We also show that the preprocessing oracle-comparison bound is tight in the deterministic comparison model.

Authors: Aleksey Upirvitskiy, Aleksandr Levin

We study leaf-to-ancestor path-minimum queries on a rooted, weighted tree in the oracle model, where the only allowed value operation is a comparison oracle on edge (or node) weights. We give a static data structure that, after O(n log h) preprocessing time, space, and oracle calls (where $n$ is the number of nodes and $h$ is the tree height), answers any leaf-to-ancestor query in O(1) worst-case time with zero oracle calls at query time. The method combines (I) an edge-to-node weight conversion with a deterministic tie-break to obtain a total order; (II) ladder (longest-path) decomposition; (III) binary lifting; and (IV) sparse-table RMQ built over ladder arrays, storing indices selected via the oracle during preprocessing. We also show that the preprocessing oracle-comparison bound is tight in the deterministic comparison model.

Non-Redundancy of Low-Arity Symmetric Boolean CSPs

from arXiv: Data Structures and Algorithms

Authors: Amatya Sharma, Santhoshini Velusamy

Non-redundancy, introduced by Bessiere, Carbonnel, and Katsirelos (AAAI 2020), is a structural parameter for Constraint Satisfaction Problems ($\mathsf{CSPs}$) that governs kernelization, exact and approximate sparsification, and exact streaming complexity. It is the largest size of a $\mathsf{CSP}$ instance admitting no smaller subinstance with the same satisfying assignments. We study non-redundancy $\mathsf{NRD}_n(R)$ for Boolean symmetric $\mathsf{CSPs}$ defined by an $r$-ary relation $R$ whose value depends only on Hamming weight. An instance of $\mathsf{CSP}(R)$ has $n$ variables and constraints given by $r$-tuples; a constraint is satisfied exactly when the induced tuple lies in $R$. This class includes natural predicates such as cuts and $k$-SAT clauses. Our main result is a near-complete classification of the asymptotic growth of $\mathsf{NRD}_n(R)$ for symmetric Boolean predicates of arity at most $5$. Using computational experiments and algebraic upper- and lower-bound criteria, we resolve every predicate of arity at most $4$ and all but two predicates of arity $5$. For upper bounds, we introduce $t$-balancedness, a lifted, higher-degree version of the balancedness notion of Chen, Jansen, and Pieterse (Algorithmica 2020). We prove that $t$-balancedness is equivalent to the existence of degree-$t$ multilinear polynomials capturing $R$, and hence implies $\mathsf{NRD}_n(R)=O(n^t)$. For lower bounds, we use Carbonnel's (CP 2022) framework: predicates admitting a special reduction from $k$-ary OR inherit OR's lower bound $Ω(n^k)$. The only unresolved arity-$5$ predicates in our framework have bounds $Ω(n^2)$ and $O(n^3)$; we reduce their exact classification to natural extremal set-system questions.

Authors: Amatya Sharma, Santhoshini Velusamy

Non-redundancy, introduced by Bessiere, Carbonnel, and Katsirelos (AAAI 2020), is a structural parameter for Constraint Satisfaction Problems ($\mathsf{CSPs}$) that governs kernelization, exact and approximate sparsification, and exact streaming complexity. It is the largest size of a $\mathsf{CSP}$ instance admitting no smaller subinstance with the same satisfying assignments. We study non-redundancy $\mathsf{NRD}_n(R)$ for Boolean symmetric $\mathsf{CSPs}$ defined by an $r$-ary relation $R$ whose value depends only on Hamming weight. An instance of $\mathsf{CSP}(R)$ has $n$ variables and constraints given by $r$-tuples; a constraint is satisfied exactly when the induced tuple lies in $R$. This class includes natural predicates such as cuts and $k$-SAT clauses. Our main result is a near-complete classification of the asymptotic growth of $\mathsf{NRD}_n(R)$ for symmetric Boolean predicates of arity at most $5$. Using computational experiments and algebraic upper- and lower-bound criteria, we resolve every predicate of arity at most $4$ and all but two predicates of arity $5$. For upper bounds, we introduce $t$-balancedness, a lifted, higher-degree version of the balancedness notion of Chen, Jansen, and Pieterse (Algorithmica 2020). We prove that $t$-balancedness is equivalent to the existence of degree-$t$ multilinear polynomials capturing $R$, and hence implies $\mathsf{NRD}_n(R)=O(n^t)$. For lower bounds, we use Carbonnel's (CP 2022) framework: predicates admitting a special reduction from $k$-ary OR inherit OR's lower bound $Ω(n^k)$. The only unresolved arity-$5$ predicates in our framework have bounds $Ω(n^2)$ and $O(n^3)$; we reduce their exact classification to natural extremal set-system questions.

Min-1-Planarity is NP-Hard

from arXiv: Data Structures and Algorithms

Authors: Yuto Okada

In this paper, we show that it is NP-hard to determine whether a given graph admits a min-1-planar drawing. A drawing of a graph is min-$k$-planar if, for every crossing in the drawing, at least one of the two crossing edges involves at most $k$ crossings. This notion of min-$k$-planarity was introduced by Binucci, Büngener, Di Battista, Didimo, Dujmović, Hong, Kaufmann, Liotta, Morin, and Tappini [GD 2023; JGAA, 2024] as a generalization of $k$-planarity.

Authors: Yuto Okada

In this paper, we show that it is NP-hard to determine whether a given graph admits a min-1-planar drawing. A drawing of a graph is min-$k$-planar if, for every crossing in the drawing, at least one of the two crossing edges involves at most $k$ crossings. This notion of min-$k$-planarity was introduced by Binucci, Büngener, Di Battista, Didimo, Dujmović, Hong, Kaufmann, Liotta, Morin, and Tappini [GD 2023; JGAA, 2024] as a generalization of $k$-planarity.

Hitting Axis-Parallel Segments with Weighted Points

from arXiv: Data Structures and Algorithms

Authors: Rajiv Raman, Siddhartha Sarkar, Jatin Yadav

We study a geometric hitting-set problem in which the input consists of a set $P$ of weighted points and a family $S=H\cup V$ of axis-parallel segments in the plane. The goal is to select a minimum-weight subset of $P$ that hits every segment in $S$. Even restricted geometric hitting-set problems are known to be computationally hard, and for axis-parallel segments the standard decomposition into horizontal and vertical sub-instances yields only a simple factor-$2$ approximation. We present an LP-rounding algorithm that breaks the factor-2 barrier. For the weighted problem, we obtain a randomized $(1+2/e)$-approximation by combining systematic rounding on horizontal lines with an exact repair step on residual vertical sub-instances. In the unweighted case, a sharper analysis gives a $(1+1/(e-1))$-approximation. Finally, we consider the case where one of the sub-instances consists of lines instead of line segments, a problem considered by Fekete et al. (Geometric Hitting Set for Segments of Few Orientations, Theor. Comp. Sys., 62 (2) 2018),. In this case, we improve their result to obtain an approximation factor of $1+1/e$ and show that the problem is APX-hard. We also present algorithms for the generalization to $d$ orientations, as well as PTASes for bounded-complexity subclasses of the unweighted Hitting Set problem.

Authors: Rajiv Raman, Siddhartha Sarkar, Jatin Yadav

We study a geometric hitting-set problem in which the input consists of a set $P$ of weighted points and a family $S=H\cup V$ of axis-parallel segments in the plane. The goal is to select a minimum-weight subset of $P$ that hits every segment in $S$. Even restricted geometric hitting-set problems are known to be computationally hard, and for axis-parallel segments the standard decomposition into horizontal and vertical sub-instances yields only a simple factor-$2$ approximation. We present an LP-rounding algorithm that breaks the factor-2 barrier. For the weighted problem, we obtain a randomized $(1+2/e)$-approximation by combining systematic rounding on horizontal lines with an exact repair step on residual vertical sub-instances. In the unweighted case, a sharper analysis gives a $(1+1/(e-1))$-approximation. Finally, we consider the case where one of the sub-instances consists of lines instead of line segments, a problem considered by Fekete et al. (Geometric Hitting Set for Segments of Few Orientations, Theor. Comp. Sys., 62 (2) 2018),. In this case, we improve their result to obtain an approximation factor of $1+1/e$ and show that the problem is APX-hard. We also present algorithms for the generalization to $d$ orientations, as well as PTASes for bounded-complexity subclasses of the unweighted Hitting Set problem.

Hybrid Sketching Methods for Dynamic Connectivity on Sparse Graphs

from arXiv: Data Structures and Algorithms

Authors: Quinten De Man, Gilvir Gill, Michael A. Bender, Laxman Dhulipala, David Tench

Dynamic connectivity is a fundamental dynamic graph problem, and recent algorithmic breakthroughs on dynamic graph sketching have reshaped what is theoretically possible: by encoding the graph as per-vertex linear sketches, these algorithms solve dynamic connectivity in only $Θ(V \log^2 V)$ space, independent of the number of edges,outperforming lossless $Θ(V+E)$-space structures that grow as the graph becomes denser. Prior to this work, no practical dynamic connectivity algorithm has been able to translate these theoretical breakthroughs into space savings on real-world graphs. The main obstacle is that per-vertex sketches cost thousands of bytes per vertex, so sketching only pays off once the graph becomes extremely dense. We observe that sparse real-world graphs are often not uniformly sparse, these graphs can contain dense cores on a small subset of vertices that account for a large fraction of edges. We exploit this structure via hybrid sketching: sketch only the dense core, and store the sparse periphery losslessly. We design new hybrid algorithms for fully-dynamic and semi-streaming connectivity with space $O(\min\{V+E, V \log V \log(2+E/V)\})$ w.h.p., simultaneously matching the lossless bound on sparse graphs, the sketching bound on dense graphs, and improving on both in an intermediate regime. A key component is BalloonSketch, a new l0-sampler reducing per-vertex sketch sizes by up to 8x. We implement HybridSCALE, a modular system treating the lossless and sketch-based components as subroutines. HybridSCALE is the first sketch-based dynamic connectivity system to save space on common real-world graphs. Compared to the state-of-the-art lossless baseline, HybridSCALE saves up to 15% space on sparse graphs (average degree < 100), up to 92% on intermediate density graphs (average degree ~ 100-1000), and up to 97% on dense graphs (average degree > 1000).

Authors: Quinten De Man, Gilvir Gill, Michael A. Bender, Laxman Dhulipala, David Tench

Dynamic connectivity is a fundamental dynamic graph problem, and recent algorithmic breakthroughs on dynamic graph sketching have reshaped what is theoretically possible: by encoding the graph as per-vertex linear sketches, these algorithms solve dynamic connectivity in only $Θ(V \log^2 V)$ space, independent of the number of edges,outperforming lossless $Θ(V+E)$-space structures that grow as the graph becomes denser. Prior to this work, no practical dynamic connectivity algorithm has been able to translate these theoretical breakthroughs into space savings on real-world graphs. The main obstacle is that per-vertex sketches cost thousands of bytes per vertex, so sketching only pays off once the graph becomes extremely dense. We observe that sparse real-world graphs are often not uniformly sparse, these graphs can contain dense cores on a small subset of vertices that account for a large fraction of edges. We exploit this structure via hybrid sketching: sketch only the dense core, and store the sparse periphery losslessly. We design new hybrid algorithms for fully-dynamic and semi-streaming connectivity with space $O(\min\{V+E, V \log V \log(2+E/V)\})$ w.h.p., simultaneously matching the lossless bound on sparse graphs, the sketching bound on dense graphs, and improving on both in an intermediate regime. A key component is BalloonSketch, a new l0-sampler reducing per-vertex sketch sizes by up to 8x. We implement HybridSCALE, a modular system treating the lossless and sketch-based components as subroutines. HybridSCALE is the first sketch-based dynamic connectivity system to save space on common real-world graphs. Compared to the state-of-the-art lossless baseline, HybridSCALE saves up to 15% space on sparse graphs (average degree < 100), up to 92% on intermediate density graphs (average degree ~ 100-1000), and up to 97% on dense graphs (average degree > 1000).

Sharp Bounds on the Eigenvalues of Kikuchi Graphs and Applications to Quantum Max Cut

from arXiv: Data Structures and Algorithms

Authors: Ainesh Bakshi, Arpon Basu, Pravesh Kothari, Anqi Li

We prove that the maximum eigenvalue of the (both signed and unsigned) Laplacian of level $k$ Kikuchi graph of any graph $G$ with $m$ edges is at most $m+k$. This confirms four recent conjectures of Apte, Parekh, and Sud. As applications, we obtain that tensor products of one and two qubit product states achieve an approximation ratio of $5/8$ for Quantum Max Cut and $5/7$ for the XY Hamiltonian. Moreover, combining our bounds with the algorithms analyzed by Apte, Parekh, and Sud, yields efficient algorithms achieving an approximation ratio of $0.614$ for Quantum Max Cut and $0.674$ for the XY Hamiltonian. Finally, we also make modest progress on Brouwer's conjecture and improve Lew's bound on the sum of the top-$k$ eigenvalues of a Graph Laplacian.

Authors: Ainesh Bakshi, Arpon Basu, Pravesh Kothari, Anqi Li

We prove that the maximum eigenvalue of the (both signed and unsigned) Laplacian of level $k$ Kikuchi graph of any graph $G$ with $m$ edges is at most $m+k$. This confirms four recent conjectures of Apte, Parekh, and Sud. As applications, we obtain that tensor products of one and two qubit product states achieve an approximation ratio of $5/8$ for Quantum Max Cut and $5/7$ for the XY Hamiltonian. Moreover, combining our bounds with the algorithms analyzed by Apte, Parekh, and Sud, yields efficient algorithms achieving an approximation ratio of $0.614$ for Quantum Max Cut and $0.674$ for the XY Hamiltonian. Finally, we also make modest progress on Brouwer's conjecture and improve Lew's bound on the sum of the top-$k$ eigenvalues of a Graph Laplacian.

Optimal Bounds for the k-Disjoint Paths Problem

from arXiv: Data Structures and Algorithms

Authors: Dario Cavallaro, Maximilian Gorsky, Stephan Kreutzer, Dimitrios M. Thilikos, Sebastian Wiederrecht

The Graph Minors Series of Robertson and Seymour forms the foundation of algorithmic structural graph theory, yielding fixed-parameter algorithms for problems such as Disjoint Paths, Rooted Minor Checking, and Folio. A key ingredient behind the fixed-parameter tractability of the $k$-Disjoint Paths problem is the irrelevant-vertex technique. This machinery is governed by the Vital Linkage Theorem and the so-called Linkage Function $\ell$. However, despite its foundational role, the best known bounds on the Linkage Function are enormous and are only implicitly understood. The quantitative bounds behind these results have traditionally been so large that the resulting algorithms are regarded as "galactic". Our main result is a general irrelevant-vertex theorem for a common generalisation of $k$-Disjoint Paths and Rooted Minor Checking for graphs of size at most $d,$ commonly called the $(k,d)$-Folio problem. Specifically, we show that for any graph $G$ in which the $k$ terminals are chosen from some set $R,$ if the treewidth of $G$ exceeds $β(k,b,d)\in$ $2^{{\bf poly}(b + d)}$ $\cdot {\bf poly}(k)$ then we can locate an irrelevant vertex for the $(k,d)$-Folio problem. Here, the quantity $b$ is the bidimensionality of $R,$ that is, the largest $b$ for which a $(b\times b)$-grid minor in $G$ can be rooted on $R$. Thus, the exponential component of the irrelevant-vertex threshold is driven by the bound on the bidimensionality, rather than by the number of terminals, and we argue that this dependence is essentially optimal up to polynomial factors. As a consequence, the Linkage Function satisfies $\ell(k) \in 2^{{\bf poly}(k)}$. Beyond its structural significance, our result yields improved parameter dependencies for algorithms for Disjoint Paths and Rooted Minor Checking}, and provides a quantitative improvement for a broad range of graph-minor-based algorithmic frameworks.

Authors: Dario Cavallaro, Maximilian Gorsky, Stephan Kreutzer, Dimitrios M. Thilikos, Sebastian Wiederrecht

The Graph Minors Series of Robertson and Seymour forms the foundation of algorithmic structural graph theory, yielding fixed-parameter algorithms for problems such as Disjoint Paths, Rooted Minor Checking, and Folio. A key ingredient behind the fixed-parameter tractability of the $k$-Disjoint Paths problem is the irrelevant-vertex technique. This machinery is governed by the Vital Linkage Theorem and the so-called Linkage Function $\ell$. However, despite its foundational role, the best known bounds on the Linkage Function are enormous and are only implicitly understood. The quantitative bounds behind these results have traditionally been so large that the resulting algorithms are regarded as "galactic". Our main result is a general irrelevant-vertex theorem for a common generalisation of $k$-Disjoint Paths and Rooted Minor Checking for graphs of size at most $d,$ commonly called the $(k,d)$-Folio problem. Specifically, we show that for any graph $G$ in which the $k$ terminals are chosen from some set $R,$ if the treewidth of $G$ exceeds $β(k,b,d)\in$ $2^{{\bf poly}(b + d)}$ $\cdot {\bf poly}(k)$ then we can locate an irrelevant vertex for the $(k,d)$-Folio problem. Here, the quantity $b$ is the bidimensionality of $R,$ that is, the largest $b$ for which a $(b\times b)$-grid minor in $G$ can be rooted on $R$. Thus, the exponential component of the irrelevant-vertex threshold is driven by the bound on the bidimensionality, rather than by the number of terminals, and we argue that this dependence is essentially optimal up to polynomial factors. As a consequence, the Linkage Function satisfies $\ell(k) \in 2^{{\bf poly}(k)}$. Beyond its structural significance, our result yields improved parameter dependencies for algorithms for Disjoint Paths and Rooted Minor Checking}, and provides a quantitative improvement for a broad range of graph-minor-based algorithmic frameworks.

Hardness of Burning Number Problem on Regular Graphs

from arXiv: Data Structures and Algorithms

Authors: Dhanyamol Antony, L. Sunil Chandran, Anita Das, Shirish Gosavi, Dalu Jacob, Shashanka Kulamarva

The Burning Number Problem (BNP) models the spread of information or contagion in a network through a discrete-time process on a graph. At each step, one new vertex is selected as a burning source, while fire simultaneously spreads from previously burned vertices to their neighbors. The burning number of a graph is the minimum number of steps required to burn all vertices. The decision version asks whether the burning number is at most a given integer $k$. BNP is known to be NP-complete even on restricted graph classes such as path forests. We study BNP on connected regular graphs, a natural and previously unexplored graph class. We prove that BNP is NP-complete on connected cubic graphs, and moreover APX-hard under this restriction. We further show that BNP remains APX-hard on connected $d$-regular graphs for every fixed $d \geq 4$.

Authors: Dhanyamol Antony, L. Sunil Chandran, Anita Das, Shirish Gosavi, Dalu Jacob, Shashanka Kulamarva

The Burning Number Problem (BNP) models the spread of information or contagion in a network through a discrete-time process on a graph. At each step, one new vertex is selected as a burning source, while fire simultaneously spreads from previously burned vertices to their neighbors. The burning number of a graph is the minimum number of steps required to burn all vertices. The decision version asks whether the burning number is at most a given integer $k$. BNP is known to be NP-complete even on restricted graph classes such as path forests. We study BNP on connected regular graphs, a natural and previously unexplored graph class. We prove that BNP is NP-complete on connected cubic graphs, and moreover APX-hard under this restriction. We further show that BNP remains APX-hard on connected $d$-regular graphs for every fixed $d \geq 4$.

Branch-width of represented matroids in matrix multiplication time

from arXiv: Data Structures and Algorithms

Authors: Mujin Choi, Tuukka Korhonen, Sang-il Oum

For an $n$-element matroid $M$ given by an $n \times n$ matrix representation over a finite field $\mathbb F$ and an integer $k$, we present an $(O_{k,\mathbb F}(n^2)+O(n^ω))$-time algorithm that either finds a branch-decomposition of $M$ of width at most $k$, or confirms that the branch-width of $M$ is more than $k$, where $ω< 2.3714$ is the matrix multiplication exponent, and the $O_{k,\mathbb F}(\cdot)$-notation hides factors that depend on $k$ and $\mathbb F$ in a computable manner. All previous algorithms including Hliněný and Oum [SIAM J. Comput. (2008)] and Jeong, Kim, and Oum [SIAM J. Discrete Math. (2021)] run in at least $Ω(n^3)$ time. Moreover, if the input matrix representation is given by a standard form, our algorithm runs in $O_{k,\mathbb F}(n^2)$-time, since $O(n^ω)$-time is only needed for finding a standard form of the input matrix. When $M$ is given by an $m \times n$ matrix, the overhead for finding a standard form is $O(mn \min(m,n)^{ω-2})$. As corollaries, we obtain faster algorithms for rank-width of directed graphs and path-width of matroids represented over a fixed finite field. Furthermore, we also present an approximation algorithm for finding branch-width that works on infinite fields provided that the input matrix is of a standard form and contains a bounded number of distinct values of entries. To suggest that our algorithm is optimal, we observe that for every field $\mathbb F$, deciding whether the branch-width of a matroid represented over $\mathbb F$ is $0$ is as hard as deciding whether a square matrix over $\mathbb F$ is singular. Under the assumption that singularity testing requires $Ω(n^ω)$-time, this implies that the overhead of $O(n^ω)$ is unavoidable. We also show strengthenings of this observation to rule out some approximations under this assumption.

Authors: Mujin Choi, Tuukka Korhonen, Sang-il Oum

For an $n$-element matroid $M$ given by an $n \times n$ matrix representation over a finite field $\mathbb F$ and an integer $k$, we present an $(O_{k,\mathbb F}(n^2)+O(n^ω))$-time algorithm that either finds a branch-decomposition of $M$ of width at most $k$, or confirms that the branch-width of $M$ is more than $k$, where $ω< 2.3714$ is the matrix multiplication exponent, and the $O_{k,\mathbb F}(\cdot)$-notation hides factors that depend on $k$ and $\mathbb F$ in a computable manner. All previous algorithms including Hliněný and Oum [SIAM J. Comput. (2008)] and Jeong, Kim, and Oum [SIAM J. Discrete Math. (2021)] run in at least $Ω(n^3)$ time. Moreover, if the input matrix representation is given by a standard form, our algorithm runs in $O_{k,\mathbb F}(n^2)$-time, since $O(n^ω)$-time is only needed for finding a standard form of the input matrix. When $M$ is given by an $m \times n$ matrix, the overhead for finding a standard form is $O(mn \min(m,n)^{ω-2})$. As corollaries, we obtain faster algorithms for rank-width of directed graphs and path-width of matroids represented over a fixed finite field. Furthermore, we also present an approximation algorithm for finding branch-width that works on infinite fields provided that the input matrix is of a standard form and contains a bounded number of distinct values of entries. To suggest that our algorithm is optimal, we observe that for every field $\mathbb F$, deciding whether the branch-width of a matroid represented over $\mathbb F$ is $0$ is as hard as deciding whether a square matrix over $\mathbb F$ is singular. Under the assumption that singularity testing requires $Ω(n^ω)$-time, this implies that the overhead of $O(n^ω)$ is unavoidable. We also show strengthenings of this observation to rule out some approximations under this assumption.

zSort: Stable Distribution Sort using Z-Score Partitioning

from arXiv: Data Structures and Algorithms

Authors: Hriday Jain, Ketan Sabale, Aditya Shastri, Hiren Kumar Thakkar, Ashutosh Londhe

Sorting is a foundational primitive in modern data processing, influencing the execution speed of high-performance data pipelines. However, the algorithmic landscape is currently bifurcated by a pervasive "Stability Tax": practitioners must sacrifice either order preservation for high throughput or execution speed for stability. To address these limitations, this paper introduces, zSort, an adaptive z-score based distribution sorting algorithm that guarantees stability while avoiding pass complexity that scales with key-width. The performance of the proposed technique is evaluated using Microarchitectural analysis and experimental results. Microarchitectural analysis shows that zSort achieves a lower bad-speculation overhead (19.7%) than both stable baselines and several high-performance unstable algorithms and sustains a competitive IPC of 1.44. Empirical evaluation across diverse input distributions and datasets of up to 10^7 elements (64 bit) demonstrates that zSort consistently outperforms widely used comparison based stable sorting algorithms, achieving up to 3x-4.5x speedups, and a relatively better performance compared to LSD Radix, with larger gains on duplicate heavy and partially ordered inputs. Despite providing stability, zSort achieves comparable throughput as compared to high-performance unstable algorithms such as Skasort. It also maintains this performance on adaptive workloads where methods like Pdqsort typically excel and doesn't exhibit any extreme worst case. These results indicate that zSort substantially narrows the traditional performance gap between stable and unstable sorting and provides an efficient, stable sorting alternative.

Authors: Hriday Jain, Ketan Sabale, Aditya Shastri, Hiren Kumar Thakkar, Ashutosh Londhe

Sorting is a foundational primitive in modern data processing, influencing the execution speed of high-performance data pipelines. However, the algorithmic landscape is currently bifurcated by a pervasive "Stability Tax": practitioners must sacrifice either order preservation for high throughput or execution speed for stability. To address these limitations, this paper introduces, zSort, an adaptive z-score based distribution sorting algorithm that guarantees stability while avoiding pass complexity that scales with key-width. The performance of the proposed technique is evaluated using Microarchitectural analysis and experimental results. Microarchitectural analysis shows that zSort achieves a lower bad-speculation overhead (19.7%) than both stable baselines and several high-performance unstable algorithms and sustains a competitive IPC of 1.44. Empirical evaluation across diverse input distributions and datasets of up to 10^7 elements (64 bit) demonstrates that zSort consistently outperforms widely used comparison based stable sorting algorithms, achieving up to 3x-4.5x speedups, and a relatively better performance compared to LSD Radix, with larger gains on duplicate heavy and partially ordered inputs. Despite providing stability, zSort achieves comparable throughput as compared to high-performance unstable algorithms such as Skasort. It also maintains this performance on adaptive workloads where methods like Pdqsort typically excel and doesn't exhibit any extreme worst case. These results indicate that zSort substantially narrows the traditional performance gap between stable and unstable sorting and provides an efficient, stable sorting alternative.

Fast Gossip-based Rumor Spreading using Small Messages

from arXiv: Data Structures and Algorithms

Authors: Fabien Dufoulon, William K. Moses, Gopal Pandurangan

We study gossip algorithms for the fundamental rumor spreading problem, where the goal is to disseminate a rumor from a given source node to all nodes in an arbitrary (and unknown) graph. Gossip algorithms allow each node to call only one neighbor per round and are therefore highly message-efficient, with low per-node communication overhead per round. The state of the art present fast gossip algorithms, however they typically leverage large-sized messages. This undermines the light-weight communication advantage of gossip, since even though only one neighbor is contacted per round, the message size can be linear in $n$, the network size. Hence, a fundamental question is whether one can perform fast gossip using small messages. The main contribution of this paper is to answer the above question in the affirmative and present two gossip algorithms that achieve fast rumor spreading using messages of polylog{n} size. Specifically, we present the following algorithms: 1. An algorithm that runs in $O(c \log n / Φ_c)$ rounds for every $c \geq 1$, and $Φ_c$ is the weak conductance. Our bound in terms of weak conductance is essentially optimal. 2. An algorithm that depends on the network diameter (and is independent of the graph's conductance), which runs in $\tilde{O}(D+\sqrt{n})$ rounds with high probability. Our algorithm can be modified to output a minimum spanning tree (MST) in the same number of rounds, which is essentially round-optimal (even for non-gossip algorithms). Our gossip algorithms use graph sketches [Ahn, Guha, McGregor, SODA 2012] in a novel way to overcome communication bottlenecks and achieve small communication overhead with small message sizes.

Authors: Fabien Dufoulon, William K. Moses, Gopal Pandurangan

We study gossip algorithms for the fundamental rumor spreading problem, where the goal is to disseminate a rumor from a given source node to all nodes in an arbitrary (and unknown) graph. Gossip algorithms allow each node to call only one neighbor per round and are therefore highly message-efficient, with low per-node communication overhead per round. The state of the art present fast gossip algorithms, however they typically leverage large-sized messages. This undermines the light-weight communication advantage of gossip, since even though only one neighbor is contacted per round, the message size can be linear in $n$, the network size. Hence, a fundamental question is whether one can perform fast gossip using small messages. The main contribution of this paper is to answer the above question in the affirmative and present two gossip algorithms that achieve fast rumor spreading using messages of polylog{n} size. Specifically, we present the following algorithms: 1. An algorithm that runs in $O(c \log n / Φ_c)$ rounds for every $c \geq 1$, and $Φ_c$ is the weak conductance. Our bound in terms of weak conductance is essentially optimal. 2. An algorithm that depends on the network diameter (and is independent of the graph's conductance), which runs in $\tilde{O}(D+\sqrt{n})$ rounds with high probability. Our algorithm can be modified to output a minimum spanning tree (MST) in the same number of rounds, which is essentially round-optimal (even for non-gossip algorithms). Our gossip algorithms use graph sketches [Ahn, Guha, McGregor, SODA 2012] in a novel way to overcome communication bottlenecks and achieve small communication overhead with small message sizes.

Semi-Streaming Algorithms for Submodular Maximization under Random Arrival Order

from arXiv: Data Structures and Algorithms

Authors: Niv Buchbinder, Moran Feldman, Siyue Liu, Sherry Sarkar

We study random order semi-streaming algorithms for submodular maximization under a wide range of combinatorial constraint classes, including matroids, matroid $p$-parity, $p$-exchange systems and $p$-systems. For most of these classes of constraints, our results are the first improvement over what is known to be achievable for adversarial order. For matroids, matching and $p$-matchoids, previous random order results were known, and we improve over some of these as well. In the case of matroids, our improved results show a separation between adversarial and random order semi-streaming algorithms, and exponentially improve the number of passes necessary for getting $1 - 1/e - \varepsilon$ approximation for maximizing a monotone submodular function subject to a matroid constraint. We also prove a new hardness result showing a similar separation for $p$-systems. Our results are based on two new technical tools. One tool provides a general way to translate offline algorithms for many classes of constraints into random order semi-streaming algorithms. The other tool is a semi-streaming variant of a recently proposed offline algorithm for matroid constraints.

Authors: Niv Buchbinder, Moran Feldman, Siyue Liu, Sherry Sarkar

We study random order semi-streaming algorithms for submodular maximization under a wide range of combinatorial constraint classes, including matroids, matroid $p$-parity, $p$-exchange systems and $p$-systems. For most of these classes of constraints, our results are the first improvement over what is known to be achievable for adversarial order. For matroids, matching and $p$-matchoids, previous random order results were known, and we improve over some of these as well. In the case of matroids, our improved results show a separation between adversarial and random order semi-streaming algorithms, and exponentially improve the number of passes necessary for getting $1 - 1/e - \varepsilon$ approximation for maximizing a monotone submodular function subject to a matroid constraint. We also prove a new hardness result showing a similar separation for $p$-systems. Our results are based on two new technical tools. One tool provides a general way to translate offline algorithms for many classes of constraints into random order semi-streaming algorithms. The other tool is a semi-streaming variant of a recently proposed offline algorithm for matroid constraints.

Stochastic Matching via Local Sparsification

from arXiv: Data Structures and Algorithms

Authors: Sara Ahmadian, Edith Cohen, Mohammad Roghani

The classic online stochastic matching problem typically requires immediate and irrevocable matching decisions. However, in many modern decentralized systems such as real-time ride-hailing and distributed cloud computing, the primary bottleneck is often local communication bandwidth rather than the timing of the match itself. We formalize this challenge by introducing a two-stage local sparsification framework. In this setting, arriving requests must prune their realized compatibility sets to a strict budget of $k$ edges before a central coordinator optimizes the global matching. This creates a "middle ground" between local information constraints and global optimization utility. We propose a local selection strategy, parametrized by a fractional solution of the expected instance. Theoretically, we quantify the approximation ratio as a function of the solution's {\em spread}. We prove that under sufficient spread, our sparsifier globally preserves the expected size of the maximum matching. Empirically, we demonstrate the robustness of our approach using the New York City ride-hailing datasets and adversarial synthetic benchmarks. Our results show that near-optimal global matching is achievable even with highly constrained local budgets, significantly outperforming standard online baselines.

Authors: Sara Ahmadian, Edith Cohen, Mohammad Roghani

The classic online stochastic matching problem typically requires immediate and irrevocable matching decisions. However, in many modern decentralized systems such as real-time ride-hailing and distributed cloud computing, the primary bottleneck is often local communication bandwidth rather than the timing of the match itself. We formalize this challenge by introducing a two-stage local sparsification framework. In this setting, arriving requests must prune their realized compatibility sets to a strict budget of $k$ edges before a central coordinator optimizes the global matching. This creates a "middle ground" between local information constraints and global optimization utility. We propose a local selection strategy, parametrized by a fractional solution of the expected instance. Theoretically, we quantify the approximation ratio as a function of the solution's {\em spread}. We prove that under sufficient spread, our sparsifier globally preserves the expected size of the maximum matching. Empirically, we demonstrate the robustness of our approach using the New York City ride-hailing datasets and adversarial synthetic benchmarks. Our results show that near-optimal global matching is achievable even with highly constrained local budgets, significantly outperforming standard online baselines.

Finite Sample Bounds for Learning with Score Matching

from arXiv: Data Structures and Algorithms

Authors: Devin Smedira, Abhijith Jayakumar, Sidhant Misra, Marc Vuffray, Andrey Y. Lokhov

Learning of continuous exponential family distributions with unbounded support remains an important area of research for both theory and applications in high-dimensional statistics. In recent years, score matching has become a widely used method for learning exponential families with continuous variables due to its computational ease when compared against maximum likelihood estimation. However, theoretical understanding of the statistical properties of score matching is still lacking. In this work, we provide a non-asymptotic sample complexity analysis for learning the structure of exponential families of polynomials with score matching. The derived sample bounds show a polynomial dependence on the model dimension. These bounds are the first of its kind, as all prior work has shown only asymptotic bounds on the sample complexity.

Authors: Devin Smedira, Abhijith Jayakumar, Sidhant Misra, Marc Vuffray, Andrey Y. Lokhov

Learning of continuous exponential family distributions with unbounded support remains an important area of research for both theory and applications in high-dimensional statistics. In recent years, score matching has become a widely used method for learning exponential families with continuous variables due to its computational ease when compared against maximum likelihood estimation. However, theoretical understanding of the statistical properties of score matching is still lacking. In this work, we provide a non-asymptotic sample complexity analysis for learning the structure of exponential families of polynomials with score matching. The derived sample bounds show a polynomial dependence on the model dimension. These bounds are the first of its kind, as all prior work has shown only asymptotic bounds on the sample complexity.

New Algorithms for Parity-SAT and Its Bounded-Occurrence Versions

from arXiv: Data Structures and Algorithms

Authors: Sanjay Jain, Junqiang Peng, Frank Stephan, Haoyun Tang, Mingyu Xiao

Parity-SAT is the problem of determining whether a given CNF formula has an odd number of satisfying assignments. As a canonical $\oplus$P-complete problem, it represents a fundamental variant of the exact model counting problem (#SAT). Under the Strong Exponential Time Hypothesis (SETH), Parity-SAT admits no $O^*((2-\varepsilon)^n)$-time or $O^*((2-\varepsilon)^m)$-time algorithm for any constant $\varepsilon>0$, where $n$ and $m$ denote the numbers of variables and clauses, respectively. Thus, breaking the $2^n$ or $2^m$ barrier appears impossible in full generality. In this work, we revisit this barrier through structural restrictions and a refined exploitation of parity. We study Parity-$d$-occ-SAT, where each variable appears in at most $d$ clauses, and obtain three main results. First, we design {a randomized} $O^*(2^{m(1-1/O(d))})$-time algorithm, thereby breaking the $2^m$ barrier for every fixed $d$. Second, for the special case $d=2$, we develop a significantly sharper branching algorithm running in $O^*(1.1193^n)$ time or $O^*(1.3248^m)$ time. Third, leveraging the structural insights underlying the $d=2$ case, we obtain an $O^*(1.1052^L)$-time algorithm for general Parity-SAT, where $L$ denotes the formula length. All algorithms use only polynomial space. Notably, our running-time bounds are better than the best known bounds for the corresponding exact counting counterparts, highlighting a genuine algorithmic advantage of parity over counting. Conceptually, our results demonstrate that parity admits finer structural reductions and more efficient branching than exact model counting, and that bounded occurrence can be systematically leveraged to circumvent classical exponential barriers.

Authors: Sanjay Jain, Junqiang Peng, Frank Stephan, Haoyun Tang, Mingyu Xiao

Parity-SAT is the problem of determining whether a given CNF formula has an odd number of satisfying assignments. As a canonical $\oplus$P-complete problem, it represents a fundamental variant of the exact model counting problem (#SAT). Under the Strong Exponential Time Hypothesis (SETH), Parity-SAT admits no $O^*((2-\varepsilon)^n)$-time or $O^*((2-\varepsilon)^m)$-time algorithm for any constant $\varepsilon>0$, where $n$ and $m$ denote the numbers of variables and clauses, respectively. Thus, breaking the $2^n$ or $2^m$ barrier appears impossible in full generality. In this work, we revisit this barrier through structural restrictions and a refined exploitation of parity. We study Parity-$d$-occ-SAT, where each variable appears in at most $d$ clauses, and obtain three main results. First, we design {a randomized} $O^*(2^{m(1-1/O(d))})$-time algorithm, thereby breaking the $2^m$ barrier for every fixed $d$. Second, for the special case $d=2$, we develop a significantly sharper branching algorithm running in $O^*(1.1193^n)$ time or $O^*(1.3248^m)$ time. Third, leveraging the structural insights underlying the $d=2$ case, we obtain an $O^*(1.1052^L)$-time algorithm for general Parity-SAT, where $L$ denotes the formula length. All algorithms use only polynomial space. Notably, our running-time bounds are better than the best known bounds for the corresponding exact counting counterparts, highlighting a genuine algorithmic advantage of parity over counting. Conceptually, our results demonstrate that parity admits finer structural reductions and more efficient branching than exact model counting, and that bounded occurrence can be systematically leveraged to circumvent classical exponential barriers.

Improved Speed via Regional Fulfillment

from arXiv: Data Structures and Algorithms

Authors: Daniel Hathcock, R. Ravi, Amitabh Sinha

In e-retail, order fulfillment speed has become one of the most important metrics affecting customer satisfaction. While common wisdom dictates that maintaining a large global fulfillment network maximizes efficiency via economies of scale, recent evidence has shown that breaking up the network into smaller regions can yield significant speed improvements. In this paper, we consider a simple abstract model of order fulfillment by which we explain this phenomenon. We characterize fulfillment assignments satisfying an equilibrium condition based on the greedy fulfillment strategy, and quantify how the resulting fulfillment delay can be decreased by regionalizing the network. Finally, we provide some algorithmic results for computing low delay assignments, and some simulations supporting our equilibrium framework.

Authors: Daniel Hathcock, R. Ravi, Amitabh Sinha

In e-retail, order fulfillment speed has become one of the most important metrics affecting customer satisfaction. While common wisdom dictates that maintaining a large global fulfillment network maximizes efficiency via economies of scale, recent evidence has shown that breaking up the network into smaller regions can yield significant speed improvements. In this paper, we consider a simple abstract model of order fulfillment by which we explain this phenomenon. We characterize fulfillment assignments satisfying an equilibrium condition based on the greedy fulfillment strategy, and quantify how the resulting fulfillment delay can be decreased by regionalizing the network. Finally, we provide some algorithmic results for computing low delay assignments, and some simulations supporting our equilibrium framework.

Thursday, May 14

TCS+ talk: Wednesday, May 20 — Shyamal Patel, Columbia University

from TCS+ Seminar Series

The next TCS+ talk will take place this coming Wednesday, May 20th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Shyamal Patel from Columbia University will speak about “Learning Functions of Halfspaces” (abstract below). You can reserve a spot as an individual or a group to join us […]

The next TCS+ talk will take place this coming Wednesday, May 20th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Shyamal Patel from Columbia University will speak about “Learning Functions of Halfspaces” (abstract below).

You can reserve a spot as an individual or a group to join us live by signing up on the online form. Registration is not required to attend the interactive talk, and the link will be posted on the website the day prior to the talk; however, by registering in the form, you will receive a reminder, along with the link. (The recorded talk will also be posted on our website afterwards) As usual, for more information about the TCS+ online seminar series and the upcoming talks, or to suggest a possible topic or speaker, please see the website.

Abstract: Learning halfspaces is one of the most basic and fundamental problems in learning theory. While we have a good understanding of the problem, simple generalizations such as learning an intersection of halfspaces appear much more challenging. While learning an intersection of halfspaces can be done efficiently if we assume our data is drawn from a uniform or log-concave distribution, there are large gaps in our understanding when we make no assumptions on the background distribution. In particular, even for learning an intersection of two halfspaces, we have no non-trivial learning algorithms and no evidence that the problem requires super-polynomial time.
In this talk, we’ll describe a sub-exponential time algorithm for learning an intersection of halfspaces and more generally for learning an arbitrary function of \tilde{O}(\log(n)) halfspaces (joint work with Josh Alman and Rocco Servedio).

By plustcs

The degree of Fibonacci heaps

from David Eppstein

The Fibonacci heap is a priority queue data structure in which finding and removing the top-priority item takes logarithmic time (like a binary heap) but reprioritizing items to have higher priority takes only constant time, under amortized time analysis. This property is useful for speeding up the time complexity of algorithms including Dijkstra’s algorithm for shortest paths and Jarník’s algorithm for minimum spanning trees, so that they take logarithmic time per vertex and constant time per edge.

The Fibonacci heap is a priority queue data structure in which finding and removing the top-priority item takes logarithmic time (like a binary heap) but reprioritizing items to have higher priority takes only constant time, under amortized time analysis. This property is useful for speeding up the time complexity of algorithms including Dijkstra’s algorithm for shortest paths and Jarník’s algorithm for minimum spanning trees, so that they take logarithmic time per vertex and constant time per edge.

In its internal structure, the Fibonacci heap is a forest of trees, with one tree node per prioritized object, higher priorities towards to roots of the trees. The analysis of its logarithmic-time find-and-remove operation can be split into two parts: showing that the time of this operation is proportional to the maximum number of children of any node in this tree (the degree of the node), and showing that the degree is logarithmic. It is the second part of this analysis that gives the data structure its name: a node with degree \(d\) must be the root of a subtree at least a Fibonacci number of nodes, \(F_{d+2}\) (numbering the Fibonacci numbers starting from \(F_0=0\) and \(F_1=1\)). For instance, a subtree whose root has three children must contain at least five nodes, etc. Therefore, in a Fibonacci heap that reaches a maximum size of \(n\), the degree is bounded by the requirement that \(F_{d+2}\le n\), for otherwise we would have a tree with more elements than the entire heap. But this standard analysis, which you can find in the original Fibonacci heap paper, Wikipedia, or the textbook Introduction to Algorithms, is not tight! It turns out that the actual requirement is \(F_{d+3}\le n\). Or in other words, the degrees of the trees in a Fibonacci heap are smaller by one than the bound given by the standard analysis.

To prove that each subtree has a Fibonacci number of nodes, one uses a lemma that the \(i\)th child of any node (in the order that it was added as a child) must itself have degree at least \(i-2\). This follows from the three ways that the Fibonacci heap data structure can change its forest. First, a deletion operation can remove the root of its tree, making its children into roots, but their degrees remain unchanged. Second, any deletion triggers a cleanup phase in which pairs of trees with equal numbers of children are merged by making one root into the child of another. If this merge adds a node as the \(i\)th child of the merged root, the new child had the same degree as the root before the merge, \(i-1\). And third, a change of priority can trigger the removal of some tree edges (promoting the child of a node to become a new tree root), but every node that is not a tree root can have only one child removed while it is not a root. Thus, the \(i\)th child of a node, which was at least the \(i\)th child and had degree at least \(i-1\) when it first became a child, can only have its degree decrease by one, to \(i-2\).

Based on this lemma, one can construct a sequence of the smallest trees \(T_d\) with each root degree \(d\). In this sequence, each tree \(T_d\) is formed from \(T_{d-1}\) by adding to it another child with \(d-2\) grandchildren (the minimum allowed by the lemma). The added child’s subtree takes the form of \(T_{d-2}\), the smallest subtree that could give that number of grandchildren. Thus, \(T_d=T_{d-1}+T_{d-2}\) for an addition operation on trees that adds the second tree argument of the operation as a child of the first tree argument. This is the Fibonacci recurrence, and these trees have a Fibonacci number of nodes.

The smallest trees in a Fibonacci heap with root degree 0, 1, 2, 3, and 4

But now, instead of looking at the minimum size of a tree with degree \(d\), let’s consider what situation has to occur for a tree with degree \(d\) to be created. As a base case, a tree with degree one is created by merging two trees of degree zero. For this merge to happen, the Fibonacci heap needs to perform a deletion operation (to trigger the merge) in a state where there are already two trees of degree zero to be merged. Thus, creating a tree with degree one requires there to exist three heap elements, before the deletion operation: one to be deleted and two more in the merged trees. Similarly, to create a tree with degree two, we need to merge two trees of degree one (with at least two nodes each), triggered by a deletion of another element. So this creation requires five elements: four in the two merged trees and one to be deleted to trigger a merge or sequence of merges that ends up merging two degree-one trees.

More generally, we can prove by induction that creating a tree of degree \(d\) can only happen after the heap has reached a size of \(F_{d+3}\). To create a tree of degree \(d\), we need the first of two trees of degree \(d-1\) to be created (in the same merge step or in an earlier merge step as the one creating the tree of degree \(d\)), and then the second, and then a merge of the two. When the first tree is created, it contains at least \(F_{d+1}\) elements that cannot participate in the creation of the second tree, by the tree size bound from the old analysis of Fibonacci heaps. When the second tree is created, the trees that merge to form it and the deleted element that trigger the merge comprise at least \(F_{d+2}\) elements, by the induction hypothesis. Together these give \(F_{d+1}+F_{d+2}=F_{d+3}\) elements in the heap, just before the deletion that triggered the creation of the second merged subtree and then the merge of the two degree-\((d-1)\) trees into a single degree-\(d\) tree.

This induction proof also guides a construction of a sequence of operations that creates a degree-\(d\) tree, and then trims it down to the form \(T_{d-1}\), while keeping the maximum number of elements in the heap at most \(F_{d+3}\). As a base case, inserting three elements into an empty heap and deleting one will create a degree-one tree \(T_1\), with no trimming needed. Then for any degree \(d>1\), create a degree-\((d-1)\) tree and trim it to the form \(T_{d-1}\) recursively. Then, create a second degree-\((d-1)\) tree. The merge step that creates it will also merge it with the first \(T_{d-1}\); assign priorities in such a way that the second tree becomes a child of the first \(T_{d-1}\). Continue performing improve-priority and delete operations that trim the second tree to the form \(T_{d-1}\), while it remains a subtree of the first tree, producing a tree formed by merging two copies of \(T_{d-1}\). Then, use additional improve-priority and delete operations to remove the last child of the last child of the root, producing a tree in the form of \(T_d\). The number of elements in the heap rises to \(F_{d+2}\) while creating the first \(T_{d-1}\), falls back down to \(F_{d+1}\), and then rises to \(F_{d+1}+F_{d+2}=F_{d+3}\) while finishing the construction, matching the lower bound.

(Discuss on Mastodon)

By David Eppstein

Prediction Markets Redux

from Computational Complexity

For those very long-time readers this blog extensively covered prediction markets from 2006 to 2008. In a prediction market, you have a future event, such as the winner of an election, and a market that pays off one dollar if that event happens and nothing if it doesn't. The price of this market corresponds to a predicted probability that the event will happen. 

I worked with David Pennock and Yiling Chen to create an interactive map that colored every state based on how likely it was to go Democratic or Republican for both the presidential, gubernatorial and Senate races.


This all ended when Intrade, the source of the data that we used, went out of business after its CEO died climbing Mount Everest and may have embezzled money from the company.

In 2016 prediction markets went out of favor after badly predicting against Brexit once and Trump twice. 

But what's old becomes new again. Prediction markets are back with a bang, with Kalshi and Polymarket providing real money markets open to U.S. citizens, something we didn't legally have, except for very low stakes, twenty years ago.

Prediction markets play two distinct roles:

  1. As a betting system so people can put their "money where their mouth is" based on their beliefs.
  2. As an information aggregation system, to use the wisdom of the crowds to predict future events.
Sometimes these roles conflict with each other. With real money comes real greed and we've seen some recent issues of insider trading, most notably a US soldier who used classified knowledge about the then upcoming raid in Venezuela to net $400K on Polymarket. 
Insider trading will likely lead to better predictions, but those who have bet on that market might feel cheated. In the short run, insider trading might be a good thing, but in the long run, if we scare away participants because they're afraid others who have more information will take advantage of them, then we have fewer people in the market making it harder to draw upon the wisdom of the crowds. 
Another challenge for prediction markets is determining how the market pays off. You need to come up with a very specific definition of what it means for the market to pay off, but strange circumstances can lead to unexpected results. Back in 2006 North Korea launched a test missile outside the country but the market for that event paid off zero because the U.S. Department of Defense never verified that the missile actually left North Korean airspace. 
The market I like to see is factoring the RSA numbers by a given date. Very easy to verify. Some people might buy such a security, expecting quantum computing to be factoring numbers soon. And then I can make some money by betting against it. 

By Lance Fortnow

For those very long-time readers this blog extensively covered prediction markets from 2006 to 2008. In a prediction market, you have a future event, such as the winner of an election, and a market that pays off one dollar if that event happens and nothing if it doesn't. The price of this market corresponds to a predicted probability that the event will happen. 

I worked with David Pennock and Yiling Chen to create an interactive map that colored every state based on how likely it was to go Democratic or Republican for both the presidential, gubernatorial and Senate races.


This all ended when Intrade, the source of the data that we used, went out of business after its CEO died climbing Mount Everest and may have embezzled money from the company.

In 2016 prediction markets went out of favor after badly predicting against Brexit once and Trump twice. 

But what's old becomes new again. Prediction markets are back with a bang, with Kalshi and Polymarket providing real money markets open to U.S. citizens, something we didn't legally have, except for very low stakes, twenty years ago.

Prediction markets play two distinct roles:

  1. As a betting system so people can put their "money where their mouth is" based on their beliefs.
  2. As an information aggregation system, to use the wisdom of the crowds to predict future events.
Sometimes these roles conflict with each other. With real money comes real greed and we've seen some recent issues of insider trading, most notably a US soldier who used classified knowledge about the then upcoming raid in Venezuela to net $400K on Polymarket. 

Insider trading will likely lead to better predictions, but those who have bet on that market might feel cheated. In the short run, insider trading might be a good thing, but in the long run, if we scare away participants because they're afraid others who have more information will take advantage of them, then we have fewer people in the market making it harder to draw upon the wisdom of the crowds. 

Another challenge for prediction markets is determining how the market pays off. You need to come up with a very specific definition of what it means for the market to pay off, but strange circumstances can lead to unexpected results. Back in 2006 North Korea launched a test missile outside the country but the market for that event paid off zero because the U.S. Department of Defense never verified that the missile actually left North Korean airspace. 

The market I like to see is factoring the RSA numbers by a given date. Very easy to verify. Some people might buy such a security, expecting quantum computing to be factoring numbers soon. And then I can make some money by betting against it. 

By Lance Fortnow

Simplex on a Fixed View Schedule

from Decentralized Thoughts

This post gives a fixed view schedule version of Simplex, called Simplex FVS, with the advantages of a fixed view schedule discussed here. In the FVS design, the start and end times of each view are fixed in advance; this requires synchronized clocks. We number protocol views starting at $1$, and view $v$ starts at time $3v\Delta$. After a view ends, parties no longer sign new messages for it, but...

By Ittai Abraham

This post gives a fixed view schedule version of Simplex, called Simplex FVS, with the advantages of a fixed view schedule discussed here. In the FVS design, the start and end times of each view are fixed in advance; this requires synchronized clocks. We number protocol views starting at $1$, and view $v$ starts at time $3v\Delta$. After a view ends, parties no longer sign new messages for it, but...

By Ittai Abraham

Upper Bounds for Symmetric Approximate Bounded Indistinguishability

from arXiv: Computational Complexity

Authors: Christopher Williamson

A pair of probability distributions over $\{0,1\}^n$ is said to be $(k,δ)$-wise indistinguishable if all of the size $k$ marginals are within statistical distance at most $δ$. Previous works introduced this concept and study when and how well one can distinguish between such a pair of symmetric distributions by observing $t$ bits. We use a simple hypergeometric smoothing approach and Hahn polynomials to obtain new upper bounds that apply across a wider range of parameters and improve previously available bounds in several regimes. In particular, prior works left open the basic question of whether there exist constants $0

Authors: Christopher Williamson

A pair of probability distributions over $\{0,1\}^n$ is said to be $(k,δ)$-wise indistinguishable if all of the size $k$ marginals are within statistical distance at most $δ$. Previous works introduced this concept and study when and how well one can distinguish between such a pair of symmetric distributions by observing $t$ bits. We use a simple hypergeometric smoothing approach and Hahn polynomials to obtain new upper bounds that apply across a wider range of parameters and improve previously available bounds in several regimes. In particular, prior works left open the basic question of whether there exist constants $0

Polyhedral Instability Governs Regret in Online Learning

from arXiv: Computational Complexity

Authors: Yuetai Li, Fengqing Jiang, Yichen Feng, Kaiyuan Zheng, Luyao Niu, Bhaskar Ramasubramanian, Basel Alomair, Linda Bushnell, Radha Poovendran

Many online decision problems over combinatorial actions are addressed via convex relaxations, leading to online convex optimization with piecewise linear objectives and induced polyhedral structure. We show that regret in such problems is governed by \emph{polyhedral instability}: the number of changes of the active region. Under full information feedback and fixed partition assumptions, if $\mathrm{RS}_T$ denotes the number of region switches and $V_{\max}$ the maximum number of vertices per region, we prove $\Regret_T= Θ(\sqrt{(1+\mathrm{RS}_T)\,T\,\log V_{\max}})$ interpolating between experts-like and dimension-dependent OCO rates. For online submodular--concave games under Lovász convexification, this reduces to the permutation-switch count $\mathrm{SC}_T$, yielding the matching rate $\Regret_T= Θ(\sqrt{(1+\mathrm{SC}_T)\,T\,\log n})$. Experiments on synthetic and real combinatorial problems (shortest path, influence maximization) validate the predicted scaling and indicate that low-instability regimes can arise in practice without explicit enumeration of actions.

Authors: Yuetai Li, Fengqing Jiang, Yichen Feng, Kaiyuan Zheng, Luyao Niu, Bhaskar Ramasubramanian, Basel Alomair, Linda Bushnell, Radha Poovendran

Many online decision problems over combinatorial actions are addressed via convex relaxations, leading to online convex optimization with piecewise linear objectives and induced polyhedral structure. We show that regret in such problems is governed by \emph{polyhedral instability}: the number of changes of the active region. Under full information feedback and fixed partition assumptions, if $\mathrm{RS}_T$ denotes the number of region switches and $V_{\max}$ the maximum number of vertices per region, we prove $\Regret_T= Θ(\sqrt{(1+\mathrm{RS}_T)\,T\,\log V_{\max}})$ interpolating between experts-like and dimension-dependent OCO rates. For online submodular--concave games under Lovász convexification, this reduces to the permutation-switch count $\mathrm{SC}_T$, yielding the matching rate $\Regret_T= Θ(\sqrt{(1+\mathrm{SC}_T)\,T\,\log n})$. Experiments on synthetic and real combinatorial problems (shortest path, influence maximization) validate the predicted scaling and indicate that low-instability regimes can arise in practice without explicit enumeration of actions.

On the Complexity of the Minimum-($k,ρ$)-Shortcut Problem

from arXiv: Computational Complexity

Authors: Tatiana Rocha Avila, Julian Christoph Brinkmann, Alexander Leonhardt, Conrad Schecker

We consider the Minimum-$(k,ρ)$-$\mathrm{Shortcut}$ problem ($\min(k,ρ)\text{-}\mathrm{Shortcut}$), where the goal is to find the smallest set of shortcut edges such that every vertex in a given graph can reach its $ρ$ closest vertices using paths of at most $k$ edges. This is a fundamental graph optimization problem used to accelerate parallel shortest path algorithms. It is well-known that the problem is trivially solvable for the cases $k=1$ and $k\geqρ$. While recent work by Leonhardt, Meyer, and Penschuck (ESA 2024) showed that in undirected graphs $\min(k,ρ)\text{-}\mathrm{Shortcut}$ is NP-hard for $k\geq 3$ if $ρ=Θ(n^ε)$, the boundary where the problem transitions from polynomial-time solvable to NP-hard remained open. In this paper, we narrow this gap significantly. We present a simpler and more direct reduction from the Hitting Set problem which establishes that $\min(k,ρ)\text{-}\mathrm{Shortcut}$ is NP-hard for $k\geq2$ and $ρ\geq k+2$ in both directed and undirected graphs. Complementing this, we use the symmetry of the undirected case to show that $ρ=k+1$ is solvable in polynomial time, a regime where the directed version remains a candidate for NP-hardness. Therefore, we obtain an almost complete characterization of the complexity of $\min(k,ρ)\text{-}\mathrm{Shortcut}$, with the sole remaining open case being $ρ= k+1$ in the directed setting.

Authors: Tatiana Rocha Avila, Julian Christoph Brinkmann, Alexander Leonhardt, Conrad Schecker

We consider the Minimum-$(k,ρ)$-$\mathrm{Shortcut}$ problem ($\min(k,ρ)\text{-}\mathrm{Shortcut}$), where the goal is to find the smallest set of shortcut edges such that every vertex in a given graph can reach its $ρ$ closest vertices using paths of at most $k$ edges. This is a fundamental graph optimization problem used to accelerate parallel shortest path algorithms. It is well-known that the problem is trivially solvable for the cases $k=1$ and $k\geqρ$. While recent work by Leonhardt, Meyer, and Penschuck (ESA 2024) showed that in undirected graphs $\min(k,ρ)\text{-}\mathrm{Shortcut}$ is NP-hard for $k\geq 3$ if $ρ=Θ(n^ε)$, the boundary where the problem transitions from polynomial-time solvable to NP-hard remained open. In this paper, we narrow this gap significantly. We present a simpler and more direct reduction from the Hitting Set problem which establishes that $\min(k,ρ)\text{-}\mathrm{Shortcut}$ is NP-hard for $k\geq2$ and $ρ\geq k+2$ in both directed and undirected graphs. Complementing this, we use the symmetry of the undirected case to show that $ρ=k+1$ is solvable in polynomial time, a regime where the directed version remains a candidate for NP-hardness. Therefore, we obtain an almost complete characterization of the complexity of $\min(k,ρ)\text{-}\mathrm{Shortcut}$, with the sole remaining open case being $ρ= k+1$ in the directed setting.

Diversity of Extensions in Abstract Argumentation

from arXiv: Computational Complexity

Authors: Johannes K. Fichte, Markus Hecher, Yasir Mahmood, Zhengjun Wang

Argumentation is an important topic of AI for modelling and reasoning about arguments. In abstract argumentation, we consider directed graphs, so-called argumentation frameworks (AF), that express conflicts between arguments. The semantics is defined by the notion of extensions, which are sets of arguments that satisfy particular relationship conditions in the AF. Usually, standard reasoning in argumentation do not reveal how far apart extensions are. We introduce a quantitative notion of diversity of extensions based on the symmetric difference and provide a systematic complexity classification. Intuitively, diversity captures whether extensions of a framework (accepted viewpoints) differ only marginally or represent fundamentally incompatible sets of arguments. We study whether an AF admits k-diverse extensions, admits k-diverse extensions covering specific arguments, and to compute the largest k for which an AF admits k-diverse extensions. We outline a prototype and provide an evaluation for computing diversity levels.

Authors: Johannes K. Fichte, Markus Hecher, Yasir Mahmood, Zhengjun Wang

Argumentation is an important topic of AI for modelling and reasoning about arguments. In abstract argumentation, we consider directed graphs, so-called argumentation frameworks (AF), that express conflicts between arguments. The semantics is defined by the notion of extensions, which are sets of arguments that satisfy particular relationship conditions in the AF. Usually, standard reasoning in argumentation do not reveal how far apart extensions are. We introduce a quantitative notion of diversity of extensions based on the symmetric difference and provide a systematic complexity classification. Intuitively, diversity captures whether extensions of a framework (accepted viewpoints) differ only marginally or represent fundamentally incompatible sets of arguments. We study whether an AF admits k-diverse extensions, admits k-diverse extensions covering specific arguments, and to compute the largest k for which an AF admits k-diverse extensions. We outline a prototype and provide an evaluation for computing diversity levels.

Decision Tree Learning on Product Spaces

from arXiv: Computational Complexity

Authors: Arshia Soltani Moakahr, Faraz Ghahremani, Kiarash Banihashem, MohammadTaghi Hajiaghayi

Decision tree learning has long been a central topic in theoretical computer science, driven by its practical importance. A fundamental and widely used method for decision tree construction is the top-down greedy heuristic, which recursively splits on the most influential variable. Despite its empirical success, theoretical analysis of this heuristic has been limited. A recent breakthrough by Blanc et al. (ITCS, 2020) provided the first rigorous theoretical guarantees for the greedy approach, but only under the uniform distribution. We extend this analysis to the more general and practically relevant setting of arbitrary product distributions. Our main result shows that for any function $f$ computable by an optimal decision tree of size $s$, maximum depth $D_{\text{opt}}$, and average depth $Δ_{\text{opt}}$, the greedy heuristic constructs an $ε$-approximating tree whose size grows at most with $\exp\bigl(Δ_{\text{opt}} D_{\text{opt}} \log(e/ε)\bigr)$. In the special case where the optimal tree is a full binary tree, this bound improves upon the bound of Blanc et al. and holds under a strictly broader class of distributions. Moreover, we present an algorithm based on the top-down greedy heuristic that is entirely parameter-free -- it requires no prior knowledge of the optimal tree's size or depth -- offering a practical advantage over Blanc et al.'s method.

Authors: Arshia Soltani Moakahr, Faraz Ghahremani, Kiarash Banihashem, MohammadTaghi Hajiaghayi

Decision tree learning has long been a central topic in theoretical computer science, driven by its practical importance. A fundamental and widely used method for decision tree construction is the top-down greedy heuristic, which recursively splits on the most influential variable. Despite its empirical success, theoretical analysis of this heuristic has been limited. A recent breakthrough by Blanc et al. (ITCS, 2020) provided the first rigorous theoretical guarantees for the greedy approach, but only under the uniform distribution. We extend this analysis to the more general and practically relevant setting of arbitrary product distributions. Our main result shows that for any function $f$ computable by an optimal decision tree of size $s$, maximum depth $D_{\text{opt}}$, and average depth $Δ_{\text{opt}}$, the greedy heuristic constructs an $ε$-approximating tree whose size grows at most with $\exp\bigl(Δ_{\text{opt}} D_{\text{opt}} \log(e/ε)\bigr)$. In the special case where the optimal tree is a full binary tree, this bound improves upon the bound of Blanc et al. and holds under a strictly broader class of distributions. Moreover, we present an algorithm based on the top-down greedy heuristic that is entirely parameter-free -- it requires no prior knowledge of the optimal tree's size or depth -- offering a practical advantage over Blanc et al.'s method.

LFPL: Revisited and Mechanized

from arXiv: Computational Complexity

Authors: Nathaniel Glover, Jan Hoffmann

Hofmann (1999) introduced the functional programming language LFPL to characterize the functions computable in polynomial time using an affine type system. LFPL enables a natural programming style, including nested recursion, and has inspired the development of type systems for automatic cost analysis, linear dependent type theories, and efficient memory management in functional programming languages. Despite its prominence, there does not exist a self-contained presentation, let alone a full mechanization, of LFPL and its core metatheory. This article presents a modern account and mechanization of LFPL and its metatheory with the goal of being self-contained and accessible while streamlining the strongest-known soundness and completeness results. The soundness proof works with the language LFPL+, which extends LFPL with additional language features. The proof is novel, adapting a technique by Aehlig and Schwichtenberg (2002) to construct explicit polynomials that bound the cost of an LFPL+ expression with respect to a big-step cost semantics. The completeness proof shows that LFPL programs can simulate polynomial-time Turing machines while only relying on restricted forms of linear functions and lists. It has the same structure as the original proof by Hofmann (2002) but greatly simplifies the core argument with a novel stack-like data structure that is implemented with first-class functions and lists. The mechanization includes the full soundness and completeness proofs, and serves as one of the first case studies of mechanized metatheory in the recently developed proof assistant Istari.

Authors: Nathaniel Glover, Jan Hoffmann

Hofmann (1999) introduced the functional programming language LFPL to characterize the functions computable in polynomial time using an affine type system. LFPL enables a natural programming style, including nested recursion, and has inspired the development of type systems for automatic cost analysis, linear dependent type theories, and efficient memory management in functional programming languages. Despite its prominence, there does not exist a self-contained presentation, let alone a full mechanization, of LFPL and its core metatheory. This article presents a modern account and mechanization of LFPL and its metatheory with the goal of being self-contained and accessible while streamlining the strongest-known soundness and completeness results. The soundness proof works with the language LFPL+, which extends LFPL with additional language features. The proof is novel, adapting a technique by Aehlig and Schwichtenberg (2002) to construct explicit polynomials that bound the cost of an LFPL+ expression with respect to a big-step cost semantics. The completeness proof shows that LFPL programs can simulate polynomial-time Turing machines while only relying on restricted forms of linear functions and lists. It has the same structure as the original proof by Hofmann (2002) but greatly simplifies the core argument with a novel stack-like data structure that is implemented with first-class functions and lists. The mechanization includes the full soundness and completeness proofs, and serves as one of the first case studies of mechanized metatheory in the recently developed proof assistant Istari.

The Distributed Complexity Landscape on Trees Depends on the Knowledge About the Network Size

from arXiv: Computational Complexity

Authors: Alkida Balliu, Sebastian Brandt, Fabian Kuhn, Dennis Olivetti, Timothé Picavet, Gustav Schmid

One of the central models in distributed computing is Linial's LOCAL model [SIAM J. Comp. 1992]. Over time, researchers have studied distributed graph problems in the LOCAL model under slightly different assumptions, such as whether nodes know the exact network size $n$, only a polynomial upper bound on $n$, or nothing at all. We ask whether these differences are merely technical or fundamentally affect the theory of Locally Checkable Labelings (LCLs), one of the most studied problem classes. LCLs are graph problems whose valid solutions can be characterized by a finite set of allowed constant-radius neighborhoods. Since their introduction by Naor and Stockmeyer [FOCS 1995], they have become central in distributed computing, and the last decade has seen major progress in understanding their complexity. For example, Chang, Kopelowitz, and Pettie [FOCS 2016] showed that the randomized complexity of any LCL on $n$-node graphs is at least its deterministic complexity on $\sqrt{\log n}$-node graphs. Later, Chang and Pettie [FOCS 2017] showed that any randomized $n^{o(1)}$-round algorithm for LCLs on bounded-degree trees can be turned into a deterministic $O(\log n)$-round algorithm. Then, Balliu et al. [STOC 2018] showed that such automatic speedups are impossible for general bounded-degree graphs. However, these results fundamentally rely on nodes knowing $n$. How much does this assumption affect the theory of LCLs? Our work shows that if nodes are oblivious to $n$, or know only a polynomial upper bound on it, then even on trees, the theory of LCLs changes significantly. While the fundamental classification of problems remains the same, we show the landscape becomes much more complex: for example, for LCLs, randomness helps in more cases; some problems have very unnatural complexities; and some have a lower bound that depends on which definition of $Ω$ we use!

Authors: Alkida Balliu, Sebastian Brandt, Fabian Kuhn, Dennis Olivetti, Timothé Picavet, Gustav Schmid

One of the central models in distributed computing is Linial's LOCAL model [SIAM J. Comp. 1992]. Over time, researchers have studied distributed graph problems in the LOCAL model under slightly different assumptions, such as whether nodes know the exact network size $n$, only a polynomial upper bound on $n$, or nothing at all. We ask whether these differences are merely technical or fundamentally affect the theory of Locally Checkable Labelings (LCLs), one of the most studied problem classes. LCLs are graph problems whose valid solutions can be characterized by a finite set of allowed constant-radius neighborhoods. Since their introduction by Naor and Stockmeyer [FOCS 1995], they have become central in distributed computing, and the last decade has seen major progress in understanding their complexity. For example, Chang, Kopelowitz, and Pettie [FOCS 2016] showed that the randomized complexity of any LCL on $n$-node graphs is at least its deterministic complexity on $\sqrt{\log n}$-node graphs. Later, Chang and Pettie [FOCS 2017] showed that any randomized $n^{o(1)}$-round algorithm for LCLs on bounded-degree trees can be turned into a deterministic $O(\log n)$-round algorithm. Then, Balliu et al. [STOC 2018] showed that such automatic speedups are impossible for general bounded-degree graphs. However, these results fundamentally rely on nodes knowing $n$. How much does this assumption affect the theory of LCLs? Our work shows that if nodes are oblivious to $n$, or know only a polynomial upper bound on it, then even on trees, the theory of LCLs changes significantly. While the fundamental classification of problems remains the same, we show the landscape becomes much more complex: for example, for LCLs, randomness helps in more cases; some problems have very unnatural complexities; and some have a lower bound that depends on which definition of $Ω$ we use!

Quantum state isomorphism problems for groups

from arXiv: Computational Complexity

Authors: Alexandru Gheorghiu, Dale Jacobs, Saeed Mehraban, Arsalan Motamedi

We study the computational complexity of quantum state isomorphism problems under group actions: given two quantum circuits that prepare pure or mixed states, decide whether the two states are related by a group action. This can be seen as a quantum state version of the Hidden Shift Problem, in much the same way that the State Hidden Subgroup Problem is a quantum version of the ordinary Hidden Subgroup Problem. We prove several results for this computational problem: - For the pure-state version, we show that the problem is BQP-hard for all nontrivial groups, and contained in QCMA $\cap$ QCSZK. We further obtain refined results for specific groups of interest: for abelian groups we show that the problem reduces to the state hidden subgroup problem over the generalized dihedral group; for the Clifford group, the problem is at least as hard as Graph Isomorphism under polynomial-time reductions; for the Pauli group it is BQP-complete. - For the mixed-state version, for nontrivial, finite and efficiently representable groups, the problem is QSZK-complete. - We also study a variant of this problem over an infinite group, in particular, the bosonic linear optical unitaries. We show that in the setting where the classical description of the quantum state is given in a suitable wave function representation known as the stellar representation, the problem is at least as hard as Graph Isomorphism, and is contained in NP $\cap$ SZK. Prior to our work, state isomorphism problems had only been studied for the symmetric group [LG17]. As a consequence of our results, we resolve an open question posed in [HEC25] about the existence of a quantum algorithm for the abelian state hidden subgroup problem on mixed states. We show that this problem is QSZK-hard in the worst case, thereby ruling out an efficient quantum algorithm unless QSZK = BQP.

Authors: Alexandru Gheorghiu, Dale Jacobs, Saeed Mehraban, Arsalan Motamedi

We study the computational complexity of quantum state isomorphism problems under group actions: given two quantum circuits that prepare pure or mixed states, decide whether the two states are related by a group action. This can be seen as a quantum state version of the Hidden Shift Problem, in much the same way that the State Hidden Subgroup Problem is a quantum version of the ordinary Hidden Subgroup Problem. We prove several results for this computational problem: - For the pure-state version, we show that the problem is BQP-hard for all nontrivial groups, and contained in QCMA $\cap$ QCSZK. We further obtain refined results for specific groups of interest: for abelian groups we show that the problem reduces to the state hidden subgroup problem over the generalized dihedral group; for the Clifford group, the problem is at least as hard as Graph Isomorphism under polynomial-time reductions; for the Pauli group it is BQP-complete. - For the mixed-state version, for nontrivial, finite and efficiently representable groups, the problem is QSZK-complete. - We also study a variant of this problem over an infinite group, in particular, the bosonic linear optical unitaries. We show that in the setting where the classical description of the quantum state is given in a suitable wave function representation known as the stellar representation, the problem is at least as hard as Graph Isomorphism, and is contained in NP $\cap$ SZK. Prior to our work, state isomorphism problems had only been studied for the symmetric group [LG17]. As a consequence of our results, we resolve an open question posed in [HEC25] about the existence of a quantum algorithm for the abelian state hidden subgroup problem on mixed states. We show that this problem is QSZK-hard in the worst case, thereby ruling out an efficient quantum algorithm unless QSZK = BQP.

Topology-Preserving Neural Operator Learning via Hodge Decomposition

from arXiv: Computational Geometry

Authors: Dongzhe Zheng, Tao Zhong, Christine Allen-Blanchette

In this paper, we study solution operators of physical field equations on geometric meshes from a function-space perspective. We reveal that Hodge orthogonality fundamentally resolves spectral interference by isolating unlearnable topological degrees of freedom from learnable geometric dynamics, enabling an additive approximation confined to structure-preserving subspaces. Building on Hodge theory and operator splitting, we derive a principled operator-level decomposition. The result is a Hybrid Eulerian-Lagrangian architecture with an algebraic-level inductive bias we call Hodge Spectral Duality (HSD). In our framework, we use discrete differential forms to capture topology-dominated components and an orthogonal auxiliary ambient space to represent complex local dynamics. Our method achieves superior accuracy and efficiency on geometric graphs with enhanced fidelity to physical invariants. Our code is available at github.com/ContinuumCoder/Hodge-Spectral-Duality

Authors: Dongzhe Zheng, Tao Zhong, Christine Allen-Blanchette

In this paper, we study solution operators of physical field equations on geometric meshes from a function-space perspective. We reveal that Hodge orthogonality fundamentally resolves spectral interference by isolating unlearnable topological degrees of freedom from learnable geometric dynamics, enabling an additive approximation confined to structure-preserving subspaces. Building on Hodge theory and operator splitting, we derive a principled operator-level decomposition. The result is a Hybrid Eulerian-Lagrangian architecture with an algebraic-level inductive bias we call Hodge Spectral Duality (HSD). In our framework, we use discrete differential forms to capture topology-dominated components and an orthogonal auxiliary ambient space to represent complex local dynamics. Our method achieves superior accuracy and efficiency on geometric graphs with enhanced fidelity to physical invariants. Our code is available at https://github.com/ContinuumCoder/Hodge-Spectral-Duality