Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Sunday, March 03

TR24-042 | Pebble Games and Algebraic Proof Systems Meet Again | Jacobo Toran, Lisa Jaser

from ECCC Papers

Analyzing refutations of the well known pebbling formulas we prove some new strong connections between pebble games and algebraic proof system, showing that there is a parallelism between the reversible, black and black-white pebbling games on one side, and the three algebraic proof systems NS, MC and PC on the other side. In particular we prove: \begin{itemize} \item For any DAG $G$ with a single sink, if there is a Monomial Calculus refutation for $Peb(G)$ having simultaneously degree $s$ and size $t$ then there is a black pebbling strategy on $G$ with space $s$ and time $t+s$. Also if there is a black pebbling strategy for $G$ with space $s$ and time $t$ it is possible to extract from it a MC refutation for $Peb(G)$ having simultaneously degree $s$ and size $2t(s-1)$. These results are analogous to those proven in [de Rezende et al. 21] for the case of reversible pebbling and Nullstellensatz. Using them we prove degree separations between NS and MC as well as strong degree-size tradeoffs for MC. \item We show that the variable space needed for the refutation of pebbling formulas in Polynomial Calculus exactly coincides with the black-white pebbling number of the corresponding graph. One direction of this result was known. We present an new elementary proof of it. \item We show that for any unsatisfiable CNF formula $F,$ the variable space in a Resolution refutation of the formula is a lower bound for the monomial space in a PCR refutation for the extended formula $F[\oplus]$. This implies that for any DAG $G$, the monomial space needed in the PCR refutation of an XOR pebbling formulas is lower bounded by the black-white pebbling number of the corresponding graph. This solves affirmatively Open Problem 7.11 from [Buss Nordström 21]. \item The last result also proves a strong separation between degree and monomial space in PCR of size $\Omega(\frac{n}{\log n})$ with the additional property that it is independent of the field characteristic. This question was posed in [Filmus et al. 13]. \end{itemize}

Analyzing refutations of the well known pebbling formulas we prove some new strong connections between pebble games and algebraic proof system, showing that there is a parallelism between the reversible, black and black-white pebbling games on one side, and the three algebraic proof systems NS, MC and PC on the other side. In particular we prove: \begin{itemize} \item For any DAG $G$ with a single sink, if there is a Monomial Calculus refutation for $Peb(G)$ having simultaneously degree $s$ and size $t$ then there is a black pebbling strategy on $G$ with space $s$ and time $t+s$. Also if there is a black pebbling strategy for $G$ with space $s$ and time $t$ it is possible to extract from it a MC refutation for $Peb(G)$ having simultaneously degree $s$ and size $2t(s-1)$. These results are analogous to those proven in [de Rezende et al. 21] for the case of reversible pebbling and Nullstellensatz. Using them we prove degree separations between NS and MC as well as strong degree-size tradeoffs for MC. \item We show that the variable space needed for the refutation of pebbling formulas in Polynomial Calculus exactly coincides with the black-white pebbling number of the corresponding graph. One direction of this result was known. We present an new elementary proof of it. \item We show that for any unsatisfiable CNF formula $F,$ the variable space in a Resolution refutation of the formula is a lower bound for the monomial space in a PCR refutation for the extended formula $F[\oplus]$. This implies that for any DAG $G$, the monomial space needed in the PCR refutation of an XOR pebbling formulas is lower bounded by the black-white pebbling number of the corresponding graph. This solves affirmatively Open Problem 7.11 from [Buss Nordström 21]. \item The last result also proves a strong separation between degree and monomial space in PCR of size $\Omega(\frac{n}{\log n})$ with the additional property that it is independent of the field characteristic. This question was posed in [Filmus et al. 13]. \end{itemize}

Saturday, March 02

Two Professorships (open rank) interfacing classical Network Science and Graph Machine Learning at Goethe University Frankfurt, Center for Critical Computational Studies (C3S) (apply by April 2, 2024)

from CCI: jobs

Ideal candidates will possess a robust cross-disciplinary profile or interest, demonstrating not only expertise within their specific domains but also a genuine openness to collaborative, cross-disciplinary work. While the primary focus is on researchers with a strong background in network science, computer science or mathematics, other degrees in suitable application areas are also welcome. Website: […]

Ideal candidates will possess a robust cross-disciplinary profile or interest, demonstrating not only expertise within their specific domains but also a genuine openness to collaborative, cross-disciplinary work. While the primary focus is on researchers with a strong background in network science, computer science or mathematics, other degrees in suitable application areas are also welcome.

Website: https://www.c3s-frankfurt.de/workshop
Email: office@c3s.uni-frankfurt.de

By shacharlovett

Postdoc at Yale University (apply by March 15, 2024)

from CCI: jobs

A postdoc position is available, focusing on emerging problems in the areas of the foundations of AI (specifically, extending diffusion models or understanding/evaluating LLMs) or in the area of responsible AI (specifically, fairness and privacy). Submit CV, research statement, and 3 letters by March 15, 2024. Strong math background, with experience in modeling and empirical […]

A postdoc position is available, focusing on emerging problems in the areas of the foundations of AI (specifically, extending diffusion models or understanding/evaluating LLMs) or in the area of responsible AI (specifically, fairness and privacy).

Submit CV, research statement, and 3 letters by March 15, 2024.

Strong math background, with experience in modeling and empirical work required.

Website: https://www.cs.yale.edu/homes/vishnoi/Home.html
Email: nisheeth.vishnoi@gmail.com

By shacharlovett

Sum(m)it280 – Frank, Füredi, Győri, and Pach are 70

from CS Theory Events

July 8-12, 2024Budapest, Hungaryconferences.renyi.hu/summit280/home Submission deadline: March 31, 2024Registration deadline: May 15, 2024 In 2024, Péter Frankl, Zoltán Füredi, Ervin Győri and János Pach will turn 70. On the occasion of this joyful event, we organize a conference Sum(m)it280. We would like to invite you to celebrate these four Hungarian combinatorialists with us. List of … Continue reading Sum(m)it280 – Frank, Füredi, Győri, and Pach are 70

By shacharlovett

July 8-12, 2024Budapest, Hungaryhttps://conferences.renyi.hu/summit280/home Submission deadline: March 31, 2024Registration deadline: May 15, 2024 In 2024, Péter Frankl, Zoltán Füredi, Ervin Győri and János Pach will turn 70. On the occasion of this joyful event, we organize a conference Sum(m)it280. We would like to invite you to celebrate these four Hungarian combinatorialists with us. List of … Continue reading Sum(m)it280 – Frank, Füredi, Győri, and Pach are 70

By shacharlovett

Friday, March 01

TR24-041 | Launching Identity Testing into (Bounded) Space | Nikhil Gupta, Pranav Bisht, Prajakta Nimbhorkar, Ilya Volkovich

from ECCC Papers

In this work, we initiate the study of the space complexity of the Polynomial Identity Testing problem (PIT). First, we observe that the majority of the existing (time-)efficient ``blackbox'' PIT algorithms already give rise to space-efficient ``whitebox'' algorithms for the respective classes of arithmetic formulas via a space-efficient arithmetic formula evaluation procedure. Among other things, we observe that the results of Minahan-Volkovich (ACM Transactions on Computation Theory, 2018), Gurjar et. al. (Theory of Computing, 2017) and Agrawal et. al. (SIAM Journal of Computing, 2016) imply logspace PIT algorithms for read-once formulas, constant-width read-once oblivious branching programs, and bounded-transcendence degree depth-3 circuits, respectively. However, since the best known blackbox PIT algorithms for the class of multilinear read-$k$ formulas are quasi-polynomial time, as shown in Anderson et. al. (Computational Complexity, 2015), our previous observation only yields a $O(\log^2n)$-space whitebox PIT algorithm. Our main result, thus, is the first $O(\log n)$-space PIT algorithm for multilinear read-twice formulas. We also extend this result to test if a given read-twice formula is equal to a given read-once formula. Our technical contributions include the development of a space-efficient measure $\muell$ which is ``distilled'' from the result of Anderson et. al. (Computational Complexity, 2015) and can be used to reduce PIT for a read-$k$ formula to PIT for a sum of two read-$(k-1)$ formulas, in logarithmic space. In addition, we show how to combine a space-efficient blackbox PIT algorithm for read-$(k-1)$ formulas together with a space-efficient whitebox PIT algorithm for read-$k$ formulas to test if a given read-$k$ formula is equal to a given read-$(k-1)$ formula.

In this work, we initiate the study of the space complexity of the Polynomial Identity Testing problem (PIT). First, we observe that the majority of the existing (time-)efficient ``blackbox'' PIT algorithms already give rise to space-efficient ``whitebox'' algorithms for the respective classes of arithmetic formulas via a space-efficient arithmetic formula evaluation procedure. Among other things, we observe that the results of Minahan-Volkovich (ACM Transactions on Computation Theory, 2018), Gurjar et. al. (Theory of Computing, 2017) and Agrawal et. al. (SIAM Journal of Computing, 2016) imply logspace PIT algorithms for read-once formulas, constant-width read-once oblivious branching programs, and bounded-transcendence degree depth-3 circuits, respectively. However, since the best known blackbox PIT algorithms for the class of multilinear read-$k$ formulas are quasi-polynomial time, as shown in Anderson et. al. (Computational Complexity, 2015), our previous observation only yields a $O(\log^2n)$-space whitebox PIT algorithm. Our main result, thus, is the first $O(\log n)$-space PIT algorithm for multilinear read-twice formulas. We also extend this result to test if a given read-twice formula is equal to a given read-once formula. Our technical contributions include the development of a space-efficient measure $\muell$ which is ``distilled'' from the result of Anderson et. al. (Computational Complexity, 2015) and can be used to reduce PIT for a read-$k$ formula to PIT for a sum of two read-$(k-1)$ formulas, in logarithmic space. In addition, we show how to combine a space-efficient blackbox PIT algorithm for read-$(k-1)$ formulas together with a space-efficient whitebox PIT algorithm for read-$k$ formulas to test if a given read-$k$ formula is equal to a given read-$(k-1)$ formula.

Dominic Welsh, 1938–2023

from Richard Lipton

My doctoral thesis advisor Geoffrey Grimmett source Dominic Welsh passed away last November 30th. He was my doctoral advisor 1981–86 at Merton College, Oxford University, and a giant in several fields of combinatorial and applied mathematics. Today I remember Dominic and describe his late-career influence on a modern problem: How “natural” is bounded-error quantum polynomial […]


My doctoral thesis advisor

Geoffrey Grimmett source

Dominic Welsh passed away last November 30th. He was my doctoral advisor 1981–86 at Merton College, Oxford University, and a giant in several fields of combinatorial and applied mathematics.

Today I remember Dominic and describe his late-career influence on a modern problem: How “natural” is bounded-error quantum polynomial time?

Among several memorials by colleagues and friends, there is now a detailed tribute on the Matroid Union blog by my fellow student and Oxford officemate Graham Farr with fellow students Dillon Mayhew and James Oxley. Graham lays out the full variety of Dominic’s career, beginning with his work on discrete probability and then going on to matroid theory—work that led him to write the definitive text Matroid Theory already in 1976.

I first met Dominic on a visit to Oxford in August 1980, on my way back from playing chess in Norway. This was a month before the start of my senior year at Princeton and decision to apply for a Marshall Scholarship. I was beginning an undergraduate thesis on algebraic combinatorics supervised by Doug West and Harold Kuhn. Although I had attended some talks on computational complexity, including one by Eugene Luks on isomorphism for graphs of bounded degree being in {\mathsf{P}}, I had not thought of doing complexity as a research area of itself until I arrived to Merton College in September 1981. Dominic was so enthusiastic about complexity that I never heard much about matroid theory from him; ironically I have come to that subject only fairly recently.

Computational Complexity and Physical Systems

Dominic’s view on complexity came from statistical physics, a subject proceeding from his own doctoral thesis in 1964 under John Hammersley. He was most interested in counting problems. Leslie Valiant had recently introduced the complexity class {\mathsf{\#P}} and shown that computing the permanent of a matrix is {\mathsf{\#P}}-complete, hence {\mathsf{NP}}-hard. The problems of counting satisfying assignments, graph 3-colorings, Hamiltonian cycles, and perfect matchings in bipartite graphs are also {\mathsf{\#P}}-complete.

The first three are counting versions of {\mathsf{NP}}-complete decision problems, but the decision version of the last—does a graph have a perfect matching?—belongs to {\mathsf{P}}. The counting problem {\mathsf{\#2SAT}} for clause size {k=2} is {\mathsf{\#P}}-complete, even though {\mathsf{2SAT}} is in {\mathsf{P}}.

What fascinated Dominic even more was that instances of the hard problems can often be structured with a natural parameter {p} so that the complexity transitions from easy to hard in a narrow critical region as {p} varies. The classic example is the ratio of clauses to variables in {k}-SAT instances and other forms of constraint satisfaction. Here is a diagram for 3SAT from a 2013 paper by Marcelo Finger and Poliana Reis, showing both the probability of being satisfiable and the time taken by an exhaustive algorithm:

For graph coloring and Hamiltonian cycles one can use parameters that govern the distribution for drawing random graph instances. The meta-question here is:

If in fact {\mathsf{P=NP}}, then what accounts for this observed phenomenon of sharp phase transition in evident hardness?

See this 2012 talk by Cris Moore with hints on how further phase phenomena could possibly unlock {\mathsf{P \neq NP}}. Stepping back to the early 1980s, all this raised early hope that the theory of physical systems, such as the Ising model, could help surmount barriers to complexity lower bounds that were already being felt.

Percolation and a Story

The world has just been through a long engagement with a phase-transition parameter, the reproduction number {R_0} of an epidemic. It is normalized so that if {R_0 < 1} then the infection will be contained in ideal circumstances, but if {R_0 > 1} it will almost certainly pervade the population.

Dominic introduced me to related examples in percolation theory that were not causing actual disasters. Open problems emerge almost immediately. Consider {n\times n} square lattices in which every node is independently colored black with probability {p}. There is a critical value {p_c} such that when {p < p_c}, the probability of having a path of black nodes span the lattice goes to {0} sharply as {n \rightarrow \infty}, but for {p > p_c} the probability of a spanning black cluster is nonzero and ramps up quickly. This value has been empirically computed as {0.59274621 \pm 0.00000013} but is not known analytically.

Paul Heitjans source

I was duly warned that analytic bounds on {p_c} for other families of grids were as hard to do as complexity lower bounds. Dominic and Paul Seymour had made important progress on the problem for edges in the square lattice, which had helped Harry Kesten prove {p_c = 0.5} for them in 1980. (See also this 2022 paper.) Thresholds for many other lattice structures have been computed empirically but remain open analytically.

My strongest memory of percolation in my first year had a different purpose. Much work was being done by physicists whose level of rigor was below standard. Dominic set me to probe one such paper, and I confirmed his suspicion of a major gap in the main result. Rather than send a letter, he invited the author over to Oxford, and after suitable entertainment at Fellows Lunch, brought him to meet me at the Mathematical Institute. Having prepped the approach with me beforehand, and with his usual twinkle in the eyes, he invited the author to lay out his proof. I posed questions that led to the flaw, and the poor chap was duly flustered—but appreciative at the same time. At least he was treated in best fashion.

Complexity

I took the mindset, however, that hope of progress in complexity needed not only greater rigor but a full embrace of logic and formal structures. I was primed to be caught up in the Structural Complexity movement of the 1980s, which enshrined the field’s main conference until it was renamed the “Computational Complexity Conference” in 1996. Among papers I studied on the possible independence of {\mathsf{P}} versus {\mathsf{NP}} from strong formal systems was this by Dick with Rich DeMillo.

This went outside Dominic’s interests, but his steady and kindly hand was important especially through a tumultuous fourth year for me. During that fourth year, we co-taught a course on complexity, communication, and coding theory, which figured into his 1988 textbook Codes and Cryptography. Here is the end of its preface:



As for doing something about “hieroglyphic handwriting,” I brought in the inexpensive PC-based typesetting system {T^3} (now Scientific Word) with the generous support of the Mathematical Institute, where it was housed in room T-3. I manually improved all the {24 \times 18} pixel math-and-language fonts, which had evidently been digitized rather than crafted, so I joked that I had “written” a dozen dissertations before I finished mine the next year.

My camera, Merton College, Oxford, October 1986

Capturing BQP

Between then and Dominic’s 2005 retirement, he branched into new strands of complexity involving graph and knot polynomials, which grew into and beyond his 1993 book Complexity: Knots, Colorings, and Countings. (Although this is by Cambridge University Press, I have added an Oxford comma to the title.) This fostered a connection to quantum computing via the Jones polynomial of knot theory. The basic relation was discovered by Michael Freedman, Alexei Kitaev, Michael Larson, and Zhenghan Wang in their foundational 2003 paper “Topological Quantum Computing”:

Theorem 1 (paraphrase) In time polynomial in the size {s} of a given quantum circuit {C}, an argument {x} to {C}, and {\log(1/\epsilon)}, we can create a knot link {L} and a simple formula {g_L} such that

\displaystyle  \left|\Pr[C(x)=1] - g_L(V_L(e^{2\pi i/5}))\right| < \epsilon,

where {V_L} is the Jones polynomial of {L}.

In contrast to polynomial translations I’ve discussed here and here, there is only one evaluation of the Jones polynomial—and at a “magic” fifth root of unity.

Now Dominic—with his student Dirk Vertigan and with François Jaeger of Grenoble—showed that evaluating {V_L} at any point other than a {2,3,4,6}th root of unity is {\mathsf{\#P}}-hard given general {L}. Many other problems that express the evaluation of quantum circuits are likewise {\mathsf{\#P}}-hard, including some of the special cases shown here. This raises a natural meta-question:

Is it possible to simulate {\mathsf{BQP}} via a natural computational problem that is not (known to be) {\mathsf{\#P}}-hard on its full natural domain of instances?

There is a case to be made for a categorical no answer. The dichotomy phenomenon is that whole classes {C} of natural functions {f} in {\mathsf{\#P}} have the property that every {f \in C} is either in {\mathsf{P}} or is {\mathsf{\#P}}-complete. Now {\mathsf{BQP}} is believed to be intermediate between {\mathsf{P}} and {\mathsf{PP}}, the latter being the language-class peer of {\mathsf{\#P}}. But dichotomy seems to leave no room for a natural capture of {\mathsf{BQP}} that stays in-between. This is besides the observation from structural complexity theory that insofar as {\mathsf{BQP}} is a “promise” class, it is unlikely to have a complete decision problem, nor ipso facto be captured by simple function computations.

An Answer

Dominic’s 2005 paper with Freedman, Laci Lovász, and Dominic’s student Magnus Bordewich gave—among several results—a task that exactly captures {\mathsf{BQP}}. This required a clever new condition of approximation for numerical functions {f}. The most familiar ones require computing (deterministically or with high probability) a value {y} such that

\displaystyle  f(x) - \epsilon f(x) \leq y \leq f(x) + \epsilon f(x),

where for every prescribed {\epsilon > 0}, the running time is polynomial in {|x|}. If the time is also polynomial in {\frac{1}{\epsilon}}, one speaks of a fully approximating scheme (FPRAS in the random case).

A rub with this is that the most familiar numerical approaches to simulating quantum circuits {C} make the acceptance probability (or variously, the amplitude) have the form

\displaystyle  \Pr[C(x)=1] = \frac{f_1(x) - f_2(x)}{R},

where {f_1} and {f_2} are fully approximable but the difference {f_1(x) - f_2(x)} is small, typically of order {(f_1(x) + f_2(x))^{1/2}}. This “multiplicative” approximation property does not carry through to the difference.

Their additive approximation scheme takes an auxiliary function {u(x)} and seeks to compute {y} such that

\displaystyle  f(x) - \epsilon u(x) \leq y \leq f(x) + \epsilon u(x). \ \ \ \ \ (1)

Now if {f_1(x)} and {f_2(x)} are approximable with “normalizer” {u(x)} in this sense, then so is {f_1(x) - f_2(x)} with normalizer {2u(x)}. This works even if the difference is relatively tiny as above, provided {u(x)} is chosen appropriately. It still took more cleverness and an appeal to knot theory to make the idea work for {\mathsf{BQP}}:

Theorem 2 (paraphrase) Let {A} be an oracle function that takes as input {\epsilon} and a knot link {L}, where {L} is given as the plat closure of a braid of string size {m}, and returns an additive approximation of {V_L(e^{2\pi i/5})} within {\epsilon} times the normalizer {(2\cos\frac{\pi}{5})^{m/2}}. Then {\mathsf{BQP = P^A}}.

Here {L} is the “{x}” in the definition of additive approximation. In words, the task of generating such approximations of the Jones polynomial at a fifth root of unity is equivalent in power to {\mathsf{BQP}}, allowing only deterministic polynomial time computation otherwise. As the tribute co-written by Graham Farr puts it,

Dominic collaborated with Bordewich, Freedman and Lovász on an important paper (2005) showing that an additive approximation (which is weaker than an FPRAS) to a certain Tutte polynomial evaluation (related to the Jones polynomial) is sufficient to capture the power of quantum computation. (emphasis added)

Dominic had much more engagement with quantum computing, including his co-supervision with Artur Ekert of Michele Mosca, who went on to the University of Waterloo and was one of several who welcomed Dominic there for an honorary doctorate in 2006.

Open Problems

The Matroid Union tribute mentions numerous conjectures arising from Dominic’s work: proved, disproved, and still open. One of the proved ones is featured in the citation for the 2022 Fields Medal awarded to June Huh of Princeton University. I’ve tried to lay out motivations for the open ones having important ramifications——the tribute gives links by which to pursue them.

Our condolences to Bridget, their family, and all who were graced to know Dominic over the years. A memorial service and tea reception afterward will be held at Merton College on June 1, 3:00pm UK time (with livestream).


[added Mathematics Genealogy link to Dominic’s students at top; fixed that Farr’s first section is on discrete probability not matroid theory]

By KWRegan

On Efficient Computation of DiRe Committees

from arXiv: Computational Complexity

Authors: Kunal Relia

Consider a committee election consisting of (i) a set of candidates who are divided into arbitrary groups each of size \emph{at most} two and a diversity constraint that stipulates the selection of \emph{at least} one candidate from each group and (ii) a set of voters who are divided into arbitrary populations each approving \emph{at most} two candidates and a representation constraint that stipulates the selection of \emph{at least} one candidate from each population who has a non-null set of approved candidates. The DiRe (Diverse + Representative) committee feasibility problem (a.k.a. the minimum vertex cover problem on unweighted undirected graphs) concerns the determination of the smallest size committee that satisfies the given constraints. Here, for this problem, we discover an unconditional deterministic polynomial-time algorithm that is an amalgamation of maximum matching, breadth-first search, maximal matching, and local minimization.

Authors: Kunal Relia

Consider a committee election consisting of (i) a set of candidates who are divided into arbitrary groups each of size \emph{at most} two and a diversity constraint that stipulates the selection of \emph{at least} one candidate from each group and (ii) a set of voters who are divided into arbitrary populations each approving \emph{at most} two candidates and a representation constraint that stipulates the selection of \emph{at least} one candidate from each population who has a non-null set of approved candidates. The DiRe (Diverse + Representative) committee feasibility problem (a.k.a. the minimum vertex cover problem on unweighted undirected graphs) concerns the determination of the smallest size committee that satisfies the given constraints. Here, for this problem, we discover an unconditional deterministic polynomial-time algorithm that is an amalgamation of maximum matching, breadth-first search, maximal matching, and local minimization.

The Power of Unentangled Quantum Proofs with Non-negative Amplitudes

from arXiv: Computational Complexity

Authors: Fernando Granha Jeronimo, Pei Wu

Quantum entanglement is a fundamental property of quantum mechanics and plays a crucial role in quantum computation and information. We study entanglement via the lens of computational complexity by considering quantum generalizations of the class NP with multiple unentangled quantum proofs, the so-called QMA(2) and its variants. The complexity of QMA(2) is a longstanding open problem, and only the trivial bounds QMA $\subseteq$ QMA(2) $\subseteq$ NEXP are known. In this work, we study the power of unentangled quantum proofs with non-negative amplitudes, a class which we denote $\text{QMA}^+(2)$. In this setting, we are able to design proof verification protocols for problems both using logarithmic size quantum proofs and having a constant probability gap in distinguishing yes from no instances. In particular, we design global protocols for small set expansion, unique games, and PCP verification. As a consequence, we obtain NP $\subseteq \text{QMA}^+_{\log}(2)$ with a constant gap. By virtue of the new constant gap, we are able to ``scale up'' this result to $\text{QMA}^+(2)$, obtaining the full characterization $\text{QMA}^+(2)$=NEXP by establishing stronger explicitness properties of the PCP for NEXP. One key novelty of these protocols is the manipulation of quantum proofs in a global and coherent way yielding constant gaps. Previous protocols (only available for general amplitudes) are either local having vanishingly small gaps or treat the quantum proofs as classical probability distributions requiring polynomially many proofs thereby not implying non-trivial bounds on QMA(2). Finally, we show that QMA(2) is equal to $\text{QMA}^+(2)$ provided the gap of the latter is a sufficiently large constant. In particular, if $\text{QMA}^+(2)$ admits gap amplification, then QMA(2)=NEXP.

Authors: Fernando Granha Jeronimo, Pei Wu

Quantum entanglement is a fundamental property of quantum mechanics and plays a crucial role in quantum computation and information. We study entanglement via the lens of computational complexity by considering quantum generalizations of the class NP with multiple unentangled quantum proofs, the so-called QMA(2) and its variants. The complexity of QMA(2) is a longstanding open problem, and only the trivial bounds QMA $\subseteq$ QMA(2) $\subseteq$ NEXP are known. In this work, we study the power of unentangled quantum proofs with non-negative amplitudes, a class which we denote $\text{QMA}^+(2)$. In this setting, we are able to design proof verification protocols for problems both using logarithmic size quantum proofs and having a constant probability gap in distinguishing yes from no instances. In particular, we design global protocols for small set expansion, unique games, and PCP verification. As a consequence, we obtain NP $\subseteq \text{QMA}^+_{\log}(2)$ with a constant gap. By virtue of the new constant gap, we are able to ``scale up'' this result to $\text{QMA}^+(2)$, obtaining the full characterization $\text{QMA}^+(2)$=NEXP by establishing stronger explicitness properties of the PCP for NEXP. One key novelty of these protocols is the manipulation of quantum proofs in a global and coherent way yielding constant gaps. Previous protocols (only available for general amplitudes) are either local having vanishingly small gaps or treat the quantum proofs as classical probability distributions requiring polynomially many proofs thereby not implying non-trivial bounds on QMA(2). Finally, we show that QMA(2) is equal to $\text{QMA}^+(2)$ provided the gap of the latter is a sufficiently large constant. In particular, if $\text{QMA}^+(2)$ admits gap amplification, then QMA(2)=NEXP.

Spectral Meets Spatial: Harmonising 3D Shape Matching and Interpolation

from arXiv: Computational Geometry

Authors: Dongliang Cao, Marvin Eisenberger, Nafie El Amrani, Daniel Cremers, Florian Bernard

Although 3D shape matching and interpolation are highly interrelated, they are often studied separately and applied sequentially to relate different 3D shapes, thus resulting in sub-optimal performance. In this work we present a unified framework to predict both point-wise correspondences and shape interpolation between 3D shapes. To this end, we combine the deep functional map framework with classical surface deformation models to map shapes in both spectral and spatial domains. On the one hand, by incorporating spatial maps, our method obtains more accurate and smooth point-wise correspondences compared to previous functional map methods for shape matching. On the other hand, by introducing spectral maps, our method gets rid of commonly used but computationally expensive geodesic distance constraints that are only valid for near-isometric shape deformations. Furthermore, we propose a novel test-time adaptation scheme to capture both pose-dominant and shape-dominant deformations. Using different challenging datasets, we demonstrate that our method outperforms previous state-of-the-art methods for both shape matching and interpolation, even compared to supervised approaches.

Authors: Dongliang Cao, Marvin Eisenberger, Nafie El Amrani, Daniel Cremers, Florian Bernard

Although 3D shape matching and interpolation are highly interrelated, they are often studied separately and applied sequentially to relate different 3D shapes, thus resulting in sub-optimal performance. In this work we present a unified framework to predict both point-wise correspondences and shape interpolation between 3D shapes. To this end, we combine the deep functional map framework with classical surface deformation models to map shapes in both spectral and spatial domains. On the one hand, by incorporating spatial maps, our method obtains more accurate and smooth point-wise correspondences compared to previous functional map methods for shape matching. On the other hand, by introducing spectral maps, our method gets rid of commonly used but computationally expensive geodesic distance constraints that are only valid for near-isometric shape deformations. Furthermore, we propose a novel test-time adaptation scheme to capture both pose-dominant and shape-dominant deformations. Using different challenging datasets, we demonstrate that our method outperforms previous state-of-the-art methods for both shape matching and interpolation, even compared to supervised approaches.

Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm

from arXiv: Data Structures and Algorithms

Authors: Leo Zhou, Joao Basso, Song Mei

The quantum approximate optimization algorithm (QAOA) is a general-purpose algorithm for combinatorial optimization. In this paper, we analyze the performance of the QAOA on a statistical estimation problem, namely, the spiked tensor model, which exhibits a statistical-computational gap classically. We prove that the weak recovery threshold of $1$-step QAOA matches that of $1$-step tensor power iteration. Additional heuristic calculations suggest that the weak recovery threshold of $p$-step QAOA matches that of $p$-step tensor power iteration when $p$ is a fixed constant. This further implies that multi-step QAOA with tensor unfolding could achieve, but not surpass, the classical computation threshold $\Theta(n^{(q-2)/4})$ for spiked $q$-tensors. Meanwhile, we characterize the asymptotic overlap distribution for $p$-step QAOA, finding an intriguing sine-Gaussian law verified through simulations. For some $p$ and $q$, the QAOA attains an overlap that is larger by a constant factor than the tensor power iteration overlap. Of independent interest, our proof techniques employ the Fourier transform to handle difficult combinatorial sums, a novel approach differing from prior QAOA analyses on spin-glass models without planted structure.

Authors: Leo Zhou, Joao Basso, Song Mei

The quantum approximate optimization algorithm (QAOA) is a general-purpose algorithm for combinatorial optimization. In this paper, we analyze the performance of the QAOA on a statistical estimation problem, namely, the spiked tensor model, which exhibits a statistical-computational gap classically. We prove that the weak recovery threshold of $1$-step QAOA matches that of $1$-step tensor power iteration. Additional heuristic calculations suggest that the weak recovery threshold of $p$-step QAOA matches that of $p$-step tensor power iteration when $p$ is a fixed constant. This further implies that multi-step QAOA with tensor unfolding could achieve, but not surpass, the classical computation threshold $\Theta(n^{(q-2)/4})$ for spiked $q$-tensors. Meanwhile, we characterize the asymptotic overlap distribution for $p$-step QAOA, finding an intriguing sine-Gaussian law verified through simulations. For some $p$ and $q$, the QAOA attains an overlap that is larger by a constant factor than the tensor power iteration overlap. Of independent interest, our proof techniques employ the Fourier transform to handle difficult combinatorial sums, a novel approach differing from prior QAOA analyses on spin-glass models without planted structure.

Higher-Order Networks Representation and Learning: A Survey

from arXiv: Data Structures and Algorithms

Authors: Hao Tian, Reza Zafarani

Network data has become widespread, larger, and more complex over the years. Traditional network data is dyadic, capturing the relations among pairs of entities. With the need to model interactions among more than two entities, significant research has focused on higher-order networks and ways to represent, analyze, and learn from them. There are two main directions to studying higher-order networks. One direction has focused on capturing higher-order patterns in traditional (dyadic) graphs by changing the basic unit of study from nodes to small frequently observed subgraphs, called motifs. As most existing network data comes in the form of pairwise dyadic relationships, studying higher-order structures within such graphs may uncover new insights. The second direction aims to directly model higher-order interactions using new and more complex representations such as simplicial complexes or hypergraphs. Some of these models have long been proposed, but improvements in computational power and the advent of new computational techniques have increased their popularity. Our goal in this paper is to provide a succinct yet comprehensive summary of the advanced higher-order network analysis techniques. We provide a systematic review of its foundations and algorithms, along with use cases and applications of higher-order networks in various scientific domains.

Authors: Hao Tian, Reza Zafarani

Network data has become widespread, larger, and more complex over the years. Traditional network data is dyadic, capturing the relations among pairs of entities. With the need to model interactions among more than two entities, significant research has focused on higher-order networks and ways to represent, analyze, and learn from them. There are two main directions to studying higher-order networks. One direction has focused on capturing higher-order patterns in traditional (dyadic) graphs by changing the basic unit of study from nodes to small frequently observed subgraphs, called motifs. As most existing network data comes in the form of pairwise dyadic relationships, studying higher-order structures within such graphs may uncover new insights. The second direction aims to directly model higher-order interactions using new and more complex representations such as simplicial complexes or hypergraphs. Some of these models have long been proposed, but improvements in computational power and the advent of new computational techniques have increased their popularity. Our goal in this paper is to provide a succinct yet comprehensive summary of the advanced higher-order network analysis techniques. We provide a systematic review of its foundations and algorithms, along with use cases and applications of higher-order networks in various scientific domains.

Total Completion Time Scheduling Under Scenarios

from arXiv: Data Structures and Algorithms

Authors: Thomas Bosman, Martijn van Ee, Ekin Ergen, Csanad Imreh, Alberto Marchetti-Spaccamela, Martin Skutella, Leen Stougie

Scheduling jobs with given processing times on identical parallel machines so as to minimize their total completion time is one of the most basic scheduling problems. We study interesting generalizations of this classical problem involving scenarios. In our model, a scenario is defined as a subset of a predefined and fully specified set of jobs. The aim is to find an assignment of the whole set of jobs to identical parallel machines such that the schedule, obtained for the given scenarios by simply skipping the jobs not in the scenario, optimizes a function of the total completion times over all scenarios. While the underlying scheduling problem without scenarios can be solved efficiently by a simple greedy procedure (SPT rule), scenarios, in general, make the problem NP-hard. We paint an almost complete picture of the evolving complexity landscape, drawing the line between easy and hard. One of our main algorithmic contributions relies on a deep structural result on the maximum imbalance of an optimal schedule, based on a subtle connection to Hilbert bases of a related convex cone.

Authors: Thomas Bosman, Martijn van Ee, Ekin Ergen, Csanad Imreh, Alberto Marchetti-Spaccamela, Martin Skutella, Leen Stougie

Scheduling jobs with given processing times on identical parallel machines so as to minimize their total completion time is one of the most basic scheduling problems. We study interesting generalizations of this classical problem involving scenarios. In our model, a scenario is defined as a subset of a predefined and fully specified set of jobs. The aim is to find an assignment of the whole set of jobs to identical parallel machines such that the schedule, obtained for the given scenarios by simply skipping the jobs not in the scenario, optimizes a function of the total completion times over all scenarios. While the underlying scheduling problem without scenarios can be solved efficiently by a simple greedy procedure (SPT rule), scenarios, in general, make the problem NP-hard. We paint an almost complete picture of the evolving complexity landscape, drawing the line between easy and hard. One of our main algorithmic contributions relies on a deep structural result on the maximum imbalance of an optimal schedule, based on a subtle connection to Hilbert bases of a related convex cone.

Edit and Alphabet-Ordering Sensitivity of Lex-parse

from arXiv: Data Structures and Algorithms

Authors: Yuto Nakashima, Dominik Köppl, Mitsuru Funakoshi, Shunsuke Inenaga, Hideo Bannai

We investigate the compression sensitivity [Akagi et al., 2023] of lex-parse [Navarro et al., 2021] for two operations: (1) single character edit and (2) modification of the alphabet ordering, and give tight upper and lower bounds for both operations. For both lower bounds, we use the family of Fibonacci words. For the bounds on edit operations, our analysis makes heavy use of properties of the Lyndon factorization of Fibonacci words to characterize the structure of lex-parse.

Authors: Yuto Nakashima, Dominik Köppl, Mitsuru Funakoshi, Shunsuke Inenaga, Hideo Bannai

We investigate the compression sensitivity [Akagi et al., 2023] of lex-parse [Navarro et al., 2021] for two operations: (1) single character edit and (2) modification of the alphabet ordering, and give tight upper and lower bounds for both operations. For both lower bounds, we use the family of Fibonacci words. For the bounds on edit operations, our analysis makes heavy use of properties of the Lyndon factorization of Fibonacci words to characterize the structure of lex-parse.

Computing Longest Common Subsequence under Cartesian-Tree Matching Model

from arXiv: Data Structures and Algorithms

Authors: Taketo Tsujimoto, Koki Shibata, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

Two strings of the same length are said to Cartesian-tree match (CT-match) if their Cartesian-trees are isomorphic [Park et al., TCS 2020]. Cartesian-tree matching is a natural model that allows for capturing similarities of numerical sequences. Oizumi et al. [CPM 2022] showed that subsequence pattern matching under CT-matching model can be solved in polynomial time. This current article follows and extends this line of research: We present the first polynomial-time algorithm that finds the longest common subsequence under CT-matching of two given strings $S$ and $T$ of length $n$, in $O(n^6)$ time and $O(n^4)$ space for general ordered alphabets. We then show that the problem has a faster solution in the binary case, by presenting an $O(n^2 / \log n)$-time and space algorithm.

Authors: Taketo Tsujimoto, Koki Shibata, Takuya Mieno, Yuto Nakashima, Shunsuke Inenaga

Two strings of the same length are said to Cartesian-tree match (CT-match) if their Cartesian-trees are isomorphic [Park et al., TCS 2020]. Cartesian-tree matching is a natural model that allows for capturing similarities of numerical sequences. Oizumi et al. [CPM 2022] showed that subsequence pattern matching under CT-matching model can be solved in polynomial time. This current article follows and extends this line of research: We present the first polynomial-time algorithm that finds the longest common subsequence under CT-matching of two given strings $S$ and $T$ of length $n$, in $O(n^6)$ time and $O(n^4)$ space for general ordered alphabets. We then show that the problem has a faster solution in the binary case, by presenting an $O(n^2 / \log n)$-time and space algorithm.

Rahmani Sort: A Novel Variant of Insertion Sort Algorithm with O(nlogn) Complexity

from arXiv: Data Structures and Algorithms

Authors: Mohammad Khalid Imam Rahmani

Various decision support systems are available that implement Data Mining and Data Warehousing techniques for diving into the sea of data for getting useful patterns of knowledge (pearls). Classification, regression, clustering, and many other algorithms are used to enhance the precision and accuracy of the decision process. So, there is scope for increasing the response time of the decision process, especially in mission-critical operations. If data are ordered with suitable and efficient sorting operation, the response time of the decision process can be minimized. Insertion sort is much more suitable for such applications due to its simple and straight logic along with its dynamic nature suitable for list implementation. But it is slower than merge sort and quick sort. The main reasons this is slow: firstly, a sequential search is used to find the actual position of the next key element into the sorted left subarray and secondly, shifting of elements is required by one position towards the right for accommodating the newly inserted element. Therefore, I propose a new algorithm by using a novel technique of binary search mechanism for finding the sorted location of the next key item into the previously sorted left subarray much quicker than the conventional insertion sort algorithm. Performance measurement in terms of the actual running time of the new algorithm has been compared with those of other conventional sorting algorithms apart from the insertion sort. The results obtained on various sample data show that the new algorithm is better in performance than the conventional insertion sort and merge sort algorithms.

Authors: Mohammad Khalid Imam Rahmani

Various decision support systems are available that implement Data Mining and Data Warehousing techniques for diving into the sea of data for getting useful patterns of knowledge (pearls). Classification, regression, clustering, and many other algorithms are used to enhance the precision and accuracy of the decision process. So, there is scope for increasing the response time of the decision process, especially in mission-critical operations. If data are ordered with suitable and efficient sorting operation, the response time of the decision process can be minimized. Insertion sort is much more suitable for such applications due to its simple and straight logic along with its dynamic nature suitable for list implementation. But it is slower than merge sort and quick sort. The main reasons this is slow: firstly, a sequential search is used to find the actual position of the next key element into the sorted left subarray and secondly, shifting of elements is required by one position towards the right for accommodating the newly inserted element. Therefore, I propose a new algorithm by using a novel technique of binary search mechanism for finding the sorted location of the next key item into the previously sorted left subarray much quicker than the conventional insertion sort algorithm. Performance measurement in terms of the actual running time of the new algorithm has been compared with those of other conventional sorting algorithms apart from the insertion sort. The results obtained on various sample data show that the new algorithm is better in performance than the conventional insertion sort and merge sort algorithms.

Efficient Processing of Subsequent Densest Subgraph Query

from arXiv: Data Structures and Algorithms

Authors: Chia-Yang Hung, Chih-Ya Shen

Dense subgraph extraction is a fundamental problem in graph analysis and data mining, aimed at identifying cohesive and densely connected substructures within a given graph. It plays a crucial role in various domains, including social network analysis, biological network analysis, recommendation systems, and community detection. However, extracting a subgraph with the highest node similarity is a lack of exploration. To address this problem, we studied the Member Selection Problem and extended it with a dynamic constraint variant. By incorporating dynamic constraints, our algorithm can adapt to changing conditions or requirements, allowing for more flexible and personalized subgraph extraction. This approach enables the algorithm to provide tailored solutions that meet specific needs, even in scenarios where constraints may vary over time. We also provide the theoretical analysis to show that our algorithm is 1/3-approximation. Eventually, the experiments show that our algorithm is effective and efficient in tackling the member selection problem with dynamic constraints.

Authors: Chia-Yang Hung, Chih-Ya Shen

Dense subgraph extraction is a fundamental problem in graph analysis and data mining, aimed at identifying cohesive and densely connected substructures within a given graph. It plays a crucial role in various domains, including social network analysis, biological network analysis, recommendation systems, and community detection. However, extracting a subgraph with the highest node similarity is a lack of exploration. To address this problem, we studied the Member Selection Problem and extended it with a dynamic constraint variant. By incorporating dynamic constraints, our algorithm can adapt to changing conditions or requirements, allowing for more flexible and personalized subgraph extraction. This approach enables the algorithm to provide tailored solutions that meet specific needs, even in scenarios where constraints may vary over time. We also provide the theoretical analysis to show that our algorithm is 1/3-approximation. Eventually, the experiments show that our algorithm is effective and efficient in tackling the member selection problem with dynamic constraints.

Thursday, February 29

Linkage for leap day

from David Eppstein

The mysterious math of billiards tables (\(\mathbb{M}\)), Dave Richeson in Quanta.

By David Eppstein

The Soothing Warmth of Vacuum Tubes

from Ben Recht

Going back to the very beginning of feedback to understand its future.

Coming back to this scatter plot of decision systems today, let’s jump all the way to the top-most corner:

What happens when you can act instantaneously with maximal impact? This is the realm of old school electrical engineering and the feedback amplifier. Though it’s a bit of a stretch to think of circuits like this as decision systems, they highlight some fundamental issues in a very elementary way. And the mathematics is just middle school algebra (I know this because I ran the post by my fifth grader, and he was into it.).

I have two circuits. The first circuit is a high-gain amplifier. This thing just takes inputs and makes them super loud. But I don’t know how loud the amplifier really is. On any given day, it might change its amplification level by a factor of two or more. And I’ve noticed that the amplifier is sometimes better at amplifying treble than bass, but sometimes it’s the opposite. The amp can make signals a thousand times louder, but it’s unreliable. Should I throw it out?

Maybe not. I have a second component, an attenuator, that takes signals and makes them slightly less loud. I can use the attenuator to make my unruly amplifier well-behaved by connecting the two parts in a feedback loop:

I can write what this circuit does as a few simple equations that link the inputs to the outputs. The amplifier is driven by the signal u. The voltage output of the amplifier is the amplifier gain times the input signal. This is the equation:

“A” is the amplifier gain I don’t know particularly well. The attenuator works by taking its input and reducing it by a factor of B. The corresponding equation of this effect is

Finally, the feedback interconnection rule subtracts z from the input Voltage to produce the input to the amplifier.

We can combine these three equations into one, eliminating the variables u and z:

Now I can solve for Vout and get the final expression:

Let’s stare at this formula for a bit to see what it implies. First, as we make A larger, the gain from Vin to Vout approaches an asymptote. For super large values of A, the gain is basically just 1/B. But this feedback loop is very insensitive to the actual value of A. Suppose the attenuation factor B is ½. Then, when A is large enough, the gain should be around a factor of 2. We can plug in specific numbers and see a remarkable range of stable behavior.  If A = 10,000, the gain from Vin to Vout is 1.9996. If A = 20,000, the gain is 1.9998. If A = 5,000, the gain is 1.9992. Over a huge range of open loop gains, from 400 to infinity, the gain of this system is within less than 1% of the ideal gain. Despite vast differences in open loop behavior, the closed loop behavior is predictably the same for all of these different amplifiers.

It’s even better than this. I told you earlier that the amplifier was temperamental and might change its amplification on any given day or might amplify different signals to different levels. Suppose that the amplifier amplifies one signal by 10,000 and one by 20,000. Both signals will be amplified by 2 when put through the feedback circuit. We mitigate all sorts of uncertainty in one component with the simple negative feedback law.

So what’s the catch? Though we are insensitive to the amplifier, we’re very sensitive to the system we’re using to control the amplifier. The output gain is very sensitive to the attenuation factor B. If the attenuator changes by a factor of 2 from ½ to ¼, then the closed loop gain changes from 2 to 4. The attenuation circuit can be precision manufactured if a precise amplification is necessary. But variability is often desirable, as you might have a knob that changes the attenuation value. This would give you a volume knob. Turn it up to 11. 

There are less obvious, more dangerous issues. The feedback equations I wrote above assume the feedback occurs instantaneously. This is why I said we’re in the upper right corner of the scatter plot. But what if we can’t act immediately and there are delays between measuring the output of the amplifier and computing the feedback signal? Rather than a nice and simple formula for amplifier gain, we get stuck with an equation that looks like

This equation doesn’t have a clean analytic solution. Still, you can imagine what might happen: if Vin is changing at a rate comparable to the delay time, we might be effectively adding to the signal instead of subtracting. These errors can compound, and the signal might get amplified to arbitrarily high gains, ruining the amplifier. Control engineers are always wary of time delays and their pernicious effects, and the patches for dealing with them are quite complicated and non-elementary.

This simple amplifier example is instructive. By designing an actuator that acts instantaneously, the analysis becomes solving a simple equation. A barely quantified system can be made into a well-controlled, useful mechanism.  But there is no free lunch. Seemingly innocuous delays can bring the whole system off the rails.

Let me bring this back to decision making. The amplifier example is a bit contrived as we don’t think of electronic signal levels as “decisions.” But the ideas here generalize to a much wider set of general feedback rules. This general theory might be harder to explain in such clear terms, but let me attempt a summary in the next post.

Subscribe now

By Ben Recht

TR24-040 | Randomness Extractors in $\mathrm{AC}^0$ and $\mathrm{NC}^1$: Optimal up to Constant Factors | Ruiyang Wu, Kuan Cheng

from ECCC Papers

We study extractors computable in uniform $\mathrm{AC}^0$ and uniform $\mathrm{NC}^1$. For the $\mathrm{AC}^0$ setting, we give a construction such that for every $k \ge n/ \mathrm{poly} \log n, \eps \ge 2^{-\mathrm{poly} \log n}$, it can extract $(1-\gamma)k$ randomness from an $(n, k)$ source for an arbitrary constant $\gamma$, with seed length $O(\log \frac{n}{\epsilon})$. The output length and seed length are optimal up to constant factors matching the parameters of the best polynomial time construction such as [GUV09]. The range of $k$ and $\epsilon$ almost meets the lower bound in [GVW15] and [CL18].We also generalize the main lower bound of [GVW15] for extractors in $\mathrm{AC}^0$, showing that when $k < n/ \mathrm{poly} \log n$, even strong dispersers do not exist in $\mathrm{AC}^0$. For the $\mathrm{NC}^1$ setting, we also give a construction with seed length $O(\log \frac{n}{\epsilon})$ and a small constant fraction entropy loss in the output. The construction works for every $k \ge O(\log^2 n), \epsilon\ge 2^{-O(\sqrt{k})}$. To our knowledge the previous best $\mathrm{NC}^1$ construction is Trevisan's extractor [Tre01] and its improved version[RRV02] which have seed lengths $\mathrm{poly} \log \frac{n}{\epsilon}$. Our main techniques include a new error reduction process and a new output stretch process based on low depth circuits implementations for mergers from [DKSS13], condensers from [KT22] and somewhere extractors from [Ta-98].

We study extractors computable in uniform $\mathrm{AC}^0$ and uniform $\mathrm{NC}^1$. For the $\mathrm{AC}^0$ setting, we give a construction such that for every $k \ge n/ \mathrm{poly} \log n, \eps \ge 2^{-\mathrm{poly} \log n}$, it can extract $(1-\gamma)k$ randomness from an $(n, k)$ source for an arbitrary constant $\gamma$, with seed length $O(\log \frac{n}{\epsilon})$. The output length and seed length are optimal up to constant factors matching the parameters of the best polynomial time construction such as [GUV09]. The range of $k$ and $\epsilon$ almost meets the lower bound in [GVW15] and [CL18].We also generalize the main lower bound of [GVW15] for extractors in $\mathrm{AC}^0$, showing that when $k < n/ \mathrm{poly} \log n$, even strong dispersers do not exist in $\mathrm{AC}^0$. For the $\mathrm{NC}^1$ setting, we also give a construction with seed length $O(\log \frac{n}{\epsilon})$ and a small constant fraction entropy loss in the output. The construction works for every $k \ge O(\log^2 n), \epsilon\ge 2^{-O(\sqrt{k})}$. To our knowledge the previous best $\mathrm{NC}^1$ construction is Trevisan's extractor [Tre01] and its improved version[RRV02] which have seed lengths $\mathrm{poly} \log \frac{n}{\epsilon}$. Our main techniques include a new error reduction process and a new output stretch process based on low depth circuits implementations for mergers from [DKSS13], condensers from [KT22] and somewhere extractors from [Ta-98].

TR24-039 | Optimal PSPACE-hardness of Approximating Set Cover Reconfiguration | Shuichi Hirahara, Naoto Ohsaka

from ECCC Papers

In the Minmax Set Cover Reconfiguration problem, given a set system $\mathcal{F}$ over a universe and its two covers $\mathcal{C}^\mathrm{start}$ and $\mathcal{C}^\mathrm{goal}$ of size $k$, we wish to transform $\mathcal{C}^\mathrm{start}$ into $\mathcal{C}^\mathrm{goal}$ by repeatedly adding or removing a single set of $\mathcal{F}$ while covering the universe in any intermediate state. Then, the objective is to minimize the maximize size of any intermediate cover during transformation. We prove that Minmax Set Cover Reconfiguration and Minmax Dominating Set Reconfiguration are PSPACE-hard to approximate within a factor of $2-\frac{1}{\mathrm{polyloglog} N}$, where $N$ is the size of the universe and the number of vertices in a graph, respectively, improving upon Ohsaka (SODA 2024) and Karthik C. S. and Manurangsi (2023). This is the first result that exhibits a sharp threshold for the approximation factor of any reconfiguration problem because both problems admit a $2$-factor approximation algorithm as per Ito, Demaine, Harvey, Papadimitriou, Sideri, Uehara, and Uno (Theor. Comput. Sci., 2011). Our proof is based on a reconfiguration analogue of the FGLSS reduction from Probabilistically Checkable Reconfiguration Proofs of Hirahara and Ohsaka (2024). We also prove that for any constant $\varepsilon \in (0,1)$, Minmax Hypergraph Vertex Cover Reconfiguration on $\mathrm{poly}(\varepsilon^{-1})$-uniform hypergraphs is PSPACE-hard to approximate within a factor of $2-\varepsilon$.

In the Minmax Set Cover Reconfiguration problem, given a set system $\mathcal{F}$ over a universe and its two covers $\mathcal{C}^\mathrm{start}$ and $\mathcal{C}^\mathrm{goal}$ of size $k$, we wish to transform $\mathcal{C}^\mathrm{start}$ into $\mathcal{C}^\mathrm{goal}$ by repeatedly adding or removing a single set of $\mathcal{F}$ while covering the universe in any intermediate state. Then, the objective is to minimize the maximize size of any intermediate cover during transformation. We prove that Minmax Set Cover Reconfiguration and Minmax Dominating Set Reconfiguration are PSPACE-hard to approximate within a factor of $2-\frac{1}{\mathrm{polyloglog} N}$, where $N$ is the size of the universe and the number of vertices in a graph, respectively, improving upon Ohsaka (SODA 2024) and Karthik C. S. and Manurangsi (2023). This is the first result that exhibits a sharp threshold for the approximation factor of any reconfiguration problem because both problems admit a $2$-factor approximation algorithm as per Ito, Demaine, Harvey, Papadimitriou, Sideri, Uehara, and Uno (Theor. Comput. Sci., 2011). Our proof is based on a reconfiguration analogue of the FGLSS reduction from Probabilistically Checkable Reconfiguration Proofs of Hirahara and Ohsaka (2024). We also prove that for any constant $\varepsilon \in (0,1)$, Minmax Hypergraph Vertex Cover Reconfiguration on $\mathrm{poly}(\varepsilon^{-1})$-uniform hypergraphs is PSPACE-hard to approximate within a factor of $2-\varepsilon$.

TR24-038 | Polynomial Calculus for Quantified Boolean Logic: Lower Bounds through Circuits and Degree | Olaf Beyersdorff, Kaspar Kasche, Luc Nicolas Spachmann

from ECCC Papers

We initiate an in-depth proof-complexity analysis of polynomial calculus (Q-PC) for Quantified Boolean Formulas (QBF). In the course of this we establish a tight proof-size characterisation of Q-PC in terms of a suitable circuit model (polynomial decision lists). Using this correspondence we show a size-degree relation for Q-PC, similar in spirit, yet different from the classic size-degree formula for propositional PC by Impagliazzo, Pudlák and Sgall (1999). We use the circuit characterisation together with the size-degree relation to obtain various new lower bounds on proof size in Q-PC. This leads to incomparability results for Q-PC systems over different fields.

We initiate an in-depth proof-complexity analysis of polynomial calculus (Q-PC) for Quantified Boolean Formulas (QBF). In the course of this we establish a tight proof-size characterisation of Q-PC in terms of a suitable circuit model (polynomial decision lists). Using this correspondence we show a size-degree relation for Q-PC, similar in spirit, yet different from the classic size-degree formula for propositional PC by Impagliazzo, Pudlák and Sgall (1999). We use the circuit characterisation together with the size-degree relation to obtain various new lower bounds on proof size in Q-PC. This leads to incomparability results for Q-PC systems over different fields.

DIMACS Tutorial on Fine-grained Complexity

from CS Theory Events

July 15-19, 2024 DIMACS, New Jersey, USA dimacs.rutgers.edu/events/details?eID=2764 Submission deadline: March 28, 2024 Registration deadline: March 25, 2024 DIMACS is organizing a tutorial in Fine-grained complexity in July 2024. The tutorial is primarily for graduate students working on topics in and around theoretical computer science (TCS) who are not already familiar with fine-grained complexity. Students … Continue reading DIMACS Tutorial on Fine-grained Complexity

By shacharlovett

July 15-19, 2024 DIMACS, New Jersey, USA http://dimacs.rutgers.edu/events/details?eID=2764 Submission deadline: March 28, 2024 Registration deadline: March 25, 2024 DIMACS is organizing a tutorial in Fine-grained complexity in July 2024. The tutorial is primarily for graduate students working on topics in and around theoretical computer science (TCS) who are not already familiar with fine-grained complexity. Students … Continue reading DIMACS Tutorial on Fine-grained Complexity

By shacharlovett

Counting points with Riemann-Roch formulas

from arXiv: Computational Complexity

Authors: Jorge Martín-Morales

We provide an algorithm for computing the number of integral points lying in certain triangles that do not have integral vertices. We use techniques from Algebraic Geometry such as the Riemann-Roch formula for weighted projective planes and resolution of singularities. We analyze the complexity of the method and show that the worst case is given by the Fibonacci sequence. At the end of the manuscript a concrete example is developed in detail where the interplay with other invariants of singularity theory is also treated.

Authors: Jorge Martín-Morales

We provide an algorithm for computing the number of integral points lying in certain triangles that do not have integral vertices. We use techniques from Algebraic Geometry such as the Riemann-Roch formula for weighted projective planes and resolution of singularities. We analyze the complexity of the method and show that the worst case is given by the Fibonacci sequence. At the end of the manuscript a concrete example is developed in detail where the interplay with other invariants of singularity theory is also treated.

Enhancing Roadway Safety: LiDAR-based Tree Clearance Analysis

from arXiv: Computational Geometry

Authors: Miriam Louise Carnot, Eric Peukert, Bogdan Franczyk

In the efforts for safer roads, ensuring adequate vertical clearance above roadways is of great importance. Frequently, trees or other vegetation is growing above the roads, blocking the sight of traffic signs and lights and posing danger to traffic participants. Accurately estimating this space from simple images proves challenging due to a lack of depth information. This is where LiDAR technology comes into play, a laser scanning sensor that reveals a three-dimensional perspective. Thus far, LiDAR point clouds at the street level have mainly been used for applications in the field of autonomous driving. These scans, however, also open up possibilities in urban management. In this paper, we present a new point cloud algorithm that can automatically detect those parts of the trees that grow over the street and need to be trimmed. Our system uses semantic segmentation to filter relevant points and downstream processing steps to create the required volume to be kept clear above the road. Challenges include obscured stretches of road, the noisy unstructured nature of LiDAR point clouds, and the assessment of the road shape. The identified points of non-compliant trees can be projected from the point cloud onto images, providing municipalities with a visual aid for dealing with such occurrences. By automating this process, municipalities can address potential road space constraints, enhancing safety for all. They may also save valuable time by carrying out the inspections more systematically. Our open-source code gives communities inspiration on how to automate the process themselves.

Authors: Miriam Louise Carnot, Eric Peukert, Bogdan Franczyk

In the efforts for safer roads, ensuring adequate vertical clearance above roadways is of great importance. Frequently, trees or other vegetation is growing above the roads, blocking the sight of traffic signs and lights and posing danger to traffic participants. Accurately estimating this space from simple images proves challenging due to a lack of depth information. This is where LiDAR technology comes into play, a laser scanning sensor that reveals a three-dimensional perspective. Thus far, LiDAR point clouds at the street level have mainly been used for applications in the field of autonomous driving. These scans, however, also open up possibilities in urban management. In this paper, we present a new point cloud algorithm that can automatically detect those parts of the trees that grow over the street and need to be trimmed. Our system uses semantic segmentation to filter relevant points and downstream processing steps to create the required volume to be kept clear above the road. Challenges include obscured stretches of road, the noisy unstructured nature of LiDAR point clouds, and the assessment of the road shape. The identified points of non-compliant trees can be projected from the point cloud onto images, providing municipalities with a visual aid for dealing with such occurrences. By automating this process, municipalities can address potential road space constraints, enhancing safety for all. They may also save valuable time by carrying out the inspections more systematically. Our open-source code gives communities inspiration on how to automate the process themselves.

A One-step Image Retargeing Algorithm Based on Conformal Energy

from arXiv: Computational Geometry

Authors: Chengyang Liu, Michael K. Ng

The image retargeting problem is to find a proper mapping to resize an image to one with a prescribed aspect ratio, which is quite popular these days. In this paper, we propose an efficient and orientation-preserving one-step image retargeting algorithm based on minimizing the harmonic energy, which can well preserve the regions of interest (ROIs) and line structures in the image. We also give some mathematical proofs in the paper to ensure the well-posedness and accuracy of our algorithm.

Authors: Chengyang Liu, Michael K. Ng

The image retargeting problem is to find a proper mapping to resize an image to one with a prescribed aspect ratio, which is quite popular these days. In this paper, we propose an efficient and orientation-preserving one-step image retargeting algorithm based on minimizing the harmonic energy, which can well preserve the regions of interest (ROIs) and line structures in the image. We also give some mathematical proofs in the paper to ensure the well-posedness and accuracy of our algorithm.

On the Parameterized Complexity of Motion Planning for Rectangular Robots

from arXiv: Computational Geometry

Authors: Iyad Kanj, Salman Parsa

We study computationally-hard fundamental motion planning problems where the goal is to translate $k$ axis-aligned rectangular robots from their initial positions to their final positions without collision, and with the minimum number of translation moves. Our aim is to understand the interplay between the number of robots and the geometric complexity of the input instance measured by the input size, which is the number of bits needed to encode the coordinates of the rectangles' vertices. We focus on axis-aligned translations, and more generally, translations restricted to a given set of directions, and we study the two settings where the robots move in the free plane, and where they are confined to a bounding box. We obtain fixed-parameter tractable (FPT) algorithms parameterized by $k$ for all the settings under consideration. In the case where the robots move serially (i.e., one in each time step) and axis-aligned, we prove a structural result stating that every problem instance admits an optimal solution in which the moves are along a grid, whose size is a function of $k$, that can be defined based on the input instance. This structural result implies that the problem is fixed-parameter tractable parameterized by $k$. We also consider the case in which the robots move in parallel (i.e., multiple robots can move during the same time step), and which falls under the category of Coordinated Motion Planning problems. Finally, we show that, when the robots move in the free plane, the FPT results for the serial motion case carry over to the case where the translations are restricted to any given set of directions.

Authors: Iyad Kanj, Salman Parsa

We study computationally-hard fundamental motion planning problems where the goal is to translate $k$ axis-aligned rectangular robots from their initial positions to their final positions without collision, and with the minimum number of translation moves. Our aim is to understand the interplay between the number of robots and the geometric complexity of the input instance measured by the input size, which is the number of bits needed to encode the coordinates of the rectangles' vertices. We focus on axis-aligned translations, and more generally, translations restricted to a given set of directions, and we study the two settings where the robots move in the free plane, and where they are confined to a bounding box. We obtain fixed-parameter tractable (FPT) algorithms parameterized by $k$ for all the settings under consideration. In the case where the robots move serially (i.e., one in each time step) and axis-aligned, we prove a structural result stating that every problem instance admits an optimal solution in which the moves are along a grid, whose size is a function of $k$, that can be defined based on the input instance. This structural result implies that the problem is fixed-parameter tractable parameterized by $k$. We also consider the case in which the robots move in parallel (i.e., multiple robots can move during the same time step), and which falls under the category of Coordinated Motion Planning problems. Finally, we show that, when the robots move in the free plane, the FPT results for the serial motion case carry over to the case where the translations are restricted to any given set of directions.

Fractional Linear Matroid Matching is in quasi-NC

from arXiv: Data Structures and Algorithms

Authors: Rohit Gurjar, Taihei Oki, Roshan Raj

The matching and linear matroid intersection problems are solvable in quasi-NC, meaning that there exist deterministic algorithms that run in polylogarithmic time and use quasi-polynomially many parallel processors. However, such a parallel algorithm is unknown for linear matroid matching, which generalizes both of these problems. In this work, we propose a quasi-NC algorithm for fractional linear matroid matching, which is a relaxation of linear matroid matching and commonly generalizes fractional matching and linear matroid intersection. Our algorithm builds upon the connection of fractional matroid matching to non-commutative Edmonds' problem recently revealed by Oki and Soma~(2023). As a corollary, we also solve black-box non-commutative Edmonds' problem with rank-two skew-symmetric coefficients.

Authors: Rohit Gurjar, Taihei Oki, Roshan Raj

The matching and linear matroid intersection problems are solvable in quasi-NC, meaning that there exist deterministic algorithms that run in polylogarithmic time and use quasi-polynomially many parallel processors. However, such a parallel algorithm is unknown for linear matroid matching, which generalizes both of these problems. In this work, we propose a quasi-NC algorithm for fractional linear matroid matching, which is a relaxation of linear matroid matching and commonly generalizes fractional matching and linear matroid intersection. Our algorithm builds upon the connection of fractional matroid matching to non-commutative Edmonds' problem recently revealed by Oki and Soma~(2023). As a corollary, we also solve black-box non-commutative Edmonds' problem with rank-two skew-symmetric coefficients.

Max-Cut with $ε$-Accurate Predictions

from arXiv: Data Structures and Algorithms

Authors: Vincent Cohen-Addad, Tommaso d'Orsi, Anupam Gupta, Euiwoong Lee, Debmalya Panigrahi

We study the approximability of the MaxCut problem in the presence of predictions. Specifically, we consider two models: in the noisy predictions model, for each vertex we are given its correct label in $\{-1,+1\}$ with some unknown probability $1/2 + \epsilon$, and the other (incorrect) label otherwise. In the more-informative partial predictions model, for each vertex we are given its correct label with probability $\epsilon$ and no label otherwise. We assume only pairwise independence between vertices in both models. We show how these predictions can be used to improve on the worst-case approximation ratios for this problem. Specifically, we give an algorithm that achieves an $\alpha + \widetilde{\Omega}(\epsilon^4)$-approximation for the noisy predictions model, where $\alpha \approx 0.878$ is the MaxCut threshold. While this result also holds for the partial predictions model, we can also give a $\beta + \Omega(\epsilon)$-approximation, where $\beta \approx 0.858$ is the approximation ratio for MaxBisection given by Raghavendra and Tan. This answers a question posed by Ola Svensson in his plenary session talk at SODA'23.

Authors: Vincent Cohen-Addad, Tommaso d'Orsi, Anupam Gupta, Euiwoong Lee, Debmalya Panigrahi

We study the approximability of the MaxCut problem in the presence of predictions. Specifically, we consider two models: in the noisy predictions model, for each vertex we are given its correct label in $\{-1,+1\}$ with some unknown probability $1/2 + \epsilon$, and the other (incorrect) label otherwise. In the more-informative partial predictions model, for each vertex we are given its correct label with probability $\epsilon$ and no label otherwise. We assume only pairwise independence between vertices in both models. We show how these predictions can be used to improve on the worst-case approximation ratios for this problem. Specifically, we give an algorithm that achieves an $\alpha + \widetilde{\Omega}(\epsilon^4)$-approximation for the noisy predictions model, where $\alpha \approx 0.878$ is the MaxCut threshold. While this result also holds for the partial predictions model, we can also give a $\beta + \Omega(\epsilon)$-approximation, where $\beta \approx 0.858$ is the approximation ratio for MaxBisection given by Raghavendra and Tan. This answers a question posed by Ola Svensson in his plenary session talk at SODA'23.

Polynomial-time approximation schemes for induced subgraph problems on fractionally tree-independence-number-fragile graphs

from arXiv: Data Structures and Algorithms

Authors: Esther Galby, Andrea Munaro, Shizhou Yang

We investigate a relaxation of the notion of fractional treewidth-fragility, namely fractional tree-independence-number-fragility. In particular, we obtain polynomial-time approximation schemes for meta-problems such as finding a maximum-weight sparse induced subgraph satisfying a given $\mathsf{CMSO}_2$ formula on fractionally tree-independence-number-fragile graph classes. Our approach unifies and extends several known polynomial-time approximation schemes on seemingly unrelated graph classes, such as classes of intersection graphs of fat objects in a fixed dimension or proper minor-closed classes. We also study the related notion of layered tree-independence number, a relaxation of layered treewidth, and its applications to exact subexponential-time algorithms.

Authors: Esther Galby, Andrea Munaro, Shizhou Yang

We investigate a relaxation of the notion of fractional treewidth-fragility, namely fractional tree-independence-number-fragility. In particular, we obtain polynomial-time approximation schemes for meta-problems such as finding a maximum-weight sparse induced subgraph satisfying a given $\mathsf{CMSO}_2$ formula on fractionally tree-independence-number-fragile graph classes. Our approach unifies and extends several known polynomial-time approximation schemes on seemingly unrelated graph classes, such as classes of intersection graphs of fat objects in a fixed dimension or proper minor-closed classes. We also study the related notion of layered tree-independence number, a relaxation of layered treewidth, and its applications to exact subexponential-time algorithms.

Dynamic Deterministic Constant-Approximate Distance Oracles with $n^ε$ Worst-Case Update Time

from arXiv: Data Structures and Algorithms

Authors: Bernhard Haeupler, Yaowei Long, Thatchaphol Saranurak

We present a new distance oracle in the fully dynamic setting: given a weighted undirected graph $G=(V,E)$ with $n$ vertices undergoing both edge insertions and deletions, and an arbitrary parameter $\epsilon$ where $1/\log^{c} n<\epsilon<1$ and $c>0$ is a small constant, we can deterministically maintain a data structure with $n^{\epsilon}$ worst-case update time that, given any pair of vertices $(u,v)$, returns a $2^{{\rm poly}(1/\epsilon)}$-approximate distance between $u$ and $v$ in ${\rm poly}(1/\epsilon)\log\log n$ query time. Our algorithm significantly advances the state-of-the-art in two aspects, both for fully dynamic algorithms and even decremental algorithms. First, no existing algorithm with worst-case update time guarantees a $o(n)$-approximation while also achieving an $n^{2-\Omega(1)}$ update and $n^{o(1)}$ query time, while our algorithm offers a constant $O_{\epsilon}(1)$-approximation with $n^{\epsilon}$ update time and $O_{\epsilon}(\log \log n)$ query time. Second, even if amortized update time is allowed, it is the first deterministic constant-approximation algorithm with $n^{1-\Omega(1)}$ update and query time. The best result in this direction is the recent deterministic distance oracle by Chuzhoy and Zhang [STOC 2023] which achieves an approximation of $(\log\log n)^{2^{O(1/\epsilon^{3})}}$ with amortized update time of $n^{\epsilon}$ and query time of $2^{{\rm poly}(1/\epsilon)}\log n\log\log n$. We obtain the result by dynamizing tools related to length-constrained expanders [Haeupler-R\"acke-Ghaffari, STOC 2022; Haeupler-Hershkowitz-Tan, 2023; Haeupler-Huebotter-Ghaffari, 2022]. Our technique completely bypasses the 40-year-old Even-Shiloach tree, which has remained the most pervasive tool in the area but is inherently amortized.

Authors: Bernhard Haeupler, Yaowei Long, Thatchaphol Saranurak

We present a new distance oracle in the fully dynamic setting: given a weighted undirected graph $G=(V,E)$ with $n$ vertices undergoing both edge insertions and deletions, and an arbitrary parameter $\epsilon$ where $1/\log^{c} n<\epsilon<1$ and $c>0$ is a small constant, we can deterministically maintain a data structure with $n^{\epsilon}$ worst-case update time that, given any pair of vertices $(u,v)$, returns a $2^{{\rm poly}(1/\epsilon)}$-approximate distance between $u$ and $v$ in ${\rm poly}(1/\epsilon)\log\log n$ query time. Our algorithm significantly advances the state-of-the-art in two aspects, both for fully dynamic algorithms and even decremental algorithms. First, no existing algorithm with worst-case update time guarantees a $o(n)$-approximation while also achieving an $n^{2-\Omega(1)}$ update and $n^{o(1)}$ query time, while our algorithm offers a constant $O_{\epsilon}(1)$-approximation with $n^{\epsilon}$ update time and $O_{\epsilon}(\log \log n)$ query time. Second, even if amortized update time is allowed, it is the first deterministic constant-approximation algorithm with $n^{1-\Omega(1)}$ update and query time. The best result in this direction is the recent deterministic distance oracle by Chuzhoy and Zhang [STOC 2023] which achieves an approximation of $(\log\log n)^{2^{O(1/\epsilon^{3})}}$ with amortized update time of $n^{\epsilon}$ and query time of $2^{{\rm poly}(1/\epsilon)}\log n\log\log n$. We obtain the result by dynamizing tools related to length-constrained expanders [Haeupler-R\"acke-Ghaffari, STOC 2022; Haeupler-Hershkowitz-Tan, 2023; Haeupler-Huebotter-Ghaffari, 2022]. Our technique completely bypasses the 40-year-old Even-Shiloach tree, which has remained the most pervasive tool in the area but is inherently amortized.

On the enumeration of signatures of XOR-CNF's

from arXiv: Data Structures and Algorithms

Authors: Nadia Creignou, Oscar Defrain, Frédéric Olive, Simon Vilmin

Given a CNF formula $\varphi$ with clauses $C_1, \dots, C_m$ over a set of variables $V$, a truth assignment $\mathbf{a} : V \to \{0, 1\}$ generates a binary sequence $\sigma_\varphi(\mathbf{a})=(C_1(\mathbf{a}), \ldots, C_m(\mathbf{a}))$, called a signature of $\varphi$, where $C_i(\mathbf{a})=1$ if clause $C_i$ evaluates to 1 under assignment $\mathbf{a}$, and $C_i(\mathbf{a})=0$ otherwise. Signatures and their associated generation problems have given rise to new yet promising research questions in algorithmic enumeration. In a recent paper, B\'erczi et al. interestingly proved that generating signatures of a CNF is tractable despite the fact that verifying a solution is hard. They also showed the hardness of finding maximal signatures of an arbitrary CNF due to the intractability of satisfiability in general. Their contribution leaves open the problem of efficiently generating maximal signatures for tractable classes of CNFs, i.e., those for which satisfiability can be solved in polynomial time. Stepping into that direction, we completely characterize the complexity of generating all, minimal, and maximal signatures for XOR-CNFs.

Authors: Nadia Creignou, Oscar Defrain, Frédéric Olive, Simon Vilmin

Given a CNF formula $\varphi$ with clauses $C_1, \dots, C_m$ over a set of variables $V$, a truth assignment $\mathbf{a} : V \to \{0, 1\}$ generates a binary sequence $\sigma_\varphi(\mathbf{a})=(C_1(\mathbf{a}), \ldots, C_m(\mathbf{a}))$, called a signature of $\varphi$, where $C_i(\mathbf{a})=1$ if clause $C_i$ evaluates to 1 under assignment $\mathbf{a}$, and $C_i(\mathbf{a})=0$ otherwise. Signatures and their associated generation problems have given rise to new yet promising research questions in algorithmic enumeration. In a recent paper, B\'erczi et al. interestingly proved that generating signatures of a CNF is tractable despite the fact that verifying a solution is hard. They also showed the hardness of finding maximal signatures of an arbitrary CNF due to the intractability of satisfiability in general. Their contribution leaves open the problem of efficiently generating maximal signatures for tractable classes of CNFs, i.e., those for which satisfiability can be solved in polynomial time. Stepping into that direction, we completely characterize the complexity of generating all, minimal, and maximal signatures for XOR-CNFs.

Interval-Constrained Bipartite Matching over Time

from arXiv: Data Structures and Algorithms

Authors: Andreas Abels, Mariia Anapolska

Interval-constrained online bipartite matching problem frequently occurs in medical appointment scheduling: unit-time jobs representing patients arrive online and are assigned to a time slot within their given time interval. We consider a variant of this problem where reassignments are allowed and extend it by a notion of current time, which is decoupled from the job arrival events. As jobs appear, the current point in time gradually advances. Jobs that are assigned to the current time unit become processed, which fixes part of the matching and disables these jobs or slots for reassignments in future steps. We refer to these time-dependent restrictions on reassignments as the over-time property. We show that FirstFit with reassignments according to the shortest augmenting path rule is $\frac{2}{3}$-competitive with respect to the matching cardinality, and that the bound is tight. Interestingly, this bound holds even if the number of reassignments per job is bound by a constant. For the number of reassignments performed by the algorithm, we show that it is in $\Omega(n \log n)$ in the worst case, where $n$ is the number of patients or jobs on the online side. This result is in line with lower bounds for the number of reassignments in online bipartite matching with reassignments, and, similarly to this previous work, we also conjecture that this bound should be tight. Known upper bounds like the $O(n \log^2 n)$ for online bipartite matching with reassignments by Bernstein, Holm, and Rotenberg do not transfer directly: while our interval constraints simplify the problem, the over-time property restricts the set of possible reassignments.

Authors: Andreas Abels, Mariia Anapolska

Interval-constrained online bipartite matching problem frequently occurs in medical appointment scheduling: unit-time jobs representing patients arrive online and are assigned to a time slot within their given time interval. We consider a variant of this problem where reassignments are allowed and extend it by a notion of current time, which is decoupled from the job arrival events. As jobs appear, the current point in time gradually advances. Jobs that are assigned to the current time unit become processed, which fixes part of the matching and disables these jobs or slots for reassignments in future steps. We refer to these time-dependent restrictions on reassignments as the over-time property. We show that FirstFit with reassignments according to the shortest augmenting path rule is $\frac{2}{3}$-competitive with respect to the matching cardinality, and that the bound is tight. Interestingly, this bound holds even if the number of reassignments per job is bound by a constant. For the number of reassignments performed by the algorithm, we show that it is in $\Omega(n \log n)$ in the worst case, where $n$ is the number of patients or jobs on the online side. This result is in line with lower bounds for the number of reassignments in online bipartite matching with reassignments, and, similarly to this previous work, we also conjecture that this bound should be tight. Known upper bounds like the $O(n \log^2 n)$ for online bipartite matching with reassignments by Bernstein, Holm, and Rotenberg do not transfer directly: while our interval constraints simplify the problem, the over-time property restricts the set of possible reassignments.

DynaWarp -- Efficient, large-scale log storage and retrieval

from arXiv: Data Structures and Algorithms

Authors: Julian Reichinger, Thomas Krismayer, Jan Rellermeyer

Modern, large scale monitoring systems have to process and store vast amounts of log data in near real-time. At query time the systems have to find relevant logs based on the content of the log message using support structures that can scale to these amounts of data while still being efficient to use. We present our novel DynaWarp membership sketch, capable of answering Multi-Set Multi-Membership-Queries, that can be used as an alternative to existing indexing structures for streamed log data. In our experiments, DynaWarp required up to 93% less storage space than the tested state-of-the-art inverted index and had up to four orders of magnitude less false-positives than the tested state-of-the-art membership sketch. Additionally, DynaWarp achieved up to 250 times higher query throughput than the tested inverted index and up to 240 times higher query throughput than the tested membership sketch.

Authors: Julian Reichinger, Thomas Krismayer, Jan Rellermeyer

Modern, large scale monitoring systems have to process and store vast amounts of log data in near real-time. At query time the systems have to find relevant logs based on the content of the log message using support structures that can scale to these amounts of data while still being efficient to use. We present our novel DynaWarp membership sketch, capable of answering Multi-Set Multi-Membership-Queries, that can be used as an alternative to existing indexing structures for streamed log data. In our experiments, DynaWarp required up to 93% less storage space than the tested state-of-the-art inverted index and had up to four orders of magnitude less false-positives than the tested state-of-the-art membership sketch. Additionally, DynaWarp achieved up to 250 times higher query throughput than the tested inverted index and up to 240 times higher query throughput than the tested membership sketch.

Online Edge Coloring is (Nearly) as Easy as Offline

from arXiv: Data Structures and Algorithms

Authors: Joakim Blikstad, Ola Svensson, Radu Vintan, David Wajc

The classic theorem of Vizing (Diskret. Analiz.'64) asserts that any graph of maximum degree $\Delta$ can be edge colored (offline) using no more than $\Delta+1$ colors (with $\Delta$ being a trivial lower bound). In the online setting, Bar-Noy, Motwani and Naor (IPL'92) conjectured that a $(1+o(1))\Delta$-edge-coloring can be computed online in $n$-vertex graphs of maximum degree $\Delta=\omega(\log n)$. Numerous algorithms made progress on this question, using a higher number of colors or assuming restricted arrival models, such as random-order edge arrivals or vertex arrivals (e.g., AGKM FOCS'03, BMM SODA'10, CPW FOCS'19, BGW SODA'21, KLSST STOC'22). In this work, we resolve this longstanding conjecture in the affirmative in the most general setting of adversarial edge arrivals. We further generalize this result to obtain online counterparts of the list edge coloring result of Kahn (J. Comb. Theory. A'96) and of the recent "local" edge coloring result of Christiansen (STOC'23).

Authors: Joakim Blikstad, Ola Svensson, Radu Vintan, David Wajc

The classic theorem of Vizing (Diskret. Analiz.'64) asserts that any graph of maximum degree $\Delta$ can be edge colored (offline) using no more than $\Delta+1$ colors (with $\Delta$ being a trivial lower bound). In the online setting, Bar-Noy, Motwani and Naor (IPL'92) conjectured that a $(1+o(1))\Delta$-edge-coloring can be computed online in $n$-vertex graphs of maximum degree $\Delta=\omega(\log n)$. Numerous algorithms made progress on this question, using a higher number of colors or assuming restricted arrival models, such as random-order edge arrivals or vertex arrivals (e.g., AGKM FOCS'03, BMM SODA'10, CPW FOCS'19, BGW SODA'21, KLSST STOC'22). In this work, we resolve this longstanding conjecture in the affirmative in the most general setting of adversarial edge arrivals. We further generalize this result to obtain online counterparts of the list edge coloring result of Kahn (J. Comb. Theory. A'96) and of the recent "local" edge coloring result of Christiansen (STOC'23).

Output-Sensitive Enumeration of Potential Maximal Cliques in Polynomial Space

from arXiv: Data Structures and Algorithms

Authors: Caroline Brosse, Alessio Conte, Vincent Limouzy, Giulia Punzi, Davide Rucci

A set of vertices in a graph forms a potential maximal clique if there exists a minimal chordal completion in which it is a maximal clique. Potential maximal cliques were first introduced as a key tool to obtain an efficient, though exponential-time algorithm to compute the treewidth of a graph. As a byproduct, this allowed to compute the treewidth of various graph classes in polynomial time. In recent years, the concept of potential maximal cliques regained interest as it proved to be useful for a handful of graph algorithmic problems. In particular, it turned out to be a key tool to obtain a polynomial time algorithm for computing maximum weight independent sets in $P_5$-free and $P_6$-free graphs (Lokshtanov et al., SODA `14 and Grzeskik et al., SODA `19. In most of their applications, obtaining all the potential maximal cliques constitutes an algorithmic bottleneck, thus motivating the question of how to efficiently enumerate all the potential maximal cliques in a graph $G$. The state-of-the-art algorithm by Bouchitt\'e \& Todinca can enumerate potential maximal cliques in output-polynomial time by using exponential space, a significant limitation for the size of feasible instances. In this paper, we revisit this algorithm and design an enumeration algorithm that preserves an output-polynomial time complexity while only requiring polynomial space.

Authors: Caroline Brosse, Alessio Conte, Vincent Limouzy, Giulia Punzi, Davide Rucci

A set of vertices in a graph forms a potential maximal clique if there exists a minimal chordal completion in which it is a maximal clique. Potential maximal cliques were first introduced as a key tool to obtain an efficient, though exponential-time algorithm to compute the treewidth of a graph. As a byproduct, this allowed to compute the treewidth of various graph classes in polynomial time. In recent years, the concept of potential maximal cliques regained interest as it proved to be useful for a handful of graph algorithmic problems. In particular, it turned out to be a key tool to obtain a polynomial time algorithm for computing maximum weight independent sets in $P_5$-free and $P_6$-free graphs (Lokshtanov et al., SODA `14 and Grzeskik et al., SODA `19. In most of their applications, obtaining all the potential maximal cliques constitutes an algorithmic bottleneck, thus motivating the question of how to efficiently enumerate all the potential maximal cliques in a graph $G$. The state-of-the-art algorithm by Bouchitt\'e \& Todinca can enumerate potential maximal cliques in output-polynomial time by using exponential space, a significant limitation for the size of feasible instances. In this paper, we revisit this algorithm and design an enumeration algorithm that preserves an output-polynomial time complexity while only requiring polynomial space.

Lower Bounds for Leaf Rank of Leaf Powers

from arXiv: Data Structures and Algorithms

Authors: Svein Høgemo

Leaf powers and $k$-leaf powers have been studied for over 20 years, but there are still several aspects of this graph class that are poorly understood. One such aspect is the leaf rank of leaf powers, i.e. the smallest number $k$ such that a graph $G$ is a $k$-leaf power. Computing the leaf rank of leaf powers has proved a hard task, and furthermore, results about the asymptotic growth of the leaf rank as a function of the number of vertices in the graph have been few and far between. We present an infinite family of rooted directed path graphs that are leaf powers, and prove that they have leaf rank exponential in the number of vertices (utilizing a type of subtree model first presented by Rautenbach [Some remarks about leaf roots. Discrete mathematics, 2006]). This answers an open question by Brandst\"adt et al. [Rooted directed path graphs are leaf powers. Discrete mathematics, 2010].

Authors: Svein Høgemo

Leaf powers and $k$-leaf powers have been studied for over 20 years, but there are still several aspects of this graph class that are poorly understood. One such aspect is the leaf rank of leaf powers, i.e. the smallest number $k$ such that a graph $G$ is a $k$-leaf power. Computing the leaf rank of leaf powers has proved a hard task, and furthermore, results about the asymptotic growth of the leaf rank as a function of the number of vertices in the graph have been few and far between. We present an infinite family of rooted directed path graphs that are leaf powers, and prove that they have leaf rank exponential in the number of vertices (utilizing a type of subtree model first presented by Rautenbach [Some remarks about leaf roots. Discrete mathematics, 2006]). This answers an open question by Brandst\"adt et al. [Rooted directed path graphs are leaf powers. Discrete mathematics, 2010].

Computing Minimal Absent Words and Extended Bispecial Factors with CDAWG Space

from arXiv: Data Structures and Algorithms

Authors: Shunsuke Inenaga, Takuya Mieno, Hiroki Arimura, Mitsuru Funakoshi, Yuta Fujishige

A string $w$ is said to be a minimal absent word (MAW) for a string $S$ if $w$ does not occur in $S$ and any proper substring of $w$ occurs in $S$. We focus on non-trivial MAWs which are of length at least 2. Finding such non-trivial MAWs for a given string is motivated for applications in bioinformatics and data compression. Fujishige et al. [TCS 2023] proposed a data structure of size $\Theta(n)$ that can output the set $\mathsf{MAW}(S)$ of all MAWs for a given string $S$ of length $n$ in $O(n + |\mathsf{MAW}(S)|)$ time, based on the directed acyclic word graph (DAWG). In this paper, we present a more space efficient data structure based on the compact DAWG (CDAWG), which can output $\mathsf{MAW}(S)$ in $O(|\mathsf{MAW}(S)|)$ time with $O(e)$ space, where $e$ denotes the minimum of the sizes of the CDAWGs for $S$ and for its reversal $S^R$. For any strings of length $n$, it holds that $e < 2n$, and for highly repetitive strings $e$ can be sublinear (up to logarithmic) in $n$. We also show that MAWs and their generalization minimal rare words have close relationships with extended bispecial factors, via the CDAWG.

Authors: Shunsuke Inenaga, Takuya Mieno, Hiroki Arimura, Mitsuru Funakoshi, Yuta Fujishige

A string $w$ is said to be a minimal absent word (MAW) for a string $S$ if $w$ does not occur in $S$ and any proper substring of $w$ occurs in $S$. We focus on non-trivial MAWs which are of length at least 2. Finding such non-trivial MAWs for a given string is motivated for applications in bioinformatics and data compression. Fujishige et al. [TCS 2023] proposed a data structure of size $\Theta(n)$ that can output the set $\mathsf{MAW}(S)$ of all MAWs for a given string $S$ of length $n$ in $O(n + |\mathsf{MAW}(S)|)$ time, based on the directed acyclic word graph (DAWG). In this paper, we present a more space efficient data structure based on the compact DAWG (CDAWG), which can output $\mathsf{MAW}(S)$ in $O(|\mathsf{MAW}(S)|)$ time with $O(e)$ space, where $e$ denotes the minimum of the sizes of the CDAWGs for $S$ and for its reversal $S^R$. For any strings of length $n$, it holds that $e < 2n$, and for highly repetitive strings $e$ can be sublinear (up to logarithmic) in $n$. We also show that MAWs and their generalization minimal rare words have close relationships with extended bispecial factors, via the CDAWG.

Tighter Bounds for Local Differentially Private Core Decomposition and Densest Subgraph

from arXiv: Data Structures and Algorithms

Authors: Monika Henzinger, A. R. Sricharan, Leqi Zhu

Computing the core decomposition of a graph is a fundamental problem that has recently been studied in the differentially private setting, motivated by practical applications in data mining. In particular, Dhulipala et al. [FOCS 2022] gave the first mechanism for approximate core decomposition in the challenging and practically relevant setting of local differential privacy. One of the main open problems left by their work is whether the accuracy, i.e., the approximation ratio and additive error, of their mechanism can be improved. We show the first lower bounds on the additive error of approximate and exact core decomposition mechanisms in the centralized and local model of differential privacy, respectively. We also give mechanisms for exact and approximate core decomposition in the local model, with almost matching additive error bounds. Our mechanisms are based on a black-box application of continual counting. They also yield improved mechanisms for the approximate densest subgraph problem in the local model.

Authors: Monika Henzinger, A. R. Sricharan, Leqi Zhu

Computing the core decomposition of a graph is a fundamental problem that has recently been studied in the differentially private setting, motivated by practical applications in data mining. In particular, Dhulipala et al. [FOCS 2022] gave the first mechanism for approximate core decomposition in the challenging and practically relevant setting of local differential privacy. One of the main open problems left by their work is whether the accuracy, i.e., the approximation ratio and additive error, of their mechanism can be improved. We show the first lower bounds on the additive error of approximate and exact core decomposition mechanisms in the centralized and local model of differential privacy, respectively. We also give mechanisms for exact and approximate core decomposition in the local model, with almost matching additive error bounds. Our mechanisms are based on a black-box application of continual counting. They also yield improved mechanisms for the approximate densest subgraph problem in the local model.

Decremental $(1+ε)$-Approximate Maximum Eigenvector: Dynamic Power Method

from arXiv: Data Structures and Algorithms

Authors: Deeksha Adil, Thatchaphol Saranurak

We present a dynamic algorithm for maintaining $(1+\epsilon)$-approximate maximum eigenvector and eigenvalue of a positive semi-definite matrix $A$ undergoing \emph{decreasing} updates, i.e., updates which may only decrease eigenvalues. Given a vector $v$ updating $A\gets A-vv^{\top}$, our algorithm takes $\tilde{O}(\mathrm{nnz}(v))$ amortized update time, i.e., polylogarithmic per non-zeros in the update vector. Our technique is based on a novel analysis of the influential power method in the dynamic setting. The two previous sets of techniques have the following drawbacks (1) algebraic techniques can maintain exact solutions but their update time is at least polynomial per non-zeros, and (2) sketching techniques admit polylogarithmic update time but suffer from a crude additive approximation. Our algorithm exploits an oblivious adversary. Interestingly, we show that any algorithm with polylogarithmic update time per non-zeros that works against an adaptive adversary and satisfies an additional natural property would imply a breakthrough for checking psd-ness of matrices in $\tilde{O}(n^{2})$ time, instead of $O(n^{\omega})$ time.

Authors: Deeksha Adil, Thatchaphol Saranurak

We present a dynamic algorithm for maintaining $(1+\epsilon)$-approximate maximum eigenvector and eigenvalue of a positive semi-definite matrix $A$ undergoing \emph{decreasing} updates, i.e., updates which may only decrease eigenvalues. Given a vector $v$ updating $A\gets A-vv^{\top}$, our algorithm takes $\tilde{O}(\mathrm{nnz}(v))$ amortized update time, i.e., polylogarithmic per non-zeros in the update vector. Our technique is based on a novel analysis of the influential power method in the dynamic setting. The two previous sets of techniques have the following drawbacks (1) algebraic techniques can maintain exact solutions but their update time is at least polynomial per non-zeros, and (2) sketching techniques admit polylogarithmic update time but suffer from a crude additive approximation. Our algorithm exploits an oblivious adversary. Interestingly, we show that any algorithm with polylogarithmic update time per non-zeros that works against an adaptive adversary and satisfies an additional natural property would imply a breakthrough for checking psd-ness of matrices in $\tilde{O}(n^{2})$ time, instead of $O(n^{\omega})$ time.

Wednesday, February 28

Meta-Complexity: A Basic Introduction for the Meta-Perplexed

from Simons Institute Blog

by Adam Becker (science communicator in residence, Spring 2023) Think about the last time you faced a problem you couldn’t solve. Say it was something practical, something that seemed small — a leaky faucet, for example. There’s an exposed screw … Continue reading →

By Simons Institute Editor

by Adam Becker (science communicator in residence, Spring 2023) Think about the last time you faced a problem you couldn’t solve. Say it was something practical, something that seemed small — a leaky faucet, for example. There’s an exposed screw … Continue reading

By Simons Institute Editor

A Quantum State

from Computational Complexity

♦ Illinois' most famous citizen working on a quantum computer
The governor of Illinois, JB Pritzker, unveiled his budget last week including $500 million for quantum computing research. Is this the best way to spend my tax dollars?

As long-time readers know, I have strong doubts about the real-world applications of quantum computing and the hype for the field. But the article does not suggest any applications of quantum computing, rather

Pritzker says he's optimistic that the Illinois state legislature will embrace his proposal as a catalyst for job creation and investment attraction.

That does make sense. Investing in quantum may very well bring in extra federal and corporate investment into quantum in Chicago. At the least it will bring in smart people to Illinois to fill research roles. And it's not if this money would go to any other scientific endeavor if we don't put it into quantum.

So it makes sense financially and scientifically even if these machines don't actually solve any real-world problems. Quantum winter will eventually come but might as well take advantage of the hype while it's still there. Or should we?

A physicist colleague strongly supports Illinois spending half a billion on quantum. He lives in Indiana. 

By Lance Fortnow

Illinois' most famous citizen working on a quantum computer

The governor of Illinois, JB Pritzker, unveiled his budget last week including $500 million for quantum computing research. Is this the best way to spend my tax dollars?

As long-time readers know, I have strong doubts about the real-world applications of quantum computing and the hype for the field. But the article does not suggest any applications of quantum computing, rather

Pritzker says he's optimistic that the Illinois state legislature will embrace his proposal as a catalyst for job creation and investment attraction.

That does make sense. Investing in quantum may very well bring in extra federal and corporate investment into quantum in Chicago. At the least it will bring in smart people to Illinois to fill research roles. And it's not if this money would go to any other scientific endeavor if we don't put it into quantum.

So it makes sense financially and scientifically even if these machines don't actually solve any real-world problems. Quantum winter will eventually come but might as well take advantage of the hype while it's still there. Or should we?

A physicist colleague strongly supports Illinois spending half a billion on quantum. He lives in Indiana. 

By Lance Fortnow

Tight Lower Bounds for Block-Structured Integer Programs

from arXiv: Computational Complexity

Authors: Christoph Hunkenschröder, Kim-Manuel Klein, Martin Koutecký, Alexandra Lassota, Asaf Levin

We study fundamental block-structured integer programs called tree-fold and multi-stage IPs. Tree-fold IPs admit a constraint matrix with independent blocks linked together by few constraints in a recursive pattern; and transposing their constraint matrix yields multi-stage IPs. The state-of-the-art algorithms to solve these IPs have an exponential gap in their running times, making it natural to ask whether this gap is inherent. We answer this question affirmative. Assuming the Exponential Time Hypothesis, we prove lower bounds showing that the exponential difference is necessary, and that the known algorithms are near optimal. Moreover, we prove unconditional lower bounds on the norms of the Graver basis, a fundamental building block of all known algorithms to solve these IPs. This shows that none of the current approaches can be improved beyond this bound.

Authors: Christoph Hunkenschröder, Kim-Manuel Klein, Martin Koutecký, Alexandra Lassota, Asaf Levin

We study fundamental block-structured integer programs called tree-fold and multi-stage IPs. Tree-fold IPs admit a constraint matrix with independent blocks linked together by few constraints in a recursive pattern; and transposing their constraint matrix yields multi-stage IPs. The state-of-the-art algorithms to solve these IPs have an exponential gap in their running times, making it natural to ask whether this gap is inherent. We answer this question affirmative. Assuming the Exponential Time Hypothesis, we prove lower bounds showing that the exponential difference is necessary, and that the known algorithms are near optimal. Moreover, we prove unconditional lower bounds on the norms of the Graver basis, a fundamental building block of all known algorithms to solve these IPs. This shows that none of the current approaches can be improved beyond this bound.

Graph Neural Networks and Arithmetic Circuits

from arXiv: Computational Complexity

Authors: Timon Barlag, Vivian Holzapfel, Laura Strieker, Jonni Virtema, Heribert Vollmer

We characterize the computational power of neural networks that follow the graph neural network (GNN) architecture, not restricted to aggregate-combine GNNs or other particular types. We establish an exact correspondence between the expressivity of GNNs using diverse activation functions and arithmetic circuits over real numbers. In our results the activation function of the network becomes a gate type in the circuit. Our result holds for families of constant depth circuits and networks, both uniformly and non-uniformly, for all common activation functions.

Authors: Timon Barlag, Vivian Holzapfel, Laura Strieker, Jonni Virtema, Heribert Vollmer

We characterize the computational power of neural networks that follow the graph neural network (GNN) architecture, not restricted to aggregate-combine GNNs or other particular types. We establish an exact correspondence between the expressivity of GNNs using diverse activation functions and arithmetic circuits over real numbers. In our results the activation function of the network becomes a gate type in the circuit. Our result holds for families of constant depth circuits and networks, both uniformly and non-uniformly, for all common activation functions.

Geometric Deep Learning for Computer-Aided Design: A Survey

from arXiv: Computational Geometry

Authors: Negar Heidari, Alexandros Iosifidis

Geometric Deep Learning techniques have become a transformative force in the field of Computer-Aided Design (CAD), and have the potential to revolutionize how designers and engineers approach and enhance the design process. By harnessing the power of machine learning-based methods, CAD designers can optimize their workflows, save time and effort while making better informed decisions, and create designs that are both innovative and practical. The ability to process the CAD designs represented by geometric data and to analyze their encoded features enables the identification of similarities among diverse CAD models, the proposition of alternative designs and enhancements, and even the generation of novel design alternatives. This survey offers a comprehensive overview of learning-based methods in computer-aided design across various categories, including similarity analysis and retrieval, 2D and 3D CAD model synthesis, and CAD generation from point clouds. Additionally, it provides a complete list of benchmark datasets and their characteristics, along with open-source codes that have propelled research in this domain. The final discussion delves into the challenges prevalent in this field, followed by potential future research directions in this rapidly evolving field.

Authors: Negar Heidari, Alexandros Iosifidis

Geometric Deep Learning techniques have become a transformative force in the field of Computer-Aided Design (CAD), and have the potential to revolutionize how designers and engineers approach and enhance the design process. By harnessing the power of machine learning-based methods, CAD designers can optimize their workflows, save time and effort while making better informed decisions, and create designs that are both innovative and practical. The ability to process the CAD designs represented by geometric data and to analyze their encoded features enables the identification of similarities among diverse CAD models, the proposition of alternative designs and enhancements, and even the generation of novel design alternatives. This survey offers a comprehensive overview of learning-based methods in computer-aided design across various categories, including similarity analysis and retrieval, 2D and 3D CAD model synthesis, and CAD generation from point clouds. Additionally, it provides a complete list of benchmark datasets and their characteristics, along with open-source codes that have propelled research in this domain. The final discussion delves into the challenges prevalent in this field, followed by potential future research directions in this rapidly evolving field.

Enclosing Points with Geometric Objects

from arXiv: Data Structures and Algorithms

Authors: Timothy M. Chan, Qizheng He, Jie Xue

Let $X$ be a set of points in $\mathbb{R}^2$ and $\mathcal{O}$ be a set of geometric objects in $\mathbb{R}^2$, where $|X| + |\mathcal{O}| = n$. We study the problem of computing a minimum subset $\mathcal{O}^* \subseteq \mathcal{O}$ that encloses all points in $X$. Here a point $x \in X$ is enclosed by $\mathcal{O}^*$ if it lies in a bounded connected component of $\mathbb{R}^2 \backslash (\bigcup_{O \in \mathcal{O}^*} O)$. We propose two algorithmic frameworks to design polynomial-time approximation algorithms for the problem. The first framework is based on sparsification and min-cut, which results in $O(1)$-approximation algorithms for unit disks, unit squares, etc. The second framework is based on LP rounding, which results in an $O(\alpha(n)\log n)$-approximation algorithm for segments, where $\alpha(n)$ is the inverse Ackermann function, and an $O(\log n)$-approximation algorithm for disks.

Authors: Timothy M. Chan, Qizheng He, Jie Xue

Let $X$ be a set of points in $\mathbb{R}^2$ and $\mathcal{O}$ be a set of geometric objects in $\mathbb{R}^2$, where $|X| + |\mathcal{O}| = n$. We study the problem of computing a minimum subset $\mathcal{O}^* \subseteq \mathcal{O}$ that encloses all points in $X$. Here a point $x \in X$ is enclosed by $\mathcal{O}^*$ if it lies in a bounded connected component of $\mathbb{R}^2 \backslash (\bigcup_{O \in \mathcal{O}^*} O)$. We propose two algorithmic frameworks to design polynomial-time approximation algorithms for the problem. The first framework is based on sparsification and min-cut, which results in $O(1)$-approximation algorithms for unit disks, unit squares, etc. The second framework is based on LP rounding, which results in an $O(\alpha(n)\log n)$-approximation algorithm for segments, where $\alpha(n)$ is the inverse Ackermann function, and an $O(\log n)$-approximation algorithm for disks.

Robustly Learning Single-Index Models via Alignment Sharpness

from arXiv: Data Structures and Algorithms

Authors: Nikos Zarifis, Puqian Wang, Ilias Diakonikolas, Jelena Diakonikolas

We study the problem of learning Single-Index Models under the $L_2^2$ loss in the agnostic model. We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss, that succeeds under a range of distributions (including log-concave distributions) and a broad class of monotone and Lipschitz link functions. This is the first efficient constant factor approximate agnostic learner, even for Gaussian data and for any nontrivial class of link functions. Prior work for the case of unknown link function either works in the realizable setting or does not attain constant factor approximation. The main technical ingredient enabling our algorithm and analysis is a novel notion of a local error bound in optimization that we term alignment sharpness and that may be of broader interest.

Authors: Nikos Zarifis, Puqian Wang, Ilias Diakonikolas, Jelena Diakonikolas

We study the problem of learning Single-Index Models under the $L_2^2$ loss in the agnostic model. We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss, that succeeds under a range of distributions (including log-concave distributions) and a broad class of monotone and Lipschitz link functions. This is the first efficient constant factor approximate agnostic learner, even for Gaussian data and for any nontrivial class of link functions. Prior work for the case of unknown link function either works in the realizable setting or does not attain constant factor approximation. The main technical ingredient enabling our algorithm and analysis is a novel notion of a local error bound in optimization that we term alignment sharpness and that may be of broader interest.

Learning-Based Algorithms for Graph Searching Problems

from arXiv: Data Structures and Algorithms

Authors: Adela Frances DePavia, Erasmo Tani, Ali Vakilian

We consider the problem of graph searching with prediction recently introduced by Banerjee et al. (2022). In this problem, an agent, starting at some vertex $r$ has to traverse a (potentially unknown) graph $G$ to find a hidden goal node $g$ while minimizing the total distance travelled. We study a setting in which at any node $v$, the agent receives a noisy estimate of the distance from $v$ to $g$. We design algorithms for this search task on unknown graphs. We establish the first formal guarantees on unknown weighted graphs and provide lower bounds showing that the algorithms we propose have optimal or nearly-optimal dependence on the prediction error. Further, we perform numerical experiments demonstrating that in addition to being robust to adversarial error, our algorithms perform well in typical instances in which the error is stochastic. Finally, we provide alternative simpler performance bounds on the algorithms of Banerjee et al. (2022) for the case of searching on a known graph, and establish new lower bounds for this setting.

Authors: Adela Frances DePavia, Erasmo Tani, Ali Vakilian

We consider the problem of graph searching with prediction recently introduced by Banerjee et al. (2022). In this problem, an agent, starting at some vertex $r$ has to traverse a (potentially unknown) graph $G$ to find a hidden goal node $g$ while minimizing the total distance travelled. We study a setting in which at any node $v$, the agent receives a noisy estimate of the distance from $v$ to $g$. We design algorithms for this search task on unknown graphs. We establish the first formal guarantees on unknown weighted graphs and provide lower bounds showing that the algorithms we propose have optimal or nearly-optimal dependence on the prediction error. Further, we perform numerical experiments demonstrating that in addition to being robust to adversarial error, our algorithms perform well in typical instances in which the error is stochastic. Finally, we provide alternative simpler performance bounds on the algorithms of Banerjee et al. (2022) for the case of searching on a known graph, and establish new lower bounds for this setting.

The SMART approach to instance-optimal online learning

from arXiv: Data Structures and Algorithms

Authors: Siddhartha Banerjee, Alankrita Bhatt, Christina Lee Yu

We devise an online learning algorithm -- titled Switching via Monotone Adapted Regret Traces (SMART) -- that adapts to the data and achieves regret that is instance optimal, i.e., simultaneously competitive on every input sequence compared to the performance of the follow-the-leader (FTL) policy and the worst case guarantee of any other input policy. We show that the regret of the SMART policy on any input sequence is within a multiplicative factor $e/(e-1) \approx 1.58$ of the smaller of: 1) the regret obtained by FTL on the sequence, and 2) the upper bound on regret guaranteed by the given worst-case policy. This implies a strictly stronger guarantee than typical `best-of-both-worlds' bounds as the guarantee holds for every input sequence regardless of how it is generated. SMART is simple to implement as it begins by playing FTL and switches at most once during the time horizon to the worst-case algorithm. Our approach and results follow from an operational reduction of instance optimal online learning to competitive analysis for the ski-rental problem. We complement our competitive ratio upper bounds with a fundamental lower bound showing that over all input sequences, no algorithm can get better than a $1.43$-fraction of the minimum regret achieved by FTL and the minimax-optimal policy. We also present a modification of SMART that combines FTL with a ``small-loss" algorithm to achieve instance optimality between the regret of FTL and the small loss regret bound.

Authors: Siddhartha Banerjee, Alankrita Bhatt, Christina Lee Yu

We devise an online learning algorithm -- titled Switching via Monotone Adapted Regret Traces (SMART) -- that adapts to the data and achieves regret that is instance optimal, i.e., simultaneously competitive on every input sequence compared to the performance of the follow-the-leader (FTL) policy and the worst case guarantee of any other input policy. We show that the regret of the SMART policy on any input sequence is within a multiplicative factor $e/(e-1) \approx 1.58$ of the smaller of: 1) the regret obtained by FTL on the sequence, and 2) the upper bound on regret guaranteed by the given worst-case policy. This implies a strictly stronger guarantee than typical `best-of-both-worlds' bounds as the guarantee holds for every input sequence regardless of how it is generated. SMART is simple to implement as it begins by playing FTL and switches at most once during the time horizon to the worst-case algorithm. Our approach and results follow from an operational reduction of instance optimal online learning to competitive analysis for the ski-rental problem. We complement our competitive ratio upper bounds with a fundamental lower bound showing that over all input sequences, no algorithm can get better than a $1.43$-fraction of the minimum regret achieved by FTL and the minimax-optimal policy. We also present a modification of SMART that combines FTL with a ``small-loss" algorithm to achieve instance optimality between the regret of FTL and the small loss regret bound.

Deterministic Cache-Oblivious Funnelselect

from arXiv: Data Structures and Algorithms

Authors: Gerth Stølting Brodal, Sebastian Wild

In the multiple-selection problem one is given an unsorted array $S$ of $N$ elements and an array of $q$ query ranks $r_1<\cdots0$, and $\Delta_i = r_{i} - r_{i-1}$ (assuming $r_0=0$ and $r_{q+1}=N + 1$).

Authors: Gerth Stølting Brodal, Sebastian Wild

In the multiple-selection problem one is given an unsorted array $S$ of $N$ elements and an array of $q$ query ranks $r_1<\cdots0$, and $\Delta_i = r_{i} - r_{i-1}$ (assuming $r_0=0$ and $r_{q+1}=N + 1$).

FlipHash: A Constant-Time Consistent Range-Hashing Algorithm

from arXiv: Data Structures and Algorithms

Authors: Charles Masson, Homin K. Lee

Consistent range-hashing is a technique used in distributed systems, either directly or as a subroutine for consistent hashing, commonly to realize an even and stable data distribution over a variable number of resources. We introduce FlipHash, a consistent range-hashing algorithm with constant time complexity and low memory requirements. Like Jump Consistent Hash, FlipHash is intended for applications where resources can be indexed sequentially. Under this condition, it ensures that keys are hashed evenly across resources and that changing the number of resources only causes keys to be remapped from a removed resource or to an added one, but never shuffled across persisted ones. FlipHash differentiates itself with its low computational cost, achieving constant-time complexity. We show that FlipHash beats Jump Consistent Hash's cost, which is logarithmic in the number of resources, both theoretically and in experiments over practical settings.

Authors: Charles Masson, Homin K. Lee

Consistent range-hashing is a technique used in distributed systems, either directly or as a subroutine for consistent hashing, commonly to realize an even and stable data distribution over a variable number of resources. We introduce FlipHash, a consistent range-hashing algorithm with constant time complexity and low memory requirements. Like Jump Consistent Hash, FlipHash is intended for applications where resources can be indexed sequentially. Under this condition, it ensures that keys are hashed evenly across resources and that changing the number of resources only causes keys to be remapped from a removed resource or to an added one, but never shuffled across persisted ones. FlipHash differentiates itself with its low computational cost, achieving constant-time complexity. We show that FlipHash beats Jump Consistent Hash's cost, which is logarithmic in the number of resources, both theoretically and in experiments over practical settings.