Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Sunday, December 08

TR24-202 | Lifting to Randomized Parity Decision Trees | Farzan Byramji, Russell Impagliazzo

from ECCC Papers

We prove a lifting theorem from randomized decision tree depth to randomized parity decision tree (PDT) size. We use the same property of the gadget, stifling, which was introduced by Chattopadhyay, Mande, Sanyal and Sherif [ITCS'23] to prove a lifting theorem for deterministic PDTs. Moreover, even the milder condition that the gadget has minimum parity certificate complexity at least $2$ suffices for lifting to randomized PDT size. To further improve the dependence on the gadget $g$ in the lower bounds for composed functions, we consider a related problem $g_*$ whose inputs are certificates of $g$. It is implicit in the work of Chattopadhyay et al. that for any function $f$, lower bounds for the $*$-depth of $f_*$ give lower bounds for the PDT size of $f$. We make this connection explicit in the deterministic case and show that it also holds for randomized PDTs. We then combine this with composition theorems for $*$-depth, which follow by adapting known composition theorems for decision trees. As a corollary, we get tight lifting theorems when the gadget is Indexing, Inner Product or Disjointness.

We prove a lifting theorem from randomized decision tree depth to randomized parity decision tree (PDT) size. We use the same property of the gadget, stifling, which was introduced by Chattopadhyay, Mande, Sanyal and Sherif [ITCS'23] to prove a lifting theorem for deterministic PDTs. Moreover, even the milder condition that the gadget has minimum parity certificate complexity at least $2$ suffices for lifting to randomized PDT size. To further improve the dependence on the gadget $g$ in the lower bounds for composed functions, we consider a related problem $g_*$ whose inputs are certificates of $g$. It is implicit in the work of Chattopadhyay et al. that for any function $f$, lower bounds for the $*$-depth of $f_*$ give lower bounds for the PDT size of $f$. We make this connection explicit in the deterministic case and show that it also holds for randomized PDTs. We then combine this with composition theorems for $*$-depth, which follow by adapting known composition theorems for decision trees. As a corollary, we get tight lifting theorems when the gadget is Indexing, Inner Product or Disjointness.

Friday, December 06

Estimating the persistent homology of $\mathbb{R}^n$-valued functions using function-geometric multifiltrations

from arXiv: Computational Geometry

Authors: Ethan André, Jingyi Li, Steve Oudot

Given an unknown $\mathbb{R}^n$-valued function $f$ on a metric space $X$, can we approximate the persistent homology of $f$ from a finite sampling of $X$ with known pairwise distances and function values? This question has been answered in the case $n=1$, assuming $f$ is Lipschitz continuous and $X$ is a sufficiently regular geodesic metric space, and using filtered geometric complexes with fixed scale parameter for the approximation. In this paper we answer the question for arbitrary $n$, under similar assumptions and using function-geometric multifiltrations. Our analysis offers a different view on these multifiltrations by focusing on their approximation properties rather than on their stability properties. We also leverage the multiparameter setting to provide insight into the influence of the scale parameter, whose choice is central to this type of approach.

Authors: Ethan André, Jingyi Li, Steve Oudot

Given an unknown $\mathbb{R}^n$-valued function $f$ on a metric space $X$, can we approximate the persistent homology of $f$ from a finite sampling of $X$ with known pairwise distances and function values? This question has been answered in the case $n=1$, assuming $f$ is Lipschitz continuous and $X$ is a sufficiently regular geodesic metric space, and using filtered geometric complexes with fixed scale parameter for the approximation. In this paper we answer the question for arbitrary $n$, under similar assumptions and using function-geometric multifiltrations. Our analysis offers a different view on these multifiltrations by focusing on their approximation properties rather than on their stability properties. We also leverage the multiparameter setting to provide insight into the influence of the scale parameter, whose choice is central to this type of approach.

Dudeney's Dissection is Optimal

from arXiv: Computational Geometry

Authors: Erik D. Demaine, Tonan Kamata, Ryuhei Uehara

In 1907, Henry Ernest Dudeney posed a puzzle: ``cut any equilateral triangle \dots\ into as few pieces as possible that will fit together and form a perfect square'' (without overlap, via translation and rotation). Four weeks later, Dudeney demonstrated a beautiful four-piece solution, which today remains perhaps the most famous example of a dissection. In this paper (over a century later), we finally solve Dudeney's puzzle, by proving that the equilateral triangle and square have no common dissection with three or fewer polygonal pieces. We reduce the problem to the analysis of a discrete graph structure representing the correspondence between the edges and vertices of the pieces forming each polygon, using ideas from common unfolding.

Authors: Erik D. Demaine, Tonan Kamata, Ryuhei Uehara

In 1907, Henry Ernest Dudeney posed a puzzle: ``cut any equilateral triangle \dots\ into as few pieces as possible that will fit together and form a perfect square'' (without overlap, via translation and rotation). Four weeks later, Dudeney demonstrated a beautiful four-piece solution, which today remains perhaps the most famous example of a dissection. In this paper (over a century later), we finally solve Dudeney's puzzle, by proving that the equilateral triangle and square have no common dissection with three or fewer polygonal pieces. We reduce the problem to the analysis of a discrete graph structure representing the correspondence between the edges and vertices of the pieces forming each polygon, using ideas from common unfolding.

Dynamical Persistent Homology via Wasserstein Gradient Flow

from arXiv: Computational Geometry

Authors: Minghua Wang, Jinhui Xu

In this study, we introduce novel methodologies designed to adapt original data in response to the dynamics of persistence diagrams along Wasserstein gradient flows. Our research focuses on the development of algorithms that translate variations in persistence diagrams back into the data space. This advancement enables direct manipulation of the data, guided by observed changes in persistence diagrams, offering a powerful tool for data analysis and interpretation in the context of topological data analysis.

Authors: Minghua Wang, Jinhui Xu

In this study, we introduce novel methodologies designed to adapt original data in response to the dynamics of persistence diagrams along Wasserstein gradient flows. Our research focuses on the development of algorithms that translate variations in persistence diagrams back into the data space. This advancement enables direct manipulation of the data, guided by observed changes in persistence diagrams, offering a powerful tool for data analysis and interpretation in the context of topological data analysis.

Recognizing 2-Layer and Outer $k$-Planar Graphs

from arXiv: Data Structures and Algorithms

Authors: Yasuaki Kobayashi, Yuto Okada, Alexander Wolff

The crossing number of a graph is the least number of crossings over all drawings of the graph in the plane. Computing the crossing number of a given graph is NP-hard, but fixed-parameter tractable (FPT) with respect to the natural parameter. Two well-known variants of the problem are 2-layer crossing minimization and circular crossing minimization, where every vertex must lie on one of two layers, namely two parallel lines, or a circle, respectively. Both variants are NP-hard, but FPT with respect to the natural parameter. Recently, a local version of the crossing number has also received considerable attention. A graph is $k$-planar if it admits a drawing with at most $k$ crossings per edge. In contrast to the crossing number, recognizing $k$-planar graphs is NP-hard even if $k=1$. In this paper, we consider the two above variants in the local setting. The $k$-planar graphs that admit a straight-line drawing with vertices on two layers or on a circle are called 2-layer $k$-planar and outer $k$-planar graphs, respectively. We study the parameterized complexity of the two recognition problems with respect to $k$. For $k=0$, both problems can easily be solved in linear time. Two groups independently showed that outer 1-planar graphs can also be recognized in linear time [Algorithmica 2015/2016]. One group asked whether outer 2-planar graphs can be recognized in polynomial time. Our main contribution consists of XP-algorithms for recognizing 2-layer $k$-planar graphs and outer $k$-planar graphs. We complement these results by showing that both recognition problems are XNLP-hard. This implies that both problems are W$[t]$-hard for every $t$ and that it is unlikely that they admit FPT-algorithms. On the other hand, we present an FPT-algorithm for recognizing 2-layer $k$-planar graphs where the order of the vertices on one layer is specified.

Authors: Yasuaki Kobayashi, Yuto Okada, Alexander Wolff

The crossing number of a graph is the least number of crossings over all drawings of the graph in the plane. Computing the crossing number of a given graph is NP-hard, but fixed-parameter tractable (FPT) with respect to the natural parameter. Two well-known variants of the problem are 2-layer crossing minimization and circular crossing minimization, where every vertex must lie on one of two layers, namely two parallel lines, or a circle, respectively. Both variants are NP-hard, but FPT with respect to the natural parameter. Recently, a local version of the crossing number has also received considerable attention. A graph is $k$-planar if it admits a drawing with at most $k$ crossings per edge. In contrast to the crossing number, recognizing $k$-planar graphs is NP-hard even if $k=1$. In this paper, we consider the two above variants in the local setting. The $k$-planar graphs that admit a straight-line drawing with vertices on two layers or on a circle are called 2-layer $k$-planar and outer $k$-planar graphs, respectively. We study the parameterized complexity of the two recognition problems with respect to $k$. For $k=0$, both problems can easily be solved in linear time. Two groups independently showed that outer 1-planar graphs can also be recognized in linear time [Algorithmica 2015/2016]. One group asked whether outer 2-planar graphs can be recognized in polynomial time. Our main contribution consists of XP-algorithms for recognizing 2-layer $k$-planar graphs and outer $k$-planar graphs. We complement these results by showing that both recognition problems are XNLP-hard. This implies that both problems are W$[t]$-hard for every $t$ and that it is unlikely that they admit FPT-algorithms. On the other hand, we present an FPT-algorithm for recognizing 2-layer $k$-planar graphs where the order of the vertices on one layer is specified.

Robust Contraction Decomposition for Minor-Free Graphs and its Applications

from arXiv: Data Structures and Algorithms

Authors: Sayan Bandyapadhyay, William Lochet, Daniel Lokshtanov, Dániel Marx, Pranabendu Misra, Daniel Neuen, Saket Saurabh, Prafullkumar Tale, Jie Xue

We prove a robust contraction decomposition theorem for $H$-minor-free graphs, which states that given an $H$-minor-free graph $G$ and an integer $p$, one can partition in polynomial time the vertices of $G$ into $p$ sets $Z_1,\dots,Z_p$ such that $\operatorname{tw}(G/(Z_i \setminus Z')) = O(p + |Z'|)$ for all $i \in [p]$ and $Z' \subseteq Z_i$. Here, $\operatorname{tw}(\cdot)$ denotes the treewidth of a graph and $G/(Z_i \setminus Z')$ denotes the graph obtained from $G$ by contracting all edges with both endpoints in $Z_i \setminus Z'$. Our result generalizes earlier results by Klein [SICOMP 2008] and Demaine et al. [STOC 2011] based on partitioning $E(G)$, and some recent theorems for planar graphs by Marx et al. [SODA 2022], for bounded-genus graphs (more generally, almost-embeddable graphs) by Bandyapadhyay et al. [SODA 2022], and for unit-disk graphs by Bandyapadhyay et al. [SoCG 2022]. The robust contraction decomposition theorem directly results in parameterized algorithms with running time $2^{\widetilde{O}(\sqrt{k})} \cdot n^{O(1)}$ or $n^{O(\sqrt{k})}$ for every vertex/edge deletion problems on $H$-minor-free graphs that can be formulated as Permutation CSP Deletion or 2-Conn Permutation CSP Deletion. Consequently, we obtain the first subexponential-time parameterized algorithms for Subset Feedback Vertex Set, Subset Odd Cycle Transversal, Subset Group Feedback Vertex Set, 2-Conn Component Order Connectivity on $H$-minor-free graphs. For other problems which already have subexponential-time parameterized algorithms on $H$-minor-free graphs (e.g., Odd Cycle Transversal, Vertex Multiway Cut, Vertex Multicut, etc.), our theorem gives much simpler algorithms of the same running time.

Authors: Sayan Bandyapadhyay, William Lochet, Daniel Lokshtanov, Dániel Marx, Pranabendu Misra, Daniel Neuen, Saket Saurabh, Prafullkumar Tale, Jie Xue

We prove a robust contraction decomposition theorem for $H$-minor-free graphs, which states that given an $H$-minor-free graph $G$ and an integer $p$, one can partition in polynomial time the vertices of $G$ into $p$ sets $Z_1,\dots,Z_p$ such that $\operatorname{tw}(G/(Z_i \setminus Z')) = O(p + |Z'|)$ for all $i \in [p]$ and $Z' \subseteq Z_i$. Here, $\operatorname{tw}(\cdot)$ denotes the treewidth of a graph and $G/(Z_i \setminus Z')$ denotes the graph obtained from $G$ by contracting all edges with both endpoints in $Z_i \setminus Z'$. Our result generalizes earlier results by Klein [SICOMP 2008] and Demaine et al. [STOC 2011] based on partitioning $E(G)$, and some recent theorems for planar graphs by Marx et al. [SODA 2022], for bounded-genus graphs (more generally, almost-embeddable graphs) by Bandyapadhyay et al. [SODA 2022], and for unit-disk graphs by Bandyapadhyay et al. [SoCG 2022]. The robust contraction decomposition theorem directly results in parameterized algorithms with running time $2^{\widetilde{O}(\sqrt{k})} \cdot n^{O(1)}$ or $n^{O(\sqrt{k})}$ for every vertex/edge deletion problems on $H$-minor-free graphs that can be formulated as Permutation CSP Deletion or 2-Conn Permutation CSP Deletion. Consequently, we obtain the first subexponential-time parameterized algorithms for Subset Feedback Vertex Set, Subset Odd Cycle Transversal, Subset Group Feedback Vertex Set, 2-Conn Component Order Connectivity on $H$-minor-free graphs. For other problems which already have subexponential-time parameterized algorithms on $H$-minor-free graphs (e.g., Odd Cycle Transversal, Vertex Multiway Cut, Vertex Multicut, etc.), our theorem gives much simpler algorithms of the same running time.

Computing diverse pair of solutions for tractable SAT

from arXiv: Data Structures and Algorithms

Authors: Tatsuya Gima, Yuni Iwamasa, Yasuaki Kobayashi, Kazuhiro Kurita, Yota Otachi, Rin Saito

In many decision-making processes, one may prefer multiple solutions to a single solution, which allows us to choose an appropriate solution from the set of promising solutions that are found by algorithms. Given this, finding a set of \emph{diverse} solutions plays an indispensable role in enhancing human decision-making. In this paper, we investigate the problem of finding diverse solutions of Satisfiability from the perspective of parameterized complexity with a particular focus on \emph{tractable} Boolean formulas. We present several parameterized tractable and intractable results for finding a diverse pair of satisfying assignments of a Boolean formula. In particular, we design an FPT algorithm for finding an ``almost disjoint'' pair of satisfying assignments of a $2$CNF formula.

Authors: Tatsuya Gima, Yuni Iwamasa, Yasuaki Kobayashi, Kazuhiro Kurita, Yota Otachi, Rin Saito

In many decision-making processes, one may prefer multiple solutions to a single solution, which allows us to choose an appropriate solution from the set of promising solutions that are found by algorithms. Given this, finding a set of \emph{diverse} solutions plays an indispensable role in enhancing human decision-making. In this paper, we investigate the problem of finding diverse solutions of Satisfiability from the perspective of parameterized complexity with a particular focus on \emph{tractable} Boolean formulas. We present several parameterized tractable and intractable results for finding a diverse pair of satisfying assignments of a Boolean formula. In particular, we design an FPT algorithm for finding an ``almost disjoint'' pair of satisfying assignments of a $2$CNF formula.

Combinatorial Selection with Costly Information

from arXiv: Data Structures and Algorithms

Authors: Shuchi Chawla, Dimitris Christou, Amit Harlev, Ziv Scully

We consider a class of optimization problems over stochastic variables where the algorithm can learn information about the value of any variable through a series of costly steps; we model this information acquisition process as a Markov Decision Process (MDP). The algorithm's goal is to minimize the cost of its solution plus the cost of information acquisition, or alternately, maximize the value of its solution minus the cost of information acquisition. Such bandit superprocesses have been studied previously but solutions are known only for fairly restrictive special cases. We develop a framework for approximate optimization of bandit superprocesses that applies to arbitrary processes with a matroid (and in some cases, more general) feasibility constraint. Our framework establishes a bound on the optimal cost through a novel cost amortization; it then couples this bound with a notion of local approximation that allows approximate solutions for each component MDP in the superprocess to be composed without loss into a global approximation. We use this framework to obtain approximately optimal solutions for several variants of bandit superprocesses for both maximization and minimization. We obtain new approximations for combinatorial versions of the previously studied Pandora's Box with Optional Inspection and Pandora's Box with Partial Inspection; as well as approximation algorithms for a new problem that we call the Weighing Scale problem.

Authors: Shuchi Chawla, Dimitris Christou, Amit Harlev, Ziv Scully

We consider a class of optimization problems over stochastic variables where the algorithm can learn information about the value of any variable through a series of costly steps; we model this information acquisition process as a Markov Decision Process (MDP). The algorithm's goal is to minimize the cost of its solution plus the cost of information acquisition, or alternately, maximize the value of its solution minus the cost of information acquisition. Such bandit superprocesses have been studied previously but solutions are known only for fairly restrictive special cases. We develop a framework for approximate optimization of bandit superprocesses that applies to arbitrary processes with a matroid (and in some cases, more general) feasibility constraint. Our framework establishes a bound on the optimal cost through a novel cost amortization; it then couples this bound with a notion of local approximation that allows approximate solutions for each component MDP in the superprocess to be composed without loss into a global approximation. We use this framework to obtain approximately optimal solutions for several variants of bandit superprocesses for both maximization and minimization. We obtain new approximations for combinatorial versions of the previously studied Pandora's Box with Optional Inspection and Pandora's Box with Partial Inspection; as well as approximation algorithms for a new problem that we call the Weighing Scale problem.

The Online Submodular Assignment Problem

from arXiv: Data Structures and Algorithms

Authors: Daniel Hathcock, Billy Jin, Kalen Patton, Sherry Sarkar, Michael Zlatin

Online resource allocation is a rich and varied field. One of the most well-known problems in this area is online bipartite matching, introduced in 1990 by Karp, Vazirani, and Vazirani [KVV90]. Since then, many variants have been studied, including AdWords, the generalized assignment problem (GAP), and online submodular welfare maximization. In this paper, we introduce a generalization of GAP which we call the submodular assignment problem (SAP). This generalization captures many online assignment problems, including all classical online bipartite matching problems as well as broader online combinatorial optimization problems such as online arboricity, flow scheduling, and laminar restricted allocations. We present a fractional algorithm for online SAP that is (1-1/e)-competitive. Additionally, we study several integral special cases of the problem. In particular, we provide a (1-1/e-epsilon)-competitive integral algorithm under a small-bids assumption, and a (1-1/e)-competitive integral algorithm for online submodular welfare maximization where the utility functions are given by rank functions of matroids. The key new ingredient for our results is the construction and structural analysis of a "water level" vector for polymatroids, which allows us to generalize the classic water-filling paradigm used in online matching problems. This construction reveals connections to submodular utility allocation markets and principal partition sequences of matroids.

Authors: Daniel Hathcock, Billy Jin, Kalen Patton, Sherry Sarkar, Michael Zlatin

Online resource allocation is a rich and varied field. One of the most well-known problems in this area is online bipartite matching, introduced in 1990 by Karp, Vazirani, and Vazirani [KVV90]. Since then, many variants have been studied, including AdWords, the generalized assignment problem (GAP), and online submodular welfare maximization. In this paper, we introduce a generalization of GAP which we call the submodular assignment problem (SAP). This generalization captures many online assignment problems, including all classical online bipartite matching problems as well as broader online combinatorial optimization problems such as online arboricity, flow scheduling, and laminar restricted allocations. We present a fractional algorithm for online SAP that is (1-1/e)-competitive. Additionally, we study several integral special cases of the problem. In particular, we provide a (1-1/e-epsilon)-competitive integral algorithm under a small-bids assumption, and a (1-1/e)-competitive integral algorithm for online submodular welfare maximization where the utility functions are given by rank functions of matroids. The key new ingredient for our results is the construction and structural analysis of a "water level" vector for polymatroids, which allows us to generalize the classic water-filling paradigm used in online matching problems. This construction reveals connections to submodular utility allocation markets and principal partition sequences of matroids.

Classic Round-Up Variant of Fast Unsigned Division by Constants: Algorithm and Full Proof

from arXiv: Data Structures and Algorithms

Authors: Yifei Li

Integer division instruction is generally expensive in most architectures. If the divisor is constant, the division can be transformed into combinations of several inexpensive integer instructions. This article discusses the classic round-up variant of the fast unsigned division by constants algorithm, and provides full proof of its correctness and feasibility. Additionally, a simpler variant for bounded dividends is presented.

Authors: Yifei Li

Integer division instruction is generally expensive in most architectures. If the divisor is constant, the division can be transformed into combinations of several inexpensive integer instructions. This article discusses the classic round-up variant of the fast unsigned division by constants algorithm, and provides full proof of its correctness and feasibility. Additionally, a simpler variant for bounded dividends is presented.

Thursday, December 05

Wisdom without a brain

from Ben Recht

Thinking through my preparation of a lecture about homeostasis

With the semester winding down and my latest dive into class live blogging coming to an end, I’m plotting out the next course for the blog. Looking at my Notes App, I’m guessing it may be the “potpourri” column for the next while. To choose from the rather scattered assortment of directions, I’m going with reactive mode, as I am a bit panicked about putting together a talk for tomorrow. Hopefully, rambling on the blog will get my thoughts in order.

This week, I’m attending a workshop at the Simons Institute on “Modern Paradigms in Generalization.” Narrowly construed, generalization mathematically characterizes how well a machine learning model performs on the next data point based on properties of data in the past. This typically means you assume the future is the same as the past and apply some contorted manipulation of the law of large numbers to get a prediction interval around your machine learning model predictions.

This workshop tries to get after something bigger. For the organizers, “Generalization, broadly construed, is the ability of machine learning methods to perform well in scenarios outside their training data.” That’s pretty broadly construed! They further propose:

We will characterize not just what it means for a machine learning model to do well on future data, but more generally for any entity to behave effectively in unknown future settings.

Ambitious!

The organizers kindly invited me to speak but told me I could talk about whatever I wanted. After consulting some of the other Simons program participants, I decided to go after the second half of that sentence. I’ve been grappling with a concept for the last year, a concept that precisely captures this characterization while looking nothing like artificial intelligence: homeostasis.

Homeostasis is the term coined by Walter Cannon to describe how organisms maintain some notion of internal constancy despite existing in an extremely adversarial environment. In his 1932 text, The Wisdom of the Body, Cannon deliberately chose homeostasis over equilibrium:

“The constant conditions which are maintained in the body might be termed equilibria. That word, however, has come to have fairly exact meaning as applied to relatively simple physico-chemical states, in closed systems, where known forces are balanced. The coordinated physiological processes which maintain most of the steady states in the organism are so complex and so peculiar to living beings–involving, as they may, the brain and nerves, the heart, lungs, kidneys, and spleen, all working cooperatively—that I have suggested a special designation for these states, homeostasis. The word does not imply something set and immobile, a stagnation. It means a condition—a condition which may vary, but which is relatively constant.”

For Cannon, equilibrium was a thermodynamic property of dead stuff. Two water baths connected by a pipe reach thermal equilibrium. A person in a sauna maintains an internal temperature of 37 degrees Celcius. Homeostasis is an active mechanism of regulation—a constant push to keep something constant. It requires millions of years of evolution and a constant intake of new resources to maintain.

Cannon’s concept of homeostasis would revolutionize the way we think about medicine. Doctors began to conceive many ailments and diseases as regulatory disorders. If a body becomes dysregulated, a doctor can prescribe a treatment to steer it back to homeostasis. One of the more striking examples in The Wisdom of the Body is an early description of how the body regulates blood sugar. Insulin was only discovered 10 years earlier, and Cannon describes its role in regulating blood sugar. Diabetes could thus be thought of as a disease of dysregulation. In 1920, children diagnosed with Type I diabetes would tragically die, often in as little as six months. Today, though still a very serious medical condition, Type I diabetes is not only treatable but can be managed with insulin pumps controlled by a cell phone.

I set out to write a talk that starts with Cannon and then thoroughly explores his influence, but I won’t be able to do that justice in a single talk. The pile of things I was hoping to engage with that have already been cut could be turned into talks themselves. Notably, homeostasis was a central motivating concept for the Cyberneticists and remained a tenet of systems thinking as cybernetics moved into management sciences. Dan Davies frequently writes about cybernetic homeostasis on his blog and in his fantastic book The Unaccountability Machine, and I want to engage more with Dan’s work here on argmin at some point soon.

But what can I do in 40 minutes? I’ve decided to present some results on how to model homeostatic systems as mathematical feedback systems. Some rather general principles have to apply if you want to keep a signal tightly regulated. Control theorists worked out these concepts in the 1950s, and I’m going to describe them as simply as I can, without any differential equations or Laplace Transforms.

What’s particularly interesting is these sorts of regulatory loops often adapt to unknown disturbances without a brain. Bacteria are very adaptive! I’ll cover some of the ideas from this fantastic survey by Mustafa Khammash, showing how adaptation occurs even in isolated chemical reaction networks.

Central to these homeostatic adaptations is the idea of integral control, and I will try to explain why some sort of integral control is necessary and emerges everywhere you look. And then I want to describe some of my own research in this area, which provides some general principles of using integral control to treat dysregulation in surprising ways. This hits on my favorite topic, individualized decision making, and hints at how we can improve ourselves by simultaneously being the treatment and control group.

I’ve been trying to get my thoughts on homeostasis in order for a while and decided to force my hand by submitting a talk abstract for a talk I haven’t written. I’m not sure if inventing deadlines is the best strategy to get things done, but it’s certainly a strategy. I’m fairly sure I’ll be able to make a compelling case, but these first talks always have a high degree of unquantifiable uncertainty associated with them. I’ll write more next week about how it goes.

Subscribe now

By Ben Recht

Tenure track at HSE University Faculty of Computer Science (apply by January 15, 2025)

from CCI: jobs

The HSE Faculty of Computer Science is looking for tenure-track Assistant Professor positions in various computer science areas, including AI and software engineering. Candidates must hold a recent PhD in Computer Science or a related field. Preferred qualifications include teaching experience at top foreign universities. Proficiency in English is essential. Deadline: January 15, 2025. Website: […]

The HSE Faculty of Computer Science is looking for tenure-track Assistant Professor positions in various computer science areas, including AI and software engineering. Candidates must hold a recent PhD in Computer Science or a related field. Preferred qualifications include teaching experience at top foreign universities. Proficiency in English is essential. Deadline: January 15, 2025.

Website: https://iri.hse.ru/ru/TT_CompSci_2024
Email: iri@hse.ru

By shacharlovett

Postdoc at University of Bordeaux (apply by January 15, 2025)

from CCI: jobs

We are offering a 2-year postdoc position in the Quantum Information and Computation group at the CS department (LaBRI) of the University of Bordeaux, France. We are looking for candidates with a strong interest in research on quantum information, quantum algorithms, and complexity theory. Please refer to the attached link for details on the application […]

We are offering a 2-year postdoc position in the Quantum Information and Computation group at the CS department (LaBRI) of the University of Bordeaux, France.

We are looking for candidates with a strong interest in research on quantum information, quantum algorithms, and complexity theory. Please refer to the attached link for details on the application process.

Website: https://quantique.labri.fr/positions/
Email: yassine.hamoudi@labri.fr

By shacharlovett

TR24-201 | On one-way functions and the average time complexity of almost-optimal compression | Marius Zimand

from ECCC Papers

We show that one-way functions exist if and only there exists an efficient distribution relative to which almost-optimal compression is hard on average. The result is obtained by combining a theorem of Ilango, Ren, and Santhanam [STOC 2022] and one by Bauwens and Zimand [JACM, 2023].

We show that one-way functions exist if and only there exists an efficient distribution relative to which almost-optimal compression is hard on average. The result is obtained by combining a theorem of Ilango, Ren, and Santhanam [STOC 2022] and one by Bauwens and Zimand [JACM, 2023].

Faculty at Ben-Gurion University of the Negev (apply by March 31, 2025)

from CCI: jobs

The Computer Science department is seeking candidates for a tenure-track faculty position starting in October 2025. Although we particularly seek applicants in Cyber Security, Systems, Software Engineering, Machine Learning, Data Sciences, and Robotics, we are interested in outstanding candidates in all areas of Computer Science. Applicants with ties to the industry are encouraged to apply. […]

The Computer Science department is seeking candidates for a tenure-track faculty position starting in October 2025. Although we particularly seek applicants in Cyber Security, Systems, Software Engineering, Machine Learning, Data Sciences, and Robotics, we are interested in outstanding candidates in all areas of Computer Science. Applicants with ties to the industry are encouraged to apply.

Website: https://bgu-academic-recruitment.my.site.com/Recruiters/VF_BGUPositions?Id=02iTc000009Gb9i
Email: cs_chair@cs.bgu.ac.il

By shacharlovett

Characterizing the Distinguishability of Product Distributions through Multicalibration

from arXiv: Computational Complexity

Authors: Cassandra Marcussen, Aaron L. Putterman, Salil Vadhan

Given a sequence of samples $x_1, \dots , x_k$ promised to be drawn from one of two distributions $X_0, X_1$, a well-studied problem in statistics is to decide $\textit{which}$ distribution the samples are from. Information theoretically, the maximum advantage in distinguishing the two distributions given $k$ samples is captured by the total variation distance between $X_0^{\otimes k}$ and $X_1^{\otimes k}$. However, when we restrict our attention to $\textit{efficient distinguishers}$ (i.e., small circuits) of these two distributions, exactly characterizing the ability to distinguish $X_0^{\otimes k}$ and $X_1^{\otimes k}$ is more involved and less understood. In this work, we give a general way to reduce bounds on the computational indistinguishability of $X_0$ and $X_1$ to bounds on the $\textit{information-theoretic}$ indistinguishability of some specific, related variables $\widetilde{X}_0$ and $\widetilde{X}_1$. As a consequence, we prove a new, tight characterization of the number of samples $k$ needed to efficiently distinguish $X_0^{\otimes k}$ and $X_1^{\otimes k}$ with constant advantage as \[ k = \Theta\left(d_H^{-2}\left(\widetilde{X}_0, \widetilde{X}_1\right)\right), \] which is the inverse of the squared Hellinger distance $d_H$ between two distributions $\widetilde{X}_0$ and $\widetilde{X}_1$ that are computationally indistinguishable from $X_0$ and $X_1$. Likewise, our framework can be used to re-derive a result of Geier (TCC 2022), proving nearly-tight bounds on how computational indistinguishability scales with the number of samples for arbitrary product distributions.

Authors: Cassandra Marcussen, Aaron L. Putterman, Salil Vadhan

Given a sequence of samples $x_1, \dots , x_k$ promised to be drawn from one of two distributions $X_0, X_1$, a well-studied problem in statistics is to decide $\textit{which}$ distribution the samples are from. Information theoretically, the maximum advantage in distinguishing the two distributions given $k$ samples is captured by the total variation distance between $X_0^{\otimes k}$ and $X_1^{\otimes k}$. However, when we restrict our attention to $\textit{efficient distinguishers}$ (i.e., small circuits) of these two distributions, exactly characterizing the ability to distinguish $X_0^{\otimes k}$ and $X_1^{\otimes k}$ is more involved and less understood. In this work, we give a general way to reduce bounds on the computational indistinguishability of $X_0$ and $X_1$ to bounds on the $\textit{information-theoretic}$ indistinguishability of some specific, related variables $\widetilde{X}_0$ and $\widetilde{X}_1$. As a consequence, we prove a new, tight characterization of the number of samples $k$ needed to efficiently distinguish $X_0^{\otimes k}$ and $X_1^{\otimes k}$ with constant advantage as \[ k = \Theta\left(d_H^{-2}\left(\widetilde{X}_0, \widetilde{X}_1\right)\right), \] which is the inverse of the squared Hellinger distance $d_H$ between two distributions $\widetilde{X}_0$ and $\widetilde{X}_1$ that are computationally indistinguishable from $X_0$ and $X_1$. Likewise, our framework can be used to re-derive a result of Geier (TCC 2022), proving nearly-tight bounds on how computational indistinguishability scales with the number of samples for arbitrary product distributions.

On one-way functions and the average time complexity of almost-optimal compression

from arXiv: Computational Complexity

Authors: Marius Zimand

We show that one-way functions exist if and only there exists an efficient distribution relative to which almost-optimal compression is hard on average. The result is obtained by combining a theorem of Ilango, Ren, and Santhanam and one by Bauwens and Zimand.

Authors: Marius Zimand

We show that one-way functions exist if and only there exists an efficient distribution relative to which almost-optimal compression is hard on average. The result is obtained by combining a theorem of Ilango, Ren, and Santhanam and one by Bauwens and Zimand.

Hard diagrams of split links

from arXiv: Computational Geometry

Authors: Corentin Lunel, Arnaud de Mesmay, Jonathan Spreer

Deformations of knots and links in ambient space can be studied combinatorially on their diagrams via local modifications called Reidemeister moves. While it is well-known that, in order to move between equivalent diagrams with Reidemeister moves, one sometimes needs to insert excess crossings, there are significant gaps between the best known lower and upper bounds on the required number of these added crossings. In this article, we study the problem of turning a diagram of a split link into a split diagram, and we show that there exist split links with diagrams requiring an arbitrarily large number of such additional crossings. More precisely, we provide a family of diagrams of split links, so that any sequence of Reidemeister moves transforming a diagram with $c$ crossings into a split diagram requires going through a diagram with $\Omega(\sqrt{c})$ extra crossings. Our proof relies on the framework of bubble tangles, as introduced by Lunel and de Mesmay, and a technique of Chambers and Liokumovitch to turn homotopies into isotopies in the context of Riemannian geometry.

Authors: Corentin Lunel, Arnaud de Mesmay, Jonathan Spreer

Deformations of knots and links in ambient space can be studied combinatorially on their diagrams via local modifications called Reidemeister moves. While it is well-known that, in order to move between equivalent diagrams with Reidemeister moves, one sometimes needs to insert excess crossings, there are significant gaps between the best known lower and upper bounds on the required number of these added crossings. In this article, we study the problem of turning a diagram of a split link into a split diagram, and we show that there exist split links with diagrams requiring an arbitrarily large number of such additional crossings. More precisely, we provide a family of diagrams of split links, so that any sequence of Reidemeister moves transforming a diagram with $c$ crossings into a split diagram requires going through a diagram with $\Omega(\sqrt{c})$ extra crossings. Our proof relies on the framework of bubble tangles, as introduced by Lunel and de Mesmay, and a technique of Chambers and Liokumovitch to turn homotopies into isotopies in the context of Riemannian geometry.

On Approximability of $\ell_2^2$ Min-Sum Clustering

from arXiv: Data Structures and Algorithms

Authors: Karthik C. S., Euiwoong Lee, Yuval Rabani, Chris Schwiegelshohn, Samson Zhou

The $\ell_2^2$ min-sum $k$-clustering problem is to partition an input set into clusters $C_1,\ldots,C_k$ to minimize $\sum_{i=1}^k\sum_{p,q\in C_i}\|p-q\|_2^2$. Although $\ell_2^2$ min-sum $k$-clustering is NP-hard, it is not known whether it is NP-hard to approximate $\ell_2^2$ min-sum $k$-clustering beyond a certain factor. In this paper, we give the first hardness-of-approximation result for the $\ell_2^2$ min-sum $k$-clustering problem. We show that it is NP-hard to approximate the objective to a factor better than $1.056$ and moreover, assuming a balanced variant of the Johnson Coverage Hypothesis, it is NP-hard to approximate the objective to a factor better than 1.327. We then complement our hardness result by giving the first $(1+\varepsilon)$-coreset construction for $\ell_2^2$ min-sum $k$-clustering. Our coreset uses $\mathcal{O}\left(k^{\varepsilon^{-4}}\right)$ space and can be leveraged to achieve a polynomial-time approximation scheme with runtime $nd\cdot f(k,\varepsilon^{-1})$, where $d$ is the underlying dimension of the input dataset and $f$ is a fixed function. Finally, we consider a learning-augmented setting, where the algorithm has access to an oracle that outputs a label $i\in[k]$ for input point, thereby implicitly partitioning the input dataset into $k$ clusters that induce an approximately optimal solution, up to some amount of adversarial error $\alpha\in\left[0,\frac{1}{2}\right)$. We give a polynomial-time algorithm that outputs a $\frac{1+\gamma\alpha}{(1-\alpha)^2}$-approximation to $\ell_2^2$ min-sum $k$-clustering, for a fixed constant $\gamma>0$.

Authors: Karthik C. S., Euiwoong Lee, Yuval Rabani, Chris Schwiegelshohn, Samson Zhou

The $\ell_2^2$ min-sum $k$-clustering problem is to partition an input set into clusters $C_1,\ldots,C_k$ to minimize $\sum_{i=1}^k\sum_{p,q\in C_i}\|p-q\|_2^2$. Although $\ell_2^2$ min-sum $k$-clustering is NP-hard, it is not known whether it is NP-hard to approximate $\ell_2^2$ min-sum $k$-clustering beyond a certain factor. In this paper, we give the first hardness-of-approximation result for the $\ell_2^2$ min-sum $k$-clustering problem. We show that it is NP-hard to approximate the objective to a factor better than $1.056$ and moreover, assuming a balanced variant of the Johnson Coverage Hypothesis, it is NP-hard to approximate the objective to a factor better than 1.327. We then complement our hardness result by giving the first $(1+\varepsilon)$-coreset construction for $\ell_2^2$ min-sum $k$-clustering. Our coreset uses $\mathcal{O}\left(k^{\varepsilon^{-4}}\right)$ space and can be leveraged to achieve a polynomial-time approximation scheme with runtime $nd\cdot f(k,\varepsilon^{-1})$, where $d$ is the underlying dimension of the input dataset and $f$ is a fixed function. Finally, we consider a learning-augmented setting, where the algorithm has access to an oracle that outputs a label $i\in[k]$ for input point, thereby implicitly partitioning the input dataset into $k$ clusters that induce an approximately optimal solution, up to some amount of adversarial error $\alpha\in\left[0,\frac{1}{2}\right)$. We give a polynomial-time algorithm that outputs a $\frac{1+\gamma\alpha}{(1-\alpha)^2}$-approximation to $\ell_2^2$ min-sum $k$-clustering, for a fixed constant $\gamma>0$.

Theoretical limitations of multi-layer Transformer

from arXiv: Data Structures and Algorithms

Authors: Lijie Chen, Binghui Peng, Hongxun Wu

Transformers, especially the decoder-only variants, are the backbone of most modern large language models; yet we do not have much understanding of their expressive power except for the simple $1$-layer case. Due to the difficulty of analyzing multi-layer models, all previous work relies on unproven complexity conjectures to show limitations for multi-layer Transformers. In this work, we prove the first $\textit{unconditional}$ lower bound against multi-layer decoder-only transformers. For any constant $L$, we prove that any $L$-layer decoder-only transformer needs a polynomial model dimension ($n^{\Omega(1)}$) to perform sequential composition of $L$ functions over an input of $n$ tokens. As a consequence, our results give: (1) the first depth-width trade-off for multi-layer transformers, exhibiting that the $L$-step composition task is exponentially harder for $L$-layer models compared to $(L+1)$-layer ones; (2) an unconditional separation between encoder and decoder, exhibiting a hard task for decoders that can be solved by an exponentially shallower and smaller encoder; (3) a provable advantage of chain-of-thought, exhibiting a task that becomes exponentially easier with chain-of-thought. On the technical side, we propose the multi-party $\textit{autoregressive}$ $\textit{communication}$ $\textit{model}$ that captures the computation of a decoder-only Transformer. We also introduce a new proof technique that finds a certain $\textit{indistinguishable}$ $\textit{decomposition}$ of all possible inputs iteratively for proving lower bounds in this model. We believe our new communication model and proof technique will be helpful to further understand the computational power of transformers.

Authors: Lijie Chen, Binghui Peng, Hongxun Wu

Transformers, especially the decoder-only variants, are the backbone of most modern large language models; yet we do not have much understanding of their expressive power except for the simple $1$-layer case. Due to the difficulty of analyzing multi-layer models, all previous work relies on unproven complexity conjectures to show limitations for multi-layer Transformers. In this work, we prove the first $\textit{unconditional}$ lower bound against multi-layer decoder-only transformers. For any constant $L$, we prove that any $L$-layer decoder-only transformer needs a polynomial model dimension ($n^{\Omega(1)}$) to perform sequential composition of $L$ functions over an input of $n$ tokens. As a consequence, our results give: (1) the first depth-width trade-off for multi-layer transformers, exhibiting that the $L$-step composition task is exponentially harder for $L$-layer models compared to $(L+1)$-layer ones; (2) an unconditional separation between encoder and decoder, exhibiting a hard task for decoders that can be solved by an exponentially shallower and smaller encoder; (3) a provable advantage of chain-of-thought, exhibiting a task that becomes exponentially easier with chain-of-thought. On the technical side, we propose the multi-party $\textit{autoregressive}$ $\textit{communication}$ $\textit{model}$ that captures the computation of a decoder-only Transformer. We also introduce a new proof technique that finds a certain $\textit{indistinguishable}$ $\textit{decomposition}$ of all possible inputs iteratively for proving lower bounds in this model. We believe our new communication model and proof technique will be helpful to further understand the computational power of transformers.

Dynamic Consistent $k$-Center Clustering with Optimal Recourse

from arXiv: Data Structures and Algorithms

Authors: Sebastian Forster, Antonis Skarlatos

Given points from an arbitrary metric space and a sequence of point updates sent by an adversary, what is the minimum recourse per update (i.e., the minimum number of changes needed to the set of centers after an update), in order to maintain a constant-factor approximation to a $k$-clustering problem? This question has received attention in recent years under the name consistent clustering. Previous works by Lattanzi and Vassilvitskii [ICLM '17] and Fichtenberger, Lattanzi, Norouzi-Fard, and Svensson [SODA '21] studied $k$-clustering objectives, including the $k$-center and the $k$-median objectives, under only point insertions. In this paper we study the $k$-center objective in the fully dynamic setting, where the update is either a point insertion or a point deletion. Before our work, {\L}\k{a}cki, Haeupler, Grunau, Rozho\v{n}, and Jayaram [SODA '24] gave a deterministic fully dynamic constant-factor approximation algorithm for the $k$-center objective with worst-case recourse of $2$ per update. In this work, we prove that the $k$-center clustering problem admits optimal recourse bounds by developing a deterministic fully dynamic constant-factor approximation algorithm with worst-case recourse of $1$ per update. Moreover our algorithm performs simple choices based on light data structures, and thus is arguably more direct and faster than the previous one which uses a sophisticated combinatorial structure. Additionally, we develop a new deterministic decremental algorithm and a new deterministic incremental algorithm, both of which maintain a $6$-approximate $k$-center solution with worst-case recourse of $1$ per update. Our incremental algorithm improves over the $8$-approximation algorithm by Charikar, Chekuri, Feder, and Motwani [STOC '97]. Finally, we remark that since all three of our algorithms are deterministic, they work against an adaptive adversary.

Authors: Sebastian Forster, Antonis Skarlatos

Given points from an arbitrary metric space and a sequence of point updates sent by an adversary, what is the minimum recourse per update (i.e., the minimum number of changes needed to the set of centers after an update), in order to maintain a constant-factor approximation to a $k$-clustering problem? This question has received attention in recent years under the name consistent clustering. Previous works by Lattanzi and Vassilvitskii [ICLM '17] and Fichtenberger, Lattanzi, Norouzi-Fard, and Svensson [SODA '21] studied $k$-clustering objectives, including the $k$-center and the $k$-median objectives, under only point insertions. In this paper we study the $k$-center objective in the fully dynamic setting, where the update is either a point insertion or a point deletion. Before our work, {\L}\k{a}cki, Haeupler, Grunau, Rozho\v{n}, and Jayaram [SODA '24] gave a deterministic fully dynamic constant-factor approximation algorithm for the $k$-center objective with worst-case recourse of $2$ per update. In this work, we prove that the $k$-center clustering problem admits optimal recourse bounds by developing a deterministic fully dynamic constant-factor approximation algorithm with worst-case recourse of $1$ per update. Moreover our algorithm performs simple choices based on light data structures, and thus is arguably more direct and faster than the previous one which uses a sophisticated combinatorial structure. Additionally, we develop a new deterministic decremental algorithm and a new deterministic incremental algorithm, both of which maintain a $6$-approximate $k$-center solution with worst-case recourse of $1$ per update. Our incremental algorithm improves over the $8$-approximation algorithm by Charikar, Chekuri, Feder, and Motwani [STOC '97]. Finally, we remark that since all three of our algorithms are deterministic, they work against an adaptive adversary.

Optimal bounds on a tree inference algorithm

from arXiv: Data Structures and Algorithms

Authors: Jack Gardiner, Lachlan L. H. Andrew, Junhao Gan, Jean Honorio, Seeun William Umboh

This paper tightens the best known analysis of Hein's 1989 algorithm to infer the topology of a weighted tree based on the lengths of paths between its leaves. It shows that the number of length queries required for a degree-$k$ tree of $n$ leaves is $O(n k \log_k n)$, which is the lower bound. It also presents a family of trees for which the performance is asymptotically better, and shows that no such family exists for a competing $O(n k \log_k n)$ algorithm.

Authors: Jack Gardiner, Lachlan L. H. Andrew, Junhao Gan, Jean Honorio, Seeun William Umboh

This paper tightens the best known analysis of Hein's 1989 algorithm to infer the topology of a weighted tree based on the lengths of paths between its leaves. It shows that the number of length queries required for a degree-$k$ tree of $n$ leaves is $O(n k \log_k n)$, which is the lower bound. It also presents a family of trees for which the performance is asymptotically better, and shows that no such family exists for a competing $O(n k \log_k n)$ algorithm.

Provably Extending PageRank-based Local Clustering Algorithm to Weighted Directed Graphs with Self-Loops and to Hypergraphs

from arXiv: Data Structures and Algorithms

Authors: Zihao Li, Dongqi Fu, Hengyu Liu, Jingrui He

Local clustering aims to find a compact cluster near the given starting instances. This work focuses on graph local clustering, which has broad applications beyond graphs because of the internal connectivities within various modalities. While most existing studies on local graph clustering adopt the discrete graph setting (i.e., unweighted graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs. Specifically, leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We theoretically prove that, under two mild conditions, both algorithms can identify a quadratically optimal local cluster in terms of conductance with at least 1/2 probability. On the property of hypergraphs, we address a fundamental gap in the literature by defining conductance for hypergraphs from the perspective of hypergraph random walks. Additionally, we provide experiments to validate our theoretical findings.

Authors: Zihao Li, Dongqi Fu, Hengyu Liu, Jingrui He

Local clustering aims to find a compact cluster near the given starting instances. This work focuses on graph local clustering, which has broad applications beyond graphs because of the internal connectivities within various modalities. While most existing studies on local graph clustering adopt the discrete graph setting (i.e., unweighted graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs. Specifically, leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We theoretically prove that, under two mild conditions, both algorithms can identify a quadratically optimal local cluster in terms of conductance with at least 1/2 probability. On the property of hypergraphs, we address a fundamental gap in the literature by defining conductance for hypergraphs from the perspective of hypergraph random walks. Additionally, we provide experiments to validate our theoretical findings.

Extracting Dual Solutions via Primal Optimizers

from arXiv: Data Structures and Algorithms

Authors: Yair Carmon, Arun Jambulapati, Liam O'Carroll, Aaron Sidford

We provide a general method to convert a "primal" black-box algorithm for solving regularized convex-concave minimax optimization problems into an algorithm for solving the associated dual maximin optimization problem. Our method adds recursive regularization over a logarithmic number of rounds where each round consists of an approximate regularized primal optimization followed by the computation of a dual best response. We apply this result to obtain new state-of-the-art runtimes for solving matrix games in specific parameter regimes, obtain improved query complexity for solving the dual of the CVaR distributionally robust optimization (DRO) problem, and recover the optimal query complexity for finding a stationary point of a convex function.

Authors: Yair Carmon, Arun Jambulapati, Liam O'Carroll, Aaron Sidford

We provide a general method to convert a "primal" black-box algorithm for solving regularized convex-concave minimax optimization problems into an algorithm for solving the associated dual maximin optimization problem. Our method adds recursive regularization over a logarithmic number of rounds where each round consists of an approximate regularized primal optimization followed by the computation of a dual best response. We apply this result to obtain new state-of-the-art runtimes for solving matrix games in specific parameter regimes, obtain improved query complexity for solving the dual of the CVaR distributionally robust optimization (DRO) problem, and recover the optimal query complexity for finding a stationary point of a convex function.

Improved Differentially Private Continual Observation Using Group Algebra

from arXiv: Data Structures and Algorithms

Authors: Monika Henzinger, Jalaj Upadhyay

Differentially private weighted prefix sum under continual observation is a crucial component in the production-level deployment of private next-word prediction for Gboard, which, according to Google, has over a billion users. More specifically, Google uses a differentially private mechanism to sum weighted gradients in its \emph{private follow-the-regularized leader} algorithm. Apart from efficiency, the additive error of the private mechanism is crucial as multiplied with the square root of the model's dimension $d$ (with $d$ ranging up to $10$ trillion, for example, Switch Transformers or M6-10T), it determines the accuracy of the learning system. So, any improvement in leading constant matters significantly in practice. In this paper, we show a novel connection between mechanisms for continual weighted prefix sum and a concept in representation theory known as the group matrix introduced in correspondence between Dedekind and Frobenius (1897) and generalized by Schur (1904). To the best of our knowledge, this is the first application of group algebra to analyze differentially private algorithms. Using this connection, we analyze a class of matrix norms known as {\em factorization norms} that give upper and lower bounds for the additive error under general $\ell_p$-norms of the matrix mechanism. This allows us to give the first efficient factorization that matches the best-known non-constructive upper bound on the factorization norm by Mathias (1993) for the matrix used in Google's deployment and also improves on the previous best-known constructive bound of Fichtenberger et al. (ICML 2023) and Henzinger et al. (SODA 2023) and the first upper bound on the additive error for a large class of weight functions for weighted prefix sum problems, including the sliding window matrix (Bolot et al. (ICDT 2013).

Authors: Monika Henzinger, Jalaj Upadhyay

Differentially private weighted prefix sum under continual observation is a crucial component in the production-level deployment of private next-word prediction for Gboard, which, according to Google, has over a billion users. More specifically, Google uses a differentially private mechanism to sum weighted gradients in its \emph{private follow-the-regularized leader} algorithm. Apart from efficiency, the additive error of the private mechanism is crucial as multiplied with the square root of the model's dimension $d$ (with $d$ ranging up to $10$ trillion, for example, Switch Transformers or M6-10T), it determines the accuracy of the learning system. So, any improvement in leading constant matters significantly in practice. In this paper, we show a novel connection between mechanisms for continual weighted prefix sum and a concept in representation theory known as the group matrix introduced in correspondence between Dedekind and Frobenius (1897) and generalized by Schur (1904). To the best of our knowledge, this is the first application of group algebra to analyze differentially private algorithms. Using this connection, we analyze a class of matrix norms known as {\em factorization norms} that give upper and lower bounds for the additive error under general $\ell_p$-norms of the matrix mechanism. This allows us to give the first efficient factorization that matches the best-known non-constructive upper bound on the factorization norm by Mathias (1993) for the matrix used in Google's deployment and also improves on the previous best-known constructive bound of Fichtenberger et al. (ICML 2023) and Henzinger et al. (SODA 2023) and the first upper bound on the additive error for a large class of weight functions for weighted prefix sum problems, including the sliding window matrix (Bolot et al. (ICDT 2013).

Computing the Center of Uncertain Points on Cactus Graphs

from arXiv: Data Structures and Algorithms

Authors: Ran Hu, Divy H. Kanani, Jingru Zhang

In this paper, we consider the (weighted) one-center problem of uncertain points on a cactus graph. Given are a cactus graph $G$ and a set of $n$ uncertain points. Each uncertain point has $m$ possible locations on $G$ with probabilities and a non-negative weight. The (weighted) one-center problem aims to compute a point (the center) $x^*$ on $G$ to minimize the maximum (weighted) expected distance from $x^*$ to all uncertain points. No previous algorithm is known for this problem. In this paper, we propose an $O(|G| + mn\log mn)$-time algorithm for solving it. Since the input is $O(|G|+mn)$, our algorithm is almost optimal.

Authors: Ran Hu, Divy H. Kanani, Jingru Zhang

In this paper, we consider the (weighted) one-center problem of uncertain points on a cactus graph. Given are a cactus graph $G$ and a set of $n$ uncertain points. Each uncertain point has $m$ possible locations on $G$ with probabilities and a non-negative weight. The (weighted) one-center problem aims to compute a point (the center) $x^*$ on $G$ to minimize the maximum (weighted) expected distance from $x^*$ to all uncertain points. No previous algorithm is known for this problem. In this paper, we propose an $O(|G| + mn\log mn)$-time algorithm for solving it. Since the input is $O(|G|+mn)$, our algorithm is almost optimal.

Wednesday, December 04

Podcasts!

from Scott Aaronson

Do you like watching me spout about AI alignment, watermarking, my time at OpenAI, the P versus NP problem, quantum computing, consciousness, Penrose’s views on physics and uncomputability, university culture, wokeness, free speech, my academic trajectory, and much more, despite my slightly spastic demeanor and my many verbal infelicities? Then holy crap are you in […]

Do you like watching me spout about AI alignment, watermarking, my time at OpenAI, the P versus NP problem, quantum computing, consciousness, Penrose’s views on physics and uncomputability, university culture, wokeness, free speech, my academic trajectory, and much more, despite my slightly spastic demeanor and my many verbal infelicities? Then holy crap are you in luck today! Here’s 2.5 hours of me talking to former professional poker players (and now wonderful Austin-based friends) Liv Boeree and her husband Igor Kurganov about all of those topics. (Or 1.25 hours if you watch at 2x speed, as I strongly recommend.)

But that’s not all! Here I am talking to Harvard’s Hrvoje Kukina, in a much shorter (45-minute) podcast focused on quantum computing, cosmological bounds on information processing, and the idea of the universe as a computer:

Last but not least, here I am in an hour-long podcast (this one audio-only) with longtime friend Kelly Weinersmith and her co-host Daniel Whiteson, talking about quantum computing.

Enjoy!

By Scott

TR24-200 | Testing vs Estimation for Index-Invariant Properties in the Huge Object Model | Sourav Chakraborty, Eldar Fischer, Arijit Ghosh, Amit Levi, Gopinath Mishra, Sayantan Sen

from ECCC Papers

The Huge Object model of property testing [Goldreich and Ron, TheoretiCS 23] concerns properties of distributions supported on $\{0,1\}^n$, where $n$ is so large that even reading a single sampled string is unrealistic. Instead, query access is provided to the samples, and the efficiency of the algorithm is measured by the total number of queries that were made to them. Index-invariant properties under this model were defined in [Chakraborty et al., COLT 23], as a compromise between enduring the full intricacies of string testing when considering unconstrained properties, and giving up completely on the string structure when considering label-invariant properties. Index-invariant properties are those that are invariant through a consistent reordering of the bits of the involved strings. Here we provide an adaptation of Szemer\'edi's regularity method for this setting, and in particular show that if an index-invariant property admits an $\epsilon$-test with a number of queries depending only on the proximity parameter $\epsilon$, then it also admits a distance estimation algorithm whose number of queries depends only on the approximation parameter.

The Huge Object model of property testing [Goldreich and Ron, TheoretiCS 23] concerns properties of distributions supported on $\{0,1\}^n$, where $n$ is so large that even reading a single sampled string is unrealistic. Instead, query access is provided to the samples, and the efficiency of the algorithm is measured by the total number of queries that were made to them. Index-invariant properties under this model were defined in [Chakraborty et al., COLT 23], as a compromise between enduring the full intricacies of string testing when considering unconstrained properties, and giving up completely on the string structure when considering label-invariant properties. Index-invariant properties are those that are invariant through a consistent reordering of the bits of the involved strings. Here we provide an adaptation of Szemer\'edi's regularity method for this setting, and in particular show that if an index-invariant property admits an $\epsilon$-test with a number of queries depending only on the proximity parameter $\epsilon$, then it also admits a distance estimation algorithm whose number of queries depends only on the approximation parameter.

Postdoc at University of Toronto (apply by December 15, 2024)

from CCI: jobs

We are currently accepting applications for postdoctoral positions, starting in the academic year 2025-2026. The positions are typically for two years. We will consider all areas of theory CS, including algorithms, complexity theory, cryptography, differential privacy, distributed computing, graph theory, quantum computing, and theoretical aspects of machine learning. Website: www.cs.toronto.edu/theory/positions.html Email: postdocapp@cs.toronto.edu

We are currently accepting applications for postdoctoral positions, starting in the academic year 2025-2026. The positions are typically for two years. We will consider all areas of theory CS, including algorithms, complexity theory, cryptography, differential privacy, distributed computing, graph theory, quantum computing, and theoretical aspects of machine learning.

Website: https://www.cs.toronto.edu/theory/positions.html
Email: postdocapp@cs.toronto.edu

By shacharlovett

Favorite Theorems: The Complete List

from Computational Complexity

Now in one place all of my sixty favorite theorems from the six decades of computational complexity (1965-2024).

2015-2024
  • Graph Isomorphism (Babai)
  • Sensitivity (Huang)
  • Quantum Provers (Ji-Natarajan-Vidick-Wright-Yuen)
  • Dichotomy (Bulatov, Zhuk)
  • Algebraic Circuits (Limaye-Srinivasan-Tavenas)
  • Extracting Ramsey Graphs (Chattopadhyay-Zuckerman)
  • Random Oracles (Håstad-Rossman-Servedio-Tan)
  • Parity Games (Calude-Jain-Khoussainov-Li-Stephan)
  • Gradient Descent (Fearnley-Goldberg-Hollender-Savani)
  • Learning from Natural Proofs (Carmosino-Impagliazzo-Kabanets-Kolokolova)

2005-2014
  • Undirected Connectivity in Log Space (Reingold)
  • Optimal Inapproximability for Max-Cut (Khot-Kindler-Mossel-O'Donnell, Mossel-O'Donnell-Oleszkiewicz)
  • Limitations of linear programming (Fiorini-Massar-Pokutta-Tiwary-de Wolf, Rothvoß)
  • Complexity of Nash Equilibrium (Daskalakis-Goldberg-Papadimitriou, Chen-Deng)
  • Combinatorial Proof of PCP Theorem (Dinur)
  • Constructive Proof of the Lovász Local Lemma (Moser)
  • Polylogarithmic independence fools AC0 (Braverman)
  • QIP = PSPACE (Jain-Ji-Upadhyay-Watrous)
  • Lower Bounds on Multilinear Formulas (Raz)
  • NEXP not in ACC0 (Williams)
1995-2004
  • Derandomization (Impagliazzo-Wigderson)
  • Primality (Agrawal-Kayal-Saxena)
  • Probabilistically Checkable Proofs (Håstad)
  • Connections (Trevisan)
  • Superlinear Bounds on Branching Programs (Ajtai)
  • Parallel Repetition (Raz)
  • List Decoding (Sudan)
  • Learning Circuits (Bshouty-Cleve-Gavaldà-Kannan-Tamon)
  • Quantum Lower Bounds (Ambainis)
  • Derandomizing Space (Saks-Zhou)
1985-1994
To mark my first decade in computational complexity during my pre-blog days, I chose my first set of favorite theorems from that time period for an invited talk and paper (PDF) at the 1994 Foundations of Software Technology and Theoretical Computer Science (FST&TCS) conference in Madras (now Chennai), India. The links below go to the papers directly, except for Szelepcsényi’s, which I can't find online.
  • Branching Programs (Barrington)
  • Bounded-Depth Circuits (Håstad)
  • Monotone Circuits (Razborov)
  • Nondeterministic Space (Immerman, Szelepcsényi)
  • Cryptographic Assumptions (Impagliazzo-Levin-Luby)
  • Isomorphism Conjecture (Ogihara-Watanabe)
  • Simulating Randomness (Nisan)
  • Counting Complexity (Toda)
  • Counting Classes (Beigel-Reingold-Spielman)
  • Interactive Proof Systems (Arora-Lund-Motwani-Sudan-Szegedy)
1975-1984 (From 2006)
  • Alternation (Chandra-Kozen-Stockmeyer)
  • Relativization (Baker-Gill-Solovay)
  • Small Sets (Mahaney)
  • Primality Algorithms (Solovay-Strassen, Miller, Rabin)
  • Probabilistic Complexity (Gill, Sipser)
  • The Permanent (Valiant)
  • Unique Witnesses (Valiant-Vazirani)
  • Parity (Furst-Saxe-Sipser, Ajtai)
  • The Yao Principle (Yao)
  • Nonuniform Complexity (Karp-Lipton)

1965-1974 (From 2005)

  • The Seminal Paper (Hartmanis-Stearns)
  • Efficient Computation (Cobham-Edmonds)
  • NP-Completeness (Cook, Levin)
  • Combinatorial NP-Complete Problems (Karp)
  • The Polynomial-Time Hierarchy (Meyer-Stockmeyer)
  • P = NP for Space (Savitch)
  • Abstract Complexity (Blum)
  • NP-Incomplete Sets (Ladner)
  • Logical Characterization of NP (Fagin)
  • Defining the Future (Cook-Reckhow)

Will I do this again in ten years when I'm 70? Come back in 2034 and find out.

By Lance Fortnow

Now in one place all of my sixty favorite theorems from the six decades of computational complexity (1965-2024).

2015-2024

2005-2014
1985-1994

To mark my first decade in computational complexity during my pre-blog days, I chose my first set of favorite theorems from that time period for an invited talk and paper (PDF) at the 1994 Foundations of Software Technology and Theoretical Computer Science (FST&TCS) conference in Madras (now Chennai), India. The links below go to the papers directly, except for Szelepcsényi’s, which I can't find online.
1975-1984 (From 2006)

1965-1974 (From 2005)


Will I do this again in ten years when I'm 70? Come back in 2034 and find out.

By Lance Fortnow

TR24-199 | Randomized Lifting to Semi-Structured Communication Complexity via Linear Diversity | Vladimir Podolskii, Alexander Shekhovtsov

from ECCC Papers

We study query-to-communication lifting. The major open problem in this area is to prove a lifting theorem for gadgets of constant size. The recent paper (Beame, Koroth, 2023) introduces semi-structured communication complexity, in which one of the players can only send parities of their input bits. They have shown that for any $m \geq 4$ deterministic decision tree complexity of a function $f$ can be lifted to the so called semi-structured communication complexity of $f \circ {Ind}_m$, where ${Ind}_m$ is the Indexing gadget. As our main contribution we extend these results to randomized setting. Our results also apply to a substantially larger set of gadgets. More specifically, we introduce a new complexity measure of gadgets, linear diversity. For all gadgets $g$ with non-trivial linear diversity we show that randomized decision tree complexity of $f$ lifts to randomized semi-structured communication complexity of $f \circ g$. In particular, this gives tight lifting results for Indexing gadget ${Ind}_m$, Inner Product gadget ${IP}_m$, and Majority gadget ${MAJ}_m$ for all $m$. We prove the same results for deterministic case. From our result it immediately follows that deterministic/randomized decision tree complexity lifts to deterministic/randomized parity decision tree complexity. For randomized case this is the first result of this type. For deterministic case, our result improves the bound in ( Chattpodhyay et al., 2023) for Inner Product gadget. To obtain our results we introduce a new secret sets approach to simulation of semi-structured communication protocols by decision trees. It allows us to simulate (restricted classes of) communication protocols on truly uniform distribution of inputs.

We study query-to-communication lifting. The major open problem in this area is to prove a lifting theorem for gadgets of constant size. The recent paper (Beame, Koroth, 2023) introduces semi-structured communication complexity, in which one of the players can only send parities of their input bits. They have shown that for any $m \geq 4$ deterministic decision tree complexity of a function $f$ can be lifted to the so called semi-structured communication complexity of $f \circ {Ind}_m$, where ${Ind}_m$ is the Indexing gadget. As our main contribution we extend these results to randomized setting. Our results also apply to a substantially larger set of gadgets. More specifically, we introduce a new complexity measure of gadgets, linear diversity. For all gadgets $g$ with non-trivial linear diversity we show that randomized decision tree complexity of $f$ lifts to randomized semi-structured communication complexity of $f \circ g$. In particular, this gives tight lifting results for Indexing gadget ${Ind}_m$, Inner Product gadget ${IP}_m$, and Majority gadget ${MAJ}_m$ for all $m$. We prove the same results for deterministic case. From our result it immediately follows that deterministic/randomized decision tree complexity lifts to deterministic/randomized parity decision tree complexity. For randomized case this is the first result of this type. For deterministic case, our result improves the bound in ( Chattpodhyay et al., 2023) for Inner Product gadget. To obtain our results we introduce a new secret sets approach to simulation of semi-structured communication protocols by decision trees. It allows us to simulate (restricted classes of) communication protocols on truly uniform distribution of inputs.

Unconditional proofs of quantumness between small-space machines

from arXiv: Computational Complexity

Authors: A. C. Cem Say, M. Utkan Gezer

A proof of quantumness is a protocol through which a classical machine can test whether a purportedly quantum device, with comparable time and memory resources, is performing a computation that is impossible for classical computers. Existing approaches to provide proofs of quantumness depend on unproven assumptions about some task being impossible for machines of a particular model under certain resource restrictions. We study a setup where both devices have space bounds $\mathit{o}(\log \log n)$. Under such memory budgets, it has been unconditionally proven that probabilistic Turing machines are unable to solve certain computational problems. We formulate a new class of problems, and show that these problems are polynomial-time solvable for quantum machines, impossible for classical machines, and have the property that their solutions can be "proved" by a small-space quantum machine to a classical machine with the same space bound. These problems form the basis of our newly defined protocol, where the polynomial-time verifier's verdict about the tested machine's quantumness is not conditional on an unproven weakness assumption.

Authors: A. C. Cem Say, M. Utkan Gezer

A proof of quantumness is a protocol through which a classical machine can test whether a purportedly quantum device, with comparable time and memory resources, is performing a computation that is impossible for classical computers. Existing approaches to provide proofs of quantumness depend on unproven assumptions about some task being impossible for machines of a particular model under certain resource restrictions. We study a setup where both devices have space bounds $\mathit{o}(\log \log n)$. Under such memory budgets, it has been unconditionally proven that probabilistic Turing machines are unable to solve certain computational problems. We formulate a new class of problems, and show that these problems are polynomial-time solvable for quantum machines, impossible for classical machines, and have the property that their solutions can be "proved" by a small-space quantum machine to a classical machine with the same space bound. These problems form the basis of our newly defined protocol, where the polynomial-time verifier's verdict about the tested machine's quantumness is not conditional on an unproven weakness assumption.

Persistent (Co)Homology in Matrix Multiplication Time

from arXiv: Computational Complexity

Authors: Dmitriy Morozov, Primoz Skraba

Most algorithms for computing persistent homology do so by tracking cycles that represent homology classes. There are many choices of such cycles, and specific choices have found different uses in applications. Although it is known that persistence diagrams can be computed in matrix multiplication time [8] for the more general case of zigzag persistent homology, it is not clear how to extract cycle representatives, especially if specific representatives are desired. In this paper, we provide the same matrix multiplication bound for computing representatives for the two choices common in applications in the case of ordinary persistent (co)homology. We first provide a fast version of the reduction algorithm, which is simpler than the algorithm in [8], but returns a different set of representatives than the standard algorithm [6] We then give a fast version of a different variant called the row algorithm [4], which returns the same representatives as the standard algorithm.

Authors: Dmitriy Morozov, Primoz Skraba

Most algorithms for computing persistent homology do so by tracking cycles that represent homology classes. There are many choices of such cycles, and specific choices have found different uses in applications. Although it is known that persistence diagrams can be computed in matrix multiplication time [8] for the more general case of zigzag persistent homology, it is not clear how to extract cycle representatives, especially if specific representatives are desired. In this paper, we provide the same matrix multiplication bound for computing representatives for the two choices common in applications in the case of ordinary persistent (co)homology. We first provide a fast version of the reduction algorithm, which is simpler than the algorithm in [8], but returns a different set of representatives than the standard algorithm [6] We then give a fast version of a different variant called the row algorithm [4], which returns the same representatives as the standard algorithm.

Vanishing of Schubert Coefficients

from arXiv: Computational Complexity

Authors: Igor Pak, Colleen Robichaux

Schubert coefficients are nonnegative integers $c^w_{u,v}$ that arise in Algebraic Geometry and play a central role in Algebraic Combinatorics. It is a major open problem whether they have a combinatorial interpretation, i.e, whether $c^w_{u,v} \in \#{\sf P}$. We study the closely related vanishing problem of Schubert coefficients: $\{c^w_{u,v}=^?0\}$. Until this work it was open whether this problem is in the polynomial hierarchy ${\sf PH}$. We prove that $\{c^w_{u,v}=^?0\}$ in ${\sf coAM}$ assuming the GRH. In particular, the vanishing problem is in ${\Sigma_2^{{\text{p}}}}$. Our approach is based on constructions lifted formulations, which give polynomial systems of equations for the problem. The result follows from a reduction to Parametric Hilbert's Nullstellensatz, recently studied in arXiv:2408.13027. We apply our construction to show that the vanishing problem is in ${\sf NP}_{\mathbb{C}} \cap {\sf P}_{\mathbb{R}}$ in the Blum--Shub--Smale (BSS) model of computation over complex and real numbers respectively. Similarly, we prove that computing Schubert coefficients is in $\#{\sf P}_{\mathbb{C}}$, a counting version of the BSS model. We also extend our results to classical types. With one notable exception of the vanishing problem in type $D$, all our results extend to all types.

Authors: Igor Pak, Colleen Robichaux

Schubert coefficients are nonnegative integers $c^w_{u,v}$ that arise in Algebraic Geometry and play a central role in Algebraic Combinatorics. It is a major open problem whether they have a combinatorial interpretation, i.e, whether $c^w_{u,v} \in \#{\sf P}$. We study the closely related vanishing problem of Schubert coefficients: $\{c^w_{u,v}=^?0\}$. Until this work it was open whether this problem is in the polynomial hierarchy ${\sf PH}$. We prove that $\{c^w_{u,v}=^?0\}$ in ${\sf coAM}$ assuming the GRH. In particular, the vanishing problem is in ${\Sigma_2^{{\text{p}}}}$. Our approach is based on constructions lifted formulations, which give polynomial systems of equations for the problem. The result follows from a reduction to Parametric Hilbert's Nullstellensatz, recently studied in arXiv:2408.13027. We apply our construction to show that the vanishing problem is in ${\sf NP}_{\mathbb{C}} \cap {\sf P}_{\mathbb{R}}$ in the Blum--Shub--Smale (BSS) model of computation over complex and real numbers respectively. Similarly, we prove that computing Schubert coefficients is in $\#{\sf P}_{\mathbb{C}}$, a counting version of the BSS model. We also extend our results to classical types. With one notable exception of the vanishing problem in type $D$, all our results extend to all types.

All Polyhedral Manifolds are Connected by a 2-Step Refolding

from arXiv: Computational Geometry

Authors: Lily Chung, Erik D. Demaine, Jenny Diomidova, Tonan Kamata, Jayson Lynch, Ryuhei Uehara, Hanyu Alice Zhang

We prove that, for any two polyhedral manifolds P, Q, there is a polyhedral manifold I such that P, I share a common unfolding and I, Q share a common unfolding. In other words, we can unfold P, refold (glue) that unfolding into I, unfold I, and then refold into Q. Furthermore, if P, Q are embedded in 3D, then I can be embedded in 3D (without self-intersection). These results generalize to n given manifolds P_1, P_2, ..., P_n; they all have a common unfolding with an intermediate manifold I. Allowing more than two unfold/refold steps, we obtain stronger results for two special cases: for doubly covered convex planar polygons, we achieve that all intermediate polyhedra are planar; and for tree-shaped polycubes, we achieve that all intermediate polyhedra are tree-shaped polycubes.

Authors: Lily Chung, Erik D. Demaine, Jenny Diomidova, Tonan Kamata, Jayson Lynch, Ryuhei Uehara, Hanyu Alice Zhang

We prove that, for any two polyhedral manifolds P, Q, there is a polyhedral manifold I such that P, I share a common unfolding and I, Q share a common unfolding. In other words, we can unfold P, refold (glue) that unfolding into I, unfold I, and then refold into Q. Furthermore, if P, Q are embedded in 3D, then I can be embedded in 3D (without self-intersection). These results generalize to n given manifolds P_1, P_2, ..., P_n; they all have a common unfolding with an intermediate manifold I. Allowing more than two unfold/refold steps, we obtain stronger results for two special cases: for doubly covered convex planar polygons, we achieve that all intermediate polyhedra are planar; and for tree-shaped polycubes, we achieve that all intermediate polyhedra are tree-shaped polycubes.

Simple Construction of Greedy Trees and Greedy Permutations

from arXiv: Data Structures and Algorithms

Authors: Oliver Chubet, Don Sheehy, Siddharth Sheth

\begin{abstract} Greedy permutations, also known as Gonzalez Orderings or Farthest Point Traversals are a standard way to approximate $k$-center clustering and have many applications in sampling and approximating metric spaces. A greedy tree is an added structure on a greedy permutation that tracks the (approximate) nearest predecessor. Greedy trees have applications in proximity search as well as in topological data analysis. For metrics of doubling dimension $d$, a $2^{O(d)}n\log n$ time algorithm is known, but it is randomized and also, quite complicated. Its construction involves a series of intermediate structures and $O(n \log n)$ space. In this paper, we show how to construct greedy permutations and greedy trees using a simple variation of an algorithm of Clarkson that was shown to run in $2^{O(d)}n\log \Delta$ time, where the spread $\spread$ is the ratio of largest to smallest pairwise distances. The improvement comes from the observation that the greedy tree can be constructed more easily than the greedy permutation. This leads to a linear time algorithm for merging two approximate greedy trees and thus, an $2^{O(d)}n \log n$ time algorithm for computing the tree. Then, we show how to extract a $(1+\frac{1}{n})$-approximate greedy permutation from the approximate greedy tree in the same asymptotic running time. \end{abstract}

Authors: Oliver Chubet, Don Sheehy, Siddharth Sheth

\begin{abstract} Greedy permutations, also known as Gonzalez Orderings or Farthest Point Traversals are a standard way to approximate $k$-center clustering and have many applications in sampling and approximating metric spaces. A greedy tree is an added structure on a greedy permutation that tracks the (approximate) nearest predecessor. Greedy trees have applications in proximity search as well as in topological data analysis. For metrics of doubling dimension $d$, a $2^{O(d)}n\log n$ time algorithm is known, but it is randomized and also, quite complicated. Its construction involves a series of intermediate structures and $O(n \log n)$ space. In this paper, we show how to construct greedy permutations and greedy trees using a simple variation of an algorithm of Clarkson that was shown to run in $2^{O(d)}n\log \Delta$ time, where the spread $\spread$ is the ratio of largest to smallest pairwise distances. The improvement comes from the observation that the greedy tree can be constructed more easily than the greedy permutation. This leads to a linear time algorithm for merging two approximate greedy trees and thus, an $2^{O(d)}n \log n$ time algorithm for computing the tree. Then, we show how to extract a $(1+\frac{1}{n})$-approximate greedy permutation from the approximate greedy tree in the same asymptotic running time. \end{abstract}

The Broader Landscape of Robustness in Algorithmic Statistics

from arXiv: Data Structures and Algorithms

Authors: Gautam Kamath

The last decade has seen a number of advances in computationally efficient algorithms for statistical methods subject to robustness constraints. An estimator may be robust in a number of different ways: to contamination of the dataset, to heavy-tailed data, or in the sense that it preserves privacy of the dataset. We survey recent results in these areas with a focus on the problem of mean estimation, drawing technical and conceptual connections between the various forms of robustness, showing that the same underlying algorithmic ideas lead to computationally efficient estimators in all these settings.

Authors: Gautam Kamath

The last decade has seen a number of advances in computationally efficient algorithms for statistical methods subject to robustness constraints. An estimator may be robust in a number of different ways: to contamination of the dataset, to heavy-tailed data, or in the sense that it preserves privacy of the dataset. We survey recent results in these areas with a focus on the problem of mean estimation, drawing technical and conceptual connections between the various forms of robustness, showing that the same underlying algorithmic ideas lead to computationally efficient estimators in all these settings.

Efficient Graph Matching for Correlated Stochastic Block Models

from arXiv: Data Structures and Algorithms

Authors: Shuwen Chai, Miklós Z. Rácz

We study learning problems on correlated stochastic block models with two balanced communities. Our main result gives the first efficient algorithm for graph matching in this setting. In the most interesting regime where the average degree is logarithmic in the number of vertices, this algorithm correctly matches all but a vanishing fraction of vertices with high probability, whenever the edge correlation parameter $s$ satisfies $s^2 > \alpha \approx 0.338$, where $\alpha$ is Otter's tree-counting constant. Moreover, we extend this to an efficient algorithm for exact graph matching whenever this is information-theoretically possible, positively resolving an open problem of R\'acz and Sridhar (NeurIPS 2021). Our algorithm generalizes the recent breakthrough work of Mao, Wu, Xu, and Yu (STOC 2023), which is based on centered subgraph counts of a large family of trees termed chandeliers. A major technical challenge that we overcome is dealing with the additional estimation errors that are necessarily present due to the fact that, in relevant parameter regimes, the latent community partition cannot be exactly recovered from a single graph. As an application of our results, we give an efficient algorithm for exact community recovery using multiple correlated graphs in parameter regimes where it is information-theoretically impossible to do so using just a single graph.

Authors: Shuwen Chai, Miklós Z. Rácz

We study learning problems on correlated stochastic block models with two balanced communities. Our main result gives the first efficient algorithm for graph matching in this setting. In the most interesting regime where the average degree is logarithmic in the number of vertices, this algorithm correctly matches all but a vanishing fraction of vertices with high probability, whenever the edge correlation parameter $s$ satisfies $s^2 > \alpha \approx 0.338$, where $\alpha$ is Otter's tree-counting constant. Moreover, we extend this to an efficient algorithm for exact graph matching whenever this is information-theoretically possible, positively resolving an open problem of R\'acz and Sridhar (NeurIPS 2021). Our algorithm generalizes the recent breakthrough work of Mao, Wu, Xu, and Yu (STOC 2023), which is based on centered subgraph counts of a large family of trees termed chandeliers. A major technical challenge that we overcome is dealing with the additional estimation errors that are necessarily present due to the fact that, in relevant parameter regimes, the latent community partition cannot be exactly recovered from a single graph. As an application of our results, we give an efficient algorithm for exact community recovery using multiple correlated graphs in parameter regimes where it is information-theoretically impossible to do so using just a single graph.

The Space Complexity of Approximating Logistic Loss

from arXiv: Data Structures and Algorithms

Authors: Gregory Dexter, Petros Drineas, Rajiv Khanna

We provide space complexity lower bounds for data structures that approximate logistic loss up to $\epsilon$-relative error on a logistic regression problem with data $\mathbf{X} \in \mathbb{R}^{n \times d}$ and labels $\mathbf{y} \in \{-1,1\}^d$. The space complexity of existing coreset constructions depend on a natural complexity measure $\mu_\mathbf{y}(\mathbf{X})$, first defined in (Munteanu, 2018). We give an $\tilde{\Omega}(\frac{d}{\epsilon^2})$ space complexity lower bound in the regime $\mu_\mathbf{y}(\mathbf{X}) = O(1)$ that shows existing coresets are optimal in this regime up to lower order factors. We also prove a general $\tilde{\Omega}(d\cdot \mu_\mathbf{y}(\mathbf{X}))$ space lower bound when $\epsilon$ is constant, showing that the dependency on $\mu_\mathbf{y}(\mathbf{X})$ is not an artifact of mergeable coresets. Finally, we refute a prior conjecture that $\mu_\mathbf{y}(\mathbf{X})$ is hard to compute by providing an efficient linear programming formulation, and we empirically compare our algorithm to prior approximate methods.

Authors: Gregory Dexter, Petros Drineas, Rajiv Khanna

We provide space complexity lower bounds for data structures that approximate logistic loss up to $\epsilon$-relative error on a logistic regression problem with data $\mathbf{X} \in \mathbb{R}^{n \times d}$ and labels $\mathbf{y} \in \{-1,1\}^d$. The space complexity of existing coreset constructions depend on a natural complexity measure $\mu_\mathbf{y}(\mathbf{X})$, first defined in (Munteanu, 2018). We give an $\tilde{\Omega}(\frac{d}{\epsilon^2})$ space complexity lower bound in the regime $\mu_\mathbf{y}(\mathbf{X}) = O(1)$ that shows existing coresets are optimal in this regime up to lower order factors. We also prove a general $\tilde{\Omega}(d\cdot \mu_\mathbf{y}(\mathbf{X}))$ space lower bound when $\epsilon$ is constant, showing that the dependency on $\mu_\mathbf{y}(\mathbf{X})$ is not an artifact of mergeable coresets. Finally, we refute a prior conjecture that $\mu_\mathbf{y}(\mathbf{X})$ is hard to compute by providing an efficient linear programming formulation, and we empirically compare our algorithm to prior approximate methods.

The Two-Center Problem of Uncertain Points on Trees

from arXiv: Data Structures and Algorithms

Authors: Haitao Xu, Jingru Zhang

In this paper, we consider the (weighted) two-center problem of uncertain points on a tree. Given are a tree $T$ and a set $\calP$ of $n$ (weighted) uncertain points each of which has $m$ possible locations on $T$ associated with probabilities. The goal is to compute two points on $T$, i.e., two centers with respect to $\calP$, so that the maximum (weighted) expected distance of $n$ uncertain points to their own expected closest center is minimized. This problem can be solved in $O(|T|+ n^{2}\log n\log mn + mn\log^2 mn \log n)$ time by the algorithm for the general $k$-center problem. In this paper, we give a more efficient and simple algorithm that solves this problem in $O(|T| + mn\log mn)$ time.

Authors: Haitao Xu, Jingru Zhang

In this paper, we consider the (weighted) two-center problem of uncertain points on a tree. Given are a tree $T$ and a set $\calP$ of $n$ (weighted) uncertain points each of which has $m$ possible locations on $T$ associated with probabilities. The goal is to compute two points on $T$, i.e., two centers with respect to $\calP$, so that the maximum (weighted) expected distance of $n$ uncertain points to their own expected closest center is minimized. This problem can be solved in $O(|T|+ n^{2}\log n\log mn + mn\log^2 mn \log n)$ time by the algorithm for the general $k$-center problem. In this paper, we give a more efficient and simple algorithm that solves this problem in $O(|T| + mn\log mn)$ time.

The Two-Center Problem of Uncertain Points on Cactus Graphs

from arXiv: Data Structures and Algorithms

Authors: Haitao Xu, Jingru Zhang

We study the two-center problem on cactus graphs in facility locations, which aims to place two facilities on the graph network to serve customers in order to minimize the maximum transportation cost. In our problem, the location of each customer is uncertain and may appear at $O(m)$ points on the network with probabilities. More specifically, given are a cactus graph $G$ and a set $\calP$ of $n$ (weighted) uncertain points where every uncertain point has $O(m)$ possible locations on $G$ each associated with a probability and is of a non-negative weight. The problem aims to compute two centers (points) on $G$ so that the maximum (weighted) expected distance of the $n$ uncertain points to their own expected closest center is minimized. No previous algorithms are known for this problem. In this paper, we present the first algorithm for this problem and it solves the problem in $O(|G|+ m^{2}n^{2}\log mn)$ time.

Authors: Haitao Xu, Jingru Zhang

We study the two-center problem on cactus graphs in facility locations, which aims to place two facilities on the graph network to serve customers in order to minimize the maximum transportation cost. In our problem, the location of each customer is uncertain and may appear at $O(m)$ points on the network with probabilities. More specifically, given are a cactus graph $G$ and a set $\calP$ of $n$ (weighted) uncertain points where every uncertain point has $O(m)$ possible locations on $G$ each associated with a probability and is of a non-negative weight. The problem aims to compute two centers (points) on $G$ so that the maximum (weighted) expected distance of the $n$ uncertain points to their own expected closest center is minimized. No previous algorithms are known for this problem. In this paper, we present the first algorithm for this problem and it solves the problem in $O(|G|+ m^{2}n^{2}\log mn)$ time.

The Cost of Consistency: Submodular Maximization with Constant Recourse

from arXiv: Data Structures and Algorithms

Authors: Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Ola Svensson, Morteza Zadimoghaddam

In this work, we study online submodular maximization, and how the requirement of maintaining a stable solution impacts the approximation. In particular, we seek bounds on the best-possible approximation ratio that is attainable when the algorithm is allowed to make at most a constant number of updates per step. We show a tight information-theoretic bound of $\tfrac{2}{3}$ for general monotone submodular functions, and an improved (also tight) bound of $\tfrac{3}{4}$ for coverage functions. Since both these bounds are attained by non poly-time algorithms, we also give a poly-time randomized algorithm that achieves a $0.51$-approximation. Combined with an information-theoretic hardness of $\tfrac{1}{2}$ for deterministic algorithms from prior work, our work thus shows a separation between deterministic and randomized algorithms, both information theoretically and for poly-time algorithms.

Authors: Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Ola Svensson, Morteza Zadimoghaddam

In this work, we study online submodular maximization, and how the requirement of maintaining a stable solution impacts the approximation. In particular, we seek bounds on the best-possible approximation ratio that is attainable when the algorithm is allowed to make at most a constant number of updates per step. We show a tight information-theoretic bound of $\tfrac{2}{3}$ for general monotone submodular functions, and an improved (also tight) bound of $\tfrac{3}{4}$ for coverage functions. Since both these bounds are attained by non poly-time algorithms, we also give a poly-time randomized algorithm that achieves a $0.51$-approximation. Combined with an information-theoretic hardness of $\tfrac{1}{2}$ for deterministic algorithms from prior work, our work thus shows a separation between deterministic and randomized algorithms, both information theoretically and for poly-time algorithms.

UNIFY: Unified Index for Range Filtered Approximate Nearest Neighbors Search

from arXiv: Data Structures and Algorithms

Authors: Anqi Liang, Pengcheng Zhang, Bin Yao, Zhongpu Chen, Yitong Song, Guangxu Cheng

This paper presents an efficient and scalable framework for Range Filtered Approximate Nearest Neighbors Search (RF-ANNS) over high-dimensional vectors associated with attribute values. Given a query vector $q$ and a range $[l, h]$, RF-ANNS aims to find the approximate $k$ nearest neighbors of $q$ among data whose attribute values fall within $[l, h]$. Existing methods including pre-, post-, and hybrid filtering strategies that perform attribute range filtering before, after, or during the ANNS process, all suffer from significant performance degradation when query ranges shift. Though building dedicated indexes for each strategy and selecting the best one based on the query range can address this problem, it leads to index consistency and maintenance issues. Our framework, called UNIFY, constructs a unified Proximity Graph-based (PG-based) index that seamlessly supports all three strategies. In UNIFY, we introduce SIG, a novel Segmented Inclusive Graph, which segments the dataset by attribute values. It ensures the PG of objects from any segment combinations is a sub-graph of SIG, thereby enabling efficient hybrid filtering by reconstructing and searching a PG from relevant segments. Moreover, we present Hierarchical Segmented Inclusive Graph (HSIG), a variant of SIG which incorporates a hierarchical structure inspired by HNSW to achieve logarithmic hybrid filtering complexity. We also implement pre- and post-filtering for HSIG by fusing skip list connections and compressed HNSW edges into the hierarchical graph. Experimental results show that UNIFY delivers state-of-the-art RF-ANNS performance across small, mid, and large query ranges.

Authors: Anqi Liang, Pengcheng Zhang, Bin Yao, Zhongpu Chen, Yitong Song, Guangxu Cheng

This paper presents an efficient and scalable framework for Range Filtered Approximate Nearest Neighbors Search (RF-ANNS) over high-dimensional vectors associated with attribute values. Given a query vector $q$ and a range $[l, h]$, RF-ANNS aims to find the approximate $k$ nearest neighbors of $q$ among data whose attribute values fall within $[l, h]$. Existing methods including pre-, post-, and hybrid filtering strategies that perform attribute range filtering before, after, or during the ANNS process, all suffer from significant performance degradation when query ranges shift. Though building dedicated indexes for each strategy and selecting the best one based on the query range can address this problem, it leads to index consistency and maintenance issues. Our framework, called UNIFY, constructs a unified Proximity Graph-based (PG-based) index that seamlessly supports all three strategies. In UNIFY, we introduce SIG, a novel Segmented Inclusive Graph, which segments the dataset by attribute values. It ensures the PG of objects from any segment combinations is a sub-graph of SIG, thereby enabling efficient hybrid filtering by reconstructing and searching a PG from relevant segments. Moreover, we present Hierarchical Segmented Inclusive Graph (HSIG), a variant of SIG which incorporates a hierarchical structure inspired by HNSW to achieve logarithmic hybrid filtering complexity. We also implement pre- and post-filtering for HSIG by fusing skip list connections and compressed HNSW edges into the hierarchical graph. Experimental results show that UNIFY delivers state-of-the-art RF-ANNS performance across small, mid, and large query ranges.

Testing vs Estimation for Index-Invariant Properties in the Huge Object Model

from arXiv: Data Structures and Algorithms

Authors: Sourav Chakraborty, Eldar Fischer, Arijit Ghosh, Amit Levi, Gopinath Mishra, Sayantan Sen

The Huge Object model of property testing [Goldreich and Ron, TheoretiCS 23] concerns properties of distributions supported on $\{0,1\}^n$, where $n$ is so large that even reading a single sampled string is unrealistic. Instead, query access is provided to the samples, and the efficiency of the algorithm is measured by the total number of queries that were made to them. Index-invariant properties under this model were defined in [Chakraborty et al., COLT 23], as a compromise between enduring the full intricacies of string testing when considering unconstrained properties, and giving up completely on the string structure when considering label-invariant properties. Index-invariant properties are those that are invariant through a consistent reordering of the bits of the involved strings. Here we provide an adaptation of Szemer\'edi's regularity method for this setting, and in particular show that if an index-invariant property admits an $\epsilon$-test with a number of queries depending only on the proximity parameter $\epsilon$, then it also admits a distance estimation algorithm whose number of queries depends only on the approximation parameter.

Authors: Sourav Chakraborty, Eldar Fischer, Arijit Ghosh, Amit Levi, Gopinath Mishra, Sayantan Sen

The Huge Object model of property testing [Goldreich and Ron, TheoretiCS 23] concerns properties of distributions supported on $\{0,1\}^n$, where $n$ is so large that even reading a single sampled string is unrealistic. Instead, query access is provided to the samples, and the efficiency of the algorithm is measured by the total number of queries that were made to them. Index-invariant properties under this model were defined in [Chakraborty et al., COLT 23], as a compromise between enduring the full intricacies of string testing when considering unconstrained properties, and giving up completely on the string structure when considering label-invariant properties. Index-invariant properties are those that are invariant through a consistent reordering of the bits of the involved strings. Here we provide an adaptation of Szemer\'edi's regularity method for this setting, and in particular show that if an index-invariant property admits an $\epsilon$-test with a number of queries depending only on the proximity parameter $\epsilon$, then it also admits a distance estimation algorithm whose number of queries depends only on the approximation parameter.

You (Almost) Can't Beat Brute Force for 3-Matroid Intersection

from arXiv: Data Structures and Algorithms

Authors: Ilan Doron-Arad, Ariel Kulik, Hadas Shachnai

The $\ell$-matroid intersection ($\ell$-MI) problem asks if $\ell$ given matroids share a common basis. Already for $\ell = 3$, notable canonical NP-complete special cases are $3$-Dimensional Matching and Hamiltonian Path on directed graphs. However, while these problems admit exponential-time algorithms that improve the simple brute force, the fastest known algorithm for $3$-MI is exactly brute force with runtime $2^{n}/poly(n)$, where $n$ is the number of elements. Our first result shows that in fact, brute force cannot be significantly improved, by ruling out an algorithm for $\ell$-MI with runtime $o\left(2^{n-5 \cdot n^{\frac{1}{\ell-1}} \cdot \log (n)}\right)$, for any fixed $\ell\geq 3$. The complexity gap between $3$-MI and the polynomially solvable $2$-matroid intersection calls for a better understanding of the complexity of intermediate problems. One such prominent problem is exact matroid intersection (EMI). Given two matroids whose elements are either red or blue and a number $k$, decide if there is a common basis which contains exactly $k$ red elements. We show that EMI does not admit a randomized polynomial time algorithm. This bound implies that the parameterized algorithm of Eisenbrand et al. (FOCS'24) for exact weight matroid cannot be generalized to matroid intersection. We further obtain: (i) an algorithm that solves $\ell$-MI faster than brute force in time $2^{n-\Omega\left(\log^2 (n)\right)} $ (ii) a parameterized running time lower bound of $2^{(\ell-2) \cdot k \cdot \log k} \cdot poly(n)$ for $\ell$-MI, where the parameter $k$ is the rank of the matroids. We obtain these two results by generalizing the Monotone Local Search technique of Fomin et al. (J. ACM'19). Broadly speaking, our generalization converts any parameterized algorithm for a subset problem into an exponential-time algorithm which is faster than brute-force.

Authors: Ilan Doron-Arad, Ariel Kulik, Hadas Shachnai

The $\ell$-matroid intersection ($\ell$-MI) problem asks if $\ell$ given matroids share a common basis. Already for $\ell = 3$, notable canonical NP-complete special cases are $3$-Dimensional Matching and Hamiltonian Path on directed graphs. However, while these problems admit exponential-time algorithms that improve the simple brute force, the fastest known algorithm for $3$-MI is exactly brute force with runtime $2^{n}/poly(n)$, where $n$ is the number of elements. Our first result shows that in fact, brute force cannot be significantly improved, by ruling out an algorithm for $\ell$-MI with runtime $o\left(2^{n-5 \cdot n^{\frac{1}{\ell-1}} \cdot \log (n)}\right)$, for any fixed $\ell\geq 3$. The complexity gap between $3$-MI and the polynomially solvable $2$-matroid intersection calls for a better understanding of the complexity of intermediate problems. One such prominent problem is exact matroid intersection (EMI). Given two matroids whose elements are either red or blue and a number $k$, decide if there is a common basis which contains exactly $k$ red elements. We show that EMI does not admit a randomized polynomial time algorithm. This bound implies that the parameterized algorithm of Eisenbrand et al. (FOCS'24) for exact weight matroid cannot be generalized to matroid intersection. We further obtain: (i) an algorithm that solves $\ell$-MI faster than brute force in time $2^{n-\Omega\left(\log^2 (n)\right)} $ (ii) a parameterized running time lower bound of $2^{(\ell-2) \cdot k \cdot \log k} \cdot poly(n)$ for $\ell$-MI, where the parameter $k$ is the rank of the matroids. We obtain these two results by generalizing the Monotone Local Search technique of Fomin et al. (J. ACM'19). Broadly speaking, our generalization converts any parameterized algorithm for a subset problem into an exponential-time algorithm which is faster than brute-force.

Tuesday, December 03

(Tenure-Track) Assistant Professor in Theoretical Computer Science at Duke University (apply by December 15, 2024)

from CCI: jobs

At Duke Computer Science, we are looking to hire a tenure-track Assistant Professor in theoretical computer science. All areas with TCS will be considered. Although the official deadline is December 15, applications will continue to be considered until the position is filled. In particular, all applications arriving by December 31 will receive full consideration. Website: […]

At Duke Computer Science, we are looking to hire a tenure-track Assistant Professor in theoretical computer science. All areas with TCS will be considered. Although the official deadline is December 15, applications will continue to be considered until the position is filled. In particular, all applications arriving by December 31 will receive full consideration.

Website: https://academicjobsonline.org/ajo/jobs/29268
Email: debmalya@cs.duke.edu

By shacharlovett

Project Projections

from Ben Recht

A rough survey of submitted class projects

This post reflects on the live blog of my graduate class “Convex Optimization.” A Table of Contents is here.

It’s class presentation week! No matter which graduate class I teach, my assignment is the same: I ask students to choose a project related to their research but connected to the course content. This is purposefully vague and open, as all academic research questions begin. Not only do I like seeing how students grapple with making (sometimes tenuous) connections between the class and their interests, but I learn a lot from their interpretations of the assignment. I get insights into what students took away from the class and into the current hot research topics on campus.

The projects also gauge how well pedagogical content connects to these current hot research directions. This was one of the main questions I set out to answer in the first place this semester, and I’m getting the opportunity to see over thirty different ways in which convex optimization can be applied in our age of LLMs.

Right now, all I have are the slides for the presentations, but I can give a superficial view of some of these connections. Unsurprisingly, the vast majority of the projects have something to do with neural networks and machine learning. A few projects use convex methods to analyze neural net systems, aiming to better understand what the neural nets are doing and how they might best be optimized. Some projects wanted to see if they could use convex techniques to replace the neural methods. Some wanted to see if neural techniques could improve on convex methods. I like this tension! There’s no clean answer for when one will be better than the other. We too often run with what is trendy, not what is best.

Of the projects not aimed at understanding neural nets, the two dominant application areas were energy and robotics. The energy projects ranged from studying building energy management, managing energy and utilization of public transportation, planning new additions to the electricity grid, and understanding how to solve massive inverse problems in energy resources. Since energy management is tightly coupled to policy concerns about conservation and economics, it’s not surprising that optimization techniques play a central role.

The other dominant application area was robotics. High level robotic planning is a persistently challenging optimization problem, and I suppose unsurprising that we’re still trying to devise clever ways to solve these problems. The last week of class may have made it seem like all optimal control could be solved by simply running backpropagation. Unfortunately, in practice, the physical world gets in the way of such clean plans. The course projects propose new methods for solving problems about avoiding collisions and obstructions, dealing with sensor noise, and guaranteeing safe execution. There were a lot of proposed neural methods here as well, such as training robots to mimic people and training robots to mimic convex optimization solutions.

Though energy and robotics were the dominant applications, they were by no means the only applications. I received project submissions accelerating linear algebra, scheduling hardware, understanding neural coding, and improving variational inference. It’s good to see how convex optimization can still inspire new techniques and methods in a broad set of fields.

That’s my superficial overview of the projects, and I’m excited to get more details from the students. Later this week, I’ll share follow-up thoughts about what comes out of the lightning presentations and associated discussions.

Subscribe now

By Ben Recht

On the Computational Complexity of Multi-Objective Ordinal Unconstrained Combinatorial Optimization

from arXiv: Computational Complexity

Authors: José Rui Figueira, Kathrin Klamroth, Michael Stiglmayr, Julia Sudhoff Santos

Multi-objective unconstrained combinatorial optimization problems (MUCO) are in general hard to solve, i.e., the corresponding decision problem is NP-hard and the outcome set is intractable. In this paper we explore special cases of MUCO problems that are actually easy, i.e., solvable in polynomial time. More precisely, we show that MUCO problems with up to two ordinal objective functions plus one real-valued objective function are tractable, and that their complete nondominated set can be computed in polynomial time. For MUCO problems with one ordinal and a second ordinal or real-valued objective function we present an even more efficient algorithm that applies a greedy strategy multiple times.

Authors: José Rui Figueira, Kathrin Klamroth, Michael Stiglmayr, Julia Sudhoff Santos

Multi-objective unconstrained combinatorial optimization problems (MUCO) are in general hard to solve, i.e., the corresponding decision problem is NP-hard and the outcome set is intractable. In this paper we explore special cases of MUCO problems that are actually easy, i.e., solvable in polynomial time. More precisely, we show that MUCO problems with up to two ordinal objective functions plus one real-valued objective function are tractable, and that their complete nondominated set can be computed in polynomial time. For MUCO problems with one ordinal and a second ordinal or real-valued objective function we present an even more efficient algorithm that applies a greedy strategy multiple times.

Limit-sure reachability for small memory policies in POMDPs is NP-complete

from arXiv: Computational Complexity

Authors: Ali Asadi, Krishnendu Chatterjee, Raimundo Saona, Ali Shafiee

A standard model that arises in several applications in sequential decision making is partially observable Markov decision processes (POMDPs) where a decision-making agent interacts with an uncertain environment. A basic objective in such POMDPs is the reachability objective, where given a target set of states, the goal is to eventually arrive at one of them. The limit-sure problem asks whether reachability can be ensured with probability arbitrarily close to 1. In general, the limit-sure reachability problem for POMDPs is undecidable. However, in many practical cases the most relevant question is the existence of policies with a small amount of memory. In this work, we study the limit-sure reachability problem for POMDPs with a fixed amount of memory. We establish that the computational complexity of the problem is NP-complete.

Authors: Ali Asadi, Krishnendu Chatterjee, Raimundo Saona, Ali Shafiee

A standard model that arises in several applications in sequential decision making is partially observable Markov decision processes (POMDPs) where a decision-making agent interacts with an uncertain environment. A basic objective in such POMDPs is the reachability objective, where given a target set of states, the goal is to eventually arrive at one of them. The limit-sure problem asks whether reachability can be ensured with probability arbitrarily close to 1. In general, the limit-sure reachability problem for POMDPs is undecidable. However, in many practical cases the most relevant question is the existence of policies with a small amount of memory. In this work, we study the limit-sure reachability problem for POMDPs with a fixed amount of memory. We establish that the computational complexity of the problem is NP-complete.

A Hierarchical Heuristic for Clustered Steiner Trees in the Plane with Obstacles

from arXiv: Computational Geometry

Authors: Victor Parque

Euclidean Steiner trees are relevant to model minimal networks in real-world applications ubiquitously. In this paper, we study the feasibility of a hierarchical approach embedded with bundling operations to compute multiple and mutually disjoint Euclidean Steiner trees that avoid clutter and overlapping with obstacles in the plane, which is significant to model the decentralized and the multipoint coordination of agents in constrained 2D domains. Our computational experiments using arbitrary obstacle configuration with convex and non-convex geometries show the feasibility and the attractive performance when computing multiple obstacle-avoiding Steiner trees in the plane. Our results offer the mechanisms to elucidate new operators for obstacle-avoiding Steiner trees.

Authors: Victor Parque

Euclidean Steiner trees are relevant to model minimal networks in real-world applications ubiquitously. In this paper, we study the feasibility of a hierarchical approach embedded with bundling operations to compute multiple and mutually disjoint Euclidean Steiner trees that avoid clutter and overlapping with obstacles in the plane, which is significant to model the decentralized and the multipoint coordination of agents in constrained 2D domains. Our computational experiments using arbitrary obstacle configuration with convex and non-convex geometries show the feasibility and the attractive performance when computing multiple obstacle-avoiding Steiner trees in the plane. Our results offer the mechanisms to elucidate new operators for obstacle-avoiding Steiner trees.