Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Tuesday, May 21

Openness on OpenAI

from Scott Aaronson

I am, of course, sad that Jan Leike and Ilya Sutskever, the two central people who recruited me to OpenAI and then served as my “bosses” there—two people for whom I developed tremendous admiration—have both now resigned from the company. Ilya’s resignation followed the board drama six months ago, but Jan’s resignation last week came […]

I am, of course, sad that Jan Leike and Ilya Sutskever, the two central people who recruited me to OpenAI and then served as my “bosses” there—two people for whom I developed tremendous admiration—have both now resigned from the company. Ilya’s resignation followed the board drama six months ago, but Jan’s resignation last week came as a shock to me and others. The Superalignment team, which Jan and Ilya led and which I was part of, is being split up and merged into other teams at OpenAI.

See here for Ilya’s parting statement, and here for Jan’s. See here for Zvi Mowshowitz’s perspective and summary of reporting on these events. For additional takes, see pretty much the entire rest of the nerd Internet.

As for me? My two-year leave at OpenAI was scheduled to end this summer anyway. It seems pretty clear that I ought to spend my remaining months at OpenAI simply doing my best for AI safety—for example, by shepherding watermarking toward deployment. After a long delay, I’m gratified that interest in watermarking has spiked recently, not only within OpenAI and other companies but among legislative bodies in the US and Europe.

And afterwards? I’ll certainly continue thinking about how AI is changing the world and how (if at all) we can steer its development to avoid catastrophes, because how could I not think about that? I spent 15 years mostly avoiding the subject, and that now seems like a huge mistake, and probably like enough of that mistake for one lifetime.

So I’ll continue looking for juicy open problems in complexity theory that are motivated by interpretability, or scalable oversight, or dangerous capability evaluations, or other aspects of AI safety—I’ve already identified a few such problems! And without giving up on quantum computing (because how could I?), I expect to reorient at least some of my academic work toward problems at the interface of theoretical computer science and AI safety, and to recruit students who want to work on those problems, and to apply for grants about them. And I’ll presumably continue giving talks about this stuff, and doing podcasts and panels and so on—anyway, as long as people keep asking me to!

And I’ll be open to future sabbaticals or consulting arrangements with AI organizations, like the one I’ve done at OpenAI. But I expect that my main identity will always be as an academic. Certainly I never want to be in a position where I have to speak for an organization rather than myself, or censor what I can say in public about the central problems I’m working on, or sign a nondisparagement agreement or anything of the kind.

I can tell you this: in two years at OpenAI, hanging out at the office and meeting the leadership and rank-and-file engineers, I never once found a smoke-filled room where they laugh at all the rubes who take the talk about “safety” and “alignment” seriously. While my interactions were admittedly skewed toward safetyists, the OpenAI folks I met were invariably smart and earnest and dead serious about the mission of getting AI right for humankind.

It’s more than fair for outsiders to ask whether that’s enough, whether even good intentions can survive bad incentives. It’s likewise fair of them to ask: what fraction of compute and other resources ought to be set aside for alignment research? What exactly should OpenAI do on alignment going forward? What should governments force them and other AI companies to do? What should employees and ex-employees be allowed, or encouraged, to share publicly?

I don’t know the answers to these questions, but if you do, feel free to tell me in the comments!

By Scott

Noise-tolerant learnability of shallow quantum circuits from statistics and the cost of quantum pseudorandomness

from arXiv: Computational Complexity

Authors: Chirag Wadhwa, Mina Doosti

This work studies the learnability of unknown quantum circuits in the near term. We prove the natural robustness of quantum statistical queries for learning quantum processes and provide an efficient way to benchmark various classes of noise from statistics, which gives us a powerful framework for developing noise-tolerant algorithms. We adapt a learning algorithm for constant-depth quantum circuits to the quantum statistical query setting with a small overhead in the query complexity. We prove average-case lower bounds for learning random quantum circuits of logarithmic and higher depths within diamond distance with statistical queries. Additionally, we show the hardness of the quantum threshold search problem from quantum statistical queries and discuss its implications for the learnability of shallow quantum circuits. Finally, we prove that pseudorandom unitaries (PRUs) cannot be constructed using circuits of constant depth by constructing an efficient distinguisher and proving a new variation of the quantum no-free lunch theorem.

Authors: Chirag Wadhwa, Mina Doosti

This work studies the learnability of unknown quantum circuits in the near term. We prove the natural robustness of quantum statistical queries for learning quantum processes and provide an efficient way to benchmark various classes of noise from statistics, which gives us a powerful framework for developing noise-tolerant algorithms. We adapt a learning algorithm for constant-depth quantum circuits to the quantum statistical query setting with a small overhead in the query complexity. We prove average-case lower bounds for learning random quantum circuits of logarithmic and higher depths within diamond distance with statistical queries. Additionally, we show the hardness of the quantum threshold search problem from quantum statistical queries and discuss its implications for the learnability of shallow quantum circuits. Finally, we prove that pseudorandom unitaries (PRUs) cannot be constructed using circuits of constant depth by constructing an efficient distinguisher and proving a new variation of the quantum no-free lunch theorem.

Fixed-parameter tractability of canonical polyadic decomposition over finite fields

from arXiv: Computational Complexity

Authors: Jason Yang

We present a simple proof that finding a rank-$R$ canonical polyadic decomposition of 3-dimensional tensors over a finite field $\mathbb{F}$ is fixed-parameter tractable with respect to $R$ and $\mathbb{F}$. We also show some more concrete upper bounds on the time complexity of this problem.

Authors: Jason Yang

We present a simple proof that finding a rank-$R$ canonical polyadic decomposition of 3-dimensional tensors over a finite field $\mathbb{F}$ is fixed-parameter tractable with respect to $R$ and $\mathbb{F}$. We also show some more concrete upper bounds on the time complexity of this problem.

Inner-approximate Reachability Computation via Zonotopic Boundary Analysis

from arXiv: Computational Complexity

Authors: Dejin Ren, Zhen Liang, Chenyu Wu, Jianqiang Ding, Taoran Wu, Bai Xue

Inner-approximate reachability analysis involves calculating subsets of reachable sets, known as inner-approximations. This analysis is crucial in the fields of dynamic systems analysis and control theory as it provides a reliable estimation of the set of states that a system can reach from given initial states at a specific time instant. In this paper, we study the inner-approximate reachability analysis problem based on the set-boundary reachability method for systems modelled by ordinary differential equations, in which the computed inner-approximations are represented with zonotopes. The set-boundary reachability method computes an inner-approximation by excluding states reached from the initial set's boundary. The effectiveness of this method is highly dependent on the efficient extraction of the exact boundary of the initial set. To address this, we propose methods leveraging boundary and tiling matrices that can efficiently extract and refine the exact boundary of the initial set represented by zonotopes. Additionally, we enhance the exclusion strategy by contracting the outer-approximations in a flexible way, which allows for the computation of less conservative inner-approximations. To evaluate the proposed method, we compare it with state-of-the-art methods against a series of benchmarks. The numerical results demonstrate that our method is not only efficient but also accurate in computing inner-approximations.

Authors: Dejin Ren, Zhen Liang, Chenyu Wu, Jianqiang Ding, Taoran Wu, Bai Xue

Inner-approximate reachability analysis involves calculating subsets of reachable sets, known as inner-approximations. This analysis is crucial in the fields of dynamic systems analysis and control theory as it provides a reliable estimation of the set of states that a system can reach from given initial states at a specific time instant. In this paper, we study the inner-approximate reachability analysis problem based on the set-boundary reachability method for systems modelled by ordinary differential equations, in which the computed inner-approximations are represented with zonotopes. The set-boundary reachability method computes an inner-approximation by excluding states reached from the initial set's boundary. The effectiveness of this method is highly dependent on the efficient extraction of the exact boundary of the initial set. To address this, we propose methods leveraging boundary and tiling matrices that can efficiently extract and refine the exact boundary of the initial set represented by zonotopes. Additionally, we enhance the exclusion strategy by contracting the outer-approximations in a flexible way, which allows for the computation of less conservative inner-approximations. To evaluate the proposed method, we compare it with state-of-the-art methods against a series of benchmarks. The numerical results demonstrate that our method is not only efficient but also accurate in computing inner-approximations.

A Nearly Quadratic Improvement for Memory Reallocation

from arXiv: Data Structures and Algorithms

Authors: Martin Farach-Colton, William Kuszmaul, Nathan Sheffield, Alek Westover

In the Memory Reallocation Problem a set of items of various sizes must be dynamically assigned to non-overlapping contiguous chunks of memory. It is guaranteed that the sum of the sizes of all items present at any time is at most a $(1-\varepsilon)$-fraction of the total size of memory (i.e., the load-factor is at most $1-\varepsilon$). The allocator receives insert and delete requests online, and can re-arrange existing items to handle the requests, but at a reallocation cost defined to be the sum of the sizes of items moved divided by the size of the item being inserted/deleted. The folklore algorithm for Memory Reallocation achieves a cost of $O(\varepsilon^{-1})$ per update. In recent work at FOCS'23, Kuszmaul showed that, in the special case where each item is promised to be smaller than an $\varepsilon^4$-fraction of memory, it is possible to achieve expected update cost $O(\log\varepsilon^{-1})$. Kuszmaul conjectures, however, that for larger items the folklore algorithm is optimal. In this work we disprove Kuszmaul's conjecture, giving an allocator that achieves expected update cost $O(\varepsilon^{-1/2} \operatorname*{polylog} \varepsilon^{-1})$ on any input sequence. We also give the first non-trivial lower bound for the Memory Reallocation Problem: we demonstrate an input sequence on which any resizable allocator (even offline) must incur amortized update cost at least $\Omega(\log\varepsilon^{-1})$. Finally, we analyze the Memory Reallocation Problem on a stochastic sequence of inserts and deletes, with random sizes in $[\delta, 2 \delta]$ for some $\delta$. We show that, in this simplified setting, it is possible to achieve $O(\log\varepsilon^{-1})$ expected update cost, even in the ``large item'' parameter regime ($\delta > \varepsilon^4$).

Authors: Martin Farach-Colton, William Kuszmaul, Nathan Sheffield, Alek Westover

In the Memory Reallocation Problem a set of items of various sizes must be dynamically assigned to non-overlapping contiguous chunks of memory. It is guaranteed that the sum of the sizes of all items present at any time is at most a $(1-\varepsilon)$-fraction of the total size of memory (i.e., the load-factor is at most $1-\varepsilon$). The allocator receives insert and delete requests online, and can re-arrange existing items to handle the requests, but at a reallocation cost defined to be the sum of the sizes of items moved divided by the size of the item being inserted/deleted. The folklore algorithm for Memory Reallocation achieves a cost of $O(\varepsilon^{-1})$ per update. In recent work at FOCS'23, Kuszmaul showed that, in the special case where each item is promised to be smaller than an $\varepsilon^4$-fraction of memory, it is possible to achieve expected update cost $O(\log\varepsilon^{-1})$. Kuszmaul conjectures, however, that for larger items the folklore algorithm is optimal. In this work we disprove Kuszmaul's conjecture, giving an allocator that achieves expected update cost $O(\varepsilon^{-1/2} \operatorname*{polylog} \varepsilon^{-1})$ on any input sequence. We also give the first non-trivial lower bound for the Memory Reallocation Problem: we demonstrate an input sequence on which any resizable allocator (even offline) must incur amortized update cost at least $\Omega(\log\varepsilon^{-1})$. Finally, we analyze the Memory Reallocation Problem on a stochastic sequence of inserts and deletes, with random sizes in $[\delta, 2 \delta]$ for some $\delta$. We show that, in this simplified setting, it is possible to achieve $O(\log\varepsilon^{-1})$ expected update cost, even in the ``large item'' parameter regime ($\delta > \varepsilon^4$).

Count-Min Sketch with Conservative Updates: Worst-Case Analysis

from arXiv: Data Structures and Algorithms

Authors: Younes Ben Mazziane, Othmane Marfoq

Count-Min Sketch with Conservative Updates (\texttt{CMS-CU}) is a memory-efficient hash-based data structure used to estimate the occurrences of items within a data stream. \texttt{CMS-CU} stores~$m$ counters and employs~$d$ hash functions to map items to these counters. We first argue that the estimation error in \texttt{CMS-CU} is maximal when each item appears at most once in the stream. Next, we study \texttt{CMS-CU} in this setting. Precisely, \begin{enumerate} \item In the case where~$d=m-1$, we prove that the average estimation error and the average counter rate converge almost surely to~$\frac{1}{2}$, contrasting with the vanilla Count-Min Sketch, where the average counter rate is equal to~$\frac{m-1}{m}$. \item For any given~$m$ and~$d$, we prove novel lower and upper bounds on the average estimation error, incorporating a positive integer parameter~$g$. Larger values of this parameter improve the accuracy of the bounds. Moreover, the computation of each bound involves examining an ergodic Markov process with a state space of size~$\binom{m+g-d}{g}$ and a sparse transition probabilities matrix containing~$\mathcal{O}(m\binom{m+g-d}{g})$ non-zero entries. \item For~$d=m-1$, $g=1$, and as $m\to \infty$, we show that the lower and upper bounds coincide. In general, our bounds exhibit high accuracy for small values of $g$, as shown by numerical computation. For example, for $m=50$, $d=4$, and $g=5$, the difference between the lower and upper bounds is smaller than~$10^{-4}$. \end{enumerate}

Authors: Younes Ben Mazziane, Othmane Marfoq

Count-Min Sketch with Conservative Updates (\texttt{CMS-CU}) is a memory-efficient hash-based data structure used to estimate the occurrences of items within a data stream. \texttt{CMS-CU} stores~$m$ counters and employs~$d$ hash functions to map items to these counters. We first argue that the estimation error in \texttt{CMS-CU} is maximal when each item appears at most once in the stream. Next, we study \texttt{CMS-CU} in this setting. Precisely, \begin{enumerate} \item In the case where~$d=m-1$, we prove that the average estimation error and the average counter rate converge almost surely to~$\frac{1}{2}$, contrasting with the vanilla Count-Min Sketch, where the average counter rate is equal to~$\frac{m-1}{m}$. \item For any given~$m$ and~$d$, we prove novel lower and upper bounds on the average estimation error, incorporating a positive integer parameter~$g$. Larger values of this parameter improve the accuracy of the bounds. Moreover, the computation of each bound involves examining an ergodic Markov process with a state space of size~$\binom{m+g-d}{g}$ and a sparse transition probabilities matrix containing~$\mathcal{O}(m\binom{m+g-d}{g})$ non-zero entries. \item For~$d=m-1$, $g=1$, and as $m\to \infty$, we show that the lower and upper bounds coincide. In general, our bounds exhibit high accuracy for small values of $g$, as shown by numerical computation. For example, for $m=50$, $d=4$, and $g=5$, the difference between the lower and upper bounds is smaller than~$10^{-4}$. \end{enumerate}

Scheduling Jobs with Work-Inefficient Parallel Solutions

from arXiv: Data Structures and Algorithms

Authors: William Kuszmaul, Alek Westover

This paper introduces the \emph{serial-parallel decision problem}. Consider an online scheduler that receives a series of tasks, where each task has both a parallel and a serial implementation. The parallel implementation has the advantage that it can make progress concurrently on multiple processors, but the disadvantage that it is (potentially) work-inefficient. As tasks arrive, the scheduler must decide for each task which implementation to use. We begin by studying \emph{total awake time}. We give a simple \emph{decide-on-arrival} scheduler that achieves a competitive ratio of $3$ for total awake time -- this scheduler makes serial/parallel decisions immediately when jobs arrive. Our second result is an \emph{parallel-work-oblivious} scheduler that achieves a competitive ratio of $6$ for total awake time -- this scheduler makes all of its decisions based only on the size of each serial job and without needing to know anything about the parallel implementations. Finally, we prove a lower bound showing that, if a scheduler wishes to achieve a competitive ratio of $O(1)$, it can have at most one of the two aforementioned properties (decide-on-arrival or parallel-work-oblivious). We also prove lower bounds of the form $1 + \Omega(1)$ on the optimal competitive ratio for any scheduler. Next, we turn our attention to optimizing \emph{mean response time}. Here, we show that it is possible to achieve an $O(1)$ competitive ratio with $O(1)$ speed augmentation. This is the most technically involved of our results. We also prove that, in this setting, it is not possible for a parallel-work-oblivious scheduler to do well. In addition to these results, we present tight bounds on the optimal competitive ratio if we allow for arrival dependencies between tasks (e.g., tasks are components of a single parallel program), and we give an in-depth discussion of the remaining open questions.

Authors: William Kuszmaul, Alek Westover

This paper introduces the \emph{serial-parallel decision problem}. Consider an online scheduler that receives a series of tasks, where each task has both a parallel and a serial implementation. The parallel implementation has the advantage that it can make progress concurrently on multiple processors, but the disadvantage that it is (potentially) work-inefficient. As tasks arrive, the scheduler must decide for each task which implementation to use. We begin by studying \emph{total awake time}. We give a simple \emph{decide-on-arrival} scheduler that achieves a competitive ratio of $3$ for total awake time -- this scheduler makes serial/parallel decisions immediately when jobs arrive. Our second result is an \emph{parallel-work-oblivious} scheduler that achieves a competitive ratio of $6$ for total awake time -- this scheduler makes all of its decisions based only on the size of each serial job and without needing to know anything about the parallel implementations. Finally, we prove a lower bound showing that, if a scheduler wishes to achieve a competitive ratio of $O(1)$, it can have at most one of the two aforementioned properties (decide-on-arrival or parallel-work-oblivious). We also prove lower bounds of the form $1 + \Omega(1)$ on the optimal competitive ratio for any scheduler. Next, we turn our attention to optimizing \emph{mean response time}. Here, we show that it is possible to achieve an $O(1)$ competitive ratio with $O(1)$ speed augmentation. This is the most technically involved of our results. We also prove that, in this setting, it is not possible for a parallel-work-oblivious scheduler to do well. In addition to these results, we present tight bounds on the optimal competitive ratio if we allow for arrival dependencies between tasks (e.g., tasks are components of a single parallel program), and we give an in-depth discussion of the remaining open questions.

Lipschitz Continuous Allocations for Optimization Games

from arXiv: Data Structures and Algorithms

Authors: Soh Kumabe, Yuichi Yoshida

In cooperative game theory, the primary focus is the equitable allocation of payoffs or costs among agents. However, in the practical applications of cooperative games, accurately representing games is challenging. In such cases, using an allocation method sensitive to small perturbations in the game can lead to various problems, including dissatisfaction among agents and the potential for manipulation by agents seeking to maximize their own benefits. Therefore, the allocation method must be robust against game perturbations. In this study, we explore optimization games, in which the value of the characteristic function is provided as the optimal value of an optimization problem. To assess the robustness of the allocation methods, we use the Lipschitz constant, which quantifies the extent of change in the allocation vector in response to a unit perturbation in the weight vector of the underlying problem. Thereafter, we provide an algorithm for the matching game that returns an allocation belonging to the $\left(\frac{1}{2}-\epsilon\right)$-approximate core with Lipschitz constant $O(\epsilon^{-1})$. Additionally, we provide an algorithm for a minimum spanning tree game that returns an allocation belonging to the $4$-approximate core with a constant Lipschitz constant. The Shapley value is a popular allocation that satisfies several desirable properties. Therefore, we investigate the robustness of the Shapley value. We demonstrate that the Lipschitz constant of the Shapley value for the minimum spanning tree is constant, whereas that for the matching game is $\Omega(\log n)$, where $n$ denotes the number of vertices.

Authors: Soh Kumabe, Yuichi Yoshida

In cooperative game theory, the primary focus is the equitable allocation of payoffs or costs among agents. However, in the practical applications of cooperative games, accurately representing games is challenging. In such cases, using an allocation method sensitive to small perturbations in the game can lead to various problems, including dissatisfaction among agents and the potential for manipulation by agents seeking to maximize their own benefits. Therefore, the allocation method must be robust against game perturbations. In this study, we explore optimization games, in which the value of the characteristic function is provided as the optimal value of an optimization problem. To assess the robustness of the allocation methods, we use the Lipschitz constant, which quantifies the extent of change in the allocation vector in response to a unit perturbation in the weight vector of the underlying problem. Thereafter, we provide an algorithm for the matching game that returns an allocation belonging to the $\left(\frac{1}{2}-\epsilon\right)$-approximate core with Lipschitz constant $O(\epsilon^{-1})$. Additionally, we provide an algorithm for a minimum spanning tree game that returns an allocation belonging to the $4$-approximate core with a constant Lipschitz constant. The Shapley value is a popular allocation that satisfies several desirable properties. Therefore, we investigate the robustness of the Shapley value. We demonstrate that the Lipschitz constant of the Shapley value for the minimum spanning tree is constant, whereas that for the matching game is $\Omega(\log n)$, where $n$ denotes the number of vertices.

Equilibria in multiagent online problems with predictions

from arXiv: Data Structures and Algorithms

Authors: Gabriel Istrate, Cosmin Bonchiş, Victor Bogdan

We study the power of (competitive) algorithms with predictions in a multiagent setting. For this we introduce a multiagent version of the ski-rental problem. In this problem agents can collaborate by pooling resources to get a group license for some asset. If the license price is not met agents have to rent the asset individually for the day at a unit price. Otherwise the license becomes available forever to everyone at no extra cost. Our main contribution is a best-response analysis of a single-agent competitive algorithm that assumes perfect knowledge of other agents' actions (but no knowledge of its own renting time). We then analyze the setting when agents have a predictor for their own active time, yielding a tradeoff between robustness and consistency. We investigate the effect of using such a predictor in an equilibrium, as well as the new equilibria formed in this way.

Authors: Gabriel Istrate, Cosmin Bonchiş, Victor Bogdan

We study the power of (competitive) algorithms with predictions in a multiagent setting. For this we introduce a multiagent version of the ski-rental problem. In this problem agents can collaborate by pooling resources to get a group license for some asset. If the license price is not met agents have to rent the asset individually for the day at a unit price. Otherwise the license becomes available forever to everyone at no extra cost. Our main contribution is a best-response analysis of a single-agent competitive algorithm that assumes perfect knowledge of other agents' actions (but no knowledge of its own renting time). We then analyze the setting when agents have a predictor for their own active time, yielding a tradeoff between robustness and consistency. We investigate the effect of using such a predictor in an equilibrium, as well as the new equilibria formed in this way.

Decentralized Privacy Preservation for Critical Connections in Graphs

from arXiv: Data Structures and Algorithms

Authors: Conggai Li, Wei Ni, Ming Ding, Youyang Qu, Jianjun Chen, David Smith, Wenjie Zhang, Thierry Rakotoarivelo

Many real-world interconnections among entities can be characterized as graphs. Collecting local graph information with balanced privacy and data utility has garnered notable interest recently. This paper delves into the problem of identifying and protecting critical information of entity connections for individual participants in a graph based on cohesive subgraph searches. This problem has not been addressed in the literature. To address the problem, we propose to extract the critical connections of a queried vertex using a fortress-like cohesive subgraph model known as $p$-cohesion. A user's connections within a fortress are obfuscated when being released, to protect critical information about the user. Novel merit and penalty score functions are designed to measure each participant's critical connections in the minimal $p$-cohesion, facilitating effective identification of the connections. We further propose to preserve the privacy of a vertex enquired by only protecting its critical connections when responding to queries raised by data collectors. We prove that, under the decentralized differential privacy (DDP) mechanism, one's response satisfies $(\varepsilon, \delta)$-DDP when its critical connections are protected while the rest remains unperturbed. The effectiveness of our proposed method is demonstrated through extensive experiments on real-life graph datasets.

Authors: Conggai Li, Wei Ni, Ming Ding, Youyang Qu, Jianjun Chen, David Smith, Wenjie Zhang, Thierry Rakotoarivelo

Many real-world interconnections among entities can be characterized as graphs. Collecting local graph information with balanced privacy and data utility has garnered notable interest recently. This paper delves into the problem of identifying and protecting critical information of entity connections for individual participants in a graph based on cohesive subgraph searches. This problem has not been addressed in the literature. To address the problem, we propose to extract the critical connections of a queried vertex using a fortress-like cohesive subgraph model known as $p$-cohesion. A user's connections within a fortress are obfuscated when being released, to protect critical information about the user. Novel merit and penalty score functions are designed to measure each participant's critical connections in the minimal $p$-cohesion, facilitating effective identification of the connections. We further propose to preserve the privacy of a vertex enquired by only protecting its critical connections when responding to queries raised by data collectors. We prove that, under the decentralized differential privacy (DDP) mechanism, one's response satisfies $(\varepsilon, \delta)$-DDP when its critical connections are protected while the rest remains unperturbed. The effectiveness of our proposed method is demonstrated through extensive experiments on real-life graph datasets.

BYO: A Unified Framework for Benchmarking Large-Scale Graph Containers

from arXiv: Data Structures and Algorithms

Authors: Brian Wheatman, Xiaojun Dong, Zheqi Shen, Laxman Dhulipala, Jakub Łącki, Prashant Pandey, Helen Xu

A fundamental building block in any graph algorithm is a graph container - a data structure used to represent the graph. Ideally, a graph container enables efficient access to the underlying graph, has low space usage, and supports updating the graph efficiently. In this paper, we conduct an extensive empirical evaluation of graph containers designed to support running algorithms on large graphs. To our knowledge, this is the first apples-to-apples comparison of graph containers rather than overall systems, which include confounding factors such as differences in algorithm implementations and infrastructure. We measure the running time of 10 highly-optimized algorithms across over 20 different containers and 10 graphs. Somewhat surprisingly, we find that the average algorithm running time does not differ much across containers, especially those that support dynamic updates. Specifically, a simple container based on an off-the-shelf B-tree is only 1.22x slower on average than a highly optimized static one. Moreover, we observe that simplifying a graph-container Application Programming Interface (API) to only a few simple functions incurs a mere 1.16x slowdown compared to a complete API. Finally, we also measure batch-insert throughput in dynamic-graph containers for a full picture of their performance. To perform the benchmarks, we introduce BYO, a unified framework that standardizes evaluations of graph-algorithm performance across different graph containers. BYO extends the Graph Based Benchmark Suite (Dhulipala et al. 18), a state-of-the-art graph algorithm benchmark, to easily plug into different dynamic graph containers and enable fair comparisons between them on a large suite of graph algorithms. While several graph algorithm benchmarks have been developed to date, to the best of our knowledge, BYO is the first system designed to benchmark graph containers

Authors: Brian Wheatman, Xiaojun Dong, Zheqi Shen, Laxman Dhulipala, Jakub Łącki, Prashant Pandey, Helen Xu

A fundamental building block in any graph algorithm is a graph container - a data structure used to represent the graph. Ideally, a graph container enables efficient access to the underlying graph, has low space usage, and supports updating the graph efficiently. In this paper, we conduct an extensive empirical evaluation of graph containers designed to support running algorithms on large graphs. To our knowledge, this is the first apples-to-apples comparison of graph containers rather than overall systems, which include confounding factors such as differences in algorithm implementations and infrastructure. We measure the running time of 10 highly-optimized algorithms across over 20 different containers and 10 graphs. Somewhat surprisingly, we find that the average algorithm running time does not differ much across containers, especially those that support dynamic updates. Specifically, a simple container based on an off-the-shelf B-tree is only 1.22x slower on average than a highly optimized static one. Moreover, we observe that simplifying a graph-container Application Programming Interface (API) to only a few simple functions incurs a mere 1.16x slowdown compared to a complete API. Finally, we also measure batch-insert throughput in dynamic-graph containers for a full picture of their performance. To perform the benchmarks, we introduce BYO, a unified framework that standardizes evaluations of graph-algorithm performance across different graph containers. BYO extends the Graph Based Benchmark Suite (Dhulipala et al. 18), a state-of-the-art graph algorithm benchmark, to easily plug into different dynamic graph containers and enable fair comparisons between them on a large suite of graph algorithms. While several graph algorithm benchmarks have been developed to date, to the best of our knowledge, BYO is the first system designed to benchmark graph containers

Fair Set Cover

from arXiv: Data Structures and Algorithms

Authors: Mohsen Dehghankar, Rahul Raychaudhury, Stavros Sintos, Abolfazl Asudeh

The potential harms of algorithmic decisions have ignited algorithmic fairness as a central topic in computer science. One of the fundamental problems in computer science is Set Cover, which has numerous applications with societal impacts, such as assembling a small team of individuals that collectively satisfy a range of expertise requirements. However, despite its broad application spectrum and significant potential impact, set cover has yet to be studied through the lens of fairness. Therefore, in this paper, we introduce Fair Set Cover, which aims not only to cover with a minimum-size set but also to satisfy demographic parity in its selection of sets. To this end, we develop multiple versions of fair set cover, study their hardness, and devise efficient approximation algorithms for each variant. Notably, under certain assumptions, our algorithms always guarantees zero-unfairness, with only a small increase in the approximation ratio compared to regular set cover. Furthermore, our experiments on various data sets and across different settings confirm the negligible price of fairness, as (a) the output size increases only slightly (if any) and (b) the time to compute the output does not significantly increase.

Authors: Mohsen Dehghankar, Rahul Raychaudhury, Stavros Sintos, Abolfazl Asudeh

The potential harms of algorithmic decisions have ignited algorithmic fairness as a central topic in computer science. One of the fundamental problems in computer science is Set Cover, which has numerous applications with societal impacts, such as assembling a small team of individuals that collectively satisfy a range of expertise requirements. However, despite its broad application spectrum and significant potential impact, set cover has yet to be studied through the lens of fairness. Therefore, in this paper, we introduce Fair Set Cover, which aims not only to cover with a minimum-size set but also to satisfy demographic parity in its selection of sets. To this end, we develop multiple versions of fair set cover, study their hardness, and devise efficient approximation algorithms for each variant. Notably, under certain assumptions, our algorithms always guarantees zero-unfairness, with only a small increase in the approximation ratio compared to regular set cover. Furthermore, our experiments on various data sets and across different settings confirm the negligible price of fairness, as (a) the output size increases only slightly (if any) and (b) the time to compute the output does not significantly increase.

String 2-Covers with No Length Restrictions

from arXiv: Data Structures and Algorithms

Authors: Itai Boneh, Shay Golan, Arseny Shur

A $\lambda$-cover of a string $S$ is a set of strings $\{C_i\}_1^\lambda$ such that every index in $S$ is contained in an occurrence of at least one string $C_i$. The existence of a $1$-cover defines a well-known class of quasi-periodic strings. Quasi-periodicity can be decided in linear time, and all $1$-covers of a string can be reported in linear time plus the size of the output. Since in general it is NP-complete to decide whether a string has a $\lambda$-cover, the natural next step is the development of efficient algorithms for $2$-covers. Radoszewski and Straszy\'nski [ESA 2020] analysed the particular case where the strings in a $2$-cover must be of the same length. They provided an algorithm that reports all such $2$-covers of $S$ in time near-linear in $|S|$ and in the size of the output. In this work, we consider $2$-covers in full generality. Since every length-$n$ string has $\Omega(n^2)$ trivial $2$-covers (every prefix and suffix of total length at least $n$ constitute such a $2$-cover), we state the reporting problem as follows: given a string $S$ and a number $m$, report all $2$-covers $\{C_1,C_2\}$ of $S$ with length $|C_1|+|C_2|$ upper bounded by $m$. We present an $\tilde{O}(n + Output)$ time algorithm solving this problem, with Output being the size of the output. This algorithm admits a simpler modification that finds a $2$-cover of minimum length. We also provide an $\tilde{O}(n)$ time construction of a $2$-cover oracle which, given two substrings $C_1,C_2$ of $S$, reports in poly-logarithmic time whether $\{C_1,C_2\}$ is a $2$-cover of $S$.

Authors: Itai Boneh, Shay Golan, Arseny Shur

A $\lambda$-cover of a string $S$ is a set of strings $\{C_i\}_1^\lambda$ such that every index in $S$ is contained in an occurrence of at least one string $C_i$. The existence of a $1$-cover defines a well-known class of quasi-periodic strings. Quasi-periodicity can be decided in linear time, and all $1$-covers of a string can be reported in linear time plus the size of the output. Since in general it is NP-complete to decide whether a string has a $\lambda$-cover, the natural next step is the development of efficient algorithms for $2$-covers. Radoszewski and Straszy\'nski [ESA 2020] analysed the particular case where the strings in a $2$-cover must be of the same length. They provided an algorithm that reports all such $2$-covers of $S$ in time near-linear in $|S|$ and in the size of the output. In this work, we consider $2$-covers in full generality. Since every length-$n$ string has $\Omega(n^2)$ trivial $2$-covers (every prefix and suffix of total length at least $n$ constitute such a $2$-cover), we state the reporting problem as follows: given a string $S$ and a number $m$, report all $2$-covers $\{C_1,C_2\}$ of $S$ with length $|C_1|+|C_2|$ upper bounded by $m$. We present an $\tilde{O}(n + Output)$ time algorithm solving this problem, with Output being the size of the output. This algorithm admits a simpler modification that finds a $2$-cover of minimum length. We also provide an $\tilde{O}(n)$ time construction of a $2$-cover oracle which, given two substrings $C_1,C_2$ of $S$, reports in poly-logarithmic time whether $\{C_1,C_2\}$ is a $2$-cover of $S$.

Comparisons Are All You Need for Optimizing Smooth Functions

from arXiv: Data Structures and Algorithms

Authors: Chenyi Zhang, Tongyang Li

When optimizing machine learning models, there are various scenarios where gradient computations are challenging or even infeasible. Furthermore, in reinforcement learning (RL), preference-based RL that only compares between options has wide applications, including reinforcement learning with human feedback in large language models. In this paper, we systematically study optimization of a smooth function $f\colon\mathbb{R}^n\to\mathbb{R}$ only assuming an oracle that compares function values at two points and tells which is larger. When $f$ is convex, we give two algorithms using $\tilde{O}(n/\epsilon)$ and $\tilde{O}(n^{2})$ comparison queries to find an $\epsilon$-optimal solution, respectively. When $f$ is nonconvex, our algorithm uses $\tilde{O}(n/\epsilon^2)$ comparison queries to find an $\epsilon$-approximate stationary point. All these results match the best-known zeroth-order algorithms with function evaluation queries in $n$ dependence, thus suggest that \emph{comparisons are all you need for optimizing smooth functions using derivative-free methods}. In addition, we also give an algorithm for escaping saddle points and reaching an $\epsilon$-second order stationary point of a nonconvex $f$, using $\tilde{O}(n^{1.5}/\epsilon^{2.5})$ comparison queries.

Authors: Chenyi Zhang, Tongyang Li

When optimizing machine learning models, there are various scenarios where gradient computations are challenging or even infeasible. Furthermore, in reinforcement learning (RL), preference-based RL that only compares between options has wide applications, including reinforcement learning with human feedback in large language models. In this paper, we systematically study optimization of a smooth function $f\colon\mathbb{R}^n\to\mathbb{R}$ only assuming an oracle that compares function values at two points and tells which is larger. When $f$ is convex, we give two algorithms using $\tilde{O}(n/\epsilon)$ and $\tilde{O}(n^{2})$ comparison queries to find an $\epsilon$-optimal solution, respectively. When $f$ is nonconvex, our algorithm uses $\tilde{O}(n/\epsilon^2)$ comparison queries to find an $\epsilon$-approximate stationary point. All these results match the best-known zeroth-order algorithms with function evaluation queries in $n$ dependence, thus suggest that \emph{comparisons are all you need for optimizing smooth functions using derivative-free methods}. In addition, we also give an algorithm for escaping saddle points and reaching an $\epsilon$-second order stationary point of a nonconvex $f$, using $\tilde{O}(n^{1.5}/\epsilon^{2.5})$ comparison queries.

Monday, May 20

Les Atomes

from Ben Recht

Meehl's Philosophical Psychology, Lecture 5, Part 3.

It’s impossible for me to fathom, but the molecular theory of matter wasn’t widely accepted until the 1910s. Very famous, very smart scientists, including Ernst Mach and Henri Poincare, thought atoms were merely a convenient fiction for predicting experimental outcomes. Statistical calculations, like those deployed to derive the ideal gas laws from kinetic theory, were akin to approximating integrals with discrete sums. Since bulk media obeyed differential equations, many thought it was more likely that they were continuous substances. That nature was an assemblage of a nearly infinite collection of invisible, discrete billiard balls seemed rightfully outlandish.

The controversy was effectively put to rest by Jean Baptiste Perrin in his 1913 book Les Atomes. Perrin spent hundreds of pages detailing the experimental evidence for atoms and molecules. The core of his argument was that if you assume molecules exist, you could count them in surprisingly diverse ways.

Again, it’s difficult to take my 21st century brain and imagine the mindset of an eighteenth century chemist, but scientists had somehow accepted the unit of a mole before they accepted the theory of atoms. The person who coined the term mole, Nobel Prize Winner Wilhelm Ostwald, was another famous atomic skeptic. How could you believe in moles but not molecules? I suppose it’s not that crazy. Different substances had different “molecular weights,” meaning that the same “amount of stuff” could weigh different amounts. You could observe nitrogen and hydrogen balloons at the same temperature, pressure, and volume; one would rise while the other would not. Oil floats on water. Gold is heavier than lead.

Perrin argued a mole always had the same number of molecules. This number, NA, is called Avogadro’s Constant and equals 6.02214076 x 1023.1 No matter how he went about trying to estimate NA, he always found the same answer.

Perrin presented a derivation of NA from Brownian motion, assuming a macroscopic particle was bombarded by microscopic particles in a fluid. This bombardment caused the macroscopic particle to randomly move around. The predictions of Brownian motion would rely on some number of molecules in any given unit of the fluid. Balancing the predictions of Brownian motion against the effect of gravity gave the number of water molecules in a small volume. Extrapolating out provided an estimate of NA.

Perrin described a derivation of Einstein that predicted the color of the sky by analyzing the statistical mechanical scattering of light by air particles. This derivation relied on an estimate of the number of particles in a given volume of air. Perrin derived an estimate from the theory of black body radiation, using their calculation of Boltzman’s constant and applying the fact that the “R” in the ideal gas law was equal to Avogadro’s number divided by Boltzman’s constant.

Perrin also used properties of alpha decay to compute an estimate. Radium emits alpha particles, which combine with electrons to produce helium. Perrin knew the rate of alpha particle decay, which yielded a prediction of the number of Helium molecules. He then compared this number to the amount of Helium produced to get yet another estimate of NA

Electrochemistry determined the charge required to deposit one mole of silver onto an electrode through electrolysis. This was called a Faraday and had value F. Assuming it costs one electron to deposit one atom, then the total number of atoms deposited should be F/e where e is the charge of the electron. Millikan had recently computed an estimate for e, and hence this gave another path to estimating Avogadro’s number.

The conclusion of Les Atomes contains a remarkable table. Perrin lists his estimates of NA from 13 different derivations.

No matter which he chose, the number was always around 6x1023. The fact that all of these different calculations gave the same answer was indeed a damned strange coincidence. After Perrin’s work, almost everyone (except Mach2) conceded that matter was composed of atoms and molecules. This was in 1913! Look where we are today.

Let’s take a minute to explore how this example fits into Meehl’s Lakatosian Defense framework and why Meehl spent so much time on Avogadro in Lecture 5. Here we have a single theory: “A mole is a constant number of molecules.” Adjoining this theory to a variety of different derivation chains gives the same value for the NA. The results are too close to each other for that to have happened by pure chance, and hence the theory is corroborated.

Well, except it’s not really that clean. First, it’s hard to precisely say what is the probability here. What does it mean that the probability that Perrin’s calculations all gave the same NA was exceptionally small if atoms weren’t real? What is p(atoms)? The atomic skeptics were clearly very surprised by Perrin’s results. Wesley Salmon quotes Poincare as exclaiming, “How can you argue, if all of the counts come out the same?” He announced, “Atoms are no longer a useful fiction; things seem to us in favor of saying that we see them since we know how to count them.”

Still, I don’t think we can make this “probability” notion too formal. Meehl chirps against hypothesis testing here. It’s certainly not the case that physicists ran some F-test or something on Perrin’s table to convince themselves of the result. It certainly wasn’t the case that scientists at the time ground out some Bayesian confirmation calculation, as Bayesian statistics wouldn’t be formulated for another twenty years. Salmon tries to construct a post hoc “common cause” formalization of atomic confirmation, but I am not at all convinced by his handwavy argument.

Meehl argues it’s the specificity of the predictions that convinced everyone. I suppose we could reverse engineer a Meehlian Spielraum argument here. Suppose Perrin had instead assumed that Avogadro’s number was 6x1023. Then he’d get estimates of the charge of the electron, Boltzmann’s constant, the decay rate of radium, and a dozen other physical constants, all within a factor of 25% of their known value. Perrin’s atomic theory would be predicitng a remarkably narrow range of the Spielraum over dozens of experiments. Again, this isn’t what Perrin did, but I can imagine that if he had presented his results this way, people would have been just as convinced. 

And obviously, the fact that all of Perrin’s rested on a single number made the whole theory too good to ignore. A friend joked this weekend that these days, we’re happy if a billion-parameter model gets one prediction correct. But we’re much certainly happier if one parameter yields a billion predictions.

Indeed, the atomic theory would continue to predict the outcomes of endless experiments. It would prove critical in understanding X-rays and estimating their wavelengths. And once the atomic theory became entrenched, it would be guarded by Lakatosian Defense. When Bäcklin and Bearden made measurements that found too high a value for X-ray wavelengths, this suggested an error in the estimates of Avogadro's number. Prins would write to Nature to correct the record: “the usual diffraction formula needs a correction when applied to X-rays under the usual experimental conditions.” Atomic theory was now far too useful and had to be defended at all costs.

Subscribe now

If you want to get more into the weeds of atoms, check out Wesley Salmon’s Scientific Explanation and the Causal Structure of the World. The blog here pieces together Meehl’s lecture, Salmon’s argument, and my own reading of Perrin. Maybe I got too into the weeds here.

1

In our modern standards, Avogadro’s constant defines a mole. It is now completely backwards: A mole was first defined by a volume of gas, but now defined to be NA molecules.

2

As Meehl says, Mach died in 1916, but maybe would have changed his mind had he lived a bit longer.

By Ben Recht

TR24-094 | Interactive Proofs for General Distribution Properties | Tal Herman, Guy Rothblum

from ECCC Papers

Suppose Alice has collected a small number of samples from an unknown distribution, and would like to learn about the distribution. Bob, an untrusted data analyst, claims that he ran a sophisticated data analysis on the distribution, and makes assertions about its properties. Can Alice efficiently verify Bob's claims using fewer resources (say in terms of samples and computation) than would be needed to run the analysis herself? We construct an interactive proof system for any distribution property that can be decided by uniform polynomial-size circuits of bounded depth: the circuit gets a complete description of the distribution and decides whether it has the property. Taking $N$ to be an upper bound on the size of the distribution's support, the verifier's sample complexity, running time, and the communication complexity are all sublinear in $N$: they are bounded by $\widetilde{O}( N^{1-\alpha} + D)$ for a constant $\alpha > 0$, where $D$ is a bound on the depth of the circuits that decide the property. The honest prover runs in $\text{poly}(N)$ time and has quasi-linear sample complexity. Moreover, the proof system is tolerant: it can be used to approximate the distribution's distance from the property. We show similar results for any distribution property that can be decided by a bounded-space Turing machine (that gets as input a complete description of the distribution). We remark that even for simple properties, deciding the property without a prover requires quasi-linear sample complexity and running time. Prior work [Herman and Rothblum, FOCS 2023] demonstrated sublinear interactive proof systems, but only for the much more restricted class of label-invariant distribution properties.

Suppose Alice has collected a small number of samples from an unknown distribution, and would like to learn about the distribution. Bob, an untrusted data analyst, claims that he ran a sophisticated data analysis on the distribution, and makes assertions about its properties. Can Alice efficiently verify Bob's claims using fewer resources (say in terms of samples and computation) than would be needed to run the analysis herself? We construct an interactive proof system for any distribution property that can be decided by uniform polynomial-size circuits of bounded depth: the circuit gets a complete description of the distribution and decides whether it has the property. Taking $N$ to be an upper bound on the size of the distribution's support, the verifier's sample complexity, running time, and the communication complexity are all sublinear in $N$: they are bounded by $\widetilde{O}( N^{1-\alpha} + D)$ for a constant $\alpha > 0$, where $D$ is a bound on the depth of the circuits that decide the property. The honest prover runs in $\text{poly}(N)$ time and has quasi-linear sample complexity. Moreover, the proof system is tolerant: it can be used to approximate the distribution's distance from the property. We show similar results for any distribution property that can be decided by a bounded-space Turing machine (that gets as input a complete description of the distribution). We remark that even for simple properties, deciding the property without a prover requires quasi-linear sample complexity and running time. Prior work [Herman and Rothblum, FOCS 2023] demonstrated sublinear interactive proof systems, but only for the much more restricted class of label-invariant distribution properties.

I don't do that well when the Jeopardy category is Math

from Computational Complexity

Bill and Darling are watching Jeopardy.


DARLING: Bill, one of the categories is MATH TALK. You will kick butt!
BILL: Not clear. I doubt they will have the least number n such that R(n) is not known. They will ask things easy enough so that my math knowledge won't help.
DARLING: But you can answer faster.
BILL: Not clear.  -------------------------------------------- Recall that in Jeopardy they give the answers and you come up with the question. Like Sheldon Cooper I prefer my questions in the form of a question.  Even so, I will present the answers that were given on the show (that sounds funny), then  I will provide the questions (that sounds funny), what happened, and what I would have gotten right. 
$400 ANSWER: Its a demonstrably true mathematical statement; Calculus has a ``Fundamental'' one. QUESTION: What is a Theorem? WHAT HAPPENED: Someone buzzed in and said AXIOM. This one I knew the answer and would have won!
$800 ANSWER: Fire up the engines of your mind and name this solid figure with equal and parallel circles at either end.  QUESTION: What is a Cylinder? WHAT HAPPENED: Someone buzzed in with the correct answer. I had a hard time parsing this one and only got it right in hindsight. This one I would have lost on. Note that the phrase Fire up your engines is supposed to make you think of Fire on all cylinders. This did not help me.
$1200 ANSWER: Multiply the numerator of one fraction by the denominator of another (and vice versa) to get the ``cross'' this.  QUESTION: What is a Product? WHAT HAPPENED: I got this one very fast. So did the contestant on the real show. Not clear what would happened if I was there.
$1600 ANSWER: See if you can pick off this term for the point at which a line or curve crosses an axis.  QUESTION: What is an Intercept? WHAT HAPPENED: Someone buzzed in with the correct answer. I really didn't know what they were getting at. Even in hindsight the answer does not seem right, though I am sure that it is. The phrase pick off this term is  supposed to remind me of something, but it didn't. Lance happened to read a draft of this post and did the obvious thing: asked ChatGPT about it. ChatGPT said that in football a pick off is an interception. To see the ChatGPT transcript see here.
$2000 ANSWER: In 19-5=14 19 is the minuend; 5 is this other ``end'' QUESTION: What is a  Subtrahend? WHAT HAPPENED: Someone buzzed in with the correct answer. The answer was news to me. It is correct; however, I am not embarrassed to say I never heard these terms. Spellcheck thinks that minuend and subtrahend words. This is similar to when I was not smarter than a fifth grader (see blog post here). 
---------------------------------------------------------------- So the final tally: The $400 question I would have gotten right The $1200 question I might have gotten right if I was fast on the buzzer
But that's it. Why did I do so badly?  1) Two of the ones I got wrong were phrased in funny ways. I thought so anyway. And note that they did not use advanced math knowledge, so my math knowledge didn't help. (This is not a complaint- it would be bad if they used advanced math knowledge. Like when a crossword puzzle my wife was working on wanted  Log-Man and it began with N and I knew Napier. Why was that in a crossword puzzle for laypeople? Because  Napier has a lot of vowels in it.)

2) One of them I really did not know the math knowledge. Is it arrogant to say that if there is a math question on Jeopardy where I don't know the answer then its a bad question? I leave that as an exercise for the reader. 
On questions about  presidents, vice presidents, or American history, I do well.
On questions about novelty songs  (sometimes comes up) I do very well. (One question was about this song here. The question: here.)

But math... not so much. 
For computer science questions I also do not do that well, but I've learned some common abbreviations that I did not know: 
BIT: Binary Integer (A reader named Anonymous, who makes many comments, pointed out that BIT is actually Binary Digit. I have a possibly false memory of Jeopardy telling me Binary Integer. Either my memory is wrong or Jeopardy is wrong. But Anonymous is right- its Binary Digit.) 
HTTP: Hypertext Transfer Protocol
HTML: Hyper Text Markup Language
FORTRAN: Formula Translation
Those were more interesting than learning about minuend and subtrahend, terms I had never heard before and won't hear again unless I catch a rerun of Jeopardy (at which time I will get it right).



By gasarch

Bill and Darling are watching Jeopardy.


DARLING: Bill, one of the categories is MATH TALK. You will kick butt!

BILL: Not clear. I doubt they will have the least number n such that R(n) is not known. They will ask things easy enough so that my math knowledge won't help.

DARLING: But you can answer faster.

BILL: Not clear. 
--------------------------------------------
Recall that in Jeopardy they give the answers and you come up with the question.
Like Sheldon Cooper I prefer my questions in the form of a question. 
Even so, I will present the answers that were given on the show (that sounds funny), then 
I will provide the questions (that sounds funny), what happened, and what I would have gotten right. 

$400
ANSWER: Its a demonstrably true mathematical statement; Calculus has a ``Fundamental'' one.
QUESTION: What is a Theorem?
WHAT HAPPENED: Someone buzzed in and said AXIOM. This one I knew the answer and would have won!

$800
ANSWER: Fire up the engines of your mind and name this solid figure with equal and parallel circles at either end. 
QUESTION: What is a Cylinder?
WHAT HAPPENED: Someone buzzed in with the correct answer. I had a hard time parsing this one and only got it right in hindsight. This one I would have lost on. Note that the phrase Fire up your engines is supposed to make you think of Fire on all cylinders. This did not help me.

$1200
ANSWER: Multiply the numerator of one fraction by the denominator of another (and vice versa) to get the ``cross'' this. 
QUESTION: What is a Product?
WHAT HAPPENED: I got this one very fast. So did the contestant on the real show. Not clear what would happened if I was there.

$1600
ANSWER: See if you can pick off this term for the point at which a line or curve crosses an axis. 
QUESTION: What is an Intercept?
WHAT HAPPENED: Someone buzzed in with the correct answer. I really didn't know what they were getting at. Even in hindsight the answer does not seem right, though I am sure that it is. The phrase pick off this term is  supposed to remind me of something, but it didn't. Lance happened to read a draft of this post and did the obvious thing: asked ChatGPT about it. ChatGPT said that in football a pick off is an interception. To see the ChatGPT transcript see here.

$2000
ANSWER: In 19-5=14 19 is the minuend; 5 is this other ``end''
QUESTION: What is a  Subtrahend?
WHAT HAPPENED: Someone buzzed in with the correct answer. The answer was news to me. It is correct; however, I am not embarrassed to say I never heard these terms. Spellcheck thinks that minuend and subtrahend words. This is similar to when I was not smarter than a fifth grader (see blog post here). 

----------------------------------------------------------------
So the final tally:
The $400 question I would have gotten right
The $1200 question I might have gotten right if I was fast on the buzzer

But that's it. Why did I do so badly? 
1) Two of the ones I got wrong were phrased in funny ways. I thought so anyway. And note that they did not use advanced math knowledge, so my math knowledge didn't help. (This is not a complaint- it would be bad if they used advanced math knowledge. Like when a crossword puzzle my wife was working on wanted  Log-Man and it began with N and I knew Napier. Why was that in a crossword puzzle for laypeople? Because  Napier has a lot of vowels in it.)

2) One of them I really did not know the math knowledge. Is it arrogant to say that if there is a math question on Jeopardy where I don't know the answer then its a bad question? I leave that as an exercise for the reader. 

On questions about  presidents, vice presidents, or American history, I do well.

On questions about novelty songs  (sometimes comes up) I do very well. (One question was about this song here. The question: here.)

But math... not so much. 

For computer science questions I also do not do that well, but I've learned some common abbreviations that I did not know: 

BIT: Binary Integer (A reader named Anonymous, who makes many comments, pointed out that BIT is actually Binary Digit. I have a possibly false memory of Jeopardy telling me Binary Integer. Either my memory is wrong or Jeopardy is wrong. But Anonymous is right- its Binary Digit.) 

HTTP: Hypertext Transfer Protocol

HTML: Hyper Text Markup Language

FORTRAN: Formula Translation

Those were more interesting than learning about minuend and subtrahend, terms I had never heard before and won't hear again unless I catch a rerun of Jeopardy (at which time I will get it right).




By gasarch

Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

from arXiv: Computational Complexity

Authors: Jayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram

We consider the problem of selecting an optimal subset of information sources for a hypothesis testing/classification task where the goal is to identify the true state of the world from a finite set of hypotheses, based on finite observation samples from the sources. In order to characterize the learning performance, we propose a misclassification penalty framework, which enables non-uniform treatment of different misclassification errors. In a centralized Bayesian learning setting, we study two variants of the subset selection problem: (i) selecting a minimum cost information set to ensure that the maximum penalty of misclassifying the true hypothesis remains bounded and (ii) selecting an optimal information set under a limited budget to minimize the maximum penalty of misclassifying the true hypothesis. Under mild assumptions, we prove that the objective (or constraints) of these combinatorial optimization problems are weak (or approximate) submodular, and establish high-probability performance guarantees for greedy algorithms. Further, we propose an alternate metric for information set selection which is based on the total penalty of misclassification. We prove that this metric is submodular and establish near-optimal guarantees for the greedy algorithms for both the information set selection problems. Finally, we present numerical simulations to validate our theoretical results over several randomly generated instances.

Authors: Jayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram

We consider the problem of selecting an optimal subset of information sources for a hypothesis testing/classification task where the goal is to identify the true state of the world from a finite set of hypotheses, based on finite observation samples from the sources. In order to characterize the learning performance, we propose a misclassification penalty framework, which enables non-uniform treatment of different misclassification errors. In a centralized Bayesian learning setting, we study two variants of the subset selection problem: (i) selecting a minimum cost information set to ensure that the maximum penalty of misclassifying the true hypothesis remains bounded and (ii) selecting an optimal information set under a limited budget to minimize the maximum penalty of misclassifying the true hypothesis. Under mild assumptions, we prove that the objective (or constraints) of these combinatorial optimization problems are weak (or approximate) submodular, and establish high-probability performance guarantees for greedy algorithms. Further, we propose an alternate metric for information set selection which is based on the total penalty of misclassification. We prove that this metric is submodular and establish near-optimal guarantees for the greedy algorithms for both the information set selection problems. Finally, we present numerical simulations to validate our theoretical results over several randomly generated instances.

Injective hardness condition for PCSPs

from arXiv: Computational Complexity

Authors: Demian Banakh, Marcin Kozik

We present a template for the Promise Constraint Satisfaction Problem (PCSP) which is NP-hard but does not satisfy the current state-of-the-art hardness condition [ACMTCT'21]. We introduce a new "injective" condition based on the smooth version of the layered PCP Theorem and use this new condition to confirm that the problem is indeed NP-hard. In the second part of the article, we establish a dichotomy for Boolean PCSPs defined by templates with polymorphisms in the set of linear threshold functions. The reasoning relies on the new injective condition.

Authors: Demian Banakh, Marcin Kozik

We present a template for the Promise Constraint Satisfaction Problem (PCSP) which is NP-hard but does not satisfy the current state-of-the-art hardness condition [ACMTCT'21]. We introduce a new "injective" condition based on the smooth version of the layered PCP Theorem and use this new condition to confirm that the problem is indeed NP-hard. In the second part of the article, we establish a dichotomy for Boolean PCSPs defined by templates with polymorphisms in the set of linear threshold functions. The reasoning relies on the new injective condition.

You Can't Solve These Super Mario Bros. Levels: Undecidable Mario Games

from arXiv: Computational Complexity

Authors: MIT Hardness Group, Hayashi Ani, Erik D. Demaine, Holden Hall, Ricardo Ruiz, Naveen Venkat

We prove RE-completeness (and thus undecidability) of several 2D games in the Super Mario Bros. platform video game series: the New Super Mario Bros. series (original, Wii, U, and 2), and both Super Mario Maker games in all five game styles (Super Mario Bros. 1 and 3, Super Mario World, New Super Mario Bros. U, and Super Mario 3D World). These results hold even when we restrict to constant-size levels and screens, but they do require generalizing to allow arbitrarily many enemies at each location and onscreen, as well as allowing for exponentially large (or no) timer. Our New Super Mario Bros. constructions fit within one standard screen size. In our Super Mario Maker reductions, we work within the standard screen size and use the property that the game engine remembers offscreen objects that are global because they are supported by "global ground". To prove these Mario results, we build a new theory of counter gadgets in the motion-planning-through-gadgets framework, and provide a suite of simple gadgets for which reachability is RE-complete.

Authors: MIT Hardness Group, Hayashi Ani, Erik D. Demaine, Holden Hall, Ricardo Ruiz, Naveen Venkat

We prove RE-completeness (and thus undecidability) of several 2D games in the Super Mario Bros. platform video game series: the New Super Mario Bros. series (original, Wii, U, and 2), and both Super Mario Maker games in all five game styles (Super Mario Bros. 1 and 3, Super Mario World, New Super Mario Bros. U, and Super Mario 3D World). These results hold even when we restrict to constant-size levels and screens, but they do require generalizing to allow arbitrarily many enemies at each location and onscreen, as well as allowing for exponentially large (or no) timer. Our New Super Mario Bros. constructions fit within one standard screen size. In our Super Mario Maker reductions, we work within the standard screen size and use the property that the game engine remembers offscreen objects that are global because they are supported by "global ground". To prove these Mario results, we build a new theory of counter gadgets in the motion-planning-through-gadgets framework, and provide a suite of simple gadgets for which reachability is RE-complete.

Learning low-degree quantum objects

from arXiv: Data Structures and Algorithms

Authors: Srinivasan Arunachalam, Arkopal Dutt, Francisco Escudero Gutiérrez, Carlos Palazuelos

We consider the problem of learning low-degree quantum objects up to $\varepsilon$-error in $\ell_2$-distance. We show the following results: $(i)$ unknown $n$-qubit degree-$d$ (in the Pauli basis) quantum channels and unitaries can be learned using $O(1/\varepsilon^d)$ queries (independent of $n$), $(ii)$ polynomials $p:\{-1,1\}^n\rightarrow [-1,1]$ arising from $d$-query quantum algorithms can be classically learned from $O((1/\varepsilon)^d\cdot \log n)$ many random examples $(x,p(x))$ (which implies learnability even for $d=O(\log n)$), and $(iii)$ degree-$d$ polynomials $p:\{-1,1\}^n\to [-1,1]$ can be learned through $O(1/\varepsilon^d)$ queries to a quantum unitary $U_p$ that block-encodes $p$. Our main technical contributions are new Bohnenblust-Hille inequalities for quantum channels and completely bounded~polynomials.

Authors: Srinivasan Arunachalam, Arkopal Dutt, Francisco Escudero Gutiérrez, Carlos Palazuelos

We consider the problem of learning low-degree quantum objects up to $\varepsilon$-error in $\ell_2$-distance. We show the following results: $(i)$ unknown $n$-qubit degree-$d$ (in the Pauli basis) quantum channels and unitaries can be learned using $O(1/\varepsilon^d)$ queries (independent of $n$), $(ii)$ polynomials $p:\{-1,1\}^n\rightarrow [-1,1]$ arising from $d$-query quantum algorithms can be classically learned from $O((1/\varepsilon)^d\cdot \log n)$ many random examples $(x,p(x))$ (which implies learnability even for $d=O(\log n)$), and $(iii)$ degree-$d$ polynomials $p:\{-1,1\}^n\to [-1,1]$ can be learned through $O(1/\varepsilon^d)$ queries to a quantum unitary $U_p$ that block-encodes $p$. Our main technical contributions are new Bohnenblust-Hille inequalities for quantum channels and completely bounded~polynomials.

Real-World Graph Analysis: Techniques for Static, Dynamic, and Temporal Communities

from arXiv: Data Structures and Algorithms

Authors: Davide Rucci

Graphs are widely used in various fields of computer science. They have also found application in unrelated areas, leading to a diverse range of problems. These problems can be modeled as relationships between entities in various contexts, such as social networks, protein interactions in cells, and route maps. Therefore it is logical to analyze these data structures with diverse approaches, whether they are numerical or structural, global or local, approximate or exact. In particular, the concept of community plays an important role in local structural analysis, as it is able to highlight the composition of the underlying graph while providing insights into what the organization and importance of the nodes in a network look like. This thesis pursues the goal of extracting knowledge from different kinds of graphs, including static, dynamic, and temporal graphs, with a particular focus on their community substructures. To tackle this task we use combinatorial algorithms that can list all the communities in a graph according to different formalizations, such as cliques, $k$-graphlets, and $k$-cores. We first develop new algorithms to enumerate subgraphs, using traditional and novel techniques such as push-out amortization, and CPU cache analysis to boost their efficiency. We then extend these concepts to the analysis of real-world graphs across diverse domains, ranging from social networks to autonomous systems modeled as temporal graphs. In this field, there is currently no widely accepted adaptation, even for straightforward subgraphs like $k$-cores, and the available data is expanding both in terms of quantity and scale. As a result, our findings advance the state of the art both from a theoretical and a practical perspective and can be used in a static or dynamic setting to further speed up and refine graph analysis techniques.

Authors: Davide Rucci

Graphs are widely used in various fields of computer science. They have also found application in unrelated areas, leading to a diverse range of problems. These problems can be modeled as relationships between entities in various contexts, such as social networks, protein interactions in cells, and route maps. Therefore it is logical to analyze these data structures with diverse approaches, whether they are numerical or structural, global or local, approximate or exact. In particular, the concept of community plays an important role in local structural analysis, as it is able to highlight the composition of the underlying graph while providing insights into what the organization and importance of the nodes in a network look like. This thesis pursues the goal of extracting knowledge from different kinds of graphs, including static, dynamic, and temporal graphs, with a particular focus on their community substructures. To tackle this task we use combinatorial algorithms that can list all the communities in a graph according to different formalizations, such as cliques, $k$-graphlets, and $k$-cores. We first develop new algorithms to enumerate subgraphs, using traditional and novel techniques such as push-out amortization, and CPU cache analysis to boost their efficiency. We then extend these concepts to the analysis of real-world graphs across diverse domains, ranging from social networks to autonomous systems modeled as temporal graphs. In this field, there is currently no widely accepted adaptation, even for straightforward subgraphs like $k$-cores, and the available data is expanding both in terms of quantity and scale. As a result, our findings advance the state of the art both from a theoretical and a practical perspective and can be used in a static or dynamic setting to further speed up and refine graph analysis techniques.

On Minimal Transversals of Maximal Cliques in Graphs

from arXiv: Data Structures and Algorithms

Authors: Endre Boros, Vladimir Gurvich, Martin Milanič, Dmitry Tikhanovsky, Yushi Uno

A hypergraph is conformal if it is the family of maximal cliques of a graph. In this paper we are interested in the problem of determining when is the family of minimal transversal of maximal cliques of a graph conformal. Such graphs are called clique dually conformal (CDC for short). As our main results, we completely characterize CDC graphs within the families of triangle-free graphs and split graphs. Both characterizations lead to polynomial-time recognition algorithms. We also show that the class of CDC graphs is closed under substitution, in the strong sense that substituting a graph $H$ for a vertex of a graph $G$ results in a CDC graph if and only if both $G$ and $H$ are CDC.

Authors: Endre Boros, Vladimir Gurvich, Martin Milanič, Dmitry Tikhanovsky, Yushi Uno

A hypergraph is conformal if it is the family of maximal cliques of a graph. In this paper we are interested in the problem of determining when is the family of minimal transversal of maximal cliques of a graph conformal. Such graphs are called clique dually conformal (CDC for short). As our main results, we completely characterize CDC graphs within the families of triangle-free graphs and split graphs. Both characterizations lead to polynomial-time recognition algorithms. We also show that the class of CDC graphs is closed under substitution, in the strong sense that substituting a graph $H$ for a vertex of a graph $G$ results in a CDC graph if and only if both $G$ and $H$ are CDC.

Parameterized Complexity of Dominating Set Variants in Almost Cluster and Split Graphs

from arXiv: Data Structures and Algorithms

Authors: Dishant Goyal, Ashwin Jacob, Kaushtubh Kumar, Diptapriyo Majumdar, Venkatesh Raman

We consider structural parameterizations of the fundamental Dominating Set problem and its variants in the parameter ecology program. We give improved FPT algorithms and lower bounds under well-known conjectures for dominating set in graphs that are k vertices away from a cluster graph or a split graph. These are graphs in which there is a set of k vertices (called the modulator) whose deletion results in a cluster graph or a split graph. We also call k as the deletion distance (to the appropriate class of graphs). When parameterized by the deletion distance k to cluster graphs - we can find a minimum dominating set (DS) in 3^k n^{O(1)}-time. Within the same time, we can also find a minimum independent dominating set (IDS) or a minimum dominating clique (DC) or a minimum efficient dominating set (EDS) or a minimum total dominating set (TDS). We also show that most of these variants of dominating set do not have polynomial sized kernel. Additionally, we show that when parameterized by the deletion distance k to split graphs - IDS can be solved in 2^k n^{O(1)}-time and EDS can be solved in 3^{k/2}n^{O(1)}.

Authors: Dishant Goyal, Ashwin Jacob, Kaushtubh Kumar, Diptapriyo Majumdar, Venkatesh Raman

We consider structural parameterizations of the fundamental Dominating Set problem and its variants in the parameter ecology program. We give improved FPT algorithms and lower bounds under well-known conjectures for dominating set in graphs that are k vertices away from a cluster graph or a split graph. These are graphs in which there is a set of k vertices (called the modulator) whose deletion results in a cluster graph or a split graph. We also call k as the deletion distance (to the appropriate class of graphs). When parameterized by the deletion distance k to cluster graphs - we can find a minimum dominating set (DS) in 3^k n^{O(1)}-time. Within the same time, we can also find a minimum independent dominating set (IDS) or a minimum dominating clique (DC) or a minimum efficient dominating set (EDS) or a minimum total dominating set (TDS). We also show that most of these variants of dominating set do not have polynomial sized kernel. Additionally, we show that when parameterized by the deletion distance k to split graphs - IDS can be solved in 2^k n^{O(1)}-time and EDS can be solved in 3^{k/2}n^{O(1)}.

Lock-Free Augmented Trees

from arXiv: Data Structures and Algorithms

Authors: Panagiota Fatourou, Eric Ruppert

Augmenting an existing sequential data structure with extra information to support greater functionality is a widely used technique. For example, search trees are augmented to build sequential data structures like order-statistic trees, interval trees, tango trees, link/cut trees and many others. We study how to design concurrent augmented tree data structures. We present a new, general technique that can augment a lock-free tree to add any new fields to each tree node, provided the new fields' values can be computed from information in the node and its children. This enables the design of lock-free, linearizable analogues of a wide variety of classical augmented data structures. As a first example, we give a wait-free trie that stores a set $S$ of elements drawn from $\{1,\ldots,N\}$ and supports linearizable order-statistic queries such as finding the $k$th smallest element of $S$. Updates and queries take $O(\log N)$ steps. We also apply our technique to a lock-free binary search tree (BST), where changes to the structure of the tree make the linearization argument more challenging. Our augmented BST supports order statistic queries in $O(h)$ steps on a tree of height $h$. The augmentation does not affect the asymptotic running time of the updates. For both our trie and BST, we give an alternative augmentation to improve searches and order-statistic queries to run in $O(\log |S|)$ steps (with a small increase in step complexity of updates). As an added bonus, our technique supports arbitrary multi-point queries (such as range queries) with the same time complexity as they would have in the corresponding sequential data structure.

Authors: Panagiota Fatourou, Eric Ruppert

Augmenting an existing sequential data structure with extra information to support greater functionality is a widely used technique. For example, search trees are augmented to build sequential data structures like order-statistic trees, interval trees, tango trees, link/cut trees and many others. We study how to design concurrent augmented tree data structures. We present a new, general technique that can augment a lock-free tree to add any new fields to each tree node, provided the new fields' values can be computed from information in the node and its children. This enables the design of lock-free, linearizable analogues of a wide variety of classical augmented data structures. As a first example, we give a wait-free trie that stores a set $S$ of elements drawn from $\{1,\ldots,N\}$ and supports linearizable order-statistic queries such as finding the $k$th smallest element of $S$. Updates and queries take $O(\log N)$ steps. We also apply our technique to a lock-free binary search tree (BST), where changes to the structure of the tree make the linearization argument more challenging. Our augmented BST supports order statistic queries in $O(h)$ steps on a tree of height $h$. The augmentation does not affect the asymptotic running time of the updates. For both our trie and BST, we give an alternative augmentation to improve searches and order-statistic queries to run in $O(\log |S|)$ steps (with a small increase in step complexity of updates). As an added bonus, our technique supports arbitrary multi-point queries (such as range queries) with the same time complexity as they would have in the corresponding sequential data structure.

A Polynomial-Time Approximation for Pairwise Fair $k$-Median Clustering

from arXiv: Data Structures and Algorithms

Authors: Sayan Bandyapadhyay, Eden Chlamtáč, Yury Makarychev, Ali Vakilian

In this work, we study pairwise fair clustering with $\ell \ge 2$ groups, where for every cluster $C$ and every group $i \in [\ell]$, the number of points in $C$ from group $i$ must be at most $t$ times the number of points in $C$ from any other group $j \in [\ell]$, for a given integer $t$. To the best of our knowledge, only bi-criteria approximation and exponential-time algorithms follow for this problem from the prior work on fair clustering problems when $\ell > 2$. In our work, focusing on the $\ell > 2$ case, we design the first polynomial-time $(t^{\ell}\cdot \ell\cdot k)^{O(\ell)}$-approximation for this problem with $k$-median cost that does not violate the fairness constraints. We complement our algorithmic result by providing hardness of approximation results, which show that our problem even when $\ell=2$ is almost as hard as the popular uniform capacitated $k$-median, for which no polynomial-time algorithm with an approximation factor of $o(\log k)$ is known.

Authors: Sayan Bandyapadhyay, Eden Chlamtáč, Yury Makarychev, Ali Vakilian

In this work, we study pairwise fair clustering with $\ell \ge 2$ groups, where for every cluster $C$ and every group $i \in [\ell]$, the number of points in $C$ from group $i$ must be at most $t$ times the number of points in $C$ from any other group $j \in [\ell]$, for a given integer $t$. To the best of our knowledge, only bi-criteria approximation and exponential-time algorithms follow for this problem from the prior work on fair clustering problems when $\ell > 2$. In our work, focusing on the $\ell > 2$ case, we design the first polynomial-time $(t^{\ell}\cdot \ell\cdot k)^{O(\ell)}$-approximation for this problem with $k$-median cost that does not violate the fairness constraints. We complement our algorithmic result by providing hardness of approximation results, which show that our problem even when $\ell=2$ is almost as hard as the popular uniform capacitated $k$-median, for which no polynomial-time algorithm with an approximation factor of $o(\log k)$ is known.

Saturday, May 18

Sheila Sundaram, May 19th, 18:05 (Israel time):  Stirling numbers, Koszul duality and cohomology of configuration spaces (joint work with Ayah Almousa and Vic Reiner)

from Gil Kalai

Let me share an announcement of a zoom lecture by Sheila Sundaram. The lecture (tomorrow, Sunday) is about the paper   Koszulity, supersolvability, and Stirling representations, by Ayah Almousa, Victor Reiner, and Sheila Sundaram. _____________   Bar-Ilan Combinatorics Seminar The next meeting … Continue reading →

Let me share an announcement of a zoom lecture by Sheila Sundaram. The lecture (tomorrow, Sunday) is about the paper
 

Koszulity, supersolvability, and Stirling representations,

by Ayah Almousa, Victor Reiner, and Sheila Sundaram.

_____________
 

Bar-Ilan Combinatorics Seminar

The next meeting will be, IYH, on Sunday 11 Iyar (May 19th), 18:05 pm 

(18:05 refers to Israel time.)

*Note change to unusual time
 
on zoom
 

Title:  Stirling numbers, Koszul duality and cohomology of configuration spaces

Abstract: (Joint work with Ayah Almousa and Vic Reiner)

 
The unsigned Stirling numbers c(n,k) of the first kind give the Hilbert function for two algebras associated to the hyperplane arrangement in type A, the Orlik-Solomon algebra and the graded Varchenko-Gelfand algebra. Both algebras carry symmetric group actions with a rich structure, and have been well studied by topologists, algebraists and combinatorialists: the first coincides with the Whitney homology of the partition lattice, and the second with a well-known decomposition (Thrall’s decomposition, giving the higher Lie characters) of the universal enveloping algebra of the free Lie algebra. In each case the graded representations have dimension c(n,k).

Both these algebras are examples of Koszul algebras, for which the Priddy resolution defines a group-equivariant dual Koszul algebra. Now the Hilbert function is given by the Stirling numbers S(n,k) of the second kind, and hence the Koszul duality relation defines representations of the symmetric group whose dimensions are the numbers S(n,k). Investigating this observation leads to the realisation that this situation generalises to all supersolvable matroids.

 
The Koszul duality recurrence is shown to have interesting general consequences. For the resulting group representations, it implies the existence of branching rules which, in the case of the braid arrangement, specialise by dimension to the classical enumerative recurrences satisfied by the Stirling numbers of both kinds. It also implies representation stability in the sense of Church and Farb.
 
The associated Koszul dual representations appear to have other properties that are more mysterious; for example, in the case of the braid arrangement, the Stirling representations of the Koszul dual are sometimes tantalisingly close to being permutation modules.
 
We also show a connection with the homology of the subword order poset, from previous work of the speaker, where Stirling numbers arise again in the representations.

You are welcome to attend

 

By Gil Kalai

Friday, May 17

Postdoc at King’s College London (apply by June 6, 2024)

from CCI: jobs

A postdoctoral research position is available at King’s College London. The successful candidate will be hosted by Hubie Chen and will be expected to work on topics related to the themes of complexity, database theory, structural decomposition methods, and logic. Informal enquiries and discussion are strongly encouraged prior to application; please send a CV when […]

A postdoctoral research position is available at King’s College London. The successful candidate will be hosted by Hubie Chen and will be expected to work on topics related to the themes of complexity, database theory, structural decomposition methods, and logic.

Informal enquiries and discussion are strongly encouraged prior to application; please send a CV when initiating correspondence.

Website: https://hubie-chen.github.io/postdoc2024.html
Email: hubie.chen@kcl.ac.uk

By shacharlovett

TR24-093 | Low-Degree Polynomials Are Good Extractors | Jesse Goodman, Omar Alrabiah, Joao Ribeiro, Jonathan Mosheiff

from ECCC Papers

We prove that random low-degree polynomials (over $\mathbb{F}_2$) are unbiased, in an extremely general sense. That is, we show that random low-degree polynomials are good randomness extractors for a wide class of distributions. Prior to our work, such results were only known for the small families of (1) uniform sources, (2) affine sources, and (3) local sources. We significantly generalize these results, and prove the following. 1. Low-degree polynomials extract from small families. We show that a random low-degree polynomial is a good low-error extractor for any small family of sources. In particular, we improve the positive result of Alrabiah, Chattopadhyay, Goodman, Li, and Ribeiro (ICALP 2022) for local sources, and give new results for polynomial sources and variety sources via a single unified approach. 2. Low-degree polynomials extract from sumset sources. We show that a random low-degree polynomial is a good extractor for sumset sources, which are the most general large family of sources (capturing independent sources, interleaved sources, small-space sources, and more). This extractor achieves polynomially small error, and its min-entropy requirement is tight up to a square. Our results on sumset extractors imply new complexity separations for linear ROBPs, and the tools that go into its proof have further applications, as well. The two main tools we use are a new structural result on sumset-punctured Reed-Muller codes, paired with a novel type of reduction between randomness extractors. Using the first new tool, we strengthen and generalize the extractor impossibility results of Chattopadhyay, Goodman, and Gurumukhani (ITCS 2024). Using the second, we show the existence of sumset extractors for min-entropy $k=O(\log(n/\varepsilon))$, resolving an open problem of Chattopadhyay and Liao (STOC 2022).

We prove that random low-degree polynomials (over $\mathbb{F}_2$) are unbiased, in an extremely general sense. That is, we show that random low-degree polynomials are good randomness extractors for a wide class of distributions. Prior to our work, such results were only known for the small families of (1) uniform sources, (2) affine sources, and (3) local sources. We significantly generalize these results, and prove the following. 1. Low-degree polynomials extract from small families. We show that a random low-degree polynomial is a good low-error extractor for any small family of sources. In particular, we improve the positive result of Alrabiah, Chattopadhyay, Goodman, Li, and Ribeiro (ICALP 2022) for local sources, and give new results for polynomial sources and variety sources via a single unified approach. 2. Low-degree polynomials extract from sumset sources. We show that a random low-degree polynomial is a good extractor for sumset sources, which are the most general large family of sources (capturing independent sources, interleaved sources, small-space sources, and more). This extractor achieves polynomially small error, and its min-entropy requirement is tight up to a square. Our results on sumset extractors imply new complexity separations for linear ROBPs, and the tools that go into its proof have further applications, as well. The two main tools we use are a new structural result on sumset-punctured Reed-Muller codes, paired with a novel type of reduction between randomness extractors. Using the first new tool, we strengthen and generalize the extractor impossibility results of Chattopadhyay, Goodman, and Gurumukhani (ITCS 2024). Using the second, we show the existence of sumset extractors for min-entropy $k=O(\log(n/\varepsilon))$, resolving an open problem of Chattopadhyay and Liao (STOC 2022).

TR24-092 | Hilbert Functions and Low-Degree Randomness Extractors | Alexander Golovnev, Zeyu Guo, Pooya Hatami, Satyajeet Nagargoje, Chao Yan

from ECCC Papers

For $S\subseteq \mathbb{F}^n$, consider the linear space of restrictions of degree-$d$ polynomials to $S$. The Hilbert function of $S$, denoted $\mathrm{h}_S(d,\mathbb{F})$, is the dimension of this space. We obtain a tight lower bound on the smallest value of the Hilbert function of subsets $S$ of arbitrary finite grids in $\mathbb{F}^n$ with a fixed size $|S|$. We achieve this by proving that this value coincides with a combinatorial quantity, namely the smallest number of low Hamming weight points in a down-closed set of size $|S|$. Understanding the smallest values of Hilbert functions is closely related to the study of degree-$d$ closure of sets, a notion introduced by Nie and Wang (Journal of Combinatorial Theory, Series A, 2015). We use bounds on the Hilbert function to obtain a tight bound on the size of degree-$d$ closures of subsets of $\mathbb{F}_q^n$, which answers a question posed by Doron, Ta-Shma, and Tell (Computational Complexity, 2022). We use the bounds on the Hilbert function and degree-$d$ closure of sets to prove that a random low-degree polynomial is an extractor for samplable randomness sources. Most notably, we prove the existence of low-degree extractors and dispersers for sources generated by constant degree polynomials and polynomial-size circuits. Until recently, even the existence of arbitrary deterministic extractors for such sources was not known.

For $S\subseteq \mathbb{F}^n$, consider the linear space of restrictions of degree-$d$ polynomials to $S$. The Hilbert function of $S$, denoted $\mathrm{h}_S(d,\mathbb{F})$, is the dimension of this space. We obtain a tight lower bound on the smallest value of the Hilbert function of subsets $S$ of arbitrary finite grids in $\mathbb{F}^n$ with a fixed size $|S|$. We achieve this by proving that this value coincides with a combinatorial quantity, namely the smallest number of low Hamming weight points in a down-closed set of size $|S|$. Understanding the smallest values of Hilbert functions is closely related to the study of degree-$d$ closure of sets, a notion introduced by Nie and Wang (Journal of Combinatorial Theory, Series A, 2015). We use bounds on the Hilbert function to obtain a tight bound on the size of degree-$d$ closures of subsets of $\mathbb{F}_q^n$, which answers a question posed by Doron, Ta-Shma, and Tell (Computational Complexity, 2022). We use the bounds on the Hilbert function and degree-$d$ closure of sets to prove that a random low-degree polynomial is an extractor for samplable randomness sources. Most notably, we prove the existence of low-degree extractors and dispersers for sources generated by constant degree polynomials and polynomial-size circuits. Until recently, even the existence of arbitrary deterministic extractors for such sources was not known.

Low-Degree Polynomials Are Good Extractors

from arXiv: Computational Complexity

Authors: Omar Alrabiah, Jesse Goodman, Jonathan Mosheiff, João Ribeiro

We prove that random low-degree polynomials (over $\mathbb{F}_2$) are unbiased, in an extremely general sense. That is, we show that random low-degree polynomials are good randomness extractors for a wide class of distributions. Prior to our work, such results were only known for the small families of (1) uniform sources, (2) affine sources, and (3) local sources. We significantly generalize these results, and prove the following. 1. Low-degree polynomials extract from small families. We show that a random low-degree polynomial is a good low-error extractor for any small family of sources. In particular, we improve the positive result of Alrabiah, Chattopadhyay, Goodman, Li, and Ribeiro (ICALP 2022) for local sources, and give new results for polynomial sources and variety sources via a single unified approach. 2. Low-degree polynomials extract from sumset sources. We show that a random low-degree polynomial is a good extractor for sumset sources, which are the most general large family of sources (capturing independent sources, interleaved sources, small-space sources, and more). This extractor achieves polynomially small error, and its min-entropy requirement is tight up to a square. Our results on sumset extractors imply new complexity separations for linear ROBPs, and the tools that go into its proof have further applications, as well. The two main tools we use are a new structural result on sumset-punctured Reed-Muller codes, paired with a novel type of reduction between randomness extractors. Using the first new tool, we strengthen and generalize the extractor impossibility results of Chattopadhyay, Goodman, and Gurumukhani (ITCS 2024). Using the second, we show the existence of sumset extractors for min-entropy $k=O(\log(n/\varepsilon))$, resolving an open problem of Chattopadhyay and Liao (STOC 2022).

Authors: Omar Alrabiah, Jesse Goodman, Jonathan Mosheiff, João Ribeiro

We prove that random low-degree polynomials (over $\mathbb{F}_2$) are unbiased, in an extremely general sense. That is, we show that random low-degree polynomials are good randomness extractors for a wide class of distributions. Prior to our work, such results were only known for the small families of (1) uniform sources, (2) affine sources, and (3) local sources. We significantly generalize these results, and prove the following. 1. Low-degree polynomials extract from small families. We show that a random low-degree polynomial is a good low-error extractor for any small family of sources. In particular, we improve the positive result of Alrabiah, Chattopadhyay, Goodman, Li, and Ribeiro (ICALP 2022) for local sources, and give new results for polynomial sources and variety sources via a single unified approach. 2. Low-degree polynomials extract from sumset sources. We show that a random low-degree polynomial is a good extractor for sumset sources, which are the most general large family of sources (capturing independent sources, interleaved sources, small-space sources, and more). This extractor achieves polynomially small error, and its min-entropy requirement is tight up to a square. Our results on sumset extractors imply new complexity separations for linear ROBPs, and the tools that go into its proof have further applications, as well. The two main tools we use are a new structural result on sumset-punctured Reed-Muller codes, paired with a novel type of reduction between randomness extractors. Using the first new tool, we strengthen and generalize the extractor impossibility results of Chattopadhyay, Goodman, and Gurumukhani (ITCS 2024). Using the second, we show the existence of sumset extractors for min-entropy $k=O(\log(n/\varepsilon))$, resolving an open problem of Chattopadhyay and Liao (STOC 2022).

Hilbert Functions and Low-Degree Randomness Extractors

from arXiv: Computational Complexity

Authors: Alexander Golovnev, Zeyu Guo, Pooya Hatami, Satyajeet Nagargoje, Chao Yan

For $S\subseteq \mathbb{F}^n$, consider the linear space of restrictions of degree-$d$ polynomials to $S$. The Hilbert function of $S$, denoted $\mathrm{h}_S(d,\mathbb{F})$, is the dimension of this space. We obtain a tight lower bound on the smallest value of the Hilbert function of subsets $S$ of arbitrary finite grids in $\mathbb{F}^n$ with a fixed size $|S|$. We achieve this by proving that this value coincides with a combinatorial quantity, namely the smallest number of low Hamming weight points in a down-closed set of size $|S|$. Understanding the smallest values of Hilbert functions is closely related to the study of degree-$d$ closure of sets, a notion introduced by Nie and Wang (Journal of Combinatorial Theory, Series A, 2015). We use bounds on the Hilbert function to obtain a tight bound on the size of degree-$d$ closures of subsets of $\mathbb{F}_q^n$, which answers a question posed by Doron, Ta-Shma, and Tell (Computational Complexity, 2022). We use the bounds on the Hilbert function and degree-$d$ closure of sets to prove that a random low-degree polynomial is an extractor for samplable randomness sources. Most notably, we prove the existence of low-degree extractors and dispersers for sources generated by constant-degree polynomials and polynomial-size circuits. Until recently, even the existence of arbitrary deterministic extractors for such sources was not known.

Authors: Alexander Golovnev, Zeyu Guo, Pooya Hatami, Satyajeet Nagargoje, Chao Yan

For $S\subseteq \mathbb{F}^n$, consider the linear space of restrictions of degree-$d$ polynomials to $S$. The Hilbert function of $S$, denoted $\mathrm{h}_S(d,\mathbb{F})$, is the dimension of this space. We obtain a tight lower bound on the smallest value of the Hilbert function of subsets $S$ of arbitrary finite grids in $\mathbb{F}^n$ with a fixed size $|S|$. We achieve this by proving that this value coincides with a combinatorial quantity, namely the smallest number of low Hamming weight points in a down-closed set of size $|S|$. Understanding the smallest values of Hilbert functions is closely related to the study of degree-$d$ closure of sets, a notion introduced by Nie and Wang (Journal of Combinatorial Theory, Series A, 2015). We use bounds on the Hilbert function to obtain a tight bound on the size of degree-$d$ closures of subsets of $\mathbb{F}_q^n$, which answers a question posed by Doron, Ta-Shma, and Tell (Computational Complexity, 2022). We use the bounds on the Hilbert function and degree-$d$ closure of sets to prove that a random low-degree polynomial is an extractor for samplable randomness sources. Most notably, we prove the existence of low-degree extractors and dispersers for sources generated by constant-degree polynomials and polynomial-size circuits. Until recently, even the existence of arbitrary deterministic extractors for such sources was not known.

Voronoi Graph -- Improved raycasting and integration schemes for high dimensional Voronoi diagrams

from arXiv: Computational Geometry

Authors: Alexander Sikorski, Martin Heida

The computation of Voronoi Diagrams, or their dual Delauney triangulations is difficult in high dimensions. In a recent publication Polianskii and Pokorny propose an iterative randomized algorithm facilitating the approximation of Voronoi tesselations in high dimensions. In this paper, we provide an improved vertex search method that is not only exact but even faster than the bisection method that was previously recommended. Building on this we also provide a depth-first graph-traversal algorithm which allows us to compute the entire Voronoi diagram. This enables us to compare the outcomes with those of classical algorithms like qHull, which we either match or marginally beat in terms of computation time. We furthermore show how the raycasting algorithm naturally lends to a Monte Carlo approximation for the volume and boundary integrals of the Voronoi cells, both of which are of importance for finite Volume methods. We compare the Monte-Carlo methods to the exact polygonal integration, as well as a hybrid approximation scheme.

Authors: Alexander Sikorski, Martin Heida

The computation of Voronoi Diagrams, or their dual Delauney triangulations is difficult in high dimensions. In a recent publication Polianskii and Pokorny propose an iterative randomized algorithm facilitating the approximation of Voronoi tesselations in high dimensions. In this paper, we provide an improved vertex search method that is not only exact but even faster than the bisection method that was previously recommended. Building on this we also provide a depth-first graph-traversal algorithm which allows us to compute the entire Voronoi diagram. This enables us to compare the outcomes with those of classical algorithms like qHull, which we either match or marginally beat in terms of computation time. We furthermore show how the raycasting algorithm naturally lends to a Monte Carlo approximation for the volume and boundary integrals of the Voronoi cells, both of which are of importance for finite Volume methods. We compare the Monte-Carlo methods to the exact polygonal integration, as well as a hybrid approximation scheme.

Rounding Large Independent Sets on Expanders

from arXiv: Data Structures and Algorithms

Authors: Mitali Bafna, Jun-Ting Hsieh, Pravesh K. Kothari

We develop a new approach for approximating large independent sets when the input graph is a one-sided spectral expander - that is, the uniform random walk matrix of the graph has the second eigenvalue bounded away from 1. Consequently, we obtain a polynomial time algorithm to find linear-sized independent sets in one-sided expanders that are almost $3$-colorable or are promised to contain an independent set of size $(1/2-\epsilon)n$. Our second result above can be refined to require only a weaker vertex expansion property with an efficient certificate. Somewhat surprisingly, we observe that the analogous task of finding a linear-sized independent set in almost $4$-colorable one-sided expanders (even when the second eigenvalue is $o_n(1)$) is NP-hard, assuming the Unique Games Conjecture. All prior algorithms that beat the worst-case guarantees for this problem rely on bottom eigenspace enumeration techniques (following the classical spectral methods of Alon and Kahale) and require two-sided expansion, meaning a bounded number of negative eigenvalues of magnitude $\Omega(1)$. Such techniques naturally extend to almost $k$-colorable graphs for any constant $k$, in contrast to analogous guarantees on one-sided expanders, which are Unique Games-hard to achieve for $k \geq 4$. Our rounding builds on the method of simulating multiple samples from a pseudodistribution introduced by Barak et. al. for rounding Unique Games instances. The key to our analysis is a new clustering property of large independent sets in expanding graphs - every large independent set has a larger-than-expected intersection with some member of a small list - and its formalization in the low-degree sum-of-squares proof system.

Authors: Mitali Bafna, Jun-Ting Hsieh, Pravesh K. Kothari

We develop a new approach for approximating large independent sets when the input graph is a one-sided spectral expander - that is, the uniform random walk matrix of the graph has the second eigenvalue bounded away from 1. Consequently, we obtain a polynomial time algorithm to find linear-sized independent sets in one-sided expanders that are almost $3$-colorable or are promised to contain an independent set of size $(1/2-\epsilon)n$. Our second result above can be refined to require only a weaker vertex expansion property with an efficient certificate. Somewhat surprisingly, we observe that the analogous task of finding a linear-sized independent set in almost $4$-colorable one-sided expanders (even when the second eigenvalue is $o_n(1)$) is NP-hard, assuming the Unique Games Conjecture. All prior algorithms that beat the worst-case guarantees for this problem rely on bottom eigenspace enumeration techniques (following the classical spectral methods of Alon and Kahale) and require two-sided expansion, meaning a bounded number of negative eigenvalues of magnitude $\Omega(1)$. Such techniques naturally extend to almost $k$-colorable graphs for any constant $k$, in contrast to analogous guarantees on one-sided expanders, which are Unique Games-hard to achieve for $k \geq 4$. Our rounding builds on the method of simulating multiple samples from a pseudodistribution introduced by Barak et. al. for rounding Unique Games instances. The key to our analysis is a new clustering property of large independent sets in expanding graphs - every large independent set has a larger-than-expected intersection with some member of a small list - and its formalization in the low-degree sum-of-squares proof system.

Adaptive Quotient Filters

from arXiv: Data Structures and Algorithms

Authors: Richard Wen, Hunter McCoy, David Tench, Guido Tagliavini, Michael A. Bender, Alex Conway, Martin Farach-Colton, Rob Johnson, Prashant Pandey

Adaptive filters, such as telescoping and adaptive cuckoo filters, update their representation upon detecting a false positive to avoid repeating the same error in the future. Adaptive filters require an auxiliary structure, typically much larger than the main filter and often residing on slow storage, to facilitate adaptation. However, existing adaptive filters are not practical and have seen no adoption in real-world systems due to two main reasons. Firstly, they offer weak adaptivity guarantees, meaning that fixing a new false positive can cause a previously fixed false positive to come back. Secondly, the sub-optimal design of the auxiliary structure results in adaptivity overheads so substantial that they can actually diminish the overall system performance compared to a traditional filter. In this paper, we design and implement AdaptiveQF, the first practical adaptive filter with minimal adaptivity overhead and strong adaptivity guarantees, which means that the performance and false-positive guarantees continue to hold even for adversarial workloads. The AdaptiveQF is based on the state-of-the-art quotient filter design and preserves all the critical features of the quotient filter such as cache efficiency and mergeability. Furthermore, we employ a new auxiliary structure design which results in considerably low adaptivity overhead and makes the AdaptiveQF practical in real systems.

Authors: Richard Wen, Hunter McCoy, David Tench, Guido Tagliavini, Michael A. Bender, Alex Conway, Martin Farach-Colton, Rob Johnson, Prashant Pandey

Adaptive filters, such as telescoping and adaptive cuckoo filters, update their representation upon detecting a false positive to avoid repeating the same error in the future. Adaptive filters require an auxiliary structure, typically much larger than the main filter and often residing on slow storage, to facilitate adaptation. However, existing adaptive filters are not practical and have seen no adoption in real-world systems due to two main reasons. Firstly, they offer weak adaptivity guarantees, meaning that fixing a new false positive can cause a previously fixed false positive to come back. Secondly, the sub-optimal design of the auxiliary structure results in adaptivity overheads so substantial that they can actually diminish the overall system performance compared to a traditional filter. In this paper, we design and implement AdaptiveQF, the first practical adaptive filter with minimal adaptivity overhead and strong adaptivity guarantees, which means that the performance and false-positive guarantees continue to hold even for adversarial workloads. The AdaptiveQF is based on the state-of-the-art quotient filter design and preserves all the critical features of the quotient filter such as cache efficiency and mergeability. Furthermore, we employ a new auxiliary structure design which results in considerably low adaptivity overhead and makes the AdaptiveQF practical in real systems.

Near Uniform Triangle Sampling Over Adjacency List Graph Streams

from arXiv: Data Structures and Algorithms

Authors: Arijit Bishnu, Arijit Ghosh, Gopinath Mishra, Sayantan Sen

Triangle counting and sampling are two fundamental problems for streaming algorithms. Arguably, designing sampling algorithms is more challenging than their counting variants. It may be noted that triangle counting has received far greater attention in the literature than the sampling variant. In this work, we consider the problem of approximately sampling triangles in different models of streaming with the focus being on the adjacency list model. In this problem, the edges of a graph $G$ will arrive over a data stream. The goal is to design efficient streaming algorithms that can sample and output a triangle from a distribution, over the triangles in $G$, that is close to the uniform distribution over the triangles in $G$. The distance between distributions is measured in terms of $\ell_1$-distance. The main technical contribution of this paper is to design algorithms for this triangle sampling problem in the adjacency list model with the space complexities matching their counting variants. For the sake of completeness, we also show results on the vertex and edge arrival models.

Authors: Arijit Bishnu, Arijit Ghosh, Gopinath Mishra, Sayantan Sen

Triangle counting and sampling are two fundamental problems for streaming algorithms. Arguably, designing sampling algorithms is more challenging than their counting variants. It may be noted that triangle counting has received far greater attention in the literature than the sampling variant. In this work, we consider the problem of approximately sampling triangles in different models of streaming with the focus being on the adjacency list model. In this problem, the edges of a graph $G$ will arrive over a data stream. The goal is to design efficient streaming algorithms that can sample and output a triangle from a distribution, over the triangles in $G$, that is close to the uniform distribution over the triangles in $G$. The distance between distributions is measured in terms of $\ell_1$-distance. The main technical contribution of this paper is to design algorithms for this triangle sampling problem in the adjacency list model with the space complexities matching their counting variants. For the sake of completeness, we also show results on the vertex and edge arrival models.

Distributed Delta-Coloring under Bandwidth Limitations

from arXiv: Data Structures and Algorithms

Authors: Yannic Maus, Magnús M. Halldórsson

We consider the problem of coloring graphs of maximum degree $\Delta$ with $\Delta$ colors in the distributed setting with limited bandwidth. Specifically, we give a $\mathsf{poly}\log\log n$-round randomized algorithm in the CONGEST model. This is close to the lower bound of $\Omega(\log \log n)$ rounds from [Brandt et al., STOC '16], which holds also in the more powerful LOCAL model. The core of our algorithm is a reduction to several special instances of the constructive Lov\'asz local lemma (LLL) and the $deg+1$-list coloring problem.

Authors: Yannic Maus, Magnús M. Halldórsson

We consider the problem of coloring graphs of maximum degree $\Delta$ with $\Delta$ colors in the distributed setting with limited bandwidth. Specifically, we give a $\mathsf{poly}\log\log n$-round randomized algorithm in the CONGEST model. This is close to the lower bound of $\Omega(\log \log n)$ rounds from [Brandt et al., STOC '16], which holds also in the more powerful LOCAL model. The core of our algorithm is a reduction to several special instances of the constructive Lov\'asz local lemma (LLL) and the $deg+1$-list coloring problem.

Dynamic online matching with budget refills

from arXiv: Data Structures and Algorithms

Authors: Maria Cherifa, Clément Calauzènes, Vianney Perchet

Inspired by sequential budgeted allocation problems, we study the online matching problem with budget refills. In this context, we consider an online bipartite graph G=(U,V,E), where the nodes in $V$ are discovered sequentially and nodes in $U$ are known beforehand. Each $u\in U$ is endowed with a budget $b_{u,t}\in \mathbb{N}$ that dynamically evolves over time. Unlike the canonical setting, in many applications, the budget can be refilled from time to time, which leads to a much richer dynamic that we consider here. Intuitively, adding extra budgets in $U$ seems to ease the matching task, and our results support this intuition. In fact, for the stochastic framework considered where we studied the matching size built by Greedy algorithm on an Erd\H{o}s-R\'eyni random graph, we showed that the matching size generated by Greedy converges with high probability to a solution of an explicit system of ODE. Moreover, under specific conditions, the competitive ratio (performance measure of the algorithm) can even tend to 1. For the adversarial part, where the graph considered is deterministic and the algorithm used is Balance, the $b$-matching bound holds when the refills are scarce. However, when refills are regular, our results suggest a potential improvement in algorithm performance. In both cases, Balance algorithm manages to reach the performance of the upper bound on the adversarial graphs considered.

Authors: Maria Cherifa, Clément Calauzènes, Vianney Perchet

Inspired by sequential budgeted allocation problems, we study the online matching problem with budget refills. In this context, we consider an online bipartite graph G=(U,V,E), where the nodes in $V$ are discovered sequentially and nodes in $U$ are known beforehand. Each $u\in U$ is endowed with a budget $b_{u,t}\in \mathbb{N}$ that dynamically evolves over time. Unlike the canonical setting, in many applications, the budget can be refilled from time to time, which leads to a much richer dynamic that we consider here. Intuitively, adding extra budgets in $U$ seems to ease the matching task, and our results support this intuition. In fact, for the stochastic framework considered where we studied the matching size built by Greedy algorithm on an Erd\H{o}s-R\'eyni random graph, we showed that the matching size generated by Greedy converges with high probability to a solution of an explicit system of ODE. Moreover, under specific conditions, the competitive ratio (performance measure of the algorithm) can even tend to 1. For the adversarial part, where the graph considered is deterministic and the algorithm used is Balance, the $b$-matching bound holds when the refills are scarce. However, when refills are regular, our results suggest a potential improvement in algorithm performance. In both cases, Balance algorithm manages to reach the performance of the upper bound on the adversarial graphs considered.

Risk-Sensitive Online Algorithms

from arXiv: Data Structures and Algorithms

Authors: Nicolas Christianson, Bo Sun, Steven Low, Adam Wierman

We initiate the study of risk-sensitive online algorithms, in which risk measures are used in the competitive analysis of randomized online algorithms. We introduce the CVaR$_\delta$-competitive ratio ($\delta$-CR) using the conditional value-at-risk of an algorithm's cost, which measures the expectation of the $(1-\delta)$-fraction of worst outcomes against the offline optimal cost, and use this measure to study three online optimization problems: continuous-time ski rental, discrete-time ski rental, and one-max search. The structure of the optimal $\delta$-CR and algorithm varies significantly between problems: we prove that the optimal $\delta$-CR for continuous-time ski rental is $2-2^{-\Theta(\frac{1}{1-\delta})}$, obtained by an algorithm described by a delay differential equation. In contrast, in discrete-time ski rental with buying cost $B$, there is an abrupt phase transition at $\delta = 1 - \Theta(\frac{1}{\log B})$, after which the classic deterministic strategy is optimal. Similarly, one-max search exhibits a phase transition at $\delta = \frac{1}{2}$, after which the classic deterministic strategy is optimal; we also obtain an algorithm that is asymptotically optimal as $\delta \downarrow 0$ that arises as the solution to a delay differential equation.

Authors: Nicolas Christianson, Bo Sun, Steven Low, Adam Wierman

We initiate the study of risk-sensitive online algorithms, in which risk measures are used in the competitive analysis of randomized online algorithms. We introduce the CVaR$_\delta$-competitive ratio ($\delta$-CR) using the conditional value-at-risk of an algorithm's cost, which measures the expectation of the $(1-\delta)$-fraction of worst outcomes against the offline optimal cost, and use this measure to study three online optimization problems: continuous-time ski rental, discrete-time ski rental, and one-max search. The structure of the optimal $\delta$-CR and algorithm varies significantly between problems: we prove that the optimal $\delta$-CR for continuous-time ski rental is $2-2^{-\Theta(\frac{1}{1-\delta})}$, obtained by an algorithm described by a delay differential equation. In contrast, in discrete-time ski rental with buying cost $B$, there is an abrupt phase transition at $\delta = 1 - \Theta(\frac{1}{\log B})$, after which the classic deterministic strategy is optimal. Similarly, one-max search exhibits a phase transition at $\delta = \frac{1}{2}$, after which the classic deterministic strategy is optimal; we also obtain an algorithm that is asymptotically optimal as $\delta \downarrow 0$ that arises as the solution to a delay differential equation.

Thursday, May 16

Trench Warfare

from Ben Recht

Meehl's Philosophical Psychology, Lecture 5, Part 2.

Since I spread the argument out over the last post, let me tidily summarize Meehl’s metatheory of Lakatosian Defense. An experimental outcome is deduced from a collection of scientific assertions in a derivation chain. The derivation chain uses clauses from the core theory (TC), assorted auxiliary theories from the discipline (AT), auxiliary theories about the instruments (AI), a ceteris paribus clause (CP), and a variety of experimental conditions (CN). From the logical conjunction of all of these clauses, we logically deduce the prediction “If I observe O1, then I will observe O2” (O1⊃O2). We choose experiments such that, with the background information we have, the probability of O2 conditioned on O1 is small. Together, we get the expression:

If we observe O1 and O2 in our experiment, the low probability statement corroborates our theory. The smaller pE, the more the theory is corroborated. If we observe O1 but not O2, our derivation chain is falsified and we look for something to blame, moving from right to left.

Today, I want to cast the examples I’ve listed so far in this series as stages of Lakatosian defense to illustrate why this framework elucidates the role of falsifiers in scientific development. In the next post, I’ll discuss corroboration.

Experimental conditions

Meehl’s description of the controversies in latent learning shows how arguments about experimental conditions alone can keep scientists busy for years. He described squabbles about the right way to manipulate rats before setting them loose in a maze. It mattered how you carried it from the cage to the maze. If you held the rat by the tail, it would be less likely to cooperate than if you had let it gently sit on your forearm while petting it. If you gave the rat a little food in advance, it might perform better than if it went in hungry. Every tiny adjustment mattered. Whichever side of the latent learning debate you were on, you could criticize the other side by arguing about CN.

Wars about experimental conditions, as Meehl says, are usually focused on replication. It is replication in the narrowest sense here: 

“You are not denying the facts. You’re denying that something is a fact.” 

You attack CN by asking if someone can create the same conditions that a paper reports on and see the same outcome. A more general sort of replication, seeing if a result translates to slightly different contexts, takes us one step up the Lakatosian Defense Hierarchy.

Ceteris Paribus

Ceteris paribus clauses are so general that fields can spend decades fighting about them. Here, we ask if something outside of the derivation chain is responsible for the experimental outcome. Ceteris paribus clauses assert that we assume the derivation chain is true “everything else being equal.” But of course, such a bold statement is never true literally true. We can’t really control everything in an experiment. The question is always whether we have controlled things enough.

Most of the arguments about “DAGs” in causal inference are attacks on CP. If an unspecified confounding variable causes both O1 and O2, then the experimental association is spurious. The confounder violates CP. Arguments about selection biases (like Berkson’s paradox) are also attacking ceteris paribus, as they argue the association only holds on the very specific population collected for the particular experiment.

But the ceteris paribus clause is even more general than this. CP is where we store all of our idealizations.  You assert certain things you know are true are not relevant to the outcome under the prescribed experimental conditions. Since you know these idealizations might be false, you can use CP to your advantage in a Lakatosian defense. You can try to explain the outcome away by adding an additional auxiliary theory up the derivation chain to fix the CP violation. Perhaps there was a matter of fact about the universe you weren’t aware of in your initial experiment, but the observation is explained once you add the fact to your derivation chain. This was the case with the discovery of Neptune. Or, perhaps there were idealizations about facts you knew were present but thought had low influence. Incorporating these facts might fix up the experimental correspondence. This was what happened in the kinetic theory of gasses with van Der Waals corrections. In both of these, we’re modifying the “these facts don’t matter” clauses, CP, to transform a falsifying experiment into a corroborating experiment. But to do so, we had to bloat up the auxiliary theory clause with more facts and parameters. 

Instrumental auxiliaries

Meehl emphasizes that in his characterization, instrumental auxiliaries are only those outside of the discipline. But where you draw disciplinary boundaries can be tricky. In the example of Eddington fudging his analysis of light deflection, are telescopes inside or outside the theory of astrophysics? I might argue that the telescopes and photographic plates are governed by terrestrial physical concerns, not grand theories of cosmology. 

What about Dayton Miller’s observations of aether drift? While many people questioned the functionality of the Michaelson-Morley interferometer, Miller’s apparatus was harder to attack because Miller was such a careful experimentalist. In the end, the results were explained away as thermal artifacts. Was this a violation of ceteris paribus or a problem with the instrument? I suppose we could say it’s both.

I bring this up because I want to talk about software and statistics, which messily infect all of the clauses in the Lakatosian Defense. I’ll say more about this in a future post.

Theoretical auxiliaries

The final stand of a Lakatosian Defense attacks the theoretical auxiliaries of a core theory. As I mentioned, adding auxiliary theories is a common part of Lakatosian defense. We let ourselves explain facts by adding conditional characterizations of when certain approximations are valid. But removing auxiliary theories—declaring them false—is much more rare. In fact, I’m hard-pressed to find good examples, though I’m probably just not thinking hard enough as I write. For what it’s worth, Meehl doesn’t give any clean examples of experiments messing with theoretical auxiliaries directly. If you have any fun examples of attacking auxiliary theories, tell me in the comments!

The reason it’s hard to come by attacks on auxiliaries is removing them messes up all of your past results. Removing an auxiliary will not only invalidate a bunch of past derivations, but it will also cause your theory to disagree with past experiments. You’ll have to explain that away, too. Meehl argues that scientists will trade some contradictions with the past with a bunch of new damned strange coincidences in the future, but he doesn’t give any examples. I’ll try to pin this all down in Lecture 6 when discussing Lakatos’ notion of the “Hard Core” and “Protective Belt” of a theory. But before we get there, I want to get into the second part of Lakatosian Defense, describing what happens when experiments corroborate your theory.

Subscribe now

By Ben Recht

Accepted papers for GandALF 2024 and SLSS 2024

from Luca Aceto

The list of papers that were selected for presentation at GandALF 2024 is available at scool24.github.io/GandALF/ I am looking forward to listening to the presentations based on those articles and to the four invited talks by Bernd Finkbeiner (CISPA Helmholtz Center for Information Security), Kim Guldstrand Larsen (Aalborg University), Brigitte Pientka (McGill University) and Azalea Raad (Imperial College London). 


On behalf of the GandALF SC, I thank the GandALF 2024 PC co-chairs, Antonis Achilleos and Adrian Francalanza, and their PC for the efficient PC work. 
 In case you missed it, the list of selected contributions and invited talks at the co-located Twelfth Scandinavian Logic Symposium (SLSS 2024) is at scool24.github.io/SLSS/. SLSS will be held on 14-16 June and GandALF on 19-21 June. As part of the Reykjavik Summer of Cool Logic 2024 (SCooL 2024), we will also host the Fifth Nordic Logic Summer School (NLS 2024) on 10-13 June. 
Thanks to all my colleagues at ICE-TCS, Department of Computer Science at Reykjavik University, who are working very hard on the organisation of these three events back to back. 
I hope to see a good participation at those events.

By Luca Aceto

The list of papers that were selected for presentation at GandALF 2024 is available at https://scool24.github.io/GandALF/ I am looking forward to listening to the presentations based on those articles and to the four invited talks by Bernd Finkbeiner (CISPA Helmholtz Center for Information Security), Kim Guldstrand Larsen (Aalborg University), Brigitte Pientka (McGill University) and Azalea Raad (Imperial College London). 


On behalf of the GandALF SC, I thank the GandALF 2024 PC co-chairs, Antonis Achilleos and Adrian Francalanza, and their PC for the efficient PC work. 

 In case you missed it, the list of selected contributions and invited talks at the co-located Twelfth Scandinavian Logic Symposium (SLSS 2024) is at https://scool24.github.io/SLSS/. SLSS will be held on 14-16 June and GandALF on 19-21 June. As part of the Reykjavik Summer of Cool Logic 2024 (SCooL 2024), we will also host the Fifth Nordic Logic Summer School (NLS 2024) on 10-13 June. 

Thanks to all my colleagues at ICE-TCS, Department of Computer Science at Reykjavik University, who are working very hard on the organisation of these three events back to back. 

I hope to see a good participation at those events.

By Luca Aceto

Recurrence solution of monomer-polymer models on two-dimensional rectangular lattices

from arXiv: Computational Complexity

Authors: Yong Kong

The problem of counting polymer coverings on the rectangular lattices is investigated. In this model, a linear rigid polymer covers $k$ adjacent lattice sites such that no two polymers occupy a common site. Those unoccupied lattice sites are considered as monomers. We prove that for a given number of polymers ($k$-mers), the number of arrangements for the polymers on two-dimensional rectangular lattices satisfies simple recurrence relations. These recurrence relations are quite general and apply for arbitrary polymer length ($k$) and the width of the lattices ($n$). The well-studied monomer-dimer problem is a special case of the monomer-polymer model when $k=2$. It is known the enumeration of monomer-dimer configurations in planar lattices is #P-complete. The recurrence relations shown here have the potential for hints for the solution of long-standing problems in this class of computational complexity.

Authors: Yong Kong

The problem of counting polymer coverings on the rectangular lattices is investigated. In this model, a linear rigid polymer covers $k$ adjacent lattice sites such that no two polymers occupy a common site. Those unoccupied lattice sites are considered as monomers. We prove that for a given number of polymers ($k$-mers), the number of arrangements for the polymers on two-dimensional rectangular lattices satisfies simple recurrence relations. These recurrence relations are quite general and apply for arbitrary polymer length ($k$) and the width of the lattices ($n$). The well-studied monomer-dimer problem is a special case of the monomer-polymer model when $k=2$. It is known the enumeration of monomer-dimer configurations in planar lattices is #P-complete. The recurrence relations shown here have the potential for hints for the solution of long-standing problems in this class of computational complexity.

Lens functions for exploring UMAP Projections with Domain Knowledge

from arXiv: Computational Geometry

Authors: Daniel M. Bot, Jan Aerts

Dimensionality reduction algorithms are often used to visualise high-dimensional data. Previously, studies have used prior information to enhance or suppress expected patterns in projections. In this paper, we adapt such techniques for domain knowledge guided interactive exploration. Inspired by Mapper and STAD, we present three types of lens functions for UMAP, a state-of-the-art dimensionality reduction algorithm. Lens functions enable analysts to adapt projections to their questions, revealing otherwise hidden patterns. They filter the modelled connectivity to explore the interaction between manually selected features and the data's structure, creating configurable perspectives each potentially revealing new insights. The effectiveness of the lens functions is demonstrated in two use cases and their computational cost is analysed in a synthetic benchmark. Our implementation is available in an open-source Python package: github.com/vda-lab/lensed_umap.

Authors: Daniel M. Bot, Jan Aerts

Dimensionality reduction algorithms are often used to visualise high-dimensional data. Previously, studies have used prior information to enhance or suppress expected patterns in projections. In this paper, we adapt such techniques for domain knowledge guided interactive exploration. Inspired by Mapper and STAD, we present three types of lens functions for UMAP, a state-of-the-art dimensionality reduction algorithm. Lens functions enable analysts to adapt projections to their questions, revealing otherwise hidden patterns. They filter the modelled connectivity to explore the interaction between manually selected features and the data's structure, creating configurable perspectives each potentially revealing new insights. The effectiveness of the lens functions is demonstrated in two use cases and their computational cost is analysed in a synthetic benchmark. Our implementation is available in an open-source Python package: https://github.com/vda-lab/lensed_umap.

Symmetric-Difference (Degeneracy) and Signed Tree Models

from arXiv: Data Structures and Algorithms

Authors: Édouard Bonnet, Julien Duron, John Sylvester, Viktor Zamaraev

We introduce a dense counterpart of graph degeneracy, which extends the recently-proposed invariant symmetric difference. We say that a graph has sd-degeneracy (for symmetric-difference degeneracy) at most $d$ if it admits an elimination order of its vertices where a vertex $u$ can be removed whenever it has a $d$-twin, i.e., another vertex $v$ such that at most $d$ vertices outside $\{u,v\}$ are neighbors of exactly one of $u, v$. The family of graph classes of bounded sd-degeneracy is a superset of that of graph classes of bounded degeneracy or of bounded flip-width, and more generally, of bounded symmetric difference. Unlike most graph parameters, sd-degeneracy is not hereditary: it may be strictly smaller on a graph than on some of its induced subgraphs. In particular, every $n$-vertex graph is an induced subgraph of some $O(n^2)$-vertex graph of sd-degeneracy 1. In spite of this and the breadth of classes of bounded sd-degeneracy, we devise $\tilde{O}(\sqrt n)$-bit adjacency labeling schemes for them, which are optimal up to the hidden polylogarithmic factor. This is attained on some even more general classes, consisting of graphs $G$ whose vertices bijectively map to the leaves of a tree $T$, where transversal edges and anti-edges added to $T$ define the edge set of $G$. We call such graph representations signed tree models as they extend the so-called tree models (or twin-decompositions) developed in the context of twin-width, by adding transversal anti-edges. While computing the degeneracy of an input graph can be done in linear time, we show that deciding whether its symmetric difference is at most 8 is co-NP-complete, and whether its sd-degeneracy is at most 1 is NP-complete.

Authors: Édouard Bonnet, Julien Duron, John Sylvester, Viktor Zamaraev

We introduce a dense counterpart of graph degeneracy, which extends the recently-proposed invariant symmetric difference. We say that a graph has sd-degeneracy (for symmetric-difference degeneracy) at most $d$ if it admits an elimination order of its vertices where a vertex $u$ can be removed whenever it has a $d$-twin, i.e., another vertex $v$ such that at most $d$ vertices outside $\{u,v\}$ are neighbors of exactly one of $u, v$. The family of graph classes of bounded sd-degeneracy is a superset of that of graph classes of bounded degeneracy or of bounded flip-width, and more generally, of bounded symmetric difference. Unlike most graph parameters, sd-degeneracy is not hereditary: it may be strictly smaller on a graph than on some of its induced subgraphs. In particular, every $n$-vertex graph is an induced subgraph of some $O(n^2)$-vertex graph of sd-degeneracy 1. In spite of this and the breadth of classes of bounded sd-degeneracy, we devise $\tilde{O}(\sqrt n)$-bit adjacency labeling schemes for them, which are optimal up to the hidden polylogarithmic factor. This is attained on some even more general classes, consisting of graphs $G$ whose vertices bijectively map to the leaves of a tree $T$, where transversal edges and anti-edges added to $T$ define the edge set of $G$. We call such graph representations signed tree models as they extend the so-called tree models (or twin-decompositions) developed in the context of twin-width, by adding transversal anti-edges. While computing the degeneracy of an input graph can be done in linear time, we show that deciding whether its symmetric difference is at most 8 is co-NP-complete, and whether its sd-degeneracy is at most 1 is NP-complete.

A Primal-Dual Framework for Symmetric Cone Programming

from arXiv: Data Structures and Algorithms

Authors: Jiaqi Zheng, Antonios Varvitsiotis, Tiow-Seng Tan, Wayne Lin

In this paper, we introduce a primal-dual algorithmic framework for solving Symmetric Cone Programs (SCPs), a versatile optimization model that unifies and extends Linear, Second-Order Cone (SOCP), and Semidefinite Programming (SDP). Our work generalizes the primal-dual framework for SDPs introduced by Arora and Kale, leveraging a recent extension of the Multiplicative Weights Update method (MWU) to symmetric cones. Going beyond existing works, our framework can handle SOCPs and mixed SCPs, exhibits nearly linear time complexity, and can be effectively parallelized. To illustrate the efficacy of our framework, we employ it to develop approximation algorithms for two geometric optimization problems: the Smallest Enclosing Sphere problem and the Support Vector Machine problem. Our theoretical analyses demonstrate that the two algorithms compute approximate solutions in nearly linear running time and with parallel depth scaling polylogarithmically with the input size. We compare our algorithms against CGAL as well as interior point solvers applied to these problems. Experiments show that our algorithms are highly efficient when implemented on a CPU and achieve substantial speedups when parallelized on a GPU, allowing us to solve large-scale instances of these problems.

Authors: Jiaqi Zheng, Antonios Varvitsiotis, Tiow-Seng Tan, Wayne Lin

In this paper, we introduce a primal-dual algorithmic framework for solving Symmetric Cone Programs (SCPs), a versatile optimization model that unifies and extends Linear, Second-Order Cone (SOCP), and Semidefinite Programming (SDP). Our work generalizes the primal-dual framework for SDPs introduced by Arora and Kale, leveraging a recent extension of the Multiplicative Weights Update method (MWU) to symmetric cones. Going beyond existing works, our framework can handle SOCPs and mixed SCPs, exhibits nearly linear time complexity, and can be effectively parallelized. To illustrate the efficacy of our framework, we employ it to develop approximation algorithms for two geometric optimization problems: the Smallest Enclosing Sphere problem and the Support Vector Machine problem. Our theoretical analyses demonstrate that the two algorithms compute approximate solutions in nearly linear running time and with parallel depth scaling polylogarithmically with the input size. We compare our algorithms against CGAL as well as interior point solvers applied to these problems. Experiments show that our algorithms are highly efficient when implemented on a CPU and achieve substantial speedups when parallelized on a GPU, allowing us to solve large-scale instances of these problems.

Improved classical shadows from local symmetries in the Schur basis

from arXiv: Data Structures and Algorithms

Authors: Daniel Grier, Sihan Liu, Gaurav Mahajan

We study the sample complexity of the classical shadows task: what is the fewest number of copies of an unknown state you need to measure to predict expected values with respect to some class of observables? Large joint measurements are likely required in order to minimize sample complexity, but previous joint measurement protocols only work when the unknown state is pure. We present the first joint measurement protocol for classical shadows whose sample complexity scales with the rank of the unknown state. In particular we prove $\mathcal O(\sqrt{rB}/\epsilon^2)$ samples suffice, where $r$ is the rank of the state, $B$ is a bound on the squared Frobenius norm of the observables, and $\epsilon$ is the target accuracy. In the low-rank regime, this is a nearly quadratic advantage over traditional approaches that use single-copy measurements. We present several intermediate results that may be of independent interest: a solution to a new formulation of classical shadows that captures functions of non-identical input states; a generalization of a ``nice'' Schur basis used for optimal qubit purification and quantum majority vote; and a measurement strategy that allows us to use local symmetries in the Schur basis to avoid intractable Weingarten calculations in the analysis.

Authors: Daniel Grier, Sihan Liu, Gaurav Mahajan

We study the sample complexity of the classical shadows task: what is the fewest number of copies of an unknown state you need to measure to predict expected values with respect to some class of observables? Large joint measurements are likely required in order to minimize sample complexity, but previous joint measurement protocols only work when the unknown state is pure. We present the first joint measurement protocol for classical shadows whose sample complexity scales with the rank of the unknown state. In particular we prove $\mathcal O(\sqrt{rB}/\epsilon^2)$ samples suffice, where $r$ is the rank of the state, $B$ is a bound on the squared Frobenius norm of the observables, and $\epsilon$ is the target accuracy. In the low-rank regime, this is a nearly quadratic advantage over traditional approaches that use single-copy measurements. We present several intermediate results that may be of independent interest: a solution to a new formulation of classical shadows that captures functions of non-identical input states; a generalization of a ``nice'' Schur basis used for optimal qubit purification and quantum majority vote; and a measurement strategy that allows us to use local symmetries in the Schur basis to avoid intractable Weingarten calculations in the analysis.

Counting overlapping pairs of strings

from arXiv: Data Structures and Algorithms

Authors: Eric Rivals, Pengfei Wang

A correlation is a binary vector that encodes all possible positions of overlaps of two words, where an overlap for an ordered pair of words (u,v) occurs if a suffix of word u matches a prefix of word v. As multiple pairs can have the same correlation, it is relevant to count how many pairs of words share the same correlation depending on the alphabet size and word length n. We exhibit recurrences to compute the number of such pairs -- which is termed population size -- for any correlation; for this, we exploit a relationship between overlaps of two words and self-overlap of one word. This theorem allows us to compute the number of pairs with a longest overlap of a given length and to show that the expected length of the longest border of two words asymptotically diverges, which solves two open questions raised by Gabric in 2022. Finally, we also provide bounds for the asymptotic of the population ratio of any correlation. Given the importance of word overlaps in areas like word combinatorics, bioinformatics, and digital communication, our results may ease analyses of algorithms for string processing, code design, or genome assembly.

Authors: Eric Rivals, Pengfei Wang

A correlation is a binary vector that encodes all possible positions of overlaps of two words, where an overlap for an ordered pair of words (u,v) occurs if a suffix of word u matches a prefix of word v. As multiple pairs can have the same correlation, it is relevant to count how many pairs of words share the same correlation depending on the alphabet size and word length n. We exhibit recurrences to compute the number of such pairs -- which is termed population size -- for any correlation; for this, we exploit a relationship between overlaps of two words and self-overlap of one word. This theorem allows us to compute the number of pairs with a longest overlap of a given length and to show that the expected length of the longest border of two words asymptotically diverges, which solves two open questions raised by Gabric in 2022. Finally, we also provide bounds for the asymptotic of the population ratio of any correlation. Given the importance of word overlaps in areas like word combinatorics, bioinformatics, and digital communication, our results may ease analyses of algorithms for string processing, code design, or genome assembly.

Interval Selection in Sliding Windows

from arXiv: Data Structures and Algorithms

Authors: Cezar-Mihail Alexandru, Christian Konrad

We initiate the study of the Interval Selection problem in the (streaming) sliding window model of computation. In this problem, an algorithm receives a potentially infinite stream of intervals on the line, and the objective is to maintain at every moment an approximation to a largest possible subset of disjoint intervals among the $L$ most recent intervals, for some integer $L$. We give the following results: - In the unit-length intervals case, we give a $2$-approximation sliding window algorithm with space $\tilde{\mathrm{O}}(|OPT|)$, and we show that any sliding window algorithm that computes a $(2-\varepsilon)$-approximation requires space $\Omega(L)$, for any $\varepsilon > 0$. - In the arbitrary-length case, we give a $(\frac{11}{3}+\varepsilon)$-approximation sliding window algorithm with space $\tilde{\mathrm{O}}(|OPT|)$, for any constant $\varepsilon > 0$, which constitutes our main result. We also show that space $\Omega(L)$ is needed for algorithms that compute a $(2.5-\varepsilon)$-approximation, for any $\varepsilon > 0$. Our main technical contribution is an improvement over the smooth histogram technique, which consists of running independent copies of a traditional streaming algorithm with different start times. By employing the one-pass $2$-approximation streaming algorithm by Cabello and P\'{e}rez-Lantero [Theor. Comput. Sci. '17] for \textsf{Interval Selection} on arbitrary-length intervals as the underlying algorithm, the smooth histogram technique immediately yields a $(4+\varepsilon)$-approximation in this setting. Our improvement is obtained by forwarding the structure of the intervals identified in a run to the subsequent run, which constrains the shape of an optimal solution and allows us to target optimal intervals differently.

Authors: Cezar-Mihail Alexandru, Christian Konrad

We initiate the study of the Interval Selection problem in the (streaming) sliding window model of computation. In this problem, an algorithm receives a potentially infinite stream of intervals on the line, and the objective is to maintain at every moment an approximation to a largest possible subset of disjoint intervals among the $L$ most recent intervals, for some integer $L$. We give the following results: - In the unit-length intervals case, we give a $2$-approximation sliding window algorithm with space $\tilde{\mathrm{O}}(|OPT|)$, and we show that any sliding window algorithm that computes a $(2-\varepsilon)$-approximation requires space $\Omega(L)$, for any $\varepsilon > 0$. - In the arbitrary-length case, we give a $(\frac{11}{3}+\varepsilon)$-approximation sliding window algorithm with space $\tilde{\mathrm{O}}(|OPT|)$, for any constant $\varepsilon > 0$, which constitutes our main result. We also show that space $\Omega(L)$ is needed for algorithms that compute a $(2.5-\varepsilon)$-approximation, for any $\varepsilon > 0$. Our main technical contribution is an improvement over the smooth histogram technique, which consists of running independent copies of a traditional streaming algorithm with different start times. By employing the one-pass $2$-approximation streaming algorithm by Cabello and P\'{e}rez-Lantero [Theor. Comput. Sci. '17] for \textsf{Interval Selection} on arbitrary-length intervals as the underlying algorithm, the smooth histogram technique immediately yields a $(4+\varepsilon)$-approximation in this setting. Our improvement is obtained by forwarding the structure of the intervals identified in a run to the subsequent run, which constrains the shape of an optimal solution and allows us to target optimal intervals differently.

Tree-Packing Revisited: Faster Fully Dynamic Min-Cut and Arboricity

from arXiv: Data Structures and Algorithms

Authors: Tijn de Vos, Aleksander B. G. Christiansen

A tree-packing is a collection of spanning trees of a graph. It has been a useful tool for computing the minimum cut in static, dynamic, and distributed settings. In particular, [Thorup, Comb. 2007] used them to obtain his dynamic min-cut algorithm with $\tilde O(\lambda^{14.5}\sqrt{n})$ worst-case update time. We reexamine this relationship, showing that we need to maintain fewer spanning trees for such a result; we show that we only need to pack $\Theta(\lambda^3 \log m)$ greedy trees to guarantee a 1-respecting cut or a trivial cut in some contracted graph. Based on this structural result, we then provide a deterministic algorithm for fully dynamic exact min-cut, that has $\tilde O(\lambda^{5.5}\sqrt{n})$ worst-case update time, for min-cut value bounded by $\lambda$. In particular, this also leads to an algorithm for general fully dynamic exact min-cut with $\tilde O(m^{1-1/12})$ amortized update time, improving upon $\tilde O(m^{1-1/31})$ [Goranci et al., SODA 2023]. We also give the first fully dynamic algorithm that maintains a $(1+\varepsilon)$-approximation of the fractional arboricity -- which is strictly harder than the integral arboricity. Our algorithm is deterministic and has $O(\alpha \log^6m/\varepsilon^4)$ amortized update time, for arboricity at most $\alpha$. We extend these results to a Monte Carlo algorithm with $O(\text{poly}(\log m,\varepsilon^{-1}))$ amortized update time against an adaptive adversary. Our algorithms work on multi-graphs as well. Both result are obtained by exploring the connection between the min-cut/arboricity and (greedy) tree-packing. We investigate tree-packing in a broader sense; including a lower bound for greedy tree-packing, which - to the best of our knowledge - is the first progress on this topic since [Thorup, Comb. 2007].

Authors: Tijn de Vos, Aleksander B. G. Christiansen

A tree-packing is a collection of spanning trees of a graph. It has been a useful tool for computing the minimum cut in static, dynamic, and distributed settings. In particular, [Thorup, Comb. 2007] used them to obtain his dynamic min-cut algorithm with $\tilde O(\lambda^{14.5}\sqrt{n})$ worst-case update time. We reexamine this relationship, showing that we need to maintain fewer spanning trees for such a result; we show that we only need to pack $\Theta(\lambda^3 \log m)$ greedy trees to guarantee a 1-respecting cut or a trivial cut in some contracted graph. Based on this structural result, we then provide a deterministic algorithm for fully dynamic exact min-cut, that has $\tilde O(\lambda^{5.5}\sqrt{n})$ worst-case update time, for min-cut value bounded by $\lambda$. In particular, this also leads to an algorithm for general fully dynamic exact min-cut with $\tilde O(m^{1-1/12})$ amortized update time, improving upon $\tilde O(m^{1-1/31})$ [Goranci et al., SODA 2023]. We also give the first fully dynamic algorithm that maintains a $(1+\varepsilon)$-approximation of the fractional arboricity -- which is strictly harder than the integral arboricity. Our algorithm is deterministic and has $O(\alpha \log^6m/\varepsilon^4)$ amortized update time, for arboricity at most $\alpha$. We extend these results to a Monte Carlo algorithm with $O(\text{poly}(\log m,\varepsilon^{-1}))$ amortized update time against an adaptive adversary. Our algorithms work on multi-graphs as well. Both result are obtained by exploring the connection between the min-cut/arboricity and (greedy) tree-packing. We investigate tree-packing in a broader sense; including a lower bound for greedy tree-packing, which - to the best of our knowledge - is the first progress on this topic since [Thorup, Comb. 2007].

Pointwise Lipschitz Continuous Graph Algorithms via Proximal Gradient Analysis

from arXiv: Data Structures and Algorithms

Authors: Quanquan C. Liu, Grigoris Velegkas, Yuichi Yoshida, Felix Zhou

In many real-world applications, it is prohibitively expensive to drastically change the solution to a problem after a small perturbation in the environment. Therefore, the stability of an algorithm is a very desirable property. In this paper, we study the class of pointwise Lipschitz continuous algorithms as introduced in the recent work of Kumabe and Yoshida [KY23b, FOCS'23]. The Lipschitz constant of an algorithm, intuitively, bounds the ratio of the changes in its output (measured in $\ell_1$ distance) over the perturbations of its input. Prior to our work, most of the attention was focused on the weighted setting whereas only the maximum bipartite matching and the minimum spanning tree problems were studied in the unweighted which is our focus. In this paper, we give a general and simple framework for bounding the Lipschitz constant of algorithms measured through the unweighted $\ell_1$ distance of their outputs. Our approach consists of three main steps. First, we consider a natural continuous relaxation of the underlying graph problem by adding a smooth and strongly convex regularizer to the objective function. Then, we give upper bounds on the $\ell_1$ distance of the optimal solutions of the convex programs, under small perturbations of the weights, via a stability analysis of the trajectory of the proximal gradient method. Finally, we present new problem-specific rounding techniques to obtain integral solutions to several graph problems that approximately maintain the stability guarantees of the fractional solutions. We apply our framework to a number of problems including minimum $s$-$t$ cut, multiway cut, densest subgraph, maximum ($b$-)matching, and packing integer programs. To complement our algorithms, we show the tightness of our results for certain problems by establishing matching lower bounds.

Authors: Quanquan C. Liu, Grigoris Velegkas, Yuichi Yoshida, Felix Zhou

In many real-world applications, it is prohibitively expensive to drastically change the solution to a problem after a small perturbation in the environment. Therefore, the stability of an algorithm is a very desirable property. In this paper, we study the class of pointwise Lipschitz continuous algorithms as introduced in the recent work of Kumabe and Yoshida [KY23b, FOCS'23]. The Lipschitz constant of an algorithm, intuitively, bounds the ratio of the changes in its output (measured in $\ell_1$ distance) over the perturbations of its input. Prior to our work, most of the attention was focused on the weighted setting whereas only the maximum bipartite matching and the minimum spanning tree problems were studied in the unweighted which is our focus. In this paper, we give a general and simple framework for bounding the Lipschitz constant of algorithms measured through the unweighted $\ell_1$ distance of their outputs. Our approach consists of three main steps. First, we consider a natural continuous relaxation of the underlying graph problem by adding a smooth and strongly convex regularizer to the objective function. Then, we give upper bounds on the $\ell_1$ distance of the optimal solutions of the convex programs, under small perturbations of the weights, via a stability analysis of the trajectory of the proximal gradient method. Finally, we present new problem-specific rounding techniques to obtain integral solutions to several graph problems that approximately maintain the stability guarantees of the fractional solutions. We apply our framework to a number of problems including minimum $s$-$t$ cut, multiway cut, densest subgraph, maximum ($b$-)matching, and packing integer programs. To complement our algorithms, we show the tightness of our results for certain problems by establishing matching lower bounds.