Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Thursday, April 03

Recovery Reductions in the Random Noise Model via Group Theory: Insights into NP-Complete and Fine-Grained Problems

from arXiv: Computational Complexity

Authors: Tejas Nareddy, Abhishek Mishra

We introduce and initiate the study of a new model of reductions called the random noise model. In this model, the truth table $T_f$ of the function $f$ is corrupted on a randomly chosen $\delta$-fraction of instances. A randomized algorithm $A$ is a $\left(t, \delta, 1-\varepsilon\right)$-recovery reduction for $f$ if: 1. With probability $1-\varepsilon$ over the choice of $\delta$-fraction corruptions, given access to the corrupted truth table, the algorithm $A$ computes $f(\phi)$ correctly with probability at least $2/3$ on every input $\phi$. 2. The algorithm $A$ runs in time $O(t)$. We believe this model, which is a natural relaxation of average-case complexity, both has practical motivations and is mathematically interesting. Pointing towards this, we show the existence of robust deterministic polynomial-time recovery reductions with the highest tolerable noise level for many of the canonical NP-complete problems - SAT, kSAT, kCSP, CLIQUE and more. Our recovery reductions are optimal for non-adaptive algorithms under complexity-theoretic assumptions. Notably, all our recovery reductions follow as corollaries of one black box algorithm based on group theory and permutation group algorithms. This suggests that recovery reductions in the random noise model are important to the study of the structure of NP-completeness. Furthermore, we establish recovery reductions with optimal parameters for Orthogonal Vectors and Parity $k$-Clique problems. These problems exhibit structural similarities to NP-complete problems, with Orthogonal Vectors admitting a $2^{0.5n}$-time reduction from kSAT on $n$ variables and Parity $k$-Clique, a subexponential-time reduction from 3SAT. This further highlights the relevance of our model to the study of NP-completeness.

Authors: Tejas Nareddy, Abhishek Mishra

We introduce and initiate the study of a new model of reductions called the random noise model. In this model, the truth table $T_f$ of the function $f$ is corrupted on a randomly chosen $\delta$-fraction of instances. A randomized algorithm $A$ is a $\left(t, \delta, 1-\varepsilon\right)$-recovery reduction for $f$ if: 1. With probability $1-\varepsilon$ over the choice of $\delta$-fraction corruptions, given access to the corrupted truth table, the algorithm $A$ computes $f(\phi)$ correctly with probability at least $2/3$ on every input $\phi$. 2. The algorithm $A$ runs in time $O(t)$. We believe this model, which is a natural relaxation of average-case complexity, both has practical motivations and is mathematically interesting. Pointing towards this, we show the existence of robust deterministic polynomial-time recovery reductions with the highest tolerable noise level for many of the canonical NP-complete problems - SAT, kSAT, kCSP, CLIQUE and more. Our recovery reductions are optimal for non-adaptive algorithms under complexity-theoretic assumptions. Notably, all our recovery reductions follow as corollaries of one black box algorithm based on group theory and permutation group algorithms. This suggests that recovery reductions in the random noise model are important to the study of the structure of NP-completeness. Furthermore, we establish recovery reductions with optimal parameters for Orthogonal Vectors and Parity $k$-Clique problems. These problems exhibit structural similarities to NP-complete problems, with Orthogonal Vectors admitting a $2^{0.5n}$-time reduction from kSAT on $n$ variables and Parity $k$-Clique, a subexponential-time reduction from 3SAT. This further highlights the relevance of our model to the study of NP-completeness.

Lower Bounds for Leader Election and Collective Coin Flipping, Revisited

from arXiv: Computational Complexity

Authors: Eshan Chattopadhyay, Mohit Gurumukhani, Noam Ringach, Rocco Servedio

We study the tasks of collective coin flipping and leader election in the full-information model. We prove new lower bounds for coin flipping protocols, implying lower bounds for leader election protocols. We show that any $k$-round coin flipping protocol, where each of $\ell$ players sends 1 bit per round, can be biased by $O(\ell/\log^{(k)}(\ell))$ bad players. For all $k>1$ this strengthens previous lower bounds [RSZ, SICOMP 2002], which ruled out protocols resilient to adversaries controlling $O(\ell/\log^{(2k-1)}(\ell))$ players. Consequently, we establish that any protocol tolerating a linear fraction of corrupt players, with only 1 bit per round, must run for at least $\log^*\ell-O(1)$ rounds, improving on the prior best lower bound of $\frac12 \log^*\ell-\log^*\log^*\ell$. This lower bound matches the number of rounds, $\log^*\ell$, taken by the current best coin flipping protocols from [RZ, JCSS 2001], [F, FOCS 1999] that can handle a linear sized coalition of bad players, but with players sending unlimited bits per round. We also derive lower bounds for protocols allowing multi-bit messages per round. Our results show that the protocols from [RZ, JCSS 2001], [F, FOCS 1999] that handle a linear number of corrupt players are almost optimal in terms of round complexity and communication per player in a round. A key technical ingredient in proving our lower bounds is a new result regarding biasing most functions from a family of functions using a common set of bad players and a small specialized set of bad players specific to each function that is biased. We give improved constant-round coin flipping protocols in the setting that each player can send 1 bit per round. For two rounds, our protocol can handle $O(\ell/(\log\ell)(\log\log\ell)^2)$ sized coalition of bad players; better than the best one-round protocol by [AL, Combinatorica 1993] in this setting.

Authors: Eshan Chattopadhyay, Mohit Gurumukhani, Noam Ringach, Rocco Servedio

We study the tasks of collective coin flipping and leader election in the full-information model. We prove new lower bounds for coin flipping protocols, implying lower bounds for leader election protocols. We show that any $k$-round coin flipping protocol, where each of $\ell$ players sends 1 bit per round, can be biased by $O(\ell/\log^{(k)}(\ell))$ bad players. For all $k>1$ this strengthens previous lower bounds [RSZ, SICOMP 2002], which ruled out protocols resilient to adversaries controlling $O(\ell/\log^{(2k-1)}(\ell))$ players. Consequently, we establish that any protocol tolerating a linear fraction of corrupt players, with only 1 bit per round, must run for at least $\log^*\ell-O(1)$ rounds, improving on the prior best lower bound of $\frac12 \log^*\ell-\log^*\log^*\ell$. This lower bound matches the number of rounds, $\log^*\ell$, taken by the current best coin flipping protocols from [RZ, JCSS 2001], [F, FOCS 1999] that can handle a linear sized coalition of bad players, but with players sending unlimited bits per round. We also derive lower bounds for protocols allowing multi-bit messages per round. Our results show that the protocols from [RZ, JCSS 2001], [F, FOCS 1999] that handle a linear number of corrupt players are almost optimal in terms of round complexity and communication per player in a round. A key technical ingredient in proving our lower bounds is a new result regarding biasing most functions from a family of functions using a common set of bad players and a small specialized set of bad players specific to each function that is biased. We give improved constant-round coin flipping protocols in the setting that each player can send 1 bit per round. For two rounds, our protocol can handle $O(\ell/(\log\ell)(\log\log\ell)^2)$ sized coalition of bad players; better than the best one-round protocol by [AL, Combinatorica 1993] in this setting.

Epistemic Skills: Reasoning about Knowledge and Oblivion

from arXiv: Computational Complexity

Authors: Xiaolong Liang, Yì N. Wáng

This paper presents a class of epistemic logics that captures the dynamics of acquiring knowledge and descending into oblivion, while incorporating concepts of group knowledge. The approach is grounded in a system of weighted models, introducing an ``epistemic skills'' metric to represent the epistemic capacities tied to knowledge updates. Within this framework, knowledge acquisition is modeled as a process of upskilling, whereas oblivion is represented as a consequence of downskilling. The framework further enables exploration of ``knowability'' and ``forgettability,'' defined as the potential to gain knowledge through upskilling and to lapse into oblivion through downskilling, respectively. Additionally, it supports a detailed analysis of the distinctions between epistemic de re and de dicto expressions. The computational complexity of the model checking and satisfiability problems is examined, offering insights into their theoretical foundations and practical implications.

Authors: Xiaolong Liang, Yì N. Wáng

This paper presents a class of epistemic logics that captures the dynamics of acquiring knowledge and descending into oblivion, while incorporating concepts of group knowledge. The approach is grounded in a system of weighted models, introducing an ``epistemic skills'' metric to represent the epistemic capacities tied to knowledge updates. Within this framework, knowledge acquisition is modeled as a process of upskilling, whereas oblivion is represented as a consequence of downskilling. The framework further enables exploration of ``knowability'' and ``forgettability,'' defined as the potential to gain knowledge through upskilling and to lapse into oblivion through downskilling, respectively. Additionally, it supports a detailed analysis of the distinctions between epistemic de re and de dicto expressions. The computational complexity of the model checking and satisfiability problems is examined, offering insights into their theoretical foundations and practical implications.

Dichotomies for \#CSP on graphs that forbid a clique as a minor

from arXiv: Computational Complexity

Authors: Boning Meng, Yicheng Pan

We prove complexity dichotomies for \#CSP problems (not necessarily symmetric) with Boolean domain and complex range on several typical minor-closed graph classes. These dichotomies give a complete characterization of the complexity of \#CSP on graph classes that forbid a complete graph as a minor. In particular, we also demonstrate that, whether the maximum degree of vertices is bounded may influence the complexity on specific minor-closed graph classes, and this phenomenon has never been observed in the previous related studies. Furthermore, our proofs integrate the properties of each graph class with the techniques from counting complexity, and develop a systematic approach for analyzing the complexity of \#CSP on these graph classes.

Authors: Boning Meng, Yicheng Pan

We prove complexity dichotomies for \#CSP problems (not necessarily symmetric) with Boolean domain and complex range on several typical minor-closed graph classes. These dichotomies give a complete characterization of the complexity of \#CSP on graph classes that forbid a complete graph as a minor. In particular, we also demonstrate that, whether the maximum degree of vertices is bounded may influence the complexity on specific minor-closed graph classes, and this phenomenon has never been observed in the previous related studies. Furthermore, our proofs integrate the properties of each graph class with the techniques from counting complexity, and develop a systematic approach for analyzing the complexity of \#CSP on these graph classes.

Confidence Bands for Multiparameter Persistence Landscapes

from arXiv: Computational Geometry

Authors: Inés García-Redondo, Anthea Monod, Qiquan Wang

Multiparameter persistent homology is a generalization of classical persistent homology, a central and widely-used methodology from topological data analysis, which takes into account density estimation and is an effective tool for data analysis in the presence of noise. Similar to its classical single-parameter counterpart, however, it is challenging to compute and use in practice due to its complex algebraic construction. In this paper, we study a popular and tractable invariant for multiparameter persistent homology in a statistical setting: the multiparameter persistence landscape. We derive a functional central limit theorem for multiparameter persistence landscapes, from which we compute confidence bands, giving rise to one of the first statistical inference methodologies for multiparameter persistence landscapes. We provide an implementation of confidence bands and demonstrate their application in a machine learning task on synthetic data.

Authors: Inés García-Redondo, Anthea Monod, Qiquan Wang

Multiparameter persistent homology is a generalization of classical persistent homology, a central and widely-used methodology from topological data analysis, which takes into account density estimation and is an effective tool for data analysis in the presence of noise. Similar to its classical single-parameter counterpart, however, it is challenging to compute and use in practice due to its complex algebraic construction. In this paper, we study a popular and tractable invariant for multiparameter persistent homology in a statistical setting: the multiparameter persistence landscape. We derive a functional central limit theorem for multiparameter persistence landscapes, from which we compute confidence bands, giving rise to one of the first statistical inference methodologies for multiparameter persistence landscapes. We provide an implementation of confidence bands and demonstrate their application in a machine learning task on synthetic data.

Distributed Triangle Detection is Hard in Few Rounds

from arXiv: Data Structures and Algorithms

Authors: Sepehr Assadi, Janani Sundaresan

In the distributed triangle detection problem, we have an $n$-vertex network $G=(V,E)$ with one player for each vertex of the graph who sees the edges incident on the vertex. The players communicate in synchronous rounds using the edges of this network and have a limited bandwidth of $O(\log{n})$ bits over each edge. The goal is to detect whether or not $G$ contains a triangle as a subgraph in a minimal number of rounds. We prove that any protocol (deterministic or randomized) for distributed triangle detection requires $\Omega(\log\log{n})$ rounds of communication. Prior to our work, only one-round lower bounds were known for this problem. The primary technique for proving these types of distributed lower bounds is via reductions from two-party communication complexity. However, it has been known for a while that this approach is provably incapable of establishing any meaningful lower bounds for distributed triangle detection. Our main technical contribution is a new information theoretic argument which combines recent advances on multi-pass graph streaming lower bounds with the point-to-point communication aspects of distributed models, and can be of independent interest.

Authors: Sepehr Assadi, Janani Sundaresan

In the distributed triangle detection problem, we have an $n$-vertex network $G=(V,E)$ with one player for each vertex of the graph who sees the edges incident on the vertex. The players communicate in synchronous rounds using the edges of this network and have a limited bandwidth of $O(\log{n})$ bits over each edge. The goal is to detect whether or not $G$ contains a triangle as a subgraph in a minimal number of rounds. We prove that any protocol (deterministic or randomized) for distributed triangle detection requires $\Omega(\log\log{n})$ rounds of communication. Prior to our work, only one-round lower bounds were known for this problem. The primary technique for proving these types of distributed lower bounds is via reductions from two-party communication complexity. However, it has been known for a while that this approach is provably incapable of establishing any meaningful lower bounds for distributed triangle detection. Our main technical contribution is a new information theoretic argument which combines recent advances on multi-pass graph streaming lower bounds with the point-to-point communication aspects of distributed models, and can be of independent interest.

Shared-Memory Hierarchical Process Mapping

from arXiv: Data Structures and Algorithms

Authors: Christian Schulz, Henning Woydt

Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse and can be determined in advance. Such applications are executed on supercomputers, which are often organized in a hierarchical hardware topology, consisting of islands, racks, nodes, and processors, where processing elements reside. To ensure efficient workload distribution, tasks must be allocated to processing elements in a way that ensures balanced utilization. However, this approach optimizes only the workload, not the communication cost of the application. It is straightforward to see that placing groups of tasks that frequently exchange large amounts of data on processing elements located near each other is beneficial. The problem of mapping tasks to processing elements considering optimization goals is called process mapping. In this work, we focus on minimizing communication cost while evenly distributing work. We present the first shared-memory algorithm that utilizes hierarchical multisection to partition the communication model across processing elements. Our parallel approach achieves the best solution on 95 percent of instances while also being marginally faster than the next best algorithm. Even in a serial setting, it delivers the best solution quality while also outperforming previous serial algorithms in speed.

Authors: Christian Schulz, Henning Woydt

Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another. Typically, the communication pattern between tasks is sparse and can be determined in advance. Such applications are executed on supercomputers, which are often organized in a hierarchical hardware topology, consisting of islands, racks, nodes, and processors, where processing elements reside. To ensure efficient workload distribution, tasks must be allocated to processing elements in a way that ensures balanced utilization. However, this approach optimizes only the workload, not the communication cost of the application. It is straightforward to see that placing groups of tasks that frequently exchange large amounts of data on processing elements located near each other is beneficial. The problem of mapping tasks to processing elements considering optimization goals is called process mapping. In this work, we focus on minimizing communication cost while evenly distributing work. We present the first shared-memory algorithm that utilizes hierarchical multisection to partition the communication model across processing elements. Our parallel approach achieves the best solution on 95 percent of instances while also being marginally faster than the next best algorithm. Even in a serial setting, it delivers the best solution quality while also outperforming previous serial algorithms in speed.

Cutwidth Bounds via Vertex Partitions

from arXiv: Data Structures and Algorithms

Authors: Antoine Amarilli, Benoît Groz

We study the cutwidth measure on graphs and ways to bound the cutwidth of a graph by partitioning its vertices. We consider bounds expressed as a function of two quantities: on the one hand, the maximal cutwidth x of the subgraphs induced by the classes of the partition, and on the other hand, the cutwidth y of the quotient multigraph obtained by merging each class to a single vertex. We consider in particular decomposition of directed graphs into strongly connected components (SCCs): in this case, x is the maximal cutwidth of an SCC, and y is the cutwidth of the directed acyclic condensation multigraph. We show that the cutwidth of a graph is always in O(x + y), specifically it can be upper bounded by 1.5x + y. We also show a lower bound justifying that the constant 1.5 cannot be improved in general

Authors: Antoine Amarilli, Benoît Groz

We study the cutwidth measure on graphs and ways to bound the cutwidth of a graph by partitioning its vertices. We consider bounds expressed as a function of two quantities: on the one hand, the maximal cutwidth x of the subgraphs induced by the classes of the partition, and on the other hand, the cutwidth y of the quotient multigraph obtained by merging each class to a single vertex. We consider in particular decomposition of directed graphs into strongly connected components (SCCs): in this case, x is the maximal cutwidth of an SCC, and y is the cutwidth of the directed acyclic condensation multigraph. We show that the cutwidth of a graph is always in O(x + y), specifically it can be upper bounded by 1.5x + y. We also show a lower bound justifying that the constant 1.5 cannot be improved in general

Local Computation Algorithms for Knapsack: impossibility results, and how to avoid them

from arXiv: Data Structures and Algorithms

Authors: Clément L. Canonne, Yun Li, Seeun William Umboh

Local Computation Algorithms (LCA), as introduced by Rubinfeld, Tamir, Vardi, and Xie (2011), are a type of ultra-efficient algorithms which, given access to a (large) input for a given computational task, are required to provide fast query access to a consistent output solution, without maintaining a state between queries. This paradigm of computation in particular allows for hugely distributed algorithms, where independent instances of a given LCA provide consistent access to a common output solution. The past decade has seen a significant amount of work on LCAs, by and large focusing on graph problems. In this paper, we initiate the study of Local Computation Algorithms for perhaps the archetypal combinatorial optimization problem, Knapsack. We first establish strong impossibility results, ruling out the existence of any non-trivial LCA for Knapsack as several of its relaxations. We then show how equipping the LCA with additional access to the Knapsack instance, namely, weighted item sampling, allows one to circumvent these impossibility results, and obtain sublinear-time and query LCAs. Our positive result draws on a connection to the recent notion of reproducibility for learning algorithms (Impagliazzo, Lei, Pitassi, and Sorrell, 2022), a connection we believe to be of independent interest for the design of LCAs.

Authors: Clément L. Canonne, Yun Li, Seeun William Umboh

Local Computation Algorithms (LCA), as introduced by Rubinfeld, Tamir, Vardi, and Xie (2011), are a type of ultra-efficient algorithms which, given access to a (large) input for a given computational task, are required to provide fast query access to a consistent output solution, without maintaining a state between queries. This paradigm of computation in particular allows for hugely distributed algorithms, where independent instances of a given LCA provide consistent access to a common output solution. The past decade has seen a significant amount of work on LCAs, by and large focusing on graph problems. In this paper, we initiate the study of Local Computation Algorithms for perhaps the archetypal combinatorial optimization problem, Knapsack. We first establish strong impossibility results, ruling out the existence of any non-trivial LCA for Knapsack as several of its relaxations. We then show how equipping the LCA with additional access to the Knapsack instance, namely, weighted item sampling, allows one to circumvent these impossibility results, and obtain sublinear-time and query LCAs. Our positive result draws on a connection to the recent notion of reproducibility for learning algorithms (Impagliazzo, Lei, Pitassi, and Sorrell, 2022), a connection we believe to be of independent interest for the design of LCAs.

Generalized Assignment and Knapsack Problems in the Random-Order Model

from arXiv: Data Structures and Algorithms

Authors: Max Klimm, Martin Knaack

We study different online optimization problems in the random-order model. There is a finite set of bins with known capacity and a finite set of items arriving in a random order. Upon arrival of an item, its size and its value for each of the bins is revealed and it has to be decided immediately and irrevocably to which bin the item is assigned, or to not assign the item at all. In this setting, an algorithm is $\alpha$-competitive if the total value of all items assigned to the bins is at least an $\alpha$-fraction of the total value of an optimal assignment that knows all items beforehand. We give an algorithm that is $\alpha$-competitive with $\alpha = (1-\ln(2))/2 \approx 1/6.52$ improving upon the previous best algorithm with $\alpha \approx 1/6.99$ for the generalized assignment problem and the previous best algorithm with $\alpha \approx 1/6.65$ for the integral knapsack problem. We then study the fractional knapsack problem where we have a single bin and it is also allowed to pack items fractionally. For that case, we obtain an algorithm that is $\alpha$-competitive with $\alpha = 1/e \approx 1/2.71$ improving on the previous best algorithm with $\alpha = 1/4.39$. We further show that this competitive ratio is the best-possible for deterministic algorithms in this model.

Authors: Max Klimm, Martin Knaack

We study different online optimization problems in the random-order model. There is a finite set of bins with known capacity and a finite set of items arriving in a random order. Upon arrival of an item, its size and its value for each of the bins is revealed and it has to be decided immediately and irrevocably to which bin the item is assigned, or to not assign the item at all. In this setting, an algorithm is $\alpha$-competitive if the total value of all items assigned to the bins is at least an $\alpha$-fraction of the total value of an optimal assignment that knows all items beforehand. We give an algorithm that is $\alpha$-competitive with $\alpha = (1-\ln(2))/2 \approx 1/6.52$ improving upon the previous best algorithm with $\alpha \approx 1/6.99$ for the generalized assignment problem and the previous best algorithm with $\alpha \approx 1/6.65$ for the integral knapsack problem. We then study the fractional knapsack problem where we have a single bin and it is also allowed to pack items fractionally. For that case, we obtain an algorithm that is $\alpha$-competitive with $\alpha = 1/e \approx 1/2.71$ improving on the previous best algorithm with $\alpha = 1/4.39$. We further show that this competitive ratio is the best-possible for deterministic algorithms in this model.

Diameter Shortcut Sets on Temporal Graphs

from arXiv: Data Structures and Algorithms

Authors: Gerome Quantmeyer

Shortcut sets are a vital instrument for reducing the diameter of a static graph and, consequently, its shortest path complexity, which is relevant in numerous subfields of graph theory. We explore the notion of shortcut sets in temporal graphs, which incorporate a discrete time model into the graph, rendering each edge accessible exclusively at specific points in time. This not only alters the underlying assumptions of regular graphs but also substantially increases the complexity of path problems and reachability. In turn, a temporal graph is often a much more realistic and accurate representation of a real-world network. In this thesis we provide a definition for a shortcut set in a temporal graph and explore differences to classic shortcut sets. Utilizing this definition, we show that temporal and regular shortcut sets yield the same results on temporal paths, enabling the application of existing construction algorithms for static shortcut sets on paths. The primary contribution of this thesis is a translation approach for general temporal graphs that utilizes the static expansion of a temporal graph, allowing the conversion of static shortcut sets into temporal shortcut sets, yielding similar results.

Authors: Gerome Quantmeyer

Shortcut sets are a vital instrument for reducing the diameter of a static graph and, consequently, its shortest path complexity, which is relevant in numerous subfields of graph theory. We explore the notion of shortcut sets in temporal graphs, which incorporate a discrete time model into the graph, rendering each edge accessible exclusively at specific points in time. This not only alters the underlying assumptions of regular graphs but also substantially increases the complexity of path problems and reachability. In turn, a temporal graph is often a much more realistic and accurate representation of a real-world network. In this thesis we provide a definition for a shortcut set in a temporal graph and explore differences to classic shortcut sets. Utilizing this definition, we show that temporal and regular shortcut sets yield the same results on temporal paths, enabling the application of existing construction algorithms for static shortcut sets on paths. The primary contribution of this thesis is a translation approach for general temporal graphs that utilizes the static expansion of a temporal graph, allowing the conversion of static shortcut sets into temporal shortcut sets, yielding similar results.

Computing Time-varying Network Reliability using Binary Decision Diagrams

from arXiv: Data Structures and Algorithms

Authors: Yu Nakahata, Shun Arizono, Shoji Kasahara

Computing the reliability of a time-varying network, taking into account its dynamic nature, is crucial for networks that change over time, such as space networks, vehicular ad-hoc networks, and drone networks. These networks are modeled using temporal graphs, in which each edge is labeled with a time indicating its existence at a specific point in time. The time-varying network reliability is defined as the probability that a data packet from the source vertex can reach the terminal vertex, following links with increasing time labels (i.e., a journey), while taking into account the possibility of network link failures. Currently, the existing method for calculating this reliability involves explicitly enumerating all possible journeys between the source and terminal vertices and then calculating the reliability using the sum of disjoint products method. However, this method has high computational complexity. In contrast, there is an efficient algorithm that uses binary decision diagrams (BDDs) to evaluate the reliability of a network whose topology does not change over time. This paper presents an efficient exact algorithm that utilizes BDDs for computing the time-varying network reliability. Experimental results show that the proposed method runs faster than the existing method up to four orders of magnitude.

Authors: Yu Nakahata, Shun Arizono, Shoji Kasahara

Computing the reliability of a time-varying network, taking into account its dynamic nature, is crucial for networks that change over time, such as space networks, vehicular ad-hoc networks, and drone networks. These networks are modeled using temporal graphs, in which each edge is labeled with a time indicating its existence at a specific point in time. The time-varying network reliability is defined as the probability that a data packet from the source vertex can reach the terminal vertex, following links with increasing time labels (i.e., a journey), while taking into account the possibility of network link failures. Currently, the existing method for calculating this reliability involves explicitly enumerating all possible journeys between the source and terminal vertices and then calculating the reliability using the sum of disjoint products method. However, this method has high computational complexity. In contrast, there is an efficient algorithm that uses binary decision diagrams (BDDs) to evaluate the reliability of a network whose topology does not change over time. This paper presents an efficient exact algorithm that utilizes BDDs for computing the time-varying network reliability. Experimental results show that the proposed method runs faster than the existing method up to four orders of magnitude.

SplineSketch: Even More Accurate Quantiles with Error Guarantees

from arXiv: Data Structures and Algorithms

Authors: Aleksander Łukasiewicz, Jakub Tětek, Pavel Veselý

Space-efficient estimation of quantiles in massive datasets is a fundamental problem with numerous applications in data monitoring and analysis. While theoretical research led to optimal algorithms, such as the Greenwald-Khanna algorithm or the KLL sketch, practitioners often use other sketches that perform significantly better in practice but lack theoretical guarantees. Most notably, the widely used t-digest has unbounded worst-case error. In this paper, we seek to get the best of both worlds. We present a new quantile summary, SplineSketch, for numeric data, offering near-optimal theoretical guarantees and outperforming t-digest by a factor of 2-20 on a range of synthetic and real-world datasets with non-skewed frequency distributions. To achieve such performance, we develop a novel approach that maintains a dynamic subdivision of the input range into buckets while fitting the input distribution using monotone cubic spline interpolation. The core challenge is implementing this method in a space-efficient manner while ensuring strong worst-case guarantees.

Authors: Aleksander Łukasiewicz, Jakub Tětek, Pavel Veselý

Space-efficient estimation of quantiles in massive datasets is a fundamental problem with numerous applications in data monitoring and analysis. While theoretical research led to optimal algorithms, such as the Greenwald-Khanna algorithm or the KLL sketch, practitioners often use other sketches that perform significantly better in practice but lack theoretical guarantees. Most notably, the widely used t-digest has unbounded worst-case error. In this paper, we seek to get the best of both worlds. We present a new quantile summary, SplineSketch, for numeric data, offering near-optimal theoretical guarantees and outperforming t-digest by a factor of 2-20 on a range of synthetic and real-world datasets with non-skewed frequency distributions. To achieve such performance, we develop a novel approach that maintains a dynamic subdivision of the input range into buckets while fitting the input distribution using monotone cubic spline interpolation. The core challenge is implementing this method in a space-efficient manner while ensuring strong worst-case guarantees.

LimTDD: A Compact Decision Diagram Integrating Tensor and Local Invertible Map Representations

from arXiv: Data Structures and Algorithms

Authors: Xin Hong, Aochu Dai, Dingchao Gao, Sanjiang Li, Zhengfeng Ji, Mingsheng Ying

Tensor Decision Diagrams (TDDs) provide an efficient structure for representing tensors by combining techniques from both tensor networks and decision diagrams, demonstrating competitive performance in quantum circuit simulation and verification. However, existing decision diagrams, including TDDs, fail to exploit isomorphisms within tensors, limiting their compression efficiency. This paper introduces Local Invertible Map Tensor Decision Diagrams (LimTDDs), an extension of TDD that integrates local invertible maps (LIMs) to achieve more compact representations. Unlike LIMDD, which applies Pauli operators to quantum states, LimTDD generalizes this approach using the XP-stabilizer group, enabling broader applicability. We develop efficient algorithms for normalization and key tensor operations, including slicing, addition, and contraction, essential for quantum circuit simulation and verification. Theoretical analysis shows that LimTDD surpasses TDD in compactness while maintaining its generality and offers exponential advantages over both TDD and LIMDD in the best-case scenarios. Experimental results validate these improvements, demonstrating LimTDD's superior efficiency in quantum circuit simulation and functionality computation.

Authors: Xin Hong, Aochu Dai, Dingchao Gao, Sanjiang Li, Zhengfeng Ji, Mingsheng Ying

Tensor Decision Diagrams (TDDs) provide an efficient structure for representing tensors by combining techniques from both tensor networks and decision diagrams, demonstrating competitive performance in quantum circuit simulation and verification. However, existing decision diagrams, including TDDs, fail to exploit isomorphisms within tensors, limiting their compression efficiency. This paper introduces Local Invertible Map Tensor Decision Diagrams (LimTDDs), an extension of TDD that integrates local invertible maps (LIMs) to achieve more compact representations. Unlike LIMDD, which applies Pauli operators to quantum states, LimTDD generalizes this approach using the XP-stabilizer group, enabling broader applicability. We develop efficient algorithms for normalization and key tensor operations, including slicing, addition, and contraction, essential for quantum circuit simulation and verification. Theoretical analysis shows that LimTDD surpasses TDD in compactness while maintaining its generality and offers exponential advantages over both TDD and LIMDD in the best-case scenarios. Experimental results validate these improvements, demonstrating LimTDD's superior efficiency in quantum circuit simulation and functionality computation.

Wednesday, April 02

Improved Round-by-round Soundness IOPs via Reed-Muller Codes

from arXiv: Computational Complexity

Authors: Dor Minzer, Kai Zhe Zheng

We give an IOPP (interactive oracle proof of proximity) for trivariate Reed-Muller codes that achieves the best known query complexity in some range of security parameters. Specifically, for degree $d$ and security parameter $\lambda\leq \frac{\log^2 d}{\log\log d}$ , our IOPP has $2^{-\lambda}$ round-by-round soundness, $O(\lambda)$ queries, $O(\log\log d)$ rounds and $O(d)$ length. This improves upon the FRI [Ben-Sasson, Bentov, Horesh, Riabzev, ICALP 2018] and the STIR [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] IOPPs for Reed-Solomon codes, that have larger query and round complexity standing at $O(\lambda \log d)$ and $O(\log d+\lambda\log\log d)$ respectively. We use our IOPP to give an IOP for the NP-complete language Rank-1-Constraint-Satisfaction with the same parameters. Our construction is based on the line versus point test in the low-soundness regime. Compared to the axis parallel test (which is used in all prior works), the general affine lines test has improved soundness, which is the main source of our improved soundness. Using this test involves several complications, most significantly that projection to affine lines does not preserve individual degrees, and we show how to overcome these difficulties. En route, we extend some existing machinery to more general settings. Specifically, we give proximity generators for Reed-Muller codes, show a more systematic way of handling ``side conditions'' in IOP constructions, and generalize the compiling procedure of [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] to general codes.

Authors: Dor Minzer, Kai Zhe Zheng

We give an IOPP (interactive oracle proof of proximity) for trivariate Reed-Muller codes that achieves the best known query complexity in some range of security parameters. Specifically, for degree $d$ and security parameter $\lambda\leq \frac{\log^2 d}{\log\log d}$ , our IOPP has $2^{-\lambda}$ round-by-round soundness, $O(\lambda)$ queries, $O(\log\log d)$ rounds and $O(d)$ length. This improves upon the FRI [Ben-Sasson, Bentov, Horesh, Riabzev, ICALP 2018] and the STIR [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] IOPPs for Reed-Solomon codes, that have larger query and round complexity standing at $O(\lambda \log d)$ and $O(\log d+\lambda\log\log d)$ respectively. We use our IOPP to give an IOP for the NP-complete language Rank-1-Constraint-Satisfaction with the same parameters. Our construction is based on the line versus point test in the low-soundness regime. Compared to the axis parallel test (which is used in all prior works), the general affine lines test has improved soundness, which is the main source of our improved soundness. Using this test involves several complications, most significantly that projection to affine lines does not preserve individual degrees, and we show how to overcome these difficulties. En route, we extend some existing machinery to more general settings. Specifically, we give proximity generators for Reed-Muller codes, show a more systematic way of handling ``side conditions'' in IOP constructions, and generalize the compiling procedure of [Arnon, Chiesa, Fenzi, Yogev, Crypto 2024] to general codes.

SAT problem and Limit of Solomonoff's inductive reasoning theory

from arXiv: Computational Complexity

Authors: Feng Pan

This paper explores the Boolean Satisfiability Problem (SAT) in the context of Kolmogorov complexity theory. We present three versions of the distinguishability problem-Boolean formulas, Turing machines, and quantum systems-each focused on distinguishing between two Bernoulli distributions induced by these computational models. A reduction is provided that establishes the equivalence between the Boolean formula version of the program output statistical prediction problem and the #SAT problem. Furthermore, we apply Solomonoff's inductive reasoning theory, revealing its limitations: the only "algorithm" capable of determining the output of any shortest program is the program itself, and any other algorithms are computationally indistinguishable from a universal computer, based on the coding theorem. The quantum version of this problem introduces a unique algorithm based on statistical distance and distinguishability, reflecting a fundamental limit in quantum mechanics. Finally, the potential equivalence of Kolmogorov complexity between circuit models and Turing machines may have significant implications for the NP vs P problem. We also investigate the nature of short programs corresponding to exponentially long bit sequences that can be compressed, revealing that these programs inherently contain loops that grow exponentially.

Authors: Feng Pan

This paper explores the Boolean Satisfiability Problem (SAT) in the context of Kolmogorov complexity theory. We present three versions of the distinguishability problem-Boolean formulas, Turing machines, and quantum systems-each focused on distinguishing between two Bernoulli distributions induced by these computational models. A reduction is provided that establishes the equivalence between the Boolean formula version of the program output statistical prediction problem and the #SAT problem. Furthermore, we apply Solomonoff's inductive reasoning theory, revealing its limitations: the only "algorithm" capable of determining the output of any shortest program is the program itself, and any other algorithms are computationally indistinguishable from a universal computer, based on the coding theorem. The quantum version of this problem introduces a unique algorithm based on statistical distance and distinguishability, reflecting a fundamental limit in quantum mechanics. Finally, the potential equivalence of Kolmogorov complexity between circuit models and Turing machines may have significant implications for the NP vs P problem. We also investigate the nature of short programs corresponding to exponentially long bit sequences that can be compressed, revealing that these programs inherently contain loops that grow exponentially.

Strongly sublinear separators and bounded asymptotic dimension for sphere intersection graphs

from arXiv: Computational Geometry

Authors: James Davies, Agelos Georgakopoulos, Meike Hatzel, Rose McCarty

In this paper, we consider the class $\mathcal{C}^d$ of sphere intersection graphs in $\mathbb{R}^d$ for $d \geq 2$. We show that for each integer $t$, the class of all graphs in $\mathcal{C}^d$ that exclude $K_{t,t}$ as a subgraph has strongly sublinear separators. We also prove that $\mathcal{C}^d$ has asymptotic dimension at most $2d+2$.

Authors: James Davies, Agelos Georgakopoulos, Meike Hatzel, Rose McCarty

In this paper, we consider the class $\mathcal{C}^d$ of sphere intersection graphs in $\mathbb{R}^d$ for $d \geq 2$. We show that for each integer $t$, the class of all graphs in $\mathcal{C}^d$ that exclude $K_{t,t}$ as a subgraph has strongly sublinear separators. We also prove that $\mathcal{C}^d$ has asymptotic dimension at most $2d+2$.

Crossing number inequalities for curves on surfaces

from arXiv: Computational Geometry

Authors: Alfredo Hubard, Hugo Parlier

We prove that, as $m$ grows, any family of $m$ homotopically distinct closed curves on a surface induces a number of crossings that grows at least like $(m \log m)^2$. We use this to answer two questions of Pach, Tardos and Toth related to crossing numbers of drawings of multigraphs where edges are required to be non-homotopic. Furthermore, we generalize these results, obtaining effective bounds with optimal growth rates on every orientable surface.

Authors: Alfredo Hubard, Hugo Parlier

We prove that, as $m$ grows, any family of $m$ homotopically distinct closed curves on a surface induces a number of crossings that grows at least like $(m \log m)^2$. We use this to answer two questions of Pach, Tardos and Toth related to crossing numbers of drawings of multigraphs where edges are required to be non-homotopic. Furthermore, we generalize these results, obtaining effective bounds with optimal growth rates on every orientable surface.

Co-design Optimization of Moving Parts for Compliance and Collision Avoidance

from arXiv: Computational Geometry

Authors: Amir M. Mirzendehdel, Morad Behandish

Design requirements for moving parts in mechanical assemblies are typically specified in terms of interactions with other parts. Some are purely kinematic (e.g., pairwise collision avoidance) while others depend on physics and material properties (e.g., deformation under loads). Kinematic design methods and physics-based shape/topology optimization (SO/TO) deal separately with these requirements. They rarely talk to each other as the former uses set algebra and group theory while the latter requires discretizing and solving differential equations. Hence, optimizing a moving part based on physics typically relies on either neglecting or pruning kinematic constraints in advance, e.g., by restricting the design domain to a collision-free space using an unsweep operation. In this paper, we show that TO can be used to co-design two or more parts in relative motion to simultaneously satisfy physics-based criteria and collision avoidance. We restrict our attention to maximizing linear-elastic stiffness while penalizing collision measures aggregated in time. We couple the TO loops for two parts in relative motion so that the evolution of each part's shape is accounted for when penalizing collision for the other part. The collision measures are computed by a correlation functional that can be discretized by left- and right-multiplying the shape design variables by a pre-computed matrix that depends solely on the motion. This decoupling is key to making the computations scalable for TO iterations. We demonstrate the effectiveness of the approach with 2D and 3D examples.

Authors: Amir M. Mirzendehdel, Morad Behandish

Design requirements for moving parts in mechanical assemblies are typically specified in terms of interactions with other parts. Some are purely kinematic (e.g., pairwise collision avoidance) while others depend on physics and material properties (e.g., deformation under loads). Kinematic design methods and physics-based shape/topology optimization (SO/TO) deal separately with these requirements. They rarely talk to each other as the former uses set algebra and group theory while the latter requires discretizing and solving differential equations. Hence, optimizing a moving part based on physics typically relies on either neglecting or pruning kinematic constraints in advance, e.g., by restricting the design domain to a collision-free space using an unsweep operation. In this paper, we show that TO can be used to co-design two or more parts in relative motion to simultaneously satisfy physics-based criteria and collision avoidance. We restrict our attention to maximizing linear-elastic stiffness while penalizing collision measures aggregated in time. We couple the TO loops for two parts in relative motion so that the evolution of each part's shape is accounted for when penalizing collision for the other part. The collision measures are computed by a correlation functional that can be discretized by left- and right-multiplying the shape design variables by a pre-computed matrix that depends solely on the motion. This decoupling is key to making the computations scalable for TO iterations. We demonstrate the effectiveness of the approach with 2D and 3D examples.

Linked Array Tree: A Constant-Time Search Structure for Big Data

from arXiv: Data Structures and Algorithms

Authors: Songpeng Liu

As data volumes continue to grow rapidly, traditional search algorithms, like the red-black tree and B+ Tree, face increasing challenges in performance, especially in big data scenarios with intensive storage access. This paper presents the Linked Array Tree (LAT), a novel data structure designed to achieve constant-time complexity for search, insertion, and deletion operations. LAT leverages a sparse, non-moving hierarchical layout that enables direct access paths without requiring rebalancing or data movement. Its low memory overhead and avoidance of pointer-heavy structures make it well-suited for large-scale and intensive workloads. While not specifically tested under parallel or concurrent conditions, the structure's static layout and non-interfering operations suggest potential advantages in such environments. This paper first introduces the structure and algorithms of LAT, followed by a detailed analysis of its time complexity in search, insertion, and deletion operations. Finally, it presents experimental results across both data-intensive and sparse usage scenarios to evaluate LAT's practical performance.

Authors: Songpeng Liu

As data volumes continue to grow rapidly, traditional search algorithms, like the red-black tree and B+ Tree, face increasing challenges in performance, especially in big data scenarios with intensive storage access. This paper presents the Linked Array Tree (LAT), a novel data structure designed to achieve constant-time complexity for search, insertion, and deletion operations. LAT leverages a sparse, non-moving hierarchical layout that enables direct access paths without requiring rebalancing or data movement. Its low memory overhead and avoidance of pointer-heavy structures make it well-suited for large-scale and intensive workloads. While not specifically tested under parallel or concurrent conditions, the structure's static layout and non-interfering operations suggest potential advantages in such environments. This paper first introduces the structure and algorithms of LAT, followed by a detailed analysis of its time complexity in search, insertion, and deletion operations. Finally, it presents experimental results across both data-intensive and sparse usage scenarios to evaluate LAT's practical performance.

Expander Pruning with Polylogarithmic Worst-Case Recourse and Update Time

from arXiv: Data Structures and Algorithms

Authors: Simon Meierhans, Maximilian Probst Gutenberg, Thatchaphol Saranurak

Expander graphs are known to be robust to edge deletions in the following sense: for any online sequence of edge deletions $e_1, e_2, \ldots, e_k$ to an $m$-edge graph $G$ that is initially a $\phi$-expander, the algorithm can grow a set $P \subseteq V$ such that at any time $t$, $G[V \setminus P]$ is an expander of the same quality as the initial graph $G$ up to a constant factor and the set $P$ has volume at most $O(t/\phi)$. However, currently, there is no algorithm to grow $P$ with low worst-case recourse that achieves any non-trivial guarantee. In this work, we present an algorithm that achieves near-optimal guarantees: we give an algorithm that grows $P$ only by $\tilde{O}(1/\phi^2)$ vertices per time step and ensures that $G[V \setminus P]$ remains $\tilde{\Omega}(\phi)$-expander at any time. Even more excitingly, our algorithm is extremely efficient: it can process each update in near-optimal worst-case update time $\tilde{O}(1/\phi^2)$. This affirmatively answers the main open question posed in [SW19] whether such an algorithm exists. By combining our results with recent techniques in [BvdBPG+22], we obtain the first adaptive algorithms to maintain spanners, cut and spectral sparsifiers with $\tilde{O}(n)$ edges and polylogarithmic approximation guarantees, worst-case update time and recourse. More generally, we believe that worst-case pruning is an essential tool for obtaining worst-case guarantees in dynamic graph algorithms and online algorithms.

Authors: Simon Meierhans, Maximilian Probst Gutenberg, Thatchaphol Saranurak

Expander graphs are known to be robust to edge deletions in the following sense: for any online sequence of edge deletions $e_1, e_2, \ldots, e_k$ to an $m$-edge graph $G$ that is initially a $\phi$-expander, the algorithm can grow a set $P \subseteq V$ such that at any time $t$, $G[V \setminus P]$ is an expander of the same quality as the initial graph $G$ up to a constant factor and the set $P$ has volume at most $O(t/\phi)$. However, currently, there is no algorithm to grow $P$ with low worst-case recourse that achieves any non-trivial guarantee. In this work, we present an algorithm that achieves near-optimal guarantees: we give an algorithm that grows $P$ only by $\tilde{O}(1/\phi^2)$ vertices per time step and ensures that $G[V \setminus P]$ remains $\tilde{\Omega}(\phi)$-expander at any time. Even more excitingly, our algorithm is extremely efficient: it can process each update in near-optimal worst-case update time $\tilde{O}(1/\phi^2)$. This affirmatively answers the main open question posed in [SW19] whether such an algorithm exists. By combining our results with recent techniques in [BvdBPG+22], we obtain the first adaptive algorithms to maintain spanners, cut and spectral sparsifiers with $\tilde{O}(n)$ edges and polylogarithmic approximation guarantees, worst-case update time and recourse. More generally, we believe that worst-case pruning is an essential tool for obtaining worst-case guarantees in dynamic graph algorithms and online algorithms.

How to Protect Yourself from Threatening Skeletons: Optimal Padded Decompositions for Minor-Free Graphs

from arXiv: Data Structures and Algorithms

Authors: Jonathan Conroy, Arnold Filtser

Roughly, a metric space has padding parameter $\beta$ if for every $\Delta>0$, there is a stochastic decomposition of the metric points into clusters of diameter at most $\Delta$ such that every ball of radius $\gamma\Delta$ is contained in a single cluster with probability at least $e^{-\gamma\beta}$. The padding parameter is an important characteristic of a metric space with vast algorithmic implications. In this paper we prove that the shortest path metric of every $K_r$-minor-free graph has padding parameter $O(\log r)$, which is also tight. This resolves a long standing open question, and exponentially improves the previous bound. En route to our main result, we construct sparse covers for $K_r$-minor-free graphs with improved parameters, and we prove a general reduction from sparse covers to padded decompositions.

Authors: Jonathan Conroy, Arnold Filtser

Roughly, a metric space has padding parameter $\beta$ if for every $\Delta>0$, there is a stochastic decomposition of the metric points into clusters of diameter at most $\Delta$ such that every ball of radius $\gamma\Delta$ is contained in a single cluster with probability at least $e^{-\gamma\beta}$. The padding parameter is an important characteristic of a metric space with vast algorithmic implications. In this paper we prove that the shortest path metric of every $K_r$-minor-free graph has padding parameter $O(\log r)$, which is also tight. This resolves a long standing open question, and exponentially improves the previous bound. En route to our main result, we construct sparse covers for $K_r$-minor-free graphs with improved parameters, and we prove a general reduction from sparse covers to padded decompositions.

Tuesday, April 01

ICE-TCS seminar by Benjamin Moore on "Smoothed analysis for graph isomorphism"

from Luca Aceto

Today, the ICE-TCS seminar series at Reykjavik University hosted a talk by Benjamin Moore (Institute of Science and Technology Austria) who is visiting our postdoctoral researcher Nicolaos Matsakis. 

Benjamin presented the main results in his paper "Smoothed analysis for graph isomorphism", coauthored with his ISTA colleagues Michael Anastos and Matthew Kwan. (In passing, I just saw that Matthew Kwan received the main prize of the Austrian Mathematical Society last year. Congratulations!) 

To my mind, Benjamin did an excellent job in presenting the context for their exciting (but very technical) contribution and the main ideas that underlie it. Kudos! The work by Benjamin and his collaborators provides another explanation of the effectiveness of the colour refinement algorithm (also known as the one-dimensional Weisfeiler-Leman algorithm) in checking whether two graphs are isomorphic. I encourage you to read at least the introduction of their paper, which will be presented at STOC 2025, and the ISTA news article here, which does a much better job at putting their work in context than an interested, but ignorant, observer like me ever could. FWIW, I find results like theirs, which offer some explanation as to why theoretically hard problems are seemingly easy in practice, fascinating and I feel like that paper might be a strong candidate for a best paper award. 

It was also fitting to see recent work on smoothed analysis being presented at our seminar series since Daniel Spielman and Shang-Hua Teng received the 2008 Gödel Prize at ICALP 2008, which was held at Reykjavik University. Time flies, but great work is timeless. 


By Luca Aceto

Today, the ICE-TCS seminar series at Reykjavik University hosted a talk by Benjamin Moore (Institute of Science and Technology Austria) who is visiting our postdoctoral researcher Nicolaos Matsakis

Benjamin presented the main results in his paper "Smoothed analysis for graph isomorphism", coauthored with his ISTA colleagues Michael Anastos and Matthew Kwan. (In passing, I just saw that Matthew Kwan received the main prize of the Austrian Mathematical Society last year. Congratulations!) 

To my mind, Benjamin did an excellent job in presenting the context for their exciting (but very technical) contribution and the main ideas that underlie it. Kudos! The work by Benjamin and his collaborators provides another explanation of the effectiveness of the colour refinement algorithm (also known as the one-dimensional Weisfeiler-Leman algorithm) in checking whether two graphs are isomorphic. I encourage you to read at least the introduction of their paper, which will be presented at STOC 2025, and the ISTA news article here, which does a much better job at putting their work in context than an interested, but ignorant, observer like me ever could. FWIW, I find results like theirs, which offer some explanation as to why theoretically hard problems are seemingly easy in practice, fascinating and I feel like that paper might be a strong candidate for a best paper award. 

It was also fitting to see recent work on smoothed analysis being presented at our seminar series since Daniel Spielman and Shang-Hua Teng received the 2008 Gödel Prize at ICALP 2008, which was held at Reykjavik University. Time flies, but great work is timeless. 


By Luca Aceto

PDQ Shor (?-2025)

from Computational Complexity

♦ PDQ Shor

PDQ Shor, Peter Shor's smarter brother, passed away last week. PDQ was a Physicist/Computer Scientist/Mathematician/Astrologer/Psychic at the University of Northern South Dakota in Wakpala.

Dr. Phineas Dominic Quincy Shor III, PhD, MBA, BLT, received his education at Europa U. during one of his many alien abductions. He ended up in South Dakota after having fled every other state.

He was most famous for the concept of unnatural proofs, collected in his anthology Proofs from the Other Book, which includes his classic "interpretive dance proof" of the Pythagorean theorem. Critics complain the proof only handles the case where the right angle is on the left.

His follow up book, Proofs from the Crypt, contains his masterpiece, a 1237 page proof that conclusively shows that the empty set contains no irrational numbers.

Like his brother he's moved to the quantum space, reverse engineering Peter's work by giving a doubly exponential time quantum algorithm for multiplying numbers. He created the innovative dipping bird quantum error collection machine that constantly monitors a quantum machine collapsing all entanglement. Apple bought the device for $327 million which immediately destroyed their plans for a QiPhone.
PDQ used the proceeds to create the perpetual Turing machine, guaranteed to never halt. Until it did.

Sadly PDQ passed away from paranormal causes last week. Or was it last year? No one is quite sure. He donated his body to pseudoscience, currently lying in state in an undisclosed state. We hardly knew you.

With apologies to Peter Schickele. This April Fools post was inspired by the complexity class PDQMA.

By Lance Fortnow

PDQ Shor

PDQ Shor, Peter Shor's smarter brother, passed away last week. PDQ was a Physicist/Computer Scientist/Mathematician/Astrologer/Psychic at the University of Northern South Dakota in Wakpala.

Dr. Phineas Dominic Quincy Shor III, PhD, MBA, BLT, received his education at Europa U. during one of his many alien abductions. He ended up in South Dakota after having fled every other state.

He was most famous for the concept of unnatural proofs, collected in his anthology Proofs from the Other Book, which includes his classic "interpretive dance proof" of the Pythagorean theorem. Critics complain the proof only handles the case where the right angle is on the left.

His follow up book, Proofs from the Crypt, contains his masterpiece, a 1237 page proof that conclusively shows that the empty set contains no irrational numbers.

Like his brother he's moved to the quantum space, reverse engineering Peter's work by giving a doubly exponential time quantum algorithm for multiplying numbers. He created the innovative dipping bird quantum error collection machine that constantly monitors a quantum machine collapsing all entanglement. Apple bought the device for $327 million which immediately destroyed their plans for a QiPhone.

PDQ used the proceeds to create the perpetual Turing machine, guaranteed to never halt. Until it did.

Sadly PDQ passed away from paranormal causes last week. Or was it last year? No one is quite sure. He donated his body to pseudoscience, currently lying in state in an undisclosed state. We hardly knew you.

With apologies to Peter Schickele. This April Fools post was inspired by the complexity class PDQMA.

By Lance Fortnow

Lifting for Arbitrary Gadgets

from arXiv: Computational Complexity

Authors: Siddharth Iyer

We prove a sensitivity-to-communication lifting theorem for arbitrary gadgets. Given functions $f: \{0,1\}^n\to \{0,1\}$ and $g : \mathcal X\times \mathcal Y\to \{0,1\}$, denote $f\circ g(x,y) := f(g(x_1,y_1),\ldots,g(x_n,y_n))$. We show that for any $f$ with sensitivity $s$ and any $g$, \[D(f\circ g) \geq s\cdot \bigg(\frac{\Omega(D(g))}{\log\mathsf{rk}(g)} - \log\mathsf{rk}(g)\bigg),\] where $D(\cdot)$ denotes the deterministic communication complexity and $\mathsf{rk}(g)$ is the rank of the matrix associated with $g$. As a corollary, we get that if $D(g)$ is a sufficiently large constant, $D(f\circ g) = \Omega(\min\{s,d\}\cdot \sqrt{D(g)})$, where $s$ and $d$ denote the sensitivity and degree of $f$. In particular, computing the OR of $n$ copies of $g$ requires $\Omega(n\cdot\sqrt{D(g)})$ bits.

Authors: Siddharth Iyer

We prove a sensitivity-to-communication lifting theorem for arbitrary gadgets. Given functions $f: \{0,1\}^n\to \{0,1\}$ and $g : \mathcal X\times \mathcal Y\to \{0,1\}$, denote $f\circ g(x,y) := f(g(x_1,y_1),\ldots,g(x_n,y_n))$. We show that for any $f$ with sensitivity $s$ and any $g$, \[D(f\circ g) \geq s\cdot \bigg(\frac{\Omega(D(g))}{\log\mathsf{rk}(g)} - \log\mathsf{rk}(g)\bigg),\] where $D(\cdot)$ denotes the deterministic communication complexity and $\mathsf{rk}(g)$ is the rank of the matrix associated with $g$. As a corollary, we get that if $D(g)$ is a sufficiently large constant, $D(f\circ g) = \Omega(\min\{s,d\}\cdot \sqrt{D(g)})$, where $s$ and $d$ denote the sensitivity and degree of $f$. In particular, computing the OR of $n$ copies of $g$ requires $\Omega(n\cdot\sqrt{D(g)})$ bits.

$\mathsf{P}$-completeness of Graph Local Complementation

from arXiv: Computational Complexity

Authors: Pablo Concha-Vega

Local complementation of a graph $G$ on vertex $v$ is an operation that results in a new graph $G*v$, where the neighborhood of $v$ is complemented. This operation has been widely studied in graph theory and quantum computing. This article introduces the Local Complementation Problem, a decision problem that captures the complexity of applying a sequence of local complementations. Given a graph $G$, a sequence of vertices $s$, and a pair of vertices $u,v$, the problem asks whether the edge $(u,v)$ is present in the graph obtained after applying local complementations according to $s$. The main contribution of this work is proving that this problem is $\mathsf{P}$-complete, implying that computing a sequence of local complementation is unlikely to be efficiently parallelizable. The proof is based on a reduction from the Circuit Value Problem, a well-known $\mathsf{P}$-complete problem, by simulating circuits through local complementations. Aditionally, the complexity of this problem is analyzed under different restrictions. In particular, it is shown that for complete and star graphs, the problem belongs to $\mathsf{LOGSPACE}$. Finally, it is conjectured that the problem remains $\mathsf{P}$-complete for the class of circle graphs.

Authors: Pablo Concha-Vega

Local complementation of a graph $G$ on vertex $v$ is an operation that results in a new graph $G*v$, where the neighborhood of $v$ is complemented. This operation has been widely studied in graph theory and quantum computing. This article introduces the Local Complementation Problem, a decision problem that captures the complexity of applying a sequence of local complementations. Given a graph $G$, a sequence of vertices $s$, and a pair of vertices $u,v$, the problem asks whether the edge $(u,v)$ is present in the graph obtained after applying local complementations according to $s$. The main contribution of this work is proving that this problem is $\mathsf{P}$-complete, implying that computing a sequence of local complementation is unlikely to be efficiently parallelizable. The proof is based on a reduction from the Circuit Value Problem, a well-known $\mathsf{P}$-complete problem, by simulating circuits through local complementations. Aditionally, the complexity of this problem is analyzed under different restrictions. In particular, it is shown that for complete and star graphs, the problem belongs to $\mathsf{LOGSPACE}$. Finally, it is conjectured that the problem remains $\mathsf{P}$-complete for the class of circle graphs.

Simple general magnification of circuit lower bounds

from arXiv: Computational Complexity

Authors: Albert Atserias, Moritz Müller

We construct so-called distinguishers, sparse matrices that retain some properties of error correcting codes. They provide a technically and conceptually simple approach to magnification. We generalize and strengthen known general (not problem specific) magnification results and in particular achieve magnification thresholds below known lower bounds. For example, we show that fixed polynomial formula size lower bounds for NP are implied by slightly superlinear formula size lower bounds for approximating any sufficiently sparse problem in NP. We also show that the thresholds achieved are sharp. Additionally, our approach yields a uniform magnification result for the minimum circuit size problem. This seems to sidestep the localization barrier.

Authors: Albert Atserias, Moritz Müller

We construct so-called distinguishers, sparse matrices that retain some properties of error correcting codes. They provide a technically and conceptually simple approach to magnification. We generalize and strengthen known general (not problem specific) magnification results and in particular achieve magnification thresholds below known lower bounds. For example, we show that fixed polynomial formula size lower bounds for NP are implied by slightly superlinear formula size lower bounds for approximating any sufficiently sparse problem in NP. We also show that the thresholds achieved are sharp. Additionally, our approach yields a uniform magnification result for the minimum circuit size problem. This seems to sidestep the localization barrier.

On the Quantum Chromatic Gap

from arXiv: Computational Complexity

Authors: Lorenzo Ciardo

The largest known gap between quantum and classical chromatic number of graphs, obtained via quantum protocols for colouring Hadamard graphs based on the Deutsch--Jozsa algorithm and the quantum Fourier transform, is exponential. We put forth a quantum pseudo-telepathy version of Khot's $d$-to-$1$ Games Conjecture and prove that, conditional to its validity, the gap is unbounded: There exist graphs whose quantum chromatic number is $3$ and whose classical chromatic number is arbitrarily large. Furthermore, we show that the existence of a certain form of pseudo-telepathic XOR games would imply the conjecture and, thus, the unboundedness of the quantum chromatic gap. As two technical steps of our proof that might be of independent interest, we establish a quantum adjunction theorem for Pultr functors between categories of relational structures, and we prove that the Dinur--Khot--Kindler--Minzer--Safra reduction, recently used for proving the $2$-to-$2$ Games Theorem, is quantum complete.

Authors: Lorenzo Ciardo

The largest known gap between quantum and classical chromatic number of graphs, obtained via quantum protocols for colouring Hadamard graphs based on the Deutsch--Jozsa algorithm and the quantum Fourier transform, is exponential. We put forth a quantum pseudo-telepathy version of Khot's $d$-to-$1$ Games Conjecture and prove that, conditional to its validity, the gap is unbounded: There exist graphs whose quantum chromatic number is $3$ and whose classical chromatic number is arbitrarily large. Furthermore, we show that the existence of a certain form of pseudo-telepathic XOR games would imply the conjecture and, thus, the unboundedness of the quantum chromatic gap. As two technical steps of our proof that might be of independent interest, we establish a quantum adjunction theorem for Pultr functors between categories of relational structures, and we prove that the Dinur--Khot--Kindler--Minzer--Safra reduction, recently used for proving the $2$-to-$2$ Games Theorem, is quantum complete.

Classical Simulation of Quantum CSP Strategies

from arXiv: Computational Complexity

Authors: Demian Banakh, Lorenzo Ciardo, Marcin Kozik, Jan Tułowiecki

We prove that any perfect quantum strategy for the two-prover game encoding a constraint satisfaction problem (CSP) can be simulated via a perfect classical strategy with an extra classical communication channel, whose size depends only on $(i)$ the size of the shared quantum system used in the quantum strategy, and $(ii)$ structural parameters of the CSP template. The result is obtained via a combinatorial characterisation of perfect classical strategies with extra communication channels and a geometric rounding procedure for the projection-valued measurements involved in quantum strategies. A key intermediate step of our proof is to establish that the gap between the classical chromatic number of graphs and its quantum variant is bounded when the quantum strategy involves shared quantum information of bounded size.

Authors: Demian Banakh, Lorenzo Ciardo, Marcin Kozik, Jan Tułowiecki

We prove that any perfect quantum strategy for the two-prover game encoding a constraint satisfaction problem (CSP) can be simulated via a perfect classical strategy with an extra classical communication channel, whose size depends only on $(i)$ the size of the shared quantum system used in the quantum strategy, and $(ii)$ structural parameters of the CSP template. The result is obtained via a combinatorial characterisation of perfect classical strategies with extra communication channels and a geometric rounding procedure for the projection-valued measurements involved in quantum strategies. A key intermediate step of our proof is to establish that the gap between the classical chromatic number of graphs and its quantum variant is bounded when the quantum strategy involves shared quantum information of bounded size.

On the difficulty of order constrained pattern matching with applications to feature matching based malware detection

from arXiv: Computational Complexity

Authors: Adiesha Liyanage, Braeden Sopp, Binhai Zhu

We formulate low-level malware detection using algorithms based on feature matching as Order-based Malware Detection with Critical Instructions (General-OMDCI): given a pattern in the form of a sequence \(M\) of colored blocks, where each block contains a critical character (representing a unique sequence of critical instructions potentially associated with malware but without certainty), and a program \(A\), represented as a sequence of \(n\) colored blocks with critical characters, the goal is to find two subsequences, \(M'\) of \(M\) and \(A'\) of \(A\), with blocks matching in color and whose critical characters form a permutation of each other. When $M$ is a permutation in both colors and critical characters the problem is called OMDCI. If we additionally require $M'=M$, then the problem is called OMDCI+; if in this case $d=|M|$ is used as a parameter, then the OMDCI+ problem is easily shown to be FPT. Our main (negative) results are on the cases when $|M|$ is arbitrary and are summarized as follows: OMDCI+ is NP-complete, which implies OMDCI is also NP-complete. For the special case of OMDCI, deciding if the optimal solution has length $0$ (i.e., deciding if no part of \(M\) appears in \(A\)) is co-NP-hard. As a result, the OMDCI problem does not admit an FPT algorithm unless P=co-NP. In summary, our results imply that using algorithms based on feature matching to identify malware or determine the absence of malware in a given low-level program are both hard.

Authors: Adiesha Liyanage, Braeden Sopp, Binhai Zhu

We formulate low-level malware detection using algorithms based on feature matching as Order-based Malware Detection with Critical Instructions (General-OMDCI): given a pattern in the form of a sequence \(M\) of colored blocks, where each block contains a critical character (representing a unique sequence of critical instructions potentially associated with malware but without certainty), and a program \(A\), represented as a sequence of \(n\) colored blocks with critical characters, the goal is to find two subsequences, \(M'\) of \(M\) and \(A'\) of \(A\), with blocks matching in color and whose critical characters form a permutation of each other. When $M$ is a permutation in both colors and critical characters the problem is called OMDCI. If we additionally require $M'=M$, then the problem is called OMDCI+; if in this case $d=|M|$ is used as a parameter, then the OMDCI+ problem is easily shown to be FPT. Our main (negative) results are on the cases when $|M|$ is arbitrary and are summarized as follows: OMDCI+ is NP-complete, which implies OMDCI is also NP-complete. For the special case of OMDCI, deciding if the optimal solution has length $0$ (i.e., deciding if no part of \(M\) appears in \(A\)) is co-NP-hard. As a result, the OMDCI problem does not admit an FPT algorithm unless P=co-NP. In summary, our results imply that using algorithms based on feature matching to identify malware or determine the absence of malware in a given low-level program are both hard.

Disjunctive Complexity

from arXiv: Computational Complexity

Authors: Nikita Ivanov, Alexander Rubtsov, Michael Vyalyi

A recently introduced measure of Boolean functions complexity--disjunc\-tive complexity (DC)--is compared with other complexity measures: the space complexity of streaming algorithms and the complexity of nondeterministic branching programs (NBP). We show that DC is incomparable with NBP. Specifically, we present a function that has low NBP but has subexponential DC. Conversely, we provide arguments based on computational complexity conjectures to show that DC can superpolynomially exceed NBP in certain cases. Additionally, we prove that the monotone version of NBP complexity is strictly weaker than DC. We prove that the space complexity of one-pass streaming algorithms is strictly weaker than DC. Furthermore, we introduce a generalization of streaming algorithms that captures the full power of DC. This generalization can be expressed in terms of nondeterministic algorithms that irreversibly write 1s to entries of a Boolean vector (i.e., changes from 1 to 0 are not allowed). Finally, we discuss an unusual phenomenon in disjunctive complexity: the existence of uniformly hard functions. These functions exhibit the property that their disjunctive complexity is maximized, and this property extends to all functions dominated by them.

Authors: Nikita Ivanov, Alexander Rubtsov, Michael Vyalyi

A recently introduced measure of Boolean functions complexity--disjunc\-tive complexity (DC)--is compared with other complexity measures: the space complexity of streaming algorithms and the complexity of nondeterministic branching programs (NBP). We show that DC is incomparable with NBP. Specifically, we present a function that has low NBP but has subexponential DC. Conversely, we provide arguments based on computational complexity conjectures to show that DC can superpolynomially exceed NBP in certain cases. Additionally, we prove that the monotone version of NBP complexity is strictly weaker than DC. We prove that the space complexity of one-pass streaming algorithms is strictly weaker than DC. Furthermore, we introduce a generalization of streaming algorithms that captures the full power of DC. This generalization can be expressed in terms of nondeterministic algorithms that irreversibly write 1s to entries of a Boolean vector (i.e., changes from 1 to 0 are not allowed). Finally, we discuss an unusual phenomenon in disjunctive complexity: the existence of uniformly hard functions. These functions exhibit the property that their disjunctive complexity is maximized, and this property extends to all functions dominated by them.

Integer multiplication is at least as hard as matrix transposition

from arXiv: Computational Complexity

Authors: David Harvey, Joris van der Hoeven

Working in the multitape Turing model, we show how to reduce the problem of matrix transposition to the problem of integer multiplication. If transposing an $n \times n$ binary matrix requires $\Omega(n^2 \log n)$ steps on a Turing machine, then our reduction implies that multiplying $n$-bit integers requires $\Omega(n \log n)$ steps. In other words, if matrix transposition is as hard as expected, then integer multiplication is also as hard as expected.

Authors: David Harvey, Joris van der Hoeven

Working in the multitape Turing model, we show how to reduce the problem of matrix transposition to the problem of integer multiplication. If transposing an $n \times n$ binary matrix requires $\Omega(n^2 \log n)$ steps on a Turing machine, then our reduction implies that multiplying $n$-bit integers requires $\Omega(n \log n)$ steps. In other words, if matrix transposition is as hard as expected, then integer multiplication is also as hard as expected.

The Stamp Folding Problem From a Mountain-Valley Perspective: Enumerations and Bounds

from arXiv: Computational Geometry

Authors: Thomas C. Hull, Adham Ibrahim, Jacob Paltrowitz, Natalya Ter-Saakov, Grace Wang

A strip of square stamps can be folded in many ways such that all of the stamps are stacked in a single pile in the folded state. The stamp folding problem asks for the number of such foldings and has previously been studied extensively. We consider this problem with the additional restriction of fixing the mountain-valley assignment of each crease in the stamp pattern. We provide a closed form for counting the number of legal foldings on specific patterns of mountain-valley assignments, including a surprising appearance of the Catalan numbers. We construct upper and lower bounds for the number of ways to fold a given mountain-valley assignment on the strip of stamps. Lastly, we provide experimental evidence towards more general results.

Authors: Thomas C. Hull, Adham Ibrahim, Jacob Paltrowitz, Natalya Ter-Saakov, Grace Wang

A strip of square stamps can be folded in many ways such that all of the stamps are stacked in a single pile in the folded state. The stamp folding problem asks for the number of such foldings and has previously been studied extensively. We consider this problem with the additional restriction of fixing the mountain-valley assignment of each crease in the stamp pattern. We provide a closed form for counting the number of legal foldings on specific patterns of mountain-valley assignments, including a surprising appearance of the Catalan numbers. We construct upper and lower bounds for the number of ways to fold a given mountain-valley assignment on the strip of stamps. Lastly, we provide experimental evidence towards more general results.

Simplification of Trajectory Streams

from arXiv: Computational Geometry

Authors: Siu-Wing Cheng, Haoqiang Huang, Le Jiang

While there are software systems that simplify trajectory streams on the fly, few curve simplification algorithms with quality guarantees fit the streaming requirements. We present streaming algorithms for two such problems under the Fr\'{e}chet distance $d_F$ in $\mathbb{R}^d$ for some constant $d \geq 2$. Consider a polygonal curve $\tau$ in $\mathbb{R}^d$ in a stream. We present a streaming algorithm that, for any $\varepsilon\in (0,1)$ and $\delta > 0$, produces a curve $\sigma$ such that $d_F(\sigma,\tau[v_1,v_i])\le (1+\varepsilon)\delta$ and $|\sigma|\le 2\,\mathrm{opt}-2$, where $\tau[v_1,v_i]$ is the prefix in the stream so far, and $\mathrm{opt} = \min\{|\sigma'|: d_F(\sigma',\tau[v_1,v_i])\le \delta\}$. Let $\alpha = 2(d-1){\lfloor d/2 \rfloor}^2 + d$. The working storage is $O(\varepsilon^{-\alpha})$. Each vertex is processed in $O(\varepsilon^{-\alpha}\log\frac{1}{\varepsilon})$ time for $d \in \{2,3\}$ and $O(\varepsilon^{-\alpha})$ time for $d \geq 4$ . Thus, the whole $\tau$ can be simplified in $O(\varepsilon^{-\alpha}|\tau|\log\frac{1}{\varepsilon})$ time. Ignoring polynomial factors in $1/\varepsilon$, this running time is a factor $|\tau|$ faster than the best static algorithm that offers the same guarantees. We present another streaming algorithm that, for any integer $k \geq 2$ and any $\varepsilon \in (0,\frac{1}{17})$, maintains a curve $\sigma$ such that $|\sigma| \leq 2k-2$ and $d_F(\sigma,\tau[v_1,v_i])\le (1+\varepsilon) \cdot \min\{d_F(\sigma',\tau[v_1,v_i]): |\sigma'| \leq k\}$, where $\tau[v_1,v_i]$ is the prefix in the stream so far. The working storage is $O((k\varepsilon^{-1}+\varepsilon^{-(\alpha+1)})\log \frac{1}{\varepsilon})$. Each vertex is processed in $O(k\varepsilon^{-(\alpha+1)}\log^2\frac{1}{\varepsilon})$ time for $d \in \{2,3\}$ and $O(k\varepsilon^{-(\alpha+1)}\log\frac{1}{\varepsilon})$ time for $d \geq 4$.

Authors: Siu-Wing Cheng, Haoqiang Huang, Le Jiang

While there are software systems that simplify trajectory streams on the fly, few curve simplification algorithms with quality guarantees fit the streaming requirements. We present streaming algorithms for two such problems under the Fr\'{e}chet distance $d_F$ in $\mathbb{R}^d$ for some constant $d \geq 2$. Consider a polygonal curve $\tau$ in $\mathbb{R}^d$ in a stream. We present a streaming algorithm that, for any $\varepsilon\in (0,1)$ and $\delta > 0$, produces a curve $\sigma$ such that $d_F(\sigma,\tau[v_1,v_i])\le (1+\varepsilon)\delta$ and $|\sigma|\le 2\,\mathrm{opt}-2$, where $\tau[v_1,v_i]$ is the prefix in the stream so far, and $\mathrm{opt} = \min\{|\sigma'|: d_F(\sigma',\tau[v_1,v_i])\le \delta\}$. Let $\alpha = 2(d-1){\lfloor d/2 \rfloor}^2 + d$. The working storage is $O(\varepsilon^{-\alpha})$. Each vertex is processed in $O(\varepsilon^{-\alpha}\log\frac{1}{\varepsilon})$ time for $d \in \{2,3\}$ and $O(\varepsilon^{-\alpha})$ time for $d \geq 4$ . Thus, the whole $\tau$ can be simplified in $O(\varepsilon^{-\alpha}|\tau|\log\frac{1}{\varepsilon})$ time. Ignoring polynomial factors in $1/\varepsilon$, this running time is a factor $|\tau|$ faster than the best static algorithm that offers the same guarantees. We present another streaming algorithm that, for any integer $k \geq 2$ and any $\varepsilon \in (0,\frac{1}{17})$, maintains a curve $\sigma$ such that $|\sigma| \leq 2k-2$ and $d_F(\sigma,\tau[v_1,v_i])\le (1+\varepsilon) \cdot \min\{d_F(\sigma',\tau[v_1,v_i]): |\sigma'| \leq k\}$, where $\tau[v_1,v_i]$ is the prefix in the stream so far. The working storage is $O((k\varepsilon^{-1}+\varepsilon^{-(\alpha+1)})\log \frac{1}{\varepsilon})$. Each vertex is processed in $O(k\varepsilon^{-(\alpha+1)}\log^2\frac{1}{\varepsilon})$ time for $d \in \{2,3\}$ and $O(k\varepsilon^{-(\alpha+1)}\log\frac{1}{\varepsilon})$ time for $d \geq 4$.

Matching random colored points with rectangles (Corrigendum)

from arXiv: Computational Geometry

Authors: Josué Corujo, Paul Horn, Pablo Pérez-Lantero

Given $n>0$, let $S\subset [0,1]^2$ be a set of $n$ points, chosen uniformly at random. Let $R\cup B$ be a random partition, or coloring, of $S$ in which each point of $S$ is included in $R$ uniformly at random with probability $1/2$. Corujo et al.~(JOCO 2023) studied the random variable $M(n)$ equal to the number of points of $S$ that are covered by the rectangles of a maximum matching of $S$ using pairwise-disjoint rectangles. Each rectangle is axis-aligned and covers exactly two points of $S$ of the same color. They designed a deterministic algorithm to match points of $S$, and the algorithm was modeled as a discrete stochastic process over a finite set of states. After claiming that the stochastic process is a Markov chain, they proved that almost surely $M(n)\ge 0.83\,n$ for $n$ large enough. The issue is that such a process is not actually a Markov one, as we discuss in this note. We argue this issue, and correct it by obtaining the same result but considering that the stochastic process is not Markov, but satisfies some kind of first-order homogeneity property that allows us to compute its marginal distributions.

Authors: Josué Corujo, Paul Horn, Pablo Pérez-Lantero

Given $n>0$, let $S\subset [0,1]^2$ be a set of $n$ points, chosen uniformly at random. Let $R\cup B$ be a random partition, or coloring, of $S$ in which each point of $S$ is included in $R$ uniformly at random with probability $1/2$. Corujo et al.~(JOCO 2023) studied the random variable $M(n)$ equal to the number of points of $S$ that are covered by the rectangles of a maximum matching of $S$ using pairwise-disjoint rectangles. Each rectangle is axis-aligned and covers exactly two points of $S$ of the same color. They designed a deterministic algorithm to match points of $S$, and the algorithm was modeled as a discrete stochastic process over a finite set of states. After claiming that the stochastic process is a Markov chain, they proved that almost surely $M(n)\ge 0.83\,n$ for $n$ large enough. The issue is that such a process is not actually a Markov one, as we discuss in this note. We argue this issue, and correct it by obtaining the same result but considering that the stochastic process is not Markov, but satisfies some kind of first-order homogeneity property that allows us to compute its marginal distributions.

Skeletonization Quality Evaluation: Geometric Metrics for Point Cloud Analysis in Robotics

from arXiv: Computational Geometry

Authors: Qingmeng Wen, Yu-Kun Lai, Ze Ji, Seyed Amir Tafrishi

Skeletonization is a powerful tool for shape analysis, rooted in the inherent instinct to understand an object's morphology. It has found applications across various domains, including robotics. Although skeletonization algorithms have been studied in recent years, their performance is rarely quantified with detailed numerical evaluations. This work focuses on defining and quantifying geometric properties to systematically score the skeletonization results of point cloud shapes across multiple aspects, including topological similarity, boundedness, centeredness, and smoothness. We introduce these representative metric definitions along with a numerical scoring framework to analyze skeletonization outcomes concerning point cloud data for different scenarios, from object manipulation to mobile robot navigation. Additionally, we provide an open-source tool to enable the research community to evaluate and refine their skeleton models. Finally, we assess the performance and sensitivity of the proposed geometric evaluation methods from various robotic applications.

Authors: Qingmeng Wen, Yu-Kun Lai, Ze Ji, Seyed Amir Tafrishi

Skeletonization is a powerful tool for shape analysis, rooted in the inherent instinct to understand an object's morphology. It has found applications across various domains, including robotics. Although skeletonization algorithms have been studied in recent years, their performance is rarely quantified with detailed numerical evaluations. This work focuses on defining and quantifying geometric properties to systematically score the skeletonization results of point cloud shapes across multiple aspects, including topological similarity, boundedness, centeredness, and smoothness. We introduce these representative metric definitions along with a numerical scoring framework to analyze skeletonization outcomes concerning point cloud data for different scenarios, from object manipulation to mobile robot navigation. Additionally, we provide an open-source tool to enable the research community to evaluate and refine their skeleton models. Finally, we assess the performance and sensitivity of the proposed geometric evaluation methods from various robotic applications.

Multi-Pass Streaming Lower Bounds for Approximating Max-Cut

from arXiv: Data Structures and Algorithms

Authors: Yumou Fei, Dor Minzer, Shuo Wang

In the Max-Cut problem in the streaming model, an algorithm is given the edges of an unknown graph $G = (V,E)$ in some fixed order, and its goal is to approximate the size of the largest cut in $G$. Improving upon an earlier result of Kapralov, Khanna and Sudan, it was shown by Kapralov and Krachun that for all $\varepsilon>0$, no $o(n)$ memory streaming algorithm can achieve a $(1/2+\varepsilon)$-approximation for Max-Cut. Their result holds for single-pass streams, i.e.~the setting in which the algorithm only views the stream once, and it was open whether multi-pass access may help. The state-of-the-art result along these lines, due to Assadi and N, rules out arbitrarily good approximation algorithms with constantly many passes and $n^{1-\delta}$ space for any $\delta>0$. We improve upon this state-of-the-art result, showing that any non-trivial approximation algorithm for Max-Cut requires either polynomially many passes or polynomially large space. More specifically, we show that for all $\varepsilon>0$, a $k$-pass streaming $(1/2+\varepsilon)$-approximation algorithm for Max-Cut requires $\Omega_{\varepsilon}\left(n^{1/3}/k\right)$ space. This result leads to a similar lower bound for the Maximum Directed Cut problem, showing the near optimality of the algorithm of [Saxena, Singer, Sudan, Velusamy, SODA 2025]. Our lower bounds proceed by showing a communication complexity lower bound for the Distributional Implicit Hidden Partition (DIHP) Problem, introduced by Kapralov and Krachun. While a naive application of the discrepancy method fails, we identify a property of protocols called ``globalness'', and show that (1) any protocol for DIHP can be turned into a global protocol, (2) the discrepancy of a global protocol must be small. The second step is the more technically involved step in the argument, and therein we use global hypercontractive inequalities.

Authors: Yumou Fei, Dor Minzer, Shuo Wang

In the Max-Cut problem in the streaming model, an algorithm is given the edges of an unknown graph $G = (V,E)$ in some fixed order, and its goal is to approximate the size of the largest cut in $G$. Improving upon an earlier result of Kapralov, Khanna and Sudan, it was shown by Kapralov and Krachun that for all $\varepsilon>0$, no $o(n)$ memory streaming algorithm can achieve a $(1/2+\varepsilon)$-approximation for Max-Cut. Their result holds for single-pass streams, i.e.~the setting in which the algorithm only views the stream once, and it was open whether multi-pass access may help. The state-of-the-art result along these lines, due to Assadi and N, rules out arbitrarily good approximation algorithms with constantly many passes and $n^{1-\delta}$ space for any $\delta>0$. We improve upon this state-of-the-art result, showing that any non-trivial approximation algorithm for Max-Cut requires either polynomially many passes or polynomially large space. More specifically, we show that for all $\varepsilon>0$, a $k$-pass streaming $(1/2+\varepsilon)$-approximation algorithm for Max-Cut requires $\Omega_{\varepsilon}\left(n^{1/3}/k\right)$ space. This result leads to a similar lower bound for the Maximum Directed Cut problem, showing the near optimality of the algorithm of [Saxena, Singer, Sudan, Velusamy, SODA 2025]. Our lower bounds proceed by showing a communication complexity lower bound for the Distributional Implicit Hidden Partition (DIHP) Problem, introduced by Kapralov and Krachun. While a naive application of the discrepancy method fails, we identify a property of protocols called ``globalness'', and show that (1) any protocol for DIHP can be turned into a global protocol, (2) the discrepancy of a global protocol must be small. The second step is the more technically involved step in the argument, and therein we use global hypercontractive inequalities.

Accelerated Approximate Optimization of Multi-Commodity Flows on Directed Graphs

from arXiv: Data Structures and Algorithms

Authors: Li Chen, Andrei Graur, Aaron Sidford

We provide $m^{1+o(1)}k\epsilon^{-1}$-time algorithms for computing multiplicative $(1 - \epsilon)$-approximate solutions to multi-commodity flow problems with $k$-commodities on $m$-edge directed graphs, including concurrent multi-commodity flow and maximum multi-commodity flow. To obtain our results, we provide new optimization tools of potential independent interest. First, we provide an improved optimization method for solving $\ell_{q, p}$-regression problems to high accuracy. This method makes $\tilde{O}_{q, p}(k)$ queries to a high accuracy convex minimization oracle for an individual block, where $\tilde{O}_{q, p}(\cdot)$ hides factors depending only on $q$, $p$, or $\mathrm{poly}(\log m)$, improving upon the $\tilde{O}_{q, p}(k^2)$ bound of [Chen-Ye, ICALP 2024]. As a result, we obtain the first almost-linear time algorithm that solves $\ell_{q, p}$ flows on directed graphs to high accuracy. Second, we present optimization tools to reduce approximately solving composite $\ell_{1, \infty}$-regression problems to solving $m^{o(1)}\epsilon^{-1}$ instances of composite $\ell_{q, p}$-regression problem. The method builds upon recent advances in solving box-simplex games [Jambulapati-Tian, NeurIPS 2023] and the area convex regularizer introduced in [Sherman, STOC 2017] to obtain faster rates for constrained versions of the problem. Carefully combining these techniques yields our directed multi-commodity flow algorithm.

Authors: Li Chen, Andrei Graur, Aaron Sidford

We provide $m^{1+o(1)}k\epsilon^{-1}$-time algorithms for computing multiplicative $(1 - \epsilon)$-approximate solutions to multi-commodity flow problems with $k$-commodities on $m$-edge directed graphs, including concurrent multi-commodity flow and maximum multi-commodity flow. To obtain our results, we provide new optimization tools of potential independent interest. First, we provide an improved optimization method for solving $\ell_{q, p}$-regression problems to high accuracy. This method makes $\tilde{O}_{q, p}(k)$ queries to a high accuracy convex minimization oracle for an individual block, where $\tilde{O}_{q, p}(\cdot)$ hides factors depending only on $q$, $p$, or $\mathrm{poly}(\log m)$, improving upon the $\tilde{O}_{q, p}(k^2)$ bound of [Chen-Ye, ICALP 2024]. As a result, we obtain the first almost-linear time algorithm that solves $\ell_{q, p}$ flows on directed graphs to high accuracy. Second, we present optimization tools to reduce approximately solving composite $\ell_{1, \infty}$-regression problems to solving $m^{o(1)}\epsilon^{-1}$ instances of composite $\ell_{q, p}$-regression problem. The method builds upon recent advances in solving box-simplex games [Jambulapati-Tian, NeurIPS 2023] and the area convex regularizer introduced in [Sherman, STOC 2017] to obtain faster rates for constrained versions of the problem. Carefully combining these techniques yields our directed multi-commodity flow algorithm.

On Speedups for Convex Optimization via Quantum Dynamics

from arXiv: Data Structures and Algorithms

Authors: Shouvanik Chakrabarti, Dylan Herman, Jacob Watkins, Enrico Fontana, Brandon Augustino, Junhyung Lyle Kim, Marco Pistoia

We explore the potential for quantum speedups in convex optimization using discrete simulations of the Quantum Hamiltonian Descent (QHD) framework, as proposed by Leng et al., and establish the first rigorous query complexity bounds. We develop enhanced analyses for quantum simulation of Schr\"odinger operators with black-box potential via the pseudo-spectral method, providing explicit resource estimates independent of wavefunction assumptions. These bounds are applied to assess the complexity of optimization through QHD. Our findings pertain to unconstrained convex optimization in $d$ dimensions. In continuous time, we demonstrate that QHD, with suitable parameters, can achieve arbitrarily fast convergence rates. The optimization speed limit arises solely from the discretization of the dynamics, mirroring a property of the classical dynamics underlying QHD. Considering this cost, we show that a $G$-Lipschitz convex function can be optimized to an error of $\epsilon$ with $\widetilde{\mathcal{O}}(d^{1.5}G^2 R^2/\epsilon^2)$ queries. Moreover, under reasonable assumptions on the complexity of Hamiltonian simulation, $\widetilde{\Omega}(d/\epsilon^2)$ queries are necessary. Thus, QHD does not offer a speedup over classical zeroth order methods with exact oracles. However, we demonstrate that the QHD algorithm tolerates $\widetilde{\mathcal{O}}(\epsilon^3/d^{1.5}G^2 R^2)$ noise in function evaluation. We show that QHD offers a super-quadratic query advantage over all known classical algorithms tolerating this level of evaluation noise in the high-dimension regime. Additionally, we design a quantum algorithm for stochastic convex optimization that provides a super-quadratic speedup over all known classical algorithms in the high-dimension regime. To our knowledge, these results represent the first rigorous quantum speedups for convex optimization achieved through a dynamical algorithm.

Authors: Shouvanik Chakrabarti, Dylan Herman, Jacob Watkins, Enrico Fontana, Brandon Augustino, Junhyung Lyle Kim, Marco Pistoia

We explore the potential for quantum speedups in convex optimization using discrete simulations of the Quantum Hamiltonian Descent (QHD) framework, as proposed by Leng et al., and establish the first rigorous query complexity bounds. We develop enhanced analyses for quantum simulation of Schr\"odinger operators with black-box potential via the pseudo-spectral method, providing explicit resource estimates independent of wavefunction assumptions. These bounds are applied to assess the complexity of optimization through QHD. Our findings pertain to unconstrained convex optimization in $d$ dimensions. In continuous time, we demonstrate that QHD, with suitable parameters, can achieve arbitrarily fast convergence rates. The optimization speed limit arises solely from the discretization of the dynamics, mirroring a property of the classical dynamics underlying QHD. Considering this cost, we show that a $G$-Lipschitz convex function can be optimized to an error of $\epsilon$ with $\widetilde{\mathcal{O}}(d^{1.5}G^2 R^2/\epsilon^2)$ queries. Moreover, under reasonable assumptions on the complexity of Hamiltonian simulation, $\widetilde{\Omega}(d/\epsilon^2)$ queries are necessary. Thus, QHD does not offer a speedup over classical zeroth order methods with exact oracles. However, we demonstrate that the QHD algorithm tolerates $\widetilde{\mathcal{O}}(\epsilon^3/d^{1.5}G^2 R^2)$ noise in function evaluation. We show that QHD offers a super-quadratic query advantage over all known classical algorithms tolerating this level of evaluation noise in the high-dimension regime. Additionally, we design a quantum algorithm for stochastic convex optimization that provides a super-quadratic speedup over all known classical algorithms in the high-dimension regime. To our knowledge, these results represent the first rigorous quantum speedups for convex optimization achieved through a dynamical algorithm.

Sample-Optimal Private Regression in Polynomial Time

from arXiv: Data Structures and Algorithms

Authors: Prashanti Anderson, Ainesh Bakshi, Mahbod Majid, Stefan Tiegel

We consider the task of privately obtaining prediction error guarantees in ordinary least-squares regression problems with Gaussian covariates (with unknown covariance structure). We provide the first sample-optimal polynomial time algorithm for this task under both pure and approximate differential privacy. We show that any improvement to the sample complexity of our algorithm would violate either statistical-query or information-theoretic lower bounds. Additionally, our algorithm is robust to a small fraction of arbitrary outliers and achieves optimal error rates as a function of the fraction of outliers. In contrast, all prior efficient algorithms either incurred sample complexities with sub-optimal dimension dependence, scaling with the condition number of the covariates, or obtained a polynomially worse dependence on the privacy parameters. Our technical contributions are two-fold: first, we leverage resilience guarantees of Gaussians within the sum-of-squares framework. As a consequence, we obtain efficient sum-of-squares algorithms for regression with optimal robustness rates and sample complexity. Second, we generalize the recent robustness-to-privacy framework [HKMN23, (arXiv:2212.05015)] to account for the geometry induced by the covariance of the input samples. This framework crucially relies on the robust estimators to be sum-of-squares algorithms, and combining the two steps yields a sample-optimal private regression algorithm. We believe our techniques are of independent interest, and we demonstrate this by obtaining an efficient algorithm for covariance-aware mean estimation, with an optimal dependence on the privacy parameters.

Authors: Prashanti Anderson, Ainesh Bakshi, Mahbod Majid, Stefan Tiegel

We consider the task of privately obtaining prediction error guarantees in ordinary least-squares regression problems with Gaussian covariates (with unknown covariance structure). We provide the first sample-optimal polynomial time algorithm for this task under both pure and approximate differential privacy. We show that any improvement to the sample complexity of our algorithm would violate either statistical-query or information-theoretic lower bounds. Additionally, our algorithm is robust to a small fraction of arbitrary outliers and achieves optimal error rates as a function of the fraction of outliers. In contrast, all prior efficient algorithms either incurred sample complexities with sub-optimal dimension dependence, scaling with the condition number of the covariates, or obtained a polynomially worse dependence on the privacy parameters. Our technical contributions are two-fold: first, we leverage resilience guarantees of Gaussians within the sum-of-squares framework. As a consequence, we obtain efficient sum-of-squares algorithms for regression with optimal robustness rates and sample complexity. Second, we generalize the recent robustness-to-privacy framework [HKMN23, (arXiv:2212.05015)] to account for the geometry induced by the covariance of the input samples. This framework crucially relies on the robust estimators to be sum-of-squares algorithms, and combining the two steps yields a sample-optimal private regression algorithm. We believe our techniques are of independent interest, and we demonstrate this by obtaining an efficient algorithm for covariance-aware mean estimation, with an optimal dependence on the privacy parameters.

Word Break on SLP-Compressed Texts

from arXiv: Data Structures and Algorithms

Authors: Rajat De, Dominik Kempa

Word Break is a prototypical factorization problem in string processing: Given a word $w$ of length $N$ and a dictionary $\mathcal{D} = \{d_1, d_2, \ldots, d_{K}\}$ of $K$ strings, determine whether we can partition $w$ into words from $\mathcal{D}$. We propose the first algorithm that solves the Word Break problem over the SLP-compressed input text $w$. Specifically, we show that, given the string $w$ represented using an SLP of size $g$, we can solve the Word Break problem in $\mathcal{O}(g \cdot m^{\omega} + M)$ time, where $m = \max_{i=1}^{K} |d_i|$, $M = \sum_{i=1}^{K} |d_i|$, and $\omega \geq 2$ is the matrix multiplication exponent. We obtain our algorithm as a simple corollary of a more general result: We show that in $\mathcal{O}(g \cdot m^{\omega} + M)$ time, we can index the input text $w$ so that solving the Word Break problem for any of its substrings takes $\mathcal{O}(m^2 \log N)$ time (independent of the substring length). Our second contribution is a lower bound: We prove that, unless the Combinatorial $k$-Clique Conjecture fails, there is no combinatorial algorithm for Word Break on SLP-compressed strings running in $\mathcal{O}(g \cdot m^{2-\epsilon} + M)$ time for any $\epsilon > 0$.

Authors: Rajat De, Dominik Kempa

Word Break is a prototypical factorization problem in string processing: Given a word $w$ of length $N$ and a dictionary $\mathcal{D} = \{d_1, d_2, \ldots, d_{K}\}$ of $K$ strings, determine whether we can partition $w$ into words from $\mathcal{D}$. We propose the first algorithm that solves the Word Break problem over the SLP-compressed input text $w$. Specifically, we show that, given the string $w$ represented using an SLP of size $g$, we can solve the Word Break problem in $\mathcal{O}(g \cdot m^{\omega} + M)$ time, where $m = \max_{i=1}^{K} |d_i|$, $M = \sum_{i=1}^{K} |d_i|$, and $\omega \geq 2$ is the matrix multiplication exponent. We obtain our algorithm as a simple corollary of a more general result: We show that in $\mathcal{O}(g \cdot m^{\omega} + M)$ time, we can index the input text $w$ so that solving the Word Break problem for any of its substrings takes $\mathcal{O}(m^2 \log N)$ time (independent of the substring length). Our second contribution is a lower bound: We prove that, unless the Combinatorial $k$-Clique Conjecture fails, there is no combinatorial algorithm for Word Break on SLP-compressed strings running in $\mathcal{O}(g \cdot m^{2-\epsilon} + M)$ time for any $\epsilon > 0$.

Space of Data through the Lens of Multilevel Graph

from arXiv: Data Structures and Algorithms

Authors: Marco Caputo, Michele Russo, Emanuela Merelli

This work seeks to tackle the inherent complexity of dataspaces by introducing a novel data structure that can represent datasets across multiple levels of abstraction, ranging from local to global. We propose the concept of a multilevel graph, which is equipped with two fundamental operations: contraction and expansion of its topology. This multilevel graph is specifically designed to fulfil the requirements for incremental abstraction and flexibility, as outlined in existing definitions of dataspaces. Furthermore, we provide a comprehensive suite of methods for manipulating this graph structure, establishing a robust framework for data analysis. While its effectiveness has been empirically validated for unstructured data, its application to structured data is also inherently viable. Preliminary results are presented through a real-world scenario based on a collection of dream reports.

Authors: Marco Caputo, Michele Russo, Emanuela Merelli

This work seeks to tackle the inherent complexity of dataspaces by introducing a novel data structure that can represent datasets across multiple levels of abstraction, ranging from local to global. We propose the concept of a multilevel graph, which is equipped with two fundamental operations: contraction and expansion of its topology. This multilevel graph is specifically designed to fulfil the requirements for incremental abstraction and flexibility, as outlined in existing definitions of dataspaces. Furthermore, we provide a comprehensive suite of methods for manipulating this graph structure, establishing a robust framework for data analysis. While its effectiveness has been empirically validated for unstructured data, its application to structured data is also inherently viable. Preliminary results are presented through a real-world scenario based on a collection of dream reports.

Network Unreliability in Almost-Linear Time

from arXiv: Data Structures and Algorithms

Authors: Ruoxu Cen, Jason Li, Debmalya Panigrahi

The network unreliability problem asks for the probability that a given undirected graph gets disconnected when every edge independently fails with a given probability $p$. Valiant (1979) showed that this problem is \#P-hard; therefore, the best we can hope for are approximation algorithms. In a classic result, Karger (1995) obtained the first FPTAS for this problem by leveraging the fact that when a graph disconnects, it almost always does so at a near-minimum cut, and there are only a small (polynomial) number of near-minimum cuts. Since then, a series of results have obtained progressively faster algorithms to the current bound of $m^{1+o(1)} + \tilde{O}(n^{3/2})$ (Cen, He, Li, and Panigrahi, 2024). In this paper, we obtain an $m^{1+o(1)}$-time algorithm for the network unreliability problem. This is essentially optimal, since we need $O(m)$ time to read the input graph. Our main new ingredient is relating network unreliability to an {\em ideal} tree packing of spanning trees (Thorup, 2001).

Authors: Ruoxu Cen, Jason Li, Debmalya Panigrahi

The network unreliability problem asks for the probability that a given undirected graph gets disconnected when every edge independently fails with a given probability $p$. Valiant (1979) showed that this problem is \#P-hard; therefore, the best we can hope for are approximation algorithms. In a classic result, Karger (1995) obtained the first FPTAS for this problem by leveraging the fact that when a graph disconnects, it almost always does so at a near-minimum cut, and there are only a small (polynomial) number of near-minimum cuts. Since then, a series of results have obtained progressively faster algorithms to the current bound of $m^{1+o(1)} + \tilde{O}(n^{3/2})$ (Cen, He, Li, and Panigrahi, 2024). In this paper, we obtain an $m^{1+o(1)}$-time algorithm for the network unreliability problem. This is essentially optimal, since we need $O(m)$ time to read the input graph. Our main new ingredient is relating network unreliability to an {\em ideal} tree packing of spanning trees (Thorup, 2001).

Generalized Capacity Planning for the Hospital-Residents Problem

from arXiv: Data Structures and Algorithms

Authors: Haricharan Balasundaram, Girija Limaye, Meghana Nasre, Abhinav Raja

The Hospital Residents setting models important problems like school choice, assignment of undergraduate students to degree programs, among many others. In this setting, fixed quotas are associated with the programs that limit the number of agents that can be assigned to them. Motivated by scenarios where all agents must be matched, we propose and study a generalized capacity planning problem, which allows cost-controlled flexibility with respect to quotas. Our setting is an extension of the Hospital Resident setting where programs have the usual quota as well as an associated cost, indicating the cost of matching an agent beyond the initial quotas. We seek to compute a matching that matches all agents and is optimal with respect to preferences, and minimizes either a local or a global objective on cost. We show that there is a sharp contrast -- minimizing the local objective is polynomial-time solvable, whereas minimizing the global objective is NP-hard. On the positive side, we present approximation algorithms for the global objective in the general case and a particular hard case. We achieve the approximation guarantee for the special hard case via a linear programming based algorithm. We strengthen the NP-hardness by showing a matching lower bound to our algorithmic result.

Authors: Haricharan Balasundaram, Girija Limaye, Meghana Nasre, Abhinav Raja

The Hospital Residents setting models important problems like school choice, assignment of undergraduate students to degree programs, among many others. In this setting, fixed quotas are associated with the programs that limit the number of agents that can be assigned to them. Motivated by scenarios where all agents must be matched, we propose and study a generalized capacity planning problem, which allows cost-controlled flexibility with respect to quotas. Our setting is an extension of the Hospital Resident setting where programs have the usual quota as well as an associated cost, indicating the cost of matching an agent beyond the initial quotas. We seek to compute a matching that matches all agents and is optimal with respect to preferences, and minimizes either a local or a global objective on cost. We show that there is a sharp contrast -- minimizing the local objective is polynomial-time solvable, whereas minimizing the global objective is NP-hard. On the positive side, we present approximation algorithms for the global objective in the general case and a particular hard case. We achieve the approximation guarantee for the special hard case via a linear programming based algorithm. We strengthen the NP-hardness by showing a matching lower bound to our algorithmic result.

Improved algorithms for single machine serial-batch scheduling to minimize makespan and maximum cost

from arXiv: Data Structures and Algorithms

Authors: Shuguang Li, Zhenxin Wen, Jing Wei

This paper studies the bicriteria problem of scheduling $n$ jobs on a serial-batch machine to minimize makespan and maximum cost simultaneously. A serial-batch machine can process up to $b$ jobs as a batch, where $b$ is known as the batch capacity. When a new batch starts, a constant setup time is required for the machine. Within each batch, the jobs are processed sequentially, and thus the processing time of a batch equals the sum of the processing times of its jobs. All the jobs in a batch have the same completion time, namely, the completion time of the batch. The main result is an $O(n^3)$-time algorithm which can generate all Pareto optimal points for the bounded model ($b

Authors: Shuguang Li, Zhenxin Wen, Jing Wei

This paper studies the bicriteria problem of scheduling $n$ jobs on a serial-batch machine to minimize makespan and maximum cost simultaneously. A serial-batch machine can process up to $b$ jobs as a batch, where $b$ is known as the batch capacity. When a new batch starts, a constant setup time is required for the machine. Within each batch, the jobs are processed sequentially, and thus the processing time of a batch equals the sum of the processing times of its jobs. All the jobs in a batch have the same completion time, namely, the completion time of the batch. The main result is an $O(n^3)$-time algorithm which can generate all Pareto optimal points for the bounded model ($b

Wagner's Algorithm Provably Runs in Subexponential Time for SIS$^\infty$

from arXiv: Data Structures and Algorithms

Authors: Léo Ducas, Lynn Engelberts, Johanna Loyer

At CRYPTO 2015, Kirchner and Fouque claimed that a carefully tuned variant of the Blum-Kalai-Wasserman (BKW) algorithm (JACM 2003) should solve the Learning with Errors problem (LWE) in slightly subexponential time for modulus $q=\mathrm{poly}(n)$ and narrow error distribution, when given enough LWE samples. Taking a modular view, one may regard BKW as a combination of Wagner's algorithm (CRYPTO 2002), run over the corresponding dual problem, and the Aharonov-Regev distinguisher (JACM 2005). Hence the subexponential Wagner step alone should be of interest for solving this dual problem - namely, the Short Integer Solution problem (SIS) - but this appears to be undocumented so far. We re-interpret this Wagner step as walking backward through a chain of projected lattices, zigzagging through some auxiliary superlattices. We further randomize the bucketing step using Gaussian randomized rounding to exploit the powerful discrete Gaussian machinery. This approach avoids sample amplification and turns Wagner's algorithm into an approximate discrete Gaussian sampler for $q$-ary lattices. For an SIS lattice with $n$ equations modulo $q$, this algorithm runs in subexponential time $\exp(O(n/\log \log n))$ to reach a Gaussian width parameter $s = q/\mathrm{polylog}(n)$ only requiring $m = n + \omega(n/\log \log n)$ many SIS variables. This directly provides a provable algorithm for solving the Short Integer Solution problem in the infinity norm ($\mathrm{SIS}^\infty$) for norm bounds $\beta = q/\mathrm{polylog}(n)$. This variant of SIS underlies the security of the NIST post-quantum cryptography standard Dilithium. Despite its subexponential complexity, Wagner's algorithm does not appear to threaten Dilithium's concrete security.

Authors: Léo Ducas, Lynn Engelberts, Johanna Loyer

At CRYPTO 2015, Kirchner and Fouque claimed that a carefully tuned variant of the Blum-Kalai-Wasserman (BKW) algorithm (JACM 2003) should solve the Learning with Errors problem (LWE) in slightly subexponential time for modulus $q=\mathrm{poly}(n)$ and narrow error distribution, when given enough LWE samples. Taking a modular view, one may regard BKW as a combination of Wagner's algorithm (CRYPTO 2002), run over the corresponding dual problem, and the Aharonov-Regev distinguisher (JACM 2005). Hence the subexponential Wagner step alone should be of interest for solving this dual problem - namely, the Short Integer Solution problem (SIS) - but this appears to be undocumented so far. We re-interpret this Wagner step as walking backward through a chain of projected lattices, zigzagging through some auxiliary superlattices. We further randomize the bucketing step using Gaussian randomized rounding to exploit the powerful discrete Gaussian machinery. This approach avoids sample amplification and turns Wagner's algorithm into an approximate discrete Gaussian sampler for $q$-ary lattices. For an SIS lattice with $n$ equations modulo $q$, this algorithm runs in subexponential time $\exp(O(n/\log \log n))$ to reach a Gaussian width parameter $s = q/\mathrm{polylog}(n)$ only requiring $m = n + \omega(n/\log \log n)$ many SIS variables. This directly provides a provable algorithm for solving the Short Integer Solution problem in the infinity norm ($\mathrm{SIS}^\infty$) for norm bounds $\beta = q/\mathrm{polylog}(n)$. This variant of SIS underlies the security of the NIST post-quantum cryptography standard Dilithium. Despite its subexponential complexity, Wagner's algorithm does not appear to threaten Dilithium's concrete security.

Length-Constrained Directed Expander Decomposition and Length-Constrained Vertex-Capacitated Flow Shortcuts

from arXiv: Data Structures and Algorithms

Authors: Bernhard Haeupler, Yaowei Long, Thatchaphol Saranurak, Shengzhe Wang

We show the existence of length-constrained expander decomposition in directed graphs and undirected vertex-capacitated graphs. Previously, its existence was shown only in undirected edge-capacitated graphs [Haeupler-R\"acke-Ghaffari, STOC 2022; Haeupler-Hershkowitz-Tan, FOCS 2024]. Along the way, we prove the multi-commodity maxflow-mincut theorems for length-constrained expansion in both directed and undirected vertex-capacitated graphs. Based on our decomposition, we build a length-constrained flow shortcut for undirected vertex-capacitated graphs, which roughly speaking is a set of edges and vertices added to the graph so that every multi-commodity flow demand can be routed with approximately the same vertex-congestion and length, but all flow paths only contain few edges. This generalizes the shortcut for undirected edge-capacitated graphs from [Haeupler-Hershkowitz-Li-Roeyskoe-Saranurak, STOC 2024]. Length-constrained expander decomposition and flow shortcuts have been crucial in the recent algorithms in undirected edge-capacitated graphs [Haeupler-Hershkowitz-Li-Roeyskoe-Saranurak, STOC 2024; Haeupler-Long-Saranurak, FOCS 2024]. Our work thus serves as a foundation to generalize these concepts to directed and vertex-capacitated graphs.

Authors: Bernhard Haeupler, Yaowei Long, Thatchaphol Saranurak, Shengzhe Wang

We show the existence of length-constrained expander decomposition in directed graphs and undirected vertex-capacitated graphs. Previously, its existence was shown only in undirected edge-capacitated graphs [Haeupler-R\"acke-Ghaffari, STOC 2022; Haeupler-Hershkowitz-Tan, FOCS 2024]. Along the way, we prove the multi-commodity maxflow-mincut theorems for length-constrained expansion in both directed and undirected vertex-capacitated graphs. Based on our decomposition, we build a length-constrained flow shortcut for undirected vertex-capacitated graphs, which roughly speaking is a set of edges and vertices added to the graph so that every multi-commodity flow demand can be routed with approximately the same vertex-congestion and length, but all flow paths only contain few edges. This generalizes the shortcut for undirected edge-capacitated graphs from [Haeupler-Hershkowitz-Li-Roeyskoe-Saranurak, STOC 2024]. Length-constrained expander decomposition and flow shortcuts have been crucial in the recent algorithms in undirected edge-capacitated graphs [Haeupler-Hershkowitz-Li-Roeyskoe-Saranurak, STOC 2024; Haeupler-Long-Saranurak, FOCS 2024]. Our work thus serves as a foundation to generalize these concepts to directed and vertex-capacitated graphs.

Monday, March 31

Linkage

from David Eppstein

Matthias Merzenich finds a true-period-14 glider gun in Conway’s Game of Life (\(\mathbb{M}\)).

By David Eppstein

TR25-037 | Uniform Bounds on Product Sylvester-Gallai Configurations | Abhibhav Garg, Rafael Mendes de Oliveira, Akash Sengupta

from ECCC Papers

In this work, we explore a non-linear extension of the classical Sylvester-Gallai configuration. Let $\mathbb{K}$ be an algebraically closed field of characteristic zero, and let $\mathcal{F} = \{F_1, \ldots, F_m\} \subset \mathbb{K}[x_1, \ldots, x_N]$ denote a collection of irreducible homogeneous polynomials of degree at most $d$, where each $F_i$ is not a scalar multiple of any other $F_j$ for $i \neq j$. We define $\mathcal{F}$ to be a product Sylvester-Gallai configuration if, for any two distinct polynomials $F_i, F_j \in \mathcal{F}$, the following condition is satisfied: $$\prod_{k \neq i, j} F_k \in \textrm{rad} \left( F_i, F_j \right).$$ We prove that product Sylvester-Gallai configurations are inherently low dimensional. Specifically, we show that there exists a function $\lambda : \mathbb{N} \rightarrow \mathbb{N}$, independent of $\mathbb{K}$, $N$, and $m$, such that any product Sylvester-Gallai configuration must satisfy: $$ \dim (\text{span}_{\mathbb{K}}(\mathcal{F})) \leq \lambda(d). $$ This result generalizes the main theorems from [S19,PS20a, OS23], and gets us one step closer to a full derandomization of the polynomial identity testing problem for the class of depth 4 circuits with bounded top and bottom fan-in.

In this work, we explore a non-linear extension of the classical Sylvester-Gallai configuration. Let $\mathbb{K}$ be an algebraically closed field of characteristic zero, and let $\mathcal{F} = \{F_1, \ldots, F_m\} \subset \mathbb{K}[x_1, \ldots, x_N]$ denote a collection of irreducible homogeneous polynomials of degree at most $d$, where each $F_i$ is not a scalar multiple of any other $F_j$ for $i \neq j$. We define $\mathcal{F}$ to be a product Sylvester-Gallai configuration if, for any two distinct polynomials $F_i, F_j \in \mathcal{F}$, the following condition is satisfied: $$\prod_{k \neq i, j} F_k \in \textrm{rad} \left( F_i, F_j \right).$$ We prove that product Sylvester-Gallai configurations are inherently low dimensional. Specifically, we show that there exists a function $\lambda : \mathbb{N} \rightarrow \mathbb{N}$, independent of $\mathbb{K}$, $N$, and $m$, such that any product Sylvester-Gallai configuration must satisfy: $$ \dim (\text{span}_{\mathbb{K}}(\mathcal{F})) \leq \lambda(d). $$ This result generalizes the main theorems from [S19,PS20a, OS23], and gets us one step closer to a full derandomization of the polynomial identity testing problem for the class of depth 4 circuits with bounded top and bottom fan-in.

Tragedy in one shitty act

from Scott Aaronson

Far-Left Students and Faculty: We’d sooner burn universities to the ground than allow them to remain safe for the hated Zionist Jews, the baby-killing demons of the earth. We’ll disrupt their classes, bar them from student activities, smash their Hillel centers, take over campus buildings and quads, and chant for Hezbollah and the Al-Aqsa Martyrs […]

Far-Left Students and Faculty: We’d sooner burn universities to the ground than allow them to remain safe for the hated Zionist Jews, the baby-killing demons of the earth. We’ll disrupt their classes, bar them from student activities, smash their Hillel centers, take over campus buildings and quads, and chant for Hezbollah and the Al-Aqsa Martyrs Brigades to eradicate them like vermin. We’ll do all this because we’ve so thoroughly learned the lessons of the Holocaust.

Trump Administration [cackling]: Burn universities to the ground, you say? What a coincidence! We’d love nothing more than to do exactly that. Happy to oblige you.

Far-Left Students and Faculty: You fascist scum. We didn’t mean “call our bluff”! Was it the campus Zionists who ratted us out to you? It was, wasn’t it? You can’t do this without due process; we have rights!

Trump Administration: We don’t answer to you and we don’t care about “due process” or your supposed “rights.” We’re cutting all your funding, effective immediately. Actually, since you leftists don’t have much funding to speak of, let’s just cut any university funding whatsoever that we can reach. Cancer studies. Overhead on NIH grants. Student aid. Fellowships. Whatever universities use to keep the lights on. The more essential it is, the longer it took to build, the more we’ll enjoy the elitist professors’ screams of anguish as we destroy it all in a matter of weeks.

Far-Left Students and Faculty: This is the end, then. But if our whole little world must go up in flames, at least we’ll die having never compromised our most fundamental moral principle: the eradication of the State of Israel and the death of its inhabitants.

Sane Majorities at Universities, Including Almost Everyone in STEM: [don’t get a speaking part in this play. They’ve already bled out on the street, killed in the crossfire]

By Scott