Thank you to arXiv for use of its open access interoperability.
Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.
We consider the computational complexity of computing Bayes-Nash equilibria
in first-price auctions, where the bidders' values for the item are drawn from
a general (possibly correlated) joint distribution. We show that when the
values and the bidding space are discrete, determining the existence of a pure
Bayes-Nash equilibrium is NP-hard. This is the first hardness result in the
literature of the problem that does not rely on assumptions of subjectivity of
the priors, or convoluted tie-breaking rules. We then present two main
approaches for achieving positive results, via bid sparsification and via bid
densification. The former is more combinatorial and is based on enumeration
techniques, whereas the latter makes use of the continuous theory of the
problem developed in the economics literature. Using these approaches, we
develop polynomial-time approximation algorithms for computing equilibria in
symmetric settings or settings with a fixed number of bidders, for different
(discrete or continuous) variants of the auction.
We consider the computational complexity of computing Bayes-Nash equilibria
in first-price auctions, where the bidders' values for the item are drawn from
a general (possibly correlated) joint distribution. We show that when the
values and the bidding space are discrete, determining the existence of a pure
Bayes-Nash equilibrium is NP-hard. This is the first hardness result in the
literature of the problem that does not rely on assumptions of subjectivity of
the priors, or convoluted tie-breaking rules. We then present two main
approaches for achieving positive results, via bid sparsification and via bid
densification. The former is more combinatorial and is based on enumeration
techniques, whereas the latter makes use of the continuous theory of the
problem developed in the economics literature. Using these approaches, we
develop polynomial-time approximation algorithms for computing equilibria in
symmetric settings or settings with a fixed number of bidders, for different
(discrete or continuous) variants of the auction.
Authors: Alperen A. Ergür, Josué Tonelli-Cueto, Elias Tsigaridas
We introduce beyond-worst-case analysis into symbolic computation. This is an
extensive field which almost entirely relies on worst-case bit complexity, and
we start from a basic problem in the field: isolating the real roots of
univariate polynomials. This is a fundamental problem in symbolic computation
and it is arguably one of the most basic problems in computational mathematics.
The problem has a long history decorated with numerous ingenious algorithms and
furnishes an active area of research. However, most available results in
literature either focus on worst-case analysis in the bit complexity model or
simply provide experimental benchmarking without any theoretical justifications
of the observed results. We aim to address the discrepancy between practical
performance of root isolation algorithms and prescriptions of worst-case
complexity theory: We develop a smoothed analysis framework for polynomials
with integer coefficients to bridge this gap. We demonstrate (quasi-)linear
(expected and smoothed) complexity bounds for Descartes algorithm, that is one
most well know symbolic algorithms for isolating the real roots of univariate
polynomials with integer coefficients. Our results explain the surprising
efficiency of Descartes solver in comparison to sophisticated algorithms that
have superior worst-case complexity. We also analyse the Sturm solver, ANewDsc
a symbolic-numeric algorithm that combines Descartes with Newton operator, and
a symbolic algorithm for sparse polynomials.
We introduce beyond-worst-case analysis into symbolic computation. This is an
extensive field which almost entirely relies on worst-case bit complexity, and
we start from a basic problem in the field: isolating the real roots of
univariate polynomials. This is a fundamental problem in symbolic computation
and it is arguably one of the most basic problems in computational mathematics.
The problem has a long history decorated with numerous ingenious algorithms and
furnishes an active area of research. However, most available results in
literature either focus on worst-case analysis in the bit complexity model or
simply provide experimental benchmarking without any theoretical justifications
of the observed results. We aim to address the discrepancy between practical
performance of root isolation algorithms and prescriptions of worst-case
complexity theory: We develop a smoothed analysis framework for polynomials
with integer coefficients to bridge this gap. We demonstrate (quasi-)linear
(expected and smoothed) complexity bounds for Descartes algorithm, that is one
most well know symbolic algorithms for isolating the real roots of univariate
polynomials with integer coefficients. Our results explain the surprising
efficiency of Descartes solver in comparison to sophisticated algorithms that
have superior worst-case complexity. We also analyse the Sturm solver, ANewDsc
a symbolic-numeric algorithm that combines Descartes with Newton operator, and
a symbolic algorithm for sparse polynomials.
This paper proposes a fast and unsupervised scheme for a polygonal
approximation of a closed digital curve. It is demonstrated that the
approximation scheme is faster than state-of-the-art approximation and is
competitive with the same in Rosin's measure and in its aesthetic aspect. The
scheme comprises of three phases: initial segmentation, iterative vertex
insertion, and iterative merging, followed by vertex adjustment. The initial
segmentation is used to detect sharp turnings - the vertices that seemingly
have high curvature. It is likely that some of important vertices with low
curvature might have been missed out at the first phase and so iterative vertex
insertion is used to add vertices in a region where the curvature changes
slowly but steadily. The initial phase may pick up some undesirable vertices
and so merging is used to eliminate the redundant vertices. Finally, vertex
adjustment is used to facilitate enhancement in the aesthetic look of the
approximation. The quality of the approximations is measured using Rosin's
measure. The robustness of the proposed scheme with respect to geometric
transformation is observed.
This paper proposes a fast and unsupervised scheme for a polygonal
approximation of a closed digital curve. It is demonstrated that the
approximation scheme is faster than state-of-the-art approximation and is
competitive with the same in Rosin's measure and in its aesthetic aspect. The
scheme comprises of three phases: initial segmentation, iterative vertex
insertion, and iterative merging, followed by vertex adjustment. The initial
segmentation is used to detect sharp turnings - the vertices that seemingly
have high curvature. It is likely that some of important vertices with low
curvature might have been missed out at the first phase and so iterative vertex
insertion is used to add vertices in a region where the curvature changes
slowly but steadily. The initial phase may pick up some undesirable vertices
and so merging is used to eliminate the redundant vertices. Finally, vertex
adjustment is used to facilitate enhancement in the aesthetic look of the
approximation. The quality of the approximations is measured using Rosin's
measure. The robustness of the proposed scheme with respect to geometric
transformation is observed.
Motivated by practical applications in the design of optimization compilers
for neural networks, we initiated the study of identity testing problems for
arithmetic circuits augmented with \emph{exponentiation gates} that compute the
real function $x\mapsto e^x$. These circuits compute real functions of form
$P(\vec x)/P'(\vec x)$, where both $P(\vec x)$ and $P'(\vec x)$ are exponential
polynomials
\[
\sum_{i=1}^k f_i(\vec x)\cdot \exp\left(\frac{g_i(\vec x)}{h_i(\vec
x)}\right),
\]
for polynomials $f_i(\vec x),g_i(\vec x)$, and $h_i(\vec x)$.
We formalize a black-box query model over finite fields for this class of
circuits, which is mathematical simple and reflects constraints faced by
real-world neural network compilers. We proved that a simple and efficient
randomized identity testing algorithm achieves perfect completeness and
non-trivial soundness. Concurrent with our work, the algorithm has been
implemented in the optimization compiler Mirage by Wu et al.~(OSDI 2025),
demonstrating promising empirical performance in both efficiency and soundness
error. Finally, we propose a number-theoretic conjecture under which our
algorithm is sound with high probability.
Motivated by practical applications in the design of optimization compilers
for neural networks, we initiated the study of identity testing problems for
arithmetic circuits augmented with \emph{exponentiation gates} that compute the
real function $x\mapsto e^x$. These circuits compute real functions of form
$P(\vec x)/P'(\vec x)$, where both $P(\vec x)$ and $P'(\vec x)$ are exponential
polynomials
\[
\sum_{i=1}^k f_i(\vec x)\cdot \exp\left(\frac{g_i(\vec x)}{h_i(\vec
x)}\right),
\]
for polynomials $f_i(\vec x),g_i(\vec x)$, and $h_i(\vec x)$.
We formalize a black-box query model over finite fields for this class of
circuits, which is mathematical simple and reflects constraints faced by
real-world neural network compilers. We proved that a simple and efficient
randomized identity testing algorithm achieves perfect completeness and
non-trivial soundness. Concurrent with our work, the algorithm has been
implemented in the optimization compiler Mirage by Wu et al.~(OSDI 2025),
demonstrating promising empirical performance in both efficiency and soundness
error. Finally, we propose a number-theoretic conjecture under which our
algorithm is sound with high probability.
Authors: Thomas Depian, Simon D. Fink, Robert Ganian, Martin Nöllenburg
We consider the problem of computing $\ell$-page queue layouts, which are
linear arrangements of vertices accompanied with an assignment of the edges to
pages from one to $\ell$ that avoid the nesting of edges on any of the pages.
Inspired by previous work in the extension of stack layouts, here we consider
the setting of extending a partial $\ell$-page queue layout into a complete one
and primarily analyze the problem through the refined lens of parameterized
complexity. We obtain novel algorithms and lower bounds which provide a
detailed picture of the problem's complexity under various measures of
incompleteness, and identify surprising distinctions between queue and stack
layouts in the extension setting.
We consider the problem of computing $\ell$-page queue layouts, which are
linear arrangements of vertices accompanied with an assignment of the edges to
pages from one to $\ell$ that avoid the nesting of edges on any of the pages.
Inspired by previous work in the extension of stack layouts, here we consider
the setting of extending a partial $\ell$-page queue layout into a complete one
and primarily analyze the problem through the refined lens of parameterized
complexity. We obtain novel algorithms and lower bounds which provide a
detailed picture of the problem's complexity under various measures of
incompleteness, and identify surprising distinctions between queue and stack
layouts in the extension setting.
Minimizers are sampling schemes with numerous applications in computational
biology. Assuming a fixed alphabet of size $\sigma$, a minimizer is defined by
two integers $k,w\ge2$ and a linear order $\rho$ on strings of length $k$ (also
called $k$-mers). A string is processed by a sliding window algorithm that
chooses, in each window of length $w+k-1$, its minimal $k$-mer with respect to
$\rho$. A key characteristic of the minimizer is its density, which is the
expected frequency of chosen $k$-mers among all $k$-mers in a random infinite
$\sigma$-ary string. Minimizers of smaller density are preferred as they
produce smaller samples with the same guarantee: each window is represented by
a $k$-mer.
The problem of finding a minimizer of minimum density for given input
parameters $(\sigma,k,w)$ has a huge search space of $(\sigma^k)!$ and is
representable by an ILP of size $\tilde\Theta(\sigma^{k+w})$, which has
worst-case solution time that is doubly-exponential in $(k+w)$ under standard
complexity assumptions. We solve this problem in $w\cdot 2^{\sigma^k+O(k)}$
time and provide several additional tricks reducing the practical runtime and
search space. As a by-product, we describe an algorithm computing the average
density of a minimizer within the same time bound. Then we propose a novel
method of studying minimizers via regular languages and show how to find, via
the eigenvalue/eigenvector analysis over finite automata, minimizers with the
minimal density in the asymptotic case $w\to\infty$. Implementing our
algorithms, we compute the minimum density minimizers for
$(\sigma,k)\in\{(2,2),(2,3),(2,4),(2,5),(4,2)\}$ and \textbf{all} $w\ge 2$. The
obtained densities are compared against the average density and the theoretical
lower bounds, including the new bound presented in this paper.
Minimizers are sampling schemes with numerous applications in computational
biology. Assuming a fixed alphabet of size $\sigma$, a minimizer is defined by
two integers $k,w\ge2$ and a linear order $\rho$ on strings of length $k$ (also
called $k$-mers). A string is processed by a sliding window algorithm that
chooses, in each window of length $w+k-1$, its minimal $k$-mer with respect to
$\rho$. A key characteristic of the minimizer is its density, which is the
expected frequency of chosen $k$-mers among all $k$-mers in a random infinite
$\sigma$-ary string. Minimizers of smaller density are preferred as they
produce smaller samples with the same guarantee: each window is represented by
a $k$-mer.
The problem of finding a minimizer of minimum density for given input
parameters $(\sigma,k,w)$ has a huge search space of $(\sigma^k)!$ and is
representable by an ILP of size $\tilde\Theta(\sigma^{k+w})$, which has
worst-case solution time that is doubly-exponential in $(k+w)$ under standard
complexity assumptions. We solve this problem in $w\cdot 2^{\sigma^k+O(k)}$
time and provide several additional tricks reducing the practical runtime and
search space. As a by-product, we describe an algorithm computing the average
density of a minimizer within the same time bound. Then we propose a novel
method of studying minimizers via regular languages and show how to find, via
the eigenvalue/eigenvector analysis over finite automata, minimizers with the
minimal density in the asymptotic case $w\to\infty$. Implementing our
algorithms, we compute the minimum density minimizers for
$(\sigma,k)\in\{(2,2),(2,3),(2,4),(2,5),(4,2)\}$ and \textbf{all} $w\ge 2$. The
obtained densities are compared against the average density and the theoretical
lower bounds, including the new bound presented in this paper.
Shapley values have emerged as a critical tool for explaining which features
impact the decisions made by machine learning models. However, computing exact
Shapley values is difficult, generally requiring an exponential (in the feature
dimension) number of model evaluations. To address this, many model-agnostic
randomized estimators have been developed, the most influential and widely used
being the KernelSHAP method (Lundberg & Lee, 2017). While related estimators
such as unbiased KernelSHAP (Covert & Lee, 2021) and LeverageSHAP (Musco &
Witter, 2025) are known to satisfy theoretical guarantees, bounds for
KernelSHAP have remained elusive. We describe a broad and unified framework
that encompasses KernelSHAP and related estimators constructed using both with
and without replacement sampling strategies. We then prove strong
non-asymptotic theoretical guarantees that apply to all estimators from our
framework. This provides, to the best of our knowledge, the first theoretical
guarantees for KernelSHAP and sheds further light on tradeoffs between existing
estimators. Through comprehensive benchmarking on small and medium dimensional
datasets for Decision-Tree models, we validate our approach against exact
Shapley values, consistently achieving low mean squared error with modest
sample sizes. Furthermore, we make specific implementation improvements to
enable scalability of our methods to high-dimensional datasets. Our methods,
tested on datasets such MNIST and CIFAR10, provide consistently better results
compared to the KernelSHAP library.
Shapley values have emerged as a critical tool for explaining which features
impact the decisions made by machine learning models. However, computing exact
Shapley values is difficult, generally requiring an exponential (in the feature
dimension) number of model evaluations. To address this, many model-agnostic
randomized estimators have been developed, the most influential and widely used
being the KernelSHAP method (Lundberg & Lee, 2017). While related estimators
such as unbiased KernelSHAP (Covert & Lee, 2021) and LeverageSHAP (Musco &
Witter, 2025) are known to satisfy theoretical guarantees, bounds for
KernelSHAP have remained elusive. We describe a broad and unified framework
that encompasses KernelSHAP and related estimators constructed using both with
and without replacement sampling strategies. We then prove strong
non-asymptotic theoretical guarantees that apply to all estimators from our
framework. This provides, to the best of our knowledge, the first theoretical
guarantees for KernelSHAP and sheds further light on tradeoffs between existing
estimators. Through comprehensive benchmarking on small and medium dimensional
datasets for Decision-Tree models, we validate our approach against exact
Shapley values, consistently achieving low mean squared error with modest
sample sizes. Furthermore, we make specific implementation improvements to
enable scalability of our methods to high-dimensional datasets. Our methods,
tested on datasets such MNIST and CIFAR10, provide consistently better results
compared to the KernelSHAP library.
Advances in storage technology have introduced Non-Volatile Memory, NVM, as a
new storage medium. NVM, along with Dynamic Random Access Memory (DRAM), Solid
State Disk (SSD), and Disk present a system designer with a wide array of
options in designing caching middleware. Moreover, design decisions to
replicate a data item in more than one level of a caching memory hierarchy may
enhance the overall system performance with a faster recovery time in the event
of a memory failure. Given a fixed budget, the key configuration questions are:
Which storage media should constitute the memory hierarchy? What is the storage
capacity of each hierarchy? Should data be replicated or partitioned across the
different levels of the hierarchy? We model these cache configuration questions
as an instance of the Multiple Choice Knapsack Problem (MCKP). This model is
guided by the specification of each type of memory along with an application's
database characteristics and its workload. Although MCKP is NP-complete, its
linear programming relaxation is efficiently solvable and can be used to
closely approximate the optimal solution. We use the resulting simple algorithm
to evaluate design tradeoffs in the context of a memory hierarchy for a
Key-Value Store (e.g., memcached) as well as a host-side cache (e.g.,
Flashcache). The results show selective replication is appropriate with certain
failure rates and workload characteristics. With a slim failure rate and
frequent data updates, tiering of data across the different storage media that
constitute the cache is superior to replication.
Advances in storage technology have introduced Non-Volatile Memory, NVM, as a
new storage medium. NVM, along with Dynamic Random Access Memory (DRAM), Solid
State Disk (SSD), and Disk present a system designer with a wide array of
options in designing caching middleware. Moreover, design decisions to
replicate a data item in more than one level of a caching memory hierarchy may
enhance the overall system performance with a faster recovery time in the event
of a memory failure. Given a fixed budget, the key configuration questions are:
Which storage media should constitute the memory hierarchy? What is the storage
capacity of each hierarchy? Should data be replicated or partitioned across the
different levels of the hierarchy? We model these cache configuration questions
as an instance of the Multiple Choice Knapsack Problem (MCKP). This model is
guided by the specification of each type of memory along with an application's
database characteristics and its workload. Although MCKP is NP-complete, its
linear programming relaxation is efficiently solvable and can be used to
closely approximate the optimal solution. We use the resulting simple algorithm
to evaluate design tradeoffs in the context of a memory hierarchy for a
Key-Value Store (e.g., memcached) as well as a host-side cache (e.g.,
Flashcache). The results show selective replication is appropriate with certain
failure rates and workload characteristics. With a slim failure rate and
frequent data updates, tiering of data across the different storage media that
constitute the cache is superior to replication.
Hypergraphs model complex, non-binary relationships like co-authorships,
social group memberships, and recommendations. Like traditional graphs,
hypergraphs can grow large, posing challenges for storage, transmission, and
query performance. We propose HyperCSA, a novel compression method for
hypergraphs that maintains support for standard queries over the succinct
representation. HyperCSA achieves compression ratios of 26% to 79% of the
original file size on real-world hypergraphs - outperforming existing methods
on all large hypergraphs in our experiments. Additionally, HyperCSA scales to
larger datasets than existing approaches. Furthermore, for common real-world
hypergraphs, HyperCSA evaluates neighbor queries 6 to 40 times faster than both
standard data structures and other hypergraph compression approaches.
Hypergraphs model complex, non-binary relationships like co-authorships,
social group memberships, and recommendations. Like traditional graphs,
hypergraphs can grow large, posing challenges for storage, transmission, and
query performance. We propose HyperCSA, a novel compression method for
hypergraphs that maintains support for standard queries over the succinct
representation. HyperCSA achieves compression ratios of 26% to 79% of the
original file size on real-world hypergraphs - outperforming existing methods
on all large hypergraphs in our experiments. Additionally, HyperCSA scales to
larger datasets than existing approaches. Furthermore, for common real-world
hypergraphs, HyperCSA evaluates neighbor queries 6 to 40 times faster than both
standard data structures and other hypergraph compression approaches.
Authors: Pengxin Bian, Panagiotis Charalampopoulos, Lorraine A. K. Ayad, Manal Mohamed, Solon P. Pissis, Grigorios Loukides
Frequent pattern mining is a flagship problem in data mining. In its most
basic form, it asks for the set of substrings of a given string $S$ of length
$n$ that occur at least $\tau$ times in $S$, for some integer $\tau\in[1,n]$.
We introduce a resilient version of this classic problem, which we term the
$(\tau, k)$-Resilient Pattern Mining (RPM) problem. Given a string $S$ of
length $n$ and two integers $\tau, k\in[1,n]$, RPM asks for the set of
substrings of $S$ that occur at least $\tau$ times in $S$, even when the
letters at any $k$ positions of $S$ are substituted by other letters. Unlike
frequent substrings, resilient ones account for the fact that changes to string
$S$ are often expensive to handle or are unknown.
We propose an exact $\mathcal{O}(n\log n)$-time and $\mathcal{O}(n)$-space
algorithm for RPM, which employs advanced data structures and combinatorial
insights. We then present experiments on real large-scale datasets from
different domains demonstrating that: (I) The notion of resilient substrings is
useful in analyzing genomic data and is more powerful than that of frequent
substrings, in scenarios where resilience is required, such as in the case of
versioned datasets; (II) Our algorithm is several orders of magnitude faster
and more space-efficient than a baseline algorithm that is based on dynamic
programming; and (III) Clustering based on resilient substrings is effective.
Frequent pattern mining is a flagship problem in data mining. In its most
basic form, it asks for the set of substrings of a given string $S$ of length
$n$ that occur at least $\tau$ times in $S$, for some integer $\tau\in[1,n]$.
We introduce a resilient version of this classic problem, which we term the
$(\tau, k)$-Resilient Pattern Mining (RPM) problem. Given a string $S$ of
length $n$ and two integers $\tau, k\in[1,n]$, RPM asks for the set of
substrings of $S$ that occur at least $\tau$ times in $S$, even when the
letters at any $k$ positions of $S$ are substituted by other letters. Unlike
frequent substrings, resilient ones account for the fact that changes to string
$S$ are often expensive to handle or are unknown.
We propose an exact $\mathcal{O}(n\log n)$-time and $\mathcal{O}(n)$-space
algorithm for RPM, which employs advanced data structures and combinatorial
insights. We then present experiments on real large-scale datasets from
different domains demonstrating that: (I) The notion of resilient substrings is
useful in analyzing genomic data and is more powerful than that of frequent
substrings, in scenarios where resilience is required, such as in the case of
versioned datasets; (II) Our algorithm is several orders of magnitude faster
and more space-efficient than a baseline algorithm that is based on dynamic
programming; and (III) Clustering based on resilient substrings is effective.
The Burrows-Wheeler Transform (BWT) is a fundamental component in many data
structures for text indexing and compression, widely used in areas such as
bioinformatics and information retrieval. The extended BWT (eBWT) generalizes
the classical BWT to multisets of strings, providing a flexible framework that
captures many BWT-like constructions. Several known variants of the BWT can be
viewed as instances of the eBWT applied to specific decompositions of a word. A
central property of the BWT, essential for its compressibility, is the number
of maximal ranges of equal letters, named runs. In this article, we explore how
different decompositions of a word impact the number of runs in the resulting
eBWT. First, we show that the number of decompositions of a word is
exponential, even under minimal constraints on the size of the subsets in the
decomposition. Second, we present an infinite family of words for which the
ratio of the number of runs between the worst and best decompositions is
unbounded, under the same minimal constraints. These results illustrate the
potential cost of decomposition choices in eBWT-based compression and underline
the challenges in optimizing run-length encoding in generalized BWT frameworks.
The Burrows-Wheeler Transform (BWT) is a fundamental component in many data
structures for text indexing and compression, widely used in areas such as
bioinformatics and information retrieval. The extended BWT (eBWT) generalizes
the classical BWT to multisets of strings, providing a flexible framework that
captures many BWT-like constructions. Several known variants of the BWT can be
viewed as instances of the eBWT applied to specific decompositions of a word. A
central property of the BWT, essential for its compressibility, is the number
of maximal ranges of equal letters, named runs. In this article, we explore how
different decompositions of a word impact the number of runs in the resulting
eBWT. First, we show that the number of decompositions of a word is
exponential, even under minimal constraints on the size of the subsets in the
decomposition. Second, we present an infinite family of words for which the
ratio of the number of runs between the worst and best decompositions is
unbounded, under the same minimal constraints. These results illustrate the
potential cost of decomposition choices in eBWT-based compression and underline
the challenges in optimizing run-length encoding in generalized BWT frameworks.
Authors: Maria Cherifa, Clément Calauzènes, Vianney Perchet
While online bipartite matching has gained significant attention in recent
years, existing analyses in stochastic settings fail to capture the performance
of algorithms on heterogeneous graphs, such as those incorporating inter-group
affinities or other social network structures. In this work, we address this
gap by studying online bipartite matching within the stochastic block model
(SBM). A fixed set of offline nodes is matched to a stream of online arrivals,
with connections governed probabilistically by latent class memberships. We
analyze two natural algorithms: a $\tt{Myopic}$ policy that greedily matches
each arrival to the most compatible class, and the $\tt{Balance}$ algorithm,
which accounts for both compatibility and remaining capacity. For the
$\tt{Myopic}$ algorithm, we prove that the size of the matching converges, with
high probability, to the solution of an ordinary differential equation (ODE),
for which we provide a tractable approximation along with explicit error
bounds. For the $\tt{Balance}$ algorithm, we demonstrate convergence of the
matching size to a differential inclusion and derive an explicit limiting
solution. Lastly, we explore the impact of estimating the connection
probabilities between classes online, which introduces an
exploration-exploitation trade-off.
While online bipartite matching has gained significant attention in recent
years, existing analyses in stochastic settings fail to capture the performance
of algorithms on heterogeneous graphs, such as those incorporating inter-group
affinities or other social network structures. In this work, we address this
gap by studying online bipartite matching within the stochastic block model
(SBM). A fixed set of offline nodes is matched to a stream of online arrivals,
with connections governed probabilistically by latent class memberships. We
analyze two natural algorithms: a $\tt{Myopic}$ policy that greedily matches
each arrival to the most compatible class, and the $\tt{Balance}$ algorithm,
which accounts for both compatibility and remaining capacity. For the
$\tt{Myopic}$ algorithm, we prove that the size of the matching converges, with
high probability, to the solution of an ordinary differential equation (ODE),
for which we provide a tractable approximation along with explicit error
bounds. For the $\tt{Balance}$ algorithm, we demonstrate convergence of the
matching size to a differential inclusion and derive an explicit limiting
solution. Lastly, we explore the impact of estimating the connection
probabilities between classes online, which introduces an
exploration-exploitation trade-off.
Byzantine agreement is a fundamental problem in fault-tolerant distributed
computing that has been studied intensively for the last four decades. Much of
the research has focused on a static Byzantine adversary, where the adversary
is constrained to choose the Byzantine nodes in advance of the protocol's
execution. This work focuses on the harder case of an adaptive Byzantine
adversary that can choose the Byzantine nodes \emph{adaptively} based on the
protocol's execution. While efficient $O(\log n)$-round protocols ($n$ is the
total number of nodes) are known for the static adversary (Goldwasser, Pavlov,
and Vaikuntanathan, FOCS 2006) tolerating up to $t < n/(3+\epsilon)$ Byzantine
nodes, $\Omega(t/\sqrt{n \log n})$ rounds is a well-known lower bound for
adaptive adversary [Bar-Joseph and Ben-Or, PODC 1998]. The best-known protocol
for adaptive adversary runs in $O(t/\log n)$ rounds [Chor and Coan, IEEE Trans.
Soft. Engg., 1985].
This work presents a synchronous randomized Byzantine agreement protocol
under an adaptive adversary that improves over previous results. Our protocol
works under the powerful \emph{adaptive rushing adversary in the full
information model}. That is, we assume that the Byzantine nodes can behave
arbitrarily and maliciously, have knowledge about the entire state of the
network at every round, including random choices made by all the nodes up to
and including the current round, have unlimited computational power, and may
collude among themselves. Furthermore, the adversary can \emph{adaptively}
corrupt up to $t < n/3$ nodes based on the protocol's execution. We present a
simple randomized Byzantine agreement protocol that runs in $O(\min\{t^2\log
n/n, t/\log n\})$ rounds that improves over the long-standing bound of
$O(t/\log n)$ rounds due to Chor and Coan [IEEE Trans. Soft. Engg., 1985].
Byzantine agreement is a fundamental problem in fault-tolerant distributed
computing that has been studied intensively for the last four decades. Much of
the research has focused on a static Byzantine adversary, where the adversary
is constrained to choose the Byzantine nodes in advance of the protocol's
execution. This work focuses on the harder case of an adaptive Byzantine
adversary that can choose the Byzantine nodes \emph{adaptively} based on the
protocol's execution. While efficient $O(\log n)$-round protocols ($n$ is the
total number of nodes) are known for the static adversary (Goldwasser, Pavlov,
and Vaikuntanathan, FOCS 2006) tolerating up to $t < n/(3+\epsilon)$ Byzantine
nodes, $\Omega(t/\sqrt{n \log n})$ rounds is a well-known lower bound for
adaptive adversary [Bar-Joseph and Ben-Or, PODC 1998]. The best-known protocol
for adaptive adversary runs in $O(t/\log n)$ rounds [Chor and Coan, IEEE Trans.
Soft. Engg., 1985].
This work presents a synchronous randomized Byzantine agreement protocol
under an adaptive adversary that improves over previous results. Our protocol
works under the powerful \emph{adaptive rushing adversary in the full
information model}. That is, we assume that the Byzantine nodes can behave
arbitrarily and maliciously, have knowledge about the entire state of the
network at every round, including random choices made by all the nodes up to
and including the current round, have unlimited computational power, and may
collude among themselves. Furthermore, the adversary can \emph{adaptively}
corrupt up to $t < n/3$ nodes based on the protocol's execution. We present a
simple randomized Byzantine agreement protocol that runs in $O(\min\{t^2\log
n/n, t/\log n\})$ rounds that improves over the long-standing bound of
$O(t/\log n)$ rounds due to Chor and Coan [IEEE Trans. Soft. Engg., 1985].
Authors: Jakub Łącki, Slobodan Mitrović, Srikkanth Ramachandran, Wen-Horng Sheu
We study the allocation problem in the Massively Parallel Computation (MPC)
model. This problem is a special case of $b$-matching, in which the input is a
bipartite graph with capacities greater than $1$ in only one part of the
bipartition. We give a $(1+\epsilon)$ approximate algorithm for the problem,
which runs in $\tilde{O}(\sqrt{\log \lambda})$ MPC rounds, using sublinear
space per machine and $\tilde{O}(\lambda n)$ total space, where $\lambda$ is
the arboricity of the input graph. Our result is obtained by providing a new
analysis of a LOCAL algorithm by Agrawal, Zadimoghaddam, and Mirrokni [ICML
2018], which improves its round complexity from $O(\log n)$ to $O(\log
\lambda)$. Prior to our work, no $o(\log n)$ round algorithm for
constant-approximate allocation was known in either LOCAL or sublinear space
MPC models for graphs with low arboricity.
We study the allocation problem in the Massively Parallel Computation (MPC)
model. This problem is a special case of $b$-matching, in which the input is a
bipartite graph with capacities greater than $1$ in only one part of the
bipartition. We give a $(1+\epsilon)$ approximate algorithm for the problem,
which runs in $\tilde{O}(\sqrt{\log \lambda})$ MPC rounds, using sublinear
space per machine and $\tilde{O}(\lambda n)$ total space, where $\lambda$ is
the arboricity of the input graph. Our result is obtained by providing a new
analysis of a LOCAL algorithm by Agrawal, Zadimoghaddam, and Mirrokni [ICML
2018], which improves its round complexity from $O(\log n)$ to $O(\log
\lambda)$. Prior to our work, no $o(\log n)$ round algorithm for
constant-approximate allocation was known in either LOCAL or sublinear space
MPC models for graphs with low arboricity.
We study rumor spreading in dynamic random graphs. Starting with a single
informed vertex, the information flows until it reaches all the vertices of the
graph (completion), according to the following process. At each step $k$, the
information is propagated to neighbors of the informed vertices, in the $k$-th
generated random graph. The way this information propagates from vertex to
vertex at each step will depend on the ``protocol". We provide a method based
on strong stationary times to study the completion time when the graphs are
Markovian time dependent, using known results of the literature for independent
graphs. The concept of strong stationary times is then extended to
non-Markovian Dynamics using coupling from the past algorithms. This allows to
extend results on completion times for non-Markov dynamics
We study rumor spreading in dynamic random graphs. Starting with a single
informed vertex, the information flows until it reaches all the vertices of the
graph (completion), according to the following process. At each step $k$, the
information is propagated to neighbors of the informed vertices, in the $k$-th
generated random graph. The way this information propagates from vertex to
vertex at each step will depend on the ``protocol". We provide a method based
on strong stationary times to study the completion time when the graphs are
Markovian time dependent, using known results of the literature for independent
graphs. The concept of strong stationary times is then extended to
non-Markovian Dynamics using coupling from the past algorithms. This allows to
extend results on completion times for non-Markov dynamics
If peer review only refers to papers, it is not defensible
Bah, I made a mistake. In Tuesday’s post, I tried to take some poetic justice and propose a broader definition of “peer review” to encompass the complexity of the system of academic knowledge curation. Unfortunately, almost all of the feedback I received latched onto the more common usage, the flawed system we use to vet which manuscripts get published.1
Anyway, that’s my bad. Loaded terms can’t get overloaded definitions. I should have instead called academia a “Bureaucracy of Expertise.” Peer review could then maintain its original common definition as the subsystem of complex, obfuscated, administrative rules, procedures, and paperwork that go into accepting and rejecting papers.
Fine, you all win. And if we’re going with that definition, I won't defend peer review. It’s not defensible. However, I will use it as an interesting case study to exemplify how change happens (very, very slowly) in The Academic Bureau of Expertise.
If you don’t know how pre-publication peer review is supposed to work, it’s like this. An academic team writes a paper and then sends it to some venue to be published. There, an editor skims the paper and finds three people to write reviews of the submission. Those three people write something about the paper and send it back to the editor. Based on what they write, the editor decides whether or not to publish it. The editor then sends their decision and the reviews, with the reviewer names redacted to protect the innocent, back to the academic team.
This is the system we have, and no one is happy with it. I fully reject that we can repurpose Churchill and say, “Our system of pre-publication manuscript review is terrible, but it’s better than everything else.” Others have eloquently written elsewhere that this is a failed post-war experiment, and we’d be best off abandoning it as part of any academic reform. I highly recommend Adam Mastroianni’s viral screed for an accessible introduction. Everyone knows what it doesn’t do: it doesn’t elevate great work, it doesn’t catch errors, it doesn’t catch fraud, it is susceptible to collusion rings. No one has a good argument for what it does well, and there are a trillion other ways to get feedback on your research. Those who say there is no better alternative don’t provide evidence when pressed.
Besides, how would you prove it's the best system in the first place? Given the current provisions in the Acceptable Procedures Manual of The Academic Bureau of Expertise, you would run an RCT, of course! Evidence-based academia. Someone would set up a contrived randomized controlled experiment and show that some particular group fares better with one particular setup of the peer review system when compared to another. People have already conducted many such studies, and they don’t tell us anything. I saw a new RCT yesterday claiming that academics can’t even evaluate reviews of their own paper. Part of the problem with academia is that we have blessed this idea of RCTs as a way of sensemaking in social science, and it simply doesn’t work.
Importantly, Paul Beame writes correctly that these RCTs have virtually no external validity. Generalizing from machine learning conferences to everything else is a fool’s errand. I couldn’t agree more. Last month, our AI Winter conference received thirty thousand submissions! Our Summer and Spring conferences receive similar numbers. Reviewers are supposed to evaluate these submissions to determine whether they deserve to be considered “top-tier” publications. But they only get six weeks to review six or more papers. Most communities would be horrified by our system, and rightfully so. With the way the marketplace of ideas now functions in machine learning, literally in the bubbly capitalist marketplace, it’s unclear why we keep this sideshow of review cycles that consume a third of the lives of graduate students (and the many undergrads who are reviewing too).
A more organic fix that wouldn’t require explicit rule changes would be to devalue the currency. We could collectively just agree that NeurIPS and ICML papers are valueless. That they no longer count towards any of the advancement steps. In many ways, just as Adam Smith said it should, this is happening. People are grumbling about how the undergrad with five machine learning publications didn’t turn into a good graduate student. Faculty recruiting committees, baffled by candidates with 50 papers on their CVs, often move on to those with clearer, more focused records, often outside the field of artificial intelligence entirely. The self-correction happens slowly. Maybe there will still be a hundred thousand machine learning papers submitted every year, but AI paper count is slowly beginning to be treated as equivalent to the number of tweets a candidate has posted.
A general dissatisfaction with peer review is why social media and preprint servers are winning. As my mad-tweeting friend Dimitris Papailiopoulos frequently points out, the ratio of arxiv papers to peer-reviewed papers of any machine learning researcher is very close to one. We do the peer-reviewed part because we “have to,” not because anyone thinks it matters for anything related to knowledge dissemination. It has become divorced from the aforementioned marketplace of ideas and now just serves as a bean on a CV to be counted.
And yet, full openness on social media is not a panacea. We have to be careful as we move away from pre-publication manuscript peer review. The frenzied, decentralized sloshing around with systems to maximize their visibility in the market of ideas doesn’t always take us to good outcomes.
Scientists were tricked by the logic of the platform into having a single, public-facing account where they did three very different kinds of things:
Had intense debates about the status of scientific research, often criticizing scientific practice.
Shared their own scientific findings with the public, towards the goal of “impact” so prized by today’s hiring committees and grant awarding institutions.
Spouted off about their private political opinions.
Each of these things is a perfectly reasonable thing to be doing — just not all of them, at once, from the same account.
I can’t recommend Kevin’s post enough. You can be all for open science and believe academic Twitter was a mistake. Academics making sausage and posting their whole selves to Twitter was, for the most part, a very bad look.
More importantly, Kevin articulates why all of these changes to academic systems have to be made organically. Making explicit rules and then not providing additional resources ends up strangling academic work. More mandatory paperwork—whether it be forcing people to write ethics statements, funding statements, or preregistration plans—with no increase (in fact, a decrease!) in resources isn’t going to lead to more creativity and breakthroughs. And worse, it provides openings for opportunists to exert authoritarian control.
(Click to enlarge) Today, at 14:30, Monday June 9 at 11:00 and Wednesday June 11, at 11:20 Mehtaab S. Sawhney will deliver the 2005 Erdos lecture. Three talks, each representing a monumental achievement!
(Click to enlarge)
Today, at 14:30, Monday June 9 at 11:00 and Wednesday June 11, at 11:20 Mehtaab S. Sawhney will deliver the 2005 Erdos lecture. Three talks, each representing a monumental achievement!
Authors: Julian Golak, Finn Sörensen, Malte Fliedner
In this work, we consider extensions of both the Line Traveling Salesman and
Line Traveling Repairman Problem, in which a single server must service a set
of clients located along a line segment under the assumption that not only the
server, but also the clients can move along the line and seek to collaborate
with the server to speed up service times. We analyze the structure of
different problem versions and identify hard and easy subproblems by building
up on prior results from the literature. Specifically, we investigate problem
versions with zero or general processing times, clients that are either slower
or faster than the server, as well as different time window restrictions.
Collectively, these results map out the complexity landscape of the Line
Traveling Salesman and Repairman Problem with collaboration.
In this work, we consider extensions of both the Line Traveling Salesman and
Line Traveling Repairman Problem, in which a single server must service a set
of clients located along a line segment under the assumption that not only the
server, but also the clients can move along the line and seek to collaborate
with the server to speed up service times. We analyze the structure of
different problem versions and identify hard and easy subproblems by building
up on prior results from the literature. Specifically, we investigate problem
versions with zero or general processing times, clients that are either slower
or faster than the server, as well as different time window restrictions.
Collectively, these results map out the complexity landscape of the Line
Traveling Salesman and Repairman Problem with collaboration.
Authors: Shaoshan Liu, Fan Wang, Hongjun Zhou, Yuanfeng Wang
While theory and practice are often seen as separate domains, this article
shows that theoretical insight is essential for overcoming real-world
engineering barriers. We begin with a practical challenge: training a
cross-morphology embodied AI policy that generalizes across diverse robot
morphologies. We formalize this as the Heterogeneous Embodied Agent Training
(HEAT) problem and prove it reduces to a structured Partially Observable Markov
Decision Process (POMDP) that is PSPACE-complete. This result explains why
current reinforcement learning pipelines break down under morphological
diversity, due to sequential training constraints, memory-policy coupling, and
data incompatibility. We further explore Collective Adaptation, a distributed
learning alternative inspired by biological systems. Though NEXP-complete in
theory, it offers meaningful scalability and deployment benefits in practice.
This work illustrates how computational theory can illuminate system design
trade-offs and guide the development of more robust, scalable embodied AI. For
practitioners and researchers to explore this problem, the implementation code
of this work has been made publicly available at
github.com/airs-admin/HEAT
While theory and practice are often seen as separate domains, this article
shows that theoretical insight is essential for overcoming real-world
engineering barriers. We begin with a practical challenge: training a
cross-morphology embodied AI policy that generalizes across diverse robot
morphologies. We formalize this as the Heterogeneous Embodied Agent Training
(HEAT) problem and prove it reduces to a structured Partially Observable Markov
Decision Process (POMDP) that is PSPACE-complete. This result explains why
current reinforcement learning pipelines break down under morphological
diversity, due to sequential training constraints, memory-policy coupling, and
data incompatibility. We further explore Collective Adaptation, a distributed
learning alternative inspired by biological systems. Though NEXP-complete in
theory, it offers meaningful scalability and deployment benefits in practice.
This work illustrates how computational theory can illuminate system design
trade-offs and guide the development of more robust, scalable embodied AI. For
practitioners and researchers to explore this problem, the implementation code
of this work has been made publicly available at
https://github.com/airs-admin/HEAT
Hive is an abstract strategy game played on a table with hexagonal pieces.
First published in 2001, it was and continues to be highly popular among both
casual and competitive players. In this paper, we show that for a suitably
generalized version of the game, the computational problem of determining
whether a given player in an arbitrary position has a winning strategy is
PSPACE-hard. We do this by reduction from a variant of Generalized Geography we
call Formula Game Geography.
Hive is an abstract strategy game played on a table with hexagonal pieces.
First published in 2001, it was and continues to be highly popular among both
casual and competitive players. In this paper, we show that for a suitably
generalized version of the game, the computational problem of determining
whether a given player in an arbitrary position has a winning strategy is
PSPACE-hard. We do this by reduction from a variant of Generalized Geography we
call Formula Game Geography.
Authors: James Demmel, Hengrui Luo, Ryan Schneider, Yifu Wang
In this paper, we analyze several versions of Jacobi's method for the
symmetric eigenvalue problem. Our goal throughout is to reduce the asymptotic
cost of the algorithm as much as possible, as measured by the number of
arithmetic operations performed and associated (sequential or parallel)
communication, i.e., the amount of data moved between slow and fast memory or
between processors in a network. In producing rigorous complexity bounds, we
allow our algorithms to be built on both classic $O(n^3)$ matrix multiplication
and fast, Strassen-like $O(n^{\omega_0})$ alternatives. In the classical
setting, we show that a blocked implementation of Jacobi's method attains the
communication lower bound for $O(n^3)$ matrix multiplication (and is therefore
expected to be communication optimal among $O(n^3)$ methods). In the fast
setting, we demonstrate that a recursive version of blocked Jacobi can go even
further, reaching essentially optimal complexity in both measures. We also
discuss Jacobi-based SVD algorithms and a parallel version of block Jacobi,
showing that analogous complexity bounds apply.
In this paper, we analyze several versions of Jacobi's method for the
symmetric eigenvalue problem. Our goal throughout is to reduce the asymptotic
cost of the algorithm as much as possible, as measured by the number of
arithmetic operations performed and associated (sequential or parallel)
communication, i.e., the amount of data moved between slow and fast memory or
between processors in a network. In producing rigorous complexity bounds, we
allow our algorithms to be built on both classic $O(n^3)$ matrix multiplication
and fast, Strassen-like $O(n^{\omega_0})$ alternatives. In the classical
setting, we show that a blocked implementation of Jacobi's method attains the
communication lower bound for $O(n^3)$ matrix multiplication (and is therefore
expected to be communication optimal among $O(n^3)$ methods). In the fast
setting, we demonstrate that a recursive version of blocked Jacobi can go even
further, reaching essentially optimal complexity in both measures. We also
discuss Jacobi-based SVD algorithms and a parallel version of block Jacobi,
showing that analogous complexity bounds apply.
We analyze the computational power of non-Hermitian quantum dynamics, i.e.,
conditional time evolutions that arise when a quantum system is monitored and
one postselects on a particular measurement record. We establish an approximate
equivalence between post-selection and arbitrary non-Hermitian Hamiltonians.
Namely, first we establish hardness in the following sense: Let $U=e^{-iHt}$ be
an NH gate on $n$ qubits whose smallest and largest singular values differ by
at least $2^{-\text{poly}(n)}$. Together with any universal set of unitary
gates, the ability to apply such a gate lets one efficiently emulate
postselection. The resulting model decides every language in PostBQP; hence,
under standard complexity conjectures, fully scalable NH quantum computers are
unlikely to be engineered. Second, we establish upper bounds which show that
conversely, any non-Hermitian evolution can be written as a unitary on a
system-meter pair followed by postselecting the meter. This ``purification'' is
compact -- it introduces only $O(\delta^{2})$ Trotter error per time step
$\delta$ -- so any NH model whose purification lies in a strongly simulable
unitary family (e.g., Clifford, matchgate, or low-bond-dimension tensor-network
circuits) remains efficiently simulable. Thus, non-Hermitian physics neither
guarantees a quantum advantage nor precludes efficient classical simulation:
its complexity is controlled by the singular-value radius of the evolution
operator and by the structure of its unitary purification.
We analyze the computational power of non-Hermitian quantum dynamics, i.e.,
conditional time evolutions that arise when a quantum system is monitored and
one postselects on a particular measurement record. We establish an approximate
equivalence between post-selection and arbitrary non-Hermitian Hamiltonians.
Namely, first we establish hardness in the following sense: Let $U=e^{-iHt}$ be
an NH gate on $n$ qubits whose smallest and largest singular values differ by
at least $2^{-\text{poly}(n)}$. Together with any universal set of unitary
gates, the ability to apply such a gate lets one efficiently emulate
postselection. The resulting model decides every language in PostBQP; hence,
under standard complexity conjectures, fully scalable NH quantum computers are
unlikely to be engineered. Second, we establish upper bounds which show that
conversely, any non-Hermitian evolution can be written as a unitary on a
system-meter pair followed by postselecting the meter. This ``purification'' is
compact -- it introduces only $O(\delta^{2})$ Trotter error per time step
$\delta$ -- so any NH model whose purification lies in a strongly simulable
unitary family (e.g., Clifford, matchgate, or low-bond-dimension tensor-network
circuits) remains efficiently simulable. Thus, non-Hermitian physics neither
guarantees a quantum advantage nor precludes efficient classical simulation:
its complexity is controlled by the singular-value radius of the evolution
operator and by the structure of its unitary purification.
We introduce an imperative, stack-based, and reversible computational model
that characterizes Two-way Bijections both implicitly, concerning their
computational complexity, and with zero-garbage.
We introduce an imperative, stack-based, and reversible computational model
that characterizes Two-way Bijections both implicitly, concerning their
computational complexity, and with zero-garbage.
Fix a positive integer $r$, and a graph $G$ that is $K_{3,r}$-minor-free. Let
$I_s$ and $I_t$ be two independent sets in $G$, each of size $k$. We begin with
a ``token'' on each vertex of $I_s$ and seek to move all tokens to $I_t$, by
repeated ``token jumping'', removing a single token from one vertex and placing
it on another vertex. We require that each intermediate arrangement of tokens
again specifies an independent set of size $k$. Given $G$, $I_s$, and $I_t$, we
ask whether there exists a sequence of token jumps that transforms $I_s$ into
$I_t$. When $k$ is part of the input, this problem is known to be
PSPACE-complete. However, it was shown by Ito, Kami\'nski, and Ono (2014) to be
fixed-parameter tractable. That is, when $k$ is fixed, the problem can be
solved in time polynomial in the order of $G$.
Here we strengthen the upper bound on the running time in terms of $k$ by
showing that the problem has a kernel of size linear in $k$. More precisely, we
transform an arbitrary input problem on a $K_{3,r}$-minor-free graph into an
equivalent problem on a ($K_{3,r}$-minor-free) graph with order $O(k)$. This
answers positively a question of Bousquet, Mouawad, Nishimura, and Siebertz
(2024) and improves the recent quadratic kernel of Cranston, M\"{u}hlenthaler,
and Peyrille (2024+). For planar graphs, we further strengthen this upper bound
to get a kernel of size at most $42k$.
Fix a positive integer $r$, and a graph $G$ that is $K_{3,r}$-minor-free. Let
$I_s$ and $I_t$ be two independent sets in $G$, each of size $k$. We begin with
a ``token'' on each vertex of $I_s$ and seek to move all tokens to $I_t$, by
repeated ``token jumping'', removing a single token from one vertex and placing
it on another vertex. We require that each intermediate arrangement of tokens
again specifies an independent set of size $k$. Given $G$, $I_s$, and $I_t$, we
ask whether there exists a sequence of token jumps that transforms $I_s$ into
$I_t$. When $k$ is part of the input, this problem is known to be
PSPACE-complete. However, it was shown by Ito, Kami\'nski, and Ono (2014) to be
fixed-parameter tractable. That is, when $k$ is fixed, the problem can be
solved in time polynomial in the order of $G$.
Here we strengthen the upper bound on the running time in terms of $k$ by
showing that the problem has a kernel of size linear in $k$. More precisely, we
transform an arbitrary input problem on a $K_{3,r}$-minor-free graph into an
equivalent problem on a ($K_{3,r}$-minor-free) graph with order $O(k)$. This
answers positively a question of Bousquet, Mouawad, Nishimura, and Siebertz
(2024) and improves the recent quadratic kernel of Cranston, M\"{u}hlenthaler,
and Peyrille (2024+). For planar graphs, we further strengthen this upper bound
to get a kernel of size at most $42k$.
Authors: Boris Aronov, Sang Won Bae, Sergio Cabello, Otfried Cheong, David Eppstein, Christian Knauer, Raimund Seidel
Let $\mathcal{A}$ be the subdivision of $\mathbb{R}^d$ induced by $m$ convex
polyhedra having $n$ facets in total. We prove that $\mathcal{A}$ has
combinatorial complexity $O(m^{\lceil d/2 \rceil} n^{\lfloor d/2 \rfloor})$ and
that this bound is tight. The bound is mentioned several times in the
literature, but no proof for arbitrary dimension has been published before.
Let $\mathcal{A}$ be the subdivision of $\mathbb{R}^d$ induced by $m$ convex
polyhedra having $n$ facets in total. We prove that $\mathcal{A}$ has
combinatorial complexity $O(m^{\lceil d/2 \rceil} n^{\lfloor d/2 \rfloor})$ and
that this bound is tight. The bound is mentioned several times in the
literature, but no proof for arbitrary dimension has been published before.
We consider the Top-$K$ selection problem, which aims to identify the
largest-$K$ elements from an array. Top-$K$ selection arises in many machine
learning algorithms and often becomes a bottleneck on accelerators, which are
optimized for dense matrix multiplications. To address this problem,
\citet{chern2022tpuknnknearestneighbor} proposed a fast two-stage
\textit{approximate} Top-$K$ algorithm: (i) partition the input array and
select the top-$1$ element from each partition, (ii) sort this \textit{smaller
subset} and return the top $K$ elements. In this paper, we consider a
generalized version of this algorithm, where the first stage selects top-$K'$
elements, for some $1 \leq K' \leq K$, from each partition. Our contributions
are as follows: (i) we derive an expression for the expected recall of this
generalized algorithm and show that choosing $K' > 1$ with fewer partitions in
the first stage reduces the input size to the second stage more effectively
while maintaining the same expected recall as the original algorithm, (ii) we
derive a bound on the expected recall for the original algorithm in
\citet{chern2022tpuknnknearestneighbor} that is provably tighter by a factor of
$2$ than the one in that paper, and (iii) we implement our algorithm on Cloud
TPUv5e and achieve around an order of magnitude speedups over the original
algorithm without sacrificing recall on real-world tasks.
We consider the Top-$K$ selection problem, which aims to identify the
largest-$K$ elements from an array. Top-$K$ selection arises in many machine
learning algorithms and often becomes a bottleneck on accelerators, which are
optimized for dense matrix multiplications. To address this problem,
\citet{chern2022tpuknnknearestneighbor} proposed a fast two-stage
\textit{approximate} Top-$K$ algorithm: (i) partition the input array and
select the top-$1$ element from each partition, (ii) sort this \textit{smaller
subset} and return the top $K$ elements. In this paper, we consider a
generalized version of this algorithm, where the first stage selects top-$K'$
elements, for some $1 \leq K' \leq K$, from each partition. Our contributions
are as follows: (i) we derive an expression for the expected recall of this
generalized algorithm and show that choosing $K' > 1$ with fewer partitions in
the first stage reduces the input size to the second stage more effectively
while maintaining the same expected recall as the original algorithm, (ii) we
derive a bound on the expected recall for the original algorithm in
\citet{chern2022tpuknnknearestneighbor} that is provably tighter by a factor of
$2$ than the one in that paper, and (iii) we implement our algorithm on Cloud
TPUv5e and achieve around an order of magnitude speedups over the original
algorithm without sacrificing recall on real-world tasks.
Authors: Jan Seyfried, Sayantan Sen, Marco Tomamichel
We investigate the sample complexity of mutual information and conditional
mutual information testing. For conditional mutual information testing, given
access to independent samples of a triple of random variables $(A, B, C)$ with
unknown distribution, we want to distinguish between two cases: (i) $A$ and $C$
are conditionally independent, i.e., $I(A\!:\!C|B) = 0$, and (ii) $A$ and $C$
are conditionally dependent, i.e., $I(A\!:\!C|B) \geq \varepsilon$ for some
threshold $\varepsilon$. We establish an upper bound on the number of samples
required to distinguish between the two cases with high confidence, as a
function of $\varepsilon$ and the three alphabet sizes. We conjecture that our
bound is tight and show that this is indeed the case in several parameter
regimes. For the special case of mutual information testing (when $B$ is
trivial), we establish the necessary and sufficient number of samples required
up to polylogarithmic terms.
Our technical contributions include a novel method to efficiently simulate
weakly correlated samples from the conditionally independent distribution
$P_{A|B} P_{C|B} P_B$ given access to samples from an unknown distribution
$P_{ABC}$, and a new estimator for equivalence testing that can handle such
correlated samples, which might be of independent interest.
We investigate the sample complexity of mutual information and conditional
mutual information testing. For conditional mutual information testing, given
access to independent samples of a triple of random variables $(A, B, C)$ with
unknown distribution, we want to distinguish between two cases: (i) $A$ and $C$
are conditionally independent, i.e., $I(A\!:\!C|B) = 0$, and (ii) $A$ and $C$
are conditionally dependent, i.e., $I(A\!:\!C|B) \geq \varepsilon$ for some
threshold $\varepsilon$. We establish an upper bound on the number of samples
required to distinguish between the two cases with high confidence, as a
function of $\varepsilon$ and the three alphabet sizes. We conjecture that our
bound is tight and show that this is indeed the case in several parameter
regimes. For the special case of mutual information testing (when $B$ is
trivial), we establish the necessary and sufficient number of samples required
up to polylogarithmic terms.
Our technical contributions include a novel method to efficiently simulate
weakly correlated samples from the conditionally independent distribution
$P_{A|B} P_{C|B} P_B$ given access to samples from an unknown distribution
$P_{ABC}$, and a new estimator for equivalence testing that can handle such
correlated samples, which might be of independent interest.
Authors: Yaojian Chen, Tianyu Ma, An Yang, Lin Gan, Wenlai Zhao, Guangwen Yang
Tensor permutation is a fundamental operation widely applied in AI, tensor
networks, and related fields. However, it is extremely complex, and different
shapes and permutation maps can make a huge difference. SIMD permutation began
to be studied in 2006, but the best method at that time was to split complex
permutations into multiple simple permutations to do SIMD, which might increase
the complexity for very complex permutations. Subsequently, as tensor
contraction gained significant attention, researchers explored structured
permutations associated with tensor contraction. Progress on general
permutations has been limited, and with increasing SIMD bit widths, achieving
efficient performance for these permutations has become increasingly
challenging. We propose a SIMD permutation toolkit, \system, that generates
optimized permutation code for arbitrary instruction sets, bit widths, tensor
shapes, and permutation patterns, while maintaining low complexity. In our
experiments, \system is able to achieve up to $38\times$ speedup for special
cases and $5\times$ for general gases compared to Numpy.
Tensor permutation is a fundamental operation widely applied in AI, tensor
networks, and related fields. However, it is extremely complex, and different
shapes and permutation maps can make a huge difference. SIMD permutation began
to be studied in 2006, but the best method at that time was to split complex
permutations into multiple simple permutations to do SIMD, which might increase
the complexity for very complex permutations. Subsequently, as tensor
contraction gained significant attention, researchers explored structured
permutations associated with tensor contraction. Progress on general
permutations has been limited, and with increasing SIMD bit widths, achieving
efficient performance for these permutations has become increasingly
challenging. We propose a SIMD permutation toolkit, \system, that generates
optimized permutation code for arbitrary instruction sets, bit widths, tensor
shapes, and permutation patterns, while maintaining low complexity. In our
experiments, \system is able to achieve up to $38\times$ speedup for special
cases and $5\times$ for general gases compared to Numpy.
Authors: Haricharan Balasundaram, J B Krishnashree, Girija Limaye, Meghana Nasre
The Hospital Residents problem with sizes (HRS) is a generalization of the
well-studied hospital residents (HR) problem. In the HRS problem, an agent $a$
has a size $s(a)$ and the agent occupies $s(a)$ many positions of the hospital
$h$ when assigned to $h$. The notion of stability in this setting is suitably
modified, and it is known that deciding whether an HRS instance admits a stable
matching is NP-hard under severe restrictions. In this work, we explore a
variation of stability, which we term occupancy-based stability. This notion
was defined by McDermid and Manlove in their work, however, to the best of our
knowledge, this notion remains unexplored. We show that every HRS instance
admits an occupancy-stable matching. We further show that computing a
maximum-size occupancy-stable matching is NP-hard. We complement our hardness
result by providing a linear-time 3-approximation algorithm for the max-size
occupancy-stable matching problem. Given that the classical notion of stability
adapted for HRS is not guaranteed to exist in general, we show a practical
restriction under which a stable matching is guaranteed to exist. We present an
efficient algorithm to output a stable matching in the restricted HRS
instances. We also provide an alternate NP-hardness proof for the decision
version of the stable matching problem for HRS which imposes a severe
restriction on the number of neighbours of non-unit sized agents.
The Hospital Residents problem with sizes (HRS) is a generalization of the
well-studied hospital residents (HR) problem. In the HRS problem, an agent $a$
has a size $s(a)$ and the agent occupies $s(a)$ many positions of the hospital
$h$ when assigned to $h$. The notion of stability in this setting is suitably
modified, and it is known that deciding whether an HRS instance admits a stable
matching is NP-hard under severe restrictions. In this work, we explore a
variation of stability, which we term occupancy-based stability. This notion
was defined by McDermid and Manlove in their work, however, to the best of our
knowledge, this notion remains unexplored. We show that every HRS instance
admits an occupancy-stable matching. We further show that computing a
maximum-size occupancy-stable matching is NP-hard. We complement our hardness
result by providing a linear-time 3-approximation algorithm for the max-size
occupancy-stable matching problem. Given that the classical notion of stability
adapted for HRS is not guaranteed to exist in general, we show a practical
restriction under which a stable matching is guaranteed to exist. We present an
efficient algorithm to output a stable matching in the restricted HRS
instances. We also provide an alternate NP-hardness proof for the decision
version of the stable matching problem for HRS which imposes a severe
restriction on the number of neighbours of non-unit sized agents.
We explain how the maximum energy of the Quantum MaxCut, XY, and EPR
Hamiltonians on a graph $G$ are related to the spectral radii of the token
graphs of $G$. From numerical study, we conjecture new bounds for these
spectral radii based on properties of $G$. We show how these conjectures
tighten the analysis of existing algorithms, implying state-of-the-art
approximation ratios for all three Hamiltonians. Our conjectures also provide
simple combinatorial bounds on the ground state energy of the antiferromagnetic
Heisenberg model, which we prove for bipartite graphs.
We explain how the maximum energy of the Quantum MaxCut, XY, and EPR
Hamiltonians on a graph $G$ are related to the spectral radii of the token
graphs of $G$. From numerical study, we conjecture new bounds for these
spectral radii based on properties of $G$. We show how these conjectures
tighten the analysis of existing algorithms, implying state-of-the-art
approximation ratios for all three Hamiltonians. Our conjectures also provide
simple combinatorial bounds on the ground state energy of the antiferromagnetic
Heisenberg model, which we prove for bipartite graphs.
$Q_{n,p}$, the random subgraph of the $n$-vertex hypercube $Q_n$, is obtained
by independently retaining each edge of $Q_n$ with probability $p$. We give
precise values for the cover time of $Q_{n,p}$ above the connectivity
threshold.
$Q_{n,p}$, the random subgraph of the $n$-vertex hypercube $Q_n$, is obtained
by independently retaining each edge of $Q_n$ with probability $p$. We give
precise values for the cover time of $Q_{n,p}$ above the connectivity
threshold.
Authors: Diego Diaz-Dominguez, Travis Gagie, Veronica Guerrini, Ben Langmead, Zsuzsanna Liptak, Giovanni Manzini, Francesco Masillo, Vikram Shivakumar
When building Burrows-Wheeler Transforms (BWTs) of truly huge datasets,
prefix-free parsing (PFP) can use an unreasonable amount of memory. In this
paper we show how if a dataset can be broken down into small datasets that are
not very similar to each other -- such as collections of many copies of genomes
of each of several species, or collections of many copies of each of the human
chromosomes -- then we can drastically reduce PFP's memory footprint by
building the BWTs of the small datasets and then merging them into the BWT of
the whole dataset.
When building Burrows-Wheeler Transforms (BWTs) of truly huge datasets,
prefix-free parsing (PFP) can use an unreasonable amount of memory. In this
paper we show how if a dataset can be broken down into small datasets that are
not very similar to each other -- such as collections of many copies of genomes
of each of several species, or collections of many copies of each of the human
chromosomes -- then we can drastically reduce PFP's memory footprint by
building the BWTs of the small datasets and then merging them into the BWT of
the whole dataset.
♦
You can write laws that are very specific, like the US tax code, or open to interpretation like the first amendment. In the literature these are known as rules and standards respectively.
In computational complexity, we generally think of complexity as bad. We want to solve problems quickly and simply. Sometimes complexity is good, if you want to hide information, generate randomness or need some friction. But mostly we want simplicity. How does simplicity guide us in setting guidance, either through rules or standards?
Rules are like a computer program. Feed in the input and get an output. Predictable and easy to compute. So why not always have tight rules?
Nobody ever gets a computer program right the first time, and the same goes for rules. Rules can be overly restrictive or have loopholes, leading to feelings of unfairness. Rules can require hoops to jump through to get things done. Rules don't engender trust to the ones the rules apply to, like very tight requirements on how grant funds can be spent. We know that in general we can't predict anything about how a computer program behaves, so why do we trust the rules?
A good example of a standard is that a PhD dissertation requires significant original research. Rules are things like the exact formatting requirements of a thesis, or statements like a CS thesis must contain three papers published in a specific given set of conferences.
As an administrator I like to focus on making decisions based on what's best for my unit, as opposed to ensuring I followed every letter of every rule. Because if you live by the rules, you'll die by the rules. People will try to use their interpretation of the rules to force your hand.
Sometimes we do need strict rules, like safety standards, especially for people unfamiliar with the equipment. Structured rules do give a complete clarity of when an action is allowed. But it also gives an excuse. Have you ever been satisfied by someone who did something you didn't like but said "I was just following the rules"?
Even strict rules tend to have an out, like a petition to take a set of courses that don't exactly match the requirements of a major. The petition is a standard, open to interpretation to capture what the rules don't.
As a complexity theorist I know what programs can't achieve, and as an administrator I see the same with rules. I prefer standards, principles over policies. Set your expectations, live by example, and trust the people, faculty, staff and students, to do the right thing and push back when they don't. People don't want strict rules, but they mostly act properly when they believe they are trusted and have wide latitude in their work.
By Lance Fortnow
You can write laws that are very specific, like the US tax code, or open to interpretation like the first amendment. In the literature these are known as rules and standards respectively.
In computational complexity, we generally think of complexity as bad. We want to solve problems quickly and simply. Sometimes complexity is good, if you want to hide information, generate randomness or need some friction. But mostly we want simplicity. How does simplicity guide us in setting guidance, either through rules or standards?
Rules are like a computer program. Feed in the input and get an output. Predictable and easy to compute. So why not always have tight rules?
Nobody ever gets a computer program right the first time, and the same goes for rules. Rules can be overly restrictive or have loopholes, leading to feelings of unfairness. Rules can require hoops to jump through to get things done. Rules don't engender trust to the ones the rules apply to, like very tight requirements on how grant funds can be spent. We know that in general we can't predict anything about how a computer program behaves, so why do we trust the rules?
A good example of a standard is that a PhD dissertation requires significant original research. Rules are things like the exact formatting requirements of a thesis, or statements like a CS thesis must contain three papers published in a specific given set of conferences.
As an administrator I like to focus on making decisions based on what's best for my unit, as opposed to ensuring I followed every letter of every rule. Because if you live by the rules, you'll die by the rules. People will try to use their interpretation of the rules to force your hand.
Sometimes we do need strict rules, like safety standards, especially for people unfamiliar with the equipment. Structured rules do give a complete clarity of when an action is allowed. But it also gives an excuse. Have you ever been satisfied by someone who did something you didn't like but said "I was just following the rules"?
Even strict rules tend to have an out, like a petition to take a set of courses that don't exactly match the requirements of a major. The petition is a standard, open to interpretation to capture what the rules don't.
As a complexity theorist I know what programs can't achieve, and as an administrator I see the same with rules. I prefer standards, principles over policies. Set your expectations, live by example, and trust the people, faculty, staff and students, to do the right thing and push back when they don't. People don't want strict rules, but they mostly act properly when they believe they are trusted and have wide latitude in their work.
Not whether AI will change work, but what kind of work — and which workers — will endure. Former President Obama recently shared an interview with Dario Amodei, CEO of Anthropic, who warned in Axios that AI could eliminate up to 50% of entry-level white-collar jobs within five years, potentially raising unemployment to 10–20%. Economists […]
Not whether AI will change work, but what kind of work — and which workers — will endure.
Former President Obama recently shared an interview with Dario Amodei, CEO of Anthropic, who warned in Axios that AI could eliminate up to 50% of entry-level white-collar jobs within five years, potentially raising unemployment to 10–20%. Economists and institutions like the IMF and the World Economic Forum are sounding similar alarms.
This kind of forecast is no longer hypothetical. It reframes the urgency of our conversation: not whether AI will change work, but what kind of work — and which workers — will endure.
In this post, I want to offer a clearer way to think about that question. Drawing on recent research with colleagues, I propose a framework that breaks down jobs into their smallest units — skills — and then divides each skill into two distinct components: decision-making and execution. This distinction, I argue, is essential to understanding where humans still matter, how AI is reshaping labor, and what we should be measuring, teaching, and building toward.
If we can’t say what a job is, we can’t reason about what AI is replacing.
The Wrong Debate
Much of the public debate still hinges on a narrow question: Will AI replace us, or will it help us? But this framing misses the deeper structural shift underway. It treats jobs as flat, undifferentiated collections of tasks, as if replacing a few steps is equivalent to automating the whole. In reality, most jobs are complex systems of judgment, adaptation, and execution. And not all parts of that system are equally exposed to AI. If we want to understand how work is changing — and what skills will remain valuable — we need a more granular lens.
Ada’s Shift
Consider Ada, a mid-level software engineer. Just a few years ago, her day revolved around writing code, debugging features, documenting functions, and reviewing pull requests. Today, GitHub Copilot writes much of the scaffolding. GPT drafts her documentation. Internal models flag bugs before she sees them. Her technical execution has been accelerated — even outsourced — by AI. But her job hasn’t disappeared. It has shifted.
What Ada does has changed. What Ada is responsible for has not.
What matters now is whether Ada can decide what to build, why it matters, how to navigate tradeoffs that aren’t in the training data — and crucially, whether the result is even correct. Her value no longer lies in execution — it lies in judgment and verification. This isn’t just a shift in what she does — it’s a shift in what makes her valuable. Execution can be delegated. Judgment cannot.
The Real Divide
Ada’s story isn’t just a sign of change—it’s a window into the deeper structure of modern work. In our research, we argue that to understand how AI is reshaping labor, we need to go beyond jobs and tasks, and examine the composition of skills themselves
Action-level skills: executing plans through tools, language, or procedures
This distinction is foundational to our framework. It allows us to model worker capabilities and job demands more precisely. And once you see the split, it’s everywhere.
A doctor uses AI to flag anomalies in a scan — but must decide when to intervene, and why.
An analyst uses GPT to draft a report — but must decide which story the data tells.
A teacher uses AI to generate exercises — but must decide how to adapt them for a struggling student.
A programmer uses Copilot to write code — but must decide what to build, how to build it, and in what language, architecture, or paradigm.
AI acts. Humans decide. That’s the frontier.
This isn’t a cosmetic difference. It’s a deep structural boundary — one that shapes who adds value, who adapts, and who gets displaced.
The distinction between decision and action is becoming the new fault line in labor.
A New Lens on Success
This structure isn’t just descriptive—it makes success measurable. We model each worker using an ability profile that captures their expected strength and variability at each subskill. Given a job’s subskill demands, we can compute the probability of success—a quantitative match between a worker’s capabilities and a job’s needs. This also lets us explore how training, support, or AI tools can shift that success rate—and identify where small changes make a big difference.
One of our key findings is that decision-level skills often act as a bottleneck: small improvements can trigger phase transitions, where success probability jumps sharply rather than smoothly. A little more judgment can be worth far more than a lot more action.
We also show how merging two workers—or pairing a worker with an AI system—can significantly boost job success by combining complementary skills. This framework yields four canonical pairings—each capturing a mode of human or hybrid productivity:
Strong decision + strong action: the ideal worker, high success probability.
Strong decision + weak action: can be aided by AI to improve execution.
Weak decision + strong action: needs human or institutional support to frame and evaluate problems.
Weak decision + weak action: unlikely to succeed without extensive support or retraining.
This pairing framework also models human-AI collaboration. AI tools excel in action-level tasks, reducing execution noise. But they cannot compensate for poor decisions. Conversely, humans with decision strength but noisy action skills can leverage AI to execute more reliably.
AI can make your actions sharper. But it takes another human—or a better-trained self—to make your decisions wiser.
Rethinking Upskilling
Most upskilling initiatives focus on teaching people how to use tools — how to code in Python, use Excel, write better prompts. But these are action-level skills. They train people to do — not to decide. And as AI becomes more fluent in these domains, such training risks becoming obsolete even faster.
What our model shows is that the most durable leverage comes from strengthening decision-level abilities: learning how to formulate the right problem, evaluate competing goals, and adapt strategy under uncertainty. These skills are harder to teach, harder to measure — but they’re also harder to replace.
Reskilling should not mean trading one automatable task for another. It should mean elevating workers into roles where human judgment is indispensable — and building the scaffolding to support that transition.
The goal isn’t to out-code AI. It’s to out-decide it.
But understanding skill composition isn’t just useful for support or upskilling—it changes how we identify and select talent in the first place.
Beyond the Average: Hiring for Complements, Not Proxies
Traditional hiring often relies on blunt proxies—credentials, test scores, résumé keywords—that collapse a person’s skill into a single metric. But real talent is rarely uniform. Someone may struggle with polish but excel at solving hard problems. Another may be smooth in execution but shallow in judgment. Conventional systems force institutions to average across these traits—hiring for perceived overall competence rather than true fit.
Our framework breaks that bind. By distinguishing decision-level and action-level abilities, and modeling how they combine, we can identify complementary strengths—either across people or in partnership with AI. This makes it possible to spot a high-judgment candidate with messy execution and pair them with tools that stabilize output. Or to recognize an efficient executor who thrives when decisions are pre-structured by a manager or teammate.
You no longer have to bet on the average. You can hire for the right complement.
This shift doesn’t just improve hiring. It makes evaluation more precise, support more targeted, and teams more balanced. It also disrupts the performative incentives of current systems, where polish often trumps insight, and fluency overshadows depth. If we can build systems that recognize decision strength—even in the absence of perfect execution—we can unlock talent that today gets overlooked.
Designing for the Future of Work
Ada’s story is only the beginning. Her challenge—learning how to decide, not just how to do—is quickly becoming the central challenge for institutions.
The AI wave isn’t just increasing efficiency. It’s reshaping the anatomy of work, separating action from decision, and forcing us to ask: what remains uniquely human?
Economist Daron Acemoglu argues that the most dangerous trend in recent technological history isn’t automation itself—it’s the way we’ve used automation to displace workers rather than augment them. Over the past few decades, many technologies have replaced mid-skill jobs without meaningfully improving productivity. The result: wage stagnation, rising inequality, and a polarized labor market. Acemoglu’s call is clear—we need to redirect innovation toward task-augmenting technologies that enhance human capability rather than erode it.
Our framework offers a concrete way to act on that vision. By distinguishing between decision- and action-level skills, and modeling how they interact with AI, we can design technologies—and institutions—that truly support human strengths.
But the real challenge is not theoretical. It’s institutional.
If we continue to train, hire, and evaluate people based on action-level outputs, we will misread talent, mistrain workers, and misdesign systems. Worse, we will cede the future of work to automation by default—not because AI is more capable, but because we forgot how to measure what humans uniquely contribute.
Our model not only reframes how we think about work, but also offers a practical tool for institutions. By enabling fine-grained evaluations along decision- and action-level dimensions, it allows for more accurate assessments—not just in hiring, but also in upskilling, promotions, task allocation, and even layoff decisions. Instead of collapsing a worker’s abilities into a single metric, we can now ask: where does their judgment shine? Where is support or augmentation most effective?
We need tools, metrics, and institutions that can recognize decision-level excellence—not just output volume. Otherwise, we’ll keep mistaking speed for insight, and productivity for progress. That means rethinking how we evaluate students, support workers, and guide innovation. It means building systems that value judgment, verification, ethical reasoning, and strategic adaptation—the things that don’t show up in a prompt but define good decisions.
The question is no longer whether humans have a role. The question is whether we’re designing for it.
This is the work ahead. And we’ll need to be as thoughtful about decisions as we’ve been clever with tools.
This essay draws on a forthcoming paper, A Mathematical Framework for AI-Human Integration in Work, by L. Elisa Celis, Lingxiao Huang, and Nisheeth K. Vishnoi, to appear at ICML 2025. You can read the full paper here: https://arxiv.org/abs/2505.23432
Many current and near-future applications of quantum computing utilise
parametric families of quantum circuits and variational methods to find optimal
values for these parameters. Solving a quantum computational problem with such
variational methods relies on minimising some cost function, e.g., the energy
of a physical system. As such, this is similar to the training process in
machine learning and variational quantum simulations can therefore suffer from
similar problems encountered in machine learning training. This includes
non-convergence to the global minimum due to local minima as well as critical
slowing down. In this article, we analyse the imaginary time evolution as a
means of compiling parametric quantum circuits and finding optimal parameters,
and show that it guarantees convergence to the global minimum without critical
slowing down. We also show that the compilation process, including the task of
finding optimal parameters, can be performed efficiently up to an arbitrary
error threshold if the underlying physical system is of bounded order. This
includes many relevant computational problems, e.g., local physical theories
and combinatorial optimisation problems such as the flight-to-gate assignment
problem. In particular, we show a priori estimates on the success probability
for these combinatorial optimisation problems. There seem to be no known
classical methods with similar efficiency and convergence guarantees. Meanwhile
the imaginary time evolution method can be implemented on current quantum
computers.
Many current and near-future applications of quantum computing utilise
parametric families of quantum circuits and variational methods to find optimal
values for these parameters. Solving a quantum computational problem with such
variational methods relies on minimising some cost function, e.g., the energy
of a physical system. As such, this is similar to the training process in
machine learning and variational quantum simulations can therefore suffer from
similar problems encountered in machine learning training. This includes
non-convergence to the global minimum due to local minima as well as critical
slowing down. In this article, we analyse the imaginary time evolution as a
means of compiling parametric quantum circuits and finding optimal parameters,
and show that it guarantees convergence to the global minimum without critical
slowing down. We also show that the compilation process, including the task of
finding optimal parameters, can be performed efficiently up to an arbitrary
error threshold if the underlying physical system is of bounded order. This
includes many relevant computational problems, e.g., local physical theories
and combinatorial optimisation problems such as the flight-to-gate assignment
problem. In particular, we show a priori estimates on the success probability
for these combinatorial optimisation problems. There seem to be no known
classical methods with similar efficiency and convergence guarantees. Meanwhile
the imaginary time evolution method can be implemented on current quantum
computers.
An evaluator is trustworthy when there exists some agreed-upon way to measure
its performance as a labeller. The two ways to establish trustworthiness are
either by testing it, or by assuming the evaluator `knows' somehow the way to
label the corpus. However, if labelled references (e.g., a development set) are
unavailable, neither of these approaches work: the former requires the data,
and the latter is an assumption, not evidence. To address this, we introduce an
algorithm (the `No-Data Algorithm') by which to establish trust in an evaluator
without any existing references. Our algorithm works by successively posing
challenges to said evaluator. We show that this is sufficient to establish
trustworthiness w.h.p., in such a way that when the evaluator actually knows
the way to label the corpus, the No-Data Algorithm accepts its output; and,
conversely, flags untrustworthy evaluators when these are unable to prove it.
We present formal proofs of correctness and limited experiments.
An evaluator is trustworthy when there exists some agreed-upon way to measure
its performance as a labeller. The two ways to establish trustworthiness are
either by testing it, or by assuming the evaluator `knows' somehow the way to
label the corpus. However, if labelled references (e.g., a development set) are
unavailable, neither of these approaches work: the former requires the data,
and the latter is an assumption, not evidence. To address this, we introduce an
algorithm (the `No-Data Algorithm') by which to establish trust in an evaluator
without any existing references. Our algorithm works by successively posing
challenges to said evaluator. We show that this is sufficient to establish
trustworthiness w.h.p., in such a way that when the evaluator actually knows
the way to label the corpus, the No-Data Algorithm accepts its output; and,
conversely, flags untrustworthy evaluators when these are unable to prove it.
We present formal proofs of correctness and limited experiments.
A litany of theoretical and numerical results have established the
sketch-and-precondition paradigm as a powerful approach to solving large linear
regression problems in standard computing environments. Perhaps surprisingly,
much less work has been done on understanding how sketch-and-precondition
performs on graphics processing unit (GPU) systems. We address this gap by
benchmarking an implementation of sketch-and-precondition based on sparse
sign-sketches on single and multi-GPU systems. In doing so, we describe a
novel, easily parallelized, rejection-sampling based method for generating
sparse sign sketches. Our approach, which is particularly well-suited for GPUs,
is easily adapted to a variety of computing environments. Taken as a whole, our
numerical experiments indicate that sketch-and-precondition with sparse sign
sketches is particularly well-suited for GPUs, and may be suitable for use in
black-box least-squares solvers.
A litany of theoretical and numerical results have established the
sketch-and-precondition paradigm as a powerful approach to solving large linear
regression problems in standard computing environments. Perhaps surprisingly,
much less work has been done on understanding how sketch-and-precondition
performs on graphics processing unit (GPU) systems. We address this gap by
benchmarking an implementation of sketch-and-precondition based on sparse
sign-sketches on single and multi-GPU systems. In doing so, we describe a
novel, easily parallelized, rejection-sampling based method for generating
sparse sign sketches. Our approach, which is particularly well-suited for GPUs,
is easily adapted to a variety of computing environments. Taken as a whole, our
numerical experiments indicate that sketch-and-precondition with sparse sign
sketches is particularly well-suited for GPUs, and may be suitable for use in
black-box least-squares solvers.
The theta function of Lovasz is a graph parameter that can be computed up to
arbitrary precision in polynomial time. It plays a key role in algorithms that
approximate graph parameters such as maximum independent set, maximum clique
and chromatic number, or even compute them exactly in some models of random and
semi-random graphs. For Erdos-Renyi random $G_{n,1/2}$ graphs, the expected
value of the theta function is known to be at most $2\sqrt{n}$ and at least
$\sqrt{n}$. These bounds have not been improved in over 40 years.
In this work, we introduce a new class of polynomial time computable graph
parameters, where every parameter in this class is an upper bound on the theta
function. We also present heuristic arguments for determining the expected
values of parameters from this class in random graphs. The values suggested by
these heuristic arguments are in agreement with results that we obtain
experimentally, by sampling graphs at random and computing the value of the
respective parameter. Based on parameters from this new class, we feel safe in
conjecturing that for $G_{n,1/2}$, the expected value of the theta function is
below $1.55 \sqrt{n}$. Our paper falls short of rigorously proving such an
upper bound, because our analysis makes use of unproven assumptions.
The theta function of Lovasz is a graph parameter that can be computed up to
arbitrary precision in polynomial time. It plays a key role in algorithms that
approximate graph parameters such as maximum independent set, maximum clique
and chromatic number, or even compute them exactly in some models of random and
semi-random graphs. For Erdos-Renyi random $G_{n,1/2}$ graphs, the expected
value of the theta function is known to be at most $2\sqrt{n}$ and at least
$\sqrt{n}$. These bounds have not been improved in over 40 years.
In this work, we introduce a new class of polynomial time computable graph
parameters, where every parameter in this class is an upper bound on the theta
function. We also present heuristic arguments for determining the expected
values of parameters from this class in random graphs. The values suggested by
these heuristic arguments are in agreement with results that we obtain
experimentally, by sampling graphs at random and computing the value of the
respective parameter. Based on parameters from this new class, we feel safe in
conjecturing that for $G_{n,1/2}$, the expected value of the theta function is
below $1.55 \sqrt{n}$. Our paper falls short of rigorously proving such an
upper bound, because our analysis makes use of unproven assumptions.
Authors: Bastien Auvray, Julien David, Richard Groult, Thierry Lecroq
In this paper, we introduce the notion of Cartesian Forest, which generalizes
Cartesian Trees, in order to deal with partially ordered sequences. We show
that algorithms that solve both exact and approximate Cartesian Tree Matching
can be adapted to solve Cartesian Forest Matching in average linear time. We
adapt the notion of Cartesian Tree Signature to Cartesian Forests and show how
filters can be used to experimentally improve the algorithm for the exact
matching. We also show a one to one correspondence between Cartesian Forests
and Schr\"oder Trees.
In this paper, we introduce the notion of Cartesian Forest, which generalizes
Cartesian Trees, in order to deal with partially ordered sequences. We show
that algorithms that solve both exact and approximate Cartesian Tree Matching
can be adapted to solve Cartesian Forest Matching in average linear time. We
adapt the notion of Cartesian Tree Signature to Cartesian Forests and show how
filters can be used to experimentally improve the algorithm for the exact
matching. We also show a one to one correspondence between Cartesian Forests
and Schr\"oder Trees.
This paper investigates the role of mediators in Bayesian games by examining
their impact on social welfare through the price of anarchy (PoA) and price of
stability (PoS). Mediators can communicate with players to guide them toward
equilibria of varying quality, and different communication protocols lead to a
variety of equilibrium concepts collectively known as Bayes (coarse) correlated
equilibria. To analyze these equilibrium concepts, we consider a general class
of Bayesian games with submodular social welfare, which naturally extends valid
utility games and their variant, basic utility games. These frameworks,
introduced by Vetta (2002), have been developed to analyze the social welfare
guarantees of equilibria in games such as competitive facility location,
influence maximization, and other resource allocation problems.
We provide upper and lower bounds on the PoA and PoS for a broad class of
Bayes (coarse) correlated equilibria. Central to our analysis is the strategy
representability gap, which measures the multiplicative gap between the optimal
social welfare achievable with and without knowledge of other players' types.
For monotone submodular social welfare functions, we show that this gap is
$1-1/\mathrm{e}$ for independent priors and $\Theta(1/\sqrt{n})$ for correlated
priors, where $n$ is the number of players. These bounds directly lead to upper
and lower bounds on the PoA and PoS for various equilibrium concepts, while we
also derive improved bounds for specific concepts by developing smoothness
arguments. Notably, we identify a fundamental gap in the PoA and PoS across
different classes of Bayes correlated equilibria, highlighting essential
distinctions among these concepts.
This paper investigates the role of mediators in Bayesian games by examining
their impact on social welfare through the price of anarchy (PoA) and price of
stability (PoS). Mediators can communicate with players to guide them toward
equilibria of varying quality, and different communication protocols lead to a
variety of equilibrium concepts collectively known as Bayes (coarse) correlated
equilibria. To analyze these equilibrium concepts, we consider a general class
of Bayesian games with submodular social welfare, which naturally extends valid
utility games and their variant, basic utility games. These frameworks,
introduced by Vetta (2002), have been developed to analyze the social welfare
guarantees of equilibria in games such as competitive facility location,
influence maximization, and other resource allocation problems.
We provide upper and lower bounds on the PoA and PoS for a broad class of
Bayes (coarse) correlated equilibria. Central to our analysis is the strategy
representability gap, which measures the multiplicative gap between the optimal
social welfare achievable with and without knowledge of other players' types.
For monotone submodular social welfare functions, we show that this gap is
$1-1/\mathrm{e}$ for independent priors and $\Theta(1/\sqrt{n})$ for correlated
priors, where $n$ is the number of players. These bounds directly lead to upper
and lower bounds on the PoA and PoS for various equilibrium concepts, while we
also derive improved bounds for specific concepts by developing smoothness
arguments. Notably, we identify a fundamental gap in the PoA and PoS across
different classes of Bayes correlated equilibria, highlighting essential
distinctions among these concepts.
Authors: J. A. Alejandro-Soto, Joel Antonio Trejo-Sanchez, Carlos Segura
This work proposes \textsc{H-Td}, a practical linear-time algorithm for
computing an optimal-width tree decomposition of Halin graphs. Unlike
state-of-the-art methods based on reduction rules or separators, \textsc{H-Td}
exploits the structural properties of Halin graphs. Although two theoretical
linear-time algorithms exist that can be applied to graphs of treewidth three,
no practical implementation has been made publicly available. Furthermore,
extending reduction-based approaches to partial $k$-trees with $k > 3$ results
in increasingly complex rules that are challenging to implement. This motivates
the exploration of alternative strategies that leverage structural insights
specific to certain graph classes. Experimental validation against the winners
of the Parameterized Algorithms and Computational Experiments Challenge (PACE)
2017 and the treewidth library \texttt{libtw} demonstrates the advantage of
\textsc{H-Td} when the input is known to be a Halin graph.
This work proposes \textsc{H-Td}, a practical linear-time algorithm for
computing an optimal-width tree decomposition of Halin graphs. Unlike
state-of-the-art methods based on reduction rules or separators, \textsc{H-Td}
exploits the structural properties of Halin graphs. Although two theoretical
linear-time algorithms exist that can be applied to graphs of treewidth three,
no practical implementation has been made publicly available. Furthermore,
extending reduction-based approaches to partial $k$-trees with $k > 3$ results
in increasingly complex rules that are challenging to implement. This motivates
the exploration of alternative strategies that leverage structural insights
specific to certain graph classes. Experimental validation against the winners
of the Parameterized Algorithms and Computational Experiments Challenge (PACE)
2017 and the treewidth library \texttt{libtw} demonstrates the advantage of
\textsc{H-Td} when the input is known to be a Halin graph.
Authors: Aleix Boquet-Pujadas, Pol del Aguila Pla, Michael Unser
We formulate an optimization problem to estimate probability densities in the
context of multidimensional problems that are sampled with uneven probability.
It considers detector sensitivity as an heterogeneous density and takes
advantage of the computational speed and flexible boundary conditions offered
by splines on a grid. We choose to regularize the Hessian of the spline via the
nuclear norm to promote sparsity. As a result, the method is spatially adaptive
and stable against the choice of the regularization parameter, which plays the
role of the bandwidth. We test our computational pipeline on standard densities
and provide software. We also present a new approach to PET rebinning as an
application of our framework.
We formulate an optimization problem to estimate probability densities in the
context of multidimensional problems that are sampled with uneven probability.
It considers detector sensitivity as an heterogeneous density and takes
advantage of the computational speed and flexible boundary conditions offered
by splines on a grid. We choose to regularize the Hessian of the spline via the
nuclear norm to promote sparsity. As a result, the method is spatially adaptive
and stable against the choice of the regularization parameter, which plays the
role of the bandwidth. We test our computational pipeline on standard densities
and provide software. We also present a new approach to PET rebinning as an
application of our framework.
We study the problem of learning the optimal item pricing for a unit-demand
buyer with independent item values, and the learner has query access to the
buyer's value distributions. We consider two common query models in the
literature: the sample access model where the learner can obtain a sample of
each item value, and the pricing query model where the learner can set a price
for an item and obtain a binary signal on whether the sampled value of the item
is greater than our proposed price. In this work, we give nearly tight sample
complexity and pricing query complexity of the unit-demand pricing problem.
We study the problem of learning the optimal item pricing for a unit-demand
buyer with independent item values, and the learner has query access to the
buyer's value distributions. We consider two common query models in the
literature: the sample access model where the learner can obtain a sample of
each item value, and the pricing query model where the learner can set a price
for an item and obtain a binary signal on whether the sampled value of the item
is greater than our proposed price. In this work, we give nearly tight sample
complexity and pricing query complexity of the unit-demand pricing problem.
Authors: Eden Hartman, Dinesh Kumar Baghel, Erel Segal-Halevi
In many parts of the world - particularly in developing countries - the
demand for electricity exceeds the available supply. In such cases, it is
impossible to provide electricity to all households simultaneously. This raises
a fundamental question: how should electricity be allocated fairly? In this
paper, we explore this question through the lens of egalitarianism - a
principle that emphasizes equality by prioritizing the welfare of the worst-off
households. One natural rule that aligns with this principle is to maximize the
egalitarian welfare - the smallest utility across all households. We show that
computing such an allocation is NP-hard, even under strong simplifying
assumptions. Leximin is a stronger fairness notion that generalizes the
egalitarian welfare: it also requires to maximize the smallest utility, but
then, subject to that, the second-smallest, then the third, and so on. The
hardness results extends directly to leximin as well. Despite this, we present
a Fully Polynomial-Time Approximation Scheme (FPTAS) for leximin in the special
case where the network connectivity graph is a tree. This means that we can
efficiently approximate leximin - and, in particular, the egalitarian welfare -
to any desired level of accuracy.
In many parts of the world - particularly in developing countries - the
demand for electricity exceeds the available supply. In such cases, it is
impossible to provide electricity to all households simultaneously. This raises
a fundamental question: how should electricity be allocated fairly? In this
paper, we explore this question through the lens of egalitarianism - a
principle that emphasizes equality by prioritizing the welfare of the worst-off
households. One natural rule that aligns with this principle is to maximize the
egalitarian welfare - the smallest utility across all households. We show that
computing such an allocation is NP-hard, even under strong simplifying
assumptions. Leximin is a stronger fairness notion that generalizes the
egalitarian welfare: it also requires to maximize the smallest utility, but
then, subject to that, the second-smallest, then the third, and so on. The
hardness results extends directly to leximin as well. Despite this, we present
a Fully Polynomial-Time Approximation Scheme (FPTAS) for leximin in the special
case where the network connectivity graph is a tree. This means that we can
efficiently approximate leximin - and, in particular, the egalitarian welfare -
to any desired level of accuracy.
Last week’s blogging and ensuing feedback storm got so many issues swirling in my head that I found myself writing in ten different directions. To address all the issues raised, I needed a logical starting place. I needed to begin at the root of all of academia’s problems: peer review.1
I’ve written here on the blog and elsewhere about my disdain for the peer review system we use to evaluate papers, but a conversation with Kevin Munger earlier this year convinced me I needed far more nuance. Because academia does not exist without peer review.
To explain what I mean, let me distinguish between hierarchical evaluation2 and peer review. In academia, hierarchical evaluation starts with graduate school admissions, which grants access to the first floor of the ivory tower. Graduate students are admitted by the faculty to the departments to which they apply. In the PhD defense, a committee of faculty decides whether a candidate gets blessed as an expert of academic knowledge. They learn the secret handshake if they pass. A step beyond, and far more competitive, there’s “faculty recruiting,” the euphemism used for the bizarre, unprincipled practices where professors decide who gets hired as new professors. There are junior faculty investigator awards, where older professors recognize those who are exemplary in their early professorial careers. Then there’s tenure review, where a department of professors chooses whether or not it wants to keep a professor on its staff for the next half century. There is promotion to “full professor.” No one knows why we do this. Then elections into the National Academy. And so on and so on.
In each of these cases, younger people are evaluated by people of higher rank, ostensibly based on merit. Each stage involves reams of recommendation letters, which always extoll the singular genius of their subjects. Older individuals choose who remains in the system, thereby creating a narrowing circle of authority. The review is not “peer” because the hierarchy and power asymmetries are so evident.
By contrast, peer review ostensibly claims to cut across this hierarchy. The most famous form of peer review is the one I mentioned above, which is the system used to decide what gets published in academic venues. Editors of journals send a manuscript to three people they believe to have relevant expertise, and beg them to carefully read the paper and write a detailed report on whether it should be published. This work is uncompensated. Based on these reviews, the editor makes a publication decision and sends the author the reports as “feedback.” Since there is a potential asymmetry of power, with juniors evaluating seniors, the identity of the reviewers is kept confidential from the authors.3 For some reason, this sort of peer review has to be anonymous, even though in the hierarchical review examples, the evaluated knows every department or professor that’s evaluating them.
I’ve previously criticized the system of blind pre-publication manuscript peer review, but I don’t want to dwell on that today.4 Because it’s only one of the many applications of peer review in the elaborate, confusing mechanisms for building academic knowledge. We also use peer review systems to review grant proposals. Again, in mostly uncompensated work, peers pore over hundreds of pages of grant proposals to decide who gets a share of an ever-dwindling pot of money. More broadly, there’s peer review with every choice we make in our academic career. Whose papers do we cite? Whose open problems should we work on? Who should we invite to give talks? Who do we flame on Twitter? This is all peer review too.
Academia is built upon a foundation of review. It is messy, decentralized collective decision making that selects a group of people and their writing into a canon of thought. That thought is what we call academic expertise. The academic canon is an artiface entirely constructed out of hierarchical evaluation and peer review. It only exists because we say it exists. Academia is a giant social system of credentialized credentialing.
“Should it be the scientists themselves? Mark argues that ‘Researchers serve on review panels for grants; they argue against grants being awarded that would not (in their professional opinion) yield good science.’ I disagree with Mark’s assessment of peer review. Our prestige systems fail to determine what is good and bad all the time. Peer review completely fails to improve science and can’t even reliably detect fraud. At this point, it’s clear most scientists don't work hard at paper reviews, and LLM reviews are rapidly on the rise. Grant panels are frequently plagued by old guys yelling at the clouds that forgot to cite their work. Scientists are prone to herding around hype. That science trudges forward despite all of the apparent ‘irrationality’ is what makes it such a remarkable human endeavor.”
I still agree with this, but want to emphasize its nuance. Academia is by no means a perfect system of deciding what is “good” and “bad.” Although it stumbles and makes many mistakes, it is often quite good at discovering knowledge that the rest of society deems important too.
And the peer review system is all we have. It has its warts, but academic knowledge canonization needs some systems for us to be counted, evaluated, promoted, awarded, and canonized. Academia is a social system after all, and social systems love to classify and rank.
These social systems are daunting to change. Individual faculty can be agile, creative forces, but the aggregate academia is the most conservative, turgid mess. Somehow together that can work well, because we occasionally stumble upon world-changing breakthroughs that shape how we see the world. Peer review might not be all we need, but it’s all we have. Our solutions for any future models of academia start there.
In machine learning, we now let undergrads evaluate full professors. Just like in teaching evaluations. A truly egalitarian system of knowledge curation.
The Theory community has a tradition of a crowdsourced spreadsheet of sharing who has accepted which jobs, previously hosted by Grigory Yaroslavtsev and Lance Fortnow. Past years have been slightly chaotic, due to the sheet being an anonymous free-for-all in terms of edits. This year, you can report jobs accepted via this form (which is … Continue reading "Theory Jobs 2025"
The Theory community has a tradition of a crowdsourced spreadsheet of sharing who has accepted which jobs, previously hosted by Grigory Yaroslavtsev and Lance Fortnow.
Past years have been slightly chaotic, due to the sheet being an anonymous free-for-all in terms of edits. This year, you can report jobs accepted via this form (which is still anonymous), and view the results on this sheet. If you feel a change needs to be made to the results reported in the form, please email me or fill out this other form (also anonymous, unless you choose to identify yourself).
The rules, from past years, are as follows:
You are welcome to add yourself, or people your department has hired.
Hires should be connected to theoretical computer science, broadly defined.
Only add jobs that you are absolutely sure have been offered and accepted. This is not the place for speculation and rumors. Please, be particularly careful when adding senior hires (people who already have an academic or industrial job) — end dates of their current positions might be still in the future.
To keep this list focused around theory (and particularly theoretical computer science), mild curation will be applied: individuals should have publications at theory-centric conferences such as STOC, FOCS, SODA, ICALP, STACS, COLT, CRYPTO, CCC, ITCS, etc. If you don’t meet this criterion, but still identify as a theorist/theoretician (at least partially), please email me (ideally, with a sentence or two of explanation) at g@csail.mit.edu and I’ll include you, no further questions asked.
Ethereum Foundation talk, today This afternoon (Tuesday, June 3, 2025) at 17:00 Israel time I give a zoom lecture on A Critical View on Quantum Computing. The lecture is hosted by the Ethereum Foundation and the 90 minute events will … Continue reading →
Ethereum Foundation talk, today
This afternoon (Tuesday, June 3, 2025) at 17:00 Israel time I give a zoom lecture on A Critical View on Quantum Computing. The lecture is hosted by the Ethereum Foundation and the 90 minute events will include question/answers/conversation with the Etherium participants. The talk will be streamlined on You Tube here https://youtube.com/live/HhWWkTkaXSI?feature=share . (It will be uploaded to You Tube as well.)
I plan to devote the first 35-40 minutes to the opening slides of This presentation. The presentation (click to open), contains five additional parts and we may zoom in one one or two parts at the request of the participants.
Part II: Background: Boolean circuits; quantum circuits; efficient computation
Part III: The Argument Against Quantum Computers;
Part IV: Noise Stability and sensitivity for boson sampling;
Part V: The failure of quantum computation – underlying principles and consequences;
Part VI: The Analysis of Google’s 2019 experiment and other QC experiments
Later at 20:00 I plan to be with Eran, my son in law, and Ilan and Yoav, my grandchildren, at a basketball game between Hapo’el Tel Aviv and Hapo’el Jerusalem, and my grandchildren taught me some (mild) required songs (and advised a red T-shirt).
A geometry day tomorrow honoring Micha Sharir
Tomorrow (June 4, 2025) there will be a geometry day in honor of Micha Sharir’s 75th birthday. And congratulations, Micha, for being named the 2025 Knuth Prize recipient. Here is the agenda:
· 10am: Rom Pinchasi – Odd area and odd volume (Micha Sharir is 75)
· 11am: Danny Halperin – The Piano Movers at 40: Geometry for Robotics
· Noon: Gil Kalai – Helly type theorems and problems
· 2:30pm: A special session that includes personal addresses, gifts, videos, and more. This event will also be on Zoom.
A pseudorandom code is a keyed error-correction scheme with the property that
any polynomial number of encodings appear random to any computationally bounded
adversary. We show that the pseudorandomness of any code tolerating a constant
rate of random errors cannot be based on black-box reductions to almost any
generic cryptographic primitive: for instance, anything that can be built from
random oracles, generic multilinear groups, and virtual black-box obfuscation.
Our result is optimal, as Ghentiyala and Guruswami (2024) observed that
pseudorandom codes tolerating any sub-constant rate of random errors exist
using a black-box reduction from one-way functions.
The key technical ingredient in our proof is the hypercontractivity theorem
for Boolean functions, which we use to prove our impossibility in the random
oracle model. It turns out that this easily extends to an impossibility in the
presence of ``crypto oracles,'' a notion recently introduced -- and shown to be
capable of implementing all the primitives mentioned above -- by Lin, Mook, and
Wichs (EUROCRYPT 2025).
A pseudorandom code is a keyed error-correction scheme with the property that
any polynomial number of encodings appear random to any computationally bounded
adversary. We show that the pseudorandomness of any code tolerating a constant
rate of random errors cannot be based on black-box reductions to almost any
generic cryptographic primitive: for instance, anything that can be built from
random oracles, generic multilinear groups, and virtual black-box obfuscation.
Our result is optimal, as Ghentiyala and Guruswami (2024) observed that
pseudorandom codes tolerating any sub-constant rate of random errors exist
using a black-box reduction from one-way functions.
The key technical ingredient in our proof is the hypercontractivity theorem
for Boolean functions, which we use to prove our impossibility in the random
oracle model. It turns out that this easily extends to an impossibility in the
presence of ``crypto oracles,'' a notion recently introduced -- and shown to be
capable of implementing all the primitives mentioned above -- by Lin, Mook, and
Wichs (EUROCRYPT 2025).
In ethics, individual responsibility is often defined through Frankfurt's
principle of alternative possibilities. This definition is not adequate in a
group decision-making setting because it often results in the lack of a
responsible party or "responsibility gap''. One of the existing approaches to
address this problem is to consider group responsibility. Another, recently
proposed, approach is "higher-order'' responsibility. The paper considers the
problem of deciding if higher-order responsibility up to degree $d$ is enough
to close the responsibility gap. The main technical result is that this problem
is $\Pi_{2d+1}$-complete.
In ethics, individual responsibility is often defined through Frankfurt's
principle of alternative possibilities. This definition is not adequate in a
group decision-making setting because it often results in the lack of a
responsible party or "responsibility gap''. One of the existing approaches to
address this problem is to consider group responsibility. Another, recently
proposed, approach is "higher-order'' responsibility. The paper considers the
problem of deciding if higher-order responsibility up to degree $d$ is enough
to close the responsibility gap. The main technical result is that this problem
is $\Pi_{2d+1}$-complete.
Many problems in Euclidean geometry, arising in computational design and
fabrication, amount to a system of constraints, which is challenging to solve.
We suggest a new general approach to the solution, which is to start with
analogous problems in isotropic geometry. Isotropic geometry can be viewed as a
structure-preserving simplification of Euclidean geometry. The solutions found
in the isotropic case give insight and can initialize optimization algorithms
to solve the original Euclidean problems. We illustrate this general approach
with three examples: quad-mesh mechanisms, composite asymptotic-geodesic
gridshells, and asymptotic gridshells with constant node angle.
Many problems in Euclidean geometry, arising in computational design and
fabrication, amount to a system of constraints, which is challenging to solve.
We suggest a new general approach to the solution, which is to start with
analogous problems in isotropic geometry. Isotropic geometry can be viewed as a
structure-preserving simplification of Euclidean geometry. The solutions found
in the isotropic case give insight and can initialize optimization algorithms
to solve the original Euclidean problems. We illustrate this general approach
with three examples: quad-mesh mechanisms, composite asymptotic-geodesic
gridshells, and asymptotic gridshells with constant node angle.