OPML feed of all feeds.
Subscribe to the Atom feed, RSS feed to stay up to date.
Thank you to arXiv for use of its open access interoperability.
Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.
Powered by Pluto.
Source on GitHub.
Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.
from ECCC Papers
from Ben Recht
You might have noticed a pattern in last week’s blog history of the megatrials in cardiology. All the breakthrough trials I covered were essentially drug trials of anticoagulant interventions like aspirin, streptokinase, and heparin. This, of course, wasn’t just a coincidence.
Though randomized trials are often pitched as the ultimate arbiters of causality, you need to have a good reason to do one. You can’t (and don’t) just propose a random trial with a control group. Not only must there be considerable uncertainty about the outcome to get a trial approved, but there must also be some reason to do the trial in the first place. There has to be some intuition that leads some investigator to propose some intervention, and they have to have enough faith in that intervention to devote massive resources to proving it’s efficacious. Though randomized trials sit far above expert opinion in the hierarchy of clinical evidence, expert opinion necessarily must still drive the operation.
So what was the core hypothesis driving the series of anti-coagulant megatrials? Whether clots were a primary cause of heart attacks was a hot debate in cardiology well into the 1980s. The prominent theory for much of the 20th century posited that it was arterial hardening and valve narrowing that caused heart attacks. In this model, blood clots were caused by the heart attack and not the other way around. This felt consistent with the evidence, as most of those who died from heart attacks didn’t have blood clots in their autopsies.
It wasn’t until 1980 that a paradigm-shifting study showed that blocked arteries were overwhelmingly common in the early stages of heart attacks. The investigators corroborated earlier findings that the clots were less prevalent the following day. This suggested that some biological mechanism was naturally breaking down the clots during heart attacks, and suggested a mode of attack for treatment. Cardiologists spent a decade finding new and creative ways to break down blood clots in the heart and intervene as early as possible, leading to marked improvements in survival.
The large pragmatic drug trials were important for building practice. They did find a fairly robust suite of therapies for treating heart attacks. Could an individual cardiologist have determined a better regimen? Possibly. But the megatrial infrastructure of cardiology trials proved that, for the right clinical hypothesis, professional societies of trialists could maximize population outcomes to achieve a reasonable standard of care with 3-6 times lower mortality than the prior one.
Now, I also mentioned a trial that wasn’t about coagulation: CAST. As a reminder, the CAST trial studied the value of drugs that quelled irregular heartbeats in heart attack patients. The trial found that these therapies led to a major increase in mortality, and it ended the practice. But why were doctors convinced that this was a good idea in the first place?
I suppose it might sound reasonable to make a sick heart look “healthier” by messing with its rhythm. But what’s the model behind why that would be a good idea? I’m sure there is some grand rounds somewhere that lays out the full hypothesis, but I couldn’t find one, and the literature associated with the CAST trial doesn’t provide it. Instead, the trial report cites a bunch of correlational studies showing that arrhythmia is correlated with mortality in heart attack patients. The main driving hypothesis seems to be this statistical correlation. Expert opinion on anti-arrhythmia drugs was decidedly mixed before CAST, and that’s part of the reason the trial went forward.
It’s worth dwelling on one remarkable feature of the CAST trial. The investigators were allowed to compare against a placebo. Without the placebo arm, the bad practice of treating arrhythmias would not have been ended. It remains a heated debate in medical ethics as to when it’s acceptable to compare an intervention to a placebo rather than to the standard of care. There is a strong case for placebos. A cascade of RCTs has a great deal of path dependence. Every trial slightly adjusts the standard of care along a single axis. This is what optimizers call random search, and there is nothing to prevent it from ending up in a disadvantageous local minimum. Thus, trials against the standard of care might keep you in an iatrogenic region of medical policy. It is quite possible that the standard of care needs to be completely reimagined to achieve the next major therapeutic improvement. As evidence, this was exactly what was needed to force cardiology to shift to anticoagulant therapy. CAST remains a classic reminder that it is not just ethical but often imperative to test against placebos.
Though often held up as a poster child for the megatrial, the full history of cardiology looks an awful lot like the rest of science and engineering. There isn’t a single magic bullet that automatically led to improved therapies. There was a mixture of expert opinion and institutional debate. Small studies inspired huge randomized trials. The megatrials were guided by biological plausibility and mechanistic intuition. The RCT was clearly an important part of improving practice, but it was part of a complex curation of expertise, rather than a cut-and-dried gold standard.
However, some people stubbornly maintain that RCTs are the only pure way to get to the truth of what works. This belief led to one of the weirder moves in medical history that’s still with us today: The rise of the postmodern thinking of evidence-based medicine. This is the topic of my next post.
Authors: Yoshiki Kanazawa, Naphan Benchasattabuse, Michal Hajdušek, Rodney Van Meter
We formulate a structure-informed multiple sequence alignment problem, denoted MSA-S. The model abstracts biological sequences as strings and structural information as designated position-pairs. It augments a fixed pairwise string score, defined by a fixed non-gap symbol-pair scoring rule and fixed affine gap penalties, with a binary overlap score on designated position-pairs, which can be interpreted as a contact-map overlap score in structural applications. This yields a fixed-score, integer-valued optimization model suitable for complexity-theoretic analysis. Under this formulation, we show that the decision problem MSA-S-DEC is NP-complete for a broad class of fixed pairwise string scoring schemes. We also show that NP-hardness persists even under the restriction that every designated position-pair set is nonempty and the pair-overlap threshold is strictly positive. For the associated scalarized optimization problem MSA-S-OPT(lambda) with any fixed rational constant lambda >= 1, we further show that, under the canonical unit scheme for the non-gap symbol-pair scoring rule, MSA-S-OPT(lambda) admits no polynomial-time approximation scheme (PTAS) even for two input strings (k = 2), unless P = NP. These results establish a formal complexity-theoretic baseline for structure-informed multiple sequence alignment.Authors: Angelos Charalambidis, Babis Kostopoulos, Panos Rondogiannis
We consider a fragment of Higher-Order Datalog with negation and argue that it generalizes the familiar and important fragment of Linear Datalog. We investigate the expressive power of this fragment, establishing a tight connection with the hierarchy of space complexity classes. In particular, we demonstrate that for all $k \ge 1$, the $(k+1)$-order fragment of Stratified Linear Higher-Order Datalog$^\neg$ captures $(k-1)$-EXPSPACE. This result suggests that restricting programs to linear recursion shifts the expressive power of the corresponding fragments from time to space, generalizing the classical result that (Stratified) Linear Datalog captures NL. Unlike the first-order setting where an ordering assumption is required to capture NL, our results hold without any such assumption on the input database. The proof relies on simulating space-bounded Turing machines using Stratified Linear Higher-Order Datalog$^\neg$ programs and providing a space-efficient evaluation of the query program. We argue that identifying such computationally well-behaved fragments is a crucial step towards paving the way for practical implementations of Higher-Order Datalog.Authors: Hyun-Gee Jei, Caleb J. Armstrong, Farzan Sasangohar
Modern command, control, communications, computers, cyber, intelligence, surveillance, and reconnaissance (C5ISR) environments place substantial attentional demands on mission commanders. Failures in attention allocation in these high-risk settings can have severe operational consequences. This study investigates the efficacy of gaze-driven, attention-guided adaptive decision support tools, including visual-only and multimodal designs, in a high-fidelity simulated military command center. To characterize gaze and attentional dynamics during interaction with these tools, recurrence quantification analysis was applied to eye-tracking data. Stepwise regression using the Bayesian information criterion was then used to identify recurrence-based gaze metrics associated with performance. Results showed that the multimodal adaptive decision support tool was associated with significantly higher performance than the visual-only attention-guided tool. Average diagonal line length showed a negative linear association with performance, whereas entropy showed a positive linear association. Recurrence rate, determinism, and entropy also showed nonlinear quadratic relationships with performance. In particular, recurrence rate and determinism followed an inverted-U pattern consistent with the Yerkes-Dodson law. These findings suggest that effective performance in dynamic C5ISR contexts depends on a balance between structured and flexible visual scanning, and that recurrence-based gaze metrics can help characterize attentional dynamics during interaction with adaptive decision support systems.Authors: Qian Li, Xinyu Mao, Shang-Hua Teng
Positional encoding (PE) is widely viewed as necessary for transformers to process ordered sequences: without them, the next-token map appears permutation-invariant in its context tokens. This intuition underlies all prior universality results, which rely on positional information to prove that transformers with chain-of-thought can perform arbitrary computation, i.e., they are Turing complete. We revisit this belief in the regime most relevant to long-form reasoning, where generation proceeds through a finite sliding context window. Our opening perception is that the window mechanism itself (mildly) breaks the permutation symmetry. To distill and precisely capture the degree of this added expressiveness, we introduce an abstract autoregressive model, the HIST model, in which each update depends only on constant-size internal state and the token-count histogram within the current window. We prove that this HIST model is Turing complete by showing that the evolution of the window can reveal the token that has just left the window, which suffices to simulate Turing-complete Post machines. We then construct a sliding-window transformer over a constant-size token alphabet, without PE, and show that it can simulate the HIST model. Our result demonstrates that positional encodings are not indispensable for transformers to perform universal computation: The window sliding itself already breaks permutation symmetry and captures sufficient positional information.Authors: Fabian Egidy
We study the relationship between the existence of optimal proof systems and recursive jump operators, two central open problems in proof complexity. For a set L, an optimal proof system is a strongest proof system in terms of proof length, whereas a recursive jump operator uniformly transforms any proof system for L into a stronger one with respect to proof length, thereby witnessing non-optimality. It is clear that the existence of a recursive jump operator for L rules out optimal proof systems for L. Khaniki (FOCS 2024) is interested in the converse of this implication and explicitly poses the following question, where TAUT denotes the set of propositional tautologies. Q: Does the non-existence of optimal proof systems for TAUT imply the existence of recursive jump operators for TAUT? We generalize and address this question from both a relativized and an unrelativized perspective. We show that proving a positive answer for Q is provably hard by constructing the following oracle. O: The polynomial-time hierarchy is infinite, TAUT has no optimal proof systems, and TAUT has no recursive jump operators. This shows that Khaniki's question can not be answered in the positive by relativizable means, even under the standard complexity-theoretic assumption that the polynomial-time hierarchy is infinite. In contrast, we obtain positive results when the question Q is posed for sets different from TAUT. We prove that the existence of recursive jump operators is upward closed under $\leq_{\text{m}}^{\text{p}}$-reducibility, a result that so far was only known for the non-existence of optimal proof systems. Furthermore, we show that the sets known to have no optimal proof systems by Messner (STACS 1999) in fact admit recursive jump operators. Thus, essentially all sets currently known to have no optimal proof systems have recursive jump operators.Authors: Artem Parfenov, Michael Vyalyi
In this paper, we study the complexity of the recurrence evaluation problem. We are interested in finitely valued recurrent functions. We present two results in this direction. First, we study the recurrence problem for sequences, assuming that a recurrence relation is defined by a fixed function, while the offsets are part of the input. Depending on the form of presentation (whether the offsets are given in unary or in binary), the problem is PSPACE-complete or EXP-complete. Second, we study recurrences defined by the NAND function. They are related to impartial games. We prove PP-hardness of the recurrence evaluation problem for a very simple 3-dimensional game, in which the offset vectors are coordinate vectors (1,0,0), (0,1,0) and (0,0,1) but the boundary conditions are arbitrary. In other words, we consider generalized winning conditions for the game extending the normal and the misère winning conditions.Authors: Mark Braverman, Jingyi Liu, Eric Xue, Chenghan Zhou
In this paper, we investigate the computational hardness of finding fractional allocations to unit-demand players using competitive equilibria from equal incomes (CEEI), where we allow a small constant error in players' response to market prices (also known as an approximate Hylland-Zeckhauser equilibrium). We show that assuming the $\mathbf{(\varepsilon,δ)}$-Generalized Circuits problem is PPAD-hard (the "PCP for PPAD" conjecture), finding an approximate HZ equilibrium is also PPAD-hard. This result provides additional motivation for trying to prove the PCP for PPAD conjecture as a tool for obtaining robust computational hardness results about markets. Further, we introduce a natural restriction on approximate HZ equilibria, where players' bundles may still only be approximately optimal given the prices, but may not contain positive-price items for which the player has zero utility. We show unconditionally that there exists a constant $ε$ such that finding a restricted $ε$-HZ equilibrium is PPAD-hard.Authors: Ezequiel López-Rubio
We introduce Grid Programs, a novel model of computation in which programs are finite two-dimensional arrangements of instructions on an integer grid rather than linear sequences of statements. Three properties distinguish this model fundamentally from classical frameworks: (i) programs are planar structures through which an instruction pointer moves in the four cardinal directions; (ii) there are no syntax constraints, so any assignment of instructions to grid cells constitutes a valid program; and (iii) the model uses no named variables or explicit memory addresses. Program state is maintained through a data stack, an address stack, and a circularly doubly linked list accessed via three named pointers. Control flow is achieved spatially, with branching encoded as perpendicular turns of the instruction pointer. The address stack stores triplets (cell row, cell column, direction), enabling precise restoration of both position and heading after branches, loops, and function calls. We give a formal operational semantics, present a representative instruction set covering arithmetic, control flow, and linked-list manipulation, and work through several detailed examples, including an absolute-value function, a factorial computation, a linear-search algorithm, a string-reversal program, and a while-loop summation. We establish that Grid Programs are Turing-complete by simulating an arbitrary register machine, and we discuss their relationship to prior two-dimensional languages such as Befunge and Funge-98, to stack-based languages such as Forth and PostScript, and to dataflow and spatial computation models. Grid Programs offer a fresh vantage point for exploring the design space of computation, with potential applications in visual programming environments, cellular-automaton-inspired hardware, and obfuscation-resistant code.Authors: Alexey Ovchinnikov, Pedro Soto
Global parameter identifiability is a property of a parametric ODE model to recover the parameter values uniquely from the input-output data. Not all parametric ODE models have this property, and checking for parameter identifiability is a prerequisite to perform numerical parameter estimation. There are many algorithms and software packages for global parameter identifiability, and frequently the runtime is large. However, the computational complexity for this problem has not been analyzed yet, though there are complexity results for local (finitely many values fit the data) parameter identifiability. In this paper, we estimate the complexity of checking global parameter identifiability over real fields for ODE models that depend linearly on the state variables and rationally on the parameters. In particular, we prove that it is equivalent to the injectivity problem.Authors: Omrit Filtser, Tzalik Maimon, Michal Moiseev
The Fréchet distance is a well-studied distance measure between two curves. In this work, we demonstrate that the merit of Fréchet distance extends beyond evaluating similarity, and introduce a new setting in which it proves useful. Consider a situation where two agents are required to visit a given set of sites, while staying close to each other throughout their traversal. In this paper, we study problems where the goal is to construct two curves whose vertices are from a given set of points, under the constraint that the Fréchet distance between the curves is kept as small as possible. This problem can be viewed as a variant of the Traveling Salesman Problem (TSP), and thus may be of interest in routing, network planning and more. We present a near-linear algorithm for this problem under the discrete Fréchet distance, and explore several variants of the problem, including minimizing the lengths of the curves and balancing the number of sites assigned to each agent. Lastly, we prove that the problem is NP-hard under the continuous Fréchet Distance.Authors: Hossein Baktash, Mark Gillespie, Keenan Crane
We describe a method for recovering a manifold, intersection-free triangle mesh from the points where edges of a tetrahedral grid pierce a continuous surface. Unlike classic marching cubes or tets, our subgrid marching scheme allows arbitrarily many surface patches within a single cell, capturing fine features and thin sheets. Moreover, it requires neither a well-defined inside/outside (allowing surfaces with boundary), nor consistently-oriented input geometry. Yet we retain the local, parallel nature of classic marching: reconstruction is performed independently per tet, yielding a conforming mesh across tet boundaries. Our key innovation is a generalization of normal coordinates from geometric topology, which encode surface connectivity via arbitrary integer intersection counts along each grid edge. This encoding sidesteps the usual Nyquist--Shannon limit, putting no lower bound on the size of features that can be resolved on a fixed grid. In practice, for similar compute time and equal grid resolution -- or even an equal number of output triangles -- meshes produced by subgrid marching are far more accurate than those from classic marching. Beyond standard contouring, our method can be used to convert polygon soup into a manifold, intersection-free mesh.Authors: Jacob Ender, Chris Kapulkin
We develop a new algorithm for computing the second discrete homology group of a graph which is much faster when compared to existing algorithms. To do so, we identify five basic shapes, which are quotient graphs of the 3-cube with the property that the injective maps from them detect all possible 2-boundaries in the singular chain complex computing discrete homology.Authors: Benjamin Merlin Bumpus, Rod Downey, Tala Eagling-Vose, Jessica Enright, Michael R. Fellows, David C. Kutner, Laura Larios-Jones, Barnaby Martin, Frances Rosamond, Ella Yates
Parameterized complexity has always been concerned with practical computing: by confining combinatorial explosion to a secondary parameter $k$, one can uncover why and how many NP-hard problems are effectively tackled in practice. Today, however, the scale of data has changed: scientists study Big Data, which is so large that even quadratic dependence in the total input size $n$ is unaffordable. Therefore, what constitutes a practical algorithm has also changed. Classically, parameterized complexity is blind to the difference between defining fixed parameter tractability multiplicatively (i.e. $f(k) \cdot n^c$) or additively (i.e. $f(k) + n^c$). But what if the constant $c$ is one and we require true linearity, is this distinction still inconsequential? Here, we define and explore Truly Linear FPT (TLFPT) -- that is $O(n)+f(k)$ -- and show that it is a strict subset of Linear FPT (LFPT) -- that is $O(n) \cdot f(k)$ -- via diagonalization. Populating TLFPT requires careful consideration of linear-time algorithmics and data structures. We meet many inhabitants of TLFPT: SAT, Vertex Cover, Min-Max Matching, $(n-k)$-Coloring, Diverse Pair of Matchings, $k$-Path, and $H$-Coloring. Our parameterizations are equally varied. Beyond classical parameters like solution size, we leverage two parameters, treedepth and BFS-width, which are particularly well-suited to the TLFPT regime. We do so by developing techniques based on depth- and breadth-first search. For parameterized complexity to be of service to the scientific community, we need to contend with Big Data. For sufficiently large inputs, FPT beyond linear may not suffice. Thus, there is a practical and theoretical need for more ambitious goals. TLFPT is a first step forward.Authors: Bart M. P. Jansen, Ruben F. A. Verhaegh
For a fixed set $\mathcal{F}$ of Boolean constraint types, a MinCSP$(\mathcal{F})$-instance consists of a formula $F$ that applies $m$ constraints from $\mathcal{F}$ to a set of $n$ Boolean variables. The goal is to remove a minimum subset of constraint applications from $F$ to make the remaining formula satisfiable. Previous work characterized how the choice of $\mathcal{F}$ affects its polynomial-time solvability and approximability. We extend a recently introduced preprocessing framework for graph problems to the problem above. Rephrased in the context of CSPs, this framework defines a constraint application from a given formula $F$ as $c$-essential if it is contained in all $c$-approximate solutions to $F$. Being able to efficiently detect these essential parts of a solution reduces the search space of any follow-up FPT algorithms parameterized by the solution size and yields an immediate asymptotic improvement to the runtime of such algorithms. In this work, we present a dichotomy theorem that distinguishes constraint sets $\mathcal{F}$ for which $c_\mathcal{F}$-essential constraint applications can be detected efficiently for some $c_{\mathcal{F}} \in \mathcal{O}(1)$, from those for which this task is intractable under established complexity-theoretic conjectures. Our results show that for any set $\mathcal{F}$ of bijunctive constraints, there is a polynomial-time algorithm that detects $\mathcal{O}(1)$-essential constraint applications. This contrasts the fact that constant-factor approximating a bijunctive MinCSP$(\mathcal{F})$-problem is intractable under the Unique Games Conjecture.Authors: Farzam Ebrahimnejad, Shayan Oveis Gharan
In 2010, Steurer conjectured that any family of $n$ unit-norm vectors $v_1,\dots,v_n$ with polynomially small average correlation $\mathbb{E}_{i,j}|\langle v_i,v_j\rangle|\leq n^{-ε}$ contains linear-sized constant-separated sets. We refute this conjecture in a strong sense using the machinery of sparse high-dimensional expanders: such vector families do not even have linear-sized $\frac{1}{\log^{1/4-o(1)}(n)}$-separated sets. Consequently, we show that there are families of vertex expanders on $n$ vertices for which the (average) $L_2$-mixing time to the uniform distribution of any reweighted simple random walk is at least $\log^{5/4-o(1)} n$.Authors: Jyothish S, Sadagopan Narasimhan
Given a connected graph $G$ and a terminal set $R \subseteq V(G)$, the Steiner tree problem (ST) asks for a tree that spans all of $R$ with at most $r$ vertices from $V(G)\backslash R$, for some integer $r\geq 0$. It is known from (Garey et al.,1977 ) that ST is NP-complete. A Steiner tree in which all terminal vertices are constrained to be leaves is called a terminal Steiner tree. Our study addresses the existence of a terminal Steiner tree, its complexity across various graph classes, black-box applications of the ST, and a fixed-parameter tractable (FPT) algorithm with respect to the number of terminals.Authors: Peter Clifford, Raphaël Clifford
We study exact uniform sampling of permutations of length $n$ whose longest increasing subsequence (LIS) has prescribed length $k$. For $k \in Θ(n)$, we give a direct rejection sampler whose expected running time is $O(n\log\log n)$ in the word-RAM model. The sampler uses an expanded proposal space consisting of permutations together with a specified increasing subsequence, and accepts exactly those proposals whose specified subsequence is the leftmost LIS. For arbitrary $1\le k\le n$, we give an exact sampler based on the Robinson--Schensted correspondence. The algorithm samples the corresponding Plancherel-conditioned shape by computing exact completion counts via determinant identities, and then samples two uniform tableaux of that shape. The direct implementation runs in $\tilde O(n^4k^5)$ expected time. We then show that the same sampler can be implemented in expected $\tilde O(n^3k^4)$ time by evaluating a determinant oracle through Hankel moment matrices.Authors: Karl Bringmann, Nick Fischer, Yanheng Wang
The subgraph isomorphism problem and its generalizations such as conjunctive queries, where some nodes are projected, are among the most fundamental problems in graph algorithms and database theory. In this paper, we study the listing and enumeration variants of these problems and present two main results. (1) We present the first algorithms for enumerating projected trees with polynomial preprocessing time ($\widetilde{O}(n^{17.42})$) and polylogarithmic delay ($\mathrm{polylog}(n)$). Prior to this work, all algorithms in the literature required time $Ω(n^{Ω(k)} + t)$ or $t \cdot n^{Ω(1)}$ to list all copies of a $k$-node tree with projections, where $t$ is the number of solutions. Our result generalizes to arbitrary projected hypergraphs, achieving enumeration in preprocessing time $\widetilde{O}(m^{17.42 \cdot \mathrm{subw}(H)})$ and polylogarithmic delay, where $\mathrm{subw}(H)$ is the submodular width of the pattern hypergraph $H$. We heavily rely on fast (rectangular and output-sensitive) matrix multiplication, which we complement by fine-grained lower bounds indicating that any algorithm beating time $Ω(n^{Ω(k)} + t)$ must rely on fast matrix multiplication. (2) As our second main result, we present a generic enumeration-to-listing reduction, establishing that listing and enumeration are equivalent under natural assumptions. For (colored) subgraph isomorphism, our reduction transforms any listing algorithm running in time $O(f(n,m) + t \cdot g(n,m))$ into an enumeration algorithm with preprocessing time $O((f(n,m)+g(n,m)+m) \log^2 n)$ and delay $O(g(n,m))$. We utilize this equivalence as a tool for proving our first main result, and we expect that our generic reduction will find many future applications.Authors: Kao-Chuan Liang, Ya-Chun Liang
We study online scheduling with obligatory testing on $m$ identical machines with the objective of minimizing the sum of completion times. In this model, every job must undergo a test before its actual processing time is revealed. Consequently, the central algorithmic challenge is no longer whether to acquire information, but how to optimally balance machine capacity between revealing unknown jobs and processing currently known ones. While this tradeoff becomes structurally richer in the multiple-machine setting, the only prior explicit deterministic lower bound for this objective was $\sqrt{2}$, established strictly for a single machine in 2024 by Dogeas et al. [ESA 2024: 48:1-48:14]. Our core conceptual contribution is demonstrating that completion-threshold quantities, denoted $T_X$, serve as the fundamental analytical metric for this setting. Because every completed job must first pass through the testing phase, delayed revelation inherently forces delayed completion. By bounding these $T_X$ thresholds, we systematically derive strong lower bounds on the total completion time. Utilizing this framework, we establish the first substantial deterministic lower bounds for multiple machines, including a three-type bound of $1.4811$ and a multi-type dyadic construction that asymptotically approaches $3/2$. Finally, we complement these theoretical limits with a deterministic $2$-competitive list-scheduling algorithm for arbitrary test times.Authors: Debarati Das, Maximilian Probst Gutenberg, Christian Wulff-Nilsen
In the planar, dynamic All-Pairs Shortest Paths (APSP) problem, a planar, weighted digraph $G$ undergoes a sequence of edge weight updates and the goal is to maintain a data structure on $G$, that can quickly answer distance queries between any two vertices $x,y \in V(G)$. The currently best algorithms for this problem require $\tilde{O}(n^{2/3})$ worst-case update and query time, while conditional lower bounds show that either update or query time $n^{0.5-δ}$ is needed for any constant $δ> 0$. In this article, we present the first algorithm with near-optimal $\tilde{O}(\sqrt{n})$ worst-case update and query time for the offline setting, where the update sequence is given initially. This result is obtained by giving the first offline dynamic algorithm for maintaining dense distance graphs (DDGs) faster than recomputing from scratch after each update. Further, we also present an \emph{online} algorithm for the incremental APSP problem with $\tilde{O}(\sqrt{n})$ worst-case update/ query time. This allows us to reduce the online dynamic APSP problem to the online decremental APSP problem, which constitutes partial progress even for the online version of this notorious problem.Authors: Pratheek Prakash Shetty, Thomas R. W. Scogland, Wu-chun Feng
Concurrent queues can significantly impact supercomputing performance by being critical bottlenecks for task distribution, load balancing, and resource utilization. As HPC systems move beyond 10-million processor cores, the ability to rapidly move items between producer and consumer threads without excessive locking is essential for efficient queues, preventing idle cores, maximizing utilization, and achieving high parallel speedup. While concurrent queues are well studied on CPUs, they remain largely unexplored on modern GPUs, where SIMT execution, massive parallelism, and atomic contention reshape the design space. We present three linearizable GPU concurrent queues spanning from lock-free to wait-free guarantees: (1) G-WFQ-YMC, an adaptation of Yang and Mellor-Crummey's wait-free queue using preallocated segments; (2) G-LFQ, a bounded lock-free queue that uses wave-batched fast paths to maximize throughput; and (3) G-WFQ, a bounded wait-free queue that packs shared state into 64-bit compare-and-swap operations while preserving linearizability and bounded memory.Authors: Peng Chen, Hailiang Zhao, Xueyan Tang, Yixuan Wang, Shuiguang Deng
Learning-augmented paging has been extensively studied in recent years. A key advantage over naive ML-based approaches is \emph{bounded robustness}, which guarantees worst-case performance even when predictions are inaccurate, making these algorithms valuable for real-world systems. Prior work achieves robustness bounds of $2H_k + O(1)$ in the randomized setting, leaving a gap to the optimal competitive ratio $H_k$. In this paper, we study how to close this gap. We begin by reviewing online optimality and proving a new property of the latest $H_k$-competitive algorithm, which facilitates our analysis in the learning-augmented setting. Then, we review existing learning-augmented paging algorithms and introduce a unifying primitive, the \emph{relative prediction budget}, which captures the essence of establishing robustness and reveals that prior algorithms either overuse or underutilize predictions. Guided by the above analysis, we develop a new framework that achieves the best-possible robustness up to an additive constant for learning-augmented paging: $H_k + O(1)$. Experiments further demonstrate strong practical performance.Authors: Micah Gold
ReCom is a leading Markov Chain Monte Carlo algorithm for sampling balanced graph partitions in computational redistricting. At each step, its transition function proposes a new partition by merging two adjacent districts and if possible re-splitting the conjoined region. The transition function is efficient in practice, however, it is unknown whether it is guaranteed to run in polynomial time. In this report we exhibit an explicit family of 3-partitions on planar square grid graphs from which ReCom requires an exponentially large expected number of steps to re-split the graph (even if we admit approximately balanced splits), showing that in the worst case ReCom does not run in polynomial time. Notably, this result implies that ReCom is not technically rapidly mixing (if started from an adversarial configuration, ReCom requires exponential many steps to reach the stationary distribution).Authors: Alireza Haqi, Shayan Oveis Gharan
We resolve the thin matching problem proposed by Anari, Charikar and Ramakrishnan [ACR23] up to polylogarithmic factors. Given a fractional perfect matching $x$, we say a perfect matching $M$ is $α$-thin w.r.t. $x$ if for any cut $(S,\overline{S})$, we have $$ |M \cap E(S,\overline{S})| \leq α\cdot x(S,\overline{S}).$$ [ACR23] conjectured that for any fractional perfect matching $x$, there exists a perfect matching $M$ which is $O(1)$-thin w.r.t. $x$. First, we show that if $M$ is restricted to be in the support of $x$, then $α\geq Ω(n)$ and we complement this by designing an efficient algorithm that outputs an $O(n\log n)$-thin perfect matching where $n$ is the number of vertices. Then, we relax this constraint and show that for any fractional perfect matching $x$, there is a perfect matching $M$ (which is not necessarily in the support of $x$) such that $M$ is $\text{polylog}(n)$-thin w.r.t. $x$. All results work for both bipartite and non-bipartite graphs. We also discuss applications to the metric distortion problem.Authors: Qingwen Ma, Chao Peng, Changfeng Xu, Chenyang Xu, Ruilong Zhang
This paper introduces a general multiagent matroid upgrading problem that models a broad class of real-world resource allocation tasks. In this setting, there are multiple agents and a ground set of elements, where each element is assigned to a specific agent and has two associated costs: a default cost and a reduced (upgraded) cost. Upgrading an element lowers its cost to the upgraded value, while non-upgraded elements retain their default costs. Each agent is associated with its own matroid, with the goal of finding a minimum-cost basis. The central task is to select at most k elements to upgrade so as to minimize a non-decreasing convex function over the agents' minimum basis costs, capturing both efficiency and fairness objectives in multiagent systems.Authors: Groot Koerkamp, Ragnar
In recent years, there has been a renewed interest in the search for low density minimizer schemes. These schemes take a window of $w$ consecutive $k$-mers, and sample one of them: the smallest under some specific order. Schemes such as the mod-minimizer provide a low density (fraction of sampled $k$-mers) when $k \gg w$, while schemes such as the greedy minimizer work well for explicit small parameters roughly in the regime $k \leq 2w$, for $k$ and $w$ up to $15$ or so. When $k < \log_σw$ is very small, minimizer schemes cannot do well, and more general sampling schemes are needed that can be richer than just comparing $k$-mers. Bidirectional-string anchors (bd-anchors) form one such scheme. Inspired by bd-anchors, we introduce the smallest unique substring or SUS-anchor: Given a window, this considers all suffixes that do not occur as a substring elsewhere in the window. It then samples the start position of the smallest suffix according to the new anti-lexicographic order that minimizes the first character and maximizes the remaining characters. We give a linear-time and $O(w)$ space streaming algorithm to compute all SUS-anchors of a string. For alphabet size $σ=4$ and $k=1$, the anti-lexicographic SUS-anchor empirically has density $<1\%$ away from the density lower bound, significantly improving over bd-anchors that are often $>15\%$ above it. For alphabet size $σ=2$, the density is at most $10\%$ above the lower bound, which again improves over the $>50\%$ overhead of bd-anchors.Authors: Shahbaz Khan, Shubham Kumar Verma, Utkarsh Lohiya
Given a graph $G(V,E)$ having $n$ vertices and $m$ edges, we maintain its Breadth-First Search (BFS) tree from source $s$ under an online sequence of edge updates in the prediction model. Our approach leverages a predicted update sequence aiding online processing. We present algorithms for incremental (insertions-only), decremental (deletions-only), and fully dynamic (insertions and deletions) settings that maintain a BFS tree (parent and level information). Classically, the incremental and decremental BFS tree requires total $O(mn)$ time [JACM81], with amortized $O(n)$ and worst-case $O(m)$ update time. The combinatorial BMM conjecture restricts any polynomial improvement [FOCS14] even when the updates are known in advance [STOC15]. For fully dynamic BFS trees, only the trivial $O(m)$ time recomputation is known. Our complexity bounds are expressed in prediction error measures, where error vertices are those having incorrectly predicted distances, with the corresponding difference as their error. The vertex prediction error $η_{v}$ is the sum of degrees of error vertices, weighted vertex prediction error $η^*_{v}$ is error-weighted sum of degrees of error vertices, and $η_e$ counts the incorrectly predicted updates. For incremental and decremental BFS, our algorithm requires respectively $O(η_v + η_e)$ and $O(\min\{m,η^*_v + η_e\})$ worst case update time using $O(mn)$ preprocessing time and space, and total update time of $O(η^*_v + η_e)$. For fully-dynamic updates, our algorithm requires $O(\min\{m,η^*_v+η_e\})$ worst case update time. At its core, we extend the classical ES Trees [JACM81] for batch updates and fully dynamic updates. This simple extension is sufficient to give a competitive prediction algorithm, which may be generalized to other graph problems. We also consider space optimizations and error correction to improve our results.Authors: Jake Yoon
Every electronic exchange relies on an order book whose storage layer determines matching latency. The dominant implementation -- linked lists chained through a balanced tree -- imposes two costs on every operation: pointer-chased traversal to reach the insertion point, and root-to-leaf search to locate the target price level. Under micro-burst conditions these costs produce tail-latency spikes that degrade market quality when liquidity is most needed. We present two data-structure contributions that eliminate these costs. The first is the Priority-Indicated Node (PIN), a priority queue in which entries occupy fixed-capacity, contiguously addressable slots, each carrying a per-slot indicator encoding the entry's global priority. Unlike heaps, which require O(log n) comparisons per operation, the PIN resolves insertion position directly from the indicators without comparing entries; indicator updates are O(1), independent of queue size. The second addresses a broader inefficiency: balanced search trees search root-to-leaf on every insertion and deletion, even when the caller already knows the key's in-order neighbors -- as in ordered event streams, incremental index already knows the key's in-order neighbors -- as in ordered event streams, incremental index maintenance, and electronic trading. Neighbor-aware insertion and deletion exploit known neighbor references to attach or remove a node with O(1) reference writes, followed by single-path rebalancing, uniformly across red-black, AVL, and B/B+-tree variants. A single CPU core sustains 32 million order messages per second with sub-microsecond tail latency under multi-million message-per-second micro-bursts, and is 5-11x faster than the best available open-source matching engines on the same hardware. Scaled to a single 96-core instance, the engine sustains 640 million messages per second across 10,000 symbols.Authors: Andreas Charalampopoulos, Dimitris Fotakis, Thanos Tolias
We study budget feasible procurement auctions, in which $n$ agents, each with a privately held service cost, offer their services to an employer. The employer seeks to maximize a public submodular valuation function over the set of hired agents, while facing a hard budget constraint. We consider an online posted-price setting, in which agents arrive in a uniformly random order (a.k.a. \emph{secretary arrivals}) and the employer must make irrevocable take-it-or-leave-it offers upon their arrival. The employer does not get any feedback about the agent service costs other than whether they accept the offer or not. We introduce Repeated Descent (a.k.a. \RED), a deterministic framework based on adaptive linear posted pricing. \RED enforces budget feasibility by adaptively adjusting its pricing and balancing each pricing level with the number of agents considered in it. Using \RED as the main building block, we obtain a $1046$-competitive posted-price mechanism for online budget feasible auctions with secretary agent arrivals and submodular valuations, thus improving on the previously best known ratio of (Charalampopoulos et al., EC 2025) by several orders of magnitude. Combining \RED with random subsampling, we obtain the first constant-competitive posted-price budget feasible mechanism for non-monotone submodular valuations. On the negative side, we show that every online budget feasible mechanism with XOS valuations has a competitive ratio of $Ω\!\left(\tfrac{\log n}{(\log\log n)^2}\right)$.Authors: Nima Anari, Alireza Haqi, Eric Ma
We study correlated rounding on the hypersimplex, the base polytope of the uniform matroid. For each point $x$ in the hypersimplex, the goal is to sample a $k$-subset $A(x)$ with marginals $x$, while coupling the samples for all choices of $x$ so that nearby inputs produce nearby sets. We give a constant-stretch scheme. Our scheme samples the maximum-entropy $k$-subset distribution with prescribed marginals using a common random ordering and common uniform thresholds. For every $x,y\in[0,1]^n$ with $\sum_i x_i=\sum_i y_i=k$, it satisfies $\mathbb{E}[|A(x)\triangle A(y)|]\le 6\|x-y\|_1$. This improves the previous $O(\log k)$ bound for hypersimplex correlated rounding and answers an open question raised by Naor, Raju, Shetty, Srinivasan, Valieva, and Wajc. By adding dummy coordinates, the same result gives stretch at most $12$ for the at-most-$k$ polytope. The proof was found in a GPT 5.5 Pro Extended conversation prompted by the authors, and Codex was used to help assemble the manuscript under the authors' supervision.Authors: Qiming Fang, Sihong Shao, Yuxuan Wu
Using the concepts of Eulerian-spanning set and coboundary operator, we generalize Hadlock's conversion of the maxcut problem on planar graphs to one on general graphs with non-negative weights. Using our conversion, we can explore algorithms for maxcut beyond the class of planar graphs. We obtain a Fixed-Parameter Tractable algorithm for $k$-contraction apex graphs. Specifically, our algorithm can be applied to graphs with crossing number $k$, giving an $O(2^k(n+k)^{3/2}\log (n+k))$-time algorithm that matches the best known results when restricted to non-negative weights.Authors: Misha Ivkov, Tselil Schramm
We present a simple and efficient algorithm for robust approximate message passing (AMP) in the spiked matrix setting. In particular, let $\varepsilon$ be a sufficiently small constant, and suppose that $X \in \mathbb R^{n \times n}$ is a Gaussian matrix with a planted rank-$1$ spike, and $E \in \mathbb R^{n \times n}$ is an adversarially chosen matrix supported on an $\varepsilon n \times \varepsilon n$ principal minor. Let $v_{\mathrm{AMP}}(X)$ be the output of an AMP iteration on the uncorrupted matrix $X$. We give a procedure that, given access only to the corrupted matrix $Y = X + E$, computes a vector $v_{\mathrm{ALG}}(Y)$ which is $\tilde{O}(\sqrt{\varepsilon})$-close to $v_{\mathrm{AMP}}(X)$, for any of a class of AMP iterations which includes sparse Principal Component Analysis (PCA), non-negative PCA, and $\mathbb Z_2$ synchronization. Our algorithm consists of a spectral pre-processing step combined with a robust spectral initialization procedure; given these inputs, we prove that (perhaps surprisingly) AMP is robust out-of-the-box.Authors: Nathan White, Krish Singal
Quantization is a fundamental tool used to compress datasets, neural network weights, and memory usage in a range of computational tasks. Many downstream applications of vector quantization perform inner products with arbitrary inputs. This motivates the study of inner product aware quantization schemes that approximately preserve inner products with unseen vectors -- in contrast to simply minimizing the mean-squared error. In this work, we formulate objectives that capture natural desiderata and develop adaptive and unbiased quantization methods that approximately preserve inner products with worst-case and average-case inputs. An analysis of these objectives shows a tight connection with the well-studied notion of Adaptive Stochastic Quantization (ASQ). We develop provably fast exact and approximate algorithms for our objectives. Our theoretical results inspire efficient practical algorithms that perform well across a variety of workload distributions. They also lead to practical algorithms for standard ASQ which are 2-10$\times$ faster than prior state-of-the-art methods while maintaining quality. These theoretical and empirical results contribute towards making adaptive quantization techniques more efficient and tractable in practical settings.Authors: Steven Heilman
We show that Grothendieck's real constant $K_G$ can be upper bounded by projecting vectors onto a random plane through the origin and thresholding a degree five Hermite polynomial. This resolves a conjecture of Braverman-Makarychev-Makarychev-Naor from 2011, who required an extra randomization step in their rounding scheme and proved $K_G<\fracπ{2\log(1+\sqrt{2})}-10^{-500}$. As a corollary of our result, we prove the bound $K_G<\fracπ{2\log(1+\sqrt{2})}-10^{-217}$ by thresholding degree three Hermite polynomials in the plane. We finally give a rigorous computer-assisted proof that $K_G<\fracπ{2\log(1+\sqrt{2})}-10^{-5}$ using interval arithmetic and degree three Hermite polynomial thresholding.from Kamathematics
The Theory community has a tradition of a crowdsourced spreadsheet of sharing who has accepted which jobs (see, e.g., 2025). You can report jobs accepted via this form (anonymous), and view the results on this sheet. If you feel a change needs to be made to the results reported in the form, please email me or fill out this … Continue reading "Theory Jobs 2026"
The Theory community has a tradition of a crowdsourced spreadsheet of sharing who has accepted which jobs (see, e.g., 2025).
You can report jobs accepted via this form (anonymous), and view the results on this sheet. If you feel a change needs to be made to the results reported in the form, please email me or fill out this other form (also anonymous, unless you choose to identify yourself).
The rules are as follows:
To keep this list focused around theory, mild curation will be applied: individuals should have publications at theory-centric conferences such as STOC, FOCS, SODA, ICALP, STACS, COLT, CRYPTO, CCC, ITCS, etc., or publications of a similar style. If you don’t meet this criterion, but still identify as a theorist/theoretician (at least partially), please email me (ideally, with a sentence or two of explanation) at g@csail.mit.edu and I’ll include you, no further questions asked.
Odd Scenarios about Research Claims
I blogged about OpenAI's achievement of having AI solve a math problem here.
My post had a few comments about authorship of such results.
Lance had a post about co-authorship and AI here
There are times when an author on a paper prefers to not be listed as a co-author.
There are times when an author wants to give credit in odd places.
We give some scenarios.
1) Professors Alice and Bob prove a theorem. They have their grad student Carol
write up the result. Alice and Bob intentionally leave their name
off of the paper since they want Carol to get a career boost.
2) Alice has a conjecture and does some work on it. Famous Mathematicians
Bob and Carol solve it and offer Alice a co-authorship. Alice declines thinking that
having it known that Bob and Carol cared about her work is more important
(and Alice thinks, rightly or wrongly, that it's more impressive if she is not a co-author)
3) Alice has a good idea and tells it to Bob and Carol. They incorporate the idea
into a paper and make her a co-author. Later she tells them her idea is wrong and
to take her name OFF of the journal version. They say, no, they will keep her name
on the journal version. She does not inquire why they will do this for fear they
will rethink it. She offers to proofread the paper very carefully to earn her keep.
She does that and does a really good job.
4) Alice has a conjecture and does some work on it. Other people solve it
and offer Alice a co-authorship. Alice declines since she doesn't really understand
the final paper.
5) Alice and Bob are working for the NSA. They come up with a breakthrough (either making or
breaking codes). They can't publish it because of national security.
6) A pair of scenarios.
6a) Alice, Bob, and Carol prove a theorem in number theory that leads to a new classical
factoring algorithm. They code up the algorithm and find that it's 1000 times
faster than the best algorithms known. They make it into a package and want to sell it.
However, they claim it's a quantum computer since they think it will sell better that way.
They do not claim that it's Shor's algorithm.
They just claim that it's 1000 times faster than current algorithms, which is true.
If someone buys it because they want to factor large numbers, do they care
that it's not quantum?
If someone buys it because they want to explore the quantum aspect, they do care.
b) Alice, Bob, and Carol are quantum engineers. They build a quantum computer and
use it to factor large numbers. They find that it factors 1000 times faster than current classical
algorithms. However, by this point there has been so much quantum hype that has been
debunked, that, in order to sell it, they claim it's a classical algorithm achieved
by a breakthrough in number theory. They do not claim it's Shor's algorithm,
just that it's 1000 times faster which is true.
If someone buys it because they want to factor large numbers, do they care
that it's not quantum?
If someone buys it because they want to explore the number-theory aspects, they do care.
7) A pair of scenarios.
7a) Alice, Bob, and Carol work for a math research lab. They use AI to crack an open problem in math.
The AI did all the heavy lifting and produced a very readable proof. They claim AI didn't do much
since they think this will increase how much grant money they can get.
If someone just wants to read the proof, then do they care that it's AI-generated?
If someone wants to give Alice-Bob-Carol grant money, then they may dislike the lying, but
Alice-Bob-Carol could probably use AI to get more results. Hence Alice-Bob-Carol were fools
to hide the use of AI.
7b) An AI company hires Alice, Bob, and Carol to solve a math problem. They succeed and
produce a readable proof. The company then claims that AI did it to increase attention and funding.
If someone just wants to read the proof, then do they care that it's human-generated?
If someone wants to give the company venture capital money, then they do care.
8) Alice reads a novel and likes it. She later finds out that it was written by AI.
Now she decides she doesn't like it. Is that rational? What if the cover of the novel
said quite clearly ``This novel was written by AI'' but she missed it (perhaps she
has a used copy where the cover is torn off). Then she can't use the
``I mind that they lied'' excuse for why she doesn't like it.
9) In Scenario 7 you could replace ``novel'' with ``song'' or ``movie''
or ``episode of a TV show'' or whatever you want.
10) If I found out that a novelty song about Ramsey Theory was actually
AI-generated, then would I mind? I would probably search my files and find
that I was the one who asked AI to do it a few years ago and forgot about it.
Do you know anyone else who would use ChatGPT to write songs about Ramsey Theory?
11) Having written point 9 I am morally obligated to give you a pointer to
a Ramsey Song I wrote with the help of ChatGPT. It's about the proof of
van der Waerden's theorem and is titled
I like big blocks (and I cannot lie) here.
By gasarch
Odd Scenarios about Research Claims
I blogged about OpenAI's achievement of having AI solve a math problem here.
My post had a few comments about authorship of such results.
Lance had a post about co-authorship and AI here
There are times when an author on a paper prefers to not be listed as a co-author.
There are times when an author wants to give credit in odd places.
We give some scenarios.
1) Professors Alice and Bob prove a theorem. They have their grad student Carol
write up the result. Alice and Bob intentionally leave their name
off of the paper since they want Carol to get a career boost.
2) Alice has a conjecture and does some work on it. Famous Mathematicians
Bob and Carol solve it and offer Alice a co-authorship. Alice declines thinking that
having it known that Bob and Carol cared about her work is more important
(and Alice thinks, rightly or wrongly, that it's more impressive if she is not a co-author)
3) Alice has a good idea and tells it to Bob and Carol. They incorporate the idea
into a paper and make her a co-author. Later she tells them her idea is wrong and
to take her name OFF of the journal version. They say, no, they will keep her name
on the journal version. She does not inquire why they will do this for fear they
will rethink it. She offers to proofread the paper very carefully to earn her keep.
She does that and does a really good job.
4) Alice has a conjecture and does some work on it. Other people solve it
and offer Alice a co-authorship. Alice declines since she doesn't really understand
the final paper.
5) Alice and Bob are working for the NSA. They come up with a breakthrough (either making or
breaking codes). They can't publish it because of national security.
6) A pair of scenarios.
6a) Alice, Bob, and Carol prove a theorem in number theory that leads to a new classical
factoring algorithm. They code up the algorithm and find that it's 1000 times
faster than the best algorithms known. They make it into a package and want to sell it.
However, they claim it's a quantum computer since they think it will sell better that way.
They do not claim that it's Shor's algorithm.
They just claim that it's 1000 times faster than current algorithms, which is true.
If someone buys it because they want to factor large numbers, do they care
that it's not quantum?
If someone buys it because they want to explore the quantum aspect, they do care.
b) Alice, Bob, and Carol are quantum engineers. They build a quantum computer and
use it to factor large numbers. They find that it factors 1000 times faster than current classical
algorithms. However, by this point there has been so much quantum hype that has been
debunked, that, in order to sell it, they claim it's a classical algorithm achieved
by a breakthrough in number theory. They do not claim it's Shor's algorithm,
just that it's 1000 times faster which is true.
If someone buys it because they want to factor large numbers, do they care
that it's not quantum?
If someone buys it because they want to explore the number-theory aspects, they do care.
7) A pair of scenarios.
7a) Alice, Bob, and Carol work for a math research lab. They use AI to crack an open problem in math.
The AI did all the heavy lifting and produced a very readable proof. They claim AI didn't do much
since they think this will increase how much grant money they can get.
If someone just wants to read the proof, then do they care that it's AI-generated?
If someone wants to give Alice-Bob-Carol grant money, then they may dislike the lying, but
Alice-Bob-Carol could probably use AI to get more results. Hence Alice-Bob-Carol were fools
to hide the use of AI.
7b) An AI company hires Alice, Bob, and Carol to solve a math problem. They succeed and
produce a readable proof. The company then claims that AI did it to increase attention and funding.
If someone just wants to read the proof, then do they care that it's human-generated?
If someone wants to give the company venture capital money, then they do care.
8) Alice reads a novel and likes it. She later finds out that it was written by AI.
Now she decides she doesn't like it. Is that rational? What if the cover of the novel
said quite clearly ``This novel was written by AI'' but she missed it (perhaps she
has a used copy where the cover is torn off). Then she can't use the
``I mind that they lied'' excuse for why she doesn't like it.
9) In Scenario 7 you could replace ``novel'' with ``song'' or ``movie''
or ``episode of a TV show'' or whatever you want.
10) If I found out that a novelty song about Ramsey Theory was actually
AI-generated, then would I mind? I would probably search my files and find
that I was the one who asked AI to do it a few years ago and forgot about it.
Do you know anyone else who would use ChatGPT to write songs about Ramsey Theory?
11) Having written point 9 I am morally obligated to give you a pointer to
a Ramsey Song I wrote with the help of ChatGPT. It's about the proof of
van der Waerden's theorem and is titled
I like big blocks (and I cannot lie) here.
Authors: Timon Barlag, Nicolas Fröhlich, Miika Hannula, Phokion G. Kolaitis, Juha Kontinen, Arne Meier, Jouko Väänänen
Dependence logic extends first-order logic with dependence atoms asserting that the value of a variable is determined by the values of certain other variables. The semantics of dependence logic has a second-order character and involves sets of assignments, called teams, instead of individual assignments as in the classical Tarski semantics. Since the model-checking problem is known to be NP-complete even for quantifier-free dependence logic (DQF) formulas, researchers have pursued conditions on formulas that make this problem tractable. In 2010, Jarmo Kontinen introduced the notion of k-coherence for dependence logic formulas, where k is a positive integer. This notion asserts that if the formula is satisfied in a structure by all k-element subteams of a given team, then the given team itself satisfies the formula. It has been proved that k-coherent DQF-formulas have a tame model-checking problem, because such formulas admit a first-order rewriting. In this paper, we investigate the structural and algorithmic aspects of coherence. We show that if a DQF-formula is first-order ewritable, then it is k-coherent for some positive integer k. Thus, for DQF-formulas, coherence is equivalent to first-order rewritability. Furthermore, we show that an analogous result holds for universally quantified dependence logic formulas under a stronger notion of coherence. After this, we focus on the complexity of deciding if a given dependence logic formula is k-coherent. We establish that this decision problem is highly undecidable for arbitrary dependence logic formulas, while for DQF-formulas this problem is co-recursively enumerable. Furthermore, we pinpoint the computational complexity of the coherence problem for propositional dependence logic formulas by showing that this problem is complete for the second level of the exponential hierarchy.Authors: Liviu Aolaritei, Ricky Huang, Michael I. Jordan, Paul Grigas
We introduce a diffusion-based uncertainty model for robust optimization on directed graphs, in which perturbations of edge weights propagate along adjacent edges and satisfy conservation constraints at nodes. This topology-aware structure is natural in networked systems where uncertainty is induced by flows and local interactions, including transportation, logistics, communication, and energy networks. We analyze how such diffusive uncertainty reshapes the computational landscape of robust graph optimization. For convex network problems, such as minimum-cost flow and maximum flow, the resulting formulations remain convex and admit polynomial-time solution methods across all diffusion regimes considered. For combinatorial problems, the effect is more delicate. We focus on two canonical combinatorial graph problems, shortest path and the traveling salesman problem (TSP), which provide complementary benchmarks: shortest path is polynomial-time solvable in the nominal setting, whereas TSP is already NP-hard. We show that, for shortest path, propagation depth induces a sharp transition between tractable and intractable robust counterparts. For the traveling salesman problem, robustness often adds no computational complexity beyond ordinary TSP, because the structure of Hamiltonian cycles makes the fixed-tour adversarial problem collapse to explicit formulas. Together, these results show that topology-aware uncertainty can fundamentally change robust combinatorial optimization, with tractability governed by the interaction between propagation, budget geometry, and the structure of feasible solutions.Authors: Anej Svete, William Merrill, Ryan Cotterell, Ashish Sabharwal
Recent work describes what transformers can and cannot compute through connections to boolean circuits, but existing results lack exact characterizations and are sensitive to modeling choices. Padded transformers -- to whose input filler symbols such as ``...'' are appended -- emerge as a useful gadget for establishing equivalences to circuit classes by providing polynomial space for adaptive parallel computation. However, only a limited set of padded transformer idealizations has been studied, leaving open how robustly these equivalences hold under changes to attention type, model width, and uniformity. We find that, under practical assumptions, padded transformers are surprisingly robust to all of these, and identify numeric precision and model depth as the main factors affecting expressivity. Concretely, we prove that polynomially padded $\text{L-uniform}$ constant-precision transformers are equivalent to $\text{L-uniform AC}^0$, while growing-precision ones achieve $\text{L-uniform TC}^0$ regardless of width. Furthermore, looping enables sequential processing analogous to circuits: $\log^d N$-looped constant-precision transformers reach $\text{FO-uniform AC}^d$, and growing-precision ones reach $\text{FO-uniform TC}^d$. Interestingly, growing width or precision beyond logarithmic does not increase expressivity, and all our results hold for both softmax and average hard attention transformers.Authors: Michael A. Bekos, Eleni Katsanou, Philipp Kindermann, Maria Eleni Pavlidi
In this work, we study the interplay between the number of slopes, the number of bends per edge, and the area requirements for planar drawings of bounded-degree graphs. Our motivation stems from the fact that, while numerous algorithms produce planar drawings with few slopes for graphs of relatively small degree in polynomial area, existing approaches for higher-degree graphs often require super-polynomial area. We address this gap in the literature by presenting new constructions that yield polynomial-area drawings with few bends per edge while slightly increasing the required number of slopes, thereby providing the first systematic study of slopes, bends and area trade-offs.Authors: Leo van Iersel, Mark Jones, Mathias Weller
TREE CONTAINMENT is a central decision problem in mathematical phylogenetics, asking whether a given rooted phylogenetic tree is embeddable in ("displayed by") a given rooted phylogenetic network. While the problem is NP-complete for general networks, many algorithmic advances have relied on structural parameters that capture how "tree-like" a network is. In this paper we investigate TREE CONTAINMENT under the structural parameter scanwidth, a directed width measure generalizing popular parameters measuring tree-likeness of phylogenetic networks. We first present a parameterized algorithm that solves the problem in $O(4^{k + k\log{k}} n + nm^2)$ time, where $n$ and $m$ are the numbers of nodes and arcs in the network and $k$ is the width of a given tree-extension. Complementing this upper bound, we prove a matching lower bound under the Exponential-Time Hypothesis (ETH), showing that there is no algorithm for TREE CONTAINMENT that runs in $2^{o(c\log{c})} n^{O(1)}$ time, even on binary inputs, where $c$ is the directed cutwidth of the input network, which upper-bounds the scanwidth $k$.Authors: Fabio Massimo Zanzotto, Federico Ranaldi, Giorgio Satta
In this paper, we show the possibility of a direct injection of algorithms into neural network architecture. We focus on a complex algorithm, that is, Cocke-Youger-Kasami (CYK) for parsing context-free grammars in Chomsky Normal Form and we propose CYKNN, a simple recurrent neural network architecture for encoding the CYK algorithm in trainable matrix-vector multiplications.We experimented with a very simple grammar with 4 variations showing that our approach outperforms existing LLMs with more than 20B parameters with an in-context learning setting and smaller LLMs of the Qwen family fine-tuned with LoRA. Our attempt paves the way to a different approach to neuro-symbolic methodologies.Authors: Nick Fischer, Mursalin Habib
We revisit the Binary Closest String problem, which asks, given a set of binary strings $X \subseteq \{0, 1\}^n$, to compute a string minimizing the maximum Hamming distance to $X$. A long line of work has focused on parameterized algorithms with respect to the optimal distance $d$, yielding a sequence of improvements from $O^*(d^d)$ through $O^*(16^d)$, $O^*(9.513^d)$, $O^*(8^d)$, $O^*(6.731^d)$ to the current best-known running time of $O^*(5^d)$ [Chen, Ma, Wang; Algorithmica '16]. We present a faster randomized algorithm running in time $O^*(4^d)$. Our result matches a recent fine-grained lower bound [Abboud, Fischer, Goldenberg, Karthik C.S., Safier; ESA '23], and is therefore conditionally optimal. As an extra benefit, our algorithm is remarkably simple.Authors: Jason M. Altschuler, Pablo A. Parrilo
Can gradient descent be accelerated by just choosing better stepsizes? Surprisingly, the answer is yes. This short expository article provides an accessible introduction to this phenomenon of stepsize hedging.Authors: Miltiadis Stouras, Vincent Cohen-Addad, Silvio Lattanzi, Ola Svensson
Retrieval-augmented generation (RAG) systems typically rely on a single retriever and a single set of hyperparameters, despite facing highly heterogeneous queries that range from simple factoid questions to complex multi-hop reasoning. We propose a method that automatically selects a small, diverse subset of retrievers (a portfolio) from a large pool of candidates, to cover different regions of the target query distribution. We formalize this setting via an expected best-of-$k$ objective over the query distribution and show that it admits an efficient portfolio construction algorithm with near-optimal guarantees. Across multiple QA benchmarks, our learned portfolios and router pipeline consistently outperform single-retriever and naive multi-retriever baselines on both retrieval metrics and answer quality. In addition, compared to inference-time hyperparameter tuning approaches, fixed portfolios enable parallel retrieval and LLM calls, achieving comparable (and sometimes better) accuracy with substantially lower latency and token cost.Authors: Shenghu Jiang, Ruihao Gong
We propose a novel algorithm for incremental Byte Pair Encoding (BPE) tokenization. The algorithm processes each input byte in worst-case $\mathcal{O}(\log^2 t)$ time, leading to an overall complexity of $\mathcal{O}(n \log^2 t)$, where $n$ is the input length and $t$ is the maximum token length. The algorithm incrementally maintains BPE tokenization results for every prefix of the input text, implementing the standard BPE merge procedure defined by a fixed set of merge rules. This enables efficient partial tokenization in streaming settings. Functioning as a drop-in replacement for standard BPE, our approach achieves a speedup of up to ${\sim}3\times$ over Hugging Face's tokenizers, and demonstrates significant latency reductions over OpenAI's tiktoken on pathological inputs. We further introduce an eager output algorithm that enables streaming output, emitting tokens as soon as token boundaries are determined during incremental tokenization. Overall, our results demonstrate that BPE tokenization can be performed incrementally with strong worst-case guarantees, while providing practical latency benefits in modern large language model pipelines. Code: github.com/ModelTC/mtc-inc-bpeAuthors: Vasileios Nakos, Hung Q. Ngo, Andreas Panayi
A classic result of Alon, Yuster, and Zwick (AYZ, Algorithmica 1997) shows that all $2k$-cycles in an $m$-edge graph can be listed in $\tilde O(m^{2-1/k}+t)$ time, where $t$ is the output size. This bound underlies the {\em submodular width} of Marx (JACM 2013) and the PANDA framework of Abo Khamis, Ngo, and Suciu (PODS 2017), which extend AYZ to arbitrary conjunctive queries with degree constraints. A central open question is whether combinatorial algorithms can beat the submodular-width barrier. Bringmann and Gorbachev (STOC 2025) gave lower-bound evidence that submodular width may be optimal for general conjunctive queries under combinatorial algorithms. The picture changes for $2k$-cycles on undirected graphs, whose queries have self-joins and symmetric EDBs: recent works improve on AYZ for even-cycle detection and listing. Pinning down the complexity of $C_{2k}$-detection and listing is thus a natural step toward overcoming the submodular-width barrier for such queries. For detection, Dahlgaard, Knudsen, and St{ö}ckel (STOC 2017) solved $C_{2k}$-detection in $\tilde O(m^{2k/(k+1)})$ time. Listing is harder. Jin and Xu (STOC 2023), and independently Abboud, Khoury, Leibowitz, and Safier (FSTTCS 2023), listed 4-cycles in $\tilde O(m^{4/3}+t)$ time; Vassilevska~Williams and Westover (ITCS 2025) listed 6-cycles in $\tilde O(m^{8/5}+t)$ time, improving the AYZ bounds of $\tilde O(m^{3/2})$ and $\tilde O(m^{5/3})$. The general case has remained open for 30 years. Building on these works, we list $2k$-cycles in $\tilde O(m^{(2k^2-k+1)/(k^2+1)}+t)$ time, improving AYZ for every $k\geq 3$. The key ingredient is an \emph{asymmetric supersaturation} result for even cycles. Our algorithms use only join and project operators over multiple tree-decomposition plans, making them naturally implementable in database systems, in contrast to prior BFS-based graph approaches.Authors: Yi-Jun Chang, Yang Ze Guan
We study the aggregation problem in synchronous multi-hop radio networks with $O(\log n)$-bit messages and no collision detection. Each node initially holds a value, and the goal is to compute a global aggregate such as the sum of all values. Aggregation tasks arise naturally in wireless sensor networks, where nodes are often battery-powered and radio activity is the dominant source of energy consumption. Accordingly, our main objective is to minimize the energy complexity, defined as the maximum number of rounds in which any node is awake. Our main result is a randomized distributed algorithm that, with high probability, constructs and executes an aggregation schedule in $O(n \operatorname{polylog} n)$ rounds and using $O(Δ^\ast \operatorname{polylog} n)$ energy, where $Δ^\ast$ is the minimum possible maximum degree of a spanning tree of the network graph. This guarantee is nearly optimal: for any aggregation schedule and any graph, there exists a node that must be awake for at least $Δ^\ast$ rounds. As a by-product, the algorithm also computes a spanning tree whose maximum degree is within an $O(\log n)$ factor of $Δ^\ast$, with the same round and energy guarantees. For every tree edge, both endpoints learn that the edge belongs to the tree.