OPML feed of all feeds.
Subscribe to the Atom feed, RSS feed to stay up to date.
Thank you to arXiv for use of its open access interoperability.
Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.
Powered by Pluto.
Source on GitHub.
Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.
from ECCC Papers
from ECCC Papers
from The Geomblog
News moves fast. 12 hours ago I was enjoying the demolition that the US put on Paraguay when I heard that Anthropic had shut down access to Fable and Mythos (their latest and most powerful models).
Since then, more news has surfaced about what went down, and I feel like it's a good exercise in understanding both the policy and psychodrama around AI today - with maybe even a moral or two like Aesop's .... FABLES (yes I'm going to keep making bad jokes and no you can't stop me)
Part 1: The eventLet's first lay out the facts of the matter. To the best of my knowledge, here's what transpired.
By Suresh Venkatasubramanian
News moves fast. 12 hours ago I was enjoying the demolition that the US put on Paraguay when I heard that Anthropic had shut down access to Fable and Mythos (their latest and most powerful models).
Since then, more news has surfaced about what went down, and I feel like it's a good exercise in understanding both the policy and psychodrama around AI today - with maybe even a moral or two like Aesop's .... FABLES (yes I'm going to keep making bad jokes and no you can't stop me)
Part 1: The eventLet's first lay out the facts of the matter. To the best of my knowledge, here's what transpired.
from David Eppstein
This past week was the final exam week for our spring term, and in it we also held the doctoral defense of another Ph.D. student, Vinesh Sridhar (advised by Mike Goodrich). Vinesh organized his thesis in an unusually symmetric way, with two parts on computing with noisy primitives, two parts on in-place algorithms, two parts on sequential algorithms, and two parts on parallel algorithms, in four combinations (noisy sequential, noisy parallel, in-place sequential, in-place parallel).
I’ve previously written here about Vinesh’s work with Mike and me on computational geometry algorithms with noisy primitives, which we published in WADS 2025. In this work, following a random walk back and forth along a search path in a history DAG or other geometric data structure (and occasionally, choosing a false step and then backtracking) allows algorithms based on this idea to avoid the logarithmic time penalty of repeating each noisy query for enough repetitions to be sure that it is accurate. Being sure means “with high probability”: the failure probability should be inversely polynomial in the input size.
Vinesh and Mike published a parallel follow-up to this at CCCG 2025, “Optimal parallel algorithms for convex hulls in 2d and 3d under noisy primitive operations”. A difficulty here is that, for a divide-and-conquer algorithm, a failure probability that is inverse polynomial in the size of a small subproblem may not be inverse polynomial in the much larger input size. To work around this, Vinesh and Mike used the idea of “failure sweeping”, in which one divides the input into polynomially-many subproblems, lets some of them fail with a probability that is larger than the target failure probability (but still small enough that with high probability there are few failures). Then, one cleans up the failed subproblems in a second pass using a fast and more reliable parallel algorithm with a greater total work bound that is cancelled by the smaller number of subproblems.
The in-place part of Vinesh’s thesis focused on in-place sorting algorithms, where one is allowed only \(O(1)\) extra words of memory beyond the array being sorted. The classical choice for this is heapsort, which can be implemented to run in-place and takes time \(O(n\log n)\) to sort an \(n\)-item array. This is optimal in the worst case, but many other sorting algorithms go beyond worst case time complexity to run more quickly for inputs that are not completely unsorted. An important class of these are entropy-sensitive sorting algorithms. Their runtime on input sequences that can be partitioned into \(r\) sorted runs of size \(n_i\) is \(O(n(H+1))\), where the entropy \(H\) is defined by
\[\mathrm{H}=-\sum_{i=1}^r \frac{n_i}{n}\log_2 \frac{n_i}{n}.\]This can be done by finding runs and then repeatedly merging the shortest two, but the merged runs may not be consecutive in the input sequence, making an in-place merge problematic. Instead, Timsort and several similar alternatives build a stack of runs in a left-right scan of the input, while doing so merging pairs of short runs near the top of the stack. This leaves a sequence of sorted runs on the stack whose sizes grow roughly according to a geometric sequence, which can be repeatedly popped and merged. The geometric growth of the stack means that the stack itself can be stored in \(O(\log n)\) space, but while small this is not small enough to be in-place. Vinesh finds two strategies for reducing the space in this family of algorithms. In the “walking” strategy, only the top \(O(1)\) stack elements are maintained; the next ones below that are rediscovered one at a time, as they become needed, by a sequential search within each run. In the “jumping” strategy, some elements within each run are placed out of order, in order to encode extra information to speed up the walking algorithm. Vinesh shows that most entropy-sensitive merging algorithms (but not Timsort itself) can be made in-place by walking, and that all known ones can be made in-place by jumping. He published this work (joint with Ofek Gila and Mike Goodrich) as “How to sort in a refrigerator: Simple entropy-sensitive strictly in-place sorting algorithms” in LATIN 2026.
The final part of the thesis comes from a paper with the colorful title “Raiders of the lost log: Synchronous parallel in-place models and algorithms” (with Goodrich in the 28th Workshop on Advances in Parallel and Distributed Computational Models, 2026, and selected as an “outstanding paper” for APDCM 2027). It’s not currently free online but Vinesh and Mike promise to make an arXiv version soon. This part involves PRAM model parallel machines where the only shared memory is the input and each parallel processor has only a bounded amount of private memory. The title comes from a technique inspired by the first scene of the first Indiana Jones movie, in which Jones (unsuccessfully) swaps a sandbag for a booby-trapped gold figurine. In their algorithmic technique, Vinesh and Mike have the parallel processors each swap their working memory for a piece of the input data that they control, freeing up enough shared memory to perform their algorithms. They apply this technique not only to in-place sorting, but also to convex hulls, parallel primitives such as prefix sums and tree contraction, and the generation of random permutations.
The defense was presented very smoothly and clearly, and Vinesh’s many results (including several papers not included in the thesis) made for an easy pass of his defense. Congratulations, Vinesh!
The 10th edition of WoLA, the Workshop on Local Algorithms, will be taking place on August 16-18, 2026 at at Boston University, Boston, MA, just before APPROX/RANDOM!
For those unfamiliar with WoLA:
Local algorithms, that is, algorithms that compute and make decisions on parts of the output considering only a portion of the input, have been studied in a number of areas in theoretical computer science and mathematics. Some of these areas include sublinear-time algorithms, distributed algorithms, inference in large networks and graphical models. These communities have similar goals but a variety of approaches, techniques, and methods. This workshop is aimed at fostering dialogue and cross-pollination of ideas between the various communities.
You are all invited! For more on (free) registration, how to submit a poster or propose a talk, list of invited speakers, and local arrangements, see
the website.
Don’t miss it! Not only is WoLA a truly enjoyable and interesting event, showcasing exciting new results and directions of research, it’s also one centered around an incredibly welcoming and collegial (and fun!) community of researchers. But that’s not all! Expect a particularly amazing edition this year, as this is the 10th anniversary edition of WoLA, and it is being held on the top floor of the new Computer and Data Sciences building at BU, with breathtaking views of Boston.
Program Committee: Maryam Aliakbarpour (Rice University), Sepehr Assadi (University of Waterloo; Chair), Eric Blais (University of Waterloo), Yi-Jun Chang (National University of Singapore), Talya Eden (Bar Ilan University), Nathan Harms (University of British Columbia), Akash Kumar (IIT Bombay), Yannic Maus (TU Graz), Sofya Raskhodnikova (Boston University; Local Chair)), Chen Wang (Rensselaer Polytechnic Institute).
from Ben Recht
Leif Weatherby, Tyler Shoemaker, and I have a new essay out today in The Ideas Letter about the creation of reality through pseudoscience, religion, and psychosis. Of course, it’s mostly about AI. Starting with the bizarre appearance of Antropic co-founder and mechanistic interpretability zealot Chris Olah at the side of Pope Leo as he read out his encyclical, we explore how commercial large language models automate the creation of reality.
The term of art for the process by which abstract concepts are made into material objects is reification. Marxist theorists think of the reification of social relations, where qualitative things like actions become commodified. This is the process by which labor becomes objectified and priced, and subsequently alienated from the person who does the work. Distinctly, historians and philosophers of science think of reification in terms of how correlational associations are named and become real through coordinated research programs. When your biggest companies all claim to be doing research, you can see how these two become one, with coordinated research becoming commodity itself.
LLMs are not only generated by a technological oligarchy, but also introduce a new twist into the process of reality generation. LLMs automate reification. LLMs are designed to produce explanations for us. We skip the steps in the collective creation of scientific concepts (aka thinking and arguing) and let LLMs do it for us through their linguistic association. LLMs entice you to skip the step of interpretation of outputs. Why not just ask Claude what it thinks?
As an illustrative example, we discuss Stephen J Gould’s historical analysis of the g factor in psychometrics research. In the social sciences, everything is correlated with everything. If you run some sort of correlational analysis on any data set, you’ll find something. Gould describes the process through which psychometricians decided that the correlational quantity that linked different measures of intelligence—the g factor—was real, and how they created a whole scientific research program to reinforce the reality of the g factor.
You know who still loves factor analysis? Mechanistic interpretability researchers. AI researchers have found themselves enamored with psychometrics as they work to prove their LLM toys are conscious. But they have new technology that Charles Spearman and his colleagues didn’t have a century ago: the LLMs themselves! Whereas 20th-century psychometrists had to think about causal stories to explain their strong correlational findings, interpretability researchers can automate this process. They can give an LLM the output of a factor analysis and ask it to explain the factors. Since it returns answers in natural language, this feels like discovery even though you have offloaded the process that would have been done by thinking about it to a machine. This process sets off a loop in which scientists and citizens alike give iterative credence to concepts generated by correlating text. We write:
“Models, in other words, kick off an associative chain of ideas by effectively auto-labeling queries. It’s like taking the principal components derived from that data about oak trees, boat speeds, cat whiskers, and Letterboxd reviews, and asking ChatGPT: “What do each of these artifacts mean?” ChatGPT will respond—and then keep the conversation going, bringing in more associations that more or less fit. But as we have already argued, this doesn’t only happen to amateurs who are easy to pathologize. How is this different from the standard methodology of interpretability research? In both the cases that might be dismissed as psychosis and the ones celebrated at AI conferences, interaction with LLMs induces mental friction. They create a feeling that discovery is there. By elaborating on what you put in through a context that the model has trained on, it is able to make connections that feel both correct and expansive, filling in the area around your thought—simulating the feeling that you are having a new thought. The model helps you refine the obscurity of your prompt through a chain of associations, and suddenly you have something. This is reification at work. And when the next link in a chain of thoughts comes along, it becomes hard to resist prompting the model again.”
LLMs are built to automate a complex web of reification. They are designed to tell us what we want to hear. As we write, “Trained on an enormous corpus of human writing, speech, and code, and tuned to refine responses around context and memory as user interactions unroll, a model of this kind is designed to provide the sense that one’s expectation is being exceeded.”
Artificial intelligence is a bizarre technology that participates in its own mythmaking. Anthropic is the most deeply invested in telling outrageous stories about its products, but no one in this space is innocent. As I’ve written before, I think that benchmarks get an unnecessarily bad rap. I certainly agree that you can’t “benchmark general purpose technology.” I agree that maxing out benchmark scores doesn’t mean a product will be better received by customers. But when we move away from benchmarking in artificial intelligence, we get stuck in storytelling. LLMs are designed and intended to convince us that their stories are real. When backed by a turbocharged capitalist engine, the reification becomes inevitable. Stories become commodities. Those commodities sell like hotcakes. And there is nothing more real than when the number goes up.
I’ll leave you there, as I need to go check on the value of my SpaceX shares. Read the whole article and let us know what you think.
from CS Theory Events
By shacharlovett
Authors: Jan Krajicek
Inspired by a statement about Extended Frege proof systems by Jain and Jin (FOCS 2022) we prove that: - there is a p-time binary relation $\approx$ between circuits that implies their logical equivalence, - the relation $\approx$ implies that each of the two circuits can be rewritten into the other one by possibly deleting some gates and adding at most seven new gates, - if the equivalence $C \equiv D$ has a size $s$ proof in an Extended Frege or a Circuit Frege proof system then there is a chain of circuits $E_i$ $$ C = E_0 \approx \dots \approx E_t = D $$ with $t \le s^{O(1)}$.Authors: Jie Wang
AI-augmented computing delegates natural language queries, code generation requests, and other open-ended tasks to a cluster of AI models that processes queries and generates responses. This paradigm introduces a resource dimension that neither classical time nor space complexity captures: the cost of sending queries to and receiving responses from such a cluster. We introduce token complexity, a formal resource measure defined as the minimum expected token cost to achieve a specified level of output quality on a task, and develop a taxonomy classifying AI systems by the strength of their probabilistic properties. We develop token complexity within the framework of AI-Oracle Turing machines, in which a probabilistic Turing machine interacts with a stochastic oracle via dedicated query and response tapes. We prove basic theorems establishing that token complexity behaves as expected: monotonicity (higher quality costs more tokens), convexity (quality improvements become progressively more expensive), price sensitivity (small price changes produce bounded cost changes), and price-relativity of task ordering (the token complexity ordering of tasks can reverse depending on the query-to-response cost ratio). We prove that the complexity frontier, defined as the set of all feasible resource bounds in tokens, time, and space, is non-empty, upward-closed, and convex.Authors: Bruno Loff, Suhail Sherif, Navid Talebanfard, Francesca Ugazio
Razborov and Rudich (JCSS'97) observed that all known lower-bound proofs follow a certain pattern: when showing that a function $F$ is hard, along the way the proof provides us with a distinguisher, namely, an efficient algorithm which can distinguish easy functions from random functions. They called such lower-bound proofs natural proofs. They then showed a natural-proofs barrier: under standard cryptographic assumptions, natural proofs cannot show superpolynomial lower-bounds against Boolean circuits. Along similar lines it can be shown that under a suitable cryptographic assumption, natural proofs cannot significantly improve the current state-of-the-art lower bound against constant depth circuits (AC0). The state of the art, using Håstad's Switching Lemma (SL), is $2^{n^{1/(d-1)}}$ for depth-$d$ circuits, and (conditionally) no natural proof can prove lower bounds of $2^{n^{c/d}}$ for some large constant $c$. In this paper we revisit the natural-proofs barrier from an $\textit{unconditional}$ perspective. We focus on AC0-natural proofs, i.e. proofs whose distinguishers are computable by AC0 circuits. Razborov and Rudich observed that lower bounds based on SL are AC0-natural. We show that this is true for most known lower-bound techniques against constant-depth circuits. We then establish an unconditional barrier for such proofs. By localizing the Trevisan--Xue pseudorandom generator, we are able to show that no AC0-natural proof can prove a lower bound greater than $2^{n^{7/(d-5)}}$ against depth-$d$ circuits. This is in the same quantitative regime as the SL frontier which instead has $1/(d-1)$ in the power of $n$. The proof has a striking self-referential aspect: the proof of security of the Trevisan--Xue generator crucially relies on SL, and so SL has been used to show that AC0-natural proofs, such as SL itself, cannot prove AC0 lower bounds better than that of SL.Authors: Karthik Sheshadri
The border determinantal complexity $\dcb(f)$ of a polynomial $f$ is the least $m$ such that $f$ is a limit of determinants of $m\times m$ matrices of affine-linear forms. We prove that for every $n\ge3$, over $\CC$, \[ \dcb\Big(\sum_{i=1}^n x_i^n\Big)\ \ge\ \frac{(n-1)^2}{4e}, \qquad \sdcb\Big(\sum_{i=1}^n x_i^n\Big)\ \ge\ \frac{(n-1)^2}{2e} \] in the ordinary and symmetric models respectively; both match the known $O(n^2)$ upper bounds up to the constant. To our knowledge these are the first border determinantal lower bounds for an explicit family that are superlinear in the number of variables: the known quadratic border bound for the permanent reads the \emph{dimension} of the dual variety and is linear in its number of variables, whereas we transfer the dual \emph{degree}. The proof has two ingredients. The first is an unconditional bound on the slot-$(n-2)$ conormal multidegree of the multiplicity-one Gauss-graph cycle of an arbitrary affine-linear determinant -- singular, reducible, and non-reduced fibers allowed -- by a multihomogeneous Bézout count of a lifted kernel incidence. The second is a specialization argument: along any degeneration $\det A_c\to\sum_ix_i^n$, the flat limit of these Gauss-graph cycles contains the conormal variety of the Fermat cone with positive coefficient. A cone-shift identity converts that conormal multidegree into the classical dual degree $n(n-1)^{n-2}$ of the smooth Fermat hypersurface, and an $(n-1)$-st root yields the quadratic bound. The exact lower bounds of the author's companion manuscripts follow as corollaries.Authors: Luca Castelli, Paolo Massazza
The degree of convexity of a convex polyomino P is the smallest integer k such that any two cells of P can be joined by a monotone path inside P with at most k changes of direction. In this paper, we present a set of formulas and equations that are the basis of a C++ program that allows you to compute the longest counting sequence known to date (with respect to the area) of convex polyominoes of degree of convexity at most 2Authors: Faruk Alpay, Baris Basaran
Every finite family of compact convex bodies in Euclidean space induces an apartness relation between disjoint index sets: two sets are apart when the convex hulls of the corresponding unions are disjoint. This paper studies the finite theory obtained by taking apartness as the primitive relation. Its basic laws are symmetry, bilateral subsumption, and vacuity, equivalently the separation-polarity form of acyclic separoids. The main contribution is an effective rational realization theorem with uniform margins and the exact consequence theory it supports. Every finite apartness separoid is realized by rational polytopes whose coordinates are indexed by maximal separations. Maximal separations and minimal Radon partitions can be enumerated from a full table, generators, or a membership oracle; the coordinate values have controlled bit height; and each coordinate records a readable certificate of one maximal separation. The realization separates every apart pair with clearance at least 2, remains correct under outer parallel enlargement by any radius below 1, and yields full-dimensional convex bodies after thickening. The distance-function layer records standard convex-analytic stability through Lipschitz comparison, monotonicity under inclusion, and outer parallel bodies. On the logical side, positive entailment is exactly one-premise subsumption. Boolean consequence over Euclidean scenes is sound, complete, and decidable; satisfiability is NP-complete, validity is coNP-complete, and positive entailment is linear for sorted encodings. A stratification theorem shows that Boolean reasoning introduces no new atomic apartness beyond separoid closure. Fixed-dimensional consequence relations form a strictly decreasing hierarchy that stabilizes in dimension n minus 1 for n sites.Authors: Flavio Chierichetti, Mirko Giacchini, Ravi Kumar, Alessandro Panconesi, Erasmo Tani, Andrew Tomkins
In this work we settle the complexity of three sketching problems. (i) We show that sketching vertex neighborhood sizes in graphs requires $Ω(n^2)$ bits, standing in sharp contrast to the $\tilde{O}(n)$ complexity of sketching edge cuts. (ii) We obtain tight lower and upper bounds of $\tildeΘ(n^2)$ for sketching coverage functions with additive and multiplicative errors. (iii) We prove an $Ω(n^2)$ lower bound for sketching Random Utility Models under the $\ell_\infty$-norm, improving upon the previous $Ω(n \log n)$ bound and matching a known upper bound to within logarithmic factors. These bounds are obtained through a connection with the problem of sketching the intersection profile of a distribution $D$ on $2^{[n]}$. Specifically, we seek a succinct data structure that, for any query set $S \subseteq [n]$, approximates the quantity $\Pr_{T \sim D}[T \cap S \neq \varnothing]$ to within a small constant additive error. One can obtain lower bounds for this latter problem directly from known results about the itemset frequency estimation problem in databases for which tight bounds are known. As an additional contribution, we also provide an alternative proof for the intersection profile sketching lower bound, in the setting in which the accuracy parameter is constant. This proof relies solely on elementary probability avoiding the heavier machinery used in previous proofs.Authors: Tatiana Rocha Avila, Holger Dell, John Lapinskas
The voter model is a classical stochastic process that models how opinions might spread through a network: at each step, every node lazily adopts the opinion of a random neighbour; eventually all nodes share the same opinion (consensus). Stronger connectivity should yield faster consensus. Berenbrink, Giakkoupis, Kermarrec, and Mallmann-Trenn (ICALP 2016) make this precise via the network's conductance: if the network has $m$ edges, minimum degree $d_{\min}$, and conductance at least $φ$, then the voter model reaches consensus in expected $O(m/(d_{\min}φ))$ steps. Their results extend to dynamic networks with fixed vertex degrees by considering the network's conductance at each time step. We introduce temporal conductance $Φ$, a more general connectivity measure for dynamic networks. Unlike static conductance, which collapses to $0$ whenever some snapshot is disconnected, $Φ$ captures connectivity through edges that appear at different times. We generalise the results of Berenbrink et al. from static conductance to temporal conductance, showing that the expected consensus time of the standard voter model is at most $O(m/(d_{\min}Φ))$. Moreover, we prove that this bound is tight up to constant factors. We expect temporal conductance to be a useful primitive for analysing other dynamics on temporal networks, and potentially time-inhomogeneous Markov chains more generally.Authors: Georg Hasebe, Johannes Lengler, Raghu Raman Ravi
We study the $(1 + 1)$-EA in dynamic linear environments, where in every generation selection is performed with respect to a freshly sampled linear function with positive weights. We consider the Dynamic Binary Value problem, where each generation uses a uniformly random permutation of $1,2,4,\dots,2^{n-1}$, and a Uniform weight variant, where the weights are drawn independently from $\mathrm{Unif}(0,1)$. Both of them have recently been integrated into the IOHprofiler platform and empirically studied. For both models we prove a sharp threshold in the mutation parameter $χ$ for mutation rate $χ/n$. Below the threshold, the expected optimisation time is $\mathcal{O}(n\log n)$, whereas above it the runtime becomes $2^{Ω(n)}$. For the Dynamic Binary Value problem in the exponential regime, we also quantify at what distance from the optimum the optimisation process stagnates. We show that there is a second threshold: a distance that is efficiently reached, but reaching any smaller distance takes exponential time. This quantifies and proves previous empirical findings.Authors: Faruk Alpay, Levent Sarioglu
We study retrospective auditing for dynamic ordered sets maintained by an untrusted party. A passive auditor watches insert, delete, membership, predecessor, successor, min, and max operations, stores five machine words and a flag, and receives a constant-size public tally record per operation. At audit time the maintainer discloses the claimed live vacant intervals. The method represents order semantics by maximal gaps: gaps are born, cited, consumed, and timestamped, while two hidden field accumulators test equality of the birth and consumption ledgers. Honest executions are accepted with probability one. If any answer in a T-operation session is wrong, acceptance occurs with probability at most (4T+1)/p over one secret field element, against computationally unbounded maintainers. We prove that deterministic and visible-coin auditors require linear state, and that removing the timestamp rule permits an exact replay forgery. A leaf-oriented (2,4)-tree implements the maintainer in O(log n) worst-case time per operation with one extra word per element, and its rebalancing events admit an auditable O(m) envelope over m updates. Checkpoint audits compose with additive error.Authors: Alexander Omelchenko
We present an algorithmic framework for the exhaustive generation and tabulation of knot and link diagrams on the thickened torus T^2 x I, based on the theory of maps on surfaces. Cellular 4-regular torus projections are encoded by permutation pairs (alpha, sigma), and unsensed equivalence classes are enumerated completely and without duplication via canonical representatives. Crossing assignments, local diagram-level reductions, and the generalized Kauffman-type bracket are formulated entirely within the same permutation model. The pipeline is validated against published genus-one classifications for crossing numbers N <= 5 and then extended to N = 6, 7, 8, producing, to our knowledge, the first complete genus-one tabulation at these crossing numbers under the stated comparison conventions. The resulting dataset contains more than 33,000 knot and link types. Besides the tables, the computation yields proved structural facts, including a parity statement for the a-span of the bracket and a sharp upper bound N-1 for the number of bigon faces in a 4-regular torus map. It also suggests several conjectures, among them a formula for the maximum number of straight-ahead components, the absence of equi-quadrilateral knot projections, and a 4N upper bound for the genus-one bracket span.Authors: Nathanaël Hassler, Vincent Vajnovszki
Permutations avoiding a pattern of length three are enumerated by the Catalan numbers. In this work, we present methods for ranking and unranking such permutations in lexicographic or colexicographic order.Authors: Elena Barcucci, Antonio Bernini, Stefano Bilotta, Renzo Pinzani
We study k-coloured Motzkin paths, namely Motzkin paths in which horizontal steps can be coloured in k different ways, and investigate their connection with the number of prefixes ending at odd height from both an analytical and a combinatorial point of view. Moreover, the combinatorial approach provides a random generation algorithm for k-coloured Motzkin paths in linear-time.Authors: Kaito Baba, Evripidis Bampis, Giorgos Mitropoulos
Recently, Antoniadis et al. (ICLR 2025) proposed a framework for incorporating predictions to approximate NP-hard selection problems. Despite its simplicity, this approach tightly matches theoretical lower bounds, making its generalization highly compelling. We address an open question raised in the work of Antoniadis et al., concerning the extension of this approach to other important problems outside the class of selection problems, such as scheduling. We develop a learning-augmented algorithm for the makespan minimization problem on unrelated machines, denoted by $R\|C_{\max}$. By using predictions of heavy job assignments, we achieve a polynomial-time $(1+\varepsilon)$-approximation for accurate predictions that smoothly degrades to a worst-case 2-approximation as the error increases. We conclude our work with an empirical analysis of our method.Authors: Ali Dasdan
Binary search is deceptively simple in concept yet notoriously difficult to implement correctly. This paper presents a unified treatment of binary search: five core variants, six derived query functions, and four standard library implementations (BSD, glibc, Java, C++ STL), each with consistent notation, loop invariants, and analysis. We introduce bsearch_ultimate, a combined search that subsumes all variants in a single call. Every algorithm is provided as synchronized Python code, Dafny formal proof, and pseudocode. All implementations are validated by over 9,500 tests and 21 Dafny formal verifications; an additional six deliberately faulty implementations demonstrate common bug categories and Dafny's ability to detect them. We also provide memorable rules linking boundary choices to loop conditions and update formulas.Authors: Ziao Wang, Lei Ying
This paper studies a variation of the classic network alignment problem, named diffusion-network alignment. The goal is to align the vertices of a rooted diffusion tree to the vertices of a network, where the diffusion tree could be from a communication trace or contact tracing, and the network could be an online or offline social network. Different from the classic network alignment where both networks are fully observed, this model captures the information asymmetry of two networks. To solve this problem, this paper presents an efficient algorithm based on tree correlation tests to extract alignment information from local neighborhoods. We analyze the performance of the algorithm in the sparse graph regime and show that with high probability, all matched pairs are correct. Furthermore, for each vertex on the diffusion tree, this paper establishes an explicit lower bound on the probability that the vertex is correctly matched. These lower bounds are depth-dependent and increase as vertices get closer to the root.Authors: Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar, Manish Purohit
We study the problem of selecting the largest among $n$ unknown values $x_1,\dots,x_n$ given only a single unbiased estimate $y_i$ for each $x_i$. We design strategies that are simultaneously admissible (not uniformly dominated by any other strategy) and also never worse than a given baseline such as uniform random selection. We provide an application to stochastic optimization, where we obtain online-to-batch conversion bounds with a desirable "no-compromise" guarantee: they are never worse than standard random iterate selection, and yet can be significantly better in benign settings.Authors: Yunbum Kook, Santosh S. Vempala
We give a simple, unified, and nearly tight bound for sampling arbitrary logconcave distributions from a warm start using the In-and-Out algorithm along with exponential lifting. The main new ingredient in the analysis is an improved bound on the Poincaré constant of a lifted distribution. As a consequence, the resulting convergence rate is nearly tight for both constrained settings (e.g., Gaussian restricted to a convex body) and well-conditioned settings (e.g., strongly logconcave and smooth densities).Authors: Ahmed M. Alzuhair, Ahmed Alherz
We propose a randomized local-improvement algorithm for the Maximum Weighted Matching (MWM) problem. Our method introduces a softmax-based biased sampling mechanism that achieves local $\varepsilon$-dominance and yields an expected $\frac{1}{2}-\varepsilon$ approximation ratio. We prove convergence guarantees and show that the algorithm runs in $O\!\left(m\log(1/\varepsilon)/p_{\min}\right)$ time, where $p_{\min}$ is the minimum softmax proposal probability over all edges; under mild conditions on the bias parameter and weight range, this simplifies to $O(m\log(1/\varepsilon))$. The framework provides a tunable tradeoff between convergence speed and approximation quality.Authors: Sasha Voitovych, Abhishek Shetty, Noah Golowich, Alexander Rakhlin
Understanding the minimal assumptions necessary for generalization is the fundamental question in learning theory. Unfortunately, most results rely heavily on independence (or some proxy thereof) of the data-generating process, while results for strongly dependent data are far more limited. Towards addressing this gap, we introduce the framework of simulatable processes, where the learner has access to a simulator that approximates the distribution generating the data (which may be an arbitrarily complex and dependent process). Surprisingly, given access to such a simulator, we show that we can recover the same learning guarantees as in the classical setting with independent data, namely, error bounds that depend on the VC dimension. Further, we use this framework to study the power of conditional sampling and show strict statistical and computational advantages in this setting. As a highlight of our framework, we exhibit a single algorithm that simultaneously learns any given VC class under all processes samplable in bounded polynomial time, with regret controlled by the time-bounded Kolmogorov complexity of the process. This provides a significant conceptual broadening of the classical PAC model.Authors: Maximilian J. Kramer, Carsten Schubert, Jens Eisert
For general max-k-XORSAT with $k \geq 3$, no polynomial-time algorithm can do substantially better than random guessing on worst-case instances unless $\mathsf{P} = \mathsf{NP}$: approximating beyond the random-assignment value of $1/2$ is $\mathsf{NP}$-hard. The picture changes when each variable appears in at most $D$ constraints. In that bounded-degree setting, polynomial-time algorithms can provably beat the random baseline by an additive amount of order $1/\sqrt{D}$. For Boolean instances, this scaling is known to be optimal: the matching hardness result is due to Trevisan, while the corresponding algorithmic guarantee was established by Barak et al. Whether the same holds over general finite fields, and what it implies for quantum algorithms, has not been established. We make this connection explicit and extend the hardness to max-E$k$-LINSAT$(q,r)$ with bounded degree $D$ and over arbitrary finite fields $\mathbb{F}_q$, proving that it is $\mathsf{NP}$-hard to exceed $r/q + \mathcal{O}_{q,r}(1/\sqrt{D})$. These results provide the complexity-theoretic benchmark for the bounded-degree instances targeted by decoded quantum interferometry (DQI), QAOA, and classical heuristics. Any quantum advantage on bounded-degree instances is therefore confined to the constant prefactor. We further show that in the context of DQI and on $(k,D)$-regular instances, this prefactor is sensitive to the nature of the decoder: DQI with classical decoders faces an information-theoretic $1/\sqrt{D \log D}$ barrier that prevents it from matching the hardness scaling, while DQI with quantum decoders is compatible with the $1/\sqrt{D}$ scaling -- identifying quantum decoding as the key ingredient for matching the complexity-theoretic scaling with DQI.Authors: Yumou Fei, Ronitt Rubinfeld
The seminal work of Goldreich and Ron (\textit{Combinatorica, 1999}) showed that bipartiteness of bounded-degree graphs can be tested using $O(\sqrt{n\log n})$ random walks of length $O(\log^{6} n)$. In this work, we improve their result by showing that $O(\sqrt{n})$ random walks of length $O(\log n)$ suffice. As a corollary, we obtain an $O(\log n)$-pass, $O(\sqrt{n}\log n)$-space streaming algorithm for testing bipartiteness, whose pass complexity is optimal in light of a recent lower bound of Fei, Minzer, and Wang (\textit{arXiv, 2026}). Our proof takes a different approach from that of Goldreich and Ron, using the semidefinite programming relaxation for Max-Cut introduced by Goemans and Williamson (\textit{J. ACM, 1995}).Authors: Ari Biswas, Graham Cormode, Yaron Kanza, Divesh Srivastava, Zhengyi Zhou
The task of finding _Hierarchical_ Heavy Hitters (HHH) was introduced by Cormode et al. [VLDB 2003] as a generalisation of the heavy hitter problem. While finding HHH in data streams has been studied extensively, the question of releasing HHH when the underlying data is private remains unexplored. In this paper, we study differentially private HHH release in both the streaming and non-streaming setting. In the non-streaming setting, we show the surprising result that the relative error in estimating the residual count for any prefix is independent of the height of the hierarchy and the number of heavy hitters in the stream. Meanwhile, in the streaming setting, although the exact version of HHH has low global sensitivity (as counting queries are 1-sensitive), the approximation functions due to streaming have high global sensitivity, linear in the available space. Despite this obstacle, we show that the absolute error for estimating frequencies in the steaming setting is independent of the available space.from Emanuele Viola
Enrollment in computer science is declining, out of fear that AI will wipe out much of the demand for software engineers. So here are the jobs of the future:
In 2008, when I was on the job market, computer science was at its nadir. The people behind the doors I was knocking on told me kids were told to study biology and law instead. What happened instead is that people entering the field precisely at that point were going to graduate at a very good time.
Authors: Krti Tallam
Enterprise security was built to govern data boundaries: the protected surface was data at rest and in transit, and the controls -- access control, data-loss prevention, perimeter inspection -- governed crossings of that boundary. Production AI agents dissolve this assumption. An agent reads context, calls tools, invokes connectors, and modifies systems of record on an enterprise's behalf, so risk moves inside the workflow, into sequences of individually-permitted actions that may transform a business process no one authorized. Existing policy engines do not extend to this regime: they evaluate request-time decisions against atomic principals, where agentic systems require stateful evaluation against composite principals whose authority attenuates through delegation chains. We present a reference architecture for the runtime governance of production agents, built from four composable primitives: a five-plane decomposition (a reasoning plane that adjudicates intent, and four enforcement planes -- network, identity, endpoint, data -- that realize the decision), stop-anywhere mediation, composite principals with capability attenuation, and audit as a structured evidence substrate. We define a taxonomy of six interruption primitives that generalize allow and deny, state and argue for four correctness invariants, and demonstrate the foreclosure of seven production-agent threats across five concrete workflows. A reference implementation of the policy-engine core supplies measured evidence: attenuation correctness and evidence reconstructability hold on every trial, adjudication runs in single-digit microseconds, and the audit substrate's tamper-evidence behaves exactly as designed. We are explicit about scope: the architecture governs delegated action, not model behavior, and a full-system evaluation against a live agent benchmark is the invited next step.Authors: Ruichen Qiu, Yichuan Cao, Qiao-Long Huang, Ruyong Feng, Xiao-Shan Gao
In this paper, we prove that output-sensitive sparse polynomial GCD computation over finite fields is NP-hard under BPP many-one reduction. More precisely, for two sparse univariate polynomials $f,g$ with finite field coefficients, there exists no randomized algorithm to compute $\mathrm{gcd}(f,g)$, which is polynomial-time in the sizes of $f,g,\gcd(f,g)$ under the standard complexity assumption $\mathrm{NP}\nsubseteq\mathrm{BPP}$. This settles the open problem posed as Challenge 5 in The Sparsity Challenges in the finite field setting. Furthermore, we show that the Roots of Unity Detection problem over finite fields is NP-hard; that is, determining whether the GCD of a sparse univariate polynomial and $x^n - 1$ has nonzero degree is NP-hard.Authors: Yichuan Cao, Ruichen Qiu, Qiao-Long Huang, Ruyong Feng, Xiao-Shan Gao
In this paper, we show that deciding whether a sparse polynomial does not divide another sparse polynomial exactly over finite fields is NP-hard under BPP many-one reductions. Equivalently, the sparse polynomial divisibility test over finite fields is CoNP-hard. This resolves the long-standing open problem concerning the computational complexity of the divisibility test for sparse polynomials in the setting of finite fields.Authors: Qiao-Long Huang, Yichuan Cao, Ruichen Qiu, Xiao-Shan Gao
Sparse polynomial multiplication is a fundamental problem in computer algebra and the theory of computation, and the development of a quasi-linear time output-sensitive multiplication algorithm has been posed as an open challenge. In this paper, a counterexample is provided to a previously claimed solution to this open problem for integer coefficients. By employing the existing quasi-linear modular-black-box interpolation algorithm, we are able to provide an algorithm with quasi-linear bit complexity for the integer coefficients setting. Furthermore, in the case of coefficients over a finite field, we obtain an algorithm whose bit complexity is linear in the number of terms, the logarithm of the degree, and the logarithm of the size of the finite field.Authors: Arie Soeteman, Balder ten Cate, Maurice Funk, Benny Kimelfeld, Carsten Lutz, Moritz Schönherr
The conventional approach to deep learning over relational databases applies neural models, such as Graph Neural Networks (GNNs), to a graph representation of the database. Recent approaches instead operate on databases directly, associating tuples with embeddings and extending query mechanisms to jointly process embeddings and relational content. Inspired by these developments, we introduce Neuro-Relational Programs (NRPs), a declarative query language for relational databases whose facts carry numeric vector embeddings. NRPs extend Datalog-style rules with operations that combine, aggregate, and transform embeddings, thereby interleaving relational reasoning and learnable neural components within a single formalism. This yields a general approach to neural computation over relational data: an NRP can be read both as a query plan with trainable components and as a neural architecture with relational structure built in. Natural syntactic fragments of NRPs recover existing architectures and query formalisms. Zero-ary NRPs correspond to non-adaptive query algorithms; monadic NRPs generalize GNN-style message passing and precisely capture Deep Homomorphism Networks, a connection that we extend to frontier-guarded NRPs over databases with row-ids. We characterize the expressive power of unrestricted NRPs with ReLU-FFN transformations by FOCQ, an extension of first-order logic with counting interpreted over real-weighted structures, yielding a precise connection with uniform TC$^0$ over ordered databases. Together, these results establish NRPs as a broad declarative framework for querying and neural computation over relational data.Authors: Qi Duan
We study the undirected three-terminal reachability-preserving minimum edge cut problem. The input is an undirected graph $G=(V,E)$ with nonnegative edge costs, two protected terminals $s_1,s_2$, and a target terminal $t$. The goal is to remove a minimum-cost edge set so that $t$ is disconnected from the protected terminals while $s_1$ and $s_2$ remain connected. This problem captures a basic tension between separation and connectivity preservation. Prior work on connectivity-preserving cuts established polynomial-time solvability for some special cases, such as planar edge-cut instances, and strong hardness for node-cut variants, but a general-graph approximation guarantee for the undirected three-terminal edge-cut version does not appear to have been known. We give a polynomial-time $O(\sqrt n)$-approximation algorithm in this paper. This is the first known approximation algorithm for the problemAuthors: Arjen Simons, Sarah Schöttler, Wouter Meulemans, Kevin Verbeek, Bettina Speckmann
Thematic maps visually communicate statistical information about spatial units such as countries or states. They must balance the individual readability of those map elements that carry the statistical information and the overall cartographic context. Nowadays, most maps are not static images, but must flexibly respond to a range of device types and display sizes. Current approaches to responsive thematic mapping are limited: they are labor-intensive for practitioners and often rely on combining disjointed visual encodings to cover different device types. In this paper we introduce the first algorithmic framework to efficiently compute responsive thematic maps that smoothly adapt to different display sizes. A key component of our framework is the layout guide: a combinatorial structure which encodes the two essential aspects of a thematic map. The first aspect are the visual requirements of each statistical map element (at least their desired width and height), the second aspect is the cartographic context in the form of relative positions of map elements. Our main algorithmic contribution is the map arranger which takes a visual container as input and returns a suitable layout guide. The map arranger does so in a stable and consistent manner: if the container changes only a little, then so does the layout guide, and the same input container always results in the same layout guide. To use our framework, one needs three ingredients: $(1)$ a reference layout, which corresponds to the ``ideal'' thematic map, $(2)$ a total vertical and horizontal order for all map elements (the desired layouts for containers with extreme aspect ratios), and $(3)$ a thematic mapping algorithm that can construct a thematic map from a layout guide. We demonstrate our framework on two types of thematic maps, namely rectangular and Demers cartograms.Authors: Yuzhou Gu, Xin Li, Yinzhan Xu
We study the query complexity of Boolean functions $f: \{0, 1\}^n \rightarrow \{0, 1\}$ in the noisy query model introduced by Feige, Raghavan, Peleg and Upfal [SICOMP 1994]. In this model, an algorithm can adaptively query the bits of an input vector, but each query result is independently flipped with constant probability $p \in (0, 1/2)$; repeated queries are allowed. The noisy query complexity $\mathsf{N}_p(f)$ of a function $f$ is defined as the minimum expected number of queries needed to compute $f(x)$ with error probability at most $1/3$, for the worst case input $x$. We prove a general lower bound on $\mathsf{N}_p(f)$ based on degree statistics of certain subgraphs of the Boolean hypercube. This is the first general lower bound beyond those implied by the simple observation that $\mathsf{N}_p(f)$ is lower bounded by the randomized query complexity. We show that this recovers (up to a constant factor) most previously known lower bounds on the noisy query complexity of Boolean functions, providing a unified framework for understanding these results and simplifying the proofs in several cases. Furthermore, this resolves in the affirmative an open problem of Gu, Li and Xu [COLT 2025] that $\mathsf{N}_p(f) = Ω(\mathsf{I}(f) \log \mathsf{I}(f))$, where $\mathsf{I}(f)$ denotes the total influence of $f$. We also apply our general lower bound to obtain tight bounds on the noisy query complexity for several new functions.Authors: Christoper Musco, Indu Ramesh
A large body of work studies the problem of learning an approximation to an implicit matrix $A\in \mathbb{R}^{m\times n}$ that is only accessible implicitly via matrix-vector product queries (matvec queries) of the form ${x} \rightarrow {A}{x}$ or ${x} \rightarrow {A}^T{x}$. Of particular interest are methods that learn a near-optimal approximation with a fixed sparsity pattern. For example, we might want to learn a near-optimal diagonal, banded, or arrow-head approximation to an implicit matrix $A$. Naturally, the number of matvec queries required to solve this problem depends on the sparsity pattern, which can be encoded as a binary matrix ${S}\in \{0,1\}^{m\times n}$. The query complexity of previous algorithms scales with quantities like the total number of ones in ${S}$, its maximum column/row sparsity, or the chromatic number of a its "conflict graph". These quantities are incomparable: for a given ${S}$, parameterizing by one might yield lower query complexity than another. In this work, we unify and tighten these prior results by providing a nearly sharp characterization of the matvec query complexity of sparse matrix approximation. Generalizing a definition from graph algorithms, let the degeneracy, ${degen}({S})$, denote the smallest number $k$ so that, if we iteratively delete all rows and columns of ${S}$ with $\leq k$ ones, we are left with an empty matrix. We show that a near-optimal approximation to $A$ with sparsity pattern $S$ can be learned with $\tilde{O}({degen}({S}))$ matrix-vector product queries, and $Ω({degen}({S}))$ queries are necessary, for any sparsity pattern ${S}$. Moreover, unlike prior work based on graph coloring, all of our methods run in polynomial time.Authors: Malte Baumecker, Rustam Latypov, Yannic Maus, Jara Uitto
Given a graph $G=(V,E)$, a $β$-ruling set is a subset of nodes $S\subseteq V$ that is independent, and each node in $V$ is at distance at most $β$ from some node in $S$. In this paper, we present almost optimal distributed algorithms for finding $2$-ruling sets in the classical \LOCAL model. Our main contribution is a randomized algorithm that w.h.p.\ computes a $2$-ruling set on any $n$-node graph with bounded arboricity in $O(\log \log n)$ rounds. In fact, the algorithm works up to arboricity $O(\log\log n)$, improves exponentially over the prior state of the art that can be achieved by combining [Barenboim, Elkin, Pettie, Schneider; JACM'16], [Ghaffari; SODA'16], and [Bisht, Kothapalli and Pemmaraju; PODC'14], and nearly matches the lower bound of $Ω(\log \log n / \log \log \log n)$ [Balliu, Brandt, Kuhn, Olivetti; FOCS'20]. The domination parameter $β=2$ is optimal for algorithms with runtime $\log^{o(1)}n$: on graphs with arboricity $2$, there is a lower bound of $Ω(\sqrt{\log n})$ rounds for MIS (i.e., $β= 1$) [Khoury, Schild; FOCS'25]. Additionally, we obtain improved algorithms for larger arboricity. For general graphs with arboricity $α$, we present a randomized algorithm that computes a $2$-ruling set in $\widetilde{O}(\log^{5/8} α+\log^{5/3} \log n)$ rounds. This improves exponentially over the state of the art for a large range of non-constant arboricity. Our techniques extend beyond distributed computing. We present an $O(\log \log \log n)$-round algorithm in the low-space Massively Parallel Computation (\mpc) model that w.h.p.\ computes a $2$-ruling set on any graph with arboricity up to $2^{poly (\log \log n)}$, improving exponentially over the state of the art from [Kothapalli, Pai, Pemmaraju; FSTTCS'20] combined with [Fischer, Giliberti, Grunau; SPAA'23].Authors: Daniel Dadush, László A. Végh, Giacomo Zambelli
We consider linear programming in the oracle model: $\max\{c^\top x \,:\, x\in P\}$, where the polyhedron $P=\{x\in\mathbb{R}^n\,:\, Ax\le b\}$ is given by a separation oracle. We present an algorithm that finds exact primal and dual solutions using $O(n^2\log(n/δ))$ oracle calls and $O(n^4\log(n/δ)+n^5\log\log(1/δ))$ arithmetic operations, where $δ$ is a geometric condition number associated with the system $(A,b)$. These bounds do not depend on the cost vector $c$ and do not require a priori knowledge of $δ$. For rational data, $\log(1/δ)$ is polynomially bounded in the encoding size of $(A,b)$, thus providing a polynomial-time algorithm. The algorithm works in a black box manner, requiring a subroutine for approximate primal and dual solutions; the above running times are achieved when using the cutting plane method of Jiang, Lee, Song, and Wong (STOC 2020) for this subroutine. Whereas approximate solvers may return primal solutions only, we develop a general framework for extracting dual certificates based on the work of Burrell and Todd (Math. Oper. Res. 1985). Our algorithm strengthens results by Grötschel, Lovász, and Schrijver (Prog. Comb. Opt. 1984), and by Frank and Tardos (Combinatorica 1987) that rely on bit-complexity arguments. Our algorithm avoids rounding-based arguments such as simultaneous Diophantine approximation and uses geometric arguments instead.Authors: Rasmus Pagh, Sia Sejer
We consider the problem of privately releasing a $k$-dimensional vector under updates: Starting with a zero vector, at times $t_1, t_2,\dots$ the vector is updated by adding $x^{(1)}, x^{(2)},\dots$, respectively. For positive integers $T$, $k$ we model the updates as a data set $\{(t_i, x^{(i)})\}_i$, where $t_i \in [T]$ and $x^{(i)} \in B_k$ (the $k$-dimensional unit ball). Two such data sets are said to be neighboring if their symmetric difference has size at most $1$. The continual release consists of the sum $A^{(t)} = \sum_{i \; : \; t_i \leq t} x^{(i)}$ for each time step $t=1,\dots,T$. Classical continual release techniques allow us to release an approximation of $A^{(1)},\dots,A^{(T)}$ with additive noise of magnitude $\text{polylog}(T)$, computed in time $O(kT)$, even in the on-line, adaptive case where data is continually revealed for the current time step. Motivated by private sketching techniques, we consider the setting where only a \emph{subset} of entries in $A^{(t)}$ need to be released at time step $t$. Our new result is that it is possible to sample any desired entry in a given noise vector in \emph{constant time} while reproducing exactly the distribution of the binary tree mechanism with Gaussian noise. The improvement on the known time bound of $O(\log T)$ comes from a new data structure that allows us to sample a new noise value with the correct correlations in constant time using Brownian bridges. We present two data management applications, of independent interest, that use our technique in conjunction with differentially private CountSketches: 1) A dynamic data structure for orthogonal range counting queries with a better privacy/accuracy/space trade-off than previous data structures, and 2) Join size estimation, where in addition we show improved high-probability bounds.Authors: Tait Weicht, Alexander S. Wein
Multireference alignment (MRA) is the task of recovering a hidden "signal" vector, given many noisy copies that have been cyclically shifted by unknown offsets. This task belongs to the class of orbit recovery problems, in which the observed samples are affected by some group action. These problems have a variety of practical motivations, including the reconstruction of 3-dimensional molecular structure from cryogenic electron microscopy (cryo-EM) images. We consider two variants of MRA: dihedral MRA, where the cyclic group is replaced by the dihedral group, allowing for reversals of the vector in addition to shifts; and projected MRA, where the observations are passed through a projection operator akin to the tomographic projection present in cryo-EM. We apply the method of moments and aim to recover the signal from the third moment tensor of the samples. This inverse problem is well understood for basic MRA, but for the variants we consider there is no polynomial-time algorithm known to succeed for generic signals. We give the first such algorithm for both of these variants. Our method requires the signal length to be a power of two, and recursively subdivides the problem into smaller problems of half the size. The algorithm's success for generic signals is proven, conditional on a conjecture about the rank of a certain symbolic matrix of polynomials. For any given problem size, this conjecture can be verified on a computer.Authors: Spencer Compton, Jerry Li
We study the task of density estimation, where we hope to accurately estimate a probability density from $n$ samples. A textbook method for density estimation in total variation distance is the minimum-distance estimator approach, where we conclude both the algorithm and the analysis merely from bounding the VC dimension of a particular concept class (the so-called Yatracos class). While this technique has originally yielded sharp guarantees primarily for total variation distance, in this work we extend the minimum-distance estimator approach for learning within Hellinger distance. Our main observation is that we may produce an analogous recipe for Hellinger (where we only require bounding the VC dimension of a related concept class) by drawing connections to recent results yielding reverse data processing inequalities. This recipe is flexible enough to accommodate fast algorithms originally designed for total variation distance; by modifying the approach of Acharya et al. (2017) we conclude the first near-linear time algorithm for learning classes including univariate mixtures of log-concave densities and mixtures of Gaussians (with arbitrary variances), with near-optimal sample complexity.Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi
Efficiently sampling from a complex probability distribution is a fundamental problem which has become increasingly pertinent in recent years with the rise of generative AI, as sophisticated sampling procedures from LLMs have been proposed to solve challenging reasoning problems. The efficacy of such sampling algorithms is limited, however, by the relationship between the LLM and the particular sampling task at hand, which has motivated the framework of test-time training (TTT). TTT works by updating a model's weights in response to partial generations and reward feedback received at inference time, thus adapting to the particular problem. In this work, we propose a formalization for TTT as the problem of producing a sample from a given probability measure $μ^\star$ belonging to a known class ${F}$ of distributions, given an oracle $\hat μ$ which yields approximate density estimates for $μ^\star$. This is closely related to the problem of reducing sampling to approximate counting studied in seminal works of Jerrum, Valiant & Vazirani (1986) and Jerrum & Sinclair (1989): namely, when ${F}$ is the class of all distributions, it coincides exactly with the aforementioned counting-to-sampling reduction. In this paper, we first show a quadratic lower bound on the query complexity of sampling from $μ^\star$ given query access to $\hat μ$ (for sufficiently large classes ${F}$), thus showing that the random walk approach proposed by Jerrum & Sinclair (1989) and refined by Hayes & Sinclair (2010), is optimal. This answers an open question posed by Hayes & Sinclair. We then show that this lower bound can be circumvented if the size of ${F}$ is bounded appropriately. As we discuss, this latter result can be viewed as an abstraction of TTT, and thus represents a starting point for the development of a principled theoretical framework for TTT.There are two ways to look at the P v NP problem, as a formal mathematically defined conjecture as a Clay Millennium Prize Problem, and as the more intuitive notion that everything efficiently verifiable is efficiently computable and the implications that has on our ability to compute.
I've written considerably about how artificial intelligence has affected the latter. In particular, how AI and other advances in computing have brought us to this Optiland of getting most of the good implications of P=NP while our cryptographic codes remain unbreakable.
But now with the recent advances in AI-created and assisted proofs, will AI change what we know about the formal mathematical statement? Is an AI-generated proof of P ≠ NP around the corner?
No, it isn't. I do not believe we will see a P v NP proof in my lifetime proven by man or machine, separately or working together.
While the disproof of the Erdős unit distance problem is an impressive AI achievement, keep in mind that for every AI math proof there are hundreds of problems that we have tried to solve with AI where we haven't seen progress. And there is a huge chasm between Erdős combinatorial conjectures and the Clay Millennium problems. AI will continue to improve, but there are limits.
People, particularly those outside of computational complexity, don't realize how difficult a mathematical challenge this is. Polynomial-time algorithms can work in strange and mysterious ways. They don't have to respect the semantics of an NP-search problem or do any searching at all. Bill gave me the following "algorithm" for clique: Take the eigenvalues of the adjacency matrix. For all we know, if there are two primes p and q such that the pth eigenvalue and the qth eigenvalue differ by more than 1/k then the graph has a k-clique. Of course this doesn't work. But to prove P ≠ NP, you need to prove not only that this algorithm doesn't work, but neither do any of the infinitely other potential algorithms for solving NP-complete problems.
We simply know of no way to manage general polynomial-time algorithms other than by simulating them. We know by relativization that simulation and diagonalization will not work to settle P v NP. Other attempts to understand polynomial time, like circuit complexity, proof complexity and algebraic geometry have gotten bogged down well below the full power of polynomial-time. At this time we don't even have a viable approach to settling the P v NP problem.
Don't waste your time trying a formal approach via Lean. (I'm looking at you Dmitry Khanukov) Computational complexity is very messy to formulate technically. I can't get an AI willing to give me a full Lean-verified proof of something trivial like P closed under complement, forget the PCP theorem. If someone or something does come up with a P ≠ NP, it'll be following the right intuitive approach, not a formalistic one.
At least start with something simpler, like showing BPP is in subexponential time, or SAT doesn't have quadratic algorithms. You won't succeed there either, even though these questions should be galactically simpler than P ≠ NP.
By Lance Fortnow
There are two ways to look at the P v NP problem, as a formal mathematically defined conjecture as a Clay Millennium Prize Problem, and as the more intuitive notion that everything efficiently verifiable is efficiently computable and the implications that has on our ability to compute.
I've written considerably about how artificial intelligence has affected the latter. In particular, how AI and other advances in computing have brought us to this Optiland of getting most of the good implications of P=NP while our cryptographic codes remain unbreakable.
But now with the recent advances in AI-created and assisted proofs, will AI change what we know about the formal mathematical statement? Is an AI-generated proof of P ≠ NP around the corner?
No, it isn't. I do not believe we will see a P v NP proof in my lifetime proven by man or machine, separately or working together.
While the disproof of the Erdős unit distance problem is an impressive AI achievement, keep in mind that for every AI math proof there are hundreds of problems that we have tried to solve with AI where we haven't seen progress. And there is a huge chasm between Erdős combinatorial conjectures and the Clay Millennium problems. AI will continue to improve, but there are limits.
People, particularly those outside of computational complexity, don't realize how difficult a mathematical challenge this is. Polynomial-time algorithms can work in strange and mysterious ways. They don't have to respect the semantics of an NP-search problem or do any searching at all. Bill gave me the following "algorithm" for clique: Take the eigenvalues of the adjacency matrix. For all we know, if there are two primes p and q such that the pth eigenvalue and the qth eigenvalue differ by more than 1/k then the graph has a k-clique. Of course this doesn't work. But to prove P ≠ NP, you need to prove not only that this algorithm doesn't work, but neither do any of the infinitely other potential algorithms for solving NP-complete problems.
We simply know of no way to manage general polynomial-time algorithms other than by simulating them. We know by relativization that simulation and diagonalization will not work to settle P v NP. Other attempts to understand polynomial time, like circuit complexity, proof complexity and algebraic geometry have gotten bogged down well below the full power of polynomial-time. At this time we don't even have a viable approach to settling the P v NP problem.
Don't waste your time trying a formal approach via Lean. (I'm looking at you Dmitry Khanukov) Computational complexity is very messy to formulate technically. I can't get an AI willing to give me a full Lean-verified proof of something trivial like P closed under complement, forget the PCP theorem. If someone or something does come up with a P ≠ NP, it'll be following the right intuitive approach, not a formalistic one.
At least start with something simpler, like showing BPP is in subexponential time, or SAT doesn't have quadratic algorithms. You won't succeed there either, even though these questions should be galactically simpler than P ≠ NP.
Authors: Karthik Sheshadri
The symmetric determinantal complexity sdc(f) of a polynomial f is the least m such that f = det(M) for an m x m symmetric matrix M of affine-linear forms. We prove, over the complex numbers, that sdc(sum_{i=1}^n x_i^n) >= (1/(2e) - o(1)) n^2. This is a symmetric companion to the author's non-symmetric polar-degree preprint (arXiv:7680505); the method parallels that work, but the proof here is self-contained and redoes the load-bearing local incidence analysis in the symmetric setting. The general theorem: if X = V(f) in P^{N-1} is a smooth degree-d hypersurface, N >= 3, and f = det(A_0 + sum x_i A_i) with all A_i symmetric of size m, then the top polar degree d(d-1)^{N-2} is at most 2^{N-2} C(m, N-1). The proof uses the symmetric rank-one kernel incidence M(z,x) u = 0. At a genuine polar point M has rank m-1, and a symmetric Schur-complement normal form eliminates the unique kernel line scheme-theoretically; on the resulting local graph the lifted conormal forms u^T A_i u are a common unit multiple of the partials d_i f, so the lifted polar equations cut the ordinary polar slice up to units and each genuine lifted polar point is a zero-dimensional isolated solution. Multihomogeneous Bezout on P^N x P^{m-1} then yields the bound 2^{N-2} C(m, N-1). For F_n = sum x_i^n this gives the constant 1/(2e). More generally, for F_{N,d} = sum_{i=1}^N x_i^d the same theorem gives sdc(F_{N,d}) >= (1/(2e) - o_N(1)) N(d-1) as N -> infinity. We give an explicit symmetric representation of F_{N,d} of size 2N(d+1)+1, so the diagonal bounds are non-vacuous and tight up to a constant. The result is for exact symmetric determinantal complexity in characteristic zero; it is not a border statement and not a uniform positive-characteristic theorem.Authors: Chiara Piombi
The last decade of research on the LOCAL model has seen tremendous progress in understanding locally checkable labeling (LCL) problems, culminating in an almost complete classification of the possible complexities LCL problems can exhibit. In particular, on undirected trees, Chang and Pettie showed that there is no LCL problem with complexity between $ω(\log n)$ and $n^{o(1)}$ and Chang showed that, for every positive integer $k$, there is no LCL problem with complexity between $ω(n^{1/(k+1)})$ and $o(n^{1/k})$; additionally, which side of each gap a problem is found on is decidable. While the class of LCL problems - which, roughly speaking, consists of problems for which the correctness of a solution can be described by a finite set of allowed node configurations, which in turn can be locally verified by a constant-time algorithm - includes many important problems, it has one major restriction: problems can be defined only on bounded degree graphs, which consequently restricts all the classification and gap results mentioned above. In this work, we propose a generalization of LCL problems to unbounded degree using Presburger monadic second-order (PMSO) formulas; more specifically, we consider what we call Local PMSO (LPMSO) problems, i.e., problems whose correct solutions are both finitely described by a PMSO formula and locally verifiable by a LOCAL algorithm in constant time - this class contains many of the important problems studied in the LOCAL model but defines them on unbounded degree graphs. As our main result we prove that, on unbounded degree rooted trees, the aforementioned $ω(\log n)$ - $n^{o(1)}$ and $ω(n^{1/(k+1)})$ - $o(n^{1/k})$ complexity gaps (and their decidability) extend to the class of LPMSO problems.Authors: Pavel Pudlák, Neil Thapen
In the implicit version of a propositional proof system Q, we work with Q-proofs that are not written down directly, but are succinctly encoded by circuits. Thus implicit Q-proofs are potentially exponentially shorter than usual Q-proofs. We study narrow implicit proofs, a restricted version of this notion, in which lines in the encoded proof can only have polynomial size. We use a cut-elimination construction to show that G_{i+1} is equivalent to narrow implicit G_i, for i >= 1, where G_i is the extension of Frege allowing reasoning with Sigma^q_i quantified propositional formulas. We show that G_1 is equivalent to implicit resolution.