Last Update

OPML feed of all feeds.

Subscribe to the Atom feed, RSS feed to stay up to date.

Thank you to arXiv for use of its open access interoperability.

Note: the date of arXiv entries announced right after publication holidays might incorrectly show up as the date of the publication holiday itself. This is due to our ad hoc method of inferring announcement dates, which are not returned by the arXiv API.

Powered by Pluto.

Source on GitHub.

Maintained by Nima Anari, Arnab Bhattacharyya, Gautam Kamath.

Theory of Computing Report

Sunday, December 28

TR25-221 | A Meta-Complexity Characterization of Minimal Quantum Cryptography | Bruno P. Cavalar, Boyang Chen, Andrea Coladangelo, Matthew Gray, Zihan Hu, Zhengfeng Ji, Xingjian Li

from ECCC Papers

We give a meta-complexity characterization of EFI pairs, which are considered the ''minimal'' primitive in quantum cryptography (and are equivalent to quantum commitments). More precisely, we show that the existence of EFI pairs is equivalent to the following: there exists a non-uniformly samplable distribution over pure states such that the problem of estimating a certain Kolmogorov-like complexity measure is hard given a single copy. A key technical step in our proof, which may be of independent interest, is to show that the existence of EFI pairs is equivalent to the existence of non-uniform single-copy secure pseudorandom state generators (nu 1-PRS). As a corollary, we get an alternative, arguably simpler, construction of a universal EFI pair.

We give a meta-complexity characterization of EFI pairs, which are considered the ''minimal'' primitive in quantum cryptography (and are equivalent to quantum commitments). More precisely, we show that the existence of EFI pairs is equivalent to the following: there exists a non-uniformly samplable distribution over pure states such that the problem of estimating a certain Kolmogorov-like complexity measure is hard given a single copy. A key technical step in our proof, which may be of independent interest, is to show that the existence of EFI pairs is equivalent to the existence of non-uniform single-copy secure pseudorandom state generators (nu 1-PRS). As a corollary, we get an alternative, arguably simpler, construction of a universal EFI pair.

TR25-220 | A Note on Avoid vs MCSP | Edward Hirsch, Ilya Volkovich

from ECCC Papers

A recent result of Ghentiyala, Li, and Stephens-Davidowitz (ECCC TR 25-210) shows that any language reducible to the Range Avoidance Problem (Avoid) via deterministic or randomized Turing reductions is contained in AM $\cap$ coAM. In this note, we present a different potential avenue for obtaining the same result via the Minimal Circuit Size Problem (MCSP).

A recent result of Ghentiyala, Li, and Stephens-Davidowitz (ECCC TR 25-210) shows that any language reducible to the Range Avoidance Problem (Avoid) via deterministic or randomized Turing reductions is contained in AM $\cap$ coAM. In this note, we present a different potential avenue for obtaining the same result via the Minimal Circuit Size Problem (MCSP).

Friday, December 26

Israel AGT day in Weizmann — 29/1/2026

from Turing's Invisible Hand

The next Israeli Algorithmic Game Theory day will be held in the Weizmann Institute on 29/1/2026. Registration is free but required.

The next Israeli Algorithmic Game Theory day will be held in the Weizmann Institute on 29/1/2026. Registration is free but required.

By Noam Nisan

Combinatorial Morning in Tel Aviv, Sunday 28/12/2025

from Gil Kalai

Coming very soon! Organizer:      Michael Krivelevich Place:  Schreiber 309, Tel Aviv University Event’s site.  sites.google.com/view/combinatorics-seminar-tel-aviv Program 09:30-10:00        Asaf Ferber (UC Irvine)  Quantum algorithms on graphs 10:00-10:30        Gal Kronenberg (Oxford U.)  2-factors in … Continue reading →

Coming very soon!

Organizer:      Michael Krivelevich

Place:  Schreiber 309, Tel Aviv University

Event’s site.  https://sites.google.com/view/combinatorics-seminar-tel-aviv

Program

By Gil Kalai

Thursday, December 25

My Christmas gift: telling you about PurpleMind, which brings CS theory to the YouTube masses

from Scott Aaronson

Merry Christmas, everyone! Ho3! Here’s my beloved daughter baking chocolate chip cookies, which she’ll deliver tomorrow morning with our synagogue to firemen, EMTs, and others who need to work on Christmas Day. My role was limited to taste-testing. While (I hope you’re sitting down for this) the Aaronson-Moshkovitzes are more of a latke/dreidel family, I […]

Merry Christmas, everyone! Ho3!

Here’s my beloved daughter baking chocolate chip cookies, which she’ll deliver tomorrow morning with our synagogue to firemen, EMTs, and others who need to work on Christmas Day. My role was limited to taste-testing.

While (I hope you’re sitting down for this) the Aaronson-Moshkovitzes are more of a latke/dreidel family, I grew up surrounded by Christmas and am a lifelong enjoyer of the decorations, the songs and movies (well, some of them), the message of universal goodwill, and even gingerbread and fruitcake.


Therefore, as a Christmas gift to my readers, I hereby present what I now regard as one of the great serendipitous “discoveries” in my career, alongside students like Paul Christiano and Ewin Tang who later became superstars.

Ever since I was a pimply teen, I dreamed of becoming the prophet who’d finally bring the glories of theoretical computer science to the masses—who’d do for that systematically under-sung field what Martin Gardner did for math, Carl Sagan for astronomy, Richard Dawkins for evolutionary biology, Douglas Hofstadter for consciousness and Gödel. Now, with my life half over, I’ve done … well, some in that direction, but vastly less than I’d dreamed.

A month ago, I learned that maybe I can rest easier. For a young man named Aaron Gostein is doing the work I wish I’d done—and he’s doing it using tools I don’t have, and so brilliantly that I could barely improve a pixel.

Aaron recently graduated from Carnegie Mellon, majoring in CS. He’s now moved back to Austin, TX, where he grew up, and where of course I now live as well. (Before anyone confuses our names: mine is Scott Aaronson, even though I’ve gotten hundreds of emails over the years calling me “Aaron.”)

Anyway, here in Austin, Aaron is producing a YouTube channel called PurpleMind. In starting this channel, Aaron was directly inspired by Grant Sanderson’s 3Blue1Brown—a math YouTube channel that I’ve also praised to the skies on this blog—but Aaron has chosen to focus on theoretical computer science.

I first encountered Aaron a month ago, when he emailed asking to interview me about … which topic will it be this time, quantum computing and Bitcoin? quantum computing and AI? AI and watermarking? no, diagonalization as a unifying idea in mathematical logic. That got my attention.

So Aaron came to my office and we talked for 45 minutes. I didn’t expect much to come of it, but then Aaron quickly put out this video, in which I have a few unimportant cameos:

After I watched this, I brought Dana and the kids and even my parents to watch it too. The kids, whose attention spans normally leave much to be desired, were sufficiently engaged that they made me pause every 15 seconds to ask questions (“what would go wrong if you diagonalized a list of all whole numbers, where we know there are only ℵ0 of them?” “aren’t there other strategies that would work just as well as going down the diagonal?”).

Seeing this, I sat the kids down to watch more PurpleMind. Here’s the video on the P versus NP problem:

Here’s one on the famous Karatsuba algorithm, which reduced the number of steps needed to multiply two n-digit numbers from ~n2 to only ~n1.585, and thereby helped inaugurate the entire field of algorithms:

Here’s one on RSA encryption:

Here’s one on how computers quickly generate the huge random prime numbers that RSA and other modern encryption methods need:

These are the only ones we’ve watched so far. Each one strikes me as close to perfection. There are many others (for example, on Diffie-Hellman encryption, the Bernstein-Vazirani quantum algorithm, and calculating pi) that I’m guessing will be equally superb.

In my view, what makes these videos so good is their concreteness, achieved without loss of correctness. When, for example, Aaron talks about Gödel mailing a letter to the dying von Neumann posing what we now know as P vs. NP, or any other historical event, he always shows you an animated reconstruction. When he talks about an algorithm, he always shows you his own Python code, and what happened when he ran the code, and then he invites you to experiment with it too.

I might even say that the results singlehandedly justify the existence of YouTube, as the ten righteous men would’ve saved Sodom—with every crystal-clear animation of a CS concept canceling out a thousand unboxing videos or screamingly-narrated Minecraft play-throughs in the eyes of God.

Strangely, the comments below Aaron’s YouTube videos attack him relentlessly for his use of AI to help generate the animations. To me, it seems clear that AI is the only thing that could let one person, with no production budget to speak of, create animations of this quality and quantity. If people want so badly for the artwork to be 100% human-generated, let them volunteer to create it themselves.


Even as I admire the PurpleMind videos, or the 3Blue1Brown videos before them, a small part of me feels melancholic. From now until death, I expect that I’ll have only the same pedagogical tools that I acquired as a young’un: talking; waving my arms around; quizzing the audience; opening the floor to Q&A; cracking jokes; drawing crude diagrams on a blackboard or whiteboard until the chalk or the markers give out; typing English or LaTeX; the occasional PowerPoint graphic that might (if I’m feeling ambitious) fade in and out or fly across the screen.

Today there are vastly better tools, both human and AI, that make it feasible to create spectacular animations for each and every mathematical concept, as if transferring the imagery directly from mind to mind. In the hands of a master explainer like Grant Sanderson or Aaron Gostein, these tools are tractors to my ox-drawn plow. I’ll be unable to compete in the long term.

But then I reflect that at least I can help this new generation of math and CS popularizers, by continuing to feed them raw material. I can do cameos in their YouTube productions. Or if nothing else, I can bring their jewels to my community’s attention, as I’m doing right now.

Peace on Earth, and to all a good night.

By Scott

Adjusted Kolmogorov Complexity of Binary Words with Empirical Entropy Normalization

from arXiv: Computational Complexity

Authors: Brani Vidakovic

Kolmogorov complexity of a finite binary word reflects both algorithmic structure and the empirical distribution of symbols appearing in the word. Words with symbol frequencies far from one half have smaller combinatorial richness and therefore appear less complex under the standard definition. In this paper an entropy-normalized complexity measure is introduced that divides the Kolmogorov complexity of a word by the empirical entropy of its observed distribution of zeros and ones. This adjustment isolates intrinsic descriptive complexity from the purely combinatorial effect of symbol imbalance. For Martin Löf random sequences under constructive exchangeable measures, the adjusted complexity grows linearly and converges to one. A pathological construction shows that regularity of the underlying measure is essential. The proposed framework connects Kolmogorov complexity, empirical entropy, and randomness in a natural manner and suggests applications in randomness testing and in the analysis of structured binary data.

Authors: Brani Vidakovic

Kolmogorov complexity of a finite binary word reflects both algorithmic structure and the empirical distribution of symbols appearing in the word. Words with symbol frequencies far from one half have smaller combinatorial richness and therefore appear less complex under the standard definition. In this paper an entropy-normalized complexity measure is introduced that divides the Kolmogorov complexity of a word by the empirical entropy of its observed distribution of zeros and ones. This adjustment isolates intrinsic descriptive complexity from the purely combinatorial effect of symbol imbalance. For Martin Löf random sequences under constructive exchangeable measures, the adjusted complexity grows linearly and converges to one. A pathological construction shows that regularity of the underlying measure is essential. The proposed framework connects Kolmogorov complexity, empirical entropy, and randomness in a natural manner and suggests applications in randomness testing and in the analysis of structured binary data.

Stochastic well-structured transition systems

from arXiv: Computational Complexity

Authors: James Aspnes

Extending well-structured transition systems to incorporate a probabilistic scheduling rule, we define a new class of stochastic well-structured transition systems that includes population protocols, chemical reaction networks, and many common gossip models; as well as augmentations of these systems by an oracle that exposes a total order on agents as in population protocols in the comparison model or an equivalence relation as in population protocols with unordered data. We show that any implementation of a phase clock in these systems either stops or ticks too fast after polynomially many expected steps, and that any terminating computation in these systems finishes or fails in expected polynomial time. This latter property allows an exact characterization of the computational power of many stochastic well-structured transition systems augmented with a total order or equivalence relation on agents, showing that these compute exactly the languages in BPP, while the corresponding unaugmented systems compute just the symmetric languages in BPL.

Authors: James Aspnes

Extending well-structured transition systems to incorporate a probabilistic scheduling rule, we define a new class of stochastic well-structured transition systems that includes population protocols, chemical reaction networks, and many common gossip models; as well as augmentations of these systems by an oracle that exposes a total order on agents as in population protocols in the comparison model or an equivalence relation as in population protocols with unordered data. We show that any implementation of a phase clock in these systems either stops or ticks too fast after polynomially many expected steps, and that any terminating computation in these systems finishes or fails in expected polynomial time. This latter property allows an exact characterization of the computational power of many stochastic well-structured transition systems augmented with a total order or equivalence relation on agents, showing that these compute exactly the languages in BPP, while the corresponding unaugmented systems compute just the symmetric languages in BPL.

Shifted Partial Derivative Polynomial Rank and Codimension

from arXiv: Computational Complexity

Authors: Darren J. Edwards

Shifted partial derivative (SPD) methods are a central algebraic tool for circuit lower bounds, measuring the dimension of spaces of shifted derivatives of a polynomial. We develop the Shifted Partial Derivative Polynomial (SPDP) framework, packaging SPD into an explicit coefficient-matrix formalism. This turns shifted-derivative spans into concrete linear-algebraic objects and yields two dual measures: SPDP rank and SPDP codimension. We define the SPDP generating family, its span, and the SPDP matrix M_{k,l}(p) inside a fixed ambient coefficient space determined by the (k,l) regime, so rank is canonical and codimension is a deficit from ambient fullness. We prove structural properties needed for reuse: monotonicity in the shift/derivative parameters (with careful scoping for |S|=k versus |S|<=k conventions), invariance under admissible variable symmetries and basis changes, and robustness across standard Boolean/multilinear embeddings. We then give generic width-to-rank upper-bound templates for local circuit models via profile counting, separating the model-agnostic SPDP toolkit from additional compiled refinements used elsewhere. We illustrate the codimension viewpoint on representative examples.

Authors: Darren J. Edwards

Shifted partial derivative (SPD) methods are a central algebraic tool for circuit lower bounds, measuring the dimension of spaces of shifted derivatives of a polynomial. We develop the Shifted Partial Derivative Polynomial (SPDP) framework, packaging SPD into an explicit coefficient-matrix formalism. This turns shifted-derivative spans into concrete linear-algebraic objects and yields two dual measures: SPDP rank and SPDP codimension. We define the SPDP generating family, its span, and the SPDP matrix M_{k,l}(p) inside a fixed ambient coefficient space determined by the (k,l) regime, so rank is canonical and codimension is a deficit from ambient fullness. We prove structural properties needed for reuse: monotonicity in the shift/derivative parameters (with careful scoping for |S|=k versus |S|<=k conventions), invariance under admissible variable symmetries and basis changes, and robustness across standard Boolean/multilinear embeddings. We then give generic width-to-rank upper-bound templates for local circuit models via profile counting, separating the model-agnostic SPDP toolkit from additional compiled refinements used elsewhere. We illustrate the codimension viewpoint on representative examples.

Uncovering Hierarchical Structure in LLM Embeddings with $δ$-Hyperbolicity, Ultrametricity, and Neighbor Joining

from arXiv: Computational Geometry

Authors: Prakash Chourasia, Sarwan Ali, Murray Patterson

The rapid advancement of large language models (LLMs) has enabled significant strides in various fields. This paper introduces a novel approach to evaluate the effectiveness of LLM embeddings in the context of inherent geometric properties. We investigate the structural properties of these embeddings through three complementary metrics $δ$-hyperbolicity, Ultrametricity, and Neighbor Joining. $δ$-hyperbolicity, a measure derived from geometric group theory, quantifies how much a metric space deviates from being a tree-like structure. In contrast, ultrametricity characterizes strictly hierarchical structures where distances obey a strong triangle inequality. While Neighbor Joining quantifies how tree-like the distance relationships are, it does so specifically with respect to the tree reconstructed by the Neighbor Joining algorithm. By analyzing the embeddings generated by LLMs using these metrics, we uncover to what extent the embedding space reflects an underlying hierarchical or tree-like organization. Our findings reveal that LLM embeddings exhibit varying degrees of hyperbolicity and ultrametricity, which correlate with their performance in the underlying machine learning tasks.

Authors: Prakash Chourasia, Sarwan Ali, Murray Patterson

The rapid advancement of large language models (LLMs) has enabled significant strides in various fields. This paper introduces a novel approach to evaluate the effectiveness of LLM embeddings in the context of inherent geometric properties. We investigate the structural properties of these embeddings through three complementary metrics $δ$-hyperbolicity, Ultrametricity, and Neighbor Joining. $δ$-hyperbolicity, a measure derived from geometric group theory, quantifies how much a metric space deviates from being a tree-like structure. In contrast, ultrametricity characterizes strictly hierarchical structures where distances obey a strong triangle inequality. While Neighbor Joining quantifies how tree-like the distance relationships are, it does so specifically with respect to the tree reconstructed by the Neighbor Joining algorithm. By analyzing the embeddings generated by LLMs using these metrics, we uncover to what extent the embedding space reflects an underlying hierarchical or tree-like organization. Our findings reveal that LLM embeddings exhibit varying degrees of hyperbolicity and ultrametricity, which correlate with their performance in the underlying machine learning tasks.

An Allele-Centric Pan-Graph-Matrix Representation for Scalable Pangenome Analysis

from arXiv: Data Structures and Algorithms

Authors: Roberto Garrone

Population-scale pangenome analysis increasingly requires representations that unify single-nucleotide and structural variation while remaining scalable across large cohorts. Existing formats are typically sequence-centric, path-centric, or sample-centric, and often obscure population structure or fail to exploit carrier sparsity. We introduce the H1 pan-graph-matrix, an allele-centric representation that encodes exact haplotype membership using adaptive per-allele compression. By treating alleles as first-class objects and selecting optimal encodings based on carrier distribution, H1 achieves near-optimal storage across both common and rare variants. We further introduce H2, a path-centric dual representation derived from the same underlying allele-haplotype incidence information that restores explicit haplotype ordering while remaining exactly equivalent in information content. Using real human genome data, we show that this representation yields substantial compression gains, particularly for structural variants, while remaining equivalent in information content to pangenome graphs. H1 provides a unified, population-aware foundation for scalable pangenome analysis and downstream applications such as rare-variant interpretation and drug discovery.

Authors: Roberto Garrone

Population-scale pangenome analysis increasingly requires representations that unify single-nucleotide and structural variation while remaining scalable across large cohorts. Existing formats are typically sequence-centric, path-centric, or sample-centric, and often obscure population structure or fail to exploit carrier sparsity. We introduce the H1 pan-graph-matrix, an allele-centric representation that encodes exact haplotype membership using adaptive per-allele compression. By treating alleles as first-class objects and selecting optimal encodings based on carrier distribution, H1 achieves near-optimal storage across both common and rare variants. We further introduce H2, a path-centric dual representation derived from the same underlying allele-haplotype incidence information that restores explicit haplotype ordering while remaining exactly equivalent in information content. Using real human genome data, we show that this representation yields substantial compression gains, particularly for structural variants, while remaining equivalent in information content to pangenome graphs. H1 provides a unified, population-aware foundation for scalable pangenome analysis and downstream applications such as rare-variant interpretation and drug discovery.

An O($nlogn$) approximate knapsack algorithm

from arXiv: Data Structures and Algorithms

Authors: Nick Dawes

A modified dynamic programming algorithm rapidly and accurately solves large 0/1 knapsack problems. It has computational O($nlogn$), space O($nlogn$) and predictable maximum error. Experimentally it's accuracy increases faster than linearly with the solution size $k$. Problems with $k=10^3$ are solved with an average maximum fractional error of $10^{-4}$ and problems with $k=10^5$ with an average maximum fractional error of $10^{-7}$. The algorithm runs in constant time for all problems with a given $n$. On a common desktop computer the algorithm processes $n=10^3$ problems in $10^{-3}$ seconds and $n=10^6$ problems in 2 seconds.

Authors: Nick Dawes

A modified dynamic programming algorithm rapidly and accurately solves large 0/1 knapsack problems. It has computational O($nlogn$), space O($nlogn$) and predictable maximum error. Experimentally it's accuracy increases faster than linearly with the solution size $k$. Problems with $k=10^3$ are solved with an average maximum fractional error of $10^{-4}$ and problems with $k=10^5$ with an average maximum fractional error of $10^{-7}$. The algorithm runs in constant time for all problems with a given $n$. On a common desktop computer the algorithm processes $n=10^3$ problems in $10^{-3}$ seconds and $n=10^6$ problems in 2 seconds.

Approximation Schemes for Planar Graph Connectivity Problems

from arXiv: Data Structures and Algorithms

Authors: Meike Neuwohner, Vera Traub, Rico Zenklusen

Finding a smallest subgraph that is k-edge-connected, or augmenting a k-edge-connected graph with a smallest subset of given candidate edges to become (k+1)-edge-connected, are among the most fundamental Network Design problems. They are both APX-hard in general graphs. However, this hardness does not carry over to the planar setting, which is not well understood, except for very small values of k. One main obstacle in using standard decomposition techniques for planar graphs, like Baker's technique and extensions thereof, is that connectivity requirements are global (rather than local) properties that are not captured by existing frameworks. We present a novel, and arguably clean, decomposition technique for such classical connectivity problems on planar graphs. This technique immediately implies PTASs for the problems of finding a smallest k-edge-connected or k-vertex-connected spanning subgraph of a planar graph for arbitrary k. By leveraging structural results for minimally k-edge-connected graphs, we further obtain a PTAS for planar k-connectivity augmentation for any constant k. We complement this with an NP-hardness result, showing that our results are essentially optimal.

Authors: Meike Neuwohner, Vera Traub, Rico Zenklusen

Finding a smallest subgraph that is k-edge-connected, or augmenting a k-edge-connected graph with a smallest subset of given candidate edges to become (k+1)-edge-connected, are among the most fundamental Network Design problems. They are both APX-hard in general graphs. However, this hardness does not carry over to the planar setting, which is not well understood, except for very small values of k. One main obstacle in using standard decomposition techniques for planar graphs, like Baker's technique and extensions thereof, is that connectivity requirements are global (rather than local) properties that are not captured by existing frameworks. We present a novel, and arguably clean, decomposition technique for such classical connectivity problems on planar graphs. This technique immediately implies PTASs for the problems of finding a smallest k-edge-connected or k-vertex-connected spanning subgraph of a planar graph for arbitrary k. By leveraging structural results for minimally k-edge-connected graphs, we further obtain a PTAS for planar k-connectivity augmentation for any constant k. We complement this with an NP-hardness result, showing that our results are essentially optimal.

ESCHER: Efficient and Scalable Hypergraph Evolution Representation with Application to Triad Counting

from arXiv: Data Structures and Algorithms

Authors: S. M. Shovan, Arindam Khanda, Sanjukta Bhowmick, Sajal K. Das

Higher-order interactions beyond pairwise relationships in large complex networks are often modeled as hypergraphs. Analyzing hypergraph properties such as triad counts is essential, as hypergraphs can reveal intricate group interaction patterns that conventional graphs fail to capture. In real-world scenarios, these networks are often large and dynamic, introducing significant computational challenges. Due to the absence of specialized software packages and data structures, the analysis of large dynamic hypergraphs remains largely unexplored. Motivated by this gap, we propose ESCHER, a GPU-centric parallel data structure for Efficient and Scalable Hypergraph Evolution Representation, designed to manage large scale hypergraph dynamics efficiently. We also design a hypergraph triad-count update framework that minimizes redundant computation while fully leveraging the capabilities of ESCHER for dynamic operations. We validate the efficacy of our approach across multiple categories of hypergraph triad counting, including hyperedge-based, incident-vertex-based, and temporal triads. Empirical results on both large real-world and synthetic datasets demonstrate that our proposed method outperforms existing state-of-the-art methods, achieving speedups of up to 104.5x, 473.7x, and 112.5x for hyperedge-based, incident-vertex-based, and temporal triad types, respectively.

Authors: S. M. Shovan, Arindam Khanda, Sanjukta Bhowmick, Sajal K. Das

Higher-order interactions beyond pairwise relationships in large complex networks are often modeled as hypergraphs. Analyzing hypergraph properties such as triad counts is essential, as hypergraphs can reveal intricate group interaction patterns that conventional graphs fail to capture. In real-world scenarios, these networks are often large and dynamic, introducing significant computational challenges. Due to the absence of specialized software packages and data structures, the analysis of large dynamic hypergraphs remains largely unexplored. Motivated by this gap, we propose ESCHER, a GPU-centric parallel data structure for Efficient and Scalable Hypergraph Evolution Representation, designed to manage large scale hypergraph dynamics efficiently. We also design a hypergraph triad-count update framework that minimizes redundant computation while fully leveraging the capabilities of ESCHER for dynamic operations. We validate the efficacy of our approach across multiple categories of hypergraph triad counting, including hyperedge-based, incident-vertex-based, and temporal triads. Empirical results on both large real-world and synthetic datasets demonstrate that our proposed method outperforms existing state-of-the-art methods, achieving speedups of up to 104.5x, 473.7x, and 112.5x for hyperedge-based, incident-vertex-based, and temporal triad types, respectively.

Time-Bucketed Balance Records: Bounded-Storage Ephemeral Tokens for Resource-Constrained Systems

from arXiv: Data Structures and Algorithms

Authors: Shaun Scovil, Bhargav Chickmagalur Nanjundappa

Fungible tokens with time-to-live (TTL) semantics require tracking individual expiration times for each deposited unit. A naive implementation creates a new balance record per deposit, leading to unbounded storage growth and vulnerability to denial-of-service attacks. We present time-bucketed balance records, a data structure that bounds storage to O(k) records per account while guaranteeing that tokens never expire before their configured TTL. Our approach discretizes time into k buckets, coalescing deposits within the same bucket to limit unique expiration timestamps. We prove three key properties: (1) storage is bounded by k+1 records regardless of deposit frequency, (2) actual expiration time is always at least the configured TTL, and (3) adversaries cannot increase a victim's operation cost beyond O(k)[amortized] worst case. We provide a reference implementation in Solidity with measured gas costs demonstrating practical efficiency.

Authors: Shaun Scovil, Bhargav Chickmagalur Nanjundappa

Fungible tokens with time-to-live (TTL) semantics require tracking individual expiration times for each deposited unit. A naive implementation creates a new balance record per deposit, leading to unbounded storage growth and vulnerability to denial-of-service attacks. We present time-bucketed balance records, a data structure that bounds storage to O(k) records per account while guaranteeing that tokens never expire before their configured TTL. Our approach discretizes time into k buckets, coalescing deposits within the same bucket to limit unique expiration timestamps. We prove three key properties: (1) storage is bounded by k+1 records regardless of deposit frequency, (2) actual expiration time is always at least the configured TTL, and (3) adversaries cannot increase a victim's operation cost beyond O(k)[amortized] worst case. We provide a reference implementation in Solidity with measured gas costs demonstrating practical efficiency.

Fairness in the k-Server Problem

from arXiv: Data Structures and Algorithms

Authors: Mohammadreza Daneshvaramoli, Helia Karisani, Mohammad Hajiesmaili, Shahin Kamali, Cameron Musco

We initiate a formal study of fairness for the $k$-server problem, where the objective is not only to minimize the total movement cost, but also to distribute the cost equitably among servers. We first define a general notion of $(α,β)$-fairness, where, for parameters $α\ge 1$ and $β\ge 0$, no server incurs more than an $α/k$-fraction of the total cost plus an additive term $β$. We then show that fairness can be achieved without a loss in competitiveness in both the offline and online settings. In the offline setting, we give a deterministic algorithm that, for any $\varepsilon > 0$, transforms any optimal solution into an $(α,β)$-fair solution for $α= 1 + \varepsilon$ and $β= O(\mathrm{diam} \cdot \log k / \varepsilon)$, while increasing the cost of the solution by just an additive $O(\mathrm{diam} \cdot k \log k / \varepsilon)$ term. Here $\mathrm{diam}$ is the diameter of the underlying metric space. We give a similar result in the online setting, showing that any competitive algorithm can be transformed into a randomized online algorithm that is fair with high probability against an oblivious adversary and still competitive up to a small loss. The above results leave open a significant question: can fairness be achieved in the online setting, either with a deterministic algorithm or a randomized algorithm, against a fully adaptive adversary? We make progress towards answering this question, showing that the classic deterministic Double Coverage Algorithm (DCA) is fair on line metrics and on tree metrics when $k = 2$. However, we also show a negative result: DCA fails to be fair for any non-vacuous parameters on general tree metrics.

Authors: Mohammadreza Daneshvaramoli, Helia Karisani, Mohammad Hajiesmaili, Shahin Kamali, Cameron Musco

We initiate a formal study of fairness for the $k$-server problem, where the objective is not only to minimize the total movement cost, but also to distribute the cost equitably among servers. We first define a general notion of $(α,β)$-fairness, where, for parameters $α\ge 1$ and $β\ge 0$, no server incurs more than an $α/k$-fraction of the total cost plus an additive term $β$. We then show that fairness can be achieved without a loss in competitiveness in both the offline and online settings. In the offline setting, we give a deterministic algorithm that, for any $\varepsilon > 0$, transforms any optimal solution into an $(α,β)$-fair solution for $α= 1 + \varepsilon$ and $β= O(\mathrm{diam} \cdot \log k / \varepsilon)$, while increasing the cost of the solution by just an additive $O(\mathrm{diam} \cdot k \log k / \varepsilon)$ term. Here $\mathrm{diam}$ is the diameter of the underlying metric space. We give a similar result in the online setting, showing that any competitive algorithm can be transformed into a randomized online algorithm that is fair with high probability against an oblivious adversary and still competitive up to a small loss. The above results leave open a significant question: can fairness be achieved in the online setting, either with a deterministic algorithm or a randomized algorithm, against a fully adaptive adversary? We make progress towards answering this question, showing that the classic deterministic Double Coverage Algorithm (DCA) is fair on line metrics and on tree metrics when $k = 2$. However, we also show a negative result: DCA fails to be fair for any non-vacuous parameters on general tree metrics.

In-Place BWT and Lyndon Array Construction in Constant Space

from arXiv: Data Structures and Algorithms

Authors: Felipe A. Louza, Arnaud Lefebvre

We present an extension of the in-place BWT algorithm of Crochemore et al. [8] that enables the construction of the Lyndon array using O(1) extra space. Our approach incrementally maintains the lexicographic ranks of the suffixes during the right-to-left BWT construction and then derives the Lyndon array through a simple next-smaller-value procedure. Although not intended for practical use due to its quadratic running time, the method is conceptually simple and works for unbounded alphabets.

Authors: Felipe A. Louza, Arnaud Lefebvre

We present an extension of the in-place BWT algorithm of Crochemore et al. [8] that enables the construction of the Lyndon array using O(1) extra space. Our approach incrementally maintains the lexicographic ranks of the suffixes during the right-to-left BWT construction and then derives the Lyndon array through a simple next-smaller-value procedure. Although not intended for practical use due to its quadratic running time, the method is conceptually simple and works for unbounded alphabets.

Wednesday, December 24

Top-K Exterior Power Persistent Homology: Algorithm, Structure, and Stability

from arXiv: Computational Geometry

Authors: Yoshihiro Maruyama

Exterior powers play important roles in persistent homology in computational geometry. In the present paper we study the problem of extracting the $K$ longest intervals of the exterior-power layers of a tame persistence module. We prove a structural decomposition theorem that organizes the exterior-power layers into monotone per-anchor streams with explicit multiplicities, enabling a best-first algorithm. We also show that the Top-$K$ length vector is $2$-Lipschitz under bottleneck perturbations of the input barcode, and prove a comparison-model lower bound. Our experiments confirm the theory, showing speedups over full enumeration in high overlap cases. By enabling efficient extraction of the most prominent features, our approach makes higher-order persistence feasible for large datasets and thus broadly applicable to machine learning, data science, and scientific computing.

Authors: Yoshihiro Maruyama

Exterior powers play important roles in persistent homology in computational geometry. In the present paper we study the problem of extracting the $K$ longest intervals of the exterior-power layers of a tame persistence module. We prove a structural decomposition theorem that organizes the exterior-power layers into monotone per-anchor streams with explicit multiplicities, enabling a best-first algorithm. We also show that the Top-$K$ length vector is $2$-Lipschitz under bottleneck perturbations of the input barcode, and prove a comparison-model lower bound. Our experiments confirm the theory, showing speedups over full enumeration in high overlap cases. By enabling efficient extraction of the most prominent features, our approach makes higher-order persistence feasible for large datasets and thus broadly applicable to machine learning, data science, and scientific computing.

Algorithm for Interpretable Graph Features via Motivic Persistent Cohomology

from arXiv: Computational Geometry

Authors: Yoshihiro Maruyama

We present the Chromatic Persistence Algorithm (CPA), an event-driven method for computing persistent cohomological features of weighted graphs via graphic arrangements, a classical object in computational geometry. We establish rigorous complexity results: CPA is exponential in the worst case, fixed-parameter tractable in treewidth, and nearly linear for common graph families such as trees, cycles, and series-parallel graphs. Finally, we demonstrate its practical applicability through a controlled experiment on molecular-like graph structures.

Authors: Yoshihiro Maruyama

We present the Chromatic Persistence Algorithm (CPA), an event-driven method for computing persistent cohomological features of weighted graphs via graphic arrangements, a classical object in computational geometry. We establish rigorous complexity results: CPA is exponential in the worst case, fixed-parameter tractable in treewidth, and nearly linear for common graph families such as trees, cycles, and series-parallel graphs. Finally, we demonstrate its practical applicability through a controlled experiment on molecular-like graph structures.

Hierarchical Rectangle Packing Solved by Multi-Level Recursive Logic-based Benders Decomposition

from arXiv: Computational Geometry

Authors: Josef Grus, Zdeněk Hanzálek, Christian Artigues, Cyrille Briand, Emmanuel Hebrard

We study the two-dimensional hierarchical rectangle packing problem, motivated by applications in analog integrated circuit layout, facility layout, and logistics. Unlike classical strip or bin packing, the dimensions of the container are not fixed, and the packing is inherently hierarchical: each item is either a rectangle or a block occurrence, whose dimensions are a solution of another packing problem. This recursive structure reflects real-world scenarios in which components, boxes, or modules must be packed within higher-level containers. We formally define the problem and propose exact formulations in Mixed-Integer Linear Programming and Constraint Programming. Given the computational difficulty of solving complex packing instances directly, we propose decomposition heuristics. First, we implement an existing Bottom-Up baseline method that solves subblocks before combining them at higher levels. Building upon this, we introduce a novel multilevel Logic-based Benders Decomposition method. This heuristic method dynamically refines block dimension constraints, eliminating the need for manual selection of candidate widths or aspect ratios. Experiments on synthetic instances with up to seven hierarchy levels, 80 items per block, and limited computation time show that the proposed decomposition significantly outperforms both monolithic formulations and the Bottom-Up method in terms of solution quality and scalability.

Authors: Josef Grus, Zdeněk Hanzálek, Christian Artigues, Cyrille Briand, Emmanuel Hebrard

We study the two-dimensional hierarchical rectangle packing problem, motivated by applications in analog integrated circuit layout, facility layout, and logistics. Unlike classical strip or bin packing, the dimensions of the container are not fixed, and the packing is inherently hierarchical: each item is either a rectangle or a block occurrence, whose dimensions are a solution of another packing problem. This recursive structure reflects real-world scenarios in which components, boxes, or modules must be packed within higher-level containers. We formally define the problem and propose exact formulations in Mixed-Integer Linear Programming and Constraint Programming. Given the computational difficulty of solving complex packing instances directly, we propose decomposition heuristics. First, we implement an existing Bottom-Up baseline method that solves subblocks before combining them at higher levels. Building upon this, we introduce a novel multilevel Logic-based Benders Decomposition method. This heuristic method dynamically refines block dimension constraints, eliminating the need for manual selection of candidate widths or aspect ratios. Experiments on synthetic instances with up to seven hierarchy levels, 80 items per block, and limited computation time show that the proposed decomposition significantly outperforms both monolithic formulations and the Bottom-Up method in terms of solution quality and scalability.

A Comprehensive Guide to Mesh Simplification using Edge Collapse

from arXiv: Computational Geometry

Authors: Purva Kulkarni, Aravind Shankara Narayanan

Mesh simplification is the process of reducing the number of vertices, edges and triangles in a three-dimensional (3D) mesh while preserving the overall shape and salient features of the mesh. A popular strategy for this is edge collapse, where an edge connecting two vertices is merged into a single vertex. The edge to collapse is chosen based on a cost function that estimates the error introduced by this collapse. This paper presents a comprehensive, implementation-oriented guide to edge collapse for practitioners and researchers seeking both theoretical grounding and practical insight. We review and derive the underlying mathematics and provide reference implementations for foundational cost functions including Quadric Error Metrics (QEM) and Lindstrom-Turk's geometric criteria. We also explain the mathematics behind attribute-aware edge collapse in QEM variants and Hoppe's energy-based method used in progressive meshes. In addition to cost functions, we outline the complete edge collapse algorithm, including the specific sequence of operations and the data structures that are commonly used. To create a robust system, we also cover the necessary programmatic safeguards that prevent issues like mesh degeneracies, inverted normals, and improper handling of boundary conditions. The goal of this work is not only to consolidate established methods but also to bridge the gap between theory and practice, offering a clear, step-by-step guide for implementing mesh simplification pipelines based on edge collapse.

Authors: Purva Kulkarni, Aravind Shankara Narayanan

Mesh simplification is the process of reducing the number of vertices, edges and triangles in a three-dimensional (3D) mesh while preserving the overall shape and salient features of the mesh. A popular strategy for this is edge collapse, where an edge connecting two vertices is merged into a single vertex. The edge to collapse is chosen based on a cost function that estimates the error introduced by this collapse. This paper presents a comprehensive, implementation-oriented guide to edge collapse for practitioners and researchers seeking both theoretical grounding and practical insight. We review and derive the underlying mathematics and provide reference implementations for foundational cost functions including Quadric Error Metrics (QEM) and Lindstrom-Turk's geometric criteria. We also explain the mathematics behind attribute-aware edge collapse in QEM variants and Hoppe's energy-based method used in progressive meshes. In addition to cost functions, we outline the complete edge collapse algorithm, including the specific sequence of operations and the data structures that are commonly used. To create a robust system, we also cover the necessary programmatic safeguards that prevent issues like mesh degeneracies, inverted normals, and improper handling of boundary conditions. The goal of this work is not only to consolidate established methods but also to bridge the gap between theory and practice, offering a clear, step-by-step guide for implementing mesh simplification pipelines based on edge collapse.

On the near-tightness of $χ\leq 2r$: a general $σ$-ary construction and a binary case via LFSRs

from arXiv: Data Structures and Algorithms

Authors: Vinicius T. V. Date, Leandro M. Zatesko

In the field of compressed string indexes, recent work has introduced suffixient sets and their corresponding repetitiveness measure $χ$. In particular, researchers have explored its relationship to other repetitiveness measures, notably $r$, the number of runs in the Burrows--Wheeler Transform (BWT) of a string. Navarro et al. (2025) proved that $χ\leq 2r$, although empirical results by Cenzato et al. (2024) suggest that this bound is loose, with real data bounding $χ$ by around $1.13r$ to $1.33r$ when the size of the alphabet is $σ= 4$. To better understand this gap, we present two cases for the asymptotic tightness of the $χ\leq 2r$ bound: a general construction for arbitrary $σ$ values, and a binary alphabet case, consisting of de Bruijn sequences constructed by linear-feedback shift registers (LFSRs) from primitive polynomials over $\mathbb{F}_2$. The second is a novel characterization of which de Bruijn sequences achieve the literature run-minimal pattern for the cyclic BWT. Moreover, we show that de Bruijn sequences fail to close the gap for $σ\geq 3$.

Authors: Vinicius T. V. Date, Leandro M. Zatesko

In the field of compressed string indexes, recent work has introduced suffixient sets and their corresponding repetitiveness measure $χ$. In particular, researchers have explored its relationship to other repetitiveness measures, notably $r$, the number of runs in the Burrows--Wheeler Transform (BWT) of a string. Navarro et al. (2025) proved that $χ\leq 2r$, although empirical results by Cenzato et al. (2024) suggest that this bound is loose, with real data bounding $χ$ by around $1.13r$ to $1.33r$ when the size of the alphabet is $σ= 4$. To better understand this gap, we present two cases for the asymptotic tightness of the $χ\leq 2r$ bound: a general construction for arbitrary $σ$ values, and a binary alphabet case, consisting of de Bruijn sequences constructed by linear-feedback shift registers (LFSRs) from primitive polynomials over $\mathbb{F}_2$. The second is a novel characterization of which de Bruijn sequences achieve the literature run-minimal pattern for the cyclic BWT. Moreover, we show that de Bruijn sequences fail to close the gap for $σ\geq 3$.

Certified Lower Bounds and Efficient Estimation of Minimum Accuracy in Quantum Kernel Methods

from arXiv: Data Structures and Algorithms

Authors: Demerson N. Gonçalves, Tharso D. Fernandes, Andrias M. M. Cordeiro, Pedro H. G. Lugao, João T. Dias

The minimum accuracy heuristic evaluates quantum feature maps without requiring full quantum support vector machine (QSVM) training. However, the original formulation is computationally expensive, restricted to balanced datasets, and lacks theoretical backing. This work generalizes the metric to arbitrary binary datasets and formally proves it constitutes a certified lower bound on the optimal empirical accuracy of any linear classifier in the same feature space. Furthermore, we introduce Monte Carlo strategies to efficiently estimate this bound using a random subset of Pauli directions, accompanied by rigorous probabilistic guarantees. These contributions establish minimum accuracy as a scalable, theoretically sound tool for pre-screening feature maps on near-term quantum devices.

Authors: Demerson N. Gonçalves, Tharso D. Fernandes, Andrias M. M. Cordeiro, Pedro H. G. Lugao, João T. Dias

The minimum accuracy heuristic evaluates quantum feature maps without requiring full quantum support vector machine (QSVM) training. However, the original formulation is computationally expensive, restricted to balanced datasets, and lacks theoretical backing. This work generalizes the metric to arbitrary binary datasets and formally proves it constitutes a certified lower bound on the optimal empirical accuracy of any linear classifier in the same feature space. Furthermore, we introduce Monte Carlo strategies to efficiently estimate this bound using a random subset of Pauli directions, accompanied by rigorous probabilistic guarantees. These contributions establish minimum accuracy as a scalable, theoretically sound tool for pre-screening feature maps on near-term quantum devices.

Approximation and parameterized algorithms for covering disjointness-compliable set families

from arXiv: Data Structures and Algorithms

Authors: Zeev Nutov, Anael Vaknin

A set-family ${\cal F}$ is disjointness-compliable if $A' \subseteq A \in {\cal F}$ implies $A' \in {\cal F}$ or $A \setminus A' \in {\cal F}$; if ${\cal F}$ is also symmetric then ${\cal F}$ is proper. A classic result of Goemans and Williamson [SODA 92:307-316] states that the problem of covering a proper set-family by a min-cost edge set admits approximation ratio $2$, by a classic primal-dual algorithm. However, there are several famous algorithmic problems whose set-family ${\cal F}$ is disjointness-compliable but not symmetric -- among them $k$-Minimum Spanning Tree ($k$-MST), Generalized Point-to-Point Connection (G-P2P), Group Steiner, Covering Steiner, multiroot versions of these problems, and others. We will show that any such problem admits approximation ratio $O(α\log τ)$, where $τ$ is the number of inclusion-minimal sets in the family ${\cal F}$ that models the problem and $α$ is the best known approximation ratio for the case when $τ=1$. This immediately implies several results, among them the following two. (i) The first deterministic polynomial time $O(\log n)$-approximation algorithm for the G-P2P problem. Here the $τ=1$ case is the $k$-MST problem. (ii) Approximation ratio $O(\log^4 n)$ for the multiroot version of the Covering Steiner problem, where each root has its own set of groups. Here the $τ=1$ case is the Covering Steiner problem. We also discuss the parameterized complexity of covering a disjointness-compliable family ${\cal F}$, when parametrized by $τ$. We will show that if ${\cal F}$ is proper then the problem is fixed parameter tractable and can be solved in time $O^*(3^τ)$. For the non-symmetric case we will show that the problem admits approximation ratio between $α$ and $α+1$ in time $O^*(3^τ)$, which is essentially the best possible.

Authors: Zeev Nutov, Anael Vaknin

A set-family ${\cal F}$ is disjointness-compliable if $A' \subseteq A \in {\cal F}$ implies $A' \in {\cal F}$ or $A \setminus A' \in {\cal F}$; if ${\cal F}$ is also symmetric then ${\cal F}$ is proper. A classic result of Goemans and Williamson [SODA 92:307-316] states that the problem of covering a proper set-family by a min-cost edge set admits approximation ratio $2$, by a classic primal-dual algorithm. However, there are several famous algorithmic problems whose set-family ${\cal F}$ is disjointness-compliable but not symmetric -- among them $k$-Minimum Spanning Tree ($k$-MST), Generalized Point-to-Point Connection (G-P2P), Group Steiner, Covering Steiner, multiroot versions of these problems, and others. We will show that any such problem admits approximation ratio $O(α\log τ)$, where $τ$ is the number of inclusion-minimal sets in the family ${\cal F}$ that models the problem and $α$ is the best known approximation ratio for the case when $τ=1$. This immediately implies several results, among them the following two. (i) The first deterministic polynomial time $O(\log n)$-approximation algorithm for the G-P2P problem. Here the $τ=1$ case is the $k$-MST problem. (ii) Approximation ratio $O(\log^4 n)$ for the multiroot version of the Covering Steiner problem, where each root has its own set of groups. Here the $τ=1$ case is the Covering Steiner problem. We also discuss the parameterized complexity of covering a disjointness-compliable family ${\cal F}$, when parametrized by $τ$. We will show that if ${\cal F}$ is proper then the problem is fixed parameter tractable and can be solved in time $O^*(3^τ)$. For the non-symmetric case we will show that the problem admits approximation ratio between $α$ and $α+1$ in time $O^*(3^τ)$, which is essentially the best possible.

Tuesday, December 23

Research Associate (between postdoc & research scientist) at Harvard University (apply by January 15, 2026)

from CCI: jobs

This is a position within the OpenDP open-source software project. Candidates should have familiarity with differential privacy and an interest in advancing its applications. Candidates whose profile does not exactly fit the posting (e.g. being more junior or more senior) should still reach out to us and apply; we may be able to tailor the […]

This is a position within the OpenDP open-source software project. Candidates should have familiarity with differential privacy and an interest in advancing its applications. Candidates whose profile does not exactly fit the posting (e.g. being more junior or more senior) should still reach out to us and apply; we may be able to tailor the position to fit.

Website: https://opendp.org/job-opportunities/
Email: info@opendp.org

By shacharlovett

Postdoc at University of Waterloo (apply by January 31, 2026)

from CCI: jobs

The Algorithms & Complexity group at the University of Waterloo is offering postdoctoral positions starting in the Fall of 2026. We seek candidates from all areas of theoretical computer science. Interested applicants should have their materials sent as in the link below, and questions can be sent to the email below. Applications are due **January […]

The Algorithms & Complexity group at the University of Waterloo is offering postdoctoral positions starting in the Fall of 2026. We seek candidates from all areas of theoretical computer science.

Interested applicants should have their materials sent as in the link below, and questions can be sent to the email below.
Applications are due **January 31st, 2026**.

Website: https://algcomp.uwaterloo.ca/positions/
Email: theory.waterloo@gmail.com

By shacharlovett

TR25-219 | Negations are powerful even in small depth | Bruno Pasqualotto Cavalar, Théo Fabris, Partha Mukhopadhyay, Srikanth Srinivasan, Amir Yehudayoff

from ECCC Papers

We study the power of negation in the Boolean and algebraic settings and show the following results. * We construct a family of polynomials $P_n$ in $n$ variables, all of whose monomials have positive coefficients, such that $P_n$ can be computed by a depth three circuit of polynomial size but any monotone circuit computing it has size $2^{\Omega(n)}$. This is the strongest possible separation result between monotone and non-monotone arithmetic computations and improves upon all earlier results, including the seminal work of Valiant (1980) and more recently by Chattopadhyay, Datta, and Mukhopadhyay (2021). We then boot-strap this result to prove strong monotone separations for polynomials of constant degree, which solves an open problem from the survey of Shpilka and Yehudayoff (2010). * By moving to the Boolean setting, we can prove superpolynomial monotone Boolean circuit lower bounds for specific Boolean functions, which imply that all the powers of certain monotone polynomials cannot be computed by polynomially sized monotone arithmetic circuits. This leads to a new kind of monotone vs. non-monotone separation in the arithmetic setting. * We then define a collection of problems with linear-algebraic nature, which are similar to span programs, and prove monotone Boolean circuit lower bounds for them. In particular, this gives the strongest known monotone lower bounds for functions in uniform (non-monotone) $\textbf{NC}^2$. Our construction also leads to an explicit matroid that defines a monotone function that is difficult to compute, which solves an open problem by Jukna and Seiwert (2020) in the context of the relative powers of greedy and pure dynamic programming algorithms. Our monotone arithmetic and Boolean circuit lower bounds are based on known techniques, such as reduction from monotone arithmetic complexity to multipartition communication complexity and the approximation method for proving lower bounds for monotone Boolean circuits, but we overcome several new challenges in order to obtain efficient upper bounds using low-depth circuits.

We study the power of negation in the Boolean and algebraic settings and show the following results. * We construct a family of polynomials $P_n$ in $n$ variables, all of whose monomials have positive coefficients, such that $P_n$ can be computed by a depth three circuit of polynomial size but any monotone circuit computing it has size $2^{\Omega(n)}$. This is the strongest possible separation result between monotone and non-monotone arithmetic computations and improves upon all earlier results, including the seminal work of Valiant (1980) and more recently by Chattopadhyay, Datta, and Mukhopadhyay (2021). We then boot-strap this result to prove strong monotone separations for polynomials of constant degree, which solves an open problem from the survey of Shpilka and Yehudayoff (2010). * By moving to the Boolean setting, we can prove superpolynomial monotone Boolean circuit lower bounds for specific Boolean functions, which imply that all the powers of certain monotone polynomials cannot be computed by polynomially sized monotone arithmetic circuits. This leads to a new kind of monotone vs. non-monotone separation in the arithmetic setting. * We then define a collection of problems with linear-algebraic nature, which are similar to span programs, and prove monotone Boolean circuit lower bounds for them. In particular, this gives the strongest known monotone lower bounds for functions in uniform (non-monotone) $\textbf{NC}^2$. Our construction also leads to an explicit matroid that defines a monotone function that is difficult to compute, which solves an open problem by Jukna and Seiwert (2020) in the context of the relative powers of greedy and pure dynamic programming algorithms. Our monotone arithmetic and Boolean circuit lower bounds are based on known techniques, such as reduction from monotone arithmetic complexity to multipartition communication complexity and the approximation method for proving lower bounds for monotone Boolean circuits, but we overcome several new challenges in order to obtain efficient upper bounds using low-depth circuits.

Negations are powerful even in small depth

from arXiv: Computational Complexity

Authors: Bruno Cavalar, Théo Borém Fabris, Partha Mukhopadhyay, Srikanth Srinivasan, Amir Yehudayoff

We study the power of negation in the Boolean and algebraic settings and show the following results. * We construct a family of polynomials $P_n$ in $n$ variables, all of whose monomials have positive coefficients, such that $P_n$ can be computed by a depth three circuit of polynomial size but any monotone circuit computing it has size $2^{Ω(n)}$. This is the strongest possible separation result between monotone and non-monotone arithmetic computations and improves upon all earlier results, including the seminal work of Valiant (1980) and more recently by Chattopadhyay, Datta, and Mukhopadhyay (2021). We then boot-strap this result to prove strong monotone separations for polynomials of constant degree, which solves an open problem from the survey of Shpilka and Yehudayoff (2010). * By moving to the Boolean setting, we can prove superpolynomial monotone Boolean circuit lower bounds for specific Boolean functions, which imply that all the powers of certain monotone polynomials cannot be computed by polynomially sized monotone arithmetic circuits. * We then define a collection of problems with linear-algebraic nature, which are similar to span programs, and prove monotone Boolean circuit lower bounds for them. In particular, this gives the strongest known monotone lower bounds for functions in uniform (non-monotone) $\textbf{NC}^2$. Our construction also leads to an explicit matroid that defines a monotone function that is difficult to compute, which solves an open problem by Jukna and Seiwert (2020). Our monotone arithmetic and Boolean circuit lower bounds are based on known techniques, such as reduction from monotone arithmetic complexity to multipartition communication complexity and the approximation method for proving lower bounds for monotone Boolean circuits, but we overcome several new challenges in order to obtain efficient upper bounds using low-depth circuits.

Authors: Bruno Cavalar, Théo Borém Fabris, Partha Mukhopadhyay, Srikanth Srinivasan, Amir Yehudayoff

We study the power of negation in the Boolean and algebraic settings and show the following results. * We construct a family of polynomials $P_n$ in $n$ variables, all of whose monomials have positive coefficients, such that $P_n$ can be computed by a depth three circuit of polynomial size but any monotone circuit computing it has size $2^{Ω(n)}$. This is the strongest possible separation result between monotone and non-monotone arithmetic computations and improves upon all earlier results, including the seminal work of Valiant (1980) and more recently by Chattopadhyay, Datta, and Mukhopadhyay (2021). We then boot-strap this result to prove strong monotone separations for polynomials of constant degree, which solves an open problem from the survey of Shpilka and Yehudayoff (2010). * By moving to the Boolean setting, we can prove superpolynomial monotone Boolean circuit lower bounds for specific Boolean functions, which imply that all the powers of certain monotone polynomials cannot be computed by polynomially sized monotone arithmetic circuits. * We then define a collection of problems with linear-algebraic nature, which are similar to span programs, and prove monotone Boolean circuit lower bounds for them. In particular, this gives the strongest known monotone lower bounds for functions in uniform (non-monotone) $\textbf{NC}^2$. Our construction also leads to an explicit matroid that defines a monotone function that is difficult to compute, which solves an open problem by Jukna and Seiwert (2020). Our monotone arithmetic and Boolean circuit lower bounds are based on known techniques, such as reduction from monotone arithmetic complexity to multipartition communication complexity and the approximation method for proving lower bounds for monotone Boolean circuits, but we overcome several new challenges in order to obtain efficient upper bounds using low-depth circuits.

Classical billiards can compute

from arXiv: Computational Complexity

Authors: Eva Miranda, Isaac Ramos

We show that two-dimensional billiard systems are Turing complete by encoding their dynamics within the framework of Topological Kleene Field Theory. Billiards serve as idealized models of particle motion with elastic reflections and arise naturally as limits of smooth Hamiltonian systems under steep confining potentials. Our results establish the existence of undecidable trajectories in physically natural billiard-type models, including billiard-type models arising in hard-sphere gases and in collision-chain limits of celestial mechanics.

Authors: Eva Miranda, Isaac Ramos

We show that two-dimensional billiard systems are Turing complete by encoding their dynamics within the framework of Topological Kleene Field Theory. Billiards serve as idealized models of particle motion with elastic reflections and arise naturally as limits of smooth Hamiltonian systems under steep confining potentials. Our results establish the existence of undecidable trajectories in physically natural billiard-type models, including billiard-type models arising in hard-sphere gases and in collision-chain limits of celestial mechanics.

On the complexity of computing Strahler numbers

from arXiv: Computational Complexity

Authors: Moses Ganardi, Markus Lohrey

It is shown that the problem of computing the Strahler number of a binary tree given as a term is complete for the circuit complexity class uniform $\mathsf{NC}^1$. For several variants, where the binary tree is given by a pointer structure or in a succinct form by a directed acyclic graph or a tree straight-line program, the complexity of computing the Strahler number is determined as well. The problem, whether a given context-free grammar in Chomsky normal form produces a derivation tree (resp., an acyclic derivation tree), whose Strahler number is at least a given number $k$ is shown to be P-complete (resp., PSPACE-complete).

Authors: Moses Ganardi, Markus Lohrey

It is shown that the problem of computing the Strahler number of a binary tree given as a term is complete for the circuit complexity class uniform $\mathsf{NC}^1$. For several variants, where the binary tree is given by a pointer structure or in a succinct form by a directed acyclic graph or a tree straight-line program, the complexity of computing the Strahler number is determined as well. The problem, whether a given context-free grammar in Chomsky normal form produces a derivation tree (resp., an acyclic derivation tree), whose Strahler number is at least a given number $k$ is shown to be P-complete (resp., PSPACE-complete).

Assembly Addition Chains

from arXiv: Computational Complexity

Authors: Leroy Cronin, Juan Carlos Morales Parra, Keith Y. Patarroyo

In this paper we extend the notion of Addition Chains over Z+ to a general set S. We explain how the algebraic structure of Assembly Multi-Magma over the pairs (S,BB proper subset of S) allows to define the concept of Addition Chain over S, called Assembly Addition Chains of S with Building Blocks BB. Analogously to the Z+ case, we introduce the concept of Optimal Assembly Addition Chains over S and prove lower and upper bounds for their lengths, similar to the bounds found by Schonhage for the Z+ case. In the general case the unit 1 is in set Z+ is replaced by the subset BB and the mentioned bounds for the length of an Optimal Assembly Addition Chain of O is in set S are defined in terms of the size of O (i.e. the number of Building Blocks required to construct O). The main examples of S that we consider through this papers are (i) j-Strings (Strings with an alphabeth of j letters), (ii) Colored Connected Graphs and (iii) Colored Polyominoes.

Authors: Leroy Cronin, Juan Carlos Morales Parra, Keith Y. Patarroyo

In this paper we extend the notion of Addition Chains over Z+ to a general set S. We explain how the algebraic structure of Assembly Multi-Magma over the pairs (S,BB proper subset of S) allows to define the concept of Addition Chain over S, called Assembly Addition Chains of S with Building Blocks BB. Analogously to the Z+ case, we introduce the concept of Optimal Assembly Addition Chains over S and prove lower and upper bounds for their lengths, similar to the bounds found by Schonhage for the Z+ case. In the general case the unit 1 is in set Z+ is replaced by the subset BB and the mentioned bounds for the length of an Optimal Assembly Addition Chain of O is in set S are defined in terms of the size of O (i.e. the number of Building Blocks required to construct O). The main examples of S that we consider through this papers are (i) j-Strings (Strings with an alphabeth of j letters), (ii) Colored Connected Graphs and (iii) Colored Polyominoes.

Construction and deformation of P-hedra using control polylines

from arXiv: Computational Geometry

Authors: Georg Nawratil

In the 19th International Symposium on Advances in Robot Kinematics the author introduced a novel class of continuous flexible discrete surfaces and mentioned that these so-called P-hedra (or P-nets) allow direct access to their spatial shapes by three control polylines. In this follow-up paper we study this intuitive method, which makes these flexible planar quad surfaces suitable for transformable design tasks by means of interactive tools. The construction of P-hedra from the control polylines can also be used for an efficient algorithmic computation of their isometric deformations. In addition we discuss flexion limits, bifurcation configurations, developable/flat-foldable pattern and tubular P-hedra.

Authors: Georg Nawratil

In the 19th International Symposium on Advances in Robot Kinematics the author introduced a novel class of continuous flexible discrete surfaces and mentioned that these so-called P-hedra (or P-nets) allow direct access to their spatial shapes by three control polylines. In this follow-up paper we study this intuitive method, which makes these flexible planar quad surfaces suitable for transformable design tasks by means of interactive tools. The construction of P-hedra from the control polylines can also be used for an efficient algorithmic computation of their isometric deformations. In addition we discuss flexion limits, bifurcation configurations, developable/flat-foldable pattern and tubular P-hedra.

On The Computational Complexity for Minimizing Aerial Photographs for Full Coverage of a Planar Region

from arXiv: Computational Geometry

Authors: Si Wei Feng

With the popularity of drone technologies, aerial photography have become prevalent in many daily scenarios such as environment monitoring, structure inspection, law enforcement etc. A central challenge in this domain is the efficient coverage of a target area with photographs that can entirely capture the region, while respecting constraints such as the image resolution, and limited number of pictures that can be taken. This work investigates the computational complexity of several fundamental problems arised from this challenge. By abstracting the aerial photography problem into the coverage problems in computational geometry, we demonstrate that most of these problems are in fact computationally intractable, with the implication that traditional algorithms cannot solve them efficiently. The intuitions of this work can extend beyond aerial photography to broader applications such as pesticide spraying, and strategic sensor placement.

Authors: Si Wei Feng

With the popularity of drone technologies, aerial photography have become prevalent in many daily scenarios such as environment monitoring, structure inspection, law enforcement etc. A central challenge in this domain is the efficient coverage of a target area with photographs that can entirely capture the region, while respecting constraints such as the image resolution, and limited number of pictures that can be taken. This work investigates the computational complexity of several fundamental problems arised from this challenge. By abstracting the aerial photography problem into the coverage problems in computational geometry, we demonstrate that most of these problems are in fact computationally intractable, with the implication that traditional algorithms cannot solve them efficiently. The intuitions of this work can extend beyond aerial photography to broader applications such as pesticide spraying, and strategic sensor placement.

Finding Cliques in Geometric Intersection Graphs with Grounded or Stabbed Constraints

from arXiv: Computational Geometry

Authors: J. Mark Keil, Debajyoti Mondal

A geometric intersection graph is constructed over a set of geometric objects, where each vertex represents a distinct object and an edge connects two vertices if and only if the corresponding objects intersect. We examine the problem of finding a maximum clique in the intersection graphs of segments and disks under grounded and stabbed constraints. In the grounded setting, all objects lie above a common horizontal line and touch that line. In the stabbed setting, all objects can be stabbed with a common line. - We prove that finding a maximum clique is NP-hard for the intersection graphs of upward rays. This strengthens the previously known NP-hardness for ray graphs and settles the open question for the grounded segment graphs. The hardness result holds in the stabbed setting. - We show that the problem is polynomial-time solvable for intersection graphs of grounded unit-length segments, but NP-hard for stabbed unit-length segments. - We give a polynomial-time algorithm for the case of grounded disks. If the grounded constraint is relaxed, then we give an $O(n^3 f(n))$-time $3/2$-approximation for disk intersection graphs with radii in the interval $[1,3]$, where $n$ is the number of disks and $f(n)$ is the time to compute a maximum clique in an $n$-vertex cobipartite graph. This is faster than previously known randomized EPTAS, QPTAS, or 2-approximation algorithms for arbitrary disks. We obtain our result by proving that pairwise intersecting disks with radii in $[1,3]$ are 3-pierceable, which extends the 3-pierceable property from the long known unit disk case to a broader class.

Authors: J. Mark Keil, Debajyoti Mondal

A geometric intersection graph is constructed over a set of geometric objects, where each vertex represents a distinct object and an edge connects two vertices if and only if the corresponding objects intersect. We examine the problem of finding a maximum clique in the intersection graphs of segments and disks under grounded and stabbed constraints. In the grounded setting, all objects lie above a common horizontal line and touch that line. In the stabbed setting, all objects can be stabbed with a common line. - We prove that finding a maximum clique is NP-hard for the intersection graphs of upward rays. This strengthens the previously known NP-hardness for ray graphs and settles the open question for the grounded segment graphs. The hardness result holds in the stabbed setting. - We show that the problem is polynomial-time solvable for intersection graphs of grounded unit-length segments, but NP-hard for stabbed unit-length segments. - We give a polynomial-time algorithm for the case of grounded disks. If the grounded constraint is relaxed, then we give an $O(n^3 f(n))$-time $3/2$-approximation for disk intersection graphs with radii in the interval $[1,3]$, where $n$ is the number of disks and $f(n)$ is the time to compute a maximum clique in an $n$-vertex cobipartite graph. This is faster than previously known randomized EPTAS, QPTAS, or 2-approximation algorithms for arbitrary disks. We obtain our result by proving that pairwise intersecting disks with radii in $[1,3]$ are 3-pierceable, which extends the 3-pierceable property from the long known unit disk case to a broader class.

BlockSets: A Structured Visualization for Sets with Large Elements

from arXiv: Computational Geometry

Authors: Neda Novakova, Veselin Todorov, Steven van den Broek, Tim Dwyer, Bettina Speckmann

Visualizations of set systems frequently use enclosing geometries for the sets in combination with reduced representations of the elements, such as short text labels, small glyphs, or points. Hence they are generally unable to adequately represent sets whose elements are larger text fragments, images, or charts. In this paper we introduce BlockSets, a novel set visualization technique specifically designed for sets with large elements. BlockSets places the elements on a grid and uses rectilinear shapes as enclosing geometries. We describe integer linear programs that find high-quality layouts of the elements on the grid. Since not all set systems allow a compact contiguous representation in this form, we also present an algorithm that splits the visualization into parts when needed; our visual encoding highlights the parts for the user in the final visualization. BlockSets utilizes orthoconvex shapes which offer a good trade-off between compactness and readability. Finally, BlockSets renders the enclosing geometries as stacked opaque shapes. We describe an algorithm that finds a stacking order such that all shapes can be inferred. Such a stacking does not have to exist, but our algorithm did find a stacking for all real-world data sets that we tested.

Authors: Neda Novakova, Veselin Todorov, Steven van den Broek, Tim Dwyer, Bettina Speckmann

Visualizations of set systems frequently use enclosing geometries for the sets in combination with reduced representations of the elements, such as short text labels, small glyphs, or points. Hence they are generally unable to adequately represent sets whose elements are larger text fragments, images, or charts. In this paper we introduce BlockSets, a novel set visualization technique specifically designed for sets with large elements. BlockSets places the elements on a grid and uses rectilinear shapes as enclosing geometries. We describe integer linear programs that find high-quality layouts of the elements on the grid. Since not all set systems allow a compact contiguous representation in this form, we also present an algorithm that splits the visualization into parts when needed; our visual encoding highlights the parts for the user in the final visualization. BlockSets utilizes orthoconvex shapes which offer a good trade-off between compactness and readability. Finally, BlockSets renders the enclosing geometries as stacked opaque shapes. We describe an algorithm that finds a stacking order such that all shapes can be inferred. Such a stacking does not have to exist, but our algorithm did find a stacking for all real-world data sets that we tested.

Clustering with Label Consistency

from arXiv: Data Structures and Algorithms

Authors: Diptarka Chakraborty, Hendrik Fichtenberger, Bernhard Haeupler, Silvio Lattanzi, Ashkan Norouzi-Fard, Ola Svensson

Designing efficient, effective, and consistent metric clustering algorithms is a significant challenge attracting growing attention. Traditional approaches focus on the stability of cluster centers; unfortunately, this neglects the real-world need for stable point labels, i.e., stable assignments of points to named sets (clusters). In this paper, we address this gap by initiating the study of label-consistent metric clustering. We first introduce a new notion of consistency, measuring the label distance between two consecutive solutions. Then, armed with this new definition, we design new consistent approximation algorithms for the classical $k$-center and $k$-median problems.

Authors: Diptarka Chakraborty, Hendrik Fichtenberger, Bernhard Haeupler, Silvio Lattanzi, Ashkan Norouzi-Fard, Ola Svensson

Designing efficient, effective, and consistent metric clustering algorithms is a significant challenge attracting growing attention. Traditional approaches focus on the stability of cluster centers; unfortunately, this neglects the real-world need for stable point labels, i.e., stable assignments of points to named sets (clusters). In this paper, we address this gap by initiating the study of label-consistent metric clustering. We first introduce a new notion of consistency, measuring the label distance between two consecutive solutions. Then, armed with this new definition, we design new consistent approximation algorithms for the classical $k$-center and $k$-median problems.

Near-optimal streaming approximation for Max-DICUT in sublinear space using two passes

from arXiv: Data Structures and Algorithms

Authors: Santhoshini Velusamy

The Max-DICUT problem has gained a lot of attention in the streaming setting in recent years, and has so far served as a canonical problem for designing algorithms for general constraint satisfaction problems (CSPs) in this setting. A seminal result of Kapralov and Krachun [STOC 2019] shows that it is impossible to beat $1/2$-approximation for Max-DICUT in sublinear space in the single-pass streaming setting, even on bounded-degree graphs. In a recent work, Saxena, Singer, Sudan, and Velusamy [SODA 2025] prove that the above lower bound is tight by giving a single-pass algorithm for bounded-degree graphs that achieves $(1/2-ε)$-approximation in sublinear space, for every constant $ε>0$. For arbitrary graphs of unbounded degree, they give an $O(1/ε)$-pass $O(\log n)$ space algorithm. Their work left open the question of obtaining $1/2$-approximation for arbitrary graphs in the single-pass setting in sublinear space. We make progress towards this question and give a two-pass algorithm that achieves $(1/2-ε)$-approximation in sublinear space, for every constant $ε>0$.

Authors: Santhoshini Velusamy

The Max-DICUT problem has gained a lot of attention in the streaming setting in recent years, and has so far served as a canonical problem for designing algorithms for general constraint satisfaction problems (CSPs) in this setting. A seminal result of Kapralov and Krachun [STOC 2019] shows that it is impossible to beat $1/2$-approximation for Max-DICUT in sublinear space in the single-pass streaming setting, even on bounded-degree graphs. In a recent work, Saxena, Singer, Sudan, and Velusamy [SODA 2025] prove that the above lower bound is tight by giving a single-pass algorithm for bounded-degree graphs that achieves $(1/2-ε)$-approximation in sublinear space, for every constant $ε>0$. For arbitrary graphs of unbounded degree, they give an $O(1/ε)$-pass $O(\log n)$ space algorithm. Their work left open the question of obtaining $1/2$-approximation for arbitrary graphs in the single-pass setting in sublinear space. We make progress towards this question and give a two-pass algorithm that achieves $(1/2-ε)$-approximation in sublinear space, for every constant $ε>0$.

Fare Zone Assignment

from arXiv: Data Structures and Algorithms

Authors: Martin Hoefer, Lennart Kauther, Philipp Pabst, Britta Peis, Khai Van Tran

Tariff setting in public transportation networks is an important challenge. A popular approach is to partition the network into fare zones ("zoning") and fix journey prices depending on the number of traversed zones ("pricing"). In this paper, we focus on finding revenue-optimal solutions to the zoning problem for a given concave pricing function. We consider tree networks with $n$ vertices, since trees already pose non-trivial algorithmic challenges. Our main results are efficient algorithms that yield a simple $\mathcal{O}(\log n)$-approximation as well as a more involved $\mathcal{O}(\log n/\log \log n)$-approximation. We show how to solve the problem exactly on rooted instances, in which all demand arises at the same source. For paths, we prove strong NP-hardness and outline a PTAS. Moreover, we show that computing an optimal solution is in FPT or XP for several natural problem parameters.

Authors: Martin Hoefer, Lennart Kauther, Philipp Pabst, Britta Peis, Khai Van Tran

Tariff setting in public transportation networks is an important challenge. A popular approach is to partition the network into fare zones ("zoning") and fix journey prices depending on the number of traversed zones ("pricing"). In this paper, we focus on finding revenue-optimal solutions to the zoning problem for a given concave pricing function. We consider tree networks with $n$ vertices, since trees already pose non-trivial algorithmic challenges. Our main results are efficient algorithms that yield a simple $\mathcal{O}(\log n)$-approximation as well as a more involved $\mathcal{O}(\log n/\log \log n)$-approximation. We show how to solve the problem exactly on rooted instances, in which all demand arises at the same source. For paths, we prove strong NP-hardness and outline a PTAS. Moreover, we show that computing an optimal solution is in FPT or XP for several natural problem parameters.

On Factoring and Power Divisor Problems via Rank-3 Lattices and the Second Vector

from arXiv: Data Structures and Algorithms

Authors: Yiming Gao, Yansong Feng, Honggang Hu, Yanbin Pan

We propose a deterministic algorithm based on Coppersmith's method that employs a rank-3 lattice to address factoring-related problems. An interesting aspect of our approach is that we utilize the second vector in the LLL-reduced basis to avoid trivial collisions in the Baby-step Giant-step method, rather than the shortest vector as is commonly used in the literature. Our results are as follows: 1. Compared to the result by Harvey and Hittmeir (Math. Comp. 91 (2022), 1367 - 1379), who achieved a complexity of O( N^(1/5) log^(16/5) N / (log log N)^(3/5)) for factoring a semiprime N = pq, we demonstrate that in the balanced p and q case, the complexity can be improved to O( N^(1/5) log^(13/5) N / (log log N)^(3/5) ). 2. For factoring sums and differences of powers, that is, numbers of the form N = a^n plus or minus b^n, we improve Hittmeir's result (Math. Comp. 86 (2017), 2947 - 2954) from O( N^(1/4) log^(3/2) N ) to O( N^(1/5) log^(13/5) N ). 3. For the problem of finding r-power divisors, that is, finding all integers p such that p^r divides N, Harvey and Hittmeir (Proceedings of ANTS XV, Research in Number Theory 8 (2022), no. 4, Paper No. 94) recently directly applied Coppersmith's method and achieved a complexity of O( N^(1/(4r)) log^(10+epsilon) N / r^3 ). By using faster LLL-type algorithms and sieving on small primes, we improve their result to O( N^(1/(4r)) log^(7+3 epsilon) N / ((log log N minus log(4r)) r^(2+epsilon)) ). The worst-case running time for their algorithm occurs when N = p^r q with q on the order of N^(1/2). By focusing on this case and employing our rank-3 lattice approach, we achieve a complexity of O( r^(1/4) N^(1/(4r)) log^(5/2) N ). In conclusion, we offer a new perspective on these problems, which we hope will provide further insights.

Authors: Yiming Gao, Yansong Feng, Honggang Hu, Yanbin Pan

We propose a deterministic algorithm based on Coppersmith's method that employs a rank-3 lattice to address factoring-related problems. An interesting aspect of our approach is that we utilize the second vector in the LLL-reduced basis to avoid trivial collisions in the Baby-step Giant-step method, rather than the shortest vector as is commonly used in the literature. Our results are as follows: 1. Compared to the result by Harvey and Hittmeir (Math. Comp. 91 (2022), 1367 - 1379), who achieved a complexity of O( N^(1/5) log^(16/5) N / (log log N)^(3/5)) for factoring a semiprime N = pq, we demonstrate that in the balanced p and q case, the complexity can be improved to O( N^(1/5) log^(13/5) N / (log log N)^(3/5) ). 2. For factoring sums and differences of powers, that is, numbers of the form N = a^n plus or minus b^n, we improve Hittmeir's result (Math. Comp. 86 (2017), 2947 - 2954) from O( N^(1/4) log^(3/2) N ) to O( N^(1/5) log^(13/5) N ). 3. For the problem of finding r-power divisors, that is, finding all integers p such that p^r divides N, Harvey and Hittmeir (Proceedings of ANTS XV, Research in Number Theory 8 (2022), no. 4, Paper No. 94) recently directly applied Coppersmith's method and achieved a complexity of O( N^(1/(4r)) log^(10+epsilon) N / r^3 ). By using faster LLL-type algorithms and sieving on small primes, we improve their result to O( N^(1/(4r)) log^(7+3 epsilon) N / ((log log N minus log(4r)) r^(2+epsilon)) ). The worst-case running time for their algorithm occurs when N = p^r q with q on the order of N^(1/2). By focusing on this case and employing our rank-3 lattice approach, we achieve a complexity of O( r^(1/4) N^(1/(4r)) log^(5/2) N ). In conclusion, we offer a new perspective on these problems, which we hope will provide further insights.

Constant Approximation of Arboricity in Near-Optimal Sublinear Time

from arXiv: Data Structures and Algorithms

Authors: Jiangqi Dai, Mohsen Ghaffari, Julian Portmann

We present a randomized algorithm that computes a constant approximation of a graph's arboricity, using $\tilde{O}(n/λ)$ queries to adjacency lists and in the same time bound. Here, $n$ and $λ$ denote the number of nodes and the graph's arboricity, respectively. The $\tilde{O}(n/λ)$ query complexity of our algorithm is nearly optimal. Our constant approximation settles a question of Eden, Mossel, and Ron [SODA'22], who achieved an $O(\log^2 n)$ approximation with the same query and time complexity and asked whether a better approximation can be achieved using near-optimal query complexity. A key technical challenge in the problem is due to recursive algorithms based on probabilistic samplings, each with a non-negligible error probability. In our case, many of the recursions invoked could have bad probabilistic samples and result in high query complexities. The particular difficulty is that those bad recursions are not easy or cheap to detect and discard. Our approach runs multiple recursions in parallel, to attenuate the error probability, using a careful \textit{scheduling mechanism} that manages the speed at which each of them progresses and makes our overall query complexity competitive with the single good recursion. We find this usage of parallelism and scheduling in a sublinear algorithm remarkable, and we are hopeful that similar ideas may find applications in a wider range of sublinear algorithms that rely on probabilistic recursions.

Authors: Jiangqi Dai, Mohsen Ghaffari, Julian Portmann

We present a randomized algorithm that computes a constant approximation of a graph's arboricity, using $\tilde{O}(n/λ)$ queries to adjacency lists and in the same time bound. Here, $n$ and $λ$ denote the number of nodes and the graph's arboricity, respectively. The $\tilde{O}(n/λ)$ query complexity of our algorithm is nearly optimal. Our constant approximation settles a question of Eden, Mossel, and Ron [SODA'22], who achieved an $O(\log^2 n)$ approximation with the same query and time complexity and asked whether a better approximation can be achieved using near-optimal query complexity. A key technical challenge in the problem is due to recursive algorithms based on probabilistic samplings, each with a non-negligible error probability. In our case, many of the recursions invoked could have bad probabilistic samples and result in high query complexities. The particular difficulty is that those bad recursions are not easy or cheap to detect and discard. Our approach runs multiple recursions in parallel, to attenuate the error probability, using a careful \textit{scheduling mechanism} that manages the speed at which each of them progresses and makes our overall query complexity competitive with the single good recursion. We find this usage of parallelism and scheduling in a sublinear algorithm remarkable, and we are hopeful that similar ideas may find applications in a wider range of sublinear algorithms that rely on probabilistic recursions.

Quantization for Vector Search under Streaming Updates

from arXiv: Data Structures and Algorithms

Authors: Ishaq Aden-Ali, Hakan Ferhatosmanoglu, Alexander Greaves-Tunnell, Nina Mishra, Tal Wagner

Large-scale vector databases for approximate nearest neighbor (ANN) search typically store a quantized dataset in main memory for fast access, and full precision data on remote disk. State-of-the-art ANN quantization methods are highly data-dependent, rendering them unable to handle point insertions and deletions. This either leads to degraded search quality over time, or forces costly global rebuilds of the entire search index. In this paper, we formally study data-dependent quantization under streaming dataset updates. We formulate a computation model of limited remote disk access and define a dynamic consistency property that guarantees freshness under updates. We use it to obtain the following results: Theoretically, we prove that static data-dependent quantization can be made dynamic with bounded disk I/O per update while retaining formal accuracy guarantees for ANN search. Algorithmically, we develop a practical data-dependent quantization method which is provably dynamically consistent, adapting itself to the dataset as it evolves over time. Our experiments show that the method outperforms baselines in large-scale nearest neighbor search quantization under streaming updates.

Authors: Ishaq Aden-Ali, Hakan Ferhatosmanoglu, Alexander Greaves-Tunnell, Nina Mishra, Tal Wagner

Large-scale vector databases for approximate nearest neighbor (ANN) search typically store a quantized dataset in main memory for fast access, and full precision data on remote disk. State-of-the-art ANN quantization methods are highly data-dependent, rendering them unable to handle point insertions and deletions. This either leads to degraded search quality over time, or forces costly global rebuilds of the entire search index. In this paper, we formally study data-dependent quantization under streaming dataset updates. We formulate a computation model of limited remote disk access and define a dynamic consistency property that guarantees freshness under updates. We use it to obtain the following results: Theoretically, we prove that static data-dependent quantization can be made dynamic with bounded disk I/O per update while retaining formal accuracy guarantees for ANN search. Algorithmically, we develop a practical data-dependent quantization method which is provably dynamically consistent, adapting itself to the dataset as it evolves over time. Our experiments show that the method outperforms baselines in large-scale nearest neighbor search quantization under streaming updates.

Learning Dependency Models for Subset Repair

from arXiv: Data Structures and Algorithms

Authors: Haoda Li, Jiahui Chen, Yu Sun, Shaoxu Song, Haiwei Zhang, Xiaojie Yuan

Inconsistent values are commonly encountered in real-world applications, which can negatively impact data analysis and decision-making. While existing research primarily focuses on identifying the smallest removal set to resolve inconsistencies, recent studies have shown that multiple minimum removal sets may exist, making it difficult to make further decisions. While some approaches use the most frequent values as the guidance for the subset repair, this strategy has been criticized for its potential to inaccurately identify errors. To address these issues, we consider the dependencies between attribute values to determine a more appropriate subset repair. Our main contributions include (1) formalizing the optimal subset repair problem with attribute dependencies and analyzing its computational hardness; (2) computing the exact solution using integer linear programming; (3) developing an approximate algorithm with performance guarantees based on cliques and LP relaxation; and (4) designing a probabilistic approach with an approximation bound for efficiency. Experimental results on real-world datasets validate the effectiveness of our methods in both subset repair performance and downstream applications.

Authors: Haoda Li, Jiahui Chen, Yu Sun, Shaoxu Song, Haiwei Zhang, Xiaojie Yuan

Inconsistent values are commonly encountered in real-world applications, which can negatively impact data analysis and decision-making. While existing research primarily focuses on identifying the smallest removal set to resolve inconsistencies, recent studies have shown that multiple minimum removal sets may exist, making it difficult to make further decisions. While some approaches use the most frequent values as the guidance for the subset repair, this strategy has been criticized for its potential to inaccurately identify errors. To address these issues, we consider the dependencies between attribute values to determine a more appropriate subset repair. Our main contributions include (1) formalizing the optimal subset repair problem with attribute dependencies and analyzing its computational hardness; (2) computing the exact solution using integer linear programming; (3) developing an approximate algorithm with performance guarantees based on cliques and LP relaxation; and (4) designing a probabilistic approach with an approximation bound for efficiency. Experimental results on real-world datasets validate the effectiveness of our methods in both subset repair performance and downstream applications.

Constrained Cuts, Flows, and Lattice-Linearity

from arXiv: Data Structures and Algorithms

Authors: Robert Streit, Vijay K. Garg

In a capacitated directed graph, it is known that the set of all min-cuts forms a distributive lattice [1], [2]. Here, we describe this lattice as a regular predicate whose forbidden elements can be advanced in constant parallel time after precomputing a max-flow, so as to obtain parallel algorithms for min-cut problems with additional constraints encoded by lattice-linear predicates [3]. Some nice algorithmic applications follow. First, we use these methods to compute the irreducibles of the sublattice of min-cuts satisfying a regular predicate. By Birkhoff's theorem [4] this gives a succinct representation of such cuts, and so we also obtain a general algorithm for enumerating this sublattice. Finally, though we prove computing min-cuts satisfying additional constraints is NP-hard in general, we use poset slicing [5], [6] for exact algorithms with constraints not necessarily encoded by lattice-linear predicates) with better complexity than exhaustive search. We also introduce $k$-transition predicates and strong advancement for improved complexity analyses of lattice-linear predicate algorithms in parallel settings, which is of independent interest.

Authors: Robert Streit, Vijay K. Garg

In a capacitated directed graph, it is known that the set of all min-cuts forms a distributive lattice [1], [2]. Here, we describe this lattice as a regular predicate whose forbidden elements can be advanced in constant parallel time after precomputing a max-flow, so as to obtain parallel algorithms for min-cut problems with additional constraints encoded by lattice-linear predicates [3]. Some nice algorithmic applications follow. First, we use these methods to compute the irreducibles of the sublattice of min-cuts satisfying a regular predicate. By Birkhoff's theorem [4] this gives a succinct representation of such cuts, and so we also obtain a general algorithm for enumerating this sublattice. Finally, though we prove computing min-cuts satisfying additional constraints is NP-hard in general, we use poset slicing [5], [6] for exact algorithms with constraints not necessarily encoded by lattice-linear predicates) with better complexity than exhaustive search. We also introduce $k$-transition predicates and strong advancement for improved complexity analyses of lattice-linear predicate algorithms in parallel settings, which is of independent interest.

Graph-based Nearest Neighbors with Dynamic Updates via Random Walks

from arXiv: Data Structures and Algorithms

Authors: Nina Mishra, Yonatan Naamad, Tal Wagner, Lichen Zhang

Approximate nearest neighbor search (ANN) is a common way to retrieve relevant search results, especially now in the context of large language models and retrieval augmented generation. One of the most widely used algorithms for ANN is based on constructing a multi-layer graph over the dataset, called the Hierarchical Navigable Small World (HNSW). While this algorithm supports insertion of new data, it does not support deletion of existing data. Moreover, deletion algorithms described by prior work come at the cost of increased query latency, decreased recall, or prolonged deletion time. In this paper, we propose a new theoretical framework for graph-based ANN based on random walks. We then utilize this framework to analyze a randomized deletion approach that preserves hitting time statistics compared to the graph before deleting the point. We then turn this theoretical framework into a deterministic deletion algorithm, and show that it provides better tradeoff between query latency, recall, deletion time, and memory usage through an extensive collection of experiments.

Authors: Nina Mishra, Yonatan Naamad, Tal Wagner, Lichen Zhang

Approximate nearest neighbor search (ANN) is a common way to retrieve relevant search results, especially now in the context of large language models and retrieval augmented generation. One of the most widely used algorithms for ANN is based on constructing a multi-layer graph over the dataset, called the Hierarchical Navigable Small World (HNSW). While this algorithm supports insertion of new data, it does not support deletion of existing data. Moreover, deletion algorithms described by prior work come at the cost of increased query latency, decreased recall, or prolonged deletion time. In this paper, we propose a new theoretical framework for graph-based ANN based on random walks. We then utilize this framework to analyze a randomized deletion approach that preserves hitting time statistics compared to the graph before deleting the point. We then turn this theoretical framework into a deterministic deletion algorithm, and show that it provides better tradeoff between query latency, recall, deletion time, and memory usage through an extensive collection of experiments.

Fast Rational Search via Stern-Brocot Tree

from arXiv: Data Structures and Algorithms

Authors: Connor Weyers, N. V. Vinodchandran

We revisit the problem of rational search: given an unknown rational number $α= \frac{a}{b} \in (0,1)$ with $b \leq n$, the goal is to identify $α$ using comparison queries of the form ``$β\leq α$?''. The problem has been studied several decades ago and optimal query algorithms are known. We present a new algorithm for rational search based on a compressed traversal of the Stern--Brocot tree, which appeared to have been overlooked in the literature. This approach also naturally extends to two related problems that, to the best of our knowledge, have not been previously addressed: (i) unbounded rational search, where the bound $n$ is unknown, and (ii) computing the best (in a precise sense) rational approximation of an unknown real number using only comparison queries.

Authors: Connor Weyers, N. V. Vinodchandran

We revisit the problem of rational search: given an unknown rational number $α= \frac{a}{b} \in (0,1)$ with $b \leq n$, the goal is to identify $α$ using comparison queries of the form ``$β\leq α$?''. The problem has been studied several decades ago and optimal query algorithms are known. We present a new algorithm for rational search based on a compressed traversal of the Stern--Brocot tree, which appeared to have been overlooked in the literature. This approach also naturally extends to two related problems that, to the best of our knowledge, have not been previously addressed: (i) unbounded rational search, where the bound $n$ is unknown, and (ii) computing the best (in a precise sense) rational approximation of an unknown real number using only comparison queries.

Monday, December 22

Complexity Year in Review

from Computational Complexity

An easy choice for paper of the year, a paper that has nothing to do with randomness, interaction, quantum, circuits or codes. Just a near quadratic improvement in the amount of memory you need to simulate time.

Simulating Time with Square-Root Space by Ryan Williams

Any time \(t(n)\) algorithm can be simulated in space \(O(\sqrt{t(n)\log t(n)})\) greatly improving the \(O(t(n)/\log t(n))\) result from the 70's. Ryan's work makes strong use of last year's space efficient tree evaluation by James Cook and Ian Mertz. More in my February post and a Quanta article which did a better job explaining the importance of the result than I could.

Bill is also excited by the new \(O(m\log^{2/3}n)\) single-sourced shortest path algorithm by Ran Duan, Jiayi Mao, Xiao Mao, Xinkai Shu and Longhui Yinthat that beats out Dijkstra on sparse graphs. 

Last year I wrote

We're heading to a perfect storm for US higher education with the oncoming trains of the new administration, artificial intelligence, fiscal challenges and the demographic cliff. Hang on tight, it's going to be a bumpy ride.

Bumpy is an understatement and we're just starting the ride. Limited immigration, National Science Foundation woes in its 75th anniversary, and a drop in computer science enrollments as AI continues to suck up the atmosphere. Do we buckle down or should we completely rethink our institutions? 

In the spirit of all the AI wrapped content, I asked Claude to put together a full year in review for this blog. This is getting scarily good.

We remember George Foreman, Frank Gehry, Ray Laflamme, Tom Lehrer, Charles Lin, Pradyut Shah and Tom Stoppard. 

We thank our guest posters Eric Allender, Daniel Fernández and Alberto Fraile, Clyde Kruskal and Nick Sovich.

See you all in January!

By Lance Fortnow

An easy choice for paper of the year, a paper that has nothing to do with randomness, interaction, quantum, circuits or codes. Just a near quadratic improvement in the amount of memory you need to simulate time.

Simulating Time with Square-Root Space by Ryan Williams

Any time \(t(n)\) algorithm can be simulated in space \(O(\sqrt{t(n)\log t(n)})\) greatly improving the \(O(t(n)/\log t(n))\) result from the 70's. Ryan's work makes strong use of last year's space efficient tree evaluation by James Cook and Ian Mertz. More in my February post and a Quanta article which did a better job explaining the importance of the result than I could.

Bill is also excited by the new \(O(m\log^{2/3}n)\) single-sourced shortest path algorithm by Ran Duan, Jiayi Mao, Xiao Mao, Xinkai Shu and Longhui Yinthat that beats out Dijkstra on sparse graphs. 

Last year I wrote

We're heading to a perfect storm for US higher education with the oncoming trains of the new administration, artificial intelligence, fiscal challenges and the demographic cliff. Hang on tight, it's going to be a bumpy ride.

Bumpy is an understatement and we're just starting the ride. Limited immigration, National Science Foundation woes in its 75th anniversary, and a drop in computer science enrollments as AI continues to suck up the atmosphere. Do we buckle down or should we completely rethink our institutions? 

In the spirit of all the AI wrapped content, I asked Claude to put together a full year in review for this blog. This is getting scarily good.

We remember George Foreman, Frank GehryRay Laflamme, Tom Lehrer, Charles Lin, Pradyut Shah and Tom Stoppard

We thank our guest posters Eric Allender, Daniel Fernández and Alberto Fraile, Clyde Kruskal and Nick Sovich.

See you all in January!

By Lance Fortnow

On the Complexity of Bipartite Degree Realizability

from arXiv: Computational Complexity

Authors: István Miklós

We study the \emph{Bipartite Degree Realization} (BDR) problem: given a graphic degree sequence $D$, decide whether it admits a realization as a bipartite graph. While bipartite realizability for a fixed vertex partition can be decided in polynomial time via the Gale--Ryser theorem, the computational complexity of BDR without a prescribed partition remains unresolved. We address this question through a parameterized analysis. For constants $0 \le c_1 \le c_2 \le 1$, we define $\mathrm{BDR}_{c_1,c_2}$ as the restriction of BDR to degree sequences of length $n$ whose degrees lie in the interval $[c_1 n, c_2 n]$. Our main result shows that $\mathrm{BDR}_{c_1,c_2}$ is solvable in polynomial time whenever $0 \le c_1 \le c_2 \le \frac{\sqrt{c_1(c_1+4)}-c_1}{2}$, as well as for all $c_1 > \tfrac12$. The proof relies on a reduction to extremal \emph{least balanced degree sequences} and a detailed verification of the critical Gale--Ryser inequalities, combined with a bounded subset-sum formulation. We further show that, assuming the NP-completeness of unrestricted BDR, the problem $\mathrm{BDR}_{c_1,c_2}$ remains NP-complete for all $0 < c_2 < \tfrac12$ and $c_1 < 1 - c_2 - \sqrt{1-2c_2}$. Our results clarify the algorithmic landscape of bipartite degree realization and contribute to the broader study of potentially bipartite graphic degree sequences.

Authors: István Miklós

We study the \emph{Bipartite Degree Realization} (BDR) problem: given a graphic degree sequence $D$, decide whether it admits a realization as a bipartite graph. While bipartite realizability for a fixed vertex partition can be decided in polynomial time via the Gale--Ryser theorem, the computational complexity of BDR without a prescribed partition remains unresolved. We address this question through a parameterized analysis. For constants $0 \le c_1 \le c_2 \le 1$, we define $\mathrm{BDR}_{c_1,c_2}$ as the restriction of BDR to degree sequences of length $n$ whose degrees lie in the interval $[c_1 n, c_2 n]$. Our main result shows that $\mathrm{BDR}_{c_1,c_2}$ is solvable in polynomial time whenever $0 \le c_1 \le c_2 \le \frac{\sqrt{c_1(c_1+4)}-c_1}{2}$, as well as for all $c_1 > \tfrac12$. The proof relies on a reduction to extremal \emph{least balanced degree sequences} and a detailed verification of the critical Gale--Ryser inequalities, combined with a bounded subset-sum formulation. We further show that, assuming the NP-completeness of unrestricted BDR, the problem $\mathrm{BDR}_{c_1,c_2}$ remains NP-complete for all $0 < c_2 < \tfrac12$ and $c_1 < 1 - c_2 - \sqrt{1-2c_2}$. Our results clarify the algorithmic landscape of bipartite degree realization and contribute to the broader study of potentially bipartite graphic degree sequences.

When Symmetry Yields NP-Hardness: Affine ML-SAT on S5 Frames

from arXiv: Computational Complexity

Authors: Andreas Krebs, Arne Meier

Hemaspaandra~et~al.~[JCSS 2010] conjectured that satisfiability for multi-modal logic restricted to the connectives XOR and 1, over frame classes T, S4, and S5, is solvable in polynomial time. We refute this for S5 frames, by proving NP-hardness.

Authors: Andreas Krebs, Arne Meier

Hemaspaandra~et~al.~[JCSS 2010] conjectured that satisfiability for multi-modal logic restricted to the connectives XOR and 1, over frame classes T, S4, and S5, is solvable in polynomial time. We refute this for S5 frames, by proving NP-hardness.

Plane Strong Connectivity Augmentation

from arXiv: Computational Geometry

Authors: Stéphane Bessy, Daniel Gonçalves, Amadeus Reinald, Dimitrios M. Thilikos

We investigate the problem of strong connectivity augmentation within plane oriented graphs. We show that deciding whether a plane oriented graph $D$ can be augmented with (any number of) arcs $X$ such that $D+X$ is strongly connected, but still plane and oriented, is NP-hard. This question becomes trivial within plane digraphs, like most connectivity augmentation problems without a budget constraint. The budgeted version, Plane Strong Connectivity Augmentation (PSCA) considers a plane oriented graph $D$ along with some integer $k$, and asks for an $X$ of size at most $k$ ensuring that $D+X$ is strongly connected, while remaining plane and oriented. Our main result is a fixed-parameter tractable algorithm for PSCA, running in time $2^{O(k)} n^{O(1)}$. The cornerstone of our procedure is a structural result showing that, for any fixed $k$, each face admits a bounded number of partial solutions "dominating" all others. Then, our algorithm for PSCA combines face-wise branching with a Monte-Carlo reduction to the polynomial Minimum Dijoin problem, which we derandomize. To the best of our knowledge, this is the first FPT algorithm for a (hard) connectivity augmentation problem constrained by planarity.

Authors: Stéphane Bessy, Daniel Gonçalves, Amadeus Reinald, Dimitrios M. Thilikos

We investigate the problem of strong connectivity augmentation within plane oriented graphs. We show that deciding whether a plane oriented graph $D$ can be augmented with (any number of) arcs $X$ such that $D+X$ is strongly connected, but still plane and oriented, is NP-hard. This question becomes trivial within plane digraphs, like most connectivity augmentation problems without a budget constraint. The budgeted version, Plane Strong Connectivity Augmentation (PSCA) considers a plane oriented graph $D$ along with some integer $k$, and asks for an $X$ of size at most $k$ ensuring that $D+X$ is strongly connected, while remaining plane and oriented. Our main result is a fixed-parameter tractable algorithm for PSCA, running in time $2^{O(k)} n^{O(1)}$. The cornerstone of our procedure is a structural result showing that, for any fixed $k$, each face admits a bounded number of partial solutions "dominating" all others. Then, our algorithm for PSCA combines face-wise branching with a Monte-Carlo reduction to the polynomial Minimum Dijoin problem, which we derandomize. To the best of our knowledge, this is the first FPT algorithm for a (hard) connectivity augmentation problem constrained by planarity.

Delaunay-Rips filtration: a study and an algorithm

from arXiv: Computational Geometry

Authors: Mattéo Clémot, Julie Digne, Julien Tierny

The Delaunay-Rips filtration is a lighter and faster alternative to the well-known Rips filtration for low-dimensional Euclidean point clouds. Despite these advantages, it has seldom been studied. In this paper, we aim to bridge this gap by providing a thorough theoretical and empirical analysis of this construction. From a theoretical perspective, we show how the persistence diagrams associated with the Delaunay-Rips filtration approximate those obtained with the Rips filtration. Additionally, we describe the instabilities of the Delaunay-Rips persistence diagrams when the input point cloud is perturbed. Finally, we introduce an algorithm that computes persistence diagrams of Delaunay-Rips filtrations in any dimension. We show that our method is faster and has a lower memory footprint than traditional approaches in low dimensions. Our C++ implementation, which comes with Python bindings, is available at github.com/MClemot/GeoPH.

Authors: Mattéo Clémot, Julie Digne, Julien Tierny

The Delaunay-Rips filtration is a lighter and faster alternative to the well-known Rips filtration for low-dimensional Euclidean point clouds. Despite these advantages, it has seldom been studied. In this paper, we aim to bridge this gap by providing a thorough theoretical and empirical analysis of this construction. From a theoretical perspective, we show how the persistence diagrams associated with the Delaunay-Rips filtration approximate those obtained with the Rips filtration. Additionally, we describe the instabilities of the Delaunay-Rips persistence diagrams when the input point cloud is perturbed. Finally, we introduce an algorithm that computes persistence diagrams of Delaunay-Rips filtrations in any dimension. We show that our method is faster and has a lower memory footprint than traditional approaches in low dimensions. Our C++ implementation, which comes with Python bindings, is available at https://github.com/MClemot/GeoPH.

Refining the Complexity Landscape of Speed Scaling: Hardness and Algorithms

from arXiv: Data Structures and Algorithms

Authors: Antonios Antoniadis, Denise Graafsma, Ruben Hoeksma, Maria Vlasiou

We study the computational complexity of scheduling jobs on a single speed-scalable processor with the objective of capturing the trade-off between the (weighted) flow time and the energy consumption. This trade-off has been extensively explored in the literature through a number of problem formulations that differ in the specific job characteristics and the precise objective function. Nevertheless, the computational complexity of four important problem variants has remained unresolved and was explicitly identified as an open question in prior work. In this paper, we settle the complexity of these variants. More specifically, we prove that the problem of minimizing the objective of total (weighted) flow time plus energy is NP-hard for the cases of (i) unit-weight jobs with arbitrary sizes, and (ii)~arbitrary-weight jobs with unit sizes. These results extend to the objective of minimizing the total (weighted) flow time subject to an energy budget and hold even when the schedule is required to adhere to a given priority ordering. In contrast, we show that when a completion-time ordering is provided, the same problem variants become polynomial-time solvable. The latter result highlights the subtle differences between priority and completion orderings for the problem.

Authors: Antonios Antoniadis, Denise Graafsma, Ruben Hoeksma, Maria Vlasiou

We study the computational complexity of scheduling jobs on a single speed-scalable processor with the objective of capturing the trade-off between the (weighted) flow time and the energy consumption. This trade-off has been extensively explored in the literature through a number of problem formulations that differ in the specific job characteristics and the precise objective function. Nevertheless, the computational complexity of four important problem variants has remained unresolved and was explicitly identified as an open question in prior work. In this paper, we settle the complexity of these variants. More specifically, we prove that the problem of minimizing the objective of total (weighted) flow time plus energy is NP-hard for the cases of (i) unit-weight jobs with arbitrary sizes, and (ii)~arbitrary-weight jobs with unit sizes. These results extend to the objective of minimizing the total (weighted) flow time subject to an energy budget and hold even when the schedule is required to adhere to a given priority ordering. In contrast, we show that when a completion-time ordering is provided, the same problem variants become polynomial-time solvable. The latter result highlights the subtle differences between priority and completion orderings for the problem.