Constant-Time Query Evaluation That Barely Exceeds Optimal Sequential Work

A new result from arXiv (2301.08178) crushes a long-held assumption: you can evaluate relational algebra queries in constant time on a CRCW PRAM while paying only a negligible overhead in work over the optimal sequential algorithm. The authors deliver algorithms that achieve work $\mathcal{O}(T^{1+\varepsilon})$ for every $\varepsilon>0$, where $T$ is the time of an optimal sequential plan. That's almost as good as it gets.

Why Constant Time on a PRAM Usually Blows Up Work

Naive parallel evaluation of a query on a CRCW PRAM can run in constant time, but it does so by throwing a polynomial number of processors at the problem. The result set ends up scattered across memory in an unreachable mess, and the total work (processor-time product) balloons beyond practical reach. This paper identifies the obstacles specifically: the need to compact output without sequential bottlenecks, and the challenge of computing aggregates under tight work bounds.

Three Query Classes That Bend the Rules

The authors present algorithms for three settings where efficient sequential evaluation already exists: acyclic queries, semijoin algebra queries, and join queries within the worst-case optimal framework. For each class, they show how to parallelize the evaluation to constant time while keeping work close to the sequential baseline. The key tools are approximate prefix sums and compaction from a classic Goldberg and Zwick (1995) paper — algorithms that trade a tiny bit of precision for massive parallelism.

The Cost: $\mathcal{O}(T^{1+\varepsilon})$ — Practically Linear

The catch: two mild assumptions. Either data values must be numbers of polynomial size in the database size, or the relations must be suitably sorted. Under those conditions, the algorithms achieve what the authors call weakly work-efficient constant-time evaluation. For any $\varepsilon>0$, you can crank up parallelism until the wall-clock time is constant, but the total work grows only polynomially slower than sequential — $T^{1+\varepsilon}$ versus $T$. In practice, that means a query that takes 1 second sequentially could run in constant time on a sufficiently large PRAM with, say, $T^{0.01}$ extra work.

If these algorithms survive the transition from PRAM theory to real shared-memory or GPU architectures, database engineers may finally have a roadmap for scaling query evaluation the way we already scale matrix multiply: constant latency, near-linear total work.

Source: Work-Efficient Query Evaluation in Constant Time with PRAMs
Domain: arxiv.org

Constant-Time Query Evaluation That Barely Exceeds Optimal Sequential Work

Why Constant Time on a PRAM Usually Blows Up Work

Three Query Classes That Bend the Rules

The Cost: $\mathcal{O}(T^{1+\varepsilon})$ — Practically Linear

More in Science & Research