Source linked

Prefetch Trick Nails p99 0ms Autocomplete لـ 240M Domains

ruurtjan.com@fast_lynx3 hours ago·Systems Engineering·5 comments

يستجيب API Autocomplete من Wirewiki في أقل من 2ms في p50، وتخفيض التوقيت المباشر على الجانب المستخدمي بقية الأشياء حتى تظهر النتائج قبل أن يرفع المستخدم يديه.

wirewikiruurtjan pulautocompletedomain nameslatency optimizationtrie

99% of autocomplete queries on Wirewiki return before the user finishes pressing the key - effectively 0 milliseconds of perceived latency. That's the headline number from Ruurtjan Pul's deep dive on building an autocomplete for 240 million domain names.

The Prefetch Trick That Hides API Latency

Every keystroke on Wirewiki triggers a prefetch on keyDown and a render on keyUp. The API response arrives in the gap between the two key presses, typically within 121ms at p99 for a reasonably fast typist. A 60Hz display paints every 16.7ms, so the budget is tight. Prefetching the result for the current prefix plus any next character buys just enough time.

Pul measured his own typing: 121ms p99 from one key press to the next release. That means if the API returns before that window ends, the results are ready before the user sees the next frame. The asterisk in "p99 0ms*" means 0ms of wall-clock wait after keyUp - the response is already cached.

Two Structures for 240M Names

The autocomplete covers the Tranco top 1 million domains (most popular) plus every domain from CZDS gTLD zone files - about 240 million total. Two data structures handle that scale.

An in-memory character trie stores precomputed top-8 suggestions for every prefix. Lookup is O(length of the query), walking a few pointers. For the long tail, Pul built an SSD-backed memory-mapped block index: 240M domains sorted, delta-compressed into fixed-size blocks, with a 27MB in-memory directory. A binary search finds the right block, then a linear scan handles 256 names. Hot pages stay cached by the OS. Worst-case complexity is effectively O(1) for both structures.

Under 2ms at the API, 0ms at the UI

Pul stress-tested the production server with an LLM generating 720k keystroke queries from 60k simulated domain names, fired open-loop. The API alone responded in 2ms at p50 and 15ms at p99 even at 1,600 requests per second. Through Nginx and Cloudflare, end-to-end p99 stayed under 15ms at moderate load.

At that speed, the API finishes long before the next key press. The prefetch cache then serves the suggestion instantly on keyUp. Wirewiki's approach proves that with careful timing and data structure design, instant suggestion is achievable at internet scale. The full methodology and a live demo are up for anyone to poke at.


Source: p99 0ms* autocomplete for 240 million domain names
Domain: ruurtjan.com

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.