Source linked

Three Dimensions Suffice for Any Tree Hierarchy Embedding

A new proof shows any directed tree can be embedded into just 3 dimensions for exact reachability, shattering prior bounds that scaled with tree depth. For general DAGs, dimension drops from O(n) to O(t log n) when...

arxivhierarchical retrievalreachability embeddingstreewidthdimensionality reductionmachine learning

3 dimensions. Not 3000. Not O(depth). Just three. That's what a new paper on arXiv (2606.18520) proves: any directed tree, no matter how deep or wide, can be mapped into a constant 3-dimensional embedding that exactly preserves ancestor-descendant reachability. Previous bounds for deep hierarchies required dimension proportional to the number of nodes. That's now dead for trees.

Trees Are Easy: Constant Dimension, Zero Depth Dependence The trick is to exploit the structural simplicity of trees compared to general DAGs. While earlier work by You et al. showed existence of reachability embeddings when the number of descendants is small, the bounds degraded badly for deep hierarchies. The new result shows that for any directed tree, there exists a reachability embedding in exactly 3 dimensions -- no dependence on size or depth. That's a concrete theoretical guarantee with practical implications for hierarchical retrieval systems.

General DAGs: Dimension Scales with Treewidth For general DAGs, the authors provide tight bounds parameterized by treewidth $t$, a common structural sparsity measure. They construct embeddings of dimension $O(t \log n)$ and prove a lower bound of $\Omega(t/\log(n/t))$. For graphs where $t$ is small (e.g., $t=O(\log n)$), dimension becomes polylogarithmic instead of the $\Omega(n)$ required for arbitrary DAGs. The gap between these bounds and the trivial $\Omega(n)$ lower bound for general DAGs shows that structural parameters, not just node count, determine embedding compactness.

Real Data Confirms the Theory The paper doesn't stop at asymptotics. Experiments on real-world hierarchical datasets show that these new embeddings achieve much smaller dimensions than prior theoretical constructions, especially in high-recall regimes. That means you can build retriever systems that scale to truly deep hierarchies without blowing up embedding storage or compute. What makes this work is the clean mapping between structural graph parameters and geometric dimension -- an approach that could generalize to other retrieval tasks beyond hierarchies.


Source: Compact Geometric Representations of Hierarchies
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.