Source linked

RB-PEM bat les stratégies d'évolution bruyante en misant sur la profondeur

Une astuce Rao-Blackwellized appelée probabilistique réduit l'appartenance à l'élite à travers le bruit de classement, fournissant des gains cohérents sur les tâches BBOB-noisy et RL de COCO sous des budgets d'évaluation serrés.

probabilistic elite membershiprb pemcoco bbob noisyevolution strategiesrao blackwellizationoptimization

When you have a fixed budget of noisy function evaluations, every wasted sample hurts. The standard play in evolution strategies is to spend evaluations denoising the ranking within each generation — but that steals budget from the optimizer's next distribution update. A new paper argues the opposite: depth over fidelity.

Probabilistic elite membership (PEM), from the authors on arXiv (2606.06555), replaces hard rank-based weights with conditional expected rank weights that integrate over ranking uncertainty. This is a Rao-Blackwellization of the noisy rank-based step: it preserves the conditional mean update while slashing conditional update dispersion. In plain terms, you get a less jittery parameter update for the same evaluation budget.

Residual Bootstrapping Makes It Practical

The PEM idea is instantiated via residual bootstrapping (RB-PEM) with capped per-generation overhead. An adaptive probe-and-switch mechanism kicks in for low-noise regimes where the bootstrapping overhead isn't justified. No free lunch, but the switch avoids wasting compute when the noise is small.

Results across the COCO bbob-noisy suite, plus external tasks like RL policy search and hyperparameter optimization, show consistent gains specifically in high-misranking, budget-constrained settings. That's exactly where every other method bleeds samples just to figure out which candidate is actually better.

What This Means for Practitioners

If you're tuning hyperparameters or running policy search with a tight evaluation budget, RB-PEM gives you more effective gradient steps per unit cost. The depth-over-fidelity principle generalizes beyond evolution strategies — any optimizer that ranks noisy candidates can borrow this conditional expectation trick.

Next time someone says you need more samples to get a clean ranking, ask whether a Rao-Blackwellized estimate would let you spend that budget on an extra generation instead.


Source: Depth over Fidelity in Fixed-Budget Noisy Evolution Strategies
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.