Source linked

MERGEvolve Escapes Convex Jail to Find Better Merged Models

Existing model merging only explores convex combinations of experts; MERGEvolve uses evolution to search outside that space, achieving competitive multi-task performance without extra training.

mergevolvemodel mergingevolution strategiesparameter space explorationmulti task learning

Model merging techniques have been stuck inside the convex combination hull of expert parameters, and that limitation is now getting shattered.

The Convex Trap

Every popular model merging method — TIES-Merging, DARE, Fisher Merging, you name it — works within the convex combination space of the source models. You pick coefficients, blend weights, and pray the mean is better than any single expert. But the high-performance regions worth finding often live outside that convex polygon. The MERGEvolve paper from arXiv:2606.28373 calls this out directly: existing methods fail to explore those regions.

MERGEvolve: Merging as Init, Evolution as Search

The fix is surprisingly elegant. Instead of stopping at the merged model, MERGEvolve treats that merged point as the initial population for an evolution strategy. Expert models act as deterministic sources to build a strong starting point; then random noise drives exploration of the parameter space beyond the convex combination boundary. The framework unifies model merging and evolutionary search so that the merging phase gives a high-quality initialization and the evolution phase widens the search. Theoretical analysis confirms that MERGEvolve explores regions the convex methods cannot reach.

What the Numbers Say

On single-task and multi-task benchmarks, MERGEvolve consistently achieves performance competitive with advanced model merging baselines. No blowout — but it matches them without extra training data or compute beyond the evolution search itself. Ablation studies hammer home the core insight: without that high-quality initial point from merging, the evolution wanders; with it, the search finds strong multitask models quickly.

This framework suggests that future model merging won't just interpolate — it'll actively explore parameter space for higher-quality multitask models.


Source: Model Merging to Evolution: Parameter Space Exploration for Expert Models
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.