Source linked

La regla de raíz cuadrada para los retrasos de compromiso colapsa por encima de este umbral de carga

Un análisis de ciclo cerrado muestra que la política de comisión codiciada coincide con el mejor temporizador ajustado dentro del 0,1%, haciendo que el comité_delay=0 sea óptimo por encima de un umbral de carga de dispositivo computable.

postgresqlawsebs gp3nvmedatabase systemsgroup commit

Stop tuning your group commit timer. Above a device-set load threshold, the parameter-free greedy-pipelined flush policy (flush the instant the device is free) matches any oracle-tuned timer within 0.1%. That's from a new paper that models group commit as a closed queueing network - the real world, not the textbook open-loop fantasy.

The textbook says you need an optimal timer: the EOQ square-root rule $T^\star=\sqrt{2F_0/\lambda}$ for Poisson arrivals, or a ski-rental 2-competitive wait-or-flush decision. That's open-loop theory. In actual OLTP, clients are closed-loop: they wait for their commit to complete before issuing the next transaction. The arrival rate is induced by the policy's own latency. Model that correctly, and the greedy-pipelined policy self-clocks to a fixed point. No tuning knob required.

The Device-Set Load Threshold That Makes Tuning Vacuous

The key insight is the relationship $T^\star \lambda^\star=2/F_0$. Above this device-set load threshold, the optimal timer collapses onto zero - the greedy policy. Below that threshold, the clean theory applies, but in practice most production databases run above $\lambda^\star$ on modern storage. The paper measures fsync distributions on two AWS storage classes: EBS gp3 and instance NVMe, spanning a 25x range in latency. Both confirm the effect: the threshold is easily exceeded under realistic loads.

PostgreSQL Confirms: commit_delay=0 is Competitive

They tested directly on PostgreSQL, the most common open-source database with a commit_delay parameter. Setting commit_delay=0 (the greedy flush) was competitive with any tuned value across their workloads. No need for adaptive policies, no square-root calculus, no ski-rental gymnastics. Just flush when the device is free. The paper's contribution is a characterization that explains why deployed practice already defaults to zero - and why your tuning efforts above a moderate load are wasted.

This characterization gives you a simple decision rule: compute your $\lambda^\star$ from your fsync latency, and if your commit rate exceeds it, stop tuning and go work on something that actually matters.


Source: Group Commit Self-Clocks: Why Tuning Is Unnecessary Above a Device-Set Load Threshold
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.