Source linked

ZeroFS transforme S3 en un système de fichiers POSIX de 1,6 μs sans la taxe de latence

Le cache chaud lit en 1,6 microsecondes; les tours ronds S3 bruts prennent 50-300 ms. ZeroFS fournit la sémantique POSIX sur NFS, 9P ou NBD avec le cryptage et la compression sur chaque écriture.

zerofss3log structured filesystemposixnfsjepsen

1.6 microseconds for a warm random read—that’s the gap between ZeroFS and raw S3. A direct S3 round trip costs 50–300 milliseconds. ZeroFS closes that with a local cache, log-structured writes, and a single userspace daemon that exports POSIX semantics over NFS, 9P, or block devices via NBD.

16 EiB, 1.6 µs, and a Log-Structured Engine

ZeroFS packs file data into 32 KiB extents inside immutable segment objects. Writes go to S3 as new objects; compaction reclaims deleted space. Compression (zstd or lz4) and encryption (XChaCha20-Poly1305, key derived from Argon2id) happen before anything leaves your machine. There is no unencrypted mode.

Mean small-write latency is 0.83 ms. Maximum filesystem size: 16 EiB, courtesy of 64-bit inode and size fields. Checkpoints snapshot the filesystem at any point; read replicas pick up the writer’s changes automatically and return EROFS on writes.

Verified by 8,662 POSIX Tests and Jepsen Fault Injection

CI runs pjdfstest (8,662 tests) on every change over NFS, 9P, and the FUSE client, plus xfstests—the same tests ext4 and XFS use. Kernel builds run in parallel on live mounts: make -j$(nproc) stresses concurrent writes. stress-ng and Jepsen’s local-fs suite verify operation histories against a reference model, including crash recovery mid-fsync.

Failover testing drives a leader/standby pair over MinIO, kills and restarts leaders, and confirms every acknowledged write survives. ZFS itself is the end-to-end test: CI builds a ZFS pool on ZeroFS block devices, extracts the Linux kernel source, and scrubs it with zero checksum errors.

Geo-Distributed ZFS Mirror Over Three S3 Regions

ZeroFS exposes each S3 region as an NBD block device. A ZFS mirror across us-east, eu-west, and ap-southeast is configured like any other pool. If a region goes dark, the pool degrades and data stays available from the other two. No custom kernel modules, no FUSE overhead for block devices—just a single daemon per region.

ZeroFS makes S3 a viable primary store for workloads that previously required local or SAN storage.


Source: Show HN: ZeroFS - A log-structured filesystem for S3
Domain: zerofs.net

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.