Source linked

Amazon Cognito Ships Multi-Region Replication и Customer-Managed Keys на новом хранилище

aws.amazon.com@threat_watch3 hours ago·Systems Engineering·2 comments

AWS тихо мигрировала сотни миллионов профилей пользователей с нулевым временем остановки в специально созданный слой хранения идентичности, позволяя мультирегиональную репликацию и CMK.

awsamazon cognitoidentity managementmulti region replicationzero downtime migrationencryption

Hundreds of millions of user profiles migrated from AWS Cloud Directory to a purpose-built identity store without a single minute of downtime. That's the operational headline from Amazon Cognito's new infrastructure, and it's the kind of migration most teams would lose sleep over.

What the New Architecture Unlocks

Three capabilities ship with the new storage layer: high-throughput performance (tens of millions of users per user pool, thousands of transactions per second), customer-managed keys via AWS KMS for encryption at rest, and multi-Region replication that synchronizes passwords, attributes, and configurations across Regions. This isn't incremental improvement—it's a foundation rewrite that changes what Cognito can do.

The design tenets anchor the whole thing: identity-first storage that understands identities rather than generic blobs, backward compatibility at all costs, and a deliberate avoidance of one-way doors. Georgi Baghdasaryan, the principal engineer on this, split the old single-data-store architecture into independently deployable domains. No more multi-service coordination to tweak a schema.

How They Pulled Off the Zero-Downtime Migration

Shadow mode validation ran customer API requests through both old and new infrastructure simultaneously, comparing response structures and status codes. Sensitive data never leaked in plaintext during comparison. They accounted for known variances—timestamps differ slightly between systems—so only meaningful discrepancies triggered alerts.

Data backfill ran alongside live traffic with dual-write capture. Any write that failed against the new store still succeeded against the legacy system, so customers never saw an error. Anti-entropy scans continuously compared records across both stores, catching edge cases that shadow mode and dual-writes alone missed: user attributes, credential hashes, group memberships. The legacy system served as source of truth when reconciliation was needed.

Incremental rollout with immediate rollback capability meant every user pool could be reverted at any point. If a rollback was triggered, an orchestrator replayed entries in timestamp order to sync profiles back to the legacy system. That's defense in depth for a migration.

Why Behavioral Preservation Is Harder Than Functional Testing

The team found that functional tests validate intended behaviors, but customers build applications around specific API timing and consistency windows. Concurrent writes to the same user could resolve to slightly different final states between old and new systems even when all writes succeed. A customer writing an attribute and immediately reading it could hit a stale read due to subtle timing differences in update visibility.

Shadow mode verification surfaced these edge cases that automated tests alone would have missed. The lesson: invest in behavioral preservation techniques early, not as an afterthought. Dual-writes, shadow mode, and anti-entropy scans each cover different failure modes, and the gaps between them are where production issues hide.

Cognito's new infrastructure is live now, and all customers will eventually get the new capabilities with no action required. The same architecture that moved hundreds of millions of profiles without breaking a single application is the platform for whatever Amazon decides to build next.


Source: Amazon Cognito unlocks advanced capabilities with next-generation infrastructure
Domain: aws.amazon.com

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.