The topics every senior loop covers. 13
Long-form deep dives — not blog posts. Each one covers the failure modes, tradeoffs, and interview traps that surface-level guides skip.
Idempotency & Exactly-Once Payment Processing
freeNetworks lose responses. Clients retry. Without protection, that retry charges the card twice. Learn the protocol every payments engineer must know cold.
Saga, Outbox & CDC for Payments
freeWhen one local transaction cannot cover a multi-service payment flow, a saga coordinates committed steps, reliable events, compensation, and forward recovery.
Kafka Internals & Production Operations
freeAlmost every durability and ordering guarantee Kafka makes reduces to one primitive: a partition is a log, replicated to a few brokers, with one leader. This note builds from that up through production failure modes.
Distributed Cache — Redis at Scale
proA cache is a bet — keep a hot slice of data close and fast, accept some staleness. The hard part is everything that goes wrong at scale: cold start, cache stampede, hot keys, eviction under pressure.
Distributed Rate Limiter
proEvery API gateway must answer one question within its latency budget: has this caller used up its quota? Counting accurately across nodes, through clock skew and traffic spikes, is the real problem.
Consistent Hashing & Sharding
proWhen data outgrows one machine it must split across many. Naive modulo hashing reshuffles almost everything when a node is added or removed. Consistent hashing moves only the keys that must move.
Distributed Locking
pro'Only one worker may run this at a time' sounds trivial until the worker is one of many processes on different machines. Single-active-worker and leader election appear in every payments and fraud system.
Two-Phase Commit Protocol
proAtomic commits across distributed participants: the protocol, its failure modes, the blocking flaw every interviewer probes, and when 2PC is the right answer versus when to use a saga instead.
Leader Election & Raft Basics
proEvery system that needs a single source of truth across machines — Kafka's controller, etcd, Postgres failover — rests on consensus. Raft elects one leader, funnels all changes through it, replicates to a majority.
Chat Systems at Scale
proA chat system looks trivial until you see the hard parts: millions of persistent connections, presence that is always slightly wrong, ordering that must hold per-conversation across multiple devices, and reconnect storms.
Push Notifications at Scale
proDesign a large-scale notification pipeline: fan-out architecture, queueing, provider delivery, token lifecycle, cancellation, and the failure modes an interviewer is likely to probe.
Matching Engine & Order Book
proThe core of every exchange — FX, crypto, equities. Orders pour in; the engine maintains a price-time priority book and matches each incoming order against the best opposite side.
URL Shortener at Scale
proThe 'design bit.ly' question. The baseline is table stakes; the substance lies in the follow-ups: hot keys, cache invalidation, OLTP/OLAP separation, and CAP trade-offs under write pressure.