senior loop
Deep Dives

The topics every senior loop covers. 13

Long-form deep dives — not blog posts. Each one covers the failure modes, tradeoffs, and interview traps that surface-level guides skip.

13topics
3free
~35min avg
long-form
Start here · free

Idempotency & Exactly-Once Payment Processing

free

Networks lose responses. Clients retry. Without protection, that retry charges the card twice. Learn the protocol every payments engineer must know cold.

PaymentsDistributed Systems·~30 min
Read →

Saga, Outbox & CDC for Payments

free
PaymentsMicroservices

When one local transaction cannot cover a multi-service payment flow, a saga coordinates committed steps, reliable events, compensation, and forward recovery.

~35 minRead →

Kafka Internals & Production Operations

free
StreamingDistributed Systems

Almost every durability and ordering guarantee Kafka makes reduces to one primitive: a partition is a log, replicated to a few brokers, with one leader. This note builds from that up through production failure modes.

~40 minRead →

Distributed Cache — Redis at Scale

pro
CachingInfrastructure

A cache is a bet — keep a hot slice of data close and fast, accept some staleness. The hard part is everything that goes wrong at scale: cold start, cache stampede, hot keys, eviction under pressure.

~40 min🔒

Distributed Rate Limiter

pro
Distributed SystemsInfrastructure

Every API gateway must answer one question within its latency budget: has this caller used up its quota? Counting accurately across nodes, through clock skew and traffic spikes, is the real problem.

~35 min🔒

Consistent Hashing & Sharding

pro
Distributed SystemsSharding

When data outgrows one machine it must split across many. Naive modulo hashing reshuffles almost everything when a node is added or removed. Consistent hashing moves only the keys that must move.

~40 min🔒

Distributed Locking

pro
Distributed Systems

'Only one worker may run this at a time' sounds trivial until the worker is one of many processes on different machines. Single-active-worker and leader election appear in every payments and fraud system.

~35 min🔒

Two-Phase Commit Protocol

pro
Distributed Systems

Atomic commits across distributed participants: the protocol, its failure modes, the blocking flaw every interviewer probes, and when 2PC is the right answer versus when to use a saga instead.

~25 min🔒

Leader Election & Raft Basics

pro
ConsensusDistributed Systems

Every system that needs a single source of truth across machines — Kafka's controller, etcd, Postgres failover — rests on consensus. Raft elects one leader, funnels all changes through it, replicates to a majority.

~40 min🔒

Chat Systems at Scale

pro
RealtimeDistributed Systems

A chat system looks trivial until you see the hard parts: millions of persistent connections, presence that is always slightly wrong, ordering that must hold per-conversation across multiple devices, and reconnect storms.

~40 min🔒

Push Notifications at Scale

pro
Infrastructure

Design a large-scale notification pipeline: fan-out architecture, queueing, provider delivery, token lifecycle, cancellation, and the failure modes an interviewer is likely to probe.

~25 min🔒

Matching Engine & Order Book

pro
FintechDistributed Systems

The core of every exchange — FX, crypto, equities. Orders pour in; the engine maintains a price-time priority book and matches each incoming order against the best opposite side.

~35 min🔒

URL Shortener at Scale

pro
Distributed SystemsCaching

The 'design bit.ly' question. The baseline is table stakes; the substance lies in the follow-ups: hot keys, cache invalidation, OLTP/OLAP separation, and CAP trade-offs under write pressure.

~35 min🔒
Up nextWebhook reliability · Event sourcing · Circuit breakers