Skip to content
Go back

Backend System Design Fundamentals — Questions

Introduction

Building scalable, reliable, maintainable backend systems comes down to a few repeatable ideas: how services communicate, how state changes are represented, how failures are handled, how data scales, and how you observe what’s happening in production.

This guide is structured as interview-style questions with practical explanations. Use it for quick revision, as a checklist when designing systems, or as a set of topics to deepen over time.

Table of Contents

Open Table of Contents

Architecture Patterns

Event-Driven Architecture (EDA)

What is Event-Driven Architecture?

Event-Driven Architecture is a style where services communicate by publishing events (facts about something that happened) and subscribing to the events they care about. Instead of direct service-to-service calls, you push events to a broker or bus and let consumers react.

Key characteristics

Common use cases

Example flow

User Action → Event Producer → Event Bus/Broker → Event Consumers

Things to watch


Saga Pattern

What is the Saga Pattern?

The Saga Pattern manages multi-step business transactions across services without relying on a single ACID transaction. A saga is a sequence of local transactions. If something fails, you run compensating actions to undo earlier steps.

Two styles

  1. Choreography: services react to events and decide the next step
  2. Orchestration: a coordinator explicitly tells each service what to do

Example: order processing

1. Create Order (Order Service)
2. Reserve Inventory (Inventory Service) → Compensate: Release Inventory
3. Charge Payment (Payment Service) → Compensate: Refund Payment
4. Arrange Shipment (Shipping Service) → Compensate: Cancel Shipment

⚠️ If step 3 fails → release inventory + cancel order.

When to use

Trade-offs


CQRS (Command Query Responsibility Segregation)

What is CQRS?

CQRS splits your system into:

The read and write sides can have different data models and even different storage, so each can be optimized independently.

Traditional vs CQRS

TraditionalCQRS
One model handles reads + writesCommand model (write-optimized) → events/replication → Query model (read-optimized)
Same storage for reads and writesDifferent storage optimized per side
Single source of truthSeparate models, eventual consistency

Why it’s useful

Example: e-commerce product catalog

Write side: Normalized relational DB for product updates (price, inventory, description). Write operations are simple and consistent.

Read side: Denormalized document store optimized for queries. Product detail pages might store:

{
  "id": 123,
  "name": "Product Name",
  "price": 99.99,
  "images": [...],
  "reviews": [...],
  "relatedProducts": [...]
}

All in one document for fast reads. Search indexes, category listings, and recommendation engines each have their own optimized read models.

When a product price changes, the write model updates the normalized table, then an event or CDC mechanism propagates the change to all read models.

Costs


Event Sourcing

What is Event Sourcing?

Event Sourcing stores a log of events as the source of truth, rather than storing only the latest state. Current state is derived by replaying events (often with snapshots for speed).

Traditional vs Event Sourcing

TraditionalEvent Sourcing
current state → update → new stateEvent 1 → Event 2 → Event 3 → ... → current state (derived)
Stores current state onlyStores all events as source of truth
State is the source of truthEvents are the source of truth
Limited audit trailComplete audit trail by default

Example: bank account

Events:
- AccountCreated
- MoneyDeposited(50)
- MoneyWithdrawn(20)
- MoneyDeposited(70)

Current Balance (derived): 50 - 20 + 70 = 100

Benefits

Challenges


Circuit Breaker Pattern

What is the Circuit Breaker Pattern?

Circuit breakers prevent cascading failures by stopping calls to a dependency that is failing. After too many failures, the breaker “opens” and requests fail fast (or return fallbacks) until recovery.

States

  1. Closed: normal traffic
  2. Open: fail fast (dependency is unhealthy)
  3. Half-open: allow limited test traffic to see if it recovered

State transitions

Closed → (failure threshold) → Open
Open → (cooldown timeout) → Half-open
Half-open → (enough success) → Closed
Half-open → (failure) → Open
StateBehavior
ClosedNormal operation, requests pass through
OpenFail fast, no requests to unhealthy dependency
Half-openTest mode, limited requests to check recovery

Example: payment service dependency

Your order service calls a payment service. After 5 failures in 10 seconds, the circuit opens. All new requests immediately return a fallback response (e.g., “Payment temporarily unavailable, please try again”) without waiting for the payment service timeout. After 30 seconds, the circuit goes half-open and allows one test request. If it succeeds, the circuit closes and normal traffic resumes. If it fails, the circuit opens again.

Where it helps

Pairs well with


Distributed Systems

Distributed Tracing

What is Distributed Tracing?

Distributed tracing tracks a single request as it flows through multiple services using a shared trace ID. Each step is a span with timing, metadata, and relationships.

Key terms

Why it matters

Example

Trace abc123
├─ GET /api/orders (200ms)
│  ├─ DB query (50ms)
│  ├─ Payment service call (100ms)
│  └─ Inventory service call (40ms)

This shows the full request path, making it easy to identify that the payment service call takes the longest (100ms).


CAP Theorem

What is the CAP Theorem?

CAP says a distributed system can’t simultaneously guarantee all three:

In real distributed systems, partitions happen, so you typically choose between:

Practical examples

Important nuance

CAP is about behavior during a partition, not forever. Many systems offer knobs (timeouts, quorum reads/writes) that move you along the spectrum.


Idempotency

What is Idempotency?

An operation is idempotent if repeating it produces the same effect as doing it once. This matters because retries happen (timeouts, network glitches, client re-sends).

HTTP intuition

Common strategy: idempotency keys

Client sends a unique key; server stores the result for that key and returns the same response for duplicates.

Example: payment processing

POST /api/payments
Headers:
  Idempotency-Key: abc-123-xyz
Body:
  {
    "amount": 100,
    "card": "..."
  }

First request: Processes payment, stores result with key abc-123-xyz

Retry (same key): Returns stored result, doesn’t charge again

The server checks: “Have I seen key abc-123-xyz?” If yes, return the previous response. If no, process and store the result.

Best practices


Data Management

Data Sharding

What is Data Sharding?

Sharding splits a dataset across multiple databases/servers (horizontal partitioning). Each shard holds a subset of the data.

Sharding vs replication

ShardingReplication
Different data on different nodesSame data on multiple nodes
Scales capacity and write throughputScales reads and availability
Each node holds a subsetEach node holds a copy

Common strategies

Example: user data sharding

You have 100M users. With hash-based sharding across 4 shards:

hash(user_id) % 4 = 0 → Shard 1
hash(user_id) % 4 = 1 → Shard 2
hash(user_id) % 4 = 2 → Shard 3
hash(user_id) % 4 = 3 → Shard 4

User ID 12345 → hash(12345) % 4 = 2 → stored on Shard 3. Each shard holds ~25M users. Queries for a specific user are fast (only one shard). Queries like “all users in California” require querying all shards and merging results.

Trade-offs


Database Indexing

What is Database Indexing?

Indexes are extra data structures that help databases find rows faster—like a book index. Without an index, the DB may scan many rows. With the right index, it can jump close to the answer.

What indexes improve

Example: user lookup

Without index:

SELECT * FROM users WHERE email = 'user@example.com';
-- Full table scan: checks all 10M rows, takes 2 seconds

With index on email:

CREATE INDEX idx_email ON users(email);
SELECT * FROM users WHERE email = 'user@example.com';
-- Index lookup: finds row directly, takes 5ms

The index is like a sorted list: email → row location. The database can binary search instead of scanning every row.

Trade-offs

Practical guidance


Database Replication

What is Database Replication?

Replication copies data from a primary database to one or more replicas.

Common setup

Sync vs async

SynchronousAsynchronous
Stronger consistencyEventual consistency
Slower writes (wait for replica confirmation)Faster writes (don’t wait)
Replicas stay in syncReplicas may lag
Higher latencyLower latency

Why it’s used

Common pitfall: replication lag

A user writes to primary, then immediately reads from a replica and sees stale data. Solutions include reading from primary for critical paths, or using stronger replication settings where needed.


Infrastructure & Operations

API Gateway

What is an API Gateway?

An API gateway is a single entry point for clients. It routes requests to internal services and handles cross-cutting concerns.

Typical responsibilities

Example: request flow

Client sends: GET /api/v1/orders/123

1. API Gateway receives request
2. Validates JWT token (auth)
3. Checks rate limits (user has 50 requests left this hour)
4. Routes to Order Service (based on path /api/v1/orders/*)
5. Order Service returns order data
6. Gateway transforms response (adds metadata, formats dates)
7. Gateway logs request and metrics
8. Returns response to client

The client never knows about Order Service directly. If you later split Order Service into multiple services, you only change routing in the gateway.

Trade-offs


Caching

What is Caching?

Caching stores frequently used data in fast storage (often memory) to reduce latency and lower database/service load.

Common pattern: cache-aside

1. Check cache
2. On miss, read DB
3. Store result in cache

Example: product details

User requests product ID 123:

  1. Check Redis: GET product:123miss
  2. Query database: SELECT * FROM products WHERE id = 123
  3. Store in Redis: SET product:123 <data> EX 3600 (expires in 1 hour)
  4. Return data

Next request for product 123: Redis hit, return immediately (no DB query).

When product price changes:

Why cache invalidation is hard

Because the moment the source data changes, you must decide:

  • Which cached keys are now stale? (product:123, product-list, search-results, category-page?)
  • When to expire them? (immediately or wait for TTL?)
  • How to keep multiple caches consistent? (Redis, CDN, browser cache all need updates)

Practical approaches


Rate Limiting

What is Rate Limiting?

Rate limiting caps how many requests a client can make in a window to protect your system and ensure fair usage.

Common algorithms

Example: API rate limiting

Free tier: 100 requests per hour per API key.

Fixed window approach:

Token bucket approach:

When limit exceeded:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1609459200
Retry-After: 3600

Typical response


Load Balancing

What is Load Balancing?

Load balancing distributes incoming traffic across multiple instances so no single server is overloaded and failures don’t take the system down.

Layer 4 vs layer 7

Layer 4 (Transport)Layer 7 (Application)
Routes by IP/portRoutes by HTTP path/headers
Faster (less processing)Smarter (more context-aware)
Lower latencyCan do path-based routing, SSL termination
TCP/UDP levelHTTP/HTTPS level

Common policies

Example: request distribution

You have 3 servers behind a load balancer.

Round robin:

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (cycle repeats)

With least connections (better for long-lived connections):

With consistent hashing (useful for caching):

Session affinity problem

User logs in on Server A, session stored there. Next request goes to Server B, session missing.

Solutions:


Conclusion

These concepts show up everywhere in real backend work. EDA and Sagas handle workflows. CQRS and Event Sourcing enable scalable state changes and auditability. Circuit breakers and idempotency improve reliability. Tracing helps debug across services. CAP helps understand trade-offs. Sharding, replication, and indexing scale data. Gateways with caching, rate limiting, and load balancing make systems production-ready.

Treat this as a checklist: learn the definition, understand the trade-offs, then build one small demo per topic.


Share this post on:

Next Post
Another Sample Post