Monolith vs Microservices — Architecture Trade-offs at Scale
- system-design
- microservices
- monolith
- api-gateway
- scalability
The Story: The City Hall vs Specialised Departments
Monolith: City Hall. One building. Every department like tax office, licensing, permits, courts are all under one roof. One building code, one management, one key. Convenient. Fast internal communication. Efficient. But: if the building catches fire, everything stops. If the plumbing needs upgrading, the whole building is affected.
Microservices: Specialised departments scattered across the city. Tax office in one district. Licensing in another. Courts in the civic center. Each runs independently. One can renovate without shutting the others down. But: they need messengers (APIs) to talk. Coordination is complex. More buildings to maintain.
Neither is always better. The architecture should match the team’s maturity and the product’s complexity.
The Monolith
A single deployable unit containing all application logic and data are called a monolith. All components are tightly coupled and run in the same process.
┌────────────────────────────────────────────────────────────────────┐
│ Monolith │
│ │
│ [User Auth] [Product Catalog] [Orders] [Payments] │
│ [Reviews] [Search] [Email] [Analytics] │
│ │
│ Single database │
│ Single codebase │
│ Deploy everything together │
└────────────────────────────────────────────────────────────────────┘
Types of monoliths
Modular Monolith
Still one deployable unit, but internally structured with clear module boundaries. Each module has its own domain logic and can only communicate with others through defined interfaces. The best of both worlds for many teams.
┌─────────────────────────────────────────┐
│ Modular Monolith │
│ ┌──────────┐ ┌──────────┐ │
│ │ Orders │ → │ Payments │ │
│ │ Module │ │ Module │ │
│ └──────────┘ └──────────┘ │
│ (communicate via interfaces) │
└─────────────────────────────────────────┘
Distributed Monolith
The worst architecture. Multiple services, but so tightly coupled they must be deployed together. All the complexity of microservices, none of the benefits.
Pros and Cons
| Pros | Cons |
|---|---|
| Simple to develop and debug | Harder to scale individual components |
| No network overhead for internal calls | Long deployment pipelines (deploy everything to change anything) |
| Easy transactions (same DB) | Technology lock-in (one language, one framework) |
| Simple deployment (one artifact) | Teams step on each other at scale (merge conflicts, coupling) |
| Easy to refactor (same codebase) | Single point of failure risk |
| Fastest for early-stage | Risky deployments (change X, accidentally break Y) |
Microservices
Independently deployable services, each owning its domain and its data.
┌────────────┐ ┌──────────────┐ ┌────────────┐
│ User │ │ Product │ │ Order │
│ Service │ │ Service │ │ Service │
│ DB: users │ │ DB: products │ │ DB: orders │
└────────────┘ └──────────────┘ └────────────┘
↕ (API calls) ↕ (events) ↕ (events)
┌────────────┐ ┌─────────────┐ ┌─────────────┐
│ Payment │ │ Search │ │ Email │
│ Service │ │ Service │ │ Service │
│ DB: pymts │ │ DB: elastic │ │ (stateless) │
└────────────┘ └─────────────┘ └─────────────┘
The critical rule: Each microservice owns its own database. Never share a database between microservices. This is what makes them independently deployable.
Pros and Cons
| Pros | Cons |
|---|---|
| Independent deployments (deploy payment without touching search) | Network overhead on every inter-service call |
| Independent scaling (scale only the Order service) | Distributed systems complexity (partial failures, consistency) |
| Technology freedom (Python for ML, Go for high-throughput) | Harder to debug (trace spans 8 services) |
| Team autonomy (team owns their service end-to-end) | No ACID across services (must use Saga pattern) |
| Isolated failures | Service discovery and networking overhead |
| Smaller codebases per service | Operational overhead (many deployments, many databases) |
How to Decide
The Conway’s Law insight
Organizations which design systems are constrained to produce designs which are copies of the communication structures of those organizations. — Melvin Conway (1967)
Your architecture mirrors your team structure. Before choosing microservices, look at your team:
| Team situation | Recommended architecture |
|---|---|
| 1–3 developers | Monolith |
| 4–10 developers | Modular monolith |
| Multiple teams working in the same codebase causing friction | Consider extracting services |
| Teams stepping on each other at release time | Microservices |
| Scale is your bottleneck, not team coordination | Monolith with vertical/horizontal scaling first |
| Different parts need different technology stacks | Microservices |
The maturity ladder
Stage 1: Monolith
↓ (when: teams fighting over codebase, specific scale needs)
Stage 2: Modular Monolith (with clean internal APIs)
↓ (when: a specific module needs independent scaling or team)
Stage 3: Extract one or two services that are clearly bounded
↓ (when: proven value and team has distributed systems expertise)
Stage 4: Microservices (where justified)
The common mistake: Going from stage 1 to stage 4 immediately. Startup with 2 engineers building 15 microservices. “Premature microservices” — all the operational complexity, none of the scale benefits.
Communication Between Microservices
Synchronous (Request-Response)
[Order Service] ──> HTTP/gRPC ──> [Payment Service]
← response (success/fail)
When to use: When you need the result immediately to serve the user.
Risk: Payment service down → Order service returns error. Tight coupling.
Asynchronous (Event-Driven)
[Order Service] ──> "order_placed" event ──> [Message Queue]
↓
[Payment Service] (processes when ready)
[Email Service] (sends confirmation)
[Inventory] (decrements stock)
When to use: When downstream services can process independently.
Benefit: Order service doesn’t care if payment service is temporarily slow.
gRPC vs REST for inter-service communication
| REST (HTTP/JSON) | gRPC (HTTP/2 + Protobuf) | |
|---|---|---|
| Format | JSON (human-readable) | Binary Protobuf (compact, fast) |
| Speed | Slower (JSON parsing) | ~10× faster |
| Typing | Loose | Strict (schema enforced) |
| Streaming | Not native | Built-in bidirectional streaming |
| Browser support | Native | Needs proxy (grpc-web) |
| Best for | External APIs, browser-facing | Internal microservice communication |
Service Discovery
Problem: You have 50 microservices. Service A needs to call Service B. Where is Service B? Its IP address changes when it scales up/down or restarts.
Without service discovery:
Service A → hard-coded "10.0.0.5:8080" → service B
If Service B restarts with new IP → Service A is broken
Client-side discovery
Services register themselves in a registry. Callers look up the registry.
Service B startup → registers "payment-service" → [Service Registry] (e.g., Consul, Eureka)
Service A wants to call B:
1. Query registry: "where is payment-service?"
2. Registry returns: ["10.0.1.5:8080", "10.0.1.6:8080"]
3. Service A load-balances and calls one
Server-side discovery (more common today)
Services call a fixed endpoint. The infrastructure routes to the right place.
Service A → calls "http://payment-service" (fixed hostname)
→ Kubernetes DNS resolves to the right pod
→ kube-proxy load-balances across healthy pods
Kubernetes handles this automatically. You just call http://payment-service:8080 and it works.
API Gateway in Microservices
Without an API gateway, clients must know about all services:
Mobile app → /users → User Service
Mobile app → /orders → Order Service
Mobile app → /products → Product Service
With an API gateway:
Mobile app → [API Gateway]
↓ (routes to)
/users → User Service
/orders → Order Service
/products → Product Service
API Gateway responsibilities:
- Routing: Forward requests to correct service
- Authentication: Verify JWT/OAuth before requests reach services
- Rate limiting: Per-client throttling
- SSL termination: Handle HTTPS, forward HTTP internally
- Request/response transformation: Adapt between API versions
- Aggregation: Combine responses from multiple services (BFF pattern)
Tools: Kong, AWS API Gateway, Nginx, Traefik, Envoy
Backend For Frontend (BFF) Pattern
Different clients have different data needs.
Mobile app → [Mobile BFF] → calls multiple microservices → tailored response
Web app → [Web BFF] → calls multiple microservices → tailored response
3rd party → [Public API] → calls microservices → standardised response
Each BFF aggregates and adapts the response for its specific client — avoiding over-fetching (getting too much data) or under-fetching (multiple round trips).
Distributed Tracing
Problem: A request fails. It touched 8 services. Where did it fail?
Without tracing:
[User] → [API Gateway] → [Order Service] → [Payment] → [Fraud] → [Notification]
⚠️ Error!
→ You check 6 different log files, each with different timestamps and formats
With distributed tracing:
Every request gets a trace ID. Each service logs with that trace ID. You view the complete trace in one place.
Trace ID: abc-123
Span 1: API Gateway 100ms total
Span 2: Order Service 80ms
Span 3: Payment Svc 60ms ← ⚠️ ERROR HERE, 60ms before timeout
Span 4: Fraud Svc 40ms
Tools: Jaeger, Zipkin, AWS X-Ray, Datadog APM
Implementation: Pass a X-Trace-ID header through every service. Structured logging includes the trace ID.
# Middleware in every service
def trace_middleware(request, next_handler):
trace_id = request.headers.get('X-Trace-ID') or generate_trace_id()
with logger.bind(trace_id=trace_id, service="order-service"):
request.headers['X-Trace-ID'] = trace_id # forward to downstream
return next_handler(request)
The Strangler Fig Pattern: Migrating Monolith → Microservices
Don’t rewrite everything at once. The big-bang rewrite is one of the most common causes of project failure.
The Strangler Fig: A vine that grows around a tree, slowly replacing it. The tree is still alive and functioning while the vine grows around it.
Phase 1: Monolith handles everything
[Client] → [Monolith]
Phase 2: Extract one service, proxy in front
[Client] → [Proxy/API Gateway]
↓ ↓
[User Service] [Monolith]
(new) (everything else)
Phase 3: Extract another service
[Client] → [API Gateway]
↓ ↓ ↓
[User Svc] [Order Svc] [Monolith]
Phase N: Monolith is gone
[Client] → [API Gateway]
↓ ↓ ↓ ↓
[Svc A] [Svc B] [Svc C] [Svc D]
Rules for successful migration:
- Extract one service at a time
- Start with the most independently bounded domain
- Keep the monolith working throughout
- Run the new service and monolith in parallel initially (feature flags)
- Only cut over when the new service is proven
Flashcards
Q: Monolith or microservices for this system?
Default answer: I’d start with a modular monolith. It gives us fast development velocity, simple deployment, and easy debugging. The modules are clearly separated internally, so if we need to extract a service later (perhaps the ML recommendation engine or the payment flow), we can do so with the strangler fig pattern without a full rewrite.
When to say microservices: If the design problem specifies different scaling needs for different components, or if it explicitly mentions a large organisation with multiple teams, I’d decompose based on domain boundaries following domain-driven design.
Q: How do microservices communicate?
Synchronously via HTTP REST or gRPC for operations where the caller needs an immediate response. Asynchronously via a message broker (Kafka/RabbitMQ) for events where downstream services process independently. My default is async/event-driven because it decouples services and prevents cascading failures.
Q: What is the key rule of microservices data ownership?
Each microservice owns its own database. Never share a database between services.
Q: What is Conway’s Law?
Organizations build systems that mirror their communication structures. Team structure drives architecture.
Q: What is the Strangler Fig pattern?
Migrating from monolith to microservices by extracting one service at a time, proxying in front, keeping monolith functional throughout.
Q: What does an API Gateway do?
Routing, authentication, rate limiting, SSL termination, request transformation. Single entry point for all clients.
Q: What is the BFF pattern?
Backend For Frontend — separate gateway/aggregation layer per client type (mobile, web, 3rd party) to serve tailored responses.
Q: What is distributed tracing?
Attaching a trace ID to every request that flows through all services, enabling you to see the full call chain and locate failures/latency.
Q: What is a distributed monolith?
Multiple services that are so tightly coupled they must be deployed together. Worst of both worlds.
Series · System Design
Previous · Part 7
Rate Limiting & Traffic Control
Next · Part 9
Communication Protocols — HTTP, WebSockets, gRPC & SSE
Part 8 of 13 · Jun 2026