Context
Cloud Native Overview
Serverless is one of the operating models of cloud-native architecture, and not a separate magic.
Serverless patterns help speed up delivery and relieve some of the operational load, but transfer complexity to event contracts, observability and cost control. Robust design is built around asynchrony, idempotency, and managed retry policies.
When is serverless appropriate?
- Irregular or burst load, where payment for actual consumption is important.
- Asynchronous workflow and event-driven integration (queues, webhooks, stream events).
- Quick product launch without a separate platform operation team.
- Automation around storage, messaging, cron/schedule and lightweight API endpoints.
Key patterns
Async first
Separate request acceptance from heavy processing via queue/topic. This dampens spikes and increases resistance to spikes.
Idempotent handlers
Each function must safely handle repeated events (at-least-once delivery is the default in many managed services).
Function-per-capability
Divide the logic into bounded contexts, rather than into monolithic lambdas. This makes it easier to scale, test and deploy.
State externalization
Do not store critical state in function memory. Use managed DB/cache/object storage and versioned schemas.
FinOps
Cost Optimization & FinOps
The economics of serverless should be measured based on actual production traffic, not just expectations.
Risks and how to cover them
Cold starts
Plan your latency budget, use provisioned concurrency/warmer approaches and minimize the init path.
Hidden coupling through events
Enter event contracts, schema versioning and observability through the end-to-end pipeline.
Timeout and retry storms
Set explicit timeouts, DLQ/retry policy and retry budget at the level of each consumer.
Increased cost with high constant load
Compare unit economics serverless vs containers/VM on the production load profile.
Practical checklist
- Latency/SLO boundaries have been defined for the sync API and async processing separately.
- All handlers are idempotent and support replay without duplicating side effects.
- There is a DLQ/parking lot and an operational runbook for stuck/bad messages.
- Observability covers trace via API -> queue -> function -> storage.
- The economic model is regularly recalculated based on actual traffic.
Frequent anti-pattern: transfer a synchronous monolith to a single function without decomposition and without queue-based backpressure.
