RESEARCH

Cloud Infrastructure Cost Model - 43% Savings Validated

22 Oct 2025

Executive Summary

We evaluated serverless and container deployment options for highly variable API workloads (10–1000 req/min) and built a repeatable cost model to identify breakpoints.

Using a 6-week production trace replay, we validated that containers (Fargate/EC2) can cut total spend by 34–48% while improving p95 latency stability (reduced cold-start exposure).

Abstract

Architecture choices are often made from intuition rather than workload evidence. For variable traffic, cost and latency depend on the distribution of demand over time, not only average requests per minute.

We present a methodology to replay production traces against comparable AWS configurations and quantify: compute, data transfer, and observability costs. We derive practical breakpoints and provide a decision tree that can be applied to new workloads.

Problem Statement

Given a workload with high variance, choose between:

AWS Lambda (pay-per-use, possible cold starts)
AWS Fargate (containerized, autoscaling)
EC2 + ALB (steady baseline capacity)

Target outcomes:

minimize total cost over representative time windows
maintain consistent p95 latency
keep operational complexity proportional to savings

Experimental Design

Test Environment

3 identical API implementations (Node.js, ~180ms p50 response time)
Workload simulation: 6-week trace from a production ecommerce platform
Cost tracking: compute, data transfer, auxiliary services (logs, metrics)

Evaluated Configurations

Lambda (1024MB, arm64)
Fargate (0.5 vCPU, 1GB, autoscaling 2–20 tasks)
EC2 t4g.small (2 instances, ALB)

Cost Model

We compute total cost as:

Compute runtime (or provisioned capacity)
Data transfer
Logs and metrics (often underestimated)

We replay the trace using identical application behavior assumptions and compare totals.

Results

Cost Breakdown (6-week period)

Configuration (runtime)	Compute spend	Data transfer spend	Logs & metrics spend	Total spend (6 weeks)	Delta vs. Lambda
Lambda (1024MB, arm64)	$842	$67	$124	$1,033	baseline
Fargate (0.5 vCPU, 1GB)	$511	$58	$109	$678	-34%
EC2 (t4g.small + ALB)	$387	$52	$98	$537	-48%

Breakpoint Finding

Breakeven between Lambda and Fargate occurs at ~45 sustained requests/minute. For workloads spending >60% of time above this level, containers typically save 40–50%.

Latency Impact

Lambda cold starts: 12% of requests (avg. +340ms)
Fargate: consistent p95 < 250ms
EC2: consistent p95 < 210ms

Sensitivity and Operational Considerations

The breakpoint shifts with response time and memory footprint.
Observability spend can erase expected savings if left unconstrained.
Fargate provides a strong tradeoff when you want fewer ops commitments than EC2.

Deployment Notes

Start with a trace replay: guessing from averages is unreliable.
Instrument per-route duration and memory; use these features in the decision tree.
Separate “always-on” components from bursty paths to mix serverless and containers.

Commercial Application

This work translates into:

predictable infrastructure planning with explainable breakpoints
lower cost and improved p95 stability
migration plans that are justified by workload evidence rather than opinion

Licensable Outcomes

Cost modeling framework (open-source): replay production traces against pricing APIs
Decision tree implementation: maps workload characteristics to recommended architecture
Migration playbook: 47-page guide covering IaC templates, monitoring setup, rollback procedures

Limitations and Next Work

Results depend on trace representativeness; revisit the model quarterly.
Future work: include queueing effects and downstream service costs in the replay.

Evaluation Date: October 2025
Status: Framework licensed to 3 SaaS companies, migration playbook under NDA with 1 enterprise client

Back to Research Contact

Sign in