Skip to content
LoopSmart
Menu

RESEARCH

Cloud Infrastructure Cost Model - 43% Savings Validated

22 Oct 2025

Executive Summary

We evaluated serverless and container deployment options for highly variable API workloads (10–1000 req/min) and built a repeatable cost model to identify breakpoints.

Using a 6-week production trace replay, we validated that containers (Fargate/EC2) can cut total spend by 34–48% while improving p95 latency stability (reduced cold-start exposure).

Abstract

Architecture choices are often made from intuition rather than workload evidence. For variable traffic, cost and latency depend on the distribution of demand over time, not only average requests per minute.

We present a methodology to replay production traces against comparable AWS configurations and quantify: compute, data transfer, and observability costs. We derive practical breakpoints and provide a decision tree that can be applied to new workloads.

Problem Statement

Given a workload with high variance, choose between:

  • AWS Lambda (pay-per-use, possible cold starts)
  • AWS Fargate (containerized, autoscaling)
  • EC2 + ALB (steady baseline capacity)

Target outcomes:

  • minimize total cost over representative time windows
  • maintain consistent p95 latency
  • keep operational complexity proportional to savings

Experimental Design

Test Environment

  • 3 identical API implementations (Node.js, ~180ms p50 response time)
  • Workload simulation: 6-week trace from a production ecommerce platform
  • Cost tracking: compute, data transfer, auxiliary services (logs, metrics)

Evaluated Configurations

  1. Lambda (1024MB, arm64)
  2. Fargate (0.5 vCPU, 1GB, autoscaling 2–20 tasks)
  3. EC2 t4g.small (2 instances, ALB)

Cost Model

We compute total cost as:

  • Compute runtime (or provisioned capacity)
  • Data transfer
  • Logs and metrics (often underestimated)

We replay the trace using identical application behavior assumptions and compare totals.

Results

Cost Breakdown (6-week period)

Configuration (runtime)Compute spendData transfer spendLogs & metrics spendTotal spend (6 weeks)Delta vs. Lambda
Lambda (1024MB, arm64)$842$67$124$1,033baseline
Fargate (0.5 vCPU, 1GB)$511$58$109$678-34%
EC2 (t4g.small + ALB)$387$52$98$537-48%

Breakpoint Finding

Breakeven between Lambda and Fargate occurs at ~45 sustained requests/minute. For workloads spending >60% of time above this level, containers typically save 40–50%.

Latency Impact

  • Lambda cold starts: 12% of requests (avg. +340ms)
  • Fargate: consistent p95 < 250ms
  • EC2: consistent p95 < 210ms

Sensitivity and Operational Considerations

  • The breakpoint shifts with response time and memory footprint.
  • Observability spend can erase expected savings if left unconstrained.
  • Fargate provides a strong tradeoff when you want fewer ops commitments than EC2.

Deployment Notes

  • Start with a trace replay: guessing from averages is unreliable.
  • Instrument per-route duration and memory; use these features in the decision tree.
  • Separate “always-on” components from bursty paths to mix serverless and containers.

Commercial Application

This work translates into:

  • predictable infrastructure planning with explainable breakpoints
  • lower cost and improved p95 stability
  • migration plans that are justified by workload evidence rather than opinion

Licensable Outcomes

  1. Cost modeling framework (open-source): replay production traces against pricing APIs
  2. Decision tree implementation: maps workload characteristics to recommended architecture
  3. Migration playbook: 47-page guide covering IaC templates, monitoring setup, rollback procedures

Limitations and Next Work

  • Results depend on trace representativeness; revisit the model quarterly.
  • Future work: include queueing effects and downstream service costs in the replay.

Evaluation Date: October 2025
Status: Framework licensed to 3 SaaS companies, migration playbook under NDA with 1 enterprise client