← Insights·Cloud

Kubernetes cost optimisation — 12 levers we pull first

Most Kubernetes clusters we inherit are running at 30–40% of actual resource needs. Here are the first checks we run on every engagement.

Moksh Parikh

Dec 2025 · 9 min read

Kubernetes cost optimisation — 12 levers we pull first

Cloud bills have a way of growing quietly. Kubernetes makes it worse because the abstraction layer makes it easy to over-provision 'just to be safe' — and that safety buffer accumulates across every deployment, every namespace, every team. When we audit a new client's cluster, the median finding is 35–45% resource over-provisioning.

Here are the first twelve checks we run, in order of typical impact.

Resource requests and limits

1. Unset requests: Pods without resource requests get scheduled as if they need nothing. The scheduler cannot make good placement decisions. Set requests on every container.
2. Over-specified limits: Limits set at 10x requests are common. They prevent the scheduler from packing nodes efficiently. Benchmark your actual P99 usage and set limits at 1.5–2x.
3. Limit ranges: Use LimitRange objects per namespace to enforce sensible defaults for teams that haven't specified requests.

Node and cluster sizing

4. Node type mismatch: Many clusters run general-purpose nodes for workloads that would be 40% cheaper on compute-optimised or memory-optimised instances. Profile your workload mix before choosing node types.
5. Cluster autoscaler tuning: Default scale-down delays are conservative. If your workload is bursty, tighten `--scale-down-unneeded-time` and `--scale-down-delay-after-add`.
6. Spot/preemptible nodes for tolerant workloads: Batch jobs, ML training, and dev environments rarely need on-demand instances. Move them to spot with appropriate tolerations and PodDisruptionBudgets.

Idle and zombie resources

7. Idle namespaces: Dev and staging namespaces often run 24/7. Schedule them to scale to zero overnight and on weekends.
8. Unused PersistentVolumes: PVCs that are no longer mounted by any pod still incur storage costs. Run a periodic audit against `kubectl get pvc` and cross-reference with running pods.
9. Old images in registries: Container registry costs are minor but often overlooked. Set lifecycle policies to expire images older than 90 days that aren't tagged as release versions.

Network and data transfer

10. Cross-AZ traffic: Traffic between availability zones incurs data transfer costs. Co-locate pods that communicate frequently using topology spread constraints or affinity rules.
11. Egress optimisation: Review what data is leaving your cluster and to where. NAT gateway costs are frequently the largest surprise on a cloud bill.
12. Service mesh overhead: If you're running a service mesh, profile the sidecar proxy overhead. Envoy sidecars can consume 50–100MB of memory per pod — on a 200-pod cluster, that's 10–20GB of reserved memory for networking alone.

Kubernetes cost optimisation is not a one-time project. Build it into your quarterly engineering review — clusters drift toward over-provisioning naturally as teams make 'safe' changes.

The highest-leverage intervention is almost always right-sizing resource requests based on actual observed usage. Tools like Goldilocks, VPA in recommendation mode, or Kubecost's right-sizing recommendations can surface the changes needed in an afternoon. The manual work is reviewing and applying them — but for most clusters, that work pays back within the first billing cycle.

Have a project in mind?

We'd love to hear about it.