Member-only story

Autoscaling in Kubernetes: Understanding Vertical Pod Autoscaler (VPA)

Published in

FAUN — Developer Community 🐾

17 min readAug 10, 2023

Historically, companies used to scale their servers and allocate resources according to predictable usage patterns on fixed daily, weekly, or yearly cycles. For example, administrators would scale up the office’s internal systems during business hours, or a retailer might increase server capacity during holiday sales spikes. However, in an era where a sudden viral feed, post on TikTok, or breaking news can create immediate and unpredictable changes in demand, these usage patterns have become less predictable, making manual responsiveness obsolete.

To address this issue and avoid poor performance or wasted costs caused by under or over-provisioning of servers, cloud service providers and Kubernetes created functionality to automatically scale servers up or down according to real-world metrics like load and traffic.

In Kubernetes, we have three different autoscaling resources:

Vertical Pod Autoscaler (VPA)
Horizontal Pod Autoscaler (HPA)
Cluster Autoscaler (CA)

Each autoscaler targets a different layer and need within the cluster. In this series of articles, I will explore each one of them in more depth, starting with the Vertical Pod Autoscaler.

FAUN — Developer Community 🐾

Autoscaling in Kubernetes: Understanding Vertical Pod Autoscaler (VPA)

CPU and Memory Limits in…

Create an account to read the full story.

Published in FAUN — Developer Community 🐾

Written by Ivan (이반) Porta

No responses yet