Autoscaling
The application can be scaled manually by setting the number of replicas in the configuration. Additionally, the Helm chart includes an autoscaling mechanism (HPA - Horizontal Pod Autoscaler) that dynamically adjusts the number of replicas based on workload. HPA configuration allows defining:
- Minimum and maximum number of replicas (minReplicas, maxReplicas)
- Target CPU and memory utilization (targetCPU, targetMemory)
- Scaling behavior, such as stabilization windows and scaling policies Example HPA configuration in the Helm chart:
hpa:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPU: 80
targetMemory: 75- enabled – Enables or disables autoscaling.
- minReplicas – Minimum number of replicas to ensure the application doesn’t scale below this limit.
- maxReplicas – Maximum number of replicas allowed, preventing over-scaling.
- targetCPU – CPU utilization percentage that triggers scaling up/down.
- targetMemory – Memory utilization percentage that triggers scaling up/down.
Additionally, HPA configuration allows defining scaling behavior for both scaling up (scaleUp) and scaling down (scaleDown).
behavior:
scaleUp:
stabilizationWindowSeconds: 120 # Time (in seconds) for which past recommendations are considered before scaling up
selectPolicy: Max # Policy selection method ("Max", "Min", "Disabled")
policies: [] # Scaling policies for scale up
scaleDown:
stabilizationWindowSeconds: 300 # Time (in seconds) for which past recommendations are considered before scaling down
selectPolicy: Max # Policy selection method ("Max", "Min", "Disabled")
policies:
- type: Pods # Policy type (Pods - number of replicas, Percent - percentage change)
value: 1 # Number of replicas to add/remove
periodSeconds: 300 # Minimum time interval between scaling actionsIf the application has specific scaling requirements, a custom HPA definition can be added to the chart by modifying values.yaml or creating a separate Kubernetes HPA manifest.