Auto Scaling and Elasticity
Elasticity is the ability to scale resources up or down dynamically based on demand. Combined with auto scaling, it ensures your applications have the right amount of resources at all times β no more, no less.
Auto Scaling Flow
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AUTO SCALING WORKFLOW β
β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β CloudWatchβββββΆβ Scaling βββββΆβ Launch/ β β
β β Alarm β β Policy β β Terminateβ β
β ββββββββββββ ββββββββββββ β Instanceβ β
β β² ββββββββββββ β
β β β β
β β ββββββββββββ β β
β βββββββββββ Metrics ββββββββββββββ β
β β (CPU, β β
β β memory, β New instances join β
β β requestsβ the load balancer β
β β etc.) β β
β ββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Scaling Policies
Target Tracking: Maintain a target metric (e.g., keep CPU at 50%). AWS adjusts capacity automatically.
Step Scaling: Scale based on CloudWatch alarms with step adjustments (e.g., add 2 instances at 60% CPU, add 4 at 80%).
Simple Scaling: Add or remove a fixed number of instances based on a single alarm. Least recommended.
Scaling Strategies
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Strategy β When to Use β
ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ€
β Reactive β Scale after demand increases β
β β (catch up with traffic) β
ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ€
β Scheduled β Predictable traffic patterns β
β β (e.g., scale up at 9am) β
ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ€
β Predictive β ML-based forecasting β
β β (AWS Auto Scaling) β
ββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ€
β Step β Fine-grained multi-step scaling β
β β (complex scaling needs) β
ββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββ
Best Practices
Set appropriate minimum and maximum instance counts. Use warm-up periods to avoid thrashing. Design applications to handle being started and stopped. Use lifecycle hooks for graceful shutdown. Combine with load balancers for seamless scaling. Test scaling behavior under load.