AWS On-Demand pricing is the convenience tax. Reserved Instances, Spot Instances, and Savings Plans can cut your compute costs by 50-90% — if you use them correctly. This guide covers when to use each option and how to implement them without operational pain.

Pricing Model Comparison

ModelDiscountCommitmentFlexibilityBest For
On-Demand0%NoneFullUnpredictable, short-term
Savings Plans30-66%1-3 yearsHighSteady baseline compute
Reserved Instances30-72%1-3 yearsLowSpecific instance types
Spot Instances60-90%NoneVariableFault-tolerant workloads

Savings Plans

Compute Savings Plans

Most flexible — applies to any EC2, Fargate, or Lambda usage.

# Analyze current usage first
data "aws_ce_cost_and_usage" "compute" {
  time_period {
    start = timeadd(timestamp(), "-720h")  # 30 days ago
    end   = timestamp()
  }

  granularity = "MONTHLY"

  metrics = ["UnblendedCost", "UsageQuantity"]

  group_by {
    key  = "SERVICE"
    type = "DIMENSION"
  }

  filter {
    dimension {
      key    = "SERVICE"
      values = ["Amazon Elastic Compute Cloud - Compute"]
    }
  }
}

# Purchase via AWS Console or API
# aws savingsplans create-savings-plan \
#   --savings-plan-type "ComputeSavingsPlan" \
#   --commitment 10.00 \
#   --term-duration-in-years 1 \
#   --payment-option "NoUpfront"

EC2 Instance Savings Plans

Locked to instance family, but deeper discounts.

# Get recommendations
aws ce get-savings-plans-purchase-recommendation \
  --savings-plans-type "EC2_INSTANCE_SP" \
  --term-in-years "ONE_YEAR" \
  --payment-option "NO_UPFRONT" \
  --lookback-period-in-days "SIXTY_DAYS"

Savings Plans Strategy

# Recommended approach
savings_plan_strategy:
  # Cover 70-80% of steady-state with Compute Savings Plans
  compute_savings_plan:
    commitment: "$XX/hour based on baseline"
    term: "3-year for maximum discount"
    payment: "No Upfront (preserve cash) or All Upfront (maximum discount)"

  # Cover specific, stable workloads with EC2 Instance Savings Plans
  ec2_instance_savings_plan:
    commitment: "Remainder of baseline after Compute SP"
    families: ["m6i", "c6i"]  # Your most-used families
    term: "1-year (more flexibility)"

  # Leave 20-30% for On-Demand/Spot
  buffer: "Handles spikes, new workloads, experimentation"

Reserved Instances

When to Use RIs Over Savings Plans

Use Reserved Instances when:
- You need capacity reservation (Zonal RIs)
- Running RDS, ElastiCache, Redshift, OpenSearch
- You want marketplace resale option

Use Savings Plans when:
- Flexibility across instance types matters
- Using Lambda, Fargate alongside EC2
- Simpler management preferred

RDS Reserved Instances

# Get RDS RI recommendations
aws rds describe-reserved-db-instances-offerings \
  --db-instance-class db.r6g.large \
  --product-description postgresql \
  --duration 31536000 \
  --offering-type "No Upfront"

# Purchase RDS RI
aws rds purchase-reserved-db-instances-offering \
  --reserved-db-instances-offering-id <offering-id> \
  --db-instance-count 2

Terraform for RDS RI Monitoring

# Alert when RIs are expiring
resource "aws_cloudwatch_metric_alarm" "ri_expiring" {
  alarm_name          = "rds-ri-expiring-soon"
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = 1
  threshold           = 30

  metric_query {
    id          = "e1"
    expression  = "DIFF(m1)"
    label       = "Days Until Expiration"
    return_data = true
  }

  metric_query {
    id = "m1"
    metric {
      metric_name = "ReservedDBInstances"
      namespace   = "AWS/RDS"
      period      = 86400
      stat        = "Minimum"
    }
  }

  alarm_actions = [aws_sns_topic.finops.arn]
}

Spot Instances

Spot Best Practices

spot_best_practices:
  # Diversify across instance types and AZs
  instance_pools:
    - family: "m6i"
      sizes: ["large", "xlarge", "2xlarge"]
    - family: "m5"
      sizes: ["large", "xlarge", "2xlarge"]
    - family: "m5a"
      sizes: ["large", "xlarge", "2xlarge"]

  # Use allocation strategies
  allocation_strategy: "capacity-optimized"  # or "lowest-price"

  # Handle interruptions gracefully
  interruption_handling:
    - Use instance metadata service to detect interruptions
    - Implement graceful shutdown (2-minute warning)
    - Use EBS for stateful data
    - Design for fault tolerance

Spot Fleet with Mixed Instances

resource "aws_launch_template" "app" {
  name_prefix   = "app-"
  image_id      = data.aws_ami.amazon_linux.id
  instance_type = "m6i.large"

  user_data = base64encode(<<-EOF
    #!/bin/bash
    # Setup interruption handler
    while true; do
      if curl -s http://169.254.169.254/latest/meta-data/spot/instance-action | grep -q "terminate"; then
        echo "Spot interruption detected, draining..."
        /usr/local/bin/drain-and-shutdown.sh
        break
      fi
      sleep 5
    done &

    # Start application
    /usr/local/bin/start-app.sh
  EOF
  )

  iam_instance_profile {
    arn = aws_iam_instance_profile.app.arn
  }

  network_interfaces {
    associate_public_ip_address = false
    security_groups             = [aws_security_group.app.id]
  }

  tag_specifications {
    resource_type = "instance"
    tags = merge(var.tags, {
      Name = "app-spot"
    })
  }
}

resource "aws_autoscaling_group" "app" {
  name                = "app-asg"
  vpc_zone_identifier = var.private_subnet_ids
  min_size            = 2
  max_size            = 20
  desired_capacity    = 4

  mixed_instances_policy {
    instances_distribution {
      on_demand_base_capacity                  = 2  # Minimum On-Demand
      on_demand_percentage_above_base_capacity = 25  # 75% Spot
      spot_allocation_strategy                 = "capacity-optimized"
      spot_max_price                           = ""  # On-Demand price (default)
    }

    launch_template {
      launch_template_specification {
        launch_template_id = aws_launch_template.app.id
        version            = "$Latest"
      }

      override {
        instance_type = "m6i.large"
        weighted_capacity = 1
      }

      override {
        instance_type = "m6i.xlarge"
        weighted_capacity = 2
      }

      override {
        instance_type = "m5.large"
        weighted_capacity = 1
      }

      override {
        instance_type = "m5.xlarge"
        weighted_capacity = 2
      }

      override {
        instance_type = "m5a.large"
        weighted_capacity = 1
      }

      override {
        instance_type = "m5a.xlarge"
        weighted_capacity = 2
      }
    }
  }

  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 75
    }
  }

  tag {
    key                 = "Name"
    value               = "app-asg"
    propagate_at_launch = true
  }
}

Spot for EKS

resource "aws_eks_node_group" "spot" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "spot-workers"
  node_role_arn   = aws_iam_role.node.arn
  subnet_ids      = var.private_subnet_ids
  capacity_type   = "SPOT"

  scaling_config {
    desired_size = 3
    max_size     = 10
    min_size     = 1
  }

  instance_types = [
    "m6i.large",
    "m6i.xlarge",
    "m5.large",
    "m5.xlarge",
    "m5a.large",
    "m5a.xlarge"
  ]

  labels = {
    "node.kubernetes.io/lifecycle" = "spot"
  }

  taint {
    key    = "spot"
    value  = "true"
    effect = "NO_SCHEDULE"
  }

  tags = var.tags
}

Kubernetes Spot Configuration

# Tolerations for spot nodes
apiVersion: apps/v1
kind: Deployment
metadata:
  name: stateless-app
spec:
  replicas: 6
  template:
    spec:
      tolerations:
      - key: "spot"
        operator: "Equal"
        value: "true"
        effect: "NoSchedule"
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
              - key: "node.kubernetes.io/lifecycle"
                operator: In
                values:
                - spot
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: stateless-app
              topologyKey: topology.kubernetes.io/zone
      terminationGracePeriodSeconds: 120
      containers:
      - name: app
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 30"]

Cost Monitoring and Alerts

Budget Alerts

resource "aws_budgets_budget" "monthly" {
  name         = "monthly-budget"
  budget_type  = "COST"
  limit_amount = "10000"
  limit_unit   = "USD"
  time_unit    = "MONTHLY"

  cost_filter {
    name   = "Service"
    values = ["Amazon Elastic Compute Cloud - Compute"]
  }

  notification {
    comparison_operator        = "GREATER_THAN"
    threshold                  = 80
    threshold_type             = "PERCENTAGE"
    notification_type          = "FORECASTED"
    subscriber_email_addresses = [var.finops_email]
  }

  notification {
    comparison_operator        = "GREATER_THAN"
    threshold                  = 100
    threshold_type             = "PERCENTAGE"
    notification_type          = "ACTUAL"
    subscriber_email_addresses = [var.finops_email]
    subscriber_sns_topic_arns  = [aws_sns_topic.budget_alerts.arn]
  }
}

Cost Anomaly Detection

resource "aws_ce_anomaly_monitor" "service" {
  name              = "service-cost-monitor"
  monitor_type      = "DIMENSIONAL"
  monitor_dimension = "SERVICE"
}

resource "aws_ce_anomaly_subscription" "alerts" {
  name      = "cost-anomaly-alerts"
  frequency = "IMMEDIATE"

  monitor_arn_list = [
    aws_ce_anomaly_monitor.service.arn
  ]

  subscriber {
    type    = "EMAIL"
    address = var.finops_email
  }

  subscriber {
    type    = "SNS"
    address = aws_sns_topic.cost_alerts.arn
  }

  threshold_expression {
    dimension {
      key           = "ANOMALY_TOTAL_IMPACT_ABSOLUTE"
      values        = ["100"]
      match_options = ["GREATER_THAN_OR_EQUAL"]
    }
  }
}

Cost Optimization Checklist

compute_cost_checklist:
  immediate_wins:
    - [ ] Right-size instances (check CloudWatch metrics)
    - [ ] Stop/terminate unused instances
    - [ ] Delete unattached EBS volumes
    - [ ] Release unused Elastic IPs
    - [ ] Review NAT Gateway usage

  quick_wins:
    - [ ] Enable Savings Plans for baseline usage
    - [ ] Use Spot for fault-tolerant workloads
    - [ ] Implement auto-scaling
    - [ ] Schedule dev/test shutdown

  strategic:
    - [ ] Modernize to Graviton (20% cheaper)
    - [ ] Containerize workloads (better utilization)
    - [ ] Move to serverless where appropriate
    - [ ] Review reserved instance coverage

Instance Right-Sizing Script

import boto3
from datetime import datetime, timedelta

cloudwatch = boto3.client('cloudwatch')
ec2 = boto3.client('ec2')

def analyze_instance(instance_id: str, days: int = 14) -> dict:
    """Analyze instance utilization and recommend right-sizing."""
    end_time = datetime.utcnow()
    start_time = end_time - timedelta(days=days)

    # Get CPU metrics
    cpu_response = cloudwatch.get_metric_statistics(
        Namespace='AWS/EC2',
        MetricName='CPUUtilization',
        Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
        StartTime=start_time,
        EndTime=end_time,
        Period=3600,
        Statistics=['Average', 'Maximum']
    )

    cpu_points = cpu_response['Datapoints']
    if not cpu_points:
        return {'status': 'no_data'}

    avg_cpu = sum(p['Average'] for p in cpu_points) / len(cpu_points)
    max_cpu = max(p['Maximum'] for p in cpu_points)

    # Get current instance type
    instance = ec2.describe_instances(InstanceIds=[instance_id])
    instance_type = instance['Reservations'][0]['Instances'][0]['InstanceType']

    recommendation = {
        'instance_id': instance_id,
        'current_type': instance_type,
        'avg_cpu': round(avg_cpu, 2),
        'max_cpu': round(max_cpu, 2),
    }

    # Recommend downsize if underutilized
    if avg_cpu < 20 and max_cpu < 50:
        recommendation['action'] = 'downsize'
        recommendation['reason'] = 'Consistently low utilization'
    elif avg_cpu < 5:
        recommendation['action'] = 'stop_or_terminate'
        recommendation['reason'] = 'Near-zero utilization'
    else:
        recommendation['action'] = 'none'

    return recommendation

Key Takeaways

  1. Start with Savings Plans — flexible, easy, 30%+ savings
  2. Use Spot for stateless workloads — 60-90% savings with proper architecture
  3. Diversify Spot pools — reduces interruption risk
  4. Monitor and alert — cost anomalies catch problems early
  5. Right-size first — savings plans on oversized instances waste money

“The cheapest instance is the one you don’t run. The second cheapest is a properly sized Spot instance.”