Google Compute Engine offers aggressive discounts through Preemptible VMs (up to 80% off) and the newer Spot VMs (up to 91% off). Both can be terminated with 30 seconds notice, but with the right architecture, they’re production-ready for many workloads.

Preemptible vs Spot VMs

FeaturePreemptibleSpot
Max lifetime24 hoursUnlimited
Termination notice30 seconds30 seconds
DiscountUp to 80%Up to 91%
AvailabilityLimitedBetter availability
Pricing modelFixed discountDynamic (like AWS Spot)

Recommendation: Use Spot VMs for new workloads. Preemptible is legacy.

Creating Spot VMs

gcloud CLI

# Create a Spot VM
gcloud compute instances create worker-1 \
  --zone=us-central1-a \
  --machine-type=n2-standard-4 \
  --provisioning-model=SPOT \
  --instance-termination-action=STOP \
  --image-family=debian-12 \
  --image-project=debian-cloud \
  --boot-disk-size=100GB \
  --boot-disk-type=pd-ssd \
  --metadata=startup-script='#!/bin/bash
    apt-get update
    apt-get install -y docker.io
    systemctl start docker
    docker pull myapp:latest
    docker run -d myapp:latest'

# Create with maintenance-insensitive (no live migration)
gcloud compute instances create batch-worker \
  --zone=us-central1-a \
  --machine-type=c2-standard-8 \
  --provisioning-model=SPOT \
  --maintenance-policy=TERMINATE \
  --instance-termination-action=DELETE

Terraform

resource "google_compute_instance" "spot_worker" {
  name         = "spot-worker"
  machine_type = "n2-standard-4"
  zone         = "us-central1-a"

  scheduling {
    provisioning_model          = "SPOT"
    preemptible                 = true
    automatic_restart           = false
    on_host_maintenance         = "TERMINATE"
    instance_termination_action = "STOP"  # or DELETE
  }

  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-12"
      size  = 100
      type  = "pd-ssd"
    }
  }

  network_interface {
    network    = google_compute_network.main.id
    subnetwork = google_compute_subnetwork.main.id

    # No external IP for security
    # access_config {}
  }

  metadata_startup_script = <<-EOF
    #!/bin/bash
    apt-get update && apt-get install -y docker.io
    systemctl start docker
    docker run -d myapp:latest
  EOF

  service_account {
    email  = google_service_account.worker.email
    scopes = ["cloud-platform"]
  }

  labels = {
    environment = "production"
    team        = "data"
  }
}

Managed Instance Groups with Spot

# Instance template for Spot VMs
resource "google_compute_instance_template" "spot" {
  name_prefix  = "spot-worker-"
  machine_type = "n2-standard-4"
  region       = "us-central1"

  scheduling {
    provisioning_model          = "SPOT"
    preemptible                 = true
    automatic_restart           = false
    on_host_maintenance         = "TERMINATE"
    instance_termination_action = "STOP"
  }

  disk {
    source_image = "debian-cloud/debian-12"
    auto_delete  = true
    boot         = true
    disk_size_gb = 50
    disk_type    = "pd-ssd"
  }

  network_interface {
    network    = google_compute_network.main.id
    subnetwork = google_compute_subnetwork.main.id
  }

  metadata = {
    startup-script = file("${path.module}/startup.sh")
  }

  service_account {
    email  = google_service_account.worker.email
    scopes = ["cloud-platform"]
  }

  lifecycle {
    create_before_destroy = true
  }
}

# Managed Instance Group
resource "google_compute_region_instance_group_manager" "workers" {
  name               = "spot-workers"
  base_instance_name = "worker"
  region             = "us-central1"
  target_size        = 10

  version {
    instance_template = google_compute_instance_template.spot.id
  }

  named_port {
    name = "http"
    port = 8080
  }

  update_policy {
    type                         = "PROACTIVE"
    minimal_action               = "REPLACE"
    most_disruptive_allowed_action = "REPLACE"
    max_surge_fixed              = 3
    max_unavailable_fixed        = 0
    replacement_method           = "SUBSTITUTE"
  }

  auto_healing_policies {
    health_check      = google_compute_health_check.http.id
    initial_delay_sec = 300
  }
}

# Autoscaler
resource "google_compute_region_autoscaler" "workers" {
  name   = "spot-workers-autoscaler"
  region = "us-central1"
  target = google_compute_region_instance_group_manager.workers.id

  autoscaling_policy {
    min_replicas    = 2
    max_replicas    = 50
    cooldown_period = 60

    cpu_utilization {
      target = 0.7
    }
  }
}

Handling Preemption

Metadata Server Polling

#!/usr/bin/env python3
"""
Preemption handler - runs as systemd service
Polls metadata server for preemption notice
"""

import requests
import subprocess
import sys
import time
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

METADATA_URL = "http://metadata.google.internal/computeMetadata/v1/instance/preempted"
HEADERS = {"Metadata-Flavor": "Google"}

def check_preemption():
    """Check if instance is being preempted."""
    try:
        response = requests.get(
            METADATA_URL,
            headers=HEADERS,
            timeout=1,
            params={"wait_for_change": "true", "timeout_sec": "30"}
        )
        return response.text.strip().lower() == "true"
    except requests.exceptions.Timeout:
        return False
    except requests.exceptions.RequestException as e:
        logger.error(f"Error checking preemption: {e}")
        return False

def graceful_shutdown():
    """Perform graceful shutdown tasks."""
    logger.info("Preemption detected! Starting graceful shutdown...")
    
    # Stop accepting new work
    subprocess.run(["docker", "exec", "app", "touch", "/tmp/shutdown"], check=False)
    
    # Wait for in-flight requests (max 25 seconds, leaving 5s buffer)
    time.sleep(25)
    
    # Checkpoint state to GCS
    subprocess.run([
        "gsutil", "cp", "/var/lib/app/state.json",
        "gs://my-bucket/checkpoints/$(hostname).json"
    ], check=False)
    
    logger.info("Graceful shutdown complete")

def main():
    logger.info("Preemption handler started")
    
    while True:
        if check_preemption():
            graceful_shutdown()
            sys.exit(0)

if __name__ == "__main__":
    main()
# /etc/systemd/system/preemption-handler.service
[Unit]
Description=GCE Preemption Handler
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/preemption-handler.py
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Shutdown Script via Metadata

#!/bin/bash
# shutdown-script.sh - Runs when instance is preempted

# Log to Cloud Logging
logger -t preemption "Instance being preempted, starting shutdown"

# Drain load balancer
gcloud compute backend-services remove-backend my-backend \
  --instance-group=$(curl -s -H "Metadata-Flavor: Google" \
    http://metadata.google.internal/computeMetadata/v1/instance/zone | cut -d/ -f4) \
  --zone=$(curl -s -H "Metadata-Flavor: Google" \
    http://metadata.google.internal/computeMetadata/v1/instance/zone | cut -d/ -f4)

# Stop services gracefully
docker stop --time 20 $(docker ps -q)

# Checkpoint to GCS
gsutil -m cp -r /var/lib/data/* gs://checkpoints/$(hostname)/

logger -t preemption "Shutdown complete"
resource "google_compute_instance_template" "spot" {
  # ... other config ...

  metadata = {
    shutdown-script = file("${path.module}/shutdown-script.sh")
  }
}

Mixed Fleet Strategy

# On-demand baseline
resource "google_compute_region_instance_group_manager" "baseline" {
  name               = "baseline-workers"
  base_instance_name = "baseline"
  region             = "us-central1"
  target_size        = 2  # Always running

  version {
    instance_template = google_compute_instance_template.ondemand.id
  }
}

# Spot for burst capacity
resource "google_compute_region_instance_group_manager" "spot" {
  name               = "spot-workers"
  base_instance_name = "spot"
  region             = "us-central1"

  version {
    instance_template = google_compute_instance_template.spot.id
  }
}

resource "google_compute_region_autoscaler" "spot" {
  name   = "spot-autoscaler"
  region = "us-central1"
  target = google_compute_region_instance_group_manager.spot.id

  autoscaling_policy {
    min_replicas    = 0   # Scale to zero when not needed
    max_replicas    = 100
    cooldown_period = 60

    cpu_utilization {
      target = 0.6
    }
  }
}

Cost Comparison

# n2-standard-4 (4 vCPU, 16 GB) in us-central1
# On-demand: $0.194/hour = $141.62/month
# Spot:      $0.024/hour = $17.52/month (87% savings!)

# Calculate monthly savings for a 10-instance fleet
# On-demand: 10 * $141.62 = $1,416.20/month
# Spot:      10 * $17.52  = $175.20/month
# Savings:   $1,241/month

# With 5% preemption rate and auto-replacement:
# Effective cost: ~$184/month (still 87% savings)

Committed Use Discounts (CUDs)

For predictable baseline workloads, combine with CUDs:

# Purchase 1-year commitment for baseline
gcloud compute commitments create baseline-commitment \
  --region=us-central1 \
  --resources=vcpu=8,memory=32GB \
  --plan=12-month

# 1-year: 37% discount
# 3-year: 55% discount

# Strategy: CUDs for baseline + Spot for burst

Best Practices

  1. Design for failure — assume any instance can disappear in 30 seconds
  2. Externalize state — use Cloud Storage, Memorystore, or Cloud SQL
  3. Use managed instance groups — automatic replacement on preemption
  4. Spread across zones — reduces correlated failures
  5. Checkpoint frequently — save progress every few minutes
  6. Use shutdown scripts — 30 seconds is enough for graceful drain
  7. Monitor preemption rates — adjust strategy if too high

Key Takeaways

  1. Spot VMs save up to 91% — use for batch, dev/test, and stateless workloads
  2. 30-second warning — enough time for graceful shutdown if you prepare
  3. Spot > Preemptible — no 24-hour limit, better availability
  4. MIGs auto-replace preempted instances automatically
  5. Mix on-demand baseline with Spot burst for reliability
  6. CUDs for predictable load — combine with Spot for maximum savings

“Spot VMs are the cheapest compute in any cloud. The only cost is engineering for graceful failure — which you should be doing anyway.”