Google Compute Engine: Preemptible VMs and Cost Optimization
Save up to 91% on compute costs with GCE Preemptible and Spot VMs. Learn instance selection, interruption handling, and cost optimization strategies.
Google Compute Engine offers aggressive discounts through Preemptible VMs (up to 80% off) and the newer Spot VMs (up to 91% off). Both can be terminated with 30 seconds notice, but with the right architecture, they’re production-ready for many workloads.
Preemptible vs Spot VMs
| Feature | Preemptible | Spot |
|---|---|---|
| Max lifetime | 24 hours | Unlimited |
| Termination notice | 30 seconds | 30 seconds |
| Discount | Up to 80% | Up to 91% |
| Availability | Limited | Better availability |
| Pricing model | Fixed discount | Dynamic (like AWS Spot) |
Recommendation: Use Spot VMs for new workloads. Preemptible is legacy.
Creating Spot VMs
gcloud CLI
# Create a Spot VM
gcloud compute instances create worker-1 \
--zone=us-central1-a \
--machine-type=n2-standard-4 \
--provisioning-model=SPOT \
--instance-termination-action=STOP \
--image-family=debian-12 \
--image-project=debian-cloud \
--boot-disk-size=100GB \
--boot-disk-type=pd-ssd \
--metadata=startup-script='#!/bin/bash
apt-get update
apt-get install -y docker.io
systemctl start docker
docker pull myapp:latest
docker run -d myapp:latest'
# Create with maintenance-insensitive (no live migration)
gcloud compute instances create batch-worker \
--zone=us-central1-a \
--machine-type=c2-standard-8 \
--provisioning-model=SPOT \
--maintenance-policy=TERMINATE \
--instance-termination-action=DELETE
Terraform
resource "google_compute_instance" "spot_worker" {
name = "spot-worker"
machine_type = "n2-standard-4"
zone = "us-central1-a"
scheduling {
provisioning_model = "SPOT"
preemptible = true
automatic_restart = false
on_host_maintenance = "TERMINATE"
instance_termination_action = "STOP" # or DELETE
}
boot_disk {
initialize_params {
image = "debian-cloud/debian-12"
size = 100
type = "pd-ssd"
}
}
network_interface {
network = google_compute_network.main.id
subnetwork = google_compute_subnetwork.main.id
# No external IP for security
# access_config {}
}
metadata_startup_script = <<-EOF
#!/bin/bash
apt-get update && apt-get install -y docker.io
systemctl start docker
docker run -d myapp:latest
EOF
service_account {
email = google_service_account.worker.email
scopes = ["cloud-platform"]
}
labels = {
environment = "production"
team = "data"
}
}
Managed Instance Groups with Spot
# Instance template for Spot VMs
resource "google_compute_instance_template" "spot" {
name_prefix = "spot-worker-"
machine_type = "n2-standard-4"
region = "us-central1"
scheduling {
provisioning_model = "SPOT"
preemptible = true
automatic_restart = false
on_host_maintenance = "TERMINATE"
instance_termination_action = "STOP"
}
disk {
source_image = "debian-cloud/debian-12"
auto_delete = true
boot = true
disk_size_gb = 50
disk_type = "pd-ssd"
}
network_interface {
network = google_compute_network.main.id
subnetwork = google_compute_subnetwork.main.id
}
metadata = {
startup-script = file("${path.module}/startup.sh")
}
service_account {
email = google_service_account.worker.email
scopes = ["cloud-platform"]
}
lifecycle {
create_before_destroy = true
}
}
# Managed Instance Group
resource "google_compute_region_instance_group_manager" "workers" {
name = "spot-workers"
base_instance_name = "worker"
region = "us-central1"
target_size = 10
version {
instance_template = google_compute_instance_template.spot.id
}
named_port {
name = "http"
port = 8080
}
update_policy {
type = "PROACTIVE"
minimal_action = "REPLACE"
most_disruptive_allowed_action = "REPLACE"
max_surge_fixed = 3
max_unavailable_fixed = 0
replacement_method = "SUBSTITUTE"
}
auto_healing_policies {
health_check = google_compute_health_check.http.id
initial_delay_sec = 300
}
}
# Autoscaler
resource "google_compute_region_autoscaler" "workers" {
name = "spot-workers-autoscaler"
region = "us-central1"
target = google_compute_region_instance_group_manager.workers.id
autoscaling_policy {
min_replicas = 2
max_replicas = 50
cooldown_period = 60
cpu_utilization {
target = 0.7
}
}
}
Handling Preemption
Metadata Server Polling
#!/usr/bin/env python3
"""
Preemption handler - runs as systemd service
Polls metadata server for preemption notice
"""
import requests
import subprocess
import sys
import time
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
METADATA_URL = "http://metadata.google.internal/computeMetadata/v1/instance/preempted"
HEADERS = {"Metadata-Flavor": "Google"}
def check_preemption():
"""Check if instance is being preempted."""
try:
response = requests.get(
METADATA_URL,
headers=HEADERS,
timeout=1,
params={"wait_for_change": "true", "timeout_sec": "30"}
)
return response.text.strip().lower() == "true"
except requests.exceptions.Timeout:
return False
except requests.exceptions.RequestException as e:
logger.error(f"Error checking preemption: {e}")
return False
def graceful_shutdown():
"""Perform graceful shutdown tasks."""
logger.info("Preemption detected! Starting graceful shutdown...")
# Stop accepting new work
subprocess.run(["docker", "exec", "app", "touch", "/tmp/shutdown"], check=False)
# Wait for in-flight requests (max 25 seconds, leaving 5s buffer)
time.sleep(25)
# Checkpoint state to GCS
subprocess.run([
"gsutil", "cp", "/var/lib/app/state.json",
"gs://my-bucket/checkpoints/$(hostname).json"
], check=False)
logger.info("Graceful shutdown complete")
def main():
logger.info("Preemption handler started")
while True:
if check_preemption():
graceful_shutdown()
sys.exit(0)
if __name__ == "__main__":
main()
# /etc/systemd/system/preemption-handler.service
[Unit]
Description=GCE Preemption Handler
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/preemption-handler.py
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
Shutdown Script via Metadata
#!/bin/bash
# shutdown-script.sh - Runs when instance is preempted
# Log to Cloud Logging
logger -t preemption "Instance being preempted, starting shutdown"
# Drain load balancer
gcloud compute backend-services remove-backend my-backend \
--instance-group=$(curl -s -H "Metadata-Flavor: Google" \
http://metadata.google.internal/computeMetadata/v1/instance/zone | cut -d/ -f4) \
--zone=$(curl -s -H "Metadata-Flavor: Google" \
http://metadata.google.internal/computeMetadata/v1/instance/zone | cut -d/ -f4)
# Stop services gracefully
docker stop --time 20 $(docker ps -q)
# Checkpoint to GCS
gsutil -m cp -r /var/lib/data/* gs://checkpoints/$(hostname)/
logger -t preemption "Shutdown complete"
resource "google_compute_instance_template" "spot" {
# ... other config ...
metadata = {
shutdown-script = file("${path.module}/shutdown-script.sh")
}
}
Mixed Fleet Strategy
# On-demand baseline
resource "google_compute_region_instance_group_manager" "baseline" {
name = "baseline-workers"
base_instance_name = "baseline"
region = "us-central1"
target_size = 2 # Always running
version {
instance_template = google_compute_instance_template.ondemand.id
}
}
# Spot for burst capacity
resource "google_compute_region_instance_group_manager" "spot" {
name = "spot-workers"
base_instance_name = "spot"
region = "us-central1"
version {
instance_template = google_compute_instance_template.spot.id
}
}
resource "google_compute_region_autoscaler" "spot" {
name = "spot-autoscaler"
region = "us-central1"
target = google_compute_region_instance_group_manager.spot.id
autoscaling_policy {
min_replicas = 0 # Scale to zero when not needed
max_replicas = 100
cooldown_period = 60
cpu_utilization {
target = 0.6
}
}
}
Cost Comparison
# n2-standard-4 (4 vCPU, 16 GB) in us-central1
# On-demand: $0.194/hour = $141.62/month
# Spot: $0.024/hour = $17.52/month (87% savings!)
# Calculate monthly savings for a 10-instance fleet
# On-demand: 10 * $141.62 = $1,416.20/month
# Spot: 10 * $17.52 = $175.20/month
# Savings: $1,241/month
# With 5% preemption rate and auto-replacement:
# Effective cost: ~$184/month (still 87% savings)
Committed Use Discounts (CUDs)
For predictable baseline workloads, combine with CUDs:
# Purchase 1-year commitment for baseline
gcloud compute commitments create baseline-commitment \
--region=us-central1 \
--resources=vcpu=8,memory=32GB \
--plan=12-month
# 1-year: 37% discount
# 3-year: 55% discount
# Strategy: CUDs for baseline + Spot for burst
Best Practices
- Design for failure — assume any instance can disappear in 30 seconds
- Externalize state — use Cloud Storage, Memorystore, or Cloud SQL
- Use managed instance groups — automatic replacement on preemption
- Spread across zones — reduces correlated failures
- Checkpoint frequently — save progress every few minutes
- Use shutdown scripts — 30 seconds is enough for graceful drain
- Monitor preemption rates — adjust strategy if too high
Key Takeaways
- Spot VMs save up to 91% — use for batch, dev/test, and stateless workloads
- 30-second warning — enough time for graceful shutdown if you prepare
- Spot > Preemptible — no 24-hour limit, better availability
- MIGs auto-replace preempted instances automatically
- Mix on-demand baseline with Spot burst for reliability
- CUDs for predictable load — combine with Spot for maximum savings
“Spot VMs are the cheapest compute in any cloud. The only cost is engineering for graceful failure — which you should be doing anyway.”