Terraform State Locking and Remote Backends
Configure Terraform remote state with S3, GCS, and Azure Blob backends, implement state locking to prevent corruption, and handle state migration safely.
Terraform state is the source of truth for your infrastructure. Lose it, corrupt it, or have two people modify it simultaneously, and you’re in for a bad time. This guide covers remote backends, state locking, and safe state operations.
Why Remote State Matters
Local state has critical problems:
- No collaboration: State file lives on one machine
- No locking: Two
terraform applyruns can corrupt state - No backup: Accidental deletion = recreate everything
- Secrets in plaintext: State contains sensitive values
Remote backends solve all of these.
AWS: S3 + DynamoDB
The most common setup for AWS users:
Bootstrap Script
#!/bin/bash
# bootstrap-terraform-backend.sh
BUCKET_NAME="mycompany-terraform-state"
DYNAMODB_TABLE="terraform-locks"
REGION="us-east-1"
# Create S3 bucket
aws s3api create-bucket \
--bucket $BUCKET_NAME \
--region $REGION
# Enable versioning
aws s3api put-bucket-versioning \
--bucket $BUCKET_NAME \
--versioning-configuration Status=Enabled
# Enable encryption
aws s3api put-bucket-encryption \
--bucket $BUCKET_NAME \
--server-side-encryption-configuration '{
"Rules": [{
"ApplyServerSideEncryptionByDefault": {
"SSEAlgorithm": "aws:kms"
},
"BucketKeyEnabled": true
}]
}'
# Block public access
aws s3api put-public-access-block \
--bucket $BUCKET_NAME \
--public-access-block-configuration '{
"BlockPublicAcls": true,
"IgnorePublicAcls": true,
"BlockPublicPolicy": true,
"RestrictPublicBuckets": true
}'
# Create DynamoDB table for locking
aws dynamodb create-table \
--table-name $DYNAMODB_TABLE \
--attribute-definitions AttributeName=LockID,AttributeType=S \
--key-schema AttributeName=LockID,KeyType=HASH \
--billing-mode PAY_PER_REQUEST \
--region $REGION
Backend Configuration
# backend.tf
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
# Optional: Use a specific KMS key
# kms_key_id = "arn:aws:kms:us-east-1:111111111111:key/xxxxx"
}
}
State File Organization
terraform-state/
├── global/
│ ├── iam/terraform.tfstate
│ └── route53/terraform.tfstate
├── prod/
│ ├── networking/terraform.tfstate
│ ├── eks/terraform.tfstate
│ ├── rds/terraform.tfstate
│ └── app/terraform.tfstate
├── staging/
│ ├── networking/terraform.tfstate
│ └── app/terraform.tfstate
└── dev/
└── app/terraform.tfstate
GCP: Google Cloud Storage
# backend.tf
terraform {
backend "gcs" {
bucket = "mycompany-terraform-state"
prefix = "prod/networking"
}
}
# Create bucket with versioning
gsutil mb -l us-central1 gs://mycompany-terraform-state
gsutil versioning set on gs://mycompany-terraform-state
GCS handles locking automatically—no separate table needed.
Azure: Blob Storage
# backend.tf
terraform {
backend "azurerm" {
resource_group_name = "terraform-state-rg"
storage_account_name = "mycompanytfstate"
container_name = "tfstate"
key = "prod/networking/terraform.tfstate"
}
}
# Create storage account and container
az group create --name terraform-state-rg --location eastus
az storage account create \
--name mycompanytfstate \
--resource-group terraform-state-rg \
--sku Standard_LRS \
--encryption-services blob
az storage container create \
--name tfstate \
--account-name mycompanytfstate
How State Locking Works
When you run terraform apply:
- Terraform attempts to acquire a lock (DynamoDB, GCS, etc.)
- If lock exists, operation fails with “state locked” error
- If lock acquired, operation proceeds
- Lock released after operation completes
User A: terraform apply
│
▼
┌─────────────┐
│ Acquire Lock │ ────────────────────────────┐
└─────────────┘ │
│ │
▼ User B: terraform apply
┌─────────────┐ │
│ Apply │ ▼
└─────────────┘ ┌─────────────┐
│ │ Lock Failed │
▼ └─────────────┘
┌─────────────┐ │
│Release Lock │ ▼
└─────────────┘ Error: "State locked"
Lock Information
# View lock details (if stuck)
aws dynamodb scan --table-name terraform-locks
# Example lock entry
{
"LockID": {
"S": "mycompany-terraform-state/prod/networking/terraform.tfstate"
},
"Info": {
"S": "{\"ID\":\"xxx\",\"Operation\":\"OperationTypeApply\",\"Info\":\"\",\"Who\":\"user@machine\",\"Version\":\"1.6.0\",\"Created\":\"2024-01-15T10:30:00.000000000Z\",\"Path\":\"\"}"
}
}
Force Unlock (Use Carefully!)
# Only if you're SURE no operation is running
terraform force-unlock LOCK_ID
# Example
terraform force-unlock a1b2c3d4-e5f6-7890-abcd-ef1234567890
Warning: Force unlocking while another operation is running will corrupt your state.
State Migration
Local to Remote
# Step 1: Add backend configuration
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "prod/app/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
}
}
# Step 2: Initialize and migrate
terraform init -migrate-state
# Terraform will ask:
# "Do you want to copy existing state to the new backend?"
# Answer: yes
Between Remote Backends
# Step 1: Pull current state
terraform state pull > terraform.tfstate.backup
# Step 2: Update backend configuration
# Edit backend.tf with new backend settings
# Step 3: Reinitialize
terraform init -migrate-state
# If that fails, manual approach:
terraform init -reconfigure
terraform state push terraform.tfstate.backup
Between State Files (Refactoring)
Moving resources between state files:
# Remove from source state
cd old-project
terraform state rm aws_s3_bucket.data
# Import into destination state
cd ../new-project
terraform import aws_s3_bucket.data my-bucket-name
For bulk moves:
# Move resources between states
terraform state mv -state=source.tfstate -state-out=dest.tfstate \
aws_s3_bucket.data aws_s3_bucket.data
# Or use terraform_remote_state data source for references
Partial Backend Configuration
Keep sensitive values out of code:
# backend.tf
terraform {
backend "s3" {
key = "prod/app/terraform.tfstate"
# bucket, region, dynamodb_table provided via -backend-config
}
}
# backend-config/prod.hcl
bucket = "mycompany-terraform-state"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
# Initialize with partial config
terraform init -backend-config=backend-config/prod.hcl
Workspaces with Remote State
# Each workspace gets its own state file
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "app/terraform.tfstate" # Becomes app/env:dev/terraform.tfstate
region = "us-east-1"
dynamodb_table = "terraform-locks"
workspace_key_prefix = "workspaces"
}
}
terraform workspace new staging
# State file: s3://mycompany-terraform-state/workspaces/staging/app/terraform.tfstate
terraform workspace new prod
# State file: s3://mycompany-terraform-state/workspaces/prod/app/terraform.tfstate
State Access Control
IAM Policy for Terraform Users
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::mycompany-terraform-state/*"
},
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::mycompany-terraform-state"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:DeleteItem"
],
"Resource": "arn:aws:dynamodb:us-east-1:*:table/terraform-locks"
}
]
}
Read-Only Access (for CI/CD plan jobs)
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject"],
"Resource": "arn:aws:s3:::mycompany-terraform-state/*"
},
{
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "arn:aws:s3:::mycompany-terraform-state"
},
{
"Effect": "Allow",
"Action": ["dynamodb:GetItem"],
"Resource": "arn:aws:dynamodb:us-east-1:*:table/terraform-locks"
}
]
}
Remote State Data Source
Reference outputs from other state files:
# In app configuration, reference networking outputs
data "terraform_remote_state" "networking" {
backend = "s3"
config = {
bucket = "mycompany-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "us-east-1"
}
}
resource "aws_instance" "app" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t3.micro"
subnet_id = data.terraform_remote_state.networking.outputs.private_subnet_ids[0]
vpc_security_group_ids = [
data.terraform_remote_state.networking.outputs.app_security_group_id
]
}
Disaster Recovery
Automated Backups
S3 versioning provides automatic backups:
# List state file versions
aws s3api list-object-versions \
--bucket mycompany-terraform-state \
--prefix prod/app/terraform.tfstate
# Restore specific version
aws s3api get-object \
--bucket mycompany-terraform-state \
--key prod/app/terraform.tfstate \
--version-id "abc123" \
restored-state.tfstate
# Push restored state
terraform state push restored-state.tfstate
Cross-Region Replication
# Enable replication for disaster recovery
resource "aws_s3_bucket_replication_configuration" "state_replication" {
bucket = aws_s3_bucket.terraform_state.id
role = aws_iam_role.replication.arn
rule {
id = "replicate-state"
status = "Enabled"
destination {
bucket = aws_s3_bucket.terraform_state_replica.arn
storage_class = "STANDARD"
}
}
}
CI/CD Best Practices
Separate Plan and Apply
# .github/workflows/terraform.yml
jobs:
plan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Terraform Plan
run: |
terraform init
terraform plan -out=plan.tfplan
- name: Upload Plan
uses: actions/upload-artifact@v4
with:
name: tfplan
path: plan.tfplan
apply:
needs: plan
runs-on: ubuntu-latest
environment: production # Requires approval
steps:
- uses: actions/checkout@v4
- name: Download Plan
uses: actions/download-artifact@v4
with:
name: tfplan
- name: Terraform Apply
run: |
terraform init
terraform apply plan.tfplan
Lock Timeout Handling
# Increase lock timeout for slow operations
terraform apply -lock-timeout=10m
Troubleshooting
”Error acquiring state lock"
# 1. Check if another operation is running
aws dynamodb scan --table-name terraform-locks
# 2. If stuck lock, verify no operation is running
# 3. Force unlock only as last resort
terraform force-unlock LOCK_ID
"State file not found"
# Verify bucket and key exist
aws s3 ls s3://mycompany-terraform-state/prod/app/
# Check IAM permissions
aws sts get-caller-identity
aws s3api head-object --bucket mycompany-terraform-state --key prod/app/terraform.tfstate
"Backend configuration changed”
# Reinitialize with migration
terraform init -migrate-state
# Or reconfigure without migration
terraform init -reconfigure
Key Takeaways
- Always use remote state in team environments
- Always enable state locking—DynamoDB for S3, automatic for GCS
- Enable versioning for disaster recovery
- Organize state files by environment and component
- Use partial configuration to keep credentials out of code
- Never force-unlock unless you’re certain no operation is running
- Backup before migrations—
terraform state pull > backup.tfstate
State management isn’t glamorous, but getting it wrong is catastrophic. Invest time in proper backend configuration, and you’ll avoid the late-night “who corrupted the state file” incidents.