Terraform state is the source of truth for your infrastructure. Lose it, corrupt it, or have two people modify it simultaneously, and you’re in for a bad time. This guide covers remote backends, state locking, and safe state operations.

Why Remote State Matters

Local state has critical problems:

  • No collaboration: State file lives on one machine
  • No locking: Two terraform apply runs can corrupt state
  • No backup: Accidental deletion = recreate everything
  • Secrets in plaintext: State contains sensitive values

Remote backends solve all of these.

AWS: S3 + DynamoDB

The most common setup for AWS users:

Bootstrap Script

#!/bin/bash
# bootstrap-terraform-backend.sh

BUCKET_NAME="mycompany-terraform-state"
DYNAMODB_TABLE="terraform-locks"
REGION="us-east-1"

# Create S3 bucket
aws s3api create-bucket \
  --bucket $BUCKET_NAME \
  --region $REGION

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket $BUCKET_NAME \
  --versioning-configuration Status=Enabled

# Enable encryption
aws s3api put-bucket-encryption \
  --bucket $BUCKET_NAME \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "aws:kms"
      },
      "BucketKeyEnabled": true
    }]
  }'

# Block public access
aws s3api put-public-access-block \
  --bucket $BUCKET_NAME \
  --public-access-block-configuration '{
    "BlockPublicAcls": true,
    "IgnorePublicAcls": true,
    "BlockPublicPolicy": true,
    "RestrictPublicBuckets": true
  }'

# Create DynamoDB table for locking
aws dynamodb create-table \
  --table-name $DYNAMODB_TABLE \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST \
  --region $REGION

Backend Configuration

# backend.tf
terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "prod/networking/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
    
    # Optional: Use a specific KMS key
    # kms_key_id     = "arn:aws:kms:us-east-1:111111111111:key/xxxxx"
  }
}

State File Organization

terraform-state/
├── global/
│   ├── iam/terraform.tfstate
│   └── route53/terraform.tfstate
├── prod/
│   ├── networking/terraform.tfstate
│   ├── eks/terraform.tfstate
│   ├── rds/terraform.tfstate
│   └── app/terraform.tfstate
├── staging/
│   ├── networking/terraform.tfstate
│   └── app/terraform.tfstate
└── dev/
    └── app/terraform.tfstate

GCP: Google Cloud Storage

# backend.tf
terraform {
  backend "gcs" {
    bucket  = "mycompany-terraform-state"
    prefix  = "prod/networking"
  }
}
# Create bucket with versioning
gsutil mb -l us-central1 gs://mycompany-terraform-state
gsutil versioning set on gs://mycompany-terraform-state

GCS handles locking automatically—no separate table needed.

Azure: Blob Storage

# backend.tf
terraform {
  backend "azurerm" {
    resource_group_name  = "terraform-state-rg"
    storage_account_name = "mycompanytfstate"
    container_name       = "tfstate"
    key                  = "prod/networking/terraform.tfstate"
  }
}
# Create storage account and container
az group create --name terraform-state-rg --location eastus

az storage account create \
  --name mycompanytfstate \
  --resource-group terraform-state-rg \
  --sku Standard_LRS \
  --encryption-services blob

az storage container create \
  --name tfstate \
  --account-name mycompanytfstate

How State Locking Works

When you run terraform apply:

  1. Terraform attempts to acquire a lock (DynamoDB, GCS, etc.)
  2. If lock exists, operation fails with “state locked” error
  3. If lock acquired, operation proceeds
  4. Lock released after operation completes
User A: terraform apply


   ┌─────────────┐
   │ Acquire Lock │ ────────────────────────────┐
   └─────────────┘                              │
        │                                       │
        ▼                              User B: terraform apply
   ┌─────────────┐                              │
   │   Apply     │                              ▼
   └─────────────┘                         ┌─────────────┐
        │                                  │ Lock Failed │
        ▼                                  └─────────────┘
   ┌─────────────┐                              │
   │Release Lock │                              ▼
   └─────────────┘                         Error: "State locked"

Lock Information

# View lock details (if stuck)
aws dynamodb scan --table-name terraform-locks

# Example lock entry
{
  "LockID": {
    "S": "mycompany-terraform-state/prod/networking/terraform.tfstate"
  },
  "Info": {
    "S": "{\"ID\":\"xxx\",\"Operation\":\"OperationTypeApply\",\"Info\":\"\",\"Who\":\"user@machine\",\"Version\":\"1.6.0\",\"Created\":\"2024-01-15T10:30:00.000000000Z\",\"Path\":\"\"}"
  }
}

Force Unlock (Use Carefully!)

# Only if you're SURE no operation is running
terraform force-unlock LOCK_ID

# Example
terraform force-unlock a1b2c3d4-e5f6-7890-abcd-ef1234567890

Warning: Force unlocking while another operation is running will corrupt your state.

State Migration

Local to Remote

# Step 1: Add backend configuration
terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "prod/app/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}
# Step 2: Initialize and migrate
terraform init -migrate-state

# Terraform will ask:
# "Do you want to copy existing state to the new backend?"
# Answer: yes

Between Remote Backends

# Step 1: Pull current state
terraform state pull > terraform.tfstate.backup

# Step 2: Update backend configuration
# Edit backend.tf with new backend settings

# Step 3: Reinitialize
terraform init -migrate-state

# If that fails, manual approach:
terraform init -reconfigure
terraform state push terraform.tfstate.backup

Between State Files (Refactoring)

Moving resources between state files:

# Remove from source state
cd old-project
terraform state rm aws_s3_bucket.data

# Import into destination state
cd ../new-project
terraform import aws_s3_bucket.data my-bucket-name

For bulk moves:

# Move resources between states
terraform state mv -state=source.tfstate -state-out=dest.tfstate \
  aws_s3_bucket.data aws_s3_bucket.data

# Or use terraform_remote_state data source for references

Partial Backend Configuration

Keep sensitive values out of code:

# backend.tf
terraform {
  backend "s3" {
    key = "prod/app/terraform.tfstate"
    # bucket, region, dynamodb_table provided via -backend-config
  }
}
# backend-config/prod.hcl
bucket         = "mycompany-terraform-state"
region         = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt        = true
# Initialize with partial config
terraform init -backend-config=backend-config/prod.hcl

Workspaces with Remote State

# Each workspace gets its own state file
terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "app/terraform.tfstate"  # Becomes app/env:dev/terraform.tfstate
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    workspace_key_prefix = "workspaces"
  }
}
terraform workspace new staging
# State file: s3://mycompany-terraform-state/workspaces/staging/app/terraform.tfstate

terraform workspace new prod
# State file: s3://mycompany-terraform-state/workspaces/prod/app/terraform.tfstate

State Access Control

IAM Policy for Terraform Users

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::mycompany-terraform-state/*"
    },
    {
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::mycompany-terraform-state"
    },
    {
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:DeleteItem"
      ],
      "Resource": "arn:aws:dynamodb:us-east-1:*:table/terraform-locks"
    }
  ]
}

Read-Only Access (for CI/CD plan jobs)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": "arn:aws:s3:::mycompany-terraform-state/*"
    },
    {
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": "arn:aws:s3:::mycompany-terraform-state"
    },
    {
      "Effect": "Allow",
      "Action": ["dynamodb:GetItem"],
      "Resource": "arn:aws:dynamodb:us-east-1:*:table/terraform-locks"
    }
  ]
}

Remote State Data Source

Reference outputs from other state files:

# In app configuration, reference networking outputs
data "terraform_remote_state" "networking" {
  backend = "s3"
  config = {
    bucket = "mycompany-terraform-state"
    key    = "prod/networking/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "app" {
  ami           = data.aws_ami.amazon_linux.id
  instance_type = "t3.micro"
  subnet_id     = data.terraform_remote_state.networking.outputs.private_subnet_ids[0]
  
  vpc_security_group_ids = [
    data.terraform_remote_state.networking.outputs.app_security_group_id
  ]
}

Disaster Recovery

Automated Backups

S3 versioning provides automatic backups:

# List state file versions
aws s3api list-object-versions \
  --bucket mycompany-terraform-state \
  --prefix prod/app/terraform.tfstate

# Restore specific version
aws s3api get-object \
  --bucket mycompany-terraform-state \
  --key prod/app/terraform.tfstate \
  --version-id "abc123" \
  restored-state.tfstate

# Push restored state
terraform state push restored-state.tfstate

Cross-Region Replication

# Enable replication for disaster recovery
resource "aws_s3_bucket_replication_configuration" "state_replication" {
  bucket = aws_s3_bucket.terraform_state.id
  role   = aws_iam_role.replication.arn

  rule {
    id     = "replicate-state"
    status = "Enabled"

    destination {
      bucket        = aws_s3_bucket.terraform_state_replica.arn
      storage_class = "STANDARD"
    }
  }
}

CI/CD Best Practices

Separate Plan and Apply

# .github/workflows/terraform.yml
jobs:
  plan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Terraform Plan
        run: |
          terraform init
          terraform plan -out=plan.tfplan
          
      - name: Upload Plan
        uses: actions/upload-artifact@v4
        with:
          name: tfplan
          path: plan.tfplan

  apply:
    needs: plan
    runs-on: ubuntu-latest
    environment: production  # Requires approval
    steps:
      - uses: actions/checkout@v4
      
      - name: Download Plan
        uses: actions/download-artifact@v4
        with:
          name: tfplan
          
      - name: Terraform Apply
        run: |
          terraform init
          terraform apply plan.tfplan

Lock Timeout Handling

# Increase lock timeout for slow operations
terraform apply -lock-timeout=10m

Troubleshooting

”Error acquiring state lock"

# 1. Check if another operation is running
aws dynamodb scan --table-name terraform-locks

# 2. If stuck lock, verify no operation is running
# 3. Force unlock only as last resort
terraform force-unlock LOCK_ID

"State file not found"

# Verify bucket and key exist
aws s3 ls s3://mycompany-terraform-state/prod/app/

# Check IAM permissions
aws sts get-caller-identity
aws s3api head-object --bucket mycompany-terraform-state --key prod/app/terraform.tfstate

"Backend configuration changed”

# Reinitialize with migration
terraform init -migrate-state

# Or reconfigure without migration
terraform init -reconfigure

Key Takeaways

  1. Always use remote state in team environments
  2. Always enable state locking—DynamoDB for S3, automatic for GCS
  3. Enable versioning for disaster recovery
  4. Organize state files by environment and component
  5. Use partial configuration to keep credentials out of code
  6. Never force-unlock unless you’re certain no operation is running
  7. Backup before migrationsterraform state pull > backup.tfstate

State management isn’t glamorous, but getting it wrong is catastrophic. Invest time in proper backend configuration, and you’ll avoid the late-night “who corrupted the state file” incidents.