Object storage is the backbone of cloud-native applications. Whether you’re storing user uploads, data lake files, or application backups, understanding the nuances between S3, GCS, and Azure Blob can save you thousands in storage costs and optimize performance.

Feature Comparison

FeatureAWS S3Google Cloud StorageAzure Blob Storage
Storage Classes644
Max Object Size5 TB5 TB190.7 TB (block blob)
VersioningYesYesYes
Lifecycle PoliciesYesYesYes
Event TriggersLambda, SNS, SQSCloud Functions, Pub/SubEvent Grid, Functions
CDN IntegrationCloudFrontCloud CDNAzure CDN
EncryptionSSE-S3, SSE-KMS, SSE-CGoogle-managed, CMEKMicrosoft-managed, CMK

AWS S3

The original cloud object storage, S3 set the standard for the industry.

Creating a Bucket with Terraform

resource "aws_s3_bucket" "main" {
  bucket = "my-app-data-${var.environment}"

  tags = {
    Environment = var.environment
    Project     = "my-app"
  }
}

resource "aws_s3_bucket_versioning" "main" {
  bucket = aws_s3_bucket.main.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "main" {
  bucket = aws_s3_bucket.main.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.s3.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_public_access_block" "main" {
  bucket = aws_s3_bucket.main.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_lifecycle_configuration" "main" {
  bucket = aws_s3_bucket.main.id

  rule {
    id     = "archive-old-data"
    status = "Enabled"

    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }

    transition {
      days          = 90
      storage_class = "GLACIER"
    }

    transition {
      days          = 365
      storage_class = "DEEP_ARCHIVE"
    }

    noncurrent_version_transition {
      noncurrent_days = 30
      storage_class   = "GLACIER"
    }

    noncurrent_version_expiration {
      noncurrent_days = 90
    }
  }

  rule {
    id     = "abort-incomplete-uploads"
    status = "Enabled"

    abort_incomplete_multipart_upload {
      days_after_initiation = 7
    }
  }
}

S3 Storage Classes

Standard          → Frequently accessed data
Standard-IA       → Infrequent access, rapid retrieval
One Zone-IA       → Infrequent access, single AZ (cheaper)
Glacier Instant   → Archive with millisecond access
Glacier Flexible  → Archive with minutes-hours retrieval
Glacier Deep      → Coldest storage, 12-48 hour retrieval

Python SDK Example

import boto3
from botocore.config import Config

# Configure with retries and timeouts
config = Config(
    retries={'max_attempts': 3, 'mode': 'adaptive'},
    connect_timeout=5,
    read_timeout=30
)

s3 = boto3.client('s3', config=config)

# Upload with metadata
def upload_file(bucket, key, file_path, content_type=None):
    extra_args = {
        'Metadata': {
            'uploaded-by': 'my-app',
            'timestamp': str(int(time.time()))
        }
    }
    if content_type:
        extra_args['ContentType'] = content_type

    s3.upload_file(file_path, bucket, key, ExtraArgs=extra_args)

# Generate presigned URL
def get_presigned_url(bucket, key, expiration=3600):
    return s3.generate_presigned_url(
        'get_object',
        Params={'Bucket': bucket, 'Key': key},
        ExpiresIn=expiration
    )

# Multipart upload for large files
def upload_large_file(bucket, key, file_path, part_size=100*1024*1024):
    from boto3.s3.transfer import TransferConfig

    config = TransferConfig(
        multipart_threshold=part_size,
        multipart_chunksize=part_size,
        max_concurrency=10,
        use_threads=True
    )

    s3.upload_file(file_path, bucket, key, Config=config)

Google Cloud Storage

GCS offers tight integration with BigQuery and data analytics tools.

Creating a Bucket with Terraform

resource "google_storage_bucket" "main" {
  name          = "my-app-data-${var.environment}"
  location      = "US"
  storage_class = "STANDARD"

  uniform_bucket_level_access = true

  versioning {
    enabled = true
  }

  encryption {
    default_kms_key_name = google_kms_crypto_key.storage.id
  }

  lifecycle_rule {
    condition {
      age = 30
    }
    action {
      type          = "SetStorageClass"
      storage_class = "NEARLINE"
    }
  }

  lifecycle_rule {
    condition {
      age = 90
    }
    action {
      type          = "SetStorageClass"
      storage_class = "COLDLINE"
    }
  }

  lifecycle_rule {
    condition {
      age = 365
    }
    action {
      type          = "SetStorageClass"
      storage_class = "ARCHIVE"
    }
  }

  lifecycle_rule {
    condition {
      num_newer_versions = 3
    }
    action {
      type = "Delete"
    }
  }

  cors {
    origin          = ["https://myapp.com"]
    method          = ["GET", "HEAD", "PUT", "POST"]
    response_header = ["*"]
    max_age_seconds = 3600
  }

  labels = {
    environment = var.environment
    project     = "my-app"
  }
}

# IAM binding
resource "google_storage_bucket_iam_member" "viewer" {
  bucket = google_storage_bucket.main.name
  role   = "roles/storage.objectViewer"
  member = "serviceAccount:${google_service_account.app.email}"
}

GCS Storage Classes

Standard    → Frequently accessed data
Nearline    → Access once per month (30-day minimum)
Coldline    → Access once per quarter (90-day minimum)
Archive     → Access once per year (365-day minimum)

Python SDK Example

from google.cloud import storage
from google.cloud.storage import transfer_manager

# Initialize client
client = storage.Client()
bucket = client.bucket('my-app-data')

# Upload with metadata
def upload_file(blob_name, file_path, content_type=None):
    blob = bucket.blob(blob_name)
    blob.metadata = {
        'uploaded-by': 'my-app',
        'timestamp': str(int(time.time()))
    }
    if content_type:
        blob.content_type = content_type
    blob.upload_from_filename(file_path)

# Generate signed URL
def get_signed_url(blob_name, expiration_minutes=60):
    from datetime import timedelta
    blob = bucket.blob(blob_name)
    return blob.generate_signed_url(
        version='v4',
        expiration=timedelta(minutes=expiration_minutes),
        method='GET'
    )

# Parallel upload for multiple files
def upload_many_files(file_paths, workers=8):
    results = transfer_manager.upload_many_from_filenames(
        bucket,
        file_paths,
        max_workers=workers
    )
    for path, result in zip(file_paths, results):
        if isinstance(result, Exception):
            print(f"Failed: {path}: {result}")

Azure Blob Storage

Azure Blob excels at hierarchical namespaces and big data scenarios.

Creating Storage with Terraform

resource "azurerm_storage_account" "main" {
  name                     = "myappdata${var.environment}"
  resource_group_name      = azurerm_resource_group.main.name
  location                 = azurerm_resource_group.main.location
  account_tier             = "Standard"
  account_replication_type = "GRS"
  account_kind             = "StorageV2"

  is_hns_enabled           = true  # Enable hierarchical namespace
  min_tls_version          = "TLS1_2"

  blob_properties {
    versioning_enabled = true

    delete_retention_policy {
      days = 30
    }

    container_delete_retention_policy {
      days = 7
    }
  }

  network_rules {
    default_action             = "Deny"
    ip_rules                   = var.allowed_ips
    virtual_network_subnet_ids = [azurerm_subnet.private.id]
    bypass                     = ["AzureServices"]
  }

  identity {
    type = "SystemAssigned"
  }

  tags = var.tags
}

resource "azurerm_storage_container" "data" {
  name                  = "data"
  storage_account_name  = azurerm_storage_account.main.name
  container_access_type = "private"
}

resource "azurerm_storage_management_policy" "lifecycle" {
  storage_account_id = azurerm_storage_account.main.id

  rule {
    name    = "archive-old-data"
    enabled = true

    filters {
      blob_types   = ["blockBlob"]
      prefix_match = ["data/"]
    }

    actions {
      base_blob {
        tier_to_cool_after_days_since_modification_greater_than    = 30
        tier_to_archive_after_days_since_modification_greater_than = 90
        delete_after_days_since_modification_greater_than          = 365
      }
      snapshot {
        delete_after_days_since_creation_greater_than = 30
      }
      version {
        delete_after_days_since_creation = 90
      }
    }
  }
}

Azure Storage Tiers

Hot       → Frequently accessed data
Cool      → Infrequent access, 30-day minimum
Cold      → Rare access, 90-day minimum
Archive   → Long-term storage, hours to rehydrate

Python SDK Example

from azure.storage.blob import BlobServiceClient, BlobSasPermissions, generate_blob_sas
from datetime import datetime, timedelta

# Initialize client
connection_string = os.environ['AZURE_STORAGE_CONNECTION_STRING']
blob_service = BlobServiceClient.from_connection_string(connection_string)
container = blob_service.get_container_client('data')

# Upload with metadata
def upload_file(blob_name, file_path, content_type=None):
    blob = container.get_blob_client(blob_name)
    with open(file_path, 'rb') as data:
        blob.upload_blob(
            data,
            overwrite=True,
            content_settings=ContentSettings(content_type=content_type) if content_type else None,
            metadata={
                'uploaded-by': 'my-app',
                'timestamp': str(int(time.time()))
            }
        )

# Generate SAS URL
def get_sas_url(blob_name, expiration_hours=1):
    account_name = blob_service.account_name
    account_key = os.environ['AZURE_STORAGE_KEY']

    sas_token = generate_blob_sas(
        account_name=account_name,
        container_name='data',
        blob_name=blob_name,
        account_key=account_key,
        permission=BlobSasPermissions(read=True),
        expiry=datetime.utcnow() + timedelta(hours=expiration_hours)
    )

    return f"https://{account_name}.blob.core.windows.net/data/{blob_name}?{sas_token}"

# Parallel upload
async def upload_many_files(file_paths):
    from azure.storage.blob.aio import BlobServiceClient as AsyncBlobService

    async with AsyncBlobService.from_connection_string(connection_string) as client:
        container = client.get_container_client('data')
        tasks = [
            upload_single(container, path) for path in file_paths
        ]
        await asyncio.gather(*tasks)

Pricing Comparison

Storage Cost (per GB/month, US regions)

TierS3GCSAzure
Hot/Standard$0.023$0.020$0.018
Cool/Nearline$0.0125$0.010$0.010
Cold/Coldline$0.004$0.004$0.0036
Archive$0.00099$0.0012$0.00099

Operations Cost (per 10,000 requests)

OperationS3GCSAzure
PUT/POST$0.005$0.005$0.005
GET$0.0004$0.0004$0.0004
DELETEFreeFreeFree

Egress Cost (per GB)

All three charge similar egress rates: ~$0.09/GB for first 10TB, decreasing with volume.

Cross-Cloud Data Transfer

Using rclone

# Install rclone
curl https://rclone.org/install.sh | sudo bash

# Configure sources
rclone config

# Sync S3 to GCS
rclone sync s3:my-bucket gcs:my-bucket --progress

# Copy Azure to S3
rclone copy azure:container s3:bucket/prefix --transfers=16

Terraform Multi-Cloud Replication

# S3 bucket with replication to GCS
resource "aws_s3_bucket_replication_configuration" "to_gcs" {
  bucket = aws_s3_bucket.source.id
  role   = aws_iam_role.replication.arn

  rule {
    id     = "replicate-to-gcs"
    status = "Enabled"

    destination {
      bucket        = "arn:aws:s3:::gcs-bridge-bucket"
      storage_class = "STANDARD"
    }
  }
}

Security Best Practices

Common Across All Providers

# Checklist for object storage security
security_checklist:
  - Block all public access by default
  - Enable versioning for critical data
  - Use customer-managed encryption keys
  - Enable access logging
  - Implement lifecycle policies
  - Use VPC endpoints/Private Link
  - Apply least-privilege IAM policies
  - Enable MFA delete for sensitive buckets
  - Configure CORS only for required origins
  - Use presigned URLs instead of public access

Key Takeaways

  1. GCS is cheapest for hot storage — $0.020/GB vs $0.023 (S3)
  2. S3 has the most storage classes — 6 tiers for fine-grained optimization
  3. Azure excels at hierarchical data — HNS for data lake scenarios
  4. Egress costs are the killer — design to minimize cross-region traffic
  5. Use lifecycle policies — automate tier transitions to save 50%+

“The cheapest storage is the storage you don’t have to pay for. Delete what you don’t need, archive what you rarely access.”