S3 vs GCS vs Azure Blob: Object Storage Compared
Deep dive into cloud object storage services. Compare features, pricing, performance, and best practices for AWS S3, Google Cloud Storage, and Azure Blob Storage.
Object storage is the backbone of cloud-native applications. Whether you’re storing user uploads, data lake files, or application backups, understanding the nuances between S3, GCS, and Azure Blob can save you thousands in storage costs and optimize performance.
Feature Comparison
| Feature | AWS S3 | Google Cloud Storage | Azure Blob Storage |
|---|---|---|---|
| Storage Classes | 6 | 4 | 4 |
| Max Object Size | 5 TB | 5 TB | 190.7 TB (block blob) |
| Versioning | Yes | Yes | Yes |
| Lifecycle Policies | Yes | Yes | Yes |
| Event Triggers | Lambda, SNS, SQS | Cloud Functions, Pub/Sub | Event Grid, Functions |
| CDN Integration | CloudFront | Cloud CDN | Azure CDN |
| Encryption | SSE-S3, SSE-KMS, SSE-C | Google-managed, CMEK | Microsoft-managed, CMK |
AWS S3
The original cloud object storage, S3 set the standard for the industry.
Creating a Bucket with Terraform
resource "aws_s3_bucket" "main" {
bucket = "my-app-data-${var.environment}"
tags = {
Environment = var.environment
Project = "my-app"
}
}
resource "aws_s3_bucket_versioning" "main" {
bucket = aws_s3_bucket.main.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "main" {
bucket = aws_s3_bucket.main.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = aws_kms_key.s3.arn
}
bucket_key_enabled = true
}
}
resource "aws_s3_bucket_public_access_block" "main" {
bucket = aws_s3_bucket.main.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_s3_bucket_lifecycle_configuration" "main" {
bucket = aws_s3_bucket.main.id
rule {
id = "archive-old-data"
status = "Enabled"
transition {
days = 30
storage_class = "STANDARD_IA"
}
transition {
days = 90
storage_class = "GLACIER"
}
transition {
days = 365
storage_class = "DEEP_ARCHIVE"
}
noncurrent_version_transition {
noncurrent_days = 30
storage_class = "GLACIER"
}
noncurrent_version_expiration {
noncurrent_days = 90
}
}
rule {
id = "abort-incomplete-uploads"
status = "Enabled"
abort_incomplete_multipart_upload {
days_after_initiation = 7
}
}
}
S3 Storage Classes
Standard → Frequently accessed data
Standard-IA → Infrequent access, rapid retrieval
One Zone-IA → Infrequent access, single AZ (cheaper)
Glacier Instant → Archive with millisecond access
Glacier Flexible → Archive with minutes-hours retrieval
Glacier Deep → Coldest storage, 12-48 hour retrieval
Python SDK Example
import boto3
from botocore.config import Config
# Configure with retries and timeouts
config = Config(
retries={'max_attempts': 3, 'mode': 'adaptive'},
connect_timeout=5,
read_timeout=30
)
s3 = boto3.client('s3', config=config)
# Upload with metadata
def upload_file(bucket, key, file_path, content_type=None):
extra_args = {
'Metadata': {
'uploaded-by': 'my-app',
'timestamp': str(int(time.time()))
}
}
if content_type:
extra_args['ContentType'] = content_type
s3.upload_file(file_path, bucket, key, ExtraArgs=extra_args)
# Generate presigned URL
def get_presigned_url(bucket, key, expiration=3600):
return s3.generate_presigned_url(
'get_object',
Params={'Bucket': bucket, 'Key': key},
ExpiresIn=expiration
)
# Multipart upload for large files
def upload_large_file(bucket, key, file_path, part_size=100*1024*1024):
from boto3.s3.transfer import TransferConfig
config = TransferConfig(
multipart_threshold=part_size,
multipart_chunksize=part_size,
max_concurrency=10,
use_threads=True
)
s3.upload_file(file_path, bucket, key, Config=config)
Google Cloud Storage
GCS offers tight integration with BigQuery and data analytics tools.
Creating a Bucket with Terraform
resource "google_storage_bucket" "main" {
name = "my-app-data-${var.environment}"
location = "US"
storage_class = "STANDARD"
uniform_bucket_level_access = true
versioning {
enabled = true
}
encryption {
default_kms_key_name = google_kms_crypto_key.storage.id
}
lifecycle_rule {
condition {
age = 30
}
action {
type = "SetStorageClass"
storage_class = "NEARLINE"
}
}
lifecycle_rule {
condition {
age = 90
}
action {
type = "SetStorageClass"
storage_class = "COLDLINE"
}
}
lifecycle_rule {
condition {
age = 365
}
action {
type = "SetStorageClass"
storage_class = "ARCHIVE"
}
}
lifecycle_rule {
condition {
num_newer_versions = 3
}
action {
type = "Delete"
}
}
cors {
origin = ["https://myapp.com"]
method = ["GET", "HEAD", "PUT", "POST"]
response_header = ["*"]
max_age_seconds = 3600
}
labels = {
environment = var.environment
project = "my-app"
}
}
# IAM binding
resource "google_storage_bucket_iam_member" "viewer" {
bucket = google_storage_bucket.main.name
role = "roles/storage.objectViewer"
member = "serviceAccount:${google_service_account.app.email}"
}
GCS Storage Classes
Standard → Frequently accessed data
Nearline → Access once per month (30-day minimum)
Coldline → Access once per quarter (90-day minimum)
Archive → Access once per year (365-day minimum)
Python SDK Example
from google.cloud import storage
from google.cloud.storage import transfer_manager
# Initialize client
client = storage.Client()
bucket = client.bucket('my-app-data')
# Upload with metadata
def upload_file(blob_name, file_path, content_type=None):
blob = bucket.blob(blob_name)
blob.metadata = {
'uploaded-by': 'my-app',
'timestamp': str(int(time.time()))
}
if content_type:
blob.content_type = content_type
blob.upload_from_filename(file_path)
# Generate signed URL
def get_signed_url(blob_name, expiration_minutes=60):
from datetime import timedelta
blob = bucket.blob(blob_name)
return blob.generate_signed_url(
version='v4',
expiration=timedelta(minutes=expiration_minutes),
method='GET'
)
# Parallel upload for multiple files
def upload_many_files(file_paths, workers=8):
results = transfer_manager.upload_many_from_filenames(
bucket,
file_paths,
max_workers=workers
)
for path, result in zip(file_paths, results):
if isinstance(result, Exception):
print(f"Failed: {path}: {result}")
Azure Blob Storage
Azure Blob excels at hierarchical namespaces and big data scenarios.
Creating Storage with Terraform
resource "azurerm_storage_account" "main" {
name = "myappdata${var.environment}"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
account_tier = "Standard"
account_replication_type = "GRS"
account_kind = "StorageV2"
is_hns_enabled = true # Enable hierarchical namespace
min_tls_version = "TLS1_2"
blob_properties {
versioning_enabled = true
delete_retention_policy {
days = 30
}
container_delete_retention_policy {
days = 7
}
}
network_rules {
default_action = "Deny"
ip_rules = var.allowed_ips
virtual_network_subnet_ids = [azurerm_subnet.private.id]
bypass = ["AzureServices"]
}
identity {
type = "SystemAssigned"
}
tags = var.tags
}
resource "azurerm_storage_container" "data" {
name = "data"
storage_account_name = azurerm_storage_account.main.name
container_access_type = "private"
}
resource "azurerm_storage_management_policy" "lifecycle" {
storage_account_id = azurerm_storage_account.main.id
rule {
name = "archive-old-data"
enabled = true
filters {
blob_types = ["blockBlob"]
prefix_match = ["data/"]
}
actions {
base_blob {
tier_to_cool_after_days_since_modification_greater_than = 30
tier_to_archive_after_days_since_modification_greater_than = 90
delete_after_days_since_modification_greater_than = 365
}
snapshot {
delete_after_days_since_creation_greater_than = 30
}
version {
delete_after_days_since_creation = 90
}
}
}
}
Azure Storage Tiers
Hot → Frequently accessed data
Cool → Infrequent access, 30-day minimum
Cold → Rare access, 90-day minimum
Archive → Long-term storage, hours to rehydrate
Python SDK Example
from azure.storage.blob import BlobServiceClient, BlobSasPermissions, generate_blob_sas
from datetime import datetime, timedelta
# Initialize client
connection_string = os.environ['AZURE_STORAGE_CONNECTION_STRING']
blob_service = BlobServiceClient.from_connection_string(connection_string)
container = blob_service.get_container_client('data')
# Upload with metadata
def upload_file(blob_name, file_path, content_type=None):
blob = container.get_blob_client(blob_name)
with open(file_path, 'rb') as data:
blob.upload_blob(
data,
overwrite=True,
content_settings=ContentSettings(content_type=content_type) if content_type else None,
metadata={
'uploaded-by': 'my-app',
'timestamp': str(int(time.time()))
}
)
# Generate SAS URL
def get_sas_url(blob_name, expiration_hours=1):
account_name = blob_service.account_name
account_key = os.environ['AZURE_STORAGE_KEY']
sas_token = generate_blob_sas(
account_name=account_name,
container_name='data',
blob_name=blob_name,
account_key=account_key,
permission=BlobSasPermissions(read=True),
expiry=datetime.utcnow() + timedelta(hours=expiration_hours)
)
return f"https://{account_name}.blob.core.windows.net/data/{blob_name}?{sas_token}"
# Parallel upload
async def upload_many_files(file_paths):
from azure.storage.blob.aio import BlobServiceClient as AsyncBlobService
async with AsyncBlobService.from_connection_string(connection_string) as client:
container = client.get_container_client('data')
tasks = [
upload_single(container, path) for path in file_paths
]
await asyncio.gather(*tasks)
Pricing Comparison
Storage Cost (per GB/month, US regions)
| Tier | S3 | GCS | Azure |
|---|---|---|---|
| Hot/Standard | $0.023 | $0.020 | $0.018 |
| Cool/Nearline | $0.0125 | $0.010 | $0.010 |
| Cold/Coldline | $0.004 | $0.004 | $0.0036 |
| Archive | $0.00099 | $0.0012 | $0.00099 |
Operations Cost (per 10,000 requests)
| Operation | S3 | GCS | Azure |
|---|---|---|---|
| PUT/POST | $0.005 | $0.005 | $0.005 |
| GET | $0.0004 | $0.0004 | $0.0004 |
| DELETE | Free | Free | Free |
Egress Cost (per GB)
All three charge similar egress rates: ~$0.09/GB for first 10TB, decreasing with volume.
Cross-Cloud Data Transfer
Using rclone
# Install rclone
curl https://rclone.org/install.sh | sudo bash
# Configure sources
rclone config
# Sync S3 to GCS
rclone sync s3:my-bucket gcs:my-bucket --progress
# Copy Azure to S3
rclone copy azure:container s3:bucket/prefix --transfers=16
Terraform Multi-Cloud Replication
# S3 bucket with replication to GCS
resource "aws_s3_bucket_replication_configuration" "to_gcs" {
bucket = aws_s3_bucket.source.id
role = aws_iam_role.replication.arn
rule {
id = "replicate-to-gcs"
status = "Enabled"
destination {
bucket = "arn:aws:s3:::gcs-bridge-bucket"
storage_class = "STANDARD"
}
}
}
Security Best Practices
Common Across All Providers
# Checklist for object storage security
security_checklist:
- Block all public access by default
- Enable versioning for critical data
- Use customer-managed encryption keys
- Enable access logging
- Implement lifecycle policies
- Use VPC endpoints/Private Link
- Apply least-privilege IAM policies
- Enable MFA delete for sensitive buckets
- Configure CORS only for required origins
- Use presigned URLs instead of public access
Key Takeaways
- GCS is cheapest for hot storage — $0.020/GB vs $0.023 (S3)
- S3 has the most storage classes — 6 tiers for fine-grained optimization
- Azure excels at hierarchical data — HNS for data lake scenarios
- Egress costs are the killer — design to minimize cross-region traffic
- Use lifecycle policies — automate tier transitions to save 50%+
“The cheapest storage is the storage you don’t have to pay for. Delete what you don’t need, archive what you rarely access.”