Google Cloud offers two serverless compute platforms: Cloud Functions for event-driven functions and Cloud Run for containerized applications. Both scale to zero and charge per-use, but they serve different needs. This guide helps you choose.

Quick Comparison

FeatureCloud FunctionsCloud Run
Unit of deploymentFunctionContainer
LanguagesNode.js, Python, Go, Java, .NET, Ruby, PHPAny (containerized)
Max timeout60 min (2nd gen)60 min
Max memory32 GB32 GB
Max vCPUs88
Concurrency1 per instance (1st gen), configurable (2nd gen)Up to 1000 per instance
Min instances0 or more0 or more
Cold startHigherLower (with min instances)
Pricing modelPer invocation + computePer request + compute

Cloud Functions: Event-Driven Simplicity

HTTP Function (2nd Gen)

# main.py
import functions_framework
from flask import jsonify

@functions_framework.http
def hello(request):
    """HTTP Cloud Function."""
    name = request.args.get('name', 'World')
    return jsonify({'message': f'Hello, {name}!'})
# Deploy HTTP function
gcloud functions deploy hello \
  --gen2 \
  --runtime=python312 \
  --region=us-central1 \
  --source=. \
  --entry-point=hello \
  --trigger-http \
  --allow-unauthenticated \
  --memory=256MB \
  --timeout=60s

Event-Driven Function (Pub/Sub)

# main.py
import functions_framework
import base64
import json

@functions_framework.cloud_event
def process_message(cloud_event):
    """Triggered by Pub/Sub message."""
    data = base64.b64decode(cloud_event.data["message"]["data"]).decode()
    message = json.loads(data)
    
    print(f"Processing order: {message['order_id']}")
    # Process order...
    
    return 'OK'
# Deploy Pub/Sub triggered function
gcloud functions deploy process-message \
  --gen2 \
  --runtime=python312 \
  --region=us-central1 \
  --source=. \
  --entry-point=process_message \
  --trigger-topic=orders \
  --memory=512MB \
  --timeout=300s

Terraform for Cloud Functions

resource "google_cloudfunctions2_function" "api" {
  name     = "api-function"
  location = "us-central1"

  build_config {
    runtime     = "python312"
    entry_point = "handler"
    
    source {
      storage_source {
        bucket = google_storage_bucket.source.name
        object = google_storage_bucket_object.source.name
      }
    }
  }

  service_config {
    max_instance_count    = 100
    min_instance_count    = 0
    available_memory      = "256M"
    timeout_seconds       = 60
    service_account_email = google_service_account.function.email

    environment_variables = {
      PROJECT_ID = var.project_id
    }

    secret_environment_variables {
      key        = "API_KEY"
      project_id = var.project_id
      secret     = google_secret_manager_secret.api_key.secret_id
      version    = "latest"
    }
  }
}

# Allow unauthenticated access
resource "google_cloud_run_service_iam_member" "invoker" {
  location = google_cloudfunctions2_function.api.location
  service  = google_cloudfunctions2_function.api.name
  role     = "roles/run.invoker"
  member   = "allUsers"
}

Cloud Run: Container Flexibility

Basic Service

# Dockerfile
FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Cloud Run expects PORT env var
ENV PORT=8080
EXPOSE 8080

CMD ["gunicorn", "--bind", ":8080", "--workers", "1", "--threads", "8", "app:app"]
# app.py
from flask import Flask, request, jsonify
import os

app = Flask(__name__)

@app.route('/')
def hello():
    return jsonify({'message': 'Hello from Cloud Run!'})

@app.route('/process', methods=['POST'])
def process():
    data = request.get_json()
    # Handle concurrent requests efficiently
    result = process_data(data)
    return jsonify(result)

if __name__ == '__main__':
    port = int(os.environ.get('PORT', 8080))
    app.run(host='0.0.0.0', port=port)
# Build and deploy
gcloud builds submit --tag gcr.io/$PROJECT_ID/my-service

gcloud run deploy my-service \
  --image gcr.io/$PROJECT_ID/my-service \
  --region us-central1 \
  --platform managed \
  --allow-unauthenticated \
  --memory 512Mi \
  --cpu 1 \
  --concurrency 80 \
  --min-instances 1 \
  --max-instances 100

Terraform for Cloud Run

resource "google_cloud_run_v2_service" "api" {
  name     = "api-service"
  location = "us-central1"

  template {
    containers {
      image = "gcr.io/${var.project_id}/api:${var.image_tag}"

      ports {
        container_port = 8080
      }

      resources {
        limits = {
          cpu    = "1"
          memory = "512Mi"
        }
      }

      env {
        name  = "PROJECT_ID"
        value = var.project_id
      }

      env {
        name = "DATABASE_URL"
        value_source {
          secret_key_ref {
            secret  = google_secret_manager_secret.db_url.secret_id
            version = "latest"
          }
        }
      }

      startup_probe {
        http_get {
          path = "/health"
          port = 8080
        }
        initial_delay_seconds = 5
        period_seconds        = 10
        failure_threshold     = 3
      }

      liveness_probe {
        http_get {
          path = "/health"
          port = 8080
        }
        period_seconds = 30
      }
    }

    scaling {
      min_instance_count = 1
      max_instance_count = 100
    }

    service_account = google_service_account.api.email
  }

  traffic {
    type    = "TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST"
    percent = 100
  }
}

# Allow unauthenticated access
resource "google_cloud_run_v2_service_iam_member" "public" {
  location = google_cloud_run_v2_service.api.location
  name     = google_cloud_run_v2_service.api.name
  role     = "roles/run.invoker"
  member   = "allUsers"
}

# Custom domain
resource "google_cloud_run_domain_mapping" "api" {
  location = "us-central1"
  name     = "api.example.com"

  metadata {
    namespace = var.project_id
  }

  spec {
    route_name = google_cloud_run_v2_service.api.name
  }
}

Concurrency: The Key Difference

Cloud Functions (1 request per instance)

Instance 1: [Request A............]
Instance 2: [Request B............]
Instance 3: [Request C............]
            ↑ 3 instances for 3 concurrent requests

Cloud Run (multiple requests per instance)

Instance 1: [Request A]
            [Request B]
            [Request C]
            [Request D]
            ↑ 1 instance handles multiple concurrent requests
# Cloud Run - optimize for concurrency
from flask import Flask
import asyncio
import aiohttp

app = Flask(__name__)

@app.route('/batch')
async def batch_process():
    """Handle concurrent requests efficiently."""
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_data(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
    return jsonify(results)

When to Choose Cloud Functions

Choose Cloud Functions when:

  • Simple event handlers (Pub/Sub, Cloud Storage, Firestore)
  • Quick HTTP endpoints with minimal dependencies
  • Team wants to focus on code, not containers
  • Single-purpose functions with infrequent invocation
  • Tight integration with GCP event sources
# Perfect for Cloud Functions: Simple event handler
@functions_framework.cloud_event
def on_file_upload(cloud_event):
    """Triggered when file uploaded to Cloud Storage."""
    file_name = cloud_event.data["name"]
    bucket = cloud_event.data["bucket"]
    
    # Process uploaded file
    process_file(bucket, file_name)

When to Choose Cloud Run

Choose Cloud Run when:

  • Complex applications with multiple routes
  • Need custom runtimes or system dependencies
  • High concurrency workloads (many simultaneous requests)
  • Websockets or streaming responses
  • Want portability (standard containers)
  • Gradual traffic migration (revisions)
# Perfect for Cloud Run: Multi-route API with concurrency
from flask import Flask
from concurrent.futures import ThreadPoolExecutor

app = Flask(__name__)
executor = ThreadPoolExecutor(max_workers=10)

@app.route('/api/users')
def list_users():
    return fetch_users()

@app.route('/api/orders')  
def list_orders():
    return fetch_orders()

@app.route('/api/reports/<report_id>')
def generate_report(report_id):
    # Long-running, but handles concurrent requests
    return generate_complex_report(report_id)

Cost Comparison

Scenario: 1M requests/month, 200ms avg duration, 256MB memory

Cloud Functions (2nd Gen):
- Invocations: 1M × $0.0000004 = $0.40
- Compute: 1M × 0.2s × 0.25GB × $0.0000025 = $0.125
- Total: ~$0.53/month

Cloud Run (80 concurrency):
- Requests: 1M × $0.0000004 = $0.40  
- Compute: 1M / 80 × 0.2s × 0.5vCPU × $0.0000240 = $0.03
- Total: ~$0.43/month

Winner: Cloud Run (higher concurrency = lower compute cost)

Migration Path

# Cloud Functions to Cloud Run migration
# 1. Wrap your function in a Flask/FastAPI app
# 2. Add Dockerfile
# 3. Deploy to Cloud Run

# Before (Cloud Function)
@functions_framework.http
def my_function(request):
    return process(request)

# After (Cloud Run)
from flask import Flask, request
app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def my_function():
    return process(request)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))

Key Takeaways

  1. Cloud Functions 2nd Gen is built on Cloud Run — they’re converging
  2. Concurrency matters — Cloud Run handles more requests per instance
  3. Cloud Functions excels at event-driven, single-purpose handlers
  4. Cloud Run excels at APIs, complex apps, and high-concurrency workloads
  5. Both scale to zero — no cost when idle
  6. Use Cloud Functions for GCP event triggers (Pub/Sub, Storage, Firestore)
  7. Use Cloud Run when you need containers or complex dependencies

“Start with Cloud Functions for simplicity. Move to Cloud Run when you outgrow it — the migration is straightforward.”