GitLab CI: Pipeline Optimization and Caching Strategies
Master GitLab CI/CD with advanced pipeline optimization techniques, intelligent caching, and parallel execution strategies that cut build times in half.
GitLab CI is powerful out of the box, but most teams leave significant performance on the table. This guide covers the optimization techniques that separate a 45-minute pipeline from a 10-minute one.
The Problem with Slow Pipelines
Slow CI/CD directly impacts developer productivity. A 30-minute pipeline means:
- Developers context-switch while waiting
- Bugs take longer to catch
- Deployment frequency drops
- Engineers start skipping tests
The fix isn’t throwing more runners at the problem—it’s smarter pipeline design.
Pipeline Structure Fundamentals
Stages and Jobs
# .gitlab-ci.yml
stages:
- build
- test
- security
- deploy
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
default:
image: node:20-alpine
tags:
- docker
# Jobs within the same stage run in parallel
build:
stage: build
script:
- npm ci --cache .npm
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 hour
unit-tests:
stage: test
script:
- npm run test:unit
integration-tests:
stage: test
script:
- npm run test:integration
DAG Pipelines for True Parallelism
By default, GitLab waits for all jobs in a stage to complete before moving on. DAG (Directed Acyclic Graph) pipelines break this limitation:
stages:
- build
- test
- deploy
build-frontend:
stage: build
script:
- npm ci && npm run build
artifacts:
paths:
- frontend/dist/
build-backend:
stage: build
script:
- go build -o app ./cmd/server
test-frontend:
stage: test
needs: [build-frontend] # Starts as soon as build-frontend finishes
script:
- npm run test
test-backend:
stage: test
needs: [build-backend] # Doesn't wait for build-frontend
script:
- go test ./...
deploy:
stage: deploy
needs: [test-frontend, test-backend]
script:
- ./deploy.sh
With needs, test-backend starts immediately after build-backend, even if build-frontend is still running.
Caching Strategies
Caching is where most pipelines gain the biggest wins. The key is understanding what to cache and when to invalidate.
Dependency Caching
variables:
NPM_CACHE_DIR: "$CI_PROJECT_DIR/.npm"
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.pip"
# Node.js project
build:
stage: build
cache:
key:
files:
- package-lock.json
paths:
- .npm/
- node_modules/
policy: pull-push
script:
- npm ci --cache .npm
- npm run build
# Python project
test-python:
stage: test
image: python:3.11
cache:
key:
files:
- requirements.txt
paths:
- .pip/
- venv/
script:
- python -m venv venv
- source venv/bin/activate
- pip install -r requirements.txt
- pytest
Cache Key Strategies
# Per-branch cache (isolated but more cache misses)
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- node_modules/
# Lock file hash (invalidates only when dependencies change)
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
# Combined: branch + lock file (balance isolation and hits)
cache:
key:
prefix: "$CI_COMMIT_REF_SLUG"
files:
- package-lock.json
paths:
- node_modules/
Cache Policies
# Job that only reads cache (faster startup)
lint:
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
policy: pull # Never uploads, only downloads
# Job that updates cache
install-deps:
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
policy: pull-push # Downloads and uploads
# Job that creates fresh cache (weekly refresh)
refresh-cache:
rules:
- if: $CI_PIPELINE_SOURCE == "schedule"
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
policy: push # Only uploads, ignores existing
script:
- rm -rf node_modules
- npm ci
Parallel Test Execution
Using parallel Keyword
test:
stage: test
parallel: 4
script:
- npm run test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
# For Jest
test:
parallel: 4
script:
- npx jest --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
# For pytest
test:
image: python:3.11
parallel: 4
script:
- pip install pytest-split
- pytest --splits $CI_NODE_TOTAL --group $CI_NODE_INDEX
Matrix Jobs
test:
stage: test
parallel:
matrix:
- NODE_VERSION: ["18", "20", "22"]
DATABASE: ["postgres", "mysql"]
image: node:${NODE_VERSION}
services:
- name: ${DATABASE}:latest
alias: db
script:
- npm test
Conditional Job Execution
Don’t run what you don’t need:
# Only run on specific file changes
test-frontend:
rules:
- changes:
- frontend/**/*
- package.json
script:
- npm run test:frontend
test-backend:
rules:
- changes:
- backend/**/*
- go.mod
script:
- go test ./...
# Skip CI for docs-only changes
workflow:
rules:
- if: $CI_COMMIT_MESSAGE =~ /\[skip ci\]/
when: never
- changes:
- "**/*.md"
- "docs/**/*"
when: never
- when: always
# Environment-specific jobs
deploy-production:
stage: deploy
rules:
- if: $CI_COMMIT_BRANCH == "main"
environment:
name: production
script:
- ./deploy.sh production
Artifacts vs. Cache
Understanding when to use each:
| Feature | Cache | Artifacts |
|---|---|---|
| Purpose | Speed up jobs | Pass data between jobs |
| Persistence | Best-effort | Guaranteed |
| Scope | Same job across pipelines | Different jobs in same pipeline |
| Use case | Dependencies (node_modules) | Build outputs (dist/) |
build:
stage: build
cache:
key:
files:
- package-lock.json
paths:
- node_modules/ # Cache: reused in future pipelines
artifacts:
paths:
- dist/ # Artifact: needed by deploy job
expire_in: 1 hour
script:
- npm ci
- npm run build
deploy:
stage: deploy
dependencies:
- build # Only download artifacts from build job
script:
- aws s3 sync dist/ s3://my-bucket/
Docker Image Optimization
Use Kaniko for Rootless Builds
build-image:
stage: build
image:
name: gcr.io/kaniko-project/executor:v1.19.2-debug
entrypoint: [""]
variables:
DOCKER_CONFIG: /kaniko/.docker
script:
- |
/kaniko/executor \
--context "${CI_PROJECT_DIR}" \
--dockerfile "${CI_PROJECT_DIR}/Dockerfile" \
--destination "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}" \
--destination "${CI_REGISTRY_IMAGE}:latest" \
--cache=true \
--cache-repo="${CI_REGISTRY_IMAGE}/cache"
Layer Caching with BuildKit
build-image:
stage: build
image: docker:24
services:
- docker:24-dind
variables:
DOCKER_BUILDKIT: "1"
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- |
docker build \
--cache-from ${CI_REGISTRY_IMAGE}:latest \
--build-arg BUILDKIT_INLINE_CACHE=1 \
-t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA} \
-t ${CI_REGISTRY_IMAGE}:latest \
.
- docker push ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
- docker push ${CI_REGISTRY_IMAGE}:latest
Include and Extends for DRY Pipelines
Reusable Templates
# templates/node.yml
.node-base:
image: node:20-alpine
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
before_script:
- npm ci --cache .npm
.deploy-base:
image: amazon/aws-cli:2.15.0
before_script:
- aws configure set region $AWS_REGION
# .gitlab-ci.yml
include:
- local: templates/node.yml
- project: 'devops/ci-templates'
ref: main
file: '/security/sast.yml'
build:
extends: .node-base
stage: build
script:
- npm run build
deploy:
extends: .deploy-base
stage: deploy
script:
- aws s3 sync dist/ s3://$BUCKET_NAME/
Runner Configuration Tips
Choosing the Right Executor
# config.toml for GitLab Runner
# Docker executor (isolated, consistent)
[[runners]]
executor = "docker"
[runners.docker]
image = "alpine:latest"
privileged = false
volumes = ["/cache"]
# Docker Machine (auto-scaling)
[[runners]]
executor = "docker+machine"
[runners.machine]
IdleCount = 2
MaxBuilds = 100
MachineDriver = "amazonec2"
Tag-Based Job Routing
# Route to specific runners
heavy-build:
tags:
- high-memory
- ssd
script:
- npm run build:production
quick-lint:
tags:
- shared
script:
- npm run lint
Monitoring Pipeline Performance
Built-in Analytics
GitLab provides pipeline analytics at CI/CD > Analytics. Track:
- Pipeline duration trends
- Job failure rates
- Most time-consuming jobs
Custom Metrics
.metrics:
after_script:
- |
curl -X POST "https://metrics.example.com/pipeline" \
-H "Content-Type: application/json" \
-d '{
"job": "'$CI_JOB_NAME'",
"duration": "'$CI_JOB_DURATION'",
"status": "'$CI_JOB_STATUS'",
"pipeline": "'$CI_PIPELINE_ID'"
}' || true
test:
extends: .metrics
script:
- npm test
Complete Optimized Pipeline Example
stages:
- install
- build
- test
- security
- deploy
variables:
NPM_CACHE_DIR: "$CI_PROJECT_DIR/.npm"
default:
image: node:20-alpine
tags:
- docker
workflow:
rules:
- if: $CI_COMMIT_MESSAGE =~ /\[skip ci\]/
when: never
- when: always
# Separate install job that others depend on
install:
stage: install
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
policy: pull-push
script:
- npm ci --cache .npm
artifacts:
paths:
- node_modules/
expire_in: 1 hour
build:
stage: build
needs: [install]
script:
- npm run build
artifacts:
paths:
- dist/
expire_in: 1 day
lint:
stage: test
needs: [install]
script:
- npm run lint
unit-tests:
stage: test
needs: [install]
parallel: 4
script:
- npm run test:unit -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
coverage: '/Lines\s*:\s*(\d+\.?\d*)%/'
e2e-tests:
stage: test
needs: [build]
image: cypress/included:13.6.0
script:
- npm run test:e2e
artifacts:
when: on_failure
paths:
- cypress/screenshots/
- cypress/videos/
security-scan:
stage: security
needs: [install]
allow_failure: true
script:
- npm audit --audit-level=high
deploy-staging:
stage: deploy
needs: [build, unit-tests, lint]
environment:
name: staging
rules:
- if: $CI_COMMIT_BRANCH == "develop"
script:
- ./deploy.sh staging
deploy-production:
stage: deploy
needs: [build, unit-tests, e2e-tests, lint]
environment:
name: production
rules:
- if: $CI_COMMIT_BRANCH == "main"
script:
- ./deploy.sh production
When NOT to Optimize
Not every pipeline needs aggressive optimization:
- Small projects: Complex caching setup overhead > time saved
- Infrequent changes: Cache expires before reuse
- Compliance requirements: Some industries require fresh builds
- Debugging: Over-optimized pipelines are harder to troubleshoot
Key Takeaways
- Use DAG pipelines with
needsfor true parallelism - Cache dependencies, not build outputs
- Use
policy: pullon jobs that don’t need to update cache - Parallelize tests with the
parallelkeyword - Skip jobs on irrelevant changes with
rulesandchanges - Measure before optimizing — use GitLab’s pipeline analytics
The goal isn’t the fastest possible pipeline—it’s a pipeline fast enough that developers never context-switch waiting for it.