CI/CD October 20, 2025 ⏱ 8 min read

GitLab CI: Pipeline Optimization and Caching Strategies

Master GitLab CI/CD with advanced pipeline optimization techniques, intelligent caching, and parallel execution strategies that cut build times in half.

gitlabci-cdcachingoptimizationdevops

GitLab CI is powerful out of the box, but most teams leave significant performance on the table. This guide covers the optimization techniques that separate a 45-minute pipeline from a 10-minute one.

The Problem with Slow Pipelines

Slow CI/CD directly impacts developer productivity. A 30-minute pipeline means:

Developers context-switch while waiting
Bugs take longer to catch
Deployment frequency drops
Engineers start skipping tests

The fix isn’t throwing more runners at the problem—it’s smarter pipeline design.

Pipeline Structure Fundamentals

Stages and Jobs

# .gitlab-ci.yml
stages:
  - build
  - test
  - security
  - deploy

variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"

default:
  image: node:20-alpine
  tags:
    - docker

# Jobs within the same stage run in parallel
build:
  stage: build
  script:
    - npm ci --cache .npm
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 hour

unit-tests:
  stage: test
  script:
    - npm run test:unit

integration-tests:
  stage: test
  script:
    - npm run test:integration

DAG Pipelines for True Parallelism

By default, GitLab waits for all jobs in a stage to complete before moving on. DAG (Directed Acyclic Graph) pipelines break this limitation:

stages:
  - build
  - test
  - deploy

build-frontend:
  stage: build
  script:
    - npm ci && npm run build
  artifacts:
    paths:
      - frontend/dist/

build-backend:
  stage: build
  script:
    - go build -o app ./cmd/server

test-frontend:
  stage: test
  needs: [build-frontend]  # Starts as soon as build-frontend finishes
  script:
    - npm run test

test-backend:
  stage: test
  needs: [build-backend]  # Doesn't wait for build-frontend
  script:
    - go test ./...

deploy:
  stage: deploy
  needs: [test-frontend, test-backend]
  script:
    - ./deploy.sh

With needs, test-backend starts immediately after build-backend, even if build-frontend is still running.

Caching Strategies

Caching is where most pipelines gain the biggest wins. The key is understanding what to cache and when to invalidate.

Dependency Caching

variables:
  NPM_CACHE_DIR: "$CI_PROJECT_DIR/.npm"
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.pip"

# Node.js project
build:
  stage: build
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - .npm/
      - node_modules/
    policy: pull-push
  script:
    - npm ci --cache .npm
    - npm run build

# Python project
test-python:
  stage: test
  image: python:3.11
  cache:
    key:
      files:
        - requirements.txt
    paths:
      - .pip/
      - venv/
  script:
    - python -m venv venv
    - source venv/bin/activate
    - pip install -r requirements.txt
    - pytest

Cache Key Strategies

# Per-branch cache (isolated but more cache misses)
cache:
  key: "$CI_COMMIT_REF_SLUG"
  paths:
    - node_modules/

# Lock file hash (invalidates only when dependencies change)
cache:
  key:
    files:
      - package-lock.json
  paths:
    - node_modules/

# Combined: branch + lock file (balance isolation and hits)
cache:
  key:
    prefix: "$CI_COMMIT_REF_SLUG"
    files:
      - package-lock.json
  paths:
    - node_modules/

Cache Policies

# Job that only reads cache (faster startup)
lint:
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull  # Never uploads, only downloads

# Job that updates cache
install-deps:
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull-push  # Downloads and uploads

# Job that creates fresh cache (weekly refresh)
refresh-cache:
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: push  # Only uploads, ignores existing
  script:
    - rm -rf node_modules
    - npm ci

Parallel Test Execution

Using parallel Keyword

test:
  stage: test
  parallel: 4
  script:
    - npm run test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

# For Jest
test:
  parallel: 4
  script:
    - npx jest --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

# For pytest
test:
  image: python:3.11
  parallel: 4
  script:
    - pip install pytest-split
    - pytest --splits $CI_NODE_TOTAL --group $CI_NODE_INDEX

Matrix Jobs

test:
  stage: test
  parallel:
    matrix:
      - NODE_VERSION: ["18", "20", "22"]
        DATABASE: ["postgres", "mysql"]
  image: node:${NODE_VERSION}
  services:
    - name: ${DATABASE}:latest
      alias: db
  script:
    - npm test

Conditional Job Execution

Don’t run what you don’t need:

# Only run on specific file changes
test-frontend:
  rules:
    - changes:
        - frontend/**/*
        - package.json
  script:
    - npm run test:frontend

test-backend:
  rules:
    - changes:
        - backend/**/*
        - go.mod
  script:
    - go test ./...

# Skip CI for docs-only changes
workflow:
  rules:
    - if: $CI_COMMIT_MESSAGE =~ /\[skip ci\]/
      when: never
    - changes:
        - "**/*.md"
        - "docs/**/*"
      when: never
    - when: always

# Environment-specific jobs
deploy-production:
  stage: deploy
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  environment:
    name: production
  script:
    - ./deploy.sh production

Artifacts vs. Cache

Understanding when to use each:

Feature	Cache	Artifacts
Purpose	Speed up jobs	Pass data between jobs
Persistence	Best-effort	Guaranteed
Scope	Same job across pipelines	Different jobs in same pipeline
Use case	Dependencies (node_modules)	Build outputs (dist/)

build:
  stage: build
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/  # Cache: reused in future pipelines
  artifacts:
    paths:
      - dist/  # Artifact: needed by deploy job
    expire_in: 1 hour
  script:
    - npm ci
    - npm run build

deploy:
  stage: deploy
  dependencies:
    - build  # Only download artifacts from build job
  script:
    - aws s3 sync dist/ s3://my-bucket/

Docker Image Optimization

Use Kaniko for Rootless Builds

build-image:
  stage: build
  image:
    name: gcr.io/kaniko-project/executor:v1.19.2-debug
    entrypoint: [""]
  variables:
    DOCKER_CONFIG: /kaniko/.docker
  script:
    - |
      /kaniko/executor \
        --context "${CI_PROJECT_DIR}" \
        --dockerfile "${CI_PROJECT_DIR}/Dockerfile" \
        --destination "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}" \
        --destination "${CI_REGISTRY_IMAGE}:latest" \
        --cache=true \
        --cache-repo="${CI_REGISTRY_IMAGE}/cache"

Layer Caching with BuildKit

build-image:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  variables:
    DOCKER_BUILDKIT: "1"
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
    - |
      docker build \
        --cache-from ${CI_REGISTRY_IMAGE}:latest \
        --build-arg BUILDKIT_INLINE_CACHE=1 \
        -t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA} \
        -t ${CI_REGISTRY_IMAGE}:latest \
        .
    - docker push ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
    - docker push ${CI_REGISTRY_IMAGE}:latest

Include and Extends for DRY Pipelines

Reusable Templates

# templates/node.yml
.node-base:
  image: node:20-alpine
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
  before_script:
    - npm ci --cache .npm

.deploy-base:
  image: amazon/aws-cli:2.15.0
  before_script:
    - aws configure set region $AWS_REGION

# .gitlab-ci.yml
include:
  - local: templates/node.yml
  - project: 'devops/ci-templates'
    ref: main
    file: '/security/sast.yml'

build:
  extends: .node-base
  stage: build
  script:
    - npm run build

deploy:
  extends: .deploy-base
  stage: deploy
  script:
    - aws s3 sync dist/ s3://$BUCKET_NAME/

Runner Configuration Tips

Choosing the Right Executor

# config.toml for GitLab Runner

# Docker executor (isolated, consistent)
[[runners]]
  executor = "docker"
  [runners.docker]
    image = "alpine:latest"
    privileged = false
    volumes = ["/cache"]

# Docker Machine (auto-scaling)
[[runners]]
  executor = "docker+machine"
  [runners.machine]
    IdleCount = 2
    MaxBuilds = 100
    MachineDriver = "amazonec2"

Tag-Based Job Routing

# Route to specific runners
heavy-build:
  tags:
    - high-memory
    - ssd
  script:
    - npm run build:production

quick-lint:
  tags:
    - shared
  script:
    - npm run lint

Monitoring Pipeline Performance

Built-in Analytics

GitLab provides pipeline analytics at CI/CD > Analytics. Track:

Pipeline duration trends
Job failure rates
Most time-consuming jobs

Custom Metrics

.metrics:
  after_script:
    - |
      curl -X POST "https://metrics.example.com/pipeline" \
        -H "Content-Type: application/json" \
        -d '{
          "job": "'$CI_JOB_NAME'",
          "duration": "'$CI_JOB_DURATION'",
          "status": "'$CI_JOB_STATUS'",
          "pipeline": "'$CI_PIPELINE_ID'"
        }' || true

test:
  extends: .metrics
  script:
    - npm test

Complete Optimized Pipeline Example

stages:
  - install
  - build
  - test
  - security
  - deploy

variables:
  NPM_CACHE_DIR: "$CI_PROJECT_DIR/.npm"

default:
  image: node:20-alpine
  tags:
    - docker

workflow:
  rules:
    - if: $CI_COMMIT_MESSAGE =~ /\[skip ci\]/
      when: never
    - when: always

# Separate install job that others depend on
install:
  stage: install
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
      - .npm/
    policy: pull-push
  script:
    - npm ci --cache .npm
  artifacts:
    paths:
      - node_modules/
    expire_in: 1 hour

build:
  stage: build
  needs: [install]
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 day

lint:
  stage: test
  needs: [install]
  script:
    - npm run lint

unit-tests:
  stage: test
  needs: [install]
  parallel: 4
  script:
    - npm run test:unit -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
  coverage: '/Lines\s*:\s*(\d+\.?\d*)%/'

e2e-tests:
  stage: test
  needs: [build]
  image: cypress/included:13.6.0
  script:
    - npm run test:e2e
  artifacts:
    when: on_failure
    paths:
      - cypress/screenshots/
      - cypress/videos/

security-scan:
  stage: security
  needs: [install]
  allow_failure: true
  script:
    - npm audit --audit-level=high

deploy-staging:
  stage: deploy
  needs: [build, unit-tests, lint]
  environment:
    name: staging
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"
  script:
    - ./deploy.sh staging

deploy-production:
  stage: deploy
  needs: [build, unit-tests, e2e-tests, lint]
  environment:
    name: production
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  script:
    - ./deploy.sh production

When NOT to Optimize

Not every pipeline needs aggressive optimization:

Small projects: Complex caching setup overhead > time saved
Infrequent changes: Cache expires before reuse
Compliance requirements: Some industries require fresh builds
Debugging: Over-optimized pipelines are harder to troubleshoot

Key Takeaways

Use DAG pipelines with needs for true parallelism
Cache dependencies, not build outputs
Use policy: pull on jobs that don’t need to update cache
Parallelize tests with the parallel keyword
Skip jobs on irrelevant changes with rules and changes
Measure before optimizing — use GitLab’s pipeline analytics

The goal isn’t the fastest possible pipeline—it’s a pipeline fast enough that developers never context-switch waiting for it.