GitLab CI is powerful out of the box, but most teams leave significant performance on the table. This guide covers the optimization techniques that separate a 45-minute pipeline from a 10-minute one.

The Problem with Slow Pipelines

Slow CI/CD directly impacts developer productivity. A 30-minute pipeline means:

  • Developers context-switch while waiting
  • Bugs take longer to catch
  • Deployment frequency drops
  • Engineers start skipping tests

The fix isn’t throwing more runners at the problem—it’s smarter pipeline design.

Pipeline Structure Fundamentals

Stages and Jobs

# .gitlab-ci.yml
stages:
  - build
  - test
  - security
  - deploy

variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"

default:
  image: node:20-alpine
  tags:
    - docker

# Jobs within the same stage run in parallel
build:
  stage: build
  script:
    - npm ci --cache .npm
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 hour

unit-tests:
  stage: test
  script:
    - npm run test:unit

integration-tests:
  stage: test
  script:
    - npm run test:integration

DAG Pipelines for True Parallelism

By default, GitLab waits for all jobs in a stage to complete before moving on. DAG (Directed Acyclic Graph) pipelines break this limitation:

stages:
  - build
  - test
  - deploy

build-frontend:
  stage: build
  script:
    - npm ci && npm run build
  artifacts:
    paths:
      - frontend/dist/

build-backend:
  stage: build
  script:
    - go build -o app ./cmd/server

test-frontend:
  stage: test
  needs: [build-frontend]  # Starts as soon as build-frontend finishes
  script:
    - npm run test

test-backend:
  stage: test
  needs: [build-backend]  # Doesn't wait for build-frontend
  script:
    - go test ./...

deploy:
  stage: deploy
  needs: [test-frontend, test-backend]
  script:
    - ./deploy.sh

With needs, test-backend starts immediately after build-backend, even if build-frontend is still running.

Caching Strategies

Caching is where most pipelines gain the biggest wins. The key is understanding what to cache and when to invalidate.

Dependency Caching

variables:
  NPM_CACHE_DIR: "$CI_PROJECT_DIR/.npm"
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.pip"

# Node.js project
build:
  stage: build
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - .npm/
      - node_modules/
    policy: pull-push
  script:
    - npm ci --cache .npm
    - npm run build

# Python project
test-python:
  stage: test
  image: python:3.11
  cache:
    key:
      files:
        - requirements.txt
    paths:
      - .pip/
      - venv/
  script:
    - python -m venv venv
    - source venv/bin/activate
    - pip install -r requirements.txt
    - pytest

Cache Key Strategies

# Per-branch cache (isolated but more cache misses)
cache:
  key: "$CI_COMMIT_REF_SLUG"
  paths:
    - node_modules/

# Lock file hash (invalidates only when dependencies change)
cache:
  key:
    files:
      - package-lock.json
  paths:
    - node_modules/

# Combined: branch + lock file (balance isolation and hits)
cache:
  key:
    prefix: "$CI_COMMIT_REF_SLUG"
    files:
      - package-lock.json
  paths:
    - node_modules/

Cache Policies

# Job that only reads cache (faster startup)
lint:
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull  # Never uploads, only downloads

# Job that updates cache
install-deps:
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: pull-push  # Downloads and uploads

# Job that creates fresh cache (weekly refresh)
refresh-cache:
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
    policy: push  # Only uploads, ignores existing
  script:
    - rm -rf node_modules
    - npm ci

Parallel Test Execution

Using parallel Keyword

test:
  stage: test
  parallel: 4
  script:
    - npm run test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

# For Jest
test:
  parallel: 4
  script:
    - npx jest --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

# For pytest
test:
  image: python:3.11
  parallel: 4
  script:
    - pip install pytest-split
    - pytest --splits $CI_NODE_TOTAL --group $CI_NODE_INDEX

Matrix Jobs

test:
  stage: test
  parallel:
    matrix:
      - NODE_VERSION: ["18", "20", "22"]
        DATABASE: ["postgres", "mysql"]
  image: node:${NODE_VERSION}
  services:
    - name: ${DATABASE}:latest
      alias: db
  script:
    - npm test

Conditional Job Execution

Don’t run what you don’t need:

# Only run on specific file changes
test-frontend:
  rules:
    - changes:
        - frontend/**/*
        - package.json
  script:
    - npm run test:frontend

test-backend:
  rules:
    - changes:
        - backend/**/*
        - go.mod
  script:
    - go test ./...

# Skip CI for docs-only changes
workflow:
  rules:
    - if: $CI_COMMIT_MESSAGE =~ /\[skip ci\]/
      when: never
    - changes:
        - "**/*.md"
        - "docs/**/*"
      when: never
    - when: always

# Environment-specific jobs
deploy-production:
  stage: deploy
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  environment:
    name: production
  script:
    - ./deploy.sh production

Artifacts vs. Cache

Understanding when to use each:

FeatureCacheArtifacts
PurposeSpeed up jobsPass data between jobs
PersistenceBest-effortGuaranteed
ScopeSame job across pipelinesDifferent jobs in same pipeline
Use caseDependencies (node_modules)Build outputs (dist/)
build:
  stage: build
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/  # Cache: reused in future pipelines
  artifacts:
    paths:
      - dist/  # Artifact: needed by deploy job
    expire_in: 1 hour
  script:
    - npm ci
    - npm run build

deploy:
  stage: deploy
  dependencies:
    - build  # Only download artifacts from build job
  script:
    - aws s3 sync dist/ s3://my-bucket/

Docker Image Optimization

Use Kaniko for Rootless Builds

build-image:
  stage: build
  image:
    name: gcr.io/kaniko-project/executor:v1.19.2-debug
    entrypoint: [""]
  variables:
    DOCKER_CONFIG: /kaniko/.docker
  script:
    - |
      /kaniko/executor \
        --context "${CI_PROJECT_DIR}" \
        --dockerfile "${CI_PROJECT_DIR}/Dockerfile" \
        --destination "${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}" \
        --destination "${CI_REGISTRY_IMAGE}:latest" \
        --cache=true \
        --cache-repo="${CI_REGISTRY_IMAGE}/cache"

Layer Caching with BuildKit

build-image:
  stage: build
  image: docker:24
  services:
    - docker:24-dind
  variables:
    DOCKER_BUILDKIT: "1"
  before_script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
  script:
    - |
      docker build \
        --cache-from ${CI_REGISTRY_IMAGE}:latest \
        --build-arg BUILDKIT_INLINE_CACHE=1 \
        -t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA} \
        -t ${CI_REGISTRY_IMAGE}:latest \
        .
    - docker push ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
    - docker push ${CI_REGISTRY_IMAGE}:latest

Include and Extends for DRY Pipelines

Reusable Templates

# templates/node.yml
.node-base:
  image: node:20-alpine
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
  before_script:
    - npm ci --cache .npm

.deploy-base:
  image: amazon/aws-cli:2.15.0
  before_script:
    - aws configure set region $AWS_REGION
# .gitlab-ci.yml
include:
  - local: templates/node.yml
  - project: 'devops/ci-templates'
    ref: main
    file: '/security/sast.yml'

build:
  extends: .node-base
  stage: build
  script:
    - npm run build

deploy:
  extends: .deploy-base
  stage: deploy
  script:
    - aws s3 sync dist/ s3://$BUCKET_NAME/

Runner Configuration Tips

Choosing the Right Executor

# config.toml for GitLab Runner

# Docker executor (isolated, consistent)
[[runners]]
  executor = "docker"
  [runners.docker]
    image = "alpine:latest"
    privileged = false
    volumes = ["/cache"]

# Docker Machine (auto-scaling)
[[runners]]
  executor = "docker+machine"
  [runners.machine]
    IdleCount = 2
    MaxBuilds = 100
    MachineDriver = "amazonec2"

Tag-Based Job Routing

# Route to specific runners
heavy-build:
  tags:
    - high-memory
    - ssd
  script:
    - npm run build:production

quick-lint:
  tags:
    - shared
  script:
    - npm run lint

Monitoring Pipeline Performance

Built-in Analytics

GitLab provides pipeline analytics at CI/CD > Analytics. Track:

  • Pipeline duration trends
  • Job failure rates
  • Most time-consuming jobs

Custom Metrics

.metrics:
  after_script:
    - |
      curl -X POST "https://metrics.example.com/pipeline" \
        -H "Content-Type: application/json" \
        -d '{
          "job": "'$CI_JOB_NAME'",
          "duration": "'$CI_JOB_DURATION'",
          "status": "'$CI_JOB_STATUS'",
          "pipeline": "'$CI_PIPELINE_ID'"
        }' || true

test:
  extends: .metrics
  script:
    - npm test

Complete Optimized Pipeline Example

stages:
  - install
  - build
  - test
  - security
  - deploy

variables:
  NPM_CACHE_DIR: "$CI_PROJECT_DIR/.npm"

default:
  image: node:20-alpine
  tags:
    - docker

workflow:
  rules:
    - if: $CI_COMMIT_MESSAGE =~ /\[skip ci\]/
      when: never
    - when: always

# Separate install job that others depend on
install:
  stage: install
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
      - .npm/
    policy: pull-push
  script:
    - npm ci --cache .npm
  artifacts:
    paths:
      - node_modules/
    expire_in: 1 hour

build:
  stage: build
  needs: [install]
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 day

lint:
  stage: test
  needs: [install]
  script:
    - npm run lint

unit-tests:
  stage: test
  needs: [install]
  parallel: 4
  script:
    - npm run test:unit -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
  coverage: '/Lines\s*:\s*(\d+\.?\d*)%/'

e2e-tests:
  stage: test
  needs: [build]
  image: cypress/included:13.6.0
  script:
    - npm run test:e2e
  artifacts:
    when: on_failure
    paths:
      - cypress/screenshots/
      - cypress/videos/

security-scan:
  stage: security
  needs: [install]
  allow_failure: true
  script:
    - npm audit --audit-level=high

deploy-staging:
  stage: deploy
  needs: [build, unit-tests, lint]
  environment:
    name: staging
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"
  script:
    - ./deploy.sh staging

deploy-production:
  stage: deploy
  needs: [build, unit-tests, e2e-tests, lint]
  environment:
    name: production
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
  script:
    - ./deploy.sh production

When NOT to Optimize

Not every pipeline needs aggressive optimization:

  • Small projects: Complex caching setup overhead > time saved
  • Infrequent changes: Cache expires before reuse
  • Compliance requirements: Some industries require fresh builds
  • Debugging: Over-optimized pipelines are harder to troubleshoot

Key Takeaways

  1. Use DAG pipelines with needs for true parallelism
  2. Cache dependencies, not build outputs
  3. Use policy: pull on jobs that don’t need to update cache
  4. Parallelize tests with the parallel keyword
  5. Skip jobs on irrelevant changes with rules and changes
  6. Measure before optimizing — use GitLab’s pipeline analytics

The goal isn’t the fastest possible pipeline—it’s a pipeline fast enough that developers never context-switch waiting for it.