GitLab CI/CD – .gitlab-ci.yml

In today’s blog post, we will discuss .gitlab-ci.yml file used to define GitLab CI/CD pipeline. In particular, .gitlab-ci.yml syntax and structure, core components (keywords) of a .gitlab-ci.yml file, an example .gitlab-ci.yml file with most of the core components, and some best practices while using .gitlab-ci.yml. So without any further delay, let us get started.

What is .gitlab-ci.yml?

The .gitlab-ci.yml is a configuration file that tells GitLab Runner—the agent that executes your CI/CD jobs—what to do. It uses a human-readable YAML syntax, allowing you to define a series of steps (jobs) that are organized into stages. When a new commit is pushed, GitLab automatically detects this file and starts a new pipeline based on its instructions.

Basic Structure and Key Keywords

A typical .gitlab-ci.yml file is composed of several key elements:

stages:

The stages keyword defines the order in which your jobs will execute. Jobs within the same stage run in parallel, while stages themselves run sequentially. If a job in an earlier stage fails, subsequent stages are not executed (by default).

stages:
  - build
  - test
  - deploy

In this example, all jobs in the build stage will run first, followed by test jobs, and finally deploy jobs.

before_script, script, after_script:

These keywords define the commands that your jobs will execute.

  • before_script: Commands defined here run before the script of every job, across all stages. This is useful for common setup tasks like installing dependencies.
  • script: This is the most important part of any job. It contains the actual commands that the job will execute.
  • after_script: Commands defined here run after the script of every job, regardless of whether the job succeeds or fails. Useful for cleanup or reporting.

Example of a simple job:

build_job:
  stage: build
  script:
    - echo "Compiling the application..."
    - mvn clean install

image:

The image keyword allows you to specify a Docker image that will be used as the base environment for your jobs. This ensures consistent build environments and simplifies dependency management.

default:
  image: alpine/git

build_job:
  stage: build
  script:
    - echo "Running inside alpine/git container"

You can define a default image for all jobs, or specify different images for individual jobs.

variables: Defining Reusable Values

The variables keyword allows you to define custom variables that can be used throughout your pipeline. These can be global or job-specific.

variables:
  APP_VERSION: 1.0.0
  TEST_SUITE: unit

build_job:
  stage: build
  script:
    - echo "Building version $APP_VERSION"

test_job:
  stage: test
  script:
    - echo "Running $TEST_SUITE tests"

only/except: Controlling When Jobs Run

These keywords provide custom control over when jobs are included or excluded from a pipeline. You can specify branches, tags, or even file changes as a condition to run your jobs.

deploy_production:
  stage: deploy
  script:
    - deploy to production server
  only:
    - main # This job will only run on the 'main' branch

test_feature:
  stage: test
  script:
    - run feature tests
  except:
    - main # This job will run on all branches except 'main'

artifacts: Storing Job Output

The artifacts keyword allows you to specify files or directories that should be saved after a job completes. These artifacts can be downloaded from the pipeline view or passed to subsequent jobs.

build_job:
  stage: build
  script:
    - mkdir build
    - echo "Compiled code" > build/app.txt
  artifacts:
    paths:
      - build/
    expire_in: 1 week

cache: Speeding Up Pipelines

The cache keyword allows you to specify files or directories that should be cached between pipeline runs. This is particularly useful for dependencies that do not change often, like node_modules or Maven repository contents.

cache:
  paths:
    - node_modules/
    - .npm/

install_dependencies:
  stage: build
  script:
    - npm install

extends: Reusing Configuration

The extends keyword allows you to inherit configuration from another job. This promotes code reuse and makes your .gitlab-ci.yml more DRY (Do not Repeat Yourself).

.template_job:
  script:
    - echo "This is a template script"

my_job:
  extends: .template_job
  stage: build
  script:
    - echo "This job uses the template and adds more"

Advanced Concepts

  • Directed Acyclic Graph (DAG) Pipelines (needs): For more complex scenarios, you can use the needs keyword to define explicit dependencies between jobs, allowing jobs in different stages to run out of order if their dependencies are met.
  • rules: A more flexible and powerful alternative to only/except, allowing for complex conditional logic based on various pipeline variables and changes.
  • Parent/Child Pipelines: Break down large pipelines into smaller, more manageable child pipelines for better organization and performance.
  • Manual Jobs: Jobs that require a manual trigger before execution, useful for production deployments.
  • Environments: Define deployment environments to track deployments and enable manual rollbacks.

.gitlab-ci.yml Example

Here is a comprehensive .gitlab-ci.yml file that includes all the core components and features of GitLab CI/CD. Each section is clearly commented to help you understand what it does. I know it is too much if you are just getting started. But this has been given just for reference, so that you can have an idea on what is ahead of you. And do not worry as I will cover each components in a dedicated blog post.

# .gitlab-ci.yml
# This file defines the CI/CD pipeline for your GitLab project.
# It uses YAML syntax and is processed by GitLab CI/CD runners.

# --- Global Variables ---
# Variables defined here are available globally to all jobs.
# They can be overridden at the job level.
variables:
  # General project variables
  GIT_DEPTH: "1" # Limit clone depth for faster CI/CD
  NODE_VERSION: "18.x" # Example for Node.js projects
  DOCKER_IMAGE_NAME: "$CI_REGISTRY_IMAGE/$CI_COMMIT_REF_SLUG" # Dynamic image name for Docker builds
  DOCKER_DRIVER: "overlay2" # Recommended Docker storage driver
  # Define environment specific variables, often overridden in deploy jobs
  STAGING_SERVER_IP: "your_staging_server_ip"
  PRODUCTION_SERVER_IP: "your_production_server_ip"
  # Authentication variables (often masked/protected in GitLab UI)
  # DOCKER_HUB_USERNAME: "$CI_REGISTRY_USER" # Automatically provided by GitLab for built-in registry
  # DOCKER_HUB_PASSWORD: "$CI_REGISTRY_PASSWORD" # Automatically provided by GitLab for built-in registry

# --- Stages Definition ---
# Stages define the order of execution for jobs.
# Jobs in earlier stages complete before jobs in later stages begin.
stages:
  - build       # Compile code, build Docker images, etc.
  - test        # Run unit tests, integration tests, linting, security scans.
  - deploy      # Deploy to various environments (staging, production).
  - cleanup     # Optional: Post-deployment cleanup or notification.

# --- Default Settings for Jobs ---
# These settings apply to all jobs unless explicitly overridden by a job.
default:
  # Use a common image for all jobs, or specify different images per job.
  # This example uses a Docker image with Node.js and Docker installed.
  image: node:${NODE_VERSION}-alpine3.18
  # Define a 'before_script' section to run commands before any job's script.
  # Useful for common setup tasks like installing dependencies.
  before_script:
    - echo "Executing default before_script..."
    # Install Docker in Docker (dind) client for jobs that need it.
    # Check if 'docker' is available in the current image or if dind is needed.
    - apk add --no-cache docker || true # Install docker client if not present (alpine)
    - echo "Default before_script completed."
  # Timeout for jobs (e.g., 1 hour and 30 minutes)
  timeout: 1h 30m

# --- Cache Configuration ---
# Caching speeds up builds by reusing files from previous jobs.
# 'key' defines a unique identifier for the cache.
cache:
  paths:
    - node_modules/ # Cache Node.js dependencies
    - .npm/         # Cache npm packages
  key: ${CI_COMMIT_REF_SLUG} # Cache per branch/tag

# --- Jobs Definition ---
# Each job is a dictionary that defines a specific task in the pipeline.

# --- Build Stage Jobs ---

build_nodejs_app:
  stage: build
  # Image can be overridden at job level
  image: node:${NODE_VERSION}-alpine
  script:
    - echo "Installing Node.js dependencies..."
    - npm ci --cache .npm --prefer-offline # Use `ci` for clean installs in CI
    - echo "Building Node.js application..."
    - npm run build # Assuming a 'build' script in package.json
  artifacts:
    # Path to store build output as artifacts
    paths:
      - dist/ # Assuming 'dist' directory contains build output
    expire_in: 1 week # How long to keep artifacts
  # Dependencies: This job does not depend on any other job in the 'build' stage.
  # It can run concurrently with other jobs in the same stage unless 'needs' is used.

build_docker_image:
  stage: build
  # Use a service for Docker-in-Docker (dind)
  image: docker:latest
  services:
    - docker:dind # This service provides a Docker daemon for building images.
  script:
    - echo "Logging into GitLab Container Registry..."
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
    - echo "Building Docker image: ${DOCKER_IMAGE_NAME}:${CI_COMMIT_SHA}"
    - docker build -t ${DOCKER_IMAGE_NAME}:${CI_COMMIT_SHA} -t ${DOCKER_IMAGE_NAME}:latest .
    - echo "Pushing Docker image..."
    - docker push ${DOCKER_IMAGE_NAME}:${CI_COMMIT_SHA}
    - docker push ${DOCKER_IMAGE_NAME}:latest
  # Rules define when a job should be executed.
  # This job runs only on pushes to branches (not merge requests or tags by default).
  rules:
    - if: $CI_COMMIT_BRANCH
  # Dependencies: This job does not depend on any other job to start.

# --- Test Stage Jobs ---

run_unit_tests:
  stage: test
  script:
    - echo "Running unit tests..."
    - npm test # Assuming 'test' script in package.json runs unit tests
  # Artifacts: Store test reports if available
  artifacts:
    when: always # Always upload artifacts, even if job fails
    paths:
      - junit.xml # Example for JUnit XML report
    reports:
      junit: junit.xml # GitLab will parse this for test results in MRs
  # Needs: This job needs `build_nodejs_app` to complete successfully
  # and will download its artifacts.
  needs: ["build_nodejs_app"]
  # Rules: Only run on pushes to branches (not tags or scheduled pipelines)
  rules:
    - if: $CI_COMMIT_BRANCH

run_integration_tests:
  stage: test
  script:
    - echo "Running integration tests..."
    - npm run test:integration # Assuming a dedicated script for integration tests
  # Needs: This job also needs the build to be complete.
  needs: ["build_nodejs_app"]
  rules:
    - if: $CI_COMMIT_BRANCH

run_linter:
  stage: test
  script:
    - echo "Running code linter..."
    - npm run lint # Assuming a 'lint' script in package.json
  allow_failure: true # Allow pipeline to continue even if linting fails (can be changed)
  rules:
    - if: $CI_COMMIT_BRANCH

# --- Deploy Stage Jobs ---

deploy_to_staging:
  stage: deploy
  image: alpine/git # A lightweight image with git for deployment scripts
  script:
    - echo "Deploying to staging environment..."
    # Example: SSH into server and pull latest Docker image or restart service
    - apk add --no-cache openssh-client # Install SSH client
    - eval "$(ssh-agent -s)" # Start SSH agent
    # Add your SSH private key (stored as a CI/CD variable)
    - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add - # SSH_PRIVATE_KEY must be a protected CI/CD variable
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - ssh-keyscan ${STAGING_SERVER_IP} >> ~/.ssh/known_hosts # Add server to known hosts
    - chmod 644 ~/.ssh/known_hosts
    - ssh user@${STAGING_SERVER_IP} "docker pull ${DOCKER_IMAGE_NAME}:latest && docker-compose down && docker-compose up -d"
    - echo "Deployment to staging complete."
  environment:
    name: staging
    url: http://${STAGING_SERVER_IP} # URL to access the deployed application
  # Rules: This job runs only on pushes to the 'develop' or 'main' branches.
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"
    - if: $CI_COMMIT_BRANCH == "main"

deploy_to_production:
  stage: deploy
  image: alpine/git
  script:
    - echo "Deploying to production environment..."
    - apk add --no-cache openssh-client
    - eval "$(ssh-agent -s)"
    - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
    - mkdir -p ~/.ssh
    - chmod 700 ~/.ssh
    - ssh-keyscan ${PRODUCTION_SERVER_IP} >> ~/.ssh/known_hosts
    - chmod 644 ~/.ssh/known_hosts
    - ssh user@${PRODUCTION_SERVER_IP} "docker pull ${DOCKER_IMAGE_NAME}:latest && docker-compose down && docker-compose up -d"
    - echo "Deployment to production complete."
  environment:
    name: production
    url: http://${PRODUCTION_SERVER_IP}
  # Rules: This job requires a manual action AND runs only on the 'main' branch.
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual # Requires a manual trigger in the GitLab UI
      allow_failure: false # Will block the pipeline if manual job fails

# --- Cleanup Stage Jobs (Optional) ---

send_slack_notification:
  stage: cleanup
  image: curlimages/curl:latest # A minimal image with curl
  script:
    - echo "Sending Slack notification..."
    # Replace with your actual Slack webhook URL and message
    - 'curl -X POST -H "Content-type: application/json" --data "{\"text\":\"Deployment to production for ${CI_PROJECT_NAME} (${CI_COMMIT_SHORT_SHA}) completed successfully!\"}" $SLACK_WEBHOOK_URL'
  # Needs: Only run after production deployment succeeds
  needs: ["deploy_to_production"]
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: on_success # Run only if previous jobs in the stage (and their dependencies) succeed

# --- Example of a Scheduled Job (using 'only' or 'rules') ---
# This job demonstrates how to run a job only on a schedule (e.g., nightly builds)
nightly_security_scan:
  stage: test
  image: alpine/git
  script:
    - echo "Running nightly security scan..."
    # Example: Run a security scanner (e.g., SAST, DAST tool)
    - echo "Security scan completed."
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule" # Run only when triggered by a schedule

# --- Using Include for External Templates (Advanced) ---
# This feature allows you to reuse CI/CD configurations from other files or projects.
# include:
#   - local: '.gitlab/ci/template-build.yml' # Local file in your repository
#   - project: 'my-org/ci-templates' # From another project
#     ref: 'main'
#     file: '/templates/deploy-kubernetes.yml'
#   - remote: 'https://example.com/ci-templates/common.yml' # From a URL

This .gitlab-ci.yml file demonstrates various essential components:

  • variables: Defines global variables that can be used throughout the pipeline.
  • stages: Organizes jobs into logical sequences, ensuring, for example, that build jobs complete before test jobs.
  • default: Sets common configurations like the default image and before_script for all jobs, reducing redundancy.
  • cache: Configures caching for dependencies (node_modules), significantly speeding up subsequent runs.
  • jobs:
    • build_nodejs_app: An example of building a Node.js application, saving its output as artifacts.
    • build_docker_image: Demonstrates building and pushing a Docker image to the GitLab Container Registry using docker:dind service.
    • run_unit_tests: Runs unit tests and showcases how to upload junit reports for display in merge requests. It uses needs to ensure it runs after the build_nodejs_app job.
    • deploy_to_staging: An example of deploying to a staging environment using SSH. It uses rules to run only on specific branches.
    • deploy_to_production: Shows a manual deployment step for production, requiring explicit trigger in the GitLab UI.
    • send_slack_notification: A post-deployment job that runs on success.
    • nightly_security_scan: Illustrates how to configure a job to run only on a schedule.
  • rules: Provides flexible control over when jobs execute (e.g., if conditions based on branch, when for manual triggers, on_success).
  • needs: Defines explicit job dependencies, allowing jobs in later stages to start earlier if their direct dependencies are met, or ensuring specific artifacts are passed.
  • artifacts: Specifies files or directories generated by a job to be saved and passed to subsequent jobs or downloaded by users.
  • environment: Configures deployment environments for better visibility and management within GitLab.

Best Practices for .gitlab-ci.yml

  • Start Simple: Begin with a basic pipeline and gradually add complexity as your needs evolve.
  • Keep it Readable: Use comments, consistent indentation, and logical grouping to make your file easy to understand.
  • Leverage Templates: Use extends and define templates for common job configurations to reduce duplication.
  • Version Control Your Pipeline: Since .gitlab-ci.yml is part of your repository, all changes are tracked, allowing for easy rollbacks and collaboration.
  • Use Variables Wisely: Externalize sensitive information using GitLab CI/CD variables (Settings > CI/CD > Variables) rather than hardcoding them.
  • Test Iteratively: Push small changes frequently to see their effect on the pipeline and debug issues early.
  • Optimize for Speed: Utilize caching, parallelize jobs, and use optimized Docker images to reduce pipeline execution time.
  • Secure Your Pipeline: Be mindful of the commands you run and the permissions granted to your GitLab Runner.
  • Document Your Pipeline: While the .gitlab-ci.yml file is self-documenting to an extent, provide additional documentation for complex logic or external dependencies.

FAQs – .gitlab-ci.yml File


What is the .gitlab-ci.yml file in GitLab CI/CD?
The .gitlab-ci.yml file is the main configuration file used by GitLab to define a CI/CD pipeline. It contains a set of instructions (called jobs) that GitLab Runner executes when certain events (like a push or merge) occur. It is stored in the root of your repository and version-controlled like any other code file.


How do I create a .gitlab-ci.yml file?
To create a .gitlab-ci.yml file:

  1. Go to the root directory of your GitLab project.
  2. Create a new file named .gitlab-ci.yml.
  3. Add pipeline stages and jobs like this:
stages:
  - build
  - test
  - deploy

build-job:
  stage: build
  script:
    - echo "Compiling the code..."

test-job:
  stage: test
  script:
    - echo "Running tests..."

deploy-job:
  stage: deploy
  script:
    - echo "Deploying application..."

Once committed, GitLab automatically triggers a pipeline based on this configuration.


What are stages in .gitlab-ci.yml and why are they important?
Stages define the order in which jobs are executed. Jobs in the same stage run in parallel, and stages run sequentially.

Example:

stages:
  - build
  - test
  - deploy

Here, all build jobs run first, followed by test, then deploy.


What is a job in .gitlab-ci.yml?
A job is a unit of work in GitLab CI/CD. It runs a series of shell commands. Every job must have a unique name and at least a script:

lint:
  script:
    - npm run lint

This job will run the npm run lint command in the CI runner.


What does the script keyword do in GitLab CI/CD?
The script keyword defines the commands to execute inside a job. These are the actual tasks your runner performs, such as compiling code, running tests, or deploying an app:

build:
  script:
    - make
    - make install

How can I define variables in .gitlab-ci.yml?
You can define environment variables globally or per job using the variables keyword:

variables:
  APP_ENV: production

test:
  script:
    - echo $APP_ENV  # Prints 'production'

You can also define variables inside specific jobs for finer control.


Can I include other files in .gitlab-ci.yml?
Yes, use the include keyword to import configuration from other YAML files, remote URLs, or GitLab-managed templates:

include:
  - local: 'ci/test.yml'
  - template: 'Security/SAST.gitlab-ci.yml'

This helps you reuse CI logic across multiple projects.


How do I define a pipeline that runs only on specific branches or tags?
Use only, except, or the newer rules keyword:

deploy:
  script:
    - ./deploy.sh
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'

This ensures the deploy job runs only when code is pushed to the main branch.


What is the difference between rules, only, and except in GitLab CI/CD?

  • only and except are legacy filters.
  • rules is the newer, more flexible way to control when a job runs.

Example using rules:

test:
  script: echo "Testing"
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

Prefer rules for complex or condition-based logic.


How do I cache dependencies in GitLab CI/CD?
Use the cache keyword to save time by reusing previously downloaded dependencies:

cache:
  paths:
    - node_modules/

install:
  script:
    - npm install

This saves the node_modules directory between pipeline runs.


How can I use templates and reuse jobs in .gitlab-ci.yml?
You can use YAML anchors or extends to reuse job definitions:

.default-job: &default-template
  script:
    - echo "Common step"

job1:
  <<: *default-template

job2:
  <<: *default-template
  script:
    - echo "Custom for job2"

This avoids repeating the same configuration across multiple jobs.


What is the default behavior if .gitlab-ci.yml is missing?
If your project does not contain a .gitlab-ci.yml file in the root directory, GitLab will not trigger any CI/CD pipelines. You must create and commit the file to start using GitLab CI/CD.


Where can I validate my .gitlab-ci.yml file?
You can validate your CI configuration in GitLab using:

  • CI Lint Tool: Navigate to CI/CD → Editor → Validate in your project
  • Or visit: https://gitlab.com/<your-namespace>/<your-project>/-/ci/lint

This helps catch syntax and logic errors before running pipelines.


Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com

Leave a Comment