GitLab CI/CD – Artifacts

This article explores GitLab CI/CD artifacts, detailing their definition, purpose in passing data between jobs and stages, common use cases like build outputs and test reports, and important considerations such as expiration policies and best practices for efficient pipeline management. Discover how to leverage artifacts to streamline your CI/CD workflows and enhance software delivery.

Understanding GitLab CI/CD Artifacts

GitLab CI/CD artifacts are a fundamental concept in building robust and efficient continuous integration and continuous delivery pipelines. Simply put, artifacts are files or directories that are generated by a CI/CD job and can then be saved, viewed, and passed to subsequent jobs in later stages of the pipeline, or downloaded by users. They are crucial for maintaining state and sharing data across otherwise independent pipeline jobs.

Why Use GitLab CI/CD Artifacts?

The primary purpose of artifacts is to enable communication and data persistence between different jobs and stages in a GitLab CI/CD pipeline. Without artifacts, each job would run in an isolated environment, making it impossible to share outputs like compiled code, test results, or generated documentation.

Key reasons to use artifacts:

  • Passing Data Between Stages: The most common use case, allowing a build job to produce an executable that a test job then uses, and a deploy job subsequently deploys.
  • Archiving Build Outputs: Storing compiled binaries, packages, or deployment files for later release or debugging.
  • Collecting Test Reports: Centralizing test results (e.g., JUnit XML files) for display in the GitLab UI or for analysis.
  • Generating Documentation: Saving generated documentation, such as API docs or user manuals, as part of the pipeline.
  • Caching Dependencies (Distinct from Artifacts): While similar in concept, caching is primarily for speeding up future pipeline runs by reusing downloaded dependencies (e.g., node_modules). Artifacts are for outputs that need to be preserved or passed to other jobs.

Defining Artifacts in .gitlab-ci.yml

Artifacts are defined within the .gitlab-ci.yml file using the artifacts keyword. This keyword has several sub-parameters to control what files are saved, when they are saved, and how long they are retained.

Basic Artifact Definition

To specify which files or directories should be saved as artifacts, use the paths keyword:

stages:
  - build
  - test

build_job:
  stage: build
  script:
    - echo "Building application..."
    - mkdir -p build_output
    - echo "This is a compiled application." > build_output/app.txt
    - cp build_output/app.txt application.txt # Another file for artifact
  artifacts:
    paths:
      - build_output/ # Saves the entire directory
      - application.txt # Saves a specific file
    # By default, artifacts are uploaded when the job succeeds.
  • paths: This is a list of files or directories relative to the project’s root that will be included in the artifact archive.

Controlling Artifact Upload Conditions (when)

You can control when artifacts are uploaded using the when keyword:

  • on_success (default): Uploads artifacts only if the job succeeds.
  • on_failure: Uploads artifacts only if the job fails. Useful for debugging.
  • always: Uploads artifacts regardless of the job’s success or failure.
build_and_archive:
  stage: build
  script:
    - echo "Simulating a build process..."
    - mkdir -p build_logs
    - echo "Build log content" > build_logs/build.log
    - exit 1 # Simulate a failure to demonstrate 'on_failure'
  artifacts:
    paths:
      - build_logs/
    when: on_failure # Upload artifacts only if this job fails
    name: "failed-build-logs" # Custom name for the artifact archive

Setting Artifact Expiration (expire_in)

Artifacts consume storage space. It’s crucial to set an expiration policy to automatically remove old artifacts. The expire_in keyword specifies how long artifacts should be kept.

deploy_job:
  stage: deploy
  script:
    - echo "Deploying application..."
    # Assume the deployment package is downloaded from a previous artifact
  artifacts:
    paths:
      - deployment_report.txt
    expire_in: 1 day # Artifacts will be kept for 1 day
  • expire_in: Accepts human-readable durations like 1 week, 3 months, 1 year, 30 mins, etc. If not specified, GitLab uses the instance-wide default (which can vary, but is often 30 days on GitLab.com, or forever for self-hosted instances unless configured). You can also manually “Keep” an artifact from expiring via the GitLab UI.

Passing Artifacts Between Jobs (dependencies and needs)

By default, jobs in later stages automatically fetch all artifacts from jobs in previous stages. However, for finer control and to optimize pipeline performance, you can use dependencies or needs.

  • dependencies: Specifies a list of jobs from which to download artifacts. This is useful when you only need artifacts from a subset of previous jobs. If set to an empty array [], no artifacts from previous jobs are downloaded.
stages:
  - build
  - test
  - deploy

build_app:
  stage: build
  script:
    - echo "Building application..."
    - mkdir -p build_output
    - echo "Compiled app content" > build_output/app.bin
  artifacts:
    paths:
      - build_output/

run_unit_tests:
  stage: test
  script:
    - echo "Running unit tests..."
    # This job will automatically download artifacts from 'build_app'
    - cat build_output/app.bin # Accessing the artifact
    - echo "Unit tests passed." > unit_test_results.txt
  artifacts:
    paths:
      - unit_test_results.txt
    
run_integration_tests:
  stage: test
  script:
    - echo "Running integration tests..."
    - cat build_output/app.bin # Accessing the artifact
    - echo "Integration tests passed." > integration_test_results.txt
  artifacts:
    paths:
      - integration_test_results.txt
  dependencies:
    - build_app # Explicitly state dependency on 'build_app' artifacts

deploy_to_staging:
  stage: deploy
  script:
    - echo "Deploying to staging..."
    - cat build_output/app.bin # Accessing the artifact
  dependencies:
    - build_app # Only needs artifacts from build_app
    - run_unit_tests # And artifacts from run_unit_tests (for reports, etc.)
  • needs: This keyword allows jobs to run out of stage order and explicitly defines dependencies between jobs. When using needs, artifacts are only fetched from the jobs explicitly listed. This can significantly speed up pipelines by running jobs in parallel that would otherwise wait for previous stages to complete.
stages:
  - build
  - lint
  - test
  - deploy

build_frontend:
  stage: build
  script:
    - echo "Building frontend..."
    - mkdir -p frontend_dist
    - echo "Frontend built" > frontend_dist/index.html
  artifacts:
    paths:
      - frontend_dist/

build_backend:
  stage: build
  script:
    - echo "Building backend..."
    - mkdir -p backend_dist
    - echo "Backend built" > backend_dist/api.jar
  artifacts:
    paths:
      - backend_dist/

run_lint:
  stage: lint
  script:
    - echo "Running linting..."

run_e2e_tests:
  stage: test
  script:
    - echo "Running E2E tests..."
    - cat frontend_dist/index.html # Accessing frontend artifact
    - cat backend_dist/api.jar # Accessing backend artifact
  needs:
    - build_frontend # Explicitly depends on build_frontend and its artifacts
    - build_backend # Explicitly depends on build_backend and its artifacts

deploy_production:
  stage: deploy
  script:
    - echo "Deploying to production..."
    - cat frontend_dist/index.html
    - cat backend_dist/api.jar
  needs:
    - run_e2e_tests # Only needs the E2E tests to pass, artifacts are also available if specified

Artifact Reports (artifacts:reports)

GitLab CI/CD can automatically parse certain types of reports from artifacts and display them directly in the UI (e.g., on merge requests or pipeline views). This is particularly useful for code quality, test results, and security scans.

run_tests:
  stage: test
  script:
    - echo "Running tests and generating JUnit report..."
    - mkdir -p reports
    - echo "<testsuite name='my_suite'><testcase name='my_test'/></testsuite>" > reports/junit_report.xml
  artifacts:
    reports:
      junit: reports/junit_report.xml # Specifies the JUnit report path
    paths:
      - reports/ # Also save the report file as a browsable artifact
    expire_in: 1 week

Common report types include:

  • junit
  • codequality
  • sast (Static Application Security Testing)
  • dast (Dynamic Application Security Testing)
  • dependency_scanning
  • container_scanning
  • license_scanning
  • metrics
  • dotenv (for passing environment variables between jobs)

Managing and Browse Artifacts

GitLab provides a user-friendly interface to manage and browse artifacts:

  • Download: You can download the entire artifact archive for a job from the pipeline or job detail page.
  • Browse: For ZIP archives (which is the default format for artifacts), you can browse the contents directly in the GitLab UI.
  • Keep: You can manually override the expire_in setting for a specific job’s artifacts by clicking the “Keep” button, preventing them from being deleted.
  • Delete: Project Maintainers and Owners can delete individual job artifacts.

Best Practices for GitLab CI/CD Artifacts

To make the most of GitLab CI/CD artifacts and maintain an efficient pipeline:

  • Be Specific with paths: Only include the files and directories that are truly necessary as artifacts. Large artifact archives consume more storage and take longer to upload and download, slowing down pipelines.
  • Use expire_in Judiciously: Set appropriate expiration times. For development branches, shorter retention might be suitable, while artifacts from release branches or production deployments might need to be kept longer.
  • Leverage dependencies and needs: Carefully define job dependencies to ensure only required artifacts are downloaded, improving pipeline speed and efficiency.
  • Utilize Report Artifacts: Integrate artifacts:reports for automated display of test results, code quality, and security findings within GitLab, enhancing visibility and feedback.
  • Consider Caching for Dependencies: For third-party dependencies (e.g., npm install, composer install), use GitLab CI/CD caching instead of artifacts. Caching is optimized for reusing fetched dependencies across pipeline runs, whereas artifacts are for outputs of jobs.
  • Artifact Naming: Use artifacts:name to give meaningful names to your artifact archives, especially if you have multiple artifact definitions within a single job or complex pipelines. You can use predefined CI/CD variables in artifact names for dynamic naming (e.g., name: "$CI_COMMIT_REF_NAME-build").
  • Object Storage for Self-Managed Instances: If you host your own GitLab instance, consider configuring object storage (like S3, MinIO) for artifacts to improve scalability and resilience.
  • Security: Be mindful of what sensitive information might be included in artifacts. Ensure proper access controls are in place for projects containing sensitive artifacts.

FAQs – GitLab CI/CD Artifacts

What is the artifacts keyword in GitLab CI/CD?
The artifacts keyword in GitLab CI/CD is used to define files and directories that are generated by a job and should be preserved after the job finishes. These files can be used in later jobs or downloaded from the GitLab interface. Common use cases include preserving build results, test reports, and deployment packages.


How to define artifacts in a GitLab CI/CD job?
You can define artifacts using the artifacts keyword inside a job. Here’s an example:

build-job:
  stage: build
  script:
    - make build
  artifacts:
    paths:
      - build/   # This folder will be stored as an artifact

In this example, the build/ directory generated by the make build command is stored as an artifact and can be accessed later.


How long are artifacts stored in GitLab by default?
By default, GitLab stores artifacts for 30 days. However, this can be changed using the expire_in keyword:

build-job:
  script:
    - make build
  artifacts:
    paths:
      - build/
    expire_in: 1 week  # Artifacts will be deleted after 1 week

You can use time values like 1 hr, 2 days, 3 months, etc.


How to share artifacts between jobs in different stages?
Artifacts can be passed from one job to another by defining them in one job and using the dependencies keyword in the downstream job:

build-job:
  stage: build
  script:
    - make build
  artifacts:
    paths:
      - build/

test-job:
  stage: test
  script:
    - ./run_tests.sh build/
  dependencies:
    - build-job

Here, test-job depends on build-job and will download its artifacts before running.


Can I exclude files from being saved as artifacts?
Yes, you can use the exclude option along with paths to prevent certain files or folders from being stored:

build-job:
  script:
    - make build
  artifacts:
    paths:
      - build/
    exclude:
      - build/temp/   # This subfolder will be excluded from artifacts

What file types can be stored as artifacts?
Any file type can be stored as artifacts—there is no file extension restriction. Common examples include:

  • .zip, .tar, .jar (build outputs)
  • .html, .xml, .json (reports)
  • Executables or logs

The only constraint is storage size and GitLab plan limits.


Can I store test reports as artifacts in GitLab?
Yes, and GitLab can even parse test reports if you use the reports keyword under artifacts:

test-job:
  script:
    - run_tests > report.xml
  artifacts:
    reports:
      junit: report.xml

This stores the report and enables GitLab to show test results directly in the UI.


How to make artifacts available for download in GitLab UI?
Once a job with artifacts finishes, GitLab adds a Download button in the pipeline/job UI. Users can click it to download the artifacts archive (.zip file by default).


Are artifacts transferred automatically to all jobs?
No, artifacts are not passed automatically between jobs. You must explicitly define the dependencies keyword or use needs: [job-name] in GitLab 13.3+ for DAG pipelines.


Can I store artifacts only when a job fails or succeeds?
Yes, you can use the when keyword to control when artifacts are saved:

test-job:
  script:
    - run_tests
  artifacts:
    paths:
      - logs/
    when: on_failure  # Save artifacts only if job fails

Values for when include:

  • on_success (default)
  • on_failure
  • always

Do artifacts affect Git repository size?
No, artifacts are stored separately from your Git repository and do not affect repository size. However, they do consume CI storage and may be subject to cleanup policies or storage limits.

Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com

Leave a Comment