This article provides a detailed exploration of the cache
keyword in GitLab CI/CD, explaining its role in optimizing pipeline performance by storing and reusing dependencies between jobs and pipeline runs. We will cover the mechanics of caching, including keying strategies, path definitions, and cache policies, along with practical examples and best practices for efficient dependency management to accelerate your continuous integration and continuous delivery processes.
Understanding the cache
Keyword in GitLab CI/CD
The cache
keyword in GitLab CI/CD is a powerful feature designed to speed up your pipeline jobs by reusing a defined set of files and directories between subsequent runs. Its primary purpose is to store project dependencies (like downloaded libraries, compiled code, or package manager modules) so that they do not need to be re-downloaded or re-generated in every job or pipeline run. This significantly reduces job execution times, conserves network bandwidth, and leads to faster feedback cycles.
Why Use the cache
Keyword?
The main benefits of using cache
include:
- Faster Pipeline Execution: Reduces the time spent downloading or generating dependencies, directly impacting overall pipeline duration.
- Reduced Network Usage: Minimizes redundant downloads, which is especially beneficial for large projects with many dependencies.
- Improved Efficiency: Optimizes resource usage on your CI/CD runners.
- Consistent Environments: Helps ensure that jobs start with a consistent set of dependencies.
How cache
Works
When a job with a cache
definition runs:
- Before Script: GitLab CI/CD attempts to restore the cache based on the
key
defined. If a cache with that key exists, its contents are unzipped into the job’s working directory. - After Script: If the job succeeds and the
policy
allows it, the files and directories specified inpaths
are compressed and uploaded as a new cache, associated with thekey
. This new cache overwrites any existing cache with the same key.
Configuring cache
in .gitlab-ci.yml
The cache
keyword is typically defined globally (for all jobs) or at the individual job level.
Basic cache
Configuration
To specify which files or directories should be cached, use the paths
keyword:
# .gitlab-ci.yml
cache:
paths:
- node_modules/ # Cache Node.js dependencies
- .cache/ # Cache any other general build cache directory
stages:
- build
- test
install_dependencies:
stage: build
script:
- echo "Installing dependencies..."
- mkdir -p node_modules # Ensure directory exists for caching
- touch node_modules/dummy_package.txt # Simulate package installation
- echo "Dependencies installed."
run_tests:
stage: test
script:
- echo "Running tests using cached dependencies..."
- ls node_modules/ # Verify cached content is available
paths
: A list of files or directories (relative to the project’s root) to cache. These paths are usually where your project’s dependencies are installed (e.g.,node_modules/
,vendor/
,~/.m2/repository/
).
Cache Key (key
)
The key
determines how caches are stored and retrieved. Jobs with the same cache key will share the same cache. This is crucial for managing different caches for different branches, operating systems, or dependency sets.
Static Key
A simple, fixed string. All jobs using this key will share the same cache.
cache:
key: my-static-cache
paths:
- .bundle/
Dynamic Keys using Predefined Variables
Using predefined CI/CD variables in the key
is the most common and powerful approach.
$CI_COMMIT_REF_SLUG
: Caches per branch or tag. Ideal for maintaining separate caches for feature branches.
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- node_modules/
$CI_COMMIT_REF_SLUG-$CI_JOB_NAME
: Caches per branch/tag and per job. Useful if different jobs generate distinct caches that should not conflict.
cache:
key: "$CI_COMMIT_REF_SLUG-$CI_JOB_NAME"
paths:
- build-artifacts/
files
keyword for key
: Dynamically generates a key based on the content of specified files. If the content of these files changes, a new cache is created. This is excellent for package manager lock files (e.g., package-lock.json
, Gemfile.lock
), as the cache updates only when dependencies change.
cache:
key:
files:
- package.json
- yarn.lock
prefix: "$CI_COMMIT_REF_SLUG" # Optional prefix for the key
paths:
- node_modules/
- If
package.json
oryarn.lock
changes, a new cache key is generated, and a freshnode_modules
cache is created. Otherwise, the existing cache for that key is used.
Cache Policy (policy
)
The policy
determines when the cache is uploaded and downloaded.
pull-push
(default): Cache is downloaded before the job and uploaded after the job succeeds.pull
: Cache is only downloaded, not uploaded. Useful for jobs that consume dependencies but do not modify them (e.g., test jobs).push
: Cache is only uploaded, not downloaded. Useful for jobs specifically dedicated to preparing and uploading a cache (e.g., aninstall_deps
job).
install_dependencies_job:
stage: build
script:
- npm install
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- node_modules/
policy: push # Only upload the cache from this job
test_job:
stage: test
script:
- npm test
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- node_modules/
policy: pull # Only download the cache for this job, do not re-upload
Cache Expiration (expire_in
)
Similar to artifacts, caches can also expire. The expire_in
keyword specifies how long a cache should be kept.
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- node_modules/
expire_in: 1 week # Cache will be kept for 1 week
expire_in
: Accepts human-readable durations like1 day
,3 months
,30 mins
. If not specified, GitLab uses a default retention period (often 7 days on GitLab.com, or forever on self-hosted instances unless configured).
Global vs. Job-level Cache
You can define cache
globally at the top level of your .gitlab-ci.yml
file, which applies to all jobs. Individual jobs can then override or extend this global definition.
# Global cache definition
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- .npm/
- .gradle/
stages:
- build
- test
build_frontend:
stage: build
script:
- npm install
- npm run build
cache:
paths: # Overrides global paths for this job
- node_modules/
- dist/ # Add project-specific build output to cache if needed
# Key and policy inherited from global unless overridden
test_backend:
stage: test
script:
- ./gradlew test
cache:
policy: pull # Override global policy for this job
Important Considerations and Best Practices
- Artifacts vs. Cache:
- Cache is for dependencies that are downloaded or installed from external sources to speed up future job runs (input optimization). They are volatile and might be evicted.
- Artifacts are for outputs generated by a job that need to be preserved and passed to subsequent jobs in the same pipeline or downloaded by users (output preservation).
- Do not use cache for build outputs that are critical for subsequent jobs in the same pipeline. Use
artifacts
for that purpose.
- Choose the Right Cache Key:
- Branch-specific caches (
$CI_COMMIT_REF_SLUG
) are good for development, as different branches often have different dependency versions. - File-based keys (
key: files: [...]
) are ideal for projects with lock files, as they ensure the cache is only invalidated when dependencies truly change, not on every commit. - Be mindful of cache size when using file-based keys, as they can create many distinct caches.
- Branch-specific caches (
- Be Specific with
paths
: Only cache what is necessary. Caching too many files or large directories can negate the benefits by increasing upload/download times and storage usage. Exclude build artifacts that are specific to a single job and should not be reused. - Use
policy: push
andpolicy: pull
Strategically:- A dedicated “dependency installation” job with
policy: push
can prepare the cache. - Subsequent jobs can then use
policy: pull
to consume that cache efficiently.
- A dedicated “dependency installation” job with
- Clear or Invalidate Caches: If you encounter unexpected behavior or corrupted caches, you can manually clear project caches through the GitLab UI (
Project Settings > CI/CD > Caches
). A new cache will be generated on the next pipeline run. Changing thekey
also effectively invalidates the old cache. - Runners and Cache Storage: Caches are stored on the GitLab Runner (often on local disk or in object storage configured for the runner). The performance of caching depends on the runner’s I/O speed and network proximity to the cache storage.
- Cache Size Limits: Be aware of potential cache size limits imposed by your GitLab instance administrator. Large caches can take longer to transfer.
- Debugging Caching Issues: If caching does not seem to work, check:
- Are the
key
definitions consistent across jobs that should share a cache? - Are the
paths
correctly specified and do the directories actually exist when the job runs? - Are the
policy
settings appropriate? - Are there any
expire_in
settings or manual expirations that might be removing the cache?
- Are the
FAQs – Cache
What is the cache
keyword in GitLab CI/CD?
The cache
keyword defines files or directories that GitLab Runner should save and reuse between different job runs or pipeline executions. Caching improves performance by preventing repeated downloads or rebuilds of dependencies like node_modules
, .m2
, or vendor/
.
How do I define a cache in a GitLab CI/CD job?
You define cache
inside a job or globally, and specify the paths to be cached:
cache:
paths:
- node_modules/
build:
script:
- npm install
This stores node_modules/
so that future jobs or pipelines can reuse it, speeding up builds.
What is the difference between artifacts
and cache
in GitLab?
- Artifacts are saved after a job finishes and are typically used for sharing files across jobs/stages or for download.
- Cache is used to speed up jobs by storing and reusing files between different runs or pipelines.
Artifacts are tied to a pipeline; cache is shared and persistent (within configured keys).
Can I use cache across different jobs and stages?
Yes. As long as the cache
key is the same, GitLab Runner shares the cache across different jobs or stages.
cache:
key: shared-cache
paths:
- node_modules/
build:
script: npm install
test:
script: npm test
Both jobs use the same node_modules/
cache.
What is the key
in the cache
keyword, and why is it important?
The key
uniquely identifies a cache. If two jobs share the same key, they share the same cache.
cache:
key: my-cache
paths:
- node_modules/
You can use static or dynamic keys (e.g., branch-specific):
key: "$CI_COMMIT_REF_SLUG"
This makes the cache unique per branch.
Can I use different caches for different branches or environments?
Yes. Use dynamic keys based on GitLab predefined variables:
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- venv/
This creates separate caches for each branch.
How can I share a cache between different pipelines?
GitLab caches are persisted on the same runner. To share caches between pipelines:
- Use a consistent
key
. - Use the same path structure.
- Ensure pipelines run on the same GitLab Runner or executor with shared storage (e.g., Docker volumes or mounted directories).
How do I cache multiple paths in GitLab CI/CD?
You can list multiple directories or files under the paths
key:
cache:
paths:
- node_modules/
- .cache/pip
- vendor/
All listed paths will be stored and restored when the job runs.
What does policy
mean in cache configuration?
The policy
setting controls when caching happens. It supports:
pull-push
(default): download the cache before the job and upload afterward.pull
: only download the cache.push
: only upload the cache after the job finishes.
Example:
cache:
key: packages
policy: pull
paths:
- .npm/
This job only pulls the cache and would not update it after execution.
Can I disable cache for a specific job?
Yes. You can override or clear the global cache by setting an empty cache block:
job:
cache: {}
script:
- echo "No cache used here"
This ensures the job does not use or create any cache.
Where is GitLab cache stored and how long does it persist?
Cache files are stored on the GitLab Runner machine, typically in a directory like /home/gitlab-runner/cache/
.
GitLab does not automatically expire caches. You must manage them manually or set up a cache key rotation policy using dynamic keys.
How do I cache dependencies from package managers like npm, pip, or Maven?
✅ npm:
cache:
paths:
- node_modules/
✅ pip:
cache:
paths:
- .cache/pip
✅ Maven:
cache:
paths:
- .m2/repository
Using these paths ensures that downloaded packages are reused in future runs.
How do I debug or check if the cache is working in GitLab?
Enable verbose logging in your jobs and watch for messages like:
Checking cache for...
Successfully restored cache
Creating cache...
Example:
job:
script:
- echo "Debugging cache"
cache:
key: cache-debug
paths:
- temp/
Also inspect the job log in the GitLab UI to confirm cache pull/push actions.
Can I use cache with GitLab’s Docker executor?
Yes. Caches work with the Docker executor but are stored outside the Docker container (on the host runner). GitLab mounts them into the container during execution.
Author

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com