Out of the box the Intelligent Cache is always active for enhanced runners. The most pertinent defaults are as follows.
- incremental mode is enabled if a
cache
stanza is present on the job definition - all jobs are in the default collection
- cache expiration is 5 days for incremental, otherwise 2 hours
The cache
stanza is disabled and only used to determine the default for incremental
mode.
Log
All job logs will contain a line, in the first collapsed section, that indicates the filesystem inputs. This can be useful when adjusting the variables below to ensure they are being propagated as expected.
Intelligent Cache: collection=default incremental dependencies=1
Variables
The Intelligent Cache may be configured through variables defined alongside CI jobs.
The following is an example .gitlab-ci.yml
file demonstrating how to specify the configuration.
job:
variables:
CEDARCI_INCREMENTAL: "true"
script:
- ./download_dependencies
- ./build_application
CEDARCI_ARTIFACT_DISABLED
Disable standard artifact upload and download.
Useful during trial/migration to avoid modifying job definitions. Instead of passing artifacts between jobs in a pipeline via Gitlab artifacts, the Intelligent Cache will perform the task more efficiently.
CEDARCI_CACHE_COMPLAIN
Fail jobs containing a cache stanza.
Useful to clarify that the cache
stanza is no longer available. Avoids confusion after a migration has completed.
CEDARCI_COLLECTION
Set the cache collection.
The collection provides a user-controlled sub-partition. Caches are only found within the same collection
unless explicitly overridden via CEDARCI_COLLECTION_SEARCH
. The two primary use cases are:
- To provide a programmatic "cache wipe" by changing the collection.
For projects that regularly make changes that require a cache wipe, the collection can be altered in a changeset. An example would be to start with01
and increment to02
, but feature/release names and dates also work well. - When using child pipelines, which allow for a job name to be repeated, the collection must be changed to avoid
sharing the same cache.
Monorepos are a common use case for this feature where the collection is set to the app or directory to which the pipeline pertains.
CEDARCI_COLLECTION_SEARCH
Set the cache collection to search.
Search for dependencies from a different collection than the one containing the current job.
This feature is almost exclusively for monorepos. A child pipeline job may want to obtain files from a parent pipeline
job. Since it can be tricky to properly depend on a job from a parent pipeline using the built-in stanzas the
CEDARCI_PSEUDO_DEPENDENCIES
variable is provided to explicitly list them.
For example:
parent_pipeline
|- build_deps (collection: default)
\- child_pipeline
|- build (collection: app1, search: default, pseduo: [build_deps])
\- child_pipeline
|- build (collection: app2, search: default, pseduo: [build_deps])
The above would keep the caches for the build
jobs in each child pipeline separate while also allowing them to benefit
from the output of the build_deps
job.
CEDARCI_EXPIRE
Set the cache expiration in seconds.
Overrides CEDARCI_EXPIRE_PROTECTED
and CEDARCI_EXPIRE_FEATURE
when set.
The expiration variables are only useful if a project either moves very quickly or slowly. In the case of quickly, caches can be expired faster to avoid excessive storage, and in the case of slowly they can be kept longer to ensure matches are found.
Maximum expiration period is 14 days.
CEDARCI_EXPIRE_PROTECTED
Set the cache expiration in seconds when executed in a protected branch.
CEDARCI_EXPIRE_FEATURE
Set the cache expiration in seconds when executed in a feature branch (non-protected).
CEDARCI_INCREMENTAL
Set the incremental mode: "true" or "false".
Mark jobs that benefit from an incremental cache, outside the current pipeline. Caches from previous instances of the job will be utilized.
In some cases, especially when migrating, jobs may misbehave when incremental mode is enabled. As such this value may be
used to disable incremental even when a cache
stanza is present.
CEDARCI_PSEUDO_DEPENDENCIES
Set the list of pseudo (cache-only) dependencies.
A comma separated list of job names to consider as dependencies for the purposes of the cache. This has no impact on execution order or conditions.
See also CEDARCI_COLLECTION_SEARCH
.
Security
There are a number of options for tuning the Intelligent Cache security which may be set in the enhanced runner configuration. The default security orientation is protected-read, which assumes no secrets are written to the cache. The default is expected to be sufficient for the majority of cases.
The security boundaries are:
Partition
Full partitioning is enabled by default meaning the cache is partitioned by all three boundaries.
When partitioning is disabled, caches from that entity class are no longer separated and can read and write to caches used by any other entity in that class.
- If
partition_project
is disabled, protected branches across all projects share a partition. - If
partition_protected
is disabled, all branches within a project share a partition. - If both are disabled, all branches in all projects share a partition.
Partitioning should only be disabled when everyone creating pipelines is trusted, such as private projects.
Expand
Assuming partitioning is enabled, read access can be granted to entities outside the partition. Such access improves performance and only is a concern if secrets are stored in the cache.
- If
expand_fork
is enabled, a forked project will have read access to the parent project cache. - If
expand_protected
is enabled, a non-protected branch will have read access to a protected branch cache. - If both are enabled, a non-protected branch will have read access to a parent project, protected branch cache.
Performance
The highest performance can be achieved with partitioning disabled since all caches are always usable. Otherwise, a protected branch will never use a non-protected branch cache, nor a parent project use a fork cache.
For private projects, it is reasonable to disable partitioning, but in many cases it will make little difference.
For public projects, especially large projects with a Gitlab instance, expand_fork
can be a huge boost with little to
no cause for concern.