Config / Intelligent Cache

Out of the box the Intelligent Cache is always active for enhanced runners. The most pertinent defaults are as follows.

incremental mode is enabled if a cache stanza is present on the job definition
all jobs are in the default collection
cache expiration is 5 days for incremental, otherwise 2 hours

The cache stanza is disabled and only used to determine the default for incremental mode.

Log 

All job logs will contain a line, in the first collapsed section, that indicates the filesystem inputs. This can be useful when adjusting the variables below to ensure they are being propagated as expected.

Intelligent Cache: collection=default incremental dependencies=1

Variables 

The Intelligent Cache may be configured through variables defined alongside CI jobs.

The following is an example .gitlab-ci.yml file demonstrating how to specify the configuration.

job:
  variables:
    CEDARCI_INCREMENTAL: "true"
  script:
     - ./download_dependencies
     - ./build_application

CEDARCI_ARTIFACT_DISABLED 

Disable standard artifact upload and download.

Useful during trial/migration to avoid modifying job definitions. Instead of passing artifacts between jobs in a pipeline via Gitlab artifacts, the Intelligent Cache will perform the task more efficiently.

CEDARCI_CACHE_COMPLAIN 

Fail jobs containing a cache stanza.

Useful to clarify that the cache stanza is no longer available. Avoids confusion after a migration has completed.

CEDARCI_COLLECTION 

Set the cache collection.

The collection provides a user-controlled sub-partition. Caches are only found within the same collection unless explicitly overridden via CEDARCI_COLLECTION_SEARCH. The two primary use cases are:

To provide a programmatic "cache wipe" by changing the collection.
For projects that regularly make changes that require a cache wipe, the collection can be altered in a changeset. An example would be to start with 01 and increment to 02, but feature/release names and dates also work well.
When using child pipelines, which allow for a job name to be repeated, the collection must be changed to avoid sharing the same cache.
Monorepos are a common use case for this feature where the collection is set to the app or directory to which the pipeline pertains.

CEDARCI_COLLECTION_SEARCH 

Set the cache collection to search.

Search for dependencies from a different collection than the one containing the current job.

This feature is almost exclusively for monorepos. A child pipeline job may want to obtain files from a parent pipeline job. Since it can be tricky to properly depend on a job from a parent pipeline using the built-in stanzas the CEDARCI_PSEUDO_DEPENDENCIES variable is provided to explicitly list them.

For example:

parent_pipeline
|- build_deps (collection: default)
 \- child_pipeline
   |- build (collection: app1, search: default, pseduo: [build_deps])
 \- child_pipeline
   |- build (collection: app2, search: default, pseduo: [build_deps])

The above would keep the caches for the build jobs in each child pipeline separate while also allowing them to benefit from the output of the build_deps job.

CEDARCI_EXPIRE 

Set the cache expiration in seconds.

Overrides CEDARCI_EXPIRE_PROTECTED and CEDARCI_EXPIRE_FEATURE when set.

The expiration variables are only useful if a project either moves very quickly or slowly. In the case of quickly, caches can be expired faster to avoid excessive storage, and in the case of slowly they can be kept longer to ensure matches are found.

Maximum expiration period is 14 days.

CEDARCI_EXPIRE_PROTECTED 

Set the cache expiration in seconds when executed in a protected branch.

CEDARCI_EXPIRE_FEATURE 

Set the cache expiration in seconds when executed in a feature branch (non-protected).

CEDARCI_INCREMENTAL 

Set the incremental mode: "true" or "false".

Mark jobs that benefit from an incremental cache, outside the current pipeline. Caches from previous instances of the job will be utilized.

In some cases, especially when migrating, jobs may misbehave when incremental mode is enabled. As such this value may be used to disable incremental even when a cache stanza is present.

CEDARCI_PSEUDO_DEPENDENCIES 

Set the list of pseudo (cache-only) dependencies.

A comma separated list of job names to consider as dependencies for the purposes of the cache. This has no impact on execution order or conditions.

Security 

There are a number of options for tuning the Intelligent Cache security which may be set in the enhanced runner configuration. The default security orientation is protected-read, which assumes no secrets are written to the cache. The default is expected to be sufficient for the majority of cases.

The security boundaries are:

Partition 

Full partitioning is enabled by default meaning the cache is partitioned by all three boundaries.

When partitioning is disabled, caches from that entity class are no longer separated and can read and write to caches used by any other entity in that class.

If partition_project is disabled, protected branches across all projects share a partition.
If partition_protected is disabled, all branches within a project share a partition.
If both are disabled, all branches in all projects share a partition.

Partitioning should only be disabled when everyone creating pipelines is trusted, such as private projects.

Expand 

Assuming partitioning is enabled, read access can be granted to entities outside the partition. Such access improves performance and only is a concern if secrets are stored in the cache.

If expand_fork is enabled, a forked project will have read access to the parent project cache.
If expand_protected is enabled, a non-protected branch will have read access to a protected branch cache.
If both are enabled, a non-protected branch will have read access to a parent project, protected branch cache.

Performance 

The highest performance can be achieved with partitioning disabled since all caches are always usable. Otherwise, a protected branch will never use a non-protected branch cache, nor a parent project use a fork cache.

For private projects, it is reasonable to disable partitioning, but in many cases it will make little difference.

For public projects, especially large projects with a Gitlab instance, expand_fork can be a huge boost with little to no cause for concern.