Scaling Pipelines

Table of Contents

How Do I Set Speed/Durability Settings?
Will Higher-Performance Durability Settings Help Me?
What Am I Giving Up With This Durability Setting "Trade-Off?"
Requirements To Use Durability Settings
What Are The Durability Settings?
Suggested Best Practices And Tips for Durability Settings
Other Scaling Suggestions

One of the main bottlenecks in Pipeline is that it writes transient data to disk FREQUENTLY so that running pipelines can handle an unexpected Jenkins restart or system crash. This durability is useful for many users but its performance cost can be a problem.

Pipeline now includes features to let users improve performance by reducing how much data is written to disk and how often it is written — at a small cost to durability. In some special cases, users may not be able to resume or visualize running Pipelines if Jenkins shuts down suddenly without getting a chance to write data.

Because these settings include a trade-off of speed vs. durability, they are initially opt-in. To enable performance-optimized modes, users need to explicitly set a Speed/Durability Setting for Pipelines. If no explicit choice is made, pipelines currently default to the "maximum durability" setting and write to disk as they have in the past. There are some I/O optimizations to this mode included in the same plugin releases, but the benefits are much smaller.

How Do I Set Speed/Durability Settings?

There are 3 ways to configure the durability setting:

Globally, you can choose a global default durability setting under "Manage Jenkins" > "System", labelled "Pipeline Speed/Durability Settings". You can override these with the more specific settings below.
Per pipeline job: at the top of the job configuration, labelled "Custom Pipeline Speed/Durability Level" - this overrides the global setting. Or, use a "properties" step - the setting will apply to the NEXT run after the step is executed (same result).
Per-branch for a multibranch project: configure a custom Branch Property Strategy (under the SCM) and add a property for Custom Pipeline Speed/Durability Level. This overrides the global setting. You can also use a "properties" step to override the setting, but remember that you may have to run the step again to undo this.

Durability settings will take effect with the next applicable Pipeline run, not immediately. The setting will be displayed in the log.

Will Higher-Performance Durability Settings Help Me?

Yes, if your Jenkins controller uses NFS, magnetic storage, runs many Pipelines at once, or shows high iowait.
Yes, if you are running Pipelines with many steps (more than several hundred).
Yes, if your Pipeline stores large files or complex data to variables in the script, keeps that variable in scope for future use, and then runs steps. This sounds oddly specific but happens more than you’d expect.
- For example: readFile step with a large XML/JSON file, or using configuration information from parsing such a file with One of the Utility Steps.
- Another common pattern is a "summary" object containing data from many branches (logs, results, or statistics). Often this is visible because you’ll be adding to it often via an add/append or Map.put() operations.
- Large arrays of data or Maps of configuration information are another common example of this situation.
No, if your Pipelines spend almost all their time waiting for a few shell/batch scripts to finish. This ISN’T a magic "go fast" button for everything!
No, if Pipelines are writing massive amounts of data to logs (logging is unchanged).
No, if you are not using Pipelines, or your system is loaded down by other factors.
No, if you don’t enable higher-performance modes for pipelines.

What Am I Giving Up With This Durability Setting "Trade-Off?"

Stability of Jenkins ITSELF is not changed regardless of this setting - it only applies to Pipelines. The worst-case behavior for Pipelines reverts to something like Freestyle builds — running Pipelines that cannot persist transient data may not be able to resume or be displayed in Pipeline Graph View/Stage View/etc, but will show logs. This impacts only running Pipelines and only when Jenkins is shut down abruptly and not gracefully before they get to complete.

A "graceful" shutdown is where Jenkins goes through a full shutdown process, such as visiting http://[jenkins-server]/exit, or using normal service shutdown scripts (if Jenkins is healthy). Sending a SIGTERM/SIGINT to Jenkins will trigger a graceful shutdown. Note that running Pipelines do not need to complete (you do not need to use /safeExit to shut down).

A "dirty" shutdown is when Jenkins does not get to do normal shutdown processes. This can occur if the process is forcibly terminated. The most common causes are using SIGKILL to terminate the Jenkins process or killing the container/VM running Jenkins. Simply stopping or pausing the container/VM will not cause this, as long as the Jenkins process is able to resume. A dirty shutdown can also happen due to catastrophic operating system failures, including the Linux OOMKiller attacking the Jenkins java process to free memory.

Atomic writes: All settings except "maximum durability" currently avoid atomic writes — what this means is that if the operating system running Jenkins fails, data that is buffered for writing to disk will not be flushed, it will be lost. This is quite rare, but can happen as a result of container or virtualization operations that halt the operating system or disconnect storage. Usually this data is flushed pretty quickly to disk, so the window for data loss is brief. On Linux this flush-to-disk can be forced by running 'sync'. In some rare cases this can also result in a build that cannot be loaded.

Requirements To Use Durability Settings

Jenkins LTS 2.73+ or higher (or a weekly 2.62+)
For all the Pipeline plugins below, at least the specified minimum version must be installed
- Pipeline: API (workflow-api) v2.25
- Pipeline: Groovy (workflow-cps) v2.43
- Pipeline: Job (workflow-job) v2.17
- Pipeline: Supporting APIs (workflow-support) v2.17
- Pipeline: Multibranch (workflow-multibranch) v2.17 - optional, only needed to enable this setting for multibranch pipelines.
Restart the controller to use the updated plugins - note: you need all of them to take advantage.

What Are The Durability Settings?

Performance-optimized mode ("PERFORMANCE_OPTIMIZED") - Greatly reduces disk I/O. If Pipelines do not finish AND Jenkins is not shut down gracefully, they may lose data and behave like Freestyle projects — see details above.
Maximum survivability/durability ("MAX_SURVIVABILITY") - behaves just like Pipeline did before, slowest option. Use this for running your most critical Pipelines.
Less durable, a bit faster ("SURVIVABLE_NONATOMIC") - Writes data with every step but avoids atomic writes. This is faster than maximum durability mode, especially on networked filesystems. It carries a small extra risk (details above under "What Am I Giving Up: Atomic Writes").

Suggested Best Practices And Tips for Durability Settings

Use the "performance-optimized" mode for most pipelines and especially basic build-test Pipelines or anything that can simply be run again if needed.
Use either the "maximum durability" or "less durable" mode for pipelines when you need a guaranteed record of their execution (auditing). These two modes record every step run. For example, use one of these two modes when:
- you have a pipeline that modifies the state of critical infrastructure
- you do a production deployment
Set a global default (see above) of "performance-optimized" for the Durability Setting, and then where needed set "maximum durability" on specific Pipeline jobs or Multibranch Pipeline branches ("master" or release branches).
You can force a Pipeline to persist data by pausing it.

Other Scaling Suggestions

Use @NonCPS-annotated functions for more complex work. This means more involved processing, logic, and transformations. This lets you leverage additional Groovy & functional features for more powerful, concise, and performant code.
- This still runs on controller so be aware of complexity of the work, but is much faster than native Pipeline code because it doesn’t provide durability and uses a faster execution model. Still, be mindful of the CPU cost and offload to executors when the cost becomes too high.
- @NonCPS functions can use a much broader subset of the Groovy language, such as iterators and functional features, which makes them more terse and fast to write.
- @NonCPS functions should not use Pipeline steps internally, however you can store the result of a Pipeline step to a variable and use it that as the input to a @NonCPS function.
  - Gotcha: It’s not guaranteed that use of a step will generate an error (there is an open RFE to implement that), but you should not rely on that behavior. You may see improper handling of exceptions.
- While normal Pipeline is restricted to serializable local variables, @NonCPS functions can use more complex, nonserializable types internally (for example regex matchers, etc). Parameters and return types should still be Serializable, however.
  - Gotcha: improper usages are not guaranteed to raise an error with normal Pipeline (optimizations may mask the issue), but it is unsafe to rely on this behavior.
- General Gotcha: when using running @NonCPS functions, the actual error can sometimes be swallowed by pipeline creating a confusing error message. Combat this by using a try/catch block and potentially using an echo to plain text print the error message in the catch
Whenever possible, run Jenkins with fast SSD-backed storage and not hard drives. This can make a huge difference.
In general try to fit the tool to the job. Consider writing short Shell/Batch/Groovy/Python scripts when running a complex process using a build agent. Good examples include processing data, communicating interactively with REST APIs, and parsing/templating larger XML or JSON files. The sh and bat steps are helpful to invoke these, especially with returnStdout: true to return the output from this script and save it as a variable (Scripted Pipeline).
- The Pipeline DSL is not designed for arbitrary networking and computation tasks - it is intended for CI/CD scripting.
Use the latest versions of the Pipeline plugins and Script Security, if applicable. These include regular performance improvements.
Try to simplify Pipeline code by reducing the number of steps run and using simpler Groovy code for Scripted Pipelines.
Consolidate sequential steps of the same type if you can, for example by using one Shell step to invoke a helper script rather than running many steps.
Try to limit the amount of data written to logs by Pipelines. If you are writing several MB of log data, such as from a build tool, consider instead writing this to an external file, compressing it, and archiving it as a build artifact.
When using Jenkins with more than 6 GB of heap use the suggested garbage collection tuning options to minimize garbage collection pause times and overhead.