JEP |
207 |
Title |
External Build Logging support in the Jenkins Core |
Sponsor |
|
Status |
Deferred ⌛ |
Type |
Standards |
Created |
2018-07-23 |
BDFL-Delegate |
|
Discussions-To |
On large-scale Jenkins instances master Disk and Network I/O become bottlenecks in particular cases. Build logging and Task reporting are one for the most intensive I/O consumers, hence it would be great to somehow redirect them to an external system. This is a continuation of the original story we had back in 2016 (see the public design document here).
This Jenkins enhancement proposal documents changes in the core required to achieve the external build logging objective (see Reasoning for explanation of the scope). Support of other logging types will be documented in subsequent JEPs.
-
All or almost all build logging is moved from Jenkins filesystem to External Build Log Storage
-
ON: “Task” logging is not considered as a requirement after the scope change
-
-
Minimize log traffic between Jenkins agents and the master, logs should be reported from slaves directly when it is possible
-
External Build Log Storage has an extensible and pluggable architecture
-
Reference implementations:
-
File-system-based storage (current implementation)
-
One external build log storage based on the open-source stack, e.g. Elasticsearch-Logstash-Kibana (ELK)
-
Nice 2 have: one Cloud-focused implementation, e.g. for AWS CloudWatch
-
-
Freestyle and Jenkins Pipeline projects are supported out of the box
-
Run console logs should be provided using standard Jenkins interfaces
-
Data migration flow for upgrading instances is technically possible
-
It is OK if it requires special actions outside Jenkins
-
-
Reference External Build logging implementation provides a feasible performance and fault-tolerance compared to the original filesystem based solution
-
All Global Configurations are designed in a way that they can be configured via Jenkins Configuration-as-Code Plugin
This section contains a not-so-sorted top-level description (aka “braindump”) of use-cases and concerns we need to address in the design.
In this story we consider Build logs as a sequence of events, which need to be registered in the system.
-
We cannot handle all kinds of executions in Jenkins since plugins may have their specific implementations
-
For console logs the minimal atomic item for Event is a line (for I/O Stream implementations) or event
Such events may also include metadata so that they can be queried by Jenkins (e.g. "get all log entries for a build step") or other log consumers.
-
We do not implement new external logging system on our own, we want to integrate with existing open-source and/or proprietary systems
-
The build reporting system should support….
-
Reporting of events - the build log will be splitted to multiple (potentially thousands) events. These timestamped events may be delivered to the storage systems in a random order (e.g. in
parallel()
builds for Pipeline) -
The reporting may be performed in parallel
-
Reporting of log data from master and agents will not be synchronized
-
Reporting from a single instance (e.g. Master) may be also parallel
-
-
-
Multiple Jenkins masters may be connected to the same External Log storage
Depending on the environment, different build logging destinations may be used. The solution should be generic enough in order to support common destination types The following storage types should be supportable:
-
FileSystem-based storage (default implementation)
-
Industry-standard External Build Logging and storage systems: Fluentd, Logstash, Elasticsearch, etc.
-
SQL-based storages
-
No-SQL storages: Key-value storages, Document-based storages
-
We will support different External Build Logging system for different builds
-
It allows updating without data migration
-
It allows configuring different loggers[n][o][p]
-
Requirements:
-
We implement the new “LogStorageFactory” extension point, which allows tweaking logging strategies
-
By now we do not provide specific implementations excepting reference ones, but we can tweak logging destination via JobProperty or NodeProperty later
-
Pipeline step / declarative will be complicated since we may lose some logging info (self-configuring logging within Pipeline, like JENKINS-41929) Secret handling during Log reporting
-
-
Logging should be performed on both master and slave
-
Secrets should be shaded on both sides ⇒ password suppression rules should be executed on both master and slaves side
-
This suppression rules should be passed to the node. It causes a potential security [q][r][s]risk if the implementation does not capture secrets properly, because they may go to location Jenkins admin does not control (external storage)
-
JG: If a secret is defined in an environment variable, we are already sending it to the agent via
RemoteLaunchCallable.env
. So having theConsoleLogFilter
also include the same information is not an issue.
-
Design decisions:
-
Agent <⇒ Master communication should be always performed via encrypted protocol when we use external build logging (ideally needs a NodeMonitor)
-
We should pass secret filtering options to the remote launcher when we invoke it
-
According to the “Indexing” approach, we have binary and text annotations
-
ConsoleNote is technically a binary one, which is being encoded to a string with a prefix to the output stream
Design decisions:
-
Log annotations should be performed on the master and agent side
-
Binary annotations (ConsoleNote classes) should be encoded into HEX representation and stored as additional annotation fields[t]
-
They will be decoded by Jenkins master[u][v][w] only when it displays it
-
-
Log browsing should support both local and remote Logging systems
-
The interface should support…
-
Querying and Filtering logs
-
Progressive log output (for running builds and tasks)
-
Annotation visualization in console log
-
Design decisions:
-
Annotations should be stored in the external storage
-
Storage format is defined by the external log storage implementation
-
If the log storage can store objects, it is recommended to store annotations separately from the text
-
Log rotation is performed as for any other components within Jenkins builds
-
Currently log deletion is implemented as a part of the build deletion
Design decisions:
-
New API should be introduced to support deletion of logs
-
External logging APIs should provide methods for deletion of logs
-
These APIs may implement log deletion… or not. In the latter case Jenkins should be able to produce a warning, but it should not impact its operation
-
External log browser implementations should be able to explicitly indicate that there is no logs available
The following new API entities will be introduced:
-
Loggable
- interface for objects supporting external logging -
LogStorage
- objects defining log reporting and browsing logic -
LogStorageFactory
- extension point for locatingLogStorage
Implementations:
-
File-based
LogStorage
- logging to the local FileSystem, implements compatibility mode -
No-op
LogStorage
- Fallback implementations for reporting errors
The introduced entities are described below.
This is a new interface,
which will mark all objects supporting external logging.
In the current design this interface will be implemented only by Run
instances,
but other log types may be supported in further implementations.
Loggable interface should provide the following methods:
-
Getters for the
LogStorage
being used in the object-
Default implementation - consult with
LogStorageFactory
extensions
-
-
Getters for default LogStorage
-
These getters will be used if there is no
LogStorage
configured for the item -
For example, `Run`s will be referring File-based storage to retain compatibility
-
-
boolean isLoggingFinished()
- indicates that there is no new logging being performed -
Charset getCharset()
- method, which defines the charset to be used-
Some instances like
Run
allow setting charsets explicitly. -
By this method this requirement is propagated to logging methods
-
-
getLogFileCompatLocation
- provides file path to be used by the File-based storage-
This method is needed, because instances like
Runs
have complex logic which defines the storage location
-
LogStorage is a central class
which represents the log storage being used for a particular Loggable
instance.
It defines API for reporting logs and retrieving them.
LogStorage is an @ExportedBean
,
so its instances can be exported to the REST API.
Methods to be offered:
-
BuildListener createBuildListener() throws IOException, InterruptedException
- Build Listener provider.-
This listener will receive build events and put them to the storage
-
Implementations are responsible to consult with Jenkins security logic like
ConsoleLogFilter
extension points
-
-
TaskListener createTaskListener() throws IOException, InterruptedException
- Same ascreateBuildListener()
, but for tasks. This is a stub for other task types support in the future -
Launcher decorateLauncher(@Nonnull Launcher original, @Nonnull Run<?,?> run, @Nonnull Node node)
- Launcher decorator for logging. It allows altering the launcher logic in builds, e.g. to inject custom environment. This logic may be invoked by core and plugins (see JENKINS-52914 for limitations). -
AnnotatedLargeText<T> overallLog()
- Get large text for the entire execution/run
Some implementations should be also moved from Run
and generalized.
Jenkins core or External Logging API will provide default convenience implementations
which can be overridden by implementations for better performance.
-
InputStream getLogInputStream() throws IOException
- gets the log as an input stream -
Reader getLogReader() throws IOException
- get the log as a Reader -
String getLog() throws IOException
- gets the entire log as a single String-
This method is deprecated in
hudson.model.Run
, and it should remain deprecated
-
-
List<String> getLog(int maxLines) throws IOException
- gets a number of log lines as a list of strings -
File getLogFile() throws IOException
- Compatibility method, which retrieves the log as aFile
.-
By default a temporary file will be created, unless an implementation offers something better
-
This is a low-level extension point, which allows locating
LogStorage
to be used for a particular Loggable
item.
This extension point should offer static methods which consult with all implementations
and provide proper extensions.
If there is no LogStorageFactory
providing implementation,
fallback FileLogStorage
should be used.
These classes implement extension points and contain the
original logic for the Filesystem logging.
All Filesystem-specific logic from hudson.model.Run
and other such classes
should be moved to these implementations.
Integration with Loggable
:
-
Run
instance should implementLoggable
-
Run
storesLogStorage
references in fields. These fields can be persisted on the disk -
Run#onLoad()
method restores references to the owner which are stored byLogStorage
-
All methods in
Run
and child classes implement new APIs used by LogStorage` -
Run
offers agetLogStorage()
method which is@Exported
File operations:
-
File logging operations are moved to
FileLogStorage
-
Run#getLogFile()
method should be deprecated, all usages in the Jenkins core should be cleaned up. The method will be still invoking the compatibility layer fromLogStorage
so read-only API users do not lose the compatibility
The default build logging in Jenkins is known to be a performance and scalability bottleneck at large-scale instances.
-
Build logging from agents goes through master. It produces loads on the network and master’s memory/CPU, especially in the case of massive parallel builds
-
Build browsing goes through master. Every time logs are displayed to users, a request is sent to the Jenkins master in order to load the data
-
Logs are stored on the disk in raw format. It consumes a lot of storage space.
Externalization of Build logging could allow improving the situation a lot,
but Core API patches are required to support external logging in AbstractProject
-based job types.
This JEP is needed in order to specify such core changes.
Other changes are documented in subsequent JEPs.
Being compared to the original design in 2016, this design limits the scope of work so that it can be implemented and delivered in a reasonable timeframe.
The original design in this JEP proposed to keep independent implementations for log reporting and log browsing functionality in order to increase configuration flexibility of implementations.
After the discussion in Cloud Native SIG, it was decided to move this separation to the External Logging API Plugin (JEP-212).
In the original JEP it was proposed to support Log browsing for particular steps. This functionality is needed to browse Pipeline FlowNode logs, but it may be also used to browse other segmented logs.
After the review it was decided to NOT add this API to the core. Instead of that, External Logging API implements it for now. If there is a need to support logging of steps, such feature can be added in future core versions in a compatible way (implicit override).
During the original discussions in 2016, the log migration topic has been raised. When a logging system is configured, one may expect the logs to be moved (e.g. from filesystem to the external storage).
-
We will NOT implement migration for old builds
-
We are going to provide multiple `LogStorage`s in parallel on a single instance according to the current design
-
We will show logs from the file system till they get log-rotated
Justification:
-
Not required since we offer smooth migration. All logs on the disk on old instances will be rotated eventually
-
It would be complicated since we may have multiple log sources.
-
We would also have to take ConsoleNote annotations into account
Currently Jenkins does not set limitations for encoding while doing logging. Any charsets may be used on agent and master sides, and it is hard to manage them. Some implementations also rely on the default encoding in master or agent JVMs, and these encodings may be different. This behavior should be retained, because it is a default one for Freestyle projects.
Although it is expected that all logs eventually switch to UTF-8 (see the JEP-206 proposal for Pipeline), in meantime external logging may be performed in different encodings.
-
Loggable
implementations can define the charset to be used -
LogStorage
implementations may implement support of charsets or reject them, it is up to the implementation -
If the implementation does not support the requested charset,
LogStorageFactory
may apply a compatibility layer or skip the Log Storage
In the current design, the encoding is up to the LogStorage
implementation.
The default FileLogStorage
implementation must support the default encoding.
-
We investigated Kibana usage for client-only log browsing during ELK prototyping, and we were able to create it for non-authenticated instances
-
For real-world there are limitations of things to consider:
-
Master-provided logs may be required by CLI, REST API, or by plugins relying on the current master-side implementations (like BlueOcean)
-
Isolation. The log storage (e.g. Elasticsearch or AWS Cloudwatch) may be inaccessible to users at all. Services may have some kind of access tokens for it, but we should not expect any Jenkins user to have such access
-
Network isolation. The services may be just unreachable for user machines
-
…
-
In the current design it was decided that log browsing by default will go through the master.
Client-side logging may be implemented via custom RunAction
implementations.
Support of client-side log browsing may be added in a subsequent JEP.
This JEP guarantees full compatibility of Jenkins instances when they are upgraded and keep the legacy Filesystem-based storage.
On the other hand, some incompatibilities may be introduced for the new external logging modes.
-
Logging of non-UTF-8 charsets
-
Application of non-serializable
ConsoleLogFilter
implementations -
etc.
External logging implementations will be responsible to document known incompatibilities and to warn users about it. Some checks will be performed at the External Logging API plugin level.
hudson.model.Run
offers File getLogFile()
method and several other methods,
which cannot be universally mapped to external storages.
In order to support them, all LogStorage
implementations are
expected to provide a File toLogFile()
method which ensures compatibility with such old API.
It may be done via creating temporary files,
so that read-only calls to Run#getLogFile()
remain compatible.
Such caching approach implies a performance hit, but the raw File
-based APIs are deprecated by this design anyway.
There will be no performance overhead on the built-in File-based storage.
Also, caching does not prevent from compatibility issues if one of the plugins
invokes Run#getLogFile()
and then performs modification of such file.
Such logic will be considered as incompatible for new External Logging implementations.
This JEP defines the following security requirements:
-
All newly introduced methods should follow the Jenkins security model and perform user and queue authentication permission checks where necessary
-
Existing sensitive information masking logic should be executed on master and agent BEFORE logs are submitted to the external storage. External log storage should not expose secrets
-
The following sensitive data must be masked by default
-
Environment variables and parameters marked as sensitive
-
Credentials contributed by Credentials Binding plugin
-
ConsoleLogFilter
implementations if they areSerializable
(the most of Pipeline-compatible implementations are already serializable)
-
-
ConsoleAnnotator
-based secret masking (e.g. Mask Passwords plugin) should be implementable in plugins
This Jenkins Enhancement Proposal does not define strong security requirements for external storage implementations. These implementations are responsible to define their security model.
There is no special infrastructure requirements defined for this JEP. Subsequent JEPs for the implementations may define such infrastructure requirements.
All tests will be implemented using Jenkins Test Harness or Acceptance Test Harness (ATH) frameworks.
The following use-cases must be covered:
-
Backward compatibility
-
Upgradeability - upgraded instances use the Filesystem Storage by default
-
Smoke tests - logging Method locators are invoked for new runs
Jenkins core will provide Logging methods and browsers only for the File System Log storage. This storage will be covered by existing tests for jobs.
External Logging implementations are expected to implement integration
tests using DockerRule
or similar technologies,
if the target log storage allows it.
Once JENKINS-TODO is implemented, integration tests with External Task Logging API Plugin
and one of the reference implementations should be added to the
essentialsTest()
run.