GSoC 2018. Jenkins Remoting over Message Bus/Queue

Project status: Active

My name is Pham Vu Tuan, I am a final year undergraduate student from Singapore. This is the first time I participate in Google Summer of Code and contribute to an open-source organization.

I have GSoC mentors who help me in this project Oleg Nenashev and Supun Wanniarachchi. Besides that, I also receive great support from developers in remoting project Devin Nusbaum and Jeff Thompson.

Project Overview

Current versions of Jenkins Remoting are based on the TCP protocol. If it fails, the agent connection and the build fails as well. There are also issues with traffic prioritization and multi-agent communications, which impact Jenkins stability and scalability.

This project aims to develop a plugin in order to add support of a popular message queue/bus technology (Kafka) as a fault-tolerant communication layer in Jenkins.

The plugin source code can be found in GitHub.

Benefits to the community

The plugin provide useful features to the community:

  • Provide a new method to connect agent to master using Kafka besides existing methods such as JNLP or ssh-slaves-plugin.

  • Help to resolve the existing issues with the TCP protocol between master and agent communication in Jenkins.

  • Help to resolve traffic prioritization and multi-agent communications issue in Jenkins.

Why Kafka?

  • Kafka itself is not a queue like ActiveMQ or RabbitMQ, it is a distributed, replicated commit log. This helps to remove message delivery complexity we have in traditional queue system.

  • We need to support data streaming as a requirement, and Kafka is good at this aspect, which RabbitMQ is lack of.

  • Kafka is said to have a better scalability and good support from the development community.

Architecture Overview

The project consists of multiple components:

  • Kafka Client Library - new command transport implementation, producer and consumer client logic.

  • Remoting Kafka Plugin - plugin implementation with KafkaGlobalConfiguration, KafkaComputerLauncher and KafkaSecretManager.

  • Remoting Kafka Agent - A custom JAR agent with remoting JAR packaged together with a custom Engine implementation to setup a communication channel with Kafka. The agent is also packaged as a Docker image in DockerHub.

  • All the components are packaged together with Docker Compose.

The below diagram is the overview of the current architecture: remoting kafka architecture

With this design, master is not communicating with agent using direct TCP communication anymore, all the communication commands are transfered with Kafka.

Phase 1 Summary

  • Setup project as a set of Docker Compose components: Kafka cluster, Jenkins master (with plugin) and a custom agent.

  • Create a PoC with command transport implementation to support Kafka, which involves of command invocation,RMI, classloading and data streaming.

  • Make necessary changes in Remoting and Core to make them extensible (if needed).

  • Make a decision whether Kafka is suitable as a final implementation.

Phase 2 Summary

Phase 3 Summary

Features

There are some snapshots for the features of the plugin.

1. Kafka Global Configuration with support of credentials plugin to store secrets.

remoting kafka configuration

2. Launch agent with Kafka Launcher.

launch agent kafka

3. Launch agent from CLI using agent JAR with secret provided to ensure security.

agent cli

4. Run jobs, pipeline using Kafka agent.

demo jobs

5. Kafka communication between master and agent.

kafka commands

Remoting operations are being executed over Kafka. In the log you may see:

  • Command execution (SlaveInstallerFactoryImpl.isWindows())

  • Classloading (Classloader.fetch())

  • Log streaming (Pipe.chunk())

How to run demo of the project

We have setup a ready-to-fly demo for this plugin. You can try to run a demo of the plugin by following this instruction. Features in the demo:

  • Docker Compose starts preconfigured Master and agent instance, they connect automatically using Kafka launcher.

  • Kafka is secured and encrypted with SSL.

  • There few demo jobs in the instance so that a user can launch a job on the agent.

  • Kakfa Manager supported in localhost:9000 to support monitoring of Kafka cluster.

Future Work

  • Cloud API implementation (JENKINS-51474).

  • Chunking capabilities for Kafka channel (JENKINS-51709).

  • Stop bundling remoting in Remoting Kafka Agent (JENKINS-51944).

  • Consumer pooling, NIO options (JENKINS-52199).

  • Support multiple Kafka hosts to achieve fault-tolerant communication (JENKINS-52542).

  • Agent recovery to continue running jobs after disconnection to Kafka (JENKINS-52954).

  • Make Zookeeper configuration optional to support ad-hoc topics creation (JENKINS-52870).

Phase 3 Presentation Slides

Phase 3 Presentation Video

Team

Links