In this post, I’ll discuss how I’m currently working to migrate a suite of apps from Docker Swarm to Kubernetes.
The client chose this migration to align with more contemporary standards of container deployment and gain a more comprehensive feature set. Also, some products that they recently purchased were best supported as a Kube package.
So you have a fuller picture of the process itself, I’ve included the high-level phases of migration. They are listed below.
- Analyze and prepare.
- Run Kube in parallel with Swarm. Then, gradually migrate traffic.
- Retire Swarm.
The project is currently in Phase 2, so that’s what I will mainly focus on today. Here’s what we’ll cover:
- A high-level diagram of Phase 2
- The old system
- Design goals for the new system
- The new system
- The bridge between old and new
- Preview of what changes as part of Phase 3
I’ll describe each of the above steps in detail throughout the post. Without further ado, let’s get started!
High-Level Diagram of Phase 2:
Above, I included a diagram of Phase 2. We’ll use this diagram in future steps as well. The components as they’re used for this project are:
- NetScaler: load balancer appliance
- Docker Swarm: container orchestrator we’re migrating from
- Jenkins: automation server developers interact with
- GitLab: hosting the git repos.
- Argo CD: GitOps continuously delivery
- Kubernetes: container orchestrator we’re migrating to
- Helm: package manager for Kube
- Teams: chat app we’re using for deployment notifications
When developers want to deploy, they run the
Scheduled Jenkins job. This allows them to select their app name, the environment they want to deploy to, and the docker image and version they want to deploy all from drop-down lists.
Then, the Jenkins job makes API calls to Swarm to create/update the necessary services.
Design Goals for the New System:
Every good project should begin with a set of objectives or goals. Goals guide and focus development so that as little time and effort is wasted. Being strategic is critical. Below, I discuss the goals we had for Phase 2 of the Swarm to Kubernetes migration.
First, the desired state of the environment must be managed with git. Each app in the app suite should have an independent Helm chart, and Kube/Helm deployments should be performed by ArgoCD.
Additionally, no changes should be made to existing developer contact points, which means that no changes should be made to existing developer deployment procedures. While it’s not covered here, seamless integration of Kube-side logs into existing ES/Kibana indexes is another way to make the transition as seamless as possible.
Breaking it down further, we also have specific goals for what occurs during migration. First, the two should be kept as parallel systems with the same configuration and same capacity. Swarm is the system of record, and Kube is kept in sync automatically. Minimal changes should be made to existing Jenkins deployment jobs.
Also during migration, pattern traffic percentage should be strangulated, which is managed in the environment repo.
Post Migration, our goal is to create a path for updating Helm charts even with no Swarm to sync with.
Here’s the git repo-centric portion of the Kube deployment. This checks a lot of the design goals off for the final state.
The environment GitLab repo contains a directory per app in the app suite. Each is an independent helm chart containing all the necessary templating to describe each app.
Argo CD deploys the Helm charts in the “Environment” GitLab repo and publishes status notifications to Teams.
A Bridge Between Old and New:
Scheduled Deploy Jenkins job has an internal step added, transparently to the user, that pokes the
Kube Sync Jenkins job once it has updated the Swarm API with the latest desired state of the apps.
Kube Sync Jenkins job reads the Swarm API. Then, it clones and programmatically commits and pushes the Helm chart equivalents to the
Environment GitLab repo.
In addition to Helm charts, the
Environment GitLab repo also contains a traffic percentage variable. The
Migration Task Jenkins job updates the NetScaler API with the desired percentage of traffic that should be sent to Kubernetes nodes vs. Swarm nodes.
The ability to control traffic percentage is a key piece in avoiding a “big bang” migration. As the percentage is slowly ramped up over days and weeks, we can compare behavior and performance between the two systems.
If there is ever a major problem with the Kube side of the house, it’s easy to send all traffic to Swarm. Just update the percentage variable to 0 and commit to the repo.
Preview of Phase 3
With Phase 2 completed, all interaction with the Swarm API is eliminated. Success!
Deploy Jenkins job can write directly to the
Environment Gitlab repo. The
Migration Task Jenkins job can be eliminated, and NetScaler is set to always send 100% of the traffic to the Kube nodes.
We covered what steps we’re taking to perform a gradual migration from Docker Swarm to Kubernetes, while also incorporating git-based versioning of deployment states.
My client chose this migration to align with more contemporary standards of container deployment and gain a more comprehensive feature set. And the chances are high that you’ll be seeing this on the projects you work on as well. As Kubernetes continues to replace Swarm, it’s important to have a handle on how migration is done.
Maybe next time I’ll dive further into Phase 3! As always, thank you for reading, and check out the Keyhole Dev Blog for more.