Using GitHub Actions to deploy to Kubernetes in GKE

By Merlin Carter

GitHub actions and workflows are great. We use them to deploy a web app to different cluster environments, and I want to show you how we did it. Hopefully, this helps to simplify your deployment process as well. I’ve also written a companion article that describes our GitHub workflows for continuous integration.

Here, we’ll focus on our continuous deployment process:

Workflow Overview

Our continuous deployment workflow has four jobs. It includes the same jobs that we had in our continuous integration workflow, as well as two extra jobs:

(CD only) Validate that the release has been tagged correctly.
(CI/CD) Lint the code with Flake8 and test the application with Pytest.
(CI/CD) Build a Docker Image and publish it to Google’s container registry
(CD only) Deploy the container to a Kubernetes cluster in Google Cloud.

Release Process Overview

To understand our deployment workflow, you need to know a bit about our release process.

As mentioned in the preceding article, we set up these workflows while building a web app for a major property tech startup.

The web app consists of four sub-applications.
Each application is maintained in a separate repository and runs in its own Docker container.
For each repository, the workflow automation looks at certain attributes to determine which cluster to deploy to.

Release Attributes

When creating a release in GitHub, we use the pre-release flag to indicate that the release is not ready for production. In this case, the container is either deployed to a development or staging cluster, depending on what branch was used to create the release.

Development: When we’re in the process of developing, we push everything to a branch and create a release from that branch. We set the pre-release flag and give it a descriptive tag with the prefix “rc-” (release candidate).

Staging: When we’re reading to push to staging for proper QA, we merge to master and create another release. We select the master branch but keep the pre-release flag. We add a tag that indicates the intended version, but we keep the prefix “rc-”.

Production: When we deploy to production, we add a tag with the prefix “r-” and the “pre-release” flag is no longer selected.

The “r-” and “rc-” tag prefixes enable us to easily distinguish between “real” releases and release candidates when reviewing the release history. As you’ll soon see, we automatically validate tag prefixes when deploying to a production cluster.

Create a workflow file

Now that you understand our release process, the deployment logic in the workflow file will make a lot more sense. So let’s go through what we did to create it. By the way, I’m assuming you’ve created a workflow file before. If you haven’t, check out my first article on creating a workflow file for CI.

I’ll be using the workflow file for our backend app as an example.

Define the triggering event

We wanted our deployment workflow to kick in whenever someone created a release in GitHub, so we updated our workflow file as follows.

name: CD
on:
  release:
    types: [created]
Code language: CSS (css)

For more information on the trigger syntax, see GitHub’s reference doc on events.

Define Jobs

As mentioned in my introduction, we wanted to validate the GitHub tags and test the app, take a snapshot of it, and push it as a Docker image in our container registry (just in case we needed to roll back to an earlier iteration of the app). Finally, we wanted to deploy the image to a Kubernetes cluster.

For this purpose, we defined these four jobs:

Job 1: Validate tags.
Job 2: Run the tests.
Job 3: Build and publish the Docker image.
Job 4: Deploy the application to a cluster

If any one of these jobs fails, the whole workflow is terminated. I won’t go too much into job 2 and job 3 because they also exist in our CI workflow, which I’ve covered in a companion article.

Job 1 is interesting because we’re considering an action that we’ve created ourselves. Also, we store the action in a private repository which poses its own challenges, as you’ll soon see. So let’s get into it:

Job #1 (CD only): Validate release tags

We wanted to easily distinguish different release types on our releases page — so we defined the simple tagging rules that I described previously. But rules are no good unless you can enforce them.

Creating a custom action

To check that people are tagging releases correctly, we created a custom action. GitHub actions are essentially small predefined scripts that execute one specific task. There are plenty of user-contributed actions on the GitHub marketplace, but in this case, we needed to create our own.

GitHub supports two types of actions: Ones that run as JavaScript or ones that run in a Docker container.

We set up one that runs in a Docker container since that’s what we’re more familiar with. Our action lives in its own private repository with the following file structure.

The most important files are the action metadata (“action.yaml”) and the shell script (“entrypoint.sh”).

The actions metadata

The actions.yaml file defines the metadata for the action according to the metadata syntax.

name: 'Validate tags'
author: 'Hugobert Humperdinck'
description: 'Validate release/pre-release tags'
inputs:
  prerelease:
    description: 'Tag is prerelease'
    required: true
runs:
  using: 'docker'
  image: 'Dockerfile'
  args:
    - ${{ inputs.prerelease }}
Code language: JavaScript (javascript)

It also defines the arguments that the action takes. In our action, we depend on the value of the pre-release flag for our validation logic, so we define it as an input for the action.

The shell script

The entrypoint.sh is the shell script to run in the Docker container. It includes all of our validation logic.

#!/bin/bash
set -e
set -o pipefail
echo "Start validation $1"
BRANCH=$(git branch -r --contains ${GITHUB_SHA} | grep "")
RELEASE_VERSION=$(echo ${GITHUB_REF} | sed -e "s/refs\/tags\///g" | sed -e "s/\//-/g")
MASTER_BRANCH_NAME='origin/master'
RELEASE_PREFIX='r-'
if [[ "${INPUT_PRERELEASE}" != true ]] && [[ "$BRANCH" == *"$MASTER_BRANCH_NAME"* ]] && [[ "$RELEASE_VERSION" == "$RELEASE_PREFIX"* ]]; then
  echo "Release tag validation succeeded!"
  exit 0
elif [[ "${INPUT_PRERELEASE}" == true ]]; then
  echo "Pre-Release tag validation succeeded!"
  exit 0
else
  echo "Tag validation failed!"
  exit 1
fi
Code language: PHP (php)

Here’s what the script does.

Set environment variables

We’re using the built-in GitHub environment variables GITHUB_REF and GITHUB_SHA to determine the following variables:

BRANCH: Our validation logic depends on the branch name, so we search through the branches for the commit that triggered the workflow and derive the associated branch.
RELEASE_VERSION: We search the full Git ref string to get just the tag at the end of the path.
The variables MASTER_BRANCH_NAME and the RELEASE_PREFIX are just hardcoded strings
INPUT_PRERELEASE is the value of the pre-release flag, and it comes from the input defined for the action (in action.yaml). When we call the action, we pass the pre-release flag as an argument.

Validate Release Tag

We want the tags to be formatted a certain way, but we only care about tags for production releases.

First, we make sure it’s a production release but specify our two release criteria: the pre-release flag is not set, and it’s on the master branch.
The third criterion is that the RELEASE_VERSION has to start with “r-”. If it doesn’t have the right prefix, we fail the job and consequently the entire deployment workflow.
If the pre-release flag is set, we know it’s not a production release, then we just say that the tag is fine whatever it is.

This logic could obviously be improved. We could check the pre-release flag in the job definition, but my main goal is to show you how a basic action is structured.

Calling the action from our job

Normally, it’s easy to call an action from a workflow file — but that’s assuming the action is in a public repo. Our customers didn’t want to open up their source files for the world to see, so we needed to store our actions in a private repository.

Initially, this was a problem. It was difficult to check out a private repository, so we couldn’t use an action stored there. I’m telling you this because we’re still in the process of updating our workflow since GitHub recently fixed this problem.

Originally, checking out other repositories was simple. When you wanted to clone your working branch to the virtual machine, you called the public action “actions/checkout” without any arguments. You could also pass it extra arguments to check out other public repositories.

But version one of the checkout action did not work so well with private repositories. It could check out the repository but did not pass the path to the next workflow step. If you checked out a repository in the folder “my-magic-actions”, the workflow could not see it.

Luckily, version two of the checkout action fixes this issue, and it supports SSH for accessing private repositories.

Now, to use a private action, we can update our workflow file to check out our private actions repository like this:

jobs:
  validate-release-name:
    name: Validate release name
    runs-on: 'ubuntu-latest'
    steps:
      - name: Checkout working branch
        uses: actions/checkout@v2
      - name: Checkout private actions repo
        uses: actions/checkout@v2
        with:
           repository: acme/private-actions
           token: ${{ secrets.GitHub_PAT }} # `GitHub_PAT` is a secret that contains your PAT
           path: private-actions
Code language: PHP (php)

In the job steps, we first call the checkout action without arguments to get the working branch for our application.

Then, we call the “checkout” action again to check out another private repository — the one that contains our action.

This time, we provide extra parameters, such as the repository path, the local path, and the GitHub personal access token (PAT) for the actions repository.
We can’t use ${{ github.token }} because it is scoped to our backend repository, so we have to specify a separate PAT that has access to the repository for our private actions (the ‘secrets’ variables are defined in the repository settings.)

For more information on how the Checkout action works, see the checkout README.

After all that, we can finally call our private action with the pre-release flag as an argument.

- name: Validate release tag
    uses: private-actions/validate-tag-action
    with:
      prerelease: 'github.event.release.prerelease'
Code language: PHP (php)

We get the pre-release flag by using the GitHub context syntax.

Job #2 (CD/CD): Run the tests.

These are the same tests that we use in our continuous integration workflow, which I have already covered in a companion article.

Job #3 (CD/CD): Build the Docker image and publish it to the registry

Again, I already covered this job in my companion article, but there is one small difference this time around. We add the tag_names parameter, which changes how we tag the Docker image.

- name: Publish Docker Image
      uses: elgohr/Publish-Docker-Github-Action@2.14
      env:
        SSH_PRIVATE_KEY: ${{ secrets.SSH_PRIVATE_KEY }}
      with:
        name: ${{ env.DOCKER_IMAGE }}
        username: ${{ steps.gcloud.outputs.username }}
        password: ${{ steps.gcloud.outputs.password }}
        registry: ${{ env.DOCKER_REGISTRY }}
        tag_names: true
        buildargs: SSH_PRIVATE_KEY
Code language: JavaScript (javascript)

Instead of sticking with the default behavior, which is to tag the image with the originating branch, we pass the value of the GitHub release tag as our Docker tag. So in the Docker registry, released images look something like this:

Name: 8cd6851d850b Tag: r-3

Again, this enables us to visually distinguish released Docker images from ones that were pushed as part of the continuous integration workflow. As a reminder, pre-release images are tagged with the branch name like this:

Name: 8cd6851d850b Tag: XYZ-123_add_special_field_todo_magic

Job #4: Deploy to Google Cloud

Assuming the previous job succeeds, we’re ready to push the Docker image to a Kubernetes cluster in our Google Cloud instance.

First, we need to set a few environment variables:

Set up the environment

We updated our workflow file as follows:

deployment:
    name: Deploy backend to cluster
    runs-on: 'ubuntu-latest'
    needs: [docker-image]
    steps:
    - name: Checkout working branch
      uses: actions/checkout@v1
    - name: Set Release version
      run: |
        echo ::set-env name=RELEASE_VERSION::$(echo ${GITHUB_REF} | 
                            sed -e "s/refs\/tags\///g" | sed -e "s/\//-/g")
    - name: Cluster env for production
      if: "!github.event.release.prerelease"
      run: |
        echo ::set-env name=CLUSTER_ENV::prod
    - name: Cluster env for staging/dev
      if: "github.event.release.prerelease"
      run: |
        BRANCH=$(git branch -r --contains ${GITHUB_SHA} | grep "")
        MASTER_BRANCH_NAME='origin/master'
        if [[ "$BRANCH" == *"$MASTER_BRANCH_NAME"* ]]; then
          echo ::set-env name=CLUSTER_ENV::stag
        else
          echo ::set-env name=CLUSTER_ENV::dev
        fi
    - name: Set Cluster credentials
      run: |
        echo ::set-env name=CLUSTER_NAME::acme-gke-${{ env.CLUSTER_ENV }}
        echo ::set-env name=CLUSTER_ZONE::europe-west3-a
        echo ::set-env name=PROJECT_NAME::acme-555555
Code language: PHP (php)

We check out the working branch again (you have to do this for each job), then set RELEASE_VERSION by extracting the tag name from the end of the GITHUB_REF.

Then we need to set all the variables that we’ll use for the “gcloud” command in a subsequent step:

CLUSTER_ENV: We have some simple logic for defining how to do it:

If the pre-release flag is not set, it’s a proper release, and we set it to “prod”.
If the pre-release flag is set, we check the originating branch.
If the branch is master, we set it to “stag” (staging), and if not, we set it to “dev”.

CLUSTER_NAME: We use the CLUSTER_ENV variable to set the suffix for the full name. So it’s either “acme-gke-prod”, “acme-gke-stag”, or “acme-gke-dev”.

The zone and project name are hard coded.

Install the necessary tools

Next, we need to install the Kubernetes command-line tool and Helm, which makes it easier to install Kubernetes applications.

- name: Install kubectl
      run: |
        sudo apt-get install kubectl
    - name: Install helm
      run: |
        curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
        chmod 700 get_helm.sh
        ./get_helm.sh
Code language: JavaScript (javascript)

Note that these tools seem to be preinstalled in the standard GitHub virtual machines now. But they weren’t when we set up our workflow file, so I’ll stick to describing what we did.

Deploy the image to a cluster

Finally, we use all the variables that we defined previously to run the gcloud CLI.

- name: Deploy Release on cluster
      env:
        GCLOUD_KEY: ${{ secrets.GCLOUD_KEY }}
      run: |
        echo "$GCLOUD_KEY" | base64 --decode > ${HOME}/gcloud.json
        gcloud auth activate-service-account --key-file=${HOME}/gcloud.json
        gcloud auth configure-docker
        gcloud container clusters get-credentials \
        ${{ env.CLUSTER_NAME }} --zone 
        ${{ env.CLUSTER_ZONE }} --project ${{ env.PROJECT_NAME }}
        # install/upgrade helm chart
        helm upgrade --install backend ./deploy/helm/backend \
                  --values ./deploy/helm/backend/env.values.${{ env.CLUSTER_ENV }}.yaml \
                  --set env_values.image_version=${{ env.RELEASE_VERSION }
Code language: PHP (php)

First, we get our Google Cloud key from our repository secrets, then we give it to the gcloud CLI as well as our cluster and project details.

Then, we use a Helm chart to install our backend application on the Kubernetes cluster. The Helm chart for the back end is stored in the backend repository because we prefer to maintain charts as part of the application (more on that in another article).

When installing, we pass Helm the following arguments to override the default settings in the chart:

— values define the yaml file that contains the environment variables.
For a production release, it’s “env.values.prod.yaml”.
— set overrides a specific variable in the “env.values” yaml file, namely “image_version”.
In the yaml file, it’s set to “latest”, but we want it to use our release version, such as “r-3”.

And that’s the end of the workflow. Once it’s triggered, it’s easy to monitor the progress and make sure that everything is deployed correctly. For example, here’s a screenshot of the output for one of our other public projects:

screenshot: A log of a triggered workflow

Summary

As I mentioned in the first part of this two-part series, it was very easy to set these workflows up. The limitation with actions in private repositories was a minor irritation, but GitHub is continually improving its built-in actions as this was soon addressed. Plus, there is a growing ecosystem of user-contributed actions for every kind of task. Unless a customer wants us to use something else besides GitHub, we’ll be sticking with GitHub CI/CD workflows for future projects.