Python Package CI/CD with GitHub Actions

2021-05-09 — Python, CI/CD, GitHub Actions — 10 min read

In a previous post, I alluded to having pure CI/CD checks and autoreleases for my random-standup program. I wanted to ensure that:

Each change I make to my program won't break existing functionality (Continuous Integration), and
Publishing a new release to PyPI is automatic (Continuous Delivery/Deployment).

GitHub provides a workflow automation feature called GitHub Actions. Essentially, you write your workflow configurations in a YAML file in your-repo/.github/workflows/, and they'll be executed on certain repository events.

Continuous Integration

This automation is relatively straightforward. I want to run the following workflows on each commit into the repository trunk and on each pull request into trunk:

Test syntax by running a linting check for formatting (n.b. syntax correctness is a subset of formatting correctness).
Test functionality across a variety of operating systems and Python versions by running automated tests on the entire program. For this program, I only included a single basic black-box test that's more demonstrative than useful (it checks for a regex match with program output). A suite of unit tests would be more appropriate for a more complex program.
Test build stability by attempting to build the program (but discarding the build artifact) across the same combinations of operating systems and Python versions from Step 2.

Here's the full workflow.

Each commit to trunk

The trigger for this is declared at the top of the workflow file:

1on:
2  push:
3    branches: [main]
4  pull_request:
5    branches: [main]

Test syntax by checking formatting

First, we have to checkout the repository in GitHub Actions using GitHub's own checkout action. Then, we have to set up the Python version using GitHub's setup-python action. Finally, we can use Black's provided GitHub Action for checking formatting - it runs black --check --diff on the workflow runner's clone of the repo and outputs an error code if any Python file in the repo fails Black's formatting rules. Note that Black fails if the AST cannot be parsed (i.e. if there are any syntax errors), so it can also be used for checking syntax correctness, which itself is a good proxy for checking for merge conflict strings.

1jobs:
2  black-formatting-check:
3    name: Check formatting
4    runs-on: 'ubuntu-latest'
5    steps:
6      - uses: actions/checkout@v2
7      - uses: actions/setup-python@v2
8      - uses: psf/black@stable

Running a job across different build environments

GitHub Actions provides matrix build functionality where you provide the option set for each variable and it runs the dependent steps with the n-ary Cartesian product of these n variable option sets:

1build:
2    runs-on: ${{ matrix.os }}
3    needs: black-formatting-check
4    strategy:
5      matrix:
6        os:
7          - 'ubuntu-latest'
8          - 'macos-latest'
9          - 'windows-latest'
10        python-version:
11          - '3.7'
12          - '3.8'
13          - '3.9'

This is defined in the jobs.<job_id>.strategy.matrix directive. I've added 2 variables: one for OS (with Ubuntu, macOS, and Windows as options) and one for Python version (with 3.7, 3.8, and 3.9 as options). This means that everything in the build job will run on every combination of OS and Python version options:

Ubuntu, Python 3.7
Ubuntu, Python 3.8
Ubuntu, Python 3.9
macOS, Python 3.7
macOS, Python 3.8
etc

Note that the runs-on directive is defined as ${{ matrix.os }} which points to the value of the os variable in the current runner. Internally, the steps are somewhat like:

GitHub Actions parses the directives for the job and sees there's a matrix strategy.
It spins up a separate runner for each matrix combination and defines the variables matrix.os and matrix.python-version as the values for that combination. For example, in the Ubuntu/Python 3.7 runner, matrix.os = 'ubuntu-latest' and matrix.python-version = '3.7'.
It runs the job steps in each runner it spun up in Step 2.

You can see an example of how this matrix run looks like in the GitHub Actions console here (see all the OS/Python combinations in the left sidebar). These matrix options are run in parallel by default, so the runtime of the job determined by the slowest matrix option. Note that if your repository is private, you will be charged Actions minutes for each separate build combination, with some hefty multipliers for macOS and Windows (1 macOS minute is 10 minutes of Actions credit, 1 Windows minute is 2 minutes of Actions credit as of May 2021).

Test Functionality

Again, we need to checkout the repo for this job and set up the Python version. The key difference with the Python version setup here compared to the Black formatting job is that the Python version is specified and points to the matrix option for python-version:

1steps:
2      - name: Checkout code
3        uses: actions/checkout@v2
4      - name: Setup Python
5        uses: actions/setup-python@v2
6        with:
7          python-version: ${{matrix.python-version}}

Then, we need to set up the dependencies for the program to ensure it can run. I used Poetry for dependency and virtual environment management, and it's not included with any of the runner environments, so we have to install it in a workflow step. Installing it takes some time, though, so to speed up my workflow runtime, I "permanently" cache Poetry using GitHub's provided cache action. I only run the installation step if the cache is missed, which won't happen since the key is constant for each OS/Python version combination.

1# Perma-cache Poetry since we only need it for checking pyproject version
2      - name: Cache Poetry
3        id: cache-poetry
4        uses: actions/[email protected]
5        with:
6          path: ~/.poetry
7          key: ${{ matrix.os }}-poetry
8      # Only runs when key from caching step changes
9      - name: Install latest version of Poetry
10        if: steps.cache-poetry.outputs.cache-hit != 'true'
11        run: |
12          curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
13      # Poetry still needs to be re-prepended to the PATH on each run, since
14      # PATH does not persist between runs.
15      - name: Add Poetry to $PATH
16        run: |
17          echo "$HOME/.poetry/bin" >> $GITHUB_PATH
18      - name: Get Poetry version
19        run: poetry --version

Then, I do another caching step for dependencies and install them if poetry.lock has changed:

1- name: Check pyproject.toml validity
2        run: poetry check --no-interaction
3      - name: Cache dependencies
4        id: cache-deps
5        uses: actions/[email protected]
6        with:
7          path: ${{github.workspace}}/.venv
8          key: ${{ matrix.os }}-${{ hashFiles('**/poetry.lock') }}
9          restore-keys: ${{ matrix.os }}-
10      - name: Install deps
11        if: steps.cache-deps.cache-hit != 'true'
12        run: |
13          poetry config virtualenvs.in-project true
14          poetry install --no-interaction

Finally, once dependency and virtual environment setup is done, I run pytest:

1- name: Run tests
2        run: poetry run pytest -v

Test build stability

For testing build stability, we simply run Poetry's build subcommand, which creates the build artifacts:

1- name: Build artifacts
2        run: poetry build

Auto-merge

GitHub also allows pull requests to be merged automatically if branch protection rules are configured and if the pull request passes all required reviews and status checks. In the repo Settings > Branches > Branch Protection rules, I have a rule defined for main requiring all jobs in the build.yml workflow to pass before a branch can be merged into main.

Release automation

There are 2 parts to GitHub release automation:

Create the GitHub release using Git tags and add the build artifacts to it (workflow).
Publish the package to PyPI (workflow).

Create GitHub Release

We set up the workflow to trigger on push to a tag beginning with v:

1on:
2  push:
3    # Sequence of patterns matched against refs/tags
4    tags:
5      - 'v*' # Push events to matching v*, i.e. v1.0, v20.15.10

Then, we define our autorelease job, running on Ubuntu (cheapest and fastest GitHub Actions runner environment):

1name: Create Release
2
3jobs:
4  autorelease:
5    name: Create Release
6    runs-on: 'ubuntu-latest'

Our first 2 steps are almost the same as our Build workflow for pushes and PRs to main: we checkout the repo and set up Poetry. Our checkout step is slightly different, though: we provide 0 to the fetch-depth input so we make a deep clone with all commits, not a shallow clone with just the most recent commit.

1steps:
2      - name: Checkout code
3        uses: actions/checkout@v2
4        with:
5          fetch-depth: 0

The Poetry setup steps are identical, so I won't include them here.

Then, we use Poetry to get the project version from pyproject.toml, store it in an environment variable, then check if the tag version matches the project version:

1- name: Add version to environment vars
2        run: |
3          PROJECT_VERSION=$(poetry version --short)
4          echo "PROJECT_VERSION=$PROJECT_VERSION" >> $GITHUB_ENV
5      - name: Check if tag version matches project version
6        run: |
7          TAG=$(git describe HEAD --tags --abbrev=0)
8          echo $TAG
9          echo $PROJECT_VERSION
10          if [[ "$TAG" != "v$PROJECT_VERSION" ]]; then exit 1; fi

This is a bit of a guardrail because of how I trigger the autorelease. I update the pyproject.toml version on my local clone using poetry version <version>, commit it to main, then tag it with the same <version> and push the commit and the tag, which then starts this workflow. We need to ensure that the version tag and the pyproject.toml versions match (in case we forget to bump versions properly).

Then, we do the same dependency and virtualenv setup as in my Build workflow using Poetry, then run pytest and poetry build. The build artifacts will be used when we create the release in the final step of this workflow.

The next step is to create some release notes. I keep a release template in the .github folder and append some gitlog output to it:

1- name: Release Notes
2        run: git log $(git describe HEAD~ --tags --abbrev=0)..HEAD --pretty='format:* %h %s%n  * %an <%ae>' --no-merges >> ".github/RELEASE-TEMPLATE.md"

That gnarly gitlog command is checking all commits since the last tag to HEAD. For each commit, it appends the commit hash, the commit message subject, the author name, and the author email to the release template.

Finally, we use a 3rd-party release creation Action for creating a release draft with the release notes and artifacts we just created:

1- name: Create Release Draft
2        uses: softprops/action-gh-release@v1
3        with:
4          body_path: ".github/RELEASE-TEMPLATE.md"
5          draft: true
6          files: |
7            dist/random_standup-${{env.PROJECT_VERSION}}-py3-none-any.whl
8            dist/random-standup-${{env.PROJECT_VERSION}}.tar.gz
9        env:
10          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

This creates a draft visible at https://github.com/jidicula/random-standup-py/releases. I modify the release announcements as needed, and publish the release.

Publishing to PyPI

The final step of the release process is to publish the package release to the Python Package Index along with the release assets. Here's the full workflow.

This time, we trigger the workflow to run on a release being published (the last step of the previous workflow is manually publishing a release draft):

1on:
2  release:
3    types:
4      - published

We do the same checkout and Poetry setup as before. Then, we simply run poetry publish --build using a PyPI token as a GitHub Secrets environment variable for authentication:

1- name: Publish to PyPI
2        env:
3          PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}
4        run: |
5          poetry config pypi-token.pypi $PYPI_TOKEN
6          poetry publish --build

Putting it all together

So overall, working on this project would involve:

Make a PR for my changes.
Confirm auto-merge.
Repeeat Steps 1 and 2 until I'm ready to release.
Bump the pyproject.toml version on my local clone using poetry version <new_version>. Commit the changes.
Create a tag on main pointing to the version bump commit.
Push both the tag and the version bump commit to GitHub.
Wait for the Create Release run to finish.
Go to https://github.com/jidicula/random-standup-py/releases and modify the Announcements for the just-created release draft.
Publish the release.
Wait for the PyPI Publish run to finish.
Check PyPI for the updated package version.

If you have any questions or comments, email me at [email protected] or post a comment here.

Did you find this post useful? Buy me a beverage or sponsor me here!