Continuous Deployment of Python eggs with VSTS on Azure

This blog shows how to create a basic continuous deployment (CD) pipeline for Python code with Visual Studio Team Services (VSTS) on Azure. Building a full CI/CD pipeline on VSTS is a bit of a challenge because Python is a not first-grade citizen on the Azure stack (yet), so we'll focus on building a Python egg from a repository and putting that egg on a file share.

Our use case looks like this:

Our Python code is hosted on a Git repository in VSTS and is released to multiple machines in the Azure Cloud.

New commits on the master branch in the Git repository trigger a process that builds a Python egg and releases it on a file share. The Python egg is an uncompiled, zipped Python module that is system independent.

The file share is mounted on a machine with JupyterHub, used for development, and one with Airflow, used for our jobs. New eggs get installed in the virtual environment on both machines so that the everyone has access to our latest and greatest code.

The repository

We'll start with a repository called example-project:

example_project/
├── exampleproject/     <- Python package with source code.
│   └── __init__.py     <-- Make the folder a package.
    └── process.py      <-- Example module.
├── tests/              <- Tests for your Python package.
    └── test_process.py <-- Tests for process.py.
├── README.md           <- README with info of the project.
├── setup.py            <- Install and distribute your module.

└── vsts_build.bat      <- Windows build script.

The file setup.py specifies how to build install your Python package and create an egg out of it:

# setup.py

import os
from setuptools import setup, find_packages


def read(fname):
    return open(os.path.join(os.path.dirname(__file__), fname)).read()


setup(
    name="exampleproject",
    description="Example project.",
    author="Henk Griffioen",
    long_description=read('README.md'),
    packages=find_packages(),
)

Normally, you can specify which folders to include or exclude with find_packages(), but the Hosted VS2017 agent has a version of setuptools that seems to have problems with submodules when doing that.

vsts_build.bat builds the egg and is rather simple:

:: vsts_build.bat

@echo off
echo Running build on "%AGENT_NAME%" with ID: %AGENT_ID%.

:: Uncomment the following lines to get some info on the agent:
:: @dir %AGENT_WORKFOLDER%
:: @dir %AGENT_BUILDDIRECTORY%
:: @dir %BUILD_SOURCESDIRECTORY%
:: @dir C:\Python27
:: @dir C:\

:: Build the egg.
C:\Python27\python.exe -W ignore setup.py bdist_egg

This simple Powershell script uses Python 2.7 (🔔🔔🔔 shame) to build the egg. Python 3.6 is also available on the Hosted VS2017 agent.

Make sure that you can build the egg with python setup.py bdist_egg on your own machine.

Build & release

Now that our repository is set up, we can create the build & release pipeline. We'll assume you already have an agent queue with the Hosted VS2017 agent. Instead of having two steps for the Build and Release, we'll build the egg and put it on a fileshare in one step.

In VSTS go to 'Build & Release' -> 'Builds' -> '+ New'. Start with an empty process, give it a nice name and choose the queue with the Hosted VS2017 agent.

The first task is to get the sources. Select your repository under 'This project' and the master branch.

The next task will build the egg and call our vsts_build.bat. Add a 'Batch Script' and point the 'Path' to the vsts_build.bat in the repository.

Add the task 'Run Inline Azure Powershell' (you may need to search for it) so that earlier deployed eggs are deleted. Under 'Script to the run` add:

Param(
  [string]$SecretKey
)

$context = New-AzureStorageContext -StorageAccountName "<STORAGE-ACCOUNT-NAME>" -StorageAccountKey "$SecretKey"

Remove-AzureStorageFile –ShareName "<SHARE-NAME>" –Path "<FOLDER>/example_project-v0.0.0-py2.7.egg" -Context $context

The key of the Storage Account is a parameter for this script, so set the following under 'Argument': -SecretKey "$(SecretKey)". Under 'Control Options' enable Continue on Error so that your pipeline doesn't error out when there's no egg on the file share to remove.

Now that we have build our artifact, we can publish it. Select the 'Publish Build Artifacts' task and configure it.

We'll cheat a bit and put the uploading of the egg in our Build process (best practices is to create a separate Releaste step). Create a new task of type 'Azure Storage Upload'. This task is not available in VSTS by default, you can install it for free trough the VSTS Marketplace. When installed configure the following:

  • Source: $(Build.SourcesDirectory)/dist
  • File: *.egg
  • Destination: https://<STORAGE-ACCOUNT>.file.core.windows.net/<SHARE-NAME>/<FOLDER>
  • Key: $(SecretKey)

The final build process should look something like:

Go the tab 'Variables' to add a variable called SecretKey with your the Storage Account key. This key grants full access to the blob-storage account (it's better to generate a short term SAS-token). In the tab 'Triggers' you can define when the build should start. Enable the Continous Integration based on new code in the master branch.

Click 'Save & queue' to save your pipeline. Each time a new git commit is pushed to master, a new egg will be placed on the file share!

Conclusion

We've build a basic pipeline to deploy new Python code on a file share. This code is an egg distribution of the Python module. If you want to build a more sophisticated pipeline, you'll have to use resources not available to VSTS & Azure by default and for instance, host your own agent or use your own Jenkins server.


Thanks to Erik Kooijman from Delta-N and Marco Lormans from Xebia for all the VSTS tips!

Stay up to date on the latest insights and best-practices by registering for the GoDataDriven newsletter.