In a way, this is total overkill for a static site. If I have the repo cloned on my machine and I want to publish a new post, I can do it in two commands:
stack exec site build scp -r \_site/ email@example.com:/var/www/www.mrlee.dev/
It's flawed compared to using
rsync, as it won't remove existing files, but it does the job in less than a second or two.
The thing is, this isn't so quick if I want to publish a post from a different computer that doesn't have any programming tools installed. I would have to install
stack 1, which is a build tool for Haskell, and then I would have to run
stack build. This can take at least half an hour as the command will pull down the correct version of
GHC and a 'snapshot' (basically a huge collection of all the Hackage2 libraries available for that build) before it even thinks about compiling my
site.hs file. It also means to committing a few gigs of storage space for all of that.
I like to write from my little Surface Pro when I'm out and about, so I'd rather not do a full-blown compilation on that for the sake of my battery. Enter Azure DevOps Pipelines3.
I've been keen on playing with these pipelines for a while, and much like any dev-tool, it has a free tier for open source repos. So does Github Actions4, which actually shares some of the underlying architecture of DevOps Pipelines, but I wanted to play with something different.
Let's do a step-by-step walk through my setup.
trigger: - master pool: vmImage: 'ubuntu-latest'
This is pretty much CI boilerplate. The build will run on any PR that targets
master, and it uses Ubuntu as the underlying image. I'm not doing any Docker stuff here.
jobs: - job: build steps: ...
I only have a couple of jobs in this pipeline, to keep it simple. The next bunch of steps are nested under this.
- script: | mkdir -p ~/.local/bin $(Build.BinariesDirectory) curl -L https://get.haskellstack.org/stable/linux-x86_64.tar.gz | tar xz --wildcards --strip-components=1 -C ~/.local/bin '*/stack' displayName: Install Stack
Won't get far without grabbing the latest stable Stack binary.
- task: Cache@2 displayName: Cache Stack/GHC snapshot inputs: key: 'stack | root' path: .stack/ cacheHitVar: 'STACK_SNAPSHOT_RESTORED'
Later on there will be a step that runs
stack build, which will take about 40 minutes in CI. It would be a waste to repeatedly download all of that, so I'm caching the root stack folder for good measure. The
cacheHitVar is something we will reference later.
- task: Cache@2 displayName: Cache local stack deps inputs: key: 'stack | stack.yaml.lock' path: .stack-work/ cacheHitVar: 'STACK_DEPS_RESTORED'
This is the same as the last step, but it's for the dependencies my static site requires. I want to cache these separately so adding a new project dependency doesn't force a full refresh of the Stack snapshot.
- script: | export PATH=$HOME/.local/bin:$PATH stack --no-terminal --stack-root $(System.DefaultWorkingDirectory)/.stack setup displayName: Build Snapshot condition: ne(variables.STACK_SNAPSHOT_RESTORED, 'true')
STACK_SNAPSHOT_RESTORED condition at the bottom there? This step sets up GHC and the Stack snapshot, but only if one wasn't restored from the cache. If the cache has it, then it will have alread been fetched.
- script: | export PATH=$HOME/.local/bin:$PATH stack --no-terminal --stack-root $(System.DefaultWorkingDirectory)/.stack build displayName: Build Dependencies condition: ne(variables.STACK_DEPS_RESTORED, 'true')
This is the same as above, but for the project dependencies. So far so good. We're almost done now.
- script: | export PATH=$HOME/.local/bin:$PATH stack --no-terminal --stack-root $(System.DefaultWorkingDirectory)/.stack install --local-bin-path $(Build.BinariesDirectory) displayName: Build Site Executable
Since I've already run
stack build, this just copies the binary to a different location, which I use to store it as a build artifact.
Build.BinariesDirectory is a special place on the VM to store compiled build artifacts. It doesn't matter where specifically that is, only that it's the same across steps.
- task: PublishBuildArtifacts@1 displayName: Save static site binary inputs: pathToPublish: $(Build.BinariesDirectory) artifactName: site
This is where that binaries directory comes into play, as I can tell Azure to upload everything in there as a build artifact, which I can then refer to in another job. This isn't quite the same as a cache, as a build is not expected to fail if the cache goes missing. It would fail if the binary isn't there though.
So, that's the first step done, but what about actually publishing a post? I have two jobs for that, which are very similar (one for draft posts/staging, one for prod). I'll describe one of them.
- job: deploy_published dependsOn: build condition: and(succeeded(), eq(variables['build.sourceBranchName'], 'master')) steps: ...
The key to this step is the condition. This will run only if the
build job was successful, and the branch being built is the master branch. Practically, this only runs if I push straight to master or merge a PR. The staging version runs only on PRs.
- task: DownloadBuildArtifacts@0 displayName: Download site binary inputs: artifactName: site downloadPath: $(System.DefaultWorkingDirectory)
Time to put that binary I compiled to good use. It downloads it into the main working directory and I'll call it directly in a later step. The executable is self-contained (or otherwise dynamically links stuff the image already has), so I don't need to pull down Stack/GHC stuff again.
- script: | export PATH=$(System.DefaultWorkingDirectory)/site:$PATH chmod +x $(System.DefaultWorkingDirectory)/site/site site build displayName: Build with published posts
This is the same as running
stack exec site build on my local machine. It compiles the static site, so finally I'll have a new version to upload.
- task: InstallSSHKey@0 displayName: Setup SSH inputs: knownHostsEntry: '$(NexusKnownHost)' sshKeySecureFile: 'nexus_deploy'
I host this blog on my own little VPS, which means that the server needs to know that the CI is authorised to connect to it with its SSH key. This is the same as having a deploy key on GitHub, and requires generating a keypair to be stored in CI, with the public key being added to your
authorized_keys file of the appropriate user on the server.
At this point I'll say that if you're doing this yourself, make sure to properly harden your server. I'll describe this more in a follow-up post.
There's only step left now, and that's to deploy!
- task: CopyFilesOverSSH@0 displayName: Deploy to prod inputs: sshEndpoint: 'Nexus' sourceFolder: '\_site/' contents: '**' targetFolder: '/var/www/www.mrlee.dev' cleanTargetFolder: true readyTimeout: '20000'
This is similar to running
rsync to deploy, except that it knows where to get your private key from and where to connect to. This is defined elsewhere in Azure DevOps, through the UI, rather than in the YAML file.
To solve the issue I first mentioned,
cleanTargetFolder makes sure to delete the previous deployment before copying the new one over. Problem solved!
To see the pipeline in full, you can check out the full YAML file5. I've been using it with success for the past couple of weeks now.