When deploying Apollo it will create the cluster infrastructure in the cloud using an image built by Atlas and Packer. Additionally, it will provision and configure the new machines providing a full PAAS ready for deploying containers across the datacenter.

When dealing with several layers of abstraction and using many different technologies (Bash, Packer, Terraform, Ansible, etc.) it is fairly easy to break things during development.

We need to make sure we are not committing syntax errors, that every commit is deployable and that the behaviour of the platform with all the bits communicating to each other works as expected.

Changes need to be continuously integrated and driven by tests. For this purpose we chose Wercker out of a list of available CI as a service tools like Travis, circleci, Drone and Shippable

“Wercker is an open automation platform for developing, building and delivering your applications” from the simplicity of a YAML file and it is highly customizable via custom steps

For achieving our continuous deployment approach we’ll use two different pipelines: build and deploy

Build

“Builds are the result of a run-through of the steps in the wercker pipeline. If all steps in the pipeline are successful, a build has passed”.

Every time a Pull Request is created the build steps below will run:

  • Syntax checks against all our json packer templates using packer validate.
script:
  cwd: packer/
  name: validate packer template for AMI
  code: ./packer validate ubuntu-14.04_amd64-amis.json
capgemini/ansible-lint:
  name: run ansible-lint against the site.yml playbook
  playbook: site.yml
script:
  name: scripts testing
  cwd: bootstrap/tests
  code: |
    /bin/bash test.sh

In the future we might be adding a new step here for “packer push” the images into Atlas when merging into master branch.

You can see the feedback for every build in wercker:

Fig 1. Wercker builds

Deploy

Wercker allows you to deploy your application into supported platforms like Heroku or Openshift or you can create you own Deployment Target.

Every successful build from any specific commit can be deployed either automatically or manually.

Fig 2. Wercker deploy

At the moment we are continuously deploying Apollo into Amazon and Digitalocean using a step-apollo-deploy.

deploy:
    steps:
    - install-packages:
        packages: wget unzip
    - pip-install:
        requirements_file: "requirements.txt"
    - capgemini/apollo-deploy:
        cloud: $CLOUD
        artifact_name: $ARTIFACT_NAME
        artifact_version: $ARTIFACT_VERSION
        terraform_version: 0.5.3
        run_tests: true

Environment variables can be set for every target you create in wercker:

Fig 3. Slack notification

We have created step-apollo-deploy step for:

  • Installing a given version of Terraform.
  • Deploying Apollo in parallel into multiple clouds.
  • Running different types of testing inside the vms. At the moment Serverspecs and docker bench security.
  • Destroying the whole platform.

Every change in step-apollo-deploy is also continuously integrated by wercker itself using step-validate-wercker-step:

box: wercker/default
build:
  steps:
    - validate-wercker-step

So in essence with both “build” and “deploy” in place we have the ability to automatically triggering a fully tested ephemeral Apollo deployment for any specific commit.

Test

We use ansible roles for installing and running dockerbench and serverspecs covering components of Apollo:

  • Mesos
  • Marathon
  • Docker
  • Weave
  • Consul
  • Registrator
  • Dnsmasq
  • HAProxy
  • Zookeeper

Testing during a deploy can be optionally enabled using a toggle environment variable.

export APOLLO_serverspec_run_tests=true
export APOLLO_dockerbench_run_test=true

Dockerbench

We have recently decided to try and integrate dockerbench into our test suite.

The Docker Bench for Security is a script that checks for all the automatable tests included in the CIS Docker 1.6 Benchmark . It gives INFO, PASS and WARNING messages for each of the tests that run. We run tests at deploy time, to check we haven’t slipped any unwanted security risks into our build. Currently we only test if WARNING’s are above a certain configurable threshold.

The dockerbench project is still fairly young, and is still to settle on a standard way all tests can be run across any environment, whilst having configurable rules. We think this is a great project, as it allows us to bring security into our Continuous Integration cycle, which can decrease attack vectors and catch risks earlier in development cycles.

Serverspecs

Apollo default deployment provides 3 machines playing mesos master role and running zookeeper and marathon. As some configuration like the ips can be assigned on deployment time we need to populate some of the serverspecs dynamically by using an ansible j2 template.

describe command("curl --silent --show-error --fail --dump-header /dev/stderr --retry 2 http://{{ marathon_hostname }}:8080/v2/info") do
  its(:stdout) { should match '.*"master":"zk://"{{ marathon_peers|join(',') }}/mesos".*' }
  its(:stdout) { should match '.*"zk":"zk://{{ marathon_peers|join(',') }}/marathon".*' }
  its(:stdout) { should match '.*"name":"marathon".*' }
  its(:stdout) { should match '.*"mesos_user":"root".*' }
  its(:stdout) { should match '.*"executor":"//cmd".*' }
  its(:exit_status) { should eq 0 }
end

Below are some examples with more of the serverspecs we run for marathon:

describe docker_container('marathon') do
  it { should be_running }
  it { should have_volume('/store','/etc/marathon/store') }
end

describe port(8080) do
  it { should be_listening }
end

describe command("/usr/local/bin/marathon-wait-for-listen.sh") do
  its(:exit_status) { should eq 0 }
end

describe service('marathon') do
  it { should be_running.under('upstart') }
end

describe command("curl -s -XPOST 127.0.0.1:8080/v2/apps -d@spec/marathon/sample.json -H \"Content-Type: application/json\" && sleep 3") do
  its(:exit_status) { should eq 0 }
end

describe command("curl -s 127.0.0.1:8080/v2/deployments") do
  its(:stdout) { should match '[]' }
  its(:exit_status) { should eq 0 }
end

describe command("curl -s 127.0.0.1:8080/v2/apps/serverspecs-app") do
  its(:stdout) { should match '.*"id":"/serverspecs-app".*' }
  its(:stdout) { should match '.*"tasksRunning":1.*' }
  its(:exit_status) { should eq 0 }
end

Feedback

Finally we get the feeback for the deployment in every cloud in our Slack channel using wantedly/pretty-slack-notify:

after-steps:
    - wantedly/pretty-slack-notify:
        webhook_url: $SLACK_WEBHOOK_URL
        channel: apollo
        username: wercker-bot

Fig 4. Slack notification

You can check out the full wercker.yml file here and below is a video showing the whole cycle:

What’s next

If for your use case you need to keep track of the state of Terraform you might find this Atlas and Github integration useful.

Check out our upcoming posts if you want to hear more about CI, CD and our Continuous Delivery approach for services running inside docker containers deployed by Apollo, running Serverspecs into ephemeral containers via “docker exec” across multiple environments.

Join our team

If you like the sound of what you've read and would like to join our team, we're hiring!

Find out more about working with Capgemini

Comments