Automating the test of your infrastructure code

Infrastructure test automation

Quality must be everyone’s role. All the teams must be aware of this responsibility. Assure the quality of every code we produce is not a kind of task we should delegate. We must take the ownership of our work and deliver it with quality.

The infrastructure test automation, asides any application test automation, is important in the process of delivering code. Every change you make in your Ansible playbook, or any file of your infrastructure project, must be followed by the test of the entire project.

The tests can be done either manually or automatically. The advantage of automating the test is obviously save time and make it reproducible at any time. Although you have to invest some time in developing the automation, you get rid of manually repeating the tests. With automation, it becomes a simple matter of a click of a button.

The test script

You can use a tool like Molecule to test your Ansible playbooks, or simply use shell scripts. The file below is and example of the use of shell script to automate an entire Ansible project. More about the project you can find in the article How to deal with the same configuration file with different content in different environments. You can also clone the Codeyourinfra repository and take a look at the same_cfgfile_diff_content directory.


	vagrant destroy -f
	rm -rf .vagrant/ *.retry "$tmpfile"

. ../common/

# turn on the environment
vagrant up

# check the solution playbook syntax
checkPlaybookSyntax playbook.yml hosts

# execute the solution
ansible-playbook playbook.yml -i hosts | tee ${tmpfile}
assertEquals 3 $(tail -5 ${tmpfile} | grep -c "failed=0")

# validate the solution
ansible qa -i hosts -m shell -a "cat /etc/conf" | tee ${tmpfile}
assertEquals "prop1=Aprop2=B" $(awk '/qa1/ {for(i=1; i<=2; i++) {getline; printf}}' ${tmpfile})
assertEquals "prop1=Cprop2=D" $(awk '/qa2/ {for(i=1; i<=2; i++) {getline; printf}}' ${tmpfile})
assertEquals "prop1=Eprop2=F" $(awk '/qa3/ {for(i=1; i<=2; i++) {getline; printf}}' ${tmpfile})

# turn off the environment and exit
exit 0

The script is quite simple. It basically turns on the required environment for testing, do the tests and turn off the environment. If everything goes as expected, the script exits with the code 0. Otherwise, the exit code is 1. (Here is a great article about exit codes)

The environment for testing is managed by Vagrant. The command up turns the environment on, while the command destroy turns it down. Vagrant can manage both local virtual machines and AWS’ EC2 instances. When the test is done in the cloud, there’s an additional step of gathering the IP addresses from AWS. Ansible requires these IPv4 addresses in order to connect with the remote hosts through SSH. If you want more details, please take a look at the previous article Bringing the Ansible development to the cloud.

Notice that the environment is turned off and all the auxiliary files are removed in the teardown function. Other functions are also used within the script, loaded from the file. They are as follows:

  • checkPlaybookSyntax – uses the –check-syntax option of the ansible-playbook command in order to validate the playbook YML file;
  • assertEquals – compares an expected value with the actual one in order to validate what was supposed to happen;
  • assertFileExists – checks if a required file exists.

The script also creates a temporary file. In to the temporary file the command tee writes the output of the command ansible-playbook executions. Right after each execution, some assertions are made, in order to check if everything has just gone fine. The example below shows the output of the playbook.yml execution.

ansible-playbook playbook.yml -i hosts

PLAY [qa] **************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************************************************************************
ok: [qa1]
ok: [qa2]
ok: [qa3]

TASK [Set the configuration file content] ******************************************************************************************************************************************************************
changed: [qa1] => (item={'key': u'prop1', 'value': u'A'})
changed: [qa3] => (item={'key': u'prop1', 'value': u'E'})
changed: [qa2] => (item={'key': u'prop1', 'value': u'C'})
changed: [qa1] => (item={'key': u'prop2', 'value': u'B'})
changed: [qa2] => (item={'key': u'prop2', 'value': u'D'})
changed: [qa3] => (item={'key': u'prop2', 'value': u'F'})

PLAY RECAP *************************************************************************************************************************************************************************************************
qa1                        : ok=2    changed=1    unreachable=0    failed=0   
qa2                        : ok=2    changed=1    unreachable=0    failed=0   
qa3                        : ok=2    changed=1    unreachable=0    failed=0

The command tail gets the last five (-5) lines of the temporary file, and the command grep counts (-c) how many lines have “failed=0”. Ansible outputs the result at the end, and it’s expected success (failed=0) in the performing of the tasks in all of the three target hosts (3).

In a single execution, Ansible is able to do tasks in multiple hosts. The Ansible ad-hoc command bellow executes the command cat /etc/conf in each of the hosts that belong to the test environment (q1, q2 and q3). The goal is validate the prior playbook’s execution. The content of the configuration file of each host must be as defined in the config.json file.

ansible qa -i hosts -m shell -a "cat /etc/conf"

qa2 | SUCCESS | rc=0 >>

qa3 | SUCCESS | rc=0 >>

qa1 | SUCCESS | rc=0 >>

The command awk finds a specific line by a pattern (/hostname/) and gets the two lines below in a single line. This way is possible compare the configuration file content obtained from each host with the expected content.


Every Codeyourinfra project’s solution has its own automated tests. You can check it out by navigating through the repository directories. The file of each folder does the job, including those which are in the aws subdirectories. In this case, the test environment is turned on in an AWS region of your choice.

Shell scripting is just an example of how you can implement your infrastructure test automation. You can use Docker containers instead of virtual machines managed by Vagrant, too. The important is having a consistent and reproducible way to guarantee the quality of your infrastructure code.

The next step is create an integration continuous process for developing your infrastructure. But it’s the subject of the next article. Stay tuned!

Before I forget, I must reinforce it: the purpose of the Codeyourinfra project is help you. So, don’t hesitate to tell the problems you face as a sysadmin.