Configuration management with Ansible: Playbooks & Execution

In my previous post, I described how you could setup Ansible. At this point, you are able to send commands to your hosts. Although running ad hoc tasks is useful, I won’t be covering that here. If you want to know more about running command line ad hoc tasks check out the detailed examples on Ansibles documentation page.

Directory Layout

In order to use playbooks, you’ll need to set up a directory layout. There are many different ways to organize your directories, so I can only recommend that you experiment with it. So if you feel that the way I organize my directories and tasks doesn’t work for you, feel free to modify it so it fits your needs.

One thing you definitely want to do is use the “roles” organization feature. Roles will automatically load certain vars_files, tasks and handlers based on a known file structure.
Roles are just automation around ‘include’ directives and don’t contain additional functionality beyond some improvements to search path handling for referenced files. This can be a big thing for you and who knows that roles functionality may expand in the future.

In my case, the directory structure looks like this:

dbservers.yml                        # playbook for database servers
gitservers.yml                       # playbook for git servers
webservers.yml                       # playbook for web servers
infrastructure.yml                   # playbook for the whole infrastructure ( db, git and web )
hosts                                # inventory file for servers
group_vars                           # assign variables to particular groups
host_vars                            # assign variables to specific servers
library                              # external modules can be stored here
roles                                # put your roles in this directory
  common                             # this hierarchy represents a “role”
    files                            # files for use with the copy module
    handlers                         # store handlers 
      main.yml                       # main handlers file
    tasks                            # store tasks
      main.yml                       # main tasks file
    templates                        # files for use with the template module. Templates end in .j2
  dbservers                          # same kind of structure as common role                
    ...
  gitservers                         # same kind of structure as common role
    ...        
  webservers                         # same kind of structure as common role
    ...        
  shared                             # same kind of structure as common role
    ...

Host inventory

I store all my hosts in one inventory file. You could easily split your inventory into multiple files, for instance for separating production and staging hosts.

But since the smaller size of my infrastructure, I have no need for doing that. Basically, I need 3 groups:

  • gitservers
  • dbservers
  • webservers

You can define groups based on the purpose of the host (roles). This makes my inventory file look like this:

[gitservers]
gitserver1                ansible_ssh_host=gitserver1.example.org

[dbservers]
dbserver1                ansible_ssh_host=dbserver1.example.org

[webservers]
webserver1                ansible_ssh_host=webserver1.example.org
webserver2                ansible_ssh_host=webserver2.example.org
webserver3                ansible_ssh_host=webserver3.example.org

At this point, I prefer using a simple hostname in my inventory and specify the host address through the use of the ansible_ssh_host option. This way, when using host_vars, the filename stays nice and short, no dots in the filename.

Writing playbooks

The infrastructure.yml file, is a playbook that defines the entire infrastructure ( all available groups ). It’s very short because we only include the group specific playbooks.

---

- include: gitservers.yml
- include: dbservers.yml
- include: webservers.yml

Also the group specific playbook is pretty short. Here we simply map the configuration of our group to the roles performed for that group:

# file: webservers.yml
- hosts: webservers        # list of one or more groups or host patterns, separated by colons
  user: root               # give the SSH user to connect to the hosts
  roles:                   # list of roles that need to be executed
    - common
    - webservers

This designates the following behaviors, for each role:

  1. If roles/x/tasks/main.yml exists, tasks listed therein will be added to the play
  2. If roles/x/handlers/main.yml exists, handlers listed therein will be added to the play
  3. If roles/x/vars/main.yml exists, variables listed therein will be added to the play
  4. Any copy tasks can reference files in roles/x/files without having to path them relatively or absolutely
  5. Any script tasks can reference scripts in roles/x/files/ without having to path them relatively or absolutely
  6. Any template tasks can reference files in roles/x/templates/ without having to path them relatively or absolutely

Don’t worry, if any of the files or directories are not present, they are just ignored. So it’s ok to not have a ‘vars’ subdirectory for the role for instance.

So in the above example, all tasks residing in the roles/common/tasks/main.yml and roles/webservers/tasks/main.yml will be executed.

Reusable task includes

So all the magic here happens in the roles main.yml task file. Here, we’ll list all our tasks. So for instance, our webstack consists of a default LAMP installation ( Apache, MySQL and PHP ).

---

- name: install apache package
  apt: pkg=apache2 state=latest

- name: install mysql package
  apt: pkg=mysql-server state=latest

- name: install php package
  apt: pkg=php5 state=latest

Since these 3 tasks only consists out of installing APT packages, you could combine these 3 tasks as one, providing a list:

# roles/webservers/tasks/main.yml
---

- name: install LAMP packages
  apt: pkg=$item state=latest
  with_items:
    - apache2
    - mysql-server
    - php5

So what did we do here. First, we provided our task with a name. Then we defined the specific task that needs to be executed by calling the desirable module and passing the correct options. In this case, we want to install packages using the Ansible APT module.
With_items saves you some typing, repeating tasks in a loop. With_items takes a list ( strings or hashes).

Of course, a web server needs more then just installing a LAMP stack. Creating users, managing projects, security, etc. This will make your main.yml file grow in volume and become less clear for future maintenance. That is why I prefer grouping tasks that belong together in separate YAML files.

So for example purposes, lets split our LAMP installation in 3 separate files. Our main.yml file looks like this:

# roles/webservers/tasks/main.yml
---

- include: php.yml                tags=php
- include: apache.yml             tags=apache
- include: mysql.yml              tags=mysql

You might notice the tags directive behind every include. When your playbook starts to grow, it may become useful to run a specific part of the configuration. Thats where the tags attribute comes into play. More later on the different ways how to execute your playbooks.

It also might happen that you have certain tasks that don’t belong to a specific role but that needs to be executed in 2 or more different roles.

Let’s say we have some tools that needs to be installed on the web servers and database server, but not on the git servers. We can’t just include it in the common roles, since the common role gets executed in all playbooks. Here we have 3 possibilities:

  1. just add another role. But I’m not too keen on this. It might be doable when you don’t have a lot of roles, but from the moment you do, the number of possible permutations grows along. So it might be better to keep your structure as clean as possible from the beginning.
  2. Use a relative or absolute path when including your tasks. In the directory layout section, you might have noticed I created a directory in the root named “shared”. Store your task lists in the shared/tasks folder and point your role include statement to that location. This is a good alternative for smaller playbooks, but at the same time, you lose the roles auto-path magic.
  3. my preference goes to symlinking the task list. Just as I pointed out in option 2, you store your task lists, files and templates in the shared folder. But instead of including them using a relative or absolute path, you create a symlink in your role that targets the shared tasks. This way, you can include the task list and call files / templates, just as intended and still make use of the roles path magic.

Variables

In Ansible, you can have different kind of variables. You can have group variables, host variables or role variables.

  • Groups variables are variables that can change depending on the group ( stored in groups_var/ )
  • host variables are variables that can change according to the host. For instance, the name of a server backup set ( stored in host_vars/ )
  • role variables are variables that can change according to the roles. If you roles layout is equal to that of groups, you’re better of storing your variables as role variables ( stored in roles//vars/main )

You can also specify variables in the vars section of a playbook. All these variables can be used in the playbook like this:

$varname
{varname}
{{ varname }}

The first two styles are actually legacy, whilst the last one is the preferred way (since ansible 1.2).
Also, in the when: clause, which you can consider as something Python style, you can just write varname, without any braces, e.g.:

when: varname is defined

You can use these variables in your templates as well as in tasks.

Executing playbooks

Now that we have a basic structure with some working tasks, we can execute our playbooks. Based on the given directory structure, you can execute your playbooks as following:

# define your host file ( in case you have multiple host files otherwise use the ANSIBLE_HOSTS environment variable). This will run all playbooks ( webservers, dbservers, gitservers ).
$ ansible-playbook -i hosts infrastructure.yml

# run only certain tags for all playbooks
$ ansible-playbook -i hosts infractructure.yml --tags backup

# run a specific playbook
$ ansible-playbook -i hosts webservers.yml

# run a specific group
$ ansible-playbook -i hosts infrastructure.yml --limit webservers

# limit the number of hosts for group ( first 10 )
$ ansible-playbook -i hosts infrastructure.yml --limit webservers[0-10]

Conclusion

So I’m using Ansible to automate my own infrastructure ( 1 database server, 4 web servers and 1 git server ). Although that doesn’t seem a lot, automating your infrastructure will pay of in the long run, even with only 2 servers.

All the above information should get you started creating your playbook structure. But keep in mind that I only covered the more basic features of Ansible. This is only the tip of the ice berg. The rabbit hole goes a lot deeper and Ansible has more complex and advanced features. I can only recommend visiting the Ansible documentation page for more detailed information.