Manage your datacenter infrastructure with Ansible

Ansible is not only a great tool for simple tasks (like updating servers) but can be of much help deploying and automating the infrastructure underneath it. Ansible supports building your infrastructure from the ground up.

Ansible is compatible with almost anything, if you can use CLI - you can use Ansible. Out of the box it has lots of plugins for vendors like Cisco, HP/Aruba, Arista, Juniper, Netapp and many more. Want to take it to a higher level? There is also support for VmWare, Xenserver, RHEV and more. Nothing ansible cannot build for you.

Build our network topology with Ansible

I will show you how to build a leaf-spine topology using ansible. If you want to know more about leaf-spine network topology's please refer to this article. In short; leafs are access switches connected to the spines using layer-3 routing (ospf/bgp).

Ansible Hosts file

In order to manage our entire infrastructure in one place we will create a hosts file with groups (spines, leafs, servers) and children objects (the actual devices). For now i use VyOS as swithes but this can be any Cisco, HP or Juniper switch of course.

leaf01 ansible_host= ansible_network_os=vyos
leaf02 ansible_host= ansible_network_os=vyos

spine01 ansible_host= ansible_network_os=vyos
spine02 ansible_host= ansible_network_os=vyos

server01 ansible_host=
server02 ansible_host=



In the above example you will see i have two leaf switches that i want to connect to my two spine switches. I grouped them under the two host categories and then created a new categorie "infrastructure" linking them together. With that setup i can run tasks on either a set of leafs or on both spines and leafs together. Don't forget to create a local ansible.cfg pointing to the hosts file

inventory = ~/ansible-datacenter/hosts
filter_plugins = ~/ansible-datacenter/plugins/

Configuring the interfaces of leafs and spines

Let's start with the easy part, configure all devices to have interfaces in the correct subnet so they can communicate with eachother. Also, i am giving them a loopback address on interface lo used for internal purposes and management. Let's create the playbook ipaddr.yml

- hosts: infrastructure
  connection: network_cli
          - { name: eth1, ipv4: }
          - { name: eth2, ipv4: }
          - { name: lo, ipv4: }
          - { name: eth1, ipv4: }
          - { name: eth2, ipv4: }
          - { name: lo, ipv4: }
          - { name: eth1, ipv4: }
          - { name: eth2, ipv4: }
          - { name: lo, ipv4: }
          - { name: eth1, ipv4: }
          - { name: eth2, ipv4: }
          - { name: lo, ipv4: }
    - name: VyOS | Configure IPv4
        aggregate: "{{interface_data[inventory_hostname]}}"

Notice that in this case i am using the hostsgroup 'infrastructure' because i want to set these IP adresses on all the switches (leafs and spines). This saves time as i can now do this from only one playbook. So, run it and see if the leaf actually has the correct configuration now.

[email protected]:~$ show interfaces
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
eth0                    u/u
eth1                      u/u
eth2                      u/u
lo                            u/u

Looks great! So now i have 2 spines and 2 leaves that can ping eachother and are connected.

[email protected]:~$ ping
PING ( 56(84) bytes of data.
64 bytes from icmp_req=1 ttl=64 time=0.288 ms
64 bytes from icmp_req=2 ttl=64 time=0.305 ms

Automated checking

Ansible is meant to automate stuff, so after applying the interface configuration it would be best to automaticly check if the devices are reachable. Let's create a task to ping from within the playbook after applying the configuration.

    - name: VyOS | Test IPv4 connectivity
          - "ping count 5"
          - "ping count 5"
      register: spine01_result
      when: 'inventory_hostname == "spine01"'
    - name: VyOS | Testresults Connectivity
          - "'' in spine01_result.stdout[0]"
          - "'0% packet loss' in spine01_result.stdout[0]"
          - "'' in spine01_result.stdout[1]"
          - "'0% packet loss' in spine01_result.stdout[1]"
      when: 'inventory_hostname == "spine01"'

So now we have spines, leaves, everything is connected but we still need Layer3 routing. We can use either BGP or OSPF. In this example i will use ansible to push the OSPF configuration to VyOS and create a new "area 0". There are two ways to accomplish this, using the CLI or just push the config file itself. I'm going to the easy way, push the file. So i create a template in ansible and save it as ospf_conf.j2;

protocols {
    ospf {
        area 0 {
            {% for dict in interface_data[inventory_hostname] -%}
            {% if dict["name"] != "lo" -%}
            network {{ dict["ipv4"] | ipaddr("network") }}/{{ dict["ipv4"] | ipaddr("prefix") }}
            {% else -%}
            network {{ dict["ipv4"] }}
            {% endif -%}
            {% endfor -%}
        parameters {
            {% for dict in interface_data[inventory_hostname] -%}
            {% if dict["name"] == "lo" -%}
            router-id {{ dict["ipv4"] | ipaddr("address") }}
            {% endif -%}
            {% endfor -%}

interfaces {
    {% for dict in interface_data[inventory_hostname] -%}
    {% if dict["name"] != "lo" -%}
    ethernet {{ dict["name"] }} {
        ip {
            ospf {
                network point-to-point
    {% endif -%}
    {% endfor -%}

So what this does is add each range from the interface_data to the ospf networks, and add the ospf parameter to the interface (eth1/2). So add this task to the playbook:

  - name: push ospf configurtion to vyos
      src: ./ospf_conf.j2
      save: yes

So, run the playbook. After this you will see that each device has a working ospf configuration and the leafs are now redundantly connected to each spine.

[email protected]# show protocols ospf
 area 0 {
 parameters {
[email protected]# run show ip ospf neighbor

Neighbor ID     Pri State           Dead Time Address         Interface            RXmtL RqstL DBsmL          1 Full/DROther      32.379s    eth1:        0     0     0          1 Full/DROther      32.369s    eth2:        0     0     0

[email protected]# traceroute
traceroute to (, 30 hops max, 60 byte packets
 1 (  0.320 ms  0.318 ms  0.313 ms
 2 (  0.557 ms  0.553 ms  0.584 ms
[email protected]# traceroute
traceroute to (, 30 hops max, 60 byte packets
 1 (  0.254 ms  0.249 ms  0.245 ms
 2 (  0.503 ms  0.502 ms  0.499 ms

The routing table will show you all 4 ip's, some directly attached and some using OSPF:

[email protected]# run show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route

S>* [210/0] via, eth0, 19:18:47
O>* [110/10] via, eth1, 18:05:55
O>* [110/10] via, eth2, 17:20:50
O [110/0] is directly connected, lo, 18:06:16
C>* is directly connected, lo, 18:06:16
O>* [110/20] via, eth1, 17:20:50
  *                       via, eth2, 17:20:50

We did this in less then 5 minutes, remember the time where we had to configure this by hand? Now if i would require a new interface i can edit my playbook, run it and be done in 30 seconds.

Next: Devide this playbook in to nice roles.


Ubuntu 18.04 – Apache2 – HTTP2

Today, I'm going to install the latest Apache2 and PHP7 on an Ubuntu 18.04 server and enable the HTTP/2 protocol. To upgrade an existing Apache system to use HTTP/2, follow these simple instructions:

$ sudo -i
$ apt-get install python-software-properties
$ add-apt-repository -y ppa:ondrej/apache2
$ apt-key update
$ apt-get update

The above commands add the latest apache2 repository to your system and updates the list of available packages your system is aware of.

$ apt-get upgrade

Now your system is up to date with the latest packages. I am assuming you already have Apache/php etc running. Now we can enable the module in Apache:

a2enmod http2

Now we have to edit the virtual host and add this protocol to it.

<VirtualHost *:443>
 # prefer http over http1
 Protocols h2 http/1.1

Now restart Apache and you should be good to go !

Ansible Linux

Ansible – One role to rule them all

Ansible Role is a concept that deals with ideas rather than events. Its basically another level of abstraction used to organize playbooks. They provide a skeleton for an independent and reusable collection of variables, tasks, templates, files, and modules which can be automatically loaded into the playbook. Playbooks are a collection of roles. Every role has specific functionality.

For example, to install Nginx, we need to add a package repository, install the package and set up configuration. Roles allow us to create very minimal playbooks that then look to a directory structure to determine the configuration steps they need to perform.

Role directory structure

In order for Ansible to correctly handle roles, we should build a directory structure so that Ansible can find and understand. We can do this by creating a Roles directory in our working directory.

The directory structure for Roles looks like this:

 - files
 - handlers
 - meta
 - templates
 - tasks
 - vars

A role's directory structure consists of files, handlers, meta, templates, tasks, and vars. These are the directories that will contain all of the code to implement our configuration. We may not use all of the directories, so in real practice, we may not need to create all of these directories.

Ansible will search for and read any yaml file called roles/nginx/tasks/main.yml automatically. Here is the main.yml file;

- name: Installs Nginx
  apt: pkg=nginx state=installed update_cache=true
    - Start Nginx

- name: Upload default index.php for host
  copy: src=index.php dest=/usr/share/nginx/html/ mode=0644
  register: php
  ignore_errors: True

- name: Remove index.html for host
  command: rm /usr/share/nginx/html/index.html
  when: php|success

- name: Upload default index.html for host
  copy: src=index.html dest=/usr/share/nginx/html/ mode=0644
  when: php|failed

As we can see, the file just lists the steps that are to be performed, which makes it reads well.

We also made a change how we references external files in our configuration. Our src lines reference a static_files directory. This is unnecessary if we place all of our static files in the files subdirectory. Ansible will find them automatically.

Now that we have the task portion of the playbook in the tasks/main.yml file, we need to move the handlers section into a file located at handlers/main.yml.

- name: Start Nginx
  service: name=nginx state=started

Move index.html and index.php pages out of the static_files directory and put them into the roles/nginx/files directory.

So now we can create a very very simple playbook with the following content:

- hosts: test_group
    - role: nginx

Run it!

$ ansible-playbook -s test.yml

PLAY [test_group] ******************************************************************** 

GATHERING FACTS *************************************************************** 
ok: []

TASK: [nginx | Installs Nginx] ************************************************ 
ok: []

TASK: [nginx | Upload default index.php for host] ***************************** 
ok: []

TASK: [nginx | Remove index.html for host] ************************************ 
changed: []

TASK: [nginx | Upload default index.html for host] **************************** 
skipping: []

PLAY RECAP ********************************************************************              : ok=4    changed=1    unreachable=0    failed=0  

Shopware + NGIX

Shopware is a widely used professional open source e-commerce software. Based on bleeding edge technologies like Symfony 3, Doctrine2 and Zend Framework Shopware comes as the perfect platform for your next e-commerce project.

Set up the timezone and make sure all updates are done and required packages are installed:

sudo dpkg-reconfigure tzdata
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget vim git unzip socat apt-transport-https

Install PHP and required packages

sudo apt install -y php7.0 php7.0-cli php7.0-fpm php7.0-common php7.0-mysql php7.0-curl php7.0-json php7.0-zip php7.0-gd php7.0-xml php7.0-mbstring php7.0-opcache

Install database server (mysql or mariadb)

sudo apt install -y mariadb-server
sudo mysql_secure_installation
Would you like to setup VALIDATE PASSWORD plugin? N
New password: your_secure_password
Re-enter new password: your_secure_password
Remove anonymous users? [Y/n] Y
Disallow root login remotely? [Y/n] Y
Remove test database and access to it? [Y/n] Y
Reload privilege tables now? [Y/n] Y

Connect and create a user and database:

sudo mysql -u root -p
# Enter password
mysql> CREATE DATABASE dbname;
mysql> GRANT ALL ON dbname.* TO 'username' IDENTIFIED BY 'password';

Install and configure NGIX

sudo apt install -y nginx
sudo nano /etc/nginx/sites-available/shopware.conf
server {
    listen 80;
    listen 443 ssl;

    ssl_certificate /etc/letsencrypt/;
    ssl_certificate_key /etc/letsencrypt/;
    ssl_certificate /etc/letsencrypt/example.com_ecc/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/example.com_ecc/private.key;
    root /var/www/shopware;

    index shopware.php index.php;

    location / {
        try_files $uri $uri/ /shopware.php$is_args$args;

    location /recovery/install {
      index index.php;
      try_files $uri /recovery/install/index.php$is_args$args;

    location ~ \.php$ {
        include snippets/fastcgi-php.conf;
        fastcgi_pass unix:/var/run/php/php7.0-fpm.sock;
sudo ln -s /etc/nginx/sites-available/shopware.conf /etc/nginx/sites-enabled
sudo systemctl reload nginx.service

Now it's time to install Shopware;

sudo mkdir -p /var/www/shopware
sudo chown -R {your_user}:{your_user} /var/www/shopware
cd /var/www/shopware
wget -O
sudo chown -R www-data:www-data /var/www/shopware

You should alter the default PHP values of memory_limit = 256M and upload_max_filesize = 6M.

Now fire up a browser to your server and you will see the setup wizard of Shopware, ready to complete.

Linux Networking

Ubuntu Bonding (trunk) with LACP

Linux allows us to bond multiple network interfaces into single interface using a special kernel module named bonding. The Linux bonding driver provides a method for combining multiple network interfaces into a single logical “bonded” interface.

sudo apt-get install ifenslave-2.6

Now, we have to make sure that the correct kernel module bonding is present, and loaded at boot time.
Edit /etc/modules file:

# /etc/modules: kernel modules to load at boot time.
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
# Parameters can be specified after the module name.

As you can see we added “bonding”.
Now stop the network service:

service networking stop

Load the module (or reboot server):

sudo modprobe bonding

Now edit the interfaces configuration to support bonding and LACP.

auto eth1
iface eth1 inet manual
    bond-master bond0
auto eth2
iface eth2 inet manual
    bond-master bond0
auto bond0
iface bond0 inet static
    # For jumbo frames, change mtu to 9000
    mtu 1500
    bond-miimon 100
    bond-downdelay 200 
    bond-updelay 200 
    bond-mode 4
    bond-slaves none

Now start the network service again

service networking start

Verify the bond is up:

cat /proc/net/bonding/bond0

Output should be something like:

~$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
    Aggregator ID: 1
    Number of ports: 2
    Actor Key: 33
    Partner Key: 2
    Partner Mac Address: cc:e1:7f:2b:82:80
Slave Interface: eth1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:29:4f:26:c5
Aggregator ID: 1
Slave queue ID: 0
Slave Interface: eth2
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:29:4f:26:cf
Aggregator ID: 1
Slave queue ID: 0

Pacemaker and Corosync HA

In this setup we will setup a HA failover solution using Corosync and Pacemake, in a Active/Passive setup.

Installation and Setup


  • Hosts or DNS resolvers
  • NTP Must be installed and configured on all nodes
cat /etc/hosts
10.0.1 10   ha1 server01   ha2 server02

We will install pacemaker, it should install corosync as an dependency, if not install it.

apt-get install pacemaker

Edit corosync.conf. The bind address is the network address, NOT the IP. The mcastaddr is default, which is fine.

cat /etc/corosync/corosync.conf
interface {
        # The following values need to be set based on your environment
        ringnumber: 0
        mcastport: 5405

We also want corosync to start pacemaker automatically. If we do not do this, we will have to start pacemaker manually.
ver: 0 Indicates corosync to start pacemaker automatically. Setting it to 1, will require manually start of pacemaker!

cat /etc/corosync/corosync.conf
service {
    # Load the Pacemaker Cluster Resource Manager
    ver:       0
    name:      pacemaker

Copy/paste the content of corosync.conf, or scp the file to the second node.

scp /etc/corosync/corosync.conf

Make corosync starts at boot time.

cat /etc/default/corosync
# start corosync at boot [yes|no]

Start corosync

/etc/init.d/corosync start

Check the status of the cluster

Last updated: Fri Jun  9 11:02:55 2017          Last change: Wed Jun  7 14:26:06 2017 by root via cibadmin on server01
Stack: corosync
Current DC: server01 (version 1.1.14-70404b0) - partition with quorum
2 Nodes configured, 2 expected votes
0 Resources configured.
Online: [ server01 ]

Copy the config file to the second node

scp /etc/corosync/corosync.conf server02:/etc/corosync/

Now on the second node, try to start corosync

/etc/init.d/corosync start

Check the status again. We should now hopefully see the second node joining. If this fails check the firewall settings and hosts file (they must be able to resolve).

We are getting some warnings. Use the following commands:

crm configure property stonith-enabled=false
sudo crm configure property no-quorum-policy=ignore
crm_verify -L

Now add a virtual IP to the cluster.

crm configure primitive VIP ocf:IPaddr2 params ip= nic=eth0 op monitor interval=10s

Now we should have added an VIP/Floating IP, we can test this by a simple ping. Should respond from both nodes.

Adding Resources: Services

Now we are ready to add a service to our cluster. In this example we use a postfix service (smtp) that we want to failover. Postfix must be installed on both nodes

crm configure primitive HA-postfix lsb:postfix op monitor interval=15s

Check the status.

crm status

As we have not linked the IP to the service yet, postfix could be running on server02 while the IP is on server01. We need to set them both in one HA group.

crm configure group HA-Group VIP HA-postfix

If we check the status again, we can see that the two resources are now running on the same server.

Online: [ server01 server02 ]
 Resource Group: HA-Group
     VIP    (ocf::heartbeat:IPaddr2):   Started server01
     HA-postfix (lsb:postfix):  Started server01

Looks good !

If an resource fails, for some reason, like postfix crashes, and cannot start again, we want to migrate to another server.
Per default the migration-threshold is not defined/set to infinity, which will never migrate it.

When we have 3 fails, migrate the node, and expire the failed resource after 60 seconds. This will allow it to automatically to move it back to this node.

primitive HA-postfix lsb:postfix \
        op monitor interval="15s" \
        meta target-role="Started" migration-threshold="3" failure-timeout=60s

Now we are DONE!

Some extra commands that might be usefull when managing the cluster:

Deleting a resource

crm resource stop HA-XXXX
crm configure delete HA-XXXX

Where XXXX is the name of the HA cluster.

Migrate / Move Resource

crm_resource --resource HA-Group --move --node server02

View configuration

crm configure show

View status and fail counts

crm_mon -1 --fail

Configure FC Multipath on Debian (HP EVA)

This detailed how to guides to high availability and performance on Debian/Ubuntu for with a dual FC HBA (Brocade) and shared storage on a HP EVA6300. Tested on Debian Linux 5.x and 6.x bits running on HP Proliant Dl360 and DL380 models, with 8GB FC Host Bus Adapters from Brocade.

Configure the software we need

# apt-get install multipath-tools-boot multipath-tools firmware-qlogic sysfsutils lsscsi
# reboot

Verifying that the correct Linux kernel module was loaded

[email protected]:~# cat /var/log/dmesg | grep Brocade
[ 11.584057] Brocade BFA FC/FCOE SCSI driver - version:
[ 11.654052] scsi1 : Brocade FC/FCOE Adapter, hwpath: 0000:0a:00.0 driver:
[ 12.011790] scsi4 : Brocade FC/FCOE Adapter, hwpath: 0000:0a:00.1 driver:
[email protected]:~# cat /var/log/dmesg | grep scsi
[ 11.550599] scsi0 : hpsa
[ 11.558223] scsi 0:0:0:0: RAID HP P420i 3.54 PQ: 0 ANSI: 5
[email protected]:~# modinfo bfa
filename: /lib/modules/3.2.0-4-amd64/kernel/drivers/scsi/bfa/bfa.ko
author: Brocade Communications Systems, Inc.
description: Brocade Fibre Channel HBA Driver fcpim

Create the /etc/multipath.conf for the IBM DS8300 storage

First we need to find out the correct wwid:
As multipath is not yet correctly configured, the command below will return “undef” for some paths, as the example below. What we need now is to identify the wwid between parenthesis.

[email protected]:~# multipath -ll
fc_storage (3600143801259ba3a0000b00001650000) dm-1 HP,HSV340
size=2.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 1:0:0:1 sdb 8:16 active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `- 1:0:1:1 sdc 8:32 active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `- 4:0:0:1 sdd 8:48 active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 4:0:1:1 sde 8:64 active ready running

Mind the wwid (3600…..)

# Multipath.conf file for HP EVA system
# Version 1.02
# Storage node: HP EVA
# Connection: Dual 8GB FC
defaults {
    polling_interval    30
    failback            immediate
    no_path_retry       5
    rr_min_io           100
    path_checker        tur
    user_friendly_names yes
devices {
# These are the default settings for P6300 (HP EVA)
    device {
        vendor                   "HP"
        product                  "HSV340"
        path_grouping_policy     group_by_prio
multipaths {
        multipath {
                wwid                    3600143801259ba3a0000b00001650000
                alias                   fc_storage
                path_grouping_policy    failover
                path_selector           "round-robin 0"