Ansible, CloudFormation and Jinja2

CloudFormation is the corner stone to provision infrastructure in AWS with code, however it’s not very DRY, ie. poor modularization, almost static variables and templates. So here comes Ansible.

However at the moment Ansible’s CloudFormation module doesn’t support Jinja2 in templates, like other modules do. Luckily there’s a work-around to get the Ansible-CloudFormation-Jinja2 trio working together.

A simple CloudFormation snippet with Jinja2 variable:

# roles/test-stack/templates/mycf.yaml.j2
...
Parameters:
VpcId:
Type: String
Default: '{{ template_params.vpc_id }}'
...

I don’t put Jinja2 variable directly to replace the !Ref intrinsic function, because I think this is a softer approach so if there’s any parameter verification in the template it still works.

Then the template can be loaded to Ansible’s cloudformation module like this:

# roles/test-stack/tasks/main.yaml
- name: create a stack
cloudformation:
stack_name: '{{ stack_names.test_stack }}'
state: present
template_body: "{{ lookup('template', 'templates/mycf.yaml.j2') }}"
tags: '{{ template_tags }}'
...

The inventory may look like this:

# inventory/local
all:
hosts:
dev:
ansible_connection: local
gather_facts: false

All parameters can go into the global variable file group_vars/all or host variable file host_vars/dev so later I can create a set of parameters for production.

# group_vars/all
stack_names:
test_stack: test1
...
# host_vars/dev
template_params:
vpc_id: vpc-xxxxxxxxxxxxxxx

template_tags:
app_env: dev

And the playbook can be simple as:

# deploy.yaml
- name: deploy stacks
hosts: dev
tags: dev
roles:
- test-stack

Finally, this CloudFormation stack can be deployed by running:

$ ansible-playbook -i inventory/local deploy.yaml --tags dev

🙂

Ansible, Packer and Configurations

I use Ansible as provisioner for Packer, to build AMIs to be used as a base image of our development environment. When Ansible is used by Packer, it’s not quite intuitive whether it’s using the same ansible.cfg when I run ansible-playbook command in a terminal.

Here’s how to make sure Ansible in Packer session will use the correct ansible.cfg file. 

First, an ENV is supplied in Packer’s template, because ENV precedes any other configuration that can be found:

    {
      "type": "ansible",
      "playbook_file": "../../ansible/apps.yml",
      "ansible_env_vars": ["ANSIBLE_CONFIG=/tmp/ansible.cfg"],
      "extra_arguments": [
        "--skip-tags=packer-skip",
        "-vvv"
      ]
    },

The line with “ANSIBLE_CONFIG=/tmp/ansible.cfg” will tell ansible to use /tmp/ansible.cfg.

With the ansible.cfg at /tmp, and the extra debug switch -vvv I can see in the output if the config file is picked up.


Install Fluentd with Ansible

Fluentd has become the popular open source log aggregration framework for a while. I’ll try to give it a spin with Ansible. There are quite some existing Ansible playbooks to install Fluentd out there, but I would like to do it from scratch just to understand how it works.

From the installation guide page, I can grab the script and dependencies and then translate them into Ansible tasks:

---
# roles/fluentd-collector/tasks/install-xenial.yml
- name: install os packages
  package:
    name: '{{ item }}'
    state: latest
  with_items:
    - libcurl4-gnutls-dev
    - build-essential

- name: insatll fluentd on debian/ubuntu
  raw: "curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent2.sh | sh"

Then it can be included by the main task:

# roles/fluentd-collector/tasks/main.yml
# (incomplete)
- include: install-debian.yml
  when: ansible_os_family == 'Debian'

In the log collecting end, I need to configure /etc/td-agent/td-agent.conf to let fluentd(the stable release is call td-agent) receive syslog, tail other logs and then forward the data to the central collector end. Here’s some sample configuration with jinja2 template place holders:

<match *.**>
  type forward
  phi_threshold 100
  hard_timeout 60s
  <server>
    name mycollector
    host {{ fluent_server_ip }}
    port {{ fluent_server_port }}
    weight 10
  </server>
</match>
<source>
  type syslog
  port 42185
  tag {{ inventory_hostname }}.system
</source>

{% for tail in fluentd.tails %}
<source>
  type tail
  format {{ tail.format }}
  time_format {{ tail.time_format }}
  path {{ tail.file }}
  pos_file /var/log/td-agent/pos.{{ tail.name }}
  tag {{ inventory_hostname }}.{{ tail.name }}
</source>
{% endfor %}

At the aggregator’s end, a sample configuration can look like:

<source>
  type forward
  port {{ fluentd_server_port }}
</source>

<match *.**>
  @type elasticsearch
  logstash_format true
  flush_interval 10s
  index_name fluentd
  type_name fluentd
  include_tag_key true
  user {{ es_user }}
  password {{ es_pass }}
</match>

Then the fluentd/td-agent can aggregate all logs from peers and forward to Elasticsearch in LogStash format.

🙂

Distribute cron jobs to hours/minutes with Ansible

This is a handy trick to run a batch of cron jobs on different hour/minute combination so they won’t collide with each other and cause some pressure on the server.

The key is to use `with_indexed_items` and Jinja2 math:

- name: ansible daily cronjob {{ item.1 }}
  cron: user=ansible name=ansible-daily-{{ item.1 }} hour={{ item.0 % 24 }} minute= {{ (item.0 * 5) % 60 }}  job="/usr/local/run.sh {{ item.1  }}"
  with_indexed_items:
    - job_foo
    - job_bar

So the `job_foo` will be running at hour 0, minute 0 and `job_bar` will be running at hour 1, minute 5, etc…

🙂