Install Fluentd with Ansible

Fluentd has become the popular open source log aggregration framework for a while. I’ll try to give it a spin with Ansible. There are quite some existing Ansible playbooks to install Fluentd out there, but I would like to do it from scratch just to understand how it works.

From the installation guide page, I can grab the script and dependencies and then translate them into Ansible tasks:

# roles/fluentd-collector/tasks/install-xenial.yml
- name: install os packages
    name: '{{ item }}'
    state: latest
    - libcurl4-gnutls-dev
    - build-essential

- name: insatll fluentd on debian/ubuntu
  raw: "curl -L | sh"

Then it can be included by the main task:

# roles/fluentd-collector/tasks/main.yml
# (incomplete)
- include: install-debian.yml
  when: ansible_os_family == 'Debian'

In the log collecting end, I need to configure /etc/td-agent/td-agent.conf to let fluentd(the stable release is call td-agent) receive syslog, tail other logs and then forward the data to the central collector end. Here’s some sample configuration with jinja2 template place holders:

<match *.**>
  type forward
  phi_threshold 100
  hard_timeout 60s
    name mycollector
    host {{ fluent_server_ip }}
    port {{ fluent_server_port }}
    weight 10
  type syslog
  port 42185
  tag {{ inventory_hostname }}.system

{% for tail in fluentd.tails %}
  type tail
  format {{ tail.format }}
  time_format {{ tail.time_format }}
  path {{ tail.file }}
  pos_file /var/log/td-agent/pos.{{ }}
  tag {{ inventory_hostname }}.{{ }}
{% endfor %}

At the aggregator’s end, a sample configuration can look like:

  type forward
  port {{ fluentd_server_port }}

<match *.**>
  @type elasticsearch
  logstash_format true
  flush_interval 10s
  index_name fluentd
  type_name fluentd
  include_tag_key true
  user {{ es_user }}
  password {{ es_pass }}

Then the fluentd/td-agent can aggregate all logs from peers and forward to Elasticsearch in LogStash format.


The Burnout Effect

Back in October 2015 I got an offer from a big data startup, and after 1 year and 4 months I decided to move on.

There’s a 3D printer and a drone in the office and the team was talking about Fallout 4 in the morning because it was just released. I thought the company and the team were very cool and I still think so now.

My first challenge was to migrate a self-hosted MySQL database to AWS Aurora, because the MySQL server was over stretched and felt like it could collapse anytime soon. I was quite experienced at MySQL so I didn’t think that would be hard. However there were some complications: The MySQL was a huge VM managed by Ganeti cluster and backed by DRBD volumes and the best of all was that the disks were old school magnetic SAS disks.

The DB migration tool recommended by AWS just failed randomly on large tables(~400GB). It wasn’t acceptable to do a full mysqldump(~1TB) and setup replication to Aurora because that will cause huge downtime. And since there’s no access to the Aurora’s system, the option to use LVM snapshot was out too. I thought there must have a way so I created a MySQL replica in the same cluster using LVM snapshot, then setup replication between the replica and the master.

After the replication was done and verified between the master and the replica, I had the opportunity to pause the replication and do mysqldump on the replica and then setup another replication between the replica MySQL and Aurora. After the replica  caught  up with the master we did the DNS switch-over and the apps almost did feel a thing and started to update on the Aurora. This concluded the first success.

I was involved in several big projects such as migration of DNS to route53, migration of core servers(about 40) to AWS and migration of data warehouse from AWS Redshift to Google BigQuery in 2016.

I thought the job should have become more comfortable since a large portion of the infrastructure had been rebuilt. However since a few months ago I started to have poor sleeps, and in daytime poor concentration. I searched for the answer, to my surprise, a lot of people shared the same issue which is called burnout. So rather than being asked to leave for poor performance, I choose to have a break and search for a new job.

On the last day of the job, we played Rocket League together and those were my best hours. I felt super relieved, yet very sad to leave the team. Thanks to the team especially Trist and Adam who I learned a lot from.


苹果电脑上的 Home 和 End 键

公司里面一般就给你准备两种笔记本电脑: Windows 10 + Lenovo/Dell/HP 或者 OS X + Macbook Pro. 我拿到的是后者, 而且是很不错的一款, 2015 MBP 顶级配置. 不过苹果电脑里有一些设置真是毫无道理, 例如 Home 和 End 键, 在 OS X 中被定义为[页首]和[页尾], Linux 上 Home 和 End 缺省是[行首]和[行尾]. 我不知道别人的实际应用是怎样的, 但对我而言[行首]和[行尾]要更常用到, 例如, 命令行 😀

下面是如何在 OS X 上重新定义 Home/End (在 Terminal 里完成):

mkdir -p ~/Library/KeyBindings
cat <<EOF > ~/Library/KeyBindings/DefaultKeyBinding.dict
/* Remap Home / End keys to be correct */
"\UF729" = "moveToBeginningOfLine:"; /* Home */
"\UF72B" = "moveToEndOfLine:"; /* End */
"$\UF729" = "moveToBeginningOfLineAndModifySelection:"; /* Shift + Home */
"$\UF72B" = "moveToEndOfLineAndModifySelection:"; /* Shift + End */
"^\UF729" = "moveToBeginningOfDocument:"; /* Ctrl + Home */
"^\UF72B" = "moveToEndOfDocument:"; /* Ctrl + End */
"$^\UF729" = "moveToBeginningOfDocumentAndModifySelection:"; /* Shift + Ctrl + Home */
"$^\UF72B" = "moveToEndOfDocumentAndModifySelection:"; /* Shift + Ctrl + End */