Linux and Wake on LAN

The Internet servers are usually on 24×7, probably that’s why I never had the need to use the Wake on LAN feature on a computer.

I’ve just built a home server running Ubuntu Linux, using consumer grade PC parts. To avoid a big surge on my next electricity bill, I plan to only turn on the server when the sun is shining or during off peak when electricity is cheaper. It’s trivial to mention how to shutdown a Linux server via SSH, however to my surprise it’s not any harder to turn on a Linux server using WoL.

First on the server, make sure the line `ethernet-wol g` exists under the interface. eg.

auto enp0s31f6
iface enp0s31f6 inet static
ethernet-wol g

Save it and restart, run `sudo ethtool enp0s31f6` and if the following line appears in the output then it's a success!

Wake-on: g

Next step is to turn on WoL in BIOS. Different BIOS may call it different names but generally it’s to allow the system to power on by PCI/Network.

On Arch Linux, install wol, the command to wake up a WoL enabled computer.

sudo pacman -Sy wol
sudo wol <MAC ADDRESS>


That’s it 🙂

Install Fluentd with Ansible

Fluentd has become the popular open source log aggregration framework for a while. I’ll try to give it a spin with Ansible. There are quite some existing Ansible playbooks to install Fluentd out there, but I would like to do it from scratch just to understand how it works.

From the installation guide page, I can grab the script and dependencies and then translate them into Ansible tasks:

# roles/fluentd-collector/tasks/install-xenial.yml
- name: install os packages
    name: '{{ item }}'
    state: latest
    - libcurl4-gnutls-dev
    - build-essential

- name: insatll fluentd on debian/ubuntu
  raw: "curl -L | sh"

Then it can be included by the main task:

# roles/fluentd-collector/tasks/main.yml
# (incomplete)
- include: install-debian.yml
  when: ansible_os_family == 'Debian'

In the log collecting end, I need to configure /etc/td-agent/td-agent.conf to let fluentd(the stable release is call td-agent) receive syslog, tail other logs and then forward the data to the central collector end. Here’s some sample configuration with jinja2 template place holders:

<match *.**>
  type forward
  phi_threshold 100
  hard_timeout 60s
    name mycollector
    host {{ fluent_server_ip }}
    port {{ fluent_server_port }}
    weight 10
  type syslog
  port 42185
  tag {{ inventory_hostname }}.system

{% for tail in fluentd.tails %}
  type tail
  format {{ tail.format }}
  time_format {{ tail.time_format }}
  path {{ tail.file }}
  pos_file /var/log/td-agent/pos.{{ }}
  tag {{ inventory_hostname }}.{{ }}
{% endfor %}

At the aggregator’s end, a sample configuration can look like:

  type forward
  port {{ fluentd_server_port }}

<match *.**>
  @type elasticsearch
  logstash_format true
  flush_interval 10s
  index_name fluentd
  type_name fluentd
  include_tag_key true
  user {{ es_user }}
  password {{ es_pass }}

Then the fluentd/td-agent can aggregate all logs from peers and forward to Elasticsearch in LogStash format.


The Burnout Effect

Back in October 2015 I got an offer from a big data startup, and after 1 year and 4 months I decided to move on.

There’s a 3D printer and a drone in the office and the team was talking about Fallout 4 in the morning because it was just released. I thought the company and the team were very cool and I still think so now.

My first challenge was to migrate a self-hosted MySQL database to AWS Aurora, because the MySQL server was over stretched and felt like it could collapse anytime soon. I was quite experienced at MySQL so I didn’t think that would be hard. However there were some complications: The MySQL was a huge VM managed by Ganeti cluster and backed by DRBD volumes and the best of all was that the disks were old school magnetic SAS disks.

The DB migration tool recommended by AWS just failed randomly on large tables(~400GB). It wasn’t acceptable to do a full mysqldump(~1TB) and setup replication to Aurora because that will cause huge downtime. And since there’s no access to the Aurora’s system, the option to use LVM snapshot was out too. I thought there must have a way so I created a MySQL replica in the same cluster using LVM snapshot, then setup replication between the replica and the master.

After the replication was done and verified between the master and the replica, I had the opportunity to pause the replication and do mysqldump on the replica and then setup another replication between the replica MySQL and Aurora. After the replica  caught  up with the master we did the DNS switch-over and the apps almost did feel a thing and started to update on the Aurora. This concluded the first success.

I was involved in several big projects such as migration of DNS to route53, migration of core servers(about 40) to AWS and migration of data warehouse from AWS Redshift to Google BigQuery in 2016.

I thought the job should have become more comfortable since a large portion of the infrastructure had been rebuilt. However since a few months ago I started to have poor sleeps, and in daytime poor concentration. I searched for the answer, to my surprise, a lot of people shared the same issue which is called burnout. So rather than being asked to leave for poor performance, I choose to have a break and search for a new job.

On the last day of the job, we played Rocket League together and those were my best hours. I felt super relieved, yet very sad to leave the team. Thanks to the team especially Trist and Adam who I learned a lot from.