Kubernetes Log Aggregation with Filebeat and Logstash

Following last blog, Filebeat is very easy to setup however it doesn’t do log pattern matching, guess I’ll need Logstash after all.

First is to install Logstash of course. To tell Filebeat to feed to Logstash instead of Elasticsearch is straightforward, here’s some configuration snippets:

Filebeat K8s configMap:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
    kubernetes.io/cluster-service: "true"
data:
  filebeat.yml: |-
  filebeat.config:
 
  ...
  # replace output.elasticsearch with this
  output.logstash:
    hosts: ['${LOGSTASH_HOST:logstash}:${LOGSTASH_PORT:5044}']

Sample Logstash configuration:

input {
  beats {
    port => "5044"
  }
}
filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}"}
  }
}
output {
  elasticsearch {
    hosts => [ "localhost:9200" ]
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
}

COMBINEDAPACHELOG is the standard apache log format(as well as nginx’s). By using this predefined log format, values like request URI or referrer URL will be available as fields in Elastisearch.

🙂

Kubernetes Cluster Log Aggregation with Filebeat

Finally the Kubernetes cluster I was working on went live, and I didn’t provide a log aggregation solution yet. I had a look at dynaTrace, which is a paid SaaS. However it requires to install some agent in every container. It’s fun when there’s only several to play with but I wouldn’t rebuild dozens of docker containers just to get logs out.

Luckily enough I found Filebeat from Elastic which can be installed as a DaemonSet in a Kubernetes cluster and then pipe all logs to Elasticsearch and I already have an Elasticsearch cluster running so why not. The installation is quite easy following this guide:

1, Download the manifest

2, The only configuration needs to be changed are:

 env:
   - name: ELASTICSEARCH_HOST
     value: 10.1.1.10
   - name: ELASTICSEARCH_PORT
     value: "9200"
   - name: ELASTICSEARCH_USERNAME
     value: elastic
   - name: ELASTICSEARCH_PASSWORD
     value: changeme

Then load it to the kubernetes cluster:

kubectl apply -f filebeat.yaml

3, If the docker containers running in the cluster already logging to stdout/stderr, you should see logs flowing into Elasticsearch, otherwise check Filebeat logs in Kubernetes dashboard(it’s in kube-system name space).

4, Make sure to create an index for filebeat in Kibana, usually filebeat-*

That’s about it 🙂

Kubernetes External Service with HTTPS

This is a quick example to assign an SSL certificate to a Kubernetes external service(which is an ELB in AWS). Tested with kops 1.8 and kubernetes 1.8.

---
apiVersion: v1
kind: Service
metadata:
 name: my-https-service
 namespace: my-project
 labels:
   app: my-website-ssl
 annotations:
   service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:ap-southeast-2:xxx:certificate/xxx..."
   service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
   service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
   service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: '3600'
spec:
 type: LoadBalancer
 selector:
   app: my-website
 ports:
   - name: http
     port: 80
     targetPort: 80
   - name: https
     port: 443
     targetPort: 80

🙂

Get access to a container in Kubernetes cluster

With Kubernetes(K8s), there’s no need to do ssh [email protected] anymore since everything is running as containers. There are still occasions when I need shell access to a container to do some troubleshooting.

With Docker I can do

docker exec -ti <container_id> /bin/bash

It’s quite similar in K8s

kubectl exec -ti <container_id> -- /bin/bash

However in K8s containers have random IDs so I need to know the container ID first

kubectl get pods

Then I can grab the container ID and do the `kubectl exec` command. This is hard to automate because picking up the expected container ID using `grep` and `awk` commands can fail if the matching condition is too strict.

Given a deployment like this

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 name: my-app-deploy
spec:
 replicas: 2
 template:
   metadata:
     labels:
       app: my-app
...

the K8s way to query the container will be

kubectl get pods --selector=app=my-app -o jsonpath='{.items[0].metadata.name}'

A chained one-liner could be

kubectl exec -ti $(kubectl get pods --selector=app=my-app -o jsonpath='{.items[0].metadata.name}') -- /bin/bash

This doesn’t check for errors, eg. if no container matching `app=my-app` was found, but a better script can be easily crafted from here.

🙂