Use Fluentd and Elasticsearch to Analyse Squid Proxy Traffic

TL;DR This is a quick guide to set up Fluentd + Elasticsearch integration to analyse Squid Proxy traffic. In the example below Fluentd td-agent is installed in the same host as Squid Proxy and Elasticsearch is installed in the other host. The OS is Ubuntu 20.04.

Useful links:
– Fluentd installation:
– Elasticsearch installation:

The logs of Squid need to be accessible by td-agent, it can be done by adding td-agent user to the proxy group:

$ sudo usermod --groups proxy -a td-agent

The configuration for td-agent looks like

  @type tail
  @id squid_tail
    @type regexp
    expression /^(?<timestamp>[0-9]+)[\.0-9]* +(?<elapsed>[0-9]+) (?<userIP>[0-9\.]+) (?<action>[A-Z_]+)\/(?<statusCode>[0-9]+) (?<size>[0-9]+) (?<method>[A-Z]+) (?<URL>[^ ]+) (?<rfc931>[^ ]+) (?<peerStatus>[^ ]+)/(?<peerIP>[^ ]+) (?<mime>[^ ]+)/
    time_key timestamp
    time_format %s
  path /var/log/squid/access.log
  tag squid.access

The key is to get the regex expression to fit the Squid access log, which looks like

1598101487.920 240256 TCP_TUNNEL/200 1562 CONNECT - HIER_DIRECT/ -

Then I can use the fields defined in the regex, such as userIP or URL in Elasticsearch for queries.