2

Kibana “Hello World” Example – Part 3 of the ELK Stack Series


kibana-logo-color-vToday, we will introduce Kibana, a data visualization open source tool. As part of Elastic’s ELK stack (now called Elastic stack), Kibana is often used to visualize logging statistics and for management of the Elastic Stack. However, in this Tutorial, we will analyze statistical data from Twitter by comparing the popularity of Trump vs. Obama vs. Clinton.

For that, we will attach Logstash to the Twitter API and feed the data to Elasticsearch. Kibana will visualize the Elasticsearch data in a pie chart and a data time histogram. At the end, we will see, that Trump wins this little Twitter tweet count contest by far: he is mentioned in Twitter tweets about 20 times as often as Obama and Clinton together:


2016-11-20-18_24_06-popularity-of-us-politicians-kibana

But now let us get back to the technology topics.

This is the third blog post of a series about the Elastic Stack (a.k.a. ELK stack):

What is Kibana?

Kibana is a tool for visualization of logging statistics stored in the Elasticsearch database. Statistical graphs like histograms, line graphs, pie charts, sunbursts are core capabilities of Kibana.

kibana-basics

In addition, Logstash’s and Elasticsearch’s capabilites allow it to visualize statistical data on a geographical map. And with tools like Timelion and Graph, an administrator can analyze time-series and relationships, respectively:

kibana-geokibana-timekibana-graph

Kibana is often used in the so-called ELK pipeline for log file collection, analysis and visualization:

  • Elasticsearch is for searching, analyzing, and storing your data
  • Logstash (and Beats) is for collecting and transforming data, from any source, in any format
  • Kibana is a portal for visualizing the data and to navigate within the elastic stack

 

2016-11-17-18_31_39

 

Target Configuration for this Blog Post

In this Hello World blog post, we will use simple HTTP Verbs towards the RESTful API of Elasticsearch to create, read and destroy data entries:

2016-11-18-21_15_08

As a second step, we will attach Logstash to Twitter and Elasticsearch, whose data will be visualized in Kibana. Apart from the data source on the left, this is the same as the usual ELK pipeline:

2016-11-20-18_54_28

This will allow us to analyze the number of Twitter tweets with certain keywords in the text.

Tools used

  • Vagrant 1.8.6
  • Virtualbox 5.0.20
  • Docker 1.12.1
  • Logstash 5.0.1
  • Elasticsearch 5.0.1
  • Kibana 5.0.1

Prerequisites:

  • DRAM >~ 4GB
  • The max virtual memory areas vm.max_map_count must be at least 262144, see this note on the official documentation.
    See also Appendix B below, how to set the value on Linux temporarily, permanently and also for the next Vagrant-created Linux VM.

Step 1: Install a Docker Host via Vagrant and Connect to the Host via SSH

If you are using an existing docker host, make sure that your host has enough memory and your own Docker ho

We will run Kibana, Elasticstack and Logstash in Docker containers in order to allow for maximum interoperability. This way, we always can use the latest Logstash version without the need to control the java version used: e.g. Logstash v 1.4.x works with java 7, while version 5.0.x works with java 8 only, currently.

If you are new to Docker, you might want to read this blog post.

Installing Docker on Windows and Mac can be a real challenge, but no worries: we will show an easy way here, that is much quicker than the one described in Docker’s official documentation:

Prerequisites of this step:

  • I recommend to have direct access to the Internet: via Firewall, but without HTTP proxy. However, if you cannot get rid of your HTTP proxy, read this blog post.
  • Administration rights on you computer.

Steps to install a Docker Host VirtualBox VM:

Download and install Virtualbox (if the installation fails with error message “Oracle VM Virtualbox x.x.x Setup Wizard ended prematurely” see Appendix A of this blog post: Virtualbox Installation Workaround below)

1. Download and Install Vagrant (requires a reboot)

2. Download Vagrant Box containing an Ubuntu-based Docker Host and create a VirtualBox VM like follows:

basesystem# mkdir ubuntu-trusty64-docker ; cd ubuntu-trusty64-docker
basesystem# vagrant init williamyeh/ubuntu-trusty64-docker
basesystem# vagrant up
basesystem# vagrant ssh

Now you are logged into the Docker host and we are ready for the next step: to create the Ansible Docker image.

Note: I have experienced problems with the vi editor when running vagrant ssh in a Windows terminal. In case of Windows, consider to follow Appendix C of this blog post and to use putty instead.

Step 2 (optional): Download Kibana Image

This extra download step is optional, since the Kibana Docker image will be downloaded automatically in step 3, if it is not already found on the system:

(dockerhost)$ docker pull kibana
Using default tag: latest
latest: Pulling from library/kibana

386a066cd84a: Already exists
9ca92df3a376: Pull complete
c04752ac6b44: Pull complete
7bfecbcf70ff: Pull complete
f1338b2c8ead: Pull complete
bfe1da400856: Pull complete
cf0b2da1d7f9: Pull complete
aeaada72e01d: Pull complete
0162f4823d8e: Pull complete
Digest: sha256:c75dbca9c774887a3ab778c859208db638fde1a67cfa48aad703ac8cc94a793d
Status: Downloaded newer image for kibana:latest

The version of the downloaded Kibana image can be checked with following command:

(dockerhost)$ sudo docker run -it --rm kibana --version
5.0.1

We are using version 5.0.1 currently. If you want to make sure that you use the exact same version as I have used in this blog, you can use the imagename kibana:5.0.1 in all docker commands instead of kibana only.

Step 3: Start Elasticsearch

Kibana relies on the data stored and analyzed in Elasticsearch, so let us start that one first. Like in the Elasticsearch blog post, we run Elasticsearch interactively:

(dockerhost)$ sudo docker run -it --rm --name elasticsearch -p9200:9200 -p9300:9300 --entrypoint bash elasticsearch
(elasticsearchcontainer)# /docker-entrypoint.sh elasticsearch

After successful start, Elasticsearch is waiting for data.

Step 4: Start Logstash and use Twitter as Data Source

For this demonstration, it is good to have a lot of data we can analyze. Why not using Twitter as data source and look for tweets about Obama, Trump or Clinton? For that, let us create a file logstash_twitter.conf on the Docker host in the directory we will start the Logstash container from:

# logstash_twitter.conf
input {
  twitter {
    consumer_key => "consumer_key"
    consumer_secret => "consumer_secret"
    oauth_token => "oauth_token"
    oauth_token_secret => "oauth_token_secret"
    keywords => [ "Obama", "Trump", "Clinton" ]
    full_tweet => true
  }
}

output {
 stdout { codec => dots }
 elasticsearch {
 action => "index"
 index => "twitter"
 hosts => "elasticsearch"
 document_type => "tweet"
 template => "/app/twitter_template.json"
 template_name => "twitter"
 workers => 1
 }
}

But how do you find your personal consumer_key, etc? For that, you need a Twitter account, log in and create a new app on https://apps.twitter.com/.

Note: this works only, if you have registered your mobile phone with the Twitter account on Profile -> Settings -> Mobile Phone. The Website must have a valid URL format, even if you add a dummy address there.

2016-11-19-20_02_58-create-an-application-_-twitter-application-management

The consumer key and secret can be accessed on the “Keys and Access Tokens” tab of the page you are redirected to. We do not want to send any tweets, so we can set the Access Level to “Read only”. Then, on the “Keys and Access Tokens” tab again, create an access token by clicking the button at the bottom of the page. Then copy and paste the keys to the configuration file logstash_twitter.conf we have created above.

Now we need to download the template file twitter_template.json that logstash_twitter.conf is referring to (found here on the elasticstack/examples GIT repository of  this blog post and on this GIT repository)

(dockerhost)$ curl -JO https://raw.githubusercontent.com/elastic/examples/master/ElasticStack_twitter/twitter_template.json

The content of the file is:

{
  "template": "twitter_elastic_example",
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "_default_": {
      "_all": {
        "enabled": true
      },
      "properties": {
        "@timestamp": {
          "type": "date",
          "format": "dateOptionalTime"
        },
        "text": {
          "type": "text"
        },
        "user": {
          "type": "object",
          "properties": {
            "description": {
              "type": "text"
            }
          }
        },
        "coordinates": {
          "type": "object",
          "properties": {
            "coordinates": {
              "type": "geo_point"
            }
          }
        },
        "entities": {
          "type": "object",
          "properties": {
            "hashtags": {
              "type": "object",
              "properties": {
                "text": {
                  "type": "text",
                  "fielddata": true
                }
              }
            }
          }
        },
        "retweeted_status": {
          "type": "object",
          "properties": {
            "text": {
              "type": "text"
            }
          }
        }
      },
      "dynamic_templates": [
        {
          "string_template": {
            "match": "*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ]
    }
  }
}

With that, we are ready to start a Logstash Docker container with a link to the Elasticsearch container and start the Logstash process using of the configuration file we have created above:

(dockerhost)$ sudo docker run -it --rm --name logstash --link elasticsearch -v "$PWD":/app --entrypoint bash logstash
(logstash-container)# logstash -f /app/logstash_twitter.conf

In my case, I get many warnings in the Logstash terminal like

...............................20:05:52.220 [[main]>worker0] WARN  logstash.outputs.elasticsearch - Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"twitter", :_type=>"tweet", :_routing=>nil}, 2016-11-19T20:05:51.000Z %{host} %{message}], :response=>{"index"=>{"_index"=>"twitter", "_type"=>"tweet", "_id"=>"AVh-MfEl713eVTPkwAuA", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"Limit of total fields [1000] in index [twitter] has been exceeded"}}}}

and in the Elasticsearch terminal like

[2016-11-19T20:49:43,874][WARN ][o.e.d.i.m.StringFieldMapper$TypeParser] The [string] field is deprecated, please use [text] or [keyword] instead on [id_str]
[2016-11-19T20:49:43,874][WARN ][o.e.d.i.m.StringFieldMapper$TypeParser] The [string] field is deprecated, please use [text] or [keyword] instead on [raw]

However, the many dots in the Logstash terminal show that a high number of tweets is recorded continuously. Let us ignore the warnings for now (they appear also, if logstash is started without any template, so it looks like bugs in the current version of the Elastic stack) and let us check the number of recorded tweeds from the Docker host:

(dockerhost)$ curl -XGET localhost:9200/twitter/_count
{"count":115,"_shards":{"total":1,"successful":1,"failed":0}}
(dockerhost)$ curl -XGET localhost:9200/twitter/_count
{"count":253,"_shards":{"total":1,"successful":1,"failed":0}}

The number of tweets is rising quickly. While performing the next steps, we keep Logstash and Elasticsearch running, so we have a good amount of data entries to work with.

Step 5: Run Kibana in interactive Terminal Mode

In this step, we will run Kibana interactively (with -it switch instead of -d switch) to better see, what is happening (in the Elasticsearch blog post, I had some memory issues, which cannot be seen easily in detached mode).

Similar to Logstash, we start Kibana with a link to the Elasticsearch container:

(dockerhost)$ sudo docker run -it --rm --name kibana -p5601:5601 --link elasticsearch --entrypoint bash kibana

We have found out by analyzing the Kibana image via the online imagelayer tool, that the default command is to run /docker-entrypoint.sh kibana. Let us do that now:

root@f13588d10379:/# /docker-entrypoint.sh kibana
[WARN  tini (5)] Tini is not running as PID 1 and isn't registered as a child subreaper.
        Zombie processes will not be re-parented to Tini, so zombie reaping won't work.
        To fix the problem, use -s or set the environment variable TINI_SUBREAPER to register Tini as a child subreaper, or run Tini as PID 1.
  log   [16:28:02.791] [info][status][plugin:kibana@5.0.1] Status changed from uninitialized to green - Ready
  log   [16:28:02.842] [info][status][plugin:elasticsearch@5.0.1] Status changed from uninitialized to yellow - Waiting for Elasticsearch
  log   [16:28:02.867] [info][status][plugin:console@5.0.1] Status changed from uninitialized to green - Ready
  log   [16:28:03.074] [info][status][plugin:timelion@5.0.1] Status changed from uninitialized to green - Ready
  log   [16:28:03.080] [info][listening] Server running at http://0.0.0.0:5601
  log   [16:28:03.085] [info][status][ui settings] Status changed from uninitialized to yellow - Elasticsearch plugin is yellow
  log   [16:28:08.118] [info][status][plugin:elasticsearch@5.0.1] Status changed from yellow to yellow - No existing Kibana index found
  log   [16:28:08.269] [info][status][plugin:elasticsearch@5.0.1] Status changed from yellow to green - Kibana index ready
  log   [16:28:08.270] [info][status][ui settings] Status changed from yellow to green - Ready

If you see errors at this point, refer to Appendix C.

Step 6: Open Kibana in a Browser

Now we want to connect to the Kibana portal. For that, open a browser and open the URL

<your_kibana_host>:5601

In our case, Kibana is running in a container and we have mapped the container-port 5601 to the local port 5601 of the Docker host. On the Docker host, we can open the URL.

localhost:5601

Note: In case of Vagrant with VirtualBox, per default, there is only a NAT-based interface and you need to create port-forwarding for any port you want to reach from outside (also the local machine you are working on is to be considered as outside). In this case, we need to add an entry in the port forwarding list of VirtualBox:

2016-11-19-18_45_38-regel-fur-port-weiterleitung

The Kibana dashboard will open:

2016-11-19-18_34_15-kibana

We change the index name pattern logstash-* by twitter and press Create.

After clicking Discover in the left pane, Kibana displays a time/date histogram of the total tweet count recorded, the fields received, and a list of the tweets:

2016-11-19-21_56_10-kibana

Now let us compare the popularity of Obama vs. Trump. On the Docker host, we can test a query like follows:

(dockerhost)$ curl -XGET localhost:9200/twitter/tweet/_count?q=text:Obama
{"count":2046,"_shards":{"total":1,"successful":1,"failed":0}}
(dockerhost)$ curl -XGET localhost:9200/twitter/tweet/_count?q=text:Trump
{"count":9357,"_shards":{"total":1,"successful":1,"failed":0}}
(dockerhost)$ curl -XGET localhost:9200/twitter/tweet/_count?q=text:Clinton
{"count":747,"_shards":{"total":1,"successful":1,"failed":0}}

Okay, we already can see, who the winner of this little Twitter contest is: Trump. Let us analyze the data a little bit more in detail. For that, we can place the tested query into the query field:

2016-11-20-08_33_44-trump-or-obama-kibana

All matching entries are listed, and the matching strings are highlighted. Let us press Save and give it the name “Trump OR Obama OR Clinton”.

Step 6: Create a Pie Chart

Now let us visualize the data. Press on Visualize link in the left pane, choose pie chart

Pie Chart Icon

and choose the “Trump OR Obama OR Clinton” query from the Saved Searches on the right pane. We are shown a very simple pie chart:

2016-11-20-08_48_38-kibana

This is not so interesting yet. Let us now click Split Slices, choose the Filter Aggregation and add the query text:Trump. Then Add Filter and text:Obama. The same for Clinton. After that press the white on blue triangle Apply Button to apply the changes. That looks better now.

2016-11-20-08_55_13-kibana

Let us save this as “Trump vs. Obama vs. Clinton Pie Chart”.

Step 7: Create a Time Chart

Now we want to visualize, how the popularity of the three politicians change over time. For a single query, this can be done with a Line Chart and using Visualize -> Line Chart -> choose the query -> and choose the X-Axis Aggregation Date Histogram. However, this is not, what we want to achieve:

2016-11-20-09_01_29-kibana

We would like to display all three queries in a single graph, and this requires the usage of the Timelion plugin.

So, click on Timelion link on the left pane, then add the query .es('text:Trump'), .es('text:Obama'), .es(text:Clinton) on the top. This will create the chart we were looking for:

2016-11-20-09_05_28-timelion-kibana

Let us save this as a Kibana dashboard panel with the name “Trump vs. Obama vs. Clinton Time Chart”, so we can use it in the next step.

Step 8: Define a Dashboard

We now will create a dashboard. Click on Dashboard on the left pane and click Add in the upper menu. Click on Trump vs. Obama vs Clinton Time Chart and then on Trump vs. Obama vs Clinton Pie Chart.

2016-11-20-09_19_46-kibana

Clicking the white on black ^ icon will give you more space.

Resize the charts from the corner and move them, so they are aligned. We now see that the colors do not match yet:

2016-11-20-09_21_33-kibana

However, the colors of the Pie chart can easily be changed by clicking on the legends (it is not so easy on the Time chart, though).

2016-11-20-09_27_03-kibana

Even if the colors are not 100% the same, we are coming closer:

2016-11-20-09_31_25-kibana

Let us Save this as “Popularity of US Politicians” dashboard.

About 10 hours later, we can see the rise of tweets in the US, which is ~5 to 10 hours behind Germany’s time zone:

2016-11-20-18_24_06-popularity-of-us-politicians-kibana

Summary

With this Hello World or Tutorial, we have shown

  • how we can use Logstash to collect Twitter data,
  • save it on Elasticsearch and
  • use Kibana to visualize the Elasticsearch search queries.

We can see the rise and fall for the number of tweeds during a day and we also can see that the number of Trump tweets is outpacing those of Obama tweets and Clinton tweets by far.

DONE!

P.S.: the colors of the timelion graphs can also be changed easily by adding a .color(...) directive after the .es(...) . If we want to have Trump in red, Obama n blue and Clinton in green, we set:

 

.es('text:Trump').color('red'), .es('text:Obama').color('blue'), .es('text:Clinton').color('green')

The resulting graph is:

2016-11-23-16_14_35-timelion-kibana

We also can use hex color codes like .color('#ff0000') instead of color names.

Appendix A: Error: Cannot allocate memory

This error has been seen by running Elasticsearch as a Docker container on a Docker host with only 250 MB RAM left (as seen with top).

(dockerhost) $
$ sudo docker run -it --rm elasticsearch --version
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000008a660000, 1973026816, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 1973026816 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid1.log

Resolution:

A temporary resolution is to

  1. shut down the Vagrant Docker host via
vagrant halt

2. Open the Virtualbox console

3. Increase the memory by ~500 MB (right-click the VM on the left pane of the Virtualbox console -> change -> system -> increase memory)

4. Start Vagrant Docker host via

vagrant up

A permanent solution is to

  1. increase the value of vb.memory in the Vagrantfile line, e.g.
vb.memory = "1536"

by

vb.memory = "4096"

With that, next time a Virtualbox VM is created by Vagrant, the new value will be used. Also I have seen that the reboot has freed up quite some resources…

Appendix B: vm.max_map_count too low

The Elasticsearch application requires a minimum vm.max_map_count of 262144. See the official documentation for details. If this minimum requirement is not met, we see following log during startup of Elasticsearch:

$ sudo docker run -it --rm --name elasticsearch -p9200:9200 -p9300:9300 elasticsearch
[2016-11-18T13:29:35,124][INFO ][o.e.n.Node ] [] initializing ...
[2016-11-18T13:29:35,258][INFO ][o.e.e.NodeEnvironment ] [SfJmZdJ] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/dm-0)]], net usable_space [32.3gb], net total_space [38.2gb], spins? [possibly], types [ext4]
[2016-11-18T13:29:35,258][INFO ][o.e.e.NodeEnvironment ] [SfJmZdJ] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-11-18T13:29:35,261][INFO ][o.e.n.Node ] [SfJmZdJ] node name [SfJmZdJ] derived from node ID; set [node.name] to override
[2016-11-18T13:29:35,267][INFO ][o.e.n.Node ] [SfJmZdJ] version[5.0.1], pid[1], build[080bb47/2016-11-11T22:08:49.812Z], OS[Linux/4.2.0-42-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_111/25.111-b14]
[2016-11-18T13:29:37,449][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [aggs-matrix-stats]
[2016-11-18T13:29:37,450][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [ingest-common]
[2016-11-18T13:29:37,451][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-expression]
[2016-11-18T13:29:37,452][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-groovy]
[2016-11-18T13:29:37,452][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-mustache]
[2016-11-18T13:29:37,453][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-painless]
[2016-11-18T13:29:37,455][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [percolator]
[2016-11-18T13:29:37,455][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [reindex]
[2016-11-18T13:29:37,456][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [transport-netty3]
[2016-11-18T13:29:37,456][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [transport-netty4]
[2016-11-18T13:29:37,457][INFO ][o.e.p.PluginsService ] [SfJmZdJ] no plugins loaded
[2016-11-18T13:29:37,807][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
[2016-11-18T13:29:43,310][INFO ][o.e.n.Node ] [SfJmZdJ] initialized
[2016-11-18T13:29:43,310][INFO ][o.e.n.Node ] [SfJmZdJ] starting ...
[2016-11-18T13:29:43,716][INFO ][o.e.t.TransportService ] [SfJmZdJ] publish_address {172.17.0.3:9300}, bound_addresses {[::]:9300}
[2016-11-18T13:29:43,725][INFO ][o.e.b.BootstrapCheck ] [SfJmZdJ] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
ERROR: bootstrap checks failed
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2016-11-18T13:29:43,741][INFO ][o.e.n.Node ] [SfJmZdJ] stopping ...
[2016-11-18T13:29:43,763][INFO ][o.e.n.Node ] [SfJmZdJ] stopped
[2016-11-18T13:29:43,764][INFO ][o.e.n.Node ] [SfJmZdJ] closing ...
[2016-11-18T13:29:43,791][INFO ][o.e.n.Node ] [SfJmZdJ] closed

Resolution:

Temporary solution:
(dockerhost) $ sudo sysctl -w vm.max_map_count=262144

and reboot the system.

Permanent solution on LINUX hosts:

Update the vm.max_map_count setting to 262144 or more in /etc/sysctl.conf. To verify after rebooting, run sysctl vm.max_map_count.

Permanent solution for future Vagrant-created LINUX hosts:

In case we use Vagrant to create Linux VMs, we also need to make sure the next VM is created with the correct vm.max_map_count setting. For that, we can run a startup.sh file like described here:

In the Vagrantfile we set:

config.vm.provision :file, :source => "elasticsearchpreparation.sh", :destination => "/tmp/elasticsearchpreparation.sh"  
config.vm.provision :shell, :inline => "sudo sed -i 's/\r//g' /tmp/elasticsearchpreparation.sh && chmod +x /tmp/elasticsearchpreparation.sh && /tmp/elasticsearchpreparation.sh", :privileged => true

with the file elasticsearchpreparation.sh:

#!/usr/bin/env bash
# file: elasticsearchpreparation.sh
sudo sysctl -w vm.max_map_count=262144
ulimit -n 65536

The sed and chmod commands make sense on Windows hosts in order to make sure the file has UNIX format and and has the required rights. Also here, make sure to run sysctl vm.max_map_count in order to check, that the configuration is active (might require a reboot).

Appendix C: Typical Kibana Startup Logs

Successful Log

Here we see a successful startup log:

root@f13588d10379:/# /docker-entrypoint.sh kibana
[WARN  tini (5)] Tini is not running as PID 1 and isn't registered as a child subreaper.
        Zombie processes will not be re-parented to Tini, so zombie reaping won't work.
        To fix the problem, use -s or set the environment variable TINI_SUBREAPER to register Tini as a child subreaper, or run Tini as PID 1.
  log   [16:28:02.791] [info][status][plugin:kibana@5.0.1] Status changed from uninitialized to green - Ready
  log   [16:28:02.842] [info][status][plugin:elasticsearch@5.0.1] Status changed from uninitialized to yellow - Waiting for Elasticsearch
  log   [16:28:02.867] [info][status][plugin:console@5.0.1] Status changed from uninitialized to green - Ready
  log   [16:28:03.074] [info][status][plugin:timelion@5.0.1] Status changed from uninitialized to green - Ready
  log   [16:28:03.080] [info][listening] Server running at http://0.0.0.0:5601
  log   [16:28:03.085] [info][status][ui settings] Status changed from uninitialized to yellow - Elasticsearch plugin is yellow
  log   [16:28:08.118] [info][status][plugin:elasticsearch@5.0.1] Status changed from yellow to yellow - No existing Kibana index found
  log   [16:28:08.269] [info][status][plugin:elasticsearch@5.0.1] Status changed from yellow to green - Kibana index ready
  log   [16:28:08.270] [info][status][ui settings] Status changed from yellow to green - Ready

At this point the system has connected successfully to Elasticsearch, as can be seen in the last three log lines above.

Logs if Elasticsearch is not reachable

If Kibana cannot connect to Elasticsearch on the IP layer (e.g. because the Docker container link is missing) the last four lines of the successful log are replaced by:

  log   [16:45:51.597] [info][status][ui settings] Status changed from uninitialized to yellow - Elasticsearch plugin is yellow
  log   [16:45:54.407] [error][status][plugin:elasticsearch@5.0.1] Status changed from yellow to red - Request Timeout after 3000ms
  log   [16:45:54.410] [error][status][ui settings] Status changed from yellow to red - Elasticsearch plugin is red
...

To correct the issue, make sure that the Elasticsearch server (or container) is reachable from the Kibana server (or container).

Logs if Elasticsearch is reachable but not started (TCP RST)

If it can reach the Elasticsearch server, but the Elasticsearch process has not been started, the error messages appear even earlier in the log:

(kibanacontainer)# /docker-entrypoint.sh kibana
[WARN  tini (8)] Tini is not running as PID 1 and isn't registered as a child subreaper.
        Zombie processes will not be re-parented to Tini, so zombie reaping won't work.
        To fix the problem, use -s or set the environment variable TINI_SUBREAPER to register Tini as a child subreaper, or run Tini as PID 1.
  log   [17:06:57.714] [info][status][plugin:kibana@5.0.1] Status changed from uninitialized to green - Ready
  log   [17:06:57.763] [info][status][plugin:elasticsearch@5.0.1] Status changed from uninitialized to yellow - Waiting for Elasticsearch
  log   [17:06:57.780] [error][elasticsearch] Request error, retrying
HEAD http://elasticsearch:9200/ => connect ECONNREFUSED 172.17.0.3:9200
  log   [17:06:57.794] [warning][elasticsearch] Unable to revive connection: http://elasticsearch:9200/
  log   [17:06:57.795] [warning][elasticsearch] No living connections
  log   [17:06:57.798] [error][status][plugin:elasticsearch@5.0.1] Status changed from yellow to red - Unable to connect to Elasticsearch at http://elasticsearch:9200.
  log   [17:06:57.800] [info][status][plugin:console@5.0.1] Status changed from uninitialized to green - Ready
  log   [17:06:57.981] [info][status][plugin:timelion@5.0.1] Status changed from uninitialized to green - Ready
  log   [17:06:57.989] [info][listening] Server running at http://0.0.0.0:5601
  log   [17:06:57.992] [error][status][ui settings] Status changed from uninitialized to red - Elasticsearch plugin is red
  log   [17:07:00.309] [warning][elasticsearch] Unable to revive connection: http://elasticsearch:9200/
  log   [17:07:00.314] [warning][elasticsearch] No living connections

To correct the issue, make sure that the Elasticsearch server (or container) is reachable from the Kibana server (or container), that the Elasticsearch process is started and the port is reachable from outside. This may involve to map TCP ports from inside networks to outside networks. In the example of this blog post, the container port is mapped with the docker run -p9200:9200 switch from the container to the Docker host, and then the Docker host port is mapped via Virtualbox forwarding from the Docker host VM to the local machine.

Summary

In this blog post we have performed following tasks:

  1. attach Logstash to the Twitter API for retrieval of all tweeds with the Keywords “Obama” or “Trump” or “Clinton”
  2. feed Logstash’s data into Elasticsearch
  3. attach Kibana to Elasticsearch and visualize the statistics on how often the text pattern “Obama”, “Trump” and “Clinton” is found in the recorded tweets.  The total number is shown in a Pie Chart and the date/time histogram is shown in a Line Chart with more than one search term in a single chart. The latter can be done by usage of the Timelion plugin.

In order to avoid any compatibility issues with the java version on the host, we have run Kibana, Elasticsearch and Logstash in Docker containers. In order to better see what happens under the hood, we have chosen Docker containers in interactive terminal mode. In the course of the Elasticsearch “Hello World” in the last blog post, we had hit two Memory resource issues: too low memory and too low number of mapped memory areas. Those issues and their workarounds/solutions are described in Appendix A and B here and in the last blog post.

References

 

2

Elasticsearch “Hello World” Example – Part 2 of the ELK Stack Series



elasticsearch_logo

In the last blog post, we have explored Logstash, a tool for collecting and transform log data from many different input sources. Today, we will explore Elasticsearch, a scheme-less noSQL database with a versatile (“elastic”) search engine. We will perform a little Elasticsearch “Hello World” by running Elasticsearch in a Docker container and manipulating database entries. After that we will  use Logstash as a data source for populating the Elasticsearch database. This configuration is often seen in a typical log processing pipeline.

This is the second blog post of a series about the Elastic Stack (a.k.a. ELK stack):

What is Elasticsearch?

Elasticsearch is a highly scalable, distributed, scheme-less noSQL database with a versatile (“elastic”) search engine. It is based on an open source project created by Elastic. In this performance comparison it has been shown that Elasticsearch performs well even for millions of documents.

Elasticsearch is often used in the so-called ELK pipeline for log file collection, analysis and visualization:

  • Elasticsearch is for searching, analyzing, and storing your data
  • Logstash (and Beats) is for collecting and transforming data, from any source, in any format
  • Kibana is a portal for visualizing the data and to navigate within the elastic stack

 

2016-11-17-18_31_39

 

Target

In this post, we will perform a little Elasticsearch “Hello World” by running Elasticsearch in a Docker container and create, read, search and delete our first database entries. This is done by sending simple HTTP messages towards the RESTful API of Elasticsearch:

 

2016-11-18-21_15_08

As a second step, we will attach Logstash as a data source for Elasticsearch in order to move one step closer towards the ELK pipeline shown above:

2016-11-18-20_21_44

Tools used

  • Vagrant 1.8.6
  • Virtualbox 5.0.20
  • Docker 1.12.1
  • Logstash 5.0.1
  • Elasticsearch 5.0.1

Prerequisites:

  • Free Memory >= 3 GB for the Elasticsearch step and >= 4 GB for the Logstash + Elasticsearch pipeline (see Appendix A).
  • The max virtual memory areas vm.max_map_count must be at least 262144, see this note on the official documentation.
    See also Appendix B below, how to set the value on Linux temporarily, permanently and also for the next Vagrant-created Linux VM.

Step 1: Install a Docker Host via Vagrant and Connect to the Host via SSH

We will run Elasticsearch and Logstash in Docker containers in order to allow for maximum interoperability. This way, we always can use the latest Elasticsearch and Logstash versions without the need to control the java version used: e.g. Logstash v 1.4.x works with java 7, while version 5.0.x works with java 8 only, currently.

If you are new to Docker, you might want to read this blog post.

Installing Docker on Windows and Mac can be a real challenge, but no worries: we will show an easy way here, that is much quicker than the one described in Docker’s official documentation:

Prerequisites of this step:

  • I recommend to have direct access to the Internet: via Firewall, but without HTTP proxy. However, if you cannot get rid of your HTTP proxy, read this blog post.
  • Administration rights on you computer.

Steps to install a Docker Host VirtualBox VM:

Download and install Virtualbox (if the installation fails with error message “<to be completed> see Appendix A of this blog post: Virtualbox Installation Workaround below)

1. Download and Install Vagrant (requires a reboot)

2. Download Vagrant Box containing an Ubuntu-based Docker Host and create a VirtualBox VM like follows:

basesystem# mkdir ubuntu-trusty64-docker ; cd ubuntu-trusty64-docker
basesystem# vagrant init williamyeh/ubuntu-trusty64-docker
basesystem# vagrant up
basesystem# vagrant ssh

Now you are logged into the Docker host and we are ready for the next step: to create the Ansible Docker image.

Note: I have experienced problems with the vi editor when running vagrant ssh in a Windows terminal. In case of Windows, consider to follow Appendix C of this blog post and to use putty instead.

Step 2 (optional): Download Elasticsearch Image

This extra download step is optional, since the Elasticsearch Docker image will be downloaded automatically in step 3, if it is not already found on the system:

(dockerhost)$ docker pull elasticsearch
Using default tag: latest
latest: Pulling from library/elasticsearch

386a066cd84a: Already exists
75ea84187083: Already exists
3e2e387eb26a: Already exists
eef540699244: Already exists
1624a2f8d114: Already exists
7018f4ec6e0a: Already exists
6ca3bc2ad3b3: Already exists
424638b495a6: Pull complete
2ff72d0b7bea: Pull complete
d0d6a2049bf2: Pull complete
51dc322097cb: Pull complete
5d6cdd5ecea8: Pull complete
51cdecfd285e: Pull complete
29a05afcfde6: Pull complete
Digest: sha256:c7eaa97e9b898b65f8f8588ade1c9c6187420b8ce6efb7d3300d9213cd5cb0dc
Status: Downloaded newer image for elasticsearch:latest

The version of the downloaded Elasticsearch image can be checked with following command:

(dockerhost)$ sudo docker run -it --rm elasticsearch --version
Version: 5.0.1, Build: 080bb47/2016-11-11T22:08:49.812Z, JVM: 1.8.0_111

We are using version 5.0.1 currently. If you want to make sure that you use the exact same version as I have used in this blog, you can use the imagename elasticsearch:5.0.1 in all docker commands instead of elasticsearch only.

Step 3: Run Elasticsearch in interactive Terminal Mode

In this step, we will run Elasticsearch interactively (with -it switch instead of -d switch) to better see, what is happening (I had some memory issues, see Appendix A and B, which cannot be seen easily in detached mode):

(dockerhost)$ sudo docker run -it --rm --name elasticsearch -p9200:9200 -p9300:9300 --entrypoint bash elasticsearch

We have found out by analyzing the Elasticsearch image via the online imagelayer tool, that the default command is to run /docker-entrypoint.sh elasticsearch. Let us do that now. The output should look something like follows:

root@8e7170639d98:/usr/share/elasticsearch# /docker-entrypoint.sh elasticsearch
[2016-11-18T14:34:36,149][INFO ][o.e.n.Node               ] [] initializing ...
[2016-11-18T14:34:36,395][INFO ][o.e.e.NodeEnvironment    ] [iqF8643] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/dm-0)]], net usable_space [32.3gb], net total_space [38.2gb], spins? [possibly], types [ext4]
[2016-11-18T14:34:36,396][INFO ][o.e.e.NodeEnvironment    ] [iqF8643] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-11-18T14:34:36,398][INFO ][o.e.n.Node               ] [iqF8643] node name [iqF8643] derived from node ID; set [node.name] to override
[2016-11-18T14:34:36,403][INFO ][o.e.n.Node               ] [iqF8643] version[5.0.1], pid[41], build[080bb47/2016-11-11T22:08:49.812Z], OS[Linux/4.2.0-42-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_111/25.111-b14]
[2016-11-18T14:34:38,606][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [aggs-matrix-stats]
[2016-11-18T14:34:38,607][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [ingest-common]
[2016-11-18T14:34:38,607][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [lang-expression]
[2016-11-18T14:34:38,607][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [lang-groovy]
[2016-11-18T14:34:38,607][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [lang-mustache]
[2016-11-18T14:34:38,608][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [lang-painless]
[2016-11-18T14:34:38,608][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [percolator]
[2016-11-18T14:34:38,608][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [reindex]
[2016-11-18T14:34:38,608][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [transport-netty3]
[2016-11-18T14:34:38,609][INFO ][o.e.p.PluginsService     ] [iqF8643] loaded module [transport-netty4]
[2016-11-18T14:34:38,610][INFO ][o.e.p.PluginsService     ] [iqF8643] no plugins loaded
[2016-11-18T14:34:39,104][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
[2016-11-18T14:34:42,833][INFO ][o.e.n.Node               ] [iqF8643] initialized
[2016-11-18T14:34:42,833][INFO ][o.e.n.Node               ] [iqF8643] starting ...
[2016-11-18T14:34:43,034][INFO ][o.e.t.TransportService   ] [iqF8643] publish_address {172.17.0.2:9300}, bound_addresses {[::]:9300}
[2016-11-18T14:34:43,040][INFO ][o.e.b.BootstrapCheck     ] [iqF8643] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2016-11-18T14:34:43,839][INFO ][o.e.m.j.JvmGcMonitorService] [iqF8643] [gc][1] overhead, spent [434ms] collecting in the last [1s]
[2016-11-18T14:34:46,211][INFO ][o.e.c.s.ClusterService   ] [iqF8643] new_master {iqF8643}{iqF86430QRmm70Y5fDzVQw}{KsVmKueNQL6UBOMpiMsa5w}{172.17.0.2}{172.17.0.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2016-11-18T14:34:46,263][INFO ][o.e.h.HttpServer         ] [iqF8643] publish_address {172.17.0.2:9200}, bound_addresses {[::]:9200}
[2016-11-18T14:34:46,265][INFO ][o.e.n.Node               ] [iqF8643] started
[2016-11-18T14:34:46,276][INFO ][o.e.g.GatewayService     ] [iqF8643] recovered [0] indices into cluster_state

At this point the system is waiting for input on port 9200.

Step 4: Create sample Data

With the -p9200:9200 docker run option in the previous step, we have mapped the Docker container port 9200 to the Docker host port 9200. We now can send API calls to the Docker host’s port 9200.

Let us open a new terminal on the Docker host and type:

(dockerhost)$ curl -XPOST localhost:9200/twitter/tweed/1 -d '
{
"user": "oveits",
"message": "this is my first elasticsearch message",
"postDate": "2016-11-18T15:55:00"
}'

This will return a result like

{"_index":"twitter","_type":"tweed","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}

On the Elasticsearch terminal we see that a new index has been created with name “twitter” a new mapping has been created with name “tweed”:

[2016-11-18T14:56:46,777][INFO ][o.e.c.m.MetaDataCreateIndexService] [iqF8643] [twitter] creating index, cause [auto(index api)], templates [], shards [5]/[1], mappings []
[2016-11-18T15:01:01,361][INFO ][o.e.c.m.MetaDataMappingService] [iqF8643] [twitter/p9whAy1-TeSVZbUbz-3VVQ] create_mapping [tweed]

Step 5: Read Data from the Database

We can read the data with a HTTP GET command:

curl -XGET localhost:9200/twitter/tweed/1

This will return

{"_index":"twitter","_type":"tweed","_id":"1","_version":1,"found":true,"_source":
{
"user": "oveits",
"message": "this is my first elasticsearch message",
"postDate": "2016-11-18T15:55:00"
}}

Let us send a second tweed a little bit later (postDate: 16:11 instead of 15:55):

(dockerhost)$ curl -XPOST localhost:9200/twitter/tweed/2 -d '
{
"user": "oveits",
"message": "this is my second message",
"postDate": "2016-11-18T16:11:00"
}'
curl -XPUT localhost:9200/twitter/tweed/2 -d '
{
"user": "oveits",
"message": "this is my second message",
"postDate": "2016-11-18T16:11:00"
}'

Step 6: Search Data based on Content

Now we will test some search capabilities of Elasticsearch. Let us search for all entries with a message that contains the string “elasticsearch”:

Step 6.1: Search String in Message

curl -XGET localhost:9200/twitter/_search?q=message:elasticsearch

This will return our first message only, since it contains the “elasticsearch” string:

{"took":58,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.25316024,"hits":[{"_index":"twitter","_type":"tweed","_id":"1","_score":0.25316024,"_source":
{
"user": "oveits",
"message": "this is my first elasticsearch message",
"postDate": "2016-11-18T15:55:00"
}}]}}

Note that the answer contains a _source field with the full text of the data.

Step 6.2: Search String in any Field

We also can search for any field, if we remove the field name message: from the query, e.g.

$ curl -XGET localhost:9200/twitter/_search?q=2016
{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.25316024,"hits":[{"_index":"twitter","_type":"tweed","_id":"1","_score":0.25316024,"_source":
{
"user": "oveits",
"message": "this is my first elasticsearch message",
"postDate": "2016-11-18T15:55:00"
}},{"_index":"twitter","_type":"tweed","_id":"2","_score":0.24257512,"_source":
{
"user": "oveits",
"message": "this is my second message",
"postDate": "2016-11-18T16:11:00"

The query has found both entries, since they both contain the string “2016” in one of the fields.

Step 6.3: Search for Entries within a Time Range

We also can filter database entries based on a time range. The command

$ curl -XGET localhost:9200/twitter/_search? -d '
{ "query": { "range": { "postDate": { "from": "2016-11-18T15:00:00", "to": "2016-11-18T17:00:00" } } } }'

returns both entries while

$ curl -XGET localhost:9200/twitter/_search? -d '
{ "query": { "range": { "postDate": { "from": "2016-11-18T15:00:00", "to": "2016-11-18T16:00:00" } } } }'

returns the first entry only:

{ "query": { "range": { "postDate": { "from": "2016-11-18T15:00:00", "to": "2016-11-18T16:00:00" } } } }'
{"took":3,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"twitter","_type":"tweed","_id":"1","_score":1.0,"_source":
{
"user": "oveits",
"message": "this is my first elasticsearch message",
"postDate": "2016-11-18T15:55:00"
}}]}}

Step 7: Logstash als Input Source to Elasticsearch

Our final step for this Hello World post is to use Logstash as the data source for Elasticsearch. The target pipeline of this step is:

2016-11-18-20_21_44

It does not really make a difference, but for simplicity of this demonstration, we will replace the input file by a command line STDIN input. We already have shown in the Logstash blog post that both input sources create the same results. This helps us reduce the number of needed terminals:  we can use the Logstash terminal to add the data and there is no need to open a separate terminal for manipulation of the input file.

2016-11-18-20_27_19

Note: For this step, make sure to have at least 500 MB memory left on your (Docker) host after starting Elasticsearch, e.g. by checking with top. In my tests, I have created a Docker host VM with a total memory of 4 GB. I have seen Elasticsearch to occupy up to 2.9 GB, while Logstash may need another 0.5 GB.

On the Docker host, we create a configuration file logstash_to_elasticsearch.conf like follows:

#logstash_to_elasticsearch.conf
input {
  stdin { }
}

output {
  elasticsearch {
    action => "index"
    index => "logstash"
    hosts => "10.0.2.15"
    workers => 1
  }
  stdout { }
}

Here 10.0.2.15 ist the IP address of the Docker host (interface docker0). We have used STDIN and STDOUT for simplicity. This way, we can just type the input data into the Logstash terminal, similar to yesterday’s Logstash blog post like  follows:

(dockerhost)$ sudo docker run -it --rm --name logstash -v "$PWD":/app --entrypoint bash logstash

And within the container we start Logstash with this configuration file:

(container)# logstash -f /app/logstash_to_elasticsearch.conf
...
18:43:58.751 [[main]-pipeline-manager] INFO  logstash.outputs.elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>["http://10.0.2.15:9200"]}}

In a second terminal on the Docker host, we clean the Elasticsearch database and verify that the database is empty by checking that the total number of entries is 0:

(dockerhost)$ curl -XDELETE 'http://localhost:9200/_all'
{"acknowledged":true}
(dockerhost)$ curl -XGET localhost:9200/_search
{"took":1,"timed_out":false,"_shards":{"total":0,"successful":0,"failed":0},"hits":{"total":0,"max_score":0.0,"hits":[]}}

Caution: this will delete all data in the database!

Now we type into the Logstash terminal:

This is a testlog<Enter>

In the Elasticsearch terminal, we see the log:

[2016-11-18T19:12:15,275][INFO ][o.e.c.m.MetaDataCreateIndexService] [kam5hQi] [logstash] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings []
[2016-11-18T19:12:15,422][INFO ][o.e.c.m.MetaDataMappingService] [kam5hQi] [logstash/TbRsmMiFRbuGyP_THANk3w] create_mapping [logs]

And with following command we can review the data Logstash has forwarded the data to Elasticsearch:

(dockerhost)$ curl -XGET localhost:9200/_search
{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"logstash","_type":"logs","_id":"AVh42oA25J6ZuRKS_qBB","_score":1.0,"_source":{"@timestamp":"2016-11-18T19:12:14.442Z","@version":"1","host":"adf58f139fd3","message":"This is a testlog","tags":[]}}]}}

Perfect! With that we have verified that data is sent from Logstash to Elasticsearch.

thumps_up_3

Appendix A: Error: Cannot allocate memory

This error has been seen by running Elasticsearch as a Docker container on a Docker host with only 250 MB RAM left (as seen with top).

(dockerhost) $
$ sudo docker run -it --rm elasticsearch --version
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000008a660000, 1973026816, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 1973026816 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid1.log

Resolution:

A temporary resolution is to

  1. shut down the Vagrant Docker host via
vagrant halt

2. Open the Virtualbox console

3. Increase the memory by ~500 MB (right-click the VM on the left pane of the Virtualbox console -> change -> system -> increase memory)

4. Start Vagrant Docker host via

vagrant up

A permanent solution is to

  1. increase the value of vb.memory in the Vagrantfile line, e.g.
vb.memory = "1536"

by

vb.memory = "4096"

With that, next time a Virtualbox VM is created by Vagrant, the new value will be used. Also I have seen that the reboot has freed up quite some resources…

Appendix B: vm.max_map_count too low

The Elasticsearch application requires a minimum vm.max_map_count of 262144. See the official documentation for details. If this minimum requirement is not met, we see following log during startup of Elasticsearch:

$ sudo docker run -it --rm --name elasticsearch -p9200:9200 -p9300:9300 elasticsearch
[2016-11-18T13:29:35,124][INFO ][o.e.n.Node ] [] initializing ...
[2016-11-18T13:29:35,258][INFO ][o.e.e.NodeEnvironment ] [SfJmZdJ] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/dm-0)]], net usable_space [32.3gb], net total_space [38.2gb], spins? [possibly], types [ext4]
[2016-11-18T13:29:35,258][INFO ][o.e.e.NodeEnvironment ] [SfJmZdJ] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-11-18T13:29:35,261][INFO ][o.e.n.Node ] [SfJmZdJ] node name [SfJmZdJ] derived from node ID; set [node.name] to override
[2016-11-18T13:29:35,267][INFO ][o.e.n.Node ] [SfJmZdJ] version[5.0.1], pid[1], build[080bb47/2016-11-11T22:08:49.812Z], OS[Linux/4.2.0-42-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_111/25.111-b14]
[2016-11-18T13:29:37,449][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [aggs-matrix-stats]
[2016-11-18T13:29:37,450][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [ingest-common]
[2016-11-18T13:29:37,451][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-expression]
[2016-11-18T13:29:37,452][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-groovy]
[2016-11-18T13:29:37,452][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-mustache]
[2016-11-18T13:29:37,453][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [lang-painless]
[2016-11-18T13:29:37,455][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [percolator]
[2016-11-18T13:29:37,455][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [reindex]
[2016-11-18T13:29:37,456][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [transport-netty3]
[2016-11-18T13:29:37,456][INFO ][o.e.p.PluginsService ] [SfJmZdJ] loaded module [transport-netty4]
[2016-11-18T13:29:37,457][INFO ][o.e.p.PluginsService ] [SfJmZdJ] no plugins loaded
[2016-11-18T13:29:37,807][WARN ][o.e.d.s.g.GroovyScriptEngineService] [groovy] scripts are deprecated, use [painless] scripts instead
[2016-11-18T13:29:43,310][INFO ][o.e.n.Node ] [SfJmZdJ] initialized
[2016-11-18T13:29:43,310][INFO ][o.e.n.Node ] [SfJmZdJ] starting ...
[2016-11-18T13:29:43,716][INFO ][o.e.t.TransportService ] [SfJmZdJ] publish_address {172.17.0.3:9300}, bound_addresses {[::]:9300}
[2016-11-18T13:29:43,725][INFO ][o.e.b.BootstrapCheck ] [SfJmZdJ] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
ERROR: bootstrap checks failed
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2016-11-18T13:29:43,741][INFO ][o.e.n.Node ] [SfJmZdJ] stopping ...
[2016-11-18T13:29:43,763][INFO ][o.e.n.Node ] [SfJmZdJ] stopped
[2016-11-18T13:29:43,764][INFO ][o.e.n.Node ] [SfJmZdJ] closing ...
[2016-11-18T13:29:43,791][INFO ][o.e.n.Node ] [SfJmZdJ] closed

Resolution:

Temporary solution:
(dockerhost) $ sudo sysctl -w vm.max_map_count=262144

and reboot the system.

Permanent solution on LINUX hosts:

Update the vm.max_map_count setting to 262144 or more in /etc/sysctl.conf. To verify after rebooting, run sysctl vm.max_map_count.

Permanent solution for future Vagrant-created LINUX hosts:

In case we use Vagrant to create Linux VMs, we also need to make sure the next VM is created with the correct vm.max_map_count setting. For that, we can run a startup.sh file like described here:

In the Vagrantfile we set:

config.vm.provision :file, :source => "elasticsearchpreparation.sh", :destination => "/tmp/elasticsearchpreparation.sh"  
config.vm.provision :shell, :inline => "sudo sed -i 's/\r//g' /tmp/elasticsearchpreparation.sh && chmod +x /tmp/elasticsearchpreparation.sh && /tmp/elasticsearchpreparation.sh", :privileged => true

with the file elasticsearchpreparation.sh:

#!/usr/bin/env bash
# file: elasticsearchpreparation.sh
sudo sysctl -w vm.max_map_count=262144
ulimit -n 65536

The sed and chmod commands make sense on Windows hosts in order to make sure the file has UNIX format and and has the required rights. Also here, make sure to run sysctl vm.max_map_count in order to check, that the configuration is active (might require a reboot).

Summary

In this blog post we have performed following Hello World tasks:

  1. we have fed Elasticsearch with JSON style data using simple CURL commands
  2. we have shown how to read and search data by full text search and by time range
  3. we have shown how Logstash can be used as the data source to feed data into the Elasticsearch database

In order to avoid any compatibility issues with the java version on the host, we have run both, Elasticsearch and Logstash in Docker containers. In order to better see what happens under the hood, we have chosen Docker containers in interactive terminal mode. In the course of the tests, we had hit two Memory resource issues: too low memory and too low number of mapped memory areas. Those issues and their workarounds/solutions are described in Appendix A and B.

References

 

8

Logstash “Hello World” Example – Part 1 of the ELK Stack Series


2016-11-17-17_10_26-https___static-www-elastic-co_assets_bltdf06b3795cdbfb45_elastic-logstash-fw-svg

Today, we will first introduce Logstash, an open source project created by Elastic, before we perform a little Logstash “Hello World”: we will show how to read data from command line or from file, transform the data and send it back to command line or file. In the appendix you will find a note on Logstash CSV input performance and on how to replace the timestamp by a custom timestamp read from the input message (e.g. the input file).

For a maximum of interoperability with the host system (so the used java version becomes irrelevant), Logstash will be run in a Docker-based container sandbox.

This is the first blog post of a series about the Elastic Stack (a.k.a. ELK stack):

What is Logstash?

Logstash can collect logging data from a multitude of sources, transform the data, and send the data to a multitude of “stashes”.

logstash_input_output

Elastic’s “favorite stash” is Elasticsearch, another open source project driven by Elastic. Together with Kibana, Logstash and Elastic build the so-called ELK pipeline:

  • Elasticsearch is for searching, analyzing, and storing your data
  • Logstash (and Beats) is for collecting and transforming data, from any source, in any format
  • Kibana is a portal for visualizing the data and to navigate within the elastic stack

 

2016-11-17-18_31_39

 

In the current blog post, we will restrict ourselves to simplified Hello World Pipelines like follows:

2016-11-17-19_52_26

and:

2016-11-17-18_34_43

We will first read and write to from command line, before we will use log files as input source and output destinations.

Tools used

  • Vagrant 1.8.6
  • Virtualbox 5.0.20
  • Docker 1.12.1
  • Logstash 5.0.1

Step 1: Install a Docker Host via Vagrant and Connect to the Host via SSH

We will run Logstash in a Docker container in order to allow for maximum interoperability. This way, we always can use the latest Logstash version without the need to control the java version used: e.g. Logstash v 1.4.x works with java 7, while version 5.0.x works with java 8 only, currently.

If you are new to Docker, you might want to read this blog post.

Installing Docker on Windows and Mac can be a real challenge, but no worries: we will show an easy way here, that is much quicker than the one described in Docker’s official documentation:

Prerequisites of this step:

  • I recommend to have direct access to the Internet: via Firewall, but without HTTP proxy. However, if you cannot get rid of your HTTP proxy, read this blog post.
  • Administration rights on you computer.

Steps to install a Docker Host VirtualBox VM:

Download and install Virtualbox (if the installation fails with error message “Oracle VM Virtualbox x.x.x Setup Wizard ended prematurely” see Appendix A of this blog post: Virtualbox Installation Workaround below)

1. Download and Install Vagrant (requires a reboot)

2. Download Vagrant Box containing an Ubuntu-based Docker Host and create a VirtualBox VM like follows:

(basesystem)# mkdir ubuntu-trusty64-docker ; cd ubuntu-trusty64-docker
(basesystem)# vagrant init williamyeh/ubuntu-trusty64-docker
(basesystem)# vagrant up
(basesystem)# vagrant ssh
(dockerhost)$

Now you are logged into the Docker host and we are ready for the next step: to create the Ansible Docker image.

Note: I have experienced problems with the vi editor when running vagrant ssh in a Windows terminal. In case of Windows, consider to follow Appendix C of this blog post and to use putty instead.

Step 2 (optional): Download Logstash Image

This extra download step is optional, since the Logstash Docker image will be downloaded automatically in step 3, if it is not already found on the system:

(dockerhost)$ sudo docker pull logstash
Unable to find image 'logstash:latest' locally
latest: Pulling from library/logstash

386a066cd84a: Already exists
75ea84187083: Already exists
3e2e387eb26a: Pull complete
eef540699244: Pull complete
1624a2f8d114: Pull complete
7018f4ec6e0a: Pull complete
6ca3bc2ad3b3: Pull complete
3829939e7052: Pull complete
1cf20bb3ce62: Pull complete
f737f281552e: Pull complete
f1b7aca72edd: Pull complete
fb821ca73c54: Pull complete
c1543e80c12a: Pull complete
566f64970d2a: Pull complete
de88d0e92195: Pull complete
Digest: sha256:048a18100f18cdec3a42ebaa42042d5ee5bb3acceacea027dee4ae3819039da7
Status: Downloaded newer image for logstash:latest

The version of the downloaded Logstash image can be checked with following command:

(dockerhost)$ sudo docker run -it --rm logstash --version
logstash 5.0.1

We are using version 5.0.1 currently.

Step 3: Run Logstash als a Translator from Command Line to Command Line

In this step, we will use Logstash to translate the command line standard input (STDIN) to command line standard output (STDOUT).

2016-11-17-19_52_26

Once a docker host  is available, downloading, installing and running Logstash is as simple as typing following command. If the image is already downloaded, because Step 2 was accomplished before, the download part will be skipped:

(dockerhost)$ sudo docker run -it --rm logstash -e 'input { stdin { } } output { stdout { } }'

The with the -e option, we tell Logstash to read from the command line input (STDIN) and to send all output to the command line STOUT.

The output looks like follows:

Unable to find image 'logstash:latest' locally
latest: Pulling from library/logstash

386a066cd84a: Already exists
75ea84187083: Already exists
3e2e387eb26a: Pull complete
eef540699244: Pull complete
1624a2f8d114: Pull complete
7018f4ec6e0a: Pull complete
6ca3bc2ad3b3: Pull complete
3829939e7052: Pull complete
1cf20bb3ce62: Pull complete
f737f281552e: Pull complete
f1b7aca72edd: Pull complete
fb821ca73c54: Pull complete
c1543e80c12a: Pull complete
566f64970d2a: Pull complete
de88d0e92195: Pull complete
Digest: sha256:048a18100f18cdec3a42ebaa42042d5ee5bb3acceacea027dee4ae3819039da7
Status: Downloaded newer image for logstash:latest
Sending Logstash's logs to /var/log/logstash which is now configured via log4j2.properties
The stdin plugin is now waiting for input:
11:19:07.293 [[main]-pipeline-manager] INFO  logstash.pipeline - Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250}
11:19:07.334 [[main]-pipeline-manager] INFO  logstash.pipeline - Pipeline main started
11:19:07.447 [Api Webserver] INFO  logstash.agent - Successfully started Logstash API endpoint {:port=>9600}

In the first part, the Logstash Docker image is downloaded from Docker Hub, if the image is not already available locally. Then there are the logs of the Logstash start and the output is stopping and is waiting for your input. Now, if we type

hello logstash

we get an output similar to

2016-11-17T11:35:10.764Z 828389ba165b hello logstash

We can stop the container by typing <Ctrl>-D and we will get an output like

11:51:20.132 [LogStash::Runner] WARN  logstash.agent - stopping pipeline {:id=>"main"}

Now let us try another output format:

(dockerhost)$ sudo docker run -it --rm logstash -e 'input { stdin { } } output { stdout { codec => rubydebug } }'
Sending Logstash's logs to /var/log/logstash which is now configured via log4j2.properties
The stdin plugin is now waiting for input:
11:48:05.746 [[main]-pipeline-manager] INFO logstash.pipeline - Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250}
11:48:05.760 [[main]-pipeline-manager] INFO logstash.pipeline - Pipeline main started
11:48:05.827 [Api Webserver] INFO logstash.agent - Successfully started Logstash API endpoint {:port=>9600}

You will need to wait for ~8 sec before you can send your first log to the STDIN. Let us do that now and type:

hello logstash in ruby style

This will produce an output like

{
 "@timestamp" => 2016-11-17T11:50:24.571Z,
 "@version" => "1",
 "host" => "9cd979a20db4",
 "message" => "hello logstash in ruby style",
 "tags" => []
}

Step 3: Run Logstash als a Translator from from File to File

In this example, we will use (log) files as input source and output destination:

2016-11-17-18_34_43

For this, we will create a Logstash configuration file on the Docker host as follows:

#logstash.conf
input {
  file {
    path => "/app/input.log"
  }
}

output {
  file {
    path => "/app/output.log"
  }
}

For being able to read a file in the current directory on the Docker host, we need to map the current directory to a directory inside the Docker container using the -v switch. This time we need to override the entrypoint, since we need to get access to the command line of the container itself. We cannot just

(dockerhost-terminal1)$ sudo docker run -it --rm --name logstash -v "$PWD":/app --entrypoint bash logstash

Then within the container we run logstash:

(container-terminal1)# logstash -f /app/logstash.conf

In a second terminal on the docker host, we need to run a second bash terminal within the container by issuing the command:

(dockerhost-terminal2)$ sudo docker exec -it logstash bash

Now, on the container command line, we prepare to see the output like follows:

(container-terminal2)# touch /app/output.log; tail -f /app/output.log

Now we need a third terminal, and connect to the container again. Then we send a “Hello Logstash” to the input file:

(dockerhost-terminal3)$ sudo docker exec -it logstash bash
(container-terminal3)# echo "Hello Logstash" >> /app/input.log

This will create following output on terminal 2:

{"path":"/app/input.log","@timestamp":"2016-11-17T19:53:02.728Z","@version":"1","host":"88a342b6b385","message":"Hello Logstash","tags":[]}

The output is in a format Elasticsearch understands.

In order to improve the readability of the output, we can specify a “plain” output codec in the configuration file:

#logstash.conf
input {
  file {
    path => "/app/input.log"
  }
}

output {
  file {
    path => "/app/output.log"
    codec => "plain"
  }
}

Note that a change of the Logstash configuration file content requires the logstash process to be restarted for the change to have an effect; i.e. we can stop it with Ctrl-C and restart the logstash process in terminal 1 it with

(container-terminal1)# logstash -f /app/logstash.conf

Now again

(container-terminal-3)# echo "Hello Logstash" >> /app/input.log

in terminal 3. That will produce following syslog-style output on terminal 2:

2016-11-17T20:10:39.861Z 88a342b6b385 Hello Logstash

Appendix A: Error Errno::EACCES: Permission denied if the logfile is changed from on a mapped Volume

This error has been seen by running Logstash as a Docker container with a mapped folder and manipulate the input file from the Docker host

(dockerhost)$ sudo docker run -it --rm -v "$PWD":/app logstash -f /app/logstash.conf
Sending Logstash's logs to /var/log/logstash which is now configured via log4j2.properties
19:15:59.927 [[main]-pipeline-manager] INFO logstash.pipeline - Starting pipeline {"id"=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250}
19:15:59.940 [[main]-pipeline-manager] INFO logstash.pipeline - Pipeline main started
19:16:00.005 [Api Webserver] INFO logstash.agent - Successfully started Logstash API endpoint {:port=>9600}

If we now change the input file on the docker host in a second terminal like follows:

(dockerhost)$ echo "Hello Logstash" >> input.log

we receive following output on the first terminal:

19:22:47.732 [[main]>worker1] INFO logstash.outputs.file - Opening file {:path=>"/app/output.log"}
19:22:47.779 [LogStash::Runner] FATAL logstash.runner - An unexpected error occurred! {:error=>#<Errno::EACCES: Permission denied - /app/output.log>, :backtrace=>["org/jruby/RubyFile.java:370:in `initialize'", "org/jruby/RubyIO.java:871:in `new'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-4.0.1/lib/logstash/outputs/file.rb:280:in `open'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-4.0.1/lib/logstash/outputs/file.rb:132:in `multi_receive_encoded'", "org/jruby/RubyHash.java:1342:in `each'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-4.0.1/lib/logstash/outputs/file.rb:131:in `multi_receive_encoded'", "org/jruby/ext/thread/Mutex.java:149:in `synchronize'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-file-4.0.1/lib/logstash/outputs/file.rb:130:in `multi_receive_encoded'", "/usr/share/logstash/logstash-core/lib/logstash/outputs/base.rb:90:in `multi_receive'", "/usr/share/logstash/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:12:in `multi_receive'", "/usr/share/logstash/logstash-core/lib/logstash/output_delegator.rb:42:in `multi_receive'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:297:in `output_batch'", "org/jruby/RubyHash.java:1342:in `each'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:296:in `output_batch'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:252:in `worker_loop'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:225:in `start_workers'"]}
(dockerhost)$

There is a problem with the synchronization of the input.log file from the Docker host to the Container causing the docker container to stop. The workaround is to run the container with a bash entrypoint and manipulate the file from within the container, as shown in the step by step guide above.

Appendix B: How to apply a custom Time Stamp

In a real customer project, I had the task to visualize the data of certain data dump files, which had their own time stamps in a custom format like follows:

2016-11-21|00:00:00|<other data>

Okay, you are right in thinking that this is a CSV with pipe (|) separator, and that the CSV Logstash plugin should be applied. However, before doing so, we can take it as an example on how to replace the built-in Logstash timestamp variable called @timestamp. This is better than creating your own timestamp variable with a different name. The latter is possible also and works with normal Kibana visualizations, but it does not seem to work with Timelion for more complex visualizations. So let us do it the right way now:

We will create a simple demonstration Logstash configuration file for demonstration of the topic like follows:

# logstash_custom_timestamp.conf
input {
  stdin { }
  file {
    path => "/app/input/*.*"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

With that, will allow for STDIN input as well as for file input from any file you dump into the path /app/input/*. For testing, we have set the start_position to the “beginning”. I.e., Logstash will always read the files from the beginning, even if it already has read part of it. In addition, by setting the sincedb_path to "/dev/null", we make sure, that Logstash forgets about which files are already processed. This way, we can restart Logstash and re-process any files in the folder.

Now let us find the time variable with a grok filter and replace the time variable with the date plugin:

filter {
  grok {
    match => {"message" => "(?<mydate>[1-9][0-9]{3}-[0-9]{2}-[0-9]{2}\|[0-9]{2}:[0-9]{2}:[0-9]{2})"}
  }

  date {
    match => ["mydate", "YYYY-MM-dd|HH:mm:ss", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601"]
    target => "@timestamp"
  }

}

The grok filter allows us to define a new interim variable named mydate, if the specified regular expression is found in the input message. In our case, we want to match something like 2016-11-21|00:00:00, i.e. one digit between 1 and 9 ([1-9]) and 3 digits between 0 and 9 ([0-9]{3}), then a dash (-), then two digits ([0-9]{2}), a.s.o.

Then we can use the date plugin to overwrite the built-in @timestamp with our variable mydate we have created with the grok filter. Within the date we can match clauses like YYYY-MM-dd|HH:mm:ss in the mydate variable and push it to the @timestamp variable.

Note, that it is not possible to just use the replace directive. If we just try to overwrite @timestamp with mydate using the replace directive, Logstash will complain that you cannot overwrite a time variable with a String variable.

output {
  stdout { codec => rubydebug }
}

Now, let us start Logstash in a Docker container and test the configuration:

(dockerhost)$ sudo docker run -it --rm --name logstash -v "$PWD":/app --entrypoint bash logstash
(container)$ logstash -f /app/logstash_custom_timestamp.conf

And now, the container is waiting for input. We do  not let it wait and we type in the line below in fat:

1966-10-23|12:00:00|birthday
{
 "mydate" => "1966-10-23|12:00:00",
 "@timestamp" => 1966-10-23T12:00:00.000Z,
 "@version" => "1",
 "host" => "02cec85c3aac",
 "message" => "1966-10-23|12:00:00|birthday",
 "tags" => []
}

Success: the built-in timestamp variable @timestamp has been updated with the date found in the input message.

Let us observe, what happens with input messages, which do not match:

this is a message that does not match
{
    "@timestamp" => 2016-12-04T10:19:17.501Z,
      "@version" => "1",
          "host" => "02cec85c3aac",
       "message" => "this is a message that does not match",
          "tags" => [
        [0] "_grokparsefailure"
    ]
}

We can see that the output is tagged with "_grokparsefailure" in this case and the timestamp is set to the current date and time, as expected.

Appendix C: Logstash CSV read Performance

In a real project, I had to read in many millions of lines of a large set of CSV files. I have experienced that it took quite a bit of time to read in the data, so I want to measure the input performance of Logstash to be able to calculate the time consumption.

Note: we will reduce the data volume by a random sampling of the input. This will optimize input and Elasticsearch performance with the trade-off that the data analysis will become less accurate. However, if each data point still has more than 100 samples, the error is expected to lower than a few per cent, if the input data has no “unhealthy” values distribution (e.g. many records with low values and only few records with very large values).

Tools used:

  • Notebook with i7-6700HQ CPU and 64 GB RAM and Windows 10 Pro
  • VirtualBox 5.0.20 r106931
  • VirtualBox VM with Ubuntu 14.04, 4GB RAM and 2 vCPU
  • Docker installed 1.12.1, build 23cf638
  • 3 Docker containers running in interactive mode (the performance in detached mode might be higher, so we will measure a lower bound of the performance):
    • Logstash 5.0.1
    • Elasticsearch 5.0.1
    • Kibana 5.0.1
  • Data input files:
  • CSV files with 12200 lines
  • Sample data lines (note that the first line of each file will be dropped by Logstash):
DATUM|ZEIT|IPV4_SRC_ADDR|IPV4_DST_ADDR|ROUTER_IP|INTF_IN|INTF_OUT|TOS|FLAGS|IP_PROTOCOL_VERSION|PROTOCOL|L4_SRC_PORT|L4_DST_PORT|IN_PKTS|IN_BYTES|FLOWS
2016-11-23|15:58:10|9.1.7.231|164.25.118.50|9.0.253.1|2|0|0|0|4|17|49384|161|6|1602|1
2016-10-23|15:58:12|9.1.7.231|9.60.64.1|9.0.253.1|2|2|0|0|4|17|51523|161|1|78|1
...

Logstash configuration

# logstash_netflow_csv_to_elasticsearch.conf
input {
  stdin { }
  file {
    path => "/app/input/*.*"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
  ruby {
    # Sampling:
    code => "event.cancel if rand <= 0.90" # 10% sampling (i.e. cancel 90% of events) 
    #code => "event.cancel if rand <= 0.99" # 1% sampling (i.e. cancel 99% of events)
  } 

  # set timestamp: read from message 
  grok { 
    match => {"message" => "(?[1-9][0-9]{3}-[0-9]{2}-[0-9]{2}\|[0-9]{2}:[0-9]{2}:[0-9]{2})"}
  }

  # set timestamp: overwrite time stamp
  date {
    match => ["mydate", "YYYY-MM-dd|HH:mm:ss", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601"]
    target => "@timestamp"
  }

  csv {

    columns => [
      "DATUM",
      "ZEIT",
      "IPV4_SRC_ADDR",
      "IPV4_DST_ADDR",
      "ROUTER_IP",
      "INTF_IN",
      "INTF_OUT",
      "TOS",
      "FLAGS",
      "IP_PROTOCOL_VERSION",
      "PROTOCOL",
      "L4_SRC_PORT",
      "L4_DST_PORT",
      "IN_PKTS",
      "IN_BYTES",
      "FLOWS"
    ]

    separator => "|"
    remove_field => ["mydate"]
  }

  if ([DATUM] == "DATUM") {
    drop { }
  }

}

output {
  stdout { codec => dots }

  elasticsearch {
    action => "index"
    index => "csv"
    hosts => "elasticsearch"
    document_type => "data"
    workers => 1
  }
}

Results without Elasticsearch output

As a base line, we will perform tests with the elaisicsearch output commented out first:

Test 1) 100% Sampling -> 3,400/sec (i.e. 3,400 data sets/sec)
(10 files with 12,200 lines each in ~35 sec)

Test 2) 10% Sampling -> 6,100 lines/sec (i.e. 610 data sets/sec)
(10 files with 12,200 lines each in ~20 sec)

Test 3) 1% Sampling -> 8,100 lines/sec (i.e. 81 data sets/sec)
(10 files with 12,200 lines each in ~15 sec)

Results with Elasticsearch output

Now let us test the performance in case the data is sent to Elasticsearch:

Test 1) 100% Sampling -> 1,700/sec with 1,700 data lines/sec
(10 files with 12,200 lines each in ~70 sec)

Test 2) 10% Sampling -> 3,500 lines/sec with 350 data lines/sec
(10 files with 12,200 lines each in ~35 sec)

Test 3) 1% Sampling -> 6,100 lines/sec (i.e. 61 data sets/sec)
(10 files with 12,200 lines each in ~20 sec)

2016-12-05-14_27_24-logstash-input-performance-with-and-without-elasticsearch-output-ov-v0-1-ods-l

As we can see, the input rate is about 2000 lines/sec lower, if the output is sent Elasticsearch instead of sending it to console only (dots) (yellow vs. blue line).

In case of output to Elasticsearch, we get following rates graph:

2016-12-05-15_45_32-logstash-input-performance-with-and-without-elasticsearch-output-ov-v0-1-ods-l

  • Sampling rate 1%: if only 1% of the data records are sent to the output, the input rate is increased to 6,100 (factor ~3.6 compared to a sampling rate of 100%).
  • Sampling rate 10%: if only 10% of the data records are sent to the output, one could expect an input rate increase by the factor 10 compared to 100% sampling, if the output pipe was the bottleneck. This does not seem to be the case, since we observe an increase by the factor 2 only (3,500 lines/sec).
  • Sampling rate 100%: if all input lines are sent to the output, we can reach ~ 1,700 lines/sec

The optimum sampling rate is determined by increasing the sampling rate until the required data accuracy is reached. The data accuracy can be checked by random sampling of the same set of data several times and to observe variance of the output.

Summary

In this blog post we have created two simple Hello World examples:

  1. one for translation between command line input and command line output and
  2. a second one for translation from a file to a file.

In order to avoid any compatibility issues with the java version on the host, we have run Logstash in a Docker container. This works fine, if the input file is manipulated from within the container. As seen in Appendix A, we cannot manipulate the file on a mapped volume on the Docker Host, though.

References

 

3

Java Build Automation Part 2: Create executable jar using Gradle


Original title: How to build a lean JAR File with Gradle

2016-11-14-19_15_52

In this step by step guide, we will show that Gradle is a good alternative to Maven for packaging java code into executable jar files. In order to keep the executable jar files “lean”, we will keep the dependent jar files outside of the jar in a separate folder.

Tools Used

  1. Maven 3.3.9
  2. JDK 1.8.0_101
  3. log4j 1.2.17 (downloaded automatically)
  4. Joda-time 2.5 (downloaded automatically)
  5. Git-2.8.4 with GNU bash 4.3.42(5)

Why using Gradle for a Maven Project?

In this blog post, we will show how Gradle can be used to create a executable/runnable jar. The task has been accomplished on this popular Mkyong blog post by using Maven. Why would we want to do the same task using Gradle?

By working with both, Maven and Gradle, I have found that:

  • Gradle allows me to move any resource file to outside of the jar without the need of any additional Linux script or alike;
  • Gradle allows me to easily create an executable/runnable jar for the JUnit tests, even if those are not separated into a separate project.

Moreover, while Maven is descriptive, Gradle is procedural in nature. With Maven, you describe the goal and you rely on Maven and its plugins to perform the steps you had in mind. Whereas with Gradle, you have explicit control on each step of the build process. Gradle is easy to understand for programmers and it gives them fine-grained control over the build process.

The Goal: a lean, executable JAR File

In the following step by step guide, we will create a lean executable jar file with all dependent libraries and resources.

Step 1 Download Hello World Maven Project of Mkyong

Download this hello world Maven project you can find on this popular HowTo page from Mkyong:

curl -OJ http://www.mkyong.com/wp-content/uploads/2012/11/maven-create-a-jar.zip
unzip maven-create-a-jar.zip
cd dateUtils

Logs:

$ curl -OJ http://www.mkyong.com/wp-content/uploads/2012/11/maven-create-a-jar.zip
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7439  100  7439    0     0  23722      0 --:--:-- --:--:-- --:--:-- 24963

olive@LAPTOP-P5GHOHB7  /d/veits/eclipseWorkspaceRecent/MkYong/ttt
$ unzip maven-create-a-jar.zip
Archive:  maven-create-a-jar.zip
   creating: dateUtils/
  inflating: dateUtils/.classpath
  inflating: dateUtils/.DS_Store
   creating: __MACOSX/
   creating: __MACOSX/dateUtils/
  inflating: __MACOSX/dateUtils/._.DS_Store
  inflating: dateUtils/.project
   creating: dateUtils/.settings/
  inflating: dateUtils/.settings/org.eclipse.jdt.core.prefs
  inflating: dateUtils/log4j.properties
  inflating: dateUtils/pom.xml
   creating: dateUtils/src/
   creating: dateUtils/src/main/
   creating: dateUtils/src/main/java/
   creating: dateUtils/src/main/java/com/
   creating: dateUtils/src/main/java/com/mkyong/
   creating: dateUtils/src/main/java/com/mkyong/core/
   creating: dateUtils/src/main/java/com/mkyong/core/utils/
  inflating: dateUtils/src/main/java/com/mkyong/core/utils/App.java
   creating: dateUtils/src/main/resources/
  inflating: dateUtils/src/main/resources/log4j.properties
   creating: dateUtils/src/test/
   creating: dateUtils/src/test/java/
   creating: dateUtils/src/test/java/com/
   creating: dateUtils/src/test/java/com/mkyong/
   creating: dateUtils/src/test/java/com/mkyong/core/
   creating: dateUtils/src/test/java/com/mkyong/core/utils/
  inflating: dateUtils/src/test/java/com/mkyong/core/utils/AppTest.java
olive@LAPTOP-P5GHOHB7  /d/veits/eclipseWorkspaceRecent/MkYong/ttt
$ cd dateUtils/

olive@LAPTOP-P5GHOHB7  /d/veits/eclipseWorkspaceRecent/MkYong/ttt/dateUtils
$ 

Step 2 (optional): Create GIT Repository

In order to see, which files have been changed by which step, we can create a local GIT repository like follows

git init
# echo "Converting Maven to Gradle" > Readme.txt
git add .
git commit -m "first commit"

After each step, you then can perform the last two commands with a different message, so you can always go back to a previous step, if you need to do so. If you have made changes in a step that you have not committed yet, you can go back easily to the last clean commit state by issuing the command

# go back to status of last commit:
git stash -u

Warning: this will delete any new files you have created since the last commit.

Step 3 (required): Initialize Gradle

gradle init

This will automatically create a file build.gradle file from the Maven POM file with following content:

apply plugin: 'java'
apply plugin: 'maven'

group = 'com.mkyong.core.utils'
version = '1.0-SNAPSHOT'

description = """dateUtils"""

sourceCompatibility = 1.7
targetCompatibility = 1.7

repositories {

     maven { url "http://repo.maven.apache.org/maven2" }
}
dependencies {
    compile group: 'joda-time', name: 'joda-time', version:'2.5'
    compile group: 'log4j', name: 'log4j', version:'1.2.17'
    testCompile group: 'junit', name: 'junit', version:'4.11'
}

Step 4 (required): Gather Data

Since we are starting from a Maven project, which is prepared to create a runnable JAR via Maven already, we can extract the needed data from the POM.xml file:

MAINCLASS=`grep '<mainClass' pom.xml | cut -f2 -d">" | cut -f1 -d"<"`

Note: In cases with non-existing maven plugin, you need to set the MAINCLASS manually, e.g.

MAINCLASS=com.mkyong.core.utils.App

We also can define, where the dependency jars will be copied to later:

DEPENDENCY_JARS=dependency-jars

Logs:

$ MAINCLASS=`grep '<mainClass' pom.xml | cut -f2 -d">" | cut -f1 -d"<"`
$ echo $MAINCLASS
com.mkyong.core.utils.App
$ DEPENDENCY_JARS=dependency-jars
echo $DEPENDENCY_JARS
dependency-jars

Step 5 (required): Prepare to copy dependent Jars

Here, we will add instructions to the build.gradle file, which dependency JAR files are to be copied into a directory accessible by the executable jar.

We will need to copy the jars, we depend on, to a folder the runnable jar will access later on. See e.g. this StackOverflow question on this topic.

cat << END >> build.gradle

// copy dependency jars to build/libs/$DEPENDENCY_JARS 
task copyJarsToLib (type: Copy) {
    def toDir = "build/libs/$DEPENDENCY_JARS"

    // create directories, if not already done:
    file(toDir).mkdirs()

    // copy jars to lib folder:
    from configurations.compile
    into toDir
}
END

Step 6 (required): Prepare the Creation of an executable JAR File

In this step, we define in the build.gradle file, how to create an executable jar file.

cat << END >> build.gradle
jar {
    // exclude log properties (recommended)
    exclude ("log4j.properties")

    // make jar executable: see http://stackoverflow.com/questions/21721119/creating-runnable-jar-with-gradle
    manifest {
        attributes (
            'Main-Class': '$MAINCLASS',
            // add classpath to Manifest; see http://stackoverflow.com/questions/30087427/add-classpath-in-manifest-file-of-jar-in-gradle
            "Class-Path": '. dependency-jars/' + configurations.compile.collect { it.getName() }.join(' dependency-jars/')
            )
    }
}
END

Step 7 (required): Define build Dependencies

Up to now, a task copyJarsToLib was defined, but this task will not be executed, unless we tell Gradle to do so. In this step, we will specify that each time, a Jar is created, the copyJarsToLib task is to be performed beforehand. This can be done by telling Gradle that the jar goal depends on the copyJarsToLib task like follows:

cat << END >> build.gradle

// always call copyJarsToLib when building jars:
jar.dependsOn copyJarsToLib
END

Step 8 (required): Build Project

Meanwhile, the build.gradle file should have following content:

apply plugin: 'java'
apply plugin: 'maven'

group = 'com.mkyong.core.utils'
version = '1.0-SNAPSHOT'

description = """dateUtils"""

sourceCompatibility = 1.7
targetCompatibility = 1.7

repositories {

     maven { url "http://repo.maven.apache.org/maven2" }
}
dependencies {
    compile group: 'joda-time', name: 'joda-time', version:'2.5'
    compile group: 'log4j', name: 'log4j', version:'1.2.17'
    testCompile group: 'junit', name: 'junit', version:'4.11'
}

// copy dependency jars to build/libs/dependency-jars
task copyJarsToLib (type: Copy) {
    def toDir = "build/libs/dependency-jars"

    // create directories, if not already done:
    file(toDir).mkdirs()

    // copy jars to lib folder:
    from configurations.compile
    into toDir
}

jar {
    // exclude log properties (recommended)
    exclude ("log4j.properties")

    // make jar executable: see http://stackoverflow.com/questions/21721119/creating-runnable-jar-with-gradle
    manifest {
        attributes (
            'Main-Class': 'com.mkyong.core.utils.App',
            // add classpath to Manifest; see http://stackoverflow.com/questions/30087427/add-classpath-in-manifest-file-of-jar-in-gradle
            "Class-Path": '. dependency-jars/' + configurations.compile.collect { it.getName() }.join(' dependency-jars/')
            )
    }
}

// always call copyJarsToLib when building jars:
jar.dependsOn copyJarsToLib

Now is the time to create the runnable jar file:

gradle build

Note: Be patient at this step: it can appear to be hanging for several minutes, if it is run the first time, while it is working in the background.

This will create the runnable jar on build/libs/dateUtils-1.0-SNAPSHOT.jar and will copy the dependency jars to build/libs/dependency-jars/

Logs:

$ gradle build
:compileJava
warning: [options] bootstrap class path not set in conjunction with -source 1.7
1 warning
:processResources
:classes
:copyJarsToLib
:jar
:assemble
:compileTestJava
warning: [options] bootstrap class path not set in conjunction with -source 1.7
1 warning
:processTestResources UP-TO-DATE
:testClasses
:test
:check
:build

BUILD SUCCESSFUL

Total time: 3.183 secs

$ ls build/libs/
dateUtils-1.0-SNAPSHOT.jar dependency-jars

$ ls build/libs/dependency-jars/
joda-time-2.5.jar log4j-1.2.17.jar

Step 9: Execute the JAR file

It is best practice to exclude the log4j.properties file from the runnable jar file, and place it outside of the jar file, since we want to be able to change logging levels at runtime. This is, why we had excluded the properties file in step 6. In order to avoid an error “No appenders could be found for logger”, we need not specify the location of the log4j.properties properly on the command-line.

Step 9.1 Execute JAR file on Linux

On a Linux system, we run the command like follows:

java -jar -Dlog4j.configuration=file:full_path_to_log4j.properties build/libs/dateUtils-1.0-SNAPSHOT.jar

Example:

$ java -jar -Dlog4j.configuration=file:/usr/home/me/dateUtils/log4j.properties build/libs/dateUtils-1.0-SNAPSHOT.jar
11:47:33,018 DEBUG App:18 - getLocalCurrentDate() is executed!
2016-11-14

Note: if the log4j.properties file is on the current directory on a Linux machine, we also can create a batch file run.sh with the content

#!/usr/bin/env bash
java -jar -Dlog4j.configuration=file:`pwd`/log4j.properties build/libs/dateUtils-1.0-SNAPSHOT.jar

and run it via bash run.sh

Step 9.1 Execute JAR file on Windows

In case of Windows in a CMD shell all paths need to be in Windows style:

java -jar -Dlog4j.configuration=file:D:\veits\eclipseWorkspaceRecent\MkYong\dateUtils\log4j.properties build\libs\dateUtils-1.0-SNAPSHOT.jar
11:45:30,007 DEBUG App:18 - getLocalCurrentDate() is executed!
2016-11-14

If we run the command on a Windows GNU bash shell, the syntax is kind of mixed: the path to the jar file is in Linux style while the path to the log properties file needs to be in Windows style (this is, how the Windows java.exe file expects the input of this option):

$ java -jar -Dlog4j.configuration=file:'D:\veits\eclipseWorkspaceRecent\MkYong\dateUtils\log4j.properties' build/libs/dateUtils-1.0-SNAPSHOT.jar
11:45:30,007 DEBUG App:18 - getLocalCurrentDate() is executed!
2016-11-14

Inverted commas have been used in order to avoid the necessity of escaped backslashes like D:\\veits\\eclipseWorkspaceRecent\\… needed on a Windows system.

Note: if the log4j.properties file is on the current directory on a Windows machine, we also can create a batch file run.bat with the content

java -jar -Dlog4j.configuration=file:%cd%\log4j.properties build\libs\dateUtils-1.0-SNAPSHOT.jar

To run the bat file on GNU bash on Windows, just type ./run.bat

Yepp, that is it: the hello world executable file is printing the date to the console, just as it did in Mkyong’s blog post, where the executable file was created using Maven.

simpleicons_interface_folder-download-symbol-svg

Download the source code from GIT.

Note: in the source code, you also will find a file named prepare_build.gradle.sh, which can be run on a bash shell and will replace the manual steps 4 to 7.

References

Next Steps

  • create an even leaner jar with resource files kept outside of the executable jar. This opens the opportunity to changing resource files at runtime.
  • create an executable jar file that will run the JUnit tests.

 

0

How to set up Docker Monitoring via cAdvisor, InfluxDB and Grafana


Have you ever tried to monitor a docker solution? In this blog post, we will discuss three open source docker monitoring alternatives, before we will go through a step by step guide of a docker monitoring alternative that consist of the components Google cAdvisor as data source, InfluxDB as the database and Grafana for creating the graphs.

The post is built upon a blog post of Brian Christner.  However, we will take a shortcut via a docker compose file created by Dale Kate-Murray and Ross Jimenez, which helps us to spin up the needed docker containers within minutes (depending on your Internet speed).

Go to Summary ->

Docker Monitoring Alternatives

Other free docker monitoring solutions are discussed in this youtube video of Brian Christner:

  • Google cAdvisor (standalone): easy to use, no config needed
  • cAdvisor + InfluxDB + Grafana: flexible, adaptable (the one we will get hands-on experience below)
  • Prometheus: all-in-one complete monitoring solution

He is summarizing the capabilities of those solutions like follows:

2016-10-25-17_32_32-docker-monitoring-youtube

@Brian Christner: I hope it is Okay, that I have copied this slide from your youtube video?

Those are open source alternatives. Brian Christner points out that you might need more complete, enterprise-level solutions than the open source alternatives can offer, e.g. Data Dog (offers a free service for up to five monitored hosts) or Sysdig (the latter also seems to be open source, though). See also this Rancher post, which compares seven docker monitoring alternatives.

Step by Step guide: “Installing” cAdvisor + InfluxDB + Grafana

Here, we will lead through a step by step guide on how to deploy a flexible docker monitoring solution consisting of Google cAdvisor as data source, InfluxDB as the database and Grafana for creating the graphs. We will make use of a docker compose file Ross Jimenez has created and Brian Christner has included in his git repository

Step 0: Prerequisites

We assume that following prerequisites are met

  • Docker is installed. A nice way to install an Ubuntu Docker host via Vagrant is described here (search for the term “Install a Docker Host”).
  • You have direct Internet access. If you need to cope with a HTTP proxy, see the official docker instructions, or, if it does not work you may try this blog post.

Step 1: Install docker-compose via Container:

On a Docker host, we will install docker-compose via docker container via following script:

# detect, whether sudo is needed:
sudo echo hallo > /dev/null 2>&1 && SUDO=sudo

# download docker-compose wrapper, if docker-compose:
$SUDO docker-compose --version || \
 $SUDO docker --version && \
 curl -L https://github.com/docker/compose/releases/download/1.8.1/run.sh | $SUDO tee /usr/local/bin/docker-compose && \
 $SUDO chmod +x /usr/local/bin/docker-compose

You might prefer a native installation of docker compose. Please check out the official documentation in this case.

Step 2: Download Docker Compose File

Now we download Brian Christner’s docker monitoring git repository. The CircleCI tests of Brian’s repository are failing currently (as of 2016-10-25), but the software seems to work anyway.

git clone https://github.com/vegasbrianc/docker-monitoring && \
cd docker-monitoring

Step 3: Start Containers

Now let us start the containers via

$ docker-compose up
Starting dockermonitoringrepaired_influxdbData_1
Starting dockermonitoringrepaired_influxdb_1
Starting dockermonitoringrepaired_grafana_1
Starting dockermonitoringrepaired_cadvisor_1
Attaching to dockermonitoringrepaired_influxdbData_1, dockermonitoringrepaired_influxdb_1, dockermonitoringrepaired_cadvisor_1, dockermonitoringrepaired_grafana_1
dockermonitoringrepaired_influxdbData_1 exited with code 0
influxdb_1 | influxdb configuration:
influxdb_1 | ### Welcome to the InfluxDB configuration file.
influxdb_1 |
influxdb_1 | # Once every 24 hours InfluxDB will report anonymous data to m.influxdb.com
influxdb_1 | # The data includes raft id (random 8 bytes), os, arch, version, and metadata.
influxdb_1 | # We don't track ip addresses of servers reporting. This is only used
influxdb_1 | # to track the number of instances running and the versions, which
...
(trunkated; see full log in the Appendix)
...
influxdb_1 | [admin] 2016/10/25 16:48:44 Listening on HTTP: [::]:8083
influxdb_1 | [continuous_querier] 2016/10/25 16:48:44 Starting continuous query service
influxdb_1 | [httpd] 2016/10/25 16:48:44 Starting HTTP service
influxdb_1 | [httpd] 2016/10/25 16:48:44 Authentication enabled: false
influxdb_1 | [httpd] 2016/10/25 16:48:44 Listening on HTTP: [::]:8086
influxdb_1 | [retention] 2016/10/25 16:48:44 Starting retention policy enforcement service with check interval of 30m0s
influxdb_1 | [monitor] 2016/10/25 16:48:44 Storing statistics in database '_internal' retention policy 'monitor', at interval 10s
influxdb_1 | 2016/10/25 16:48:44 Sending anonymous usage statistics to m.influxdb.com
influxdb_1 | [run] 2016/10/25 16:48:44 Listening for signals

Note: if you see a continuous message “Waiting for confirmation of InfluxDB service startup”, you might hit a problem described in an Appendix below. Search for “Waiting for confirmation of InfluxDB service startup” on this page.

Step 4 (optional): In a different window on the Docker host, we can test the connection like follows:

$ curl --retry 10 --retry-delay 5 -v http://localhost:8083
* Rebuilt URL to: http://localhost:8083/
* Hostname was NOT found in DNS cache
* Trying ::1...
* Connected to localhost (::1) port 8083 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.35.0
...
</body>

</html>
* Connection #0 to host localhost left intact

Step 5: Connect to cAdvisor, InfluxDB, Grafana

Step 5.1 (optional): Connect to cAdvisor

Now let us connect to cAdvisor. For that, you need to find out, which IP address your docker host is using. In my case, I am using a Vagrant-based Docker host and I have added an additional line

config.vm.network "private_network", ip: "192.168.33.11"

The TCP port can be seen in the docker-compose.yml file: it is 8080. This allows me to connect to cAdvisor’ dashboard via http://192.168.33.11:8080/containers/:

2016-10-25-22_36_11-cadvisor-_

Step 5.2 (optional): connect to InfluxDB

InfluxDB is reachable via http://192.168.33.11:8083/:

2016-10-25-22_38_50-influxdb-admin-interface

Step 5.3 (required): Connect to Grafana

And Grafana can be reached via http://192.168.33.11:3000/: log in as admin with password admin, if you are prompted for it:

2016-10-25-22_42_32-grafana-home

Okay, the dashboard is still empty.

Step 6: Add Data Sources to Grafana manually

Connect to Grafana (http://192.168.33.11:3000/ in my case)

Click on Data Sources -> Add new

and add following data:

Name: influxdb
Type: InfluxDB 0.9.x

Note: Be sure to check default box! Otherwise, you will see random data created by Grafana below!

Http settings
Url: http://192.168.33.11:8086 (please adapt the IP address to your environment)
Access: proxy
Basic Auth: Enabled
User: admin
Password: admin

InfluxDB Details
Database: cadvisor
User: root
Password: root

Click Add -> Test Connection (should be successful) -> Save

Step 7: Add New Dashboard to Grafana via json File

Connect to Grafana (http://192.168.33.11:3000/ in my case)

Click on Grafana Home, then Grafana Import, navigate to the cloned guthub repository, click on the button below Import File and pick the file docker-monitoring-0.9.json:

docker-monitoring-0.9.json

As if by an invisible hand, we get a dashboard with information on Filesystem Usage, CPU Usage, Memory Usage and Network Usage of the Containers on the host.

2016-10-30-20_05_15-grafana-new-dashboard

Note: if the graphs look like follows

2016-10-28-21_28_18-grafana-new-dashboard

and the graphs change substantially by clicking Grafana Dashboard Refresh, then you most probably have forgotten to check the “default” box in step 6. In this case, you need to click on the title of the graph -> Edit -> choose influxdb as data source.

Step 9 (optional): CPU Stress Test

Since only the docker monitoring containers are running, the absolute numbers we see are quite low. Let us start a container that is stressing the CPU a little bit:

docker run -it petarmaric/docker.cpu-stress-test

The graphs of cAdvisor are reacting right away:

2016-10-30-21_08_06-cadvisor-_

Let us wait a few minutes and refresh the Grafana graphs and put our focus on the CPU usage:

2016-10-30-21_05_35-grafana-new-dashboard

The data does not seem to be reliable. I have opened issue #10 for this. When looking at the InfluxDB data by specifying following URL in a browser:

http://192.168.33.11:8086/query?pretty=true&db=cadvisor&q=SELECT%20%22value%22%20FROM%20%22cpu_usage_system%22

(you need to adapt your IP address), then we get data that is changing by the factor of 10.000 within milliseconds!

{
    "results": [
        {
            "series": [
                {
                    "name": "cpu_usage_system",
                    "columns": [
                        "time",
                        "value"
                    ],
                    "values": [
                        [
                            "2016-10-24T17:12:49.021559212Z",
                            910720000000
                        ],
                        [
                            "2016-10-24T17:12:49.032153994Z",
                            20000000
                        ],
                        [
                            "2016-10-24T17:12:49.033316234Z",
                            5080000000
                        ],

Summary

Following Brian Christner’s youtube video on Docker Monitoring, we have compared three open source Docker monitoring solutions

  1. Google cAdvisor (standalone): easy to use, no config needed
  2. cAdvisor + InfluxDB + Grafana: flexible, adaptable (the one we will get hands-on experience below)
  3. Prometheus: all-in-one complete monitoring solution

By using a pre-defined docker-compose file, solution 2 can be spin up in minutes (unless you are working in a NFS synced Vagrant folder on Windows, which leads to a continuous ‘Waiting for confirmation of InfluxDB service startup’ message; see the Appendic B below; that problem I have reported here had caused quite a headache on my side).

After the data source is configured manually, a json file helps to create a nice Grafana dashboard within minutes. The dashboard shows graphs about File System Usage, CPU Usage, Memory Usage and Network Usage.

In the moment, there is a ceaveat that the data displayed is not trustworthy. This is investigated in the framework of issue #10 of Brian Christner’s repository. I will report here, when it is resolved.

Next Steps:

Appendix A: full startup log of successful ‘docker-compose up’

$ docker-compose up
Starting dockermonitoringrepaired_influxdbData_1
Starting dockermonitoringrepaired_influxdb_1
Starting dockermonitoringrepaired_grafana_1
Starting dockermonitoringrepaired_cadvisor_1
Attaching to dockermonitoringrepaired_influxdbData_1, dockermonitoringrepaired_influxdb_1, dockermonitoringrepaired_grafana_1, dockermonitoringrepaired_cadvisor_1
dockermonitoringrepaired_influxdbData_1 exited with code 0
influxdb_1 | influxdb configuration:
influxdb_1 | ### Welcome to the InfluxDB configuration file.
influxdb_1 |
influxdb_1 | # Once every 24 hours InfluxDB will report anonymous data to m.influxdb.com
influxdb_1 | # The data includes raft id (random 8 bytes), os, arch, version, and metadata.
influxdb_1 | # We don't track ip addresses of servers reporting. This is only used
influxdb_1 | # to track the number of instances running and the versions, which
influxdb_1 | # is very helpful for us.
influxdb_1 | # Change this option to true to disable reporting.
influxdb_1 | reporting-disabled = false
influxdb_1 |
influxdb_1 | # we'll try to get the hostname automatically, but if it the os returns something
influxdb_1 | # that isn't resolvable by other servers in the cluster, use this option to
influxdb_1 | # manually set the hostname
influxdb_1 | # hostname = "localhost"
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [meta]
influxdb_1 | ###
influxdb_1 | ### Controls the parameters for the Raft consensus group that stores metadata
influxdb_1 | ### about the InfluxDB cluster.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [meta]
influxdb_1 | # Where the metadata/raft database is stored
influxdb_1 | dir = "/data/meta"
influxdb_1 |
influxdb_1 | retention-autocreate = true
influxdb_1 |
influxdb_1 | # If log messages are printed for the meta service
influxdb_1 | logging-enabled = true
influxdb_1 | pprof-enabled = false
influxdb_1 |
influxdb_1 | # The default duration for leases.
influxdb_1 | lease-duration = "1m0s"
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [data]
influxdb_1 | ###
influxdb_1 | ### Controls where the actual shard data for InfluxDB lives and how it is
influxdb_1 | ### flushed from the WAL. "dir" may need to be changed to a suitable place
influxdb_1 | ### for your system, but the WAL settings are an advanced configuration. The
influxdb_1 | ### defaults should work for most systems.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [data]
influxdb_1 | # Controls if this node holds time series data shards in the cluster
influxdb_1 | enabled = true
influxdb_1 |
influxdb_1 | dir = "/data/data"
influxdb_1 |
influxdb_1 | # These are the WAL settings for the storage engine >= 0.9.3
influxdb_1 | wal-dir = "/data/wal"
influxdb_1 | wal-logging-enabled = true
influxdb_1 | data-logging-enabled = true
influxdb_1 |
influxdb_1 | # Whether queries should be logged before execution. Very useful for troubleshooting, but will
influxdb_1 | # log any sensitive data contained within a query.
influxdb_1 | # query-log-enabled = true
influxdb_1 |
influxdb_1 | # Settings for the TSM engine
influxdb_1 |
influxdb_1 | # CacheMaxMemorySize is the maximum size a shard's cache can
influxdb_1 | # reach before it starts rejecting writes.
influxdb_1 | # cache-max-memory-size = 524288000
influxdb_1 |
influxdb_1 | # CacheSnapshotMemorySize is the size at which the engine will
influxdb_1 | # snapshot the cache and write it to a TSM file, freeing up memory
influxdb_1 | # cache-snapshot-memory-size = 26214400
influxdb_1 |
influxdb_1 | # CacheSnapshotWriteColdDuration is the length of time at
influxdb_1 | # which the engine will snapshot the cache and write it to
influxdb_1 | # a new TSM file if the shard hasn't received writes or deletes
influxdb_1 | # cache-snapshot-write-cold-duration = "1h"
influxdb_1 |
influxdb_1 | # MinCompactionFileCount is the minimum number of TSM files
influxdb_1 | # that need to exist before a compaction cycle will run
influxdb_1 | # compact-min-file-count = 3
influxdb_1 |
influxdb_1 | # CompactFullWriteColdDuration is the duration at which the engine
influxdb_1 | # will compact all TSM files in a shard if it hasn't received a
influxdb_1 | # write or delete
influxdb_1 | # compact-full-write-cold-duration = "24h"
influxdb_1 |
influxdb_1 | # MaxPointsPerBlock is the maximum number of points in an encoded
grafana_1 | 2016/10/25 17:30:59 [I] Starting Grafana
grafana_1 | 2016/10/25 17:30:59 [I] Version: 2.6.0, Commit: v2.6.0, Build date: 2015-12-14 14:18:01 +0000 UTC
grafana_1 | 2016/10/25 17:30:59 [I] Configuration Info
grafana_1 | Config files:
grafana_1 | [0]: /usr/share/grafana/conf/defaults.ini
grafana_1 | [1]: /etc/grafana/grafana.ini
grafana_1 | Command lines overrides:
grafana_1 | [0]: default.paths.data=/var/lib/grafana
grafana_1 | [1]: default.paths.logs=/var/log/grafana
grafana_1 | Paths:
grafana_1 | home: /usr/share/grafana
grafana_1 | data: /var/lib/grafana
grafana_1 | logs: /var/log/grafana
grafana_1 |
grafana_1 | 2016/10/25 17:30:59 [I] Database: sqlite3
grafana_1 | 2016/10/25 17:30:59 [I] Migrator: Starting DB migration
grafana_1 | 2016/10/25 17:30:59 [I] Listen: http://0.0.0.0:3000
influxdb_1 | # block in a TSM file. Larger numbers may yield better compression
influxdb_1 | # but could incur a performance penalty when querying
influxdb_1 | # max-points-per-block = 1000
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [cluster]
influxdb_1 | ###
influxdb_1 | ### Controls non-Raft cluster behavior, which generally includes how data is
influxdb_1 | ### shared across shards.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [cluster]
influxdb_1 | shard-writer-timeout = "5s" # The time within which a remote shard must respond to a write request.
influxdb_1 | write-timeout = "10s" # The time within which a write request must complete on the cluster.
influxdb_1 | max-concurrent-queries = 0 # The maximum number of concurrent queries that can run. 0 to disable.
influxdb_1 | query-timeout = "0s" # The time within a query must complete before being killed automatically. 0s to disable.
influxdb_1 | max-select-point = 0 # The maximum number of points to scan in a query. 0 to disable.
influxdb_1 | max-select-series = 0 # The maximum number of series to select in a query. 0 to disable.
influxdb_1 | max-select-buckets = 0 # The maximum number of buckets to select in an aggregate query. 0 to disable.
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [retention]
influxdb_1 | ###
influxdb_1 | ### Controls the enforcement of retention policies for evicting old data.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [retention]
influxdb_1 | enabled = true
influxdb_1 | check-interval = "30m"
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [shard-precreation]
influxdb_1 | ###
influxdb_1 | ### Controls the precreation of shards, so they are available before data arrives.
influxdb_1 | ### Only shards that, after creation, will have both a start- and end-time in the
influxdb_1 | ### future, will ever be created. Shards are never precreated that would be wholly
influxdb_1 | ### or partially in the past.
influxdb_1 |
influxdb_1 | [shard-precreation]
influxdb_1 | enabled = true
influxdb_1 | check-interval = "10m"
influxdb_1 | advance-period = "30m"
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### Controls the system self-monitoring, statistics and diagnostics.
influxdb_1 | ###
influxdb_1 | ### The internal database for monitoring data is created automatically if
influxdb_1 | ### if it does not already exist. The target retention within this database
influxdb_1 | ### is called 'monitor' and is also created with a retention period of 7 days
influxdb_1 | ### and a replication factor of 1, if it does not exist. In all cases the
influxdb_1 | ### this retention policy is configured as the default for the database.
influxdb_1 |
influxdb_1 | [monitor]
influxdb_1 | store-enabled = true # Whether to record statistics internally.
influxdb_1 | store-database = "_internal" # The destination database for recorded statistics
influxdb_1 | store-interval = "10s" # The interval at which to record statistics
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [admin]
influxdb_1 | ###
influxdb_1 | ### Controls the availability of the built-in, web-based admin interface. If HTTPS is
influxdb_1 | ### enabled for the admin interface, HTTPS must also be enabled on the [http] service.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [admin]
influxdb_1 | enabled = true
influxdb_1 | bind-address = ":8083"
influxdb_1 | https-enabled = false
influxdb_1 | https-certificate = "/etc/ssl/influxdb.pem"
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [http]
influxdb_1 | ###
influxdb_1 | ### Controls how the HTTP endpoints are configured. These are the primary
influxdb_1 | ### mechanism for getting data into and out of InfluxDB.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [http]
influxdb_1 | enabled = true
influxdb_1 | bind-address = ":8086"
influxdb_1 | auth-enabled = false
influxdb_1 | log-enabled = true
influxdb_1 | write-tracing = false
influxdb_1 | pprof-enabled = false
influxdb_1 | https-enabled = false
influxdb_1 | https-certificate = "/etc/ssl/influxdb.pem"
influxdb_1 | max-row-limit = 10000
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [[graphite]]
influxdb_1 | ###
influxdb_1 | ### Controls one or many listeners for Graphite data.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [[graphite]]
influxdb_1 | enabled = false
influxdb_1 | database = "graphitedb"
influxdb_1 | bind-address = ":2003"
influxdb_1 | protocol = "tcp"
influxdb_1 | # consistency-level = "one"
influxdb_1 |
influxdb_1 | # These next lines control how batching works. You should have this enabled
influxdb_1 | # otherwise you could get dropped metrics or poor performance. Batching
influxdb_1 | # will buffer points in memory if you have many coming in.
influxdb_1 |
influxdb_1 | # batch-size = 5000 # will flush if this many points get buffered
influxdb_1 | # batch-pending = 10 # number of batches that may be pending in memory
influxdb_1 | # batch-timeout = "1s" # will flush at least this often even if we haven't hit buffer limit
influxdb_1 | # udp-read-buffer = 0 # UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
influxdb_1 |
influxdb_1 | ### This string joins multiple matching 'measurement' values providing more control over the final measurement name.
influxdb_1 | # separator = "."
influxdb_1 |
influxdb_1 | ### Default tags that will be added to all metrics. These can be overridden at the template level
influxdb_1 | ### or by tags extracted from metric
influxdb_1 | # tags = ["region=us-east", "zone=1c"]
influxdb_1 |
influxdb_1 | ### Each template line requires a template pattern. It can have an optional
influxdb_1 | ### filter before the template and separated by spaces. It can also have optional extra
influxdb_1 | ### tags following the template. Multiple tags should be separated by commas and no spaces
influxdb_1 | ### similar to the line protocol format. There can be only one default template.
influxdb_1 | templates = [
influxdb_1 | # filter + template
influxdb_1 | #"*.app env.service.resource.measurement",
influxdb_1 | # filter + template + extra tag
influxdb_1 | #"stats.* .host.measurement* region=us-west,agent=sensu",
influxdb_1 | # default template. Ignore the first graphite component "servers"
influxdb_1 | "instance.profile.measurement*"
influxdb_1 | ]
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [collectd]
influxdb_1 | ###
influxdb_1 | ### Controls one or many listeners for collectd data.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [[collectd]]
influxdb_1 | enabled = false
influxdb_1 | # bind-address = ":25826"
influxdb_1 | # database = "collectd"
influxdb_1 | # typesdb = "/usr/share/collectd/types.db"
influxdb_1 | # retention-policy = ""
influxdb_1 |
influxdb_1 | # These next lines control how batching works. You should have this enabled
influxdb_1 | # otherwise you could get dropped metrics or poor performance. Batching
influxdb_1 | # will buffer points in memory if you have many coming in.
influxdb_1 |
influxdb_1 | # batch-size = 1000 # will flush if this many points get buffered
influxdb_1 | # batch-pending = 5 # number of batches that may be pending in memory
influxdb_1 | # batch-timeout = "1s" # will flush at least this often even if we haven't hit buffer limit
influxdb_1 | # read-buffer = 0 # UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [opentsdb]
influxdb_1 | ###
influxdb_1 | ### Controls one or many listeners for OpenTSDB data.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [[opentsdb]]
influxdb_1 | enabled = false
influxdb_1 | # bind-address = ":4242"
influxdb_1 | # database = "opentsdb"
influxdb_1 | # retention-policy = ""
influxdb_1 | # consistency-level = "one"
influxdb_1 | # tls-enabled = false
influxdb_1 | # certificate= ""
influxdb_1 | # log-point-errors = true # Log an error for every malformed point.
influxdb_1 |
influxdb_1 | # These next lines control how batching works. You should have this enabled
influxdb_1 | # otherwise you could get dropped metrics or poor performance. Only points
influxdb_1 | # metrics received over the telnet protocol undergo batching.
influxdb_1 |
influxdb_1 | # batch-size = 1000 # will flush if this many points get buffered
cadvisor_1 | I1025 17:30:59.170040 1 storagedriver.go:42] Using backend storage type "influxdb"
cadvisor_1 | I1025 17:30:59.170881 1 storagedriver.go:44] Caching stats in memory for 2m0s
cadvisor_1 | I1025 17:30:59.171032 1 manager.go:131] cAdvisor running in container: "/docker/9839f9c5c9d674016006e4d4144f984ea91320686356235951f21f0b51306c47"
cadvisor_1 | I1025 17:30:59.194143 1 fs.go:107] Filesystem partitions: map[/dev/dm-0:{mountpoint:/rootfs major:252 minor:0 fsType: blockSize:0} /dev/sda1:{mountpoint:/rootfs/boot major:8 minor:1 fsType: blockSize:0}]
influxdb_1 | # batch-pending = 5 # number of batches that may be pending in memory
influxdb_1 | # batch-timeout = "1s" # will flush at least this often even if we haven't hit buffer limit
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [[udp]]
influxdb_1 | ###
influxdb_1 | ### Controls the listeners for InfluxDB line protocol data via UDP.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [[udp]]
influxdb_1 | enabled = false
influxdb_1 | bind-address = ":4444"
influxdb_1 | database = "udpdb"
influxdb_1 | # retention-policy = ""
influxdb_1 |
influxdb_1 | # These next lines control how batching works. You should have this enabled
influxdb_1 | # otherwise you could get dropped metrics or poor performance. Batching
influxdb_1 | # will buffer points in memory if you have many coming in.
influxdb_1 |
influxdb_1 | # batch-size = 1000 # will flush if this many points get buffered
influxdb_1 | # batch-pending = 5 # number of batches that may be pending in memory
influxdb_1 | # batch-timeout = "1s" # will flush at least this often even if we haven't hit buffer limit
influxdb_1 | # read-buffer = 0 # UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
influxdb_1 |
influxdb_1 | # set the expected UDP payload size; lower values tend to yield better performance, default is max UDP size 65536
influxdb_1 | # udp-payload-size = 65536
influxdb_1 |
influxdb_1 | ###
influxdb_1 | ### [continuous_queries]
influxdb_1 | ###
influxdb_1 | ### Controls how continuous queries are run within InfluxDB.
influxdb_1 | ###
influxdb_1 |
influxdb_1 | [continuous_queries]
influxdb_1 | log-enabled = true
influxdb_1 | enabled = true
influxdb_1 | # run-interval = "1s" # interval for how often continuous queries will be checked if they need to run=> Starting InfluxDB ...
influxdb_1 | => About to create the following database: cadvisor
influxdb_1 | => Database had been created before, skipping ...
influxdb_1 | exec influxd -config=${CONFIG_FILE}
influxdb_1 |
influxdb_1 | 8888888 .d888 888 8888888b. 888888b.
influxdb_1 | 888 d88P" 888 888 "Y88b 888 "88b
influxdb_1 | 888 888 888 888 888 888 .88P
influxdb_1 | 888 88888b. 888888 888 888 888 888 888 888 888 8888888K.
influxdb_1 | 888 888 "88b 888 888 888 888 Y8bd8P' 888 888 888 "Y88b
influxdb_1 | 888 888 888 888 888 888 888 X88K 888 888 888 888
influxdb_1 | 888 888 888 888 888 Y88b 888 .d8""8b. 888 .d88P 888 d88P
influxdb_1 | 8888888 888 888 888 888 "Y88888 888 888 8888888P" 8888888P"
influxdb_1 |
cadvisor_1 | I1025 17:30:59.228909 1 machine.go:50] Couldn't collect info from any of the files in "/rootfs/etc/machine-id,/var/lib/dbus/machine-id"
cadvisor_1 | I1025 17:30:59.229080 1 manager.go:166] Machine: {NumCores:2 CpuFrequency:2592000 MemoryCapacity:1569599488 MachineID: SystemUUID:B63CB367-870F-4E48-917F-7E524C2C67A0 BootID:e225b37a-b8e6-466b-9f67-84b74df8e90c Filesystems:[{Device:/dev/dm-0 Capacity:41092214784} {Device:/dev/sda1 Capacity:246755328}] DiskMap:map[252:0:{Name:dm-0 Major:252 Minor:0 Size:41884319744 Scheduler:none} 252:1:{Name:dm-1 Major:252 Minor:1 Size:805306368 Scheduler:none} 8:0:{Name:sda Major:8 Minor:0 Size:42949672960 Scheduler:deadline}] NetworkDevices:[{Name:br-067c518abd1f MacAddress:02:42:78:41:c0:71 Speed:0 Mtu:1500} {Name:br-1c136984ac6d MacAddress:02:42:0c:dc:89:ac Speed:0 Mtu:1500} {Name:br-9b7560132352 MacAddress:02:42:5c:df:9a:43 Speed:0 Mtu:1500} {Name:eth0 MacAddress:08:00:27:c7:ba:b5 Speed:1000 Mtu:1500} {Name:eth1 MacAddress:08:00:27:51:9c:7e Speed:1000 Mtu:1500}] Topology:[{Id:0 Memory:1569599488 Cores:[{Id:0 Threads:[0] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2} {Size:6291456 Type:Unified Level:3}]} {Id:1 Threads:[1] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2} {Size:6291456 Type:Unified Level:3}]}] Caches:[]}] CloudProvider:Unknown InstanceType:Unknown}
cadvisor_1 | I1025 17:30:59.229884 1 manager.go:172] Version: {KernelVersion:4.2.0-42-generic ContainerOsVersion:Alpine Linux v3.2 DockerVersion:1.12.1 CadvisorVersion:0.20.5 CadvisorRevision:9aa348f}
influxdb_1 | [run] 2016/10/25 17:30:58 InfluxDB starting, version 0.13.0, branch 0.13, commit e57fb88a051ee40fd9277094345fbd47bb4783ce
influxdb_1 | [run] 2016/10/25 17:30:58 Go version go1.6.2, GOMAXPROCS set to 2
influxdb_1 | [run] 2016/10/25 17:30:58 Using configuration at: /config/config.toml
influxdb_1 | [store] 2016/10/25 17:30:58 Using data dir: /data/data
influxdb_1 | [tsm1wal] 2016/10/25 17:30:58 tsm1 WAL starting with 10485760 segment size
influxdb_1 | [tsm1wal] 2016/10/25 17:30:58 tsm1 WAL writing to /data/wal/_internal/monitor/1
influxdb_1 | [tsm1wal] 2016/10/25 17:30:58 tsm1 WAL starting with 10485760 segment size
influxdb_1 | [tsm1wal] 2016/10/25 17:30:58 tsm1 WAL writing to /data/wal/cadvisor/default/2
influxdb_1 | [filestore] 2016/10/25 17:30:58 /data/data/_internal/monitor/1/000000001-000000001.tsm (#0) opened in 1.243404ms
influxdb_1 | [cacheloader] 2016/10/25 17:30:58 reading file /data/wal/_internal/monitor/1/_00001.wal, size 1777379
influxdb_1 | [filestore] 2016/10/25 17:30:58 /data/data/cadvisor/default/2/000000001-000000001.tsm (#0) opened in 1.725916ms
influxdb_1 | [cacheloader] 2016/10/25 17:30:58 reading file /data/wal/cadvisor/default/2/_00001.wal, size 4130244
influxdb_1 | [tsm1wal] 2016/10/25 17:30:58 tsm1 WAL starting with 10485760 segment size
influxdb_1 | [tsm1wal] 2016/10/25 17:30:58 tsm1 WAL writing to /data/wal/_internal/monitor/3
influxdb_1 | [cacheloader] 2016/10/25 17:30:58 reading file /data/wal/_internal/monitor/3/_00001.wal, size 1097258
cadvisor_1 | E1025 17:30:59.248299 1 manager.go:208] Docker container factory registration failed: docker found, but not using native exec driver.
cadvisor_1 | I1025 17:30:59.262682 1 factory.go:94] Registering Raw factory
cadvisor_1 | I1025 17:30:59.327660 1 manager.go:1000] Started watching for new ooms in manager
cadvisor_1 | W1025 17:30:59.327883 1 manager.go:239] Could not configure a source for OOM detection, disabling OOM events: exec: "journalctl": executable file not found in $PATH
cadvisor_1 | I1025 17:30:59.328250 1 manager.go:252] Starting recovery of all containers
cadvisor_1 | I1025 17:30:59.371456 1 manager.go:257] Recovery completed
cadvisor_1 | I1025 17:30:59.395792 1 cadvisor.go:106] Starting cAdvisor version: 0.20.5-9aa348f on port 8080
influxdb_1 | [cacheloader] 2016/10/25 17:30:59 reading file /data/wal/cadvisor/default/2/_00002.wal, size 2232957
influxdb_1 | [cacheloader] 2016/10/25 17:30:59 reading file /data/wal/_internal/monitor/3/_00002.wal, size 197651
influxdb_1 | [cacheloader] 2016/10/25 17:30:59 reading file /data/wal/_internal/monitor/3/_00003.wal, size 0
influxdb_1 | [shard] 2016/10/25 17:30:59 /data/data/_internal/monitor/3 database index loaded in 1.387775ms
influxdb_1 | [store] 2016/10/25 17:30:59 /data/data/_internal/monitor/3 opened in 865.976354ms
influxdb_1 | [cacheloader] 2016/10/25 17:30:59 reading file /data/wal/_internal/monitor/1/_00004.wal, size 0
influxdb_1 | [shard] 2016/10/25 17:30:59 /data/data/_internal/monitor/1 database index loaded in 3.29894ms
influxdb_1 | [store] 2016/10/25 17:30:59 /data/data/_internal/monitor/1 opened in 896.765569ms
influxdb_1 | [cacheloader] 2016/10/25 17:30:59 reading file /data/wal/cadvisor/default/2/_00003.wal, size 444696
influxdb_1 | [cacheloader] 2016/10/25 17:30:59 reading file /data/wal/cadvisor/default/2/_00004.wal, size 0
influxdb_1 | [shard] 2016/10/25 17:30:59 /data/data/cadvisor/default/2 database index loaded in 2.465579ms
influxdb_1 | [store] 2016/10/25 17:30:59 /data/data/cadvisor/default/2 opened in 981.523781ms
influxdb_1 | [subscriber] 2016/10/25 17:30:59 opened service
influxdb_1 | [monitor] 2016/10/25 17:30:59 Starting monitor system
influxdb_1 | [monitor] 2016/10/25 17:30:59 'build' registered for diagnostics monitoring
influxdb_1 | [monitor] 2016/10/25 17:30:59 'runtime' registered for diagnostics monitoring
influxdb_1 | [monitor] 2016/10/25 17:30:59 'network' registered for diagnostics monitoring
influxdb_1 | [monitor] 2016/10/25 17:30:59 'system' registered for diagnostics monitoring
influxdb_1 | [cluster] 2016/10/25 17:30:59 Starting cluster service
influxdb_1 | [shard-precreation] 2016/10/25 17:30:59 Starting precreation service with check interval of 10m0s, advance period of 30m0s
influxdb_1 | [snapshot] 2016/10/25 17:30:59 Starting snapshot service
influxdb_1 | [copier] 2016/10/25 17:30:59 Starting copier service
influxdb_1 | [admin] 2016/10/25 17:30:59 Starting admin service
influxdb_1 | [admin] 2016/10/25 17:30:59 Listening on HTTP: [::]:8083
influxdb_1 | [continuous_querier] 2016/10/25 17:30:59 Starting continuous query service
influxdb_1 | [httpd] 2016/10/25 17:30:59 Starting HTTP service
influxdb_1 | [httpd] 2016/10/25 17:30:59 Authentication enabled: false
influxdb_1 | [httpd] 2016/10/25 17:30:59 Listening on HTTP: [::]:8086
influxdb_1 | [retention] 2016/10/25 17:30:59 Starting retention policy enforcement service with check interval of 30m0s
influxdb_1 | [run] 2016/10/25 17:30:59 Listening for signals
influxdb_1 | [monitor] 2016/10/25 17:30:59 Storing statistics in database '_internal' retention policy 'monitor', at interval 10s
influxdb_1 | 2016/10/25 17:30:59 Sending anonymous usage statistics to m.influxdb.com

Appendix: Error message ”

Appendix B: ‘Waiting for confirmation of InfluxDB service startup’

After issuing the command

docker-compose up

I had hit a problem described here, that was caused by using a Vagrant synced folder as working directory.

vagrant@openshift-installer /vagrant/Monitoring/docker-monitoring_master $ docker-compose up
Starting dockermonitoringmaster_influxdbData_1
Starting dockermonitoringmaster_influxdb_1
Starting dockermonitoringmaster_cadvisor_1
Starting dockermonitoringmaster_grafana_1
Attaching to dockermonitoringmaster_influxdbData_1, dockermonitoringmaster_influxdb_1, dockermonitoringmaster_grafana_1, dockermonitoringmaster_cadvisor_1
dockermonitoringmaster_influxdbData_1 exited with code 0
influxdb_1      | => Starting InfluxDB in background ...
influxdb_1      | => Waiting for confirmation of InfluxDB service startup ...
influxdb_1      |
influxdb_1      |  8888888           .d888 888                   8888888b.  888888b.
influxdb_1      |    888            d88P"  888                   888  "Y88b 888  "88b
influxdb_1      |    888            888    888                   888    888 888  .88P
influxdb_1      |    888   88888b.  888888 888 888  888 888  888 888    888 8888888K.
influxdb_1      |    888   888 "88b 888    888 888  888  Y8bd8P' 888    888 888  "Y88b
influxdb_1      |    888   888  888 888    888 888  888   X88K   888    888 888    888
influxdb_1      |    888   888  888 888    888 Y88b 888 .d8""8b. 888  .d88P 888   d88P
influxdb_1      |  8888888 888  888 888    888  "Y88888 888  888 8888888P"  8888888P"
influxdb_1      |
influxdb_1      | 2016/10/28 12:34:49 InfluxDB starting, version 0.9.6.1, branch 0.9.6, commit 6d3a8603cfdaf1a141779ed88b093dcc5c528e5e, built 2015-12-10T23:40:23+0000
influxdb_1      | 2016/10/28 12:34:49 Go version go1.4.2, GOMAXPROCS set to 2
influxdb_1      | 2016/10/28 12:34:49 Using configuration at: /config/config.toml
influxdb_1      | [metastore] 2016/10/28 12:34:49 Using data dir: /data/meta
influxdb_1      | [retention] 2016/10/28 12:34:49 retention policy enforcement terminating
influxdb_1      | [monitor] 2016/10/28 12:34:49 shutting down monitor system
influxdb_1      | [handoff] 2016/10/28 12:34:49 shutting down hh service
influxdb_1      | [subscriber] 2016/10/28 12:34:49 closed service
influxdb_1      | run: open server: open meta store: raft: new bolt store: invalid argument
grafana_1       | 2016/10/28 12:34:50 [I] Starting Grafana
grafana_1       | 2016/10/28 12:34:50 [I] Version: 2.6.0, Commit: v2.6.0, Build date: 2015-12-14 14:18:01 +0000 UTC
grafana_1       | 2016/10/28 12:34:50 [I] Configuration Info
grafana_1       | Config files:
grafana_1       |   [0]: /usr/share/grafana/conf/defaults.ini
grafana_1       |   [1]: /etc/grafana/grafana.ini
grafana_1       | Command lines overrides:
grafana_1       |   [0]: default.paths.data=/var/lib/grafana
grafana_1       |   [1]: default.paths.logs=/var/log/grafana
grafana_1       | Paths:
grafana_1       |   home: /usr/share/grafana
grafana_1       |   data: /var/lib/grafana
grafana_1       |   logs: /var/log/grafana
grafana_1       |
grafana_1       | 2016/10/28 12:34:50 [I] Database: sqlite3
grafana_1       | 2016/10/28 12:34:50 [I] Migrator: Starting DB migration
grafana_1       | 2016/10/28 12:34:50 [I] Listen: http://0.0.0.0:3000
cadvisor_1      | I1028 12:34:50.214917       1 storagedriver.go:42] Using backend storage type "influxdb"
cadvisor_1      | I1028 12:34:50.215243       1 storagedriver.go:44] Caching stats in memory for 2m0s
cadvisor_1      | I1028 12:34:50.215376       1 manager.go:131] cAdvisor running in container: "/docker/2da85f53aaf23024eb2016dc330b05634972252eea2f230831e3676ad3b6fa73"
cadvisor_1      | I1028 12:34:50.238721       1 fs.go:107] Filesystem partitions: map[/dev/dm-0:{mountpoint:/rootfs major:252 minor:0 fsType: blockSize:0} /dev/sda1:{mountpoint:/rootfs/boot major:8 minor:1 fsType: blockSize:0}]
cadvisor_1      | I1028 12:34:50.249690       1 machine.go:50] Couldn't collect info from any of the files in "/rootfs/etc/machine-id,/var/lib/dbus/machine-id"
cadvisor_1      | I1028 12:34:50.249806       1 manager.go:166] Machine: {NumCores:2 CpuFrequency:2592000 MemoryCapacity:1569599488 MachineID: SystemUUID:B63CB367-870F-4E48-917F-7E524C2C67A0 BootID:e225b37a-b8e6-466b-9f67-84b74df8e90c Filesystems:[{Device:/dev/dm-0 Capacity:41092214784} {Device:/dev/sda1 Capacity:246755328}] DiskMap:map[252:0:{Name:dm-0 Major:252 Minor:0 Size:41884319744 Scheduler:none} 252:1:{Name:dm-1 Major:252 Minor:1 Size:805306368 Scheduler:none} 8:0:{Name:sda Major:8 Minor:0 Size:42949672960 Scheduler:deadline}] NetworkDevices:[{Name:br-067c518abd1f MacAddress:02:42:78:41:c0:71 Speed:0 Mtu:1500} {Name:br-1c136984ac6d MacAddress:02:42:0c:dc:89:ac Speed:0 Mtu:1500} {Name:br-3b100a8c826a MacAddress:02:42:11:2c:a0:4c Speed:0 Mtu:1500} {Name:br-5573a4076799 MacAddress:02:42:97:14:9a:fc Speed:0 Mtu:1500} {Name:br-9b7560132352 MacAddress:02:42:5c:df:9a:43 Speed:0 Mtu:1500} {Name:eth0 MacAddress:08:00:27:c7:ba:b5 Speed:1000 Mtu:1500} {Name:eth1 MacAddress:08:00:27:51:9c:7e Speed:1000 Mtu:1500}] Topology:[{Id:0 Memory:1569599488 Cores:[{Id:0 Threads:[0] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2} {Size:6291456 Type:Unified Level:3}]} {Id:1 Threads:[1] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2} {Size:6291456 Type:Unified Level:3}]}] Caches:[]}] CloudProvider:Unknown InstanceType:Unknown}
cadvisor_1      | I1028 12:34:50.251115       1 manager.go:172] Version: {KernelVersion:4.2.0-42-generic ContainerOsVersion:Alpine Linux v3.2 DockerVersion:1.12.1 CadvisorVersion:0.20.5 CadvisorRevision:9aa348f}
cadvisor_1      | E1028 12:34:50.273526       1 manager.go:208] Docker container factory registration failed: docker found, but not using native exec driver.
cadvisor_1      | I1028 12:34:50.279684       1 factory.go:94] Registering Raw factory
cadvisor_1      | I1028 12:34:50.316816       1 manager.go:1000] Started watching for new ooms in manager
cadvisor_1      | W1028 12:34:50.316960       1 manager.go:239] Could not configure a source for OOM detection, disabling OOM events: exec: "journalctl": executable file not found in $PATH
cadvisor_1      | I1028 12:34:50.317927       1 manager.go:252] Starting recovery of all containers
cadvisor_1      | I1028 12:34:50.336674       1 manager.go:257] Recovery completed
cadvisor_1      | I1028 12:34:50.352618       1 cadvisor.go:106] Starting cAdvisor version: 0.20.5-9aa348f on port 8080
influxdb_1      | => Waiting for confirmation of InfluxDB service startup ...
influxdb_1      | => Waiting for confirmation of InfluxDB service startup ...
influxdb_1      | => Waiting for confirmation of InfluxDB service startup ...
influxdb_1      | => Waiting for confirmation of InfluxDB service startup ...

To confirm the issue, you can try to connect to port 8083 in a different window on the docker host:

(docker host) $ curl --retry 10 --retry-delay 5 -v http://localhost:8083
* Rebuilt URL to: http://localhost:8083/
* Hostname was NOT found in DNS cache
*   Trying ::1...
* Connected to localhost (::1) port 8083 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8083
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer

I.e. there is a TCP RST on the port.

Reason:

The reason of this problem lies in a problem with Vagrant synced folders of type “vboxsf”.

Workaround 1a: do not use synced folders

The problem disappears, if you clone the repository into a non-synced folder.

Workaround 1b: use synced folders of different type

The problem disappears, if you use a synced folder of different type. I have tested to use Vagrant synched folder of type “smb”.

  • add the line
     config.vm.synced_folder ".", "/vagrant", type: "smb"

    to the Vagrantfile inside the configure section

  • start CMD as Administrator. E.g. run
     runas.exe /savecred /user:Administrator "cmd"

    in a non-priviledged CMD

  • run
    vagrant up

    in the privideged CMD session.

After that, you can ssh into the docker host system, and clone the repository and run docker-compose up like starting in step 1 without hitting the InfluxDB problem.

Workaround 2: upgrade InfluxDB to 0.13

The problem disappears even, if we use vboxfs synced folders, if we update InfluxDN to 0.13:

I have found this InfluxDB 0.9 issue with the same symptoms. This is, why I have tried to upgrade, still working within the Vagrant synced folder /vagrant.

Step 1: Upgrade InfluxDB

In the docker-compose.yml file replace

influxdb:
 image: tutum/influxdb:0.9

by

influxdb:
 image: tutum/influxdb:0.13
Step 2: remove ./data folder (important! Otherwise, the problem will persist!)
Step 3: Try again:

$ docker-compose up

Starting dockermonitoringrepaired_influxdbData_1
Starting dockermonitoringrepaired_influxdb_1
Starting dockermonitoringrepaired_grafana_1
Starting dockermonitoringrepaired_cadvisor_1
...
influxdb_1 | [monitor] 2016/10/25 16:48:44 Storing statistics in database '_internal' retention policy 'monitor', at interval 10s
influxdb_1 | 2016/10/25 16:48:44 Sending anonymous usage statistics to m.influxdb.com
influxdb_1 | [run] 2016/10/25 16:48:44 Listening for signals
Step 4: CURL-Test

Now, the curl test is successful:

$ curl --retry 10 --retry-delay 5 -v http://localhost:8083
* Rebuilt URL to: http://localhost:8083/
* Hostname was NOT found in DNS cache
* Trying ::1...
* Connected to localhost (::1) port 8083 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.35.0
...
</body>

</html>
* Connection #0 to host localhost left intact

Also this is successful now.

Appendix: Error: load error nokogiri/nokogiri LoadError Vagrant

This is an error I have encountered after installing Vagrant 1.8.1 on Windows 10 and …