This blog shows, how a Docker CoreOS cluster can be set up in ~10 minutes (dependent on your Internet speed), excluding the time to install Vagrant, Virtualbox and git.
In order to speed up the installation process of a CoreOS cluster, I have created a fork of the coreos-vagrant git repository. With that repository fork, a CoreOS cluster with 3 cluster nodes can be up and running and discovered in less than 10 minutes, if the prerequisites are met.
Vagrant, Virtualbox and git are assumed to be installed.
Unix tools like dos2unix is assumed to be installed, but any tool to convert the files is possible.
c) on Windows, the files must be converted to UNIX format (minimum: provision_ssh_agent_bashrc.sh must be in UNIX format). If bash is installed on the system, you can do the following (otherwise you might be able to use your editor, e.g. PSPAD, Ultraedit, … to do so): bash dos2unix *
In this post, I will explore services failover scenarios for docker containers on CoreOS clusters using fleet. A container-based service will be defined and started, and we will explore the service recovery after a failure of one or all cluster nodes (e.g. a power outage).
If you happen to read the full description after tl;dr section below, you will learn about the difference between fleet used here and orchestration tools like Kubernetes. In addition, we will explore the proper SSH key handling and HTTP proxy configuration.
tl;dr: Docker CoreOS Cluster Failover Test in less than 15 Minutes
In this chapter, we start from a cluster that is up and running as created in the appendix of last my last blog. In that blog we have created a CoreOS cluster in less than 10 minutes. Now, we will define a hello world service that can be started on the cluster, and we will check that the cluster failover works as expected.
check the service (issue several times until you see some lines with date and “Hello World”):
fleetctl status hello.service
Failover Test: via Virtualbox console, shutdown the machine that has the service running (might differ from the machine you have started the service)
Connect to another node, e.g. via
vagrant ssh core-03
and repeat step 7. It should show that the service is up and running like follows:
Full description: Starting a distributed Hello World Service
v1 (2015-07-22): step by step instruction with manual changes required
v2 (2015-07-27): added the “Appendix: CoreOS Cluster Failover Test in less than 15 Minutes”
v3 (2015-08-19): moved to WordPress, since the LinkedIn blog is not available publicly anymore
v4 (2015-08-19): moved the CoreOS Cluster Failover Tests in less than 15 Minutes to the top (for the quick reader)
For service definition and management, we explore the features and handling of fleet software that ships with CoreOS. fleet is based on an open source project that calls fleet a “simple distributed init system”. The project team points to Kubernetes for all those, who are looking for more complex scheduling requirements or a first-class container orchestration system.
Still, I would like to learn about fleet. fleet is described in the CoreOS Quick Start Guide and it seems to be simple, and still powerful, if you are looking for following features:
define a (docker container) service independent of the CoreOS node it will run on later
deploy and start the a service unit on any of the CoreOS nodes
deploy and start service units on all of the CoreOS nodes
upon failure of a node: automatic restart of a service unit on an other node
make sure that all required service units are started on the same machine (affinity)
forbid specific service units from running on the same machine (anti-affinity)
allow for machine specific metadata
Note that the fleet software could work on any Linux system with an up to date systemd, but officially, it is supported only on CoreOS (see a group discussion here).
Following the CoreOS Quick Start Guide, I have create a hello.service file in home directory of the “core” user on core-01:
It is not clear to me yet, where this service definition is supposed to be located. Will it be distributed to all systems? If not, what happens if the machine, this service definition is located is unavailable? We will see in the “Power Outage” section further down below, that the service is still started automatically. For now, let us go on:
We can see that the service has not been started on core-01, but on core-03 instead, since the IP address 172.17.8.103 is owned by core-03. Let us try to check the service status with the command
fleetctl status hello.service
Failed. We can still see that the file has been started on core-03 by using the command
However, the “fleetctl status hello.service” command works locally on core-03, though, where the service has been started:
It is not up and running, because busybox could not be downloaded. This is caused by a missing HTTP proxy configuration and will be fixed later. As a workaround, you can also connect temporarily to the Internet without HTTP proxy (e.g. using the hot spot function of your mobile phone), and it will work. However, let us fix that later and let us concentrate on the fleetctl status command issue first:
Fixing the fleetctl status Problem
A long troubleshooting session has lead me to the result that the problem is caused by the ssh agent not being started and by a missing SSH private key on the system that is issuing the fleetctl command. For resolution, we need to perform following steps:
Find and upload the insecure SSH key to the nodes
start the ssh-agent
set permissions of the ssh key to 600
add the key with ssh-add
1. find and upload the SSH key to the nodes
If you are using WinSCP as SFTP client, you need to convert the insecure private key found on
(where $USERPROFILE=C:/Users/<yourusername> on Windows)
to ppk format. This can be done by importing and saving the private key using the puTTYgen tool.
The ppk key then can be used by WinSCP to connect to the system and transfer the private key (the one in the original format):
Here, you see the WinSCP connection to the core-0x machines with manually added private keys in the .ssh folder.
2. start the ssh agent
Perform the command
eval `ssh-agent -s`
After that the error message changes to “ssh: handshake failed: ssh: unable to authenticate”
3. set permissions of the SSH key file to 600
chmod 600 .ssh/insecure_private_key
on the node:
4. add the key with ssh-add
Now we need to perform the command
With that, the fleetctl problem is resolved and the command fleetctl status hello.service works also on core-01:
You need to perform those steps from any machine you want to be able to perform fleetctl commands with full access to all of the nodes of the cluster. In the Appendix, I will show that this can be automated using Vagrant.
Manually Fixing the HTTP proxy Problems
Now let us fix the busybox download topic: the service could not be started because I am behind a proxy and the cluster nodes cannot download the busybox image from the repository. For that to work, I either need to pre-load the images, or I need to control the http proxy configuration of each cluster node, similar to how I have done it on DDDocker (6) in the section “HTTP Proxy Problems”.
The cool, automated way of doing this is by using the Proxy Configuration Plugin for Vagrant described on http://tmatilai.github.io/vagrant-proxyconf/. But note, that there is an open bug, which will be fixed in the Appendix.
For core-03, let us test the manual way of solving the problem first:
check that the download fails e.g. by trying a “docker search busybox” on the command line or on core-03: fails like expected (this takes a long time):
Perform the commands:
sudo mkdir /etc/systemd/system/docker.service.d
sudo vi /etc/systemd/system/docker.service.d/http-proxy.conf
reboot core-03 (best via “vagrant reload core-03” on the vagrant folder) and check again the command, which should succeed now:
Success: all docker requests are sent to the HTTP proxy and docker search command succeeds.
An implicit Cluster Node Failover Test
Now let us see, what happened to our service:
We can see, that it is still down. However, because of the reboot of core-03, the service was automatically moved to core-02. However, core-02 still has a false HTTP proxy configuration; therefore the download of the busybox image has failed. Let us force the service back to core-03 by reloading core-02 and core-01 via
vagrant reload core-02 core-01
(performed on the host system in the vagrant project folder). As expected the service is moved to core-03 and is up and running:
Note that our fleetctl (SSH key) fix has not survived the reload on core-01:
Therefore it is a good idea to look for an automated (vagrant) way to fix the SSH key problem as shown below.
Automated Fix of the SSH Key Problem
We have seen above that the fleetctl status commands fails if the service is running on a remote cluster node. The reason was that the fleetctl client has no trusted private SSH key installed and the ssh agent is not started. The manual workaround to activate the SSH key has not survived a reboot of the node. Therefore, I have automated the installation of the vagrant SSH key on my github fork https://github.com/oveits/coreos-vagrant.
Automation of the HTTP proxy Configuration Distribution
On core-03, we manually have fixed the HTTP proxy configuration by editing the file /etc/systemd/system/docker.service.d/http-proxy.conf. On core-01 and core-02, we will now fix it using a nice plugin, the vagrant-proxyconf plugin. This plugin provisions the HTTP proxy configuration of all nodes of the cluster during vagrant deployment.
Install the vagrant-proxyconf plugin
If you are currently behind a HTTP proxy, the HTTPS_PROXY variable must be set manually on the host system for the command
vagrant plugin install vagrant-proxyconf
to succeed. Below, I was connected directly to the Internet, so the HTTPS_PROXY had to be empty:
Now configure the Vagrantfile:
Replace the IP addresses and ports to the ones that fit to your networking environment.
Let us verify that the HTTP configuration is still wrong for core-01:
The variables $HTTP_PROXY and $http_proxy are empty and docker search busybox has failed (timed out):
Now let us provision the cluster node via (if you omit the name core-01, it will provision all nodes in the cluster; however, we want to quickly see and verify the results first before rolling out the provisioning to all nodes):
vagrant reload --provision core-01
# (also possible: vagrant provision core-01, but you need to make sure you reconnect via SSH, since the effect will not be visible in the current SSH session)
After reconnecting to core-01 via SSH, we will see following output of the “set” command (with different IP addresses and/or ports in your case):
Now docker search is successful:
We remove the file /etc/systemd/system/docker.service.d/http-proxy.conf from core-03, since we will not need it anymore:
and we perform the same provisioning step for core-02 and core-03:
vagrant reload --provision core-02 core-03
After that, the hello.service should have been moved to core-01 and the busybox download should have been successful:
before issuing the fleetctl status command, as seen below:
The hello.service is up and running also from behind a HTTP proxy.
Total Cluster Failure (e.g. Power Outage)
Now, we will simulate a total cluster failure by shutting down the cluster node VMs via Virtualbox. The hello.service has been defined on core-01 only. Let us first start core-02 and core-03, if they start up the service:
Cool: even after a simulated power outage, the service is restarted. And this, although I have not yet started core-01, where the service definition file is located. Perfect. In a productive environment, we just need to make sure that the virtual machines are automatically booted, after power is back again (e.g. using vSphere functions).
fleet is a low level init system for clusters, which can be used to manage clusters directly and also also can be used to bootstrap higher-level container orchestration software like Kubernetes. It allows to define services, and run the service on any node of the cluster.
In this post, using fleet, the failover and recovery of a container-based service after a cluster node failure has been tested successfully:
after failure of the node the service was running on, the service is automatically started on another node of the cluster.
After a simulated total cluster outage, the service will be started, even if the cluster node, where the hello.service file is located, is kept down.
Special care had to be taken to following topics:
SSH keys and ssh agent on the fleet client
HTTP proxy onfiguration, if applicable to your environment
Both topics have been shown that they can be handled either manually or through automatic Vagrant provisioning.
About 1. SSH connection requirements:
The node running the fleet client needs to be provisioned with a trusted SSH private key. This can be automated using Vagrant; an according git repository has been provided here: https://github.com/oveits/coreos-vagrant.
About 2. HTTP proxy configuration:
If your cluster runs behind a HTTP proxy, several measures should be considered:
initial cluster discovery (see DDDocker (8)): a temporary direct Internet access is needed, if you do not want to install a local etcd discovery agent, since etcd discovery does not yet support HTTP proxies.
Successful service launch requires a correct HTTP proxy configuration of each node. This can be automated via the vagrant-proxyconf plugin. However, in the current version, a bug needs to be fixed manually (can be performed in two minutes).
This blog post shows a quick way how to install Kubernetes, a Docker orchestration framework published by Google, as a set of Docker containers. With that you can circumvent the hassle you may run into while trying to install Kubernetes natively.
v0.1 (draft) 2015-08-12: described 3 ways of installing Kubernetes, but they either failed or they led to problems later on, because ‘make’ and other commands were missing on the machine running the kubectl client (Windows related problem). Now this installation version is moved to the appendix, since it might still be needed later on, if testing other procedures on Linux. v0.2 (draft) 2015-08-14: added coarse outline of my (now successful) 4th attempt v1.0 2015-08-17: full description of the successful attempt; I have moved the old, problematic attempt into the appendix. v1.1 2015-08-17: added a subchapter “Networking Challenges”, which shows how to route from the Windows host to the service. v1.2 2015-08-18: moved the page to wordpress.com, since LinkedIn blogs was not available to all of my colleagues. v1.3 2016-07-11: moved the documentation of the unsuccessful attempts to the end of the document
In the last blog I have investigated some low level container orchestration using fleet, which calls itself a “simple distributed init system”, but we could show that it offers possibilities to
define container-based services and
monitor the health of Docker hosts
automatic restart of containers on other hosts, if a Docker host fails.
For those looking for more complex scheduling requirements or a first-class container orchestration system, Kubernetes of google is recommended. Let us explore, what Kubernetes adds to fleet’s capabilities, how to install it and how to test its core features. Kubernetes is a core element of other, more complete Container frameworks, e.g. Red Hat’s OpenShift Container Platform (a.k.a. OpenShift Enterprise 3.x).
In the architecture consists of a master docker node and one or more minion nodes. In our example the master node offers:
kubectl, i.e. the kube client
the REST API with authentication, Replication Controller and Scheduler
the kubelet info service, i.e. the service, which talks to the other docker hosts
Depending on the size of the solution, the functions can be spread over different docker hosts.
The minion docker hosts that are hosting pods offer following functions:
kublet, i.e. the kube agent the kubelet info service talks to
cAdvisor, which is used to monitor containers
a proxy, which offers an abstraction layer for the communication with pods, see the description of pods.
are a set of containers on a single docker host
each pod is assigned an IP address
communication between pods is performed via a proxy, which is the abstraction layer offering the pod’s IP address from outside
kubectl is the client talking to a REST API, which in turn talks to the kublet info service, which in turn talks to the pods via local kublet agents.
etcd is used as a distributed key storage system. I guess, host clustering is done via the etcd discovery service (t.b.v.).
Create and initialize a Vagrant working directory:
mkdir ubuntu-trusty64-docker; cd ubuntu-trusty64-docker
vagrant init williamyeh/ubuntu-trusty64-docker
Start and connect to the VM:
Start Kubernetes Docker Containers
If you are operating behind a proxy, set the http_proxy and https_proxy variables accordingly and add those variables also to the docker environment:
# perform this section, if you are behind a HTTP proxy, but replace IP address and port to match your environment:
sudo vi /etc/default/docker
# add the export commands above to the file /etc/default/docker (with sudo vi /etc/default/docker) and restart the docker service:
sudo service docker restart
The service is reachable from the Vagrant Linux host. However, the service cannot be reached from my Windows machine yet.
The problem can be described like follows:
Kubernetes is automatically fetching an IP address from a pool (defined where?) for each service. In case of the Nginx service, this was the IP address 10.0.0.146.
The address is not owned by the Vagrant Linux VM, as can be seen with an ifconfig.
In Vagrant, per default, we have no public IP address. However, Vagrant offers the possibility to map a VM’s IP address and port to a port of the host (= Windows host in my case). However, I have not found any possibility to map a host’s IP:port pair to an IP and port that is not owned by the VM.
We have two possibilities to resolve the problem:
1) Chained port mapping (only theory; not tested yet):
In Vagrant map the host’s port to an IP:port pair owned by the VM
In the VM, using e.g. iptables NAT function to map the IP:port pair to the service’s IP:port pair
2) Create an additional, reachable interface for the VM and route the service IP address to this public interface
In the Vagrantfile, add e.g. the line
config.vm.network “private_network”, ip: “192.168.33.10” -> this will automatically create the interface eth1 in a new host-only network. You need to issue “vagrant reload –provision” to activate this setting.
On the (Windows) host, add a route to the network, which matches the pool Kubernetes is choosing the service IP addresses from. In my case, I have added the route using the command route add 10.0.0.0 mask 255.255.255.0 192.168.33.10
With this, the Nginx service becomes reachable (now on another, randomly chosen IP address 10.0.0.19, since I have restarted the host):
Perfect! The service is now available also from the Windows host.
In a real world example, external load balancers will map the externally visible IP:port pair to the service’s IP:Port pair; and the IP address and port will be chosen statically. Routing needs to take care that the service is reached, no matter, on which host it is located. This is something, I need to explore in more detail an another post: how can we connect several Minions to the same network, and can we make sure that the IP packets are routed to the right Minion? Gratuitous ARP?
Appendix: Installation of Kubernetes on CoreOS (removed)
This was a log of my efforts, but it lead to a dead end (installation of kubectl on CoreOS is not supported and running kubectl in a docker container did not lead to the desired results), so I have removed it; still available on request as revision 18 …
Appendix: Attempts to install Kubernetes (including unsuccessful attempts)
Installation of Kubernetes on Windows seems to be hard, as can be seen with the first three unsuccessful installation attempts. However, I have found a fairly automated way of installing Kubernetes as a set of Docker containers by using Vagrant and a base image that has docker already installed on an Ubuntu VM. This is described as successful attemt 4) below and is described in more detail in the main part of this blog.
1) Multi-node CoreOS cluster installation on the Getting Started CoreOS page I had to try 3 times, until the kubectl client was downloaded and installed correctly. And when trying to start my first NginX example, I found myself in a dead end: the example(s) require normal Linux commands line “make”, but CoreOS neither support those commands, nor allows to install them.
2) Running Kubernetes locally via Docker
this is supposed to be the quick way for an evaluation installation, since we only need to download and run pre-installed Docker images. Not so this time: here, I run into the problem that kubectl client cannot be installed on my boot2docker host. When I try to use one of the kubectl docker images, I always get an error that 127.0.0.1:8080 cannot be reached.
4) SUCCESS: Running Kubernetes locally via Docker within an Ubuntu VM has succeeded finally: I had created an Ubuntu VM using Vagrant with the image ubuntu-trusty64-docker from the Vagrant boxes repository and I have downloaded kubectl v1.0.1 into /usr/local/bin of that image. First, I had the problem that kubectl always had returned an error that I could not find on Google, saying that it had received the string “Supported versions: [v1.0,…]” or similar. Then I had found out that kubectl is connecting to http://localhost:8080, which was occupied already with a docker image google/cadvisor, that was up and running in the Ubuntu Vagrant image I had used. After finding this docker image with “docker ps” and stopping it with “docker stop <container-id>”, kubectl worked as expected. Now “wget -qO- http://localhost:8080/ | less” returns a list of paths in json format and all kubectl commands on the instruction page are working as expected. That was hard work. My assumption that it would work on Ubuntu was correct. I will troubleshoot in more details, why it had not worked in one of the other ways.