Saturday, October 24
Shadow

Tag: k8s

Configuring an etcd cluster for Kubernetes

IT Services

Etcd is the primary cluster status storage for Kubernetes. In this article, we will discuss the basics of etcd functioning in Kubernetes and show how to configure an etcd cluster.

Also, we will explain how to enable SSL/TLS encryption for data exchanged between cluster nodes and etcd to raise the security and reliability level of the etcd.

Etcd is an open-source distributed storage of the “key-value” type. Initially, the etcd was created for CoreOS, but now it is available for OS X, Linux, and BSD.

Etcd is written in Go like Kubernetes itself and employs the Raft consensus algorithm to ensure a highly available data replication. The distributed systems support makes it a widespread choice for Kubernetes, Cloud Foundry, Fleet, and other projects.

Why would you need an etcd cluster in Kubernetes?

Etcd is a database for Kubernetes, a critical component of any cluster. It contains all information needed for the correct operation of the cluster.

You can run etcd (1) on the same nodes as Kubernetes cluster, or (2) on separate systems or (3) as a node in the Kubernetes cluster. Regardless of configuration, the main task is to ensure consistency of the data and resiliency of the cluster. If etcd is run in several replicas, either on separate systems or together with the Kubernetes, it will ensure continuity of the cluster operation in case a Kubernetes node is down.

A three-node etcd cluster

 

What should an etcd cluster look like?

In theory, the amount of nodes in an etcd cluster is unlimited. The more nodes there are, the more reliable the cluster is. However, the data recording latency grows with the size of the cluster as the data is replicated on more nodes. Optimal etcd cluster should include seven nodes maximum, and in most cases, five nodes are enough. For example, Google uses clusters of this size in its internal service similar to etcd.

A 5-node cluster will allow you to stay online even if two nodes are down. In this case, you will still have three nodes online. To protect yourself in case of failure of 3 nodes, you will need a 7-node cluster, etc.

As you can see, it is better to use an uneven number of nodes. This data matching between the nodes in the cluster is done using a quorum mechanism that ultimately ensures its resilience. For example, for a 5-node cluster, the quorum will be three nodes (i.e., the cluster can loose two nodes and will be still operational). It would seem that the more nodes, the better and adding the sixth node will be useful, but the quorum in the 6-node cluster will be four nodes. Therefore this cluster will stay online only if the maximum two nodes are down, just as the 5-node cluster. But it has more nodes, and each of them can fail. You can find more details regarding the optimal number of nodes here.

Some more important things to note:

  • It is better to run an etcd cluster on separate nodes.
  • You should always monitor that all nodes have enough available resources to accommodate new data.
  • Recommended etcd version for production is 3.2.24 and up. Earlier versions do not ensure smooth operation with newer versions of Kubernetes (1.12 and 1.13).

Effective and reliable operation of clusters depends on disk I/O to a large extent. To achieve reliability, etcd writes its metadata to a log and constantly runs health checks to remove expired data from that log. High I/O latency may result in mistiming of cluster status data, and that will lead to unstable work and re-election of the etcd master.  Moreover, lack of resources may lead to etcd cluster component mistiming, and inability to write information to etcd or launch new nodes in Kubernetes clusters. That is why it is recommended to run etcd on SSDs.

Our example architecture

In this article, we will create a 3-node etcd cluster. If you need a larger cluster, then change the number of nodes and their addresses in the configuration files presented below accordingly.

In order to run etcd, you must have a Kubernetes cluster and a configured kubectl utility.

Setting up the etcd cluster

In this section, we will set up a cluster and install etcd from the standard repository. We will use three virtual nodes with CentOS 7.5. Components communicate using IP-addresses and hostnames

  • 10.10.0.10 etcd1
  • 10.10.0.11 etcd2
  • 10.10.0.12 etcd3

Note: It is recommended that nodes intended to run etcd are configured to use a separate internal subnetwork for peer-to-peer communication.

Preliminary configuration

In our example, we will not use an internal DNS, and therefore we will have to specify the hostnames in the /etc/hosts file on every node in the etcd cluster. If we had a DNS server, then we would be able to communicate with the nodes directly using hostnames.

So, let’s enter the hostnames

# replace the hostnames and their addresses with actual ones

10.10.0.10 etcd1

10.10.0.11 etcd2

10.10.0.12 etcd3

setenforce 0

sed -i ‘s/^SELINUX=.*/SELINUX=disabled/g’ /etc/sysconfig/selinux

firewall-cmd –add-port={2379,2380}/tcp –permanent

firewall-cmd –reload

Installing the etcd packages on all nodes

We will install etcd packages on every node using the yum package manager, using the following command:

yum install -y etcd

We are now ready to configure etcd.

Configuring etcd

On each node, go to /etc/etcd/etcd.conf, write down the cluster nodes addresses into etcd.conf. On every node the file should look like this:

# [member]

ETCD_NAME=

ETCD_DATA_DIR=”/var/lib/etcd/default.etcd”

ETCD_LISTEN_PEER_URLS=”http://:2380″

ETCD_LISTEN_CLIENT_URLS=”http://:2379,http://127.0.0.1:2379″

#[cluster]

ETCD_INITIAL_ADVERTISE_PEER_URLS=”http://:2380″

ETCD_INITIAL_CLUSTER=”etcd1=http://10.10.0.10:2380,etcd2=http://10.10.0.11:2380,etcd3=http://10.10.0.12:2380″

ETCD_INITIAL_CLUSTER_STATE=”new”

ETCD_INITIAL_CLUSTER_TOKEN=”etcd-cluster-1″

ETCD_ADVERTISE_CLIENT_URLS=”http://:2379″

where – is the name and the – is the node address.

 

Completing configuration

After we have started the etcd cluster, we change its status from new to existing. To do that, on each node we will run:

sed -i s’/ETCD_INITIAL_CLUSTER_STATE=”new”/ETCD_INITIAL_CLUSTER_STATE=”existing”/’g /etc/etcd/etcd.conf

Configuring Kubernetes API Server

For etcd to fully work with the Kubernetes cluster, you should configure a Kubernetes API server on the master and specify addresses of nodes running etcd.

Open /etc/kubernetes/apiserver on the Kubernetes cluster and enter the node’s address:

KUBE_ETCD_SERVERS=”–etcd_servers=http://10.10.0.10:2379,http://10.10.0.11:2379, http://10.10.0.12:2379″

After that, make sure that the cluster can be reached. Run this command on one of the nodes with etcd:

etcdctl member list

If we get the list of nodes with etcd in response, then the cluster is operational and is ready for basic configuring.

However, to use etcd in production, it is necessary to configure interaction between the etcd and Kubernetes cluster components over https.

Configuring SSL/TLS for communications between etcd nodes

If etcd holds private or confidential information, all communications must be encrypted. Etcd supports SSL/TLS and server/client authentication with certificates for client-server or peer-to-peer connections.

SSL/TLS offers encrypted data transfer over https. In our example, we will encrypt connections between the etcd nodes (as SSL/TLS servers) and the Kubernetes API servers (as SSL/TLS client), as well as connections between etcd nodes (peer-to-peer connections) In order to do that we will create a certificate for each etcd node and API server (master) in the Kubernetes cluster.

First, we must create a CA certificate (certification authority certificate) and two keys for every node in accordance with its role. It is recommended to create a new pair of keys for every cluster member.

Generating a self-signed TLS-certificate

You can easily generate new certificates using the cfssl tool. To install cfssl, run the following commands on each node:

curl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -o /usr/local/bin/cfssl

curl https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -o /usr/local/bin/cfssljson

chmod +x /usr/local/bin/cfssl /usr/local/bin/cfssljson

Create folder for generating certificates and default CA configuration files:

mkdir ~/cfssl

cd ~/cfssl

cfssl print-defaults config > ca-config.json

cfssl print-defaults csr > ca-csr.json

edit the ca-config.json file as stated below, creating profiles for server-client communication and interaction between the servers:

{

“signing”: {

“default”: {

“expiry”: “43800h”

},

“profiles”: {

“server”: {

“expiry”: “43800h”,

“usages”: [

“signing”,

“key encipherment”,

“server auth”

]

},

“client”: {

“expiry”: “43800h”,

“usages”: [

“signing”,

“key encipherment”,

“client auth”

]

},

“peer”: {

“expiry”: “43800h”,

“usages”: [

“signing”,

“key encipherment”,

“server auth”,

“client auth”

]

}

}

}

}

 

Certificates for etcd nodes

Create certificate configuration files for etcd nodes acting as SSL/TLS servers:

cfssl gencert -initca ca-csr.json | cfssljson -bare ca –

cfssl print-defaults csr > server.json

We will get the server.json file with configuration for the  certificate.

Make a copy of the server.json file for every node (finally, you will get three files) and name them server1.json, server2.json, and server3.json for convenience. Edit the files, by entering your hostnames and IP-addresses in the internal network:

{

“CN”: “etcd1”,

“hosts”: [

“etcd1”,

“10.0.0.11”

],

“key”: {

“algo”: “ecdsa”,

“size”: 256

}

}

Generate certificates for etcd servers:

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server1.json | cfssljson -bare server1

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server2.json | cfssljson -bare server2

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server3.json | cfssljson -bare server3

 

Certificate for the client

Now create a configuration file for the master node of the Kubernetes cluster acting as the SSL/TLS client:

cfssl print-defaults csr > client.json

{

“CN”: “root”,

“key”: {

“algo”: “ecdsa”,

“size”: 256

}

}

Generate the client certificate:

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client

Certificates for peer-to-peer communications in the etcd cluster

Finally, create configuration files for the peer-to-peer communications between the etcd nodes:

cfssl print-defaults csr > member1.json

As with creating certificates for servers, create a copy of the file for every node in the cluster and call them member1.json, member2.json, and member3.json. Edit the files, by entering your hostnames and IP-addresses in the internal network:

{

“CN”: “etcd1”,

“hosts”: [

“etcd1”,

“10.0.0.11”

],

“key”: {

“algo”: “ecdsa”,

“size”: 256

}

}

Generate the certificate:

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer member1.json | cfssljson -bare member1

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer member2.json | cfssljson -bare member2

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer member3.json | cfssljson -bare member3

Copy the certificates to the appropriate nodes

Create on each node (etcd cluster components and Kubernetes API server) a folder /etc/ssl/etcd to store etcd certificates:

mkdir /etc/ssl/etcd

Copy certificates to each node according to the following scheme:

  • etcd1: server1.pem, member1.pem, member1-key.pem, server1-key.pem, ca.pem
  • etcd2: server2.pem, member2.pem, member2-key.pem, server2-key.pem, ca.pem
  • etcd3: server3.pem, member3.pem, member3-key.pem, server3-key.pem, ca.pem
  • etcd client: client.pem, client-key.pem. ca.pem

Don’t forget to change user for the folder with certificates and change rights to files with keys in order to prevent unauthorized access: By default the name of the user running etcd is also etcd:

chown etcd: /etc/ssl/etcd/*

chmod 660 /etc/ssl/etcd/*-key.pem

Configuring HTTPS communication without authentication

Now let’s configure communication over HTTPS protocol On each node in the etcd cluster in /etc/etcd/etcd.conf change http to https and add new line in the Security section to allow the use of TLS certificates.

Example for node 1 (take similar steps for nodes 2 and 3):

# [member]

ETCD_NAME=etcd1

ETCD_DATA_DIR=”/var/lib/etcd/default.etcd”

ETCD_LISTEN_PEER_URLS=”https://10.10.0.10:2380″

ETCD_LISTEN_CLIENT_URLS=”https://10.10.0.10:2379,https://127.0.0.1:2379″

#[cluster]

ETCD_INITIAL_ADVERTISE_PEER_URLS=”https://10.10.0.10:2380″

ETCD_INITIAL_CLUSTER=”etcd1=https://10.10.0.10:2380,etcd2=https://10.10.0.11:2380,etcd3=https://10.10.0.12:2380″

ETCD_INITIAL_CLUSTER_STATE=”new”

ETCD_INITIAL_CLUSTER_TOKEN=”etcd-cluster-1″

ETCD_ADVERTISE_CLIENT_URLS=”https://10.10.0.10:2379″

#[Security]

ETCD_CERT_FILE=”/etc/ssl/etcd/server1.pem”

ETCD_KEY_FILE=”/etc/ssl/etcd/server1-key.pem”

ETCD_TRUSTED_CA_FILE=”/etc/ssl/etcd/ca.pem”

ETCD_CLIENT_CERT_AUTH=”false”

ETCD_PEER_CERT_FILE=”/etc/ssl/etcd/member1.pem”

ETCD_PEER_KEY_FILE=”/etc/ssl/etcd/member1-key.pem”

ETCD_PEER_CLIENT_CERT_AUTH=”true”

ETCD_PEER_TRUSTED_CA_FILE=”/etc/ssl/etcd/ca.pem”

Enter the client node and execute the following commands with applicable variables:

export ETCDCTL_API=3

export ETCDCTL_DIAL_TIMEOUT=3s

export ETCDCTL_ENDPOINTS=’https://10.10.0.10:2379,https://10.10.0.10:2379,https://10.10.0.10:2379′

export ETCDCTL_CACERT=/etc/ssl/etcd/ca.crt

export ETCDCTL_CERT=/etc/ssl/etcd/client.pem

export ETCDCTL_KEY=/etc/ssl/etcd/client-key.pem

Now the configuration for client-server and peer-to-peer connections over HTTPS is complete.

Checking TLS connections (client-server) without authentication

In order to verify that the communication configuration is correct, connect to etcd over HTTPS from the Kubernetes master node:

curl –cacert /etc/ssl/certs/ca.crt https://etcd(1|2|3):2379/v2/keys/foo -XPUT -d value=bar -v

Configuring HTTPS communication with authentication

We can also enable client authentication on the etcd server to ensure secure data exchange and prevent unauthorized access to etcd.

The etcd can authenticate using basic authentication, name, and password or certificate. Although basic authentication is more simple to configure, certificates offer a higher level of security. In our case, when the server receives a request from an etcd client, it will verify that its certificate is valid and that the client is allowed to access the server.

For that, you should edit the /etc/etcd/etcd.conf file on each node in the cluster and specify the ETCD_CLIENT_CERT_AUTH=»true» value.

Checking TLS connections (client-server) with authentication

Let’s try to connect to etcd from the Kubernetes master node with authentication:

curl –cacert /etc/ssl/certs/ca.crt –cert /etc/ssl/certs/client.crt –key /etc/ssl/certs/client.key \

-L https://etcd(1|2|3):2379/v2/keys/foo -XPUT -d value=bar -v

As in our example we use a self-signed CA certificate we specify the certification center (CA) manually using the cacert flag. If you don’t want to do that you can add a CA certificate in the trusted certificates folder of the system (usually it’s the /etc/pki/tls/certs or /etc/ssl/certs).

Authorization and authentication in the etcd cluster

Authorization in etcd is based on Role-Based Access Control (RBAC), i.e., the rules of user access to data stored in the etcd are defined by the roles of these users. To assign rights to an end-user, you should assign him a role for which these rights are assigned.

Create the user named root that will have the root role assigned by default:

etcdctl user add root

After executing this command for a new user, you will have to set a password. If the certificate authentication is enabled, the user will not have to use it during connections. However, having a set password may be handy in case the authentication gets broken – you can use your password to enter.

Authentication is disabled in etcd by default. To enable it, execute the following command:

etcdctl auth enable

After enabling authentication, we can configure authorization. In our example, we create a user, a role, define rights for this role and assign the role to this user:

etcdctl user add user1

etcdctl role add role1

etcdctl role grant-permission role1 –prefix=true readwrite /dir1/

etcdctl user grant-role user1 role1

This will give the user1 full right to everything that is in the folder /dir1/.

The outcome

We have configured the 3-node etcd cluster, which will ensure reliable storage for the Kubernetes cluster status. This minimal etcd configuration contributes to the service resiliency. If you shut down one etcd node you will see that the Kubernetes still works.   However, if you shut down two etcd nodes, the cluster will become unavailable, so a five node etcd cluster is recommended for critical applications with high loads.