kubernetesのデータベースetcdをバックアップする
LinuCエヴァンジェリストの鯨井貴博@opensourcetechです。
はじめに
今回は、Kubernetesのデータベースであるetcdのバックアップです。
etcdはkubernetesに関する構成情報などを格納しているので冗長構成としますが、
さらに定期的なバックアップをして耐障害性を高めます。
データのありか
etcdのデータ格納先ですが、etcd導入に使われたyamlファイルに記載されています。
具体的には、19行目にある" - --data-dir=/var/lib/etcd"です。
kubeuser@kubemaster1:~$ sudo cat -n /etc/kubernetes/manifests/etcd.yaml
1 apiVersion: v1
2 kind: Pod
3 metadata:
4 annotations:
5 kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.168.1.251:2379
6 creationTimestamp: null
7 labels:
8 component: etcd
9 tier: control-plane
10 name: etcd
11 namespace: kube-system
12 spec:
13 containers:
14 - command:
15 - etcd
16 - --advertise-client-urls=https://192.168.1.251:2379
17 - --cert-file=/etc/kubernetes/pki/etcd/server.crt
18 - --client-cert-auth=true
19 - --data-dir=/var/lib/etcd
20 - --initial-advertise-peer-urls=https://192.168.1.251:2380
21 - --initial-cluster=kubemaster1=https://192.168.1.251:2380
22 - --key-file=/etc/kubernetes/pki/etcd/server.key
23 - --listen-client-urls=https://127.0.0.1:2379,https://192.168.1.251:2379
24 - --listen-metrics-urls=http://127.0.0.1:2381
25 - --listen-peer-urls=https://192.168.1.251:2380
26 - --name=kubemaster1
27 - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
28 - --peer-client-cert-auth=true
29 - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
30 - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
31 - --snapshot-count=10000
32 - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
33 image: k8s.gcr.io/etcd:3.4.3-0
34 imagePullPolicy: IfNotPresent
35 livenessProbe:
36 failureThreshold: 8
37 httpGet:
38 host: 127.0.0.1
39 path: /health
40 port: 2381
41 scheme: HTTP
42 initialDelaySeconds: 15
43 timeoutSeconds: 15
44 name: etcd
45 resources: {}
46 volumeMounts:
47 - mountPath: /var/lib/etcd
48 name: etcd-data
49 - mountPath: /etc/kubernetes/pki/etcd
50 name: etcd-certs
51 hostNetwork: true
52 priorityClassName: system-cluster-critical
53 volumes:
54 - hostPath:
55 path: /etc/kubernetes/pki/etcd
56 type: DirectoryOrCreate
57 name: etcd-certs
58 - hostPath:
59 path: /var/lib/etcd
60 type: DirectoryOrCreate
61 name: etcd-data
62 status: {}
kubeuser@kubemaster1:~$ sudo grep data-dir /etc/kubernetes/manifests/etcd.yaml
- --data-dir=/var/lib/etcd
etcdコンテナへのログイン
今回操作するkubernetes環境では、etcdがPod(コンテナ)として稼働しています。
kubeuser@kubemaster1:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77ff9c69dd-m5jbz 1/1 Running 1 5d
calico-node-7t474 1/1 Running 0 5d
calico-node-bdw22 1/1 Running 0 5d
calico-node-ftjg8 1/1 Running 0 5d
calico-node-lk26d 1/1 Running 0 4d23h
calico-node-qphzt 1/1 Running 0 4d23h
coredns-66bff467f8-cpd25 1/1 Running 0 5d
coredns-66bff467f8-wtww9 1/1 Running 0 5d
etcd-kubemaster1 1/1 Running 1 5d ・・・・ここ
etcd-kubemaster2 1/1 Running 2 5d ・・・・ここ
etcd-kubemaster3 1/1 Running 1 4d23h ・・・・ここ
kube-apiserver-kubemaster1 1/1 Running 3 5d
kube-apiserver-kubemaster2 1/1 Running 1 5d
kube-apiserver-kubemaster3 1/1 Running 1 4d23h
kube-controller-manager-kubemaster1 1/1 Running 6 5d
kube-controller-manager-kubemaster2 1/1 Running 6 5d
kube-controller-manager-kubemaster3 1/1 Running 5 4d23h
kube-proxy-47qm5 1/1 Running 0 4d23h
kube-proxy-6gq55 1/1 Running 0 5d
kube-proxy-flcg2 1/1 Running 0 5d
kube-proxy-xqvdx 1/1 Running 0 5d
kube-proxy-xv78r 1/1 Running 0 4d23h
kube-scheduler-kubemaster1 1/1 Running 7 5d
kube-scheduler-kubemaster2 1/1 Running 5 5d
kube-scheduler-kubemaster3 1/1 Running 4 4d23h
では、そのコンテナへログインします。
ログイン後、使用するコマンドetcdctlのヘルプをetcdctl -hでオプションなどを確認しておきます。
kubeuser@kubemaster1:~$ kubectl -n kube-system exec -it etcd-kubemaster1 -- sh
# etcdctl -h
NAME:
etcdctl - A simple command line client for etcd3.
USAGE:
etcdctl [flags]
VERSION:
3.4.3
API VERSION:
3.4
COMMANDS:
alarm disarm Disarms all alarms
alarm list Lists all alarms
auth disable Disables authentication
auth enable Enables authentication
check datascale Check the memory usage of holding data for different workloads on a given server endpoint.
check perf Check the performance of the etcd cluster
compaction Compacts the event history in etcd
defrag Defragments the storage of the etcd members with given endpoints
del Removes the specified key or range of keys [key, range_end)
elect Observes and participates in leader election
endpoint hashkv Prints the KV history hash for each endpoint in --endpoints
endpoint health Checks the healthiness of endpoints specified in `--endpoints` flag
endpoint status Prints out the status of endpoints specified in `--endpoints` flag
get Gets the key or a range of keys
help Help about any command
lease grant Creates leases
lease keep-alive Keeps leases alive (renew)
lease list List all active leases
lease revoke Revokes leases
lease timetolive Get lease information
lock Acquires a named lock
make-mirror Makes a mirror at the destination etcd cluster
member add Adds a member into the cluster
member list Lists all members in the cluster
member promote Promotes a non-voting member in the cluster
member remove Removes a member from the cluster
member update Updates a member in the cluster
migrate Migrates keys in a v2 store to a mvcc store
move-leader Transfers leadership to another etcd cluster member.
put Puts the given key into the store
role add Adds a new role
role delete Deletes a role
role get Gets detailed information of a role
role grant-permission Grants a key to a role
role list Lists all roles
role revoke-permission Revokes a key from a role
snapshot restore Restores an etcd member snapshot to an etcd directory
snapshot save Stores an etcd node backend snapshot to a given file
snapshot status Gets backend snapshot status of a given file
txn Txn processes all the requests in one transaction
user add Adds a new user
user delete Deletes a user
user get Gets detailed information of a user
user grant-role Grants a role to a user
user list Lists all users
user passwd Changes password of user
user revoke-role Revokes a role from a user
version Prints the version of etcdctl
watch Watches events stream on keys or prefixes
OPTIONS:
--cacert="" verify certificates of TLS-enabled secure servers using this CA bundle
--cert="" identify secure client using this TLS certificate file
--command-timeout=5s timeout for short running command (excluding dial timeout)
--debug[=false] enable client-side debug logging
--dial-timeout=2s dial timeout for client connections
-d, --discovery-srv="" domain name to query for SRV records describing cluster endpoints
--discovery-srv-name="" service name to query when using DNS discovery
--endpoints=[127.0.0.1:2379] gRPC endpoints
-h, --help[=false] help for etcdctl
--hex[=false] print byte strings as hex encoded strings
--insecure-discovery[=true] accept insecure SRV records describing cluster endpoints
--insecure-skip-tls-verify[=false] skip server certificate verification
--insecure-transport[=true] disable transport security for client connections
--keepalive-time=2s keepalive time for client connections
--keepalive-timeout=6s keepalive timeout for client connections
--key="" identify secure client using this TLS key file
--password="" password for authentication (if this option is used, --user option shouldn't include password)
--user="" username[:password] for authentication (prompt if password is not supplied)
-w, --write-out="simple" set the output format (fields, json, protobuf, simple, table)
また、etcdctl実行時に指定するPKI(証明書・秘密鍵など)を確認します。
# cd /etc/kubernetes/pki/etcd
# pwd
/etc/kubernetes/pki/etcd
# ls
ca.crt healthcheck-client.crt peer.crt server.crt
ca.key healthcheck-client.key peer.key server.key
# exit
なお、少し前の記事でetcdctlを使ってみたので、参考にどうぞ。
https://www.opensourcetech.tokyo/entry/20211021/1634747642
データベース状態の確認
実行には、先ほど確認したPKI情報を使います。
kubeuser@kubemaster1:~$ kubectl -n kube-system exec -it etcd-kubemaster1 -- sh -c "ETCDCTL_API=3 ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key etcdctl endpoint health"
127.0.0.1:2379 is healthy: successfully committed proposal: took = 249.182775ms
etcdデータベース数の確認
3台のetcd(Master Node)で構成されているので、一応確認。
kubeuser@kubemaster1:~$ kubectl -n kube-system exec -it etcd-kubemaster1 -- sh -c "ETCDCTL_API=3 etcdctl --cert=/etc/kubernetes/pki/etcd/peer.crt --key=/etc/kub
ernetes/pki/etcd//peer.key --cacert=/etc/kubernetes/pki/etcd/ca.crt --endpoints=
https://127.0.0.1:2379 member list"
1b1c43fd2a12bac1, started, kubemaster1, https://192.168.1.251:2380, https://192.168.1.251:2379, false
2e9d81b870c2839b, started, kubemaster2, https://192.168.1.252:2380, https://192.168.1.252:2379, false
83ceb25956e47fcd, started, kubemaster3, https://192.168.1.249:2380, https://192.168.1.249:2379, false
表形式での表示も出来ます。
kubeuser@kubemaster1:~$ kubectl -n kube-system exec -it etcd-kubemaster1 -- sh -c "ETCDCTL_API=3 ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key etcdctl --endpoints=https://127.0.0.1:2379 -w table endpoint status --cluster"
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.1.251:2379 | 1b1c43fd2a12bac1 | 3.4.3 | 5.8 MB | false | false | 97 | 1445079 | 1445079 | |
| https://192.168.1.252:2379 | 2e9d81b870c2839b | 3.4.3 | 5.7 MB | false | false | 97 | 1445079 | 1445079 | |
| https://192.168.1.249:2379 | 83ceb25956e47fcd | 3.4.3 | 5.7 MB | true | false | 97 | 1445079 | 1445079 | |
+----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
データベースバックアップの取得
"etcdctl snapshot save"を使ってバックアップを取得します。
kubeuser@kubemaster1:~$ kubectl -n kube-system exec -it etcd-kubemaster1 -- sh -c "ETCDCTL_API=3 ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key etcdctl --endpoints=https://127.0.0.1:2379 snapshot save /var/lib/etcd/snapshot.db"
{"level":"info","ts":1638795365.0601962,"caller":"snapshot/v3_snapshot.go:110","msg":"created temporary db file","path":"/var/lib/etcd/snapshot.db.part"}
{"level":"warn","ts":"2021-12-06T12:56:05.069Z","caller":"clientv3/retry_interceptor.go:116","msg":"retry stream intercept"}
{"level":"info","ts":1638795365.0704958,"caller":"snapshot/v3_snapshot.go:121","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
{"level":"info","ts":1638795367.2656121,"caller":"snapshot/v3_snapshot.go:134","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","took":2.205339955}
{"level":"info","ts":1638795367.2657769,"caller":"snapshot/v3_snapshot.go:143","msg":"saved","path":"/var/lib/etcd/snapshot.db"}
Snapshot saved at /var/lib/etcd/snapshot.db
kubeuser@kubemaster1:~$ date
Mon Dec 6 12:56:35 UTC 2021
kubeuser@kubemaster1:~$ sudo ls -l /var/lib/etcd
[sudo] password for kubeuser:
total 5708
drwx------ 4 root root 4096 Dec 1 12:01 member
-rw------- 1 root root 5836832 Dec 6 12:56 snapshot.db
その他データのバックアップ
etcdが稼働するMaster Nodeを復旧するために必要なデータも併せてバックアップします。
ubeuser@kubemaster1:~$ mkdir $HOME/backup
kubeuser@kubemaster1:~$ sudo cp /var/lib/etcd/snapshot.db $HOME/backup/snapshot.db-$(date +%m-%d-%y)
kubeuser@kubemaster1:~$ sudo cp /root/kubeadm-config.yaml $HOME/backup
cp: cannot stat '/root/kubeadm-config.yaml': No such file or directory
kubeuser@kubemaster1:~$ sudo cp /etc/kubernetes/kubeadm-config.yaml $HOME/backup
cp: cannot stat '/etc/kubernetes/kubeadm-config.yaml': No such file or directory
kubeuser@kubemaster1:~$ sudo cp /etc/kubernetes/
admin.conf controller-manager.conf kubelet.conf manifests/ pki/ scheduler.conf
kubeuser@kubemaster1:~$ sudo cp /etc/kubernetes/
admin.conf controller-manager.conf kubelet.conf manifests/ pki/ scheduler.conf
kubeuser@kubemaster1:~$ sudo cp /etc/kubernetes/manifests/
etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
kubeuser@kubemaster1:~$
kubeuser@kubemaster1:~$ sudo find / -name kubeadm-config.yaml
/home/kubeuser/kubeadm-config.yaml
^C
kubeuser@kubemaster1:~$ sudo cp /home/kubeuser/kubeadm-config.yaml $HOME/backup
kubeuser@kubemaster1:~$ sudo cp -r /etc/kubernetes/pki/etcd/ $HOME/backup/
kubeuser@kubemaster1:~$ ls -l $HOME/backup
total 5712
drwxr-xr-x 2 root root 4096 Dec 6 13:05 etcd
-rw-r--r-- 1 root root 165 Dec 6 13:04 kubeadm-config.yaml
-rw------- 1 root root 5836832 Dec 6 13:01 snapshot.db-12-06-21
補足(データのリストア)
データのリストアは、以下の手順で実施します。
https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#restoring-an-etcd-cluster
おわりに
今回は単発のバックアップ作業でしたが、最新のバックアップデータを保持し続けるために、
cronjobなどで定期的にバックアップ作業が必要となります。
この記事が気に入ったらサポートをしてみませんか?