现状
k8snode节点磁盘容量只有120G,3周-5周就会报一次磁盘告警
主要是容器内产生的数据、系统日志、异常状态的容器、迭代产生的旧镜像
在之前使用docker时,使用docker system prune -a来清理
现在使用的是Containerd,没有那么自动的工具,镜像和异常容器需要分开清理,如下:
1
2
3
4
5
|
# 清理未使用的镜像
crictl rmi -prune
# 清理exited状态的容器
crictl ps -a --state exited | awk 'NR>1 {print $1}'|xargs crictl rm
|
目标
实现k8s节点的自动清理能力
步骤
首先创建一个shell脚本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
# vim cleanup.sh
#!/bin/bash
# 获取所有 Exited 状态的容器 ID
container_ids=$(crictl ps -a --state exited | awk 'NR>1 {print $1}')
# 清理未用的images
crictl rmi -prune
# 检查是否有 Exited 状态的容器
if [ -n "$container_ids" ]; then
# 删除所有 Exited 状态的容器
echo "Removing Exited containers..."
for container_id in $container_ids; do
crictl rm $container_id
done
else
echo "No Exited containers found."
fi
|
创建k8s cornjob的描述文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
# vim cleanup.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: crictl-config
data:
crictl.yaml: |
runtime-endpoint: unix:///var/run/containerd/containerd.sock
image-endpoint: unix:///var/run/containerd/containerd.sock
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cleanup-daemonset
spec:
selector:
matchLabels:
app: cleanup
template:
metadata:
labels:
app: cleanup
spec:
containers:
- name: cleanup
image: registry-vpc.cn-hangzhou.aliyuncs.com/zhangjinhui/cleanup:v0.0.4
command: ["/bin/sh", "-c", "while true; do /cleanup.sh; sleep 86400; done"]
volumeMounts:
- name: crictl-config
mountPath: /etc/crictl.yaml
subPath: crictl.yaml
- name: containerd-sock
mountPath: /run/containerd
securityContext:
privileged: true
volumes:
- name: crictl-config
configMap:
name: crictl-config
- name: containerd-sock
hostPath:
path: /run/containerd
type: Directory
tolerations:
- operator: Exists
|
部署
1
|
kubectl apply -f cleanup.yaml
|