给我的node节点创建自动清理的job

2024-11-21 约 409 字预计阅读 1 分钟

现状

k8snode节点磁盘容量只有120G，3周-5周就会报一次磁盘告警

主要是容器内产生的数据、系统日志、异常状态的容器、迭代产生的旧镜像

在之前使用docker时，使用docker system prune -a来清理

现在使用的是Containerd，没有那么自动的工具，镜像和异常容器需要分开清理，如下：

1
2
3
4
5


# 清理未使用的镜像
crictl rmi -prune

# 清理exited状态的容器
crictl ps -a --state exited | awk 'NR>1 {print $1}'|xargs crictl rm

目标

实现k8s节点的自动清理能力

步骤

首先创建一个shell脚本

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


# vim cleanup.sh
#!/bin/bash

# 获取所有 Exited 状态的容器 ID
container_ids=$(crictl ps -a --state exited | awk 'NR>1 {print $1}')
# 清理未用的images
crictl rmi -prune
# 检查是否有 Exited 状态的容器
if [ -n "$container_ids" ]; then
  # 删除所有 Exited 状态的容器
  echo "Removing Exited containers..."
  for container_id in $container_ids; do
    crictl rm $container_id
  done
else
  echo "No Exited containers found."
fi

创建k8s cornjob的描述文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45


# vim cleanup.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: crictl-config
data:
  crictl.yaml: |
    runtime-endpoint: unix:///var/run/containerd/containerd.sock
    image-endpoint: unix:///var/run/containerd/containerd.sock    
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: cleanup-daemonset
spec:
  selector:
    matchLabels:
      app: cleanup
  template:
    metadata:
      labels:
        app: cleanup
    spec:
      containers:
      - name: cleanup
        image: registry-vpc.cn-hangzhou.aliyuncs.com/zhangjinhui/cleanup:v0.0.4
        command: ["/bin/sh", "-c", "while true; do /cleanup.sh; sleep 86400; done"]
        volumeMounts:
        - name: crictl-config
          mountPath: /etc/crictl.yaml
          subPath: crictl.yaml
        - name: containerd-sock
          mountPath: /run/containerd
        securityContext:
          privileged: true
      volumes:
      - name: crictl-config
        configMap:
          name: crictl-config
      - name: containerd-sock
        hostPath:
          path: /run/containerd
          type: Directory
      tolerations:
      - operator: Exists

部署

1

kubectl apply -f cleanup.yaml

目录

给我的node节点创建自动清理的job

现状

目标

步骤