关于 K8s 集群 CentOS Linux 7 节点批量 Kernel 升级的一些笔记

写在前面
k8s 集群安装一个观测工具检查发现内核版本太低不支持,所有决定升级
操作环境为实验环境,所以没什么顾虑
如果生产环境升级,需要做错误预算哈,最好用 Velero 备份,做好集群迁移的准备
高内核版本支持 cgroup2,如果新集群部署需要考虑下这块。
理解不足小伙伴帮忙指正
对每个人而言,真正的职责只有一个:找到自我。然后在心中坚守其一生,全心全意,永不停息。所有其它的路都是不完整的,是人的逃避方式,是对大众理想的懦弱回归,是随波逐流,是对内心的恐惧 ——赫尔曼·黑塞《德米安》

本地的 k8s 集群,CentOS Linux 7 (Core) 的系统

┌──[root@vms100.liruilongs.github.io]-[~]
└─$kubectl get nodes
NAME                          STATUS     ROLES           AGE    VERSION
vms100.liruilongs.github.io   Ready      control-plane   6d4h   v1.25.1
vms101.liruilongs.github.io   Ready      control-plane   6d4h   v1.25.1
vms102.liruilongs.github.io   Ready      control-plane   6d4h   v1.25.1
vms103.liruilongs.github.io   Ready      <none>          6d4h   v1.25.1
vms105.liruilongs.github.io   Ready      <none>          6d4h   v1.25.1
vms106.liruilongs.github.io   Ready      <none>          6d4h   v1.25.1
vms107.liruilongs.github.io   Ready      <none>          6d4h   v1.25.1
vms108.liruilongs.github.io   Ready      <none>          6d4h   v1.25.1
┌──[root@vms100.liruilongs.github.io]-[~]
└─$
内核版本 Linux 3.10.0-693.el7.x86_64

┌──[root@vms100.liruilongs.github.io]-[~/ansible/pixie]
└─$hostnamectl
   Static hostname: vms100.liruilongs.github.io
         Icon name: computer-vm
           Chassis: vm
        Machine ID: e93ae3f6cb354f3ba509eeb73568087e
           Boot ID: 5ed408a863df48ae80b51f1b6c4be85f
    Virtualization: vmware
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-693.el7.x86_64
      Architecture: x86-64
┌──[root@vms100.liruilongs.github.io]-[~/ansible/pixie]
└─$
在安装一个观测工具时,提示内核版本太低

┌──[root@vms100.liruilongs.github.io]-[~/ansible/pixie]
└─$px deploy --check_only
Pixie CLI

Running Cluster Checks:
 ✕    Kernel version > 4.14.0  ERR: kernel version for node (vms100.liruilongs.github.io) not supported ✕    Kernel version > 4.14.0  ERR: kernel version for node (vms100.liruilongs.github.io) not supported. Must have minimum kernel version of (4.14.0)
Check pre-check has failed. To bypass pass in --check=false. error=kernel version for node (vms100.liruilongs.github.io) not supported. Must have minimum kernel version of (4.14.0)
决定升级内核,

这里升级方案,先升级一台机器,确认没有问题,对集群做简单测试,半小时后,如果集群运行正常,然后通过 Ansible 批量升级其他的节点。

248323bk-1.png

Linux  官方内核  需要从 https://www.kernel.org/ 下载并编译安装

大多数 Linux 发行版提供自行维护的内核,可以通过 yum 、df或 rpm 等包管理系统升级。

ELRepo 是一个为Linux提供驱动程序和内核镜像的存储库,一个用于企业 Linux 软件包的 RPM 存储库。ELRepo 支持 Red Hat Enterprise Linux (RHEL) 及其重建项目.

ELRepo 项目专注于硬件相关的软件包,以增强您使用 Enterprise Linux 的体验。这包括文件系统驱动程序、图形驱动程序、网络驱动程序、声音驱动程序、网络摄像头和视频驱动程序。

ELRepo官网:http://elrepo.org/tiki/tiki-index.php

#查看 yum 中可升级的内核版本
yum list kernel --showduplicates
#如果list中有需要的版本可以直接执行 update 升级,多数是没有的,所以要按以下步骤操作

#导入ELRepo软件仓库的公共秘钥
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

#Centos7系统安装ELRepo
yum install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
#Centos8系统安装ELRepo
yum install https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm

#查看ELRepo提供的内核版本
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
Kernel 升级
先找一台机器单独升级

Centos7系统安装ELRepo ,  导入ELRepo软件仓库的公共秘钥

┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$yum -y install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
查看ELRepo提供的内核版本

┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
已加载插件:fastestmirror
elrepo-kernel                                                                                                                                                                             | 3.0 kB  00:00:00
elrepo-kernel/primary_db                                                                                                                                                                  | 2.1 MB  00:01:40
Loading mirror speeds from cached hostfile
 * elrepo-kernel: ftp.yz.yamagata-u.ac.jp
可安装的软件包
kernel-lt.x86_64                                                                                        5.4.230-1.el7.elrepo                                                                        elrepo-kernel
kernel-lt-devel.x86_64                                                                                  5.4.230-1.el7.elrepo                                                                        elrepo-kernel
kernel-lt-doc.noarch                                                                                    5.4.230-1.el7.elrepo                                                                        elrepo-kernel
kernel-lt-headers.x86_64                                                                                5.4.230-1.el7.elrepo                                                                        elrepo-kernel
kernel-lt-tools.x86_64                                                                                  5.4.230-1.el7.elrepo                                                                        elrepo-kernel
kernel-lt-tools-libs.x86_64                                                                             5.4.230-1.el7.elrepo                                                                        elrepo-kernel
kernel-lt-tools-libs-devel.x86_64                                                                       5.4.230-1.el7.elrepo                                                                        elrepo-kernel
kernel-ml.x86_64                                                                                        6.1.8-1.el7.elrepo                                                                          elrepo-kernel
kernel-ml-devel.x86_64                                                                                  6.1.8-1.el7.elrepo                                                                          elrepo-kernel
kernel-ml-doc.noarch                                                                                    6.1.8-1.el7.elrepo                                                                          elrepo-kernel
kernel-ml-headers.x86_64                                                                                6.1.8-1.el7.elrepo                                                                          elrepo-kernel
kernel-ml-tools.x86_64                                                                                  6.1.8-1.el7.elrepo                                                                          elrepo-kernel
kernel-ml-tools-libs.x86_64                                                                             6.1.8-1.el7.elrepo                                                                          elrepo-kernel
kernel-ml-tools-libs-devel.x86_64                                                                       6.1.8-1.el7.elrepo                                                                          elrepo-kernel
perf.x86_64                                                                                             5.4.230-1.el7.elrepo                                                                        elrepo-kernel
python-perf.x86_64                                                                                      5.4.230-1.el7.elrepo                                                                        elrepo-kernel
┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$
kernel-lt:表示longterm,即长期支持的内核;当前为5.4.
kernel-ml:表示mainline,即当前主线的内核;当前为5.17.






这里我们升级长期支持的版本,直接升级

#长期支持的内核
┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$yum -y  --enablerepo=elrepo-kernel install kernel-lt.x86_64
查看系统可用内核,并设置启动项

┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$sudo awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg
0 : CentOS Linux (5.4.230-1.el7.elrepo.x86_64) 7 (Core)
1 : CentOS Linux 7 Rescue e93ae3f6cb354f3ba509eeb73568087e (3.10.0-1160.83.1.el7.x86_64)
2 : CentOS Linux (3.10.0-1160.83.1.el7.x86_64) 7 (Core)
3 : CentOS Linux (3.10.0-693.el7.x86_64) 7 (Core)
4 : CentOS Linux (0-rescue-80c608ceab5342779ba1adc2ac29c213) 7 (Core)
指定开机启动内核版本

┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$grub2-set-default 0
生成 grub 配置文件


┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.4.230-1.el7.elrepo.x86_64
Found initrd image: /boot/initramfs-5.4.230-1.el7.elrepo.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-1160.83.1.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-1160.83.1.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-693.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-693.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-80c608ceab5342779ba1adc2ac29c213
Found initrd image: /boot/initramfs-0-rescue-80c608ceab5342779ba1adc2ac29c213.img
Found linux image: /boot/vmlinuz-0-rescue-e93ae3f6cb354f3ba509eeb73568087e
Found initrd image: /boot/initramfs-0-rescue-e93ae3f6cb354f3ba509eeb73568087e.img
done
重启系统,验证

┌──[root@vms100.liruilongs.github.io]-[~/back]
└─$reboot
Connection to 192.168.26.100 closed by remote host.
Connection to 192.168.26.100 closed.
....
┌──[root@vms100.liruilongs.github.io]-[~]
└─$hostnamectl
   Static hostname: vms100.liruilongs.github.io
         Icon name: computer-vm
           Chassis: vm
        Machine ID: e93ae3f6cb354f3ba509eeb73568087e
           Boot ID: a1150b6d97dc4afbb81dae58f131a487
    Virtualization: vmware
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
      Architecture: x86-64
┌──[root@vms100.liruilongs.github.io]-[~]
└─$
确实没有问题之后,对集群做简单测试,等半个小时,批量升级一下

编写升级脚本

#!/bin/bash

#@File    :   update_kernel
#@Time    :   2023/02/01 23:58:23
#@Author  :   Li Ruilong
#@Version :   1.0
#@Desc    :   contos 7 批量升级内核脚本
#@Contact :   liruilonger@gmail.com



yum -y install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

yum -y  --enablerepo=elrepo-kernel install kernel-lt.x86_64

grub2-set-default 0

grub2-mkconfig -o /boot/grub2/grub.cfg

reboot
拷贝脚本到升级节点机器

┌──[root@vms100.liruilongs.github.io]-[~/ansible]
└─$ansible ansible_node -m copy -a "src=./update_kernel/update_kernel.sh dest=/tmp/" -i host.yaml
┌──[root@vms100.liruilongs.github.io]-[~/ansible]
└─$ansible ansible_node -m shell -a "cat /tmp/update_kernel.sh" -i host.yaml
运行升级脚本

┌──[root@vms100.liruilongs.github.io]-[~/ansible]
└─$ansible ansible_node -m shell -a "/usr/bin/bash /tmp/update_kernel.sh" -i host.yaml  -f 7 -vvv
升级完成查看内核版本确认

┌──[root@vms100.liruilongs.github.io]-[~/ansible]
└─$ansible ansible_node  -m shell -a 'hostnamectl | grep Kernel'  -i host.yaml
192.168.26.106 | CHANGED | rc=0 >>
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.105 | CHANGED | rc=0 >>
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.102 | CHANGED | rc=0 >>
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.103 | CHANGED | rc=0 >>
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.101 | CHANGED | rc=0 >>
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.107 | CHANGED | rc=0 >>
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.108 | CHANGED | rc=0 >>
            Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
查看集群信息确认

┌──[root@vms100.liruilongs.github.io]-[~/ansible]
└─$kubectl get nodes
NAME                                    STATUS      ROLES           AGE    VERSION
vms100.liruilongs.github.io   Ready    control-plane   6d5h   v1.25.1
vms101.liruilongs.github.io   Ready    control-plane   6d5h   v1.25.1
vms102.liruilongs.github.io   Ready    control-plane   6d5h   v1.25.1
vms103.liruilongs.github.io   Ready    <none>           6d4h   v1.25.1
vms105.liruilongs.github.io   Ready    <none>           6d4h   v1.25.1
vms106.liruilongs.github.io   Ready    <none>           6d4h   v1.25.1
vms107.liruilongs.github.io   Ready    <none>           6d4h   v1.25.1
vms108.liruilongs.github.io   Ready    <none>           6d4h   v1.25.1
┌──[root@vms100.liruilongs.github.io]-[~/ansible]
└─$
运行原来的工具测试

┌──[root@vms100.liruilongs.github.io]-[~/ansible/pixie]
└─$px deploy --check_only
Pixie CLI

Running Cluster Checks:
 ✔    Kernel version > 4.14.0
 ✔    Cluster type is supported
 ✔    K8s version > 1.16.0
 ✔    Kubectl > 1.10.0 is present
 ✔    User can create namespace
INFO[0002] All Required Checks Passed!
┌──[root@vms100.liruilongs.github.io]-[~/ansible/pixie]
└─$



作者:山河已无恙


欢迎关注微信公众号 :山河已无恙