- 报错原因
- 报错如下:
- 报错解决方法: 删除新版内核
自建多GPU服务器可以参考 https://blog.csdn.net/landian0531/article/details/120242839
报错原因意外停电导致Ubuntu服务器重启,docker里面的容器无法通过docker ps -aq | xargs -I {} docker start {}命令启动
报错如下:gpu@gpu-workstation:~$ docker ps -aq | xargs -I {} docker start {}
Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #1:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown
Error: failed to start containers: 485f0e25b37c
报错解决方法: 删除新版内核
查看系统现有内核dpkg --get-selections | grep linux
gpu@gpu-workstation:~$ dpkg --get-selections | grep linux binutils-x86-64-linux-gnu install console-setup-linux install libnvpair1linux install libselinux1:amd64 install libuutil1linux install libzfs2linux install libzpool2linux install linux-base install linux-firmware install linux-generic install linux-headers-5.4.0-88 install linux-headers-5.4.0-88-generic hold linux-headers-5.4.0-89 install linux-headers-5.4.0-89-generic install linux-headers-generic install linux-image-5.4.0-88-generic hold linux-image-5.4.0-89-generic install linux-image-generic install linux-libc-dev:amd64 install linux-modules-5.4.0-88-generic hold linux-modules-5.4.0-89-generic install linux-modules-extra-5.4.0-88-generic hold linux-modules-extra-5.4.0-89-generic install util-linux install zfsutils-linux install
发现系统自动安装了5.4.0-89,通过sudo apt-get purge linux-image-5.4.0-89-generic 命令删除内核
中间有个提示,选择Cancel (注意:删除内核有风险,需要自己斟酌。)
删除后重启服务器即可
gpu@gpu-workstation:~$ sudo apt-get purge linux-image-5.4.0-89-generic Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: amd64-microcode intel-microcode iucode-tool libdbus-glib-1-2 libevdev2 libimobiledevice6 libplist3 libupower-glib3 libusbmuxd6 linux-headers-generic thermald upower usbmuxd Use 'sudo apt autoremove' to remove them. The following additional packages will be installed: linux-image-unsigned-5.4.0-89-generic Suggested packages: fdutils linux-doc | linux-source-5.4.0 linux-tools The following packages will be REMOVED: linux-generic* linux-image-5.4.0-89-generic* linux-image-generic* linux-modules-extra-5.4.0-89-generic* The following NEW packages will be installed: linux-image-unsigned-5.4.0-89-generic 0 upgraded, 1 newly installed, 4 to remove and 39 not upgraded. Need to get 9,011 kB of archives. After this operation, 202 MB disk space will be freed. Do you want to continue? [Y/n] y Get:1 http://ca.archive.ubuntu.com/ubuntu focal-updates/main amd64 linux-image-unsigned-5.4.0-89-generic amd64 5.4.0-89.100 [9,011 kB] Fetched 9,011 kB in 4s (2,522 kB/s) (Reading database ... 113040 files and directories currently installed.) Removing linux-generic (5.4.0.89.93) ... Removing linux-image-generic (5.4.0.89.93) ... Removing linux-modules-extra-5.4.0-89-generic (5.4.0-89.100) ... Removing linux-image-5.4.0-89-generic (5.4.0-89.100) ... W: Removing the running kernel I: /boot/vmlinuz is now a symlink to vmlinuz-5.4.0-88-generic I: /boot/initrd.img is now a symlink to initrd.img-5.4.0-88-generic /etc/kernel/postrm.d/initramfs-tools: update-initramfs: Deleting /boot/initrd.img-5.4.0-89-generic /etc/kernel/postrm.d/zz-update-grub: Sourcing file `/etc/default/grub' Sourcing file `/etc/default/grub.d/init-select.cfg' Generating grub configuration file ... Found linux image: /boot/vmlinuz-5.4.0-88-generic Found initrd image: /boot/initrd.img-5.4.0-88-generic Adding boot menu entry for UEFI Firmware Settings done Selecting previously unselected package linux-image-unsigned-5.4.0-89-generic. (Reading database ... 107660 files and directories currently installed.) Preparing to unpack .../linux-image-unsigned-5.4.0-89-generic_5.4.0-89.100_amd64.deb ... Unpacking linux-image-unsigned-5.4.0-89-generic (5.4.0-89.100) ... Setting up linux-image-unsigned-5.4.0-89-generic (5.4.0-89.100) ... I: /boot/vmlinuz is now a symlink to vmlinuz-5.4.0-89-generic I: /boot/initrd.img is now a symlink to initrd.img-5.4.0-89-generic (Reading database ... 107663 files and directories currently installed.) Purging configuration files for linux-modules-extra-5.4.0-89-generic (5.4.0-89.100) ... Purging configuration files for linux-image-5.4.0-89-generic (5.4.0-89.100) ... I: /boot/vmlinuz is now a symlink to vmlinuz-5.4.0-88-generic I: /boot/initrd.img is now a symlink to initrd.img-5.4.0-88-generic /var/lib/dpkg/info/linux-image-5.4.0-89-generic.postrm ... removing pending trigger rmdir: failed to remove '/lib/modules/5.4.0-89-generic': Directory not empty Processing triggers for linux-image-unsigned-5.4.0-89-generic (5.4.0-89.100) ... gpu@gpu-workstation:~$



