Fix for kernel bug in Ubuntu 20.04 causing various issues
-
This fix is not needed anymore. Ubuntu has released
5.4.0-135-generic
Cloudron enables automatic ubuntu security updates. Roughly around 2022-11-17, the linux kernel was updated to
5.4.0-132-generic
. You can find the automatic updates log in/var/log/apt/history.log
. This kernel has a bug causing various things like containerd, prometheus node exporter etc to fail. On Cloudron, this manifests itself as:- automatic updates appear to get stuck in 'cleaning up old install"
- cron jobs don't work anymore
- file permissions inside containers become incorrect
For the moment, it's best to revert to the previous kernel
5.4.0-131-generic
. How you do this, depends on your VPS provider. Some VPS providers allow you to change the kernel via their control panels.Please be careful with instructions below. You might have to fine tune it based on your setup/provider.
Many of the modern provider will just use Grub 2 as the kernel (digitalocean, linode, to name a few). On such VPS, please change the kernel as follows:
- Highly recommend taking a snapshot of the server, in case something goes wrong.
- SSH into the server
apt install linux-image-5.4.0-131-generic linux-modules-extra-5.4.0-131-generic
apt-mark hold linux-generic linux-image-generic linux-headers-generic
- Edit
/etc/default/grub
. Find the lineGRUB_DEFAULT=0
. Change this toGRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.4.0-131-generic"
. Important you get this line right, otherwise your server may not boot! update-grub
reboot
- After reboot,
uname -nar
will say5.4.0-131-generic
.
To reverse the above changes:
apt-mark unhold linux-generic linux-image-generic linux-headers-generic
unattended-upgrade -d
- when running this you will see new kernel5.4.0-135-generic
is getting installed.- Edit
/etc/default/grub
. Change the line toGRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.4.0-135-generic"
. update-grub
reboot
- After reboot,
uname -nar
will say5.4.0-135-generic
apt remove linux-image-5.4.0-131-generic linux-modules-extra-5.4.0-131-generic
- to remove the old kernel- Edit
/etc/default/grub
. Change the line to `GRUB_DEFAULT=0 update-grub
Related threads:
@girish Hello Girish,
after fix Β«automatic updates appear to get stuck in 'cleaning up old install"Β» has been solved, but it seems "cron jobs don't work anymore" problem is still there...
Do you have any other feedback on this issue?
-
@girish Hello Girish,
after fix Β«automatic updates appear to get stuck in 'cleaning up old install"Β» has been solved, but it seems "cron jobs don't work anymore" problem is still there...
Do you have any other feedback on this issue?
@p44 said in Fix for kernel bug in Ubuntu 20.04 causing various issues:
"cron jobs don't work anymore" problem is still there.
Seems to have gone for me. Previously my Nextclouds were giving me warnings about that, but they aren't doing that anymore.
-
@p44 said in Fix for kernel bug in Ubuntu 20.04 causing various issues:
"cron jobs don't work anymore" problem is still there.
Seems to have gone for me. Previously my Nextclouds were giving me warnings about that, but they aren't doing that anymore.
@jdaviescoates Thank's a lot, I'll do more accurate tests ... it seems only few cron jobs are executed, in external cron panel
-
@jdaviescoates Updating to ubuntu 22.04 is a much riskier endeavor than just downgrading the kernel. Downgrading kernel only takes 5 mins. In fact, just this weekend I upgraded all our servers from Ubuntu 18 to 20 and all of them just completely hosed
Each one failed in different places - one in upgrading cloud-init, another is still stuck in some "conflicting package" and for another I had to switch from DO mirror to canonical's mirror. I have generally not had good experiences with distro upgrades (on the server atleast). On desktop ubuntu, I feel things are better, maybe because I have the PC in front of me and have more control of the boot loader.
Ubuntu 22 should work fine though with Cloudron 7.3. But note that it requires you to also rebuild all containers because of cgroup v1 to cgroup v2 migration. All this is in the docs but atleast we had 2-3 bug reports of the migration script not working perfectly.
@girish said in Fix for kernel bug in Ubuntu 20.04 causing various issues:
I upgraded all our servers from Ubuntu 18 to 20 and all of them just completely hosed
Interesting.... I upgraded one server from 16 > 18 > 20 and two more servers from 18 > 20 using the Cloudron guides and never had a problem. I was going to upgrade to 22 thinking it'll all be easy but a little more unsure now.
-
@girish said in Fix for kernel bug in Ubuntu 20.04 causing various issues:
I upgraded all our servers from Ubuntu 18 to 20 and all of them just completely hosed
Interesting.... I upgraded one server from 16 > 18 > 20 and two more servers from 18 > 20 using the Cloudron guides and never had a problem. I was going to upgrade to 22 thinking it'll all be easy but a little more unsure now.
-
B BrutalBirdie referenced this topic on
-
At least on vultr Ubuntu 20.04 repository mirrors, the new fixed kernel is already available via security updates. You can check if this is the same in your instance by running:
apt-get update && apt list --upgradable | grep "\-security"
If it lists
linux-generic/focal-updates,focal-security 5.4.0.135.133 amd64 [upgradable from: 5.4.0.132.132]
then you have to unhold the previously hold packages and eventually it will normally update:apt-mark unhold linux-generic linux-image-generic linux-headers-generic
-
Ubuntu released a new kenel with the fix
5.4.0-135.152
- https://bugs.launchpad.net/ubuntu/+source/containerd/+bug/1996678/comments/28 . I don't know if this kernel arrives as a security update. -
@girish Hello Girish, did you check if this kernel arrives as a security update?
Can we unhold previously hold packages?
Thank's a lot
@p44 this seems to have come to our servers atleast. So, yes, go ahead an unmark the holds.
Start-Date: 2022-12-03 06:10:57 Commandline: /usr/bin/unattended-upgrade Install: linux-image-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic), linux-headers-5.4.0-135:amd64 (5.4.0-135.152, automatic), linux-modules-extra-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic), linux-modules-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic), linux-headers-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic) Upgrade: linux-headers-generic:amd64 (5.4.0.132.132, 5.4.0.135.133), linux-image-generic:amd64 (5.4.0.132.132, 5.4.0.135.133), linux-generic:amd64 (5.4.0.132.132, 5.4.0.135.133) End-Date: 2022-12-03 06:12:14
-
G girish unpinned this topic on
-
G girish marked this topic as a question on
-
G girish has marked this topic as solved on
-
@p44 this seems to have come to our servers atleast. So, yes, go ahead an unmark the holds.
Start-Date: 2022-12-03 06:10:57 Commandline: /usr/bin/unattended-upgrade Install: linux-image-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic), linux-headers-5.4.0-135:amd64 (5.4.0-135.152, automatic), linux-modules-extra-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic), linux-modules-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic), linux-headers-5.4.0-135-generic:amd64 (5.4.0-135.152, automatic) Upgrade: linux-headers-generic:amd64 (5.4.0.132.132, 5.4.0.135.133), linux-image-generic:amd64 (5.4.0.132.132, 5.4.0.135.133), linux-generic:amd64 (5.4.0.132.132, 5.4.0.135.133) End-Date: 2022-12-03 06:12:14
-
At least on vultr Ubuntu 20.04 repository mirrors, the new fixed kernel is already available via security updates. You can check if this is the same in your instance by running:
apt-get update && apt list --upgradable | grep "\-security"
If it lists
linux-generic/focal-updates,focal-security 5.4.0.135.133 amd64 [upgradable from: 5.4.0.132.132]
then you have to unhold the previously hold packages and eventually it will normally update:apt-mark unhold linux-generic linux-image-generic linux-headers-generic
@nebulon Should we have to reverse also this step?
- Edit /etc/default/grub . Find the line GRUB_DEFAULT=0. Change this to GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.4.0-131-generic" . Important you get this line right, otherwise your server may not boot
-
@nebulon Should we have to reverse also this step?
- Edit /etc/default/grub . Find the line GRUB_DEFAULT=0. Change this to GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.4.0-131-generic" . Important you get this line right, otherwise your server may not boot
-
@p44 said in Fix for kernel bug in Ubuntu 20.04 causing various issues:
@girish @nebulon please can you post here all the whole process to reverse this fix?
Yes please, the whole process.
-
@girish said in Fix for kernel bug in Ubuntu 20.04 causing various issues:
@p44 I put in instructions to reverse in the original post.
Thanks.
I tried to implement the instructions to reverse. For me, this didn't work. I followed the steps, did the reboot, and uname -nar still shows 5.4.0-131-generic #147-Ubuntu SMP Fri Oct 14 17:07:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
I tried to follow the steps suggested by @nebulon here: https://forum.cloudron.io/post/57470
When checking for updates, it provided the following output:
Hit:1 http://no.archive.ubuntu.com/ubuntu focal InRelease Hit:2 http://no.archive.ubuntu.com/ubuntu focal-updates InRelease Hit:3 http://no.archive.ubuntu.com/ubuntu focal-backports InRelease Hit:4 http://no.archive.ubuntu.com/ubuntu focal-security InRelease Reading package lists... Done WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
Maybe it was wrong, but I unholded the previously hold packages.
Then I rebooted and waited a few hours. Nothing happened, I was still on 5.4.0-131-generic
So I tried the next step I could find, by following the instructions here (as best as I could understand): https://forum.cloudron.io/post/57552
At command line, I typed sudo /usr/bin/unattended-upgrade
Again, I rebooted and again I was still on 5.4.0-131 when I checked today.
So now, I tried to follow the new instructions you put on top of this post.
First, the unhold command gave the output that the change already had been unholded.
Second, when doing sudo unattended-upgrade -d, I got a huge output, but nothing in that output said anything about 5.4.0-135 generic. (I pasted the full output below, sorry if it becomes a long post).
I continued to follow the instructions as best as I could, so I edited /etc/default/grub like it says in the reverse instruction post, did a reboot, and checked uname -nar again. Still on 5.4.0-131 generic.
I have not apt removed 5.4.0.131 (as I thought that perhaps would be a bad idea since I apparently don't have a newer), and I have not restored /etc/default/grub to default 0.
Could you offer any assistance, perhaps? Is this easily fixable, or do I have to clean install the whole Cloudron and restore from backup?
I am on Ubuntu 20.04 and I self host.
Thank you!
edit: I have to reply to this post, because the output is too long for the editor: