Strange correlation of performance issue, backup issue and kernel.
-
Yesterday one of my Cloudron VPS es started being really slow. Apps were unusably slow. We checked and saw that we are on 20.0 with Cloudron 9. Trying to do a backup before upgrading was impossible. We have Hetzner Storage Boxes as CIFS shares which used to work just fine all the time.
Anyways, we supposed it may be the EOL-Ubuntu, so my hoster upgraded the server to 22.04, recreated containers and so on. Unfortuately the lag was still present. Backup task takes for ever.
To make the VPS useable we temporarily scaled it to 8 vCPUs and 24 GB RAM. This made apps respond properly again. We were on 4 vCPUs and 12 GB RAM a few days ago and the only thing we added was a niche WordPress (Managed) site. Looking at graphs it does not explain the sudden slowdown which also persists in 22.04.
This morning another of our VPSes shows the same pattern. Sudden Slowdown of all apps, failing backup. There are only two apps running on this VPS and we certainly don´t have more users or anything extraordinarily demanding from the outside. Also on 20.04. An upgrade will be made today.
As a side note: our 20.04 kernel on the first machine has been stuck on some 124 version for some reason.
Now it is clear that 20.04 is not supported anymore. But why is the machine similarly slow after upgrade to a supported version (22.04) and why do backups fail? There is a certain lag on the UI as well that correlates with it. Another thing that happens is that "Disk Usage" displays nothing. Not even empty bars. The whole passage is just missing.
Do you have any ideas?
Open for ssh-support by Cloudron team, if you have time.
Chain of events:
- Cloudron 9 running on 20.04
- Sudden performance issues correlating with unstable backup tasks
- We upgrade to 22.04
- We recreate containers and also do the post-upgrade routines
- Performance is still laggy. Not as unusable as in the incident but definitely slower.
-
I used iotop to look at %user / %system% / iowait / %idle and everything looks normal.
w/s 31
%util 4.8
w_await 5.35 msIt does not seem problematic.
-
Yesterday one of my Cloudron VPS es started being really slow. Apps were unusably slow. We checked and saw that we are on 20.0 with Cloudron 9. Trying to do a backup before upgrading was impossible. We have Hetzner Storage Boxes as CIFS shares which used to work just fine all the time.
Anyways, we supposed it may be the EOL-Ubuntu, so my hoster upgraded the server to 22.04, recreated containers and so on. Unfortuately the lag was still present. Backup task takes for ever.
To make the VPS useable we temporarily scaled it to 8 vCPUs and 24 GB RAM. This made apps respond properly again. We were on 4 vCPUs and 12 GB RAM a few days ago and the only thing we added was a niche WordPress (Managed) site. Looking at graphs it does not explain the sudden slowdown which also persists in 22.04.
This morning another of our VPSes shows the same pattern. Sudden Slowdown of all apps, failing backup. There are only two apps running on this VPS and we certainly don´t have more users or anything extraordinarily demanding from the outside. Also on 20.04. An upgrade will be made today.
As a side note: our 20.04 kernel on the first machine has been stuck on some 124 version for some reason.
Now it is clear that 20.04 is not supported anymore. But why is the machine similarly slow after upgrade to a supported version (22.04) and why do backups fail? There is a certain lag on the UI as well that correlates with it. Another thing that happens is that "Disk Usage" displays nothing. Not even empty bars. The whole passage is just missing.
Do you have any ideas?
Open for ssh-support by Cloudron team, if you have time.
Chain of events:
- Cloudron 9 running on 20.04
- Sudden performance issues correlating with unstable backup tasks
- We upgrade to 22.04
- We recreate containers and also do the post-upgrade routines
- Performance is still laggy. Not as unusable as in the incident but definitely slower.
Hello
@whitespace said in Strange correlation of performance issue, backup issue and kernel.:
As a side note: our 20.04 kernel on the first machine has been stuck on some 146 version for some reason.
What version are you referring to?
@whitespace said in Strange correlation of performance issue, backup issue and kernel.:
This morning another of our VPSes shows the same pattern.
Is this VPS hosted by the same provider, maybe even the same datacenter, same node?
It could be an underlying issue if your providers storage system. -
The machine was stuck on Ubuntu 20.04.6 LTS Linux 5.4.0-124-generic. It has been upgraded to Ubuntu 22.04.5 LTS Linux 5.15.0-170-generic. It was still too slow for normal operation after the upgrade. The provider scaled it up to compensate. Now it is working fine but a doubling in CPU and RAM seems quite strange of a need so sudden.
Yes, indeed both machines are hosted by the same provider. Regarding data center and node, I can not say for sure. Would a storage issue not reflect in I/O numbers of the VPS? Those seem fine.