Apps Stuck Updating - Cleaning up old install - even after stopping/restarting task

jdaviescoates

@subven said in Apps Stuck Updating - Cleaning up old install - even after stopping/restarting task:

@girish said in Apps Stuck Updating - Cleaning up old install - even after stopping/restarting task:

So far, this is not reproducible on a netcup server. I left it overnight to make many updates and it seems to be doing fine.

Can confirm. Not reproducable on Netcup RS or VPS servers.

I wonder why so many (well, at least 3, me @avatar1024 @imc67 ) of us Netcup customers are hitting this issue.

I'm on a Netcup VPS 3000 G10.

avatar1024

Just to add some info:

All my three instances are on Netcup:
Two are RS on Cloudron 7.2.5 Ubuntu 20.04
One is a VPS on Cloudron 7.3.2 Ubuntu 22.04

All are using a herzner storage box for backup mounted as SSHFS (not CIFS).

I only hit this issue on one instance on Cloudron 7.2.5. Rebooting the server solves it for a bit (I can update all apps manually and it works fine) but then the issue comes back at some point.

Everything in the update process works fine but it gets stuck on "Cleaning up old install". When I then cancel the task, go to the repair tab and click Restart task, the app just restart to a working state (not updated) but the update does not restart.

So what's weird is that it works completely fine on another instance that's exactly the same: same provider, same root server, same Cloudron/Ubuntu version.

Happy to give SSH access if useful.

humptydumpty

@girish Updating the server was going smooth until I tried to recreate the addon containers. It's still adding new dots so I assume it's okay but has been running for over 6 1/2 hours! Is this normal?

Disk size is 420GB total w/ around 65GB used.

girish

@humptydumpty that should definitely not take so long. I assume you mean the running of /home/yellowtent/box/scripts/recreate-containers ? If you go Services view, are things running already?

humptydumpty

@girish dashboard and all apps aren't loading .. said cloudron is offline in red then.. unable to connect error page.

humptydumpty

@girish I logged in in another putty window and ran

systemctl status box & systemctl status collectd -- both are active and running (green).

girish

@humptydumpty yeah, I think what has happenned is that the docker is getting stuck again and thus addon containers are not getting created properly. Can you give me SSH access and drop a mail to support@ ? I can try to see if recreating the docker images solves anything.

humptydumpty

@girish email sent .. ~~btw should I close the addon container putty window?~~

girish

@humptydumpty thanks, just looking into this now. Indeed:

2022-11-22T05:48:22.494Z box:shell removeAllContainers exec: docker ps -qa --filter 'label=isCloudronManaged' | xargs --no-run-if-empty docker stop

docker is not responding So, I will recreate the docker images completely . I did this now on @jdaviescoates ' server a while ago to see if it helps as well.

girish

Further, I am unable to remove the docker directory even after stopping and disabling docker...

root@vmi557975:/# sudo systemctl stop docker
Warning: Stopping docker.service, but it can still be activated by:
  docker.socket
root@vmi557975:/# rm -rf /var/lib/docker
rm: cannot remove '/var/lib/docker/overlay2/ff0c38183c43e7bcaaf9b564d44f762e0b22f4bd77592a1f0ddae1507dff138d/merged': Device or resource busy
rm: cannot remove '/var/lib/docker/overlay2/d19d5ab400f594ee0a9ce613fbd44ebdd98d72cfa7b0ef7afd1705189272636e/merged': Device or resource busy
rm: cannot remove '/var/lib/docker/overlay2/d8c1b7219fb875bed31be5c935420240e86a29ae38e42f06c547d8d20534e1c0/merged': Device or resource busy
root@vmi557975:/# ps aux | grep docker
root     1330267  0.0  0.0   9068  2260 pts/0    S+   13:58   0:00 grep --color=auto docker

Some dangling container is causing problems:

root@vmi557975:/# ps aux | grep container
root      304611  0.0  0.0 710832  8624 ?        Sl   Nov20   0:24 /usr/bin/containerd-shim-runc-v2 -namespace moby -id adf60365e7ebeed38c53723ed75725ed398a434fb7aaec805a910edcaa5ae901 -address /run/containerd/containerd.sock
root     1196183  0.0  0.0 710768  8464 ?        Sl   05:24   0:04 /usr/bin/containerd-shim-runc-v2 -namespace moby -id d4b36c14c6c2850853bd62513ff6203135f75047a8640001accadcf62ef496b1 -address /run/containerd/containerd.sock
root     1330278  0.0  0.0   9068  2200 pts/0    R+   13:59   0:00 grep --color=auto container

I guess this is why it's hanging, it's unable to remove images.

humptydumpty

@girish ~~what are my options to get the server back up? snapshot restore?~~ dashboard is up! apps are in "configuring - queued" state.

girish

@humptydumpty yes, I just removed the entire /var/lib/docker and re-creating all the apps (had to reboot to even remove it since it was locked by various processes).

humptydumpty

@girish they're "configuring" now. thanks!

humptydumpty

@girish excluding the time to install ubuntu which was around 10 mins at most, do you think i missed any incoming mail during the addon container recreation time?

girish

@humptydumpty practically, all email servers will retry sending email if they caught you in the downtime.

jdaviescoates

@girish said in Apps Stuck Updating - Cleaning up old install - even after stopping/restarting task:

I did this now on @jdaviescoates ' server a while ago to see if it helps as well.

Well, something is different. I see in your email that an n8n update successfully worked after the changes you made, but I just tried updating the outstanding WordPress update...

Now just seemingly stuck here:

So I tried clicking Restart App, which didn't work, so I tried Retry Task, Retry app restart, which did work but of course the app still isn't updated, but back where it was before I tried to update.

So, tried to update again, exact same thing. Just stuck on "Updating".

Sigh. I'm beginning to regret my server move (not only because of this, but also because I've been hitting email issues too - in part because my new IP was/ is blacklisted, but also because email keeps timing out which never used to happen before either)

Adding log:

All I could see when it first got stuck on "Updating" was this:

Nov 22 14:02:09 [POST] /backup
Nov 22 14:02:09 backing up
Nov 22 14:02:09 13:M 22 Nov 2022 14:02:09.480 * DB saved on disk

girish

@jdaviescoates I think this is something else, maybe I forgot something, let me check.

jdaviescoates

Then the restart, retry, attempt to update again logs (too long to paste here):

https://paste.uniteddiversity.coop/?0c9b9bec1cd7f806#5WH55kCQDv7pHUZtuzqSjS5EZCSDRqh5aL2WLvBbeKvA

girish

@jdaviescoates this was a bug with the patch I made yesterday to use docker cli instead of the docker API. I will revert back the code on your server to use the API.

humptydumpty

@girish do you still need SSH access for further investigation?

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Apps Stuck Updating - Cleaning up old install - even after stopping/restarting task