update to cloudron 5.4 crashes complete cloudron server

imc67

Just updated (or trying to) to 5.4 and the whole Cloudron server crashed, all apps down, dashboard down. CPU busy, no network activity.

Via chat @nebulon and @girish were not available and I coincidently had a server snapshot of 2 days old so could restore on server level.

Now I'm missing 2 days data and worse 2 days email!

How to quickly restore (because new mails are coming in) at least email from the last backup?

nebulon

Hi,

what is the actual crash so we can fix it? Like something in the logs?

Regarding the restore, not sure I understand this correctly, so you have restored a server snapshot from 2 days ago already or plan to do so?

If the latter, please enable remote ssh support via cloudron-support --enable-ssh and send a mail to support@cloudron.io with your domain. Then we can debug and fix this.

imc67

@nebulon sorry but in the meanwhile I was busy for hours to get everything back online, this is what I did and found out:

restored server snapshot (-2 days) as that was for me at that time the only thing I could do
was looking for a way to restore a backup of box (because of email) while running the -2 days snapshot but in the docs there is nothing mentioned about restoring a box
'rented' a new NetCup VPS, started it with the Cloudron image ... but restoring a backup was not possible because that image is already on 5.4
I reinstalled the VPS with Ubuntu and manually installed Cloudron 5.3.4
then I was able to restore the backup made just before the 5.4 upgrade

Lessons learned: always make a snapshot before a Cloudron update

Question: how can one restore a box (or specifically email) on a running Cloudron? A few years ago a user (on our previous host) accidentally deleted all his folders in his emailbox, then I was able (DirectAdmin) to only restore his box (it had an hourly backup system)?

nebulon

Do you happen to remember what was maybe crashing, so we can investigate what the root issue was?

Restoring the main Cloudron system database,mailboxes,... (ie the "box") is not individually possible, as apps depend on that and a rollback of that while the apps already using newer state could result in inconsistencies.

Generally I think what you did in the end was the correct approach to restore the whole server on a new VPS using the backup made prior to the update. We may have to improve that flow a bit further, since in such situations the stress level is already high, so it should be a smooth restore path.

imc67

@nebulon I SSH'd into the server during the 'crash' and did a 'top' command, I remember seeing only some processes using load something like 'Docker' and 'Node'.

I tried a reboot but after that I wasn't even able to SSH into the VPS anymore, that was the moment I thought of rebuilding/moving.

Herewith a screenshot from my Zabbix dashboard, backup started around 7:10h finished just after 8:05h and then the update started and "crashed".

Schermafbeelding 2020-07-18 om 13.04.04.png

nebulon

Unfortunately that does not give us much information to debug this further. Next time if that happens, it is worth looking into the log files at /home/yellowtent/platformdata/logs/ most likely the issue will surface within the logs.

imc67

Just the latest update: the fresh new VPS with Cloudron Pro restore automatically updated last night to 5.4 without any issue. Also my two other Cloudron Pro instances had an automatic update and went well.

Next time (hopefully not needed) I will download logs before a restore to make debug possible .

d19dotca

@imc67 @girish Are pre-release versions auto-updating? I thought they had to be manually installed? I ask because now I'm worried mine may auto-update to a pre-release version on a production system.

nebulon

There should currently be no automatic update for 5.4 however if manually checked for updates, then the update can be applied already.

girish

@imc67 I made https://git.cloudron.io/cloudron/box/-/issues/717 for keeping mail backups separately from the box backups.

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

update to cloudron 5.4 crashes complete cloudron server