Auto-update to 8.3 - various apps down - database issue
-
@girish Man this is bad
-
@girish Man this is bad
@CptPlastic if the cloudron is still in error state, would like to take a look at it. can you email us at support@cloudron.io ?
-
Not sure if strong words help the cause, it is not like we introduce bugs or slack on testing on purpose.
I wonder where the data loss comes in though, there should be only a small timeframe between app backup and app being down (so no data can get changed/added) while the app was down.
-
I share the feelings of alarm but we do need to keep it in context - it’s IT - it always goes wrong
- we are amazed and pleasantly, nay deliriously, happy when it works smoothly, as mostly Cloudron does.
Apart from root cause stuff and understanding why it is smooth for some and (what feels in the moment) near-disaster for others, the interesting point for me is : when is a backup a backup ? When can it be relied on ?
I couldn’t restore to the pre-update backups, needed to go further back, and then manually add lost stuff. In my case it wasn’t nice, but it wasn’t that hard, but maybe I was lucky, judging by other reports.
But that’s probably an unanswerable question.
I wonder if snapshots offer more than backups in this scenario. But I guess that’s outside Cloudron realm.Maybe biggest takeaway from all this : don’t make big updates of core services auto-update. Give the user notice, and a chance to make a snapshot or some other cautious approach, and leave them to decide when to run it.
-
@joseph thank you for your reply! This is the error:
An error occurred during the restore operation: Addons Error: Unexpected response code or HTTP error when piping /home/yellowtent/appsdata/fa89594f-7176-4a81-9c25-1686af1e50da/postgresqldump to http://172.18.30.2:3000/databases/dbfa89594f71764a819c251686af1e50da/restore?access_token=XXXX&username=userXXXX: status 500 complete false
-
@mazarian thanks for the access, we got to the bottom of the issue!
The issue is that the pgvector extension is crashing on some servers. Every time an app like immich or chatwoot attempts to use this extensions, the entire database crashes
This then makes the database go into recovery mode.
You can read more at:
https://github.com/pgvector/pgvector/issues/143
https://github.com/pgvector/pgvector/issues/752
https://github.com/pgvector/pgvector/issues/389The database container has been updated now . https://git.cloudron.io/platform/box/-/commit/d2de2c7093e72bdcd3c5e6ea9f8d5dc88a595b77 . We will make a 8.3.1 with the fix.
-
@mazarian thanks for the access, we got to the bottom of the issue!
The issue is that the pgvector extension is crashing on some servers. Every time an app like immich or chatwoot attempts to use this extensions, the entire database crashes
This then makes the database go into recovery mode.
You can read more at:
https://github.com/pgvector/pgvector/issues/143
https://github.com/pgvector/pgvector/issues/752
https://github.com/pgvector/pgvector/issues/389The database container has been updated now . https://git.cloudron.io/platform/box/-/commit/d2de2c7093e72bdcd3c5e6ea9f8d5dc88a595b77 . We will make a 8.3.1 with the fix.
@girish thank you guys for everything you do and have done to have a stable platform! It's nice to see that even if issues come up, you guys are there for support - moreso than some of the large companies I buy tons of equipment from! You guys are awesome!
-
@timconsidine You are so right. The best day for Team Cloudron is one in which there are no updates. I give @girish @nebulon @joseph an enormous amount of credit for the job they do. As someone who rolls out ~ monthly updates to users, every time that happens I pray that me and my team didn't miss something important. But it happens to all of us, despite good processes and best of intentions. But 8.3 will be replaced shortly with 8.3.1 and then 8.4 and then 9.0. And this difficult day will be replaced in everyone's memory by the great things that are yet to come!
-
-