Solved MySQL server down
After some automatic app upgrades, my MySQL service is no longer starting.
My Event Log shows apps stopping responding after updates to Ghost, Wekan, and Collabora office.
Yesterday healthmonitor wekan.example.com (Wekan) is down Yesterday cron Ghost at blog.example.com was updated to v3.189.3 Yesterday cron Update of Ghost at blog.iamthefij.com started from v3.189.3 to v3.190.0 Yesterday healthmonitor blog.example.com (Ghost) is down Yesterday cron Wekan at wekan.example.com was updated to v4.16.2 Yesterday cron Ghost at blog.example.com was updated to v3.189.2 Yesterday cron Collabora Office at docs.example.com was updated to v1.10.18 Yesterday cron Update of Wekan at wekan.example.com started from v4.16.2 to v4.16.3 Yesterday cron Update of Collabora Office at docs.example.com started from v1.10.18 to v1.10.19 Yesterday cron Update of Ghost at blog.example.com started from v3.189.2 to v3.189.3
One odd thing I see is that it appears to apply updates in odd orders... Eg. Chost from
v3.189.2 to v3.189.3started, then it says it completed update to
v3.189.2(old version). The same thing happens again shortly after showing
update of Ghost at blog.example.com started from v3.189.3 to v3.190.0and then saying that it's been updated to
v3.189.3. Maybe that is just an error in the Event Log though posting the old version when the update is completed.
When I inspect the
mysqlcontainer I find the following in the logs:
root@mysql:/# cat /tmp/mysqld.err 2021-11-07T17:18:12.374786Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.23-0ubuntu0.20.04.1) starting as process 13 2021-11-07T17:18:12.399005Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2021-11-07T17:18:13.560419Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2021-11-07T17:18:14.130206Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Socket: /var/run/mysqld/mysqlx.sock 2021-11-07T17:18:14.224525Z 0 [ERROR] [MY-013183] [InnoDB] Assertion failure: trx0purge.cc:174:purge_sys->iter.trx_no <= purge_sys->rseg->last_trx_no thread 139872211805952 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 17:18:14 UTC - mysqld got signal 6 ; Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware. Thread pointer: 0x7f3668000b60 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... 2021-11-07T17:18:14.227292Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed. 2021-11-07T17:18:14.227358Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel. stack_bottom = 7f36897f9ba0 thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x41) [0x5628ef0ac961] /usr/sbin/mysqld(handle_fatal_signal+0x31b) [0x5628eded7e0b] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7f36cfcd73c0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb) [0x7f36cf33518b] /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b) [0x7f36cf314859] /usr/sbin/mysqld(+0xe89b63) [0x5628edc38b63] /usr/sbin/mysqld(TrxUndoRsegsIterator::set_next()+0xa48) [0x5628ef392a18] /usr/sbin/mysqld(+0x25e3d1e) [0x5628ef392d1e] /usr/sbin/mysqld(+0x25e6931) [0x5628ef395931] /usr/sbin/mysqld(trx_purge(unsigned long, unsigned long, bool)+0x13d) [0x5628ef39603d] /usr/sbin/mysqld(srv_purge_coordinator_thread()+0x5be) [0x5628ef35ddae] /usr/sbin/mysqld(std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runnable, void (*)()> > >::_M_run()+0xbd) [0x5628ef25483d] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6d84) [0x7f36cf721d84] /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f36cfccb609] /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f36cf411293] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0): is an invalid pointer Connection ID (thread ID): 0 Status: NOT_KILLED The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash.
I've done some research on recovering from database corruption and I'll try to follow that, but since all apps are backed up, I'm wondering if there's a way to blow away this database and just restore all apps that depend on MySQL from backup.
I'm following this guide to wipe away and restore: https://docs.cloudron.io/troubleshooting/#mysql-addon
Hoping that works.
@iamthefij did you manager to bring back the mysql instance?
@nebulon yea. I put apps into recovery and stopped them, then wiped the data, started restored, turned off recovery for each one by one. Not sure if there’s a root cause for this. It happened right after some app updates.
@nebulon Ever since I brought up my system this way, backups have failed.
Unexpected response code when piping https://172.18.0.4:3000/databases/0e9ccf544025299e/backup?access_token=<token>: 404 message: Not Found filename: /home/yellowtent/appsdata/<id>/mysqldump
When I check that path, there is indeed a file there, but it's an empty one.
It looks like it may be one app (Matomo) failing and canceling the entire backup operation since one app (Gitea) does get backed up before the failure.
I just disabled backups for Matomo and I'm trying to back up the entire box again now to narrow it down.
Oh, interesting. Matomo says "Database access denied". So something is up there.
@iamthefij Ok. Just restored matomo from backup again and it's all good.