Cloudron stops working periodically



  • This is weird but here it goes :-).

    @tadeas I have validated your mail here. We got your email to support but it does not have a working email (the FROM address is not gmail) and it's thus bouncing. Please send us an email from your gmail account. Thanks.



  • Thanks :) (I used to have g. apps for my domain for many years, which is still working to send emails from, even after I migrated to cloudron)

    Anyway, here's the issue ---

    I've been using cloudron for about 6 months now. Usually every 4 weeks I need to login via ssh and manually reboot, because the server stops responding (mail, calendar, etc.). I didn't have time to troubleshoot that.

    Now however, the mail&calendar became inaccessible and the my.* interface was saying update in progress for many hours. I rebooted and now I'm only getting an error page from the cloudron.

    Logging into the server via ssh, I see mysql is not working and also does not restart.

    Apr 22 20:30:49 my..net systemd[1]: Starting MySQL Community Server...
    Apr 22 20:30:50 my.
    .net systemd[1]: mysql.service: Main process exited, code=exited, status=1/FAILURE
    Apr 22 20:31:19 my..net systemd[1]: Failed to start MySQL Community Server.
    Apr 22 20:31:19 my.
    .net systemd[1]: mysql.service: Unit entered failed state.
    Apr 22 20:31:19 my..net systemd[1]: mysql.service: Failed with result 'exit-code'.
    Apr 22 20:31:20 my.
    .net systemd[1]: mysql.service: Service hold-off time over, scheduling restart.
    Apr 22 20:31:20 my.*.net systemd[1]: Stopped MySQL Community Server.

    root@my:~# systemctl status mysql

    • mysql.service - MySQL Community Server
      Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset: enabled)
      Active: activating (start-post) (Result: exit-code) since Sun 2018-04-22 21:30:42 UTC; 17s ago
      Process: 21980 ExecStart=/usr/sbin/mysqld (code=exited, status=1/FAILURE)
      Process: 21964 ExecStartPre=/usr/share/mysql/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
      Main PID: 21980 (code=exited, status=1/FAILURE); : 21988 (mysql-systemd-s)
      Tasks: 2
      Memory: 320.0K
      CPU: 278ms
      CGroup: /system.slice/mysql.service
      -control |-21988 /bin/bash /usr/share/mysql/mysql-systemd-start post-22390 sleep 1

    Apr 22 21:30:42 my..net systemd[1]: Stopped MySQL Community Server.
    Apr 22 21:30:42 my.
    .net systemd[1]: Starting MySQL Community Server...
    Apr 22 21:30:43 my.*.net systemd[1]: mysql.service: Main process exited, code=exited, status=1/FAILURE

    Unbound service also not working, but starts if I restart it.

    I don't know how other services are doing.

    This happened by itself, I was not doing anything with the server. I can of course wipe the server, reinstall and load a backup, but I would prefer not to have this happen again.

    I have now wiped the server and am restoring from backup, I can have it offline for longer.

    But I will leave this here for future reference.

    Thanks, T.



  • @tadeas It's hard to tell what is happening, are you able give us SSH access to the server? SSH keys are here -https://cloudron.io/documentation/support/#remote-support

    Can you also tell us:

    1. Does the server have sufficient disk space? If it ran out of disk space, please use the procedure here -https://cloudron.io/documentation/server/#recovery-after-disk-full

    2. What version of Cloudron are you running?

    3. Can you paste the output of "ls -lh /var/lib/mysql/box"? If you see the eventlog to be very large (like in gigs), that is probably the root of all the problems.

    Thanks,

    Girish



  • @girish Thanks for the response. I have wiped the server and did a restore, so I can't do what you suggest atm.

    There is/was plenty of space (42gb used out of 490), and ram (1gb out of 6gb).

    I was/am running 2.0.1.

    I'll check the event log size once the problem comes in again and will report here.