server seems to be going to sleep and becomes inaccessible
-
I started to have the IPv6 email bounce again on one of my server and so I did a bunch of (a priori harmless) things to configure IPv6 properly (I had disabled it all before for the server to use IPv4):
- activated IPv6 on the network interface on the server
- activated IPv6 on Cloudron (Network > IPv6 > Public IP)
- added * and @ AAAA record for the domain
- Resync DNS from Cloudron interface
All seemed good. Except that after a few minutes, the dashboard and all apps became inaccessible. I tried to SSH into the server and there it said the connection timed out.
I went on Netcup Server Control Panel, reboot the server and it all came back up right away...but then all went down again about 15min later. I tried to power the server on and off, reboot etc. and every time the same thing happens, it all comes back up right away (all services etc. are fine)...but then goes down again after a few minutes. Then I realised that by going Netcup Server Control Panel in the General tab I could access the server from there instead of SSH. Weird things happened:
- From the SCP interface the whole server screen looks black (instead, as in my other server, where you can see it's a terminal asking for a username)
- when opening the server, it is in graphical mode! I can see a graphical login screen (instead of the usual terminal one) where I can see my username, enter password and then login into a session
- just opening the server into the login screen (without even logging in), then the all server, Cloudron and everything else goes back up.
Any clues??
Below is an example of Cloudron log from the point it's working, then a freeze (18:25), then brought back up (22:23):
Feb 11 18:24:30 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 18:24:40 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 18:24:50 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 22:23:20 box:updatechecker checkAppUpdates: Error getting app update info for 1b0f9d03-8de9-46f0-90c2-06355401006e BoxError: getaddrinfo EAI_AGAIN api.cloudron.io Feb 11 22:23:20 at Object.getAppUpdate (/home/yellowtent/box/src/appstore.js:267:22) Feb 11 22:23:20 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { Feb 11 22:23:20 reason: 'Network Error', Feb 11 22:23:20 details: {}, Feb 11 22:23:20 nestedError: Error: getaddrinfo EAI_AGAIN api.cloudron.io Feb 11 22:23:20 at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26) { Feb 11 22:23:20 errno: -3001, Feb 11 22:23:20 code: 'EAI_AGAIN', Feb 11 22:23:20 syscall: 'getaddrinfo', Feb 11 22:23:20 hostname: 'api.cloudron.io', Feb 11 22:23:20 response: undefined Feb 11 22:23:20 } Feb 11 22:23:20 } Feb 11 22:23:20 box:shell Running as unit: box-task-18590.service
Update: another thing that I did earlier that day (even before the whole IPv6 stuff was to reboot the server from the Cloudron notifications because of updates...maybe the issue started then, but I don't know as I went on to do the IPv6 changes after the reboot.
-
I started to have the IPv6 email bounce again on one of my server and so I did a bunch of (a priori harmless) things to configure IPv6 properly (I had disabled it all before for the server to use IPv4):
- activated IPv6 on the network interface on the server
- activated IPv6 on Cloudron (Network > IPv6 > Public IP)
- added * and @ AAAA record for the domain
- Resync DNS from Cloudron interface
All seemed good. Except that after a few minutes, the dashboard and all apps became inaccessible. I tried to SSH into the server and there it said the connection timed out.
I went on Netcup Server Control Panel, reboot the server and it all came back up right away...but then all went down again about 15min later. I tried to power the server on and off, reboot etc. and every time the same thing happens, it all comes back up right away (all services etc. are fine)...but then goes down again after a few minutes. Then I realised that by going Netcup Server Control Panel in the General tab I could access the server from there instead of SSH. Weird things happened:
- From the SCP interface the whole server screen looks black (instead, as in my other server, where you can see it's a terminal asking for a username)
- when opening the server, it is in graphical mode! I can see a graphical login screen (instead of the usual terminal one) where I can see my username, enter password and then login into a session
- just opening the server into the login screen (without even logging in), then the all server, Cloudron and everything else goes back up.
Any clues??
Below is an example of Cloudron log from the point it's working, then a freeze (18:25), then brought back up (22:23):
Feb 11 18:24:30 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 18:24:40 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 18:24:50 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 22:23:20 box:updatechecker checkAppUpdates: Error getting app update info for 1b0f9d03-8de9-46f0-90c2-06355401006e BoxError: getaddrinfo EAI_AGAIN api.cloudron.io Feb 11 22:23:20 at Object.getAppUpdate (/home/yellowtent/box/src/appstore.js:267:22) Feb 11 22:23:20 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { Feb 11 22:23:20 reason: 'Network Error', Feb 11 22:23:20 details: {}, Feb 11 22:23:20 nestedError: Error: getaddrinfo EAI_AGAIN api.cloudron.io Feb 11 22:23:20 at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26) { Feb 11 22:23:20 errno: -3001, Feb 11 22:23:20 code: 'EAI_AGAIN', Feb 11 22:23:20 syscall: 'getaddrinfo', Feb 11 22:23:20 hostname: 'api.cloudron.io', Feb 11 22:23:20 response: undefined Feb 11 22:23:20 } Feb 11 22:23:20 } Feb 11 22:23:20 box:shell Running as unit: box-task-18590.service
Update: another thing that I did earlier that day (even before the whole IPv6 stuff was to reboot the server from the Cloudron notifications because of updates...maybe the issue started then, but I don't know as I went on to do the IPv6 changes after the reboot.
@avatar1024 oh no, that's sounds very frustrating. I've no clue what you should do but just wanted to express solidarity and hope you manage to work it out ASAP!
-
J jdaviescoates referenced this topic
-
I started to have the IPv6 email bounce again on one of my server and so I did a bunch of (a priori harmless) things to configure IPv6 properly (I had disabled it all before for the server to use IPv4):
- activated IPv6 on the network interface on the server
- activated IPv6 on Cloudron (Network > IPv6 > Public IP)
- added * and @ AAAA record for the domain
- Resync DNS from Cloudron interface
All seemed good. Except that after a few minutes, the dashboard and all apps became inaccessible. I tried to SSH into the server and there it said the connection timed out.
I went on Netcup Server Control Panel, reboot the server and it all came back up right away...but then all went down again about 15min later. I tried to power the server on and off, reboot etc. and every time the same thing happens, it all comes back up right away (all services etc. are fine)...but then goes down again after a few minutes. Then I realised that by going Netcup Server Control Panel in the General tab I could access the server from there instead of SSH. Weird things happened:
- From the SCP interface the whole server screen looks black (instead, as in my other server, where you can see it's a terminal asking for a username)
- when opening the server, it is in graphical mode! I can see a graphical login screen (instead of the usual terminal one) where I can see my username, enter password and then login into a session
- just opening the server into the login screen (without even logging in), then the all server, Cloudron and everything else goes back up.
Any clues??
Below is an example of Cloudron log from the point it's working, then a freeze (18:25), then brought back up (22:23):
Feb 11 18:24:30 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 18:24:40 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 18:24:50 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 22:23:20 box:updatechecker checkAppUpdates: Error getting app update info for 1b0f9d03-8de9-46f0-90c2-06355401006e BoxError: getaddrinfo EAI_AGAIN api.cloudron.io Feb 11 22:23:20 at Object.getAppUpdate (/home/yellowtent/box/src/appstore.js:267:22) Feb 11 22:23:20 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { Feb 11 22:23:20 reason: 'Network Error', Feb 11 22:23:20 details: {}, Feb 11 22:23:20 nestedError: Error: getaddrinfo EAI_AGAIN api.cloudron.io Feb 11 22:23:20 at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26) { Feb 11 22:23:20 errno: -3001, Feb 11 22:23:20 code: 'EAI_AGAIN', Feb 11 22:23:20 syscall: 'getaddrinfo', Feb 11 22:23:20 hostname: 'api.cloudron.io', Feb 11 22:23:20 response: undefined Feb 11 22:23:20 } Feb 11 22:23:20 } Feb 11 22:23:20 box:shell Running as unit: box-task-18590.service
Update: another thing that I did earlier that day (even before the whole IPv6 stuff was to reboot the server from the Cloudron notifications because of updates...maybe the issue started then, but I don't know as I went on to do the IPv6 changes after the reboot.
Are you saying the server has gnome/xfce/kde? Maybe you can check if those packages are installed? Indeed, your guess will be correct that the system goes to sleep if the desktop environments have power management enabled. Did you install anything on the server recently?
@avatar1024 said in server seems to be going to sleep and becomes inaccessible:
Feb 11 22:23:20 nestedError: Error: getaddrinfo EAI_AGAIN api.cloudron.io
This is a DNS resolution error. DNS is not working, test with
host api.cloudron.io
. -
-
Are you saying the server has gnome/xfce/kde? Maybe you can check if those packages are installed? Indeed, your guess will be correct that the system goes to sleep if the desktop environments have power management enabled. Did you install anything on the server recently?
@avatar1024 said in server seems to be going to sleep and becomes inaccessible:
Feb 11 22:23:20 nestedError: Error: getaddrinfo EAI_AGAIN api.cloudron.io
This is a DNS resolution error. DNS is not working, test with
host api.cloudron.io
.@jdaviescoates Thank you, much appreciated.
@joseph Many thanks for the reply.
@joseph said in server seems to be going to sleep and becomes inaccessible:
Did you install anything on the server recently?
Nope I didn't install anything manually (say via ssh or otherwise). The only thing I did was to reboot the server from the Cloudron interface to complete updates as per notifications (I did not run the update manually on the server).
@joseph said in server seems to be going to sleep and becomes inaccessible:
Are you saying the server has gnome/xfce/kde?
Yes it looks like it's got Gnome...
Can see it even in the thumbnails from Netcup SCP interface:
VSon another server.
@joseph said in server seems to be going to sleep and becomes inaccessible:
Maybe you can check if those packages are installed?
Seems like there are both Gnome and power management packages installed....which is freaking insane.
root@v2202108132182160313:~# dpkg -l gnome* Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-========================================-=======================-============-===========================> un gnome-applets <none> <none> (no description available) un gnome-backgrounds <none> <none> (no description available) un gnome-bluetooth <none> <none> (no description available) ii gnome-bluetooth-3-common 46.0-1ubuntu1 all GNOME Bluetooth 3 common fi> ii gnome-bluetooth-common 3.34.5-13build3 all GNOME Bluetooth common files un gnome-calendar <none> <none> (no description available) un gnome-characters <none> <none> (no description available) ii gnome-control-center 1:41.7-0ubuntu0.22.04.9 amd64 utilities to configure the > ii gnome-control-center-data 1:41.7-0ubuntu0.22.04.9 all configuration applets for G> ii gnome-control-center-faces 1:46.4-0ubuntu0.24.04.1 all utilities to configure the > ii gnome-desktop3-data 42.9-0ubuntu1 all Common files for GNOME desk> un gnome-documents <none> <none> (no description available) un gnome-icon-theme <none> <none> (no description available) un gnome-icon-theme-symbolic <none> <none> (no description available) un gnome-initial-setup <none> <none> (no description available) ii gnome-keyring 40.0-3ubuntu3 amd64 GNOME keyring services (dae> ii gnome-keyring-pkcs11:amd64 40.0-3ubuntu3 amd64 GNOME keyring module for th> un gnome-maps <none> <none> (no description available) ii gnome-menus 3.36.0-1.1ubuntu3 amd64 GNOME implementation of the> ii gnome-online-accounts 3.44.0-1ubuntu1 amd64 service to manage online ac> un gnome-packagekit <none> <none> (no description available) un gnome-panel <none> <none> (no description available) ii gnome-remote-desktop 42.9-0ubuntu0.22.04.2 amd64 Remote desktop daemon for G> un gnome-session <none> <none> (no description available) ii gnome-session-bin 42.0-1ubuntu2 amd64 GNOME Session Manager - Min> ii gnome-session-common 42.0-1ubuntu2 all GNOME Session Manager - com> un gnome-session-flashback <none> <none> (no description available) ii gnome-settings-daemon 42.1-1ubuntu2.2 amd64 daemon handling the GNOME s> ii gnome-settings-daemon-common 42.1-1ubuntu2.2 all daemon handling the GNOME s> un gnome-settings-daemon-schemas <none> <none> (no description available) ii gnome-shell 42.9-0ubuntu2.2 amd64 graphical shell for the GNO> ii gnome-shell 42.9-0ubuntu2.2 amd64 graphical shell for the GNO> ii gnome-shell-common 42.9-0ubuntu2.2 all common files for the GNOME > un gnome-shell-extension-appindicator <none> <none> (no description available) un gnome-shell-extension-autohidetopbar <none> <none> (no description available) un gnome-shell-extension-caffeine <none> <none> (no description available) un gnome-shell-extension-dash-to-panel <none> <none> (no description available) un gnome-shell-extension-dashtodock <none> <none> (no description available) un gnome-shell-extension-desktop-icons <none> <none> (no description available) un gnome-shell-extension-desktop-icons-ng <none> <none> (no description available) un gnome-shell-extension-multi-monitors <none> <none> (no description available) un gnome-shell-extension-pixelsaver <none> <none> (no description available) un gnome-shell-extension-prefs <none> <none> (no description available) un gnome-shell-extension-taskbar <none> <none> (no description available) un gnome-shell-extension-top-icons-plus <none> <none> (no description available) un gnome-shell-extension-ubuntu-dock <none> <none> (no description available) un gnome-shell-extension-workspaces-to-dock <none> <none> (no description available) un gnome-shell-extensions <none> <none> (no description available) un gnome-shell-pomodoro <none> <none> (no description available) un gnome-software <none> <none> (no description available) un gnome-sound-recorder <none> <none> (no description available) ii gnome-startup-applications 42.0-1ubuntu2 amd64 Startup Applications manage> un gnome-terminal <none> <none> (no description available) un gnome-themes-standard-data <none> <none> (no description available) un gnome-todo <none> <none> (no description available) ii gnome-user-docs 46.0-1ubuntu1 all GNOME Help un gnome-user-share <none> <none> (no description available) un gnome-weather <none> <none> (no description available)
root@v2202108132182160313:~# dpkg -l power* Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-=====================-============-============-=================================================== ii power-profiles-daemon 0.10.1-3 amd64 Makes power profiles handling available over D-Bus. ii powermgmt-base 1.37 all common utils for power management
@joseph said in server seems to be going to sleep and becomes inaccessible:
This is a DNS resolution error. DNS is not working, test with host api.cloudron.io .
Right so, this seemed to happen as the server wakes up and then it becomes fine (all apps healthcheck fine). I did change the DNS config (i.e. added IPv6 as explained above) so maybe propagation wasn't yet perfect. Anyway I can't see how any of this would be related to the server suddenly a DE
I let the logs open all night and while I can see a pause between 23:37 and 8:30, there are no more DNS errors and the servers has remained up since 8:30. So there are progress. Though in the Cloudron Dashboard it says a backup was completed at 01:10 but I cannot see that in the log...?
Feb 11 23:37:00 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 23:37:10 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 11 23:37:20 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 12 08:30:30 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 12 08:30:40 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive Feb 12 08:30:50 box:apphealthmonitor app health: 19 running / 3 stopped / 0 unresponsive
-
The browser is local timezone and the server is UTC. You might have to translate the tz .
I recommend uninstall gnome and friends altogether. I have never done this but I expect this won't happen easily since uninstalling a DE is probably not supported . Easiest way is to set up a new server and migrate using cloudron backups.
-
The browser is local timezone and the server is UTC. You might have to translate the tz .
I recommend uninstall gnome and friends altogether. I have never done this but I expect this won't happen easily since uninstalling a DE is probably not supported . Easiest way is to set up a new server and migrate using cloudron backups.
@joseph said in server seems to be going to sleep and becomes inaccessible:
The browser is local timezone and the server is UTC. You might have to translate the tz .
I'm in the UK so I believe at this time of year UTC = local time?
@joseph said in server seems to be going to sleep and becomes inaccessible:
I recommend uninstall gnome and friends altogether. I have never done this but I expect this won't happen easily since uninstalling a DE is probably not supported . Easiest way is to set up a new server and migrate using cloudron backups.
Yes starting from fresh seems the most sensible...but I really cannot understand what has happened (like I am 100% positive I haven't installed packages manually). I wonder if this was something to do with Netcup. Also since this morning, the server seems kinda stable...a bit worrying though.
Many thanks for your help.
Update: I was able to track the date as to when those packages were installed...and it was when I upgrade to ubuntu 24.04 on 1st Jan 2025. I did that on three other servers without installing a DE so I guess something must have gone wrong when doing this one. But it means the issue I encountered yesterday was somehow due to something else. Maybe with the DNS changes meant the server had somehow network issues and was kept with no activities for a while which sent it to sleep...I don't know.
-