VPS down, restart button won't work, host denies any downtime
About two days ago, I experienced a downtime of well over 1.5 hours where my VPS was unreachable. I hopped on my VPS dashboard and tried to reset the server, but I ran into an error saying it's not possible (not exact message but along that line). I couldn't stop, start, or reboot the VPS. The server came back up by itself and life went on.
I emailed Contabo support, and they said "we did not record any technical issue both on the v-host where your server is located and in the same data center". The reason why I'm saying it might be Cloudron related is because I manually updated InvoiceNinja right before the server went down (literally, the red Cloudron is offline banner came up right before the update was finishing up).
- Is it possible to find out if it's Cloudron related or a hosting issue?
- Could a Cloudron related bug/crash stop Contabo's reboot button from working?
- If so, how could the server come back up by itself then?
Edit: My email server resides on this VPS and I wanted to know if I'll be missing out on any sent emails, so I emailed myself from my Proton address during the downtime. Oddly enough, it didn't bounce back, but it wasn't in my inbox when the server came back up either. It did show up half an hour after the server was online.
- The Invoice Ninja update had finished before the downtime (confirmed it after server was up).
- I could not SSH into the server
- I ran an MTR and Trace tests per Contabo's requirement before contacting support (please lmk if you need those).
TBH, there is no way to truly know where things went wrong unless your infra provider is willing to work with you closely through this. For example, if my ubuntu VM is down in Digital Ocean, there is no way for the ubuntu VM to know by itself why it is down.
So.. this needs a mix of infra provider telling you the status of the VM (from the hypervisor tools) as to what is happening when the server is down. Some infra providers make this easier: Digital Ocean has a console view which gives the serial console. The boot logs are available. If the server is not even booting, we already know it's an infra issue. Maybe the boot is getting stuck in some fsck check or something.
Does contabo provide tools to debug this?
@girish said in VPS down, restart button won't work, host denies any downtime - Cloudron update related?:
Does contabo provide tools to debug this?
I'm not sure if they do but here's a screenshot of the dashboard. It took two replies to get them to say that they detected no issues with my server. Their first response was,
We have checked your VPS L NVMe and we can inform you that your VPS is up and running again. Therefore we would kindly ask you to check the situation also from your side.
Spring-cleaning is up. Time to move hosts again or better yet, spin up a server at work. I'll keep a cheap VPS for mail only. Can I run a mail server on Cloudron's free tier?
Edit: the "i can't connect to this server" option opens a popup that asks how you're trying to connect (RDP, SSH, or VNC) and then instructs to do the MTR / Trace tests and attach those to support (not that they gave them a look).
robi last edited by
@humptydumpty said in VPS down, restart button won't work, host denies any downtime - Cloudron update related?:
Can I run a mail server on Cloudron's free tier?
@robi perfect thanks
My VPS is down again and there's nothing on Contabo's server status page that indicates any issues at all. It's not just my VPS though, their dashboard is down too.
Edit: I stand corrected, status page has been updated and is showing a downtime in my datacenter.
Estimated duration: 180 minutes We are experiencing a disruption in St. Louis Data Center. Once we will find out the reason, we will provide an update and provide an ETA. We’re sorry for the inconvenience. Next update 19:30 CET.
@girish As this isn't Cloudron related, can we please move this back to off-topic?
VPS is up. Total downtime for me was 1.5 hrs. Magic number!
[UPDATE 28.03.2023 07:30 pm (UTC+2)] We have identified the reason behind the disruption in our St. Louis Data Center which is affecting some of our servers. Our technicians on-site already working to fix the issue. The next update will be provided in 60 minutes. (so until 8:30 CEST) [UPDATE 28.03.2023 07:43 pm (UTC+2)] We have fixed the problem causing disruption in the St. Louis Data Center which is affecting some of our servers. We expect to bring all servers online in the next 30 minutes (so until 8:10 CEST).
scooke last edited by
@humptydumpty Get out while you can!
Kimsufi KS-1's are still available from time to time. I use https://checkservers.ovh/ to check, plus the random "oh my goodness everyone get to ovh! ks-1's are available NOW" posts from lowendtalk.com (only for 5 minutes later to see 100 posts saying "sold out").
@scooke I definitely will! I decided to move when the previous downtime happened because of how they handle things but it's on the backburner until I have some free time. I didn't expect the VPS to go down again this soon. Sigh. I'll keep an eye out for the OVH deals, thanks!