Disk space should never bring a whole server down
-
In just cannot get my head around how the disk can be allowed to fill to the point of a total system failure.
Slowdown, sure I understand - but it's a total fail and I can't see why this isn't preventable.
Is it really all one has to to bring a Cloudron down is load it up with data?
There's a bunch of Apps that allow for uploads, it really wouldn't take much effort to flood those with a few GB.
-
@nebulon @girish App have Memory & CPU allocations - any reason they can't have disk- space allocations too?
I'd rather a single app hits a wall than an entire server.
It seems all one would have to do to bring a whole Cloudron down this way would be send a lot of email attachments to the point of disk saturation.
Maybe I'm wrong and it's something else - but feel free to delete this post and move to email if it's a reproducible risk.
-
@marcusquinn said in Disk space should never bring a whole server down:
@nebulon @girish App have Memory & CPU allocations - any reason they can't have disk- space allocations too?
Yes, the memory & CPU allocations are features of the linux kernel cgroups. However, disk space allocation is not part of them.
I guess the issue atleast to handle right now is that for some reason disk space is full. Running
docker image prune -a
sometimes frees up some disk space. Can you try that? Alternately, if you drop me a mail on support , I can look into the server. -
@girish said in Disk space should never bring a whole server down:
docker image prune -a
OK, thanks, tried that: "Total reclaimed space: 816.4MB"
Still no respondio though. Have emailed support@ but 2am here and an early start, so back online in 8h or so, by which it'll be your 2am, and appreciate it's Saturday, so just grateful for pointers and hoping I might have some other requests for assistance waking up soon too.
-
I'm wondering if maybe Cloudron should have its own volume by default.
A quick search in the subject but kinda tired now:
- https://www.reddit.com/r/docker/comments/loleal/how_to_limit_disk_space_for_a_docker_container/
- https://guide.blazemeter.com/hc/en-us/articles/115003812129-Overcoming-Container-Storage-Limitation-Overcoming-Container-Storage-Limitation#:~:text=In the current Docker version,be left in the container
- https://stackoverflow.com/questions/38542426/docker-container-specific-disk-quota
-
@marcusquinn Managed to bring it up by truncating many logs. Should be coming up in a bit, hold on.
-
@girish Ahhhh - thank you kindly!
I have an unused 1TB volume mounted, although I'm not sure how much of the remaining free space is used in the Move function, as I guess that was killing it when I triggered to move the 16GB Jira App data to it?
-
@girish said in Disk space should never bring a whole server down:
Managed to bring it up by truncating many logs
Is this perhaps related to the issue I reported a little while back too, regarding the logrotate not running properly under certain circumstances?
-
Going to trigger a move on Confluence to the mounted volume, it's 4.5GB with 7.5GB free space now on the main volume - so hopefully that's enough working space but I have to zzz, problems where I know I don't immediately know how to solve are kinda exhausting.
-
@marcusquinn looks like things are back up! There is ~7GB left, so hopefully that should hold up for sometime.
-
I am looking into some clues on what can be done to mitigate this, will report back. BTW, for the volume suggestion, this is possible. In fact, we used to do this very long ago with each app having it's own btrfs partition. Usually, people start with a simple VPS. This means that for this to work out of the box one has to create a loopback file system which is very slow. Also, when I logged in to your server, it was mysql that was down which was not happy with lack of disk.
I am wondering if the solution involves suggesting the user to make a specific kind of setup if they want to protect themselves against this kind of issue. That is totally doable (for example, suggest user to move platformdata and boxdata to a separate volume/disk post installation)
-
@robi We actually have a disk space alert, in fact, it's there right now in the dashboard.
But the above is not super useful because it's just checking space in a cronjob. This cronjob is quite conservative because we don't want to keep spinning the disk too much. I am not aware of a way to get a "signal" from the server when disk space limits are hit. If a server fills up too fast between cron runs, the whole thing is useless...
-
I've triggered some bigger app data moves to the mounted 1TB volume but it seems to have chewed through 3GB of the remaining free space on the main volume already and I'm back to "Cloudron is offline. Reconnecting". Probably just making hasty tiredness errors now.
-
@marcusquinn maybe it's best to move them by hand first. Can you send me the apps you want to move by email and I can move it by hand since this seems to keep hitting a wall. ie. free space -> try to free space -> run out of space and start over...
-
@girish yes, but does it email you when approaching the threshold?
threshold setting? (twice a day should be plenty)
action setting checkboxes? (maybe a custom one too?)
heck, even deleting an non critical app would be fine since it's restorable from backup.
-
@marcusquinn Hang in there @marcusquinn. Bonne courage.
-
WHM has disk space limitations. Is it possible to copy their method and have it implemented in CR?
-
Thanks for all the help - I managed to get some extra hands on deck this morning and we're moving lots of data to a mounted volume for much more headroom.
I still think it's a little too vulnerable having this hazard able to bring a server down.
Also, I couldn't see if there's a way to set Email storage to be a mounted volume too?