Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    Disk space should never bring a whole server down

    Feature Requests
    15
    50
    1279
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • girish
      girish Staff last edited by

      I am looking into some clues on what can be done to mitigate this, will report back. BTW, for the volume suggestion, this is possible. In fact, we used to do this very long ago with each app having it's own btrfs partition. Usually, people start with a simple VPS. This means that for this to work out of the box one has to create a loopback file system which is very slow. Also, when I logged in to your server, it was mysql that was down which was not happy with lack of disk.

      I am wondering if the solution involves suggesting the user to make a specific kind of setup if they want to protect themselves against this kind of issue. That is totally doable (for example, suggest user to move platformdata and boxdata to a separate volume/disk post installation)

      1 Reply Last reply Reply Quote 2
      • girish
        girish Staff @robi last edited by

        @robi We actually have a disk space alert, in fact, it's there right now in the dashboard.

        57ac2165-226d-4f31-aa1e-a22cf131df92-image.png

        But the above is not super useful because it's just checking space in a cronjob. This cronjob is quite conservative because we don't want to keep spinning the disk too much. I am not aware of a way to get a "signal" from the server when disk space limits are hit. If a server fills up too fast between cron runs, the whole thing is useless...

        robi mehdi 2 Replies Last reply Reply Quote 2
        • marcusquinn
          marcusquinn last edited by marcusquinn

          I've triggered some bigger app data moves to the mounted 1TB volume but it seems to have chewed through 3GB of the remaining free space on the main volume already and I'm back to "Cloudron is offline. Reconnecting". Probably just making hasty tiredness errors now.

          We're not here for a long time - but we are here for a good time :)
          Jersey/UK
          Work & Ecommerce Advice: https://brandlight.org
          Personal & Software Tips: https://marcusquinn.com

          girish scooke 2 Replies Last reply Reply Quote 0
          • girish
            girish Staff @marcusquinn last edited by

            @marcusquinn maybe it's best to move them by hand first. Can you send me the apps you want to move by email and I can move it by hand since this seems to keep hitting a wall. ie. free space -> try to free space -> run out of space and start over...

            1 Reply Last reply Reply Quote 2
            • robi
              robi @girish last edited by

              @girish yes, but does it email you when approaching the threshold?

              threshold setting? (twice a day should be plenty)

              action setting checkboxes? (maybe a custom one too?)

              heck, even deleting an non critical app would be fine since it's restorable from backup.

              Life of Advanced Technology

              1 Reply Last reply Reply Quote 0
              • scooke
                scooke @marcusquinn last edited by

                @marcusquinn Hang in there @marcusquinn. Bonne courage.

                A life lived in fear is a life half-lived

                1 Reply Last reply Reply Quote 2
                • humptydumpty
                  humptydumpty last edited by

                  WHM has disk space limitations. Is it possible to copy their method and have it implemented in CR?

                  1 Reply Last reply Reply Quote 0
                  • marcusquinn
                    marcusquinn last edited by

                    Thanks for all the help - I managed to get some extra hands on deck this morning and we're moving lots of data to a mounted volume for much more headroom.

                    I still think it's a little too vulnerable having this hazard able to bring a server down.

                    Also, I couldn't see if there's a way to set Email storage to be a mounted volume too?

                    We're not here for a long time - but we are here for a good time :)
                    Jersey/UK
                    Work & Ecommerce Advice: https://brandlight.org
                    Personal & Software Tips: https://marcusquinn.com

                    d19dotca 1 Reply Last reply Reply Quote 1
                    • mehdi
                      mehdi App Dev @girish last edited by

                      @girish Also, the current warning is IMO not very useful if the threshold is not configurable. Depending on how the server is used, a few GB may be enough for weeks, or for mere hours if there's media stuff on the server, or if a user uploads stuff on nextcloud or something.

                      1 Reply Last reply Reply Quote 2
                      • d19dotca
                        d19dotca @marcusquinn last edited by d19dotca

                        @marcusquinn said in Disk space should never bring a whole server down:

                        Also, I couldn't see if there's a way to set Email storage to be a mounted volume too?

                        Currently, emails are part of boxdata and you need to move the boxdata entirely. I’ve done this in my current server due to the amount of email stored for my clients. The steps for this are at https://docs.cloudron.io/storage/#default-data-directory for reference.

                        I’m making an assumption by volume you meant an external disk vs the actual Volumes function that Cloudron has.

                        There is a feature request I believe to keep emails separate but boxdata really don’t contain much data at all other than emails so it’s doable as-is for now. It’d just be nice to see the GUI handle moving the email data much like it does for apps.

                        --
                        Dustin Dauncey
                        www.d19.ca

                        marcusquinn 1 Reply Last reply Reply Quote 2
                        • marcusquinn
                          marcusquinn @d19dotca last edited by

                          @d19dotca Thanks. I'm an app specialist and anything more than a few minutes digging in the dirt is my kinda hell. Just getting brain fog now as I've lost a bunch of important work and 2 days of progress on it now 😞

                          We're not here for a long time - but we are here for a good time :)
                          Jersey/UK
                          Work & Ecommerce Advice: https://brandlight.org
                          Personal & Software Tips: https://marcusquinn.com

                          1 Reply Last reply Reply Quote 0
                          • marcusquinn
                            marcusquinn last edited by

                            Anyone know where /app/data actually is in the full file system structure?

                            I'm trying to navigate a snapshot clone to see if that has the missing config.php file that hasn't come back for EspoCRM but just not seeing anything obvious and searching docs hasn't found me the clue.

                            We're not here for a long time - but we are here for a good time :)
                            Jersey/UK
                            Work & Ecommerce Advice: https://brandlight.org
                            Personal & Software Tips: https://marcusquinn.com

                            1 Reply Last reply Reply Quote 0
                            • marcusquinn
                              marcusquinn last edited by

                              The problem I have is that EspoCRM Administration writes changes back to /app/data/data/config.php - however, that file also contains all the database connection details, password hash, basically everything for that instance to work.

                              So when the disk was full, it seems to have somehow written a 0kb version of config.php.

                              And because of the rsync encryption failing to backup EspoCRM, the Cloudron backups aren't complete.

                              So that leaves provider backup snapshot restore and dig around.

                              Basically, whatever anyone does - never allow the disk to get full - the cascade of problems that can happen from that interruption is just one massive time hole.

                              We're not here for a long time - but we are here for a good time :)
                              Jersey/UK
                              Work & Ecommerce Advice: https://brandlight.org
                              Personal & Software Tips: https://marcusquinn.com

                              marcusquinn 1 Reply Last reply Reply Quote 1
                              • marcusquinn
                                marcusquinn @marcusquinn last edited by

                                @marcusquinn Holy sh*t, with some dumb-luck trying everything I know, I seem to have fixed it.

                                Lesson learnt - never run out of disk space - sods law says it will be the apps you rely on the most that will get corrupted.

                                Now, given the many open ways to load up a Cloudron with data (email/FilePizza/PrivateBin) maybe there's a way to avoid this causing a total fail?

                                We're not here for a long time - but we are here for a good time :)
                                Jersey/UK
                                Work & Ecommerce Advice: https://brandlight.org
                                Personal & Software Tips: https://marcusquinn.com

                                jdaviescoates 1 Reply Last reply Reply Quote 3
                                • jdaviescoates
                                  jdaviescoates @marcusquinn last edited by

                                  @marcusquinn said in Disk space should never bring a whole server down:

                                  Now, given the many open ways to load up a Cloudron with data (email/FilePizza/PrivateBin) maybe there's a way to avoid this causing a total fail?

                                  I think FilePizza if fully P2P and so I'm not sure you could fill the server up with that (but you could with Jirafeau).

                                  But yeah, I reckon configurable disk space notifications (e.g. email/notify me hourly/daily/whatever once I've only got x space left) but be a good first step to help this not to happen.

                                  I use Cloudron with Gandi & Hetzner

                                  1 Reply Last reply Reply Quote 0
                                  • marcusquinn
                                    marcusquinn last edited by

                                    Quick fix idea: maybe 70% full is a better nag threshold?

                                    We're not here for a long time - but we are here for a good time :)
                                    Jersey/UK
                                    Work & Ecommerce Advice: https://brandlight.org
                                    Personal & Software Tips: https://marcusquinn.com

                                    1 Reply Last reply Reply Quote 1
                                    • bestknownhost
                                      bestknownhost last edited by

                                      Thanks for all the feedback here. We discovered cloudron a whiles back and have been testing it out on a number of server over the last couple of months. We wanted to get a good handle on how everything works before rolling anything out into production. Firstly it’s a excellent platform and fills a great need. But we did run into a little problem with one of our test servers running on a digital ocean droplet. About 2 weeks ago it went from using 20gb of space to nearly 80gb in the space of 4 hours. We received an alart from digital ocean however things were happening so fast that all we could initially do is upgrade the instance, this gave us half and hour and then we had to do it again, then we just attached a 100gb volume. Although just in testing there was a wordpress app we were fond of and so we transferred it off the cloudron and left a pixelfed app. Somewhere between shutting down the server to add the volume and moving the Wordpress app, the space usage stopped increasing. I know what your thinking Wordpress right? No we checked the install before hand and it was working fine on another server. We then removed the 100gb volume and resized the digital ocean server back to its original size and evething was back to normal. I figured that some server updates ran that morning and some out of control process started this and resizing the server up and down somehow got rid of the problem.

                                      robi nebulon 2 Replies Last reply Reply Quote 0
                                      • robi
                                        robi @bestknownhost last edited by

                                        @bestknownhost Did you perhaps have AdGuard installed?

                                        Life of Advanced Technology

                                        bestknownhost 1 Reply Last reply Reply Quote 0
                                        • bestknownhost
                                          bestknownhost @robi last edited by

                                          @robi No we didn't.

                                          robi 1 Reply Last reply Reply Quote 0
                                          • robi
                                            robi @bestknownhost last edited by

                                            @bestknownhost did you figure out what was filling up the disk with du -sh /* and drilling down?

                                            Life of Advanced Technology

                                            1 Reply Last reply Reply Quote 1
                                            • nebulon
                                              nebulon Staff @bestknownhost last edited by

                                              @bestknownhost for a start to clarify, are you using an external backup storage or just the local disk for now? Using the local disk may cause disk usage to go up quickly depending on how much data you've put into the server.

                                              If that is not the case, then you may have hit some issue we recently saw with mysql binlogs https://forum.cloudron.io/topic/4510/able-to-clean-up-binlog-files-in-var-lib-mysql-directory?_=1616402616926 ?

                                              And as @robi mentioned, do you have any idea so far what is using all that disk space?

                                              1 Reply Last reply Reply Quote 3
                                              • D
                                                DigitEgal @marcusquinn last edited by DigitEgal

                                                @marcusquinn I was running into a simular issue while testing some stuff, most likely because of the Nexcloud Plugin "External Sites":
                                                I am not sure right now, but i dont think that it recreates the files, but more likely it writes a looooooooot of logs down since cpu got pushed aswell

                                                ( THATS NOT A TUTORIAL! ITS ONLY FOR REPEATABILITY OF BUGS! )
                                                How to create the Issue repeatable:
                                                1: Create a Nextcloud and share a folder(structure) to a public link.
                                                2. Insert this link into any secondary website (wordpress etc) as a button that does NOT open a new tab.
                                                3. Add the Plugin "External Sites" to Nextcloud - go to config and add the secondary website.
                                                4. By using the embed Mode of external site implementation this issue is possible to get triggered by a user with access to the External Sites Buttons.
                                                4.1 *Actual i did not test it by using a non-admin user as "trigger" user

                                                How to finally trigger the filling of Disk space ?
                                                -> Now follow the link in Nextcloud to your secondary website.
                                                -> By clicking the button back into nextcloud the issues is triggered.

                                                ( THATS NOT A TUTORIAL! ITS ONLY FOR REPEATABILITY OF BUGS! )

                                                1 Reply Last reply Reply Quote 2
                                                • Referenced by  girish girish 
                                                • M
                                                  makemrproper last edited by makemrproper

                                                  Here is my SOLUTION:

                                                  It does not solve the root cause why you are running out of space, but with this methodology you will buy yourself time.

                                                  Generate 3 files of 2 gigabytes each.
                                                  This is one way of generating these files:
                                                  fallocate -l 2G /storage-padding-buffer-2-gb-file1.img
                                                  fallocate -l 2G /storage-padding-buffer-2-gb-file2.img
                                                  fallocate -l 2G /storage-padding-buffer-2-gb-file3.img

                                                  When your server is out of storage, you may delete one or all of these padding files, so that regain the space you need to rescue the server.

                                                  I have had the same issue with cloudron, because over time, storage will run out.
                                                  For now I chose not to update the storage of my VPS server because it will double my hosting cost for this node, from USD400 to USD800 per year. That's digital ocean pricing for you, but I digress.

                                                  This is a systems engineering issue and isn't caused by Cloudron. However I would not have anything against an elegant solution from the team if it were possible :).

                                                  I want to say I am working on a post to describe I work with a massive cleanup, and exactly which steps I took to regain loads of space. TLDR; use ncdu, analyze all containers and identify where apps are storing logs and rotating these, clear NPM package cache in each container. More to cone

                                                  1 Reply Last reply Reply Quote 6
                                                  • Referenced by  marcusquinn marcusquinn 
                                                  • marcusquinn
                                                    marcusquinn last edited by

                                                    Maybe the Cloudron app needs to generate its own partition to run from, where regular app storage can't saturate the OS or Cloudron partitions?

                                                    We're not here for a long time - but we are here for a good time :)
                                                    Jersey/UK
                                                    Work & Ecommerce Advice: https://brandlight.org
                                                    Personal & Software Tips: https://marcusquinn.com

                                                    girish 1 Reply Last reply Reply Quote 2
                                                    • girish
                                                      girish Staff @marcusquinn last edited by

                                                      @marcusquinn said in Disk space should never bring a whole server down:

                                                      Maybe the Cloudron app needs to generate its own partition to run from, where regular app storage can't saturate the OS or Cloudron partitions?

                                                      Right. The main issue, it's not possible to create proper disk partitions in VPS 😕 i.e one can only create file backed loop back file systems but such things are not to be used in production and I have no idea about their reliability/durability.

                                                      M 1 Reply Last reply Reply Quote 2
                                                      • M
                                                        makemrproper @girish last edited by makemrproper

                                                        @girish This is still a huge problem.

                                                        My production server have failing applications again due to disk space filling up. Luckily DigitalOcean's backup functionality saves my setup this time again. If I had relied only on Cloudron for backups it would have been disaster time.

                                                        I am not asking you for help to fix it or to blame anyone. But this needs an engineered solution by upstream, you guys.

                                                        You could for instance recommend that we make use of separate storage volumes on the system drive. This brings the TCO cost down, for storage. Recommendations and verified testing from you would be valuable for us as customers.

                                                        I can imagine hundreds of other customers of yours that are seeing the same issue.

                                                        Also:
                                                        Your assertion above that disk space is cheap is true for physical drive storage, but not for VPS server storage. For me to double disk space from 80gb to 160gb it also doubles my yearly VPS cost. I would believe a sizable portion of your users are hosting Cloudron on a VPS.
                                                        Of course you could temporarily invalidate this problem by recommending 160GB storage capacity. This might alienate some potential users.

                                                        Now onwards to repair my Cloudron install and apps!

                                                        edit:
                                                        Solving the root cause

                                                        Used ncdu to browse every container

                                                        • Gained 1gb of storage by deleting /usr/local/share/.cache/yarn/ on a container volume
                                                        • Gained 500mb of storage by deleting Anaconda distribution package cache within a container volume

                                                        Analysis: There seems to be space wasteful ways of letting the Metor spread around old versions of libraries and builds (?).

                                                        girish M timconsidine 3 Replies Last reply Reply Quote 0
                                                        • girish
                                                          girish Staff @makemrproper last edited by

                                                          @makemrproper said in Disk space should never bring a whole server down:

                                                          If I had relied only on Cloudron for backups it would have been disaster time.

                                                          Can you clarify this? Why are you unable to rely on Cloudron backups ?

                                                          I agree with the bigger point though. Unfortunately, we have found no clear technical solution to solve disk space issues even outside of Cloudron. What do people generally do when hosting apps on a VPS?

                                                          As for anaconda cache and meteor are you referring to jupyter and wekan apps ? Maybe those packages can be fixed to clear the cache.

                                                          scooke 1 Reply Last reply Reply Quote 1
                                                          • M
                                                            michaelpope @makemrproper last edited by michaelpope

                                                            @makemrproper Hey makemrproper,

                                                            If you want a VPS with a lot of space, you might try looking into BuyVMs storage volumes.

                                                            They take a bit of time to set up (you have to mount the volume), but it's very difficult to get a deal on the space provided elsewhere.

                                                            It's also very well cached, so I've found it to be almost as performant disk wise as what I've used on digital ocean.

                                                            They are a smaller provider though, so the reliability won't be quite as good (you might have a little more down-time compare to other providers).

                                                            I know this doesn't necessarily solve your issue, but more disk space is always great :).

                                                            1 Reply Last reply Reply Quote 0
                                                            • scooke
                                                              scooke @girish last edited by scooke

                                                              @girish I think what @makemrproper meant was if their backups were on the same server as the Cloudron, they wouldn't be able to back up, or restore, from them since the disk was not responsive. I am impressed again by how patient you all are with these situations. Keep up the good work and attitudes.

                                                              A life lived in fear is a life half-lived

                                                              1 Reply Last reply Reply Quote 2
                                                              • timconsidine
                                                                timconsidine @makemrproper last edited by timconsidine

                                                                @makemrproper said in Disk space should never bring a whole server down:

                                                                This is still a huge problem.

                                                                I understand the desire for an approach that stops the problem happening in the first place. In the interim, I really recommend an alert system like ntfy. Use their hosted service or host it yourself ( have self packaged for Cloudron - more recent version is at https://forum.cloudron.io/post/54552).

                                                                Set a cron job for as often as you want, running a script for df -h, set alert levels in the script sending notifications to dashboard or iOS/Android device.

                                                                As self-hosters we want to rely on things working, but we can't escape our responsibility to keep an eye on things. ntfy handles this in one of the simplest ways.

                                                                1 Reply Last reply Reply Quote 4
                                                                • First post
                                                                  Last post
                                                                Powered by NodeBB