Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


    Cloudron Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular

    Disk space should never bring a whole server down

    Feature Requests
    15
    50
    1283
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • marcusquinn
      marcusquinn last edited by

      Right now I have a whole server down and because it's Saturday, can't get hold of anyone.

      Running out of disk space should never bring a whole server down.

      How about some safety margin that preserves 10% of disk space to keep the OS and Cloudron running?

      This really isn't my area but I'm beyond words right now.

      I've tried clearing out var/backups, numerous reboots - still a total fail 😞

      We're not here for a long time - but we are here for a good time :)
      Jersey/UK
      Work & Ecommerce Advice: https://brandlight.org
      Personal & Software Tips: https://marcusquinn.com

      girish 1 Reply Last reply Reply Quote 5
      • nebulon
        nebulon Staff last edited by

        Last time we checked it was actually not easy to preserve space for the system as such, we would need support from the filesystem here.

        Either way after clearing up some space, did you follow https://docs.cloudron.io/troubleshooting/#recovery-after-disk-full ?

        marcusquinn 1 Reply Last reply Reply Quote 1
        • girish
          girish Staff @marcusquinn last edited by

          @marcusquinn when you say whole server down, do you mean that the dashboard is unreachable? Is the box code running systemctl status box and systemctl status nginx . If those two are up, things should be working.

          AFAIK, there is no easy way to implement this feature. If a process starts writing into disk like crazy because of a bug or intentionally, then other processes which write to disk start crashing. Unless the app itself is resilient to this, it' tricky to implement (I guess one approach is to also disk sandbox all apps but that also AFAIK is not easy to do).

          1 Reply Last reply Reply Quote 0
          • marcusquinn
            marcusquinn last edited by

            @nebulon Thanks - tried all of that but all was OK.

            @girish yeah, the Dashboard.

            I think I've found an outlier symptom but not worked out how to solve yet.

            If I login a fresh with Firefox at my.example.com I can get in again.

            However, my main daily-driver "browser" for this is a WebCatalog (Electron) webapp.

            So, here's the weird thing - I clear cookies and cache, I can get to the login screen. I login and just a blank https://my.brandlight.org/#/apps page, with nothing showing.

            All other Cloudron servers and instances working fine.

            Since it's working in Firefox now, I think it's safe to say no so urgent. My feeling is something went wrong when the disk filled up and somehow the browser now is all confused.

            Anyway, lower priority - I'll keep digging and see if I can find the cause/cure with some luck...

            We're not here for a long time - but we are here for a good time :)
            Jersey/UK
            Work & Ecommerce Advice: https://brandlight.org
            Personal & Software Tips: https://marcusquinn.com

            1 Reply Last reply Reply Quote 0
            • marcusquinn
              marcusquinn last edited by

              Ugh, nope, disk full again, whole server down. Makes me dislike weekends when this stuff happens.

              No clue what to do but the post title remains valid - maybe Apps need diskspace limits because I'm caught between hard server resets, brief times of access, and then lockout again.

              We're not here for a long time - but we are here for a good time :)
              Jersey/UK
              Work & Ecommerce Advice: https://brandlight.org
              Personal & Software Tips: https://marcusquinn.com

              D 1 Reply Last reply Reply Quote 0
              • marcusquinn
                marcusquinn last edited by

                Whatever 1000% CPU is doing, it's not showing a Cloudron Dashboard:

                867da7cb-a6af-4563-8933-abd715cdf7b7-image.png

                We're not here for a long time - but we are here for a good time :)
                Jersey/UK
                Work & Ecommerce Advice: https://brandlight.org
                Personal & Software Tips: https://marcusquinn.com

                1 Reply Last reply Reply Quote 0
                • marcusquinn
                  marcusquinn last edited by

                  # systemctl status box
                  ● box.service - Cloudron Admin
                     Loaded: loaded (/etc/systemd/system/box.service; enabled; vendor preset: enabled)
                     Active: activating (auto-restart) (Result: exit-code) since Sat 2021-03-06 23:22:48 UTC; 88ms ago
                    Process: 311 ExecStart=/home/yellowtent/box/box.js (code=exited, status=1/FAILURE)
                   Main PID: 311 (code=exited, status=1/FAILURE)
                  
                  # systemctl status nginx
                  ● nginx.service - nginx - high performance web server
                     Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
                    Drop-In: /etc/systemd/system/nginx.service.d
                             └─cloudron.conf
                     Active: active (running) since Sat 2021-03-06 23:09:24 UTC; 14min ago
                       Docs: http://nginx.org/en/docs/
                    Process: 1431 ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf (code=exited, status=0/SUCCESS)
                   Main PID: 1634 (nginx)
                      Tasks: 17 (limit: 4915)
                     CGroup: /system.slice/nginx.service
                             β”œβ”€1634 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
                             β”œβ”€1638 nginx: worker process
                             β”œβ”€1639 nginx: worker process
                             β”œβ”€1641 nginx: worker process
                             β”œβ”€1642 nginx: worker process
                             β”œβ”€1645 nginx: worker process
                             β”œβ”€1646 nginx: worker process
                             β”œβ”€1647 nginx: worker process
                             β”œβ”€1648 nginx: worker process
                             β”œβ”€1649 nginx: worker process
                             β”œβ”€1650 nginx: worker process
                             β”œβ”€1651 nginx: worker process
                             β”œβ”€1652 nginx: worker process
                             β”œβ”€1653 nginx: worker process
                             β”œβ”€1654 nginx: worker process
                             β”œβ”€1655 nginx: worker process
                             └─1656 nginx: worker process
                  
                  Mar 06 23:09:23 cloudron01 systemd[1]: Starting nginx - high performance web server...
                  Mar 06 23:09:24 cloudron01 systemd[1]: Started nginx - high performance web server.
                  

                  Sorry, I have to work evenings and weekends, it's the only time I can concentrate on the deep work without all the emails & messages interruptions of the week days.

                  We're not here for a long time - but we are here for a good time :)
                  Jersey/UK
                  Work & Ecommerce Advice: https://brandlight.org
                  Personal & Software Tips: https://marcusquinn.com

                  1 Reply Last reply Reply Quote 0
                  • marcusquinn
                    marcusquinn last edited by

                    # systemctl status unbound
                    ● unbound.service - Unbound DNS Resolver
                       Loaded: loaded (/etc/systemd/system/unbound.service; enabled; vendor preset: enabled)
                       Active: active (running) since Sat 2021-03-06 23:26:38 UTC; 8s ago
                     Main PID: 20802 (unbound)
                        Tasks: 1 (limit: 4915)
                       CGroup: /system.slice/unbound.service
                               └─20802 /usr/sbin/unbound -d
                    
                    Mar 06 23:26:38 cloudron01 systemd[1]: Started Unbound DNS Resolver.
                    Mar 06 23:26:38 cloudron01 unbound[20802]: [20802:0] notice: init module 0: subnet
                    Mar 06 23:26:38 cloudron01 unbound[20802]: [20802:0] notice: init module 1: validator
                    Mar 06 23:26:38 cloudron01 unbound[20802]: [20802:0] notice: init module 2: iterator
                    Mar 06 23:26:38 cloudron01 unbound[20802]: [20802:0] info: start of service (unbound 1.6.7).
                    Mar 06 23:26:39 cloudron01 unbound[20802]: [20802:0] error: could not fflush(/var/lib/unbound/root.key): No space left on device
                    Mar 06 23:26:39 cloudron01 unbound[20802]: [20802:0] error: could not fflush(/var/lib/unbound/root.key): No space left on device
                    

                    We're not here for a long time - but we are here for a good time :)
                    Jersey/UK
                    Work & Ecommerce Advice: https://brandlight.org
                    Personal & Software Tips: https://marcusquinn.com

                    1 Reply Last reply Reply Quote 0
                    • marcusquinn
                      marcusquinn last edited by

                      In just cannot get my head around how the disk can be allowed to fill to the point of a total system failure.

                      Slowdown, sure I understand - but it's a total fail and I can't see why this isn't preventable.

                      Is it really all one has to to bring a Cloudron down is load it up with data?

                      There's a bunch of Apps that allow for uploads, it really wouldn't take much effort to flood those with a few GB.

                      We're not here for a long time - but we are here for a good time :)
                      Jersey/UK
                      Work & Ecommerce Advice: https://brandlight.org
                      Personal & Software Tips: https://marcusquinn.com

                      1 Reply Last reply Reply Quote 0
                      • marcusquinn
                        marcusquinn @nebulon last edited by

                        @nebulon @girish App have Memory & CPU allocations - any reason they can't have disk- space allocations too?

                        I'd rather a single app hits a wall than an entire server.

                        It seems all one would have to do to bring a whole Cloudron down this way would be send a lot of email attachments to the point of disk saturation.

                        Maybe I'm wrong and it's something else - but feel free to delete this post and move to email if it's a reproducible risk.

                        We're not here for a long time - but we are here for a good time :)
                        Jersey/UK
                        Work & Ecommerce Advice: https://brandlight.org
                        Personal & Software Tips: https://marcusquinn.com

                        girish 1 Reply Last reply Reply Quote 0
                        • girish
                          girish Staff @marcusquinn last edited by

                          @marcusquinn said in Disk space should never bring a whole server down:

                          @nebulon @girish App have Memory & CPU allocations - any reason they can't have disk- space allocations too?

                          Yes, the memory & CPU allocations are features of the linux kernel cgroups. However, disk space allocation is not part of them.

                          I guess the issue atleast to handle right now is that for some reason disk space is full. Running docker image prune -a sometimes frees up some disk space. Can you try that? Alternately, if you drop me a mail on support , I can look into the server.

                          marcusquinn 1 Reply Last reply Reply Quote 1
                          • marcusquinn
                            marcusquinn @girish last edited by

                            @girish said in Disk space should never bring a whole server down:

                            docker image prune -a

                            OK, thanks, tried that: "Total reclaimed space: 816.4MB"

                            Still no respondio though. Have emailed support@ but 2am here and an early start, so back online in 8h or so, by which it'll be your 2am, and appreciate it's Saturday, so just grateful for pointers and hoping I might have some other requests for assistance waking up soon too.

                            We're not here for a long time - but we are here for a good time :)
                            Jersey/UK
                            Work & Ecommerce Advice: https://brandlight.org
                            Personal & Software Tips: https://marcusquinn.com

                            1 Reply Last reply Reply Quote 0
                            • robi
                              robi last edited by

                              Can we add a disk space email alert, and event where after some critical threshold /tmp is cleaned up & docker images pruned by Cloudron.

                              Completely avoidable with a bit of this..

                              Life of Advanced Technology

                              marcusquinn girish 2 Replies Last reply Reply Quote 4
                              • marcusquinn
                                marcusquinn @robi last edited by

                                I'm wondering if maybe Cloudron should have its own volume by default.

                                A quick search in the subject but kinda tired now:

                                • https://www.reddit.com/r/docker/comments/loleal/how_to_limit_disk_space_for_a_docker_container/
                                • https://guide.blazemeter.com/hc/en-us/articles/115003812129-Overcoming-Container-Storage-Limitation-Overcoming-Container-Storage-Limitation#:~:text=In the current Docker version,be left in the container
                                • https://stackoverflow.com/questions/38542426/docker-container-specific-disk-quota

                                We're not here for a long time - but we are here for a good time :)
                                Jersey/UK
                                Work & Ecommerce Advice: https://brandlight.org
                                Personal & Software Tips: https://marcusquinn.com

                                girish 1 Reply Last reply Reply Quote 0
                                • girish
                                  girish Staff @marcusquinn last edited by

                                  @marcusquinn Managed to bring it up by truncating many logs. Should be coming up in a bit, hold on.

                                  marcusquinn d19dotca 2 Replies Last reply Reply Quote 2
                                  • marcusquinn
                                    marcusquinn @girish last edited by

                                    @girish Ahhhh - thank you kindly!

                                    I have an unused 1TB volume mounted, although I'm not sure how much of the remaining free space is used in the Move function, as I guess that was killing it when I triggered to move the 16GB Jira App data to it?

                                    We're not here for a long time - but we are here for a good time :)
                                    Jersey/UK
                                    Work & Ecommerce Advice: https://brandlight.org
                                    Personal & Software Tips: https://marcusquinn.com

                                    girish 1 Reply Last reply Reply Quote 0
                                    • d19dotca
                                      d19dotca @girish last edited by d19dotca

                                      @girish said in Disk space should never bring a whole server down:

                                      Managed to bring it up by truncating many logs

                                      Is this perhaps related to the issue I reported a little while back too, regarding the logrotate not running properly under certain circumstances?

                                      --
                                      Dustin Dauncey
                                      www.d19.ca

                                      marcusquinn 1 Reply Last reply Reply Quote 1
                                      • marcusquinn
                                        marcusquinn @d19dotca last edited by

                                        @d19dotca I remembered that mention, although fading brain never found or got to looking at that. I kinda think this situation is a bit too easy to get into and hard to get out of once its Terminal only.

                                        We're not here for a long time - but we are here for a good time :)
                                        Jersey/UK
                                        Work & Ecommerce Advice: https://brandlight.org
                                        Personal & Software Tips: https://marcusquinn.com

                                        1 Reply Last reply Reply Quote 0
                                        • marcusquinn
                                          marcusquinn last edited by

                                          Going to trigger a move on Confluence to the mounted volume, it's 4.5GB with 7.5GB free space now on the main volume - so hopefully that's enough working space but I have to zzz, problems where I know I don't immediately know how to solve are kinda exhausting.

                                          We're not here for a long time - but we are here for a good time :)
                                          Jersey/UK
                                          Work & Ecommerce Advice: https://brandlight.org
                                          Personal & Software Tips: https://marcusquinn.com

                                          1 Reply Last reply Reply Quote 0
                                          • girish
                                            girish Staff @marcusquinn last edited by

                                            @marcusquinn looks like things are back up! There is ~7GB left, so hopefully that should hold up for sometime.

                                            1 Reply Last reply Reply Quote 1
                                            • girish
                                              girish Staff last edited by

                                              I am looking into some clues on what can be done to mitigate this, will report back. BTW, for the volume suggestion, this is possible. In fact, we used to do this very long ago with each app having it's own btrfs partition. Usually, people start with a simple VPS. This means that for this to work out of the box one has to create a loopback file system which is very slow. Also, when I logged in to your server, it was mysql that was down which was not happy with lack of disk.

                                              I am wondering if the solution involves suggesting the user to make a specific kind of setup if they want to protect themselves against this kind of issue. That is totally doable (for example, suggest user to move platformdata and boxdata to a separate volume/disk post installation)

                                              1 Reply Last reply Reply Quote 2
                                              • girish
                                                girish Staff @robi last edited by

                                                @robi We actually have a disk space alert, in fact, it's there right now in the dashboard.

                                                57ac2165-226d-4f31-aa1e-a22cf131df92-image.png

                                                But the above is not super useful because it's just checking space in a cronjob. This cronjob is quite conservative because we don't want to keep spinning the disk too much. I am not aware of a way to get a "signal" from the server when disk space limits are hit. If a server fills up too fast between cron runs, the whole thing is useless...

                                                robi mehdi 2 Replies Last reply Reply Quote 2
                                                • marcusquinn
                                                  marcusquinn last edited by marcusquinn

                                                  I've triggered some bigger app data moves to the mounted 1TB volume but it seems to have chewed through 3GB of the remaining free space on the main volume already and I'm back to "Cloudron is offline. Reconnecting". Probably just making hasty tiredness errors now.

                                                  We're not here for a long time - but we are here for a good time :)
                                                  Jersey/UK
                                                  Work & Ecommerce Advice: https://brandlight.org
                                                  Personal & Software Tips: https://marcusquinn.com

                                                  girish scooke 2 Replies Last reply Reply Quote 0
                                                  • girish
                                                    girish Staff @marcusquinn last edited by

                                                    @marcusquinn maybe it's best to move them by hand first. Can you send me the apps you want to move by email and I can move it by hand since this seems to keep hitting a wall. ie. free space -> try to free space -> run out of space and start over...

                                                    1 Reply Last reply Reply Quote 2
                                                    • robi
                                                      robi @girish last edited by

                                                      @girish yes, but does it email you when approaching the threshold?

                                                      threshold setting? (twice a day should be plenty)

                                                      action setting checkboxes? (maybe a custom one too?)

                                                      heck, even deleting an non critical app would be fine since it's restorable from backup.

                                                      Life of Advanced Technology

                                                      1 Reply Last reply Reply Quote 0
                                                      • scooke
                                                        scooke @marcusquinn last edited by

                                                        @marcusquinn Hang in there @marcusquinn. Bonne courage.

                                                        A life lived in fear is a life half-lived

                                                        1 Reply Last reply Reply Quote 2
                                                        • humptydumpty
                                                          humptydumpty last edited by

                                                          WHM has disk space limitations. Is it possible to copy their method and have it implemented in CR?

                                                          1 Reply Last reply Reply Quote 0
                                                          • marcusquinn
                                                            marcusquinn last edited by

                                                            Thanks for all the help - I managed to get some extra hands on deck this morning and we're moving lots of data to a mounted volume for much more headroom.

                                                            I still think it's a little too vulnerable having this hazard able to bring a server down.

                                                            Also, I couldn't see if there's a way to set Email storage to be a mounted volume too?

                                                            We're not here for a long time - but we are here for a good time :)
                                                            Jersey/UK
                                                            Work & Ecommerce Advice: https://brandlight.org
                                                            Personal & Software Tips: https://marcusquinn.com

                                                            d19dotca 1 Reply Last reply Reply Quote 1
                                                            • mehdi
                                                              mehdi App Dev @girish last edited by

                                                              @girish Also, the current warning is IMO not very useful if the threshold is not configurable. Depending on how the server is used, a few GB may be enough for weeks, or for mere hours if there's media stuff on the server, or if a user uploads stuff on nextcloud or something.

                                                              1 Reply Last reply Reply Quote 2
                                                              • d19dotca
                                                                d19dotca @marcusquinn last edited by d19dotca

                                                                @marcusquinn said in Disk space should never bring a whole server down:

                                                                Also, I couldn't see if there's a way to set Email storage to be a mounted volume too?

                                                                Currently, emails are part of boxdata and you need to move the boxdata entirely. I’ve done this in my current server due to the amount of email stored for my clients. The steps for this are at https://docs.cloudron.io/storage/#default-data-directory for reference.

                                                                I’m making an assumption by volume you meant an external disk vs the actual Volumes function that Cloudron has.

                                                                There is a feature request I believe to keep emails separate but boxdata really don’t contain much data at all other than emails so it’s doable as-is for now. It’d just be nice to see the GUI handle moving the email data much like it does for apps.

                                                                --
                                                                Dustin Dauncey
                                                                www.d19.ca

                                                                marcusquinn 1 Reply Last reply Reply Quote 2
                                                                • marcusquinn
                                                                  marcusquinn @d19dotca last edited by

                                                                  @d19dotca Thanks. I'm an app specialist and anything more than a few minutes digging in the dirt is my kinda hell. Just getting brain fog now as I've lost a bunch of important work and 2 days of progress on it now 😞

                                                                  We're not here for a long time - but we are here for a good time :)
                                                                  Jersey/UK
                                                                  Work & Ecommerce Advice: https://brandlight.org
                                                                  Personal & Software Tips: https://marcusquinn.com

                                                                  1 Reply Last reply Reply Quote 0
                                                                  • marcusquinn
                                                                    marcusquinn last edited by

                                                                    Anyone know where /app/data actually is in the full file system structure?

                                                                    I'm trying to navigate a snapshot clone to see if that has the missing config.php file that hasn't come back for EspoCRM but just not seeing anything obvious and searching docs hasn't found me the clue.

                                                                    We're not here for a long time - but we are here for a good time :)
                                                                    Jersey/UK
                                                                    Work & Ecommerce Advice: https://brandlight.org
                                                                    Personal & Software Tips: https://marcusquinn.com

                                                                    1 Reply Last reply Reply Quote 0
                                                                    • marcusquinn
                                                                      marcusquinn last edited by

                                                                      The problem I have is that EspoCRM Administration writes changes back to /app/data/data/config.php - however, that file also contains all the database connection details, password hash, basically everything for that instance to work.

                                                                      So when the disk was full, it seems to have somehow written a 0kb version of config.php.

                                                                      And because of the rsync encryption failing to backup EspoCRM, the Cloudron backups aren't complete.

                                                                      So that leaves provider backup snapshot restore and dig around.

                                                                      Basically, whatever anyone does - never allow the disk to get full - the cascade of problems that can happen from that interruption is just one massive time hole.

                                                                      We're not here for a long time - but we are here for a good time :)
                                                                      Jersey/UK
                                                                      Work & Ecommerce Advice: https://brandlight.org
                                                                      Personal & Software Tips: https://marcusquinn.com

                                                                      marcusquinn 1 Reply Last reply Reply Quote 1
                                                                      • marcusquinn
                                                                        marcusquinn @marcusquinn last edited by

                                                                        @marcusquinn Holy sh*t, with some dumb-luck trying everything I know, I seem to have fixed it.

                                                                        Lesson learnt - never run out of disk space - sods law says it will be the apps you rely on the most that will get corrupted.

                                                                        Now, given the many open ways to load up a Cloudron with data (email/FilePizza/PrivateBin) maybe there's a way to avoid this causing a total fail?

                                                                        We're not here for a long time - but we are here for a good time :)
                                                                        Jersey/UK
                                                                        Work & Ecommerce Advice: https://brandlight.org
                                                                        Personal & Software Tips: https://marcusquinn.com

                                                                        jdaviescoates 1 Reply Last reply Reply Quote 3
                                                                        • jdaviescoates
                                                                          jdaviescoates @marcusquinn last edited by

                                                                          @marcusquinn said in Disk space should never bring a whole server down:

                                                                          Now, given the many open ways to load up a Cloudron with data (email/FilePizza/PrivateBin) maybe there's a way to avoid this causing a total fail?

                                                                          I think FilePizza if fully P2P and so I'm not sure you could fill the server up with that (but you could with Jirafeau).

                                                                          But yeah, I reckon configurable disk space notifications (e.g. email/notify me hourly/daily/whatever once I've only got x space left) but be a good first step to help this not to happen.

                                                                          I use Cloudron with Gandi & Hetzner

                                                                          1 Reply Last reply Reply Quote 0
                                                                          • marcusquinn
                                                                            marcusquinn last edited by

                                                                            Quick fix idea: maybe 70% full is a better nag threshold?

                                                                            We're not here for a long time - but we are here for a good time :)
                                                                            Jersey/UK
                                                                            Work & Ecommerce Advice: https://brandlight.org
                                                                            Personal & Software Tips: https://marcusquinn.com

                                                                            1 Reply Last reply Reply Quote 1
                                                                            • bestknownhost
                                                                              bestknownhost last edited by

                                                                              Thanks for all the feedback here. We discovered cloudron a whiles back and have been testing it out on a number of server over the last couple of months. We wanted to get a good handle on how everything works before rolling anything out into production. Firstly it’s a excellent platform and fills a great need. But we did run into a little problem with one of our test servers running on a digital ocean droplet. About 2 weeks ago it went from using 20gb of space to nearly 80gb in the space of 4 hours. We received an alart from digital ocean however things were happening so fast that all we could initially do is upgrade the instance, this gave us half and hour and then we had to do it again, then we just attached a 100gb volume. Although just in testing there was a wordpress app we were fond of and so we transferred it off the cloudron and left a pixelfed app. Somewhere between shutting down the server to add the volume and moving the Wordpress app, the space usage stopped increasing. I know what your thinking Wordpress right? No we checked the install before hand and it was working fine on another server. We then removed the 100gb volume and resized the digital ocean server back to its original size and evething was back to normal. I figured that some server updates ran that morning and some out of control process started this and resizing the server up and down somehow got rid of the problem.

                                                                              robi nebulon 2 Replies Last reply Reply Quote 0
                                                                              • robi
                                                                                robi @bestknownhost last edited by

                                                                                @bestknownhost Did you perhaps have AdGuard installed?

                                                                                Life of Advanced Technology

                                                                                bestknownhost 1 Reply Last reply Reply Quote 0
                                                                                • bestknownhost
                                                                                  bestknownhost @robi last edited by

                                                                                  @robi No we didn't.

                                                                                  robi 1 Reply Last reply Reply Quote 0
                                                                                  • robi
                                                                                    robi @bestknownhost last edited by

                                                                                    @bestknownhost did you figure out what was filling up the disk with du -sh /* and drilling down?

                                                                                    Life of Advanced Technology

                                                                                    1 Reply Last reply Reply Quote 1
                                                                                    • nebulon
                                                                                      nebulon Staff @bestknownhost last edited by

                                                                                      @bestknownhost for a start to clarify, are you using an external backup storage or just the local disk for now? Using the local disk may cause disk usage to go up quickly depending on how much data you've put into the server.

                                                                                      If that is not the case, then you may have hit some issue we recently saw with mysql binlogs https://forum.cloudron.io/topic/4510/able-to-clean-up-binlog-files-in-var-lib-mysql-directory?_=1616402616926 ?

                                                                                      And as @robi mentioned, do you have any idea so far what is using all that disk space?

                                                                                      1 Reply Last reply Reply Quote 3
                                                                                      • D
                                                                                        DigitEgal @marcusquinn last edited by DigitEgal

                                                                                        @marcusquinn I was running into a simular issue while testing some stuff, most likely because of the Nexcloud Plugin "External Sites":
                                                                                        I am not sure right now, but i dont think that it recreates the files, but more likely it writes a looooooooot of logs down since cpu got pushed aswell

                                                                                        ( THATS NOT A TUTORIAL! ITS ONLY FOR REPEATABILITY OF BUGS! )
                                                                                        How to create the Issue repeatable:
                                                                                        1: Create a Nextcloud and share a folder(structure) to a public link.
                                                                                        2. Insert this link into any secondary website (wordpress etc) as a button that does NOT open a new tab.
                                                                                        3. Add the Plugin "External Sites" to Nextcloud - go to config and add the secondary website.
                                                                                        4. By using the embed Mode of external site implementation this issue is possible to get triggered by a user with access to the External Sites Buttons.
                                                                                        4.1 *Actual i did not test it by using a non-admin user as "trigger" user

                                                                                        How to finally trigger the filling of Disk space ?
                                                                                        -> Now follow the link in Nextcloud to your secondary website.
                                                                                        -> By clicking the button back into nextcloud the issues is triggered.

                                                                                        ( THATS NOT A TUTORIAL! ITS ONLY FOR REPEATABILITY OF BUGS! )

                                                                                        1 Reply Last reply Reply Quote 2
                                                                                        • Referenced by  girish girish 
                                                                                        • M
                                                                                          makemrproper last edited by makemrproper

                                                                                          Here is my SOLUTION:

                                                                                          It does not solve the root cause why you are running out of space, but with this methodology you will buy yourself time.

                                                                                          Generate 3 files of 2 gigabytes each.
                                                                                          This is one way of generating these files:
                                                                                          fallocate -l 2G /storage-padding-buffer-2-gb-file1.img
                                                                                          fallocate -l 2G /storage-padding-buffer-2-gb-file2.img
                                                                                          fallocate -l 2G /storage-padding-buffer-2-gb-file3.img

                                                                                          When your server is out of storage, you may delete one or all of these padding files, so that regain the space you need to rescue the server.

                                                                                          I have had the same issue with cloudron, because over time, storage will run out.
                                                                                          For now I chose not to update the storage of my VPS server because it will double my hosting cost for this node, from USD400 to USD800 per year. That's digital ocean pricing for you, but I digress.

                                                                                          This is a systems engineering issue and isn't caused by Cloudron. However I would not have anything against an elegant solution from the team if it were possible :).

                                                                                          I want to say I am working on a post to describe I work with a massive cleanup, and exactly which steps I took to regain loads of space. TLDR; use ncdu, analyze all containers and identify where apps are storing logs and rotating these, clear NPM package cache in each container. More to cone

                                                                                          1 Reply Last reply Reply Quote 6
                                                                                          • Referenced by  marcusquinn marcusquinn 
                                                                                          • marcusquinn
                                                                                            marcusquinn last edited by

                                                                                            Maybe the Cloudron app needs to generate its own partition to run from, where regular app storage can't saturate the OS or Cloudron partitions?

                                                                                            We're not here for a long time - but we are here for a good time :)
                                                                                            Jersey/UK
                                                                                            Work & Ecommerce Advice: https://brandlight.org
                                                                                            Personal & Software Tips: https://marcusquinn.com

                                                                                            girish 1 Reply Last reply Reply Quote 2
                                                                                            • girish
                                                                                              girish Staff @marcusquinn last edited by

                                                                                              @marcusquinn said in Disk space should never bring a whole server down:

                                                                                              Maybe the Cloudron app needs to generate its own partition to run from, where regular app storage can't saturate the OS or Cloudron partitions?

                                                                                              Right. The main issue, it's not possible to create proper disk partitions in VPS πŸ˜• i.e one can only create file backed loop back file systems but such things are not to be used in production and I have no idea about their reliability/durability.

                                                                                              M 1 Reply Last reply Reply Quote 2
                                                                                              • M
                                                                                                makemrproper @girish last edited by makemrproper

                                                                                                @girish This is still a huge problem.

                                                                                                My production server have failing applications again due to disk space filling up. Luckily DigitalOcean's backup functionality saves my setup this time again. If I had relied only on Cloudron for backups it would have been disaster time.

                                                                                                I am not asking you for help to fix it or to blame anyone. But this needs an engineered solution by upstream, you guys.

                                                                                                You could for instance recommend that we make use of separate storage volumes on the system drive. This brings the TCO cost down, for storage. Recommendations and verified testing from you would be valuable for us as customers.

                                                                                                I can imagine hundreds of other customers of yours that are seeing the same issue.

                                                                                                Also:
                                                                                                Your assertion above that disk space is cheap is true for physical drive storage, but not for VPS server storage. For me to double disk space from 80gb to 160gb it also doubles my yearly VPS cost. I would believe a sizable portion of your users are hosting Cloudron on a VPS.
                                                                                                Of course you could temporarily invalidate this problem by recommending 160GB storage capacity. This might alienate some potential users.

                                                                                                Now onwards to repair my Cloudron install and apps!

                                                                                                edit:
                                                                                                Solving the root cause

                                                                                                Used ncdu to browse every container

                                                                                                • Gained 1gb of storage by deleting /usr/local/share/.cache/yarn/ on a container volume
                                                                                                • Gained 500mb of storage by deleting Anaconda distribution package cache within a container volume

                                                                                                Analysis: There seems to be space wasteful ways of letting the Metor spread around old versions of libraries and builds (?).

                                                                                                girish M timconsidine 3 Replies Last reply Reply Quote 0
                                                                                                • girish
                                                                                                  girish Staff @makemrproper last edited by

                                                                                                  @makemrproper said in Disk space should never bring a whole server down:

                                                                                                  If I had relied only on Cloudron for backups it would have been disaster time.

                                                                                                  Can you clarify this? Why are you unable to rely on Cloudron backups ?

                                                                                                  I agree with the bigger point though. Unfortunately, we have found no clear technical solution to solve disk space issues even outside of Cloudron. What do people generally do when hosting apps on a VPS?

                                                                                                  As for anaconda cache and meteor are you referring to jupyter and wekan apps ? Maybe those packages can be fixed to clear the cache.

                                                                                                  scooke 1 Reply Last reply Reply Quote 1
                                                                                                  • M
                                                                                                    michaelpope @makemrproper last edited by michaelpope

                                                                                                    @makemrproper Hey makemrproper,

                                                                                                    If you want a VPS with a lot of space, you might try looking into BuyVMs storage volumes.

                                                                                                    They take a bit of time to set up (you have to mount the volume), but it's very difficult to get a deal on the space provided elsewhere.

                                                                                                    It's also very well cached, so I've found it to be almost as performant disk wise as what I've used on digital ocean.

                                                                                                    They are a smaller provider though, so the reliability won't be quite as good (you might have a little more down-time compare to other providers).

                                                                                                    I know this doesn't necessarily solve your issue, but more disk space is always great :).

                                                                                                    1 Reply Last reply Reply Quote 0
                                                                                                    • scooke
                                                                                                      scooke @girish last edited by scooke

                                                                                                      @girish I think what @makemrproper meant was if their backups were on the same server as the Cloudron, they wouldn't be able to back up, or restore, from them since the disk was not responsive. I am impressed again by how patient you all are with these situations. Keep up the good work and attitudes.

                                                                                                      A life lived in fear is a life half-lived

                                                                                                      1 Reply Last reply Reply Quote 2
                                                                                                      • timconsidine
                                                                                                        timconsidine @makemrproper last edited by timconsidine

                                                                                                        @makemrproper said in Disk space should never bring a whole server down:

                                                                                                        This is still a huge problem.

                                                                                                        I understand the desire for an approach that stops the problem happening in the first place. In the interim, I really recommend an alert system like ntfy. Use their hosted service or host it yourself ( have self packaged for Cloudron - more recent version is at https://forum.cloudron.io/post/54552).

                                                                                                        Set a cron job for as often as you want, running a script for df -h, set alert levels in the script sending notifications to dashboard or iOS/Android device.

                                                                                                        As self-hosters we want to rely on things working, but we can't escape our responsibility to keep an eye on things. ntfy handles this in one of the simplest ways.

                                                                                                        1 Reply Last reply Reply Quote 4
                                                                                                        • First post
                                                                                                          Last post
                                                                                                        Powered by NodeBB