Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Support
  3. Disk is (suddenly) full on 1TB drive, can't access cloudron

Disk is (suddenly) full on 1TB drive, can't access cloudron

Scheduled Pinned Locked Moved Solved Support
nginxdisk spacedashboard
20 Posts 6 Posters 2.0k Views 6 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      S Offline
      shan
      wrote on last edited by girish
      #1

      Hello. A few days ago my disk usage was around ~20% when I checked. Today I got an alert that my disk was full, yet the disk usage when I ran some du commands on my server does not reflect that. I can't access cloudron as the disk is completely full. It appears to have filled up rapidly within a few hours, with no changes on my end.

      I've had problems with my separate local backup server recently and thought maybe Cloudron had fallen back to storing backups on the local disk, but I checked var/backups and there is nothing there more than a few kb. Unless there is another place the backups might be stored?

      I got a glimpse of the disk usage statistics on Cloudron dashboard before I stopped being able to access it; it said that like 90% of the usage was for the green "System" type. Not sure if that is helpful for diagnosing what's going on.

      Seeking any tips for debugging this, thanks! (really hoping it's not a data corruption issue..)

      imc67I girishG 2 Replies Last reply
      0
      • marcusquinnM Offline
        marcusquinnM Offline
        marcusquinn
        wrote on last edited by
        #2

        Frustrating!

        Cross-post of a thread we had on this same subject a while back:

        • https://forum.cloudron.io/topic/4604/disk-space-should-never-bring-a-whole-server-down

        Web Design https://www.evergreen.je
        Development https://brandlight.org
        Life https://marcusquinn.com

        1 Reply Last reply
        0
        • S shan

          Hello. A few days ago my disk usage was around ~20% when I checked. Today I got an alert that my disk was full, yet the disk usage when I ran some du commands on my server does not reflect that. I can't access cloudron as the disk is completely full. It appears to have filled up rapidly within a few hours, with no changes on my end.

          I've had problems with my separate local backup server recently and thought maybe Cloudron had fallen back to storing backups on the local disk, but I checked var/backups and there is nothing there more than a few kb. Unless there is another place the backups might be stored?

          I got a glimpse of the disk usage statistics on Cloudron dashboard before I stopped being able to access it; it said that like 90% of the usage was for the green "System" type. Not sure if that is helpful for diagnosing what's going on.

          Seeking any tips for debugging this, thanks! (really hoping it's not a data corruption issue..)

          imc67I Offline
          imc67I Offline
          imc67
          translator
          wrote on last edited by
          #3

          @shan @girish I recognize similar effect but was in time because I use a seperate Zabbix server with triggers.

          This was my situation after long long search:

          1. since Cloudron doesn't notify admin by mail if there is something wrong like 'backup not succeeded' or 'CIFS connection lost' after a few days I noticed backup failure because of CIFS disconnection.
          2. I reconnected and everything seemd fine, except in Zabbix I noticed the disk usage graph was increased
          3. long story short: when I umount the CIFS I noticed the "hidden" backup at the mount path (before connecting)! I deleted all backups there and mounted again: SOLVED

          This same issue was on 2 of my 4 Cloudron Premium servers.

          1 Reply Last reply
          2
          • S shan

            Hello. A few days ago my disk usage was around ~20% when I checked. Today I got an alert that my disk was full, yet the disk usage when I ran some du commands on my server does not reflect that. I can't access cloudron as the disk is completely full. It appears to have filled up rapidly within a few hours, with no changes on my end.

            I've had problems with my separate local backup server recently and thought maybe Cloudron had fallen back to storing backups on the local disk, but I checked var/backups and there is nothing there more than a few kb. Unless there is another place the backups might be stored?

            I got a glimpse of the disk usage statistics on Cloudron dashboard before I stopped being able to access it; it said that like 90% of the usage was for the green "System" type. Not sure if that is helpful for diagnosing what's going on.

            Seeking any tips for debugging this, thanks! (really hoping it's not a data corruption issue..)

            girishG Offline
            girishG Offline
            girish
            Staff
            wrote on last edited by
            #4

            @shan Like @imc67 suggested, are you able to check the local file system after unmounting your remote backup ?

            Meanwhile, I will investigate why mount failure is not reported as a backup error.

            1 Reply Last reply
            1
            • V vjvanjungg referenced this topic on
            • S Offline
              S Offline
              shan
              wrote on last edited by shan
              #5

              Thanks for the responses guys!

              I realized that my backup server was unmounted after it bugged out the other day, and it was storing backups locally in mnt/backups/snapshots folder. After purging that folder the disk space issue is resolved.

              However! My cloudron instance is still inaccessible.

              I followed all of the steps on this troubleshooting guide, to no avail.

              After rebooting after cleaning up disk space, both nginx and unbound were in an error state. unbound restarted right away, but nginx had some issues with old certs preventing it from restarting. After purging the old certs (which I read was safe in the troubleshooting guide 🙂), nginx was able to restart and is now running.

              Unfortunately, my cloudron instance is still inaccessible and I'm not sure why. All other services mentioned in the troubleshooting guide are working properly (docker, mysql, box) according to the logs.

              As far as I can tell everything is working properly, I just can't access my cloudron instance and don't know where to go from here. Any ideas for troubleshooting?

              EDIT: Looks like nginx just died again for some reason. It restarted successfully once after I purged old certs, but now has the same error again even though the certs are gone. 🤔

              This is the error it's giving me when I run nginx -t:

              nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.myserver.net.key": 
              PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY)
              
              girishG 1 Reply Last reply
              0
              • S shan

                Thanks for the responses guys!

                I realized that my backup server was unmounted after it bugged out the other day, and it was storing backups locally in mnt/backups/snapshots folder. After purging that folder the disk space issue is resolved.

                However! My cloudron instance is still inaccessible.

                I followed all of the steps on this troubleshooting guide, to no avail.

                After rebooting after cleaning up disk space, both nginx and unbound were in an error state. unbound restarted right away, but nginx had some issues with old certs preventing it from restarting. After purging the old certs (which I read was safe in the troubleshooting guide 🙂), nginx was able to restart and is now running.

                Unfortunately, my cloudron instance is still inaccessible and I'm not sure why. All other services mentioned in the troubleshooting guide are working properly (docker, mysql, box) according to the logs.

                As far as I can tell everything is working properly, I just can't access my cloudron instance and don't know where to go from here. Any ideas for troubleshooting?

                EDIT: Looks like nginx just died again for some reason. It restarted successfully once after I purged old certs, but now has the same error again even though the certs are gone. 🤔

                This is the error it's giving me when I run nginx -t:

                nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.myserver.net.key": 
                PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY)
                
                girishG Offline
                girishG Offline
                girish
                Staff
                wrote on last edited by
                #6

                @shan Delete the nginx config files as well and then systemctl restart box. This will regenerate the nginx configs and cert files. After that, you will be able to access the dashboard. Go into each app's Location view and click save. That will regenerate the nginx config of each app.

                (This tedious process is automated/fixed in next release.)

                S 1 Reply Last reply
                2
                • girishG girish

                  @shan Delete the nginx config files as well and then systemctl restart box. This will regenerate the nginx configs and cert files. After that, you will be able to access the dashboard. Go into each app's Location view and click save. That will regenerate the nginx config of each app.

                  (This tedious process is automated/fixed in next release.)

                  S Offline
                  S Offline
                  shan
                  wrote on last edited by shan
                  #7

                  @girish I've deleted the nginx conf file (home/yellowtent/platformdata/nginx/nginx.conf) and am encountering a new error. It seems systemctl restart box did not regenerate this.

                  [emerg] open() "/etc/nginx/nginx.conf" failed (2: No such file or directory)
                  
                  girishG 1 Reply Last reply
                  0
                  • S shan

                    @girish I've deleted the nginx conf file (home/yellowtent/platformdata/nginx/nginx.conf) and am encountering a new error. It seems systemctl restart box did not regenerate this.

                    [emerg] open() "/etc/nginx/nginx.conf" failed (2: No such file or directory)
                    
                    girishG Offline
                    girishG Offline
                    girish
                    Staff
                    wrote on last edited by
                    #8

                    @shan Oh, my bad. I should have been clearer that only app configs have to be deleted. Anyway, please run /home/yellowtent/box/setup/start.sh which will create nginx config files.

                    S 1 Reply Last reply
                    1
                    • girishG girish

                      @shan Oh, my bad. I should have been clearer that only app configs have to be deleted. Anyway, please run /home/yellowtent/box/setup/start.sh which will create nginx config files.

                      S Offline
                      S Offline
                      shan
                      wrote on last edited by
                      #9

                      @girish that seems to have fixed all my problems! Can access the dashboard again. Looking forward to the next release when this is automated lol. 🙏

                      1 Reply Last reply
                      2
                      • girishG girish marked this topic as a question on
                      • girishG girish has marked this topic as solved on
                      • nebulonN Offline
                        nebulonN Offline
                        nebulon
                        Staff
                        wrote on last edited by
                        #10

                        As the root cause of this was, that the backup was continuing even though the backup disk was not mounted, we were now able to find the bug which caused this and possibly other similar issues.

                        So the check itself for the mountpoint was correct, but this result was just ignored by the code. This oversight will be fixed for the next release and should avoid such cases for mounted backup volumes in the future.

                        S 1 Reply Last reply
                        5
                        • nebulonN nebulon

                          As the root cause of this was, that the backup was continuing even though the backup disk was not mounted, we were now able to find the bug which caused this and possibly other similar issues.

                          So the check itself for the mountpoint was correct, but this result was just ignored by the code. This oversight will be fixed for the next release and should avoid such cases for mounted backup volumes in the future.

                          S Offline
                          S Offline
                          shan
                          wrote on last edited by
                          #11

                          @nebulon hey, not sure if you guys already implemented this fix, but I just had the exact same issue happen again this morning on a fully up-to-date Cloudron instance. Trying to remember wtf I did to fix this

                          girishG 1 Reply Last reply
                          0
                          • S shan

                            @nebulon hey, not sure if you guys already implemented this fix, but I just had the exact same issue happen again this morning on a fully up-to-date Cloudron instance. Trying to remember wtf I did to fix this

                            girishG Offline
                            girishG Offline
                            girish
                            Staff
                            wrote on last edited by
                            #12

                            @shan yes, fix is coming in 7.3.3. What you have to do is first do journalctl -u nginx -fa. It will say some cert/conf file is bad. Just delete them and systemctl restart nginx and get it running. Then, systemctl restart box.

                            S 1 Reply Last reply
                            0
                            • girishG girish

                              @shan yes, fix is coming in 7.3.3. What you have to do is first do journalctl -u nginx -fa. It will say some cert/conf file is bad. Just delete them and systemctl restart nginx and get it running. Then, systemctl restart box.

                              S Offline
                              S Offline
                              shan
                              wrote on last edited by shan
                              #13

                              @girish the error it gives is the same as this one above; it is not clear that I need to actually just delete the application certs ( just found their location again at nginx/applications )

                              nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.myserver.net.key": 
                              PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY)
                              

                              Deleting the application certs allowed nginx to restart but my webserver is still not running, gah

                              girishG 1 Reply Last reply
                              0
                              • S shan

                                @girish the error it gives is the same as this one above; it is not clear that I need to actually just delete the application certs ( just found their location again at nginx/applications )

                                nginx: [emerg] cannot load certificate key "/home/yellowtent/platformdata/nginx/cert/_.myserver.net.key": 
                                PEM_read_bio_PrivateKey() failed (SSL: error:0909006C:PEM routines:get_name:no start line:Expecting: ANY PRIVATE KEY)
                                

                                Deleting the application certs allowed nginx to restart but my webserver is still not running, gah

                                girishG Offline
                                girishG Offline
                                girish
                                Staff
                                wrote on last edited by girish
                                #14

                                @shan that message is saying that the cert cannot be loaded. Did you remove that file? (you have to also remove the .cert file along with the .key file). If you did remove that file, then go into /etc/nginx/applications and delete the conf files that reference the above cert.

                                S 1 Reply Last reply
                                0
                                • girishG girish

                                  @shan that message is saying that the cert cannot be loaded. Did you remove that file? (you have to also remove the .cert file along with the .key file). If you did remove that file, then go into /etc/nginx/applications and delete the conf files that reference the above cert.

                                  S Offline
                                  S Offline
                                  shan
                                  wrote on last edited by
                                  #15

                                  @girish I didn't remove that file, just the application certs and nginx is now running fine according to systemctl status nginx, just the webserver isn't loading. Unbound & box are fine too. Should I still delete that file even though nginx is running?

                                  girishG 1 Reply Last reply
                                  0
                                  • S shan

                                    @girish I didn't remove that file, just the application certs and nginx is now running fine according to systemctl status nginx, just the webserver isn't loading. Unbound & box are fine too. Should I still delete that file even though nginx is running?

                                    girishG Offline
                                    girishG Offline
                                    girish
                                    Staff
                                    wrote on last edited by
                                    #16

                                    @shan what do you mean by "webserver" ? Do you mean the dashboard? nginx is the webserver.

                                    S 1 Reply Last reply
                                    0
                                    • girishG girish

                                      @shan what do you mean by "webserver" ? Do you mean the dashboard? nginx is the webserver.

                                      S Offline
                                      S Offline
                                      shan
                                      wrote on last edited by
                                      #17

                                      @girish yeah the dashboard

                                      girishG 1 Reply Last reply
                                      0
                                      • S shan

                                        @girish yeah the dashboard

                                        girishG Offline
                                        girishG Offline
                                        girish
                                        Staff
                                        wrote on last edited by
                                        #18

                                        @shan you have to systemctl restart box, it will regenerate the nginx config needed for the dashboard. Then, if you refresh in browser, you might have to accept self signed certificate and login (that's OK). Then, go to Domains -> Renew Certs and you should be back.

                                        BTW, it's safe to delete configs and certs because it's all in the database and code. Renew certs above does not get a new cert internally. It will sync the existing cert in db to disk.

                                        S 1 Reply Last reply
                                        0
                                        • girishG girish

                                          @shan you have to systemctl restart box, it will regenerate the nginx config needed for the dashboard. Then, if you refresh in browser, you might have to accept self signed certificate and login (that's OK). Then, go to Domains -> Renew Certs and you should be back.

                                          BTW, it's safe to delete configs and certs because it's all in the database and code. Renew certs above does not get a new cert internally. It will sync the existing cert in db to disk.

                                          S Offline
                                          S Offline
                                          shan
                                          wrote on last edited by
                                          #19

                                          @girish alright was able to get the dashboard back, thanks! When can we expect that update with the fix?

                                          robiR 1 Reply Last reply
                                          0
                                          • S shan

                                            @girish alright was able to get the dashboard back, thanks! When can we expect that update with the fix?

                                            robiR Offline
                                            robiR Offline
                                            robi
                                            wrote on last edited by
                                            #20

                                            @shan 7.3.3 I believe.

                                            Conscious tech

                                            1 Reply Last reply
                                            1
                                            Reply
                                            • Reply as topic
                                            Log in to reply
                                            • Oldest to Newest
                                            • Newest to Oldest
                                            • Most Votes


                                              • Login

                                              • Don't have an account? Register

                                              • Login or register to search.
                                              • First post
                                                Last post
                                              0
                                              • Categories
                                              • Recent
                                              • Tags
                                              • Popular
                                              • Bookmarks
                                              • Search