Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Support
  3. Backup cleanup task starts but do not finish the job

Backup cleanup task starts but do not finish the job

Scheduled Pinned Locked Moved Solved Support
backupscleanup-backups
15 Posts 2 Posters 1.1k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • nebulonN nebulon

    Looks like two totally different issues. For case 1 no clue why the database queries would be out of order. Is this easily reproducible?

    For case 2, looking at the logs, I wonder why it has mixed messages about the app domain? Is this because you manually replaced those or is this actually mixing domains in the original logs? Why a rm -rf wouldn't work on a folder which contains data is strange. Can you check permissions with that folder?

    T Offline
    T Offline
    tobiasb
    wrote on last edited by
    #3

    @nebulon

    Yes, internal 2 different issues, but both had ~ 5 running backup cleanup jobs.

    I do not see case one at the moment, but case 2 still.

    35689108-1079-42fa-a0c2-00c4fe39fdf5-grafik.png

    I bet only one job should be started and when job 1 does something, which job 2 tries also that this can happen.

    1 Reply Last reply
    0
    • nebulonN Away
      nebulonN Away
      nebulon
      Staff
      wrote on last edited by
      #4

      Ok just trying to understand this better. The issue is that those cleanup jobs run in parallel? They are ok to run in parallel as such, so they shouldn't fail due to that.

      The unable to prune backup directory is just logging that fact, but it would not fail here.

      Do you see other errors or failing tasks as such?

      T 1 Reply Last reply
      0
      • nebulonN nebulon

        Ok just trying to understand this better. The issue is that those cleanup jobs run in parallel? They are ok to run in parallel as such, so they shouldn't fail due to that.

        The unable to prune backup directory is just logging that fact, but it would not fail here.

        Do you see other errors or failing tasks as such?

        T Offline
        T Offline
        tobiasb
        wrote on last edited by
        #5

        @nebulon

        At some ENOTEMPTY errors I see a new error.

        box:backupcleaner removeBackup: unable to prune backup directory /mnt/cloudronbackup/myhost/2024-11-26-030148-145: ENOTEMPTY: directory not empty, rmdir '/mnt/cloudronbackup/myhost/2024-11-26-030148-145'
        Dec 09 05:28:23 box:backupcleaner removeBackup: error removing app_cef6575d-2b80-4e31-bf0f-01b61e1ca149_v1.4.3_344f2264 from database. BoxError: Backup not found at Object.del (/home/yellowtent/box/src/backups.js:281:42) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { reason: 'Not found', details: {} }
        Dec 09 05:28:23
        
        1 Reply Last reply
        0
        • nebulonN Away
          nebulonN Away
          nebulon
          Staff
          wrote on last edited by
          #6

          So since this is trying to remove things, while they are already gone. The log is maybe a bit misleading. Those are not critical as the intended outcome of the task is just already fulfilled. Basically those are just debug logs in the end.

          1 Reply Last reply
          0
          • T Offline
            T Offline
            tobiasb
            wrote on last edited by tobiasb
            #7

            When I trust systemctl status box-task-12345.service, then all 4 tasks try to remove the same ( rm -rf /mnt/cloudronbackup/myhost/2024-11-26-030148-145/app_n8n.example.org_v3.57.0).
            I opened the process (rm -rf) with strace. which still runs syscall unlinkat/newfstatat. (pid 130049)

            newfstatat(4, "tpgnVSb3898pzsvVCOpQQ-IwDccdLmSNWN1n-DYUdlo", {st_mode=S_IFREG|0644, st_size=2180, ...}, AT_SYMLINK_NOFOLLOW) = 0
            unlinkat(4, "tpgnVSb3898pzsvVCOpQQ-IwDccdLmSNWN1n-DYUdlo", 0) = 0
            

            All other tasks tries the same:

            newfstatat(4, "FRWM67Hwddxd+j4LlMw0eNkJtl1TB-NY4hupbahkGyTyWLn0Kduf8YTNra3FrvU3", 0x55fcd3be73e8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
            unlinkat(4, "FRWM67Hwddxd+j4LlMw0eNkJtl1TB-NY4hupbahkGyTyWLn0Kduf8YTNra3FrvU3", 0) = -1 ENOENT (No such file or directory)
            

            Summary:

            box-task-21539.service 
              130049 rm -rf /mnt/cloudronbackup/myhost/2024-11-26-030148-145/app_n8n.example.org_v3.57.0 # still running
            box-task-21538.service
              130047 rm -rf /mnt/cloudronbackup/myhost/2024-11-26-030148-145/app_n8n.example.org_v3.57.0
            box-task-21543.service
              130046 rm -rf /mnt/cloudronbackup/myhost/2024-11-26-030148-145/app_n8n.example.org_v3.57.0
            box-task-21545.service
              130048 rm -rf /mnt/cloudronbackup/myhost/2024-11-26-030148-145/app_n8n.example.org_v3.57.0
            

            I will wait and see what will be happing when task 1 is finished.

            1 Reply Last reply
            0
            • nebulonN Away
              nebulonN Away
              nebulon
              Staff
              wrote on last edited by
              #8

              Ok so before we get into this, are the tasks finishing and if so, also sucessfully? Despite the log lines you had posted earlier, the files are also gone afterwards?

              Trying to find out if there is anything to debug in the first place or if the log lines were just confusing.

              T 1 Reply Last reply
              0
              • nebulonN nebulon

                Ok so before we get into this, are the tasks finishing and if so, also sucessfully? Despite the log lines you had posted earlier, the files are also gone afterwards?

                Trying to find out if there is anything to debug in the first place or if the log lines were just confusing.

                T Offline
                T Offline
                tobiasb
                wrote on last edited by
                #9

                @nebulon

                The tasks are still running, box-task-21538.service started Sun 2024-12-08 23:30:02 UTC; so I need to wait to say which task was sucessfully, which not.

                The directory /mnt/cloudronbackup/my_example_org/2024-11-14-030035-138 is gone from my first post.

                1 Reply Last reply
                0
                • T Offline
                  T Offline
                  tobiasb
                  wrote on last edited by
                  #10

                  The four jobs from about was all successfully.

                  So what I take from it. A job just start and do not care if the other job already runs, perhaps they try to cleanup the same.
                  And perhaps it is faster, because job A delete file A and job B can delete file B, when it is faster than job A.
                  I did not read the entire log to verify that this was the case.

                  Just a idea:

                  A cleanup-job should look/claim what it does so that a following job do not try it also and waste time to find out that the file list in his memory was already deleted.

                  1 Reply Last reply
                  0
                  • nebulonN Away
                    nebulonN Away
                    nebulon
                    Staff
                    wrote on last edited by
                    #11

                    Yeah I guess the main question is, why some were even running parallel. Timewise they should be spaced out, so maybe those were just manually triggered or took unexpectedly long?

                    T 1 Reply Last reply
                    0
                    • nebulonN nebulon

                      Yeah I guess the main question is, why some were even running parallel. Timewise they should be spaced out, so maybe those were just manually triggered or took unexpectedly long?

                      T Offline
                      T Offline
                      tobiasb
                      wrote on last edited by
                      #12

                      @nebulon

                      The jobs did not start on same time 23:30, 01:30, 03:30, 05:30.
                      At the moment only one cleanup is running.

                      1 Reply Last reply
                      0
                      • nebulonN Away
                        nebulonN Away
                        nebulon
                        Staff
                        wrote on last edited by
                        #13

                        So 2 hours seem plenty of time to finish removing a bunch of files and folders, strange how they took so long then.

                        At least I don't see anything to fix as such, locking them can have other sideeffects and a task to do some file deletion isn't going to consume much resources anyways. Maybe the target filesystem in this case is very slow?

                        T 1 Reply Last reply
                        0
                        • nebulonN nebulon

                          So 2 hours seem plenty of time to finish removing a bunch of files and folders, strange how they took so long then.

                          At least I don't see anything to fix as such, locking them can have other sideeffects and a task to do some file deletion isn't going to consume much resources anyways. Maybe the target filesystem in this case is very slow?

                          T Offline
                          T Offline
                          tobiasb
                          wrote on last edited by
                          #14

                          @nebulon Hetzner storagebox (Finland) and sshfs, rsync and encrypted filenames. 😄 The host fs (cloudron) is not a ssd (netcup).

                          1 Reply Last reply
                          0
                          • T Offline
                            T Offline
                            tobiasb
                            wrote on last edited by
                            #15

                            I would say lets close this here. I do not see any failed cleanup jobs even multiple jobs try todo the same.

                            1 Reply Last reply
                            0
                            • nebulonN nebulon has marked this topic as solved on
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Don't have an account? Register

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • Bookmarks
                            • Search