Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Backup to ext4 disk mount, using rsync, fails on first full backup but succeeds in all future ones.

  • Hello,

    I know you're probably sick of me and the backup issues I'm reporting. 👼 haha

    So in the last day I have setup a new environment again after all the previous backup trouble, and now I'm no longer mounting with CIFS or NFS, but this time adding an actual ext4 disk attached to the hosted instance. And unfortunately backing up using rsync with hardlinks always seems to fail the first time with this error (but works in all other times after the first failure):

    Apr 02 13:19:13 box:shell copy (stdout): /bin/cp: cannot create hard link '/cloudron-backups/2020-04-02-201402-010/box_2020-04-02-201901-713_v5.0.6/mail/vmail/<email>/mail/cur/1585849277.M754834P1428.5aaeeeebead7,S=136634,W=138502:2,Sa' to '/cloudron-backups/snapshot/box/mail/vmail/<email>/mail/cur/1585849277.M754834P1428.5aaeeeebead7,S=136634,W=138502:2,Sa': Operation not permitted
    Apr 02 13:19:17 box:shell copy code: 1, signal: null

    However, what's strange to me is running it a second time without me removing any data, it all succeeds. At first, I thought maybe this ext4 disk isn't allowing hardlinks somehow, but hardlinks would be used on the second go-around too, wouldn't they? And those successive backups do not fail - only the very first one on an empty ext4 hard disk.

    It only seems to be reproducible when I have a clean-slate ext4 disk, where it's all owned and everything by yellowtent and nothing consuming disk space. If I run rm -rf * in the /cloudron-backups ext4 disk mount and run it again, it will fail again. If I leave the data in-tact and run a second backup using the same setting, it succeeds.

    Any ideas on this one?

  • Update: Went into the /cloudron-backups directory and went to create a hardlink using this command: ln test1 test5 (I had created "test1" earlier). I got the same error as Cloudron did earlier... "operation not permitted".

    I then ran the same command but with sudo, so I ran sudo ln test1 test5 and then it worked. So this tells me the filesystem supports hard links, but needs sudo privileges. I assume yellowtent doesn't have sudo privileges.

    Perhaps the yellowtent user needs to be added to the sudoers file. I will try that next and see if that helps.

  • Okay, so I see now that yellowtent doesn't necessarily need sudo access, as it calls a /user/bin/sudo script instead. Unfortunately I learned this the hard way. 😕

    I only ran the command sudo usermod -aG sudo yellowtent but somehow that killed the yellowtent user's ability to do any tasks that require sudo, as now backups crash right away and I cannot reboot the system through the Cloudron interface, etc. The error I get when trying to backup is:

    Apr 02 15:33:06 box:taskworker Starting task 2845
    Apr 02 15:33:06 box:settings initCache: pre-load settings
    Apr 02 15:33:06 box:tasks 2845: {"percent":1,"message":"Backing up www.<domain>.<tld>"}
    Apr 02 15:33:06 box:tasks 2845: {"percent":4.703703703703704,"message":"Snapshotting app www.<domain>.<tld>"}
    Apr 02 15:33:06 box:addons www.<domain>.<tld> backupAddons
    Apr 02 15:33:06 box:addons www.<domain>.<tld> backupAddons: Backing up ["mysql","localstorage","sendmail","ldap","scheduler"]
    Apr 02 15:33:06 box:addons www.<domain>.<tld> Backing up mysql
    Apr 02 15:33:07 box:shell backup-snapshot/app_27179da8-f406-4df2-b367-6f4a6f28f58a spawn: /usr/bin/sudo -S -E --close-from=4 /home/yellowtent/box/src/scripts/backupupload.js snapshot/app_27179da8-f406-4df2-b367-6f4a6f28f58a rsync {"localRoot":"/home/yellowtent/appsdata/27179da8-f406-4df2-b367-6f4a6f28f58a","layout":[]}
    Apr 02 15:33:07 box:shell backup-snapshot/app_27179da8-f406-4df2-b367-6f4a6f28f58a (stdout): sudo
    Apr 02 15:33:07 box:shell backup-snapshot/app_27179da8-f406-4df2-b367-6f4a6f28f58a (stdout): :
    Apr 02 15:33:07 box:shell backup-snapshot/app_27179da8-f406-4df2-b367-6f4a6f28f58a (stdout): you are not permitted to use the -C option
    Apr 02 15:33:07 box:shell backup-snapshot/app_27179da8-f406-4df2-b367-6f4a6f28f58a (stdout):
    Apr 02 15:33:07 box:shell backup-snapshot/app_27179da8-f406-4df2-b367-6f4a6f28f58a code: 1, signal: null
    Apr 02 15:33:07 box:backups Unable to backup { BoxError: Backuptask crashed
    at /home/yellowtent/box/src/backups.js:725:29
    at f (/home/yellowtent/box/node_modules/once/once.js:25:25)
    at ChildProcess.<anonymous> (/home/yellowtent/box/src/shell.js:67:9)
    at ChildProcess.emit (events.js:198:13)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:248:12)
    name: 'BoxError',
    reason: 'Internal Error',
    details: {},
    message: 'Backuptask crashed' }
    Apr 02 15:33:07 box:tasks setCompleted - 2845: {"result":null,"error":{"stack":"BoxError: Backuptask crashed\n at /home/yellowtent/box/src/backups.js:725:29\n at f (/home/yellowtent/box/node_modules/once/once.js:25:25)\n at ChildProcess.<anonymous> (/home/yellowtent/box/src/shell.js:67:9)\n at ChildProcess.emit (events.js:198:13)\n at Process.ChildProcess._handle.onexit (internal/child_process.js:248:12)","name":"BoxError","reason":"Internal Error","details":{},"message":"Backuptask crashed"}}
    Apr 02 15:33:07 box:tasks 2845: {"percent":100,"result":null,"error":{"stack":"BoxError: Backuptask crashed\n at /home/yellowtent/box/src/backups.js:725:29\n at f (/home/yellowtent/box/node_modules/once/once.js:25:25)\n at ChildProcess.<anonymous> (/home/yellowtent/box/src/shell.js:67:9)\n at ChildProcess.emit (events.js:198:13)\n at Process.ChildProcess._handle.onexit (internal/child_process.js:248:12)","name":"BoxError","reason":"Internal Error","details":{},"message":"Backuptask crashed"}}

    I then proceeded to back out the change by running sudo deluser yellowtent sudo but this did not fix it. I rebooted the whole server just in case, still nothing. All the apps load and everything, but can't do administrative tasks. I screwed something up. Your guidance would be appreciated. 🙂

  • Okay, I fixed the last one. Somehow I accidentally removed the /etc/sudoers.d/yellowtent file. Strangely I had created that file based on a recommendation for adding sudo privileges too, and so I removed it when backing it out but I had never seen any of the default stuff. Anyways, long story short... that part is fixed now.

    So now I'm back to the original issue in this post, which is that when running rsync filesystem backups in Cloudron, it gets "operation not permitted" when creating a hardlink. I assume this has to do with some sort of sudo privileges, since it seems my default non-root user also gets the same error when I try to manually create a hardlink but it works fine for root.

    Any ideas on that one? Is this a Cloudron defect where it's not sudo-ing properly perhaps to run the command to create the hardlinks?

  • Latest update: I ran the 6th or so "fresh start" rsync backup with the empty disk, and I expected it to fail as it did the other 5 "fresh start" times, but this time it seemed to work without any errors at all.

    Unsure if this is a coincidence, or related to the change in /etc/sudoers.d/yellowtent file, or something else, but I wanted to say that it seems like it worked this time. Unsure why though. Would still love to hear thoughts on what could cause this issue, because I am worried there is a defect or environmental issue causing it that I need to nail down to have reliable backups of my Cloudron server.

  • Staff

    The yellowtent sudoers file is required for backups indeed as well as other operations. So maybe an intermediate Cloudron update has restored the correct file and thus solved the issue?

  • @nebulon Unfortunately that's not the case. I had fixed that file myself after I accidentally deleted it with the wrong understanding, recreating the file and filling in the content by grabbing it directly from the repository for the /etc/sudoers.d/yellowtent file and that file was apparently last edited quite a while ago, nothing recently changed to it. There was also no Cloudron update during my testing time as this was all the same day. The file btw I grabbed the contents from is this one:

    I wonder if it'd be reproducible by setting up a fresh instance again with the same setup, because that's essentially what occurred for me and it only seemed to be fixed after editing the sudoers.d/yellowtent file. Not sure if a coincidence though or something different. Perhaps there was a discrepancy between what Cloudron setup versus what's in the Cloudron git directory?

    Here is the exact steps I took to get to that issue before which seems to no longer be impacting me:

    1. Setup a new public cloud instance at OVH
    2. Create a block storage disk and attach it to the public instance created in step 1
    3. Format the block storage disk and mount it according to this article:
    4. Install and restore Cloudron from old server via DigitalOcean Spaces latest Cloudron backup ID
    5. Once Cloudron is running off latest backup, change backup configuration to file system using rsync non-encrypted with hardlinks, then initiate the first backup to the new location. This is where it failed for me.

    In addition to the steps above, I then tried adding the yellowtent user to have sudo privileges by running sudo usermod -aG sudo yellowtent however this did not work, so I then created a file in /etc/sudoers.d/yellowtent (which strangely enough didn't tell me it already existed, so it was almost like that file didn't exist previously). I then deleted the file afterwards as it still did not work, and then later on found the file contents that should have been there in the Cloudron/box git repository, manually created the file again and added in the contents. This then immediately resolved the issue for the sudo concern and backups continued again and also were fully successful rather than hitting the "operation not permitted" error.

    The more I think about it, I'm starting to wonder if the yellowtent sudoers file even existed when installing Cloudron and restoring from backup, is it possible any of those tasks may have modified it or prevented it from being added?

    EDIT: Interestingly, I got collectd errors while running into the sudo issues prior to resolving it, and I see in your latest update to 5.1.1 changelog that you fixed collectd installation issue, I doubt it's related but is that possible in this case as well as the possibilities above?