Slow backup even without changes?
-
This is probably more a questions than a problem...
I am using incremental backups (rsync with hardlinks) on my Cloudron instances (to a Hetzner storage box via CIFS). Nevertheless backing up takes over an hour even if there have been no changes (for e.g. if I trigger one backup after another).
The time spend is on the transfer of the app container e.g.:
box:shell copy spawn: /bin/cp -dRl /mnt/cloudronbackup/snapshot/app_af90ceac-46a1-476a-88f8-26d36c6e2bcd /mnt/cloudronbackup/2021-09-25-082136-063/app_eg.mydomain.com_v4.10.2Is there a reason for this or is this a problem?
Many thanks
-
@avatar1024 I'm not really sure, but I think rsync can take ages even if there are no changes, as first it has to check every file to check if it's changed or not. So the transfer itself can end up being tiny or non existent and be quick to transfer, but before doing the transfer it has to do loads of work to work out if changes have happened and that can take ages.
-
@jdaviescoates Thanks! That would make sense, I just wonder in that case why what seem to take time are the cp commands (and not the rsync)?
-
@avatar1024 /me shrugs
Hopefully someone else more knowledgable will chime in
-
Again today it took almost 6 hours to just backup nextcloud even though there were almost no changes in files / contents.
Is this normal??
It's kinda creating issues because running the backup is now going through the day.
This Nextcloud instance is about 24GB and 38,000 files.
Here is the log:
2022-02-20T05:16:50.935Z box:backuptask runBackupUpload: result - {"result":""} 2022-02-20T05:16:50.949Z box:backuptask uploadAppSnapshot: file.xxxxxxx.org upload with id snapshot/app_af90ceac-46a1-476a-88f8-26d36c6e2bcd. 11.795 seconds 2022-02-20T05:16:50.950Z box:backuptask rotateAppBackup: rotating file.xxxxxxx.org to id 2022-02-20-020000-723/app_file.xxxxxxx.org_v4.12.1 2022-02-20T05:16:50.953Z box:tasks update 9112: {"percent":65.70588235294119,"message":"Copying /mnt/cloudronbackup/snapshot/app_af90ceac-46a1-476a-88f8-26d36c6e2bcd to /mnt/cloudronbackup/2022-02-20-020000-723/app_file.xxxxxxx.org_v4.12.1"} 2022-02-20T05:16:50.953Z box:shell copy spawn: /bin/cp -al /mnt/cloudronbackup/snapshot/app_af90ceac-46a1-476a-88f8-26d36c6e2bcd /mnt/cloudronbackup/2022-02-20-020000-723/app_file.xxxxxxx.org_v4.12.1 2022-02-20T11:00:32.842Z box:backuptask copy: copied successfully to id 2022-02-20-020000-723/app_file.xxxxxxx.org_v4.12.1. Took 20621.889 seconds 2022-02-20T11:00:32.842Z box:backuptask fullBackup: app file.xxxxxxx.org backup finished. Took 20635.402 seconds
The backup is done on an Hetzner storage box via sshfs
-
@avatar1024 I don't use rsync (I do Tarball zipped), but I am backing up to a Hetzner Storage Box from a Hetzner Cloud VPS and it looks like my last backup finished at 3:09. They start at 2:00, so seemingly just over an hour for 153.8GB (48.13 of which is Nextcloud).
-
@jdaviescoates Many thanks for looking into this and for the info.
I'll try to switch to tarball zip and see if it improves as this is not really workable at the moment.
The time it takes might just to do with all of the http requests rsync has to do to check for changes on every file but still almost 6 hours seemed quite extreme for "only" 38,000 files (mind that it is just under 2s per file so it might just be it).
-
I have the same issue with Nextcloud. The problem really is, that having lots of data in Nextcloud means, the rsync strategy makes a lot more sense. However Nextcloud also requires all the plugin code to be writeable so it also will be backed up and those consist of a lot of small files. Next to that, Nextcloud further creates multiple thumbnail versions per file, which essentially increases the small file count once more. A lot of small files create a lot of small backup I/O requests, resulting in overall speed reduction...it is not clear what the solution would be unfortunately.
-
@nebulon Restic is a solid and fast solution for this: https://forum.cloudron.io/topic/1575/backup-improvements-restic-backend
-
@nebulon : interesting about Nextcloud
I used to have 4 x Nextcloud instances in my Cloudron server (different projects). 3 of them about 10Gb of files (other smaller).
My backups were working but slow.
Even after switching to 'better' VPS provider.
So after testing Seafile for a while, I have moved 2 of the Nextcloud instances over to a single Seafile VPS (with different libraries for segregation). The other 2 Nextcloud instances will follow.I have nothing against Nextcloud and been very happy for it for long time. But if it is really only being used for file collaboration and device syncing, and you don't actually NEED the extra facilities it provides, my personal view is that Seafile is a better solution.
So would be awesome to get Seafile onto Cloudron.
But I'm quite happy running it on a separate VPS.