Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Feature Requests
  3. Improve Clone/Backup/Restore Speed

Improve Clone/Backup/Restore Speed

Scheduled Pinned Locked Moved Feature Requests
backupsclone
24 Posts 9 Posters 6.7k Views 9 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • robiR robi

    Looking at the progress of a clone from a backup on a remote S3 share, it seems to restore individual files, one by one, which takes hours for larger apps.

    Can the operation be restructured to quickly pull the full backup locally and then copy the files to the container much faster?

    d19dotcaD Offline
    d19dotcaD Offline
    d19dotca
    wrote on last edited by
    #2

    @robi This likely has to do with the remote-nature of your backups. Using a mounted disk allows cloning to be super quick. I know there was room for improvement before and changes later made to allow for faster backups, but not certain if that impacts the restores too.

    --
    Dustin Dauncey
    www.d19.ca

    1 Reply Last reply
    0
    • girishG Offline
      girishG Offline
      girish
      Staff
      wrote on last edited by
      #3

      Is this rsync or tgz? You can now configure the Download concurrency in Advanced Settings.

      robiR 1 Reply Last reply
      0
      • girishG girish

        Is this rsync or tgz? You can now configure the Download concurrency in Advanced Settings.

        robiR Offline
        robiR Offline
        robi
        wrote on last edited by
        #4

        @girish currently it's set to rsync and default concurrency of 10.

        Increasing concurrency isn't as useful when all the files are fetched one by one. It's too many requests.

        Object storage should NOT be treated like a filesystem because it isn't one.

        The large object should be requested in its entirety.

        This is something I worked on at IBM as few people on the planet understood this concept.

        As for a clone, it would be much faster now to install a new copy of the same app, and apply the few deltas. But that's not what's happening.

        How can we speed things up?

        Conscious tech

        1 Reply Last reply
        0
        • girishG Offline
          girishG Offline
          girish
          Staff
          wrote on last edited by
          #5

          @robi said in Improve Clone Speed:

          The large object should be requested in its entirety.

          Mmm, I am not following. What do you mean by large object? If the files are stored separately as rsync does, they have to be fetched separately as well.

          As for a clone, it would be much faster now to install a new copy of the same app, and apply the few deltas. But that's not what's happening.

          That's honestly very complex (might even be impossible) since you have to delta databases, caches etc.

          robiR 1 Reply Last reply
          1
          • girishG girish

            @robi said in Improve Clone Speed:

            The large object should be requested in its entirety.

            Mmm, I am not following. What do you mean by large object? If the files are stored separately as rsync does, they have to be fetched separately as well.

            As for a clone, it would be much faster now to install a new copy of the same app, and apply the few deltas. But that's not what's happening.

            That's honestly very complex (might even be impossible) since you have to delta databases, caches etc.

            robiR Offline
            robiR Offline
            robi
            wrote on last edited by
            #6

            @girish said in Improve Clone Speed:

            Mmm, I am not following. What do you mean by large object? If the files are stored separately as rsync does, they have to be fetched separately as well.

            Then I am definitely using the wrong method.

            As for a clone, it would be much faster now to install a new copy of the same app, and apply the few deltas. But that's not what's happening.

            That's honestly very complex (might even be impossible) since you have to delta databases, caches etc.

            For a clone, it would be simpler to duplicate the database/caches, etc over the default install and not compare anything.

            Conscious tech

            1 Reply Last reply
            0
            • girishG Offline
              girishG Offline
              girish
              Staff
              wrote on last edited by
              #7

              @robi said in Improve Clone Speed:

              For a clone, it would be simpler to duplicate the database/caches, etc over the default install and not compare anything.

              Yes, good point 🙂 And that's probably the solution to make things faster.. Currently, our clone system is tied to a backup. So, this feature request is "clone a live app" (i.e skip this backup stuff altogether).

              1 Reply Last reply
              3
              • robiR Offline
                robiR Offline
                robi
                wrote on last edited by
                #8

                @girish are you aware of the incremental feature of tar ?

                This feature is provided by tar via an argument -listed-incremental=snapshot-file where a "snapshot-file" is a special file maintained by the tar command to determine the files that are been added,modified or deleted.

                That will speed things up.

                Conscious tech

                yusfY 2 Replies Last reply
                3
                • robiR robi

                  @girish are you aware of the incremental feature of tar ?

                  This feature is provided by tar via an argument -listed-incremental=snapshot-file where a "snapshot-file" is a special file maintained by the tar command to determine the files that are been added,modified or deleted.

                  That will speed things up.

                  yusfY Offline
                  yusfY Offline
                  yusf
                  wrote on last edited by yusf
                  #9

                  @robi Would this mean that we can have incremental, encrypted tar backups? That would be amazing as it would solve all my backup woes by combining the best of both methods.

                  robiR 1 Reply Last reply
                  2
                  • yusfY yusf

                    @robi Would this mean that we can have incremental, encrypted tar backups? That would be amazing as it would solve all my backup woes by combining the best of both methods.

                    robiR Offline
                    robiR Offline
                    robi
                    wrote on last edited by
                    #10

                    @yusf that's probably a better question for @girish but likely yes.

                    Conscious tech

                    jdaviescoatesJ 1 Reply Last reply
                    1
                    • robiR robi

                      @yusf that's probably a better question for @girish but likely yes.

                      jdaviescoatesJ Online
                      jdaviescoatesJ Online
                      jdaviescoates
                      wrote on last edited by jdaviescoates
                      #11

                      @robi @yusf @girish

                      More info here:
                      https://www.gnu.org/software/tar/manual/html_node/Incremental-Dumps.html

                      And here:
                      https://serverfault.com/a/569668

                      But perhaps what we really want is Borg?
                      https://borgbackup.readthedocs.io/en/stable/

                      It's recommended in that Server Fault post, but also when I asked Hetzner about backups I can download (you can't download backup from their native backup solution) to store off-Hetzer, they suggested I use it, here is their guide:

                      https://community.hetzner.com/tutorials/install-and-configure-borgbackup

                      But that's ^ all a bit too technical for me. I use Cloudron so I don't have to bother with all that. So I'd love to just have Borg as a backup option on Cloudron 🙂

                      I use Cloudron with Gandi & Hetzner

                      1 Reply Last reply
                      0
                      • LonkleL Offline
                        LonkleL Offline
                        Lonkle
                        wrote on last edited by
                        #12

                        Incremental backups. I thought we had those already, but if we don't - we should vote to support them.

                        jdaviescoatesJ 1 Reply Last reply
                        0
                        • LonkleL Lonkle

                          Incremental backups. I thought we had those already, but if we don't - we should vote to support them.

                          jdaviescoatesJ Online
                          jdaviescoatesJ Online
                          jdaviescoates
                          wrote on last edited by
                          #13

                          @Lonk said in Improve Clone/Backup/Restore Speed:

                          Incremental backups. I thought we had those already, but if we don't - we should vote to support them.

                          We do have them with the rsync option.

                          I use Cloudron with Gandi & Hetzner

                          LonkleL 1 Reply Last reply
                          0
                          • necrevistonnezrN Offline
                            necrevistonnezrN Offline
                            necrevistonnezr
                            wrote on last edited by necrevistonnezr
                            #14

                            Actually, rsync and rclone-ing the incremental backups - encrypted - to Onedrive has been very reliable for me the last years. rclone is such a fantastic tool.

                            1 Reply Last reply
                            1
                            • jdaviescoatesJ jdaviescoates

                              @Lonk said in Improve Clone/Backup/Restore Speed:

                              Incremental backups. I thought we had those already, but if we don't - we should vote to support them.

                              We do have them with the rsync option.

                              LonkleL Offline
                              LonkleL Offline
                              Lonkle
                              wrote on last edited by
                              #15

                              @jdaviescoates I'm using .tgz with Backblaze right now since tgz was the default I didn't look into it. Should I switch to rsync for the benefit of incremental changes or are there cons like @robi is trying to solve (his cloning / restoring speed suggestions).

                              Is "Incremental TAR files" the best of both worlds, basically?

                              marcusquinnM 1 Reply Last reply
                              0
                              • LonkleL Lonkle

                                @jdaviescoates I'm using .tgz with Backblaze right now since tgz was the default I didn't look into it. Should I switch to rsync for the benefit of incremental changes or are there cons like @robi is trying to solve (his cloning / restoring speed suggestions).

                                Is "Incremental TAR files" the best of both worlds, basically?

                                marcusquinnM Online
                                marcusquinnM Online
                                marcusquinn
                                wrote on last edited by
                                #16

                                @Lonk If you use rsync, use Wasabi as it has no ingress costs. Also, in Backblaze, check the lifecycle settings on all buckets to make sure you're not paying to insure infinite versions of versions, just change the setting for each in there to just store the latest.

                                Web Design & Development: https://www.evergreen.je
                                Technology & Apps: https://www.marcusquinn.com

                                LonkleL 1 Reply Last reply
                                0
                                • marcusquinnM marcusquinn

                                  @Lonk If you use rsync, use Wasabi as it has no ingress costs. Also, in Backblaze, check the lifecycle settings on all buckets to make sure you're not paying to insure infinite versions of versions, just change the setting for each in there to just store the latest.

                                  LonkleL Offline
                                  LonkleL Offline
                                  Lonkle
                                  wrote on last edited by
                                  #17

                                  @marcusquinn said in Improve Clone/Backup/Restore Speed:

                                  @Lonk If you use rsync, use Wasabi as it has no ingress costs. Also, in Backblaze, check the lifecycle settings on all buckets to make sure you're not paying to insure infinite versions of versions, just change the setting for each in there to just store the latest.

                                  I did have infinite versions on, thanks for saving me there, I owe ya!

                                  marcusquinnM 1 Reply Last reply
                                  1
                                  • LonkleL Lonkle

                                    @marcusquinn said in Improve Clone/Backup/Restore Speed:

                                    @Lonk If you use rsync, use Wasabi as it has no ingress costs. Also, in Backblaze, check the lifecycle settings on all buckets to make sure you're not paying to insure infinite versions of versions, just change the setting for each in there to just store the latest.

                                    I did have infinite versions on, thanks for saving me there, I owe ya!

                                    marcusquinnM Online
                                    marcusquinnM Online
                                    marcusquinn
                                    wrote on last edited by
                                    #18

                                    @Lonk Everyone does as the sneaky f***ers make it the default. Must have lost thousands of dollars before I found that where everyone else missed it.

                                    Web Design & Development: https://www.evergreen.je
                                    Technology & Apps: https://www.marcusquinn.com

                                    marcusquinnM 1 Reply Last reply
                                    0
                                    • marcusquinnM marcusquinn

                                      @Lonk Everyone does as the sneaky f***ers make it the default. Must have lost thousands of dollars before I found that where everyone else missed it.

                                      marcusquinnM Online
                                      marcusquinnM Online
                                      marcusquinn
                                      wrote on last edited by
                                      #19

                                      Wasabi don't make it the default though, and have a much better interface. I'm dropping Backblaze from my recommendations for S3 needs and only use if for personal machine backups which don't have all those extra costs.

                                      Web Design & Development: https://www.evergreen.je
                                      Technology & Apps: https://www.marcusquinn.com

                                      1 Reply Last reply
                                      0
                                      • robiR robi

                                        @girish are you aware of the incremental feature of tar ?

                                        This feature is provided by tar via an argument -listed-incremental=snapshot-file where a "snapshot-file" is a special file maintained by the tar command to determine the files that are been added,modified or deleted.

                                        That will speed things up.

                                        yusfY Offline
                                        yusfY Offline
                                        yusf
                                        wrote on last edited by
                                        #20

                                        @robi said in Improve Clone/Backup/Restore Speed:

                                        This feature is provided by tar via an argument -listed-incremental=snapshot-file where a "snapshot-file" is a special file maintained by the tar command to determine the files that are been added,modified or deleted.

                                        Have @girish or @nebulon seen this? 😃

                                        1 Reply Last reply
                                        1
                                        • jimcavoliJ Offline
                                          jimcavoliJ Offline
                                          jimcavoli
                                          App Dev
                                          wrote on last edited by
                                          #21

                                          Recently, I accidentally found myself studying this problem. I've relocated backups to GCS recently from DigitalOcean Spaces for one machine...suffice it to say I found the bottleneck in that process. Previously, it appeared to be some traffic management into spaces, and/or the fact that it was heading to the SFO2 region from NYC3 (you know...because...geography). After turning on backups into GCS in the awesome us multi-region automatic replication (nearline), it became very obvious that the main limiting factor was a 10MB/s cap on the disk speed at DO.

                                          Seriously; here's their graph over the last 7 days for Disk I/O performance (it's pretty obvious where the backups are):
                                          CleanShot 2020-11-19 at 09.28.45.png

                                          The main reason this even showed up is that GCS ingest is way faster from a bandwidth perspective:
                                          800b8e2b-95a4-40f5-aa2a-df347d1a2f0d-CleanShot 2020-11-19 at 09.30.00.png

                                          Too bad I don't have the old Spaces graph to show as well, but suffice it to say, it wasn't great. So the GCS switchover actually moved the first bottleneck, getting at the main root of the issue.

                                          I'll update on how things go one the server in question gets itself moved into a GCP instance - by my rough math, there should be a noticeable performance bump in at least backups, but likely systemwide once it transitions into the GCP volumes, which are rated at least 50% faster in the case of the small volumes, and in the big one (apps data), should have a network performance ceiling that is roughly 6x higher than the existing DO volumes. I know this is more on the production/operator side than the personal side of usage, and the problem of "throw more, bigger resources at it" is not one most folks can/would take on a NAS/local server and home internet connection, but it's some interesting data and an intriguing problem in any case.

                                          girishG 1 Reply Last reply
                                          5
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Bookmarks
                                          • Search