Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps - Status | Demo | Docs | Install
  1. Cloudron Forum
  2. Discuss
  3. Backups redundant?

Backups redundant?

Scheduled Pinned Locked Moved Discuss
backups
7 Posts 4 Posters 59 Views 5 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M Offline
    M Offline
    miednr
    wrote last edited by
    #1

    I would like to know, why Cloudron has a snapshot folder in the backup and a folder for each daily backup. This seems to be redundant, but surely does make a lot of sense. But, how does it work? I searched the forums, but did not find a discussion about this. Maybe it's to obvious? Would love to understand this miracle... 😉

    1 Reply Last reply
    3
    • girishG Offline
      girishG Offline
      girish
      Staff
      wrote last edited by joseph
      #2

      @miednr Good question. Some background:

      • Uploading files is slow and expensive. In this context, "upload" means to put things in the backup destination. For S3, it's a network upload. For a (networked) disk, it is a copy of files over the network.

      • We made a decision that backup data have to be usable without any special tool. What this means is that if you go to your backup storage - you will see .tar.gz files or individual files (rsync). Basically, no special tool needed if you want to move away from Cloudron. It's important for us to give the signal that you don't get locked-in. For this reason, we don't do differential backups (which will involve some custom format).

      Which then brings us to snapshot folder. Let's say your app has 50 files. On first backup, we create a folder called snapshot and upload 50 files there. Then, we complete the backup process by creating a timestamped directory and copying contents of snapshot into the timestamped directory. This "copying" is very cheap because there are APIs to do a "remote copy". The timestamped directory is a complete standalone backup. It does not rely on anything else. You can just use normal tools to view files.

      After some time, your app has 52 files (it created 2 new files). For the second/next backup, we want to skip uploading 50 unchanged files and only upload the two new ones. The backup system updates the snapshot folder with 2 new files and repeats the copy of the current snapshot to another timestamped directory with "remote copy". Key here is only 2 files got uploaded (which is the expensive part).

      In the case of real disks, we use hard links between the snapshot and timestamped dirs. Even if you have 100GB in snapshots directory, there is only one copy of the files. Hardlinks give us a Copy-on-write style filesystem.

      For S3 and friends, storage costs are lower than ever. I think 1TB on hetzner is just 5 bucks.

      Maybe a better term for snapshot is scratchpad or working dir or something. But the folder name has stuck since we started 12 years ago!

      robiR 1 Reply Last reply
      4
      • M Offline
        M Offline
        miednr
        wrote last edited by
        #3

        @girish Thank you for this profound explanation. It's great that you are using Hardlinks. I guess, that's a real space and time saver. I did my first restores with Cloudron via the UI yesterday.

        A) What does Cloudron internally use for the restore? Snapshot vs. Timestamp?

        B) What could/should anyone use to restore without Cloudron? Snapshot vs. Timestamp?

        This should be different, to make sense...

        As far as understand, A) is Snapshot and B) is Timestamp. Right?

        girishG 1 Reply Last reply
        3
        • girishG girish

          @miednr Good question. Some background:

          • Uploading files is slow and expensive. In this context, "upload" means to put things in the backup destination. For S3, it's a network upload. For a (networked) disk, it is a copy of files over the network.

          • We made a decision that backup data have to be usable without any special tool. What this means is that if you go to your backup storage - you will see .tar.gz files or individual files (rsync). Basically, no special tool needed if you want to move away from Cloudron. It's important for us to give the signal that you don't get locked-in. For this reason, we don't do differential backups (which will involve some custom format).

          Which then brings us to snapshot folder. Let's say your app has 50 files. On first backup, we create a folder called snapshot and upload 50 files there. Then, we complete the backup process by creating a timestamped directory and copying contents of snapshot into the timestamped directory. This "copying" is very cheap because there are APIs to do a "remote copy". The timestamped directory is a complete standalone backup. It does not rely on anything else. You can just use normal tools to view files.

          After some time, your app has 52 files (it created 2 new files). For the second/next backup, we want to skip uploading 50 unchanged files and only upload the two new ones. The backup system updates the snapshot folder with 2 new files and repeats the copy of the current snapshot to another timestamped directory with "remote copy". Key here is only 2 files got uploaded (which is the expensive part).

          In the case of real disks, we use hard links between the snapshot and timestamped dirs. Even if you have 100GB in snapshots directory, there is only one copy of the files. Hardlinks give us a Copy-on-write style filesystem.

          For S3 and friends, storage costs are lower than ever. I think 1TB on hetzner is just 5 bucks.

          Maybe a better term for snapshot is scratchpad or working dir or something. But the folder name has stuck since we started 12 years ago!

          robiR Offline
          robiR Offline
          robi
          wrote last edited by
          #4

          @girish said in Backups redundant?:

          We made a decision that backup data have to be usable without any special tool. What this means is that if you go to your backup storage - you will see .tar.gz files or individual files (rsync). Basically, no special tool needed if you want to move away from Cloudron. It's important for us to give the signal that you don't get locked-in. For this reason, we don't do differential backups (which will involve some custom format).

          Made me ask to what would one restore this if not Cloudron (It's kind of special ! )?

          Conscious tech

          girishG 1 Reply Last reply
          1
          • M miednr

            @girish Thank you for this profound explanation. It's great that you are using Hardlinks. I guess, that's a real space and time saver. I did my first restores with Cloudron via the UI yesterday.

            A) What does Cloudron internally use for the restore? Snapshot vs. Timestamp?

            B) What could/should anyone use to restore without Cloudron? Snapshot vs. Timestamp?

            This should be different, to make sense...

            As far as understand, A) is Snapshot and B) is Timestamp. Right?

            girishG Offline
            girishG Offline
            girish
            Staff
            wrote last edited by
            #5

            @miednr said in Backups redundant?:

            A) What does Cloudron internally use for the restore? Snapshot vs. Timestamp?

            Always the timestamp! The snapshot is just a working directory.

            B) What could/should anyone use to restore without Cloudron? Snapshot vs. Timestamp?

            Always timestamp. These are the standalone backups at the specific point in time.

            1 Reply Last reply
            0
            • robiR robi

              @girish said in Backups redundant?:

              We made a decision that backup data have to be usable without any special tool. What this means is that if you go to your backup storage - you will see .tar.gz files or individual files (rsync). Basically, no special tool needed if you want to move away from Cloudron. It's important for us to give the signal that you don't get locked-in. For this reason, we don't do differential backups (which will involve some custom format).

              Made me ask to what would one restore this if not Cloudron (It's kind of special ! )?

              girishG Offline
              girishG Offline
              girish
              Staff
              wrote last edited by
              #6

              @robi said in Backups redundant?:

              Made me ask to what would one restore this if not Cloudron (It's kind of special ! )?

              I mean that if someone wants to move apps out of Cloudron, they can use the backups to get the db dumps, config files and the data files.

              Of course, there is no magic button/tool to migrate with a click from the backup to non-Cloudron installation. You have to do sysadmin work to migrate away, but I think this is expected. There is no standardized format for these backups, would have been great if there was one.

              1 Reply Last reply
              1
              • timconsidineT Offline
                timconsidineT Offline
                timconsidine
                App Dev
                wrote last edited by
                #7

                Docs could benefit from having girish explanation added, maybe with a nice Escalidraw diagram.
                When there is time available (ha!)

                Indie app dev, scratching my itches, lover of Cloudron PaaS

                1 Reply Last reply
                1
                Reply
                • Reply as topic
                Log in to reply
                • Oldest to Newest
                • Newest to Oldest
                • Most Votes


                • Login

                • Don't have an account? Register

                • Login or register to search.
                • First post
                  Last post
                0
                • Categories
                • Recent
                • Tags
                • Popular
                • Bookmarks
                • Search