Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Discuss
  3. Cloudron+ZFS?

Cloudron+ZFS?

Scheduled Pinned Locked Moved Discuss
zfsbtrfs
27 Posts 7 Posters 2.1k Views 7 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • infogulchI Offline
    infogulchI Offline
    infogulch
    wrote on last edited by
    #3

    There is no solution that I am aware of that can take a exact point in time "snapshot" of an app across 3 different things without stopping the app. In fact, very long time ago, we used to stop apps before backups but people complained loudly. Eventually, we found that it's not really a big issue to take a live backup. It's probably not consistent but most apps seem to deal with it fine.

    Live backups are probably how almost all users of the upstream projects take backups (if they take any at all 😬), including the maintainers, so live backups seem like a reasonable choice from a user-coverage perspective. That also hints that perhaps it's not a big deal that you can't take simultaneous snapshots across multiple volumes; they'll probably be 'more simultaneous' than simply copying files if that's worth anything.

    back in the day app data volumes were BTRFS volumes. We faced a lot of instability with BTRFS and ran into issues...

    It's very interesting to know that you've already experimented with this approach before. I did search the forum for ZFS but not for BTRFS which in retrospect seems a bit silly on my part. I think it's quite reasonable that you chose a different direction after encountering so many problems trying to integrate with BTRFS. I've heard similar accounts of corruption and operation issues running BTRFS, and fewer accounts of such issues with ZFS, but I don't have personal experience with either so this is just hearsay.

    If you had some willing help / test subjects would you consider experimenting with a CoW filesystem like ZFS again? If so, how might you expect that to that look?

    1 Reply Last reply
    1
    • girishG girish

      I don't have much experience with zfs, but back in the day app data volumes were BTRFS volumes. We faced a lot of instability with BTRFS and ran into issues which we couldn't solve/understand. It was really hard to get reasonable help for common issues, it used to keep running out of inodes and then people ask you to rebalance stuff and that too fails sometimes. Sometimes, file system becomes corrupt. Eventually, we decided to not use it.

      For Cloudron's backup situation, having snapshots would indeed help since we can upload (to the backup backend) a atomic point-in-time snapshot of the app data volumes, as you suggested. However, we found that this is not a "complete" solution because apps store data in database, cache (redis) and also on file system. There is no solution that I am aware of that can take a exact point in time "snapshot" of an app across 3 different things without stopping the app. In fact, very long time ago, we used to stop apps before backups but people complained loudly. Eventually, we found that it's not really a big issue to take a live backup. It's probably not consistent but most apps seem to deal with it fine.

      robiR Offline
      robiR Offline
      robi
      wrote on last edited by
      #4

      @girish the stability of ZFS on linux has improved drastically since I first played with it. It's definitely worth trying now.

      Also, when you tried BTRFS, it was plagued by all of the immaturity issues, ZFS didn't have at the time.

      Now that distros lets you add ZFS volumes by default, it's pretty awesome for all the benefits you get.

      Conscious tech

      1 Reply Last reply
      1
      • girishG Offline
        girishG Offline
        girish
        Staff
        wrote on last edited by
        #5

        Yeah, might be worth re-visiting. I think this might also help us in setting per-app data directory size limits. There was another long there elsewhere about how running out of disk space should not bring down the server. I think with having appdata in a volume with specific size (assuming zfs allows us to resize volumes easily etc), then that's a big win! Any idea on how these things perform? As long as it's reasonable, that should be ok. It's not like most apps are very disk i/o heavy.

        robiR 1 Reply Last reply
        3
        • girishG girish

          Yeah, might be worth re-visiting. I think this might also help us in setting per-app data directory size limits. There was another long there elsewhere about how running out of disk space should not bring down the server. I think with having appdata in a volume with specific size (assuming zfs allows us to resize volumes easily etc), then that's a big win! Any idea on how these things perform? As long as it's reasonable, that should be ok. It's not like most apps are very disk i/o heavy.

          robiR Offline
          robiR Offline
          robi
          wrote on last edited by
          #6

          @girish said in Cloudron+ZFS?:

          I think this might also help us in setting per-app data directory size limits.

          Yes

          There was another long there elsewhere about how running out of disk space should not bring down the server. I think with having appdata in a volume with specific size (assuming zfs allows us to resize volumes easily etc), then that's a big win!

          Yes it does! From 10 years ago I recall it's only resizing up. Downsize was a lateral move across volumes.

          Any idea on how these things perform? As long as it's reasonable, that should be ok. It's not like most apps are very disk i/o heavy.

          Yes, it runs better than most any other filesystem because of the efficiency and caching that ZFS does. It can also be further enhanced with their ZIL (intent log) that's kept in RAM. For busy systems that is a godsend.

          I was one of the key members of the team actually created the first ZFS system on the planet, stretched across two nodes.

          https://www.researchgate.net/publication/333250276_OCP_Hackathon_2015_N-Way_OpenZFS_Redundancy_and_Horizontal_Scaling_on_Flash_Memory_Switched_Fabrics

          Conscious tech

          1 Reply Last reply
          0
          • robiR Offline
            robiR Offline
            robi
            wrote on last edited by
            #7

            Here's someone's account of do's and don't with ZFS.

            Creating and managing single/dual drive ZFS filesystem on Linux

            Do NOT use ZFS in these cases:

            you want to use ZFS on single/one external USB drive (worst case, data corruption will happen on non clean dismount, and you would have to recreate whole dataset)
            you want to use ZFS on single/one drive and you do not have any external drive for the backup purpose (why? when the zpool is not cleanly dismounted/exported, some data can get corrupted permanently and zfs will have no other mirror drive from which it can automatically get valid data, unless you get secondary drive of same type, size for parity, redundancy)
            you do not have hours of your time to learn basics of ZFS management, on this page though are most basic things
            

            Majority of following commands will work on all Linux distributions though first part of the tutorial is using Arch/Manjaro Linux packages and package manager. On Ubuntu i was able to setup ZFS using command "sudo apt install zfsutils-linux". If you have other distribution, you need to discover if your distribution has packages for zfs (and kernel modules).

            Upgrade and update system and reboot (in case new kernel was installed since last reboot)

            A)
            sudo pacman -S linux-latest-zfs
            reboot
            sudo /sbin/modprobe zfs

            if modprobe not works, try "sudo pacman -R linux-latest-zfs" and try method B:

            B)
            Discover installed kernels:
            uname -r
            pacman -Q | grep β€œ^linux”

            and install zfs packages for these:
            sudo pacman -Ss zfs|grep -i linux
            sudo pacman -S linux123-zfs
            pamac install zfs-dkms
            reboot
            enable zfs support in kernel (it was not enabled in 5.8.16-2-MANJARO after reboot, but once enabled by following command it persist)

            sudo /sbin/modprobe zfs

            ===================================

            Open two pages and search for things and parameters to understand following commands:
            https://zfsonlinux.org/manpages/0.8.1/man8/zpool.8.html
            https://zfsonlinux.org/manpages/0.8.1/man8/zfs.8.html

            sudo smartctl -a /dev/sdb|grep -i "sector size"
            Sector Sizes: 512 bytes logical, 4096 bytes physical
            (smartctl is in package "smartmontools")

            It was suggested here https://forum.proxmox.com/threads/how-can-i-set-the-correct-ashift-on-zfs.58242/post-268384 to use in the following "zpool create" command the parameter ashift=12 for drives with 4096 bytes physical sector size and for 8K physical to use ashift=13. If ashift not defined, then zfs use autodetect where i do not know how good it is.
            attempt to create pool named "poolname" on a HDD of choice: (use disk that store no important data or it will be lost + unmount the drive, maybe using gparted)

            A) sudo zpool create -o ashift=12 -o feature@async_destroy=enabled -o feature@empty_bpobj=enabled -o feature@lz4_compress=enabled poolname /dev/disk/ID-HERE(ls -l /dev/disk/by-id/)

            or the same command only the pool will be created across 2 physical drives (of same size, else pool will not use all the space on bigger drive?) where one will be used for redundancy (recommended to reduce irreversible data corruption risk and double the read/write performance)
            B) sudo zpool create -o ashift=12 -o feature@async_destroy=enabled -o feature@empty_bpobj=enabled -o feature@lz4_compress=enabled poolname mirror /dev/disk/DRIVE1-ID-HERE(ls -l /dev/disk/by-id/) /dev/disk/DRIVE2-ID-HERE(ls -l /dev/disk/by-id/)
            (for 4 drives mirror, it should be: zpool create poolname mirror drive1id drive2id mirror drive3id drive4id)

            Regarding following parameter recordsize, it was suggested on places like this https://blog.programster.org/zfs-record-size and https://jrs-s.net/2019/04/03/on-zfs-recordsize/ and https://www.reddit.com/r/zfs/comments/8l20f5/zfs_record_size_is_smaller_really_better/ that for large media files drive, the block size is better to increase from 128k to 512k. So i did it for my multimedia drive. Though above linked manual page for zfs says this value is only suggested and zfs automatically adjust size per usage patterns. Also is said on mentioned unofficial article that the record size should be similar to a size of the typical storage operation within the dataset which may contradict with the file size itself. "zpool iostat -r" shows the operation sizes distribution/counts, also if zpool is single drive, maybe can be used "sudo iostat -axh 3 /dev/zpooldrivename" and checking the "rareq-sz" (read average request size).

            Creating two datasets one encrypted one not:
            sudo zfs create -o compression=lz4 -o checksum=skein -o atime=off -o xattr=sa -o encryption=on -o keyformat=passphrase -o mountpoint=/e poolname/enc
            sudo zfs create -o compression=lz4 -o checksum=skein -o atime=off -o xattr=sa -o encryption=off -o recordsize=512K -o mountpoint=/d poolname/data

            fix permissions:
            sudo chown -R $(whoami):$(whoami) /poolname /e /d

            gracefully unmount the pools (i think necessary or poor will be marked as suspended and compute restart needed):
            sudo zpool export -a

            mount the pools:
            sudo zpool import -a
            (if it fails, you have to mount manually, list disk names (ls -l /dev/disk/by-id/), then: sudo zpool import -a -d /dev/disk/by-id/yourdisk1name-part1 -d /dev/disk/by-id/yourdisk2name-part1 )

            If some pool is encrypted, then additional command needed (-l parameter to enter passphrase, else it complains "encryption key not loaded"):
            sudo zfs mount -a -l

            pool activity statistics:
            zpool iostat -vlq
            zpool iostat -r (request size histograms)
            zpool iostat -w (wait/latency histograms)

            intent log statistics:
            cat /proc/spl/kstat/zfs/zil

            change mountpoint of some dataset within the pool:
            sudo mkdir /new;sudo zfs set mountpoint=/new poolname/datasetname

            rename/move dataset (error "cannot destroy filesystem has children"):
            sudo zfs rename poolname/dataset/subdataset poolname/subdatasetnew

            attach new drive (if the existing one is non-redundant single drive, result will be mirror (something like RAID1, with enhanced read/write and 1 drive fault tollerance, data self healing), if existing is part of mirror, it will be three way mirror:
            zpool attach poolname existingdrive newdrive

            Detach, remove, replace, see manual page (man zpool) or https://zfsonlinux.org/manpages/0.8.1/man8/zpool.8.html

            create snapshot:
            zfs snapshot -r poolname@snapshot1

            destroy (delete) all snapshots (no prompt):
            sudo zfs list -H -o name -t snapshot -r POOLNAME|sudo xargs -n1 zfs destroy

            destroy (delete) dataset (no prompt):
            sudo zfs destroy poolname/enc

            destroy (delete) whole pool (no prompt):
            sudo zpool destroy poolname

            ========
            If you are OK with the HDD activity to increase at times regular activity is no/low, then consider enabling automatic scrubbing (kind of runtime "fsck" that checks files and even can repair files on replicated devices (mirror/raidz)). Following sets the monthly task:

            su;echo -e "[Unit]\nDescription=Monthly zpool scrub on %i\n\n[Timer]\nOnCalendar=monthly\nAccuracySec=1h\nPersistent=true\n\n[Install]\nWantedBy=multi-user.target" > /etc/systemd/system/zfs-scrub@.timer
            
            echo -e "[Unit]\nDescription=zpool scrub on %i\n\n[Service]\nNice=19\nIOSchedulingClass=idle\nKillSignal=SIGINT\nExecStart=/usr/bin/zpool scrub %i\n\n[Install]\nWantedBy=multi-user.target" > /etc/systemd/system/zfs-scrub@.service;exit
            
            systemctl enable zfs-scrub@YOURPOOLNAMEHERE.timer
            

            ========
            Another page worth reading: https://wiki.archlinux.org/index.php/ZFS

            Terminology:
            ZIL - ZFS intent log is allocated from blocks within the main pool. However, it might be possible to get better sequential write performance using separate intent log devices (SLOG) such as NVRAM.
            SLOG - It's just a really fast place/device to store the ZIL (Zfs Intent Log). Most systems do not write anything close to 4GB to ZIL. (cat /proc/spl/kstat/zfs/zil). ZFS will not benefit from more SLOG storage than the maximum ARC size. That is half of system memory on Linux by default. SLOG device can only increase throughput and decrease latency in a workload with many sync writes.

            ARC - Adaptive Replacement Cache is the ZFS read cache in the main memory (DRAM).

            L2ARC - Second Level Adaptive Replacement Cache is used to store read cache data outside of the main memory. ... use read-optimized SSDs (no need to mirror/fault tolerance)

            Cache - These devices (typically a SSD) are managed by L2ARC to provide an additional layer of caching between main memory and disk. For read-heavy workloads, where the working set size is much larger than what can be cached in main memory, using cache devices allow much more of this working set to be served from low latency media. Using cache devices provides the greatest performance improvement for random read-workloads of mostly static content. (zpool add POOLNAME cache DEVICENAME)

            Interesting utilities:
            ZREP is a ZFS based replication and failover script https://github.com/bolthole/zrep
            Syncoid facilitates the asynchronous incremental replication of ZFS filesystems https://github.com/jimsalterjrs/sanoid#syncoid

            ========
            ZFS zpool file statistics (file size, number of files):
            /decko/
            a) zpool iostat -r;zpool iostat -w
            b)
            [code]find /decko/ -type f -print0 | xargs -0 ls -l | awk '{ n=int(log($5)/log(2)); if (n<10) { n=10; } size[n]++ } END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' | sort -n | awk 'function human(x) { x[1]/=1024; if (x[1]>=1024) { x[2]++; human(x) } } { a[1]=$1; a[2]=0; human(a); printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'

            1k: 8102
            2k: 2938
            4k: 2169
            8k: 2102
            16k: 2311
            32k: 2986
            64k: 2533
            128k: 2164
            256k: 2146
            512k: 1692
            1M: 2284
            2M: 4512
            4M: 7483
            8M: 7890
            16M: 4184
            32M: 1911
            64M: 484
            128M: 1461
            256M: 4911
            512M: 2344
            1G: 578
            2G: 113
            4G: 13
            8G: 11
            16G: 2[/code]

            /ecko/ZN:
            [code]find /ecko/ZN/ -type f -print0 2>/dev/null| xargs -0 ls -l 2>/dev/null| awk '{ n=int(log($5)/log(2)); if (n<10) { n=10; } size[n]++ } END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' | sort -n | awk 'function human(x) { x[1]/=1024; if (x[1]>=1024) { x[2]++; human(x) } } { a[1]=$1; a[2]=0; human(a); printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'

            1k: 403007
            2k: 33644
            4k: 48356
            8k: 155711
            16k: 62305
            32k: 52709
            64k: 47308
            128k: 44223
            256k: 35698
            512k: 32049
            1M: 34376
            2M: 22291
            4M: 38327
            8M: 8134
            16M: 2448
            32M: 1346
            64M: 1948
            128M: 1438
            256M: 379
            512M: 276
            1G: 124
            2G: 3[/code]

            Conscious tech

            1 Reply Last reply
            2
            • infogulchI infogulch

              ZFS is a next-gen filesystem that uses a Copy-on-Write (CoW) storage strategy to provide very useful features such as instant volume snapshots that only consume space proportional to the changes since the previous snapshot, and similarly support "0-storage-consumption" clones of volumes. Naturally, snapshots are the primary vehicle for backups in a ZFS environment. I see one thread asking about ZFS back in 2019, but I didn't see a cogent argument for Cloudron+ZFS. I think ZFS volumes as Cloudron app data volumes would be a match made in heaven.

              Sorry for submitting a whole blog post πŸ˜… there's a TL;DR at the end.

              ZFS Snapshots

              There are various [1] excellent [2] explanations for ZFS snapshots and volumes you can find online, but this one is mine: When a new block of data is written in a ZFS volume the new data and the state of the filesystem after the write does not overwrite the existing block and filesystem, but is instead written to an unused location. Subsequent reads and writes see the the old filesystem masked/shadowed by the new data. Eventually these shadowed blocks can be reclaimed and written to again, but until then they still contain the full state of the filesystem at the time before they were shadowed. If the operator desires to save a particular state (aka take a backup) that's easy: just keep a pointer to it and don't let the data it references be reclaimed -- ZFS calls this a "snapshot". Snapshots are incredibly cheap to create in both time and space, since all it does is tell the system not to reclaim the blocks referenced by it. You don't write to a snapshot -- they are read-only -- but you can create a "cloned" volume based on a particular snapshot. Again, the new clone is very cheap to create and doesn't consume any storage itself until something writes data to it, masking the snapshot with the new data. One might think this system is conceptually similar to the way Docker's overlay2 filesystem driver works -- perhaps if it were much more mature and operationalized.

              Cloudron + ZFS

              With this knowledge of how ZFS snapshots work, their applicability as an underlying implementation for Cloudron's model of app data volumes becomes clear: Each app gets its own ZFS volume for /data. Backups are just snapshots of the volume and become instant, causing no interruption or degradation to app or system functionality. Restores are a cloned volume based on an existing snapshot and no longer require any extra space to initially create. App upgrades can be attempted and reverted immediately if needed. Since backups are cheap and truly incremental, it becomes viable to create them very frequently, potentially as often as every 10 minutes. This could even work for storage volumes hosting e.g. PostgreSQL database.

              You may guess (correctly) that I think ZFS is friggin awesome and could be a good fit for Cloudron, but I admit that this very simplified presentation of the benefits leaves a lot of open questions and caveats unanswered as it applies to the Cloudron ecosystem. I explore some of these below.

              The ZFS project

              ZFS is typically a FreeBSD filesystem (see Wikipedia for a less truncated history), but ZFS-on-Linux (ZoL) project first made ZFS viable as a Linux filesystem in 2014 when it was first released as a package in Ubuntu 14.04. Late last year the project merged with the FreeBSD base, rebranded to OpenZFS in a combined code repository that supports many OSes, and (fwiw) released their first major versions going from v0.8.6 to 2.0.0 and then version 2.0.4 just last month.

              Caveats & open questions

              • Cloudron would surely not migrate everyone to ZFS by default (at least not this decade..) and the current simple backup strategy is too valuable in simple installations to ever just dispense with, which means ZFS would have to become a new option, with the maintenance and support burden that implies.
              • In some situations ZFS could save storage space compared to a "normal" filesystem, but it would definitely complicate the storage space calculations since snapshots consume space until they are released. The main issue is that this storage consumption is not directly "visible" where an unknowing user might look (e.g. as files on a mount point) which could cause some confusion. Note, OpenZFS recommends maintaining at least 10% free space.
              • A big feature of Cloudron backups is seamless remote backups, which are certainly possible -- elegant even -- for ZFS snapshots, but this would require some additional fleshing out before it could substitute in for the current fully featured backup and restore system.
              • It gives users additional footguns to shoot themselves with. For example: the ZFS deduplication feature sounds great, but it requires more resources that one might think and probably shouldn't be enabled before careful consideration.
              • There are surely other caveats. πŸ™‚ Any others you can think of?

              ZFS technology for Linux systems appears to be converging in 2021, and I think Cloudron's app backup and restore model would lend beautifully to being supported directly by ZFS. What do you think? Anyone using ZFS on Cloudron or otherwise?

              TL;DR: yo I think ZFS is dope and could integrate nicely with Cloudron. hbu?

              L Offline
              L Offline
              LoudLemur
              wrote on last edited by
              #8

              @infogulch Ubuntu offers ZFS as an experimental option during installation on the Focal Fossa 20.04 LTS release. If you e.g. setup a Virtual Machine with ZFS, and try and install Cloudron, you receive the following error message:

              Error: Cloudron requires '/' to be ext4
              

              I suppose the same thing will happen if you chose btrfs for the file system.

              robiR girishG 2 Replies Last reply
              2
              • L LoudLemur

                @infogulch Ubuntu offers ZFS as an experimental option during installation on the Focal Fossa 20.04 LTS release. If you e.g. setup a Virtual Machine with ZFS, and try and install Cloudron, you receive the following error message:

                Error: Cloudron requires '/' to be ext4
                

                I suppose the same thing will happen if you chose btrfs for the file system.

                robiR Offline
                robiR Offline
                robi
                wrote on last edited by
                #9

                @LoudLemur removing this restriction would be useful.

                Conscious tech

                1 Reply Last reply
                2
                • L LoudLemur

                  @infogulch Ubuntu offers ZFS as an experimental option during installation on the Focal Fossa 20.04 LTS release. If you e.g. setup a Virtual Machine with ZFS, and try and install Cloudron, you receive the following error message:

                  Error: Cloudron requires '/' to be ext4
                  

                  I suppose the same thing will happen if you chose btrfs for the file system.

                  girishG Offline
                  girishG Offline
                  girish
                  Staff
                  wrote on last edited by
                  #10

                  @LoudLemur We use the overlay2 device driver in Docker. In the past, this device driver did not support btrfs atleast. See https://git.cloudron.io/cloudron/box/-/issues/364 . You can remove the check in the cloudron-setup script and see what happens.

                  1 Reply Last reply
                  2
                  • infogulchI Offline
                    infogulchI Offline
                    infogulch
                    wrote on last edited by
                    #11

                    There is a zfs storage driver that I think is relevant here: https://docs.docker.com/storage/storagedriver/zfs-driver/

                    1 Reply Last reply
                    1
                    • 32463 Offline
                      32463 Offline
                      3246
                      wrote on last edited by
                      #12

                      Thanks for this interesting discussion. I am struggling with backup due to the volume (350/400GB) and wonder if having ZFS+snapshots would be better (faster, easier and more reliable) than Cloudron's way (tar or rsync)?

                      The new box is a Hetzner dedi with 2x 3TB and a 512GB NVMe. Instead of sRAID1 I am now pondering ZFS mirror with daily snapshots send to the Storagebox and perhaps part of the NVMe as a ZIL log.

                      I am keen to hear any thoughts and experience you may have folks 😊

                      πŸ‘‰ Find our more www.bebraver.online

                      robiR necrevistonnezrN 2 Replies Last reply
                      0
                      • 32463 3246

                        Thanks for this interesting discussion. I am struggling with backup due to the volume (350/400GB) and wonder if having ZFS+snapshots would be better (faster, easier and more reliable) than Cloudron's way (tar or rsync)?

                        The new box is a Hetzner dedi with 2x 3TB and a 512GB NVMe. Instead of sRAID1 I am now pondering ZFS mirror with daily snapshots send to the Storagebox and perhaps part of the NVMe as a ZIL log.

                        I am keen to hear any thoughts and experience you may have folks 😊

                        robiR Offline
                        robiR Offline
                        robi
                        wrote on last edited by
                        #13

                        @3246 that would auto dedupe at a file level, but the main issue is that backups would make it worse due to the tar and compression. Uncompressed you'd save way more space as most files would not change much.

                        That's why newer tech storage appliances tend to dedupe at a multiple of 4k bytes to be able to dedupe even more regardless of file type or compression type.

                        Otherwise one could play with storing rapidly or increasingly changing data differently than more static app data, which is exactly that Cloudron does, separating the two.

                        Conscious tech

                        1 Reply Last reply
                        0
                        • 32463 3246

                          Thanks for this interesting discussion. I am struggling with backup due to the volume (350/400GB) and wonder if having ZFS+snapshots would be better (faster, easier and more reliable) than Cloudron's way (tar or rsync)?

                          The new box is a Hetzner dedi with 2x 3TB and a 512GB NVMe. Instead of sRAID1 I am now pondering ZFS mirror with daily snapshots send to the Storagebox and perhaps part of the NVMe as a ZIL log.

                          I am keen to hear any thoughts and experience you may have folks 😊

                          necrevistonnezrN Offline
                          necrevistonnezrN Offline
                          necrevistonnezr
                          wrote on last edited by
                          #14

                          @3246 I have a similar amount of data - pointing Cloudronβ€˜s backup to a local drive via rsync and then pushing this snapshot via restic (and rclone) to Onedrive (via cron) works great. It’s encrypted, de-duped and mountable. Backup results are sent via email. If that’s interesting, i could share my setup.

                          robiR 1 Reply Last reply
                          4
                          • necrevistonnezrN necrevistonnezr

                            @3246 I have a similar amount of data - pointing Cloudronβ€˜s backup to a local drive via rsync and then pushing this snapshot via restic (and rclone) to Onedrive (via cron) works great. It’s encrypted, de-duped and mountable. Backup results are sent via email. If that’s interesting, i could share my setup.

                            robiR Offline
                            robiR Offline
                            robi
                            wrote on last edited by
                            #15

                            @necrevistonnezr Please do.

                            Conscious tech

                            necrevistonnezrN 1 Reply Last reply
                            0
                            • robiR robi

                              @necrevistonnezr Please do.

                              necrevistonnezrN Offline
                              necrevistonnezrN Offline
                              necrevistonnezr
                              wrote on last edited by necrevistonnezr
                              #16

                              @robi
                              I use

                              Tools

                              • rclone: https://rclone.org/docs/
                              • restic: https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html#other-services-via-rclone
                              • ssmtp: https://wiki.archlinux.org/title/SSMTP

                              Installation

                              • Install tools above via apt
                              • afterwards update to latest version (repo versions are old): sudo restic self-update && sudo rclone selfupdate

                              Setup rclone

                              • Enter an interactive setup process via rclone config
                              • in my case I use Onedrive as it has 1TB of space coming with my Office 365 subscription
                              • for the rest of this summary, we assume you gave it the repository name "REPOSITORY"
                              • details at https://rclone.org/commands/rclone_config/

                              Setup restic

                              • set up a backup repository restic -r rclone:REPOSITORY init
                              • for a subfolder on onedrive just use restic -r rclone:REPOSITORY:subfolder init
                              • save password that you gave the repository in file /home/USER/resticpw
                              • details at https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#other-services-via-rclone

                              Setup SSMTP

                              • for receiving backup results, otherwise not needed
                              • See https://wiki.archlinux.org/title/SSMTP

                              Cloudron Backup settings

                              • Provider: mountpoint
                              • Location: /media/CloudronBackup (<-- obviously adjust to your settings)
                              • this creates a snapshot at /media/CloudronBackup/snapshot for the current backup
                              • Storage Format: rsync
                              • Adjust schedule and retention to your liking

                              Backup, Prune and Check scripts

                              restic-cron-backup.sh

                              #!/bin/bash
                              d=$(date +%Y-%m-%d)
                              if pidof -o %PPID -x β€œ$0”; then
                              echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                              exit 1
                              fi
                              restic -r rclone:REPOSITORY:subfolder backup /media/CloudronBackup/snapshot -p=/home/USER/resticpw
                              restic -r rclone:REPOSITORY:subfolder forget --keep-monthly 12 --keep-weekly 5 --keep-daily 14 -p=/home/USER/resticpw
                              restic -r rclone:REPOSITORY:subfolder check --read-data-subset=2% -p=/home/USER/resticpw
                              exit
                              

                              First line does the backup (incremental, encrypted), second line is the backup retention, third line checks a random 2 % of all data for errors

                              restic-cron-prune.sh

                              #!/bin/bash
                              d=$(date +%Y-%m-%d)
                              if pidof -o %PPID -x β€œ$0”; then
                              echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                              exit 1
                              fi
                              restic -r rclone:REPOSITORY:subfolder prune -p=/home/USER/resticpw
                              exit
                              

                              removes unused data from the repository, I run this once a week

                              restic-cron-check.sh

                              #!/bin/bash
                              d=$(date +%Y-%m-%d)
                              if pidof -o %PPID -x β€œ$0”; then
                              echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                              exit 1
                              fi
                              restic -r rclone:REPOSITORY:subfolder check --read-data -p=/home/USER/resticpw
                              exit
                              

                              checks all data for errors, I run this once a week

                              Crontab

                              30 2 * * * sh /home/USER/restic-cron-backup.sh | mailx -s "Restic Backup Results" server@mydomain.com
                              1 5 1 * * sh /home/USER/restic-cron-prune.sh | mailx -s "Restic Prune Results" server@mydomain.com
                              1 8 1 * * sh /home/USER/restic-cron-check.sh | mailx -s "Restic Full Check Results" server@mydomain.com
                              

                              Backup daily at 2:30, prune and check once a week. Receive results to specified mail

                              Mount backups

                              Just to be complete: You can mount restic backups locally like
                              restic -r rclone:REPOSITORY:subfolder mount /media/resticmount/ -p=/home/USER/resticpw && cd /media/resticmount
                              obviously adjust /media/resticmount/to your settings; allows you to browse and copy from full snapshots for each backup

                              robiR 32463 2 Replies Last reply
                              5
                              • necrevistonnezrN necrevistonnezr

                                @robi
                                I use

                                Tools

                                • rclone: https://rclone.org/docs/
                                • restic: https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html#other-services-via-rclone
                                • ssmtp: https://wiki.archlinux.org/title/SSMTP

                                Installation

                                • Install tools above via apt
                                • afterwards update to latest version (repo versions are old): sudo restic self-update && sudo rclone selfupdate

                                Setup rclone

                                • Enter an interactive setup process via rclone config
                                • in my case I use Onedrive as it has 1TB of space coming with my Office 365 subscription
                                • for the rest of this summary, we assume you gave it the repository name "REPOSITORY"
                                • details at https://rclone.org/commands/rclone_config/

                                Setup restic

                                • set up a backup repository restic -r rclone:REPOSITORY init
                                • for a subfolder on onedrive just use restic -r rclone:REPOSITORY:subfolder init
                                • save password that you gave the repository in file /home/USER/resticpw
                                • details at https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#other-services-via-rclone

                                Setup SSMTP

                                • for receiving backup results, otherwise not needed
                                • See https://wiki.archlinux.org/title/SSMTP

                                Cloudron Backup settings

                                • Provider: mountpoint
                                • Location: /media/CloudronBackup (<-- obviously adjust to your settings)
                                • this creates a snapshot at /media/CloudronBackup/snapshot for the current backup
                                • Storage Format: rsync
                                • Adjust schedule and retention to your liking

                                Backup, Prune and Check scripts

                                restic-cron-backup.sh

                                #!/bin/bash
                                d=$(date +%Y-%m-%d)
                                if pidof -o %PPID -x β€œ$0”; then
                                echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                                exit 1
                                fi
                                restic -r rclone:REPOSITORY:subfolder backup /media/CloudronBackup/snapshot -p=/home/USER/resticpw
                                restic -r rclone:REPOSITORY:subfolder forget --keep-monthly 12 --keep-weekly 5 --keep-daily 14 -p=/home/USER/resticpw
                                restic -r rclone:REPOSITORY:subfolder check --read-data-subset=2% -p=/home/USER/resticpw
                                exit
                                

                                First line does the backup (incremental, encrypted), second line is the backup retention, third line checks a random 2 % of all data for errors

                                restic-cron-prune.sh

                                #!/bin/bash
                                d=$(date +%Y-%m-%d)
                                if pidof -o %PPID -x β€œ$0”; then
                                echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                                exit 1
                                fi
                                restic -r rclone:REPOSITORY:subfolder prune -p=/home/USER/resticpw
                                exit
                                

                                removes unused data from the repository, I run this once a week

                                restic-cron-check.sh

                                #!/bin/bash
                                d=$(date +%Y-%m-%d)
                                if pidof -o %PPID -x β€œ$0”; then
                                echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                                exit 1
                                fi
                                restic -r rclone:REPOSITORY:subfolder check --read-data -p=/home/USER/resticpw
                                exit
                                

                                checks all data for errors, I run this once a week

                                Crontab

                                30 2 * * * sh /home/USER/restic-cron-backup.sh | mailx -s "Restic Backup Results" server@mydomain.com
                                1 5 1 * * sh /home/USER/restic-cron-prune.sh | mailx -s "Restic Prune Results" server@mydomain.com
                                1 8 1 * * sh /home/USER/restic-cron-check.sh | mailx -s "Restic Full Check Results" server@mydomain.com
                                

                                Backup daily at 2:30, prune and check once a week. Receive results to specified mail

                                Mount backups

                                Just to be complete: You can mount restic backups locally like
                                restic -r rclone:REPOSITORY:subfolder mount /media/resticmount/ -p=/home/USER/resticpw && cd /media/resticmount
                                obviously adjust /media/resticmount/to your settings; allows you to browse and copy from full snapshots for each backup

                                robiR Offline
                                robiR Offline
                                robi
                                wrote on last edited by
                                #17

                                @necrevistonnezr beautiful πŸ˜„

                                Funny how you have to prune cuz it forgets. πŸ’­

                                Conscious tech

                                necrevistonnezrN 1 Reply Last reply
                                0
                                • robiR robi

                                  @necrevistonnezr beautiful πŸ˜„

                                  Funny how you have to prune cuz it forgets. πŸ’­

                                  necrevistonnezrN Offline
                                  necrevistonnezrN Offline
                                  necrevistonnezr
                                  wrote on last edited by
                                  #18

                                  @robi said in Cloudron+ZFS?:

                                  Funny how you have to prune cuz it forgets. πŸ’­

                                  😁 😡

                                  1 Reply Last reply
                                  0
                                  • necrevistonnezrN necrevistonnezr

                                    @robi
                                    I use

                                    Tools

                                    • rclone: https://rclone.org/docs/
                                    • restic: https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html#other-services-via-rclone
                                    • ssmtp: https://wiki.archlinux.org/title/SSMTP

                                    Installation

                                    • Install tools above via apt
                                    • afterwards update to latest version (repo versions are old): sudo restic self-update && sudo rclone selfupdate

                                    Setup rclone

                                    • Enter an interactive setup process via rclone config
                                    • in my case I use Onedrive as it has 1TB of space coming with my Office 365 subscription
                                    • for the rest of this summary, we assume you gave it the repository name "REPOSITORY"
                                    • details at https://rclone.org/commands/rclone_config/

                                    Setup restic

                                    • set up a backup repository restic -r rclone:REPOSITORY init
                                    • for a subfolder on onedrive just use restic -r rclone:REPOSITORY:subfolder init
                                    • save password that you gave the repository in file /home/USER/resticpw
                                    • details at https://restic.readthedocs.io/en/latest/030_preparing_a_new_repo.html#other-services-via-rclone

                                    Setup SSMTP

                                    • for receiving backup results, otherwise not needed
                                    • See https://wiki.archlinux.org/title/SSMTP

                                    Cloudron Backup settings

                                    • Provider: mountpoint
                                    • Location: /media/CloudronBackup (<-- obviously adjust to your settings)
                                    • this creates a snapshot at /media/CloudronBackup/snapshot for the current backup
                                    • Storage Format: rsync
                                    • Adjust schedule and retention to your liking

                                    Backup, Prune and Check scripts

                                    restic-cron-backup.sh

                                    #!/bin/bash
                                    d=$(date +%Y-%m-%d)
                                    if pidof -o %PPID -x β€œ$0”; then
                                    echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                                    exit 1
                                    fi
                                    restic -r rclone:REPOSITORY:subfolder backup /media/CloudronBackup/snapshot -p=/home/USER/resticpw
                                    restic -r rclone:REPOSITORY:subfolder forget --keep-monthly 12 --keep-weekly 5 --keep-daily 14 -p=/home/USER/resticpw
                                    restic -r rclone:REPOSITORY:subfolder check --read-data-subset=2% -p=/home/USER/resticpw
                                    exit
                                    

                                    First line does the backup (incremental, encrypted), second line is the backup retention, third line checks a random 2 % of all data for errors

                                    restic-cron-prune.sh

                                    #!/bin/bash
                                    d=$(date +%Y-%m-%d)
                                    if pidof -o %PPID -x β€œ$0”; then
                                    echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                                    exit 1
                                    fi
                                    restic -r rclone:REPOSITORY:subfolder prune -p=/home/USER/resticpw
                                    exit
                                    

                                    removes unused data from the repository, I run this once a week

                                    restic-cron-check.sh

                                    #!/bin/bash
                                    d=$(date +%Y-%m-%d)
                                    if pidof -o %PPID -x β€œ$0”; then
                                    echo β€œ$(date β€œ+%d.%m.%Y %T”) Exit, already running.”
                                    exit 1
                                    fi
                                    restic -r rclone:REPOSITORY:subfolder check --read-data -p=/home/USER/resticpw
                                    exit
                                    

                                    checks all data for errors, I run this once a week

                                    Crontab

                                    30 2 * * * sh /home/USER/restic-cron-backup.sh | mailx -s "Restic Backup Results" server@mydomain.com
                                    1 5 1 * * sh /home/USER/restic-cron-prune.sh | mailx -s "Restic Prune Results" server@mydomain.com
                                    1 8 1 * * sh /home/USER/restic-cron-check.sh | mailx -s "Restic Full Check Results" server@mydomain.com
                                    

                                    Backup daily at 2:30, prune and check once a week. Receive results to specified mail

                                    Mount backups

                                    Just to be complete: You can mount restic backups locally like
                                    restic -r rclone:REPOSITORY:subfolder mount /media/resticmount/ -p=/home/USER/resticpw && cd /media/resticmount
                                    obviously adjust /media/resticmount/to your settings; allows you to browse and copy from full snapshots for each backup

                                    32463 Offline
                                    32463 Offline
                                    3246
                                    wrote on last edited by
                                    #19

                                    @necrevistonnezr amazing! Thank you for sharing πŸ™‚

                                    I am pondering a similar approach and currently back up to a secondary internal drive via rsync without encryption, although the backup drive is using LUKS.

                                    I wonder if using rclone crypt instead of encrypting the files via restic would gain any advantages (e.g. maximum file / folder names, depth, speed)?

                                    Cloudron > rsync to local drive > rclone/crypt via restic > remote

                                    πŸ‘‰ Find our more www.bebraver.online

                                    32463 1 Reply Last reply
                                    1
                                    • 32463 3246

                                      @necrevistonnezr amazing! Thank you for sharing πŸ™‚

                                      I am pondering a similar approach and currently back up to a secondary internal drive via rsync without encryption, although the backup drive is using LUKS.

                                      I wonder if using rclone crypt instead of encrypting the files via restic would gain any advantages (e.g. maximum file / folder names, depth, speed)?

                                      Cloudron > rsync to local drive > rclone/crypt via restic > remote

                                      32463 Offline
                                      32463 Offline
                                      3246
                                      wrote on last edited by
                                      #20

                                      Just reading up on restic and encryption, etc and may just skip the rclone part as I am looking to either go to Wasabi or Hetzner Storagebox.

                                      However, I kinda like the crypt part and am looking for any comparisons between rclone/crypt with restic and restic w/ encryption in terms of time it takes to backup and any drawbacks.

                                      πŸ‘‰ Find our more www.bebraver.online

                                      necrevistonnezrN 1 Reply Last reply
                                      0
                                      • 32463 3246

                                        Just reading up on restic and encryption, etc and may just skip the rclone part as I am looking to either go to Wasabi or Hetzner Storagebox.

                                        However, I kinda like the crypt part and am looking for any comparisons between rclone/crypt with restic and restic w/ encryption in terms of time it takes to backup and any drawbacks.

                                        necrevistonnezrN Offline
                                        necrevistonnezrN Offline
                                        necrevistonnezr
                                        wrote on last edited by
                                        #21

                                        @3246 said in Cloudron+ZFS?:

                                        Just reading up on restic and encryption, etc and may just skip the rclone part as I am looking to either go to Wasabi or Hetzner Storagebox.

                                        However, I kinda like the crypt part and am looking for any comparisons between rclone/crypt with restic and restic w/ encryption in terms of time it takes to backup and any drawbacks.

                                        Just to be clear: The encryption (as well as deduplication, the repository, data integrity checks, etc.) is completely handled by restic. rclone is just the "transporter tool" that copies data to providers that the restic does not handle (restic out-of-the-box currently handles SFTP, REST-Server, Minio, Wasabi, etc. see https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html).

                                        32463 1 Reply Last reply
                                        1
                                        • necrevistonnezrN necrevistonnezr

                                          @3246 said in Cloudron+ZFS?:

                                          Just reading up on restic and encryption, etc and may just skip the rclone part as I am looking to either go to Wasabi or Hetzner Storagebox.

                                          However, I kinda like the crypt part and am looking for any comparisons between rclone/crypt with restic and restic w/ encryption in terms of time it takes to backup and any drawbacks.

                                          Just to be clear: The encryption (as well as deduplication, the repository, data integrity checks, etc.) is completely handled by restic. rclone is just the "transporter tool" that copies data to providers that the restic does not handle (restic out-of-the-box currently handles SFTP, REST-Server, Minio, Wasabi, etc. see https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html).

                                          32463 Offline
                                          32463 Offline
                                          3246
                                          wrote on last edited by
                                          #22

                                          @necrevistonnezr thank you. Is the encryption always part of restic or optional? It looks like it's baked-in the way repos are build, right?

                                          πŸ‘‰ Find our more www.bebraver.online

                                          necrevistonnezrN 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Bookmarks
                                          • Search