Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps - Status | Demo | Docs | Install
  1. Cloudron Forum
  2. Discuss
  3. Struggling to Replace MinIO - Advice Welcome!

Struggling to Replace MinIO - Advice Welcome!

Scheduled Pinned Locked Moved Discuss
14 Posts 9 Posters 913 Views 9 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • robiR Offline
    robiR Offline
    robi
    wrote on last edited by
    #5

    There is an effort to explore minio alternatives in another thread.
    Sponsorship may speed things along.

    Conscious tech

    1 Reply Last reply
    2
    • D Offline
      D Offline
      davejgreen
      wrote on last edited by davejgreen
      #6

      Thanks for the responses. We are particularly interested in de-duplication, does anyone know if Cloudron backing up to a Hetzner Storage Box will do de-duplicated backups? I was surprised when Backblaze didn't, but maybe I configured something wrong?

      1 Reply Last reply
      0
      • jadudmJ Offline
        jadudmJ Offline
        jadudm
        wrote on last edited by
        #7

        Depending on your appetite for loss, I would consider backups-in-depth. That is, one backup site is not a backup.

        1. Use rsync-based backup over SSHFS to Hetzner or similar. You will want to select "use hardlinks" and, if you want it, encryption. The use of hardlinks is, essentially, your de-duplication. (See below.)
        2. For a second layer of depth, I would consider a (daily? weekly? monthly?) backup of your primary backup site to a secondary. This could be a sync to AWS S3, for example. Note that any S3-based backup (B2, Cloudflare ObjectSomething, etc.) will have both a storage cost and an API cost. If you are dealing with millions of small files in your backups, the API costs will become real, because dedupe requires checking each object, and then possibly transferring it (multiple PUT/GET requests per file).
          1. S3 has the ability to automatically keep multiple versions of a file. You could use this to have an in-place rotation/update of files.
          2. If you are doing an S3 backup, you can use lifecycle rules to automatically move your S3 content to Glacier. This is much cheaper than "hot" S3 storage. But, you pay a penalty if you download/delete to early/too often.
        3. As a third, cheap-ish option, go get a 2- or 4-bay NAS that can run TrueNAS, and put a pair of 8-12TB HDDs in it. Configure the disks in a ZFS mirrored pair. Run a cron job once per day/week to pull down the contents of the Hetzner box. (Your cron will want to, again, use rsync with hardlinks.) You now have a local machine mirroring your hot backups. It is arguably more expensive than some other options (~600USD up front), but you don't have any "we might run out of space" issues. And, because you're using it to pull, you don't have any weird networking problems: just SCP the data down. (Or, rsync it down over SSH.)

        Whatever you are doing, consider targeting two different destinations at two different times (per day/alternating/etc.). Or, consider having some combination of backups that give you multiple copies at multiple sites. That could be Hetzner in two regions, with backups run on alternating days, or it could be you backup to a storage box and pull down a clone every day to a local NAS, or ... or ...

        Ultimately, your 150GB is small. If you're increasing by a few GB per week, you're saying that you are likely to have 1TB/year. Not knowing your company's finances, this is generally considered a small amount of data. Trying to optimize for cost, immediately, is possibly less important than just getting the backups somewhere.

        Other strategies could involve backing up to the NAS locally first, and then using a cron to borg or rsync to a remote host (possibly more annoying to set up), etc. But, you might have more "dedupe" options then. (borg has dedupe built in, I think, but...)

        I have a suspicion that your desire to use object storage might be a red herring. But, again, I don't know your constraints/budget/needs/concerns.


        Deduplication: If you use rsync with hardlinks, then each daily backup will automatically dedupe unchanged files. A hardlink is a pointer to a file. So, if you upload super_ai_outputs_day_1.md to your storage on Monday, and it remains unchanged for the rest of time, then each subsequent day is going to be a hardlink to that file. It will, for all intents and purposes, take up zero disk space. So, if you are backing up large numbers of small-to-medium sized files that do not change, SSHFS/rsync with hardlinks is going to naturally dedupe your unchanging old data.

        This will not do binary deduplication of different files. So, if you're looking for a backup solution that would (say) identify that two, 1GB files where the middle 500GB are identical, and somehow dedupe that... you need more sophisticated tools and strategies. Rsync/hardlinks just makes sure that the same file, backed up every day, does not take (# days * size) space. It just takes the original size of the file plus an inode in the FS for each link.

        Note, though, if you involve a snapshot of your hardlinked backups to an object store, every file may take the full size of every file for every day. I'm possibly wrong on that, but I'm not confident that most tools would know what to do with those hardlinks when you're copying to an object store. I think you'd end up multiplying your disk usage significantly, because your backup tool will have to create a copy of each file into the object store. (Most object stores do not have a notion of symlinks/hardlinks.) An experiment with a subset of the data, or even a few files, will tell you the answer to that question.

        If you have other questions, you can ask here, or DM me.

        I use Cloudron on a DXP2800 NAS w/ 8TB in ZFS RAID1

        1 Reply Last reply
        1
        • ruihildtR Offline
          ruihildtR Offline
          ruihildt
          wrote on last edited by ruihildt
          #8

          Minio is back:

          https://blog.vonng.com/en/db/minio-resurrect/

          I see on the HN conversation people are doubting the author, and some others also mention Chainguard will also keep a fork with CVE patched: https://github.com/chainguard-forks/minio

          Maybe it's worth waiting a bit to see which fork get consistent maintenance.

          1 Reply Last reply
          5
          • d19dotcaD Offline
            d19dotcaD Offline
            d19dotca
            wrote last edited by d19dotca
            #9

            I am in a similar position. I currently use iDrive e2 for backups and it’s fine but it does take around 1.5 hours uploading tarballs from my server. I’m looking at possibly deploying a low-budget Kimsufi server in the same OVH data centre and just mounting that disk as SSHFS to Cloudron on my primary server, haven’t tried it out yet. If I go this way I will likely still keep iDrive as a second backup destination and just run it a little less frequently and with lower retention to save on costs a little bit there.

            I’m wondering about MinIO alternatives as I tried MinIO on a second Cloudron install but it seemed to take even longer than uploading to iDrive e2 somehow (I expected it’d be quicker not slower). It seemed the project is dead too but then it also looks like there’s an active fork that maybe the Cloudron @staff can look into using instead. Brings back many of the lost MinIO features by the sounds of it too.

            Thinking of other avenues to keep backups more “local” or as close to local as possible for rapid quick backups, and then completely offsite as a second backup plan too.

            I have around 65 GB compressed to back up, around 125 GB uncompressed, I believe.

            --
            Dustin Dauncey
            www.d19.ca

            1 Reply Last reply
            0
            • robiR Offline
              robiR Offline
              robi
              wrote last edited by
              #10

              Have you tried the custom Garage S3 app?

              Conscious tech

              1 Reply Last reply
              0
              • girishG Offline
                girishG Offline
                girish
                Staff
                wrote last edited by
                #11

                The garage app is packaged, just reviewing it and have to a get an initial package out.

                1 Reply Last reply
                4
                • d19dotcaD Offline
                  d19dotcaD Offline
                  d19dotca
                  wrote last edited by
                  #12

                  I haven’t used Garage yet but isn’t it just another s3? So it’d basically be a MinIO replacement, right? Do we have any other options for ones with “hardlinks” using rsync? I kind of think the Surfer app would honestly be a great way to use as a backup somehow if it could be used to expose a disk.

                  --
                  Dustin Dauncey
                  www.d19.ca

                  1 Reply Last reply
                  0
                  • girishG Offline
                    girishG Offline
                    girish
                    Staff
                    wrote last edited by
                    #13

                    Whoops, apologies. The garage app is not packaged. We have packaged seaweedfs and that is what is in the pipeline. The S3 compat layer seems to work well in that app. https://git.cloudron.io/packages/seaweedfs-app/ is the package.

                    1 Reply Last reply
                    4
                    • robiR Offline
                      robiR Offline
                      robi
                      wrote last edited by
                      #14

                      The minimally changed and full featured drop-in fork of MinIO is called Silo - https://github.com/pgsty/minio

                      Would be a great community package if not official app replacement.

                      I'd still like to see Garage packaged too, it's quick and simple, even does http hosting.

                      Conscious tech

                      1 Reply Last reply
                      2

                      Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                      Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                      With your input, this post could be even better 💗

                      Register Login
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • Bookmarks
                      • Search