Cloudron Data Paths - Best Practices
-
I am trying to set up my first Cloudron instance properly, especially given plans for future growth. I expect to host a significant number of domains (200+), email users (200+) and several apps (Rocket Chat, Chatwoot and others).
I am using a Hetzner cloud server with an attached volume for expandable storage. As expected, volume-based disk storage is 6 times slower than local disk.
In reading the Cloudron documentation, boxdata, appsdata and platformdata and /var/lib/docker are all relocatable. Here are my initial thoughts:
-
/var/lib/docker. Docker images take up a lot of space and will only be accessed at boot time. Moving them to an external volume will increase server boot time, but since that is an occasional occurence, this is perhaps the best path to relocate to an external volume.
-
boxdata. Contains all user emails. This is likely to grow significantly with added users and more email messages/attachments per user. Although email retrieval (and potentially other user-related actions) will be slower , this path relocation will eventually be necessary so better to do this now.
Questions:
(1) Is relocating the docker and boxdata paths the right approach or am I missing something?
(2) What about appsdata and platformdata? Given the server disk size, I can comfortably allow this to reach 30GB total (with a safety factor). I know path sizes are dependent on which apps selected and usage of those apps (e.g. chat will generate more and more data), but any advice from more experienced admins would be appreciated.
(3) What is the recommended procedure for safely increasing the size of an external volume?
-
-
/var/lib/docker
will be accessed by docker. It will be accessed all the time - during app update, restart, re-configure etc. So, it's not only just at boot time. It's best to have this in a fast disk (ssd/nvme). I would not put this on some network mount.As you mentioned, appsdata and platformdata will largely depend on the apps you install and how much data each app consumes. platformdata mostly contains logs and the databases itself. appsdata contains the file storage of apps.
For apps, we have a feature called volumes - https://docs.cloudron.io/volumes/ . You can add an external disk as a volume and you can move an app's storage into a subpath of a volume. This is very similar to android's "move to sdcard" functionality.
For email data, you have to move things manually by ssh'ing into the server. We intend to integrate the storage into the volumes logic at some point (so you can move maildata just like you can move app's data).
-
@girish Thank you for explaining /var/lib/docker and best practice.
I am a bit confused about your documentation reference to volumes vs. https://docs.cloudron.io/storage/#default-data-directory. I am thinking about moving all data directories to an external ext4 volume that Hetzer has automatically mounted at /mnt/<volname>. Will that work? I also want to make sure that the default Cloudron backup processes will apply. Do I need to move the default location of the Hetzner volume?
-
Hey @crazybrad In my experience, it works (including backup), although it does create issues if for whatever reason the volume doesn't mount correctly on a server restart (happens occasionally).
-
CIFS and NFS look like the only "remote" file systems. Are either of them performant? Are there any hosted options for those?
I guess Volume storage that DO, Hetzner and others offer is the main use case aside from "local" Cloudron installs with local disk?
-
@girish Thank you for clarifying. @michaelpope That's another reason why no one likes to restart their servers:) Appreciate the "heads-up".
-
@bmann CIFS, NFS, are both remote file systems, yeah. SSHFS is a remote file system built on SSH, so that technically counts too. But all of those have somewhat limited functionality.
From what I've seen, most people who use the Mount backup are using it for where your VPS provider gives you a volume to work with. Sometimes they are technically remote too, on the backend, but they act like they are local (I know BUYVM's volumes are based on an RDMA which is basically like direct disk access over ethernet, which is kinda cool - Hetzner probably uses some other technology for their volumes).
-
-
-
As expected, volume-based disk storage is 6 times slower than local disk.
Now that the topic is set as "best practices" ( ) I wonder if there's any nice practice some of you do for setting up external Volumes for data (not backups). I'm interested in this considering the size of my Nextcloud instance that gets this giant backup I don't necessarily need.
-
@michaelpope yeah I was skipping SSHFS as kind of being a temporary hack, whereas CIFS / NFS are designed as remote file systems (total assumption on my part).
I'm thinking about this for scaling storage beyond one machine (not for backup). Nextcloud, Peertube, Pixelfed, and some of the photo management apps are ones that could really eat storage.
-
@bmann said in Cloudron Data Paths - Best Practices:
Peertube, Pixelfed
For those (and Mastodon too) I'd strongly recommend connecting them to an S3 compatible object storage and storing everything there instead.
I use Scaleway Object Storage buckets for that myself.
-
@jdaviescoates OK, but that puts us into custom configuring those apps. Do those env config changes persist when they are Cloudron managed?
I see that being documented on the Mastodon on Cloudron page but it's not on the Peertube or Pixelfed pages.
Between app data, Volumes, and Backups, there are three places to put config around storage. I'd love to see those items be configurable at the Cloudron admin layer.
The wording doesn't totally match, but having Volumes actually be "Storage" or "Volumes and Backup" and having multiple configs there might be a useful evolution.
-
@bmann said in Cloudron Data Paths - Best Practices:
@jdaviescoates OK, but that puts us into custom configuring those apps. Do those env config changes persist when they are Cloudron managed?
Yes. Seems to working fine.
@bmann said in Cloudron Data Paths - Best Practices:
I see that being documented on the Mastodon on Cloudron page but it's not on the Peertube or Pixelfed pages.
Yeah, could really do with lots more documentation on how to do it. It's mostly documented upstream, but it is way harder than one might expect and seems to require lots of trial and error. Doesn't help that it seems to be a bit different depending on which S3 provider you're using.
-
@bmann Yeah, in that case, I'd say just use a provider that allows you to buy extra storage inexpensively and mount it as a volume. BuyVM is the one that comes to mind honestly.
You can also use CIFS or NFS, but I don't know how well I'd trust them as primary storage for an app.
Also, as said, some apps support external S3 (although that limits your capability to fully encapsulate your backups).