Plausible (Analytics Platform)
-
@dylightful said in Plausible (Analytics Platform):
I'm thinking we ought to try and assemble a comprehensive list of Cloudron-like services like this. Seems there are so many these days! (and together it seems like at least someone here has tried them out)
-
@jdaviescoates indeed such a list would be very useful. Although Elest.io is really in the category of providing hosted versions of OpenSource apps. So from a deployment and data ownership perspective quite different.
-
@nebulon said in Plausible (Analytics Platform):
from a deployment and data ownership perspective quite different.
Yeah, they're all a bit different in their models. For me the inclusion criteria would just be something like "makes it really easy for relatively non technical people to install a bunch of open source apps and access them on their own domains", and possibly also those that are aimed at people who are a bit more technical but also automate much of the process too.
-
@dylightful Is the pricing per VM? Can multiple apps be deployed in a single VM?
-
@girish Sorry to awaken an old thread, but just wanted to give another vote for Plausible. I'm trying to have as few servers as possible, with the servers either being specifically for one or two specific apps or something like Cloudron (my Truenas Scale has a ton of apps, Synology NAS, and Ubuntu with Portainer). For business, looking at this, SerpBear, and Odoo. TrueNas has Odoo but for the other two it looks like I can either try Elest.io or spin up yet another server but try out Easy Panel. I've tried CapRover and had mixed experiences, and they also don't have SerpBear.
-
I have this running off my TrueNAS Scale and it works perfectly. I'd really prefer to not have this running there as it's the only app installed that's exposed to the internet (rest are internal apps), so this would be nice to have on Cloudron... and I definitely prefer it over the current selection of analytic solutions.
-
Definitely hoping to see Plausible too. Currently testing Umami but not thrilled with it in all honesty, some benefits to Plausible over Umami such as concatenated referrers and more (i.e. "Google" as opposed to tracking google.ca and google.com separately, which may be helpful to some but frankly most customers just care how much comes from "Google" overall rather than the TLD of it).
-
I would like to contribute by creating this app. Where should I start to learn to encapsulate app in Cloudron?
-
@emilienbidet start with docs.cloudron.io
-
@robi said in Plausible (Analytics Platform):
@emilienbidet start with docs.cloudron.io
More specifically https://docs.cloudron.io/packaging/tutorial/
-
@emilienbidet We have a forum section (https://forum.cloudron.io/category/96/app-packaging-development) for this. See the pinned topics there for documentation and pointers. Feel free to post there for any packaging help.
-
Plausible is a great platform. We have contributed some (open source) code enhancements and I can say that the two individuals behind Plausible are awesome to work with. We are hosting this on our own infrastructure but continue to maintain a subscription at Plausible.io to support the platform (Note: this is an easy and inexpensive way to test the technology). For those considering "packaging" it, there are multiple docker containers required (5 if my memory is correct on this): database (PostGRES), events (Clickhouse), email, application, geoIP database. And Plausible is GPDR-compliant!
-
Plausible Analytics Packaging Attempt
I've began an attempt at packaging Plausible for the Cloudron platform. So far, I've spent two days on it, and it's a non-trivial effort. I'll leave some notes here for those who may be considering packaging this, or similar apps:
Overview of Plausible's architecture
Plausible depends upon two databases:
- PostgreSQL, which is provided by Cloudron's Postgres addon
- Clickhouse, a high-performance DBMS commonly used in analytics.
Clickhouse must be made available to Plausible in some way, and the upstream maintainers at Plausible provide a reference self-hosting implementation using multiple docker containers and
docker-compose
. This is a turnkey solution which completely abstracts away the underlying Plausible implementation.This reference-implementation depends on an alpinelinux docker container for the Plausible binary, as well as upstream container images for Postgresql and Clickhouse.
Difficulties in the packaging process
There were two major challenges that I encountered in the packaging process.
Building Plausible Binaries for Cloudron's Ubuntu-derived base Image
Plausible is not a simple nodejs application. It is an Elixir application written in Erlang, that uses NPM for asset management. It depends on an Elixir-based toolchain that creates compiled binaries from the Plausible source code.
The Plausible team does not provide builds for their application, hence any prospective user or self-hoster that does not wish to depend upon the reference implementation must build Plausible from source.
The Plausible repository contains a
Dockerfile
which builds the application binary as a part of their own containerization process. My first attempt was to replicate the build process using Plausible's original Dockerfile, copy the build artifacts, and then create a Cloudron package using said builds.Unfortunately, this did not work because Plausible's original build toolchain depends on an Alpinelinux environment. Cloudron containers use a base image that is derived from Ubuntu 22.04. Hence the build binaries for Plausible were entirely incompatible with the Cloudron docker image.
Thus for my second attempt, I setup a parallel repository that specifically created an Ubuntu 22.04-based build environment for Plausible, and used git submodules to create unofficial, Ubuntu-compatible binaries.
These binaries worked, and with a little additional effort, I was able to copy them over to my Cloudron app package.
Packaging Clickhouse for the Cloudron App Package
The second difficulty that I encountered was providing Clickhouse for the Cloudron app package. Clickhouse is a high-performance DBMS system often used in the analytics space, and Plausible requires it as a dependency.
Hence, any effort to package Plausible for Cloudron requires packaging Clickhouse as well.
The major difficulty that I am encountering is that Clickhouse expects to be run under the
clickhouse
user. This causes all sorts of obscure permissions issues when combined withgosu
and the Cloudron environment. Additionally, the clickhouse database configuration is non-trivial, and so far I have yet to manage a working database connection between Plausible and Clickhouse.Plausible Initialization
Finally, Plausible's binary must be run with a set of configuration options exposed as environment variables. These configuration options require initialization, and some of them are undocumented.
I hope these notes will be helpful to any future packaging attempts, as they took me quite a lot of time to figure out.
Plausible requires an
SECRET_KEY_BASE
env, which you can provision according to the documentation here. However in addition to that, it requires aTOTP_VAULT_KEY
env, which is completely undocumented. The application binary will crash and segfault if it is not provided. TheTOTP_VAULT_KEY
env is a 32-byte base64-encoded string. An example for it can be found at Plausible'senv.dev
file.Finally, on initial setup, Plausible expects certain tables to be available in the PostgreSQL database. If those tables do not exist, the binary crashes and segfaults. Plausible provides a
createdb.sh
script which initializes its database, but it does not work on Cloudron because it expects an empty PostgreSQL server and tries to create its own database.createdb.sh
is unable to take Cloudron's existing database (which is created automatically per-application) and simply create tables within it.A workaround is available for this issue, by running Plausible's
migrate.sh
instead. The script exits with a non-zero exit code, because there is nothing to migrate, but it does create the right tables. Thus as a part of the application's initialization process, the packager must be careful to run the migration script in order to initialize Plausible's PostgreSQL database.Summary
In conclusion, Plausible Analytics is a challenging app to package for the Cloudron environment, chiefly due to three difficulties:
- A non-trivial Erlang-based build process which must create compatible binaries for Cloudron's Ubuntu-derived base image
- Dependency on Clickhouse, a high-performance DBMS which is difficult to configure, and has obscure permissions-related issues due to dependency on
clickhouse
user. - Database initialization is poorly documented and requires a workaround.
Right now, I was able to successfully create a build toolchain that outputs compatible Ubuntu binaries. I was also able to solve the PostgreSQL database initialization issue with the aforementioned
migrate.sh
workaround, and I am even able to see the Plausible admin dashboard.My only outstanding difficulty is with Clickhouse, as I can't seem to get it to read the
config.xml
files, nor write to files in the right place.Thus, this is a call-for-help. Can somebody help me get Clickhouse working within a Cloudron application package? The most difficult part of Plausible's packaging process was the build toolchain, and I was able to get that working. If we can get Clickhouse to work, I am confident that we can release an app package for Plausible in due course.
-
Update: I'm almost done packaging the app! This one was quite difficult! I've managed to make Clickhouse available in a secure, and reasonably elegant fashion, and I've gotten to the point where I can get Plausible working within my Cloudron setup.
Right now, the only task I have left is to use supervisord to manage the Clickhouse and Plausible processes within the same Docker container. Once I complete this, I will release a repository for testing.
-
It works! After a final afternoon learning about Supervisord and multi-process Docker containerization, I was able to create a working Cloudron app package for Plausible! It was a very challenging, but enjoyable process - a journey where I learned how to use multiple tools, such as docker-based build environments, Elixir, Clickbase, and Supervisord. All that remains now is for me to write the documentation and end-to-end unit tests!
I will release a git repository with the unstable app package once I finish writing the documentation. In the meantime, I am going to do some additional testing on my own, including testing of the sendmail addon configuration. In particular, I want to make that the plausible database is properly backed up and restorable, prior to releasing a git repository. I hope to get this done in the next few days!
In the meantime, if you want to help me out, please help me test Plausible! Come visit my website with your ad-blocker disabled, so I can generate some test data. I can't figure out how to work the upstream Elixir test harness, so instead of synthetic data you can provide me with some real data
-
@Sydney great progress. Some questions about clickhouse: my understanding is that clickhouse is a database. This means that it's not "safe" to backup raw files and instead one needs to save/restore clickhouse dumps. Is my understanding correct ? If so, we might have to make clickhouse as an addon.
And along those lines, if clickhouse is to be made an addon, do you know if clickhouse supports multi-tenancy? Similar to postgresql/mysql ? In those databases, you can have a single postgres instance and create multiple databases. Each app's database is completely isolated and you can create credentials for an app and set permissions in such a way that it cannot access another app's database. The other side of this is caches like redis. We spin a redis instance per app since back in the day redis did not support multi-tenancy (maybe this has changed now with redis 6 and ACL).
-
@girish Great questions!
Best Practices for Backing-up Clickhouse
Clickhouse is a database, and you're absolutely right that the best practice for backing up databases would be to save and restore dumps. Right now, for the minimal viable prototype I've simply installed Clickhouse in the read-only Docker image, and then used a customclickhouse-config.xml
file to set it's data directory to/app/data/clickhouse/
. Right now, this seems to work, but a more mature implementation would ideally dump backups to /app/data, and then load them in upon restore. I'll have to do further research to see what's the best way to do this, so please allow me to get back to you with more information.Clickhouse Multi-tenancy and Potential Addon support
Clickhouse does support multi-tenancy. From my understanding, it is a full featured database system that allows you to define multiple users with complex access-control-lists and create multiple independent databases. It even offers a MYSQL-compatible interface for legacy applications. Hence, I believe it will absolutely be possible to implement Clickhouse as a first-class Cloudron addon. It would likewise simplify the backup and restore process as well.I am interested in implementing Clickhouse as a potential addon, for the future. I actually looked into it at the start of the packaging process, but I wasn't able to make much headway because I couldn't find a lot of documentation on how to create Cloudron addons. Where could I find the source code for some of Cloudron's existing addons? If you could direct me to the source code for Cloudron's MySQL, MongoDB, and PostgreSQL addons, I am confident I figure something out by comparing and contrasting the code.