Plausible (Analytics Platform)
-
@emilienbidet start with docs.cloudron.io
-
@robi said in Plausible (Analytics Platform):
@emilienbidet start with docs.cloudron.io
More specifically https://docs.cloudron.io/packaging/tutorial/
-
@emilienbidet We have a forum section (https://forum.cloudron.io/category/96/app-packaging-development) for this. See the pinned topics there for documentation and pointers. Feel free to post there for any packaging help.
-
Plausible is a great platform. We have contributed some (open source) code enhancements and I can say that the two individuals behind Plausible are awesome to work with. We are hosting this on our own infrastructure but continue to maintain a subscription at Plausible.io to support the platform (Note: this is an easy and inexpensive way to test the technology). For those considering "packaging" it, there are multiple docker containers required (5 if my memory is correct on this): database (PostGRES), events (Clickhouse), email, application, geoIP database. And Plausible is GPDR-compliant!
-
Plausible Analytics Packaging Attempt
I've began an attempt at packaging Plausible for the Cloudron platform. So far, I've spent two days on it, and it's a non-trivial effort. I'll leave some notes here for those who may be considering packaging this, or similar apps:
Overview of Plausible's architecture
Plausible depends upon two databases:
- PostgreSQL, which is provided by Cloudron's Postgres addon
- Clickhouse, a high-performance DBMS commonly used in analytics.
Clickhouse must be made available to Plausible in some way, and the upstream maintainers at Plausible provide a reference self-hosting implementation using multiple docker containers and
docker-compose
. This is a turnkey solution which completely abstracts away the underlying Plausible implementation.This reference-implementation depends on an alpinelinux docker container for the Plausible binary, as well as upstream container images for Postgresql and Clickhouse.
Difficulties in the packaging process
There were two major challenges that I encountered in the packaging process.
Building Plausible Binaries for Cloudron's Ubuntu-derived base Image
Plausible is not a simple nodejs application. It is an Elixir application written in Erlang, that uses NPM for asset management. It depends on an Elixir-based toolchain that creates compiled binaries from the Plausible source code.
The Plausible team does not provide builds for their application, hence any prospective user or self-hoster that does not wish to depend upon the reference implementation must build Plausible from source.
The Plausible repository contains a
Dockerfile
which builds the application binary as a part of their own containerization process. My first attempt was to replicate the build process using Plausible's original Dockerfile, copy the build artifacts, and then create a Cloudron package using said builds.Unfortunately, this did not work because Plausible's original build toolchain depends on an Alpinelinux environment. Cloudron containers use a base image that is derived from Ubuntu 22.04. Hence the build binaries for Plausible were entirely incompatible with the Cloudron docker image.
Thus for my second attempt, I setup a parallel repository that specifically created an Ubuntu 22.04-based build environment for Plausible, and used git submodules to create unofficial, Ubuntu-compatible binaries.
These binaries worked, and with a little additional effort, I was able to copy them over to my Cloudron app package.
Packaging Clickhouse for the Cloudron App Package
The second difficulty that I encountered was providing Clickhouse for the Cloudron app package. Clickhouse is a high-performance DBMS system often used in the analytics space, and Plausible requires it as a dependency.
Hence, any effort to package Plausible for Cloudron requires packaging Clickhouse as well.
The major difficulty that I am encountering is that Clickhouse expects to be run under the
clickhouse
user. This causes all sorts of obscure permissions issues when combined withgosu
and the Cloudron environment. Additionally, the clickhouse database configuration is non-trivial, and so far I have yet to manage a working database connection between Plausible and Clickhouse.Plausible Initialization
Finally, Plausible's binary must be run with a set of configuration options exposed as environment variables. These configuration options require initialization, and some of them are undocumented.
I hope these notes will be helpful to any future packaging attempts, as they took me quite a lot of time to figure out.
Plausible requires an
SECRET_KEY_BASE
env, which you can provision according to the documentation here. However in addition to that, it requires aTOTP_VAULT_KEY
env, which is completely undocumented. The application binary will crash and segfault if it is not provided. TheTOTP_VAULT_KEY
env is a 32-byte base64-encoded string. An example for it can be found at Plausible'senv.dev
file.Finally, on initial setup, Plausible expects certain tables to be available in the PostgreSQL database. If those tables do not exist, the binary crashes and segfaults. Plausible provides a
createdb.sh
script which initializes its database, but it does not work on Cloudron because it expects an empty PostgreSQL server and tries to create its own database.createdb.sh
is unable to take Cloudron's existing database (which is created automatically per-application) and simply create tables within it.A workaround is available for this issue, by running Plausible's
migrate.sh
instead. The script exits with a non-zero exit code, because there is nothing to migrate, but it does create the right tables. Thus as a part of the application's initialization process, the packager must be careful to run the migration script in order to initialize Plausible's PostgreSQL database.Summary
In conclusion, Plausible Analytics is a challenging app to package for the Cloudron environment, chiefly due to three difficulties:
- A non-trivial Erlang-based build process which must create compatible binaries for Cloudron's Ubuntu-derived base image
- Dependency on Clickhouse, a high-performance DBMS which is difficult to configure, and has obscure permissions-related issues due to dependency on
clickhouse
user. - Database initialization is poorly documented and requires a workaround.
Right now, I was able to successfully create a build toolchain that outputs compatible Ubuntu binaries. I was also able to solve the PostgreSQL database initialization issue with the aforementioned
migrate.sh
workaround, and I am even able to see the Plausible admin dashboard.My only outstanding difficulty is with Clickhouse, as I can't seem to get it to read the
config.xml
files, nor write to files in the right place.Thus, this is a call-for-help. Can somebody help me get Clickhouse working within a Cloudron application package? The most difficult part of Plausible's packaging process was the build toolchain, and I was able to get that working. If we can get Clickhouse to work, I am confident that we can release an app package for Plausible in due course.
-
Update: I'm almost done packaging the app! This one was quite difficult! I've managed to make Clickhouse available in a secure, and reasonably elegant fashion, and I've gotten to the point where I can get Plausible working within my Cloudron setup.
Right now, the only task I have left is to use supervisord to manage the Clickhouse and Plausible processes within the same Docker container. Once I complete this, I will release a repository for testing.
-
It works! After a final afternoon learning about Supervisord and multi-process Docker containerization, I was able to create a working Cloudron app package for Plausible! It was a very challenging, but enjoyable process - a journey where I learned how to use multiple tools, such as docker-based build environments, Elixir, Clickbase, and Supervisord. All that remains now is for me to write the documentation and end-to-end unit tests!
I will release a git repository with the unstable app package once I finish writing the documentation. In the meantime, I am going to do some additional testing on my own, including testing of the sendmail addon configuration. In particular, I want to make that the plausible database is properly backed up and restorable, prior to releasing a git repository. I hope to get this done in the next few days!
In the meantime, if you want to help me out, please help me test Plausible! Come visit my website with your ad-blocker disabled, so I can generate some test data. I can't figure out how to work the upstream Elixir test harness, so instead of synthetic data you can provide me with some real data
-
@Sydney great progress. Some questions about clickhouse: my understanding is that clickhouse is a database. This means that it's not "safe" to backup raw files and instead one needs to save/restore clickhouse dumps. Is my understanding correct ? If so, we might have to make clickhouse as an addon.
And along those lines, if clickhouse is to be made an addon, do you know if clickhouse supports multi-tenancy? Similar to postgresql/mysql ? In those databases, you can have a single postgres instance and create multiple databases. Each app's database is completely isolated and you can create credentials for an app and set permissions in such a way that it cannot access another app's database. The other side of this is caches like redis. We spin a redis instance per app since back in the day redis did not support multi-tenancy (maybe this has changed now with redis 6 and ACL).
-
@girish Great questions!
Best Practices for Backing-up Clickhouse
Clickhouse is a database, and you're absolutely right that the best practice for backing up databases would be to save and restore dumps. Right now, for the minimal viable prototype I've simply installed Clickhouse in the read-only Docker image, and then used a customclickhouse-config.xml
file to set it's data directory to/app/data/clickhouse/
. Right now, this seems to work, but a more mature implementation would ideally dump backups to /app/data, and then load them in upon restore. I'll have to do further research to see what's the best way to do this, so please allow me to get back to you with more information.Clickhouse Multi-tenancy and Potential Addon support
Clickhouse does support multi-tenancy. From my understanding, it is a full featured database system that allows you to define multiple users with complex access-control-lists and create multiple independent databases. It even offers a MYSQL-compatible interface for legacy applications. Hence, I believe it will absolutely be possible to implement Clickhouse as a first-class Cloudron addon. It would likewise simplify the backup and restore process as well.I am interested in implementing Clickhouse as a potential addon, for the future. I actually looked into it at the start of the packaging process, but I wasn't able to make much headway because I couldn't find a lot of documentation on how to create Cloudron addons. Where could I find the source code for some of Cloudron's existing addons? If you could direct me to the source code for Cloudron's MySQL, MongoDB, and PostgreSQL addons, I am confident I figure something out by comparing and contrasting the code.
-
Cloudron App Package for Plausible Analytics Released
Hello everyone! Final update for the year of 2023. I have finished writing the initial documentation for my plausible app package, and I am now releasing the git repository for public testing. You may find the link to it below:Please note that the plausible-app package depends upon pre-built Ubuntu binaries. These un-official binaries are built directly from the upstream source at Plausible Analytics, are are compiled automatically using Github Actions. You may find the build environment used to build these binaries below:
Please be advised that while the app package works, it has not been extensively tested. This app package is not ready for production use, and using it may incurr data loss.
-
@girish I've released the beta app package for plausible. Please let me know if you have any feedback: this app package is still very much a work in progress, and I hope to improve it, particularly in regards to the database backup concerns that you outlined earlier.
In particular, I hope to work with you to create a Clickbase addon so we can support Clickbase natively within Cloudron. I think this might be the best way forwards to take care of the backups issue.
-
@Sydney thanks for the work and notes on clickhouse! For addons, I think it's best that we do the work instead of getting this contributed. Mostly because of issues around Cloudron licensing (copyright assignment) and as well as writing tests etc which is not straightforward since we haven't really documented all this. The (platform) project is not set up for contributions unlike app packages. I will look into your clickhouse work and see how easy this is to make it an addon.
-
Thank you for the feedback. I am very excited for Plausible to be a part of the Cloudron app store, and I wish you the best of luck in packaging Clickbase for Cloudron.
For backing up and restoring the Clickbase database, you may reference the
clickbase-backup.sh
andclickbase-restore.sh
scripts in my repository.Should you use any of my code in the Clickhouse addon, I am happy to assign copyright to you - just send me a release form.
In the meantime, for those who are not willing to wait for @girish , feel free to use the code in my repository - it provisions a fully working installation of Plausible with a local Clickhouse db.
-
@Sydney , thank you very much, indeed!
I really hope an app will make to Cloudron sooner, rather than later!
-
Bump on this. I can see it being another one that will bring in new Cloudron users.
-
-
@LoudLemur said in Plausible (Analytics Platform):
It seemed Plausible is a proprietary solution
?
What do you mean? it's AGPL https://github.com/plausible/analytics/blob/master/LICENSE.md