Update: hardening pass complete, production-proven, and one correction on backups
A few days on from the original post, the package has been through a full integration and hardening pass on a live Cloudron box. Most of this is good news. There is also one correction to a claim in the original that is important enough to lead with.
️ Correction: bundled-ClickHouse backup needs care (a fix is in progress)
The original post said that /app/data backup "just works, no custom backup hook needed." That holds right up until ClickHouse happens to be running a background merge during the backup snapshot.
Here is the race. ClickHouse continuously creates and deletes temporary merge directories under its store (tmp_merge_*, tmp_insert_*, tmp_fetch_*). Cloudron's rsync syncer walks that tree, and if one of those directories disappears between enumeration and read, readTree receives null and crashes on .sort(). The part that matters: that crash aborts the entire server backup run, not only Langfuse. Every app scheduled after it is left on the previous night's snapshot. The failure is timing-dependent, so it appears intermittently with no configuration change on your side.
This affects any ClickHouse-backed package (Plausible, PostHog, SigNoz, and others), so we have filed it upstream as a platform robustness issue: https://forum.cloudron.io/topic/15663/backup-task-crashes-when-a-clickhouse-app-deletes-a-temp-merge-dir-mid-snapshot. A single null-guard before the sort would fix it for every affected app.
Until that lands, the workaround (full detail in KNOWN-ISSUES.md:
Exclude Langfuse from the automatic backup schedule so a racy run cannot abort the others, and back it up out of band.
One correction to the obvious workaround: Cloudron will not back up a stopped app, so "stop the app, then back it up" is not a valid path. A quiesced backup has to be a filesystem-level snapshot taken outside the platform, or you simply re-run the schedule, since it is a timing collision and usually clears on the next attempt.
The real fix is the next release. Version 0.2.0 moves the ClickHouse store to a persistentDir and backs it up as a consistent logical dump through backupCommand and restoreCommand, which removes the race structurally, because the syncer never walks the live store at all. It is scoped and tracked, and it gets its own backup-and-restore round-trip gate before it ships.
What is new since the original
A full hardening pass on the live instance, covering the code, configuration and dependency surface, container capabilities, the network surface, the authentication surface, and secrets at rest and in transit. Verdict: pass. The write-up is here: SECURITY.md. In short: the internal services (ClickHouse and the worker) are unreachable from outside; the only surface open without a login is /api/public/*, which is API-key gated by design; secrets are 0600; and the encryption key is byte-identical across both update and restore.
Production-proven from the public package: installed from the versions-url exactly as any operator would, OIDC single sign-on confirmed end to end, presigned trace-media working through the proxy, update and backup-restore both preserving the encryption key, and memory measured under sustained ingestion (peak around 2.6 GiB, with the 5 GiB default holding comfortably).
Integration note: agentgateway
If you are considering wiring agentgateway's own traces into Langfuse, it will not work on agentgateway 1.3.1, and the cause is not on the Langfuse side. Its config.tracing OTLP exporter sends cleartext even when the endpoint is https://, and Cloudron's TLS-terminating proxy rejects that with a 400. Langfuse accepts the identical payload over TLS without complaint. agentgateway simply will not negotiate TLS for trace export, and there is no configuration option to change that. Filed upstream: [agentgateway issue — link]. Open WebUI and the Langfuse SDKs both handle TLS correctly, so the integrations described in the original post work as written.
What is next
Version 0.2.0 is the ClickHouse backup rework described above. Source, issues and pull requests are here: https://github.com/OrcVole/langfuse-cloudron. Real-world memory-footprint reports and "works on my install" confirmations remain very welcome.