Interesting ... I restored the app but no data is recovered.. how is this possible ... and I can't attach screenshots here it says dimensions are too big ... it's not big actually.
mendoksai
Posts
-
Can't uninstall app and mongodb is not reachable -
Can't uninstall app and mongodb is not reachableI disabled all updates for now. Thank you, you may resolve this ticket too.
-
Can't uninstall app and mongodb is not reachableLuckily I have backup, I'll restore to previous version.
-
Can't uninstall app and mongodb is not reachableOhh no now it's happening for another app after update the app itself.
This is Kuma.
Apr 19 11:14:40 2026-04-19T02:14:40Z [SERVER] ERROR: Failed to prepare your database: Aggregate table migration is already in progress Apr 19 11:14:40 2026-04-19T02:14:40Z [SERVER] INFO: Connected to the database Apr 19 11:14:40 2026-04-19T02:14:40Z [SERVER] INFO: Creating express and socket.io instance Apr 19 11:14:40 2026-04-19T02:14:40Z [SERVER] INFO: Data Dir: ./data/ Apr 19 11:14:40 2026-04-19T02:14:40Z [SERVER] INFO: Server Type: HTTP Apr 19 11:14:40 2026-04-19T02:14:40Z [DB] ERROR: Database migration failed Apr 19 11:14:40 2026-04-19T02:14:40Z [DB] WARN: Aggregate table migration is already in progress, or it was interrupted Apr 19 11:14:40 2026-04-19T02:14:40Z [DB] INFO: Database Type: sqlite Apr 19 11:14:53 => Healthcheck error: Error: connect EHOSTUNREACH 172.18.19.169:3001 Apr 19 11:15:03 => Healthcheck error: Error: connect EHOSTUNREACH 172.18.19.169:3001Is there a way to stop update apps also all at once?
-
Server crashes caused by stopped app's runner container stuck in restart loopI used rescue mode and I saw the issue was with kernel and fixed the kernel version, disabled docker, Cloudron so I could connect. Then I did another investigation which seems Raid Controller had problem so I asked Hetzner checked they replaced Raid Controller and then things back to normal. So, I think the issue was more like happing from Raid Controller but just timing was with the upgrade and reboot gave the wrong impression, although I'm not 100% sure. Anyway, for now, it seems okay. You may close this ticket. Thank you.
-
Can't uninstall app and mongodb is not reachableI had to remove the mongodb as well. Now it's okay. the app has gone from dashboard.
-
Can't uninstall app and mongodb is not reachableHi Cloudron team,
I have a stale app entry in the dashboard for “Family Chat (Rocket.Chat)”.
The app appears to be already gone at the container/data level, but it still remains in the dashboard as an errored app.
Uninstall fails with a foreign key constraint error referencingbox.appAddo....While investigating, I found that
cloudron-support --troubleshootreports:- No pending database migrations
- box v9.1.6 is running
- Service
mongodbis not reachable
The MongoDB logs suggest it is not just “hanging”, but crashing during startup:
- WiredTiger-related stack frames appear during startup
mongodaborts- Cloudron then repeatedly gets
MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017
This started after a hardware RAID controller issue on the server. Hetzner has already replaced the RAID controller successfully and the server is back online.
My current understanding is:
- MongoDB service startup is failing at the WiredTiger/storage layer
- the stale app entry may be a secondary symptom
- uninstall cannot complete because the addon/service cleanup is incomplete
Could you please advise the safest supported recovery path?
Thanks!
-
Server crashes caused by stopped app's runner container stuck in restart loopServer was stable for 14 days after I fixed the DNS configuration myself. The original daily crash issue was resolved.
This morning I received Cloudron's security reboot email. Rebooted via dashboard. Server never came back. Ping responds, SSH returns
kex_exchange_identification: Connection reset by peer. Hard reset via Hetzner Robot didn't help either.So now I'm locked out of my own server because of an automatic security update that I didn't ask for and had no control over. My mail server is down, again.
I have to ask: is anyone actually testing these updates before pushing them? Every major issue I've had in the past two months has been triggered by an automatic update or upgrade. The previous instability started after a Cloudron update in February. Now this.
I need:
- Help getting my server back online — I'll likely need to use Hetzner rescue mode
- A way to permanently disable automatic security updates so I can apply them manually at a time that works for me
- Some assurance that updates are being properly tested before being pushed to production servers
This is a production server running critical mail services. I can't keep being the QA tester for untested updates.
Are you guys vibe coding?
-
Server crashes caused by stopped app's runner container stuck in restart loop@girish @nebulon Server crashed again last night. But this time the pattern is different — no containers in restart loop, no runner issues. The cron cleanup job is working. All containers were stable (Up 11 hours) before the crash.
The Docker journal shows the DNS resolver dying on its own:
23:38 - External DNS timeouts begin (185.12.64.2) 23:57 - Internal Docker DNS fails (172.18.0.1:53 i/o timeout) 23:59 - [resolver] connect failed: dial tcp 172.18.0.1:53: i/o timeout 00:xx - Server becomes unresponsiveThere's also a container (different ID each time) producing "ignoring event" / "cleaning up dead shim" messages every minute — not sure if related.
This happens roughly at the same time every night (~23:00-00:00 UTC). All previous fixes applied (no restart loops, domain renewed, hardware clean). I'm running out of ideas on my end.
Would it be possible to get SSH-level support to debug this? I can provide access anytime. This is really urgent as it's been impacting my mail service daily for weeks now.
Thank you.
-
Server crashes caused by stopped app's runner container stuck in restart loopUpdate: I renewed the expired domain and the app (Lychee) is now running properly. No containers in restart loop currently. The earlier crashes today were likely caused by the runner container still being in a stale state from before the domain renewal.
I have a cron job cleaning up zombie runners every 5 minutes, which seems to be working (log shows it removed 5 runners since setup).
Will monitor for the next few days and report back. If it stays stable, I'll mark this as resolved.
-
Server crashes caused by stopped app's runner container stuck in restart loop@nebulon Yes, here's the full timeline of changes:
- Server was stable on Ubuntu 20.04 + kernel 5.4 for months
- Upgraded to Ubuntu 22.04 + kernel 5.15 (following Cloudron upgrade docs) — instability started
- Upgraded to Ubuntu 24.04 + kernel 6.8 (following Cloudron upgrade docs) — issue persists
- Installed
fail2banandsmartmontoolsvia apt - No other custom modifications
All upgrades were done following the official Cloudron documentation. The crashes happen on both kernel 5.15 and 6.8, so it doesn't seem kernel-specific.
One thing that may be relevant: Docker is using
cgroupfsdriver with cgroup v2. The Cloudron systemd unit explicitly sets--exec-opt native.cgroupdriver=cgroupfs. Could there be a compatibility issue with Ubuntu 24.04's default cgroup v2?The server just crashed again twice in one hour. Happy to provide SSH access if that would help debug this. This is urgent as my mail server runs on this machine.
-
Server crashes caused by stopped app's runner container stuck in restart loopYes, I followed your upgrade docs as you suggest to upgrade due to discontinuing of the support old Ubuntu version, since then this problem happens. And it just happened again, right now. Twice in today.
-
Server crashes caused by stopped app's runner container stuck in restart loopHappened again. Every a few days.

-
Server crashes caused by stopped app's runner container stuck in restart loopQuick update — I just noticed
cloudron-support --troubleshootwas reporting:[FAIL] Database migrations are pending. Last migration in DB: /20260217120000-mailPasswords-create-table.jsThis migration has been pending since Feb 17 — which is exactly when the instability started. I missed this earlier. Just applied it:
cloudron-support --apply-db-migrations [OK] Database migrations applied successfullyI've also stopped the Mattermost container that was in a restart loop (it was failing to connect to MySQL on boot and never recovering).
Will monitor for the next few days and report back. Fingers crossed this was the missing piece.
-
Server crashes caused by stopped app's runner container stuck in restart loopThanks @girish for looking into this.
You're right — this isn't just about the stopped app. After collecting detailed logs, I found multiple containers in incorrect states on every boot:
<appid-1>-runner Created (stopped app - Lychee) <appid-2> Restarting (1) (Mattermost) <appid-3> Restarting (1) (Kimai)The Mattermost container is the main culprit. On boot, it tries to connect to MySQL before it's ready, fails, and enters an infinite restart loop:
error: Failed to ping DB error="dial tcp 172.18.30.1:3306: connect: connection refused" Error: failed to initialize platform: cannot create store: error setting up connectionsThis restart loop seems to degrade Docker networking over time. The Docker journal shows a clear cascade:
- Boot → Mattermost enters restart loop (MySQL not ready yet)
- Docker resolver starts failing — first external DNS timeouts, then internal (172.18.0.1:53)
Error: listen EADDRNOTAVAIL: address not available 172.18.0.1:3003on every boot- Eventually host MySQL becomes unreachable → full server lockup
For
journalctl -u docker, there are no explicit error-level entries from Docker daemon itself — onlyinfolevel "ignoring event" / "cleaning up dead shim" messages repeating every 5 minutes for the same container, pluserrorlevel DNS timeout entries from the resolver.I've stopped both Mattermost and the Lychee runner for now. Will monitor.
Environment details:
- Cloudron 9.1.3
- Ubuntu 24.04.4 LTS, Kernel 6.8.0-106-generic
- Dedicated Server: 8 CPUs, 32GB RAM
- ~35 containers on the cloudron network
- Docker:
Cgroup Driver: cgroupfs,Cgroup Version: 2 - Hardware check by Hetzner: all clean (CPU, disks, NIC)
- Issue started ~3 weeks ago, persisted through kernel 5.15 → 6.8 upgrade
Happy to provide SSH access or full logs if needed.
-
Server crashes caused by stopped app's runner container stuck in restart loopUpdate: Confirmed that Cloudron recreates the runner container on every boot, even though the app is stopped via the dashboard.
After each reboot:
- Main container:
Exited (0)✓ - Runner container:
Created← this is the problem - Redis addon:
Up← also still running
The runner in "Created" state triggers the scheduler loop → "cannot join network namespace" errors every 15-60 min → eventually cascading into Docker DNS failure → MySQL unreachable → full server lockup.
I've been manually removing the runner with
docker rm -f <appid>-runnerafter each reboot, but this is not sustainable.Is there a way to prevent the scheduler from recreating the runner for a stopped app? Or should I uninstall the app entirely to stop this cycle? The app's domain has expired but I'd like to keep the data for when I renew it.
Thanks for any guidance.
- Main container:
-
Server crashes caused by stopped app's runner container stuck in restart loopA domain expired for one of my apps. I stopped the app via the Cloudron dashboard. However, the runner container remained in "Created" state and kept trying to join the network namespace of the stopped app container, causing cascading failures:
- Runner repeatedly fails with:
Cannot restart container <appid>-runner: cannot join network namespace of container: Container <id> is restarting, wait until the container is running - This eventually causes Docker DNS resolution failures (internal Docker DNS timeouts)
- Host MySQL becomes unreachable (
ECONNREFUSED 127.0.0.1:3306) - SSH stops accepting connections
- Server becomes completely unresponsive, requiring hard reboot
This has been happening daily for the past week.
What I did
- Stopped the app via Cloudron dashboard → runner remained in "Created" state
docker rm -f <appid>-runnerremoved the stuck runner- Main container shows "Exited (0)" and redis addon is still running — both untouched
Questions
- Will Cloudron's scheduler recreate the runner container for a stopped app? If so, how do I prevent this?
- Is there a proper way to fully stop an app including its runner when the domain has expired?
- Should I also stop the redis addon container for this app?
Relevant box.log pattern (repeating every 15-60 min):
box:scheduler could not run task runner: (HTTP code 500) server error - Cannot restart container <appid>-runner: cannot join network namespace of containerAlso seeing on every boot:
Error: listen EADDRNOTAVAIL: address not available 172.18.0.1:3003cloudron-support --troubleshoot Vendor: System manufacturer Product: System Product Name Linux: 6.8.0-106-generic Ubuntu: noble 24.04 Execution environment: none none Processor: Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz BIOS Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz To Be Filled By O.E.M. CPU @ 3.4GHz x 8 RAM: 32796076KB Disk: /dev/sda3 909G [OK] node version is correct [OK] IPv6 is enabled and public IPv6 address is working [OK] docker is running [OK] docker version is correct [OK] MySQL is running [OK] netplan is good [OK] DNS is resolving via systemd-resolved [OK] unbound is running [OK] nginx is running [OK] dashboard cert is valid [OK] dashboard is reachable via loopback [FAIL] Database migrations are pending. Last migration in DB: /20260217120000-mailPasswords-create-table.js. Last migration file: /package.json. Please run 'cloudron-support --apply-db-migrations' to apply the migrations. [OK] Service 'mysql' is running and healthy [OK] Service 'postgresql' is running and healthy [OK] Service 'mongodb' is running and healthy [OK] Service 'mail' is running and healthy [OK] Service 'graphite' is running and healthy [OK] Service 'sftp' is running and healthy [OK] box v9.1.3 is running [OK] Dashboard is reachable via domain name [OK] Domain is valid and has not expired - Runner repeatedly fails with:
-
Daily GeoIP Database Download Limit ReachedYes, you are right. After deep investigation I found it's another docker that uses the license key. Sorry my bad, first I've got the IP wrong. That's why I assumed it's on Cloudron. My apologize.
-
Daily GeoIP Database Download Limit ReachedHello,
Recently, I've been receiving this email by MaxMind:
Dear MaxMind Customer, Your account has reached the daily limit for database downloads. Any additional download attempts today from your account will fail. Learn more about database download limits. To avoid download errors we recommend that you limit your downloads of each database to no more than once per day per server. You can sign in to your account and check your GeoIP download history for information on IP addresses you are downloading databases from. To ensure that you are downloading your databases efficiently, you may consult the update schedule for GeoIP databases. If you have any questions, please contact us at support@maxmind.com. Sincerely, The Team at MaxMindWhen I check the download history, I can see it's really downloading every a few minutes save file. ex:
GeoLite2-City_20240301.tar.gzI wonder it's happening due to failure?Can this process be improved?
Thank you.