@girish @nebulon Server crashed again last night. But this time the pattern is different — no containers in restart loop, no runner issues. The cron cleanup job is working. All containers were stable (Up 11 hours) before the crash.
The Docker journal shows the DNS resolver dying on its own:
23:38 - External DNS timeouts begin (185.12.64.2)
23:57 - Internal Docker DNS fails (172.18.0.1:53 i/o timeout)
23:59 - [resolver] connect failed: dial tcp 172.18.0.1:53: i/o timeout
00:xx - Server becomes unresponsive
There's also a container (different ID each time) producing "ignoring event" / "cleaning up dead shim" messages every minute — not sure if related.
This happens roughly at the same time every night (~23:00-00:00 UTC). All previous fixes applied (no restart loops, domain renewed, hardware clean). I'm running out of ideas on my end.
Would it be possible to get SSH-level support to debug this? I can provide access anytime. This is really urgent as it's been impacting my mail service daily for weeks now.
Thank you.
