Something broke remote checks.. local OK

robi

This is odd, the recent update seems to have broken update checks and TG notifications for a few sites yet not others.

I run HTTPS checks with a keyword and for some reason the app from the container cannot connect to the IP of a select few domains any more.

ping from the terminal of the app, gets no data back, just blank.

ping from the host via ssh, works just fine.

So now the dashboard is screaming bloody murder in lots of red, when 2 domains are actually up.

The logs show warnings for the hosts it's timing out on as well as other odd errors:

Jan 12 21:46:34 2023-01-12T21:46:34-08:00 [MONITOR] ERROR: Cannot send notification to TG Alert - Uptime-bot
Jan 12 21:46:34 TypeError: Cannot read properties of undefined (reading 'data')
Jan 12 21:46:34 at Telegram.send (/app/code/server/notification-providers/telegram.js:21:39)
Jan 12 21:46:34 at processTicksAndRejections (node:internal/process/task_queues:96:5)
Jan 12 21:46:34 at async Function.sendNotification (/app/code/server/model/monitor.js:1098:21)
Jan 12 21:46:34 at async beat (/app/code/server/model/monitor.js:660:21)
Jan 12 21:46:34 at async Timeout.safeBeat [as _onTimeout] (/app/code/server/model/monitor.js:728:17)

All remote monitors seem to be broken, while all local host/domain monitors are working.

I set up a monitor for google.com and it's down too

Update: demo cloudron kuma instance doesn't appear to have the same issues.

Update2: cloning a pervious version from backup doesn't help.
Also, new install on separate subdomain doesn't make it work any better.

girish

@robi said in Upgrade broke remote checks?:

Jan 12 21:46:34 TypeError: Cannot read properties of undefined (reading 'data')

Maybe upstream missed some db migration. We use the telegram updates as well but our instance is doing OK. An idea could be to delete the telegram notification and/or the websites throwing an error and add them again.

robi

@girish the DB size is at 62MB keeping data for 180 days. Doesn't look like that's the issue as it works for private IPs but not public IPs.

On a new instance, I even added 1.1.1.1, and it also fails by timing out. So it's not DNS related.

Something about docker/container but not host. (base image?)
So odd.

girish

@robi Does disabling telegram notification make things any better ? (the stack trace you posted is the telegram notification crashing)

robi

@girish no, because a new instance which doesn't have notifications setup has the same issue.

the crashing seems to be from being unable to reach Telegram.

so this is something with outbound container networking..

girish

@robi strange.. might want to report this upstream then. Not sure why something with no notifications is erroring in telegram code.

robi

@girish that's the wrong conclusion.

original app with TG notifications has the crash

new test app w/o TG notifications has only the monitors failing

The CL UI also seems slower than usual.
send you an email with logs

robi

Things I checked:

Network configuration: The Docker container may not be configured to use the host's network, which would prevent it from accessing public IPs.

Firewall rules: A firewall on the host machine may be blocking incoming connections from the Docker container.

DNS resolution: The Docker container may not be able to resolve public DNS names to IP addresses.

Network address translation (NAT): The host machine may be configured to use NAT, which would prevent the Docker container from accessing public IPs.

Inadequate permissions: The user running the container may not have the necessary permissions to access the host's network.

Network isolation: The network namespace of the container can be isolated from the host machine thus it has only access to the local IPs.

robi

Things are getting worse as other container apps are now failing, not able to reach their DB, or failing upgrades (DB migration), etc.

Not sure what's going on, but container networking has something to do with it.

robi

Email is no longer accessible, all clients cannot connect.

Checking services, email service is green, but logs are empty.. restarting the mail service fails with a long red error.

Cloudron Error
Command failed: docker run --restart=always -d --name="mail" --net cloudron --net-alias mail --log-driver syslog --log-opt syslog-address=udp://127.0.0.1:2514 --log-opt syslog-format=rfc5424 --log-opt tag=mail -m 429916160 --memory-swap 536870912 --dns 172.18.0.1 --dns-search=. -e CLOUDRON_MAIL_TOKEN="xxxxxx" -e CLOUDRON_RELAY_TOKEN="xxxxxx" -e LOGLEVEL=info -v "/home/yellowtent/boxdata/mail:/app/data" -v "/home/yellowtent/platformdata/addons/mail:/etc/mail:ro" -p 587:2587 -p 993:9993 -p 4190:4190 -p 25:2587 -p 465:2465 -p 995:9995 --label isCloudronManaged=true --read-only -v /run -v /tmp cloudron/mail:3.7.4@sha256:8ddbf13ee3fd479e18923c7bf1370d9d8aa5f12a94cbbda5afac8b5a4af72a28 docker: Error response from daemon: driver failed programming external connectivity on endpoint mail (ddb03fa18c2bf483ec4782d27b2e31a9f774bd5e835b1a15c830d2d38ee82b50): (iptables failed: ip6tables --wait -t filter -A DOCKER ! -i br-e5579f54c902 -o br-e5579f54c902 -p tcp -d fd00:c107:d509::17 --dport 4190 -j ACCEPT: ip6tables: No chain/target/match by that name. (exit status 1)).

I thought to restart docker, but if it fails to restart, all sites will be down ;-/

The plot thickens.

jdaviescoates

@robi said in Something broke remote checks.. local OK:

The plot thickens.

SSDNodes having a wobble?

girish

@robi I rebooted the server and it fixed up the iptable issue atleast. mail container seems to be back now and I am also able wget/curl etc from the uptime kuma container.

I am not sure what messed up with you iptables though. Did you upgrade recently? I do see that there are some non-cloudron docker container in the server (like watchtower), not sure what effect they have.

robi

@girish Did not think to look at iptables, ugh.

Yes, stuff works at least! Thank you.

There were some package upgrades. But don't remember which.

Ooh, watchtower from 2 years ago? Looks like I forgot about playing with that. Nope, no effect, but it would be nice to see those in the Cloudron UI.

Happy Sunday

girish

@robi ah, indeed. Maybe you installed something a long time ago! It's running for 2 years lol

robi

@girish done https://forum.cloudron.io/topic/8487/enhance-app-proxy-app-with-path-and-port-support

Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.

Cloudron Forum

Something broke remote checks.. local OK