Regular short getaddrinfo EAI_AGAIN outages
-
I recently enabled email notifications on my Uptime Kuma and I've noticed there are regular short outages. Something to do with getaddrinfo EAI_AGAIN
Any idea what's causing this and how to resolve it?
Apparently
EAI_AGAIN is a DNS lookup timed out error, means it is a network connectivity error or proxy related error.
According to https://stackoverflow.com/questions/40182121/whats-the-cause-of-the-error-getaddrinfo-eai-again
Oh, and here it says:
EAI_AGAIN means the DNS server replied that it cannot currently fulfill the request. (If you want the hairy details, the RCODE field in the response is set to 2, SERVFAIL.)
There is no single solution because it entirely depends on why the DNS server sends that back. Maybe it's overloaded, maybe the network is down, maybe it got the same reply from its upstream server.
In general, the best you can do is wait a while and try again. Hope that helps.
I wonder what the specific issue is in my case (perhaps just network issues with Netcup?)
-
@jdaviescoates said in Regular short getaddrinfo EAI_AGAIN outages:
perhaps just network issues with Netcup?
I've just created a load of monitors to watch apps I've got on some Hetzner Could VPS servers too. I wonder if I'll see the same issues on there or not...
-
Do you have other domains by any chance? And does this happen only for the
coop
domain ? Could be a TLD issue that their authoritative servers are flaky. -
@girish said in Regular short getaddrinfo EAI_AGAIN outages:
Do you have other domains by any chance?
Yes, lots.
@girish said in Regular short getaddrinfo EAI_AGAIN outages:
And does this happen only for the coop domain ?
Not sure yet, but shall keep an eye on it.
@girish said in Regular short getaddrinfo EAI_AGAIN outages:
Could be a TLD issue that their authoritative servers are flaky.
Sounds plausible.
-
This is an Uptime Kuma quirk that I solved by upping "Retries" on each affected monitor to at least 2. Just one of my 25 monitors needed retries upping to 5 for a fix.
Hope this helps!
-
@RoundHouse1924 Thanks, although surely that could just mean that when it tried once it was broken but by the time it retried it was working again? Although I guess the time between retries is tiny?
-
-
I went and checked our instances' logs. I found sporadic
ENOTFOUND
but very spaced out (like months apart). -
I don't see this in our instance atleast - https://status.cloudron.io . Which TLD are you using?
-
@mazarian as a way to debug this further, maybe you can add a www.cloudron.io or some other domain (io or com) , to your Uptime Kuma. This well help us narrow if this is a general DNS/network issue or specific to your TLD.
-
That's a fantastic idea! I will add it. I ended up migrating it off Cloudron for the time being because I have come to depend on UK for work and all the notifications were bogging me down.
I will restart the old instance to test and will report back what I find out.
-
I do have the same kind of shortages with uptime kuma. I did add Cloudron to see. Any idea of what is happening ?
I have a dedicated server with hetzner.
-
@jrl-abstract27 this is to do with the local DNS server (unbound) not resolving . In Cloudron 8 (the next release), we are removing unbound altogether and it will use your network's resolver via systemd-resolved. Maybe this issues gets sorted out with that.
-
thanks @girish
-
@girish said in Regular short getaddrinfo EAI_AGAIN outages:
@jrl-abstract27 this is to do with the local DNS server (unbound) not resolving . In Cloudron 8 (the next release), we are removing unbound altogether and it will use your network's resolver via systemd-resolved. Maybe this issues gets sorted out with that.
Hi, i just upgraded to Cloudron 8.0.3, rebooted server and i see Unbound still appears in the services pages, is that normal? does that mean uptime kuma still uses it?
-
unbound
is still used in Cloudron, but its usage is drastically reduced now. It is used for directly querying the nameservers to check if DNS records are already in-sync to avoid hitting NXDOMAIN for newly installed apps as well as for email DNS record lookup.The rest now uses whatever the default setup, of the environment the server is running in, is.
-
To conclude what @nebulon said, for Uptime Kuma this means that it uses the system DNS (and not unbound).
-
@factord right, so this means this is either a local DNS issue or uptime kuma issue. Quick idea is to just set up the server's
/etc/resolv.conf
with Google DNS maybe and check if it mitigates the issue.