DNS lookup failure MX for yandex.com
- 
Does this happen also after you wait some few hours? Often those hit rate-limits quickly and it is always queried when the dashboard fetches the status in the mail view 
- 
Where is this coming from? The mail server logs? Can you check if unbound is running with systemctl status unbound. You can also restart unbound from the Services view.
- 
@girish the Email event log. unbound is running, although it's log is empty. 
 why do all logs download to log.txt? they should be unique.other emails are being delivered just fine. 
- 
@girish the Email event log. unbound is running, although it's log is empty. 
 why do all logs download to log.txt? they should be unique.other emails are being delivered just fine. 
- 
@girish yes, took a few seconds the first time but it returned the one I found with an online tool. Is the lookup timeout too short? @robi I think the delay is the source of the problem. Do you know why it takes so long? IIUC, nodejs uses c-ares library underneath for DNS queries. And ARES_OPT_TIMEOUTMS has a default of 5 seconds. If it doesn't get a response, it will think it's a failures (since dns resolution is UDP based, it has to rely on timeouts). It doesn't seem like Haraka can configure these timeouts from a quick read of https://github.com/haraka/Haraka/blob/master/outbound/mx_lookup.js 
- 
@robi I think the delay is the source of the problem. Do you know why it takes so long? IIUC, nodejs uses c-ares library underneath for DNS queries. And ARES_OPT_TIMEOUTMS has a default of 5 seconds. If it doesn't get a response, it will think it's a failures (since dns resolution is UDP based, it has to rely on timeouts). It doesn't seem like Haraka can configure these timeouts from a quick read of https://github.com/haraka/Haraka/blob/master/outbound/mx_lookup.js 
- 
@girish said in DNS lookup failure MX for yandex.com: host -t MX yandex.com 127.0.0.1 I'm having the same issue here but for a different domain... gov.bc.ca. host -t MX gov.bc.ca 127.0.0.1gives me this output:;; connection timed out; no servers could be reachedThe DNS servers set in netplan are the following (so shouldn't be a problem): nameservers: addresses: [1.1.1.1, 1.0.0.1, 8.8.8.8, 213.186.33.99] search: []Are there any concerns on this above? Is there a different place the DNS addresses should be set? The only email failing to send is to the gov.bc.ca domain, everything else works fine. And external checks shows the MX record no problem for gov.bc.ca so it's not a DNS issue on their end at all. The only thing in the unbound service logs is this repeatedly: Nov 22 16:16:10 vps-<name> unbound[2484160]: [2484160:0] info: generate keytag query _ta-4f66. NULL INFrom the mail logs, I only see these two entries for the gov.bc.ca attempts: 2021-11-22T08:58:37.000Z [NOTICE] [3306618A-B219-42B7-A2F5-6DBF86CE783E.1.1] [outbound] MX Lookup for gov.bc.ca failed: Error: queryMx ETIMEOUT gov.bc.ca 2021-11-22T08:58:37.000Z [INFO] [3306618A-B219-42B7-A2F5-6DBF86CE783E.1.1] [outbound] bouncing mail: Too many failures (DNS lookup failure: Error: queryMx ETIMEOUT gov.bc.ca)
- 
@girish said in DNS lookup failure MX for yandex.com: host -t MX yandex.com 127.0.0.1 I'm having the same issue here but for a different domain... gov.bc.ca. host -t MX gov.bc.ca 127.0.0.1gives me this output:;; connection timed out; no servers could be reachedThe DNS servers set in netplan are the following (so shouldn't be a problem): nameservers: addresses: [1.1.1.1, 1.0.0.1, 8.8.8.8, 213.186.33.99] search: []Are there any concerns on this above? Is there a different place the DNS addresses should be set? The only email failing to send is to the gov.bc.ca domain, everything else works fine. And external checks shows the MX record no problem for gov.bc.ca so it's not a DNS issue on their end at all. The only thing in the unbound service logs is this repeatedly: Nov 22 16:16:10 vps-<name> unbound[2484160]: [2484160:0] info: generate keytag query _ta-4f66. NULL INFrom the mail logs, I only see these two entries for the gov.bc.ca attempts: 2021-11-22T08:58:37.000Z [NOTICE] [3306618A-B219-42B7-A2F5-6DBF86CE783E.1.1] [outbound] MX Lookup for gov.bc.ca failed: Error: queryMx ETIMEOUT gov.bc.ca 2021-11-22T08:58:37.000Z [INFO] [3306618A-B219-42B7-A2F5-6DBF86CE783E.1.1] [outbound] bouncing mail: Too many failures (DNS lookup failure: Error: queryMx ETIMEOUT gov.bc.ca)
- 
@d19dotca Does host -t NS gov.bc.ca 127.0.0.1output anything? You can also try the above command aftersystemctl restart unbound.FWIW, the above commands work on my servers. @girish Unfortunately it doesn't.  If I take out the 127.0.0.1 it works fine though (which I think is only the case because I added 1.1.1.1to the /etc/resolv.conf file temporarily to see if that'd help at all):host -t NS gov.bc.ca gov.bc.ca name server pubdns-k.spanbc.ca. gov.bc.ca name server pubdns-c.spanbc.ca.I tried restarting unbound but same issue too when running the host command with 127.0.0.01, FYI. host -t NS gov.bc.ca 127.0.0.1 ;; connection timed out; no servers could be reachedAny suggestions then? I'm always a bit confused when it comes to DNS in Cloudron servers... does Cloudron force it's own DNS lookup server on a VPS with Cloudron, and thus any local config isn't really applicable to a server as it would be without Cloudron? Is that we try the hostcommand with the 127.0.0.1 because it sends it through Cloudron's local DNS server (unbound)?
- 
@girish Unfortunately it doesn't.  If I take out the 127.0.0.1 it works fine though (which I think is only the case because I added 1.1.1.1to the /etc/resolv.conf file temporarily to see if that'd help at all):host -t NS gov.bc.ca gov.bc.ca name server pubdns-k.spanbc.ca. gov.bc.ca name server pubdns-c.spanbc.ca.I tried restarting unbound but same issue too when running the host command with 127.0.0.01, FYI. host -t NS gov.bc.ca 127.0.0.1 ;; connection timed out; no servers could be reachedAny suggestions then? I'm always a bit confused when it comes to DNS in Cloudron servers... does Cloudron force it's own DNS lookup server on a VPS with Cloudron, and thus any local config isn't really applicable to a server as it would be without Cloudron? Is that we try the hostcommand with the 127.0.0.1 because it sends it through Cloudron's local DNS server (unbound)?@d19dotca yes, all DNS requests go via unbound (which is running on 127.0.0.1). Docker is configured to make DNS requests via unbound (so all apps will also indirectly use unbound). I am not sure why unbound is unable to get the nameservers of that specific domain. If you edit /etc/unbound/unbound.conf.d/cloudron-network.conf, enable debugging there.verbosity: 5 log-queries: yesThen restart unbound and check if we get additional hints. 
- 
@d19dotca yes, all DNS requests go via unbound (which is running on 127.0.0.1). Docker is configured to make DNS requests via unbound (so all apps will also indirectly use unbound). I am not sure why unbound is unable to get the nameservers of that specific domain. If you edit /etc/unbound/unbound.conf.d/cloudron-network.conf, enable debugging there.verbosity: 5 log-queries: yesThen restart unbound and check if we get additional hints. @girish I made the change (and quickly put it back after seeing it grow so quickly), but it was on for a few minutes, I ran the test and here's the file link for download (it's 3 MB): https://filesharing.d19.ca/f.php?h=32DPrTGN&d=1 There's hundreds of lines in there for it, it seems. But here's some quick snippets in my very brief review right away: It seems the initial NS are found: 2021-11-23T21:24:40+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: incoming scrubbed packet: ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 0 ;; flags: qr ; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 2 ;; QUESTION SECTION: gov.bc.ca. IN A ;; ANSWER SECTION: ;; AUTHORITY SECTION: gov.bc.ca. 300 IN NS pubdns-c.spanbc.ca. gov.bc.ca. 300 IN NS pubdns-k.spanbc.ca. hcgv77huiaek95dvf2mlh6mgc3747u7d.ca. 300 IN NSEC3 1 1 5 - hcif66mnpd6ucerv6dkg8nodve36k0ma TXT RRSIG ;{flags: optout} hcgv77huiaek95dvf2mlh6mgc3747u7d.ca. 300 IN RRSIG NSEC3 8 2 3600 20211127185149 20211120210941 6810 ca. CaN+r3F3jFEa+PKhUj1YVtegRPO83dQ9Ak9eFGgi4QCmIsOfTye0EgHad7+a1TtqOkLW6VwVghc6Gh83kecuulKRmM6IFwCMQI/TT/6jN53Mabhm+Zy3PZdqCMeaP2Fjs6PPsXbQVUbw0H/dSBP1l0mdKX72feKSPzQXd92++mA= ;{id = 6810} j7ndutk162v2aatm9t1tqeeftjri3jcv.ca. 300 IN NSEC3 1 1 5 - j7oh4h2jucnrgkn54kf5t3gj4v55cuel NS DS RRSIG ;{flags: optout} j7ndutk162v2aatm9t1tqeeftjri3jcv.ca. 300 IN RRSIG NSEC3 8 2 3600 20211129061916 20211122023917 6810 ca. opOLaNq6jn5w8EarGGa5tElQPbywUYC3OW1IJCQjnIwJS8fbO0RDKpE0p+Nv0gndmF8ELCqUJmSuCmRti7FeZDLMvkKzSfmwrx2BILlpiMNBArSswNhI9HbpoW+Dt8Gl+u2/jX7qbOMXNBZEx8Nn/PBrAWWvnwIx3Ur0xgB89Us= ;{id = 6810} ;; ADDITIONAL SECTION: pubdns-c.spanbc.ca. 300 IN A 142.34.50.57 pubdns-k.spanbc.ca. 300 IN A 142.34.208.20 ;; MSG SIZE rcvd: 594I do see a few of these timeouts though: 2021-11-23T21:24:42+0000 vps-8b86529d unbound[1459245]: [1459245:0] debug: timeout udpI don't know what these mean exactly, but for reference... 2021-11-23T21:25:10+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: 2vRDCD mod2 pubdns-k.spanbc.ca. A IN 2021-11-23T21:25:10+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: 4RDd mod2 rep gov.bc.ca. NS IN 2021-11-23T21:25:10+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: 5RDdc mod2 rep gov.bc.ca. NS IN
- 
@girish I made the change (and quickly put it back after seeing it grow so quickly), but it was on for a few minutes, I ran the test and here's the file link for download (it's 3 MB): https://filesharing.d19.ca/f.php?h=32DPrTGN&d=1 There's hundreds of lines in there for it, it seems. But here's some quick snippets in my very brief review right away: It seems the initial NS are found: 2021-11-23T21:24:40+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: incoming scrubbed packet: ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 0 ;; flags: qr ; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 2 ;; QUESTION SECTION: gov.bc.ca. IN A ;; ANSWER SECTION: ;; AUTHORITY SECTION: gov.bc.ca. 300 IN NS pubdns-c.spanbc.ca. gov.bc.ca. 300 IN NS pubdns-k.spanbc.ca. hcgv77huiaek95dvf2mlh6mgc3747u7d.ca. 300 IN NSEC3 1 1 5 - hcif66mnpd6ucerv6dkg8nodve36k0ma TXT RRSIG ;{flags: optout} hcgv77huiaek95dvf2mlh6mgc3747u7d.ca. 300 IN RRSIG NSEC3 8 2 3600 20211127185149 20211120210941 6810 ca. CaN+r3F3jFEa+PKhUj1YVtegRPO83dQ9Ak9eFGgi4QCmIsOfTye0EgHad7+a1TtqOkLW6VwVghc6Gh83kecuulKRmM6IFwCMQI/TT/6jN53Mabhm+Zy3PZdqCMeaP2Fjs6PPsXbQVUbw0H/dSBP1l0mdKX72feKSPzQXd92++mA= ;{id = 6810} j7ndutk162v2aatm9t1tqeeftjri3jcv.ca. 300 IN NSEC3 1 1 5 - j7oh4h2jucnrgkn54kf5t3gj4v55cuel NS DS RRSIG ;{flags: optout} j7ndutk162v2aatm9t1tqeeftjri3jcv.ca. 300 IN RRSIG NSEC3 8 2 3600 20211129061916 20211122023917 6810 ca. opOLaNq6jn5w8EarGGa5tElQPbywUYC3OW1IJCQjnIwJS8fbO0RDKpE0p+Nv0gndmF8ELCqUJmSuCmRti7FeZDLMvkKzSfmwrx2BILlpiMNBArSswNhI9HbpoW+Dt8Gl+u2/jX7qbOMXNBZEx8Nn/PBrAWWvnwIx3Ur0xgB89Us= ;{id = 6810} ;; ADDITIONAL SECTION: pubdns-c.spanbc.ca. 300 IN A 142.34.50.57 pubdns-k.spanbc.ca. 300 IN A 142.34.208.20 ;; MSG SIZE rcvd: 594I do see a few of these timeouts though: 2021-11-23T21:24:42+0000 vps-8b86529d unbound[1459245]: [1459245:0] debug: timeout udpI don't know what these mean exactly, but for reference... 2021-11-23T21:25:10+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: 2vRDCD mod2 pubdns-k.spanbc.ca. A IN 2021-11-23T21:25:10+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: 4RDd mod2 rep gov.bc.ca. NS IN 2021-11-23T21:25:10+0000 vps-8b86529d unbound[1459245]: [1459245:0] info: 5RDdc mod2 rep gov.bc.ca. NS IN@d19dotca In the logs, I see dnssec status: not expected. Can you try disabling DNSSEC?https://www.nlnetlabs.nl/documentation/unbound/howto-turnoff-dnssec/ . Can just add val-permissive-mode: yesin the unbound config.https://dnssec-analyzer.verisignlabs.com/gov.bc.ca confirms the domain has some DNSSEC errors. 
- 
@d19dotca In the logs, I see dnssec status: not expected. Can you try disabling DNSSEC?https://www.nlnetlabs.nl/documentation/unbound/howto-turnoff-dnssec/ . Can just add val-permissive-mode: yesin the unbound config.https://dnssec-analyzer.verisignlabs.com/gov.bc.ca confirms the domain has some DNSSEC errors. @girish I tried this, restarted the unbound server after adding that parameter to /etc/unbound/unbound.conf.d/cloudron-network.conf, but my hostcommands still fail with the exact same thing. Current config values: server: port: 53 interface: 127.0.0.1 interface: 172.18.0.1 do-ip6: no access-control: 127.0.0.1 allow access-control: 172.18.0.1/16 allow cache-max-negative-ttl: 30 cache-max-ttl: 300 val-permissive-mode: yesRan the restart command, but still seems to fail. 
- 
@girish I tried this, restarted the unbound server after adding that parameter to /etc/unbound/unbound.conf.d/cloudron-network.conf, but my hostcommands still fail with the exact same thing. Current config values: server: port: 53 interface: 127.0.0.1 interface: 172.18.0.1 do-ip6: no access-control: 127.0.0.1 allow access-control: 172.18.0.1/16 allow cache-max-negative-ttl: 30 cache-max-ttl: 300 val-permissive-mode: yesRan the restart command, but still seems to fail. 
- 
@d19dotca I am afraid I have to debug on your server to help further then. Can you drop a mail to support? @girish Sent the email from the server's support page and allowed remote access for you. Thank you so much in advance, Girish! Very odd issue, I'd love to know what's going on there. For what it's worth, I tried changing verbosity to 2 and logging the queries, and it seems my hostcommands now come back withSERVFAILerror, where-as before it came back with nothing outside of what's noted earlier. Not sure if that's progress or not, haha. I've gone ahead and set it back, so it's not verbose right now.Here's what I got recently though after making that change for the verbosity to 2: host -t NS gov.bc.ca 127.0.0.1 Using domain server: Name: 127.0.0.1 Address: 127.0.0.1#53 Aliases: Host gov.bc.ca not found: 2(SERVFAIL)
- 
Trying to debug this further now. I cannot make much sense of the unbound logs. So, I wrote a simple node script to do DNS queries: #!/usr/bin/env node 'use strict'; const { Resolver } = require('dns').promises; const resolver = new Resolver(); (async function () { try { const nameservers = await resolver.resolveMx('the.domain'); console.log(nameservers); } catch (e) { console.log('Exception when looking up name server: ', e); } })();I get: Exception when looking up name server: Error: queryMx ESERVFAIL the.domain at QueryReqWrap.onresolve [as oncomplete] (internal/dns/promises.js:169:17) { errno: undefined, code: 'ESERVFAIL', syscall: 'queryMx', hostname: 'the.domain' }So, it's not an unbound issue but a general network issue. Trying to see what else we can try here. Of course, replacing the.domainwith something likecloudron.ioworks. So, it's the network connectivity between the nameservers of this specific domain.
 



