Solved: Domain resolving error. Check DNS configuration on...

Jerry · ‎2020-04-19

sk120558 does not apply - just FYI

problem is as self-explained by the screenshot. please have a look.

it is all fresh R80.40 all-in-one dual-stack infrastructure.

error is just an ALERT not BLOCK/DROP/DENY - just so you know 🙂

see the screen and tell me if you find any clues as I'm struggling to find any

1. DNS resolution works on v4 /both = fwd/rev

2. DNS resolution works on v6 /rev only! wonder why ...

ps. resolution from the gateway nslookup'ing or dig'ing - dig resolves ALL - nslookup resolves v6 only REV not FWD queries!

I think I found the bug chaps! see my screenshot.

Cheers

Jerry

G_W_Albrecht · ‎2020-04-27

Finally an update in SR#6-0001976805 from CP:

Currently, it doesn't seem a bug but looks more as a failure with the DNS Resolver.
Please follow these steps to restart the DNS resolver:

Run the command  # killall wsdnsd

Push policy.

To confirm it is working, please verify the 3 kernel table for resolver
#fw ctl multik print_bl dns_reverse_domains_tbl
#fw ctl multik print_bl dns_reverse_cache_tbl
#fw ctl multik print_bl dns_reverse_unmatched_cache

In case the restart will not solve the alerts of the DNS, please proceed with debugs of 
the wsdnsd daemon:

Ilya_Yusupov, any comment ?

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

View solution in original post

Jerry · ‎2020-05-01

alright guys, it truns out to be a completely non-CP related problem and after deep-dive (together with @Ilya_Yusupov and others) we have discovered issues with the Microsoft DNS server (aparently Windows 2019 DataCenter DNS server) which was having issues with ZoneTransfers between the RevDSN zones towards and from its partner (2 DNS servers, 1 PRI, 1 SEC for 2 Zones).

errors like this made CP SG feel a little bit of stress and some recursive queries like rev.dns.ip.cn were simply overloading udp/52 socket. it happened and now it all works like charm as I have focused on fixing it last night and played deep-dive with the crap of MS DSN technology 😛

now no alerts are recorded and CP is NOT struggling any longer with DNS recursive queries (IPv6/IPv4).

Bear in mind it all happened in the environment where about you've got BOTH STACKS - IPv4 and IPv6 which coexist for ALL services inc. NAT 624/426 etc.

Cheers to all and happy R80.30 🙂

Jerry

View solution in original post

PhoneBoy · ‎2020-04-19

Possibly but the log is related to CPMI traffic.

Jerry · ‎2020-04-19

indeed! and that was my point 🙂 something isn't right. I'm debugging it myself today and will post update once know more. I'll share if find anything otherwise I'll rise SR with TAC as I'm in a believe that sg resolves all names just fine but struggle with ND (ipv6) fwd resolver daemon / dameon 😛

Jerry

G_W_Albrecht · ‎2020-04-20

Exactly the same issue can be found on our customers 5400 R80.40 JT 25 GWs - we have SR#6-0001976805 open for it now...

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Jerry · ‎2020-04-20

awesome news! Thanks. Please let me know should the resolution be found by TAC. I'm keen to have this fixed on my 13500 and 14600 devices running same OS like the one you've mentioned.

Cheers

Jerry

G_W_Albrecht · ‎2020-04-20

I have pointed CP to this CheckMates topic - but i would assume someone in the R80.40 team does know of this bug already... Does it show to be cosmetic only in your case ?

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Jerry · ‎2020-04-20

it is indeed sort of cosmetic as it generates an ALERT not DROP by the rules however it claims that dns resolver is not working as it should so I'd have assume it is a serious flaw I guess? Cheers.

Jerry

Ilya_Yusupov · ‎2020-04-20

Hi @Jerry ,

i saw same in the past while the issue was that TCP DNS was not hit by implied rule and we didn't have explicit TCP DNS rule to allow it.

can you please check if TCP DNS is allowed by your RB?

Thanks,

Ilya

Jerry · ‎2020-04-20

@Ilyya

Thanks for the tip, I do appreciate that a lot, however I need to make you aware that R80.40 has no UDP/TCP 53 allowed by Implied rules whatsoever, you can "Configure it" but this isn't recommended afaik 😉

Reg. tcp/53 my SG is allowed to query via both proto's and port 53 its local dns server and that works but somehow it does not work on multi-stack environment when you query dns by either protocol fwd or rev dns questions ie.
when you query fqdn v6 - you have no answer, when you query fqdn v4 - you do have full answer, when you query IP v6 - you have rev dns answer same for IPv4. sam you do on other dev's on the network and there is no issue.

Should you want to t-shoot that with me I'd be more than happy to help and show you all around 🙂

Cheers!

Jerry

Ilya_Yusupov · ‎2020-04-20

Thanks @Jerry , i will contact you offline to schedule remote.

Jerry · ‎2020-04-20

I'll show you example here, just bear with me, this may clarify the issue hold on.

Jerry

Jerry · ‎2020-04-20

Expert@cp:0]# nslookup X.ipv6.domain.co *** HOST RESOLVES fine by other devices ***
Server: 1.2.3.4
Address: 1.2.3.4#53

*** Can't find X.ipv6.domain.co: No answer

[Expert@cp:0]# nslookup X.ipv4.domain.co
Server: 1.2.3.4
Address: 1.2.3.4#53

Name: X.ipv4.domain.co
Address: 1.2.3.4

[Expert@cp:0]#

and rev dns v4:

[Expert@cp:0]# nslookup 1.2.3.5
Server: 1.2.3.5
Address: 1.2.3.5 #53

5.3.2.1.in-addr.arpa name = X.ipv4.domain.co.

[Expert@cp:0]#

Jerry

Ilya_Yusupov · ‎2020-04-21

Thanks @Jerry , i will take it with RnD and will back to you.

Jerry · ‎2020-04-21

Thanks!

One more thing,

my DNS servers are so called DUAL, they resolve either v4 or v6. R80.30 is doing that just fine and I can prove it anytime.
When SG is asking himself what is the fqdn or IP they answer respectively either v4 or v6 regardless which direction is that, I'll post here something here from R80.30 latest take, just bear with me. This will be a prof that same network, my own lab, same DNS server just normal working answers, Let me craft it for you here. Cheers!

Jerry

G_W_Albrecht · ‎2020-04-23

Still no feedback in SR#6-0001976805 open for it - strange...

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

G_W_Albrecht · ‎2020-04-27

Finally an update in SR#6-0001976805 from CP:

Currently, it doesn't seem a bug but looks more as a failure with the DNS Resolver.
Please follow these steps to restart the DNS resolver:

Run the command  # killall wsdnsd

Push policy.

To confirm it is working, please verify the 3 kernel table for resolver
#fw ctl multik print_bl dns_reverse_domains_tbl
#fw ctl multik print_bl dns_reverse_cache_tbl
#fw ctl multik print_bl dns_reverse_unmatched_cache

In case the restart will not solve the alerts of the DNS, please proceed with debugs of 
the wsdnsd daemon:

Ilya_Yusupov, any comment ?

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Jerry · ‎2020-04-27

got it and all works just fine, all 3 tables are full of bits so I presume all is fine and we do not need a call today Ilya? 🙂

Jerry

Ilya_Yusupov · ‎2020-04-27

@G_W_Albrecht @Jerry thank you guys for the update, i will review the solution with RnD and updated.

@Jerry - if you solved the problem so no need for a remote today i will review the solution with RnD and update here.

Jerry · ‎2020-04-27

alers are still happening with the DNS issue, wonder why.
let's have that WebEx today (in 90 mins.) though. Alright?

Jerry

Ilya_Yusupov · ‎2020-04-27

@Jerry - no problem.

G_W_Albrecht · ‎2020-04-28

Any news about this issue ?

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

Jerry · ‎2020-04-29

more to come tomorrow 🙂 have a session with Iliya to see what's the issue. fingers crossed 😛

cheers

Jerry

Jerry · ‎2020-05-01

alright guys, it truns out to be a completely non-CP related problem and after deep-dive (together with @Ilya_Yusupov and others) we have discovered issues with the Microsoft DNS server (aparently Windows 2019 DataCenter DNS server) which was having issues with ZoneTransfers between the RevDSN zones towards and from its partner (2 DNS servers, 1 PRI, 1 SEC for 2 Zones).

errors like this made CP SG feel a little bit of stress and some recursive queries like rev.dns.ip.cn were simply overloading udp/52 socket. it happened and now it all works like charm as I have focused on fixing it last night and played deep-dive with the crap of MS DSN technology 😛

now no alerts are recorded and CP is NOT struggling any longer with DNS recursive queries (IPv6/IPv4).

Bear in mind it all happened in the environment where about you've got BOTH STACKS - IPv4 and IPv6 which coexist for ALL services inc. NAT 624/426 etc.

Cheers to all and happy R80.30 🙂

Jerry

B_P · ‎2020-06-01

I am seeing this on a R80.40 JFH.T48 gateway and Windows Server 2019 DNS - but have only ipv4 enabled. What did you do to fix the 2019 DNS?

I would also like to add that I am seeing this error when the internal DNS server queries the internet forwarder. I'm evening seeing it on ICMP traffic. That does not make any sense.

Jerry · ‎2020-06-02

hi
thanks for your update, really appreciate it
as mentioned earlier I was seeing it on v4 and v6 with DNS WIN2019 DC and when I corrected REV-DNS setup on DNS server all started to ease but still remain. I do believe this isn't about CP however but about how WIN2019 DNS server works especially when it comes to the rev-dns for IPv6 (ND-NA) mode. on IPv4 however all was all the time the same as I was mainly seeing the errors with the relationship to the ipv6 not ipv4 rev-dns records.

Cheers

Jerry

B_P · ‎2020-06-04

For me, this started happening after upgrading from R80.30 to R80.40. What changed in regards to DNS between these versions?

Jerry · ‎2020-06-04

indeed! happen to me exactly same thing - all issues has introduced on R80.40 whilst on R80.30 nothing similar ever happened - weired isn't it? 🙂

Jerry

Ilya_Yusupov · ‎2020-06-09

Hi @Jerry,

What exactly happen on R80.40? do you still see same issue as we investigate on previous version?

can we do another remote session for investigation it on R80.40?

Jerry · ‎2020-06-09

hi Ilya, thanks for coming back to me on that matter,
as the matter of fact despite me fixing DNS on Win2019 server Alerts on All-In-One standalone R80.40 are still happening.
I'm happy to do another session with you anytime you're available but just so you know those Alerts are not with "Drop" but "Accept" so they're not obstructing just annoying if you know what I mean. I'll show you below one from today, let me craft it so that you can see what type of Alert is that. Please let me know via email should you have time for another session (WebEX based). Cheers

Jerry

Jerry · ‎2020-06-09

As mentioned earlier - this is just from NOW on. Unreachable by ipv6 routing are dropped, reachable are allowed but still alerted. this isn't happening on my other Standalone devices under R80.30 (latest take) so only by R80.40 (latest) it is happening. Any clues?

Jerry

Domain resolving error. Check DNS configuration on the gateway (0) - bug in R80.40?

Domain resolving error. Check DNS configuration on the gateway (0) - bug in R80.40?

Are you a member of CheckMates?

Domain resolving error. Check DNS configuration on the gateway (0) - bug in R80.40?

Domain resolving error. Check DNS configuration on the gateway (0) - bug in R80.40?