Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Platinum

Domain resolving error. Check DNS configuration on the gateway (0) - bug in R80.40?

Jump to solution

sk120558 does not apply - just FYI

 

problem is as self-explained by the screenshot. please have a look.

it is all fresh R80.40 all-in-one dual-stack infrastructure.

error is just an ALERT not BLOCK/DROP/DENY - just so you know 🙂

 

see the screen and tell me if you find any clues as I'm struggling to find any

 

1. DNS resolution works on v4 /both = fwd/rev

2. DNS resolution works on v6 /rev only! wonder why ...

ps. resolution from the gateway nslookup'ing or dig'ing - dig resolves ALL - nslookup resolves v6 only REV not FWD queries!

 

I think I found the bug chaps! see my screenshot.

error-alert.jpg

Cheers

Jerry
0 Kudos
1 Solution

Accepted Solutions
Highlighted
Platinum

alright guys, it truns out to be a completely non-CP related problem and after deep-dive (together with @Ilya_Yusupov and others) we have discovered issues with the Microsoft DNS server (aparently Windows 2019 DataCenter DNS server) which was having issues with ZoneTransfers between the RevDSN zones towards and from its partner (2 DNS servers, 1 PRI, 1 SEC for 2 Zones).

 

errors like this made CP SG feel a little bit of stress and some recursive queries like rev.dns.ip.cn were simply overloading udp/52 socket. it happened and now it all works like charm as I have focused on fixing it last night and played deep-dive with the crap of MS DSN technology 😛

 

now no alerts are recorded and CP is NOT struggling any longer with DNS recursive queries (IPv6/IPv4).

Bear in mind it all happened in the environment where about you've got BOTH STACKS - IPv4 and IPv6 which coexist for ALL services inc. NAT 624/426 etc.

 

Cheers to all and happy R80.30 🙂

Jerry

View solution in original post

33 Replies
Highlighted
Admin
Admin
Possibly but the log is related to CPMI traffic.
0 Kudos
Highlighted
Platinum
indeed! and that was my point 🙂 something isn't right. I'm debugging it myself today and will post update once know more. I'll share if find anything otherwise I'll rise SR with TAC as I'm in a believe that sg resolves all names just fine but struggle with ND (ipv6) fwd resolver daemon / dameon 😛
Jerry
0 Kudos
Highlighted
Sapphire

Exactly the same issue can be found on our customers 5400 R80.40 JT 25 GWs - we have SR#6-0001976805 open for it now...

 

0 Kudos
Highlighted
Platinum
awesome news! Thanks. Please let me know should the resolution be found by TAC. I'm keen to have this fixed on my 13500 and 14600 devices running same OS like the one you've mentioned.

Cheers
Jerry
0 Kudos
Highlighted
Sapphire

I have pointed CP to this CheckMates topic - but i would assume someone in the R80.40 team does know of this bug already... Does it show to be cosmetic only in your case ?

0 Kudos
Highlighted
Platinum
it is indeed sort of cosmetic as it generates an ALERT not DROP by the rules however it claims that dns resolver is not working as it should so I'd have assume it is a serious flaw I guess? Cheers.
Jerry
0 Kudos
Highlighted
Employee+
Employee+

Hi @Jerry ,

 

i saw same in the past while the issue was that TCP DNS was not hit by implied rule and we didn't have explicit TCP DNS rule to allow it. 

 

can you please check if TCP DNS is allowed by your RB?

 

Thanks,

Ilya  

0 Kudos
Highlighted
Platinum
@Ilyya

Thanks for the tip, I do appreciate that a lot, however I need to make you aware that R80.40 has no UDP/TCP 53 allowed by Implied rules whatsoever, you can "Configure it" but this isn't recommended afaik 😉

Reg. tcp/53 my SG is allowed to query via both proto's and port 53 its local dns server and that works but somehow it does not work on multi-stack environment when you query dns by either protocol fwd or rev dns questions ie.
when you query fqdn v6 - you have no answer, when you query fqdn v4 - you do have full answer, when you query IP v6 - you have rev dns answer same for IPv4. sam you do on other dev's on the network and there is no issue.

Should you want to t-shoot that with me I'd be more than happy to help and show you all around 🙂

Cheers!
Jerry
0 Kudos
Highlighted
Employee+
Employee+

Thanks @Jerry , i will contact you offline to schedule remote.

0 Kudos
Highlighted
Platinum
I'll show you example here, just bear with me, this may clarify the issue hold on.
Jerry
0 Kudos
Highlighted
Platinum

Expert@cp:0]# nslookup X.ipv6.domain.co *** HOST RESOLVES fine by other devices ***
Server: 1.2.3.4
Address: 1.2.3.4#53

*** Can't find X.ipv6.domain.co: No answer

[Expert@cp:0]# nslookup X.ipv4.domain.co
Server: 1.2.3.4
Address: 1.2.3.4#53

Name: X.ipv4.domain.co
Address: 1.2.3.4

[Expert@cp:0]#


and rev dns v4:

[Expert@cp:0]# nslookup 1.2.3.5
Server: 1.2.3.5
Address: 1.2.3.5 #53

5.3.2.1.in-addr.arpa name = X.ipv4.domain.co.

[Expert@cp:0]#

Jerry
0 Kudos
Highlighted
Employee+
Employee+

Thanks @Jerry  , i will take it with RnD and will back to you.

0 Kudos
Highlighted
Platinum
Thanks!

One more thing,

my DNS servers are so called DUAL, they resolve either v4 or v6. R80.30 is doing that just fine and I can prove it anytime.
When SG is asking himself what is the fqdn or IP they answer respectively either v4 or v6 regardless which direction is that, I'll post here something here from R80.30 latest take, just bear with me. This will be a prof that same network, my own lab, same DNS server just normal working answers, Let me craft it for you here. Cheers!
Jerry
0 Kudos
Highlighted
Sapphire

Still no feedback in SR#6-0001976805 open for it - strange...

0 Kudos
Highlighted
Sapphire

Finally an update in SR#6-0001976805 from CP:

Currently, it doesn't seem a bug but looks more as a failure with the DNS Resolver.
Please follow these steps to restart the DNS resolver:

Run the command  # killall wsdnsd

Push policy.

To confirm it is working, please verify the 3 kernel table for resolver
#fw ctl multik print_bl dns_reverse_domains_tbl
#fw ctl multik print_bl dns_reverse_cache_tbl
#fw ctl multik print_bl dns_reverse_unmatched_cache

In case the restart will not solve the alerts of the DNS, please proceed with debugs of
the wsdnsd daemon:

Ilya_Yusupov, any comment ? 

0 Kudos
Highlighted
Platinum
got it and all works just fine, all 3 tables are full of bits so I presume all is fine and we do not need a call today Ilya? 🙂
Jerry
0 Kudos
Highlighted
Employee+
Employee+

@G_W_Albrecht @Jerry  thank you guys for the update, i will review the solution with RnD and updated.

 

@Jerry  -  if you solved the problem so no need for a remote today i will review the solution with RnD and update here.

0 Kudos
Highlighted
Platinum
alers are still happening with the DNS issue, wonder why.
let's have that WebEx today (in 90 mins.) though. Alright?
Jerry
0 Kudos
Highlighted
Employee+
Employee+

@Jerry  - no problem.

0 Kudos
Highlighted
Sapphire

Any news about this issue ?

0 Kudos
Highlighted
Platinum
more to come tomorrow 🙂 have a session with Iliya to see what's the issue. fingers crossed 😛

cheers
Jerry
0 Kudos
Highlighted
Platinum

alright guys, it truns out to be a completely non-CP related problem and after deep-dive (together with @Ilya_Yusupov and others) we have discovered issues with the Microsoft DNS server (aparently Windows 2019 DataCenter DNS server) which was having issues with ZoneTransfers between the RevDSN zones towards and from its partner (2 DNS servers, 1 PRI, 1 SEC for 2 Zones).

 

errors like this made CP SG feel a little bit of stress and some recursive queries like rev.dns.ip.cn were simply overloading udp/52 socket. it happened and now it all works like charm as I have focused on fixing it last night and played deep-dive with the crap of MS DSN technology 😛

 

now no alerts are recorded and CP is NOT struggling any longer with DNS recursive queries (IPv6/IPv4).

Bear in mind it all happened in the environment where about you've got BOTH STACKS - IPv4 and IPv6 which coexist for ALL services inc. NAT 624/426 etc.

 

Cheers to all and happy R80.30 🙂

Jerry

View solution in original post

Highlighted
Nickel

I am seeing this on a R80.40 JFH.T48 gateway and Windows Server 2019 DNS - but have only ipv4 enabled. What did you do to fix the 2019 DNS?

I would also like to add that I am seeing this error when the internal DNS server queries the internet forwarder. I'm evening seeing it on ICMP traffic. That does not make any sense.

BlockICMPDNS.png

0 Kudos
Highlighted
Platinum
hi
thanks for your update, really appreciate it
as mentioned earlier I was seeing it on v4 and v6 with DNS WIN2019 DC and when I corrected REV-DNS setup on DNS server all started to ease but still remain. I do believe this isn't about CP however but about how WIN2019 DNS server works especially when it comes to the rev-dns for IPv6 (ND-NA) mode. on IPv4 however all was all the time the same as I was mainly seeing the errors with the relationship to the ipv6 not ipv4 rev-dns records.

Cheers
Jerry
0 Kudos
Highlighted
Nickel

For me, this started happening after upgrading from R80.30 to R80.40. What changed in regards to DNS between these versions?

0 Kudos
Highlighted
Platinum
indeed! happen to me exactly same thing - all issues has introduced on R80.40 whilst on R80.30 nothing similar ever happened - weired isn't it? 🙂
Jerry
0 Kudos
Highlighted
Employee+
Employee+

Hi @Jerry,

 

What exactly happen on R80.40? do you still see same issue as we investigate on previous version?

can we do another remote session for investigation it on R80.40?

0 Kudos
Highlighted
Platinum
hi Ilya, thanks for coming back to me on that matter,
as the matter of fact despite me fixing DNS on Win2019 server Alerts on All-In-One standalone R80.40 are still happening.
I'm happy to do another session with you anytime you're available but just so you know those Alerts are not with "Drop" but "Accept" so they're not obstructing just annoying if you know what I mean. I'll show you below one from today, let me craft it so that you can see what type of Alert is that. Please let me know via email should you have time for another session (WebEX based). Cheers
Jerry
0 Kudos
Highlighted
Platinum

As mentioned earlier - this is just from NOW on. Unreachable by ipv6 routing are dropped, reachable are allowed but still alerted. this isn't happening on my other Standalone devices under R80.30 (latest take) so only by R80.40 (latest) it is happening. Any clues?

test.jpg

Jerry
0 Kudos