Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
William_Chang
Participant

Domain Object when FQDN has multiple DNS results

We are running R80.40. Nowadays, more and more outbound destinations on Internet are hosted in the cloud service providers or CDNs. We don't do the https inspection( decryption). We use Domain Object with FQDN very often.

Question: How will the Domain Object with FQDN react to the multiple IP result for a single FQDN DNS query ?

                  How does Check Point implement the Domain Object with FQDN? There is very little information about the details.

1. What's the frequency/intervals of the gateway querying the DNS? (Is it adjustable via CLI or GUI?)

2. Will it cache the multiple IP resolution for the same FQDN and update the cache the next time the gateway query the DNS?

3. If it is caching the multiple IPs, will it only cache IPv4 and discard IPv6 when the gateway only set for IPv4 (to save the cache space)?

4. If multiple IP results are not cached together, if the gateway only cache one of the result, this could lead to the gateway denys the traffic when the server sending the traffic is based on a different IP from the same query on the same DNS server.

Summary: When the Domain Object with FQDN resolves to multiple IPs (Very common since a lot of cloud service providers have multiple DCs), will the gateway correctly permit the traffic for any one of the IPs resolved from the DNS?

 

 

18 Replies
Wolfgang
Leader
Leader

Hello @William_Chang 

resolving DNS names with multiple IPs will be no problem. With Domains Tool (domains_tool) you can check every entry on the gateway.

The cache is updated every 60s starting from R80.20.

Have a look at  How do Domain Objects work?  and  Domain Objects in R8x 

Wolfgang

0 Kudos
JanVC
Collaborator

It will be a problem if the DNS reply is performance/geo location based.

For example: ondemand.database.windows.net

 

root@server:~# dig ondemand.database.windows.net +short
dataslice6.eastus.database.windows.net.
dataslice6eastus.trafficmanager.net.
cr3.eastus1-a.control.database.windows.net.
40.79.153.12

root@server:~# dig ondemand.database.windows.net +short
dataslice6.eastus.database.windows.net.
dataslice6eastus.trafficmanager.net.
cr5.eastus1-a.control.database.windows.net.
40.78.225.32

 

This correlates with the SQL gateways used by Azure (check East US region):

https://docs.microsoft.com/en-us/azure/azure-sql/database/connectivity-architecture

 

The firewall can resolve the FQDN to IP A and the client can resolve it to IP B, causing a mismatch and dropping the traffic.

When checking with domain_tools, you can see it caches only the latest reply and not all recent replies.
Seems kind of logical to only cache the latest result, but in this case it can pose mismatches.

Opened a TAC case, but there was nothing the engineer could do.

Bob_Zimmerman
Advisor

As with any distributed system, to ensure consistency, there must be a single point of truth. The firewall needs to use exactly the same resolver as the clients behind it. This resolver should cache replies and provide the same answer until the TTL of that answer expires.

JanVC
Collaborator

so in this case the full 9 seconds 🙂

 

 

root@server:~# dig ondemand.database.windows.net

; <<>> DiG 9.11.5-P4-5.1+deb10u5-Debian <<>> ondemand.database.windows.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16830
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;ondemand.database.windows.net. IN A

;; ANSWER SECTION:
ondemand.database.windows.net. 299 IN CNAME dataslice6.eastus.database.windows.net.
dataslice6.eastus.database.windows.net. 9 IN CNAME dataslice6eastus.trafficmanager.net.
dataslice6eastus.trafficmanager.net. 9 IN CNAME cr3.eastus1-a.control.database.windows.net.
cr3.eastus1-a.control.database.windows.net. 21023 IN A 40.79.153.12

;; Query time: 30 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri May 14 18:30:03 CEST 2021
;; MSG SIZE rcvd: 188

root@server:~#

Benedikt_Weissl
Advisor

Dale_Lobb
Collaborator

Passive DNS learning only works for Non-FQDN Domain Objects and some Updatable objects, according to sk161612.

Multiple SKs state that FQDN is resolved by direct DNS query via gateway defined DNS servers.

The vast majority of FQDNs are going to work just fine in pretty much all scenarios, except that were the query is not static for all questioners.  This happens mostly in the regards to global load balancers where the query is resolved based on the questioner's geo location or based on performance metrics of the possible answers.  As JanVC said, that's going to happen a lot for many cloud based apps.

If you have multiple internal DNS resolvers, the only fully safe way to use FQDN objects is to have al internal DNS resolvers forward queries for non-local domains to one outbound-to-the-internet DNS server.  That way, when any internal DNS server queries a DNS name, the sole forwarder to internet will either resolve the query or provide the answer from it's cache of already resolved queries, for the TTL of the query.  Thus all internal resolvers will have the same answer for the same length of time, including the gateway DNS agents.

0 Kudos
JanVC
Collaborator

@Bob_Zimmerman Well my lab proved me wrong by caching two entries for ondemand.database.windows.net, I didn't even need to go test DNS passive learning 🙂
I used 8.8.8.8 and a standalone windows server 2019 DNS server configured on the gateway (R80.20 jumbo 190 and R80.30 jumbo 228)

 

Every 2.0s: domains_tool -d ondemand.database.windows.net 60001:0117 19:56:06 2021
---------------------------------------------------------------------------------------------------
| Given Domain name: ondemand.database.windows.net FQDN: yes |
---------------------------------------------------------------------------------------------------
| IP address | sub-domain |
---------------------------------------------------------------------------------------------------
| 40.79.153.12 | no |
| 40.78.225.32 | no |
---------------------------------------------------------------------------------------------------
Total of 2 IP addresses found

 

dig ondemand.database.windows.net

; <<>> DiG 9.11.5-P4-5.1+deb10u5-Debian <<>> ondemand.database.windows.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63464
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;ondemand.database.windows.net. IN A

;; ANSWER SECTION:
ondemand.database.windows.net. 299 IN CNAME dataslice6.eastus.database.windows.net.
dataslice6.eastus.database.windows.net. 9 IN CNAME dataslice6eastus.trafficmanager.net.
dataslice6eastus.trafficmanager.net. 8 IN CNAME cr3.eastus1-a.control.database.windows.net.
cr3.eastus1-a.control.database.windows.net. 20862 IN A 40.79.153.12

;; Query time: 19 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Mon May 17 20:08:23 CEST 2021
;; MSG SIZE rcvd: 188

=> A record cached for 20862 seconds, only the CNAME's were cached for 9 seconds

Need to recheck production what is going on there..

 

@Dale_Lobb , sk161612 states

Value 3: Enabled for Updatable Objects, Non-FQDN objects, and FQDN objects
So you're saying there are other SK's contradicting this info?

0 Kudos
Dale_Lobb
Collaborator

Ah, I guess I missed the Value 3 in sk161612.   Certainly the default for R80.40 does not do FQDN objects. 

As to other SKs, they all could use some updating.  sk120633 and sk90401 never mention passive DNS learning nor sk161612.  sk161632 also never mentions passive DNS learning, although it does reference sk161612, but only as a referral to treating errors encountered in the domain tools.  BTW, some of the options for sk161612 (-hc, -report) do not work, at least not in R80.40.

Today is actually the first time I have seen sk161612, although I've been aware of and using FQDN objects for some time, and have actually encountered the issue mentioned above becasue the default setting for passive DNS learning does not support FQDN objects.  We have been planning a sole DNS forwarder in order to take care of the issue, but now it seems that maybe I do not have to go down that route.

 

 

0 Kudos
JanVC
Collaborator

[Expert@xxx:0]# domains_tool -d ondemand.database.windows.net
Internal erorr, for more information use DEBUG mode
[Expert@xxx:0]#

 

It seems the domains_tool or the DNS probing is not working as intended on production
- checked DNS servers and they are working correctly
- checked tcpdump and request is done every 60 seconds to both DNS servers and correct response is seen

 

It seems there is a binary difference between R80.40 (T94) and R80.20 (T190)

ls -lh /opt/CPsuite-R80.40/fw1/bin/domains_tool
-rwxr-x--- 1 admin bin 30K Jan 27 2020 /opt/CPsuite-R80.40/fw1/bin/domains_tool

 

ls -lh /opt/CPsuite-R80.20/fw1/bin/domains_tool
-rwxr-x--- 1 admin bin 43K Sep 17 2018 /opt/CPsuite-R80.20/fw1/bin/domains_tool

 

TAC case is opened

0 Kudos
JanVC
Collaborator

TAC confirmed it as a bug in R80.20

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

 

Symptoms
  • When using two domain objects that translate to the same IP address in two different rules of the access policy, the rules using these domain objects might not be matched correctly.
Cause

The DNS cache structure in R80.10 and R80.20 supports only one domain per IP.

0 Kudos
William_Chang
Participant

Yes. # domains_tool -hc and # domains_tool -report are not working in R80.40,

One important condition to use the DNS passive learning:For DNS Passive Learning to work, the DNS traffic must pass through the Security Gateway / Cluster.

sk161612 is as most of Check Point documents, which failed to provide information of the "DNS Passive Learning" in depth:

1. How big is the cache and how long the new DNS entry will last in cache?

2. Is there any way to adjust the cache size?

3. Howto clear one particular entry in the cache?

4. Are we using the same domains_tool to check the entries in the cache?

0 Kudos
William_Chang
Participant

Any expert here could you please provide the explanation of the discrepancy between the "dig" and "domains_tool":

#dig svc.intersight.com +short
34.195.174.165
34.225.148.197
52.86.5.111

_______________________________

# domains_tool -d svc.intersight.com -m
---------------------------------------------------------------------------------------------------
| Given Domain name: svc.intersight.com FQDN: yes |
---------------------------------------------------------------------------------------------------
| IP address | sub-domain |
---------------------------------------------------------------------------------------------------
| 52.86.5.111 | no |
| 13.35.89.31 | no |
| 34.225.148.197 | no |
| 13.35.89.6 | no |
| 34.195.174.165 | no |
| 13.35.89.58 | no |
| 13.35.89.71 | no |
---------------------------------------------------------------------------------------------------

Checking those 13.35.89.x address:

# dig -x 13.35.89.31 +short
server-13-35-89-31.lax3.r.cloudfront.net.

These obviously are CDN address, which means Cisco Insight is using CDN acceleration. 

However the questions from this:

1. Is there any CLI/tool to find out the CDN for a FQDN based on the geolocation of the DNS used by gateway? ( to validate the additional entry in the domains_tool are correct for the particular domain)

2. More importantly, if the gateway is permitting the traffic based on the IP listed for the particular FQDN, which including the IPs for CDN, the same CDN IPs most likely would be used for other hosted FQDNs, thus the firewall would permit traffic that is not  intended. 

In this example. it the gateway permit the traffic for FQDN "svc.intersight.com", which include 13.35.89.x CDN IPs, and if the same CDN IPs are also used for other sites (hypothetically speaking, www.gamble.com) , wouldn't the rule will also permit the traffic?

 

 

 

 

0 Kudos
Benedikt_Weissl
Advisor

see sk120633 for an explanation of the discrepancy, basically the gateway also resolves www.$yourdomainobject.

0 Kudos
William_Chang
Participant

Thank you Benedikt!

However, this is against common sense for FQDN. Any reason why Check Point adding www. into the FWDN as well? Any way to disable this odd behavior?  If the FQDN is www.checkpoint.com, it will include www.www.checkpoint.com, which doesn't make any sense.

0 Kudos
Benedikt_Weissl
Advisor

I don't know the reasoning behind this. I guess its to safe time and/or effort if you're using FQDNs to allow websurfing to specific sites. For example, if you want to allow traffic to "www.checkpoint.com" all you have to configure is ".checkpoint.com" and the "www" is included, thus saving same time. 

Maybe a check point employee can comment on this?

0 Kudos
mk1
Collaborator

Hello all,

I have the following question. Let's say I want to allow access to particular S3 bucket (TCP/443) in AWS. For instance my-s3.s3.eu-central-1.amazonaws.com
Let's see what do we have for it:

user@server:~$ dig @1.1.1.1 my-s3.s3.eu-central-1.amazonaws.com

; <<>> DiG 9.16.1-Ubuntu <<>> @1.1.1.1 my-s3.s3.eu-central-1.amazonaws.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32911
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;my-s3.s3.eu-central-1.amazonaws.com. IN A

;; ANSWER SECTION:
my-s3.s3.eu-central-1.amazonaws.com. 300 IN CNAME s3-r-w.eu-central-1.amazonaws.com.
s3-r-w.eu-central-1.amazonaws.com. 5 IN A 52.219.140.16

;; Query time: 64 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Fri Aug 13 16:26:13 CEST 2021
;; MSG SIZE rcvd: 101

As you can see it points to a CNAME record s3-r-w.eu-central-1.amazonaws.com for which the TTL is no more than 5 seconds!
After I downloaded the official list with Amazon IPs for S3 services in that region (eu-central-1), I found there are 198 PTR records pointing to s3-r-w.eu-central-1.amazonaws.com. I know PTR records will not be checked in case I use FQDN, but I'm giving you this just as an example that we could have all that IPs as an A records when we try to resolve s3-r-w.eu-central-1.amazonaws.com. So as far as I understand, firewall must have in its cache all these IPs (when I check with domains_tool -d) in order to allow access to that S3 bucket? After I checked that, I can say for sure we don't have all that 198 A records which could lead to my-s3.s3.eu-central-1.amazonaws.com. That means it's quite possible for the user to have IP address for that S3 bucket in his DNS cache, which is missing on the firewall. Hence his access to that service will be blocked. We already have complaints from users they are not able to reach their S3 buckets, and I can see in the logs they don't match the rule which contains their S3 bucket. Domain object is called .my-s3.s3.eu-central-1.amazonaws.com and it's configured as FQDN.

Have you ever had similar issue?

Thank you!

Tobias_Moritz
Advisor

This is the logically effect of the approach, Check Point uses here. As described already in this thread, the passive DNS learning feature from sk161612 can help if you configure it correctly and if you can make sure, that the firewall which shell allow this traffic is also seeing the DNS requests for this FQDN.

As far as I know, there is no way to share this information between different firewalls (gateways) managed by the same management server. Please correct me, if I'm wrong here.

0 Kudos
mk1
Collaborator

Unfortunately I can't use "DNS Passive Learning", because the DNS requests are not passing the firewall.

0 Kudos