Solved: R80.40+ and GNAT. How to clear the NAT-table ?

RamGuy239 · ‎2021-07-15

Greetings.

I'm having a customer that is facing an issue with Hide-NAT Exhaustion but it seems to be a case of the NAT table not clearing correctly. When viewing CPview History data it doesn't seem normal for the customer to be facing these issues, but suddenly they maxed out and CPview -> NAT is showing all the capacity being used and the logs are showing messages about "NAT Hide failure - There are currently not available ports for hide operation.".

The customer already did a smart thing and changed the NAT rule for this traffic to use another address for Hide-NAT. They did this yesterday but still, CPview -> NAT is showing that the old Hide-NAT is still using its full capacity. We know that the new address is in place as traffic is working and it's showing in the statistics as well.

But we don't understand why the statistics for the old address is still showing as 99% used?

The customer is running a Check Point High Availability cluster on a set of two Check Point Quantum 28800 Plus appliances running R80.40 with the latest general availability jumbo hotfix accumulator take 118.

I've tried to verify a few things. GNAT is enabled by default on R80.40 as long as there are more than six firewall instances / CoreXL workers. We also notice from the logs that the format is showing as protocol, source IP, destination IP, destination port which indicates GNAT as the output should have been protocol, source IP and destination IP (not destination port) if it was not using GNAT.

Just to make sure I got a few outputs from the customer:

[Expert@TK-FW-12:0]# fw ctl get int fwx_alloc_entry_expiration
fwx_alloc_entry_expiration = 300

[Expert@TK-FW-12:0]# fw ctl get int fwx_alloc_free_port_timeout
fwx_alloc_free_port_timeout = 1

[Expert@TK-FW-12:0]# dynamic_split -p
Dynamic Balancing is currently off

[Expert@TK-FW-12:0]# fw ctl get int fwx_gnat_enabled
fwx_gnat_enabled = 1

Configuring Check Point CoreXL...

=================================

CoreXL is currently enabled with 60 IPv4 firewall instances and 4 IPv6 firewall instances.

These are all default settings for R80.40. GNAT is active and the default expiration time for NAT is 300 seconds. I was confused on why CoreXL Split / Dynamic Balancing would not be in use on an appliance with so many cores but it seems like it was never on by default on R80.40, that is a thing with R81+. On R80.40 it has to be enabled manually regardless of the number of cores you have.

It's been over 18 hours since the customer changed the NAT rule from using the old IP to using the new IP. Still CPview -> NAT is showing 99% used on the old IP. When running a cppcap for over five minutes we can't see anything on the active member indicating traffic or existing connections that could be keeping the old hide-NAT table alive:

[Expert@TK-FW-12:0]# cppcap -f "src 185.176.215.252 and dst 146.192.252.64" -DNT
0 packets captured (0 B)

This doesn't make much sense to me. If there is no traffic passing the firewall indicating that the old IP is being used for hide-NAT why doesn't the table clear? Is this a bug?

The commands I know in order to manually clear the NAT-table etc.. Are not working on R80.40 with GNAT enabled so I don't have any commands for verifying the current state of the NAT-table or to manually clear it. So I have no real way of knowing if this is simply CPview showing the wrong data, if the NAT-table is somehow stuck not clearing or what is going on.

Normally I would have used:

fw tab -t fwx_alloc -s, to view the statistics of the NAT-table
fw tab -t fwx_alloc -x, to clear it

But fwx_alloc does not exist on R80.40 with GNAT enabled:
No such table fwx_alloc

sk165153 for GNAT doesn't really give me any similar commands to run. All it contains is information on how to do kernel debugs.

Do we have any commands that can be used on R80.40+ with GNAT enabled?

Certifications: CCSA, CCSE, CCSM, CCSM ELITE, CCTA, CCTE, CCVS, CCME

AaronCP · ‎2021-07-22

Hey @RamGuy239,

The name of the table for GNAT has changed to fwx_alloc_global.

fw tab -t fwx_alloc_global -s

Thanks,

Aaron.

View solution in original post

Sam2 · ‎2021-07-16

How was the NAT rule configured? Can you set the source to be a network containing just 3 addresses? Or maybe a network range? That should work similar to a NAT pool.

Is there a mix in terms of the traffic being natted to various destinations or is this a ton of connections being NAT from a single host?

fw tab -s | egrep ' alloc|nat'

you might be able to use the above command to print out a few tables that might be GNAT related, I am running R80.10 currently so i can't run anything myself to confirm.

Sam2 · ‎2021-07-16

Ip range sorry, not a network range**

RamGuy239 · ‎2021-07-20

It was a simple:

rule236
src: group with all internal subnets, dst: 146.192.252.64, service: any, translate: hide-nat 185.176.215.244

Located right above the other nat rule being rule 237:
src: group with all internal subnets, dst: 146.192.252.64, service: any, translate: hide-nat 185.176.215.252

But it seems like the issue got solved. The customer changed the nat-rule to use a IP-range instead so they have additional IP's that can be used for hide-NAT so the new rule looks like this:

src: group with all internal subnets, dst: 146.192.252.64, service: any, translate: hide-nat 185.176.215.178-181

And after making this move, and manually restarting the application that is running on 146.192.252.64 this are looking much better. The NAT-table isn't endlessly increasing anymore.

Certifications: CCSA, CCSE, CCSM, CCSM ELITE, CCTA, CCTE, CCVS, CCME

AaronCP · ‎2021-07-22

Hey @RamGuy239,

The name of the table for GNAT has changed to fwx_alloc_global.

fw tab -t fwx_alloc_global -s

Thanks,

Aaron.

RamGuy239 · ‎2021-07-22

Hi @AaronCP

Thank you for the commands. They will become very handy!

Certifications: CCSA, CCSE, CCSM, CCSM ELITE, CCTA, CCTE, CCVS, CCME

YosiHavilo · ‎2021-07-28

Hi

moving from dynamic port allocation to GNAT , the table changed to global table - fwx_alloc_global

It is not recommended to delete the fwx_alloc table (fwx_alloc / fwx_alloc_global)

(you can't delete fwx_alloc_global table because it is a global table - we currently don't have command to do it )

In case you delete it , you can have link collision .

I suggest that you open a support ticket so we can investigate .

RamGuy239 · ‎2021-07-28

We currently have a TAC case going. It's moving rather slow as a result of RnD needing to get involved.

Certifications: CCSA, CCSE, CCSM, CCSM ELITE, CCTA, CCTE, CCVS, CCME

Are you a member of CheckMates?

R80.40+ and GNAT. How to clear the NAT-table ?