Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor

Bi-directional NAT is not working post VMSS deployment

Hi All,

 

For some reason the bi-directional NAT is not working for one of our Destination Natted traffic in VMSS deployment (2 instances) . Return traffic from destination is not able to connect to the original src.

R80.40 - Build 105

 

Am I missing something ??

 

Rule -

Original Src- 10.22.x.x-23 , 10.22.y.y-23 ( Network Group - consisting of Network )

Original Dst - 10.22.8.40

Original Port - 443

Xlated Src - Original

Xlated Dst - 10.22.133.10 (Type Static)

Xlated Port- Original

 

 

Log -

Cann't see - Additonal Nat Rule -1 ,  which generally comes in traffic . 

 

 

@PhoneBoy  --- Any help , will be highly appreciated.

0 Kudos
Reply
21 Replies
Highlighted
Participant

Do you see nating in fw monitor?

0 Kudos
Reply
Highlighted
Champion
Champion

R80.40 - Build 105 and which Jumbo HFA ?

0 Kudos
Reply
Highlighted
Contributor

HOTFIX_R80_40_JUMBO_HF_MAIN Take: 78
0 Kudos
Reply
Highlighted
Admin
Admin

Did you read the help with bi-directional NAT?
This option is only relevant for automatic NAT rules.

In any case, not sure I understand what is happening here.
Can you be far more explicit about what you expect, what is actually happening, etc?

0 Kudos
Reply
Highlighted

do you see a drop/out of state  for the return packet (Source: 10.22.133.10,  Destination: 10.22.x.x-23 , 10.22.y.y-23) ?

0 Kudos
Reply
Highlighted
Contributor

Yup , exactly. 

0 Kudos
Reply
Highlighted
Contributor

This is observed for EAST-to-WEST return traffic .

Basically , the incoming packet from CP internal LB is coming to 1 instance of GW (eth1) and the return traffic is coming on the other GW (eth1) via the CP internal LB .

I did added the LocalGatewayInternal  ( Xlated Source - type Hide NAT) but still no luck .

Note - the interesting traffic over here is FTP - Passive

0 Kudos
Reply
Highlighted

does this Destination NAT work for other connections?

With two firewall instances I would expect that there is a chance of 50:50 that it will work as the internal LB is selecting the firewall  instance based on the IPs of the packet and as NAT is modifying one of these IPs, the loadbalancer could selecting a different instance for the return packet (as in your case).

But if so, I would expect more problems, not just for FTP -Passive.

Doing a Xlated Source NAT with  LocalGatewayInternal  should solve the problem.

Did you check the log to see that the Source IP is modified correctly ?

A "dynamic_objects -l" on the instance should show you the IP attached to LocalGatewayInternal 

If the Source NAT is correct, is a UDR used  for  the subnet in which your destination 10.22.133.10 is deployed ?

If so, please make sure, that the subnet of the internal Inferface of the VMSS is routed directly (next hop type Virtual network) and not forwared to the internal LB otherwise we would have the same problem

 

0 Kudos
Reply
Highlighted
Contributor

Hi @Matthias_Haas ,

 

Please find my response inline to your comments (in italic font)--

 

does this Destination NAT work for other connections? ---  we just have this single application traffic utilizing DNAT , no other used case so far. 

With two firewall instances I would expect that there is a chance of 50:50 that it will work as the internal LB is selecting the firewall  instance based on the IPs of the packet and as NAT is modifying one of these IPs, the loadbalancer could selecting a different instance for the return packet (as in your case).

But if so, I would expect more problems, not just for FTP -Passive.

Doing a Xlated Source NAT with  LocalGatewayInternal  should solve the problem.

Did you check the log to see that the Source IP is modified correctly ?

A "dynamic_objects -l" on the instance should show you the IP attached to LocalGatewayInternal 

---  Yes, the source NAT is happening properly , have validated the translated Src to be fw  eth1 IP and Dst to be 10.22.133.10 using tcpdump & fw mon

If the Source NAT is correct, is a UDR used  for  the subnet in which your destination 10.22.133.10 is deployed ? -- Yes UDR is added on the FW internal interface eth1 subnet for destination 10.22.133.10 via Virtual network (next hop)

0 Kudos
Reply
Highlighted
Contributor

also we enable one service on a random port 2222 , it works fine ... have observed issue with just Passive FTP connection ( 21 , data-port (2000 - 4000)) ...

Already allowed in access rule .

 

No issue observed for this FTP service in single instance (VMSS solution), or the earlier deployed cluster gateway 😞

0 Kudos
Reply
Highlighted

Abhishek,

<Yes UDR is added on the FW internal interface eth1 subnet for destination 10.22.133.10 via Virtual network (next hop)

 I mean the UDR attached  to the subnet of the destination 10.22.133.10 (relevant for the return packet)?

Update: make sure, that the eth1 subnet is not forwarded to the internal LB

0 Kudos
Reply
Highlighted
Contributor

Ohk , thats UDR is pointing towards VMSS internal LB . ( Hence , we thought of adding Source Hide NAT to overcome asymmetric of return traffic).

 

Basically for all the subnets , as per design we have default route pointing towards VMSS internal LB so that checkpoint can inspect the traffic.

If in this case , I change the route for CP subnet directly vis Azure fabric , that would lead to CP missing the return traffic , isn't it ??

0 Kudos
Reply
Highlighted

<If in this case , I change the route for CP subnet directly vis Azure fabric , that would lead to CP missing the return traffic , isn't it ??

exactly (just for the internal/eth1 segment of the VMSS)

0 Kudos
Reply
Highlighted

49E78284-54B3-4602-89A3-A97AAB31A534.jpeg
that‘s the route you need to add

0 Kudos
Reply
Highlighted
Contributor

Yes , thats done as per the doc. Routing wise we are sorted , no split,asymmetric scenerio .

 

Finally with end-to-end packet captures , realized Checkpoint is specifically dropping 227 PASV response towards client whenever we enable SNAT .

 

Zdebug -

@;266774109;[cpu_2];[fw4_1];fw_log_drop_ex: Packet proto=6 x.x.x.x:21 -> y.y.y.y:17024 dropped by fw_post_vm_chain_handler Reason: Handler 'ftp_code' drop;

 

kernel debug -

@;237812014;26Oct2020  7:53:08.451195;[cpu_1];[fw4_2];fw_xlate_scan_ftp_cmd: 227 command;

@;237812014;26Oct2020  7:53:08.451196;[cpu_1];[fw4_2];fw_xlate_anticipate_cookie: changing packet to <y.y.y.y, 9dd>;

@;237812014;26Oct2020  7:53:08.451197;[cpu_1];[fw4_2];fw_xlate_update_packet: new field (len=16, delta=-1) is 'y,y,y,y,9,221';

@;237812014;26Oct2020  7:53:08.451199;[cpu_1];[fw4_2];fw_xlate_update_length: Got -3 from fwseqvalid_reg_offset_deltas;

@;237812014;26Oct2020  7:53:08.451200;[cpu_1];[fw4_2];fw_post_vm_chain_handler: handler function returned action DROP;

@;237812014;26Oct2020  7:53:08.451202;[cpu_1];[fw4_2];fw_log_drop_ex: Packet proto=6 y.y.y.y:21 -> x.x.x.x:61627 dropped by fw_post_vm_chain_handler Reason: Handler 'ftp_code' drop;

@;237812014;26Oct2020  7:53:08.451204;[cpu_1];[fw4_2];After  POST VM: <dir 1, y.y.y.y:21 -> x.x.x.x:61627 IPP 6> (len=87) TCP flags=0x18 (PUSH-ACK), seq=3417305397, ack=951427193, data end=3417305444 ;

@;237812014;26Oct2020  7:53:08.451205;[cpu_1];[fw4_2];POST VM Final action=DROP;

@;237812014;26Oct2020  7:53:08.451205;[cpu_1];[fw4_2]; -----  Stateful POST VM outbound Completed -----

 

 

@PhoneBoy  --- We have already opened a TAC case - SR#6-0002342606 , but not getting proper attention . Can you please suggest and highlight this to appropriate Checkpoint resources . Thanks in Advance 🙂

0 Kudos
Reply
Highlighted
Contributor

Just to add on --- Have tried all the permutation & combination of Global Nat settings , Custom TCP service (with protocol as None) , sks available on internet search with this ftp_code drop , everything with no luck...

0 Kudos
Reply
Highlighted
Admin
Admin

0 Kudos
Reply
Highlighted
Contributor

Yes , gone through that SK ... we aren't using FTP over TLS 😞 . Plus , the client in our case is never receiving an response from Server . ( 227 PASV response from server is not relayed back to client , CP is dropping it)

Have already tried with custom TCP service , allowing port-21 , >1024 (with None as protocol) --- with no luck.

 

Is there a way I can force Checkpoint to bypass the standard inspection for this traffic ?? 

0 Kudos
Reply
Highlighted
Contributor

@Timothy_Hall  -- do you have any insights on this?? 

0 Kudos
Reply
Highlighted
Champion
Champion

Not sure here, but it looks like the rewrite of the 227 command by NAT is running afoul of TCP sequence validation?  Try setting these Inspection Settings to Inactive (or Accept if Inactive is not available):

  • TCP Out of Sequence
  • TCP Off-Path Sequence Inference
  • Sequence Verifier

There are some kernel variables that seem to be relevant to your debug, but I wouldn't recommend trying to tamper with these except under the guidance of TAC as they are not documented and doing so may have nasty side effects:

ecm_seqval = 29
fwseqvalid_exact_syn_on_rst = 0
fwseqvalid_flush_syn_ack = 1

 

Gaia 3.10 Immersion Self-paced Video Series
now available at http://www.maxpowerfirewalls.com
0 Kudos
Reply
Highlighted
Contributor

Thanks Timothy for the quick reply ,  

  • TCP Off-Path Sequence Inference - inactive
  • Sequence Verifier - inactive
  • TCP Out of Sequence --- was "drop" by default ---> changed it to "Accept & log" ---- but even this didn't made any difference . Issue still persist

 

The TAC team is following up with R&D department , but they might take some time . If you have any suggestions or papers to follow/read about the kernel parameters you have mentioned I would like to explore in the meantime.

 

Note: those k-params are set as default at the moment , attached the snap for the same.

 

Tx,

Abhishek

0 Kudos
Reply