Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
David_Guan
Participant

R80.10 VSX Performance Issue - VPN tunnel encryption failed

Hi all,

Currently we are having one VPN tunnel performance issue and need your help.

We have a dedicated VS (VS4) as a site2site VPN gateway and there is only one VPN tunnel running with remote gateway a Fortigate firewall device.

The issue is during the large file transfer (SFTP) through the VPN tunnel, the data pipe was closed without a reason and the transfer failed. We managed to capture the packet drops as below:

;[kern];[tid_8];[fw4_0];fw_log_drop_conn: Packet <dir 1, 10.92.47.4:42216 -> 10.56.61.101:22 IPP 6>, dropped by do_outbound, Reason: encryption failed;

And actually when we removed the filter of the IP address from the zdebug command, we can see more drops on different traffic streams with the same reason as above.

While watching the output of "fwaccel stats -d", we can see the counters for "Encryption Errors" and "Decryption Errors" keep increasing.

Reason Value Reason Value
-------------------- --------------- -------------------- ---------------
general reason 2 PXL decision 22376
fragment error 0 hl - spoof viol 0
F2F not allowed 0 hl - TCP viol 0
corrupted packet 0 hl - new conn 0
clr pkt on vpn 41790 partial conn 0
encrypt failed 36731 drop template 0
decrypt failed 34614018 outb - no conn 20
interface down 0 cluster error 0
XMT error 0 template quota 0
anti spoofing 0 Attack mitigation 0
local spoofing 0 sanity error 0
monitored spoofed 1383646751 QXL decision 0

The command "fwaccel conns | grep "10.56.61.101" helped confirm that the encryption/decryption is offloaded to SecureXL.
10.56.61.101 22 10.92.47.4 36654 6 ...AC......... 2/0 0/2 0 0
10.56.61.101 22 10.92.47.4 36090 6 ...AC......... 2/0 0/2 0 0
10.92.47.4 36654 10.56.61.101 22 6 ...AC......... 2/0 0/2 0 0
10.92.47.4 36090 10.56.61.101 22 6 ...AC......... 2/0 0/2 0 0

We have logged case with TAC and provided all information we could however so far we have not got any clear indication what caused the encryption/decryption error.

One thing we would like to test is to follow the suggestion from Timothy Hall in below discussions to use the commands "sim vpn off; fwaccel off; fwaccel on" to see if we are still getting errors when the encryption/decryption is handled by INSPECT. However we have never done that and don't understand the possible impact so would like someone to share his experience on this?

R80.10 gateway, can't set sim_clamp_vpn_mss 

ICMP is sometimes drop when send via IPSec Tunnel 

And what else could cause this issue? If the issue is gone without SecureXL, shall we just keep it as it is? I believe SecureXL is still good to offload the encryption and decryption traffic so even the workaround works it cannot last long and still we need to find the root cause.

Hope someone can help out here.

David

0 Kudos
4 Replies
PhoneBoy
Admin
Admin

Disabling SecureXL is a useful troubleshooting step, but it should never be "the solution" to a problem. Smiley Happy

When you do packet captures, do you see packets on that same session that are close to or at the MTU limit?

Possibly any related ICMP messages?

And have you tried enabling MSS clamping?

0 Kudos
David_Guan
Participant

Hi Dameon,

Thanks for the suggestions.

We managed to capture the packets while enabling the debug on Checkpoint. We did find a lot of encrypted packets on the tcpdump with more than 1500bytes. And from fw monitor capture could find quite a few ICMP Type 3 Code 4 messages (Destination Unreachable, need fragmentation) from the SFTP server to our MFT gateway which was trying to pull bulk of data. The SFTP server kept sending the same ICMP Type 3 Code 4 messages and then MFT gateway  reinitiated the TCP session by sending TCP SYN packet so I guess it failed the original session. We still need to capture the packets on both end servers as well to understand what happened but I believe the SecureXL encryption errors should be related to the ICMP Type 3 Code 4 messages.

Also I will try to get the current MSS clamping settings on the Checkpoint. I don't believe this has been set before so most probably they are with the default settings. What is the default behaviour? How is Checkpoint R80.10 SecureXL dealing with the over-sized MTUs and the ICMP Type 3 Code 4 messages?

0 Kudos
PhoneBoy
Admin
Admin

If the sftp server is sending packets that are at or near MTU, then when you add the IPSec headers, the packets will be above the MTU value.

It gets a little more complicated if that packet has "don't fragment" set in it's headers.

I'd start with this SK for some background: MTU and Fragmentation Issues in IPsec VPN 

That will probably lead you to this SK: New VPN features in R77.20 and later 

Basically you want to "clamp" MSS on traffic going through the VPN so you don't end up with oversized IPsec traffic.

0 Kudos
David_Guan
Participant

Thanks PhoneBoy!

 

We have read through both articles you recommended and identified that in our environment only fw_clamp_vpn_mss is enabled while sim_clamp_vpn_mss is still disabled. We are scheduling one change to enable sim_clamp_vpn_mss and if it does not work as mentioned in https://community.checkpoint.com/t5/General-Topics/R80-10-gateway-can-t-set-sim-clamp-vpn-mss/td-p/1..., we will disable the SecureXL on VPN traffic only so as to force all VPN traffic to go via the INSPECT.

 

Will keep you posted.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events