Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
dj0Nz
Advisor
Jump to solution

VPN performance on a 3800

Dear community,

I am currently investigating an issue on a CPSG 3800 cluster running only S2S vpns. Throughput is limited to roundabout 300 Mbps because CPU 0 is contantly at 100% load. Besides normal SND tasks, there is the process show below which is causing the load:

vpn_fw_traffic_probs_20230720_perf-c0.jpg

Does any of you had similar issues and a solution?

Cheers,
Michael

 

 

0 Kudos
1 Solution

Accepted Solutions
dj0Nz
Advisor

Okay problem is finally "solved": Check Point provided a POC hardware (6400) with better CPUs but lower VPN throughput according to datasheet. But single thread performance matters in this case. Bandwidth is now at interface maximum (1 Gbps) with CPU 0 load at about 30%.

Conclusion: Either "someone" fixes the datasheets or performance figures must be divided by ten...

View solution in original post

(1)
54 Replies
Tal_Paz-Fridman
Employee
Employee

A similar issue was resolved in JHFs. Check you have the latest installed. For example from R81.10:

https://sc1.checkpoint.com/documents/Jumbo_HFA/R81.10/R81.10/R81.10-List-of-all-Resolved-Issues.htm

 

PRJ-42145, PMTR-88118 SecureXL SNDs may reach 100% CPU utilization and are not released in some Site to Site VPN scenarios.
0 Kudos
dj0Nz
Advisor

Thank you for your reply. But take 95 is already installed.

0 Kudos
Tal_Paz-Fridman
Employee
Employee

I suggest contacting TAC and referencing the fix in the JHF (PRJ-42145, PMTR-88118) so that they can communicate it with relevant owners in R&D.

0 Kudos
dj0Nz
Advisor

In the mean time we kind of "solved" it by setting P2 integrity to SHA1 after reviewing sk73980.
But honestly: This gateway is specified to achieve 2.75 IPSEC performance.
It should reach that with up to date crypto and NOT by using deprecated ciphers.

TAC case is open. 😉

0 Kudos
the_rock
Legend
Legend

Just my personal honest opinion...I would NOT do settings from that sk, because it essentially "solves" speed, but severely impacts security as far as VPN goes.

0 Kudos
Chris_Atkinson
Employee Employee
Employee

That's a bit generic... E.g. No one wants 3DES. Whereas the AES-NI friendly protocols are better on both fronts.

CCSM R77/R80/ELITE
0 Kudos
the_rock
Legend
Legend

Hey Chris,

I had more than 1 instance where TAC suggested that sk on the phone to customers and every single time they got an argument back about security, which is 100% valid. TAC response was always that while its true, it should help the speed. In my view, sort of hard sell for security company....

0 Kudos
Chris_Atkinson
Employee Employee
Employee

We all have different objectives & situations/ constraints that we're dealing with.  Security is a sound argument but often doesn't fly if an appliance is undersized for the task at hand.

Ultimately it's a balance like most things.

 

CCSM R77/R80/ELITE
0 Kudos
the_rock
Legend
Legend

I get it, but if you were a customer and security vendor told you that, Im sure you would not be too happy about it.

0 Kudos
dj0Nz
Advisor

Absolutely right! I wouldn't do that either but, you might know how "urgent" things can get. 😎

0 Kudos
Tal_Paz-Fridman
Employee
Employee

Do you know how many SNDs are there were on the machine, and what is the CPU usage on those SNDs when the issue happened?

0 Kudos
dj0Nz
Advisor

There's dynamic balancing configured and working fine. All other cores are at ~90% idle.

0 Kudos
Bob_Zimmerman
Authority
Authority

Hashes for HMACs are vastly different from the same hashes used for file integrity. They don't depend nearly as much on the security of the hash, since they're computed per packet, and finding collisions retrospectively isn't useful.

MD5 provides more than enough integrity assurance for HMAC use for the next century. The only reason to use anything longer is checkbox-compliance where somebody cares more about not seeing the string 'md5" in a configuration than they care about the actual security of the environment.

dj0Nz
Advisor

Fully agreed. But there are "checkboxes".

Bob_Zimmerman
Authority
Authority

I've successfully argued with compliance assessors for a few standards (including PCI-DSS) that the proscription against MD5 and SHA1 doesn't apply to HMACs.

Next, I need to convince them that the requirement for passwords is "expiration every 90 days or better" and that no expiration is better, like NIST says. It just hasn't come up in an assessment yet.

0 Kudos
Chris_Atkinson
Employee Employee
Employee

So VPN and no blades other than FW?

Have you tuned / altered your CoreXL config at all?

CCSM R77/R80/ELITE
0 Kudos
dj0Nz
Advisor

Yes, FW/VPN only and we altered different aspects of CoreXL including disabling dynamic balancing / multiqueueing with no impact on the main problem. We also checked if KSFW mode makes a difference but as expected, it has not.

From the perf top output above I assume, that this is no issue that can be solved on CoreXL level. Looks like a software bug to me. 

0 Kudos
Chris_Atkinson
Employee Employee
Employee

Understood. Specific issues aside VPN performance may require changes for best results is the point and can also be dependent on the testing methodology. 

Hope the SR is resolved for you quickly.

CCSM R77/R80/ELITE
0 Kudos
dj0Nz
Advisor

Thanks a lot!

0 Kudos
Ruan_Kotze
Advisor

I've had one case where a customer with around 80 incoming tunnels changed the community from "tunnel per gateway pair" to "tunnel per host pair" and that immediately caused the appliance to become unresponsive.  Something to double-check?

0 Kudos
dj0Nz
Advisor

Yes we checked that, too. It's already "tunnel per gateway pair". Only a hadful of VPNs and about 30 firewall rules...

0 Kudos
the_rock
Legend
Legend

That setting is usually checked for permanent tunnel...is that the case? Or is it just regular tunnel?

Andy

0 Kudos
dj0Nz
Advisor

This is a permanent tunnel connecting to Zscaler. We configured it according to sk174848 and it has been working fine for months. Customer did not change anything but all of a sudden packet loss started last monday.

My guess is that traffic never got above ~350 Mbps but no one noticed. Today, after changing P2 integrity from sha256 to sha1 we noticed rates above 500 Mbps with cpu 0 load at about 50% which is okay.

Currently we're waiting for R&D to come back with suggestions on how to switch back to sha256. 😉

0 Kudos
the_rock
Legend
Legend

K, got it, fair enough : - )

Here is what escalation guy from Dallas gave me few months ago when customer had similar issue...client never implemented it, since they had more pressing projects on the go, but not sure if this is something that would help though

Andy

 

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

 

fw_clamp_tcp_mss_control - change in Guidbedit to true

 

mss_value - change to 1200

 

fw_clamp_vpn_mss -> # fw ctl set int fw_clamp_vpn_mss 1

 

sim_clamp_vpn_mss -> change to 1 $PPKDIR/conf/simkern.conf   

0 Kudos
Timothy_Hall
Champion Champion
Champion

If your VPN peer supports it, in P2 try using a Galois Counter Mode version of AES such as AES-GCM-128 or AES-GCM-256.  GCM takes the encryption and hashing function and stitches them into a single operation, which can be fully offloaded to the AES-NI silicon.   This should give you a nice boost; we switch to a GCM-based algorithm from 3DES in a lab exercise of my Gateway Performance Optimization Course and the speed increase is impressive.  The 3800 does have 8 cores but they are low-power ATOM and not very speedy, the performance of the 3800 was dissected in this earlier thread:

Maximum reachable bandwidth 3800

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
dj0Nz
Advisor

This would be right if we had encryption in P2. But this is a VPN tunnel with Zscaler ZIA and we're only using the hash function. See sk174848.

Maybe that's part of the problem? Don't know but our L3 engineer forwarded it to R&D. I'm pretty curious.

BTW: I don't expect the box to provide more than 700-800 Mbps throughput.

0 Kudos
G_W_Albrecht
Legend
Legend

The explanation can be found here: sk177966: Secure Network Distributor (SND) shows high CPU usage when 3DES encryption is enabled

CCSE CCTE CCSM SMB Specialist
0 Kudos
G_W_Albrecht
Legend
Legend

sk174848 tells you to Configure the encryption properties according to the following recommendation:

5.2.2.2. IKE SA Phase 2 – NULL or AES + MD5

So why did you use sha256 in P2 ? Usually, AES-128 / SHA-1 is good enough for P2...

CCSE CCTE CCSM SMB Specialist
0 Kudos
dj0Nz
Advisor

Tunnel has been created according to sk174848.

See configuration screenshots there. But it also performs bad with sha1 and I bet it's worse with md5. But that's not the point: The issue is, that the hashing nearly fully saturates one cpu core and if an appliance is specified for 2.75 Gbps performance I would expect it to deliver anything near that number.

 

(1)

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events