- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- Re: VPN performance on a 3800
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
VPN performance on a 3800
Dear community,
I am currently investigating an issue on a CPSG 3800 cluster running only S2S vpns. Throughput is limited to roundabout 300 Mbps because CPU 0 is contantly at 100% load. Besides normal SND tasks, there is the process show below which is causing the load:
Does any of you had similar issues and a solution?
Cheers,
Michael
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okay problem is finally "solved": Check Point provided a POC hardware (6400) with better CPUs but lower VPN throughput according to datasheet. But single thread performance matters in this case. Bandwidth is now at interface maximum (1 Gbps) with CPU 0 load at about 30%.
Conclusion: Either "someone" fixes the datasheets or performance figures must be divided by ten...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A similar issue was resolved in JHFs. Check you have the latest installed. For example from R81.10:
https://sc1.checkpoint.com/documents/Jumbo_HFA/R81.10/R81.10/R81.10-List-of-all-Resolved-Issues.htm
PRJ-42145, PMTR-88118 | SecureXL | SNDs may reach 100% CPU utilization and are not released in some Site to Site VPN scenarios. |
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your reply. But take 95 is already installed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suggest contacting TAC and referencing the fix in the JHF (PRJ-42145, PMTR-88118) so that they can communicate it with relevant owners in R&D.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the mean time we kind of "solved" it by setting P2 integrity to SHA1 after reviewing sk73980.
But honestly: This gateway is specified to achieve 2.75 IPSEC performance.
It should reach that with up to date crypto and NOT by using deprecated ciphers.
TAC case is open. 😉
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just my personal honest opinion...I would NOT do settings from that sk, because it essentially "solves" speed, but severely impacts security as far as VPN goes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's a bit generic... E.g. No one wants 3DES. Whereas the AES-NI friendly protocols are better on both fronts.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Chris,
I had more than 1 instance where TAC suggested that sk on the phone to customers and every single time they got an argument back about security, which is 100% valid. TAC response was always that while its true, it should help the speed. In my view, sort of hard sell for security company....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We all have different objectives & situations/ constraints that we're dealing with. Security is a sound argument but often doesn't fly if an appliance is undersized for the task at hand.
Ultimately it's a balance like most things.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I get it, but if you were a customer and security vendor told you that, Im sure you would not be too happy about it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Absolutely right! I wouldn't do that either but, you might know how "urgent" things can get. 8)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you know how many SNDs are there were on the machine, and what is the CPU usage on those SNDs when the issue happened?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There's dynamic balancing configured and working fine. All other cores are at ~90% idle.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hashes for HMACs are vastly different from the same hashes used for file integrity. They don't depend nearly as much on the security of the hash, since they're computed per packet, and finding collisions retrospectively isn't useful.
MD5 provides more than enough integrity assurance for HMAC use for the next century. The only reason to use anything longer is checkbox-compliance where somebody cares more about not seeing the string 'md5" in a configuration than they care about the actual security of the environment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Fully agreed. But there are "checkboxes".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've successfully argued with compliance assessors for a few standards (including PCI-DSS) that the proscription against MD5 and SHA1 doesn't apply to HMACs.
Next, I need to convince them that the requirement for passwords is "expiration every 90 days or better" and that no expiration is better, like NIST says. It just hasn't come up in an assessment yet.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So VPN and no blades other than FW?
Have you tuned / altered your CoreXL config at all?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, FW/VPN only and we altered different aspects of CoreXL including disabling dynamic balancing / multiqueueing with no impact on the main problem. We also checked if KSFW mode makes a difference but as expected, it has not.
From the perf top output above I assume, that this is no issue that can be solved on CoreXL level. Looks like a software bug to me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Understood. Specific issues aside VPN performance may require changes for best results is the point and can also be dependent on the testing methodology.
Hope the SR is resolved for you quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've had one case where a customer with around 80 incoming tunnels changed the community from "tunnel per gateway pair" to "tunnel per host pair" and that immediately caused the appliance to become unresponsive. Something to double-check?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes we checked that, too. It's already "tunnel per gateway pair". Only a hadful of VPNs and about 30 firewall rules...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That setting is usually checked for permanent tunnel...is that the case? Or is it just regular tunnel?
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a permanent tunnel connecting to Zscaler. We configured it according to sk174848 and it has been working fine for months. Customer did not change anything but all of a sudden packet loss started last monday.
My guess is that traffic never got above ~350 Mbps but no one noticed. Today, after changing P2 integrity from sha256 to sha1 we noticed rates above 500 Mbps with cpu 0 load at about 50% which is okay.
Currently we're waiting for R&D to come back with suggestions on how to switch back to sha256. 😉
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
K, got it, fair enough : - )
Here is what escalation guy from Dallas gave me few months ago when customer had similar issue...client never implemented it, since they had more pressing projects on the go, but not sure if this is something that would help though
Andy
fw_clamp_tcp_mss_control - change in Guidbedit to true
mss_value - change to 1200
fw_clamp_vpn_mss -> # fw ctl set int fw_clamp_vpn_mss 1
sim_clamp_vpn_mss -> change to 1 $PPKDIR/conf/simkern.conf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If your VPN peer supports it, in P2 try using a Galois Counter Mode version of AES such as AES-GCM-128 or AES-GCM-256. GCM takes the encryption and hashing function and stitches them into a single operation, which can be fully offloaded to the AES-NI silicon. This should give you a nice boost; we switch to a GCM-based algorithm from 3DES in a lab exercise of my Gateway Performance Optimization Course and the speed increase is impressive. The 3800 does have 8 cores but they are low-power ATOM and not very speedy, the performance of the 3800 was dissected in this earlier thread:
Maximum reachable bandwidth 3800
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This would be right if we had encryption in P2. But this is a VPN tunnel with Zscaler ZIA and we're only using the hash function. See sk174848.
Maybe that's part of the problem? Don't know but our L3 engineer forwarded it to R&D. I'm pretty curious.
BTW: I don't expect the box to provide more than 700-800 Mbps throughput.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The explanation can be found here: sk177966: Secure Network Distributor (SND) shows high CPU usage when 3DES encryption is enabled
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sk174848 tells you to Configure the encryption properties according to the following recommendation:
5.2.2.2. IKE SA Phase 2 – NULL or AES + MD5
So why did you use sha256 in P2 ? Usually, AES-128 / SHA-1 is good enough for P2...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tunnel has been created according to sk174848.
See configuration screenshots there. But it also performs bad with sha1 and I bet it's worse with md5. But that's not the point: The issue is, that the hashing nearly fully saturates one cpu core and if an appliance is specified for 2.75 Gbps performance I would expect it to deliver anything near that number.
