Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Leader
Leader

Performance of open server hardware and Power ProfileSettings

Hello CheckMates,

I want to share something after a long running period of troubleshooting.

Some weeks ago we upgraded an open server cluster to newer hardware and from R77.30 to R80.30. Everything was fine after the upgrade but we noticed a lot of more CPU utilization after the upgrade. Monitoring shows 20-30% more utilization then before. Same traffic, same blades enabled as with R77.30.

Sometimes we saw traffic loss because of high utilizition. I knew we had some load peaks with network traffic (database replications, storage-transfers etc.) but this all was no problem with R77.30.

We tried all the ways for optimization

  • Rule base
  • CoreXL / SND
  • Multiqueue
  • disabling blades like IPS, AV, AB etc. (we tried with only firewall and VPN enabled, same results)
  • analyzed traffic and optimized with fw ctl fast_accel feature
  • disabled logging
  • change from USFW to kernel mode
  • investigation with TAC

Long story, but nothing changed. The newer hardware has much more CPU power but utilization was much more higher then the old system. Our main problem was the traffic loss with the peak loads of the network traffic. They are shown in very short time frames but with very high peaks (4-5Gb/s).

Once a day in the kitchen of our office I followed a performance discussion of the virtualization guys, which sounds like a similar problem. This was another long story (slow running Citrix XenAPP on VMware) but with a solution. The problem was solved with changing the POWER settings in the BIOS of the hardware. The guys found that for some applications it was a problem if the hardware does not always provide the full power of all CPUs.

Same day evening I checked our settings and voila….

„Balanced…“ was active and I changed to „maximum performance“

BIOS-setting-HP-power-profile.png

 

 

 

 

After that, everything was fine, utilization as aspected, no more traffic loss at high load peaks.

I’m aware of these settings but I was not aware about these interaction of the performance. And I never checked anything else in the BIOS settings in the long time of the case handling (shame on me).

Last, here are two monitoring screens before and after changing the power profile settings. Most like the same network traffic for both screens.

HP power profile „Balanced Power and Performance“ (CPU utilization 8 cores, one working day)

BIOS-setting-HP-power-profile_balanced.png


HP power profile „Maximum performance“ (CPU utilization 8 cores, one working day)

BIOS-setting-HP-power-profile_maximum.png

Simple solution for a long a running story.

Wolfgang

2 Replies
Champion
Champion

Thanks for the great writeup, I have seen this power saving issue before along with the infamous "CPU fan 0 RPM" failure that caused monstrous CPU load as the CPU downclocked to keep itself from literally bursting into flames.  I mentioned both issues in my book even though they are (somewhat) rare:

Spoiler

Firewall Open Hardware BIOS Tuning


If utilizing open hardware instead of a Check Point appliance for your firewall, it is
important to make several BIOS configuration adjustments to ensure optimal
performance. On the other hand, all Check Point firewall appliances have already had
their BIOS settings tuned appropriately at the factory; only Check Point TAC knows the
BIOS password for the appliances anyway and they will not disclose it. Updating the
BIOS of a Check Point firewall appliance is extremely rare (and only Check Point TAC
can actually perform the upgrade), but the following SK reference is included for sake of
completeness: sk120915: Check Point Appliances BIOS Firmware versions map.


If using an open hardware firewall, it may be possible to change the number of cores
per CPU that are presented to the Gaia hardware via the BIOS. In the server’s BIOS,
configure an appropriate number of cores per CPU such that the total number of cores
presented by the underlying hardware matches the total number of licensed cores. For
more information about how this setting impacts Check Point CoreXL licensing, see
section “The Trial License Core Crunch” in Chapter 6.


The following are general recommendations for open hardware firewalls; specific
BIOS setting names will vary slightly between hardware vendors, so you may need to do
a little research and exploration to find the relevant settings to adjust. If using open
hardware for the firewall that has multiple power supplies, a power supply problem can
actually cause system CPUs to be placed in a disabled state which is absolutely
disastrous from a performance perspective. See the following SK for an explanation of
this rare but nasty corner case: sk103348: Output of 'cat /proc/cpuinfo' command and of
'top' command show only one CPU core on multi...
.

In general any BIOS setting that permits the CPU frequency to vary from its base
processor speed (either faster or slower) should be disabled. While the Gaia operating
system itself performs just fine with varying CPU frequencies, portions of the Check
Point firewall code such as the Secure Network Distributor (SND), Dynamic Dispatcher,
and ClusterXL assume that all cores are equal at all times in regards to clock speed and
processing power.

If firewall CPU clock speeds vary in the slightest these features will perform in a sub-
optimal fashion; CPU clock speed adjustments can take many forms but here is a
sampling of those that should be disabled on open hardware firewalls:

- Intel Turbo Boost/Turbo Mode
- Intel SpeedStep
- Energy Saving: P-States, C-States, & HPC Optimizations
- SMT/Hyperthreading (only supported on Check Point firewall appliances)
- IO Non Posted Prefetching
- X2APIC Support
- Dynamic Power Capping Function

Other relevant BIOS settings to check:
- CPU and Memory Speed: Maximum Performance
- Memory Channel Mode: Independent
- Thermal/Fan Mode: Maximum Performance
- AES-NI Support: Enabled

Gaia 3.10 Immersion Self-paced Video Series
now available at http://www.maxpowerfirewalls.com
0 Kudos
Reply
Admin
Admin

An interesting story for sure.
Wouldn't have even considered that might cause a CPU load issue but I guess it makes sense with respect the the amount of power being used.
0 Kudos
Reply