Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Khalid_Aftas
Contributor

R80.20 CoreXL & Vsx best practices

Hello,

 

During last months i've heard mutilple version from CP TAC regarding the best pratices in core affinity for FWK in vsx R80.20

 

- Put all available core (minus SND) to all VSs, add FWK instances (each time we have performance issues) with dynamic dyspatcher (current setup with 28 cores to all VSs, some cpu core are maxed ou some are doing nothing)

- Put specific cores to Specific VSs and their FWKs (eg vs 1 -> cpu 4-12, VS2-> cpu 13-20)

 

Where is the truth ? 😄

 

Kr,

 

Khalid

0 Kudos
16 Replies
G_W_Albrecht
Legend
Legend

Where is the documentation of these best practices available ? I only have Check Point VSX Administration Guide R80.20 that explains CoreXL config starting p.87.

0 Kudos
Khalid_Aftas
Contributor

That's the what i asked the TAC when he suggested what was the best practices.

That's why i'm asking the community..

0 Kudos
HeikoAnkenbrand
Champion
Champion

More informations:

R80.30 VSX Administration Guide
R80.30 VSX Administration Guide

@Kaspars_Zibarts  will share best practices on leveraging VSX technology to provide scalable and optimized security while keeping maximum performance. 
Presentation - nice presentation 100 points from me👍

And more Tuning tips from me:

R80.x Architecture and Performance Tuning - Link Collection
R80.x - Top 20 Gateway Tuning Tips 

 

PS:
In your overview you should consider whether SMT (R80.x - Performance Tuning Tip - SMT (Hyper Threading)) is on or off. Here there can be massive performance differences with CoreXL, if the cores are not assigned correctly. The correct use of MQ (R80.x - Performance Tuning Tip - Multi Queue) should also be observed. The dynamic dispatcher (sk105261: CoreXL Dynamic Dispatcher in R80.10 and above) only brings a better distribution of connections in some situations, so I would only use it in specific cases.

 

Kaspars_Zibarts
Authority
Authority

Actually it's very difficult to prescribe "best" model for VSX when it comes to CoreXL. Everyone is using it in different ways so solution at the end will be quite different. But I understand your frustration - it's not easy and it takes years to get some good understanding. And then when you think you know it all bam! New version and new tricks 🙂

First things first - all I know from "inside" is that you should not be running R80.20 on gateways, upgrade to R80.30 latest jumbo. I just heard that feedback and case numbers on R80.20 gateways (not mgmt!) was not great. We have been running VSX on R80.30 since january and it's been great. You might want to read this too if you decide to upgrade

https://community.checkpoint.com/t5/General-Topics/First-impressions-R80-30-on-gateway-one-step-forw...

Secondly, I'm not really good on VSX running on open servers - they seem to behave somewhat different, looks like open servers are more efficient and you can just run all VSes sharing the same FWK cores. At least that's what I've heard from "big" customers. We run appliances, mix of 23800, 26000 and 41000 chassis and they all needed tweaking to get best results.

Your next decision will be based on blades you use - is it just FW or also advanced blades. Basically VSX runs better without hyperthreading or SMT enabled if you only use FW and most traffic is accelerated. If you see a lot of PXL and you use advanced blades, you definitely will benefit from extra cores.

One special high CPU case for us for example was Identity Awareness (pdpd and pepd) - therefore we run those on dedicated cores so that they do not affect real firewalling.

To give you short answer - I prefer dedicated cores for each VS and even processes as it really helps troubleshooting, especially high CPU cases. Plus you are protecting your other VSes from being impacted.

It's a lot of careful work to plan your CoreXL split manually, especially if you use hyperthreading - you must consider CPU core sibblings! That's very important.

But it would be very difficult to give you exact answer without knowing exact circumstances.

 

Khalid_Aftas
Contributor

Thx for your insight.

We are running on CP hardware 14000 series cluster, with 32 cores.

Currently we have 2 VSs running FW/IPS/urlfiltering blade (no https inspec), you can below the cpu affinity, only 2 cores are almost always maxed, the rest is doing nothing (dynamic dispatcher is on)

Plan is to segment further the traffic to new VSs and also upgrade to 80.30


[Expert@EU933055-OSS:0]# fw ctl affinity -l
Mgmt: CPU 0
eth2-01: CPU 1
eth2-02: CPU 1
eth2-03: CPU 1
eth2-04: CPU 1
eth2-05: CPU 2
eth2-06: CPU 2
eth2-07: CPU 2
eth2-08: CPU 2
VS_0: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
VS_0 fwk: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
VS_0 smt_status: CPU 1 2 3
VS_1: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
VS_1 fwk: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
VS_2: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
VS_2 fwk: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
VS_3: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
VS_3 fwk: CPU 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31


0 Kudos
Kaspars_Zibarts
Authority
Authority

Is hyperthreading enabled? You can run this command to check
if [ `grep ^"cpu cores" /proc/cpuinfo | head -1 | awk '{print $4}'` -ne `grep ^"siblings" /proc/cpuinfo | head -1 | awk '{print $3}'` ]; then echo HT; else echo no-HT; fi
0 Kudos
Khalid_Aftas
Contributor

HT is on 🙂

0 Kudos
Kaspars_Zibarts
Authority
Authority

Yeah, the HT sibling allocation is wrong I'm afraid. How many cores are set per each VS? You can do this:

cat $FWDIR/state/local/VSX/local.vsall | grep "vs create vs" | awk '{print "VS-"$4" instances: "$12}'
0 Kudos
Khalid_Aftas
Contributor

Here you go

[Expert@Exxxxxxxx:0]# cat $FWDIR/state/local/VSX/local.vsall | grep "vs create vs" | awk '{print "VS-"$4" instances: "$12}'
VS-1 instances: 1
VS-1 instances: 1
VS-3 instances: 12
VS-3 instances: 12
VS-2 instances: 12
VS-2 instances: 12

0 Kudos
Kaspars_Zibarts
Authority
Authority

I would start with something like this. It's not ideal as VS-2 is stretched over 2 physical CPUs but it might help to fix overloaded CPUs

image.png

 

 

 

 

 

 

 

 

 

 

 

 

 

Note that cores 16-18 must not be used for FWKs at all! They are HT sibblings for cores 0-2 that are used for SND.

Which two cores are maxing out BTW?

What are throughput, connections per second and concurrent connections on each VS? You can check that with cpview on corresponding vsenv

0 Kudos
Kaspars_Zibarts
Authority
Authority

Actual commands to achieve this
fw ctl affinity -s -d -vsid 0 1 -cpu 3 19
fw ctl affinity -s -d -vsid 2 -cpu 4-9 20-25
fw ctl affinity -s -d -vsid 3 -cpu 10-15 26-31

0 Kudos
Khalid_Aftas
Contributor

CPU 16 and 17 are always in RED 🙂
0 Kudos
Kaspars_Zibarts
Authority
Authority

you must exclude cores 16-18 from FWK pool. As said before - they are HT siblings of 0-2 and must not be used for FWKs.
simple test would be (all VSes sharing the same resources):
fw ctl affinity -s -d -vsid 0-3 -cpu 3-15 19-31
0 Kudos
Khalid_Aftas
Contributor

Such a change has any impact on live prod traffic i guess ?

Will plan this.

 

Correct me if i'm wrong your recommendation is still to dedicate specific cores per VS ?

0 Kudos
Kaspars_Zibarts
Authority
Authority

Correct, I would stick with dedicated cores for VS as it allows easier troubleshooting and better resource protection.
I have done it during daytime without any impact on our VSX, but choice is yours of course. I'm not CP and cannot promise anything 🙂
0 Kudos
Kaspars_Zibarts
Authority
Authority

Plus you can always do these changes on standby node first, fail over and see what happens , then do the other one
0 Kudos