- CheckMates
- :
- Products
- :
- General Topics
- :
- Re: Core XL SND
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CoreXL SND
Dear All,
I have a few questions about CoreXL SND:
1. Assume that my firewall has two SNDs. When traffic reaches the firewall, how is it decided which SND should be processed? Is it also based on the load size of the two SNDs?
2. Does the working position of SND correspond to the 'little i'?
3. Is the load of SND Core generally smaller than that of Firewall Instance? If full load occurs, what are the possible reasons?
4. After R80, SND allocates work based on the load size of the Firewall Instance, so has the Global Dispatcher table been completely abandoned?
Thank you.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @GigaYang
- First: what is the version of you GW? R81.20 take ?
- Is Dynamic Balancing enabled?
https://community.checkpoint.com/t5/General-Topics/R80-40-Dynamic-split-of-CoreXL/m-p/66872#M13714
The task of the SND with my (poor) words: it handles the traffic between the SND and the FW workers. (for easier understanding)
Here is a thread about packet flow:
SND
Q2:
the "small" "i" means the outside of the incoming interface. The "big" "I" means the inside ... and so on
Q3:
There are a lot of possible scenarios. Because there are a lot of blades enabled, there can be a lot of traffic that can't accelerated.
Here are the sk: https://support.checkpoint.com/results/sk/sk32578
The there are rulebase issues, where the templating stopped.
You can check with fwaccel stat command:
Check this things first, before you move further.
Q4:
What do you mean here? Yes, if the dynamic balancing is enabled, the GW will do everything for the best performance.
Akos
\m/_(>_<)_\m/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'll take a crack here, I think you are conflating Multi-Queue with the Dynamic Dispatcher:
1. Assume that my firewall has two SNDs. When traffic reaches the firewall, how is it decided which SND should be processed? Is it also based on the load size of the two SNDs?
Assuming that all interfaces with Multi-Queue (MQ) enabled are set for Auto or Dynamic mode (mq_mng --show to check), the NIC driver (which is what actually implements MQ, it is not technically Check Point code) runs a hash calculation of the L3 src/dst IP addresses and L4 src/dst ports, and assigns that flow's packets to an SND core for handling. The assignment at this level is not load-based to my knowledge, as opposed to something like the Dynamic Dispatcher which assigns connections/flows to a Firewall Worker Instance based on load. I believe the flow's reply packets will also be hashed to that same SND core. You can observe how well the traffic balancing is working in cpview via the Advanced...SecureXL..Network Per CPU screen.
If the traffic balance is way out of whack between the SNDs this is usually because someone has messed with the MQ configuration (see the first sentence of the last paragraph) which you should NEVER DO in R80.40 and later, and this will be especially disastrous if Dynamic Split is enabled. All SNDs are considered to be equal as far as processing power. If the traffic is well-balanced between the SNDs but CPU utilization is way out of whack between the SNDs, generally this is a Check Point code issue with SecureXL. Most of the time TAC will be required to assist here, but I will discuss some undocumented techniques for doing this yourself in my CPX 2025 Vegas speech on the CheckMates track.
2. Does the working position of SND correspond to the 'little i'?
Traditionally (prior to R80.20) "i" would only indicate the entrance to the slowpath, and packets in the medium/fast path would not traverse it at all, other than in R80.20+ where the first packet of every new connection/session which always goes slowpath, and then the rest of the connection is hopefully offloaded to the Medium path or fastpath.
However new capture tools like fw monitor -F available in R80.20+ now show all packets coming into the SND as "i" so I'm not sure what to think now. In the modern releases (R80.20+), I suppose "i" could be interpreted as when a packet is handed off from the Gaia Linux OS code (NIC driver & ring buffer) to the Check Point code (SecureXL dispatcher or worker instance code). If someone from R&D could further clarify this that would be helpful.
3. Is the load of SND Core generally smaller than that of Firewall Instance? If full load occurs, what are the possible reasons?
This is highly dependent on the distribution of traffic between the fastpath, medium path, and slowpath (fwaccel stats -s). On a firewall with no deep inspection blades enabled (APCL, TP etc) a high percentage of traffic will be completely processed on the SND cores only (other than the first packet of a new connection/session which always goes slowpath). However the bulk of traffic on modern firewalls is examined by deep inspection in the Medium Path and sometimes slowpath on a Firewall Worker Instance. So the inspection operations are much more intensive on a worker instance when compared to a SND, which is why there generally tends to be more worker instances than SND instances on most firewalls unless percentage of fastpath traffic is extremely high.
When you say "full load" I assume you mean either just the SNDs are saturated or only the Worker Instances are saturated. Dynamic Split can help with this if there is enough spare CPU capacity overall. The most common cause of high load on worker instances is excessive slowpath/F2F traffic. The most common cause of high load on SNDs is a very high amount of fastpath traffic, or possibly an MQ or Check Point SecureXL code issue.
4. After R80, SND allocates work based on the load size of the Firewall Instance, so has the Global Dispatcher table been completely abandoned?
As far as the SND who runs the Dynamic Dispatcher is concerned, all Worker Instances are equal in overall capability unless the server architecture has Intel's P-cores and E-cores present which is a whole other can of worms. But for the most part when the first packet of a new connection/session arrives at a SND, it assigns the connection and all its subsequent packets to the least-loaded Worker Instance. This assignment is tracked in the SNDs by what I believe you are calling the "Global Dispatcher Table" which can be viewed with fw ctl multik gconn; this is necessary as all the packets of a single connection must always be handled by the same Worker Instance, even with Hyperflow.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If not the fw_worker is loaded, some load must be arise somewhere. Thats the SND 🙂
\m/_(>_<)_\m/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You said it, thats so true.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How many total cores does the system have 8 or more and what blades are enabled?
To show the current mapping use: fw ctl affinity -l -r
Performance Tuning Guide
ATRG: CoreXL
https://support.checkpoint.com/results/sk/sk98737
Dynamic Balancing
https://support.checkpoint.com/results/sk/sk164155
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Chris,
My firewall has 8 CPU cores.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In this case you will also want to review the multi-queue configuration, use: mq_mng --show
Is the gateway configured for large MTU, which version / JHF is used?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Chris,
Extremelly helpful links! Just curious if you know and I welcome, as always, any other opinions @GigaYang @Timothy_Hall @AkosBakos
Do you guys have any idea what OTHER represents in below screenshot? Its my eve-ng lab and I gave it 10 CPU cores when it was created few months ago, though never checked this until I saw this post. Its R81.20 jumbo 92 gateway (this one is not a cluster.
Tx!
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you using a trial/eval license? Any chance your security gateway container is only allowing 8 cores?
Also is it being presented to the VM as 10 full cores or 5x2 with SMT?
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Tim,
Thats right, its eval license, correct. And as far as your 2nd question, its 10 cores all together.
Btw, @Timothy_Hall , sorry if this may sound like a dumb question, but would below indicate each interface's multi-q is configured for 8 cpu cores?
[Expert@CP-GW:0]# mq_mng -o
Total 10 cores. Available for MQ 2 cores
i/f driver driver mode state mode (queues) cores
actual/avail
------------------------------------------------------------------------------------------------
eth0 vmxnet3 Kernel Up Auto (8/8) 0,1,0,1,0,1,0,1
eth1 vmxnet3 Kernel Up Auto (8/8) 0,1,0,1,0,1,0,1
eth2 vmxnet3 Kernel Up Auto (8/8) 0,1,0,1,0,1,0,1
[Expert@CP-GW:0]#
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Looks like the Check Point code and MQ are only recognizing/using 8 cores (max for vmxnet3 is 8 cores/queues anyway). 10 cores may not actually be supported, can you increase the number of cores to 12 or 16? Pretty sure you will see all cores being utilized with those core counts.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Will do it shortly and let you know, great suggestion.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just tested it and I believe it would make sense its license limitation.
Andy
[Expert@CP-GW:0]# mq_mng -o
Total 16 cores. Available for MQ 2 cores
i/f driver driver mode state mode (queues) cores
actual/avail
------------------------------------------------------------------------------------------------
eth0 vmxnet3 Kernel Up Auto (8/8) 0,1,0,1,0,1,0,1
eth1 vmxnet3 Kernel Up Auto (8/8) 0,1,0,1,0,1,0,1
eth2 vmxnet3 Kernel Up Auto (8/8) 0,1,0,1,0,1,0,1
[Expert@CP-GW:0]#
I assigned 16 cores.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @GigaYang
- First: what is the version of you GW? R81.20 take ?
- Is Dynamic Balancing enabled?
https://community.checkpoint.com/t5/General-Topics/R80-40-Dynamic-split-of-CoreXL/m-p/66872#M13714
The task of the SND with my (poor) words: it handles the traffic between the SND and the FW workers. (for easier understanding)
Here is a thread about packet flow:
SND
Q2:
the "small" "i" means the outside of the incoming interface. The "big" "I" means the inside ... and so on
Q3:
There are a lot of possible scenarios. Because there are a lot of blades enabled, there can be a lot of traffic that can't accelerated.
Here are the sk: https://support.checkpoint.com/results/sk/sk32578
The there are rulebase issues, where the templating stopped.
You can check with fwaccel stat command:
Check this things first, before you move further.
Q4:
What do you mean here? Yes, if the dynamic balancing is enabled, the GW will do everything for the best performance.
Akos
\m/_(>_<)_\m/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi AkosBakos,
My firewall is using R82 now, and upgrade from R81.20.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are using R82 (or the latest JHFAs for R81.20) AND you are on a Quantum Lightspeed or Quantum Force 9000/19000/29000 appliance, something called UPPAK will be enabled instead of the traditional KPPAK (fwaccel stat to check). Basically most of SecureXL executes in user space (usim) instead as a driver in kernel space (sim) if UPPAK is active.
If UPPAK is enabled the SND cores will always register at 100% CPU utilization regardless of traffic load, at least as reported by the Gaia/Linux tools top/vmstat/etc. This is EXPECTED BEHAVIOR due to the migration from interrupt-based to poll-mode processing in UPPAK mode which leverages something called DPDK. Check Point-based status tools such as the CPU screen of cpview will show you the "real" CPU load on the SND cores based on how much traffic they are actually handling.
This new SND CPU behavior in UPPAK mode has and will continue to cause confusion going forward, check out the DPDK wikipedia page for more info. I initially assumed this poll mode behavior was a quite undesirable "busy wait" approach (which traditionally fell into the realm of sloppy programming and was to be desperately avoided at all costs), but it allows modern systems to scale to much greater capacities while keeping the infamous killer of network performance known as jitter to a minimum.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Timothy_Hall,
Thanks for a lot. We will buy two 9300 gateway at 2025.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just a couple of nagging "admin" notes.
1. Fixed the title, CoreXL is without a space in the middle.
2. The diagram you are referring too is not 100% accurate, according to R&D. Not a big deal, though, the principles it is showing are still standing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"The devil is in the detail"
Point 2: Is there an more accurate diagram somewhere?
Á
\m/_(>_<)_\m/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You got excellent responses so far. I always refer to ARTG link @Chris_Atkinson gave. Will say though, in R81.20, I never had a need to modify those settings manually.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'll take a crack here, I think you are conflating Multi-Queue with the Dynamic Dispatcher:
1. Assume that my firewall has two SNDs. When traffic reaches the firewall, how is it decided which SND should be processed? Is it also based on the load size of the two SNDs?
Assuming that all interfaces with Multi-Queue (MQ) enabled are set for Auto or Dynamic mode (mq_mng --show to check), the NIC driver (which is what actually implements MQ, it is not technically Check Point code) runs a hash calculation of the L3 src/dst IP addresses and L4 src/dst ports, and assigns that flow's packets to an SND core for handling. The assignment at this level is not load-based to my knowledge, as opposed to something like the Dynamic Dispatcher which assigns connections/flows to a Firewall Worker Instance based on load. I believe the flow's reply packets will also be hashed to that same SND core. You can observe how well the traffic balancing is working in cpview via the Advanced...SecureXL..Network Per CPU screen.
If the traffic balance is way out of whack between the SNDs this is usually because someone has messed with the MQ configuration (see the first sentence of the last paragraph) which you should NEVER DO in R80.40 and later, and this will be especially disastrous if Dynamic Split is enabled. All SNDs are considered to be equal as far as processing power. If the traffic is well-balanced between the SNDs but CPU utilization is way out of whack between the SNDs, generally this is a Check Point code issue with SecureXL. Most of the time TAC will be required to assist here, but I will discuss some undocumented techniques for doing this yourself in my CPX 2025 Vegas speech on the CheckMates track.
2. Does the working position of SND correspond to the 'little i'?
Traditionally (prior to R80.20) "i" would only indicate the entrance to the slowpath, and packets in the medium/fast path would not traverse it at all, other than in R80.20+ where the first packet of every new connection/session which always goes slowpath, and then the rest of the connection is hopefully offloaded to the Medium path or fastpath.
However new capture tools like fw monitor -F available in R80.20+ now show all packets coming into the SND as "i" so I'm not sure what to think now. In the modern releases (R80.20+), I suppose "i" could be interpreted as when a packet is handed off from the Gaia Linux OS code (NIC driver & ring buffer) to the Check Point code (SecureXL dispatcher or worker instance code). If someone from R&D could further clarify this that would be helpful.
3. Is the load of SND Core generally smaller than that of Firewall Instance? If full load occurs, what are the possible reasons?
This is highly dependent on the distribution of traffic between the fastpath, medium path, and slowpath (fwaccel stats -s). On a firewall with no deep inspection blades enabled (APCL, TP etc) a high percentage of traffic will be completely processed on the SND cores only (other than the first packet of a new connection/session which always goes slowpath). However the bulk of traffic on modern firewalls is examined by deep inspection in the Medium Path and sometimes slowpath on a Firewall Worker Instance. So the inspection operations are much more intensive on a worker instance when compared to a SND, which is why there generally tends to be more worker instances than SND instances on most firewalls unless percentage of fastpath traffic is extremely high.
When you say "full load" I assume you mean either just the SNDs are saturated or only the Worker Instances are saturated. Dynamic Split can help with this if there is enough spare CPU capacity overall. The most common cause of high load on worker instances is excessive slowpath/F2F traffic. The most common cause of high load on SNDs is a very high amount of fastpath traffic, or possibly an MQ or Check Point SecureXL code issue.
4. After R80, SND allocates work based on the load size of the Firewall Instance, so has the Global Dispatcher table been completely abandoned?
As far as the SND who runs the Dynamic Dispatcher is concerned, all Worker Instances are equal in overall capability unless the server architecture has Intel's P-cores and E-cores present which is a whole other can of worms. But for the most part when the first packet of a new connection/session arrives at a SND, it assigns the connection and all its subsequent packets to the least-loaded Worker Instance. This assignment is tracked in the SNDs by what I believe you are calling the "Global Dispatcher Table" which can be viewed with fw ctl multik gconn; this is necessary as all the packets of a single connection must always be handled by the same Worker Instance, even with Hyperflow.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Amazing answer, THX @Timothy_Hall
\m/_(>_<)_\m/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Indeed! I always think Tims book is superb, which it is, but then sometimes I see even better explanations here 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes. The book is grate, I have buy the book.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When traffic via Accelerated Path. Why increase SND core loading?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If not the fw_worker is loaded, some load must be arise somewhere. Thats the SND 🙂
\m/_(>_<)_\m/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You said it, thats so true.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@GigaYang I think below link would also be helpful.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I believe the little "i" is still the same meaning as before, a trap before fw VM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Giga,
Just something I noticed, though this sort of made sense to me already 🙂
Andy
CP-GW> set dynamic-balancing state enable
Dynamic Balancing is not supported on open server appliances
CP-GW>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yep, but you can force Dynamic Split active on a system that does not support it for lab purposes. Here is the secret expert mode command we use in my Gateway Performance Optimization course which is implemented in VMWare, a reboot will be required:
dynamic_split -o enable_automation
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Lets see if it works, will report soon.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Guess eve-ng is special lol
Andy
[Expert@CP-GW:0]# dynamic_split -o enable_automation
Dynamic Balancing is not supported on open server appliances
[Expert@CP-GW:0]#
