- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- Re: Dynamic dispatcher issue with R80.30
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dynamic dispatcher issue with R80.30
Hi again,
I already posted a related question here: https://community.checkpoint.com/t5/Next-Generation-Firewall/Investigating-CPU-core-consumption-in-R...
Did some analyses and now realized that CoreXL connection distribution seems to not work properly anymore:
[Expert@FW:0]# fw ctl multik stat
ID | Active | CPU | Connections | Peak
----------------------------------------------
0 | Yes | 15 | 78152 | 89916
1 | Yes | 7 | 64497 | 68297
2 | Yes | 14 | 57302 | 64495
3 | Yes | 6 | 36117 | 50217
4 | Yes | 13 | 13611 | 35532
5 | Yes | 5 | 3130 | 29384
6 | Yes | 12 | 767 | 26404
7 | Yes | 4 | 548 | 25565
8 | Yes | 11 | 618 | 25357
9 | Yes | 3 | 420 | 25236
10 | Yes | 10 | 503 | 27162
These are all CoreXL CPUs carrying a very different number of connections.
[Expert@FW:0]# fw ctl multik dynamic_dispatching get_mode
Current mode is On
[Expert@FW:0]# fw ctl affinity -l -r
CPU 0:
CPU 1:
CPU 2:
CPU 3: fw_9
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 4: fw_7
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 5: fw_5
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 6: fw_3
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 7: fw_1
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 8:
CPU 9:
CPU 10: fw_10
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 11: fw_8
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 12: fw_6
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 13: fw_4
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 14: fw_2
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 15: fw_0
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
Any ideas ?
Regards Thomas
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @TomShanti
Rather than statically assigning new connections to a CoreXL FW instance based on packet's IP addresses and IP protocol (static hash function), the new dynamic assignment mechanism is based on the utilization of CPU cores, on which the CoreXL FW instances are running.
The dynamic decision is made for first packets of connections, by assigning each of the CoreXL FW instances a rank, and selecting the CoreXL FW instance with the lowest rank.
The rank for each CoreXL FW instance is calculated according to its CPU utilization (only for first packet).
The higher the CPU utilization, the higher the CoreXL FW instance's rank is, hence this CoreXL FW instance is less likely to be selected by the CoreXL SND.
The CoreXL Dynamic Dispatcher allows for better load distribution and helps mitigate connectivity issues during traffic "peaks", as connections opened at a high rate that would have been assigned to the same CoreXL FW instance by a static decision, will now be distributed to several CoreXL FW instances.
There are the following points which influence an asymmetrical distribution:
- Elephant flows with high CPU utilization per CPU core
- Other FW processes that increase the CPU usage of a core.
In your example these processes:
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
I have the suspicion that these are the following IA processes:
pepd, pepd
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @TomShanti
Rather than statically assigning new connections to a CoreXL FW instance based on packet's IP addresses and IP protocol (static hash function), the new dynamic assignment mechanism is based on the utilization of CPU cores, on which the CoreXL FW instances are running.
The dynamic decision is made for first packets of connections, by assigning each of the CoreXL FW instances a rank, and selecting the CoreXL FW instance with the lowest rank.
The rank for each CoreXL FW instance is calculated according to its CPU utilization (only for first packet).
The higher the CPU utilization, the higher the CoreXL FW instance's rank is, hence this CoreXL FW instance is less likely to be selected by the CoreXL SND.
The CoreXL Dynamic Dispatcher allows for better load distribution and helps mitigate connectivity issues during traffic "peaks", as connections opened at a high rate that would have been assigned to the same CoreXL FW instance by a static decision, will now be distributed to several CoreXL FW instances.
There are the following points which influence an asymmetrical distribution:
- Elephant flows with high CPU utilization per CPU core
- Other FW processes that increase the CPU usage of a core.
In your example these processes:
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
I have the suspicion that these are the following IA processes:
pepd, pepd
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have the same problem with the asymmetric distribution.
How can I recognize elephant flows?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If using kernel mode firewall, the command fw ctl multik print_heavy_conn will show all detected elephant flows (aka "heavy" connections) for the last 24 hours.
If using USFW or VSX, you'll need to use the CPMonitor and connstat tools.
See here for further reading: sk164215: How to Detect and Handle Heavy Connections
March 27th with sessions for both the EMEA and Americas time zones
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Interestingly the phenomena vanished when we switched cluster member.
Regards Thomas
