- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- Re: Investigating CPU core consumption in R80.30 k...
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Investigating CPU core consumption in R80.30 kernel 3.10 UMFW
Hi all,
I am stuck and need your help. Since beginning of July our CPU consumption on a R80.30 kernel 3.10 with User-mode firewall mode jumped up significantly. We can see CPU cores working on through the night which was not the case before:
The cores you can see in bright blue and purple are CoreXL CPUs.
But I have no clue how to find out which clients occupy these resources.
All things that worked previously in older releases seem to not work in R80.30 UMFW:
Any ideas ?
Thanks and regards
Tom
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My guess as to the purpose of the fwk0_dev_0 is that it acts as the liaison between the multiple fwk firewall worker processes (fw instance thread that takes care for the packet processing) and the single fwmod kernel driver instance and the process for high priority cluster thread.
In UMFW the fw instances are threads of the fwk0_dev_0 so by default the top shows all the threads cpu utilization under the main thread. Top has the option to present the utilization per thread as well.
A small calculation sample for the utilization of process fwk0_dev_0:
max_CoreXL_number max_CoreXL_number
fwk0_dev_0 = ∑ fwk0_x + ∑ fwk0_dev_x + fwk0_kissd + fwk0_hp
x=0 x=0
Thread from process fwk0_dev_0:
- fwk0_X -> fw instance thread that takes care for the packet processing
- fwk0_dev_X -> the thread that takes care for communication between fw instances and other CP daemons
- fwk0_kissd -> legacy Kernel Infrastructure (obsolete)
- fwk0_hp -> (high priority) cluster thread
Read more here:
Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you trying to see the connections that run on each core?
You can use "fw ctl affinity -l -r" to see if the core is running SND or FW
fwaccel conns --> will print you the connections running in secureXL and will give you the CPU id as well
fw -i <instance number> tab -t connections -u --> will print you the FW connections for the requested instance
We are working on introducing the top connections view into cpview in USFW which will also help in that case.
I will update on this once integrated into the jumbo.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is a regular GW and the issue appeared out of the blue.
fw accel stats look the same as before.
Regards Tom
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And you confirmed with top command that is actually fwk workers using high load? What blades are you running?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes they are. Blades are FW and IPS - "ips off" does not have any impact on CPU core load.
Regards Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My guess as to the purpose of the fwk0_dev_0 is that it acts as the liaison between the multiple fwk firewall worker processes (fw instance thread that takes care for the packet processing) and the single fwmod kernel driver instance and the process for high priority cluster thread.
In UMFW the fw instances are threads of the fwk0_dev_0 so by default the top shows all the threads cpu utilization under the main thread. Top has the option to present the utilization per thread as well.
A small calculation sample for the utilization of process fwk0_dev_0:
max_CoreXL_number max_CoreXL_number
fwk0_dev_0 = ∑ fwk0_x + ∑ fwk0_dev_x + fwk0_kissd + fwk0_hp
x=0 x=0
Thread from process fwk0_dev_0:
- fwk0_X -> fw instance thread that takes care for the packet processing
- fwk0_dev_X -> the thread that takes care for communication between fw instances and other CP daemons
- fwk0_kissd -> legacy Kernel Infrastructure (obsolete)
- fwk0_hp -> (high priority) cluster thread
Read more here:
Performance Tuning Tip – User Mode Firewall vs. Kernel Mode Firewall
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Heiko,
the question now is how to get a list of the connections a certain core currently handles ...
Regards Tom
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am tagging @shais who might have an idea what is the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use the command "fw ctl multik stat" to print the connections distribution.
Note, if you want to check this for IPV6 then please use fw6 ctl multik stat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
sadly this is not enough to find our what exactly occupies the cores.
It gives me this:
[Expert@FW:0]# fw ctl multik stat
ID | Active | CPU | Connections | Peak
----------------------------------------------
0 | Yes | 15 | 61802 | 89916
1 | Yes | 7 | 50544 | 68297
2 | Yes | 14 | 50979 | 64495
3 | Yes | 6 | 35115 | 48966
4 | Yes | 13 | 17028 | 35532
5 | Yes | 5 | 3343 | 29384
6 | Yes | 12 | 770 | 26404
7 | Yes | 4 | 438 | 25565
8 | Yes | 11 | 589 | 25357
9 | Yes | 3 | 505 | 25236
10 | Yes | 10 | 452 | 27162
[Expert@FW:0]# fw ctl affinity -l -r
CPU 0:
CPU 1:
CPU 2:
CPU 3: fw_9
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 4: fw_7
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 5: fw_5
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 6: fw_3
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 7: fw_1
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 8:
CPU 9:
CPU 10: fw_10
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 11: fw_8
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 12: fw_6
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 13: fw_4
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 14: fw_2
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
CPU 15: fw_0
mpdaemon fwd pdpd lpd pepd dtpsd in.acapd dtlsd in.asessiond rtmd vpnd cprid cpd
All:
Interface eth4: has multi queue enabled
Interface eth5: has multi queue enabled
Interface eth6: has multi queue enabled
Interface eth7: has multi queue enabled
Interface eth10: has multi queue enabled
Interface eth11: has multi queue enabled
Interface eth12: has multi queue enabled
Interface eth13: has multi queue enabled
Interface eth8: has multi queue enabled
Interface eth9: has multi queue enabled
So I can see that CPU is loaded probably due to a lot of connections. But I want to know who that is (in terms of IPs) ...
Regards Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The traditional command to see connection distribution by firewall worker instance is fw ctl multik gconn, but does not currently work if USFW is enabled. This limitation is rectified in the latest R80.40 Jumbo HFAs.
If that command doesn't work for you connstat must be used instead: sk85780: How to use the 'connstat' utility
March 27th with sessions for both the EMEA and Americas time zones
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Timothy,
thanks for the connstat hint.
I did an investigation today and it art least gives a clue where to start.
I will try to run it at night to identify the hosts causing the permanent load ...
It is very bad in terms of troubleshooting with R80.30 UMFW that it seems we cannot get any insights what consumes a core ...
Regards Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you trying to see the connections that run on each core?
You can use "fw ctl affinity -l -r" to see if the core is running SND or FW
fwaccel conns --> will print you the connections running in secureXL and will give you the CPU id as well
fw -i <instance number> tab -t connections -u --> will print you the FW connections for the requested instance
We are working on introducing the top connections view into cpview in USFW which will also help in that case.
I will update on this once integrated into the jumbo.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, can you help me figure out why this command does not work ?
[Expert@FW:0]# fw -i 0 -t connections -u
Unknown command "-t"
Usage:
fw ver [-h] ... # Display version
fw kill [-sig_no] procname # Send signal to a daemon
fw putkey ... # Client server keys
fw sam ... # Control sam server
fw sam_policy ... # SAM policy editor
fw fetch targets # Fetch last policy
fw amw fetch # Fetch Anti-Bot & Anti-Virus policy
fw tab [-h] ... # Kernel tables content
fw showuptables [-h] ... # Formatted Unified Policy kernel tables content
fw monitor [-h] ... # Monitor VPN-1/FW-1 traffic
fw ctl [args] # Control kernel
fw lichosts # Display protected hosts
fw log [-h] ... # Display logs
fw logswitch [-h target] [+|-][oldlog] # Create a new log file;
# the old log is moved
fw repairlog ... # Log index recreation
fw mergefiles ... # log files merger
fw lslogs ... # Remote machine log file list
fw fetchlogs ... # Fetch logs from a remote host
fw up_execute [args] # Offline rulebase execution
fw gtp ... # GTP commands
fw balance ... # Logical servers
Regards Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks the command is now working
Is the instance ID the same ID as in "fw ctl multik stat" output ?
Thanks Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The ID stand for instance number
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
fw -i 0 tab -t connections -u
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @TomShanti,
to your question:
CUT>>>
the question now is how to get a list of the connections a certain core currently handles ...
<<<CUT
You can only see a sum of the connections per core:
# fw ctl multik stat
You can't see this with a UMFW, because the assignment of session to core is not in the connection table.
From R80.10 this is controlled by the CoreXL dynamic dispatcher.
Rather than statically assigning new connections to a CoreXL FW instance based on packet's IP addresses and IP protocol (static hash function), the new dynamic assignment mechanism is based on the utilization of CPU cores, on which the CoreXL FW instances are running.
The dynamic decision is made for first packets of connections, by assigning each of the CoreXL FW instances a rank, and selecting the CoreXL FW instance with the lowest rank.
The rank for each CoreXL FW instance is calculated according to its CPU utilization.
The higher the CPU utilization, the higher the CoreXL FW instance's rank is, hence this CoreXL FW instance is less likely to be selected by the CoreXL SND.
The CoreXL Dynamic Dispatcher allows for better load distribution and helps mitigate connectivity issues during traffic "peaks", as connections opened at a high rate that would have been assigned to the same CoreXL FW instance by a static decision, will now be distributed to several CoreXL FW instances.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @TomShanti
Did you ever find the root cause of the spikes?
