Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
net-harry
Contributor

High CPU utilization and affinity

Hi,

In one of our R80.20 firewalls the CPU utilization on one core is very high, while the other cores are almost idle. This started after the installation on JHA 183 on the firewall cluster.

While perhaps not related to the upgrade itself, the CPU affinity is now in an undesirable state as eth6 and eth7 are 10 Gbps interfaces while eth0, eth2 and eth3 are 1 Gbps interfaces.

firewall> fw ctl affinity -l -r
CPU 0: eth6 eth7 eth0
CPU 1: eth2 eth3
CPU 2: fw_5
lpd rtmd fwd wsdnsd mpdaemon in.asessiond cpd cprid
CPU 3: fw_4
lpd rtmd fwd wsdnsd mpdaemon in.asessiond cpd cprid
CPU 4: fw_3
lpd rtmd fwd wsdnsd mpdaemon in.asessiond cpd cprid
CPU 5: fw_2
lpd rtmd fwd wsdnsd mpdaemon in.asessiond cpd cprid
CPU 6: fw_1
lpd rtmd fwd wsdnsd mpdaemon in.asessiond cpd cprid
CPU 7: fw_0
lpd rtmd fwd wsdnsd mpdaemon in.asessiond cpd cprid
CPU 8:
CPU 9:
CPU 10:
CPU 11:
All:
The current license permits the use of CPUs 0, 1, 2, 3, 4, 5, 6, 7 only.


We previously used the following procedure to change affinity and improve utilization across cores

  1. Use “cpconfig” to change CoreXL to use 3 cores for SND/IRQ and 5 cores for Firewall worker (previously 2 cores were used for SND/IRQ and 6 for Firewall worker)
  2. Use “sim affinity -s” to allocate 1 SND/IRQ to each 10 Gbps interface and a separate to the other interfaces (eth6 to CPU 1, eth7 to CPU 2 and all other interfaces to CPU0)
  3. Use “taskset_us_all” to assign user space processes to only firewall worker cores:
  4. taskset_us_all -l 3-7
  5. Update /etc/rc.local with the “taskset_us_all” command to make it survive reboot


I have a few questions that I hope you could answer:

  1. Does this procedure look correct or should something be changed? For example:
  2. Is it correct to allocate one SND/IRQ to each 10 Gbps interface and another to the 1 Gbps interfaces?
  3. The server has 12 cores, but we only have a license for 8. If I understand correctly it is recommended to modify BIOS so only 8 are seen by the OS, but what are the advantage compared to just specifying 8 cores in CoreXL?


Thanks for your help!

Best regards,

Harry

0 Kudos
7 Replies
PhoneBoy
Admin
Admin

To me, that looks right.
You might consider licensing those additional cores so you can leverage all the cores (and potentially use multiqueue).

Timothy_Hall
Champion
Champion

You didn't say which kind of core was saturated (SND vs. Firewall Worker) after the JHFA application but I am assuming it is an SND core due to this fix included in your JHFA level which makes much more traffic eligible for full acceleration by SecureXL in some circumstances: sk166700: High CPU after upgrade from R77.x to R80.x when running only Firewall and Monitoring blade...   This is a great problem to have.  🙂

Generally you should avoid static CPU allocations for SoftIRQ via sim affinity wherever possible and enable Multi-Queue on your 10Gbps interfaces; one CPU core (even if dedicated to only one 10Gbps interface) will start getting saturated around 4-5Gbps and start losing frames (RX-DRP as shown by netstat -ni).  What I would recommend:

1) If more than 70% of your traffic is fully-accelerated (Accelerated pkts [not conns] shown by fwaccel stats -s) configure a 4/4 split with cpconfig.  Otherwise your 3/5 split should be fine for now.

2) Run sim affinity -a to set all interface affinities back to auto mode.  You may need to reboot after doing this, can't remember.

3) Enable Multi-Queue on your 10Gbps interfaces; Multi-Queue can only be active on a maximum of 5 physical interfaces in your kernel version.  You will most definitely need to reboot after making this change.

4) After reboot all SND/IRQ cores will be able to service the 10Gbps interfaces, thus spreading the load out more evenly among them and hopefully avoiding excessive RX-DRP frame loss.

As far as having 12 cores but only being licensed for 8, I have seen some strange effects happen when there is this kind of mismatch but based on your command outputs I think your firewall is handling this situation fine.  You taskset core allocations as you have them configured are OK, be sure to update them if you change the number of SND cores to avoid wayward processes from grabbing CPU time on the SND cores and trashing their CPU fast cache.

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com
net-harry
Contributor

Thank you very much @Timothy_Hall for your suggestions!

In the end I used my initial plan with sim affinity since we are (unfortunately) running tg3 and be2net drivers that from what I understand do not allow multi-queue.

Will try to ensure that we use better NICs when we refresh the hardware on the open servers.

Best regards,

Harry

0 Kudos
Timothy_Hall
Champion
Champion

tg3 and be2net in use on your firewall?  My condolences...

"Max Capture: Know Your Packets" Video Series
now available at http://www.maxpowerfirewalls.com
0 Kudos
the_rock
Advisor

tg3 and be2net in use on your firewall?  My condolences...that made me laugh LOL

0 Kudos
the_rock
Advisor

I will share very simple way I fixed this with customers few times (cant guarantee it would work)

cpconfig -> disable corexl -> reboot -> cpconfig -> re-enable corexl -> reboot again -> check (make sure you do it on both fws if its a cluster_

I never figured out why this worked, but I guess like with anything else, it probably "resets" the corexl config and starts fresh

 

Let us know if you tried that.

 

Andy

0 Kudos
the_rock
Advisor

However, if what I suggested above fails, I would try below article to debug it and send to TAC:

 

sk43443

0 Kudos