cancel
Showing results for 
Search instead for 
Did you mean: 
Create a Post
Highlighted

R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution

Hello,

I have seen, that one of my cores is nearly always on 100% si. It's one of our 10GE interfaces that leads to our internet firewall.

So I tried optimization by using MultiQ and CoreXL Affinity.

Now I see still 100%si on one of the cores. Did I made something wrong or how would I get more cores for MultiQ?

Maybe somebody can give me a clue.

Thanks in advance,

 

Jan

 

 

Here is my configuration:

Checkpoint 23800 VSLS (at the moment all machines are running on one member)

VS5 is the bigest machine (10 COREXL) and VS4  (4 COREXL) is another machine. The other VSs are Virtual Switches. Some (2 or 3) of them will get unnecessary, but I didn't have enough time in the last maintenance window.

 

Mgmt: CPU 0
Sync: CPU 24
eth2-01: CPU 2
eth2-02: CPU 26
eth4-01: CPU 3
eth4-04: CPU 27
eth1-01: CPU 1
eth1-04: CPU 25
VS_0: CPU 5 29
VS_0 fwk: CPU 5 29
VS_1: CPU 15 16 17 18 39 40 41 42
VS_1 fwk: CPU 15 16 17 18 39 40 41 42
VS_2: CPU 15 16 17 18 39 40 41 42
VS_2 fwk: CPU 15 16 17 18 39 40 41 42
VS_3: CPU 15 16 17 18 39 40 41 42
VS_3 fwk: CPU 15 16 17 18 39 40 41 42
VS_3 pepd: CPU 23 47
VS_3 pdpd: CPU 23 47
VS_4: CPU 12 13 14 36 37 38
VS_4 fwk: CPU 12 13 14 36 37 38
VS_5: CPU 6 7 8 9 10 11 30 31 32 33 34 35
VS_5 fwk: CPU 6 7 8 9 10 11 30 31 32 33 34 35
VS_5 pepd: CPU 23 47
VS_5 pdpd: CPU 23 47
VS_9: CPU 15 16 17 18 39 40 41 42
VS_9 fwk: CPU 15 16 17 18 39 40 41 42
VS_17: CPU 15 16 17 18 39 40 41 42
VS_17 fwk: CPU 15 16 17 18 39 40 41 42
VS_18: CPU 15 16 17 18 39 40 41 42
VS_18 fwk: CPU 15 16 17 18 39 40 41 42
VS_19: CPU 15 16 17 18 39 40 41 42
VS_19 fwk: CPU 15 16 17 18 39 40 41 42
VS_20: CPU 15 16 17 18 39 40 41 42
VS_20 fwk: CPU 15 16 17 18 39 40 41 42
VS_23: CPU 15 16 17 18 39 40 41 42
VS_23 fwk: CPU 15 16 17 18 39 40 41 42
Interface eth4-02: has multi queue enabled
Interface eth4-03: has multi queue enabled
Interface eth1-02: has multi queue enabled
Interface eth1-03: has multi queue enabled

 

 

cpmq get -v

Active ixgbe interfaces:
eth1-01 [Off]
eth1-02 [On]
eth1-03 [On]
eth1-04 [Off]
eth4-01 [Off]
eth4-02 [On]
eth4-03 [On]
eth4-04 [Off]

Active igb interfaces:
Mgmt [Off]
Sync [Off]
eth2-01 [Off]
eth2-02 [Off]

The rx_num for ixgbe is: 4

multi-queue affinity for ixgbe interfaces:
CPU | TX | Vector | RX Bytes
-------------------------------------------------------------
0 | 0 | eth4-02-TxRx-0 (92) | 13051584193
| | eth4-03-TxRx-0 (164) |
| | eth1-02-TxRx-0 (85) |
| | eth1-03-TxRx-0 (125) |
1 | 2 | eth4-02-TxRx-2 (108) | 3464794311
| | eth4-03-TxRx-2 (180) |
| | eth1-02-TxRx-2 (101) |
| | eth1-03-TxRx-2 (141) |
2 | 4 | |
3 | 6 | |
4 | 8 | |
5 | 10 | |
6 | 12 | |
7 | 14 | |
8 | 16 | |
9 | 18 | |
10 | 20 | |
11 | 22 | |
12 | 24 | |
13 | 26 | |
14 | 28 | |
15 | 30 | |
16 | 32 | |
17 | 34 | |
18 | 36 | |
19 | 38 | |
20 | 40 | |
21 | 42 | |
22 | 44 | |
23 | 46 | |
24 | 1 | eth4-02-TxRx-1 (100) | 4631540381
| | eth4-03-TxRx-1 (172) |
| | eth1-02-TxRx-1 (93) |
| | eth1-03-TxRx-1 (133) |
25 | 3 | eth4-02-TxRx-3 (116) | 8284222848
| | eth4-03-TxRx-3 (188) |
| | eth1-02-TxRx-3 (109) |
| | eth1-03-TxRx-3 (149) |
26 | 5 | |
27 | 7 | |
28 | 9 | |
29 | 11 | |
30 | 13 | |
31 | 15 | |
32 | 17 | |
33 | 19 | |
34 | 21 | |
35 | 23 | |
36 | 25 | |
37 | 27 | |
38 | 29 | |
39 | 31 | |
40 | 33 | |
41 | 35 | |
42 | 37 | |
43 | 39 | |
44 | 41 | |
45 | 43 | |
46 | 45 | |
47 | 47 | |

top - 11:15:31 up 24 min, 1 user, load average: 2.85, 3.07, 2.74
Tasks: 663 total, 5 running, 658 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.3%sy, 0.0%ni, 3.6%id, 0.0%wa, 0.7%hi, 95.4%si, 0.0%st
Cpu1 : 0.0%us, 0.3%sy, 0.0%ni, 86.5%id, 0.0%wa, 1.0%hi, 12.2%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 1.3%si, 0.0%st
Cpu3 : 0.0%us, 0.3%sy, 0.0%ni, 94.1%id, 0.0%wa, 0.0%hi, 5.6%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 5.3%us, 2.0%sy, 0.0%ni, 91.7%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu7 : 11.6%us, 1.7%sy, 0.0%ni, 85.5%id, 0.0%wa, 0.0%hi, 1.3%si, 0.0%st
Cpu8 : 10.3%us, 1.3%sy, 0.0%ni, 87.4%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu9 : 11.3%us, 0.7%sy, 0.0%ni, 86.7%id, 0.0%wa, 0.0%hi, 1.3%si, 0.0%st
Cpu10 : 7.0%us, 3.0%sy, 0.0%ni, 89.4%id, 0.0%wa, 0.0%hi, 0.7%si, 0.0%st
Cpu11 : 4.0%us, 4.3%sy, 0.0%ni, 91.4%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu12 : 3.7%us, 2.0%sy, 0.0%ni, 94.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu13 : 3.3%us, 1.3%sy, 0.0%ni, 95.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 4.3%us, 1.7%sy, 0.0%ni, 93.0%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu16 : 0.3%us, 1.3%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu17 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu19 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu21 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu23 : 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu24 : 0.0%us, 0.0%sy, 0.0%ni, 67.5%id, 0.0%wa, 2.0%hi, 30.5%si, 0.0%st
Cpu25 : 0.0%us, 0.0%sy, 0.0%ni, 45.4%id, 0.0%wa, 1.3%hi, 53.3%si, 0.0%st
Cpu26 : 0.0%us, 0.0%sy, 0.0%ni, 93.4%id, 0.0%wa, 0.7%hi, 6.0%si, 0.0%st
Cpu27 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu28 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu29 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu30 : 6.3%us, 1.0%sy, 0.0%ni, 91.7%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu31 : 11.3%us, 1.3%sy, 0.0%ni, 86.4%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu32 : 7.3%us, 0.7%sy, 0.0%ni, 91.1%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu33 : 8.6%us, 1.3%sy, 0.0%ni, 89.0%id, 0.0%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu34 : 12.6%us, 1.3%sy, 0.0%ni, 84.7%id, 0.0%wa, 0.0%hi, 1.3%si, 0.0%st
Cpu35 : 3.6%us, 4.0%sy, 0.0%ni, 92.1%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu36 : 1.7%us, 1.0%sy, 0.0%ni, 97.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu37 : 2.3%us, 2.0%sy, 0.0%ni, 95.4%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu38 : 5.0%us, 1.7%sy, 0.0%ni, 93.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu39 : 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu40 : 0.7%us, 1.7%sy, 0.0%ni, 97.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu41 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu42 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu43 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu44 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu45 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu46 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu47 : 1.0%us, 0.3%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 65746320k total, 7982200k used, 57764120k free, 93136k buffers
Swap: 33551672k total, 0k used, 33551672k free, 1834796k cached

Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
Mgmt 1500 0 102959 0 0 0 158433 0 0 0 BMRU
Sync 1500 0 516139 0 0 0 819709 0 0 0 BMRU
bond1 1500 0 6980926 0 0 0 7490991 0 0 0 BMmRU
bond2 1500 0 785635483 0 0 0 776472832 0 0 0 BMmRU
bond3 1500 0 10596109 0 0 0 10870748 0 0 0 BMmRU
bond4 1500 0 43445054 0 0 0 49262579 0 0 0 BMmRU
eth1-01 1500 0 48534 0 0 0 4379061 0 0 0 BMsRU
eth1-02 1500 0 24755088 0 0 0 772492689 0 0 0 BMsRU
eth1-03 1500 0 43445385 0 0 0 49262991 0 0 0 BMsRU
eth1-04 1500 0 3876598 0 0 0 5029630 0 0 0 BMRU
eth2-01 1500 0 2649139 0 0 0 9736 0 0 0 BMsRU
eth2-02 1500 0 7947157 0 0 0 10861543 0 0 0 BMsRU
eth4-01 1500 0 6932527 0 0 0 3112070 0 0 0 BMsRU
eth4-02 1500 0 760924909 0 0 0 3989946 0 0 0 BMsRU
eth4-03 1500 0 0 0 0 0 0 0 0 0 BMsU
eth4-04 1500 0 0 0 0 0 0 0 0 0 BMU
lo 16436 0 21980 0 0 0 21980 0 0 0 LRU
[Expert@fw-lan-02:0]#

 

0 Kudos
1 Solution

Accepted Solutions
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution

It looks that setup is OK but indeed the MQ seems to be hammering mostly CPU0.

You can add cores with cpmq set rx_num command, this is example from our get

[Expert@vsx:0]# cpmq get rx_num ixgbe
The rx_num for ixgbe is: 6

Remember to do on "standby" chassis first (so fail over all your VSes to one box first). And remember to re-allocate CoreXL cores first so they do not overlap with SXL.

And after reboot you will need to set MQ affinity

cpmq set affinity

Or just type cpmq to get full help

 

 

 

View solution in original post

14 Replies
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution

It looks that setup is OK but indeed the MQ seems to be hammering mostly CPU0.

You can add cores with cpmq set rx_num command, this is example from our get

[Expert@vsx:0]# cpmq get rx_num ixgbe
The rx_num for ixgbe is: 6

Remember to do on "standby" chassis first (so fail over all your VSes to one box first). And remember to re-allocate CoreXL cores first so they do not overlap with SXL.

And after reboot you will need to set MQ affinity

cpmq set affinity

Or just type cpmq to get full help

 

 

 

View solution in original post

Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
Thank you. I will try this.
0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution

BTW, I heard that R80.20 was not a great release for gateways, maybe try upgrading to R80.30. We run 23800 VSX with MQ with no major issues

0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
Or if you are confident go to R80.40 instead, which will give you the 3.10 kernel by default and a better MQ implementation.
Regards, Maarten
0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
Will migration to Kernel 3.10 require a clean install?
Do you do this over the Management Adapter? Last time this way seemed realy slow.
Regards,

Jan
0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
No need for the Kernel update to do a clean install, the filesystem upodate that came with R80.20 did require a clean install, but on a gateway that should not really make any difference.
You can prepare the download while the gateway is running by using installer download and select the R80.xx version you want.
once downloaded you need to run installer verify first to see if the installer sees any issues.
When you press the <Tab> key when you entered in clish: installer download
you should see a list of available packages. Just runj the download anytime that suits you. Then the verify and installation are both only local and should be slow at all.
Regards, Maarten
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
This was the point. I did not remember that it was the filesystem change.
0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
funny enough when we did upgrades in January 23800 appliances did not support 3.10 🙂 works like charm using 2.6
0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
Hello,

the behaviour doesn't change.
100% CPU on all MultiQ interfaces. The same on a 15600 with R80.30.
Do you have a hint where to look what could be wrong.

Best regards,

Jan
0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
So it has changed actually as previously only one MQ core (CPU0) got most of the traffic? Still running 4 MQ cores? Since your other cores have plenty of resources, I would suggest to increase MQ cores. Start with 8 and possibly go up even more. Just remember to re-calculate FWK cores correctly
0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
Hello,
we have another firewall (R80.30 with VSX) that has similar problems.

I activated MultiQueue on the 4 10GE Interfaces with 6 Cores (HT) and 4096 RX and TX ringsize.
The Cores are nearly the whole time at 100% SI rate.
Do you have a clue what could be wrong with the configuration?

fw ctl affinity -l
Mgmt: CPU 3
Sync: CPU 19
VS_0: CPU 4 20
VS_0 fwk: CPU 4 20
VS_1: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_1 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_2: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_2 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_3: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_3 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_4: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_4 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_6: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_6 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_7: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_7 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_22: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_22 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_24: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
VS_24 fwk: CPU 5 6 7 8 9 10 11 12 13 14 15 21 22 23 24 25 26 27 28 29 30 31
Interface eth3-01: has multi queue enabled
Interface eth3-02: has multi queue enabled
Interface eth1-01: has multi queue enabled
Interface eth1-02: has multi queue enabled

0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution

What's the traffic level in general on those interfaces? Is it the 6 MQ CPUs that are running at 100%? I FWK cores are not too loaded, you probably will need to change the split an allocate more cores to MQ / SXL to deal with the load.

0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution
The traffic level is 2.1 Gbit/s RX and 2.2 Gbit/s TX on these 4 interfaces. Only the MQ CPUs are at 100%.
The CPUs 0,1,2 are at 100% and the HT-CPU 24 is also on 100%. The CPUs 25 and 26 are between 5% and 25%.

Thanks,
Jan

0 Kudos
Highlighted

Re: R80.20 VSX Perfomance Optimization Questions - MultiQ, CoreXL

Jump to solution

try changing the split and adding more to MQ. It's hard to say exact number, but I would try to double it so 12 HT cores for MQ and rest for FWK. If you have any graphs from CPU and throughput, you might be able to correlate and calculate better

0 Kudos