Unusual high CPU after migration VSX R77.30 to R80... - Page 2

Martin_Oles · ‎2020-09-15

Hi,

I am just wonder, if you observe similar behavior. We are running VSX cluster Check point 12200, 4CPUs, 8G of memory, CPAC-4-10 line card.

Prior migration from R77.30 to R80.30 hfa 215 we had CPU utilization:

[Expert@FW01:2]#top
top - 15:08:42 up 541 days, 13:07, 2 users, load average: 0.96, 1.04, 0.95
Tasks: 151 total,   1 running, 150 sleeping,   0 stopped,   0 zombie
Cpu(s): 4.6%us, 2.3%sy, 0.0%ni, 87.2%id, 0.1%wa, 0.2%hi, 5.5%si, 0.0%st
Mem:   8029532k total, 7977484k used,    52048k free,   419772k buffers
Swap: 18908408k total,      544k used, 18907864k free, 2963120k cached

PID USER      PR NI VIRT RES SHR S %CPU %MEM    TIME+ COMMAND
16043 admin      0 -20 818m 278m 21m S   52 3.6 120937:59 fwk2_dev
9718 admin     15   0 448m 73m 25m S   16 0.9   9965:41 fw_full
16026 admin      0 -20 649m 109m 21m S    6 1.4 31184:29 fwk1_dev

Virtual system had only 1 virtual instance (CPU) and was working just fine.

But with the very same rulebase and configuration we are having CPUs through the roof.

[Expert@FW01:2]# top
top - 09:45:29 up 2 days, 6:48, 5 users, load average: 4.04, 4.22, 4.39
Tasks: 163 total, 1 running, 162 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 21.3%id, 0.0%wa, 0.3%hi, 78.3%si, 0.0%st
Cpu1 : 56.1%us, 10.0%sy, 0.0%ni, 27.6%id, 0.0%wa, 0.0%hi, 6.3%si, 0.0%st
Cpu2 : 64.3%us, 10.3%sy, 0.0%ni, 18.3%id, 0.0%wa, 0.0%hi, 7.0%si, 0.0%st
Cpu3 : 62.1%us, 12.0%sy, 0.0%ni, 19.9%id, 0.0%wa, 0.0%hi, 6.0%si, 0.0%st
Mem: 8029492k total, 4010952k used, 4018540k free, 284872k buffers
Swap: 18908408k total, 0k used, 18908408k free, 1106088k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26699 admin 0 -20 1741m 1.0g 110m S 203 13.6 653:26.79 fwk2_dev_0
27524 admin 15 0 595m 95m 39m S 12 1.2 32:44.97 fw_full
10245 admin 15 0 0 0 0 S 4 0.0 11:56.26 cphwd_q_init_ke
18641 admin 0 -20 809m 194m 46m S 3 2.5 32:04.54 fwk1_dev_0

I have added to virtual system three virtual instances (CPUs), But still not enough. Also I am observing quite high CPU0 where is dispatcher.

I have checked SIM affinity, looks fine for me:

[Expert@FW01:0]# sim affinity -l -r -v
eth1-02 : 0
eth1-03 : 0
eth1-01 : 0
Mgmt : 0

[Expert@FW01:0]# fw ctl affinity -l -a -v
Interface eth1 (irq 234): CPU 0
Interface eth7 (irq 115): CPU 0
Interface Mgmt (irq 99): CPU 0
Interface eth1-01 (irq 226): CPU 0
Interface eth1-02 (irq 234): CPU 0
Interface eth1-03 (irq 67): CPU 0
VS_0 fwk: CPU 1 2 3
VS_1 fwk: CPU 1 2 3
VS_2 fwk: CPU 1 2 3

In affected virtual system I can observe surprisingly high amount of PSLXL traffic, but I could not compare it prior upgrade.

[Expert@FW01:2]# fwaccel stats -s
Accelerated conns/Total conns : 26754/115454 (23%)
Accelerated pkts/Total pkts : 5344525154/10554405196 (50%)
F2Fed pkts/Total pkts : 5629452/10554405196 (0%)
F2V pkts/Total pkts : 21331262/10554405196 (0%)
CPASXL pkts/Total pkts : 0/10554405196 (0%)
PSLXL pkts/Total pkts : 5204250590/10554405196 (49%)
QOS inbound pkts/Total pkts : 0/10554405196 (0%)
QOS outbound pkts/Total pkts : 0/10554405196 (0%)
Corrected pkts/Total pkts : 0/10554405196 (0%)

Tried also turn off IPS, also no help:

[Expert@FW01:2]# ips stat
IPS Status: Manually disabled
IPS Update Version: 635158746
Global Detect: Off
Bypass Under Load: Off

[Expert@FW01:2]# enabled_blades
fw ips

[Expert@FW01:0]# vsx stat -l

VSID: 2
VRID: 2
Type: Virtual System
Name: ntra
Security Policy: Standard
Installed at: 14Sep2020 20:03:55
SIC Status: Trust
Connections number: 118972
Connections peak: 119651
Connections limit: 549900

I have also observed in "fw ctl zdebug + drop" logs, it disappeared after adding virtual instance.

@;166488350;[kern];[tid_0];[fw4_0];fw_log_drop_ex: Packet proto=6 10.20.30.40:50057 -> 15.114.24.198:443 dropped by cphwd_pslglue_handle_packet_cb_do Reason: F2P: Instance 0 is currently fully utilized;

I do have opened support case for it, but so far nothing really helpful. Am I missing something?

Thank your for opinion and advice.

Are you a member of CheckMates?

Unusual high CPU after migration VSX R77.30 to R80.30 jumbo take 215