- CheckMates
- :
- Products
- :
- General Topics
- :
- Re: What does high si mean on a FW Worker core?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What does high si mean on a FW Worker core?
Timothy Hall, I have an Open Server (R80.10) that has a 4 core license on it. I've set CoreXL to consume 2 cores, and SND to 2 cores. What does it mean when the FW Worker cores are using a high % of si in the top output? I thought si was typically only seen in SND cores for taking traffic off of network interfaces?
This FW is running just the FW blade, and 94% of packets are accelerated.
Interfaces eth0 and eth2 are bonded together, and eth1 and eth3 are bonded together (all 10G interfaces)
For example:
[Expert@fw1:0]# fw ctl affinity -l
eth3: CPU 0
eth2: CPU 1
eth1: CPU 0
eth0: CPU 1
eth6: CPU 0
eth4: CPU 1
Kernel fw_0: CPU 3
Kernel fw_1: CPU 2
Daemon mpdaemon: CPU 2 3
Daemon wsdnsd: CPU 2 3
Daemon fwd: CPU 2 3
Daemon in.geod: CPU 2 3
Daemon lpd: CPU 2 3
Daemon in.asessiond: CPU 2 3
Daemon cpd: CPU 2 3
Daemon cprid: CPU 2 3
The current license permits the use of CPUs 0, 1, 2, 3 only.
[Expert@fw1:0]# top
top - 21:05:07 up 27 min, 1 user, load average: 1.73, 1.51, 1.37
Tasks: 178 total, 3 running, 175 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 76.3%id, 0.0%wa, 2.0%hi, 21.7%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 91.7%id, 0.0%wa, 1.3%hi, 7.0%si, 0.0%st
Cpu2 : 2.7%us, 1.7%sy, 0.0%ni, 52.7%id, 0.0%wa, 0.0%hi, 43.0%si, 0.0%st
Cpu3 : 2.3%us, 2.0%sy, 0.0%ni, 51.5%id, 1.3%wa, 0.0%hi, 42.9%si, 0.0%st
<removed extra unused cores at 100% idle>
Mem: 16262312k total, 5321028k used, 10941284k free, 52204k buffers
Swap: 17824108k total, 0k used, 17824108k free, 1995872k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5703 admin 15 0 0 0 0 R 43 0.0 11:00.16 fw_worker_1
5702 admin 15 0 0 0 0 R 43 0.0 10:45.52 fw_worker_0
7249 admin 15 0 591m 104m 40m S 7 0.7 1:48.70 fw_full
6696 admin 15 0 2884 1580 1320 S 0 0.0 0:06.24 netflowd
18626 admin 15 0 2244 1136 832 R 0 0.0 0:00.03 top
1 admin 15 0 1972 720 624 S 0 0.0 0:03.51 init
[Expert@fw1:0]# fwaccel stats -s
Accelerated conns/Total conns : 97389/109164 (89%)
Accelerated pkts/Total pkts : 66910366/70502200 (94%)
F2Fed pkts/Total pkts : 2872625/70502200 (4%)
PXL pkts/Total pkts : 719209/70502200 (1%)
QXL pkts/Total pkts : 0/70502200 (0%)
Thanks for any insight you can provide as to what may be going on here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you using a lot of VLANs ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
46 VLAN interfaces are configured.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hristo Grigorov can you comment further on VLANS and si usage?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I read somewhere that lots of of VLANs may cause high number of si but in your case 46 does not look like a lot to me. Are they all on a single physical interface ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
They are all on a single bond interface (2 physical interfaces bonded as a single logical interface).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure if that is the reason for the high si value. Most likely it is not.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The distinction between sy and si in the context of Check Point's kernel code has never been real clear to me. Both are definitely measuring CPU slices consumed in kernel space. And si is usually associated with SoftIRQ runs but can include other operations as well.
Given that you are running at around 90% acceleration for throughput and sessions that is strange that your worker cores are so busy. Wondering if somehow because the number of physical cores exceeds the number of licensed cores that the SoftIRQ processing has become "disconnected" from the Sim driver on cores 0 & 1 and is the SoftIRQ handling is happening on cores 3 & 4. Please post the following:
cat $FWDIR/conf/fwaffinity.conf
sim affinity -l
cat /proc/interrupts
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[Expert@fw1:0]# cat $FWDIR/conf/fwaffinity.conf
# Process / Interface Affinity Settings
<removed>
i default auto
[Expert@fw1:0]# sim affinity -l
eth0 : 1
eth1 : 0
eth2 : 1
eth3 : 0
eth4 : 1
eth6 : 0
[Expert@fw1:0]# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11
0: 50263369 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge timer
1: 3 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge i8042
4: 494 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge serial
8: 3 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge rtc
9: 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level acpi
12: 4 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge i8042
51: 564647 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X cciss0
68: 1 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth4
76: 7508 98507 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth4-TxRx-0
91: 840148957 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth3-TxRx-0
99: 5 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth3
115: 541398 365993378 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth2-TxRx-0
123: 4 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth2
139: 872685073 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth1-TxRx-0
147: 6 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth1
163: 572345 312908224 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth0-TxRx-0
171: 5 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth0
185: 34 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level uhci_hcd:usb1
218: 26 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level ehci_hcd:usb2
219: 1 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth6
226: 65 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-level ehci_hcd:usb3
227: 25721946 245191 0 0 0 0 0 0 0 0 0 0 PCI-MSI-X eth6-TxRx-0
NMI: 13838 4943 16791 16626 426 465 638 577 477 560 565 1051
LOC: 50265571 50265560 50265544 50265476 50265327 50265254 50262717 50265015 50264943 50264870 50264797 50264728
ERR: 0
MIS: 0
(sorry for the rotten formatting on the last command - I tried a few different fonts, but cannot get it to space out more nicely)
Thanks for any insight!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry for the delay in responding, had to think about this one for awhile I was at CPX Vegas. Everything looks correct as far as the SND cores and SoftIRQ handling. So in regards to why you are seeing a lot of si utilization on your workers even with such a high percentage of templating and fully-accelerated traffic, I can think of the following things to check:
1) Are you using a cluster? State sync is handled on the worker cores and if your sync network is unhealthy or overloaded it can cause high CPU use on these cores. Please post output of fw ctl pstat and cphaprob syncstat. An easy thing to try if you suspect this is the cause is to shutdown your standby member and see if the CPU use normalizes on the active.
2) While SecureXL is forming accept templates at a very high percentage on your gateway, NAT lookups must still happen on the worker cores unless NAT templates are enabled. Is most of your traffic NATted? If so this may be one of the situations where you will want to enable NAT templates due to the very high rate of acceleration and connection templating, run fwaccel stat to check status of NAT templates. They are off by default in R80.10 and earlier, if you decide to enable these make sure you have the latest GA Jumbo HFA loaded.
3) I suppose it is possible that there are long-running unaccelerated elephant flows or an extremely fast burst of new, non-templated connections performing numerous rule base lookups on the worker cores and causing the utilization, but that seems unlikely. What is the new connection rate (connections/sec) reported by cpview on the overview screen? Also in cpview go to the CPU & Network screens and look under Top-Connections for possible elephants.
--
"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) Yes, this is an HA New Mode cluster. Here is the output of the commands you requested:
# fw ctl pstat
System Capacity Summary:
Memory used: 7% (928 MB out of 11910 MB) - below watermark
Concurrent Connections: 63878 (Unlimited)
Aggressive Aging is enabled, not active
Hash kernel memory (hmem) statistics:
Total memory allocated: 1245708288 bytes in 304128 (4096 bytes) blocks using 1 pool
Total memory bytes used: 0 unused: 1245708288 (100.00%) peak: 363027380
Total memory blocks used: 0 unused: 304128 (100%) peak: 92080
Allocations: 2769517885 alloc, 0 failed alloc, 2768911449 free
System kernel memory (smem) statistics:
Total memory bytes used: 1864138360 peak: 1880021812
Total memory bytes wasted: 2943676
Blocking memory bytes used: 6567460 peak: 6845516
Non-Blocking memory bytes used: 1857570900 peak: 1873176296
Allocations: 158130399 alloc, 0 failed alloc, 158128460 free, 0 failed free
vmalloc bytes used: 1854176256 expensive: no
Kernel memory (kmem) statistics:
Total memory bytes used: 690286080 peak: 974801324
Allocations: 2927646014 alloc, 0 failed alloc
2927038444 free, 0 failed free
External Allocations: 11520 for packets, 140558384 for SXL
Cookies:
1118031906 total, 0 alloc, 0 free,
43774 dup, 794853300 get, 41272552 put,
1194132460 len, 377912 cached len, 0 chain alloc,
0 chain free
Connections:
268803174 total, 219140235 TCP, 47903781 UDP, 1753991 ICMP,
5167 other, 1942 anticipated, 212500 recovered, 63878 concurrent,
390501 peak concurrent
Fragments:
635641 fragments, 228997 packets, 1487 expired, 0 short,
0 large, 0 duplicates, 0 failures
NAT:
1219601/0 forw, 1364547/0 bckw, 959319 tcpudp,
1624778 icmp, 1420265-806129 alloc
Sync:
Version: new
Status: Able to Send/Receive sync packets
Sync packets sent:
total : 299435577, retransmitted : 16, retrans reqs : 11, acks : 11403
Sync packets received:
total : 880511, were queued : 91, dropped by net : 52
retrans reqs : 16, received 954 acks
retrans reqs for illegal seq : 0
dropped updates as a result of sync overload: 0
Callback statistics: handled 179 cb, average delay : 2, max delay : 105
# cphaprob syncstat
Sync Statistics (IDs of F&A Peers - 1 😞
Other Member Updates:
Sent retransmission requests................... 11
Avg missing updates per request................ 6
Old or too-new arriving updates................ 2
Unsynced missing updates....................... 0
Lost sync connection (num of events)........... 13
Timed out sync connection ..................... 0
Local Updates:
Total generated updates ....................... 31102774
Recv Retransmission requests................... 16
Recv Duplicate Retrans request................. 0
Blocking Events................................ 0
Blocked packets................................ 0
Max length of sending queue.................... 0
Avg length of sending queue.................... 0
Hold Pkts events............................... 179
Unhold Pkt events.............................. 179
Not held due to no members..................... 0
Max held duration (sync ticks)................. 0
Avg held duration (sync ticks)................. 0
Timers:
Sync tick (ms)................................. 100
CPHA tick (ms)................................. 100
Queues:
Sending queue size............................. 512
Receiving queue size........................... 256
2) This GW should be doing very little NAT. Here is the fwaccel stat. I'll see what rule is stopping template offloads...
# fwaccel stat
Accelerator Status : on
Accept Templates : disabled by Firewall
Layer BDC-DMZ-Policy Security disables template offloads from rule #32
Throughput acceleration still enabled.
Drop Templates : disabled
NAT Templates : disabled by user
NMR Templates : enabled
NMT Templates : enabled
Accelerator Features : Accounting, NAT, Cryptography, Routing,
HasClock, Templates, Synchronous, IdleDetection,
Sequencing, TcpStateDetect, AutoExpire,
DelayedNotif, TcpStateDetectV2, CPLS, McastRouting,
WireMode, DropTemplates, NatTemplates,
Streaming, MultiFW, AntiSpoofing, Nac,
ViolationStats, AsychronicNotif, ERDOS,
McastRoutingV2, NMR, NMT, NAT64, GTPAcceleration,
SCTPAcceleration
Cryptography Features : Tunnel, UDPEncapsulation, MD5, SHA1, NULL,
3DES, DES, CAST, CAST-40, AES-128, AES-256,
ESP, LinkSelection, DynamicVPN, NatTraversal,
EncRouting, AES-XCBC, SHA256
3) The connections/sec seem to average between 350-450 connections/sec in cpview.
I noticed one potential flow that was a constant 9-30 Mbps on 'port 0' in cpview. Logs identify this as a VPN (I see IKE(500) and ESP (50) tied to the source/destination). I'm guessing that traffic is not accelerated? But I wouldn't expect that 1 flow to impact both FW worker cores, right?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1) State sync network looks fine.
2) Adjusting rule 32 to enable more templating might help, depends. If you make the templating adjustments and the CPU drops noticeably on the workers that would seem to support the "lots of rulebase lookups causing load on the workers" theory I floated earlier.
3) That connections rate is busy but not excessive. As far as the VPN flow in general that should be accelerated by SecureXL unless you are using SHA-384, some kind of GCM mode for AES, or some other inspection is being called for that requires PXL/F2F. Please post output of enabled_blades and check your VPN settings for those two algorithms. On a screen of cpview you should be able to see live throughput in SXL/PXL/F2F, roughly what are you seeing there?
4) Can you characterize the CPU load on the workers a bit further? If you sit and watch it for awhile is it pretty constant or does it drop to zero and jump back up over and over?
Beyond this point it is going to be tough to figure out what is going on, under the Advanced screen in cpview you can look at various live statistics for the individual CoreXL worker instances that may provide some insight into what they are busy trying to do. A lot of those stats are not well-documented though, but feel free to run them by me if something jumps out at you. Beyond that will probably require a TAC case...
--
"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
RE: 3) Let me clarify about the VPN flow. It is not a VPN terminating on the Checkpoint FW. A client / server behind the FW is initiating a VPN tunnel to an external host.
# enabled_blades
fw mon
Per cpview:
Traffic Rate: |
| |
| Total FW PXL SecureXL |
| Inbound packets/sec 168K 1,062 672 166K |
| Outbound packets/sec 168K 1,014 672 166K |
| Inbound bits/sec 656M 1,524K 2,718K 652M |
| Outbound bits/sec 677M 2,913K 2,794K 671M |
| Connections/sec 397 285 0 112
| Concurrent Connections: |
| |
| Total FW PXL SecureXL |
| Connections 67,526 24,047 276 43,203 |
| Non-TCP 6,101 700 0 5,401 |
| TCP handshake 1,511 0 0 1,511 |
| TCP established 39,603 23,182 20 16,401 |
| TCP closed 2 0,311 1 65 256 19,890
Templates: |
| |
| % Connections from templates 25% |
| % Unused templates 45%
4. It is pretty constant. And, in top, the 'si' number for a core almost always matches the %cpu for the corresponding fw_worker process. For example:
]# top
top - 11:12:34 up 7 days, 14:35, 1 user, load average: 0.45, 0.90, 1.03
Tasks: 178 total, 3 running, 175 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 48.0%id, 0.0%wa, 2.3%hi, 49.7%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 79.9%id, 0.0%wa, 2.6%hi, 17.5%si, 0.0%st
Cpu2 : 2.3%us, 3.0%sy, 0.0%ni, 63.6%id, 0.0%wa, 0.0%hi, 31.1%si, 0.0%st
Cpu3 : 4.0%us, 3.6%sy, 0.0%ni, 63.4%id, 0.0%wa, 0.0%hi, 29.0%si, 0.0%st
<removed>
5703 admin 15 0 0 0 0 R 31 0.0 2857:49 fw_worker_1
5702 admin 15 0 0 0 0 R 29 0.0 2704:07 fw_worker_0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Only TCP and UDP connections can be accelerated so that transiting VPN traffic will not be accelerated and will go F2F.
The fw_worker processes are just user-space representations for the Firewall Worker core instances, but I can't figure out what the heck they could be doing with so much of your traffic being accelerated and templated. As mentioned earlier try poking around with cpview and look at the CoreXL instance statistics under Advanced.
--
"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I read correctly you have bond1 (eth0+eth2) bound to CPU#1 and bond2 (eth1+eth3) bound to CPU#0. Yet according to IRQ table bond2 is properly assigned to CPU#0 only while bond1 is assigned to both CPU#0 and CPU#1. Am I missing something apparent here or it is not normal ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
bond0 = eth0 and eth2
bond1 = eth1 and eth3
bond2 = eth3 and eth6
bond1 is the only interface with vlans on it. The others are not tagged.
I have not specifically tied any interfaces directly to a core via sim affinity commands. I simply have 2 cores tied to SND and 2 cores tied to fw_workers (coreXL). The cores chosen by SND happens automatically.
I see what you are saying in the /proc/interrupts table. For eth0, for example, when I refresh it, I only see CPU1 incrementing. Are the SND cores dynamically allocated to different interfaces based on interface/cpu load? If so, there may be slow times where everything dials back to CPU0? (Maybe Timothy Hall would know the answer to that one).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It does seem that CPU#0 is processing more interrupts than CPU#1. It is indeed interesting to hear Tim's comments on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If traffic levels drop low enough, Automatic SIM Affinity will move all SoftIRQ processing onto Core 0. This can be easily observed on a standby cluster member that is not moving any traffic, or a firewall deployed in a lab environment with very little traffic passing through it. The examination and potential reassignment of SoftIRQ handling to a less loaded SND/IRQ core happens every 60 seconds, so there could be brief periods where CPU 0 might get overwhelmed by a sudden traffic burst. As long as RXP-DRPs are below 0.1% it is not a cause for concern, and it is normal to see slightly higher interrupt numbers on CPU 0. Since about R77 or so Automatic SIM Affinity seems to work pretty well and I haven't had to set up any manual interface affinities.
Phillip Runner Please provide output of netstat -ni so we can see if you have too many RX-DRPs.
--
"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks like we are still well < than 0.1%:
]# netstat -ni
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
bond0 1500 0 7859487232 0 0 0 7733430999 0 0 0 BMmRU
bond1 1500 0 40051486705 0 26777 0 40054256168 0 0 0 BMmRU
<removing 46 vlan interfaces (none of which have RX-DRP)>
eth0 1500 0 3416428494 0 0 0 3546523603 0 0 0 BMsRU
eth1 1500 0 20280097009 0 12369 0 19922028133 0 0 0 BMsRU
eth2 1500 0 4443061045 0 0 0 4186909631 0 0 0 BMsRU
eth3 1500 0 19771400863 0 14408 0 20132239246 0 0 0 BMsRU
eth4 1500 0 539299 0 0 0 590542 0 0 0 BMsRU
eth6 1500 0 20763324 0 0 0 512879905 0 0 0 BMsRU
lo 16436 0 547800 0 0 0 547800 0 0 0 LRU
I guess I'll live with it 🙂
I just thought it was so strange that the fw_workers were doing so much based on how much was accelerated, and that it was all categorized as si in top.
Also interesting:
# mpstat -P ALL 2
16:44:24 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
16:44:26 all 0.42 0.00 0.33 0.00 0.33 15.46 0.00 83.46 78445.50
16:44:26 0 0.00 0.00 0.00 0.00 2.50 32.50 0.00 65.00 52825.50
16:44:26 1 0.00 0.00 0.00 0.00 1.00 12.00 0.00 87.00 25628.50
16:44:26 2 2.50 0.00 2.00 0.00 0.00 71.00 0.00 24.50 0.00
16:44:26 3 2.50 0.00 2.50 0.00 0.00 70.00 0.00 25.00 0.00
16:44:26 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
16:44:26 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
16:44:26 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
16:44:26 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
16:44:26 8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
16:44:26 9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
16:44:26 10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
16:44:26 11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.0
Only the SND cores 0 and 1 have intr/s. However, cores 2 and 3 have a high %soft value...
I'll see if I can turn on priority queues / heavy connection evaluations (they are off by default in R80.10) and then maybe I can see what the CPU top-connections are in cpview. Maybe that will provide a clue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah heavy conns might shed some light, or poking around in the CoreXL worker instances under Advanced in cpview.
--
"IPS Immersion Training" Self-paced Video Class
Now Available at http://www.maxpowerfirewalls.com
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think the key lays exactly in what you say Phillip. With that traffic acceleration it must be something that happens well before packets goes for processing by the firewall module.
Have you tried (if possible) to disconnect one of the bonded links and see if that makes any difference regarding si ? Or why not even disable SXL for a brief time....
Sometimes just playing around with setup helps to narrow down where the problem comes from.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We may be able to try a couple of these things during a maintenance window. They are production systems, so I can't just pull the trigger on those.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi phlrnnr,
Did you solve the issue?