- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- Re: R81.20 High CPU being reported by Solarwinds a...
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
R81.20 High CPU being reported by Solarwinds and TOP
Hello,
Ran into a issue to where the active and standby firewall are both showing CPU usage at 155% when looking at with TOP or Solarwinds. The process it is showing using the most cpu is usim_x86. From what I have gathered this is SecureXL. I have turned off SecureXL on the standby unit and it still runs 100%. No matter how long I wait.
In reviewing this further and after talking to support. It seems when SecureXL operates in User Space Mode that this happens. From what I have gathered so far and seen the only fix for the is to use a different monitoring or make sure this cluster is using kernel mode instead.
We have 2 clusters running R81.20 and both clusters have this happening.
Has anyone else see this behavior? If so, were you able to resolve in a different way?
Thanks!!
B
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This may actually be normal for UPPAK. How many SNDs do you have in your system?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
fwaccel off only disables new template creation, it does not disable SecureXL entirely.
As others have stated, this is normal with USFW as stated in sk180299 and does not indicate a problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@70a89920-abf3-4 wrote:
Are the CPUs actually running 100%?
The answer to this gets into how preemptive multitasking works and how utilization is measured. Most processors can't actually be idle. Sometimes the OS kernel can shut off cores it isn't using to save power and heat, and sometimes it can adjust the clock speed, but a processor core which is on and not in a debugging mode is always running an instruction.
Preemptive multitasking (how most operating-system-level multitasking works today) works by splitting instructions up into chunks, then a piece of code called the scheduler determines which chunks will run where during a given time slice. The scheduler also has something called an idle loop which it schedules when there is no work to be done on a given core in a given time slice. The scheduler reports the processor's utilization to top and other tools based on how many time slices in the last second were spent running work chunks versus running the idle loop.
User-mode networking involves running essentially a whole other operating system kernel with its own scheduler in a VM. This normally increases the latency of scheduling decisions, since the inner scheduler has to make the decision, then request time from the outer scheduler, which has to make its own decision. To avoid this scheduler-within-a-scheduler problem, it's common to have the outer OS kernel dedicate particular cores to the inner user-mode kernel's scheduler. From the perspective of the outer OS kernel's scheduler, it's scheduling work for the user 100% of the time slices on the specific cores, and never scheduling the idle loop. This is what top reports.
cpview is able to get information from the user-mode kernel's scheduler, which more accurately reflects the processor time capacity available for handling traffic. Both tools are correct, they're just asking different parts of the system.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This may actually be normal for UPPAK. How many SNDs do you have in your system?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
fwaccel off only disables new template creation, it does not disable SecureXL entirely.
As others have stated, this is normal with USFW as stated in sk180299 and does not indicate a problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the replies. Not stating this indicates a problem.
To me this causes a issue with monitoring. When using Solarwinds for monitoring, it is going to show that these 2 CPUs are a constant 100%.
Also, does this mean that I should not pay attention when looking at TOP for these two CPUs? Should someone just pay attention to CPVIEW when reviewing high CPU issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Cpview is the CLI tool to use, the SK also provides the revised SNMP OIDs which presumably you can update solarwinds to use.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply.
So it sounds like any firewall I upgrade to R81.20, I have to make sure Solarwinds is using this SNMP OID?
THANKS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Answer is yes to your last question, thats your best bet and as other have said, this is 100% normal. Keep in mind, kernel mode is limited to 40 physical cores, while user mode does not have that limitation. I also relay on "old school" Linux commands for thins like this, top and ps -auxw.
Hope that helps.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply. Unfortunately with TOP you are not getting the actual number though? At least that is my thought. Are the CPUs actually running 100%?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thats why you should use cpview.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately, CPVIEW is not what we use for monitoring and reporting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thats fine, but it gives very accurate results.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@70a89920-abf3-4 wrote:
Are the CPUs actually running 100%?
The answer to this gets into how preemptive multitasking works and how utilization is measured. Most processors can't actually be idle. Sometimes the OS kernel can shut off cores it isn't using to save power and heat, and sometimes it can adjust the clock speed, but a processor core which is on and not in a debugging mode is always running an instruction.
Preemptive multitasking (how most operating-system-level multitasking works today) works by splitting instructions up into chunks, then a piece of code called the scheduler determines which chunks will run where during a given time slice. The scheduler also has something called an idle loop which it schedules when there is no work to be done on a given core in a given time slice. The scheduler reports the processor's utilization to top and other tools based on how many time slices in the last second were spent running work chunks versus running the idle loop.
User-mode networking involves running essentially a whole other operating system kernel with its own scheduler in a VM. This normally increases the latency of scheduling decisions, since the inner scheduler has to make the decision, then request time from the outer scheduler, which has to make its own decision. To avoid this scheduler-within-a-scheduler problem, it's common to have the outer OS kernel dedicate particular cores to the inner user-mode kernel's scheduler. From the perspective of the outer OS kernel's scheduler, it's scheduling work for the user 100% of the time slices on the specific cores, and never scheduling the idle loop. This is what top reports.
cpview is able to get information from the user-mode kernel's scheduler, which more accurately reflects the processor time capacity available for handling traffic. Both tools are correct, they're just asking different parts of the system.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So when it comes to monitoring has anyone had any luck in using the MIB mentioned in the SK180299 ?
I get this is the norm and this is how it going to be unless you change modes. At least for this hardware. I am working with out monitoring team to try and get a better reading in Solarwinds.
Just curious to know if anyone has had any luck.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes my customers are using those OID/MIBs, the only caveat might be the JHF you are using is it Take 79 or higher?
PRJ-52416, PRHF-31929 |
Gaia OS |
SNMP query for OID 1.3.6.1.4.1.2620.1.6.7.5.1.5 (CPU utilization per CPU core) and the "cpstat os -f cpu" command may return an incorrect value. Refer to sk182447. |