Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Martijn
Collaborator

SNMP CPU load on VSX

Hi All,

We have a customer with VSX which is using SNMP to monitor the load on virtual systems.

He configured CPU per vs and below is a graph of the CPU load on a virtual system with only one firewall instance.

All virtual system can use all the CPU's on the appliance (except the SND CPU's).

Customer has the feeling the CPU load is never above 15%. Even if the firewall is under load during a nightly backup job going through this gateway.

Is 15% within SNMP a 100% load on the gateway? How is this calculated? The virtual system does not have a real CPU, but just a firewall instance attached. Is the percentage in SNMP the real load on a CPU?

And what about virtual system with multi firewall instances. Is the load in SNMP the average of the load on all CPU's for that virtual system?

Regards,

Martijn.

CPU Graph

0 Kudos
9 Replies
Peter_Sandkuijl
Employee
Employee

Hi Martijn,

While I don't have the complete answer here is some food for thought: "What SNMP interpretation tool are you using"?

Not all tools can handle the information provided over SNMPv3 well.

My records show this (may have improved, this is end of July):

CA eHealth has no snmp V3 support

 

PRTG: cannot handle tables. Not good for visualizing the asg tables (when using Scalable Platforms)

 

Zabbix:      - has SNMP v3 support

                   - context aware for vsx

                   - Can handle tables

 

Icinga/Nagios: can do everything and is free, you need someone who's able to handle it

 

Manage Engine: can handle everything, privacy not ok 

 

Solar winds and HP: can handle everything. can even cook coffee, make Pizza etc 🙂

In general load is shown as a function towards the whole system, this may differ depending on the OID queried. Each VS process can address a core, when running a single instance that means only one core at a time though the scheduler may decide to jump to another core if that is less loaded. Core assignment is a real-time process in the Linux kernel

BR

Peter !!

0 Kudos
genisis__
Advisor

Guys,

Need some clarification on what OIDs represent CPU utilisation at a virtual system or virtual switch level.

Based some feedback in the forum and what I researched it looks like the only OID I require is "fw instancescpu/fw instances cpu total"

In PRTG I have added this per virtual system and then created a graph which combines all these in to one view, only other thing I need to work on is adding total CPUs so there is a max value.

The idea will be to have an upper limit value which will be the total amount of worker cores available, against what is being used.

I've had a look but I can't find a OID that represents total worker cores count.

 

CPU utilisation for VSsCPU utilisation for VSs

0 Kudos
Henrik_Noerr1
Contributor

Hey,

 

I think I answered this in another of your replies? You should use CHECKPOINT-MIB::fwInstancesCPUTable

This gives you cpu load per fwk thread. This also helps with the max value, that will always be 100%

With the approach you show above, you will be hard pressed to show it intuatively with VSs that have different  #corexl instances

 

/Henrik

0 Kudos
genisis__
Advisor

I found SK170756 yesterday "How to monitor CPU usage per VS via SNMP in Gaia Kernel 3.10" which indicates to use 'fwInstancesCPU', which is basically what I'm using.  In prtg its not a separate value 'fw instancescpu/fw instances cpu total', so not quite sure how I get the maximum worker cores value.

I only see this value in the R81 MIB file 'Checkpoint-R81-MIB' and not the 'Checkpoint-MIB' file which is strange.

Capture.PNG

0 Kudos
Magnus-Holmberg
Advisor

have you tested that this actually works? because before the OID did not consider how many VS instanses was allocated to a VS, but it took a AVG of all the cpu cores on the box for some calculation.

https://www.youtube.com/c/MagnusHolmberg-NetSec
0 Kudos
genisis__
Advisor

It seems to be working.

I added the OID per VS and then created a chart which links them all in one graph.  I also looked at the top output and seems about right.

All I'm missing now is a OID to get the total number of worker cores so I can add this into the graph as an upper limit value.

0 Kudos
Henrik_Noerr1
Contributor

There is no such OID. But fwInstancesCPU gives you the number.

When polling fwInstancesCPUTable you get the following, showing vdName, threads (look for fwkx_x giving you the number of corexl) and cpu load for these threads.

 

See data below from my own environment. Showing vs 4 having 6 cores

 

> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_0,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=6i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_5,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=14i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_3,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=5i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_dev_1,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_kissd,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_1,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=7i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_2,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=3i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_hp,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_4,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=38i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_dev_5,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_dev_4,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_dev_3,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_dev_0,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_service,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000
> fwInstancesCPUTable,agent_host=10.200.11.4,fwInstancesCPUInstanceName=fwk4_dev_2,host=<pollerhost>,hostname=<secret firewall>,vdName=<secret vs> fwInstancesCPUUsage=0i 1625604524000000000

 

The above translates to this graph, with no need to know corexl assigned, even giving the benefit of seeing individual threads spiking.

vs4.PNG

 

0 Kudos
genisis__
Advisor

Great!

It could be how prtg  reads the information, equally though I would like a simple total value of worker cores. The above is great to break it down at a VS level (and that is something I would like to do as well), but I also want it at the appliance level ie. if cpview I can see how may worker cores and SND cores are allocated as an example.

0 Kudos
genisis__
Advisor

Had a look but still not seeing this value:

Capture.PNGCapture.PNG

 

We can see from the descriptions they don't really pick up the amount of cores allocated as a single value.

0 Kudos