Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Marc0523
Participant
Jump to solution

PRTG giving false memory readings

We use PRTG for monitoring our network, including our firewalls.

We have three clusters of 5100 appliances for our 3 sites.

 

For some reason, PRTG states our available memory (physical memory) is down to 3% available.

Smart Console (& CPView) show 49% free.

 

The issue is, I assume, how, or what PRTG is asking our firewalls, but damned if I know how to fix this.

We are using SNMP to get the information from the firewalls, this works correctly for CPU usage, for example, but not for Memory Usage.

 

 

Any ideas what I could look at as a fix for this?  

0 Kudos
1 Solution

Accepted Solutions
Vincent_Bacher
Advisor
Advisor
Seems your PRTG is using standard HOST-RESOURCES-MIB. There are several ways to query memory usage of CP firewalls.
One of them is Check Point own MIB.
 
cp-memory-mib.png
 

 

snmpwalk -On -v3 -l authPriv -u ***** -a SHA512 -A '*****' -x AES128 -X '*****' x.x.x.x .1.3.6.1.4.1.2620.1.6.7.4
.1.3.6.1.4.1.2620.1.6.7.4.1.0 = Counter64: 100844179456
.1.3.6.1.4.1.2620.1.6.7.4.2.0 = Counter64: 19621507072
.1.3.6.1.4.1.2620.1.6.7.4.3.0 = Counter64: 66484576256
.1.3.6.1.4.1.2620.1.6.7.4.4.0 = Counter64: 19621507072
.1.3.6.1.4.1.2620.1.6.7.4.5.0 = Counter64: 46863069184​

 

 

On our Zabbix SNMP monitoring (hopefully soon to be replaced by prometheus) template we use metrics like vm.memory.free[memAvailReal.0] which is subtree
1.3.6.1.4.1.2021.4 from a standard Unix mib (forgotten the name) and with this you as well get metrics from a CP device.

 

snmpwalk -On -v3 -l authPriv -u ***** -a SHA512 -A '*****' -x AES128 -X '*****' x.x.x.x 1.3.6.1.4.1.2021.4
.1.3.6.1.4.1.2021.4.1.0 = INTEGER: 0
.1.3.6.1.4.1.2021.4.2.0 = STRING: swap
.1.3.6.1.4.1.2021.4.3.0 = INTEGER: 33554300 kB
.1.3.6.1.4.1.2021.4.4.0 = INTEGER: 33554300 kB
.1.3.6.1.4.1.2021.4.5.0 = INTEGER: 64926344 kB
.1.3.6.1.4.1.2021.4.6.0 = INTEGER: 38869112 kB
.1.3.6.1.4.1.2021.4.11.0 = INTEGER: 72423412 kB
.1.3.6.1.4.1.2021.4.12.0 = INTEGER: 16000 kB
.1.3.6.1.4.1.2021.4.13.0 = INTEGER: 21704 kB
.1.3.6.1.4.1.2021.4.14.0 = INTEGER: 4448 kB
.1.3.6.1.4.1.2021.4.15.0 = INTEGER: 7176164 kB
.1.3.6.1.4.1.2021.4.18.0 = Counter64: 33554300
.1.3.6.1.4.1.2021.4.19.0 = Counter64: 33554300
.1.3.6.1.4.1.2021.4.20.0 = Counter64: 64926344
.1.3.6.1.4.1.2021.4.21.0 = Counter64: 38869112
.1.3.6.1.4.1.2021.4.22.0 = Counter64: 72423412
.1.3.6.1.4.1.2021.4.23.0 = Counter64: 16000
.1.3.6.1.4.1.2021.4.24.0 = Counter64: 21704
.1.3.6.1.4.1.2021.4.25.0 = Counter64: 4448
.1.3.6.1.4.1.2021.4.26.0 = Counter64: 7176164
.1.3.6.1.4.1.2021.4.100.0 = INTEGER: noError(0)
.1.3.6.1.4.1.2021.4.101.0 = STRING:

 

 

Did never compare which of them best fits those metrics from cpview.



and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

View solution in original post

0 Kudos
11 Replies
Vincent_Bacher
Advisor
Advisor

Which OID are you querying?

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos
Bob_Zimmerman
Authority
Authority

This is definitely it. Linux has about ten different ways of measuring "free memory", and all of them are misleading in one way or another.

Totally unused RAM is wasted RAM, so the system keeps a lot of caches and inactive pages around. These count against memory which is free-as-in-totally-unused, since they have data in them. The priority of the pages is very low, so any memory pressure can reclaim them. Take this output from one of my firewalls:

[Expert@SomeVsxFirewall:0 ACTIVE]# free -h
              total        used        free      shared  buff/cache   available
Mem:            16G        5.0G        854M         44M        9.5G        9.5G
Swap:           17G          0B         17G

The "free" column is memory which is totally unused (roughly). This firewall only has 854 MB totally unused, which is about 5%. 9.5 GB of the memory which isn't free is buffers and low-priority cache, most of which can be flushed at a moment's notice to make room for some other process. The "available" column is what most people think of as free memory. Going from that one, I've got 59% of my RAM free.

0 Kudos
Marc0523
Participant

Exactly, direct from the firewall, or in SmartConsole, I can get useful metrics on memory, but PRTG is giving me useless information.

0 Kudos
Marc0523
Participant

PRTG does not tell you what OID is in use, I needed to do a packet capture to find out!

 

Object Name: 1.3.6.1.2.1.25.2.3.1.3.1 (iso.3.6.1.2.1.25.2.3.1.3.1)
Object Name: 1.3.6.1.2.1.25.2.3.1.4.1 (iso.3.6.1.2.1.25.2.3.1.4.1)
Object Name: 1.3.6.1.2.1.25.2.3.1.5.1 (iso.3.6.1.2.1.25.2.3.1.5.1)
Object Name: 1.3.6.1.2.1.25.2.3.1.6.1 (iso.3.6.1.2.1.25.2.3.1.6.1)

 

This isn't my skillset, so I am not sure how to read the above. These are all get-request packets.

 

PRTG has 4 "Channels" 

Downtime (ID -4)
Percent Available Memory (ID 0)
Available Memory (ID 1)
Total Memory (ID 2)

I am viewing Percent Available Memory (ID 0).

 

I assume the 4 OIDs are linked to the 4 channels above, but I am not sure how.

0 Kudos
the_rock
Legend
Legend

Cpview is 100% correct, so has to be PRTG issue. Not sure what OID you are using.

Andy

0 Kudos
Timothy_Hall
Legend Legend
Legend

Different monitoring tools present the concept of "free" vs "available" memory differently, which can be highly misleading.  The only one I pay any attention to anymore is free -m, please see my CPX 2024 slide deck for how to properly interpret the output of this command:

Be Your Own TAC: Advanced Gateway Troubleshooting Commands

Bottom line is ignore the "free" value and focus on the "available" value, hopefully there are separate OIDs for each of these.

Attend my 60-minute "Be your Own TAC: Part Deux" Presentation
Exclusively at CPX 2025 Las Vegas Tuesday Feb 25th @ 1:00pm
(1)
the_rock
Legend
Legend

This is why I keep telling everyone to buy your book, because I can tell the way you explain things is just SUPERB, in my opinion. Certain things, like that command, for example, free -m or free -g, its easy to misread itand assume or get wrong information, but once you read it properly, you see it 100% matches output from cpview as far as free memory.

Thank you and keep doing amazing work!

Andy

0 Kudos
Hugo_vd_Kooij
Advisor

Just read the slide deck.The screenshot is brilliant. a key indicator I like to keep track of is swap space usage.

On an average firewall you shouldn't see a lot of swap usage. In our monitoring solution we track them and flag them as warning if the swap usage goes above 1 GB. And over 2GB the monitoring goes critical.

Once you go over 2 GB Swap usage firewalls tend to become unhappy.

<< We make miracles happen while you wait. The impossible jobs take just a wee bit longer. >>
0 Kudos
Vincent_Bacher
Advisor
Advisor
Seems your PRTG is using standard HOST-RESOURCES-MIB. There are several ways to query memory usage of CP firewalls.
One of them is Check Point own MIB.
 
cp-memory-mib.png
 

 

snmpwalk -On -v3 -l authPriv -u ***** -a SHA512 -A '*****' -x AES128 -X '*****' x.x.x.x .1.3.6.1.4.1.2620.1.6.7.4
.1.3.6.1.4.1.2620.1.6.7.4.1.0 = Counter64: 100844179456
.1.3.6.1.4.1.2620.1.6.7.4.2.0 = Counter64: 19621507072
.1.3.6.1.4.1.2620.1.6.7.4.3.0 = Counter64: 66484576256
.1.3.6.1.4.1.2620.1.6.7.4.4.0 = Counter64: 19621507072
.1.3.6.1.4.1.2620.1.6.7.4.5.0 = Counter64: 46863069184​

 

 

On our Zabbix SNMP monitoring (hopefully soon to be replaced by prometheus) template we use metrics like vm.memory.free[memAvailReal.0] which is subtree
1.3.6.1.4.1.2021.4 from a standard Unix mib (forgotten the name) and with this you as well get metrics from a CP device.

 

snmpwalk -On -v3 -l authPriv -u ***** -a SHA512 -A '*****' -x AES128 -X '*****' x.x.x.x 1.3.6.1.4.1.2021.4
.1.3.6.1.4.1.2021.4.1.0 = INTEGER: 0
.1.3.6.1.4.1.2021.4.2.0 = STRING: swap
.1.3.6.1.4.1.2021.4.3.0 = INTEGER: 33554300 kB
.1.3.6.1.4.1.2021.4.4.0 = INTEGER: 33554300 kB
.1.3.6.1.4.1.2021.4.5.0 = INTEGER: 64926344 kB
.1.3.6.1.4.1.2021.4.6.0 = INTEGER: 38869112 kB
.1.3.6.1.4.1.2021.4.11.0 = INTEGER: 72423412 kB
.1.3.6.1.4.1.2021.4.12.0 = INTEGER: 16000 kB
.1.3.6.1.4.1.2021.4.13.0 = INTEGER: 21704 kB
.1.3.6.1.4.1.2021.4.14.0 = INTEGER: 4448 kB
.1.3.6.1.4.1.2021.4.15.0 = INTEGER: 7176164 kB
.1.3.6.1.4.1.2021.4.18.0 = Counter64: 33554300
.1.3.6.1.4.1.2021.4.19.0 = Counter64: 33554300
.1.3.6.1.4.1.2021.4.20.0 = Counter64: 64926344
.1.3.6.1.4.1.2021.4.21.0 = Counter64: 38869112
.1.3.6.1.4.1.2021.4.22.0 = Counter64: 72423412
.1.3.6.1.4.1.2021.4.23.0 = Counter64: 16000
.1.3.6.1.4.1.2021.4.24.0 = Counter64: 21704
.1.3.6.1.4.1.2021.4.25.0 = Counter64: 4448
.1.3.6.1.4.1.2021.4.26.0 = Counter64: 7176164
.1.3.6.1.4.1.2021.4.100.0 = INTEGER: noError(0)
.1.3.6.1.4.1.2021.4.101.0 = STRING:

 

 

Did never compare which of them best fits those metrics from cpview.



and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos
Marc0523
Participant

This helped a lot, thanks.

I used the above to find the Check Point official MIB, and then used PRTG's overcomplicated process to input them into PRTG.

This allowed me to get memory reading which were about 50%, so true reading.

 

Thanks a lot all for the support.

0 Kudos
Vincent_Bacher
Advisor
Advisor

Sounds good, great 🙂

If you have some spare time, have a look at Skyline and onboard yourself on prometheus based monitoring. 🙂

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events