Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
belteto
Participant
Participant

Physical memory vs FW memory. Explanation needed!

Hi All!

 

I try to understand the nature of these two parameters of the VSX vsls gateways.

What is the differences/similarities of these two parameters, Physical memory and the FW memory in the cpview.

It is a little bit foggy to me since we are investigating a behaviour.

Could someone able to explain it to me? Seems to me the Fw memory is more and more important than the physical to monitor.

I attached a picture from cpview, when the Fw memory is fully utilised but the physical is still on 50%.

In this case the gateway stop processing traffic, lot of 'internal rule base error' drops. But the gateway itself are available.

all input for this are highly welcome.

thanks in advance.

 

0 Kudos
14 Replies
PhoneBoy
Admin
Admin

Physical memory refers to the entire appliance.
Firewall memory refers to the memory allocated to the various processes and such related to firewall functions.
More information is definitely required to assist in troubleshooting this (for example version/JHF level, precise error messages and such).

belteto
Participant
Participant

Thanks for the explanation and offer to help.

for my understanding, and correct me if I'm wrong:

The Physical memory usage is alway higher than the FW memory usage because this:

Physical memory usage = Fw memory usage + OS base memory usage

And PhysMem usage is increasing when the FWmem is increasing as well.

this is what we see on other VSX's (each has 10 vs on them)   The physical men usage is ~3Gbit more than the fw men.

 

In this particular case after the reboot and latest hot fix (r81.10 T87) the fw memory usage is still higher than the physical men usages. And keep rising, very slowly.

pic attached.

 

This VSX cluster is a 3 node cp26000  96gb ram.  r81.10 T87.  (VSLS. with 39 VS and 5 switch on it)

No error message is visible now. only when the fw memory was 100% full, we got only 'internal rule base error' drop messages in the logs. nothing more. 

 

Tac is already on in and possible RnD will be involved.

I'd like to pic your and the community's brain, maybe you saw similar like this.

 

0 Kudos
PhoneBoy
Admin
Admin

It's possible there is a memory leak somewhere.
I recommend getting the TAC involved. 

0 Kudos
belteto
Participant
Participant

Yes we (Tac as well) are suspected memory leak, that is why they recommended to apply T87, which has memory fixes (as they told us)

Maybe not all memory issues was fixed. So they are still investigating.

0 Kudos
the_rock
Legend
Legend

Yea, you got that right, but as @PhoneBoy said, its possible you have memory leak going on here. To be able to properly help you, can you send us outputs of below commands:

top

free -m

ps -auxw

cpview (look at initial screen for memory usage)

cpwd_admin list

enabled_bladed

cat /proc/cpuinfo

cat /proc/meminfo

cpstat fw -f all

Cheers,

Andy

0 Kudos
belteto
Participant
Participant

Hi!

Attached the outputs.

on the 39 VS, there are 3 of them has its blade enabled 

All others has only fw. and all the connections around 90-99% accelerated.

 

Thx

Balint

0 Kudos
Teddy_Brewski
Collaborator

Hello @belteto .  Did you find anything with the TAC?

We've recently experienced the same issue with R81.20 Take 90 on open servers. The FW memory got consumed and we ended up with 'internal rule base error' drops.

The case with the TAC went nowhere. They provided with the huge list of kernel settings that need to be enabled during the moment the memory is saturated, which hasn't happened so far.

Which brings me to another question: does anyone know how to monitor (SNMP) FW memory? I can get values for RAM - Real Active and RAM - Real Free, but it's no use.

Thank you.

 

 

0 Kudos
the_rock
Legend
Legend

Hey Teddy,

Can you send what you see below when running cpview? You can also check history by running cpview -t and then t to enter the desired date onwards. By the way, do you see anything consuming high memory from top or pa -auxw commands? What does free -m show?

Andy

 

Screenshot_1.png

 

my lab:

 

 [Expert@CP-GW:0]# free -m
total used free shared buff/cache available
Mem: 23309 6555 8494 32 8259 15303
Swap: 8191 0 8191
[Expert@CP-GW:0]#

0 Kudos
Teddy_Brewski
Collaborator

Hello @the_rock 

We didn't have enough patience and time to identify what was consuming high memory from top. It was a "Mad Max" emergency troubleshooting in the middle of the night. Even initial troubleshooting went in the wrong direction: FW memory values were overlooked and everybody was focused on the state of the cluster, which was perfectly fine and healthy. The failover fixed the issue and only in the morning we noticed 'internal rule base' errors and started replaying cpview which revealed FW memory exhaustion:

Untitled1x.png

And this is how it looks now:

Untitled1.png

For some weeks we didn't experience any memory increase, so it's still under observation.

 

0 Kudos
the_rock
Legend
Legend

Ok, fair enough. So, at this point, do commands top and ps -auxw show any process consuming high memory?

Andy

0 Kudos
Teddy_Brewski
Collaborator

Nothing high:

Untitled1xxx.png

As per 'ps -auxw', the most heavy talker (2.2%) is:

admin 19372 1.7 2.2 2646548 1438580 ? S<Ll Nov16 638:14 fwk

 

0 Kudos
the_rock
Legend
Legend

That looks normal.

0 Kudos
belteto
Participant
Participant

Hi Teddy!

In our case, there was a memory leak identified, the dynamic_balancing feature process(dsd) caused.
The next jumbo hotfix solved the issue.

There is no way to monitor the Fw memory directly. No specific OID assigned to that parameter.

The get the data via snmp, we created a custom oid which run a script that query the counters with the fw ctl pstat:

added in the /etc/snmp/userDefinedSettings.conf:

pass .1.3.555.1 /usr/local/bin/mem_pass.sh max

pass .1.3.555.2 /usr/local/bin/mem_pass.sh used

 

/usr/local/bin/mem_pass.sh 

#!/bin/bash

 

max=$(fw ctl pstat | grep Physical | awk '{print $9}')

used=$(fw ctl pstat | grep Physical | awk '{print substr($5,2)}')

 

if [[ $1 =~ max ]]

then

  echo .1.3.555.1.0

  echo integer

  echo $max

fi

 

if [[ $1 =~ used ]]

then

  echo .1.3.555.2.0

  echo integer

  echo $used

fi

 

 

Hope it helps!

Balint

Teddy_Brewski
Collaborator

Thanks a lot!

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events