Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
knassif
Participant

memory leak

we have been having a memory leak issue on our gateway firewalls, we have applied all kinds of patches and hotfixes and we still face the issue, if we leave the memory to get to the max the firewall becomes unresponsive and cant access it via ssh and have to reboot it. appreciate any help or anyone has experienced the same issue attached a screenshot

current version is R81.20 T89

0 Kudos
56 Replies
the_rock
Legend
Legend

Btw, not sure if this might be related, but I see below in jumbo 90.

Andy

 

Screenshot_1.png

0 Kudos
knassif
Participant

yea I did see that, asked TAC to see if it can fix it, we have seen many memory leak fixes in previous hotfixes we have installed them all with no luck, we dont use antivirus blade by the way

0 Kudos
the_rock
Legend
Legend

K, fair enough. Im sure, collectively here, we will be able to help you fix this issue. Please upload debugs when you can and I will review them later tonight (im in Canada, est time zone)

Andy

0 Kudos
knassif
Participant

should we allocate more cores per VSX instance? the 16200 turbo has 48 cores see attached

(1)
the_rock
Legend
Legend

100% you should. I was just actually going to suggest that, but you "beat" me to it.

Andy

0 Kudos
knassif
Participant

gonna need a maintenance window to do that as I believe it would need downtime

0 Kudos
the_rock
Legend
Legend

Thats right...do NOT do this during work hours, it will cause downtime.

Andy

0 Kudos
knassif
Participant

so you can assign like 20 for each VSX instance? ( we have 15 VSX instances on that 16200 gateway and the total number of cores on it is 48)

Number of Cores: 48

Total Memory: 131072 MB

0 Kudos
the_rock
Legend
Legend

Sorry, I was thinking of something else...no no, you cant do that, my bad. So if its 48 cores and 15 instances, then would make sense 15 x 3 = 45

Andy

0 Kudos
knassif
Participant

it says the number of instances is not limited by the physical cores on the server so I think it should be fine to add, take a look at screenshot, my concern though adding more resources while the leak is still there may not fix the problem

0 Kudos
the_rock
Legend
Legend

I get it, totally valid concern. I hate to put it like this, but only way to know for sure would be to give it a go...not saying it would fix the problem, but at least it would eliminate that possibility.

Andy

0 Kudos
knassif
Participant

the strange part is these firewalls barely have any traffic on them so cant understand why the fwk process which is responsible for traffic is having a leak.

0 Kudos
the_rock
Legend
Legend

That is indeed odd, I agree. I will say, from debugs you sent, when I reviewed them, I can clearly see fwk consumes say 0.5% memory here and there, then 1.7% few times, then 2%, then bit more, so adding those "bits and pieces" amounts to big issue. Few of them even show cpu consumptions of 17%.

Andy

0 Kudos
knassif
Participant

take a look in bold below it says the number of instances is not limited by the number of physical cores on the server or appliance so 48 cores should not matter

 

 

 

CoreXL

What can I do here?

Configure the number of firewall instances on this Virtual System. When you change the number of firewall instances on a Virtual System, there is a downtime for that Virtual System.

 

 

Important - Each firewall instance that you create uses additional system memory. A Virtual System with five instances would use approximately the same amount of memory as five separate Virtual Systems.

What background information do I need to know?

CoreXL creates multiple firewall instances which are different firewalls. You can use CoreXL to increase the performance of the Virtual System on a server or appliance with multiple cores.

Use SmartConsole to configure the number of CoreXL Firewall instances on the Virtual System

  • In 32-bit VSX, you can assign up to 10 CoreXL Firewall instances on a Virtual System.

  • In 64-bit VSX, you can assign up to 32 CoreXL Firewall instances on a Virtual System.

The number of instances is not limited by the physical cores on the server or appliance.

Tell me about the fields...

  • Number of instances: Select the number of firewall instances for the Virtual System.

 

 

 

Getting Here - Double-click a Virtual System > CoreXL


the_rock
Legend
Legend

Its been some time I worked on VSX, okay, that explanation makes sense. But, having said that, I would be very careful with the number you put there...not sure if its related to number of VSs, as you have 15, so wondering if you gave 10 there, if its related to amount of ram or not, but I would maybe start with 5 and see if it that makes it any better.

Just my logical approach...

Andy

0 Kudos
the_rock
Legend
Legend

Hey @knassif 

Since this issue was really bugging me and I want to help the best of my ability, I called a customer I used to work with while ago and since I recall they had literally the same issue about 2 years ago, though on R80.40, that was the exact solution, changing number of instances. He told me when they changed it from 3 to 10, after that problems never reappeared. Again, I cant guarantee that would work for you, but certainly worth a try.

Personally, I dont see that making situation worse than it is...

Andy

0 Kudos
knassif
Participant

how many though if we have 48, should allocate 40 for each VSX instance? what i'm seeing is an increase of 0.1% of memory on a few VS's see attached, fwk 10 , fwk6 and fwk2 have increased 0.1% in about a couple of hours, our memory leak is increasing at about 100MB per hour or so not sure if that translates in to the 0.1%

0 Kudos
the_rock
Legend
Legend

Well, if you do simple math, thats it. If 0.1% is 100mbs, then 1% is 10 times that, so 1 gb x 100 is 100gbs of RAM. Dont do 40, lets start with 15 or 20, see if that helps.

Andy

0 Kudos
knassif
Participant

attached is the debugs,we were given this sk as well, let me know if you need me to run any commands or you can find anything in the debugs I was not able to upload more than 143MB

https://support.checkpoint.com/results/sk/sk182648

 

0 Kudos
the_rock
Legend
Legend

This is good, I will see what I can review later, but if not, then Friday, for sure.

Andy

0 Kudos
the_rock
Legend
Legend

Just had a look at the debugs and here is my gut feeling/assumptions, considering I have no clue about anything so far that was done in TAC case you have going on. Just my 2 cents, but is it possible that since fwk process seems to be more less causing this, that fw is running in kernel mode?

You can check by below command:

cpprod_util FwIsUsermode

If it shows 0, I would follow below to switch to 1 (enable it)

https://community.checkpoint.com/t5/General-Topics/R80-x-Performance-Tuning-Tip-User-Mode-Firewall-v...

1) Run the following clish commands:
    # cpprod_util FwSetUsFwmachine 1
    # cpprod_util FwSetUsermode 1
2) Edit the boot.conf file (vi $FWDIR/boot/boot.conf) with the following:
    KERN_INSTANCE_NUM 62
3) Reboot.

0 Kudos
the_rock
Legend
Legend

@knassif Disregard my last comment. I see based on below, user mode is enabled by default anyway on 16k turbo.

Andy

https://support.checkpoint.com/results/sk/sk167052

0 Kudos
PhoneBoy
Admin
Admin

What specifically is this VS doing?
Is it using a PPPoE interface, operating as an explicit proxy, or does one of the other conditions in this SK apply? https://support.checkpoint.com/results/sk/sk32578 

0 Kudos
knassif
Participant

no PPPoE, dont believe it is operating as an explicit proxy, we use fqdn objects and updatable objects in the policies though

0 Kudos
knassif
Participant

we do have this procedure for bpdu on the firewall though as we have one of the VS's as a layer 2 firewall

In Active/Standby Bridge Mode only, you can disable BPDU forwarding to avoid such blocking mode:

After the line:

./etc/init.d/functions

Add this line:

/sbin/sysctl -w net.bridge.bpdu_forwarding=0

 

0 Kudos
G_W_Albrecht
Legend Legend
Legend

Escalate thru the manager and give the SR# to your Local CP SE to escalate from the sales side! Memory leaks can be a pita sometimes, hard work to find the reason.

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist
0 Kudos
knassif
Participant

we did

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events