CoreXL

knassif · ‎2024-11-12

we have been having a memory leak issue on our gateway firewalls, we have applied all kinds of patches and hotfixes and we still face the issue, if we leave the memory to get to the max the firewall becomes unresponsive and cant access it via ssh and have to reboot it. appreciate any help or anyone has experienced the same issue attached a screenshot

current version is R81.20 T89

the_rock · ‎2024-11-13

Btw, not sure if this might be related, but I see below in jumbo 90.

Andy

knassif · ‎2024-11-13

yea I did see that, asked TAC to see if it can fix it, we have seen many memory leak fixes in previous hotfixes we have installed them all with no luck, we dont use antivirus blade by the way

the_rock · ‎2024-11-13

K, fair enough. Im sure, collectively here, we will be able to help you fix this issue. Please upload debugs when you can and I will review them later tonight (im in Canada, est time zone)

Andy

knassif · ‎2024-11-14

should we allocate more cores per VSX instance? the 16200 turbo has 48 cores see attached

the_rock · ‎2024-11-14

100% you should. I was just actually going to suggest that, but you "beat" me to it.

Andy

knassif · ‎2024-11-14

gonna need a maintenance window to do that as I believe it would need downtime

the_rock · ‎2024-11-14

Thats right...do NOT do this during work hours, it will cause downtime.

Andy

knassif · ‎2024-11-14

so you can assign like 20 for each VSX instance? ( we have 15 VSX instances on that 16200 gateway and the total number of cores on it is 48)

Number of Cores: 48

Total Memory: 131072 MB

the_rock · ‎2024-11-14

Sorry, I was thinking of something else...no no, you cant do that, my bad. So if its 48 cores and 15 instances, then would make sense 15 x 3 = 45

Andy

knassif · ‎2024-11-14

it says the number of instances is not limited by the physical cores on the server so I think it should be fine to add, take a look at screenshot, my concern though adding more resources while the leak is still there may not fix the problem

the_rock · ‎2024-11-14

I get it, totally valid concern. I hate to put it like this, but only way to know for sure would be to give it a go...not saying it would fix the problem, but at least it would eliminate that possibility.

Andy

knassif · ‎2024-11-14

the strange part is these firewalls barely have any traffic on them so cant understand why the fwk process which is responsible for traffic is having a leak.

the_rock · ‎2024-11-14

That is indeed odd, I agree. I will say, from debugs you sent, when I reviewed them, I can clearly see fwk consumes say 0.5% memory here and there, then 1.7% few times, then 2%, then bit more, so adding those "bits and pieces" amounts to big issue. Few of them even show cpu consumptions of 17%.

Andy

knassif · ‎2024-11-14

take a look in bold below it says the number of instances is not limited by the number of physical cores on the server or appliance so 48 cores should not matter

CoreXL

What can I do here?

Configure the number of firewall instances on this Virtual System. When you change the number of firewall instances on a Virtual System, there is a downtime for that Virtual System.

Important - Each firewall instance that you create uses additional system memory. A Virtual System with five instances would use approximately the same amount of memory as five separate Virtual Systems.

What background information do I need to know?

CoreXL creates multiple firewall instances which are different firewalls. You can use CoreXL to increase the performance of the Virtual System on a server or appliance with multiple cores.

Use SmartConsole to configure the number of CoreXL Firewall instances on the Virtual System

In 32-bit VSX, you can assign up to 10 CoreXL Firewall instances on a Virtual System.
In 64-bit VSX, you can assign up to 32 CoreXL Firewall instances on a Virtual System.

The number of instances is not limited by the physical cores on the server or appliance.

Tell me about the fields...

Number of instances: Select the number of firewall instances for the Virtual System.

Getting Here - Double-click a Virtual System > CoreXL

Top of Page

Support Center

Send Feedback

the_rock · ‎2024-11-14

Its been some time I worked on VSX, okay, that explanation makes sense. But, having said that, I would be very careful with the number you put there...not sure if its related to number of VSs, as you have 15, so wondering if you gave 10 there, if its related to amount of ram or not, but I would maybe start with 5 and see if it that makes it any better.

Just my logical approach...

Andy

the_rock · ‎2024-11-15

Hey @knassif

Since this issue was really bugging me and I want to help the best of my ability, I called a customer I used to work with while ago and since I recall they had literally the same issue about 2 years ago, though on R80.40, that was the exact solution, changing number of instances. He told me when they changed it from 3 to 10, after that problems never reappeared. Again, I cant guarantee that would work for you, but certainly worth a try.

Personally, I dont see that making situation worse than it is...

Andy

knassif · ‎2024-11-14

how many though if we have 48, should allocate 40 for each VSX instance? what i'm seeing is an increase of 0.1% of memory on a few VS's see attached, fwk 10 , fwk6 and fwk2 have increased 0.1% in about a couple of hours, our memory leak is increasing at about 100MB per hour or so not sure if that translates in to the 0.1%

the_rock · ‎2024-11-14

Well, if you do simple math, thats it. If 0.1% is 100mbs, then 1% is 10 times that, so 1 gb x 100 is 100gbs of RAM. Dont do 40, lets start with 15 or 20, see if that helps.

Andy

knassif · ‎2024-11-14

attached is the debugs,we were given this sk as well, let me know if you need me to run any commands or you can find anything in the debugs I was not able to upload more than 143MB

https://support.checkpoint.com/results/sk/sk182648

the_rock · ‎2024-11-14

This is good, I will see what I can review later, but if not, then Friday, for sure.

Andy

the_rock · ‎2024-11-14

Just had a look at the debugs and here is my gut feeling/assumptions, considering I have no clue about anything so far that was done in TAC case you have going on. Just my 2 cents, but is it possible that since fwk process seems to be more less causing this, that fw is running in kernel mode?

You can check by below command:

cpprod_util FwIsUsermode

If it shows 0, I would follow below to switch to 1 (enable it)

https://community.checkpoint.com/t5/General-Topics/R80-x-Performance-Tuning-Tip-User-Mode-Firewall-v...

1) Run the following clish commands:
    # cpprod_util FwSetUsFwmachine 1
    # cpprod_util FwSetUsermode 1
2) Edit the boot.conf file (vi $FWDIR/boot/boot.conf) with the following:
    KERN_INSTANCE_NUM 62
3) Reboot.

the_rock · ‎2024-11-14

@knassif Disregard my last comment. I see based on below, user mode is enabled by default anyway on 16k turbo.

Andy

https://support.checkpoint.com/results/sk/sk167052

PhoneBoy · ‎2024-11-13

What specifically is this VS doing?
Is it using a PPPoE interface, operating as an explicit proxy, or does one of the other conditions in this SK apply? https://support.checkpoint.com/results/sk/sk32578

knassif · ‎2024-11-13

no PPPoE, dont believe it is operating as an explicit proxy, we use fqdn objects and updatable objects in the policies though

knassif · ‎2024-11-13

we do have this procedure for bpdu on the firewall though as we have one of the VS's as a layer 2 firewall

In Active/Standby Bridge Mode only, you can disable BPDU forwarding to avoid such blocking mode:

After the line:

./etc/init.d/functions

Add this line:

/sbin/sysctl -w net.bridge.bpdu_forwarding=0

G_W_Albrecht · ‎2024-11-13

Escalate thru the manager and give the SR# to your Local CP SE to escalate from the sales side! Memory leaks can be a pita sometimes, hard work to find the reason.

CCSP - CCSE / CCTE / CTPS / CCME / CCSM Elite / SMB Specialist

knassif · ‎2024-11-13

we did

Are you a member of CheckMates?

memory leak

CoreXL