Solved: Memory increase in FW's model SMB 1800 after updat...

Luis1980 · ‎2023-07-25

Hello everyone.

After upgrading to R81.10 in a backup center environment 2 FW's model SMB 1800, it has an increase in memory which is not normal for the use they have. One of them is exceeding 80%, the TAC is investigating but they do not give any clear answer. Can someone help with this issue. Thank you.

This is Check Point's 1800 Appliance R81.10.05 - Build 301

R81.10.05 (996001301)

Amir_Ayalon · ‎2023-12-12

Hi

In General, if you open an SR and you feel it's going nowhere, you can ask for R&D involvement (Task)

After Briefly looking into it (we are still looking) we found that the memory reserved for some processes was increased (double) than in previous build. it's not a real issue as the memory is just reserved not allocated, and can be freed if needed. it does however change the amount of memory presented as "Free".

This will explain part of the "lost memory" you see.

TAC remark are only relevant in cases Sizing is a problem. many many host and concurrent connection.

If you feel this is not the case, and TAC answers and not satisficing, please contact us.

thanks

View solution in original post

K_R_V · ‎2024-01-04

On our firewalls, it looks like a memory leak. Graph of free memory :

We loose around 3 Mb / 5 minutes.

When a limit is reached, SFWD is restarted :

sfwd_periodic_memory_rss: sfwd RSS (230 MB) is over the defined limit (230 MB). sfwd is exiting!

[3 Jan 19:12:43] sfwd: Wed Jan 3 19:12:43 2024

sfwd_periodic_memory_rss: sfwd RSS (230 MB) is over the defined limit (230 MB). sfwd is exiting!

[4 Jan 4:07:58] sfwd: Thu Jan 4 04:07:58 2024

View solution in original post

Lesley · ‎2023-07-25

We had same issue. Saw it was related to SFWD. Upgrade to version R81.10.07, looks more stable now.

See any output? : cat $FWDIR/log/sfwd.elg | grep -i '360 MB' -A 1

ps aux | grep -i sfwd

Then I know if we have the same issue.

-------
If you like this post please give a thumbs up(kudo)! 🙂

Luis1980 · ‎2023-07-25

Hello these are the results, we have the same problem?

[Expert@]1# cat $FWDIR/log/sfwd.elg | grep -i '360 MB' -A 1
sfwd_periodic_memory_rss: sfwd RSS (360 MB) is over the defined limit (360 MB). sfwd is exiting!
[sfwd 2929 4156636416]@FWI-BCP[21 Jul 19:03:27] sfwd: Fri Jul 21 19:03:27 2023
--
sfwd_periodic_memory_rss: sfwd RSS (360 MB) is over the defined limit (360 MB). sfwd is exiting!
[sfwd 19308 4156308736]@FWI-BCP[22 Jul 20:45:15] sfwd: Sat Jul 22 20:45:15 2023
--
sfwd_periodic_memory_rss: sfwd RSS (360 MB) is over the defined limit (360 MB). sfwd is exiting!
[sfwd 3169 4157357312]@FWI-BCP[23 Jul 22:27:03] sfwd: Sun Jul 23 22:27:03 2023
--
sfwd_periodic_memory_rss: sfwd RSS (360 MB) is over the defined limit (360 MB). sfwd is exiting!
[sfwd 19491 41515Expert@1]# ps aux | grep -i sfwd
root 11822 0.0 0.0 4992 4352 ? S Jun15 1:09 /bin/bash /pfrm2.0/etc/sfwd_check_state.sh
root 14052 0.0 0.0 6528 3648 pts/0 S+ 13:05 0:00 grep -i sfwd
root 19491 0.5 3.1 456128 251968 ? Ssl 00:08 4:15 fw sfwd

It has different results and is the one with the highest memory.

[Expert@2]# cat $FWDIR/log/sfwd.elg | grep -i '360 MB' -A 1
[Expert@]2# ps aux | grep -i sfwd
root 349 0.0 0.0 6592 1920 pts/1 S+ 13:06 0:00 grep -i sfwd
root 10885 0.0 0.0 4992 4416 ? S Jun15 1:09 /bin/bash /pfrm2.0/etc/sfwd_check_state.sh
root 17924 1.2 2.3 404864 187520 ? Ssl Jun15 696:21 fw sfwd
root 19574 0.0 0.0 4928 4352 ? S Jun15 1:04 /bin/bash /pfrm2.0/etc/sfwd_check_state.sh

Lesley · ‎2023-07-25

I would upgrade the units and share the above output to TAC. Of course make sure you make new output after upgrade.

We still get sfwd RSS (360 MB) is over the defined limit (360 MB). sfwd is exiting! , after the upgrade. We are going to try to increase the 360MB value. I don't have any results regarding it yet.

-------
If you like this post please give a thumbs up(kudo)! 🙂

Luis1980 · ‎2023-07-25

It means that although updated to R81.10.07 has not been solved?, I have happened to my integrator to add this in the case in case they tell me that it is the same problem.

Sbolton · ‎2023-12-11

Any resolution to this? We are seeing this same issue with ANY SMB that we've moved from local to central management as well.

Amir_Ayalon · ‎2023-12-11

Hi

Might be false indication in the management.

We are looking into it.

You are Welcome to open an SR and asked for a Task to be assigned to R&D

Thanks

Sbolton · ‎2023-12-11

I've had a ticket open for the past two months (They've transfered and closed the old ones multiple times.) I've been moving a single customer into Smart-1 Cloud and since we upgraded to R81.10.08 and went to central management, that device is crashing fairly consistently. I'm seriously wondering if this is an issue with R81.10.05 and above, not R81.10 in general.

the_rock · ‎2023-12-11

I had seen few posts about it on the forum recently. I know one customer who has software option set to auto update, so far, no issues. Mind you, theirs is locally, not centrally managed, so not sure if that would make a whole lot of difference.

Best regards,

Andy

Sbolton · ‎2023-12-12

Supposedly in R81.10.05 they added several services, some of which only work in local managed mode, not centrally managed.

the_rock · ‎2023-12-12

Logically, that would certainly explain it. Not sure if they provide custom fixes for those appliances, like they do for regular Gaia. Personally, Im not a big fan of those, as they can be an issue when trying to install next recommended jumbo take, but maybe works bit different on SMB.

Andy

Chris_Atkinson · ‎2023-12-11

Which specific model of gateway in your case, does the issue persist with R81.10.08 Build 996001683 ?

( 30 November 2023: R81.10.08 Build 996001683 image has been released for 1500 / 1600 / 1800 appliances, replacing Build 996001608, refer sk181079. )

CCSM R77/R80/ELITE

Sbolton · ‎2023-12-12

1590 SMB, went from R80.20.50 to R81.10.08. Constant issues ever since.

the_rock · ‎2023-12-12

So whats latest from TAC? To me, seems like this is more than enough reason for RMA if no one can figure out why crashes are happening...

Andy

Sbolton · ‎2023-12-12

The first response from TAC was, "Buy a better box." To which I told them that we had 0 issues on the old firmware, and that central management is supposed to reduce memory usage, not increase it. To which they replied asking for additional logs and for a RAM intensive debug to be run on the device at hand. It's been a consistent back and forth ever since then.

the_rock · ‎2023-12-12

Buy a better box? Wow...if those were the exact words, Im truly speechless.

Andy

Sbolton · ‎2023-12-12

Not exact words, it was more along the lines of. "We notice a high RAM usage from Dr. Spark. We would recommend that you purchase a firewall that can handle the traffic at this location." A little more sophisticated, but basically the same thing.

the_rock · ‎2023-12-12

But still, even that answer, to me, if you think about it logically, makes no sense at all its the hardware issue, because if all worked fine BEFORE installing new software, its obviously software problem, not hardware.

Andy

Sbolton · ‎2023-12-12

I completely agree. And that was why both I, and my sales team, didn't appreciate that response. I hope they fix their issues soon cause this is not okay.

the_rock · ‎2023-12-12

Of course nothing about what you described is okay, Im with you 100%. Why not try escalate the issue further through TAC? Just a suggestion...

Andy

_Val_ · ‎2023-12-13

Andy, the answer may make sense if the issue is related to the traffic volume.

This is one of the first messages I convey during my performance optimization training sessions: Sometimes, unfortunately, there is no other way but to buy a bigger box.

I believe TAC has some reasons to suggest that, and without further details, it is unfair to challenge their judgement.

the_rock · ‎2023-12-13

You are 100% correct in saying I dont have all the facts, that is true, BUT, purely going based on logic, it would indicate software, not hardware issue. If you see response from @Sbolton , even their CP sales team was surprised at the answer they received from support. To me, the way I look at is like this. Lets take most basic example, though its not same product "family" if you will, but if you say install windows update on windows 11 PC and stuff starts breaking, well, its totally logical to say issue is with the update and not with your PC.

Best regards,

Andy

K_R_V · ‎2023-12-29

Sometimes, unfortunately, there is no other way but to buy a bigger box.

We have just replaced 100+1200R firewalls on R77.20.XX with 1570R firewalls on R81.10.XX. ( 1CPU /1Gb RAM -> 4 CPU / 2 GB RAM ). We now have memory issue on all new firewalls, with outages on regular bases, all with the SFWD process. There are less then 100 sessions on these firewalls and the bandwidth is around 1mbps ! There is something seriously wrong with these versions. We had less issues on a smaller box with older software !

We have a a TAC case open for more then 4 months, with R&D involved, but there is no progress.

Problems we see :

Antivirus Updates: SFWD crashes randomly during antivirus updates, causing complete outages of the firewalls.
IPS Updates: SFWD restarts after almost every IPS update due to memory consumption
IPS Schedule Timer: Spark devices do not adhere to the IPS update timer configured in the dashboard.
Segmentation Fault: Some firewalls experience segmentation faults, resulting in complete outages.
High Memory Usage: Memory increases after firewall reboot, potentially indicating a memory leak issue.

we suspect all problems being related to memory consumption.

all crashes are solved with sfwd_restart.

example mem usage : 60 Mb free, no free SWAP memory free

Mem: 1959048 1792108 60700

Swap: 524284 524264 20

Concurrent Connections: 0% (79 out of 99900) - below watermark

Bits/sec 289K

the_rock · ‎2023-12-29

I understand your points : - )

Happy New Year!

Best,

Andy

Sbolton · ‎2024-01-02

Hi KristofVermael,

Quick question for you. What is the firmware version you're running currently? I was on R81.10.08 build 996001608 specifically, and TAC just had me upgrade to R81.10.08 Build 996001683 before the holidays. Now the customer hasn't had crashes since then, but I am waiting for the customer's traffic to go up in that location (People out for the holidays) first before making the judgement call that it resolved our issue.

K_R_V · ‎2024-01-02

Hey,

Most of them are running on R81.10.05 (996001301) but we are testing with R81.10.08 (996001696) provided by TAC. It is hard to tell if there is an improvement with this version, Memory consumption is still high on our test firewall without any traffic, SFWD is still restarting with every IPS update due to memory and the IPS update schedule is still not respected. We did not have a major outage on these firewalls …

sfwd.elg log :

sfwd_periodic_memory_rss: sfwd RSS (350 MB) is over the defined limit (350 MB). sfwd is exiting!

We are waiting on a fixed version to upgrade them all.

Sbolton · ‎2024-01-02

Thanks for the heads up. Please let me know if it goes anywhere! I'm hoping that this is resolved soon.

Alex- · ‎2024-01-03

Do you know if it's related to the 1570R series? Our 15x0 have been running stable on R80.20.50 and were upgraded to R81.10.07 to anticipate EoS of the former and even though overall memory usage has increased they've been stable for months.

K_R_V · ‎2024-01-03

I don't think it is related to the 1570R, we see it also on 1590 devices.

I can be wrong, but I think IPS updates have something to do with it. If the updates are set on "gateway auto update", each update can trigger a SFWD restart on devices with 2Gb ram. I'm now monitoring the memory usage very closely to see if there is a possible memory leak.

I'm hoping that this is resolved soon but the case has been open for +4 months now .

K_R_V · ‎2024-01-04

On our firewalls, it looks like a memory leak. Graph of free memory :

We loose around 3 Mb / 5 minutes.

When a limit is reached, SFWD is restarted :

sfwd_periodic_memory_rss: sfwd RSS (230 MB) is over the defined limit (230 MB). sfwd is exiting!

[3 Jan 19:12:43] sfwd: Wed Jan 3 19:12:43 2024

sfwd_periodic_memory_rss: sfwd RSS (230 MB) is over the defined limit (230 MB). sfwd is exiting!

[4 Jan 4:07:58] sfwd: Thu Jan 4 04:07:58 2024

Are you a member of CheckMates?

Memory increase in FW's model SMB 1800 after updating to R81.10