Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
prakash82
Explorer

Gateway Firewall Memory above 90% have to failover and reboot every week

Hi Team,

We have to reboot the firewall every week and lately every 4 days since the memory goes  more than 95% , logged a TAC ticket and its ongoing since last 6 months. 

Here is the output of some memory command:

 

# free -k -t
total used free shared buff/cache available
Mem: 32364908 16109064 2495696 2373016 13760148 11234316
Swap: 32653180 13824 32639356
Total: 65018088 16122888 35135052

# free -mt
total used free shared buff/cache available
Mem: 31606 15729 2476 2317 13400 10967
Swap: 31887 13 31874
Total: 63494 15743 34350

##free -
total used free shared buff/cache available
Mem: 30G 15G 2.4G 2.3G 13G 10G
Swap: 31G 13M 31G

cpstat -f memory os

Total Virtual Memory (Bytes): 66578522112
Active Virtual Memory (Bytes): 21650284544
Total Real Memory (Bytes): 33141665792
Active Real Memory (Bytes): 21636128768
Free Real Memory (Bytes): 11505537024
Memory Swaps/Sec: -
Memory To Disk Transfers/Sec: -

 

fw ctl pstat -l

Virtual System Capacity Summary:
Physical memory used: 52% (14161 MB out of 26865 MB) - below watermark
Kernel memory used: 5% (1443 MB out of 26865 MB) - below watermark
Virtual memory used: 46% (12524 MB out of 26865 MB) - below watermark
Used: 12524 MB by FW, 36992 MB by zeco
Concurrent Connections: 27733 (Unlimited)
Aggressive Aging is enabled, in detect mode

Kernel memory (kmem) statistics:
Total memory bytes used: 5858565214 peak: 9054649462
Allocations: 3628203140 alloc, 0 failed alloc
3443143165 free, 0 failed free

Cookies:
3564956855 total, 100334 alloc, 100334 free,
65 dup, 962627186 get, 1389385943 put,
3857272616 len, 2247045568 cached len, 0 chain alloc,
0 chain free

Connections:
175215155 total, 155360803 TCP, 19159365 UDP, 694936 ICMP,
51 other, 4 anticipated, 30 recovered, 27734 concurrent,
86440 peak concurrent

Fragments:
59335 fragments, 29654 packets, 0 expired, 0 short,
0 large, 0 duplicates, 0 failures

NAT:
56657638/0 forw, 75376564/0 bckw, 8398319398 tcpudp,
1174823 icmp, 135659996-158143676 alloc

Sync: Run "cphaprob syncstat" for cluster sync statistics.

Handles:
table name "kbufs"
922796 handles, 247 pools, 247 maximum pool(s)
1156268149 allocated, 0 failed, 1155345353 freed
249 pool(s) allocated, 0 failed, 2 freed, 0 not preallocated

 

##cat /proc/meminf
MemTotal: 32364908 kB
MemFree: 2535036 kB
MemAvailable: 11228176 kB
Buffers: 1388 kB
Cached: 12348760 kB
SwapCached: 336 kB
Active: 9984912 kB
Inactive: 3153224 kB
Active(anon): 3437964 kB
Inactive(anon): 784552 kB
Active(file): 6546948 kB
Inactive(file): 2368672 kB
Unevictable: 12813376 kB
Mlocked: 12813448 kB
SwapTotal: 32653180 kB
SwapFree: 32639356 kB
Dirty: 6620 kB
Writeback: 0 kB
AnonPages: 13601812 kB
Mapped: 4004008 kB
Shmem: 2373016 kB
Slab: 1367792 kB
SReclaimable: 454988 kB
SUnreclaim: 912804 kB
KernelStack: 21120 kB
PageTables: 66828 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 48786480 kB
Committed_AS: 12609164 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 1601140 kB
VmallocChunk: 34357976016 kB
Percpu: 18656 kB
AnonHugePages: 4771840 kB
HugePages_Total: 48
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 678616 kB
DirectMap2M: 12601344 kB
DirectMap1G: 22020096 kB

 

Have tried disabling the monitoring blade which doesn't make any differences

#enabled_blade
fw vpn urlf av appi ips identityServer anti_bot ThreatEmulation mon Scrub

 

# fw ctl multik stat
ID | Active | CPU | Connections | Peak
-----------------------------------------------
0 | Yes | 15 | 987 | 3080
1 | Yes | 30 | 1063 | 3357
2 | Yes | 14 | 1036 | 3164
3 | Yes | 29 | 1019 | 3369
4 | Yes | 13 | 1052 | 3257
5 | Yes | 28 | 1077 | 3242
6 | Yes | 12 | 1072 | 3579
7 | Yes | 27 | 1036 | 3359
8 | Yes | 11 | 1041 | 3320
9 | Yes | 26 | 994 | 3242
10 | Yes | 10 | 1035 | 3478
11 | Yes | 25 | 1082 | 3216
12 | Yes | 9 | 1047 | 3155
13 | Yes | 24 | 1024 | 3450
14 | Yes | 8 | 1061 | 3242
15 | Yes | 23 | 1097 | 3178
16 | Yes | 7 | 1136 | 3519
17 | Yes | 22 | 999 | 3164
18 | Yes | 6 | 1037 | 3252
19 | Yes | 21 | 1054 | 3258
20 | Yes | 5 | 1063 | 3260
21 | Yes | 20 | 1061 | 3215
22 | Yes | 4 | 1057 | 3175
23 | Yes | 19 | 982 | 3227
24 | Yes | 3 | 1022 | 3200
25 | Yes | 18 | 883 | 2080
26 | Yes | 2 | 789 | 2152

0 Kudos
13 Replies
_Val_
Admin
Admin

Version in use? HFA level?

0 Kudos
prakash82
Explorer

R81.20

HOTFIX_R81_20_JUMBO_HF_MAIN Take: 26

0 Kudos
the_rock
MVP Gold
MVP Gold

R81? R81.10? Any clue what may have triggered this 6 months ago? Was there major upgrade done?

Andy

Best,
Andy
0 Kudos
Bob_Zimmerman
MVP Gold
MVP Gold

How are you monitoring the memory usage? Most monitoring tools report Linux memory usage in a deeply misleading way.

Are there actual problems which lead you to reboot, or is the reboot a proactive step because you're worried you will see a problem if you leave it?

I don't see any indication of a memory limitation in the output you've shared. It has used 13 MB of swap, so it's on the border of needing more RAM, but not urgently.

0 Kudos
prakash82
Explorer

The logs which I had in the original post was the day I have restarted the firewall, below is the memory usage logs. Swap and memory went up in the last few days. 

We reboot when the memory hits around 96%-98%. ( Every 5 - 6 days)

 

##free -
total used free shared buff/cache available
Mem: 30G 16G 2.0G 3.9G 11G 7.5G
Swap: 31G 92M 31G

 

# fw ctl pstat

Virtual System Capacity Summary:
Physical memory used: 61% (16554 MB out of 26865 MB) - below watermark
Kernel memory used: 5% (1493 MB out of 26865 MB) - below watermark
Virtual memory used: 55% (14809 MB out of 26865 MB) - below watermark
Used: 14809 MB by FW, 36992 MB by zeco
Concurrent Connections: 54877 (Unlimited)
Aggressive Aging is enabled, in detect mode

Kernel memory (kmem) statistics:
Total memory bytes used: 6669823069 peak: 9344514107
Allocations: 2317003416 alloc, 0 failed alloc
1942725061 free, 0 failed free

Cookies:
2575526124 total, 199577 alloc, 199577 free,
126 dup, 1197655591 get, 1701396094 put,
3329215880 len, 2602510663 cached len, 0 chain alloc,
0 chain free

Connections:
292118456 total, 258662121 TCP, 32263738 UDP, 1192372 ICMP,
225 other, 7 anticipated, 30 recovered, 54876 concurrent,
86440 peak concurrent

Fragments:
151211 fragments, 75576 packets, 1 expired, 0 short,
0 large, 0 duplicates, 0 failures

NAT:
143738715/0 forw, 195701560/0 bckw, 21799931406 tcpudp,
3099770 icmp, 348895944-259862984 alloc

 

0 Kudos
Bob_Zimmerman
MVP Gold
MVP Gold

How are you monitoring the memory usage? Most monitoring tools report Linux memory usage in a deeply misleading way. A healthy Linux system should only have 50-100 MB of free RAM. Any other unused RAM is filled with cache either of disk data or data from processes which have ended. Most monitoring systems will report this as ~98% memory usage.

0 Kudos
prakash82
Explorer

I monitor the memory from smart console. What's the best way to monitor?

 

0 Kudos
the_rock
MVP Gold
MVP Gold

Maybe also run cpview and check there.

Andy

Best,
Andy
0 Kudos
prakash82
Explorer

Attached the CPView

0 Kudos
Chris_Atkinson
MVP Gold CHKP MVP Gold CHKP
MVP Gold CHKP

As above are you rebooting because you encounter some issues or outage?

Is this a 15600 or 7000 appliance? Note both these models can have up to 64G memory installed.

There are also some additional proactive measures you can take if this is connection table related per: https://community.checkpoint.com/t5/General-Topics/R80-x-Performance-Tuning-Tip-Connection-Table/td-...

CCSM R77/R80/ELITE
0 Kudos
Chris_Atkinson
MVP Gold CHKP MVP Gold CHKP
MVP Gold CHKP

The output suggests you have Threat Emulation enabled, note if the web extraction component is enabled per sk145773 there are additional RAM requirements that you should be aware of.

CCSM R77/R80/ELITE
the_rock
MVP Gold
MVP Gold

Good point Chris.

Best,
Andy
0 Kudos
Vincent_Bacher
Advisor
Advisor

We run prometheus and this would be a nice use case to run process-exporter to monitor all relevant CP processes to have an overview which one is going nuts. And then having the right candidate(s) to dig further. 🙂

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events