We have a couple of CP5600 operating in different locations with very similar configurations. The load is about the same. Each is running r80.10 - T189.
Location B is stable and running without issues, but Location A we have to reboot about once every 45 days due to memory issues. Whatever is happening, affects the dataplane. IE, Fw stops forwarding packets.
This is the memory output for location A:
System Capacity Summary:
Memory used: 77% (4455 MB out of 5731 MB) - below watermark
Concurrent Connections: 10410 (Unlimited)
Aggressive Aging is enabled, not active
Hash kernel memory (hmem) statistics:
Total memory allocated: 3737321472 bytes in 912432 (4096 bytes) blocks using 14 pools
Initial memory allocated: 599785472 bytes (Hash memory extended by 3137536000 bytes) - 3.1GB?
Memory allocation limit: 4806672384 bytes using 512 pools
Total memory bytes used: 0 unused: 3737321472 (100.00%) peak: 3426386556
Total memory blocks used: 0 unused: 912432 (100%) peak: 861288
Allocations: 4163792559 alloc, 0 failed alloc, 4140371486 free
System kernel memory (smem) statistics:
Total memory bytes used: 4598247500 peak: 4608745920
Total memory bytes wasted: 3721660
Blocking memory bytes used: 4784944 peak: 9567848
Non-Blocking memory bytes used: 4593462556 peak: 4599178072
Allocations: 13741524 alloc, 0 failed alloc, 13738637 free, 0 failed free
vmalloc bytes used: 4588389496 expensive: no
Kernel memory (kmem) statistics:
Total memory bytes used: 4143730832 peak: 4231943656
Allocations: 4177522403 alloc, 0 failed alloc
4154099588 free, 0 failed free
External Allocations: 16896 for packets, 88628453 for SXL
Cookies:
3778625491 total, 0 alloc, 0 free,
150073 dup, 300575262 get, 2794359219 put,
2072999334 len, 2707089222 cached len, 0 chain alloc,
0 chain free
Connections:
388319874 total, 136725382 TCP, 231455561 UDP, 19560665 ICMP,
578266 other, 30721 anticipated, 195046 recovered, 10410 concurrent,
159214 peak concurrent
Fragments:
1118953332 fragments, 2706956154 packets, 3456 expired, 0 short,
0 large, 0 duplicates, 848 failures
NAT:
67013/0 forw, 52962/0 bckw, 982 tcpudp,
0 icmp, 5906-17579 alloc
Sync: off
[Expert@LocationA:0]# free -m
total used free shared buffers cached
Mem: 7744 7580 164 0 333 1837
-/+ buffers/cache: 5409 2334
Swap: 18394 0 18394
This is Location B:
System Capacity Summary:
Memory used: 9% (539 MB out of 5731 MB) - below watermark
Concurrent Connections: 8560 (Unlimited)
Aggressive Aging is enabled, not active
Hash kernel memory (hmem) statistics:
Total memory allocated: 599785472 bytes in 146432 (4096 bytes) blocks using 1 pool
Total memory bytes used: 0 unused: 599785472 (100.00%) peak: 27427 7488
Total memory blocks used: 0 unused: 146432 (100%) peak: 69627
Allocations: 1607331344 alloc, 0 failed alloc, 1607117916 free
System kernel memory (smem) statistics:
Total memory bytes used: 967638752 peak: 986044552
Total memory bytes wasted: 4180014
Blocking memory bytes used: 5820820 peak: 14955252
Non-Blocking memory bytes used: 961817932 peak: 971089300
Allocations: 151132250 alloc, 0 failed alloc, 151129180 free, 0 failed free
vmalloc bytes used: 956763424 expensive: no
Kernel memory (kmem) statistics:
Total memory bytes used: 401812380 peak: 640749756
Allocations: 1758439658 alloc, 0 failed alloc
1758224295 free, 0 failed free
External Allocations: 76032 for packets, 89765022 for SXL
Cookies:
1450833429 total, 836424 alloc, 836424 free,
251 dup, 433718314 get, 2695578081 put,
2263227759 len, 2298121504 cached len, 0 chain alloc,
0 chain free
Connections:
1628040697 total, 660800638 TCP, 927853823 UDP, 39386225 ICMP,
11 other, 288832 anticipated, 441738 recovered, 8560 concurrent,
161987 peak concurrent
Fragments:
302418965 fragments, 2297426537 packets, 2476610 expired, 0 short,
0 large, 0 duplicates, 1969 failures
NAT:
0/0 forw, 0/0 bckw, 0 tcpudp,
0 icmp, 0-27257 alloc
Sync: off
[Expert@locationB:0]# free -m
total used free shared buffers cached
Mem: 7744 7555 189 0 419 4896
-/+ buffers/cache: 2239 5504
Swap: 18394 0 18394
The only difference I can find between the two is that Location A is using Extended memory hash tables, but I don't know what would cause this behavior?