Hello everyone,
One of our customers constantly having issues with failovers due to routed PNOTE. We noticed that they have around 12k OSPF routes. After the failover we see that the routed daemon on the newly active node already uses 3GB of memory. Here's the output from the currently active node after failover:
[Global] ch02-01> show routed memory
1_01:
Total Memory Usage: 32 MB
Core: 3 MB
BGP: 360 B
MFC: 236 B
OSPF: 28 MB
Policy: 2 KB
2_01:
Total Memory Usage: 3 GB
Core: 3 MB
BGP: 360 B
MFC: 236 B
OSPF: 3 GB
Policy: 2 KB
[Global] MAB-CL1-ch02-01> show routed resources
1_01:
Total Uptime : 3 hrs 9 mins 44 secs
Total User Time : 3 mins 2 secs
Total System Time : 13 secs
Page Faults : 85
Page Reclaims : 16070
Total Swaps: : 0
Voluntary Context Switches : 624724
Involuntary Context Switches : 70850
2_01:
Total Uptime : 21 days 1 hr 49 mins 52 secs
Total User Time : 12 hrs 3 mins 26 secs
Total System Time : 27 mins 46 secs
Page Faults : 807
Page Reclaims : 1023538
Total Swaps: : 0
Voluntary Context Switches : 63305978
Involuntary Context Switches : 12788009
At first, we thought that there was some memory leak going on, but there are no coredumps or anything that points to a memory leak. TAC agreed with us on this as well. What I'm wondering is if it's normal when there are 12k OSPF routes resulting in 3GB memory use. Or should we look at some underlying issues that ends up with this symptom?
If this is normal, then how much would route aggregation help us with the situation (before we go for a memory increasing path)?
As always, thank you very much.
Cheers!