Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Ryan_Ryan
Advisor

CP Manager 100% CPU postgres

Hi all,

 

We do have a tac case open for this but its been going nowhere for weeks now. We have a manager running R81.10 T87. About a month ago it started running 100% CPU across all 12 cores for several hours a day, and gotten progressively worse, now after a reboot within 15mins all cores will be at 100%. culprit seems to be postgres, we are now at the stage we need to reboot it several times a day, policy install takes 20-60 mins, it will randomly drop our gui sessions. 

 

I have attached a txt doc with some command outputs, we are not sure else where to go from here. Revisions have been cleared (less than 10). any help appreciated!

 

3 Replies
Timothy_Hall
Legend Legend
Legend

Hmm due to the Nice (NI) value of 0 on those postgres processes that is definitely not log indexing which is expected to continually consume CPU in the background at low priority.  Has TAC run the cpm doctor tool yet?  That tool does a good job of sniffing out postgres database issues which is what this appears to be. 

You can also try poking around in the $FWDIR/log/cpm.elg and $FWDIR/log/postgres.elg log files and see if any errors are being thrown which might be helpful, that latter log file may even show you what specific SELECT command is continually being run and give you an idea of where to look.  If it is indeed some kind of database issue, that is not something you can normally fix yourself and you'll have to rely on TAC.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Ryan_Ryan
Advisor

Spot on as always Tim! we had to run a cleanup script and everything looks to be working well after that (and a stop start). the script seems to be hidden behind sk180294. (Although we hadn't upgraded recently) 

the_rock
Legend
Legend

I once worked with TAC on this and customer and I got sort of frustrated it was taking so long to come up with any reasonable suggestion, so what we ended up doing was restoring a backup from the time when all worked fine and that actually solved the issue.

Please dont ask me how, as I have no clue in the world, as we never really found a reason why this happened in the first place, as the upgrade was done 6 months BEFORE they started noticing the problem.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events