Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
ashah
Explorer

Cluster member at 100 CPU

Hello Experts, 

I have upgraded checkpoint 4600 cluster from R80.10 to R80.30 (as these gateways appliances are under sized we didint go for R80.40). 

after successful upgrade on both the cluster member, secondary member went up to 100% CPU utilization and is consistent still. Checkpoint TAC suggested to apply hot fix and see if that might resolve the issue. but since it is at 100% CPU, its not allowing us to install/download hot fix. 

TOP command shows high number of clish jobs running, we tried to kill some of these jobs but it keeps re-generating. 

this node is currently standby and not interrupting traffic but this is really a concerning issue as these are prof cluster. 

any help will be appreciated.  

0 Kudos
2 Replies
Timothy_Hall
Champion
Champion

I've seen this type of behavior before, sounds like a corrupt Gaia configuration database or one that has grown too large.  The Gaia database is separate from the /config/active file which contains a text version of your Gaia OS config.  Older versions had issues with having the Gaia database size run out of control. 

See Scenario 2 here which details how to rebuild the Gaia database (you won't lose your Gaia config):

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

If that doesn't help there are a variety of other conditions that can cause this, search for string "Timeout waiting for response from database server" in SecureKnowledge and you'll get plenty of hits.

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
the_rock
Legend
Legend

I know you said high number of clish jobs running, but is there a process in particular causing this from running maybe ps -auxw or top command? Also, what happens if you reboot the box? Not sure if you tried that...I really get the fact that 4600 might be undersized to properly run even R80.30. Though, it is a bit peculiar as to why only one would have this issue, since its same hardware.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events