Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
bramotten
Explorer

High CPU usage on management server [r81.10]

Jump to solution

We have upgraded from r80.40 to r81.10 some time ago. In the beginning the 100% CPU usage on the management server was expected as it was reformatting the logs for r81.10. Unfortunately we have been getting sporadic spikes in cpu usage causing the recent traffic not to be shown in the console. 

Our Situation:

2x Open server in HA (active-standby) On premise

2x Azure Checkpoint appliances in HA (active-standby)

1x Security management server 8 vcpu 32 Gb (VmWare)

We have a VPN tunnel between azure and on-premise.

checkpoint.PNG

top and htop are attached.

 

sincerely,

 

Bram

0 Kudos
1 Solution

Accepted Solutions
Dorit_Dor
Employee
Employee

If you recently upgraded, make sure that its not “still reindexing your logs” (after upgrade to r81x we reindex all the logs) 

View solution in original post

11 Replies
Timothy_Hall
Champion
Champion

The nice (NI) value shown in top for the CPU-heavy java processes is 19, which means they are set for minimum CPU priority.  These are going to be your log_indexder and SOLR processes, which will get kicked off the CPU immediately if some other process needs to use it.  There have been several other threads about this and it is expected behavior, if you are having issues with logs not showing up in a timely fashion there could be contention for the hard drive, although your waiting for I/O (wa) is showing as zero.

New 2021 IPS/AV/ABOT Immersion Self-Guided Video Series
now available at http://www.maxpowerfirewalls.com
0 Kudos
jb1
Participant

Hello,

did you solve this issue? I currently have two environments with the same symptoms as you with high CPU ussage(after mgmt upgrade to R81.10 CPU is at 90%)  Before the upgrade it was 20-30%. In one environment, I observed that the issue was caused by log_exporter, but after a while (2 days) the CPU levels were OK. For some reason, after the upgrade, I see a lot of wa (wait) when running the "top" command on the mgmt server. Like Timothy suggested my first culprit was disk. But after checking the actual disk r/w I can see that there are spikes up to 150M but this should still be OK. Mem is OK, NICs are OK so I don't know what is causing CPU waits in such a number? At the top we have java process with 600%, followed by indexer and exporter at 100% All of them have NI value of 19. What I did noticed is that when wa kicks in java process drops fromm 600 to 100. Then it rises again until there are wa's again...

 

P.S. before the upgrade everything was working OK, so something changed. Two different environments.

 

Br J

0 Kudos
Dorit_Dor
Employee
Employee

If you recently upgraded, make sure that its not “still reindexing your logs” (after upgrade to r81x we reindex all the logs) 

jb1
Participant

Hello Dorit, nice to hear from you. It seems you were right - both systems are now running OK. After your suggestion I did some further investigation and noticed that all the environments that had manually  added custom value of"days_to_index (value)" in $INDEXERDIR/log_indexer_custom_settings.conf,  prior to upgrade had high CPU issue. And this files survives and upgrade. As stated in https://community.checkpoint.com/t5/Management/Cannot-view-previous-logs-after-upgrade-to-R81/td-p/1... R81 should only re-index logs for 24h  unless triggered manually - but that is something a reboot does 🙂

What was weird and made me look away from indexing is  that the process that was consuming CPU was java, not log_indexer as in previous versions when re-indexing.

Thank you for your response!

0 Kudos
Miri_Ofir
Employee
Employee

@jb1 JAVA process that was working hard during reindexing is actually SOLR daemon (you can see it in top -c).

The default value of Days to Index is 24 hours, I don't know what was the reason to change it prior to upgrade, and usually it is not necessary, but specifically in upgrades from R80.x to R81/R81.x, users might change this value to a longer time. In R81 we upgraded the SOLR indexing engine, and the new engine cannot read from indexes that were created prior the upgrade.

Now after reindex is completed, I'm sure you notice much faster log queries and reports.

0 Kudos
jb1
Participant

Hello,

a OK I can see it now. The reason why this value is/was changed is because of Smarevent reports. At some time the customer wanted to have the ability to create Smart-event reports for 90/180 days. To achieve that disk space was added to the mgmt and logs were imported back to SE and re-index so that the customer could could do reports for more than 14 days.

 

Unfortunately  that did not played out well, after the upgrade to R81 I've noticed that re-index for old logs is not done properly as some chunks of indexed logs are randomly missing. I've deleted the new index files and ran another re-index but  the result was almost the same in two environments where the need for Smart-event history is "required" I've contacted TAC on this issue.

Thank you for replay.

Br J

Yes I do have to say, new reports and queries does feel faster, kudos for that 🙂 

0 Kudos
Miri_Ofir
Employee
Employee

I see. please share with me the SR privately, I will monitor the case with TAC

0 Kudos
Fernando_Lopez
Contributor

Hi jb1!

Have you got any updates on this case?

 I'm having a similar problem and I'd like to know if you could share any info that could help.

Thanks!

0 Kudos
jb1
Participant

Hello Fernando!

If you are talking about the high CPU than the issue was  SOLR daemon re-indexing old files. If you are talking about the problem where some logs were not indexed, then I just, received word from TAC yesterday actually, saying that they found a problem and portfix is available. They also mentioned that this fix will be implement in the next JHF, that will be out in a couple of weeks. In this environment. Hope this helps.

 

Br J

Fernando_Lopez
Contributor

Hi jb1! 

Thanks for your reply.

Already solved mine; SMS was stuck trying to reindex log files.

As it was a VMWare machine, I was able to reinstall the SMS entirely and that solved the problem somehow.

0 Kudos
jb1
Participant

Hello Fernando,

a drastic move 🙂 were any changes made in the $INDEXERDIR/log_indexer_custom_settings.conf? ,specifically with parameter days_to_index

0 Kudos