Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
kamilazat
Advisor

SmartUpdate is extremely slow after MDS upgrade to R81.20

Hello everyone.

We are experiencing heavy lagging on SmartUpdate after upgrading from R81.10 to R81.20 JHF Take 98. It does NOT hang or freeze, but everything works extremely slowly. This did not happen before the update so something feels wrong with the current state of the system.

At first we thought that it is normal due to reindexing causing heavy load until it finishes, but it's been more than a week since the update and the lagging is still there. SmartConsole doesn't lag like SmartUpdate.

We tried following sk41793 and sk112334 for debugging on global domain (since we connect to SU from global) and SmartConsole, but none of the logs gave us anything. At this point we're kind of lost as to where (and how) to look further. cpwd doesn't show any restarts of any processes, and we also didn't see any complaints in cpd, messages etc. The only strange thing is that we see that fwm keeps writing so much information even after we turned off the debug parameters as described in sk41793:

[Expert@HostName]# fw debug fwm off TDERROR_ALL_SU=0
[Expert@HostName]# fw debug fwm off TDERROR_ALL_cprep=0
[Expert@HostName]# fw debug fwm off TDERROR_ALL_cpget=0
[Expert@HostName]# fw debug fwm off TDERROR_ALL_cpms=0
[Expert@HostName]# fw debug fwm off OPSEC_DEBUG_LEVEL=0
[Expert@HostName]# fw debug fwm off SU_DEBUG_LEVEL=0

Also, here's an output from top command, if anything:

top - 10:35:52 up 8 days, 16:11, 1 user, load average: 4.76, 4.88, 6.09
Threads: 2743 total, 10 running, 2733 sleeping, 0 stopped, 0 zombie
%Cpu0 : 21.0 us, 2.9 sy, 0.0 ni, 71.4 id, 1.9 wa, 0.0 hi, 2.9 si, 0.0 st
%Cpu1 : 13.3 us, 7.6 sy, 1.9 ni, 76.2 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu2 : 19.2 us, 3.8 sy, 1.9 ni, 75.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 45.7 us, 7.6 sy, 2.9 ni, 42.9 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu4 : 24.5 us, 3.8 sy, 2.8 ni, 67.0 id, 0.9 wa, 0.0 hi, 0.9 si, 0.0 st
%Cpu5 : 35.6 us, 8.7 sy, 3.8 ni, 51.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 49.1 us, 1.9 sy, 0.9 ni, 47.2 id, 0.0 wa, 0.9 hi, 0.0 si, 0.0 st
%Cpu7 : 24.8 us, 4.8 sy, 1.9 ni, 67.6 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu8 : 50.5 us, 2.9 sy, 0.0 ni, 45.7 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu9 : 18.1 us, 5.7 sy, 1.9 ni, 72.4 id, 1.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10 : 23.1 us, 13.5 sy, 1.0 ni, 62.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu11 : 20.2 us, 10.6 sy, 1.0 ni, 68.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12 : 18.1 us, 2.9 sy, 6.7 ni, 71.4 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu13 : 15.2 us, 7.6 sy, 1.9 ni, 74.3 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu14 : 29.0 us, 4.7 sy, 3.7 ni, 61.7 id, 0.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu15 : 26.4 us, 1.9 sy, 0.0 ni, 71.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 98416716 total, 12708860 free, 47303964 used, 38403892 buff/cache
KiB Swap: 33551748 total, 32049184 free, 1502564 used. 48368700 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
9144 admin 20 0 2670296 2.318g 43640 R 61.0 2.5 394:32.66 9 cpd
6412 admin 20 0 1009428 673704 50348 S 58.1 0.7 698:26.05 12 fwm
15303 admin 20 0 22292 4320 2080 R 55.2 0.0 0:00.58 6 esc_db_c+
27786 cp_post+ 20 0 1819428 1.671g 1.552g R 32.4 1.8 96:26.72 3 postgres
14022 admin 20 0 25.035g 0.014t 12356 R 28.6 15.7 0:06.16 10 qtp28026+
13681 admin 20 0 25.035g 0.014t 12356 R 25.7 15.7 0:08.37 8 qtp28026+
6360 admin 20 0 25.035g 0.014t 12356 R 19.0 15.7 0:08.79 0 qtp28026+
13682 admin 20 0 25.035g 0.014t 12356 S 18.1 15.7 0:08.54 5 qtp28026+
11493 admin 20 0 25.035g 0.014t 12356 S 16.2 15.7 0:07.78 7 qtp28026+
12813 admin 20 0 25.035g 0.014t 12356 S 16.2 15.7 0:10.29 1 qtp28026+
12802 admin 20 0 25.035g 0.014t 12356 R 15.2 15.7 0:09.57 3 qtp28026+

 

Any ideas and recommendations would be very much appreciated.

 

Cheers!

0 Kudos
21 Replies
the_rock
Legend
Legend

Just curious, is it same whether you open it from smart console or CP folder where guidbedit also "resides"?

Andy

0 Kudos
kamilazat
Advisor

Hi Andy!

Interesting approach. But the behavior is the same.

0 Kudos
the_rock
Legend
Legend

Fair enough...what jumbo is installed?

Andy

0 Kudos
kamilazat
Advisor

JHF Take 89. We installed it right after upgrading.

0 Kudos
the_rock
Legend
Legend

I know 98 (latest one) is recommended at this point. Not sure if its worth trying that take...

Andy

0 Kudos
AkosBakos
Leader Leader
Leader

Hi @kamilazat 

take 96 has a relevant fix. Maybe this behaviour is the same:

image.png

Before you do any deeper investigation install the take 98 as @the_rock mentioned.

My experience: this lowered the load at least 40% of my SmartLog.

We have almost 4 MGMT with take 98, I can say, it is safe to install! We are waiting fr your feedback 🙂

Akos

 

----------------
\m/_(>_<)_\m/
0 Kudos
the_rock
Legend
Legend

Hey @AkosBakos ...on slightly different note, though still relevant to jumbo 98, do you have it installed on any firewalls? I ask, since there was a post here by someone saying they installed it and it broke remote access. Thats pretty significant, since I had pretty large customer that has people connecting from all over world ask me about installing that jumbo hotfix, but I told them to wait, specifically because of that post.

Will see if I can find it.

Andy

0 Kudos
kamilazat
Advisor

@the_rock @AkosBakos oh my... That was an unfortunate typo. We have JHF Take 98 🙂

EDIT: We collected cpinfo from global MDS and HCP found that there are cpd coredumps that happened on 8th and 9th March. Do you think it may be relevant? It bugs me how an upgrade can cause cpd to crash.🤔

0 Kudos
the_rock
Legend
Legend

No worries. Did you try cprestart to see if that helps?

Andy

0 Kudos
AkosBakos
Leader Leader
Leader

I don't suggest it, because it starts everything in te "same" time. #evstart starts only the indexing.

Akos

----------------
\m/_(>_<)_\m/
0 Kudos
the_rock
Legend
Legend

True true, I just figured since its management, its pretty safe to do.

Andy

0 Kudos
kamilazat
Advisor

So you think it may be related to indexing? Can you elaborate on it a bit? I can't imagine how log indexing can be related. I'll need a good reason to propose a MW for that 🙂

 

0 Kudos
the_rock
Legend
Legend

That was also my thinking, hence why I suggested cprestart.

Andy

0 Kudos
AkosBakos
Leader Leader
Leader

Hi @kamilazat 

This can be a good reason for the maintatance window:

  • Some indexed logs are not visible. Although some queries yield results, some show empty lines.

https://support.checkpoint.com/results/sk/sk168315

----------------
\m/_(>_<)_\m/
0 Kudos
kamilazat
Advisor

@AkosBakos I will try to arrange a MW and run evstop/evstart at the soonest. I'll update you with the results.

0 Kudos
the_rock
Legend
Legend

Screenshot attached.

Andy

0 Kudos
AkosBakos
Leader Leader
Leader

Hi @kamilazat 

Without any reason, try  #evstop then # evstart ot the MGMT.

This solved a lot of problems in the past.

Akos

----------------
\m/_(>_<)_\m/
the_rock
Legend
Legend

Not a bad idea 🙂

0 Kudos
kamilazat
Advisor

That will trigger reindexing, correct? If yes, it will take some time to report back due to maintenance window restrictions 🙂

0 Kudos
AkosBakos
Leader Leader
Leader

Yes, exactly!

----------------
\m/_(>_<)_\m/
0 Kudos
the_rock
Legend
Legend

For what its worth, here is AI Copilot answer 🙂

Andy

There could be several reasons why SmartUpdate is slow after upgrading MDS to R81.20. Here are some potential causes and solutions:

  1. Database Re-indexing:

    • After an upgrade, the system may need to re-index the database, which can temporarily slow down operations. This is especially true if there are a large number of objects or logs to process.
  2. Resource Utilization:

    • The upgrade process might have left some temporary files or processes running, consuming system resources. Check for any unnecessary processes and clean up temporary files.
  3. Configuration Issues:

    • There might be configuration issues or mismatches that occurred during the upgrade. Ensure that all configurations are correctly set and that there are no conflicts.
  4. Network Latency:

    • Network issues can also cause delays. Ensure that the network connection between the MDS and the SmartUpdate server is stable and has low latency.
  5. Jumbo Hotfixes:

    • Ensure that you have applied the latest Jumbo Hotfix Accumulator for R81.20. These hotfixes often contain performance improvements and bug fixes that can resolve such issues.
  6. Logs and Monitoring:

    • Check the logs for any errors or warnings that might indicate the cause of the slowdown. Use monitoring tools to identify any bottlenecks in the system.
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events