Hello everyone.
We are experiencing heavy lagging on SmartUpdate after upgrading from R81.10 to R81.20 JHF Take 98. It does NOT hang or freeze, but everything works extremely slowly. This did not happen before the update so something feels wrong with the current state of the system.
At first we thought that it is normal due to reindexing causing heavy load until it finishes, but it's been more than a week since the update and the lagging is still there. SmartConsole doesn't lag like SmartUpdate.
We tried following sk41793 and sk112334 for debugging on global domain (since we connect to SU from global) and SmartConsole, but none of the logs gave us anything. At this point we're kind of lost as to where (and how) to look further. cpwd doesn't show any restarts of any processes, and we also didn't see any complaints in cpd, messages etc. The only strange thing is that we see that fwm keeps writing so much information even after we turned off the debug parameters as described in sk41793:
[Expert@HostName]# fw debug fwm off TDERROR_ALL_SU=0
[Expert@HostName]# fw debug fwm off TDERROR_ALL_cprep=0
[Expert@HostName]# fw debug fwm off TDERROR_ALL_cpget=0
[Expert@HostName]# fw debug fwm off TDERROR_ALL_cpms=0
[Expert@HostName]# fw debug fwm off OPSEC_DEBUG_LEVEL=0
[Expert@HostName]# fw debug fwm off SU_DEBUG_LEVEL=0
Also, here's an output from top command, if anything:
top - 10:35:52 up 8 days, 16:11, 1 user, load average: 4.76, 4.88, 6.09
Threads: 2743 total, 10 running, 2733 sleeping, 0 stopped, 0 zombie
%Cpu0 : 21.0 us, 2.9 sy, 0.0 ni, 71.4 id, 1.9 wa, 0.0 hi, 2.9 si, 0.0 st
%Cpu1 : 13.3 us, 7.6 sy, 1.9 ni, 76.2 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
%Cpu2 : 19.2 us, 3.8 sy, 1.9 ni, 75.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 45.7 us, 7.6 sy, 2.9 ni, 42.9 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu4 : 24.5 us, 3.8 sy, 2.8 ni, 67.0 id, 0.9 wa, 0.0 hi, 0.9 si, 0.0 st
%Cpu5 : 35.6 us, 8.7 sy, 3.8 ni, 51.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 49.1 us, 1.9 sy, 0.9 ni, 47.2 id, 0.0 wa, 0.9 hi, 0.0 si, 0.0 st
%Cpu7 : 24.8 us, 4.8 sy, 1.9 ni, 67.6 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu8 : 50.5 us, 2.9 sy, 0.0 ni, 45.7 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu9 : 18.1 us, 5.7 sy, 1.9 ni, 72.4 id, 1.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10 : 23.1 us, 13.5 sy, 1.0 ni, 62.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu11 : 20.2 us, 10.6 sy, 1.0 ni, 68.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12 : 18.1 us, 2.9 sy, 6.7 ni, 71.4 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu13 : 15.2 us, 7.6 sy, 1.9 ni, 74.3 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu14 : 29.0 us, 4.7 sy, 3.7 ni, 61.7 id, 0.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu15 : 26.4 us, 1.9 sy, 0.0 ni, 71.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 98416716 total, 12708860 free, 47303964 used, 38403892 buff/cache
KiB Swap: 33551748 total, 32049184 free, 1502564 used. 48368700 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
9144 admin 20 0 2670296 2.318g 43640 R 61.0 2.5 394:32.66 9 cpd
6412 admin 20 0 1009428 673704 50348 S 58.1 0.7 698:26.05 12 fwm
15303 admin 20 0 22292 4320 2080 R 55.2 0.0 0:00.58 6 esc_db_c+
27786 cp_post+ 20 0 1819428 1.671g 1.552g R 32.4 1.8 96:26.72 3 postgres
14022 admin 20 0 25.035g 0.014t 12356 R 28.6 15.7 0:06.16 10 qtp28026+
13681 admin 20 0 25.035g 0.014t 12356 R 25.7 15.7 0:08.37 8 qtp28026+
6360 admin 20 0 25.035g 0.014t 12356 R 19.0 15.7 0:08.79 0 qtp28026+
13682 admin 20 0 25.035g 0.014t 12356 S 18.1 15.7 0:08.54 5 qtp28026+
11493 admin 20 0 25.035g 0.014t 12356 S 16.2 15.7 0:07.78 7 qtp28026+
12813 admin 20 0 25.035g 0.014t 12356 S 16.2 15.7 0:10.29 1 qtp28026+
12802 admin 20 0 25.035g 0.014t 12356 R 15.2 15.7 0:09.57 3 qtp28026+
Any ideas and recommendations would be very much appreciated.
Cheers!