Slowness in displaying log in SMS.

Bachan · ‎2023-08-29

Hi Team,

We have Multiple Gateway (approx. 8 to 10 ) gateway managed by an SMS server.

SMS server Hosted on VM with R80.40 Take 197.

Initially Log delay on smartconsole was around 2 to 3 hours in the beginning. Post performing "Instal Database" on SMS, the issue resolved.

Now, we again observe this issue intermediately . But this time it is 15 to 20 min delay during peak hours.

-Connectivity from Gateway to SMS server is fine.

-CPU and Memory utilization was fine during peak hours.

-CPWD_Admin list was all Established.

-Logs increasing on $FWDIR/log

-df -kh was having ample amount of space.

-lvm manager also had around 400 GB free space for logs.

We are observing this issue observed 15 to 20 days back. I have check the Indexing logs and found the below errors. Please let me know if this points to anything ?

Logs

Below logs , i see from Aug 8 from when the issue is observed.

[Expert@]# cat log_indexer.elg.1 | grep error
[4069522240][28 Aug 15:42:55] CMappedBinaryFile::[01;31m[Kerror[m[K opening file /opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log
[4069522240][28 Aug 15:42:55] CLogFile::Open2: [01;31m[Kerror[m[K: open (/opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log) for reading failed
[4080003904][28 Aug 16:07:39] CMappedBinaryFile::[01;31m[Kerror[m[K opening file /opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log
[4080003904][28 Aug 16:07:39] CLogFile::Open2: [01;31m[Kerror[m[K: open (/opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log) for reading failed
[4100971328][28 Aug 16:27:36] CMappedBinaryFile::[01;31m[Kerror[m[K opening file /opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log
[4100971328][28 Aug 16:27:36] CLogFile::Open2: [01;31m[Kerror[m[K: open (/opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log) for reading failed
[4109364032][28 Aug 17:06:14] CMappedBinaryFile::[01;31m[Kerror[m[K opening file /opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log
[4109364032][28 Aug 17:06:14] CLogFile::Open2: [01;31m[Kerror[m[K: open (/opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log) for reading failed
[4069522240][28 Aug 17:31:25] CMappedBinaryFile::[01;31m[Kerror[m[K opening file /opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log
[4069522240][28 Aug 17:31:25] CLogFile::Open2: [01;31m[Kerror[m[K: open (/opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log) for reading failed
[4069522240][28 Aug 18:07:22] CMappedBinaryFile::[01;31m[Kerror[m[K opening file /opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log
[4069522240][28 Aug 18:07:22] CLogFile::Open2: [01;31m[Kerror[m[K: open (/opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log) for reading failed
[4080003904][28 Aug 18:34:53] CMappedBinaryFile::[01;31m[Kerror[m[K opening file /opt/CPsuite-R80.40/fw1/log/pg_gaia_dump.log

======================================================================================================

[Expert@]# cat pg_gaia_dump.log
Thu Jan 19 14:10:59 IST 2023 Initiating psql_dumpall_client
Thu Jan 19 14:10:59 IST 2023 PID of psql_dumpall_client is 44430
Thu Jan 19 14:10:59 IST 2023 Noticed change in file size of dump file
Thu Jan 19 14:10:59 IST 2023 Previous recorded dump file size is -1
Thu Jan 19 14:10:59 IST 2023 Sleeping for 1 minute
Thu Jan 19 14:10:59 IST 2023
Thu Jan 19 14:11:59 IST 2023 Current size of dump file is 135036928
Thu Jan 19 14:11:59 IST 2023 Noticed change in file size of dump file
Thu Jan 19 14:11:59 IST 2023 Previous recorded dump file size is 0
Thu Jan 19 14:11:59 IST 2023 Sleeping for 1 minute
Thu Jan 19 14:11:59 IST 2023
Thu Jan 19 14:12:59 IST 2023 Current size of dump file is 330268672
Thu Jan 19 14:12:59 IST 2023 Noticed change in file size of dump file
Thu Jan 19 14:12:59 IST 2023 Previous recorded dump file size is 135036928
Thu Jan 19 14:12:59 IST 2023 Sleeping for 1 minute
Thu Jan 19 14:12:59 IST 2023
Thu Jan 19 14:13:59 IST 2023 Current size of dump file is 671607745
Thu Jan 19 14:13:59 IST 2023 Noticed change in file size of dump file
Thu Jan 19 14:13:59 IST 2023 Previous recorded dump file size is 330268672
Thu Jan 19 14:13:59 IST 2023 Sleeping for 1 minute
Thu Jan 19 14:13:59 IST 2023
Thu Jan 19 14:14:59 IST 2023 Current size of dump file is 671607745
Thu Jan 19 14:14:59 IST 2023 Dumpall client has exited with status 0
Thu Jan 19 14:14:59 IST 2023 PID of psql_dumpall_client is 44430
Thu Jan 19 14:14:59 IST 2023 psql_dumpall_client has completed and located in /opt/CPsuite-R80.40/fw1/tmp/migrate//ren_db.gz
Tue Aug 8 15:43:00 IST 2023 Initiating psql_dumpall_client
Tue Aug 8 15:43:00 IST 2023 PID of psql_dumpall_client is 19608
Tue Aug 8 15:43:00 IST 2023 Noticed change in file size of dump file
Tue Aug 8 15:43:00 IST 2023 Previous recorded dump file size is -1
Tue Aug 8 15:43:00 IST 2023 Sleeping for 1 minute
Tue Aug 8 15:43:00 IST 2023
pg_dump: Dumping the contents of table "dbindexnetworkobject_data" failed: PQgetResult() failed.
pg_dump: Error message from server: FATAL: terminating connection due to administrator command
pg_dump: The command was: COPY public.dbindexnetworkobject_data (objid, abacusserver, appliancetype, checkpointobjid, clustermember, cpproductsinstalled, cpmitype, dag, dlesession, domainid, external, firewall, floodgate, folder, internal, ipaddr, ipaddr6, ipsblade, isbridge, junction, logserver, management, mediaip, netmask, netmask6, objfromxaddr, permissionprimitivepresetid, primarymanagement, profiletype, readprimitiveid, realtimemonitor, sicname, slimfwhardwaretype, sofawaredag, sofawaremode, sofawareoriginated, sofawarevpn, svnversionname, usedglobally, vendor, versionmajor, versionminor, versionoptionpack, vsclustermember, vsclusternetobj, vsnetobj, vsxclustermember, vsxclusternetobj, vsxnetobj, opid, fromversion, toversion, editingsession, deleted) TO stdout;
pg_dumpall: pg_dump failed on database "cpm", exiting
Tue Aug 8 15:44:00 IST 2023 Current size of dump file is 61849346
Tue Aug 8 15:44:00 IST 2023 Noticed change in file size of dump file
Tue Aug 8 15:44:00 IST 2023 Previous recorded dump file size is 0
Tue Aug 8 15:44:00 IST 2023 Sleeping for 1 minute
Tue Aug 8 15:44:00 IST 2023
Tue Aug 8 15:45:00 IST 2023 Current size of dump file is 61849346
Tue Aug 8 15:45:00 IST 2023 Dumpall client has exited with status 1
Tue Aug 8 15:45:00 IST 2023 Postgres Dump failed! Please check postgres logs for more information

=========================================================================================================[Expert@]# psql_client cpm postgres -c "SELECT pg_size_pretty(pg_database_size('cpm'));"
pg_size_pretty
----------------
12 GB
(1 row)

=======================================================================================================

1) View LVM storage overview
2) Resize lv_current/lv_log Logical Volume
3) Quit
Select action: 1
LVM overview
============
Size(GB) Used(GB) Configurable Description
lv_current 150 54 yes Check Point OS and products
lv_log 1500 1201 yes Logs volume
lv_snapshot 53 53 no Snapshot volume
upgrade 165 N/A no Reserved for version upgrade
swap 64 N/A no Swap volume size
free 115 N/A no Unused space
------- ----
total 2047 N/A no Total size

press ENTER to continue.

==========================================================================================================

df -kh
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_splat-lv_current 150G 54G 97G 36% /
/dev/sda1 291M 43M 233M 16% /boot
tmpfs 63G 4.5M 63G 1% /dev/shm
/dev/mapper/vg_splat-lv_log 1.5T 1.2T 301G 80% /var/log

the_rock · ‎2023-08-29

Does rebooting mgmt server help? Also, seems you need to clean up some space in /var/log dir, for sure.

Andy

Timothy_Hall · ‎2023-08-29

Almost certainly an oversubscribed disk path which is common in VMWare environments. How many CPUs are allocated to the VM?

Please provide the output of the following, ideally while logs are heavily delayed:

vmstat 5 5

iostat 5 5

cpstat mg -f log_server

cpstat ls -f logging

cpstat ls -f indexer

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com

Bachan · ‎2023-08-30

Please find the output requested

vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
6 0 2266624 1264844 24 87677676 0 0 1 1 0 0 23 2 74 1 0
5 0 2266624 1178028 24 87759444 0 0 9 27263 16206 13657 25 2 73 0 0
2 0 2266624 973744 24 87958660 0 0 18 1256 16373 14428 25 2 73 0 0
7 0 2266624 878004 24 88049072 0 0 10 46 17848 16472 26 2 71 0 0
3 0 2266624 825164 24 88095748 0 0 1 12844 13950 11700 19 2 79 0 0

===============================================================================

[Expert@]# iostats 5 5
Linux 3.10.0-957.21.3cpx86_64 (PR-PC-CS) 08/30/23 _x86_64_ (20 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
4.10 19.14 2.35 0.64 0.00 73.77

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 174.03 1089.52 1089.52 2147483647 2147483647
dm-0 131.79 1089.52 1089.52 2147483647 2147483647
dm-1 56.97 121.11 1089.52 238704250 2147483647
dm-2 0.00 0.00 0.00 2408 0

avg-cpu: %user %nice %system %iowait %steal %idle
3.02 14.61 1.72 0.00 0.00 80.64

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 14.40 0.00 0.00 0 0
dm-0 6.40 0.00 0.00 0 0
dm-1 8.00 0.00 0.00 0 0
dm-2 0.00 0.00 0.00 0 0

avg-cpu: %user %nice %system %iowait %steal %idle
4.02 15.18 2.00 0.02 0.00 78.77

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 110.20 0.00 0.00 0 0
dm-0 93.80 0.00 0.00 0 0
dm-1 18.20 0.80 0.00 4 0
dm-2 0.00 0.00 0.00 0 0

avg-cpu: %user %nice %system %iowait %steal %idle
3.88 14.41 1.89 0.00 0.00 79.82

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 36.20 0.00 0.00 0 0
dm-0 26.00 0.00 0.00 0 0
dm-1 12.20 2.40 0.00 12 0
dm-2 0.00 0.00 0.00 0 0

avg-cpu: %user %nice %system %iowait %steal %idle
3.35 16.09 1.89 0.08 0.00 78.59

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 67.40 0.00 0.00 0 0
dm-0 65.20 0.00 0.00 0 0
dm-1 2.40 0.00 0.00 0 0
dm-2 0.00 0.00 0.00 0 0

========================================================================

[Expert@]# cpstat mg -f log_server

Log Receive Rate: 10401
Log Receive Rate Peak: 43748
Log Receive Rate Last 10 Minutes: 9817
Log Receive Rate Last Hour: 13394

[Expert@]# cpstat ls -f indexer
No product has flag 'ls'
[Expert@]# cpstat ls -f logging
No product has flag 'ls'

[Expert@]# exit
logout

Timothy_Hall · ‎2023-08-31

Were these commands run while the logs were being heavily delayed? It doesn't look like it as this system has plenty of CPU/memory/disk speed. If they were it does not look like a resource issue to me and could be a Check Point code issue. Keep in mind that the Log Indexer was significantly improved in R81 (or is it R81.10?) by getting rid of SOLR and using a much more efficient & scalable logging/indexing mechanism, and you are still on R80.40.

If you can identify when heavy log delays were occurring in the last 30 days, first determine the day of the month (i.e. 20th of August for this example). Once you have that post the results of these commands run from expert mode:

sar -f /var/log/sa/sa20

sar -f /var/log/sa/sa20 -r

sar -f /var/log/sa/sa20 -n EDEV

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com

googjinkim · ‎2023-09-07

Hi,

Try applying the solution in the link below.

https://support.checkpoint.com/results/sk/sk167511

Are you a member of CheckMates?

Slowness in displaying log in SMS.