Solved: Re: VIP ping delay and active addresses of the MGM...

Arturxr · ‎2026-01-22

Hello, we've discovered that we have high ping latency to the VIP address and the active node address on our Check Point cluster on the MGMT IP. This seems to be causing communication issues between the Identity Collector and Check Point, as timeout errors periodically occur, and some events may not reach Check Point. Furthermore, the pdp monitor user command output returns the message "daemon did not respond or not running!", while the pdp monitor ip command displays a result.
I also can't run the cpinfo -y all command (the output freezes), and the gateway periodically appears red in the Smart Console.

Could you tell me what could be causing the ping latency to the VIP address and the active addresses of the MGMT interface?

Arturxr · ‎2026-01-26

it seems that the problem has been resolved, we will monitor it, we have done core balancing, according to https://sc1.checkpoint.com/documents/R81.10/WebAdminGuides/EN/CP_R81.10_PerformanceTuning_AdminGuide...

in short,: set dynamic-balancing state enable

reboot

View solution in original post

the_rock · ‎2026-01-22

Hey @Arturxr

When did this issue happen? Btw, for what is worth, can you ensure time is adjusted and correct? Sometimes that can definitely cause these problems, seen it few times.

Best,
Andy
"Have a great day and if its not, change it"

Arturxr · ‎2026-01-22

I checked the time, everything is set up correctly. The problem was noticed recently when some users stopped being identified in the access role, ping delays were accidentally detected, and judging by the schedule in the monitoring system, there were always delays, usually during the entire working hours, when there was a lot of user traffic. But after restarting the cluster, the situation seemed to worsen and delays increased even more.

the_rock · ‎2026-01-22

Fair enough. Are you able to figure out based on the policy revisions if there were any changes that could have impacted this behavior?

Best,
Andy
"Have a great day and if its not, change it"

Arturxr · ‎2026-01-22

we did not make any changes to the policy, the problem arose on its own.

the_rock · ‎2026-01-22

Do you see any drops if you try zdebug and grep for specific ip?

fw ctl zdebug + drop | grep x.x.x.x (just replace x.x.x.x with affected ip address)

Best,
Andy
"Have a great day and if its not, change it"

Arturxr · ‎2026-01-22

I tried looking at the Identity Collector address, as well as the address of the active node, but I didn't notice any drops.

Vincent_Bacher · ‎2026-01-22

The error message ‘daemon did not respond or not running!’ for pdp commands usually appears when the system has a load spike or is generally overloaded.
Have you used top to check whether the CPU utilisation is OK or too high?

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

Arturxr · ‎2026-01-22

Yes, the strangest thing is that the load is not high and does not exceed 48% during the day, but there are problems as if the load were high.

Vincent_Bacher · ‎2026-01-22

Nevertheless, it would be interesting to see what problem the pdpd daemon has, because normally the message is only displayed when there is a load problem.
To be on the safe side, I would analyse the daemon using perf.
First, collect data with

perf record -p $(pidof pdpd)

Then wait a while and cancel with Ctrl+C.
Then display the result with
perf report
.
It may be that everything is OK, but it's better to be safe than sorry. As I said, “daemon did not respond or not running!” does not appear without reason.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

Arturxr · ‎2026-01-22

I'll try to collect it tomorrow, since now user traffic has dropped and it looks like Check Point has started to feel free, but still it seems to me that the root of the problem lies in packet delays on the mgmt interface.

the_rock · ‎2026-01-22

Can you try below when issue is there? Just replace eth0 with right mgmt name:

[Expert@CP-GW:0]# ethtool -S eth0
NIC statistics:
Tx Queue#: 0
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 28236
ucast bytes tx: 7204990
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 0
bcast bytes tx: 0
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Tx Queue#: 1
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 3385966
ucast bytes tx: 330360905
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 0
bcast bytes tx: 0
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Tx Queue#: 2
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 63375
ucast bytes tx: 6192225
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 0
bcast bytes tx: 0
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Tx Queue#: 3
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 3371959
ucast bytes tx: 291134280
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 2
bcast bytes tx: 84
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Tx Queue#: 4
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 57490
ucast bytes tx: 6673403
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 0
bcast bytes tx: 0
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Tx Queue#: 5
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 3357203
ucast bytes tx: 289774189
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 0
bcast bytes tx: 0
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Tx Queue#: 6
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 23619
ucast bytes tx: 5230029
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 0
bcast bytes tx: 0
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Tx Queue#: 7
TSO pkts tx: 0
TSO bytes tx: 0
ucast pkts tx: 51158
ucast bytes tx: 7085314
mcast pkts tx: 0
mcast bytes tx: 0
bcast pkts tx: 0
bcast bytes tx: 0
pkts tx err: 0
pkts tx discard: 0
drv dropped tx total: 0
too many frags: 0
giant hdr: 0
hdr err: 0
tso: 0
ring full: 0
pkts linearized: 0
hdr cloned: 0
giant hdr: 0
Rx Queue#: 0
LRO pkts rx: 370577
LRO byte rx: 570807857
ucast pkts rx: 10543134
ucast bytes rx: 2463264301
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 212155
bcast bytes rx: 21748729
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
Rx Queue#: 1
LRO pkts rx: 0
LRO byte rx: 0
ucast pkts rx: 0
ucast bytes rx: 0
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 0
bcast bytes rx: 0
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
Rx Queue#: 2
LRO pkts rx: 0
LRO byte rx: 0
ucast pkts rx: 0
ucast bytes rx: 0
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 0
bcast bytes rx: 0
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
Rx Queue#: 3
LRO pkts rx: 0
LRO byte rx: 0
ucast pkts rx: 0
ucast bytes rx: 0
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 0
bcast bytes rx: 0
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
Rx Queue#: 4
LRO pkts rx: 0
LRO byte rx: 0
ucast pkts rx: 0
ucast bytes rx: 0
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 0
bcast bytes rx: 0
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
Rx Queue#: 5
LRO pkts rx: 0
LRO byte rx: 0
ucast pkts rx: 0
ucast bytes rx: 0
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 0
bcast bytes rx: 0
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
Rx Queue#: 6
LRO pkts rx: 0
LRO byte rx: 0
ucast pkts rx: 0
ucast bytes rx: 0
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 0
bcast bytes rx: 0
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
Rx Queue#: 7
LRO pkts rx: 0
LRO byte rx: 0
ucast pkts rx: 0
ucast bytes rx: 0
mcast pkts rx: 0
mcast bytes rx: 0
bcast pkts rx: 0
bcast bytes rx: 0
pkts rx OOB: 0
pkts rx err: 0
drv dropped rx total: 0
err: 0
fcs: 0
rx buf alloc fail: 0
tx timeout count: 0
[Expert@CP-GW:0]#

Best,
Andy
"Have a great day and if its not, change it"

Vincent_Bacher · ‎2026-01-22

A brief version and easier to read would be netstat -ni or watch netstat -ni to monitor if errors are increasing and how fast.

netstat -ni
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 0 56492512 0 1719 0 54322958 0 0 0 BMRU
lo 65536 0 81189170 0 0 0 81189170 0 0 0 LMdPORU

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

the_rock · ‎2026-01-22

Yep...or even ethtool -S may help too.

Best,
Andy
"Have a great day and if its not, change it"

Arturxr · ‎2026-01-22

no errors detected by rx and tx

Vincent_Bacher · ‎2026-01-22

So, as I suspected, the issie with ping replies seems to be more of a symptom than the cause, which points more to a load issue.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

Arturxr · ‎2026-01-22

no stats available

Vincent_Bacher · ‎2026-01-22

For me it sounds that there are multiple symptoms and the question is, which is the root cause. The packet delay could as well be just a symptom. But hard to say at this point.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

the_rock · ‎2026-01-22

Could be "red herring", as they say...hard to tell, for sure.

Best,
Andy
"Have a great day and if its not, change it"

Vincent_Bacher · ‎2026-01-22

However, since even ‘cpinfo -y all’ freezes when the problem occurs, I wonder whether this can really be attributed to the interface. Of course, the output of the command may be delayed due to network problems, but this can be determined by calling ‘time cpinfo -y all’.

 time cpinfo -y all | tail -n 1

This is Check Point CPinfo Build 914000250 for GAIA


real    0m2.280s
user    0m1.170s
sys     0m0.958s

If the command takes forever to execute according to the time display, I believe that this is less of a problem with the mgmt interface. Or, to be more precise, it may not be the only problem.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

Arturxr · ‎2026-01-22

real 3m17.391s
user 0m5.854s
sys 0m7.796s

Arturxr · ‎2026-01-22

from the suspicious, the output showed the following:
25.95% pdpd [kernel.kallsyms] [k] native_queued_spin_lock_slowpath

Vincent_Bacher · ‎2026-01-22

I guess, a high percentage of CPU time spent in native_queued_spin_lock_slowpath indicates significant contention and potential performance issues with the pdpd daemon.
I'd say this could point to suboptimal CoreXL/SND configuration, such as allocating too many SND cores, which can lead to excessive locking overhead.

However, I doubt that adjusting CoreXL settings specifically for PDP would be advisable.

Overall, this supports my assumption that offloading IDC → Firewall connections by introducing a dedicated, upstream PDP instance could improve the situation.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

Arturxr · ‎2026-01-22

can you tell me if there is a guide on how to do this, that is, is it possible to allocate an entire core for one process?

Vincent_Bacher · ‎2026-01-23

Regarding my assumption about the process, maybe @Timothy_Hall is best contact for performance improvement. He is way better than myself when it comes to performance optimisation.

and now to something completely different - CCVS, CCAS, CCTE, CCCS, CCSM elite

the_rock · ‎2026-01-22

Thats actually super valid point, Vince.

Best,
Andy
"Have a great day and if its not, change it"

Timothy_Hall · ‎2026-01-23

Sounds to me like memory problems which are leading to network problems. Please provide the output of the Super Seven run on the problematic gateway for further analysis:

https://community.checkpoint.com/t5/Scripts/S7PAC-Super-Seven-Performance-Assessment-Commands/m-p/40...

Also are you using the Identity Collector software? If not, trying to perform all IA functions on the gateway itself can overload the pdpd daemon and cause IA issues.

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

Arturxr · ‎2026-01-23

I ran the script, everything looks good in general, but something is still confusing.:

fwaccel stats -s
Accelerated conns/Total conns : 143/89086 (0%)
LightSpeed conns/Total conns : 0/89086 (0%)
Accelerated pkts/Total pkts : 54565446822/61758703001 (88%)
LightSpeed pkts/Total pkts : 0/61758703001 (0%)
F2Fed pkts/Total pkts : 7193256179/61758703001 (11%)
F2V pkts/Total pkts : 265061601/61758703001 (0%)
CPASXL pkts/Total pkts : 47012465094/61758703001 (76%)
PSLXL pkts/Total pkts : 7121643784/61758703001 (11%)
CPAS pipeline pkts/Total pkts : 0/61758703001 (0%)
PSL pipeline pkts/Total pkts : 0/61758703001 (0%)
CPAS inline pkts/Total pkts : 0/61758703001 (0%)
PSL inline pkts/Total pkts : 0/61758703001 (0%)
QOS inbound pkts/Total pkts : 0/61758703001 (0%)
QOS outbound pkts/Total pkts : 0/61758703001 (0%)
Corrected pkts/Total pkts : 0/61758703001 (0%)

especially: Accelerated conns/Total conns : 143/89086 (0%)

Timothy_Hall · ‎2026-01-23

Please post the full Super Seven results along with the output of enabled_blades.

I'm assuming that fwaccel stat reports that Accept templates are fully enabled? If so a zero templating rate is caused by policy layer construction or the use of protocol signatures with services in the policy. Please post the output of fwaccel templates -R for further diagnosis.

New Book: "Max Power 2026" Coming Soon
Check Point Firewall Performance Optimization

Arturxr · ‎2026-01-25

fwaccel templates -R

fwaccel: illegal option -- R
Usage: fwaccel templates <options>

Options:
-m <max entries> - max number of entries to print
-s - print only the number of offloaded templates
-S - prints statistics
-d - prints drop templates
-h - this help message

Accept templates flags (one or more of the below flags):
U - unidirectional
N - NAT
A - accounted
S - pxl enabled
Q - qxl enabled
I - NAC enabled
O - created for rule with/below dynamic object
X - created for NAT rule with translated dynamic object
E - created for NAT rule with IDA object
M - created for rule with/below domain object
T - created for rule with/below time object
Z - created for rule with/below Sec Zone object
B - created for rule with/below IDA support object
R - created for rule with/below Traceroute object
P - created for a connection that may match on a service with src port
Drop templates flags (one or more of the below flags):
D - drop template
L - log drop action

Are you a member of CheckMates?

VIP ping delay and active addresses of the MGMT interface