- Products
- Learn
- Local User Groups
- Partners
- More
What's New in R82.10?
Watch HereWhen the Agents Attack
A Live Look at Agentic Exposure Validation
AI Security Masters E8:
Claude Mythos: New Era in Cyber Security
CheckMates Go:
CheckMates Fest
Hello CheckMates,
We are troubleshooting a persistent issue with fwFullyUtilizedDrops on a cluster of two Check Point 26000 appliances running R81.20 (Take 120) in Kernel Mode (KPPAK).
Our environment handles approximately 350,000 packets/sec (as reported by CPView Overview) with about 200,000 active sessions. We are observing constant fwFullyUtilizedDrops (around 2,000 to 2,500 drops/sec during peak hours), which represents roughly a 1% packet loss. These drops are invisible in zdebug drop or zdebug + drop.
Here is a summary of our optimization efforts so far:
What worked:
What did NOT work (No impact or negative impact on drops):
fw amw unload and ips off. While CPU usage decreased, fwFullyUtilizedDrops remained exactly the same, suggesting the bottleneck is not at the Worker processing level.The SMT Hypothesis: We are now considering disabling SMT (Hyperthreading). Our hypothesis is that at 350,000 pps, the SND instances are suffering from resource contention or cache thrashing on the physical cores, hindering their ability to empty the buffers and inject packets into the CoreXL queues.
Questions:
fwFullyUtilizedDrops by disabling SMT?cpconfig in R81.20 (as per sk172304), is the fwboot bootconf set_core_override command a safe alternative to a BIOS/TAC intervention for a production test?fast_accel test confirm that the SND is at its absolute physical limit for processing interruptions?We're giving the result of the s7pac command at the moment of this writing, in case it may be useful.
Finally, we are planning an upgrade to R82.10 in the coming weeks to leverage the structural benefits of Polling mode. However, we are proceeding with caution as our environment includes a critical Site-to-Site VPN with a partner where coordination for troubleshooting would be extremely difficult in the event of a regression.
Thank you for your time and insights.
Turning off SMT will probably fix the problem, but not for the reasons you think.
Are you using Dynamic Split? I don't think you are, because you have 10 real cores trying to execute both SND and Worker functions, if I'm understanding your architecture correctly due to how SMT divides cores. 26000 uses a 2x Intel Xeon Gold 6254 CPU @ 3.10GHz (18C, 36T). So 2 sockets, but your split appears to be set for only one socket. With SMT, socket 1 is cores 0-35 (18 real & 18 siblings), socket 2 is cores 36-71 (18 real & 18 siblings).
Your SNDs are cores 0-4 and cores 36-40, which would make sense if there were a single socket, as these would be sibling cores. But you have two sockets, so cores 0-4 are siblings with cores 18-22 (all workers), and cores 36-40 are siblings with cores 54-58 (all workers). So you have 10 of your 36 real cores trying to perform both SND and worker functions.
It's easy to see how the CoreXL queues could overflow if the worker is constantly getting clobbered by SND functions. Since the SNDs run in the kernel for KPPAK mode, the busier the SNDs get, the more the conflicting worker instances get kicked off the sibling core to make way for the kernel, which has ultimate scheduling priority over measly USFW worker processes. This may also cause significant traffic between NUMA nodes, further impacting performance. Doing fast_accel's would make this situation much worse on the 10 overlapping real cores, by making the conflicting SNDs even busier than they were before.
Check my work here, please @Bob_Zimmerman.
Turning off SMT will probably fix the problem, but not for the reasons you think.
Are you using Dynamic Split? I don't think you are, because you have 10 real cores trying to execute both SND and Worker functions, if I'm understanding your architecture correctly due to how SMT divides cores. 26000 uses a 2x Intel Xeon Gold 6254 CPU @ 3.10GHz (18C, 36T). So 2 sockets, but your split appears to be set for only one socket. With SMT, socket 1 is cores 0-35 (18 real & 18 siblings), socket 2 is cores 36-71 (18 real & 18 siblings).
Your SNDs are cores 0-4 and cores 36-40, which would make sense if there were a single socket, as these would be sibling cores. But you have two sockets, so cores 0-4 are siblings with cores 18-22 (all workers), and cores 36-40 are siblings with cores 54-58 (all workers). So you have 10 of your 36 real cores trying to perform both SND and worker functions.
It's easy to see how the CoreXL queues could overflow if the worker is constantly getting clobbered by SND functions. Since the SNDs run in the kernel for KPPAK mode, the busier the SNDs get, the more the conflicting worker instances get kicked off the sibling core to make way for the kernel, which has ultimate scheduling priority over measly USFW worker processes. This may also cause significant traffic between NUMA nodes, further impacting performance. Doing fast_accel's would make this situation much worse on the 10 overlapping real cores, by making the conflicting SNDs even busier than they were before.
Check my work here, please @Bob_Zimmerman.
Just checked one of my dual-socket boxes (a 16200, running R82 jumbo 60), and I see cores 0-11 on socket 0, 12-23 on socket 1, 24-35 on socket 0, and 36-47 on socket 1. Pretty sure the hyperthreads are 24-47. Still agreed, this suggests something is wrong with CoreXL and dynamic split.
The output of 'cpstat os -f multi_cpu -o 1 -c 5' looks like the load is relatively well spread. A drop debug should show which instance dropped the traffic, or which instance the SND was trying to send the traffic to. Is it consistent, or do the drops implicate many instances?
Edit: Incidentally, the 16200 I checked is under light load, and has SNDs on cores 0, 1, 12, 13, 24, 25, 36, and 37. In other words, the first two cores of every set of cores. This pattern holds for all of my non-VSX firewalls with two sockets. On a dual-socket system running VSX, I see SNDs on 0, 1, 24, and 25. On an asymmetric single-socket system (a 9300, which uses an Intel i5-13400E), I see SNDs on 0 and 1 (which seems to be the hyperthread for 0), and none on the e-core complex (12-15).
Are these 26000 units running VSX or MDPS?
dynamic_balancing -p reports the status as 'on'. However, there is clearly a configuration issue because our $FWDIR/conf/dynamic_split.conf file shows AUTOMATION_MODE=0, which may indicate the mechanism is effectively in a static or 'frozen' state and unable to perform automatic adjustments (although not sure what AUTOMATION=0 really means).fw zdebug drop or even fw ctl zdebug + drop. Although it is understood that these drops typically occur when the CoreXL input queue is full, they remain invisible in the debug output despite SNMP reporting a consistent rate of 2,000 to 2,500 drops/sec during peak hours. At this moment, we haven't found a reliable way to capture these events or pinpoint exactly where in the datapath the discard is happening.After extensive troubleshooting, we finally identified the root cause — and it turned out to be a monitoring configuration error on our end.
We had assigned OID iso.3.6.1.4.1.2620.1.1.25.13.0 (fwLoggedTotal) to our Zabbix item instead of iso.3.6.1.4.1.2620.1.1.25.26.0 (fwFullyUtilizedDrops). We were never actually observing fwFullyUtilizedDrops at all — we were monitoring the total number of logged connections.
In hindsight, the most obvious clue was right there from the beginning: the drops were completely invisible in zdebug drop and in SmartConsole logs. That inconsistency should have immediately pointed to a measurement problem rather than an actual performance issue — but none of us caught it.
We want to thank Timothy Hall and Bob Zimmerman for their thorough and technically sound analysis of our CoreXL/SMT configuration. Even though it was based on a false premise, the findings regarding SND/Worker core overlap on our dual-socket setup are real and worth addressing independently.
Lesson learned: always validate your data source before analyzing the data.
Closing this thread as resolved.
Leaderboard
Epsum factorial non deposit quid pro quo hic escorol.
| User | Count |
|---|---|
| 75 | |
| 17 | |
| 7 | |
| 6 | |
| 5 | |
| 4 | |
| 4 | |
| 4 | |
| 4 | |
| 3 |
Thu 02 Jul 2026 @ 06:00 PM (CST)
Revolucionando la Seguridad con IA Generativa: Prevención Inteligente en Tiempo RealThu 09 Jul 2026 @ 10:00 AM (CEST)
Schutz souveräner Workloads: Check Point & die AWS European Sovereign CloudThu 09 Jul 2026 @ 11:00 AM (CEST)
The Cloud Architects Series: Check Point Edge Protection SD-WAN & SASETue 14 Jul 2026 @ 10:00 AM (PDT)
AI Security Masters E11: READY OR NOT: Securing the AI Enterprise 3/5 - AI Workforce SecurityThu 30 Jul 2026 @ 10:00 AM (PDT)
AI Security Masters E12: READY OR NOT: Securing the AI Enterprise 4/5 - AI GatewayThu 20 Aug 2026 @ 10:00 AM (PDT)
AI Security Masters E13: READY OR NOT: Securing the AI Ent 5/5 - AI Research & Threat LandscapeTue 14 Jul 2026 @ 10:00 AM (PDT)
AI Security Masters E11: READY OR NOT: Securing the AI Enterprise 3/5 - AI Workforce SecurityThu 30 Jul 2026 @ 10:00 AM (PDT)
AI Security Masters E12: READY OR NOT: Securing the AI Enterprise 4/5 - AI GatewayThu 20 Aug 2026 @ 10:00 AM (PDT)
AI Security Masters E13: READY OR NOT: Securing the AI Ent 5/5 - AI Research & Threat LandscapeThu 02 Jul 2026 @ 06:00 PM (CST)
Revolucionando la Seguridad con IA Generativa: Prevención Inteligente en Tiempo RealAbout CheckMates
Learn Check Point
Advanced Learning
YOU DESERVE THE BEST SECURITY