The purpose of this post is to present a technical analysis of the behavior observed in the customer environment, detailing the symptoms, investigation process, identified root cause, and corrective actions applied.
During the analysis, the environment presented intermittent packet loss affecting communication between internal networks and services traversing the Check Point Firewall (R82.10). The issue was not constant, which made the initial identification more complex and required a deeper investigation.
From an operational standpoint, one of the main observed behaviors was the firewall dropping traffic unexpectedly. Legitimate traffic was being matched against the cleanup rule without any clear indication of misconfiguration in security policies or NAT rules. The behavior appeared random and inconsistent across different traffic flows.
The initial investigation focused on common areas such as:
Security policy evaluation
NAT rules
Connection table utilization
CPU and memory consumption
No abnormal behavior or resource exhaustion was identified in these components that could justify the observed symptoms.
The analysis then progressed to system-level verification, where logs from /var/log/messages were reviewed. During this step, repeated entries of:
kernel: neighbour table overflow
Apr 8 11:20:03 2026 FW-CP-01 kernel:[140434.823181] neighbour: arp_cache: neighbor table overflow!
Apr 8 11:20:03 2026 FW-CP-01 kernel:[140434.823291] neighbour: arp_cache: neighbor table overflow!
Apr 8 11:20:03 2026 FW-CP-01 kernel:[140434.823416] neighbour: arp_cache: neighbor table overflow!
were identified.
This finding indicated a condition related to the ARP/neighbor table at the operating system level.
From a topology perspective, the client environment is designed with the core switch operating at Layer 2, delegating Layer 3 responsibilities to the Check Point firewall. As a result, the firewall is responsible for both inter-VLAN routing and maintaining ARP resolution (IP-to-MAC mapping) for all connected devices.
Considering the high number of devices in the environment, this design leads to a significant increase in ARP table entries on the firewall.
To support the analysis, the following command was used to monitor the ARP table size:
ip -s neigh | wc -l
It was observed that the number of ARP entries was approaching and reaching the configured limit of the ARP cache.
Based on this evidence, the root cause was identified as an insufficient ARP table size. The cache-size parameter was configured with a limit of 4096 (Default value) entries, which is not adequate for the scale and characteristics of the client’s environment.
Once this threshold was reached:
The system began aggressively removing entries from the ARP table
ARP resolution became inconsistent
The firewall was intermittently unable to resolve destination MAC addresses
Valid traffic could not be properly forwarded
This behavior directly explains the packet loss observed and the fact that legitimate traffic was being dropped and matched against the cleanup rule.
As a corrective action, the ARP table size was increased using the following command:
set arp table cache-size 16000
The change was applied immediately, without requiring a reboot or policy installation.
After the adjustment:
The “neighbour table overflow” messages were no longer observed
Packet loss symptoms were eliminated
Traffic stopped being unexpectedly matched against the cleanup rule
Overall network stability was restored
In conclusion, the observed behavior was caused by ARP table exhaustion due to the number of devices in an architecture where Layer 3 functions are centralized on the firewall. The implemented adjustment resolved the issue, and this analysis documents the scenario for future reference and preventive actions.
Reference SK:
https://support.checkpoint.com/results/sk/sk43772