- Products
- Learn
- Local User Groups
- Partners
-
More
Join Us for CPX 360
23-24 February 2021
Important certificate update to CloudGuard Controller, CME,
and Azure HA Security Gateways
How to Remediate Endpoint & VPN
Issues (in versions E81.10 or earlier)
IDC Spotlight -
Uplevel The SOC
Important! R80 and R80.10
End Of Support around the corner (May 2021)
Hello!
I am trying to find a mysterious source of packet loss using my R80.10 JHF 225 gateways. The administrator of the access layer is saying their switch is receiving "pause frames" from the firewall and so it's dropping packets it cannot deliver in a timely manner. I am not sure how to evaluate this - from reading, it does not appear that they would necessarily show up in a packet capture. I've also read that those perhaps exclusively originate from an endpoint or a switch. I tried a tcpdump from the gateway and wireshark filter "macc.opcode == pause" - no results.
In the specific scenario I am troubleshooting that I hope is indicative of the larger problem, an attempt to connect to an https server reliably gets SYN-SYN/ACK-ACK-Client Hello ... Client Hello ... RST (from server). We've seen it before with a QoS/CoS issue on our switch hardware.
In searching for similar issues, I found https://community.checkpoint.com/t5/General-Topics/Ifconfig-dropped-explanation/m-p/24447#M4885 but ifconfig does not report any Rx or Tx errors, so our situation does not map well to that scenario.
I'm not getting indications that the gateway is under any meaningful load, though cpview does show 195,627 "Instance High CPU" drops, though on a "Inbound Packets/sec" rate of around 70k.
How can I determine whether the gateway is telling the switch to suspend passing packets?
This situation is covered in my "Max Power" book. Rather than trying to copy/paste and reformat it, I'm just going to be lazy and post screenshots of the relevant two pages. Let me know if you have further questions.
They are seeing the pause frames counter increment on the interface directly connected to the gateway.
This situation is covered in my "Max Power" book. Rather than trying to copy/paste and reformat it, I'm just going to be lazy and post screenshots of the relevant two pages. Let me know if you have further questions.
Yes it is, please provide output of netstat -ni for that interface so we can see if you have RX-OVR as well.
Solid row of zeroes on RX-OVR.
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth1-04 1500 0 13471586055 0 0 0 8257090756 0 0 0 BMRU
Sounds like your NIC is keeping up with the load barely. It is sending pauses but not actually running out of buffer space and losing frames to overruns. But it sounds like you are getting packet loss somewhere anyway. Doesn't look like the loss is happening on the firewall side, I'm assuming the switch counters look clean? The issue might be somewhere else in the network, the presence of pause frames does not necessarily indicate the loss is happening right there...
So it sounds like the switch is actually honoring the pause request (i.e. has flow control enabled), but has insufficient buffer capacity to hold the frames for very long and is dropping some whenever the firewall NIC requests a pause. I suppose you could turn off the rx pause function on the firewall with ethtool -A, and see who blinks first and starts losing some frames. This will be indicated on the firewall by the RX-OVR counter. Be warned that anytime you toggle interface options with ethool there is a chance it will cause a reset on the target interface and possibly all other interfaces using that same driver (i.e. igb, ixgbe, etc.) which will cause a brief bounce, so you may want to schedule an outage window before toggling it.
I'd turn them both off (make sure they have done the same on the switch side) and let the chips fall where they may.
About CheckMates
Learn Check Point
Advanced Learning
WELCOME TO THE FUTURE OF CYBER SECURITY