Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor

Demonstrating pause frames

Jump to solution

Hello!

I am trying to find a mysterious source of packet loss using my R80.10 JHF 225 gateways. The administrator of the access layer is saying their switch is receiving "pause frames" from the firewall and so it's dropping packets it cannot deliver in a timely manner. I am not sure how to evaluate this - from reading, it does not appear that they would necessarily show up in a packet capture. I've also read that those perhaps exclusively originate from an endpoint or a switch. I tried a tcpdump from the gateway and wireshark filter "macc.opcode == pause" - no results.

In the specific scenario I am troubleshooting that I hope is indicative of the larger problem, an attempt to connect to an https server reliably gets SYN-SYN/ACK-ACK-Client Hello ... Client Hello ... RST (from server). We've seen it before with a QoS/CoS issue on our switch hardware.

In searching for similar issues, I found https://community.checkpoint.com/t5/General-Topics/Ifconfig-dropped-explanation/m-p/24447#M4885 but ifconfig does not report any Rx or Tx errors, so our situation does not map well to that scenario.

I'm not getting indications that the gateway is under any meaningful load, though cpview does show 195,627 "Instance High CPU" drops, though on a "Inbound Packets/sec" rate of around 70k.

How can I determine whether the gateway is telling the switch to suspend passing packets?

 

1 Solution

Accepted Solutions
Highlighted
Champion
Champion

This situation is covered in my "Max Power" book.  Rather than trying to copy/paste and reformat it, I'm just going to be lazy and post screenshots of the relevant two pages.  Let me know if you have further questions.

 

Flow1.jpgflow2.jpg

 

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com

View solution in original post

12 Replies
Highlighted
Admin
Admin
Perhaps a better question would be: how are they determining they are receiving pause frames from the gateway?
Highlighted
Contributor

They are seeing the pause frames counter increment on the interface  directly connected to the gateway. 

Highlighted
Champion
Champion

This situation is covered in my "Max Power" book.  Rather than trying to copy/paste and reformat it, I'm just going to be lazy and post screenshots of the relevant two pages.  Let me know if you have further questions.

 

Flow1.jpgflow2.jpg

 

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com

View solution in original post

Highlighted
Contributor
Thank you! Love your book.
0 Kudos
Highlighted
Contributor
So this is potentially a concerning number:
tx_flow_control_xon: 359283
rx_flow_control_xon: 0
tx_flow_control_xoff: 506156
rx_flow_control_xoff: 0
Highlighted
Champion
Champion

Yes it is, please provide output of netstat -ni for that interface so we can see if you have RX-OVR as well.

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com
0 Kudos
Highlighted
Contributor

Solid row of zeroes on RX-OVR.

Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg

eth1-04 1500 0 13471586055 0 0 0 8257090756 0 0 0 BMRU

netstat-ni.png

 

Highlighted
Champion
Champion

Sounds like your NIC is keeping up with the load barely. It is sending pauses but not actually running out of buffer space and losing frames to overruns. But it sounds like you are getting packet loss somewhere anyway.  Doesn't look like the loss is happening on the firewall side, I'm assuming the switch counters look clean?  The issue might be somewhere else in the network, the presence of pause frames does not necessarily indicate the loss is happening right there...

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com
0 Kudos
Highlighted
Contributor
According to the switch's administrator and Juniper, the switch is dropping frames whenever it receives a pause.
At the moment, I am monitoring accumulation of that "tx_flow_control_xoff" and plan to try to monitor traffic patterns at that time in the console.
But yes, we started looking into this since we're seeing issues with VOIP traffic from time to time.
0 Kudos
Highlighted
Champion
Champion

So it sounds like the switch is actually honoring the pause request (i.e. has flow control enabled), but has insufficient buffer capacity to hold the frames for very long and is dropping some whenever the firewall NIC requests a pause.  I suppose you could turn off the rx pause function on the firewall with ethtool -A, and see who blinks first and starts losing some frames.  This will be indicated on the firewall by the RX-OVR counter.  Be warned that anytime you toggle interface options with ethool there is a chance it will cause a reset on the target interface and possibly all other interfaces using that same driver (i.e. igb, ixgbe, etc.) which will cause a brief bounce, so you may want to schedule an outage window before toggling it.

 

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com
Highlighted
Contributor
That's a great idea, thanks! Should I disable both rx and tx pause on the interface, or just rx since we're looking for RX_OVR?
0 Kudos
Champion
Champion

I'd turn them both off (make sure they have done the same on the switch side) and let the chips fall where they may.

R80.40 addendum for book "Max Power 2020" now available
for free download at http://www.maxpowerfirewalls.com