Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Brandon_Cotter
Contributor
Jump to solution

Demonstrating pause frames

Hello!

I am trying to find a mysterious source of packet loss using my R80.10 JHF 225 gateways. The administrator of the access layer is saying their switch is receiving "pause frames" from the firewall and so it's dropping packets it cannot deliver in a timely manner. I am not sure how to evaluate this - from reading, it does not appear that they would necessarily show up in a packet capture. I've also read that those perhaps exclusively originate from an endpoint or a switch. I tried a tcpdump from the gateway and wireshark filter "macc.opcode == pause" - no results.

In the specific scenario I am troubleshooting that I hope is indicative of the larger problem, an attempt to connect to an https server reliably gets SYN-SYN/ACK-ACK-Client Hello ... Client Hello ... RST (from server). We've seen it before with a QoS/CoS issue on our switch hardware.

In searching for similar issues, I found https://community.checkpoint.com/t5/General-Topics/Ifconfig-dropped-explanation/m-p/24447#M4885 but ifconfig does not report any Rx or Tx errors, so our situation does not map well to that scenario.

I'm not getting indications that the gateway is under any meaningful load, though cpview does show 195,627 "Instance High CPU" drops, though on a "Inbound Packets/sec" rate of around 70k.

How can I determine whether the gateway is telling the switch to suspend passing packets?

 

1 Solution

Accepted Solutions
Timothy_Hall
Legend Legend
Legend

This situation is covered in my "Max Power" book.  Rather than trying to copy/paste and reformat it, I'm just going to be lazy and post screenshots of the relevant two pages.  Let me know if you have further questions.

 

Flow1.jpgflow2.jpg

 

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com

View solution in original post

12 Replies
PhoneBoy
Admin
Admin
Perhaps a better question would be: how are they determining they are receiving pause frames from the gateway?
Brandon_Cotter
Contributor

They are seeing the pause frames counter increment on the interface  directly connected to the gateway. 

Timothy_Hall
Legend Legend
Legend

This situation is covered in my "Max Power" book.  Rather than trying to copy/paste and reformat it, I'm just going to be lazy and post screenshots of the relevant two pages.  Let me know if you have further questions.

 

Flow1.jpgflow2.jpg

 

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Brandon_Cotter
Contributor
Thank you! Love your book.
0 Kudos
Brandon_Cotter
Contributor
So this is potentially a concerning number:
tx_flow_control_xon: 359283
rx_flow_control_xon: 0
tx_flow_control_xoff: 506156
rx_flow_control_xoff: 0
Timothy_Hall
Legend Legend
Legend

Yes it is, please provide output of netstat -ni for that interface so we can see if you have RX-OVR as well.

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Brandon_Cotter
Contributor

Solid row of zeroes on RX-OVR.

Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg

eth1-04 1500 0 13471586055 0 0 0 8257090756 0 0 0 BMRU

netstat-ni.png

 

Timothy_Hall
Legend Legend
Legend

Sounds like your NIC is keeping up with the load barely. It is sending pauses but not actually running out of buffer space and losing frames to overruns. But it sounds like you are getting packet loss somewhere anyway.  Doesn't look like the loss is happening on the firewall side, I'm assuming the switch counters look clean?  The issue might be somewhere else in the network, the presence of pause frames does not necessarily indicate the loss is happening right there...

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Brandon_Cotter
Contributor
According to the switch's administrator and Juniper, the switch is dropping frames whenever it receives a pause.
At the moment, I am monitoring accumulation of that "tx_flow_control_xoff" and plan to try to monitor traffic patterns at that time in the console.
But yes, we started looking into this since we're seeing issues with VOIP traffic from time to time.
0 Kudos
Timothy_Hall
Legend Legend
Legend

So it sounds like the switch is actually honoring the pause request (i.e. has flow control enabled), but has insufficient buffer capacity to hold the frames for very long and is dropping some whenever the firewall NIC requests a pause.  I suppose you could turn off the rx pause function on the firewall with ethtool -A, and see who blinks first and starts losing some frames.  This will be indicated on the firewall by the RX-OVR counter.  Be warned that anytime you toggle interface options with ethool there is a chance it will cause a reset on the target interface and possibly all other interfaces using that same driver (i.e. igb, ixgbe, etc.) which will cause a brief bounce, so you may want to schedule an outage window before toggling it.

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Brandon_Cotter
Contributor
That's a great idea, thanks! Should I disable both rx and tx pause on the interface, or just rx since we're looking for RX_OVR?
0 Kudos
Timothy_Hall
Legend Legend
Legend

I'd turn them both off (make sure they have done the same on the switch side) and let the chips fall where they may.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events