- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- Re: RX frame errors
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
RX frame errors
hello,
I am having a frame issue on some interfaces connected to gigamon we see frame rx errors increase on those interfaces, has anyone experienced that and found a solution? we have replaced the NIC cards, cables, SFP's issue remain, the gateways are 6400 model, we dont see that issue on the 16000 turbo hardware.
thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you done a performance assessment on the 6400 to ensure it is not overloaded?
You can start with this: https://community.checkpoint.com/t5/Scripts/S7PAC-Super-Seven-Performance-Assessment-Commands/m-p/40...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
no we havent done a performance assessment, however that firewall has no load at the moment, it is not taking live traffic yet. we are trying to find and fix the issue before it gets the load.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please run cpview and check below? (just go all the way to the bottom, where it shows drops)
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
cpview is not showing errors however ifconfig shows frame errors on the interfaces forming the bond
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And you said they constantly keep increasing? If the answer to that question is yes, when did this start happening?
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes constantly keeps increasing, we dont see this behavior on the 16000 which are also connected to the same gigamon device, the only difference is the 16000 have different drivers for the interfaces
below is from 6400 showing rx errors
[Expert@idboinfw007:0]# ethtool -i eth1-02
driver: i40e
version: 2.10.19.82
firmware-version: 6.80 0x8000a368 0.0.0
expansion-rom-version:
bus-info: 0000:01:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
[Expert@idboinfw007:0]# ethtool -s eth1-02
[Expert@idboinfw007:0]# ethtool -S eth1-02
NIC statistics:
rx_packets: 166537830
tx_packets: 20506
rx_bytes: 12314087925
tx_bytes: 2542744
rx_errors: 0
tx_errors: 0
rx_dropped: 0
tx_dropped: 0
collisions: 0
rx_length_errors: 10991
rx_crc_errors: 0
rx_unicast: 0
tx_unicast: 0
rx_multicast: 20502
tx_multicast: 20506
rx_broadcast: 166517328
tx_broadcast: 0
rx_unknown_protocol: 0
tx_linearize: 0
tx_force_wb: 0
tx_busy: 0
rx_alloc_fail: 0
rx_pg_alloc_fail: 0
tx-0.packets: 20500
tx-0.bytes: 2542000
rx-0.packets: 166537426
rx-0.bytes: 12314035213
tx-1.packets: 1
tx-1.bytes: 124
rx-1.packets: 86
rx-1.bytes: 11712
tx-2.packets: 2
tx-2.bytes: 248
rx-2.packets: 62
rx-2.bytes: 7954
tx-3.packets: 3
tx-3.bytes: 372
rx-3.packets: 256
rx-3.bytes: 33046
veb.rx_bytes: 0
veb.tx_bytes: 0
veb.rx_unicast: 0
veb.tx_unicast: 0
veb.rx_multicast: 0
veb.tx_multicast: 0
veb.rx_broadcast: 0
veb.tx_broadcast: 0
veb.rx_discards: 0
veb.tx_discards: 0
veb.tx_errors: 0
veb.rx_unknown_protocol: 0
veb.tc_0_tx_packets: 0
veb.tc_0_tx_bytes: 0
veb.tc_0_rx_packets: 0
veb.tc_0_rx_bytes: 0
veb.tc_1_tx_packets: 0
veb.tc_1_tx_bytes: 0
veb.tc_1_rx_packets: 0
veb.tc_1_rx_bytes: 0
veb.tc_2_tx_packets: 0
veb.tc_2_tx_bytes: 0
veb.tc_2_rx_packets: 0
veb.tc_2_rx_bytes: 0
veb.tc_3_tx_packets: 0
veb.tc_3_tx_bytes: 0
veb.tc_3_rx_packets: 0
veb.tc_3_rx_bytes: 0
veb.tc_4_tx_packets: 0
veb.tc_4_tx_bytes: 0
veb.tc_4_rx_packets: 0
veb.tc_4_rx_bytes: 0
veb.tc_5_tx_packets: 0
veb.tc_5_tx_bytes: 0
veb.tc_5_rx_packets: 0
veb.tc_5_rx_bytes: 0
veb.tc_6_tx_packets: 0
veb.tc_6_tx_bytes: 0
veb.tc_6_rx_packets: 0
veb.tc_6_rx_bytes: 0
veb.tc_7_tx_packets: 0
veb.tc_7_tx_bytes: 0
veb.tc_7_rx_packets: 0
veb.tc_7_rx_bytes: 0
port.rx_bytes: 23697907860
port.tx_bytes: 2624768
port.rx_unicast: 986953
port.tx_unicast: 0
port.rx_multicast: 91526543
port.tx_multicast: 20506
port.rx_broadcast: 166517328
port.tx_broadcast: 0
port.tx_errors: 0
port.rx_dropped: 0
port.tx_dropped_link_down: 0
port.rx_crc_errors: 0
port.illegal_bytes: 0
port.mac_local_faults: 0
port.mac_remote_faults: 0
port.tx_timeout: 0
port.rx_csum_bad: 0
port.rx_length_errors: 10991
port.link_xon_rx: 0
port.link_xoff_rx: 0
port.link_xon_tx: 0
port.link_xoff_tx: 0
port.rx_size_64: 6437141
port.rx_size_127: 222539300
port.rx_size_255: 26399150
port.rx_size_511: 1818432
port.rx_size_1023: 67740
port.rx_size_1522: 1769061
port.rx_size_big: 0
port.tx_size_64: 0
port.tx_size_127: 0
port.tx_size_255: 20506
port.tx_size_511: 0
port.tx_size_1023: 0
port.tx_size_1522: 0
port.tx_size_big: 0
port.rx_undersize: 0
port.rx_fragments: 0
port.rx_oversize: 0
port.rx_jabber: 0
port.VF_admin_queue_requests: 0
port.arq_overflows: 0
port.tx_hwtstamp_timeouts: 0
port.rx_hwtstamp_cleared: 0
port.tx_hwtstamp_skipped: 0
port.fdir_flush_cnt: 1
port.fdir_atr_match: 0
port.fdir_atr_tunnel_match: 0
port.fdir_atr_status: 0
port.fdir_sb_match: 0
port.fdir_sb_status: 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Might be worth TAC case.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is 10991 roughly how many framing errors are being reported by ifconfig and netstat -ni? If not please run these commands within a few seconds of each other:
netstat -ni | grep eth1-02
ethtool -S eth1-02
ifconfig eth1-02
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is quite strange that both physical interfaces of the bond are reporting the exact same number of framing errors; assuming they are actively incrementing this would suggest some kind of regular emanation from the switch that the NIC thinks is not a properly formed Ethernet frame (perhaps a bridging/STP advertisement or some other kind of proprietary media test?) On the firewall run sar -n EDEV, is it reporting a consistent number of rxfram/s errors in each 10 minute sample period all day long? Could also be some kind of invalid frame getting sent to the broadcast and being flooded by the switch, but I was under the impression that a switch will not forward an invalid frame so it is likely something the switch itself is creating.
Unfortunately there is no easy way to see what these supposedly invalid frames actually are with a packet capture on the firewall, as the bad frames will be simply discarded by the NIC hardware.
Please provide output from the following commands from expert mode on the firewall, there may be some other side effects being caused by this condition that will help point to the issue:
ethtool -i eth1-02
ethtool -S eth1-04
ethtool -S eth1-02
Make sure all elements of the bond configuration are EXACTLY the same on both the firewall and switch side.
I assume the network counters on the switchport side are error-free?
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
attached netstat and ethtool and sar dev output
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ehtool - i and -S output attached
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the only difference on the interface firmware is these settings , dont know if it could be related or not
supports-eeprom-access: yes
supports-register-dump: yes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK that helped lot. From what I can tell there is a constant, regular stream of frames that are too small (at least according to i40e) coming from the gigamon; in the distant past these were called "runts" while too-long frames were called "jabbers". I highly doubt these framing errors are actually legitimate frames getting corrupted so this would appear to just be cosmetic and not impact real traffic. This assertion is confirmed by the netstat output showing that these framing errors are not even incrementing RX-ERR.
For this stream of framing errors to be so consistent it must be some kind of regular emanation from the gigmon itself, probably:
- STP/Bridge announcements (disabling Spanning Tree is NOT an option, but you could try portfast and see if that helps)
- LLDP (try disabling this on the gigamon ports if enabled)
- CDP (Cisco Discovery Protocol - try disabling this on the gigamon ports if enabled)
- Gigamon Discovery (appears to be some kind of proprietary Gigamon discovery protocol - try disabling it)
- If there are any other discovery/probing/health check type of protocols enabled on the gigamon try disabling them on the relevant switchports, including possibly some proprietary VLAN trunking 802.1q discovery/healthcheck/probing if the ports are trunked
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
understood, checking back with my colleagues, they are doubting it may be related to gigamon discovery however we cant disable that sicne this is how the device gigamon works, is it possible though to have a driver/firmware update on these interfaces somehow to get those frame drops/errors go away? it seems the 40Gig interface on the 16000 firewall has a way of handling it and not show on its stats.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes it could be a driver update issue; the current Gaia i40e 2.10.19.82 driver is from early 2020. You can see the changelog for the i40e driver at the URL below, and while it doesn't seem to have any fixes directly relevant to this issue, Check Point TAC may have a newer driver available.
Also one more question: what code version and Jumbo HFA level are you using on your gateway? The i40e driver version is 2.10.19.82 for R81.10 and later, in R80.40 and earlier it was 2.7.12.
https://github.com/dmarion/deb-i40e/blob/master/debian/changelog
now available at maxpowerfirewalls.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the 6400 and the 16000 are both on R81.20 T10, yea I think it is the latest firmware since 2020 they dont update those drivers often for some reason but I will see with TAC if they can check for newer driver/firmware versions. thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
it is a VSX also