- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- Re: Weird 10Gb interface hangup
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Weird 10Gb interface hangup
This more of a "friday" post for fun. Although problem was real - in one of our 5900 clusters running R80.10 the standby member out of blue produced some obscure errors on one of the 10Gb bond trunks (eth1-04)
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: Detected Tx Unit Hang
Jun 27 19:29:19 2018 fwf2 kernel: Tx Queue <0>
Jun 27 19:29:19 2018 fwf2 kernel: TDH, TDT <37a>, <124>
Jun 27 19:29:19 2018 fwf2 kernel: next_to_use <124>
Jun 27 19:29:19 2018 fwf2 kernel: next_to_clean <37a>
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: tx_buffer_info[next_to_clean]
Jun 27 19:29:19 2018 fwf2 kernel: time_stamp <2a1b97a3e>
Jun 27 19:29:19 2018 fwf2 kernel: jiffies <2a1b98956>
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: tx hang 1 detected on queue 0, resetting adapter
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: Reset adapter
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: RXDCTL.ENABLE on Rx queue 0 not cleared within the polling period
Jun 27 19:29:19 2018 fwf2 kernel: bonding: bond0: link status down for idle interface eth1-04, disabling it in 200 ms.
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe: eth1-04: ixgbe_setup_mrqc: configure Symmetric RSS
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe: eth1-04: ixgbe_up_complete: Double vlan mode is not set
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: detected SFP+: 6
And then few seconds later all interfaces in the expansion slot started reporting continuously
Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:05:00.0: eth1-01: -1 Spoofed packets detected
Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: -1 Spoofed packets detected
Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:06:00.0: eth1-03: -1 Spoofed packets detected
Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:05:00.1: eth1-02: -1 Spoofed packets detected
It was resolved by node reboot. The only relevant SK I found was this Intermittent outages of TCP traffic on 10GbE interfaces in IP Appliances running Gaia OS but it's not applicable to R80.10 nor 5900 and offload is definitely disabled on interfaces.
Here's the best part - the display on the appliance at the time showed this
Does this mean firewall needs to go to toilet?? P-p-p-peee....
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I guess some things never change...
I think I used a wrong ISO file that time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sounds like the NIC hardware is what went into the toilet, the display was telling you that the "stream" of outbound packets was no longer getting handled by the NIC, and that the firewall's bladder was too full which can certainly be uncomfortable to say the least. 🙂
--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I guess some things never change...
I think I used a wrong ISO file that time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry to bring an old post back to live, but I'm having a similar error on a 10Gb interface on a 5900. A reboot initially fixed the issue, but it came back, and again required a reboot (not to mention a disk check on each reboot).
Did you have any more problems with your 5900 after your reboot?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What code version is the firewall using? Make sure you have the latest GA Jumbo HFA applied as updated NIC driver versions are sometimes bundled in Jumbo HFAs.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Product version Check Point Gaia R80.10
OS build 479
OS kernel version 2.6.18-92cpx86_64
OS edition 64-bit
Using GA Jumbo HFA Take 169
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Last weekend we had a 5900 with a 10GB bond (2 interfaces), part of a VSX cluster, with exactly the same problem, a messages files completely filled with anti-spoofing messages. Older messages file was no longer available.
Code running R80.20 with jumbo 118
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry I should have posted back with details on this. This was a confirmed by CP to be a nic manufacturer hardware issue. The 5900 was affected, and I think maybe a couple of other models? However, it isn't always an issue on these models. There are some checks you can do to see if you have the issue, but unfortunately, I moved out of the firewall world, and don't have access to check the details. In short, contact CP support.
Found another detail from the past - affected 4 port cards, but not 2 port cards. This went to R&D for investigation and they confirmed is was not software/driver related,but rather HW design.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the followup, trying to distinguish NIC hardware problems from NIC driver problems can be pretty tough.
CET (Europe) Timezone Course Scheduled for July 1-2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, it would be interesting to understand if you have checked DMESG after the issue occurs and can confirm if your seeing a VETO bit message just after the ixgbe interfaces being taken offline. I had an opportunity to look at something similar and was fortunate enough to also capture an "error level 5" message from the PCIE drivers also being captured (effectively stating they we're going to sleep). Subsequently, I found that either a reboot or reloading the ixgbe driver (this reloads all ixgbe interfaces so take care) brings it back into service.
Did you ever find a resolution?
Kind Regards
Ju
