Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Kaspars_Zibarts
Employee Employee
Employee
Jump to solution

Weird 10Gb interface hangup

This more of a "friday" post for fun. Although problem was real - in one of our 5900 clusters running R80.10 the standby member out of blue produced some obscure errors on one of the 10Gb bond trunks (eth1-04)

 

Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: Detected Tx Unit Hang
Jun 27 19:29:19 2018 fwf2 kernel: Tx Queue <0>
Jun 27 19:29:19 2018 fwf2 kernel: TDH, TDT <37a>, <124>
Jun 27 19:29:19 2018 fwf2 kernel: next_to_use <124>
Jun 27 19:29:19 2018 fwf2 kernel: next_to_clean <37a>
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: tx_buffer_info[next_to_clean]
Jun 27 19:29:19 2018 fwf2 kernel: time_stamp <2a1b97a3e>
Jun 27 19:29:19 2018 fwf2 kernel: jiffies <2a1b98956>
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: tx hang 1 detected on queue 0, resetting adapter
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: Reset adapter
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: RXDCTL.ENABLE on Rx queue 0 not cleared within the polling period
Jun 27 19:29:19 2018 fwf2 kernel: bonding: bond0: link status down for idle interface eth1-04, disabling it in 200 ms.
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe: eth1-04: ixgbe_setup_mrqc: configure Symmetric RSS
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe: eth1-04: ixgbe_up_complete: Double vlan mode is not set
Jun 27 19:29:19 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: detected SFP+: 6‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

 

And then few seconds later all interfaces in the expansion slot started reporting continuously

Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:05:00.0: eth1-01: -1 Spoofed packets detected
Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:06:00.1: eth1-04: -1 Spoofed packets detected
Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:06:00.0: eth1-03: -1 Spoofed packets detected
Jun 27 19:30:55 2018 fwf2 kernel: ixgbe 0000:05:00.1: eth1-02: -1 Spoofed packets detected‍‍‍‍‍‍‍‍

 

It was resolved by node reboot. The only relevant SK I found was this Intermittent outages of TCP traffic on 10GbE interfaces in IP Appliances running Gaia OS but it's not applicable to R80.10 nor 5900 and offload is definitely disabled on interfaces.

 

Here's the best part - the display on the appliance at the time showed this

 

 

Does this mean firewall needs to go to toilet?? P-p-p-peee.... 

1 Solution

Accepted Solutions
AlekseiShelepov
Advisor

I guess some things never change...

I think I used a wrong ISO file that time.

View solution in original post

11 Replies
Timothy_Hall
Legend Legend
Legend

Sounds like the NIC hardware is what went into the toilet, the display was telling you that the "stream" of outbound packets was no longer getting handled by the NIC, and that the firewall's bladder was too full which can certainly be uncomfortable to say the least.  🙂

--
Second Edition of my "Max Power" Firewall Book
Now Available at http://www.maxpowerfirewalls.com

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Kaspars_Zibarts
Employee Employee
Employee

0 Kudos
AlekseiShelepov
Advisor

I guess some things never change...

I think I used a wrong ISO file that time.

Mike_Jones
Participant

Sorry to bring an old post back to live, but I'm having a similar error on a 10Gb interface on a 5900.  A reboot initially fixed the issue, but it came back, and again required a reboot (not to mention a disk check on each reboot).

Did you have any more problems with your 5900 after your reboot?

0 Kudos
Timothy_Hall
Legend Legend
Legend

What code version is the firewall using?  Make sure you have the latest GA Jumbo HFA applied as updated NIC driver versions are sometimes bundled in Jumbo HFAs.

 

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Mike_Jones
Participant

Product version Check Point Gaia R80.10
OS build 479
OS kernel version 2.6.18-92cpx86_64
OS edition 64-bit

Using GA Jumbo HFA Take 169

 

0 Kudos
Maarten_Sjouw
Champion
Champion
Well bringing back this old one again.
Last weekend we had a 5900 with a 10GB bond (2 interfaces), part of a VSX cluster, with exactly the same problem, a messages files completely filled with anti-spoofing messages. Older messages file was no longer available.
Code running R80.20 with jumbo 118
Regards, Maarten
0 Kudos
Mike_Jones
Participant

Sorry I should have posted back with details on this.  This was a confirmed by CP to be a nic manufacturer hardware issue.  The 5900 was affected, and I think maybe a couple of other models?  However, it isn't always an issue on these models.  There are some checks you can do to see if you have the issue, but unfortunately, I moved out of the firewall world, and don't have access to check the details.  In short, contact CP support.

Found another detail from the past - affected 4 port cards, but not 2 port cards.  This went to R&D for investigation and they confirmed is was not software/driver related,but rather HW design.

Maarten_Sjouw
Champion
Champion
Thanks Mike, we will check with TAC.
Regards, Maarten
0 Kudos
Timothy_Hall
Legend Legend
Legend

Thanks for the followup, trying to distinguish NIC hardware problems from NIC driver problems can be pretty tough.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
bad_joojoo
Participant

Hi, it would be interesting to understand if you have checked DMESG after the issue occurs and can confirm if your seeing a VETO bit message just after the ixgbe interfaces being taken offline. I had an opportunity to look at something similar and was fortunate enough to also capture an "error level 5" message from the PCIE drivers also being captured (effectively stating they we're going to sleep). Subsequently, I found that either a reboot or reloading the ixgbe driver (this reloads all ixgbe interfaces so take care) brings it back into service.

 

Did you ever find a resolution?

Kind Regards

Ju

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events