Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
kamilazat
Collaborator

Gateway doesn't show up on MHO GaiaPortal

Hello all.

We are trying to add a new gateway to an existing Maestro environment. Currently each security group has 3 gateways with licenses and work perfectly fine, but when trying to add a new gateway doesn't work.

- We installed R81.20SP

- lldpctl on MHO shows the device, but it doesn't show up in the Unassigned Gateways section.

- We already checked the cables and other functionality. 

- auto-clone is enabled on the SMO Master.

We are trying to add new gateways to both security groups and the same issue appears for both sides. No error messages in /var/log/messages on MHO.

Where else shuld we look?

 

Cheers!

0 Kudos
24 Replies
Daniel_Szydelko
Advisor
Advisor

Hello,

At this stage you need to be sure that:

- new SGM has properly installed correct SP software version

- downlinks are connected according to sk158652

- particular ports on MHO's are configured as downlink (orch_stat -p -v)

BR

Daniel. 

0 Kudos
kamilazat
Collaborator

@Daniel_Szydelko Thank you for the suggestions.

We have double checked that R81.20SP installed on the GW. I am attaching a screenshot of orch_stat -p -v.

The same steps were followed on a different maestro environment with the same MHO model, same ISO and same model (16600) and it worked there. 

Additionally, we found sk173410 that talks about exactly the same situation, except web ssl-port is not touched. 

Maybe we should open a TAC ticket, what do you think?

Edit: Noticed that I uploaded the wrong image. Corrected.

0 Kudos
emmap
Employee
Employee

What DAC/QSFPs/cables are used for the downlink? If you do a 'show maestro port x/x/x optic-info' for the downlink port what do you get? 

0 Kudos
kamilazat
Collaborator

@emmap Here's the output:

mho> show maestro port 1/23/1 optic-info

Physical Port: 1/23/1

Vendor Name: PROLABS

Serial Number: CPT186AC2608A

Part Number: NA

Check Point Part Number: 1368NIY4463

Enforcement: Supported

Check Point SKU: CPAC-DAC-100G-3M-B

Material ID: 321296

Product Type: 100GBASE-CR4-3M

Speed: 100G

0 Kudos
emmap
Employee
Employee

OK so that's detected properly, that's good. The MHO is reporting that the link is physically down, can you see if the SGM on the other end is showing the same? Do you have out of band access to it?

0 Kudos
kamilazat
Collaborator

We can take outputs from the SGM, yes. Would ethtool be enough or some other information needed?

Edit: I'm thinking about running the script mentioned here.

 

0 Kudos
kamilazat
Collaborator

@emmap I'm attaching the output of the script mentioned here.

 

What do you think, is the "Invalid numeric literal at line 2, column 0 related to the one-liner or the line card?

 

0 Kudos
emmap
Employee
Employee

Not sure, but it seems like it's detecting the cable. Have you tried a different downlink port on the MHO?

0 Kudos
kamilazat
Collaborator

Yes, we tried that as well. Currently, it looks like the correct ISO is installed and the cable looks functioning. We have taken a cpinfo from the gateway but didn't see any apparent errors in it. The only repeating error message is this on MHO:

error: smo_rest_api_request_url: smo_rest_api_request_with_url: ERROR! Failed to Handle request. smo_rest_api_type = -1

But web ssl-port is 443 already, so sk171592 does not really help. 

Where else can we look, any ideas?

0 Kudos
AkosBakos
Leader Leader
Leader

Hi,

The RMA has been arrived?

Akos

----------------
\m/_(>_<)_\m/
0 Kudos
kamilazat
Collaborator

The RMA is in progress. It was for an appliance where we didn't have console access. But the problem remains for another Maestro deployment as well. I posted to troubleshoot multiple similar issues. We have this gateway where we have access to console, but still doesn't show up.

0 Kudos
emmap
Employee
Employee

What does 'ethtool [ifname] from the SGM say? I think the ifname is ethsBP1-01 but double check with 'ifconfig'. 

We're still talking about a 16600HS SGM, yes?

0 Kudos
kamilazat
Collaborator

Yes, the model is the same, 16600HS.

I attached the ethtool info from both BP interfaces (two of them appear).

0 Kudos
emmap
Employee
Employee

They're both showing no link, same as the MHO. If they're cabled up and like that, there's a layer 1/hardware issue here to chase down.

0 Kudos
kamilazat
Collaborator

You were right, we noticed that the socket didn't "click" in, which is a surprisingly common issue 🙂

Now, the problem still remains. orch_stat shows PLUGGED, but DOWN. We ran HCP on MHO and it told us to run the command below on the problematic port. I added the other flags for more information as well.

mlxlink -d /dev/mst/mt53100_pci_cr0 -m -e -p 23 -c --show_ber_monitor

And here's the output:

Operational Info
---------------
State : Polling
Physical state : ETH_AN_FSM_AN_GOOD_CHECK
Speed : N/A
Width : N/A
FEC : N/A
Loopback Mode : No Loopback
Auto Negotiation : ON

Supported Info
-------------
Enabled Link Speed (Ext.) : 0x00000200 (100G_4X)
Supported Cable Speed (Ext.) : 0x000002F2 (400G,50G_2X,40G,25G,10G,1G)

Troubleshooting Info
------------------
Status Opcode : 0
Group Opcode : N/A
Recommendation : No issue was observed.

Physical Counters and BER Info
---------------------------
Time Since Last Clear [Min] : N/A
Effective Physical Errors : N/A
Effective Physical BER : N/A
Raw Physical BER : N/A
Raw Physical Errors Per Lane : N/A

EYE Opening Info
--------------
Physical : 0, 0, 0, 0
Height Eye Opening [mV] : N/A, N/A, N/A, N/A
Phase Eye Opening [psec] : N/A, N/A, N/A, N/A

Module Info
----------
Identifier : QSFP28
Compliance : 100GBASE-CR4 or 25GBASE-CR CA-L
Style Technology : Copper cable
Cable Type : Passive copper cable
OUI : Other
Vendor Name : PROLABS
Vendor Part Number : 13B8NEV4463
Vendor Serial Number : CFT1858B2688A
Rev : A1
Attenuation (5g,7g,12g) [dB]: 0,0,0
FW Version : N/A
Wavelength [nm] : N/A
Transfer Distance [m] : 3
Digital Diagnostic Monitoring: No
Power Class : 1.5 W max
CDR RX : N/A
CDR TX : N/A
LOS Alarm : N/A
Temperature [C] : N/A
Voltage [mV] : N/A
Bias Current [mA] : N/A
RX Power Current [dBm] : N/A
TX Power Current [dBm] : N/A

BER Monitor Info
--------------
BER Monitor State : Normal
BER Monitor Type : Post FEC / No FEC BER monitoring

As far as I guesstimate, auto-negotiation went well (ETH_AN_FSM_AN_GOOD_CHECK state) and I understand that speed information will not be shown until there is actual traffic between the devices. On the other hand, the gateway still shows 0 RX or TX. No errors or anything, all counters (except mgmt interface) are zero.

sk180812 talks about turning off auto-negotiation, although Physical State is different in that example. I'm not sure we can turn it off on MHO but on GW, it's possible. Do you think it's worth a try?

0 Kudos
emmap
Employee
Employee

The symptoms are different. What does ethtool on the SGM show now?

0 Kudos
kamilazat
Collaborator

Sorry for the late answer. I attached all the ethtool for ethsBP1-01.

Do you think turning off auto-negotiation can be an option?

0 Kudos
emmap
Employee
Employee

There's still no link detected. You can try disabling the auto-negotiation, if that doesn't work then I'd suggest a TAC case from here as it still seems to be a hardware issue somewhere.

0 Kudos
Daniel_Szydelko
Advisor
Advisor

Hello,

If it's possible then create new SR and ask for remote session with maestro TAC engineer. It will be faster way to resolve it.

You can also go through some log files on MHO (/var/log/messages, /var/log/smartd.log) to see if something valuable can be found regarding this situation. 

It's strange that we have link state DOWN. You can try to check from similar SGM with ethtool how it looks like from this side.

BR

Daniel.

0 Kudos
AkosBakos
Leader Leader
Leader

Hi,

Did you install any kind of jumbo onto the pure R81.20 SP? Or did you plan after the enrollment?

For sure, make an FCD, and start the process again. 

Did you use Check_Point_R81.20_T634_ScalablePlatform.iso ?

Do you have an another 100G DAC cable? Only for a test, to be 100% sure, not the cable wrong.

Akos

----------------
\m/_(>_<)_\m/
0 Kudos
kamilazat
Collaborator

No JHF was installed onto the pure image. We did fcd couple of times, triple checked that that is the correct .iso, and tried other cables.

0 Kudos
AkosBakos
Leader Leader
Leader

I wish the best 🙂

----------------
\m/_(>_<)_\m/
0 Kudos
kamilazat
Collaborator

That's relieving :)))

0 Kudos
kamilazat
Collaborator

Hi again everyone.

I wanted to update the post. TAC decided that RMA will be the best solution. 

Thank you all for your time!

Cheers!

0 Kudos