You were right, we noticed that the socket didn't "click" in, which is a surprisingly common issue 🙂
Now, the problem still remains. orch_stat shows PLUGGED, but DOWN. We ran HCP on MHO and it told us to run the command below on the problematic port. I added the other flags for more information as well.
mlxlink -d /dev/mst/mt53100_pci_cr0 -m -e -p 23 -c --show_ber_monitor
And here's the output:
Operational Info
---------------
State : Polling
Physical state : ETH_AN_FSM_AN_GOOD_CHECK
Speed : N/A
Width : N/A
FEC : N/A
Loopback Mode : No Loopback
Auto Negotiation : ON
Supported Info
-------------
Enabled Link Speed (Ext.) : 0x00000200 (100G_4X)
Supported Cable Speed (Ext.) : 0x000002F2 (400G,50G_2X,40G,25G,10G,1G)
Troubleshooting Info
------------------
Status Opcode : 0
Group Opcode : N/A
Recommendation : No issue was observed.
Physical Counters and BER Info
---------------------------
Time Since Last Clear [Min] : N/A
Effective Physical Errors : N/A
Effective Physical BER : N/A
Raw Physical BER : N/A
Raw Physical Errors Per Lane : N/A
EYE Opening Info
--------------
Physical : 0, 0, 0, 0
Height Eye Opening [mV] : N/A, N/A, N/A, N/A
Phase Eye Opening [psec] : N/A, N/A, N/A, N/A
Module Info
----------
Identifier : QSFP28
Compliance : 100GBASE-CR4 or 25GBASE-CR CA-L
Style Technology : Copper cable
Cable Type : Passive copper cable
OUI : Other
Vendor Name : PROLABS
Vendor Part Number : 13B8NEV4463
Vendor Serial Number : CFT1858B2688A
Rev : A1
Attenuation (5g,7g,12g) [dB]: 0,0,0
FW Version : N/A
Wavelength [nm] : N/A
Transfer Distance [m] : 3
Digital Diagnostic Monitoring: No
Power Class : 1.5 W max
CDR RX : N/A
CDR TX : N/A
LOS Alarm : N/A
Temperature [C] : N/A
Voltage [mV] : N/A
Bias Current [mA] : N/A
RX Power Current [dBm] : N/A
TX Power Current [dBm] : N/A
BER Monitor Info
--------------
BER Monitor State : Normal
BER Monitor Type : Post FEC / No FEC BER monitoring
As far as I guesstimate, auto-negotiation went well (ETH_AN_FSM_AN_GOOD_CHECK state) and I understand that speed information will not be shown until there is actual traffic between the devices. On the other hand, the gateway still shows 0 RX or TX. No errors or anything, all counters (except mgmt interface) are zero.
sk180812 talks about turning off auto-negotiation, although Physical State is different in that example. I'm not sure we can turn it off on MHO but on GW, it's possible. Do you think it's worth a try?