Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
sebfuuu
Explorer
Explorer

VSX UPPAK issues

Hello everyone,

I noticed a few interesting issues with VSX and UPPAK.

 

Appliance 9300
R81.20 JHF 111

 

 

* First issue:

When traffic goes through two different VS in the same VSX hardware via a Virtual Switch we see 1-2% "packet-loss"

With zdebug we see this output that matches the traffic:

 

@;628145396.20082;[uspace];[tid_1];[SIM4];prepare_cut_through:do_routing returned invalid out_ifn 65535, conn:<x.x.x.x,53,y.y.y.y,46771,17>;

@;628145396.20083;[uspace];[tid_1];[SIM4];sim_pkt_send_drop_notification:(5,0) received drop, reason: Interface Down (8), conn:<y.y.y.y,46771,x.x.x.x,53,17>;

@;628145396.20084;[uspace];[tid_1];[SIM4];sim_pkt_send_drop_notification:no track is needed for this drop - not sending a notificaion, conn:<y.y.y.y,46771,x.x.x.x,53,17>;

 

It is always dropped on the first VS, it does not matter what traffic and which VS.

(More than two VS see this behavior)

After change of the SecureXL mode from UPPAK to KPPAK this issues goes away.

TAC case was raised but the customer was not willing to share the data needed to proceed with the case at that time.

 

 

* Second issue:

After SecureXL mode change to KPPAK we get RX errors on a lot on 10g fiber interfaces.

We have confirmed this on four 9300 hardware (two diffrent clusters).

If you switch to UPPAK, the errors goes away.

If you go back to KPPAK you get the same amount of errors on the same interfaces.

Reboot of FW/Switch or disconnect/reconnect of SFP and cable does not have any impact.

No TAC case raised yet.

 

 

I have not seen these issues on larger 9xxx appliances with VSX.

So one guess would be the Intel E-cores / P-cores architecture. (Like sk183438)

USFW is used.

 

 

 

The questions I have for the community.

Have anyone seen any issues like this?

 

 

 

Thanks again to everyone for this great community.

/Seb

 

2 Replies
Henrik_Noerr1
Advisor

Your experiences mirrors ours.

Furthermore we have a 9400 vsx cluster waiting to be onboarded with production traffic still with no VSs built. 

One node becomes unresponsive every 2-4 days and needs to be power cycled from LOM. We made RMA of both nodes. Same result. The node crashing in the cluster varies. jumbo t113

We are investigating options to replace the model to 9700+

 

/Henrik

Lesley
Authority Authority
Authority

RX buffers could be related with: https://support.checkpoint.com/results/sk/sk182825

does not 100% match

-------
If you like this post please give a thumbs up(kudo)! 🙂

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events