Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
cem82
Contributor
Jump to solution

OSPF drops on cluster failover since R81.10 upgrade from R80.30

Hi

After upgrading from R80.30 to R81.10 (active/standby ClusterXL) we have found that whenever the cluster is failed over we loose OSPF and advertised routes. This was working fine when on R80.30 with the same clish config with the router-ID set as the VIP on both cluster members.  When under stable conditions, we do see the OSPF routes sync'd to standby but historically if we did "show ospf neighbors" we would see the same as on the active.  Now we see none on the standby.  After a few min on either side of the cluster everything is fine again, is just upon failover for a few min things drop

Is there some other config or anything required for on R81.10?  I didn't notice anything in the clusterXL or advanced routing admin guides.

Thanks

0 Kudos
1 Solution

Accepted Solutions
Chris_Atkinson
Employee Employee
Employee

Graceful restart is new to R81.10 so if your comparing to R80.30 you should disable it for a like-for-like comparison.

R81.10 "OSPFv2 Graceful Restart in ClusterXL (RFC standard)" Source sk98226

 

The above may seem contrary to what you might expect but you'll note the following stated in sk95968

OSPF CXL.png

CCSM R77/R80/ELITE

View solution in original post

0 Kudos
(1)
8 Replies
Chris_Atkinson
Employee Employee
Employee

R81.10 with JHF T55 is working well with OSPF in a cluster environment from customer testing that I've been involved with.

CCSM R77/R80/ELITE
0 Kudos
cem82
Contributor

We're also running JHF take 55.  Do you need to do anything additional above the type of thing below or clusterXL related

 

set router-id <cluster VIP>

set ospf instance default area backbone on
set ospf instance default interface <interface> area backbone on
set ospf instance default interface <interface> priority 1

and any associated route-filters / route redistribution.  There are a few other OSPF modifications along with "set ospf instance default" but nothing that I could imagine as causing problems since were also there prior to upgrade.

0 Kudos
Chris_Atkinson
Employee Employee
Employee

What allowances have been made in the security policy itself for OSPF & IGMP?

Is graceful restart configured in your case?

CCSM R77/R80/ELITE
0 Kudos
cem82
Contributor

FW policy hasn't changed at all for some time prior to or after the upgrade so would imagine that'd be fine.  Graceful restart and graceful-restart-helper are both enabled.  When looking at smartlog for src or dst of either cluster member or cluster object there are no drops.

0 Kudos
Chris_Atkinson
Employee Employee
Employee

Graceful restart is new to R81.10 so if your comparing to R80.30 you should disable it for a like-for-like comparison.

R81.10 "OSPFv2 Graceful Restart in ClusterXL (RFC standard)" Source sk98226

 

The above may seem contrary to what you might expect but you'll note the following stated in sk95968

OSPF CXL.png

CCSM R77/R80/ELITE
0 Kudos
(1)
cem82
Contributor

Thanks for pointing that out, when looking at the admin guide it seems to refer to that only for VRRP clusters?  As part of a TAC case it was recommended (and did) after enabling graceful-restart-helper (graceful restart was already enabled) to resolve an issue we had on another cluster running 80.30 to enable graceful-restart-helper.

 

Hopefully when I upgrade the other cluster the other issue doesn't break again if this is the fix to disable GR 🙂

0 Kudos
cem82
Contributor

I've tested and that did appear to do the trick 🙂  Thanks heaps!  Much better than what I was told to change was fwha_cluster_hide_active_only 0 kernel value.

 

I hope this doesn't re-introduce the problem we had where another FW rebooting caused OSPF to drop on the check point but will cross that bridge if it still happens on R81.10 🙂  TAC advised for that case to enable GR and GR-Helper so touch wood we don't encounter that again.

0 Kudos
Chris_Atkinson
Employee Employee
Employee

I understand why they would recommend that and perhaps warrants it's own TAC investigation to determine why GR didn't perform in the way indicated be it configuration on the peer (missing GR helper) or other issue (especially if it persists with T66 and higher).

CCSM R77/R80/ELITE
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events