Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
RS_Daniel
Advisor

HA ClusterXL Connectivity Upgrade

Hi mates,

I am facing an issue with a cluster upgrade and no idea why the problem appeared, i hope someone could provide any guidance or help. I describe the scenario here below:

*Two 4400 R80.10 appliances in HA cluster --> Upgrade to R80.30

*Performing Connectivity Upgrade procedure due to customer needs zero downtime if possible (followed steps         described in CP_R80.30_Installation_and_Upgrade_Guide)

*After the upgrade in standby member, we did the connectivity upgrade with cphacu start command. Members status were ok, Ready for upgraded member and Active(!) for old one.

*Doing the failover with cpstop command in old member, traffic some vlan's started to lose connectivity and connection to SMS was lost. As the traffic this cluster handles is very very critical we had to do cpstart in old member which is handling the traffic righ now.

*During the review after the failed upgrade we see that pnote RouteD is with state "Problem" in upgraded appliance.  Active member RouteD Pnote state is "ok". Taking into consideration that it is a cluster and that it is using OSPF, should the state of Routed Ponte be that?

*In this moment active member is running R80.10 and the other R80.30, version in smartconsole object is R80.10 in order to let the customer install policy.

*We opened a case with TAC who is asking for a maintenance window in order to troubleshoot during the failover, it implies a downtime and the customer wants to do this as the last option, until all possible attempts were made before.

Any ideas to try solve this? Thanks in advance

0 Kudos
3 Replies
PhoneBoy
Admin
Admin

Given the nature of the issue, gathering the necessary debugs during a maintenance window seems to be the only real course of action.
The various connectivity upgrade options were not meant to be operated long-term, so you definitely want to get to the bottom of this soon.
0 Kudos
Bryce_Myers
Collaborator

Did you verify that you're running the same router-id on both cluster members?
Cphacu start should pull over the dynamic routes when it runs. Do you have the output of the cphacu start command?
0 Kudos
RS_Daniel
Advisor

Hi,

thank your for your help. We had a maintenance window yesterday and TAC collected some information and will update the ticket. I saw that when upgraded member handles the traffic Pnote RouteD state is ok. I have not output from cphacu command, but ti showed "Connectivity upgrade status: Ready for Failover" we also verified routes with "show route summary" command, and router-ID is the same in both gateways. The traffic was still dropped in same vlans, we noticed that connectivity was ok, ping to and from that vlans was successfull, but users could not start session, we retarted those services with no change, mails in that vlans could not be sent with timeout error messagge. I will be telling you how it moves along.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events