Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Scott_Paisley
Advisor
Jump to solution

Stanby Manager resets to active

I failed over management servers yesterday. I failed back to the server we normally use as Active, and now the standby keeps resetting itself to active so I have a clash

I have a ticket open, but has anybody else seen this?

0 Kudos
1 Solution

Accepted Solutions
Scott_Paisley
Advisor

Here is the TAC solution. Trying it now.

Solution

  1. Run the cpstop command.
     
  2. To find out the current status, run:

    # cpprod_util FwIsActiveManagement

    0 - means Standby.
    1 - means Active.
     
  3. To set the Management station to Standby status, run:

    # cpprod_util FwSetActiveManagement 0

     
  4. To set the Management station to Active status, run:

    # cpprod_util FwSetActiveManagement 1
     
  5. Run the cpstart command.

View solution in original post

0 Kudos
12 Replies
Timothy_Hall
Legend Legend
Legend

An SMS in a Management HA setup should never set itself active without human intervention.  Are you saying that you started with the Primary in active and Secondary in standby, then you set Secondary active (and Primary went standby), then you set Primary back to active (and Secondary went standby), and now you are saying the Secondary went active again on its own?  Are you seeing a state of "Advanced" or "Collision"?

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Scott_Paisley
Advisor

I am seeing collision.

Due to an earlier issue, we normally run our Secondary as the Active, and our Primary as the Backup, and all is fine.

I made the primary active yesterday, then made the secondary active again. I can set the primary as standby and do a full sunc, then a few minutes later I get the error that says both are active

0 Kudos
Timothy_Hall
Legend Legend
Legend

If you are in a collision state you must force a manual sync to get back into the usual active/standby.  Because both SMSs have changes, you need to pick a "winner" and a "loser".  The winner syncs over top of the loser; any changes on the loser are lost except I think for SIC certificate changes/revocations which are merged.  Would be a great idea to take backups of both SMSs first just in case you overwrite the "wrong" changes that you actually needed, as a manual sync cannot be undone.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
0 Kudos
Scott_Paisley
Advisor

A manual sync does not resolve the problem.

It shows a successful sync, then some minutes later back to collision.

0 Kudos
Timothy_Hall
Legend Legend
Legend

Yep, that would be a TAC case then.

Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com
Scott_Paisley
Advisor

Here is the TAC solution. Trying it now.

Solution

  1. Run the cpstop command.
     
  2. To find out the current status, run:

    # cpprod_util FwIsActiveManagement

    0 - means Standby.
    1 - means Active.
     
  3. To set the Management station to Standby status, run:

    # cpprod_util FwSetActiveManagement 0

     
  4. To set the Management station to Active status, run:

    # cpprod_util FwSetActiveManagement 1
     
  5. Run the cpstart command.
0 Kudos
the_rock
Legend
Legend

I could be wrong when I say this, as I had not set and played with mgmt HA in some time, but I dont believe there is a preempt option in dashboard like you can set for gateways where one member can always act as master even when it comes back after reboot. I think that for mgmt HA, you have to do that manually if you want to fail them over. Now, from what you described, that sort of behaviour does not sound normal to me. Are there any logs you can see either in dashboard or command line? Maybe messages file, any core dumps?

0 Kudos
Scott_Paisley
Advisor

Yeah, I did a manual failover, and a manual failback, but the failback didn't 'stick' on one of the boxes. Trying the TAC fix now.

0 Kudos
the_rock
Legend
Legend

Ok, great, let us know if that works. I think I seen that procedure before, I believe there is an sk for that. Hope it goes well.

0 Kudos
genisis__
Leader Leader
Leader

It does sound odd.  The real question is what's triggered this behaviour ie. what's the root cause of the issue.

Silly question but what values are returned when you run the following on both SMS appliances?

cprod_util FwIsActiveManagement

Primary Should = 1

Secondary Should = 0

 

0 Kudos
Scott_Paisley
Advisor

Both devices returned 1, which explains the problem.

Since I set the standby to zero, it appears to be working.

What is interesting is that it didn't work from the GUI

the_rock
Legend
Legend

Yes, thats very odd indeed. Let us know if they ever find a reason why it happened, I would be curious to know.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events