Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
ikafka
Collaborator
Jump to solution

Qauntum Spark 1600 HA down problem

I configured 2 Quantum Spark 1600 Ha. The device that should be passive seems to be down. When I check the interfaces, some interfaces do not seem to have arrived. The screenshots are as below.

 

diag.png

Active Devica HA List of Configuration

active.png

Network interfaces of active device 

activa device network.png

Down Device list of internet interfaces. 

passive.png

When I enter the down device, I get such a web gui.

ekran.png

 

I reset the secondary device a couple of times, then added it as HA secondary again, and this happens every time. not all interface ip's come through.

0 Kudos
1 Solution

Accepted Solutions
ikafka
Collaborator

Hi;

The problem is solved. 

TAc suggested an upgrade. 
After upgrading the primary and secondary member, the problem persisted. as you can see in the picture above, there are vlan interfaces in the primary member that do not pass to the secondary member. (when adding the secondary, the vlan interfaces do not pass in any way. normally it passes.)
So I manually added the vlan interface ip addresses that did not pass. After waiting for a while, the problem was fixed. I did this in the previous version and it didn't work. I think the secondary device is having trouble getting the vlan interface ip addresses when clustering on SMB devices.
As a result: My problem is solved. Thank you.
related upgrade package link

View solution in original post

9 Replies
Chris_Atkinson
Employee Employee
Employee

For context which firmware version/build is used here?

 

CCSM R77/R80/ELITE
0 Kudos
ikafka
Collaborator

Two devices of the same version: Version: R81.10.05 (996001301

0 Kudos
RS_Daniel
Advisor

Hello,

According to sk167453 traffic from standby member goes through sync interface. In our case active member dropped traffic from standby member. Try creating a rule with src the two members and dest any accept. 

If that doesn't fix the issue try changing "OS advanced settings - Use unique ICMP ID" value to true so both members can do monitoring independently.

Regards

 

0 Kudos
ikafka
Collaborator

The result has not changed. The device is still down. I did a reboot and it showed LOST during the reboot. When the device is turned on, it appears down again. But when I try to access from the web, I get ERR_CONNECTION_TIMED_OUT error. This quantaum series is strange.

0 Kudos
ikafka
Collaborator

There is cphaprob state output on the down device:

 

Cluster Mode:   High Availability (Active Up) with IGMP Membership

ID         Unique Address  Assigned Load   State

1          10.231.149.1    100%            ACTIVE(!)
2 (local)  10.231.149.2    0%              DOWN


Active PNOTEs: LPRB, IAC, COREXL

Last member state change event:
   Event Code:                 CLUS-110600
   State change:               INIT -> DOWN
   Reason for state change:    Incorrect configuration - Sync interface has not been detected
   Event time:                 Fri Jun 23 18:35:00 2023

Cluster failover count:
   Failover counter:           0
   Time of counter reset:      Fri Jun 23 21:32:33 2023 (reboot)

But when check sync interface status is up. ping to active device is successfully. and there is active device cphaprobstate

upupupuup.png

0 Kudos
RS_Daniel
Advisor

Hello,

Yes, quantum spark have more issues/bugs than regular Gaia appliances. You have pnote COREXL, i would start there. Compare output of this command on both members "fw ctl multik stat",  Also you can check cpview > cpu > overview, you should have the same amount of CoreXL_FW and OTHER cpu's. do they match? Is case no, you need to check this with TAC.

Regards

0 Kudos
ikafka
Collaborator

TAC suggested upgrade to version R81.10.07. I will cluster again after upgrade. I will post if the problem is solved. 

0 Kudos
ikafka
Collaborator

Hi;

The problem is solved. 

TAc suggested an upgrade. 
After upgrading the primary and secondary member, the problem persisted. as you can see in the picture above, there are vlan interfaces in the primary member that do not pass to the secondary member. (when adding the secondary, the vlan interfaces do not pass in any way. normally it passes.)
So I manually added the vlan interface ip addresses that did not pass. After waiting for a while, the problem was fixed. I did this in the previous version and it didn't work. I think the secondary device is having trouble getting the vlan interface ip addresses when clustering on SMB devices.
As a result: My problem is solved. Thank you.
related upgrade package link

Chris_Atkinson
Employee Employee
Employee

sk174423 provides further guidance on CoreXL configuration for Spark appliances and how to align if different.

 Is this the only Pnote remaining?

CCSM R77/R80/ELITE
0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events