R80.30 to R81.10 Cluster

charlokt · ‎2022-09-23

Hello,

I have a site-to-site VPN where client card inquiries are made and it works with my R80.30 cluster

When I upload a member to R81.10 and this member is active, the clients consult their cards and they are not charged after 2 to 3 hours of keeping the new member active (R81.10) (it seems that due to the card query file size)

A failover is performed and member R80.30 remains again. At this point, customers can make inquiries to their cards.

What may be happening in this update (it only happens with this VPN) the other services work ok.

I have not been able to update to R81.10 due to this situation, which is critical for the core.

Greetings.

Chris_Atkinson · ‎2022-09-23

Was the latest JHF for R81.10 installed and have you been troubleshooting the issue with TAC, what has already been attempted?

CCSM R77/R80/ELITE

charlokt · ‎2022-09-23

Yes. last night we were with an engineer from TAC for the first 3 hours everything works fine. I then had to make member R80.30 active and it works.

Current Hotfix 66

Chris_Atkinson · ‎2022-09-23

Is MSS clamping configured, can you compare a packet capture for successful transaction vs failure?

CCSM R77/R80/ELITE

the_rock · ‎2022-09-23

Just to make sure Im not misunderstanding something, are you saying when standby was upgraded to R81.10 and its made as active, thats when you have a problem?

Best,
Andy
"Have a great day and if its not, change it"

charlokt · ‎2022-09-23

Yes, the update is being done as follows:

One member at R80.30 (Take 256) and the other member (R81.10 take 66).

When taking the member R81.10 as active, users can enter the page and make inquiries about their cards for at least 2 or three hours. Then it fails and right now I have to put the one in R80.30 as active and it works. (It is a Site to Site VPN connection and they make card status queries).

I have not placed the two members in R81.10 to have the contingency in R80.30 that everything works fine.

the_rock · ‎2022-09-23

K, thats tricky situation and reason I say that is even if you work with TAC, not sure how much they would be able to help in case like that, since both members are on different versions currently. Did you try do any debug when this fails on R81.10 member?

Andy

Best,
Andy
"Have a great day and if its not, change it"

charlokt · ‎2022-09-23

Only CPinfo sent to the TAC for the moment. I tracked traffic for the first 2-3 hours. Then it crashes on R81.10 and I put the R80.30 member in and it no longer crashes after 3 hours.

the_rock · ‎2022-09-23

Are there any logs/messages file when this happens?

Best,
Andy
"Have a great day and if its not, change it"

charlokt · ‎2022-09-23

During the inconvenience presented in customer card queries (6:30 a.m.) only these records are seen in the messages:

Configuration changed from localhost by user admin by the service dbset
Sep 23 06:36:10 2022 XXXXXXX[22579]: admin localhost t +installer:update_status_message Processing candidates 98%

From 0% percent to 98% (Multiple similar messages)

Then I had to upload member R80.30:

Sep 23 07:22:38 2022 XXXXXX last message repeated 2 times
Sep 23 07:22:38 2022 XXXXXX fwk: CLUS-116505-1: State change: READY -> ACTIVE(!) | Reason: All other machines are dead (timeout), Interface Sync is down (Cluster Control Protocol packets are not received)
Sep 23 07:22:38 2022 XXXXXX fwk: CLUS-116505-1: State change: READY -> ACTIVE(!)

After changing, the message is the same on member R80.30 Active:
admin localhost t +installer:update_status_message Processing candidates 71%

the_rock · ‎2022-09-23

Sadly, that message is not useful. Did TAC suggest to leave vpn debugs on until issue happens?

Best,
Andy
"Have a great day and if its not, change it"

charlokt · ‎2022-09-23

The VPN does not record disconnection. It is very rare because it is when users go to check the status of their cards, the file that is going to be shown for them, if it is more than 1 KB is not shown.

the_rock · ‎2022-09-23

Sorry, but Im out of ideas then...just follow whatever TAC suggests next, I guess. Maybe someone else will have an idea you can try.

Best,
Andy
"Have a great day and if its not, change it"

Are you a member of CheckMates?

R80.30 to R81.10 Cluster