Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
charlokt
Explorer

R80.30 to R81.10 Cluster

Hello,

I have a site-to-site VPN where client card inquiries are made and it works with my R80.30 cluster

When I upload a member to R81.10 and this member is active, the clients consult their cards and they are not charged after 2 to 3 hours of keeping the new member active (R81.10) (it seems that due to the card query file size)

A failover is performed and member R80.30 remains again. At this point, customers can make inquiries to their cards.

What may be happening in this update (it only happens with this VPN) the other services work ok.

I have not been able to update to R81.10 due to this situation, which is critical for the core.

Greetings.

0 Kudos
12 Replies

Was the latest JHF for R81.10 installed and have you been troubleshooting the issue with TAC, what has already been attempted?

0 Kudos
charlokt
Explorer

Yes. last night we were with an engineer from TAC for the first 3 hours everything works fine. I then had to make member R80.30 active and it works.

 

Current Hotfix 66

0 Kudos

Is MSS clamping configured, can you compare a packet capture for successful transaction vs failure?

0 Kudos
the_rock
Champion
Champion

Just to make sure Im not misunderstanding something, are you saying when standby was upgraded to R81.10 and its made as active, thats when you have a problem?

0 Kudos
charlokt
Explorer

Yes,  the update is being done as follows:

One member at R80.30 (Take 256) and the other member (R81.10 take 66).

When taking the member R81.10 as active, users can enter the page and make inquiries about their cards for at least 2 or three hours. Then it fails and right now I have to put the one in R80.30 as active and it works. (It is a Site to Site VPN connection and they make card status queries).

I have not placed the two members in R81.10 to have the contingency in R80.30 that everything works fine.

0 Kudos
the_rock
Champion
Champion

K, thats tricky situation and reason I say that is even if you work with TAC, not sure how much they would be able to help in case like that, since both members are on different versions currently. Did you try do any debug when this fails on R81.10 member?

Andy

0 Kudos
charlokt
Explorer

Only CPinfo sent to the TAC for the moment. I tracked traffic for the first 2-3 hours. Then it crashes on R81.10 and I put the R80.30 member in and it no longer crashes after 3 hours.

0 Kudos
the_rock
Champion
Champion

Are there any logs/messages file when this happens?

0 Kudos
charlokt
Explorer

During the inconvenience presented in customer card queries (6:30 a.m.) only these records are seen in the messages:

Configuration changed from localhost by user admin by the service dbset
Sep 23 06:36:10 2022 XXXXXXX[22579]: admin localhost t +installer:update_status_message Processing candidates 98%

From 0% percent to 98% (Multiple similar messages)

Then I had to upload member R80.30:

Sep 23 07:22:38 2022 XXXXXX last message repeated 2 times
Sep 23 07:22:38 2022 XXXXXX fwk: CLUS-116505-1: State change: READY -> ACTIVE(!) | Reason: All other machines are dead (timeout), Interface Sync is down (Cluster Control Protocol packets are not received)
Sep 23 07:22:38 2022 XXXXXX fwk: CLUS-116505-1: State change: READY -> ACTIVE(!)

After changing, the message is the same on member R80.30 Active:
admin localhost t +installer:update_status_message Processing candidates 71%

 

0 Kudos
the_rock
Champion
Champion

Sadly, that message is not useful. Did TAC suggest to leave vpn debugs on until issue happens?

0 Kudos
charlokt
Explorer

The VPN does not record disconnection. It is very rare because it is when users go to check the status of their cards, the file that is going to be shown for them, if it is more than 1 KB is not shown.

0 Kudos
the_rock
Champion
Champion

Sorry, but Im out of ideas then...just follow whatever TAC suggests next, I guess. Maybe someone else will have an idea you can try.

0 Kudos