Hello guys,
I have an issue with my VSX cluster - to be specific just with one VS in there. The VS itself is quite simple, we have two physical interfaces in place; one for the outside and one for the inside. The inside one is a vlan interface with about 6-7 vlans connected to it while the outside one only use one vlan. The VSX config runs VSLS when it comes to the actual clustering and as the VSX cluster consists out of two physical devices the VSLS configuration has been adjusted so that one member is active for one of the (currently) two VS. (Device 1 => active for VS 1, device 2 => active for VS2). VS1 is working perfectly fine, but for VS2 I can see the following behavior:
=> The active and standby member can ping a public IP via their outside interface, e.g. 1.1.1.1. This is working without any problems for everything that can be reached via the outside interface.
=> Only the active member can ping an internal IP via the internal interface (and a specific vlan, 255, for my current tests). Once the passive member tries to reach the test system via ICMP it does not receive any replies. However, during a packet capture analysis on the active member, I was able to see that once the standby member pings the test host the reply traffic is actually getting forwarded to the active member, instead of the standby one.
I did not see such a behavior before and wanted to ask if anyone of you is familiar with such an issue - and also what the possible root cause could be. Please note, that in addition to the just described issue I am currently also facing general issues when it comes to the internal interface and CCP traffic. Both members do send out CCP packets but none of them received the traffic of the other member. "fw zdebug" did not list any drops, unrelated to the ccp config (tested automatic, multicast & broadcast). For this issue I am in contact with out network team in order to verify the switches and what actually is received by them. Not sure if this is related at all but still, I think its good to mention for the overall picture.
Additional information:
- devices 2x 23k appliances
- VSX Software GAiA R80.20 (Jumbo Hotfix Take 47 - required because we have a custom hotfix running on top of it)