Hello,
I've been setting up a lab to get myself more hands on experience with VSX Clusters and VSLS (R81.10 Take 95)
It took a moment to get all of the blade updates and after reading through sk106496 (Software Blades updates on VSX - FAQ), i do see that the majority of the blade updates are 'proxied' in some sort via the VSX gateways. On the same SK and in regards to updatable objects, it does mention that the VS itself needs direct access to the internet for updates.
I got some initial errors in my threat policy install since i had some exceptions i was using already that used these objects.
I did take a look at sk121877 (Package of Updatable Objects is missing on the Security Gateway) and did notice that the files were missing within the VS. After getting internet access resolved, it did indeed update and no more errors.
But.....my issue i have is: How do you resolve this on the Standby VS? I still get policy install errors on my 2nd VSX node since that doesn't have any updates and since only one VS is technically active, how do you ensure that the updates are present in the event you need to fail over? Is there any synchronization of these updates between the VS active/standby members? I'm temped to shut down one member here and see if it eventually updates on the 2nd box but its seems like I am missing something here.
In regards the cluster failovers, it does feel very similar to a typical ClusterXL setup to failover the main box (VS0). It felt a little weird to failover VS0 and then still see the single VS i had (VS1) still running as active there. But....i do understand from reading further SKs (SK95133) that this is normal and you can failover an individual VS. So kinda cool to jump in to each virtual system and failover one by one.
Now...my question 🙂 I am use to a non-VSX cluster where I can set the priority of the cluster members; allowing me to failover from member A to member B....AND keep Member B as the active node when member A comes back up.
What I noticed with both VS0 and VS1 is that they both go back to member A when that node is active. Is this by design and is this configurable? I can possibly see on the VS themselves to balance load in a VSLS setup but I am curious to the sticky priority of VS0. I am trying to understand more about the typical patching situation where you would want to move over everything active off one chassis gracefully so you can patch/reboot it.
I see sk56060 as a reference point and it does note that "If there are only two physical VSX members, then the simplest way would be to run the clusterXL_admin down command (refer to sk55081: Best Practices - Manual fail-over in ClusterXL) on the VSX cluster member, from which we move the instances of Virtual Systems in Active state." If i do this on the VS0 and VS1 here to isolate a node (Member A for example), wouldn't that member just come back as "active' after reboot? Or do you use the '-p' option flag so the cluster stays down after reboot?
Is the "vsx_util redistribute_vsls" method they mention the best way to go then each time to move everything to one cluster member and then move to to the other node; finally reverting to normal at the end to 'balance' back?
The main situation is in regards to upgrades and JHF patching here so if there is an SK that direct that, I'll take a look 😉
I know that's a lot here so appreciate any time taken to review and lead me in the right direction 😉