Let me make sure I understand this properly. So say, just as an example, you have a cluster HA (active/passive), lets call it cp-cluster and say cp01 is master and cp02 is standby. Are you saying that when cp01 is active, all works fine, but if cp02 is active and cp01 is stanby, thats when you have a problem connecting to 2 out of 3 remote sites?
If so, then we would need to run bunch of captures and vpn debugs to figure out why
vpn debug trunc
vpn debug ikeon
-generate some traffic
vpn debug ikeoff
Get ike/elg and vpnd.elg files from $FWDIR.log dir
Also, would not hurt to run fw monitor commands to see what happens with the traffic.
Cheers mate.
Andy