This "Cluster policy installation failed" message no longer only means that the atomic load/commit failed or timed out on one of the cluster members, in R81+ it can also indicate that some kind of cluster sanity check failed during policy installation. You'll need to look in $FWDIR/log/cphaconf.elg on both members for clues about what is wrong. So far I've seen this message indicate:
1) One of the cluster members is set for MVC and one is not (sk179969: Policy installation fails with error "Policy installation failed on gateway. Clusterpolicy...")
2) The state of cluster enablement in cpconfig is incorrect (enabled for a non-cluster object, or disabled for a gateway that is part of a cluster object - sk180980: Policy installation failure with error message "Policy installation failed on gateway. Clu...
There are probably some other sanity checks I haven't run into yet.
The fact that you can't ARP on the sync network is a definite problem, and may be another one of the new sanity checks that are performed; namely making sure that the sync network is working, assuming state sync is enabled on the cluster object. ARP is never denied by a security policy or antispoofing so I'd look there. The high CPU is probably a symptom of the problem rather than the cause, unless it is so extreme it is causing a commit timeout on one of the gateways.
Gateway Performance Optimization R81.20 Course
now available at maxpowerfirewalls.com