DHCP Relay Troubles when both Cluster Members are Online

We're running R80.40 jumbo main Take 77.
We're having a strange issue, but are able to replicate it quite easily.  We have an IP subnet which relays DHCP through the checkpoint, back to our internal DHCP server (Windows) to obtain an address.  This is for our Guest network.
With both cluster members in an "Up" state via "clusterXL_admin up" as shown below, new DHCP requests fail to obtain an IP Address.
[Expert@CP-GW5600-HA:0]# cphaprob state
Cluster Mode:   High Availability (Active Up) with IGMP Membership
ID         Unique Address  Assigned Load   State          Name
1        100%            ACTIVE         CP-GW5600
2 (local)        0%              STANDBY        CP-GW5600-HA
If I run a "clusterXL_admin down" on one of the members, it continues to not allow new client DHCP requests nor obtain an IP Address.  On the member that we "downed", if we issue a Reboot on that member and it goes offline for a couple of minutes, the active cluster member then will allow DHCP requests and new clients are able to obtain an IP address.  That client will hang onto its IP address even after the downed member boots back up and goes to a STANDBY state.  If that same client forgets the network, or releases its IP Address, it is then unable to renew or obtain a new address until one of the cluster members goes offline for a reboot.
To me, this either appears to be a bug in R80.40 Take 77, or some sort of MAC caching issue with the Cluster VIP when both members are in an Active/Standby state. 
The strange part is this happened 2 weeks ago, but rebooting one of the cluster members fixed the issue at that time.  This morning, it started happening again out of the blue with no admins working on the firewalls. 
Thoughts on what could be happening and how to solve it? 
I assume this worked previously?
Have you done any troubleshooting to see what the issue might be?

Hi, yes it has been working for a couple years with little to no issues until recently. We’ve been on R80.40 take 77  for 6-8 months with no issues. I’ve been working with CP support and we started to run though that same article you listed but it’s now being escalated and plan to reach out again tomorrow. I figured I’d post here to see if anyone has any suggestions.

At this point, support was unable to find any dropped traffic, but the engineer wanted to escalate it to someone more familiar with tracking down these types of logs and traffic.  


Hi Rory - did you ever get a resolution for this problem?

