Hi,
I am currently facing a strange problem. Atleast its a strange problem for me, I hope it´s not for you, so that you can help :).
Status Quo
Three devices within one VLAN x, device A, device B and device C, while device B and C are in a cluster and device C is the device under investigation.
Device C is connected to this VLAN / network via two interfaces which are forming a bond, let´s say eth1,eth2 are forming bond1.
Situation I
When eth1 and eth2 are up, device C can ping device A and device B => All good.
Situation II
When disabling eth1 and leaving eth2 up, device C can ping device A and device B => All good.
Situation III
When disabling eth2 and leaving eth1 up, after a while (ARP timeout) device C can not ping device B anymore but device A and all other devices within this VLAN.
It can be observed that the ARP entry for device B will change to state "incomplete" after a while on device C. ARP request from device C for device B are visible via tcpdump on device C and B. ARP replies are visible on device B but not on device C. In case the ARP entry for device A is deleted manually on device C you can observe both, APR requests and replies, on device C and device A.
For my understanding it can not be a physical issue. Why should a ping be possible to device A but not to B in case this cable is not working correctly anymore? Why can device C receive ARP replies from device A and all other devices beside of device B? Why does device C receive ARP replies in case the interface (eth2) is up again?
Thanking you in advance
k_b