Hi there,
I'm about to roll out IPv6 in our enterprise network, and are testing various scenarios in lab before rolling out in production. Something is puzzling me and telling me this is not right...
We're running 2 x 5800 in HA (active/passive). For IPv4 we are running OSPFv2 to announce a default route, and a few connected routes, and this works like a charm - not a single hickup/ping loss when failing over between the two nodes. Also on other OSPF neighbors there are no state change, as the process is clustered.
But when I try to mimic the same setup for IPv6 (using the OSPFv3 protocol), the OSPF session changes to INIT (as seen on a neighboring device), which leads to downtime until it converges and changes to FULL.
I have both a IPv4 and IPv6 ping running towards two distant hosts. When flipping over the nodes (using clusterXL_admin down on the active node), there are no ping timeouts on the IPv4 ping, but IPv6 fails immediately, and comes back when OSPFv3 reconverges (after about 16 ping timeouts - at least 15 pings too much 😉).
<CR_LAB>%Nov 23 10:53:53:483 2020 CR_LAB OSPFV3/5/OSPFv3_NBR_CHG: OSPFv3 1 Neighbor 172.20.10.10(Vlan-interface10) received 1-Way and its state from FULL to INIT.
<CR_LAB>
<CR_LAB>%Nov 23 10:54:06:456 2020 CR_LAB OSPFV3/5/OSPFv3_NBR_CHG: OSPFv3 1 Neighbor 172.20.10.10(Vlan-interface10) received LoadingDone and its state from LOADING to FULL.
CR_LAB is a core router (HPE Comware) - router-id 172.20.127.11
FW-A and FW-B are Checkpoint R80.40 JHF Take_87 nodes.
<CR_LAB>disp ospfv3 peer 172.20.10.10
OSPFv3 Process 1 with Router ID 172.20.127.11
Area 0.0.0.0 interface Vlan-interface10's neighbors
Router ID: 172.20.10.10 Address: FE80::131:10
State: Full Mode: Nbr is slave Priority: 1
DR: 172.20.127.11 BDR: 172.20.127.81 MTU: 1500
Options is 0x000013 (-|R|-|x|E|V6)
Dead timer due in 00:00:37
Neighbor is up for 03:38:55
Neighbor state change count: 16
Database Summary List 0
Link State Request List 0
Link State Retransmission List 0
Neighbor interface ID: 168461066
GR state: Normal
Grace period: 0 Grace period timer: Off
DD Rxmt Timer: Off LS Rxmt Timer: Off
The Checkpoint cluster is using the same router-id. And I can confirm the link-local IP is identical on the CluterXL interface.
These are the relevant OSPFv3 configuration lines:
set ipv6 ospf3 instance default rfc1583-compatibility off
set ipv6 ospf3 instance default graceful-restart-helper on
set ipv6 ospf3 instance default area backbone on
set ipv6 ospf3 instance default interface eth1 area backbone on
set ipv6 ospf3 instance default interface eth1 cost 100
set ipv6 ospf3 instance default interface eth1 priority 1
set ipv6 ospf3 instance default export-routemap export_ipv6 preference 1 on
set routemap export_ipv6 id 100 on
set routemap export_ipv6 id 100 allow
set routemap export_ipv6 id 100 match network ::/0 exact
set routemap export_ipv6 id 100 match protocol static
I know GR aren't applicable when using ClusterXL, but that's the default setting and same behavior when turning it off.
Does anyone have a clue what I've done wrong? Is the clustered OSPFv3 process using ClusterXL really supposed to change the neighbor state during failover?
Thanks in advance,
Morten Gade Sørensen