Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
Vladimir
Champion
Champion

Is the Link State Monitoring dead?

This subject is occasionally being brought-up and Tim Hall has written about CCP behaviour on interfaces connected to VLANs with no pingable hosts.

I do recall, from IPSO days, that there was a feature called Link State Monitoring and have decided to give it a shot on a pair of 15400s running R80.10 and destined to be connected to a L2 only switch.

Even with "Link State Monitoring" enabled on both members (sk31336), when HA cluster members are connected to a Layer 2 switch, the only interfaces of the cluster that remain "UP" during reboot of the second HA member, are those that have other pingable hosts on same VLAN:


[Expert@CXLM01:0]# cat $FWDIR/boot/modules/fwkern.conf
fwha_forw_packet_to_not_active=1
fwha_monitor_if_link_state=1
[Expert@CXLM01:0]#

[Expert@CXLM01:0]# cphaprob stat

Cluster Mode: High Availability (Active Up) with IGMP Membership

Number Unique Address Assigned Load State

1 (local) 192.168.8.11 100% Active Attention
2 192.168.8.12 0% ClusterXL Inactive or Machine is Down

Local member is in current state since Thu Jun 21 08:14:53 2018

[Expert@CXLM01:0]#
[Expert@CXLM01:0]# cphaprob -a if

Required interfaces: 7
Required secured interfaces: 1

eth2-01 Inbound: DOWN (48.8 secs) Outbound: DOWN (49 secs) non sync(non secured), broadcast
eth2-03 UP non sync(non secured), broadcast (External interface connected to the VLAN with the router)
eth2-04 Disconnected non sync(non secured), broadcast
eth2-05 Inbound: DOWN (48.6 secs) Outbound: DOWN (49 secs) non sync(non secured), broadcast
Mgmt UP non sync(non secured), broadcast (Management interface connected to the VLAN with SMS)
Sync Inbound: DOWN (48.8 secs) Outbound: DOWN (48.9 secs) sync(secured), broadcast
eth3-01 Inbound: DOWN (48.6 secs) Outbound: DOWN (48.9 secs) non sync(non secured), broadcast (eth3-01.332)
eth3-01 Inbound: DOWN (48.6 secs) Outbound: DOWN (48.9 secs) non sync(non secured), broadcast (eth3-01.24)

Virtual cluster interfaces: 10

eth2-01 10.255.100.10 VMAC address: 00:1C:7F:00:00:01
eth2-03 192.168.8.9 VMAC address: 00:1C:7F:00:00:01
eth2-05 10.255.101.10 VMAC address: 00:1C:7F:00:00:01
Mgmt 192.168.20.60 VMAC address: 00:1C:7F:00:00:01
eth3-01.48 192.168.48.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.332 172.16.32.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.40 192.168.40.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.24 192.168.24.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.32 192.168.32.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.324 172.16.24.10 VMAC address: 00:1C:7F:00:00:01

[Expert@CXLM01:0]#

When testing a cluster failover using "clusterXL_admin down/up", you may be lulled in a false sense of security, since failover will be flowless: "Standby" will become "Active" and all the interfaces of both member will remain "Up".

Member 1:

[Expert@CXLM01:0]# cphaprob stat

Cluster Mode: High Availability (Active Up) with IGMP Membership

Number Unique Address Assigned Load State

1 (local) 192.168.254.1 100% Active
2 192.168.254.2 0% Standby

Local member is in current state since Thu Jun 21 08:14:53 2018

[Expert@CXLM01:0]# clusterXL_admin down
Setting member to administratively down state ...
Member current state is Down
[Expert@CXLM01:0]# cphaprob stat

Cluster Mode: High Availability (Active Up) with IGMP Membership

Number Unique Address Assigned Load State

1 (local) 192.168.254.1 0% Down
2 192.168.254.2 100% Active

Local member is in current state since Thu Jun 21 08:32:58 2018

[Expert@CXLM01:0]# cphaprob -a if

Required interfaces: 7
Required secured interfaces: 1

eth2-01 UP non sync(non secured), broadcast
eth2-03 UP non sync(non secured), broadcast
eth2-04 Disconnected non sync(non secured), broadcast
eth2-05 UP non sync(non secured), broadcast
Mgmt UP non sync(non secured), broadcast
Sync UP sync(secured), broadcast
eth3-01 UP non sync(non secured), broadcast (eth3-01.332)
eth3-01 UP non sync(non secured), broadcast (eth3-01.24)

Virtual cluster interfaces: 10

eth2-01 10.255.100.10 VMAC address: 00:1C:7F:00:00:01
eth2-03 192.168.8.9 VMAC address: 00:1C:7F:00:00:01
eth2-05 10.255.101.10 VMAC address: 00:1C:7F:00:00:01
Mgmt 192.168.20.60 VMAC address: 00:1C:7F:00:00:01
eth3-01.48 192.168.48.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.332 172.16.32.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.40 192.168.40.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.24 192.168.24.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.32 192.168.32.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.324 172.16.24.10 VMAC address: 00:1C:7F:00:00:01

[Expert@CXLM01:0]#

Member 2:

[Expert@CXLM02:0]# cphaprob stat

Cluster Mode: High Availability (Active Up) with IGMP Membership

Number Unique Address Assigned Load State

1 192.168.254.1 100% Active
2 (local) 192.168.254.2 0% Standby

Local member is in current state since Thu Jun 21 08:18:04 2018

[Expert@CXLM02:0]# cphaprob stat

Cluster Mode: High Availability (Active Up) with IGMP Membership

Number Unique Address Assigned Load State

1 192.168.254.1 0% Down
2 (local) 192.168.254.2 100% Active

Local member is in current state since Thu Jun 21 08:32:15 2018

[Expert@CXLM02:0]# cphaprob -a if

Required interfaces: 7
Required secured interfaces: 1

eth2-01 UP non sync(non secured), broadcast
eth2-03 UP non sync(non secured), broadcast
eth2-04 Disconnected non sync(non secured), broadcast
eth2-05 UP non sync(non secured), broadcast
Mgmt UP non sync(non secured), broadcast
Sync UP sync(secured), broadcast
eth3-01 UP non sync(non secured), broadcast (eth3-01.332)
eth3-01 UP non sync(non secured), broadcast (eth3-01.24)

Virtual cluster interfaces: 10

eth2-01 10.255.100.10 VMAC address: 00:1C:7F:00:00:01
eth2-03 192.168.8.9 VMAC address: 00:1C:7F:00:00:01
eth2-05 10.255.101.10 VMAC address: 00:1C:7F:00:00:01
Mgmt 192.168.20.60 VMAC address: 00:1C:7F:00:00:01
eth3-01.48 192.168.48.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.332 172.16.32.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.40 192.168.40.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.24 192.168.24.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.32 192.168.32.10 VMAC address: 00:1C:7F:00:00:01
eth3-01.324 172.16.24.10 VMAC address: 00:1C:7F:00:00:01

[Expert@CXLM02:0]#

This is a bit annoying, especially from the point of view of monitoring interface states on remote equipment and possible automation actions associated with state changes.

Anyone cares to chime-in?

3 Replies
PhoneBoy
Admin
Admin

CCP monitors the entire path (not just the link).

As such, what you're observing is expected behavior.

In R80.20, we'll have a way to monitor just the link (without the path).

Rolf_Sommerhald
Explorer

In R80.20, we'll have a way to monitor just the link (without the path).


That's exactly what we need, but could not find yet how to do it in ClusterXL R80.20 Administration Guide
, besides the new Automatic and Unicast Modes for CCP.

Where can we find more about checking Link State only (on Gaia), and how to disable end-to-end path checking by CCP on selected interfaces? Thanks.

0 Kudos
PhoneBoy
Admin
Admin

It's possible this requires the new kernel...will check.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events