Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
MladenAntesevic
Collaborator

Cluster failover when enabling SNMP

Hello,

I am facing a strange issue when I try to enable SNMP v3 read-only on my cluster gateways. I have two 6600 gateways in a cluster running R81. I have tried twice to enable SNMP ver3 over the Gaia portal, and both times when I press aply I got disconnected from the Gaia portal without any changes made. Investigating further, I noticed that both times I got cluster failover with the following status:

 

CP-2> cphaprob state

Cluster Mode: High Availability (Active Up) with IGMP Membership

ID Unique Address Assigned Load State Name

1 10.xxx.xxx.6 0% STANDBY CP-1
2 (local) 10.xxx.xxx.7 100% ACTIVE CP-2


Active PNOTEs: None

Last member state change event:
Event Code: CLUS-114704
State change: STANDBY -> ACTIVE
Reason for state change: No other ACTIVE members have been found in the cl uster
Event time: Thu Apr 1 22:33:50 2021

Last cluster failover event:
Transition to new ACTIVE: Member 1 -> Member 2
Reason: ROUTED PNOTE
Event time: Thu Apr 1 22:33:50 2021

Cluster failover count:
Failover counter: 3
Time of counter reset: Fri Mar 26 10:54:34 2021 (reboot)


So I got this Routed Pnote message and cluster failover every time I try to enable SNMP v3. I have OSPF enabled cluster and OSPF peering to Cisco Nexus switches and Cisco 4431 routers. Maybe someone can give me some advice, how to solve this issue?

0 Kudos
5 Replies
funkylicious
Advisor

What can you see in the /var/log/routed.log.* messages ? Anything in particular stands out ?

MladenAntesevic
Collaborator

I see that routed daemon on a master has restarted when I tried to enabled SNMP. Here are the logs from my both cluster members, I tried twice, first at 15:55 when failover to second cluster member occurred and than latter on at 22:33 when cluster master failed back to the primary:

 

Apr 1 15:55:33.550102 task_cmd_terminate(194): command subsystem terminated.
Apr 1 15:55:33.550207
Apr 1 15:55:33.550207 Exit routed[28500] version routed-10.17.2020-15:36:15
Apr 1 15:55:33.550207
Apr 1 15:55:34 trace_on: Tracing to "/var/log/routed.log" started
Apr 1 15:55:34 trace_on: Version routed-10.17.2020-15:36:15 (ice_main 995000083)
Apr 1 16:20:54.702109 cpcl_cxl_get_memberip_from_id(837): Sync IP is: xxx.xxx.xxx.7
Apr 1 16:24:00.166579 cpcl_cxl_get_memberip_from_id(837): Sync IP is: xxx.xxx.xxx.7
Apr 1 22:33:48.962633 recv(header) returns 0
Apr 1 22:33:48.962633 peer_remove(130): Entering !!!!
Apr 1 22:33:50.087259 mvc_check_for_cu: upgrade finished
Apr 1 22:33:50.087259 cpcl_master_init(6196): entering
Apr 1 22:33:50.087259 entering cpcl_master_init()
Apr 1 22:33:50.087259 cpcl_master_init(6254): sockpath is /tmp/sockvrf0
Apr 1 22:33:50.087259 leaving cpcl_master_init()
Apr 1 22:33:50.087259 cpcl_master_init(6330): leaving
Apr 1 22:33:50.087259 CLUSTER: Proto 7 enables sending in cluster
Apr 1 22:33:52.105413 cpcl_vrf_master_listen_accept(6098): entering cpcl_vrf_master_listen_accept
Apr 1 22:33:52.105413 cpcl_vrf_master_listen_accept(6170): leaving cpcl_vrf_master_listen_accept
Apr 1 22:33:52.105464 cpcl_vrf_recv_from_instance_manager(5918): instance 0 entering cpcl_vrf_recv_from_instance_manager
Apr 1 22:33:52.105464 cpcl_vrf_recv_from_instance_manager(5949): instance 0 recv returned 4
Apr 1 22:33:52.105464 cpcl_vrf_recv_from_instance_manager(5975): instance 0 received fd 36
Apr 1 22:33:52.105464 cpcl_vrf_recv_from_instance_manager(6065): Deleting CPCL_IM_Peer_Task !!!!
Apr 1 22:33:52.105464 cpcl_vrf_recv_from_instance_manager(6071): instance 0 leaving cpcl_vrf_recv_from_instance_manager
Apr 1 22:33:53.125719 cpcl_vrf_master_send_vrf_finish(5798): instance 0 entering cpcl_vrf_master_send_vrf_finish
Apr 1 22:33:53.125719 cpcl_vrf_master_send_vrf_finish(5829): instance id 0 sending CLUSTER_INITIAL_VRF_SCM_XFER_DONE to peer 1

 

Apr 1 15:55:32.870143 recv(header) returns 0
Apr 1 15:55:32.870143 peer_remove(130): Entering !!!!
Apr 1 15:55:34.021994 cpcl_master_init(6196): entering
Apr 1 15:55:34.021994 entering cpcl_master_init()
Apr 1 15:55:34.021994 cpcl_master_init(6254): sockpath is /tmp/sockvrf0
Apr 1 15:55:34.021994 leaving cpcl_master_init()
Apr 1 15:55:34.021994 cpcl_master_init(6330): leaving
Apr 1 15:55:34.021994 CLUSTER: Proto 7 enables sending in cluster
Apr 1 15:55:36.040136 cpcl_vrf_master_listen_accept(6098): entering cpcl_vrf_master_listen_accept
Apr 1 15:55:36.040136 cpcl_vrf_master_listen_accept(6170): leaving cpcl_vrf_master_listen_accept
Apr 1 15:55:36.040188 cpcl_vrf_recv_from_instance_manager(5918): instance 0 entering cpcl_vrf_recv_from_instance_manager
Apr 1 15:55:36.040188 cpcl_vrf_recv_from_instance_manager(5949): instance 0 recv returned 4
Apr 1 15:55:36.040188 cpcl_vrf_recv_from_instance_manager(5975): instance 0 received fd 36
Apr 1 15:55:36.040188 cpcl_vrf_recv_from_instance_manager(6065): Deleting CPCL_IM_Peer_Task !!!!
Apr 1 15:55:36.040188 cpcl_vrf_recv_from_instance_manager(6071): instance 0 leaving cpcl_vrf_recv_from_instance_manager
Apr 1 15:55:37.050531 cpcl_vrf_master_send_vrf_finish(5798): instance 0 entering cpcl_vrf_master_send_vrf_finish
Apr 1 15:55:37.050531 cpcl_vrf_master_send_vrf_finish(5829): instance id 0 sending CLUSTER_INITIAL_VRF_SCM_XFER_DONE to peer 2
Apr 1 22:33:48.258116 task_cmd_terminate(194): command subsystem terminated.
Apr 1 22:33:48.258177
Apr 1 22:33:48.258177 Exit routed[28553] version routed-10.17.2020-15:36:15
Apr 1 22:33:48.258177
Apr 1 22:33:49 trace_on: Tracing to "/var/log/routed.log" started
Apr 1 22:33:49 trace_on: Version routed-10.17.2020-15:36:15 (ice_main 995000083)

the_rock
Advisor

I saw similar port for this few weeks ago, but cant recall now what the outcome was, apologies. You may wish to contact TAC and open case for this. Personally, I find that to be very unexpected behaviour.

0 Kudos
genisis__
Advisor

feels like a bug.

Is the installation running JHFA23?

Is ccp being encrypted? Seen weird issues when this is on (pre-R81)

Has this same procedure been attempted using clish rather then WEBUI?

 

MladenAntesevic
Collaborator

No, I am not running JHFA23, I will check the release notes if there is something similar to my case. CCP are not being encryped, I left them as default, unicast and unencrypted.

I will try to do the same thing using clish.

0 Kudos