- Products
- Learn
- Local User Groups
- Partners
- More
Quantum Spark Management Unleashed!
Introducing Check Point Quantum Spark 2500:
Smarter Security, Faster Connectivity, and Simpler MSP Management!
Check Point Named Leader
2025 Gartner® Magic Quadrant™ for Hybrid Mesh Firewall
HTTPS Inspection
Help us to understand your needs better
CheckMates Go:
SharePoint CVEs and More!
Hi all, we have an issue with our VSX HA Cluster (Two gateways, Active/Standby), where after rebooting the Standby for whatever reason the Sync interface remains DOWN. In the past when this occurred, a physical power down of the standby restored the link, but a normal reboot does not (nor bouncing the link).
We're in the process of eliminating physical problems, particularly replacing the cable and SFP for this link. But I was wondering if there is any other troubleshooting steps I might be able to do in the mean time?
[ACTIVE] SYNC (eth3-04) <----> (eth3-04) SYNC [STANDBY]
Currently we have no HA resiliency, all VS are DOWN on the standby which isn't ideal.
Interface counters show no incrementing RX or TX on either side.
cphaprob syncstat does show incrementing SENT sync messages, but no received messages.
My theory is maybe the SFP/Transceiver is faulty, and perhaps in a normal reboot the SFP doesn't lose power, but in a full physical power down it does? Which maybe causes the link to come back up, I'm not sure..
I appreciate any thoughts!
Hi all, to confirm it was a faulty SFP, so indeed a physical issue. The SFP was allowed to be RMA'd with Checkpoint, and the replacement SFP brought the link back online.
Thanks all for your assistance.
Could you perform command below and share me result:
cphaprob stat
cphaprob -a if
tcpdump -nni <name interface sync> port 8116
Thanks for the reply Tron, please see below (I emitted some details like hostname/IP).
Even running tcpdump without port specified shows no packets at all on the interface.. so it seems the link is completely dead which makes me think it must be a physical issue.
Standby_Gateway:0> cphaprob stat
Cluster Mode: Virtual System Load Sharing (Primary Up)
ID Unique Address Assigned Load State Name
1 x.x.x.x 100% ACTIVE(!) Primary_Gateway
2 (local) x.x.x.x 0% DOWN Standby_Gateway
Active PNOTEs: IAC
Last member state change event:
Event Code: CLUS-110205
State change: ACTIVE(!) -> DOWN
Reason for state change: Interface eth3-04 is down (disconnected / link down)
Event time: Mon Aug 7 13:39:34 2023
Last cluster failover event:
Transition to new ACTIVE: Member 1 -> Member 2
Reason: Available on member 1
Event time: Mon Aug 7 13:39:01 2023
Cluster failover count:
Failover counter: 7
Time of counter reset: Tue Sep 6 17:00:37 2022 (reboot)
Cluster name: Cluster
Virtual Devices Status on each Cluster Member
=============================================
ID | Weight| Primary | Standby
| | |
| | |
| | | [local]
-------+-------+-----------+-----------
2 | 10 | ACTIVE(!) | DOWN
3 | 10 | ACTIVE(!) | DOWN
---------------+-----------+-----------
Active | 2 | 0
Weight | 20 | 0
Weight (%) | 100 | 0
Legend: Init - Initializing, Active! - Active Attention
Down! - ClusterXL Inactive or Virtual System is Down
Standby_Gateway:0> cphaprob -a if
vsid 0:
------
CCP mode: Manual (Unicast)
Required interfaces: 1
Required secured interfaces: 0
Interface Name: Status:
eth1-01 UP
eth3-04 (S) DOWN (72062 secs)
S - sync, HA/LS - bond type, LM - link monitor, P - probing
Virtual cluster interfaces: 1
eth1-01 x.x.x.x
[Expert@Standby_Gateway:0]# tcpdump -nni eth3-04 port 8116
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3-04, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
Thanks for your respone,
As information your provide, we can see:
Interface eth3-04 is down (disconnected / link down)
This causes the HAstatus to Alert DOWN. Let's check what this interface is, where this physical interface is connected, is it through any switches device?
Are there any previous changes?
How are the links cabled - are the gateways directly connected to each other (not recommended) or via a switch.
My preferred way is to have two sync interfaces in a non-LACP bond (eg. round robin works) going to two separate switches.
True for sure there is switches between them, not directly connected.. so this could be a factor also.
Dear Bro,
Please check status of physical interface or compare VLAN access for that interface.
Why not recommended direct cable between FWs? In my opinion switch is an added point of failure
I can think of this as user need. Because you can plug the cable directly between 2 devices as long as both things are in the same rack.
If both devices are located in 2 different racks, then plugging through the switch will create aesthetics and make it easier to change cables when there is a problem in the physical layer.
First off - there is Check Point's guidance on supported topologies for the sync network. Note how on all there is a switch specified.
I could build out a couple of failure scenarios - but @Bob_Zimmerman has already done a better job of it than what I can on this CheckMates post here.
If you are concerned about a switch being a single point of failure, then likely it is a SPOF for other things in your environment as well. Solve this issue with two sync interfaces in a non-LACP bond (eg. round robin works) going to two separate switches.
Hi all, to confirm it was a faulty SFP, so indeed a physical issue. The SFP was allowed to be RMA'd with Checkpoint, and the replacement SFP brought the link back online.
Thanks all for your assistance.
It is good news =))
Leaderboard
Epsum factorial non deposit quid pro quo hic escorol.
User | Count |
---|---|
9 | |
7 | |
6 | |
5 | |
5 | |
5 | |
5 | |
4 | |
4 | |
4 |
Wed 10 Sep 2025 @ 11:00 AM (CEST)
Effortless Web Application & API Security with AI-Powered WAF, an intro to CloudGuard WAFWed 10 Sep 2025 @ 11:00 AM (EDT)
Quantum Spark Management Unleashed: Hands-On TechTalk for MSPs Managing SMB NetworksFri 12 Sep 2025 @ 10:00 AM (CEST)
CheckMates Live Netherlands - Sessie 38: Harmony Email & CollaborationTue 16 Sep 2025 @ 02:00 PM (EDT)
Securing Applications with Check Point and AWS: A Unified WAF-as-a-Service Approach - AmericasWed 17 Sep 2025 @ 04:00 PM (AEST)
Securing Applications with Check Point and AWS: A Unified WAF-as-a-Service Approach - APACWed 10 Sep 2025 @ 11:00 AM (EDT)
Quantum Spark Management Unleashed: Hands-On TechTalk for MSPs Managing SMB NetworksFri 12 Sep 2025 @ 10:00 AM (CEST)
CheckMates Live Netherlands - Sessie 38: Harmony Email & CollaborationTue 16 Sep 2025 @ 02:00 PM (EDT)
Securing Applications with Check Point and AWS: A Unified WAF-as-a-Service Approach - AmericasWed 17 Sep 2025 @ 04:00 PM (AEST)
Securing Applications with Check Point and AWS: A Unified WAF-as-a-Service Approach - APACWed 17 Sep 2025 @ 03:00 PM (CEST)
Securing Applications with Check Point and AWS: A Unified WAF-as-a-Service Approach - EMEAAbout CheckMates
Learn Check Point
Advanced Learning
YOU DESERVE THE BEST SECURITY