Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
dphonovation
Collaborator

Traffic to secondary member of ClusterXL is dropped using VxLan

I have the following:

 

<Site1 ClusterXL> <---------Site2Site IpSec Tunnel ------------> <Site 2 ClusterXL>

Member1-Site1: 10.10.171.2/24                                                    Member1-Site2: 10.20.171.2/24

Member2-Site2: 10.10.171.3/24                                                    Member2-Site2: 10.20.171.3/24

VIP: 10.10.171.1                                                                                VIP: 10.20.171.1

 

 

 

Site 2 Site Tunnel 1 Encryption Domain: 10.11.171.0/24. Site1 has a Cluster VIP here of 10.11.171.1

Site 2 Site Tunnel 2 Encryption Domain: 10.12.171.0/24. Site2 has a Cluster VIP here of 10.12.171.1

 

 

 

Across that IPSEC tunnel I have a Checkpoint Native VxLan interface pointed at back at the opposite cluster:

Member1-Site1: 172.31.0.2/29                                                    Member1-Site1: 172.31.0.5/29

Member1-Site1: 172.31.0.3/29                                                    Member2-Site2: 172.31.0.6/29

VxLan VIP Site1: 172.31.0.1                                                               VxLan VIP Site2: 172.31.0.4

Remote addr: 10.12.171.1                                                                  Remote addr: 10.11.171.1

 

 

I then have a route from Site1: route 10.20.171.0/24 via 172.31.0.4

And a route from Site2 back: route 10.10.171.0/24 via 172.31.0.1

 

This works perfectly. I can reach all hosts on 10.10.171.0/24 or 10.20.171.0/24 from either side - except for traffic headed to the standby member in the ClusterXL on the destination net.

 

 

Can anyone shed light on why this might be the case?

 

 

 

0 Kudos
8 Replies
the_rock
MVP Gold
MVP Gold

If you do simple zdebug what do you see? Also, if you issue command ip r g x.x.x.x (IP you are trying to reach), does it look same as one that does work?

Andy

Best,
Andy
0 Kudos
dphonovation
Collaborator

just saw this in zdebug. A clue!:

@;1464977;[vs_0];[tid_0];[fw4_0];fw_log_drop_ex: Packet proto=6 10.10.171.4:44698 -> 10.20.171.3:18192 dropped by fwha_ccl_inbound_late_do Reason: Dropping dynamic routing packet forwarded to wrong member.;
@;1465051;[vs_0];[tid_0];[fw4_0];fw_log_drop_ex: Packet proto=6 10.10.171.12:55319 -> 10.20.171.3:8443 dropped by fwha_ccl_inbound_late_do Reason: Dropping dynamic routing packet forwarded to wrong member.;


This doesn't seem to help either (tried on all members)
fwha_forw_packet_to_not_active to 

 

0 Kudos
the_rock
MVP Gold
MVP Gold

That would appear to be something routing related, for sure. What is output of ip route get for IP you are testing on both members?

Best,
Andy
0 Kudos
dphonovation
Collaborator

On Site 1 FW1:

[Expert@cp-fw1-site1:0]# ip r g 10.20.171.3
10.20.171.3 via 172.31.0.4 dev vxlan7 src 172.31.0.2
cache
[Expert@cp-fw1-site1:0]# ip r g 10.20.171.2
10.20.171.2 via 172.31.0.4 dev vxlan7 src 172.31.0.2

 

On Site 1 FW2:

[Expert@cp-fw2-site1:0]# ip r g 10.20.171.3
10.20.171.3 via 172.31.0.4 dev vxlan7 src 172.31.0.3
cache
[Expert@cp-fw2-site1:0]# ip r g 10.20.171.2
10.20.171.2 via 172.31.0.4 dev vxlan7 src 172.31.0.3
cache

 

 

On Site 1 FW1:

[Expert@cp-fw1-site2:0]# ip r g 10.10.171.3
10.10.171.3 via 172.31.0.1 dev vxlan7 src 172.31.0.5
cache
[Expert@cp-fw1-site2:0]# ip r g 10.10.171.2
10.10.171.2 via 172.31.0.1 dev vxlan7 src 172.31.0.5

 

On Site 1 FW2:

[Expert@cp-fw2-site2:0]# ip r g 10.10.171.3
10.10.171.3 via 172.31.0.1 dev vxlan7 src 172.31.0.6
cache
[Expert@cp-fw2-site2:0]# ip r g 10.10.171.2
10.10.171.2 via 172.31.0.1 dev vxlan7 src 172.31.0.6
cache

0 Kudos
the_rock
MVP Gold
MVP Gold

That seems correct. I also found below, but you already said you changed the value. Lets see what others have to say.

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

Also, just wondering, if you compare the traceroute of working and non-working one, where is it failing?

Andy

Best,
Andy
0 Kudos
dphonovation
Collaborator

Well, the zdebug drop is being shown on the active member of the opposite site.

0 Kudos
dphonovation
Collaborator

And this is the traceroutes:

cp-mgmt-site1> traceroute 10.20.171.2
traceroute to 10.20.171.2 (10.20.171.2), 30 hops max, 40 byte packets
1 10.10.171.2 (10.10.171.2) 2.053 ms 1.682 ms 2.024 ms
2 10.20.171.2 (10.20.171.2) 20.622 ms 20.580 ms 20.607 ms
cp-mgmt-site1> traceroute 10.20.171.3
traceroute to 10.20.171.3 (10.20.171.3), 30 hops max, 40 byte packets
1 10.10.171.2 (10.10.171.2) 2.360 ms 1.796 ms 2.323 ms
2 * * *
3 * * *

 

while the other side's active member is logging the afroomentioned drops.

 

What's weird is that the security gateways can ping the standby fine; but I think this is due to an auto NAT rule.

0 Kudos
dphonovation
Collaborator

There is something about routing to the vxlan interface from the standby. Oddly, the standby member can ping both active/standby on the other side. But it cannot ping the management server (10.10.171.4):

In this case, FW2 at Site 2 (in standby) is trying to reach a CP MGMT box on the other side via ping.

FW2 at Site 2 is responding with ICMP unreachable from the IP of its member on the Clustered VxLan interface.

 

 

[Expert@cp-fw2-site2:0]# ifconfig vxlan7

vxlan7   Link encap:Ethernet HWaddr 0E:61:40:26:DB:26

      inet addr:172.31.0.6 Bcast:172.31.0.7 Mask:255.255.255.248

      UP BROADCAST RUNNING MULTICAST MTU:8000 Metric:1

      RX packets:0 errors:0 dropped:0 overruns:0 frame:0

      TX packets:2897 errors:0 dropped:0 overruns:0 carrier:0

      collisions:0 txqueuelen:1000

      RX bytes:0 (0.0 b) TX bytes:81240 (79.3 KiB)

 

[Expert@cp-fw2-site2:0]# ip r g 10.10.171.4

10.10.171.4 via 172.31.0.1 dev vxlan7 src 172.31.0.6

  cache

[Expert@cp-fw2-site2:0]# ping -c 1 172.31.0.6

PING 172.31.0.6 (172.31.0.6) 56(84) bytes of data.

64 bytes from 172.31.0.6: icmp_seq=1 ttl=64 time=0.079 ms

 

--- 172.31.0.6 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms

[Expert@cp-fw2-site2:0]# ping 10.10.171.4

PING 10.10.171.4 (10.10.171.4) 56(84) bytes of data.

From 172.31.0.6 icmp_seq=1 Destination Host Unreachable

 

Whereas everyone else on-net with fw2-site2 (but using fw1 as its active and owns the default gateway vip) CANNOT ping the standby gateway on the other side; but can the mgmt server

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events