Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
dphonovation
Collaborator

Traffic to secondary member of ClusterXL is dropped using VxLan

I have the following:

 

<Site1 ClusterXL> <---------Site2Site IpSec Tunnel ------------> <Site 2 ClusterXL>

Member1-Site1: 10.10.171.2/24                                                    Member1-Site2: 10.20.171.2/24

Member2-Site2: 10.10.171.3/24                                                    Member2-Site2: 10.20.171.3/24

VIP: 10.10.171.1                                                                                VIP: 10.20.171.1

 

 

 

Site 2 Site Tunnel 1 Encryption Domain: 10.11.171.0/24. Site1 has a Cluster VIP here of 10.11.171.1

Site 2 Site Tunnel 2 Encryption Domain: 10.12.171.0/24. Site2 has a Cluster VIP here of 10.12.171.1

 

 

 

Across that IPSEC tunnel I have a Checkpoint Native VxLan interface pointed at back at the opposite cluster:

Member1-Site1: 172.31.0.2/29                                                    Member1-Site1: 172.31.0.5/29

Member1-Site1: 172.31.0.3/29                                                    Member2-Site2: 172.31.0.6/29

VxLan VIP Site1: 172.31.0.1                                                               VxLan VIP Site2: 172.31.0.4

Remote addr: 10.12.171.1                                                                  Remote addr: 10.11.171.1

 

 

I then have a route from Site1: route 10.20.171.0/24 via 172.31.0.4

And a route from Site2 back: route 10.10.171.0/24 via 172.31.0.1

 

This works perfectly. I can reach all hosts on 10.10.171.0/24 or 10.20.171.0/24 from either side - except for traffic headed to the standby member in the ClusterXL on the destination net.

 

 

Can anyone shed light on why this might be the case?

 

 

 

0 Kudos
8 Replies
the_rock
Legend
Legend

If you do simple zdebug what do you see? Also, if you issue command ip r g x.x.x.x (IP you are trying to reach), does it look same as one that does work?

Andy

0 Kudos
dphonovation
Collaborator

just saw this in zdebug. A clue!:

@;1464977;[vs_0];[tid_0];[fw4_0];fw_log_drop_ex: Packet proto=6 10.10.171.4:44698 -> 10.20.171.3:18192 dropped by fwha_ccl_inbound_late_do Reason: Dropping dynamic routing packet forwarded to wrong member.;
@;1465051;[vs_0];[tid_0];[fw4_0];fw_log_drop_ex: Packet proto=6 10.10.171.12:55319 -> 10.20.171.3:8443 dropped by fwha_ccl_inbound_late_do Reason: Dropping dynamic routing packet forwarded to wrong member.;


This doesn't seem to help either (tried on all members)
fwha_forw_packet_to_not_active to 

 

0 Kudos
the_rock
Legend
Legend

That would appear to be something routing related, for sure. What is output of ip route get for IP you are testing on both members?

0 Kudos
dphonovation
Collaborator

On Site 1 FW1:

[Expert@cp-fw1-site1:0]# ip r g 10.20.171.3
10.20.171.3 via 172.31.0.4 dev vxlan7 src 172.31.0.2
cache
[Expert@cp-fw1-site1:0]# ip r g 10.20.171.2
10.20.171.2 via 172.31.0.4 dev vxlan7 src 172.31.0.2

 

On Site 1 FW2:

[Expert@cp-fw2-site1:0]# ip r g 10.20.171.3
10.20.171.3 via 172.31.0.4 dev vxlan7 src 172.31.0.3
cache
[Expert@cp-fw2-site1:0]# ip r g 10.20.171.2
10.20.171.2 via 172.31.0.4 dev vxlan7 src 172.31.0.3
cache

 

 

On Site 1 FW1:

[Expert@cp-fw1-site2:0]# ip r g 10.10.171.3
10.10.171.3 via 172.31.0.1 dev vxlan7 src 172.31.0.5
cache
[Expert@cp-fw1-site2:0]# ip r g 10.10.171.2
10.10.171.2 via 172.31.0.1 dev vxlan7 src 172.31.0.5

 

On Site 1 FW2:

[Expert@cp-fw2-site2:0]# ip r g 10.10.171.3
10.10.171.3 via 172.31.0.1 dev vxlan7 src 172.31.0.6
cache
[Expert@cp-fw2-site2:0]# ip r g 10.10.171.2
10.10.171.2 via 172.31.0.1 dev vxlan7 src 172.31.0.6
cache

0 Kudos
the_rock
Legend
Legend

That seems correct. I also found below, but you already said you changed the value. Lets see what others have to say.

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

Also, just wondering, if you compare the traceroute of working and non-working one, where is it failing?

Andy

0 Kudos
dphonovation
Collaborator

Well, the zdebug drop is being shown on the active member of the opposite site.

0 Kudos
dphonovation
Collaborator

And this is the traceroutes:

cp-mgmt-site1> traceroute 10.20.171.2
traceroute to 10.20.171.2 (10.20.171.2), 30 hops max, 40 byte packets
1 10.10.171.2 (10.10.171.2) 2.053 ms 1.682 ms 2.024 ms
2 10.20.171.2 (10.20.171.2) 20.622 ms 20.580 ms 20.607 ms
cp-mgmt-site1> traceroute 10.20.171.3
traceroute to 10.20.171.3 (10.20.171.3), 30 hops max, 40 byte packets
1 10.10.171.2 (10.10.171.2) 2.360 ms 1.796 ms 2.323 ms
2 * * *
3 * * *

 

while the other side's active member is logging the afroomentioned drops.

 

What's weird is that the security gateways can ping the standby fine; but I think this is due to an auto NAT rule.

0 Kudos
dphonovation
Collaborator

There is something about routing to the vxlan interface from the standby. Oddly, the standby member can ping both active/standby on the other side. But it cannot ping the management server (10.10.171.4):

In this case, FW2 at Site 2 (in standby) is trying to reach a CP MGMT box on the other side via ping.

FW2 at Site 2 is responding with ICMP unreachable from the IP of its member on the Clustered VxLan interface.

 

 

[Expert@cp-fw2-site2:0]# ifconfig vxlan7

vxlan7   Link encap:Ethernet HWaddr 0E:61:40:26:DB:26

      inet addr:172.31.0.6 Bcast:172.31.0.7 Mask:255.255.255.248

      UP BROADCAST RUNNING MULTICAST MTU:8000 Metric:1

      RX packets:0 errors:0 dropped:0 overruns:0 frame:0

      TX packets:2897 errors:0 dropped:0 overruns:0 carrier:0

      collisions:0 txqueuelen:1000

      RX bytes:0 (0.0 b) TX bytes:81240 (79.3 KiB)

 

[Expert@cp-fw2-site2:0]# ip r g 10.10.171.4

10.10.171.4 via 172.31.0.1 dev vxlan7 src 172.31.0.6

  cache

[Expert@cp-fw2-site2:0]# ping -c 1 172.31.0.6

PING 172.31.0.6 (172.31.0.6) 56(84) bytes of data.

64 bytes from 172.31.0.6: icmp_seq=1 ttl=64 time=0.079 ms

 

--- 172.31.0.6 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 0ms

rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms

[Expert@cp-fw2-site2:0]# ping 10.10.171.4

PING 10.10.171.4 (10.10.171.4) 56(84) bytes of data.

From 172.31.0.6 icmp_seq=1 Destination Host Unreachable

 

Whereas everyone else on-net with fw2-site2 (but using fw1 as its active and owns the default gateway vip) CANNOT ping the standby gateway on the other side; but can the mgmt server

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events