Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
StevePearson
Participant

All outbound traffic being corrupted

I'm hoping someone can clarify my diagnosis here!

I'm working on what should be a simple upgrade, VMware management server, and a pair of 5200 gateways in an active standby clusterXL, from R80.30 to R81.20.

Build a new management server VM and copied the config over using Migrate_Server, no problems, so onto the cluster. CPUSE fresh install on the standby member, activate MVC, push access policy, check the Gaia config, all looked good. However first problem, this gateway cannot contact the Checkpoint download server for updates. Bit of troubleshooting and I find it can talk to anything internal but nothing external, so for example I can ping an internal server but not 8.8.8.8. More toubleshooting, checked the routing table using the cli command ip route show, and I spot that there is no default route showing. Check the static routes in Gaia, and its showing correctly. Rebooted, no change. Deleted the default route in Gaia, and readded it, and it then showed in the ip show route output and everything appears to be ok again, can ping external and contact the update servers. More testing, rebooting, all appears to be ok.  Only explanation I can come up with for this is that there was a corruption with the default route.

So now ready to do manual failover to allow other member to be upgraded.

clusterXL_admin down, and instantly all network traffic stops, clusterXL_admin up and everything flows normally again. Nothing abnormal in the logs.

I'm assured that the failover worked last time, but I must admin I didn't test it before I started.

Ran admin down again, traffic stopped (running ping verifies this instantly), and ran a packet capture for a short time on the upgraded gateway before running admin up again.

Looking at the capture file it is showing all packets outgoing to the internet as malformed due to incorrect packet lengths, internal ones are ok.

I have now also deleted and readded all static routes, but that didn't help.

Never seen this before, my working theory is that this is not related to the upgrade, but to something external on the WAN interface side, so a faulty port or cable maybe?

 

0 Kudos
4 Replies
TJ_Aus
Contributor

Does your uplink from your cluster to the internet (outgoing direction) include a bonded interface (etherchannel)?

0 Kudos
StevePearson
Participant

No there are no bonds in the setup at all

0 Kudos
the_rock
Legend
Legend

I read your post carefully and I must admit I had never experienced such an issue. I find it super odd such a thing would happen with the routes, specially during an upgrade, as everything would (or SHOULD) get preserved. 

What does it show when you run ip r g 8.8.8.8? Does it give proper route? I also tend to agree with you that this may not even have anything to do with the upgrade, but rather something else, but then if all worked fine BEFORE the upgrade, then even that argument may not be correct...

Andy

0 Kudos
StevePearson
Participant

After deleting and readding the default route it worked fine with ping.

Someone has suggested that the malformed packets could be an mtu problem and that maybe the upgrade has reset it to 1500 when it should be 9000.

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events