Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
hemh
Participant

Edom environment and vpnd process

Hi Checkmates, 

 

I wanted to share with you a issue we encountered in our company recently.

Management server and gateways are all R80.40 JHF139

After an old Edom environment cleaning, where we removed all edom objects (user groups and network groups), we had a sever issue where all our sites to sites vpn felt down and sll vpn users were not abble to connect to the portal. We raised a SR to the TAC and an engineer took a look at our environment.

From what he saw he suspects a vpnd process crash, and he said we might need some CPU tuning among our 16 CPUs, because one CPU is experiencing heavy load. He said the vpnd process crash bring the same behaviour we had during the crash: SSLVPN portal unresponsive, authentication failure due to radius communication problems, site to site VPN impacted.

We restored the session revison prior to Edom environment cleaning and it solved the issue. We also applied the CPU tuning as recomended by TAC. Few days later, we tried to clean the EDOM environment again and the issue ocured again.

We were abble to see that the issue was present by seeing this log continuously: ‘Warning:cp_timed_blocker_handler: XXX’ in vpnd.elg

TAC recomandation was to update to the last JHF150 with this analysis:

Issue occurs because of the following reasons: 

The vpnd crashes due to wrong memory access. After further analyzing that access, it seems like that memory was valid at some point but got freed unexpectedly causing the vpnd to crash. 
Another reason is the because of Application mode which seems to cause vpnd to crash abruptly. 
Issue has been resolved in the following fix:

PRJ-27296,
VPNRA-761 Mobile Access In rare scenarios, when SNX client is used with Application mode on the Mobile Access Blade, the VPND process may unexpectedly exit.


We followed the recomandation and install JHF 150 but it didn't solved the problem.

So we found the solution by ourself by disabling EDOM feature: vpn set_snx_encdom_groups on/off 

That was as simple as that, hope it will help.

2 Replies
the_rock
Champion
Champion

Thanks for sharing, thats actually super helpful!

Andy

0 Kudos
Timothy_Hall
Champion
Champion

Thanks for sharing, the vpnd process on gateways is very old and had a lot of wildly different functions stuffed into it over the years; unfortunately there have been some stability issues as well and your story does not surprise me.  Thankfully this vpnd process appears to be on the way out as its functions are gradually implemented in new daemons (iked & cccd in R81.10+), with other functions such as Visitor Mode and NAT-T getting implemented in the Firewall Workers/Instances which vastly improves scalability & performance: sk168297: Large scale support in VPN Remote Access Visitor-Mode

New 2021 IPS/AV/ABOT Immersion Self-Guided Video Series
now available at http://www.maxpowerfirewalls.com