cancel
Showing results for 
Search instead for 
Did you mean: 
Post a Question

Policy Install Failed - Problem With The Commit Function

Jump to solution

I have a feeling this one is going to require a call to TAC, but does anyone have any experience troubleshooting this one? I've got one VS in a VSLS VSX Cluster containing 3 Virtual Switches and 5 Virtual Systems that I get this Policy install error on. All other VS's install policy just fine. The VSX Cluster is R77.30 with R80.10 SMS.

The strange thing is that it happens once the Policy Install progress hits 99%. It was my understanding (based on this very comprehensive and helpful writeup) that the Policy Install procedure was all but completed once the progress bar hit 99%?

When I look at vsx stat -v, It appears that VSX thinks the policy installed. The "Installed at" time matches with when the Policy Install fails. 

I am able to verify the Policy in Smart Console without any errors but I'm not sure where to begin troubleshooting this since it appears that the Gateway thinks the policy installed successfully but the management server doesn't.

Thanks!

Labels (1)
1 Solution

Accepted Solutions

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Just to close the loop here in case anyone else should encounter this problem, the final solution was to perform a SIC reset on the individual VS as outlined in sk34098

Kaspars Zibarts‌ suggestion of reset_gw would have also worked since that procedure performs a full SIC reset as part of the vsx_util reconfigure process. 

In the end, it came down to TAC feeling very certain the individual SIC reset would resolve it and my ability to try the SIC reset during the normal course of troubleshooting vs. waiting for a maintenance window to do the reset_gw. Smiley Happy 

Thanks to all who contributed their suggestions and help here! If nothing else, I learned a handful of other troubleshooting steps and commands through this thread that I otherwise wouldn't have!

0 Kudos
21 Replies

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

How many cluster members do you have and did you verify that policy installed on all members? I would start with cpd.elg logs

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

There are 2 members in the cluster. I didn't think to check both clusters. I habitually was checking just the one that the VS is active on. However, it does appear that the policy is not installing on the other cluster member. The install dates are different across the two.

I'll start digging through the cpd.elg logs on this Gateway and see if anything interesting and relevant pops up. Thanks for the suggestion!

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

The plot thickens... it looks like there is a positive confirmation the policy installed on the Cluster Member that shows the current policy install date:

However, it seems to just enter/exit "addon end_handler" without showing any confirmation that the policy install succeeded (or failed) on the Gateway that doesn't show the current policy install date. 

I was hoping for something a little more definitive in the logs pointing to a reason. But there is definitely a difference between the two cluster members.

I wonder if the cpd process just needs restarting? Maybe it's time for a cpstop/start in a maintenance window?

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

There are couple of SKs about cpd debug that would show you more messages in the log. For example

How to debug CPD daemon 

Additionally check fwm.elg. But yes - if you have it as an option - reboot the gateway that exibits the problem and check which jumbo hotfix you are on and if there's a newer version that might have fixes for cpd or policy installation / VSX

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Just to be clear, the fwm.elg log only exists on the SMS side, correct?

Thanks for the SK on debugging CPD. I got about 140,000+ lines of output when I ran it while pushing policy. I'm thinking this may be the point where it is more beneficial to engage TAC because without any guidance of what I'm looking for, it seems like searching for a needle in a haystack. 

If nothing else, I can arm TAC with a lot of information when opening the SR to hopefully move things along quickly!

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

BTW, how does vsx stat -v output looks like on that gateway? Is SIC established to the failing VS?

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Everything looks OK to my eyes:

0 Kudos
Vladimir
Jade

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Please check the status of free RAM on the policy installation target.

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Thanks for the suggestion! I think the memory looks pretty good:

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

check free HDD space on mgmt and/or VSX.

Kind regards,
Jozko Mrkvicka
0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

This seemed like a great place to start, but disk space looks pretty good. Both VSX clusters are using the same amount of disk space. The SMS should have plenty, too!

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Please open an SR with TAC if you did not do that already

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Yes, I opened an SR on Friday. TAC supplied a policy debug script. I expect we’ll make some good progress today once I get rolling working through that process.

Thanks,

Dan

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

This one is still making the rounds with TAC. We were provided a vs_reconfigure BASH script to run against the VS to rebuild it. While the script seemed to run successfully, and the GW was able to pull policy from the SMS, we are still unable to push policy to it. 

Now, we do get a SIC error despite the SIC status still showing as Trust in the output of vsx stat -v

Strangely, I am able to modify the route table and push the VS config successfully through SmartConsole. I have a feeling we will be resetting SIC on the individual VS, but it seems strange that everything seems to work and communicate up to a certain point.

Very strange... I'll do my best to update with whatever the resolution ends up being as this seems to be a unique one!

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

I would probably do full rebuild of the box. First reset_gw on firewall (this way you will keep all basic non-vsx config) and then vsx_util reconfigure on mgmt. Something seems very "stuck" there if TAC was not able to resolve it so far. Not too sure how many times have you done it, but it's not as complicated and dangerous as it sounds. I would avoid resetting sic on individual Vs - never had full success with it. Something always didn't work correctly at the end. 

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

I wasn't aware reset_gw was an option. Would you just run this from vs0 to basically blow away all the VSX config from the Gateway?

By "keep all basic non-vsx config", I'm taking that to mean the underlying GAIA config and OS remains. So, this isn't a rebuild in the sense of a total reinstall of GAIA + HFA's to the GW? I'm familiar with the "vsx_util reconfigure" process and am pretty comfortable with that. 

I was thinking along these lines, but I didn't realize there was a way to remove just the VSX configs! It would save a lot of headache of having to reimage the appliance and put everything back in place. I can mention it to the folks at TAC working with me. Is there an SK explaining this anywhere?

Thanks for the input, this could be very helpful!

-Dan

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Yap! Has saved me on number of occasions Smiley Happy and very popular command in my lab where I rebuild them constantly to test stuff Smiley Happy 

Just save your show configuration output as pain text just in case of course  

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Thanks again for this suggestion. I think this is the way to go. Now, I just need to get this squeezed into a maintenance window!

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

If you have two VMs available, I would always suggested lab testing just to make sure. And don't forget the snapshot Smiley Happy

0 Kudos

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Precisely how I planned on spending my day today! 

Re: Policy Install Failed - Problem With The Commit Function

Jump to solution

Just to close the loop here in case anyone else should encounter this problem, the final solution was to perform a SIC reset on the individual VS as outlined in sk34098

Kaspars Zibarts‌ suggestion of reset_gw would have also worked since that procedure performs a full SIC reset as part of the vsx_util reconfigure process. 

In the end, it came down to TAC feeling very certain the individual SIC reset would resolve it and my ability to try the SIC reset during the normal course of troubleshooting vs. waiting for a maintenance window to do the reset_gw. Smiley Happy 

Thanks to all who contributed their suggestions and help here! If nothing else, I learned a handful of other troubleshooting steps and commands through this thread that I otherwise wouldn't have!

0 Kudos