Just thought I would pass this along for the next person that might run into this. I have a customer with an environment that has been upgraded through many versions. Every time we do an upgrade we find something.. interesting... Well we did again!
So it turns out the internal ca had a garabge cert in it. cpca_client wouldn't show it, dumping the fwauth.NDB to sql and looking at the sql wouldn't show anything obvious however doing a less on fwauth.NDB you would see
cn=bla.domain^[^[^[.domain.com
checkpoint has a utility called fix_ndb. With this you can dump the fwauth.NDB to text, fix said text file and recreate fwauth.NDB. This is an internal utility at checkpoint and not to be distributed.
Each cert in the output will be listed and how the cert is encoded is listed as well with a line that looks like
form=ascii
or
form=bin
All the certs names were clear except the one listed as form bin. Since it was in bin mode everything was just hex. We then found we could see the serial number in the cert listed as form=bin. Doing a less on fwauth.NDB would show enough of the serial number to make it a match.
After deleting the cert (there are 2 refences in the output btw you have to delete both) and recreating the fwauth.NDB the system now upgrades to R80.40.
Oh the error was logged in /opt/CPsuite-R80..40/fw1/log/upgrade_from_fwm_on_domain_UUID.elg
Error messages was
ERROR: management.upgrade.Dispatcher [main: Exception caught during the upgrade: Marshaling Error: Invalid white space character (0x1b) in text output. Followed by 900 lines of java exceptions.
Big thanks to Arjun in Diamond for putting up with me and Russel from CFG figuring out all this mess.