I have a cluster of 15400 devices running on R80.40 with the FWMGR at R81. Right now I am one patch behind on both, but they will be upgraded soon.
We have HTTPS inspection enabled and all traffic was going thru (we have fail open) but found out that all of our traffic was also getting an "Internal System Error" and was not actually being inspected. I have seen several posts on here about Error Code 2 and I see one or two of those, but that is not what we see. We are getting "Internal System Error in HTTPS Inspection due to categorization service timeout".
I am guessing this is a new issue since SK176925 which I followed to increase some of the values in the rad.conf.c file was dated 12/13/21.
I followed the instructions in the SK and see a dramatic drop in the errors but am still getting clumps of them. For example we went from all traffic getting the error to now a few times a day I see it with about 40 - 50 log entries in less than a minute. Then it goes back to inspecting for an hour or so before I see another clump.
I am curious as to other people see this type of activity in their logs? I am not sure if this is normal and we are fixed now or if it is still broken and our tweak helped but did not fix the situation.
Also, I created a monitoring job off our network which monitors Checkpoints CWS and Updates pages and frequently get notifications that the sites are not available. I haven't seen a correlation between the times it is down and the clumps of errors we get, just throwing that out there for no good reason at all. 🙂
Any insight is appreciated.