'Crash' can mean a lot of different things - in our case it means the gateways reload at inopportune times. With respect to R81.10, we've been waiting for R81.20 before going through that exercise but would move much earlier if we were assured there are no stability issues in the R81.x release at all - something I doubt is true.
So for us, and the reason for my post, is simply to ask if our experience common among other customers and if so how did they get to a stable place?
Here is what happened in better detail:
We were standardized on R80.40 GA Take 154 but were experiencing reloads every 4 to 6 weeks. We were able to manage this with scheduled reboots during controlled service windows. Recently we were encouraged to move to GA Take 161 as it addressed some stability issues and TAC wanted to see if ours were addressed by applying it. Once on T161, we saw stability for about two weeks then we suffered unexpected reloads mid-day. Core dump analysis showed 'bad magic' errors that are apparently partially addressed in GA Take 173 but fully fixed in OG Take 180.
Under duress, we moved from GA Take 161 to OG Take 180 - and while that did take care of the 'bad magic' errors, it caused new crashes - usually in twos, one reboot with core dump generation followed by another between 15 and 45 minutes later.
Now we are trying to find our way back to a more workable deployment while R&D sort out the latest issue.