- CheckMates
- :
- Products
- :
- Quantum
- :
- Security Gateways
- :
- Unexpected crash of ClusterXL active member
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Are you a member of CheckMates?
×- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unexpected crash of ClusterXL active member
Hello, Folks.
Today our client's Main Cluster had a problem.
At about 10:00am (GMT -5), the client was without services in general (Internet, Publishing, Communication between internal segments).
Practically, the active member of the Cluster "DIED", without any cause.
The client tried to switch the ClusterXL order, with the command "clusterXL_admin down", but the command did not work, and had to restart the computer.
They have already had similar bad experiences with this Cluster, and it seems to be a problem of the hardware that was sold (Appliance 6000).
Is it possible to "detect" what was the root-cause, by which simply, the equipment, stopped working, and caused this disaster for the customer?
Best regards.
Check Point: R81.10 with JHF Take 95
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If the system crashed, there is going to be a vmcore somewhere.
I would highly recommend engaging with the TAC on this as, if it's a hardware failure, an RMA may be required.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
TAC may suggest to upgrade to R81.20, but hard to say if that would solve anything. As @PhoneBoy said, sounds like vmcore was generated, so that would need to be investigated further.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
as soon as possible, once device is "alive", collect all logs present on a device ( dmesg, var/log/messages) and cpinfo. TAC may be able to spot what went wrong before all logs are overwritten with newer ones.
Jozko Mrkvicka
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Very good point @JozkoMrkvicka
