- Products
- Learn
- Local User Groups
- Partners
- More
Access Control and Threat Prevention Best Practices
5 November @ 5pm CET / 11am ET
Ask Check Point Threat Intelligence Anything!
October 28th, 9am ET / 3pm CET
Check Point Named Leader
2025 Gartner® Magic Quadrant™ for Hybrid Mesh Firewall
HTTPS Inspection
Help us to understand your needs better
CheckMates Go:
Spark Management Portal and More!
Mistakenly cold swapped a RMA drive and seeing the messages found in this SK: https://support.checkpoint.com/results/sk/sk181269
The solution doesn't explain if the RAID will eventually self-heal and go healthy or is the spare drive no good now? Not seeing a ton of activity on the drives like it's trying to rebuild and if I reboot with the drives in this status it fails to boot until I remove the 'spare/DISC_FAILED' drive.
Just a final update on this. Worked with our VAR and TAC. Ended up being what the theory was and we had to wipe the spare drive and clear it's partition layout and then remap it with the good drive's partition layout since it already had an OS installed on it. Once this happened, the drive automatically began to rebuild in the RAID. If we would have received just a regular spare drive this would have been avoided but was a good learning experience.
Hm...I see says this, but not sure if that means technically it will self-heal..
Andy
When Storage Devices are configured in a RAID, it is mandatory to replace Storage Devices when the appliance is up and running.
After you replace a Storage Device, it can take several hours for the RAID State to become "ONLINE" and "Flags" to become "NONE"
See, that's funny because hot-swapping a failed disk also doesn't work. You have to take manual steps to get the system to recognize the new drive, as described in sk157874.
But I guess doing this would be easier...
The following workaround is also available:
Reboot the appliance. A reboot will also "wake up" the SATA port that shut down after you swapped the failed disk with a new one.
What's odd is I did try rebooting with the drive in the server and it fails to boot. If I remove the spare drive and reboot, it boots fine.
insmod: error inserting '/lib/crct10dif_common.ko': -1 File exists
insmod: error inserting '/lib/crc-t10dif.ko': -1 File exists
insmod: error inserting '/lib/sd_mod.ko': -1 File exists
mdadm: /dev/md/2 has been started with 1 drive (out of 2).
mdadm: /dev/md0 has been started with 2 drives.
mdadm: /dev/md1 has been started with 2 drives.
mdadm: /dev/md2 is already in use.
mdadm: /dev/md2 is already in use.
Reading all physical volumes. This may take a while...
Found volume group "vg_splat" using metadata type lvm2
4 logical volume(s) in volume group "vg_splat" now active
mount: error mounting /dev/root on /sysroot as ext3: Invalid argument
setuproot: moving /dev failed: No such file or directory
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
switchroot: mount failed: No such file or directory
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
CPU: 0 PID: 1 Comm: init Not tainted 3.10.0-1160.15.2cpx86_64 #1
Hardware name: CheckPoint PH-30-00/To be filled by O.E.M., BIOS 5.6.5 01/13/2016
Call Trace:
[<ffffffff817b02ee>] dump_stack+0x1e/0x20
[<ffffffff817adc6f>] panic+0xe5/0x20a
[<ffffffff810984bd>] do_exit+0xb2d/0xb30
[<ffffffff81098543>] do_group_exit+0x43/0xc0
[<ffffffff810985d4>] SyS_exit_group+0x14/0x20
[<ffffffff817c8361>] sysenter_dispatch+0x7/0x25
Kernel Offset: disabled
Rebooting in 15 seconds..
Maybe get in touch with TAC see what they say.
Boot it with the single drive, insert the replacement drive, then use the process in sk157874 to get it to recognize the replacement drive and start resilvering the set.
I don't think that's my issue as when I run fdisk -l I see two 1TB drives listed. I am working with support and will update the post with what I find out.
Sounds good, let us know.
That's normal. Here's some relevant command output from a 15600 upgraded from R81.10 (maybe R80.40, I forget) to R81.20 with a healthy RAID:
[Expert@SomeCluster1 STANDBY]# fw ver
This is Check Point's software version R81.20 - Build 012
[Expert@SomeCluster1 STANDBY]# raid_diagnostic
Raid status:
VolumeID:0 RaidLevel: RAID-1 NumberOfDisks:2 RaidSize:447GB State:OPTIMAL Flags:ENABLED
DiskID:0 DiskNumber:0 Vendor:ATA ProductID:SAMSUNG MZ7KM480 Revision:104Q Size:447GB State:ONLINE Flags:NONE
DiskID:1 DiskNumber:1 Vendor:ATA ProductID:SAMSUNG MZ7KM480 Revision:104Q Size:447GB State:ONLINE Flags:NONE
[Expert@SomeCluster1 STANDBY]# fdisk -l
Disk /dev/sda: 480.1 GB, 480103981056 bytes, 937703088 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x0008f6de
Device Boot Start End Blocks Id System
/dev/sda1 * 63 610469 305203+ fd Linux raid autodetect
/dev/sda2 610470 67713974 33551752+ fd Linux raid autodetect
/dev/sda3 67713975 937697984 434992005 fd Linux raid autodetect
Disk /dev/sdb: 480.1 GB, 480103981056 bytes, 937703088 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x0008f6de
Device Boot Start End Blocks Id System
/dev/sdb1 * 63 610469 305203+ fd Linux raid autodetect
/dev/sdb2 610470 67713974 33551752+ fd Linux raid autodetect
/dev/sdb3 67713975 937697984 434992005 fd Linux raid autodetect
Disk /dev/md0: 312 MB, 312410112 bytes, 610176 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/md1: 34.4 GB, 34356920320 bytes, 67103360 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/md2: 445.4 GB, 445431742464 bytes, 869983872 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
[Expert@SomeCluster1 STANDBY]# mdadm --misc -Q --detail /dev/md0
/dev/md0:
Version : 0.90
Creation Time : Wed Aug 8 06:10:37 2018
Raid Level : raid1
Array Size : 305088 (297.94 MiB 312.41 MB)
Used Dev Size : 305088 (297.94 MiB 312.41 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu May 2 10:36:18 2024
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : resync
UUID : 00112233:44556677:8899aabb:ccddeefd
Events : 0.36
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
[Expert@SomeCluster1 STANDBY]# mdadm --misc -Q --detail /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Wed Aug 8 06:10:32 2018
Raid Level : raid1
Array Size : 33551680 (32.00 GiB 34.36 GB)
Used Dev Size : 33551680 (32.00 GiB 34.36 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Tue Mar 19 00:23:22 2024
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : resync
UUID : 00112233:44556677:8899aabb:ccddeefe
Events : 0.8
Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
[Expert@SomeCluster1 STANDBY]# mdadm --misc -Q --detail /dev/md2
/dev/md2:
Version : 0.90
Creation Time : Wed Aug 8 06:10:32 2018
Raid Level : raid1
Array Size : 434991936 (414.84 GiB 445.43 GB)
Used Dev Size : 434991936 (414.84 GiB 445.43 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Fri May 3 09:31:11 2024
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : resync
UUID : 00112233:44556677:8899aabb:ccddeeff
Events : 0.766
Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 8 19 1 active sync /dev/sdb3
[Expert@SomeCluster1 STANDBY]#
On boot, the system tries to identify all disks. Any disks which are part of an existing md(4) set are attached to that set. Any disks which aren't part of an existing set instead result in a new set being created with the new disk attached to it. This is the problem you have hit.
It's possible to fix this live, but risky. It's far simpler to shut down, remove the new drive, boot the system, attach the new drive (after you can log in), then probe the SATA links using the command in sk157874.
Thanks. I'm working through our VAR which opened a case on our behalf so definitely don't want to waste the forums' time. My theory is... Check Point shipped us an entire chassis instead of just one spare drive so guessing that there's an OS on those drives and causing issues with the RAID rebuild. Waiting for feedback though to confirm but that's where I'm at currently. Will update with the outcome.
That sounds logical to me as well.
In that case, the drive from the spare box will definitely have its own md(4) set definition stored on it. At boot, the system is seeing both existing sets. Each has two configured members, and each has only one member present.
The fix is still to boot without the new drive (so the system only has one set), insert the new drive after you can log in, then probe the SATA links to convince the system to take the new drive and stick it in the existing set.
Maybe verify with TAC first, but here is my logic...
1) ONLY boot with existing hdd
2) Make sure you can log in
3) If yes, shut down the appliance by running halt from expert mode
4) Unplug the power cable
5) insert hdd sent
6) power on the appliance
7) verify raid status -> cpstat os -f raidInfo from expert mode
8 ) if still failing, reboot
9) check again
10) if good, great, if not, I would call TAC
Andy
Just a final update on this. Worked with our VAR and TAC. Ended up being what the theory was and we had to wipe the spare drive and clear it's partition layout and then remap it with the good drive's partition layout since it already had an OS installed on it. Once this happened, the drive automatically began to rebuild in the RAID. If we would have received just a regular spare drive this would have been avoided but was a good learning experience.
Thanks for letting us know.
Just had a similar situation on my 23500. The replacement drive would not rebuild not matter what and the partitions on the replacement were missing. After escalation, TAC had me run echo 0 > /boot/SW_RAID and then activate_sw_raid which miraculously rebuilt the partition structure to match (I used fdisk -l for comparison). RAID still did not rebuild. We pulled the replacement drive back out and rebooted just with the good drive, reinserted the replacement, and it started rebuilding and we are now good.
Leaderboard
Epsum factorial non deposit quid pro quo hic escorol.
User | Count |
---|---|
25 | |
13 | |
9 | |
9 | |
7 | |
7 | |
7 | |
6 | |
4 | |
4 |
Wed 22 Oct 2025 @ 11:00 AM (EDT)
Firewall Uptime, Reimagined: How AIOps Simplifies Operations and Prevents OutagesWed 22 Oct 2025 @ 11:00 AM (EDT)
Firewall Uptime, Reimagined: How AIOps Simplifies Operations and Prevents OutagesTue 28 Oct 2025 @ 11:00 AM (EDT)
Under the Hood: CloudGuard Network Security for Google Cloud Network Security Integration - OverviewAbout CheckMates
Learn Check Point
Advanced Learning
YOU DESERVE THE BEST SECURITY