Create a Post
cancel
Showing results for 
Search instead for 
Did you mean: 
KostasGR
Advisor
Jump to solution

Command to see the health of SSD disk of a Check point appliance?

Hello community 

Do you know a command to see the health of SSD disk of a Check point appliance?

BR,
Kostas

0 Kudos
1 Solution

Accepted Solutions
Bob_Zimmerman
Authority
Authority

First, an important note. SSD health isn't really a thing you need to check. Failures due to wear have been exceedingly rare for decades. Almost all failures are due to controller faults. The controller software just does something incorrect and bricks the drive. Controller faults don't have any advance warning signs.

With that understanding, it depends on exactly what you mean by "health". Most of the raw data from the drives can be shown with 'smartctl -a <block device node>':

[Expert@DallasSA]# smartctl -a /dev/sda
smartctl 5.40 2010-10-16 r3189 [i686-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     INTEL SSDSC2BA400G3
Serial Number:    <redacted>
Firmware Version: <redacted>
User Capacity:    400,088,457,216 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   9
ATA Standard is:  Not recognized. Minor revision code: 0x0110
Local Time is:    Sun Feb 12 16:06:57 2023 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (   0) seconds.
Offline data collection
capabilities: 			 (0x79) SMART execute Offline immediate.
					No Auto Offline data collection support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (   2) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       35247
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       2469
170 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       0
171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       439
175 Program_Fail_Count_Chip 0x0033   100   100   010    Pre-fail  Always       -       498544869989
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       2
184 End-to-End_Error        0x0033   100   100   090    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   075   062   000    Old_age   Always       -       25 (Min/Max 22/30)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       439
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       36
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       577149
226 Load-in_Time            0x0032   100   100   000    Old_age   Always       -       184
227 Torq-amp_Count          0x0032   100   100   000    Old_age   Always       -       22
228 Power-off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2093714
232 Available_Reservd_Space 0x0033   100   100   010    Pre-fail  Always       -       0
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       -       0
234 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1536
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       577149
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       171602

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[Expert@DallasSA]# 

Again, for SSDs, this data has no practical purpose. That SSD was used when I got it, I have reinstalled various GAiA versions on it over 30 times, and I have still used less than 1% of the rated lifespan. And SSDs typically last 20+ times longer than their rated lifespan.

View solution in original post

4 Replies
Chris_Atkinson
Employee Employee
Employee

HCP has tests for this refer: sk171436

Also sk97251: hwdiag will allow you to test the storage devices offline further where needed.

CCSM R77/R80/ELITE
0 Kudos
Bob_Zimmerman
Authority
Authority

First, an important note. SSD health isn't really a thing you need to check. Failures due to wear have been exceedingly rare for decades. Almost all failures are due to controller faults. The controller software just does something incorrect and bricks the drive. Controller faults don't have any advance warning signs.

With that understanding, it depends on exactly what you mean by "health". Most of the raw data from the drives can be shown with 'smartctl -a <block device node>':

[Expert@DallasSA]# smartctl -a /dev/sda
smartctl 5.40 2010-10-16 r3189 [i686-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     INTEL SSDSC2BA400G3
Serial Number:    <redacted>
Firmware Version: <redacted>
User Capacity:    400,088,457,216 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   9
ATA Standard is:  Not recognized. Minor revision code: 0x0110
Local Time is:    Sun Feb 12 16:06:57 2023 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		 (   0) seconds.
Offline data collection
capabilities: 			 (0x79) SMART execute Offline immediate.
					No Auto Offline data collection support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   1) minutes.
Extended self-test routine
recommended polling time: 	 (   2) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       35247
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       2469
170 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       0
171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       439
175 Program_Fail_Count_Chip 0x0033   100   100   010    Pre-fail  Always       -       498544869989
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       2
184 End-to-End_Error        0x0033   100   100   090    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   075   062   000    Old_age   Always       -       25 (Min/Max 22/30)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       439
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       36
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       577149
226 Load-in_Time            0x0032   100   100   000    Old_age   Always       -       184
227 Torq-amp_Count          0x0032   100   100   000    Old_age   Always       -       22
228 Power-off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2093714
232 Available_Reservd_Space 0x0033   100   100   010    Pre-fail  Always       -       0
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       -       0
234 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1536
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       577149
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       171602

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[Expert@DallasSA]# 

Again, for SSDs, this data has no practical purpose. That SSD was used when I got it, I have reinstalled various GAiA versions on it over 30 times, and I have still used less than 1% of the rated lifespan. And SSDs typically last 20+ times longer than their rated lifespan.

frechtb
Explorer

In a raid you want it from the different disks.

[Expert@Hostname:0]# find /dev/disk/by-id/ -name '*ATA*' -a ! -name '*part*'

[Expert@Hostname:0]# smartctl -d ata -v 9,raw48 -A /dev/disk/by-id/scsi-SATA_INTEL_SSDSC2KB4PHYF120401SC480BGN
smartctl 5.40 2010-10-16 r3189 [i686-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 099 099 000 Old_age Always - 2
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 17594
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 167
...

0 Kudos
the_rock
Legend
Legend

To add to everything that was said here, you can also refer to below:

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solut...

Also, another link points out some generic differences, though Im sure you already know them:

https://www.avast.com/c-ssd-vs-hdd#:~:text=SSD%20vs%20HDD%3A%20What's%20the,less%20energy%2C%20and%2....

Cheers,

Andy

0 Kudos

Leaderboard

Epsum factorial non deposit quid pro quo hic escorol.

Upcoming Events

    CheckMates Events