Mark6 SMART Data Utility
The
m6modsmartdata.py utility will display the SMART data for the disks in a selected module. m6modsmartdata.py is deployed in /usr/bin of all mark6 units at the sites and the correlator, and the utility must be run locally on the unit. The utility can be run from vlbamon user on site mark6 units and from difx user on correlator mark6 units.
Usage:
usage : m6modsmartdata.py [options] <slot number>
options: -s - short output, show only serial number and error log
<slot number> must be 1, 2, 3 or 4
Example output, normal form, first disk only:
difx@mark6fx01 VLBADIFX7-2.5.2 ~> m6modsmartdata.py 1
=====================================================================
SMART data for module LBO%0086
=====================================================================
=====================================================================
DISK 0 Information Section
=====================================================================
Model Family: HGST Ultrastar He10
Device Model: HGST HUH721010ALE604
Serial Number: 2YKH788D
LU WWN Device Id: 5 000cca 273f1335b
Firmware Version: LHGNW384
User Capacity: 10,000,831,348,736 bytes [10.0 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Thu Jan 16 14:36:39 2020 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=====================================================================
DISK 0 Attributes Section
=====================================================================
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 133 133 054 Pre-fail Offline - 100
3 Spin_Up_Time 0x0007 148 148 024 Pre-fail Always - 443 (Average 447)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 22
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 128 128 020 Pre-fail Offline - 18
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1083
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 114
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 114
194 Temperature_Celsius 0x0002 214 214 000 Old_age Always - 28 (Min/Max 15/36)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
=====================================================================
DISK 0 Error Log Section
=====================================================================
No Errors Logged
Example of a disk with an error:
=====================================================================
DISK 6 Information Section
=====================================================================
Device Model: WDC WD8003FRYZ-01JPDB1
Serial Number: 7SJ4ULAW
LU WWN Device Id: 5 000cca 252de608f
Firmware Version: 01.01H02
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Thu Jan 16 14:40:05 2020 MST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=====================================================================
DISK 6 Attributes Section
=====================================================================
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 133 133 054 Pre-fail Offline - 100
3 Spin_Up_Time 0x0007 150 150 024 Pre-fail Always - 437 (Average 439)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 37
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 128 128 020 Pre-fail Offline - 18
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 3162
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 37
22 Unknown_Attribute 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 328
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 328
194 Temperature_Celsius 0x0002 230 230 000 Old_age Always - 26 (Min/Max 14/44)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 1
=====================================================================
DISK 6 Error Log Section
=====================================================================
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1 occurred at disk power-on lifetime: 2343 hours (97 days + 15 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 00 78 e3 b7 40 00 4d+23:56:00.061 READ FPDMA QUEUED
60 00 08 78 e4 b7 40 00 4d+23:56:00.060 READ FPDMA QUEUED
60 00 08 78 e2 b7 40 00 4d+23:56:00.059 READ FPDMA QUEUED
60 00 00 78 e1 b7 40 00 4d+23:56:00.059 READ FPDMA QUEUED
60 00 08 78 e0 b7 40 00 4d+23:56:00.058 READ FPDMA QUEUED
Example of short form output for a 5 disk module:
difx@mark6fx01 VLBADIFX7-2.5.2 ~> m6modsmartdata.py -s 3
=====================================================================
SMART data for module JPLK%006
=====================================================================
=====================================================================
DISK 0 Information Section
=====================================================================
Serial Number: 7SJ7YAVW
=====================================================================
DISK 0 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 1 Information Section
=====================================================================
Serial Number: 7SJ3573W
=====================================================================
DISK 1 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 2 Information Section
=====================================================================
Serial Number: 7SJ56TLW
=====================================================================
DISK 2 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 4 Information Section
=====================================================================
Serial Number: 7SJ2RA3W
=====================================================================
DISK 4 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 6 Information Section
=====================================================================
Serial Number: 7SJ4ULAW
=====================================================================
DISK 6 Error Log Section
=====================================================================
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1 occurred at disk power-on lifetime: 2343 hours (97 days + 15 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 41 00 00 00 00 00 Error: ICRC, ABRT at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 00 78 e3 b7 40 00 4d+23:56:00.061 READ FPDMA QUEUED
60 00 08 78 e4 b7 40 00 4d+23:56:00.060 READ FPDMA QUEUED
60 00 08 78 e2 b7 40 00 4d+23:56:00.059 READ FPDMA QUEUED
60 00 00 78 e1 b7 40 00 4d+23:56:00.059 READ FPDMA QUEUED
60 00 08 78 e0 b7 40 00 4d+23:56:00.058 READ FPDMA QUEUED
Remote Mark6 SMART Data Utility
The
remotem6modsmartdata.py utility will display the SMART data for the disks in a selected module in a selected mark6 unit. remotem6modsmartdata.py is deployed in /usr/difx/bin. The utility can be run from difx user on any machine, such as gooey and swc000, that has ssh access to site and correlator mark6 units. Output is the same, both normal and short forms, as the locally run m6modsmartdata.py.
Usage:
Usage: remotem6modsmartdata.py [options] <unit code> <slot>
A program to show SMART data for the disks in a module
in a given slot a given site or playback mark6 unit
options: -s - short output, show only serial number and error log
<unit code> is two letter vlba site code for site mark6 units or
01 to 08 for playback mark6 units
<slot number> must be 1, 2, 3 or 4
Example of shortform output from ov-mark6-1 slot 2:
difx@swc000 VLBADIFX7-2.5.2 ~> remotem6modsmartdata.py -s ov 2
***************************************************************************
National Radio Astronomy Observatory computing facilities are exclusively
for the use of authorized personnel, who are expected to abide by the
terms of the NRAO Computing Security and Computing Use Policies.
***************************************************************************
=====================================================================
SMART data for module LBO%0087
=====================================================================
=====================================================================
DISK 0 Information Section
=====================================================================
Serial Number: 2YKHBBED
=====================================================================
DISK 0 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 1 Information Section
=====================================================================
Serial Number: 2YKH6BHD
=====================================================================
DISK 1 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 2 Information Section
=====================================================================
Serial Number: 2YKGL7DD
=====================================================================
DISK 2 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 4 Information Section
=====================================================================
Serial Number: 2YKH3PZD
=====================================================================
DISK 4 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 6 Information Section
=====================================================================
Serial Number: 2YKGL6JD
=====================================================================
DISK 6 Error Log Section
=====================================================================
No Errors Logged
Example of shortform output from mark6fx03 slot 2 (looks like LBO%0019 disk 4 could use some attention

):
difx@swc000 VLBADIFX7-2.5.2 ~> remotem6modsmartdata.py -s 03 2
***************************************************************************
National Radio Astronomy Observatory computing facilities are exclusively
for the use of authorized personnel, who are expected to abide by the
terms of the NRAO Computing Security and Computing Use Policies.
***************************************************************************
=====================================================================
SMART data for module LBO%0019
=====================================================================
=====================================================================
DISK 0 Information Section
=====================================================================
Serial Number: ZA15A643
=====================================================================
DISK 0 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 1 Information Section
=====================================================================
Serial Number: ZA17ZG7E
=====================================================================
DISK 1 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 2 Information Section
=====================================================================
Serial Number: ZA17ZMW2
=====================================================================
DISK 2 Error Log Section
=====================================================================
No Errors Logged
=====================================================================
DISK 4 Information Section
=====================================================================
Serial Number: ZA150DWL
=====================================================================
DISK 4 Error Log Section
=====================================================================
ATA Error Count: 845 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 845 occurred at disk power-on lifetime: 3296 hours (137 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 14:51:08.425 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:51:08.425 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:51:06.083 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:51:03.750 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:51:02.667 READ FPDMA QUEUED
Error 844 occurred at disk power-on lifetime: 3296 hours (137 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 14:50:09.277 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:50:09.276 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:50:09.275 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 14:50:09.275 READ LOG EXT
60 00 00 ff ff ff 4f 00 14:50:03.003 READ FPDMA QUEUED
Error 843 occurred at disk power-on lifetime: 3296 hours (137 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 00 ff ff ff 4f 00 14:50:03.003 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:50:02.876 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:50:02.759 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:50:02.641 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:50:02.533 READ FPDMA QUEUED
Error 842 occurred at disk power-on lifetime: 3296 hours (137 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 14:47:54.370 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:47:54.370 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:47:53.270 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:47:50.920 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:47:49.845 READ FPDMA QUEUED
Error 841 occurred at disk power-on lifetime: 3296 hours (137 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 08 ff ff ff 4f 00 14:46:56.101 READ FPDMA QUEUED
60 00 00 ff ff ff 4f 00 14:46:56.100 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:46:53.784 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:46:52.717 READ FPDMA QUEUED
60 00 08 ff ff ff 4f 00 14:46:50.393 READ FPDMA QUEUED
=====================================================================
DISK 6 Information Section
=====================================================================
Serial Number: ZA17ZMT8
=====================================================================
DISK 6 Error Log Section
=====================================================================
No Errors Logged