Hello,
I have used Softraid for years with a Thunderbay 4 to great success.
I just purchased a new Thunderbay 6/3 and 6 – 12TB Seagate Ironwolf Pro drives to create a large home server for my photo/video work. I was hoping to run a RAID 5 with 60TB of storage (all data is also backed up off-site).
I started certifying the drives using a 2018 15" MacBook Pro running 10.14.6. The Thunderbay was connected through a Caldigit TS3+ Dock. All seemed to be going fine for the better part of the last 1.5 days, but when I came home from dinner tonight all drives had failed to certify (see below).
I noticed the computer was asleep when this happened, and I did have "put hard disks to sleep when possible" checked on my Mac. I also read that there could be issues with using a dock.
So, I have connected the Thunderbay directly to my MacBook and I unchecked "put hard disks to sleep when possible". I have now restarted certification. After restarting certification, I was not-prompted to resume from before, yet the current offset values seem to be where they were when the error occurred. I have no idea if this means anything, just an observation.
However, I am wondering if I am wasting my time? So, I thought I would post the log here to get an opinion.
I have a hard time believing 6 drives are all bad out the box, but I don't want to waste my time if that is what SoftRAID is suggesting.
Thanks for the help!
Sep 14 20:43:54 - SoftRAID Application: The certify disk command for disk disk9, SN: ZHZ141WF, SATA bus 0, id 3 (Thunderbolt) encountered a verify error (offset 16,777,216, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:43:54 - SoftRAID Application: The certify disk command for disk disk9, SN: ZHZ141WF, SATA bus 0, id 3 (Thunderbolt) encountered a verify error (offset 33,554,432, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:43:54 - SoftRAID Application: The certify disk command for disk disk9, SN: ZHZ141WF, SATA bus 0, id 3 (Thunderbolt) encountered a verify error (offset 50,331,648, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:43:54 - SoftRAID Application: The certify disk command for disk disk9, SN: ZHZ141WF, SATA bus 0, id 3 (Thunderbolt) encountered a verify error (offset 67,108,864, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:43:56 - SoftRAID Application: The certify disk command for disk disk9, SN: ZHZ141WF, SATA bus 0, id 3 (Thunderbolt) failed because this disk has unreliable sectors. It should be replaced immediately (error number = 66).
Sep 14 20:46:04 - SoftRAID Application: The certify disk command for disk disk10, SN: ZHZ0Z8L1, SATA bus 0, id 2 (Thunderbolt) encountered a verify error (offset 16,777,216, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:46:04 - SoftRAID Application: The certify disk command for disk disk10, SN: ZHZ0Z8L1, SATA bus 0, id 2 (Thunderbolt) encountered a verify error (offset 33,554,432, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:46:04 - SoftRAID Application: The certify disk command for disk disk10, SN: ZHZ0Z8L1, SATA bus 0, id 2 (Thunderbolt) encountered a verify error (offset 50,331,648, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:46:05 - SoftRAID Application: The certify disk command for disk disk10, SN: ZHZ0Z8L1, SATA bus 0, id 2 (Thunderbolt) encountered a verify error (offset 67,108,864, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:46:06 - SoftRAID Application: The certify disk command for disk disk10, SN: ZHZ0Z8L1, SATA bus 0, id 2 (Thunderbolt) failed because this disk has unreliable sectors. It should be replaced immediately (error number = 66).
Sep 14 20:47:55 - SoftRAID Application: The certify disk command for disk disk11, SN: ZHZ0NAV4, SATA bus 0, id 1 (Thunderbolt) encountered a verify error (offset 16,777,216, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:47:55 - SoftRAID Application: The certify disk command for disk disk11, SN: ZHZ0NAV4, SATA bus 0, id 1 (Thunderbolt) encountered a verify error (offset 33,554,432, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:47:55 - SoftRAID Application: The certify disk command for disk disk11, SN: ZHZ0NAV4, SATA bus 0, id 1 (Thunderbolt) encountered a verify error (offset 50,331,648, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:47:55 - SoftRAID Application: The certify disk command for disk disk11, SN: ZHZ0NAV4, SATA bus 0, id 1 (Thunderbolt) encountered a verify error (offset 67,108,864, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:47:57 - SoftRAID Application: The certify disk command for disk disk11, SN: ZHZ0NAV4, SATA bus 0, id 1 (Thunderbolt) failed because this disk has unreliable sectors. It should be replaced immediately (error number = 66).
Sep 14 20:48:18 - SoftRAID Application: The certify disk command for disk disk7, SN: ZHZ1DHS6, SATA bus 0, id 5 (Thunderbolt) encountered a verify error (offset 16,777,216, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:48:18 - SoftRAID Application: The certify disk command for disk disk7, SN: ZHZ1DHS6, SATA bus 0, id 5 (Thunderbolt) encountered a verify error (offset 33,554,432, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:48:18 - SoftRAID Application: The certify disk command for disk disk7, SN: ZHZ1DHS6, SATA bus 0, id 5 (Thunderbolt) encountered a verify error (offset 50,331,648, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:48:18 - SoftRAID Application: The certify disk command for disk disk7, SN: ZHZ1DHS6, SATA bus 0, id 5 (Thunderbolt) encountered a verify error (offset 67,108,864, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:48:20 - SoftRAID Application: The certify disk command for disk disk7, SN: ZHZ1DHS6, SATA bus 0, id 5 (Thunderbolt) failed because this disk has unreliable sectors. It should be replaced immediately (error number = 66).
Sep 14 20:58:34 - SoftRAID Application: The certify disk command for disk disk12, SN: ZHZ10Q10, SATA bus 0, id 0 (Thunderbolt) encountered a verify error (offset 16,777,216, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:58:34 - SoftRAID Application: The certify disk command for disk disk12, SN: ZHZ10Q10, SATA bus 0, id 0 (Thunderbolt) encountered a verify error (offset 33,554,432, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:58:34 - SoftRAID Application: The certify disk command for disk disk12, SN: ZHZ10Q10, SATA bus 0, id 0 (Thunderbolt) encountered a verify error (offset 50,331,648, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:58:34 - SoftRAID Application: The certify disk command for disk disk12, SN: ZHZ10Q10, SATA bus 0, id 0 (Thunderbolt) encountered a verify error (offset 67,108,864, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 20:58:36 - SoftRAID Application: The certify disk command for disk disk12, SN: ZHZ10Q10, SATA bus 0, id 0 (Thunderbolt) failed because this disk has unreliable sectors. It should be replaced immediately (error number = 66).
Sep 14 21:19:08 - SoftRAID Application: The certify disk command for disk disk8, SN: ZHZ1HPWL, SATA bus 0, id 4 (Thunderbolt) encountered a verify error (offset 16,777,216, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 21:19:08 - SoftRAID Application: The certify disk command for disk disk8, SN: ZHZ1HPWL, SATA bus 0, id 4 (Thunderbolt) encountered a verify error (offset 33,554,432, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 21:19:08 - SoftRAID Application: The certify disk command for disk disk8, SN: ZHZ1HPWL, SATA bus 0, id 4 (Thunderbolt) encountered a verify error (offset 50,331,648, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 21:19:08 - SoftRAID Application: The certify disk command for disk disk8, SN: ZHZ1HPWL, SATA bus 0, id 4 (Thunderbolt) encountered a verify error (offset 67,108,864, i/o block size = 16,777,216). Error during pass number = 1. This disk should be replaced immediately.
Sep 14 21:19:10 - SoftRAID Application: The certify disk command for disk disk8, SN: ZHZ1HPWL, SATA bus 0, id 4 (Thunderbolt) failed because this disk has unreliable sectors. It should be replaced immediately (error number = 66).
Drives should never fail to certify. Certify does not use the SoftRAID driver, it is all OS X. This tells you that there is a hardware issue.
Try direct connecting without the dock, just to remove a variable.
When you say drives should never fail to certify, do you mean that all drives should complete the process and then you look for reported errors? I thought the point to certification was to detect problems in advance, so your statement is a bit confusing to me.
Also, does the above log give any indication if it is bad disks or my hardware setup? Or is there no way to know?
And yes I have restarted it without the dock. The Thunderbay is directly connected to the MacBook now. Closing in on 24 hours with no issues. Fingers crossed.
We certify disks all the time. New disks, used disks etc. There is no "random" failure with a certify, unless there is a hardware issue, either the drives, or enclosure/cable/computer, etc.
A disk should not have ANY IO errors during certify. Any that do, should be pre-replaced. This reduces a great many "early" failures.
If you are interested, here is a video presentation from 2016 from our VP Engineering, Tim Standing, on the importance of certification and disk failures.
http://docs.macsysadmin.se/2016/video/Day2Session5.mp4
Since all your disks failed, there is clearly something wrong. This looks to be a computer/cable/enclosure failure, so use trial and error.
Thanks for the info!
All 6 disks have made it through 2 passes successfully, but I am going to have to cancel the certification before the 3rd pass completes as I am leaving on a trip for which I need this computer.
I know your videos state that softraid should restart certification where it left off? If this doesn't happen in a week when I return, can I just do a one pass certification to write zeroes to the drive or do I need to start over from scratch?
Yes you can just "zero disk" or do a 1 pass certify.

