Hi, new user here. I am managing an AKiTiO Thunder2 Quad with 4x3TB drives in a RAID 5 configuration.
In the beginning I used SR to configure the RAID setup, however I did not certify the drives before using. Now three years later the first disk failure prediction has come, reporting 315 unreliable sectors and 0 reallocated sectors. I removed the disk from the volume and tried to certify, but got these messages:
"The certify disk command for disk disk3, SN: 65PHK71JFO1A, SATA bus 0, id 1 (Thunderbolt) hung while writing (offset 175,405,793,280, i/o block size = 16,777,216). This disk should be replaced immediately."
"The certify disk command for disk disk3, SN: 65PHK71JFO1A, SATA bus 0, id 1 (Thunderbolt) failed because this disk has unreliable sectors. It should be replaced immediately (error number = 66)."
A couple of questions:
1. The drive only has 344 hours of use. Is it best to replace the drive now, or can I continue using it until the drive begins reallocating sectors?
2. Can I still certify the other drives? I have backed up the data inside.
Also this is an unrelated question, but I have found that SR often gives me this error:
"An internal part of the SoftRAID application has stopped functioning properly. Please quit SoftRAID and relaunch it."
Initially I thought this had something to do with the certification process, as I would get several of these errors while certifying my drives and had to manually resume the process every time. But later on I found I was getting this error even if I just had the application open without doing anything. Is this a bug? I am running SR v5.7.2 on a 2015 MacBook Pro running macOS Mojave 10.14.1.
1. If a drive is failing certify, it is probably DOA. "Uncorrectable" sectors can be either the drive, or the result of a brownout, etc. But if you certify it, the sectors would either clear, or reallocate. I think you have a faulty drive there. A benefit to Certify, this likely would have been caught in an initial certify pass.
2. Yes you can certify all disks. Never too late! Certify is a burn in process that eliminates many DOA drives in advance of putting them to use.
3. This is a bug we are getting in 10.14, where the SoftRAIDtool is quitting. We are working on it, but have not been able to reproduce it in house, despite a dozen systems trying to replicate it.
Thanks for the advice. I have just finished certifying the other three drives, which all turned out ok. But SR would constantly quit on me, and it appeared to be erratic--sometimes SR would quit after a few minutes, other times a few hours. The only thing that seemed consistent was that it would never last through the night, so I would always wake up in the morning to find that it had crashed again. So instead of the estimated 48hrs it took a week to complete everything. That's all the info I can provide regarding the bug unfortunately. Hopefully you guys can figure out what's wrong.
We have replicated it after a fair amount of effort, and are trying to determine the cause. Thanks!

