RAID 5 volume appearing twice after attempted replacement of erroring drive
Is there a way to nominate which drives comprise a given RAID volume?
I've gotten into a weird situation which I'm not sure how to express to Google. After noticing a drive was reporting I/O errors, I started going through the motions to swap it with a spare. At some point the computer kernel panicked and rebooted immediately (unfortunately I don't remember exactly what action preceded it, this was over a Screen Sharing connection). When it came back up, I was presented with the attached situation.
Specifically, all the listed Toshiba drives should comprise the single RAID 5 volume 'lax-prod-restore'. As you can see the volume is listed twice, with the first copy backed by drive IDs 1 and 2 and reporting two others missing, the second copy backed by drive ID 4 and reporting three others missing. The drive with ID 3 is the original problem drive and is now no longer associated with any volume.
Drive ID 1 then went on to report I/O errors as well (this wasn't the case before the kernel panic reboot).
Admittedly I was on SoftRAID version 5.5.5 when this happened, although this had been stable for over a year. After seeing the mess I checked for updates and installed 5.6.3 to see if that would help. I was initially averse to upgrading as I lost this same volume in February after installing 5.5.6; at the time I had to roll back to 5.5.5 and restore from backup.
This behavior feels very erratic and unstable and I have a hunch this may be a hardware problem outside the disks themselves - perhaps the drive enclosure or cable. I feel I've eliminated the computer as the problems with this volume span two different Mac Minis.
Nonetheless I wanted to ask in here if
• anyone else thinks there's weight to that hunch? Any advice would help, I definitely don't want to have this background risk of routine maintenance causing a kernel panic which destroys a 12TB volume.
• there's any way I can at least get back to my original situation - a single RAID 5 volume backed by 4 drives, one or two of which may be failing?
We do have up to date backups, but our older stuff is cloud-only and I'd like to avoid re-downloading multiple terabytes if possible.
I've attached 3 screenshots of what I'm seeing, and a mocked-up 4th of what I'm expecting. (can't seem to control their order - sorry!)
Brief hardware summary:
macOS Sierra 10.12.6
SoftRAID for ThunderBay versions 5.5.5, then 5.6.3
Mac mini (Late 2012)
Memory 8 GB
Enclosure: OWC ThunderBay 4, connected via ThunderBolt, daisy-chained thru 2 x Promise Pegasus (although I have since removed the Pegasus units and connected the ThunderBay directly)
Drives: 4 x 4TB spinning platters (i.e. non-SSD) in RAID 5 for 12TB
Can provide .panic, .sr_supt, .spx files privately on request.
You need to send a support file to support at softraid.
Then we can investigate. Seems like the SoftRAID disk partition maps may be damaged.