Failed disk in 4 x ...
 
Notifications
Clear all

Failed disk in 4 x disk RAID-5

15 Posts
2 Users
0 Reactions
42.7 K Views
(@barrysharp)
Posts: 63
Member
Topic starter
 

I have a RAID-5 configured with SoftRAID using 4 2TB disks. One of the disks failed. I pulled the disk and replaced it with another of equal size and match the three other good disks.

I initialized the replacement disk and then Certified it.

During this time my two file systems (A & B just as an example of names) that resided on the RAID-5 were running in degraded mode.

After the replacement disk passed the Certify I had to now figure out how to get it rebuilt. I kind of thought this process was somewhat automatic.... but alas not.

I reinstated the Disk Label on the replacement disk to be same as what was on the failed disk.

I then clicked on the first file system A and added in the replacement disk.

I then clicked on the second file system B and added in the replacement disk.

I then clicked on the file system A and selected Rebuild.

I then clicked on the file system B and selected Rebuild.

This seems to have worked as both file system indicated

RAID-5 - degraded - safeguard enabled
no errors - rebuilding

Did I perform the correct process for replacing the failed disk in a 4x disk RAID-5 and for rebuilding the two file systems ?

Thank you.

 
Posted : 23/02/2016 8:03 pm
(@softraid-support)
Posts: 9200
Member Admin
 

If you see a progress bar on one volume and a rebuild queued message on the other, then yes.

 
Posted : 23/02/2016 8:53 pm
(@barrysharp)
Posts: 63
Member
Topic starter
 

Thank you. That is exactly what I'm seeing. So I'm correct that the Rebuild is not automatic when I insert the new disk. It has to be started manually as I outlined, right ?

 
Posted : 23/02/2016 9:13 pm
(@softraid-support)
Posts: 9200
Member Admin
 

It should have been automatic, I don't know why it did not start automatically. (the default preference is for RAID and Mirror volumes to auto rebuild).

glad it is working now.

 
Posted : 23/02/2016 10:19 pm
(@barrysharp)
Posts: 63
Member
Topic starter
 

So if it's supposed to be automatic, should I file a Problem Report ? What log files would you need ?

 
Posted : 24/02/2016 1:44 am
(@softraid-support)
Posts: 9200
Member Admin
 

No need to. We do this kind of testing all the time and would have discovered it if this were a consistent problem. There are a couple possible reasons why it did not rebuild, including whether the volumes were actually mounted (rebuilds require mounted volumes)

 
Posted : 24/02/2016 11:41 am
(@barrysharp)
Posts: 63
Member
Topic starter
 

Can I just pull one of the 4 disks out of its enclosure and place a new Initialized/Certified disk in its place and have the Rebuild start automatically, or do I have to unmount the complete RAID-5 first to replace the failed disk and then remount to have Rebuild started automatically ? Thank you.

 
Posted : 24/02/2016 12:00 pm
(@softraid-support)
Posts: 9200
Member Admin
 

To replace a disk in a RAID 4/5:
Make sure you have backups!
Best Practice: certify the new disk with three passes, so you know you have a good disk.
Best Practice: Validate the volume (can't be too careful is our motto, this is critical data, right?)
Make sure SoftRAID confirms all the current disks are all in sync.
Now select the volume tile and "remove disk".
Select the disk to be removed.
In your case, you need to also do this with the second volume.
Swap out the two disks.
"add disk" to volume 1, and select the newly initialized disk (It will start rebuilding)
"Add disk" to volume 2 (it will be queued for rebuild)

 
Posted : 24/02/2016 12:07 pm
(@barrysharp)
Posts: 63
Member
Topic starter
 

I did exactly as you described. The auto Rebuild did not start. Here's the SoftRAID display showing this state after having completed all the steps you gave.

See attachment

I went ahead and manually started the Rebuild.

Here's the SoftRAID log showing what happened.

Feb 24 12:34:14 SoftRAID Tool[579] : The certify disk command for disk disk26, SN: 00 30 E0 02 E0 05 00 63, FireWire bus 0, id 0 completed successfully.
Feb 24 13:06:14 SoftRAID Tool[579] : Initializing the disk disk26, SN: 00 30 E0 02 E0 05 00 63, FireWire bus 0, id 0 to GPT format.
Feb 24 13:06:26 SoftRAIDTool[579] : Initializing EFI partition on disk26.
Feb 24 13:06:29 SoftRAID Tool[579] : The disk initialize command for disk disk26, SoftRAID ID: 06A2578DA383F3C0, FireWire bus 0, id 0 completed successfully.
Feb 24 13:09:48 SoftRAID Tool[579] : Removing a disk from the volume "HandBrake" (disk23).
Feb 24 13:09:56 SoftRAID Tool[579] : The volume remove disk command for volume "HandBrake" (disk23) completed successfully. The disk disk2, Label: "BIG-4-Disk1", SoftRAID ID: 06A07455AB257300, SATA bus 0, id 9 (Thunderbolt) was removed from the volume.
Feb 24 13:10:27 SoftRAID Tool[579] : Removing a disk from the volume "Movies-TVshows-Photos" (disk24).
Feb 24 13:10:31 SoftRAID Tool[579] : The volume remove disk command for volume "Movies-TVshows-Photos" (disk24) completed successfully. The disk disk2, Label: "BIG-4-Disk1", SoftRAID ID: 06A07455AB257300, SATA bus 0, id 9 (Thunderbolt) was removed from the volume.
Feb 24 13:20:04 SoftRAID Tool[579] : Adding a disk to the volume "Movies-TVshows-Photos" (disk24).
Feb 24 13:20:04 SoftRAIDTool[579] : Rebuilding boot caches for volume "Movies-TVshows-Photos".
Feb 24 13:20:05 SoftRAID Tool[579] : The volume add disk command for volume "Movies-TVshows-Photos" (disk24) completed successfully. The disk disk2, Label: "BIG-4-Disk1", SoftRAID ID: 06A2578DA383F3C0, SATA bus 0, id 12 (Thunderbolt) was added to the volume.
Feb 24 13:20:26 SoftRAID Tool[579] : Adding a disk to the volume "HandBrake" (disk23).
Feb 24 13:20:29 SoftRAIDTool[579] : Rebuilding boot caches for volume "HandBrake".
Feb 24 13:20:34 SoftRAID Tool[579] : The volume add disk command for volume "HandBrake" (disk23) completed successfully. The disk disk2, Label: "BIG-4-Disk1", SoftRAID ID: 06A2578DA383F3C0, SATA bus 0, id 12 (Thunderbolt) was added to the volume.
Feb 24 13:28:47 SoftRAID Driver[87] : The RAID volume "Movies-TVshows-Photos" (disk24) is out of sync. A rebuild has started manually.

 
Posted : 24/02/2016 4:29 pm
(@softraid-support)
Posts: 9200
Member Admin
 

Go to SoftRAID Preferences -> RAID tab
Confirm that "Automatically rebuild RAID volumes" is selected.

thanks

 
Posted : 24/02/2016 5:33 pm
(@barrysharp)
Posts: 63
Member
Topic starter
 

OK...

While the manual Rebuild was underway on Vol 1 I went to SoftRAID Preferences -> RAID tab and found the "Automatically rebuild RAID volumes" was not set. So I set it to Confirm that "Automatically rebuild RAID volumes" is selected, and then quit the Preferences panel. As soon as this was done the current Rebuild on Vol 1 stopped and the Queued Rebuild for Vol 2 started.

Is this expected, as I was a bit disappointed the ongoing Rebuild for Vol 1 had been stopped by the changes I had made to Preferences.

Questions:

What do the following RAID tab's check boxes mean/imply if they are checked ?
and better protection from corrupted volumes
Fast RAID volume rebuilds
Validating updates incorrect RAID blocks
Enable write cache

I would have thought enabling the write cache that writes would be faster but with less safety. Why does it states "writes will be slower and better protection from corrupted volumes" ?

 
Posted : 24/02/2016 7:14 pm
(@softraid-support)
Posts: 9200
Member Admin
 

The rebuild preference is why it was not rebuilding.
We will investigate what you reported, and see if it is an issue.
(rebuild switching from one volume to the other)

Enable write cache
- and better protection from corrupted volumes
Fast RAID volume rebuilds
Validating updates incorrect RAID blocks

All this is in the online help pages. There is a ? button on every help page, click it and you will be brought to the correct help page for that dialog box.

The text for write cache i correct. When enabled, it will accelerate RIAD writes, but in the event of a crash, the data in the RAM cache may not be written out to disk. unchecking it means all writes are directly to disk, so there is less chance of corruption in the event of a kernel panic.

 
Posted : 24/02/2016 7:38 pm
(@softraid-support)
Posts: 9200
Member Admin
 

While the manual Rebuild was underway on Vol 1 I went to SoftRAID Preferences -> RAID tab and found the "Automatically rebuild RAID volumes" was not set. So I set it to Confirm that "Automatically rebuild RAID volumes" is selected, and then quit the Preferences panel. As soon as this was done the current Rebuild on Vol 1 stopped and the Queued Rebuild for Vol 2 started.

We were able to reproduce this. We should be able to fix this in the next update.

 
Posted : 24/02/2016 8:02 pm
(@barrysharp)
Posts: 63
Member
Topic starter
 

The rebuild preference is why it was not rebuilding.
We will investigate what you reported, and see if it is an issue.
(rebuild switching from one volume to the other)

Enable write cache
- and better protection from corrupted volumes
Fast RAID volume rebuilds
Validating updates incorrect RAID blocks

All this is in the online help pages. There is a ? button on every help page, click it and you will be brought to the correct help page for that dialog box.

The text for write cache i correct. When enabled, it will accelerate RIAD writes, but in the event of a crash, the data in the RAM cache may not be written out to disk. unchecking it means all writes are directly to disk, so there is less chance of corruption in the event of a kernel panic.

Where is this 'write cache' located ? In the computer's RAM managed by SoftRAID or in the disk device ?

 
Posted : 24/02/2016 8:30 pm
(@softraid-support)
Posts: 9200
Member Admin
 

SoftRAID reserves it out of your computers RAM

 
Posted : 24/02/2016 9:35 pm
Share:
close
open