macOS 12.6 w/ SoftR...
 
Notifications
Clear all

macOS 12.6 w/ SoftRAID 6.3, system had issues copying files, volume has been destroyed

(@anonymous-page)
Active Member Customer

I've been test driving Monterey w/ SoftRAID 6.3 on one of the disks in my ThunderBay Mini. My primary internal disk is still running 10.14.6 w/ SoftRAID 5.x due to software compatibility.

This evening I upgraded to 12.6 and was just testing things out. At one point I dragged a handful of files from the desktop over to a RAID-0 volume, also located in the ThunderBay enclosure. Finder began to repeatedly crash, to the point where the system was not usable. I happened to have a Terminal window open, so I issued a "reboot" from the command line.

Upon rebooting, I got the slashed circle of doom on the Monterey disk. I booted into Recovery mode (granted it was 10.14 Recovery), but Disk Utility did not recognize any volumes on the disk. I then booted back into 'ol reliable on my internal disk - macOS 10.14.6. The system came up fine. SoftRAID no longer shows any volumes on the disk that had been running Monterey.

"diskutil list" does show the volume, but reports "+ERROR" in the size column.

/dev/disk10 (synthesized):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      APFS Container Scheme -                      +ERROR      disk10
                                 Physical Store disk2s2

I tried running a "diskutil repairVolume" on the volume, with the following output:

Performing fsck_apfs -y -x /dev/disk2s2
Checking the container superblock
Checking the EFI jumpstart record
Checking the space manager
error: (oid 0x8adf) cib: invalid o_cksum (0xffffffffffffffff)
error: failed to read spaceman cib 0x8adf
Space manager is invalid
The volume /dev/disk2s2 could not be verified completely

Luckily this was just a test run on a secondary disk. At this point I just assume the volume has been destroyed.

Not pointing the finger at SoftRAID, but it's definitely part of the overall equation here. I plan to re-initialize the disk, re-install and perform more testing. This was all prep in (finally) performing a major upgrade on my primary disk, but for now... I think I'll continue sitting on an out-of-support but stable OS and software stack.

Quote
Topic starter Posted : 13/09/2022 11:17 pm
(@softraid-support)
Member Admin

I have no idea what happened, but SoftRAID is not a contributing factor. I have seen several APFS corrupted volumes, but none where just "Finder copies" to another volume triggered it. Very odd.

You had your Monterey on an external, it appears. Perhaps a communication issue with Thunderbolt? I wonder if, (as this is a Thunderbolt issue) the disk ejected momentarily? On external disks, you simply get a message that the disk disappeared, but on an external drive, the system will crash, as it lost its volume. If this happened, also, the disk # on the disk could have changed. (they are assigned semi randomply when connected).

Just a thought. This should not impact your decision on Monterey, but perhaps on using external boot volumes. You may or may not know, but external bootable volumes are apparently going away. Already on M1, you cannot load any (third party) extensions from external startup disks.

I do not know if my speculation is correct, but it makes sense.

ReplyQuote
Posted : 14/09/2022 1:48 am
(@anonymous-page)
Active Member Customer
Posted by: @softraid-support

I have no idea what happened, but SoftRAID is not a contributing factor.

Lol ... ok. I appreciate your optimism, but you can't just dismiss the idea so easily. After all, every I/O goes through the SoftRAID driver.

Posted by: @softraid-support

I wonder if, (as this is a Thunderbolt issue) the disk ejected momentarily?

No, the disk didn't eject. I can affirm that with the same vigor as your dismissal of SoftRAID's potential interaction. :)

Posted by: @softraid-support

You may or may not know, but external bootable volumes are apparently going away. Already on M1, you cannot load any (third party) extensions from external startup disks.

Sure, but for now, they are completely supported.

Posted by: @softraid-support

I do not know if my speculation is correct, but it makes sense.

If that's the case, then I suppose my OWC ThunderBay is defective (it is not).

Again -- as I said in my original post: "Not pointing the finger at SoftRAID, but it's definitely part of the overall equation here.". SoftRAID can't be dismissed, but I have no reason to believe it was involved at this point as I have no evidence to say one way or the other. I've been a SoftRAID user since 2013, and have spent thousands of dollars on OWC storage products. Trust me, I don't want this to be a SoftRAID issue. I posted this as a heads up that something really bad happened and it involves two products common to these forums.

I'm going to re-initialize, re-install and continue testing that environment, like I said. Hopefully it will never happen again.

ReplyQuote
Topic starter Posted : 14/09/2022 11:19 am
(@softraid-support)
Member Admin

@anonymous-page 

For a piece of background, SoftRAID is a pass-through driver. It does not perform any IO on its own. In simple terms, When there is an IO to a SoftRAID volume, "disktool" asks "what disk does this go to", the driver controlling the disk responds, and the disktool in MacOS performs the IO. This design is exactly how Apple's RAID driver and the Disk Utility driver work, and is done to avoid driver compatibility issues.

While there are issues with SoftRAID "working", there are no known issues with data corruption, etc, going back 20 years.

I wish I had an answer. It would be easier if it was the SoftRAID driver, we could fix it. Hopefully, you never experience this again!

ReplyQuote
Posted : 14/09/2022 3:35 pm
(@anonymous-page)
Active Member Customer

Well, unfortunately I've run into issues again with this configuration. Slightly different results when running fsck, but the volumes are toast. And this is why we do lots of testing before major upgrades, kids!

For a piece of background, SoftRAID is a pass-through driver. It does not perform any IO on its own. In simple terms, When there is an IO to a SoftRAID volume, "disktool" asks "what disk does this go to", the driver controlling the disk responds, and the disktool in MacOS performs the IO. This design is exactly how Apple's RAID driver and the Disk Utility driver work, and is done to avoid driver compatibility issues.

Sure, I assume the design is similar to intermediate volumes in Linux (e.g., LVM). When you say SoftRAID doesn't perform any I/O of its own, I must ask -- when an I/O destined for a SoftRAID volume arrives at SoftRAID, let's say the synthetic volume is a 2-disk RAID-1 volume for simplicity...wouldn't SoftRAID need to generate the I/O for the mirror... which would be a duplicate with the exception of the physical disk. In the case of a RAID with parity, I would imagine SoftRAID would necessarily need to generate an I/O with much more than just the initial data from the original I/O. It would need to create checksum I/Os, determine offsets based on some mapping, then send those out for any number of disks that are part of the RAID.

Appreciate the response. Not sure what my plan is for now. Obviously I have a serious issue with my setup here and it can't be used for serious work. Although I have multiple backups, I'm not sure I want to risk putting Monterey on a volume on my internal disk...just in case... fun fun.

ReplyQuote
Topic starter Posted : 19/09/2022 9:45 am
(@softraid-support)
Member Admin

@anonymous-page 

I am not an engineer, so can only address in layman terms.

the SoftRAID driver acts like a traffic cop. When there is IO to a volume, and macOS sees it is a SoftRAID controlled disk, it queries, where to put the data, and writes it out. If I understand, parity data looks like any other data to macOS and the SoftRAID driver just tells macOS to write it out. I am sure lots "could" go wrong, but it is a reliable system, that has few if any conflicts with macOS. SoftRAID's mechanism works the same way as the Apple RAID driver.

 

If you go to Monterey, your data volume, assuming it is on HFS, can be repaired using Disk Warrior if the directory goes wonky again. So keep a Big Sur or older volume handy, it can be on the same container that holds your Monterey volume.

ReplyQuote
Posted : 19/09/2022 11:04 am
(@anonymous-page)
Active Member Customer

@softraid-support 

If I understand, parity data looks like any other data to macOS and the SoftRAID driver just tells macOS to write it out.

I am an engineer. 😀 

Parity data doesn't exist as far as macOS is concerned. It only exists within SoftRAID -- that is, SoftRAID is the one that creates and updates parity data; therefore, it necessarily generates I/O of its own. But, I digress... take care.

ReplyQuote
Topic starter Posted : 19/09/2022 11:17 am
(@softraid-support)
Member Admin

@anonymous-page 

You are of course correct. The SoftRAID driver does not write out the parity, this is also sent to the macOS disktool to perform the actual writes.

 

thanks

ReplyQuote
Posted : 19/09/2022 12:04 pm
Share:
close
open