[Sticky] Big Sur on M1 machines and SoftRAID issues
Try moving the HDD's into the upper slots temporarily, as a test. Do you still get the ejects?
So far, after adjusting the power safe settings, things have been stable.
I've seen similar errors on my Thunderbay as @m1macmini reports, the drives randomly go offline and the volume gets ejected when it isn't viable causing MacOS to give an error. I have a RAID5 volume with 4 SSDs, seems most likely to occur with any "heavy" I/O to the volume.
Check your cabling make sure it is tightly connected. Try different combinations. Just had a user today figure out it was a bad cable, and reversing it (upside down) made it stable. Go figure.
Disk ejects are the single worst thing about Thunderbolt, this has been around since 2014, and has never been completely fixed by Apple/intel.
@softraid-support, I submitted logs last week. It occurs without the Samsung Touch attached. It has occurred on several occasions, sometimes overnight (I'd told it not to sleep when the display is off, but could be related to the drive/computer going idle). It has crashed when connected to the Caldigit dock, and also when directly connected to the MacBook, but the Caldigit dock was also connected to the other port. However, so far it has not occurred when I leave the Caldigit disconnected from the computer and have the Mercury Quad direct connected and the monitor connected via USB-C.
The Caldigit has several different things connected to it:
- an APC Back-UPS battery
- sometimes a monitor via DisplayPort for video and USB-C for the build in hub, which has a webcam, Logitech receiver, and scanner attached to it.
- a G-Drive (I have been keeping it turned off but connected during testing)
- an OWC Mercury Dual (also been keeping it plugged in but off)
- sometimes my Samsung T7 Touch or one of my T5s
- sometimes a Yubikey
- some an SD card
But again, I've had it crash when all the drives were turned off and disconnected from the Caldigit and the Mercury Quad was direct connected to the MacBook. Of course, the monitor then has to be connected to the Caldigit via DisplayPort when that is the case.
I'm not entirely sure what direction I should go next with troubleshooting. Try connecting things to the Caldigit one at a time... or uninstall SoftRAID. I was hoping to hear back about the logs and get a recommendation before taking the next troubleshooting steps.
For the cable being the problem, which is it that _many_ years using the same cables on my 2013 MacPro never saw the issue? The only change is the insertion of the TB3 to TB2 adapter, and the downstream DisplayPort monitor is never impacted...so I would think the cable is not the problem 🤷♂️
The problem is these are very high speed systems. It is not necessarily faulty in the old way, either good or bad, but the systems can cause interference that triggers in some circumstances, but not others.
For instance, did you know plugging a cable next to the wifi antenna on a 2018 mini can cause disks to eject?
Or leaving a phone next to a thunderbolt cable can cause problems when a call comes in?
there are weird scenarios, as systems are moving at gigahertz speeds. So you need to keep an open mind to cause of issues and be willing to just try things. Its frustrating. A speck of dust in the insertion point can cause disks to eject. Apple requires you to clean all ports with 90% isopropyl if you try to report disks ejects to them.
FYI: I once was getting instant ejects on a Mac Pro cannister. Just touching a cable, bamm. Turns out that one cable was either slightly loose, or had a speck of something on the connector point. This stuff can be very sensitive.
Save the panic reports if you get crashes from things like plugging in devices however, Disk ejections are one thing, but kernel panics are much different and should not occur.
@softraid-support, yeah, I’m not really having disk ejection issues, I had some one night that overlapped a series of kernel panics, so who knows what that was, but it’s mainly kernel panics that are my problem, and as mentioned, several times it’s happened the moment I plugged my Mercury in. I sent that panic report and a SoftRAID diagnostic in via email over the weekend but haven’t heard back yet.
Hi SoftRAID Support and Happy New Year!
Just wanted to mention that I was able to install the B48 Softraid version drivers on my Intel based laptop running Big Sur (11.1). I was able to connect my USB based Softraid devices, as well as my Thunderbolt based devices, on that system and they all mounted correctly. So apparently my original crashing problem with mounting the USB devices affects only my M1 Mac mini system (B48). For now, I'm just not using my USB devices on my M1 system (not needed).
Hope this update helps with isolating the problem. :-)
One problem with isolating/fixing this issue, if there is a generic issue with the M1, is there is no functional debugging kit for the M1. So there is no way for an engineer to track through the code and find the cause of a problem. Apple must have this working, but if so, it is not documented such that third parties can use it.
Hopefully, this is not a generic issue, and we can figure it out. (hardware?)
I think very possible this is a generic issue that Apple needs to correct, but in case it's a Softraid problem, at least you and others can be aware of it. Have you been able to verify other USB based Softraid devices work OK on M1? Note that I did forward a bunch of panic reports onto Apple, so if it's their problem, hopefully they will fix soon. I understand your frustration if you have no debugging capability for M1 systems. Hopefully, you can get one soon.
Btw, I ran into another (generic?) problem using my connected Thunderbolt based devices. I have a problem shutting down my M1 system if too many Thunderbolt devices are connected during the shutdown. Right now, I have 6 Thunderbolt devices (drive enclosures) connected, 4 of which are Softraid related devices. If I shutdown with all devices connected, my M1 Mac mini will appear to shut down properly, but after a short time (secs) will restart and give an panic error message. If I remove half of these devices, it will cleanly shut down without restarting. I was wondering if you ran into a similar situation with your testing?
I did find a similar situation on the Apple forum. You can see here:
If you scroll down in the post, you can see my comments (Paul). Anyway, I've sent a bunch of these panic reports onto Apple and hopefully they will fix this soon too. Just wanted to mention here in case others run into this problem.
Thanks again for all your support. In general, my M1 system is running well now on Softraid B48.
Take care and again Happy New Year!
I did not see the USB kernel panic on the M1 as of yet, I only did preliminary testing. I will be doing more. I tested with a MEPQ in the office, but it was not extensive.
the second problem is one that has existed for some time. It is indeed a kernel bug. What happens is at shutdown, macOS gives a "flush cache" command to each disk one-at-a-time. (It should be multi threaded IMO, but is not) Larger disks have more cache, so take longer to respond. If the time to flush cache exceeds the time alloted, kernel panic.
We have reported this to Apple. We also think we have a solution for SoftRAID volumes. We will be testing this and putting it in a future 6.0 beta, if our idea works.
If you unmount all your volumes, and wait a couple minutes, then shut down, it will avoid the kernel panic.
Regarding my shut down problem, if Apple doesn't provide a fix, it would be great if you guys could provide a Softraid solution. Since the majority of my external devices are using Softraid, I think a Softraid solution would likely allow me to shut down cleanly again.
Thanks so much!