Upgrade from 6.05 t...
 
Notifications
Clear all

Upgrade from 6.05 to 6.1 causes Panics?

(@crhendo)
Active Member Customer

I have two 2108 Mac Mini's with multiple OWC RAID enclosures attached to each of them. I have been running SoftRAID 6.05 on both of them since it was released with steady upgrades of Big Sur to 11.6 a few weeks ago. Over the past few days I have tried upgrading to 6.1 on both of them. The installation itself went quite well however both machines suffered the same problem upon Shutdown which took minutes to complete before a Panic occurred and the system reboots on its own. As both Panics on each system were very similar, I spent many hours researching their cause and concluded that all my RAID enclosures were timing out because of a very odd behaviour that prevented them from closing down satisfactorily. 

I noticed on the smaller Mac mini (OWC Thunderbay 4 {with Apple Thunderbolt 2 adapter) and OWC Thunderbay 4 Mini) that if I performed a PRAM reset on the Mac mini with no devices attached the problem went away. I then added one device, performed a PRAM reset and then closed down without any problems. I then added the second device and performed a PRAM reset again with the same result. I then tried 5 starts and shutdowns without any problems and  sort of concluded that the problem "might" be fixed.

I then tried the same approach on the larger system and gradually added devices in the same manner in this order; LaCie 5big thunderbolt (with Apple Thunderbolt 2 adapter), OWC Thunderbay 6, OWC 4M2. All worked as it should without any problems. The last device I added was an OWC Mercury Elite Pro Dock which of course failed and I returned to very slow shutdowns with a Panic at the end and a reboot.

Any suggestions would be most welcome. I have run out of "spare time" to do any further testing at this stage. I have attached the Panic from the larger system and a SoftRAID report.

I have no idea what is causing this problem.

 

panic(cpu 10 caller 0xffffff801aea1983): "PowerOff timed out in phase 'Notifications completed'. Total 30000 ms:

vfs_unmountall: 729 ms

PowerOff/Restart message to priority client: 3042 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 3048 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 3040 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 3045 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 3055 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 3030 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 553 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 502 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 502 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 502 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 502 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 502 ms @ 0x<ptr>, com.apple.iokit.IOStorageFamily(2.1)[7C0E4949-640F-3D1D-97AF-030903A22663]@0x<ptr>->0x<ptr>

PowerOff/Restart message to priority client: 231 ms @ 0x<ptr>, com.apple.driver.IOBluetoothHIDDriver(8.0.5d7)[C43039EB-4B29-3526-BF39-27311505DDAD]@0x<ptr>-"@/System/Volumes/Data/SWE/macOS/BuildRoots/38cf1d983f/Library/Caches/com.apple.xbs/Sources/xnu/xnu-7195.141.6/iokit/Kernel/IOPMrootDomain.cpp:5815

Backtrace (CPU 10), Frame : Return Address

0xffffffa1b913bb30 : 0xffffff801a68cfdd 

0xffffffa1b913bb80 : 0xffffff801a7d3fd3 

0xffffffa1b913bbc0 : 0xffffff801a7c45ca 

0xffffffa1b913bc10 : 0xffffff801a631a2f 

0xffffffa1b913bc30 : 0xffffff801a68c7fd 

0xffffffa1b913bd50 : 0xffffff801a68caf3 

0xffffffa1b913bdc0 : 0xffffff801ae9cdca 

0xffffffa1b913be30 : 0xffffff801aea1983 

0xffffffa1b913be50 : 0xffffff801ae0d979 

0xffffffa1b913be80 : 0xffffff801a6d3595 

0xffffffa1b913bef0 : 0xffffff801a6d4504 

0xffffffa1b913bfa0 : 0xffffff801a63113e 

 

Process name corresponding to current thread: kernel_task

 

Mac OS version:

20G165

 

Kernel version:

Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:21 PDT 2021; root:xnu-7195.141.6~3/RELEASE_X86_64

Kernel UUID: C2591F4E-EE82-33CC-8C59-DB81D9AD80DD

KernelCache slide: 0x000000001a400000

KernelCache base:  0xffffff801a600000

Kernel slide:      0x000000001a410000

Kernel text base:  0xffffff801a610000

__HIB  text base: 0xffffff801a500000

System model name: Macmini8,1 (Mac-7BA5B2DFE22DDD8C)

System shutdown begun: YES

Hibernation exit count: 0

 

System uptime in nanoseconds: 120201092818

Last Sleep:           absolute           base_tsc          base_nano

  Uptime  : 0x0000001bfc8b2628

  Sleep   : 0x0000000000000000 0x0000000000000000 0x0000000000000000

  Wake    : 0x0000000000000000 0x000000177880504f 0x0000000000000000

last started kext at 39924698102: >!AHIDKeyboard 224 (addr 0xffffff801bb3f000, size 16384)

loaded kexts:

tw.com.asmedia.driver.AsDirectHW 1.3

com.owc.driver.SCSI-Device0E 1.1.0b16

com.owc.driver.SCSI-Device 1.1.0b16

com.owc.driver.SATA-Command 8.1.0b16

com.lacie.driver.mvumi 1.3.21

com.paragon-software.filesystems.ntfs 199.8.15

com.paragon-software.filesystems.extfs 108.5.11

com.paragon-software.lvm-for-mac 1

com.softraid.driver.SoftRAID 6.1

>!ATopCaseHIDEventDriver 4050.1

>!APlatformEnabler 2.7.0d0

>AGPM 122.1

>X86PlatformShim 1.0.0

@fileutil 20.036.15

@filesystems.autofs 3.0

>AudioAUUC 1.70

>!AGraphicsDevicePolicy 6.3.5

@AGDCPluginDisplayMetrics 6.3.5

>pmtelemetry 1

|IOUserEthernet 1.0.1

>usb.!UUserHCI 1

|IO!BSerialManager 8.0.5d7

@Dont_Steal_Mac_OS_X 7.0.0

>!AHV 1

>BridgeAudioCommunication 140.4

>!AAVEBridge 6.1

>!AFIVRDriver 4.1.0

>!AUpstreamUserClient 3.6.8

>!AMCCSControl 1.14

>!ABridgeAudio!C 140.4

>!AGFXHDA 100.1.433

>!ADiskImages2 1

>ACPI_SMC_PlatformPlugin 1.0.0

>!A!ISlowAdaptiveClocking 4.0.0

>!A!IKBLGraphics 16.0.5

>!AThunderboltIP 4.0.3

>!A!IPCHPMC 2.0.1

>!A!ICFLGraphicsFramebuffer 16.0.5

@!AEthernetAquantiaAqtionFirmware 1.0.36

|SCSITaskUserClient 436.140.1

>BCMWLANFirmware4378.Hashstore 1

>BCMWLANFirmware4377.Hashstore 1

>BCMWLANFirmware4364.Hashstore 1

>BCMWLANFirmware4355.Hashstore 1

>!AFileSystemDriver 3.0.1

@filesystems.tmpfs 1

@filesystems.hfs.kext 556.100.11

@BootCache 40

@!AFSCompression.!AFSCompressionTypeZlib 1.0.0

@!AFSCompression.!AFSCompressionTypeDataless 1.0.0d1

>!ABCMWLANBusInterfacePCIeMac 1

@private.KextAudit 1.0

@filesystems.apfs 1677.141.2

>!A!II210Ethernet 2.3.1

>!AAHCIPort 346.100.2

>!AACPIButtons 6.1

>!ASMBIOS 2.1

>!AACPIEC 6.1

>!AAPIC 1.7

@!ASystemPolicy 2.0.0

@nke.applicationfirewall 311

|IOKitRegistryCompatibility 1

|EndpointSecurity 1

>!AHIDKeyboard 224

>!AActuatorDriver 4440.3

>!AMultitouchDriver 4440.3

>!AInputDeviceSupport 4400.35

>!AHS!BDriver 4050.1

>IO!BHIDDriver 8.0.5d7

|IOSCSIReducedBlockCommandsDevice 436.140.1

@kext.triggers 1.0

|IOATA!F 2.5.3

|IOSCSIParallel!F 3.0.0

>!AGraphicsControl 6.3.5

|IOAVB!F 940.4

>!ASMBus!C 1.0.18d1

>!AHDA!C 283.15

|IOHDA!F 283.15

|IO!BHost!CUARTTransport 8.0.5d7

|IO!BHost!CTransport 8.0.5d7

@!AGPUWrangler 6.3.5

>IOPlatformPluginLegacy 1.0.0

>X86PlatformPlugin 1.0.0

|IOSlowAdaptiveClocking!F 1.0.0

>!A!ILpssUARTv1 3.0.60

>!A!ILpssUARTCommon 3.0.60

>!AOnboardSerial 1.0

>usb.IOUSBHostHIDDevice 1.2

>!ASMBusPCI 1.0.14d1

>!AThunderboltEDMSink 5.0.3

|IONDRVSupport 585.1

>IOPlatformPlugin!F 6.0.0d8

|IOAccelerator!F2 442.9

@!AGraphicsDeviceControl 6.3.5

|IOGraphics!F 585.1

@plugin.IOgPTPPlugin 985.2

>!UAudio 405.39

|IOAudio!F 300.6.1

@vecLib.kext 1.2.0

>!AThunderboltDPOutAdapter 8.1.4

>!AHPM 3.4.4

>!A!ILpssI2C!C 3.0.60

>!A!ILpssI2C 3.0.60

>!A!ILpssDmac 3.0.60

>!AThunderboltDPInAdapter 8.1.4

>!AThunderboltDPAdapter!F 8.1.4

>!AThunderboltPCIUpAdapter 4.1.1

>!AThunderboltPCIDownAdapter 4.1.1

>usb.cdc.ecm 5.0.0

>usb.cdc.ncm 5.0.0

>usb.cdc 5.0.0

>usb.networking 5.0.0

>usb.!UHostCompositeDevice 1.2

>!ABSDKextStarter 3

|IOSurface 290.8.1

@filesystems.hfs.encodings.kext 1

>!ABCMWLANCoreMac 1.0.0

|IOSerial!F 11

|IO80211!FV2 1200.12.2b1

|IOSkywalk!F 1

>IOImageLoader 1.0.0

>corecapture 1.0.4

>usb.!UHostPacketFilter 1.0

|IOUSB!F 900.4.2

>!AXsanScheme 3

|IOAHCIBlock!S 332

|IOEthernetAVB!C 1.1.0

>!AThunderboltNHI 7.2.8

|IOThunderbolt!F 9.3.2

|IOAHCI!F 294.100.1

>usb.!UVHCIBCE 1.2

>usb.!UVHCICommonBCE 1.0

>usb.!UVHCI 1.2

>usb.!UVHCICommon 1.0

>!AEffaceableNOR 1.0

|IOBufferCopy!C 1.1.0

|IOBufferCopyEngine!F 1

|IONVMe!F 2.1.0

>usb.!UXHCIPCI 1.2

>usb.!UXHCI 1.2

>!AEthernetAquantiaAqtion 1.0.64

>mDNSOffloadUserClient 1.0.1b8

>!AEFINVRAM 2.1

>!AEFIRuntime 2.1

>!ASMCRTC 1.0

|IOSMBus!F 1.1

|IOHID!F 2.0.0

$!AImage4 3.0.0

|IOTimeSync!F 985.2

|IONetworking!F 3.4

>DiskImages 493.0.0

|IO!B!F 8.0.5d7

|IOReport!F 47

|IO!BPacketLogger 8.0.5d7

$quarantine 4

$sandbox 300.0

@kext.!AMatch 1.0.0d1

|CoreAnalytics!F 1

>!ASSE 1.0

>!AKeyStore 2

>!UTDM 511.141.1

|IOUSBMass!SDriver 184.140.2

|IOSCSIBlockCommandsDevice 436.140.1

|IO!S!F 2.1

|IOSCSIArchitectureModel!F 436.140.1

>!AMobileFileIntegrity 1.0.5

@kext.CoreTrust 1

>!AFDEKeyStore 28.30

>!AEffaceable!S 1.0

>!ACredentialManager 1.0

>KernelRelayHost 1

|IOUSBHost!F 1.2

>!UHostMergeProperties 1.2

>usb.!UCommon 1.0

>!ABusPower!C 1.0

>!ASEPManager 1.0.1

>IOSlaveProcessor 1

>!AACPIPlatform 6.1

>!ASMC 3.1.9

|IOPCI!F 2.9

|IOACPI!F 1.4

>watchdog 1

@kec.pthread 1

@kec.corecrypto 11.1

@kec.Libm 1

 

 

 

Quote
Topic starter Posted : 08/10/2021 12:37 am
(@softraid-support)
Member Admin

This is unusual, for a couple reasons. I do not see a reason that 6.1 would "cause" these shutdown panics, but the upgrade process is correlated in some way. Resetting PRAM (NVRAM) on 11.6 should no longer fix anything, but it is.

 

Also, you should be encountering a different bug, one where in macOS, with more than 12 drives, you get a watchdog hang on shutdown. Its caused by macOS attempting to flush the disk cache on all drives, but does it synchronously, not asynchronously, as you would expect. You can watch drives blink on, one at a time, which this happens. It this takes too long, you get a panic, with the Watchdog in the backtrace.

 

So this entry is unusual:

panic(cpu 10 caller 0xffffff801aea1983): "PowerOff timed out in phase 'Notifications completed'. Total 30000 ms:

It seems similar to the Watchdog, in that macOS was waiting for notifications from somewhere.

 

So, if you unmount your volumes before shutdown/restart does that eliminate this issue?

ReplyQuote
Posted : 08/10/2021 8:20 am
(@crhendo)
Active Member Customer

Thank you for your quick reply AND for introducing me to the macOS watchdog timer bug. I think I now know some of the answers to the problem I am having.

Firstly, I tried unmounting the drives before shutting down without any luck.

With regard to the watchdog timer, I first discovered this problem during the SoftRAID 6.1 upgrade of my "System A" Mac mini which at the time had the following attached devices; OWC Thunderbay 4, OWC Thunderbay 4 Mini, OWC Mercury Elite Pro and LaCie 5big for a total of 16 drives. After about an hour of diagnosis, I decided to minimise my risk by removing the shared drives (LaCie 5big and OWC Mercury Elite Pro) from System A and attach them to System B, which went without a hitch (and increased the drive count on "System B" from 13 to 20 drives). I then continued my diagnosis on System A which eventually led me to the "reset PRAM" approach which appeared to have fixed the system. Thanks to you I now know that, because I reduced the drive count on System A to 9 drives, the real fix probably occurred because I had not run up against the macOS watchdog timer bug. Observing the individual drives blink on one at a time during shutdown confirms this.

I then moved to System B (which had now grown from 13 to 20 drives) and after a number of successful starts and shutdowns (and backups) I started the upgrade process to 6.1 and ran into the same problem. I then tried my reset PRAM approach until I ran into the same problem when I attached the OWC Mercury Elite Pro which lifted the drive count from 18 to 20.

My outstanding issues (as I see them) are as follows:-

  1. Why haven't I run into this problem before?
  2. Why did the 6.05 to 6.1 upgrade appear to trigger the problem?
  3. Why did my System B only run into the problem when I added the final two drives (from 18 to 20)? The watchdog timer bug should kick in at 13 drives and above?
  4. Can you recommend a workaround for the macOS watchdog timer bug?

I will try to do some more testing over the weekend.

 

ReplyQuote
Topic starter Posted : 08/10/2021 8:03 pm
(@crhendo)
Active Member Customer
Posted by: @softraid-support

 

Also, you should be encountering a different bug, one where in macOS, with more than 12 drives, you get a watchdog hang on shutdown. Its caused by macOS attempting to flush the disk cache on all drives, but does it synchronously, not asynchronously, as you would expect. You can watch drives blink on, one at a time, which this happens. It this takes too long, you get a panic, with the Watchdog in the backtrace.

 

 

When you mention "more than 12 drives" is that number fixed at 12 or is it based on one configuration you have tried. I have just run my System B with the following configurations (3 runs for each test)

 

  1. 13 Drives:- 1TB internal system ssd + Thunderbay 6 (including NVMe SSD) + LaCie 5big
  2. 15 Drives:- 1TB internal system ssd + Thunderbay 6 (including NVMe SSD) + LaCie 5big + OWC Mercury Elite Pro Dock
  3. 15 Drives:- 1TB internal system ssd + Thunderbay 6 (including NVMe SSD) + OWC 4M2 + OWC Mercury Elite Pro Dock + External USB-C NVMe SSD

 

Test 1 booted and shutdown without any Panic

Test 2 booted and shutdown with a Panic

Test 3 booted and shutdown without a panic

 

If this is the macOS watchdog timer bug, it seems that the inclusion of SSD's impacts the number of drives that can be configured before the bug raises a Panic.

 

ReplyQuote
Topic starter Posted : 08/10/2021 9:49 pm
(@softraid-support)
Member Admin

@crhendo

12 is approximate. It depends on the size of the drives (caches) and the time each takes to flush their cache. 12 is the average of HDD"s that cause this panic.

We first saw this with users with 3 or 4 Thunderbay's connected to their systems and later testing narrowed this down to not enclosures, but numbers of drives. As I mentioned, when you shut down, watch the drives flash one by one. the watchdog does not allow enough time for this.

MacOS used to just shut down, but Apple realized the voume corruption caused by shutting down too quickly. A "flush cache" command was issued to each drive at shutdown. My guess is a lower level engineer was tasked with this, as it is done one at a time, rather than all at once.

ReplyQuote
Posted : 09/10/2021 1:09 am
(@softraid-support)
Member Admin

@crhendo

I wish I could say why it is showing up now. I think this has been around since Mojave or so. There is no reason by 6.1 triggered it, that I can think of. We made a lot of changes to the application, but very few to the driver in SoftRAID. perhaps none.

As mentioned the # is not set in stone. But if you unmount your volumes before shutdown, you should be able to avoid this issue. Next shutdown, watch the drive lights. Each disk will stay on for as long as 30 seconds each. that is the drive flushing the cache. larger drives, in general, take longer. SSD's should flush almost instantaneously.

 

Do some testing with unmounting and let me know.

ReplyQuote
Posted : 09/10/2021 1:14 am
(@crhendo)
Active Member Customer

After much testing I have determined that the best solution for me is to unmount the LaCie 5big and then press the "big blue button" to put the the enclosure in a wait state similar to sleep. I can then satisfactorily shut down the system with the remaining 15 drives without causing a Panic. 

The shutdown is still painfully slow with lights flickering one at a time on each enclosure BUT I am rid of the Panic. This is absolutely crazy. Have you raised this problem with Apple?

BTW, simply unmounting the enclosures had NO effect on reducing the number of drives that would need to be "touched" by the incredibly slow shut down process.

I still have no understanding as to why I haven't witnessed this problem before now.

 

ReplyQuote
Topic starter Posted : 12/10/2021 1:51 am
(@softraid-support)
Member Admin

@crhendo

Yes this is a bug filed with Apple. All they need to do is flush the cache asynchronously, I am baffled why this was not an easy fix. I don't remember the OS where this cache was implemented, probably 13 or 14. I do not know why you did not notice this before.

(Maybe there is a secret way around this!)

ReplyQuote
Posted : 12/10/2021 1:59 am
(@crhendo)
Active Member Customer
Posted by: @softraid-support

@crhendo

(Maybe there is a secret way around this!)

I can keep a secret :)

ReplyQuote
Topic starter Posted : 12/10/2021 2:02 am
Share:
close
open