Exploring the Boundaries of Write Cache on Notebook Computers. It it safe to use?
First can I say that I believe that I understand the OWC statement that ...
"The disadvantage of using a cache is that if you lose power between the time a file has been written to your RAID volume and when the file data gets written out from the cache, that file data will be lost. We recommend that you use a UPS (Uninterruptible Power Supply) on your Mac and disks whenever you are using a RAID 4 or RAID 5 volume. If you are not using a UPS, we recommend that you disable the write cache using the SoftRAID preferences."
So, I work on a MacBook Pro on mains power and with a thunderbolt connection to an OWC enclosure containing a RAID 5 array (4 SATA SSDs) to which I, or rather Time Machine, frequently backs up.
The MacBook effectively has a built in UPS that will happily run it for several hours. The OWC enclosure is not running on a UPS.
However, if SoftRAID maintains it's cached write blocks until it has confirmation that the data is on disk, then what data is lost if the enclosure loses power ?
Next time the enclosure is powered and connected to the same notebook, wouldn't / shouldn't SoftRAID simply rewrite any unconfirmed cached writes ?
Any thoughts ??
This is a very complex area. The first thing to realize is disk drives store data in an oboard memory cache, which gets written out to the disks "later". Which can vary from milliseconds, so much longer. In the meantime, the drive has signaled to MacOS that the write is completed, so there is no knowledge that the write was not in fact completed.
the reason this is done by drives is for performance.
The SoftRAID driver does have a mechanism for when performing writes, if a disk is suddenly disconnected, the driver tries to keep a record of the last IO's so it can replay them on startup. Its not perfect, but works well enough.
Look at another interesting fact. A standard Thunderbolt cable, when you are transferring data to a drive, can carry over 1MB of data in the actual cable. Technology is extremely fast, and lots can go wrong. An example is the proclivity of Thunderbolt devices to eject from the bus at times, when one of the controller chips on the cable ends "resets" or crashes. This can also cause data integrity issues, which there is little the user can do about, except keep meticulous backups.