Contribute
Register

General NVMe Drive Problems (Fatal)

No one thinks WD is a panacea. It's just a go-to concern when discussing 3rd party NVMe issues.

After surveying the googz on the "loss of MMIO space" panic, I'm thinking as follows:

(I am not up to speed on your builds / context so if this doesn't pertain just dismiss it)

There's io mapper config for OC that pertains to Intel VT-d support. This comes up in Thunderbolt config and gettng certain ethernet adaptors to work on later 590+ builds Monterey and beyond.

You are right to invoke CaseySJ's name regarding details because he's been the local thought leader on this config in context of 690 hacks.

Look up "mapper" in the OC Configuration.pdf to get started with the lingo.
 
Look up "mapper" in the OC Configuration.pdf to get started with the lingo.
OK. So Page 36 we have 7. DisableIoMapper and 8. DisabbleIoMapperMapping. Is that what you are referring to?

Looking at what MMIO means ( the error message I have are seeing ) , it appears to be how the CPU maps access to RAM and other devices like NVMe drives. If I recall most OSes use "Address space layout randomization" / ASLR as a security mitigation. So if the problem is that somehow MacOS and OpenCore are not explicitly agreeing on which address spaces are reserved for when talking to the NVMe drive, then I guess that might explain the semi-random nature of the crashing?

It seems like I might need to understand Devirtualise MMIO quirk and MMIO Whitelist to achieve this, but I have no understanding on how I might add a quirk ( if that's the correct terminology ) to say "heh, please reserve these memory address for talking to my NVMe drives." Any suggestions happily received.

FWIW I do have VT-d enabled as I use VMFusion to run a few VMs. Losing the ability to use VMWare is a bit of a deal breaker ( it's one of many reasons I don't want to move to Apple Silicon ).

No one thinks WD is a panacea. It's just a go-to concern when discussing 3rd party NVMe issues.
OK. Can you be specific about what motherboard, CPU, version of MacOS and version of OpenCore you are using, and what exact WD drive you are suggesting? WD SN850 isn't for example explicit enough as there is now a WD SN850X and for all I know I'll by a WD SN850X and get a "oh, that's not proven to be compatible you need to buy a XXXX".

I'd even go so far as to ask has anyone higher confidence in specific capacity drives? I need a 2TB, but for obvious financial reasons would like to test and have some confidence using a 250GB or 500GB drive...

I really really want to be able to put the SSD compatibility possibility to bed; but there's a limit to how many different drives I can buy/try unless I spent "Apple money" to maybe get a 2TB drive that works...
 
The thought has crossed my mind to go back to a SATA SSD and keep on going, as that seems totally bullet proof (vs NVME issues). I just hate to take that speed hit.

Has that idea crossed your mind also?
Seems counterproductive when the purpose of the Z690/12900KS is purely to get more performance, so downgrading to SATA isn't on my radar, especially when some of the workloads I have are disk I/O intensive.

In all honestly much of the time my 9900K based system if perfectly fine, but I found a Z690/12900KS for reasonable cost and thought it would be a somewhat inexpensive upgrade route. Once my time has been taken into account, that's been an epically stupid decision in retrospect.

FWIW I've had 4 Hacks ( original Intel NUC, Z270, Z390 and now Z690 ). The Zx90 boards have all been AsRock, and completely bomb-proof until this 12th Gen system. Hey ho.
 
It seems like I might need to understand Devirtualise MMIO quirk and MMIO Whitelist to achieve this, but I have no understanding on how I might add a quirk ( if that's the correct terminology ) to say "heh, please reserve these memory address for talking to my NVMe drives." Any suggestions happily received.

FWIW I do have VT-d enabled as I use VMFusion to run a few VMs. Losing the ability to use VMWare is a bit of a deal breaker ( it's one of many reasons I don't want to move to Apple Silicon ).
MMIO Whitelist should not be necessary with Intel CPUs.
If you already have VT-d enabled, and thus DisableIOMapper:false, you may try DisableIOMapperMapping:true.

I'd even go so far as to ask has anyone higher confidence in specific capacity drives?
I've had good experience with Kioxia XD5 and CD5 drives. Available in much higher capacity than 2 TB if you go for U.2.
Do NOT use NVMeFix.kext with these enterprise drives.
 
It seems like I might need to understand Devirtualise MMIO quirk and MMIO Whitelist to achieve this, but I have no understanding on how I might add a quirk ( if that's the correct terminology ) to say "heh, please reserve these memory address for talking to my NVMe drives." Any suggestions happily received.

I don't have answers for your questions on iomapper config, but I can wave my hands to explain that MMIO is memory-mapped IO and DMA in the context of VT-d, with all the attendant security implications of letting hot-plugable PCIe devices (Thunderbolt) and VMs rummage around in the underlying hardware (kernel) address space. Basically the memory maps used by devices need to be visualized along as does user space, because like user processes, devices can no longer be considered trusted by the kernel. This is a big change from the PC design age before Thunderbolt where the case could be considered a security boundary. As to how ASLR fits into this puzzle I don't know the details, but it's safe to assume it does.

On these forums I have never seen iomapper config pertained to NVMe drives, but the config is late game hack, but this doesn't mean it can't pertain... You are on the bleeding edge.

Iomapper come up in context of z690 and Thunderbolt, and also regaining compatibility with i225_v rev3 ethernet on Monterey. There are finer points as to a need for socialize specialized DMAR config you will need to suss out.

You are right to associate this with CaseySJ and his large valuable z690 thread, where you will find his reference EFI with all the trimmings.

(—Pertaining to iomapper, I included TB config such that macOS detects the Asus Hero's z590 Maple Ridge controller, but TB has never been tested on my build.)

The reason iomapper config is pertinent to you is that you may have built from a reference EFI that has iomapper details that create an edge case affecting NVMe for your HW combo.

FWIW I do have VT-d enabled as I use VMFusion to run a few VMs. Losing the ability to use VMWare is a bit of a deal breaker ( it's one of many reasons I don't want to move to Apple Silicon ).

OK. Can you be specific about what motherboard, CPU, version of MacOS and version of OpenCore you are using, and what exact WD drive you are suggesting? WD SN850 isn't for example explicit enough as there is now a WD SN850X and for all I know I'll by a WD SN850X and get a "oh, that's not proven to be compatible you need to buy a XXXX".

I'd even go so far as to ask has anyone higher confidence in specific capacity drives? I need a 2TB, but for obvious financial reasons would like to test and have some confidence using a 250GB or 500GB drive...

I don't think drive size has any bearing on your situation.

I'm using a z590 Asus Hero with 11900K and an original SN750 Black. This drive has been reliably compatible with my kit since day one, starting with Big Sur.

My back story is that I was gifted a dream build in 2021 which I decided to hack to replace a trusty, venerable 2008 Mac Pro.

I had started my build with a 10900K and the Sabrent Rocket 4, which I chose for PCIe4, but I has to purchase kit a month ahead of Intel's 11th gen release.

—This thread came about from my discovering that z590 Rocket 4 was so incompatible with Big Sur that two drives were wrecked— that is, became bricked, for unknown reasons, but possibly related to thermal overload. My thought was if a normally functioning system can wreck drive HW not just lose data that seems a special new kind of concern that deserved its own topic. Along the way I learned that Sabrent is a vertical integrator and their support isn't up to job of debugging compatibility and they likely knew they had a thermal edge case.

I switched to a SN750 Black to resolve the Rocket problems, which it did resolve.

Ultimately I got a 11900K to play with and I sidelined the SN750 for PCIe 4 SK Hynix Platinum, which has been running well in daily use since I got it 2 years ago.

—I also run this system maximally overclocked, which at full tilt 300W happens to make it's performance on par with an M1 Pro Macbook running on a battery. So I learned a lesson about the future of Mac the hard expensive way, but not as bad as poor blighters who bought a 2019 Mac Pro.

I have VT-d enabled and CaseySJ's recommended iomapper config for TB support which also gives me native i225_v compatibility in Ventura.

—I did run into a hazard where running backups to the SN750, which I now use to stage OS updates, was causing a system freeze. By pure luck, this turned out to be resolved by pulling the drive, cleaning its connector and re-installing.

x x x

Re your specific plight...

I looked a little at this "loss of MMIO space" panic on the googz:

The that the panic message is written as such "3rd party NVMe controller" shows Apple engineering knows there's a compatibility hazard.

This problem is reported with actual Macs at Apple Discussions, and also at Macrumors Unsupported, with no satisfying resolution. So it's not just a hack thing.

There are reports of this MMIO problem here on tonymacx86, from 3 years ago on another thread. 3 pages with no resolution.

It's mentioned as intermittent by one poster at Mavrumors, with no clear pattern of reappearance.

I saw a couple of posts that describe "fixs":

1) A 15 minute youtube about a Macbook reported the fix was to literally hit the laptop.

2) A post about a hack, the fix was to tear down the build, check everything for cleanliness, and put it back together.

I'm sorry nothing I have to say actually helps. It seems your symptoms are truly outlier, like mine with Rocket 4.

Last useless thoughts:

There are bugaboos with this tech that can affect any system:

- Manufacturing defects: outlying devices that are unreliable due to varying tolerances during production.

- Borderline mechanical problems: connection / interface alignment and contamination, i.e., dirt, damaged.

- ESD (electrostatic-discharge damage) a variation of the above that zaps chips.

- Borderline incompatibility, especially over-clocking.

But in history of computing, there have been literal bugs (insects).

So there is sometimes credence to silly solutions, and moreover, some SW may expose bugs while other SW does not.

—Official Intel CPU exploit mitigation is literally a case of:
- Doctor, it hurts when I do this!
- Well, don't do that."
written into microcode.
 
Disabling VT-d made no difference, stil crashes under heaving CPU+disk load. Am now trying capping Short/Long power duration TDPs to see if that makes a difference. FWIW I have tried 2 different PSUs, both of which work find in another system.

I'm becoming increasingly suspicious of that the motherboard or cpu is at fault, but determining that is going to necessitate switching my workloads over to Windows Subsystem for Linux. I have a long weekend ahead...
 
There are bugaboos with this tech that can affect any system:

- Manufacturing defects: outlying devices that are unreliable due to varying tolerances during production.

- Borderline mechanical problems: connection / interface alignment and contamination, i.e., dirt, damaged.

- ESD (electrostatic-discharge damage) a variation of the above that zaps chips.
How about cosmic rays ?
The Earth is subjected to a hail of subatomic particles from the Sun and beyond our solar system which could be the cause of glitches that afflict our phones and computers. And the risk is growing as microchip technology shrinks.
 
Back
Top