Contribute
Register

Catalina 10.15.6 + 5700 XT: Crashing under load

Status
Not open for further replies.
I had a crash at around 9:03 AM. I decided this time to wait a few minutes before rebooting. Some crazy stuff showed up. Check out all the gpuRestart files:
Screen Shot 2020-08-18 at 9.29.33 AM.png

When I opened the files, they all contain this: "Vendor failed to provide GPURestartReport"
Screen Shot 2020-08-18 at 9.29.49 AM.png

I had a look at the system.log in Console after the freeze. (see attached text file).

There's now just a TON of "DumpGPURestart" errors. It looks like the 5700 XT is attempting to continuously restart and failing to do so. It's not a typical kernel panic which is why it wasn't showing up in the failure logs. The system is still operating during the black screen just without some functions like sound. Go figure.
 

Attachments

  • DumpGPURestart.txt
    29.9 KB · Views: 157
I had a crash at around 9:03 AM. I decided this time to wait a few minutes before rebooting. Some crazy stuff showed up. Check out all the gpuRestart files:
View attachment 484803

When I opened the files, they all contain this: "Vendor failed to provide GPURestartReport"
View attachment 484804

I had a look at the system.log in Console after the freeze. (see attached text file).

There's now just a TON of "DumpGPURestart" errors. It looks like the 5700 XT is attempting to continuously restart and failing to do so. It's not a typical kernel panic which is why it wasn't showing up in the failure logs. The system is still operating during the black screen just without some functions like sound. Go figure.

Yea this is exactly what I was getting when using iMacPro1,1.
 
That and switching to version 0.5.8 of Opencore, using SSDT-Time to create the needed SSDTs (all three), and using a Samsung 970 Pro EVO Plus drive. Note didn't need a DSDT anymore.

Just a tip Poorman, this guy in another thread claims to have solved his issue with Opencore, a re-install of Catalina and creating 3 SSDT's with SSDT-Time.

 
My hack is running with open core and I do have the problem.
Note that in addition to Blender, trying to execute Geekbench finishes with a complete freeze. I looked through all the logs that I found and here what I see in the few seconds of the Geekbench execution:

Code:
2020-08-19 00:31:26.050955+0200 0x949      Default     0x0                  0      0    kernel: <=== HalUsbInMpdu()
2020-08-19 00:31:26.050958+0200 0x949      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelFenceMachine::fence_timeout(IOTimerEventSource *): AMDRadeonAccelerator prodding blockFenceInterrupt
2020-08-19 00:31:26.136421+0200 0x856      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: channel 13 DisplayPipe1 is hung! (lastReadTimestamp=0x0000034b) channelResetMask 0x00000000
2020-08-19 00:31:26.136427+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::restart_channel(): GPURestartBegin stampIdx=13 type=2
2020-08-19 00:31:26.136466+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) virtual void IOAccelFIFOChannel2::restart(): ring is empty and all finished. Nothing to do.
2020-08-19 00:31:26.136468+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) virtual void IOAccelFIFOChannel2::restart(): GPURestartSkipped stampIdx=13
2020-08-19 00:31:26.136469+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::restart_channel(): GPURestartEnd stampIdx=13 type=2
2020-08-19 00:31:26.136476+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::hardwareErrorEvent(): setting restart type to 2 (channel 18)
2020-08-19 00:31:26.136477+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::hardwareErrorEvent(): GPURestartDequeued stampIdx=18 type=2
2020-08-19 00:31:26.136478+0200 0x856      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: channel 18 event timeout
2020-08-19 00:31:26.136482+0200 0x856      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: HW Channel 8 SDMA1_PAGE pending submissions count 0
2020-08-19 00:31:26.136483+0200 0x856      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: Cannot detect guilty channel
2020-08-19 00:31:26.136484+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::restart_channel(): GPURestartSkipped stampIdx=18 type=2
2020-08-19 00:31:26.136485+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::restart_channel(): no channel associated with stamp_idx 18 (type 2)
2020-08-19 00:31:26.136487+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::hardwareErrorEvent(): setting restart type to 2 (channel 19)
2020-08-19 00:31:26.136488+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::hardwareErrorEvent(): GPURestartDequeued stampIdx=19 type=2
2020-08-19 00:31:26.136489+0200 0x856      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: channel 19 event timeout
2020-08-19 00:31:26.136493+0200 0x856      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: HW Channel 6 SDMA0_PAGE pending submissions count 0
2020-08-19 00:31:26.136494+0200 0x856      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: Cannot detect guilty channel
2020-08-19 00:31:26.136495+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::restart_channel(): GPURestartSkipped stampIdx=19 type=2
2020-08-19 00:31:26.136496+0200 0x856      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelEventMachine2::restart_channel(): no channel associated with stamp_idx 19 (type 2)
2020-08-19 00:31:26.136532+0200 0x949      Error       0x0                  0      0    kernel: (AMDRadeonX6000) [3:0:0]: failed to submit TS=0x1887 to GFX HW channel
2020-08-19 00:31:26.241258+0200 0x949      Default     0x0                  0      0    kernel: <=== HalUsbInMpdu()
2020-08-19 00:31:26.241265+0200 0x949      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelFenceMachine::fence_timeout(IOTimerEventSource *): AMDRadeonAccelerator prodding blockFenceInterrupt
2020-08-19 00:31:26.342381+0200 0x949      Default     0x0                  0      0    kernel: <=== HalUsbInMpdu()
2020-08-19 00:31:26.342385+0200 0x949      Fault       0x0                  0      0    kernel: (IOAcceleratorFamily2) void IOAccelFenceMachine::fence_timeout(IOTimerEventSource *): AMDRadeonAccelerator prodding

Don't know if someone can spot something in the Errors displayed.

Have a good night folks.
 
I enabled the verbose mode to see if I could spot something in the booting process.
I see these 2 errors related to the ACPI management

Code:
2020-08-19 10:20:39.501263+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Error:
2020-08-19 10:20:39.501263+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Error:
2020-08-19 10:20:39.501265+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) [USBX]
2020-08-19 10:20:39.501265+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) [USBX]
2020-08-19 10:20:39.501266+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  Namespace lookup failure, AE_ALREADY_EXISTS
2020-08-19 10:20:39.501266+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  Namespace lookup failure, AE_ALREADY_EXISTS
2020-08-19 10:20:39.501268+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/dswload-462)
2020-08-19 10:20:39.501269+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/dswload-462)
2020-08-19 10:20:39.501270+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501270+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501271+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) During name lookup/catalog
2020-08-19 10:20:39.501271+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) During name lookup/catalog
2020-08-19 10:20:39.501273+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/psobject-310)
2020-08-19 10:20:39.501273+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/psobject-310)
2020-08-19 10:20:39.501305+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501305+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501308+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) (SSDT:SsdtUsbx) while loading table
2020-08-19 10:20:39.501308+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) (SSDT:SsdtUsbx) while loading table
2020-08-19 10:20:39.501309+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/tbxfload-319)
2020-08-19 10:20:39.501310+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/tbxfload-319)
2020-08-19 10:20:39.501324+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Error:
2020-08-19 10:20:39.501325+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Error:
2020-08-19 10:20:39.501326+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) [_PTS]
2020-08-19 10:20:39.501326+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) [_PTS]
2020-08-19 10:20:39.501327+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  Namespace lookup failure, AE_ALREADY_EXISTS
2020-08-19 10:20:39.501327+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  Namespace lookup failure, AE_ALREADY_EXISTS
2020-08-19 10:20:39.501329+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/dswload-462)
2020-08-19 10:20:39.501329+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/dswload-462)
2020-08-19 10:20:39.501330+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501330+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501332+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) During name lookup/catalog
2020-08-19 10:20:39.501332+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) During name lookup/catalog
2020-08-19 10:20:39.501333+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/psobject-310)
2020-08-19 10:20:39.501333+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/psobject-310)
2020-08-19 10:20:39.501365+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501365+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Exception: AE_ALREADY_EXISTS,
2020-08-19 10:20:39.501367+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) (SSDT:    ZPTS) while loading table
2020-08-19 10:20:39.501367+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) (SSDT:    ZPTS) while loading table
2020-08-19 10:20:39.501369+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/tbxfload-319)
2020-08-19 10:20:39.501369+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform)  (20160930/tbxfload-319)
2020-08-19 10:20:39.501370+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Error:
2020-08-19 10:20:39.501370+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) ACPI Error:
2020-08-19 10:20:39.501371+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) 2 table load failures, 14 successful
2020-08-19 10:20:39.501372+0200 0x71       Default     0x0                  0      0    kernel: (AppleACPIPlatform) 2 table load failures, 14 successful

Below a screenshot of my config.plist hilighting the the two tables / mappings that seems to be involved in the Error.

1597826546670.png

I'll try to investigate what the issue is by digging on the net as I must admit, all of this Hackintosh configuration is not easy (to me). In the meantime and again, maybe someone could something obvious...
 
To fix the issue related to the load of the USBX table, I removed the SSDT-EC-USBX-DESKTOP.aml which was probably the generic aml file that I narrowed to my configuration in the SSDT-USBX.aml file.

Regarding to the issue with the _PTS ACPI error, I realised that the patch section was completely incorrect. The values to find and replace were copies of the previous item and the flag Enabled was set to False.

Below the new section modified.

1597897005179.jpg


Despite this change, still having the same issue when trying to load the FixShutdown-USB-SSDT.aml as shown in the error below.
1597897112568.jpg

From the SSDT content (screenshot below) and the error reported, I understand the method _PTS is already declared or something like that but I might be completely wrong... Which would explain though that the correction I made to the patch (mentioned above) wasn't fixing this issue at all.
For the time being, I'll remove the FixShutdown-USB-SSDT.aml and patch as fixing the shutdown / restart problem is not my priority. Any suggestion will be more than appreciated :)

I don't think these ACPI issues are the source of our GPURestart stuff, just going step by step...
 
To fix the issue related to the load of the USBX table, I removed the SSDT-EC-USBX-DESKTOP.aml which was probably the generic aml file that I narrowed to my configuration in the SSDT-USBX.aml file.

Regarding to the issue with the _PTS ACPI error, I realised that the patch section was completely incorrect. The values to find and replace were copies of the previous item and the flag Enabled was set to False.

Below the new section modified.

View attachment 484998

Despite this change, still having the same issue when trying to load the FixShutdown-USB-SSDT.aml as shown in the error below.
View attachment 484999
From the SSDT content (screenshot below) and the error reported, I understand the method _PTS is already declared or something like that but I might be completely wrong... Which would explain though that the correction I made to the patch (mentioned above) wasn't fixing this issue at all.
For the time being, I'll remove the FixShutdown-USB-SSDT.aml and patch as fixing the shutdown / restart problem is not my priority. Any suggestion will be more than appreciated :)

I don't think these ACPI issues are the source of our GPURestart stuff, just going step by step...
If the [_PTS] is in your DSDT (or firmware SSDTs), you may need to search for _PTS and see how many times it shows up, and set that as your ‘count’ parameter in the patch rename... don’t quote me on that, definitely search the forum for that advice, but I do think it’s something like that.
 
Thank you @canyondust and sorry for the late answer.
I haven't had the chance to test your suggestion so far as I have left the PC to the shop where I bought it. Indeed and because I didn't find any solution to have the graphic card giving its full power, I called a professional company to help me installing Mac OS. They tried everything they could and at the end, we installed Windows on the PC to find out that I had the same performance issue. So this wasn't a software issue but a hardware issue. I tested the graphic card on another PC with Windows / Mac and had the same issues.
So waiting now for the results of the tests done by the PC reseller. I'll keep you aware obviously, maybe the same issue you are facing @poorman9 ...
 
@poorman9 Joining this club. Very similar things happening on my build. I've tried a heap of things from kexts/kext updates, DSDT.aml patches, different SMBIOS definitions, toggling XMP detection/related BIOS setting, OS updates, etc. Nothing provides sustained improved stability and continuing to get what seems like random panics/black screens (one happened as i wrote this post!). Unfortunately it's not always producing crash reports... so I'm posting what I've got here with the EFI and config.plist for anyone interested in digging around.

GPU works like a charm on the Windows 10 side of my build.
 

Attachments

  • EFI.zip
    38.1 MB · Views: 201
  • config.plist
    7.6 KB · Views: 60
  • Screen Shot 2020-08-24 at 8.59.57 AM.png
    Screen Shot 2020-08-24 at 8.59.57 AM.png
    1.3 MB · Views: 63
  • Screen Shot 2020-08-24 at 9.00.04 AM.png
    Screen Shot 2020-08-24 at 9.00.04 AM.png
    1.5 MB · Views: 64
Status
Not open for further replies.
Back
Top