- Joined
- Feb 2, 2010
- Messages
- 8,175
- Motherboard
- ASUS TUF Z390-PRO GAMING - 2606 - UEFI
- CPU
- i9-9900K
- Graphics
- Vega 64
- Mac
- Classic Mac
- Mobile Phone
With the release of OS X 10.8.3, users have noticed a significant decrease in OpenCL benchmark numbers using LuxMark on NVIDIA 6xx Kepler based video cards. I have done a number of tests, and have concluded that with the new drivers in 10.8.3, the benchmarks are now accurate, whereas in earlier versions of OS X they were not. I will attempt to explain why.
Before we discuss LuxMark and it's results let's first look at the hardware specifications for NVIDIA 4xx / 5xx (Fermi) and 6xx (Kepler) cards. Both GPUs have what is called a SM or streaming multiprocessor that is a highly parallel multiprocessor. In Kepler GPUs, NVIDIA has made major changes to the SM architecture and is now calling them SMX for Next Generation SM. For those who really want to learn the details I recommend reading this pdf from NVIDIA detailing the Kepler architecture and how it compares to Fermi. The key change is that NVIDIA has reduced the number of SMs but increased the number of CUDA processors per SM. The end result is an improvement in gaming performance and a decrease in raw computing power (http://www.pcmag.com/article2/0,2817,2402021,00.asp). To summarize the differences, a Fermi GPU can have up to 16 SMs and Kepler can have up to 8 SMXs.
Now that that is out of the way let's look at LuxMark. It's using OpenCL to measure compute power by using all the SMs it has available. The issue is that the OpenCL support in 10.8.2 was broken and misreporting the # of GPU units and clock speeds and was miscalculating results. My EVGA GTX 670 has 7 SMs and a 980 MHz clock, but Luxmark running on 10.8.2 reports 28 SMs and a 705 MHz clock as seen here.
Now on 10.8.3, Luxmark reports the correct specifications of 7 SMs and a 980 MHz clock as seen here.
Furthermore, when running LuxMark in Windows, performance numbers are the same or lower than those in OS X 10.8.3. This is evident when comparing scores in the LuxMark results database.
The root cause for the performance decrease is twofold. One, the reduction in the SM count in Kepler. And two, NVIDIA not optimizing their OpenCL drivers to take advantage of the increased CUDA cores. These two changes combined with inaccurate reporting by 10.8.2 drivers have caused the decrease in OpenCL benchmark scores. In summary, OpenCL for Kepler was broken in 10.8.2 and has been fixed in 10.8.3.
Related Posts:
10.8.3 - Luxmark OpenCL Scores Significantly Lowered
Optimizing NVIDIA GeForce 4xx and 5xx Graphics Cards Using OpenCL and CUDA
LuxMark Official Information and Downloads
Before we discuss LuxMark and it's results let's first look at the hardware specifications for NVIDIA 4xx / 5xx (Fermi) and 6xx (Kepler) cards. Both GPUs have what is called a SM or streaming multiprocessor that is a highly parallel multiprocessor. In Kepler GPUs, NVIDIA has made major changes to the SM architecture and is now calling them SMX for Next Generation SM. For those who really want to learn the details I recommend reading this pdf from NVIDIA detailing the Kepler architecture and how it compares to Fermi. The key change is that NVIDIA has reduced the number of SMs but increased the number of CUDA processors per SM. The end result is an improvement in gaming performance and a decrease in raw computing power (http://www.pcmag.com/article2/0,2817,2402021,00.asp). To summarize the differences, a Fermi GPU can have up to 16 SMs and Kepler can have up to 8 SMXs.
Now that that is out of the way let's look at LuxMark. It's using OpenCL to measure compute power by using all the SMs it has available. The issue is that the OpenCL support in 10.8.2 was broken and misreporting the # of GPU units and clock speeds and was miscalculating results. My EVGA GTX 670 has 7 SMs and a 980 MHz clock, but Luxmark running on 10.8.2 reports 28 SMs and a 705 MHz clock as seen here.
Now on 10.8.3, Luxmark reports the correct specifications of 7 SMs and a 980 MHz clock as seen here.
Furthermore, when running LuxMark in Windows, performance numbers are the same or lower than those in OS X 10.8.3. This is evident when comparing scores in the LuxMark results database.
The root cause for the performance decrease is twofold. One, the reduction in the SM count in Kepler. And two, NVIDIA not optimizing their OpenCL drivers to take advantage of the increased CUDA cores. These two changes combined with inaccurate reporting by 10.8.2 drivers have caused the decrease in OpenCL benchmark scores. In summary, OpenCL for Kepler was broken in 10.8.2 and has been fixed in 10.8.3.
Related Posts:
10.8.3 - Luxmark OpenCL Scores Significantly Lowered
Optimizing NVIDIA GeForce 4xx and 5xx Graphics Cards Using OpenCL and CUDA
LuxMark Official Information and Downloads
Last edited by a moderator: