I have an nvidia GTX 750 Ti card, which is advertised as having 640 CUDA cores. Indeed, the nvidia settings application also reports this.
I’m trying to use this card to do OpenCL development on Linux. Now, I have reported from the OpenCL environment (through PyOpenCL if it makes a difference) that the number of compute units is 5. My understanding is that one compute unit on an nvidia device maps to one multiprocessor, which I understand to be 32 SIMD units (which I assume is the a CUDA core).
Clearly, 5 * 32 is not 640 (rather a quarter of what is expected).
Am I missing something as regards the meaning of a a work unit on nvidia? The card is also driving the graphics output which will be using some of the computational capability – is a proportion of the processing capability reserved for graphics use? (if so, can I change this?).
Advertisement
Answer
NVIDIA have a whitepaper for the NVIDIA GeForce GTX 750 Ti, which is worth a read.
An OpenCL compute unit translates to a streaming multiprocessor in NVIDIA GPU terms. Each Maxwell SMM in your GPU contains 128 processing elements (“CUDA cores”) – and 128*5 = 640
. The SIMD width of the device is still 32, but each compute unit (SMM) can issue instructions to four different warps at once.