First of all, we must clarify that OpenCL is not a type of hardware, but rather a software or rather an API that serves to communicate the applications with the GPU by making use of an abstraction at the software level of the GPU itself. Its difference with the rest of APIs that applications communicate with the GPU is that it is not what can be said a graphical API, but an API for scientific computing.
GPUs for computing
What is GPU computing? Keep in mind that GPUs can execute programs called shaders, which are used to manipulate the characteristics of the different primitives in the 3D pipeline, whatever their shape. Obviously, any processor only sees a series of data in binary, so when a shader unit executes a shader program what it does is process a set of data and therefore it can be used to process any type of data.
In the mid-2000s, with the arrival of GPUs with unified shaders, the possibility of using them in markets beyond that of PC gaming appeared, the main one being that of scientific computing, which allowed the departure of the NVIDIA Tesla range to from 2007.
The difference between these GPUs and those used for gaming is their ability to work with double precision floating point, which is not necessary in games, but in the world of science in all its aspects. Either for astronomical calculations or to produce a next-generation drug.
OpenCL, an API for GPU computing
Until the appearance of OpenCL, graphics APIs were designed only for rendering graphics, but not for computing purposes, so they were not entirely efficient to run non-graphics algorithms on a GPU. The solution? Obviously, the development of an API for computing, which they called OpenCL where CL comes from Compute Library.
But how are OpenCL and other APIs different? We can run OpenCL on any type of processor, not only on GPUs, but if we want we can run OpenCL code, it is a CPU if we need it, apart from that we can also run it on DSPs, FPGAs, neural networks and a long etcetera.
The reason for this is that their model is based on distributed computing where we have a Host unit that is the CPU and a series of processing units that can be GPUs, DSP, FPGA, etc. To which the tasks to be executed are sent. Each task being a processing element, which when it is processed the result is sent to the host and / or a confirmation that it has carried out said task. Each processing element is a separate program, therefore a thread of execution with its own program counter.
OpenCL is not a graphical API
It should be clarified that OpenCL does not control the graphical pipeline and therefore it is not used to execute graphs, since a good part of the functions that OpenGL and other APIs such as Direct3D, Vulkan, etc. have. They are not found in OpenCL. What’s more, OpenCL was originally designed to interact with OpenGL together and is currently designed to work with Vulkan, the current graphical API from the Khronos group.
Another difference has to do with the programming language used to run the Shader programs. In the case of graphic APIs, high-level shader languages are used, such as GLSL in the case of OpenGL and Vulkan, HLSL in the case of DirectX, and so on.
On the other hand, with OpenCL this is not the case, general and non-specific languages are used such as C and C ++, which allows to port programs and algorithms written in these languages to OpenCL so that they can be executed on all types of devices that support this API. and to be able to take advantage of its greater versatility than the limited languages for shaders.
How is OpenCL applied in everyday applications?
OpenCL is widely used in some PC applications, especially multimedia ones. When, for example, in Photoshop we tell the program to run an image filter today, it is done through OpenCL and the algorithm runs on the most appropriate hardware that supports the API, so if we have the most appropriate component then the OpenCL part will run on it.
Other types of everyday applications that use OpenCL are video codecs such as AV1, HEVC, H.264, and so on. Most of them are programmed in OpenCL for the same reasons we have discussed before. It allows the CPU to run them and developers don’t have to break their horns if there is a video codec in the hardware and optimize for it.
Curiously OpenCL is also the reason why the 2D part based on VGA has disappeared from graphics cards, and that is that although it seems contradictory, it is much better to run the 2D graphical interface of an operating system through GPU computing.
DirectX Computing and the NVIDIA Boycott with CUDA
OpenCL is in decline in its use, especially after DirectX 11 included Compute Shaders in its repertoire and Apple also included its Metal API. The appearance of graphical APIs with partial support for computing was what made OpenCL begin to lose importance.
It was from the introduction of Compute Shaders that the abandonment to OpenCL began to be gradual. The latest version widely used is 1.2 of the standard. It is a very rudimentary version compared to what other APIs can do, since it does not support things like shared virtual memory, SPIR-V for better interaction with Vulkan.
But CUDA is the main enemy of OpenCL. The reason is that NVIDIA has dominated the world of high performance GPUs for years and they have taken advantage of that to make much of the scientific computing work around CUDA and not under OpenCL, since this ties the programs to their hardware. NVIDIA has been able to do this due to a total lack of competition towards its NVIDIA Tesla,
How to boycott OpenCL by NVIDIA? Not officially supporting the improvements in OpenCL 2.0, which were also in CUDA. Not only that, but NVIDIA has never supported OpenCL on its NVIDIA Tesla, Quadro, and GeForce GPUs.
The third time lucky?
In the end, in order to avoid the final disaster of OpenCL for its third version, they had to rethink the entire API in its third version. In the version, a good part of the elements that were part of the main branch of OpenCL 2.x have been downgraded to optional extensions and therefore the base hardware does not need to support them again. So now it is possible to run OpenCL 3.0 on hardware that has OpenCL 1.2 drivers and add the extensions we want to use ourselves, a way to bypass NVIDIA censorship.
In all the problem that OpenCL faces is that outside the world of scientific computing where it is most used is in video games. Especially when calculating the physics of video games, as well as collision detection. The fact that Compute Shaders exist in both Vulkan and DirectX relegate the use of OpenCL to scientific computing, which is currently the absolute domain of CUDA.
One market that the API could have reached and succeeded in is Raspberry Pi-like embedded devices, but version 2.0 pushed them aside as it focused too much on scientific computing. Version 3.0 is not designed to bring OpenCL to embedded systems that would adopt it without problems for a multitude of applications, but rather seeks to win a war already lost in advance and that within the Khronos group itself there is already competition to OpenCL in the form of Vulkan .