WebGL* in Chromium*: Behind the scenes
Have you ever wondered how your WebGL* code is executed and what happens before it hits the drivers? You might have heard about a few things already, for example that Chromium* uses a separate process to execute GPU code and has its own wrappers around the GL calls. This article is exactly about this abstraction layer and is meant for people who want a better understanding of WebGL or who want to start developing the GPU code in Chromium.
Chromium uses a multi-process (1) architecture. Each webpage has its own rendering process, which runs in a sandbox and is very restricted in what it can access. This makes it much harder for malicious web content to mess with your computer. However, this is bad news for GPU acceleration since the renderer doesn't even have access to the GPU. This is solved by adding an extra process just for the GPU commands—which sounds horrible at first, as it introduces a lot of interprocess communication, and the textures also have to be copied between the processes, but it's not as bad as you'd imagine. For example, the textures usually only have to be copied once at initialization, and modern OpenGL is designed to minimize the number of commands that have to be sent to the GPU. This separation actually improves performance because WebGL can execute independently of all the other rendering and parsing.
The Command Buffer
The GPU process and renderer process communicate using a server-client model, where the renderer is the client. When a renderer wants to execute a GL command like glViewport, it cannot directly call the function from the driver because of the security sandbox. Instead, it creates a representation of the command, puts it into a buffer in shared memory called
CommandBuffer (2), and sends a message to the server to communicate approximately, "Hey, I put some stuff into the buffer, please execute it for me." Realistically, the renderer puts a number of commands in the buffer and sends the server a message of, "Hey, I put 10 commands in the buffer, please execute them."
This is all nice and fast as long as you just send commands and don't ask questions. Most commands don't return a value, and thus most of the communication can be asynchronous. But as soon as you ask the GPU process a question, like, "What is the result of this command?" the communication has to do a round trip, and the renderer has to wait for the result. Checking the result values when you don't need them can make the performance of your application much worse. Even something like checking
glGetError has a high cost.
The commands you can send to the GPU process are pretty much the same as those you can send to the GL ES 2.0 API, (3) apart from a few incompatibilities. (4) However, what actually gets executed depends on your platform. It could be OpenGL, it could be DirectX (through ANGLE (5)), and some commands could get executed through GL extensions if the extensions are deemed faster. The results might not be the same as if you wrote native OpenGL ES 2.0 code because Chromium enforces better security measures. For example, it does extra validation of the parameters and clears the allocated buffer memory so that one web page cannot read leftover data from a different page.
There is only one GPU process, and it doesn't care who sends it commands to—it just remembers the context for each source and keeps a separate shared memory block and command buffer for each source. It isn't just used for WebGL. Normal rendering is done via Skia (6), which also does its requests through the command buffer and not directly to the driver. Since a single web page can contain multiple elements that need GL commands (for example: multiple canvas elements), it actually renders into a texture (using FBOs (7)) instead of the framebuffer directly, and the compositor takes care of the page arrangement.
This diagram shows an outline of what happens when the
glViewport command gets executed, starting from the left and eventually getting to the GPU:
Below is a simplified (8) stack trace of the renderer (GPU client) when glViewport gets called from WebGL, with the most recent function call at the top:
gpu::gles2::GLES2CmdHelper::Viewport gpu::gles2::GLES2Implementation::Viewport blink::WebGLRenderingContextBase::viewport blink::HTMLCanvasElement::getContext v8::internal::Builtin_HandleApiCall ...
GLES2CmdHelper. This class takes care of the whole business with the command buffer by creating a representation of the command, putting it into the buffer, and (eventually) sending a message to the GPU process that there is a command it should handle.
On the GPU server side:
gfx::GLApiBase::glViewportFn gpu::gles2::GLES2DecoderImpl::DoViewport gpu::gles2::GLES2DecoderImpl::HandleViewport gpu::gles2::GLES2DecoderImpl::DoCommands gpu::CommandParser::ProcessCommands gpu::GpuScheduler::PutChanged gpu::CommandBufferService::Flush content::GpuChannel::HandleMessage base::MessageLoop::Run ...
The GPU process sits there and waits in a loop for messages (things to do), as you can see on the bottom of this simplified stack trace. When a message arrives, the GPU process jumps through a couple of callbacks and handlers depending on the type of message. In this case, the message is something like, "I put some commands into your buffer." The buffer gets flushed (meaning it synchronises the buffer position between the two processes, but the renderer can keep adding commands to it), and via another set of callbacks reaches the
GpuScheduler, which starts processing the commands. The
GLES2DecoderImpl is the main monster class that handles all the GPU commands and sends them to the driver. Again, it checks whether the parameters to
glViewport are valid. Now we have almost hit the drivers, and the GLAPIBase contains thousands of lines of auto-generated bindings that do the equivalent of
Effectively, WebGL allows you to execute arbitrary code on the GPU. If there is an exploit targeting the drivers, it could possibly break out and take control of your computer. But since the GPU process doesn't have free reign and is sandboxed-just slightly less restricted than the renderer and thus can call the 3D API directly, it’s much less likely that an exploit could do damage. Drivers on various platforms are quite prone to have bugs in them. That is why Chromium wraps each of them to work around the issues and blacklists old and buggy hardware, drivers, or GL extensions. If you are trying to figure out why some feature is not working on your device, you can find the blacklists in the source code (11) in a fairly readable format. You might have also noticed in the glViewport that extra parameter checks are done before the command even reaches the driver, which significantly decreases the chances of triggering a bug.
Having to deal with an extra process for GPU commands adds a number of complications, but the benefits are worth it. If we gave the renderer rights to access the GPU, it would be difficult to make sure that all GL commands are going through the safety controls, and eventually something would leak through. The client-side API to the command buffer doesn’t even have any external dependencies, which means NaCl (12) has a smaller attack surface. Right now, even if a GL command hits into a bug, it will be only the GPU process that crashes, and it can be restarted. Even despite the extra overhead of communication between processes, the perceived performance is better thanks to being able to execute independently from the rest of rendering, taking advantage of multi-core CPUs.
(1) Chromium Design Documents: Multi-process Architecture
(2) Chromium Design Documents: GPU Command Buffer
(3) OpenGL ES 2.0 Reference
(4) Chromium Design Documents: GPU Command Buffer - OpenGL ES 2.0 incompatibilities
(5) The ANGLE* project
(6) Skia* project
(7) OpenGL Framebuffer Object (FBO)
(8) The full stack trace of both the renderer and gpu process
(10) Blink layout engine
(11) You can look at the disabled features or list of systems where software rendering is used. Try restarting Chromium with the “--ignore-gpu-blacklist” command-line flag to ignore both of those lists.
(12) Google Native Client, also known as NaCl
For more complete information about compiler optimizations, see our Optimization Notice.