Multithreading rendering in a game engine with C#

A game running at 60FPS needs to render every 16 milliseconds, meaning that all the logic for collision detection, animation, obstacle avoidance, physics, etc. must happen in that very short time. You also need to prepare for rendering and then send the instructions to the GPU. Multithreading see ms like a most reasonable option if you have more than one core available (and who doesn’t these days).

One of the ways to do multithreading rendering in games is using a double buffer. At a high level the concept is simple: given two threads update and render, use one to fill a buffer with commands that have enough info to render(we’ll call them RenderCommands), once completed switch buffers while the other thread renders the RenderCommands in the original buffer.

Graph and possibly a better explanation of double buffer from http://bit.ly/110JuB3

You might be wondering what is a render command, well it’s the smallest amount of information we need to send to the GPU so that it can render what we want it to render. A Render command for a cube only engine (ie our engine can only draw cubes) can be as simple as:

 public class RenderCommand
{
    public float Radius { get; set; }
    public Color Color { get; set; }
    public Matrix World { get; set; }
}

There are many ways to implement this double buffer technique. The implementation we are going to see in this example is based on Quake 3 source code using modern C# to implement it. Questions, comments and optimizations welcome :D. By the way, a really in-depth review of the code is available here.

The idea

The update thread is the red one and the render thread is the blue one.

The diagram above describes the flow of the update thread(red) and rendering thread(blue), the hatched squares represent blocking.

At the initial stage, the render thread will be waiting for the render commands to become available, it will be signalled from the update thread. Once that happens, the render thread will swap the buffers, signal to the update thread that the commands have swapped and that the render thread is ready to start drawing and start drawing.

This signalling process works well also for the situation when the update frame takes longer to update and the render thread needs to wait for the render commands to become available.

Finally the situation where rendering takes longer is also covered, as we see in the the update thread is waiting until rendering is finished

The implementation

For the purposes of this example we will be using XNA. To have something to show and compare against, I’m starting off with the 3d primitives sample from XNA Creators code club.

I am going to skip the details about how to draw vertices, there are many other blog posts that cover that and focus on the threading and concurrency issues.

So, for a start we need to create the update thread. To that effect we will instantiate a class called UpdateLoop in a Task that will simply loop on executing Update as follows.

 Task.Factory.StartNew(() = 
{
        var gl = new UpdateLoop(_renderer);
        gl.Loop();
});

The loop in UpdateLoop: (Note: the loop is not time stepped, ie you should not use a while(true) like this in production).

public void Loop()
{
    _stopwatch.Start();
    while (true)
    {
        Update();
    }
}

The code for the loop is (i think) pretty self explanatory, when calling Update in line 5 it will follow the sequence as described in diagram below.

It is probably interesting to see what AddCube() looks like :

public void AddCube(Cube primitive)
{
    var translation = Matrix.CreateFromYawPitchRoll(
                            primitive.Rotation.X, 
                            primitive.Rotation.Y, 
                            primitive.Rotation.Z) *
            Matrix.CreateTranslation(primitive.Position);
 
    _updatingRenderCommands.Add(
        new RenderCommand
                {
                    Color = primitive.Color, 
                    Radius = primitive.Radius, 
                    World = translation
                });
}

As you can see from the sequence diagram, after looping on renderer.AddCube() there is a call to renderer.EndFrame(), here is where we need to signal that the render commands are ready and the update thread will be now waiting for the render buffers to be swapped.

 public void EndFrame() { 
 	_renderCompleted.WaitOne(); 
 	_renderComandsReady.Set(); 
 	_renderActive.WaitOne(); 
 }

From the render thread point of view, this is what the sequence diagram looks like:

In my game class (the main class that inherits from XNA’s game class) , in Draw(), we call _renderer.Draw():

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
  public void Draw(GraphicsDevice device, Matrix view, Matrix projection)
{
    _renderActive.Reset();
    _renderCompleted.Set();
    _renderComandsReady.WaitOne();
 
    _renderCompleted.Reset();
    _renderComandsReady.Reset();
    SwapBuffers();
    _renderActive.Set();
 
    _cubePrimitive = _cubePrimitive ?? new CubePrimitive(device);
    foreach (var renderingRenderCommand in _renderingRenderCommands)
    {
        _cubePrimitive.Draw(renderingRenderCommand.World, 
                                    view, 
                                    projection, 
                                    renderingRenderCommand.Color
                                    );
    }
}

This is probably the most complex method in the whole example. The _renderActive is reset because at this point we want the update thread to block when on wait, this was set to wait from _renderer.EndFrame(). We set _renderCompleted here to unblock the update thread and then we wait for _renderCommandsReady to be signalled, effectively putting the renderer to sleep until there are more commands to render.

Before calling SwapBuffers() _renderCompleted is reset so that if the update thread reaches the end of a frame, it will sleep until the render thread has finished swapping the buffers.

Immediately after, a call to reset _renderCommandsReady ensures that the render thread will go to sleep on the next Draw call until there are some commands to render.

I am not terribly sure the explanation above is clearer than the actual code to be honest, but after 4 6 attempts I’m giving up.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  private void SwapBuffers()
{
     
    if (_updatingRenderCommands == _bufferedRenderCommandsA)
    {
        _updatingRenderCommands = _bufferedRenderCommandsB;
        _renderingRenderCommands = _bufferedRenderCommandsA;
 
    }
    else if (_updatingRenderCommands == _bufferedRenderCommandsB)
    {
        _updatingRenderCommands = _bufferedRenderCommandsA;
        _renderingRenderCommands = _bufferedRenderCommandsB;
    }
    _updatingRenderCommands.Clear();
}

Finally SwapBuffers(). There’s no synchronization happening here so, we are just switching buffers. Before calling this method the _renderComandsReady was blocked

And that is pretty much all, the complete sample is available from github .

References and interesting related articles

Most excellent Quake 3 code review http://fabiensanglard.net/quake3/index.php

Threading your game loop http://www.altdevblogaday.com/2011/07/03/threading-and-your-game-loop/

For the craic http://en.wikipedia.org/wiki/Multiple_buffering

Roundcrisis