FBO Shader Simulation

From VRwiki
Jump to: navigation, search

Here is a list of articles about GPU programming in XVR.

Introduction

Game of Life screen shot

This tutorial will show you how to render off-screen into Frame Buffer Objects (FBO) using shaders written in GL Shading Language (GLSL). Using a pair of FBOs allows for simulations running autonomously inside your graphics card, with near zero burden on the CPU. Conway's famous Game of Life is introduced as an example application for parallel computations on the GPU. On modern hardware (as the time of writing, that is) it runs for more than a million cells so fast, that I added some nice motion blur effects to allow you to see what is going on ;-) The algorithm outline is presented using code excerpts. The shader programs are also briefly described.

Hint: Before digging yourself through the following explanations try out the demo. Once you've got an idea how it looks, it will be much easier to understand this tutorial.

Simulation inside the graphics card

Modern graphic cards offer tremendous computing power by far exceeding that available on the CPU. Programmable T&L pipelines not only allow for rendering more and more complex and realistic scenes in real-time but also to utilize graphics processing units (GPU) for applications beyond graphics. Examples are solutions of large equation systems, physical simulations and collision detection. Being fed back into the graphics pipeline, the results enable visual effects in a quality formerly known only from cinema. A good web resource on GPU usage for arbitrary computations can be found here.

Game of Life

The Game of Life (GOL) is a cellular automaton devised by the British mathematician John Horton Conway in 1970. Cellular automatons are widely used for simulations in physics and chemistry.

The world of GOL consists of a matrix of rectangular cells, which can be either dead or alive. The fate of each cell is determined by the state of it's 8 neighboring cells by two simple rules:

  1. If the cell is momentarily dead and it has exactly 3 neighbors alive, new life is born in that cell. Admittingly a strange life form needing three parents mating to give birth to a child, but that's the way it is defined ;-)
  2. If the cell is actually alive it survives only if 2 or 3 neighbor cells are alive. Otherwise it dies by loneliness or overcrowding.

With a given initial state of enough living cells a complex, self-organizing evolution emerges, very attractive to watch.

Frame buffer objects and shaders

Frame buffer objects are basically rectangular pieces of RAM in your graphics card. You can use them to render stuff off-screen (invisible to the user). To render something invisible would be a waste of time if you couldn't use the results any further. Besides reading them back to CPU memory and compute something useful, you can also use FBOs as textures. That means FBOs can also serve as input for rendering. Invoking a shader program, allows to "abuse" the graphical rendering for arbitrary computations. That is what we achieve in this tutorial.

It is important to understand that a single FBO cannot act as input (texture) and output (render target) at the same time. You can either read from it or write to it but not either simultaneously. To sort that problem out you need at least two FBOs, one for input the other for output.

If you want to feed your simulation with the result of the previous step, you have to take the double-buffer approach. In the first step, FBO A acts as input for the computation resulting in FBO B. In the next step FBO B acts as input and the result is rendered into FBO A. By swapping between A and B (called FboFront and FboBack in this tutorial) you can implement simulations inside hardware shaders with near zero burden for the CPU.

Algorithm outline

We load the initial state of our world from a picture file into a texture object. A green pixel in the picture means a living cell, a black pixel a dead one. This texture is passed immediately to our simulation function FboRender() to perform the initial simulation step.

var Texture= CVmTexture(FboInitTextureName);
FboRender(Texture);

In the first simulation step, the pixels are treated as dead of living cells, the rules mentioned above evaluated and a new grid of pixels is generated, again each depicting a dead or living cell. Here is (a little simplified) what happens inside FboRender():

SceneBegin();
FboArray[FboBack].start();
SetActiveTexture(Texture,VR_NO_FILTER,0);
FboShader.Start();
glBegin(GL_QUADS);
    glTexCoord(0.0, 0.0); glVertex(0.0, 0.0, 0.0);
    glTexCoord(1.0, 0.0); glVertex(FboWidth, 0.0, 0.0);
    glTexCoord(1.0, 1.0); glVertex(FboWidth, FboHeight, 0.0);
    glTexCoord(0.0, 1.0); glVertex(0.0, FboHeight, 0.0);
glEnd();
FboShader.Stop();
SceneEnd();

This basically renders our texture to a full-screen rectangle, using a shader (FboShader) performing the rules of GOL described above. But instead of rendering to the screen (the frame buffer that is), we render to a frame buffer object (FBO). If you wonder why we have an array of FBOs here, be a little patient, the reason will be described later. That's all, now we want to display the result of our first step to the screen by calling FboDisplay(). Here is the full function:

function FboDisplay()
{
    SetActiveTexture(FboArray[FboFront],VR_NO_FILTER,0);
    ScreenShader.Start();
    glBegin(GL_QUADS);
         glTexCoord(0.0, 0.0); glVertex(-1.0, -1.0, -0.1);
         glTexCoord(1.0, 0.0); glVertex(-1.0, +1.0, -0.1);
         glTexCoord(1.0, 1.0); glVertex(+1.0, +1.0, -0.1);
         glTexCoord(0.0, 1.0); glVertex(+1.0, -1.0, -0.1);
    glEnd();	
    ScreenShader.Stop();
}

FboDisplay() basically transports the pixels we have just generated in the initial step to your computer's screen (in technical terms: the frame buffer) by rendering a textured rectangle. In order to add some nice motion blur effects, it invokes a second shader (ScreenShader, see below).

Having described the first step of the simulation and introduced the most important parts of the simulation loop, what is left is how to close the loop, in other words, how to perform step 2, 3,... Here is how it goes:

In the initial step, we used a texture loaded from a picture file as input, performed the GOL rules by a shader and rendered the result to a FBO which was then displayed on your screen. The next simulation steps are basically the same, with one important difference: They use the results of the previous simulation step as input instead of the initial texture. And we find that results in the FBO we used as a render target in the previous step. That is why we used an array of FBOs. We have two of them. One for the previous state of our world (the dead and living cells) and one for the actual results. This approach is very similar to OpenGL double buffering: In step 2,3,... we use one FBO as back- and another as front buffer. After the simulation step is performed, both buffers are swapped:

FboFront = FboBack;
FboBack = 1 - FboFront;

The Shaders

The tutorial program uses two shaders, one to perform the simulation (life.sh) and the other one (screen.sh) to copy the grid to the screen and add some optical sugar. Both shader programs use only the fragment shaders. Therefore the vertex shaders of both programs only transform vertexes in OpenGL style and pass them down the pipeline:

[VERTEX SHADER]
 
varying vec2  TexCoord;
 
void main(void)
{
	TexCoord = gl_MultiTexCoord0.st;
	gl_Position = ftransform();
}

Life.sh

This fragment shader performs the life simulation by applying the GOL rules to each cell (pixel). We first have to count the number of living neighbor cells. In a normal program, we would have an array of cells which we would access by indexi and j. In a fragment shader, we have a texture instead which we access by texture coordinates s and t. The coordinates of the actual cell we are dealing with are determined by the texture coordinates of our fragment, stored in the varying TexCoord. The coordinates of the neighboring cells are determined as follows:


float gapS = 1.0 / TexWidth;       // Horizontal gap between two texels/pixels
float gapT = 1.0 / TexHeight;           // Vertical gap between two texels/pixels
 
vec2 Offsets[8];
Offsets[0] = vec2(0.0,gapT);            // North
Offsets[1] = vec2(gapS,gapT);           // Northeast
Offsets[2] = vec2(gapS,0.0);            // East
Offsets[3] = vec2(gapS,-gapT);          // Southeast
Offsets[4] = vec2(0.0,-gapT);           // South
Offsets[5] = vec2(-gapS,-gapT);         // Southwest
Offsets[6] = vec2(-gapS,0.0);           // West
Offsets[7] = vec2(-gapS,gapT);          // Northwest 
 

Next we have to count the number of living neighbors. Intuitively, we would loop over the neighboring cells and count the living ones. We can break the loop when we have found four or more living neighboring cells, because our cell will die in any case by overcrowding (Rule 2).

int neighbours = 0;
for (i=0; i<8 && neighbours < 4; i++)
{
    if ( texture2D(Tex, TexCoord+Offsets[i]).r > 0.5)
        neighbours ++;
}

Unfortunately, this doesn't work with the actual version of GLSL compiler. The loop is to complicated, so we have to unroll it ourselves:

if ( texture2D(Tex, TexCoord+Offsets[0]).g > 0.5)
    neighbours ++;
if ( texture2D(Tex, TexCoord+Offsets[1]).g > 0.5)
    neighbours ++;
 
[...]
 
if ( texture2D(Tex, TexCoord+Offsets[7]).g > 0.5)
    neighbours ++;

Now neighbours contains the number of living neighboring cells and we can determine if the actual cell lives (we draw a green pixel) or is dead (black pixel).

vec4 color = texture2D(Tex, TexCoord);
	  
if (color.g > 0.5) {                           // If cell is living
    if (neighbours < 2 || neighbours > 3)      // Too few or too many neighbours, cell dies
         color = vec4(1.0, 0.0, 0.0, 1.0);		
    else if (color.r > 0.0)                    // Cell survives but ages			
         color.r = color.r - colorstep;	
} else {// Dead cell
    if (neighbours == 3)                       // Exactly three neighbours
         color = vec4( 1.0, 1.0, 0.0, 1.0);    // Cell is born
    else if (color.r > 0.0)
         color.r = color.r - colorstep;        // Stays dead but fades away
}
gl_FragColor = color;                          // Draw the pixel 
 

You may wonder why we also write to the red-channel (color.r) of the pixel. The red color will not be rendered to the screen but depicts the time since a cell changed from living to dead or vice versa. This is used by the second shader to display a nice motion blur.

Screen.sh

This shader is really simple. It reads a pixels from our FBO and writes them to the screen. It just applies some nice colors for the motion blur effect:

const vec4 livingFadeFrom = vec4(0.0, 1.0, 0.0, 1.0);   /* Color if newly born cell */
const vec4 livingFadeTo  = vec4(0.0, 1.0, 1.0, 1.0);    /* Color of alive but very old cell */
const vec4 deadFadeFrom = vec4(0.0, 0.5, 0.0, 1.0);     /* Color of cell just died */
const vec4 deadFadeTo = vec4(0.0, 0.0, 0.0, 1.0);       /* Color of cell dead for a long time */

Having defined our colors for living and dead cells, we just have to mix them. The amount of each color is determined by the red component of the pixel.

if ( color.g > 0.5) {
    /* Fade color of living cell from livingFadeFrom to livingFadeTo */
    color = livingFadeFrom * color.r + livingFadeTo * (1.0 - color.r);
} else {
    /* Fade color of dead cell from deadFadeFrom to deadFadeTo */
    color = deadFadeFrom * color.r + deadFadeTo * (1.0 - color.r);
}
	
gl_FragColor = color;

Demo and source code

The Game of Life is available online on my homepage. Please notice that you need Windows Explorer and a OpenGL 2.0 graphics card to run the demo. An archive containing the project file and all source code is also available.

See Also