XNA 4.0 Light Pre-Pass: Optimization Round One

Following some suggestions on my previous posts, I decided to reconstruct the z-buffer from my linear depth buffer and optimize the lighting pass.

To achieve this, I did the following steps:

  • Changed the light-accumulation render target: now it has also a depth/stencil surface (DiscardContents);
  • Right after binding it, and before the light rendering, I draw a full-screen quad with a shader that outputs the z-buffer, using our linear depth buffer as input. I know its not precise, since we lost a lot of information close to the near plane, but this fake z-buffer is only used in the lighting stage, and with coarse light volumes. (I have some artifacts when the geometry and lights are close to far plane, maybe I can fix it using some bias);
  • Instead of drawing screen-aligned quads, now I’m using a convex mesh that fits the light volume (just a sphere, scaled by the light’s radius). I could switch between front-face or back-face culling, depending if the light volume touches the camera’s near plane or not, as seen here, but I left this to next time. I’ve inverted the winding order of my light-mesh, so I don’t need to change the culling state, and the depth compare function is set to GreaterEqual;
  • For each light, compute the appropriate WorldViewProjection matrix (using the scale and position of each light), set the light properties as usual and render. I’m using this technique to recompute the pixel view-space position.

Here is a comparison of the area being affected by the lights:

In a test with 500 lights (341 visible, the exact camera startup position in my project), using the screen-aligned technique takes draw:3ms and gpu:28ms approx. When I change to the mesh-based technique, those values decrease to  draw:1.7ms and gpu:16.7ms approx. The draw time is decreased because we don’t need to compute the screen-aligned quads anymore. Note that I don’t know if those measures are 100% correct, I’m using the technique described here, my CPU is an i5-430 and my GPU a HD5650

It proved to be a great step to improve performance, even with the z-reconstruct pass. I would like to see some results, critics and suggestions.

By the way, here is the full source.

See you next time!



About jcoluna

Game developer and musician
This entry was posted in XNA and tagged , , , , , . Bookmark the permalink.

2 Responses to XNA 4.0 Light Pre-Pass: Optimization Round One

  1. Eclectus says:

    Hi 🙂

    I am not sure that if the camera plane touches the light or not. I think instead by inverting the winding order as you have done, your always going to be rendering the inside of the sphere, and that this will be fine. Is there an advantage to testing the camera plane, if your not needing to flip the winding order?

  2. jcoluna says:

    Hey there!
    In my case, I need to invert the order when the light hits the far plane (our backfaces will be clipped). We could also render the back-faces to stencil only (using depth test greater), and then render front-faces using depth test less and where the stencil were written (http://www.talula.demon.co.uk/DeferredShading.pdf). This way we have a really tight volume, in x,y and z. I don’t know the real benefits (we have to draw the light meshes twice), but I’ll test it on the future.
    See ya!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s