XNA: Adding fog to a LPP (or deferred) pipeline

Hi folks,

After a long delay, here is one more gift for you: how to add fog to your scene. I will talk about global fog, although it can be easily extended to local fog volumes like spheres, cubes, etc.

Fog is a great visual feature in any rendering. It helps not only creating atmosphere and environment cues, but also improves the depth perception of your scene. As a bonus, you can use it to camouflage the unpleasant pop in/out side effect of distant object culling.

Here we have same view with and without fog:

The effect is basically a color interpolation based on the depth of the current pixel. You can have a look here and here for some good tutorials, as I won’t enter the gritty details of the physics behind it (it’s not that hard, though: don’t be scared to learn the real deal).

In the fixed pipeline era, it was really simple to add fog to your scene: just enable a flag, setup two or three parameters and voilà, it’s done. Now, in the programmable age, you should create it by yourself, creating lots of shader permutations (or maybe using dynamic branching) to support the fog formula at the end of the processing.

But hey: what did come to your mind when I said “the depth of the current pixel”? After lots of posts talking about our beloved GBuffer, you should know that we already have the depth buffer as a render target.

With the depth buffer at hand, all you need to do is perform a full-screen pass reading the depth value of each pixel, compute the interpolation formula and alpha blending the fog color.

Sounds easy, heh? Indeed, it is. I’ve implemented both the EXP and EXP2 formulas explained in the links above, just change the fog type to see the difference (check LightPrePass.cs).

Here is a snippet of the shader:

float4 PixelShaderFunctionExp(VertexShaderOutput input) : COLOR0{

float depthValue = tex2D(depthSampler, input.TexCoord).r;
 clip(-depthValue + 0.9999f);

 float mix = saturate((1 - exp( -depthValue * FogDensity )));

 return float4(FogColor.rgb, mix );

I’m doing a “clip” to avoid fogging up the skybox or background, where the depth buffer is still the far plane. You can use stencil buffer or another trick if you wish so.

The issue with this approach is that it doesn’t work for objects that don’t output to the depth buffer, like particle effects and other transparent objects. In this case, you can change those shaders to use the global fog parameters and do the same formula stated above (this is how I’m doing in my personal project).

Anyway, here is the sample source with all the code and assets to play around. Use it at your own risk!!

Ah, by the way, some folks asked me how to add game objects in my framework by code instead of the XML. I put some examples in the initialization of this sample, so take a look if you want.

That is it, see you around!


Posted in XNA | Tagged , , , , | 39 Comments

Light Shafts + Tone Mapping

Greetings strangers!

This time I will show you a simple approach to an effect that will help our renderer to have a more “artistic” look: the light shafts, also known as god-rays. Needless to say that the source code + assets are available, at the bottom of this post.


You can find a good description of the effect here. It’s a more physically based tutorial, so if you have shivers on your spine every time you see an integral symbol, skip it and follow my trail. The effect I implemented can be described as a simple radial blur, using the light position (in screen space) as the center of this effect, and masked by a texture.

The masking is needed to avoid the foreground objects to bleed into the scene, as the light is the only thing that needs to be “shafted”. Keep in mind that this example/tutorial is only valid for directional lights, but it can be easily extended to support point lights.

To find the center of the radial blur, you need to project the light position into screen space. As we are dealing with a directional light, you can just use a simple trick to create a fake light position, using the camera position + light direction*(value between near and far plane). And the hack season has begun!

With this projected value in hand (and in range [-1..1]), we should compute an “intensity” factor. The effect should be at maximum when the user is viewing the light right in front of it, and fade out smoothly as the light moves to the screen border. You can see what I’m doing in the file PostProcessingComponent.cs, method PreRender.

note: last post I talked about a game framework using components, and that’s what I’m using from now on. The rendering code itself is not tied to the game layer, so if you don’t like that, just use the .fx and LightShaftEffect.cs code.

Mask Creation

We  need to figure out what will be blurred. We can’t just blur everything, we need to detect what is in the light layer, in our case, the background (remember that I’m dealing only with directional lights in this example). Thankfully, we already have what we need: the depth buffer! Since we are using a LPP approach, we already have it in our G-Buffer, it’s even already downsampled 2x to save some bandwidth. What we need to do is mask out the foreground pixels and voila, it’s done. Actually, I’m using the z value itself, and not just a binary mask. This allows the foreground objects to bleed a little bit, just to give a special taste to it. Check out the file LightShaftEffect.fx, method PixelShaderConvertRGB. Remember folks: this is not a physically based approach!

In the same step where I generate this mask to the Alpha channel of a render target (I’m using a 1/4 sized RT), I downsample the color buffer with a simple linear filtering. This will save some texture bandwidth in the next step.


This is the most costly step, but it is as simple as it could be: for each pixel, we sample a lot of texture values in the direction of pixel’s position -> light’s position (in screen space). I’m applying some attenuation based on the distance to the light, and also the distance between each texture fetch is customizable. In fact, there are lots of customizable parameters. Take a look in the shader and also in the level file, coluna_level.xml, where the component containing the post-processing effect is stored along with the default values for its parameters.

I’m doing 40 texture fetches for each pixel, and that is a lot. It’s important to have the blur source (the mask + downsampled RGB) at a small size to avoid texture cache misses. As it’s blurred as hell, you will gonna end up losing the high frequencies anyway.

Final Mix

The output can be done as a single sum of the original source + ( blurred version * blurred version Alpha ). You could use some luminance and threshold formulas, but for the sake of simplicity I’m doing just what I said before.

Bonus: Tone Mapping

To give an even more sexy look, I added a tone mapping algorithm to the final mix, so you can control how the colors are displayed on screen. You can change the contrast, saturation, exposure and color balance of the scene. Thanks to our LPP renderer, the input source is already in HDR, so we won’t have color banding when doing this color space transformation. The technique I’m using is explained here. Here are some examples of the same view using different tone mapping parameters:

That’s it folks, I hope you enjoyed it. As usual, the source code is here (***my public dropbox folder is down for a while, I will try to move the source files somewhere else. Sorry about that***). Feel free to use it, at your own risk!! Any comments, suggestions and donations are appreciated.



Errata: in the previous post, I forgot to add a serializer class, so the loading code was duplicating all the entries instead of sharing them (the SharedResourceList stuffs). It’s fixed now.

Posted in XNA | Tagged , , , , , , , , | 33 Comments

XNA: Creating a Game Framework

Hi all,

After a long time, I’m back! I’m also back to Brazil, for good, and it took me a while to settle down and start writing again.

This time I won’t talk about any fancy graphics technique or rendering optimizations. Instead, I will talk about an implementation of a game framework to be used on top of the rendering system I’m developing (and available for download on this blog), and also some tips about how to save/load your levels created with this framework. Ah, the source code is here, use it at your own risk!

Component Based System

I’ve worked with two different game architectures in my previous jobs: an inheritance-oriented one, and a component based one.

In the inheritance-oriented one, when you create a new object class, you pick up a starting class (that was inherited from the base game object class at some point), and add/override/implement relevant methods. In this architecture, sometimes you see a big inheritance tree like “Game object->physics object->render object->flying object->ship object->supership object” and so on. Ah, you can use also multiple-inheritance in some languages, that makes things more complicated and harder to maintain (in my humble opinion).

In the component-based system, you have smaller classes that are responsible for simpler tasks, like only rendering or playing sounds (named components), and you attach them to a container (called GameObject). In the example above, you could have a GameObject with a render component, a physics component and a ship component to achieve the same goal.

In both cases you can have a hierarchy of Game Objects (aka scene graph), so maybe your Render Object could be a children of the Physics Object, or whatever. In my experience, working with a component-based system proved to be easier to understand, to develop, to maintain and to extend, thus I’ve chose it for my game framework. The Unity3D game engine is a good choice to see it in action.

GameObject and BaseComponent

The two core classes of the game framework are the Game Object and the Base Component: the game object can have multiple components, children and has a transform plus some basic events. The transform is propagated to its children and to all its components. The base component is where the magic happens: the base class itself doesn’t do anything special, but here are some examples of its descendants:

Render Component: is responsible for rendering a mesh/model. The mesh itself is added/removed from an independent “render world”, so we don’t need to run through the whole scene graph looking for render components during the drawing step. When it receives a “transform changed” event, it signals the render world that it’s mesh has been touched, so the render world will render it properly in the next frame

Camera Component: it uses the Game Object’s transform plus some specific members (FovY, aspect ratio, near and far planes, viewport, etc) to define how the render world will be drawn.

Light Component: it also uses the Game Object’s transform plus some specific members (light type, radius, color, etc). It’s added to the render world, so we can query quickly what lights touch a given volume.

The base component has some “Update()” methods, that may be implemented in the sub classes. The “Update()” method is called every frame for each component that extends it (we keep a list of all the components that really need to be updated). Sometimes we need an update at a fixed rate, or just once per-frame, so what I did is was to create two different updates. I added also a “PreRender” method, that is called before the rendering so you can generate the mesh with the current camera, etc.

Here is an example of a torch, in my framework: the “torch” GO (game object for short) has only a mesh component (the burning wood) and has a child, “fire” (selected in the inspector). This GO has two components: a particle emitter and a light component. This extra GO was created because I needed an offset from the pivot of the torch to the burning tip.

There is an infinite number of components that can be implemented in this architecture, without deep inheritance trees or thousands of lines of code. In the source code provided, you can find the three examples above and a particle emitter component too. A good starting point for you to extend it is to create a Physics Component. I did two different physics components in my “official” XNA engine, using JigLibX and Box2D, and they worked quite well.

Accessing Components

Sometimes, a component may need to interact with another component. Let me say you want your torch’s light to flicker. You could either inherit your LightComponent class and create a LightThatFlickersComponent, or create a component LightModulatorComponent, that inherits the BaseComponent. It has some parameters like light minimum/maximum radius, color and intensity, along with modulation type (ie sine, random, etc). When the component starts, it asks for its GO for all the light components attached to it, and then updates all the lights accordingly. There are some different methods in the GO class to query the components by type, you can search for a single component in the GO itself, in the hierarchy starting on this GO, search for a list of all the occurrence of this component in the tree, etc.

Saving and loading

One important feature of any game framework is to save and load a level/game state/something. Creating all the levels in code may not be an option if you work with level designers or your game doesn’t fit in the procedurally-generated-content genre. Shawn Hargreaves has an excellent post about how to use the XNA/C# serializer, so I won’t write it here all over again. I will note some points I think that are important and weren’t too obvious when I faced them. The serializer by default saves all the public properties, so you may face some issues like I did:

[ContentSerializer(Optional=true)] : when you add new properties to a class, the XML loader will cry a river saying that it didn’t find that property in the XML. Add this attribute to allow the game to load without that.

[ContentSerializer(Ignore=true)]: use this if you don’t want a given member to be saved, like the “GlobalTransform” in the BaseComponent, that is just an facade to the GO global transform

[Browsable(false)]: at some point you will create an editor to your game, and a property grid is a good choice to use for inspecting properties. Use this attribute if you don’t want a property to be shown on the property grid (like an internal GUID of the GO)

[ContentSerializer(SharedResource=true)]: if a component needs to store a reference to a GO or another component, use this attribute to avoid the reference to be serialized as a property inside the component. This way, it will be stored as a simple reference to an object that was already serialized

[ContentSerializer]: forces a member to be serialized (ie you don’t need to create a public access to it)

An important class that is missing so far is the SharedResourceList. It’s a list that serializes only the references to the elements, and it’s used by the GameObject class to store its children.

I created a class named GameWorld, that is responsible for managing the GO tree. Actually it has a single GO – the root, some lists of updatable components, an instance of the render world and some helper methods. You load the level in the same way you load a texture or a mesh, like

GameObject object = Content.Load(“levels/jcoluna.xml”)

With the loaded root ready, you send it to the game world, so it can call all the initialize methods and prepare the scene graph to update and render. Take a look at the code and ask if you have any doubts, it’s not rocket science.

One last tip: keep the level files and the other assets (textures, meshes) in different projects, and do not reference the game processor where it’s not needed. If you do this, every time you change a class in the game framework, all the textures and meshes will be reimported (this is not done in this example, sorry).

I would like to thank Rudi Bravo for his help during the development of my framework, he did a good job at the dirty serialization stuffs and the Box2D integration.

That is it, keep safe, healthy, happy and donating!


Posted in XNA | Tagged , , , | 10 Comments

XNA Light Pre-Pass: Instancing

Almost anyone out there wants to output as many meshes/triangles/details as possible. We know that individual draw calls can hurt performance, so it would be nice if there was a way to draw lots of meshes using a single draw call. Well, that is possible, and it’s named geometry instancing. A good starting point can be found here.

Basically, we chose our mesh to be rendered (hundred, thousands of copies), and we fill an additional array to hold the per-instance data. This data usually is the world transform of each individual instance of that mesh, and some other information like color, texture offset (to access a different part of a texture atlas, i.e.) etc. In the rendering step, we set this additional array as a secondary vertex buffer and call a single draw call using the number of copies as an argument. It’s REALLY easy (more than I thought it would be). Take a look at this code snippet:

// Create a vertex declaration according to the per-instance data layout.
// In my case, I’m only using a world transform, so 4 vector4 (or a float4x4) is enough
VertexDeclaration _instanceVertexDeclaration = new VertexDeclaration
newVertexElement(0,  VertexElementFormat.Vector4, VertexElementUsage.TextureCoordinate, 0),            
newVertexElement(16, VertexElementFormat.Vector4, VertexElementUsage.TextureCoordinate, 1),            
newVertexElement(32, VertexElementFormat.Vector4, VertexElementUsage.TextureCoordinate, 2),            
newVertexElement(48, VertexElementFormat.Vector4, VertexElementUsage.TextureCoordinate, 3)
// Set our per-instance data to the additional vertex buffer. In this example, it’s just an array of matrixes
_instanceVertexBuffer.SetData(_instanceTransforms, 0, _subMeshes.Count, SetDataOptions.Discard);
// Bind both mesh vertex buffer and our per-instance data
newVertexBufferBinding(meshPart.VertexBuffer, meshPart.VertexOffset, 0),
newVertexBufferBinding(_instanceVertexBuffer, 0, 1)
// Use a different Draw* method
PrimitiveType.TriangleList, 0, 0,
meshPart.NumVertices, meshPart.StartIndex,
meshPart.PrimitiveCount, _subMeshes.Count);

In my renderer, all you need to do is call “mesh.SetInstancingEnabled(true)”, and the code will take care of grouping the visible meshes (I named this instancing groups) according to their sub-meshes. The instancing technique is used in all the 3 stages: shadow generation, render to GBuffer and reconstruct lighting stage. The main shader was changed because when we use instancing, we get the world transform (and any other additional per-instance data) from an input and not from the default shader parameter.

Instancing offers a huge improvement in speed, and you can enable/disable the instancing in the code to check the difference. By the way, the code is here. Use it at your own risk!

That is it, see you next time!


Posted in XNA | Tagged , , , , , | 27 Comments

XNA Light Pre-Pass: ambient light, SSAO and more

Hi folks,

I’ve added a few improvements to the LPP renderer, to make it shine: ambient lighting using cubemaps, SSAO, an example of a dual-layer shader and some other small changes. Here is a screenshot:

As usual, the full source code is available here. Use it at your own risk!

Ambient Light

Ambient light (or indirect light) is the light that doesn’t come necessarily from a single, direct light source: they are the light rays that bounce on the objects in the scene and ends up filling the darkness with a subtle lighting. It prevents the rendering to become black where there is no direct light. In real life, you can see it everywhere: even if you are under the shadow of a tree, you are able to see yourself because the light doesn’t come only directly from the sun: the clouds, the building walls, the floor, even the tree leaves are reflecting the light at you. There is a lot of ways to achieve that effect, from a simplistic approach of using a constant value across all objects, to more elaborated solutions like real-time global illumination.

On this sample I’m presenting 2 versions:

Constant ambient

This is the easiest way to achieve an ambient lighting: just add a constant term to the lighting equation. In the LPP, the final pixel color would be something like this:

Color = diffuse_texture * ( diffuse_lighting + constant_ambient) + specular_texture * specular_light;

As you can see, the scene is “flat”, and the lighting is constant across the whole scene.

Cubemap Lighting

As we can experience in real-life, the bounced light is not constant in all directions. We have some options: or we invent the most anticipated algorithm that creates the perfect global illumination solution for real-time games, or we hack it. I go with the second.

So the question is: how to store information about the lighting that surrounds an object? We could use spherical harmonics,  paraboloid mapping or a cubemap (or lightmaps, light grids, etc). I chose cubemaps for a few reasons: they are easy to visualize, to generate, to load and to bind to a material.

You can check this tutorial of how it works, but the basics is: a cubemap is used to store the lighting coming from all the directions. It can be seen as a box surrounding the object, where brighter areas means more light from that direction. Ambient light is a low-frequency data: to generate it we need first to get a cubemap with the original scene (your skybox is a good start) and convolute it (blur). This way, we will get rid of all details (high-frequency) and have only what matters. You will have some blue nuances where it used to be the sky, some orange tones where the Sun tints the horizon and so on. You can have multiple cubemaps on your scene, to best represent that section of the world: just capture a cubemap from a given point of view, use some tool to process it and in run-time choose the appropriate cubemap to be bound to the mesh. I use and recommend this ATI tool.

The shader now needs to fetch the correct pixel of the cubemap, either using the vertex normal or the pixel normal (I’m using the vertex normal in this example), and add to the lighting equation, that would look like this:

ambient = tex2d( ambient_cubemap, vertex_normal);
Color = diffuse_texture * ( diffuse_lighting + ambient) + specular_texture * specular_light;

As you can see, there are different shades on the character: his face has more light than his back. That’s because the skybox has a strong yellow Sun right in front of the character, and less intense tones in the back. Some bluish tones can be noted on his head too. The shader LPPMainEffect.fx has some defines to control what kind of ambient light you want. The ambient cubemap is modulated by an ambient color, so you can tweak the values per-mesh.


Screen space ambient occlusion. I bet you’ve heard about it, so I will skip introductions. Here is the version without it. Notice that the character seems to float on the ground, since there is no direct shadows from his feet.

I tried a lot of different implementations, using only depth and depth+normals. I’ve ended up with the later, although I’m not happy with it: I’m pretty sure there is some good soul out there that can improve it and share the code back with me. I’m using a half-resolution depth buffer, and the SSAO texture is also half-res. I do some blur filtering, using the depth to avoid bleeding into the foreground, and you can notice a thin silhouette around the SSAO sometimes, mostly due to the downsampled buffers. There are lots of parameters to tweak, maybe you can find a setup that works great. I’m applying the SSAO map (that is like a black-white opacity texture) over the whole composition: if you prefer, you could use it to modulate only the ambient term, but I’m comfortable with the results I got.

The shader uses ideas and snippets from lots of different samples, so if you see some of your (or someone else) code being used, give me a shout and I’ll credit you (or remove the source).

The SSAO create some contact shadows when the feet are close to the ground, and also his arms projects some shadows on his chest.

Dedicated Specular Buffer

The Xbox port of XNA doesn’t provide a RGBA64 texture format. That means that if we use the HDRBlendable format we have only 2 bits for specular light (I used to store the specular lighting in the alpha channel). This is obviously not enough, so now at lighting stage, I render to two separate lighting buffers: a diffuse and a specular one. Another advantage is that we can have proper specular colors. It didn’t show up as a performance issue on Xbox, but I’d rather use a RGBA64 if available (maybe next XNA release?).

In the reconstruct shading stage, we need to remember to fetch both diffuse and specular light buffers, and use them accordingly.


I’ve implemented a kind of multi-material shader, where you transition from one material to another according to the vertex color. In this case, I use also a pattern on the alpha channel of the main diffuse texture to mask/unmask the second layer. This way, we don’t have the smooth (and sometimes unnatural) blending of a default weighting, but another layer that reveals in an interesting fashion. Look at the image below: I’ve drawn some “scratches” on the alpha channel of the main diffuse (the tiled floor texture), so the second layer (the gravel) shows up first as small scratches on the surface, and where the vertex color gets more intense, it replaces the first layer. All the settings to this material are exposed in the 3DMax shader that I also provide with the source, it’s just a matter of enabling some check boxes, selecting the textures and exporting the FBX.


I’ve added a RenderWorld structure, where all the submeshes are placed. The Renderer does queries on this structure using the camera frustum or light volume, so it would be easy if you want to replace it with a KD-tree, quadtree or any structure you like.

Please take some time to watch me on youtube singing some cover songs, and be plagued with my brazilian accent =)

Here is also the video for my DreamBuildPlay entry:

That is it. See you next time!


Posted in XNA | Tagged , , , , , , | 22 Comments

XNA Light Pre-Pass: culling, blending and particles

After the shortest Summer in my life, I’m back! Thanks to all who donated, I’m almost buying a Xbox360 with the money from this blog!

This time I will release the feature that everybody was asking for: blending! I’m not doing any lighting on the blended objects, though (sorry!). But it’s a good starting point: we can have explosions, sparkles, transparency and other effects without any lighting. The full source code + assets is here, use it at your own risk!!

I changed the code a lot, so I will divide the changes I made in three topics: culling (for both main rendering and shadows), blending and particles.


Obviously every renderer needs some sort of culling. Although I’m using only frustum-culling, I added a new step on the model pre-processor: the generation of metadata for each sub mesh. At compile time, I loop through all sub meshes from a given model, and compute their local bounding boxes. I assign this information plus some other properties like “cast shadows” and “render queue” to their “Tag” property (a modern “void pointer”). I need to do this because some times you have a single model with hundreds of sub meshes, so using just the main volume around them is not a good approach. I created a SubMesh class that holds that metadata, the transformed bounding box in global space, and the information for that sub mesh (effect, transform, modelMeshPart), so I can cull every sub mesh individually.

You can also extend the processor to read some extra information, like the “cast shadows” and “render queue” above, as right now they are the same for the whole model.

For the shadow rendering, I’m also using those bounding boxes to check if they lie inside the light volume. For spot lights, it’s very easy since I’m using a frustum as their volumes. For directional lights, I ignore the near clip plane if the view has the same direction as the light, as the geometry can be behind the frustum.


Using blend (additive, alpha, whatever) sounds easy, and it is: we just need to do it on the right stage. In my case, it’s right after we reconstruct the lighting, so we have all the solid objects on-screen (or almost all of them) and the z-buffer properly constructed.

I introduced into the renderer the Render Queue concept: a list of objects to be drawn at a specific stage. I have only 3 stages now:

  • Default: for objects that needs the full LPP pipeline, ie. rendering to GBuffer and reconstructing the lighting;
  • SkipGBuffer: objects that doesn’t need to be drawn to GBuffer, like skyboxes, pure reflective models or other crazy things;
  • Blend: objects that needs to be drawn after all the opaque ones, like particle systems or transparent models

In this example, that model would be drawn on the “Blend” stage, using a custom shader (you can use a custom shader also on the “SkipGBuffer” stage). That shader draws only the outline of the mesh, using fresnel math and additive blending. Have a look at the fx to see how it’s done.


At first I thought about using some 3rd party particle library, but I changed my mind because I don’t want to get tied (or anyone who download this code) to any library. So I chose the excellent XML Particles sample from MSDN, and modified it to fit my pipeline. I didn’t convert it into sub meshes or another generic mesh: I just store them as a list of visible emitters and render them after the “Blend” meshes. I’m computing an approximated local bounding box for every particle emitter at creation time (using velocity, life time and particle size), and then using it with their transforms to generate a global bonding box for culling. I’m also sorting the emitters (not individual particles) from back to front, to have a better composition. Note that in the sample, the particles are very fill-rate intensive: they are huge, and the worst is that they are so transparent that we need lots of them on the screen. Remember that even if the texture is fully transparent, its cost of rendering is the same as if it was opaque.

Bonus: other optimizations

The last thing I did was to cache lots of shader parameters, to avoid accessing the parameters map every frame. There is some work left to be done, but I will do next. I also removed lots of per-frame memory allocations, so if you disable text rendering (that creates a lot of strings every frame), you will only see the List.Sort() allocating memory (if you know how to fix it using the same List class, please let me know!).

Next time I will add Xbox version, and also the specular light buffer: on the Xbox, the light buffer is HdrBlendable, what means we have only 2 bits for specular (it’s a shame we can’t use RGBA64). I did this on my XNA engine, and I didn’t see any performance issue.

See you next time!


Posted in XNA | Tagged , , , , , , | 23 Comments

XNA Light Pre-Pass: Skinned Meshes

Skinned animation is one of the most common animation techniques today (if not the most), so I couldn’t have left it out of this series. You can find more about it here. The main idea behind it is to store a skeleton (where the number of bones is far lower than number of vertices) and a mesh where every vertex/normal/tangent/etc is associated to one or more bones (usually less than four). When the skeleton animates (using pre-computed key frames and forward kinematics, or using procedural animation like ragdolls, inverse kinematics, or even Kinect), those vertices “follow” theirs bones and so you have your full mesh animated.

This feature is easily integrated in my current LPP renderer: the only extra information we need is the bone matrixes array. I based the importer/processor heavily on the XNA Skinned Model Sample, even the animated model is the “Dude” you can find on that sample. The full source code + assets is here, use it at your own risk!

I changed mainly two things on the importer:

  1. The key frames and the mesh can be scaled, so you can use the scale property in the importer options;
  2. Remember my ubbershader approach described some posts ago? For skinned meshes, I add the define “SKINNED_MESH” (LightPrePassProcessor.cs). This way, in compile time the shader decides what to do with this special mesh type

The main shader now handles that define using a different vertex format and a special algorithm to compute the skinned vertex using the bone matrixes + vertex information, like this:

struct VertexShaderInput
    float4 Position : POSITION0;
    float2 TexCoord : TEXCOORD0;
    float3 Normal : NORMAL0;
   float3 Binormal  : BINORMAL0;
   float3 Tangent  : TANGENT;
    float4 BoneIndices : BLENDINDICES0;
    float4 BoneWeights : BLENDWEIGHT0;
// Blend between the weighted bone matrices.
    float4x4 skinTransform = 0;
    skinTransform += Bones[input.BoneIndices.x] * input.BoneWeights.x;
    skinTransform += Bones[input.BoneIndices.y] * input.BoneWeights.y;
    skinTransform += Bones[input.BoneIndices.z] * input.BoneWeights.z;
    skinTransform += Bones[input.BoneIndices.w] * input.BoneWeights.w;
float4 skinPos = mul(input.Position, skinTransform);

I also added a new class “SkinnedMesh” inherited from “Mesh”, that holds also the bone matrixes and bind it to the effect when needed. I split the data from the animation itself to make it easier to change the animation player at some point (the animation player on XNA sample is very simple, it has no animation blending/transition/etc).

I think that’s all, this sample is really short compared to the last one, the code is very straight forward to understand, I swear.

For the next entry, I’m thinking about transparency, XBox project+optimizations or some post-processing fx. Feel free (and welcome!!) to add suggestions, comments, complaints, fixes, etc.

See ya!


Posted in XNA | Tagged , , , | 27 Comments

XNA Light Pre-Pass: Cascade Shadow Maps

Hi folks!

After a long time without opening VisualStudio (I deserved a little vacation after DBP), I’m back! I’ve submited my game (far from what I expected it to be), here are some screenshots:

I would like to thanks Virginia Broering, Rudi Bravo, Rafael Moraes, Felipe Frango, Rodrigo Cox, Marina Benites, Fabiano Lima, and specially Justin Kwok that provided me his old Xbox to test the game.

On this post I will talk about my implementation of Cascade Shadow Maps, a common technique used to handle directional light shadows. Some good descriptions can be found here and here. As before, the full source code for this sample can be found here: use it at your own risk!!

The basic idea is to split the view frustum into smaller volumes (usually smaller frustums or frusta), and to generate a shadow map for each volume. This way, we can have a better distribution of our precious shadow map’s texels into each region: the volume closer to the camera is smaller than the farthest one, so we have more shadow resolution closer to the camera. The drawback is that we have more draw calls/state changes, since we render to more than one shadow map. We can use shadow map atlasing to optimize things up, we just need to create a big shadow map that fits all our smaller shadow maps and offset the texel fetch on pixel shader.

You should measure/change/measure your application to decide the best number of splits (or cascades) and the resolution of the shadow maps. I’m using 3 cascades, at 1024×1024 each on PC and 640×640 on Xbox. Also, it’s a good idea to limit it to the main/dominant/call_it_as_you_like directional light, since it’s a performance hungry feature.

The steps needed to have it working are:

  • At initialization, create a big render target that fits all 3 shadow maps (if you want to change the sample to use 2 or 4 or whatever, go ahead). In PC, we should have a 3072×1024 texture. The texture’s format should be SINGLE (floating point), with a depth-buffer attached, and as before, we use the DISCARD_CONTENTS flag. We output the depth to that texture, in LINEAR space.
  • For each frame:
    • Bind the render target and clear it to white, ie the farthest value we can have;
    • Compute the range of each sub-frustum. You can use just a linear distribution, but it won’t provide you the best resolution distribution. I’m using a quadratic distribution, so the first frustum is way smaller than the last one;
    • For each cascade:
      • Compute an orthographic camera that fits the sub-frustum and points in the light’s direction. In my sample I’m using the technique “Fit to scene” described on the link above, where each new sub-frustum overlaps the previous ones. This way we can use some tricks to avoid shadow jittering when the camera moves. I didn’t like the results I have, since we lose lots of resolution when we use that trick, so I left a boolean to trigger it on/off. Compute the viewProjection matrix for this camera;
      • Compute the correct viewport for this sub-frustum;
      • Draw each mesh that is inside this sub-frustum, using the same technique we already use for spot light shadows, but with this new viewProjection matrix.
    • When rendering the directional light that has this cascade shadow map, choose the correct technique and send the parameters to the shader (the shadow map, the view-projection matrix for each sub-frustum and also its range. I put the ranges into a single vector3 (as I have 3 cascades).
    • The shader first compute the pixel position in view space (we need that for lighting anyway), and then we use its Z to pick the correct shadow map. I’m using just a single line to do that, take a look at the shader LightingLPP.fx;
    • Convert the pixel position from camera to world space, and then to light space. Fetch the correct shadow map texel (remember that we are using a big render target with all cascades, so we need to offset it according to the selected cascade). Do the depth comparison, I’m using a 4-sampler PCF to reduce aliasing.

That is it! Here is some screenshots of this sample:

See you next time!

Posted in XNA | Tagged , , , | 8 Comments

XNA 4.0 Light Pre-Pass: Alpha masking

One important feature of any renderer implementation is the alpha-masking support. Vegetation, chains and wired fences would be a nightmare to model and also a candidate to be a triangle-hungry-monster. The idea of alpha-masking is to decide if a given pixel should be rendered or not using the alpha channel of a texture (we just need one channel, so we can store it on diffuse’s alpha). If the value is bigger then a threshold, we draw it, otherwise it is skipped.

With the introduction of full shader-based pipelines, even this basic behaviour should be implemented on pixel shader. On HLSL we must use the clip(value) function, that discards the pixel if value is negative. Note that the computation for that pixel is still performed, it is just not sent to the render target (or backbuffer). To effectively skip the processing, we could to use dynamic branching and the [branch] notation for Xbox (I will not enter in details here).

In the pixel shader, all we need to do is  clip(diffuse.a – alphaReference), so our values lesser than alphaReference will evaluate a negative result, skipping the result. We can play with the alphaReference value in run-time, to make objects appear/disappear (useful for spawning effects).

Now the cool section: how to integrate it into my light pre pass pipeline (if you don’t know what I’m talking about, take a look at my old posts).  As usual, you can get the full source code here. Use it at your own risk!!

First, we need to do the alpha masking in 3 different stages:

  • when rendering to the G-Buffer;
  • when reconstructing the light;
  • when drawing to a shadow map.

Thinking ahead, we may need to mix alpha masking with fresnel/reflection/skinned meshes/multi-layer materials/etc, so its better to start using a solution to prevent something like “shader_fresnel_alpha_skinned_dual_layer.fx”. I introduce you…uber shaders!!

Uber shader is just a big shader that implements lots of behaviours (fresnel/reflection/etc), and the application decides which path to follow. I will use pre-processor parameters (#define/#ifdef) to construct the shader flow, since it’s a compile-time only process. I must confess I’m not a big fan of uber shaders, since sometimes the code gets messy, tricky to follow and not so human-readable, but for now I’m ok with it.

I’ve added the option to enable/disable alpha-masking and also the alphaReference value on 3DMax®, so we need to find a way to get that information and store into our processed mesh. To accomplish that, we need to make some changes on our Pipeline Processor (it took me a while to have it working properly, so accept this as a good gift :p ):

  • on the model processor (our LightPrePassProcessor.cs), we need to extract the alpha information on the original material, and store a list of “defines” (for now, I’m handling only alpha-masking, but the idea is to gather all kind of information like fresnel/reflection/etc). After that, we put this list into the material’s opaqueData, like “lppMaterial.OpaqueData.Add(“Defines”, listOfDefines);”;
  • we have to extend a material processor: I’ve created a class named LightPrePassMaterialProcessor to handle the “defines” we pushed from the first step, and send it to the effect processor;
  • we need also to extend the EffectProcessor, a job for LightPrePassFXProcessor class. It only reads the “defines” information stored into the context’s parameters and copy it to its “Defines” property.

With these steps working, we can focus on the shader itself. All we need to do is to put the alpha-checking inside “#ifdef ALPHA_MASKING …. #endif” region (ALPHA_MASKING is the key I chose for that, it’s on LightPrePassProcessor.cs). Here is a small snippet, from the technique to render to GBuffer:

//read our diffuse
half4 diffuseMap = tex2D(diffuseMapSampler, input.TexCoord);
clip(diffuseMap.a – AlphaReference);

 Note that as we don’t need the diffuse for the rest of this technique (remember we just output normals/depth on this technique), we can put the texture fetch inside the alpha mask region. We need also to support backface lighting, since almost anything that uses alpha-masking is not a closed mesh. To do that, we need to use the VFACE semantics (available only on SM3+) if we detect that macro, like this:

struct PixelShaderInput
float4 Position   : POSITION0;
float3 TexCoord   : TEXCOORD0;
float Depth    : TEXCOORD1;float3 Normal : TEXCOORD2;
float3 Tangent : TEXCOORD3;
float3 Binormal : TEXCOORD4;#ifdef ALPHA_MASKED
float Face : VFACE;

At this point you’ve probably got the idea of uber shaders (this is just the beginning, though). We need to extend it to the shadow map generation, including texture coordinates to the vertex input/output and performing the clip() inside the pixel shader. Remember also to set the culling to none on the technique declaration.

The trees on this sample were generated by Tree[d], an awesome free tool to generate trees.

I would like to thanks the guys that donated some money. It’s not about the money itself: to get to the point of donating anything, someone has read my blog, downloaded the code, run it, enjoyed it, returned to the blog, and clicked on the button to do a donation. This means that I’m doing a good job in sharing the knowledge, and it motivates me to continue this series of samples.

Thanks guys, see you!


Posted in XNA | 28 Comments

XNA 4.0 Light Pre-Pass: casting shadows

It’s been a long time since my last post, but I didn’t give up on this blog. I was working really hard on a project aimed at DreamBuildPlay, but things didn’t go as I expected. It’s hard to keep everyone as motivated as you, and frustrating to see all your weeks of work being used in teapots and boxes (well, at least I have Crytek’s Sponza model).

Here is a screenshot of the editor I’ve been working on, with the game running inside it’s viewport:

It features cascade shadow maps for directional lights, spot lights (with shadows too), full HDR rendering+post processing (bloom, tone mapping and screen space lens flare), gamma corrected pipeline and SSAO, running at 60fps on XBox 360 (its a little vague since it depends on the overdraw, simultaneous shadow caster lights, etc, but it works very well).

Ok, let’s move on. On this post I will talk (and release the code as usual) about shadows, specifically for spot lights (the easiest). I’m using plain old and good shadow mapping, which basically consists in rendering the scene using the light point of view, storing the depth of each fragment on a texture (shadow map). Then, at the lighting stage, we recompute each pixel to be lit in that same light space and compare the Z of this pixel with the Z stored on the shadow map, lighting or shadowing that pixel. The full source code is here, use at your own risk.

Since I don’t want to use the PRESERVE_CONTENTS flag on any render target I use, I have to generate all shadow textures before the lighting stage begins: we cannot switch render targets from “shadow 0 -> lightAccumulation->shadow 1->lightAccumulation->etc”, otherwise we would lose its contents. The solution I’m using is:

  • at the initialization stage, create the render targets for the max simultaneous lights you wanna allow to cast shadows on a single frame (you can create them at different resolutions);
  • at the beginning of the rendering stage, determine visible lights;
  • sort these lights, using any heuristics; I choose something like the light’s radius divided by the distance to the camera;
  • select the highest rated lights as shadow casters, generate the shadows and assign them the shadow map+light view projection matrix (we could use the heuristics to select the shadow map resolution);
  • render meshes to the GBuffer as usual;
  • render the lights to the accumulation buffer, using the shadow information generated before

Remember that we should not draw the meshes culled by our main camera: for spot lights, we can compute a frustum using the spot cone angle and an aspect of 1.0f, and the light transform. Compare this frustum against all world’s meshes (or use any partitioning structure you like) and pick only the meshes that intersect it. The code for constructing that frustum is on Light.cs, method “UpdateSpotValues()”.

I’ve added another technique to the main shader (LPPMainEffect.fx) , that outputs the shadow for that model: I already had the technique to write to GBuffer and the one to reconstruct the lighting. This way makes easier to use some ubber shader tricks to allow alpha-masked geometry or skinned meshes, since we can use #defines to change the behavior of the three stages accordingly.

The result is here:

Soon I will port the optimizations I did on my engine: reconstruction of the Z-buffer and stencil tricks for optimized pixel lighting. I can also put the XBox version (it took me 15 minutes to fix the compilation problems, but I lost that version somewhere), although it has some useless per-frame allocations (aka iterators).

I hope you enjoy it. See ya!


Posted in XNA | Tagged , , , | 39 Comments