XNA Parallel-split shadow maps

Parallel-split shadow maps are here. Had some struggles with getting it to work today and yesterday. It made the Directional Lighting class huge, but I will factor it out later. As with the previous shadow mapping scheme, it uses the depth data of the G-Buffer so no geometry is re-rendered for the shadow map projection phase.

This method that takes advantage of the G-Buffer in deferred rendering is, perhaps surprisingly, called forward shadow mapping. It compares the depths between the buffer at camera view and the buffer at the light’s view after it’s transformed with its projection matrix. Then it gets multiplied by the light term, and finally the diffuse color. I decided to skip blurring the shadow map but that can be done in an extra pass if needed.

Forward Shadowing

(By the way, in the above webpage, the links to the blurred images are incorrect. Add “blur” before the extension to view the blurred examples in full size ie. “main512blur.jpg”.)

We still need to render all those scenes at different distances for all the frustum splits. I am using 1024×1024 render targets. That took a toll on the busy Sponza scene 😦 It’s now from 65 fps down to 45. Sparser scenes are still plenty fast, though- the scene in this video usually runs at over 100fps without a screen recorder on.

At first I decided to split up the shadow renders into several passes. Not actual effect passes, but repeating the same rendering technique several times. For each pass, the shader would take in different parameters for the light’s view matrix, split distances, and corresponding depth map. Initially this rendered all shadow maps at the same starting depth (the near distance). I noticed an overlapping effect in the lighting, because the closest split was very bright and the farther ones were darker.

This was a side effect of the light buffer accumulating color values, so no, this won’t work. The shadow renders need to be split. A basic depth conversion formula can convert the depth map to linear view space, simply being this one as found on this depth of field tutorial:

float linearZ = (-camNear * camFar ) / (depthVal – camFar)

Also, since camFar is always going to be 1, we can just drop the multiplication for the numerator.

Eventually I was able to split the distances well but the shader still wasn’t clipping out of bound pixels very well. Also it had some strange rendering bug where the closer maps were showing dim shadows over the farther ones and clamping at odd angles. This was more apparent when the camera was completely facing opposite of the light’s direction.

Finally I just bit the bullet and put all of the depth map rendering in a single pass. This fixed everything as we now don’t have brightness accumulation over the split regions and the view matrices are perfectly lined up. The shader is still disorganized however, with more branching in some places, more constants being loaded, as well as throwing in all four of the depth map textures to sample in.

Ideally I would like to have reduced the need to select render textures or texel offsets by doing multiple light view matrix transformations at once. One time I was thinking, “it would be great if you could output multiple positions at once on the vertex shader, just as you can with render targets on the pixel shader”. Then I quickly realized that’s what the geometry shader does. Derp. Too bad it’s not available for XNA use. MJP says it brings crappy performance anyways.

So here’s the start of my somewhat odd parallel-split shadowing function. There are several different ways to get the trick done, and this is how I managed it.

float shadow = 1.0f;
if (shadowing >= 1)
{
    float shadowIndex;
    if (linearZ > cascadeSplits.z)
    {
        shadowIndex = 3;
    }
    else if (linearZ > cascadeSplits.y)
    {
        shadowIndex = 2;
    }
    else if (linearZ > cascadeSplits.z)
    {
        shadowIndex = 1;
    }
    else
    {
        shadowIndex = 0;
    }

    float4 shadowMapPos = mul(position, lightViewProj[shadowIndex]);
    float2 shadowTexCoord = shadowMapPos.xy / shadowMapPos.w / 2.0f + float2( 0.5, 0.5 );
    shadowTexCoord.y = 1 - shadowTexCoord.y;

    float shadowDepth = 0;
    float occluderDepth = (shadowMapPos.z / shadowMapPos.w) - DepthBias;

    if (linearZ < cascadeSplits.x)
    {
        shadowDepth = tex2D(shadowMapSampler[0], shadowTexCoord).r;
        shadow = LinearFilter4Samples(shadowMapSampler[0], 0.3f, shadowTexCoord, occluderDepth);
    }
    else if (linearZ < cascadeSplits.y)
    {
        shadowDepth = tex2D(shadowMapSampler[1], shadowTexCoord).r;
        shadow = LinearFilter4Samples(shadowMapSampler[1], 0.3f, shadowTexCoord, occluderDepth);
    }
    else if (linearZ < cascadeSplits.z)
    {
        shadowDepth = tex2D(shadowMapSampler[2], shadowTexCoord).r;
        shadow = LinearFilter4Samples(shadowMapSampler[2], 0.3f, shadowTexCoord, occluderDepth);
    }
    else
    {
        shadowdepth = tex2D(shadowMapSampler[3], shadowTexCoord).r;
        shadow = LinearFilter4Samples(shadowMapSampler[3], 0.3f, shadowTexCoord, occluderDepth);
    }
}

This code resides in the same function used to calculate directional lighting, and each light can be set to cast shadows or not. I recommend using a very low number of lights as the depth map rendering makes this process expensive quickly. Besides, unless you have some weird sci-fi setting with several suns, it just looks plain wrong when you have many directional shadows going on.

As you can probably tell, I am using four different depth maps and four parallel splits for the whole render. There’s some branching involved, unfortunately, as I can’t pass anything but literals to sampler array indexes. However I was able to replace another if-else statement with just adding up booleans as numbers to get the index for the light view matrix. This code:

    float shadowIndex;
    if (linearZ > cascadeSplits.z)
    {
        shadowIndex = 3;
    }
    else if (linearZ > cascadeSplits.y)
    {
        shadowIndex = 2;
    }
    else if (linearZ > cascadeSplits.z)
    {
        shadowIndex = 1;
    }
    else
    {
        shadowIndex = 0;
    }

was condensed into the code you see near the top of the last example:

    float shadowIndex = (3 -
        (linearZ < cascadeSplits.x) + (linearZ < cascadeSplits.y) +
        (linearZ < cascadeSplits.z));

Edit: Turns out that it IS possible to do texture sampling with variable indexes. Just use tex2Dgrad instead of tex2D to use the samplers, and the program will happily compile the code with variables passed for the shadowMapSampler array. We won’t need to apply a rate of change to the geometry, so the last two paramteres are changed to zero.

This gets rid of all the if-else syntax, and now the entire shadow lookup code is shortened, and looks much better. The code is now almost 1/3 its original size and there are less comparisons to do.

float shadow = 1.0f;
if (shadowing >= 1)
{
    float shadowIndex = (3 -
        (linearZ < cascadeSplits.x) + (linearZ < cascadeSplits.y) +
        (linearZ < cascadeSplits.z));

    float4 shadowMapPos = mul(position, lightViewProj[shadowIndex]);
    float2 shadowTexCoord =
        shadowMapPos.xy / shadowMapPos.w / 2.0f + float2( 0.5, 0.5 );
    shadowTexCoord.y = 1 - shadowTexCoord.y;

    float shadowDepth = 0;
    float occluderDepth = (shadowMapPos.z / shadowMapPos.w) - DepthBias;

    shadowdepth = tex2Dgrad(shadowMapSampler[shadowIndex], shadowTexCoord, 0, 0).r;
    shadow = LinearFilter4Samples(shadowMapSampler[shadowIndex], 0.3f,
        shadowTexCoord, occluderDepth);
}

Basically, I grouped the far distances of the first three splits into a Vector3, and then compare them to the linear depth output for that pixel. X is closest, and Z is the farthest. For four different splits, 3 will be the maximum index. It follows that if linearZ is closer than the first split, the same is also true for the second and the third, so we start by adding up the total true statements and then subtracting that total from the maximum. If all statements are false, then the last split and light view matrix will be used, so the index stays at 3.

Everything else is mostly standard shadow mapping work, and a simpler to read branching statement follows that determines what depth map to compare and sample from. But if there’s a way to clean it up some more, I’d like to know. The ComputeShadow4Samples is an adaptation of the manual linear filtering function available here. It is necessary for filtering these shadow maps since they are a Single 32 bit float format, and thus can only be interpolated after the shadow comparison has been determined. “0.3” is just a way to attenuate shadow darkness so they don’t appear completely dark.

From here on the the resulting pixel color just gets multiplied with the brightness output of the directional light that casts the shadow. I don’t know whether it’s more accurate to replace brightness with the shadow coefficient instead of multiplying brightness with shadow, but it still looks fine either way. So there you have it- directional lighting with the G-Buffer and shadow mapping in one fell swoop.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s