0% found this document useful (0 votes)
18 views

Grass Using DirectX11 in Port

Uploaded by

abcrazgriz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Grass Using DirectX11 in Port

Uploaded by

abcrazgriz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Rendering Fields of Grass

u s i n g D i r e c t X 11 i n
GRID Autosport

Richard Kettlewell
Codemasters
M O T I VAT I O N

 Current implementation engineered for PS3/XBox360


M O T I VAT I O N

 High-end PC can do much better


 DirectX 11
 Compute Shaders

 Lots of interesting techniques online


 Outerra (http://goo.gl/tYlcjN)
 Nvidia (http://goo.gl/F43iTY)
GOALS

 High density
 Keep all data on GPU, for efficiency

 Get rid of polygonal look of terrain


 Flat polys with grass textures are unconvincing

 Interaction
 Wind
 Deformation
OUR APPROACH

 Generate
 Populate Append Buffer with blades of grass

 Render
 Read Append Buffer
 Construct geometry in Vertex Shader
 Rasterise using Alpha-To-Coverage
 No sorting required
ART PROCESS

 Simple world space map


 RGB defines grass colour
 Alpha defines grass height
 2K x 2K

 Wastes resolution
 Simplest approach given time constraints
 UV mapping onto terrain would be better
 Doesn’t scale well for large point-to-point tracks
G E N E R AT I N G G R A S S

 Render Terrain using custom shader


 Orthographic top-down render, centred around viewer
 Output to Append Buffer, not Render Target
 Every pixel could be a blade of grass
 Debug mode outputs to render target, for visualisation
G E N E R AT I N G G R A S S

 Every pixel could be a blade of grass


 Control density using viewport size
 Spreads the pixels over more/less distance

 Need to cull unimportant blades


 Set Scissor Rectangle around view segment

 Frustum cull against main scene camera

 Read world space map (discard if alpha < threshold)

 Scissor Rectangle
 Create bounding box from circle segment
 View position

 2 extents

 Any axis intersection

 Extra points around viewer


 Fixes problem when looking down
G E N E R AT I N G G R A S S

 LODs
 Vital for performance
 Distance based
 Each LOD discards increasing amounts of grass
 Remaining blades are scaled up to fill gaps
G E N E R AT I N G G R A S S

 LODs
 Feather distances randomly, to break up transitions
 Randomise distance calculation
 Fade grass height towards zero over last 15%
R A N D O M I S AT I O N

 Generate texture at load time


 Fill a 64x64 RGBA texture with rand()

 Provides 4 random numbers per grass blade


 Align texture to orthographic projection

 Used for
 Rotation
 Position
 Scale
 Varying Albedo
 Etc
APPEND BUFFER

 Represents every valid pixel from Generate stage


struct Instance
 DirectX 11 Structured Buffer
{
 Each element represents one grass blade
float3 position;
 Output to this instead of Render Target
float specular;
 16 byte aligned
 Pack 16bit values where possible
float3 albedo;
 f32tof16 uint vertexOffsetAndSkew;
 f16tof32 float2 rotation;
float2 scale;
};
DrawInstancedIndirect

 Allows the GPU to control Draw arguments


 Because we don’t know how many grass instances the GPU generated
 Avoids copying the AppendBuffer structure count back to CPU

 Same as DrawInstanced, except arguments come from GPU buffer


 VertexCountPerInstance
 InstanceCount
 StartVertexLocation
 StartInstanceLocation

 Create ID3D11Buffer with D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS


DrawInstancedIndirect

 Use CopyStructureCount to copy size of Append Buffer into Constant Buffer


 Populate buffer using Compute Shader
 Dispatch a single thread
 Is there a better way?

// buffer
RWBuffer<uint> g_drawInstancedBuffer : register( u0 );

// vertex buffer counter


cbuffer BufferCounter : register( b12 )
{
uint numInstances;
}

[numthreads( 1, 1, 1 )]
void cp()
{
g_drawInstancedBuffer[ 0 ] = 6u; // vertexCountPerInstance
g_drawInstancedBuffer[ 1 ] = numInstances; // instanceCount
g_drawInstancedBuffer[ 2 ] = 0u; // startVertexLocation
g_drawInstancedBuffer[ 3 ] = 0u; // startInstanceLocation
}
DrawInstancedIndirect

 Avoid dispatching high instance counts with low vertex counts


 http://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau

 Prefer dispatching a single large instance


 Reconstruct vertex/instance ID in Vertex Shader
 Use SV_VertexID

// buffer
RWBuffer<uint> g_drawInstancedBuffer : register( u0 );

// vertex buffer counter


cbuffer BufferCounter : register( b12 )
{
uint numInstances;
}

[numthreads( 1, 1, 1 )]
void cp()
{
g_drawInstancedBuffer[ 0 ] = 6u * numInstances;// vertexCountPerInstance
g_drawInstancedBuffer[ 1 ] = 1u; // instanceCount
g_drawInstancedBuffer[ 2 ] = 0u; // startVertexLocation
g_drawInstancedBuffer[ 3 ] = 0u; // startInstanceLocation
}
I N I T I A L R E S U LT S
GEOMETRY OR FINS?

 Initial implementation used geometry


 Inspired by Outerra tech
 Heavily vertex bound
 Difficult to make grass look soft with few verts per blade
 Difficult to achieve desired grass density

 We were using only 5 verts per blade


 Contributing to spiky results
 Could tessellate close grass?

Outerra grass geometry


GEOMETRY OR FINS?

 Use fins instead?


 More traditional approach to rendering grass
 Alpha Testing/ATOC
 Each ‘blade’ now represents one billboard
 Easier to add variety via UV shifting
 Softer grass can be painted into texture

 Most of existing Generation tech still valid


RENDERING

 Vertex data hardcoded in shader // sin/cos for rotation matrix


 Use SV_VertexID to generate it float s = instance.rotation.x;
 Construct matrix from position and rotation float c = instance.rotation.y;
 Apply scale to all verts
 Apply skew to top verts // world matrix
float3 worldPosition = instance.position;
 Texture format (DXT1)
float4x4 m = float4x4(
 Red: Diffuse tint
float4( c, 0, s, worldPosition.x ),
 Green: Specular map
float4( 0, 1, 0, worldPosition.y ),
 Blue: Alpha
float4( -s, 0, c, worldPosition.z ),
 Negative LOD Bias
float4( 0, 0, 0, 1 )
 3/4 mip
);
LIGHTING

 Calculated per instance


 More efficient than per vertex/per pixel
 Inaccurate for large billboards
 Normals
 Combine terrain and billboard normal
 Randomise albedo
 Small amount of noise makes big difference
 Darken terrain under grass
 Terrain shader reads grass map for height
 Specular
 Use terrain normal and apply random reduction factor
 Fade effects in distance for smooth transition to
terrain
SHADOWS

 Game creates a screen-space mask from depth pre-pass


 Pixel Shaders read mask instead of cascades

 One sample per grass instance


 What if grass instance is partially occluded?

 Solution
 Read shadow cascades directly
SSAO

 Same problem as shadows (screen-space mask)


 Expensive to add grass to depth pre-pass
 Must cope with screen-space problem (no shadow cascades!)
 SSAO also includes undercar shadow
 Leaks around car edges

 Solution
 Use depth buffer to compare 2 sample points
 Read SSAO from sample with furthest depth value
 Solves car occluding grass
SELF OCCLUSION

 Tall grass should occlude neighbours


 Treat height map like normal map
 Sample neighbours to estimate slope

 Use normal and sun direction to estimate occlusion


 Artist controlled strength
SELF OCCLUSION
D E F O R M AT I O N

 Cars/dynamic objects should flatten grass


 Render objects into F32 height texture
 Pass 1: Render centred around viewer
 Pass 2: Update texture into world space tiled texture
 Prevents texel swimming
 Fade edges of texture as it wraps around
 Use skidmarks not wheels

 Read height value in Generate stage


 If height intersects grass, modify the albedo, scale
and skew, to appear squashed/flattened
D E F O R M AT I O N
PERFORMANCE

 Worst Case (ms)


Generate Render Total
 1920 x 1200
 4xMSAA
AMD R9 290X 1.3 1.5 2.8
Nvidia GTX 780Ti 1.4 1.8 3.2
Nvidia GTX 560Ti 3.9 3.6 7.5
Intel HD5200 5.1 9.4* 14.5
*MSAA Disabled
PERFORMANCE

 Average Case (ms)


Generate Render Total
 1920 x 1200
 4xMSAA
AMD R9 290X 1.5 0.2 1.7
Nvidia GTX 780Ti 1.6 0.3 1.9
Nvidia GTX 560Ti 4.1 0.8 4.9
Intel HD5200 5.2 2.0* 7.2
*MSAA Disabled
FUTURE IMPROVEMENTS

 One Generate per LOD


 Wind
 Prototyped, but too subtle on short grass
 Similar to deformation
 Render car speed instead of height
 Bleed speed values out over texture
 Read in Generate stage, to increase existing sine wave sway

 Flowers
 Meshes
 Gravel / small rocks
 Improve art authoring pipeline
 World space map is naïve, wastes texture space
 Translucency
WE ARE HIRING!

http://www.codemasters.com/uk/working-for-us/southam/
THANKS FOR LISTENING!

Questions?

You might also like