0% found this document useful (0 votes)
63 views490 pages

GPU Pro 2 (Edited by W.Engel) (2011)

GPU Pro 2 is a comprehensive guide on advanced rendering techniques, edited by Wolfgang Engel, featuring contributions from various experts in the field. The book covers a wide range of topics including geometry manipulation, facial animation, and global illumination, providing practical solutions for graphics programming challenges. It includes bibliographical references and example programs available online to support the content discussed in the chapters.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
63 views490 pages

GPU Pro 2 (Edited by W.Engel) (2011)

GPU Pro 2 is a comprehensive guide on advanced rendering techniques, edited by Wolfgang Engel, featuring contributions from various experts in the field. The book covers a wide range of topics including geometry manipulation, facial animation, and global illumination, providing practical solutions for graphics programming challenges. It includes bibliographical references and example programs available online to support the content discussed in the chapters.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 490
GPU Pro? Editorial, Sales, ane Customer Service Office AK Peters, Ltd. 5 Commonwealth Road, Snite 2C Natick, MA 01760 www akpeters.com Copyright © 2011 by A K Peters, Ltd. All tights reserved, No part of the material protected by this copyright notice may be reproduced or utilized in any form, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the copyright owner Library of Congress Cataloging-in-Publication Data GPU Pro2 : advanced rendering tecianiques / edited by Wolfgang Engel. >. om, Includes bibliographical references. ISBN 978-1-56881-718-7 (hardback) 1. Rendesing (Computer graphics) 2. Graphics processing units-Programming, 3. Com puter graphics. 4, Real-time data processing. |. Engel, Wolfgang I. Il. Title: GPU Pro 2 TaS5.G6NSS 2011 (06.6-der2 2oLo04o134 Front cover art courtesy of Lionhead Studiog, All Fable@artwork appears courtesy of Lionhead Studios and Micronoft Corporation. “Pablo” is a rogictored trademark or trademark of Microwoft Corporation in the United States and/or other countries, ©2010 Microsoft Corporation, All Rights Reserved. Microsott, Fable, Lionhead, the Lionbeadt Logo, Xbox, and the Xbox logo ere trademarks of the Mieroscft group of companies, Back cover images are generated with CryENGINE 3 by Anton Kaplanyan. ‘The Crytek f Frank Meinl Sponza mode! is publicly availab Printed in India MBH 109876543 21 Contents Acknowledgments om Web Materials xvii 1 Geometry Manipulation 1 Wolfgang Engel, editor 1 Terrain and Ocean Rendering with Hardware Tessellation 3 Xavier Bonaventura LA DirectX 11 Graphics Pipeline ©. 2... oe 4 12 Definition of Geometry... . . : eee 7 1. Vertex Position, Vertex Normal, and Texture Coordinates 10 14 Tessellation Correction Depending on the Camera Angle . . . 12 15 Conclusions . - u Bibliography 4 2 Practical and Realistic Facial Wrinkles Animation 15 Jorge Jimenez, Jose |. Echevarria, Christopher Oat, and Diego Gutierrez 21 Background... . : Sees nee 7 2.2 Our Algorithm . . a 18 23° Results 2B 24 Discussion . B 25 Conclusion 26 26 Acknowledgments . . ees cee 26 Bibliography 26 3 Procedural Content Generation on the GPU 29 Aleksander Netzel and Pawel Rohleder BL Abstract 29 32 Introduction... 6.0.0.0... 2» 33. Terrain Generation and Rendering . 30 Contents 3.4 Environmental Effects . . 3.5 Putting It All Together 3.6 Conclusions and Future Work Bibliography Rendering Christopher Oat, editor Pre-Integrated Skin Shading Erie Penner and George Borshukov 1.1 Introduction see 1.2. Background and Previous Work 1.3 Pre-Integrating the Effects of Scattering . 14. Scattering and Diffuse Light 1.5 Scattering and Normal Maps 1.6 Shadow Scattering 1.7 Conchsion and Future Work 18 Appendix A: Lookup Textures 1.9 Appendix B: Simplified Skin Shader Bibliography Implementing Fur Using Deferred Shading Donald Revie 21 Deferred Rendering 22 Fur 2.3 Techniques 2.4 Fur Implementation Details 2.5 Conclusion 2.6 Acknowledgments Bibliography Large-Scale Terrain Rendering for Outdoor Games Ferenc Pintér 3.1 Introduction oa 3.2. Content Creation and Editing 3.3 Runtime Shading 34 Pesformance 3.5 Possible Extensions 3.6 Acmowledgments Bibliography 41 AL a2 42 44 7 1s 51 52 53. a7 57. 59. 61 68 v4 m4 7 7 79 Sd 1 93. 93, Contents 4 Practical Morphological Antialiasing Jorge Jimenez. Belen Masia, Jose |. Echevarria, Fernando Navarro. and Diego Gutierrez 4.1 Overview 1.2 Detecting Edges coe 43 Obtaining Blending Weights 44 Blending with the Four-Neighborhood 4.5 Results 46. Disenssion 47 Conclusion 48 Acknowledgments Bibliography 5 Volume Decals Emil Persson 5.1 Introduction 5.2 Decals as Volumes 5.3 Conclusions Bibliography Il Global Mlumination Effects Carsten Dachsbacher, editor 1 Temporal Screen-Space Ambient Occlusion Oliver Mattausch, Dan 1.1 Introduction . 12) Ambient Occlusion 13 Reverse Reprojection 14 Our Algorithm 15 SSAO Inplementation 16 Results LI Disi 18 Conclusions . Bibliography Scherzer, and Michael Wimmer ion and Limitations 2 Level-of-Detail and Streaming Optimized Irradiance Normal Mapping. Ralf Habel, Anders Nilsson, and Michael Wimmer 2.1 Introduction: we 22 Calculating Directional Irradiance H-Basis Implementation . . Results: 95 7 100 105 106 110 ut 12 112 115 115, 115 120 120 121 123 123, 124 126 127 134 137 140) 140 ul 143 3 144 146 49) 155 Contents 2.6 Conchuision oa oa 27 Appendix A: Spherical Harmonics Basis Funetions without Condon-Shortley Phase Bibliography Real-Time One-Bounce Indirect Illumination and Shadows using Ray Tracing Holger Gruen 3.1 Overview 3.2 Introduction ee we we 23 Phase 1: Computing Indirect Hhunination without Indirect Shad~ Phase 2: Constructing « 3D Grid of Blockers Phase 3: Computing the Blocked Portion of Inclirect Light Future Work ography: Real-Time Approximation of Light Transport in Translucent Homogenous Media Colin Barré-Brisabois and Mare Bouchard 4.1 Introduction 4.2 In Search of Translucency 4.3. The Technique: The Way Out is Through 44 Performance 4.5 Discussion . 6 Conclusion 4.7 Demo 4.8 Acknowledgments Bibliography Diffuse Global Illumination with Temporally Coherent Light Propagation Volumes Anton Kaplanyan, Wolfgang Engel, and Carsten Dachsbacher 5.1 Introdnetion 5.2 Overview Algorithm Detail Description Injection Stage Optimizations Results Conclusion Admowledgments Bibliography 5, 5, 157, 157, 159 159) 159 161 165, 168 170 17 173 173 174 175 179) 181 182 183 183 183 185 185 186 187, 189 199) 200 202 203 203 Contents IV Shadows 205 Wolfgang Engel. editor 1 Variance Shadow Maps Light-Bleeding Reduction Tricks 207 Wojeiech Sterna 1.1 Introduction 207 12 VSM Overview . . see see sees 207 13. Light-Bleeding . 2... . 209 14 Solutions to the Problem 210 1.5 Sample Application Le . 213 16 Conclusion we . we we 213 Bibliography 24 2 Fast Soft Shadows vie Adaptive Shadow Maps 215 Pavlo Turchyn 21 Percentage-Closer Filtering with Large Kernels cee UB 22 Application to Adaptive Shadow Maps 23 Sofi Shadows with Variable Penuunbra Size 24 Results Bibliography 3. Adaptive Volumetric Shadow Maps 225 Marco Salvi, Kiril Vidimée, Andrew Lauritzen, Aaron Lefohn, ané Matt Pharr 3.1 Introduction and Previous Approaches 235 32. Algorithm and Implementation 277 33 Comparisons... a cee cee BBE 84 Conclusions and Future Work... see cee 289 35 Acknowledgments 240 Bibliography 241 4 Fast Soft Shadows with Temporal Coherence 243 Daniel Scherzer, Michael Schwérzler and Oliver Mattausch 4.1 Introduction... . bees 243 2 Algorithm 244 4.3 Comparison and Results Bibliography 5 Mipmapped Screen-Space Soft Shadows 257 Alberto Aguado and Eugenia Montiel 5. Introduction and Previous Work 52 Pemmbra With 250 53. Screen-Space Filter see cee - 260 54 Filtering Shadows... 0... 00.0 : 263 Contents 5.5 Mipmap Level Selection 5.6 Multiple Occasions 5.7 Discussion . Bibliography Handheld Devices Kristof Beets, editor A Shader-Based eBook Renderer Andrea Bizzotto LL Overview see cee 1.2, Page-Peeling Effect... . . 1.3 Enabling Two Pages Side-by-Side 1.4 Improving the Look and Autialiasing Edges 1.5 Dizection-Aligned Triangle Strip : 1.6 Performance Optimizations and Power Consumption 1.7 Putting it Together 18. Future Work 1.9 Conclusion 1.10 Acknowledgments iblingraphy Post-Processing Effects on Mobile Devices Marco Weber and Peter Quayle 21 Overview... 2.2 Tochnical Details 2.3 Case Study: Bloom 2.4 Implementation 2.5 Conclusion Bibliography Shader-Based Water Effects Joe Davis and Ken Catterall 3.1 Introduction 3.2 Techniques 3.3 Optimizations 34 Cone Bil 265 268 art 272 275 277 27 78. 283 285 286 287 287 288 288 289 280 291 291 204 296 298 304 305 307 Contents VI 3D Engine Design Wessam Bahnassi, editor 1. Practical, Dynamic Visibility for Games Stephen Hill and Daniel Collin 1.1 Introduction 1.2. Surveying the Field 13> Query Quandaries 1a Wish List 1.5 Conviction Sohation 1.6 Raitlefield Solution 17 Future Development 18 Conclusion 1.9 Acknowledgments . Bibliography 2. Shader Amortization using Pixel Quad Message Passing Eric Penner 2.1 Introduction 22 Background and Related Work 23° Pixel Derivatives and Pixel Quads 24 Pixel Quad Message Passing 2.5 PQA Initialization 2.6 Limitations of PQA 2.7 Cross Bilateral Samplix 2.8 Convolution and Blurring 29 Percentage Closer Filtering 2.10 Discussion, 2.11 Appendix A: Harwaro Support . « Bibliography 3. A Rendering Pipeline for Real-Time Crowds Benjamin Hernandez and Isaac Rudomin BL System Overview 3.2. Populating the Virtual Environment and Behavior 33. View-Frasinm Culling 34 Level of Detail Sorting 39 Animation and Draw Instanced 3G Results . a Conclusions and Futm 38 Acknowledgments Bibliography Work 365 366 366 369 369 371 371 377 379 379 382 383 383 Contents Vil GPGPU Sebastien St-Laurent, editor 1 2D Distance Field Generation with the GPU Philip Rideout 1.1 Vocabulary 1.2) Manhattan Grassfire 1.3. Horizontal- Vertical Erosion . 14 Saito-Toriwaki Scanning with OpenCL, 1.5 Signed Distance with Two Color Channels 1.6 Distance Field Applications Bibliography 2 Order-Independent Transparency using Per-Pixel Linked Lists Nicolas Thibieroz 2.1 Introduction we 22 Algorithm Overview... . 23 DirectX 11 Features Requisites 24 Head Pointer and Nodes Buffers 25 Per-Pixel Linked List Creation 26 Per-Pixel Linked Lists Traversal 2.7 Multisampling Antialiasing Support 28 Optimizations 29 Tiling 2.10 Conclusion 2.11 Admowledgments Bibliography 3. Simple and Fast Fluids Martin Guay, Fabrice Colin, and Richard Egli 3.1 Introduction 3.2. Fluid Modeling 3.3. Solver’s Algorithm 34 Code 3.5 Visualization 3.6 Cone Bibliography 4 A Fast Poisson Solver for OpenCL using Multigrid Methods Sebastien Noury, Samuel Boivin, and Olivier Le Maitre 4.1 Introdnetion 4.2 Poisson Equation and Finite Volume Method 4.3 Iterative Methods 385 387 388 392 394 402 404 407 409 409 409 410 au 413 46 421 435 17 430 431 431 433 433 434 436 440 M1 waz aud 445 AaB 116 451, Contents 44 Multigrid Methods (MG) 45 OpenCL Inplementation 4.6 Benchmarks 47 Discussion Bibliography Contributors 457 460 468 470 470 473 Bahan dengan hak cipta Acknowledgments ‘The GPU Pro: Advanced Rendering Techniques book series covers ready-to-use ideas and procedures that can solve many of your daily graphies-programming, challenges. ‘The second book in the series wouldn't have been possible without the help of many people. First, T would like to Ulank the section ellitors for the fantastic job thoy did, The work of Wossam Babnassi, Sobastion St Laurent, Carsten Dachsbacher, Christopher Oat, and Kristof Beets ensured that the qnality of the series meets the expectations of our readers. ‘The great cover screenshots were taken from Fable III. I would like to thank Fernando Navaaro from Microsoft Game Studios for helping us get the permissions to use those shots. Yon will find the Fabie [/Frelated article about morphological antialiasing in this book. ‘The team at A K Peters made the whole project happen. I want to thank Alice and Klaus Peters, Sarah Cutler, and the entire production tear, who took tho articles and made them into a book. Special thanks go out to our families and friends, who spent many evenings and weekends without us during the long, book production cycle Thope you have as much fin reading the book as we bad creating it —Wolfgang Engel PS. Plans for an upcoming GPU Pro 2 are already in progress. Any comments, proposals, and suggestions are highly welcome ([email protected]) Bahan dengan hak cipta Web Materials Example programs and source code to accompany some of the chapters are avail. able at httpy//www akpeters.com/gpupro. The directory structure closely follows the book structure by using the chapter mumber as the name of the subdirectory. ‘You will need to download the DireciX August 2009 SDK. General System Requirements ‘To use all of the files, you will need: ‘The DirectX August 2009 SDK ¢ OpenGL 1.5-compatihle graphios card, A DirectX 9.0 or 10-compatible graphies card Windows XP with the latest service pack; some require VISTA or Wine dows 7 Visual C4 NET 2008 © 2048 MB RAM ‘The latest graphics card drivers Updates Updates of the example programs will be periodically posted. Comments and Suggestions Please send any comments or suggestions to [email protected]. Copyrighted material Geometry Manipulation The “Geometry Manipulation” section of the book focuses on the ability of graph- ics processing units (GPUs) to process and generate geometry in exciting ways. ‘The article “Terrain and Ocean Rendering” looks at the tessellation related stages of DirectX 11, explains a simple implementation of terrain rendering. and implements the techniques from the ShaderX® article “Procedural Ocean Effects” by Laszlé Széesi and Khashayar Arman. Jorge Jimenez, Jose I. Echevarria, Christopher Oat, and Diego Gutierrez present a method to add expressive and animated wrinkles to characters in the article “Practical and Realistic Facial Wrinkles Animation.” Their system allows the animator to independently blend multiple wrinkle maps across regions of a charocter’s face. When combined with traditional blend-target morphing for fa- cial animation, this technique can produce very compelling results that enable virtual characters to be much more expressive in both their actions and dialog. The article “Procedural Content Generation on GPU,” by Aleksander Netzel and Pawel Robleder, demonstrates the generating and rendering of infinite and deterministic heightmap-hased terrain utilizing fractal Brownian noise esleulated in real time on the GPU, Additionally it proposes a random tree distribution scheme that exploits previously generated terrain information. The authors use spectral synthesis to accumulate several layers of approximated fractal Brownian motion. They also show how to simulate crosion in real time. —Wolfgang Engel Bahan dengan hak cipta Terrain and Ocean Rendering with Hardware Tessellation Xavier Bonaventura Currently, one of the biggest challenges in computer graphics is the reproduction of detail in scenes. To get more realistic scenes you need high-detail models, which slow the computer. To increase the muber of frames per second, you can use low-detail models; however, that doesn't seems realistic. The solution is to combine high-detail models near the eamera anc low-detail models away from the cainera, but this is not easy. One of the most popular techniques uses a set of models of different levels of detail and, in the runtime, changes them depending on their distance from the camera, This process is done in the CPU and is a problem because the CPU is not intended for this type of work—a lot of time is wasted sending meshes from the CPU to the GPU. In DirectX 10, you could change the detail of meshes into the GPU by per forming tessellation into the geometry shader, but it’s not really the best solution. ‘The output of the geometry shadbor is limited and is not intended for this type of serial work ‘The best solution for tessellation is the recently developed tessellator stage in DirectX 11. This stage, together with the hull and the domain shader, al- lows the programmer to tessellate very quickly inte the GPU. With this method you can send low-level detail meshes to the GPU and generate the missing ge- ometry to the GPU depending on the camera distance, angle, or whatever you want. In this article we will take a look at the new stages in DirectX 11 and how they work, To do this we will explain a simple implementation of terrain rendering, and an implementation of water rendering as it appeared in ShaderX® [Széesi and Arman 08}, but using these tools. | Geometry Manipulation Figure 1.1. DirectX 11 pipeline. 1.1 DirectX 11 Graphics Pipeline ‘The DirectX 11 graphics pipeline [Microsoft | adds three new stages to the Di- rectX 10: the hull shader stage, tessellator stage, and domain shader stage (see Figure 1.1). The first and third are progeammable and the second is configurable. ‘These come after the vertex shader and before the geometry shader, and they are Intended to do tessellation Into the graphic card. 1.1.1 Hull Shader Stage ‘The hull shader stage is the first part of the tessellation block. The data used in it is new in DirectX 11, and it uses a new primitive topology called control point patch list As its name suggests, it represents a collection of control points, wherein the number in every patch can go from 1 to 32. These control points are required to define the mesh. ‘The output data in this stage is composed of two parts—one is the inpnt control points that can be modified, and the other is some constant data that will be used in the tessellator and domain shader stages. Bahan dengan hak 1. Terrain and Ocean Rendering with Hardware Tessellation ‘Tp calculate the output data there are two functions. The first is executed for every patch, and there you can calculate the tessellation factor for every edge of the patch and inside it. It is defined in the high-level shader language (HLSL) code and the attribute [patchconstantfunc(' ‘func_nane’’)] must be specified. ‘The other function, the main ono, is exceuted for every control point in the patch, and there you can manipulate this control point. In both func- tions, you have the information of all the control points in the pateb; in ad- dition, in the second function, you have the ID of the control point that you are proc An example of a hull shader header is as follows: HS.CONSTANTDATA.OUTPUT Terrain ConstantHS Tnpat Patch V8. CONTROIPOINT,OUTPUP, INPUT_PATCHSIZR> ip uint PatehID : SV.PrimitivelD ) domain (** quad’ *)| partitioning (‘* integer "") out puttopology (** triang outputcontrolpoints(OUTPUT.PATCH SIZE) | patchconstantlunc (** TerraimConstantHS '") HSOUTPUD hsTerrain ( InpatPatch p. wint i: SV_OutputContrclPointlD uint PateklD : SV.PrimitivelD ) For this example, we clarify some of the elements. ¢ HS.CONSTANT.DATA_OUTPUT is a struct and it must contain SV-TessFactor and S\_InsideTessFactor. Their types are dependent on [domain(type -str)] ‘¢ INPUT PATCH SIZE is an integer and it must match with the control point primitive. + [domain(‘ ‘quad’ *)] can be cither ‘quad? !, ‘tri??, or ‘isoline?? * (partitioning(‘‘integer’’)] can be either ‘‘fractional_odd’’, “ patch | ‘The type of SV.DonainLocation can be different in other domain shaders If the [domein(.)] is ‘‘quad’? or ‘isoline’’, its type is float2, bat if the domain is ‘'tri??, its type is floats. 1.2 Definition of Geometry ‘Tp tessellate the terrain and water, we need to define the initial geometry—to do this, we will divide a big quad into different patches. The tesvellation can be 10,8) a es) fz le ls fz 5 B s G 7 7 5 7 5 7 5 aE sls 3/5 sls Ee ee ole sls afteoy (we) Figure 1.3. Division of the torrain into a grid: Vir and Ex reprosont vertices and edges, respectively, in every patch where x is the index to access. Inside 0 and Inside 1 represent the directions of the tessellation inside a patch. Figure 1.4. Lines between patches when tessellation factors are wrong. applied to a lot of shapes, but we will use the most Intuitive shape: a patch with. four points. We will divide the terrain into patches of the same size, like a grid (Bigure 1.3). For every patch, we will have to decide the tessellation factor on every edge and inside, It’s very important that two patches that share the same exige have the same tessellation factor, otherwise you will see some lines between patches (Figure 14). 1.2.1 Tessellation Factor Tn tho tessellator stage you can define different kinds of tossellations (fractional_even, fractional_odd, integer, or pow2). In the terrain render- ing we will use the integer, but in addition we will impose one more restriction: this value must be a power of two, to avoid a wave effect. ‘This way, when a new vertex appears it will remain until the tessellation factor decreases again and its a and z-values will not change. The only value that will change will be the y-coordinate, to avoid popping. In the ocean rendering we will not impose the power-of-two restriction because the ocean is dynamic and the wave effect goes unnoticed. ‘The tessollation factor ranges from 1 to 64 when the type of partitioning is integer, but the user has to define the minimum and the maximum values. We do not want the minimum of the tessellation factor to be 1 if there are few patches, 1. Terrain and Ocean Rendering with Hardware Tessellation or the maxinnun to be 64 if the computer is slow. In the terrain rendering, the minimum and the maximum yalue must be powers of two. We also have to define two distances between the camera and the point where we want to calculate the tessellation factor: the minimum and the maximum. When the distance is minimum the tessellation factor is maximum and when the distance is maximum the tessellation factor is minimum, The tes where « is a number whose range will be from log, MaxTessellation to logy MinTessellation linearly interpolated between the minimum and the maximum distance. In terrain rendering, this x will be lation factor will be rounded to the nearest integer to get a power-uFtwo tessellation factor (see Equa tion (1.1)). In the ocean rendering, 2 will not he rounded (see Equation (1.2)) ginal trig for d< min(), ted) = § promettteog JA SHEED +m) foe mrin(d) < d < max(d), ginin(2eie) for d> max(d) (1) gmax(teoe) for d< min(d), to(d) =} gilf(eeng 10 +min( oq 2) for min(d) max(d), where diff(2) = max(x) — min(r) and dis the distance from the point where we want to calculate the tessellation factor to the camera. The distances defined by the user are min(d) and max(d) and min(éeiyg,) and max(Feog,) are the tessella- on factors defined by the user. For the tessellation factors, we use the logy in order to get a range from 0 to 6 instead of from 1 to 64, The final value te(d) is calculated five times for every patch, using different distances—four for the e and one for inside, As we said before, when an edge is shared by two patches the tessellation factor must be the same. To do this we will calculate five different: distances in every patch, one for each edge and one inside, To calculate the tessellation factor for each edge, we calculate the distance hetween the camera and the central point of the edge. This way, in two adjacent patches with the same edge, the distance at the middle point of this edge will be the same because they share the two vertices that we use to calenlate it. Th calculate the tessellation factor inside the patch in U and V directions, we calculate the distance between the camera position and the middle point of the patch. 10 | Geometry Manipulation 1.3. Vertex Position, Vertex Normal, and Texture Coordinates In the domain shader we have to reconstruct every final vertex and we have to calculate the position, normal, and texture coordinates. This is the part where the difference between terrain and ocean rendering is more important. 13.1 Terrain coordinates In terrain rendering (see Figure 1.5) we can easy caloulate the z- and with a single interpolation between the position of the vertices of the patch, but we also need the y-coordinate that represents the height of the terrain at every point and the texture coordinates. Since we have clefined the terrain, to caleulate the texture coordinates, we lave only to take the final x- and z-positions and divide hy the size of the terrain. This is hecanse the positions of the terrain range from 0 to the size of the terrain, and we want values from 0 to 1 to match all the texture over it Once we have the texture coordinates, to get the height and the normal of the rain in a vertex, we read the information from a height: ap and a nonnal map in world coordinates combined in one texture. To apply’ this information, we have to use mipmap levels or we will sec some popping when new vertices appear. To reduce this popping we get the value from a texture in which the concentration of points is the same compared to the concentration in the arca where the vertex is located. ‘To do this, we linearly interpolate between the minimm and the maximum mipmap levels depending on the distance (see Equation (1.3). Four patches that share a vertex have to use the same mipmap level in that vertex t be coherent; for this reason, we caleulate one mipmap level for each vortex ina patch, Then, to calculate the mipmap level for the other vertices, we have only to interpolate between the mipmap levels of the vertices of the patch, where diff(©) = max() —min(x), Mf = MipmapLevel, and d is the distance from the point to the camera: min(.M), for d < min(d), Mipmap(d) = See + min(A), for min(d) max(d). To calculate the iminium and the maximum values for the mipmap level variables, we use the following equations, where textSize is the size of the texture that we use for the terrain: min(MipmapLevel) — loga(textSize) — logy (sqrtNumPatch - 2") max(MipmapLevel) = loga(textSize) — logy (sqrtNumPateh - 2""""°)) 1. Terrain and Ocean Rendering with Hardware Tessellation uu Figure 1.5. Terrain rendering. We have to keep in mind that we use only squared textures with a power-of-two size. If the minimum valu © 0, less than 0, we u 13.2 Ocean Calculating the final vertex position in ocean rendering. (see Figure 1.6) is more difficult than calculating terrain rendering. For this, we do not have a heightmap, and we have to caleulate the final position depending on the waves and the posi- tion in the world coordinate space. To get real motion, we will use the technique explained in ShadecX* dovoloped by Széesi and Arman [Sz6esi and Arman 08] First, we have to imagine a single wave with a wavelength (), an amplitude (o), and a direction (k). Its velocity (v) can be represented by vey ‘Thon the phaso (ig) at timo (4) in a point (p) ie 2x x (pk + vt) = Finally, the displacement (s) to apply at that point is af- cos p, sing] An ocean is not a simple wave, and we have to combine all the waves to get a realistic motion: ps = p+ 3) s(p.az. Xs. hi). All a;, X;, and Fy values are stored in a file. Figure 1.6. Ocean rendering, Ab every vertex, we need to calculate the norinal—it’s not necessary to access anormal texture because it can be calculated by a formula. We have a parametric function where the parameters are the «- and 2-coordinates that we have used to calculate the position; if we calculate the cross product of the two partial derivatives, we get the normal vector at that point: ape. Ops O: “Oe 14 Tessellation Correction Depending on the Camera Angle So far we have assumed that the tessellation factor depends only on the distance from the camera; nevertheless, it’s not the same if you see a patch from one die rection or from another. For this reason, we apply a correction to the tessellation factor that we have caleulated previously, This correction will not be applied to the final tessellation factor because we want the final one to be a power-of-Lwo valu; this correction will be applied to the unrounded value zr that we use in 2* to calculate the final tessellation factor. ‘The angle to calculate this correction will be the angle between the unit vector over an edge (¢) and the unit vector that goes from the middle of this edge to the camera (@) To calculate tho correction, we will uso this formula (see Figure 1.7): = a1eoos (12+ A) ay gy Ha = 2 1. Terrain and Ocean Rendering with Hardware Tessellation V Figure 1.7. Augle cortection, ‘The rank value is used to decide how important the angle is compared with the distance, The programmer can decide the value, but is advised to use valnes from 0 to 1 to reduce or increase the tessellation factor by 50%. If you decide to use arank of 0.4 then the tessellation factor will be nmltiplied by a value between 0.8 and 1.2, depending on the angle, ‘To be cousisient with this modification, we have to apply the correction to the value that: we use to access the mipmap level in a texture, Tt is very important to understand that four patches that share the same vertex have the same mipmap value at that vertex. To calculate the angle of the camera at this point, we calenlate the mean of the angles between the camera (@) and every vector over the edges , 08) that share the point (see Figure 1.8) rank rank +1 - ‘We don’t know the information about the vertex of the other patches for the hull shader, but these vectors can be calculated in the vertex shader because we know the size of the terrain and the umber of patches, Figure 1.8. Mipmap angle correction. 4 | Geometry Manipulation 1.5 Conclusions As we have shown in this article, hardware tessellation is a powerful tool that reduces the Information transfer from the CPU tw the GPU. Three new stages added to the graphic pipeline allow great flexibility to nse harelware tessellation advantageously. We have seen the application in two fields—terrain and water rendering—bnt it can be used in similar meshes. The main point to bear in mind Is that: we can use other techniques to calculate the tossollation factor, but wo always have to be aware of the tessollation factor and mipmap levels with all the patches to avoid lines between them. In addition, we have seen that it is better if we can use functions to represent a mesh like the ocean, because the resolution can be as igh as the te ssellation fnctor sets. If we use a heightmnp, os we do for the terrain, it would be possible to not have enongh information in the taxture, and we would have to interpolate between texcls Bibliography [Microsoft | Microsoft. “Windows DirectX Grapies Documentation.” [Széesi and Arman 08) Lésal6 Szécsi and Khashayar Arman. Procedural Oceon Effects. Hingham, MA: Charles River Media, 2008. Practical and Realistic Facial Wrinkles Animation Jorge Jimenez, Jose |. Echevarria, Christopher Oat, and Diego Gutierrez Virtual characters in games are becoming more and more realistic, with recent advances, for instance, in the techniques of skin rendering [d’Eon and Luebke 07, Hable et al. 09, Jimenez, and Gutierrez 10) or behavior-based animation.’ ‘To avoid lifeless representations end to make the action more engaging, increasingly sophisticated algorithms are being devised that captnre subtle aspects of the appearance and motion of these characters. Unfortunately, facial animation and the emotional aspect of the interaction have not been traditionally pursued with the same intensity. We Uelieve this is an important aspect ini especially givon tho current trend toward story-driven AAA games and their movie-like, real-time cut scenes. Without even realizing it, we often depend on the subtleties of facial ex- pression to give us important contextual cues about what someone is saying, thinking, or feeling. For example, a wrinkled brow can indicate surprise, while a furrowed brow may indicate confusion or inquisitiveness. In the mid-1800s, a French neurologist named Guillaume Duchenne performed experiments that: in- volved applying electric stimulation to his subjects’ facial muscles. Duchenne's experiments allowed him to map which facial muscles were used for different fa- cial expressions. One interesting fact that he discovered was that smiles resulting, from true happiness utilize not only the muscles of the mouth, but also those of the eyes. It is this subtle but important additional muscle movement that dis tinguishes a gemine, happy smile from an inauthentic or sarcastic smile. What ‘wo learn from this is that facial expressions are complex and sometimes subtle, but extraordinarily important in conveying meaning and intent. In order to allow artists to create realistic, compelling characters, we must allow them to hamess the power of subtle facial expression ‘Euphoria NaturalMotion technology 16 Figure 2.1. This figure shows our wrinkle system for a complex facial expression com- posed of multiple, simultaneous blend shapes. We present a method to add expressive, animated wrinkles to characters’ faces, helping to enrich stories through subtle visual cues. Our system allows ‘the animator to independently blend multiple wrinkle maps actoss regions of a character's face. We demonstrate how combining our technique with state-of the-art, real-time skin rendering can produce stunning results that enhance the personality and emotional state of a charactor (see Figures 2.1 and 2.2) This enhanced realisin has little performance impact. Tn fact, our implementa- tion has a memory footprint of just 96 KB. Performance wise, the execution time of our shader is 0.31 ms, 0.1 ms, and 0.09 ms on # low-end GeForee 8600GT. mid range GeForce 9800GTX + ond mic-high range GeForce 295GTX, respectively. Furthermore, it is simple enongh to be added easily to existing rendering engines without requiring drastic changes, even allowing existing bump/normal textures to be reased, as our technique builds on top of them. 2. Practical and Reslistic Facial Wrinkles Animation 7 (2) Without wrinkles (2) With wrinkles Figure 2.2, The same scene (a) without and (b) with animated facial wrinkles. Adding them helps to increase visual realista and conveys the mood of the character. 2.1 Background Bump maps and normal maps are well-known techniques for adding the illu of surface features to otherwise coarse, undetailed surfaces. The use of nor mal maps to capture the facial detail of human characters has been considered standard practice for the past several generations of real-time rendering appli- cations. However, nsing static normal maps unfortunately does not accurately represent the dynamic surface of an animated human face. In order to simulate dynamic wrinkles, one option is to use length-preserving geometric constraints along with artist-placed wrinkle features to dynamically create wrinkles on ani- mated moshos [Larbouletto and Cani 04]. Sinco this method actually displaces, geometry, the underlying mesh must be sufficiently tessellated to represent the finest level of wrinkle detail. A dynamic facial-wrinkle animation scheme pre- sented recently [Oat 07] employs two wrinkle maps (one for stretch poses and one for compress poses), and allows them to be blended to independent regions of the face using artist-animated weights along with « mask texture. We build upon this technique, demonstrating how to dramatically optimize the memory requirements. Furthermore, our technique ellows us to easily include more than two wrinkle maps when needed, becanse we no longer map negative and positive values to diffrent toxtures. 18 2.2 Our Algorithm The core idea of this technique is the addition of wrinkle normal maps on top of the base normal maps and blend shapes (see Figure 2.3 (left) and (center) for example maps). For each facial expression, wrinkles are selectively applied by using weighted masks (see Fignre 2.3 (right) and Table 2.1 for the mask and weights used in our examples). This way, the animator is able to manipulate the wrinkles on a per-blend-shape basis, allowing artdirected blending between poses and expressions, We store a wrinkle mask per channel of a (RGBA) texture; hence, we can store up to four zones per texture. As our implementation uses eight zones, we require storing and accessing only two textures, Note that when the contribution of multiple blend shapes in a zone exceeds a certain limit, artifacts can appear in the wrinkles. In order to avoid this problem, we clamp the yalue of the While combining various displacement maps consists of a simple sum, com- bining normal maps involves complex operations that should be avoided in a time-constrained environment like a game, Thus, in order to combine the base Base map Wrinkle map Mask map ammation to the [0, 1] range Figure 2.3. The wrinkle map is selectively applied on top of the base normal map by using a wrinkle mask. The use of partial-derivative normal maps reduces this operation to a simple addition. ‘The yellowish look is due to the encoding and storage in the Rand G channols that this technique employs. Wrinklo-zone colors in the mask do ot represent the actual channels of the mask maps, they are put together just for visualization purposes. yt Surprise 08 Fear (2 Anger Disgist 0.0 Sal 02 Table 2.1. Weights used for each expression and zone (seo color meaning in the mask map of Figure 2.3). 2 Practical and Realistic Facial Wrinkles Animation 19 and wrinkle maps, a special encoding is used: partial-derivative normal maps [Acton 03). It has two advantages over the conventional normal map encoding: 1, Instead of reconstructing the z-valve of a normal, we just have to perform a vector normalization, saving valuable GPU cycles: 2. More important for our purposes, the combination of various partial- derivative normal maps is reduced to a simple sum, similar to combining displacement maps. ADefloat2> baseTex, ‘Texture2D wrinkleTex Texture2D maskTex(2/ floatd weights [2 float? toxsoord) { float3. WrinkledNormel (Text float base base -xy = base Tex -Sample( AnisotropicSampler16, texcoord ). srs baso.xy = -1.0 + 2.0 + base.xy base.z = 1.0; Hifdet WRINKLES float2 wrinkles = wrinkleTex.Sample(LinearSampler , fexeoord ). gr wrinkles = -1.0 + 2.0 * wrinkles; texeoord ) texeoord ) floatd mask1 = maskTex [0] .Sample(LinearSamp! floatd mask? = maskTex [1] Sample(LinearSampler mask] += weights [0] mask? += weights [1]; base xy $= maskl.r + wrinkles; base .xy j= maskl.g + wrinkles bass xy d= maski b + wrinklos ; base.xy j= maskl.a + wrinkles base xy += mask2.1 ¥ wrinkles base.xy $= mask?.g 4 wrinkles base -xy += mask2.b + wrinkles baso-xy j= mask2.a + wrinklos #endif return normalize (base): + Listing 2.1. HLSL code of our technique. We are using a linear instead of an anisotropic sampler for the wrinkle and mask maps because the low-froquency nature of their infor- mation does not require higher quality filtering. This code is a more readable version of the optimized code found in the web aterial 20 | Geometry Manipul: ‘This encoding must be run as a simple preprocess. Converting a conventional norinal n= (rp, ny, Ne) to a partialderivative normal n' = (n,,n/,,n) is done hy using the following equations, pote yy Me Neutral Surprise Anger Surprise and anger Figure 2.4. The net result of applying both surprise and anger expressions on top of the neutral pose is an unwrinkled forchead. In order to accomplish this, we use positive and negative weights in the forehead wrinkle zones, for the suprise and angry expressions, respectively. 2. Practical and Reslistic Facial Wrinkles Animation au In runtime, reconstructing a single partial-derivative normal n’ to a conven tional normal # is done as follows: nm = (ni.ni,1) Tel Note that in the original formulation of partial-derivative normal mapping there isa minus sign both in the cor ersion and reconstruction phases; removing it from hoth steps allows us to abtain the same result, with the additional advantage of, saving another GPU cycle. ‘Then, combining different partial-derivative normal maps consists of a simple summiation of their (,y)-components before the normalization step. As Fig- ure 2.3 reveals, expression wrinkles are usually low frequency. ‘Thus, we can reduce map resolution to spare storage and lower bandwidth consumption, with- ont visible loss of quality. Calculating the final normal map is therefore reduced to a summation of weighted partial-derivative normals (see Listing 2.1), A problem with facial wrinkle animation is the modeling of compound ex- pressions, through which w expressions they are built upon. For example, if we are surprised, the frontalis muscle contracts the skin, producing wrinkles in the forehead. If we then sud- denly became augry, the corrugator muscles are triggered, expanding the skin in the forehead, thus causing the wrinkles to disappear. To be able to model these kinds of interactions, we let mask weights take negative values, allowing them to cancel each other. Figure 2 illustrates this particular situation, kles result from the interactions among the basic 2.2.1. Alternative: Using Normal Map Differences An alternative to the use of partial-derivative normal maps for combining normal maps is to store differences between the base and each of the expression wrin- Kle maps (sce Figure 2.5 (right)) in a manner similar to the way blend-shape interpolation is usually performed. As differences may contain negative values, we perform a scale-and-bias operation so that all values fall in the [0,1] range, enabling storage in regular textures: dler,y) = 0.5 +05 (whey) —beey)) where w(2,y) is the normal at pixel (#,y) of the wrinkle map, and ia, y) is the corresponding value from the base normal map. When DXT compression is used for storing the differences map, it is recommended that the resulting normal be renormalized after adding the delta, in order to alleviate the arti- facts caused by the compression scheme (see web material for the corresponding listing) Base map. Wrinkle map Difference map. Figure 2.5, We calculate a wrinkle-difference map by subtracting the base normal map from the wrinkle map, In runtime, the wrinkle-difference map is selectively added on top of the base normal map by using a wrinkle mask (see Figure 23 (right) for the mask). The grey color of the image on the right is due to the bias and scale introduced ‘when computing the difference map. Partial-derivative normal mapping has the following advantages over the dif- forencos approach: @ It can be a little bit faster because it saves one GPU cycle when recon- structing the normal, and also allows us to add only two-component nor- mal derivatives instead of a full (,y, 2) difference; these two-component additions can be done two at once, in only one cycle. This translates to a measured performance improvement of 1.12 in the GeForce 8600GT, whereas we have not observed any performance gain in either the GeForce 9800GTX+ nor in the GeForce 205GTX . @ It requires only two channels to be stored vs. the three channels required for the differences approach. This provides higher quality because 3De can be used to compress the wrinkle map for the same memory cost. On the other hand, the differences approach has the following advantages over the partial-derivative normal mapping approach: @ It uses standard normal maps, which may be important if this cannot be changed in the production pipeline. « Partial-derivative normal maps cannot represent anything that falls outside ofa 45° cone around (0,0, 1). Nevertheless, in practice, this problem proved to have little impact on the quality of our renderings. ‘The suitability of each approach will depend on both the constraints of the pipeline and the characteristics of the art assets. 2. Practical and Reslistic Facial Wrinkles Animation 23 2.3. Results For our implementation we used DirectX 1), but the wrinkle-animation shader itself could be easily ported to DirectX 9, However, to circumvent the limitation that only four blend shapes can be packed into per-vertex attributes at once, we used the DirectX 10 stream-out feature, which allows us to apply an unlimited number of blend shapes using multiple passes [Lorach 07]. The hase normal map has a resolution of 2048 x 2048, whereas the difference wrinkle and mask maps have a resolution of 256 x 256 and G4 x 64, reape low-frequency information. We use 3De compression for the base and wrinkle maps, and DX for the color and mask maps, The high-quality scammed head model and textures were kindly provided by XYZRGB, Inc., with the wrinkle maps created manually, adding the missing touch to the photorealistic look of the images, We sed a mesh resolution of 13063 triangles, mouth incinded, which is a little step shead of current generation games; however, as current high-end systems become mainstream, it will be more common to see such high polygon counts, especially in chiematics. ‘To simulate the subsurface seattering of the skin, we use the recently devel- oped screen-space approach [Jimenez and Gutierrez 10,Jimenez et al. 10h], which transfers computations from texture space to screen space by modulating a convo- Iution kernel according to depth information. This way, the shnulation is reduced to a simple post-process, independent of the number of objects in the seen easy to integrate in any existing pipeline. Facial-color animation is achieved us- ing a recently proposed technique [Jimenez et al. 10a], which is based on in vivo melanin and hemoglobin measurements of real subjects. Another crucial part of our rendering system is the Kelemen/Szirmay-Kales model, which provides real- istic specular reflections in real time {d’Eon and Lnebke 07). Additionally, we nse the recently introduced filmic tone mapper [Hable 10}, which yields really crisp blacks. ‘ely, as they contain only and Nasalis Frontalis Mentalis Figure 2.6. Closeups showing the wrinkles produced by nasalis (nose), frontalis (fore- head), and mentalis (chin) muscles. | Geometry Manipulation Figure 2.7. Transition between various expressions. Having multiple mask zones for the forehead wrinkles allows their shape to change according to the animation, 2. Practical and Reslistic Facial Wrinkles Animation Shader execution time ‘orce 8600GT 30m GeForce $800GTX+ 0.1 ms GeForce 295GTX 0.09 ms Table 2.2. Performance measurements for different GPUs. The times shown correspond, specifically to the execution of the code of the wrinkles shader For the head shown in the images, we have not created wrinkles for the zones corresponding to the cheeks because the model is tessellated enough in this zone, allowing us to produce geometric deformations directly on the blend shapes Figure 2.6 shows different close-ups that allow appreciating the wrinkles aided in detail. Figure 2.7 depicts a sequential blending between compound expres- sions, illustrating that adding facial-wrinkle animation boosts realisin and adds mood to the character (frames taken from the movie are included in the web: material) Table 2.2 shows the performance of our shader using different GPUs, from the low-end GeForce S6U0GT to the high-end GeForce 295G'TX. An in-depth examination of the compiled shader code reveals that the wrinkle shader adds a per-pixel arithmetic instruction/memory access count of 9/3. Note that animat- ing wrinkles is useful mostly for near-to-medium distances; for far distances it race same wrinkles, further can he progressively disabled to save GPU eyeles. Besides, when similar ters share the same (u,v) arrangement, we can reuse Uh improving the use of memory resources. 2.4 Discussion From direct observation of real wrinkles, it may be natural to assume that shad- ing could be enhanced by using techniques like ambient occlusion or parallax occlusion mapping [Tatarchuk 07]. However, we have found that wrinkles exhibit very little to no ambient occlusion, unless the parameters used for its generation ing can be thought to be an important feature when dealing with wrinkles, but in practice we have found that the use of parallax ocehision mapping is most often unnoticeable in the specific case of facial wrinkles. Furthermore, our technique allows the incorporation of additional wrinkle maps, like the lemon pose used in [Oat 07], which allows stretching wrinkles are pushed beyond its natural values, Similarly, selfocelusion and self-shado already found in the neutral pose. However, we have not included them because they have little effect on the expressions we selected for this particular character model. | Geometry Manipulation 2.5 Conclusion Compelling facial animation is an extremely important and challenging aspect of computer graphics, Both games and animated feature films rely on convincing characters to help tell a story, and a critical part of character animation is the character's ability to use facial expression. We have presented an efficient tech- nique for achieving animated facial wrinkles for real-time character rendering. When combined with traditional blead-target morphing for facial animation, our tochnique can produce very compelling results that, enable virtual charact accompany both their actions and dialog with increased facial expression, Our system requires very little texture memory and is extremely efficient, enabling true emotional and realistic character renderings using technology available in widely odopted PC graphics hardware ond current generation game consoles. 2.6 Acknowledgments Jorge would like to dedicate this work to bis eternal and most loyal friend Kazén, We would like to thank Bel Macia for her very detailed review and support, Wolfgang Engel for his editorial efforts and ideas to improve the technique, and Xenxo Alvarez for helping to create the differont poses. This research has been funded by a Marie Curie grant from the Seventh Framework Programme (grant agreement no.: 251415), the Spanish Ministry of Science and Technology (‘TIN2010-21543), and the Gobiemo de Aragén (projects OTRT 2009/0111 and CTPPN5/09). Jorge Timenez, was additionally funded by a grant from the Gobierno de Aragén. The authors would also like to thank XYZRGB Inc. for the high-quality head sean, Bibliography [Acton 08] Mike Acton. “Ratchet and Clank Future: Teols of Destruction Teeimical Debriefing.” ‘Technical report, Insomniac Games, 2008, {d'Bon and Luebke 07] Bugene d’Eon and David Luebke, “Advanced Techniques for Realistic Real-Time Skin Rendering.” In GPU Gems 3, edited by Hubert Nguyen Chapter LA, pp. 293-347, Reading, MA: Addison Wesley, 2007. [Hable et al. 09] John Hable, George Borshukov, and Jim Hejl. “Fast Skin Shading.” In ShaderX" , elited by Wolfgang Engel, Chapter {1.4, pp. 161-173. Hingham, MA: Chaeles River Media, 2009. [Hable 10] John Hable, “Uncharted 2: HDR Lighting.” Game Developers Conference, 10. [Fimenez and Gutierrez 10] Jorge Jimenez and Diego Gutierrez. “Screen-Space Sub- surface Scattering.” In GPU Pro, edited by Wolfgang Engel, Chapter V.7. Natick MA: AK Potors, 2010. (Jimenez et al. 10a] Jorge Jimenez, Timothy Seully, Nuno Barbosa, Craig Donner, Xenxo Alvarez, Teresa Vieira, Paul Matts, Veronica Orvalho, Diego Gutierrez, 2. Practical and Reslistic Facial Wrinkles Animation and Tim Weyrich, A Practical Appearaice Model for Dynaunic Pacial Colon.” AGM Transuctions on Graphics 29:6 (2010), Article M11 jenez et al. 10b] Jorge Jimenez, David Whelan, Veronica Sundstedt, and Diego Gutierrez, “Real-Time Realistic Skin Transhucency.” IEEE Computer Grophies end Applications 80:4 (2010), 82-AL. [Larboulette and Cani 04) ©. Larboulette and M. Gaui, “RealTime Dynamic Wrin- Kles.” In Proc. of the Computer Gruphics International, pp. 5. Washington, DC: IEEE Computer Society, 2004 [Lorach 07] ‘P. Lorach. “DirectX 10 Blend Shapes: Breaking the Limits.” In GPU Jems 3, edited by Hubert Nguyen, Chapter 3, pp. 53-07. Reading, MA: Adilison. Wesley, 2007 [Oat 07] Christopher Oat. “Animated Wrinkle Maps.” In SIGGRAPH ‘7; ACM SIGGRAPH 2007 courses, pp. 33-87. New York: ACM, 2007. [Tatarchok 07] Natalya Tatarchuk. “Practical Parallax Occlusion Mapping.” In ShaderX”, edited by Wollgang Engel, Chapter U3, pp. 75-105. Hingham, Ma Chaules River Media, 2007. Bahan dengan hak cipta Procedural Content Generation on the GPU Aleksander Netzel and Pawel Rohleder 3.1 Abstract This article emphasizes on-the-fly procedural creation of content related to the video games industry. We demonstrate the generating and rendering of infi- nite and deterministic heightmap-based terrain utilizing fractal Brownian noise calculated in real time on a GPU. We take advantage of a thermal erosion algo- rithm propesed by David Cappola, which greatly improves the level of realisin pntion al- in heightmap generation. In addition, we propose a random tree dis gorithm that exploits previously generated terrain information, Combined with the natural-looking sky model based on Rayleigh and Mie scattering, we achieved very promising quality results at real-time frame rates. The entire process car be sean in our DirectX10-based demo application. 3.2 Introduction Procedural content generation (PCG) refers to the wide process of generating media algorithmically, Many existing games use PCG techniques to generate a variety of content, from simple, random object: placement over procedurally generated landscapes to fully automatic creation of weapons, buildings, or AI enemies. Game worlds tend to he increasingly rich, which requires a lot of ef fort that we can minimize by utilizing PCG techniques. One of the basic PCG techniques in real-time computer graphics applications is the heightmap-based terrain generation [Olsen 04] 20 | Geometry Manipulation 3.3. Terrain Generation and Rendering Many different real-time, terrain-generation techniques have been developed over the last few years, Most of them utilize procednrally generated noise for creating, aheightmap. The underlying-noiso generation method should bo fast enough to get at least close to real time and generate plausible results. The most interesting technique simulates 1/f noise (called “pink noise”). Because this kind of noise occurs widely in nature, it can be easily implemented on modern GPUs and has good performance/speed ratio. In our work, we decided to of fractal Brownian motion ({Bm) Our implementation uses spectral synthesis, which accumulates several layers of noise together (see Figure 3.1). The underlying-noise generation algorithi is simple Perlin noise, which is described in [Green 05], Although its implementation rolies completely on the GPU, it is not fast enough to be caleulated with evory the approximation frame because of other procedural generation algorithms. We therefore used a system in which everything we need is generated on demand, The te divided into smaller nodes (the number of nodes has to be a divider of heightmap rrain is size, so that there won't be any glitches after generation), and the camera is placed in the center of all nodes. Every node that touches the center node has hounding box. Whenever collision between the camera and any of the bounding boxes is detected, a new portion of the heightmap is generated (along with other procedural content). Based on the direction of the camera collision and the pos m of nodes in world space, we determine new UV coordinates for noise Figure 3.1. Procedurally generated terrain with random tree distribution. 3. Procedural Content Generation on the GPU a (a) 0) () (a) Figure 3.2, Red: camera, yellow: AABB, green: different patches it the grid. (a) Cam- cera in the middle, (b) Collision detected with AABB. (c) New generation of procedural content, new ABB. (d) New row that is generated, generation. This operation guarantees, with respect to the heightmap generation algorithm, that terrain will be contimous, endless, and realistic We could also optimize processing power by half, knowing that at most there are two rows of heightmaps that should be regenerated, We simply copy specified rows or columns and generate one additional row/cohumn via the pixel shader (in Figure 32, all patches with a blue outline). When a camera collision occurs in the comer where two AABBs overlap, we have to generate a new row and cohunn of data, To overcome the problem with camera collision and the newly generated AABB, we have to make the bounding boxes a little bigger or smaller, so there will he free space between them Our algorithm also utilizes a level-of-detail (LOD) system. Since we are as- sured of the relative positions of all nodes, we can precalculate all variations (there are only a small number) of index buffers. The center node has to have the lowest LOD (highest triangle count). Since we know that the camera will always be placed in the center node, we don’t have to switeh index buffers for dif. ferent terrain LOD because the distance between the patch containing the camera and other patches is always the same (that’s basically how our system works) Another natural phenomenon we decided to simulate is erosion. Erosion is defined as the transperting and depositing of solid mat wind, water, gravity, or living organisms. The simplest type of erosion is thermal erosion [Marak 97], wherein temperature changes cause small portions of the materials to crumble and pile up at the bottom of an incline, The algorithm is iterative and is as follows for each iteration: for every terrain point that hes an altitude higher than the given threshold value (called talus angle (T)), some of its material will be moved to neighboring points. In the case that many of the neighbors’ heights are above the talus angle, material has to be properly distributed. als elsewhere due to ‘The implementation is fairly simple. We compare every point of the heightmap with the heights of neighboring points and calculate how much material hes to be moved pixel shader limitation restricts us from scattering | Geometry Manipulation (before UAVs in DirectX11), we use an approach proposed by David Cappola [Capolla 08]. To overcome pixel shader limitation, we need to make two passes. Tn the first pass, we evaluate the amount of material that has to be moved from a given pcint to its neighbors; in the second pass, we actually accummlate material, using Information provided by the first pass. To improve the visual quality of our torrain, we generate a norma is pretty straightforward. We use a given heightmap and apply the Sobel filter to it, Using the sun position, we can calculate shading (Phong diffuse). Since wwe target DX10 and higher we use texture arrays to rencer terrain with many textures [Dndask 07] Collision detection is managed by copying a GPU-generated heightmap onto an offline surface and reading the height data array. Then we can check for collision using a standard method on the CPU and determine an accurate position inside an individual terrain quad nsing linear interpolation. Or, without involving, the CPU, wo could render to a 1x 1 ronder target instead; however, since our tree placement technique is not “CPU free” (more on that later), we need the possibility of being able to check for height/collision without stressing the GPU. map, which 3.4 Environmental Effects Since we want to simulate a procedurally-rich natural environment, we have to he able to render more than just terrain with grass textnre—we want plants. Tn the natural environment, plant distribution is based on many conditions such as soil quality, exposure (wind and sun), or the presence of other plants. In owr work, we used a heightmap-based algorithm to determine tree placement along, generated torrain, Our tree-placement technique consists of two separate steps. First, we gen- erate a tree-density map, which characterizes tree placement, The next step ine volves rendering instanced tree meshes. We use a density map to build a stream with world matrices. The height and slope of the terrain are the most common, factors on which plant growth is dependent; both are easy to calculate in our case since we alteady have a heightmap and a normal map. Using a simple pixel shader, we calculate density for every pixel as a mix of slope, height, ancl some arbitrarily chosen parameters. As a result, we get texture with tree distribution where value 1.0 corresponds te the maximtwa number of trees and 0.0 corresponds to no tree. To give an example, we present one of our versions (we tried many) in Listing 3.1 and the result in Figure 3.3. There are no strict rules on how to implement the shader. Anything that suits your needs is acceptable. When the tree density map is ready to go, we have to build a stream with world matrices for instancing. Because the tree-density map texel can enclose a Inige range in world space, we cannot simply place one tree per texel becanse the 2. Procedural Content Generation on the GPU float p = 0.0f Calculate slop float {-slope-range = 0.1 float flslope-min = 1,0 — fslope-renge floats wmormal = g.Normalmap.Samplo( samClamp, IN UV ).xve # 2.0 = "10 float {height = gHeightinep Sample( samClamp, IN.UV ).x # 2.0 = 1.0 float f.slope = dot( vaormal, floats (01,0) ); flops — waturate(fslepe — foslope-min ) float f.slopeval = smoothstep(0.0, f-rlope.range, fslope) Get relative height float f.rel-height-threshold = 0.002 floatd v-heights — 0; v-heights.x = g-Heightmap Sample( samClamp, IN.UV Hoat2( 10 / LHMusize, 0.0) }; f/Latt v-heights.v = eHeightmap .Sample( samClamp, IN.UV float2( 1.0 / LHN-size, 0.0) ): //Right vihcights.2 — gHoightmap Sample( somClamp , IN UV float2(_ 0.0, -LO / f.HM.size) ) ‘a v heights .w = g-Hoightmap ,Sample( samClamp, IN.UV float2( 0.0, 1.0 / fHM-siza) ); //Down veheights = v-heights * 2.0 — 1.0; viheights = abs(v-heights — f-height ); voheights = step(v-heights, f-rel-height-t) p= dot(fslope-wal, v-heighte) + 0.25: return p Listing 2.1. ‘Tree-donsity map pixel shader generated forest will be too sparse, To solve this issue, we have to increas trocs-per-toxel ratio; therefore, we need one more placement techn We assign each tree type a different radius that determines how much space this type of tree owns (in world space). It can be compared to the situation in a real environment when bigger trees take more resources and prevent smaller ‘faees from growing in their neighborhood. Also, we want our but randomly distributed across a patch corresponding, to one density map texel Our solution is to divide the current patch into a grid wherein each cell size is determined by the biggest tree radius in the whole patch. The total mmber of cells is a mix of the density of the current texel and the space that the texel encloses in world space. In the centor af eve move its position using pseudorandom offset within the grid to remove repetitive patterns. ss to be evenly ry grid cell, wo place one tree and a | Geometry Manipulation (®) 0) © Figure 3.3. (a) Heightmap, (b) normal map, and {c) tree-density map. Using camera frustum, we process only visible texels to form a density map. Based on the distance to the camera, the tree’s LOD is determined so that only ‘trees close to the camera will be rendered with a complete list of faces. After all these steps, we have a second stream prepared and we are ready to render all the trees. We also input the maximum number of trees that we are abo ut to render because there can easily become too many, especially when we have a large texel-to-world-space ratio. Care has to be taken with this approach since we cannot simply process a tree-den texels close to the camera first. If we don't, we might use all “available” tr for the farthest patches, and there won't be any close to the situation, we may use billboards for the farthest trees, or use a smooth transition Into the fog color ‘The last part of our procedural gen: scattering simulation |West 08,0°Neil 05]. Our implementation follows the tech nique described in GPU Gems 2. We first calculate optical depth to use it as a lookup table for further generating Mie and Rayleigh textures. Mie and Rayleigh toxtures are updated in every frame (using rendering to multiple render targets (MRT)) and are then sampled during sky-dome rendering. This method is effi- cient, fast, and gives visually pleasing results. y map row by row; we have to process mera. Tn this ation is the Rayleigh-Mie atmospheric 3.5 Putting It All Together In Figure 3.4 we present: what our rendering loop looks like. As described earlier, vwe perform only one fall generation of procedural content whenever a collision of the camera with one of the bounding boxes is detected. When collision is detected, we transform camera position into UV position for heightmap generation. After generating the heightmap, we calculate the erosion and the normal map. The last part of this generation step is to calculate new AABB. 2. Procedural Content Generation on the GPU 8 ‘Seane Upaate Cision with [FALSE ‘AaGB? Generate Noise Norral Nap Tree Densiy Ma Tee postons ‘steam Calculate ABR Figure 3.4, Rendering loop. ‘The tree-position stream is calculated with each frame since it ai camera crientation (see Figure 3.5) pends on the (a) (oy ©) @ Figure 3.5. (a) Heightmap, (b) erosion map, (c) normal map, and (A) tree-density map. 3.6 Conclusions and Future Work We icle using Microsoft DirectX 10. All parameters controlling algorithm behavior can be changed during real time. Table 3.1 shows the minimum, maximum, and average number of frames per second (fps) in onr framework, with 200 iterations of the erosion algorithm, aplemented all of the techniques described in this a As we can see, the average number of frames is close to the number of maxi- mum frames because usually only tree stream is generated. So a decrease of fps by a factor of 10 in the mininum is due to the generation of procedural content. | Geometry Manipulation NVIDIA GTX 260 | ATI Radeon HD 5000 Tps average | S51 352 fps minimum | 66 17 Ips maximum | 385 aT Table 3.1. Minimum, maximum, and average number of frames per second. Therefore, we could procedurally generate content in every frame and keep up with real tine, but our system doesn’t require that The obvious optimization is to put the generation of the texture into another m1 iterations can be divided into several frames. Possibilities for developing further procedural generations are almost endless We mention the techniques we will be developing in the near future, ‘Tres-density thread. For instance, erosi mapping can be used not only for tree generation but also for other plant seeding systems like grass and bushes. Since we can assume that bushes can grow close to the trees, we can place some of them into the grid containing one tree. Of course. we must take into account that bushes are unlikely to grow directly in a tree's shadow. Also, the same density snap can be used for grass placenent, Since grass is likely to grow further away (hecanse natural condi ns for grass growth are less strict), we can blur the density map to get a grass density map. Therefore, without much more processing power, we can generate grass and bushes. The next step could involve generating rivers and even whole cities, but cities put some conditions on generated terrain, Terrain under and around a city should be almost flat. The best solution is to cor ural terrain. The only challenge is to achieve a seamless transition between the two pine artist-made terrain with proce- Another interesting idea is to use other types of erosion (i.¢., erosion described, in ShaderX") or different noise generators. For heightmap generation, since every function that retums height for a given «c,y can be used, options are practically limitless. For example, one might take advantage of the possibilities provided by the latest version of DirectX 11 Compute Shalers. It provides many new features that make procedural generation easier In conclusion, PCG offers the possibility to generate virtual worlds in a fast and efficient way. Many different techniques may be used to create rich en- vironments, such as terrain with dense vegetation with a very small amount of artist /level designer work. Our application could be easily extended or integrated into an existing game editor. § it could be used for current games, like flight simulators or any other open-space game with a rich environment, nce our techniques offer interactive frame rates 2. Procedural Content Generation on the GPU Bibliography [Capolla 08) D. Capolla, “GPU Tersa Vlog, 2008, Available at http://www.maxbox.com/GPU. [Dudask 07] B. Dudask “Texture Arrays for Termin Rendering.” Avail able at _ittp:/developer.download.nvidia.com/SDK/10.5/direct’3d/Sonrece/ ‘ToxtureArrayTerrain /doc Texture rray'Tetrain,paf, 2007 [Green 05] S. Green, “Implementing Improved Perl by Hubert Nguyen, pp. T3-85. Reading, MA: Addison-Wesley, 2 ise.” In GPU Gems 2, elited 005. [Morok 97] I. Marak, “Phermal Brosion.” Availolbe at-htépi//wwwieg-tuwien.oc.at/ hostings/ceseg/CESCG97/marak/nodel Lhtm), 1997. [Olsen 04] J. Olsen. “Real-Time Procedural Terrain Generation.” Avaitable at hitp:// oddlabs.com/download terrain.generation pdf, 2004 [O'Neil 05] S. O'Neil, Accurate Atmospheric Scattering. Readi 2005, , MA: Addison-Wesley, [West 08] M. West. “Random Scattering: Creating Realistic Landscapes.” Available at http://www.gamasutra.com view /feature/1648/random_scattering_ php, 2008. Copyrighted material Rendering Tn this section we cover new techniques in the field of real-time rendering. Every now generation of game or interactive application must push the boundaries of what is possible to render and simulate in real time in order to remain compet- itive and engaging. The articles presented here demonstrate some of the latest advancements in real-time rendering that are being employed in the newest games and interactive rendering applications. ‘The first article in the renderiug section is “Pre-Integrated Skin Shading.” by Erie Penner and George Borshikov. This article presents an interesting and very cfficient shading model for rendering realistic skin, It can be evahtated entirely in a pixel shader and does not require extra rendering passes for blurring, thus making it a very scalable skin-rendering technique. (Our next article is “Implementing Fur in Deferred Shading,” by Donald Revie The popularity of deferred shading has increased dramatically in recent. years. One of the limitations of working in a deferred-rendering engine is that. techniques involving alpha blending, such as fur rendering, become difficult to implement. In this article we learn a munber of tricks that enable fur to be rendered ina doferced-shading environment. ‘The thind article in the rendering section is “Large-Scale Terrain Rendering for Outdoor Games,” by Ferenc Pintér. This article presents a host of production proven techniques that allow for large, high-quality terrains to be rendered on resource-constrainel platforms such as current-generation consoles, This arti cle provides practical tips for all areas of real-time terrain rendering, from the content-creation pipeline to final rendering. ‘The fourth article in this section is “Practical Morphological Antialiasing, by Jorge Jimenez, Belen Masia, Jose 1. Echevarria, Fernando Navarro, and Diego Gutierrez. The authors take a new, high-quality, antisliasing algorithm and demonstrate a highly optimized GPU implementation. ‘This implementation is so efficient that it competes quite successfully with hardware-based antialiasing, schemes in both performance and quality. ‘This technique is particularly pow ful because it provides a natural way to add antialiasing to a deferred-sheding, engine. We conclude the section with Emil Persson’s “Volume Decals” article. This is a practical technique to render surface decals without the need to generate 19 I Rendering special geometry for every decal. Instead, the GPU performs the entire projection operation. The author shows how to use volume textures to render decals on arbitrary surfaces while avoiding texture stretching and shearing artifacts, The diversity of the rendering methods described in this section represents the wide breadth of new work belng generated hy the real-time rendering community. As a fan of new and clever interactive rendering algorithms, reading and editing these articles has been a great joy. [hope you will enjoy reading them and will find them as useful and relevant as I do. —Christopher Oat Pre-Integrated Skin Shading Eric Penner and George Borshukov 1.1 Introduction Rendering realistic skin has always been a challenge in computer graphies. Human observers are particularly sensitive to the appearance of faces and skin, and skin exhibits several complex visual characteristics that are difficult to capture with simple shading models. One of the defining characteristics of skin is the way light bounces around in the dermis and epidermis layers. When rendering, using, asimple diffuse model, the light is assumed to immediately bounce equally in all directions after striking the surface, While this is very fast to compute, it gives surfaces a very “thin” and “hard” appearance. In order to make skin look more “soft” it is necessary to take into account the way light bounces around inside asurface. This phenomenon is known as subsurface seatiering, and substantial recent effort has been spent on the problem of realistic, real-time rendering with accurate subsurface scattering. Current skin-shading techniques usually simulate subsurface scattering during rendering by cither simulating light as it travels through akin, or by gathering incident light from neighboring locations. In this chapter we discuss a differ- ent approach to skin shading: rather than gathering neighboring light, we pre- integrate the effects of scattered light. Pre-integrating allows us to achieve the nonlocal effects of subsurface scattering using only Tocally stored information and a custom shading model. What this means is that our skin shader becomes just: that: a simple pixel shader. No extra passes are required and no blurring is required, in texture space nor screen space. Therefore, the cost of our algorithm, scales directly with the mmber of pixels shaded, just like simple shading models such as Blinn-Phong, and it can be implemented on any hardware, with minimal programmable shading support. a UL Rendering 1.2 Background and Previous Work Several offline ancl real-time approaches have been based on an approach, taken from fi led texture-space diffesion (TSD). TSD stores incoming light tn texture sp: late diffasion, The first nse of this technique was by [Borshukov and Lewis 03,Borshukov and Lewis 05) in the Matrir sequels. They rendered light into a texture-space map and then used a ct -¢ and uses a bhuring step to si blur kemel to gather scattered light from all directions. Based on. extensive reference to real skin, they used different bhur kernels for the red, green, and blue color channels, since different wavelengths of light: scatter differently through skin. Since the texture-space diffusion approach used texture-blur operations, it was @ very good fit for graphics hardware and was adopted for use in real- time rendering [Green 04, Gosselin et al. Ol]. While TSD approaches achieved much more realistic time couldn't initially achieve the same level of quality of the expensive, original. nonseparable bhurs used in film. A concept that accurately deserihes how light diffuses in skin and other transhicent materials is Inown as the diffusion profile. For a highly seattering translucent material it is assumed that light seatters equally in all directions as soon as it hits the surface. A diffusion profile can be thought of as a simple plot of how much of this diffused light exits the suxface as a function of the distance from the point of entry. Diffusion profiles can be calculated using measured seat tering parametors via mathematical models known as dipole (Jensen et al. 01] or multipole [Donner and Jensen 05) diffusion models. ‘The dipole model works for simpler materials, while the multipole model can simulate the effect of several esults, the simple blurring operations performed in real layers, each with different scattering parameters, The work by [d'Fon and Luebke 07] sets th ent high bar in real-time skin rendering, combining the concept of fast Gaussian texture-space diffusion with the rigor of physically based diffusion profiles. Their approach uses a sum of Gaussians to approximate a multipole diffusion profile for skin, allowing very large diffusion profile to be simulated using several separable Ganssian blurs. More recent approaches have achieved marked performance improvements. For example, [Hable et al. 09] have presented an optimized texture-space blur kernel. while [Jimenez et al. 09] have applied the technique in screen space. 1.3. Pre-Integrating the Effects of Scattering We have taken a different approach to the problem of subsurface scattering in skin and have departed from texture-spsce diffusion (see Figure 1.1). Instead, we wished to see how far we could push realistic skin rendering while maintaining the honefits of a local shading model. Locel shacling models have the advantage of not requiring additional rendering passes for each object, and scale linearly with the number of pixels shaded, ‘Therefore, rather than trying to achieve subsur- 1. Pre-Integrated Skin Shading Figure 1.1. Our pres shading approach uses the same diffusion profiles as textnre-space diffusion, but uses a local shading model. Note bow light bleeds over lighting boundaries and into shadows, (Mesh and textures courtesy of XYZRGB. sgrated ski face scattering by gathering incoming light from nearby locations (performing an integration during runtime), we instead seek to pre-integrate the effects of sub surface scattering in skin, Pre-integration is used in many domains and simply refers to integrating a fanction in advance, such that calculations that rely on the fimction’s integral can be accelerated later. Image convolution and blurring are just a form of numerical integration. ‘The obvious caveat of pre-integration is that in order to pre-integrate a func tion, we need to know that it won't change in the future. Since the incident light on skin can conceivably be almost arbitrary, it seems as though precomputing this effect will prove difficult, especially for changing surfaces. However, by focusing only on skin rather than arbitrary materials, and choosing specifically where and what to pre-integrate, we found what we believe is a happy medium. In total, we model, pre-integrate the effect of scattering in three special ateps: on the lighting on sinall surface details, and on occluded light (shadows). By applying all of these in tandem, we achieve similar results to texture-space diffusion approaches in a completely local pixel shader, with few additional constraints ‘To understand the reasoning behind our approach, it first helps to picture a completely flat pioco of skin under uniform directional light. In this particular case, no visible scattering will occur because the incident light is the same ey- erywhere. The only three things that introduce visible scattering are changes in

You might also like