# GPU Foliage Instancing and Shaders In a series of [videos](https://www.youtube.com/watch?v=jw00MbIJcrk) by Acerola, he discusses different techniques for displaying different aesthetics of grass. In a comment on the [first video](https://www.youtube.com/watch?v=Y0Ko0kvwfgA) there are some really interesting ideas: > Hi, fellow tech artist here: Overall nice look. I would add some additional effects. Grass should animate a little more in zones of related movements, and in waves. Then there is lighting. Calculate a standard fresnel effect (dot(pos-cam, viewDir) as uv in a fresnel LUT), and use the sky color as a tint where grass is seen very heads on. It will highlight the curves of the terrain. Also grass is a 3D structure in a vertical position, so calculating some light based on viewing direction (dot(pos-cam, lightDir)) would work wonders. As grass animates, it changes how it faces light, so could also do something here, to make it change lighting as it waves. Lastly: grass is shiny, so adding some soft specularity (again use the pos-camPos as a pseudo normal) is a nice final touch > Add some occasional clouds to the lighting aspect of the grass. with shadows occasionally going over the grass creating darker and lighter areas of lighting. Would make the grass feel a lot more natural and pretty. The sense of depth would greatly increase. Add random shine glints to the grass and directional sunlight to make the grass have even more depth. it'd be very pretty looking.. This gave me an idea for different effects, where we could pull the local maximum and minimums from terrain maps and use them to add additional highlights or shaders effects. I also thought about level of detail where certain clusters of foliage up close (like the talltest grass most likely to be closer to the camera) could have full 3D structure, but most would be billboard, and at a certain distance additionally drops down to sprites. This is pretty easy with a fixed camera, but with a moving camera the problem of pop-in where the transitiones between LODs occur need to be smoothed out. Another idea is that because skewing a 2D texture thins the pixels, that instead of the top 2 verticies moving equal distances, the leeward verticie of the foliage would move slightly farther to maintain line thickness and reduce the appearance of stretching in the texture. As an extension of that, for closer LODs, the textures themselves could generated with different characteristics so that they always maintain proportions and detail. > You actually don’t even need the whole mesh past a few meters. Instead of swapping it out for an entire lower mesh count model, you can just plug in the grass tops, on top of this past a certain distance you don’t even have to apply a skew, as you can only see the tips just revert back to a simple lateral translation. In the [third video](https://www.youtube.com/watch?v=PNvlqsXdQic) in the series he uses chunking to reduce the need for the GPU to worry about culling. As for chunking, I'm also thinking of a video I saw the other day that mentioned pathfinding in Runescape and how it doesn't use squares but arbitrary shapes to better fit the dynamic nature of the game. If chunks aren't square (or are otherwise deformed from squares) then we can hide them easier and apply different effects to each chunk like colors or foliage types and handle them all together without it being super obvious. > I'd recommend checking out the Ghost of Tsushima grass video on GDC. They go pretty in-depth on how they did their grass system which uses the chunking not only for performance but also to allow for procedural and designed variation (chunks of grass either using different colors and meshes or just being deformed in different directions to mimic variety irl) in addition to allowing easy integration with their wind sim and guiding system. Really interesting video that would be a great follow up for anyone interested in this! > You can use a division structure(think 1 chunk has 4 sub chunks that each have 4 sub chunks) then frustum cull the chunks instead of individual blades for way faster culling. > I see that other people have already mentioned using billboards at distance, but there are some other, intermediate optimizations that I think you could use as well. Just as a thought experiment, I know you're no longer working on this. > > - The GDC talk on the grass in Ghost of Tsuhsima widens the grass at further distances while drawing fewer blades. They make the blades twice as wide for half as many blades. While they generate their grass blades in the geometry shader, this should still be possible with the 3d models you are using in the vertex shader. > - The same GDC talk mentions how they use a grass model that forms a V, which gives them more coverage with fewer blades. It will make the grass look different, though. > > I read through the shader code for the grass and there's a number of smaller optimizations that could be employed. I don't know if they would actually do any good, and the compiler may even do them for you, but they could be worth trying out just to see. > > - You use `v.uv.y * v.uv.y` three times in the vertex shader, twice in `_Scale * v.uv.y * v.uv.y`. If you put them into a temporary variable the GPU would keep them in a register rather than computing them every time they're needed. > - `RotateAroundXInDegrees` and `RotateAroundYInDegrees` are fairly expensive functions, but much of that stuff can either be pre-computed or computed once (per vertex). `RotateAroundXInDegrees` is only called once, with a constant value passed into "Degrees". You can store `m` as a constant since it never, ever changes. `RotateAroundYInDegrees` is similar, the m` `for `idHash * 180.0` only needs to be computed once per vertex. > - You can inline the `RotateAround_InDegrees` functions to avoid the overhead from function calls. > - floating point arithmetic is non-associative, which means optimizations that may be obvious won't be performed by the compiler because you won't get exactly the same answer. `UNITY_PI / 180.0` should just be a constant, but in order to tell the compiler that you have to put it in parathesis, or just `#define` a constant. > - If you put `positionBuffer[instanceID].displacement` into a temporary variable, you'll be telling the compiler to load the value into a register rather than reading from cache, or worse, vram, every time it needs it. > - There are likely some code motion optimizations to be made, particularly in concert with the above, but I don't know enough about the hardware or compiler to really give many good recommendations for it. I will say that putting all of the `positionBuffer[instanceID]` calls next to each-other improves temporal locality and may have an effect on performance. > > As I mentioned, a lot of this stuff may not even help. It's all very much "try it and see". The compiler may already do most/all of this already and so the end result is just making your code impossible to maintain to get an extra 3 fps. There's also Amdahl's law. I may be focusing all of this attention on some parts of the procedure that take up 1/10 of the actual execution time. Acerola made a video about why Pokemon Legends Arceus looks bad and provides some good tips on how to improve the look of the [grass](https://youtu.be/1bu8ePFm-wQ?t=309) in particular. However, his assertion that there is no way to fix the repetitive pattern on the water without increasing texture sizes is incorrect, as demonstrated by Japster's video on Super Mario Galaxy 2 (below). # Water Textures Included here because it is very similar to foliage shaders. > The tiling of the water at a distance is solvable by using stochastic texture sampling, which is a pretty straightforward thing to do, though requires a few extra texture samples. In this case, I think it would have been worth it. There is a quick mention and demonstration of the Xbox version of Halo in [this video](https://www.youtube.com/watch?v=jSjnxJ6HWmw&t=757s) and how its water looks better than the remaster. And in [this video](https://www.youtube.com/watch?v=8rCRsOLiO7k) by Jasper he breathlessly explains how Super Mario Galaxy 2's water textures still look great over 10 years later. It is a very interesting technique because it takes a handful of low res textures and produces results that are sharp at much high resolutions due to displacement and layer blending math. Something that particularly caught my eye was taking 2 kind of crappy static nosie textures and by scrolling and blending them with different thresholds generating really nice looking water surface effects with zero simulation or animation effort. # Fog Acerola has a [video](https://www.youtube.com/watch?v=EFt_lLVDeRo) talking about Silent Hill's fog and how to implement it. He demonstrates an issue with [[Unity Engine]] (and possibly other engines) where depth maps are not collected from models without shader maps. ## Idea 1 I had the idea of making a type of textured fog by plotting a simple noise texture vs location coordinates, and then drawing it in based on the depth map. This could be implemented in many ways. One probably very intensive way would be to take the current camera's position and then iteratively apply the fog color to a new buffer at various sampling distances between the camera and the depth map range by vector. This wouldn't be terribly different than ray tracing through the whole depth and accumulating fogginess as the ray goes, but as a few possible advantages: - An entire layer of fog could be sampled as a plane at a given distance from the camera and then cut out iteratively, or alternatively just - The fog can be smoothed and blurred at each step (so starting with the farther layer would be ideal) - There will be no aliasing/graininess like a single pass ray tracer - The fog buffer can be applied with varying intensities, effects, blend styles, colors, etc or used as a new type of depth map for other things to use like a displacement map The code might look something like this for a per pixel calculation: ```elixir depth_map: scene.get_depth_map camera: scene.player_camera fog: [ texture: noisemap.generate/seamless (100 100) tint: color://rgba/OxFFFF max: 512 #;; range of fog when it becomes opaque, or otherwise the limit of fog's effect min: 5 #;; range before fog begins having an effect, everything up until this point is perfectly clear samples: 10 def/fn increment [] [ max - min / samples ] def/fn intensity [depth] [ range: 1.0 / <| max - min #;; intended to convert the max-min into a float between 0 to 1 step: range / samples #;; should give us the amount that each sample should impact the output depth * step ] def/fn map [coordinate] [ x: coordinate.x % texture.width y: coordinate.z / 4 + coordinate.y % texture.height tint * texture.at (x y) ] ] def/fn depth_to_distance [pixel depth_map] [ #;; depth_map.range needs to give us the actual maximum distance in units that can be represented by the map step: depth_map.range / depth_map.color_depth pixel.luminence * step ] def/fn xy_to_angle [pixel fov res] [ #;; this function needs to return an approximation of the angle that pixel is from the center of the camera #;; there's all kinds of possible issues with this implementation, but this is the principal of the thing center: res / 2 hstep: fov / res.width vsstep: fov / res.height #;; assumes square fov hangle: center.x - pixel.x * hstep vangle: center.y - pixel.y * vstep Vector2.new hangle vangle ] limited_depth: depth_map.clamp fog/max fog/min closest: limited_depth.min #;; darkest pixel is closest farthest: limited_depth.max #;; brighest pixel is farthest fog_buffer: limited_depth.pixels.map [ angle: xy_to_angle pixel camera.field_of_view camera.resolution #;; this may give weird results because the depth will not be locked into the sample depths specified by the fog data, what might need to happen is that depths are quantized to their nearest foggable sample instead depth: pixel.luminence accumulate [ distance: depth_to_distance depth limited_depth vector: Vector3.new/from [distance: distance yaw: angle.x pitch: angle.y] coordinate: camera.position + vector #;; done with math to reduce branching limit: depth == farthest |> .to Float #;; map true to 1.0 and false to 0.0 intensity: max limit fog.intensity depth fogginess: fog.map coordinate |> * intensity depth: depth - fog.increment either depth > max closest fog.min [ break fogginess ] [ fogginess ] ] ] ``` # References - https://godotshaders.com/shader/wandering-clipmap-stylized-grass/