# Raytracing
**Rasterization** isn't the only way to render a 3D scene into 2D image. An alternative is **raytracing**. At a high level, below shows how rasterization and raytracing differ. Assume there are
- $N$ objects, each occupying $d\sim10^1$ pixels
- $p\sim10^6$ pixels in total on the screen
**Rasterization**: $O(N*d)$. Advantageous when geometry is simple, i.e. $d\sim10^1 \ll p\sim10^6$
```
for each object:
for each pixel occupied by the object:
render pixel using vertex and/or fragment shader
```
**Raytracing**: $O(p*N)$, can be optimized to $O(p\log N)$. Advantageous when geometry is complex, i.e. $N$ is large s.t. $\log N \lll N$
```
for each pixel/ray:
for each object intercepting the ray:
render pixel similar to fragment shader
```
The main advantage of raytracing over rasterization is revealed when realistic rendering of complex scene is desired, i.e. when we want the rendering pipeline to compute shadows, transparency, complex illumination, realistic materials etc.
A more detailed sudo-code of raytracing looks something like this:
```java
Image Raytrace(Camera cam, Scene scene, int width, int height){
Image image = new Image(width, height);
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
Ray ray = RayThruPixel(cam, i, j); // Ray casting
Intersection hit = Intersect(ray, scene); // Ray-obj intersection
image[i][j] = FindColor(hit);
}
}
return Image;
}
```
Buckle up! We need to recall some found memories about linear algebra, namely [[0_background#Coordinate Frames|coordinate frame]] and [[2_viewing#3. Perspective Projection: `gluPerspective()`|perspective projection]].
## Ray-Object Intersection
We need the following information for each ray:
- For **primary rays**: Point of *intersection*, material, normals, texture coordinates
- For **shadow rays**: *Intersection* vs. no intersection
The ray-object intersection problem what concerns us is that if we parameterize ray using $t\in[t_0,t_1]$, where's the first intersection of the ray with a surface in the scene.
### Ray-sphere intersection
We've established that a ray takes the form $\mathbf{p}(t)=\mathbf{e}+t\mathbf{d}$, where $e,d$ are eye location and ray direction respectively. Recall that de can define a circle using [[0_background#Implicit circle equation|implicit curve]] equation $\Vert\mathbf{p}-\mathbf{c}\Vert^2-r^2=0$ (or define a surface using implicit surface function). Note that now $\mathbf{p}$ can be parameterized by $t$ using the ray function $\mathbf{p}(t)$, so we have the following (If the $\Vert\bullet\Vert^2$ notation (i.e. square of [Euclidean norm](https://en.wikipedia.org/wiki/Norm_(mathematics)#Euclidean_norm) or $L^2$ norm ) seems cryptic, review the note on [[0_background#Implicit Line/Curve|implicit curve]] and convince yourself that $\mathbf{v}\cdot\mathbf{v}=\Vert v\Vert^2$):
$\begin{align*}
&f(t)=\Vert\mathbf{p}(t)-\mathbf{c}\Vert^2-r^2=0 \;\equiv\; \Vert\mathbf{e}+t\mathbf{d}-\mathbf{c}\Vert^2-r^2=0 \\
\underset{\text{rearrange}}{\rightarrow}&(\mathbf{d}\cdot\mathbf{d})t^2+2\mathbf{d}\cdot(\mathbf{e-c})t+(\mathbf{e-c})\cdot(\mathbf{e-c})-r^2=0 \\
\underset{\text{Solve for t}}{\rightarrow}& t=\frac{-\mathbf{d} \cdot(\mathbf{e}-\mathbf{c}) \pm \sqrt{(\mathbf{d} \cdot(\mathbf{e}-\mathbf{c}))^2-(\mathbf{d} \cdot \mathbf{d})\left((\mathbf{e}-\mathbf{c}) \cdot(\mathbf{e}-\mathbf{c})-R^2\right)}}{(\mathbf{d} \cdot \mathbf{d})}
\end{align*}$
Now we can realize that $f(t)$ is a [quadratic equation](https://en.wikipedia.org/wiki/Quadratic_equation) w.r.t $t$. There are 4 types of roots, which we can determine by evaluating the [discriminant](https://en.wikipedia.org/wiki/Quadratic_equation#Discriminant) $\Delta=b^2-4ac$. Moreover, the 4 types of roots correspond exactly to 4 conditions regarding ray-sphere intersection:
![[Pasted image 20221227150538.png|300]]
1. 2 real positive roots ($\Delta>0$): There are 2 ray-sphere intersections, we pick the intersection that's closest to eye, i.e. the smaller root.
2. 2 roots are the same ($\Delta=0$): Ray is tangent to the sphere. (Need $t>0$)
3. 1 positive root, 1 negative root ($\Delta>0$): Ray originates inside the sphere
4. Both roots are complex ($\Delta<0$): No intersection.
Once we've fount $t$ and intersection point $\mathbf{p}(t)$, we can calculate the surface normal at intersection point simply by calculating the vector from intersection point $\mathbf{p}$ to the sphere's center $\mathbf{c}$:
$\mathbf{n}=\frac{\mathbf{p-c}}{\Vert\mathbf{p-c}\Vert}$
### Ray-triangle intersection
The triangle spans a 2D plane which is imbedded in 3D space. Therefore, determining whether a ray and a triangle intersects can be done in 2 steps: 1) Find the intersection of the ray on the 2D plane spanned by the triangle (if ray isn't parallel to the plane), 2) Determine whether the intersection lies within the triangle. Recall that [[0_background#Barycentric Coordinates|barycentric coordinates]] is particularly suited for the second task. Also review [[0_background#Vector Form Line and Plane|plane's vector form definition]].
**Step1**: (Using [[0_background#Line and Plane|plane's vector form definition]]) We define the plane spanned by $\triangle\mathbf{abc}$, and try to find the ray-plane intersection. Still, ray takes the form $\mathbf{p}(t)=\mathbf{e}+t\mathbf{d}$. We define the plane's normal $\mathbf{n}$ using two edges of the triangle.
$\begin{align*}
&\mathbf{n}\coloneqq\text{normalize}(\mathbf{(c-a)}\times\mathbf{(b-a)})\\
\text{Ray: }&\mathbf{p}(t)=\mathbf{e}+t\mathbf{d}\\
\text{Plane: }&\mathbf{n}\cdot(\mathbf{p}(t)) = \mathbf{n\cdot a} \text{ where } \mathbf{n,a} \text{ are known}\\\\
\text{Solve for t: }& t=\frac{\mathbf{n}\cdot(\mathbf{a-e})}{\mathbf{n\cdot d}}
\end{align*}$
Note that $\mathbf{n\cdot d}=0$ iff. ray is parallel to the plane, in which case the ray-plane intersection is undefined.
**Step2**: (Using [[0_background#Barycentric Coordinates|barycentric coordinates]]) Determine whether $\mathbf{p}(t)$ lies in $\triangle\mathbf{abc}$. We start by computing the barycentric coordinate $(\beta,\gamma)$ of $\mathbf{p}(t)$ w.r.t. $\triangle\mathbf{abc}$.
$\begin{align*}
&\mathbf{p}(t)=\mathbf{a}+\beta\mathbf{(b-a)}+\gamma\mathbf{(c-a)}\\
&\mathbf{p}(t)\in\triangle\mathbf{abc} \text{ iff } \alpha,\beta,\gamma\in(0,1)
\end{align*}$
The above two steps can be condensed into one, that is, solve for $t,\beta,\gamma$ by solve the following vector-valued function using [Cramer's rule](https://en.wikipedia.org/wiki/Cramer%27s_rule):
$\begin{align*}
&\mathbf{e}+t\mathbf{d}=\mathbf{a}+\beta\mathbf{(b-a)}+\gamma\mathbf{(c-a)}\\
\equiv &\left[\begin{array}{lll}
x_a-x_b & x_a-x_c & x_d \\
y_a-y_b & y_a-y_c & y_d \\
z_a-z_b & z_a-z_c & z_d
\end{array}\right]\left[\begin{array}{l}
\beta \\
\gamma \\
t
\end{array}\right]=\left[\begin{array}{l}
x_a-x_e \\
y_a-y_e \\
z_a-z_e
\end{array}\right]
\end{align*}$
## Raytracing for Transformed Object
See [note](https://cseweb.ucsd.edu/~alchern/teaching/cse167_2021/8-2RayTracingPart2Whiteboard.pdf) by prof. Albert Chern
We have an optimized ray-sphere test on simple geometry primitives, but what if we want to ray trace an ellipsoid? Solution: Ellipsoids are transformed spheres, so apply the inverse transform to the ray, and use ray-sphere intersection.
![[7_raytracing 2023-02-28 11.31.46.excalidraw.svg|400]]
We can establish that $M$ is a transformation that maps the **canonical** (i.e. model) frame (where ray-shape intersection is easily calculated) to the **world** frame. Note that we will apply the trick of [[1_transformations#3.3 Transforming Normals|transforming normals]].
$\begin{align*}
\begin{array}{ll}
\text{Ray in world frame: }&\mathbf{p}=\mathbf{p_0}+t \mathbf{p_1}\\
\text{Shape in world frame: }&\mathbf{p}=M \mathbf{x}\\\\
\text{Ray in canonical frame: } & \mathbf{p}'=M^{-1}\mathbf{p_0}+tM^{-1}\mathbf{p_1}\\
\text{Shape in canonical frame: }& \mathbf{p}'=M^{-1}M \mathbf{x}=\mathbf{x}\\\\
\text{Intersection in world frame: } & \boxed{\mathbf{p}=M \mathbf{p'}}\\
\text{Normal in world frame: } &\boxed{\mathbf{n}=M^{-T}\mathbf{n}'}
\end{array}
\end{align*}$
Finally, the distance from the origin to intersection point in the world frame is given by $\Vert \mathbf{p} -\mathbf{p}_0\Vert$.
#note $\mathbf{p_0}=[p_{0x},\;p_{0y},\;p_{0z},\;1]^T,\; \mathbf{p_1}=[p_{1x},\;p_{1y},\;p_{1z},\;0]^T$. The ray direction $\mathbf{p}_1$ have the homogeneous coordinate set to $0$ so there's no action because of translations
## Acceleration: Grids
- Simplest acceleration, for example 5x5x5 grid
- For each grid cell, store overlapping triangles
- March ray along grid (need to be careful with this), test against each triangle in grid cell
- More sophisticated: kd-tree, oct-tree bsp-tree
- Or use (hierarchical) bounding boxes
- Try to implement some acceleration in HW 4
[slides](https://online.ucsd.edu/assets/courseware/v1/a6c172b8a5f91b0eaaaa1ba28dd0753b/asset-v1:CSE+167+1T2022+type@asset+block/slides_raytrace1.pdf)
## Ray Casting Details
### Ray through pixel
Given the 3D camera location together with the 2D pixel location, we want to find the corresponding origin and direction of the ray. We need to recall how the coordinate frame in [[1_transformations#4.2 `gluLookAt`: modeview transformation matrix|gluLookAt]] is constructed.
![[7_raytracing 2023-03-02 16.33.18.excalidraw.svg|400]]
To set up the following derivation, assume:
- $(\text{height}, \text{width})$, in units of pixels, describe the size of the screen
- $\mathbf{p}_0$ is the eye/camera location
- $(c_u,c_v)$ is the point location (in 2D) at which eye/camera's look-at vector incidents
- $-w$ is the direction of camera's look-at
We can make the following observations between :
$\begin{align*}
\begin{array}{l|ll}
&\text{Horizontal}&\text{Vertical}\\
\hline
\text{Screen center coord}&c_u=\text{width}/2&c_v=\text{height/2}\\
\text{Normalized dist to center} &\triangle_u=(j-c_u)/c_u&\triangle_v =(c_v-i)/c_v\\
\text{Scaling factor} &\alpha=\tan(\frac{\text{fovx}}{2})\triangle_u&\beta=\tan(\frac{\text{fovy}}{2})\triangle_v
\end{array}
\end{align*}$
Finally the ray can be expressed as:
$\begin{align*}
\boxed{\mathbf{p}=\mathbf{p_0}+t\frac{\alpha u+\beta v-w}{\Vert \alpha u+\beta v-w \Vert_2}, \;\text{where} \begin{cases}
\alpha=\tan(\frac{\text{fovx}}{2})\frac{j-\text{width/2}}{\text{width}/2}\\
\beta=\tan(\frac{\text{fovy}}{2})\frac{\text{height/2}-i}{\text{height}/2}
\end{cases}}
\end{align*}$
### Shadows, shading, recursive ray-tracing
**Numerical inaccuracy** may cause intersection to be below surface, causing surface to incorrectly shadow itself
- Solution: Move the intersection point a little towards the light source before shooting shadow ray.
Otherwise, **shading** is done similarly as OpenGL:
- Ambient: r,g,b
- Attenuation constant: linear, quadratic. $L=\frac{L_0 }{\text{const}+\text{linear}*d+\text{quadratic}*d^2}$
- Per-light model params:
- Directional light: Direction + r,g,b
- Point light: Location + r,g,b
We have the following **material properties** (same as OpenGL):
- Diffuse reflectance: `r,g,b`
- Specular reflectance: `r,g,b`
- Emmission: `r,g,b`
- Shininess: `s`
**Recursive ray tracing** calls `GetColor()` and `RayTrace()` recursively. For each pixel:
1. Trace primary eye ray, find intersection
2. Trace secondary **[shadow rays](https://www.scratchapixel.com/lessons/3d-basic-rendering/introduction-to-shading/ligth-and-shadows)** to all lights. (`color = visible ? illumination model : 0`)
3. Trace **[reflected ray](https://www.scratchapixel.com/lessons/3d-basic-rendering/introduction-to-shading/reflection-refraction-fresnel)**. (`color += reflectivity * color of reflected ray`)
Putting it all together, we have:
$\begin{align*}
\boxed{I=K_sI_R+K_tI_t+K_a+K_e+\sum\limits_{i=1}^nV_iL_i \bigg(K_d\max(I_i\cdot n,0)+K_s\max(h_i\cdot n,0)^s \bigg) \\}
\end{align*}$
- $K_s, K_t$: Recursive specularities and transmission
- $I_R$: Secondary rays for mirror reflection
- $I_t$: Secondary rays for transmission
- $K_a, K_e$: Global ambient, emission term
- $K_d, K_s$: Per-light diffuse, specular term
- $V_i, L_i$: Per-light visibility, Intensity. $V_i=0$ if the path to light source is blocked
- $I_i$: Per-light, light direction
- $h_i$: Per-light half angle
- $s$: Shininess
### (optional) An alternative view of ray-casting
See [note](https://cseweb.ucsd.edu/~alchern/teaching/cse167_2021/8-1RayTracingWhiteboard.pdf) by Prof. Albert Chern.
![[Pasted image 20221226141525.png]]
To specify a ray, we need 2 quantities: 1) Origin, 2) direction. So an example ray class would look like:
```cpp
class Ray{
Vec3 o // origin
Vec3 d // direction
}
```
**Problem setup**: We use $i$ to index image plane pixel in the v (vertical) direction, and use $j$ to index pixel in the $u$ (horizontal) direction. The image place is unit-distance (i.e. 1) away from eye. For a pixel $(i,j)$, we want to compute the direction $d\in \mathbb{R}^3$ of the ray centered at eye ($e\in \mathbb{R}^3$), passing the image plane at pixel $(i,j)$. We use orthonormal vectors $\{u,v,w\}\subset\mathbb{R}^3$ to construct a right-handed coordinate system located at the eye $e$, the view direction being $-w$, and the camera up direction being $v$.
$\begin{align*}
\text{ray}&=\mathbf{e}+t \mathbf{d}, \quad\text{where }\begin{cases}\mathbf{o}=\text{eye/camera location}\\\mathbf{d}=\text{normalize}(\alpha\mathbf{u}+\beta\mathbf{v}-w)\end{cases}
\end{align*}$
We can see that $\alpha, \beta$ is the $(i,j)$ pixel center coordinate on the image plane, expressed in terms of the $\{u,v,w\}$ coordinate system. What exactly should they be?
$\begin{align*}
\alpha &= \frac{(j+0.5)-\text{width}/2}{\text{width}/2}\cdot\tan(\theta_x)\\
\beta &= \frac{(i+0.5)-\text{height}/2}{\text{height}/2}\cdot\tan(\theta_y)\\
\text{where }&\begin{cases}\text{width}=r-l,\; \text{heigh}=t-b \\
\theta_x=\frac{\text{fovx}}{2},\; \theta_y=\frac{\text{fovy}}{2}
\end{cases}
\end{align*}$
Alternatively, we can define $\alpha, \beta$ using $l,r;\;t,b$ explicitly (where $n_x,n_y$ are the number of pixels in the horizontal and vertical directions):
$\begin{align*}
\alpha &= l + (r-l)(j+0.5)/n_x \\
\beta &= b + (t-b)(i+0.5)/n_y
\end{align*}$
## (psudo) Code Examples
### Ray-scene intersection
A scene is simply a collection of objects, therefore the ray-scene intersection is imply the intersection of the ray with the nearest object.
```python
def intersection(ray, scene):
min_dist = infitnity; hitobject = null;
for object in scene:
t = intersect(ray, object)
if (t > 0 and t < min_dist):
min_dist = t;
hitobject = object
return intersectionInfo(min_dist, hitobject)
```
### Color at intersection point
```python
def Raytrace(camera, scene, width:int, height:int):
image = Image(width, height)
for i in range(height):
for j in range(width):
ray = rayThruPixel(camera, i, j)
hit_intersection = intersection(ray, scene)
image[i,j] = FindColor(hit)
return image
```
## Acceleration structures
### Bounding volume hierarchies (BVH)
```start-multi-column
ID: ID_c7mh
Number of Columns: 2
Largest Column: standard
```
![[Pasted image 20230307093901.png|400]]
--- column-end ---
![[Pasted image 20230307094027.png|260]]
=== end-multi-column
Key: Testing intersection against bounding box is much easier than testing against, e.g., sphere.
### Grids: uniform spatial subdivision, binary space partitioning
![[7_raytracing 2023-03-07 09.42.24.excalidraw|600]]
**Problem**: Too little benefit if grid is too coarse, too much cost if grid it too fine.
**Solution**: ***Octree traversal***. Cons: more complex neighbor finding.
- Screen space coherence
Intersecting with vertical grid lines ($t_0,\;t_0+t_1,\; t_0+2t_1,\;\cdots$)
$\begin{align*}
p_0+tp_1&= x_0+n\alpha\\
\Longleftrightarrow t&= \underset{t_0}{\underbrace{ (x_0-p_0x) }}+n \underset{t_1}{\underbrace{ \alpha }}\\
p_x &= p_{0_x}+nt_1p_{1_x}
\end{align*}$
Intersection with horizontal grid lines ($\hat{t_0},\;\hat{t_0}+\hat{t_1},\; \hat{t_0}+2\hat{t_1},\;\cdots$)
$\begin{align*}
p_x &= p_{0_y}+n\hat{t_1}p_{1_y}
\end{align*}$
Other accelerations: Check last hit first, beam tracing, pencil tracing, cone tracing
Ray tracing is "embarrassingly parallelizable".
### Bounding box test
![[7_raytracing 2023-03-07 10.09.27.excalidraw]]
$\begin{align*}
t>0,\; t\in[t_{x\min}, t_{x\max}],\; t\in[t_{y\min}, t_{y\max}]
\end{align*}$
Hierarchical bbox text (p305 of book):
```cpp
function bvh-node::create (object array A, int AXIS){
N = A.length() ;
if (N == 1) {
left = A[0]; right = NULL; bbox = bound(A[0]);
} else if (N == 2) {
left = A[0]; right = A[1]; bbox = combine(bound(A[0]),bound(A[1]));
} else
// Find midpoint m of bounding box of A along AXIS
// Partition A into lists of size k and N-k around m
left = new bvh-node (A[0...k],(AXIS+1) mod 3);
right = new bvh-node(A[k+1...N-1],(AXIS+1) mod 3);
bbox = combine (left -> bbox, right -> bbox) ;
}
```
### Uniform spatial subdivision
Different idea: Divide space rather than objects
- In BVH, each object is in one of two sibling nodes
- A point in space may be inside both nodes
- In spatial subdivision, each space point in one node
- But object may lie in multiple spatial nodes
- Simplest is uniform grid (have seen this already) § Challenge is keeping all objects within cell
- And in traversing the grid
### BSP Trees
![[7_raytracing 2023-03-07 10.23.02.excalidraw|600]]
- Used for visibility and ray tracing
- Book considers only axis-aligned splits for ray tracing § Sometimes called kd-tree for axis aligned
- Split space (binary space partition) along planes
- Fast queries and back-to-front (painter’s) traversal
- Construction is conceptually simple
- Select a plane as root of the sub-tree
- Split into two children along this root
- Random polygon for splitting plane (may need to split polygons that intersect it)
[slides](https://cseweb.ucsd.edu//~viscomp/classes/cse167/wi23/slides/lecture16.pdf)