1_transformations - Obsidian Publish

# Transformations, Coordinate Linear and affine transformations in $\mathbb{R}^2, \mathbb{R}^3$ --- ## 1. Transformation in 2D: Scale, Shear, Rotation 1. Scale $\begin{align*} \text{scale}(s_x, s_y) &= S\in\mathbb{R}^2= \begin{bmatrix}s_x&0\\0&s_y\end{bmatrix} \\ \text{scale}(s_x, s_y, s_z) &= S\in\mathbb{R}^3= \begin{bmatrix}s_x&0&0\\0&s_y&0\\0&0&s_z\end{bmatrix} \end{align*}$ 2. Shear $\begin{align*} \text{Shear} = S\in\mathbb{R}^2=\begin{bmatrix}1&-a\\0&1\end{bmatrix} \end{align*}$ 3. Rotation (linear and commutative). Note that $R$ is orthogonal, i.e. $R^TR=I$ (Checkout [this article](https://www.cuemath.com/algebra/rotation-matrix/) for derivation) $\begin{align*} \text{Rotation}(\theta) = R\in\mathbb{R}^2=\begin{bmatrix}\cos\theta&-\sin\theta\\\sin\theta & \cos\theta\end{bmatrix} \end{align*}$ **Composition** of transformations $\begin{align*} x_3 &= Rx_2, \; x_2=Sx_1 \\ \rightarrow x_3 &= R(Sx_1)=(RS)x_1\neq SRx_1 \end{align*}$ **Inverting** (composite) transforms - Option 1: Get composite matrix, then invert - Option 2: Invert each transform and swap order $\begin{align*} M &= M_1M_2M_3 \rightarrow M^{-1}=M_3^{-1}M_2^{-1}M_1^{-1} \\ M_1^{-1}M &= M_3^{-1}\left(M_2^{-1}\left(M_1^{-1} M_1\right) M_2\right) M_3 = I \end{align*}$ --- ## 2. Rotation in 3D ### 2.1. Rotation about x/y/z axis Checkout this [post on deriving rotation matrices](https://www.cuemath.com/algebra/rotation-matrix/) #lemma Composition of rotations are **non-commutative**. #lemma Rotation matrices are always linear and orthogonal, i.e. $R^TR=I$. A fancier why of saying this that $S\in\mathbb{SO}^3$, the [special orthogonal group](https://en.wikipedia.org/wiki/Orthogonal_group) $\begin{align*} R_z=\left(\begin{array}{ccc} \cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \end{array}\right)\; R_y=\left(\begin{array}{ccc} \cos \theta & 0 & \sin \theta \\ 0 & 1 & 0 \\ -\sin \theta & 0 & \cos \theta \end{array}\right)\; R_x=\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta \\ 0 & \sin \theta & \cos \theta \end{array}\right) \end{align*}$ Rows of (composite) rotation matrix can be interpreted as new coordinate frame - notice how each coordinate in $R_{uvw}p$ can be viewed as a project of $p$ on to each row of $R_{uvw}$. $\begin{align*} R_{uvw}&=\begin{bmatrix}u_x&u_y&u_z\\v_x&v_y&v_z\\w_x&w_y&w_z\end{bmatrix}=\begin{bmatrix}-u^T-\\-v^T-\\-w^T-\end{bmatrix} \\ &\rightarrow R_{uvw}p=\begin{bmatrix}u\cdot p \\ v\cdot p\\w\cdot p\end{bmatrix} \end{align*}$ ### 2.2. Rotation about arbitrary axes (Rodrigues' formula) To express rotation about arbitrary axes, we need ***Rodrigues's formula,*** aka the ***angle-axis*** representation. Such expression parameterizes the rotation matrix using two variables: $\underset{\text{angle}}{\underbrace{\theta}}, \underset{\text{axis}}{\underbrace{ a }}$, meaning rotate by angle $\theta$ about axis $a=[x,y,z]^T$. The $[\bullet]$ denotes [[0_background#matrix|skew symmetric]] operator. $\begin{align*} \text{Rot}(a,\theta)&=\cos\theta I_3+(1-\cos\theta)aa^T+\sin\theta[a]\\ &=\begin{bmatrix}\cos\theta&0&0\\0&\cos\theta&0\\0&0&\cos\theta\end{bmatrix} + (1-\cos\theta)\begin{bmatrix}x^2&xy&xz\\xy&y^2&yz\\xz&yz&z^2\end{bmatrix}+\sin\theta\begin{bmatrix}0&-z&y\\z&0&-x\\-y&x&0\end{bmatrix} \end{align*}$ Note that the final representation comprise of 3 parts: 1) Unchanged (hence cosine), 2) component along $a$ (hence 1), 3) perpendicular rotated component (hence sine). To derive the Rodrigues's formula, lets observe the following setup (all lower case letter are vectors in $\mathbb{R}^3$, except for $\theta$ which is in $\mathbb{R}$), where we have - $a$: Rotation axis (normalized, i.e. a unit vector) . $\theta$: rotation angle. - $b$: The vector to be rotated. $b'$: Rotated vector (around axis $a$ for angle $\theta$) - $b_\perp$: The projection of $p$ onto the xy-plane. $b'_\perp$: The projection of $b'$ onto the xy-plane ![[1_transformations 2023-01-18 14.22.15.excalidraw.png|400]] The goals is to express $b'$ in terms of $a,\theta$. Now we are ready to decompose $b'$ into 3 components 1. Along the z-direction (where we get the first component): $ b'_z=b_z=\text{proj}_a(b)=\frac{a\cdot b}{\cancelto{1}{\Vert a \Vert^2}}a=a(a^Tb)=aa^Tb $ To properly set up the following derivation, we'll need to understand the projection of $p$ onto the xy-plane: $b_\perp=b-b'_z=(I_3-aa^T)b$ 2. Along the x-direction: Note that $b'_\perp$ is along the same direction with $b_\perp$, and with magnitude $\Vert b'_\perp \Vert\cos\theta$ $ \begin{align*} b'_x &= \frac{b_\perp}{\Vert b_\perp \Vert}\Vert b'_\perp \Vert\cos\theta=b_\perp\cos\theta\\ &= (I_3-aa^T)b \cos\theta \end{align*} $ 3. Along the y-direction: Note that $\phi$ is the angle between $a$ and $b$ $\begin{align*} b'_y &= \Vert b_\perp \Vert\sin\theta\frac{a\times b}{\Vert a\times b \Vert} = \cancel{\Vert b \Vert}\cancel{\sin\phi}\sin\theta\frac{a\times b}{\cancelto{1}{\Vert a \Vert} \cancel{\Vert b \Vert}\cancel{\sin\phi}}\\ &= \sin \theta[a]b \end{align*} $ Putting it all together, we have $\begin{align*} b'_\perp &= b'_x + b'_y + b'_z = (I_3-aa^T)b \cos\theta+ \sin \theta[a]b+aa^Tb\\ &= \underset{\text{Rot}(a,\theta)}{\underbrace{ \Big(\cos\theta I_3+(1-\cos\theta)aa^T + \sin \theta[a]\Big) }}b \end{align*}$ VoilaIf the derivation still looks cryptic, review [[0_background#Dot Product|dot product]] and [[0_background#Coordinate Frames|coordinate frames]]. #lemma$\text{Rot}(-a,\theta)=\text{Rot}(a,-\theta)$. Can you prove it? --- ## 3. Homogeneous Coordinate For Translation & Rotation ### 3.1 Homogeneous coordinate Previously we've been working with vectors in $\mathbb{R}^3$. Now, in the context of homogenous coordinates, we "augment" vectors to be in $\mathbb{R}^4$: $p=\begin{bmatrix}x\\y\\z\\w\end{bmatrix}\underset{\text{de-homogenize}}{=}\begin{bmatrix}x/w\\y/w\\z/w\\1\end{bmatrix}$ As we will soon see, this is a neat trick that makes possible specifying translation and rotation in a compact matrix form. Later, we'll also see that homogeneous coordinate comes in handy for [[2_viewing#Perspective Projection: `gluPerspective()`|viewing and projection]]. In fact, we constrain $w\geq0$, and user $w=0$ to represent point at infinity. ### 3.1 **Translation**: Given a translation direction $t\in \mathbb{R}^3$, the translation matrix $T\in \mathbb{R}^{4\times 4}$ is: $\begin{align*} T &= \begin{bmatrix}I_3&t\\0&1\end{bmatrix} \end{align*}$ Example translate $p$ to $p'$: $p'=Tp=\begin{bmatrix}1&0&0&t_x\\0&1&0&t_y\\0&0&1&t_z\\0&0&0&1\end{bmatrix}\begin{bmatrix}p_x\\p_y\\p_z\\1\end{bmatrix}=\begin{bmatrix}p_x+t_x\\p_y+t_y\\p_z+t_z\\1\end{bmatrix}=\begin{bmatrix}p+t\\1\end{bmatrix}$ ### 3.2 **Rotation** and translation: Given a translation vector $t\in \mathbb{R}^3$, rotation matrix $R\in \mathbb{SO}^3$, what's the $4\times4$ transformation matrix that combines both transformation? Well, in this case, order matters, as translation followed by rotation may have different result than rotation followed by translation. ORDER MATTERS! #### Rotation, then translation $\begin{align*} M &= \begin{bmatrix}R&t\\0&1\end{bmatrix} \end{align*} ,\;\text{where } R\in\mathbb{SO}^3$ Example: $p'=(TR)p=Mp=Rp+t$ $\begin{align*} p'=Mp=(TR)p = \begin{bmatrix}1&0&0&t_x\\0&1&0&t_y\\0&0&1&t_z\\0&0&0&1\end{bmatrix} \begin{bmatrix}R&0\\0&1\end{bmatrix} \begin{bmatrix}p_x\\p_y\\p_z\\1\end{bmatrix}= \begin{bmatrix}R&t\\0&1\end{bmatrix}\begin{bmatrix}p_x\\p_y\\p_z\\1\end{bmatrix} = \begin{bmatrix}Rp+t\\1\end{bmatrix}\in\mathbb{R}^4 \end{align*}$ #### Translation, then rotation $\begin{align*} M &= \begin{bmatrix}R&Rt\\0&1\end{bmatrix} \end{align*} ,\;\text{where } R\in\mathbb{SO}^3$ Example: $p'=(RT)p=Mp=R(p+t)=Rp+Rt$ $\begin{align*} p'=Mp=(RT)p = \begin{bmatrix}R&0\\0&1\end{bmatrix} \begin{bmatrix}1&0&0&t_x\\0&1&0&t_y\\0&0&1&t_z\\0&0&0&1\end{bmatrix} \begin{bmatrix}p_x\\p_y\\p_z\\1\end{bmatrix}= \begin{bmatrix}R&Rt\\0&1\end{bmatrix}\begin{bmatrix}p_x\\p_y\\p_z\\1\end{bmatrix} = \begin{bmatrix}Rp+Rt\\1\end{bmatrix}\in\mathbb{R}^4 \end{align*}$ ### 3.3 Transforming Normals Motivation - problem of transforming normals: Non-uniform scaling and shear leads to incorrectly transformed normal. E.g. applying shearing to a rectangle result in slanted post-transform normal that's no longer perpendicular to the tangent plane/line. Solution: In addition to $M\in\mathbb{R}^{4\times4}$ that transforms all tangent vectors $t$ correctly, construct $Q\in\mathbb{R}^{3\times3}\neq M$ to transform normal vectors $n$ independently from using $M$. We can solve for $Q$ by leveraging the fact that $t\perp n$ and that $Mt\perp Qn$. $\begin{align*} Mt\perp Qn &\rightarrow Qn\cdot Mt = (Qn)^TMt=0 \\ &\rightarrow n^T(Q^TM)t = 0\\ \because n\perp t&\rightarrow n\cdot t = n^Tt= 0 \;\therefore Q^TM = I\\ \therefore &\boxed{Q=(M^{-1})^T} \end{align*}$ #lemma Rotation preserves normal properties. Note that if $M\in\mathbb{SO}^3$, i.e. $M$ represents a pure rotation matrix, then $Q=M$ since $M^T=M^{-1}$. #note Normal is not affected by translation, so $M$ should not include translation, instead, it should only contain rotation and/or scaling. --- ## 4. Coordinate Frames & `gluLookAt` ### 4.1 Rotating coordinate frames ![[1_transformations 2023-01-18 21.47.58.excalidraw.png|400]] We can interpret the demonstrated rotation in 2 ways: 1. Rotate vector $p$ clockwise by $\theta$ to $p'$ 2. Rotate coordinate frame $xy$ counterclockwise by $\theta$ to $uv$ Then we can discover a new way of interpreting the 2D rotation matrix $R=\begin{bmatrix}\cos\theta&-\sin\theta\\\sin\theta & \cos\theta\end{bmatrix}$: The first row is the coordinate of $u$ in xy coordinate frame. The second row is the coordinate of $v$ in xy coordinate frame. In other words: $\begin{bmatrix}u\\v\end{bmatrix}=\begin{bmatrix}\cos\theta&-\sin\theta\\\sin\theta&\cos\theta\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}$ #lemma Rows of rotation matrix is the coordinate of the new (rotated) coordinate frame. This extends to 3D rotation matrices. ### 4.2 `gluLookAt`: modeview transformation matrix ![[1_transformations 2023-01-18 22.27.04.excalidraw.png|400]] `void gluLookAt(eyex, eyey, eyez, centerx, centery, centerz, upx, upy, upz)`: Camera located at **eye**, looking at object **center**, with **up** direction. Using this OpenGL function involves 3 steps: 1. [[0_background#Coordinate Frames|Construct coordinate frame]] at camera (i.e. eye) location, using $a=(\text{eye}-\text{obj. center}), b=\text{up}$. $\begin{align*} w = \frac{a}{\Vert a \Vert}, \; u=\frac{b\times w}{\Vert b\times w\Vert}, \; v=w\times u\end{align*}$ 2. Define rotation matrix from the 3 orthonormal vectors $u,v,w$ $\begin{align*} R_{uvw}=\begin{bmatrix}-u^T-\\-v^T-\\-w^T-\end{bmatrix}=\begin{bmatrix}u_x&u_y&u_z\\v_x&v_y&v_z\\w_x&w_y&w_z\end{bmatrix} \end{align*}$ 3. Apply transformation from **object center** to **eye**: this consists of 1) translation and 2) rotation. Note that t<mark style="background: #FFF3A3A6;">ranslation must precede rotation </mark>(translation brings eye to origin ), and that $M\in\mathbb{R}^{4\times4}, R\in\mathbb{R}^{3\times3}, t\in\mathbb{R}^3$. $\begin{align*} M=\begin{bmatrix}R&0\\0&1\end{bmatrix}\begin{bmatrix}I_3&t\\0&1\end{bmatrix}=\begin{bmatrix}R&Rt\\0&1\end{bmatrix} \end{align*}$ In the context of `gluLookAt()`, $R=R_{uvw}, t=-(\text{eye})=-e$, therefore we finally have the model-view $\begin{align*} M=\begin{bmatrix}R_{uvw}&0\\0&1\end{bmatrix}\begin{bmatrix}I_3&-e\\0&1\end{bmatrix}=\begin{bmatrix}R_{uvw}&-R_{uvw}e\\0&1\end{bmatrix}= \begin{bmatrix}u^T&-u\cdot e\\v^T&-v\cdot e\\w^T&-w\cdot e\\0&1\end{bmatrix} \end{align*}$ --- ## Scene Graphs: Combining Transformations ![[Pasted image 20221214153032.png]] **Scene graph** is a DAG (directed acyclic graph) where children store the local coordinate w.r.t. parent. The **children coordinate frame** can be translated back to **world coordinate frame** by **depth-first traversing** the DAG. - Start with **viewing transformation** (i.e. camera position) at root - Depth-first traverse downward until hit a leaf. At each node, cascade local transformation at each node. Push each node onto a stack. - Traverse back up, applying the transformations popped from stack. - Finally, the cascaded transformation is the object's **pose in world frame**. ---