# Vectors Vectors are quantities that have magnitude and direction. ## Norms L1 norm is "taxicab distance", given by $ |\mathbf{u}|_1 = \sum_{i=1}^{n} |u_i| \tag{1}$ or `np.linalg.norm(u, 1)` in [numpy](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html). This is the sum of the coordinates of the vector. L2 norm is "helicopter distance", given by $ |\mathbf{u}|_2 = \sqrt{\sum_{i=1}^{n} u_i^2} \tag{2}$ or `np.linalg.norm(u)`. `np.linalg.norm(u,2)` is equivalent. Note that the L2 norm is the hypotenuse of the triangle formed by the vector against its coordinates, which is why the above is just Pythagoras’ theorem. The L2 norm is frequently just called the magnitude or absolute value of the vector. ## Operations ### Addition $\begin{align*} \mathbf{u} + \mathbf{v} = \left( \begin{array}{c} u_1 + v_1 \\ u_2 + v_2 \\ \cdots \\ u_n + v_n \\ \end{array} \right). \tag{3} \end{align*}$ In NumPy this is `u + v` or `np.add(u, v)`. Geometrically this is one diagonal of the parallelogram formed by $\mathbf{u}$ and $\mathbf{v}$, or equivalent to travelling along $\mathbf{u}$ to its tip and then starting a new vector with the magnitude and direction of $\mathbf{v}$ from there. ### Subtraction $\begin{align*} \mathbf{u} - \mathbf{v} = \left( \begin{array}{c} u_1 - v_1 \\ u_2 - v_2 \\ \cdots \\ u_n - v_n \\ \end{array} \right) \tag{4} \end{align*}$ In NumPy this is `u - v` or `np.subtract(u, v)`. Geometrically this is the other diagonal of the parallelogram formed by $\mathbf{u}$ and $\mathbf{v}$ ### Scalar Multiplication $\begin{align*} \lambda \cdot \mathbf{u} = \left( \begin{array}{c} \lambda \cdot u_1 \\ \lambda \cdot u_2 \\ \cdots \\ \lambda \cdot u_n \\ \end{array} \right) \tag{5} \end{align*}$ In Numpy this is `u * l` or `np.multiply(u, l)`. Scalar multiplication scale the magnitude of a vector. If $\lambda$ is positive, the vector is "scaled" by a factor of $\lambda$ in its current direction. If $\lambda$ is negative, its magnitude is scaled in the same way but its direction is reflected about the origin. ### Dot Product of two vectors The dot product of two vectors $\mathbf{u}$ and $\mathbf{v}$ is given by $ \mathbf{u} \cdot \mathbf{v} = \sum_{i=1}^{n} u_i v_i \tag{6}$ This is sometimes written as $\langle \mathbf{u}, \mathbf{v} \rangle$. In NumPy this is `np.dot(u, v)`, or `u @ v`. Note from formula 6 that $\mathbf{u}$ and $\mathbf{v}$ have to have the same shape. In NumPy if they are not, it will raise a `ValueError`. #### Important properties of dot products 1. The dot product of a vector with itself is the square of the L2 norm. $ \langle \mathbf{u}, \mathbf{u} \rangle = |\mathbf{u}|^2 \tag{7}$ 2. The dot product of a vector with an orthogonal vector is zero. This follows from (10) below $\langle \mathbf{u}, \mathbf{v} \rangle = 0 \iff \theta = 90\degree \tag{8}$ This can be used as a check on the calculation of the cross product as the result should be orthogonal to its arguments. 3. Another way to calculate the dot product of two arbitrary vectors is by taking the product of their norms and cosine of the angle between them. $\langle \mathbf{u}, \mathbf{v} \rangle = |\mathbf{u}||\mathbf{v}|cos(\theta) \tag{9}$ Dividing by the product of the norms gives the standard form for the equation for cosine similarity. 4. The following properties hold for the relationship between the sign of the dot product and the angle between the vectors $\begin{align*} \begin{cases} \begin{array}{lr} \langle \mathbf{u}, \mathbf{v} \rangle > 0 & \text{if } \theta < 90, \\ \langle \mathbf{u}, \mathbf{v} \rangle < 0 & \text{if } \theta > 90, \\ \langle \mathbf{u}, \mathbf{v} \rangle = 0 & \text{if } \theta = 90.\\ \end{array} \end{cases} \tag{10} \end{align*}$ ### Dot product of a matrix and a vector The dot product of a matrix $\mathbf{M}$ with a vector $\mathbf{v}$ is simply the dot product of $\mathbf{v}$ with each of the rows of $\mathbf{M}$ as follows $\begin{align*} \mathbf{M} \cdot \mathbf{v} &= \left( \begin{array}{c} M_1 \cdot \mathbf{v} \\ M_2 \cdot \mathbf{v} \\ \cdots \\ M_n \cdot \mathbf{v} \\ \end{array} \right) \tag{11} \end{align*}$ Note that $\mathbf{M} \cdot \mathbf{v}$ ends up as a column vector with the same number of rows as there are rows in $\mathbf{M}$ This is done in NumPy using `np.dot(u, v)`, or `u @ v` as with vectors. The shapes need to be compatible or it will raise `ValueError`. "Compatible" here means that $\mathbf{M}$ needs to have as many columns as their are rows in $\mathbf{v}$. # Linear Transformations and Matrix multiplication [[Linear, Affine and related Transforms|Linear transformations]] can be performed by multiplying a vector by a matrix. This matrix can be thought of as specifying the mapping for each of the basis vectors of the space before transformation. These basis vectors are usually written either $\mathbf{i},\mathbf{j}$ and $\mathbf{k}$, or $\hat{i}, \hat{j}$ and $\hat{k}$. So in 2-D space, there will be a 2x2 matrix where the first column specifies where $\hat{i} = (1,0)$ lands after the transformation, and the second column does the same, but for $\hat{j}=(0,1)$, and the same for higher dimensioned spaces. If we know that some transformation maps the basis vectors as follows $\begin{align*} \hat{i} =& \begin{pmatrix} 1 \\ 0 \end{pmatrix} \rightarrow \begin{pmatrix} a \\ c \end{pmatrix} \\ \hat{j} =& \begin{pmatrix} 0 \\ 1 \end{pmatrix} \rightarrow \begin{pmatrix} b \\ d \end{pmatrix} \\ \end{align*}$ Then the transformation is fully described by the matrix $\begin{align*} \left( \begin{array}{cc} a & b \\ c & d \\ \end{array}\right) \end{align*}$ ...and the mapping of an arbitrary point $\begin{pmatrix} x \\ y \end{pmatrix}$ is given by $\begin{align*} \left( \begin{array}{cc} a & b \\ c & d \\ \end{array}\right) \left( \begin{array}{c} x \\ y \\ \end{array}\right) = x \begin{pmatrix} a \\ c \end{pmatrix} + y \begin{pmatrix} b \\ d \end{pmatrix} = \begin{pmatrix} ax + by \\ cx + dy \end{pmatrix} \tag{12} \end{align*}$ # Matrix addition and subtraction To add or subtract matrices, simply add or subtract the elements. So $A + B$ is a matrix $C$ with the same shape as $A$ and $B$ where $C_{i,j} = A_{i,j} + B_{i,j}$ as follows: $\begin{align*} A+B = \left( \begin{array}{cccc} A_{1,1} + B_{1,2} & A_{1,2} + B_{1,2} & \cdots & A_{1,n} + B_{1,n} \\ A_{2,1} + B_{2,2} & A_{2,2} + B_{2,2} & \cdots & A_{2,n} + B_{2,n} \\ \cdots \\ A_{n,1} + B_{n,2} & A_{n,2} + B_{n,2} & \cdots & A_{n,n} + B_{n,n} \\ \end{array}\right)\\ \end{align*} $ ...and similarly $A - B$ is a matrix $C$ with the same shape as $A$ and $B$ where $C_{i,j} = A_{i,j} - B_{i,j}$ $\begin{align*} A-B = \left( \begin{array}{cccc} A_{1,1} - B_{1,2} & A_{1,2} - B_{1,2} & \cdots & A_{1,n} - B_{1,n} \\ A_{2,1} - B_{2,2} & A_{2,2} - B_{2,2} & \cdots & A_{2,n} - B_{2,n} \\ \cdots \\ A_{n,1} - B_{n,2} & A_{n,2} - B_{n,2} & \cdots & A_{n,n} - B_{n,n} \\ \end{array}\right) \tag{13} \end{align*}$ These operations are performed in numpy using `A + B` or `np.add(A,B)` , and `A - B` or `np.subtract(A,B)` as you might expect. # Matrix Multiplication Further, if $A$ is an $m \times n$ matrix and $B$ is an $n \times p$ matrix, the matrix product $C = AB$ (denoted without multiplication signs or dots) is defined to be the $m \times p$ matrix such that $c_{ij}=a_{i1}b_{1j}+a_{i2}b_{2j}+\ldots+a_{in}b_{nj}=\sum_{k=1}^{n} a_{ik}b_{kj}, \tag{14}$ where $a_{ik}$ are the elements of matrix $A$, $b_{kj}$ are the elements of matrix $B$, and $i = 1, \ldots, m$, $k=1, \ldots, n$, $j = 1, \ldots, p$. In other words, $c_{ij}$ is the dot product of the $i$-th row of $A$ and the $j$-th column of $B$. This is performed by using `np.matmul()` or `@` in NumPy. The number of rows in the first matrix has to equal the number of columns in the second matrix for this operation to be defined. # Determinants and areas The determinant of some matrix represents the area of the linear transformation of the fundamental basis represented by the matrix. This implies: 1. The determinant of the product of two matrices $A$ and $B$ is the product of their determinants. That is $\det A \det B = \det (AB) \tag{15}$ 2. The determinant of the product of a singular matrix $S$ and a non-singular matrix $M$ is zero and therefore $MS$ will perform a singular linear transformation (mapping the space into fewer dimensions) and likewise a non-singular matrix will perform a non-singular transformation (because the fundamental basis will map to a shape which has non-zero area and is able to tile the plane) 3. The determinant of the inverse of a matrix will be the inverse of the determinant of the matrix ie $\det M^{-1} = \frac{1}{\det M} \tag{16}$ More about finding the [[Determinant of a matrix|determinant of a matrix]] here. # Eigenvalues and Eigenvectors An _eigenvector_ of some matrix $\mathbf{A}$ is some vector that has its direction unchanged by the transformation specified by $\mathbf{A}$, and the corresponding _eigenvalue_ is the scalar $\lambda$ by which the magnitude of the eigenvector is scaled by $\mathbf{A}$. ## Simple example For a trivial example $\mathbf{A} = \begin{pmatrix} 2 & 0\\ 0 & 3 \end{pmatrix}$ has eigenvectors $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$ and the corresponding eigenvalues are 2 and 3. ## Finding eigenvalues and eigenvectors [[Eigenvalues and eigenvectors#The eigenvector equation|Eigenvector equation]]: $\mathbf{Ax} = \lambda \mathbf{x}$ Use this for checking eigenvectors. Alternative form: $(\mathbf{A} - \lambda \mathbb{I})\mathbf{x} = \mathbf{0}. \tag{2}$ Find the determinant of $\mathbf{A} - \lambda \mathbb{I}$ to find the characteristic polynomial $p(\lambda)$. To find eigenvalues, solve for $p(\lambda)=0$. Shortcut to find the characteristic polynomial of a 2x2 matrix is $p(\lambda) = \lambda^2 -(\mathop{\mathrm{tr}} \mathbf{A})\lambda + \det \mathbf{A}$ where $\mathop{\mathrm{tr}} \mathbf{A}$ is the "trace" of $\mathbf{A}$ (ie the sum of values on the leading diagonal). To find eigenvalues and eigenvectors of some matrix: 1. find $\mathbf{A}-\mathbb{I}\lambda$. (You can do this by just taking $\mathbf{A}$ and writing "$-\lambdaquot; next to each value on the leading diagonal) 2. Find the roots of $p(\lambda) = \det(\mathbf{A}-\mathbb{I}\lambda)$. These are the eigenvalues of $\mathbf{A}$. 3. Write the simultaneous equations derived from $(\mathbf{A}-\mathbb{I}\lambda)\mathbf{x} = \mathbf{0}$. 4. Plug in each eigenvalue into one of these equations and solve it for $y$ as a function of x. 5. Write down an eigenvector. If your solution looks like $ay = bx$ an eigenvector will be $\begin{pmatrix*}a\\b\end{pmatrix*}$. The eigenvalues of a triangular matrix are the values on the leading diagonal (from top left to bottom right) More about [[Eigenvalues and eigenvectors]] here. # Resources * Linear Algebra and Its Applications by Gilbert Strang * Elementary Linear Algebra, 8th Edition by Ron Larson * Linear Algebra done right by Sheldon Axler [Khan Academy Algebra Course](https://www.khanacademy.org/math/algebra) # Some standard notation