import numpy as np
np.random.seed(140)
The expository will assume familiarity with Linear Algebra, especially topics such as eigenvectors, eigenvalues, and singular values. This section will summarize vector and matrix norm and will establish the notation used throughout the rest of the text. It will also demonstrate a couple results that will be used later in the text.
A vector norm $\rho$ on some vector space $V$ over the field $\mathbb{R}$ or $\mathbb{C}$ is a mapping from $V$ to $\mathbb{R}$ such that the following properties hold for all $u,v \in V$ and $a \in F$
- $\rho(u+v) \leq \rho(u) + \rho(v)$ (Triangle Inequality)
- $\rho(av) = |a| \rho(v)$ (Absolutely Homogeneous )
- $\rho(v) = 0 \Rightarrow v = 0$
The most common types of vector norms are known as "p-norms", which are norms that take the following form
$$|| x ||_p = \left(\sum_{i} |x_i|^p \right)^{1/p} $$where $p$ is any real number greater than or equal to one. We can compute p-norms using Numpy with the function np.linalg.norm
. We will demonstrate computing the different norms on the following vector $v$
v = np.random.normal(size=2)
v
The 1-norm is often called the manhattan distance or the taxicab norm. This norm calculates the distance from a vector to the origin by only traveling along the main axes in a coordinate system.
$$|| x ||_1 = \sum_{i} |x_i| $$np.linalg.norm(v, 1)
The 2-norm mesasures the Euclidean distance from a point to the origin.
$$|| x ||_2 = \sqrt{\sum_{i} |x_i|^2} = \sqrt{x^* x} $$This is the most common norm and it is often convention to omit the $2$ in the subscript. Note that the distribution of the 2-norm for a vector of two independent standard normal random variables has the Rayleigh distribution. Also, the distribution of $||v||^2$ where $v$ is a vector of $n$ iid standard normals is Chi-squared(n).
np.linalg.norm(v, 2)
Theorem: Unitary transformations are isometries with respect to the 2-norm. That is, Unitary matrices do not change the 2-norm of a vector.
Proof: Suppose $U$ is a unitary matrix. Then $||Ux|| = \sqrt{(Ux)^* (Ux)} = \sqrt{x^* U^* U x} = \sqrt{x^* x} = ||x||$
It is a well known result that the p-norm of a vector approaches the maximum absolute value as $p \longrightarrow \infty$. It turns out that the maximum absolute value of a vector satisfies all three properties of a norm. Thus the $\infty-norm$, also called the maximum norm, can be defined as
$$|| x ||_{\infty} = \max_i(|x_i|) $$np.linalg.norm(v, np.inf)
Note this is equivalent to using Python's max
function on the absolute values.
max(np.abs(v))
Likewise, we can define norms for matrices as long as they satisfy the following condition for $M, N \in F^{m \times n}$ and $a \in F$ where $F$ is $\mathbb{R}$ or $\mathbb{C}$
- $\rho(M+N) \leq \rho(M) + \rho(N)$ (Triangle Inequality)
- $\rho(aM) = |a| \rho(M)$ (Absolutely Homogeneous)
- $\rho(M) \geq 0$
- $\rho(M) = 0 \Rightarrow M = 0$
Similar to vector norms, we have a family of norms known as "p-norms" for matricies. Suppose $A \in F^{m \times n}$ matrix and $p \geq 1$
$$ || A ||_p = \sup_{x \in F^n \\ x \neq 0} \frac{||Ax||_p}{||x||_p} = \sup_{x \in F^n \\||x||_p = 1} ||Ax||_p$$The np.linalg.norm
function can also be applied to matricies to find the matrix norm.
A = np.random.normal(size=(2,2))
A
When $p=2$, this norm is known as the Operator norm. We will denote the operator norm of a matrix $A$ as $||A||$.
np.linalg.norm(A,2)
Theorem: The operator norm of a diagonal matrix is equal to its maximum element
Proof: Suppose $D$ is a diagonal matrix with $(D)_{ii} = d_i$. Suppose the greatest element is $d_m$, then
$$ \begin{align} \begin{aligned} ||D|| &= \sup_{||x||_p = 1} ||Dx|| \\ &= \sup_{||x|| = 1} \sqrt{\sum_i (d_i x_i)^2} \\ &\leq \sup_{||x|| = 1} \sqrt{\sum_i (d_m x_i)^2} \\ &= |d_m| \sup_{||x|| = 1} \sqrt{\sum_i (x_i)^2}\\ &= |d_m| \end{aligned} \end{align} $$Now define $e_m$ to be a vector of all zeros except a one in the $m$th component
$$||D||=\sup_{||x||_p = 1} ||Dx|| \geq ||De_m|| = |d_m|$$Thus the two inequalities imply $||D|| = |d_m|$
Suppose $A$ is a matrix and $A^*$ is it's hermetian adjoint.
$$||A||_F = \sqrt{\text{Trace}(A^*A)} $$This norm is well known and can be computed with Numpy.
np.linalg.norm(A, 'fro')