Norms

import numpy as np
np.random.seed(140)

The expository will assume familiarity with Linear Algebra, especially topics such as eigenvectors, eigenvalues, and singular values. This section will summarize vector and matrix norm and will establish the notation used throughout the rest of the text. It will also demonstrate a couple results that will be used later in the text.

Vector Norms

A vector norm $\rho$ on some vector space $V$ over the field $\mathbb{R}$ or $\mathbb{C}$ is a mapping from $V$ to $\mathbb{R}$ such that the following properties hold for all $u,v \in V$ and $a \in F$

$\rho(u+v) \leq \rho(u) + \rho(v)$ (Triangle Inequality)
$\rho(av) = |a| \rho(v)$ (Absolutely Homogeneous )
$\rho(v) = 0 \Rightarrow v = 0$

p-norms

The most common types of vector norms are known as "p-norms", which are norms that take the following form

$$|| x ||_p = \left(\sum_{i} |x_i|^p \right)^{1/p} $$

where $p$ is any real number greater than or equal to one. We can compute p-norms using Numpy with the function np.linalg.norm. We will demonstrate computing the different norms on the following vector $v$

v = np.random.normal(size=2)
v

array([-0.46716782,  0.29599552])

1-norm

The 1-norm is often called the manhattan distance or the taxicab norm. This norm calculates the distance from a vector to the origin by only traveling along the main axes in a coordinate system.

$$|| x ||_1 = \sum_{i} |x_i| $$

np.linalg.norm(v, 1)

0.7631633402193927

2-norm

The 2-norm mesasures the Euclidean distance from a point to the origin.

$$|| x ||_2 = \sqrt{\sum_{i} |x_i|^2} = \sqrt{x^* x} $$

This is the most common norm and it is often convention to omit the $2$ in the subscript. Note that the distribution of the 2-norm for a vector of two independent standard normal random variables has the Rayleigh distribution. Also, the distribution of $||v||^2$ where $v$ is a vector of $n$ iid standard normals is Chi-squared(n).

np.linalg.norm(v, 2)

0.5530453150208221

Theorem: Unitary transformations are isometries with respect to the 2-norm. That is, Unitary matrices do not change the 2-norm of a vector.

Proof: Suppose $U$ is a unitary matrix. Then $||Ux|| = \sqrt{(Ux)^* (Ux)} = \sqrt{x^* U^* U x} = \sqrt{x^* x} = ||x||$

$\infty$-norm

It is a well known result that the p-norm of a vector approaches the maximum absolute value as $p \longrightarrow \infty$. It turns out that the maximum absolute value of a vector satisfies all three properties of a norm. Thus the $\infty-norm$, also called the maximum norm, can be defined as

$$|| x ||_{\infty} = \max_i(|x_i|) $$

np.linalg.norm(v, np.inf)

0.46716782126474604

Note this is equivalent to using Python's max function on the absolute values.

max(np.abs(v))

0.46716782126474604

Matrix Norms

Likewise, we can define norms for matrices as long as they satisfy the following condition for $M, N \in F^{m \times n}$ and $a \in F$ where $F$ is $\mathbb{R}$ or $\mathbb{C}$

$\rho(M+N) \leq \rho(M) + \rho(N)$ (Triangle Inequality)
$\rho(aM) = |a| \rho(M)$ (Absolutely Homogeneous)
$\rho(M) \geq 0$
$\rho(M) = 0 \Rightarrow M = 0$

Matrix p-norm

Similar to vector norms, we have a family of norms known as "p-norms" for matricies. Suppose $A \in F^{m \times n}$ matrix and $p \geq 1$

$$ || A ||_p = \sup_{x \in F^n \\ x \neq 0} \frac{||Ax||_p}{||x||_p} = \sup_{x \in F^n \\||x||_p = 1} ||Ax||_p$$

The np.linalg.norm function can also be applied to matricies to find the matrix norm.

A = np.random.normal(size=(2,2))
A

array([[ 0.18548011, -0.82809842],
       [-0.03192753,  1.00155352]])

Operator norm

When $p=2$, this norm is known as the Operator norm. We will denote the operator norm of a matrix $A$ as $||A||$.

np.linalg.norm(A,2)

1.307451049240857

Theorem: The operator norm of a diagonal matrix is equal to its maximum element

Proof: Suppose $D$ is a diagonal matrix with $(D)_{ii} = d_i$. Suppose the greatest element is $d_m$, then

$$ \begin{align} \begin{aligned} ||D|| &= \sup_{||x||_p = 1} ||Dx|| \\ &= \sup_{||x|| = 1} \sqrt{\sum_i (d_i x_i)^2} \\ &\leq \sup_{||x|| = 1} \sqrt{\sum_i (d_m x_i)^2} \\ &= |d_m| \sup_{||x|| = 1} \sqrt{\sum_i (x_i)^2}\\ &= |d_m| \end{aligned} \end{align} $$

Now define $e_m$ to be a vector of all zeros except a one in the $m$th component

$$||D||=\sup_{||x||_p = 1} ||Dx|| \geq ||De_m|| = |d_m|$$

Thus the two inequalities imply $||D|| = |d_m|$

Frobenius norm

Suppose $A$ is a matrix and $A^*$ is it's hermetian adjoint.

$$||A||_F = \sqrt{\text{Trace}(A^*A)} $$

This norm is well known and can be computed with Numpy.

np.linalg.norm(A, 'fro')

1.3131179271021802