Scalar Product and Norms

  • Scalar product

  • Norms

  • Cauchy-schwartz inequality and angles

  • Orthogonality

  • Hyperplanes and half-spaces

Scalar product

The scalar product (or, dot product) between two vectors x,y in mathbf{R}^n is the scalar denoted x^Ty, and defined as
 x^Ty = sum_{i=1}^n x_i y_i.

The scalar product is sometimes denoted langle x, y rangle. The motivation for our notation above will come later, when we define the matrix-vector product.

We say that the vectors are orthogonal if x^Ty = 0.

Matlab syntax
>> x = [1; 2; 3]; y = [4; 5; 6];
>> scal_prod = x'*y;

Examples:

Vector norms

Measuring the size of a scalar value is unambiguous — we just take the magnitude (absolute value) of the number. However, when we deal with higher dimensions, and try to define the notion of size, or length, of a vector, we are faced with many possible choices.

Norms are real-valued functions that satisfy a basic set of rules that a sensible notion of size should involve. You can consult the formal definition of a norm here. In this course, we focus on the following three popular norms for a vector x in mathbf{R}^n:

alt text 

The Euclidean norm:
 |x|_2 := sqrt{sum_{i=1}^n x_i^2}=sqrt{x^Tx},
corresponds to the usual notion of distance in two or three dimensions. The set of points with equal l_2-norm is a circle (in 2D), a sphere (in 3D), or a hyper-sphere in higher dimensions.

alt text 

The l_1-norm:
 |x|_1 = sum_{i=1}^n |x_i|,
corresponds to the distance travelled on a rectangular grid to go from one point to another.

alt text 

The l_infty-norm:
 |x|_infty := displaystylemax_{1 le i le n} |x_i| ,
is useful in measuring peak values.

Matlab syntax
>> x = [1; 2; -3];
>> r2 = norm(x,2); % l2-norm
>> r1 = norm(x,1); % l1 norm
>> rinf = norm(x,inf); % l-infty norm

Examples:

  • A given vector will in general have different ‘‘lengths" under different norms. For example, the vector x = [1,-2,3]^T yields |x|_2 =3.7417, |x|_1 = 6, and |x|_infty = 3.

  • Sample standard deviation.

Cauchy-Schwartz inequality, angles

The Cauchy-Schwartz inequality allows to bound the scalar product of two vectors in terms of their Euclidean norm.

Cauchy-Schwartz inequality: For any two vectors x,y in mathbf{R}^n, we have
 x^Ty le |x|_2 cdot |y|_2 ,
with equality if and only if x,y are collinear.

When none of the vectors x,y involved is zero, we can define the corresponding angle as theta such that
 cos theta = frac{x^Ty}{|x|_2 |y|_2} .
The notion above generalizes the usual notion of angle between two directions in two dimensions, and is useful in measuring the similarity (or, closeness) between two vectors. When the two vectors are orthogonal, that is, x^Ty = 0, we do obtain that their angle is theta = 90^circ.

The Cauchy-Schwartz inequality can be generalized to other norms, using the concept of dual norm.

Example:

Orthonormal basis

A basis (u_i)_{i=1}^n is said to be orthogonal if u_i^Tu_j = 0 if i ne j. If in addition, |u_i|_2 = 1, we say that the basis is orthonormal.

Example: An orthonormal basis in {mathbf{R}^3}. The collection of vectors {u_1,u_2}, with
 u_1 = frac{1}{sqrt{2}} left(begin{array}{c} 1  1 end{array}right), ;; u_2 = frac{1}{sqrt{2}} left(begin{array}{c} 1  -1 end{array}right),
forms an orthonormal basis of mathbf{R}^2.

Hyperplanes and half-spaces

Hyperplanes

A hyperplane is a set described by a single affine equality. Precisely, an hyperplane in mathbf{R}^n is a set of the form
 mathbf{H} = left{ x ~:~ a^Tx = b right},
where a in mathbf{R}^n, a ne 0, and b in mathbf{R} are given. When b=0, the hyperplane is simply the set of points that are orthogonal to a; when bne 0, the hyperplane is a translation, along direction a, of that set.

Hyperplanes are affine sets, of dimension n-1 (see the proof here). Thus, they generalize the usual notion of a plane in mathbf{R}^3. Hyperplanes are very useful because they allows to separate the whole space in two regions. The notion of half-space formalizes this.

Example:

Half-spaces

A half-space is a subset of mathbf{R}^n defined by a single affine inequality. Precisely, an half-space in mathbf{R}^n is a set of the form
 mathbf{H} = left{ x ~:~ a^Tx le b right},
where a in mathbf{R}^n, a ne 0, and b in mathbf{R} are given.

Based on the notion of angle between two vectors, we can understand the meaning of an inequality of the form a^Tx le b, where a in mathbf{R}^n and b in mathbf{R} are given.

alt text 

Let us examine the case when b=0. The condition a^Tx>0 means that the angle between x and a is acute, while a^Tx >0 means the angle is obtuse. The set { x : a^Tx le 0} defines a halfspace with boundary passing through 0, and outward vector a. When b ne 0, the half-space { x ::: a^Tx le b} is a translated version of the case b=0, along the direction a.