Cauchy–Schwarz inequality

Lua error in package.lua at line 80: module 'strict' not found.

In mathematics, the Cauchy–Schwarz inequality is a useful inequality encountered in many different settings, such as linear algebra, analysis, probability theory, vector algebra and other areas. It is considered to be one of the most important inequalities in all of mathematics.^[1] It has a number of generalizations, among them Hölder's inequality.

The inequality for sums was published by Augustin-Louis Cauchy (1821), while the corresponding inequality for integrals was first proved by Viktor Bunyakovsky (1859). The modern proof of the integral inequality was given by Hermann Amandus Schwarz (1888).^[1]

1 Statement of the inequality
2 Proof
3 Alternative proof
4 Special cases
- 4.1 Rⁿ
- 4.2 L²
5 Applications
- 5.1 Probability theory
6 Generalizations
- 6.1 Positive functionals on C*- and W*-algebras
- 6.2 Positive maps
  - 6.2.1 Kadison–Schwarz inequality
  - 6.2.2 2-positive maps
7 Physics
8 See also
9 Notes
10 References
11 External links

Statement of the inequality

The Cauchy–Schwarz inequality states that for all vectors x and y of an inner product space it is true that

|\langle x,y\rangle| ^2 \leq \langle x,x\rangle \cdot \langle y,y\rangle,

where $\langle\cdot,\cdot\rangle$ is the inner product also known as dot product. Equivalently, by taking the square root of both sides, and referring to the norms of the vectors, the inequality is written as

|\langle x,y\rangle| \leq \|x\| \cdot \|y\|.\,

^[2]

Moreover, the two sides are equal if and only if x and y are linearly dependent (or, in a geometrical sense, they are parallel or one of the vector's magnitudes is zero).

If $x_1,\ldots, x_n\in\mathbb C$ and $y_1,\ldots, y_n\in\mathbb C$ have an imaginary component, the inner product is the standard inner product and the bar notation is used for complex conjugation then the inequality may be restated more explicitly as

|x_1\bar{y}_1 + \cdots + x_n \bar{y}_n|^2 \leq (|x_1|^2 + \cdots + |x_n|^2) (|y_1|^2 + \cdots + |y_n|^2).

When viewed in this way the numbers x₁, ..., x_n, and y₁, ..., y_n are the components of x and y with respect to an orthonormal basis of V.

Even more compactly written:

\left| \sum_{i=1}^n x_i \bar{y}_i \right|^2 \leq \sum_{j=1}^n |x_j|^2 \sum_{k=1}^n |y_k|^2 .

Equality holds if and only if x and y are linearly dependent, that is, one is a scalar multiple of the other (which includes the case when one or both are zero).

The finite-dimensional case of this inequality for real vectors was proven by Cauchy in 1821, and in 1859 Cauchy's student Bunyakovsky noted that by taking limits one can obtain an integral form of Cauchy's inequality. The general result for an inner product space was obtained by Schwarz in the year 1888.

Proof

Let u, v be arbitrary vectors in a vector space V over F with an inner product, where F is the field of real or complex numbers. We prove the inequality

\big| \langle u,v \rangle \big| \leq \left\|u\right\| \left\|v\right\|,

and that equality holds only when either u or v is a multiple of the other.

If v = 0 it is clear that we have equality, and in this case u and v are also linearly dependent (regardless of u). We henceforth assume that v is nonzero. We also assume that $\langle u, v \rangle \ne 0$ otherwise the inequality is obviously true, because neither $\left\| u \right\|$ nor $\left\| v \right\|$ can be negative.

Let

z= u-\frac {\langle u, v \rangle} {\langle v, v \rangle} v.

Then, by linearity of the inner product in its first argument, one has

\langle z, v \rangle = \left\langle u -\frac {\langle u, v \rangle} {\langle v, v \rangle} v, v\right\rangle = \langle u, v \rangle - \frac {\langle u, v \rangle} {\langle v, v \rangle} \langle v, v \rangle = 0,

i.e., z is a vector orthogonal to the vector v (Indeed, z is the projection of u onto the plane orthogonal to v.). We can thus apply the Pythagorean theorem to

u= \frac {\langle u, v \rangle} {\langle v, v \rangle} v+z,

which gives

\left\|u\right\|^2 = \left|\frac{\langle u, v \rangle}{\langle v, v \rangle}\right|^2 \left\|v\right\|^2 + \left\|z\right\|^2 = \frac{|\langle u, v \rangle|^2}{\left\|v\right\|^2} + \left\|z\right\|^2 \geq \frac{|\langle u, v \rangle|^2}{\left\|v\right\|^2},

and, after multiplication by $\left\| v \right\|^2$ , the Cauchy–Schwarz inequality. Moreover, if the relation '≥' in the above expression is actually an equality, then $\left\| z \right\|^2 = 0$ and hence $z = 0$ ; the definition of z then establishes a relation of linear dependence between u and v. This establishes the theorem.

Alternative proof

Let u, v be arbitrary vectors in a vector space V over F with an inner product, where F is the field of real or complex numbers.

If $\langle u, v \rangle = 0$ , the theorem holds trivially.

If not, then $u \ne 0$ , $v \ne 0$ . Choose $\lambda = \frac{|\langle v, u \rangle|}{\langle v, u \rangle}$ . Then $|\lambda| = 1$ and

0 \le \left\|\frac{\lambda u}{\|u\|} - \frac{v}{\|v\|}\right\|^2 = |\lambda|^2 \frac{\|u\|^2}{\|u\|^2} - 2 \text{Re}\left(\left\langle \frac{\lambda v}{\|v\|}, \frac{u}{\|u\|} \right\rangle\right) + \frac{\|v\|^2}{\|v\|^2} = 2 - 2 \frac{\lambda \langle v, u \rangle}{\|v\|\|u\|}.

It follows that

|\langle u, v \rangle| = \lambda |\langle v, u \rangle| \le \|u\|\|v\|.

Special cases

Rⁿ

In Euclidean space $\mathbb R ^n$ with the standard inner product, the Cauchy–Schwarz inequality is

\left(\sum_{i=1}^n x_i y_i\right)^2\leq \left(\sum_{i=1}^n x_i^2\right) \left(\sum_{i=1}^n y_i^2\right).

To prove this form of the inequality, consider the following quadratic polynomial in z.

(x_1 z + y_1)^2 + \cdots + (x_n z + y_n)^2 = \left( \sum x_i^2 \right) z^2 + 2 \left( \sum x_i y_i \right) z + \sum y_i^2

Since it is nonnegative, it has at most one real root in z, whence its discriminant is less than or equal to zero, that is,

\left(\sum ( x_i \cdot y_i ) \right)^2 - \sum {x_i^2} \cdot \sum {y_i^2} \le 0,

which yields the Cauchy–Schwarz inequality.

An equivalent proof for $\mathbb R ^n$ starts with the summation below.

Expanding the brackets we have:

\sum_{i=1}^n \sum_{j=1}^n \left( x_i y_j - x_j y_i \right)^2 = \sum_{i=1}^n x_i^2 \sum_{j=1}^n y_j^2 + \sum_{j=1}^n x_j^2 \sum_{i=1}^n y_i^2 - 2 \sum_{i=1}^n x_i y_i \sum_{j=1}^n x_j y_j,

collecting together identical terms (albeit with different summation indices) we find:

\frac{1}{2} \sum_{i=1}^n \sum_{j=1}^n \left( x_i y_j - x_j y_i \right)^2 = \sum_{i=1}^n x_i^2 \sum_{i=1}^n y_i^2 - \left( \sum_{i=1}^n x_i y_i \right)^2.

Because the left-hand side of the equation is a sum of the squares of real numbers it is greater than or equal to zero, thus:

\sum_{i=1}^n x_i^2 \sum_{i=1}^n y_i^2 - \left( \sum_{i=1}^n x_i y_i \right)^2 \geq 0.

Yet another approach when n ≥ 2 (n = 1 is trivial) is to consider the plane containing x and y. More precisely, recoordinatize Rⁿ with any orthonormal basis whose first two vectors span a subspace containing x and y. In this basis only $x_1,~x_2,~y_1$ and $y_2~$ are nonzero, and the inequality reduces to the algebra of dot product in the plane, which is related to the angle between two vectors, from which we obtain the inequality:

|x \cdot y| = \|x\| \|y\| | \cos \theta | \le \|x\| \|y\|.

When n = 3 the Cauchy–Schwarz inequality can also be deduced from Lagrange's identity, which takes the form

\langle x,x\rangle \cdot \langle y,y\rangle = |\langle x,y\rangle|^2 + |x \times y|^2

from which readily follows the Cauchy–Schwarz inequality.

Another proof of the general case for n can be done by using the technique used to prove Inequality of arithmetic and geometric means.

L²

For the inner product space of square-integrable complex-valued functions, one has

\left|\int_{\mathbb{R}^n} f(x) \overline{g(x)}\,dx\right|^2\leq\int_{\mathbb{R}^n} \left|f(x)\right|^2\,dx \cdot \int_{\mathbb{R}^n}\left|g(x)\right|^2\,dx.

A generalization of this is the Hölder inequality.

Applications

The triangle inequality for the standard norm is often shown as a consequence of the Cauchy–Schwarz inequality, as follows: given vectors x and y:

\begin{align} \|x + y\|^2 & = \langle x + y, x + y \rangle \\ & = \|x\|^2 + \langle x, y \rangle + \langle y, x \rangle + \|y\|^2 \\ & = \|x\|^2 + 2 \text{ Re} \langle x, y \rangle + \|y\|^2\\ & \le \|x\|^2 + 2|\langle x, y \rangle| + \|y\|^2 \\ & \le \|x\|^2 + 2\|x\|\|y\| + \|y\|^2 \\ & = \left (\|x\| + \|y\|\right)^2. \end{align}

Taking square roots gives the triangle inequality.

The Cauchy–Schwarz inequality allows one to extend the notion of "angle between two vectors" to any real inner product space, by defining:

\cos\theta_{xy}=\frac{\langle x,y\rangle}{\|x\| \|y\|}.

The Cauchy–Schwarz inequality proves that this definition is sensible, by showing that the right-hand side lies in the interval [−1, 1], and justifies the notion that (real) Hilbert spaces are simply generalizations of the Euclidean space.

It can also be used to define an angle in complex inner product spaces, by taking the absolute value of the right-hand side, as is done when extracting a metric from quantum fidelity.

The Cauchy–Schwarz is used to prove that the inner product is a continuous function with respect to the topology induced by the inner product itself.

The Cauchy–Schwarz inequality is usually used to show Bessel's inequality.

Probability theory

Let X, Y be random variables, then:

\text{Var}\left(Y\right)\ge\frac{\text{Cov}\left(Y,X\right)\text{Cov}\left(Y,X\right)}{\text{Var}\left(X\right)}.

In fact we can define an inner product on the set of random variables using the expectation of their product:

\langle X, Y \rangle \triangleq \operatorname{E}(X Y),

and so, by the Cauchy–Schwarz inequality,

|\operatorname{E}(XY)|^2 \leq \operatorname{E}(X^2) \operatorname{E}(Y^2).

Moreover, if μ = E(X) and ν = E(Y), then

\begin{align} |\operatorname{Cov}(X,Y)|^2 &= |\operatorname{E}( (X - \mu)(Y - \nu) )|^2 \\ &= | \langle X - \mu, Y - \nu \rangle |^2\\ &\leq \langle X - \mu, X - \mu \rangle \langle Y - \nu, Y - \nu \rangle \\ & = \operatorname{E}( (X-\mu)^2 ) \operatorname{E}( (Y-\nu)^2 ) \\ & = \operatorname{Var}(X) \operatorname{Var}(Y), \end{align}

where Var denotes variance and Cov denotes covariance.

Generalizations

Various generalizations of the Cauchy–Schwarz inequality exist in the context of operator theory, e.g. for operator-convex functions, and operator algebras, where the domain and/or range of φ are replaced by a C*-algebra or W*-algebra.

This section lists a few of such inequalities from the operator algebra setting, to give a flavor of results of this type.

Positive functionals on C- and W-algebras

One can discuss inner products as positive functionals. Given a Hilbert space L²(m), m being a finite measure, the inner product < · , · > gives rise to a positive functional φ by

\phi (g) = \langle g, 1 \rangle.

Since < f, f > ≥ 0, φ(f*f) ≥ 0 for all f in L²(m), where f* is pointwise conjugate of f. So φ is positive. Conversely every positive functional φ gives a corresponding inner product < f, g >_φ = φ(g*f). In this language, the Cauchy–Schwarz inequality becomes

| \phi(g^*f) |^2 \leq \phi(f^*f) \phi(g^*g),

which extends verbatim to positive functionals on C*-algebras.

We now give an operator theoretic proof for the Cauchy–Schwarz inequality which passes to the C*-algebra setting. One can see from the proof that the Cauchy–Schwarz inequality is a consequence of the positivity and anti-symmetry inner-product axioms.

Consider the positive matrix

M = \begin{bmatrix} f^*\\ g^* \end{bmatrix} \begin{bmatrix} f & g \end{bmatrix} = \begin{bmatrix} f^*f & f^* g \\ g^*f & g^*g \end{bmatrix}.

Since φ is a positive linear map whose range, the complex numbers C, is a commutative C*-algebra, φ is completely positive. Therefore

M' = (I_2 \otimes \phi)(M) = \begin{bmatrix} \phi(f^*f) & \phi(f^* g) \\ \phi(g^*f) & \phi(g^*g) \end{bmatrix}

is a positive 2 × 2 scalar matrix, which implies it has positive determinant:

\phi \left (f^*f \right) \phi \left (g^*g \right) - \left | \phi \left (g^*f \right) \right |^2 \geq 0 \quad \text{i.e.} \quad \phi\left (f^*f \right) \left (g^*g \right) \geq \left | \phi \left (g^*f \right ) \right|^2.

This is precisely the Cauchy–Schwarz inequality. If f and g are elements of a C*-algebra, f* and g* denote their respective adjoints.

We can also deduce from above that every positive linear functional is bounded, corresponding to the fact that the inner product is jointly continuous.

Positive maps

Positive functionals are special cases of positive maps. A linear map Φ between C*-algebras is said to be a positive map if a ≥ 0 implies Φ(a) ≥ 0. It is natural to ask whether inequalities of Schwarz-type exist for positive maps. In this more general setting, usually additional assumptions are needed to obtain such results.

Kadison–Schwarz inequality

The following theorem is named after Richard Kadison.

Theorem. If $\Phi$ is a unital positive map, then for every normal element $a$ in its domain, we have $\Phi(a^*a) \ge \Phi(a^*) \Phi(a)$ and $\Phi(a^*a) \ge \Phi(a) \Phi(a^*)$ .

This extends the fact $\varphi(a^*a) \cdot 1 \ge \varphi(a)^* \varphi(a) = |\varphi(a)|^2$ , when $\varphi$ is a linear functional.

The case when $a$ is self-adjoint, i.e. $a = a^*$ , is sometimes known as Kadison's inequality.

2-positive maps

When Φ is 2-positive, a stronger assumption than merely positive, one has something that looks very similar to the original Cauchy–Schwarz inequality:

Theorem (Modified Schwarz inequality for 2-positive maps).^[3] For a 2-positive map Φ between C*-algebras, for all a, b in its domain,

\Phi(a)^*\Phi(a) \leq \Vert\Phi(1)\Vert\Phi(a^*a)

\Vert\Phi(a^*b)\Vert^2 \leq \Vert\Phi(a^*a)\Vert \cdot \Vert\Phi(b^*b)\Vert.

A simple argument for (2) is as follows. Consider the positive matrix

M= \begin{bmatrix} a^* & 0 \\ b^* & 0 \end{bmatrix} \begin{bmatrix} a & b \\ 0 & 0 \end{bmatrix} = \begin{bmatrix} a^*a & a^* b \\ b^*a & b^*b \end{bmatrix}.

By 2-positivity of Φ,

(I_2 \otimes \Phi) M = \begin{bmatrix} \Phi(a^*a) & \Phi(a^* b) \\ \Phi(b^*a) & \Phi(b^*b) \end{bmatrix}

is positive. The desired inequality then follows from the properties of positive 2 × 2 (operator) matrices.

Part (1) is analogous. One can replace the matrix $\begin{bmatrix} a & b \\ 0 & 0 \end{bmatrix}$ by $\begin{bmatrix} 1 & a \\ 0 & 0 \end{bmatrix}.$

Physics

The general formulation of the Heisenberg uncertainty principle is derived using the Cauchy–Schwarz inequality.

Notes

↑ ^1.0 ^1.1 The Cauchy–Schwarz Master Class: an Introduction to the Art of Mathematical Inequalities, Ch. 1 by J. Michael Steele.
↑ Lua error in package.lua at line 80: module 'strict' not found.
↑ Lua error in package.lua at line 80: module 'strict' not found. page 40.

References

J.M. Aldaz, S. Barza, M. Fujii and M.S. Moslehian, Advances in Operator Cauchy--Schwarz inequalities and their reverses, Ann. Funct. Anal. 6 (2015), no. 3, 275--295.

Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found..
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found..
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.
Lua error in package.lua at line 80: module 'strict' not found.

External links

Earliest Uses: The entry on the Cauchy–Schwarz inequality has some historical information.
Example of application of Cauchy–Schwarz inequality to determine Linearly Independent Vectors Tutorial and Interactive program.

[Steele-1] 1.0 ^1.1 The Cauchy–Schwarz Master Class: an Introduction to the Art of Mathematical Inequalities, Ch. 1 by J. Michael Steele.

[Strang5-2] Lua error in package.lua at line 80: module 'strict' not found.

[3] Lua error in package.lua at line 80: module 'strict' not found. page 40.

[1]

[2]

[3]

v t e Functional analysis
Set / subset types	Absolutely convex Absorbing Balanced Bounded Convex Radial Star-shaped Symmetric Linear cone (subset) Convex cone (subset)
TVS types	Banach Barrelled Bornological Brauner F-space Finite-dimensional Fréchet (tame) Hilbert LF-space Locally convex Mackey Montel Nuclear Normed (norm) Quasinormed Reflexive Riesz Smith Stereotype Webbed Topological tensor product (of Hilbert spaces)
Mapping topologies	Dual Dual space Operator Strong (polar operator) Ultrastrong Weak (operator) Ultraweak Uniform convergence
Linear operators	Adjoint Bilinear (form) Bounded / Unbounded Closed Continuous / Discontinuous Compact Fredholm Hilbert–Schmidt Functionals (positive) Normal Nuclear Self-adjoint Strictly singular Trace class Transpose Unitary
Set operations	Algebraic interior (core) Interior Minkowski addition Polar
Banach algebras	C-algebras Spectrum (C-algebra radius) Spectral theory
Theorems	Banach–Alaoglu Banach–Saks Banach–Mazur Bessel's inequality Cauchy–Schwarz inequality Closed range Closed graph Eberlein–Šmulian Freudenthal spectral Gelfand–Mazur Goldstine Hahn–Banach (hyperplane separation) Kakutani fixed-point Lomonosov's invariant subspace Mackey–Arens Mazur's lemma M. Riesz extension Open mapping Parseval's identity Schauder fixed-point
Analysis	Bochner space Differentiation in Fréchet spaces Derivatives (Fréchet Gâteaux functional holomorphic) Integrals (Bochner Dunford Gelfand–Pettis regulated weak) Functional calculus (Borel continuous holomorphic) Inverse function theorem (Nash–Moser theorem) Vector measure

Cauchy–Schwarz inequality

Contents

Statement of the inequality

Proof

Alternative proof

Special cases

Rⁿ

L²