Gaussian elimination

From Infogalactic: the planetary knowledge core
(Redirected from Gauss–Jordan elimination)
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found. In linear algebra, Gaussian elimination (also known as row reduction) is an algorithm for solving systems of linear equations. It is usually understood as a sequence of operations performed on the associated matrix of coefficients. This method can also be used to find the rank of a matrix, to calculate the determinant of a matrix, and to calculate the inverse of an invertible square matrix. The method is named after Carl Friedrich Gauss (1777–1855), although it was known to Chinese mathematicians as early as 179 CE (see History section).

To perform row reduction on a matrix, one uses a sequence of elementary row operations to modify the matrix until the lower left-hand corner of the matrix is filled with zeros, as much as possible. There are three types of elementary row operations: 1) Swapping two rows, 2) Multiplying a row by a non-zero number, 3) Adding a multiple of one row to another row. Using these operations, a matrix can always be transformed into an upper triangular matrix, and in fact one that is in row echelon form. Once all of the leading coefficients (the left-most non-zero entry in each row) are 1, and in every column containing a leading coefficient has zeros elsewhere, the matrix is said to be in reduced row echelon form. This final form is unique; in other words, it is independent of the sequence of row operations used. For example, in the following sequence of row operations (where multiple elementary operations might be done at each step), the third and fourth matrices are the ones in row echelon form, and the final matrix is the unique reduced row echelon form.

\left[\begin{array}{rrr|r}
1 & 3 & 1 & 9 \\
1 & 1 & -1 & 1 \\
3 & 11 & 5 & 35
\end{array}\right]\to
\left[\begin{array}{rrr|r}
1 & 3 & 1 & 9 \\
0 & -2 & -2 & -8 \\
0 & 2 & 2 & 8
\end{array}\right]\to
\left[\begin{array}{rrr|r}
1 & 3 & 1 & 9 \\
0 & -2 & -2 & -8 \\
0 & 0 & 0 & 0
\end{array}\right]\to
\left[\begin{array}{rrr|r}
1 & 0 & -2 & -3 \\
0 & 1 & 1 & 4 \\
0 & 0 & 0 & 0
\end{array}\right]

Using row operations to convert a matrix into reduced row echelon form is sometimes called Gauss–Jordan elimination. Some authors use the term Gaussian elimination to refer to the process until it has reached its upper triangular, or (non-reduced) row echelon form. For computational reasons, when solving systems of linear equations, it is sometimes preferable to stop row operations before the matrix is completely reduced.

Definitions and example of algorithm

The process of row reduction makes use of elementary row operations, and can be divided into two parts. The first part (sometimes called Forward Elimination) reduces a given system to row echelon form, from which one can tell whether there are no solutions, a unique solution, or infinitely many solutions. The second part (sometimes called back substitution) continues to use row operations until the solution is found; in other words, it puts the matrix into reduced row echelon form.

Another point of view, which turns out to be very useful to analyze the algorithm, is that row reduction produces a matrix decomposition of the original matrix. The elementary row operations may be viewed as the multiplication on the left of the original matrix by elementary matrices. Alternatively, a sequence of elementary operations that reduces a single row may be viewed as multiplication by a Frobenius matrix. Then the first part of the algorithm computes an LU decomposition, while the second part writes the original matrix as the product of a uniquely determined invertible matrix and a uniquely determined reduced row echelon matrix.

Row operations

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

There are three types of elementary row operations which may be performed on the rows of a matrix:

Type 1: Swap the positions of two rows.
Type 2: Multiply a row by a nonzero scalar.
Type 3: Add to one row a scalar multiple of another.

If the matrix is associated to a system of linear equations, then these operations do not change the solution set. Therefore, if one's goal is to solve a system of linear equations, then using these row operations could make the problem easier.

Echelon form

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

For each row in a matrix, if the row does not consist of only zeros, then the left-most non-zero entry is called the leading coefficient (or pivot) of that row. So if two leading coefficients are in the same column, then a row operation of type 3 (see above) could be used to make one of those coefficients zero. Then by using the row swapping operation, one can always order the rows so that for every non-zero row, the leading coefficient is to the right of the leading coefficient of the row above. If this is the case, then matrix is said to be in row echelon form. So the lower left part of the matrix contains only zeros, and all of the zero rows are below the non-zero rows. The word "echelon" is used here because one can roughly think of the rows being ranked by their size, with the largest being at the top and the smallest being at the bottom.

For example, the following matrix is in row echelon form, and its leading coefficients are shown in red.

\left[ \begin{array}{cccc} 0&\color{red}{\mathbf{2}}&1&-1 \\ 0&0&\color{red}{\mathbf{3}}&1 \\ 0&0&0&0 \end{array} \right]

It is in echelon form because the zero row is at the bottom, and the leading coefficient of the second row (in the third column), is to the right of the leading coefficient of the first row (in the second column).

A matrix is said to be in reduced row echelon form if furthermore all of the leading coefficients are equal to 1 (which can be achieved by using the elementary row operation of type 2), and in every column containing a leading coefficient, all of the other entries in that column are zero (which can be achieved by using elementary row operations of type 3).

Example of the algorithm

Suppose the goal is to find and describe the set of solutions to the following system of linear equations:

\begin{alignat}{7}
2x &&\; + \;&& y             &&\; - \;&& z  &&\; = \;&& 8 & \qquad (L_1) \\
-3x &&\; - \;&& y             &&\; + \;&& 2z &&\; = \;&& -11 & \qquad (L_2) \\
-2x &&\; + \;&& y &&\; +\;&& 2z  &&\; = \;&& -3 &  \qquad (L_3)
\end{alignat}

The table below is the row reduction process applied simultaneously to the system of equations, and its associated augmented matrix. In practice, one does not usually deal with the systems in terms of equations but instead makes use of the augmented matrix, which is more suitable for computer manipulations. The row reduction procedure may be summarized as follows: eliminate x from all equations below L_1, and then eliminate y from all equations below L_2. This will put the system into triangular form. Then, using back-substitution, each unknown can be solved for.

System of equations Row operations Augmented matrix
\begin{alignat}{7}
2x &&\; + \;&& y             &&\; - \;&& z  &&\; = \;&& 8 & \\
-3x &&\; - \;&& y             &&\; + \;&& 2z &&\; = \;&& -11 & \\
-2x &&\; + \;&& y &&\; +\;&& 2z  &&\; = \;&& -3 &
\end{alignat} 
\left[ \begin{array}{ccc|c}
2 & 1 & -1 & 8 \\
-3 & -1 & 2 & -11 \\
-2 & 1 & 2 & -3
\end{array} \right]
\begin{alignat}{7}
2x &&\; + && y &&\; - &&\; z &&\; = \;&& 8 &  \\
&& && \frac{1}{2}y &&\; + &&\; \frac{1}{2}z &&\; = \;&& 1 & \\
&& && 2y &&\; + &&\; z &&\; = \;&& 5 &
\end{alignat} L_2 + \frac{3}{2}L_1 \rightarrow L_2
L_3 + L_1 \rightarrow L_3

\left[ \begin{array}{ccc|c}
2 & 1 & -1 & 8 \\
0 & 1/2 & 1/2 & 1 \\
0 & 2 & 1 & 5
\end{array} \right]

\begin{alignat}{7}
2x &&\; + && y \;&& - &&\; z \;&& = \;&& 8 &  \\
&& && \frac{1}{2}y \;&& + &&\; \frac{1}{2}z \;&& = \;&& 1 & \\
&& && && &&\; -z \;&&\; = \;&& 1 &
\end{alignat} L_3 + -4L_2 \rightarrow L_3 \left[ \begin{array}{ccc|c}
2 & 1 & -1 & 8 \\
0 & 1/2 & 1/2 & 1 \\
0 & 0 & -1 & 1
\end{array} \right]
The matrix is now in echelon form (also called triangular form)
\begin{alignat}{7}
2x &&\; + && y \;&& &&\; \;&& = \;&& 7 &  \\
&& && \frac{1}{2}y \;&&  &&\; \;&& = \;&& \frac{3}{2} & \\
&& && && &&\; -z \;&&\; = \;&& 1 &
\end{alignat} L_2+\frac{1}{2}L_3 \rightarrow L_2
L_1 - L_3 \rightarrow L_1
\left[ \begin{array}{ccc|c}
2 & 1 & 0 & 7 \\
0 & 1/2 & 0 & 3/2 \\
0 & 0 & -1 & 1
\end{array} \right]
\begin{alignat}{7}
2x &&\; + && y \;&& &&\; \;&& = \;&& 7 &  \\
&& && y \;&&  &&\; \;&& = \;&& 3 & \\
&& && && &&\; z \;&&\; = \;&& -1 &
\end{alignat} 2L_2 \rightarrow L_2
-L_3 \rightarrow L_3
\left[ \begin{array}{ccc|c}
2 & 1 & 0 & 7 \\
0 & 1 & 0 & 3 \\
0 & 0 & 1 & -1
\end{array} \right]
\begin{alignat}{7}
x &&\;  &&  \;&& &&\; \;&& = \;&& 2 &  \\
&& && y \;&&  &&\; \;&& = \;&& 3 & \\
&& && && &&\; z \;&&\; = \;&& -1 &
\end{alignat} L_1 - L_2 \rightarrow L_1
\frac{1}{2}L_1 \rightarrow L_1
\left[ \begin{array}{ccc|c}
1 & 0 & 0 & 2 \\
0 & 1 & 0 & 3 \\
0 & 0 & 1 & -1
\end{array} \right]

The second column describes which row operations have just been performed. So for the first step, the x is eliminated from L_2 by adding \begin{matrix}\frac{3}{2}\end{matrix} L_1 to L_2. Next x is eliminated from L_3 by adding L_1 to L_3. These row operations are labelled in the table as

L_2 + \frac{3}{2}L_1 \rightarrow L_2
L_3 + L_1 \rightarrow L_3.

Once y is also eliminated from the third row, the result is a system of linear equations in triangular form, and so the first part of the algorithm is complete. From a computational point of view, it is faster to solve the variables in reverse order, a process known as back-substitution. One sees the solution is z = -1, y = 3, and x = 2. So there is a unique solution to the original system of equations.

Instead of stopping once the matrix is in echelon form, one could continue until the matrix is in reduced row echelon form, as it is done in the table. The process of row reducing until the matrix is reduced is sometimes referred to as Gauss-Jordan elimination, to distinguish it from stopping after reaching echelon form.

History

The method of Gaussian elimination appears in the Chinese mathematical text Chapter Eight Rectangular Arrays of The Nine Chapters on the Mathematical Art. Its use is illustrated in eighteen problems, with two to five equations. The first reference to the book by this title is dated to 179 CE, but parts of it were written as early as approximately 150 BCE.[1][2] It was commented on by Liu Hui in the 3rd century.

The method in Europe stems from the notes of Isaac Newton.[3][4] In 1670, he wrote that all the algebra books known to him lacked a lesson for solving simultaneous equations, which Newton then supplied. Cambridge University eventually published the notes as Arithmetica Universalis in 1707 long after Newton left academic life. The notes were widely imitated, which made (what is now called) Gaussian elimination a standard lesson in algebra textbooks by the end of the 18th century. Carl Friedrich Gauss in 1810 devised a notation[which?] for symmetric elimination that was adopted in the 19th century by professional hand computers to solve the normal equations of least-squares problems. The algorithm that is taught in high school was named for Gauss only in the 1950s as a result of confusion over the history of the subject.[5]

Some authors use the term Gaussian elimination to refer only to the procedure until the matrix is in echelon form, and use the term Gauss-Jordan elimination to refer to the procedure which ends in reduced echelon form. The name is used because it is a variation of Gaussian elimination as described by Wilhelm Jordan in 1887. However, the method also appears in an article by Clasen published in the same year. Jordan and Clasen probably discovered Gauss–Jordan elimination independently.[6]

Applications

The historically first application of the row reduction method is for solving systems of linear equations. Here are some other important applications of the algorithm.

Computing determinants

To explain how Gaussian elimination allows the computation of the determinant of a square matrix, we have to recall how the elementary row operations change the determinant:

  • Swapping two rows multiplies the determinant by -1
  • Multiplying a row by a nonzero scalar multiplies the determinant by the same scalar
  • Adding to one row a scalar multiple of another does not change the determinant.

If the Gaussian elimination applied to a square matrix A produces a row echelon matrix B, let d be the product of the scalars by which the determinant has been multiplied, using above rules. Then the determinant of A is the quotient by d of the product of the elements of the diagonal of B: det(A) = ∏diag(B) / d.

Computationally, for a n×n matrix, this method needs only O(n3) arithmetic operations, while solving by elementary methods requires O(2n) or O(n!) operations. Even on the fastest computers, the elementary methods are impractical for n above 20.

Finding the inverse of a matrix

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

A variant of Gaussian elimination called Gauss–Jordan elimination can be used for finding the inverse of a matrix, if it exists. If A is a n by n square matrix, then one can use row reduction to compute its inverse matrix, if it exists. First, the n by n identity matrix is augmented to the right of A, forming a n by 2n block matrix [A | I]. Now through application of elementary row operations, find the reduced echelon form of this n by 2n matrix. The matrix A is invertible if and only if the left block can be reduced to the identity matrix I; in this case the right block of the final matrix is A−1. If the algorithm is unable to reduce the left block to I, then A is not invertible.

For example, consider the following matrix

 A =
\begin{bmatrix}
2 & -1 & 0 \\
-1 & 2 & -1 \\
0 & -1 & 2
\end{bmatrix}.

To find the inverse of this matrix, one takes the following matrix augmented by the identity, and row reduces it as a 3 by 6 matrix:

 [ A | I ] = 
\left[ \begin{array}{rrr|rrr}
2 & -1 & 0 & 1 & 0 & 0\\
-1 & 2 & -1 & 0 & 1 & 0\\
0 & -1 & 2 & 0 & 0 & 1
\end{array} \right].

By performing row operations, one can check that the reduced row echelon form of this augmented matrix is:

 [ I | B ] = 
\left[ \begin{array}{rrr|rrr}
1 & 0 & 0 & \frac{3}{4} & \frac{1}{2} & \frac{1}{4}\\[3pt]
0 & 1 & 0 & \frac{1}{2} & 1 & \frac{1}{2}\\[3pt]
0 & 0 & 1 & \frac{1}{4} & \frac{1}{2} & \frac{3}{4}
\end{array} \right].

One can think of each row operation as the left product by an elementary matrix. Denoting by B the product of these elementary matrices, we showed, on the left, that BA = I, and therefore, B = A−1. On the right, we kept a record of BI = B, which we know is the inverse desired. This procedure for finding the inverse works for square matrices of any size.

Computing ranks and bases

The Gaussian elimination algorithm can be applied to any m \times n matrix A. In this way, for example, some 6 \times 9 matrices can be transformed to a matrix that has a row echelon form like

 T=
\begin{bmatrix}
a & * & * & *& * & * & * & * & * \\
0 & 0 & b & * & * & * & * & * & * \\
0 & 0 & 0 & c & * & * & * & * & * \\
0 & 0 & 0 & 0 & 0 & 0 & d & * & * \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & e \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0
\end{bmatrix}

where the *s are arbitrary entries and a, b, c, d, e are nonzero entries. This echelon matrix T contains a wealth of information about A: the rank of A is 5 since there are 5 non-zero rows in T; the vector space spanned by the columns of A has a basis consisting of the first, third, fourth, seventh and ninth column of A (the columns of a, b, c, d, e in T), and the *s tell you how the other columns of A can be written as linear combinations of the basis columns. This is a consequence of the distributivity of the dot product in the expression of a linear map as a matrix.

All of this applies also to the reduced row echelon form, which is a particular row echelon form.

Computational efficiency

The number of arithmetic operations required to perform row reduction is one way of measuring the algorithm's computational efficiency. For example, to solve a system of n equations for n unknowns by performing row operations on the matrix until it is in echelon form, and then solving for each unknown in reverse order, requires n(n+1) / 2 divisions, (2n3 + 3n2 − 5n)/6 multiplications, and (2n3 + 3n2 − 5n)/6 subtractions,[7] for a total of approximately 2n3 / 3 operations. Thus it has arithmetic complexity of O(n3); see Big O notation. This arithmetic complexity is a good measure of the time needed for the whole computation when the time for each arithmetic operation is approximately constant. This is the case when the coefficients are represented by floating point numbers or when they belong to a finite field. If the coefficients are integers or rational numbers exactly represented, the intermediate entries can grow exponentially large, so the bit complexity is exponential.[8] However, there is a variant of Gaussian elimination, called Bareiss algorithm that avoids this exponential growth of the intermediate entries, and, with the same arithmetic complexity of O(n3), has a bit complexity of O(n5).

This algorithm can be used on a computer for systems with thousands of equations and unknowns. However, the cost becomes prohibitive for systems with millions of equations. These large systems are generally solved using iterative methods. Specific methods exist for systems whose coefficients follow a regular pattern (see system of linear equations).

To put an n by n matrix into reduced echelon form by row operations, one needs n^3 arithmetic operations; which is approximately 50% more computation steps.[9]

One possible problem is numerical instability, caused by the possibility of dividing by very small numbers. If, for example, the leading coefficient of one of the rows is very close to zero, then to row reduce the matrix one would need to divide by that number so the leading coefficient is 1. This means any error that existed for the number which was close to zero would be amplified. Gaussian elimination is numerically stable for diagonally dominant or positive-definite matrices. For general matrices, Gaussian elimination is usually considered to be stable, when using partial pivoting, even though there are examples of stable matrices for which it is unstable.[10]

Generalizations

The Gaussian elimination can be performed over any field, not just the real numbers.

Gaussian elimination does not generalize in any simple way to higher order tensors (matrices are array representations of order 2 tensors); even computing the rank of a tensor of order greater than 2 is a difficult problem.

Pseudocode

As explained above, Gaussian elimination writes a given m × n matrix A uniquely as a product of an invertible m × m matrix S and a row-echelon matrix T. Here, S is the product of the matrices corresponding to the row operations performed.

The formal algorithm to compute T from A follows. We write A[i,j] for the entry in row i, column j in matrix A with 1 being the first index. The transformation is performed in place, meaning that the original matrix A is lost and successively replaced by T.

 for k = 1 ... min(m,n):
   Find the k-th pivot:
   i_max  := argmax (i = k ... m, abs(A[i, k]))
   if A[i_max, k] = 0
     error "Matrix is singular!"
   swap rows(k, i_max)
   Do for all rows below pivot:
   for i = k + 1 ... m:
     m := A[i, k] / A[k, k]
     Do for all remaining elements in current row:
     for j = k + 1 ... n:
       A[i, j]  := A[i, j] - A[k, j] * m
     Fill lower triangular matrix with zeros:
     A[i, k]  := 0

This algorithm differs slightly from the one discussed earlier, because before eliminating a variable, it first exchanges rows to move the entry with the largest absolute value to the pivot position. Such partial pivoting improves the numerical stability of the algorithm; some other variants are used.

Upon completion of this procedure the augmented matrix will be in row-echelon form and may be solved by back-substitution.

With modern computers, Gaussian elimination is not always the fastest algorithm to compute the row echelon form of matrix. There are computer libraries, like BLAS, that exploit the specifics of the computer hardware and of the structure of the matrix to choose the best algorithm automatically.

Notes

  1. Calinger (1999), pp. 234–236
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. Grcar (2011a), pp. 169-172
  4. Grcar (2011b), pp. 783-785
  5. Grcar (2011b), p. 789
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. Farebrother (1988), p. 12
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. J. B. Fraleigh and R. A. Beauregard, Linear Algebra. Addison-Wesley Publishing Company, 1995, Chapter 10
  10. Golub & Van Loan (1996), §3.4.6

References

  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found.
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found., Chapter 5 deals with Gaussian elimination.
  • Lua error in package.lua at line 80: module 'strict' not found..
  • Lua error in package.lua at line 80: module 'strict' not found.

External links