How are Linear Algebraic Equations Solved?

Part 3

There are a lot of reasons for using a better decomposition than LU, and the tutorial on “What is Hard..” explains some of those.  Instead we would like to use a Singular Value Decomposition in most cases.  The SVD for a matrix A has these properties:

·         A = U * S * VT

·         If A is of size m rows by n columns then theoretically

o   U is of size m by m

o   S is a diagonal matrix of size m by n, with at most min(m,n) non-zero elements on that diagonal

o   VT is of size n by n.

·         Both U*UT  and UT * U are identity matrices (of size m) so U and UT are inverses of each other

·         Both V*VT  and VT * V are identity matrices (of size n) so V and VT are inverses of each other

·         The diagonal elements of S are non-increasing.  That means they typically decrease, but values can repeat.

Here is how one solves A*X=B when an SVD is available:

·         We have                                              A*X=B

·         Substituting the SVD we get       U * S * VT * X = B

·         Multiply through by UT :                UT *U * S * VT * X = UT *B

·         But U*UT  = I, so                                S * VT * X = UT *B

·         But UT *B is easily computed so we can call it B’ and have

S * VT * X = B’

·         Which we relabel as                        S * X’ = B’                            (1)     because when we get X’ then computing X is easy.

·         Now since S is a diagonal matrix, it is verse is just its transpose, with each diagonal element replaced by its inverse.

·         That is,                                                  X’ = S-1*B’.

Now that seems easy… but remember that S can have some singular values which are zero, and in that case we can’t compute the elements of  S-1.  Suppose the last two singular values are zero.  Then we simply have to realize that the last two equations in (1) are impossible to satisfy (unless we are so lucky that the last two elements of B’ are also zero!).  In fact, they represent conflicting information within the system of equations.  What we have to do is just drop those two equations, which can be done by setting the last two values of S-1 to zero.  Then we have two equations with all zeros on the left side and something hopefully zero on the right, but probably not.   When we modify S-1 that way we call it a pseudo-inverse and write it as S+.  So we can continue the solution process, realizing we have made an unavoidable error which will affect the final results.

·         So we have                                         X’ = S+*B’             which is easily computed,

·         and then we have                           VT * X = X’

·         and                                                        V* VT * X = V*X’

·         and finally                                           X = V*X’

·         and it is nice to look back and see this means      X =  V* S+*B’   = V* S+* UT *B.

# Example

Let’s return to the 3 by 3 matrix problem we used earlier:

1*x + 2 *y + 3*z = 14

2*x + 2*y + 3*z = 15

1*x – 2*y + 2*z  = 3

The corresponding matrix problem, A*X=B has

A=

 1 2 3 2 2 3 1 -2 2

and B=

 14 15 3

The SVD of A is U * S * VT where

U=

 0.653 0.1881 -0.7336 0.7263 0.1189 0.677 0.2146 -0.9749 -0.059

and S=

 5.627 0 0 0 2.817 0 0 0 0.6309

and V=

 0.4123 -0.1949 0.89 0.4139 0.9103 0.0076 0.8116 -0.3653 -0.456

We want to compute X = V* S+* UT *B .

UT *B is easy to compute:

 20.68 1.492 -0.292

S+ is

 0.1777 0 0 0 0.355 0 0 0 1.585

So S+* UT *B is the product of the above matrix times the above vector and is

 3.675 0.5298 -0.4628

And finally X= V times the above which is

 1 2 3

Just as we thought!

(If there are arithmetic errors here it is due to my copying computed values into these figures!)

# Summary

So if you have the SVD (that is, the U,S,and V) for a matrix A, it is very straightforward to compute a solution unless the singular values drop to zero.

However, there is another problem: if the singular values drop very near to zero then the latter elements in S+ can then become huge, so the latter elements of S+* UT *B also become huge, and the final solution then has huge elements, typically with random signs.  More on this in the next tutorial.