**Solving
Ill-Conditioned Systems with the Solve* Programs**

The solve* programs may solve a set of equations in up to seven ways. (The set of programs currently offered do not include all the methods at this time. But I you would like to try a method not currently delivered, email me a request.) The reason for this is that we cannot know for sure which is most appropriate for your problem. Here is a description of each way.

In each case, the problem we are solving is the ordinary equations portion of the three-way problem:

Ax = b

where A has m rows and n columns, and the SVD of A is

A=USV^{T}.

**Note: In ALL CASES, we begin by scaling the rows of the
matrix A to unit norm, and carry along the right hand side values and the error
estimates, if present. **Then the rest of the solution process is as
described:

**CLS,**** or Classical Least
Squares**. This method solves the equations exactly as described in Solution Method. No use is made of
any error estimates, if provided.

**WLS,**** or Weighted Least
Squares**. This method picks the resulting median error estimate from the
scaled error estimates. Each equation is then scaled to have its error
estimate exactly equal to that median. The equations are then solved exactly as
described in Solution Method.

**DIS,**** or Discrepancy
method**. This is the first of the regularized methods. In the
regularized methods we (usually) augment the ordinary equations with a scaled
set of equations which ask for the solution to be all zeros. That is,
Ax=b becomes the augmented system:

(A)x=b (m rows by n columns)

(λI)x=0 (n rows by n columns; that is, I is an identity matrix of size n by n)

So we add n rows to A, each of which has all zeros except
for lambda in one place. The right hand side vector b is augmented by n
zeros. We relabel this augmented system
as A’ x= b’. When we pick a value of λ and solve, we get a solution x_{λ} which has greater than the minimum
least-squares residual. This residual vector is

res(λ)
= A x_{λ} –b.

Generally, res(λ) is an
increasing function of λ. The discrepancy method adjusts λ until the
norm of res(λ) is equal to the norm of the vector
of error estimates. That is, we have induced a *discrepancy *equal
to the expected error vector.

Now, to perform the three-way system solution with DIS, we
follow the description in Solution Method
except that instead of solving the ordinary equations with x_{a}= RTPI(A_{o})*b_{o}, we compute x_{a}
using this discrepancy method, with the given set of error estimates.

**AUT,**** or Picard-Jones Auto-regularized method.** This method ignores the given
error estimates, if any, and finds a value for λ another way.
(So it applies even when no error estimates are available.) The Picard
Condition says that for a good solution one should expect the coefficients in P
= S^{-1}U^{T}b to decline. The general reason for this is
that the elements of P become the coefficients
of an orthogonal expansion for the solution: x = V*P. (That is, x is a
sum of scaled columns of V, the right set of singular vectors.) In
orthogonal expansions, such as Fourier series or other classical mathematical
expansions, the coefficients form a sort of “spectrum”, and the “high
frequency” terms generally decline toward zero after some point. This is
not a very exact rule. The approach is fairly straightforward: one
computes P, and analyses it for a low point, followed by a significant
rise. Traditionally the low point is found using a moving average
centered on each successive element of P. However, the elements of P tend
to oscillate. So such an odd moving average tends to echo that
oscillation. (If three elements are being averaged, for example, one sum
will have two large and one small value, and the next will have two small and
one large value.) So in our method we use an even number of values in the
moving average. We then find the minimum of the moving averages, and
check to see if there is a significant rise following that point. If so,
then we do a secondary search within the minimum moving average interval (say P_{i},
for i from j to k) by starting at the right end (P_{k}) and backing up as long as the earlier
element of P is smaller (in absolute value) than its successor. We stop
at the beginning of this particular range of elements, or when the preceding
element of is larger than P_{i}. We call r=i
the “usable rank”. Then we compute the truncated SVD solution,x_{r}_{ }= VS^{-1}U^{T}b
where S_{i } is set to zero (or infinity, if you prefer that
terminology) after s_{r}. We then compute the residual vector
associated with x_{r}_{:}:
res(r) = A x_{r}_{ } -b.
We then estimate that the average error (or standard deviation, or sigma) in
the original b_{i }is norm(res(r))/sqrt(m-r). A key trick here is that final division by
sqrt(m-r).
Think of this estimated error as an adjusted root-mean-square value. If
we divided by sqrt(m) the formula would be exactly the RMS of the residual
vector. It easy to set up a sample ill-conditioned
problem with random errors in the RHS having a standard deviation of a known
amount, s. When it is solved with this method the resulting
estimated average error will be dramatically closer to s than it would be
without the adjustment of dividing by sqrt(m-r). The use of an even number of elements in
the moving average, plus the secondary search for a precise usable rank, plus
the special adjusted RMS error estimate, are the main meaning of the “Jones”
addition in “Picard-Jones” method.

The following methods are not currently used in any Solve* program on our web site:

**PIC,****
or Simple Picard Method**. The purpose of this method is to detect
quite mild ill-conditioning which can be missed by the more stringent
requirements of the AUT method. If AUT detects ill-conditioning, then PIC
is considered irrelevant and not done. PIC is a much simpler method. We
just compute the Picard condition vector, P, as follows:

P_{I } = (1/S_{i}) * (U^{T}b)_{I } if S_{i }>
EPS(A), or else zero.

Then we look
to see if the elements of P are *rising* at the end of the vector.
If so, we look backward from the last element of P for the first element of P
for which the preceding element is *larger*. That is, we look for the first local minimum counting
backward from the last element. The position of the element becomes the *usable
rank, r. * To solve the system, we then just do a truncated SVD
solution keeping r singular values.

**MER,**** or Matrix Error
Method**. This method only applies if the user gives a positive value
for the estimated percentage error in each element of A. The idea here is
to find if some of the latter singular values represent so little to the matrix
A that they can be discarded. Here is how we
make that decision based on the provided estimated percent error in the
coefficient, e. Take a vector Z (of length equal to the number of
singular values) and set its elements to zero. Then, set its last element
equal to the last singular value. Build the matrix B=UZV^{T}.
Compare the average absolute value of B to the average absolute value of
A. If

avg_abs_val(B) < .01*e*avg_abs_val(A)

than this singular value can be
neglected. Then, do the calculation over but set the last two elements of
Z equal to the last two singular values. If the above condition still
holds, then the last two singular values can be neglected. Repeat until
the above condition fails. We have then gone one step too far, so we know
exactly how many singular values must be kept. This number becomes our *usable
rank, r. *To solve the
system, we then just do a truncated SVD solution keeping r singular values.

**RNK,**** or User Given Rank Method. **In this method we interactively ask
the user what usable rank they would like to try, and do a truncated SVD
solution keeping that many singular values.

** **