June 21, 2004

Newton Raphson

It is used to find the solution for f(x) = 0
It uses also the derivative f '(x)
It uses f(x₀), f '(x₀) and a point x₀ to find an approximation to a line of the function f(x).

Using the points (x₀, f(x₀)) and (x₁, 0).
1) m = (f(x₀) - 0)/(x₀ - x₁)
2) m = f '(x₀)
putting 1) and 2) together:
f '(x₀) = (f(x₀) - 0)/(x₀ - x₁)
f '(x₀)(x₀-x₁) = f(x₀)
(x₀-x₁) = f(x₀)/f '(x₀)
x₁ = x₀ - f(x₀)/f '(x₀)

We can arrive to the same iteration equation of Newton Raphson using Taylor Expansion
We can approximate a continuous function using a Taylor polynomial expansion around x₀.
f(x) = f(x₀) + f '(x₀)(x-x₀) + f "(x₀)(x-x₀)²/2 + f '"(x₀)(x-x₀)³/6 + .... + fⁿ(x-x₀)ⁿ/n! ....
To approximate Newton Raphson, we are going to use only two terms in the expansion
Newton Rapson approximate f(x) using a line. The quadratic, cubic, etc terms are eliminated from the Taylor Expansion.
f(x) ≈ f(x₀) + f '(x₀)(x-x₀) + e; e is the error of using a line instead of the complete polynomial
We want to obtain x, using the approximation below where f(x₁) = 0....

        f(x₁) = f(x₀) + f '(x₀)(x₁-x₀)
        f(x₁)-f(x₀) = f '(x₀)(x₁-x₀)
        [f(x₁)-f(x₀)]/f '(x₀) = x₁-x₀
        [f(x₁)-f(x₀)]/f '(x₀) + x₀= x₁
        [0-f(x₀)]/f '(x₀) + x₀= x₁
        x_{1 =}x₀- f(x₀)/f '(x₀)

In the same way, you could create a new numerical method faster than Newton Raphson that uses 3 terms in the approximation instead of 2
The approximation will be a quadratic instead of a line
The approximation will require a second derivative

f(x)=f(x₀) + f '(x₀)(x-x₀) + f ''(x₀)(x-x₀)² / 2
0 = f(x₀) + f '(x₀)(x₁-x₀) + f ''(x₀)(x₁-x₀)² / 2
Solve for (x₁-x₀) using -b ± √(b²-4ac) / 2a

Use the term of x₁closer to x₀.

Example of Newton Raphson

sin(x) - ½ = 0
f(x) = sin(x) - ½ = 0 f '(x) = cos(x) (in radians)
Start with x₀ = 0
x₁= x₀- f(x₀)/f '(x₀)

i	x_i	f(x_i)	f '(x_i)	x_i+1
0	0	f(x₀) = sin(0)-.5 = -.5	f '(x₀)= cos(0) = 1	x₁= 0-(-.5) / 1 = -.5
1	.5	f(x₁) = sin(.5)-.5 = -.02	f '(x₁)= cos(.5) = .8795	x₂= .5-(-.02) / .8795 = .522
2	.522	f(x₂)=sin(.522)-.5 =-.00139	f '(x₂)= cos(.522) = .8668	x₃= .522-(-.00139) / .8668 = .5235

Exact solution for sin(x) -.5 = 0 in radians is 30° that is 30π/180 rad = .5235 rad.

One of the disadvantages of Newton Raphson is that you need teh derivative f '(x)
It is possible to obtain the derivative f '(x) of f(x) numerically.

f '(x) = f(x+ε) - f(x) / ε

The exact solutions requires that ε → 0. We can do an approximation by using a small ε (like 1x10)^-5.

Example:

                f(x) = sin(x)
                f '(x) ≈ sin(x+ε) - sin(x) / ε
                if ε = .001    x = 0
                f '(0) ≈ sin(0 + .001) - sin(0) / .001 = 1
                exact solution is f '(x) = cos(x)
                f '(0) = cos(0) = 1
                f '(.5) ≈ sin(.5 + .001) - sin(.5) / .001 = .8773
                Exact solution f '(.5) = cos(.5) = .8773

When Newton Raphson may not give a good approximation or it may not converge....

Newton Raphson uses a line to approximate the function.
If the function is not approximate well using a line at the starting point, the method may not converge.

June 22, 2004

Order of Convergence

Some methods are faster than others finding the solution of f(x) = 0.
Assume p is a zero (solution)

E_n= P - P_n is the error in the iteration "n"

If there are two positive constants that exist such that n ≠ 0 and R > 0 and

lim _n→∞|P-P_n+1| / |P-P_n|^R = lim_n→∞|E_n+1| / |E_n|^R = A

Then the sequence is said to converge to P with an order of convergence R.

This means E_n+1= A|E_n|^R

Example let E_n= .001 and A = 1 and R = 2

E_n+1= 1 |.001|²

= 1(1x10^-3)²

=1x10^-6=.000001

If R = 1, the convergence is called "linear".

If R = 2, the convergence is called "quadratic"

If f(x) has a simple root and we use Newton Raphson the R = 2 (quadratic convergence)

Usually, the more roots an equation has, the slower the convergence will be.

Secant Method

The Newton Raphson method requires the evaluation of two functions at each iteration: f(x_n) and f '(x_n)
The secant method will require only one evaluation of f(x).
The secant method's order of convergence with a single root is R ≈ 1.618
The secant method uses two initial points, P₀, P₁ close to the root.

(1) m = [f(x₁) - f(x₀)]/ (x₁ - x₀) using (x₀, f(x₀)) and (x₁, f(x₁))

(2) m = [f(x₂) - f(x₁)]/ (x₂ - x₁) using (x₁, f(x₁)) and (x₂, 0)

Putting (1) and (2) together....

[f(x₁) - f(x₀)]/ (x₁ - x₀) = [0 - f(x₁)]/ (x₂ - x₁)

x₂ - x₁= - f(x₁)_*(x₁ - x₀) / [f(x₁) - f(x₀)]

x₂ = x₁ - f(x₁)_*(x₁ - x₀) / [f(x₁) - f(x₀)]

June 23, 2004

The secant method approximates Newton Raphson if x_1→x₀

lim   x₂= lim   [x₁ - f(x₁)_*(x₁ - x₀) / [f(x₁) - f(x₀)]]
x_1→x₀ x_1→x₀
             = x₀ - f(x₀)_*(lim (x₁ - x₀) / [f(x₁) - f(x₀)])        note: tends to 1/f '(x₀)
                                  x_1→x₀

x₂ = x₀ - f(x₀) / f '(x₀) equation for Newton Raphson

Example

Secant Method

f(x) = sin(x) - .5 = 0 x in rad.

x₀= 0 x₁ = 1

x_n+1= x_n- f(x_n)(x_n- x_n-1) / f(x_n) - f( x_n-1)

n	x_n	f(x_n)	x_n+1
0	x₀= 0	f(x₀) = sin(0) - .5 = -.5
1	x₁= 1	f(x₁) = sin(1) - .5 = .3415	x₂ = 1 - .3415(1 - 0) / .3415 - (-.5) = .594
2	x₂= .594	f(x₂) = sin(.594) - .5 = .0597	x₃ = .594 - .0597(.594 - 1) / .0597 - (.3415) = .50798
3	x₃= .50798	f(x₃) = sin(.50798) - .5 = -.0135	x₄ = .50798 - [-.0135(.50798 - .594) / (-.0135 - .0597)] = .5238

Exact solution: x = 30°_*π/180 = .5235

In summary: to solve f(x) = 0 you may use....

Bisection
False Position (Regula Falsi)
Newton Raphson
Secant Method

The Solution of Linear Systems

Example:

6x + 3y + 2z = 29

3x + y + 2z = 17

2x + 3y + 2z = 21

This can be expressed as: Ax = B where A is a matrix and B is a vector.

| 6     3    2 |       | X |               | 29 |
| 3    1    2  |       | Y |       =      | 17 |
| 2    3    2 |       | Z |                | 21 |

A X B

Each linear equation represents a plane in the n^thdimension
The solution x,y,z represents the intersection of the planes
Not all the systems of linear equations have solutions

(1) 2x + 3y = 6 (3) x = (6 - 3y)/2 from (1)

(2) 4x + 6y = 10

substitute (3) in (2)

4_*(6-3y)/2 + 6y = 10

12 - 6y + 6y = 10

12 = 10

Geometrically, these equations represent 2 planes that never intersect.

You also can have a system of equations with an infinite number of solutions....

(1) 2x + 3y = 6

(2) 4x + 6y = 12

(1) and (2) are the same equation so both are the same plane. Therefore, the number of solutions is infinite.

Upper Triangular Systems of Linear Equations

is called upper triangular if a_ij = Ø when i > j.

Example

A = | 1    2     4 |                This is upper triangular.
       | 0    6    7  |              a₂₁= 0    where i=2, j=1    i > j.
       | 0    0    3 |

Lower Triangular Systems of Linear Equations

a matrix A is called lower triangular if a_ij = Ø when i < j

Example

A = | 7    0     0 |
       | 8    2    0 |
       | 3    5    5 |    lower triangular

Advantage

An upper triangular system of linear equations is easy to solve by obtaining the value of x_n from the last equation and substituting the x_n into the (n-1)^thequation to obtain x_n-1 and so on. This is called back substitution.

Back Substitution

3x + 4y + z = 2

3y + 2z = 1

5z = 10

z = 10/5 = 2

y = 1-2(2) / 3 = -1

x = 2-2-4(-1) / 3 = 4/3

Gaussian Elimination

It is a method to solve equations of linear equations
It transforms an arbitrary system of linear equations into upper triangular
It solves the upper triangular system of equations using back substitution
The transformation to upper triangular uses the following properties that do not affect the solution.

Interchange equations: the order of the two equations do not affect the solution
Multiplying an equation by a constant will not affect the solution. (scaling)
Replacement: an equation can be replaced by the sum of itself and a non-zero multiple of another equation.

June 24, 2004

We will use these three properties to transform an arbitrary system of linear equations into an upper triangular system and then solve it with back-substitution.

Example

1x + 3y + 4z = 19

8x + 9y + 3z = 35

x + y + z = 6 Solve system using gaussian elimination.

Augmented matrix.

Choose a pivot row and pivot element. In this case, 1 is the pivot element.

    | 1    3    4    19 |
    | 8    9    3    35|
    | 1    1    1     6 |

We need to make element a₂₁= 8 to be Ø so we multiply the first equation by 8 and subtract the result from second equation. We also multiply the first equation by -1 and add it to the third equation.

↓

     | 1            3             4             19        |
     | 8-8=0    9-24=-15    3-32=-29    35-152=-117 |
       | 1-1=0    1-3=-2         1-4=-3            6-19=-13 |

Multiply second row by -1/15

↓

     | 1    3      4         19   |
     | 0    1    29/15    117/15 |
     | 0    -2     -3      -13 |

We need to make element a₃₂= 0 so we multiply second equation by 2 and add it to third equation.

↓

     | 1        3                4                        19              |
     | 0        1                29/15                         117/15            |
     | 0    -2+2=0    -3+2*29/15      -13+2*117/15 |

Back substitution (-3+2*29/15)z = -13 + 2*117/15 .

z = 3

1y + (29/15)(3) = 117/15, y = 117/15 - (29/15)(3) = 2, y = 2

x + 3(2) + 4(3) = 19, x + 6 + 12 = 19, x = 19 - 18 = 1, x = 1

Implementing Gauss Elimination Using Matlab

function x = gauss(A,B)

% Input - A is a NxN nonsingular matrix
% - B is a Nx1 matrix
% Output - X is a Nx1 matrix that is the solution of Ax =B
% Get dimensions of A

[N N] = size(A);

% Initialize X with zeroes

x = zeroes (N,1);

% Obtain augmented matrix

Aug = [A,B];

% Gaussian Elimination
%For all rows Pivot 1 to N

for P=1:N

%Choose a pivot. We will ignore the case when pivot is 0 for now.

Piv = Aug (P,P);

%Divide pivotal row by pivot

    for K=P:N+1                                        short notation: Aug(P,P:N+1) = Aug(P,P:N+1)/Piv
    Aug (P,K) = Aug(P,K)/Piv;
    end

% Make 0's in the other elements in pivotal column for all remaining rows

for K=P+1:N

%for all remaining columns

        m = Aug(K,P);    % element that we want to be 0
        for i=P:N+1
            Aug(K,i) = Aug(K,i)                    short notation: Aug(K, P:N+1) = Aug(K, P:N+1)

                - m*Aug(P,i);                                                    -m*Aug(P, P:N+1);
        end
    end

% Back substitution

    for K=N:1
        sum = 0
        for i=K+1:N
            sum=sum+Aug(K,i)*X(i,1)
        end
        X(K,1)=Aug(K,N+1) - sum
    end

% Result is in X.

June 25, 2004

(1) Prevent pivot = 0

If pivot = 0, switch rows to prevent division by zero

(2) To reduce computing error, you may choose as the next pivot the largest number in the rows (or columns) that are still left to transform.

        | 3    6    1    7 |            by using rule (2), you don't need (1).
        | 2    4    3    0 |
        | 7    6    0    0 |

↓

use the largest number as pivot; switch rows if necessary

        | 7    6    0    0 |        new pivot = 7
        | 2    4    3    0 |
        | 3    6    1    7 |

Triangular Factorization

A matrix A has a triangular factorization if it can be expressed as the product of a lower triangular matrix and an upper triangular matrix.
Triangular factorization is useful when you want to solve multiple systems of linear equations that have the same A.

Ax = b₁
Ax = b₂
_:
Ax = b_n

Triangular factorization will save time instead of using Gauss Elimination n times.

A = LU Lower Tri Matrix, Upper Tri Matrix

        | a₁₁a₁₂ ....... a_1n |               | L₁₁ 0 ....... 0 | | U₁₁U₁₂ ....... U_1n|
|a₂₁ a₂₂ ....... a_2n |       =       |L₂₁L₂₂ ...... 0 |   |0    U₂₂ ....... U_2n |
|_: |                |_: 0 |   |0_:|
|a_n1 a_n2....... a_nn |                 |L_n1L_n2...... L_nn |   |0     0....... U_nn |

A = L U

Assume the linear system Ax=B where A has a triangular factorization A=LU then....

AX = B

↓

(LU)X =B

Then if we define UX = Y (1)

L(UX) = B

LY = B (2)

Then we can solve X by first solving Y in (2) that is a lower triangular system of linear equations using "forward substitution".

Once Y is found, we can solve (1) using back substitution.

We solve (2) using forward substitution
We solve (1) using back substitution

LU Factorization Example

6x₁+ 1x₂+ 4x₃= -3

5x₁+ 3x₂+ 2x₃= 21

1x₁- 4x₂+ 3x₃= 10

        A =    | 6      1    4 |
                  | 5     3    2 |
                  | 1    -4    3 |

Add identity matrix to the left

    A =    | 1      0    0 |    | 6      1    4 |
              | 0     1    0 |    | 5      3    2 |
              | 0    0    1 |    | 1     -4    3 |

will become L will become U

Apply Gaussian Elimination to the matrix in the right hand side and also do the same steps in the matrix in the left.

multiply first row by 1/6

                    | 1/6      0    0 |    | 1      1/6    4/6 |
      A =        | 0       1    0 |    | 5       3       2 |
                    | 0       0    1 |    | 1      -4      3 |

multiply first row by -5 and add to second row
multiply first row by -1 and add to third row

                    | 1/6       0    0 |    | 1      1/6            4/6       |
         =         | -5/6     1    0 |    | 5       -5/6 +3    -20/6 +2 |
                    | -1/6     0    1 |    | 0     -1/6-4       -4/6+3     |

multiply second row by 6/13

                   | 1/6       0    0 |    | 1      1/6            4/6     |
        =         | -5/6     1    0 |    | 0       13/6          -8/6     |
                   | -1/6     0    1 |    | 0     -25/6          14/6     |

multiple second row by 25/6 and add to third row

                  | 1/6                         0           0 |    | 1            1/6                    4/6           |
       =         | -5/13                         6/13         0 |    | 0             1                   -8/13           |
                  | (-5/13)(-25/6)-1/6     25/13       1 |    | 0             0         -8/13(25/6)+14/6     |

                  | 1/6                         0           0 |    | 1            1/6                    4/6           |
       =         | -5/13                         6/13         0 |    | 0             1                   -8/13          |
                  | -138/78                      25/13       1 |    | 0             0                    -18/78           |

multiply third row by -78/18

                  | 1/6                         0               0      |    | 1            1/6                    4/6       |
    A   =      | -5/13                       6/13              0       |    | 0             1                   -8/13       |
                  | -138/78                 -150/18       -78/18     |    | 0             0                     1             |

L U

AX = B LUX = B UX = Y (1) LY = B (2)

Solve (2) and then solve (1) from (2)

            B = | -3 |
                  | 21 |
                  | 10 |

LY = B

1/6Y₁+ 0Y₂+ 0Y₃= -3

-5/13Y₁+ 6/13Y₂+ 0Y₃= 21

-138/18Y₁- 150/18Y₂- 78/18Y₃= 10

Solve Y₁. Y₁ = (-3)6 = -18

Y₂ = (21 + 5/13(-18))13/6 = 21*13/6 + 5(-18)/6 = 61/2

Y₃= ((10+138/18(-18) + 150/18(61/2))(-18/78)

Solve (1) UX = Y

1X₁ + 1/6X₂+ 4/6X₃= -18

0X₁ + 1X₂- 8/13X₃= 61/2

0X₁ + 0X₂+ 1X₃= Y₃

We can solve this system using back substitution.

X₃= Y₃

X₂= 61/2 + 8/13Y₃

X₁= -18 - 1/6X₂- 4/6X₃

LU factorization is better than Gaussian elimination when you have more than one system of equations with the same A.

Factor A in LU = O(N³)
Gauss Elimination = O(N³)
Back Substitution = O(N²)
Forward Substitution = O(N²)