CS314 Class Notes

CS314 Class Notes: June 21st - June 25th, 2004

by Bethany Wagenaar

Jump to Day:

Monday June 21st, 2004

Tuesday June 22nd, 2004

Wednesday June 23rd, 2004

Thursday June 24th, 2004

Friday June 25th, 2004

Monday June 21st, 2004

Newton-Raphson

It is used to find the solution for ƒ(x) = 0.
It also uses the derivative ƒ′(x).
It uses ƒ(x₀), ƒ′(x₀) and a point x₀ to find an approximation to a line of the function ƒ(x).

- Newton-Raphson is very fast; faster than methods we have seen so far.

1 m = ƒ(x₀) - 0 Using the points (x₀,ƒ(x₀) & (x₁, 0))
x₀ - x₁

2 m = ƒ′(x₀)

Putting 1 and 2 together:

ƒ′(x₀) = ƒ(x₀) - 0
x₀ - x₁

ƒ′(x₀)(x₀ - x₁) = ƒ(x₀)

(x₀ - x₁) = ƒ(x₀)
ƒ′(x₀)

x₁ = x₀ - ƒ(x₀)
ƒ′(x₀)

- We can arrive to the same iteration equation of Newton Raphson using the Taylor expansion.

- We can approximate a continuous function using a Taylor polynomial around x₀.

ƒ(x) = ƒ(x₀) + ƒ′(x₀)(x - x₀) + ƒ″(x₀)(x - x₀)²/2 + ƒ′″(x₀)(x - x₀)³/6 + ... + ƒⁿ(x - x₀)ⁿ/n! + ...

- To approximate Newton Raphson, we are going to use only two terms in the expansion.

- Newton Raphson approximates ƒ(x) using a line; the quadratic, cubic, etc. terms are eliminated from the Taylor expansion.

f(x) ≈ ƒ(x₀) + ƒ′(x₀)(x - x₀) + ε ← This is the error using a line instead of the complete polynomial

- We want to obtain x₁ using the approximation.

Find solution using approximation

↓

ƒ(x₁) = ƒ(x₀) + ƒ(x₀)(x₁ - x₀)

ƒ(x₁) - ƒ(x₀)/ƒ′(x₀) = x₁ - x₀

ƒ(x₁) - ƒ(x₀)/ƒ′(x₀) + x₀ = x₁

0 - ƒ(x₀)/ƒ′(x₀) + x₀ = x₁

x₁ = x₀ - ƒ(x₀)/ƒ′(x₀)

♦ Quadratic has two solutions - choose the one closer to x₀

- In the same way, you could create a new numerical method faster than Newton Raphson that uses three terms in the approximation instead of two.

- The approximation will be a quadratic instead of a line.

- The approximation will require a second derivative.

ƒ(x) = ƒ(x₀) + ƒ′(x₀)(x - x₀) + ƒ″(x₀)(x - x₀)²/2

0 = ƒ(x₀) + ƒ′(x₀)(x₁ - x₀) + ƒ″(x₀)(x₁ - x₀)²/2

Solve for (x₁ - x₀) using -b ± √b^2 - 4ac / 2a

Solve for x₁.

Use the term of x₁ closer to x₀.

Example of Newton Raphson

ƒ(x) = sin(x) - ½ = 0 ƒ′(x) = cos(x)

↑ In radians - no degrees because we would need to multiply fractions.

Start with x₀ = 0.

x₁ = x₀ - ƒ(x₀)/ƒ′(x₀) Iteration equation Newton Raphson

i

x_i

f(x_i)

ƒ′(x_i)

x_i + 1

0 0 ƒ(x₀) = sin(0) - .5 = -.5 ƒ′(x₀) = cos(0) = 1 x₁ = 0 - .5/1 = .5

1 .5 ƒ(x₁) = sin(.5) - .5 = -.02 ƒ′(x₁) = cos(.5) = .8795 x₂ = -.5 - (-.02/.8795) = .582

2 .522 ƒ(x₂) = sin(.522) - .5 = -.00139 ƒ′(x₂) = cos(.522) = .8668 x₃ = -.522 - (-.00139/.8668) = .5235

Exact solution for sin(x) - .5 = 0 in radians is 30° that is 30 π/180 rad = .5235 rad

- One of the disadvantages of Newton Raphson is that you need the derivative ƒ′(x).

- It is possible to obtain the derivative of ƒ(x) numerically.

ƒ′(x) = ƒ(x + ε) - ƒ(x)/ε
↑ The exact solution requires that ε → 0.
We can do an approximation by using a small ε (like 1 x 10^-5).

Example:

ƒ(x) = sin(x)

ƒ′(x) ≈ sin(x + ε) - sin(x)/ε

if ε = .001 x = 0

ƒ′(0) ≈ (sin(0 + .001) - sin(0)) / .001 = 1
Exact solution is ƒ′(x) = cos(x) ƒ′(0) = cos(0) = 1

ƒ′(.5) ≈ (sin(.5 + .001) - sin(.5)) / .001 = .8773
Exact solution is ƒ′(.5) = cos(.5) = .8773

- When Newton Raphson may not give a good approximation or if it may not converge:

Newton Raphson uses a line to approximate the function.
If the function is not approximated well using a line at the starting point, the method may not converge

Tuesday June 22nd, 2004

Order of Convergence

- Some methods are faster than others at finding the solution of ƒ(x) = 0.

- Assume

p is a zero
E_n = p - p_n is the error in the iteration "n"
If two positive constants exist such that A ≠ 0 & R > 0 and Lim_n→∞ |p - p_n+1|/|p - p_n|^R = Lim_n→∞ E_n+1/|E_n|^R
= A

Then the sequence is said to converge to p with an order of convergence R.

This means

E_n+1 = A |E_n|^R

Example:

Let E_n = .001 A = 1 R = 2
→ E_n+1 = 1|.001|² = 1 x 10^-3 = (1/1000)² → The larger the exponent, the faster the error will decrease.
= 1(1 x 10^-3)²
= 1 x 10^-6 = .000001

If R = 1, The convergence is called "linear".

If R = 2. The convergence is called "quadratic".

If ƒ(x) has a simple root and we use Newton Raphson, the R = 2 (quadratic convergence).

Usually the more roots an equation has, the slower the convergence will be.

Secant Method

- The Newton Raphson method requires the evaluation of two functions of each iteration:
ƒ(x_n) and ƒ′(x_n)

- The secant method will require only one evaluation of ƒ(x).

- The secant method's order of convergence with a simple root is R ≈ 1.618.

- The secant method uses two initial points p₀, p₁ close to the root.

₁

₀

₁

₀

₁

2 m = ƒ(x₂) - ƒ(x₁)/x₂ - x₁ using (x₁, ƒ(x₁)) & (x₂, 0)

Putting 1 & 2 together

ƒ(x₁) - ƒ(x₀)/x₁- x₀ = 0 - ƒ(x₁)/x₂- x₁

x₂- x₁ = -ƒ(x₁)((x₁- x₀)/ƒ(x₁) - ƒ(x₀))

x₂ = x₁ - ƒ(x₁)((x₁- x₀)/ƒ(x₁) - ƒ(x₀))

Wednesday June 23rd, 2004

The Secant Method approximates Newton-Raphson.

if x₁→x₀

Lim_x₁→x₀ x₂ = Lim_x₁→x₀ (x₁ - ƒ(x₁)*(x₁ - x₀) / [ƒ(x₁) - ƒ(x₀)])

= x₀ - ƒ(x₀) * (Lim_x₁→x₀ (x₁ - x₀) / [ƒ(x₁) - ƒ(x₀)])

NOTE: (Lim_x₁→x₀ (x₁ - x₀) / [ƒ(x₁)-ƒ(x₀)]) tends to equal 1/ƒ′(x)

x₂ = x₀ - ƒ(x₀)/ƒ′(x₀) ← Newton Raphson

Example

Secant Method

ƒ(x) = sin(x) - 0.5 = 0

x₀ = 0, x₁ = 1

x_n+1 = x_n - [ƒ(x_n)*(x_n- x_n-1) / ƒ(x_n) - ƒ(x_n-1)] ← Secant Method

n x_n f(x_n) x_n+1

0

x₀ = 0

f(x₀) = sin(n) - .5

= -.5

-

1 x₁ = 1
f(x₁) = sin(1) - .5

= .3415

x₂ = 1 - [.3415(1-0) / .3415-(-.5)]

= .594

2 x₂ = .594
f(x₂) = sin(.594) - .5

= .0597

x₃ = .594 - [(.0597*(.594-1) / (.597-.3415)]

= .50798

3 x₃ = .50798
f(x₃) = sin(.50798) - .5

= - .0135

x4 = .50798 - [(-.0135)*(.50798-.594) / (-.0135- .0597)]

= .5238

Exact Solution

x = 30° = π/180 = .5235

In Summary

To solve ƒ(x₀) = 0

You may use:

Bisection
False Position (Regula Falsi)
Newton Raphson ← faster than the others but you need the derivative
Secant Method

The Solution of Linear Systems

Example

6x + 3y + 2z = 29
3x + y + 2z = 17
2x + 3y + 2z = 21

This can be expressed as AX = B where A is a matrix and B is a vector.

| 6   3   2 |   | X |             | 29 |
| 3   1   2 |   | Y |      =     | 17 |
| 2   3   2 |   | Z |             | 21 |

{ A } { X } { B }

- Each linear equation represents a plane in the nth dimension.

- The solution x, y, z represents the intersection of the planes.

Not all the systems of linear equations have solutions.

{1} 2x + 3y = 6
{2} 4x + 6y = 10

from {1} ← {3} x = (6 - 3y)/2

substitute {3} in {2}

4 * (6 - 3y)/2 + 6y = 10
12 - 6y + 6y = 10
12 = 10 → No Solution

Geometrically, these equations represent 2 planes that never intersect.

- You can also have a system of equations with an infinite number of solutions:

{1} 2x + 3y = 6
{2} 4x + 6y = 10

{1} & {2} are the same equation so both are the same plane.
Therefore, the number of solutions is infinite.
[multiply {1} by 2 and you obtain {2}]

Upper Triangular Systems of Linear Equations

Example:
A = | 1   2   4 |
       | 0   6   7 |
       | 0   0   3 |
// this A is upper triangular
a₂₁ = 0; i = 2, j = 1   → i > j

A matrix A is called lower triangular if a_ij = 0 when i < j

Example:
A = | 9   0   0 |
       | 8   2   0 |
       | 3   5   5 |
// this A is lower triangular
a₁₂ = 0; i = 1, j = 2   → i < j

We can have an upper triangular system of linear equations:

a₁₁x₁ + a₁₂x₂ + a₁₃x₃ + ··· + a_1n-1x_n-1+ a_1nx_n = b₁
a₂₂x₂ + a₂₃x₃ + ··· + a_2n-1x_n-1+ a_2nx_n = b₂
a₃₃x₃ + ··· + a_3n-1x_n-1+ a_3nx_n = b₃

- An upper triangular system of linear equations is easy to solve by obtaining the value of x_n from the last equation and substituting the x_n into the (n-1)th equation to obtain x_n-1 and so on . This is called back substitution.

Back Substitution

x_n = b_n/a_nn
x_n-1 = b_n-1 - a_n-1x_n / a_n-1n-1
·
·
·
x₁ = (b₁ - a_1nx_n - a_1n-1x_n-1 + ···· + a₁₂x₂₎ / a₁₁

Example:
                3x + 4y + z = 2
                        3y + 2z = 1
                                5z = 10

z = 10/5 = 2
y = (1 - 2(2)) / 3 = -3/3 = -1
x = 2 - 2 - 4(-1) / 3 = 4/3

Gaussian Elimination

It is a method to solve systems of linear equations.
It transforms an arbitrary system of linear equations into upper triangular.
It solves the upper triangular system of equations using back substitution.
The transformation to upper triangular uses the following properties that do not affect the solution:

Interchange equations
The order of two equations do not affect the solution.

                                            x - y = 5         →        3x + 2 = 6
                                          3x + 2 = 6                       x - y = 5
Multiplying an equation by a constant will not affect the solution (scaling).
                                     {1} x - y = 5         →        2x - 2y = 10
                                     {2} 3x + 2 = 6    multiply   3x + 2 = 6
                                                               {1} by 2
                                                     equivalent (same solution)

Replacement

An equation can be replaced by the sum of itself and a non-zero multiple of another equation.
                                               {1} x - y = 5         →        3x - 3y = 15
                                               {2} 3x + 2 = 6    multiply   3x + 2 = 6
                                                                        {1} by 3 &
                                                                         add it to {2}
                                                                         6x - y = 21

Thursday June 24th, 2004
We will use these three properties to transform an arbitrary system of linear equations into an upper triangular system and then solve it with back substitution.
Example:
1x + 3y + 4z = 19
8x + 9y + 3z = 35
x + y + z = 6
Solve system using Gaussian Elimination.
Augmented Matrix
pivot row → | 1   3   4   19 | (-8) (-1)
                      | 8   9   3   35 | (+1)
                      | 1   1   1    6   |          (+1)
element a_1,1 = 1 is the pivot element on pivot
We need to make element a₂₁ = 8 to be 0 so we multiply the first equation by 8 & subtract the result from the second equation.
pivot in red
| 1                3                 4               19                |
| 8-8= 0    9-8*3= -15   3-4*8= -29    35-19*8= -117       |
| 1-1=0        1-3= -2         1-4 = -3          6-19 = -13        |
| 1              3              4             19    |
| 0            -15          -29         -117   | (-1/15)
| 0             -2            -3             -13 |
| 1              3              4             19      |
| 0            1          29/15     117/15   | (+2)
| 0             -2            -3             -13    | (+1)
Now we need to make element a₃₂ = 0 so we need to multiply the second equation {2} by 2 and add it to the third equation {3}.
| 1                     3                       4                       19      |
| 0                     1                   29/15            117/15   |
| 0                -2+2 = 0          -3 + 2*29/15    -13 + 2*117/15 |
Can do everything with decimals in exam.
Back Substitution
    (-3 + 2*29/15)z = -13 + 2*117/15
    z = 3
    1y + 29/15(3) = 117/15
    y = 117/15 - 29/15*3 = 2
    y = 2
    1x + 3(2) +   4(3) = 19
    x = 19 - 3*2 - 4*3
       = 19 - 6 - 12
    x = 1
Implementing Gauss Elimination Using Matlab
function x = gauss(A, B) // A & B are coefficients [A (n x n) matrix, B (n x 1) matrix which results in x (n x 1 matrix), x is the solution]
% Input - A is a NxN nonsingular matrix (nonsingular means that it has a solution (det ≠ 0) singular means it does not have a solution)
%          - B is a Nx1 matrix
% Output - x is a Nx1 matrix that is the solution of Ax = B
% Get the dimensions of A
[N, N] = size(A); → // n x n (square matrix)
% Initialize x with zeros
x = zeros (N, 1); → // zero matrix
% Obtain augmented matrix
Aug = [A, B];
% Gaussian Elimination
%For all rows
%Pivot 1 to N
for p = 1:N
    % Choose a pivot. We will ignore the case when pivot is 0 for now.
    Piv = Aug (p, p);
    % Divide pivotal row by pivot.
    for k = p:N+1 % makes an upper triangular matrix
        Aug (p, k) = Aug (p, k) / piv;   % Aug(p,p:N+1) / piv
        % short notation → Aug(p,p:N+1)
    end
    % make 0's in the other elements in pivotal column for all remaining rows
    for k = p+1:N
        % for all remaining columns
        m = Aug (k, p); % element that we want to be 0
        for i = p:N+1
             Aug (k, i) = Aug (k, i) - (m * Aug (p, i));
        end
end
% Back Substitution
     for k = N:1 (Step - 1)
        sum = sum + Aug(k,i) * x(i,1)
     end
     x(k,1)

Friday June 25th, 2004
1. Prevent pivot = 0
   If pivot = 0, switch rows to prevent division by zero.
2. To reduce computing error, you may choose as the next pivot the largest number in the rows (or column) that are still left to transform (use absolute value). * By using Rule 2, you don't need Rule 1.
| 3   6   1   7 |
| 2   4   3   0 |
| 7   6   0   0 |   →   Use the largest number as pivot. Switch rows if necessary.
Switch rows one and three. Matrix now contains a pivot of 7.
| 7   6   0   0 | → New pivot is 7.
| 2   4   3   0 |
| 3   6   1   7 |

Triangular Factorization
A matrix A has a triangular factorization if it can be expressed as the product of a lower triangular matrix and an upper triangular matrix.
Triangular Factorization is useful when you want to solve multiple systems of linear equations that have the same A.
                                 Ax = 1b₁
                                 Ax = 1b₂
                                 Ax = 1b₃
                                        .
                                        .
                                        .
                                 Ax = 1b_n
Triangular factorization will save time instead of using Gauss Elimination n times.
A = LU
| a₁₁   a₁₂   ····   a_1n |                         | 1   0   ····   0 |         | u₁₁   u₁₂   ····   u_1n |
| a₂₁   a₂₂   ····   a_2n |                         | l₁₁ 1   ····   0 |         |   0    u₂₂   ····   u_2n |
| ·                             |        ==             | ·                    |         | ·                             |
| ·                             |                         |   ·                    |        | ·                             |
| a_n1   a_n2   ····   a_nn|                        | l_n1 l_n2 ···· 1 |         |    0     0    ····   u_nn|
               A                                                     L                             U

Assume the linear system Ax = B.
Where A has a triangular factorization A = LU then
                                    AX = B
                                        ↓
                                    (LU)X = B
Then we define UX = Y 1
                L(UX) = B
                LY = B    2
Then we can solve x by first solving y in 2 that is the lower triangular system of linear equations using "forward substitution"
Once y is found, we can solve 1 using back substitution.
- We solve 2 using foward substitution.
- We solve 1 using back substitution.
LU - Factorization
Example: 6x₁ + 1x₂ - 4x₃ = -3
               5x₁ + 3x₂ + 2x₃ = 21
               1x₁ - 4x₂ + 3x₃ = 10
A = | 6   1   4 |
         | 5   3   2 |
         | 1 -4 3 |

A = | 1   0   0 |   | 6   1   4 | (1/6)
         | 0   1   0 |   | 5   3   2 |
         | 0   0   1 |   | 1 -4 3 |
       will become L   will become U

Apply Gaussian Elimination to the matrix on the right hand side and also do the same steps to the matrix on the left hand side.
A = | 1/6   0    0   |   | 1   1/6   4/6 | (-5)   - (-1)
         | 0      1    0   |   | 5     3      2   | (+1)   -
         | 0      0     1 |   | 1     -4      3   |            - (+1)

    = | 1/6    0    0   |   | 1       1/6          4/6    |
        | -5/6   1     0   |   | 0 -3/6+3   -20/6+2    |
        | -1/6   0     1   |   | 0 -1/6-4      -4/6+3    |
    = | 1/6    0    0   |   | 1    1/6       4/6    |
        |-5/6   1     0   |    | 0   13/6     -8/6   | (6/13)
        | -1/6   0     1   |   | 0 -25/6     14/6 |
   = | 1/6    0     0   |   | 1    1/6       4/6    |
        |-5/13 6/13   0 |   | 0     1       -8/13   | (25/6)
        | -1/6    0      1   |   | 0 -25/6     14/6 | (+1)

= | 1/6                                0               0   |   | 1                    1/6                  4/6    |
       |-5/13                            6/13             0   |   | 0                    1                      -8/13 |
       | 25/6(-5/13)(-1/6)         25/13            1   |   | 0                    0        -8/13(25/6)+14/6 |

= | 1/6              0          0   |   | 1    1/6       4/6    |
        |-5/13          6/13       0 |    | 0     1       -8/13   |
        | -138/78    25/13      1   |   | 0    0     -18/78 | (-78/18)

= | 1/6                0              0   |     | 1    1/6       4/6    |
        |-5/13            6/13            0 |      | 0     1       -8/13   |
        | -138/18     -150/18    -78/18   |   | 0     0         1      |

                           L                                        U

7 steps - If we use a calculator with decimal we can do it in 3 or 4 steps.
AX = B
LUX = B    UX = Y   1
LY = B    2
Solve 2 then solve 1.
From 2   LY = B
          1/6y₁ +         0y₂ +       0y₃ = -3
        -5/13y₁ + 6/13y₂ +       0y₃ = 21
   -138/18y₁ - 150/18y₂ - 78/18y₃ = 10
B =| -3 |
      | 21 |
      | 10 |
Solve y
y_{1 = (-3)6 = -18}
y_{2 = (21 + 5/13(-18))13/6}
    = 21*13/6 + 5(-18)/6 = 61/2
y_{3 = (10 + 138/18(-18) + 150/18(61/2))(-18/78)}
Now solve 1 UX = Y
1x₁ = 1/6x₂ + 4/6x₃= -18
0x₁ +   1x₂ - 8/13x₃= 61/2
0x₁ =   0x₂ +     1x₃ =   y₃ *
We can solve the system using back substitution.
x₃= y₃
x₂= 61/2 + 8/13*1/3
x₁ = -18 - 1/6x₂- 4/6x₃
_{LU Factorization is better than Gaussian Elimination when you have
more than one system of equations with the same A.}
_{Factor A in LU = O(N}³₎
_{Gauss Elimination = O(N}³₎
_{Back Substitution = O(N}²₎
_{Forward Substitution = O(N}²₎
- Advantage of LU → solve multiple systems of equations with the same A.
If we want to solve multiple systems of equations with the same A, then it is better to use LU factorization instead of Gaussian Elimination.

i	x_i	f(x_i)	ƒ′(x_i)	x_i + 1
0	0	ƒ(x₀) = sin(0) - .5 = -.5	ƒ′(x₀) = cos(0) = 1	x₁ = 0 - .5/1 = .5
1	.5	ƒ(x₁) = sin(.5) - .5 = -.02	ƒ′(x₁) = cos(.5) = .8795	x₂ = -.5 - (-.02/.8795) = .582
2	.522	ƒ(x₂) = sin(.522) - .5 = -.00139	ƒ′(x₂) = cos(.522) = .8668	x₃ = -.522 - (-.00139/.8668) = .5235

n	x_n	f(x_n)	x_n+1
0	x₀ = 0	f(x₀) = sin(n) - .5 = -.5	-
1	x₁ = 1	f(x₁) = sin(1) - .5 = .3415	x₂ = 1 - [.3415(1-0) / .3415-(-.5)] = .594
2	x₂ = .594	f(x₂) = sin(.594) - .5 = .0597	x₃ = .594 - [(.0597*(.594-1) / (.597-.3415)] = .50798
3	x₃ = .50798	f(x₃) = sin(.50798) - .5 = - .0135	x4 = .50798 - [(-.0135)*(.50798-.594) / (-.0135- .0597)] = .5238