6/23/03

 

 

 

Newton-Raphson method

 

Used to solve f(x) = 0.

 If f(x) , f'(x) , f"(x) are continuous, then we can use this information to find the solution of f(x)

 

 

 

Newton-Raphson iteration

1) m = f'(p0)

2) m = [f(p1) - f(p0)] / [p1 - p0]

         è  -f(p0) / [p1 - p0]  (since f(p1) = 0)

 

equate both 1) and 2), then we get p1 = p0 - f(p0)/f'(p0)

 

Newton-Raphson and Taylor polynomial

 

Taylor polynomial is used to approximate a continuous function using a polynomial

 

f(x) = f(p0) + f'(p0)(x - p0) + [ f"(p0)(x - p0)2 / 2! ]

 

The Newton-Raphson method approximates f(x) using the Taylor expansion up to two terms.

 

If p0 = x and f(p1) = 0

 

f(p1) = 0

è 0 = f(p0) + f'(p0)(p1 - p0) + [ f"(p0)(p1 - p0)2 / 2! ] ...

è -f(p0) = f'(p0)(p1 - p0) ...

è  p1 = p0 - f(p0)/f'(p0)

 

We could improve Newton-Raphson by using three terms in the Taylor expansion instead of two. This will use a quadratic equation to approximate f(x) instead of a linear equation.  (This improved method also needs a 2nd derivative.)

 

Let's use a quadratic equation to approximate f(x)

 

E.g.  sin (x) = 1/2

x is in radians, otherwise if x is in degrees then f'(x) is more complicated.

 

f(x) = sin (x) - 1/2 = 0

f'(x) = cos (x)

 

Iteration equation:

pk+1 = pk - f(pk)/f'(pk)

Start with p0 = 0

 

i

pk

f(pk)

f'(pk)

pk+1

 

0

0

sin (0) - .5 = -.5

cos(0) = 1

.5

 

1

.5

sin (.5) - .5 = -.02

cos(.5) = .8775

.522

 

2

.522

sin(.522) - .5 = -.00139

cos(.522) = .8665

.5235

 

 

 

 

 

 

 

 

 

 

Newton-Raphson is a very fast method, but it requires the derivative of the equation as input. However, it is possible in most cases to compute the derivative numerically

f'(x) = lim {f(x + e) - f(x) / e}

 

We could use a small e to obtain the derivative.

E.g f(x) = sin(x) - .5 ; e = .001

f'(x) = cos(x)

 

f'(0) = f(0 + e) - f(0) / e

è [ sin(0 + .001) - sin(0) ] / .001

è 1, which is equal to cos(0) = 1 (exact)

 

f'(.5) = f(.5 + e) - f(.5) / e

è [ sin (.5 +.001) - sin(.5) ] / .001

è .8773

and the exact solution is cos (.5) = .8775

 

The value of e = .001 worked well because it is a fraction of the range of f(x).

 

 

6/24/03

 

 

 

Order of Convergence

 

An indicator of how fast a method converges to a solution. It is used to label the convergence speeds of different methods.

Assume that p is a solution for f(x) = 0. Also assume that E = p - pn is the approximation error. If two positive constants

A != 0 and R > 0 exist such that

 

Lim n -> ∞   {| p – pn+1 | / | p – pn |R }

= Lim n -> ∞   {En+1 / (En) R  }

= A

 

Then the sequence is said to converge to p with an order of convergence R. This means,

 

è Lim n -> ∞  |En+1| = A.|En|R

 

Problem with convergence is that the Newton-Raphson method uses the derivative to estimate where the zero can be found. However, if a function is difficult to estimate with a derivative and the initial approximation is far away from the solution, then the method may not converge.

 

This function has one real root and two imaginary roots

 

 

Iterations do not converge with Newton-Raphson if we start at Po because the approximation with a line at Po and its intersection with x-axis doesn't approximate where the roots actually exist.

 

Solution - choose another Po

           

                                                In this case it will converge                                           

 

 

So, in conclusion, we still need to have some knowledge about the equation to choose a starting point.

 

Secant Method

 

- The Newton-Raphson method requires the evaluation of two functions at each iteration.

i.e f(pk) and f′(pk)

- The Secant method doesn't require the derivative. The method uses two initial points that do not need to have a zero in between like bisection and regula falsi.

- The two initial points have to be close to the roots. Between two points Pk and Pk+1 create a line and the intersection with x-axis will

give Pk+2.

 

 

 

The slope m = [f(p1) - f(p0)] / [p1 – p0] , and

                   m = [0 - f(p1)] / [p2 – p1]

 

Equating both, we get

p2 = p1 - f(p1) * { [(p1 – p0)] / [f(p1) - f(p0)] }

 

Therefore, in general we have,

pk+1 = pk - f(pk) * { [(pk – pk-1)] / [f(pk) - f(pk-1)] }

 

Similarity between Newton-Raphson and Secant method

 

pk+1 = pk - f(pk) * { [(pk – pk-1)] / [f(pk) - f(pk-1)] }

if pk à pk-1 then, lim pk -> pk-1 [(pk – pk-1)] / [f(pk) - f(pk-1)] = derivative

 

The Secant method transforms to the Newton-Raphson method when pk à pk-1

 

E.g. f(x) = sin(x) – 1/2

Start with p0 = 0 and p1 = 1

 

Iterations

pk

f(pk)

pk+1

0

0

-0.5

 

1

1

0.3415

1 - [.3415(1-0)] / (.3415 + .5) = .594

2

.594

.0597

.594 – [.0597(.594-1)] / (.0597-.3415) = .50798

3

.50799

-.0136

.5079 – [-.0135(.5078-.0597)] / (-.0135-.059) = .5238

 

 

 

 

 

 

 

 

6/25/03

 

 

 

Order of Convergence (revisit)

 

If the method is converging at nà ∞ and Lim n -> ∞  |En+1| = A.|En|R  for some A, R and, En < 1

then the method is said to converge to p with an order of convergence R.

If R = 1,  then convergence is linear

If R = 2,  then convergence is quadratic.

 

In Newton-Raphson with one root à  R = 2

In Newton-Raphson with two roots à  R = 1

In Secant method, when you have one root à R = 1.618

In Secant method when you have two or more roots à R = 1

In Bisection à R = 1 and A = 1/2

In Regular Falsi à R = 1.

 

The solution of linear system Ax = B

 

We will see how to solve this system of equation in

-         Exact solution

-         Iterative.

 

Definition

 

N Dimensional vector

 

X = (x1, x2, x3, x4...xn)

 

It represents a point in a N-dimensional space.

N-dimensional space is a set of all N-dimensional vectors.

 

A vector used to define a position à position vector.

A displacement between two points à displacement vector.

 

Operation with Vectors

 

Equality à  X = Y  iff  xj = yj for all j = 1,2,3.....n

Sum à  X + Y = x1 + y1 , x2 + y2 , x3 + y3 ... xn + yn

Negative à  -Y = - y1 , - y2  , - y3 ... - yn

Difference à  X - Y = x1 - y1 , x2 – y2 , x3 – y3 ... xn – yn

Scalar Multiplication à  cx =  cx1 , cx2 , cx3... cxn

Linear combination à cx + dy = cx1 + dy1 , cx2 + dy2 , cx3 + dy3 ….cxn + dyn

Dot product à X . Y  =  x1.y1 , x2.y2 , x3.y3 ... xn.yn

Euclidean norm à  || X || =  (x12 + x22 + x32…xn2)1/2

 

In scalar multiplication cx, if |c| > 1, then cx is a stretched version of X and if |c| < 1, then cx is a shrunken version of X.

 

Displacement of a vector

 

If  x  and y represent two points,  y - x represents the displacement vector from x to y.

Distance between them is ||y-x|| = ((y1- x1)2 +(y2- x2)2 + ….+(yn- xn)2)1/2

 

Vector Algebra

 

 Y + X = X + Y    (commutative)

X + (Y + Z) = (X + Y) + Z  (associative)

X = X + 0 = 0 + X  (additive identity)

X - X = X + (-X) = 0  (additive inverse)

(a + b) X = aX + bX  (distributive)

a(X + Y) = aX + aY  (distributive with vector)

a(bX) = (ab)X  (associative with vector)

 

Example of a linear system

 

6x + 3y + 2z = 29

3x +  y + 2z = 17

2x + 3y + 2z = 21

 

Each equation represents a plane. The solution (x,y,z) of this system is the intersection of the 3 planes.

These systems can be represented as

 --           --  --   --          --    --

|  6  3  2 | | x  |         | 29 |

|  3  1  2 | | y  |   =    | 17 |

|  2  3  2 | | z  |         | 21 |

 --           --  --    --         --    --

 

 

 

6/26/03

 

 

Introduction to Matlab

 

The T.A gave a lecture on Matlab, slides can be found at ( www.cs.purdue.edu/homes/cs314/matlabp.pdf )

 

 

6/27/03

 

 

Matrices

 

A matrix A is a collection of vectors. You can see a M x N matrix as

1) M rows of N dimensional vectors

2) N columns of M dimensional vectors

 

Operators

 

Equality: A = B only if aij = bij

Sum:  A + B = [aij + bij]mxn

Negation: -A = [-aij]mxn

Difference:  A - B = [aij - bij] mxn

Scalar mutiplication: cA = [caij] mxn

Weighted Sum:  pA + qB = [paij + qbij] mxn

 

Properties

 

X + Y = Y + X    (commutative)

X + (Y + Z) = (X + Y) + Z  (associative)

X + 0 = 0 + X  (additive identity)

X - X = X + (-X) = 0  (additive inverse)

(a + b) X = aX + bX  (distributive for scalars)

a(X + Y) = aX + aY  (distributive)

a(bX) = (ab)X  (distributive for scalars)

 

Matrix multiplication

 

A.X = Y ; Y = ∑ ajk.bkxj

E.g.     A – mxn, B – nxp, C – mxp,

 --      --   --  --       --    --

|  2  3 | | 2  |   = | 13 |

|  4  1 | | 3  |      | 11 |

 --      --   --   --      --    --

     A            X         Y

 

# of multiplications = M.N.P

# of additions = (N-1).M.P

Special matrix

 

Zero matrix mxn = 0 = [0] mxn

             --    --

| 0 0 |

O3x2 =  | 0 0 |

            | 0 0 |

             --    --

Identity matrix

 

In  = [dixj ]mxn =  where [dixj] = 1 when i =j and = 0 when i≠j

è A.I = A

 

Properties of matrix multiplication

 

(AB)C = A(BC)  (associative for multiplication)

IA = AI = A  (identity)

A(B + C) = AB + AC  (left distributive)

(A + B).C = AC + BC  (right distributive)

No commutative property

 

Inverse of a non-singular matrix

 

A nxn matrix A is called non-singular or invertible if there exist a N x N matrix such that A.B = B.A = I,

then B is the inverse of A.

à B  = A-1   

à A.A-1  = A-1A = I .

If A does not have an inverse, then it is called a singular matrix.

 

A matrix is singular if

1) A row or a column is full of zeros

2) A column or a row is a linear combination of other columns or rows

 

Determinant

 

- It is a scalar quantity of a matrix

- How do we calculate the determinant?

  If A[a11] nxn is a square matrix then det(A) = a11

  If A[aii] nxn is a square matrix where n > 1, then

            det(A) = ∑(-1)i+j aij.Aij

e.g. A= [5] à det(A) = 5

              --          --

      A = | 1      3 |       à det(A) = -13

             | 7      8 |

  --          --