6/23/03

Newton-Raphson method

Used to solve f(x) = 0.

If f(x) , f'(x) , f"(x) are continuous, then we can use this information to find the solution of f(x)

Newton-Raphson iteration

1) m = f'(p0)

2) m = [f(p1) - f(p0)] / [p1 - p0]

è -f(p0) / [p1 - p0] (since f(p1) = 0)

equate both 1) and 2), then we get p1 = p0 - f(p0)/f'(p0)

Newton-Raphson and Taylor polynomial

Taylor polynomial is used to approximate a continuous function using a polynomial

f(x) = f(p0) + f'(p0)(x - p0) + [ f"(p0)(x - p0)2 / 2! ]

The Newton-Raphson method approximates f(x) using the Taylor expansion up to two terms.

If p0 = x and f(p1) = 0

f(p1) = 0

è 0 = f(p0) + f'(p0)(p1 - p0) + [ f"(p0)(p1 - p0)2 / 2! ] ...

è -f(p0) = f'(p0)(p1 - p0) ...

è p1 = p0 - f(p0)/f'(p0)

We could improve Newton-Raphson by using three terms in the Taylor expansion instead of two. This will use a quadratic equation to approximate f(x) instead of a linear equation. (This improved method also needs a 2nd derivative.)

Let's use a quadratic equation to approximate f(x)

E.g. sin (x) = 1/2

x is in radians, otherwise if x is in degrees then f'(x) is more complicated.

f(x) = sin (x) - 1/2 = 0

f'(x) = cos (x)

Iteration equation:

pk+1 = pk - f(pk)/f'(pk)

Start with p0 = 0

i	pk	f(pk)	f'(pk)	pk+1
0	0	sin (0) - .5 = -.5	cos(0) = 1	.5
1	.5	sin (.5) - .5 = -.02	cos(.5) = .8775	.522
2	.522	sin(.522) - .5 = -.00139	cos(.522) = .8665	.5235

Newton-Raphson is a very fast method, but it requires the derivative of the equation as input. However, it is possible in most cases to compute the derivative numerically

f'(x) = lim {f(x + e) - f(x) / e}

We could use a small e to obtain the derivative.

E.g f(x) = sin(x) - .5 ; e = .001

f'(x) = cos(x)

f'(0) = f(0 + e) - f(0) / e

è [ sin(0 + .001) - sin(0) ] / .001

è 1, which is equal to cos(0) = 1 (exact)

f'(.5) = f(.5 + e) - f(.5) / e

è [ sin (.5 +.001) - sin(.5) ] / .001

è .8773

and the exact solution is cos (.5) = .8775

The value of e = .001 worked well because it is a fraction of the range of f(x).

6/24/03

Order of Convergence

An indicator of how fast a method converges to a solution. It is used to label the convergence speeds of different methods.

Assume that p is a solution for f(x) = 0. Also assume that E = p - pn is the approximation error. If two positive constants

A != 0 and R > 0 exist such that

Lim_{n -> ∞} {| p – pn+1 | / | p – pn |^R}

= Lim_{n -> ∞} {En+1 / (En)^R}

= A

Then the sequence is said to converge to p with an order of convergence R. This means,

è Lim _{n -> ∞} |En+1| = A.|En|^R

Problem with convergence is that the Newton-Raphson method uses the derivative to estimate where the zero can be found. However, if a function is difficult to estimate with a derivative and the initial approximation is far away from the solution, then the method may not converge.

This function has one real root and two imaginary roots

Iterations do not converge with Newton-Raphson if we start at P_obecause the approximation with a line at P_o and its intersection with x-axis doesn't approximate where the roots actually exist.

Solution - choose another P_o

In this case it will converge

So, in conclusion, we still need to have some knowledge about the equation to choose a starting point.

Secant Method

- The Newton-Raphson method requires the evaluation of two functions at each iteration.

i.e f(p_k) and f′(p_k)

- The Secant method doesn't require the derivative. The method uses two initial points that do not need to have a zero in between like bisection and regula falsi.

- The two initial points have to be close to the roots. Between two points P_kand P_k+1 create a line and the intersection with x-axis will

give P_k+2.

The slope m = [f(p₁) - f(p₀)] / [p₁ – p₀] , and

m = [0 - f(p₁)] / [p₂ – p₁]

Equating both, we get

p₂= p₁ - f(p₁) * { [(p₁ – p₀)] / [f(p₁) - f(p₀)] }

Therefore, in general we have,

p_k+1= p_k - f(p_k) * { [(p_k – p_k-1)] / [f(p_k) - f(p_k-1)] }

Similarity between Newton-Raphson and Secant method

p_k+1= p_k - f(p_k) * { [(p_k – p_k-1)] / [f(p_k) - f(p_k-1)] }

if p_kà p_k-1then, lim_{pk ->
pk-1} [(p_k – p_k-1)] / [f(p_k) - f(p_k-1)] = derivative

The Secant method transforms to the Newton-Raphson method when p_kà p_k-1

E.g. f(x) = sin(x) – 1/2

Start with p₀= 0 and p₁= 1

Iterations	p_k	f(p_k)	p_k+1
0	0	-0.5
1	1	0.3415	1 - [.3415(1-0)] / (.3415 + .5) = .594
2	.594	.0597	.594 – [.0597(.594-1)] / (.0597-.3415) = .50798
3	.50799	-.0136	.5079 – [-.0135(.5078-.0597)] / (-.0135-.059) = .5238

6/25/03

Order of Convergence (revisit)

If the method is converging at nà ∞ and Lim _{n -> ∞} |En+1| = A.|En|^Rfor some A, R and, E_n< 1

then the method is said to converge to p with an order of convergence R.

If R = 1, then convergence is linear

If R = 2, then convergence is quadratic.

In Newton-Raphson with one root à R = 2

In Newton-Raphson with two roots à R = 1

In Secant method, when you have one root à R = 1.618

In Secant method when you have two or more roots à R = 1

In Bisection à R = 1 and A = 1/2

In Regular Falsi à R = 1.

The solution of linear system Ax = B

We will see how to solve this system of equation in

- Exact solution

- Iterative.

Definition

N Dimensional vector

X = (x1, x2, x3, x4...xn)

It represents a point in a N-dimensional space.

N-dimensional space is a set of all N-dimensional vectors.

A vector used to define a position à position vector.

A displacement between two points à displacement vector.

Operation with Vectors

Equality à X = Y iff xj = yj for all j = 1,2,3.....n

Sum à X + Y = x₁ + y₁ , x₂ + y₂ , x₃ + y₃ ... x_n + y_n

Negative à -Y = - y₁ , - y₂ , - y₃ ... - yn

Difference à X - Y = x₁ - y₁ , x₂ – y₂ , x₃ – y₃ ... x_n – y_n

Scalar Multiplication à cx = cx₁ , cx₂ , cx₃... cx_n

Linear combination à cx + dy = cx₁ + dy₁ , cx₂ + dy₂ , cx₃ + dy₃ ….cx_n + dy_n

Dot product à X . Y = x₁.y₁ , x₂.y₂ , x₃.y₃ ... x_n.y_n

Euclidean norm à || X || = (x₁² + x₂² + x₃²…x_n²)^1/2

In scalar multiplication cx, if |c| > 1, then cx is a stretched version of X and if |c| < 1, then cx is a shrunken version of X.

Displacement of a vector

If x and y represent two points, y - x represents the displacement vector from x to y.

Distance between them is ||y-x|| = ((y_1-x₁)² +(y_2-x₂)² + ….+(y_n-x_n)²)^1/2

Vector Algebra

Y + X = X + Y (commutative)

X + (Y + Z) = (X + Y) + Z (associative)

X = X + 0 = 0 + X (additive identity)

X - X = X + (-X) = 0 (additive inverse)

(a + b) X = aX + bX (distributive)

a(X + Y) = aX + aY (distributive with vector)

a(bX) = (ab)X (associative with vector)

Example of a linear system

6x + 3y + 2z = 29

3x + y + 2z = 17

2x + 3y + 2z = 21

Each equation represents a plane. The solution (x,y,z) of this system is the intersection of the 3 planes.

These systems can be represented as

-- -- -- -- -- --

| 6 3 2 | | x | | 29 |

| 3 1 2 | | y | = | 17 |

| 2 3 2 | | z | | 21 |

-- -- -- -- -- --

6/26/03

Introduction to Matlab

The T.A gave a lecture on Matlab, slides can be found at ( www.cs.purdue.edu/homes/cs314/matlabp.pdf )

6/27/03

Matrices

A matrix A is a collection of vectors. You can see a M x N matrix as

1) M rows of N dimensional vectors

2) N columns of M dimensional vectors

Operators

Equality: A = B only if aij = bij

Sum: A + B = [aij + bij]_mxn

Negation: -A = [-aij]_mxn

Difference: A - B = [aij - bij]_mxn

Scalar mutiplication: cA = [caij]_mxn

Weighted Sum: pA + qB = [paij + qbij]_mxn

Properties

X + Y = Y + X (commutative)

X + (Y + Z) = (X + Y) + Z (associative)

X + 0 = 0 + X (additive identity)

X - X = X + (-X) = 0 (additive inverse)

(a + b) X = aX + bX (distributive for scalars)

a(X + Y) = aX + aY (distributive)

a(bX) = (ab)X (distributive for scalars)

Matrix multiplication

A.X = Y ; Y = ∑ a_jk.b_kxj

E.g. A – mxn, B – nxp, C – mxp,

-- -- -- -- -- --

| 2 3 | | 2 | = | 13 |

| 4 1 | | 3 | | 11 |

-- -- -- -- -- --

A X Y

# of multiplications = M.N.P

# of additions = (N-1).M.P

Special matrix

Zero matrix_mxn = 0 = [0]_mxn

-- --

| 0 0 |

O_3x2 = | 0 0 |

| 0 0 |

-- --

Identity matrix

I_n = [d_ixj]_mxn = where [d_ixj] = 1 when i =j and = 0 when i≠j

è A.I = A

Properties of matrix multiplication

(AB)C = A(BC) (associative for multiplication)

IA = AI = A (identity)

A(B + C) = AB + AC (left distributive)

(A + B).C = AC + BC (right distributive)

No commutative property

Inverse of a non-singular matrix

A nxnmatrix A is called non-singular or invertible if there exist a N x N matrix such that A.B = B.A = I,

then B is the inverse of A.

à B = A^-1

à A.A^-1 = A^-1A = I .

If A does not have an inverse, then it is called a singular matrix.

A matrix is singular if

1) A row or a column is full of zeros

2) A column or a row is a linear combination of other columns or rows

Determinant

- It is a scalar quantity of a matrix

- How do we calculate the determinant?

If A[a₁₁]_nxn is a square matrix then det(A) = a₁₁

If A[aii] _nxn is a square matrix where n > 1, then

det(A) = ∑(-1)^i+j a_ij.A_ij

e.g. A= [5] à det(A) = 5

-- --

A = | 1 3 | à det(A) = -13

| 7 8 |

-- --

Newton-Raphson method

Newton-Raphson and Taylor polynomial

Order of Convergence

Secant Method

Equality à X = Y iff xj = yj for all j = 1,2,3.....n

Displacement of a vector

Vector Algebra

Example of a linear system

Matrices

Operators

Negation: -A = [-aij]mxn

Properties

Matrix multiplication

Special matrix

Properties of matrix multiplication

Inverse of a non-singular matrix

Determinant

Negation: -A = [-aij]_mxn