6/23/03
Used to
solve f(x) = 0.
If f(x) , f'(x) , f"(x) are continuous,
then we can use this information to find the solution of f(x)
Newton-Raphson
iteration
1) m =
f'(p0)
2) m =
[f(p1) - f(p0)] / [p1 - p0]
è -f(p0) / [p1 - p0] (since
f(p1) = 0)
equate both
1) and 2), then we get p1 = p0 - f(p0)/f'(p0)
Taylor
polynomial is used to approximate a continuous function using a polynomial
f(x) =
f(p0) + f'(p0)(x - p0) + [ f"(p0)(x - p0)2 / 2! ]
The
Newton-Raphson method approximates f(x) using the Taylor expansion up to two
terms.
If p0 = x
and f(p1) = 0
f(p1) = 0
è 0 = f(p0) + f'(p0)(p1 - p0) + [ f"(p0)(p1 - p0)2 /
2! ] ...
è -f(p0) = f'(p0)(p1 - p0) ...
è p1 = p0 -
f(p0)/f'(p0)
We could
improve Newton-Raphson by using three terms in the Taylor expansion instead of
two. This will use a quadratic equation to approximate f(x) instead of a linear
equation. (This improved method also
needs a 2nd derivative.)
Let's use a
quadratic equation to approximate f(x)
E.g. sin (x) = 1/2
x is in
radians, otherwise if x is in degrees then f'(x) is more complicated.
f(x) = sin
(x) - 1/2 = 0
f'(x) = cos
(x)
Iteration
equation:
pk+1 = pk -
f(pk)/f'(pk)
Start with
p0 = 0
i |
pk |
f(pk) |
f'(pk) |
pk+1 |
0 |
0 |
sin (0) - .5 = -.5 |
cos(0) = 1 |
.5 |
1 |
.5 |
sin (.5) - .5 = -.02 |
cos(.5) = .8775 |
.522 |
2 |
.522 |
sin(.522) - .5 = -.00139 |
cos(.522) = .8665 |
.5235 |
Newton-Raphson
is a very fast method, but it requires the derivative of the equation as input.
However, it is possible in most cases to compute the derivative numerically
f'(x) = lim
{f(x + e) - f(x) / e}
We could
use a small e to obtain the derivative.
E.g f(x) =
sin(x) - .5 ; e = .001
f'(x) =
cos(x)
f'(0) = f(0
+ e) - f(0) / e
è [ sin(0 + .001) - sin(0) ] / .001
è 1, which is equal to cos(0) = 1 (exact)
f'(.5) =
f(.5 + e) - f(.5) / e
è [ sin (.5 +.001) - sin(.5) ] / .001
è .8773
and the exact
solution is cos (.5) = .8775
The value
of e = .001 worked well because it is a fraction of the range of f(x).
6/24/03
An
indicator of how fast a method converges to a solution. It is used to label the
convergence speeds of different methods.
Assume that
p is a solution for f(x) = 0. Also assume that E = p - pn is the approximation
error. If two positive constants
A != 0 and
R > 0 exist such that
Lim
n -> ∞ {| p – pn+1 | / | p – pn |R }
=
Lim n -> ∞ {En+1 / (En) R }
=
A
Then the
sequence is said to converge to p with an order of convergence R. This means,
è Lim n -> ∞ |En+1| = A.|En|R
Problem
with convergence is that the Newton-Raphson method uses the derivative to
estimate where the zero can be found. However, if a function is difficult to
estimate with a derivative and the initial approximation is far away from the
solution, then the method may not converge.
This
function has one real root and two imaginary roots
Iterations
do not converge with Newton-Raphson if we start at Po because the
approximation with a line at Po and its intersection with x-axis
doesn't approximate where the roots actually exist.
Solution
- choose
another Po
In
this case it will converge
So, in
conclusion, we still need to have some knowledge about the equation to choose a
starting point.
- The
Newton-Raphson method requires the evaluation of two functions at each iteration.
i.e f(pk)
and f′(pk)
- The
Secant method doesn't require the derivative. The method uses two initial
points that do not need to have a zero in between like bisection and regula
falsi.
- The two
initial points have to be close to the roots. Between two points Pk and
Pk+1 create a line and the intersection with x-axis will
give Pk+2.
The slope m
= [f(p1) - f(p0)] / [p1 – p0] , and
m = [0 - f(p1)] / [p2
– p1]
Equating
both, we get
p2 =
p1 - f(p1) * { [(p1 – p0)] / [f(p1)
- f(p0)] }
Therefore, in
general we have,
pk+1 =
pk - f(pk) * { [(pk – pk-1)] / [f(pk)
- f(pk-1)] }
Similarity
between Newton-Raphson and Secant method
pk+1 =
pk - f(pk) * { [(pk – pk-1)] / [f(pk)
- f(pk-1)] }
if pk à pk-1 then, lim pk ->
pk-1 [(pk – pk-1)] / [f(pk) - f(pk-1)]
= derivative
The Secant
method transforms to the Newton-Raphson method when pk à pk-1
E.g. f(x) =
sin(x) – 1/2
Start with p0
= 0 and p1 = 1
Iterations |
pk |
f(pk) |
pk+1 |
|||||
0 |
0 |
-0.5 |
|
|||||
1 |
1 |
0.3415 |
1
- [.3415(1-0)] / (.3415 + .5) = .594 |
|||||
2 |
.594 |
.0597 |
.594
– [.0597(.594-1)] / (.0597-.3415) = .50798 |
|||||
3 |
.50799 |
-.0136 |
.5079
– [-.0135(.5078-.0597)] / (-.0135-.059) = .5238 |
|||||
|
|
|
|
|
|
|||
6/25/03
Order
of Convergence (revisit)
If the
method is converging at nà ∞ and Lim n -> ∞ |En+1| = A.|En|R for some A, R and, En < 1
then the
method is said to converge to p with an order of convergence R.
If R =
1, then convergence is linear
If R =
2, then convergence is quadratic.
In Newton-Raphson
with one root à R = 2
In Newton-Raphson
with two roots à R = 1
In Secant
method, when you have one root à R = 1.618
In Secant
method when you have two or more roots à R = 1
In
Bisection à R = 1 and A = 1/2
In Regular
Falsi à R = 1.
The solution of linear system Ax = B
We will see
how to solve this system of equation in
-
Exact
solution
-
Iterative.
Definition
N
Dimensional vector
X = (x1,
x2, x3, x4...xn)
It
represents a point in a N-dimensional space.
N-dimensional
space is a set of all N-dimensional vectors.
A vector
used to define a position à position vector.
A
displacement between two points à displacement vector.
Operation
with Vectors
Sum à
X + Y = x1 + y1 , x2 + y2 , x3
+ y3 ... xn + yn
Negative à
-Y = - y1 , - y2
, - y3 ... - yn
Difference à
X - Y = x1 - y1 , x2 – y2 , x3
– y3 ... xn – yn
Scalar
Multiplication à cx = cx1 , cx2 , cx3...
cxn
Linear
combination à cx + dy = cx1 + dy1 , cx2
+ dy2 , cx3 + dy3 ….cxn + dyn
Dot product
à X . Y
= x1.y1 , x2.y2
, x3.y3 ... xn.yn
Euclidean
norm à || X || = (x12 + x22
+ x32…xn2)1/2
In scalar multiplication cx, if |c| > 1, then cx is a
stretched version of X and if |c| < 1, then cx is a shrunken version of X.
If x
and y represent two points, y -
x represents the displacement vector from x to y.
Distance
between them is ||y-x|| = ((y1- x1)2 +(y2-
x2)2 + ….+(yn- xn)2)1/2
Y + X = X + Y (commutative)
X + (Y + Z)
= (X + Y) + Z (associative)
X = X + 0 =
0 + X (additive identity)
X - X = X +
(-X) = 0 (additive inverse)
(a + b) X =
aX + bX (distributive)
a(X + Y) =
aX + aY (distributive with vector)
a(bX) =
(ab)X (associative with vector)
6x + 3y +
2z = 29
3x + y + 2z = 17
2x + 3y +
2z = 21
Each
equation represents a plane. The solution (x,y,z) of this system is the
intersection of the 3 planes.
These
systems can be represented as
-- --
-- -- -- --
| 6 3 2 | | x
| | 29 |
| 3 1 2 | | y | = | 17 |
| 2 3 2 | | z
| | 21 |
-- --
-- -- --
--
6/26/03
Introduction to Matlab
The T.A gave a lecture on Matlab, slides can be found at ( www.cs.purdue.edu/homes/cs314/matlabp.pdf )
6/27/03
A matrix A
is a collection of vectors. You can see a M x N matrix as
1) M rows
of N dimensional vectors
2) N
columns of M dimensional vectors
Equality: A
= B only if aij = bij
Sum: A + B = [aij + bij]mxn
Difference: A - B = [aij - bij] mxn
Scalar
mutiplication: cA = [caij] mxn
Weighted Sum: pA + qB = [paij + qbij] mxn
X + Y = Y +
X (commutative)
X + (Y + Z)
= (X + Y) + Z (associative)
X + 0 = 0 +
X (additive identity)
X - X = X +
(-X) = 0 (additive inverse)
(a + b) X =
aX + bX (distributive for scalars)
a(X + Y) =
aX + aY (distributive)
a(bX) =
(ab)X (distributive for scalars)
A.X = Y ; Y
= ∑ ajk.bkxj
E.g. A – mxn, B – nxp, C – mxp,
--
-- -- -- -- --
| 2 3
| | 2 | = | 13 |
| 4 1 |
| 3 | | 11 |
--
-- -- -- -- --
A
X Y
# of
multiplications = M.N.P
# of
additions = (N-1).M.P
Zero matrix
mxn = 0 = [0] mxn
-- --
|
0 0 |
O3x2
= | 0 0 |
| 0 0 |
-- --
Identity
matrix
In
= [dixj ]mxn = where [dixj] = 1 when i =j and = 0
when i≠j
è A.I = A
(AB)C =
A(BC) (associative for multiplication)
IA = AI =
A (identity)
A(B + C) =
AB + AC (left distributive)
(A + B).C =
AC + BC (right distributive)
No
commutative property
A nxn matrix
A is called non-singular or invertible if there exist a N x N matrix such that
A.B = B.A = I,
then B is
the inverse of A.
à B = A-1
à A.A-1 = A-1A = I .
If A does
not have an inverse, then it is called a singular matrix.
A matrix is
singular if
1) A row or
a column is full of zeros
2) A column
or a row is a linear combination of other columns or rows
- It is a
scalar quantity of a matrix
- How do we
calculate the determinant?
If A[a11] nxn is a
square matrix then det(A) = a11
If A[aii] nxn is a square matrix
where n > 1, then
det(A) = ∑(-1)i+j aij.Aij
e.g. A= [5]
à det(A) = 5
--
--
A
= | 1 3 | à det(A) = -13
| 7 8 |
-- --