by Bethany Wagenaar
Jump to Day:
Newton-Raphson
- Newton-Raphson is very fast; faster than methods we have seen so far.
1 m =
ƒ(x0) - 0 Using
the points (x0,ƒ(x0) &
(x1, 0))
x0 - x1
2 m = ƒ′(x0)
Putting 1 and 2 together:
ƒ′(x0) = ƒ(x0) - 0
x0 - x1
ƒ′(x0)(x0 - x1) = ƒ(x0)
(x0 - x1) =
ƒ(x0)
  ƒ′(x0)
x1 = x0 -
ƒ(x0)
  ƒ′(x0)
- We can arrive to the same iteration equation of Newton Raphson using the Taylor expansion.
- We can approximate a continuous function using a Taylor polynomial around x0.
ƒ(x) = ƒ(x0) + ƒ′(x0)(x - x0) + ƒ″(x0)(x - x0)2/2 + ƒ′″(x0)(x - x0)3/6 + ... + ƒn(x - x0)n/n! + ...
- To approximate Newton Raphson, we are going to use only two terms in the expansion.
- Newton Raphson approximates ƒ(x) using a line; the quadratic, cubic, etc. terms are eliminated from the Taylor expansion.
f(x) ≈ ƒ(x0) + ƒ′(x0)(x - x0) + ε ← This is the error using a line instead of the complete polynomial
- We want to obtain x1 using the approximation.
Find solution using approximation
↓
ƒ(x1) = ƒ(x0) + ƒ(x0)(x1 - x0)
ƒ(x1) - ƒ(x0)/ƒ′(x0) = x1 - x0
ƒ(x1) - ƒ(x0)/ƒ′(x0) + x0 = x1
0 - ƒ(x0)/ƒ′(x0) + x0 = x1
x1 = x0 - ƒ(x0)/ƒ′(x0)
♦ Quadratic has two solutions - choose the one closer to x0
- In the same way, you could create a new numerical method faster than Newton Raphson that uses three terms in the approximation instead of two.
- The approximation will be a quadratic instead of a line.
- The approximation will require a second derivative.
ƒ(x) = ƒ(x0) + ƒ′(x0)(x - x0) + ƒ″(x0)(x - x0)2/2
0 = ƒ(x0) + ƒ′(x0)(x1 - x0) + ƒ″(x0)(x1 - x0)2/2
Solve for (x1 - x0) using -b ± √b^2 - 4ac / 2a
Solve for x1.
Use the term of x1 closer to x0.
Example of Newton Raphson
ƒ(x) = sin(x) - ½ = 0 ƒ′(x) = cos(x)
↑ In radians - no degrees because we would need to multiply fractions.
Start with x0 = 0.
x1 = x0 - ƒ(x0)/ƒ′(x0) Iteration equation Newton Raphson
i |
xi |
f(xi) |
ƒ′(xi) |
xi + 1 |
0 | 0 | ƒ(x0) = sin(0) - .5 = -.5 | ƒ′(x0) = cos(0) = 1 | x1 = 0 - .5/1 = .5 |
1 | .5 | ƒ(x1) = sin(.5) - .5 = -.02 | ƒ′(x1) = cos(.5) = .8795 | x2 = -.5 - (-.02/.8795) = .582 |
2 | .522 | ƒ(x2) = sin(.522) - .5 = -.00139 | ƒ′(x2) = cos(.522) = .8668 | x3 = -.522 - (-.00139/.8668) = .5235 |
Exact solution for sin(x) - .5 = 0 in radians is 30° that is 30 π/180 rad = .5235 rad
- One of the disadvantages of Newton Raphson is that you need the derivative ƒ′(x).
- It is possible to obtain the derivative of ƒ(x) numerically.
ƒ′(x) = ƒ(x + ε) - ƒ(x)/ε
↑ The exact solution requires that ε → 0.
We can do an approximation by using a small ε (like 1 x 10-5).
Example:
ƒ(x) = sin(x)
ƒ′(x) ≈ sin(x + ε) - sin(x)/ε
if ε = .001 x = 0
ƒ′(0) ≈ (sin(0 + .001) - sin(0)) / .001 = 1
Exact solution is ƒ′(x) = cos(x)
ƒ′(0) = cos(0) = 1
ƒ′(.5) ≈ (sin(.5 + .001) - sin(.5)) / .001 = .8773
Exact solution is ƒ′(.5) = cos(.5) = .8773
- When Newton Raphson may not give a good approximation or if it may not converge:
Newton Raphson uses a line to approximate the function.
If the function is not approximated well using a line at the
starting point, the method may not converge
Order of Convergence
- Some methods are faster than others at finding the solution of ƒ(x) = 0.
- Assume
Then the sequence is said to converge to p with an order of convergence R.
This means
En+1 = A |En|R
Example:
Let En = .001
A = 1
R = 2
→ En+1 = 1|.001|2 = 1 x 10-3
= (1/1000)2 → The larger the exponent, the faster the
error will decrease.
= 1(1 x 10-3)2
= 1 x 10-6 = .000001
If R = 1, The convergence is called "linear".
If R = 2. The convergence is called "quadratic".
If ƒ(x) has a simple root and we use Newton Raphson, the R = 2 (quadratic convergence).
Usually the more roots an equation has, the slower the convergence will be.
Secant Method
- The Newton Raphson method requires the evaluation of two
functions of each iteration:
ƒ(xn) and ƒ′(xn)
- The secant method will require only one evaluation of ƒ(x).
- The secant method's order of convergence with a simple root is R ≈ 1.618.
- The secant method uses two initial points p0, p1 close to the root.
2 m = ƒ(x2) - ƒ(x1)/x2 - x1 using (x1, ƒ(x1)) & (x2, 0)
Putting 1 & 2 together
ƒ(x1) - ƒ(x0)/x1 - x0 = 0 - ƒ(x1)/x2 - x1
x2 - x1 = -ƒ(x1)((x1 - x0)/ƒ(x1) - ƒ(x0))
x2 = x1 - ƒ(x1)((x1 - x0)/ƒ(x1) - ƒ(x0))
The Secant Method approximates Newton-Raphson.
if x1→x0
Limx1→x0 x2 = Limx1→x0 (x1 - ƒ(x1)*(x1 - x0) / [ƒ(x1) - ƒ(x0)])
= x0 - ƒ(x0) * (Limx1→x0 (x1 - x0) / [ƒ(x1) - ƒ(x0)])
NOTE: (Limx1→x0 (x1 - x0) / [ƒ(x1)-ƒ(x0)]) tends to equal 1/ƒ′(x)
x2 = x0 - ƒ(x0)/ƒ′(x0) ← Newton Raphson
Example
Secant Method
ƒ(x) = sin(x) - 0.5 = 0
x0 = 0, x1 = 1
xn+1 = xn - [ƒ(xn)*(xn - xn-1) / ƒ(xn) - ƒ(xn-1)] ← Secant Method
n | xn | f(xn) | xn+1 |
0 |
x0 = 0 |
f(x0) = sin(n) - .5 = -.5 |
- |
1 | x1 = 1 |
f(x1) = sin(1) - .5 = .3415 |
x2 = 1 - [.3415(1-0) / .3415-(-.5)] = .594 |
2 | x2 = .594 |
f(x2) = sin(.594) - .5 = .0597 |
x3 = .594 - [(.0597*(.594-1) / (.597-.3415)] = .50798 |
3 | x3 = .50798 |
f(x3) = sin(.50798) - .5 = - .0135 |
x4 = .50798 - [(-.0135)*(.50798-.594) / (-.0135- .0597)] = .5238 |
Exact Solution
x = 30° = π/180 = .5235
In Summary
To solve ƒ(x0) = 0
You may use:
The Solution of Linear Systems
Example
6x + 3y + 2z = 29
3x + y + 2z = 17
2x + 3y + 2z = 21
This can be expressed as AX = B where A is a matrix and B is a vector.
| 6 3 2 | | X |
| 29 |
| 3 1 2 | | Y |
= | 17 |
| 2 3 2 | | Z |
| 21 |
{ A } { X } { B }
- Each linear equation represents a plane in the nth dimension.
- The solution x, y, z represents the intersection of the planes.
Not all the systems of linear equations have solutions.
{1} 2x + 3y = 6
{2} 4x + 6y = 10
from {1} ← {3} x = (6 - 3y)/2
substitute {3} in {2}
4 * (6 - 3y)/2 + 6y = 10
12 - 6y + 6y = 10
12 = 10 → No Solution
Geometrically, these equations represent 2 planes that never intersect.
- You can also have a system of equations with an infinite number of solutions:
{1} 2x + 3y = 6
{2} 4x + 6y = 10
{1} & {2} are the same equation so both are the same plane.
Therefore, the number of solutions is infinite.
[multiply {1} by 2 and you obtain {2}]
Upper Triangular Systems of Linear Equations
A matrix A = | a11 a12 a13
···· a1n |
| a21 a22 a23 ····
a2n |
| ·
|
| ·
|
| ·
|
| an1 an2 an3 ···
ann |
is called upper triangular if aij = 0 when i > j
Example:
A = | 1 2 4 |
| 0 6 7 |
| 0 0 3 |
// this A is upper triangular
a21 = 0; i = 2, j = 1 → i > j
A matrix A is called lower triangular if aij = 0 when i < j
Example:
A = | 9 0 0 |
| 8 2 0 |
| 3 5 5 |
// this A is lower triangular
a12 = 0; i = 1, j = 2 → i < j
We can have an upper triangular system of linear equations:
a11x1 + a12x2 + a13x3
+ ··· + a1n-1xn-1 + a1nxn
= b1
a22x2 + a23x3
+ ··· + a2n-1xn-1 + a2nxn
= b2
a33x3
+ ··· + a3n-1xn-1 + a3nxn
= b3
- An upper triangular system of linear equations is easy to solve by obtaining the value of xn from the last equation and substituting the xn into the (n-1)th equation to obtain xn-1 and so on . This is called back substitution.
Back Substitution
xn = bn/ann
xn-1 = bn-1 - an-1xn
/ an-1n-1
·
·
·
x1 = (b1 - a1nxn - a1n-1xn-1
+ ···· + a12x2) / a11
Example:
3x + 4y + z = 2
3y + 2z = 1
5z = 10
z = 10/5 = 2
y = (1 - 2(2)) / 3 = -3/3 = -1
x = 2 - 2 - 4(-1) / 3 = 4/3
Gaussian Elimination
- Interchange equations
x - y = 5 → 3x + 2 = 6
The order of two equations do not affect the solution.
3x + 2 = 6 x - y = 5- Multiplying an equation by a constant will not affect the solution (scaling).
{1} x - y = 5 → 2x - 2y = 10
{2} 3x + 2 = 6 multiply 3x + 2 = 6
{1} by 2
equivalent (same solution)- Replacement
An equation can be replaced by the sum of itself and a non-zero multiple of another equation.
{1} x - y = 5 → 3x - 3y = 15
{2} 3x + 2 = 6 multiply 3x + 2 = 6
{1} by 3 &
add it to {2}
6x - y = 21
We will use these three properties to transform an arbitrary system of linear equations into an upper triangular system and then solve it with back substitution.
Example:
1x + 3y + 4z = 19
8x + 9y + 3z = 35
x + y + z = 6Solve system using Gaussian Elimination.
Augmented Matrix
pivot row → | 1 3 4 19 | (-8) (-1)
| 8 9 3 35 | (+1)
| 1 1 1 6 | (+1)element a1,1 = 1 is the pivot element on pivot
We need to make element a21 = 8 to be 0 so we multiply the first equation by 8 & subtract the result from the second equation.
pivot in red
| 1 3 4 19 |
| 8-8= 0 9-8*3= -15 3-4*8= -29 35-19*8= -117 |
| 1-1=0 1-3= -2 1-4 = -3 6-19 = -13 || 1 3 4 19 |
| 0 -15 -29 -117 | (-1/15)
| 0 -2 -3 -13 || 1 3 4 19 |
| 0 1 29/15 117/15 | (+2)
| 0 -2 -3 -13 | (+1)Now we need to make element a32 = 0 so we need to multiply the second equation {2} by 2 and add it to the third equation {3}.
| 1 3 4 19 |
| 0 1 29/15 117/15 |
| 0 -2+2 = 0 -3 + 2*29/15 -13 + 2*117/15 |
Can do everything with decimals in exam.Back Substitution
(-3 + 2*29/15)z = -13 + 2*117/15
z = 31y + 29/15(3) = 117/15
y = 117/15 - 29/15*3 = 2
y = 21x + 3(2) + 4(3) = 19
x = 19 - 3*2 - 4*3
= 19 - 6 - 12
x = 1Imple
menting Gauss Elimination Using Matlab function x = gauss(A, B) // A & B are coefficients [A (n x n) matrix, B (n x 1) matrix which results in x (n x 1 matrix), x is the solution]
% Input - A is a NxN nonsingular matrix (nonsingular means that it has a solution (det ≠ 0) singular means it does not have a solution)
% - B is a Nx1 matrix
% Output - x is a Nx1 matrix that is the solution of Ax = B
% Get the dimensions of A
[N, N] = size(A); → // n x n (square matrix)
% Initialize x with zeros
x = zeros (N, 1); → // zero matrix
% Obtain augmented matrix
Aug = [A, B];
% Gaussian Elimination
%For all rows
%Pivot 1 to N
for p = 1:N
% Choose a pivot. We will ignore the case when pivot is 0 for now.
Piv = Aug (p, p);
% Divide pivotal row by pivot.
for k = p:N+1 % makes an upper triangular matrix
Aug (p, k) = Aug (p, k) / piv; % Aug(p,p:N+1) / piv
% short notation → Aug(p,p:N+1)
end
% make 0's in the other elements in pivotal column for all remaining rows
for k = p+1:N
% for all remaining columns
m = Aug (k, p); % element that we want to be 0
for i = p:N+1
Aug (k, i) = Aug (k, i) - (m * Aug (p, i));
end
end
% Back Substitution
for k = N:1 (Step - 1)
sum = sum + Aug(k,i) * x(i,1)
end
x(k,1)
1. Prevent pivot = 0
If pivot = 0, switch rows to prevent division by zero.2. To reduce computing error, you may choose as the next pivot the largest number in the rows (or column) that are still left to transform (use absolute value). * By using Rule 2, you don't need Rule 1.
| 3 6 1 7 |
| 2 4 3 0 |
| 7 6 0 0 | → Use the largest number as pivot. Switch rows if necessary.Switch rows one and three. Matrix now contains a pivot of 7.
| 7 6 0 0 | → New pivot is 7.
| 2 4 3 0 |
| 3 6 1 7 |
Triangular Factorization
A matrix A has a triangular factorization if it can be expressed as the product of a lower triangular matrix and an upper triangular matrix.Triangular Factorization is useful when you want to solve multiple systems of linear equations that have the same A.
Ax = 1b1
Ax = 1b2
Ax = 1b3
.
.
.
Ax = 1bnTriangular factorization will save time instead of using Gauss Elimination n times.
A = LU
| a11 a12 ···· a1n | | 1 0 ···· 0 | | u11 u12 ···· u1n |
| a21 a22 ···· a2n | | l11 1 ···· 0 | | 0 u22 ···· u2n |
| · | == | · | | · |
| · | | · | | · |
| an1 an2 ···· ann | | ln1 ln2 ···· 1 | | 0 0 ···· unn |
A L U
Assume the linear system Ax = B.
Where A has a triangular factorization A = LU then
AX = B
↓
(LU)X = BThen we define UX = Y 1
L(UX) = B
LY = B 2Then we can solve x by first solving y in 2 that is the lower triangular system of linear equations using "forward substitution"
Once y is found, we can solve 1 using back substitution.
- We solve 2 using foward substitution.
- We solve 1 using back substitution.LU - Factorization
Example: 6x1 + 1x2 - 4x3 = -3
5x1 + 3x2 + 2x3 = 21
1x1 - 4x2 + 3x3 = 10A = | 6 1 4 |
| 5 3 2 |
| 1 -4 3 |
A = | 1 0 0 | | 6 1 4 | (1/6)
| 0 1 0 | | 5 3 2 |
| 0 0 1 | | 1 -4 3 |
will become L will become UApply Gaussian Elimination to the matrix on the right hand side and also do the same steps to the matrix on the left hand side.
A = | 1/6 0 0 | | 1 1/6 4/6 | (-5) - (-1)
| 0 1 0 | | 5 3 2 | (+1) -
| 0 0 1 | | 1 -4 3 | - (+1)= | 1/6 0 0 | | 1 1/6 4/6 |
| -5/6 1 0 | | 0 -3/6+3 -20/6+2 |
| -1/6 0 1 | | 0 -1/6-4 -4/6+3 |= | 1/6 0 0 | | 1 1/6 4/6 |
= | 1/6 0 0 | | 1 1/6 4/6 |
|-5/6 1 0 | | 0 13/6 -8/6 | (6/13)
| -1/6 0 1 | | 0 -25/6 14/6 |
|-5/13 6/13 0 | | 0 1 -8/13 | (25/6)
| -1/6 0 1 | | 0 -25/6 14/6 | (+1)= | 1/6 0 0 | | 1 1/6 4/6 |
|-5/13 6/13 0 | | 0 1 -8/13 |
| 25/6(-5/13)(-1/6) 25/13 1 | | 0 0 -8/13(25/6)+14/6 |= | 1/6 0 0 | | 1 1/6 4/6 |
|-5/13 6/13 0 | | 0 1 -8/13 |
| -138/78 25/13 1 | | 0 0 -18/78 | (-78/18)= | 1/6 0 0 | | 1 1/6 4/6 |
|-5/13 6/13 0 | | 0 1 -8/13 |
| -138/18 -150/18 -78/18 | | 0 0 1 |L U
7 steps - If we use a calculator with decimal we can do it in 3 or 4 steps.
AX = B
LUX = B UX = Y 1
LY = B 2Solve 2 then solve 1.
From 2 LY = B
1/6y1 + 0y2 + 0y3 = -3
-5/13y1 + 6/13y2 + 0y3 = 21
-138/18y1 - 150/18y2 - 78/18y3 = 10B =| -3 |
| 21 |
| 10 |Solve y
y1 = (-3)6 = -18
y2 = (21 + 5/13(-18))13/6
= 21*13/6 + 5(-18)/6 = 61/2
y3 = (10 + 138/18(-18) + 150/18(61/2))(-18/78)Now solve 1 UX = Y
1x1 = 1/6x2 + 4/6x3 = -18
0x1 + 1x2 - 8/13x3 = 61/2
0x1 = 0x2 + 1x3 = y3 *We can solve the system using back substitution.
x3 = y3
x2 = 61/2 + 8/13*1/3
x1 = -18 - 1/6x2 - 4/6x3LU Factorization is better than Gaussian Elimination when you have more than one system of equations with the same A.
Factor A in LU = O(N3)
Gauss Elimination = O(N3)
Back Substitution = O(N2)
Forward Substitution = O(N2)- Advantage of LU → solve multiple systems of equations with the same A.
If we want to solve multiple systems of equations with the same A, then it is better to use LU factorization instead of Gaussian Elimination.