Notes

Monday, June 28, 2004

-If we want to solve multiple systems of equations with the same A, then it is better to use LU factorization instead of Gaussian elimination:

Assume we have systems of linear equations

Ax₀ = b₀

Ax₁ = b₁

Ax_n = b_n

-If we use Gaussian Elimination n times

Cost: ln O(n³) = O(ln n³) < --- Good for a single system

-If we use LU factorization

Cost: O(n³) + ln O(n²) = O(n³) + O(ln n²) < --- Good for multiple systems

-LU will be faster

Iterative Methods for Systems of Linear Equations

-Iterative methods are better than the exact methods (Gaussian Elimination and LU factorization) when N is very large and 2A is a sparse matrix (sparse meaning that most of the elements in the matrix are zeros)

-For example, to solve partial differential equations numerically, very often you need to solve large (N > 100,000) and sparse matrices. Solving this kind of system using Gaussian Elimination or LU Factorization will take a long time

N = 100,000 O((10⁵)³) = O(10¹⁵)

Jacobi Iteration

-Assume the system:

3x + y = 5

x + 3y = 7

-We can write them as:

x = (5 - y) / 3 y = (7 - x) / 3

-This suggests the following method:

x_k+₁ = (5 - y_k) / 3

y_k+₁ = (7 - x_k) / 3

-Assume the starting point:

x₀ = 0 y₀ = 0

x₁ = (5 - 0) / 3 = 1.667 y₁ = (7 – 0) / 3 = 2.33

x₂ = (5 – 2.33) / 3 = 0.889 y₂ = (7 – 1.667) / 3 = 1.777

x₃ = (5 – 1.777) / 3 = 1.074 y₃ = (7 – 0.889) / 3 = 2.037

x₄ = (5 – 2.037) / 3 = 0.987 y₄ = (7 – 1.074) / 3 = 1.475

x_infinity = 1 y_infinity = 2

Gauss-Seidel Iteration

-It is similar to Jacobi, but we do not wait until the end of the iteration to use the values of x and y etc…

-Gauss-Seidel uses the new approximations as soon as they are available

-Example:

3x + y = 5

x + 3y = 7

We can write them as:

x = (5 – y) / 3 y = (7 – x) / 3

Now for the method:

x_k+₁ = (5 - y_k) / 3

y_k+₁ = (7 - x_k+1) / 3

Assume the starting point:

x₀ = 0 y₀ = 0

x₁ = (5 - 0) / 3 = 1.667 y₁ = (7 – 1.667) / 3 = 1.778

x₂ = (5 – 1.778) / 3 = 1.074 y₂ = (7 – 1.074) / 3 = 1.975

x₃ = (5 – 1.975) / 3 = 1.008 y₃ = (7 – 1.008) / 3 = 1.99

x_infinity = 1 y_infinity = 2

Gauss-Seidel converges faster than Jacobi

-Jacobi and Gauss-Seidel may not converge in some cases

-For example, if we rearrange the previous equations:

3x + y = 5 x + 3y = 7

x + 3y = 7 to 3x + y = 5

Then if we create the iteration equations:

x = 7 – 3y

y = 5 – 3y

Now for the method:

x_k+₁ = 7 - 3y_k

y_k+₁ = 5 - 3x_k+1

Assume the starting point:

x₀ = 0 y₀ = 0

x₁ = 7 – 3(0) = 7 y₁ = 5 – 3(7) = -16

x₂ = 7 – 3(-16) = 55 y₂ = 5 – 3(55) = -160 Does not converge!

Tuesday, June 29, 2004

-How can we ensure convergence?

Strictly Diagonal Dominant Matrices

-A matrix A is strictly Diagonal Dominant if:

| a_kk | > Σ | a_kj | where sigma is from j = 0 to N, where j ≠ k and where k = 1,2,…,N

-This means that the absolute value of each element in the diagonal has to be larger than the summation of the absolute value of all the other elements in the same row

Condition for convergence of Jacobi and Gauss-Seidel

-The Jacobi and Gauss-Seidel iteration will converge if A is strictly diagonal

-Example:

3x + y = 5 A = | 3 1 | |3| > |1|

x + 3y = 7 | 1 3 | |3| > |1| A is a strictly diagonal dominant matrix, so the Jacobi method converges

-Example:

x + 3y = 7 A = | 1 3 | |1| !> |3|

3x + y = 5 | 3 1 | |1| !> |3| A is not a strictly diagonal dominant matrix, so the Jacobi method may not converge

Systems of non-linear equations

-Newton’s Method for systems of non-linear equations

Assume:

f₁(x,y) = x² – y – 0.2 = 0

f₂(x,y) = y² – x – 0.3 = 0

-How do we solve this system?

-We extend Newton’s Method for equations of two variables

-Using Taylor’s expansion for functions of two variables around x₀,y₀ we have:

f₁(x,y) = f₁(x₀,y₀) + (∂f₁(x₀,y₀) / ∂x)(x - x₀) + (∂f₁(x₀,y₀) / ∂y)(y - y₀) + …

f₂(x,y) = f₂(x₀,y₀) + (∂f₂(x₀,y₀) / ∂x)(x - x₀) + (∂f₂(x₀,y₀) / ∂y)(y - y₀) + …

Wednesday, June 30, 2004

-Writing the approximation in matrix form we have:

| f₁(x,y) | | f₁(x₀,y₀) | | (∂f₁(x₀,y₀) / ∂x)(∂f₁(x₀,y₀) / ∂y) | | x - x₀ |

| f₂(x,y) | = | f₂(x₀,y₀) | + | (∂f₂(x₀,y₀) / ∂x)(∂f₂(x₀,y₀) / ∂y) | | y - y₀ | + …

-The matrix containing the partial derivatives is known as the Jacobian Matrix J

| f₁(x,y) | | f₁(x₀,y₀) | | x - x₀ |

| f₂(x,y) | = | f₂(x₀,y₀) | + J_x0,_y0 | y - y₀ | + …<error>…

-To obtain the Newton Method we approximate f₁(x,y) and f₂(x,y) by using two terms and eliminating the error

-Then we set f₁(x,y) = 0 and f₂(x,y) = 0 and x = x₁ and y = y₁ for the next iteration

| 0 | | f₁(x₀,y₀) | | x₁ - x₀ |

| 0 | = | f₂(x₀,y₀) | + J_x0,_y0 | y₁ - y₀ |

( | f₁(x₀,y₀) | | x₁ - x₀ | )

J_x0,_y0^-1 (- | f₂(x₀,y₀) | = J_x0,_y0 | y₁ - y₀ | )

| f₁(x₀,y₀) | | x₁ - x₀ |

-J_x0,_y0^-1 | f₂(x₀,y₀) | = | y₁ - y₀ |

| x₀ | | f₁(x₀,y₀) | | x₁ |

| y₀ | -J_x0,_y0^-1 | f₂(x₀,y₀) | = | y₁ |

-Newton Method for systems of non-linear equations

-Notice the similarity with Newton Method for only one variable

x₁ = x₀ – (f(x₀) / f'(x₀))

| x₁ | | x₀ | | f₁(x₀,y₀) |

| y₁ | = | y₀ | -J_x0,_y0^-1 | f₂(x₀,y₀) |

-Example:

f₁(x,y) = x² – y – 0.2 = 0

f₂(x,y) = y² – x – 0.3 = 0

| (∂f₁(x₀,y₀) / ∂x)(∂f₁(x₀,y₀) / ∂y) |

J = | (∂f₂(x₀,y₀) / ∂x)(∂f₂(x₀,y₀) / ∂y) |

| 2x -1 | | 2y 1 |

J = | -1 2y | J^-1 = | 1 2x | * (1 / (2x * 2y -1)) ß Where J^-1 is the cofactor matrix divided by the determinant

-You can also Gaussian Elimination to obtain the inverse of a matrix.

| 3 6 1 |

A = | 7 8 2 |

| 4 5 1 |

| 1 0 0 | 3 6 1 |

| 0 1 0 | 7 8 2 | ß Do Gaussian Elimination on both matrices and make A an identity matrix to get the inverse of A, A^-1

| 0 0 1 | 4 5 1 |

-Example:

i x y f₁ f₂ J^-1

0 x₀ = 0 y₀ = 0 0² - 0 - 0.2 = -0.2 0² - 0 - 0.3 = -0.3 | 2(0)⁰ 1 |

J^-1 = | 1 2(0)⁰ | (1/(4(0)(0) – 1))

| x₁| | 0 | | 0 1 | | -0.2 | | -0.3 |

| y₁| = | 0 | - -1 | 1 0 | | -0.3 | = | -0.2 |

1 x₁ = -0.3 y₁ = -0.2 (x₁)² – (y₁) - 0.2 = 0.09 (y₁)² – (x₁) - 0.3 = 0.04 | 2(-0.2) 1 |

J^-1 = | 1 2(-0.3) | (1/(4(-0.3)(-0.2) – 1))

| x₂| | -0.3 | | 0.526 -1.3158 | | 0.09 | | -0.2947 |

| y₂| = | -0.2 | - | -1.3158 0.78951 | | 0.04 | = | -0.113158 |

2 x₂ = 0.2947 y₂ = -0.113158 (x₂)² – (y₂) - 0.2 = 0.00000609 (y₂)² – (x₂) - 0.3 = 0.0075

Interpolation and Polynomial Approximation

-We want to approximate functions using polynomials

-Taylor Approximation

A polynomial P_N(x) can be used to approximate a continuous function f(x)

f(x) ≈ P_N(x) = ∑ (f^k(x₀) / k!) * (x - x₀)^k

f(x) = P_N(x) + E_N(x)

E_N(x) = (f^N+¹(c) / (N + 1)!) * (c - y₀) for some c between x₀ < c < x

-Examples:

sin(x) = x – (x³ / 3!) + (x⁵ / 5!) + (x⁷ / 7!) + …

e^x = 1 + x + (x² / 2!) + (x³ / 3!) + …

Methods to evaluate a polynomial

-Assume:

f(x) = 2x⁴ + 4x³ + 3x² + 2x + 1

2*x*x*x*x + 4*x*x*x + 3*x*x + 2*x*x +

Number of multiplications = 10

Number of sums = 4

-This is too expensive, we can factor the x using “Horner’s Method”

f(x) = (((2x + 4)x + 3)x + 2)x + 1

Number of multiplications = 4

Number of sums = 4

Thursday, July 1, 2004

Polynomial Approximation using N+1 points

-Suppose that a function y = f(x) is known to have N+1 points (x₀, y₀), (x₁, y₁), … , (x_n, y_n)

-We want to build a polynomial P(x) of degree N that will pass through the N+1 points

-Once we have the polynomial, we can use it to approximate f(x)

-If x₀ < x < x_N the approximation of f(x) using p(x) is called interpolation

-If x < x₀ or x_N < x the approximation of f(x) using p(x) is called extrapolation

-We are going to see two methods to build the polynomial from the N+1 points

Lagrange Approximation

-Assume two points (x₀, y₀), (x₁, y₁). The polynomial that passes through these points is a straight line.

#1 -> m = (y - y₀) / (x - x₀) and #2 -> m = (y₁ - y₀) / (x₁ - x₀)

Combining #1 and #2

(y - y₀) / (x - x₀) = (y₁ - y₀) / (x₁ - x₀)

(y - y₀) = ((y₁ - y₀) / (x₁ - x₀)) * (x - x₀)

y = y₀ + ((y₁ - y₀) / (x₁ - x₀)) * (x - x₀)

at x = x₀ at x = x₁

y = y₀ + ((y₁ - y₀) / (x₁ - x₀)) * (x₀ - x₀) y = y₀ + ((y₁ - y₀) / (x₁ - x₀)) * (x₁ - x₀)

y = y₀ y = y₁

-Lagrange uses a similar approach

-The function can also be written as:

y = P₁(x) = y₀((x - x₁) / (x₀ - x₁)) + y₁((x - x₀) / (x₁ - x₀))

y₀ = P₁(x₀) = y₀((x - x₁) / (x₀ - x₁)) + y₁((x - x₀) / (x₁ - x₀))

y₁ = P₁(x₁) = y₀((x - x₁) / (x₀ - x₁)) + y₁((x - x₀) / (x₁ - x₀))

-For P₂(x)

P₂(x) = y₀(((x - x₁)(x - x₂)) / ((x₀ - x₁)(x₀ - x₂))) + y₁(((x - x₀)(x - x₂)) / ((x₁ - x₀)(x₁ - x₂))) + y₂(((x - x₀)(x - x₁)) / ((x₂ - x₀)(x₂ - x₁)))

-Example:

P₂(x₂) = y₀(0) + y₁(0) + y₂(1) = y₂

For P₃(x)

P₃(x) = y₀(((x - x₁)(x - x₂)(x – x₃)) / ((x₀ - x₁)(x₀ - x₂)(x₀ - x₃))) + y₁(((x - x₀)(x - x₂)(x - x₃)) / ((x₁ - x₀)(x₁ - x₂)(x₁ - x₃)))

+ y₂(((x - x₀)(x - x₁)(x - x₃)) / ((x₂ - x₀)(x₂ - x₁)(x₂ - x₃))) + y₃(((x - x₀)(x - x₁)(x - x₂)) / ((x₃ - x₀)(x₃ - x₁)(x₃ - x₂)))

-In general

P_N(x) = ∑ y_KL_N_,K(x) where K = 0 and goes to N

Where L_N,K(x) = ( ∏ (x – x_J) ) / ( ∏ (x_K – x_J) ) where J = 0 and K ≠ J and they go to N

-Example

We want to approximate sin(x) in the interval 0, π/2 with a polynomial of degree 3 and 4 values

x sin(x)

0 0

π/6 0.5

π/3 0.866

π/2 1

P₃(x) = 0(((x - π/6)(x - π/3)(x – π/2)) / ((0 - π/6)(0 - π/3)(0 - π/2))) + 0.5(((x - 0)(x - π/3)(x - π/2)) / ((π/6 - 0)(π/6 - π/3)(π/6 - π/2)))

+ 0.866(((x - 0)(x - π/6)(x - π/2)) / ((π/3 - 0)(π/3 - π/6)(π/3 - π/2))) + 1(((x - 0)(x - π/6)(x - π/3)) / ((π/2 - 0)(π/2 - π/6)(π/2 - π/3)))

P₃(x) = (0.5 / ((π/6)(-π/6)(-π/3))) * (x)(x - π/3)(x - π/2) + (0.866 / ((π/3)(-π/6)(-π/6))) * (x)(x - π/6)(x - π/2)

+ (1 / ((π/2)(π/3)(π/6))) * (x)(x - π/6)(x - π/3)

= 1.74(x)(x – 1.047)(x – 1.5708) – 3.0164(x)(x – 0.5236)(x – 1.5708) + 1.1611(x)(x – 0.5236)(x – 1.047)

Example for sin(π/4) = 0.7071

π/4 = 0.7854

P₃(π/4) = 1.74(0.7854)(0.7854 – 1.047)(0.7854 – 1.5708) – 3.0164(0.7854)(0.7854 – 0.5236)(0.7854 – 1.5708)

+ 1.1611(0.7854)(0.7854 – 0.5236)(0.7854 – 1.047)

= 0.7054 (close to the true value of sin(π/4))

Friday, July 2, 2004

Newton Polynomials

-In Lagrange polynomials P_N-1(x) and P_N(x) are not related so we cannot use P_N-1(x) to build P_N(x)

-Newton Polynomials can be built incrementally, so we can build P_N(x) using P_N-1(x)

-Assume:

P₁(x) = a₀ + a₁(x - x₀)

P₂(x) = a₀ + a₁(x - x₀) + a₂(x - x₀)(x – x₁)

P₃(x) = a₀ + a₁(x - x₀) + a₂(x - x₀)(x – x₁) + a₃(x - x₀)(x – x₁)(x - x₂)

= P₂(x) + a₃(x - x₀)(x – x₁)(x - x₂)

P_N(x) = a₀ + a₁(x - x₀) + a₂(x - x₀)(x – x₁) + … + a_N(x - x₀)…(x - x_N-1)

-How to compute a₀, a₁, … a_N

-Assume we want to build P₁(x) with points (x₀, f(x₀)) and (x₁, f(x₁))

-We want

#1 -> P₁(x₀) = f(x₀) #2 -> P₁(x₁) = f(x₁)

#3 -> P₁(x) = a₀ + a₁(x - x₀) Newton Polynomial N = 1

-Substitute #1 in #3

P₁(x₀) = f(x₀) = a₀ + a₁(x₀ - x₀) = a₀

-Substitute #2 in #3

P₁(x₁) = f(x₁) = a₀ + a₁(x₁ - x₀)

a₁ = (f(x₁) - a₀) / (x₁ - x₀)

a₁ = (f(x₁) - f(x₀)) / (x₁ - x₀)

-Now if we want to obtain P₂(x) with an additional point (x₂, f(x₂))

P₂(x) = a₀ + a₁(x - x₀) + a₂(x - x₀)(x – x₁)

P₂(x₂) = f(x₂)

P₂(x₂) = f(x₂) = a₀ + a₁(x₂ - x₀) + a₂(x₂ - x₀)(x₂ - x₁)

-Substituting the values of a₀ and a₁

f(x₂) = f(x₀) + ((f(x₁) – f(x₀)) / (x₁ – x₀)) * (x₂ - x₀) + a₂(x₂ - x₀)(x₂ - x₁)

a₂ = (1 / (x₂ - x₀)(x₂ - x₁)) * [ f(x₂) – f(x₀) - ((f(x₁) – f(x₀)) / (x₁ – x₀)) * (x₂ - x₀) ]

a₂ = (1 / (x₂ - x₁)) * [ (f(x₂) – f(x₀)) / (x₂ - x₀) - (f(x₁) – f(x₀)) / (x₁ – x₀) ]

a₂ = (1 / (x₂ - x₁)) * [ ((f(x₂) – f(x₀))(x₁ – x₀) - (f(x₁) – f(x₀))(x₂ - x₀)) / ((x₂ - x₀)(x₁ – x₀)) ]

a₂ = (1 / (x₂ - x₁)) * [ (x₁f(x₂) – x₁f(x₀) - x₀f(x₂) – x₀f(x₀) - x₂f(x₁) + x₂f(x₀) + x₀f(x₁) – x₀f(x₀) + x₁f(x₁) - x₁f(x₁)) / ((x₂ - x₀)(x₁ – x₀)) ]

a₂ = (f(x₂)(x₁ – x₀) - f(x₁)(x₁ – x₀) - f(x₁)(x₂ – x₁) + f(x₀)(x₂ – x₁)) / ((x₂ - x₁)(x₂ - x₀)(x₁ – x₀))

a₂ = ((f(x₂) - f(x₁))(x₁ – x₀) – (f(x₁) - f(x₀))(x₂ – x₁)) / ((x₂ - x₁)(x₂ - x₀)(x₁ – x₀))

a₂ = (1 / (x₂ - x₀)) * [ ((f(x₂) - f(x₁)(x₁ – x₀)) / ((x₂ - x₁)(x₁ – x₀)) - (f(x₁) - f(x₀)(x₂ – x₁)) / ((x₂ - x₁)(x₁ – x₀)) ]

a₂ = (1 / (x₂ - x₀)) * [ ((f(x₂) - f(x₁)) / (x₂ - x₁) - (f(x₁) - f(x₀)) / (x₁ – x₀) ]

where:

m₁ = (f(x₁) - f(x₀)) / (x₁ – x₀)

m₂ = (f(x₂) - f(x₁)) / (x₂ - x₁)

-Observe that m₁ and m₂ are slopes for the lines between x₁ à x₂ and x₀ à x₁ they are called “divided differences”

-The divided differences of a function f(x) are defined as follows:

f[x_K] = f[x_K]

f[x_K-1, x_K] = (f[x_K] – f[x_K-1]) / (x_K - x_K-1)

f[x_K-2, x_K-1, x_K] = (f[x_K-1, x_K] – f[x_K-2, x_K-1]) / (x_K - x_K-2)

…

f[x_K_-J, …, x_K] = (f[x_K-J+1, …, x_K] - f[x_K_-J, …, x_K-1] ) / (x_K - x_K_-J)

where a_K = f[x₀, x₁, …, x_K]

-Example

-Build a polynomial of degree 1, 2, 3, to approximate f(x) = sin(x) in the interval of 0, π/2

x₀ = 0

x₁ = π/6

x₂ = π/3

x₃ = π/2

P₁(x) = a₀ + a₁(x - x₀)

P₂(x) = a₀ + a₁(x - x₀) + a₂(x - x₀)(x – x₁)

= P₁(x) + a₂(x - x₀)(x – x₁)

P₃(x) = a₀ + a₁(x - x₀) + a₂(x - x₀)(x – x₁) + a₃(x - x₀)(x – x₁)(x - x₂)

= P₂(x) + a₃(x - x₀)(x – x₁)(x - x₂)

x_K f[x_K] f[x_K-1, x_K] f[x_K-2, x_K-1, x_K] f[x_K-3, x_K-2, x_K-1, x_K]

0 sin(0) = 0 = a₀

π/6 = 0.5236 sin(π/6) = 0.5 .5 / .5236 = 0.9549 = a₁

π/3 = 1.047 sin(π/3) = 0.8659 (.8659 – .5) / (1.047 – .5236) = 0.699 (.699 - .9549) / 1.047 = -.244 = a₂

π/2 = 1.5078 sin(π/2) = 1 (1 - .8659) / (1.5078 – 1.047) = 0.256 (.256 - .699) / (1.5078 - .5236) = -.4230 (-.4230 +.244) / 1.5078 = -0.1137 = a₃