In [1]:
#The Hilbert matrix!
#=
Suppose that we are interested in representing a function
f(x) over the interval $[0,1]$ with a polynomial approximation.
This problem was considered by Hilbert around 1895.
https://blogs.mathworks.com/cleve/2017/06/07/hilbert-matrices-2/

This is a standard computation in numerical analysis where
we seek some way to represent $f(x)$ on the computer, and
polynomials can represent any continuous function to arbitrary
accuracy. (Internally, this is often closely related to how
the computer computes things likes $\sin(x)$)

Back to the problem at hand, if we approximate $f(x)$
with $p(x; c) = \sum_{j=1}^c c_i x^{j-1}$
in the sense that we want:
$\int_0^1 (p(x; c) - f(x))^2 dx$ to be as small as possible,
then we can show that is equivalent to minimizing
$\int_0^1 (p(x; c) - f(x))^2 dx = c^T A c - 2 c^T b + \theta$.
where the term $A_{ij} = \int_0^1 x^{i-1} x^{j-1} \, dx$,
$b_i = \int_0^1 f(x) x^{i-1} \, dx$, and
$\theta = \int_0^t f(x)^2 \, dx$.
Just to be clear, here we have that
$c^T A c = \int_0^1 p(x; c)^2\, dx > 0$ as long as $c \not= 0$.
hence the matrix $A$ is positive definite!

Thus, the solution to
$\min c^T A c - 2 c^T b + \theta$ is given by the linear system
$ A c = b$.

The entries $A_{ij} = \int_0^1 x^{i-1} x^{j-1} \, dx = 1/(i+j-1)$.
If we are interested in approximating the function $f(x) = \sqrt(x)$,
then $b_j = \int_0^1 x^{1/2} x^{j-1} \, dx = 1/(j+1/2)$
=#

# Let's solve this!
function hilbert(n::Integer)
  A = zeros(n,n)
  for j=1:n
    for i=1:n
      A[i,j] = 1/(i+j-1)
    end
  end
  return A
end
hilbert(5)
Out[1]:
5×5 Array{Float64,2}:
 1.0       0.5       0.333333  0.25      0.2     
 0.5       0.333333  0.25      0.2       0.166667
 0.333333  0.25      0.2       0.166667  0.142857
 0.25      0.2       0.166667  0.142857  0.125   
 0.2       0.166667  0.142857  0.125     0.111111
In [2]:
##

function approxprob(k)
  A = hilbert(k)
  b = zeros(k)
  for j=1:k
    b[j] = 1/(j+1/2)
  end
  return A,b
end

A, b = approxprob(10)
c = A\b
Out[2]:
10-element Array{Float64,1}:
     0.050125504504893836
     4.962389186708006   
   -39.6988848966821     
   240.83876938369386    
  -903.1418533684586     
  2107.32402860115       
 -3065.1899162723134     
  2699.4777563465864     
 -1315.992587192786      
   272.372678821189      
In [3]:
##
using Polynomials
using Plots
pyplot()
p = Poly(c)
plot(x -> p(x), range(0,stop=1, length=50))
gui()
In [4]:
##
using LinearAlgebra
plot()
for k in [2,5,10,20,50,100,200]
  A,b = approxprob(k)
  c = A\b
  p = Poly(c)
  @show norm(A*c - b)
  plot!(x -> p(x), range(0,stop=1, length=50), lab="$k")
end
gui()
norm(A * c - b) = 0.0
norm(A * c - b) = 
2.237726045655905e-16
norm(A * c - b) = 9.714809109303893e-14
norm(A * c - b) = 1.7787071781219873e-11
norm(A * c - b) = 2.2651987868916156e-11
norm(A * c - b) = 1.0034213656586302e-11
norm(A * c - b) = 1.7557534347824278e-11
In [5]:
## This appears to work great!
In [6]:
## Let's make a small perturbation
using LinearAlgebra
plot()
for k in [2,5,10,20,50,100,200]
  A,b = approxprob(k)
  b .+= 1e-11*randn(size(A,1))
  c = A\b
  p = Poly(c)
  @show norm(A*c - b)
  plot!(x -> p(x), range(0,stop=1, length=50), lab="$k")
end
gui()
norm(A * c - b) = 0.0
norm(A * c - b) = 
2.603703785810335e-16
norm(A * c - b) = 1.1279985277169437e-13
norm(A * c - b) = 8.267518385518345e-11
norm(A * c - b) = 2.064674461775667e-9
norm(A * c - b) = 6.122712291014314e-10
norm(A * c - b) = 1.7620527753694228e-9
In [7]:
##
In [8]:
##
# But if you try solving with this system with a random right hand side...
for k in [2,5,10,20,50,100,200]
  A,b = approxprob(k)
  b = randn(k)
  c = A\b
  @show k, norm(A*c-b)/norm(b)
end
# it doesn't work at all!
# so, here's a question, is there a simple to evaluate
# function f(x) (e.g. int_0^1 f(x) x^j dx has a
# simple expression, such that this shows the ill-conditioning?)
(k, norm(A * c - b) / norm(b)) = (2, 2.0212924706521545e-16)
(k, norm(A * c - b) / norm(b)) = (5, 1.1614306026808627e-12)
(k, norm(A * c - b) / norm(b)) = (10, 0.00012168923838704268)
(k, norm(A * c - b) / norm(b)) = (20, 4.981787636840716)
(k, norm(A * c - b) / norm(b)) = (50, 25.799161758160032)
(k, norm(A * c - b) / norm(b)) = (100, 12.72573043984492)
(k, norm(A * c - b) / norm(b)) = (200, 5.699287261004569)
In [9]:
## Added in class
for k in [2,5,10,20,50,100,200]
  A,b = approxprob(k)
  b = randn(k)
  Q,R = qr(A)
  c = R\(Q'*b)
  @show k, norm(A*c-b)/norm(b)
end
(k, norm(A * c - b) / norm(b)) = (2, 1.3112690765531248e-16)
(k, norm(A * c - b) / norm(b)) = (5, 7.638154709888193e-12)
(k, norm(A * c - b) / norm(b)) = (10, 0.00020059606431641901)
(k, norm(A * c - b) / norm(b)) = (20, 2.8948649007006733)
(k, norm(A * c - b) / norm(b)) = (50, 134.15575704935512)
(k, norm(A * c - b) / norm(b)) = (100, 27.468569076019325)
(k, norm(A * c - b) / norm(b)) = (200, 409.2051384668827)