Description
In this problem we will consider a very special symmetric matrix. Recall that the second-order finite difference scheme for a function 𝑓(𝑥) is given by
𝑓𝑛″ = 𝑓𝑛+1 − 2ℎ𝑓2𝑛 + 𝑓𝑛−1
where 𝑓𝑛 = 𝑓(𝑥0 + 𝑛ℎ) and ℎ = 𝑥𝑁𝑁−𝑥0 .
Define the vector f such that f𝑛 = 𝑓𝑛. Then the corresponding derivative vector using finite differences can be written as f″ = 𝐷f. Some care must be taken about the boundary conditions; for simplicity we will assume that 𝑓0 = 𝑓𝑁 = 0. Under these assumptions the matrix 𝐷 is a tridiagonal matrix:
𝑓1′ −2 1 𝑓1
⎡ 𝑓2′ ⎤ ⎡ 1 −2 1 ⎤ ⎡ 𝑓2 ⎤
⎢ 𝑓3′ ⎥ = ⎢⎢ 1 −2⋱ ⋱1 ⋱ ⎥⎥ ⎢⎢ 𝑓⋮3 ⎥⎥
⎢ ⋮ ⎥
⎢𝑓𝑁−2′ ⎥ ⎢ 1 −2 1 ⎥ ⎢𝑓𝑁−2⎥
⎣𝑓𝑁−1′ ⎦ ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⎣ 1 −2⎦ ⎣𝑓𝑁−1⎦
𝐷𝑁
You can construct the matrix 𝐷 in Julia using the diagm function. For simplicity keep everything dense in this problem, although there clearly a lot of structure that could be exploited.
1. Consider the matrix 𝐷. Using the ansatz (i.e. hypothesis) v𝑛 = sin(𝜋𝑛𝑘/𝑁), where 1 ≤ 𝑘 ≤ 𝑁 − 1, show that v is an eigenvector of 𝐷𝑁. What is the corresponding eigenvalue? Remember that the eigenvalues should be real!
2. In class we discussed several ways to find eigenvalues of matrices. The simplest algorithm for finding eigenvalues is the power method. Supppse that a symmetric matrix 𝐴 has the eigendecomposition 𝐴 =
𝑋Λ𝑋−1. Recall that for a symmetric matrix the eigenvectors associated with distinct eigenvalues are orthogonal. Starting with a random intitial vector x which can be written as a sum over the eigenvectors,
x = ∑ 𝑐𝑛v𝑛
𝑛
show that
𝐴𝑘x = ∑ 𝜆𝑘𝑛𝑐𝑛v𝑛
𝑛
where we order the eigenvalues by their magnitude, |𝜆1| > |𝜆2| > ⋯ > |𝜆𝑁| and hence
𝐴𝑘x = 𝑐1𝜆𝑘1 (v1 + 𝒪 (𝜆𝜆𝑘1𝑘2v2)) .
This shows why the power method converges to the leading eigenvector.
3. The power method gives an approximation for the leading eigenvector v1̃ .
Since it is only an approximation it is not necessarily true that 𝐴 ̃v1 =
𝜆1v1̃ exactly.
Instead we will say that the best approximation to 𝜆1 is given by the 𝜆
that satifies
2 𝜆1 = min ||𝐴 ̃v1 − 𝛼 ̃v1||2 = ∑ (∑ 𝐴𝑖𝑘(v1̃ )𝑘 − 𝛼( ̃v1)𝑖) .
𝛼
𝑖 𝑘
By differentiating this expression with respect to 𝛼, show that
̃ ⋅ (𝐴 ̃v ) ̃
𝜆1 ̃ ⋅ ̃ ̃ ̃ .
This is called the Rayleigh quotient.
4. Implement the power method power_method(A, N, x) for a symmetric matrix 𝐴 which iterates for a fixed number of iterations 𝑁 on the intial vector x and returns v1̃ and 𝜆1 approximated by the Rayleigh quotient above.
5. Run this on the matrix 𝐷10 which is of size (9 × 9). Use the true largest magnitude vector from part 1. Plot the relative error between your result and the true value as a function of 𝑁 to get a smooth plot; use the same initial vector each time.
Remember to normalize the vector at each iteration to avoid overflow!
This initial random vector should be complex. (Extra credit: show that the relationship is what you would expect analytically!)
6. A more advanced method that we discussed to find all the eigenvalues was the QR algorithm. We define 𝐴(0) = 𝐴 and then
𝑄(𝑘)𝑅(𝑘) = 𝐴(𝑘)
𝐴(𝑘+1) = 𝑅(𝑘)𝑄(𝑘)
This will normally converge to an upper-triangular matrix. How does this work? We call two matrices 𝐴 and 𝐵 similar if 𝐵 = 𝑄𝐴𝑄⊤ where 𝑄 is orthogonal. Show that if 𝐴 has eigenvalue pairs (𝜆𝑖, v𝑖) then 𝐵 has eigenpairs (𝜆𝑖, 𝑄v𝑖).
7. Show that in the QR algorithm 𝐴(𝑘+1) is similar to 𝐴(𝑘) and hence 𝐴. Therefore if the algorithm converges to an upper-triangular matrix we can read off the eigenvalues from the diagonal and the eigenvectors will be given by the columns of 𝑄(1)𝑄(2)𝑄(3) ⋯ 𝑄(𝑁).
8. Implement the QR algorithm in a function qr_algorithm(A, N) for a matrix 𝐴 with 𝑁 iterations. It should return the resulting matrix 𝐴(𝑙) and the eigenvector matrix. Run the algorithm on a random symmetric matrix of size 10 × 10 for 100 iterations. How close to upper-triangular is the resulting matrix?
9. Run the QR algorithm on the matrix 𝐷10 of size (9 × 9) for a suitable number of iterations to get convergence. Compare the results to the theoretical one. Can you find all the eigenvalues and eigenvectors?
10. Using the fact that the eigendecomposition of a symmetric matrix gives orthogonal matrices (which are easy to invert) propose a method to solve the linear system
𝐷x = b
11. Solve the system for b = 0.01^2*sin.(2π*(0.01:0.01:0.99)) and 𝐷 as a 99 × 99 matrix. Plot the resulting vector x. Plot on the same axis sin.(2π*(0.01:0.01:0.99))/4π^2. Is it similar to x? We have just solved the boundary value problem 𝑓″ = sin(2𝜋𝑥) with 𝑓(0) = 𝑓(2𝜋) = 0. In the next problem set we will see how to do this even quicker using Fourier analysis.
Exercise 2: Low-rank approximation
In this problem we will use the SVD to produce low-rank approximations for matrices. We will then use this to compress images and write fast multiplication algorithms.
In the last problem set we saw that we could exploit structural zeros to speed up algorithms. A matrix is of rank 𝑟 if we can write it in the form
𝑟
𝐴 = ∑ 𝜎𝑖u𝑖v⊤𝑖
𝑖=1
where the u𝑖 and v𝑖 are vectors of length 𝑁, So u𝑖v⊤𝑖 is a matrix of size (𝑁 × 𝑁), of rank 1.
The SVD of a square 𝑁 × 𝑁 matrix 𝐴 exactly writes 𝐴 in this form:
|
𝐴 = 𝑈Σ𝑉 ⊤ = [u1 |
u2 ⋯ | 𝜎1
⎡ u𝑁] ⎢ 𝜎2 ⋱ −−
⎤ ⎡−−
⎥ ⎢⎢ v⊤1
⊤ −−
−−⎤⎥⎥ = ∑𝑁 𝜎𝑖u𝑖v⊤𝑖
| | | ⎣ 𝜎𝑁⎦ ⎣ ⎦ 𝑖=1
Truncating the summation after the largest 𝑟 singular values results in a rank𝑟 approximation of the matrix 𝐴. In fact the Eckhart–Mirsky–Young theorem shows this is the best rank 𝑟 approximation to 𝐴.
Define the rank 𝑟 approximation of 𝐴
𝑟
𝐴𝑟 ∶= ∑ 𝜎𝑖u𝑖v⊤𝑖
𝑖=1
where u𝑟 and v𝑟 are the singular vectors of 𝐴 and 𝜎𝑟 are the singular values of the matrix.
2. Make a square matrix M of size (20 × 20) that is all zero, except for an axis-aligned cross centred in the centre. Each arm should be of total width 2 and extend a distance 5 out from the centre in each direction parallel to the axes. The cross should consist of 1s.
3. Remembering that the rank is the column rank, i.e. the number of linearlyindependent columns, how much should the rank be? Use the SVD of M to confirm this. What does the rank-1 approximation of M look like?
4. Now add small random gaussian noise using the randn function, of intensity 0.1. Plot the new matrix (using heatmap). How should this affect the (column) rank of the matrix?
5. Plot the singular values 𝜎𝑛 as a function of 𝑛. What do they tell us? What is a suitable rank-𝑛 approximation to take? What does it look like?
img = testimage(“mandrill”)
imgmat = Array(channelview(Float64.(Gray.(img))))
heatmap(imgmat, c=ColorGradient(:grays), showaxis=false, clims=(0,1), yflip=true)
Here imgmat is a standard Julia matrix that contains greyscale pixel values.
7. Plot the rank-50 approximation to imgmat (using heatmap). How does it compare to the original?
8. Plot the singular values as a function of 𝑛. You should see an “elbow” shape. Approximately where does it occur?
9. Create an interactive visualization that shows the low-rank approximation for different values of 𝑟. What do you observe? After which 𝑟 are you happy with the quality of the image. Is it related to where the elbow is?
Exercise 3: Dynamic mode decomposition
In this problem we will use the SVD to predict the future! Suppose that we have some data that satisfies an ODE,
ẋ = f(x)
We can solve the ODE and take time snapshots of the result which we can stack into a matrix as follows:
| | |
𝑋𝑛,𝑚 ∶= [x(𝑡𝑛) x(𝑡𝑛+1) ⋯ x(𝑡𝑚)]
| | |
We will assume that the dynamics is linear. This means that we expect that two consecutive snapshots should satisfy
x𝑛+1 = 𝐴x𝑛
for some matrix 𝐴.
If we have 𝑁 snapshots then we can write this as a matrix equation,
𝑋2,𝑁 = 𝐴𝑋1,(𝑁−1)
1. Suppose that we have an eigen-decomposition for 𝐴, i.e. 𝐴 = 𝑊Λ𝑊−1, and that we can write the initial time snapshot as a sum over the eigenbasis of 𝐴,
x0 = ∑ 𝑐𝑗w𝑗
𝑗
where w𝑛 is the 𝑛th eigenvector of 𝐴. Show that future time snapshots can be written as
x𝑛 = ∑ 𝑐𝑗𝜆𝑛𝑗 w𝑗
𝑗
where 𝜆𝑛 is the 𝑛th eigenvalue of 𝐴.
2. We are now going to try and find the eigendecomposition of 𝐴 without ever constructing it! Start by calculating the SVD of 𝑋1,(𝑁−1) = 𝑈Σ𝑉 ⊤. Find an expression for 𝑈⊤𝐴𝑈 in terms of 𝑋2,𝑁, Σ, 𝑉 , 𝑈. We can then calculate the eigenspectrum of A by taking the eigenvalue decomposition of 𝑈⊤𝐴𝑈 since they are similar and using use the result in [1.5].
3. Write a function that calculates the eigenspectrum of 𝐴 given 𝑋1,𝑁. Instead of using the full SVD, use a truncated SVD keeping only the first 𝑟 singular values and singular vectors; i.e. use the matrices Vr = V[:,
1:r], Ur = U[:, 1:r] and Σr = Σ[1:r, 1:r] in the expression above for 𝑉 , 𝑈, Σ. Your function should be of the form aspectrum(X, r). The reason for truncating is in case A is not full rank, in which case some terms in Σ might be nearly 0 and dividing by them would be a bad idea. 4. Test your function on a random 10 × 10 matrix 𝐴 generating some data for 𝑋 of size (10×11) from it starting from a random 𝑥0. Compare the results of eigen(A) with aspectrum(X, 10). Remember that eigenvectors are the same upto a multiplicative constant.
5. We are now going to apply this to some dynamical data. Use the ODE integrator code below to generate some data for 𝑁 coupled oscillators with 𝑥1(0) = 0.5 and all the others 𝑥𝑖(0) = 0 for N = 10 from 𝑡 = 0 to 𝑡 = 10. The coupled system can be written as
𝑥̈1 −2 1 𝑥1
⎡𝑥̈2 ⎤ ⎡ 1 −2 1 ⎤ ⎡𝑥2 ⎤
⎢𝑥̈3 ⎥ = ⎢ 1 −2 1 ⎥ ⎢𝑥3 ⎥
⎢ ⋮ ⎥ ⎢ ⋱ ⋱ ⋱ ⎥ ⎢ ⋮ ⎥
⎣𝑥̈𝑁⎦ ⎣ 1 −2⎦ ⎣𝑥𝑁⎦
or as a system of first order equations,
𝑥̇1 0 1 𝑥1
⎡ 𝑦̇1 ⎤ ⎡−2 0 1 ⎤ ⎡ 𝑦1 ⎤
⎢𝑥̇2 ⎥ ⎢ 0 0 0 1 ⎥ ⎢𝑥2 ⎥ ⎢ 𝑦̇2 ⎥ ⎢ 1 0 −2 0 1 ⎥ ⎢ 𝑦2 ⎥
⎢𝑥̇3 ⎥ = ⎢ 0 0 0 0 1 ⎥ ⎢𝑥3 ⎥
⎢ 𝑦̇3 ⎥ ⎢ 1 0 −2 0 1 ⎥ ⎢ 𝑦3 ⎥
⎢ ⋮ ⎥ ⎢ ⋱ ⋱ ⋱ ⋱ ⋱ ⎥ ⎢ ⋮ ⎥
⎢𝑥̇𝑁⎥ ⎢ 0 0 0 0 1⎥ ⎢𝑥𝑁⎥
⎣𝑦̇𝑁 ⎦ ⎣ 1 0 −2 0⎦ ⎣𝑦𝑁 ⎦
The output is a data matrix with rows 𝑥1, 𝑥̇1, 𝑥2, 𝑥̇2, ⋯
Generate a plot of 𝑥1(𝑡) to check that everything went according to plan.
6. Split the data into two parts 𝑋1 and 𝑋2, the first half from 𝑡 = 0 to 𝑡 = 5 and the second half 𝑡 = 5 to 𝑡 = 10. Calculate the spefctrum of 𝐴 with 𝑟 = 10 using 𝑋1.
7. Use the first column of 𝑋2 as the initial condition. Use the spectrum you found to predict the future dynamics. [Hint: use the initial condition to find the 𝑐𝑗s, which is a matrix solve. Then use the equations in part 1.1 to calculate the prediction.]
8. Plot the prediction for the 10 springs on the same axis as the true solution. What happens?
9. Repeat [3.6–3.7] for 𝑟 = 15 and 𝑟 = 20. What do you observe?
ODE Code:
struct RKMethod{T} c::Vector{T} b::Vector{T} a::Matrix{T} s::Int
# Make sure that the matrices and vectors have the correct size function RKMethod(c::Vector{T}, b::Vector{T}, a::Matrix{T}, s::Int) where T lc = length(c); lb = length(b); sa1, sa2 = size(a) if lc == sa1 == sa2 == s-1 && lb == s new{T}(c, b, a, s)
else
error(“Sizes should be (s = $s) : length(c) = s-1 ($lc) length(b) = s end
end end function (method::RKMethod)(f, x, t, h) # Extract the parameters
a, b, c, s = method.a, method.b, method.c, method.s
# Vector to hold the k terms k = [f(t, x)]
for i in 1:s-1
tn = t + c[i]*h xn = x + h*sum(a[i, j] * k[j] for j in 1:i)
push!(k, f(tn, xn))
end
return x + h * sum(b.*k)
end
function integrate(method, f, x0, t0, tf, h)
# Calculate the number of time steps N = ceil(Int, (tf – t0) / h) hf = tf – (N – 2)*h
#initiate tracking vectors xs = [copy(x0)] ts = [t0]
#step
for i in 1:N-1
push!(xs, method(f, xs[i], ts[i], h)) push!(ts, ts[i] + h)
end
# Special last step push!(xs, method(f, xs[N-1], ts[N-1], hf)) push!(ts, ts[N-1] + hf)
return ts, xs
end
c = [1/2, 1/2, 1] b = (1/6) .* [1, 2, 2, 1] a = [1/2 0 0; 0 1/2 0; 0 0 1] rk4 = RKMethod(c, b, a, 4)
function build_springmat(N) springmat = zeros(2N,2N) for i = 1:2N
(i < 2N) && (springmat[i, i+1] = 1.0)
if iseven(i) springmat[i, i-1] = -2.0
(i > 3) && (springmat[i, i-3] = 1.0) end end springmat
end
N = 10
const spmat = build_springmat(N) spf(t, x) = spmat*x x0 = zeros(2N); x0[1] = 0.5
ts, xs = integrate(rk4, spf, x0, 0.0, 20.0, 0.005); X = hcat(xs…) plot(ts, X[1:2:2N, :]’, leg=:outertopright, box=:on)
Exercise 4: Fourier integrals
In lectures we saw that we could write periodic functions in the form
𝑁 𝑓(𝑥) = ∑ 𝑓𝑛̂ 𝑒𝑖𝑛𝑥
𝑛=−𝑁
̂2𝜋 𝑓(𝑥) 𝑒−𝑖𝑛𝑥. where 𝑓𝑛
1. Consider the saw-tooth function,
⎧{𝑥 0 ≤ 𝑥 < 𝜋
𝑓(𝑥) = ⎨{⎩2𝑓(𝜋 − 𝑥mod(𝑥, 2𝜋)) 𝜋 ≤ 𝑥 < 2𝜋else
Calculate the Fourier series coefficients analytically.
2. Write a function fourier_coefficients(f::Vector, n::Int) that takes in a vector of samples of 𝑓 uniformly distributed over [0, 2𝜋] and returns an approximation to 𝑓𝑛̂ by calculating the integral using the trapezoidal rule.
3. Now calculate 𝑓𝑛̂ using your trapezoidal code using 100 points and 𝑛 = −3, −2, … , 3. How do they compare to the theoretical results?
4. Fix 𝑛 to be 1. Plot the relative error between the theoretical result and the result from using fourier_coefficients for calculating 𝑓1̂ using a number of points between 10 and 1000. What does the convergence look like? Does it agree with what we discussed in class?
5. Now consider the smooth periodic function exp(cos(𝑥)). Repeat [4.3–
4.4] for this function. The analytical result is given by 𝑓𝑛̂ = 𝐼|𝑛|(1).
You can calculate this using the besseli(abs(n), 1) in SpecialFunctions.jl.
6. Plot the magnitude of 𝑓𝑛̂ as a function of 𝑛 for the two functions. How do the coefficients decay? Is this what you expected?
Reviews
There are no reviews yet.