Python Lab Sols 6
Python Lab Sols 6
in Python
Lab 6 - Solutions
Task 1. Using a NumPy function, how would you create a one-dimensional NumPy array of the numbers from 10 to 100,
counting by 5? Note that this is of the form:
[10 15 20 25 30 ... 85 90 95 100]
Repeat the task using an np array of a list comprehension.
Answer
np.arange(10, 105, 5)
Task 2. What will be the output of the following pieces of code? Try to explain your answers.
(a)
x=np.array([10,20,30])
x[0]=10.1
print(x)
(b)
X=np.array([[1.,2.],[3.,4.]])
X*=np.array([[10.,100.]])
print(X)
Answer
(a) Since the original array was an integer array, 10.1 is converted to an integer and the final array the same
as the original array x.
(b) X is a 2 × 2 array. When we multiply this by the 1 × 2 array, the calculation performed is:
1. ∗ 10. 2. ∗ 100.
3. ∗ 10. 4. ∗ 100.
(b) np.array([[3,6,7],
[2,3,1]])
(c) np.array([[3,2],
[6,3],
[7,1]])
1
(d) np.array([[3,6,7],
[2,3,1],
[3,6,7]])
Answer
Task 4. Write a function that performs 0-1 scaling on a numpy array. The function should accept three arguments; the numpy
array to perform the scaling, and a lower and upper bound of the original scale.
Answer
Task 5. Write a function gamma sample, which takes the parameters n samples, k (assumed to be an integer) and theta
as arguments and which returns n samples draws from the Ga(k, θ) distribution. One draw X should be calculated
as P k
− i=1 log(Ui )
,
θ
where Ui are i.i.d. draws form the U(0, 1) distribution.
Use your function to draw a sample of size 10,000 from the Ga(3, 0.5) distribution. Compute its empirical mean and
variance, which should be close to 6 and 12, respectively.
− log(U )
The mathematics behind the method is that θ
i
produces a random draw from the Expo(θ) distribution (using
the inverse CDF method) and that the sum of k such variates yields a draw from the Ga(k, θ) distribution.
Answer
The most efficient solution is to create a n samples × k matrix of exponential realisations and then compute
the row sums.
import numpy as np
x = gamma_sample(10000, 3, 0.5)
print(np.mean(x))
print(np.var(x))
σ12
σ12 . . . σ1n
σ12 σ22 . . . σ2n
Σ= . .. .. ..
.. . . .
σ1n σ2n . . . σn2
2
Write a function cor matrix, which takes a covariance matrix as argument and which returns the corresponding
correlation matrix.
Test your function on the covariance matrix
9 3 1
Σ = 3 4 1 .
1 1 1
Answer
def cor_matrix(S):
sds = np.sqrt(np.diagonal(S))
return S / (sds[:,None] @ sds[None,:])
S = np.array([[9, 3, 1],
[3, 4, 1],
[1, 1, 1]])
print(cor_matrix(S))
Task 7. The conjugate gradient method is an iterative method for solving a system of linear equations
Ax = b
Answer
3
raise ValueError("A needs to be a square matrix.")
if n!=b.shape[0]:
raise ValueError(
"Length of b needs to be the number of row/columns of A"
)
n = A.shape[0]
x = np.zeros(n)
r = b
p = r
gamma = r @ r
for k in range(1, n+1):
d = A @ p
alpha = gamma / (p @ d)
x = x + alpha * p
r = r - alpha * d
gamma_old = gamma
gamma = r @ r
if math.sqrt(gamma)<1e-8:
break
beta = gamma / gamma_old
p = r + beta * p
return x
A = np.array([[6, 0, 2, 3],
[0, 5, 1, 1],
[2, 1, 4, 2],
[3, 1, 2 ,3]])
b = np.array([0, -11, 4, -5])
print(cg_solve(A, b))
Task 8. A model called Geometric Brownian motion (GBM) is used to model stock prices in the Black-Scholes model.
In this task we will simulate one trajectory from GBM. For a given initial share price S0 , a given percentage drift µ and
a given percentage volatility σ , we can simulate GBM with time step ∆t as follows.
t !
X ∆t · (µ − σ 2 ) √
St = S0 · exp + σ ∆t · εi
i=1
2
where εi are i.i.d. realisations from the N(0, 1) distribution. You can draw ‘n‘ realisations from the N(0, 1) using
‘np.random.normal(size=n)‘.
Draw one realisation of the path for S0 = 100, µ = 0.1, σ = 0.2, ∆t = 0.01 and n = 1000.
If the realisation of the path is stored in a NumPy array ‘s‘, then you can plot the path using
import matplotlib.pyplot as plt
plt.figure()
plt.plot(s)
plt.show()
Suppose you could make exactly one purchase followed by exactly one sale, when would you carry out these two
transactions to maximise your profit?
Answer
4
s = s_0 * np.exp(np.cumsum(x))
We cannot simply take the minimum and the maximum of the share price, as the maximum might occur before
the minimum, in which case we would have to sell the shares before we would have bought them.
We start by calculating the cumulative minimum of the price.
s_cummin = np.minimum.accumulate(s)
profit = s - s_cummin
We should buy when the share price is minimal before the sell date.
buy = np.argmin(s[:sell])
We can add these to the plot (we will learn more about plotting in three week’s time).
plt.figure()
plt.plot(s)
plt.axvline(buy)
plt.axvline(sell)
plt.show()
(b) (Harder) Gaussian processes are a commonly used method for regression. A critical component of these models
is the kernel function k , which largely defines our belief about the properties of the function. For a set of inputs
x = (x1 , ..., xn ), we can use this kernel to calculate the covariance matrix:
k(x1 , x1 ) k(x1 , x2 ) ... k(x1 , xn )
k(x2 , x1 ) k(x2 , x2 ) ... k(x2 , xn )
K= .. .. .. .. (2)
.
. . .
k(xn , x1 ) k(xn , x2 ) . . . k(xn , xn )
5
• evaluate, which takes as input the n × 1 array X and evaluates the squared exponential kernel for every pair
of inputs. This function should make use of the code from part (a). Note that an exception should be raised if X
is not two-dimensional.
Use the following code as a quick check of the covariance matrix produced by the class:
XX=np.random.uniform(-1,1,[200,1])
XX-=np.mean(XX)
XX/=np.std(XX)
YY=np.linspace(-3,3,200).reshape((200,1))
kern=SquaredExponential(0.5,1.6)
KK=kern.evaluate(XX)
try:
np.linalg.cholesky(KK+1e-8*np.eye(200))
except:
print('Something wrong with SE class')
(c) As mentioned above, the kernel function largely defines the properties of a Gaussian process, the prior distribution
of which is specified as follows:
f (·) ∼ GP(µ(·), k(·, ·))
Often, µ(·) is assumed to be zero and the specification of our Gaussian process (prior) model only requires a choice
of kernel function, such as the Squared exponential from part (b).
While the idea of a distribution over functions may seem complicated, we are only ever interested in the properties
of our GP prior at a finite set of points x and a fortunate property of Gaussian processes is that the distribution at
any finite set of points takes the form of the multivariate Normal:
f (x) ∼ MVN(0, K)
where K takes the form provided in (2). Therefore, to plot samples from the GP prior we can follow the following
procedure:
(a) Choose a vector of inputs x
(b) Evaluate the kernel function at all pairs of points in x to produce K.
(c) Sample from MVN(0, K)
Define a Python class GP that can be used to represent a Gaussian process prior. The class should be setup as follows:
• An init method that takes two arguments: kernel (an instance of a kernel class) and X (input values) and
stores these as attributes of the class.
• a method prior sample that takes an input n and produces n samples from your GP prior at inputs. X.
If setup correctly, the following code should produce a plot containing 5 samples from the GP prior:
k=SquaredExponential(0.5,1.)
XX=np.linspace(-3,3,200).reshape((200,1))
XX-=np.mean(XX)
XX/=np.std(XX)
YY=np.linspace(-3,3,200).reshape((200,1))
model=GP(XX,k)
samp=model.prior_sample(5)
plt.figure()
for i in range(5):
plt.plot(XX,samp[i,:])
Answer
(a) There’s two solutions to this task, one of which generalises better than the other. The first solution works
in the case where we are interested in two n × 1 arrays
X=np.random.normal(0,1,[20,1])
C=X-X.T
6
(b)
class SquaredExponential:
def __init__(self,lengthscale,amplitude):
if lengthscale<0:
raise ValueError('lengthscale must be positive')
if amplitude<0:
raise ValueError('amplitude must be positive')
self.lengthscale=lengthscale
self.amplitude=amplitude
def evaluate(self,X):
if len(X.shape)!=2:
raise ValueError('X must be two-dimensional')
diff=(X-X.T)**2
C=self.amplitude*np.exp(-diff/(2*self.lengthscale**2))
return C
To ensure positive parameters, we could also have used the TransformedParameter class from last week to
represent the parameters of the model. This is a better solution since it would allow us to update the param-
eters of the model in unconstrained space, which is often where estimation is performed.
class PowerTransform:
def __init__(self,power):
self.power=power
def forward(self,x):
if self.power==0:
return np.exp(x)
else:
return x**power
def inverse(self,x):
if self.power==0:
return np.log(x)
else:
x**(1/power)
class ExpTransform(PowerTransform):
def __init__(self):
super().__init__(0)
class TransformedParameter:
def __init__(self,value,name,transform):
self.name=name
self.transform=transform
self.unconstrained_value=self.transform.inverse(value)
def get_value(self):
return self.transform.forward(self.unconstrained_value)
def set_value(self,val):
self.unconstrained_value=self.transform.inverse(val)
value=property(get_value,set_value)
class SquaredExponential:
def __init__(self,lengthscale,amplitude):
if lengthscale<0:
raise ValueError('lengthscale must be positive')
if amplitude<0:
raise ValueError('amplitude must be positive')
self.lengthscale=TransformedParameter(lengthscale,'lengthscale',
ExpTransform())
self.amplitude=TransformedParameter(amplitude,'amplitude',
ExpTransform())
def evaluate(self,X):
if len(X.shape)!=2:
7
raise ValueError('X must be two-dimensional')
diff=(X-X.T)**2
C=self.amplitude.value*np.exp(-diff/(2*self.lengthscale.value**2))
return C
(c)
class GP:
def __init__(self,X,kernel):
self.kernel=kernel
self.X=X
def prior_sample(self,n):
KK=self.kernel.evaluate(self.X)
return np.random.multivariate_normal(np.zeros(self.X.shape[0]),KK,
[n])
k=SquaredExponential(0.5,1.)
XX=np.linspace(-3,3,200).reshape((200,1))
XX-=np.mean(XX)
XX/=np.std(XX)
YY=np.linspace(-3,3,200).reshape((200,1))
model=GP(XX,k)
samp=model.prior_sample(5)
plt.figure()
for i in range(5):
plt.plot(XX,samp[i,:])