random: why is my python implementation of metropolis algorithm (mcmc) so slow?

dimanche 24 février 2019

why is my python implementation of metropolis algorithm (mcmc) so slow?

I'm trying to implement the Metropolis algorithm (a simpler version of the Metropolis-Hastings algorithm) in Python.

Here is my implementation:

def Metropolis_Gaussian(p, z0, sigma, n_samples=100, burn_in=0, m=1):
"""
Metropolis Algorithm using a Gaussian proposal distribution.
p: distribution that we want to sample from (can be unnormalized)
z0: Initial sample
sigma: standard deviation of the proposal normal distribution.
n_samples: number of final samples that we want to obtain.
burn_in: number of initial samples to discard.
m: this number is used to take every mth sample at the end
"""
# List of samples, check feasibility of first sample and set z to first sample
sample_list = [z0]
_ = p(z0) 
z = z0
# set a counter of samples for burn-in
n_sampled = 0

while len(sample_list[::m]) < n_samples:
    # Sample a candidate from Normal(mu, sigma),  draw a uniform sample, find acceptance probability
    cand = np.random.normal(loc=z, scale=sigma)
    u = np.random.rand()
    try:
        prob = min(1, p(cand) / p(z))
    except (OverflowError, ValueError) as error:
        continue
    n_sampled += 1

    if prob > u:
        z = cand  # accept and make candidate the new sample

    # do not add burn-in samples
    if n_sampled > burn_in:
        sample_list.append(z)

# Finally want to take every Mth sample in order to achieve independence
return np.array(sample_list)[::m]

When I try to apply my algorithm to an exponential function it takes very little time. However, when I try it on a t-distribution it takes ages, considering that it's not doing that many calculations. This is how you can replicate my code:

t_samples = Metropolis_Gaussian(pdf_t, 3, 1, 1000, 1000, m=100)
plt.hist(t_samples, density=True, bins=15, label='histogram of samples')
x = np.linspace(min(t_samples), max(t_samples), 100)
plt.plot(x, pdf_t(x), label='t pdf')
plt.xlim(min(t_samples), max(t_samples))
plt.title("Sampling t distribution via Metropolis")
plt.xlabel(r'$x$')
plt.ylabel(r'$y$')
plt.legend()

This code takes quite a long time to run and I'm not sure why. In my code for Metropolis_Gaussian, I am trying to improve efficiency by

Not adding to the list repeated samples
Not recording burn-in samples

The function pdf_t is defined as follows

from scipy.stats import t
def pdf_t(x, df=10):
    return t.pdf(x, df=df)

random

dimanche 24 février 2019

why is my python implementation of metropolis algorithm (mcmc) so slow?

Aucun commentaire:

Enregistrer un commentaire