random: stats.rv_continuous slow when when using custom pdf

mardi 21 août 2018

stats.rv_continuous slow when when using custom pdf

Ultimately I am trying to visualise the copula between two PDFs which are estimated from data (both via a KDE). Suppose, for one of the KDEs, I have discrete x,y data sorted in a tuple called data. I need to generate random variables with this distribution in order to perform the probability integral transform (and ultimately to obtain the uniform distribution). My methodology to generate random variables is as follows:

import scipy.stats as st
from scipy import interpolate, integrate

pdf1 = interpolate.interp1d(data[0], data[1])

class pdf1_class(st.rv_continuous):
    def _pdf(self,x):
        return pdf1(x)

pdf1_rv = pdf1_class(a = data[0][0], b= data[0][-1], name = 'pdf1_class')

pdf1_samples = pdf1_rv.rvs(size=10000)

However, this method is extremely slow. I also get the following warnings:

IntegrationWarning: The maximum number of subdivisions (50) has been achieved. If increasing the limit yields no improvement it is advised to analyze the integrand in order to determine the difficulties. If the position of a local difficulty can be determined (singularity, discontinuity) one will probably gain from splitting up the interval and calling the integrator on the subranges. Perhaps a special-purpose integrator should be used. warnings.warn(msg, IntegrationWarning)

IntegrationWarning: The occurrence of roundoff error is detected, which prevents the requested tolerance from being achieved. The error may be underestimated. warnings.warn(msg, IntegrationWarning)

Is there a better way to generate the random variables?

random

mardi 21 août 2018

stats.rv_continuous slow when when using custom pdf

Aucun commentaire:

Enregistrer un commentaire