vendredi 29 novembre 2019

2-D regression dependent truncated normal/laplace distribution

I need your help in creating a normal or better truncated laplace distribution which linearly depends on another laplace distributed variable.

This is what I achieved so far unfortunately with NaNs in the derived y-distribution:

import numpy as np
from matplotlib import pyplot as plt
from scipy.stats import gaussian_kde, truncnorm

slope = 0.2237
intercept = 1.066
spread = 4.8719

def dependency(x):
    y_lin = slope * x + intercept
    lower = slope / spread * 3 * x
    upper = slope * spread / 3 * x + 2 * intercept

    y_lin_noise = np.random.laplace(loc=0, scale=spread, size=len(y_lin)) + y_lin

    y_lin_noise[y_lin_noise < lower] = np.nan  # This is the desperate solution where
    y_lin_noise[y_lin_noise > upper] = np.nan  # NaNs are introduced

    return y_lin_noise

max = 100
min = 1
mean = 40
sigma = 25

x = truncnorm((min-mean)/sigma, (max-mean)/sigma, loc=mean, scale=sigma).rvs(5000)
y = dependency(x)

# Plotting
xx = np.linspace(np.nanmin(x), np.nanmax(x), 100)
yy = slope * xx + intercept
lower = slope/spread*3*xx
upper = slope*spread/3*xx + 2*intercept

mask = ~np.isnan(y) & ~np.isnan(x)
x = x[mask]
y = y[mask]

xy = np.vstack([x, y])
z = gaussian_kde(xy)(xy)
idz = z.argsort()
x, y, z = x[idz], y[idz], z[idz]

fig, ax = plt.subplots(figsize=(5, 5))
plt.plot(xx, upper, 'r-.', label='upper constraint')
plt.plot(xx, lower, 'r--', label='lower constraint')

ax.scatter(x, y, c=z, s=3)
plt.xlabel(r'$\bf X_{laplace}$')
plt.ylabel(r'$\bf Y_$')
plt.plot(xx, yy, 'r', label='regression model')
plt.legend()
plt.tight_layout()
plt.show()

Result

What I want to get in the end is a y-distribution without NaNs, so that for every x there is a corresponding y within in the range of the upper/lower thresholds. So to say, a lower/upper-truncated distribution around the regression line.

I am looking forward to ideas!

Thank you and best regards.




Aucun commentaire:

Enregistrer un commentaire