The case I have is that I have a population of objects and I would like to renew the population by discarding a certain percentage and add some new ones. The discarded objects should on average belong to a lower percentile of the population, i.e. they would in general have slightly lower "value" than the average of the whole population.
I am working with a list of objects that has some attributes. Among these an ID and, say, "value". I would like to get a random sample of unique objects from this list so that the mean of the "value" of the objects in the sample has a certain bias compared to the mean of the "value" for the whole population.
Below is a minimal example
from random import seed
from random import random
from scipy import stats
seed(243)
class Object:
def __init__(self,ID,Value):
self.ID = ID
self.Value = Value
population = []
for i in range(0,50):
value = random()
population.append(Object(i,value))
population_list = [x.Value for x in population]
mu = np.mean(population_list)
print(mu)
The approach I am trying is to assign a weight to each Object ID and then use np.random.choice to sample the IDs. This seems to make sense, since if I use a uniform distribution for the weights I get the same mean for the sample:
Ids = [x.ID for x in population]
Weights = np.ones(len(Ids))/len(Ids)
test_means = []
while i<1000:
Sample = np.random.choice(Ids,p=Weights,size=25,replace=False)
Sample_list = [x.Value for x in population if x.ID in Sample]
test_means.append(np.mean(Sample_list))
i+=1
print(np.mean(test_means))
However, I can't seem to figure out how to construct the weights so that my sample will be systematically biased? Ideally, I would like to be able to control the bias, so that the mean of the sample would statistically approach, say, 0.80 x the population mean.
Would appreciate some good ideas or an alternative approach.
Aucun commentaire:
Enregistrer un commentaire