I'm basically trying to do this Monte Carlo kind of analysis where I randomly reassign the participants in my experiment to new groups, and then reanalyze the data given the random new groups. So here's what I want to do:
Participants are originally grouped into eight groups of four participants each. I want to randomly reassign each participant to a new group, but I don't want any participants to end up in a new group with another participant from their same original group.
Here is how far I got with this:
import random
import pandas as pd
import itertools as it
data = list(it.product(range(8),range(4)))
test_df = pd.DataFrame(data=data,columns=['group','partid'])
test_df['new_group'] = None
for idx, row in test_df.iterrows():
start_group = row['group']
takens = test_df.query('group == @start_group')['new_group'].values
fulls = test_df.groupby('new_group').count().query('partid >= 4').index.values
possibles = [x for x in test_df['group'].unique() if (x not in takens)
and (x not in fulls)]
test_df.loc[idx,'new_group'] = random.choice(possibles)
The basic idea here is that I randomly reassign a participant to a new group with the constraints that (a) the new group doesn't have one of their original group partners in, and (b) the new group doesn't have 4 or more participants already reassigned to it.
The problem with this approach is that, many times, by the time we try to reassign the last group, the only remaining group slots are in that same group. I could also just try to re-randomize when it fails until it succeeds, but that feels silly. Also, I want to make 100 random reassignments, so that approach could get very slow....
So there must be a smarter way to do this. I also feel like there should be a simpler way to solve this, given how simple the goal feels (but I realize that can be misleading...)
Aucun commentaire:
Enregistrer un commentaire