I'm tyring to make my code faster by removing some for loops and using arrays. The slowest step right now is the generation of the random lists.
context: I have a number of mutations in a chromosome, i want to perform 1000 random "chromosomes" with the same length and same number of mutation but their positions are randomized.
here is what I'm currently running to generate these randomized mutation positions:
iterations=1000
Chr_size=1000000
num_mut=500
randbps=[]
for k in range(iterations):
listed=np.random.choice(range(Chr_size),num_mut,replace=False)
randbps.append(listed)
I want to do something similar to what they cover in this question
np.random.choice(range(Chr_size),size=(num_mut,iterations),replace=False)
however without replacement applies to the array as a whole.
further context: later in the script i go through each randomized chromosome and count the number of mutations in a given window:
for l in range(len(randbps)):
arr=np.asarray(randbps[l])
for i in range(chr_last_window[f])[::step]:
counter=((i < arr) & (arr < i+window)).sum()
Aucun commentaire:
Enregistrer un commentaire