lundi 10 mai 2021

eliminate repeats in random module in python

I have a code which generates a random sequence:

import random
selection60 = {"A":20, "T":20, "G":30, "C":30}
sseq60=[]
for k in selection60:
    sseq60 = sseq60 + [k] * int(selection60[k])
    random.shuffle(sseq60)
sequence="".join(random.sample(sseq60, 100))

The output in this case is:

GACCCCTCTGTACTATTAAAAGGCGTCACCGCGCCGAAAGAGCTGCAAGGCAATAGTGGACCAGAATCAAACGAAGGATTGCTTAGGTAATGGAATACAA

However, I would like to implement something that checks as well that no repeats of longer then 10 bases will be created for example:

GACCCCCCCCCCCTATTAAAAGGCGTCATCGCGCCGAAAGAGTTGCAAGGCAATAGTGGAGCAGAATTAAACGAAGGATTGCTTAGGTAATGGAATAAAA

This sequence contains 11 Cs at the beginning and it should not be allowed, the distribution of the letters should be uniform, is the random.sample function doing it by itself or does this need to be implemented?




Aucun commentaire:

Enregistrer un commentaire