lundi 19 octobre 2020

Python switching multiple positions in string each to multiple letters

I am trying to write a python code that finds restriction enzyme sites within a sequence of DNA. Restriction enzymes cut at specific DNA sequences, however some are not so strict, for example XmnI cuts this sequence:

GAANNNNTTC

Where N can be any nucleotide (A, C, G, or T). If my math is right thats 4^4 = 256 unique sequences that it can cut. I want to make a list of these 256 short sequences, then check each one against a (longer) input DNA sequence. However, I'm having a hard time generating the 256 sequences. Here's what I have so far:

cutsequencequery = "GAANNNNTTC"
Nseq = ["A", "C", "G", "T"]
querylist = []
if "N" in cutsequencequery:
    Nlist = [cutsequencequery.replace("N", t) for t in Nseq]
    for j in list(Nlist):
        querylist.append(j)

for i in querylist:
    print(i)
print(len(querylist))

and here is the output:

GAAAAAATTC
GAACCCCTTC
GAAGGGGTTC
GAATTTTTTC
4

So it's switching each N to either A, C, G, and T, but I think I need another loop (or 3?) to generate all 256 combinations. Is there an efficient way to do this that I'm not seeing?




Aucun commentaire:

Enregistrer un commentaire