I am trying to write a python code that finds restriction enzyme sites within a sequence of DNA. Restriction enzymes cut at specific DNA sequences, however some are not so strict, for example XmnI cuts this sequence:
GAANNNNTTC
Where N can be any nucleotide (A, C, G, or T). If my math is right thats 4^4 = 256 unique sequences that it can cut. I want to make a list of these 256 short sequences, then check each one against a (longer) input DNA sequence. However, I'm having a hard time generating the 256 sequences. Here's what I have so far:
cutsequencequery = "GAANNNNTTC"
Nseq = ["A", "C", "G", "T"]
querylist = []
if "N" in cutsequencequery:
Nlist = [cutsequencequery.replace("N", t) for t in Nseq]
for j in list(Nlist):
querylist.append(j)
for i in querylist:
print(i)
print(len(querylist))
and here is the output:
GAAAAAATTC
GAACCCCTTC
GAAGGGGTTC
GAATTTTTTC
4
So it's switching each N to either A, C, G, and T, but I think I need another loop (or 3?) to generate all 256 combinations. Is there an efficient way to do this that I'm not seeing?
Aucun commentaire:
Enregistrer un commentaire