I have a set of strings which are some millions of characters each. I want to split them into substrings of random length, and this I can do with no particular issue.
However, my question is: how can I apply some sort of weight to the substring length choice? My code runs in python3
, so I would like to find a pythonic solution. In detail, my aim is to:
- split the strings into substrings that range in length between 1*e04 and 8*e06 characters.
- make it so, that the script chooses more often a short length (1*e04) over a long length (8*e06) for the newly generated substrings, like a descending length likelihood gradient.
Thanks for the help!
Aucun commentaire:
Enregistrer un commentaire