I wrote a word splitting function. It splits a word into random characters. For example if input is 'runtime' each below output possible:
['runtime']
['r','untime']
['r','u','n','t','i','m','e'] ....
But it's runtime is very high when I want to split 100k words do you have any suggestions to optimize or write it smarter.
def random_multisplitter(word):
from numpy import mod
spw = []
length = len(word)
rand = random_int(word)
if rand == length: #probability of not splitting
return [word]
else:
div = mod(rand, (length + 1))#defining division points
bound = length - div
spw.append(div)
while div != 0:
rand = random_int(word)
div = mod(rand,(bound+1))
bound = bound-div
spw.append(div)
result = spw
b = 0
points =[]
for x in range(len(result)-1): #calculating splitting points
b=b+result[x]
points.append(b)
xy=0
t=[]
for i in points:
t.append(word[xy:i])
xy=i
if word[xy:len(word)]!='':
t.append(word[xy:len(word)])
if type(t)!=list:
return [t]
return t
Aucun commentaire:
Enregistrer un commentaire