mardi 24 février 2015

How to get the indices for randomly selected rows in a list (Python)

Okay, I don't know if I phrased it badly or something, but I can't seem to find anything similar here for my problem.


So I have a 2D list, each row representing a case and each column representing a feature (for machine learning). In addition, I have a separated list (column) as labels.


I want to randomly select the rows from the 2D list to train a classifier while using the rest to test for accuracy. Thus I want to be able to know all the indices of rows I used for training to avoid repeats.


I think there are 2 parts of the question: 1) how to randomly select 2) how to get indices


again I have no idea why I can't find good info here by searching (maybe I just suck)


Sorry I'm still new to the community so I might have made a lot of format mistake. If you have any suggestion, please let me know.


Here's the part of code I'm using to get the 2D list



#273 = number of cases
feature_list=[[0]*len(mega_list)]*273
#create counters to use for index later
link_count=0
feature_count=0
#print len(mega_list)
for link in url_list[:-1]:

#setup the url
samp_url='http://ift.tt/1zEgj4L'+link
samp_url = "%20".join( samp_url.split() )

#soup it for keywords
samp_soup=BeautifulSoup(urllib2.urlopen(samp_url).read())
keywords=samp_soup.find('meta')['content']
keywords=keywords.split(',')

for keys in keywords:
#print 'megalist: '+ str(mega_list.index(keys))
if keys in mega_list:
feature_list[link_count][mega_list.index(keys)]=1


mega_list: a list with all keywords


feature_list: the 2D list, with any word in mega_list, that specific cell is set to 1, otherwise 0





Aucun commentaire:

Enregistrer un commentaire