vendredi 16 juin 2017

How to select randomly part of matrices on the same indexes?

I have three matrices which are linked by the same indexes : ratings as the same index on lines with side_subscriber and the same index on columns with side_eclipse. I need to extract some data to train an algorithm (a recommender system based on matrix factorization from Ethan Rosenthal blog post on matrix recommendation systems (which I'm modifying a bit to allow it to predict my own data)) but I need to randomly select on the same indexes. How can I do that ? I am not sure that the following code does the job ?

test_ratings = np.random.choice(ratings[user, :].nonzero()[0], 
                                size=10, 
                                replace=True)
test_side_subscriber = np.random.choice(side_subscriber[user, :].nonzero()[0], 
                                size=10, 
                                replace=True)
test_side_eclipse = np.random.choice(side_eclipse[user, :].nonzero()[0], 
                                size=10, 
                                replace=True)

Some data if needed :

side_subscriber :

hashtag_id     321   322   323   324   325   326   327   328    329   333   \
subscriber_id                                                                
54              0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
150             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
152             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
156             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
161            44.0   0.0   0.0   0.0   0.0   0.0   0.0  26.0   96.0   0.0   
172            66.0  75.0  18.0   0.0  96.0   0.0   0.0   0.0  144.0   0.0   
185             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
190             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
292             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
294             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
298             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
365             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
372             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
375             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
378             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
402             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
407             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
410             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
412             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0   
413             0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0    0.0   0.0 

for side_eclipse

hashtag_id  321   322   323   324   325   326   327   328   329   333   ...   \
eclipse_id                                                              ...    
6521           1     1     1     0     0     0     0     0     0     0  ...    
6527           0     0     0     1     1     1     1     0     0     0  ...    
6530           0     0     0     0     0     0     0     1     1     0  ...    
6532           0     0     0     0     0     0     0     1     1     0  ...    
6547           0     0     0     0     1     0     0     0     0     1  ...    
6553           0     0     0     0     0     0     0     0     0     0  ...    
6557           0     0     0     0     0     0     0     0     0     0  ...    
6559           0     0     0     0     0     0     0     0     0     0  ...    
6577           0     0     0     0     1     0     0     0     0     0  ...    
6580           0     0     0     0     1     0     0     0     0     0  ...    
6583           0     0     0     0     1     0     0     0     0     0  ...    
6588           0     0     0     0     0     0     0     0     0     0  ...    
6591           0     0     0     0     0     0     0     0     0     0  ...    
6606           1     1     0     0     0     0     0     0     0     0  ...    
6609           1     1     0     0     0     0     0     0     0     0  ...    
6612           0     0     0     0     0     0     0     0     0     0  ...    
6617           1     1     0     0     0     0     0     0     0     0  ...    
6620           0     0     0     0     0     0     0     0     0     0  ...    
6635           0     0     0     0     0     0     0     0     0     0  ...    
6638           0     0     0     0     1     0     0     0     0     0  ...    
6641           0     0     0     0     1     0     0     0     0     0  ...    
6649           1     1     0     0     0     0     0     0     0     0  ...    
6664           0     0     0     0     0     0     0     0     0     0  ...    
6667           0     0     0     0     0     0     0     0     0     0  ...    
6670           0     0     0     0     0     0     0     0     0     0  ...    
6675           0     0     0     0     0     0     0     0     0     0  ...    
6679           0     0     0     0     0     0     0     0     0     0  ...    
6694           0     0     0     0     0     0     0     0     0     0  ...    
6697           0     0     0     0     0     0     0     0     0     0  ...    
6701           0     0     0     0     0     0     0     0     0     0  ...   

And, at last, ratings

hashtag_id  2126  2206  2268  2270  2271  2272  2470  2533  2545  2546  
eclipse_id                                                              
6521           0     0     0     0     0     0     0     0     0     0  
6527           0     0     0     0     0     0     0     0     0     0  
6530           0     0     0     0     0     0     0     0     0     0  
6532           0     0     0     0     0     0     0     0     0     0  
6547           0     0     0     0     0     0     0     0     0     0  
6553           0     0     0     0     0     0     0     0     0     0  
6557           0     0     0     0     0     0     0     0     0     0  
6559           0     0     0     0     0     0     0     0     0     0  
6577           0     0     0     0     0     0     0     0     0     0  
6580           0     0     0     0     0     0     0     0     0     0  
6583           0     0     0     0     0     0     0     0     0     0  
6588           0     0     0     0     0     0     0     0     0     0  
6591           0     0     0     0     0     0     0     0     0     0  
6606           0     0     0     0     0     0     0     0     0     0  
6609           0     0     0     0     0     0     0     0     0     0  
6612           0     0     0     0     0     0     0     0     0     0  
6617           0     0     0     0     0     0     0     0     0     0  
6620           0     0     0     0     0     0     0     0     0     0  
6635           0     0     0     0     0     0     0     0     0     0  
6638           0     0     0     0     0     0     0     0     0     0  
6641           0     0     0     0     0     0     0     0     0     0  
6649           0     0     0     0     0     0     0     0     0     0  
6664           0     0     0     0     0     0     0     0     0     0  
6667           0     0     0     0     0     0     0     0     0     0  
6670           0     0     0     0     0     0     0     0     0     0  
6675           0     0     0     0     0     0     0     0     0     0  
6679           0     0     0     0     0     0     0     0     0     0  
6694           0     0     0     0     0     0     0     0     0     0  
6697           0     0     0     0     0     0     0     0     0     0  
6701           0     0     0     0     0     0     0     0     0     0  




Aucun commentaire:

Enregistrer un commentaire