vendredi 24 juillet 2015

Modify values in random rows of a data.table by category

I have a data.table with two columns, let's say city and score

data.table(city = sample(c("Cape Town", "New York",  "Tel Aviv"),size=15, replace = TRUE), score = sample(x=1:10, size = 15, replace=TRUE))
         city score
 1:  Tel Aviv     5
 2:  New York     5
 3:  New York     8
 4: Cape Town    10
 5:  Tel Aviv     7
 6:  New York    10
 7:  Tel Aviv     8
 8: Cape Town     2
 9:  Tel Aviv     2
10: Cape Town     2
11: Cape Town     5
12:  New York     1
13:  Tel Aviv     3
14: Cape Town     6
15:  New York     5

I want to change the score to 0 to two random rows per city (i.e., 2 rows for Tel Aviv, two for New York, etc.). Please mind that there will always be more than two rows for every city (my real data are quite large...). Ideally, I would like a solution based on data.table commands... Thank you!




Aucun commentaire:

Enregistrer un commentaire