I have data where each row is a person. I want to make a randomly generated unique ID, so I can identify them in analysis.
Here is a sample dataframe
df <- data.frame(
gender = rep(c("M", "F", "M", "M", "F"), 1000),
qtr = sample(c(1:99), 50000, replace = T),
result = sample(c(100:1000), 50000, replace = T)
)
To generate a unique ID, I am using stringi
library(stringi)
library(magrittr)
library(tidyr)
df <- df %>%
mutate(UniqueID = do.call(paste0, Map(stri_rand_strings, n=50000, length=c(2, 6),
pattern = c('[A-Z]', '[0-9]'))))
However, when I test to see if the new variable UniqueID is unique, by running this code, I find there are some duplicates.
length(unique(unlist(df[c("UniqueID")])))
Is there a way to generate a unique ID which is truly unique, with no duplicates?
I have seen these questions, but it doesn't answer how to make the random number generated unique. Generating unique random numbers in dataframe column in R Create a dataframe with random numbers in each column
Thanks
Aucun commentaire:
Enregistrer un commentaire