jeudi 1 juin 2017

How to repeat data frame manipulation based on random numbers and plot removals 100x with unique file names saved

I am a new user to R and am trying to create multiple subsamples of a data frame. I have my data assigned to 4 stratum (STRATUM = 1, 2, 3, 4), and want to randomly keep only a specified number of rows in each stratum. To achieve this, I import my data, sort by the stratification value, then assign a random number to each row. I want to keep my original random number assignments since I need to use them again in future analyses, so I save a .csv with these values. Next, I subset the data by their stratum, and then specify the number of records that I want to retain in each stratum. Finally, I rejoin the data and save as a new .csv. The code works, however, I want to repeat this process 100 times. In each case I want to save the .csv with random numbers assigned, as well as the final .csv of randomly selected plots. I am unsure of how to get this block of code to repeat 100x, and also how to assign a unique file name for each iteration. Any help would be much appreciated.

DataFiles <- "//Documents/flownData_JR.csv"
PlotsFlown <- read.table (file = DataFiles, header = TRUE, sep = ",")
#Sort the data by the stratification
FlownStratSort <- PlotsFlown[order(PlotsFlown$STRATUM),]
#Create a new column with a random number (no duplicates)
FlownStratSort$RAND_NUM <- sample(137, size = nrow(FlownStratSort), replace = FALSE)
#Sort by the stratum, then random number
FLOWNRAND <- FlownStratSort[order(FlownStratSort$STRATUM,FlownStratSort$RAND_NUM),]
#Save a csv file with the random numbers
write.table(FLOWNRAND, file = "//Documents/RANDNUM1_JR.csv", sep = ",", row.names = FALSE, col.names = TRUE)
#Subset the data by stratum
FLOWNRAND1 <- FLOWNRAND[which(FLOWNRAND$STRATUM=='1'),]
FLOWNRAND2 <- FLOWNRAND[which(FLOWNRAND$STRATUM=='2'),]
FLOWNRAND3 <- FLOWNRAND[which(FLOWNRAND$STRATUM=='3'),]
FLOWNRAND4 <- FLOWNRAND[which(FLOWNRAND$STRATUM=='4'),]
#Remove data from each stratum, specifying the number of records we want to retain
FLOWNRAND1 <- FLOWNRAND1[1:34, ]
FLOWNRAND2 <- FLOWNRAND2[1:21, ]
FLOWNRAND3 <- FLOWNRAND3[1:7, ]
FLOWNRAND4 <- FLOWNRAND4[1:7, ]
#Rejoin the data
FLOWNRAND_uneven <- rbind(FLOWNRAND1, FLOWNRAND2, FLOWNRAND3, FLOWNRAND4)
#Save the table with plots removed from each stratum flown in 2017
write.table(FLOWNRAND_uneven, file = "//Documents/Flown_RAND_uneven_JR.csv", sep = ",", row.names = FALSE, col.names = TRUE)




Aucun commentaire:

Enregistrer un commentaire