Systematic Sampling in R
SampleBy in the doBy package that was useful in R is not supported.
I have created sample.by systematic sampling function as below. ^^
## usage : sample.by(data_as_dataframe, number_of_column, ratio_of_sample)
## returns a list of sample.df and rest.df
sample.by <- function(df, by.col.loc=1, prop=0.1) {
sample.df <- data.frame()
rest.df <- data.frame()
dat <- split(df, df[by.col.loc])
for(i in 1:NROW(dat)) {
idx <- sample(c(rep(1,n<-round(NROW(dat[[i]]) * prop)),
rep(2,NROW(dat[[i]]) - n)))
sample.df <- rbind(sample.df, dat[[i]][idx==1,])
rest.df <- rbind(rest.df, dat[[i]][idx==2,])
}
list(sample.df=sample.df[sample(NROW(sample.df)),],
rest.df=rest.df[sample(NROW(rest.df)),])
}
## example
sample.by(iris, 5, 0.7)
댓글
댓글 쓰기