jeudi 1 octobre 2020

Sample from a grouped dataframe with specified probabilities in R

Below, I first group my data.frame (d) by two categorical variables. First, by gender (2-levels; M/F). Second, by sector (Education, Industry, NGO, Private, Public). Then, I want to sample from each level of sector with the following probabilities: c(.2, .3, .3, .1, .1), and gender by following probabilities c(.4, .6).

I'm using the code below to achieve my goal without success? Is there a fix for that?

Would you please comment if generally my code does what I describe correctly?

d <- read.csv('https://raw.githubusercontent.com/rnorouzian/d/master/su.csv')

library(tidyverse)

set.seed(1)
(out <- d %>%
  group_by(gender,sector) %>%
  slice_sample(n = 2, weight_by = c(.4, .6, .2, .3, .3, .1, .1))) # `Error:  incorrect number of probabilities`



Aucun commentaire:

Enregistrer un commentaire