I have a one hot encoded M x N matrix, A, with the following properties:
- 1 or more columns in each row can eq 1
- Every column in the matrix will have exactly one cell with a value of one (all other cells will be zero)
M<<N
I also an M x 1 array, B, that contains integers (i.e. number of random samples I want to select). Each cell of B has the following property:
B[i]<=np.sum(M[i])
I’m looking for the most efficient way to randomly sample a subset of the ones in each row of A. The number of samples returned for each row is given by the the integer values in the corresponding cells of B. The output will be an M x N matrix, let's call it C, where B == np.sum(C, axis=1)
A = np.array([[0, 0, 1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[1, 0, 0, 1, 1, 0, 0, 0]])
B = np.array([1, 3, 2])
A valid output of running this algorithm would be
array([[0, 0, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[1, 0, 0, 0, 1, 0, 0, 0]])
Another possible output would be
array([[0, 0, 0, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 1, 1, 0, 0, 0]])
Looking for the ability to generate X random samples as fast as possible
Aucun commentaire:
Enregistrer un commentaire