vendredi 19 juin 2020

Multidimensional random draw without replacement with 'predrawn' samples in pytorch

I have an (N, I) tensor of N rows with I indices beween 0 and Z, e.g., N=5, I=3, Z=100:

foo = tensor([[83,  5, 85],
              [ 7, 60, 66],
              [89, 25, 63],
              [58, 67, 47],
              [12, 46, 40]], device='cuda:0')

Now I want to efficiently add X random additional new indices (i.e., not yet included in the tensor!) between 0 and Z to the tensor, e.g.:

foo_new = tensor([[83,  5, 85,  9, 43, 53, 42],
                  [ 7, 60, 66, 85, 64, 22,  1],
                  [89, 25, 63, 38, 24,  4, 75],
                  [58, 67, 47, 83, 43, 29, 55],
                  [12, 46, 40, 74, 21, 11, 52]], device='cuda:0')

The tensor would in the end have I+X unique indices between 0 and Z, where I indices are the ones from the initial tensor, and X indices are uniform randomly drawn from the remaining indices between 0 and Z.

So it's like a multidimensional random draw without replacement from indices 0 to Z, where the first I draws (in each row) are enforced to result in the indices given by the initial tensor.

How would I do this efficiently, especially with potentially large Z?

What I tried so far (which was quite slow):

device = torch.cuda.current_device()
notinfoo = torch.ones((N,I), device=device).byte()
N_row = torch.arange(N, device=device).unsqueeze(dim=-1)
notinfoo[N_row, foo] = 0
foo_new = torch.stack([torch.cat((f, torch.arange(Z, device=device)[nf][torch.randperm(Z-I, device=device)][:X])) for f,nf in zip(foo,notinfoo)])



Aucun commentaire:

Enregistrer un commentaire