My goal is to sample k integers from 0, ... n-1 without duplication. The order of sampled integers doesn't matter. At every each call (which occurs very often), n and k will slightly vary but not much (n is about 250,000 and k is about 2,000). I've come up with the following amortized O(k) algorithm:
- Prepare an array A with items 0, 1, 2, ... , n-1. This takes O(n) but since n is relatively stable, the cost can be made amortized constant.
- Sample a random number r from [0:i] where i = n - 1. Here the cost is in fact related to n, but as n is not VERY BIG, this dependency is not critical.
- Swap the rth item and the ith item in the array A.
- Decrease i by 1.
- Repeat k times the steps 2~4; now we have a random permutation of length k at the tail of A. Copy this.
- We should roll back A to its initial state (0, ... , n-1) to keep the cost of the step 1 constant. This can be done by push r to a stack of length k at each pass of step 2. Preparation of the stack requires amortized constant cost.
I think uniform sampling of permutation/combination should be an exhaustively studied problem, so either (1) there is a much better solution, or at least (2) my solution is a (minor modification of) a well-known solution. Thus,
- In case (1), I want to know that better solution.
- In case (2), I want to find a reference.
Please help me. Thanks.
Aucun commentaire:
Enregistrer un commentaire