vendredi 1 décembre 2017

Does Array#sample guarantee random order?

Does it only guarantee a random subset, or also random order?

The documentation says:

Choose [...] n random elements from the array.

The elements are chosen by using random and unique indices into the array in order to ensure that an element doesn't repeat itself unless the array already contained duplicate elements.

I see two possible interpretations. For example for [1, 2, 3].sample(2):

  • Return [1, 2], [1, 3], [2, 1], [2, 3], [3, 1] or [3, 2], each with probability 1/6.
  • Return [1, 2], [1, 3] or [2, 3], each with probability 1/3.

I tested it and the first interpretation is what happened. And looking at the source code, I also got the impression that that's what it does in general. But I'm worried that I'm overlooking something, or that that's just a side effect of the current implementation, and that it could change in the future (or already is different in some Ruby implementations).

Is my first interpretation what it's supposed to do? Can I rely on the result not only being a random subset but also having a random order? And shouldn't the documentation be clearer about this?


Here's my test code with statistics, in case you want to try it yourself:

array = (1..3).to_a
n = 2

count = Hash.new(0)
(10**6).times do
  count[array.sample(n)] += 1
end

puts "#{count.size} different samples occurred."
puts "Smallest was #{count.keys.min}, largest was #{count.keys.max}."
puts "Frequencies ranged from #{count.values.min} to #{count.values.max}."

Outputs for example:

6 different samples occurred.
Smallest was [1, 2], largest was [3, 2].
Frequencies ranged from 165698 to 167234.




Aucun commentaire:

Enregistrer un commentaire