vendredi 28 juillet 2017

Evenly distributed random in Bash

I've been using "shuf" and "sort -R" to shuffle my music playlist, but it feels like certain songs get played more than others.

To test this, I used the following command which shuffles the alphabet and records the 1st letter in the shuffle, repeated x1000 and then counts the number of times each letter was picked. If it were truly random there would be an even distribution, but it's always lop-sided:

printf "a\nb\nc\nd\ne\nf\ng\nh\ni\nj\nk\nl\nm\nn\no\np\nq\nr\ns\nt\nu\nv\nw\nx\ny\nz" > alphabet.txt; for i in {1..1000}; do cat alphabet.txt | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);' | perl -e 'print reverse <>' | head -1 >> results.txt; done; sort results.txt | uniq -c | sort; rm results.txt; rm alphabet.txt

Which results in something like:

 29 w
 30 u
 31 d
 32 i
 33 v
 34 c
 34 m
 36 a
 36 g
 36 k
 36 n
 36 r
 36 z
 38 y
 39 x
 40 b
 40 e
 40 o
 42 p
 43 f
 43 h
 43 s
 44 j
 44 l
 52 q
 53 t

Notice how 't' was selected 53 times, but 'w' only 29. I believe the songs I hear most often are like the 't', and there are songs I rarely get in the mix (like the 'w').

Can anyone come up with a Bash/Perl/Python/etc command that would/could distribute the random results more evenly?




Aucun commentaire:

Enregistrer un commentaire