mercredi 17 novembre 2021

Why is this simple shuffle algorithm — sorting by random() — biased?

In this thread we see this simple and beautiful algorithm to shuffle an array:

function shuffle<T>(array: T[]): T[] {
  return array.sort(() => Math.random() - 0.5);
}

And we can see comments that say this algorithm is biased. But I made a simple script to create an empirical probability distribution of the index that the last element of the array ends after the shuffle:

function shuffle(array) {
  return array.sort(() => Math.random() - 0.5);
}

function generateDistribution(iterations = 10_000, arrayLength = 10) {
  const testArray = Array(arrayLength - 1).fill("test");
  const testTarget = "target";
  testArray.push(testTarget);

  const results = {};

  for (let index = 0; index < iterations; index++) {
    countTargetPosition();
  }

  return generateResult();

  function countTargetPosition() {
    const shuffled = shuffle(testArray);
    shuffled.forEach((value, index) => {
      if (value === testTarget) {
        results[index] = results[index] + 1 || 1;
      }
    });
  }

  function generateResult() {
    return Object.entries(results).map(([index, count]) => {
      return {
        [index]: count / iterations,
      };
    });
  }
}

const result = generateDistribution()
document.write(`<h1>Result</h1>`)

document.write(JSON.stringify(result))

We expect an unbiased algorithm to have a uniform distribution, and the result is very close to that, even for array with 100 elements. Why is this algorithm biased then?




Aucun commentaire:

Enregistrer un commentaire