mercredi 28 novembre 2018

What is the general formula for calculating the probability of generating a duplicate random number?

I am generating random order tracking numbers that should be short, yet should not be duplicate.

For a tracking number consisting of 3 digits, we will generate a duplicate random number after 40 attempts on average.

If we bump it to 12 digits, it will take on average 1.3 million attempts to generate a duplicate random number.

What is a more general formula for calculating how many attempts on average it will take to generate a duplicate random number up to a predefined limit?

Empirically, I can figure it out using this code, but I am looking for a more general solution:

/**
 * Calculates the average iteration at which we will generate
 * a random integer that has been generated before.
 * This numbers is much smaller than expected due to birthday paradox.
 */

// Generate random numbers up to (exclusive) this number:
const RANGE = 1000000000000;

let foundCollisions = 0;
const foundAt = [];

while(true) {
    let i = 0;
    const exists = {}

    while(true) {
        i++;

        const random = Math.floor(Math.random() * RANGE);

        if (exists[random]) {
            // We found a duplicate number:
            break;
        } else {
            exists[random] = true;
        }
    }

    foundCollisions++;
    foundAt.push(i);

    // Calculate the average iteration at which we found a duplicate number:
    const averageFoundAt = foundAt.reduce((total, num) => total + num) / foundCollisions;
    console.log(`Found ${foundCollisions} collisions, average at ${Math.round(averageFoundAt)}th iteration.`);
}




Aucun commentaire:

Enregistrer un commentaire