I found this bit of PHP code for generating random strings (alphabetical, alphanumeric, numeric, and hexadecimal).
<?php
function random($length = 8, $seeds = 'alpha') {
// Possible seeds
$seedings['alpha'] = 'abcdefghijklmnopqrstuvwqyz';
$seedings['numeric'] = '0123456789';
$seedings['alphanum'] = 'abcdefghijklmnopqrstuvwqyz0123456789';
$seedings['hexidec'] = '0123456789abcdef';
// Choose seed
if (isset($seedings[$seeds])) {
$seeds = $seedings[$seeds];
}
// Seed generator
list($usec, $sec) = explode(' ', microtime());
$seed = (float) $sec + ((float) $usec * 100000);
mt_srand($seed);
// Generate
$str = '';
$seeds_count = strlen($seeds);
for ($i = 0; $length > $i; $i++) {
$str .= $seeds{mt_rand(0, $seeds_count - 1)};
}
return $str;
}
?>
If I run this function with the default arguments (so it is generating 8 character strings, alphabetical only) and generate 1,000,000 strings, I'd think my collision rate would be low:
26^8 = 208,827,064,576
1,000,000 / 208,827,064,576 ~= 0.0004%
In actuality, when I run that on my machine, I get a 90% collision rate! Only 10% of my generated strings are unique.
Actually, it is suspiciously close to 10%. Generating multiple sets of 1,000,000 random strings, I find that each set generates...
- 100,032 unique strings
- 100,035 unique strings
- 100,032 unique strings
- 100,028 unique strings
- 100,030 unique strings
- you get the idea...
So what gives? Obviously it has to do with how I'm seeding mt_srand
, or how php implements mt_rand
, or something else.
So...
Aucun commentaire:
Enregistrer un commentaire