I have a system requirement to generate an 11 characters string where 8 rightmost digits must be unique.
Now from my understanding, at most this happens few hundreds of times per day. Due to speed concerns, I was asked to avoid using a DB to simply retrieve the nextval() in a sequence unfortunately.
So I am left to test various ways to generate a random number as good as possible, and I've come up with a solution based on SecureRandom class.
I decided to test it, to see how likely it is that a generated string would repeat itself; I tested using a HashMap (string, string) for 10 million generations - looks good, and was hoping to test for the night for billion random strings, but that has failed due to Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
The test code I have so far is this:
public class Main {
public static BigInteger BASE = BigInteger.valueOf(62);
public static final String DIGITS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
public static void main(String[] args) {
// TODO Auto-generated method stub
long lStartTime = System.nanoTime();
HashMap<String, String> orders = new HashMap<String, String>();
for (int i = 0; i < 960000000; i++) {
SecureRandom randObj = new SecureRandom();
BigInteger BigRand = new BigInteger(128, randObj);
String rand = BigRand.toString(62);
StringBuilder result = new StringBuilder();
while (BigRand.compareTo(BigInteger.ZERO) == 1 && result.length()<11) { // number > 0
BigInteger[] divmod = BigRand.divideAndRemainder(BASE);
BigRand = divmod[0];
int digit = divmod[1].intValue();
result.insert(0, DIGITS.charAt(digit));
}
String doesKeyExistString = orders.get(result);
if (doesKeyExistString != null) {
System.out.print("Duplicate key found!: "+result.toString()+"\n");
} else {
orders.put(result.toString(), result.toString()); // No such key
}
}
long lEndTime = System.nanoTime();
long difference1 = lEndTime - lStartTime;
double difference = (double)difference1/1000000000;
System.out.println("Elapsed seconds: " + difference);
System.out.println("Elapsed exact: " + difference1);
}
Do you have any suggestions how to prove that we can rely on this method of generating random numbers, with likelyhood of getting the same string twice small enough?
I stumbled across this question: random number generator test The answer looks interesting, but I didn't quite understand how to apply this to my case (Statistics was my hardest course, I barely passed it the second attempt...)
I am also not sure, how to adjust this random generator to dynamically set the length of the generated number.. there have to be better ways to do this than what I did here...
Thanks!
Aucun commentaire:
Enregistrer un commentaire