mardi 2 juin 2015

Java Unique Number Generator collision test runs out of memory

I have a system requirement to generate an 11 characters string where 8 rightmost digits must be unique.

Now from my understanding, at most this happens few hundreds of times per day. Due to speed concerns, I was asked to avoid using a DB to simply retrieve the nextval() in a sequence unfortunately.

So I am left to test various ways to generate a random number as good as possible, and I've come up with a solution based on SecureRandom class.

I decided to test it, to see how likely it is that a generated string would repeat itself; I tested using a HashMap (string, string) for 10 million generations - looks good, and was hoping to test for the night for billion random strings, but that has failed due to Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

The test code I have so far is this:

public class Main {
public static BigInteger BASE = BigInteger.valueOf(62);
public static final String DIGITS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

public static void main(String[] args) {
    // TODO Auto-generated method stub
    long lStartTime = System.nanoTime();
    HashMap<String, String> orders = new HashMap<String, String>();

    for (int i = 0; i < 960000000; i++) {
        SecureRandom randObj = new SecureRandom();
        BigInteger BigRand = new BigInteger(128, randObj);
        String rand = BigRand.toString(62);

        StringBuilder result = new StringBuilder();
        while (BigRand.compareTo(BigInteger.ZERO) == 1 && result.length()<11) { // number > 0
              BigInteger[] divmod = BigRand.divideAndRemainder(BASE);
              BigRand = divmod[0];
              int digit = divmod[1].intValue();
              result.insert(0, DIGITS.charAt(digit));
            }
        String doesKeyExistString = orders.get(result);
        if (doesKeyExistString != null) {
            System.out.print("Duplicate key found!: "+result.toString()+"\n");
        } else {
            orders.put(result.toString(), result.toString());  // No such key
        }
    }
    long lEndTime = System.nanoTime();
    long difference1 = lEndTime - lStartTime;
    double difference = (double)difference1/1000000000;
    System.out.println("Elapsed seconds: " + difference);
    System.out.println("Elapsed exact: " + difference1);

}

Do you have any suggestions how to prove that we can rely on this method of generating random numbers, with likelyhood of getting the same string twice small enough?

I stumbled across this question: random number generator test The answer looks interesting, but I didn't quite understand how to apply this to my case (Statistics was my hardest course, I barely passed it the second attempt...)

I am also not sure, how to adjust this random generator to dynamically set the length of the generated number.. there have to be better ways to do this than what I did here...

Thanks!




Aucun commentaire:

Enregistrer un commentaire