jeudi 20 juillet 2017

JDK9 randomization on immutable sets and maps

Reading this question and the answer given by Eugene, I found that JDK9 immutable sets and maps will introduce a source of randomness that will affect their traversal. This means that iteration order will indeed be random, at least among different runs of the JVM.

As the spec doesn't guarantee any traversal/iteration order for sets and maps, this is absolutely fine. In fact, code must never rely on implementation-specific details, but on the spec instead.

I know that today, with JDK 8, if I have i.e. a HashSet and do this (taken from the linked answer):

Set<String> wordSet = new HashSet<>(Arrays.asList("just", "a", "test"));

System.out.println(wordSet);

for (int i = 0; i < 100; i++) {
    wordSet.add("" + i);
}

for (int i = 0; i < 100; i++) {
    wordSet.remove("" + i);
}

System.out.println(wordSet);

Then the iteration order of the elements will change and the two outputs will differ. This is because adding and removing 100 elements to the set changes the internal capacity of the HashSet and rehashes elements. And this is perfectly valid behavior. I'm not asking about this here.

However, with JDK9, if I do this:

Set<String> set = Set.of("just", "a", "test");
System.out.println(set);

And then, in another instance of the JVM, I run the same code, the outputs can be different, because randomization has been introduced.

So far, I've found this excellent video in youtube (minute 44:55), in which Stuart Marks says that one motivation for this randomization is:

(...) that people write applications that have inadvertent dependencies on iteration order. (...) So, anyway, iteration order is a big deal and I think there's a lot of code out there that has latent dependencies on iteration order that has not been discovered yet. (...) So, our response to this is to deliberately randomize the iteration order in Set and Map in the new collections. So whereas before the iteration order of collections was unpredictable but stable, these are predictably unpredictable. So every time the JVM starts up, we get a random number and we use that at as a seed value that gets mixed in with the hash values. So, if you run a program that initializes a set and then prints out the elements in any order, you get an answer, and then, if you invoke the JVM again and run that same program, the set of elements usually would come out in a different order. So, the idea here is that (...) if there are iteration order dependencies in your code, what used to happen in the past, is a new JDK release came out and you test your code and (...) it'd take hours of debugging to trace it down to some kind of change in iteration order. What that meant was there was a bug in that code that depended on the iteration order. Now, if you vary the iteration order more often, like every JVM invocation, then (we hope) that weird behavior will manifest itself more frequently, and in fact we hope while you're doing testing...

So, the motivation is clear, and it's also clear that this randomization will only affect the new immutable sets and maps.

My question is: Are there any other motivations behind this randomization and what other advantages does it purport?




Aucun commentaire:

Enregistrer un commentaire