mardi 6 octobre 2020

How could these alternate numpy `uniform` vs `random` constructions possibly differ?

I had some code that random-initialized some numpy arrays with:

rng = np.random.default_rng(seed=seed)
new_vectors = rng.uniform(-1.0, 1.0, target_shape).astype(np.float32)  # [-1.0, 1.0)
new_vectors /= vector_size

And all was working well, all project tests passing.

Unfortunately, uniform() returns np.float64, though downstream steps only want np.float32, and in some cases, this array is very large (think millions of 400-dimensional word-vectors). So the temporary np.float64 return-value momentarily uses 3X the RAM necessary.

Thus, I replaced the above with what definitionally should be equivalent:

rng = np.random.default_rng(seed=seed)
new_vectors = rng.random(target_shape, dtype=np.float32)  # [0.0, 1.0)                                                 
new_vectors *= 2.0  # [0.0, 2.0)                                                                                  
new_vectors -= 1.0  # [-1.0, 1.0)
new_vectors /= vector_size

And after this change, all closely-related functional tests still pass, but a single distant, fringe test relying on far-downstream calculations from the vectors so-initialized has started failing. And failing in a very reliable way. It's a stochastic test, and passes with large margin-for-error in top case, but always fails in bottom case. So: something has changed, but in some very subtle way.

The superficial values of new_vectors seem properly and similarly distributed in both cases. And again, all the "close-up" tests of functionality still pass.

So I'd love theories for what non-intuitive changes this 3-line change may have made that could show up far-downstream.

(I'm still trying to find a minimal test that detects whatever's different. If you'd enjoy doing a deep-dive into the affected project, seeing the exact close-up tests that succeed & one fringe test that fails, and commits with/without the tiny change, at https://github.com/RaRe-Technologies/gensim/pull/2944#issuecomment-704512389. But really, I'm just hoping a numpy expert might recognize some tiny corner-case where something non-intuitive happens, or offer some testable theories of same.)

Any ideas, proposed tests, or possible solutions?




Aucun commentaire:

Enregistrer un commentaire