random: What's the fastest way of finding a random index in a Python list, a large number of times?

vendredi 11 décembre 2020

What's the fastest way of finding a random index in a Python list, a large number of times?

What's the best (fastest) way of extracting a random value from a list, a large number (>1M) of times?

I am currently in a situation where I have a graph represented as an adjacency list, whose inner lists can have vastly different length (in the range [2, potentially 100k]).

I need to iterate over this list to generate random walks, so my current solution is along the lines of

Get random node
Pick a random index from that node's adjacency list
Move to the new node
Repeat

This works well enough when the graph isn't too big, however now I am working with a graph that contains >440k nodes, with a very large variance in the number of edges each node has.

The function I am using at the moment to extract the random index is

node_neighbors[int(random.random() * number_neighbors_of_node)]

This sped up the computation over my previous implementation, however it's still unacceptably slow for my purposes.

The number of neighbors of a node can go from 2 to tens of thousands, I cannot remove small nodes, I have to generate tens of thousands of random walks in this environment.

From profiling the code, most of the generation time is spent looking for these indexes, so I'm looking for a method that can reduce the time taken to do so. However, if it's possible to sidestep it entirely by modifying the algorithm, that would also be great.

Thanks!

random

vendredi 11 décembre 2020

What's the fastest way of finding a random index in a Python list, a large number of times?

Aucun commentaire:

Enregistrer un commentaire