What's the best (fastest) way of extracting a random value from a list, a large number (>1M) of times?
I am currently in a situation where I have a graph represented as an adjacency list, whose inner lists can have vastly different length (in the range [2, potentially 100k]).
I need to iterate over this list to generate random walks, so my current solution is along the lines of
- Get random node
- Pick a random index from that node's adjacency list
- Move to the new node
- Repeat
This works well enough when the graph isn't too big, however now I am working with a graph that contains >440k nodes, with a very large variance in the number of edges each node has.
The function I am using at the moment to extract the random index is
node_neighbors[int(random.random() * number_neighbors_of_node)]
This sped up the computation over my previous implementation, however it's still unacceptably slow for my purposes.
The number of neighbors of a node can go from 2 to tens of thousands, I cannot remove small nodes, I have to generate tens of thousands of random walks in this environment.
From profiling the code, most of the generation time is spent looking for these indexes, so I'm looking for a method that can reduce the time taken to do so. However, if it's possible to sidestep it entirely by modifying the algorithm, that would also be great.
Thanks!
Aucun commentaire:
Enregistrer un commentaire