random: Does Pandas use Numpy as a random number generator?

lundi 4 décembre 2017

Does Pandas use Numpy as a random number generator?

I want to get reproducible samples of data. A quick experiment suggests, that numpy.random.seed does influence pandas.DataFrame.sample, but it is not documented.

Does anybody know

What I tried

I ran the following a couple of times and always got the same results back

#!/usr/bin/env python

import pandas as pd
import numpy as np


df = pd.DataFrame([(1, 2, 1),
                   (1, 2, 2),
                   (1, 2, 3),
                   (4, 1, 612),
                   (4, 1, 612),
                   (4, 1, 1),
                   (3, 2, 1),
                   ],
                  columns=['groupid', 'a', 'b'],
                  index=['India', 'France', 'England', 'Germany', 'UK', 'USA',
                         'Indonesia'])
np.random.seed(0)
print(df.sample(n=1))
print(df.sample(n=1))
print(df.sample(n=1))
print(df.sample(n=1))
print(df.sample(n=1))

Which gives:

Indonesia
France
Indonesia
USA
England

random

lundi 4 décembre 2017

Does Pandas use Numpy as a random number generator?

What I tried

Aucun commentaire:

Enregistrer un commentaire