dimanche 8 avril 2018

Pandas: Random integer between values in two columns

How can I create a new column that calculates random integer between values of two columns in particular row.

Example df:

import pandas as pd
import numpy as np

data = pd.DataFrame({'start': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                     'end': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]})
data = data.iloc[:, [1, 0]]

Result:

enter image description here

Now I am trying something like this:

data['rand_between'] = data.apply(lambda x: np.random.randint(data.start, data.end))

or

data['rand_between'] = np.random.randint(data.start, data.end)

But it doesn't work of course because data.start is a Series not a number. how can I used numpy.random with data from columns as vectorized operation?




Aucun commentaire:

Enregistrer un commentaire