vendredi 25 janvier 2019

How to assign random values from a list to a column in a pandas dataframe?

I am working with Python in Bigquery and have a large dataframe df (circa 7m rows). I also have a list lst that holds some dates (say all days in a given month).

I am trying to create an additional column "random_day" in df with a random value from lst in each row.

I tried running a loop and apply function but being quite a large dataset it is proving challenging.

My attempts passed by the loop solution:

df["rand_day"] = ""

for i in a["row_nr"]:
  rand_day = sample(day_list,1)[0]
  df.loc[i,"rand_day"] = rand_day

And the apply solution, defining first my function and then calling it:

def random_day():
  rand_day = sample(day_list,1)[0]
  return day

df["rand_day"] = df.apply(lambda row: random_day())

Any tips on this? Thank you




Aucun commentaire:

Enregistrer un commentaire