random: Adding column of random floats to data frame, but with equal values for equal data frame entries

vendredi 19 juillet 2019

Adding column of random floats to data frame, but with equal values for equal data frame entries

I have a column of integers, some are unique and some are the same. I want to add a column of random floats between 0 and 1 per row, but I want all of the floats to be the same per integer.

The code I'm providing shows a column of ints and a second column of random floats, but I need the floats for the same ints, like 1, 1, and 1, or 6 and 6, to all be the same, while still having whatever the float assigned to that int randomly generated. The ints I'm working with, however, are 8 digits, and the data set I am using is about 500,000 lines, so I am trying to be as efficient as possible.

I've created a working solution that iterates through the data frame that has already been created, but creating the random column, then iterating through checking like ints takes long. I wasn't sure if there was a more efficient method.

import numpy as np
import pandas as pd

col1 = [1,1,1,2,3,3,3,4,5,6,6,7]
col2 = np.random.uniform(0,1,12)

data = np.array([col1, col2])

df1 = pd.DataFrame(data=data)
df1 = df1.transpose()

random

vendredi 19 juillet 2019

Adding column of random floats to data frame, but with equal values for equal data frame entries

Aucun commentaire:

Enregistrer un commentaire