I have an 149x5 NumPy array. I need to save some (30%) of values selected randomly from whole array. Additionally selected values will be deleted from data.
What I have so far:
# Load dataset
data = pd.read_csv('iris.csv')
# Select randomly 30%(45) of rows from dataset
random_rows = data.sample(45)
# Object for values to be saved
values = []
# Iterate over rows and select a value randomly.
for index, row in data.iterrows():
# Random between 1 - 5
rand_selector = randint(0, 4)
# Somehow save deleted value and its position in data object
value = ?? <-------
values.append(value)
# Delete random value
del row[rand_selector]
To add further, the data from value
will later be compared to values imputed in its place by other methods(data imputation), therefore I need the position of the deleted value in original dataset.
Aucun commentaire:
Enregistrer un commentaire