samedi 5 janvier 2019

Sampling rows with sample size greater than length of DataFrame

I'm being asked to generate a new variable based on the data from an old one. Basically, what is being asked is that I take values at random (by using the random function) from the original one and have at least 10x as many observations as the old one, and then save this as a new variable.

This is my dataset: https://archive.ics.uci.edu/ml/machine-learning-databases/forest-fires/forestfires.csv

The variable I wanna work with, is area

This is my attempt but it is giving me a module object is not callable error:

import pandas as pd
import random as rand

dataFrame = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/forest-fires/forestfires.csv")

area = dataFrame['area']

random_area = rand(area)

print(random_area)




Aucun commentaire:

Enregistrer un commentaire