Im creating a tool that automatically draws random names and contact information from a user variable .csv file.
I want to use the .sample(4) function to give me 4 random rows for each column in my DF.
I have 2 problems Im trying to figure out:
-
The different .csv files that it can pull in based on user input, have a different number of columns. Each spreadsheet represents a month, and each month has a different number of events. However, 1 thing is consistent: the first 3 columns are name and contact info, and the last 2 are also contact info across all the spreadsheets. Im assuming there is a way I can write it to "give me a .sample(4) for each column (x number of columns) between the first 3 columns, and last 2 columns" That way whether there are 50 events, or 10 events it will know how many .sample(4) to generate.
-
I only want the sample to choose a row, if for that specific column it is looking at, has a "Y" value instead of a "NaN".
I found an explanation here: Use sample() function to apply in a range of column
that explains how to do almost the exact OPPOSITE of what im trying to do. That ^ is selecting a random sample for every ROW, whereas I want a random sample (of a full row) for every column.
month = input("What month are you drawing for? ")
year = input("What year are you drawing for? ")
import pandas ticket_entries = pandas.read_csv(month+year+'.csv')
ticket_entries.sample(4)
Aucun commentaire:
Enregistrer un commentaire