I have a CSV of the format
Team, Player
What I want to do is apply a filter to the field Team, then take a random subset of 3 players from EACH team.
So for instance, my CSV looks like :
Man Utd, Ryan Giggs
Man Utd, Paul Scholes
Man Utd, Paul Ince
Man Utd, Danny Pugh
Liverpool, Steven Gerrard
Liverpool, Kenny Dalglish
...
I want to end up with an XLS consisting of 3 random players from each team, and only 1 or 2 in the case where there is less than 3 e.g,
Man Utd, Paul Scholes
Man Utd, Paul Ince
Man Utd, Danny Pugh
Liverpool, Steven Gerrard
Liverpool, Kenny Dalglish
I started out using XLRD, my original post is here.
I am now trying to use Pandas as I believe this will be more flexible into the future.
So, in psuedocode what I want to do is :
foreach(team in csv)
print random 3 players + team they are assigned to
I've been looking through Pandas and trying to find the best approach to doing this, but I can't find anything similar to what I want to do (it's a difficult thing to Google!). Here's my attempt so far :
import pandas as pd
from collections import defaultdict
import csv as csv
columns = defaultdict(list) # each value in each column is appended to a list
with open('C:\\Users\\ADMIN\\Desktop\\CSV_1.csv') as f:
reader = csv.DictReader(f) # read rows into a dictionary format
for row in reader: # read a row as {column1: value1, column2: value2,...}
print(row)
#for (k,v) in row.items(): # go over each column name and value
# columns[k].append(v) # append the value into the appropriate list
# based on column name k
So I have commented out the last two lines as I am not really sure if I am needed. I now each row being printed, so I just need to select a random 3 rows per each football team (or 1 or 2 in the case where there are less).
How can I accomplish this ? Any tips/tricks?
Thanks.
Aucun commentaire:
Enregistrer un commentaire