mercredi 30 janvier 2019

Randomly sampling numerals from a list whose aggregate needs to be at least greater than a given benchmark

I have a list of tuples formed by 1000 object ids and their scores, i.e.:

scored_items = [('14',534.9),('4',86.0),('78',543.21),....].

Let T be the aggregated score of the top 20 highest scoring items.

That's easy. Using python:

top_20 = sorted(score_items, key=lambda k: k[1],reverse = True)[:20] T = sum(n for _, n in top_20)

Next, let t equal a quarter of T. I.e. in python: t = math.ceil(T/4)

My question is: what's the most efficient way to randomly select 20 items (without replacement) from scored_items such that their aggregated score is equal to or greater than (but never lower than) t? They may or may not include items from top_20.

Would prefer an answer in Python, and would prefer to not rely on external libraries much


Background: This is an item-ranking algorithm that is strategy proof according to an esoteric - but useful - Game Theory theorem (source: section 2.5 in this research paper). Strategy proof essentially means it's tough to game it.

I'm a neophyte python programmer and have been mulling how to solve this problem for a while now, but just can't seem to wrap my head around it. Would be great to know how the experts would approach and solve this.

I suppose the most simplistic (and least performant perhaps) way is to keep randomly generating sets of 20 items till their scores' sum exceeds or equals t.

But there has to be a better way to do this right?




Aucun commentaire:

Enregistrer un commentaire