mardi 13 août 2019

Conditional sampling of binary vectors (?)

I'm trying to find a name for my problem, so I don't have to re-invent wheel when coding an algorithm which solves it...

I have say 2,000 binary (row) vectors and I need to pick 500 from them. In the picked sample I do column sums and I want my sample to be as close as possible to a pre-defined distribution of the column sums.

A tiny example:

Out of the vectors:

110
010
011
110
100

I need to pick 2 to get column sums 2, 1, 0. The solution (exact in this case) would be

110
100

My ideas so far

  • one could maybe call this a binary multidimensional knapsack, but I did not find any algos for that
  • Linear Programming could help, but I'd need some step by step explanation as I got no experience with it
  • as exact solution is not always feasible, something like simulated annealing brute force could work well



Aucun commentaire:

Enregistrer un commentaire