I'm working with a very long list of numbers, say 1.5 billion. I need a way to specify a percentage of numbers that I want to keep, and the rest discard. Now I know I can use a Random Number Generator to randomly decide if I should keep it or not, but the problem is that I need the numbers to keep/discard to always be the same. Meaning, if I run the program and it decides to discard indexes 2, 5, and 10, the next time I run the program, it must discard 2, 5, and 10 as well. This is very important.
I'm also facing an issue with memory. To generate a huge list of bools to determine which numbers are discarded and which are not (if we decided to go that way, for example), the profiler says the program uses around 15gb of memory, which is already too much considering I have yet another list of 1.5 billion numbers. Here's my code for that if that matters:
static bool[] GenerateShouldAddList(int totalCombos, decimal percentToAdd)
{
Random RNG = new Random();
bool[] bools = new bool[totalCombos];
int percent = (int)(percentToAdd * 100);
for (int i = 0; i < totalCombos; i++)
{
int randNum = RNG.Next(0, 101);
bools[i] = randNum < percent;
}
return bools;
}
So I'm thinking, to avoid making a huge list, is there a way to make a function that will take in the index number (say index 5364), the total numbers (1.5 billion) and the percentage that you want to keep, and then return to me whether I should add that specific index or not? And if I run each index one at a time through that function, I should only be left with the percentage of numbers I specified. And most importantly, this function should always return the same result for the same index (if the totalNumbers and the percentage don't change). I'm thinking this isn't possible, but I also have hope there's people on here that are much smarter than me. Any help is appreciated!
Aucun commentaire:
Enregistrer un commentaire