mercredi 23 mai 2018

php - distributed randomness across array

In my Laravel 5.6 application I've got a start date (start), an end date (end) and a number (x). I must pick x random dates between start and end. Since I'm writing a small generator I want these dates as much distributed as possible.

I would like to avoid too many duplicated dates, too little distance between days, and I want a pick dates from the start to the end.

I've already found a solution. Even if it works pretty well I'm not happy with it. I would like to find a better solution with your help.

Step by step explanation on what I've done:

  1. Create an associative array containing all the days from start to end encoded as timestamps => **x** * **x**. This way every day will get a weight.

  2. Picking a $random number between 0 and the sum of all weights. Let's say $x = 8, so I will need 8 dates to be picked between $start = 01/04/18 $end = 30/04/18 for a total of 30 days. My weights sum will be $weightsSum = 240.

  3. Iterate my dates array by adding the current $weight to a $sum variable. If $sum >= $random I'll pick this date.

  4. Here becomes my idea: I'll map the dates array and I'll subtract some weight based on the distance between the picked item and the currently mapped one. For instance, let's say that $date[5] has been picked. The difference in days between date[0] and date[5] will be 5 days, so I've chosen this formula: $weight - round($weight / $diffInDays / 2 ). The new weight will be 64 - (round(64 / 5 / 2)) = 58. New weight for date[25] will be 62 and date[5] will be 32. Of course $diffInDays for date[5] will be 1. Otherwise I'll get a division by zero.

By this way, the weight of the chosen day will drastically get down, but it can be still be picked up. Also, there's less chance that a day near the previously chosen day will be picked, this is what I also need.

Let's get to the code:

__buildDatesArray()

$entriesCount = $this->entries->sum('quantity');

$base = Carbon::createFromTimestamp($this->startDate->timestamp)->startOfDay();

do {
     $this->dates->put($base->addDay()->timestamp, $entriesCount * $entriesCount);
}while($base->diffInDays($this->endDate) > 0);

__pickARandomDate()

$weightsSum = $this->dates->sum();

$random = \RNG::generateInt(0, $weightsSum);

$sum = 0;

foreach($this->dates as $timestamp => $weight)
{
    $sum += $weight;

    if ($sum >= $random)
    {
        $this->dates = $this->dates->map(function ($weight, $currentTimestamp) use ($timestamp, $weightsSum) {
            $diffInDays = ($currentTimestamp - $timestamp) / 86400;

            if ($diffInDays < 0)
                $diffInDays *= -1;

            if ($diffInDays <= 0)
                $diffInDays = 1;

            return $weight - round($weight / $diffInDays / 2);
        });

        return Carbon::createFromTimestamp($timestamp);
    }
}

Any idea on how to make this better? How can I pick dates in a well distributed fashion? I'm open to anything, thank you in advance!




Aucun commentaire:

Enregistrer un commentaire