mercredi 7 septembre 2016

Default random engine for PRNG in C++ generates same output for every instance of a class - proper seed?

I'm not experienced with the pseudo random number generation (PRNG) but recently I have been giving it some thought because I want to test some stuff and manually generating the data is difficult and frankly quite prone to errors.

I have the following class:

#include <QObject>
#include <QList>
#include <QVector3D>
#include <random>
#include <functional>

// TaskCommData is part of a Task instance (a QRunnable).
// It contains all the data required for partially controlling the runnable
// and what it processes inside its run() method
class TaskCommData : public QObject
{
    friend class Task;
    Q_OBJECT
    // Property is used to abort the run() of the Task and also signal the TaskManager that the Task has changed its running status
    Q_PROPERTY(bool running
               READ isRunning
               WRITE setRunningStatus
               NOTIFY signalRunningStatusChanged)
public:
    QString getId() const;  // Task ID
    bool isRunning() const;
signals:
    void signalRunningStatusChanged(QString id, bool running);
public slots:
    void slotAbort();
private:
    bool running;
    QList<QVector3D> data; // Some data in the form of a list of 3D vectors
    QString id;

    // PRNG related members
    std::default_random_engine* engine;
    std::uniform_int_distribution<>* distribution;
    std::function<int()> dice;

    // Private constructor (don't allow creation of TaskCommData outside the Task class which instantiates the class as its class member
    explicit TaskCommData(QString id, QObject *parent = 0);

    void setRunningStatus(bool running);
    QList<QVector3D>* getData();
    void generateData();
};

This object is created and attached to a set of QRunnables in a Qt 5.7 based application. The important parts are listed below:

#include <QDebug>
#include "TaskCommData.h"

// ...

TaskCommData::TaskCommData(QString _id, QObject *parent)
    : QObject(parent),
      running(false),
      id(_id)
{
    this->engine = new std::default_random_engine();
    this->distribution = new std::uniform_int_distribution<int>(0, 1);
    this->dice = std::bind(*this->distribution, *this->engine);

    generateData();
}

// ...

void TaskCommData::generateData()
{
    QString s;
    s += QString("Task %1: Generated data [").arg(this->id);
    for(int i = 0; i < 10; ++i) {
        this->data.append(QVector3D(dice(), dice(), dice()));   // PROBLEM occurs here but it's probably just the aftermath
        s += "[" + QString::number(this->data.at(i).x()) + ","
                 + QString::number(this->data.at(i).y()) + ","
                 + QString::number(this->data.at(i).z()) + "]";
    }
    s += "]";
    qDebug() << s;
}

Upon initialization I get the following output from qDebug() (I create 10 instances of Task which instantiates TaskCommData - one per Task):

"Task task_0: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_0" (sleep:  0)
"Task task_1: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_1" (sleep:  1315)
"Task task_2: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_2" (sleep: 7556)
"Task task_3: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_3" (sleep:  4586)
"Task task_4: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_4" (sleep: 5328)
"Task task_5: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_5" (sleep: 2189)
"Task task_6: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_6" (sleep: 470)
"Task task_7: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_7" (sleep: 6789)
"Task task_8: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_8" (sleep: 6793)
"Task task_9: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_9" (sleep: 9347)

As you might have guessed from looking at the output I'd like to have more variety (obviously it's not possible to have that much variety due to the fact that a single chunk of data (a QVector3D) contains 3 binary values) and there is obviously something gone wrong here.

You might have also noticed the (sleep: ...) in the output. It's the output which comes from my TaskManager class that creates a bunch of Tasks and their respective TaskCommDatas:

void TaskManager::initData()
{
    // Setup PRNG
    std::default_random_engine generator;
    std::uniform_int_distribution<int> distribution(0,10000); // Between 0 and 10000ms
    auto dice = std::bind(distribution, generator);

    this->tasks.reserve(this->taskCount);
    qDebug() << "Adding" << this->taskCount << "tasks...";
    int msPauseBetweenChunks = 0;

    for(int taskIdx = 0; taskIdx < this->taskCount; ++taskIdx) {
        msPauseBetweenChunks = dice();
        Task* task = new Task("task_" + QString::number(taskIdx), msPauseBetweenChunks);
        task->setAutoDelete(false);
        const TaskCommData *taskCommData = task->getCommData();

        // Manage connections
        connect(taskCommData, SIGNAL(signalRunningStatusChanged(QString, bool)),
                this, SLOT(slotRunningStatusChanged(QString, bool)));
        connect(this, SIGNAL(signalAbort()),
                taskCommData, SLOT(slotAbort()));
        this->tasks.insert(task->getCommData()->getId(), task);
        qDebug() << "Added task " << task->getCommData()->getId() << " (sleep: " << msPauseBetweenChunks << ")";
    }

    emit signalCurrentlyRunningTasks(this->tasksRunning, this->taskCount);
}

Here I have the same thing (though not as a class member) and it works (the range is different but still).

Initially I had the same code snippet (the one related to the random number generation; TaskManager::initData()) inside my void TaskCommData::generateData() that is engine, distribution and timer were on the stack and destroyed once they ran out of scope. The result was the same though - repeating the same set of random numbers over and over again.

Then I decided that the problem comes from the seed (the lack of might be more appropriate as a description here). So I changed my code to:

// ...
std::chrono::nanoseconds nanoseed = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now().time_since_epoch());
qDebug() << "Setting PRNG engine to seed" << nanoseed.count();
this->engine = new std::default_random_engine();
this->engine->seed(nanoseed.count());
this->distribution = new std::uniform_int_distribution<int>(0, 1);
this->dice = std::bind(*this->distribution, *this->engine);

generateData();
// ...

I get a slightly better result:

Setting PRNG engine to seed 1473233571281947000
"Task task_0: Generated data [[1,0,0][0,1,1][0,0,0][0,1,1][1,0,0][1,0,0][0,0,1][1,1,1][1,0,0][1,0,0]]"
Added task  "task_0"  (sleep:  0 )
Setting PRNG engine to seed 1473233571282947700
"Task task_1: Generated data [[1,0,1][1,0,0][1,0,1][0,0,1][1,1,0][0,0,1][0,0,1][0,1,0][0,1,0][0,1,0]]"
Added task  "task_1"  (sleep:  1315 )
Setting PRNG engine to seed 1473233571282947700
"Task task_2: Generated data [[1,0,1][1,0,0][1,0,1][0,0,1][1,1,0][0,0,1][0,0,1][0,1,0][0,1,0][0,1,0]]"
Added task  "task_2"  (sleep:  7556 )
Setting PRNG engine to seed 1473233571283948400
"Task task_3: Generated data [[0,0,1][1,0,1][0,1,1][1,1,1][1,0,0][0,0,0][0,0,1][1,1,0][0,1,1][0,0,1]]"
Added task  "task_3"  (sleep:  4586 )
Setting PRNG engine to seed 1473233571283948400
"Task task_4: Generated data [[0,0,1][1,0,1][0,1,1][1,1,1][1,0,0][0,0,0][0,0,1][1,1,0][0,1,1][0,0,1]]"
Added task  "task_4"  (sleep:  5328 )
Setting PRNG engine to seed 1473233571284950700
"Task task_5: Generated data [[0,0,0][1,1,0][0,0,1][0,0,1][0,1,1][1,0,0][1,0,0][1,0,1][0,0,0][0,0,0]]"
Added task  "task_5"  (sleep:  2189 )
Setting PRNG engine to seed 1473233571284950700
"Task task_6: Generated data [[0,0,0][1,1,0][0,0,1][0,0,1][0,1,1][1,0,0][1,0,0][1,0,1][0,0,0][0,0,0]]"
Added task  "task_6"  (sleep:  470 )
Setting PRNG engine to seed 1473233571285950800
"Task task_7: Generated data [[0,0,0][1,0,0][0,1,1][1,0,0][1,0,1][0,1,0][1,0,1][0,1,0][1,1,0][0,0,1]]"
Added task  "task_7"  (sleep:  6789 )
Setting PRNG engine to seed 1473233571285950800
"Task task_8: Generated data [[0,0,0][1,0,0][0,1,1][1,0,0][1,0,1][0,1,0][1,0,1][0,1,0][1,1,0][0,0,1]]"
Added task  "task_8"  (sleep:  6793 )
Setting PRNG engine to seed 1473233571286950900
"Task task_9: Generated data [[1,0,1][1,1,1][1,0,0][1,1,0][0,1,1][0,0,0][1,0,1][1,0,1][0,0,0][1,0,1]]"
Added task  "task_9"  (sleep:  9347 )

though there is still too much repetition (it seems that same pairs of data are generated). This also has the huge disadvantage that its bound to how fast the TaskCommData object is created and what goes between the creation of two instances of this class. The faster the creation, the smaller the difference measured with std::chrono::system_clock::now()). It doesn't seem like a good way to generate a seed (of course I might be mistaken :D).

Any idea how to solve this problem? Even if the problem is with the seed I still don't understand why in the TaskManager::initData() things are working just fine while here not so much.




Aucun commentaire:

Enregistrer un commentaire