mercredi 29 juillet 2015

Boost Geometry RTree poor query performance under MSVS 2013/2015

I am trying to run the following piece of code in Visual C++ 2013 and 2015. But the performance for query is no good. It takes about 3,000 seconds to finish 1 million searches. I compile the same piece of code in Clang under OS X 10.10 and g++ on under ubuntu 15.04. The query part only takes less than 0.3 second under os x or linux. The difference is about 10,000 times. Something is not right. Any idea is greatly appreciated!

The compiler options I used for VS is as follows:

/Yu"stdafx.h" /GS /GL /analyze- /W3 /Gy /Zc:wchar_t /Zi /Gm- /O2 /sdl /Fd"Release\vc140.pdb" /Zc:inline /fp:precise /D "_NO_DEBUG_HEAP=1" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /Gd /Oy- /Oi /MD /Fa"Release\" /EHsc /nologo /Fo"Release\" /Fp"Release\ConsoleApplication1.pch" 

Below is the code:

// TestSpatialIndex.cpp : Defines the entry point for the console application.
#include "stdafx.h"
#include <iostream>
#include <boost/geometry.hpp>
#include <boost/geometry/geometries/point_xy.hpp>
#include <boost/geometry/algorithms/distance.hpp>

#include <chrono>
#include <ctime>
#include <tuple>
#include <string>
#include <iostream>
#include <fstream>
#include <sstream>

namespace bg = boost::geometry;
namespace bgi = boost::geometry::index;

int main(int argc, char *argv[]) {

    using Point = bg::model::d2::point_xy<double>;
    using Box = bg::model::box<Point>;
    using Value = std::pair<Point, int>;
    using RTree = bgi::rtree<Value, bgi::rstar<8>>;

    std::vector<Value> vv;
    std::vector<Box> vb;

    for (size_t i = 0; i < 1000000; ++i) {
        auto x = (rand() % 360000000) / 1000000.0 - 180;
        auto y = (rand() % 180000000) / 1000000.0 - 90;
        vv.push_back(std::make_pair(Point(x, y), i));
    }

    for (size_t i = 0; i < 1000000; ++i) {
        auto x = (rand() % 360000000) / 1000000.0 - 180;
        auto y = (rand() % 180000000) / 1000000.0 - 90;
        vb.push_back(Box(Point(x, y), Point(x + 0.005, y + 0.005)));
    }
    std::chrono::duration<double> elapsed_seconds;
    std::chrono::time_point<std::chrono::system_clock> start, end;
    start = std::chrono::system_clock::now();

    RTree rt2(vv.begin(), vv.end());

    end = std::chrono::system_clock::now();

    elapsed_seconds = end - start;

    std::cout << "finished rt2 computation in "
        << "elapsed time: " << elapsed_seconds.count() << "s\n";

    start = std::chrono::system_clock::now();
    for (auto &it : vb) {
        std::vector<Value> v_res;
        rt2.query(bgi::intersects(it), std::back_inserter(v_res));
    }
    end = std::chrono::system_clock::now();
    elapsed_seconds = end - start;

    std::cout << "finished rt2 query in "
        << "elapsed time: " << elapsed_seconds.count() << "s\n";

    return 0;
}

Update: I found statements like auto x = (rand() % 360000000) / 1000000.0 - 180; are the culprits. The original purpose was to return latitude and longitude coordinates with 6 decimal places. But RAND_MAX in MSVC is 32767, so (rand() % 360000000) / 1000000.0 is equivalent to rand()/1000000.0 which returns a value ranging from 1e-6 to 0.032767. Therefore a 0.005x0.005 region encloses a lot of points. On the other hand in clang and gcc RAND_MAX is 2147483647, which is the largest signed integer representable in 32 bits. The above code will generate uniformly distributed latitude and longitude pairs under clang and gcc. Hence the number of points returned from a intersection query in MSVC is way much more than that from clang or gcc. And that's why it is 10k times slower in MSVC. In VC, I changed (rand() % 360000000) / 1000000.0 to (rand() % 36000) / 100.0 and (rand() % 180000000) / 1000000.0 to (rand() % 18000) / 100.0, the performance is now comparable.




Aucun commentaire:

Enregistrer un commentaire