jeudi 18 avril 2019

curand_init prevents __global__ method from compiling in Cuda

When compiling this code, I have noticed that when the thread count is high (above i = 64 and j = 10) the method will not compile. No matter what I try to print the code just will not run here.

If I lower i down to below 50 it will compile again and work. I am new to cuda, so I am possibly doing something wrong. But if I comment out curand_init the code compiles fine. The problem has to do with curand_init.

__global__ void initializePart(Particle *dev_particle) {
    int i = threadIdx.x + blockIdx.x *blockDim.x;
    int j = threadIdx.y + blockIdx.y *blockDim.y;

    curandState  state;
    curand_init(seed, i, j, &state);

    double random = curand_uniform(&state)*(1000 - (-1000)) + (-1000);
}

dim3 grid(1, 1, 1);
dim3 block(64,10,1);
initializePart << < grid,block>> > (*dev_particle);

When compiled the method will not run unless I lower the 64 down to below 50. If I printf("test") within the method it does not execute it at all. I just want each thread, i and j to print a different random number. So 640 random numbers in total between -1000 and 1000.

Any idea what the problem could be?




Aucun commentaire:

Enregistrer un commentaire