mercredi 2 décembre 2020

BigQuery - Random numbers repeating when generated inside arrays

I made a BigQuery query which involves generating an array of random numbers for each row. I use the random numbers to decide which elements to include from an array that exists in my source table.

I had a lot of trouble getting the arrays of random numbers to not repeat themselves over every single row. I found a workaround, but is this expected behavior? I'll post two "methods" (one with desired results, one with bad results) below. Note that both methods work fine if you don't use an array, but just generate a single random number.

Method 1 (BAD Results):

SELECT
    (
        SELECT
            ARRAY(
                SELECT AS STRUCT
                    RAND() AS random
                FROM UNNEST(GENERATE_ARRAY(0, 10, 1)) AS _time
            ) AS random_for_times
    )
FROM UNNEST(GENERATE_ARRAY(0, 10, 1))

Method 2 (GOOD results):

SELECT
    (
        SELECT
            ARRAY(
                SELECT AS STRUCT
                    RAND() AS random
                FROM UNNEST(GENERATE_ARRAY(0, 10, 1)) AS _time
            ) AS random_for_times
        FROM (SELECT NULL FROM UNNEST([0]))
    )
FROM UNNEST(GENERATE_ARRAY(0, 10, 1))

Example Results - Method 1 (BAD):

Row 1
0.5431173080158003
0.5585452983410205
...
Row 2   
0.5431173080158003
0.5585452983410205
...

Example Results - Method 2 (GOOD):

Row 1
0.49639706531271377
0.1604380522058521
...
Row 2   
0.7971869432989377
0.9815667330115473
...



Aucun commentaire:

Enregistrer un commentaire