mercredi 24 juillet 2019

Select n random rows from table per group of codes

I have a table full of customer details from insurance policies or quotes. Each one is assigned an output code that relates to a marketing campaign and each occurs 4 times, one per "batch" which just represents a week in the month. I need to select a random 25 percent of the rows per code, per batch number (1-4) to put into another table so I can then hold those rows back and prevent the customer being marketed to.

All the solutions I've seen on stack so far instruct how to do this for a specific number of rows per group using a ROW_NUMBER in an initial CTE query then selecting from that where rn <= a given number. I need to do this but select 25 percent of each group instead.

I've tried this solution but the specific row number doesn't move me any further forward;

Select N random rows in group

Using the linked solution, this is how my code currently is without a complete where clause because I know this isn't quite what I need.

;WITH AttributionOutput AS (
SELECT [Output Code], BatchNo, MonthandYear
FROM [dbo].[Direct_Marketing_UK]
WHERE MonthandYear = 'Sep2019'
And [Output Code] NOT IN ('HOMELIVE','HOMELIVENB','HOMENBLE')
GROUP BY [Output Code], BatchNo, MonthandYear
HAVING COUNT(*) >= 60
)

, CodeandBatch AS (
SELECT  dmuk.PK_ID,
    dmuk.MonthandYear,
    dmuk.PackNo,
    dmuk.BatchNo,
    dmuk.CustomerKey,
    dmuk.URN,
    dmuk.[Output Code],
    dmuk.[Quote/Renewal Date],
    dmuk.[Name],
    dmuk.[Title],
    dmuk.[Initial],
    dmuk.[Forename],
    dmuk.[Surname],
    dmuk.[Salutation],
    dmuk.[Address 1],
    dmuk.[Address 2],
    dmuk.[Address 3],
    dmuk.[Address 4],
    dmuk.[Address 5],
    dmuk.[Address 6],
    dmuk.[PostCode],
    ROW_NUMBER() OVER(PARTITION BY dmuk.[Output Code], dmuk.BatchNo ORDER BY newid()) as rn
FROM [dbo].[Direct_Marketing_UK] dmuk INNER JOIN
 AttributionOutput ao ON dmuk.[Output Code] = ao.[Output Code]
                            AND dmuk.BatchNo = ao.BatchNo
                            AND dmuk.MonthandYear = ao.MonthandYear
)

SELECT URN,
   [Output Code],
   [BatchNo]
FROM CodeandBatch
WHERE rn <= 

I can't see how a ROW_NUMBER() can help me to grab 25 percent of the rows from every combination of Output Code and batch number.




Aucun commentaire:

Enregistrer un commentaire