I'm trying to return the following result in a MariaDB online compiler (inspired from this old solution Select 2 products from each category in MySQL ):
Return random groups of 2 records from each categories of database test1.csv
.
test1
dataset:
id,name,category
150,apple,fruit
153,orange,fruit
187,kiwi,fruit
189,banana,fruit
142,beef,meat
141,pork,meat
143,rabbit,meat
146,chicken,meat
410,pepsi,drinks
413,milk,drinks
418,coffee,drinks
415,mate,drinks
581,carrot,vegetables
586,tomato,vegetables
589,cucumber,vegetables
582,broccoli,vegetables
I reproduce the test on MariaDB Compiler here: https://onecompiler.com/mariadb/3ymv4a6e6
Code:
-- create
CREATE TABLE test1 (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
category TEXT NOT NULL
);
-- insert
INSERT INTO test1 VALUES (150,'apple','fruit');
INSERT INTO test1 VALUES (153,'orange','fruit');
INSERT INTO test1 VALUES (187,'kiwi','fruit');
INSERT INTO test1 VALUES (189,'banana','fruit');
INSERT INTO test1 VALUES (142,'beef','meat');
INSERT INTO test1 VALUES (141,'pork','meat');
INSERT INTO test1 VALUES (143,'rabbit','meat');
INSERT INTO test1 VALUES (146,'chicken','meat');
INSERT INTO test1 VALUES (410,'pepsi','drinks');
INSERT INTO test1 VALUES (413,'milk','drinks');
INSERT INTO test1 VALUES (418,'coffee','drinks');
INSERT INTO test1 VALUES (415,'mate','drinks');
INSERT INTO test1 VALUES (581,'carrot','vegetables');
INSERT INTO test1 VALUES (586,'tomato','vegetables');
INSERT INTO test1 VALUES (589,'cucumber','vegetables');
INSERT INTO test1 VALUES (582,'broccoli','vegetables');
-- fetch
SET @prev=0,@rownum=0;
SELECT id, name, category
FROM (
SELECT *,
IF( @prev <> category,
@rownum := 1,
@rownum := @rownum+1
) AS rank,
@prev := category,
@rownum
FROM (
SELECT * FROM test1
ORDER BY category, rand()
) random_test1
) test1_ranked
WHERE rank <= 2;
As shown above the code only returns 2 random records and from one single and same category.
An example of intended result on given dataset test1 would be:
2 records (or more, randomly selected) x 4 categories (all available categories, will be much more in production dataset)
For example:
150,apple,fruit
153,orange,fruit
142,beef,meat
141,pork,meat
418,coffee,drinks
415,mate,drinks
589,cucumber,vegetables
582,broccoli,vegetables
What is the way to control/expand the random selection to all categories and not just to a single one?
I looked at the order By and the rank and Rand docs but I couldn't find mentions for subcategories/groups application.
I found a similar python and pandas example of the exact expected result: 5. Pandas sample rows by group
df.groupby('color').apply(lambda x: x.sample(n=3)).reset_index(drop = True)
How to achieve the same result in MariaDb?
Your help is appreciated.
Aucun commentaire:
Enregistrer un commentaire