samedi 22 juillet 2023

In a MYSQL table of word definitions, within sql add 3 columns/fields of definitions taken randomly from other rows

I have a MYSQL table called aa_ngl_defs of the following form:

id Word POS Definition
1 the article used to point to something already mentioned
2 be verb used to show the identity of a thing
3 and conj used to join words or groups of words
4 of prep belonging to or connected with something
5 to prep used to indicate place or direction
6 a article one particular thing or one of a class of things
7 in prep used to indicate being inside of or surrounded by something else
8 have verb to own, possess, or hold something
9 it pronoun a thing that has been previously mentioned
10 you pronoun used to refer to the person the speaker is addressing
11 he pronoun used to refer to a male person that is the subject
12 for prep indicating the purpose or need of something
13 they pronoun used to refer to two or more people, animals, or things
14 not adverb used to make an expression negative
15 that adj used to identify a specific person, thing, fact or idea
16 we pronoun used to refer to the speaker and another person as the subject
17 on prep touching and being supported or physically in contact by something

The table has unique contiguous integers in the id column.

I want to create a new table by adding 3 columns of definitions taken randomly from other rows within this table. This would allow me to create quizzes with 3 distractors. I need reasonable performance speed but not necessarily high speed, as the table has 3,500 rows but I only need to generate a new quiz once-a-week.

So that it looks like this:

id  Word    POS Correct_Definition   Distractor1    Distractor2   Distractor3

The online quiz generators Quizlet, Kahoot and Blooket are able to take a simple list like this and produce quizzes with 3 distractors. That's what I'm trying to emulate.

The following works OK for adding 1 extra distractor definition:

    SELECT t1.id, t1.Word, t1.Definition, t2.Definition
    FROM aa_ngl_defs AS t1
    LEFT JOIN aa_ngl_defs AS t2 
      ON t1.id <> t2.id -- Ensure we don't join the row with itself
    ORDER BY RAND()
    LIMIT 3;

But I don't know how to add two more distractor definitions, and also my query took 17s which I suspect will multiply exponentially if I try to add more columns.

I tried to incorporate the following to improve performance

    SELECT *
      FROM aa_ngl_defs AS r1 JOIN
        (SELECT (RAND() * (SELECT MAX(id) FROM aa_ngl_defs)) AS id
        )
            AS r2
     WHERE r1.id >= r2.id
    ORDER BY r1.id ASC
    LIMIT 1;

... and this works very fast to select one row at random, but I am completely lost as to how incorporate this to add the three distractors. I think I need nested select statements or perhaps variables.

Obviously, since each row will have 3 distractors there will be some repeats of distractors across rows, but I can't have repeated distractors within any single row. Randomness only needs to be reasonably random, not strictly so.




Aucun commentaire:

Enregistrer un commentaire