I am currently writing an application that will generate random data; specifically, random names. I have made some decent progress, but am not satisfied with many of the generated names. The problem lies in my production rules, which I've attached to the bottom of this post.
The basic idea is: consonant, vowel, consonant, vowel, but some consonants themselves map to vowels (such as b< VO >).
I have not fully created the rules yet, but the final idea would follow the format shown below. However, rather than finishing it, I would like to make a better basis for the production rules.
I have tried to find a reference that discusses either: a cfg already created for english-sounding words, or an english reference that disassembles the basic format of letter combinations for words. Unfortunately, I have not been able to find a useful resource to help me advance farther than I already have. Does anyone know of a place I should look, or a reference I can look at?
ALSO: in your opinion, do you believe a context-sensitive grammar might work better?
//These variables will be the production rules that the functions below use to generate strings
var A = ['a','aa'];
A.probabilities = [90,10]; //probability of each option corresponding to the entries
A.name = "A";
var B = ['br','bl','b<VO>','b'];
B.probabilities = [22,22,21,35];
B.name = "B";
var C = ['ch','cr','ck','c<VO>','c'];
C.probabilities = [25,5,5,25,40];
C.name = "C";
var D = ['d<R>','db<VO>','d<VO>','d'];
D.probabilities = [20,1,35,49];
D.name = "D";
var E = ['e','ee'];
E.probabilities = [90,10];
E.name = "E";
var K = ['k<B>','k<R>','k<VO>','k'];
K.probabilities = [5,15,40,40];
K.name = "K";
var O = ['o','oo'];
O.probabilities = [90,10];
O.name = "O";
var Q = ['qu'];
Q.probabilities = [100];
Q.name = "Q";
var R = ['rh<VO>','r<VO>','r'];
R.probabilities = [8,32,60];
R.name = "R";
var S = ['sh','sc','sw','s<VO>','s'];
S.probabilities = [25,5,5,25,40];
S.name = "S";
var T = ['tr','t<VO>','t'];
T.probabilities = [30,30,40];
T.name = "T";
var CO = ['<B>','<C>','<D>','f','g','h','j','<K>','l','m','n','p','<Q>','<R>','s','t','v','w','x','y','z']; //for now, Y is here
CO.probabilities = [2.41,4.49,6.87,3.59,3.25,9.84,0.24,1.24,6.5,3.88,10.9,3.11,0.153,9.67,10.2,14.6,1.58,3.81,0.242,3.19,0.12];
CO.name = "CO";
var VO = ['<A>','<E>','i','<O>','u'];
VO.probabilities = [21.43,33.33,18.28,19.7,7.23];
VO.name = "VO"
var VOCO = ['<VO>','<CO>'];
VOCO.probabilities = [38.1,61.9];
VOCO.name = "VOCO";
var rules = [A,B,C,D,E,K,O,Q,R,S,T,CO,VO,VOCO]; //this will contain all of the production rule references --> since we know they are rules, we dont need < > here
And to generate a name:
var result = "<CO><VO>".repeat(getRandomInt(1,4));
Aucun commentaire:
Enregistrer un commentaire