suppose to have a sequence s of length n in which appear k symbols. For example:
s=1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6
n=24
k= 6 (1,...,6)
Observe that each symbol appears exactly the same number of times. So from this sequence we could infer that is a uniformly at random distribution.
I'm looking for a way to determine the probability of a subsequence repetition. In particular, from the given example we'd have that s is composed by ss1,ss2,ss3,ss4, with each ss=1,2,3,4,5,6.
In the simplest example we may suppose that each symbol appears the same number of times as others, but I need a way to deal also with cases as the follow:
s1=1,2,3,4,4,1,2,3,4,4,1,2,3,4,4
in which the repetition is 1,2,3,4,4.
In general the problem may summarized as follows:
Given:
- A sequence *s* of length *n* in which appear *k* symbols
- An interval I=s_i,s_j belonging to s such that they exist at least two subsequence in the interval that are consecutive and are the same (1,2,3,1,2,3,1,2,3)
Determine the probability that the repetition is random or not.
Aucun commentaire:
Enregistrer un commentaire