basic analysis III: repeats
I counted the repeats of characters (average distance in characters),
the average distance of repeating words counted in words,
and the average distance of repeating words counted in lines.
char cA | repeated | avg char.dist. in chars |
char cB | repeated | avg char.dist. in chars |
char cAB |
repeated | avg char.dist. in chars |
. | 10899 | 6,12 | . | 22541 | 6,30 | . | 36136 | 6,31 |
a | 3577 | 18,64 | a | 9240 | 15,37 | a | 14281 | 15,96 |
c | 5056 | 13,19 | c | 7261 | 19,56 | c | 13314 | 17,11 |
d | 3158 | 21,11 | d | 8876 | 16,00 | d | 12973 | 17,56 |
e | 3764 | 17,71 | e | 14295 | 9,93 | e | 20070 | 11,35 |
f | 156 | 425,53 | f | 289 | 487,59 | f | 505 | 448,96 |
g | 42 | 1525,21 | g | 30 | 4702,10 | g | 96 | 2363,27 |
h | 6413 | 10,40 | h | 10183 | 13,95 | h | 17856 | 12,76 |
i | 3614 | 18,44 | i | 7398 | 19,20 | i | 11732 | 19,42 |
k | 2709 | 24,61 | k | 7365 | 19,29 | k | 10934 | 20,84 |
l | 3004 | 22,20 | l | 6569 | 21,62 | l | 10518 | 21,66 |
m | 391 | 170,19 | m | 604 | 235,11 | m | 1116 | 204,14 |
n | 1825 | 36,53 | n | 4028 | 35,27 | n | 6141 | 37,1 |
o | 8867 | 7,52 | o | 14011 | 10,14 | o | 25468 | 8,95 |
p | 439 | 151,20 | p | 1078 | 131,55 | p | 1630 | 139,65 |
q | 1131 | 58,95 | q | 4207 | 33,76 | q | 5423 | 42,02 |
r | 2383 | 27,98 | r | 4342 | 32,72 | r | 7456 | 30,56 |
s | 2423 | 27,51 | s | 4187 | 33,92 | s | 7387 | 30,84 |
t | 2238 | 29,75 | t | 3891 | 36,50 | t | 6944 | 32,81 |
y | 4513 | 14,77 | x | 28 | 4908,61 | v | 9 | 6505,11 |
z | 2 | 21500,50 | y | 11599 | 12,25 | x | 35 | 6378,63 |
y | 17655 | 12,91 | ||||||
processed 1816 lines | z | 2 | 30010,5 | |||||
processed 66678 chars | ||||||||
processed 2647 lines | processed 5214 lines | |||||||
processed 142053 chars | processed 227864 chars |
Chars
Cause i replaced a space with a dot,
the average distance of the dots also resembles the word-length
But, the dot-avg-char.dist. is slightly higher, cause of extra en-of-line dots i used in the text
Words
A word with a averge word distance of 1 means that the word was repeated directly.
Like: two two.
The siginificance of this avg. dist. count lies in the field of determination of written numbers.
For example if i refer to paragraph three two one, section four page six six five.
Or in latin: x x v iij
Words that have a repeat of 2, can not be common numbers like 0 to 9.
But more on that later.
Lines
The most repeated word occurs every 6 or 7 lines in Currier B and AB.
In Currier A, that is every 4 lines of text.
If we compare this information with other languages we could also see if:
* the VMS has the same language characterics
* chars, words and lines match common verbs and words.
word cA | repeated | avg word dist.in words |
word cB | repeated | avg word dist.in words |
word cAB | repeated | avg word dist.in words |
qoekol | 2 | 1 | dalo | 2 | 3 | qoekol | 2 | 1 |
okolaiin | 2 | 1 | chtl | 2 | 3 | okolaiin | 2 | 1 |
dyky | 2 | 7 | arod | 2 | 13 | choekeey | 2 | 2 |
sham | 2 | 7 | olchdaiin | 2 | 17 | chtl | 2 | 3 |
p | 2 | 7 | cheked | 2 | 25 | olchdaiin | 2 | 17 |
chekal | 2 | 11 | ssheey | 2 | 28 | shcphy | 2 | 22 |
k | 2 | 18 | qopcheos | 2 | 52 | cheked | 2 | 25 |
shcphy | 2 | 22 | ykeody | 2 | 67 | ssheey | 2 | 28 |
okod | 2 | 25 | ko | 2 | 76 | qotchoiin | 2 | 42 |
arary | 2 | 36 | qokod | 2 | 77 | chordy | 2 | 50 |
qotchoiin | 2 | 42 | tchody | 2 | 84 | qopcheos | 2 | 52 |
okalal | 2 | 46 | daiindy | 2 | 89 | qokchaiin | 2 | 54 |
okan | 2 | 48 | oldam | 2 | 96 | *eody | 2 | 56 |
………. | ………. | ………. | ||||||
cheor | 59 | 188,5 | dal | 133 | 165,72 | schey | 5 | 5723,5 |
shey | 60 | 190,93 | okain | 136 | 167,84 | cholody | 5 | 5760,25 |
dol | 63 | 176,06 | qokar | 137 | 167,54 | par | 5 | 5764,5 |
y | 64 | 148,54 | qol | 145 | 142,73 | s | 243 | 154,84 |
shy | 65 | 170,36 | otedy | 151 | 152,09 | dal | 253 | 145,28 |
shor | 69 | 149,4 | okaiin | 168 | 132,36 | al | 260 | 144,95 |
cheol | 71 | 159,29 | qokal | 169 | 132,23 | qokaiin | 262 | 141,15 |
aiin | 73 | 148,06 | al | 186 | 123,48 | dy | 270 | 135,96 |
chey | 78 | 145,34 | dar | 188 | 122,21 | qokedy | 272 | 123,87 |
dal | 85 | 126,65 | shey | 204 | 110,14 | qokain | 279 | 122,47 |
dain | 93 | 121,63 | qokaiin | 240 | 92,25 | shey | 283 | 134 |
dar | 95 | 118,61 | ar | 249 | 92,7 | qokeedy | 305 | 110,9 |
or | 96 | 119,57 | chey | 250 | 92,54 | qokeey | 308 | 121,4 |
cthy | 101 | 111,95 | or | 250 | 92,73 | dar | 318 | 118,13 |
ol | 101 | 112,76 | qokeey | 264 | 87,02 | chey | 344 | 110,08 |
chy | 104 | 106,45 | qokedy | 271 | 84,79 | ar | 350 | 108,43 |
sho | 106 | 107,48 | qokain | 275 | 80,49 | or | 363 | 104,68 |
shol | 118 | 92,49 | qokeedy | 305 | 75,78 | chol | 396 | 95,05 |
dy | 124 | 89,59 | daiin | 315 | 73,38 | shedy | 426 | 79,44 |
s | 164 | 68,48 | aiin | 351 | 65,51 | aiin | 469 | 79,32 |
chor | 182 | 62,41 | shedy | 417 | 55,49 | chedy | 501 | 67,54 |
chol | 280 | 40,26 | ol | 421 | 55,07 | ol | 537 | 70,59 |
daiin | 511 | 22,2 | chedy | 491 | 47,13 | daiin | 863 | 43,8 |
processed 1816 lines | processed 2648 lines | processed 5215 lines | ||||||
processed 11415 words | processed 23206 words | processed 37920 words |
word cA | repeated | avg word dist. in lines |
word cB | repeated | avg word dist. in lines |
word cAB | repeated | avg word dist. in lines |
dyky | 2 | 1 | olchdaiin | 2 | 1 | olchdaiin | 2 | 1 |
sham | 2 | 1 | arod | 2 | 2 | shcphy | 2 | 3 |
chekal | 2 | 1 | cheked | 2 | 3 | cheked | 2 | 3 |
shcphy | 2 | 3 | ssheey | 2 | 3 | ssheey | 2 | 3 |
okod | 2 | 3 | qopcheos | 2 | 6 | qopcheos | 2 | 6 |
arary | 2 | 4 | ykeody | 2 | 7 | qotchoiin | 2 | 7 |
okalal | 2 | 5 | cher | 2 | 9 | chean | 2 | 8 |
qotchoiin | 2 | 7 | tchody | 2 | 9 | chordy | 2 | 8 |
p | 2 | 7 | daiindy | 2 | 9 | ko | 2 | 9 |
………. | ………. | ………. | ||||||
qotchy | 51 | 27,6 | lchedy | 113 | 22,6 | okeey | 162 | 31,91 |
cthol | 53 | 33,08 | okeey | 113 | 23,28 | cheol | 164 | 31,64 |
shey | 56 | 32,65 | chckhy | 115 | 23,14 | cheey | 167 | 30,31 |
dol | 58 | 30,56 | chdy | 120 | 21,42 | shol | 174 | 28,99 |
cheor | 58 | 30,72 | cheey | 122 | 21,68 | qokal | 176 | 25,08 |
shy | 62 | 28,54 | dal | 125 | 20,23 | okaiin | 193 | 26,21 |
y | 63 | 23,4 | okain | 128 | 20,35 | dain | 200 | 26,15 |
shor | 65 | 25,66 | qokar | 128 | 20,45 | chor | 206 | 25,13 |
cheol | 67 | 26,97 | qol | 130 | 18,29 | s | 228 | 22,78 |
aiin | 71 | 24,39 | otedy | 137 | 19,18 | qokedy | 231 | 19,9 |
chey | 76 | 23,79 | okaiin | 152 | 16,77 | al | 231 | 22,47 |
dal | 80 | 21,39 | qokal | 156 | 16,4 | dal | 235 | 21,69 |
or | 86 | 21,27 | al | 172 | 15,23 | dy | 246 | 20,68 |
dar | 89 | 20,2 | dar | 176 | 14,93 | qokain | 249 | 18,71 |
dain | 90 | 20 | shey | 184 | 13,96 | qokaiin | 249 | 20,44 |
ol | 95 | 19,09 | ar | 224 | 11,77 | shey | 257 | 20,3 |
chy | 97 | 18,23 | or | 226 | 11,71 | qokeedy | 263 | 17,52 |
cthy | 97 | 18,59 | qokaiin | 227 | 11,17 | qokeey | 276 | 18,65 |
sho | 100 | 18,16 | qokedy | 230 | 11,43 | dar | 293 | 17,68 |
shol | 107 | 16,16 | chey | 231 | 11,43 | ar | 308 | 16,96 |
dy | 113 | 15,72 | qokeey | 233 | 11,26 | chey | 321 | 16,23 |
s | 154 | 11,62 | qokain | 245 | 10,35 | or | 328 | 15,94 |
chor | 170 | 10,64 | qokeedy | 263 | 10,04 | chol | 347 | 14,96 |
chol | 237 | 7,58 | daiin | 285 | 9,26 | shedy | 380 | 12,12 |
daiin | 432 | 4,18 | aiin | 314 | 8,37 | aiin | 420 | 12,21 |
shedy | 371 | 7,12 | chedy | 446 | 10,33 | |||
processed 11415 words | ol | 372 | 7,11 | ol | 479 | 10,88 | ||
processed 1816 lines | chedy | 437 | 6,04 | daiin | 751 | 6,93 | ||
processed 11415 words | ||||||||
processed 2648 lines | processed 5215 lines | |||||||
processed 23206 words | processed 37920 words |
When we test di-graphs, tri-graphs etc. (or is the correct spelling digram, trigram..etc ?) we can see which combinations are favourite in the text. Also later on we can use this same test for looking for a possible key, if any.