Greek text attempt (2015)

febr 2015

Orthographic transcription is a transcription method that employs the standard spelling system of each target language. Transcription as a mapping from sound to script must be distinguished from transliteration, which creates a mapping from one script to another that is designed to match the original script as directly as possible.

Transliteration is the conversion of a text from one script to another.
The Voynich manuscript (VMS) is transliterated (or mapped) into EVA script. Where a Voynich letter is transliterated as ‘f’. It is my goal to translate that into a plain language. Which finally will be translated into another modern language such as English,  to enable us to understand the text. (which is called transcription)

The path I follow is interesting and could be of interest during future investigations in similar textual transliteration efforts. I do not claim this method is the best or new, i simply write about my effort and use this blog to free my mind of my thoughts. By writing and creating this blog it gives me a satisfactory result, also when i do not succeed in a final good result.

Now I assume the text is written in ancient Greek because it has the best DNA language match.

Step 1. Choose a piece of text that is representative for the VMS as a whole.
I took the entire text and removed the text that is used as labels for the drawings and also textfragments that are standing on itself and are very small. That final working text is called VMS CAB NST. (nst=no small text).

One step later I found out that the text can be improved by splitting up the sh transcribed as ‘sh’  into 3-characters ‘c6h’. This because the symbols sh resemble the two characters  ch better with something added than the Eva letter ‘s’ with an ‘h’ attached.
After comparing the word, and language dna on that change. It seemed logical to me. The text is changed an this will be referred as CAB NST2.

Also after analysis, a new text is made, which i can compare with that, the new text is called CAB NST3 and has the following replacements:

sh => c6h => 5
ch => 7
ai => 3

Step2. Identify the correct transliteration of the VMS into EVA.
Remove false character-combinations in 2grams, 3grams, 4 grams and possibly 5 grams.
Let us call that “defects”.

Step3. Try to match characters from the language into the chosen language. Use

  • language DNA match
  • 1 letter word match based on repeats
  • 2grams match based on repeats and AVG
  • etc..

The difficulty lies in the fact that a specific match on one part may give a contradictionary result on another. This is because one can not follow the path 1-on-1 but will have to skip and mix letter paths.

After many iterations of the steps a language was chosen that has the best match: Greek

febr 2015

The 2grams in the VMS were compared with 2grams in Greek

Also measured are the repeats of those 2grams in the text, and their average repeat distance.

For any 2gram that can be expressed as a % of the total amount of that average repeat distance (AVG)

In any normal language (we take Modern Greek in this example) there is a simple exponential connection between the AVG and the % of the total. That is logical because if for example a 2gram repeat itself at a high frequency, then the % of the total repeats of all 2grams must be equally high. If that is not the case, we call that a faulty 2gram. (or “defects” in the 2gram)

In the following screen view you see the CAB NST text, where the columns are sorted first on AVG (low to high) and then sorted on ‘repeated’ (high to low).

The faulty 2grams are:

rr, eh, s*, l*,

2gramfaults

‘eh’ eh is remarkable because it is repeated only 4 times in 3 consecutive lines.
Ah, I did investigate that before!
See this page
. Which now has been proven by this method.

3grams

The 3grams in the VMS were analyzed the same way.

For any 3gram that can be expressed as a % of the total amount of that average repeat distance (AVG) . The 3gram that have an irregular repeat, compared to the % of the total.

But in order to get the right pattern we first have to remove the very low repeats,

cause otherwise the Greek language would be minimized to almost nothing:
3gramfaultstry1

The removed VMS  faulty 3grams are:

3gram repeated
ehd 2
htl 2
keh 3
ehy 2
fyc 2
ot* 2
ate 2
ey* 3
ol* 3
cod 2
tlo 2
kil 2
kyr 2

Now i removed the low frequency letters and used the CAB NST2 (!) text.

This behaves immediately very good, cause there are no defects in the text anymore.

4grams

Comparing the 3grams and 4grams, I stumbled upon a big problem: the occurence percentages are much too high!

On the right you see Latin. The VMS has  4 to 6  times more occurring  on ‘aiin’ !  How can we lower that ?

cab NST2 graph % delta repeated avg dist.in graphs delta x latin % delta repeated avg dist.in graphs
1 che 4,242403 0,43 4850 23,65 4,15 que 1,022 0,23 858 98,11
2 c6h 3,81204 0,24 4358 26,34 2,69 4,79 ter 0,7957 0,10 668 126,1
3 iin 3,57324 0,00 4085 28,07 1,73 5,17 est 0,6908 0,06 580 145,35
4 edy 3,569742 0,03 4081 25,19 -2,88 5,68 qua 0,6289 0,03 528 159,5
5 aii 3,534753 0,82 4041 28,38 3,19 5,89 ili 0,6003 0,01 504 164,76
6 qok 2,711639 0,45 3100 36,7 8,32 4,58 ent 0,592 0,02 497 169,41
7 cho 2,262032 0,05 2586 44,32 7,62 3,93 qui 0,5753 0,05 483 173,98
8 6he 2,213047 0,12 2530 45,35 1,03 4,18 unt 0,53 0,01 445 188,54

Even in Greek, the factor is high. Although the pattern is different.

cab NST2 graph % delta repeated avg dist.in graphs delta x GRE % delta repeated
1 che 4,242403 0,43 4850 23,65 1,19 καὶ 3,5751 1,70 3009
2 c6h 3,81204 0,24 4358 26,34 2,69 2,03 αὐτ 1,8737 0,01 1577
3 iin 3,57324 0,00 4085 28,07 1,73 1,91 τοῦ 1,8677 0,89 1572
4 edy 3,569742 0,03 4081 25,19 -2,88 3,64 ὐτο 0,9814 0,21 826
5 aii 3,534753 0,82 4041 28,38 3,19 4,57 τὸν 0,7735 0,06 651
6 qok 2,711639 0,45 3100 36,7 8,32 3,79 των 0,7153 0,03 602
7 cho 2,262032 0,05 2586 44,32 7,62 3,29 σεν 0,6867 0,09 578
8 6he 2,213047 0,12 2530 45,35 1,03 3,70 εὶπ 0,5976 0,03 503
9 hed 2,090586 0,02 2390 43,07 -2,28 3,67 πεν 0,5703 0,00 480
10 oke 2,070468 0,21 2367 48,21 5,14 3,65 της 0,5667 0,01 477
cab NST2 graph % delta repeated avg dist.in graphs delta x latin % delta repeated avg dist.in graphs
1 aiin 4,650814 1,49 3726 21,92 6,77 fili 0,6872 0,17 433 23,65
2 c6he 3,15796 0,70 2530 32,3 10,38 6,14 terr 0,5142 0,02 324 26,34
3 hedy 2,456469 0,54 1968 37,35 5,05 4,99 tque 0,492 0,01 310 28,07
4 ched 1,917244 0,08 1536 47,92 10,57 4 erra 0,4793 0,01 302 25,19
5 daii 1,836111 0,11 1471 55,5 7,58 3,93 omin 0,4666 0,07 294 28,38
6 qoke 1,722524 0,13 1380 58,76 3,26 4,39 ixit 0,392 0,00 247 36,7
7 okee 1,593959 0,10 1277 63,25 4,49 4,12 itqu 0,3873 0,00 244 44,32
8 eedy 1,489109 0,19 1193 61,61 -1,64 3,86 ibus 0,3857 0,01 243 45,35
9 cheo 1,298134 0,02 1040 78,07 16,46 3,42 domi 0,3793 0,01 239 43,07
10 okai 1,279411 0,02 1025 79,53 1,46 3,43 erun 0,373 0,00 235 48,21
cab NST2 graph % delta repeated avg dist.in graphs delta x GRE % delta repeated avg dist.in graphs
1 aiin 4,650814 1,49 3726 21,92 3,02 αυτο 1,541 0,23 854 67,61
2 c6he 3,15796 0,70 2530 32,3 10,38 2,41 υτου 1,3082 0,54 725 79,64
3 hedy 2,456469 0,54 1968 37,35 5,05 3,18 ιπεν 0,7723 0,01 428 135,76
4 ched 1,917244 0,08 1536 47,92 10,57 2,53 ειπε 0,7578 0,09 420 137,94
5 daii 1,836111 0,11 1471 55,5 7,58 2,75 αυτω 0,6676 0,09 370 156,32
6 qoke 1,722524 0,13 1380 58,76 3,26 2,99 ησεν 0,5756 0,07 319 182,56
7 okee 1,593959 0,10 1277 63,25 4,49 3,14 αυτη 0,507 0,05 281 198,4
8 eedy 1,489109 0,19 1193 61,61 -1,64 3,26 τους 0,4565 0,04 253 229,85
9 cheo 1,298134 0,02 1040 78,07 16,46 3,1 κυρι 0,4186 0,01 232 244,74
10 okai 1,279411 0,02 1025 79,53 1,46 3,1 πρὸς 0,4132 0,02 229 246,33

Loading