# basic (crypto)analysis V: Bacon

## BACON cipher.

Mr. Francis Bacon (year 1605), (not to be confused with Roger Bacon (year 1213)),

used an alphabet with 24 -letters and replaced each letter with
an block of 5 characters, containing only ‘a’ or ‘b’.

For example there  c encodes to => aaaba

An encoding example:

Chedy => aaaba aabbb aabaa aaabb babba

We can now follow different rules on decoding or,

and that is the method i choose, we change the encoding when neccessary.

Always the decoding follows the Bacon rules.

From transciption to encoding we could take other rules.

We could use:

for example an a will become aaaaa etc.

or an a will become a Bacon ‘a’ as abrev. for ‘aaaaa’, but only as exception

we could make only o,e,c  a Bacon ‘a’ and other letters are a ‘b’

and many other variations.

To make software search for a good combination these would be the characteristics

of a good Bacon encoded text:

For Latin: about 60-70% of the text will contain ‘a’, and the rest ‘b’

The distance of an ‘a’ is avg 1.5 and for ‘b’  it is 3.

The maximum number of ‘a’-s in a row is 17 and for ‘b’ the maximum is 5

The number of occurences of ‘bb’ is 10% with an avg distance of 9

The number of occurences of ‘aa’ is 40% with an avg distance of 2 till 3

## Using First or Last position letters for Bacon Encoding

If we took all first letters from all words in the entire text cAB

or, if we would take every last letters of every word for that matter,

we would have 37,905 characters.

There are then 2 obvious possibilities if we would have used Bacon

on these letters:

all letters represent a linear Bacon group and thus form words

and must be encoded into Bacon code and then decoded.

The first case is the most obvious, cause the letters would form a Bacon serie.

F.e.

kcoocca  => could form into (using 24 alphabet) =>

abaabaaabaabbababbabaaabaaaabaaaaaa*

decoding this would not form

kcoocca

because the length of that word is not a multiple of 5

But why would we encode and decode if it still comes down to the same?

yes, that is a good point.

We could have the same problem: which VMS letter represents which Latin letter?

The second Bacon approach would mean that we would not use the standard

Bacon alphabet but another encoding principle, for example

only ‘y”, “o”, “c” in the VMS would form an bacon ‘b’ and

all other letters form an bacon ‘a’.

Although all that is a bit farfetched, everything is  possible of course.

Let us look at the possible length of the text then.

With 37905 characters decoding we could get a text that is smaller that is

a multiple of 5 smaller, that is  7581 characters.

Even if we would leave out the text in the drawings and tags, this becomes even smaller.

These 7500 characters would represent the entire message and that

is about  8 to 9 pages of written VMS text.

All that trouble for 8 pages of real text ? Very unlikely.

That is the reason i leave this option where it is and also because if i program

all Bacon permutation possibilities for the whole text, i could always do that for a part of the text.

Also I looked at the DNA of those letters and it is almost the same as the Dna of cAB.

## Words with 5 characters only

Also I investigated the possibility of using only words that are 5 characters long, cause Bacon codes in 5 characters.

The Dna is the same as the entire VMS.

Then there are 9629 words in cAB.

Encoding and decoding those into readable characters would give the same amount of characters.

With an avg of 850 chars on a page, that is 11 pages of written text.

Is that worth all the programming and CPU time ?

At this point i hesitate. When i successfully can rule out the Bacon 5 chars theory,

i can use the exact same routine to search for other combinations.

And in a few weeks time i could have processed every combination.

First have to filter out the label texts and then i would have to write

a permutation table on the 17 VMS letters and  a Bacon ‘a/b/null’.

They will have to be combined with ‘a / b / nothing’  till everything has been tried.

If that will not provide any readable text this option of Bacon can be ruled out !