Cracking The Vigenere Cipher

I wanted to use Vigenere as the cipher for the 3rd message on the competition to win a domain name, but changed my mind due to the number of online tools that might help decode this cipher easily even if you know nothing about how Vigenere works.

Before you start reading about how deciphering Veginere works, I invite you to take a look how Vigenere is used to cipher messages from the Wikipedia article here. In fact, Vigenere was called “le chiffre indĂ©chiffrable” or the undecipherable cipher, because any means of cryptanalysis invented before it was defeated, till Charles Babbage found a clever, very clever, way to crack it.

After finishing reading the Code Book the last week, I started decoding the different enigmas proposed at the end of the book; and yesterday I started deciphering the 4th enigma which is a Vigenere cipher (that I finished yesterday too), and I find it pretty amazing for starters to try. I must mention that this short article will contain the solution to the enigma.

I chose to go old way while deciphering this, finding the key by hand, then decoding the message step by step.

0. The message to decode :

K Q O W E F V J P U J U U N U K G L M E K J I
N M W U X F Q M K J B G W R L F N F G H U D W
U U M B S V L P S N C M U E K Q C T E S W R E
E K O Y S S I W C T U A X Y O T A P X P L W P
...

The complete message can be found here.

1. Finding the key length:

Babbage’s method for finding the key length was that the repeated text in the ciphered message is probably produced from the same repeated letters from the clear message and which are ciphered with the same part of the key. Hence, finding an estimate between the repeated series of letters in the ciphered message would give us a good hint about the length of the key.

Clear : T H I S I S A C L E A R T E X T T H I S C A N B E E A S L Y D E C
Key :   C O D E C O D E C O D E C O D E C O D E C O D E C O D E C O D E C
Crypt : V V L W K G D G N S D V V S A X V V L W E O Q F G S D W N M G I E

As you can see the series of letters T, H, I is ciphered to the same series V, V, L, because they were situated at the same relative position according the key; notice also that E, A is ciphered to the same S, D.
The key length is at worst the spacing between these repetitions in the ciphered text, and usually it is the greatest common divisor of the spacing of (most) the repetitions (why? give it a second thought.).

285181560
This is the table of the distances from the Vigenere message (done by hand)

Anchor
Repetition
Distances
Divisors
WXIZAYG 2 190
1 | 2 | 5 | 10 | 19 | 38 | 95 | 190
EFVJ 2 220
1 | 2 | 4 | 5 | 10 | 11 | 20 | 22 …
WUU 3 95
1 | 5 | 19 | 95
EEK 3 200
1 | 2 | 4 | 5 | 8 | 10 | 20 | 25 …
UUN 2 130
1 | 2 | 5 | 10 | 13 | 26 | 65 | 130
MEK 2 135 ..
EKJ 2 515 ..
UXF 2 135 ..
JBJ 2 60 ..
GFB 2 30
1 | 2 | 3 | 5 | 6 | 10 | 15 | 30
PNT 2 35
 5 | 7
As you can see, all the repetitions have 5 as a common divisor, so we will suppose that the key length is 5.

2. Breaking the text into (key-length) sets:

Now that we know the length of the key, we will separate our cipher text to several sets. We do this by grouping all the letters ciphered with the first character of the key together, the second group will consist of the characters coded with the second letter of the key etc…

C O D E

T H I S
I S A C
L E A R
T E X T
T H I S
C A N B
E E A S
L Y D E
C

Each group is ciphered with one letter of the key. You can see that the resulting groups are the result of a simple Caesar cipher. (notice that we group the ciphered text, not the clear one –do we have the clear one anyway?-)

Here is a sample code that takes a ciphered text and a key length and returns a list of simple Ceasar ciphers from the Vigenere cipher (C# code):


3. Deciphering the Ceasar groups:

Now that we have the groups from the Vigenere cipher, it’s very easy to decipher each one apart (each one needs 26 tries to decrypt, so a key length of 5 gives us a 26^5 cases to try).

But we can do better! With frequency analysis, we need only to locate the Caesar key for each group, so if the text is for example in French or English, we can locate the letter “e” (most frequent one) and deduce the key from this letter.

If, for example, in the first group, the letter “s” is the most repeated one, we know that “e” is replaced by “s” and hence the key is “s”“e” = “m”.

The following code, analyses the frequencies and tries to find the key (supposing that the text is long enough and that the letters are distributed uniformly)


4. Cracking Vigenere :

The key, is the concatenation of all the Caesar keys we found in each group from the above step. Now that we have the key, reversing Vigenere is a simple child's game, here is a sample code that does this:


You can of course try different key lengths with this program and see if the text has any meaning without going through stage 1 to find the key length. (or you can calculate it directly using Friedman test)

5. Source Code :

The complete source code is available here: https://gist.github.com/944422.

I really wanted the last message of the Win a free domain name competition to be a Vigenere cipher since it is so much fun, but unfortunately, it won’t be. Have fun ;)

Posted in , . Bookmark the permalink. RSS feed for this post.

comments powered by Disqus

Swedish Greys - a WordPress theme from Nordic Themepark. Converted by LiteThemes.com.