Vigenère Cipher

Suggested experiment

  • Click SECRET to let the computer choose the plaintext and keep it invisible.
  • Click on the KEY LENGTH tab.
  • Adjust the shift amount until the red line is on the highest bar. This is the shift amount that has the highest incidence index, so it is the most likely key length.
  • Click on the FREQUENCY tab.
  • Choose KEY POSITION 1.
  • Adjust the slider until the red line is on the highest point on the graph. This is the key that decrypts to a ciphertext with the most english-like distribution (as found by the dot product of the probability vectors).
  • Click ADD TO KEY to add this key to the KEY field.
  • Repeat for each of the other key positions.
  • Click DECRYPT to see if the key you constructed works.

Overview

The Vigenère cipher has been used since at least the mid 1500s, and has often been called unbreakable, even in the early 20th century, despite the fact it was broken in the 18th century. The plaintext is encrypted by adding each character to a corresponding character in the key, where the key is repeated as necessary to make it as long as the message. If the key is long and random, then even if the original plaintext had an uneven distribution of characters (e.g. as in standard english text), the ciphertext will have a much more uniform distribution. For example, even if the letter "e" is very common in the plaintext, it will encrypt to any of several letters, depending on its location, and so none of those letters will tend to be as common as the "e" was in the original plaintext. This is why it has often seemed unbreakable.

However, if the key is N characters long, and you look at every Nth character in the ciphertext, then the "e" will always be encrypted to the same ciphertext character, and so that particular character will occur unusually often. In fact, every Nth character is essentially being encrypted with a shift cipher, which can be broken by frequency analysis. In this way, each character in the key can be found, by looking at frequencies for every Nth ciphertext character, starting at each possible starting position.

So the cipher is easily broken if the key length N is known. But even if it isn't known, it is easily found. If you look at every (N+1)th character, you'll eventually see "e" encrypted by every character in the key, so the distrubution will appear much more even than if you'd looked at every Nth character (or a multiple of N). So the correct key length can be found by trying various periods, and seeing which one gives the most uneven character distribution.

However, there is an even simpler method than that. Suppose that when looking at every Nth character, an "e" encrypts to an "x". Then the letter "x" will be unusually common in those positions. If the original message is placed above itself, shifted by N characters, then an "x" will be above an "x" unusually often. By a similar argument, every letter will appear above itself unusually often. The number of times a letter appears above a matching letter is called the "coincidence index". If the coincidence index is calculated for several different shifts, the shift with the highest coincidence index will likely be the key length.

Key Length

After entering the plaintext in the INPUT TEXT box, entering the key in the KEY box, and clicking ENCRYPT (or simply clicking SECRET to let the computer make up the plaintext and key), the ciphertext will become visible. Click on the KEY LENGTH tab to discover the key length. The bar chart shows the coincidence index for each shift, so the tallest bar will likely give the key length. Use the shift buttons to move the red line, and to see how many matches each shift gives.

Frequency

After finding the key length, click on the FREQUENCY tab to work out each character of the key. Enter the key length discovered in the KEY LENGTH box. Choose a KEY POSITION of 1 to start working out the first character of the key. Shift the slider to find the most likely key character, which is the one that corresponds to the highest bar (which gives the plaintext distribution that maximizes its dot product with the frequencies of standard english). When you have found the key character in this position, click ADD TO KEY to add it to the end of the key, and then advance to the next position and repeat. Click on DECRYPT to see the results of the current key guess at any time.

The Big Question

Finally, there is the question of how to pronounce "Vigenère". Although it's difficult to write without IPA symbols, it has been transcribed variously as "veezh-nare" or "veezh-en-air" or "Vee-zhen-yehr" or "Vee-zhun-aire", where the "zh" is pronounced like the "s" in "measure" or "asia", and the entire word is pronounced with a French accent (obviously!).