by Neal R. Wagner
Copyright © 2002 by Neal R. Wagner. All rights reserved.
NOTE: This site is obsolete. See book draft (in PDF):
People have used cryptography for thousands of years. For example, the Caesar Cipher, which was used during the time of Julius Caesar, wraps the alphabet from A to Z into a circle. The method employs a fixed shift, say of 3, to transform A to D, B to E, and so on until W to Z, X to A, Y to B, and Z to C. Thus a message ATTACK becomes DWWDFN and appears incomprehensible to someone intercepting the message. (Well, incomprehensible to someone not very smart.) At the other end, one can reverse the transformation by stepping 3 letters in the opposite direction to change DWWDFN back to ATTACK.
This example illustrates many concepts and terminology from cryptography. The original message is also called the plaintext. The transformed message is also called the ciphertext or the encrypted message, and the process of creating the ciphertext is encryption. The process of getting the original message back is called decryption, using a decryption algorithm. Thus one decrypts the ciphertext.
The basic method used, moving a fixed distance around the circle of letters, is the encryption algorithm. In this case the decryption algorithm is essentially the same. The specific distance moved, 3 in this case, is the key for this algorithm, and in this type of symmetric key system, the key is the same for both encryption and decryption. Usually the basic algorithm is not kept secret, but only the specific key. The idea is to reduce the problem of keeping an entire message secure to the problem of keeping a single short key secure, following Law C1 in the Introduction to Cryptography.
For this simple algorithm there are only 26 possible keys: the shift distances of 0, 1, 2, etc. up to 25, although 0 leaves the message unchanged, so a key equal to 0 is not going to keep many secrets. If the key is greater than 25, just divide by 26 and take the remainder. (Thus the keys just form the integers modulo 26, the group Z26 described in the section Cryptographer's Favorites.)
If an interceptor of this message suspects the nature of the algorithm used, it is easy to try each of the 25 keys (leaving out 0) to see if any meaningful message results -- a method of breaking a code known as exhaustive search. In this case the search is short, though it still might pose problems if the letters in the ciphertext are run together without blanks between words.
The Caesar Cipher is just a special case of the cryptograms from the previous chapter, since with a shift of 3 for example, the cyprtogram key is:
Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ Translated to: DEFGHIJKLMNOPQRSTUVWXYZABC
Here is a computer implementation of the Caesar cipher: Java source.
The Beale Cipher is a just simple extension of the Caesar Cipher, but it is easy to use by hand and it provides excellent security.
Consider the Caesar cipher of the previous section, and associate the letters A through Z with the numbers 0 through 25, that is, A is associated with 0, B with 1, C with 2, and so on until Z with 25. One can represent the previous shift of 3 in the example by the letter D, so that each letter specifies a shift. A special encryption method called the Beale cipher starts with a standard text (the key in this case) like the U.S. Constitution (WE THE PEOPLE . . .) and with the message to encrypt, say ATTACK. Write down the letters of the standard text on one line, followed by the letters of the message on the next line. In each column, the upper letter is interpreted as a shift to use in a Caesar cipher on the letter in the second row. Thus below in the second column, the E in the first row means a shift of 4 is applied to the letter T in the second row, to get the letter X.
Standard text (key): WETHEP Message: ATTACK Encrypted message: WXMHGZThe person receiving the encrypted message must know what the standard text is. Then this receiver can reverse the above encryption by applying the shifts in the opposite direction to get the original message back. This method will handle a message of any length by just using more of the standard text. Notice that in this example the two Ts came out as different letters in the encrypted message. For more security, one should not use a standard text as well known as the one in this example. Instead the sender and receiver could agree on a page of a book they both have with them as the start of their standard text.
In fact, the original historical Beale cipher consisted of three messages: one in the clear and the other two encrypted. The first encrypted message used the start of the U.S. Constitution just as above, and told of a buried treasure. The third message was to tell where to find the treasure, but it has never been decrypted. In fact, if the standard text is not known, it can be very hard to cryptanalyze a Beale cipher.
All the security of this system resides with the secrecy of the standard text. There are a number of subtle pitfalls with this method, as with most of cryptography. For example, suppose you make a trip to, ummmm, Karjackistan, and you want to communicate in secret with your friend back home. You buy two copies of a cheap detective novel, and agree on a page as above. The Karjackistan Secret Police might notice the novel you are carrying, and might digitize the entire book and try all possible starting points within its text, as possible ways to decrypt your transmissions. If that didn't work, they could try taking every third letter from every starting point, or try other more complex schemes.
Here is a computer implementation of the Beale cipher: Java source.
It may be surprising to the reader that there exist simple ``perfect'' encryption methods, meaning that there is a mathematical proof that cryptanalysis is impossible. The term ``perfect'' in cryptography also means that after an opponent receives the ciphertext he has no more information than before receiving the ciphertext.
The simplest of these perfect methods is called the one-time pad. Later discussion explains why these perfect methods are not practical to use in modern communications. However, for the practical methods there is always the possibility that a clever researcher or even a clever hacker could break the method. Also cryptanalysts can break these other methods using brute-force exhaustive searches. The only issue is how long it takes to break them. With current strong cryptographic algorithms, the chances are that there are no short-cut ways to break the systems, and current cryptanalysis requires decades or millennia or longer to break the algorithms by exhaustive search. (The time to break depends on various factors including especially the length of the cryptographic key.) To summarize, with the practical methods there is no absolute guarantee of security, but experts expect them to remain unbroken. On the other hand, the One-Time Pad is completely unbreakable.
The One-Time Pad is just a simple variation on the Beale Cipher. It starts with a random sequence of letters for the standard text (which is the key in this case). Suppose for example one uses RQBOPS as the standard text, assuming these are 6 letters chosen completely at random, and suppose the message is the same. Then encryption uses the same method as with the Beale Cipher, except that the standard text or key is not a quotation from English, but is a random string of letters.
Standard text (random key): RQBOPS Message: ATTACK Encrypted message: RJUORCSo, for example, the third column uses the letter B, representing a rotation of 1, to transform the plaintext letter T into the ciphertext letter U. The receiver must have the same random string of letters around for decryption: RQBOPS in this case. As the important part of this discussion, I want to show that this method is perfect as long as the random standard text letters are kept secret. Suppose the message is GIVEUP instead of ATTACK. If one had started with random letters LBYKXN as the standard text, instead of the letters RQBOPS, then the encryption would have taken the form:
Standard text (random key): LBYKXN Message: GIVEUP Encrypted message: RJUORCThe encrypted message (ciphertext) is the same as before, even though the message is completely different. An opponent who intercepts the encrypted message but knows nothing about the random standard text gets no information about the original message, whether it might be ATTACK or GIVEUP or any other six-letter message. Given any message at all, one could construct a standard text so that the message is encrypted to yield the ciphertext RJUORC. An opponent intercepting the ciphertext has no way to favor one message over another. It is in this sense that the one-time pad is perfect.
In this century spies have often used one-time pads. The only requirement is text (the pad) of random letters to use for encryption or decryption. (In fact, even now I would not want to be found in a hostile country with a list of random-looking letters.) The party communicating with the spy must have exactly the same text of random letters. This method requires the secure exchange of pad characters: as many such characters as in the original message. In a sense the pad behaves like the encryption key, except that here the key must be as long as the message. But such a long key defeats a goal of cryptography: to reduce the secrecy of a long message to the secrecy of a short key. If storage and transmission costs keep dropping, the one-time pad might again become an attractive alternative.
Law PAD1: The one-time pad is a method of key transmission, not message transmission. [Blakeley]
Later sections will dwell more on random number generation, but for now just note that the one-time pad requires a truly random sequence of characters. If instead, one used a random number generator to create the sequence of pad characters, such a generator might depend on a single 32-bit integer seed for its starting value. Then there would be only 232 different possible pad sequences and a computer could quickly search through all of them. Thus if a random number generator is used, it needs to have at least 128 bits of seed, and the seed must not be derived solely from something like the current date and time. (Using the current time and date would be terrible, allowing immediate cryptanalysis.)
Exercise: Write a program that will generate duplicate copies of sequences of random characters, for use in a one-time pad. [Ans: A Java program which, when executed, generates a unique Postscript file that will print such a pad is here. The output of this program is here (Postscript). or here (PDF). Here are two images to use in constructing a wheel to encrypt and decrypt by hand using a one-time pad: Outer wheel: Postscript version, PDF version, Inner wheel: Postscript version, PDF version.]