Cryptography

Cryptography is one way to solve the security challenges raised in the introduction. The basic idea is that, using some secret information, it is possible to protect the confidentiality and the integrity of the information that flows in the Web. This section aims to introduce the basic elements of cryptography so that one can understand how they are used in providing secure services in the Web.

Terminology

Consider the case when Kathy wants to send a message to Tom so that another person (Betsy) cannot find out the original information from the message. The message that Kathy wants to send is the cleartext or plaintext. To assure the secrecy of the message, Kathy encrypts the cleartext into an encrypted message. When Tom gets the message, he decrypts the encrypted message and retrieves the cleartext. Even if Betsy has access to the encrypted message (that is, she eavesdrops the transmission channel) she cannot find the cleartext. In the previous example, Kathy and Tom are using cryptography to achieve confidentiality: the message sent by Kathy is understood only by Tom. In addition to confidentiality, cryptography can be used for the following purposes:

	Authentication: the receiver of the message can ascertain its origin.
	Integrity: the receiver can verify if the message was modified during the transmission.
	Non-repudiation: the sender cannot deny that she sent the message.

Secret-Key Encryption Algorithms

An entire family of cryptographic algorithms has been designed to encrypt messages when Kathy and Tom share a common secret -- the key. These algorithms are called secret-key encryption algorithms because they are based on the fact that only Kathy and Tom know the secret key. We start with a simple example of a secret-key encryption algorithm. Let us suppose that Kathy wants to send a message to Tom and they both know a key. To encrypt, Kathy computes the XOR (exclusive or) operation between the characters from the message and the characters from the key. At the end she gets the encrypted message which is sent to Tom. To decrypt, Tom applies the operation XOR between the characters of the encrypted message and the characters of the key. This simple-XOR algorithm is an example of a polyalphabetic substitution cipher. It is considered a weak encryption algorithm because there are methods to derive the key and the clear text from the encrypted text.

There are secret-key encryption algorithms that are considered strong. The rest of this section will metion the most often used algorithms. The algorithms are complicated, and the details can be found in [SCHN96].

	DES (Digital Encryption Standard): In 1977, this algorithm was adopted by the U.S. government as the federal standard for the encryption of commercial and sensitive-yet-unclassified government computer data. DES is a block cipher (encrypts data in blocks of 64 bits) and relies on a key of 56 bits. Many people unknowingly use DES, because the standard UNIX operating system authenticates users with a variation of DES. Some cryptographers consider that the key is too short and a brute force attack (the attacker tries all the possible keys) can break a message. To address this problem a more secure variant called Triple DES (TDES) has been proposed, which applies DES three times with different keys. The DES Applet has an example using the DES algorithm.
	IDEA (International Data Encryption Algorithm): The algorithm is a block cipher algorithm with blocks of 64 bits and a key of 128 bits. The algorithms is used in the popular Pretty Good Privacy (PGP) program.
	RC2 and RC4: These are two secret-key algorithms developed by Ron Rivest at RSA Data Security Inc. The algorithms have not been published (although a routine claiming to be RC4 was posted anonymously on the Internet) but are considered strong. The RC4 algorithm with a key of 40 bits is used in many Web browsers and other software products available in the U.S. and on the international market.
	Skipjack: The algorithm was proposed by the U.S. National Security Agency (NSA) to replace DES in the future. There is not much information about the algorithm, which is intended to be implemented only in hardware (in the Clipper chip). The algorithm is considered strong but has raised controversy because of the way it is used. Besides implementing the Skipjack algorithm, the chip contains a key-escrow mechanism to allow law enforcement agencies to decrypt messages.

A general problem with secret-key algorithms is the need for key management: the communication partners must use another (secure) channel to agree on a common key. Key-exchange protocols are designed to solve this problem. For example, the Diffie-Hellman algorithm can be used to create a common (secret) key as follows:

Kathy and Tom agree on a large prime number n and another number g. These numbers are not necessarily secret.
Kathy generates a random number x and sends to Tom the value X = (g^x) mod n.
Tom generates a random number y and sends to Kathy the value Y = (g^y) mod n.
Kathy receives the value of Y and computes K_x = (Y^x) mod qn.
Tom received the value of X and computes K_y = (X^y) mod n.

The algorithm ensures that the values of K_x and K_y are equal and can be used as the key in a secret-key encryption algorithm. A third party (Betsy) cannot determine the value of the secret key by eavesdropping. The reason is that it is difficult to determine K_x (= K_y) given the values X and Y.

Public-Key Encryption Algorithms

In an attempt to solve the problem of key management for secret-key algorithms an entire new family of cryptographic algorithms, called public-key algorithms, has been developed. The basic idea in these algorithms is to use different keys for encryption and for decryption. Tom has two keys: a public key that is known by any other party that wants to communicate with Tom and a private key that is known only to Tom. If Kathy wants to send a secure message to Tom, she encrypts the message with Tom's public key and sends the message to Tom. Tom receives the message and decrypts it with his private key.

The advantage of public key algorithms is that they do not require a secret shared by both partners or a secure channel to exchange the key. One disadvantage of these algorithms is that they are slower than the secret key algorithms. Because of that, in practice, the communication is done using a hybrid protocol: the partners use the public-key protocol to exchange a secret key that is used subsequently in a secret-key based communication.

The "de facto" standard in public key encryption algorithms is the RSA algorithm (designed by Ron Rivest, Adi Shamir and Leonard Adleman) [RSA]. The algorithm has two phases. In the first phase, Tom generates two keys (a public and a private one) and makes public the public-key. In the second phase, Kathy encrypts a message and Tom decrypts it:

Tom generates two large numbers p and q, and computes the modulus, n=pq.
Tom chooses a public exponent, which is a number e that is smaller than n and relatively prime to (p-1)(q-1). In practice the value of e is 65337.
Tom computes a private exponent d=e^-1 mod (p-1)(q-1).
Tom sends the pair (n, e) as a public key to Kathy. Tom keeps the pair (n, d) as a private key.
Kathy encrypts a message m with Tom's public key: c = m^e mod n and sends the encrypted message c to Tom.
Tom decrypts encrypted message c with the private key to recover message m: m = c^d mod n.

For two-way communication, Kathy would also generate her pair of keys and publish the public key so that Tom can use it to encrypt messages he sends. The RSA Algorithm Javascript Page has a working example of the RSA algorithm.

The RSA algorithm is used in many applications. It has gotten more public attention since it has been used in Web browsers. For example, Netscape Navigator uses it for secure transactions over the Web.

Digital Signatures

The algorithms presented so far are used to preserve the confidentiality of messages. As mentioned in the beginning of this page other issues related to security are integrity and non-repudiation. These goals can be accomplished by using public key algorithms and one-way hash functions. The basics of public key algorithms have been presented earlier. In this section we describe one-way hash functions and how they are used in the verification of integrity.

The purpose of the hash functions is to create a digest of a message that has the following properties:

	The digest is relatively short (typically between 128 and 256 bits).
	Given a message it is easy to generate the hash-value.
	Given the hash value it is difficult to reconstruct the message.
	Given a hash value it is difficult to find a message which has the given hash value.

The algorithms for computing secure hash functions are intricate. We refer the reader to [SCHN96] for several examples. The most popular algorithms are Message Digest 5 (MD5), designed by Ron Rivest, and Secure Hash Algorithm (SHA), designed by the U.S. National Institute of Science and Technology (NIST) and NSA and proposed as a standard. Consider the following situation: Kathy sends a message to Tom and Tom wants to verify the integrity of the message. In order to do so, the following protocol is used between Kathy and Tom:

Kathy computes the SHA digest of the message and encrypts the digest with her private key. The encrypted digest is called the digital signature of the message.
Kathy sends the message and the digital signature to Tom.
Tom receives the message and the digital signature. He decrypts the digital signature using Kathy's public key. After that, he computes the digest and compares it with the value decrypted. If the two digests are the same, Tom is sure that the message has not been altered during transmission, and that Kathy is the person who signed the message.

Using one-way hash functions and public-key cryptographic algorithms it is possible to verify the integrity of the message and also to ensure non-repudiation.

Public-Key Certificates

The algorithms presented in the sections for public-key encryption and digital signatures raises the following question: How does Kathy know that the key she has is Tom's public key? This is part of a more general issue in security -- authentication, or how to ensure that a certain public key belongs to a person. The solution is to use a trusted third party (called in cryptography slang "Trent"). The rest of this section presents the solution to authentication using public-key certificates. The basic idea for the authentication protocol used in the Web is the following: Tom goes (physically) to a Certificates Authority and presents his ID and his public key. The Certificates Authority issues a certificate that contains the following:

Tom's distinguished name (DN) (e.g., C=US, O=Depaul, CN=Tom Smith),
Tom's RSA public key,
issuer's distinguished name (e.g., C=US, O=AuthorityOrg, CN=Authority),
validity period,
serial number, and
issuer's digital signature.

The term "distinguished name" is used twice in the certificate definition. This term is used in the X.500 standard to denote a unique name for a person. A person's distinguished name contains the name of the person's country (C), their organization (O), their name (CN), and other information required to identify the person. The issuer of the certificate "signs" the certificate with his private key using the procedure presented above in the section "Digital Signatures." The signature ensures that the certificate itself has not been altered after the issuer sends it. This structure of the certificate is defined in the X.509 standard.

With a public key certificate, communication between Kathy and Tom proceeds as follows. When Kathy wants to send a message, she needs Tom's public key, so she asks Tom for his certificate. Tom sends the certificate. Kathy verifies the validity of the certificate (explained below) and extracts Tom's public key. Then Kathy generates a random key and sends it to Tom using the public key encryption algorithm. From this point forward, Kathy and Tom can communicate using a secret key algorithm. Note that Tom can also ask Kathy to present her public key certificate for a two-way communication.

This description of the communication between Kathy and Tom omitted the detail of how to verify the authenticity of the certificate. (If Kathy gets a phony certificate from a person imitating Tom, then that person can supply their own public key in the certificate, which Kathy would use to encrypt the message, permitting the imitator to decrypt the message with their own private key.)

To verify the certificate, Kathy needs the public key of the issuer of the certificate. There are two ways to get it: either the issuer is well-known and has publicized the public-key in the media, or there is another Certificate Authority that can certify the public key of the issuer with a public key certificate. This can go on until Kathy knows, and trusts, the public key of an authority. This process of certification is called a certificate chain and the entire system that supports the process of certification and verification is called the "public key infrastructure."

Many books and papers describe, to various levels of detail, the cryptographic algorithms presented in this section. Anyone interested in this topic can find more information in [SCHN96, NETS96a, NETS96b, VERI96].

Rich Aliano <raliano@shrike.depaul.edu>

Home Digital Certificates Firewalls Cryptography SSL JAVA