To now, have been concerned with protecting message content (ie secrecy) by encrypting the message. Will now consider how to protect message integrity (ie protection from modification), as well as confirming the identity of the sender. Generically this is the problem of message authentication, and in eCommerce applications is arguably more imprtant than secrecy.
In general, want to send an "electronic signature" along with the message to validate its contents, and the senders identity. Unless the message is processed in a way that provides both encryption and authentication (as when a chaining mode is used with a block cipher), the electronic signature is separate from the encrypted message. Unless the amount of information sent is to be doubled, some means is needed to create a "digest" (MAC) of the message of a suitable, fairly small, size.
show here we see Alice sending both a message, and the signature (Auth) to Bob. Bob recomputes the signature on the message as he received it, and confirms that it is the same as the one Alice sent him. If so, the message is assumed to be unmodified.
As mentioned above, if a message is sent encrypted using a chaining mode of a block cipher (CBC or CFB) then implicitly the message is also authenticated, since any external modification would corrupt the decryption, and since only the sender & reciever supposedly know the key used. However since the algorithm is symmtric, there is no way to prove to a 3rd party who did what.
Can also use block cipher chaining mdoes to create a separate authenticator, by just sending the last block. However this suffers from being a bit too small for acceptable use today.
Hash functions are used to "digest" or "condense" a message down to a fixed size, which can then be signed, in a way that makes finding other messages with the same hash extremely difficult (so the signature wont apply easily to other messages). The hash needs to be large enough to resist "birthday attacks" on it - see Stallings 8.4 and 8A.
h
should have the
following properties:
These are the "mathematical" specifications for good hash functions. Essentially it must be extremely difficult to find 2 messages with the same hash, and the hash should not be related to the message in any obvious way (ie it should be a complex non-linear function of the message). There are quite a few similarities in the evolution of hash functions & block ciphers, and in the evolution of the design requirements on both.
This is the first generally used hash function designed by Ronald Rivest. Whilst not broken, it is no longer favoured for use due to its greater complexity and slower speed.
One of the widely used hash functions - has appeared in a number of security uses, eg secure email PEM/PGP, S/Key passwords etc
Approach is a bit like chaining a block cipher - a previous value is combined with the current chunk of message through a few rounds, before the updating the current value. The final values is the hash.
MD5 is the current, and very widely used, member of this family of hash functions. Some progress has been made analysing it, but of more concern is its 128-bit size, which is starting to look a bit small.
Again see the message broken into blocks, processed along with the buffer value using 4 rounds, and the result added to the input buffer to make the new buffer value. Repeat till run out of message, and use final buffer value as hash. nb. due to padding always have a full final block (with length in it). Show Schneier description of MD5 here.
R(a,b,c,d,Mi,s,ti): a = b +
((a+F(b,c,d)+Mi+ti)<<s)
F(b,c,d)
is a different nonlinear function in each
round
ti
is a value derived from sine
Each round mixes the buffer input with the next "word" of the message in a complex, non-linear manner. A different non-linear function is used in each of the 4 rounds (but the same function for all 16 steps in a round). The 4 buffer words (a,b,c,d) are rotated from step to step so all are used and updated. See Schneier section 18.5 or Stallings 9.1 for details.
SHA is one of the newer generation of hash functions, more resistant to cryptanalysis, and now probably preferred for new applications.
(67452301,efcdab89,98badcfe,10325476,c3d2e1f0)
(A,B,C,D,E) <-
((E+F(t,B,C,D)+(A<<5)+Wt+Kt),A,(B<<30),C,D)
Can see SHA shares much in common with MD5, but with 20 instead of 16 steps in each of the 4 rounds. Note the 4 constants are based on sqrt(2,3,5,10). Show Schneier description of SHA here.
SHA-1 is probbaly the preferred hash function for new applications. Currently no problems are known with it.
RIPEMD-160 was developed in Europe as a result of an EC call for new authentication algorithms in the early-90's. The original version evolved into this, which was finally accepted. It is similar to SHA/MD5 - slower, but likely more secure.
HAVAL is a local Australian hash algorith developed by Profs Pieprzyk & Seberry at UOW. It has a rather different approach to the MD family, and hence is rather less likely to be susceptible to the known attacks on them. It is quite fast, and has been included in some security products, like Tripwire.
Have many proposals to use block ciphers to create a hash, but one larger than the standard block sized used by the ciphers - to avoid birthday attacks. Most of the results have been flawed - see Schneier 18.11. Not much joy to be had with this approach.