Hashing information

9.4 Hashing information

Hashing is a one-way algorithm in which data can be converted to a hash value, but a hash value cannot be converted back to meaningful data. It is used in conjunction with encryption to ensure that messages are not tam- pered with in transit. Modern hashing systems include Message Digest (MD5) and Secure Hash Algorithm (SHA-1).

When a hash value is produced from a block of plain text, it should be computationally difficult to generate a different block of text that would yield the same hash value. A standard property of hashing algorithms is that

a small change in the input text creates a large change in the hash value. Hash algorithms always produce output values with the same length, regardless of the amount of input text.

In practice, a hash value is generated for a given message, and then the message and the hash code are encrypted together. When the message is decrypted, a hash must match that of the message; otherwise, it may have been tampered with. Even though it would be impossible for a hacker to

9.4 Hashing information 233

read this encrypted message in transit, it would be possible for him to alter the contents of the transmission, which could result in misinter- preted communications.

Another useful application of hashing is the secure storage of usernames and passwords. If an application stores username and password pairs in a database, it is easy for a professional hacker to access this database and read them off. If the password is hashed, the hacker cannot tell what the original password was. When the legitimate user enters a password into your appli- cation, the entered password will be hashed, and if it matches the value in the database, then the user is granted access.

This may pose a problem if the user forgets a password because the application cannot determine the original password from the hash. A sys- tem should be in place to replace passwords from an administrator’s account. More importantly, if the hacker can guess the hashing algorithm used, he could generate a hashed password, replace the existing one, and gain access. For this reason, where data integrity can be compromised, the hashing procedure should be combined with another form of encryption such as 3DES.

Hashing can also be used to prevent unauthorized data mining of online services. If you provide an Internet-based service that is accessed via a cus- tom-made client (e.g., a DLL that provides currency conversion based on live exchange rates, or whatever), and you want only paying customers to access the service, the last thing you want is a competitor to use a packet- sniffing tool to determine what data you are sending to the server and create

a product that uses your service without paying you. The obvious solution is to use asymmetric encryption; however, let us imagine that performance is the overriding factor, and asymmetric encryption would cause an unac- ceptable processing overhead.

A keyed hash (or a hash of the payload with an appended secret string of characters) of the data included in the header creates only a small overhead, but it makes the header impossible to re-create without knowing the hash key. This affords no security against your competitors’ reading what is being sent back and forth to your server, but it prevents them from generating their own client; however, you should take care that the client cannot be disassembled to view this key easily. A tool such as Dotfuscator (www.preemptive.com)can

be used to obfuscate the code and help hide this key from prying eyes.

A real-world example of this system in use is the Google toolbar. This utility can display Google’s page rank for any given Web page. Google does not want people to be able to data-mine these values using automatic pro-

Chapter 9

234 9.4 Hashing information

cesses, so the request that the toolbar component makes to the Google server contains a keyed hash code for the Web site in question. It is difficult to predict this hash code, and requests made without this code return an error. Full-blown asymmetric encryption was not used in this case because it would have created unacceptable overhead for the servers to return data that is basically available to anyone.

9.4.1 Hashing algorithms

.NET provides support for two hashing algorithms: Secure Hash Algo- rithm, or SHA, and Message Digest, or MD5 in the classes SHA1Managed and MD5CryptoServiceProvider , respectively.

SHA is specified by the secure hash standard (SHS). The hash is gener- ated from 64-byte blocks, which are transformed by a combination of one- way operations and a function of previous block transforms. The specifica- tion for SHA is widely available and can be implemented easily in any other language, so it is suitable for use on solutions with clients written in other languages or on other platforms. The specification is available in RFC 3174 ( ftp://ftp.rfc-editor.org/in-notes/rfc3174.txt ).

Hashing algorithms do not involve the same high-level mathematics as RSA or elliptic curve encryption. This is not to say that it is advisable to try to develop your own hashing algorithm. Breeds of algorithms that are similar in function to hashing are cyclic redundancy check (CRC) func- tions. CRC functions provide a fixed-length checksum for any given input. Although these may be one-way functions and provide generally higher throughput than hashing algorithms, they do not afford the same level of security.

There are four different variations of the SHA available for use in .NET: SHA1Managed (20-byte hash), SHA256Managed (32-byte hash), SHA384Managed (48-byte hash), and SHA512Managed (64-byte hash). The longer the hash, the more difficult it is for a hacker to create a new message with the same hash, although a longer hash may contain more information about the original message. In either case, SHA1 should be sufficient.

9.4.2 Using SHA

Create a new Windows application in Visual Studio .NET as usual, and draw two textboxes on the form named tbPlaintext and tbHashed . A but- ton named btnHash is also needed. Click on the button and enter the fol- lowing code:

9.4 Hashing information 235

C#

private void btnHash_Click(object sender, System.EventArgs e) {

byte[] entered = Encoding.ASCII.GetBytes(tbPlaintext.Text); byte [] hash = new SHA1Managed().ComputeHash(entered); tbHashed.Text = Encoding.ASCII.GetString(hash);

VB.NET

Private Sub btnHash_Click(ByVal sender As Object, _ ByVal e As System.EventArgs) Dim entered() As Byte = _ Encoding.ASCII.GetBytes(tbPlainText.Text) Dim hash() As Byte = New _ SHA1Managed().ComputeHash(entered) tbHashed.Text = Encoding.ASCII.GetString(hash)

End Sub

This code converts the text entered in tbPlainText into a byte array, and then passes this byte array to the ComputeHash method of the SHA1Managed class. The hash code is generated by an instance of this SHA1Mananged class. By substituting SHA1Managed with SHA512Managed or even MD5cryptoServiceProvider , the hashing will take place using that algorithm instead of SHA1.

You will also require the relevant namespaces:

C#

using System.Text; using System.Security.Cryptography;

VB.NET

Imports System.Text Imports System.Security.Cryptography

To test this, run it from Visual Studio .NET, type some text into the textbox provided, and press the button. A fixed-length hash will appear in the second textbox as shown in Figure 9.4. A small change in the plain text will cause a large change in the hash value, which will always remain the same length.

Chapter 9

236 9.6 Certificates

Figure 9.4

Secure hashing application.