One Way Hashing

What is One Way Hashing

Soham Kamble
7 min readMar 7, 2021

A one-way hash function, also known as a message digest, fingerprint, or compression function, is a mathematical function that takes a variable-length input string and converts it into a fixed-length binary sequence. Furthermore, a one-way hash function is designed in such a way that it is hard to reverse the process, that is, to find a string that hashes to a given value (hence the name one-way.) A good hash function also makes it hard to find two strings that would produce the same hash value.

All modern hash algorithms produce hash values of 128 bits and higher.
Even a slight change in an input string should cause the hash value to change drastically. Even if 1 bit is flipped in the input string, at least half of the bits in the hash value will flip as a result. This is called an avalanche effect.

Hash Algorithms

The Microsoft cryptographic providers support these hash algorithms: MD4, MD5, SHA, and SHA256.

  1. MD4 & MD5

Both MD4 and MD5 were invented by Ron Rivest. MD stands for Message Digest. Both algorithms produce 128-bit hash values. MD5 is an improved version of MD4.

2. SHA

SHA stands for Secure Hash Algorithm. It was designed by NIST and NSA. SHA produces 160-bit hash values, longer than MD4 and MD5. SHA is generally considered more secure than other algorithms and is the recommended hash algorithm.

3. SHA256

SHA256 is a 256-bit modern version of SHA and is only supported by the Microsoft Enhanced RSA and AES Cryptographic Provider.

Common Uses of One Way Hash Functions

  1. There are two types of one-way hashing algorithms, fast and slow, fast is used for file verification, and slow for password hashing.
  2. The input of a one-way hashing algorithm cannot be determined by analyzing the output or running it through another function.
  3. When a hacker has a list of password hashes from a stolen database they work out what one-way hashing algorithm was used and then guess as many possible passwords as they can, when they have an output that matches an entry in the database they know that the input is the user’s password.

Common Miss-uses of One Way Hash Functions

  1. A one-way hashing algorithm is a type of encryption and can be used to securely store data for retrieval at a later date with the use of a password and another function.

Hashing Example using Python and Sha-256 hashing algorithm

In the code editor, enter the following command to import the constructor method of the SHA-256 hash algorithm from the hashlib module:

from hashlib import sha256

In the line below, create an instance of the sha256 class:

h = sha256()

Next, use the update() method to update the hash object:

h.update(b’python1990K00L’)

Then, use the hexdigest() method to get the digest of the string passed to the update() method:

hash = h.hexdigest()

The digest is the output of the hash function.

Finally, print the hash variable to see the hash value in the console:

print(hash)

To run the script, click on the “run” button at the top of the screen. On the console, you should see the following output:

d1e8a70b5ccab1dc2f56bbf7e99f064a660c08e361a35751b9c483c88943d082

To recap, you provide the hash function a string as input and get back another string as output that represents the hashed input:

Input:

python1990K00L

Hash (SHA-256):

d1e8a70b5ccab1dc2f56bbf7e99f064a660c08e361a35751b9c483c88943d082

Try hashing the string python. Did you get the following hash?

11a4a60b518bf24989d481468076e5d5982884626aed9faeb35b8576fcd223e1

Using Cryptographic Hashing for More Secure Password Storage

The irreversible mathematical properties of hashing make it a phenomenal mechanism to conceal passwords at rest and in motion. Another critical property that makes hash functions suitable for password storage is that they are deterministic.

A deterministic function is a function that given the same input always produces the same output. This is vital for authentication since we need to have the guarantee that a given password will always produce the same hash; otherwise, it would be impossible to consistently verify user credentials with this technique.

To integrate hashing in the password storage workflow, when the user is created, instead of storing the password in cleartext, we hash the password and store the username and hash pair in the database table. When the user logs in, we hash the password sent and compare it to the hash connected with the provided username. If the hashed password and the stored hash match, we have a valid login. It’s important to note that we never store the cleartext password in the process, we hash it and then forget it.

Whereas the transmission of the password should be encrypted, the password hash doesn’t need to be encrypted at rest. When properly implemented, password hashing is cryptographically secure. This implementation would involve the use of a salt to overcome the limitations of hash functions.

Limitations of Hash Functions

Hashing seems pretty robust. But if an attacker breaks into the server and steals the password hashes, all that the attacker can see is random-looking data that can’t be reversed to plaintext due to the architecture of hash functions. An attacker would need to provide an input to the hash function to create a hash that could then be used for authentication, which could be done offline without raising any red flags on the server.
The attacker could then either steal the cleartext password from the user through modern phishing and spoofing techniques or try a brute force attack where the attacker inputs random passwords into the hash function until a matching hash is found.
A brute-force attack is largely inefficient because the execution of hash functions can be configured to be rather long.
Since hash functions are deterministic (the same function input always results in the same hash), if a couple of users were to use the same password, their hash would be identical. If a significant amount of people are mapped to the same hash that could be an indicator that the hash represents a commonly used password and allow the attacker to significantly narrow down the number of passwords to use to break in by brute force.
Additionally, through a rainbow table attack, an attacker can use a large database of precomputed hash chains to find the input of stolen password hashes. A hash chain is one row in a rainbow table, stored as an initial hash value and a final value obtained after many repeated operations on that initial value. Since a rainbow table attack has to re-compute many of these operations, we can mitigate a rainbow table attack by boosting hashing with a procedure that adds unique random data to each input at the moment they are stored. This practice is known as adding salt to a hash and it produces salted password hashes.
With a salt, the hash is not based on the value of the password alone. The input is made up of the password plus the salt. A rainbow table is built for a set of unsalted hashes. If each pre-image includes a unique, unguessable value, the rainbow table is useless. When the attacker gets a hold of the salt, the rainbow table now needs to be re-computed, which ideally would take a very long time, further mitigating this attack vector.

No Need for Speed

According to Jeff Atwood, “hashes, when used for security, need to be slow.” A cryptographic hash function used for password hashing needs to be slow to compute because a rapidly computed algorithm could make brute-force attacks more feasible, especially with the rapidly evolving power of modern hardware. We can achieve this by making the hash calculation slow by using a lot of internal iterations or by making the calculation memory intensive.

A slow cryptographic hash function hampers that process but doesn’t bring it to a halt since the speed of the hash computation affects both well-intended and malicious users. It’s important to achieve a good balance of speed and usability for hashing functions. A well-intended user won’t have a noticeable performance impact when trying a single valid login.

Collision Attacks Deprecate Hash Functions

Since hash functions can take an input of any size but produce hashes that are fixed-size strings, the set of all possible inputs is infinite while the set of all possible outputs is finite. This makes it possible for multiple inputs to map to the same hash. Therefore, even if we were able to reverse a hash, we would not know for sure that the result was the selected input. This is known as a collision and it’s not a desirable effect.

A cryptographic collision occurs when two unique inputs produce the same hash. Consequently, a collision attack is an attempt to find two pre-images that produce the same hash. The attacker could use this collision to fool systems that rely on hashed values by forging a valid hash using incorrect or malicious data. Therefore, cryptographic hash functions must also be resistant to a collision attack by making it very difficult for attackers to find these unique values.

or simple hashing algorithms, a simple Google search will allow us to find tools that convert a hash back to its cleartext input. The MD5 algorithm is considered harmful today and Google announced the first SHA1 collision in 2017. Both hashing algorithms have been deemed unsafe to use and deprecated by Google due to the occurrence of cryptographic collisions.

Google recommends using stronger hashing algorithms such as SHA-256 and SHA-3. Other options commonly used in practice are bcrypt, scrypt, among many others that you can find in this list of cryptographic algorithms. However, as we’ve explored earlier, hashing alone is not sufficient and should be combined with salts.

--

--

Responses (4)