Password Hashing and Storage Basics

Mark Ilott
9 min readJan 22, 2023

--

And why you should never re-use passwords

A far too simple password. Photo by Towfiqu barbhuiya on Unsplash

I’ve found myself explaining the technicalities of password management in back-end systems a few times recently — both to technical teams unfamiliar with the issues, and to friends and family curious about how passwords get stolen and why password re-use is such a bad idea.

While the topic can be quite complex, especially the theory behind the various hashing algorithms, at a basic level the key concepts are not difficult to understand — and I think it would be helpful if more people had some basic knowledge. I’ll provide links to more in-depth information as we go, but for the most part this explanation will cover key-concepts rather than detail.

Let’s have a look at a typical interaction with a web-based application we use to do something useful, like share pictures of our favourite widgets. Let’s call it Widgetology.

PlainText

In the beginning there is plaintext, the technical term for your unencrypted password (as opposed to plain text, which is the stuff you’re reading right here). We come up with a super secret password no one would ever guess — say like “MyP@ssword1” — and we enter it into the Widgetology web site as we set up our account.

Next time we want to access our account we type in our username and password, there’s some magic in the background as our password is verified, and all going well we have access to the Widgetology app and our personal data.

Somehow the Widgetology app knows our account password and can compare it to the password we enter and confirm we are who we say we are.

Fortunately it’s not because Widgetology has simply stored our plaintext password in a database, for that would be extraordinarily insecure. If the Widgetology system is compromised and the database stolen the attacker would now have full access to all accounts immediately. It would also mean that Widgetology admin staff would have access to our passwords, and while I’m sure they are a trustworthy bunch, you just never can know for sure.

Hashing

Instead, we use a process called hashing to obscure the plaintext password in storage, in a way that ensures we can still verify your password when you log in using plaintext.

Hashing is a one-way encryption of the password — with one-way simply meaning that once encrypted the data cannot be decrypted. When you create an account or modify your password, this encrypted password — called a hash — is stored in the Widgetology user database with your account for future logins.

If the stored password cannot be decrypted, then how can we compare it to the password you enter?

The hashing process will always produce the same result given the same plaintext input. So we take the password you enter, apply the hashing process, and compare the result to the hash we have stored in the database. If they match, Widgetology knows you have entered the correct password.

If a hacker or rogue admin manages to steal the Widgetology user database, all they have is the hash which cannot be used to log in, and cannot be decrypted to reveal your plaintext password.

For a long time this was how passwords were stored, and they still are in many systems including Windows itself. Unfortunately there’s a few flaws in the plan.

Hashing algorithms have evolved and improved over the years, with older versions not considered as secure as they once were. It’s beyond the scope here but typically the issue is limitations in the algorithm (or computing capability when it was invented) allow for too many collisions — where different plaintext can produce the same hash.

Technology isn’t really the problem here though.

Let’s look at the SHA-2 algorithm as an example (the SHA256 version specifically). SHA256 is a very popular hashing algorithm and was and is extremely common in password management. The algorithm itself is considered secure — it is impossible to reverse the encryption, so that’s not the issue. People are the issue.

Let’s take our secure password above as an example:

Plaintext: “MyP@ssword1”
SHA256 Hash: 55BFFD094830B5D09311BB357C415D8D1323F8185EE2F0C1F94E96C3E2BDD1B5

Well it turns out quite a few other Widgetology users had the same idea, and every one of these “MyP@ssword1” passwords generates the same hash. So inside the user database there will be a number of hashes that look just like mine. What’s worse is, across a million other applications and a billion users, turns out that all over the planet there are user databases full of the exact same hash.

A SHA256 hash cannot be reversed or “cracked”, but in many cases it just doesn’t need to be. An attacker will simply use a list (called a rainbow table) of hashes to compare to the stolen data. The rainbow table includes a massive list of common words, phrases, and all the usual symbol and number substitutions we all think help keep us secure — and their corresponding hashes. The attacker will simply look up my hash in the rainbow table, and they will have my password in seconds. My Widgetology account has been “hacked”.

If I’ve been a little bit more obscure with my password choice and it doesn’t appear in a rainbow table I will definitely be a little safer. But there’s still an issue.

The attacker has the hash list on their own PC, and can decipher the more complex passwords at their leisure. This is done using automated brute-force attacks. A program will create hashes using words, phrases, symbols and numbers and compare them to the stolen hash list. Brute-forcing can take a lot of time, especially if your password is long and not just full of common words. It is helped along by two factors however — first, SHA256 is a very fast algorithm, so a lot of hashes can be generated without a lot of computing power, and second — the attacker has the hash list for every user from the database and can compare each guess with the full list in milliseconds.

As computing power improved and rainbow tables of known password hashes got larger and larger, this brute force problem became a major issue. Storing passwords as simple hashes is now considered insecure, although unfortunately it is still very common.

The solution is to make password hashes unique, even if the passwords are not. Pass the salt…

Salting

Like sprinkling salt on your dinner, adding a salt to a password hash adds some randomness and makes each password hash unique.

Here’s how it works:

You enter your plaintext password as usual in the account creation process, and the Widgetology back end takes over to create your account. This time, instead of just hashing your password, the system creates a random string of characters — a salt — and adds it to your input. Your password becomes the password + salt.

My password: MyP@ssword1
Salt: XElWz9WPwSLK3y0jUP6KhO
Salted password: MyP@ssword1XElWz9WPwSLK3y0jUP6KhO

The salted password is then hashed and stored in the user database. Using Bcrypt, a common password salting application, it would look like this:

Salted Hash: $2y$10$XElWz9WPwSLK3y0jUP6KhOHepv.KF4zj6z4J3XXyYRye.VXnPsMA2

Where:

  • $2y is the hash algorithm (Blowfish in this case)
  • $10 is the cost (or complexity/time)
  • XElWz9WPwSLK3y0jUP6KhO is the salt (always 22 characters)
  • Hepv.KF4zj6z4J3XXyYRye.VXnPsMA2 is the hash of the password+salt (always 31 characters)

In addition to the salt, modern password hashing algorithms have been deliberately slowed down. It may take a second or two longer to create a password hash using modern salting techniques, but we don’t do it often and it is barely noticeable to us. To an attacker trying to create millions of hash guesses however that extra time is considerable. This extra complexity and time is configurable, so applications can make it even harder if required (at the cost of more computing power), and we can all make it harder over time as computing power increases.

You may have have noticed we are storing the salt with the hash. This means an attacker who has stolen the database knows the random salt string we added to each password to generate the hash. Turns out this isn’t very helpful for our attacker.

In theory the attacker could add the salt to a common password, generate the hash, then compare it to our stored hash to see if it matches.

Common password + salt = hash?

It will work, but they have to do that for every password in their big list of common passwords with my salt until they find a match. With the slow Bcrypt algorithm on our side, this could take a very very long time.

What’s worse for the hacker is that every salt is unique, which means that every user who has come up with the same password as me has a different hash. The attacker has to go through this common password + salt = hash process individually for each user in the database. What can be done in seconds with a SHA256 database may take thousands of years with a Bcrypt hash list.

Adding salt to a hash increases the difficulty of rainbow lookups and brute-forcing significantly, and is now the minimum standard for password storage.

Problem solved. Almost.

Password Spraying

Stealing a user database isn’t the only way an attacker can hack your account, and it’s not the most common.

The admins at Widgetology limit the number of times you can enter an incorrect password before your account will be blocked. This prevents attackers from simply trying passwords continually until they guess the right one.

However what is more difficult to defend against is an attacker using a single password against multiple accounts — a technique called password spraying. If they know or can guess usernames, they can over time simply try passwords against large numbers of accounts without being blocked.

A variation is to try your username/password combination across multiple sites. Your email address is likely to be your username across many of the applications you use — so it’s only your password protecting you against this technique.

Which leads us to the primary point of this article…

Don’t Re-Use Passwords Across Sites!

When we put all this together hopefully it’s become apparent that using the same password, and especially the same username/password across multiple sites is a bad idea!

Many sites still use simple hashing for password storage and their databases are vulnerable to brute-force attacks if stolen. Even salted passwords can be cracked, it’s just a matter of time (with password complexity playing a big part).

As soon as a password database is leaked attackers all over the internet start brute-forcing the passwords, and from there start using the username/password combinations to try logging in to other sites.

Want to know if your own super-secret passwords are already out there? Have I been Pwned will tell you. If you’ve been using your email address for more than a few years you will almost certainly be on the list.

Other mistakes, mis-configurations and malicious insiders can cause plaintext passwords to leak. Admins are human too.

In addition to avoiding re-use, use complex passwords that are machine generated whenever possible. Use a password manager like 1Password, Bitwarden or LastPass to manage your passwords for you and help make them complex and unique. (Note that LastPass has had some bad publicity lately when users password databases were stolen. It’s still far more secure than managing your own passwords — assuming you don’t use a simple to guess password on your LastPass account. 1Password uses a less convenient method of securing your password database that is immune to that type of breach — your password database cannot be decrypted without a key that is only stored locally with you).

If you are creating your own passwords and need to remember them, use phrases rather than words to make them longer, and use uncommon words or nonsense sentences. Length is the most effective way to increase complexity.

And always turn on two-factor authentication when it is available — preferably using an authenticator app rather than SMS.

For the Technical Readers

Part of the prompt for this article was being asked by a vendor to use SHA256 hashes in the password store. It still happens.

Even if your role is not involved in the build of authentication processes it is something all developers and network/sysadmins should understand. Auth0 has a series of blog articles that explain the technical concepts with examples, and I highly recommend starting there:

Hashing Passwords

Using Bcrypt to Hash and Salt

Hope that helps you all, please leave a comment if you have any questions or suggestions.

About the Author

Mark is a 20+yr IT industry veteran, with technical expertise and entrepreneurial experience in a range of fields. For many years now his passion has been building on AWS, and he holds certifications including Solution Architect Professional and the Advanced Networking Specialty. He wants you to know that he is an infrastructure and architecture expert who writes code, not a professional software developer — just in case you spot something odd.

Mark has spent the last few years in the banking industry, were he is currently building and leading a team focused on using AWS serverless technologies to reduce costs and speed up the time to market for new services and products.

You can also find Mark on:

GitHub — https://github.com/markilott

LinkedIn — https://www.linkedin.com/in/markilott/

--

--

Mark Ilott

Solution Architect specialising in AWS, sharing IaC tips and tricks