Statistics of 62K Passwords

A couple of days ago, LulzSec published a batch of 62K random logins (emails and passwords). At first, I grabbed it in order to make sure that neither me nor anyone on my contacts had his passwords revealed. Later I decided to run a few stats on this rare dump of data. Following are a few interesting facts.

Password length

The dump’s average password length is 7.63. I was surprised, because I thought most users would use something like 4 characters, but then remembered a lot of sites nowadays enforce a a 6-8 character limit minimum, so this makes sense. As you should know, and as you can find in Hacking: The Art of Exploitation, longer passwords are greatly harder to crack, so this is definitely a case where size does matter.

Here’s a short graph depicting the distribution of password length (Note that edge groups have less than 10 passwords and so aren’t really seen here):

Passwords by length
Common Passwords

Not surprisingly, the most common password is 123456 with 569 occurrences, followed by its “more secure” cousin 123456789 with 184. The 3rd most common password is… “password” (132 occurrences)! The other top-10 passwords are interesting – some are plain words such as “romance”, “mystery”, “tigger” and “shadow”, “102030″ makes quite a few appearances.

The 10th most used password is quite intriguing actually – “ajcuivd289″. Everyone on the internet seem baffled as to the source of this password. My guess would have to be it’s some worm that resets the accounts it hacked into to it. Edit: As Marc comments below, the logins with these passwords seem “clustered”, which makes it more likely that these are actually the result of some bot creating accounts. Thanks Marc!

A couple hundred passwords are just not-so-random keyboard taps (“123qwe”, “asdf1234″, etc.). 789 passwords are taken exactly from the username, and twice that many are part of the username followed by some digits (most seem like birth years).

Inside Passwords

12179 of the passwords are all numeric, some are 14 digits long! That’s just crazy. While 34717 (that’s more than half) of the passwords contain any digits, only 1262 contain capital letters and 533 contain special characters!

Some Common Words

418 passwords contain the word “love”. “sex” is in 125, “jesus” in 67. More people prefer cats (414) to dogs (291). And the language battle – 6 javas, 2 pythons and 17 “ruby”s (guess which one is also a name).

 

I’d like to sum this up with urging you to never use the same password twice and use a password manager in order to generate secure passwords! Using a password manager ensures that even if a certain site is breached, it doesn’t mean all of your passwords are revealed, and secure paswords are just harder to brute force.

You should subscribe to my feed and follow me on twitter!

  1. I think passwords are really unwieldy by definition. They really go against what we know about cognitive capabilities of people. It’s about time we figure out nicer means to deal with authentication. I guess some kind of combo provided by physical (dongle, mobile phone, ie.) and biometrical means might give one nice alternative.

    It’s going to be interesting to see where things will go. We have already tech needed by more robust schemes. It’s just a matter of finding some commonly accepted replacement for good old passwords.

    I am going to be really surprised if we’re still stuck with text based passwords after a decade or two. There are better alternatives out there.

  2. p says:

    Seriously?

    Don’t validate the actions of black hat criminals and extortionists.

  3. Marc says:

    Regarding ajcuivd289, I believe that is either a single person or group of people signing up multiple times, possibly a bot.

    When looking at the actual list, which is possibly organised by creation date, you can see that accounts using that password are often created in tight groups. Email providers and usernames (mostly female names) are also narrowly distributed for accounts using that password.

  4. Jason says:

    Aviv, Interesting post, how did you parse the file given it had 3 different formats the structure of the data was in. Would you be willing to share so I can learn?

  5. Rob says:

    This is awesome. I would suggest you present the data as percentages rather than actual counts.

  6. Joe says:

    I think the phrasing in the second sentence of the second paragraph is off. It doesn’t make any sense.

  7. LolHai says:

    language battle?

    I bet C wins , alot of matches :D

  8. Zach says:

    How many passwords were actually “strong”? (i.e. seemingly random combination of uppercase, loewrcase, number, and symbol, and greater than 10 characters?

    This would be an interesting estimate of “what percentage of people use password managers”?

  9. Ben Sizer says:

    “I’d like to sum this up with urging you to never use the same password twice and use a password manager in order to generate secure passwords!”

    Why?

    If a site is storing your password and it gets leaked, it doesn’t matter how ‘secure’ it was any more – it’s now completely insecure.

    And if a site is not storing your password or not being leaked, the chance of someone guessing it is still miniscule, unless it’s one of those top 20 passwords.

    It is interesting how many coders tell other coders “obscurity is not security” but then they think they can keep an online account secure by forcing people to guess a string of characters. If someone has the time and resources, eventually they will guess correctly. So complex passwords are not a solution, just an aid, and their use has to be balanced against the downsides – eg. having to remember many passwords, or having use a password manager (with a single point of failure), etc.

    We talk so much about how bad people’s passwords are, but hardly anybody is losing access to their account that way compared to the hundreds of thousands compromised through poor coding and network security. Yet still we suggest users should try harder, despite most of these problems not being their fault. I agree with Juho: passwords are inherently unwieldy. The solution isn’t to force people to use even more unwieldy passwords, but to find a different authentication mechanism.

    • @Ben – I’ve updated that paragraph to note my real intention – a password manager means that even if a site that stored your password in plaintext is hacked, the rest of your accounts online are safe. That’s opposed to using the same password everywhere. And using it to generate secure ones just makes it less likely it’ll get brute forced in case you are a target.

  10. Jonathan Waller says:

    @Ben

    You have misinterpreted the idea about obscurity is not security. The point is that a security system will remain secure if someone finds out how it works. If you rely on obscurity then if someone works out how you do things and breaks it everybodys account is compromised. If someone figures out your password then your account in compromised.

    Also a good system will have a timeout for login attempts to an account. e.g. a 1 second delay. With this in place and an 8 character alphanumeric password it will take 91718 years to try every combination so it is very probable you will be dead before they break into your account. With a 12 character alphanumeric password you could be perfectly happy with a billion attempts per second.

    So complex passwords with a secure system are in fact brilliantly secure to a high degree of probability (billions to one for cracking in your lifetime). So passwords are in fact a solution with perfect implementation.

    Of course as you say weak systems are the larger problem.

    What I find most shocking about these cases is that the passwords were retrievably stored in the database. There is a solution to the problem called one way hashing which is a fancy algorithm which nobody knows how to reverse efficiently. So if somebody steals the database they only have a set of one way hashes and so cannot extract the original passwords, thus the other accounts sharing a password remain secure. This is the reason why many sites reset your password rather than sending you the password (since the site owners can’t undo the hash).

    Of course even with one way hashing it is not perfect since the attacker could install software on the server which captures the passwords as you log in, but this is significantly harder to do and they must wait until you log in to capture your password.

  11. Zach says:

    @Ben Sizer:

    - “Why?

    If a site is storing your password and it gets leaked, it doesn’t matter how ‘secure’ it was any more – it’s now completely insecure.”

    That’s a totally misinformed rationalization.

    The point of using a password manager is **so that** you don’t have to use the same password twice. Obviously if a site’s database gets leaked then someone has your password and it’s now totally insecure. What they don’t have is your password for EVERY OTHER WEBSITE YOU VISIT.

    We’ve already seen on twitter the past couple days people going and randomly trying these email / password combinations on facebook / paypal / amazon.com and then charging these innocent people with a bunch of money. That simply would not have happened if they had used strong passwords and a password manager.

    - “And if a site is not storing your password or not being leaked, the chance of someone guessing it is still miniscule, unless it’s one of those top 20 passwords.”

    Wrong again. All sites store your password, it’s just that they don’t always store it in plaintext. Sometimes, for example, they store the MD5 hash. If an attacker were to simply download an MD5 password database from a compromised site and run John The Ripper against it, I guarantee you in less than 2 days most of the passwords would be cracked, including most of the ones not in the top-20.

    That’s anything but miniscule.

  12. Jason says:

    Just want to say neat article

  13. Perry Vivino says:

    I think youve created some actually interesting points. Not as well many people would basically think about this the way you just did. Im truly impressed that theres so very much about this topic thats been uncovered and you did it so nicely, with so a lot class. Superior one you, man! Actually excellent things here.

  14. Bob says:

    “it’s more secure cousin” should be “its more secure cousin”

    My soul died slightly!

    It’s = it is

  15. In this day and age, I can’t believe so many people still use things like “password” for a password. How many horror stories do you have to hear before you throw a couple @ signs or & in for good measure?

  16. I really like seeing progressive articles just like this with high quality information compiled and also talked about. I think that if you had dug even a tiny bit deeper, this actual post could almost become a good academic article or perhaps academic resource. I just added your website to my RSS reader to watch for what you might have in the future.

  17. Loads of excellent writing here. It was indeed very helpful and insightful while being straight forward and to the point. Thanks for the posting.

  1. [...] window.google_analytics_uacct = "UA-472984-2"; Password statistics: Statistics of 62K Passwords [...]

  2. [...] E-mail Hack Proves We’re Lousy at Picking Passwords” (PCWorld.com), “Statistics of 62K Passwords” (codelord.net), and “A brief Sony password analysis” [...]

  3. [...] Aviv Ben-Yosef and Rafe Kettler take a look at the complexity of the passwords. As you might expect, the results are not encouraging, although the average length is 7.63, which is higher than I would have thought. Here are some startling results from Kettler [...]

Leave a Reply