June 17, 2016
Introduction to GPU Password Cracking: Owning the LinkedIn Password Dump
Written by
Martin Bos
This blog was written by Martin Bos, Senior Principal Security Consultant - TrustedSec
Unless you’ve been living under a rock for the past few months you have probably heard about the dump from the 2012 LinkedIn hack being released. TrustedSec was able to acquire a copy of the list and use it for research purposes. Our friends over at Korelogic have already posted an excellent analysis of the list showing the most common words, patterns, and other statistics so we are not going to rehash that information. The LinkedIn list offers an opportunity for us at TrustedSec to share our password recovery methodology step by step and show how we attack large password breach lists. The passwords gained from these types of breaches are very valuable to us on penetration tests because people often reuse passwords across work and social media. Our hope is that by now everyone on this list has reset their password and is no longer using the password they used for LinkedIn in 2012, however since we cannot be sure, we have no plans to share the list so please don’t ask.
The list we received contained 167,370,909 entries in a SHA1 unsalted hash format. The list contains a large number of duplicate hashes which is valuable for statistical analysis but we don’t need that to go over cracking methodology. After removing all of the duplicates and blank lines we were left with 117,205,871 unique hashes to crack.
At TrustedSec, we have a large password cracking server that was provided to us by Jeremi Gosney and the fine folks over at Sagitta. It is more than capable of loading up the whole 117 million hashes at one time, however because not everyone has a box of this size I decided to split up the list into more manageable chunks. You can decide what is a manageable number of lines based on your hardware specifications.
I used 10 million lines to split up my list but you may want to use less.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat/HASHFILES# split -dl 10000000 --additional-suffix=.txt linkedin.hash link root@kracker:~/LINKEDIN_WORKING/cudaHashcat/HASHFILES# ls link00.txt link00.txt.new link01.txt link02.txt link03.txt link04.txt link05.txt link06.txt link07.txt link08.txt link09.txt link10.txt link11.txt linkedin.hashThis gave me 11 hash file lists to work with. My plan was to run all of the basic password attacks against each of these lists and get all of the low hanging fruit passwords out of the way and then recombine the hash lists and start the more advanced password recovery tactics. This post is also meant to be a tutorial on how to use Cudahashcat so I will try to showcase each of the attack modes even though it may not be totally necessary. The first thing to identify in any hash list is the type of hash. We know the list is in unstalted SHA1 format so we need to find the mode for that type of hash. Executing Cudahashcat with the –h flag will show us all of our options.
Hash types: 900 = MD4 0 = MD5 5100 = Half MD5 100 = SHA1 10800 = SHA-384 1400 = SHA-256 1700 = SHA-512 5000 = SHA-3(Keccak) 10100 = SipHash 6000 = RipeMD160 6100 = Whirlpool 6900 = GOST R 34.11-94 11700 = GOST R 34.11-2012 (Streebog) 256-bit 11800 = GOST R 34.11-2012 (Streebog) 512-bitYou can see here that SHA1 is mode 100 so that is what we are going to be using for the entire exercise. The next thing we want to determine is the attack mode we want to start up with.
Attack modes: 0 = Straight 1 = Combination 3 = Brute-force 6 = Hybrid dict + mask 7 = Hybrid mask + dictTo begin with, I always do a bruteforce of 1-6 characters to get the party started. Cudahashcat uses what is referred to as password masks to assign a variable to represent each character set.
?l = abcdefghijklmnopqrstuvwxyz ?u = ABCDEFGHIJKLMNOPQRSTUVWXYZ ?d = 0123456789 ?s = !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ ?a = ?l?u?d?s ?b = 0x00 - 0xffThis will be particularly useful later on when we want to tailor our attack a little more but to get started we are going to use the ?a mask to represent all of the characters.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked -i -a 3 ?a?a?a?a?a?aThe flags I am using for this attack are as follows: -m 100 = the hash mode. Remember it was SHA1 --remove = This removes the hash from the file once it has been recovered. -o = This is the output file for your cracked hashes. -I - This signifies increment mode so the cracking will start at 1 and move up in increments to the number of ?a you defined on the command line. -a 3 = Our attack mode. We are using brute force for this example. ?a?a?a?a?a?a = This is the number we want to brute force up to. In this example, it is 6 but you can use 5,7,8 or whatever you want. I will use this attack against each of my 11 word lists. You could easily write a quick for loop in bash to iterate through each of the files.
#/bin/bash for file in *.txt do ./cudaHashcat64.bin -m 100 --remove HASHFILES/$file -o linkedin.cracked -i -a 3 ?a?a?a?a?a?a doneNext, let’s take a look at a simple wordlist attack. This is attack mode 0, however since it is the default attack we don’t have to specify an –a flag on the command line. The basis of this attack is very simple because it simply goes through a wordlist and does a comparison and sees if a password is recovered. Obviously, this attack is only as good as your wordlist collection. At the end of the article, I will link to some resources to download some wordlists to get you started. I will begin by checking the Rockyou.txt wordlist.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked /wordlists/rockyou.txtOne really cool thing about Cudahashcat is that you can specify an entire directory of wordlists with a * so instead of having one giant list, you can have multiple smaller lists.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked /wordlists/*Once again you can just use a quick for loop to iterate through all of your 11 hash files.
#/bin/bash for file in *.txt do ./cudaHashcat64.bin -m 100 --remove HASHFILES/$file -o linkedin.cracked /wordlists/* doneOnce you have burned through all of your wordlists, it’s time to add some rules. Cudahashcat has rule files that have one command per line. For a thorough breakdown of the rule-based attack, you can see the Hashcat Wiki. For the most part, all of the effective rules have been written already and are included with Cudahashcat. In order to use a rule file, we specify –r on the command line and the path to the rule file.
./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked -r rules/best64.rule /wordlists/*The next attack we want to look at is the Hybrid Attack. This is a combination of the dictionary attack and the mask attack.
6 = Hybrid dict + mask 7 = Hybrid mask + dictWe choose either –a 6 or –a 7 depending on what we want to do. The 6 attack appends the mask we define and the 7 attack prepends the mask.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked -i -a 6 /wordlists/rockyou.txt ?a?a?a?aWe can once again use the increment flag to start at 1 character space and move up to 4 character spaces which we define with ?a?a?a?a. You can see below in the output that we have a wordlist as the left input and a mask as the right input.
Session.Name...: cudaHashcat Status.........: Running Input.Left.....: File (/wordlists/rockyou.txt) Input.Right....: Mask (?a) [1] Hash.Target....: File (HASHFILES/link10.txt) Hash.Type......: SHA1 Time.Started...: Wed Jun 15 08:08:37 2016 (8 secs) Time.Estimated.: 0 secs Speed.GPU.#1...: 36307.1 kH/s Speed.GPU.#2...: 31949.1 kH/s Speed.GPU.#3...: 20723.3 kH/s Speed.GPU.#4...: 22585.2 kH/s Speed.GPU.#5...: 24069.9 kH/s Speed.GPU.#6...: 23748.2 kH/s Speed.GPU.#7...: 27342.7 kH/s Speed.GPU.#8...: 22934.5 kH/s Speed.GPU.#*...: 209.7 MH/s Recovered......: 82753/2924027 (2.83%) Digests, 0/1 (0.00%) Salts Recovered/Time.: CUR:N/A,N/A,N/A AVG:588157.69,35289464.00,846947072.00 (Min,Hour,Day) Progress.......: 1362613120/1362613120 (100.00%) Rejected.......: 151905/1362613120 (0.01%)Likewise, we can run it the other direction as well and see what shakes out.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked -i -a 7 ?a?a?a?a /wordlists/rockyou.txtYou can see in the below output that this type of attack is very effective for catching passwords with random special characters and things in the middle of the words.
f7d5b2c833ef067bf3d5764e3dd28c1b97c94385:style&zo cab5a9547e82cb7bf7c86f32df74f7da69d527c0:16633p/s 2d56ee1f63a4bb297bc79f76367c10acef0b1155:ling71.t da6e3d2b461a9710f7b7d505e0346438629ad286:mc333x7s f76d5db19a723711295b7ceb1942fcc852a0d3eb:ayudame*+8At this point, we have pretty much exhausted all of the easy stuff. Let’s check and see how many we have cracked.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# wc -l linkedin.cracked 61033579 linkedin.crackedLooks like we are at about 50% cracked. This is pretty normal for this stage of the attack. At this point, it’s usually prudent to do a little analysis of the list and see what the most common patterns are. We can recycle these patterns back into Cudahashcat and hopefully crack some more passwords. There are several tools out there to analyze wordlists but the one I like the best is called PACK and is available at The Sprawl. The first thing we need to do is remove just the passwords from our Cudahashcat output file.
cut -d : -f 2 linkedin.cracked > linkedin.analyzeThe first thing I like to do is use the statsgen.py tool to get the top 20 masks.
[*] Advanced Masks: [+] ?l?l?l?l?l?l?l?l: 05% (1859090) [+] ?l?l?l?l?l?l: 04% (1411462) [+] ?l?l?l?l?l?l?d?d: 04% (1365203) [+] ?l?l?l?l?l?l?l: 04% (1344455) [+] ?d?d?d?d?d?d?d?d: 04% (1340848) [+] ?l?l?l?l?l?l?l?l?l: 03% (1087191) [+] ?d?d?d?d?d?d?d?d?d?d: 03% (1036375) [+] ?l?l?l?l?d?d?d?d: 03% (987014) [+] ?d?d?d?d?d?d?d: 02% (776811) [+] ?l?l?l?l?l?l?l?l?l?l: 02% (769990) [+] ?l?l?l?l?l?l?l?d?d: 02% (730153) [+] ?d?d?d?d?d?d: 02% (686769) [+] ?l?l?l?l?l?d?d?d?d: 02% (671888) [+] ?l?l?l?l?l?d?d: 01% (636132) [+] ?l?l?l?d?d?d?d: 01% (572769) [+] ?l?l?l?l?l?d?d?d: 01% (546086) [+] ?l?l?l?l?l?l?l?d: 01% (542754) [+] ?l?l?l?l?l?l?d?d?d?d: 01% (535142) [+] ?l?l?l?l?l?l?l?l?d?d: 01% (505372) [+] ?l?l?d?d?d?d: 01% (464088) [+] ?l?l?l?l?d?d: 01% (452142) [+] ?l?l?l?l?l?l?l?l?l?l?l: 01% (431507) [+] ?l?l?l?l?l?l?d?d?d: 01% (418555) [+] ?l?l?l?l?l?l?d: 01% (349865) [+] ?l?l?l?l?l?l?l?l?d: 01% (345546)We add the list of masks to a file and give it the .hcmask extension. We can now use the mask file with Cudahashcat and the toll will iterate through each of the masks one by one.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked -a 3 linkedin.hcmaskAnother useful attack we can use is to get the most common masks from other password breaches or wordlists that we have already cracked. In this example, we will generate a list of masks from the rockyou.txt wordlist.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat/tools/PACK# python statsgen.py /wordlists/rockyou.txt -o rockyou.masksUsing the mask file we outputted from the statsgen tool, and we can tailor our attack and use the maskgen tool to optimize our mask file based on Occurrence, complexity and optindex and target cracking time. I will make one mask file for each mode. The target time is defined in seconds so we decide how long we want to run our attack for and the tool makes the correct amount of masks to fit into the target time frame.
python maskgen.py rockyou.masks --targettime 3600 --optindex -q -o rockyou-optindex.hcmask python maskgen.py rockyou.masks --targettime 3600 --complexity -q -o rockyou-complexity.hcmask python maskgen.py rockyou.masks --targettime 3600 --occurrence -q -o rockyou- occurrence.hcmaskNow that I have three mask files I can combine them into one file and remove any duplicates. You also have to be sure to remove the occurrence number at the end of the mask line. Let’s also remove any masks that are 6 characters or shorter since we already did a brute force for anything 6 characters or smaller.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat/tools/PACK# cat rockyou-optindex.hcmask rockyou- complexity.hcmask rockyou- occurrence.hcmask | cut –d , -f 1 | sed -r '/^.{,12}$/d' | sort -u > rockyou.hcmaskThen I use the newly created mask files to attack the LinkedIn list again.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/link01.txt -o linkedin.cracked -a 3 rockyou.hcmaskAt this point, we have probably cracked enough hashes that we can combine the remaining hashes in the 11 lists into one.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat/HASHFILES# cat *.txt | sort -u > linkedin.remainingNow we can run our newly created rockyou.hcmask file against the remaining hashes in the list.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/linkedin.remaining -o linkedin.cracked -a 3 rockyou.masksAt this point I have one more type of attack I would like to show. This is called the combinator attack. In the same style as the hybrid attack used a dictionary on one side and a mask on the other side, the combinator attack uses a dictionary on both sides. This is a very effective attack for recovering long passwords that may have otherwise been missed. Here is an example taken from the hashcat wiki If our dictionary contains the words:
pass 12345 omg TestHashcat creates the following password candidates:
passpass pass12345 passomg passTest 12345pass 1234512345 12345omg 12345Test omgpass omg12345 omgomg omgTest Testpass Test12345 Testomg TestTestAdditionally, we can also add a single rule to either side of the dictionary. -j = Single rule applied to each word on the left dictionary -k = Single rule applied to each word on the right dictionary If we wanted to add a hyphen in between the word and a ! at the end we would use the following two rules. -j '$-' -k '$!' Which would give us this:
Pass-pass! Pass-12345! Pass-omg! Pass-Test! 12345-pass! 12345-12345! 12345-omg! 12345-Test! omg-pass! omg-12345! omg-omg! omg-Test! Test-pass! Test-12345! Test-omg! Test-Test!This attack is the most effective with smaller word lists as it can take an extremely long time. For this example, I am just going to use a simple English wordlist with 394748 words in it. You can feel free to mix it up and use different dictionaries on each side.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/linkedin.remaining -o linkedin.cracked -a 1 /wordlists/english.txt /wordlists/english.txtAfter I let that run I will do it a few more times with various rules. Here is an example with the rules we mentioned above.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/linkedin.remaining -o linkedin.cracked -j '$-' -k '$!' -a 1 /wordlists/english.txt /wordlists/english.txtThe last part of this attack is using the expander to create a bigger dictionary out of our already cracked list. The expander tool can be found inside the hashcat-utils download. Here is an example from the Hashcat wiki on how the expander works.
$ echo pass1 | ./expander.bin | sort -u 1 1p 1pas 1pass a as ass ass1 ass1p p pa pas pass pass1 s s1 s1p s1pa s1pas ss ss1 ss1p ss1paLet’s try expanding our passwords we have already cracked. Using the words you have already cracked from the list can improve your chances of cracking more on the list because many times companies’ passwords will follow patterns and themes. Normally we would be using much smaller dictionaries for this because it’s not often we have a hash list with 117 million hashes in it. This attack may take too long and be unrealistic but I just wanted to show an example. It works exceptionally well with smaller wordlists.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# cat linkedin.cracked | cut -d : -f 2 > linkedin.dic root@kracker:~/LINKEDIN_WORKING/cudaHashcat/tools/hashcat-utils# ./expander.bin < ../../linkedin.dic | sort -u > ../../linkedin.expThen we use the expanded dictionary on both sides of the combinatory attack.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/linkedin.remaining -o linkedin.cracked -a 1 linkedin.exp linkedin.expAnother effective attack method I use is to combine all the rule files into one big file and then use that with a dictionary file made from my already cracked hashes. This is very effective because it simply adds more patterns to already existing patterns. For this attack I will use a set of rules created by the folks over at Korelogic for the Crack Me if You Can Contest. They are available here.
root@kracker:~/LINKEDIN_WORKING/cudaHashcat# cat linkedin.cracked | cut -d : -f 2 | sort -u > linkedin.dic root@kracker:~/LINKEDIN_WORKING/cudaHashcat# cat rules/KoreLogic/*.rules | sort -u > ../KoreLogicBigRule.rule root@kracker:~/LINKEDIN_WORKING/cudaHashcat# ./cudaHashcat64.bin -m 100 --remove HASHFILES/linkedin.remaining -o linkedin.cracked -r rules/KoreLogicBigRule.rule linkedin.dicThis, of course, can take a very long time so this should be one of your last ditch efforts to recover a password. At this point, I have cracked about 85% of the LinkedIn list and I am pretty happy with the results. I will probably continue to modify these attacks we talked about in this article with different masks and wordlists and try to get more passwords. A nice collection of wordlists to get you started is available here: https://github.com/danielmiessler/SecLists/tree/master/Passwords