Real or Fake? How to Spoof Email
I briefly mentioned how easy it is to forge email sender addresses in a previous blog post that described the steps I took to determine whether a suspicious email was legitimate or a phishing attempt. In this post, we will take a deeper dive into why email sender addresses are so easy to forge and show how it's done.
Spoofed emails with forged sender information are a major problem as they can help phishing emails appear more realistic and bypass email filters. Spoofed emails include messages that appear to be from another organization or external emails that appear as if they came from inside an organization.
Take a Trip Through Time
Email moving between organizations on the Internet today uses the Extended Simple Mail Transfer Protocol (ESMTP) and various extensions that add features to the protocol. ESMTP is descended from the Simple Mail Transfer Protocol (SMTP), defined by RFC 788 in November 1981. Modern ESMTP, defined in RFC 1869, has a variety of features and extensions that the original 1980s version of SMTP didn't have but fundamentally operates the same way. It is still possible to send an email to a modern mail server using the original 1981 version of the protocol.
Nearly universal reliance on a protocol from a very different era is the root of today's email spoofing problem. The SMTP protocol was designed with the assumption that every computer sending an email across a network could be trusted to provide accurate sender information.
This assumption made sense in 1981: The Internet as we know it didn't exist yet. Instead, there were loosely connected networks of mainframes and minicomputers managed by professional IT personnel running a variety of network protocols (RFC 791, defining IPv4, was only released two (2) months before SMTP and TCP/IP had just begun its global rollout). Users were typically only able to interact with email via the user accounts, email boxes, and mail transfer agents (the software that sends email from one computer to another) that were provided for them by the institution they worked for (likely an academic institution, government agency, or research facility as commercial use of the nascent Internet was generally prohibited). A user with malicious intent usually didn't have the opportunity to mess around with the email system, or if they found a way, they could be swiftly identified and dealt with.
Network access for the average person would have been unheard of in 1981. Commercial ISPs and hosting providers didn't exist yet. Even connecting a personal computer (likely to be an Apple II, TRS-80, or Commodore PET with less than 100 kB of RAM) to any sort of remote network had only become realistic earlier in the year with the release of the first RS-232 serial modem (the Hayes Smartmodem, running at 300bps). Even if someone could send spoofed phishing emails when SMTP was created, why would they? A victim couldn't be tricked into entering their password, payment card details, or other sensitive information into a malicious website in 1981: it would be nine (9) more years before the first prototype website went live.
Today, billions of people connect to the Internet directly with personal computers vastly more powerful than the mainframes of the 1980s, virtual servers can be rented for a few dollars a month, open source mail server software is available for free on the Internet, and vast sums of money and sensitive information are transferred across the Internet every day. This means anyone with a few bucks and the technical know-how can easily take advantage of the trusting nature of a protocol that dates to 1981 to send spoofed emails and try to scam millions of people.
Nuts and Bolts
Understanding how email is transmitted via SMTP (and, by extension, ESMTP) is useful for understanding how spoofing happens. There are a variety of ESMTP extensions in common use today that allow things like encrypted sessions, but we will be omitting those from this discussion for simplicity as they don't directly impact the email spoofing problem.
Most email will originate either in an email client program (e.g., Outlook Desktop, Apple Mail, etc.) or in a webmail interface (Gmail, Hotmail, Outlook web, etc.). A new email will be transmitted to a mail server to be routed to its recipient(s). Emails to recipients at the same organization as the sender will usually be put directly into the recipient's mailbox to await retrieval or routed to another internal mail server for delivery if necessary. The process becomes more complex and SMTP connections become necessary when an email is destined for one (1) or more recipients at a different organization across the Internet.
Once the sending mail server determines that an email is addressed to someone outside the organization, it must determine an appropriate mail server to forward the email to so it can be delivered to the recipient's mailbox. A special 'mail exchanger' (MX) DNS record is used to identify the mail server(s) that receive email destined for a given domain. For example, if an email is addressed to [email protected], the sending mail server would look up the MX record for example.com and, assuming the DNS is properly configured, receive a prioritized list of valid mail servers that can receive inbound mail for any email address at example.com.
Now that the sending mail server has a list of valid mail servers to send an email to, it will establish a connection to one (1) of the highest priority servers using the SMTP protocol and fall back to lower-priority servers if the higher-priority servers don't respond. The SMTP protocol will be used to transfer the email across the Internet and the receiving mail server will either accept the message or provide an error message that will be returned to the sender (e.g., if the recipient does not exist on the system or their mailbox is full). The receiving mail server will then route the message to the user's mailbox, possibly passing through anti-spam and anti-virus systems or other internal mail servers along the way.
The SMTP protocol, used once the connection is established, is very simple. The sending mail server:
- Identifies itself to the receiving mail server
- Transmits the sender's email address
- Transmits one (1) or more recipient addresses
- Transmits the contents of the message itself
- Repeats the process to send another message over the same connection or informs the receiving mail server that it is done sending messages
The receiving mail server will provide responses throughout this process to indicate whether each command was successful or failed.
A basic SMTP session to transmit an email message from Martin at Bishop Security to Alice and Bob at SETEC Astronomy could look like this (commands from the sending mail server in these examples are denoted with a >, replies from the receiving mail servers are denoted with a <):
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.bishopsecurity.example < 250 SETEC-Mail > MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> >Hi Alice and Bob, I heard your company has been hacked.>Let me know if there's anything I can do to help.>-Martin > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
This exchange would deliver a simple three (3) line plaintext email message (indicated in red text above) to Alice and Bob.
Sender Spoofing
Unfortunately, with a few exceptions, SMTP does not have a mechanism to prevent senders from setting the information provided to identify the mail server and sender however they choose. If a mail server is configured to accept messages destined for a particular domain, it will usually accept the message, no questions asked. From there, it's up to the receiving organization's anti-spam filter and the human recipients of the message to determine the nature of the message. If the rise of phishing has taught us anything, it's that we can't rely on anti-spam filters or human recipients to make the correct determination with acceptable reliability.
This blind trust creates our first opportunity for spoofing. Let's consider an example where there is an actual well-known company named 'Real Benefits Company,' and it's plausible that the company that Alice and Bob work for might contract with them. This example shows a hypothetical message with a forged sender, i.e., this message is not actually being sent by a person at 'Real Benefits Company':
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.realbenefitscompany.example < 250 SETEC-Mail >MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> >Good morning,>Setec Astronomy has partnered with us for a new benefit program.>We need you to confirm your bank details to receive your benefits.>Please enter your bank account details at the following link:>http://www.fakebenefits.example/verify>-Carol, Director, Partner Enrollment, Real Benefits Company > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
Here we see the sender information transmitted over SMTP (indicated in red text above) has been forged and the message (also indicated in red text) is trying to convince our employees to follow a link to a website and enter sensitive personal information, a classic phishing ruse. The use of a real company's name and what appears to be a legitimate email address at that company's actual domain makes this email look more legitimate. Only the link to a site at a different domain name would give this away as a phishing email.
The sender and recipient information transmitted via SMTP can be thought of as the information written on the outside of an envelope that is to be sent via the postal service. Anyone can write whatever they choose as a return address on an envelope and the postal service does not verify that the return address is accurate before delivering the letter, nor does SMTP.
Adding Headers
The simple three-line plaintext example email provided above is overly simplistic for actual email usage in 2021. Obviously, email is often much more complicated with subject lines, HTML formatting, in-line images, and file attachments. These extra features are not part of SMTP; they are defined in other standards and are part of the message contents from the perspective of the SMTP connection.
Message headers are a part of the message contents that provide additional information for the recipient's mail program to display including a subject, the recipient(s) of the message on the To: line, the recipients of the message on the CC: line, the sender, and where replies should be sent. There are many more possible header fields as defined in RFC 5322, but for our discussion on spoofing, we can focus on this basic set.
Here is the same email exchange from our original example with some headers in the message to provide information that will be displayed by the recipient's email program. We will leave the HTML, images, and attachments out to keep this example brief and readable:
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.bishopsecurity.example < 250 SETEC-Mail > MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> > Date: Mon, 13 Dec 2021 03:13:37 -0500 > Subject: Hacking Help > From: Martin <[email protected]> > To: Alice <[email protected]> > CC: Bob <[email protected]> > > Hi Alice and Bob, I heard your company has been hacked. > Let me know if there's anything I can do to help. > -Martin > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
Note that the sender and recipient information in the header duplicates information that was already provided in the MAIL FROM: and RCPT TO: SMTP commands. If the information transmitted via SMTP can be thought of as the information written on the outside of an envelope, the header can be thought of as the introductory information written on top of a letter that is inside the envelope. An envelope, whether physical or electronic, just needs the information necessary to get the message to its destination or return it if it can't be delivered as addressed. The header on the message inside the envelope should contain all relevant information, even if it duplicates information on the envelope, so that the envelope can be discarded once the letter is received.
Header Spoofing
Generating messages with sender and recipient information in the headers that doesn't match the information provided via SMTP envelope is a common and often benign practice, often used by email distribution lists and the BCC function. This example shows a basic BCC email:
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.bishopsecurity.example < 250 SETEC-Mail > MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> > Date: Mon, 13 Dec 2021 03:13:37 -0500 > Subject: Hacking Help > From: Martin <[email protected]> > To: Alice <[email protected]> > > Hi Alice, I heard your company has been hacked. > Let me know if there's anything I can do to help. > -Martin > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
In this example, both Alice and Bob's email addresses are included in the SMTP recipients (indicated in red text above), but Bob's email address is not included in the message headers (also indicated in red text). In this case, the message will be sent to both Alice and Bob as per the SMTP envelope information. Mail programs do not typically have access to SMTP envelope information, this is used and discarded by mail servers before the message is delivered to a mailbox. This means that Alice's email program, and Alice by extension, is unaware that Bob also received the email because Bob's information is not included in the message header. Alice has no way to know that Bob was BCC'd on this email unless Martin or Bob informs her (or Bob doesn't notice he was BCC'd and succumbs to the temptation of the reply-all button).
To use our real-world example: the postal service has no way of preventing Martin from writing a letter to Alice, printing two (2) copies of the letter, mailing one (1) copy to Alice, and mailing the other copy to Bob. Alice has no way of knowing that Bob also received a copy of the letter unless Martin or Bob informs her. Fortunately, physical letters don't have a reply-all button, so it's harder for Bob to accidentally let Alice know that he received a copy.
Manipulating header information can also be used for nefarious purposes, e.g., making an email look like it's coming from a legitimate organization when it is really from an attacker by changing the From: header. Consider the following example:
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.realbenefitscompany.example < 250 SETEC-Mail > MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> > Date: Mon, 13 Dec 2021 03:13:37 -0500 > Subject: Hacking Help > From: Carol <[email protected]> > To: Alice <[email protected]> > > Good morning, > Setec Astronomy has partnered with us for a new benefit program. > We need you to confirm your bank details to receive your benefits. > Please enter your bank account details at the following link: > http://www.fakebenefits.example/verify > -Carol, Director, Partner Enrollment, Real Benefits Company > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
Here we can see that the SMTP sender information (indicated in red text above) and the message's From: header (also indicated in red text) have been altered to show that this message is from Carol at the well-known benefits company. Alice's email program will display this message as if it had come from Carol, a potentially legitimate employee of a real company.
To carry our post office analogy further: Anyone can write a letter that purports to be from another person and/or organization, stuff it in an envelope, and mail it (with or without a fake return address on the envelope). The postal service does not have a way of viewing the contents of a letter and verifying that the purported sender is who they claim to be, nor does SMTP have a way of verifying the message headers.
In the previous example, the attacker was impersonating a legitimate employee of a real external company, but this same technique can be used to impersonate internal personnel as well:
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.setecastronomy.example < 250 SETEC-Mail > MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> > Date: Mon, 13 Dec 2021 03:13:37 -0500 > Subject: Hacking Help > From: David <[email protected]> > To: Alice <[email protected]> > > Hello Alice, > We signed an emergency contract with Bishop Security. > Please immediately transfer $1,000,000 to account CPE-1704-TKS > -David, Director, Procurement, SETEC Astronomy > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
Here we can see that the attacker has forged the SMTP sender information (indicated in red text above) and the From: header (also indicated in red text) to make this message appear as though it is coming from someone inside the target organization. An attacker who understands the organization's structure (perhaps by performing research on social media) can use real names and email address in this manner to attempt to trick users.
Forging message headers for nefarious purposes is not limited to altering the sender: As with our BCC example, the recipient information can be altered as well. To: and CC: headers can make it appear as if other people received a copy of the email to lend additional legitimacy to the message. This example shows how this can be leveraged in a spear phishing email:
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.setecastronomy.example < 250 SETEC-Mail > MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> > Date: Mon, 13 Dec 2021 03:13:37 -0500 > Subject: Hacking Help > From: David <[email protected]> > To: Alice <[email protected]> > CC: Bob <[email protected]> > > Hello Alice, > We signed an emergency contract with Bishop Security. > Please immediately transfer $1,000,000 to account CPE-1704-TKS > Bob the CFO has authorized this transfer. > -David, Director, Procurement, SETEC Astronomy > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
Here we can see that Alice is the recipient listed on the SMTP envelope (indicated in red text above), but the attacker has forged the CC: header (also indicated in red text) to make it appear to Alice that Bob also received a copy of this message even though he did not. Part of the message attempts to convince Alice that a transfer of funds has been approved by Bob (also indicated in red text) who, from Alice's perspective, would surely protest if the transfer was not approved seeing as Bob also apparently received this email.
Mismatched Spoofing
Our previous examples had forged SMTP sender information and message From: headers that matched. Keep in mind that forged SMTP information is usually invisible to the recipient and therefore is not necessary to fool them. Instead, a combination of forged but mismatched SMTP and message headers can be used to bypass basic forms of filtering.
Consider an organization that has configured their receiving mail servers to reject emails from an external mail server that claims to be from an email address within the organization. The disconnect between SMTP envelope information and message headers allows this rudimentary check to be bypassed. Consider this example:
< 220 SETEC-Mail Simple Mail Transfer Service Ready > HELO mail.realbenefitscompany.example < 250 SETEC-Mail > MAIL FROM:<[email protected]> < 250 OK > RCPT TO:<[email protected]> < 250 OK > DATA < 354 Start mail input; end with <CRLF>.<CRLF> > Date: Mon, 13 Dec 2021 03:13:37 -0500 > Subject: Hacking Help > From: David <[email protected]> > To: Alice <[email protected]> > CC: Bob <[email protected]> > > Hello Alice, > We signed an emergency contract with Bishop Security. > Please immediately transfer $1,000,000 to account CPE-1704-TKS > Bob the CFO has authorized this transfer. > -David, Director, Procurement, SETEC Astronomy > . < 250 OK > QUIT < 221 SETEC-Mail Service closing transmission channel
Here the SMTP sender information (indicated in red text above) is forged to appear as if the message is coming from Carol and a real benefits company. Meanwhile the message header is forged to appear as if the message is from an internal employee. This message will be accepted if the receiving mail server is only checking the SMTP sender information to filter out spoofed internal emails, as the SMTP sender is external. Alice's email program will not display the SMTP envelope information and will instead show this message as coming from David, an internal employee, due to the mismatched forged From: header (also indicated in red text).
Stopping Spoofing
Email spoofing is a powerful tool for an attacker that is used both to send basic phishing emails and more convincing highly targeted spear phishing emails. Organizations should use the various settings available in their mail servers to filter out the basic attacks, but this will likely not be sufficient.
More advanced tools are available that are specifically designed to stop spoofing. These include:
- Sender Policy Framework (SPF)
- DomainKeys Identified Mail (DKIM)
- Domain-based Message Authentication, Reporting and Conformance (DMARC)
These tools are much more powerful than the basic controls build into mail servers but require more maintenance and cooperation between organizations to be effective. We will cover how these tools can be used to stop spoofing in an upcoming blog post.