How to tell who Leaks your Email Address

Before I describe a technique you can use to tell where your email addresses are being leaked from, let’s take a small step back and review a little about email spam in general first.

Unsolicited email spam has been a problem since the SMTP (Simple Mail Transport Protocol) was originally defined. Thankfully, software like SpamAssassin [1] and numerous other tools have got fairly good at filtering emails, and it’s not often that end users see the full scale of the problem. Unfortunately, while significant improvements have been made, the root issues still somewhat remain a challenge:

  1. Did the email come from where it said it did?
  2. Is the email content spam, phishing, other or legitimate?

Did the email come from where it said it did?

The first problem is a technical one and related to the SMTP protocol [2],[3]. SMTP defines how a message is transported and not the message content. This meant that for a very long time it was extremely easy to forge the author and origin of an email by simply setting whatever you like as the FROM, REPLY-TO and RETURN-PATH addresses.

While I was at university, I would often prank classmates by sending them emails such as job offers from bill@microsoft.com. The emails I sent were never intended to be malicious and were the most basic emails you can create; usually a single sentence and very easy to identify that they are not legitimate. However, one time this almost got me into real trouble.

I was working for a local company in a part-time job while I was at university. This is a small organisation that had headquarters in another town and HR was ran from there. One day we were doing some paperwork and decided it would be funny to send an email to HR from the boss saying that X employees down in Y town deserved a bonus this week. A day later we received a call from the boss which went something like this (paraphrasing) “We received a strange email here at the office, was that you?”, “Yes! We were just messing around, and it should be fairly obvious that it was a fake email.”, “Well, you’ve caused quite the stir, don’t do it again.”. Thankfully my boss was fairly passive about it.

2 or 3 years later when I no longer worked for this company, I recieved a call from my old boss who told me they were having some problems with their computers in the office and suspect that malware had been installed. The first real shock came when he asked “Does this have anything to do with you?”. Damn!, this is when the full impact of my prank those years earlier hit me. What has he been thinking about me all this time?

In hindsight, this was fairly silly and I certainly don’t recommend doing it. People can be suspicous of the unknown and it’s usually not a good idea to destroy any credibility you have with a silly prank. I guess the morale of the story here is if you’re a bit of a prankster, think a little about the impact before you go ahead.

A Small Demo

Now, a small demo. Let’s see if we can recreate this little bit of nostalgia. Here I’m going to use Papercut “The Simple SMTP Desktop Email Receiver” [4] and Telnet [5] (use the link to enable if on windows).

First, I’ve installed Papercut on a virtual machine on my LAN with the address 192.168.1.238.

What I would do back in the days of university was put a random search into AltaVista or Google search engines. Not the most efficient or correct way, but there was always one or more hits and I didn’t really have any need for scripts. Any websites that were returned on the front page, I would probe for a mail server by attempting to connect to port 25 (Port 25 is SMTP and used for sending mail, Port 110 is POP3 is used for retrieving mail from the server). Let’s say www.surmise.it showed up in those results and a whois for this domain told us that the IP address was 192.168.1.238. Well, I would open telnet and connect to the machine like this (Note: I’m using windows here but it works exactly the same on linux);

After you connect you should recieve some information about the mail server software and it’s version:

Now we’re ready to enter some commands. Let’s perform the following:

The arrows in red are the lines I’ve typed in manually, and this (at a very basic level) illustrates how SMTP works. After these commands have been processed by the server we can see the results in Papercut:

Today it’s unlikely that this would work due to improvements that have been made around email, such as the SPF (Sender Policy Framework) [6] and Email Authentication/Validation [7]. So no more fun but obviously for good reason.

Is the email content spam, phishing, other or legitimate?

The second problem is one of natural language processing. How does an email client determine that the content of an email is not phishing or something similar? Can a computer actually understand the contents of the email? Let’s leave that question there for now and not get into the Turing Test.

This is a harder problem to solve and it’s a constant battle between spammers and filters to get the right balance of blocking spam while allowing legitimate emails through. It’s probably the reason we have a spam folder at all, since there would be no use for this if the filtering was perfect.

As with all things computing, spam detection is being improved all the time. As well as many shared blacklists, manual user reports, identifying mass emails with similar content, regular expressions to match known bad content and many other techniques; further technologies such as ML (Machine Learning) [8] which is a subset of AI (Artificial Intelligence) [9] can be used to improve spam detection rates quite considerably and is probably one of the best tools we have.

However like a lot of technologies, this can be a double edged sword [10] and I may explain further in depth in a future article related to GPT-2 [11], but in essence; the better we get at recognising content actually came from a human, the better spammers get at making content seem like it came from a human.

How to tell who leaks your Email Address

It’s fairly straight forward to determine where your email address(es) have been leaked from, but this technique does require you to purchase a personal domain. Usually you can pick up a domain fairly cheap but the price can vary depending on the suffix or TLD (Top Level Domain) [12] (.com, .net etc).

You should also be aware that many individuals and companies pre-purchase seemingly valuable domains so that they can sell them some time later at a much higher price (called domain squatting) . If you do intend to buy a domain, try to purchase something at the base price from a well known domain name registrar. surmise.it was purchased from 123-reg.co.uk, and I’ll use this domain as the example.

Once you have your domain you should have a control panel where you can log in and manage it. As surmise.it points to separate web hosting, configuration changes for this domain can be made in CPanel on the hosting rather than on the domain name registrar, but many domain name registrars support email configuration independent of hosting.

Within CPanel I have the option to create actual @surmise.it email accounts where the email collected will be stored on the server until it is downloaded by a client over POP3, or I can simply set up forwarders instead. For the purpose of this example we’ll use forwarders.

Depending on your configuration and whether you use hosting or not, your control or management panel may be very different to the images displayed, make sure to do some research before spending any money. The domain email configuration may also allow you to set up individual forwarders or just one rule to forward everything; i.e. send everything we receive at xxxx@surmise.it to another email account such as relaki@yahoo.com; xxxx being anything you can think of, so keep that in mind.

Most people today use a central email account hosted in the cloud that is managed by a company such as gmail or yahoo. So let’s imagine that we’re going to forward everything to relaki@yahoo.com, but for the purpose of this article we will illustrate how it is done manually for each address.

Before we can put our plan in action we need somewhere that we are going to publish an email address or create an account, so let’s choose Amazon here and setup a forwarder from amazon@surmise.it to relaki@yahoo.com.

Now that we have this set up, we can sign up to Amazon and create the account with the email address amazon@surmise.it. Can you see where this is going?

Let’s imagine that we add a second forwarder for youtube@surmise.it to relaki@yahoo.com, and then go ahead and create an account on YouTube using the email address youtube@surmise.it.

If you do this for each account and one of these email addresses get leaked you now know where it was leaked from!

You can also take this one step further and avoid the company@ format altogether. Using any format you like, you can map the email address you used for the forwarder to the company they belong using an excel spreadsheet.

joe.nightlight@surmise.it -> relaki@yahoo.com [Amazon]
T3467867@surmise.it -> relaki@yahoo.com [YouTube]

I’m sure you get the idea.

Warning: The only problem you face with this technique is that you can’t afford to lose access to your domain. If you let the domain expire it would mean that anyone else could purchase the domain and have access to all of your email accounts. So either never let the domain expire, or update all of your accounts to your real email before it does.

References

[1] https://spamassassin.apache.org/index.html
[2] https://tools.ietf.org/html/rfc821
[3] https://tools.ietf.org/html/rfc5321
[4] https://github.com/ChangemakerStudios/Papercut
[5] https://blogs.technet.microsoft.com/danielmauser/2015/03/18/tip-installing-telnet-client-via-command-line/
[6] https://en.wikipedia.org/wiki/Sender_Policy_Framework
[7] https://en.wikipedia.org/wiki/Email_authentication
[8] https://en.wikipedia.org/wiki/Machine_learning
[9] https://en.wikipedia.org/wiki/Artificial_intelligence
[10] https://towardsdatascience.com/openais-gpt-2-the-model-the-hype-and-the-controversy-1109f4bfd5e8
[11] https://github.com/openai/gpt-2
[12] https://en.wikipedia.org/wiki/Top-level_domain


Leave a Reply

Your email address will not be published. Required fields are marked *