Outlook doesn't automatically send email links to Bing

The other week a blog post was shared around which claimed that if you send an email containing links to an Outlook user, those links are automatically visited by a Bingbot user agent. The author also says that these URLs get included in Bing’s public search index. The article is light on details about the second part but he offered more information in the HN comments.

This report upset people for various reasons (c.f. Hacker News, Lobsters), not all of which resonated with me. I do have skin in the game; my primary email these days is an outlook.com address and my consulting business is also on M365. For me, automatic scanning for malware and phishing behind links is actually pretty desirable. The problem I have with the article is that it’s a bit anecdotal and messy—actual third-party customers are involved and it’s hard to be certain that the link-visiting and the Bing indexing occurred as a result of the email delivery.

I set out to reproduce this by sending new, unique links to four different kinds of hosted Outlook email address under my control. After 16 days, the only hits to those links have been my own user agent, or an on-click scan immediately before/after my user agent (on some accounts). There were no hits in the 20+ minutes between delivery and when I clicked, no hits afterward, and there has been no Bing activity or indexing of the new URLs at all. It’s impossible to prove a negative but I am confident to say that Microsoft does not routinely auto-click or index your email links—at least at the time when I ran my experiment.

The Experiment

I drafted the following email which I hoped would be casual enough to get past basic spam filters (narrator: it was), sent from my SDF account thomask@sdf.org.

Hi Tom,

Thanks for the chat the other day. This is the meetup I was talking about: [first url]

You can check how many seats are left here: [second url]

Cheers

The URLs pointed to brand new paths on my website, hosted on the same domain as the sender email address. There were reports of some scanners tweaking query parameters so I wanted to have a “basic” URL and a dynamic one. I set up four pairs of paths for each destination:

  • /emails/apple1/
  • /emails/apple1/req.php?qty=1&code=1234
  • /emails/banana2/
  • /emails/banana2/req.php?qty=2&code=bruh
  • /emails/cumquat3/
  • /emails/cumquat3/req.php?qty=3&code=abcdef
  • /emails/durian4/
  • /emails/durian4/req.php?qty=1&code=delicious
  • /emails/ (a blank page to prevent a directory listing if spidered)

These each return a small valid HTML document. I tested these are working using curl locally, to avoid any risk of browsers uploading URLs for cloud history or security scanning. I confirm that subdirectories are indexed on my site, so I can search for site:thomask.sdf.org/ru-declensions for example.

There is no robots.txt nor any noindex header because the point is that these measures shouldn’t be necessary, even if they did work as a protection against indexing email links.

I used destination email addresses under my control on the following domains:

  • outlook.com - personal email account (M365 Family subscription)
  • cerberuscs.com.au - my consulting business (M365 Business)
  • utas.edu.au - email account from my alma mater, which uses M365 but their MX is hosted by a service called Mimecast
  • outlook.com again - a new free account on default settings which I created for this experiment

After sending all the emails one after the other I waited at least 20 minutes to see if there were any hits at all as a result of delivery. There were not. I then opened each one through Outlook desktop or Outlook web access and clicked once on each link.

Result: outlook.com (M365 Family)

This account has SafeLinks turned on, which means all links in emails are automatically bounced through https://nam12.safelinks.protection.outlook.com/.... When I clicked each link, immediately before my browser there was a HEAD request from 104.47.66.126, which is a Microsoft IP.

Then in the 25 seconds following some different Microsoft IPs checked a handful of URLs and also looked for favicons. There was no activity after this.

104.47.66.126 - - [06/Jul/2022:12:43:45 +0000] "HEAD /emails/apple1/ HTTP/1.1" 200 0 "-" "-"
(MY IP) - - [06/Jul/2022:12:43:46 +0000] "GET /emails/apple1/ HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44")
104.47.66.126 - - [06/Jul/2022:12:43:52 +0000] "HEAD /emails/apple1/req.php?qty=1&code=1234 HTTP/1.1" 200 0 "-" "-"
(MY IP) - - [06/Jul/2022:12:43:52 +0000] "GET /emails/apple1/req.php?qty=1&code=1234 HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44")
40.94.33.40 - - [06/Jul/2022:12:43:57 +0000] "GET /emails/apple1/ HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
40.94.30.143 - - [06/Jul/2022:12:44:02 +0000] "GET /emails/apple1/req.php HTTP/2.0" 200 128 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
40.94.30.143 - - [06/Jul/2022:12:44:02 +0000] "GET /emails/apple1/req.php?qty=1&code=1234 HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
40.94.33.89 - - [06/Jul/2022:12:44:02 +0000] "GET /favicon.ico HTTP/2.0" 404 188 "https://thomask.sdf.org/emails/apple1/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
40.94.30.223 - - [06/Jul/2022:12:44:08 +0000] "GET /favicon.ico HTTP/2.0" 404 188 "https://thomask.sdf.org/emails/apple1/req.php" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
40.94.30.223 - - [06/Jul/2022:12:44:08 +0000] "GET /favicon.ico HTTP/2.0" 404 188 "https://thomask.sdf.org/emails/apple1/req.php?qty=1&code=1234" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"

Result: cerberuscs.com.au (M365 Business)

For whatever reason my business account doesn’t have SafeLinks turned on. There were just the two hits from my own computer.

(MY IP) - - [06/Jul/2022:12:44:04 +0000] "GET /emails/banana2/ HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
(MY IP) - - [06/Jul/2022:12:44:06 +0000] "GET /emails/banana2/req.php?qty=2&code=bruh HTTP/2.0" 200 136 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"

Result: utas.edu.au (with Mimecast protection)

This one also had some click-through links, of the format https://protect-au.mimecast.com/s/.... Each click resulted in a number of visits from other addresses before my browser was eventually redirected.

89.184.210.96 - - [06/Jul/2022:12:44:20 +0000] "GET /emails/cumquat3/ HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
139.99.245.97 - - [06/Jul/2022:12:44:26 +0000] "GET /emails/cumquat3/ HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
(MY IP) - - [06/Jul/2022:12:44:27 +0000] "GET /emails/cumquat3/ HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
212.70.27.78 - - [06/Jul/2022:12:44:30 +0000] "GET /emails/cumquat3/req.php?qty=3&code=abcdef HTTP/1.1" 200 149 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
77.95.119.136 - - [06/Jul/2022:12:44:30 +0000] "GET /emails/cumquat3/req.php?qty=3&code=abcdef HTTP/2.0" 200 138 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
212.70.27.78 - - [06/Jul/2022:12:44:31 +0000] "GET /emails/cumquat3/req.php?qty=3&code=abcdef HTTP/1.1" 200 149 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
150.107.73.234 - - [06/Jul/2022:12:44:37 +0000] "GET /emails/cumquat3/req.php?qty=3&code=abcdef HTTP/2.0" 200 138 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
(MY IP) - - [06/Jul/2022:12:44:38 +0000] "GET /emails/cumquat3/req.php?qty=3&code=abcdef HTTP/2.0" 200 138 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"

Result: outlook.com (free account)

Slightly later than the others, I thought I should see what happens on an unpaid outlook.com email address on default settings, in case they treat their free users any differently. This account did not have SafeLinks and no scanning or hits occurred at all, apart from my clicks in the outlook.com web client.

(MY IP) - - [06/Jul/2022:13:12:51 +0000] "GET /emails/durian4/ HTTP/2.0" 200 135 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"
(MY IP) - - [06/Jul/2022:13:12:52 +0000] "GET /emails/durian4/req.php?qty=1&code=delicious HTTP/2.0" 200 138 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.66 Safari/537.36 Edg/103.0.1264.44"

Conclusion

For these tests that I performed, the hits on the new URLs were pretty much what I expected—certainly nothing nefarious, and no sign of Bing getting up in my business. It’s clear that there isn’t a general Outlook policy of “click all the links” or “scan all the links” or “submit all the links to Bing” because that simply hasn’t happened, across a range of account types.

That doesn’t necessarily mean the traffic described in the original report was false. Without knowing what goes on behind the scenes at Outlook.com and Bing I can’t say with certainty that they never click links or index things on Bing. There could be other reasons why that traffic occurred for the original author’s site.

Until I receive further information otherwise I am comfortable that my email links are not being misused.