Bug 184573 - [GTK] Webkit should spoof as Safari on a Mac for Outlook.com
Summary: [GTK] Webkit should spoof as Safari on a Mac for Outlook.com
Status: REOPENED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKitGTK (show other bugs)
Version: Other
Hardware: PC Linux
: P2 Normal
Assignee: Michael Catanzaro
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-12 17:26 PDT by Ryan Farmer
Modified: 2018-05-15 07:18 PDT (History)
5 users (show)

See Also:


Attachments
Patch (2.79 KB, patch)
2018-04-13 09:10 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (1.51 KB, patch)
2018-05-09 19:06 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff
Patch (1.61 KB, patch)
2018-05-10 09:12 PDT, Michael Catanzaro
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Farmer 2018-04-12 17:26:38 PDT
I originally opened this as a bug on GNOME Web, but was told to open one here.

When you visit Outlook.com in GNOME Web with default user agent, you get a "basic" version of the webmail site which is intended for obsolete web browsers.

According to Microsoft, the only supported browsers are Internet Explorer, Edge, Safari, and Chrome, and it seems like anything else is an "unknown" that gets the bare bones site.

In fact, spoofing the UA as Safari 11.1 on a Mac loads the full version, which works fine. According to a similar bug related to Google websites, Webkit tells Google that it is Chromium so it'll load the full site and shut up about spamming the user to get Chrome. Apparently, Michael Catanzaro hit a wall trying to get Google to stop doing this on their end, and it's likely a lost cause trying to get Microsoft to stop this idiocy as well.
Comment 1 Michael Catanzaro 2018-04-13 09:05:10 PDT
Yeah, user agent evangelism is a lost cause, let's just add it to the quirks list. Websites forfeit their right to receive an accurate user agent from WebKit when they sent fallback content to our users.
Comment 2 Michael Catanzaro 2018-04-13 09:10:51 PDT
Created attachment 337894 [details]
Patch
Comment 3 EWS Watchlist 2018-04-13 09:13:03 PDT
Attachment 337894 [details] did not pass style-queue:


ERROR: Tools/ChangeLog:3:  Please consider whether the use of security-sensitive phrasing could help someone exploit WebKit: spoof  [changelog/unwantedsecurityterms] [3]
ERROR: Source/WebCore/ChangeLog:3:  Please consider whether the use of security-sensitive phrasing could help someone exploit WebKit: spoof  [changelog/unwantedsecurityterms] [3]
Total errors found: 2 in 4 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 4 Michael Catanzaro 2018-04-13 09:13:09 PDT
Ah, we already have a quirk for Outlook, but only for one particular university installation:

    // Microsoft Outlook Web App forces users with WebKitGTK+'s standard user
    // agent to use the light version. Earlier versions even blocks users from
    // accessing the calendar.
    if (domain == "mail.ntu.edu.tw")
        return true;

Sadly this is one of those cases where we have to add a quirk separately for every domain found to be running Outlook.

Now, we could run into one problem here. The quirk I added only works if the resource that makes the decision to send the crap fallback site is hosted on outlook.live.com. In testing Google quirks, I've found that sometimes the quirk has to be set for various other domains as well. So I'm not confident this will actually work without testing. We might need to use the quirk for all of live.com, for instance, or restrict it to other subdomains. If you're enterprising and willing to rebuild WebKit to test it to make sure the quirk works, that would be ideal.
Comment 5 Michael Catanzaro 2018-04-17 21:37:23 PDT
Comment on attachment 337894 [details]
Patch

I guess we'll find out whether it works or not once it gets released.
Comment 6 WebKit Commit Bot 2018-04-17 22:05:30 PDT
Comment on attachment 337894 [details]
Patch

Clearing flags on attachment: 337894

Committed r230749: <https://trac.webkit.org/changeset/230749>
Comment 7 WebKit Commit Bot 2018-04-17 22:05:31 PDT
All reviewed patches have been landed.  Closing bug.
Comment 8 Michael Catanzaro 2018-05-07 07:34:43 PDT
Please test this in 2.20.2 so we can find out if it worked.
Comment 9 Ryan Farmer 2018-05-09 00:34:34 PDT
I don't see Webkit GTK 2.20.2 built for Fedora 28 yet.

I'll keep an eye for an RPM on Koji.
Comment 11 Ryan Farmer 2018-05-09 12:50:26 PDT
It seems to only work if you aren't already logged in and go to outlook.com and login (full website) or go directly to outlook.live.com (full website). 

If you go to outlook.com and you're already signed into your account, there's a redirect to outlook.live.com/owa and you get the basic webmail.

So, it seems that the patch fixes it under certain conditions, but not if you go directly to "outlook.com" and get the redirect

This site is such a mess.
Comment 12 Ryan Farmer 2018-05-09 12:57:04 PDT
Michael, it looks like the patch is only spoofing for outlook.live.com.

It looks like going after outlook.com too is going to be what it takes to get this to work. :/
Comment 13 Michael Catanzaro 2018-05-09 13:19:46 PDT
OK, we can try that...
Comment 14 Michael Catanzaro 2018-05-09 19:06:23 PDT
Created attachment 340060 [details]
Patch
Comment 15 EWS Watchlist 2018-05-09 19:08:41 PDT
Attachment 340060 [details] did not pass style-queue:


ERROR: Source/WebCore/ChangeLog:3:  Please consider whether the use of security-sensitive phrasing could help someone exploit WebKit: spoof  [changelog/unwantedsecurityterms] [3]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 16 Michael Catanzaro 2018-05-09 19:12:40 PDT
Please test this F28 scratch build once it's finished (and let me know if the build fails): https://koji.fedoraproject.org/koji/taskinfo?taskID=26866056
Comment 17 Ryan Farmer 2018-05-09 22:35:34 PDT
Seems like the fallback site is still loading if you go to outlook.com and get the redirect.
Comment 18 Ryan Farmer 2018-05-09 22:39:05 PDT
I just noticed that with the package webkitgtk3-jsc, I get:

Error unpacking rpm package webkit2gtk3-jsc-2.20.2-2.fc28.x86_64
Error unpacking rpm package webkit2gtk3-jsc-2.20.2-2.fc28.x86_64
error: unpacking of archive failed on file /usr/lib64/libjavascriptcoregtk-4.0.so.18.7.10;5af3da69: cpio: read
webkit2gtk3-jsc-2.20.2-2.fc28.x86_64 was supposed to be installed but is not!

The other two packages (Webkit itself and the GTK2 plugin package) installed though.
Comment 19 Michael Catanzaro 2018-05-10 07:19:06 PDT
(In reply to Ryan Farmer from comment #17)
> Seems like the fallback site is still loading if you go to outlook.com and
> get the redirect.

Strange. So what I have there is the most aggressive heuristic I could come up with: any request with the base domain "outlook.com" or "live.com" will get the special user agent. That includes outlook.live.com. So that tells us that there is some other domain that needs to receive the quirk. Your best bet is to open the web inspector (ctrl+shift+I) and switch to the Resources tab before loading the page, see if any resources are being loaded from elsewhere.

(In reply to Ryan Farmer from comment #18)
> I just noticed that with the package webkitgtk3-jsc, I get:
> 
> Error unpacking rpm package webkit2gtk3-jsc-2.20.2-2.fc28.x86_64
> Error unpacking rpm package webkit2gtk3-jsc-2.20.2-2.fc28.x86_64
> error: unpacking of archive failed on file
> /usr/lib64/libjavascriptcoregtk-4.0.so.18.7.10;5af3da69: cpio: read
> webkit2gtk3-jsc-2.20.2-2.fc28.x86_64 was supposed to be installed but is not!
> 
> The other two packages (Webkit itself and the GTK2 plugin package) installed
> though.

Not sure why it failed, but shouldn't matter as the code change is in WebCore.
Comment 20 Ryan Farmer 2018-05-10 08:34:50 PDT
Outlook.com redirects too quickly to catch what it's trying to load in the inspector in Web, so I loaded it in Opera with the inspector on and Javascript disabled.

Outlook.com seems to load a gigantic nasty obfuscated Javascript file from ow1.res.office365.com

https://ow1.res.office365.com/owamail/20180427.03.02/scripts/owa.mail.js

In the basic version of the site, it also looks like we get some CSS files and numerous images from r1.res.office365.com 

Once you're actually on the full mail website, there are resources pulled in from Skype, AOL, and tons of other stuff, but since the redirect itself only seems to pull in "owa.mail.js", it might be an idea to go after office365.com

-----

So it looks like a duck and quacks like a duck. Maybe office365.com needs the spoof too.

Right now, it's sitting at:

Go to outlook.com and sign in: Get full Outlook Mail.

Go to outlook.com while already signed in: Get fallback site.

Go to outlook.live.com while already signed in: Get full Outlook Mail.

Go to outlook.live.com and sign in: Get full Outlook Mail.

Go to office.com and click on any of the Outlook Mail icons while signed in, get directed to Outlook.com and get the fallback site.

-----

I've been browsing around with the user agent set to Safari 11.1 on Mac OS to see if anything related to Office 365 breaks if you spoof as Safari, and found that actually not much of the Office 365 site loads anything from office365.com. Most of it is from live.com subdomains and various CDNs.

Should Webkit after all of offfice365.com with a UA spoof and see if that fixes Outlook? 

With the way Microsoft has set Outlook/Office up, it looks dodgy as hell (obfuscript, vendor prefixed CSS, etc.), and I'm concerned that if the spoof goes after subdomains, Microsoft might change the name of them for some reason later on.

Plus, it seems like Outlook and Office 365 are more or less the same service these days. Gone is the simple utilitarian Hotmail.
Comment 21 Michael Catanzaro 2018-05-10 09:12:30 PDT
Created attachment 340093 [details]
Patch
Comment 22 EWS Watchlist 2018-05-10 09:13:59 PDT
Attachment 340093 [details] did not pass style-queue:


ERROR: Source/WebCore/ChangeLog:3:  Please consider whether the use of security-sensitive phrasing could help someone exploit WebKit: spoof  [changelog/unwantedsecurityterms] [3]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 23 Michael Catanzaro 2018-05-10 09:14:43 PDT
Here's another scratch build that will use the quirk for office365.com: https://koji.fedoraproject.org/koji/taskinfo?taskID=26877229
Comment 24 Ryan Farmer 2018-05-10 11:25:25 PDT
Blast.

Well, that doesn't fix the Outlook.com redirect either.

Interestingly though, now if you click on any link after it dumps you out at the basic site, it goes from the fallback site to the full site. (Inbox, Junk Mail,a piece of mail, etc.)

There are some more domains that have resources fetched on the full site:

skype.com
skypeassets.com
bing.com
msecnd.net
azureedge.net
advertising.com
adnxs.com
aolcdn.com

The azureedge.net and msecnd.net domains only load images.

The skype.com domain loads frames and scripts. It also uses XmlHttpRequest a lot. skypeassets.com shows up in fetches a lot.

The bing.com domain loads numerous scripts. 

The adnxs.com domain loads a script.

The advertising.com domain loads a script

The aolcdn.com domain loads a script.

Also, I went back and searched through the script that the Outlook.com redirect loads, looking for domains where it attempts to load something, and found:

office.net
adnxs.com
sharepointonline.com
outlookweb.visualstudio.com


So, candidates are many, however, the best ones seem to be:

skype.com
skypeassets.com
bing.com

Possibly:

adnxs.com
advertising.com
aolcdn.com

This should all be advertising, though. Could this be getting loaded even with Web's ad blocker?

Less likely:

azureedge.net
msecnd.net
sharepointonline.com
office.net
visualstudio.com

I'm just spitballing at this point, though. Whoever wrote these sites is deranged.
Comment 25 Ryan Farmer 2018-05-10 11:55:27 PDT
Yeah, the more I look at this, the more I notice that Skype (skype.com and skypeassets.com) is _everywhere_ on the full site. 

At first I didn't think a lot of it, but the idea occurs that perhaps if the Skype APIs don't like what they see, they knock you back to the fallback site.

Curiouser and Curiouser. 

But of course there's no telling. If I had a way of spoofing the UA per domain on my end, I could just set them all to Web's default UA and play around with them until I found the magical combo that doesn't ever result in a fallback site.
Comment 26 Michael Catanzaro 2018-05-10 14:20:21 PDT
I think you're going to have to build WebKit yourself if we want to ensure this works. See https://trac.webkit.org/wiki/BuildingGtk
Comment 27 Michael Catanzaro 2018-05-15 07:18:58 PDT
I'm going to roll out the initial fix, since it didn't work :(