Every time I view-source a page on a high-profile/large-scale website, I find myself digging deeper and deeper, fascinated by how much you can learn about that site’s architecture from simply reading it’s front-end code.
I was doing it this morning with twitter and thought I’d share my findings, non of which are earth shuttering but interesting nonetheless.
It appears the twitter is using facebook’s scribe for logging and they’re sending client-side logs directly to scribe.twitter.com
var scribeUrl = (window.location.protocol.match(/s\:$/) ? ‘https’ : ‘http’) + ‘://scribe.twitter.com’; scribeUrl += ‘?category=webclient&log=’ + encodeURIComponent(stringifyLite(report)) + ‘&ts=’ + (new Date()).getTime();(new Image()).src = scribeUrl;
Since twitter is a heavy client-side application, using their own API just as any other client app would, they are also using some sort of client-side feature management to determine which features will be enabled for the viewing user. I’m assuming this is used for both user sampling and gradual release of features and also to turn off features when the system is overloaded.
twttr._initialDeciderFeatures = {“tweet_stream_search”:1,”phoenix_puffin”:1,”tweet_stream_retweets_by_others”:1,”tweet_geo_component”:1 …[truncated]… };
There’s also indication to which domains they use for development (localhost.twitter.com on port 3000) and staging (staging*.twitter.com) purposes
twttr.domains = { local: ‘twitter.com’, remote: ‘api.twitter.com’ };
var match = window.location.hostname.match(/^(staging\d+.[a-zA-Z0-9_]*?).twitter.com$/i);
if (match) {
twttr.domains.local = match[1] + ‘.twitter.com’;
twttr.domains.remote = ‘api-’ + match[1] + ‘.twitter.com’;
}
if (document.location.hostname === “localhost.twitter.com”) { twttr.domains.local = ‘localhost.twitter.com:3000’;
twttr.domains.remote = ‘api.localhost.twitter.com:3000’;
}
twttr.hosts = { local: twttr.proto + “://” + twttr.domains.local, remote: twttr.proto + “://” + twttr.domains.remote};
They are using Lab.JS to optimize asset loading
The real thing I was interested in is figuring out whether they track the frontend server responding to each request, it appears it’s nowhere on the code but they might be using HTTP headers for that, I’ll be investigating that some other time and update this post if there’s anything to share.