Demystifying SEO with experiments

aresant · on Jan 27, 2015

"Figure 4 shows the result of an experiment we ran to test rendering with JavaScript for better web performance. . . [The test failed and] even after we turned off the experiment, it took almost a month for pages in the enabled group to recover..."

This is the incredibly frustrating part of White Hat SEO w/Google because:

(a) At nearly every SEO conference, by Google's own actions in releasing Page Speed Insights, and by Matt Cutts engaging and talking about page speed being an important indicator it seems like speed is pretty !@$!@% important to Google and worth pouring resources into. (1) (2)

(b) As a result a well equipped and well connected organization like Pinterest launches tests designed to improve this important signal. I'm going to assume they're organizationally smart enough to not damage or ignore other important Google signal ranks like usability, time on site, etc that you have to balance w/the JS page speed test.(3)

(c) Google penalizes them.

WTF!

My frustration as a customer acquisition guy - encompassing CRO / SEM / SEO / etc - is that I try to discuss and push best practices for my own projects, for clients, and for public facing blogs / presentations / etc.

I get that they don't want people gaming / pushing - but when they push out a "best practices" methodology like page speed, and then execute a penalty as described by Pinterest, I just want to throw my hands up.

(1) https://developers.google.com/speed/pagespeed/insights/

(2) http://www.webpronews.com/today-on-the-matt-cutts-show-page-...

(3) I'm going to add the disclaimer of not having seen the JavaScript that pinterest used and perhaps they're not properly weighting / aware of other important SEO signals that GOOG penalizes when using Javascript, but I'm sure they are. Happy to answer more on that directly via my profile or this thread.

mattmanser · on Jan 28, 2015

One of my clients has a site that gets about 1k visitors per day, heavily dependant on SEO, nothing massive, but big enough to see the effects of certain changes.

After I refactored one of their pages to go from 1.5 sec initial/8 sec total page load to 0.04 sec/4 sec load times there was no discernible change in rankings or traffic.

Page load as SEO is totally over egged for the most part as it's one of the things developers can easily measure and point a finger at, along with number of requests, CSS file size, etc. while everything actually meaningful for SEO is absolutely black box.

Every time I get handed a "SEO site report" by a client I can immediately tell if the SEO guy is any good or not if the report actually contains anything else other than recommendations to improve page loads, reduce requests and all those practically useless, but very expensive in developer time, actions. Oh, and the obligatory spreadsheet with all the external inbound links domains. Because that's really insightful.

motherwell · on Jan 28, 2015

Everyone - repeat after me:

SEO DOES NOT HAPPEN IN ISOLATION!

I repeat:

SEO DOES NOT HAPPEN IN ISOLATION!

If one improves a factor on their own site, it doesn't mean that will make a lick of difference in rankings because:

1. That may not be the factor holding a site back, and

2. It may be the site is performing as good as possible because users do not want that type of site in the SERPs.

SERP construction - what goes where - has a bigger impact on rankings than any one other factor, e.g. maps vs images versus news, and that has less to do with one's own site, and more to do with user feedback.

mhoad · on Jan 28, 2015

While I too couldn't comment on the actual JavaScript they put in place for the test and how it might have impacted other factors I just wanted to make a quick comment on the idea of a page speed penalty.

1. I have never in my life seen evidence of this happening before (enterprise consultant in this field). I am positive in my mind that it was not a case of "page is faster = drop in rankings".

I would be willing to bet that it had a lot more to do with the "rendering content in JS" which has traditionally been a huge issue for search engines (despite what they claim).

But every single time I have seen someone claim this or something like this there has been many other factors at play.

2. It is highly likely that the amount of impact that any given variable (in this case pagespeed) is not the same across the board. In fact it is much more aligned with the kinds of industries you are in and the keywords relating to that search.

As an example page speed is MUCH more likely to be a bigger factor for an e-commerce website and keywords showing any kind of transactional intent than it would be for someone looking for detailed information on a medical condition for example.*

*This isn't a confirmed fact that I know of at all but it seems to be a relatively well established theory in some more advanced SEO circles I believe.

jnem · on Jan 28, 2015

The main issue with JS and SEO is that most JS Frameworks we are talking about are used to create SPAs (Single Page Applications). Usability-wise, these are great; everybody hates page loads. SEO-wise, there is a problem because the search engines can’t see all the HTML because everything is being rendered client side.

Basically, with a traditional web page written in PHP, the HTML is constructed from a template. The template spits out HTML from the server to the client. Every time you click a new link to load a new page, it’s delivered to your client via a GET request (EDIT: Your client/browser "asks" for the file from the server, the server sends the requested document back to the client).

With a SPA, JS is doing some DOM manipulation, and you’re not making round-trips to the server to display new content. For example, if you were to look at the source of a SPA written in Angular, you might see some <div ng-view> </div> elements, but almost no actual html. The web crawlers would see something similar.

There are several tactics for circumventing this issue, and Im curious if the Pintrest team considered them during this experiment. Anybody on the team here?

dangayle · on Jan 28, 2015

Angular.js, courtesy of Google.

freebs · on Jan 28, 2015

I agree with you that this definitely a rendering content in JS problem.

Most of the SEO experiments with increasing page speed from bad to average or great have shown a very small increase in rankings. We're talking one small signal in the whole algorithm. In this case, trying to go from average or good page speed to great page speed just for rankings is quite foolish.

If they want page speed to be the best it can be, and they absolutely must render the whole page in JS, they should have at least tried to use PhantomJS or something else to render the page properly for googlebot.

There's always the traditional options that can increase speed: minimize the amount of JS, increase the quality of JS written, less HTTP requests, proper DB calls, image sizing, proper caching, etc.

pierrefar · on Jan 28, 2015

I work at Google in web search, and I have a few comments about this discussion.

Firstly, I think this whole discussion about page speed is the wrong way to approach it. The primary motivation for page speed should be user happiness, which affects the key metrics you care about like user acquisition, conversion, and revenue. The fact it's a (small) ranking signal is a nice benefit, a cherry on top. Here is a nice case study about page speed and user metrics from Lonely Planet:

http://cdn.oreillystatic.com/en/assets/1/event/88/Performanc...

The conversion rate graph on slide 9 is what pretty much every study that looks at performance and user engagement finds. And here is one from Google search about the effect of page speed on searchers:

http://googleresearch.blogspot.co.uk/2009/06/speed-matters.h...

Secondly, there could be other issues with how the experiment was conducted on a technical level:

1. The experiment was about using JavaScript. Was Googlebot allowed to crawl the JS files? If robots.txt blocked crawling, that would have translated to less content visible to Googlebot, and so less content to index, which can easily result in a loss of ranking.

Note that the JS file itself may have been crawlable but it may have made an API call that was blocked. Same end result in terms of indexing.

2. Related to (1), we only started rendering documents as part of our indexing process a few months ago. When was this experiment conducted? If it was before full rendering was the norm, it's very likely we didn't index JS-inserted content that we could now, which, again, may have resulted in lower ranking.

For both of these, using the Fetch and Render feature in Webmaster Tools gives you the definitive view of how our indexing system sees your content. Before running any such experiement, it's worth running a few tests using Fetch and Render.

compbio · on Jan 28, 2015

Rude: A site that optimizes for speed to gain an SEO boost, does not deserve this SEO boost. A site like Pinterest should optimize for speed, regardless of what Google says.

If you turn back an experiment meant to enhance usability and engagement, you show that you care more for search engines than for search engine users. You are of course free to place your priorities so that your business survives, but then you can't make a claim to the woes of a whitehat SEO and you will drown in a turmoil of mystification and algo update-fears.

I do not understand why they even experiment with javascript rendering to increase the page-speed when they do not even currently defer loading of javascript to </body>. There are a lot of requests that could be combined into a single request and they could save over 50%(!) of the resource size by minifying html and lossless compression of the pinned images on pinimg.com. The mobile site gets a pagespeed score of 58 / 100. This could be 90+ / 100. It should be 90+ / 100 before you even think of javascript rendering. And if you go down that route, better study WAI-ARIA, before you make your site inaccessible to some. Good SEO follows from good design, good development, good content strategy, good UX, good accessibility. All of these things can be optimized in isolation and will benefit SEO.

If by javascript rendering they meant one of those fancy endless scrolls, then it is likely they did not implement this correctly. You can expect a lot from a bot that is able to parse javascript, but not that it spends 2 minutes scrolling down with the mouse till it reaches the end of a board and has seen all the content.

admyral · on Jan 28, 2015

For some businesses I've worked with, SEO is their lifeblood. Risking potential revenue to "test" whether certain SEO strategies will yield positive results is not possible. As always, Google picks the winners and losers.

mhoad · on Jan 28, 2015

I do this for a living and I couldn't emphasise enough how much of a bad idea it is to build a business around SEO.

Is it a really important and potentially profitable thing to get right and do well at? Without any doubt the answer to that is yes in almost all cases.

Is it also at the same time where you should spend all of your marketing time and spend? No, not unless you want to spend the rest of your days worrying that the next major update will potentially wipe you out.

ssharp · on Jan 28, 2015

I find the advice of "don't rely too much on one channel" to be a bit insincere. Yes, that's good advice, but avoiding that trap, for many businesses, can be very hard. And when you're constantly trying to grow, increasing reliance on channels that provide growth is either difficult or stupid to avoid.

For channels where you don't have strong control, which are many -- SEO, social, etc. -- it's best to get as much overlap in channels you do control. So do things like convert organic and social traffic to emails as efficiently as possible. It's still going to hurt when you take hits, but you should be in a better position to rebound.

Also, when Genius got their spanking from Google last year, that advice was thrown around a lot. "Why are you so reliant on organic search?" For Genius, it makes 100% sense to be reliant on organic search. I search for lyrics a lot but almost never go to Genius directly. Why? It's not because they're bad, it's because it's easier to type in the Google bar "$song_title lyrics". And most of the time, Genius is right up there and that's who I click on because, in my opinion, they offer the best lyrics experience. And for people not familiar with Genius, they will always be most likely to become familiar with them from search results for "$song_title lyrics". It's a perfect fit.

dante9999 · on Jan 28, 2015

I don't think Google penalizes speed improvement. They penalize use of JavaScript for html rendering. Google bot probably cannot render all page content with JS, I'm sure in most cases it does not make all requests for static assets, it just makes GET to your url and parses html that is in response. If your initial response to first HTTP GET contains blank <body> that is filled dynamically by JavaScript Google bot will only see this blank body.

I'm pretty sure it would be completely inefficient for Google to measure page load times by making hundreds and hundreds of requsts to create full DOM representation.

pbowyer · on Jan 28, 2015

> and perhaps they're not properly weighting / aware of other important SEO signals that GOOG penalizes when using Javascript, but I'm sure they are. Happy to answer more on that directly via my profile or this thread.

I for one would very much like to see this discussed in this thread.

I've avoided SEO all my life (not determinate enough for my brain to handle) but I have one client (car hire) for whom SEO is their lifeblood, and I need to get over my discomfort and understand more.

IndianAstronaut · on Jan 28, 2015

>As a result a well equipped and well connected organization like Pinterest launches tests designed to improve this important signal.

One may think so, but I have worked at an organization where SEO was well known about but due to the lack of attention their recommendations were given by engineering teams, many of them left. The site subsequently suffered some penalties.

ckluis · on Jan 28, 2015

Solid fundamentals will help everything long term, but short term dealing with the BS is super frustrating.

z3t4 · on Jan 28, 2015

Google can not see JS-rendered stuff. They only do simple HTTP-requests. Googlebot doesn't even load the .js files ... Try searching your http log for \.js.*Googlebot

willu · on Jan 27, 2015

Unfortunately this only reinforces the mystical nature of SEO. 1. Why does Webmaster Tools tell you duplicate titles are a problem but changing them has no impact? 2. Why does repeating pin descriptions improve traffic drastically when we're told not to duplicate content? 3. Why do some changes have a lingering impact while others revert back to pre-change behavior?

That said, I applaud the scientific approach to coping with the black box.

freebs · on Jan 28, 2015

Webmaster tools will tell you about duplicate titles even if there are only a few. A few duplicate titles won't penalize your rankings. Perhaps this is your case. I've definitely seen results for sites that have had 60% of their page titles as dupes.

You also have to realize that it's very important on what you are changing the them to. You should have done keyword research to see how often people are searching for particular terms. If you changed the title to be a non-duplicate but used some obscure words that a normal person would never use, you won't see many results.

Titles are just one piece of the pie, they should be consistent with everything else on the entire page, including the meta description, headings, content, image alts, links, etc.

iopq · on Jan 28, 2015

The e-commerce site I worked with had the best relevance with the word "USD" since it was on every page. But you're not supposed to just repeat a word tons of times to get high relevance on a keyword, right?

motherwell · on Jan 28, 2015

What makes you think that the solution they implemented was a good one, and should have worked?

dangayle · on Jan 28, 2015

I agree. I took one look at their test and thought: That's missing the point. They actually don't care if the title tags would hash differently, they care if the title tags are descriptive of the content in such a way that the pages can be distinguished by a human or nlp bot.

But then again, what more could they do? Likely adding the board's owner would have had a much higher statistical relevance.

voyweb · on Jan 28, 2015

Think they could have had more fun experimenting with image indexing. Say you Google a nice ___location, for example: Edinburgh, Scotland. On that SERP there is two locations for images, the 5th and the knowledge box to the right has a map and image.

The first image in the 5th position image area is Wikipedia (hard to beat that), but the last three are local blogs and Flickr (easier to beat). The very last image is the same image used in the knowledge box which sits nicely in eye line with the 1st position SERP link.

After a quick bit of detective work I've found that in Google Images, Pinterest link back to their page but are not the image source.

Type into Images search: site:pinterest.com intitle:Edinburgh, Scotland

Back to the Edinburgh, Scotland SERP and looking at those images in the 5th position we can see that all images are both the page and source.

We can use the Flickr image that's third in the 5th position image area as grounds to warrant even a small experiment to test if the theory is correct. The theory being that if Pinterest was the page and source could they see a benefit from it reflected in their organic search traffic.

What Pinterest lack is content, which they stated in the post. What they don't lack is images and titles.

nlh · on Jan 27, 2015

I'm presuming the negative results they saw from "rendering with JavaScript" meant, specifically, they moved certain page rendering tasks to the client side vs. server side. (It wasn't explicitly clear that was the case, but implicitly so).

If so, that's a big reinforcement of the importance of server-side rendering for SEO purposes or, for you JavaScript fans, isometric applications.

I know this is talked about a lot anecdotally, but it's interesting to see it so starkly laid out in an experiment by a major site.

dmnd · on Jan 28, 2015

I'd love to hear about that experiment in more detail. People often cite Google's Understanding web pages better[1] as evidence that it's now OK to render everything with JS, but this is the first time I've seen someone publish actual evidence.

[1]: http://googlewebmastercentral.blogspot.com/2014/05/understan...

somberi · on Jan 28, 2015

When I spoke with some senior Google search guys, this is what I walked away with:

Be a good player. Provide good content that your users care about. Positively add to the Web. Google will find you.

IndianAstronaut · on Jan 28, 2015

Not all sites are about content. A friend of mine recently struggled with this. He is able to get top quality swords and has a cool selling platform. Problem is that this does him no good on search rankings.

schoen · on Jan 28, 2015

That's an interesting point conceptually: the best result for some purposes might be best for a reason that comes from the offline world. For example, if you want to buy rare things, you want to find the dealer with the best expertise and access to sources of those things. That dealer might have an incredibly minimal web site with almost no content -- maybe just contact information.

The original link-structure analysis idea in PageRank was meant to address issues like this a little bit: if everybody links to that dealer's page, it's a good suggestion that that dealer is important, regardless of the content of the page. But there are also things that people don't talk about on the web that much, or don't link to on the web that much (especially if they relate to a secretive, insular, or otherwise not-heavily-web-using community).

You could say it's no fair expecting search engines to know about social facts they can't possibly observe, but in any case it's a reminder of how complicated the idea of relevance or the best result really is!

weavie · on Jan 28, 2015

This is why a lot of SEO advice tends to be start a blog to run along side the selling platform. Write lots of stuff about quality swords to suck in the search engines. Really this shouldn't be necessary. If you want to buy a sword, ideally the SE would know you are looking for shops and not for info about swords. It would then rank according to customer satisfaction or something.. how it would determine this I'm not sure.

chromaton · on Jan 28, 2015

Reach out to sword fans who are active on sword forums, blogs, etc. and offer them a couple swords to review.

darkxanthos · on Jan 28, 2015

That's cute until your large business moves down in rankings and you start losing thousands of dollars because who knows why.

Apofis · on Jan 31, 2015

I wonder what they have to say about this:

https://medium.com/technology-musings/on-the-future-of-metaf...

lurchpop · on Jan 28, 2015

On the duplicate title test, I wonder if they saw no difference because they put the unique element after the pipe (e.g. "... on Pinterest | {pins}" ).

Maybe google ignores after the pipe because that's where people always put branding: {title} | {meaningless company name}.

sixQuarks · on Jan 28, 2015

This doesn't demystify SEO. There are just so many factors involved in SEO, and unknown factors. Something that works today may not work tomorrow. The only true guideline to go by is to create great content for humans, period.

julieahn · on Jan 28, 2015

I can't agree with you enough. I wish I closed my post with the exact word you wrote.

dangayle · on Jan 28, 2015

I agree with you about the great content for humans bit, but that doesn't mean that you shouldn't still aim to maximize your current traffic through technical means.

If it gives you a whitehat traffic increase of even a few percentage points, that can still mean a big deal, hundreds if not millions of dollars.

sixQuarks · on Jan 28, 2015

Here's the crazy thing about SEO - specifically SEO in relation to Google. This is a gut feeling based on over 10 years of building sites primarily with organic traffic:

I don't think you should try to do "best practices" with SEO. Over the past couple of years, I feel like Google is penalizing sites that try to dot all the i's and cross all the t's. And why shouldn't they? White hat SEO is still gaming the system in a way. In google's eyes, the pages that contain the very best content for humans should show up higher in search, despite them not being optimized for SEO.

I've been seeing more success with pages that are not optimized, pages for which I didn't pay any attention to SEO. The content on those pages are geared for humans and contain great info, that's it. I don't pay attention to URL, title tags, meta tags, etc. Google is getting very good at filtering this out, not sure if they have a team of humans that are whitelisting sites now, but I've given up on trying to optimize for SEO and it's worked wonders.

web007 · on Jan 28, 2015

Good to see AB testing applied to SEO instead of just UX changes. The tl;dr version:

A/B testing is:

  bucket(hash(experiment, user identifier))

A/B testing for SEO is:

  bucket(hash(experiment, url))

kanzure · on Jan 28, 2015

> bucket(hash(experiment, url))

Unfortunately it often seems to be per ___domain, and not per url. And then you have to factor in backlinks...

weavie · on Jan 28, 2015

I don't think I have ever come across PInterest by searching. Am I just searching for the wrong things? I thought PInterest was largely a glorified bookmarking service - what original content is there that the search engines could pick up?

JonLim · on Jan 28, 2015

Personal guess, as I never have either, but I'd hazard a guess that they show up for a lot of recipes, fashion, and interior decoration queries.

Kiro · on Jan 28, 2015

Are they comparing two URLs on the same ___domain? Is that really worthwhile? How much is it about the ___domain and how much about the single URL on the ___domain?

If I link to a specific URL, do I give PR to that URL or to the ___domain itself?

greyman · on Jan 28, 2015

To that URL. PR is always url-specific.

codezero · on Jan 28, 2015

Is there a chance that a site as large as Pinterest might have their search rankings dominated by some hand picked value rather than the many other factors that might affect a typical site?

lugg · on Jan 28, 2015

The time to index is certainly dependent on their size, but I don't see how this sort of a/b tests on SEO wouldn't work on other sites.

The sample size of pages they have to play with, and the fact that they get indexed within a couple of days does make it all the more viable though.

wyck · on Jan 28, 2015

This is pretty meaningless in context, when you visit pinterest the site is login gated. Sure that might look good on paper but it's a short term strategy.

dchuk · on Jan 28, 2015

The homepage might be, but they have 184,000,000 pages indexed in Google: https://www.google.com/search?q=site%3Apinterest.com&oq=site...

butler14 · on Jan 28, 2015

Whoever wrote this doesn't understand what a strategy is or a tactic is. Makes reading on difficult.

compbio · on Jan 28, 2015

Pinterest likely cloaks traffic. Internal site traffic to a pinboard will require a log-in/register to continue viewing the board. Traffic from the Google index is allowed to continue viewing the board. That is treating search engines differently than human users (unless they throw up this log-in wall for crawling Googlebots too, which would severely hamper crawl-ability of the site).

You do not change the page titles of a site so you can get a few more visitors from Google's algorithm, you change the page titles of a site, because they are ambiguous for all your users. If you want to create more unique page titles you can credit the username that created the board to the pagetitle, instead of a meaningless and everchanging "number of pins on this board". For example "Mickey Mouse on Pinterest by John Doe" or "Mickey Mouse | John Doe | Pinterest".

You run A/B tests to test if user engagement with the site increases. If you run A/B tests to test if certain changes increase your search engine rankings/Google visitors then you are reverse engineering Google. Especially with a large site like Pinterest this may gain you some ill-gotten benefit over sites that do play nice:

"If we discover a site running an experiment for an unnecessarily long time, we may interpret this as an attempt to deceive search engines and take action accordingly." [1]

Even on a site like Pinterest I see low-hanging on-page SEO stuff that could be implemented better. For instance the header for a pinboard starts at line 788. Proper content stacking/HTML code ordering ensures that information retrieval bots do not have to wade through many menu's of boiletplate text, before they get to the unique meat of the page.

There is basically one single way to do legit SEO and most of the tips and techniques for that are transparently written in the Google Webmaster Guidelines [3]. The good news is that this has not changed much at all over the years, so one can stop algo chasing, and start improving the site for all users and all search engines.

BTW: The blog has no canonical tag [2] and puts the _entire_ article inside the contents of '<meta name="twitter:description"'.

[1] http://googlewebmastercentral.blogspot.nl/2012/08/website-te...

[2] http://googlewebmastercentral.blogspot.nl/2009/02/specify-yo...

[3] https://support.google.com/webmasters/answer/35769?hl=en "Following these guidelines will help Google find, index, and rank your site."

franze · on Jan 28, 2015

first: i will not comment on the actual findings teased in this blog post, because we miss lots of information, data and context (javascript to make rendering faster, was it really the first pageview that was faster or was this aimed at the second, client side rendering actually makes rendering of the first pageview slower (please, proof me wrong))

second: this is the way SEO should be done - a systematic analytics dev. dirven approach - and they solved one of the challenges big sites regularly face SEO wise: running multiple onpage (SEO is just one aspect) tests simultaneously over chunks of their sites.

most of the time you are stuck with setting a custom variable (or virtual tracker) in google analytics of the pages you changed (and a control group) the issue with this approach is that GA only reports a sample of data (50 000 rows a day) and for big sites this sample becomes insignificant very fast, especially if you run tests. additionally it's not easy to compare the traffic figures of the tracked page-group with log-data like crawling, so you need a custom built solution to connect these dots.

this leads us to a serious limitation of the GA and pinterest approach: connecting their data with google serp impressions, average rankings and clicks. yeah, traffic is the goal of SEO, but it is pretty late in the funnel, crawling is pretty early in the funnel, you can optimize everything in between. for the in between we are stuck with google webmaster tools for reliable data (at least it's data directly from google and not some third party). so to get most out of such tests you must set them up in a way that they traceable via google webmaster tools.

and to make something traceable in google webmaster tools basically means you have to sice and dice them via namespaces in the URL.

simple setup

   www.example.com/ -> verify in google webmaster tools
   www.example.com/a/ -> verify in google webmaster tools to get data only for this segment
   www.example.com/b/ -> verify in google webmaster tools, ...
   ...

make tests on /a/ -> if it performs better than the rest of the site, good

the issue there is that to have a control group you need basically move a comparable chunk of the site to a new namespace i.e. /z/ and site redirects are their own hassle but well on big sites most of the time are worth it. also you don't have to move millions of pages most of the time a sample on the scale of 50 000 pages is enough (p.s.: every (test) segment should of course have it's own sitemap.xml to get communicated / indexed data)

one more thing: doing positive result tests it actually quite hard - doing negative result tests is much easier. make a test group of pages slow, see how your traffic plumbles. make your titles duplicate, see your traffic plumble, ... yeah, these tests suck business wise, from an SEO and development point of view they are a lot of fun.

shameless plug: hey pinterest, check out my contacts on my profile. the goal of my company is to make all SEO agencies - including my own - redundant. we should do stuff.

butler14 · on Jan 28, 2015

This reads like a poor attempt at a software engineer dabbling with SEO. 'Growth' team indeed.

Hire an SEO - or at least a digital marketer with SEO credentials - and do some proper optimisation.