Ari Nahmani covers the latest in advanced technical SEO at SMX Munich (Muenchen) 2016. Discussions of the deprecated HTML snapshot, Javascript crawlability and indexing, new frameworks, prerendering, server side rendering, prerender.io, isomorphic javascript, and other technical issues related to the future of protecting your index health.
11. Today’s Session
• Technical SEO issues around e-commerce /
large site architecture
• Preventing index bloat & preserving crawl
budget as a core methodology
• Current solutions & upcoming threats (JS,
AJAX, new frameworks, pre-rendering)
23. Index Bloat Prevention: Sorts & Filters
<link rel="canonical"
href=”http://www.site.com/guys/tees/" />
• Basic Solution: Strip out the unnecessary
parameters
24. Solution: Filtering Out All Facet Params
• PROS:
– Avoids diluted / dupe URLs (request, not
directive)
• CONS:
– If you want/need specific parameters indexed
and exposed (size, color), need properly coded
canonical tag logic, recipe for major leak and
confusion.
– Considerations w/ pagination & view-all page
31. Index Bloat Prevention: JS + AJAX
AJAX Refinement V1 - NO URL CHANGE,
but inactive, different href= URL exists
32. AJAX Facet Refinements V1 (NO URL CHANGE)
• PROS:
– Theoretically no parameters exposed to bloat the
index
• CONS:
– Users can’t share refined / filtered content to
friends, no accurate bookmarking. (Terrible UX)
– Googlebot will still crawl hidden href=' or other JS
framework links like Angular: ng-href= (check
canonical logic!!)
37. Index Bloat Prevention: JS + AJAX
Google preferred pushstate URL version, we had to reinforce
(via normal inline href=‘’, canonical, xml sitemap)
38. AJAX Facet Refinements V2 (PushState URL Change)
• PROS:
– Users can now share /bookmark the correct
content
– Added to browser history
• CONS:
– Still need to have consistent canonical structure
due to Googlebot crawling pushstate()
– Different hidden URL structure via AJAX facets
may require further unpredictable
canonicalization logic / further dev work
49. Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Server side
(phantomJS /
headless browser)
Pre-Rendered
(to bots)
50. Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre-Rendered
(to bots)
Server side
(phantomJS /
headless browser)
51. Indexing AJAX & JS: HTML Snapshot
• Upon crawl of URL with _escaped_fragment_=,
serve ’dumbed down’ HTML version of page.
• Not pre-rendered, rather simplified.
• For example, on ecommerce à a view-all
category listing with no dynamic facets.
Amazing results from our clients.
52. Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre or Realtime
Rendered
(to users & bots)
Pre-Rendered
(to bots)
Server side
(phantomJS /
headless browser)
53. Indexing AJAX & JS: Pre-rendering
Upon crawl of URL with _escaped_fragment_=
1. prerender.io – middleware via reverse proxy
that serves a pre-rendered, cached HTML
page to bots
OR
2. Server side – the server pre-rendered the JS
in cached html pages to serve to bots or
does it in real-time (headless browser).
58. Server side
(phantomJS /
headless browser)
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre-Rendered
(to bots)
60. Indexing AJAX & JS: Server Side
bit.ly/javascriptseobit.ly/javascriptseo
61. Indexing AJAX & JS: Server Side
bit.ly/javascriptseobit.ly/javascriptseo
62. Server side
(phantomJS /
headless browser)
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Dumbed down’
HTML Template
3rd Party
Service
(prerender.io)
Pre-Rendered
(to bots)
75. Summing It Up
• Index Bloat, Crawl Budget, & Testing: Large sites are
prone to serious index bloat and wasted crawl budget.
Needs diligent testing and an OCD-like attention to detail
with the basics. Test often & automate!
• JS/AJAX: Pushstate(), JS Frameworks and AJAX present
both discovery and bloat challenges. Know the options:
short term fixes like HTML snapshot (G+B), and long term
re-designs with modern frameworks w/ built in server side
rendering.
77. References:
• Can You Now Trust Google To Crawl Ajax Sites?
• Search Engine Optimization Best Practices for AJAX URLs | Webmaster Blog
• We Tested How Googlebot Crawls Javascript And Here's What We Learned
• Prerender - AngularJS SEO, BackboneJS SEO, or EmberJS SEO
• SMX Munich Advanced Technical SEO Brainstorm - Google Docs
• www.simoahava.com/seo/dynamically-added-meta-data-indexed-google-crawlers/
• Speakers | Search Marketing Expo – SMX Munich
• JavaScript + SEO: Better Together — Medium
• SEO AJAX Crawlability in a Responsive Publisher World
• SEO Strategies for JavaScript-Heavy Single Page Applications or AJAX Sites | Search Engine Watch
• The Basics of JavaScript Framework SEO in AngularJS - Builtvisible
• Can Search Engines Crawl Javascript?
• https://www.w3.org/wiki/Graceful_degradation_versus_progressive_enhancement#Graceful_degradatio
n_and_progressive_enhancement_in_a_nutshell
• SEO and JS: New Challenges
• BromBone | SEO for your AngularJS, EmberJS, or BackboneJS website.
• DIY AngularJS SEO with PhantomJS (the easy way!) | Lawsonry
• https://scotch.io/tutorials/angularjs-seo-with-prerender-io