SlideShare a Scribd company logo
Making AJAX crawlable Katharina Probst Engineer, Google Bruce Johnson Engineering Manager, Google in collaboration with: Arup Mukherjee, Erik van der Poel, Li Xiao , Google
Web crawlers don't always see what the user sees JavaScript produces dynamic content that is not seen by crawlers Example: A Google Web Toolkit application that looks like this to a user...              ...but a web crawler only sees this:            <script src='showcase.js'></script> The problem of AJAX for web crawlers
Web 2.0: More content on the web is created dynamically (~69%) Over time, this hurts search Developers are discouraged from building dynamic apps Not solving AJAX crawlability holds back progress on the web! Why does this problem need to be solved?
A crawler's view of the web - with and without AJAX
Crawling and indexing AJAX is needed for users and developers Problem: Which AJAX states can be indexed? Explicit opt-in needed by the web server Problem: Don't want to cloak Users and search engine crawlers need to see the same content Problem: How could the logistics work? That's the remainder of the presentation Goal: crawl and index AJAX
Crawlers execute all the web's JavaScript This is expensive and time-consuming  Only major search engines would even be able to do this, and probably only partially Indexes would be more stale, resulting in worse search results Web servers execute their own JavaScript at crawl time Avoids above problems Gives more control to webmasters  Can be done automatically Does not require ongoing maintenance Possible solutions
Overview of proposed approach - crawl time Crawling is enabled by mapping between    &quot;pretty&quot; URLs:   www.example.com/page?query#!mystate &quot;ugly&quot; URLs:  www.example.com/page?query&_escaped_fragment_=mystate
Overview of proposed approach - search time Nothing changes!
Web servers agree to opt in by indicating indexable states  execute JavaScript for ugly URLs (no user agent sniffing!)   not cloak by always giving same content to browser and crawler regardless of request (or risk elimination, as before)   Search engines agree to  discover URLs as before (Sitemaps, hyperlinks)  modify pretty URLs to ugly URLs index content display pretty URLs Agreement between participants
http://example.com/stocks.html#GOOG could easily be changed to   http://example.com/stocks.html#!GOOG   which can be crawled as   http://example.com/stocks.html?_escaped_fragment_=GOOG   but will be displayed in the search results as   http://example.com/stocks.html#!GOOG Summary: Life of a URL
We are currently working on a proposal and prototype implementation Check out the blog post on the Google Webmaster Central Blog:  http://googlewebmastercentral.blogspot.com We welcome feedback from the community at the Google Webmaster Help Forum (link is posted in the blog entry) Feedback is welcome

More Related Content

PDF
It's all about the google spider
PPT
Ajax Abuse Todcon2008
PPTX
Amp it UP! Using Google Accelerated Mobile Pages with WordPress - WordCamp An...
PDF
Web dev syllabus
PDF
WeBB MeetUp#1 Web applications caching techniques
PPTX
Introduction to AngularJS Framework
PPTX
Introduction to Progressive Web Applications
PPT
Jquery
It's all about the google spider
Ajax Abuse Todcon2008
Amp it UP! Using Google Accelerated Mobile Pages with WordPress - WordCamp An...
Web dev syllabus
WeBB MeetUp#1 Web applications caching techniques
Introduction to AngularJS Framework
Introduction to Progressive Web Applications
Jquery

What's hot (20)

PPT
15 minutes seo audit
DOCX
Shaping up with angular JS
PDF
AMPed SEO with Mike Arnesen & SEMpdx
PDF
What Are Accelerated Mobile Pages (AMPs)?
PDF
Developing WordPress Plugins Using the MVC Methodology
PPT
Jquery
PDF
Designing and Implementing a Multiuser Apps Platform
DOCX
Using HTML code to add page number and its output are there..
PPTX
Website Series 4 - JavaScript
PPTX
Real-World AJAX with ASP.NET
PDF
AngularJS - introduction & how it works?
PPTX
Web design 2 - Basic HTML 2010
PPTX
Jaggery Introductory Webinar
PDF
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...
PPTX
WCCBUS 2015 - Content Architecture in WordPress
DOCX
Directives
PPTX
What to do before you launch a website
PDF
Write Your First WordPress Plugin
PDF
Google's AMP project for web users
15 minutes seo audit
Shaping up with angular JS
AMPed SEO with Mike Arnesen & SEMpdx
What Are Accelerated Mobile Pages (AMPs)?
Developing WordPress Plugins Using the MVC Methodology
Jquery
Designing and Implementing a Multiuser Apps Platform
Using HTML code to add page number and its output are there..
Website Series 4 - JavaScript
Real-World AJAX with ASP.NET
AngularJS - introduction & how it works?
Web design 2 - Basic HTML 2010
Jaggery Introductory Webinar
DMIEXPO - Nati Elimelech - JS & SEO: Your New Beautiful Site Might Be Invisib...
WCCBUS 2015 - Content Architecture in WordPress
Directives
What to do before you launch a website
Write Your First WordPress Plugin
Google's AMP project for web users
Ad

Viewers also liked (16)

PPT
Exposition Université Islamique de Pire part 2
PDF
QueduWeb: Cas pratique SEO: lorsque les pages dupliquées sont les plus efficaces
PPTX
SEO et ecommerce sur Magento: retour d’expérience
PPTX
SEO : comment obtenir des liens puissants grâce à un contenu décalé
PPTX
Les bonnes pratiques SEO avec les frameworks javascript - SEO CAMPUS 9 mars 2017
PDF
Designing Creative Content: How visualising data helps us see
PPTX
Cocon, metamots et plus si affinités sémantiques. Seo campus-03-2017
PDF
Analyse de logs - Études de cas et best practices - SEO Campus 2017
PDF
HTTPS The Road To A More Secure Web / SEOCamp Paris
PPTX
Google AMP 1 an après : quel bilan, quelles perspectives ?
PDF
Les défauts de WordPress pour le SEO
PPTX
Measuring Content Marketing
PPTX
Organiser un projet à l’international : un Pari Fou
PDF
Visibilité en ligne : SEO, SEA pour qualifier l’audience et booster la Conver...
PPTX
Pourquoi mes clients n'appliquent pas mes recommandations SEO (mais pas que ...)
PPTX
Mots-clés, au delà du volume de recherche
Exposition Université Islamique de Pire part 2
QueduWeb: Cas pratique SEO: lorsque les pages dupliquées sont les plus efficaces
SEO et ecommerce sur Magento: retour d’expérience
SEO : comment obtenir des liens puissants grâce à un contenu décalé
Les bonnes pratiques SEO avec les frameworks javascript - SEO CAMPUS 9 mars 2017
Designing Creative Content: How visualising data helps us see
Cocon, metamots et plus si affinités sémantiques. Seo campus-03-2017
Analyse de logs - Études de cas et best practices - SEO Campus 2017
HTTPS The Road To A More Secure Web / SEOCamp Paris
Google AMP 1 an après : quel bilan, quelles perspectives ?
Les défauts de WordPress pour le SEO
Measuring Content Marketing
Organiser un projet à l’international : un Pari Fou
Visibilité en ligne : SEO, SEA pour qualifier l’audience et booster la Conver...
Pourquoi mes clients n'appliquent pas mes recommandations SEO (mais pas que ...)
Mots-clés, au delà du volume de recherche
Ad

Similar to rendre AJAX crawlable par les moteurs (20)

PDF
Google Searchable Ajaxed Content
PDF
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript F...
PPTX
Crawl optimization - ( How to optimize to increase crawl budget)
PDF
SEO AJAX Crawlability in a Responsive Publisher World
KEY
Online Collections Crawlability for Libraries, Archives, and Museums
PPTX
Seo and analytics basics
PDF
From Web Site to Web App: Fantastic Optimisations and Where To Find Them
PDF
SEO for Developers
PPTX
Myths & true stories about JavaScript for SEO
PPTX
Technical SEO explain by Akramujjaman Mridha
PPTX
Chanhao Jiang And David Wei Presentation Quickling Pagecache
PDF
Crawling & Indexing for JavaScript Heavy Sites brightonSEO 2021
PPTX
Crawl Budget: Everything you Need to Know
PPT
Advanced Seo Web Development Tech Ed 2008
PPT
Chewy Trewella - Google Searchtips
PDF
SearchLove London 2017 | Emily Grossman | From Website to Web-App: Fantastic ...
PPTX
What Does Google See When It Crawls My Site?
PPTX
Google history nd architecture
PPTX
Crawl Budget and Its Significance in 2025
PPTX
TechSEO Boost 2017: SEO Best Practices for JavaScript T-Based Websites
Google Searchable Ajaxed Content
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript F...
Crawl optimization - ( How to optimize to increase crawl budget)
SEO AJAX Crawlability in a Responsive Publisher World
Online Collections Crawlability for Libraries, Archives, and Museums
Seo and analytics basics
From Web Site to Web App: Fantastic Optimisations and Where To Find Them
SEO for Developers
Myths & true stories about JavaScript for SEO
Technical SEO explain by Akramujjaman Mridha
Chanhao Jiang And David Wei Presentation Quickling Pagecache
Crawling & Indexing for JavaScript Heavy Sites brightonSEO 2021
Crawl Budget: Everything you Need to Know
Advanced Seo Web Development Tech Ed 2008
Chewy Trewella - Google Searchtips
SearchLove London 2017 | Emily Grossman | From Website to Web-App: Fantastic ...
What Does Google See When It Crawls My Site?
Google history nd architecture
Crawl Budget and Its Significance in 2025
TechSEO Boost 2017: SEO Best Practices for JavaScript T-Based Websites

Recently uploaded (20)

PPTX
Sharks presentation & Self Representaion.pptx
PPTX
cấu trúc sử dụng mẫu Cause - Effects.pptx
PPTX
Self-Care Case Studies cases and recommended solutions.pptx
PDF
creative pattern recognition journal for artists and writers.pdf
PPTX
Healing Portfolio Presentation.exercisepptx
DOCX
🌐 Comparative Global Leadership Analysis.docx
PDF
Nep english aecc-2 about reading techniques
PDF
Emotional Mastery for Police Officers.pdf
PDF
Fueling Creativity and Change The Inspiring Path of Odeta Rose.pdf
PDF
The Human Edge: Why A.I. Can’t Steal Your Story!
PPT
lecture1.pptsabdjhbdhsavfsafkaskjfbksabfksabfkabfb
PDF
The Spotlight Effect No One Is Thinking About You as Much as You Think - by M...
PPTX
Healing Routine Presentation.exercisepptx
PDF
Dating And Courtship Quotes Handbook By Walter Tynash.pdf
PPTX
Combining Writing, Art, And Affirmations.pptx
PDF
Brown AesthetIc Minimalist Thesis Defense Presentation.pdf
PPT
WORKPLACE HARMONY AND HOW TO BEHAVE IN THE WORKPLACE
PPTX
Module-1-Nature-and-Process-of-Communication.pptx
PPTX
Attitudes presentation for psychology.pptx
PPTX
Escaping The Digital Noise And Finding Peace In Stillness.pptx
Sharks presentation & Self Representaion.pptx
cấu trúc sử dụng mẫu Cause - Effects.pptx
Self-Care Case Studies cases and recommended solutions.pptx
creative pattern recognition journal for artists and writers.pdf
Healing Portfolio Presentation.exercisepptx
🌐 Comparative Global Leadership Analysis.docx
Nep english aecc-2 about reading techniques
Emotional Mastery for Police Officers.pdf
Fueling Creativity and Change The Inspiring Path of Odeta Rose.pdf
The Human Edge: Why A.I. Can’t Steal Your Story!
lecture1.pptsabdjhbdhsavfsafkaskjfbksabfksabfkabfb
The Spotlight Effect No One Is Thinking About You as Much as You Think - by M...
Healing Routine Presentation.exercisepptx
Dating And Courtship Quotes Handbook By Walter Tynash.pdf
Combining Writing, Art, And Affirmations.pptx
Brown AesthetIc Minimalist Thesis Defense Presentation.pdf
WORKPLACE HARMONY AND HOW TO BEHAVE IN THE WORKPLACE
Module-1-Nature-and-Process-of-Communication.pptx
Attitudes presentation for psychology.pptx
Escaping The Digital Noise And Finding Peace In Stillness.pptx

rendre AJAX crawlable par les moteurs

  • 1. Making AJAX crawlable Katharina Probst Engineer, Google Bruce Johnson Engineering Manager, Google in collaboration with: Arup Mukherjee, Erik van der Poel, Li Xiao , Google
  • 2. Web crawlers don't always see what the user sees JavaScript produces dynamic content that is not seen by crawlers Example: A Google Web Toolkit application that looks like this to a user...             ...but a web crawler only sees this:            <script src='showcase.js'></script> The problem of AJAX for web crawlers
  • 3. Web 2.0: More content on the web is created dynamically (~69%) Over time, this hurts search Developers are discouraged from building dynamic apps Not solving AJAX crawlability holds back progress on the web! Why does this problem need to be solved?
  • 4. A crawler's view of the web - with and without AJAX
  • 5. Crawling and indexing AJAX is needed for users and developers Problem: Which AJAX states can be indexed? Explicit opt-in needed by the web server Problem: Don't want to cloak Users and search engine crawlers need to see the same content Problem: How could the logistics work? That's the remainder of the presentation Goal: crawl and index AJAX
  • 6. Crawlers execute all the web's JavaScript This is expensive and time-consuming Only major search engines would even be able to do this, and probably only partially Indexes would be more stale, resulting in worse search results Web servers execute their own JavaScript at crawl time Avoids above problems Gives more control to webmasters Can be done automatically Does not require ongoing maintenance Possible solutions
  • 7. Overview of proposed approach - crawl time Crawling is enabled by mapping between   &quot;pretty&quot; URLs: www.example.com/page?query#!mystate &quot;ugly&quot; URLs: www.example.com/page?query&_escaped_fragment_=mystate
  • 8. Overview of proposed approach - search time Nothing changes!
  • 9. Web servers agree to opt in by indicating indexable states  execute JavaScript for ugly URLs (no user agent sniffing!)  not cloak by always giving same content to browser and crawler regardless of request (or risk elimination, as before)   Search engines agree to discover URLs as before (Sitemaps, hyperlinks) modify pretty URLs to ugly URLs index content display pretty URLs Agreement between participants
  • 10. http://example.com/stocks.html#GOOG could easily be changed to   http://example.com/stocks.html#!GOOG   which can be crawled as   http://example.com/stocks.html?_escaped_fragment_=GOOG   but will be displayed in the search results as http://example.com/stocks.html#!GOOG Summary: Life of a URL
  • 11. We are currently working on a proposal and prototype implementation Check out the blog post on the Google Webmaster Central Blog: http://googlewebmastercentral.blogspot.com We welcome feedback from the community at the Google Webmaster Help Forum (link is posted in the blog entry) Feedback is welcome