Why is it important?
Both are right.
Submit your site, and you’ll know before you finish reading this article!
How HTTP requests work
Ultimately, this lead to a bad page experience due to the higher loading times, and needed to be addressed. When the changes started to be made, it was really noticeable in the websites crawling, indexing and speed performance.
What is the DOM?
DOM stands for Document Object Model:
- Document: This is the web page.
Object: Every element on the web page (e.g.
Model: Describes the hierarchy within the document (e.g.
<title></title>goes into the
The issues this presents
If it took your browser several seconds to fully render the web page, and the page source didn’t contain much body content, then how will search engines figure out what this page is about?
They’ll need to render the page, similar to what your browser just did, but without having to display it on a screen. Search engines use a so-called “headless browser.”
Back in July 2016, Google said they (opens in a new tab), and it’s safe to say that since then, that amount of documents has increased massively.
Google simply doesn’t have the capacity to render all of these pages. They don’t even have the capacity to crawl all of these pages—which is why every website has an assigned crawl budget.
Websites have an assigned render budget as well. This lets Google prioritize their rendering efforts, meaning they can dedicate more time to rendering pages that they expect visitors to search for more often.
Note: we keep talking about “search engines,” but from here on out, we’ll be focusing on Google specifically. Mind you, Bing—which also powers Yahoo and DuckDuckGo— (opens in a new tab), but since their market share is much smaller than Google’s, we’ll focus on Google.
The illustration above explains Google’s processes from crawling to ranking conceptually. It has been greatly simplified; in reality, (opens in a new tab) are involved.
We’ll explain every step of the process:
- Crawl Queue: It keeps track of every URL that needs to be crawled, and it is updated continuously.
Crawler: When the crawler (“Googlebot”) receives URLs from the
Crawl Queue, they request its HTML.
Processing: The HTML is analyzed, and
a) URLs found are passed on to the
Crawl Queuefor crawling.
b) The need for indexing is assessed—for instance if the HTML contains a
meta robots noindex, then it won’t be indexed (and will not be rendered either!). The HTML will also be checked for any new and changed content. If the content didn’t change, the index isn’t updated.
Render Queue. Please note that Google can already use the initial HTML response while rendering is still in progress.
d) URLs are canonicalized (note that this goes beyond the canonical link element; other canonicalization signals such as for example the XML sitemaps and internal links are taken into account as well).
Render Queue: It keeps track of every URL that needs to be rendered, and—similar to the
Crawl Queue—it’s updated continuously.
- Renderer: When the renderer (Web Rendering Services, or “WRS” for short) receives URLs, it renders them and sends back the rendered HTML for processing. Steps 3a, 3b, and 3d are repeated, but now using the rendered HTML.
- Index: It analyzes content to determine relevance, structured data, and links, and it (re)calculates the PageRank and lay-out.
- Ranking: The ranking algorithm pulls information from the index to provide Google users with the most relevant results.
- forward URLs that need to be crawled to the
- forward information that needs to be indexed to the
Indexphase. This makes the whole crawling and indexing process very inefficient and slow.
Imagine having a site with 50,000 pages, where Google needs to do a double-pass and render all of those pages. That doesn’t go down great, and it negatively impacts your SEO performance—it will take forever for your content to start driving organic traffic and deliver ROI.
Rest assured, when you continue reading you’ll learn how to tackle this.
Avoid search engines having to render your pages
Based on the initial HTML response, search engines need to be able to fully understand what your pages are about and what your crawling and indexing guidelines are. If they can’t, you’re going to have a hard time getting your pages to rank competitively.
Include essential content in initial HTML response
If you can’t prevent your pages from needing to be rendered by search engines, then at least make sure essential content, such as the title and meta elements that go into the
They should be included in the initial HTML response. This enables Google to get a good first impression of your page.
First impressions matter a lot. Does yours look good?
All pages should have unique URLs
Every page on your site needs to have a unique URL; otherwise Google will have a really hard time exploring your site and figuring out what your pages need to rank for.
Don’t use fragments in URLs to load new pages, as Google will mostly ignore these. While it may be fine for visitors to check out your “About Us” page on
https://example.com#about-us, search engines will often disregard the fragment, meaning they won’t learn about that URL.
Include navigational elements in your initial HTML response
All navigational elements should be present in the HTML response. Including your main navigation is a no-brainer, but don’t forget about your sidebar and footer, which contain important contextual links.
And especially in eCommerce, this one is important: pagination. While infinite scrolling makes for a cool user experience, it doesn’t work well for search engines, as they don’t interact with your page. So they can’t trigger any events required to load additional content.
Here’s an example of what you need to avoid, as it requires Google to render the page to find the navigation link:
Instead, do this:
Fixing it after you go live is an option, but you might find that your organic performance has already deteriorated.
onclickor button type links, unless they’ve rendered the page. If you want Google to find and follow your links unchecked, they need to be presented in plain html, as good, clean internal linking is one of the most critical things for Google.
Send clear, unambiguous indexing signals
Meta robots directives
- If you’ve got a
<meta name="robots" content="noindex, follow" />included in the initial HTML response that you’ve overwritten with a
noindex, they decide not to spend precious rendering resources on it.
On top of that, even if they were to discover that the
noindexhas been changed to
index, Google generally adheres to the most restrictive directives, which is the
noindexin this case.
- But what if you do it the other way around, having
In that case, Google is likely to just index the page because it’s allowed according to the initial HTML response. However, only after the page has been rendered, Google finds out about the
noindexand removes the page from its index. For a (brief) period of time, that page which you didn’t want to be indexed was in fact indexed and possibly even ranking.
Find out by comparing the initial HTML to the rendered HTML and prevent nasty SEO surprises!
Overwriting canonical links equally makes mayhem.
(opens in a new tab), John Mueller said: “We (currently) only process the rel=canonical on the initially fetched, non-rendered version.” and Martin Splitt (opens in a new tab) that this “undefined behaviour” leads to guesswork on Google’s part—and that’s really something you should avoid.
rel=”nofollow” link attribute value
The same goes for adding the
rel="nofollow". Again, this is a waste of crawl budget and only leads to confusion.
The other way around, having the
Don’t forget to include other directives in your initial HTML response as well, such as for instance:
- link rel=”prev”/”next” attribute
- link rel=”alternate” hreflang attribute
- link rel=”alternate” mobile attribute
- link rel=”amphtml” attribute
200while loading resources.
/404page that returns the
<head>increases the time required for the bot to crawl the page. Since all resources are parsed in order of appearance in the code, it is also important to ensure that crucial page information like metadata and critical CSS files are not being delayed by inlined scripts and are placed near the top of the
What Googlebot cannot render well, it cannot index well so make sure that the content and links you want to be crawled are found in the rendered version of your page. For the content to get its full weight for ranking purposes, also ensure that it is “visible” on page load, without user interactions (scroll, click, etc.).
Optimizing for the various speed metrics ensures that the rendering process does not “time out”, which would leave search engines with an incomplete picture of your page or worse of your entire site as the faster search engines can render your pages, the more they tend to be willing to crawl.
While optimizing for the critical rendering path and limiting JS bundle sizes help, also audit the impact of third-party scripts on your browser’s main thread as they can significantly impact performance. A sports car towing a trailer will still feel painfully slow.
Leverage code splitting and lazy loading
Implement image lazy loading with loading attribute
Lazy-loading images is a great way to improve page load speed, but you don’t want Google to have to fully render a page to figure out what images are included.
See below for an image that’s included through the loading attribute example:
<img src="/images/cat.png" loading="lazy" alt="Black cat" width="250" height="250">
By including images via the loading attribute, you get the best of both worlds:
- Search engines are able to extract the image URLs directly from the HTML (without having to render it).
- Your visitors’ browsers know to lazy-load the image.
Don’t assume everyone has the newest iPhone and access to fast internet
Don’t make the mistake of assuming everyone is walking around with the newest iPhone and has access to 4G and a strong WiFi signal. (opens in a new tab), so be sure to test your site’s performance on different and older devices—and on slower connections. And don’t just rely on lab data; instead, rely on field data.
While there are lots of rendering options (e.g. pre-rendering) out there, covering them all is outside of the scope of this article. Therefore, we’ll cover the most common rendering options to help provide search engines (and users!) a better experience:
- Server-side rendering
- Dynamic rendering
Server-side rendering is the process of rendering web pages on the server before sending them to the client (browser or crawler), instead of just relying on the client to render them.
- Every element that matters for search engines is readily available in the initial HTML response.
- It provides a fast First Contentful Paint (“FCP”).
- Slow Time to First Byte (“TTFB”), because the server has to render web pages on the fly.
Dynamic Rendering means that a server responds differently based on who made a request. If it’s a crawler, the server renders the HTML and sends that back to the client, whereas a visitor needs to rely on client-side rendering.
This rendering option is a workaround and should only be used temporarily. While it sounds like cloaking, Google (opens in a new tab) as long as the dynamic rendering produces the same content to both request types.
- Every element that matters for search engines is readily available in the initial HTML response sent to search engines.
- It’s often easier and faster to implement.
- It makes debugging issues more complex.
Don’t take shortcuts for serving Google a server-side rendered version. After years of saying they could handle CSR (client-side rendered) websites perfectly, they are now actively promoting dynamic rendering setups. This is also beneficial for non-Google crawlers like competing search engines that can’t handle JavaSscript yet and social media preview requests.
Make sure to always include basic SEO elements, internal links, structured data markup and all textual content within the initial response to Googlebot. Set up proper monitoring for detection of Googlebot (and others) since you don’t want to take any risks. Google may add new IP ranges, new user agents or a combination of those two. Not all providers of pre-baked solutions are as fast as they should be in keeping up to date with identifying Googlebots.
Let’s take a step back. Are these pages in the index? Structured data markup is an additional index processing phase. It’s appearance requires clean technical signals during crawl, effective rendering, and unique value for the index.
Your job as an SEO is to make sure search engine bots understand your content. In my experience, it’s a best practice to have a hybrid model where content and important elements for SEO are delivered as Server Side Rendered and then you sprinkle all the UX/CX improvements for the visitors as a Client Side Rendered “layer”. This way you get best from both worlds, in organized manner. You can choose to combine rendering options. For the full spectrum of what’s possible, check out these (opens in a new tab).
It’s essential you separate content, links, schema, metadata, etc. from visitor improvements. That way you deliver greatness.
How do I check what my rendered pages look like?
You can use Google’s (opens in a new tab) to fetch and test a page, showing you what your rendered page would look like under the
Don’t worry about your page getting cut off in the screenshot; that’s totally fine. You can also check what your rendered pages look like using Google Search Console’s URL Inspection tool.
Note that you can also access the
HTML tab, which shows you the rendered HTML. This can be helpful for debugging.
Important page elements which you assumed would be shown in the rendered HTML, such as page titles, meta descriptions, internal links from menus and breadcrumbs are often not consistent across templates. Check your templates, specifically for differences between the initial HTML and the rendered HTML!
And as you can imagine, Google needs to set priorities in rendering, because not all websites are equal. Therefore, websites have an assigned render budget. This allows Google to dedicate more time to rendering pages that they expect visitors to search for more often.
What about social media crawlers?
Social media crawlers like those from Facebook, Twitter, LinkedIn, and Slack need to have easy access to your HTML as well so that they can generate meaningful snippets.
If they can’t find a page’s OpenGraph, Twitter Card markup or—if those aren’t available—your title and meta description, they won’t be able to generate a snippet. This means your snippet will look bad, and it’s likely you won’t get much traffic from these social media platforms.
- (opens in a new tab)—a good foundation for understanding JS SEO, crawl budget, and rendering.
- (opens in a new tab)—pretty accessible, and it serves as a useful foundation.
- (opens in a new tab)—more technical but quite useful, as they walk you through the different steps in the crawling, indexing, and rendering process.
- (opens in a new tab)—a good article, supported by a slide deck, that describes interesting take-aways.
- (opens in a new tab)—documentation by Google’s Martin Splitt.
- (opens in a new tab)—great guide by Bartosz Góralewicz.
- (opens in a new tab)—a massive collection of resources by Barry Adams, for when you’re ready to take deep dive.
🤖 What do Google bots look for?
Google’s crawlers continuously look for new content, and when they find it a separate process will process it. That process will first look at the initial HTML response. This response needs to include all of the essential content for search engines to understand what the page is about, and what relations it has to other pages.
No, they can’t. Therefore, it’s highly recommended to either use server-side rendering or dynamic rendering. Otherwise, Facebook (and other social media platforms) won’t be able to generate a good snippet for your URL when it’s shared.