How to Get Google to Index Your Website
Google’s indexing process is complicated, with many phases affecting each other.
To get them to index your website quickly, you need to make sure there are no roadblocks preventing Google from indexing in the first place.
Secondly, do whatever you can to notify Google that you have fresh content and want your website to be indexed. Bear in mind that the quality of your content, and the lack of internal links may a deal-breaker in the indexing process.
Finally, boost your website’s popularity by building external links to your website and getting people to talk about your content on social media.
If your content isn’t on Google, does it even exist?
For your website to be visible on the dominant search engine, it first needs to be indexed. In this article, we’ll show you how to get Google to index your site quickly and efficiently and what roadblocks to avoid hitting.
Google’s indexing process in a nutshell
Before diving into how to get your website indexed, let’s go over a simplified explanation of how Google’s indexing process works.
Google’s index can be compared to a massive library – one that’s larger than all the libraries in the world combined!
The index contains billions and billions of pages, from which Google picks the most relevant ones when users make search queries.
With this much content that keeps on changing, Google must constantly search for new content, content that’s been removed, and content that’s been updated – all to keep its index up-to-date.
In order for Google to rank your site, it first needs to go through these three phases:
Discovery: By processing XML sitemaps and following links on other pages Google already knows about, the search engine discovers new and updated pages and queues them for crawling.
Crawling: Google then goes on to crawl each discovered page and passes on all the information it finds to the indexing processes.
Indexing: Among other things, the indexing processes handle content analysis, render pages and determine whether or not to index them.
Google’s indexing process is highly complex, with lots of interdependencies among the steps included in the process. If some part of the flow goes wrong, that affects other phases as well.
For instance, on August 10, 2020, the SEO community noticed a flurry of changes in the search results ranking. Many argued that this meant Google was rolling out a significant update. But the next day, Google announced that it was in fact caused by a bug in their indexing system that affected rankings:
To shed some light on how the indexing process is complicated and intertwined, Garry Illyes explained the Caffeine workflow in a Twitter thread:
This tweet suggests that a bug in the indexing phase can have a big effect on the process that follows it – in this case messing up the ranking system.
Alongside this event, it’s important to note that in May 2020, Google underwent a broad core update that impacted the indexing process. Since then, Google has been slower to index new content and is more picky about the content it decides to index. It seems as if its quality-filtering process has become a lot stricter than it previously was.
How to check if Google has indexed your website?
There are several quick ways to check whether Google has indexed your website, or whether they’re still stuck in the preceding phases discovery and crawling.
Feedback from Google Search Console
Use Google Search Console’s Index Coverage Report to get a quick overview of your website’s indexing status. This report provides feedback on the more technical details of your site’s crawling and indexing process.
The report returns four kinds of statuses:
- Valid: these pages were indexed successfully.
- Valid with warnings: these pages were indexed, but there are some issues you may want to check out.
- Excluded: these pages weren’t indexed, as Google picked up clear signals that they shouldn’t index them.
- Error: Google could not index these pages for some reason.
The Index Coverage report lets you quickly check your site’s overall indexing status, and meanwhile, you can use Google Search Console’s URL Inspection tool to zoom in on individual pages.
If the URL Inspection tool shows you the URL isn’t indexed yet, you can use the very same tool to request indexing.
Check the URL’s cache
Check whether your URL has a cached version in Google, either by typing
cache:https://example.com into Google or the address bar, or by clicking the little arrow pointing downwards under the URL on a SERP.
If you see a result, Google has indexed your URL. Here’s an example for one of our articles:
The date included in the screenshot refers to the last time the website was indexed. Keep in mind that it doesn’t say anything about when it was last crawled. The website may have been crawled again later without Google indexing its updates, as Garry Illyes pointed out in this tweet.
At the same time, checking a URLs cache isn’t foolproof either — you may see a cached page even though — in the meantime — the page has been removed from Google’s index.
If it ranks, it’s indexed
Another way to verify if your pages have been indexed is to check whether they are ranking using a rank tracker, or simply by checking Google Search Console’s Performance data to see if you’re getting clicks and impressions:
Searching for the exact page title or URL
Alternatively, to see if a page is indexed, you can search for the exact page title by putting it in between quotes (
”Your page’s title”), use the
intitle: search operator with your page’s title (
intitle:"Your page's title") or just enter the URL into Google.
You can also check out if your page is indexed by using the
site: query for the page. Here’s an example: entering
site:https://www.contentkingapp.com/academy/control-crawl-indexing/ can show whether that page is indexed.
However, this approach is not always reliable!
We’ve seen instances where pages are ranking, but they aren’t showing up for
site: queries. So never rely on this check alone.
How to get Google to index your website quickly
In order to get your website indexed by Google, you need to get rid of any roadblocks that would prevent Google from indexing it in the first place.
Secondly, you should make it easier for Google to discover your content with a push. Remember that Google is always aiming to provide their users with high-quality content to adequately answer their queries. Make sure your content fits this bill.
Finally, boost the popularity of your content by winning backlinks and having people talk about your content on social media.
1. Prevent robots directives from impacting indexing
A common reason why Google doesn’t index your content is because of the robots noindex directive. While this directive helps you prevent duplicate content issues, it sends Google a strong signal not to index certain pages on your website. Meta robots directives can be implemented through the HTML source, and the HTTP header.
In your HTML source, the meta robots tag may look something like this:
<meta name="robots" content="noindex,follow" />.
Only implement them on pages you definitely don’t want to be indexed, and in case a page you want to be indexed is having indexing issues, double check if the noindex directive isn’t implemented.
Get alerted instantly about any important pages being noindexed. No more SEO surprises!
2. Set up canonical tags correctly
Although canonical tags aren’t as strong a signal as meta robots directive, their incorrect use can lead to indexing issues. Make sure the pages you want to get indexed aren’t canonicalized.
One thing I’ve seen is sites that get so caught up in ensuring their pages canonicalize, end up canonicalizing to pages that are also marked with
Google needs clear, consistent signals, so canonicalizing your content to a page marked
noindexcould stop the affected pages’ performance in their tracks.
3. Don’t disallow content you want to get indexed
The robots.txt file is an important tool that sends signals to all search engines about the crawlability of your URLs. It can be set to let Google know it should ignore certain parts of your website.
Make sure that the URLs you want to be indexed aren’t disallowed in robots.txt. Messing up your robots.txt can lead to new content and content updates not being indexed. Be aware that anyone can make mistakes in the robots.txt file – even big companies such as Ryanair.
To check what pages are blocked by robots.txt, check the “Indexed, though blocked by robots.txt” report in Google Search Console.
The robots.txt file may be simple to use, but is also quite powerful in terms of causing a big mess. I’ve seen many cases where websites were “ready to go” and were pushed live with a
Resulting in all pages being blocked for search engines, and nobody being able to find the website through Google Search. Meanwhile, the client starts to wonder why Google isn’t indexing anything. One line of code can pass unnoticed, and block Google from finding all your website’s content!
4. Prevent crawler traps and optimize crawl budget
To make sure you get the most out of Google crawling your website, avoid creating crawler traps. Crawler traps are structural issues within a website that results in crawlers finding a virtually infinite number of irrelevant URLs, in which the crawlers can get lost.
You should make sure that the technical foundation of your website is on-par, and that you are using proper tools that can quickly detect crawler traps Google may be wasting your valuable crawl budget on.
My advice: make sure that all URL variations that need to be blocked off, truly are blocked off!
5. Feed Google indexable content through a XML sitemap
Once you are sure there is no blockage on your side, you should make it easy for Google to discover your URLs and to understand your website’s infrastructure in general. XML sitemaps are a great way to do this.
All newly published content or updated content that needs to be indexed should be added to your XML sitemap(s) automatically. To make your content easy for Google to find, submit your XML sitemap(s) to Google Search Console.
My go-to way of getting anything indexed quickly is always to verify the site in Search Console and then submit the XML sitemap there.
Always make sure your XML sitemap has all the pages you want to have indexed in it, and organized so Google can read them — with sitemap indexes if required.
For me this has been the best way to knock on Google’s door to let them know they can tour through the website ASAP and crawl/index everything found there and you can always check back to see when it was submitted and last read by Google.
6. Manually submit your URLs to Google Search Console
While Google will discover, crawl, and potentially index your new or updated pages on its own, it still pays to give it a push by submitting URLs into Google Search Console. This way, you can also speed up the ranking process.
You can submit your URLs in GSC’s URL inspector:
7. Submit post through Google My Business
Submitting a post through Google My Business gives Google an extra push to crawl and index URLs that you’ve included there. We don’t recommend doing this just for any post, and keep in mind that this post will be shown in the Google My Business knowledge panel on the right hand side for branded searches.
8. Automatic indexing via the Google Indexing API
Websites that have many short-lived pages, such as job postings, event announcements, or livestream videos, can use Google’s Indexing API (opens in a new tab) to automatically request them to crawl and index new content and content changes. Because it allows you to push individual URLs it’s an efficient way for Google to keep their index fresh.
With the Indexing API, you can
- Update a URL: notify Google of a new or updated URL to crawl
- Remove a URL: notify Google that you have removed an outdated page from your website
- Get the status of a request: see when Google crawled the URL the last time
Although Google doesn’t recommend you feed them other content types than jobs and events, I have managed to index regular pages using the API. One thing I’ve noticed is the API seems to work better for new pages rather than re-indexing. Google might enforce this at some point but for now it’s working fine. RankMath has a plugin (opens in a new tab) which can makes the job a lot easier, but requires a bit of setup.
9. Provide Google with high-quality content only
Google’s aim is to return high-quality content to its users, as quickly as possible. Therefore, always focus on providing Google with the best content you can possibly produce to increase your chances of being indexed quickly.
With Google’s strict content evaluation and the never-ending competition, creating and optimizing great content is a process that will never cease.
Apart from generating new content, focus on improving what’s already in place. Update underperforming content so that it returns better answers to potential visitors. If you have low-quality or outdated content on your website, consider either removing it completely or discouraging Google from spending its precious crawl budget on it.
10. Prevent duplicate content
Another way to turn Google’s crawl budget into a massive waste is to have duplicate content. This term refers to very similar, or identical, content that appears on multiple pages within your own website, or on other websites.
Overall, duplicate content can be truly confusing for Google. On principle, Google indexes only one URL for each unique set of content. But it’s hard for the search engine to determine which version to index, and this is subsequently reflected in their search results. And as the identical versions keep on competing against each other, it lowers performance for all of them.
Duplicate content can turn into a harsh problem, mainly for eCommerce website owners, who have to find a way to signal to Google which parts of their website to index and which to keep hidden.
To this end, you can use robots.txt disallow for filters and parameters, or you can implement canonicalized URLs. But as mentioned in the first part of this article, be very careful what you are implementing, as even a tiny change can have a negative impact.
11. Leverage internal links and avoid using nofollow
Internal links play a huge role in making Google understand the topics of your website and its inner hierarchy. By implementing strategically placed internal links, you will make it easier for Google to understand what your content is about and how it helps users.
Make sure that you avoid using the
rel=”nofollow” attribute on your internal links, as the nofollow attribute value signals to Google that it shouldn’t follow the link to the target URL. This results in no link value being passed as well.
If you need new pages indexed fast, be strategic about how you internally link to them.
Adding internal links on your home page and site-wide areas like the header and footer will significantly speed up the process of crawling and indexing.
Consider creating dynamic areas on your home page that shows your latest content, whether that’s a blog post, news article, or product.
You can also use links within a mega menu that list the latest URLs within your site’s different taxonomies.
12. Build relevant backlinks to your content
It’s not an overstatement to say that link building is one of the most important disciplines in this field. The general consensus is that links contribute by more than 50 percent to your SEO success.
Via inbound links, often called backlinks, Google can discover your website. And as links also transfer a portion of their authority, you will get indexed faster if a backlink is coming from a high-authority website, and it will significantly affect your rankings.
To help you boost your indexing and ranking options, here are a whole list of highly effective link building strategies.
13. Create buzz around your content on social media
Earlier in this article, we mentioned that Google has become much stricter when it comes to what content they index. When you create buzz around your content on social media, it signals to Google that the content is popular, which speeds up the indexing process. For instance, posting your content on Twitter together with a few popular hashtags can really help in speeding up the indexing process.
On top of that, creating buzz around your content will also lead to newsletter inclusions and backlinks!
Because of Google’s access to Twitter’s “firehose data stream” (opens in a new tab), you’ll find that all content types — but especially news content — will be discovered quickly if it gets shared on Twitter.
Getting your website indexed properly by Google can turn out to be a hell of a job. You have to tackle many technical as well as content-oriented and PR-based challenges. And with the recent Google core update in May 2020, indexing new pages has become even harder.
But with a proper strategy and checklist in place, you can get Google to index the most important parts of your website and boost your SEO performance with high rankings.