SEO Requirements for a New Website
Building websites is hard. Building SEO-friendly websites is even harder. It's all too common that a development team's members are high-fiving each other after a massive project, and yet it turns out that the website's not up to contemporary SEO standards.
There are many reasons why this can happen. The four most common are:
- The development team wasn't briefed properly.
- The development team wasn't educated well enough by the SEO specialists.
- SEOs were involved way too late in the process.
- The development team thought they had "this SEO thing" covered.
Requirements engineering—figuring out what you need to build—is a true science and is essential for the success of every development project, on and off the web.
If you don't know what the project's goals are, what to build, and what standards to adhere to, how can you make the project successful and build exactly what your customer wants?
In this article we'll cover all the SEO requirements you need to consider when building or maintaining a website. It's useful for everyone in the process: the customer who wants to have the website built, the SEO specialists who are tasked with ensuring the new website is SEO-proof, and the web development firms that are doing the actual building.
Often, there are so many people contributing to a new website launch so it's difficult for an SEO to maintain line of sight on why things have been done a certain way. I find that a checklist detailing pre-launch, deployment day and post-launch checks is crucial for creating the the foundation required for good SEO performance. This checklist should be integrated into the website's overall development plan.
At Blue Array, we often find that companies only engage an SEO agency once their website is well-established. We find more often than not, there are a myriad of issues (mostly technical) which could have been avoided if SEO was a bigger priority from the beginning.
Often it's very easy for SEOs to be indignant about a lack of inclusion in the development process, or requirements not being hit. The thing is, that's our job - to empathise, to explain the risks, to embed ourselves in the process, and to educate. Having the actual SEO down is the easy bit if that's your entire job - the politics is the more delicate art.
Monitor your site with ContentKing. Gain insights to make improvements and be alerted to issues in real-time.
When do SEO specialists need to be involved?
SEO specialists need to be involved before, during, and after the building of a new website. It's important to keep them informed and updated whenever new websites are being built and changes are being made.
New websites are often screwed up as they are rushed. SEO is left untouched until the last minute and content isn't thoroughly thought through during the planning stages. Involve SEO specialists right from the start in order to prevent this from happening!
When you're launching a new website development project, involve SEOs from the get-go, because the new website's SEO requirements will heavily influence its price.
All too often, SEO specialists' involvement in the development process starts way too late. They then quickly realize that the new website isn't going to be SEO-proof as-is, so they compile a list of SEO requirements for which an additional budget is required. This results in a messy project flow, higher costs, and delays… none of which makes the customer happy!
Now let's get our hands dirty and move on to the actual SEO requirements!
Something we see time and time again, is SEO requirements getting de-prioritised by the development team and/or business stakeholder, often in favour of hitting the website launch deadlines. This can generally be avoided with strong communication between teams and involving the SEO team at the start of the project. If we only see your project a week before launch, we're, more often than not, going to find something that could delay your launch.
All the requirements in this section are no-brainers. They've been industry standards for years, so they should come as no surprise to anyone.
A responsive design means that the design adjusts itself to the device it's being used on. Responsive designs are great for both visitors and search engine crawlers. They a consistent experience for visitors to your website, regardless of the device they're using. Google values websites that provide their visitors with a good mobile experience (and other search engine crawlers are doing the same). The reason behind this is that in late 2015, the number of mobile searches in Google surpassed the number of desktop searches. So making sure your website caters to mobile visitors is very important.
Responsive designs make SEO specialists' lives easier as well, because they only have one URL to promote per page. In the past, you'd have separate desktop and mobile sites that would both get linked to from other websites, and you'd need to consolidate those link and relevancy signals. This would always be sub-optimal compared to just getting links to a single URL.
Accelerated Mobile Pages
If you're working on a website for a publisher, then you need to consider implementing Accelerated Mobile Pages (AMP). We don't recommend implementing AMP for any other type of website.
AMP is an open-source initiative and format by Google with the aim of speeding up the web experience for mobile users. AMP pages are essentially stripped-down versions of pages that are optimized to load fast on mobile devices.
So why is AMP only useful for publishers? Because for publishers, it enables you to get into the Google News carousel, which can drive lots and lots of traffic… but other types of sites can't benefit from this, which makes the cons outweigh the pros.
HTTPS stands for Hyper Text Transfer Protocol Secure. It's the secure version of HTTP, the protocol over which data is sent between your browser and the website you're visiting. Using HTTPS makes it harder for people to try and eavesdrop on you.
Google's been pushing for websites to adopt HTTPS and has made it into a minor ranking factor because of that. While it may help a little, it's not going to provide a significant competitive edge in terms of SEO. To support their cause though, they're showing the "Not Secure" notice in the URL for sites that contain forms and aren't running on HTTPS.
Here's an example:
Nowadays, HTTPS is a must-have. So any new website that's being built today should be served over HTTPS. That's why ContentKing can check a website to see if it's available via HTTPS and whether its HTTPS certificate is valid:
Make sure to load all resources over HTTPS, because you want to avoid having so called " (opens in a new tab)" issues. Mixed content issues occur when some resourced are loaded over HTTP instead of HTTPS, thereby leaving the page unsafe.
The reasoning here is:
- Search engines' page-rendering resources are limited. Rendering a page can easily cost twenty times as many resources as crawling a regular HTML page, so search engines can only allocate a small portion of their resources to this. This results in waiting days, if not weeks, for your content to be indexed.
- Content that isn't indexed doesn't rank. Until your content is indexed, you'll get zero traffic from search engines.
SEO aside, client-side rendering also makes for a higher Time to Interactive (TTI). This means that visitors will have to wait longer before they're able to interact with the page.
Submit your site, and you'll know before you finish reading this article!
Studies have shown that visitors like fast-loading pages. They decrease bounce rates and raise conversion rates. Amazon found that their revenue increased by 1% for every 100ms decrease in load time. On top of that, having fast loading pages helps your SEO. Especially on the first page of Google, page speed can really make a difference. And, as of May 2021, Google will take into account more metrics that aim to measure user experience. This new set of metrics is called Core Web Vitals.
There are hundreds of tweaks you can make to your website and web server that will help you get better page speed, but here are the most common best practices that you should take into account:
- Use a content delivery network
- Optimize your images
- Reduce server response time to <500 ms
- Use browser caching and file compression
We always find page speed is one of the hardest things to optimise for as it involves canvasing multiple stakeholders and making fundamental updates to your site.
However, it's one of the most under-rated SEO factors that the community speaks about. One thing we advise avoiding is using tools like GTmetrix and Pingdom to work out what to optimise as it doesn't really give you a true overview of what's going on or what needs to be fixed.
Hierarchy of headings
Headings, H1–H6, are used to provide hierarchy and clarity for your web pages. By using headings appropriately, you can ensure that visitors can scan your pages quickly and help search engine crawlers to grasp your content's content and structure.
An H1 heading on a page should convey its main topic. For that reason, you shouldn't put the H1 heading around the logo, and you should only use one H1 heading per page.
The robots.txt file contains the rules of engagement for crawlers. You use it to tell crawlers what sections they can't access and to give them hints that help them discover your content efficiently by referencing the location of your XML sitemap.
Here's what's important to keep in mind when dealing with robots.txt:
- Different search engines interpret the robots.txt differently.
- Be careful not to
Disallowfiles that are required to render pages. This keeps search engines from rendering your pages, and it could hurt your SEO performance.
- And last but not least: robots.txt is very powerful. Be careful around it, as you can easily make your whole site inaccessible to search engines. Therefore it's important to monitor your robots.txt file.
XML sitemaps are an efficient way of telling search engines about the content you have on your website. Therefore, they play an important role in making sure your content is crawled quickly after publishing or updating.
Best practices surrounding XML sitemaps:
- Keep the XML Sitemap up to date with your website's content.
- Make sure it's clean: only indexable pages should be included.
- Reference the XML Sitemap from your robots.txt file.
- Don't list more than 50,000 URLs in a single XML Sitemap.
- Make sure the (uncompressed) file size doesn't exceed 50MB.
- Don't obsess about the
Keep in mind that there are special XML sitemaps for images and news articles.
Audit your XML sitemap with ContentKing. 24/7 monitoring to give you the most up-to-date insights on what can be improved.
HTTP status codes
It's all too common that "Page not found" error pages aren't returning an HTTP status code
404. This status code clearly communicates that the page doesn't exist.
HTTP status code
404 should be used for pages that never existed and can be used for pages that used to exist. We're deliberately writing can because there's an alternative status code that's more definitive. This is the HTTP status code:
410. HTTP status code
410signals to search engines that the page has been removed and will never return. Because of its definitive nature, be careful with this one as there's no turning back.
Information Architecture is the art and science of designing a structure for presenting your website's content. It's about defining what content is present and how it's made accessible. Information Architecture is where User Experience (UX) and Search Engine Optimization (SEO) meet. Everyone benefits from having a good information architecture, so this is an important phase in building a new website.
In this section, we'll be focusing on the Information Architecture SEO requirements for web development. Part of that is defining the URL structure, and how it behaves under certain conditions.
Pick your SEO battles. Make sure you do your homework about the type of website and the SEO specifics for that type of website.
Is it a local-oriented website? Then make sure you're investing time in local SEO (e.g. mentioning the city and surrounding villages or regions in the content). It's one of the most forgotten things for smaller local oriented websites.
Is it an e-commerce website? Then make sure you're incorporate buying or comparing intent of your target audience in the content, and meta information etc.
Is it a content website? Then check if you can find content gaps in your competitors' content.
URLs should be lowercase, descriptive, readable, and short.
Ideally, URLs shouldn't have extensions such as .html, .php and .aspx as it allows you to switch platforms without having to redirect these URLs. Whatever structure you pick, make sure it's used consistently across your entire website.
Furthermore, avoid using parameters in URLs as much as possible. They don't show visitors what to expect when going to a page, and they can also cause crawl issues.
The website needs to support the defining of a template that details how URLs are built up.
Things to consider:
- Are you using subdirectories in your URLs or not?
- Are you using a trailing slash (a slash at the end of each URL) or not?
- Also, make sure you're able to manually overwrite the URL template if need be.
Something to keep in mind when preparing to launch a new website is ensuring it's future-proof for potential optimization and company growth in the future. What might fit the company's website requirements right now may one day be obsolete, so make sure to clarify what long term goals they hope to achieve with their site and service.
Quick ways to guarantee future improvement involves creating a site structure that is optimal for additional subdirectories or categories when necessary, and that structured data, content strategies and user experience plans can be adapted for varying topics. Also, if your client is planning (either now or one day) international expansion, check if your suggested hreflang implementation can cater to that. Thinking about the potential bigger picture and upcoming plans will save you and your company tons of time and stress in the future.
Managing expectations when you're replatforming your website is tough: you should be looking to change as little as possible. Every tweak to site architecture, URL structure, meta data, copy or page layout is an additional risk, taking a little longer for search engines to understand. You're walking away from years of optimisation and changing everything at once makes it harder to uncover what's gone wrong.
If you're doing it right, your company is probably paying a lot of money for a website that looks/works pretty much just like the last one – that's hard to do when all everyone sees is an opportunity to redesign the whole thing.
Filters, variants, and multiple categories
If the new website contains filters, describe how they're going to affect the URL structure, and whether the resulting URLs should be accessible and indexable for search engines or not.
Do the same for page variants.
Example: you're going to be building a new eCommerce website for selling shoes. Each shoe is available in different colors and sizes, easily leading to dozens of product variants. Which pages do you want to be accessible and indexable, and which not?
And what about blog articles that are in multiple categories? Or products that are in multiple categories? There may be very good reasons to do so from a user point of view, but from an SEO point of view, it can really be a headache.
We recommend making sure that your articles and products have a primary category that can be indexed. Other categories an article or product is in should be unindexable, in order to prevent duplicate content.
A blog article is in categories A, B and C, leading to the following URLs:
Category A is the primary category, so this article will be indexable on
https://example.com/blog/c/example-article should both be canonicalized to the primary URL.
Smart templates that save you time
From an SEO standpoint, it's important to be able to define templates for the title tag, meta description, and headings. They help you optimize consistently, while also saving you a lot of work.
You need to be able to define these templates on a website level, as well as for category pages and for all pages in a certain category. Specificity wins, so the most specific template for a page will be used.
|Website||Title||$pageName - HappyShoes|
|Website||Title||$pageName - HappyShoes|
|Pages in Category "adidas"||Title||$pageName - Buy adidas shoes|
You've got a page named "adidas Ultra Boost size 44", so the
title becomes: "adidas Ultra Boost size 44 - Buy adidas shoes".
Manually overwriting smart defaults
On a page level, you should be able to overwrite these templates with manually defined ones.
That way you can quickly define a default title tag template for all of your blog articles. Or all the pages surrounding a certain service.
Crawling & Indexing
All the requirements described in this section are aimed at letting a search engine's crawling and indexing process run smoothly. You want search engines to learn quickly about any new or updated content. And upon crawling your content, you want them to quickly understand it as well as possible.
Robots directives let you choose how crawlers should treat your pages, with the most well-known directives being the
nofollow directives. You can define these robots directives using the meta tag in the
<head> section of a page, but you can also do it through
X-Robots-Tag in the HTTP header.
By default, the robots directives should allow indexing, and not have the
nofollow directive applied.
Here again, you want to define the robots directives on the following three levels:
- The website level
- The category/segment level
- The page level
Canonical URLs inform search engines that they should prefer one page over other identical or similar pages. For instance, if you have three near identical pages—A, B, and C—and you want page A to be indexed, you then canonicalize both B and C to A.
By default, canonical URLs should be self-referencing—telling search engines they're the right variant to index.
When it comes to canonicals, you only want to define these on the page level.
Best practices surrounding canonical URLs:
- Use absolute URLs, including the domain and protocol.
- Define only one canonical URL per page.
- Define the canonical URL in the page's
<head>section or HTTP header.
- Point to an indexable page.
Don’t let your canonical URL mistakes hurt your SEO performance. Monitor your site with ContentKing and get real-time alerts.
The requirements described in this section applies to websites that do international SEO.
For a website that's available in multiple languages and/or regions, be sure to use the
hreflang attribute. The
hreflang attribute is used to indicate what language your content is in and what geographical region your content is meant for. You can define the
hreflang attribute by including it in the
<head> section, or using the HTTP header.
Say you have an English, Dutch, and French version of the website. You can use the
hreflang attribute to point search engines to the translated versions of your pages. The page's English version would have the following in its
Best practices surrounding
- Reference both the page itself and its translated variants.
- Make sure to have bidirectional
- Correctly define language and region combinations.
- Always set
hreflangattribute and the canonical URL must match.
- Use absolute URLs when defining the
- Use only one method to implement the
Using structured data, you can provide additional information about your content. For instance, you can use Schema.org to do this for search engines, while Open Graph serves platforms such as Facebook, LinkedIn, and Slack, and Twitter Cards do this for Twitter. By using structured data, you ensure you have full control of how your content is presented.
Schema.org is often used to mark up reviews and to communicate who wrote an article and what organization a website belongs to. Or in the case of a local business, you can use Schema.org to explain the type of business and what its opening hours are. There are lots of opportunities here, so make sure your website supports the defining of Schema.org properties. The markup should be added to the HTML.
There are multiple ways to define Schema, but the most popular (and Google-preferred) one is using the JSON-LD format.
You should be able to define these on the follow levels:
- The website level (e.g. for defining Organization)
- The page level (e.g. for defining Reviews)
Structured data is about confidence. When SEOs talk about structured data for websites, they are typically referring to Schema.org. Using the Schema.org vocabulary involves annotating information on a page, taking text (which can be hard for machines to understand and disambiguate) and specifying explicitly what those elements are and/or mean. A machine can then more confidently use these clearly labelled items (knowing that they have been validated by a webmaster). For example, Google has the option to use this information for populating rich snippets, updating their knowledge graph, updating various SERP features, supporting local transactions, and semantic association. It's vital to ensure accuracy of the information and that it is present on the page. Refer to (opens in a new tab) on item types they support, which also covers implementation details. If you're just starting to learn about structured data, check out (opens in a new tab)!
Open Graph markup
There are four required Open Graph properties that you should be able to define:
There are also two recommended properties; use these to provide even more context about the content:
You should be able to define these on the following levels:
- The Website level (e.g. for defining a default image)
- The Category/Segment level (e.g. for defining a default image for all blog articles)
- The Page level (e.g. for defining Reviews)
Twitter Card markup
While Twitter Cards are quite similar to Open Graph, with Twitter Cards, there are four different types:
- Summary Card
- Summary Card with Large Image
- App Card
- Player Card
Here are the required Twitter Card properties:
But we highly recommend also including the three properties below to provide more context about the content:
Media such as images and PDF files can drive a lot of traffic, and therefore it's essential that you be able to optimize them.
While there are multiple factors that determine whether or not an image ranks highly, the website should support the defining of an
alt attribute and a
title tag attribute for each image. Also, when you're uploading media you need to make sure the URL path makes sense.
For instance, WordPress' default URL path, which contains the year and month, makes for very long URLs, and can also give people the false impression that your media is outdated when it's served from
Using image compression to decrease the file size of your images is recommended too. This improves page load speed.
To ensure that search engines learn about your images quickly and easily, be sure to use an image XML sitemap.
As we described in the Information Architecture section, the way in which content is accessible for visitors and search engines plays a key role in SEO.
Therefore, it's essential that you be able to manage your navigation. There are various navigation types, such as for instance:
- Main navigation
- Sidebar navigation
- Footer navigation
Make sure you can manage these for the website, and also specifically for certain sections of the website. Take for example an eCommerce website that sells shoes, pants, and sweaters. In the shoes section, you may want to define a sidebar and footer that contain nothing but links to your most important product pages and sub-category pages about shoes.
In order for any new website to be SEO proof when it's launched, it has to have good specifications. Use this article to your advantage, and be sure to click through to the in-depth articles about the topics mentioned for additional information.
And once you have all the SEO requirements for the new website down pat, be sure to get ready for the next step: preparing for a future website migration.