Search Engine Optimisation
Last updated: September 1, 2019
If you’re looking for a straight-talking, understandable introduction to the world of search engine optimisation then you’re in the right place. This in-depth guide will educate you on the basic fundamentals of SEO.
You might be a complete beginner seeking the knowledge to improve the search engine rankings of your own website, or a seasoned professional looking to broaden your SEO outlook. Either way, we hope this guide can help you in some way.
What Does SEO Stand For?
SEO stands for “Search Engine Optimisation”.
It has different interpretations, but we’d define SEO as:
Search Engine Optimisation is the art of enhancing a website’s online presence with the goal of generating increased organic search engine referral traffic.
In reality, SEO is a lot more complicated, but that’s it in simple terms.
Let’s dive right into it.
The Ranking Process
Let’s learn how Google search works.
When somebody performs a search on Google, a complex operation takes place within a fraction of a second.
Nobody outside of Google actually knows exactly how their algorithms operate (and probably very few within Google know everything). It’s a closely guarded secret. But through years of experimentation, the SEO community have developed an understanding of how search works.
I believe the process is similar to this:
1 - Query interpretation
Google will look at the search query and try to interpret what the searcher wants. They will consider factors such as:
- What device is the searcher using?
- What location is the searcher in?
- What sort of results is the searcher expecting (e.g, shopping or news)
This is where a factor called Rankbrain probably comes in. Rankbrain is a bit of an enigma because nobody outside of Google truly knows what it is and how it works. What we do know, is that it’s an important function in their query processing mechanism.
With the huge growth in mobile voice search, people use natural conversational language when searching. 15% of searches on Google are new search queries that Google has never seen before. Rankbrain is believed to be a machine learning algorithm that uses historical search data to predict what the user-intent behind a new search query might be.
Rankbrain probably attempts to understand the true intent behind an ambiguous query (explicit and implicit).
2 - Retrieval
Once the query has been processed the algorithm will search Google’s index and retrieve matching pages. Imagine the index as Google’s huge database of the internet. It contains billions of pages that they’ve discovered through crawling.
The index is split into thousands of shards. Each shard is a partition of the total index and contains millions of webpages. Sharding their index allows Google to return search results faster. This is because each shard can be searched simultaneously which is faster than searching the entire index in one go.
3 - Scoring
At this point, the query has been processed and Google has retrieved any pages from their index that are a match. Now they score those pages to determine the ranking order.
The actual signals that Google use to determine the rankings are probably the most closely guarded part of the search process. The aim of the scoring process is to calculate the relevance and quality of the page in relation to the user’s query.
Google will look at the content on the page and score it depending on how well it satisfies the needs of the searcher.
- Are appropriate words and phrases included in a natural format?
- Does the page reference related terms and synonyms?
- How well is the page designed?
- If the search is on a mobile, is the page optimised for smartphones?
- Does the page load promptly for visitors?
- How many other trusted websites link to this page?
- They may have historical data on how users have engaged with the page in the past
These are just some examples that simplify a hugely complex process.
Each matched page will receive an overall score and that decides the initial ranking.
4 - Ranking Adjustments
Before actually presenting the searcher with the set of results, some final adjustments will be made. Google have spam filters that check whether any of the results are over-optimised or have characteristics associated with spam.
If there are multiple results from the same website they may cluster these into a single result with sub-links (known as sitelinks).
In some cases, they may highlight certain pages at the top of the page as a featured snippet. These are pages that Google thinks may strongly align with the intent of the searcher.
The Stages of a Google Search
Accessibility / Indexation
The most important (and basic) SEO consideration is making sure that search engines can access your website. If a search engine is stopped from visiting your site then they cannot discover your content. If they’re not aware your content exists then it will never rank on their search engine.
This discovery process is called ‘crawling’. Search engines will send bots (or crawlers as they’re known) to your website. These bots will crawl the content on your pages and follow any hyperlinks on your pages to discover more content.
Following this crawling process, they may add your URLs to their index. This is where elements of technical SEO come to play. There are various conditions that can affect how search engines crawl and index your content.
This is very important.
Factors that Influence Indexation and Crawlability
When a crawler first comes to your site they will look for a file called robots.txt in the root of your domain. The robots.txt file can include directives to the crawler on which parts of your website can and cannot be crawled. For example, you may want to tell a crawler not to crawl your checkout pages.
Most sites have an XML sitemap which references all the important pages on the website. Search engines provide a tool called Search Console tool where you can submit your XML sitemap. The search engines will then crawl the XML sitemap to discover the URLs on your website. This is one of the easiest ways to help a search engine discover and crawl a page.
There are various ways to tell search engines not to index a URL. The most common is the noindex meta tag. It’s a meta tag that can be included in thesection of a URL. When a search engine crawls a URL and discovers a noindex tag, it will know not to index the URL. If the URL is already indexed, in most cases it will promptly remove the URL from the index.
Alternatively, the x-Robots-Tag in the HTTP header can also be used to specify a noindex.
If you have a URL that’s very similar to another URL then you might not want both to be indexed. In this scenario, you can tell search engines which version to index by referencing a canonical URL.
The canonical URL is the version chosen to be indexed and will be crawled most frequently. In doing so you will affect how search engine bots crawl and index your content. Canonicalisation is one of the most debated areas of SEO and is too in-depth to fully explain here.
How a website links its pages together will form its information architecture. Important pages will be linked to frequently and from prominent places (like the homepage and primary navigation). When search engine bots crawl your website they will develop an understanding of your information architecture and which pages are important.
This will affect how search engines crawl your site.
Think of Search Console as your communication point with Google. It’s where they send you information about how your site performs on search, and will highlight any problems to you.
You can also use Search Console to influence how Google crawls your site.
Create content that users love and search engines will love you too. Sounds easy right?
Unfortunately, it’s not that easy but it’s such a huge part of SEO it cannot be ignored. To rank on Google you need to have content that meets the needs, wants and desires of potential visitors. They’re searching on Google because they seek something, you need to provide that ‘something’.
Content starts with research. You need to understand how your audience searches on Google. The phrases, terminology and language they use. Most niches are already heavily saturated so to compete you need to have better content than the competition. What problem can you solve for your target audience?
Content optimisation starts with keyword research. Keyword research is the process of discovering what keywords your target audience use when searching on Google. There are tools to provide this such as the Adwords Keyword Planner which is free, and then paid services such as Moz Keyword Explorer.
It’s important not to build content around individual keywords. Instead, you should segment your keyword research by searcher-intent. Map keywords into groups that have the same search-intent and then build content that satisfies that search-intent.
Hummingbird and Conversational Search
Google’s natural language processing capabilities have come a long way. Many years ago their information retrieval systems relied on matching the terms in a search query with documents that contain the same terms.
Today things are very different. Their ability to understand the true intent behind a search query has evolved. They released a new version of their algorithm in 2013 called Hummingbird. Hummingbird was an evolution of their search algorithm that was better at interpreting conversational language.
With this in mind, content should be written in a natural manner. No overusing keywords. The best approach is to hire professional copywriters who know and understand the subject matter.
We can demonstrate this in action right now. Take this example search query:
“Why does my TV look strange?”
If you search for this on Google, the search results will return pages that discuss ‘the soap opera effect’. Google knows that the searcher seeks information about the soap opera effect, even though it wasn’t explicitly implied in the search query. Google understood the user-intent.
This example demonstrates how redundant it is to over-optimise for specific keywords. Google clearly believes that to answer the user’s search query, winning pages must discuss the soap opera effect, so any pages that were highly optimised around “why does my TV look strange” probably wouldn’t rank as well.
Google’s ability to understand natural language and relate words to concepts is highly sophisticated. An effective on-page SEO strategy is to ensure your content is written by an expert on the topic you want to rank for. An expert will naturally reference phrases and concepts related to your target topic.
Google released an algorithm update called Panda in 2011 which was designed to reward high-quality content, and demote low-quality content. This is still relevant today and content quality is a massive SEO consideration.
Every URL from your website that is indexed on Google will contribute towards Google’s perception of quality in your website.
Some signals of low-quality content:
- Duplicate or near duplicate pages indexed (internal and external)
- Thin content that provides nothing of value
- Ambiguous content where the purpose is unclear
Every page has to have a clear purpose. If search engines struggle to understand the difference between two pages then it will inhibit the ability of either to rank
Your information architecture can help algorithms understand the context of your website. The hierarchical organisation of your content will provide signals on relevance and importance. Your site should have a logical linking structure that makes sense both to humans and search engines.
A popular approach is called the Hub and Spoke strategy. This involves organising your content into logical groups. Each group has a central hub page which is optimised around a broad topic. The hub page links out to closely related pages (the spokes) which are optimised around the same theme but at a more specific level.
The benefit of this approach is that it sends clear signals to search engines as to what your pages are about. It helps search engines understand the contextual relevance of your website.
Structured data is any form of data that exists in a structured, machine-readable format. More and more of the web is evolving into a structured format and the ability to structure your web data has never been easier.
On the web, the most common way to structure your data is with Schema. Schema can help a search engine recognise entities and elements which have a semantic relationship to an entity.
For example, on an e-commerce product page, it would be appropriate to include product Schema. This tells a search engine that this isn’t just any page, it is a transactional, e-commerce product page.
The product itself will be the primary entity. The on-page price would use the price Schema, the description would use the description Schema, and the stock status would use the availability Schema. With this in place, search engines will recognise the semantic relationship these elements have with each other.
Why is structured data important? Well, look at it this way. If Google crawls a page on your website and there is no structured data, Google needs to figure out for itself the purpose of the page. By utilising structured data you can make these signals much clearer.
Structured data isn’t a ranking signal so you don’t get any organic boost for using it. What it does do is help search engines understand the purpose of your content. It helps search engines recognise entities and associated properties in your content. It’s a clear signal and it removes ambiguity.
Become an Authority
When it comes to SEO, the importance of trust and credibility cannot be understated. Search engines will look for independent signals of trustworthiness before they consider ranking a website. Your website must become an authority in your online space if you want search engines to send you traffic.
This is one of the most difficult parts of SEO. Generally, when measuring authority, search engines will look for ‘off-site’ signals. This means independent signals from domains other than your own. I.e websites you don’t have any control over. It’s hard because what reason do 3rd party publishers and authors have to help you?
But if you can develop a positive reputation which is recognisable online to search engines, then they are going to place more trust in your site.
If you start reading about SEO it won’t be long before you come across a topic called backlinks.
We know trust is a hugely important factor in the ranking process. But how does Google judge how credible a website actually is? Primarily they look for off-site, independent signals, the main one being backlinks.
A backlink is a hyperlink from one URL to another. In the eyes of Google, this is seen as a vote of confidence in the URL being linked to. The more backlinks a website has from trusted sources, the more authority it will have.
The context of your backlinks will also influence how relevant search engines think you are for certain topics. It seems obvious but if you want to rank for a certain topic, then it will help if your links are from pages that discuss the same topic. If a high proportion of your links are irrelevant and off-topic, they’re not going to carry as much weight.
It’s important to understand a concept called Pagerank. Pagerank is an algorithm Google uses to measure how important pages are. In simple terms, every indexed page has what’s called a Pagerank score. This score is determined by the backlinks that a page has and the Pagerank scores of those linking pages.
When a page links to another page, some of the Pagerank from the linking page is passed to the linked page. Pages that have lots of backlinks from pages with high Pagerank will develop their own PageRank and grow in trust.
The algorithm has evolved over the years to reward more PageRank to links that are likely to be clicked. Long story short, backlinks from trusted, relevant and popular pages, that will refer ‘real traffic’ will benefit your SEO.
PageRank is too complex to fully explain here but if you’d like to learn more then I’d highly recommend you read this excellent explanation by Dixon Jones on Majestic.
E.A.T is a popular trend amongst the SEO community at the moment. It stands for Expertise, Authority, Trust and is coined from Google’s Search Quality Rater guidelines.
But, it’s not a ranking factor. It’s a concept Google uses to explain some of the ways they measure credibility and importance. There is a huge amount of debate amongst the SEO community about how EAT truly affects SEO.
Expertise, authority, and trust is something that search engine algorithms are getting better at recognising. Recent algorithm updates such as Medic seemed to demote the rankings for medical websites. In reality, it affected websites in many other niches too.
The quality rater guidelines discuss another concept called YMYL which stands for ‘Your Money or Your Life’. These are websites which could affect the financial wellbeing, health or happiness of a potential visitor. For websites in these areas, Google needs a higher level of trust.
YMYL websites should consider ways to demonstrate their expertise, authority, and trust.
- Accurate content written by a knowledgeable expert
- Links from highly relevant and trusted sources
- Clear and transparent information about who is responsible for your website
When you visit a website, information is sent back and forth over a connection between your browser and web server. Most websites ensure this connection is secure by operating over what’s called HTTPs.
HTTPs stands for Hypertext Transfer Protocol Secure. The alternative to HTTPs is HTTP. When a website operates over HTTPs, there is an added layer of security. Data going back and forth between the browser and the server is encrypted. This makes it harder for anyone to intercept the data on this journey (this is called a man-in-the-middle attack). It’s easily achieved by installing an SSL certificate.
Search engines value user security highly. In fact, there is a very small ranking boost for websites that operate on HTTPs. The ranking benefit is tiny. It’s more like if there were two sites with all signals being equal and one is on HTTPs and one isn’t, the HTTPs one will win.
HTTPs is a pretty basic security layer that all websites should have. Especially any that pass sensitive data over a connection. Imagine if your credit card or password details were compromised on a website. HTTPs should be installed as standard to protect user data, and not for SEO benefit.
The dreaded algorithm update. SEOs give a collective shudder when they hear those words.
If you’ve had any involvement in SEO then you’ve probably heard SEO’s speak about Penguins and Pandas. They’re not talking about cute furry animals.
Google updates its search algorithms all the time. Every day in fact. Most of the changes go unnoticed. Other times, the updates shake up the rankings so much, the SEO community notice the ranking changes and go nuts.
Some of these algorithm changes are designed to demote the rankings of websites that use underhand SEO tactics. If a website engages in manipulative SEO practices, then it’s at risk of being impacted by algorithm changes which could destroy its organic visibility.
Penguin was an algorithm update that was first released in 2012. It was designed to punish websites that had engaged in artificial link building tactics. If a website was targeted by Penguin, the website would have to remove the artificial links to regain its organic visibility.
In the past, Google would run the Penguin algorithm incrementally. If a website was affected by Penguin, the website wouldn’t see any change in Penguin’s effect until the next iteration of Penguin was run. This could be many months or even a year.
In 2016 Penguin released an iteration called Penguin 4.0. A major change with Penguin 4.0 is that it became a real-time algorithm. This meant that Google would process Penguin adjustments much sooner than before.
The Panda algorithm was first released in 2011. Panda was all about lowering the rankings of websites with poor content. Prior to Panda, Google’s search results were littered with awful, low-quality content. Pages that had gamed their way into the search results using spam tactics. The symptoms of Panda included thin content, duplicate content, and content lacking in genuine value to human readers.
Panda used to be released in iterations every few months but is now baked into the core algorithm. Whilst Google no longer confirms when Panda is updated, it’s very likely Panda’s inner workings still play a core role in ranking.
Notable Algorithm Updates
Not only do search engines want to rank the most relevant and authoritative pages, but they also want to reward websites that provide a great experience for users. In recent years user-experience has become more important for SEO. You need to think beyond keywords and backlinks.
Remember Google’s primary goal is to satisfy the search intent of its users. Make no mistake, if they send you search traffic, at some level they will be measuring how happy your site makes those visitors. Exactly how they do this, we don’t know. There are concepts such as ‘the long click’ and ‘dwell time’ which Google can track through their logs and could be UX signals.
To speculate, I believe that they use a level of machine learning to predict the quality of experience a visitor might have on your site. Google will have access to vast amounts of training data on which to learn the characteristics of great and poor UX.
Your best bet is to focus on the wants, needs and goals of your audience, and make these easily achievable on your website.
How fast your website loads is hugely important for both users and search engines. Users obviously hate slow websites and search engines don’t want to send users to slow websites. With the growth of mobile traffic, the page load expectation for most users has increased too.
Google officially introduced page speed as a ranking factor and provide many resources to help make web sites faster. Speed is hugely important for Google so it should be for you too.
The majority of web browsing is done on smartphones and Google recognise the importance of this. In 2015 they updated their algorithm to gave a small mobile search ranking boost to websites that were mobile-friendly. By mobile-friendly, this means the websites have a mobile optimised experience for visitors browsing on smartphones.
This could be a responsive website, an adaptive website or a separate mobile website. Either way, you simply have to have a mobile version of your website if you wish to compete on Google.
How well does your website actually help your users achieve their goals? This should be a question website owners regularly ask themselves.
People visit a website for all sorts of reasons. It could be to read a news article, retrieve an answer to a question, or purchase an item. There’s always a reason.
Helping your visitors achieve their web browsing goals is one of the best SEO tactics you can use. Ultimately, if Google sends you search traffic and you consistently fulfill the needs, wants and desires of those visitors, Google is probably going to send you more traffic.
Satisfied and happy visitors will bring about SEO success both directly and indirectly. If your websites visitors continually engage with your content in a positive way, search engines will recognise this as a signal that your website meets the needs of that audience.
Is SERP Click Through Rate a Ranking Signal?
There is a never-ending debate amongst the SEO community into whether Google uses SERP click-through-rate as a ranking signal. There is some logic to this argument as engagement with the SERPS is something Google measure. It makes sense that a SERP result that garners many clicks is probably a strong indicator that the link matches the user-intent of the query.
Likewise, there is speculation that web searchers pogo-sticking back and forth to the search results could be a negative signal.
However, Google has repeatedly stated CTR is not a ranking factor. This would be a noisy signal and wouldn’t necessarily demonstrate that a webpage best meets the needs of the searcher.
Ultimately if you can optimise to attract more clicks from Google and then satisfy the search intent of those clicks, you’ve got the right strategy.
One of the most annoying parts of a website browsing experience is when you are overwhelmed by aggressive advertising. Users hate intrusive advertising, and so do search engines. In fact, in the Google search rater quality guidelines, raters are repeatedly told rate web pages with aggressive advertising as ‘low quality’.
This doesn’t mean you can’t display any adverts. That would be crazy because many websites rely on advertising as a monetisation strategy. But the key here is to put users first. Don’t overload your pages with banners, don’t fill your content with affiliate links, do put the user first.