What Lies Beneath – The Importance of Data Segmentation in Digital Marketing

By Will King | September 19, 2019

Last updated: September 20, 2019

Have you ever noticed a trend in a dataset but struggled to understand the cause? In web marketing we spend lots of time looking at charts and data, but to truly understand what’s going on, you need to delve below the top level. 

Read on to learn how we uncovered stories behind the data through the power of segmentation.

Why Do We Need to Segment?

Data is a truly wonderful thing. It allows us as marketers to measure our work and prove our value. It can tell stories and provide mind-blowing insights. 

It can also be misleading, in-fact very misleading. 

When we consider web analytics we tend to look at top-level averages. This can be useful. We can learn the total number of sessions our site generated or how many orders we received. Lots of valuable metrics.

But top-level averages rarely tell us ‘why’. We can see the end result, but we can’t see the story behind the numbers. It’s foolish to draw conclusions from top-level stats without understanding the ‘why’. This is where segmentation comes in.

I’m going to run through three case studies of where I’ve applied segmentation in an SEO application. These stories demonstrate the power and importance of this tactic.

Making Sense Out of Search Console

Search Console Average Position

Consider this screenshot from the Search Console Performance report. It looks alarming, doesn’t it? Every SEO’s worst nightmare. The average position for this website had been on the up but has suddenly plummeted. Google must have hit it with an algorithm update!

Before we panic, let’s take a closer look at what’s going on here. If we look at the clicks report, they’re actually going up. How can that be considering rankings have apparently dropped?

Search Console Clicks

The real cause is simply because of how Search Console presents the Performance report. The average position metric is directly linked to the ‘impressions’ metric. Average position is not where you rank, it’s the average position over all the impressions you’ve appeared for.

If your site is shown to a user on Google search for any query, it will affect your reported average position. You need to consider how seasonality and topic interest effect search demand. If your pages experience a sudden growth or decline in impressions, this will affect the reported average position. 

You need to consider how seasonality and topic interest effect search demand

In our case above, we were able to diagnose why this had happened by digging into the landing pages. When we segment by landing page, the data tells a very different story. It’s clear that certain URLs have seen a growth in impressions on pages 2/3 of the search results. This drags down the overall average position. When we isolate these URLs from the top-level report we see a trend we expect. 

Lesson learned here is that peaks and troughs in search impressions can distort the average position.

Giving Keywords Some Context

If we look at Search Console’s query data, we get more opportunities to segment. The top-level query numbers are useful, but they become a lot more interesting if you group them. 

I like to open a Google docs spreadsheet and use the Analytics for Sheets plugin to hook into the Search Console API. This allows you to export all your data from Search Console (the limit is normally 1000 rows).

With your query data in a spreadsheet, we can apply some Excel magic to give it more context. I like to know what the split is between brand/non-brand traffic. I also like to understand the individual performance of these separate groupings. 

As you can see below, our keyword data suddenly tells a bigger picture. We can get a sense of how many visitors seek out this site on Google and how many discover the site through generic search terms.

Keyword Summary

The Power of Power Laws

Power laws are one the most interesting and valuable areas of statistics for marketers. Think of power laws as how two quantities of data proportionately affect each other. Kevin Indig wrote a great piece about this subject on his personal site

I’m sure you’ve heard of the 80/20 principle in which 20% of your input is responsible for 80% of your output. But this methodology can be applied to all sorts of data. 

My favourite is to use it on the Landing pages report in Google Analytics. Segment by ‘Google/Organic’ and export to excel. 

In your spreadsheet, we can calculate what proportion of total sessions/revenue each segment of top landing pages contributed in total. In an ideal world, you would expect your top 20% of pages to generate 80% of your traffic. But I’ve rarely found this to be the case. 

What I normally see is somewhere in the region of 10% contributing 90% of total traffic. Reverse this and 90% of pages only generate 10% of traffic. For example, the site in the screenshot below had thousand upon thousands of pages, but when segmented the data, we discovered that 70% of traffic came from 3% of pages!

Power Law Example

This gets exponentially more severe if you dig into the bottom 90%. In fact if you were to look at your bottom 25% of all pages, they probably contributed less than 1% of your output. 

If you were to look at your bottom 25% of all pages, they probably contributed less than 1% of your output. 

I won’t digress into the causes and potential SEO problems this can cause, that’s a whole different story. But by applying segments to our landing page data, we’ve uncovered a whole new angle on landing page performance. 

Diagnosing a Conversion Dilemma

Next we’ll discuss an e-commerce website suffering a downward trend in conversion rate, for no obvious reason at all. Heading into their busiest trading period it was crucial that conversion rate was as close to forecast as possible. 

Unfortunately conversion rate was nowhere near what it should be. I went through the normal steps to try and find the cause.

  • Extensive UX and funnel testing across all devices
  • Analysis of session recordings
  • Competitor analysis on price

The drop did seem to be isolated towards organic search traffic. So could it have been a traffic quality issue? It looked that way but I needed something more concrete. 

I exported all organic landing page traffic from Google by day over a 2 month period. I then classified every page with a label. This allowed me to analyse how different classifications of pages performed. 

The odd thing is that every classification of page showed what we’d expect as a conversion rate. No downward trend and pretty much where it should be based upon time of year. I felt I was into something though but needed to dig further. 

Next, I looked at how each segment contributed to overall performance. It was here where I found the problem. I discovered the conversation rate drop was caused by a shift in landing page traffic distribution. 

I discovered the conversation rate drop was caused by a shift in landing page traffic distribution

It was very easy to see what percentage of total traffic each page classification contributed. For example, category pages would normally generate about 50% of traffic, product pages 30% and so on. 

By examining the relationship between each page classification I found that there had been a big shift in landing page distribution. 

Before the conversion rate drop, product pages contributed about 30% of sessions. But because of an increase in organic rankings for category pages, product pages only equated for about 20% of total traffic. For this site, product pages tend to convert at about 3% whereas category pages only converted at about 1.25%.

The shift in traffic percentage away from product pages meant that a much bigger chunk of traffic was entering the site through category pages. And because category pages converted at a lower rate than product pages, that meant total conversion rate was dropping. 

A much bigger chunk of traffic was entering the site through category pages

With this new insight, we were able to re-forecast with a model that applied the expected conversion rate by ‘page-type’. This was hugely valuable in helping the business accurately plan stock and resource. 

This story is a fine example of why top-level averages can be misleading. I wouldn’t have been able to diagnose the cause of the conversion rate drop without segmenting traffic by page type.

Conclusion

Where do you go from here? Think about how these data analysis tactics can be applied to your business. Where are there opportunities to break your data into interesting blocks?

Think about your products, pages, customers, stores, locations. Slice these up and analyse how individual portions behave and look at the relationships in between. You might be surprised.