Deciding How to Represent Website Data – Part 2: Segmenting Aggregate Data

By Megalytic Staff - February 27, 2015

Web analytics guru Avinash Kaushik famously wrote that all data in aggregate is "crap". Time on site. All visits. Total revenue. None of it tells a good story. As Avinash says, the only way to get insight from data is to segment it into parts.
For example, it is mildly useful to know a website receives 15,000 Sessions (visits) per month. That aggregate metric gives you a rough idea of the scale, however, not much else. If you knew that over 50% of those Sessions came from Organic Search, then you’d start to get some insight into what makes the website tick. Clearly, content marketing and SEO are an important part of the story for such a website.
Breaking down Sessions by Acquisition Channel (e.g., Organic Search, Referral, Social, etc) is an example of how we can start segmenting aggregate data – Avinash’s recommended path to enlightenment.
But once you are enlightened, how do you communicate those insights to others? Using the right data visualization helps.
In Part 1 of our series on representing website data, we looked at different ways to visualize time-series web analytics data. In this, the second part of a three-part series, we look at aggregate data for fixed periods of time – how to segment it and how to best represent the segmented data to communicate your insights.

Blog Image Representing Data

 

Segmenting Data by Dimensions

Google Analytics reports are made up of dimensions and metrics. Dimensions are attributes of data measurement (or metrics) that describe characteristics of your users. Consider, for example, the metric Sessions (website visits). One dimension is the City that the Session comes from. When I’m sitting in New York and I visit your website, Google Analytics will assign the value “New York” to the City attribute of that Session.

We already mentioned another example above – the Acquisition Channel attribute is assigned a value that indicates the method by which the visitor arrived at your website. A dimension divides your data into mutually exclusive parts. That is, the attribute can only have one value. A Session cannot have a City attribute of both New York and San Francisco.

The fact that dimension values are mutually exclusive means we can use a pie chart to represent the segmentation of data by dimension. Because the parts don’t overlap – they always add up to 100%.

Despite the pie chart’s bad rap, there is one thing a pie chart can do better than any other chart. It can help your audience visualize the relationship between the parts and the whole.

 

Traffic by Channel Pie Chart

 

For example, the above pie chart helps us immediately see Organic Search contributes more than half of the Sessions to this site. We can also see no other Channel is nearly as important.

The knock against pie charts is that it is difficult to tell the relative size differences between the slices. For example, without the percentage labeling, would it be immediately obvious that Social is 25% larger than Paid Search; or almost double the size of Referral? Probably not.

However, if illustrating the relative sizes of the dimensions is your primary goal – rather than presenting the size relative to the whole – then a bar chart may tell the clearer story.

 

Traffic by Channel Bar Chart

 

Above we see the same data represented using a bar chart. Here, it is made clearer that Social is larger than Paid and almost double the size of Referral. You can also see – just as in the pie – that Organic Search is by far the largest. What you cannot see in this bar chart is that Organic Search is more than 50% of the total. To do that, you’d have to mentally rearrange and stack the smaller bars on top of each other to ”see” they don’t add up to the blue bar. That mental feat is well beyond me (and probably your intended audience), which is why I find the pie chart useful. It’s the only way to clearly visualize that Organic Search has more than 50% of the total.

Changing Chart Types in Megalytic

In case you hadn’t guessed, the examples charts shown here are produced from Google Analytics data by the Megalytic reporting tool. If you’d like to follow along with your own data, you can create a 14 day trial account (no credit card required).

Megalytic makes it easy to change between chart styles using the chart type selector in the widget editor.

Geographic Dimensions

Geographic Region is another frequently used dimension. Besides being helpful in showing where your website visitors are located, important marketing insights can be gleaned from geographic data. For example, if you notice your website receives a lot of traffic from Germany, you may decide to provide content written in German – particularly if you notice German traffic converts at a lower rate than traffic from English-speaking countries.

Google Analytics provides a variety of geographic dimensions, including Continent, Country, Region and City (among others). A unique property of geographic data is that it can be represented on a map.

 

Web Traffic by Country in Europe - Map

 

Here, we are using a map to illustrate the countries, and the size of the circle to indicate the relative amount of traffic coming from each location. Using this type of visualization, the reader can quickly see the United Kingdom is the largest source of traffic – significantly larger than all the other European countries. We use a legend on the right hand side to indicate the country names and the exact Session amounts. This is a useful way to clarify the data, as not everyone knows the location of every country. For example, without the legend, it might be difficult to tell there are more visitors from the Netherlands (801) than Belgium (368).

When to use Tables

Data visualization is great, but sometimes it just makes sense to use a table to show data – even geographic data. This is particularly true when you want the reader to process multiple metrics at the same time.

As mentioned above, geographic data can provide useful marketing insights relating to language. As we can see from the map visualization, Germany is the second largest source of Sessions. However, as can be seen in the table below, Germany has a conversion rate (2.37%) that is far below the site average (3.96%).

The hypothesis backed up by this data is that because this website is English-only, the engagement from non-English speaking countries will be low. To make this point, I want to show the reader several metrics across the top European countries: Sessions, Avg Session Duration, Completions and Conversion Rate.

 

Web Traffic by Country in Europe - Table

 

In this case, a table is a good choice because it is a straightforward way to show four metrics per Country – one in each column. When using a table to make a point like this, it’s a good idea to include a text description of the insight the data shows. In this case, we might write something like this to go with the table:

The data clearly show that this website could benefit from local language content. It seems that there is a particularly significant opportunity in Germany, where we received over 1,000 visits. As shown by the low Avg Session Duration (average length of visit) of 2:12, the engagement is much lower than in English-speaking United Kingdom. This lack of engagement translates into a lower Conversion Rate -- only 2.37% in Germany vs 5.53% in the United Kingdom and 5.04% in Ireland. In fact, conversion rates are also well above average in Belgium (5.98%) and Sweden (4.52%) where English is widely spoken.

Conclusion

When looking at non-time series data about your website, the key to insight is segmenting along dimensions. Depending on what point you want to communicate, the dimension values (e.g., Channels, Countries) can be represented as slices of a pie, bars in a chart, circles on a map, or rows in a table. Rather than getting caught up in dogma (e.g., “pie charts are bad”), let the insights you want to communicate guide your decisions about how to present the data.

Next in our data visualization series we’ll look at how to represent data that compares results across time periods.


Miss the first post in this series? Catch up!

Content Offer

An introductory guide to inbound marketing

Get to grips with marketing in the digital age

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat.

Download Guide
Comments

We promise that we won't SPAM you.