Tag Archives: KPIs

Prospects vs Customers

Understanding your Website Visitors: Prospects vs Customers

A key metric for most websites is their conversion rate. It is a measure of how well their website is performing but, as it is an average, it can mask serious issues. People who already know you and your website (as they are existing customers) are much more likely to purchase during a single visit than people just discovering your website for the first time. But the data you use is for both groups combined.

L3 Analytics has developed an approach that identifies these two visitor segments (potentially with more granularity), allowing you to understand the behaviour of each. This is recorded in a custom dimension (which could be an Adobe eVar or similar in other tools) that records the type of visitor. Based on experiences with our clients, this can be used to expose some interesting behaviour. A website conversion rate might be 3.5% but only 0.4% for prospects, giving a very different picture on performance and your internal priorities.

How this Works

It is actually more difficult than it first seems to capture a Visitor Type. The most important lesson is that the Visitor Type MUST be set at the start of the session. This allows us to calculate the conversion rates for prospects vs customers. Otherwise only customers could place an order (as prospects become customers after doing so). But, on their next visit to the website, we need to know they are then a customer.

The exception is if the visitor logs in (having registered in a previous session) during the course of the session. After logging in, we know definitely what their Visitor Type was at the start of the session and the value should be updated to reflect this. It does mean this value needs to be passed through from internal systems.

Finally, we want to know if people are customers whether they are logged in or not. As a Visitor Type is not personally identifiable, we feel comfortable recording this after a visitor logs out.

The solution to all this is to use two cookies. One cookie is a short term cookie (equivalent to session-based) recording the value as at the start of the session (or after login). The other is a cookie with an expiration date in the distant future that records the actual Visitor Type as at that point in time. The short term cookie is the one recorded in your analytics tool.

How this Works in Practice

Creating the Cookies

The logic for defining the cookies for the simple use case where there are two Visitor Types only – prospects and customers – is as follows:

On every page load

  • If short cookie exists
    • Read current value
    • Write current value back out with 30 min expiration date
  • If short cookie doesn’t exist
    • If logged in
      • Identify visitor type (“prospect” or “customer”)
        • Set short cookie to visitor type with expiration of 30 min
        • Set long cookie to visitor type with expiration of 2 years
    • If not logged in
      • If long cookie exists
        • Set short cookie to value of long cookie with expiration of 30 min
      • Otherwise
        • Set short cookie to “prospect” with expiration of 30 min
        • Set long cookie to “prospect” with expiration of 2 years

On Login (not on registration)

  • Identify visitor type (“prospect” or “customer”)
    • Set short cookie to visitor type with expiration of 30 min
    • Set long cookie to visitor type with expiration of 2 years

On Purchase

  • If long cookie set to “prospect”
    • Set long cookie to “customer” with expiration of 2 years

Note the cookie needs to be set as “path global” so it can be read across the entire website. They should be set as first party cookies.

Capturing Visitor Type using Google Tag Manager

The following logic can be used with any Tag Management System, but the examples provided are for GTM.  The first step is to create a variable for each of the two cookies. These are very straightforward, simply grabbing the value from the two first party cookies.

session cookie variable

Then another variable needs to be created to record the Visitor Type to be used within the analytics tag. (The JavaScript required for this was provided by someone else, potentially Simo. If I remembered for sure, I would give due credit for this, apologies for not doing so.) The key reason for using this approach instead of just using the value within the short cookie is that, based on experience, the cookies have not always been created before the page tag fires and therefore we need to have a fall back option.

code for visitor type

The final step is to use this latest GTM variable within your GA tags to populate a Custom Dimension. Don’t forget to define this Custom Dimension within your Google Analytics configuration as well (or other tools). As the value for this custom dimension can change from session to session, the scope should be set at session level.

The Caveats for Visitor Type data

The first caveat is that this data will not be, and could never be, 100% accurate. For visitors who are not logged in, it is just not possible to know definitely whether they are an existing customer or a prospect.

The accuracy will improve over time as customers return and can be identified as such even if not logged in. However, this does not work if they delete their cookie or use a different device.

These is also a skew towards customers for purchases. To explain that, imagine two unidentified visitors entering the website, although both are actually customers. As such, they are identified as “prospect”. They both do some research while not logged in. One then exits the website while the other decides to purchase the products they found. During the checkout process, this visitor logs in and is identified as a “customer” with their behaviour during the entire session recorded as for a customer.

The results will show a 0% conversion rate for prospects and a 100% conversion rate for customers. Reality is a 50% conversion rate for customers. Again, nothing can be done here except being aware of this skew in the data.

Extending the List of Visitor Types

This really depends on your business and how you will use the information. For some businesses, there is a stage in between prospect and customer of “registered” (visitors who are in your email database but have not yet purchased). You could split prospects by New & Returning. Or better, by some sorts of segment like New, Aware, Research, Interested based on session scoring. For customers, there are so many segments or personas that could be applied, taking this information directly from the back end database e.g. loyal customers, discount shoppers, inactive customers, family shopper, etc.

Using the Visitor Type Information

With the tracking in place, it is then just a matter of analysing the data. If using Google Analytics, you can create a segment for each Visitor Type (so you can apply to all reports), create a Custom Report with Visitor Type as the first dimension (comparing performance side by side for selected metrics) or apply as a secondary dimension to many reports (for ad hoc analysis).

As noted at the start of this blog post, the obvious metric to look at is Conversion Rate. Beyond that, all the engagement metrics should demonstrate clear differences in behaviour. There are likely differences in the traffic sources used by prospects and customers (discovering website vs already being aware of it), the entry points into the website and the content being viewed.

An interesting one is looking at marketing costs and seeing just how much is spent on existing customers. This will allow a more accurate Cost of Acquisition to be calculated, taking into account only new customers, and only the marketing spend for prospects.

So, have I convinced you of the value of this piece of information? I find it provides the clearest and most valuable segmentation into performance, especially when it comes to driving actions. Of course, while full details have been provided on how to set up Visitor Type tracking, we would be happy to help out any companies who would like our assistance in doing so – please contact us on enquiries@l3analytics.com or +44 (0)20 8004 0835.

Is it Engagement or Likelihood to Convert?

Engagement is a buzz word. It is a quest. It is an altar at which many worship.As is frustratingly typical, Avinash says it better than I can.  Engagement is one of those jargon words which are used very freely (including by myself) within Digital Analytics without really meaning much.  I did go so far as to include Engagement as one of the MeasureCamp swear words.

According to Eric Peterson, “engagement is an estimate of the degree and depth of visitor interaction on the site against a clearly defined set of goals”.  The common metrics used to measure this “degree and depth of visitor interaction” are frequency/recency of visit, the number of pages viewed and the time spent on the website.  Or you can create a calculated metric taking multiple factors into account as Eric did…

Unfortunately, engagement is actually fairly meaningless – you don’t get paid for engaged visitors.  These are nice metrics and it is great to show people care more about your website and you can prove this as they are visiting more frequently and spending more time on it.  But it detracts from your real purpose of having a website.  Multiple people have tried to tie engagement to ROI but it is still missing the point, your boss doesn’t care if visitors are engaged, s/he cares if they converted or not (however you define conversion).

Changing the Name = Changing the Focus

So what we really care about is whether a metric suggests the visitor is likely to convert in the future or not.  Therefore I say we should start calling these metrics Likelihood to Convert (LtC) metrics, not Engagement metrics.  Is this just semantics?  Well yes, but by changing the language, we change both of the purpose of our analysis and how we communicate with non analysts – i.e. the management team.

For a simple example, a basic engagement metric is Frequency of Visit.  So a visitor who is visiting on a regular basis is more “engaged”.  Brilliant.  Does it mean anything?  No.

But change the question to whether visitors who visit on a regular basis are more likely to convert.  Now it becomes interesting.  If this link exists, then frequency is a predictor of conversions and it should be described as such, i.e. as a LtC metric.  If you can’t demonstrate this link, then frequency may be a nice warm fuzzy metric but it doesn’t add value.

If you announce to the management team that Conversions remain steady but the key Engagement metric of Frequency has dropped 10%, they are likely to say so what, it was a good week.  If you announce that Conversions remain steady but the key LtC metric of Frequency has dropped 10% indicating Conversions will drop in the future, you will get a reaction (and ideally actions).

Really?  Is there really a difference?

I admit, I have had to think through this a few times myself to really understand what the difference is between Engagement and LtC.  The answer lies in which metrics qualify for each, what they represent and how they can be used.

Ask any analyst what Engagement metrics are and they will list off frequency, time on site, average page views per visit, etc.  But do these metrics indicate the visitor is more likely to convert – yes for some websites but not for other sites.  So Engagement metrics are not automatically LtC metrics.

Instead you need to cast the net wider for relevant metrics and include those normally described as micro conversion points.  Example here include view Product pages, create Baskets, interact with tool X, view Contact Us page and View Video > 90 seconds.  These wouldn’t be considered Engagement metrics but they can definitely indicate that the visitor is likely to convert at a later date.  Again not on all websites, so we can say that LtC metrics are business specific.

As to how they can be used, a true LtC metric trends with or in advance of conversions.  They can be used as predictors of future business performance and as warning signals that an issue is developing – so the issue can be reacted to before it impacts business performance.  They can also be used as an evaluation or success metric e.g. certain campaigns are expected to move a particular LtC metric, not directly deliver conversions.

Your Thoughts

So the key elements of LtC metrics are:

  • There is no typical list of these metrics, instead they are specific to a business
  • They must trend with the number of conversions, either current or in advance
  • They can be used as predictors of future performance

Do you agree?  Does changing this name change the way you or your management team will look at and use your Digital Analytics data?  Or is Likelihood to Convert just a new description of the old Engagement buzzword?

Change to Definition of a Visit in Google Analytics

Google announced on Thursday that they were changing the definition of a visit/session in Google Analytics – http://analytics.blogspot.com/2011/08/update-to-sessions-in-google-analytics.html.  The key difference is that a new visit will be recorded whenever a visitor re-enters a website with different traffic source information.  Previously, there had to be 30 min of inactivity before a new visit could commence.  There is a second change with closing a browser no longer ending a session.

There was announced as a small change by Google with most users seeing less than a 1% change.  That doesn’t tie in with the data I am seeing over a number of accounts and feedback from other sources where changes of 10%+ have occurred.  This impact on data is most obvious in ratio metrics such as conversion rate, bounce rate and average time on site.

Which metrics are affected?

One thing to make clear is that the only absolute metric that should have changed is visits.  There should be no impact on unique visitors, page views, conversions, time on site, etc.  Any change here is due to different factors not relating to this change in the definition of a visit.

As visits have increased, any metric which is an absolute number divided by visits will have decreased.  This can be seen in metrics such as page views per visit, conversion rates and average time on site.

It is a different situation for bounce rate.  If a visitor has clicked through to a website on one traffic source, viewed a single page and then re-entered on a different traffic source, this is now counted as a bounce for that first traffic source.  As such it is likely that the bounce rate will have increased for many websites.

Another metric with a unique impact is % New Visits.  Any visitor who re-enters the website and creates a second visit will be treated as a Return Visit for that second visit.  Therefore the % New Visits will decline.

What will the impact be on my metrics?

The scale of the change will vary from website to website.  It depends on whether visitors will be re-entering the site from multiple traffic sources or not.  Some websites will see no impact.  A website which has visitors using Google or a 3rd party website (aggregators) as their internal search tool will likely see a big change as visitors are constantly re-entering the website.

I am not certain which traffic sources will be most affected by this.  There was no clear trend in my admittedly brief investigations that I could see.  The more cynical people out there are sure to suggest that Google Adwords will report a lot more traffic and conversions.

Can I still use Google Analytics data?

Let’s get to the key question, how much is this likely to impact your understanding of business performance and your ability to use Google Analytics data to improve performance? In the long term, not at all.

Neither definition of a visit is inaccurate; the numbers were correct previously and are correct now based on the definition at the time.  It makes it difficult to compare week on week numbers right now but that won’t be a problem after a few weeks.  After that, you should be looking at your current numbers when evaluating performance, the change will have no impact on your understanding of performance.

The biggest issue is for people who have just launched a new marketing campaign or website tool/feature.  It is going to be very difficult for them to evaluate the impact of this launch.  They are the big losers here.  For everyone else, we are just going to have to grit our teeth and ride it out for a couple of weeks.  Arguably there is even an opportunity here to get a better understanding of business performance by comparing data from the two definitions.

How do I check my numbers?

The change happened on the 11th Aug but it will hit your numbers at different times depending on your time zone.  I recommend extracting data for key metrics at daily level and checking the week on week change.  If your numbers are affected, you should see a step change occurring on either the 11th or 12th Aug which will continue for seven days before settling down.

Excel example of how to check on change in data

Why has Google done this?

As to why Google has made this change, I think it is an attempt to make the web analytics data easier to understand, particularly for marketers.  The definition of visits is simpler, every time someone enters the website, it is a visit.  As mentioned in the announcement, it will align with data for multi channel funnels which was likely a key factor in this decision.  So better for marketers and light users of web analytics, maybe not so good for experienced web analysts who disagree that this is the common definition of a visit.

Given the outcry we are already seeing, I can understand why this was not announced in advance or available for testing.  It is a change that Google wanted to make, they knew there would be some unhappy people out there and so just made this a quick rather than drawn out process.

My conclusion

This is going to be very annoying for a few weeks and then shouldn’t matter.  There will be some difficult discussions with senior management over why/how the numbers have changed and getting them to understand that this doesn’t mean the website is performing any differently.  To make these discussions easier, I recommend using unique visitors and even page views as a measure of traffic for the next couple of weeks until you have a clear view of the data.

As someone has said in the comments on the announcement, make sure you annotate your GA accounts with the date of this change.  If Google wants to make everyone’s life a bit easier, they could push out a global annotation stating this.

What if visitor counts are inflated

There has been some research recently suggesting that monthly unique visitor counts for a website are inflated by 2 to 4 times.  This means that if your web analytics tool is reporting 1.6m visitors for the month, the actual number of people who visited your website is between 400k and 800k.  Details of this research can be found in a press release from Scout Analytics with similar numbers found for any website using Google/DoubleClick Ad Planner.

Ignoring the methodologies used to calculate this and whether the findings are correct or not, the question I wanted to discuss was – if visitor counts are inflated, does it matter??

First of all, the absolute numbers.  Your web analytics tool says you had 1.5m visitors.  Maybe you only had 0.5m.  To me, this doesn’t matter.  If you are a publisher who is focussed still on the number of eyeballs that view your content for selling to advertisers, then yes, you would like to report the higher number.  But in terms of web analysis, the actual number of visitors to your website doesn’t matter, it is the trend over time that matters and with the level of visitor inflation remaining consistent, this trend should still hold true.

What about frequency of visit, whether the average number of visits per visitor or the proportion of visitor who make 1 visit, 2-3 visits, 4-6 visits or 7+ visits?  Well if visitor counts are inflated then these numbers are very inaccurate.  Let’s look at the data for Feb ’10 for very.co.uk, the new online department store in the UK, from Ad Planner.

Ad Planner claims there were 3.1m unique visitors based on cookies for very.co.uk in Feb, 1.2m actual unique visitors to the website with these people having made 4.6m visits.  First of all, the suggestion here is that the visitor count for very.co.uk was inflated by 2.6 times in Feb (but we are still ignoring whether this is accurate or not).  The interesting thing however is that the average number of visits per visitor could be either 1.48 or 3.83 depending on which visitor count is accurate (assuming either is).  That is a big difference.  Just imagine what the difference is for those proportions too.  And all this is the type of difference that would mean you should have very different business strategies.  Visitor counts being inflated may just matter after all…

I was lucky enough to be at a presentation by Avinash recently where a big topic of discussion was campaign attribution.  One of the points he raised was that if the number of visits to conversion for an ecommerce website is 1 or 2, visitor based attribution is fairly irrelevant.  It is only when the visitor makes multiple visits prior to making the purchase that visitor based campaign attribution becomes relevant.  But if visitor counts are inflated, the reported number of visits to conversion is very likely to be under reported and suddenly the behaviour of your website visitors is quite different to what you may think it is.

So visitor level campaign attribution could be important after all, based on the logic from Avinash, whatever the data for visits to conversion may say.  Well yes but no.  The idea of visitor counts being inflated is due to visitors using multiple devices to access a website and also some level of cookie deletion.  And what it means when it comes to visitor level campaign attribution is that you are only recording a proportion of the visits that led to that conversion.

It would mean that whatever campaign attribution method you may use – last click (can we now say this is generally agreed to be less useful), first click, even weighting, proportional weighting – well they all only count some of the visits leading to the conversion so the data and the conclusions drawn from the data are incorrect.  The conversion for the visitor may be recorded on their 2nd visit, the first being via a generic search term and the second being via an affiliate.  Simplistic example but this would still lead to various combinations of value assigned to the different campaigns depending on the attribution model.

What might be missing (if visitor counts are inflated) are those other visits by that visitor prior to the purchase – with these visits maybe coming via an organic generic search term, a link on twitter and also those two visits from paid brand search terms.  What all this might just possibly mean is that the data that is being used to determine budget allocation for the next year based on the carefully researched campaign attribution method just might not be that useful after all.

All of this is of course just hypothetical.  Various claims have been made that visitor counts are inflated but it doesn’t appear yet that this is universally agreed.  Personally I can imagine that as people use their work computer, home computer and mobile phone to access websites, that reported visitor counts are a little higher than actual fact.  And if they are, the above are a few ways in which the data that is being relied upon to make business decisions may be a little flawed, meaning those decisions that are being made might end up being flawed as well.  And this matters.

This post was originally published on AussieWebAnalyst on 10th Mar ’10

Related articles by Zemanta

New or Returning, Visits or Visitors

Everyone likes to know if the people visiting their website are seeing it for the first time or are regulars.  This is even more important when they are paying for the traffic, if the money is going on acquiring new visitors (potential new customers) or is it just providing a convenient entry point for people who would be coming to the site anyway.

Due to cookie deletion and multiple computer usage, it is difficult to get a true picture of the split between people who have never seen a website before and those who have.  However, recording whether the visitor had a cookie from this website previously does at least give an indication of this new/returning split.

What I like to be able to do is to segment out new visitors for a time period (week or month) and examine their behaviour on the website compared to visitors who had visited previously.  The new visitor segment should include all visits during that time period by these visitors, not just their initial visit.

Frustratingly, this information is usually not available as default in a web analytics tool unless you can segment at visitor level.  However, as long as you have one of the four metrics from New and Returning Visits or Visitors, you can calculate the other three.  And most tools will give at least one number.   As examples:

  • Google Analytics gives New Visits and Return Visits
  • SiteCatalyst provides Return Visits
  • HBX contains Returning Visitors

The key to this is knowing that the first time a site is visited, that is both a new visit and a new visitor.  And as any subsequent visits by these people will be reported as a return visit, the number of new visits equals the number of new visitors.

With that logic in mind, it is simple to calculate all four metrics once you have a single one.  For example, assume that the tool available is SiteCatalyst (without access to visitor level segmentation via Data Warehouse or Discover):

  • The number of Return Visits is available but none of the other three metrics
  • Total Visits minus Return Visits gives New Visits
  • New Visits equals New Visitors
  • Total Unique Visitors minus New Visitors gives Return Visitors

And now it is easy to calculate the proportion of Visits that were New or Returning or to calculate the proportion of Visitors that were New or Returning.

The same principle can be applied to Google Analytics:

  • New and Returning Visits is available (note that this metric is visits, not visitors as it is titled in the report)
  • New Visits equals New Visitors
  • Total Unique Visitors minus New Visitors gives Return Visitors

Of course, these numbers don’t mean that much on their own but do become more useful when trended over time or across different segments.

An interesting thing to look at can be the split in New and Returning Visitors for different time periods – day, week and month.  This can indicate the scale of the issue with cookie deletion, but more on that another time.

This post was originally published on AussieWebAnalyst on 2nd Nov ’09

Removing Daily Seasonality

While I generally begin to look at web analytics data at a weekly or monthly level, there are times when it is useful to drill down to daily numbers.  This can be when examining the reason for a change in the data or simply to review the previous day’s performance.  But an issue arises which can make it difficult to interpret and extract useful insights from this daily data.

Most metrics, when viewed at daily level, contain a form of daily seasonality.  This is most clear in metrics such as visits, page views or sales which are absolute numbers.  There is a re-occuring pattern throughout the week with peaks and troughs on the same day/s each week.  An example of this pattern can be seen in Figure 1 below.

While this makes any chart pretty to look at, it makes it difficult to really identify trends or spikes in the data.  Is a data point high because there was a spike or because it was a Monday?  It is school holidays but should the number of visits on that Sat really be that low?  And of course, what day did we start to see traffic decline from and how much of a change is it really?

Figure 1

A common method used to remove daily seasonality is to smooth the line out using a moving average.  As it is a weekly pattern, a seven point moving average should lead to a nice smooth line.  Unfortunately, as can be seen in Figure 2, this means you get a nice smooth line, hiding most of those interesting spikes and step changes and general data trends.  You can see overall trends but you cannot pinpoint particular days when a change occurred.  It is also difficult to clearly identify a change immediately, as each day only contributes one seventh to each data point.

Figure 2

What I advise doing instead is to remove the daily seasonality from each data point, resulting in a line that is unaffected by what day of the week it is.  Using this method means that it is clear to see if the performance each day was good or bad. For example, in Figure 3, it can be seen that the relatively worst day for visits was actually the 25th Aug, even though visits for that day were higher than for other days during the reported period.  The technique for removing daily seasonality can be applied each day, meaning that you can identify and react to a change in performance immediately.

Figure 3

The difficulty then is in calculating the daily seasonality across a week.  This can be done properly using SPSS or a similar tool but I use a quick hack workaround in Excel that, while not 100% accurate, gets the job done.  The steps to calculate daily seasonality for a metric (using the examples of visits) are as follows, with the example displayed in Figure 4:

  1. Extract historical daily visits data.  You will need at least 6 weeks, more if the period includes a known number of factors that could impact on traffic e.g. school holidays, public holidays, product releases, marketing campaigns, etc.
  2. Reorder the data so that each column contains a single week and each row contains only data for a particular day of the week.
  3. Recreate this table so but replace the visits for each day with the % that visits for that day contributed to total visits for that week.
  4. Add two more columns to calculate the mean and median for each row of data.
  5. Delete all weeks which contain days which don’t reflect the general pattern.  In this example, weeks 5 and 6 were deleted.  At this point, the mean and the median should be relatively similar for each day of the week.
  6. The daily seasonality pattern is achieved by multiplying the daily mean by 7.

Figure 4

This daily seasonality pattern can then be used for removing daily seasonality for that metric for any day.  Simply divide the value for each day by the relevant daily seasonality in order to remove it.  I generally do this using a vlookup against the day of the week for each date.

Going back to the reason for web analytics, you can use this technique to clean data so that you can instantly identify good and bad days, whether this is historical data or just for the preceding day.  If you are using this for historical data, you can identify the interesting days to investigate further (play with by segmenting).  If you are using on an on-going basis, you can see instantly what performance was like for the previous day and if need be, investigate and react to a change accordingly.

Currently, in order to be able to do this sort of analysis, you need to extract the data into Excel.  Hopefully one day, web analytics tools will allows you to upload a daily seasonality pattern for a metric so that you can display the daily data with this seasonality removed.  And my dream is of a tool that would incorporate the ability to automatically create the pattern for any selected metric (with manual over rides for tweaking of course).

The other key use that I have found for a daily seasonality pattern is it can be used in forecasting daily traffic levels.  If you are able to forecast what the week’s traffic should be, this can easily be multiplied out using the daily seasonality pattern to forecast traffic at a daily level.

A copy of the Excel file containing all the data, charts and formulae used in the examples above can be downloaded here – Daily Seasonality File.

This post was originally published on AussieWebAnalyst on 26th Nov ’08