Writing Custom Dimension to Google Analytics from Snowflake DB

Like many data geeks, Google Analytics was the thing that first sparked my curiosity. Last week, Census released our Google Analytics integration. I could write a long list of game-changing applications for creating custom dimensions in Google Analytics from data in a Snoflake data warehouse but I’ve been swirling around one that I find particularly interesting. Here’s what I’m thinking.

Census, as you’d imagine, has a long sales cycle. It’s a freemium product in the middle of multiple stakeholders and touchpoints. It should come as no surprise that it’s easy for us to lose the connection between early marketing efforts and later sales outcomes. Lead scores based on demographics, activity, content consumption, or product usage have been really helpful for us to aggregate signals into a few metrics that show how similar new leads are to past customers.

The problem is that those metrics are only helpful for understanding the funnel after a trial is started. Growth is a process of expanding what works by experimenting and iterating. So to advance our experimentation towards the top of the funnel, we need valid signals early in the journey.

This is finally where Google Analytics comes in. GA is the platform we trust for web analytics and a quick way to understand attribution. But as you know, it was built for ecommerce—not for B2B SaaS. Firing a conversion based on an event gives us a really shallow view of success.

Send data to Google Analytics from Snowflake

With Clearbit Reveal and account-based scoring, we can put score-based thresholds on the traffic coming in (for example high/mid/low score). With reverse ETL data integration for Google Analytics, We can map these thresholds against custom dimensions to measure the volume of high-quality traffic a given channel is bringing in and evaluate channels/costs against traffic quality. Sure the signal is imperfect (all models are) but it’s a lot stronger than the far-less-frequent conversion events. It feels a lot better to me than living and dying by button clicks.

I recently presented on how to get consistent metrics across Google Analytics, your ads platforms, and Hubspot called Marketing as a Data Product: Operational Analytics for Growth which shows how I did this. Check it out!

Yes, you can use Census for free. You can even send custom dimensions from Google Sheets.

Google Analytics 4 Pageview Custom Dimensions and Content Groupings

I decided it’s time I’d learn GA4 so I figured I’d implement it net new and figure it out along the way. A couple of things I rely on heavily on my sites are Content Groupings and Custom Dimensions. Luckily, GA4 handles these a lot better than UA—even if they are still a bit cumbersome.

I learned that there is only one Content Group parameter called content_group (instead of 5). It seems at first glance that event-scoped custom dimensions are capable of doing that which content groups used to handle (thanks to the new event-first rather than session-first paradigm).

To set customer dimensions, you can just set arbitrary parameters like canonical_name or page_author.

Shown below is an example of the query parameters that are sent with the page_view hit. Notice that the “ep.” (event parameter) prefix is added by the GA code. (I only passed canonical_name as a parameter) and GA adds the prefix when the hit is sent. This is also true for content_groups.

The query string parameters correspond to the Measurement Protocol

  • en: page_view
  • ep.criteria_id: 2554
  • ep.canonical_name: New%20Zealand
  • ep.countent_group4:
  • ep.countent_group3: NZ
  • ep.countent_group1: Country
  • ep.countent_group2: Active
  • ep.debug_mode: false

Notice that I set debug_mode to false so that I could take advantage of the debug view in the Google Analytics UI. That thing is pretty neat and helpful for spot-checking custom parameters.

One thing that seemed silly was that I couldn’t just arbitrarily name my content groups and assign them in the UI, but hey, it’s still a lot better than UA.

Defining and Naming Your Custom Dimensions

After you send the parameter with the hit, you have to set up the custom dimension in the GA4 UI. You can set them up by going into the left panel and selecting “Custom Definitions.”

The nice thing is that, if you’ve already set a parameter, you can just select the parameter name from the list and all you have to do is provide a Dimension Name and Description.

Pageview Custom Dimensions in Reports

To see the custom dimensions in a real report, you can go into the left panel and select Engagement > Pages and screens. Then you can find the dimension by clicking the big blue plus sign (+) and selecting whichever you like.

If you’re curious about how it works and you want to see it for yourself, you can go to the site and open up your Network console and filter for “?collect” to see the GA hits. The site is here: https://analyticscodes.com/ Please visit and send some data into that GA4 account!

9 Pro Tips for GTM Auto Event Tracking and the Click Element Variable

This week I did a Google Tag Manager implementation where I had no control over the site’s source code and no access to the site’s developers. It’s an imperfect situation but luckily, GTM Auto Event Tracking affords a solution that is very effective considering these constraints.

Google Tag Manager Should Be Easy

This post is meant share a few tips on how to implement, customize, and debug Google Tag Manager in a situation where you are using Auto Event Tracking. This is especially helpful when you want  to track events on pages that you do not have any control over. The heart of Auto Event Tracking and the focus is this post is the {{Click Element}} Auto Event Variable, also known in the dataLayer as “gtm.element.” Through the whole event tracking process, there are three tips:

  1. Setting up Click Event Tracking using “phantom” event triggers
  2. Custom JavaScript Variables to collect data about the page/interaction
  3. Debugging Auto Event Variables and Trigger

Read More

Google Analytics to Google Spreadsheets is Data to Insights

When you reach the limits of Google Analytics custom reports and you still need more, Google Spreadsheets and the Google Analytics Add-On can take you past sampling, data consistency and dimension challenges.

This post is all about the Google Analytics + Google Spreadsheets workflow. It is an end-to-end example of how you can extract more value out of your Google Analytics data when you work with it in its raw(ish) form in Google Spreadsheets.

Blog Post Traffic Growth and Decay with Age

The end goal is to look at how blog post traffic grows or decays with time and ultimately Forecast Organic Traffic to Blog Posts (coming soon).  But this post sets the foundation for that by showing how the data is extracted, cleaned and organized using a pivot table in order to get there.

There is also a bit of feature engineering to make the analysis possible. To do this we will extract date that the post was posted from the URL and effectively turn the “Date Posted ” and “Post Age”  into custom dimensions of each page. But enough setup. Let’s start.

This posts assumes a few minor things but hopefully will be easy to follow otherwise:

  • Google Spreadsheets Google Analytics Add-On already plugged in
  • Basic familiarity with Regular Expressions aka. RegEx (helpful but not necessary)
  • Basic familiarity with Pivot Tables in Google Spreadsheets
  • Blog URLs with dates as subdirectories eg “/2015/08” (or collect post date as Custom Dimension)

Creating Your Custom Report

Once you have the Google Analytics Add-On up and running this is actually pretty simple. It just takes a bit of trial and error to ensure that you’ve queried exactly what you want. From the Report Configuration tab, most of the time there are only a few, but very important, fields that you will need to worry about: Date Ranges, Dimensions, Metrics,  and Filters.

Google Analytics Add-On Report Configuration

The purpose of this is to look at sessions by blog post URLs by month, over the entire history of the site

I chose the Last N Days because I know that is roughly the age of my site and a date range that is too broad is ok. I chose sessions because this relates to how many people will land on a given landing page in a given month of a given year. So that is all pretty straight forward.

Filters can get a bit tricky. A good practice for setting up filters is to first build them using the Google Analytics Query Explorer. That will enable you to rapidly test your queries before you are ready to enter them into your spreadsheet.

There are several filter operators that you can use but I kept this one fairly simple. It consists of three components (separated by semicolons):

ga:medium==organic; Only organic traffic ( == means equals)
ga:landingPagePath=@20; Only landing page URLs that contain 20  (@= means contains)
ga:landingPagePath!@? Only landing page URLs that do not contain a query string because that’s mostly garbage. (!@ means does not contain)

I used ga:landingPagePath rather than page title because that can be less consistent than URLs. They are more likely to change and will sometimes show as your 404 page title. Blog post URL’s are a more consistent unique identifier for a post but it is important to note that sometimes people will change blog posts URLs. We will deal with that later.

Cleaning the Data for Consistency

Even with good data collection practices, cleaning the data is extremely important for accurate analysis. In my case, I had changed a couple blog post URLs over time and had to manually omit a few that my query’s filter did not catch. In this case, data cleansing becomes very important for two reasons: 1. Posts that are not combined by their unique and consistent URL will show as two separate posts in the pivot table which will skew summary statistics and 2. Posts that are not real URLs with a small number of sessions will really skew summary statistics. Consistency is key for Pivot Tables.

So in this case, for small exceptions, I just deleted the rows. For more common exceptions, I used the REGEXREPLACE function. This is a powerful tool for data cleansing and unique to Google Spreadsheets. It allows you to select a part of a string that matches a RegEx pattern and replace with with whatever you might want. In this case, I just searched what I wanted to remove and replaced it with an empty string. eg.


I used this to remove ”/blog” from the URLs that were used before I transitioned from Squarespace to WordPress and date numbers because some had been changed when the post was updated.

Extracting Blog Post Posting Dates

Extracting the date that the post was posted is actually pretty simple. Again, I used another RegEx function, REGEXEXTRACT to do it:


The RegEx pattern finds any four digits then a slash and any one or two digits. Then the extracted string is split into two cells by the slash. This yields the year in one column and the month in the next. I combined the month and year of the post date and the month and the year of the Google Analytics date into Date objects so that I could use the DATEDIFF function to calculate the age of the blog post as a dimension of the post. Maybe this is too much gory detail but hopefully it’s useful to somebody.

Finally, we end up with is something that looks like this. This allows for pivoting the data about each post by each month of age.

Google Analytics Add-On Report Breakdown

Step 4: Pivot Tables FTW!

Finally, all this work pays off. The result is one table of blog post URLs by Age and one table of URLs by summary statistics.

Google Analytics data pivot table

The blog post age table allows me to see if the posts show a natural growth or decay of traffic over time. The big purpose of the table is to create the chart below that basically shows that growth and decay don’t look that consistent. But this is an important finding which helps frame thinking about modeling blog traffic in 2016.

(Not shown here, the Google Spreadsheets SPARKLINE function is a great way to visualize Google Analytics Data.)

The summary statistic pivot table will be used for forecasting 2016 traffic. This just happens to be the topic of my next post in which the tables are used to answer 1. What is the likelihood of reaching 25,000 sessions in 2016 if things stay the same?  and 2. What needs to change in order to reach 25,000 sessions in 2016?

So in summary, so far, this post has proven a few things:

  1. If you work with Google Analytics and don’t know RegEx, go learn! Its an incredibly useful tool.
  2. My blog posts do not demonstrate consistent growth or decay
  3. I might just be as big of a nerd as my girlfriend thinks I am.

Hope it might be useful in thinking about similar problems or maybe even creating new ones!

Track REST APIs with the Google Analytics Measurement Protocol

Google Analytics got a whole lot more interesting when the Measurement Protocol was introduced. We already knew GA was the industry standard for web analytics but with the Measurement Protocol it has become the analytics platform of anything and everything that can be made digital. With some clever instrumentation, we can now use it to track products through the supply chain or track users interactions in a store. All you need is a way to collect digital data and send HTTP requests to Google Analytics and you can track anything.

I had to try it out for myself. While I could have fitted #rhinopug with a tracking device or instrumented my coffee machine with an Arduino, I took the easier (but equally cool) route to getting data: a Web API. As my proof of concept, I chose to track the SwellPath team’s group chat application called GroupMe.

Google Analytics Measurement Protocol

GA Dashboard Courtesy of Mike Arnesen

Tracking a chat app turned out to be a pretty cool way to walk that physical/digital line. While we are humans working in the same office, its interesting to compare contextual information from what we can see and hear to the very objective measure of communication; desktop and mobile messaging. This concept is similar to other measures of digital communication like Twitter firehose or brand mentions from news API’s. Those are probably much more relevant to, and could actually affect a website’s performance but, let’s be honest, this one’s a lot more fun.

Mapping Data to Google Analytics

Digital messaging is actually pretty appropriate for the Google Analytics reporting interface. The main reason is this: timestamps. We rely heavily on timestamps to analyze everything in Google Analytics which are all time-based hits. We ask Google Analytics how different landing pages perform as seasons change and what time of user’s are most likely to convert (in order to bid intelligently on ads). Likewise, there is also a natural rhythm to work-based communication. Of course, (or hopefully) its pretty quiet on the weekends and generally pretty active as people start each workday.

The other reason that human communication maps to the Google Analytics reporting interface is that message creation is a lot like content consumption. When we really think about what a “hit” schema looks like, it has a few entities what go together something like this:

[actor] did [event] on [location] at [timestamp]

This “hit” schema works equally well for describing message creation as it does content consuming.

With every hit, the [actor] a.k.a. User is assigned some attributes like Device or New/Returning and the [event] a.k.a. Event, Pageview or otherwise, will have attributes like URL and  Page Title for Pageviews or Action and Label in the case of Events. The [location] is an interesting one. For web, its the page that the user is browsing but it’s also the physical location of the user a Lat,Lon pair with appropriate geographic information. The [location] attributes are generally handled by Google Analytics automatically but speaking from experience, the real art of a good collection strategy is mapping the right information to the right attribute of each entity.

To make sense of the idea of mapping information to attributes let’s get back on track and talk about GroupMe. It boils down to this: you have data and you want it to appear in Google Analytics in a way that you can logically sort/filter/analyze it. This is where the mapping comes in.

GroupMe’s API gives you data about a group’s messages like this:

  "count": 123,
  "messages": [
      "id": "1234567890",
      "source_guid": "GUID",
      "created_at": 1302623328,
      "user_id": "1234567890",
      "group_id": "1234567890",
      "name": "John",
      "avatar_url": "http://i.groupme.com/123456789",
      "text": "Hello world ☃☃",
      "system": true,
      "favorited_by": [
      "attachments": [
          "type": "image",
          "url": "http://i.groupme.com/123456789"
          "type": "image",
          "url": "http://i.groupme.com/123456789"
          "type": "location",
          "lat": "40.738206",
          "lng": "-73.993285",
          "name": "GroupMe HQ"
          "type": "split",
          "token": "SPLIT_TOKEN"
          "type": "emoji",
          "placeholder": "☃",
          "charmap": [

If this doesn’t make sense to you, go read up on JSON. But essentially what you get when you ask the GroupMe API for the most recent messages, it returns a list of messages with, among other things, the sender’s name and user ID, the message, the number of likes, and the location. So we have information about each of the “hit” entities. The user, event, place and time are all described. The only thing missing that is critical to web analytics metrics is something similar to Page. For that reason I decided to use Google Analytics Events to describe each GroupMe message. Each hit maps GroupMe data to Google Analytics as follows:

Google Analytics Parameter GroupMe Data / JSON Keys
User ID GroupMe User ID / user_id
Client ID GroupMe Source GUID / source_guid
Custom Dimension (User) GroupMe Username / name
Event Category “GroupMe Chat”
Event Action “Post”
Event Label Truncated Text of Message / text
Event Value Count of Likes / count(favorited_by)
Queue Time Difference between Now and Timestamp /current time – created_at

Then each GroupMe message is sent to Google Analytics on an HTTP request with data mapped to GA parameters as shown above. Collect data for a few days and then it looks like this:

Measurement Protocol Specific Values: Queue Time and Client ID

If you come with a Web analytics frame of mind, there may be two things that are unfamiliar to you: Client ID and Queue Time. These are both a pain to get right but functionally awesome.

The Client ID is something you don’t have to think about for web data collection; it’s automatically collected from a cookie that Google Analytics sets for you. It is very important though. It is the key for differentiating two devices that, by their collectible attributes “look” the same but are not. The CID must follow very specific rules to be valid and lucky for me, GroupMe offers a GUID for each message that fits the specifications.

Queue Time is awesome. This is the single most important factor in getting the time value of at Measurement Protocol “hit” right. It is the delta (a cool way to say difference) between the time that the event occurred and the time that the hit was collected. If you send the hit to Google after the hit took place, Google’s servers calculate the time delta and record the hit at the time that it actually took place.

This was especially important for the method I used to get data from GroupMe and send it to Google Analytics. Because I was only getting the messages from the GroupMe API once an hour. Without the Queue Time, the hit timing would be very low fidelity, with spikes each hour when the data was collected and sent. By calculating the Queue Time when each message was sent, I got accurate timing and didn’t have to worry about burning through API limits or wasting lots of HTTP calls. (Think about it, without Queue Time, your data is only as accurate as the frequency that your hits are sent which was a cron job in this case.)

Google Analytics Measurement Protocol API

Don’t call it a hack. Ok, call it a hack.

Lessons Learned / How I’d Do it Next Time

This ended up working out pretty well thanks to a fair amount of luck and plenty of read the docs, code, debug, repeat. I got lucky when I realized I hadn’t accounted for things like the mandatory Client ID parameter and … the fact that my server doesn’t run Python cron jobs. As a result I ended up writing my first PHP script and here I am sharing 100-some lines of amateur code. But hey, this proof of concept works!

If I were to do this again, I would answer a few questions before I started:

Get to know the API

  • Will the API I want to track give me all the data I need?
  • Are events timestamped or do I have a way to approximate that?
  • How difficult is authentication and how long does it last for?
  • Am I going to operate safely within the API rate limits?
  • What about Terms and Conditions of the API data?

Map the Data to Google Analytics

  • How will I avoid making recording the same hit twice?
  • What type of Google Analytics Hit will I use?
  • How should I map the API’s data to a Google Analytics hit?


  • Can I write some code to automate this?

How the Code Works

The code I wrote to automate this is listed below but if you are unfamiliar with PHP or code in general the instructions that are given to the computer are essentially this:

Call the GroupMe API to see if there are any new messages since last time
  If no: stop.
  If yes: continue
Call API to get/make a map of User ID’s to User Names to send with hits
For each message that was returned:
  map it to GA parameters
  send it as an event to GA
  For each like of each message:
    map it to GA parameters
    send it as an event to GA
Write the the most recent message ID to a .txt file (to keep track of what has been sent)

Wait for about an hour and repeat with the next cron job

It was a fun project and luckily a successful proof of concept for tracking non-website data in Google Analytics. If you’re thinking about doing a Measurement Protocol project, leave a comment or tweet me at @realtrevorfaux (don’t worry, I’m not tracking it). If you’re interested in other cool ways to track offline transactions, check out Google Analytics Enhanced Ecommerce, I really look forward to what is to come of the Measurement Protocol with things like IoT. Connect, collect, and analyze all the things!

The PHP code that I used is below. Give me a break, this is the first PHP programming (and maybe last) I’ve ever done.


// script configuration stuff
$token = "abc123"; // from dev page
$group_id = "1234567";  // from dev page
$memory_file = "last_id.txt";

$UAID = "UA-XXXXXX-XX"; // Google Analytics UA Code
$member_names_map = makeNameMap();

// saved last message id to file
$since_id = file_get_contents("last_id.txt");

// endpoint to get lastest messages
$url = 'https://api.groupme.com/v3/groups/'. $group_id .'/messages?token=' .$token. "&since_id=". $since_id;

// call the groupme api
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
$http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);

// check response code and do the rest if no change.
if ($http_status === 304){
  echo "API RETURNED: ". $http_status ." n";
} else {
  $message_count = $response->count;
  echo "API RETURNED ". $message_count ."MESSAGESn";

function handleMessages ($response_obj){

  $json = json_decode($response_obj);
  $messages = $json->response->messages;
  $timestamp = time();

  foreach ($messages as $message) {

    global $UAID;
    $queue_time = $timestamp - $message->created_at;

    $post_hit_params = array (
      'ec'=>"GroupMe Chat",
      'qt'=> $queue_time,
      'cd1'=> $message->name,
      'cd2'=> $message->user_id


    $favorited_by = $message->favorited_by;

    foreach ($favorited_by as $id) {

      $name = $member_names_map->$id;

      $like_hit_params = array (
        'ec'=>"GroupMe Chat",
        'qt'=> $queue_time,
        'cd1'=> $name,
        'cd2'=> $id


  // get last message/id from this call's messges
  $last_message = current($messages);
  $last_message_id = $last_message->id;

function sendGAHit ($params){

  $query_string = http_build_query($params);
  $url = "www.google-analytics.com/collect?". $query_string;

  // send hit to GA
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  $response = curl_exec($ch);
  $http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);

  echo "n";

function writeMemoryFile ($last_message_id){
  global $memory_file;
  // write last ID to file for next time
  $memory = fopen($memory_file, "w");
  fwrite($memory, $last_message_id);

  echo "LAST ID WRITTEN TO FILE: ". $last_message_id ."n";

function makeNameMap(){
  global $token;
  global $group_id;
  $url = 'https://api.groupme.com/v3/groups/'. $group_id .'?token='.$token;
  // call the groupme api
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  $response = curl_exec($ch);
  $http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);

  $json = json_decode($response);
  $members = $json->response->members;
  $member_names_map = new stdClass;

  foreach ($members as $member) {

    $user_id = $member->user_id;
    $nickname = $member->nickname;
    $members_names_map->$user_id = $nickname;


  return $members_names_map;



Product Scope Custom Dimension & Metrics in Google Analytics

Google Analytics Enhanced Ecommerce provides insight into Ecommerce performance in detail that was impossible using standard Google Analytics methods. To get the most out of Enhanced Ecommerce, you must understand the breadth of Enhanced Ecommerce data collection capabilities. This tutorial will take you from Ecommerce product page to Enhanced Ecommerce Product Data.

Enhanced Ecommerce Product Page

A pretty standard Ecommerce Product Page. Thanks to smarthome.com

Enhanced Ecommerce Reporting

Enhanced Ecommerce Product Reporting with Custom Dimensions and Metrics.

Product Data Problem and Solution

The new product schema was created to provide insight in a way that answers questions that are specific to products rather than pages or events. Consider event hits; their schema uses a sentence-like structure to describe an action.  This does not map well to a product entity that is not bound by specific instances in time. The product schema is more similar to pageview hits that collect attributes of a page entity over time. But still, the data is collected on the page is in a different fashion.

For more, see Carmen Mardiros’ great conceptual explanation of Enhanced Ecommerce Data Layer and my slides on Enhanced Ecommerce schema.

Google Analytics Enhanced Ecommerce assigns a collection of properties to each product.  At least one property is mandatory (either name or id), and other properties are optional: brand, category (up to five tiers), variant, price, quantity, coupon, and position. These properties describe either an attribute of the product or the context of the product action. These standard features provide a holistic understanding of how users interact with products in an Ecommerce store. But they don’t give the whole picture of how customers interact with products within the context of any particular business.

 Understand how users interact with products based on qualitative dimensions like its availability, backorder date, release date or physical dimensions or appearance.

For instance, you may want to understand how your customers’ shopping behavior changes for products based on qualitative dimensions like its availability, backorder date, release date or physical dimensions or appearance. Likewise, there is more to understand about products’ quantitative information such as the cost of goods sold, previous price, discount, or profit. This is where custom dimensions and metrics come in.

Product Data Collection

Google Analytics collects data about product entities in a way that appropriately fits the concept of a product; each product is represented by Javascript Object. An Object is simply a thing represented by a collection of properties about that thing. In other words, a product entity would be represented by an Object that has a name property of  “Clapper Light Switch” and a price property with a value of 30.00.

To extend the ability for an Object to describe a product entity, we can specify additional properties like “cost of goods sold,” or “backorder date.” At the code level, this means adding one more property “key”:”value” pair to the product entity object. The only difference is that the properties name will be a placeholder such as “dimension9” or “metric4.” The dimension or metric name will be assigned later within the Google Analytics interface. In Universal Analytics it would look like this:

ga("create", "UA-XXXXX-Y");
ga("require", "ec");
ga("ec:addProduct", {
   "id": "81301",
   "name": "Xantech AC1 Controlled AC Outlet",
   "price": "78.55",
   "brand": "Xantech",
   "category": "AC1 Controlled AC Outlet",
   "variant": "white",
   "dimension3": "In stock - Ships Today", // stock status custom dim.
   "metric2": 5,                           // rating stars custom metric
   "quantity": 1 });
ga("ec:setAction", "add");
ga("send", "event", "Product", "Add to Cart", "Xantech AC1 Controlled AC Outlet");

and using the Google Tag Manager data layer, it would look like this:

   "event": "addToCart",
   "ecommerce": { 
      "currencyCode": "USD",
      "add": {
         "products": [{
            "id": "81301",
            "name": "Xantech AC1 Controlled AC Outlet",    
            "price": "78.55", "brand": "Xantech",            
            "category": "AC1 Controlled AC Outlet", 
            "variant": "white", 
            "dimension3": "In stock - Ships Today",  // stock status
            "metric2": 5,                            // review stars
            "quantity": 1 

For a great working example of this, see:  https://ga-dev-tools.appspot.com/enhanced-ecommerce/

Setting Custom Dimension & Metric Names

The first thing to note is, if you are using the Google Tag Manager data layer to collect product data, make sure you have checked the “Enable Enhanced Ecommerce Features” and “Use data layer” boxes shown below. Google Tag Manager Add to Cart Event No matter if you are using the data layer or standard Universal Analytics collection code, you will have to do two things:

  1. Ecommerce and Enhanced Ecommerce Reports must be enabled.  Just go into your Admin section > choose a View > Ecommerce Settings and toggle  Enable Enhanced Ecommerce Reporting to “ON.”
  2.  Custom Dimension and Metric must be activated, named, and configured as you want them to appear in Google Analytics reports. This is also done within the Admin interface. Go to Admin > choose a Property > Custom Definitions and click Custom Dimensions or Custom Metrics. Set the name, the scope to “Product”, and the state to “On.” For Custom Metrics, set the appropriate formatting type.

Enhanced Ecommerce Product Custom Metric Setup Note that these hits must be set at the product hit level. Otherwise, the data will not be collected as expected, if at all.

Enhanced Ecommerce Product Data Reports

Now you are ready to appreciate all your shiny new insights. To find these reports within the Reporting interface go to Conversions > Ecommerce >  select a Product or Sales report.

Enhanced Ecommerce Reporting

From there you can see all your products and sort, filter, and view them by any their newly recorded properties. Metrics can also be added to Custom Reports to provide aggregate insights.

Inspecting, Debugging and Perfecting Product Data Collection

You may see something funny in your reports at this point or nothing at all. In my experience, Enhanced Ecommerce data is published as soon as the hit it was sent with is recorded. This is usually relatively quick. (Under 10 minutes) If something looks amiss, don’t worry. There are usually a few simple fixes that can be made to make sure the data is being collected and reported correctly. Assuming you have done everything correctly up to this point, there may be a few things you need to check and fix. Here’s a list:

Debugging Enhanced Ecommerce Hits

This is what you are looking for in the Google Analytics Debugger.

  • Make sure the hit is being sent to the right property.
  • Use mandatory product fields (name or id) as report dimensions.  This is helpful when starting out. If you are looking at a  report with a primary dimension of Product List, but are not yet collecting product list data, the report will appear to be empty.
  • Make sure Ecommerce data is sent with a standard Google Analytics Hit. Enhanced Ecommerce data is buffered on the page until a stand Google Analytics hit is sent. Then the Ecommerce data is collected with that hit.
  • Make sure Javascript Object data structure is correct and without errors. Use my data layer debugger bookmarklet to verify that the data is in the data layer. Also, keep an eye on the Javascript console and use jshint.com to make sure there are no errors and everything is formatted correctly.
  • Try this cool method of inspecting each object that is pushed to the data layer is valid by using JSON.stringify to view Objects in the data layer. Just type the following command into your Javascript console and inspect the object in JSON (JavaScript Object Notation).
// where 0 is the index of the first object in the data layer array

Accessing Nested Values with Google Tag Manager Variables

When Google Analytics event hits carry Enhanced Ecommerce info, you may want to use a product’s attributes as the values for the event’s Category, Action, or Label or event Custom Dimension or Metric. Similarly, product data can be applied as a custom dimension on the pageview hit that, for example, is sent to on a product page to carry product detail view Enhanced Ecommerce information. In these case, if you are using Google Tag Manager, you can access the values of the product or actionField data using a Data Layer v2 Variable. These GTM Variables allow you to access property values that are nested within Objects or Arrays in the data layer.

For instance, if you wanted to access the name of a product that was just added to a shopping cart (as shown above), you would use the following format without the quotes: “ecommerce.add.products.0.name”. Note that the 0 is specifying the index (zero-based count) of the Object noted in the array that is enclosed in [brackets].

Thanks, Simo for getting me back on track with this.

A Note on Product Hit Scope Dimensions and Metrics

Custom Dimensions and Metrics are product level, and won’t be applied to an event or page on which the happen or to a user that interacts with them. That is done by setting the dimension or metric at the hit level. Just make sure to configure the hit Dimension or Metric accordingly.  Check out this old but good explanation of  hit scope by Justin Cutroni.

Start Collecting and Start Optimizing!

This may seem complicated, but the power that it provides is well worth time spent in a detailed implementation. Please leave a comment if you have any questions. Or send me an email at t@tfox.us and let’s get this started!

Thank you to SmartHome.com for the pretend data. I want everything in your store.

Data-Driven Marketing Starts With the Data Layer

The term “digital marketing “ should describe the marketing methods as much as it describes the marketing medium.  

Too often, marketers are satisfied with their “digital marketing” efforts when they send a tweet, “blast” an email or run a search ad because their message is transmitted over a digital medium.  Unfortunately, this misunderstanding so vastly underestimates the potential power of digital marketing; well…I just had to write a blog post about it.

If you take just one idea away from this blog post take this: The  access to data that the digital medium affords, when interpreted correctly and acted upon confidently, infinitely strengthens the performance of the digital marketing medium.

If you take two ideas away from this, take this too: The data layer is the very first place to start to capture and leverage your data. It provides the capability to merge data at its source. This is the foundation for data-driven action in real-time.

I wrote this post to illustrate the value of the data layer in the context of today’s digital marketing stack. I explain the present state of the digital marketing medium and how and why the data layer has a role in it. If you’re done reading, please tell your friends. The share buttons are above. Otherwise please read and leave a comment below.

I deal with data

I have data skills!

What is the data layer?

It is essentially a menu of available data, from which data consumers (defined and explained later) can order from within your web app or page.  Technically speaking, the data layer is a JavaScript data structure that receives and holds data about the user, the user’s interactions, the app, the view or page, and the context of the user interaction within an app or page.  The data layer can then be accessed by a tag management system or application for data collection or to trigger marketing actions.

The tag management system sits on top of the data layer and is responsible for routing data to data consumers. This could be as simple as a few lines of home-baked of Javascript or as intricate as Google Tag Manager. Ultimately, a tag management system is the logic that decides where the data is routed and to some extent how the data in the data layer will be used.

Why does the data layer exist?

To understand the role that the data layer plays in digital marketing, it helps to understand two currently sexy terms: Web App and the Digital Marketing Stack. I will explain those later but let me pull a little Chuck Palahniuk on you and set you up with these three eerily similar but presently unrelated premises:

The data layer exists because:

  • Web apps have multiple players providing and consuming data within the app.
  • Similarly to the concept of data independence, data access should be flexible and independent from the external user-facing layer of an application.
  • Data access and acquisition should not be barriers to data-driven action.

And now a story…

How the data layer came to be

(and A Brief History of the Web Apps.)

In the beginning there was HTML. It was and is the commonly used document structure for web documents.  In a very basic sense, the HTML language codes how text is hierarchically structured in a document. So as you remember in 1997, we could request an HTML document from a server far away and we could view it in our browser as a webpage, and things were great.

Sometimes, people would add some style to these documents and, depending on the browser you were using, things may or may not have been so great. The problem with adding style to an HTML document was that writing the style rules to the document itself meant that any time there was a need to change the style of a document, it meant changing the document itself. This was very messy and inefficient.

Then came CSS, a separate additional layer of style rules that your browser would apply to an HTML document.  Colors, sizes and blinking distraction exploded on the page in a barf of glory. People in the Wild Wild West of the web now had a simple method of keeping their theme of yellow Courier font over the top of a picture and John Wayne across all the pages of their site by using the same style sheet across their site. It wasn’t always tasteful and it certainly wasn’t interactive but it was consistent. Web developers hailed the separation of style and content layers.

Hello World! Here comes JavaScript!…and ten more pop-ups about things you wouldn’t bring up at a dinner party. JavaScript could both blow your mind and ruin your browsing experiences depending on your browser. Like document style, it was often a jagged mess until people realized the virtue in separating the JavaScript code (the behavior of the page), from the content and style of the document. Recognize the theme?

JavaScript is what is responsible for the prolific < booming voice > WEB APP </ booming voice >.  It is behind all of the shifting, scaling, constantly updating, interactivity that we now know and love. Possibly, the most powerful aspect of what JavaScript offers is the ability to get more data from a server without requesting a whole new web page. This is called AJAX and is what makes Facebook login for your favorite ecommerce site possible. It is also what makes web analytics products like Google Analytics possible.

Today, interactive Web Apps have become the norm. From Facebook, to Gmail, to EBay, to your website, web apps do more than just serve dynamic content from a server. With the help of AJAX, they can host an entire ecosystem of plugins, pixels, scripts, snippets, iframes and APIs that communicate with the other servers of the web! There are data providers (players that add data to the page like an embedded YouTube video, Qualaroo survey or the app) and there are data consumers (players that take data on the page for their own use such as analytics platforms like Google Analytics, MixPanel and SiteCatalyst, marketing tags like Adwords, Facebook or Adroll, email marketing platforms like Campaign Monitor or Mailchimp and testing and optimization platforms like Optimizely using all this data to trigger their own events using logic defined by each specific data consumer.)

“There ought to be a layer!” some smart person shouted after this mix of independent data and logic got pretty messy.  Again, a new web feature arrived and a new layer arose.Each layer of an application separated the business of each feature from the business of other features. By separating all the layers, it became possible to manipulate and modify each layer without affecting other layers or the application as a whole. The web is stable. Occam put his Razor away.

Enter the data layer. There is data in the application, there is data in the users’ interactions, and there is data coming from outside sources. The data layer exists to separate all this data from the business logic of the application. This way the application is only responsible for serving the user, and all other data consumers have a central point of access for the data they use.

How the data layer works (by example)

Let’s say that your brand just produced a video to promote a new product.

Your digital marketing stack looks like this: Your brand’s website uses Google Analytics to track user interactions and session source from several paid and earned channels that you cultivate. You have remarketing tags for Facebook, YouTube and Adwords and you use Optimizely for testing and dynamic content. Finally, you use Google Tag Manager to manage and trigger your marketing tags.

Because you’re smart, you recognize that just posting this video on YouTube is not going to have any impact on sales. (The right people have to see it!) You also recognize, based on your last video campaign, that users who saw the video were 15% more likely to convert, making it very important for users to view the video.

So you feature your video on the homepage.  As users come in, you use JavaScript to check their referrer and place that information in the data layer.  Based on the referrer you can infer their likelihood to convert; we know that a user who comes in via a certain branded paid search campaign is very likely to convert. We tag users who visit from this campaign with a value of “high” and place that data in the data layer. You could capture this in Google Analytics in a “User Value” custom dimension.

As users interact with the home page, some watch your new video and are tagged as “video viewer” (again, this could be captured as a Custom Dimension) and some navigate past the home page without viewing the video.  These users are not tagged. Because users of this segment have proven to be worth such a high value based on their referral source, you have set up Google Tag Manager to tag this specific audience with both Facebook and YouTube video remarketing tags. This user must see this video!

A couple days later, the user returns via a Facebook video post tagged with UTM parameters that are again read by JavaScript. This time, because the user came from the video post, we use Google Tag Manager to tag this user as a “video viewer.” When this happens, another event happens: this user is added to an Optimizely audience. Optimizely then cookies this user so that when they navigate back to the home page, instead of seeing the video front and center, they see a persuasive call to action above the video that shortcuts them toward the ultimate conversion page. Finally, a transaction is registered in Google Analytics along with a few events and a branded search campaign and Facebook video campaign source for attribution.

In this case, there were four players involved, but the most important thing to recognize is that without the data layer to unify all the data that each player was consuming, each player would have continued to act independently. The data layer provided the capability to merge data at its source, making it possible for all these players to interact with each other in real-time. The value created by each part of the digital marketing stacks working together is greater than the sum value of their parts.



Value is in the aggregate. Be a data gangster.

So much data! So much opportunity!

There is data everywhere. As long as there is a common thread between a single user and your website/application, there will be data that describes that relationship. Managing this data, finding the value in it and finding the value that it provides to other data role players is the key to maximizing digital marketing success.

To learn more about the data layer in action, see: Enhanced Ecommerce Using the Data Layer.

Event Tracking on Squarespace Using Google Tag Manager

I like Google Tag Manager and Squarespace for the same reason: they simplify and streamline the processes website design and Google Analytics implementation while maintaining power and functionality of the product.  For the old sliced bread analogy, they are the greatest things to come along since the knife.

google-analytics-squarespaceSquarespace is a great plug-and-play, drag-and-drop Content Management System (CMS). It allows users to easily add all kinds of content in a simple non-technical way. 

Google Tag Manager is a Tag Management System for Google Analytics . (You could say TMS but nobody does.) It allows for on-page tagging and most other advanced analytics capabilities of Google Analytics. It greatly minimizes technical implementation and actual code that must be added to a website in order to track user interaction. Its semi-technical, in fact, it’s still pretty technical but once the code is implemented by your friendly webmaster or neighborhood hacker, the backend is pretty accessible to fairly tech savvy person.  I recommend checking out this short Google Tag Manager video to get a basic understanding of its features and capabilities. When I watched it, it blew my mind, but then again, I love analytics.

“Stepping Into” Google Tag Manager

That clever heading is a JavaScript joke. Don’t worry if you don’t get, I wouldn’t have either until this year.  If you do, then you are in pretty good shape to follow this guide. To get you up to speed, here are a few great references to learn what’s going on with the technical side of the following implementation.

If you were to do this from scratch, you would need a basic understanding of JavasScript, HTML, CSS and jQuery. Actually, since Squarespace uses YUI instead of jQuery you will have to also understand YUI. I’ll explain later. You can find everything you need to know about learning how to code here.

For the YUI to jQuery translation, use jsrosettastone.com. It does exactly what it sounds like: lists translations of all the functionalities of jQuery into YUI.

For help, references use Google Tag Manager Help, Squarespace Developer Center and the Holy Grail: stackoverflow.com if you ever get hung up.

Step 1: Get Your Google Tag Manager Account and Container

Now that you are a technically proficient hack, let’s dive in!

This has been covered many times so I will let Google take it from here. Go to Google Tag Manager’s website and follow steps 1-3. Do not follow step 4! Do it yourself!

Also check out Justin Cutroni’s extremely helpful guide for setting up your account.

Step 2:  Insert the Container and Initialize the Data Layer

This is where the fun starts; the beginning of analytics tracking glory!

Inserting the Container:

Google Tag Manager documentation recommends that you place your container just below the opening body tag. One way to do this is by adding the container into a code block at the top of each page, but that is just a lot of work. Instead, the functional alternative, which works just as well, is “injecting” the container in the header section.

Initializing the Data Layer:

The container is the brain that runs Google Tag Manager on your site but it doesn’t really do anything too special without the data layer.

You will also have to insert a bit of code above the container to initialize the data layer. Technically, this bit of code creates an empty data layer object, which will be filled as user interactions push information to the data layer array. The code looks like this:

    dataLayer = [];

<!-- Google Tag Manager -->

Step 3: Insert Code to Push Events to the Data Layer

Edit: Now much of this can be handled by Google Tag Manager’s auto-event listeners.

This method of event tracking offers the agility that differentiates GTM from previous methods of Google Analytics event tracking. It is truly where the magic happens.

Inject this snippet of javascript into the footer. This way loading the script will not slow downloading the visual elements of the page.

This snippet of code says, “every time a user clicks a link to a social media site, I will push the corresponding key/value pair to the data layer.  Most commonly this is done using jQuery, but Squarespace uses YUI so you have to play the game between the lines.


 YUI().use('event', 'node', function(Y) {
   // (1) These variables reference the 'class' link attribute. 
   // Each social media site link has its own 'class' attribute.
   // Note that by using an element class rather than element id,
   // any button that shares that class will fire the GTM rule 
   // triggerd by that class.    
   var twitterLink = Y.all('a.social-twitter');
   var gPlusLink = Y.all('a.social-google');
   var linkedInLink = Y.all('a.social-linkedin');
   // (2) The 'socialView' function tracks a button click event (2a) 
   // then pushes a corresponding data layer variable to the data layer. (2b)
   var socialView= function(socialLink,socialClick){
     socialLink.on("click", function(e){                //(2a)          
       dataLayer.push({'event': socialClick});          //(2b)
       //These are logs I used for debugging:
       console.log(socialLink + " has been clicked");    
       console.log(socialClick + " was pushed to the datalayer");
   // (3) Finally, 'socialView' (2) is called for each link class variable (1).
   // This pushes a data layer value (2b) into the "key : value: pair below.
  socialView(twitterLink, "twitterClick");
  socialView(gPlusLink, "gPlusClick");
  socialView(linkedInLink, "linkedInClick"); 

//edit: that last bit is pretty sub-par code.

Step 4: Setup Tag
Manager Rules and Tags

The Google Tag Manager interface is not immediately intuitive so give yourself some time to play around with it. Once you get lost in it, you will be amazed by the power it provides.

For this example, we will go through event tracking on the Google+ button on the footer of each page. This will also track any button click for any other button with a designated div “class” attribute of  “social-google.” I use the “class” attribute because my intent is to measure navigation to my social media profiles rather than on-page UX button testing. This can be modified pretty easily to track other events. Let me know if you have questions.

Part 1: Setup the Tag

This tag measures both an Event or Social Event and a Custom Dimensions. This way its possible to track navigation to my Google+ page in aggregate and also at the visitor level.

Part 2: Setup the Rule to fire the Tag

We are tracking a JavaScript event, not to be confused with a Google Analytics event so we use the {{event}} macro to trigger the rule. When the {{event}} matches the string assigned as the data layer variable.

Note: The {{event}} macro must exactly match the data layer and sure that the match is case sensitive!

In the case of the Google+ button, the data layer variable is “gPlusClick.”

Part 3: Create a Container Version

Create a version of your container. Name your container version with a specific and consistent naming convention. This will really help in keeping track of progress and contained tags.

Step 5: Debugging and Google Analytics Real-Time Feature

Good job so far!

Now, to makes sure this is all working properly, preview the new container version. There are few ways to check if your tags are working. First preview your container in “Preview and Debug” mode. If the tags are firing on the target user interaction publish the new container and use the Real-Time feature in Google Analytics to make sure that the virtual pageviews and/or events are being reported properly.

If you followed this whole process or you managed to customize your own event tracking, give yourself a pat on the back! If you have any questions, please comment below or reach me on Twitter.