SEO with the Google Search Console API and Python

The thing I enjoy most about SEO is thinking at scale. Postmates is fun because sometimes its more appropriate to size opportunities on a logarithmic scale than a linear one.

But there is a challenge that comes along with that: opportunities scale logarithmically, but I don’t really scale… at all. That’s where scripting comes in.

SQL, Bash, Javascript, and Python regularly come in handy to identify opportunities and solve problems. This example demonstrates how scripting can be used in digital marketing to solve the challenges of having a lot of potentially useful data.

Visualize your Google Search Console data for free with Keyword Clarity. Import your keywords with one click and find patterns with interactive visualizations.

Scaling SEO with the Google Search Console API

Most, if not all, big ecommerce and marketplace sites are backed by databases. And the bigger these places are, the more likely they are to have multiple stakeholders managing and altering data in the database. From website users to customer support, to engineers, there several ways that database records can change. As a result, the site’s content grows, changes, and sometimes disappears.

It’s very important to know when these changes occur and what effect the changes will have on search engine crawling, indexing and results. Log files can come in handy but the Google Search Console is a pretty reliable source of truth for what Google sees and acknowledges on your site.

Getting Started

This guide will help you start working with the Google Search Console API, specifically with the Crawl Errors report but the script could easily be modified to query Google Search performance data or interact with sitemaps in GSC.

Want to learn about how APIs work? See: What is an API?

To get started, clone the Github Repository: https://github.com/trevorfox/google-search-console-api and follow the “Getting Started” steps on the README page. If you are unfamiliar with Github, don’t worry. This is an easy project to get you started.

Make sure you have the following:

Now for the fun stuff!

Connecting to the API

This script uses a slightly different method to connect to the API. Instead of using the Client ID and Client Secret directly in the code. The Google API auth flow accesses these variables from the client_secret.json file. This way you don’t have to modify the webmaster.py file at all, as long as the client_secret.json file is in the /config folder.

try:
    credentials = pickle.load(open("config/credentials.pickle", "rb"))
except (OSError, IOError) as e:
    flow = InstalledAppFlow.from_client_secrets_file('client_secret.json', scopes=OAUTH_SCOPE)
    credentials = flow.run_console()
    pickle.dump(credentials, open("config/credentials.pickle", "wb"))

webmasters_service = build('webmasters', 'v3', credentials=credentials)

For convenience, the script saves the credentials to the project folder as a pickle file. Storing the credentials this way means you only have to go through the Web authorization flow the first time you run the script. After that, the script will use the stored and “pickled” credentials.

Querying Google Search Console with Python

The auth flow builds the “webmasters_service” object which allows you to make authenticated API calls to the Google Search Console API. This is where Google documentation kinda sucks… I’m glad you came here.

The script’s webmasters_service object has several methods. Each one relates to one of the five ways you can query the API. The methods all correspond to verb methods (italicized below) that indicate how you would like to interact with or query the API.

The script currently uses the “webmaster_service.urlcrawlerrorssamples().list()” method to find how many crawled URLs had given type of error.

gsc_data = webmasters_service.urlcrawlerrorssamples().list(siteUrl=SITE_URL, category=ERROR_CATEGORY, platform='web').execute()

It can then optionally call “webmaster_service.urlcrawlerrorssamples().markAsFixed(…)” to note that the URL error has been acknowledged- removing it from the webmaster reports.

Google Search Console API Methods

There are five ways to interact with the Google Search Console API. Each is listed below as “webmaster_service” because that is the variable name of the object in the script.

webmasters_service.urlcrawlerrorssamples()

This allows you to get details for a single URL and list details for several URLs. You can also programmatically mark URL’s as Fixed with the markAsFixed method. *Note that marking something as fixed only changes the data in Google Search Console. It does not tell Googlebot anything or change crawl behavior.

The resources are represented as follows. As you might imagine, this will help you find the source of broken links and get an understanding of how frequently your site is crawled.

{
 "pageUrl": "some/page-path",
 "urlDetails": {
 "linkedFromUrls": ["https://example.com/some/other-page"],
 "containingSitemaps": ["https://example.com/sitemap.xml"]
 },
 "last_crawled": "2018-03-13T02:19:02.000Z",
 "first_detected": "2018-03-09T11:15:15.000Z",
 "responseCode": 404
}

webmasters_service.urlcrawlerrorscounts()

If you get this data, you will get back the day-by-day data to recreate the chart in the URL Errors report.

Crawl Errors

 

 

 

webmasters_service.searchanalytics()

This is probably what you are most excited about. This allows you to query your search console data with several filters and page through the response data to get way more data than you can get with a CSV export from Google Search Console. Come to think of it, I should have used this for the demo…

The response looks like this with a “row” object for every record depending on you queried your data. In this case, only “device” was used to query the data so there would be three “rows,” each corresponding to one device.

{
 "rows": [
 {
 "keys": ["device"],
 "clicks": double,
 "impressions": double,
 "ctr": double,
 "position": double
 },
 ...
 ],
 "responseAggregationType": "auto"
}

webmasters_service.sites()

Get, list, add and delete sites from your Google Search Console account. This is perhaps really useful if you are a spammer creating hundreds or thousands of sites that you want to be able to monitor in Google Search Console.

webmasters_service.sitemaps()

Get, list, submit and delete sitemaps to Google Search Console. If you want to get into fine-grain detail into understanding indexing with your sitemaps, this is the way to add all of your segmented sitemaps. The response will look like this:

{
   "path": "https://example.com/sitemap.xml",
   "lastSubmitted": "2018-03-04T12:51:01.049Z",
   "isPending": false,
   "isSitemapsIndex": true,
   "lastDownloaded": "2018-03-20T13:17:28.643Z",
   "warnings": "1",
   "errors": "0",
  "contents": [
    { 
    "type": "web",
    "submitted": "62"    "indexed": "59"
    }
  ]
}

Modifying the Python Script

You might want to change the Search Console Query or do something with response data. The query is in webmasters.py and you can change the code to iterate through any query. The check method checker.py is used to “operate” on every response resource. It can do things that are a lot more interesting than printing response codes.

Query all the Things!

I hope this helps you move forward with your API usage, python scripting, and Search Engine Optimization… optimization. Any question? Leave a comment. And don’t forget to tell your friends!

 

9 Pro Tips for GTM Auto Event Tracking and the Click Element Variable

This week I did a Google Tag Manager implementation where I had no control over the site’s source code and no access to the site’s developers. It’s an imperfect situation but luckily, GTM Auto Event Tracking affords a solution that is very effective considering these constraints.

Google Tag Manager Should Be Easy

This post is meant share a few tips on how to implement, customize, and debug Google Tag Manager in a situation where you are using Auto Event Tracking. This is especially helpful when you want  to track events on pages that you do not have any control over. The heart of Auto Event Tracking and the focus is this post is the {{Click Element}} Auto Event Variable, also known in the dataLayer as “gtm.element.” Through the whole event tracking process, there are three tips:

  1. Setting up Click Event Tracking using “phantom” event triggers
  2. Custom JavaScript Variables to collect data about the page/interaction
  3. Debugging Auto Event Variables and Trigger

Read More

IFTTT + NFC Tags: a Maker Channel Tutorial

I geeked out recently over the Android task automation app,  Automate. The reason I got so excited was that it provided a very elegant solution for creating a custom NFC sensor for a project I’ve been working on. What I also came to find out was that Automate opened up a lot of possibilities beyond this project.

Automate can be easily extended to prototype a ton of IoT projects. This tutorial shows how your Android phone can easily connect your physical world to any cloud service with a mashup of NFC, Automate and IFTTT.

Android IoT Automation with IFTTT

Example Use Case: NFC Enabled Sustainable Clothing Care

The project I’ve been working on is about improving the quality and sustainability of personal clothing care by instrumenting clothes with NFC tags. This way a clothing owner can be more careful about how they wash their clothes and more mindful about how they wear them. I had been looking for a good way to prototype this solution and my first idea was to create unique ID QR codes for all my clothing but that was a bit cumbersome. I had thought to wire up an Arduino but that seemed like a lot of work just for a prototype. I finally realized that the best way to build this prototype was sitting right in front of me.

Read More

“Hello World!” with Automate for Android

I’ll admit, my posts have been lame lately, but don’t worry, I just found something that I’m pretty excited about! (This is not a paid endorsement, but it will get a bit geeky. ) Automate by LamaLab is a clean, simple, fun and free (!) way to turn your IoT imagination into reality. The app puts all your Android’s interfaces at your disposal to create powerful automation and expressive working prototypes.

Automate by LlamaLab

Automate by LamaLab is a clean, simple, and fun way to turn your IoT imagination into reality.

I found it yesterday while looking for a good way to hack together a custom Android NFC reader. After looking at a few other options, I ended up geeking out on Automate for the rest of the day! It’s awesome! So I wanted to share a quick up and running tutorial so you can join the fun with your very own “Hello World!”

Read More

QR Code Scan-to-Vote App Using IFTTT and Google Sheets

In Defence of QR Codes

If QR codes aren’t being used effectively, it is more the fault of the publisher than the technology. The technology, despite its looks, is quite elegant and offers a lot of opportunity. However, there are some pretty significant barriers to adoption and use. So the burden is upon the publisher to use QR codes in the right context and provide a compelling reason to scan.

Mona Lisa QR Code

Some contexts are better than others. As this discussion illustrates, places where written language is not based on Basic Latin characters, like much of Asia, QR codes can be used to help users avoid quite a lot of mobile typing. The beauty of QR codes, a type of character encoding in their own right, is that they are universal. They are like high fidelity emotions; a lot of meaning packed into a small print expression. This universal characteristic is why I decided to use both together in the example below.

From a pure technological standpoint, QR codes offer quite a lot. They store up to several paragraphs of text or complete contact information, they can uniquely ID physical objects like NFC, and they can also trigger phones to send SMS messages, download apps and open web pages. (No wonder marketers are so eager to use them.)

The trigger to open webpages is a useful feature but publishers’ use of this trigger have fallen short of its potential. A smartphone’s web browser is much more than just a content browser. It is a interface to the blossoming world of Web 2.0 Web services. This is the unrealized potential of QR codes.

QR Codes as Application Interfaces

The technology, on the surface, is simple, it’s utility is elegant, and it’s recognition is universal. It is, in essence, just a medium but its capabilities are broad. We just have to think within the right application.

The Web is your Application

Most of us use our mobile Web browsers to request documents from Web servers (most often HTML web pages) or interact with Web apps. And that is traditionally what QR codes have been used for. But if we think of a mobile web browser beyond just a medium for viewing documents and more a means of sending HTTP requests, there is a lot of World Wide Web territory that opens up.

Beneath the document-based surface of the Web, there is a lot happening. There is a lot of data being passed around from place to place. Most often this happens as data is requested and sent from an API to dynamically update a web page (think of your favorite travel booking site) or connect one website or service to another (think of Twitter streams on web pages).

See what Web Services are available via IFTTT

These API’s are, very often, available to authenticated users of the service or even the public. There are API’s for updating the price of items on eBay, posting to Facebook, getting the current weather conditions, or getting the structured data from your favorite Reddit page. And all of these API’s and many more, are RESTful, meaning that sending a request to a specifically structured URL will return some structure data or tell a service to do something for you. This is where mobile web browsers come in.

The QR Code is your Interface

So all we have to do to make a web service do something is send a request to a URL. And all we need to trigger a request to a URL is scan a QR code that contains that URL. This means that scanning QR codes becomes a physical interface to the power and connectedness of the digital world. It is now the same as pressing buttons to control the Web!

Cat Button

The question is now, what would you want this button to do? Depending on what you can connect to, this could be everything from light switch, a messaging service, an alert system, a doorbell, some kind of Kanban system … or a voting system.

Why not Mobile Apps or IoT Hardware?

If you’ve gotten this far and you’ve done more than just scanned for cat Gifs, you are probably asking, “Why not use a Mobile App or IoT hardware where you can literally use a button as a trigger?”

The answer is cost and ease. Their is a huge cost to DIY mobile applications and connected devices. Using QR codes is as simple as generating the codes and printing the papers. Again, the context must be considered. QR codes as triggers are probably best used where use is passive and takes place over a short timespan making investment in an app or hardware infeasible.

Also persuading the download and use of mobile apps is a massive barrier. The same could be said for QR code scanner apps but for most applications, QR code scanner apps are more universal than any application that would offer the same functionality. Additionally, in places where QR codes have a stronger value proposition, like Asia, (see above) QR code scanners will be found in apps that are already commonly used, like WeChat.

Uniquely QR Codes

There is also an interesting combination of properties of QR codes that can make or break them as triggers. If the publisher uses QR codes with intention, these properties can become a features of the technology rather than a weakness.

  • Contextual – Users have to be in the same physical location of the QR code to receive or send the data that the QR contains. In a way, the location of the QR code acts as an implicit authentication. If and only if users are nearby can they scan the QR code.
  • Inherent barrier to use – This one is difficult to view as a feature but the barrier could serve as proof of motivation. This barrier can allow us to infer some things about their user and/or the strength of their motivation. Digital fluency, to some degree, is required. There is also a measure of curiosity and/or motivation that is needed.
  • Visual – You know them when you see them and they are their own call to action. They are the same as putting a big link on a page that says “Click Here!”
  • Coded – Again, they hold more information or instructions than one could ever hope to print in the same space and have it make sense. They also allow for long, dynamic URLs or if you so choose, intentional misdirection.
  • Can hold and send information about their context – Depending on how dynamic the publisher chooses to be in creating QR codes, the codes could contain an element of mass customization. The QR code’s URL could send the destination URL (web page or web service) information about the context of the QR code. This could allow the receiver of the Web request a customized response or action in the same way that a search query on your favorite ecommerce site provides customized information on the following page.
  • Duplication – Ok, it’s really hard to see how this could be anything but a weakness. It is somewhat difficult to ensure that a URL is not requested twice, either by scanning a code twice or browser refresh.

Motivating this Example


The intent is to demonstrate two things: dynamically generated QR codes that hold contextual data and web requests to a web service rather than a web page. To do this, I decided to create a QR code ballot where votes are tallied in a cloud service.

There have been other applications of QR codes for voting. In most of these, the QR code provides a link to an online form (hopefully mobile-friendly) where voters can enter their responses. Unlike these applications, in this version, scanning the QR code is the actual act of voting. This infrastructureless mashplication is also free and a MVP of what could be done.

Building a Prototype

For a test case, I used a project that has been happening in my neighborhood for a couple years called HK Walls. This project pairs wall donors to artists from around the world to create awesome public art. At the moment, each of these walls, has nothing to signify that it is part of the project or the project itself.

The HK Walls project does fit perfectly into this voting idea because it is location-based and inherently mobile (it has its own hashtag). It is also clearly visual and lends itself to normative voting. Art is either the perfect or the worst use case for this but this is just a test. There are plenty of other applications for this if you use your imagination.

To build the prototype, very little is needed; just Google Spreadsheets, IFTTT, a B/W printer and some spare time.

Step 1: IFTTT Recipe Setup

This step provides the base URL for the IFTTT Maker channel. It even creates the spreadsheets for all future steps. See my recipe.

IFTTT Recipe: QR Code Scan to Vote connects maker to google-drive

Step 2: Dynamically Generate Lots of QR Codes

I used Google Spreadsheets and Google Charts’ QR code API to generate the QR codes. Each QR code was dynamically generated by using Google Charts’ QR code API to encode a combination of an IFTTT Maker Channel URL, the location of the HK Wall and the variant emoji. The result was 88 QR codes with different contextual (location) and normative (emoji) information.

Voting With QR Codes

The normative emojis are placed above each location’s row of emojis so that they can be cut out with the set of QR codes. After finishishing step 3, I added a column of QR codes that would send scanners back to the chart.

See the Google Spreadsheet QR Code Generator for Voting

Step 3: Setup and Publish the Results Chart

The fifth column of QR codes holds a QR code that links scanners to the aggregate results of the poll (average rating values for each wall). This is a simple chart based on a pivot table of average scores. The chart was also published so it could have its own URL for scanning.

The Google Chart URL shows this on a mobile browser:

Step 4: Print the QR Codes and Rock the Vote!

If I were to do this in real life, I would print these all out and paste them up on the walls. But it is not my place to facilitate judging of free public artwork. That seems to oppose the spirit of the project. I encourage you to though to use this for your own Democracy!

Use Your Imagination

I had to see this was possible. I had to know that QR Codes could control the connected web. I had to see that it could scale with minimal technical know-how.

I am satisfied with the result, I think there is some testing to do, (I am still thinking of ways to gather some actual data of QR code use in Hong Kong.) and there are many applications yet to explore. I hope that this might motivate your curiosity to find what’s left to find.

Dynamically Pre-fill Google Forms with URL Parameters

Why? Because, connect all the things! Why else? Because the less information people have to fill in to your form, the more likely they are to complete it – especially if you are asking them to fill in information that they will have to look up! Dynamically pre-filling forms is a great conversion optimization hack to show your prospective respondents a little love. This love will increase form completion rates and response accuracy.

The question originally came to from someone who had read my post about Mailchimp Reporting with Google Spreadsheets (same products but different applications) But hey, sometimes you don’t even know where to start your Google search! His question was this:

I send a reminder email to my customers through MailChimp. The email contains the customer’s account code and some other data unique to the customers. A link on the email sends the customers to a Google Form where they will answer some questions. I want to have the customer’s unique customer code populate a field so I that I know  who the response came from, and in doing so, reduce the number of fields a customer would have to complete.

Do you have a solution for this? MailChimp and Google apps said there isn’t, but Google did suggest it might be resolved with a script (which I have no idea how to do).

It seemed interesting. I thought about the script. I was hoping this would not be the answer because that sounded like trying to breed an ostrich and an alligator. Then I remembered a project I had worked on in tracking Wufoo form submissions in Google Analytics. That was an Ostrigator but hey, it worked!

The key was passing data around on query strings. I knew there was dynamic values in Mailchimp in the form of Merge Tags. Then all we needed was a link to Google Forms that would allow you to pre-populate the forms. Lo and behold: Pre-populate form answers! So all we would have to do is match the right merge tags as to the right form URL query string parameters and we would have pre-populated forms from the dynamic values in the emails. #masscustomize all the things!

Merge Tags in Google Form Link Parameters

Step 1. Get a Pre-Filled Google Form URL

To get started, go to your Google Form editing page and click responses. Select “Get pre-filled URL”

Hubdango Google Forms

Step 2. Pre-fill the Form with Merge Tags

Find the merge tags that you want to use and enter them into the form boxes. *When you do this, make sure to use merge tags that are accurate for your whole mailing list. Missing information is not a problem but if your merge tags are inaccurate, this could cause confusion and/or complaints.

User Validation

Step 3. Copy the Pre-filled Google Form URL

The merge tags have been appended to to the form’s URL as query string parameters. Now you have your link to your Google Form that will automatically fill the values of your form with the values of your Mailchimp Merge Tags.
User Validation 2

How Query String Parameters Work

At this point, you may be wondering how this works. You have a URL that looks like this:

https://docs.google.com/forms/d/1MnRQJ3XZxsRBSVp28M7Sm1M8Gm5jniE/viewform

followed by this:

?entry.1040949360=*%7CFNAME%7C*&entry.271521054=*%7CLNAME%7C* ...

The part after starting with with the question mark (?) is the query string. It is made up of key-value pairs connected with ampersands (&). Query strings are used to pass information to webpages or “resources.” The server that handles the request will know what the query string’s keys mean and depending on each key’s value, will dynamically generate or modify the resource. (See how query strings works with a API’s) In this case, Google’s server just takes the value of each secretly encrypted key and places the value in the form field associated with that key.

But why does the ‘First Name’ Merge tag look like *%7CFNAME%7C* ?

You have seen strings on the internet, often on URL’s that have something like “%20” in them. This is called URL encoding. It is one of those web standards that allows the magic of the web to work. Because the pipe character (|) is not a normal character it is encoded after the percent symbol as “%7C” .

Don’t worry when you paste this into your Mailchimp email, Mailchimp will automatically decode the URL and it will end up looking like this:

?entry.1040949360=*|FNAME|*&entry.271521054=*|CLNAME|* ...

Now go answer every other question you have ever had about URLs.

Now that you’re back, go setup your new email campaign.
Mailchimp Merge Tags in links
Then when your email recipient and prospective form respondent gets their Mailchimp email, the link will be dynamically modified to look like this:

?entry.1040949360=Trevor&entry.271521054=Fox ...

and… Hello World! The Mailchimp has correctly rendered the links with the dynamic values from the merge tags.
Rendered links with Mailchimp Merge Tags

Digital Marketing is Science + Art

When it’s all said and done, an appealing email subject like and effective copy will only get respondents to go to the form page. After that, their is a lot left to optimize. As digital marketers, we are doing much more than trying to persuade. We are delivering a message, removing friction and optimizing experiences. We are measuring and testing all the way! Don’t forget, our job is just as much about perfecting the message as it is the medium.

To build more Digital Marketing Technical Skills checkout my $10k Tech Skills Series.

 

 

Track REST APIs with the Google Analytics Measurement Protocol

Google Analytics got a whole lot more interesting when the Measurement Protocol was introduced. We already knew GA was the industry standard for web analytics but with the Measurement Protocol it has become the analytics platform of anything and everything that can be made digital. With some clever instrumentation, we can now use it to track products through the supply chain or track users interactions in a store. All you need is a way to collect digital data and send HTTP requests to Google Analytics and you can track anything.

I had to try it out for myself. While I could have fitted #rhinopug with a tracking device or instrumented my coffee machine with an Arduino, I took the easier (but equally cool) route to getting data: a Web API. As my proof of concept, I chose to track the SwellPath team’s group chat application called GroupMe.

Google Analytics Measurement Protocol

GA Dashboard Courtesy of Mike Arnesen

Tracking a chat app turned out to be a pretty cool way to walk that physical/digital line. While we are humans working in the same office, its interesting to compare contextual information from what we can see and hear to the very objective measure of communication; desktop and mobile messaging. This concept is similar to other measures of digital communication like Twitter firehose or brand mentions from news API’s. Those are probably much more relevant to, and could actually affect a website’s performance but, let’s be honest, this one’s a lot more fun.

Mapping Data to Google Analytics

Digital messaging is actually pretty appropriate for the Google Analytics reporting interface. The main reason is this: timestamps. We rely heavily on timestamps to analyze everything in Google Analytics which are all time-based hits. We ask Google Analytics how different landing pages perform as seasons change and what time of user’s are most likely to convert (in order to bid intelligently on ads). Likewise, there is also a natural rhythm to work-based communication. Of course, (or hopefully) its pretty quiet on the weekends and generally pretty active as people start each workday.

The other reason that human communication maps to the Google Analytics reporting interface is that message creation is a lot like content consumption. When we really think about what a “hit” schema looks like, it has a few entities what go together something like this:

[actor] did [event] on [location] at [timestamp]

This “hit” schema works equally well for describing message creation as it does content consuming.

With every hit, the [actor] a.k.a. User is assigned some attributes like Device or New/Returning and the [event] a.k.a. Event, Pageview or otherwise, will have attributes like URL and  Page Title for Pageviews or Action and Label in the case of Events. The [location] is an interesting one. For web, its the page that the user is browsing but it’s also the physical location of the user a Lat,Lon pair with appropriate geographic information. The [location] attributes are generally handled by Google Analytics automatically but speaking from experience, the real art of a good collection strategy is mapping the right information to the right attribute of each entity.

To make sense of the idea of mapping information to attributes let’s get back on track and talk about GroupMe. It boils down to this: you have data and you want it to appear in Google Analytics in a way that you can logically sort/filter/analyze it. This is where the mapping comes in.

GroupMe’s API gives you data about a group’s messages like this:

{
  "count": 123,
  "messages": [
    {
      "id": "1234567890",
      "source_guid": "GUID",
      "created_at": 1302623328,
      "user_id": "1234567890",
      "group_id": "1234567890",
      "name": "John",
      "avatar_url": "http://i.groupme.com/123456789",
      "text": "Hello world ☃☃",
      "system": true,
      "favorited_by": [
        "101",
        "66",
        "1234567890"
      ],
      "attachments": [
        {
          "type": "image",
          "url": "http://i.groupme.com/123456789"
        },
        {
          "type": "image",
          "url": "http://i.groupme.com/123456789"
        },
        {
          "type": "location",
          "lat": "40.738206",
          "lng": "-73.993285",
          "name": "GroupMe HQ"
        },
        {
          "type": "split",
          "token": "SPLIT_TOKEN"
        },
        {
          "type": "emoji",
          "placeholder": "☃",
          "charmap": [
            [
              1,
              42
            ],
            [
              2,
              34
            ]
          ]
        }
      ]
    }
  ]
}

If this doesn’t make sense to you, go read up on JSON. But essentially what you get when you ask the GroupMe API for the most recent messages, it returns a list of messages with, among other things, the sender’s name and user ID, the message, the number of likes, and the location. So we have information about each of the “hit” entities. The user, event, place and time are all described. The only thing missing that is critical to web analytics metrics is something similar to Page. For that reason I decided to use Google Analytics Events to describe each GroupMe message. Each hit maps GroupMe data to Google Analytics as follows:

Google Analytics Parameter GroupMe Data / JSON Keys
User ID GroupMe User ID / user_id
Client ID GroupMe Source GUID / source_guid
Custom Dimension (User) GroupMe Username / name
Event Category “GroupMe Chat”
Event Action “Post”
Event Label Truncated Text of Message / text
Event Value Count of Likes / count(favorited_by)
Queue Time Difference between Now and Timestamp /current time – created_at

Then each GroupMe message is sent to Google Analytics on an HTTP request with data mapped to GA parameters as shown above. Collect data for a few days and then it looks like this:

Measurement Protocol Specific Values: Queue Time and Client ID

If you come with a Web analytics frame of mind, there may be two things that are unfamiliar to you: Client ID and Queue Time. These are both a pain to get right but functionally awesome.

The Client ID is something you don’t have to think about for web data collection; it’s automatically collected from a cookie that Google Analytics sets for you. It is very important though. It is the key for differentiating two devices that, by their collectible attributes “look” the same but are not. The CID must follow very specific rules to be valid and lucky for me, GroupMe offers a GUID for each message that fits the specifications.

Queue Time is awesome. This is the single most important factor in getting the time value of at Measurement Protocol “hit” right. It is the delta (a cool way to say difference) between the time that the event occurred and the time that the hit was collected. If you send the hit to Google after the hit took place, Google’s servers calculate the time delta and record the hit at the time that it actually took place.

This was especially important for the method I used to get data from GroupMe and send it to Google Analytics. Because I was only getting the messages from the GroupMe API once an hour. Without the Queue Time, the hit timing would be very low fidelity, with spikes each hour when the data was collected and sent. By calculating the Queue Time when each message was sent, I got accurate timing and didn’t have to worry about burning through API limits or wasting lots of HTTP calls. (Think about it, without Queue Time, your data is only as accurate as the frequency that your hits are sent which was a cron job in this case.)

Google Analytics Measurement Protocol API

Don’t call it a hack. Ok, call it a hack.

Lessons Learned / How I’d Do it Next Time

This ended up working out pretty well thanks to a fair amount of luck and plenty of read the docs, code, debug, repeat. I got lucky when I realized I hadn’t accounted for things like the mandatory Client ID parameter and … the fact that my server doesn’t run Python cron jobs. As a result I ended up writing my first PHP script and here I am sharing 100-some lines of amateur code. But hey, this proof of concept works!

If I were to do this again, I would answer a few questions before I started:

Get to know the API

  • Will the API I want to track give me all the data I need?
  • Are events timestamped or do I have a way to approximate that?
  • How difficult is authentication and how long does it last for?
  • Am I going to operate safely within the API rate limits?
  • What about Terms and Conditions of the API data?

Map the Data to Google Analytics

  • How will I avoid making recording the same hit twice?
  • What type of Google Analytics Hit will I use?
  • How should I map the API’s data to a Google Analytics hit?

Automate!

  • Can I write some code to automate this?

How the Code Works

The code I wrote to automate this is listed below but if you are unfamiliar with PHP or code in general the instructions that are given to the computer are essentially this:

Call the GroupMe API to see if there are any new messages since last time
  If no: stop.
  If yes: continue
Call API to get/make a map of User ID’s to User Names to send with hits
For each message that was returned:
  map it to GA parameters
  send it as an event to GA
  For each like of each message:
    map it to GA parameters
    send it as an event to GA
Write the the most recent message ID to a .txt file (to keep track of what has been sent)

Wait for about an hour and repeat with the next cron job

It was a fun project and luckily a successful proof of concept for tracking non-website data in Google Analytics. If you’re thinking about doing a Measurement Protocol project, leave a comment or tweet me at @realtrevorfaux (don’t worry, I’m not tracking it). If you’re interested in other cool ways to track offline transactions, check out Google Analytics Enhanced Ecommerce, I really look forward to what is to come of the Measurement Protocol with things like IoT. Connect, collect, and analyze all the things!

The PHP code that I used is below. Give me a break, this is the first PHP programming (and maybe last) I’ve ever done.

<?php

// script configuration stuff
$token = "abc123"; // from dev page
$group_id = "1234567";  // from dev page
$memory_file = "last_id.txt";

$UAID = "UA-XXXXXX-XX"; // Google Analytics UA Code
$member_names_map = makeNameMap();

// saved last message id to file
$since_id = file_get_contents("last_id.txt");

// endpoint to get lastest messages
$url = 'https://api.groupme.com/v3/groups/'. $group_id .'/messages?token=' .$token. "&since_id=". $since_id;

// call the groupme api
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
$http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

// check response code and do the rest if no change.
if ($http_status === 304){
  echo "API RETURNED: ". $http_status ." n";
} else {
  $message_count = $response->count;
  echo "API RETURNED ". $message_count ."MESSAGESn";
  handleMessages($response);
}


function handleMessages ($response_obj){

  $json = json_decode($response_obj);
  $messages = $json->response->messages;
  $timestamp = time();

  foreach ($messages as $message) {

    global $UAID;
    $queue_time = $timestamp - $message->created_at;

    $post_hit_params = array (
      'v'=>1,
      'tid'=>$UAID,
      'uid'=>$message->user_id,
      'cid'=>$message->source_guid,
      't'=>'event',
      'ec'=>"GroupMe Chat",
      'ea'=>"Post",
      'el'=>$message->text,
      'ev'=>count($message->favorited_by),
      'qt'=> $queue_time,
      'cd1'=> $message->name,
      'cd2'=> $message->user_id
    );

    sendGAHit($post_hit_params);

    $favorited_by = $message->favorited_by;

    foreach ($favorited_by as $id) {

      $name = $member_names_map->$id;

      $like_hit_params = array (
        'v'=>1,
        'tid'=>$UAID,
        'uid'=>$id,
        'cid'=>$message->source_guid,
        't'=>'event',
        'ec'=>"GroupMe Chat",
        'ea'=>"Like",
        'el'=>$message->text,
        'ev'=>1,
        'qt'=> $queue_time,
        'cd1'=> $name,
        'cd2'=> $id
      );

      sendGAHit($like_hit_params);
    }
  }

  // get last message/id from this call's messges
  $last_message = current($messages);
  $last_message_id = $last_message->id;
  writeMemoryFile($last_message_id);
}

function sendGAHit ($params){

  $query_string = http_build_query($params);
  $url = "www.google-analytics.com/collect?". $query_string;

  // send hit to GA
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  $response = curl_exec($ch);
  $http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
  curl_close($ch);

  echo "n";
  var_dump($params);
  sleep(1);
}

function writeMemoryFile ($last_message_id){
  global $memory_file;
  // write last ID to file for next time
  $memory = fopen($memory_file, "w");
  fwrite($memory, $last_message_id);
  fclose($memory);

  echo "LAST ID WRITTEN TO FILE: ". $last_message_id ."n";
}

function makeNameMap(){
  global $token;
  global $group_id;
  $url = 'https://api.groupme.com/v3/groups/'. $group_id .'?token='.$token;
  // call the groupme api
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  $response = curl_exec($ch);
  $http_status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
  curl_close($ch);

  $json = json_decode($response);
  $members = $json->response->members;
  $member_names_map = new stdClass;

  foreach ($members as $member) {

    $user_id = $member->user_id;
    $nickname = $member->nickname;
    $members_names_map->$user_id = $nickname;

  }

  return $members_names_map;
}

?>

 

Product Scope Custom Dimension & Metrics in Google Analytics

Google Analytics Enhanced Ecommerce provides insight into Ecommerce performance in detail that was impossible using standard Google Analytics methods. To get the most out of Enhanced Ecommerce, you must understand the breadth of Enhanced Ecommerce data collection capabilities. This tutorial will take you from Ecommerce product page to Enhanced Ecommerce Product Data.

Enhanced Ecommerce Product Page

A pretty standard Ecommerce Product Page. Thanks to smarthome.com

Enhanced Ecommerce Reporting

Enhanced Ecommerce Product Reporting with Custom Dimensions and Metrics.

Product Data Problem and Solution

The new product schema was created to provide insight in a way that answers questions that are specific to products rather than pages or events. Consider event hits; their schema uses a sentence-like structure to describe an action.  This does not map well to a product entity that is not bound by specific instances in time. The product schema is more similar to pageview hits that collect attributes of a page entity over time. But still, the data is collected on the page is in a different fashion.

For more, see Carmen Mardiros’ great conceptual explanation of Enhanced Ecommerce Data Layer and my slides on Enhanced Ecommerce schema.

Google Analytics Enhanced Ecommerce assigns a collection of properties to each product.  At least one property is mandatory (either name or id), and other properties are optional: brand, category (up to five tiers), variant, price, quantity, coupon, and position. These properties describe either an attribute of the product or the context of the product action. These standard features provide a holistic understanding of how users interact with products in an Ecommerce store. But they don’t give the whole picture of how customers interact with products within the context of any particular business.

 Understand how users interact with products based on qualitative dimensions like its availability, backorder date, release date or physical dimensions or appearance.

For instance, you may want to understand how your customers’ shopping behavior changes for products based on qualitative dimensions like its availability, backorder date, release date or physical dimensions or appearance. Likewise, there is more to understand about products’ quantitative information such as the cost of goods sold, previous price, discount, or profit. This is where custom dimensions and metrics come in.

Product Data Collection

Google Analytics collects data about product entities in a way that appropriately fits the concept of a product; each product is represented by Javascript Object. An Object is simply a thing represented by a collection of properties about that thing. In other words, a product entity would be represented by an Object that has a name property of  “Clapper Light Switch” and a price property with a value of 30.00.

To extend the ability for an Object to describe a product entity, we can specify additional properties like “cost of goods sold,” or “backorder date.” At the code level, this means adding one more property “key”:”value” pair to the product entity object. The only difference is that the properties name will be a placeholder such as “dimension9” or “metric4.” The dimension or metric name will be assigned later within the Google Analytics interface. In Universal Analytics it would look like this:

ga("create", "UA-XXXXX-Y");
ga("require", "ec");
ga("ec:addProduct", {
   "id": "81301",
   "name": "Xantech AC1 Controlled AC Outlet",
   "price": "78.55",
   "brand": "Xantech",
   "category": "AC1 Controlled AC Outlet",
   "variant": "white",
   "dimension3": "In stock - Ships Today", // stock status custom dim.
   "metric2": 5,                           // rating stars custom metric
   "quantity": 1 });
ga("ec:setAction", "add");
ga("send", "event", "Product", "Add to Cart", "Xantech AC1 Controlled AC Outlet");

and using the Google Tag Manager data layer, it would look like this:

dataLayer.push({ 
   "event": "addToCart",
   "ecommerce": { 
      "currencyCode": "USD",
      "add": {
         "products": [{
            "id": "81301",
            "name": "Xantech AC1 Controlled AC Outlet",    
            "price": "78.55", "brand": "Xantech",            
            "category": "AC1 Controlled AC Outlet", 
            "variant": "white", 
            "dimension3": "In stock - Ships Today",  // stock status
            "metric2": 5,                            // review stars
            "quantity": 1 
         }]
      }
   }
});

For a great working example of this, see:  https://ga-dev-tools.appspot.com/enhanced-ecommerce/

Setting Custom Dimension & Metric Names

The first thing to note is, if you are using the Google Tag Manager data layer to collect product data, make sure you have checked the “Enable Enhanced Ecommerce Features” and “Use data layer” boxes shown below. Google Tag Manager Add to Cart Event No matter if you are using the data layer or standard Universal Analytics collection code, you will have to do two things:

  1. Ecommerce and Enhanced Ecommerce Reports must be enabled.  Just go into your Admin section > choose a View > Ecommerce Settings and toggle  Enable Enhanced Ecommerce Reporting to “ON.”
  2.  Custom Dimension and Metric must be activated, named, and configured as you want them to appear in Google Analytics reports. This is also done within the Admin interface. Go to Admin > choose a Property > Custom Definitions and click Custom Dimensions or Custom Metrics. Set the name, the scope to “Product”, and the state to “On.” For Custom Metrics, set the appropriate formatting type.

Enhanced Ecommerce Product Custom Metric Setup Note that these hits must be set at the product hit level. Otherwise, the data will not be collected as expected, if at all.

Enhanced Ecommerce Product Data Reports

Now you are ready to appreciate all your shiny new insights. To find these reports within the Reporting interface go to Conversions > Ecommerce >  select a Product or Sales report.

Enhanced Ecommerce Reporting

From there you can see all your products and sort, filter, and view them by any their newly recorded properties. Metrics can also be added to Custom Reports to provide aggregate insights.

Inspecting, Debugging and Perfecting Product Data Collection

You may see something funny in your reports at this point or nothing at all. In my experience, Enhanced Ecommerce data is published as soon as the hit it was sent with is recorded. This is usually relatively quick. (Under 10 minutes) If something looks amiss, don’t worry. There are usually a few simple fixes that can be made to make sure the data is being collected and reported correctly. Assuming you have done everything correctly up to this point, there may be a few things you need to check and fix. Here’s a list:

Debugging Enhanced Ecommerce Hits

This is what you are looking for in the Google Analytics Debugger.

  • Make sure the hit is being sent to the right property.
  • Use mandatory product fields (name or id) as report dimensions.  This is helpful when starting out. If you are looking at a  report with a primary dimension of Product List, but are not yet collecting product list data, the report will appear to be empty.
  • Make sure Ecommerce data is sent with a standard Google Analytics Hit. Enhanced Ecommerce data is buffered on the page until a stand Google Analytics hit is sent. Then the Ecommerce data is collected with that hit.
  • Make sure Javascript Object data structure is correct and without errors. Use my data layer debugger bookmarklet to verify that the data is in the data layer. Also, keep an eye on the Javascript console and use jshint.com to make sure there are no errors and everything is formatted correctly.
  • Try this cool method of inspecting each object that is pushed to the data layer is valid by using JSON.stringify to view Objects in the data layer. Just type the following command into your Javascript console and inspect the object in JSON (JavaScript Object Notation).
JSON.stringify(dataLayer[0]) 
// where 0 is the index of the first object in the data layer array

Accessing Nested Values with Google Tag Manager Variables

When Google Analytics event hits carry Enhanced Ecommerce info, you may want to use a product’s attributes as the values for the event’s Category, Action, or Label or event Custom Dimension or Metric. Similarly, product data can be applied as a custom dimension on the pageview hit that, for example, is sent to on a product page to carry product detail view Enhanced Ecommerce information. In these case, if you are using Google Tag Manager, you can access the values of the product or actionField data using a Data Layer v2 Variable. These GTM Variables allow you to access property values that are nested within Objects or Arrays in the data layer.

For instance, if you wanted to access the name of a product that was just added to a shopping cart (as shown above), you would use the following format without the quotes: “ecommerce.add.products.0.name”. Note that the 0 is specifying the index (zero-based count) of the Object noted in the array that is enclosed in [brackets].

Thanks, Simo for getting me back on track with this.

A Note on Product Hit Scope Dimensions and Metrics

Custom Dimensions and Metrics are product level, and won’t be applied to an event or page on which the happen or to a user that interacts with them. That is done by setting the dimension or metric at the hit level. Just make sure to configure the hit Dimension or Metric accordingly.  Check out this old but good explanation of  hit scope by Justin Cutroni.

Start Collecting and Start Optimizing!

This may seem complicated, but the power that it provides is well worth time spent in a detailed implementation. Please leave a comment if you have any questions. Or send me an email at t@tfox.us and let’s get this started!

Thank you to SmartHome.com for the pretend data. I want everything in your store.

Scale Optimizely Testing with Google Tag Manager

You already know that Google Tag Manager is an incredibly powerful tool but you may not know that it can literally do anything. In fact, Google Tag Manager made me pancakes this morning. Not really, but a man can dream.

Google Tag Manager is comprised of three components: Tags (the things that do stuff with data), Triggers (formerly Rules, the things that decide when/if stuff is done), and Variables (formerly Macros, the things that express the data). This post is one post of many that shows that, thanks to the Custom HTML Tag, Google Tag Manager can do virtually anything.

Custom HTML Tags let you place any HTML, CSS, or Javascript on to the page that you might want. This could be anything from a tracking pixel, to CSS rules similar to Optimizely’s global CSS insert, to JavaScript AJAX commands to communicate with another API like Google Analytics, Lytics or if you must, SiteCatalyst. So when we say Google Tag Manager can do anything, we mean it. But that’s just because Javascript can do anything.

On Page Data Collection with Google Tag Manager

That being said, when we approach any on-page data collection problem, the question boils down to what is available and what needs to be passed on? To solve the problem simply clarify those two variables and use Google Tag Manager to draw the line between input and output.

Optimizley Goal Tracking is an excellent example of this. The input is pretty clear. We want to measure specific actions (most often clicks) that users might take or experience that will give us insight into the effect that our experiment is having. Aside from pageview tracking, output is less understood.

At a high level, we know what we want to see in terms of outputs: lines on a graph, hopefully with our new variation showing an improvement on the original. But how does it get there? While you can easily attach event triggers to clicks within the Optimizely interface. There are other events like Form Submit Events, YouTube Video Events , and Scroll Depth Events that are not quite as easy to capture.

This is where Custom Events and Optimizely’s Javascript API come in. At a basic level, whenever a Optimizley Goal happens, it is pushed into a queue of all the Goals that may have or may yet happen on the page. This queue of events are sent to Optimizely’s servers using Javascript AJAX (Asynchronous JavaScript and XML) when the Javascript commands can be executed. (This is much like how the  Google Tag Manager Data Layer works.)

At the Javascript code level, this is how it looks when a Goal is pushed onto the queue.

Optimizely Event Javascript

* Now is a good time to note, that Optimizely does not differentiate between Click Events and Custom Events. They are all Custom Events at the code level. The only thing that differentiates Event Goals is their names. (This is what you see when you click “Advanced”in the Goal setup process.) That is also what is used in the above code for “eventName”.

So now that we have a pretty good idea about what our output should be, it’s time to draw the line between input and output with Google Tag Manager.

Optimizely Custom Event Goal Setup

After a few Optimizely test implementations I came to realize I was repeating myself and I am a HUGE D.R.Y. principal person. With each implementation I found myself in the experiment setup clicking on an element to track and going through the process of making sure that this element actually tracked as expected. The problem was that every time I would setup these Goals, I would say to myself, “we are already tracking this with Google Analytics using Google Tag Manager.” So this is how I came up with this (fairly) universal method for tracking Optimizely Event Goals.

With Google Tag Manager, we specify triggers to determine when a Google Analytics Event is fired. This trigger also signals when Optimizely Event Goal actions take place. So the light went on and I decided to make both things happen at the same time.

Tracking Optimizely Event Goals With GTM

As I mentioned before, Google Tag Manager relies on “events” being pushed to the data layer similar to Optimizely Custom Event Tracking. Google Tag Manager’s syntax looks like this:

dataLayer.push()

Where the “event” value, in this case: “customizeCar”, is used to trigger the firing of GTM tags. With each push to the dataLayer, additional attributes can also be specified to associate with that event. (This makes GTM much more scalable to use for multiple tag types. Also, now is a good time to read the documentation on this.)

Optimizely, on the other hand, sends a single event name for each goal. (Queue light bulb) We can just use the dataLayer event value from GTM for Optimizely Goal Events!

Configuring GTM Custom HTML Tags

You will need two things to do this: a Data Layer Variable to get the value of the dataLayer event and a, you guessed it, Custom HTML tag to pass the event to Optimizely.

GTM MenuData Layer Event Name Variable

GTM already provides a {{event}} auto-event variable but this is not what we are looking for. The Auto-Event Variable aka: {{event}} is used for listening for events and triggering tags but it won’t return the value of the current dataLayer “event” variable. We will need to make a variable to do this. Our new varialble will look like this:

Universal Optimizely Custom Event Tag with Goal Tag Manager

To send the event to Optimizely we will need a Custom HTML tag to run a little JavasScript script to interact with the Optimizely Javascript API and send the Custom Event Goals.

GTM Custom HTML Tag

Trigger the Custom Event Goal

This is where the logic happens! Decide which dataLayer auto event variable you would like to send to Optimzely as Custom Event Goals and set them up as triggers, as shown below. Because this trigger is (fairly) universal, it can be setup with multiple custom event triggers (shown below) as long as the dataLayer event names are unique.

GTM Customize

So that (fairly) universal tag works best when you are passing in data layer events that have unique names. Putting in the planning up front can really pay off down the road. We can apply the same idea to a few other metrics that you might have considered while setting up Optimizely likeForm Submit Events, YouTube Video Events , and Scroll Depth Events.

Form Submit Custom Event Goal

The Custom HTML tag will look like this:

The Trigger will look like this – be sure to specify your form by element ID, Class or maybe data attributes. This is what the tag would look like if it had a distinctive element ID:

GTM Custom Form Event

Instead of setting it all up, download and merge this container including tag, variables and triggers here. [Instructions]

YouTube Video Custom Event Goals

If you are doing a test that involves optimizing a video, you want to track video plays with Optimizely.

The Custom HTML tag will look like this:

GTM Custom YouTube Event

This one is a little more complex so check out this explanation, or download and merge the complete container here[Instructions]

Page Scroll Depth Custom Event Goals

You might use this as a negative goal. The Custom HTML tag will look like this:

GTM Custom Scroll Event

This one is a little more complex so check out this explanation, or download and merge this container here[Instructions]

I hope all this offers an idea of the flexibility and limitless power of Google Tag Manager and answers the question, can I track that? Just remember, Optimizely is a great testing platform but Google Analytics is a great web analytics platform. If you need help with either, let us know. This is what we do and we love it.

Agile Strategy for Data Collection and Analytics

If you are like most people doing business online, it seems like there is always a long list of digital to-dos that are somewhere between “that will happen in Q4” and “that should have happened by Q4 last year.” Aside from the constant stream of daily hiccups that arise due to the asynchronous nature of our medium, if you are like most others managing a website, you face broader development challenges of slow servers, uncooperative CMS’s, or lame mobile experiences impacting your online success.

This is not failure that you have to accept! Let me introduce you to a little thing that has been bouncing around in the software/web development community that will make your online business operations feel less like swimming in peanut butter. It’s called Agile Development and it’s sexy. It’s fast and sexy like a cheetah wearing high heels.

We can apply these principles of Agile Development to data collection, analytics, and optimization to provide two exceptional benefits: rapid access to data and insight, and safeguards against constantly changing web properties.

For data collection, analytics, and optimization:

  • An Agile approach provides action before traditional methods provides insight
  • An Agile approach safeguards against the constant variability of the web medium

“If you fail to plan, you are planning to fail!” — Ben Franklin

Learning from Feature Driven Development

The Agile Development concept covers an array of development methodologies and practices, but I would like drill into one especially coherent and efficient method of Agile called Feature Driven Development.

Feature-Driven Development essentially works like this: an overall project is planned as a whole then it is separated into discrete pieces and each of these pieces is designed and developed as a component and added to the whole. This way, instead of having many semi-functional components, the project’s most valuable components are complete and fully functioning.

Phased Implementation (Not Iteration)

Because you might have already heard something about Agile Development, it is important, at this time to dispel the notion that Agile development is defined by iterations upon products. In a sense that is true but mostly it is the complete opposite of the Agile approach. The only iterations that happen are the planning, implementation, and completion of a new feature. This is not the same as adding layers upon existing features (more on this with the Definition of Done). The difference here is planning and the ability to see the project and business objectives as a whole.

Step 1: Develop an Overall Model

You must plan! Planning in an organization can be hard to motivate and difficult to initiate, but these planning steps will actually provide you with better, more actionable data sooner than not.

Understand the system. This is digital. There are a lot of moving parts. It is very important to really know how your digital presence affects your physical business and your overall business strategy and vice versa. Additionally, there are likely many components within your business that are (or could be) affected by the data that can be collected. This leads to my next suggestions.

Ask questions and seek multiple perspective. This is time to confront your assumptions about your business, your pain-points, and your data needs. It is important to really know the processes and decisions that are taking place and how they are (or are not) or could be affected by data. Communicating with those who interact with and make decisions on the data at any level will be extremely insightful.

Be strategic. Look at the big picture of the future, define your goal and work backwards. Agility does not come by luck but rather by being aware of and prepared for all foreseeable possibilities. Consider how things will change and what parts of your digital presence are shifting. How will redesigns, platform changes, and code freezes affect your strategy? This is generally the best way to face an analytics problem so this step applies very well to analytics. Agile was created to solve the problems of being short-sighted and reactive.

Step 2: Define the Parts of Your Plan

This is where the fun starts. There are multiple ways an analytics strategy can be divided and further subdivided into parts. When considering how to divide the project into parts, the goal should be to get to define parts at their most discrete, independent or atomic level. This will be helpful in prioritizing the parts into steps. Ultimately,  these parts can be regrouped based on similarity and development implementation process.

By Web Property and Section

An organization’s web presence is often not limited to a single site or app. There may be different properties or sections of web properties with different intents. Inevitably, some of these properties or sections will have a bigger impact on your organization’s goals and thus would be prioritized differently.

By Data Scope (User, Page/Screen, Event)

Each web properties has layers of data that can be gathered from it. Data about the user in the app or website’s database, information about the content of the page, and information about how the user interacts with the app or website can all be thought of discretely. These differ in terms of intelligence and the actual development work that is required to collect the data.

By Data Use

Another way to divide up the data-collection needs is by end use. For instance, you may be an ecommerce store that has different people or teams who are merchandising, planning and creating content, managing email, social campaigns, or paid media campaigns and/or optimizing the application and user experience. The data needs for each initiative will often overlap with other initiatives but sometimes data needs will be very different from others. These different data needs can be thought of as different parts of your strategy.

By Data Depth

Think 80/20 rule in terms of granularity. Some data is instantly useful. For instance, you may not be tracking clicks on your main call-to-actions or “Buy” buttons. These clicks are likely key micro-conversions and having this interaction insight can literally change your strategy overnight. Another layer of depth would be knowing what product was added to the cart as part of that event. A further layer would be configuring Google Analytics’ Enhanced Ecommerce to understand how users interact with products from the product page to the checkout. Each of these examples provide varying depths of data but also require varying amounts of development time.

Other features like Google Adwords Dynamic Remarketing and Google Analytics Content Groupings can be thought of similarly as they need more information to associate with the user or page.

Step 3: Prioritize

This is the most important step. This is where the unique value of the Agile approach really shines. This can drastically lower the cost and shorten the time to data-driven action. All the planning and foresight that took place before can be leveraged to make the right decisions for the most success.

Consider Goals

Duh. The whole reason you are gathering data is to be data-driven. The parts of your plan that most directly affect your top-line goals should be at the top of the list. Think about every time you have said or heard “If we only knew abc we could achieve xyz.” Now depending on the value of xyz, prioritize data collection.

Consider Time

This is what Agile is all about! With goal impact in mind communicate with relevant parties and your development team or partners to understand how long it will take to implement the code to start gathering data. Sometimes the value of data will scale to the development time, other times it may be as simple as using a Google Tag Manager click listener on calls-to-actions to send events to Google Analytics within a few minutes. Overall, its good to have some data to orient your decisions right away so go for the quick wins first and work with that as code is being implemented to get the real data gold.

Consider Cost

Unfortunately, bottom lines still exist and often development resource cost will have to be justified in implementing code to gather data. Some data collection might be cost prohibitive but it is possible that by gathering data that is easier to gather, such as standard Ecommerce implementation will give you the rationalization to get more in depth data down the road. Overall, get the valuable data that comes cheap, squeeze the life out of it until you need more depth.

Step 4: Implementation Cycle (Plan, Implement, QA)

Now, for the moment we’ve all been waiting for, let the collection begin! This is the step that most people think of when they think of Agile development; sprinting to complete a feature and then releasing it.  For Agile analytics, this works the same way. Now that there is a list of analytics “features” or streams of data that have been prioritized each step should be planned, implemented and tested successively.

Plan

This is a more detailed plan than the overall model. This plan defines how the data will be collected. For example, this is when Google Analytics Event or Custom Dimension naming conventions would be defined and documented. Be explicit. This will really improve the efficiency of the process.

Implement

Buy your development partner beer and pizza and pass your documentation on to them. Keep them happy and maintain a good relationship. There will be more implementations in the future. Hopefully, your documentation is clear but be open and responsive to questions; this is all about speed and accuracy.

Quality Assurance

This should happen in your development environment so that when the code is implemented on the site, the data that is reported is clear and accurate. Be thorough as this implementation should stay this way well into the future. If changes are to be made, be discreet, just as in implementation.

These three steps can happen simultaneously. For example, planning can happen on a future part as implementation and QA is happening on the present part.

Start Optimizing!

Agile is not simple but it’s also not magic. Speeding up the time to data-driven action is made possible by the planning that happens up front. Being proactive is not only a practice of Agile but also general best practice in analytics. It is the planning that makes agile efficiency possible. It may seem difficult, but putting in the effort to plan will put you in a position to act proactively agilely into the future.  Happy optimizing!