Archivo de la categoría: EN – Implementations – Data QA

How to make sure your whole site is well tracked and only once


It´s very important that we track our site properly. However, it may happen that:
a) we are not tracking a part of it
b) we are tracking it twice.

Tracking, yes, but just once, please. Let´s see in this post how to detect a possible issue and how to fix it.

The scenario in which this issue can easily happen is when we migrate from harcoded to using a tagging solution.

Let see it with an example: migration from Google Analytics (GA) harcoded to GA via Google Tag Manager (GTM). After the migration, we need to make sure that:

  • The tracking code is included in every page
  • There are no duplicated codes (GA & GTM at the same time)
    Otherwise we will be counting everything twice if both codes are in place for the same page loads or user interactions.

How can we make sure we do it correctly?

We can use tools like Screaming Frog, which is widely common for SEO purposes. But we digital analysts can also use it. And actually it’s recommended to do it when we perform an analytics audit or get a new client / project.

Just go and download the tool from their web. The price is just 149 pounds per year (less than 200 euros) what is fairly cheap for the value it offers. Just think that the cost of having a goofy analytics implementation can be estimated in more than that ….

There’s also a free version, but can just check 500 urls. Depending on the size of your size, it can work for you.

How to start using Screaming Frog.

If you have got the paid version, once you open the tool, click on “Licence” and enter the keys. And then:

1) Customizing the configuration

This is recommended to for two reasons:

  • Avoid stuff we don’t need
    It’s always distracting and makes slower the checking. For example images or CSS.
  • Subdomains are not included by default
    If your site has subdomains, then we should select them.









2) Using the filters to see whether the GA & GTM codes are included or not

We can filter pages «containing» and «not containing». And we should check both options. For both, GA & GTM.

The ideal result of this is:

  • Contains GA:  no results
    Harcoded GA has been removed from all pages
  • Does not contain GA: all pages
    Harcoded GA has been removed from all pages
  • Contains GTM: all pages
    GTM is in the whole site
  • Does not contain GTM: no results
    GTM is in the whole site

The most important filter is «Does not contain GTM» (to detect inmediately if some pages don´t have GTM). But as explained before, we need to make sure these pages don´t contain GA & GTM at the same time.. «Does not contain GA» is not really necessary.


Once the filters are ready, we just need to introduce the name of the domain and click on «start»

Wait it comes to 100 % to look at the results.

Click on “custom” (blue arrow) and select the filter you want to apply (red arrow) -remember, about contains or not contain a specific code-

Now you can get a list of all the pages in your site (including subdomains if you did include them) matching the condition we are applying in the filter. This list can be exported to Excel.

3) Detecting tracking issues and fixing them.

  • Pages without GTM code

This is easy. Just need to use the filter «not containing» GTM.
If everything was done properly, we will not get results here.

– Next step if there´s something wrong: Make sure you include the GTM code in these pages as well. And then, check again.

  • Pages having GA & GTM at the same time

You need to select «contain» for both Google Analytics & Google Tag Manager, then export to Excel, put everything together and select «duplicated values» to get the list of pages having duplication issues.

– Next step if there´s something wrong: Remove one of them. If we are migrating from harcoded to GTM, then harcoded is the one that should be removed. And then, check again.

And you? How do you make sure your site is well tracked and every page is included just once?

Any idea? Any comment? Any complaint? Leave your comment and I will get back to you. You can also contact with me via email o through my Linkedin and Twitter profiles.

First steps with the tagging solution Signal (formerly BrightTag)

There are three basic concepts that you need to understand to start working with Signal:  data elements, inputs y outputs

  • The Data Elements are the variables we want to collect
    The set of Data Elements is the Data Dictionary
  • The Inputs are what we want to know about the user and its visit to answer our business questions (in plain English)
    It can be based on page loads or user interactions. Also can be for a web browser or an app.
  • The Outputs are the tags we use to send data to digital analytics tools, like Google or Adobe Analytics

Let’s see how they work 🙂

1) Data Elements

The data elements are the variables. The data we want to get from our web or app and send to our analytics tool. Just what in other tools are dubbed as “variables”.

The way to define them and creating the tagging map is what Signal names Data Binding.
That´s the way for us to tell Signal: in this input, you have to collect these data elements (variables)

For example, for every page load (that is an input) we want to collect (among other stuff):
– The page of the name.
– The page primary primary category.
– The page sub sub category.

So, in the Data Dictionary, we have to create a Data Element for all the variables.

2) Inputs

The inputs are what we want to know / get about the navigation & behaviour of our users. The pages that are being loaded or the key interactions during the visit.

For example:
– I want to know that the product pages are being loaded
> I create an input “product pages”

– I want to know that as the page loads, we collect the product name, category and subcategory
> I create a Data Element for each (previous example) and inlcude them within the input «product pages»

We should create an input for each type of page load or user interaction.
I mean that we need an input for product pages, another for home page, another for checkout page and so on. The reason is that some data elements are common to every page type (i.e. page name) but some are just for a specific type or some o them, but not all (i.e. product name is not something we want to collect in the product page, category, checkout etc. but not in the homepage…)

To track user interactions, we just follow the same process as page loads. If we want to know if i.e. users sign in or sign up (and both have two steps – first click + actual success) we would create an input for each of the interactions.


3) Outputs

Keeping in mind what has been explained so far, Signal already knows what actions we want to collect (product pages load, sign up etc.) and what we want to know about it (the name of the product, the category etc.)

The third step here is actually sending the data to the analytics tool we use. To do so, we need to create an output for each input, specifying the vendor (in the example Adobe Analytics).



Last idea

There are two steps:
– Signal getting data from the browser or app (data elements & inputs)
– Signal sending the data to the analytics tool (outputs)

Both can be the reason of not having data in the analytics tool, so we need to keep in mind this idea in case we detect data collection issues during the QA process.

Also we need the same inputs and outputs for each data element. I would explain in more detail in another post these three concepts and other functionalities of Signal.

Any idea? Any comment? Any complaint? Leave your comment and I will get back to you. You can also contact with me via email o through my Linkedin and Twitter profiles.


Data QA with Charles Debugging Proxy (basic level)

Digital Analytics is intended to transform data into actionable insights. Sure, you have heard it a million of times. However if the data collection has not been previously audited (so we can validate it) we may be taking decisions based in data that are just wrong.

This first and necessary step that is forgotten in many cases is a bit one of geeky obsessions and priorities in every analytics project. It’s something we should use to make sure we collect properly the data we want. But also it can be used to check and see what other companies do track i.e. the side filters the user may apply when looking for some content.

1 – Tag Assistant: check easily and quickly if there’s something wrong in your Google Analytics implementation

You just need to install the plugging Google Tag Assistant for Chrome (developed by Google itself) and click in the blue label appearing in the right corner of the browser. This label will be red if any error is detected, green if everything is ok and blue if there are things that can be improved.

Let’s see an example:

p1 i1

Oh, it seems that the GA code has been placed outside the <head> tag. This means that the “tracking beacon” could not been sending data before the user leave the page, having a logic and negative impact in the quality of the data we are collecting. To fix it, you just need to click in ‘more info’ and follow the recommendations from Google Analytics Support. In this case, the solution would be just pasting the GA code in the head tag.

It’s needless to say that the UA code showed by Tag Assistant and the one in your GA property should be the same.

Just taking this little step, we would be doing a basic -but important- step to understand what’s wrong with our data collection and get a first idea of what we need to do it to fix it.

2 – Debugging tools to check what we are actually sending to Google Analytics

My favorite tools for performing a basic analysis are Fiddler y Charles . I will use Charles in this example. You just need to download it (it’s free), open the browser and visit the website you want to audit.

Then, in Charles you need to look in “” in the menu on the left (structure o sequence), y click in “request” in the right.

Let’s see a basic example with a downloads marketplace called Fileplaza


What can we see here? In the screenshots above

  • What is being collected?
    A pageview
  • What page is being collected? (I have just visited the homepage)
    The URI (URL without domain) being collected in Google Analytics will be: /
  • What’s the tittle of the page we are collecting?
    Free Software Download and Tech News – File Plaza
  • What’s the referrer that is sending the visit to (I am in the The UK and have searched “” in Google)
  • What’s the Google Analytics UA code of the property in which FilePlaza is collecting its data?
    Actually, there are two UA-48223160-1 & UA-23547102-20 (a different UA code in each screenshot)
  • Why are there two screenshots shoing the same data but having different UA codes?
    Because they are sending traffic to two GA properties.
  • Are they using Google Tag Manager?
    No, otherwise we would see GTM-XXXXXX

Now, we know what is being sent. But is is correct?

This is quite simple to answer. What we see should be the same as we want to collect with GA.
Otherwise, there’s something wrong. For example, if I do a specific interaction that is tracked with an event, but Charles doesn’t show that event, it means that the interaction is not collecting the event. And that’s something we should fix.

Let’s navigate a bit more …

1) I visit a product page: /home_education/teaching_tools/kid_pix_deluxe/


In the screenshots above, I see:

  • A page view is being collected
  • The page being collected has changed
    Now is
    So the URI in Google Analytics will be /home_education/teaching_tools/kid_pix_deluxe/
  • The data is being sent again to the two properties.

2) I click in the button ‘download’ that leads me to a downloading page:  windows/home___education/teaching_tools/download/kid_pix_deluxe/

3 l click to download a software:

i6i7Oh! Now there’s something new… Looks like my click to download has triggered an event (first red arrow) that is sending the data below to the two GA properties.

1- Event category (ec): descarga
2- Event action (ea): final -I assume is the button type, since it’s the ‘final’ button to download
3- Event label (el): Kid Pix Deluxe -that’s the name of the software i have just downloaded-

As Fileplaza is a marketplace of downloads, so the buttons to download are key. What Charles shows is the fact that they use events to measure the success of this specific user interaction.

Let’s sumarise what Charles is showing:

  1. Fileplaza measure the success of the downloads using events.
    Do they do it correctly? Yes
  2. The events contain relevant data about the interaction: action being perfored by the user (download), button being clicked and name of the software
    Do they do it correctly? Yes
  3. For a reason, FilePlaza wants to collect the data in two different properties of GA
    Do they do it correctly? Yes

I like Charles, but which tool you use is the less important thing here. I am quite tool agnostic and there are other that are also very good, like Wasp (recommended for a data layer) Google Analytics Debugger or Data Slayer (recomended for ecommerce). Normally, I use the console, in which you can install pluggings for most tools like Google Analytics or Adobe Analytics.

Any idea? Any comment? Any complaint? 🙂 Leave your comment and I will get back to you. You can also contact with me via email o through my Linkedin and Twitter profiles.