Effective Ways to Process Data within the dataLayer

This article is aimed at technical users of the Google Tag Manager.

In the world of web analytics and digital marketing, the dataLayer is a key concept, especially when working with Google Tag Manager (GTM). It's an array that serves as a central repository for storing and managing the data you want to track on your website. However, when you want to access data from it, you’ll quickly run into problems with it’s dynamic data, as the position of specific data points may not be consistent. In this article, we'll explore effective strategies to overcome these challenges, ensuring you can handle the dataLayer in the most effective way for your use case.

Data Processing within the dataLayer

Understanding the dataLayer Dynamics

Before diving into the solutions, it's important to understand that the dataLayer is an array, where each element is an object containing different pieces of information. Since it's an array, the order of these objects depends on when they were pushed into it. What you can retrieve from the dataLayer depends highly on the website. For example, user data such as the login status or the page type might be useful to retrieve from the dataLayer. Especially as a marketing consultancy that is not directly involved in customer processes, we cannot always be aware of what the customer’s tech team is working on or changing. These circumstances make it necessary for us to work in a way that adapts to what and how the customer provides data.

Case 1: Retrieving Nested and Variable Position Data

In the first case we want to show you how to deal with inconsistent positions of data within the dataLayer. We recently had a case where the data of a page type and the login status of a user were always in the first position of the dataLayer. This made it easy for us to use this data as it was reliable due to this structure having been consistent for years. We could just wait for the dataLayer to be available and use the first dataLayer index. We used this approach in various experiments to work with the data.

This became a problem when the customer changed this structure one day. The data was still stored within the first dataLayer element. But now it was nested inside an array. So the location dataLayer[0] changed to dataLayer[0][0], which obviously broke our approach.

Even if we had looped over the array each time, we wouldn’t have been able to catch this change immediately. The hot fix we had applied didn’t work for long, as the structure changed again a few days later. So the data was now located in a random index. Consequently, we decided to build a robust approach.The requirements were that it should be able to serve all the places where we used the dataLayer information. It also needed to be easy to update so that we could react quickly to changes in the structure of the data.

Here's an adapted script that checks the dataLayer for the necessary data. First, we have used an interval to poll until the dataLayer is available, so that we can continuously check for the latest changes to it.

  1. The interval delay is set relatively low as we know that the data will be populated quickly. In case this would affect site performance, we could easily change it.
  2. The code inside of the interval initially checks if the dataLayer is available using the early return pattern.
  3. We then have a variable which indicates whether the current entry meets our conditions.
  4. In the for loop we can define the different special cases that each dataLayer can have. We have added our special case where we consider an array as an entry. But only for the first position to save resources.
  5. For each entry we then check with the “checkDataLayerReady” function if the necessary properties are available.
  6. If that’s the case, we send a CustomEvent including the data so that the listener can use it directly. Additionally, we will also store the data in a global variable to ensure that the data is available for the code that runs after the event was fired.
  7. Note that before we fire the event and store the data we can also canonicalise it. If the data structure changes completely, we can canonicalise it here so that the code that processes the data later does not have to be changed.

Case 2: Optimising for Efficiency

In the previous scenario we checked all elements currently stored in the dataLayer in each interval cycle. If you are sure that nothing will change between interval cycles, you can also store the index you want to check next in the array. This approach prevents you from looping over elements that you have previously checked, thereby improving performance and reducing resource consumption.

The following code uses similar concepts to the previous example. So these parts are just marked with comments.

  1. In the code, we first create a variable that represents the index of the dataLayer we need to check next, which at the moment is 0 as we haven’t checked anything yet.
  2. We again use an interval to continuously check the newest changes to the dataLayer.
  3. Within each interval we check the new elements in the dataLayer.
  4. We deliberately store the length of the dataLayer as the next index to check. If the dataLayer doesn’t change, there is no value at that index. But since we have already checked the last index of the array in the last interval, we want to check the next one in the current interval. If there is no new data we are simply not getting into the loop.
  5. As a safety mechanism, I also added a timeout to stop this interval after 5 seconds, since I know that the desired data should be available before that. If this isn’t the case for your use case, you can remove this. Here, it might make sense to increase the interval delay to save resources.

Conclusion

The dataLayer in GTM or the Adobe Analytics dataLayer are powerful tools in your analytics arsenal, but their dynamic nature can pose challenges in data retrieval. By understanding and implementing the strategies outlined above, you can ensure that your data collection is both efficient and robust, allowing you to make informed decisions based on accurate data. Whether you're dealing with nested structures or aiming for single-read efficiency, these techniques will help you effectively navigate the complexities of dataLayer management. These were solutions to recent problems we faced, but there are many more ways to effectively check the dataLayer, depending on the use case you are trying to address.

Simon Giglhuber

Simon Giglhuber is an Experimentation Developer at Up Reply. Proficient with tools like Optimizely and Dynamic Yield, he's helped in optimising the online interface for our clients. In his blog entries, Simon will focus on JavaScript technicalities and frontend aspects essential for effective A/B testing and personalisation.

Let’s Take Your E-commerce to the Next Level

Unlock new opportunities and redefine the customer experience through personalised, data-driven strategies with Up Reply.