Looking to learn more about digital analytics? We have you covered on a wide variety of topics.
In this article, we will be looking to create a full integration of Google Analytics clickstream data directly into Google BigQuery — this time at no cost. Since you’re here, you likely know that clickstream data is generated each time a user performs a tracked action on a website, such as viewing a page or clicking a button. However, Google Analytics typically aggregates this information by factors like location and marketing channel, rather than providing detailed, individual hit-level data.
We've been looking for some time for a guide on how to pass Google Analytics clickstream data through to Google BigQuery without any cost. Information on the subject is fragmented, and for good reason. Previously, users who were paying for Google Analytics 360 had a native clickstream data integration with Google BigQuery, but that came with a hefty $150,000 price tag. The cheaper solution that presented itself was to manually collect clickstream data with Google Tag Manager, pass it to Google Analytics, and then store it in Google BigQuery — in the end, you’d still be paying for Google Cloud Storage costs.
What if there was another way?
Turns out, there is. If you’re a new, fledgeling business and need to keep your costs low, you can work with a Google BigQuery sandbox environment to house your Google Analytics clickstream data. I’m here to share how you can do that using this step-by-step guide.
The basic premise is that we will…
At a fundamental level, there are two necessary clickstream parameters that must be captured by our custom implementation are:
⚠️ NOTE: By default, Google Analytics 4 does not provide the temporal attributes of individual events, lacking timestamping and sequential indicators. Each event within a batch request shares the timestamp of the batch itself, resulting in the following counterintuitive outcome: under the default configuration, discerning the temporal occurrence of a specific event or establishing the sequence of events within a given batch is not feasible. To mitigate this, we need a custom timestamp variable — this is fully explained later in the tutorial.
❗WARNING: Before setting this up, please take into account the issue of cardinality that may arise from this implementation in Google Analytics 4. Given that we are capturing client IDs and timestamps for every occurring event on a site, you may quickly create dimensions that go past the daily 500 unique value limit in place found in Google Analytics 4. I strongly recommend that you keep the collected values for analysis in Google BigQuery instead — creating custom dimensions for client_id values and timestamp values will rapidly lead to cardinality issues.
Like we mentioned in the introduction, Google BigQuery already stores client_id values as “user_pseudo_id” values. However, if you’d like to store client_id values as custom dimensions inside your Google Analytics 4 property, you can create a custom capture of the client_id value and pass it to Google Analytics. Here’s how.
In Google Tag Manager, navigate to your active container’s variables tab. From there, create a new user-defined Custom JavaScript variable named Custom JavaScript — client_id Capture.
In the custom JavaScript field, paste the following code:
This JavaScript code defines an anonymous function that extracts and returns the Google Analytics Client ID (_ga) from the browser’s cookies. It uses regular expressions to parse the cookie value and retrieve the Client ID, returning null if the ID is not found.
Custom JavaScript — client_id Capture now needs to be passed through to Google Analytics 4 as a custom parameter. At the time of writing, it’s likely your Google Analytics 4 configuration tag has been migrated to now just be a Google Tag. We will have to pass these values using shared event settings for the Google Tag configuration, using a Google Tag: Event Settings variable type.
Go to your Google Tag configuration and navigate to shared event settings.
Click on the Event Settings Variable drop-down and select ‘New Variable…’ to create a Google Tag: Event Settings variable.
Open the drop-down for Google Analytics User Properties, and replicate the following settings:
Save the Google Tag: Event Settings variable.
Your configuration tag should now look something like this:
Let’s preview our Google Tag Manager container to ensure the values are being collected and passed through correctly to Google Analytics 4.
We can confirm that our configurator tag has fired, along with the Event Settings Variable, shared_event_settings.
At the same time, we also receive confirmation from the debug view in Google Analytics 4 that the client ID value has been passed through. It is shown via the orange highlighted box, indicating that the scope of the capture pertains to users.
Now that everything is working as intended, push the Google Tag Manager container live, naming it something like ‘Creation of client_id capture’.
Now that we have the client IDs of our user interactions recorded in Google Analytics 4, we need to determine when those events occurred. As mentioned earlier, Google Analytics 4 lacks true timestamping and sequential indicators. This may be a bug or a missing feature on Google’s end, but at the time of writing, no solution has yet been presented via official channels.
To mitigate this shortcoming, we need to record timestamp values for events on our own. Begin by navigating to Google Tag Manager, and creating a custom JavaScript variable named Custom JavaScript — Timestamp Capture.
In the custom JavaScript code field, paste the following:
This JavaScript code defines an anonymous function that returns the current timestamp.
Go ahead and save the custom JavaScript variable.
We now need to pass these values to Google Analytics 4. We need the timestamp value to be captured for ALL events recorded on our site, and there’s two ways to do this:
Either solution works — pick the one that gives your tag configuration the most flexibility.
Next, go ahead and save the Google Analytics 4 configuration tag.
We can now preview the output in the Google Tag Manager preview window to confirm that the custom JavaScript timestamp capture is occurring for all events.
At the same time, we also receive confirmation from the debug view in Google Analytics 4 that the timestamp capture value has been passed through. The actual times between the Google Tag Manager preview value and the value reported in Google Analytics 4 will differ slightly by a few milliseconds as the data is passed through from one platform to the other.
Now that everything is working as intended, push the Google Tag Manager container live, naming it something like ‘Creation of timestamp capture’.
We’ve got everything we need from Google Analytics 4 and Google Tag Manager. Time to hop over to Google BigQuery.
The last step is to store the clickstream data in Google BigQuery and have it readily accessible for analysis. The caveat is, we want to do this for free. Luckily, Google Cloud Platform offers a BigQuery sandbox environment that lets us store our data with the following limitations:
Even though the data is only available for 60 days, from a reporting and visualization standpoint, this is more than adequate for a start-up or small business. It allows you to test out whether this level of granularity is suited for your data endeavours, and upgrading to a pay-as-you-go plan is quite simple.
For Country, select your country.
For Terms of Service, select the checkbox if you agree to the terms of service.
Optional: If you are asked about email updates, select the checkbox if you want to receive email updates.
Click Agree and continue.
Click Create project.
On the New Project page, do the following:
For Project name, enter a name for your project.
For Organization, select an organization or select No organization if you are not part of one. Managed accounts, such as those associated with academic institutions, must select an organization.
If you are asked to select a Location, click Browse and select a location for your project.
Click Create. You are redirected back to the BigQuery page in the Google Cloud console.
That’s it — we now have successfully enabled the BigQuery sandbox. A BigQuery sandbox notice is now displayed on the BigQuery page:
Navigate to Google Analytics 4 and select the Admin section of your property. For your desired property, scroll down to the Product Links grouping, and click on ‘BigQuery links’.
Click the ‘Link’ button, and choose the Google BigQuery project we just created in our sandbox environment. Make sure you’re logged into both Google Analytics and Google BigQuery with the same set of credentials.
Click ‘Confirm’, and then Select a Google Cloud region for your data when you set up an export. Click ‘Next’.
On the Configure Settings step, notice that we can only select Daily exports given that we are in a sandbox environment. This should be more than enough as we are not anticipating thousands of users on our site creating millions of events that necessitate real-time analysis.
Click ‘Next’. Review your settings, and click ‘Submit’. You should receive confirmation that the link has been succesfully created.
As per Google’s instructions, once the linkage is complete, data should start flowing to your BigQuery project within 24 hours. 1 file will be exported each day that contains the previous day’s data (generally, during the morning of the time zone you set for reporting), and 3 files will be exported each day that contain the current day’s data. Google will provide a historical export of the smaller of 10 billion hits or 13 months of data.
Once the data is loaded into Google BigQuery, you can run a query on the synched Google Analytics 4 data table to preview your implementation:
That concludes this guide on how to how to pass Google Analytics clickstream data to BigQuery for free.
Integrating tools like GA4, Google Tag Manager, Looker Studio, and BigQuery can transform how your business tracks and optimizes marketing performance. But leveraging these platforms to their full potential requires expertise and strategy. At Tagmetrix, we specialize in helping businesses like yours turn raw data into actionable insights that drive growth and increase ROI.
Whether you’re running a small campaign or managing large-scale data, our digital analytics services are tailored to meet your specific needs. Let us handle the complexities of setup, tracking, and reporting so you can focus on what you do best—growing your business.
If you still have any questions, please feel free to reach out using our contact form and we'll be more than happy to help.
Copyright © Tagmetrix
Address: 18 Rean Drive, Toronto, Ontario, Canada, M2K0C7