Why the Numbers in Your BI Tool Don’t Always Match Your Source Data

By Jack Zagorski | 8 min read

When creating reports, there is a high chance that you will run into the following problem: the numbers you see in your BI tool (e.g. Google Data Studio) don’t match their counterparts in the dashboard of the cloud service from which they came (e.g. Facebook Ads).

This problem usually concerns:

  1. Unique metrics
  2. Total calculated metrics (rates and averages)

In this article, we will explain why the numbers don’t always add up and how you can reconcile the discrepancies. The formulas and techniques described below are applicable to any combination of cloud service and BI tool, but we will be using Facebook Ads and Google Data Studio as examples.


Want more Data Studio tips & tricks?

  1. Learn how to work with multiple date ranges.
  2. Get 3 free dashboard templates for beginners.
  3. Find out how to track your Instagram follower count.

 

Unique Metrics

As opposed to total metrics, which count the number of times a certain action or event took place, unique metrics usually measure the number of people who took an action or witnessed an event. Popular unique metrics include reach, ad frequency, unique page views, unique users, and sessions per unique user.

The main reason this problem occurs with unique metrics is that they simply cannot be summed

To demonstrate why, we will use one of the most common unique metrics: reach.

Let’s say that you are running an ad on Facebook and, for the sake of simplicity, that only 1 person has viewed this ad 1 time per day for the last 7 days. In the morning, you decide to log into Facebook Ads Manager and check the reach of the ad for the last 7 days, and you see that it was 1. This would be a sad, yet accurate insight.

Then, you log into your BI tool to view the same data, but you see that the reach for the last 7 days is 7. If you are like most users of Dataddo, you have your ETL tool pull campaign data from Facebook Ads and send it to your BI tool daily.

Aha! This is the source of the problem. Because the same user viewed your ad every day, your ETL tool sent a daily reach of 1 to your BI tool, which—when summed over 7 days—yielded a reach of 7 for the last 7 days. This is of course incorrect because the reach was actually 1. 

To see how this looks in real life, check out screenshots of the data from one of our Facebook Ads campaigns. The first is from our Facebook Ads Manager:

FB Ads (Reach)

Screenshot from Facebook Ads Manager. Even though we can see the reach for each of the last 7 days, the “Results” row at the bottom is set to display the weekly totals. 2,228 is the accurate weekly total for reach.

Now, compare the screenshot above with the following screenshot of the same campaign data, fed to Google Data Studio daily by Dataddo:

GDS (Reach)

Screenshot from Google Data Studio dashboard. The daily reaches match the numbers in the dashboard above, but the total is discrepant. This is because it’s simply the sum of the daily reaches, which we know is incorrect. The correct weekly reach is 2,228.

So, what is the solution to this problem? This is a tough one because it’s impossible to get the reach for a longer interval of time by summing the reaches for shorter intervals of time.

The most reasonable way to get around the problem is to create more flows (i.e. pipelines) in your ETL tool that pull data at all the intervals you need. You could, for example, have one flow that pulls daily so you can stay granular with your insights, another that pulls weekly for your weekly reports, and another that pulls monthly for your monthly reports. Of course, this won’t get rid of discrepant numbers, but at least it will give you the data you need at the levels you need.

Here is a screenshot of the same Facebooks Ads data displayed in Google Data Studio, only it’s fed by a flow that pulls weekly instead of daily:

GDS (weekly reach)

Screenshot from Google Data Studio dashboard. Compare this with our screenshot from Facebooks Ads Manager above—the reach matches!

The same logic applies to any other unique metric.

 

Total Calculated Metrics (Rates and Averages)

Like for unique metrics, the numbers for total calculated rates and averages are also often discrepant. This is because they, too, cannot be aggregated over time intervals or campaigns.

Let’s say you are running two ad campaigns via Facebook Ads. One has generated 1,000 impressions and 100 clicks for a click-through rate (CTR) of 10%, while the other has generated 100 impressions and 50 clicks for a CTR of 50%. Your ETL tool exports all three data sets (impressions, clicks, and CTR) to your BI tool, which, by default, is set to aggregate the numbers in each of the three columns. 

The aggregated sums for impressions and reach would be 1,100 and 150, respectively, which would be correct. But your CTR would display as 60%, which is obviously incorrect.

For a real-life example, see below this screenshot from one of our reports in Google Data Studio. The totals for clicks and impressions are accurate, but the total click-through rate is absolutely not 1.75%!

GDS (Incorrect CTR)

Screenshot from Google Data Studio dashboard. Note: rates and averages are exported as “floats,” i.e. numbers that have decimal places. To see them properly as percentages, you will need to create another column that divides the float number by 100, then modify the field type so it displays as a percentage. Otherwise, 0.18%, for example, will wrongly display as 18%.

One common, yet unsuitable solution is to simply average the CTRs. In our ad campaign example, where we have one CTR of 10% and another CTR of 50%, averaging would yield a total CTR of 30%. But this is also incorrect! We can’t just calculate the average, we have to calculate the weighted average.

How do we calculate weighted average? To illustrate, let’s stick with CTR, whose basic formula is:

clicks / impressions = CTR

To get the weighted average for total CTR, we have to use the following formula:

(clicks #1 + click #2) / (impressions #1 + impressions #2) = total CTR for both campaigns

So, in our ad campaign example, the result would be:

(100 + 50) / (1,000 + 100) = total CTR of 13.6% for both campaigns

This math of course explains the discrepancy, but it won’t eliminate it from the dashboard in your BI tool. To eliminate it, we recommend exporting only the data that you need to calculate rates and averages, but not the rate and average data itself, and then automating the calculations.

So, in case of CTR, you would export all the click and impression data, but not the CTR data. Then, in your BI tool, you would create a calculated field for CTR using the following formula:

sum (clicks) / sum (impressions)

This will automatically display the CTR for individual time intervals or campaigns, as well as the total CTR. Like this:

GDS (Correct CTR)

Screenshot from Google Data Studio dashboard.

And voilà! The data displayed in your BI tool should now match your source data. Here is the screenshot from our Facebook Ads Manager again:

FB Ads (CTR)

Screenshot from Facebook Ads Manager.

There are dozens of other rates and averages worth tracking, all of which are calculated using similar formulas. See:

 

An Understanding of Discrepancies + Proper Formulas = Accurate Insights

99% of the time, when the data displayed in a dashboarding app doesn’t match the equivalent data displayed in the dashboard of a tool or app, it’s due either to improper calculations or summing of metrics that should not be summed.

If you are having a discrepancy that cannot be reconciled by any of the above techniques, reach out to our solutions team. We’ll be happy to help!

 

See how Dataddo can help your business

Just a few quick steps to get your data to your dashboard for better analysis, without the hassle.

Start 14-day free trial


Category: Industry Insights

Comments