How to deal with attribution windows on large Digital Media data sets
One of the challenges of workigng with digital media data is "attribution windows". This is the practice by publishers of restating their numbers within a given window of time, e.g., DCM is 14 days, Youtube 35 and Facebook 28.
If you use the connector to append the last day's data to the existing data set (history) your data will likely be correct. The solution is to get the last N days of data (where N is the window length) and replace this in the existing data set. This can be done in Magic but the code is complex, due to the large sizes of the data sets, may be prohibitively long. For example, at one client after just 9 months the DCM data set was 30m rows and processing the attribution window with Magic took 10 hours.
Fortunately there is a simple SQL solution. In Redshift (or MySQL if you data sets are smaller), input both the history and the window data set then use this SQL
SELECT * FROM history
WHERE Date < (SELECT MIN(Date) FROM window LIMIT 1)
UNION ALL
SELECT * FROM window
then write this out to the history data set.
We found for a 30m row history data set this takes about 90 min in Redshift, with most of that time reading and writing data.
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 616 Datasets
- 2.2K Magic ETL
- 3.8K Visualize
- 2.5K Charting
- 732 Beast Mode
- 55 App Studio
- 40 Variables
- 684 Automate
- 176 Apps
- 452 APIs & Domo Developer
- 46 Workflows
- 10 DomoAI
- 35 Predict
- 14 Jupyter Workspaces
- 21 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 123 Manage
- 120 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 107 Community Announcements
- 4.8K Archive