How to deal with attribution windows on large Digital Media data sets
One of the challenges of workigng with digital media data is "attribution windows". This is the practice by publishers of restating their numbers within a given window of time, e.g., DCM is 14 days, Youtube 35 and Facebook 28.
If you use the connector to append the last day's data to the existing data set (history) your data will likely be correct. The solution is to get the last N days of data (where N is the window length) and replace this in the existing data set. This can be done in Magic but the code is complex, due to the large sizes of the data sets, may be prohibitively long. For example, at one client after just 9 months the DCM data set was 30m rows and processing the attribution window with Magic took 10 hours.
Fortunately there is a simple SQL solution. In Redshift (or MySQL if you data sets are smaller), input both the history and the window data set then use this SQL
SELECT * FROM history
WHERE Date < (SELECT MIN(Date) FROM window LIMIT 1)
UNION ALL
SELECT * FROM window
then write this out to the history data set.
We found for a 30m row history data set this takes about 90 min in Redshift, with most of that time reading and writing data.
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.6K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 101 SQL DataFlows
- 622 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 748 Beast Mode
- 59 App Studio
- 41 Variables
- 686 Automate
- 176 Apps
- 453 APIs & Domo Developer
- 47 Workflows
- 10 DomoAI
- 36 Predict
- 15 Jupyter Workspaces
- 21 R & Python Tiles
- 396 Distribute
- 113 Domo Everywhere
- 276 Scheduled Reports
- 7 Software Integrations
- 125 Manage
- 122 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 109 Community Announcements
- 4.8K Archive