Understanding My Data... preventing duplicates after DataFusion and creating a Table in Analyzer
Hello, I'm new to Domo and working with data in general.
I've managed to connect my data from Greenhouse (applicant tracking system). I have a LOT of data and have used DataFusion to combine my large sets of data. That has all worked fine. I'd like to create a few tables that can be updated in real time (Example: current candidate pipelines for open requisitions, offer status etc) I noticed that there are a lot of duplicate rows when manipulating my data in Analyzer. Questions....
-How to get rid of the duplicates in Analyzer if my data is from a DataFusion?
-Why are there duplicates?? If I have my dataset update every day does it keep my historical data? How do I make sure I am getting the most up to date data??
-Do DataFusion update when the dataset updates or do I have to manually update them?
Comments
-
So your duplicates are going to be caused by the joins you setup. Basically if you have the same value repeated more than once in a column you are joining on, you'll end up with duplicates. You'll need to have a unique column of data to join on.
Here's a better explanation of what's going on: Joins and Duplicates
Your dataset question on historical data will depend on how the data pull is setup. If it's set to return all data, then it will have history, if it's only something like current or previous day, you'll need to setup an ETL process to append your daily new records onto a historical dataset. Here's when and how to do that: When to Append
DataFusions will update automatically as the datasets they're built from update.
Hope that helps,
Valiant1
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 296 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 614 Datasets
- 2.2K Magic ETL
- 3.8K Visualize
- 2.5K Charting
- 729 Beast Mode
- 53 App Studio
- 40 Variables
- 677 Automate
- 173 Apps
- 451 APIs & Domo Developer
- 45 Workflows
- 8 DomoAI
- 34 Predict
- 14 Jupyter Workspaces
- 20 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 121 Manage
- 118 Governance & Security
- Domo Community Gallery
- 32 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive