Understanding My Data... preventing duplicates after DataFusion and creating a Table in Analyzer
Hello, I'm new to Domo and working with data in general.
I've managed to connect my data from Greenhouse (applicant tracking system). I have a LOT of data and have used DataFusion to combine my large sets of data. That has all worked fine. I'd like to create a few tables that can be updated in real time (Example: current candidate pipelines for open requisitions, offer status etc) I noticed that there are a lot of duplicate rows when manipulating my data in Analyzer. Questions....
-How to get rid of the duplicates in Analyzer if my data is from a DataFusion?
-Why are there duplicates?? If I have my dataset update every day does it keep my historical data? How do I make sure I am getting the most up to date data??
-Do DataFusion update when the dataset updates or do I have to manually update them?
Comments
-
So your duplicates are going to be caused by the joins you setup. Basically if you have the same value repeated more than once in a column you are joining on, you'll end up with duplicates. You'll need to have a unique column of data to join on.
Here's a better explanation of what's going on: Joins and Duplicates
Your dataset question on historical data will depend on how the data pull is setup. If it's set to return all data, then it will have history, if it's only something like current or previous day, you'll need to setup an ETL process to append your daily new records onto a historical dataset. Here's when and how to do that: When to Append
DataFusions will update automatically as the datasets they're built from update.
Hope that helps,
Valiant1
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 616 Datasets
- 2.2K Magic ETL
- 3.8K Visualize
- 2.5K Charting
- 738 Beast Mode
- 56 App Studio
- 40 Variables
- 684 Automate
- 176 Apps
- 452 APIs & Domo Developer
- 46 Workflows
- 10 DomoAI
- 35 Predict
- 14 Jupyter Workspaces
- 21 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 123 Manage
- 120 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive