Understanding My Data... preventing duplicates after DataFusion and creating a Table in Analyzer
Hello, I'm new to Domo and working with data in general.
I've managed to connect my data from Greenhouse (applicant tracking system). I have a LOT of data and have used DataFusion to combine my large sets of data. That has all worked fine. I'd like to create a few tables that can be updated in real time (Example: current candidate pipelines for open requisitions, offer status etc) I noticed that there are a lot of duplicate rows when manipulating my data in Analyzer. Questions....
-How to get rid of the duplicates in Analyzer if my data is from a DataFusion?
-Why are there duplicates?? If I have my dataset update every day does it keep my historical data? How do I make sure I am getting the most up to date data??
-Do DataFusion update when the dataset updates or do I have to manually update them?
Comments
-
So your duplicates are going to be caused by the joins you setup. Basically if you have the same value repeated more than once in a column you are joining on, you'll end up with duplicates. You'll need to have a unique column of data to join on.
Here's a better explanation of what's going on: Joins and Duplicates
Your dataset question on historical data will depend on how the data pull is setup. If it's set to return all data, then it will have history, if it's only something like current or previous day, you'll need to setup an ETL process to append your daily new records onto a historical dataset. Here's when and how to do that: When to Append
DataFusions will update automatically as the datasets they're built from update.
Hope that helps,
Valiant1
Categories
- All Categories
- 2K Product Ideas
- 2K Ideas Exchange
- 1.6K Connect
- 1.3K Connectors
- 311 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 3.8K Transform
- 657 Datasets
- 115 SQL DataFlows
- 2.2K Magic ETL
- 815 Beast Mode
- 3.3K Visualize
- 2.5K Charting
- 81 App Studio
- 45 Variables
- 775 Automate
- 190 Apps
- 481 APIs & Domo Developer
- 81 Workflows
- 23 Code Engine
- 40 AI and Machine Learning
- 20 AI Chat
- 1 AI Playground
- 1 AI Projects and Models
- 18 Jupyter Workspaces
- 410 Distribute
- 120 Domo Everywhere
- 280 Scheduled Reports
- 10 Software Integrations
- 144 Manage
- 140 Governance & Security
- 8 Domo Community Gallery
- 48 Product Releases
- 12 Domo University
- 5.4K Community Forums
- 41 Getting Started
- 31 Community Member Introductions
- 114 Community Announcements
- 4.8K Archive