Option to Remove the 10,000 Duplicate Error for ETL
Because of how fast/easy ETL flows are, I prefer to use them for most flows. My problem is that sometimes I hit the following error, "Error joining data. The left input cannot include over 10,000 duplicates. Please switch your inputs, group your data, or remove duplicates before joining". The Dojo suggests that this error is only shown to stop users from accidentely making cartesian joins, but I know that's not what I'm doing. For example, if I want to join an item table onto a transaction table, then I have to do it in SQL because I'll get the error. This leads me to having multiple flows for the same end result.
I suggest you give admins the option to disable the error.
Best Answer
-
The super secret workaround we all use...
Right outer join instead of left.
This keeps people who don't know what they are doing from breaking stuff, but enables the rest of us to go on blissfully joining.
It'll seem awkward at first, but in a bit, you won't even have to think about it.
You want to set it up to pull all records from the table on the right, and only those which match from the left. Simple reverse of what you're used to.
DataMaven
Breaking Down Silos - Building Bridges
**Say "Thanks" by clicking a reaction in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"3
Answers
-
The super secret workaround we all use...
Right outer join instead of left.
This keeps people who don't know what they are doing from breaking stuff, but enables the rest of us to go on blissfully joining.
It'll seem awkward at first, but in a bit, you won't even have to think about it.
You want to set it up to pull all records from the table on the right, and only those which match from the left. Simple reverse of what you're used to.
DataMaven
Breaking Down Silos - Building Bridges
**Say "Thanks" by clicking a reaction in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"3 -
@DataMaven have you noticed this not working lately? I have a flow disabling because of this error even though it is a right join.
0 -
@Figauro the question isn't whether you're using a LEFT or a RIGHT join.
The question is which side of the JOIN has the duplicates? (the many side of the 1:M relationship). You want the M side to be on the right.
If you think you already have that, then apply a GROUP BY `SHIP FROM ID` on the LEFT side before the JOIN. This guarantees non-duplicate values on the LEFT.
Jae Wilson
Check out my 🎥 Domo Training YouTube Channel 👨💻
**Say "Thanks" by clicking the ❤️ in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 297 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 614 Datasets
- 2.2K Magic ETL
- 3.8K Visualize
- 2.5K Charting
- 729 Beast Mode
- 53 App Studio
- 40 Variables
- 677 Automate
- 173 Apps
- 451 APIs & Domo Developer
- 45 Workflows
- 8 DomoAI
- 34 Predict
- 14 Jupyter Workspaces
- 20 R & Python Tiles
- 394 Distribute
- 113 Domo Everywhere
- 275 Scheduled Reports
- 6 Software Integrations
- 121 Manage
- 118 Governance & Security
- Domo Community Gallery
- 32 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive