Redshift vs. MySQL vs. ETL
Hi Everyone,
This Knowledge Base Article suggests guidelines for when to use certain In-Domo processing options, but admittedly I default to using Redshift for all of my transforms regardless of dataset input size. Am I missing out on faster data flow processing times by not following the guidelines in the article?
I would be particularly interested to know if any of the options are optimized to load input data sets faster/slower and process the output datasets than the others.
Best Answer
-
DDalt,
Thank you for reaching out with your question. Redshift is great for many use cases, especially those that require SQL transformations on larger data. The downside is that the Redshift service and resources alotted to processes are managed by Amazon and remove some environment controls that Domo otherwise has for MySQL and Magic ETL.
MySQL will generally present less variance in DataFlow run times, but it does not automatically index data as Redshift does, so it is better suited to smaller input row counts. MySQL doesn't share all of the same functionality as Redshift, as is the case with windowed functions available in Redshift. Otherwise, to get the best performance out of mySQL DataFlows, you should employ indexes for joins and consider other optimizations discussed here:
http://knowledge.domo.com?cid=optimizingdataflow
Magic ETL is well suited to larger input DataSets, and could be considered as an alternative to Redshift for many use cases. It will begin to process data through the transformations as the input data comes in, rather than waiting for all of the input data to load completely, as Redshift does.
To summarize, while Redshift is good for larger data, mySQL should be used for smaller Data inputs. Magic ETL is good for small and large data inputs. Each use case will determine what tool is the best fit, but they all have their place in your toolbox for data manipulation and additional data preparation for various use cases.
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"8
Answers
-
DDalt,
Thank you for reaching out with your question. Redshift is great for many use cases, especially those that require SQL transformations on larger data. The downside is that the Redshift service and resources alotted to processes are managed by Amazon and remove some environment controls that Domo otherwise has for MySQL and Magic ETL.
MySQL will generally present less variance in DataFlow run times, but it does not automatically index data as Redshift does, so it is better suited to smaller input row counts. MySQL doesn't share all of the same functionality as Redshift, as is the case with windowed functions available in Redshift. Otherwise, to get the best performance out of mySQL DataFlows, you should employ indexes for joins and consider other optimizations discussed here:
http://knowledge.domo.com?cid=optimizingdataflow
Magic ETL is well suited to larger input DataSets, and could be considered as an alternative to Redshift for many use cases. It will begin to process data through the transformations as the input data comes in, rather than waiting for all of the input data to load completely, as Redshift does.
To summarize, while Redshift is good for larger data, mySQL should be used for smaller Data inputs. Magic ETL is good for small and large data inputs. Each use case will determine what tool is the best fit, but they all have their place in your toolbox for data manipulation and additional data preparation for various use cases.
Regards,
Darius Rose
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"8
Categories
- All Categories
- 1.9K Product Ideas
- 1.9K Ideas Exchange
- 1.6K Connect
- 1.3K Connectors
- 302 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 104 SQL DataFlows
- 637 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 762 Beast Mode
- 65 App Studio
- 42 Variables
- 704 Automate
- 182 Apps
- 458 APIs & Domo Developer
- 53 Workflows
- 11 DomoAI
- 39 Predict
- 16 Jupyter Workspaces
- 23 R & Python Tiles
- 401 Distribute
- 116 Domo Everywhere
- 277 Scheduled Reports
- 8 Software Integrations
- 132 Manage
- 129 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 12 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 111 Community Announcements
- 4.8K Archive