Historical Comparisons

Hello,

We have completed our 2nd full year in Domo and we are starting to see performance issues with the dataflows. We have been integrating data sources left and right and our instance is around 500M rows. Not sure if that is typical or not but it should give you an idea of the data we are processing.

Some of our problems can be alleviated by creating true historical datasets that do not need to be re-processed every day and we will be working on this process this year.

I am looking for recommendations/best practices for staging/reporting on historical data. Typically we are doing yoy, ytd, mtd, dtd, 52 weeks, etc. I have found the Domo POP charts to be too inflexible for our needs. My current approach is having today datasets and historical. The historical dataflows are beginning to take hours to run and cards are starting to lag which is not making people happy. Our business day ends at 6am so if a historical comparison datasets takes 2-3 hours our users are waiting 1/2 the morning for Domo to update. The real problem is when a dataflow mysteriously takes 5-8 hours to complete. I have titlted at this windmill from year 1 with Domo engineering and the answer has always been that the dataflows on average are running at acceptable times regardless of spikes in processing times and that Domo is constantly working on improving their processing/prioritizing of jobs.

Should I create the comparison totals in beast modes? How will the card function based on tens of millions of rows? Do I need to consider aggregating data in a datawarehouse instead of Domo?

Thank you in advance for your recommendations.

Comments

NewsomSolutions

I'm going to follow this b/c I am bringing in data and some may come in at the billion row level.

But a few questions:

1. When you have slow dataflows, are they using Magic ETL, MySQL, or RedShift?

2. Have you tried thinning out your datasets and possibly structuring your data differently for the cards? What I mean is that if you currently have one dataset, that may be denormalized so you can point lots of cards to it, and it has lots of columns and lots of metrics. You may benefit from splitting that up to two at the output and point cards to one or the other depending on the KPI.

3. In your 'aggregation' question - I'd aggregate at whatever the lowest level you can in ETL that doesn't skew your beastmodes in the card. For example, if you need to know the details of each transaction, you don't want to sum(transaction count) by day. But maybe you can sum(cash sales) and sum(credit sales) something like that to help.

cwolman

1. Both mysql and redshift exhibit these fluctuations in run times.

2. We have separate datasets and dataflows for different metrics that need to be calculated totally differently.

3. The aggregrations are an attempt to roll up the data for faster performing dataflows.

I am curious if it would be better to do the comparison totals in beast modes instead of dataflows or if there is a more efficient way to stage the data in a dataflow other than left joining the data back to itself multiple times based on time periods.

NewsomSolutions

Great.

1. First off, MySQL is slower in performance than Magic ETL. To me so is RedShift, but officially I'm not sure that is the case.

2. Nice

3. Makes sense.

If you are using a card that does time period comparisons, does that not work for you vs writing out your own dataflow/BM comparisons?

cwolman

1. If magic ETL offers a performance boost then I guess I will need to start dragging tiles around. It always seemed simpler to me to type the sql.

The Domo Period Over Period cards do not allow custom summary numbers and you are limited to using a date field and only 1 metric. There are other formatting limitations but all in all they are too limited for our purposes.

I have no problem creating beast modes or dataflows but I do not want to go down the wrong path and find out I chose poorly. I need the direction moving forward to be the correct and proven method for working within the domo platform.

NewsomSolutions

As for the DF, yea, SQL is so much simpler and as a former DBA it was my go-to, but too many drags on mysql forced me to make the change. It sucks sometimes setting up and it will take some time to get used to it and how to think about it, but performance wise for me it was a much better move.

As for the comparisons, if you go the DF route, you may run into a problem with too many sources and too many cards. May be a trial / error...set it up in DF then combine what you've aggregated their with some less complex beast modes.

I'd also reach out to your CSM and see if they can sync you up with a Technical Consultant and get some recommendations.

Also - you can use workbench to do some of your ETL work too, not sure that would help you or not, but don't forget you may can do some there.

cwolman

I started looking at utilizing the magic etl option and realized that it cannot do complex joins so this will prevent me from using magic etl. Our POS data has start and end dates and I need to create rows for each day in between depending on the specifics of other columns in the row. It does not appear that magic etl can handle this so I will need to continue with mySQL or Redshift at least for the workhorse portion of the initial transforms. The other option as you pointed out would be to build this logic into the workbench side before sending it to Domo. The downside to using workbench is if the logic changes I would need to pull all of the data again from the source. Maybe I can book an appointment at the Brilliance Bar at Domopalooza and get some solid recommendations. Thank you for your responses.

Thank you.

rado98

Thing I do which may or may no help you is the following

I create large large datesets by appending hystorical and current data. I need the data updated through the day so my timing would be different to yours.

I would then run the historical part of the data, say all data from last year backwards once per day, realistically it only needs to be run once but I like to make sure the data comes through.

I would run the current data every hour or so.

I then append the historical to the current, this takes very little time.

I do manually change the dataset cutoff date which is not ideal, you could automate that somehow.

cwolman

How do you append data to the historical dataset quickly? It usually takes a couple hours to load the historical data, a few minutes to append, and then another couple hours to create and index the historical output. In my case it takes around 3-4 hours to append 25k records to a 20M historical dataset. Unfortunately this is one dataset from one source that still needs to be joined to a much larger dataset that combines multiple datasources.

Quick Links

Find more posts tagged with

Other Categories

Product Ideas
Have a Domo product enhancement idea? Submit or upvote on ideas in the Ideas Exchange.
Ideas Exchange
Suggest & vote on new features you would like to see implemented in the Domo Product.
Data Connections
Ask questions about Connectors, Workbench, Cloud Amplifier and get best practices from Domo peers
Connectors
Connectors, Custom Connectors, Writeback
Workbench
Ask questions about Workbench, a secure, client-side solution for uploading your on-premise data to Domo.
Cloud Integrations
Ask questions about Cloud Integrations and Federated Data connection to your data warehouse or lake.
Data & ETL
Ask questions about Magic ETL, SQL DataFlows, DataFusion, Dataset Views and get best practices from Domo peers
Magic ETL
Ask Magic ETL questions and get answers from Domo peers
SQL DataFlows
Ask SQL DataFlow questions and get answers from Domo peers
Datasets
Ask DataFusion and Dataset Views questions and get answers from Domo peers
Visualize & Apps
Ask questions about Beast Mode, Cards, Charting, Dashboards, Stories, Variables and get best practices from Domo peers
Dashboards
Ask Cards, Dashboards, and Stories questions and get answers from Domo peers
App Studio
Ask questions about building apps in App Studio.
Pro-code Components
Ask questions about pro-code components and Domo Bricks and get answers from Domo peers.
Charting & Analyzer
Ask questions about charting and Analyzer and get answers from Domo peers.
Calculations & Variables (Beast Mode)
Ask questions about using calculated fields and Variables (Beast Modes) in Analyzer.
AI & Data science
Ask questions about DomoAI and get answers from Domo peers.
Domo AI & AI Chat
Ask questions about AI Chat and AI assistants.
Managing AI
Ask questions about managing AI with AI Playground, AI projects, AI models, and more.
Jupyter Workspaces
Ask questions about Jupyter Workspaces, Notebooks, and file share.
Automate
Ask questions about App Framework, Workflows, Domo Bricks, Domo Developer, API and get best practices from Domo peers
Workflows
Ask questions about Task Center, building automations with Domo Workflows, and executing JavaScript or Python code with Code Engine.
Alerts
Ask questions about managing alerts in Domo and get answers from Domo peers.
Distribute
Ask questions about Domo Everywhere, Scheduled Reports, Mobile and get best practices from Domo peers
Domo Everywhere
Ask questions about embedded analytics with Domo Everywhere.
Reporting
Ask questions about Scheduled Reports, Report Builder, and Slideshow Publications.
Manage
Ask questions about Governance Administration, Approvals, Teams, Alerts, and Buzz and get best practices from Domo peers
Governance & Security
Ask questions about People, Groups, Roles, Sandbox, Activity log, Buzz, Teams, Approvals and PDP and get best practices from Domo peers
Navigation & Productivity
Ask questions about navigation, Projects & Tasks, Goals, and Buzz chat.
APIs
Ask APIs and Developer.domo.com questions and get answers from Domo peers
Add-ins & Plugins
Ask questions about plugins, Microsoft add-ins, and other third-party software integrations.
Domo Community Gallery
Watch how our Customers are using Domo to solve their complex problems.
Product Releases
Domo support and product teams are here to live-answer questions about the most recent product releases. Please post questions in this Forum board for all users to benefit (rather than submitting a support ticket).
Domo University
Questions or discussions related to Domo University, trainings and certifications
Community Forums
Getting Started
Welcome to Domo's Community Forums! You'll find everything you need to get started in this category.
Community Announcements
Get the latest from Domo's Community Team.
Social Groups
Archive
Old or outdated content that could still be found helpful.