nav[aria-label="Primary Navigation"] { padding: 0; & ul { list-style: none; width: 100%; display: flex; flex-direction: row; justify-content: start; align-items: start; gap: 30px; padding: 0; & li { margin: 0; } & ul li { list-style: none; } } }

Historical Comparisons

cwolman

Hello,

We have completed our 2nd full year in Domo and we are starting to see performance issues with the dataflows. We have been integrating data sources left and right and our instance is around 500M rows. Not sure if that is typical or not but it should give you an idea of the data we are processing.

Some of our problems can be alleviated by creating true historical datasets that do not need to be re-processed every day and we will be working on this process this year.

I am looking for recommendations/best practices for staging/reporting on historical data. Typically we are doing yoy, ytd, mtd, dtd, 52 weeks, etc. I have found the Domo POP charts to be too inflexible for our needs. My current approach is having today datasets and historical. The historical dataflows are beginning to take hours to run and cards are starting to lag which is not making people happy. Our business day ends at 6am so if a historical comparison datasets takes 2-3 hours our users are waiting 1/2 the morning for Domo to update. The real problem is when a dataflow mysteriously takes 5-8 hours to complete. I have titlted at this windmill from year 1 with Domo engineering and the answer has always been that the dataflows on average are running at acceptable times regardless of spikes in processing times and that Domo is constantly working on improving their processing/prioritizing of jobs.

Should I create the comparison totals in beast modes? How will the card function based on tens of millions of rows? Do I need to consider aggregating data in a datawarehouse instead of Domo?

Thank you in advance for your recommendations.

Comments

NewsomSolutions

I'm going to follow this b/c I am bringing in data and some may come in at the billion row level.

But a few questions:

1. When you have slow dataflows, are they using Magic ETL, MySQL, or RedShift?

2. Have you tried thinning out your datasets and possibly structuring your data differently for the cards? What I mean is that if you currently have one dataset, that may be denormalized so you can point lots of cards to it, and it has lots of columns and lots of metrics. You may benefit from splitting that up to two at the output and point cards to one or the other depending on the KPI.

3. In your 'aggregation' question - I'd aggregate at whatever the lowest level you can in ETL that doesn't skew your beastmodes in the card. For example, if you need to know the details of each transaction, you don't want to sum(transaction count) by day. But maybe you can sum(cash sales) and sum(credit sales) something like that to help.

cwolman

1. Both mysql and redshift exhibit these fluctuations in run times.

2. We have separate datasets and dataflows for different metrics that need to be calculated totally differently.

3. The aggregrations are an attempt to roll up the data for faster performing dataflows.

I am curious if it would be better to do the comparison totals in beast modes instead of dataflows or if there is a more efficient way to stage the data in a dataflow other than left joining the data back to itself multiple times based on time periods.

NewsomSolutions

Great.

1. First off, MySQL is slower in performance than Magic ETL. To me so is RedShift, but officially I'm not sure that is the case.

2. Nice

3. Makes sense.

If you are using a card that does time period comparisons, does that not work for you vs writing out your own dataflow/BM comparisons?

cwolman

1. If magic ETL offers a performance boost then I guess I will need to start dragging tiles around. It always seemed simpler to me to type the sql.

The Domo Period Over Period cards do not allow custom summary numbers and you are limited to using a date field and only 1 metric. There are other formatting limitations but all in all they are too limited for our purposes.

I have no problem creating beast modes or dataflows but I do not want to go down the wrong path and find out I chose poorly. I need the direction moving forward to be the correct and proven method for working within the domo platform.

NewsomSolutions

As for the DF, yea, SQL is so much simpler and as a former DBA it was my go-to, but too many drags on mysql forced me to make the change. It sucks sometimes setting up and it will take some time to get used to it and how to think about it, but performance wise for me it was a much better move.

As for the comparisons, if you go the DF route, you may run into a problem with too many sources and too many cards. May be a trial / error...set it up in DF then combine what you've aggregated their with some less complex beast modes.

I'd also reach out to your CSM and see if they can sync you up with a Technical Consultant and get some recommendations.

Also - you can use workbench to do some of your ETL work too, not sure that would help you or not, but don't forget you may can do some there.

cwolman

I started looking at utilizing the magic etl option and realized that it cannot do complex joins so this will prevent me from using magic etl. Our POS data has start and end dates and I need to create rows for each day in between depending on the specifics of other columns in the row. It does not appear that magic etl can handle this so I will need to continue with mySQL or Redshift at least for the workhorse portion of the initial transforms. The other option as you pointed out would be to build this logic into the workbench side before sending it to Domo. The downside to using workbench is if the logic changes I would need to pull all of the data again from the source. Maybe I can book an appointment at the Brilliance Bar at Domopalooza and get some solid recommendations. Thank you for your responses.

Thank you.

rado98

Thing I do which may or may no help you is the following

I create large large datesets by appending hystorical and current data. I need the data updated through the day so my timing would be different to yours.

I would then run the historical part of the data, say all data from last year backwards once per day, realistically it only needs to be run once but I like to make sure the data comes through.

I would run the current data every hour or so.

I then append the historical to the current, this takes very little time.

I do manually change the dataset cutoff date which is not ideal, you could automate that somehow.

cwolman

How do you append data to the historical dataset quickly? It usually takes a couple hours to load the historical data, a few minutes to append, and then another couple hours to create and index the historical output. In my case it takes around 3-4 hours to append 25k records to a 20M historical dataset. Unfortunately this is one dataset from one source that still needs to be joined to a much larger dataset that combines multiple datasources.

Child Item

Quick Links

Find more posts tagged with

Other Categories

Product Ideas
Have a Domo product enhancement idea? Submit or upvote on ideas in the Ideas Exchange.
Ideas Exchange
Suggest & vote on new features you would like to see implemented in the Domo Product.
Data Connections
Ask questions about Connectors, Workbench, Cloud Amplifier and get best practices from Domo peers
Connectors
A space to troubleshoot connector errors (like authentication and sync issues), best practices for building or customizing connectors, and API and writeback options.
Workbench
Workbench discussions including configuring and running jobs, managing data types and schema, troubleshooting upload errors, and working with large datasets. Ask questions about scheduling and automation, version updates, system requirements, and SQL query behavior.
Cloud Integrations
Discussions around federated and cloud integration topics, such as Cloud Amplifier, Snowflake, Databricks, BigQuery, Oracle NetSuite, and other data warehouse or lake connections. Ask questions about authentication, auto-preview settings, cost implications, pass-through SQL, and integration configuration.
Data & ETL
Ask questions about Magic ETL, SQL DataFlows, DataFusion, Dataset Views and get best practices from Domo peers
Magic ETL
Magic ETL discussions including data transformation flows, formula editor use, tile functions (e.g., Pivot, Join, Group By, Rank & Window), and handling schema and datatype conversions. Ask questions about workflow logic, preview behavior, visual editing features, freeform SQL, and performance/error tuning.
SQL DataFlows
SQL DataFlows discussions including creating and managing SQL dataflows, API automation (e.g., via Python), error resolution (such as row-count mismatches or timeout limits), and SQL transform logic. Ask questions about performance optimization, execution time limits, workflow error troubleshooting, API integration, and SQL view or query visibility.
Datasets
Datasets discussions including DataFusion and Dataset Views, dataset sharing and permissions, importing and formatting data (e.g., CSV/XLSX), dataset granularity and filtering behavior. Ask questions about data merging and snapshots, API metadata access, header changes in imported files, and export/view limits.
Visualize & Apps
Ask questions about Beast Mode, Cards, Charting, Dashboards, Stories, Variables and get best practices from Domo peers
Dashboards
Dashboards discussions including Cards, Dashboards, and Stories—covering topics like card formatting, dashboard navigation, filtering logic, and data visualization behavior. Ask questions about layout consistency, dynamic labeling, drill-downs, access permissions, inter-dashboard navigation, and export options.
App Studio
App Studio discussions including building multi-page apps, custom navigation, themes, forms, filters, queues, and component behaviors. Ask questions about popup forms, filter persistence, control visibility, mobile access, theming and branding, embedded workflows, and publish workflows.
Pro-code Components
Pro-code Components discussions including building and debugging Domo Bricks or pro-code apps, app lifecycle management (e.g., manifest.json), and dataset or workflow integration. Ask questions about permission configurations, app-to-dataset writebacks, form security, PDF export, workflow initiation code, and use of the web-based Pro-code Editor.
Charting & Analyzer
Charting & Analyzer discussions including chart types (e.g., period-over-period charts, bullet charts, pivot tables, heat maps), tooltip and data label configuration, filter behavior, and time-based visualization logic. Ask questions about date selector binding, custom calculation displays, sorting order, annotations, chart alerts, and multi-metric formatting.
Calculations & Variables (Beast Mode)
Calculations & Variables (Beast Mode) discussions including creating and troubleshooting calculated fields, using variables in Analyzer, nesting Beast Modes, and leveraging FIXED and window functions like RANK or aggregation logic. Ask questions about variable scoping, date and running total calculations, error handling (e.g., divide-by-zero, row filters), ETL vs Beast Mode placement, and performance optimization.
AI & Data science
Ask questions about DomoAI and get answers from Domo peers.
Domo AI & AI Chat
Domo AI & AI Chat discussions including AI readiness tools, AI Chat interface behavior, AI agent creation and workflows, and AI dictionary or metadata configuration. Ask questions about AI Chat sessions reports, chat history visibility, publication syncing, AI agent errors, and dataset readiness governance.
Managing AI
Managing AI discussions including AI Playground usage, AI project setup, and AI model management within Domo. Ask questions about AI Academy episodes, AI agent errors, AI readiness guidance, and image/upload workflows.
Jupyter Workspaces
Jupyter Workspaces discussions including Notebook execution, scheduling DataFlows, error troubleshooting (e.g., “no output” or workspace down), and package or library support within the workspace. Ask questions about AI features, file share connectors, domojupyter APIs, Jupyter via Workflows, and data science resources.
Automate
Ask questions about App Framework, Workflows, Domo Bricks, Domo Developer, API and get best practices from Domo peers
Workflows
Workflows discussions including Task Center automation, form-based workflows, conditional logic, alerts, and code-driven tasks using Code Engine (JavaScript/Python). Ask questions about email triggers, append/writebacks, dataset logging, API integration, error handling, and workflow-task interactions like Projects & Tasks or dashboards.
Alerts
Alerts discussions including setting up card-based and dataset-based alerts, conditional notifications, and monitoring alert execution behavior. Ask questions about summary number triggers, email content values, multi-dimensional logic, non-firing alerts, and configuration differences across dataset types.
Distribute
Ask questions about Domo Everywhere, Scheduled Reports, Mobile and get best practices from Domo peers
Domo Everywhere
Domo Everywhere discussions including embedding dashboards and cards (public vs private), filtering and access control, performance and layout behavior, and API/client ID management. Ask questions about license tracking, text selection in embedded content, export limitations, embed errors, and configuration of .env and datasetRedirects.
Reporting
Reporting discussions including Scheduled Reports, Report Builder, and Slideshow Publications. Ask questions about bulk managing scheduled reports, CSV/PDF export formatting, report layout customization, interface changes, and admin visibility of reports.
Manage
Ask questions about Governance Administration, Approvals, Teams, Alerts, and Buzz and get best practices from Domo peers
Governance & Security
Governance & Security discussions including managing People, Groups, Roles, Teams, Approvals, and PDP, plus sandbox environment access and activity log investigation. Ask questions about role delegation, dynamic group attributes, SSO/SCIM onboarding, governance toolkit usage, and governance dataset visibility and reporting.
Navigation & Productivity
Navigation & Productivity discussions including navigation layout and customization, Projects & Tasks usage, Goals tracking, and Buzz chat functionality. Ask questions about custom icons in navigation, level-specific dashboard creation, workspace navigation behavior, and project/task visibility in Buzz.
APIs
APIs discussions including Domo REST APIs, Python SDK, Java SDK, data import/export, and App API use cases. Ask questions about authentication (client ID/secret), rate limits, error handling (401/403), dataset append/update, and embedding or snapshot automation.
Add-ins & Plugins
Add-Ins & Plugins discussions including Microsoft add-ins (Excel, PowerPoint), Google Slides, and other third-party integrations. Ask questions about installation errors, legacy vs new plugin behavior, refresh failures, template formatting, iframe embedding, and version differences.
Domo Community Gallery
Watch how our Customers are using Domo to solve their complex problems. Featuring real-world use cases, customer success stories, and community-shared workflows or integrations. Learn how our customers are using Domo to solve their complex problems.
Product Releases
Domo support and product teams are here to live-answer questions about the most recent product releases. Please post questions in this Forum board for all users to benefit (rather than submitting a support ticket).
Domo University
Domo University discussions include self-paced training, instructor-led courses, virtual/in-person learning, and certification paths. Ask questions about course content updates, certification exam tips, platform onboarding improvements, and training resource formatting or errors.
Community Forums
Getting Started
Welcome to Domo's Community Forums! You'll find everything you need to get started in this category.
Community Announcements
Get the latest from Domo's Community Team.
Archive
Old or outdated content that could still be found helpful.