Indexing Help

Hi

Still new to Domo and I recently linked two datasets together using Magic ETL however, when using this new dataset on a dashboard I see that filtering the data takes a long time.

I wanted to see if indexing the new dataset would help improve the speed of the filtering. Does anyone know the best way I should go about doing this?

Thanks in advance

Best Answer

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    One thing I always take into account when designing ETL Pipelines is to be a teapot - short and stout. What I mean is that you can filter the data as soon as you're able to and also selecting only the columns you need. This makes the data short and faster to process as there's less data. On the stout side of the coin, you'll want to make sure you do as many things in parallel when processing the data as possible. Instead of chaining joins together join other tables together as you can so multiple joins are happening at the same time. This will allow the parallelism to work efficiently and not have your pipeline waiting on steps slowing things down.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • ArborRose
    ArborRose Coach
    edited April 30

    You don't need to explicitely create indexes as you would in traditional databases. Domo's automatically optimizes for performance. Slow performance may be due to complexity of the dashboard or an issue in how you have things structured.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    One thing I always take into account when designing ETL Pipelines is to be a teapot - short and stout. What I mean is that you can filter the data as soon as you're able to and also selecting only the columns you need. This makes the data short and faster to process as there's less data. On the stout side of the coin, you'll want to make sure you do as many things in parallel when processing the data as possible. Instead of chaining joins together join other tables together as you can so multiple joins are happening at the same time. This will allow the parallelism to work efficiently and not have your pipeline waiting on steps slowing things down.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**