Paritoning beta. Where to start?

dkonig
dkonig Member
edited July 2023 in Datasets

I've got access to the partitioning beta and trying to figure out where to start. I have a number of legacy datasets (marketing platforms like Facebook, LinkedIn, Google Ads and Google Analytics) and dataflows that are clunky at best and go the long route to handle recursive data. I'm looking to simplify and make these dataflows more efficient but I am stuck. For example, do you rebuild your datasets especially if you are using connectors? If so how best to do that to leverage partitioning?

Tagged:

Best Answer

Answers

  • @user029082 I suggest checking out this recording from one of the previous Lunch & Learn streams. There are some great examples and use cases for partitioning!

    https://community-forums.domo.com/main/events/30-partitioning-in-theory-and-in-practice

  • dkonig
    dkonig Member

    Yes. I remember this video. I'll need to go back and re-watch since I saw it before I got beta access.

  • dkonig
    dkonig Member
    edited July 2023

    This video is from back in January. It definitely mentions the use cases I am interested in but I am still not sure exactly how to configure things. I'm still unclear on what to partition on especially when a dataset that are created with connectors without partitioning are used in the dataflow

  • @user029082 In that case I would recommend reaching out directly to Andrea or the beta team to help you with the your specific use cases.

  • dkonig
    dkonig Member

    Maybe I'm just not understanding how/why to use partitioning. With a connector that only has append and replace options can you even use partitioning to pull in new data and replace(update), let's say, the last 30 days?

  • MichelleH
    MichelleH Coach
    Answer ✓

    @dkonig It depends on the connector. The ones that support partitioning will have a "merge" method which uses partitions. If the connector you have been using does not have the merge method, there may be another version of the connector that does. When paired with MagicETL subset processing (which I assume is the beta you are referring to), Domo detects which rows in the input datasets have been added/updated and only processes those rows. In many cases this results in a much faster data pipeline.

    Here are a couple other articles that talk about partitioning theory:

  • dkonig
    dkonig Member
    edited July 2023

    Hmm. Ok. So if the connector is doesn't have merge then it's going to either replace or append no matter what. I only gain the benefit of partitioning in MagicETL if I need to use that data for other reporting. The partitioning allows me to update the partitions I want.

    For example let's take Google Ads. There is usually some time for recent data to 'cure' so a recent range of dates may have data that changes from day-to-day. The connector would run every day and replace the entire dataset and then, via MagicETL, I could set up a dataflow that takes all the old existing data, that does not change, and just grab the newer data based on whatever settings I give it. This saves me all the reprocessing of the old data. Is that about right?

  • dkonig
    dkonig Member

    @MichelleH Does my previous comment make sense or am I still not grasping?