Store max transaction number

I have a general ledger transaction dataset that is almost 90Mil rows. It takes three hours to do a full replace daily. The dataset has a unique sequential transaction number that grows as transactions are posted.

Most of the time, letting the general ledger just run overnight is sufficient. But several times a month there are instances where I need to be able to refresh the general ledger in the middle of the day and get the results in a timely manner.

Therefore, I want to be able to only pull new transactions since the dataset last ran. I believe there is a way to store the maximum transaction number from the last pull. Which would allow this to become an append of any transaction numbers greater than the stored maximum transaction from the last pull, rather than having to do a full replace.

Best Answers

  • ArborRose
    ArborRose Coach
    Answer ✓

    I would handle it in an ETL. You have a huge dataset. I deal with 15-20Mil rows. Depending upon your data, you may be able to pull parts of transactions (such as current year, current quarter, etc) and append. I pull past years less frequently than I pull quarterly transactions. I combine the datasets to get a full set for use.

    I am currently experimenting with pulling my data through Python because the Domo json connnector is too slow for large data. You probably found what I did - that Postman can pull data much quicker than the oAuth No Code. I'm interested in any tricks you have for that large set btw….do share.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • MichelleH
    MichelleH Coach
    Answer ✓

    @barb_barney_88 I'd also suggest exploring subset processing in MagicETL, which will allow you to request smaller amounts of data on your Workbench dataset (last 30 days, etc.) and merge with the existing output data.

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    what you’re referring to is the special lastvalue parameter in a workbench query. Some basic information can be found here as an example https://domo-support.domo.com/s/article/4407020117527?language=en_US&topicId=0TO5w000000ZapUGAS&topicName=Workbench%205.1 you can ignore the fact that this is OLAP Cube documentation but just search for lastvalue to see the documentation on it. This way you can track the latest id you’ve pulled in and have it pick up from there moving forward

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • ArborRose
    ArborRose Coach
    Answer ✓

    I would handle it in an ETL. You have a huge dataset. I deal with 15-20Mil rows. Depending upon your data, you may be able to pull parts of transactions (such as current year, current quarter, etc) and append. I pull past years less frequently than I pull quarterly transactions. I combine the datasets to get a full set for use.

    I am currently experimenting with pulling my data through Python because the Domo json connnector is too slow for large data. You probably found what I did - that Postman can pull data much quicker than the oAuth No Code. I'm interested in any tricks you have for that large set btw….do share.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • MichelleH
    MichelleH Coach
    Answer ✓

    @barb_barney_88 I'd also suggest exploring subset processing in MagicETL, which will allow you to request smaller amounts of data on your Workbench dataset (last 30 days, etc.) and merge with the existing output data.

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    what you’re referring to is the special lastvalue parameter in a workbench query. Some basic information can be found here as an example https://domo-support.domo.com/s/article/4407020117527?language=en_US&topicId=0TO5w000000ZapUGAS&topicName=Workbench%205.1 you can ignore the fact that this is OLAP Cube documentation but just search for lastvalue to see the documentation on it. This way you can track the latest id you’ve pulled in and have it pick up from there moving forward

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**