Extract 1 row / few rows from big dataset (>100M rows)

When I try to extract the _BATCH_LAST_RUN_ from a MySQL dataset with 106M rows, the ETL takes 3h20min. Is there a way to extract only one row from the dataset. For the moment, I have to wait until the full 106M rows dataset is load in both Magic ETL and SQL. 

Tagged:

Comments

  • Jarvis
    Jarvis Domo Employee

    Hi,

     

    It depends upon how your dataset is configured. For a standard dataset, the full dataset must load. Maybe if you can partition your dataset then you can speed this up. 

     

    Jarvis

  • Yeah this is a tough one...

     

    In a nutshell any ETL tool (except Fusion) will have to transfer your data into the transformation engine, a SQL database or Magic's ... data processing environment, before you can transform it.  Hence why your ETLs take so long.  

     

    Your goal should be to use VIEWS which go directly to the Adrenaline (our database layer) to transform / subset your data. 

     

    Up to about 5 months ago, your options for this were VERY limited... but ... stay tuned ...

     

    There are updates to Product in the pipe that will make this story MUCH BETTER.  Find your Domo Customer Success Manager (CSM) and ask them how you can create views in Domo.  We've got new features coming to the UI that are really going to help you.

     

    That said, if you're pretty nerdy you can use the JavaCLI to create a view without a pretty User Interface.

    use get-schema to get the schema of an existing dataset in Domo, then use create-dataview... to ... create a dataview based on the schema you pulled with filters added.

    Jae Wilson
    Check out my 🎥 Domo Training YouTube Channel 👨‍💻

    **Say "Thanks" by clicking the ❤️ in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"