How do I get started with MySQL Dataflows?

Hello all, I have roughly two years experience with creating Magic ETLs, but no experience with MySQL dataflows. I'm seeking to experiment with MySQL just for the sake of experimentation.

Background: I've written thousands of Case When Statements for Beast Modes and calculated fields, but have very little actual SQL experience outside of that.

I know I have to input two or more datasets then apply some sort of join in the transform, but I don't know exactly what all to do in between.

For example, I have no idea whether I should index each table individually in the input dataset, or index afterwards in the join transform.

For simplicity, assume I have a table1 and table2. I want to left join table2 onto table1; joining on 'Name', 'Year', and 'Type'.

What would the indexing look like in this scenario?

Tagged:

Best Answers

  • MichelleH
    MichelleH Coach
    Answer ✓

    @TheScotsman To start, I would suggest learning more about SQL (specifically SELECT statements). Here is a good tutorial to get you started on the basics: https://www.w3schools.com/sql/default.asp . You can also learn more about MySQL Domo dataflows here: https://domo-support.domo.com/s/article/360042922994?language=en_US

    While knowledge of SQL is definitely helpful, unless you have a pressing reason to use MySQL dataflows (complex joins, etc.) you are likely better off sticking with MagicETL since there are some significant performance gains.

  • ArborRose
    ArborRose Coach
    Answer ✓

    It might be having a problem with aliasing - using asterisk. Asterisk is a representation for all.

    SELECT
    s.Name AS State_Name,
    s.Year AS State_Year,
    s.Gender AS State_Gender,
    s.Id AS State_Id,
    n.Count AS National_Count
    FROM state_names s
    LEFT OUTER JOIN national_names n
    ON s.Name = n.Name
    AND s.Year = n.Year
    AND s.Gender = n.Gender;

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

Answers

  • MichelleH
    MichelleH Coach
    Answer ✓

    @TheScotsman To start, I would suggest learning more about SQL (specifically SELECT statements). Here is a good tutorial to get you started on the basics: https://www.w3schools.com/sql/default.asp . You can also learn more about MySQL Domo dataflows here: https://domo-support.domo.com/s/article/360042922994?language=en_US

    While knowledge of SQL is definitely helpful, unless you have a pressing reason to use MySQL dataflows (complex joins, etc.) you are likely better off sticking with MagicETL since there are some significant performance gains.

  • I tried to mimic what I did in a Magic ETL (nothing is wrong in that ouput dataset). On the MySQL dataflow, the National_Count column is coming in null on the preview. I tried executing the dataflow and it wouldn't run due to a "duplicate entry". For reference, the two datasets (US Baby Names) can be found at https://domo-support.domo.com/s/article/360043931814?language=en_US What's wrong with the following code:

    SELECT
    s.Name as State_Name,
    s.Year as State_Year,
    s.Gender AS State_Gender,
    s.Id as State_Id,
    s.*,
    n.Count as National_Count
    FROM
    state_names s
    LEFT OUTER JOIN
    national_names n
    ON
    s.Name = n.Name AND s.Year = n.Year AND s.Gender = n.Gender;

  • ArborRose
    ArborRose Coach
    Answer ✓

    It might be having a problem with aliasing - using asterisk. Asterisk is a representation for all.

    SELECT
    s.Name AS State_Name,
    s.Year AS State_Year,
    s.Gender AS State_Gender,
    s.Id AS State_Id,
    n.Count AS National_Count
    FROM state_names s
    LEFT OUTER JOIN national_names n
    ON s.Name = n.Name
    AND s.Year = n.Year
    AND s.Gender = n.Gender;

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • Sorry…I should have expanded on that. You say it gave a duplicate entry. When you give it an explicit list (stating each one) such as s.Name, s.Year….
    and then you also include s.* you are repeating yourself. s.Name, s.Year are included in s.* and thus would repeat or duplicate. By explicitly specifying the columns you need, you can avoid conflicts.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **