Accept two different column names as the same column within the card

Using DataSet via email to load data.

Sometimes the date column name is "Source Date (America / New York)"

Sometimes the date column name is "Source Date (America / Chicago)"

Is there a way, within the card, to have domo accept either as the Source Date?

That is, NOT to build a data flow to address this issue, but some way with the Card to assign the Source Date to Card and have it work whether the file comes through with "Source Date (America / New York)" or "Source Date (America / Chicago)" as the column header ...

Best Answer

  • ST_Superman
    ST_Superman Domo Employee
    Answer ✓

    OK...

     

    I have a solution for you, but I have to do this in a MySQL dataflow.  You will need to use your emailed data set as the input for the dataflow.

     

    In my case, I just created a very simple dataset with only two columns, but it should function the same way regardless of how many columns you have.

     

    Here is my sample data set:3.png

     

     

    You will need to add several tranforms to accomplish this, I will walk you through it:

     

    Step 1: Create a new "table" transform4.png

     

    Here is what you enter for the first transform:

     

    SELECT GROUP_CONCAT(
    CONCAT(
    'SELECT ',QUOTE(COLUMN_NAME),' AS `Time Zone`,`',COLUMN_NAME,'` AS `Date`, `Total` FROM dojo_user_13426_test') SEPARATOR ' UNION ALL ') as statement
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME='dojo_user_13426_test'
    AND COLUMN_NAME NOT IN ('Total')

    I'm assuming that you will have more than just two columns, but you will need to replace the end of the concat statement where I say " AS `Date`, `Total` ... 

    you will need to replace the `Total` with a list of all of the other columns you want from your data set.  It could look something like this:

     AS `Date`, `Area`, `Region`, `Sales` FROM your_table_name')

    You will also have to add all of those column names into the last line where it says 

    "AND COLUMN_NAME NOT IN ('Total')"  This could look like this:

    AND COLUMN_NAME NOT IN ('Area', 'Region', 'Sales')

    Please note that you need to use the back tick ` the first time you call out the field names and when entering the field names into the NOT IN clause you use a single quote '

     

    The default for the output of this step is transform_data_1 and I left that as my output as well.  If you change the name, you will also need to change the name in step 3

     

     

    On to Step 2:

    This time you need to create a SQL transform (not a table)

    DROP PROCEDURE IF EXISTS newname;

    Step 3: also a SQL transform

    CREATE PROCEDURE newname()

    BEGIN

    SELECT statement into @sql2 from `transform_data_1`;

    SET @str=concat('create table renamed_date as ',@sql2);

    PREPARE q FROM @str;
    EXECUTE q;

    END

     

    Step 4: also a SQL transform

    DROP TABLE IF EXISTS renamed_date;

    step 5: SQL transform

    call newname;

     

    Step 6: your output data set

     

    SELECT * FROM renamed_date

     

    Name your output data set and run the dataflow.  You will get a new data set that looks like this:2.png

    You can see that I have a field for the Time Zone and the Date.

     

    Just to test, I changed the original field name for my data set and ran it again:5.png

     

     

Answers

  • ST_Superman
    ST_Superman Domo Employee

    create a Calculated Field:

     

    ifnull(`Source Date (America / New York)`,`Source Date (America / Chicago)`)

  • Thank you for the reply!  I did try your suggestion and received this error message:

    Invalid Formula : A column in this calculation did not exist.

    To clarify, the data comes through with either  `Source Date (America / New York)` OR `Source Date (America / Chicago)` as the column header, not a mix of both ... 

     

    Thanks again!

  • ST_Superman
    ST_Superman Domo Employee

    I just created a data set and tried this out.  It worked for me.  Let me know if I'm setting this up incorrectly:

    1.png

     

     

    The "Combined Date" calculated field is:

    IFNULL(`Source Date (America / Chicago)`,`Source Date (America / New York)`)
  • Thank you, i appreciate your efforts!

     

    To clarify, the data comes through with either  `Source Date (America / New York)` 

    OR `Source Date (America / Chicago)` as the column header, not a mix of both ... 

  • ST_Superman
    ST_Superman Domo Employee

    I'm not sure how the same data set can change field names depending on the row of data.  Would you mind sharing a screenshot or excel sample of your data set?  That should help me find a solution

  • It is not at row level.  

    In this instance, we are using  the DataSet via email connector to load the data.

    This data is generated external to the company.

    Sometimes, within the file, the date column name is named "Source Date (America / New York)"

    Sometimes, within the file, the date column name is named "Source Date (America / Chicago)"

    What i was hoping to do, within the card, is handle either column header as the same.

    However, if you build the card using "Source Date (America / New York)" as a field, and the file is submitted with the column named "Source Date (America / Chicago)", the card breaks.

  • ST_Superman
    ST_Superman Domo Employee

    Do the files come from two different sources?  Or is it completely random which of the two date columns will be used?

     

    Unfortunately, I don't believe that this can be solved outside of an ETL.  I would suggest that the easiest fix would actually be to pick a standard format for receiving the file.  Perhaps you can have the file changed to show two separate columns, one for timezone and one for date?  Otherwise, databases don't care much for changing field names.  

     

    If you can get the Chicago Files to come in as one dataset and the New York files to be another dataset, then you could use an ETL to append the two data sets into a single dataset.

  • No, the files come from the same source.  it does appear to be random, the source people do not seem to have control over which gets sent, or at least they do NOT know how to control it.

    There are NOT New York and Chicago files, it is just one file sent weekly.  This column heading has been switching back and forth ... 

  • ST_Superman
    ST_Superman Domo Employee
    Answer ✓

    OK...

     

    I have a solution for you, but I have to do this in a MySQL dataflow.  You will need to use your emailed data set as the input for the dataflow.

     

    In my case, I just created a very simple dataset with only two columns, but it should function the same way regardless of how many columns you have.

     

    Here is my sample data set:3.png

     

     

    You will need to add several tranforms to accomplish this, I will walk you through it:

     

    Step 1: Create a new "table" transform4.png

     

    Here is what you enter for the first transform:

     

    SELECT GROUP_CONCAT(
    CONCAT(
    'SELECT ',QUOTE(COLUMN_NAME),' AS `Time Zone`,`',COLUMN_NAME,'` AS `Date`, `Total` FROM dojo_user_13426_test') SEPARATOR ' UNION ALL ') as statement
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME='dojo_user_13426_test'
    AND COLUMN_NAME NOT IN ('Total')

    I'm assuming that you will have more than just two columns, but you will need to replace the end of the concat statement where I say " AS `Date`, `Total` ... 

    you will need to replace the `Total` with a list of all of the other columns you want from your data set.  It could look something like this:

     AS `Date`, `Area`, `Region`, `Sales` FROM your_table_name')

    You will also have to add all of those column names into the last line where it says 

    "AND COLUMN_NAME NOT IN ('Total')"  This could look like this:

    AND COLUMN_NAME NOT IN ('Area', 'Region', 'Sales')

    Please note that you need to use the back tick ` the first time you call out the field names and when entering the field names into the NOT IN clause you use a single quote '

     

    The default for the output of this step is transform_data_1 and I left that as my output as well.  If you change the name, you will also need to change the name in step 3

     

     

    On to Step 2:

    This time you need to create a SQL transform (not a table)

    DROP PROCEDURE IF EXISTS newname;

    Step 3: also a SQL transform

    CREATE PROCEDURE newname()

    BEGIN

    SELECT statement into @sql2 from `transform_data_1`;

    SET @str=concat('create table renamed_date as ',@sql2);

    PREPARE q FROM @str;
    EXECUTE q;

    END

     

    Step 4: also a SQL transform

    DROP TABLE IF EXISTS renamed_date;

    step 5: SQL transform

    call newname;

     

    Step 6: your output data set

     

    SELECT * FROM renamed_date

     

    Name your output data set and run the dataflow.  You will get a new data set that looks like this:2.png

    You can see that I have a field for the Time Zone and the Date.

     

    Just to test, I changed the original field name for my data set and ran it again:5.png

     

     

This discussion has been closed.