Summarize many columns in MySQL

SQL newbie. We have a dataset which is sourced by a google-doc. The data has many columns with short strings in them.

 

Our goal is to summarize it to two columns

Column 1: Order Number

Column 2: A concat of each column on a given row (with the column headers prepended in) 

 

I have created a transform that extracts all of the headers and puts them into a single column
select
concat('''\r\n',column_name,': ','''',', ','`',column_name, '`') as 'Formatted For Select'
from
information_schema.columns
where table_name = 'consulting_sales_debrief_data'


I am trying to create an output that would provide the two columns I want

SELECT
`Sales Order Line Identifier (ie, SO#####_#)` AS 'Order Number',
CONCAT('',(SELECT GROUP_CONCAT(`Formatted For Select` SEPARATOR ', ') FROM `transform_data_1`) ) AS 'This Returns a string of the column names in quotes and name with backticks-- basically I want this as text in my output select',
CONCAT('Timestamp: ',`Timestamp`,'\r\nEmail Address: ',`Email Address`,'\r\nCompany Name: ',`Company Name`,'\r\nProject Name: ',`Project Name`,'\r\n3. Sales Rep Name: ',`3. Sales Rep Name`) AS 'This produced the desired output, but I have to name each column manually'

 

The issue is that the column names are changed on a fairly regular basis, so we really just want to summarize all the data into one column and include whatever the column names happen to be when the SQL is run. Any help would be appreciated
FROM `consulting_sales_debrief_data`

Best Answer

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    Hi @matthewstryker 

     

    SQL doesn't handle reading different/variable column names in an SQL statement very well. What I would suggest is instead of storing the values in a wide format (columns as values) I'd recommend storing them as a narrow (also known as Tidy) format so you have two columns: Metric/Name and Value.

     

    Your Format (Wide):

    IDSalesProfit Rabbits pulled from a hat
    1100500

     

    New Format (Narrow)

    IDMetricValue
    1Sales100
    1Profit50
    1Rabbits pulled from a hat0

     

     

    This will allow you to dynamically handle different column names and to utilize the GROUP_CONCAT function to automatically group the different values together in the format you're wanting.

     

    If it's not possible to change the format of your Google Doc you could utilize a Magic ETL 2.0 (Beta) dataflow with the new Dynamic Unpivot tile to format your Google Doc data into a narrow format.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • GrantSmith
    GrantSmith Coach
    Answer ✓

    Hi @matthewstryker 

     

    SQL doesn't handle reading different/variable column names in an SQL statement very well. What I would suggest is instead of storing the values in a wide format (columns as values) I'd recommend storing them as a narrow (also known as Tidy) format so you have two columns: Metric/Name and Value.

     

    Your Format (Wide):

    IDSalesProfit Rabbits pulled from a hat
    1100500

     

    New Format (Narrow)

    IDMetricValue
    1Sales100
    1Profit50
    1Rabbits pulled from a hat0

     

     

    This will allow you to dynamically handle different column names and to utilize the GROUP_CONCAT function to automatically group the different values together in the format you're wanting.

     

    If it's not possible to change the format of your Google Doc you could utilize a Magic ETL 2.0 (Beta) dataflow with the new Dynamic Unpivot tile to format your Google Doc data into a narrow format.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Hi Grant,

     

    This worked great, thank you!

     

    -Matt