Sum Columns based on Unit Number

Hello, I am trying to sum a hours column based on unit number, each unit number appears multiple times in the data set (for each time it needed service) I need to sum the hours of work done to each unit, any recomendations? Thanks!

Best Answers

  • jhl
    jhl Member
    Answer ✓

    hm... then I would suggest an SQL Dataflow, putting in something like this pseudocode

     

    select UnitNumber, count(UnitNumber)

    from Dataset

    group by UnitNumber;

     

    or you might want to sum() up another value column (the above is assuming that it is just about the number of appearances in the dataset). If necessary, you can add your identifying column (if its not UnitNumber) and just join that dataset with your owner dataset directly in the same dataflow. Set the dataflow to update every time your data updates and there will only be a slight delay between your original data source and the dataflow.

     

    This will only make sense for you if you know any SQL, if not there are more instructions about using the dataflow types here.

     

    HTH?

     

  • ST_-Superman-_
    Answer ✓

    If you prefer to not use the MySQL redshift transform for this, I would recommend trying out the new "Rank and Window" tile available in the Magic ETLs now.

    Knowledge Base Article for Rank and Window action

     


    “There is a superhero in all of us, we just need the courage to put on the cape.” -Superman

Answers

  • jhl
    jhl Member

    If you just need this as a table view, you can just pick unit numbers and hours spent, put both in a table and use SUM in the aggregation field.

     

    If you want to brute-force it, this might help:

    Use Beast Mode to make a as many extra columns as you need, for example

    Case when `Unit Number`=1 then sum(`Hours`) end

     

    HTH

     

  • I would use this, but after creating the sum columns I need to join this data with another data set, we're trying to match hours worked to specific owners, but the sheet with hours doesnt include the owner so I need to sum hours then join the data, Thanks!

  • jhl
    jhl Member

    Is there a missing identifying column? I am not sure why you couldn't join first and then sum up.

  • The issue is twofold, firstly there are duplicates, but we need single sum for each car, so if we have unit 5015 listed 30 times we need to sum that into one row, the we need to join the data so we can have the owner identifying column, thanks for the help!

  • jhl
    jhl Member
    Answer ✓

    hm... then I would suggest an SQL Dataflow, putting in something like this pseudocode

     

    select UnitNumber, count(UnitNumber)

    from Dataset

    group by UnitNumber;

     

    or you might want to sum() up another value column (the above is assuming that it is just about the number of appearances in the dataset). If necessary, you can add your identifying column (if its not UnitNumber) and just join that dataset with your owner dataset directly in the same dataflow. Set the dataflow to update every time your data updates and there will only be a slight delay between your original data source and the dataflow.

     

    This will only make sense for you if you know any SQL, if not there are more instructions about using the dataflow types here.

     

    HTH?

     

  • ST_-Superman-_
    Answer ✓

    If you prefer to not use the MySQL redshift transform for this, I would recommend trying out the new "Rank and Window" tile available in the Magic ETLs now.

    Knowledge Base Article for Rank and Window action

     


    “There is a superhero in all of us, we just need the courage to put on the cape.” -Superman
  • Awesome, Thanks!