Distinct values with earliest datestamp.

@ValientSpur

 

I have a dimension table where new values found in my datasets are added, and a datestamp populated in a column.  I have found that I have to do a final sweep in SQL to remove duplicates because of Domo's handling of some invisible characters found in some data feeds.  

 

A select distinct statement works for the majority, but I have a handful of values that keep pushing duplicates.  I realized that the statement won't work on rows with the same company name, but different datestamp values.  

 

What do I need to add to SQL for it to identify duplicate values in one column, and keep the row with the earliest datestamp?  

 

@ValientSpur - This is similar to another one you solved, but different enough that I want to make sure you get the solution if you give it to me.  

DataMaven
Breaking Down Silos - Building Bridges
**Say "Thanks" by clicking a reaction in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"

Comments

  • You could try something like this:

    SELECT `CompanyName`, MIN(`Datestamp`) AS 'DateStamp'
    FROM dataset
    GROUP BY `CompanyName`

    That would leave you with a single entry for each unique CompanyName and the earliest date value associated with it.

     

    Is that what you're looking for, or am I missing something?

     

    Sincerely,

    ValiantSpur

     

    **Please mark "Accept as Solution" if this post solves your problem
    **Say "Thanks" by clicking the "heart" in the post that helped you.