Parsing text with common delimiters

Options
Rich
Rich Contributor

So I've now had a chance to reuse this solution about three times and that tells me that it is common enough that I should share it. The principle here is that you have a piece of text with a good naming convention that is delimited with a common character. For example, "Name_Date_Product_Channel" which is seperated by underscores. 

 

This piece of SQL code I'm including below will allow you to parse through the text and pull individual elements out as columns you name. You'll edit the elements in orange, green, and blue below.


Select `your_column_here`,
-- Parse Column 1 through 4.
Case WHEN (LENGTH(`your_column_here`) - LENGTH(REPLACE(`your_column_here`, '_', ''))) BETWEEN 1 AND 4 
THEN REPLACE(LOWER(SUBSTRING_INDEX(`your_column_here`,'_',1)),' ','')
ELSE NULL
END as 'Name',
CASE
WHEN (LENGTH(`your_column_here`) - LENGTH(REPLACE(`your_column_here`, '_', ''))) BETWEEN 1 AND 4
THEN REPLACE(LOWER(SUBSTRING_INDEX(SUBSTRING_INDEX(`your_column_here`,'_',2),'_',-1)),' ','')
ELSE NULL
END as 'Date',
CASE
WHEN (LENGTH(`your_column_here`) - LENGTH(REPLACE(`your_column_here`, '_', ''))) BETWEEN 2 AND 4
THEN REPLACE(LOWER(SUBSTRING_INDEX(SUBSTRING_INDEX(`your_column_here`,'_',3),'_',-1)),' ','')
ELSE NULL
END as 'Product',
CASE
WHEN (LENGTH(`your_column_here`) - LENGTH(REPLACE(`your_column_here`, '_', ''))) BETWEEN 3 AND 4
THEN REPLACE(LOWER(SUBSTRING_INDEX(SUBSTRING_INDEX(`your_column_here`,'_',4),'_',-1)),' ','')
ELSE NULL
END as 'Channel'
FROM your_table_here

 

The case statments above can be extended to as many parsings as you need. If no match is found (in otherwords a null occurs in your data) it will just have a blank field. 

 

I hope you find this as helpful as I have found it recently. If you use it and like it please upvote this entry!