How to remove special characters from cells and parse data in sql

I have data stored as text and am wanting to convert it to int with-in SQL.  Currently the text reads (8'-6 11/16").  I am needing to remove the foot symbol ('), the dash (-), and the inches (") symbol.  Then take each number (stored as text) and convert to int.  At which point I would rebuild the cell to read 102 11/16".  If it is possible to convert each number (stored as text) to decimal, then I could do a easy conversion and rebuild into fraction format. The SUBSTRING() command is not recognized in the "Transform" of MySQL DataFLow.

Best Answer

  • Boudrel1
    Boudrel1 Member
    Answer ✓

    Thanks for your help,

     

    I was able to use the SUBSTRING_INDEX() to get the specific substring I was looking for.  One issue I had was that my data does not always have a fraction (/) character.  So, I had to run a SUBSTRING_INDEX() to get everything between the (-) and (").

    SUBSTRING_INDEX()_1.png

     

    I then needed to get the fraction seperated.  So I used the same SUBSTRING_INDEX() method, but added the the commands: WHERE and REGEXP '/'.  With the WHERE and REGEXP command added I was able to pull only the fraction and not the whole number.

    SUBSTRING_INDEX()_2.png

     

     

    In line 18, 19, and 20 of my output table I was able to get only the fraction and skipped the rows that did not contain a fraction.  

     

    Thanks for your help with the SUBSTRING_INDEX().

Answers

  • What kinds of problems are you having with SUBSTRING()?  I've used that plenty of times in dataflows, so I'm surprised to see you're having an issue.

    What kind of variety do you see in your string data?  Do each of these parameters (feet, inches, fraction numerator and denominator) exist every time (like 8'-0 0/16" vs 8'  for eight feet)?  Or are the excluded if they won't exist as part of that measurement?

    If they are consistent, a series of substring functions should work great.  SUBSTRING_INDEX() could work really well here, too.

    Once you get SUBSTRING() or SUBSTRING_INDEX() working, using a dataflow to convert or cast to decimal should be simple.

     

    substring.PNG

     

    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • The SUBSTRING() does not work in my situation because I am trying to use it on a column with varying information.  In my column I will have information in just inches (8 1/2") as well as information in ft and in (5'-6 1/8").  I have been successful in using the LEFT() command to remove all the (") special characters.  I am unable to get the RIGHT() command to work properly.  I want to use RIGHT() to keep everything to the right of the (-) in 5'-6 1/8.  The LEFT() command removed the (") but when I try the RIGHT() command the result I get is (1/8).  I am missing the 6.

     

    LEFT() CODE:

         SELECT LEFT(height, LOCATE('"',height)-1) AS length_in, name
         FROM 96280_shipcon_table

     

    RIGHT() CODE:

          SELECT RIGHT(length_in, LOCATE('-',length_in)+1) AS length_in_fin, name
          FROM length_in

     

    LEFT() Command.pngRIGHT() Command.png

     

    Thanks for your help.

  • Variability makes it more complex for sure.  Will there always be a fraction?  That could matter in your approach as well.

    I tinkered a little bit.  There has to be a better way but see if some of this helps (also see attached):

    select
    '8''-4 3/16"' as length
    , case when '8''-4 3/16"' like '%''%' -- if there are feet measurements
    then cast(substring_index('8''-4 3/16"','''',1) as unsigned) -- take the digits until the feet character
    else 0 -- otherwise no feet
    end feet
    , case when '8''-4 3/16"' not like '%''%' -- if there are no feet measurements
    then cast(substring_index('8''-4 3/16"',' ',1) as unsigned) -- take everything up to the space
    else TRIM(SUBSTRING('8''-4 3/16"', LOCATE('-','8''-4 3/16"')+1,2)) -- if there are feet measurements, take everything between the - and the space
    end inches
    , SUBSTRING('8''-4 3/16"', LOCATE(' ','8''-4 3/16"')+1, length('8''-4 3/16"')-LOCATE(' ','8''-4 3/16"')-1) inches_fraction
    Aaron
    MajorDomo @ Merit Medical

    **Say "Thanks" by clicking the heart in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Boudrel1
    Boudrel1 Member
    Answer ✓

    Thanks for your help,

     

    I was able to use the SUBSTRING_INDEX() to get the specific substring I was looking for.  One issue I had was that my data does not always have a fraction (/) character.  So, I had to run a SUBSTRING_INDEX() to get everything between the (-) and (").

    SUBSTRING_INDEX()_1.png

     

    I then needed to get the fraction seperated.  So I used the same SUBSTRING_INDEX() method, but added the the commands: WHERE and REGEXP '/'.  With the WHERE and REGEXP command added I was able to pull only the fraction and not the whole number.

    SUBSTRING_INDEX()_2.png

     

     

    In line 18, 19, and 20 of my output table I was able to get only the fraction and skipped the rows that did not contain a fraction.  

     

    Thanks for your help with the SUBSTRING_INDEX().