Parse out the domain of an email address

Hi Everyone,

 

In a data flow, how do you parse out the domain portion of an email address so it can be put it in a separate field?  Please let me know if you have had any experience with this.

 

Thanks,

 

Mark

Comments

  • ckatzman
    ckatzman Contributor

    In your SELECT statement, could you embed a LOCATE() function within SUBSTR() function?

     

    SUBSTR(`email`, (LOCATE('@', `email`) + 1))

     

    If that works correctly, I would expect it to pull from a value of 'john.smith@generic.com' and return a value of 'generic.com'.

    **Say thank you by clicking the 'thumbs up'
    **Be sure to select the answer that represents the best solution and mark as "Accept as Solution"
  • kshah008
    kshah008 Contributor

    @MarkFlachs, did ckatzman's reply help you out? 

  • ETL:

    Long story short, you will create a new column, copy email to that new column and then strip out the data pre '@'. Here are the steps:

    1. Pull in your Input DataSet
    2. Add a Constants (new column) set Type = text and Constant = empty string
    3. Set a Column Value to copy data from, ie copy the Email field
    4. Replace Text of the new column (2 parts in this step):
      1. Select your new column and under 'Enter a term' =  ^[^@]*      Be sure to set to 'Use RegEx' and Replace found term = Empty string
      2. Select your new column and under 'Enter a term' =      No other setting needed and Replace found term = Empty string

    Hope that helps!

    Parse_Domain_from_email.png