Magic ETL

Magic ETL

Isolate email addresses from a column using Magic ETL or BeastMode

Member
edited March 2024 in Magic ETL

Hello, I am trying to isolate email addresses from an Information column in a Dataset. A sample of the data from the column would be something like 'someone's name [email@email.com]'

I don't necessarily need the name part, but I do need the Email Addresses, which are always bracketed in this column by '[]'.

I am using a Split Column tile in Magic ETL, and in the Choose Delimiter to Split On section I have 'Custom' selected and using the following REGEX code that I'm sure is wrong:

REGEX_EXTRACT([Information], "\[(.*?)\]", 0)

The code compiles, but the newly created column is copying all the text from the Information column (Column that's being split) exactly.

Any ideas how I can only grab the email address that's inside the brackets? I'm sure I'm missing something.

Thank you!

Welcome!

It looks like you're new here. Members get access to exclusive content, events, rewards, and more. Sign in or register to get started.
Sign In

Best Answer

  • Coach
    Answer ✓

    REGEX can be tricky, specially when it comes to replacing, so you might want to try a couple of different options until you find one that works, here's one you can try:

    1. REGEXP_REPLACE(`Information`,'.*([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+).*','$1')

Answers

  • Coach
    edited March 2024

    it’s not matching anything so it’s returning the original string. You can use a formula tile and REGEXP_REPLACE

    1. REGEXP_REPLACE(`Information`, '^.*\[([^\]+).*$', '$1')
    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • alternatively you can split part the split part

    1. SPLIT_PART(SPLIT_PART(`Information`, '[',2), ']', 1)
    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • For some reason the Formula Tile gives me an Unknown Error Occurred when I try REGEX in the formula bar. That's where I tried it first.

  • Sorry @blittle - Appears it didn't get adequately submitted. I've updated the formula in the comment above.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • Hi Grant, Thank you for the help, I appreciate it!
    I tried the amended formula you posted, and 'validate formula' seems to like it, but when you run the preview, I am getting a new error.

    RegExError1.png

    Any thoughts?

    Thanks again!

  • Coach
    Answer ✓

    REGEX can be tricky, specially when it comes to replacing, so you might want to try a couple of different options until you find one that works, here's one you can try:

    1. REGEXP_REPLACE(`Information`,'.*([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+).*','$1')
  • Wow @marcel_luthi, that's definitely closer than I've been! That did semi work. I'm terrible with REGEX, so it's all trial and error for me.
    It seems to be grabbing the email address, but only returning the last character of the person's name, the @ symbol and domain (example: S@gmail.com). So it appears to be looking at what's inside the brackets, but only grabbing a portion.

    Picture1.png

    Is it only grabbing a character count of 1 to the left of the '@' symbol?

  • Member
    edited March 2024

    I think I got the following to work…

    1. REGEXP_REPLACE(`Information`, '.?([a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+.[a-zA-Z0-9_-]+).', '$1')

    It seems to be returning the full email address after an initial test, but I need to open up my filter to look at more rows.

Welcome!

It looks like you're new here. Members get access to exclusive content, events, rewards, and more. Sign in or register to get started.
Sign In