Split varying full names into First Name, Middle Name, and Last Name
Hi there,
Back with another question.
I have a Full Name column in varying format:
Alex McCandless
Alexander David Thoreau
Bobby Del Rio
Cameron Diaz Camilo Carter
Dave O'Donald
Jack E. Topez
...
and I need to parse this into three columns:
- First Name - the first word of the string, seems easily accomplished with the split column tile
- Last Name - the last word of the string, how can this be done?
- Middle Name - anything in between First Name and Last Name, I was thinking substring of the Full Name up to LENGTH of the First Name and ends with a space, but unsure how to accomplish
Thank you in advance for any advice!
Best Answer
-
The best way to get 100% of the way there is to fix the data input. If that isn't a realistic option, splitting each word into columns and saying that the first word is FN the last word is LN and everything in between is middle might be the best option. Without knowing which name is exactly which, there is no 100% answer.
If this helps, feel free to agree, accept or awesome it!
4
Answers
-
The best way to get 100% of the way there is to fix the data input. If that isn't a realistic option, splitting each word into columns and saying that the first word is FN the last word is LN and everything in between is middle might be the best option. Without knowing which name is exactly which, there is no 100% answer.
If this helps, feel free to agree, accept or awesome it!
4 -
Agree with @damen that there is not a perfect solution since you have to assume a single first name and single last name, but here are some things you can do to get close if you can't fix the source data. Leveraging the formula tile in Magic ETL, you can create formula fields and then reference within that same tile. Here are some useful ones to create.
Start by eliminating multiple spaces between words and beginning and trailing spaces with the squash_whitespace function
full_name = squash_whitespace(`full_name`)
find the length of the name
str_length = length(`full_name`)
remove all spaces and find the length
str_length_no_space = LENGTH(REPLACE(`full_name`,' ',''))
subtract the two to find the number of spaces
num_spaces = `str_length` - `str_length_no_space`
first name is straightforward using the split_part function
first_name = SPLIT_PART(`full_name`,' ',1)
last name will utilize the num_spaces with the split_part function
last_name = SPLIT_PART(`full_name`,' ',`num_spaces`+1)
middle name is going to be a little trickier. I would suggest utilize a case statement using the num_spaces field and then concat accordingly
CASE WHEN `num_spaces` = 2 THEN SPLIT_PART(`full_name`,2) WHEN `num_spaces` = 3 THEN CONCAT(SPLIT_PART(`full_name`,2),' ',SPLIT_PART(`full_name`,3)) END
You could extend this out further if there are additional spaces.
Hope this helps.
**Check out my Domo Tips & Tricks Videos
**Make sure to any users posts that helped you.
**Please mark as accepted the ones who solved your issue.1
Categories
- All Categories
- 1.9K Product Ideas
- 1.9K Ideas Exchange
- 1.6K Connect
- 1.3K Connectors
- 302 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 104 SQL DataFlows
- 637 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 761 Beast Mode
- 65 App Studio
- 42 Variables
- 704 Automate
- 182 Apps
- 458 APIs & Domo Developer
- 53 Workflows
- 11 DomoAI
- 39 Predict
- 16 Jupyter Workspaces
- 23 R & Python Tiles
- 401 Distribute
- 116 Domo Everywhere
- 277 Scheduled Reports
- 8 Software Integrations
- 132 Manage
- 129 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 12 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 111 Community Announcements
- 4.8K Archive