Split varying full names into First Name, Middle Name, and Last Name
Hi there,
Back with another question.
I have a Full Name column in varying format:
Alex McCandless
Alexander David Thoreau
Bobby Del Rio
Cameron Diaz Camilo Carter
Dave O'Donald
Jack E. Topez
...
and I need to parse this into three columns:
- First Name - the first word of the string, seems easily accomplished with the split column tile
- Last Name - the last word of the string, how can this be done?
- Middle Name - anything in between First Name and Last Name, I was thinking substring of the Full Name up to LENGTH of the First Name and ends with a space, but unsure how to accomplish
Thank you in advance for any advice!
Best Answer
-
The best way to get 100% of the way there is to fix the data input. If that isn't a realistic option, splitting each word into columns and saying that the first word is FN the last word is LN and everything in between is middle might be the best option. Without knowing which name is exactly which, there is no 100% answer.
If this helps, feel free to agree, accept or awesome it!
4
Answers
-
The best way to get 100% of the way there is to fix the data input. If that isn't a realistic option, splitting each word into columns and saying that the first word is FN the last word is LN and everything in between is middle might be the best option. Without knowing which name is exactly which, there is no 100% answer.
If this helps, feel free to agree, accept or awesome it!
4 -
Agree with @damen that there is not a perfect solution since you have to assume a single first name and single last name, but here are some things you can do to get close if you can't fix the source data. Leveraging the formula tile in Magic ETL, you can create formula fields and then reference within that same tile. Here are some useful ones to create.
Start by eliminating multiple spaces between words and beginning and trailing spaces with the squash_whitespace function
full_name = squash_whitespace(`full_name`)
find the length of the name
str_length = length(`full_name`)
remove all spaces and find the length
str_length_no_space = LENGTH(REPLACE(`full_name`,' ',''))
subtract the two to find the number of spaces
num_spaces = `str_length` - `str_length_no_space`
first name is straightforward using the split_part function
first_name = SPLIT_PART(`full_name`,' ',1)
last name will utilize the num_spaces with the split_part function
last_name = SPLIT_PART(`full_name`,' ',`num_spaces`+1)
middle name is going to be a little trickier. I would suggest utilize a case statement using the num_spaces field and then concat accordingly
CASE WHEN `num_spaces` = 2 THEN SPLIT_PART(`full_name`,2) WHEN `num_spaces` = 3 THEN CONCAT(SPLIT_PART(`full_name`,2),' ',SPLIT_PART(`full_name`,3)) END
You could extend this out further if there are additional spaces.
Hope this helps.
**Check out my Domo Tips & Tricks Videos
**Make sure to any users posts that helped you.
**Please mark as accepted the ones who solved your issue.1
Categories
- All Categories
- 1.7K Product Ideas
- 1.7K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 292 Workbench
- 4 Cloud Amplifier
- 8 Federated
- 2.8K Transform
- 95 SQL DataFlows
- 602 Datasets
- 2.1K Magic ETL
- 3.7K Visualize
- 2.4K Charting
- 697 Beast Mode
- 43 App Studio
- 39 Variables
- 658 Automate
- 170 Apps
- 441 APIs & Domo Developer
- 42 Workflows
- 5 DomoAI
- 32 Predict
- 12 Jupyter Workspaces
- 20 R & Python Tiles
- 386 Distribute
- 111 Domo Everywhere
- 269 Scheduled Reports
- 6 Software Integrations
- 113 Manage
- 110 Governance & Security
- 8 Domo University
- 30 Product Releases
- Community Forums
- 39 Getting Started
- 29 Community Member Introductions
- 98 Community Announcements
- Domo Community Gallery
- 4.8K Archive