JSON Splits working in ETL preview, but show different results in dataset

Hello!

I have created a handful of column splits that preview perfectly in the ELT; however, once I RUN→Dataset, the results are off.

Example of JSON

FieldValue{attribute=PRIMITIVE, value=Known}, FieldValue{attribute=PRIMITIVE, value=1663001799.0}, FieldValue{attribute=PRIMITIVE, value=100.0},

image.png

image.png image.png image.png
Tagged:

Answers

  • You are splitting the string by comma. Replacing value equals with actual values. This may work in the preview because the data is consistent. But in execution you will have inconsistent field structures like extra spaces, missing fields, extra commas, new line variations, etc.

    You might try trimming whitespaces after splitting.

    TRIM(`Stage Name`)

    Use regular expressions for extracting data instead of splitting it on commas.

    value=([^}]+)

    You may also have inconsistent field counts. Some rows may have more or less entries. When working with JSON, I try to handle the extraction as real JSON rather that treating it as text and "grab" bits out of the string. To actually parse the JSON for values. In a Python tile for example, something like this:

    import re
    import pandas as pd

    def process(df):
    def extract_values(row):
    matches = re.findall(r'value=([^}]+)', row)
    return pd.Series(matches)

    df[['Stage Name', 'Stage Timestamp', 'Score']] = df['stage'].apply(extract_values)
    return df

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **