write_dataframe fails to write output of pd.pivot_table having Nulls

Options
Pritesh
Pritesh Member
edited March 2023 in R & Python Tiles

Domo python tile fails to write the pivoted dataframe with pd.pivot_table if it has nulls while converting - even if you add 'fill_value=np.nan'.

See below:

The pivot function works perfectly fine - it does generate the df, however, when you try to write it, it fails - see below if I write the unpivoted df, it works fine and also see the print(pivot_df) in console.

So, it's evident that pivot_table is fine, but Domo fails while writing it with native method "write_dataframe".

Any idea? or workaround? (other than using Domo native Pivot tile)

Tagged:

Best Answer

  • rco
    rco Domo Employee
    Answer ✓
    Options

    write_dataframe is failing here due to column names, not values. Each item in pivot_df.columns is expected to be a non-null non-empty string. Either pivot_table is returning a DataFrame with null column names, empty string column names, or non-string column names.

    Randall Oveson <randall.oveson@domo.com>

Answers

  • JacobFolsom
    JacobFolsom Domo Employee
    edited March 2023
    Options

    @Pritesh I haven't bumped into that one, but a workaround to consider since you are using scripting tiles in Magic ETL may be to handle the null values prior to the python tile using a formula tile.

    For example, I have a couple columns in my dataset with a null value and choose to write formulas to set defaults for text or numeric types.

    Give that a try and let me know if it helps.

    Jacob Folsom
    **Say “Thanks” by clicking the “heart” in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"
  • Pritesh
    Pritesh Member
    edited April 2023
    Options

    Hi @JacobFolsom - Thanks for the response and suggestion. However, in this case it wouldn't help because the NULLs are generated real-time while pivoting it. Please see an example below to further explain this:

    For e.g., if you have flat data like below (where you see Mary doesn't have a row for .NET and Mike doesn't have a row for PYTHON) that you try to pivot on column "Skills"…

    …it turns something like following:

    So, these NULLs are generated automatically for the missing rows for "Skills" values in that dataset. And, if the data is incrementing periodically with new skill, you can't predict those (or it wouldn't make sense to do that).

    The native PIVOT tile of Domo gracefully takes care of these NULLs but the write_dataframe surprisingly fails it, even with the 'fill_value=np.nan' of pandas.pivot_table.

    I wanted to check if anyone has already found a workaround for this before going to Domo support (not even sure if Domo provides any support on scripting tile).

  • rco
    rco Domo Employee
    Answer ✓
    Options

    write_dataframe is failing here due to column names, not values. Each item in pivot_df.columns is expected to be a non-null non-empty string. Either pivot_table is returning a DataFrame with null column names, empty string column names, or non-string column names.

    Randall Oveson <randall.oveson@domo.com>

  • Pritesh
    Options

    Hi @rco - Thanks for the response.

    I figured that the issue was the column name as integer. In my case, the column names were 101, 102, 103 and so on… which was causing this. Converting the column contaning this numbers to "String" type helped resolving the issue.