I am using the Domo Dimensions Calendar as an input to a Python script that updates an dataset of the next 365 days as columns
|2023-01-27 | 2023-01-28 | 2023-01-29 | 2023-01-30 | ........ | 2024-01-27
So, when the script runs tomorrow the first column will be 2023-01-28, and so on.
It works, to a point. The first daily run of this script produces an error when running Domo's built-in write_dataframe code:
"Column referenced but not found: 2023-01-26" (The referenced column is yesterday's date).
I then open the ETL, run a preview, which fails.
Run a second preview, which succeeds.
My guess is that when updating the existing dataset the write_dataframe code notes the mismatch between the first columns in the output dataframe of the script and the output dataset of the ETL.
I am open to any ideas
Should/can I try to recursively update the dataset, using it as an input as well as an output?
# Import the domomagic package into the script
from domomagic import *
import pandas as pd
import numpy as np
from datetime import *
# read data from inputs into a data frame
cal = read_dataframe('1 year')
# write your script here
# read data from inputs into a data frame
print('CAL first date value: ' + str(cal.iloc[0,0]))
print('CAL last date value: ' + str(cal.iloc[-1,0]))
# Convert the dt column from datetime to date by copying
# renaming and copying back again.
# have to do this as the column is called 'dt' and
# subsequent ETL flows expect this column name
# drop all columns except for dt
df = cal.loc[:, ['dt']]
# rename dt to DateTime
df.rename(columns = {'dt':'DateTime'}, inplace = True)
# Convert datetime to date
df['Date'] = pd.to_datetime(df['DateTime']).dt.date
# rename the new date column back to dt
df.rename(columns = {'Date':'dt'}, inplace = True)
# drop DateTime column, leaving dt column as date type
df.drop('DateTime', axis='columns', inplace=True)
# insert a blank value column, so that we have a values column for the coming pivot command
df['Value'] = 0
# pivot the new df so that dates appear as columns
temp_df = df.pivot(index=None, columns='dt', values='Value')
# drop all rows except the first
new_df = temp_df.drop(temp_df.index.to_list()[1:] ,axis = 0 )
# set all row values to 0
new_df[:] = 0
# to avoid array dict errors, set columns headers to string type
new_df.columns = new_df.columns.astype(str)
print(new_df)
# write a data frame so it's available to the next action
write_dataframe(new_df)