"Column referenced but not found" Python error creating date-sensitive columns in ETL

har_d_har1 · 2023-01-03T17:07:32+00:00

There was an error rendering this rich post.

Objective:

To take an existing dataset that shows event start and end dates, and a total value and parse that info into a daily value that appears in columns representing all dates between start and end.

Method:

Using Domo Dimensions Calendar, select only rows from current date to +2 years.

Use the result in a Python script that appends a new column to an existing table, one per calendar row. i.e. todays script will add columns '2022-12-21' to '2024-12-21' to the data table.

Problem:

However, the script regularly fails with "Column referenced but not found: 2022-12-09" (This is today's example) when writing the output dataframe.

I used to think that this was maybe a timezone problem (I am n GMT) and that the referenced column represented yesterdays date in my timezone, but today on the servers timezone.

However, today's error referenced a date 12 days ago!

This problem is not consistent though and tends not to occur in the afternoon GMT. Again, hinting at a timezone problem. I am now not so sure and need expert guidance as I am a Python noob

# Import the domomagic package into the script 
from domomagic import *
import pandas as pd
import numpy as np
from datetime import *

# read data from inputs into a data frame
cal = read_dataframe('CAL.WHERE')
eventdf = read_dataframe('EVENTS.APPEND')

# for each value in the dt column of the calendar
# create an array of length = number of calendar columns
cols = [None] * cal.shape[0]
# create an array of default values for each new column to be added to the dataframe
vals = [0] * cal.shape[0]
rng = range(cal.shape[0])
ind = [str(x) for x in rng]
today = date.today()

for i, row in cal.iterrows():
 dt = row['dt'].date()  
 cols[i]= str(dt)

# now we need to turn our cols (columns) and vals (rows) into a dictionary
# so that we can then in turn convert that into a new dataframe
zip_iterator = zip(cols, vals)
dictionary = dict(zip_iterator)

# create a new empty dataframe from the dictionary
new_df = pd.DataFrame(dictionary, index=ind)

# create a new dataframe from our two input dataframes: df and new_df
output_df = pd.concat([eventdf, new_df], axis=1, ignore_index=False)

# iterate through each row in the dataframe to populate date columns between
# start and end dates with the daily value
rowcount = 0
skip = 0
for i, row in output_df.iterrows():
 eventid = row["ID"]
 if type(eventid ) != "<class 'str'>":
  eventid = str(eventid)
  
 start_dt = row["Start Date"].date()
 end_dt = row["End Date"].date()
 # if either of the date values are NaT, then skip this row
 if pd.isnull(start_dt):
  print('Row ' + str(rowcount) + '. ' + eventid + " skipping as start date is null")
  skip = skip + 1
  continue
 if pd.isnull(end_dt):
  print('Row ' + str(rowcount) + '. ' + eventid + " skipping as end date is null")
  skip = skip + 1
  continue   
 # we are only interested in future values, so ignore any dates prior to today
 if start_dt < today:
  print(eventid + " begins before today. Setting start date to today")
  start_dt = today
   
 # if the event ends before today, skip it
 if end_dt < today:
  print('Row ' + str(rowcount) + '. ' + eventid + " skipping as end date before today")
  skip = skip + 1
  continue
  
 effort = row["Story Daily Effort"]  
 dates = pd.date_range(start=start_dt, end=end_dt)
 #print(story_id + ': ' + str(start_dt) + ' - ' + str(end_dt))
 col_nm = str(start_dt)
 j = output_df.columns.get_loc(col_nm)
 output_df.iloc[i, j:j+len(dates)] = effort
 rowcount = rowcount + 1 
print("Skipped " + str(skip))
print("Successfully processed " + str(rowcount))
# write a data frame so it's available to the next action
write_dataframe(output_df)

Quick Links

Accepted answers

har_d_har1

To workaround this issue and allow by ETL to complete, I have created a separate ETL dataflow just to produce a table of columns from the Domo Dimensions Calendar, which I then use as an Input DataSet to the above code, essentially removing rows 11-30 above.

All comments

har_d_har1

other categories

Product Ideas
Have a Domo product enhancement idea? Submit or upvote on ideas in the Ideas Exchange.
Ideas Exchange
Suggest & vote on new features you would like to see implemented in the Domo Product.
Data Connections
Ask questions about Connectors, Workbench, Cloud Amplifier and get best practices from Domo peers
Connectors
Connectors, Custom Connectors, Writeback
Workbench
Ask questions about Workbench, a secure, client-side solution for uploading your on-premise data to Domo.
Cloud Integrations
Ask questions about Cloud Integrations and Federated Data connection to your data warehouse or lake.
Data & ETL
Ask questions about Magic ETL, SQL DataFlows, DataFusion, Dataset Views and get best practices from Domo peers
Magic ETL
Ask Magic ETL questions and get answers from Domo peers
SQL DataFlows
Ask SQL DataFlow questions and get answers from Domo peers
Datasets
Ask DataFusion and Dataset Views questions and get answers from Domo peers
Visualize & Apps
Ask questions about Beast Mode, Cards, Charting, Dashboards, Stories, Variables and get best practices from Domo peers
Dashboards
Ask Cards, Dashboards, and Stories questions and get answers from Domo peers
App Studio
Ask questions about building apps in App Studio.
Pro-code Components
Ask questions about pro-code components and Domo Bricks and get answers from Domo peers.
Charting & Analyzer
Ask questions about charting and Analyzer and get answers from Domo peers.
Calculations & Variables (Beast Mode)
Ask questions about using calculated fields and Variables (Beast Modes) in Analyzer.
AI & Data science
Ask questions about DomoAI and get answers from Domo peers.
Domo AI & AI Chat
Ask questions about AI Chat and AI assistants.
Managing AI
Ask questions about managing AI with AI Playground, AI projects, AI models, and more.
Jupyter Workspaces
Ask questions about Jupyter Workspaces, Notebooks, and file share.
Automate
Ask questions about App Framework, Workflows, Domo Bricks, Domo Developer, API and get best practices from Domo peers
Workflows
Ask questions about Task Center, building automations with Domo Workflows, and executing JavaScript or Python code with Code Engine.
Alerts
Ask questions about managing alerts in Domo and get answers from Domo peers.
Distribute
Ask questions about Domo Everywhere, Scheduled Reports, Mobile and get best practices from Domo peers
Domo Everywhere
Ask questions about embedded analytics with Domo Everywhere.
Reporting
Ask questions about Scheduled Reports, Report Builder, and Slideshow Publications.
Manage
Ask questions about Governance Administration, Approvals, Teams, Alerts, and Buzz and get best practices from Domo peers
Governance & Security
Ask questions about People, Groups, Roles, Sandbox, Activity log, Buzz, Teams, Approvals and PDP and get best practices from Domo peers
Navigation & Productivity
Ask questions about navigation, Projects & Tasks, Goals, and Buzz chat.
APIs
Ask APIs and Developer.domo.com questions and get answers from Domo peers
Add-ins & Plugins
Ask questions about plugins, Microsoft add-ins, and other third-party software integrations.
Domo Community Gallery
Watch how our Customers are using Domo to solve their complex problems.
Product Releases
Domo support and product teams are here to live-answer questions about the most recent product releases. Please post questions in this Forum board for all users to benefit (rather than submitting a support ticket).
Domo University
Questions or discussions related to Domo University, trainings and certifications
Community Forums
Getting Started
Welcome to Domo's Community Forums! You'll find everything you need to get started in this category.
Community Announcements
Get the latest from Domo's Community Team.
Social Groups
Archive
Old or outdated content that could still be found helpful.