Redshift Flow

Options

We have a redshift flow that runs daily. The flow has past three years of data from three different flows filtered by the date. Recently some changes were made to the historical data and I cannot see those reflected in the redshift flow. However, if I write the same code in a separate Redshift dataflow it works. Can anyone please help?

Tagged:

Best Answer

  • jessdoe
    jessdoe Contributor
    Answer ✓
    Options

    If it's not recursive when last did dataflow run? That's the only reason I can think of aside from a transform within the dataflow excluding 2022 data and thus preventing the changes that have been made to 2022 data from appearing in the dataflow output.

Answers

  • jessdoe
    jessdoe Contributor
    Options

    Do you see the changes reflected in the 3 different flows powering your redshift dataflow? Were these changes to existing columns or did these changes result in new columns needing to be added? Does the date filter encompass the full date range affected by the recent changes? It sounds like your dataflow is recursive. Changes to historical data in recursive dataflows aren't always straightforward. You may consider rebuilding this in ETL and partitioning the data, possibly of the same date field.

  • lemon76
    Options

    Hey there! Yes, I see changes reflected in the three flows. The changes were to one of the flows. The raw data was replaced entirely for 2022. It isn't a recursive flow.

  • jessdoe
    jessdoe Contributor
    Answer ✓
    Options

    If it's not recursive when last did dataflow run? That's the only reason I can think of aside from a transform within the dataflow excluding 2022 data and thus preventing the changes that have been made to 2022 data from appearing in the dataflow output.