Adrenaline dataflows should not re-evaluate the schema of an inbound dataset.

I have discovered that a dataset being brought into an adrenaline dataflow, will have its schema re-evaluated based on the first x number of rows coming in.

In my case, this resulted in a google sheet dataset coming into the dataflow with an important text field being reassigned as a number. all the text values were then set to NaN. This, of course, rendered the dataflow useless as the text values were required for the needed processing.

Rather than re-evaluating a schema, adrenaline dataflows should take the schema from the dataset itself. That schema has been established and accepted prior to processing. I was told that it is working as designed but that is insanity. You should never assume you know better what the schema of a dataset should be and override the datatypes defined in that dataset.

Tagged:
2
2 votes