Data Science - Outliers

Options
Jones01
Jones01 Contributor
edited May 3 in Magic ETL

Hi,

New to using the data science tiles in ETL.

I have a dataset with sales by category by day.

I can run this through the outliers tile for one category and it will correctly mark the outliers.

What is the best way to do this per category?

Thanks

Tagged:

Best Answer

  • david_cunningham
    Answer ✓
    Options

    If you're not wanting to have a tile per column you want to detect outlier on, an option is to use the Python tile.

    You can create a function (or multiple) to run through columns you want to evaluate and generate everything in one tile. You can also customize how you want to define the outlier detection methodology. You can still implement the same standard deviation or mean absolute deviation as you can in the Outlier Detection tile, but also can also adapt it with more precision based on your particular data/context.

    David Cunningham

    ** Was this post helpful? Click Agree 😀, Like 👍️, or Awesome ❤️ below **
    ** Did this solve your problem? Accept it as a solution! ✔️**

Answers

  • david_cunningham
    Answer ✓
    Options

    If you're not wanting to have a tile per column you want to detect outlier on, an option is to use the Python tile.

    You can create a function (or multiple) to run through columns you want to evaluate and generate everything in one tile. You can also customize how you want to define the outlier detection methodology. You can still implement the same standard deviation or mean absolute deviation as you can in the Outlier Detection tile, but also can also adapt it with more precision based on your particular data/context.

    David Cunningham

    ** Was this post helpful? Click Agree 😀, Like 👍️, or Awesome ❤️ below **
    ** Did this solve your problem? Accept it as a solution! ✔️**