Way to Use Window Function to Find the Duration a Temperature Variable was Above a Threshold

random_user_098765
edited May 2023 in Beast Mode

I want to find a way to use a window function to find the duration a temperature variable was above a threshold, the timestamp at which the the temperature was above a threshold, and the range of temperatures when they were above the threshold.

The data is organized as an identifier column, the timestamp column, and the temperature column. Below is an example of the data that I am working with.

Identifier

Timestamp

Temperature

1

2022-07-08 10:00:00

47

1

2022-07-08 10:30:00

53

1

2022-07-08 11:00:00

57

1

2022-07-08 11:30:00

59

1

2022-07-08 12:00:00

44

2

2022-07-08 10:00:00

48

.

.

.

Let's say that my threshold temperature is 50. I would thus want to find the time until the temperature goes back below 50, find the timestamp when the temperature was first above the threshold, find the range of these temperatures as the max and min, and find the Identifier that corresponds to the temperature going above the threshold. For example in the above table, I would want the following:

'Identifier: 1, Timestamp of first excursion: 2022-07-08 10:30:00, Temperature Range: 53-59, Excursion Duration: 1 hour 30 minutes'.

This dataset has over 400 million rows so I believe that the windowing function will be much more helpful than Magic ETL.

I am open to any suggestions and please feel free to send me any documentation that you think may be helpful.

Thanks! 😀

Answers

  • @random_user_098765

    i may be missing something here, but I don’t think think I window function is needed.

    create a variable for the temperature threshold.

    Min temperature:

    Min(case when ‘temperature’>=variable then ‘temperature’ end)
    

    Do something similar for max.

    min time:

    Min(case when ‘temperature’>=variable then ‘time stamp’ end)
    
    

    Then graph by identifier


    “There is a superhero in all of us, we just need the courage to put on the cape.” -Superman
  • random_user_098765
    edited May 2023

    Hi @ST_-Superman-_ ,

    This is not exactly what I am looking for. I want to see how I can lead until a condition occurs so for the above example, I start in row 2 and I want to aggregate until a temperature drops below 50, which would be row 5. From this, I want to gather the first timestamp the temperature was above 50, the first timestamp the temperature drops below 50, the time difference between each of these timestamps, and the maximum and minimum temperatures during the time the temperatures were above 50.

    I may have showed a too simplistic example as one identifier could have many temperature excursions above 50.

    I thus think that a window function is useful for this problem because once I find out a way to use a lead function based on a condition, I am very confident I could do this.