I want to calculate difference between yesterday and day before yesterday.

I want to calculate the difference between yesterday and day before yesterday to see no. of files received in that day. my dates are in the below format in one column, i tried the datefiff function but couldn't get the desired results

please help

Tagged:

Best Answer

  • MichelleH
    MichelleH Coach
    Answer ✓

    You could use a beast mode like the one below to sum the file collected if the date field is one or two days before today:

    sum(case when DATE(`local_collect_date_time`) = CURRENT_DATE() - 1 then `FilesCollected` else 0 end)
    -
    sum(case when DATE(`local_collect_date_time`) = CURRENT_DATE() - 2 then `FilesCollected` else 0 end)
    

Answers

  • MichelleH
    MichelleH Coach
    Answer ✓

    You could use a beast mode like the one below to sum the file collected if the date field is one or two days before today:

    sum(case when DATE(`local_collect_date_time`) = CURRENT_DATE() - 1 then `FilesCollected` else 0 end)
    -
    sum(case when DATE(`local_collect_date_time`) = CURRENT_DATE() - 2 then `FilesCollected` else 0 end)
    

  • Aditya_Jain
    Aditya_Jain Contributor

    For Yesterday's value, you can try the following - (assuming you always want to compare against the current date and not the date from the column)

    CASE
    WHEN
    YourDateColumn = DATE_SUB(CURDATE(),INTERVAL 1 DAY)
    THEN
    YourValue ColumnEND

    Let me know if this does not work!

    'Happy to Help'
  • @Dheeraj_1996 in your dataset what does each row represent? do you have a row per each file received and the local_collect_date_time is the timestamp of when the file was received?

    If what you want to know is how many more or less files you received yesterday than you did the day before yesterday, @MichelleH approach is a nice and clean solution for that, keep in mind that this one assumes you have a field FilesCollected that tells you how many entries you collected on the given timestamp, otherwise you can just change that part of the formula with a 1 and it'll work the same. The reason I'm asking is since you mentioned what you want is to see no. of files received in that day which could be to count entries with a timestamp that falls in the given date, which you can do by choosing only the first portion of the formula she suggested.

    Also keep in mind that all of these are based on the actual calendar day and not a 24 hours window, which if you have timestamps, it might be something else stakeholders might care about to have a more "realtime" metric.

  • My dataset loads every morning at 3AM, and it appends all the data in it by date.. I want to see how many files, records I have received in that particular day. lets say I have 3 files today in the dataset and 1 file yesterday. I have received 2 files today..

  • @Dheeraj_1996 then @MichelleH answer is what you're looking for. From the sample data you've provided, assuming you were running it on the 25th line 1 will return 12 and line 3 returns 5, so the difference would be 7.

    If the idea is not to compare this to the current date, but be able to tell how many new files were received each day, then you'd need to use a window function instead, leveraging the LAG version (this will work since you've stated you have entries for every single day with no date gaps), it'd look something like:

    COUNT(`local_collect_date_time`) - (LAG(COUNT(`local_collect_date_time`), 1) OVER( ORDER BY date(`local_collect_date_time`) ASC))
    

    And you'd graph by local_collect_date_time grouping by day, with that you can get something that looks like:

    Keep in mind that since this is a lag function, the first date in your date range will have no previous day information available, so the delta will be the start date.