Effective Dating

I want to capture appointment differential data and report back using effective dating. So that if I want a snapshot of a specific date, I'm pulling the most recent changes up to that moment and ignoring the rest. I've done this plenty in SQL, but looking for a solution using Domo ETL. I don't want to re-invent the wheel. Does anyone have a link to an example?

** Was this post helpful? Click Agree or Like below. **
** Did this solve your problem? Accept it as a solution! **

Tagged:

Best Answer

  • ColemenWilson
    Answer ✓

    The way I've done this is utilizing the rank and window tile to rank in descending order grouped by the unique field, in your case it looks like you'd group by `patient`. Then add a filter tile to only keep the records with a rank of 1. This will keep only the most recent date, and corresponding row, for each patient.

    If I solved your problem, please select "yes" above

Answers

  • I'm not 100% sure what you're trying to do here. Could you share a bit more and perhaps provide some examples of what a sample data table would look like before and after your ETL?

    If I solved your problem, please select "yes" above

  • @ArborRose I am not sure I fully understand the ask but if you are just trying to look at the current report then you could do a group by to get the max date and then join that back to your data and filter when the date = max date.

    **If this answer solved your problem be sure to like it and accept it as a solution!

  • @Colemen, I mocked up an example of how I think I will implement this, below.

    Every morning I gather the current appointments. I need to build a dataset of all appointments, but only where there are changes in any field. This keeps the dataset smaller.

    In this example, assume I want to see a view of what the appointments are on Aug 24, 2016. Effective dating would show only the most recent record for each person up to and including our date. But would not show anything beyond the date.

    Looking at Felicia on 2/15/17, we would see the canceled appointment but we would not see the rescheduled appointment as that hasn't happened yet.

    Below is my theory on gathering the records every day to get only the changes. I haven't tried this yet.

    Then I will need to retrieve the effective dated records, with something like group by and max date.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • ArborRose
    ArborRose Coach
    edited February 14

    A follow-up comment to my mock-up. I need to see what was recorded based on the entries at that time I take a snapshot of the changes. A record may say an appointment took place on date X. It does not matter what the date X says. I need to order my records based on what was in the data…not based on what the data says. So my effective dating mockup is relying on an archive timestamp, not on the appointment date. That's just a random field in the dataset, no different than type or description.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • ColemenWilson
    Answer ✓

    The way I've done this is utilizing the rank and window tile to rank in descending order grouped by the unique field, in your case it looks like you'd group by `patient`. Then add a filter tile to only keep the records with a rank of 1. This will keep only the most recent date, and corresponding row, for each patient.

    If I solved your problem, please select "yes" above

  • That is what I am doing in example I showed. I created that ETL yesterday and ran it. Now I have this morning's data…it appears to work. Based on what I see, I have records of just the changes or new appointments. I'm still validating the data. Next, I will need to do the query portion of retrieving based on the effective date.

    Btw, I tried to do this scenario using the mySQL ETL. I wasn't successful. I needed to append a dataset to gather the changes. And it wouldn't let me use the dataset I was modifying also as an input table.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • Awesome. Let me know if you find that anything is not working.

    If I solved your problem, please select "yes" above

  • ArborRose
    ArborRose Coach
    edited February 14

    I appreciate the help and insights. I may not be doing this right…but the unique field I'm using is my archive timestamp and my partition is set to all the other fields. This puts every record in a chronologic order. Then when I remove duplicates on the next tile, it leaves me all the records that are new or changed. If my theory is good, it should also deviate between multiple changes within the same day. (If I decide to put data more than once in a day.)

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • What if on a given day, a patient schedules more than one appointment? Wouldn't using a rank of 1 give me only one of the appointments, or a change in appointment, but the others would be lost?

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **

  • Do you have a timestamp or just a date?

    If I solved your problem, please select "yes" above

  • I'm using current_timestamp(). Which gives me a date and a time. I plan to only pull once per date. Which makes the time unnecessary. But if I (or another coworker) triggers it to pull more than once in a day, I need to make sure it can tell which pull is more recent.

    ** Was this post helpful? Click Agree or Like below. **
    ** Did this solve your problem? Accept it as a solution! **